All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v5 00/20] Refactor HW workaround code
@ 2017-11-03 18:09 Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 01/20] drm/i915: Remove Gen9 WAs with no effect Oscar Mateo
                   ` (22 more replies)
  0 siblings, 23 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx

New approach using static tables instead of a programmatic one. This is RFC for
two reasons: firstly because I still need to re-review everything myself (I
wanted to get it out ther asap), and secondly because I'm not 100% convinced
by this approach.

While writing the patches, the approach seemed forceful: I couldn't use const
structs because there are a good deal of things that need calculations (e.g.
skl_tune_iz_hashing), or need to pass data between the pre- and the post- hooks
(e.g. disable/enable_dop_clock_gating), or cannot be done in tables (e.g. assert
the values make sense for those registers that are masked), or need to extract
fields from structs (whitelist registers). I also cannot predict how future-proof
this thing is.

Furthermote, this is going to be much more difficult to review than the previous
approach, if only because the delta is much bigger. If this approach is preferred,
I stronly suggest we do it in top of the previous one (that way we have the debugfs
output much earlier in the game to make sure we are missing anything).

From previous cover letters:

Currently, deciding how/where to apply new workarounds is challenging. Often,
workarounds end up applied incorrectly and get lost under certain circumstances
(e.g. a context switch or a GPU reset). This is a proposal to attempt to
eliminate some of this pain, by clarifying the current classification of
workarounds (context saved/restored, global registers, whitelisting, BB),
putting them together on the same file, and improving the existing validation
infrastructure (debugfs/i-g-t).

Oscar Mateo (20):
  drm/i915: Remove Gen9 WAs with no effect
  drm/i915: Move a bunch of workaround-related code to its own file
  drm/i915: Split out functions for different kinds of workarounds
  drm/i915: Transform context WAs into static tables
  drm/i915: Transform GT WAs into static tables
  drm/i915: Transform Whitelist WAs into static tables
  drm/i915: Create a new category of display WAs
  drm/i915: Print all workaround types correctly in debugfs
  drm/i915: Do not store the total counts of WAs
  drm/i915: Move WA BB stuff to the workarounds file as well
  drm/i915/cnl: Move GT and Display workarounds from init_clock_gating
  drm/i915/gen9: Move GT and Display workarounds from init_clock_gating
  drm/i915/cfl: Move GT and Display workarounds from init_clock_gating
  drm/i915/glk: Move GT and Display workarounds from init_clock_gating
  drm/i915/kbl: Move GT and Display workarounds from init_clock_gating
  drm/i915/bxt: Move GT and Display workarounds from init_clock_gating
  drm/i915/skl: Move GT and Display workarounds from init_clock_gating
  drm/i915/chv: Move GT and Display workarounds from init_clock_gating
  drm/i915/bdw: Move GT and Display workarounds from init_clock_gating
  drm/i915: Document the i915_workarounds file

 drivers/gpu/drm/i915/Makefile            |    3 +-
 drivers/gpu/drm/i915/i915_debugfs.c      |  117 ++-
 drivers/gpu/drm/i915/i915_drv.h          |   40 +-
 drivers/gpu/drm/i915/i915_gem.c          |    3 +
 drivers/gpu/drm/i915/i915_gem_context.c  |    1 +
 drivers/gpu/drm/i915/i915_reg.h          |    3 -
 drivers/gpu/drm/i915/intel_engine_cs.c   |  682 ------------
 drivers/gpu/drm/i915/intel_lrc.c         |  264 +----
 drivers/gpu/drm/i915/intel_pm.c          |  312 +-----
 drivers/gpu/drm/i915/intel_ringbuffer.c  |    5 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |    3 -
 drivers/gpu/drm/i915/intel_workarounds.c | 1663 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_workarounds.h |   51 +
 13 files changed, 1867 insertions(+), 1280 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_workarounds.c
 create mode 100644 drivers/gpu/drm/i915/intel_workarounds.h

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC PATCH 01/20] drm/i915: Remove Gen9 WAs with no effect
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-06 12:40   ` Chris Wilson
  2017-11-03 18:09 ` [RFC PATCH 02/20] drm/i915: Move a bunch of workaround-related code to its own file Oscar Mateo
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx

GEN8_CONFIG0 (0xD00) is a protected by a lock (bit 31) which is set by
the BIOS, so there is no way we can enable the three chicken bits
mandated by the WA (the BIOS should be doing it instead).

v2: Rebased
v3: Standalone patch

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h | 3 ---
 drivers/gpu/drm/i915/intel_pm.c | 3 ---
 2 files changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 8c775e9..9c57e0c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -355,9 +355,6 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   ECOCHK_PPGTT_WT_HSW		(0x2<<3)
 #define   ECOCHK_PPGTT_WB_HSW		(0x3<<3)
 
-#define GEN8_CONFIG0			_MMIO(0xD00)
-#define  GEN9_DEFAULT_FIXES		(1 << 3 | 1 << 2 | 1 << 1)
-
 #define GAC_ECO_BITS			_MMIO(0x14090)
 #define   ECOBITS_SNB_BIT		(1<<13)
 #define   ECOBITS_PPGTT_CACHE64B	(3<<8)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 07118c0..acd0cbb 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -75,9 +75,6 @@ static void gen9_init_clock_gating(struct drm_i915_private *dev_priv)
 	I915_WRITE(CHICKEN_PAR1_1,
 		   I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
 
-	I915_WRITE(GEN8_CONFIG0,
-		   I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
-
 	/* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
 	I915_WRITE(GEN8_CHICKEN_DCPR_1,
 		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 02/20] drm/i915: Move a bunch of workaround-related code to its own file
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 01/20] drm/i915: Remove Gen9 WAs with no effect Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-06 12:42   ` Chris Wilson
  2017-11-03 18:09 ` [RFC PATCH 03/20] drm/i915: Split out functions for different kinds of workarounds Oscar Mateo
                   ` (20 subsequent siblings)
  22 siblings, 1 reply; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx

This has grown to be a sizable amount of code, so move it to
its own file before we try to refactor anything. For the moment,
we are leaving behind the WA BB code and the WAs that get applied
(incorrectly) in init_clock_gating, but we will deal with it later.

v2: Use intel_ prefix for code that deals with the hardware (Chris)
v3: Rebased

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile            |   3 +-
 drivers/gpu/drm/i915/intel_engine_cs.c   | 682 -----------------------------
 drivers/gpu/drm/i915/intel_lrc.c         |   1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c  |   1 +
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   3 -
 drivers/gpu/drm/i915/intel_workarounds.c | 708 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_workarounds.h |  31 ++
 7 files changed, 743 insertions(+), 686 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_workarounds.c
 create mode 100644 drivers/gpu/drm/i915/intel_workarounds.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 1bbc544..0eabc9e 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -41,7 +41,8 @@ i915-y := i915_drv.o \
 	  intel_csr.o \
 	  intel_device_info.o \
 	  intel_pm.o \
-	  intel_runtime_pm.o
+	  intel_runtime_pm.o \
+	  intel_workarounds.o
 
 i915-$(CONFIG_COMPAT)   += i915_ioc32.o
 i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o intel_pipe_crc.o
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index f31f2d6..6e4440f 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -818,688 +818,6 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
 	}
 }
 
-static int wa_add(struct drm_i915_private *dev_priv,
-		  i915_reg_t addr,
-		  const u32 mask, const u32 val)
-{
-	const u32 idx = dev_priv->workarounds.count;
-
-	if (WARN_ON(idx >= I915_MAX_WA_REGS))
-		return -ENOSPC;
-
-	dev_priv->workarounds.reg[idx].addr = addr;
-	dev_priv->workarounds.reg[idx].value = val;
-	dev_priv->workarounds.reg[idx].mask = mask;
-
-	dev_priv->workarounds.count++;
-
-	return 0;
-}
-
-#define WA_REG(addr, mask, val) do { \
-		const int r = wa_add(dev_priv, (addr), (mask), (val)); \
-		if (r) \
-			return r; \
-	} while (0)
-
-#define WA_SET_BIT_MASKED(addr, mask) \
-	WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
-
-#define WA_CLR_BIT_MASKED(addr, mask) \
-	WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
-
-#define WA_SET_FIELD_MASKED(addr, mask, value) \
-	WA_REG(addr, mask, _MASKED_FIELD(mask, value))
-
-static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
-				 i915_reg_t reg)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	struct i915_workarounds *wa = &dev_priv->workarounds;
-	const uint32_t index = wa->hw_whitelist_count[engine->id];
-
-	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
-		return -EINVAL;
-
-	I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
-		   i915_mmio_reg_offset(reg));
-	wa->hw_whitelist_count[engine->id]++;
-
-	return 0;
-}
-
-static int gen8_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-
-	WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
-
-	/* WaDisableAsyncFlipPerfMode:bdw,chv */
-	WA_SET_BIT_MASKED(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
-
-	/* WaDisablePartialInstShootdown:bdw,chv */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
-
-	/* Use Force Non-Coherent whenever executing a 3D context. This is a
-	 * workaround for for a possible hang in the unlikely event a TLB
-	 * invalidation occurs during a PSD flush.
-	 */
-	/* WaForceEnableNonCoherent:bdw,chv */
-	/* WaHdcDisableFetchWhenMasked:bdw,chv */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_DONOT_FETCH_MEM_WHEN_MASKED |
-			  HDC_FORCE_NON_COHERENT);
-
-	/* From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0:
-	 * "The Hierarchical Z RAW Stall Optimization allows non-overlapping
-	 *  polygons in the same 8x4 pixel/sample area to be processed without
-	 *  stalling waiting for the earlier ones to write to Hierarchical Z
-	 *  buffer."
-	 *
-	 * This optimization is off by default for BDW and CHV; turn it on.
-	 */
-	WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
-
-	/* Wa4x4STCOptimizationDisable:bdw,chv */
-	WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
-
-	/*
-	 * BSpec recommends 8x4 when MSAA is used,
-	 * however in practice 16x4 seems fastest.
-	 *
-	 * Note that PS/WM thread counts depend on the WIZ hashing
-	 * disable bit, which we don't touch here, but it's good
-	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
-	 */
-	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
-			    GEN6_WIZ_HASHING_MASK,
-			    GEN6_WIZ_HASHING_16x4);
-
-	return 0;
-}
-
-static int bdw_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen8_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaDisableThreadStallDopClockGating:bdw (pre-production) */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
-
-	/* WaDisableDopClockGating:bdw
-	 *
-	 * Also see the related UCGTCL1 write in broadwell_init_clock_gating()
-	 * to disable EUTC clock gating.
-	 */
-	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
-			  DOP_CLOCK_GATING_DISABLE);
-
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-			  GEN8_SAMPLER_POWER_BYPASS_DIS);
-
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  /* WaForceContextSaveRestoreNonCoherent:bdw */
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
-			  /* WaDisableFenceDestinationToSLM:bdw (pre-prod) */
-			  (IS_BDW_GT3(dev_priv) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
-
-	return 0;
-}
-
-static int chv_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen8_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaDisableThreadStallDopClockGating:chv */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
-
-	/* Improve HiZ throughput on CHV. */
-	WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
-
-	return 0;
-}
-
-static int gen9_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	/* WaConextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS, _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
-
-	/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
-		   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
-
-	/* WaDisableKillLogic:bxt,skl,kbl */
-	if (!IS_COFFEELAKE(dev_priv))
-		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-			   ECOCHK_DIS_TLB);
-
-	if (HAS_LLC(dev_priv)) {
-		/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
-		 *
-		 * Must match Display Engine. See
-		 * WaCompressedResourceDisplayNewHashMode.
-		 */
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN9_PBE_COMPRESSED_HASH_SELECTION);
-		WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
-				  GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
-
-		I915_WRITE(MMCD_MISC_CTRL,
-			   I915_READ(MMCD_MISC_CTRL) |
-			   MMCD_PCLA |
-			   MMCD_HOTSPOT_EN);
-	}
-
-	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
-	/* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  FLOW_CONTROL_ENABLE |
-			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
-
-	/* Syncing dependencies between camera and graphics:skl,bxt,kbl */
-	if (!IS_COFFEELAKE(dev_priv))
-		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-				  GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC);
-
-	/* WaDisableDgMirrorFixInHalfSliceChicken5:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
-				  GEN9_DG_MIRROR_FIX_ENABLE);
-
-	/* WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		WA_SET_BIT_MASKED(GEN7_COMMON_SLICE_CHICKEN1,
-				  GEN9_RHWO_OPTIMIZATION_DISABLE);
-		/*
-		 * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
-		 * but we do that in per ctx batchbuffer as there is an issue
-		 * with this register not getting restored on ctx restore
-		 */
-	}
-
-	/* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
-	/* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
-			  GEN9_ENABLE_YV12_BUGFIX |
-			  GEN9_ENABLE_GPGPU_PREEMPTION);
-
-	/* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
-	/* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(CACHE_MODE_1, (GEN8_4x4_STC_OPTIMIZATION_DISABLE |
-					 GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE));
-
-	/* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
-	WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
-			  GEN9_CCS_TLB_PREFETCH_ENABLE);
-
-	/* WaDisableMaskBasedCammingInRCC:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		WA_SET_BIT_MASKED(SLICE_ECO_CHICKEN0,
-				  PIXEL_MASK_CAMMING_DISABLE);
-
-	/* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
-			  HDC_FORCE_CSR_NON_COHERENT_OVR_DISABLE);
-
-	/* WaForceEnableNonCoherent and WaDisableHDCInvalidation are
-	 * both tied to WaForceContextSaveRestoreNonCoherent
-	 * in some hsds for skl. We keep the tie for all gen9. The
-	 * documentation is a bit hazy and so we want to get common behaviour,
-	 * even though there is no clear evidence we would need both on kbl/bxt.
-	 * This area has been source of system hangs so we play it safe
-	 * and mimic the skl regardless of what bspec says.
-	 *
-	 * Use Force Non-Coherent whenever executing a 3D context. This
-	 * is a workaround for a possible hang in the unlikely event
-	 * a TLB invalidation occurs during a PSD flush.
-	 */
-
-	/* WaForceEnableNonCoherent:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_FORCE_NON_COHERENT);
-
-	/* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-		   BDW_DISABLE_HDC_INVALIDATION);
-
-	/* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
-	if (IS_SKYLAKE(dev_priv) ||
-	    IS_KABYLAKE(dev_priv) ||
-	    IS_COFFEELAKE(dev_priv) ||
-	    IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0))
-		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-				  GEN8_SAMPLER_POWER_BYPASS_DIS);
-
-	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
-
-	/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
-	I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
-				    GEN8_LQSC_FLUSH_COHERENT_LINES));
-
-	/*
-	 * Supporting preemption with fine-granularity requires changes in the
-	 * batch buffer programming. Since we can't break old userspace, we
-	 * need to set our default preemption level to safe value. Userspace is
-	 * still able to use more fine-grained preemption levels, since in
-	 * WaEnablePreemptionGranularityControlByUMD we're whitelisting the
-	 * per-ctx register. As such, WaDisable{3D,GPGPU}MidCmdPreemption are
-	 * not real HW workarounds, but merely a way to start using preemption
-	 * while maintaining old contract with userspace.
-	 */
-
-	/* WaDisable3DMidCmdPreemption:skl,bxt,glk,cfl,[cnl] */
-	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
-	/* WaDisableGPGPUMidCmdPreemption:skl,bxt,blk,cfl,[cnl] */
-	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
-	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
-	if (ret)
-		return ret;
-
-	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
-	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
-
-	/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	u8 vals[3] = { 0, 0, 0 };
-	unsigned int i;
-
-	for (i = 0; i < 3; i++) {
-		u8 ss;
-
-		/*
-		 * Only consider slices where one, and only one, subslice has 7
-		 * EUs
-		 */
-		if (!is_power_of_2(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]))
-			continue;
-
-		/*
-		 * subslice_7eu[i] != 0 (because of the check above) and
-		 * ss_max == 4 (maximum number of subslices possible per slice)
-		 *
-		 * ->    0 <= ss <= 3;
-		 */
-		ss = ffs(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]) - 1;
-		vals[i] = 3 - ss;
-	}
-
-	if (vals[0] == 0 && vals[1] == 0 && vals[2] == 0)
-		return 0;
-
-	/* Tune IZ hashing. See intel_device_info_runtime_init() */
-	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
-			    GEN9_IZ_HASHING_MASK(2) |
-			    GEN9_IZ_HASHING_MASK(1) |
-			    GEN9_IZ_HASHING_MASK(0),
-			    GEN9_IZ_HASHING(2, vals[2]) |
-			    GEN9_IZ_HASHING(1, vals[1]) |
-			    GEN9_IZ_HASHING(0, vals[0]));
-
-	return 0;
-}
-
-static int skl_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaEnableGapsTsvCreditFix:skl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
-	/* WaDisableGafsUnitClkGating:skl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
-	/* WaInPlaceDecompressionHang:skl */
-	if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaDisableLSQCROPERFforOCL:skl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
-
-	return skl_tune_iz_hashing(engine);
-}
-
-static int bxt_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaStoreMultiplePTEenable:bxt */
-	/* This is a requirement according to Hardware specification */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
-
-	/* WaSetClckGatingDisableMedia:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
-					    ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
-	}
-
-	/* WaDisableThreadStallDopClockGating:bxt */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  STALL_DOP_GATING_DISABLE);
-
-	/* WaDisablePooledEuLoadBalancingFix:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
-		I915_WRITE(FF_SLICE_CS_CHICKEN2,
-			   _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
-	}
-
-	/* WaDisableSbeCacheDispatchPortSharing:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
-		WA_SET_BIT_MASKED(
-			GEN7_HALF_SLICE_CHICKEN1,
-			GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
-	}
-
-	/* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
-	/* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
-	/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
-	/* WaDisableLSQCROPERFforOCL:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
-		if (ret)
-			return ret;
-
-		ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-		if (ret)
-			return ret;
-	}
-
-	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
-		u32 val = I915_READ(GEN8_L3SQCREG1);
-		val &= ~L3_PRIO_CREDITS_MASK;
-		val |= L3_GENERAL_PRIO_CREDITS(62) | L3_HIGH_PRIO_CREDITS(2);
-		I915_WRITE(GEN8_L3SQCREG1, val);
-	}
-
-	/* WaToEnableHwFixForPushConstHWBug:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaInPlaceDecompressionHang:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	return 0;
-}
-
-static int cnl_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
-
-	/* WaForceContextSaveRestoreNonCoherent:cnl */
-	WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
-
-	/* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
-
-	/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
-
-	/* WaInPlaceDecompressionHang:cnl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaPushConstantDereferenceHoldDisable:cnl */
-	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
-
-	/* FtrEnableFastAnisoL1BankingFix: cnl */
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
-
-	/* WaDisable3DMidCmdPreemption:cnl */
-	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
-	/* WaDisableGPGPUMidCmdPreemption:cnl */
-	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
-	/* WaEnablePreemptionGranularityControlByUMD:cnl */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
-	ret= wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static int kbl_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaEnableGapsTsvCreditFix:kbl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
-	/* WaDisableDynamicCreditSharing:kbl */
-	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
-
-	/* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
-	if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
-		WA_SET_BIT_MASKED(HDC_CHICKEN0,
-				  HDC_FENCE_DEST_SLM_DISABLE);
-
-	/* WaToEnableHwFixForPushConstHWBug:kbl */
-	if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaDisableGafsUnitClkGating:kbl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
-	/* WaDisableSbeCacheDispatchPortSharing:kbl */
-	WA_SET_BIT_MASKED(
-		GEN7_HALF_SLICE_CHICKEN1,
-		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
-
-	/* WaInPlaceDecompressionHang:kbl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaDisableLSQCROPERFforOCL:kbl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static int glk_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaToEnableHwFixForPushConstHWBug:glk */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	return 0;
-}
-
-static int cfl_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaEnableGapsTsvCreditFix:cfl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
-	/* WaToEnableHwFixForPushConstHWBug:cfl */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaDisableGafsUnitClkGating:cfl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
-	/* WaDisableSbeCacheDispatchPortSharing:cfl */
-	WA_SET_BIT_MASKED(
-		GEN7_HALF_SLICE_CHICKEN1,
-		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
-
-	/* WaInPlaceDecompressionHang:cfl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	return 0;
-}
-
-int init_workarounds_ring(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int err;
-
-	WARN_ON(engine->id != RCS);
-
-	dev_priv->workarounds.count = 0;
-	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
-
-	if (IS_BROADWELL(dev_priv))
-		err = bdw_init_workarounds(engine);
-	else if (IS_CHERRYVIEW(dev_priv))
-		err = chv_init_workarounds(engine);
-	else if (IS_SKYLAKE(dev_priv))
-		err =  skl_init_workarounds(engine);
-	else if (IS_BROXTON(dev_priv))
-		err = bxt_init_workarounds(engine);
-	else if (IS_KABYLAKE(dev_priv))
-		err = kbl_init_workarounds(engine);
-	else if (IS_GEMINILAKE(dev_priv))
-		err =  glk_init_workarounds(engine);
-	else if (IS_COFFEELAKE(dev_priv))
-		err = cfl_init_workarounds(engine);
-	else if (IS_CANNONLAKE(dev_priv))
-		err = cnl_init_workarounds(engine);
-	else
-		err = 0;
-	if (err)
-		return err;
-
-	DRM_DEBUG_DRIVER("%s: Number of context specific w/a: %d\n",
-			 engine->name, dev_priv->workarounds.count);
-	return 0;
-}
-
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
-{
-	struct i915_workarounds *w = &req->i915->workarounds;
-	u32 *cs;
-	int ret, i;
-
-	if (w->count == 0)
-		return 0;
-
-	ret = req->engine->emit_flush(req, EMIT_BARRIER);
-	if (ret)
-		return ret;
-
-	cs = intel_ring_begin(req, (w->count * 2 + 2));
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	*cs++ = MI_LOAD_REGISTER_IMM(w->count);
-	for (i = 0; i < w->count; i++) {
-		*cs++ = i915_mmio_reg_offset(w->reg[i].addr);
-		*cs++ = w->reg[i].value;
-	}
-	*cs++ = MI_NOOP;
-
-	intel_ring_advance(req, cs);
-
-	ret = req->engine->emit_flush(req, EMIT_BARRIER);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
 static bool ring_is_idle(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6840ec8..911df0c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -137,6 +137,7 @@
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 #include "intel_mocs.h"
+#include "intel_workarounds.h"
 
 #define RING_EXECLIST_QFULL		(1 << 0x2)
 #define RING_EXECLIST1_VALID		(1 << 0x3)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 47fadf8..1c721b2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -33,6 +33,7 @@
 #include <drm/i915_drm.h>
 #include "i915_trace.h"
 #include "intel_drv.h"
+#include "intel_workarounds.h"
 
 /* Rough estimate of the typical request size, performing a flush,
  * set-context and then emitting the batch.
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 69ad875..8613e4a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -766,9 +766,6 @@ static inline u32 intel_engine_last_submit(struct intel_engine_cs *engine)
 	return READ_ONCE(engine->timeline->seqno);
 }
 
-int init_workarounds_ring(struct intel_engine_cs *engine);
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req);
-
 void intel_engine_get_instdone(struct intel_engine_cs *engine,
 			       struct intel_instdone *instdone);
 
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
new file mode 100644
index 0000000..5f597d1
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -0,0 +1,708 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "i915_drv.h"
+#include "intel_workarounds.h"
+
+static int wa_add(struct drm_i915_private *dev_priv,
+		  i915_reg_t addr,
+		  const u32 mask, const u32 val)
+{
+	const u32 idx = dev_priv->workarounds.count;
+
+	if (WARN_ON(idx >= I915_MAX_WA_REGS))
+		return -ENOSPC;
+
+	dev_priv->workarounds.reg[idx].addr = addr;
+	dev_priv->workarounds.reg[idx].value = val;
+	dev_priv->workarounds.reg[idx].mask = mask;
+
+	dev_priv->workarounds.count++;
+
+	return 0;
+}
+
+#define WA_REG(addr, mask, val) do { \
+		const int r = wa_add(dev_priv, (addr), (mask), (val)); \
+		if (r) \
+			return r; \
+	} while (0)
+
+#define WA_SET_BIT_MASKED(addr, mask) \
+	WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+
+#define WA_CLR_BIT_MASKED(addr, mask) \
+	WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
+
+#define WA_SET_FIELD_MASKED(addr, mask, value) \
+	WA_REG(addr, mask, _MASKED_FIELD(mask, value))
+
+static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
+				 i915_reg_t reg)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	struct i915_workarounds *wa = &dev_priv->workarounds;
+	const uint32_t index = wa->hw_whitelist_count[engine->id];
+
+	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
+		return -EINVAL;
+
+	I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
+		   i915_mmio_reg_offset(reg));
+	wa->hw_whitelist_count[engine->id]++;
+
+	return 0;
+}
+
+static int gen8_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+
+	WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
+
+	/* WaDisableAsyncFlipPerfMode:bdw,chv */
+	WA_SET_BIT_MASKED(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
+
+	/* WaDisablePartialInstShootdown:bdw,chv */
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+
+	/* Use Force Non-Coherent whenever executing a 3D context. This is a
+	 * workaround for for a possible hang in the unlikely event a TLB
+	 * invalidation occurs during a PSD flush.
+	 */
+	/* WaForceEnableNonCoherent:bdw,chv */
+	/* WaHdcDisableFetchWhenMasked:bdw,chv */
+	WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			  HDC_DONOT_FETCH_MEM_WHEN_MASKED |
+			  HDC_FORCE_NON_COHERENT);
+
+	/* From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0:
+	 * "The Hierarchical Z RAW Stall Optimization allows non-overlapping
+	 *  polygons in the same 8x4 pixel/sample area to be processed without
+	 *  stalling waiting for the earlier ones to write to Hierarchical Z
+	 *  buffer."
+	 *
+	 * This optimization is off by default for BDW and CHV; turn it on.
+	 */
+	WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
+
+	/* Wa4x4STCOptimizationDisable:bdw,chv */
+	WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
+
+	/*
+	 * BSpec recommends 8x4 when MSAA is used,
+	 * however in practice 16x4 seems fastest.
+	 *
+	 * Note that PS/WM thread counts depend on the WIZ hashing
+	 * disable bit, which we don't touch here, but it's good
+	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
+	 */
+	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
+			    GEN6_WIZ_HASHING_MASK,
+			    GEN6_WIZ_HASHING_16x4);
+
+	return 0;
+}
+
+static int bdw_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen8_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaDisableThreadStallDopClockGating:bdw (pre-production) */
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+
+	/* WaDisableDopClockGating:bdw
+	 *
+	 * Also see the related UCGTCL1 write in broadwell_init_clock_gating()
+	 * to disable EUTC clock gating.
+	 */
+	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
+			  DOP_CLOCK_GATING_DISABLE);
+
+	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+			  GEN8_SAMPLER_POWER_BYPASS_DIS);
+
+	WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			  /* WaForceContextSaveRestoreNonCoherent:bdw */
+			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
+			  /* WaDisableFenceDestinationToSLM:bdw (pre-prod) */
+			  (IS_BDW_GT3(dev_priv) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
+
+	return 0;
+}
+
+static int chv_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen8_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaDisableThreadStallDopClockGating:chv */
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+
+	/* Improve HiZ throughput on CHV. */
+	WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
+
+	return 0;
+}
+
+static int gen9_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	/* WaConextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
+	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS, _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
+
+	/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
+	I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
+		   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+
+	/* WaDisableKillLogic:bxt,skl,kbl */
+	if (!IS_COFFEELAKE(dev_priv))
+		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+			   ECOCHK_DIS_TLB);
+
+	if (HAS_LLC(dev_priv)) {
+		/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
+		 *
+		 * Must match Display Engine. See
+		 * WaCompressedResourceDisplayNewHashMode.
+		 */
+		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				  GEN9_PBE_COMPRESSED_HASH_SELECTION);
+		WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
+				  GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
+
+		I915_WRITE(MMCD_MISC_CTRL,
+			   I915_READ(MMCD_MISC_CTRL) |
+			   MMCD_PCLA |
+			   MMCD_HOTSPOT_EN);
+	}
+
+	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
+	/* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+			  FLOW_CONTROL_ENABLE |
+			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+
+	/* Syncing dependencies between camera and graphics:skl,bxt,kbl */
+	if (!IS_COFFEELAKE(dev_priv))
+		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+				  GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC);
+
+	/* WaDisableDgMirrorFixInHalfSliceChicken5:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+		WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
+				  GEN9_DG_MIRROR_FIX_ENABLE);
+
+	/* WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+		WA_SET_BIT_MASKED(GEN7_COMMON_SLICE_CHICKEN1,
+				  GEN9_RHWO_OPTIMIZATION_DISABLE);
+		/*
+		 * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
+		 * but we do that in per ctx batchbuffer as there is an issue
+		 * with this register not getting restored on ctx restore
+		 */
+	}
+
+	/* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
+	/* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
+	WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
+			  GEN9_ENABLE_YV12_BUGFIX |
+			  GEN9_ENABLE_GPGPU_PREEMPTION);
+
+	/* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
+	/* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
+	WA_SET_BIT_MASKED(CACHE_MODE_1, (GEN8_4x4_STC_OPTIMIZATION_DISABLE |
+					 GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE));
+
+	/* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
+	WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
+			  GEN9_CCS_TLB_PREFETCH_ENABLE);
+
+	/* WaDisableMaskBasedCammingInRCC:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+		WA_SET_BIT_MASKED(SLICE_ECO_CHICKEN0,
+				  PIXEL_MASK_CAMMING_DISABLE);
+
+	/* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
+	WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
+			  HDC_FORCE_CSR_NON_COHERENT_OVR_DISABLE);
+
+	/* WaForceEnableNonCoherent and WaDisableHDCInvalidation are
+	 * both tied to WaForceContextSaveRestoreNonCoherent
+	 * in some hsds for skl. We keep the tie for all gen9. The
+	 * documentation is a bit hazy and so we want to get common behaviour,
+	 * even though there is no clear evidence we would need both on kbl/bxt.
+	 * This area has been source of system hangs so we play it safe
+	 * and mimic the skl regardless of what bspec says.
+	 *
+	 * Use Force Non-Coherent whenever executing a 3D context. This
+	 * is a workaround for a possible hang in the unlikely event
+	 * a TLB invalidation occurs during a PSD flush.
+	 */
+
+	/* WaForceEnableNonCoherent:skl,bxt,kbl,cfl */
+	WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			  HDC_FORCE_NON_COHERENT);
+
+	/* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
+	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+		   BDW_DISABLE_HDC_INVALIDATION);
+
+	/* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
+	if (IS_SKYLAKE(dev_priv) ||
+	    IS_KABYLAKE(dev_priv) ||
+	    IS_COFFEELAKE(dev_priv) ||
+	    IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0))
+		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+				  GEN8_SAMPLER_POWER_BYPASS_DIS);
+
+	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
+	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
+
+	/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
+	I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
+				    GEN8_LQSC_FLUSH_COHERENT_LINES));
+
+	/*
+	 * Supporting preemption with fine-granularity requires changes in the
+	 * batch buffer programming. Since we can't break old userspace, we
+	 * need to set our default preemption level to safe value. Userspace is
+	 * still able to use more fine-grained preemption levels, since in
+	 * WaEnablePreemptionGranularityControlByUMD we're whitelisting the
+	 * per-ctx register. As such, WaDisable{3D,GPGPU}MidCmdPreemption are
+	 * not real HW workarounds, but merely a way to start using preemption
+	 * while maintaining old contract with userspace.
+	 */
+
+	/* WaDisable3DMidCmdPreemption:skl,bxt,glk,cfl,[cnl] */
+	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+
+	/* WaDisableGPGPUMidCmdPreemption:skl,bxt,blk,cfl,[cnl] */
+	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
+
+	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
+	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
+	if (ret)
+		return ret;
+
+	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
+	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+	if (ret)
+		return ret;
+
+	/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	u8 vals[3] = { 0, 0, 0 };
+	unsigned int i;
+
+	for (i = 0; i < 3; i++) {
+		u8 ss;
+
+		/*
+		 * Only consider slices where one, and only one, subslice has 7
+		 * EUs
+		 */
+		if (!is_power_of_2(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]))
+			continue;
+
+		/*
+		 * subslice_7eu[i] != 0 (because of the check above) and
+		 * ss_max == 4 (maximum number of subslices possible per slice)
+		 *
+		 * ->    0 <= ss <= 3;
+		 */
+		ss = ffs(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]) - 1;
+		vals[i] = 3 - ss;
+	}
+
+	if (vals[0] == 0 && vals[1] == 0 && vals[2] == 0)
+		return 0;
+
+	/* Tune IZ hashing. See intel_device_info_runtime_init() */
+	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
+			    GEN9_IZ_HASHING_MASK(2) |
+			    GEN9_IZ_HASHING_MASK(1) |
+			    GEN9_IZ_HASHING_MASK(0),
+			    GEN9_IZ_HASHING(2, vals[2]) |
+			    GEN9_IZ_HASHING(1, vals[1]) |
+			    GEN9_IZ_HASHING(0, vals[0]));
+
+	return 0;
+}
+
+static int skl_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen9_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaEnableGapsTsvCreditFix:skl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaDisableGafsUnitClkGating:skl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaInPlaceDecompressionHang:skl */
+	if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
+		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	/* WaDisableLSQCROPERFforOCL:skl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+	if (ret)
+		return ret;
+
+	return skl_tune_iz_hashing(engine);
+}
+
+static int bxt_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen9_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaStoreMultiplePTEenable:bxt */
+	/* This is a requirement according to Hardware specification */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+		I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
+
+	/* WaSetClckGatingDisableMedia:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+		I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
+					    ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
+	}
+
+	/* WaDisableThreadStallDopClockGating:bxt */
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+			  STALL_DOP_GATING_DISABLE);
+
+	/* WaDisablePooledEuLoadBalancingFix:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
+		I915_WRITE(FF_SLICE_CS_CHICKEN2,
+			   _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
+	}
+
+	/* WaDisableSbeCacheDispatchPortSharing:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
+		WA_SET_BIT_MASKED(
+			GEN7_HALF_SLICE_CHICKEN1,
+			GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+	}
+
+	/* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
+	/* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
+	/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
+	/* WaDisableLSQCROPERFforOCL:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+		ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
+		if (ret)
+			return ret;
+
+		ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+		if (ret)
+			return ret;
+	}
+
+	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
+		u32 val = I915_READ(GEN8_L3SQCREG1);
+		val &= ~L3_PRIO_CREDITS_MASK;
+		val |= L3_GENERAL_PRIO_CREDITS(62) | L3_HIGH_PRIO_CREDITS(2);
+		I915_WRITE(GEN8_L3SQCREG1, val);
+	}
+
+	/* WaToEnableHwFixForPushConstHWBug:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
+		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+	/* WaInPlaceDecompressionHang:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
+		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	return 0;
+}
+
+static int cnl_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
+	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+		I915_WRITE(GAMT_CHKN_BIT_REG,
+			   (I915_READ(GAMT_CHKN_BIT_REG) |
+			    GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
+
+	/* WaForceContextSaveRestoreNonCoherent:cnl */
+	WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
+			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
+
+	/* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
+	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+		WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
+
+	/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
+	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+	/* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
+	if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
+		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				  GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
+
+	/* WaInPlaceDecompressionHang:cnl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	/* WaPushConstantDereferenceHoldDisable:cnl */
+	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
+
+	/* FtrEnableFastAnisoL1BankingFix: cnl */
+	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
+
+	/* WaDisable3DMidCmdPreemption:cnl */
+	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+
+	/* WaDisableGPGPUMidCmdPreemption:cnl */
+	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
+
+	/* WaEnablePreemptionGranularityControlByUMD:cnl */
+	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+	ret= wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int kbl_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen9_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaEnableGapsTsvCreditFix:kbl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaDisableDynamicCreditSharing:kbl */
+	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
+		I915_WRITE(GAMT_CHKN_BIT_REG,
+			   (I915_READ(GAMT_CHKN_BIT_REG) |
+			    GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
+
+	/* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
+	if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
+		WA_SET_BIT_MASKED(HDC_CHICKEN0,
+				  HDC_FENCE_DEST_SLM_DISABLE);
+
+	/* WaToEnableHwFixForPushConstHWBug:kbl */
+	if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER))
+		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+	/* WaDisableGafsUnitClkGating:kbl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaDisableSbeCacheDispatchPortSharing:kbl */
+	WA_SET_BIT_MASKED(
+		GEN7_HALF_SLICE_CHICKEN1,
+		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+
+	/* WaInPlaceDecompressionHang:kbl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	/* WaDisableLSQCROPERFforOCL:kbl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int glk_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen9_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaToEnableHwFixForPushConstHWBug:glk */
+	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+	return 0;
+}
+
+static int cfl_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen9_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaEnableGapsTsvCreditFix:cfl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaToEnableHwFixForPushConstHWBug:cfl */
+	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+	/* WaDisableGafsUnitClkGating:cfl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaDisableSbeCacheDispatchPortSharing:cfl */
+	WA_SET_BIT_MASKED(
+		GEN7_HALF_SLICE_CHICKEN1,
+		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+
+	/* WaInPlaceDecompressionHang:cfl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	return 0;
+}
+
+int init_workarounds_ring(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int err;
+
+	WARN_ON(engine->id != RCS);
+
+	dev_priv->workarounds.count = 0;
+	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+
+	if (IS_BROADWELL(dev_priv))
+		err = bdw_init_workarounds(engine);
+	else if (IS_CHERRYVIEW(dev_priv))
+		err = chv_init_workarounds(engine);
+	else if (IS_SKYLAKE(dev_priv))
+		err =  skl_init_workarounds(engine);
+	else if (IS_BROXTON(dev_priv))
+		err = bxt_init_workarounds(engine);
+	else if (IS_KABYLAKE(dev_priv))
+		err = kbl_init_workarounds(engine);
+	else if (IS_GEMINILAKE(dev_priv))
+		err =  glk_init_workarounds(engine);
+	else if (IS_COFFEELAKE(dev_priv))
+		err = cfl_init_workarounds(engine);
+	else if (IS_CANNONLAKE(dev_priv))
+		err = cnl_init_workarounds(engine);
+	else
+		err = 0;
+	if (err)
+		return err;
+
+	DRM_DEBUG_DRIVER("%s: Number of context specific w/a: %d\n",
+			 engine->name, dev_priv->workarounds.count);
+	return 0;
+}
+
+int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
+{
+	struct i915_workarounds *w = &req->i915->workarounds;
+	u32 *cs;
+	int ret, i;
+
+	if (w->count == 0)
+		return 0;
+
+	ret = req->engine->emit_flush(req, EMIT_BARRIER);
+	if (ret)
+		return ret;
+
+	cs = intel_ring_begin(req, (w->count * 2 + 2));
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	*cs++ = MI_LOAD_REGISTER_IMM(w->count);
+	for (i = 0; i < w->count; i++) {
+		*cs++ = i915_mmio_reg_offset(w->reg[i].addr);
+		*cs++ = w->reg[i].value;
+	}
+	*cs++ = MI_NOOP;
+
+	intel_ring_advance(req, cs);
+
+	ret = req->engine->emit_flush(req, EMIT_BARRIER);
+	if (ret)
+		return ret;
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
new file mode 100644
index 0000000..27099fc
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#ifndef _I915_WORKAROUNDS_H_
+#define _I915_WORKAROUNDS_H_
+
+int init_workarounds_ring(struct intel_engine_cs *engine);
+int intel_ring_workarounds_emit(struct drm_i915_gem_request *req);
+
+#endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 03/20] drm/i915: Split out functions for different kinds of workarounds
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 01/20] drm/i915: Remove Gen9 WAs with no effect Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 02/20] drm/i915: Move a bunch of workaround-related code to its own file Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables Oscar Mateo
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx

There are different kind of workarounds (those that modify registers that
live in the context image, those that modify global registers, those that
whitelist registers, etc...) and they have different requirements in terms
of where they are applied and how. Also, by splitting them apart, it should
be easier to decide where a new workaround should go.

v2:
  - Add multiple MISSING_CASE
  - Rebased

v3:
  - Rename mmio_workarounds to gt_workarounds (Chris, Mika)
  - Create empty placeholders for BDW and CHV GT WAs
  - Rebased

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c          |   3 +
 drivers/gpu/drm/i915/i915_gem_context.c  |   5 +
 drivers/gpu/drm/i915/intel_lrc.c         |  10 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  |   4 +-
 drivers/gpu/drm/i915/intel_workarounds.c | 714 +++++++++++++++++++------------
 drivers/gpu/drm/i915/intel_workarounds.h |   8 +-
 6 files changed, 458 insertions(+), 286 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e43688f..750e014 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -35,6 +35,7 @@
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
 #include "intel_mocs.h"
+#include "intel_workarounds.h"
 #include "i915_gemfs.h"
 #include <linux/dma-fence-array.h>
 #include <linux/kthread.h>
@@ -4916,6 +4917,8 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
 		}
 	}
 
+	intel_gt_workarounds_apply(dev_priv);
+
 	i915_gem_init_swizzling(dev_priv);
 
 	/*
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 10affb3..8548e571 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -90,6 +90,7 @@
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 #include "i915_trace.h"
+#include "intel_workarounds.h"
 
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
 
@@ -456,6 +457,10 @@ int i915_gem_contexts_init(struct drm_i915_private *dev_priv)
 
 	GEM_BUG_ON(dev_priv->kernel_context);
 
+	err = intel_ctx_workarounds_init(dev_priv);
+	if (err)
+		goto err;
+
 	INIT_LIST_HEAD(&dev_priv->contexts.list);
 	INIT_WORK(&dev_priv->contexts.free_work, contexts_free_worker);
 	init_llist_head(&dev_priv->contexts.free_list);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 911df0c..f0b4d2f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1503,7 +1503,7 @@ static int gen8_init_render_ring(struct intel_engine_cs *engine)
 
 	I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
 
-	return init_workarounds_ring(engine);
+	return 0;
 }
 
 static int gen9_init_render_ring(struct intel_engine_cs *engine)
@@ -1514,7 +1514,11 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 	if (ret)
 		return ret;
 
-	return init_workarounds_ring(engine);
+	ret = intel_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	return 0;
 }
 
 static void reset_common_ring(struct intel_engine_cs *engine,
@@ -1830,7 +1834,7 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 {
 	int ret;
 
-	ret = intel_ring_workarounds_emit(req);
+	ret = intel_ctx_workarounds_emit(req);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 1c721b2..b053fed 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -649,7 +649,7 @@ static int intel_rcs_ctx_init(struct drm_i915_gem_request *req)
 {
 	int ret;
 
-	ret = intel_ring_workarounds_emit(req);
+	ret = intel_ctx_workarounds_emit(req);
 	if (ret != 0)
 		return ret;
 
@@ -708,7 +708,7 @@ static int init_render_ring(struct intel_engine_cs *engine)
 	if (INTEL_INFO(dev_priv)->gen >= 6)
 		I915_WRITE_IMR(engine, ~engine->irq_keep_mask);
 
-	return init_workarounds_ring(engine);
+	return 0;
 }
 
 static void render_ring_cleanup(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 5f597d1..0a8f265 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -58,27 +58,8 @@ static int wa_add(struct drm_i915_private *dev_priv,
 #define WA_SET_FIELD_MASKED(addr, mask, value) \
 	WA_REG(addr, mask, _MASKED_FIELD(mask, value))
 
-static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
-				 i915_reg_t reg)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	struct i915_workarounds *wa = &dev_priv->workarounds;
-	const uint32_t index = wa->hw_whitelist_count[engine->id];
-
-	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
-		return -EINVAL;
-
-	I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
-		   i915_mmio_reg_offset(reg));
-	wa->hw_whitelist_count[engine->id]++;
-
-	return 0;
-}
-
-static int gen8_init_workarounds(struct intel_engine_cs *engine)
+static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-
 	WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
 
 	/* WaDisableAsyncFlipPerfMode:bdw,chv */
@@ -126,12 +107,11 @@ static int gen8_init_workarounds(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int bdw_init_workarounds(struct intel_engine_cs *engine)
+static int bdw_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen8_init_workarounds(engine);
+	ret = gen8_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
@@ -158,12 +138,11 @@ static int bdw_init_workarounds(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int chv_init_workarounds(struct intel_engine_cs *engine)
+static int chv_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen8_init_workarounds(engine);
+	ret = gen8_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
@@ -176,23 +155,8 @@ static int chv_init_workarounds(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int gen9_init_workarounds(struct intel_engine_cs *engine)
+static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	/* WaConextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS, _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
-
-	/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
-		   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
-
-	/* WaDisableKillLogic:bxt,skl,kbl */
-	if (!IS_COFFEELAKE(dev_priv))
-		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-			   ECOCHK_DIS_TLB);
-
 	if (HAS_LLC(dev_priv)) {
 		/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
 		 *
@@ -203,11 +167,6 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
 				  GEN9_PBE_COMPRESSED_HASH_SELECTION);
 		WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
 				  GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
-
-		I915_WRITE(MMCD_MISC_CTRL,
-			   I915_READ(MMCD_MISC_CTRL) |
-			   MMCD_PCLA |
-			   MMCD_HOTSPOT_EN);
 	}
 
 	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
@@ -279,10 +238,6 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
 	WA_SET_BIT_MASKED(HDC_CHICKEN0,
 			  HDC_FORCE_NON_COHERENT);
 
-	/* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-		   BDW_DISABLE_HDC_INVALIDATION);
-
 	/* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
 	if (IS_SKYLAKE(dev_priv) ||
 	    IS_KABYLAKE(dev_priv) ||
@@ -294,10 +249,6 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
 	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
 	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
 
-	/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
-	I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
-				    GEN8_LQSC_FLUSH_COHERENT_LINES));
-
 	/*
 	 * Supporting preemption with fine-granularity requires changes in the
 	 * batch buffer programming. Since we can't break old userspace, we
@@ -316,29 +267,11 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
 	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
 			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
 
-	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
-	if (ret)
-		return ret;
-
-	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
-	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
-
-	/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
-	if (ret)
-		return ret;
-
 	return 0;
 }
 
-static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
+static int skl_tune_iz_hashing(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	u8 vals[3] = { 0, 0, 0 };
 	unsigned int i;
 
@@ -377,67 +310,29 @@ static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int skl_init_workarounds(struct intel_engine_cs *engine)
+static int skl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaEnableGapsTsvCreditFix:skl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
-	/* WaDisableGafsUnitClkGating:skl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
-	/* WaInPlaceDecompressionHang:skl */
-	if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaDisableLSQCROPERFforOCL:skl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+	ret = gen9_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
-	return skl_tune_iz_hashing(engine);
+	return skl_tune_iz_hashing(dev_priv);
 }
 
-static int bxt_init_workarounds(struct intel_engine_cs *engine)
+static int bxt_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen9_init_workarounds(engine);
+	ret = gen9_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
-	/* WaStoreMultiplePTEenable:bxt */
-	/* This is a requirement according to Hardware specification */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
-
-	/* WaSetClckGatingDisableMedia:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
-					    ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
-	}
-
 	/* WaDisableThreadStallDopClockGating:bxt */
 	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
 			  STALL_DOP_GATING_DISABLE);
 
-	/* WaDisablePooledEuLoadBalancingFix:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
-		I915_WRITE(FF_SLICE_CS_CHICKEN2,
-			   _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
-	}
-
 	/* WaDisableSbeCacheDispatchPortSharing:bxt */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
 		WA_SET_BIT_MASKED(
@@ -445,117 +340,22 @@ static int bxt_init_workarounds(struct intel_engine_cs *engine)
 			GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 	}
 
-	/* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
-	/* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
-	/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
-	/* WaDisableLSQCROPERFforOCL:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
-		if (ret)
-			return ret;
-
-		ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-		if (ret)
-			return ret;
-	}
-
-	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
-		u32 val = I915_READ(GEN8_L3SQCREG1);
-		val &= ~L3_PRIO_CREDITS_MASK;
-		val |= L3_GENERAL_PRIO_CREDITS(62) | L3_HIGH_PRIO_CREDITS(2);
-		I915_WRITE(GEN8_L3SQCREG1, val);
-	}
-
 	/* WaToEnableHwFixForPushConstHWBug:bxt */
 	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
 		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
 				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
-	/* WaInPlaceDecompressionHang:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	return 0;
-}
-
-static int cnl_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
-
-	/* WaForceContextSaveRestoreNonCoherent:cnl */
-	WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
-
-	/* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
-
-	/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
-
-	/* WaInPlaceDecompressionHang:cnl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaPushConstantDereferenceHoldDisable:cnl */
-	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
-
-	/* FtrEnableFastAnisoL1BankingFix: cnl */
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
-
-	/* WaDisable3DMidCmdPreemption:cnl */
-	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
-	/* WaDisableGPGPUMidCmdPreemption:cnl */
-	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
-	/* WaEnablePreemptionGranularityControlByUMD:cnl */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
-	ret= wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
-
 	return 0;
 }
 
-static int kbl_init_workarounds(struct intel_engine_cs *engine)
+static int kbl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen9_init_workarounds(engine);
+	ret = gen9_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
-	/* WaEnableGapsTsvCreditFix:kbl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
-	/* WaDisableDynamicCreditSharing:kbl */
-	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
-
 	/* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
 	if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
 		WA_SET_BIT_MASKED(HDC_CHICKEN0,
@@ -566,34 +366,19 @@ static int kbl_init_workarounds(struct intel_engine_cs *engine)
 		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
 				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
-	/* WaDisableGafsUnitClkGating:kbl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
 	/* WaDisableSbeCacheDispatchPortSharing:kbl */
 	WA_SET_BIT_MASKED(
 		GEN7_HALF_SLICE_CHICKEN1,
 		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 
-	/* WaInPlaceDecompressionHang:kbl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaDisableLSQCROPERFforOCL:kbl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
-
 	return 0;
 }
 
-static int glk_init_workarounds(struct intel_engine_cs *engine)
+static int glk_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen9_init_workarounds(engine);
+	ret = gen9_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
@@ -604,77 +389,98 @@ static int glk_init_workarounds(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int cfl_init_workarounds(struct intel_engine_cs *engine)
+static int cfl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen9_init_workarounds(engine);
+	ret = gen9_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
-	/* WaEnableGapsTsvCreditFix:cfl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
 	/* WaToEnableHwFixForPushConstHWBug:cfl */
 	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
 			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
-	/* WaDisableGafsUnitClkGating:cfl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
 	/* WaDisableSbeCacheDispatchPortSharing:cfl */
 	WA_SET_BIT_MASKED(
 		GEN7_HALF_SLICE_CHICKEN1,
 		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 
-	/* WaInPlaceDecompressionHang:cfl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
 	return 0;
 }
 
-int init_workarounds_ring(struct intel_engine_cs *engine)
+static int cnl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-	int err;
-
-	WARN_ON(engine->id != RCS);
+	/* WaForceContextSaveRestoreNonCoherent:cnl */
+	WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
+			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
 
-	dev_priv->workarounds.count = 0;
-	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+	/* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
+	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+		WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
 
-	if (IS_BROADWELL(dev_priv))
-		err = bdw_init_workarounds(engine);
-	else if (IS_CHERRYVIEW(dev_priv))
-		err = chv_init_workarounds(engine);
-	else if (IS_SKYLAKE(dev_priv))
-		err =  skl_init_workarounds(engine);
-	else if (IS_BROXTON(dev_priv))
-		err = bxt_init_workarounds(engine);
-	else if (IS_KABYLAKE(dev_priv))
-		err = kbl_init_workarounds(engine);
-	else if (IS_GEMINILAKE(dev_priv))
-		err =  glk_init_workarounds(engine);
-	else if (IS_COFFEELAKE(dev_priv))
-		err = cfl_init_workarounds(engine);
-	else if (IS_CANNONLAKE(dev_priv))
-		err = cnl_init_workarounds(engine);
-	else
-		err = 0;
-	if (err)
-		return err;
+	/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
+	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
-	DRM_DEBUG_DRIVER("%s: Number of context specific w/a: %d\n",
-			 engine->name, dev_priv->workarounds.count);
-	return 0;
-}
+	/* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
+	if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
+		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				  GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
 
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
+	/* WaPushConstantDereferenceHoldDisable:cnl */
+	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
+
+	/* FtrEnableFastAnisoL1BankingFix:cnl */
+	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
+
+	/* WaDisable3DMidCmdPreemption:cnl */
+	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+
+	/* WaDisableGPGPUMidCmdPreemption:cnl */
+	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
+
+	return 0;
+}
+
+int intel_ctx_workarounds_init(struct drm_i915_private *dev_priv)
+{
+	int err;
+
+	dev_priv->workarounds.count = 0;
+
+	if (INTEL_GEN(dev_priv) < 8)
+		err = 0;
+	else if (IS_BROADWELL(dev_priv))
+		err = bdw_ctx_workarounds_init(dev_priv);
+	else if (IS_CHERRYVIEW(dev_priv))
+		err = chv_ctx_workarounds_init(dev_priv);
+	else if (IS_SKYLAKE(dev_priv))
+		err = skl_ctx_workarounds_init(dev_priv);
+	else if (IS_BROXTON(dev_priv))
+		err = bxt_ctx_workarounds_init(dev_priv);
+	else if (IS_KABYLAKE(dev_priv))
+		err = kbl_ctx_workarounds_init(dev_priv);
+	else if (IS_GEMINILAKE(dev_priv))
+		err = glk_ctx_workarounds_init(dev_priv);
+	else if (IS_COFFEELAKE(dev_priv))
+		err = cfl_ctx_workarounds_init(dev_priv);
+	else if (IS_CANNONLAKE(dev_priv))
+		err = cnl_ctx_workarounds_init(dev_priv);
+	else {
+		MISSING_CASE(INTEL_GEN(dev_priv));
+		err = 0;
+	}
+	if (err)
+		return err;
+
+	DRM_DEBUG_DRIVER("Number of context specific w/a: %d\n",
+			 dev_priv->workarounds.count);
+	return 0;
+}
+
+int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req)
 {
 	struct i915_workarounds *w = &req->i915->workarounds;
 	u32 *cs;
@@ -706,3 +512,353 @@ int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
 
 	return 0;
 }
+
+static void bdw_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+}
+
+static void chv_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+}
+
+static void gen9_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	if (HAS_LLC(dev_priv)) {
+		/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
+		 *
+		 * Must match Display Engine. See
+		 * WaCompressedResourceDisplayNewHashMode.
+		 */
+		I915_WRITE(MMCD_MISC_CTRL,
+			   I915_READ(MMCD_MISC_CTRL) |
+			   MMCD_PCLA |
+			   MMCD_HOTSPOT_EN);
+	}
+
+	/* WaContextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
+	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS,
+		   _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
+
+	/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
+	I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
+		   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+
+	/* WaDisableKillLogic:bxt,skl,kbl */
+	if (!IS_COFFEELAKE(dev_priv))
+		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+			   ECOCHK_DIS_TLB);
+
+	/* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
+	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+		   BDW_DISABLE_HDC_INVALIDATION);
+
+	/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
+	I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
+				    GEN8_LQSC_FLUSH_COHERENT_LINES));
+
+	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
+	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+}
+
+static void skl_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	gen9_gt_workarounds_apply(dev_priv);
+
+	/* WaEnableGapsTsvCreditFix:skl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaDisableGafsUnitClkGating:skl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaInPlaceDecompressionHang:skl */
+	if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
+		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void bxt_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	gen9_gt_workarounds_apply(dev_priv);
+
+	/* WaStoreMultiplePTEenable:bxt */
+	/* This is a requirement according to Hardware specification */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+		I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
+
+	/* WaSetClckGatingDisableMedia:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+		I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
+					    ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
+	}
+
+	/* WaDisablePooledEuLoadBalancingFix:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
+		I915_WRITE(FF_SLICE_CS_CHICKEN2,
+			   _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
+	}
+
+	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
+		u32 val = I915_READ(GEN8_L3SQCREG1);
+		val &= ~L3_PRIO_CREDITS_MASK;
+		val |= L3_GENERAL_PRIO_CREDITS(62) | L3_HIGH_PRIO_CREDITS(2);
+		I915_WRITE(GEN8_L3SQCREG1, val);
+	}
+
+	/* WaInPlaceDecompressionHang:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
+		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void kbl_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	gen9_gt_workarounds_apply(dev_priv);
+
+	/* WaEnableGapsTsvCreditFix:kbl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaDisableDynamicCreditSharing:kbl */
+	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
+		I915_WRITE(GAMT_CHKN_BIT_REG,
+			   (I915_READ(GAMT_CHKN_BIT_REG) |
+			    GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
+
+	/* WaDisableGafsUnitClkGating:kbl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaInPlaceDecompressionHang:kbl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void glk_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	gen9_gt_workarounds_apply(dev_priv);
+}
+
+static void cfl_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	gen9_gt_workarounds_apply(dev_priv);
+
+	/* WaEnableGapsTsvCreditFix:cfl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaDisableGafsUnitClkGating:cfl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaInPlaceDecompressionHang:cfl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void cnl_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
+	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+		I915_WRITE(GAMT_CHKN_BIT_REG,
+			   (I915_READ(GAMT_CHKN_BIT_REG) |
+			    GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
+
+	/* WaInPlaceDecompressionHang:cnl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	/* WaEnablePreemptionGranularityControlByUMD:cnl */
+	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+}
+
+void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	if (INTEL_GEN(dev_priv) < 8)
+		return;
+	else if (IS_BROADWELL(dev_priv))
+		bdw_gt_workarounds_apply(dev_priv);
+	else if (IS_CHERRYVIEW(dev_priv))
+		chv_gt_workarounds_apply(dev_priv);
+	else if (IS_SKYLAKE(dev_priv))
+		skl_gt_workarounds_apply(dev_priv);
+	else if (IS_BROXTON(dev_priv))
+		bxt_gt_workarounds_apply(dev_priv);
+	else if (IS_KABYLAKE(dev_priv))
+		kbl_gt_workarounds_apply(dev_priv);
+	else if (IS_GEMINILAKE(dev_priv))
+		glk_gt_workarounds_apply(dev_priv);
+	else if (IS_COFFEELAKE(dev_priv))
+		cfl_gt_workarounds_apply(dev_priv);
+	else if (IS_CANNONLAKE(dev_priv))
+		cnl_gt_workarounds_apply(dev_priv);
+	else
+		MISSING_CASE(INTEL_GEN(dev_priv));
+}
+
+static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
+				 i915_reg_t reg)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	struct i915_workarounds *wa = &dev_priv->workarounds;
+	const uint32_t index = wa->hw_whitelist_count[engine->id];
+
+	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
+		return -EINVAL;
+
+	I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
+		   i915_mmio_reg_offset(reg));
+	wa->hw_whitelist_count[engine->id]++;
+
+	return 0;
+}
+
+static int gen9_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret;
+
+	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
+	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
+	if (ret)
+		return ret;
+
+	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
+	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+	if (ret)
+		return ret;
+
+	/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int skl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret = gen9_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	/* WaDisableLSQCROPERFforOCL:skl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int bxt_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+
+	int ret = gen9_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	/* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
+	/* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
+	/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
+	/* WaDisableLSQCROPERFforOCL:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+		ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
+		if (ret)
+			return ret;
+
+		ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int kbl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret = gen9_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	/* WaDisableLSQCROPERFforOCL:kbl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int glk_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret = gen9_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int cfl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret = gen9_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int cnl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret;
+
+	/* WaEnablePreemptionGranularityControlByUMD:cnl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int err;
+
+	WARN_ON(engine->id != RCS);
+
+	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+
+	if (INTEL_GEN(dev_priv) < 9) {
+		WARN(1, "No whitelisting in Gen%u\n", INTEL_GEN(dev_priv));
+		err = 0;
+	} else if (IS_SKYLAKE(dev_priv))
+		err = skl_whitelist_workarounds_apply(engine);
+	else if (IS_BROXTON(dev_priv))
+		err = bxt_whitelist_workarounds_apply(engine);
+	else if (IS_KABYLAKE(dev_priv))
+		err = kbl_whitelist_workarounds_apply(engine);
+	else if (IS_GEMINILAKE(dev_priv))
+		err = glk_whitelist_workarounds_apply(engine);
+	else if (IS_COFFEELAKE(dev_priv))
+		err = cfl_whitelist_workarounds_apply(engine);
+	else if (IS_CANNONLAKE(dev_priv))
+		err = cnl_whitelist_workarounds_apply(engine);
+	else {
+		MISSING_CASE(INTEL_GEN(dev_priv));
+		err = 0;
+	}
+	if (err)
+		return err;
+
+	DRM_DEBUG_DRIVER("%s: Number of whitelist w/a: %d\n", engine->name,
+			 dev_priv->workarounds.hw_whitelist_count[engine->id]);
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
index 27099fc..bba51bb 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -25,7 +25,11 @@
 #ifndef _I915_WORKAROUNDS_H_
 #define _I915_WORKAROUNDS_H_
 
-int init_workarounds_ring(struct intel_engine_cs *engine);
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req);
+int intel_ctx_workarounds_init(struct drm_i915_private *dev_priv);
+int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req);
+
+void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv);
+
+int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine);
 
 #endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (2 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 03/20] drm/i915: Split out functions for different kinds of workarounds Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-06 11:59   ` Joonas Lahtinen
  2017-11-03 18:09 ` [RFC PATCH 05/20] drm/i915: Transform GT " Oscar Mateo
                   ` (18 subsequent siblings)
  22 siblings, 1 reply; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx

This is for WAs that need to touch registers that get saved/restored
together with the logical context. The idea is that WAs are "pretty"
static, so a table is more declarative than a programmatic approah.
Note however that some amount is caching is needed for those things
that are dynamic (e.g. things that need some calculation, or have
a criteria different than the more obvious GEN + stepping).

Also, this makes very explicit which WAs live in the context.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c      |  40 +-
 drivers/gpu/drm/i915/i915_drv.h          |  35 +-
 drivers/gpu/drm/i915/i915_gem_context.c  |   4 -
 drivers/gpu/drm/i915/intel_workarounds.c | 754 +++++++++++++++++--------------
 drivers/gpu/drm/i915/intel_workarounds.h |   4 +-
 5 files changed, 466 insertions(+), 371 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 39883cd..12c4330 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -30,6 +30,7 @@
 #include <linux/sort.h>
 #include <linux/sched/mm.h>
 #include "intel_drv.h"
+#include "intel_workarounds.h"
 #include "i915_guc_submission.h"
 
 static inline struct drm_i915_private *node_to_i915(struct drm_info_node *node)
@@ -3357,13 +3358,16 @@ static int i915_shared_dplls_info(struct seq_file *m, void *unused)
 
 static int i915_wa_registers(struct seq_file *m, void *unused)
 {
-	int i;
-	int ret;
 	struct intel_engine_cs *engine;
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct drm_device *dev = &dev_priv->drm;
 	struct i915_workarounds *workarounds = &dev_priv->workarounds;
+	const struct i915_wa_reg_table *wa_table;
+	uint table_count;
 	enum intel_engine_id id;
+	int i, j, ret;
+
+	intel_ctx_workarounds_get(dev_priv, &wa_table, &table_count);
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
 	if (ret)
@@ -3371,22 +3375,28 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
 
 	intel_runtime_pm_get(dev_priv);
 
-	seq_printf(m, "Workarounds applied: %d\n", workarounds->count);
+	seq_printf(m, "Workarounds applied: %d\n", workarounds->ctx_count);
 	for_each_engine(engine, dev_priv, id)
 		seq_printf(m, "HW whitelist count for %s: %d\n",
 			   engine->name, workarounds->hw_whitelist_count[id]);
-	for (i = 0; i < workarounds->count; ++i) {
-		i915_reg_t addr;
-		u32 mask, value, read;
-		bool ok;
-
-		addr = workarounds->reg[i].addr;
-		mask = workarounds->reg[i].mask;
-		value = workarounds->reg[i].value;
-		read = I915_READ(addr);
-		ok = (value & mask) == (read & mask);
-		seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s\n",
-			   i915_mmio_reg_offset(addr), value, mask, read, ok ? "OK" : "FAIL");
+
+	for (i = 0; i < table_count; i++) {
+		const struct i915_wa_reg *wa = wa_table[i].table;
+
+		for (j = 0; j < wa_table[i].count; j++) {
+			u32 read;
+			bool ok;
+
+			if (!wa[j].applied)
+				continue;
+
+			read = I915_READ(wa[j].addr);
+			ok = (wa[j].value & wa[j].mask) == (read & wa[j].mask);
+			seq_printf(m,
+				   "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s, name: %s\n",
+				   i915_mmio_reg_offset(wa[j].addr), wa[j].value,
+				   wa[j].mask, read, ok ? "OK" : "FAIL", wa[j].name);
+		}
 	}
 
 	intel_runtime_pm_put(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4a7325c..1c73fec 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1970,18 +1970,43 @@ struct i915_frontbuffer_tracking {
 	unsigned flip_bits;
 };
 
+struct i915_wa_reg;
+
+typedef bool (* wa_pre_hook_func)(struct drm_i915_private *dev_priv,
+				  struct i915_wa_reg *wa);
+typedef void (* wa_post_hook_func)(struct drm_i915_private *dev_priv,
+				   struct i915_wa_reg *wa);
+
 struct i915_wa_reg {
+	const char *name;
+	enum wa_type {
+		I915_WA_TYPE_CONTEXT = 0,
+		I915_WA_TYPE_GT,
+		I915_WA_TYPE_DISPLAY,
+		I915_WA_TYPE_WHITELIST
+	} type;
+
+	u8 since;
+	u8 until;
+
 	i915_reg_t addr;
-	u32 value;
-	/* bitmask representing WA bits */
 	u32 mask;
+	u32 value;
+	bool is_masked_reg;
+
+	wa_pre_hook_func pre_hook;
+	wa_post_hook_func post_hook;
+	u32 hook_data;
+	bool applied;
 };
 
-#define I915_MAX_WA_REGS 16
+struct i915_wa_reg_table {
+	struct i915_wa_reg *table;
+	int count;
+};
 
 struct i915_workarounds {
-	struct i915_wa_reg reg[I915_MAX_WA_REGS];
-	u32 count;
+	u32 ctx_count;
 	u32 hw_whitelist_count[I915_NUM_ENGINES];
 };
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 8548e571..7d04b5e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -457,10 +457,6 @@ int i915_gem_contexts_init(struct drm_i915_private *dev_priv)
 
 	GEM_BUG_ON(dev_priv->kernel_context);
 
-	err = intel_ctx_workarounds_init(dev_priv);
-	if (err)
-		goto err;
-
 	INIT_LIST_HEAD(&dev_priv->contexts.list);
 	INIT_WORK(&dev_priv->contexts.free_work, contexts_free_worker);
 	init_llist_head(&dev_priv->contexts.free_list);
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 0a8f265..b00899e 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -25,61 +25,65 @@
 #include "i915_drv.h"
 #include "intel_workarounds.h"
 
-static int wa_add(struct drm_i915_private *dev_priv,
-		  i915_reg_t addr,
-		  const u32 mask, const u32 val)
-{
-	const u32 idx = dev_priv->workarounds.count;
+#define WA_CTX(wa)			\
+	.name = (wa),			\
+	.type = I915_WA_TYPE_CONTEXT
 
-	if (WARN_ON(idx >= I915_MAX_WA_REGS))
-		return -ENOSPC;
+#define ALL_REVS		\
+	.since = 0,		\
+	.until = REVID_FOREVER
 
-	dev_priv->workarounds.reg[idx].addr = addr;
-	dev_priv->workarounds.reg[idx].value = val;
-	dev_priv->workarounds.reg[idx].mask = mask;
+#define REVS(s, u)	\
+	.since = (s),	\
+	.until = (u)
 
-	dev_priv->workarounds.count++;
+#define REG(a)		\
+	.addr = (a)
 
-	return 0;
-}
+#define MASK(mask, value)	((mask) << 16 | (value))
+#define MASK_ENABLE(x)		(MASK((x), (x)))
+#define MASK_DISABLE(x)		(MASK((x), 0))
 
-#define WA_REG(addr, mask, val) do { \
-		const int r = wa_add(dev_priv, (addr), (mask), (val)); \
-		if (r) \
-			return r; \
-	} while (0)
+#define SET_BIT_MASKED(m) 		\
+	.mask = (m),			\
+	.value = MASK_ENABLE(m),	\
+	.is_masked_reg = true
 
-#define WA_SET_BIT_MASKED(addr, mask) \
-	WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+#define CLEAR_BIT_MASKED( m) 		\
+	.mask = (m),			\
+	.value = MASK_DISABLE(m),	\
+	.is_masked_reg = true
 
-#define WA_CLR_BIT_MASKED(addr, mask) \
-	WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
+#define SET_FIELD_MASKED(m, v) 		\
+	.mask = (m),			\
+	.value = MASK(m, v),		\
+	.is_masked_reg = true
 
-#define WA_SET_FIELD_MASKED(addr, mask, value) \
-	WA_REG(addr, mask, _MASKED_FIELD(mask, value))
-
-static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
-{
-	WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
+static struct i915_wa_reg gen8_ctx_was[] = {
+	{ WA_CTX(""),
+	  ALL_REVS, REG(INSTPM),
+	  SET_BIT_MASKED(INSTPM_FORCE_ORDERING) },
 
-	/* WaDisableAsyncFlipPerfMode:bdw,chv */
-	WA_SET_BIT_MASKED(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
+	{ WA_CTX("WaDisableAsyncFlipPerfMode"),
+	  ALL_REVS, REG(MI_MODE),
+	  SET_BIT_MASKED(ASYNC_FLIP_PERF_DISABLE) },
 
-	/* WaDisablePartialInstShootdown:bdw,chv */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+	{ WA_CTX("WaDisablePartialInstShootdown"),
+	  ALL_REVS, REG(GEN8_ROW_CHICKEN),
+	  SET_BIT_MASKED(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) },
 
-	/* Use Force Non-Coherent whenever executing a 3D context. This is a
+	/*
+	 * Use Force Non-Coherent whenever executing a 3D context. This is a
 	 * workaround for for a possible hang in the unlikely event a TLB
 	 * invalidation occurs during a PSD flush.
 	 */
-	/* WaForceEnableNonCoherent:bdw,chv */
-	/* WaHdcDisableFetchWhenMasked:bdw,chv */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_DONOT_FETCH_MEM_WHEN_MASKED |
-			  HDC_FORCE_NON_COHERENT);
+	{ WA_CTX("WaForceEnableNonCoherent + WaHdcDisableFetchWhenMasked"),
+	  ALL_REVS, REG(HDC_CHICKEN0),
+	  SET_BIT_MASKED(HDC_FORCE_NON_COHERENT |
+			 HDC_DONOT_FETCH_MEM_WHEN_MASKED) },
 
-	/* From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0:
+	/*
+	 * From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0:
 	 * "The Hierarchical Z RAW Stall Optimization allows non-overlapping
 	 *  polygons in the same 8x4 pixel/sample area to be processed without
 	 *  stalling waiting for the earlier ones to write to Hierarchical Z
@@ -87,10 +91,13 @@ static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 	 *
 	 * This optimization is off by default for BDW and CHV; turn it on.
 	 */
-	WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
+	{ WA_CTX(""),
+	  ALL_REVS, REG(CACHE_MODE_0_GEN7),
+	  CLEAR_BIT_MASKED(HIZ_RAW_STALL_OPT_DISABLE) },
 
-	/* Wa4x4STCOptimizationDisable:bdw,chv */
-	WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
+	{ WA_CTX("Wa4x4STCOptimizationDisable"),
+	  ALL_REVS, REG(CACHE_MODE_1),
+	  SET_BIT_MASKED(GEN8_4x4_STC_OPTIMIZATION_DISABLE) },
 
 	/*
 	 * BSpec recommends 8x4 when MSAA is used,
@@ -100,154 +107,126 @@ static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 	 * disable bit, which we don't touch here, but it's good
 	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
 	 */
-	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
-			    GEN6_WIZ_HASHING_MASK,
-			    GEN6_WIZ_HASHING_16x4);
-
-	return 0;
-}
+	{ WA_CTX(""),
+	  ALL_REVS, REG(GEN7_GT_MODE),
+	  SET_FIELD_MASKED(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4) },
+};
 
-static int bdw_ctx_workarounds_init(struct drm_i915_private *dev_priv)
+static bool is_bdw_gt3(struct drm_i915_private *dev_priv, struct i915_wa_reg *wa)
 {
-	int ret;
-
-	ret = gen8_ctx_workarounds_init(dev_priv);
-	if (ret)
-		return ret;
+	return IS_BDW_GT3(dev_priv);
+}
 
-	/* WaDisableThreadStallDopClockGating:bdw (pre-production) */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+static struct i915_wa_reg bdw_ctx_was[] = {
+	{ WA_CTX("WaDisableThreadStallDopClockGating (pre-prod)"),
+	  ALL_REVS, REG(GEN8_ROW_CHICKEN),
+	  SET_BIT_MASKED(STALL_DOP_GATING_DISABLE) },
 
-	/* WaDisableDopClockGating:bdw
-	 *
+	/*
 	 * Also see the related UCGTCL1 write in broadwell_init_clock_gating()
 	 * to disable EUTC clock gating.
 	 */
-	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
-			  DOP_CLOCK_GATING_DISABLE);
-
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-			  GEN8_SAMPLER_POWER_BYPASS_DIS);
+	{ WA_CTX("WaDisableDopClockGating"),
+	  ALL_REVS, REG(GEN7_ROW_CHICKEN2),
+	  SET_BIT_MASKED(DOP_CLOCK_GATING_DISABLE) },
 
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  /* WaForceContextSaveRestoreNonCoherent:bdw */
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
-			  /* WaDisableFenceDestinationToSLM:bdw (pre-prod) */
-			  (IS_BDW_GT3(dev_priv) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
+	{ WA_CTX(""),
+	  ALL_REVS, REG(HALF_SLICE_CHICKEN3),
+	  SET_BIT_MASKED(GEN8_SAMPLER_POWER_BYPASS_DIS) },
 
-	return 0;
-}
+	{ WA_CTX("WaForceContextSaveRestoreNonCoherent"),
+	  ALL_REVS, REG(HDC_CHICKEN0),
+	  SET_BIT_MASKED(HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT) },
 
-static int chv_ctx_workarounds_init(struct drm_i915_private *dev_priv)
-{
-	int ret;
+	{ WA_CTX("WaDisableFenceDestinationToSLM (pre-prod)"),
+	  ALL_REVS, REG(HDC_CHICKEN0),
+	  SET_BIT_MASKED(HDC_FENCE_DEST_SLM_DISABLE),
+	  .pre_hook = is_bdw_gt3 },
+};
 
-	ret = gen8_ctx_workarounds_init(dev_priv);
-	if (ret)
-		return ret;
-
-	/* WaDisableThreadStallDopClockGating:chv */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+static struct i915_wa_reg chv_ctx_was[] = {
+	{ WA_CTX("WaDisableThreadStallDopClockGating"),
+	  ALL_REVS, REG(GEN8_ROW_CHICKEN),
+	  SET_BIT_MASKED(STALL_DOP_GATING_DISABLE) },
 
 	/* Improve HiZ throughput on CHV. */
-	WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
-
-	return 0;
-}
+	{ WA_CTX(""),
+	  ALL_REVS, REG(HIZ_CHICKEN),
+	  SET_BIT_MASKED(CHV_HZ_8X8_MODE_IN_1X) },
+};
 
-static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
+static bool has_llc(struct drm_i915_private *dev_priv, struct i915_wa_reg *wa)
 {
-	if (HAS_LLC(dev_priv)) {
-		/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
-		 *
-		 * Must match Display Engine. See
-		 * WaCompressedResourceDisplayNewHashMode.
-		 */
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN9_PBE_COMPRESSED_HASH_SELECTION);
-		WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
-				  GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
-	}
-
-	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
-	/* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  FLOW_CONTROL_ENABLE |
-			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+	return HAS_LLC(dev_priv);
+}
 
-	/* Syncing dependencies between camera and graphics:skl,bxt,kbl */
-	if (!IS_COFFEELAKE(dev_priv))
-		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-				  GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC);
+static struct i915_wa_reg gen9_ctx_was[] = {
+	/*
+	 * Must match Display Engine. See
+	 * WaCompressedResourceDisplayNewHashMode.
+	 */
+	{ WA_CTX("WaCompressedResourceSamplerPbeMediaNewHashMode"),
+	  ALL_REVS, REG(COMMON_SLICE_CHICKEN2),
+	  SET_BIT_MASKED(GEN9_PBE_COMPRESSED_HASH_SELECTION),
+	  .pre_hook = has_llc },
+	{ WA_CTX("WaCompressedResourceSamplerPbeMediaNewHashMode"),
+	  ALL_REVS, REG(GEN9_HALF_SLICE_CHICKEN7),
+	  SET_BIT_MASKED(GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR),
+	  .pre_hook = has_llc },
+
+	{ WA_CTX("WaClearFlowControlGpgpuContextSave + WaDisablePartialInstShootdown"),
+	  ALL_REVS, REG(GEN8_ROW_CHICKEN),
+	  SET_BIT_MASKED(FLOW_CONTROL_ENABLE |
+			 PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) },
 
-	/* WaDisableDgMirrorFixInHalfSliceChicken5:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
-				  GEN9_DG_MIRROR_FIX_ENABLE);
+	/*
+	 * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
+	 * but we do that in per ctx batchbuffer as there is an issue
+	 * with this register not getting restored on ctx restore
+	 */
+	{ WA_CTX("WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken"),
+	  REVS(0, BXT_REVID_A1), REG(GEN7_COMMON_SLICE_CHICKEN1),
+	  SET_BIT_MASKED(GEN9_RHWO_OPTIMIZATION_DISABLE) },
 
-	/* WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		WA_SET_BIT_MASKED(GEN7_COMMON_SLICE_CHICKEN1,
-				  GEN9_RHWO_OPTIMIZATION_DISABLE);
-		/*
-		 * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
-		 * but we do that in per ctx batchbuffer as there is an issue
-		 * with this register not getting restored on ctx restore
-		 */
-	}
+	{ WA_CTX("WaEnableYV12BugFixInHalfSliceChicken7 + WaEnableSamplerGPGPUPreemptionSupport"),
+	  ALL_REVS, REG(GEN9_HALF_SLICE_CHICKEN7),
+	  SET_BIT_MASKED(GEN9_ENABLE_YV12_BUGFIX |
+			 GEN9_ENABLE_GPGPU_PREEMPTION) },
 
-	/* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
-	/* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
-			  GEN9_ENABLE_YV12_BUGFIX |
-			  GEN9_ENABLE_GPGPU_PREEMPTION);
+	{ WA_CTX("Wa4x4STCOptimizationDisable + WaDisablePartialResolveInVc"),
+	  ALL_REVS, REG(CACHE_MODE_1),
+	  SET_BIT_MASKED(GEN8_4x4_STC_OPTIMIZATION_DISABLE |
+	      GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE) },
 
-	/* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
-	/* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(CACHE_MODE_1, (GEN8_4x4_STC_OPTIMIZATION_DISABLE |
-					 GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE));
+	{ WA_CTX("WaCcsTlbPrefetchDisable"),
+	  ALL_REVS, REG(GEN9_HALF_SLICE_CHICKEN5),
+	  CLEAR_BIT_MASKED(GEN9_CCS_TLB_PREFETCH_ENABLE) },
 
-	/* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
-	WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
-			  GEN9_CCS_TLB_PREFETCH_ENABLE);
+	{ WA_CTX("WaForceContextSaveRestoreNonCoherent"),
+	  ALL_REVS, REG(HDC_CHICKEN0),
+	  SET_BIT_MASKED(HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
+	      HDC_FORCE_CSR_NON_COHERENT_OVR_DISABLE) },
 
-	/* WaDisableMaskBasedCammingInRCC:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		WA_SET_BIT_MASKED(SLICE_ECO_CHICKEN0,
-				  PIXEL_MASK_CAMMING_DISABLE);
-
-	/* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
-			  HDC_FORCE_CSR_NON_COHERENT_OVR_DISABLE);
-
-	/* WaForceEnableNonCoherent and WaDisableHDCInvalidation are
-	 * both tied to WaForceContextSaveRestoreNonCoherent
-	 * in some hsds for skl. We keep the tie for all gen9. The
-	 * documentation is a bit hazy and so we want to get common behaviour,
-	 * even though there is no clear evidence we would need both on kbl/bxt.
-	 * This area has been source of system hangs so we play it safe
-	 * and mimic the skl regardless of what bspec says.
+	/*
+	 * WaForceEnableNonCoherent and WaDisableHDCInvalidation are
+	 * both tied to WaForceContextSaveRestoreNonCoherent in some hsds for
+	 * skl. We keep the tie for all gen9. The documentation is a bit hazy
+	 * and so we want to get common behaviour, even though there is no clear
+	 * evidence we would need both on kbl/bxt. This area has been source of
+	 * system hangs so we play it safe and mimic the skl regardless of what
+	 * bspec says.
 	 *
 	 * Use Force Non-Coherent whenever executing a 3D context. This
 	 * is a workaround for a possible hang in the unlikely event
 	 * a TLB invalidation occurs during a PSD flush.
 	 */
+	{ WA_CTX("WaForceEnableNonCoherent"),
+	  ALL_REVS, REG(HDC_CHICKEN0),
+	  SET_BIT_MASKED(HDC_FORCE_NON_COHERENT) },
 
-	/* WaForceEnableNonCoherent:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_FORCE_NON_COHERENT);
-
-	/* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
-	if (IS_SKYLAKE(dev_priv) ||
-	    IS_KABYLAKE(dev_priv) ||
-	    IS_COFFEELAKE(dev_priv) ||
-	    IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0))
-		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-				  GEN8_SAMPLER_POWER_BYPASS_DIS);
-
-	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
+	{ WA_CTX("WaDisableSTUnitPowerOptimization"),
+	  ALL_REVS, REG(HALF_SLICE_CHICKEN2),
+	  SET_BIT_MASKED(GEN8_ST_PO_DISABLE) },
 
 	/*
 	 * Supporting preemption with fine-granularity requires changes in the
@@ -259,18 +238,17 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 	 * not real HW workarounds, but merely a way to start using preemption
 	 * while maintaining old contract with userspace.
 	 */
-
-	/* WaDisable3DMidCmdPreemption:skl,bxt,glk,cfl,[cnl] */
-	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
-	/* WaDisableGPGPUMidCmdPreemption:skl,bxt,blk,cfl,[cnl] */
-	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
-	return 0;
-}
-
-static int skl_tune_iz_hashing(struct drm_i915_private *dev_priv)
+	{ WA_CTX("WaDisable3DMidCmdPreemption"),
+	  ALL_REVS, REG(GEN8_CS_CHICKEN1),
+	  CLEAR_BIT_MASKED(GEN9_PREEMPT_3D_OBJECT_LEVEL) },
+	{ WA_CTX("WaDisableGPGPUMidCmdPreemption"),
+	  ALL_REVS, REG(GEN8_CS_CHICKEN1),
+	  SET_FIELD_MASKED(GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+			   GEN9_PREEMPT_GPGPU_COMMAND_LEVEL) },
+};
+
+static bool skl_tune_iz_hashing(struct drm_i915_private *dev_priv,
+				struct i915_wa_reg *wa)
 {
 	u8 vals[3] = { 0, 0, 0 };
 	unsigned int i;
@@ -296,211 +274,295 @@ static int skl_tune_iz_hashing(struct drm_i915_private *dev_priv)
 	}
 
 	if (vals[0] == 0 && vals[1] == 0 && vals[2] == 0)
-		return 0;
-
-	/* Tune IZ hashing. See intel_device_info_runtime_init() */
-	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
-			    GEN9_IZ_HASHING_MASK(2) |
-			    GEN9_IZ_HASHING_MASK(1) |
-			    GEN9_IZ_HASHING_MASK(0),
-			    GEN9_IZ_HASHING(2, vals[2]) |
-			    GEN9_IZ_HASHING(1, vals[1]) |
-			    GEN9_IZ_HASHING(0, vals[0]));
-
-	return 0;
-}
+		return false;
 
-static int skl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
-{
-	int ret;
-
-	ret = gen9_ctx_workarounds_init(dev_priv);
-	if (ret)
-		return ret;
+	wa->mask = GEN9_IZ_HASHING_MASK(2) |
+		   GEN9_IZ_HASHING_MASK(1) |
+		   GEN9_IZ_HASHING_MASK(0);
+	wa->value = _MASKED_FIELD(wa->mask, GEN9_IZ_HASHING(2, vals[2]) |
+					    GEN9_IZ_HASHING(1, vals[1]) |
+					    GEN9_IZ_HASHING(0, vals[0]));
 
-	return skl_tune_iz_hashing(dev_priv);
+	return true;
 }
 
-static int bxt_ctx_workarounds_init(struct drm_i915_private *dev_priv)
-{
-	int ret;
+static struct i915_wa_reg skl_ctx_was[] = {
+	/* Syncing dependencies between camera and graphics */
+	{ WA_CTX(""),
+	  ALL_REVS, REG(HALF_SLICE_CHICKEN3),
+	  SET_BIT_MASKED(GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC) },
 
-	ret = gen9_ctx_workarounds_init(dev_priv);
-	if (ret)
-		return ret;
+	{ WA_CTX("WaDisableSamplerPowerBypassForSOPingPong"),
+	  ALL_REVS, REG(HALF_SLICE_CHICKEN3),
+	  SET_BIT_MASKED(GEN8_SAMPLER_POWER_BYPASS_DIS) },
 
-	/* WaDisableThreadStallDopClockGating:bxt */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  STALL_DOP_GATING_DISABLE);
+	/* Tune IZ hashing. See intel_device_info_runtime_init() */
+	{ WA_CTX(""),
+	  ALL_REVS, REG(GEN7_GT_MODE),
+	  .mask = 0, .value = 0,
+	  .pre_hook = skl_tune_iz_hashing },
+};
+
+static struct i915_wa_reg bxt_ctx_was[] = {
+	/* Syncing dependencies between camera and graphics */
+	{ WA_CTX(""),
+	  ALL_REVS, REG(HALF_SLICE_CHICKEN3),
+	  SET_BIT_MASKED(GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC) },
+
+	{ WA_CTX("WaDisableDgMirrorFixInHalfSliceChicken5"),
+	  REVS(0, BXT_REVID_A1), REG(GEN9_HALF_SLICE_CHICKEN5),
+	  CLEAR_BIT_MASKED(GEN9_DG_MIRROR_FIX_ENABLE) },
+
+	{ WA_CTX("WaDisableMaskBasedCammingInRCC"),
+	  REVS(0, BXT_REVID_A1), REG(SLICE_ECO_CHICKEN0),
+	  SET_BIT_MASKED(PIXEL_MASK_CAMMING_DISABLE) },
+
+	{ WA_CTX("WaDisableSamplerPowerBypassForSOPingPong"),
+	  REVS(0, BXT_REVID_B0), REG(HALF_SLICE_CHICKEN3),
+	  SET_BIT_MASKED(GEN8_SAMPLER_POWER_BYPASS_DIS) },
+
+	{ WA_CTX("WaDisableThreadStallDopClockGating"),
+	  ALL_REVS, REG(GEN8_ROW_CHICKEN),
+	  SET_BIT_MASKED(STALL_DOP_GATING_DISABLE) },
+
+	{ WA_CTX("WaDisableSbeCacheDispatchPortSharing"),
+	  REVS(0, BXT_REVID_B0), REG(GEN7_HALF_SLICE_CHICKEN1),
+	  SET_BIT_MASKED(GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE) },
+
+	{ WA_CTX("WaToEnableHwFixForPushConstHWBug"),
+	  REVS(BXT_REVID_C0, REVID_FOREVER), REG(COMMON_SLICE_CHICKEN2),
+	  SET_BIT_MASKED(GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION) },
+};
+
+static struct i915_wa_reg kbl_ctx_was[] = {
+	/* Syncing dependencies between camera and graphics */
+	{ WA_CTX(""),
+	  ALL_REVS, REG(HALF_SLICE_CHICKEN3),
+	  SET_BIT_MASKED(GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC) },
+
+	{ WA_CTX("WaDisableSamplerPowerBypassForSOPingPong"),
+	  ALL_REVS, REG(HALF_SLICE_CHICKEN3),
+	  SET_BIT_MASKED(GEN8_SAMPLER_POWER_BYPASS_DIS) },
+
+	{ WA_CTX("WaDisableFenceDestinationToSLM (pre-prod)"),
+	  REVS(KBL_REVID_A0, KBL_REVID_A0), REG(HDC_CHICKEN0),
+	  SET_BIT_MASKED(HDC_FENCE_DEST_SLM_DISABLE) },
+
+	{ WA_CTX("WaToEnableHwFixForPushConstHWBug"),
+	  REVS(KBL_REVID_C0, REVID_FOREVER), REG(COMMON_SLICE_CHICKEN2),
+	  SET_BIT_MASKED(GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION) },
+
+	{ WA_CTX("WaDisableSbeCacheDispatchPortSharing"),
+	  ALL_REVS, REG(GEN7_HALF_SLICE_CHICKEN1),
+	  SET_BIT_MASKED(GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE) },
+};
+
+static struct i915_wa_reg glk_ctx_was[] = {
+	/* Syncing dependencies between camera and graphics */
+	{ WA_CTX(""),
+	  ALL_REVS, REG(HALF_SLICE_CHICKEN3),
+	  SET_BIT_MASKED(GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC) },
+
+	{ WA_CTX("WaToEnableHwFixForPushConstHWBug"),
+	  ALL_REVS, REG(COMMON_SLICE_CHICKEN2),
+	  SET_BIT_MASKED(GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION) },
+};
+
+static struct i915_wa_reg cfl_ctx_was[] = {
+	{ WA_CTX("WaDisableSamplerPowerBypassForSOPingPong"),
+	  ALL_REVS, REG(HALF_SLICE_CHICKEN3),
+	  SET_BIT_MASKED(GEN8_SAMPLER_POWER_BYPASS_DIS) },
+
+	{ WA_CTX("WaToEnableHwFixForPushConstHWBug"),
+	  ALL_REVS, REG(COMMON_SLICE_CHICKEN2),
+	  SET_BIT_MASKED(GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION) },
+
+	{ WA_CTX("WaDisableSbeCacheDispatchPortSharing"),
+	  ALL_REVS, REG(GEN7_HALF_SLICE_CHICKEN1),
+	  SET_BIT_MASKED(GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE) },
+};
+
+static struct i915_wa_reg cnl_ctx_was[] = {
+	{ WA_CTX("WaForceContextSaveRestoreNonCoherent"),
+	  ALL_REVS, REG(CNL_HDC_CHICKEN0),
+	  SET_BIT_MASKED(HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT) },
+
+	{ WA_CTX("WaThrottleEUPerfToAvoidTDBackPressure (pre-prod)"),
+	  REVS(CNL_REVID_B0, CNL_REVID_B0), REG(GEN8_ROW_CHICKEN),
+	  SET_BIT_MASKED(THROTTLE_12_5) },
+
+	{ WA_CTX("WaDisableReplayBufferBankArbitrationOptimization"),
+	  ALL_REVS, REG(COMMON_SLICE_CHICKEN2),
+	  SET_BIT_MASKED(GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION) },
+
+	{ WA_CTX("WaDisableEnhancedSBEVertexCaching (pre-prod)"),
+	  REVS(0, CNL_REVID_B0), REG(COMMON_SLICE_CHICKEN2),
+	  SET_BIT_MASKED(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE) },
+
+	{ WA_CTX("WaPushConstantDereferenceHoldDisable"),
+	  ALL_REVS, REG(GEN7_ROW_CHICKEN2),
+	  SET_BIT_MASKED(PUSH_CONSTANT_DEREF_DISABLE) },
+
+	{ WA_CTX("FtrEnableFastAnisoL1BankingFix"),
+	  ALL_REVS, REG(HALF_SLICE_CHICKEN3),
+	  SET_BIT_MASKED(CNL_FAST_ANISO_L1_BANKING_FIX) },
+
+	{ WA_CTX("WaDisable3DMidCmdPreemption"),
+	  ALL_REVS, REG(GEN8_CS_CHICKEN1),
+	  CLEAR_BIT_MASKED(GEN9_PREEMPT_3D_OBJECT_LEVEL) },
+
+	{ WA_CTX("WaDisableGPGPUMidCmdPreemption"),
+	  ALL_REVS, REG(GEN8_CS_CHICKEN1),
+	  SET_FIELD_MASKED(GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+			   GEN9_PREEMPT_GPGPU_COMMAND_LEVEL) },
+};
+
+static const struct i915_wa_reg_table bdw_ctx_wa_tbl[] = {
+	{ gen8_ctx_was, ARRAY_SIZE(gen8_ctx_was) },
+	{ bdw_ctx_was,  ARRAY_SIZE(bdw_ctx_was) },
+};
+
+static const struct i915_wa_reg_table chv_ctx_wa_tbl[] = {
+	{ gen8_ctx_was, ARRAY_SIZE(gen8_ctx_was) },
+	{ chv_ctx_was,  ARRAY_SIZE(chv_ctx_was) },
+};
+
+static const struct i915_wa_reg_table skl_ctx_wa_tbl[] = {
+	{ gen9_ctx_was, ARRAY_SIZE(gen9_ctx_was) },
+	{ skl_ctx_was,  ARRAY_SIZE(skl_ctx_was) },
+};
+
+static const struct i915_wa_reg_table bxt_ctx_wa_tbl[] = {
+	{ gen9_ctx_was, ARRAY_SIZE(gen9_ctx_was) },
+	{ bxt_ctx_was,  ARRAY_SIZE(bxt_ctx_was) },
+};
+
+static const struct i915_wa_reg_table kbl_ctx_wa_tbl[] = {
+	{ gen9_ctx_was, ARRAY_SIZE(gen9_ctx_was) },
+	{ kbl_ctx_was,  ARRAY_SIZE(kbl_ctx_was) },
+};
+
+static const struct i915_wa_reg_table glk_ctx_wa_tbl[] = {
+	{ gen9_ctx_was, ARRAY_SIZE(gen9_ctx_was) },
+	{ glk_ctx_was,  ARRAY_SIZE(glk_ctx_was) },
+};
+
+static const struct i915_wa_reg_table cfl_ctx_wa_tbl[] = {
+	{ gen9_ctx_was, ARRAY_SIZE(gen9_ctx_was) },
+	{ cfl_ctx_was,  ARRAY_SIZE(cfl_ctx_was) },
+};
+
+static const struct i915_wa_reg_table cnl_ctx_wa_tbl[] = {
+	{ cnl_ctx_was,  ARRAY_SIZE(cnl_ctx_was) },
+};
+
+void intel_ctx_workarounds_get(struct drm_i915_private *dev_priv,
+			       const struct i915_wa_reg_table **wa_table,
+			       uint *table_count)
+{
+	*wa_table = NULL;
+	*table_count = 0;
 
-	/* WaDisableSbeCacheDispatchPortSharing:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
-		WA_SET_BIT_MASKED(
-			GEN7_HALF_SLICE_CHICKEN1,
-			GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+	if (INTEL_GEN(dev_priv) < 8)
+		return;
+	else if (IS_BROADWELL(dev_priv)) {
+		*wa_table = bdw_ctx_wa_tbl;
+		*table_count = ARRAY_SIZE(bdw_ctx_wa_tbl);
+	} else if (IS_CHERRYVIEW(dev_priv)) {
+		*wa_table = chv_ctx_wa_tbl;
+		*table_count = ARRAY_SIZE(chv_ctx_wa_tbl);
+	} else if (IS_SKYLAKE(dev_priv)) {
+		*wa_table = skl_ctx_wa_tbl;
+		*table_count = ARRAY_SIZE(skl_ctx_wa_tbl);
+	} else if (IS_BROXTON(dev_priv)) {
+		*wa_table = bxt_ctx_wa_tbl;
+		*table_count = ARRAY_SIZE(bxt_ctx_wa_tbl);
+	} else if (IS_KABYLAKE(dev_priv)) {
+		*wa_table = kbl_ctx_wa_tbl;
+		*table_count = ARRAY_SIZE(kbl_ctx_wa_tbl);
+	} else if (IS_GEMINILAKE(dev_priv)) {
+		*wa_table = glk_ctx_wa_tbl;
+		*table_count = ARRAY_SIZE(glk_ctx_wa_tbl);
+	} else if (IS_COFFEELAKE(dev_priv)) {
+		*wa_table = cfl_ctx_wa_tbl;
+		*table_count = ARRAY_SIZE(cfl_ctx_wa_tbl);
+	} else if (IS_CANNONLAKE(dev_priv)) {
+		*wa_table = cnl_ctx_wa_tbl;
+		*table_count = ARRAY_SIZE(cnl_ctx_wa_tbl);
+	} else {
+		MISSING_CASE(INTEL_GEN(dev_priv));
+		return;
 	}
-
-	/* WaToEnableHwFixForPushConstHWBug:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	return 0;
-}
-
-static int kbl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
-{
-	int ret;
-
-	ret = gen9_ctx_workarounds_init(dev_priv);
-	if (ret)
-		return ret;
-
-	/* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
-	if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
-		WA_SET_BIT_MASKED(HDC_CHICKEN0,
-				  HDC_FENCE_DEST_SLM_DISABLE);
-
-	/* WaToEnableHwFixForPushConstHWBug:kbl */
-	if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaDisableSbeCacheDispatchPortSharing:kbl */
-	WA_SET_BIT_MASKED(
-		GEN7_HALF_SLICE_CHICKEN1,
-		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
-
-	return 0;
-}
-
-static int glk_ctx_workarounds_init(struct drm_i915_private *dev_priv)
-{
-	int ret;
-
-	ret = gen9_ctx_workarounds_init(dev_priv);
-	if (ret)
-		return ret;
-
-	/* WaToEnableHwFixForPushConstHWBug:glk */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	return 0;
-}
-
-static int cfl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
-{
-	int ret;
-
-	ret = gen9_ctx_workarounds_init(dev_priv);
-	if (ret)
-		return ret;
-
-	/* WaToEnableHwFixForPushConstHWBug:cfl */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaDisableSbeCacheDispatchPortSharing:cfl */
-	WA_SET_BIT_MASKED(
-		GEN7_HALF_SLICE_CHICKEN1,
-		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
-
-	return 0;
 }
 
-static int cnl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
+static uint ctx_workarounds_init(struct drm_i915_private *dev_priv,
+				 const struct i915_wa_reg_table *wa_table,
+				 uint table_count)
 {
-	/* WaForceContextSaveRestoreNonCoherent:cnl */
-	WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
-
-	/* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
-
-	/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+	uint total_count = 0;
+	int i, j;
 
-	/* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
+	for (i = 0; i < table_count; i++) {
+		struct i915_wa_reg *wa = wa_table[i].table;
 
-	/* WaPushConstantDereferenceHoldDisable:cnl */
-	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
+		for (j = 0; j < wa_table[i].count; j++) {
+			wa[j].applied =
+				IS_REVID(dev_priv, wa[j].since, wa[j].until);
 
-	/* FtrEnableFastAnisoL1BankingFix:cnl */
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
+			if (wa[j].applied && wa[j].pre_hook)
+				wa[j].applied = wa[j].pre_hook(dev_priv, &wa[j]);
 
-	/* WaDisable3DMidCmdPreemption:cnl */
-	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+			/* Post-hooks do not make sense in context WAs */
+			GEM_BUG_ON(wa[j].post_hook);
 
-	/* WaDisableGPGPUMidCmdPreemption:cnl */
-	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
+			/* We expect all context registers to be masked */
+			GEM_BUG_ON(!wa[j].is_masked_reg);
+			GEM_BUG_ON(wa[j].mask & 0xffff0000);
 
-	return 0;
-}
-
-int intel_ctx_workarounds_init(struct drm_i915_private *dev_priv)
-{
-	int err;
-
-	dev_priv->workarounds.count = 0;
-
-	if (INTEL_GEN(dev_priv) < 8)
-		err = 0;
-	else if (IS_BROADWELL(dev_priv))
-		err = bdw_ctx_workarounds_init(dev_priv);
-	else if (IS_CHERRYVIEW(dev_priv))
-		err = chv_ctx_workarounds_init(dev_priv);
-	else if (IS_SKYLAKE(dev_priv))
-		err = skl_ctx_workarounds_init(dev_priv);
-	else if (IS_BROXTON(dev_priv))
-		err = bxt_ctx_workarounds_init(dev_priv);
-	else if (IS_KABYLAKE(dev_priv))
-		err = kbl_ctx_workarounds_init(dev_priv);
-	else if (IS_GEMINILAKE(dev_priv))
-		err = glk_ctx_workarounds_init(dev_priv);
-	else if (IS_COFFEELAKE(dev_priv))
-		err = cfl_ctx_workarounds_init(dev_priv);
-	else if (IS_CANNONLAKE(dev_priv))
-		err = cnl_ctx_workarounds_init(dev_priv);
-	else {
-		MISSING_CASE(INTEL_GEN(dev_priv));
-		err = 0;
+			if (wa[j].applied)
+				total_count++;
+		}
 	}
-	if (err)
-		return err;
 
-	DRM_DEBUG_DRIVER("Number of context specific w/a: %d\n",
-			 dev_priv->workarounds.count);
-	return 0;
+	dev_priv->workarounds.ctx_count = total_count;
+	DRM_DEBUG_DRIVER("Number of context specific w/a: %u\n", total_count);
+
+	return total_count;
 }
 
 int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req)
 {
-	struct i915_workarounds *w = &req->i915->workarounds;
+	struct drm_i915_private *dev_priv = req->i915;
+	const struct i915_wa_reg_table *wa_table;
+	uint table_count, total_count;
 	u32 *cs;
-	int ret, i;
+	int i, j, ret;
+
+	intel_ctx_workarounds_get(dev_priv, &wa_table, &table_count);
 
-	if (w->count == 0)
+	total_count = ctx_workarounds_init(dev_priv, wa_table, table_count);
+	if (total_count == 0)
 		return 0;
 
 	ret = req->engine->emit_flush(req, EMIT_BARRIER);
 	if (ret)
 		return ret;
 
-	cs = intel_ring_begin(req, (w->count * 2 + 2));
+	cs = intel_ring_begin(req, (total_count * 2 + 2));
 	if (IS_ERR(cs))
 		return PTR_ERR(cs);
 
-	*cs++ = MI_LOAD_REGISTER_IMM(w->count);
-	for (i = 0; i < w->count; i++) {
-		*cs++ = i915_mmio_reg_offset(w->reg[i].addr);
-		*cs++ = w->reg[i].value;
+	*cs++ = MI_LOAD_REGISTER_IMM(total_count);
+	for (i = 0; i < table_count; i++) {
+		const struct i915_wa_reg *wa = wa_table[i].table;
+
+		for (j = 0; j < wa_table[i].count; j++) {
+			if (!wa[j].applied)
+				continue;
+
+			*cs++ = i915_mmio_reg_offset(wa[j].addr);
+			*cs++ = wa[j].value;
+		}
 	}
 	*cs++ = MI_NOOP;
 
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
index bba51bb..38763e7 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -25,7 +25,9 @@
 #ifndef _I915_WORKAROUNDS_H_
 #define _I915_WORKAROUNDS_H_
 
-int intel_ctx_workarounds_init(struct drm_i915_private *dev_priv);
+void intel_ctx_workarounds_get(struct drm_i915_private *dev_priv,
+                               const struct i915_wa_reg_table **wa_table,
+                               uint *table_count);
 int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req);
 
 void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 05/20] drm/i915: Transform GT WAs into static tables
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (3 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 06/20] drm/i915: Transform Whitelist " Oscar Mateo
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx

This is for WAs that need to touch global MMIO registers
related to GT.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          |   1 +
 drivers/gpu/drm/i915/intel_workarounds.c | 404 +++++++++++++++++++------------
 drivers/gpu/drm/i915/intel_workarounds.h |   3 +
 3 files changed, 253 insertions(+), 155 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1c73fec..72b5d80 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2007,6 +2007,7 @@ struct i915_wa_reg_table {
 
 struct i915_workarounds {
 	u32 ctx_count;
+	u32 gt_count;
 	u32 hw_whitelist_count[I915_NUM_ENGINES];
 };
 
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index b00899e..b07fbd0 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -29,6 +29,10 @@
 	.name = (wa),			\
 	.type = I915_WA_TYPE_CONTEXT
 
+#define WA_GT(wa)			\
+	.name = (wa),			\
+	.type = I915_WA_TYPE_GT
+
 #define ALL_REVS		\
 	.since = 0,		\
 	.until = REVID_FOREVER
@@ -40,6 +44,18 @@
 #define REG(a)		\
 	.addr = (a)
 
+#define SET_BIT(m) 		\
+	.mask = (m),		\
+	.value = (m)
+
+#define CLEAR_BIT(m) 		\
+	.mask = (m),		\
+	.value = 0
+
+#define SET_FIELD(m, v) 	\
+	.mask = (m),		\
+	.value = (v)
+
 #define MASK(mask, value)	((mask) << 16 | (value))
 #define MASK_ENABLE(x)		(MASK((x), (x)))
 #define MASK_DISABLE(x)		(MASK((x), 0))
@@ -575,196 +591,274 @@ int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req)
 	return 0;
 }
 
-static void bdw_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+static uint mmio_workarounds_apply(struct drm_i915_private *dev_priv,
+				   const struct i915_wa_reg_table *wa_table,
+				   uint table_count)
 {
-}
+	uint total_count = 0;
+	int i, j;
 
-static void chv_gt_workarounds_apply(struct drm_i915_private *dev_priv)
-{
-}
+	for (i = 0; i < table_count; i++) {
+		struct i915_wa_reg *wa = wa_table[i].table;
 
-static void gen9_gt_workarounds_apply(struct drm_i915_private *dev_priv)
-{
-	if (HAS_LLC(dev_priv)) {
-		/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
-		 *
-		 * Must match Display Engine. See
-		 * WaCompressedResourceDisplayNewHashMode.
-		 */
-		I915_WRITE(MMCD_MISC_CTRL,
-			   I915_READ(MMCD_MISC_CTRL) |
-			   MMCD_PCLA |
-			   MMCD_HOTSPOT_EN);
-	}
+		for (j = 0; j < wa_table[i].count; j++) {
+			wa[j].applied =
+				IS_REVID(dev_priv, wa[j].since, wa[j].until);
 
-	/* WaContextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS,
-		   _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
+			if (wa[j].applied && wa[j].pre_hook)
+				wa[j].applied = wa[j].pre_hook(dev_priv, &wa[j]);
 
-	/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
-		   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+			if (wa[j].applied) {
+				i915_reg_t addr = wa[j].addr;
+				u32 value = wa[j].value;
+				u32 mask = wa[j].mask;
 
-	/* WaDisableKillLogic:bxt,skl,kbl */
-	if (!IS_COFFEELAKE(dev_priv))
-		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-			   ECOCHK_DIS_TLB);
+				if (wa[j].is_masked_reg) {
+					GEM_BUG_ON(mask & 0xffff0000);
+					I915_WRITE(addr, value);
+				} else {
+					I915_WRITE(addr,
+						(I915_READ(addr) & ~mask) |
+						value);
+				}
 
-	/* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-		   BDW_DISABLE_HDC_INVALIDATION);
+				if (wa[j].post_hook)
+					wa[j].post_hook(dev_priv, &wa[j]);
 
-	/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
-	I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
-				    GEN8_LQSC_FLUSH_COHERENT_LINES));
+				total_count++;
+			}
+		}
+	}
 
-	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+	return total_count;
 }
 
-static void skl_gt_workarounds_apply(struct drm_i915_private *dev_priv)
-{
-	gen9_gt_workarounds_apply(dev_priv);
+static struct i915_wa_reg gen8_gt_was[] = {
+};
 
-	/* WaEnableGapsTsvCreditFix:skl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+static struct i915_wa_reg bdw_gt_was[] = {
+};
 
-	/* WaDisableGafsUnitClkGating:skl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+static struct i915_wa_reg chv_gt_was[] = {
+};
 
-	/* WaInPlaceDecompressionHang:skl */
-	if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-}
+static struct i915_wa_reg gen9_gt_was[] = {
+	{ WA_GT("WaCompressedResourceSamplerPbeMediaNewHashMode"),
+	  ALL_REVS, REG(MMCD_MISC_CTRL),
+	  SET_BIT(MMCD_PCLA | MMCD_HOTSPOT_EN),
+	  .pre_hook = has_llc },
 
-static void bxt_gt_workarounds_apply(struct drm_i915_private *dev_priv)
-{
-	gen9_gt_workarounds_apply(dev_priv);
+	{ WA_GT("WaContextSwitchWithConcurrentTLBInvalidate"),
+	  ALL_REVS, REG(GEN9_CSFE_CHICKEN1_RCS),
+	  SET_BIT_MASKED(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE) },
+
+	{ WA_GT("WaEnableLbsSlaRetryTimerDecrement"),
+	  ALL_REVS, REG(BDW_SCRATCH1),
+	  SET_BIT(GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE) },
+
+	{ WA_GT("WaDisableHDCInvalidation"),
+	  ALL_REVS, REG(GAM_ECOCHK),
+	  SET_BIT(BDW_DISABLE_HDC_INVALIDATION) },
+
+	{ WA_GT("WaOCLCoherentLineFlush"),
+	  ALL_REVS, REG(GEN8_L3SQCREG4),
+	  SET_BIT(GEN8_LQSC_FLUSH_COHERENT_LINES) },
+
+	{ WA_GT("WaEnablePreemptionGranularityControlByUMD"),
+	  ALL_REVS, REG(GEN7_FF_SLICE_CS_CHICKEN1),
+	  SET_BIT_MASKED(GEN9_FFSC_PERCTX_PREEMPT_CTRL) },
+};
+
+static struct i915_wa_reg skl_gt_was[] = {
+	{ WA_GT("WaDisableKillLogic"),
+	  ALL_REVS, REG(GAM_ECOCHK),
+	  SET_BIT(ECOCHK_DIS_TLB) },
+
+	{ WA_GT("WaEnableGapsTsvCreditFix"),
+	  ALL_REVS, REG(GEN8_GARBCNTL),
+	  SET_BIT(GEN9_GAPS_TSV_CREDIT_DISABLE) },
+
+	{ WA_GT("WaDisableGafsUnitClkGating"),
+	  ALL_REVS, REG(GEN7_UCGCTL4),
+	  SET_BIT(GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE) },
+
+	{ WA_GT("WaInPlaceDecompressionHang"),
+	  REVS(SKL_REVID_H0, REVID_FOREVER), REG(GEN9_GAMT_ECO_REG_RW_IA),
+	  SET_BIT(GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS) },
+};
+
+static struct i915_wa_reg bxt_gt_was[] = {
+	{ WA_GT("WaDisableKillLogic"),
+	  ALL_REVS, REG(GAM_ECOCHK),
+	  SET_BIT(ECOCHK_DIS_TLB) },
 
-	/* WaStoreMultiplePTEenable:bxt */
 	/* This is a requirement according to Hardware specification */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
+	{ WA_GT("WaStoreMultiplePTEenable"),
+	  REVS(0, BXT_REVID_A1), REG(TILECTL),
+	  SET_BIT(TILECTL_TLBPF) },
+
+	{ WA_GT("WaSetClckGatingDisableMedia"),
+	  REVS(0, BXT_REVID_A1), REG(GEN7_MISCCPCTL),
+	  CLEAR_BIT(GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE) },
+
+	{ WA_GT("WaDisablePooledEuLoadBalancingFix"),
+	  REVS(BXT_REVID_B0, REVID_FOREVER), REG(FF_SLICE_CS_CHICKEN2),
+	  SET_BIT_MASKED(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE) },
+
+	{ WA_GT("WaProgramL3SqcReg1DefaultForPerf"),
+	  REVS(BXT_REVID_B0, REVID_FOREVER), REG(GEN8_L3SQCREG1),
+	  SET_FIELD(L3_PRIO_CREDITS_MASK, L3_GENERAL_PRIO_CREDITS(62) |
+					  L3_HIGH_PRIO_CREDITS(2)) },
+
+	{ WA_GT("WaInPlaceDecompressionHang"),
+	  REVS(BXT_REVID_C0, REVID_FOREVER), REG(GEN9_GAMT_ECO_REG_RW_IA),
+	  SET_BIT(GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS) },
+};
 
-	/* WaSetClckGatingDisableMedia:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
-					    ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
-	}
+static struct i915_wa_reg kbl_gt_was[] = {
+	{ WA_GT("WaDisableKillLogic"),
+	  ALL_REVS, REG(GAM_ECOCHK),
+	  SET_BIT(ECOCHK_DIS_TLB) },
 
-	/* WaDisablePooledEuLoadBalancingFix:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
-		I915_WRITE(FF_SLICE_CS_CHICKEN2,
-			   _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
-	}
+	{ WA_GT("WaEnableGapsTsvCreditFix"),
+	  ALL_REVS, REG(GEN8_GARBCNTL),
+	  SET_BIT(GEN9_GAPS_TSV_CREDIT_DISABLE) },
 
-	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
-		u32 val = I915_READ(GEN8_L3SQCREG1);
-		val &= ~L3_PRIO_CREDITS_MASK;
-		val |= L3_GENERAL_PRIO_CREDITS(62) | L3_HIGH_PRIO_CREDITS(2);
-		I915_WRITE(GEN8_L3SQCREG1, val);
-	}
+	{ WA_GT("WaDisableDynamicCreditSharing"),
+	  REVS(0, KBL_REVID_B0), REG(GAMT_CHKN_BIT_REG),
+	  SET_BIT(GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING) },
 
-	/* WaInPlaceDecompressionHang:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-}
+	{ WA_GT("WaDisableGafsUnitClkGating"),
+	  ALL_REVS, REG(GEN7_UCGCTL4),
+	  SET_BIT(GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE) },
 
-static void kbl_gt_workarounds_apply(struct drm_i915_private *dev_priv)
-{
-	gen9_gt_workarounds_apply(dev_priv);
-
-	/* WaEnableGapsTsvCreditFix:kbl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
-	/* WaDisableDynamicCreditSharing:kbl */
-	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
-
-	/* WaDisableGafsUnitClkGating:kbl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
-	/* WaInPlaceDecompressionHang:kbl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-}
+	{ WA_GT("WaInPlaceDecompressionHang"),
+	  ALL_REVS, REG(GEN9_GAMT_ECO_REG_RW_IA),
+	  SET_BIT(GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS) },
+};
 
-static void glk_gt_workarounds_apply(struct drm_i915_private *dev_priv)
-{
-	gen9_gt_workarounds_apply(dev_priv);
-}
+static struct i915_wa_reg glk_gt_was[] = {
+	{ WA_GT("WaDisableKillLogic"),
+	  ALL_REVS, REG(GAM_ECOCHK),
+	  SET_BIT(ECOCHK_DIS_TLB) },
+};
 
-static void cfl_gt_workarounds_apply(struct drm_i915_private *dev_priv)
-{
-	gen9_gt_workarounds_apply(dev_priv);
+static struct i915_wa_reg cfl_gt_was[] = {
+	{ WA_GT("WaEnableGapsTsvCreditFix"),
+	  ALL_REVS, REG(GEN8_GARBCNTL),
+	  SET_BIT(GEN9_GAPS_TSV_CREDIT_DISABLE) },
 
-	/* WaEnableGapsTsvCreditFix:cfl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+	{ WA_GT("WaDisableGafsUnitClkGating"),
+	  ALL_REVS, REG(GEN7_UCGCTL4),
+	  SET_BIT(GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE) },
 
-	/* WaDisableGafsUnitClkGating:cfl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+	{ WA_GT("WaInPlaceDecompressionHang"),
+	  ALL_REVS, REG(GEN9_GAMT_ECO_REG_RW_IA),
+	  SET_BIT(GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS) },
+};
 
-	/* WaInPlaceDecompressionHang:cfl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-}
+static struct i915_wa_reg cnl_gt_was[] = {
+	{ WA_GT("WaDisableI2mCycleOnWRPort"),
+	  REVS(CNL_REVID_B0, CNL_REVID_B0), REG(GAMT_CHKN_BIT_REG),
+	  SET_BIT(GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT) },
 
-static void cnl_gt_workarounds_apply(struct drm_i915_private *dev_priv)
-{
-	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
+	{ WA_GT("WaInPlaceDecompressionHang"),
+	  ALL_REVS, REG(GEN9_GAMT_ECO_REG_RW_IA),
+	  SET_BIT(GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS) },
 
-	/* WaInPlaceDecompressionHang:cnl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+	{ WA_GT("WaEnablePreemptionGranularityControlByUMD"),
+	  ALL_REVS, REG(GEN7_FF_SLICE_CS_CHICKEN1),
+	  SET_BIT_MASKED(GEN9_FFSC_PERCTX_PREEMPT_CTRL) },
+};
 
-	/* WaEnablePreemptionGranularityControlByUMD:cnl */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
-}
+static const struct i915_wa_reg_table bdw_gt_wa_tbl[] = {
+	{ gen8_gt_was, ARRAY_SIZE(gen8_gt_was) },
+	{ bdw_gt_was,  ARRAY_SIZE(bdw_gt_was) },
+};
 
-void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+static const struct i915_wa_reg_table chv_gt_wa_tbl[] = {
+	{ gen8_gt_was, ARRAY_SIZE(gen8_gt_was) },
+	{ chv_gt_was,  ARRAY_SIZE(chv_gt_was) },
+};
+
+static const struct i915_wa_reg_table skl_gt_wa_tbl[] = {
+	{ gen9_gt_was, ARRAY_SIZE(gen9_gt_was) },
+	{ skl_gt_was,  ARRAY_SIZE(skl_gt_was) },
+};
+
+static const struct i915_wa_reg_table bxt_gt_wa_tbl[] = {
+	{ gen9_gt_was, ARRAY_SIZE(gen9_gt_was) },
+	{ bxt_gt_was,  ARRAY_SIZE(bxt_gt_was) },
+};
+
+static const struct i915_wa_reg_table kbl_gt_wa_tbl[] = {
+	{ gen9_gt_was, ARRAY_SIZE(gen9_gt_was) },
+	{ kbl_gt_was,  ARRAY_SIZE(kbl_gt_was) },
+};
+
+static const struct i915_wa_reg_table glk_gt_wa_tbl[] = {
+	{ gen9_gt_was, ARRAY_SIZE(gen9_gt_was) },
+	{ glk_gt_was,  ARRAY_SIZE(glk_gt_was) },
+};
+
+static const struct i915_wa_reg_table cfl_gt_wa_tbl[] = {
+	{ gen9_gt_was, ARRAY_SIZE(gen9_gt_was) },
+	{ cfl_gt_was,  ARRAY_SIZE(cfl_gt_was) },
+};
+
+static const struct i915_wa_reg_table cnl_gt_wa_tbl[] = {
+	{ cnl_gt_was,  ARRAY_SIZE(cnl_gt_was) },
+};
+
+void intel_gt_workarounds_get(struct drm_i915_private *dev_priv,
+			      const struct i915_wa_reg_table **wa_table,
+			      uint *table_count)
 {
+	*wa_table = NULL;
+	*table_count = 0;
+
 	if (INTEL_GEN(dev_priv) < 8)
 		return;
-	else if (IS_BROADWELL(dev_priv))
-		bdw_gt_workarounds_apply(dev_priv);
-	else if (IS_CHERRYVIEW(dev_priv))
-		chv_gt_workarounds_apply(dev_priv);
-	else if (IS_SKYLAKE(dev_priv))
-		skl_gt_workarounds_apply(dev_priv);
-	else if (IS_BROXTON(dev_priv))
-		bxt_gt_workarounds_apply(dev_priv);
-	else if (IS_KABYLAKE(dev_priv))
-		kbl_gt_workarounds_apply(dev_priv);
-	else if (IS_GEMINILAKE(dev_priv))
-		glk_gt_workarounds_apply(dev_priv);
-	else if (IS_COFFEELAKE(dev_priv))
-		cfl_gt_workarounds_apply(dev_priv);
-	else if (IS_CANNONLAKE(dev_priv))
-		cnl_gt_workarounds_apply(dev_priv);
-	else
+	else if (IS_BROADWELL(dev_priv)) {
+		*wa_table = bdw_gt_wa_tbl;
+		*table_count = ARRAY_SIZE(bdw_gt_wa_tbl);
+	} else if (IS_CHERRYVIEW(dev_priv)) {
+		*wa_table = chv_gt_wa_tbl;
+		*table_count = ARRAY_SIZE(chv_gt_wa_tbl);
+	} else if (IS_SKYLAKE(dev_priv)) {
+		*wa_table = skl_gt_wa_tbl;
+		*table_count = ARRAY_SIZE(skl_gt_wa_tbl);
+	} else if (IS_BROXTON(dev_priv)) {
+		*wa_table = bxt_gt_wa_tbl;
+		*table_count = ARRAY_SIZE(bxt_gt_wa_tbl);
+	} else if (IS_KABYLAKE(dev_priv)) {
+		*wa_table = kbl_gt_wa_tbl;
+		*table_count = ARRAY_SIZE(kbl_gt_wa_tbl);
+	} else if (IS_GEMINILAKE(dev_priv)) {
+		*wa_table = glk_gt_wa_tbl;
+		*table_count = ARRAY_SIZE(glk_gt_wa_tbl);
+	} else if (IS_COFFEELAKE(dev_priv)) {
+		*wa_table = cfl_gt_wa_tbl;
+		*table_count = ARRAY_SIZE(cfl_gt_wa_tbl);
+	} else if (IS_CANNONLAKE(dev_priv)) {
+		*wa_table = cnl_gt_wa_tbl;
+		*table_count = ARRAY_SIZE(cnl_gt_wa_tbl);
+	} else {
 		MISSING_CASE(INTEL_GEN(dev_priv));
+		return;
+	}
+}
+
+void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	const struct i915_wa_reg_table *wa_table;
+	uint table_count, total_count;
+
+	intel_gt_workarounds_get(dev_priv, &wa_table, &table_count);
+	total_count = mmio_workarounds_apply(dev_priv, wa_table, table_count);
+
+	dev_priv->workarounds.gt_count = total_count;
+	DRM_DEBUG_DRIVER("Number of GT specific w/a: %u\n", total_count);
 }
 
 static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
index 38763e7..9bb3c48 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -30,6 +30,9 @@ void intel_ctx_workarounds_get(struct drm_i915_private *dev_priv,
                                uint *table_count);
 int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req);
 
+void intel_gt_workarounds_get(struct drm_i915_private *dev_priv,
+                              const struct i915_wa_reg_table **wa_table,
+                              uint *table_count);
 void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv);
 
 int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 06/20] drm/i915: Transform Whitelist WAs into static tables
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (4 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 05/20] drm/i915: Transform GT " Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 20:43   ` [RFC PATCH v2] " Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 07/20] drm/i915: Create a new category of display WAs Oscar Mateo
                   ` (16 subsequent siblings)
  22 siblings, 1 reply; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx

This is for WAs that whitelist a register.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          |   2 +
 drivers/gpu/drm/i915/intel_workarounds.c | 249 +++++++++++++++----------------
 drivers/gpu/drm/i915/intel_workarounds.h |   3 +
 3 files changed, 128 insertions(+), 126 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 72b5d80..6a62a7b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1989,6 +1989,8 @@ struct i915_wa_reg {
 	u8 since;
 	u8 until;
 
+	i915_reg_t whitelist_addr;
+
 	i915_reg_t addr;
 	u32 mask;
 	u32 value;
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index b07fbd0..849e70a 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -33,6 +33,10 @@
 	.name = (wa),			\
 	.type = I915_WA_TYPE_GT
 
+#define WA_WHITELIST(wa)		\
+	.name = (wa),			\
+	.type = I915_WA_TYPE_WHITELIST
+
 #define ALL_REVS		\
 	.since = 0,		\
 	.until = REVID_FOREVER
@@ -75,6 +79,9 @@
 	.value = MASK(m, v),		\
 	.is_masked_reg = true
 
+#define WHITELIST(reg)		\
+	.whitelist_addr = reg
+
 static struct i915_wa_reg gen8_ctx_was[] = {
 	{ WA_CTX(""),
 	  ALL_REVS, REG(INSTPM),
@@ -861,160 +868,150 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("Number of GT specific w/a: %u\n", total_count);
 }
 
-static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
-				 i915_reg_t reg)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	struct i915_workarounds *wa = &dev_priv->workarounds;
-	const uint32_t index = wa->hw_whitelist_count[engine->id];
-
-	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
-		return -EINVAL;
-
-	I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
-		   i915_mmio_reg_offset(reg));
-	wa->hw_whitelist_count[engine->id]++;
-
-	return 0;
-}
-
-static int gen9_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	int ret;
-
-	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
-	if (ret)
-		return ret;
+static struct i915_wa_reg gen9_whitelist_was[] = {
+	{ WA_WHITELIST("WaVFEStateAfterPipeControlwithMediaStateClear"),
+	  ALL_REVS, WHITELIST(GEN9_CTX_PREEMPT_REG) },
 
-	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
-	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
+	{ WA_WHITELIST("WaEnablePreemptionGranularityControlByUMD"),
+	  ALL_REVS, WHITELIST(GEN8_CS_CHICKEN1) },
 
-	/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
-	if (ret)
-		return ret;
+	{ WA_WHITELIST("WaAllowUMDToModifyHDCChicken1"),
+	  ALL_REVS, WHITELIST(GEN8_HDC_CHICKEN1) },
+};
 
-	return 0;
-}
+static struct i915_wa_reg skl_whitelist_was[] = {
+	{ WA_WHITELIST("WaDisableLSQCROPERFforOCL"),
+	  ALL_REVS, WHITELIST(GEN8_L3SQCREG4) },
+};
 
-static int skl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	int ret = gen9_whitelist_workarounds_apply(engine);
-	if (ret)
-		return ret;
+static struct i915_wa_reg bxt_whitelist_was[] = {
+	{ WA_WHITELIST("WaDisableObjectLevelPreemptionForTrifanOrPolygon +"
+		       "WaDisableObjectLevelPreemptionForInstancedDraw +"
+		       "WaDisableObjectLevelPreemtionForInstanceId +"
+		       "WaDisableLSQCROPERFforOCL"),
+	  REVS(0, BXT_REVID_A1), WHITELIST(GEN9_CS_DEBUG_MODE1) },
 
-	/* WaDisableLSQCROPERFforOCL:skl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
+	{ WA_WHITELIST("WaDisableObjectLevelPreemptionForTrifanOrPolygon +"
+		       "WaDisableObjectLevelPreemptionForInstancedDraw +"
+		       "WaDisableObjectLevelPreemtionForInstanceId +"
+		       "WaDisableLSQCROPERFforOCL"),
+	  REVS(0, BXT_REVID_A1), WHITELIST(GEN8_L3SQCREG4) },
+};
 
-	return 0;
-}
+static struct i915_wa_reg kbl_whitelist_was[] = {
+	{ WA_WHITELIST("WaDisableLSQCROPERFforOCL"),
+	  ALL_REVS, WHITELIST(GEN8_L3SQCREG4) },
+};
 
-static int bxt_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
+static struct i915_wa_reg cnl_whitelist_was[] = {
+	{ WA_WHITELIST("WaEnablePreemptionGranularityControlByUMD"),
+	  ALL_REVS, WHITELIST(GEN8_CS_CHICKEN1) },
+};
 
-	int ret = gen9_whitelist_workarounds_apply(engine);
-	if (ret)
-		return ret;
+static const struct i915_wa_reg_table skl_whitelist_wa_tbl[] = {
+	{ gen9_whitelist_was, ARRAY_SIZE(gen9_whitelist_was) },
+	{ skl_whitelist_was,  ARRAY_SIZE(skl_whitelist_was) },
+};
 
-	/* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
-	/* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
-	/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
-	/* WaDisableLSQCROPERFforOCL:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
-		if (ret)
-			return ret;
-
-		ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-		if (ret)
-			return ret;
-	}
+static const struct i915_wa_reg_table bxt_whitelist_wa_tbl[] = {
+	{ gen9_whitelist_was, ARRAY_SIZE(gen9_whitelist_was) },
+	{ bxt_whitelist_was,  ARRAY_SIZE(bxt_whitelist_was) },
+};
 
-	return 0;
-}
+static const struct i915_wa_reg_table kbl_whitelist_wa_tbl[] = {
+	{ gen9_whitelist_was, ARRAY_SIZE(gen9_whitelist_was) },
+	{ kbl_whitelist_was,  ARRAY_SIZE(kbl_whitelist_was) },
+};
 
-static int kbl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	int ret = gen9_whitelist_workarounds_apply(engine);
-	if (ret)
-		return ret;
+static const struct i915_wa_reg_table glk_whitelist_wa_tbl[] = {
+	{ gen9_whitelist_was, ARRAY_SIZE(gen9_whitelist_was) },
+};
 
-	/* WaDisableLSQCROPERFforOCL:kbl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
+static const struct i915_wa_reg_table cfl_whitelist_wa_tbl[] = {
+	{ gen9_whitelist_was, ARRAY_SIZE(gen9_whitelist_was) },
+};
 
-	return 0;
-}
+static const struct i915_wa_reg_table cnl_whitelist_wa_tbl[] = {
+	{ cnl_whitelist_was,  ARRAY_SIZE(cnl_whitelist_was) },
+};
 
-static int glk_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+void intel_whitelist_workarounds_get(struct drm_i915_private *dev_priv,
+				     const struct i915_wa_reg_table **wa_table,
+				     uint *table_count)
 {
-	int ret = gen9_whitelist_workarounds_apply(engine);
-	if (ret)
-		return ret;
+	*wa_table = NULL;
+	*table_count = 0;
 
-	return 0;
+	if (INTEL_GEN(dev_priv) < 9) {
+		WARN(1, "No whitelisting in Gen%u\n", INTEL_GEN(dev_priv));
+		return;
+	} else if (IS_SKYLAKE(dev_priv)) {
+		*wa_table = skl_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(skl_whitelist_wa_tbl);
+	} else if (IS_BROXTON(dev_priv)) {
+		*wa_table = bxt_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(bxt_whitelist_wa_tbl);
+	} else if (IS_KABYLAKE(dev_priv)) {
+		*wa_table = kbl_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(kbl_whitelist_wa_tbl);
+	} else if (IS_GEMINILAKE(dev_priv)) {
+		*wa_table = glk_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(glk_whitelist_wa_tbl);
+	} else if (IS_COFFEELAKE(dev_priv)) {
+		*wa_table = cfl_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(cfl_whitelist_wa_tbl);
+	} else if (IS_CANNONLAKE(dev_priv)) {
+		*wa_table = cnl_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(cnl_whitelist_wa_tbl);
+	} else {
+		MISSING_CASE(INTEL_GEN(dev_priv));
+		return;
+	}
 }
 
-static int cfl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
 {
-	int ret = gen9_whitelist_workarounds_apply(engine);
-	if (ret)
-		return ret;
+	struct drm_i915_private *dev_priv = engine->i915;
+	const struct i915_wa_reg_table *wa_table;
+	uint table_count, total_count = 0;
+	int i, j;
 
-	return 0;
-}
+	intel_whitelist_workarounds_get(dev_priv, &wa_table, &table_count);
 
-static int cnl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	int ret;
+	for (i = 0; i < table_count; i++) {
+		struct i915_wa_reg *wa = wa_table[i].table;
 
-	/* WaEnablePreemptionGranularityControlByUMD:cnl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
+		for (j = 0; j < wa_table[i].count; j++) {
+			wa[j].applied =
+				IS_REVID(dev_priv, wa[j].since, wa[j].until);
 
-	return 0;
-}
+			if (wa[j].applied && wa[j].pre_hook)
+				wa[j].applied = wa[j].pre_hook(dev_priv, &wa[j]);
 
-int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int err;
+			if (wa[j].applied) {
+				if (WARN_ON(total_count >= RING_MAX_NONPRIV_SLOTS)) {
+					wa[j].applied = false;
+					return -EINVAL;
+				}
 
-	WARN_ON(engine->id != RCS);
+				/* Cache the translation of the */
+				wa[j].addr =
+					RING_FORCE_TO_NONPRIV(engine->mmio_base,
+							      total_count++);
+				wa[j].value =
+					i915_mmio_reg_offset(wa[j].whitelist_addr);
+				wa[j].mask = 0xffffffff;
 
-	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+				I915_WRITE(wa[j].addr, wa[j].value);
+			}
 
-	if (INTEL_GEN(dev_priv) < 9) {
-		WARN(1, "No whitelisting in Gen%u\n", INTEL_GEN(dev_priv));
-		err = 0;
-	} else if (IS_SKYLAKE(dev_priv))
-		err = skl_whitelist_workarounds_apply(engine);
-	else if (IS_BROXTON(dev_priv))
-		err = bxt_whitelist_workarounds_apply(engine);
-	else if (IS_KABYLAKE(dev_priv))
-		err = kbl_whitelist_workarounds_apply(engine);
-	else if (IS_GEMINILAKE(dev_priv))
-		err = glk_whitelist_workarounds_apply(engine);
-	else if (IS_COFFEELAKE(dev_priv))
-		err = cfl_whitelist_workarounds_apply(engine);
-	else if (IS_CANNONLAKE(dev_priv))
-		err = cnl_whitelist_workarounds_apply(engine);
-	else {
-		MISSING_CASE(INTEL_GEN(dev_priv));
-		err = 0;
+			GEM_BUG_ON(wa[j].post_hook);
+		}
 	}
-	if (err)
-		return err;
 
-	DRM_DEBUG_DRIVER("%s: Number of whitelist w/a: %d\n", engine->name,
-			 dev_priv->workarounds.hw_whitelist_count[engine->id]);
+	dev_priv->workarounds.hw_whitelist_count[engine->id] = total_count;
+	DRM_DEBUG_DRIVER("%s: Number of whitelist w/a: %u\n", engine->name,
+			 total_count);
+
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
index 9bb3c48..f60913f 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -35,6 +35,9 @@ void intel_gt_workarounds_get(struct drm_i915_private *dev_priv,
                               uint *table_count);
 void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv);
 
+void intel_whitelist_workarounds_get(struct drm_i915_private *dev_priv,
+                                     const struct i915_wa_reg_table **wa_table,
+                                     uint *table_count);
 int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine);
 
 #endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 07/20] drm/i915: Create a new category of display WAs
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (5 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 06/20] drm/i915: Transform Whitelist " Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 08/20] drm/i915: Print all workaround types correctly in debugfs Oscar Mateo
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

Display workarounds do not need to be re-applied on a GPU reset
(this is, in Ville's words: "at the very least wasted effort [...]
and could even be actively harmful in case we end up clobbering
something the current display configuration depends on"). Therefore,
they have to be applied in a different place that GT ones so they
deserve their own category.

Actually populating this is left for future patches: we have to
start moving WAs from init_clock_gating into either GT or
Display functions, and this requires a good deal of careful code
reviewing.

v2: Rebased to carry the init_early nomenclature over (Chris)

v3: Static tables version (Joonas)

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          |   1 +
 drivers/gpu/drm/i915/intel_pm.c          |   2 +
 drivers/gpu/drm/i915/intel_workarounds.c | 124 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_workarounds.h |   5 ++
 4 files changed, 132 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6a62a7b..f781d1c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2010,6 +2010,7 @@ struct i915_wa_reg_table {
 struct i915_workarounds {
 	u32 ctx_count;
 	u32 gt_count;
+	u32 disp_count;
 	u32 hw_whitelist_count[I915_NUM_ENGINES];
 };
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index acd0cbb..0d0e84b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -29,6 +29,7 @@
 #include <drm/drm_plane_helper.h>
 #include "i915_drv.h"
 #include "intel_drv.h"
+#include "intel_workarounds.h"
 #include "../../../platform/x86/intel_ips.h"
 #include <linux/module.h>
 #include <drm/drm_atomic_helper.h>
@@ -9013,6 +9014,7 @@ static void i830_init_clock_gating(struct drm_i915_private *dev_priv)
 void intel_init_clock_gating(struct drm_i915_private *dev_priv)
 {
 	dev_priv->display.init_clock_gating(dev_priv);
+	intel_display_workarounds_apply(dev_priv);
 }
 
 void intel_suspend_hw(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 849e70a..5a532a0 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -33,6 +33,10 @@
 	.name = (wa),			\
 	.type = I915_WA_TYPE_GT
 
+#define WA_DISP(wa)			\
+	.name = (wa),			\
+	.type = I915_WA_TYPE_DISPLAY
+
 #define WA_WHITELIST(wa)		\
 	.name = (wa),			\
 	.type = I915_WA_TYPE_WHITELIST
@@ -868,6 +872,126 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("Number of GT specific w/a: %u\n", total_count);
 }
 
+static struct i915_wa_reg gen8_disp_was[] = {
+};
+
+static struct i915_wa_reg bdw_disp_was[] = {
+};
+
+static struct i915_wa_reg chv_disp_was[] = {
+};
+
+static struct i915_wa_reg gen9_disp_was[] = {
+};
+
+static struct i915_wa_reg skl_disp_was[] = {
+};
+
+static struct i915_wa_reg bxt_disp_was[] = {
+};
+
+static struct i915_wa_reg kbl_disp_was[] = {
+};
+
+static struct i915_wa_reg glk_disp_was[] = {
+};
+
+static struct i915_wa_reg cfl_disp_was[] = {
+};
+
+static struct i915_wa_reg cnl_disp_was[] = {
+};
+
+static const struct i915_wa_reg_table bdw_disp_wa_tbl[] = {
+	{ gen8_disp_was, ARRAY_SIZE(gen8_disp_was) },
+	{ bdw_disp_was,  ARRAY_SIZE(bdw_disp_was) },
+};
+
+static const struct i915_wa_reg_table chv_disp_wa_tbl[] = {
+	{ gen8_disp_was, ARRAY_SIZE(gen8_disp_was) },
+	{ chv_disp_was,  ARRAY_SIZE(chv_disp_was) },
+};
+
+static const struct i915_wa_reg_table skl_disp_wa_tbl[] = {
+	{ gen9_disp_was, ARRAY_SIZE(gen9_disp_was) },
+	{ skl_disp_was,  ARRAY_SIZE(skl_disp_was) },
+};
+
+static const struct i915_wa_reg_table bxt_disp_wa_tbl[] = {
+	{ gen9_disp_was, ARRAY_SIZE(gen9_disp_was) },
+	{ bxt_disp_was,  ARRAY_SIZE(bxt_disp_was) },
+};
+
+static const struct i915_wa_reg_table kbl_disp_wa_tbl[] = {
+	{ gen9_disp_was, ARRAY_SIZE(gen9_disp_was) },
+	{ kbl_disp_was,  ARRAY_SIZE(kbl_disp_was) },
+};
+
+static const struct i915_wa_reg_table glk_disp_wa_tbl[] = {
+	{ gen9_disp_was, ARRAY_SIZE(gen9_disp_was) },
+	{ glk_disp_was,  ARRAY_SIZE(glk_disp_was) },
+};
+
+static const struct i915_wa_reg_table cfl_disp_wa_tbl[] = {
+	{ gen9_disp_was, ARRAY_SIZE(gen9_disp_was) },
+	{ cfl_disp_was,  ARRAY_SIZE(cfl_disp_was) },
+};
+
+static const struct i915_wa_reg_table cnl_disp_wa_tbl[] = {
+	{ cnl_disp_was,  ARRAY_SIZE(cnl_disp_was) },
+};
+
+void intel_display_workarounds_get(struct drm_i915_private *dev_priv,
+				   const struct i915_wa_reg_table **wa_table,
+				   uint *table_count)
+{
+	*wa_table = NULL;
+	*table_count = 0;
+
+	if (INTEL_GEN(dev_priv) < 8)
+		return;
+	else if (IS_BROADWELL(dev_priv)) {
+		*wa_table = bdw_disp_wa_tbl;
+		*table_count = ARRAY_SIZE(bdw_disp_wa_tbl);
+	} else if (IS_CHERRYVIEW(dev_priv)) {
+		*wa_table = chv_disp_wa_tbl;
+		*table_count = ARRAY_SIZE(chv_disp_wa_tbl);
+	} else if (IS_SKYLAKE(dev_priv)) {
+		*wa_table = skl_disp_wa_tbl;
+		*table_count = ARRAY_SIZE(skl_disp_wa_tbl);
+	} else if (IS_BROXTON(dev_priv)) {
+		*wa_table = bxt_disp_wa_tbl;
+		*table_count = ARRAY_SIZE(bxt_disp_wa_tbl);
+	} else if (IS_KABYLAKE(dev_priv)) {
+		*wa_table = kbl_disp_wa_tbl;
+		*table_count = ARRAY_SIZE(kbl_disp_wa_tbl);
+	} else if (IS_GEMINILAKE(dev_priv)) {
+		*wa_table = glk_disp_wa_tbl;
+		*table_count = ARRAY_SIZE(glk_disp_wa_tbl);
+	} else if (IS_COFFEELAKE(dev_priv)) {
+		*wa_table = cfl_disp_wa_tbl;
+		*table_count = ARRAY_SIZE(cfl_disp_wa_tbl);
+	} else if (IS_CANNONLAKE(dev_priv)) {
+		*wa_table = cnl_disp_wa_tbl;
+		*table_count = ARRAY_SIZE(cnl_disp_wa_tbl);
+	} else {
+		MISSING_CASE(INTEL_GEN(dev_priv));
+		return;
+	}
+}
+
+void intel_display_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	const struct i915_wa_reg_table *wa_table;
+	uint table_count, total_count;
+
+	intel_display_workarounds_get(dev_priv, &wa_table, &table_count);
+	total_count = mmio_workarounds_apply(dev_priv, wa_table, table_count);
+
+	dev_priv->workarounds.disp_count = total_count;
+	DRM_DEBUG_DRIVER("Number of Display specific w/a: %u\n", total_count);
+}
+
 static struct i915_wa_reg gen9_whitelist_was[] = {
 	{ WA_WHITELIST("WaVFEStateAfterPipeControlwithMediaStateClear"),
 	  ALL_REVS, WHITELIST(GEN9_CTX_PREEMPT_REG) },
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
index f60913f..c73dc66 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -35,6 +35,11 @@ void intel_gt_workarounds_get(struct drm_i915_private *dev_priv,
                               uint *table_count);
 void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv);
 
+void intel_display_workarounds_get(struct drm_i915_private *dev_priv,
+                                   const struct i915_wa_reg_table **wa_table,
+                                   uint *table_count);
+void intel_display_workarounds_apply(struct drm_i915_private *dev_priv);
+
 void intel_whitelist_workarounds_get(struct drm_i915_private *dev_priv,
                                      const struct i915_wa_reg_table **wa_table,
                                      uint *table_count);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 08/20] drm/i915: Print all workaround types correctly in debugfs
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (6 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 07/20] drm/i915: Create a new category of display WAs Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 09/20] drm/i915: Do not store the total counts of WAs Oscar Mateo
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx

Let's try to make sure that all WAs are applied correctly and survive
resumes, resets, etc... (with some help from a companion i-g-t patch).

v2:
  - Rebased
  - Print display WAs as well (Ville)

v3:
  - Grab the forcewake once for everyone, so that all reads are from
    the same powercontext (Chris)

v4: Rebase on top of static tables

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 79 +++++++++++++++++++++++++++++--------
 1 file changed, 63 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 12c4330..8a6fef4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3356,48 +3356,95 @@ static int i915_shared_dplls_info(struct seq_file *m, void *unused)
 	return 0;
 }
 
+static void check_wa_register(struct seq_file *m,
+			      const struct i915_wa_reg *wa_reg)
+{
+	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	u32 read;
+	bool ok;
+
+	assert_forcewakes_active(dev_priv, FORCEWAKE_ALL);
+
+	read = I915_READ_FW(wa_reg->addr);
+	ok = (wa_reg->value & wa_reg->mask) == (read & wa_reg->mask);
+	seq_printf(m,
+		   "0x%X: 0x%08x, mask: 0x%08x, read: 0x%08x, status: %s, name: %s\n",
+		   i915_mmio_reg_offset(wa_reg->addr),
+		   wa_reg->value, wa_reg->mask, read,
+		   ok ? "OK" : "FAIL", wa_reg->name);
+}
+
+static void check_wa_registers(struct seq_file *m,
+			       const struct i915_wa_reg_table *wa_table,
+			       uint table_count)
+{
+	int i, j;
+
+	for (i = 0; i < table_count; i++) {
+		const struct i915_wa_reg *wa = wa_table[i].table;
+
+		for (j = 0; j < wa_table[i].count; j++) {
+			if (!wa[j].applied)
+				continue;
+
+			check_wa_register(m, &wa[j]);
+		}
+	}
+}
+
 static int i915_wa_registers(struct seq_file *m, void *unused)
 {
-	struct intel_engine_cs *engine;
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct drm_device *dev = &dev_priv->drm;
 	struct i915_workarounds *workarounds = &dev_priv->workarounds;
 	const struct i915_wa_reg_table *wa_table;
 	uint table_count;
-	enum intel_engine_id id;
 	int i, j, ret;
 
-	intel_ctx_workarounds_get(dev_priv, &wa_table, &table_count);
-
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
 	if (ret)
 		return ret;
 
 	intel_runtime_pm_get(dev_priv);
 
-	seq_printf(m, "Workarounds applied: %d\n", workarounds->ctx_count);
-	for_each_engine(engine, dev_priv, id)
-		seq_printf(m, "HW whitelist count for %s: %d\n",
-			   engine->name, workarounds->hw_whitelist_count[id]);
-
+	seq_printf(m, "Context workarounds applied: %d\n",
+		   workarounds->ctx_count);
+	intel_ctx_workarounds_get(dev_priv, &wa_table, &table_count);
 	for (i = 0; i < table_count; i++) {
 		const struct i915_wa_reg *wa = wa_table[i].table;
 
 		for (j = 0; j < wa_table[i].count; j++) {
-			u32 read;
-			bool ok;
-
 			if (!wa[j].applied)
 				continue;
 
-			read = I915_READ(wa[j].addr);
-			ok = (wa[j].value & wa[j].mask) == (read & wa[j].mask);
 			seq_printf(m,
-				   "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s, name: %s\n",
+				   "0x%X: 0x%08X, mask: 0x%08X, name: %s\n",
 				   i915_mmio_reg_offset(wa[j].addr), wa[j].value,
-				   wa[j].mask, read, ok ? "OK" : "FAIL", wa[j].name);
+				   wa[j].mask, wa[j].name);
 		}
 	}
+	seq_putc(m, '\n');
+
+	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
+	seq_printf(m, "GT workarounds applied: %d\n", workarounds->gt_count);
+	intel_gt_workarounds_get(dev_priv, &wa_table, &table_count);
+	check_wa_registers(m, wa_table, table_count);
+	seq_putc(m, '\n');
+
+	seq_printf(m, "Display workarounds applied: %d\n",
+		   workarounds->disp_count);
+	intel_display_workarounds_get(dev_priv, &wa_table, &table_count);
+	check_wa_registers(m, wa_table, table_count);
+	seq_putc(m, '\n');
+
+	seq_printf(m, "Whitelist workarounds applied: %d\n",
+		   workarounds->hw_whitelist_count[RCS]);
+	intel_whitelist_workarounds_get(dev_priv, &wa_table, &table_count);
+	check_wa_registers(m, wa_table, table_count);
+	seq_putc(m, '\n');
+
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
 	intel_runtime_pm_put(dev_priv);
 	mutex_unlock(&dev->struct_mutex);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 09/20] drm/i915: Do not store the total counts of WAs
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (7 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 08/20] drm/i915: Print all workaround types correctly in debugfs Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 10/20] drm/i915: Move WA BB stuff to the workarounds file as well Oscar Mateo
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx

Simply recalculate as needed so that we can remove the
workarounds structure in dev_priv.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c      | 34 ++++++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_drv.h          |  9 ---------
 drivers/gpu/drm/i915/intel_workarounds.c |  4 ----
 3 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 8a6fef4..8fa8c68 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3392,11 +3392,28 @@ static void check_wa_registers(struct seq_file *m,
 	}
 }
 
+static uint count_wa_registers(const struct i915_wa_reg_table *wa_table,
+			       uint table_count)
+{
+	uint total = 0;
+	int i, j;
+
+	for (i = 0; i < table_count; i++) {
+		const struct i915_wa_reg *wa = wa_table[i].table;
+
+		for (j = 0; j < wa_table[i].count; j++) {
+			if (wa[j].applied)
+				total++;
+		}
+	}
+
+	return total;
+}
+
 static int i915_wa_registers(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct drm_device *dev = &dev_priv->drm;
-	struct i915_workarounds *workarounds = &dev_priv->workarounds;
 	const struct i915_wa_reg_table *wa_table;
 	uint table_count;
 	int i, j, ret;
@@ -3407,9 +3424,9 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
 
 	intel_runtime_pm_get(dev_priv);
 
-	seq_printf(m, "Context workarounds applied: %d\n",
-		   workarounds->ctx_count);
 	intel_ctx_workarounds_get(dev_priv, &wa_table, &table_count);
+	seq_printf(m, "Context workarounds applied: %d\n",
+		   count_wa_registers(wa_table, table_count));
 	for (i = 0; i < table_count; i++) {
 		const struct i915_wa_reg *wa = wa_table[i].table;
 
@@ -3427,20 +3444,21 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
 
 	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
 
-	seq_printf(m, "GT workarounds applied: %d\n", workarounds->gt_count);
 	intel_gt_workarounds_get(dev_priv, &wa_table, &table_count);
+	seq_printf(m, "GT workarounds applied: %d\n",
+		   count_wa_registers(wa_table, table_count));
 	check_wa_registers(m, wa_table, table_count);
 	seq_putc(m, '\n');
 
-	seq_printf(m, "Display workarounds applied: %d\n",
-		   workarounds->disp_count);
 	intel_display_workarounds_get(dev_priv, &wa_table, &table_count);
+	seq_printf(m, "Display workarounds applied: %d\n",
+		   count_wa_registers(wa_table, table_count));
 	check_wa_registers(m, wa_table, table_count);
 	seq_putc(m, '\n');
 
-	seq_printf(m, "Whitelist workarounds applied: %d\n",
-		   workarounds->hw_whitelist_count[RCS]);
 	intel_whitelist_workarounds_get(dev_priv, &wa_table, &table_count);
+	seq_printf(m, "Whitelist workarounds applied: %d\n",
+		   count_wa_registers(wa_table, table_count));
 	check_wa_registers(m, wa_table, table_count);
 	seq_putc(m, '\n');
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f781d1c..7efb59b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2007,13 +2007,6 @@ struct i915_wa_reg_table {
 	int count;
 };
 
-struct i915_workarounds {
-	u32 ctx_count;
-	u32 gt_count;
-	u32 disp_count;
-	u32 hw_whitelist_count[I915_NUM_ENGINES];
-};
-
 struct i915_virtual_gpu {
 	bool active;
 	u32 caps;
@@ -2452,8 +2445,6 @@ struct drm_i915_private {
 
 	int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
 
-	struct i915_workarounds workarounds;
-
 	struct i915_frontbuffer_tracking fb_tracking;
 
 	struct intel_atomic_helper {
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 5a532a0..74e59bb 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -551,7 +551,6 @@ static uint ctx_workarounds_init(struct drm_i915_private *dev_priv,
 		}
 	}
 
-	dev_priv->workarounds.ctx_count = total_count;
 	DRM_DEBUG_DRIVER("Number of context specific w/a: %u\n", total_count);
 
 	return total_count;
@@ -868,7 +867,6 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 	intel_gt_workarounds_get(dev_priv, &wa_table, &table_count);
 	total_count = mmio_workarounds_apply(dev_priv, wa_table, table_count);
 
-	dev_priv->workarounds.gt_count = total_count;
 	DRM_DEBUG_DRIVER("Number of GT specific w/a: %u\n", total_count);
 }
 
@@ -988,7 +986,6 @@ void intel_display_workarounds_apply(struct drm_i915_private *dev_priv)
 	intel_display_workarounds_get(dev_priv, &wa_table, &table_count);
 	total_count = mmio_workarounds_apply(dev_priv, wa_table, table_count);
 
-	dev_priv->workarounds.disp_count = total_count;
 	DRM_DEBUG_DRIVER("Number of Display specific w/a: %u\n", total_count);
 }
 
@@ -1133,7 +1130,6 @@ int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
 		}
 	}
 
-	dev_priv->workarounds.hw_whitelist_count[engine->id] = total_count;
 	DRM_DEBUG_DRIVER("%s: Number of whitelist w/a: %u\n", engine->name,
 			 total_count);
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 10/20] drm/i915: Move WA BB stuff to the workarounds file as well
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (8 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 09/20] drm/i915: Do not store the total counts of WAs Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 11/20] drm/i915/cnl: Move GT and Display workarounds from init_clock_gating Oscar Mateo
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx

Since we are trying to put all WA stuff together, do not forget about the BB WAs.

v2: s/intel_bb_workarounds_init/intel_engine_init_bb_workarounds (Chris)

v3: Rebased

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_lrc.c         | 253 +-----------------------------
 drivers/gpu/drm/i915/intel_workarounds.c | 254 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_workarounds.h |   3 +
 3 files changed, 259 insertions(+), 251 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f0b4d2f..7f731d0 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1181,255 +1181,6 @@ static int execlists_request_alloc(struct drm_i915_gem_request *request)
 	return 0;
 }
 
-/*
- * In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after
- * PIPE_CONTROL instruction. This is required for the flush to happen correctly
- * but there is a slight complication as this is applied in WA batch where the
- * values are only initialized once so we cannot take register value at the
- * beginning and reuse it further; hence we save its value to memory, upload a
- * constant value with bit21 set and then we restore it back with the saved value.
- * To simplify the WA, a constant value is formed by using the default value
- * of this register. This shouldn't be a problem because we are only modifying
- * it for a short period and this batch in non-premptible. We can ofcourse
- * use additional instructions that read the actual value of the register
- * at that time and set our bit of interest but it makes the WA complicated.
- *
- * This WA is also required for Gen9 so extracting as a function avoids
- * code duplication.
- */
-static u32 *
-gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
-{
-	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
-	*batch++ = 0;
-
-	*batch++ = MI_LOAD_REGISTER_IMM(1);
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
-
-	batch = gen8_emit_pipe_control(batch,
-				       PIPE_CONTROL_CS_STALL |
-				       PIPE_CONTROL_DC_FLUSH_ENABLE,
-				       0);
-
-	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
-	*batch++ = 0;
-
-	return batch;
-}
-
-/*
- * Typically we only have one indirect_ctx and per_ctx batch buffer which are
- * initialized at the beginning and shared across all contexts but this field
- * helps us to have multiple batches at different offsets and select them based
- * on a criteria. At the moment this batch always start at the beginning of the page
- * and at this point we don't have multiple wa_ctx batch buffers.
- *
- * The number of WA applied are not known at the beginning; we use this field
- * to return the no of DWORDS written.
- *
- * It is to be noted that this batch does not contain MI_BATCH_BUFFER_END
- * so it adds NOOPs as padding to make it cacheline aligned.
- * MI_BATCH_BUFFER_END will be added to perctx batch and both of them together
- * makes a complete batch buffer.
- */
-static u32 *gen8_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
-{
-	/* WaDisableCtxRestoreArbitration:bdw,chv */
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
-
-	/* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */
-	if (IS_BROADWELL(engine->i915))
-		batch = gen8_emit_flush_coherentl3_wa(engine, batch);
-
-	/* WaClearSlmSpaceAtContextSwitch:bdw,chv */
-	/* Actual scratch location is at 128 bytes offset */
-	batch = gen8_emit_pipe_control(batch,
-				       PIPE_CONTROL_FLUSH_L3 |
-				       PIPE_CONTROL_GLOBAL_GTT_IVB |
-				       PIPE_CONTROL_CS_STALL |
-				       PIPE_CONTROL_QW_WRITE,
-				       i915_ggtt_offset(engine->scratch) +
-				       2 * CACHELINE_BYTES);
-
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-
-	/* Pad to end of cacheline */
-	while ((unsigned long)batch % CACHELINE_BYTES)
-		*batch++ = MI_NOOP;
-
-	/*
-	 * MI_BATCH_BUFFER_END is not required in Indirect ctx BB because
-	 * execution depends on the length specified in terms of cache lines
-	 * in the register CTX_RCS_INDIRECT_CTX
-	 */
-
-	return batch;
-}
-
-static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
-{
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
-
-	/* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */
-	batch = gen8_emit_flush_coherentl3_wa(engine, batch);
-
-	/* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */
-	*batch++ = MI_LOAD_REGISTER_IMM(1);
-	*batch++ = i915_mmio_reg_offset(COMMON_SLICE_CHICKEN2);
-	*batch++ = _MASKED_BIT_DISABLE(
-			GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE);
-	*batch++ = MI_NOOP;
-
-	/* WaClearSlmSpaceAtContextSwitch:kbl */
-	/* Actual scratch location is at 128 bytes offset */
-	if (IS_KBL_REVID(engine->i915, 0, KBL_REVID_A0)) {
-		batch = gen8_emit_pipe_control(batch,
-					       PIPE_CONTROL_FLUSH_L3 |
-					       PIPE_CONTROL_GLOBAL_GTT_IVB |
-					       PIPE_CONTROL_CS_STALL |
-					       PIPE_CONTROL_QW_WRITE,
-					       i915_ggtt_offset(engine->scratch)
-					       + 2 * CACHELINE_BYTES);
-	}
-
-	/* WaMediaPoolStateCmdInWABB:bxt,glk */
-	if (HAS_POOLED_EU(engine->i915)) {
-		/*
-		 * EU pool configuration is setup along with golden context
-		 * during context initialization. This value depends on
-		 * device type (2x6 or 3x6) and needs to be updated based
-		 * on which subslice is disabled especially for 2x6
-		 * devices, however it is safe to load default
-		 * configuration of 3x6 device instead of masking off
-		 * corresponding bits because HW ignores bits of a disabled
-		 * subslice and drops down to appropriate config. Please
-		 * see render_state_setup() in i915_gem_render_state.c for
-		 * possible configurations, to avoid duplication they are
-		 * not shown here again.
-		 */
-		*batch++ = GEN9_MEDIA_POOL_STATE;
-		*batch++ = GEN9_MEDIA_POOL_ENABLE;
-		*batch++ = 0x00777000;
-		*batch++ = 0;
-		*batch++ = 0;
-		*batch++ = 0;
-	}
-
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-
-	/* Pad to end of cacheline */
-	while ((unsigned long)batch % CACHELINE_BYTES)
-		*batch++ = MI_NOOP;
-
-	return batch;
-}
-
-#define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
-
-static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
-{
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-	int err;
-
-	obj = i915_gem_object_create(engine->i915, CTX_WA_BB_OBJ_SIZE);
-	if (IS_ERR(obj))
-		return PTR_ERR(obj);
-
-	vma = i915_vma_instance(obj, &engine->i915->ggtt.base, NULL);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err;
-	}
-
-	err = i915_vma_pin(vma, 0, PAGE_SIZE, PIN_GLOBAL | PIN_HIGH);
-	if (err)
-		goto err;
-
-	engine->wa_ctx.vma = vma;
-	return 0;
-
-err:
-	i915_gem_object_put(obj);
-	return err;
-}
-
-static void lrc_destroy_wa_ctx(struct intel_engine_cs *engine)
-{
-	i915_vma_unpin_and_release(&engine->wa_ctx.vma);
-}
-
-typedef u32 *(*wa_bb_func_t)(struct intel_engine_cs *engine, u32 *batch);
-
-static int intel_init_workaround_bb(struct intel_engine_cs *engine)
-{
-	struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
-	struct i915_wa_ctx_bb *wa_bb[2] = { &wa_ctx->indirect_ctx,
-					    &wa_ctx->per_ctx };
-	wa_bb_func_t wa_bb_fn[2];
-	struct page *page;
-	void *batch, *batch_ptr;
-	unsigned int i;
-	int ret;
-
-	if (WARN_ON(engine->id != RCS || !engine->scratch))
-		return -EINVAL;
-
-	switch (INTEL_GEN(engine->i915)) {
-	case 10:
-		return 0;
-	case 9:
-		wa_bb_fn[0] = gen9_init_indirectctx_bb;
-		wa_bb_fn[1] = NULL;
-		break;
-	case 8:
-		wa_bb_fn[0] = gen8_init_indirectctx_bb;
-		wa_bb_fn[1] = NULL;
-		break;
-	default:
-		MISSING_CASE(INTEL_GEN(engine->i915));
-		return 0;
-	}
-
-	ret = lrc_setup_wa_ctx(engine);
-	if (ret) {
-		DRM_DEBUG_DRIVER("Failed to setup context WA page: %d\n", ret);
-		return ret;
-	}
-
-	page = i915_gem_object_get_dirty_page(wa_ctx->vma->obj, 0);
-	batch = batch_ptr = kmap_atomic(page);
-
-	/*
-	 * Emit the two workaround batch buffers, recording the offset from the
-	 * start of the workaround batch buffer object for each and their
-	 * respective sizes.
-	 */
-	for (i = 0; i < ARRAY_SIZE(wa_bb_fn); i++) {
-		wa_bb[i]->offset = batch_ptr - batch;
-		if (WARN_ON(!IS_ALIGNED(wa_bb[i]->offset, CACHELINE_BYTES))) {
-			ret = -EINVAL;
-			break;
-		}
-		if (wa_bb_fn[i])
-			batch_ptr = wa_bb_fn[i](engine, batch_ptr);
-		wa_bb[i]->size = batch_ptr - (batch + wa_bb[i]->offset);
-	}
-
-	BUG_ON(batch_ptr - batch > CTX_WA_BB_OBJ_SIZE);
-
-	kunmap_atomic(batch);
-	if (ret)
-		lrc_destroy_wa_ctx(engine);
-
-	return ret;
-}
-
 static u8 gtiir[] = {
 	[RCS] = 0,
 	[BCS] = 0,
@@ -1875,7 +1626,7 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *engine)
 
 	intel_engine_cleanup_common(engine);
 
-	lrc_destroy_wa_ctx(engine);
+	intel_engine_fini_bb_workarounds(engine);
 	engine->i915 = NULL;
 	dev_priv->engine[engine->id] = NULL;
 	kfree(engine);
@@ -1994,7 +1745,7 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
 	if (ret)
 		return ret;
 
-	ret = intel_init_workaround_bb(engine);
+	ret = intel_engine_init_bb_workarounds(engine);
 	if (ret) {
 		/*
 		 * We continue even if we fail to initialize WA batch
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 74e59bb..72e8d90 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -1135,3 +1135,257 @@ int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
 
 	return 0;
 }
+
+/*
+ * In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after
+ * PIPE_CONTROL instruction. This is required for the flush to happen correctly
+ * but there is a slight complication as this is applied in WA batch where the
+ * values are only initialized once so we cannot take register value at the
+ * beginning and reuse it further; hence we save its value to memory, upload a
+ * constant value with bit21 set and then we restore it back with the saved value.
+ * To simplify the WA, a constant value is formed by using the default value
+ * of this register. This shouldn't be a problem because we are only modifying
+ * it for a short period and this batch in non-premptible. We can ofcourse
+ * use additional instructions that read the actual value of the register
+ * at that time and set our bit of interest but it makes the WA complicated.
+ *
+ * This WA is also required for Gen9 so extracting as a function avoids
+ * code duplication.
+ */
+static u32 *
+gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
+{
+	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
+	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
+	*batch++ = 0;
+
+	*batch++ = MI_LOAD_REGISTER_IMM(1);
+	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
+
+	batch = gen8_emit_pipe_control(batch,
+				       PIPE_CONTROL_CS_STALL |
+				       PIPE_CONTROL_DC_FLUSH_ENABLE,
+				       0);
+
+	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
+	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
+	*batch++ = 0;
+
+	return batch;
+}
+
+/*
+ * Typically we only have one indirect_ctx and per_ctx batch buffer which are
+ * initialized at the beginning and shared across all contexts but this field
+ * helps us to have multiple batches at different offsets and select them based
+ * on a criteria. At the moment this batch always start at the beginning of the page
+ * and at this point we don't have multiple wa_ctx batch buffers.
+ *
+ * The number of WA applied are not known at the beginning; we use this field
+ * to return the no of DWORDS written.
+ *
+ * It is to be noted that this batch does not contain MI_BATCH_BUFFER_END
+ * so it adds NOOPs as padding to make it cacheline aligned.
+ * MI_BATCH_BUFFER_END will be added to perctx batch and both of them together
+ * makes a complete batch buffer.
+ */
+static u32 *gen8_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
+{
+	/* WaDisableCtxRestoreArbitration:bdw,chv */
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+
+	/* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */
+	if (IS_BROADWELL(engine->i915))
+		batch = gen8_emit_flush_coherentl3_wa(engine, batch);
+
+	/* WaClearSlmSpaceAtContextSwitch:bdw,chv */
+	/* Actual scratch location is at 128 bytes offset */
+	batch = gen8_emit_pipe_control(batch,
+				       PIPE_CONTROL_FLUSH_L3 |
+				       PIPE_CONTROL_GLOBAL_GTT_IVB |
+				       PIPE_CONTROL_CS_STALL |
+				       PIPE_CONTROL_QW_WRITE,
+				       i915_ggtt_offset(engine->scratch) +
+				       2 * CACHELINE_BYTES);
+
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
+	/* Pad to end of cacheline */
+	while ((unsigned long)batch % CACHELINE_BYTES)
+		*batch++ = MI_NOOP;
+
+	/*
+	 * MI_BATCH_BUFFER_END is not required in Indirect ctx BB because
+	 * execution depends on the length specified in terms of cache lines
+	 * in the register CTX_RCS_INDIRECT_CTX
+	 */
+
+	return batch;
+}
+
+static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
+{
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+
+	/* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */
+	batch = gen8_emit_flush_coherentl3_wa(engine, batch);
+
+	/* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */
+	*batch++ = MI_LOAD_REGISTER_IMM(1);
+	*batch++ = i915_mmio_reg_offset(COMMON_SLICE_CHICKEN2);
+	*batch++ = _MASKED_BIT_DISABLE(
+			GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE);
+	*batch++ = MI_NOOP;
+
+	/* WaClearSlmSpaceAtContextSwitch:kbl */
+	/* Actual scratch location is at 128 bytes offset */
+	if (IS_KBL_REVID(engine->i915, 0, KBL_REVID_A0)) {
+		batch = gen8_emit_pipe_control(batch,
+					       PIPE_CONTROL_FLUSH_L3 |
+					       PIPE_CONTROL_GLOBAL_GTT_IVB |
+					       PIPE_CONTROL_CS_STALL |
+					       PIPE_CONTROL_QW_WRITE,
+					       i915_ggtt_offset(engine->scratch)
+					       + 2 * CACHELINE_BYTES);
+	}
+
+	/* WaMediaPoolStateCmdInWABB:bxt,glk */
+	if (HAS_POOLED_EU(engine->i915)) {
+		/*
+		 * EU pool configuration is setup along with golden context
+		 * during context initialization. This value depends on
+		 * device type (2x6 or 3x6) and needs to be updated based
+		 * on which subslice is disabled especially for 2x6
+		 * devices, however it is safe to load default
+		 * configuration of 3x6 device instead of masking off
+		 * corresponding bits because HW ignores bits of a disabled
+		 * subslice and drops down to appropriate config. Please
+		 * see render_state_setup() in i915_gem_render_state.c for
+		 * possible configurations, to avoid duplication they are
+		 * not shown here again.
+		 */
+		*batch++ = GEN9_MEDIA_POOL_STATE;
+		*batch++ = GEN9_MEDIA_POOL_ENABLE;
+		*batch++ = 0x00777000;
+		*batch++ = 0;
+		*batch++ = 0;
+		*batch++ = 0;
+	}
+
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
+	/* Pad to end of cacheline */
+	while ((unsigned long)batch % CACHELINE_BYTES)
+		*batch++ = MI_NOOP;
+
+	return batch;
+}
+
+#define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
+
+static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int err;
+
+	obj = i915_gem_object_create(engine->i915, CTX_WA_BB_OBJ_SIZE);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	vma = i915_vma_instance(obj, &engine->i915->ggtt.base, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err;
+	}
+
+	err = i915_vma_pin(vma, 0, PAGE_SIZE, PIN_GLOBAL | PIN_HIGH);
+	if (err)
+		goto err;
+
+	engine->wa_ctx.vma = vma;
+	return 0;
+
+err:
+	i915_gem_object_put(obj);
+	return err;
+}
+
+static void lrc_destroy_wa_ctx(struct intel_engine_cs *engine)
+{
+	i915_vma_unpin_and_release(&engine->wa_ctx.vma);
+}
+
+typedef u32 *(*wa_bb_func_t)(struct intel_engine_cs *engine, u32 *batch);
+
+int intel_engine_init_bb_workarounds(struct intel_engine_cs *engine)
+{
+	struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
+	struct i915_wa_ctx_bb *wa_bb[2] = { &wa_ctx->indirect_ctx,
+					    &wa_ctx->per_ctx };
+	wa_bb_func_t wa_bb_fn[2];
+	struct page *page;
+	void *batch, *batch_ptr;
+	unsigned int i;
+	int ret;
+
+	if (WARN_ON(engine->id != RCS || !engine->scratch))
+		return -EINVAL;
+
+	switch (INTEL_GEN(engine->i915)) {
+	case 10:
+		return 0;
+	case 9:
+		wa_bb_fn[0] = gen9_init_indirectctx_bb;
+		wa_bb_fn[1] = NULL;
+		break;
+	case 8:
+		wa_bb_fn[0] = gen8_init_indirectctx_bb;
+		wa_bb_fn[1] = NULL;
+		break;
+	default:
+		MISSING_CASE(INTEL_GEN(engine->i915));
+		return 0;
+	}
+
+	ret = lrc_setup_wa_ctx(engine);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Failed to setup context WA page: %d\n", ret);
+		return ret;
+	}
+
+	page = i915_gem_object_get_dirty_page(wa_ctx->vma->obj, 0);
+	batch = batch_ptr = kmap_atomic(page);
+
+	/*
+	 * Emit the two workaround batch buffers, recording the offset from the
+	 * start of the workaround batch buffer object for each and their
+	 * respective sizes.
+	 */
+	for (i = 0; i < ARRAY_SIZE(wa_bb_fn); i++) {
+		wa_bb[i]->offset = batch_ptr - batch;
+		if (WARN_ON(!IS_ALIGNED(wa_bb[i]->offset, CACHELINE_BYTES))) {
+			ret = -EINVAL;
+			break;
+		}
+		if (wa_bb_fn[i])
+			batch_ptr = wa_bb_fn[i](engine, batch_ptr);
+		wa_bb[i]->size = batch_ptr - (batch + wa_bb[i]->offset);
+	}
+
+	BUG_ON(batch_ptr - batch > CTX_WA_BB_OBJ_SIZE);
+
+	kunmap_atomic(batch);
+	if (ret)
+		lrc_destroy_wa_ctx(engine);
+
+	return ret;
+}
+
+void intel_engine_fini_bb_workarounds(struct intel_engine_cs *engine)
+{
+	lrc_destroy_wa_ctx(engine);
+}
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
index c73dc66..e5c6eee 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -45,4 +45,7 @@ void intel_whitelist_workarounds_get(struct drm_i915_private *dev_priv,
                                      uint *table_count);
 int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine);
 
+int intel_engine_init_bb_workarounds(struct intel_engine_cs *engine);
+void intel_engine_fini_bb_workarounds(struct intel_engine_cs *engine);
+
 #endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 11/20] drm/i915/cnl: Move GT and Display workarounds from init_clock_gating
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (9 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 10/20] drm/i915: Move WA BB stuff to the workarounds file as well Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 12/20] drm/i915/gen9: " Oscar Mateo
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

To their rightful place inside intel_workarounds.c

v2: classify WaSarbUnitClockGatingDisable as GT WA (Ville)
v3: Static tables

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c          | 32 +-------------------------------
 drivers/gpu/drm/i915/intel_workarounds.c | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0d0e84b..ff3ac6c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8519,36 +8519,6 @@ static void cnp_init_clock_gating(struct drm_i915_private *dev_priv)
 		   CNP_PWM_CGE_GATING_DISABLE);
 }
 
-static void cnl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	u32 val;
-	cnp_init_clock_gating(dev_priv);
-
-	/* This is not an Wa. Enable for better image quality */
-	I915_WRITE(_3D_CHICKEN3,
-		   _MASKED_BIT_ENABLE(_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE));
-
-	/* WaEnableChickenDCPR:cnl */
-	I915_WRITE(GEN8_CHICKEN_DCPR_1,
-		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
-
-	/* WaFbcWakeMemOn:cnl */
-	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
-		   DISP_FBC_MEMORY_WAKE);
-
-	/* WaSarbUnitClockGatingDisable:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_A0, CNL_REVID_B0))
-		I915_WRITE(SLICE_UNIT_LEVEL_CLKGATE,
-			   I915_READ(SLICE_UNIT_LEVEL_CLKGATE) |
-			   SARBUNIT_CLKGATE_DIS);
-
-	/* Display WA #1133: WaFbcSkipSegments:cnl */
-	val = I915_READ(ILK_DPFC_CHICKEN);
-	val &= ~GLK_SKIP_SEG_COUNT_MASK;
-	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
-	I915_WRITE(ILK_DPFC_CHICKEN, val);
-}
-
 static void cfl_init_clock_gating(struct drm_i915_private *dev_priv)
 {
 	cnp_init_clock_gating(dev_priv);
@@ -9040,7 +9010,7 @@ static void nop_init_clock_gating(struct drm_i915_private *dev_priv)
 void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 {
 	if (IS_CANNONLAKE(dev_priv))
-		dev_priv->display.init_clock_gating = cnl_init_clock_gating;
+		dev_priv->display.init_clock_gating = nop_init_clock_gating;
 	else if (IS_COFFEELAKE(dev_priv))
 		dev_priv->display.init_clock_gating = cfl_init_clock_gating;
 	else if (IS_SKYLAKE(dev_priv))
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 72e8d90..a0b34d9 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -779,6 +779,15 @@ static uint mmio_workarounds_apply(struct drm_i915_private *dev_priv,
 	{ WA_GT("WaEnablePreemptionGranularityControlByUMD"),
 	  ALL_REVS, REG(GEN7_FF_SLICE_CS_CHICKEN1),
 	  SET_BIT_MASKED(GEN9_FFSC_PERCTX_PREEMPT_CTRL) },
+
+	/* This is not an Wa. Enable for better image quality */
+	{ WA_GT(""),
+	  ALL_REVS, REG(_3D_CHICKEN3),
+	  SET_BIT_MASKED(_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE) },
+
+	{ WA_GT("WaSarbUnitClockGatingDisable (pre-prod)"),
+	  REVS(CNL_REVID_A0, CNL_REVID_B0), REG(SLICE_UNIT_LEVEL_CLKGATE),
+	  SET_BIT(SARBUNIT_CLKGATE_DIS) },
 };
 
 static const struct i915_wa_reg_table bdw_gt_wa_tbl[] = {
@@ -894,10 +903,33 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 static struct i915_wa_reg glk_disp_was[] = {
 };
 
+static bool has_pch_cnp(struct drm_i915_private *dev_priv,
+			struct i915_wa_reg *wa)
+{
+	return HAS_PCH_CNP(dev_priv);
+}
+
 static struct i915_wa_reg cfl_disp_was[] = {
 };
 
 static struct i915_wa_reg cnl_disp_was[] = {
+	{ WA_DISP("Wa #1181"),
+	  ALL_REVS, REG(SOUTH_DSPCLK_GATE_D),
+	  SET_BIT(CNP_PWM_CGE_GATING_DISABLE),
+	  .pre_hook = has_pch_cnp },
+
+	{ WA_DISP("WaEnableChickenDCPR"),
+	  ALL_REVS, REG(GEN8_CHICKEN_DCPR_1),
+	  SET_BIT(MASK_WAKEMEM) },
+
+	{ WA_DISP("WaFbcWakeMemOn"),
+	  ALL_REVS, REG(DISP_ARB_CTL),
+	  SET_BIT(DISP_FBC_MEMORY_WAKE) },
+
+	{ WA_DISP("Display WA #1133: WaFbcSkipSegments"),
+	  ALL_REVS, REG(ILK_DPFC_CHICKEN),
+	  SET_FIELD(GLK_SKIP_SEG_COUNT_MASK,
+		    GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1)) },
 };
 
 static const struct i915_wa_reg_table bdw_disp_wa_tbl[] = {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 12/20] drm/i915/gen9: Move GT and Display workarounds from init_clock_gating
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (10 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 11/20] drm/i915/cnl: Move GT and Display workarounds from init_clock_gating Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 13/20] drm/i915/cfl: " Oscar Mateo
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

To their rightful place inside intel_workarounds.c

v2:
  - Rebase on WA removed
  - Rebased to carry the init_early nomenclature over (Chris)

v3: Static tables

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c          | 48 --------------------------------
 drivers/gpu/drm/i915/intel_workarounds.c | 28 +++++++++++++++++++
 2 files changed, 28 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index ff3ac6c..f712b02 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -57,50 +57,8 @@
 #define INTEL_RC6p_ENABLE			(1<<1)
 #define INTEL_RC6pp_ENABLE			(1<<2)
 
-static void gen9_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	if (HAS_LLC(dev_priv)) {
-		/*
-		 * WaCompressedResourceDisplayNewHashMode:skl,kbl
-		 * Display WA#0390: skl,kbl
-		 *
-		 * Must match Sampler, Pixel Back End, and Media. See
-		 * WaCompressedResourceSamplerPbeMediaNewHashMode.
-		 */
-		I915_WRITE(CHICKEN_PAR1_1,
-			   I915_READ(CHICKEN_PAR1_1) |
-			   SKL_DE_COMPRESSED_HASH_MODE);
-	}
-
-	/* See Bspec note for PSR2_CTL bit 31, Wa#828:skl,bxt,kbl,cfl */
-	I915_WRITE(CHICKEN_PAR1_1,
-		   I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
-
-	/* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(GEN8_CHICKEN_DCPR_1,
-		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
-
-	/* WaFbcTurnOffFbcWatermark:skl,bxt,kbl,cfl */
-	/* WaFbcWakeMemOn:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
-		   DISP_FBC_WM_DIS |
-		   DISP_FBC_MEMORY_WAKE);
-
-	/* WaFbcHighMemBwCorruptionAvoidance:skl,bxt,kbl,cfl */
-	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
-		   ILK_DPFC_DISABLE_DUMMY0);
-
-	if (IS_SKYLAKE(dev_priv)) {
-		/* WaDisableDopClockGating */
-		I915_WRITE(GEN7_MISCCPCTL, I915_READ(GEN7_MISCCPCTL)
-			   & ~GEN7_DOP_CLOCK_GATE_ENABLE);
-	}
-}
-
 static void bxt_init_clock_gating(struct drm_i915_private *dev_priv)
 {
-	gen9_init_clock_gating(dev_priv);
-
 	/* WaDisableSDEUnitClockGating:bxt */
 	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
 		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
@@ -123,7 +81,6 @@ static void bxt_init_clock_gating(struct drm_i915_private *dev_priv)
 static void glk_init_clock_gating(struct drm_i915_private *dev_priv)
 {
 	u32 val;
-	gen9_init_clock_gating(dev_priv);
 
 	/*
 	 * WaDisablePWMClockGating:glk
@@ -8522,7 +8479,6 @@ static void cnp_init_clock_gating(struct drm_i915_private *dev_priv)
 static void cfl_init_clock_gating(struct drm_i915_private *dev_priv)
 {
 	cnp_init_clock_gating(dev_priv);
-	gen9_init_clock_gating(dev_priv);
 
 	/* WaFbcNukeOnHostModify:cfl */
 	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
@@ -8531,8 +8487,6 @@ static void cfl_init_clock_gating(struct drm_i915_private *dev_priv)
 
 static void kbl_init_clock_gating(struct drm_i915_private *dev_priv)
 {
-	gen9_init_clock_gating(dev_priv);
-
 	/* WaDisableSDEUnitClockGating:kbl */
 	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
 		I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
@@ -8550,8 +8504,6 @@ static void kbl_init_clock_gating(struct drm_i915_private *dev_priv)
 
 static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
 {
-	gen9_init_clock_gating(dev_priv);
-
 	/* WAC6entrylatency:skl */
 	I915_WRITE(FBC_LLC_READ_CTRL, I915_READ(FBC_LLC_READ_CTRL) |
 		   FBC_LLC_FULLY_OPEN);
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index a0b34d9..d5cbda1 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -889,9 +889,37 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 };
 
 static struct i915_wa_reg gen9_disp_was[] = {
+	/*
+	 * Must match Sampler, Pixel Back End, and Media. See
+	 * WaCompressedResourceSamplerPbeMediaNewHashMode.
+	 */
+	{ WA_DISP("WaCompressedResourceDisplayNewHashMode + Display WA#0390"),
+	  ALL_REVS, REG(CHICKEN_PAR1_1),
+	  SET_BIT(SKL_DE_COMPRESSED_HASH_MODE),
+	  .pre_hook = has_llc },
+
+	/* See Bspec note for PSR2_CTL bit 31 */
+	{ WA_DISP("Wa#828"),
+	  ALL_REVS, REG(CHICKEN_PAR1_1),
+	  SET_BIT(SKL_EDP_PSR_FIX_RDWRAP) },
+
+	{ WA_DISP("WaEnableChickenDCPR"),
+	  ALL_REVS, REG(GEN8_CHICKEN_DCPR_1),
+	  SET_BIT(MASK_WAKEMEM) },
+
+	{ WA_DISP("WaFbcTurnOffFbcWatermark + WaFbcWakeMemOn "),
+	  ALL_REVS, REG(DISP_ARB_CTL),
+	  SET_BIT(DISP_FBC_WM_DIS | DISP_FBC_MEMORY_WAKE) },
+
+	{ WA_DISP("WaFbcHighMemBwCorruptionAvoidance"),
+	  ALL_REVS, REG(ILK_DPFC_CHICKEN),
+	  SET_BIT(ILK_DPFC_DISABLE_DUMMY0) },
 };
 
 static struct i915_wa_reg skl_disp_was[] = {
+	{ WA_DISP("WaDisableDopClockGating"),
+	  ALL_REVS, REG(GEN7_MISCCPCTL),
+	  CLEAR_BIT(GEN7_DOP_CLOCK_GATE_ENABLE) },
 };
 
 static struct i915_wa_reg bxt_disp_was[] = {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 13/20] drm/i915/cfl: Move GT and Display workarounds from init_clock_gating
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (11 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 12/20] drm/i915/gen9: " Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 14/20] drm/i915/glk: " Oscar Mateo
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

To their rightful place inside intel_workarounds.c

v2: Static tables

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c          | 23 +----------------------
 drivers/gpu/drm/i915/intel_workarounds.c |  8 ++++++++
 2 files changed, 9 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index f712b02..a85a001 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8466,25 +8466,6 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 }
 
-static void cnp_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	if (!HAS_PCH_CNP(dev_priv))
-		return;
-
-	/* Wa #1181 */
-	I915_WRITE(SOUTH_DSPCLK_GATE_D, I915_READ(SOUTH_DSPCLK_GATE_D) |
-		   CNP_PWM_CGE_GATING_DISABLE);
-}
-
-static void cfl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	cnp_init_clock_gating(dev_priv);
-
-	/* WaFbcNukeOnHostModify:cfl */
-	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
-		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
-}
-
 static void kbl_init_clock_gating(struct drm_i915_private *dev_priv)
 {
 	/* WaDisableSDEUnitClockGating:kbl */
@@ -8961,10 +8942,8 @@ static void nop_init_clock_gating(struct drm_i915_private *dev_priv)
  */
 void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 {
-	if (IS_CANNONLAKE(dev_priv))
+	if (IS_CANNONLAKE(dev_priv) || IS_COFFEELAKE(dev_priv))
 		dev_priv->display.init_clock_gating = nop_init_clock_gating;
-	else if (IS_COFFEELAKE(dev_priv))
-		dev_priv->display.init_clock_gating = cfl_init_clock_gating;
 	else if (IS_SKYLAKE(dev_priv))
 		dev_priv->display.init_clock_gating = skl_init_clock_gating;
 	else if (IS_KABYLAKE(dev_priv))
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index d5cbda1..4fe1dd0 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -938,6 +938,14 @@ static bool has_pch_cnp(struct drm_i915_private *dev_priv,
 }
 
 static struct i915_wa_reg cfl_disp_was[] = {
+	{ WA_DISP("Wa #1181"),
+	  ALL_REVS, REG(SOUTH_DSPCLK_GATE_D),
+	  SET_BIT(CNP_PWM_CGE_GATING_DISABLE),
+	  .pre_hook = has_pch_cnp },
+
+	{ WA_DISP("WaFbcNukeOnHostModify"),
+	  ALL_REVS, REG(ILK_DPFC_CHICKEN),
+	  SET_BIT(ILK_DPFC_NUKE_ON_ANY_MODIFICATION) },
 };
 
 static struct i915_wa_reg cnl_disp_was[] = {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 14/20] drm/i915/glk: Move GT and Display workarounds from init_clock_gating
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (12 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 13/20] drm/i915/cfl: " Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 15/20] drm/i915/kbl: " Oscar Mateo
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

To their rightful place inside intel_workarounds.c

v2: Static tables

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c          | 33 ++------------------------------
 drivers/gpu/drm/i915/intel_workarounds.c | 16 ++++++++++++++++
 2 files changed, 18 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index a85a001..b5e7432 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -78,34 +78,6 @@ static void bxt_init_clock_gating(struct drm_i915_private *dev_priv)
 		   PWM1_GATING_DIS | PWM2_GATING_DIS);
 }
 
-static void glk_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	u32 val;
-
-	/*
-	 * WaDisablePWMClockGating:glk
-	 * Backlight PWM may stop in the asserted state, causing backlight
-	 * to stay fully on.
-	 */
-	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
-		   PWM1_GATING_DIS | PWM2_GATING_DIS);
-
-	/* WaDDIIOTimeout:glk */
-	if (IS_GLK_REVID(dev_priv, 0, GLK_REVID_A1)) {
-		u32 val = I915_READ(CHICKEN_MISC_2);
-		val &= ~(GLK_CL0_PWR_DOWN |
-			 GLK_CL1_PWR_DOWN |
-			 GLK_CL2_PWR_DOWN);
-		I915_WRITE(CHICKEN_MISC_2, val);
-	}
-
-	/* Display WA #1133: WaFbcSkipSegments:glk */
-	val = I915_READ(ILK_DPFC_CHICKEN);
-	val &= ~GLK_SKIP_SEG_COUNT_MASK;
-	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
-	I915_WRITE(ILK_DPFC_CHICKEN, val);
-}
-
 static void i915_pineview_get_mem_freq(struct drm_i915_private *dev_priv)
 {
 	u32 tmp;
@@ -8942,7 +8914,8 @@ static void nop_init_clock_gating(struct drm_i915_private *dev_priv)
  */
 void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 {
-	if (IS_CANNONLAKE(dev_priv) || IS_COFFEELAKE(dev_priv))
+	if (IS_CANNONLAKE(dev_priv) || IS_COFFEELAKE(dev_priv) ||
+	    IS_GEMINILAKE(dev_priv))
 		dev_priv->display.init_clock_gating = nop_init_clock_gating;
 	else if (IS_SKYLAKE(dev_priv))
 		dev_priv->display.init_clock_gating = skl_init_clock_gating;
@@ -8950,8 +8923,6 @@ void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 		dev_priv->display.init_clock_gating = kbl_init_clock_gating;
 	else if (IS_BROXTON(dev_priv))
 		dev_priv->display.init_clock_gating = bxt_init_clock_gating;
-	else if (IS_GEMINILAKE(dev_priv))
-		dev_priv->display.init_clock_gating = glk_init_clock_gating;
 	else if (IS_BROADWELL(dev_priv))
 		dev_priv->display.init_clock_gating = bdw_init_clock_gating;
 	else if (IS_CHERRYVIEW(dev_priv))
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 4fe1dd0..a438ce3 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -929,6 +929,22 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 };
 
 static struct i915_wa_reg glk_disp_was[] = {
+	/*
+	 * Backlight PWM may stop in the asserted state, causing backlight
+	 * to stay fully on.
+	 */
+	{ WA_DISP("WaDisablePWMClockGating"),
+	  ALL_REVS, REG(GEN9_CLKGATE_DIS_0),
+	  SET_BIT(PWM1_GATING_DIS | PWM2_GATING_DIS) },
+
+	{ WA_DISP("WaDDIIOTimeout"),
+	  REVS(0, GLK_REVID_A1), REG(CHICKEN_MISC_2),
+	  CLEAR_BIT(GLK_CL0_PWR_DOWN | GLK_CL1_PWR_DOWN | GLK_CL2_PWR_DOWN) },
+
+	{ WA_DISP("Display WA #1133: WaFbcSkipSegments"),
+	  ALL_REVS, REG(ILK_DPFC_CHICKEN),
+	  SET_FIELD(GLK_SKIP_SEG_COUNT_MASK,
+		    GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1)) },
 };
 
 static bool has_pch_cnp(struct drm_i915_private *dev_priv,
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 15/20] drm/i915/kbl: Move GT and Display workarounds from init_clock_gating
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (13 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 14/20] drm/i915/glk: " Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 16/20] drm/i915/bxt: " Oscar Mateo
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

To their rightful place inside intel_workarounds.c

v2: Classify WaDisableSDEUnitClockGating and WaDisableGamClockGating
as GT WAs

v3: Static tables (Joonas)

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c          | 21 +--------------------
 drivers/gpu/drm/i915/intel_workarounds.c | 11 +++++++++++
 2 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index b5e7432..046553b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8438,23 +8438,6 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 }
 
-static void kbl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	/* WaDisableSDEUnitClockGating:kbl */
-	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-			   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
-
-	/* WaDisableGamClockGating:kbl */
-	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
-			   GEN6_GAMUNIT_CLOCK_GATE_DISABLE);
-
-	/* WaFbcNukeOnHostModify:kbl */
-	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
-		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
-}
-
 static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
 {
 	/* WAC6entrylatency:skl */
@@ -8915,12 +8898,10 @@ static void nop_init_clock_gating(struct drm_i915_private *dev_priv)
 void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 {
 	if (IS_CANNONLAKE(dev_priv) || IS_COFFEELAKE(dev_priv) ||
-	    IS_GEMINILAKE(dev_priv))
+	    IS_GEMINILAKE(dev_priv) || IS_KABYLAKE(dev_priv))
 		dev_priv->display.init_clock_gating = nop_init_clock_gating;
 	else if (IS_SKYLAKE(dev_priv))
 		dev_priv->display.init_clock_gating = skl_init_clock_gating;
-	else if (IS_KABYLAKE(dev_priv))
-		dev_priv->display.init_clock_gating = kbl_init_clock_gating;
 	else if (IS_BROXTON(dev_priv))
 		dev_priv->display.init_clock_gating = bxt_init_clock_gating;
 	else if (IS_BROADWELL(dev_priv))
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index a438ce3..396399b 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -745,6 +745,14 @@ static uint mmio_workarounds_apply(struct drm_i915_private *dev_priv,
 	{ WA_GT("WaInPlaceDecompressionHang"),
 	  ALL_REVS, REG(GEN9_GAMT_ECO_REG_RW_IA),
 	  SET_BIT(GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS) },
+
+	{ WA_GT("WaDisableSDEUnitClockGating"),
+	  REVS(0, KBL_REVID_B0), REG(GEN8_UCGCTL6),
+	  SET_BIT(GEN8_SDEUNIT_CLOCK_GATE_DISABLE) },
+
+	{ WA_GT("WaDisableGamClockGating"),
+	  REVS(0, KBL_REVID_B0), REG(GEN6_UCGCTL1),
+	  SET_BIT(GEN6_GAMUNIT_CLOCK_GATE_DISABLE) },
 };
 
 static struct i915_wa_reg glk_gt_was[] = {
@@ -926,6 +934,9 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 };
 
 static struct i915_wa_reg kbl_disp_was[] = {
+	{ WA_DISP("WaFbcNukeOnHostModify"),
+	  ALL_REVS, REG(ILK_DPFC_CHICKEN),
+	  SET_BIT(ILK_DPFC_NUKE_ON_ANY_MODIFICATION) },
 };
 
 static struct i915_wa_reg glk_disp_was[] = {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 16/20] drm/i915/bxt: Move GT and Display workarounds from init_clock_gating
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (14 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 15/20] drm/i915/kbl: " Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 17/20] drm/i915/skl: " Oscar Mateo
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

To their rightful place inside intel_workarounds.c

v2: Classify WaDisableSDEUnitClockGating as GT WA
v3: Static tables (Joonas)

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c          | 26 ++------------------------
 drivers/gpu/drm/i915/intel_workarounds.c | 19 +++++++++++++++++++
 2 files changed, 21 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 046553b..98e976e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -57,27 +57,6 @@
 #define INTEL_RC6p_ENABLE			(1<<1)
 #define INTEL_RC6pp_ENABLE			(1<<2)
 
-static void bxt_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	/* WaDisableSDEUnitClockGating:bxt */
-	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
-
-	/*
-	 * FIXME:
-	 * GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ applies on 3x6 GT SKUs only.
-	 */
-	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-		   GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
-
-	/*
-	 * Wa: Backlight PWM may stop in the asserted state, causing backlight
-	 * to stay fully on.
-	 */
-	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
-		   PWM1_GATING_DIS | PWM2_GATING_DIS);
-}
-
 static void i915_pineview_get_mem_freq(struct drm_i915_private *dev_priv)
 {
 	u32 tmp;
@@ -8898,12 +8877,11 @@ static void nop_init_clock_gating(struct drm_i915_private *dev_priv)
 void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 {
 	if (IS_CANNONLAKE(dev_priv) || IS_COFFEELAKE(dev_priv) ||
-	    IS_GEMINILAKE(dev_priv) || IS_KABYLAKE(dev_priv))
+	    IS_GEMINILAKE(dev_priv) || IS_KABYLAKE(dev_priv)   ||
+	    IS_BROXTON(dev_priv))
 		dev_priv->display.init_clock_gating = nop_init_clock_gating;
 	else if (IS_SKYLAKE(dev_priv))
 		dev_priv->display.init_clock_gating = skl_init_clock_gating;
-	else if (IS_BROXTON(dev_priv))
-		dev_priv->display.init_clock_gating = bxt_init_clock_gating;
 	else if (IS_BROADWELL(dev_priv))
 		dev_priv->display.init_clock_gating = bdw_init_clock_gating;
 	else if (IS_CHERRYVIEW(dev_priv))
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 396399b..1ebe56d 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -723,6 +723,10 @@ static uint mmio_workarounds_apply(struct drm_i915_private *dev_priv,
 	{ WA_GT("WaInPlaceDecompressionHang"),
 	  REVS(BXT_REVID_C0, REVID_FOREVER), REG(GEN9_GAMT_ECO_REG_RW_IA),
 	  SET_BIT(GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS) },
+
+	{ WA_GT("WaDisableSDEUnitClockGating"),
+	  ALL_REVS, REG(GEN8_UCGCTL6),
+	  SET_BIT(GEN8_SDEUNIT_CLOCK_GATE_DISABLE) },
 };
 
 static struct i915_wa_reg kbl_gt_was[] = {
@@ -931,6 +935,21 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 };
 
 static struct i915_wa_reg bxt_disp_was[] = {
+	/*
+	 * FIXME:
+	 * GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ applies on 3x6 GT SKUs only.
+	 */
+	{ WA_DISP(""),
+	  ALL_REVS, REG(GEN8_UCGCTL6),
+	  SET_BIT(GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ) },
+
+	/*
+	 * Backlight PWM may stop in the asserted state, causing backlight
+	 * to stay fully on.
+	 */
+	{ WA_DISP(""),
+	  ALL_REVS, REG(GEN9_CLKGATE_DIS_0),
+	  SET_BIT(PWM1_GATING_DIS | PWM2_GATING_DIS) },
 };
 
 static struct i915_wa_reg kbl_disp_was[] = {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 17/20] drm/i915/skl: Move GT and Display workarounds from init_clock_gating
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (15 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 16/20] drm/i915/bxt: " Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 18/20] drm/i915/chv: " Oscar Mateo
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

To their rightful place inside intel_workarounds.c

v2: Static tables (Joonas)

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c          | 15 +--------------
 drivers/gpu/drm/i915/intel_workarounds.c |  8 ++++++++
 2 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 98e976e..eb5bac0 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8417,17 +8417,6 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 }
 
-static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	/* WAC6entrylatency:skl */
-	I915_WRITE(FBC_LLC_READ_CTRL, I915_READ(FBC_LLC_READ_CTRL) |
-		   FBC_LLC_FULLY_OPEN);
-
-	/* WaFbcNukeOnHostModify:skl */
-	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
-		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
-}
-
 static void bdw_init_clock_gating(struct drm_i915_private *dev_priv)
 {
 	/* The GTT cache must be disabled if the system is using 2M pages. */
@@ -8878,10 +8867,8 @@ void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 {
 	if (IS_CANNONLAKE(dev_priv) || IS_COFFEELAKE(dev_priv) ||
 	    IS_GEMINILAKE(dev_priv) || IS_KABYLAKE(dev_priv)   ||
-	    IS_BROXTON(dev_priv))
+	    IS_BROXTON(dev_priv)    || IS_SKYLAKE(dev_priv))
 		dev_priv->display.init_clock_gating = nop_init_clock_gating;
-	else if (IS_SKYLAKE(dev_priv))
-		dev_priv->display.init_clock_gating = skl_init_clock_gating;
 	else if (IS_BROADWELL(dev_priv))
 		dev_priv->display.init_clock_gating = bdw_init_clock_gating;
 	else if (IS_CHERRYVIEW(dev_priv))
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 1ebe56d..0e3f7c3 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -932,6 +932,14 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 	{ WA_DISP("WaDisableDopClockGating"),
 	  ALL_REVS, REG(GEN7_MISCCPCTL),
 	  CLEAR_BIT(GEN7_DOP_CLOCK_GATE_ENABLE) },
+
+	{ WA_DISP("WAC6entrylatency"),
+	  ALL_REVS, REG(FBC_LLC_READ_CTRL),
+	  SET_BIT(FBC_LLC_FULLY_OPEN) },
+
+	{ WA_DISP("WaFbcNukeOnHostModify"),
+	  ALL_REVS, REG(ILK_DPFC_CHICKEN),
+	  SET_BIT(ILK_DPFC_NUKE_ON_ANY_MODIFICATION) },
 };
 
 static struct i915_wa_reg bxt_disp_was[] = {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 18/20] drm/i915/chv: Move GT and Display workarounds from init_clock_gating
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (16 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 17/20] drm/i915/skl: " Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 19/20] drm/i915/bdw: " Oscar Mateo
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

To their rightful place inside intel_workarounds.c

v2: Classify WaDisableCSUnitClockGating and WaDisableSDEUnitClockGating
as GT WAs

v3:
  - Static tables (Joonas)
  - Also move WaProgramL3SqcReg1Default/WaTempDisableDOPClkGating

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c          | 39 +-------------------
 drivers/gpu/drm/i915/intel_workarounds.c | 63 +++++++++++++++++++++++++++++---
 2 files changed, 59 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index eb5bac0..aef0aee 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8705,40 +8705,6 @@ static void vlv_init_clock_gating(struct drm_i915_private *dev_priv)
 	I915_WRITE(VLV_GUNIT_CLOCK_GATE, GCFG_DIS);
 }
 
-static void chv_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	/* WaVSRefCountFullforceMissDisable:chv */
-	/* WaDSRefCountFullforceMissDisable:chv */
-	I915_WRITE(GEN7_FF_THREAD_MODE,
-		   I915_READ(GEN7_FF_THREAD_MODE) &
-		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
-
-	/* WaDisableSemaphoreAndSyncFlipWait:chv */
-	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
-		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
-
-	/* WaDisableCSUnitClockGating:chv */
-	I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
-		   GEN6_CSUNIT_CLOCK_GATE_DISABLE);
-
-	/* WaDisableSDEUnitClockGating:chv */
-	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
-
-	/*
-	 * WaProgramL3SqcReg1Default:chv
-	 * See gfxspecs/Related Documents/Performance Guide/
-	 * LSQC Setting Recommendations.
-	 */
-	gen8_set_l3sqc_credits(dev_priv, 38, 2);
-
-	/*
-	 * GTT cache may not work with big pages, so if those
-	 * are ever enabled GTT cache may need to be disabled.
-	 */
-	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
-}
-
 static void g4x_init_clock_gating(struct drm_i915_private *dev_priv)
 {
 	uint32_t dspclk_gate;
@@ -8867,12 +8833,11 @@ void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 {
 	if (IS_CANNONLAKE(dev_priv) || IS_COFFEELAKE(dev_priv) ||
 	    IS_GEMINILAKE(dev_priv) || IS_KABYLAKE(dev_priv)   ||
-	    IS_BROXTON(dev_priv)    || IS_SKYLAKE(dev_priv))
+	    IS_BROXTON(dev_priv)    || IS_SKYLAKE(dev_priv)    ||
+	    IS_CHERRYVIEW(dev_priv))
 		dev_priv->display.init_clock_gating = nop_init_clock_gating;
 	else if (IS_BROADWELL(dev_priv))
 		dev_priv->display.init_clock_gating = bdw_init_clock_gating;
-	else if (IS_CHERRYVIEW(dev_priv))
-		dev_priv->display.init_clock_gating = chv_init_clock_gating;
 	else if (IS_HASWELL(dev_priv))
 		dev_priv->display.init_clock_gating = hsw_init_clock_gating;
 	else if (IS_IVYBRIDGE(dev_priv))
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 0e3f7c3..1ebce4f 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -646,10 +646,67 @@ static uint mmio_workarounds_apply(struct drm_i915_private *dev_priv,
 static struct i915_wa_reg gen8_gt_was[] = {
 };
 
+/* WaTempDisableDOPClkGating */
+static bool disable_dop_clock_gating(struct drm_i915_private *dev_priv,
+                        	     struct i915_wa_reg *wa)
+{
+	u32 misccpctl = I915_READ(GEN7_MISCCPCTL);
+
+	wa->hook_data = misccpctl;
+	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
+
+	return true;
+}
+
+/* WaTempDisableDOPClkGating */
+static void enable_dop_clock_gating(struct drm_i915_private *dev_priv,
+                		    struct i915_wa_reg *wa)
+{
+	u32 misccpctl = wa->hook_data;
+
+	/*
+	 * Wait at least 100 clocks before re-enabling clock
+	 * gating. See the definition of L3SQCREG1 in BSpec.
+	 */
+	POSTING_READ(GEN8_L3SQCREG1);
+	udelay(1);
+	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
+}
+
 static struct i915_wa_reg bdw_gt_was[] = {
 };
 
 static struct i915_wa_reg chv_gt_was[] = {
+	{ WA_GT("WaVSRefCountFullforceMissDisable + WaDSRefCountFullforceMissDisable"),
+	  ALL_REVS, REG(GEN7_FF_THREAD_MODE),
+	  CLEAR_BIT(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME) },
+
+	{ WA_GT("WaDisableSemaphoreAndSyncFlipWait"),
+	  ALL_REVS, REG(GEN6_RC_SLEEP_PSMI_CONTROL),
+	  SET_BIT_MASKED(GEN8_RC_SEMA_IDLE_MSG_DISABLE) },
+
+	{ WA_GT("WaDisableCSUnitClockGating"),
+	  ALL_REVS, REG(GEN6_UCGCTL1),
+	  SET_BIT(GEN6_CSUNIT_CLOCK_GATE_DISABLE) },
+
+	{ WA_GT("WaDisableSDEUnitClockGating"),
+	  ALL_REVS, REG(GEN8_UCGCTL6),
+	  SET_BIT(GEN8_SDEUNIT_CLOCK_GATE_DISABLE) },
+
+	{ WA_GT("WaProgramL3SqcReg1Default"),
+	  ALL_REVS, REG(GEN8_L3SQCREG1),
+	  SET_FIELD(L3_PRIO_CREDITS_MASK,
+		    L3_GENERAL_PRIO_CREDITS(38) | L3_HIGH_PRIO_CREDITS(2)),
+	  .pre_hook = disable_dop_clock_gating,
+	  .post_hook = enable_dop_clock_gating },
+
+	/*
+	 * GTT cache may not work with big pages, so if those
+	 * are ever enabled GTT cache may need to be disabled.
+	 */
+	{ WA_GT(""),
+	  ALL_REVS, REG(HSW_GTT_CACHE_EN),
+	  SET_FIELD(0xFFFFFFFF, GTT_CACHE_EN_ALL) },
 };
 
 static struct i915_wa_reg gen9_gt_was[] = {
@@ -808,7 +865,6 @@ static uint mmio_workarounds_apply(struct drm_i915_private *dev_priv,
 };
 
 static const struct i915_wa_reg_table chv_gt_wa_tbl[] = {
-	{ gen8_gt_was, ARRAY_SIZE(gen8_gt_was) },
 	{ chv_gt_was,  ARRAY_SIZE(chv_gt_was) },
 };
 
@@ -897,9 +953,6 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 static struct i915_wa_reg bdw_disp_was[] = {
 };
 
-static struct i915_wa_reg chv_disp_was[] = {
-};
-
 static struct i915_wa_reg gen9_disp_was[] = {
 	/*
 	 * Must match Sampler, Pixel Back End, and Media. See
@@ -1028,8 +1081,6 @@ static bool has_pch_cnp(struct drm_i915_private *dev_priv,
 };
 
 static const struct i915_wa_reg_table chv_disp_wa_tbl[] = {
-	{ gen8_disp_was, ARRAY_SIZE(gen8_disp_was) },
-	{ chv_disp_was,  ARRAY_SIZE(chv_disp_was) },
 };
 
 static const struct i915_wa_reg_table skl_disp_wa_tbl[] = {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 19/20] drm/i915/bdw: Move GT and Display workarounds from init_clock_gating
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (17 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 18/20] drm/i915/chv: " Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:09 ` [RFC PATCH 20/20] drm/i915: Document the i915_workarounds file Oscar Mateo
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni, Rodrigo Vivi

To their rightful place inside intel_workarounds.c

TODO2: Decide what to do with lpt_init_clock_gating (shouldn't
WADPOClockGatingDisable be marked as "bdw"? shouldn't it be
protected by HAS_PCH_LPT_LP? do we want to move the whole thing
to the workarounds file or not?).

v2: Classify WaDisableSDEUnitClockGating as GT WA
v3:
  - Static tables (Joonas)
  - Also move WaProgramL3SqcReg1Default/WaTempDisableDOPClkGating

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c          | 76 ------------------------------
 drivers/gpu/drm/i915/intel_workarounds.c | 81 +++++++++++++++++++++++++++++---
 2 files changed, 74 insertions(+), 83 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index aef0aee..0fc0670 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8391,87 +8391,11 @@ static void lpt_suspend_hw(struct drm_i915_private *dev_priv)
 	}
 }
 
-static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
-				   int general_prio_credits,
-				   int high_prio_credits)
-{
-	u32 misccpctl;
-	u32 val;
-
-	/* WaTempDisableDOPClkGating:bdw */
-	misccpctl = I915_READ(GEN7_MISCCPCTL);
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
-
-	val = I915_READ(GEN8_L3SQCREG1);
-	val &= ~L3_PRIO_CREDITS_MASK;
-	val |= L3_GENERAL_PRIO_CREDITS(general_prio_credits);
-	val |= L3_HIGH_PRIO_CREDITS(high_prio_credits);
-	I915_WRITE(GEN8_L3SQCREG1, val);
-
-	/*
-	 * Wait at least 100 clocks before re-enabling clock gating.
-	 * See the definition of L3SQCREG1 in BSpec.
-	 */
-	POSTING_READ(GEN8_L3SQCREG1);
-	udelay(1);
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
-}
-
 static void bdw_init_clock_gating(struct drm_i915_private *dev_priv)
 {
-	/* The GTT cache must be disabled if the system is using 2M pages. */
-	bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv,
-						 I915_GTT_PAGE_SIZE_2M);
-	enum pipe pipe;
-
 	ilk_init_lp_watermarks(dev_priv);
 
-	/* WaSwitchSolVfFArbitrationPriority:bdw */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
-
-	/* WaPsrDPAMaskVBlankInSRD:bdw */
-	I915_WRITE(CHICKEN_PAR1_1,
-		   I915_READ(CHICKEN_PAR1_1) | DPA_MASK_VBLANK_SRD);
-
-	/* WaPsrDPRSUnmaskVBlankInSRD:bdw */
-	for_each_pipe(dev_priv, pipe) {
-		I915_WRITE(CHICKEN_PIPESL_1(pipe),
-			   I915_READ(CHICKEN_PIPESL_1(pipe)) |
-			   BDW_DPRS_MASK_VBLANK_SRD);
-	}
-
-	/* WaVSRefCountFullforceMissDisable:bdw */
-	/* WaDSRefCountFullforceMissDisable:bdw */
-	I915_WRITE(GEN7_FF_THREAD_MODE,
-		   I915_READ(GEN7_FF_THREAD_MODE) &
-		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
-
-	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
-		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
-
-	/* WaDisableSDEUnitClockGating:bdw */
-	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
-
-	/* WaProgramL3SqcReg1Default:bdw */
-	gen8_set_l3sqc_credits(dev_priv, 30, 2);
-
-	/* WaGttCachingOffByDefault:bdw */
-	I915_WRITE(HSW_GTT_CACHE_EN, can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0);
-
-	/* WaKVMNotificationOnConfigChange:bdw */
-	I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
-		   | KVM_CONFIG_CHANGE_NOTIFICATION_SELECT);
-
 	lpt_init_clock_gating(dev_priv);
-
-	/* WaDisableDopClockGating:bdw
-	 *
-	 * Also see the CHICKEN2 write in bdw_init_workarounds() to disable DOP
-	 * clock gating.
-	 */
-	I915_WRITE(GEN6_UCGCTL1,
-		   I915_READ(GEN6_UCGCTL1) | GEN6_EU_TCUNIT_CLOCK_GATE_DISABLE);
 }
 
 static void hsw_init_clock_gating(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 1ebce4f..a8fe655 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -643,9 +643,6 @@ static uint mmio_workarounds_apply(struct drm_i915_private *dev_priv,
 	return total_count;
 }
 
-static struct i915_wa_reg gen8_gt_was[] = {
-};
-
 /* WaTempDisableDOPClkGating */
 static bool disable_dop_clock_gating(struct drm_i915_private *dev_priv,
                         	     struct i915_wa_reg *wa)
@@ -673,7 +670,45 @@ static void enable_dop_clock_gating(struct drm_i915_private *dev_priv,
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 }
 
+static bool use_gtt_cache(struct drm_i915_private *dev_priv,
+			  struct i915_wa_reg *wa)
+{
+	/* The GTT cache must be disabled if the system is using 2M pages. */
+	bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv, I915_GTT_PAGE_SIZE_2M);
+
+	wa->value = can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0;
+
+	return true;
+}
+
 static struct i915_wa_reg bdw_gt_was[] = {
+	{ WA_GT("WaSwitchSolVfFArbitrationPriority"),
+	  ALL_REVS, REG(GAM_ECOCHK),
+	  SET_BIT(HSW_ECOCHK_ARB_PRIO_SOL) },
+
+	{ WA_GT("WaVSRefCountFullforceMissDisable + WaDSRefCountFullforceMissDisable"),
+	  ALL_REVS, REG(GEN7_FF_THREAD_MODE),
+	  CLEAR_BIT(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME) },
+
+	{ WA_GT(""),
+	  ALL_REVS, REG(GEN6_RC_SLEEP_PSMI_CONTROL),
+	  SET_BIT_MASKED(GEN8_RC_SEMA_IDLE_MSG_DISABLE) },
+
+	{ WA_GT("WaDisableSDEUnitClockGating"),
+	  ALL_REVS, REG(GEN8_UCGCTL6),
+	  SET_BIT(GEN8_SDEUNIT_CLOCK_GATE_DISABLE) },
+
+	{ WA_GT("WaProgramL3SqcReg1Default"),
+	  ALL_REVS, REG(GEN8_L3SQCREG1),
+	  SET_FIELD(L3_PRIO_CREDITS_MASK,
+		    L3_GENERAL_PRIO_CREDITS(30) | L3_HIGH_PRIO_CREDITS(2)),
+	  .pre_hook = disable_dop_clock_gating,
+	  .post_hook = enable_dop_clock_gating },
+
+	{ WA_GT("WaGttCachingOffByDefault"),
+	  ALL_REVS, REG(HSW_GTT_CACHE_EN),
+	  SET_FIELD(0xFFFFFFFF, 0x0),
+	  .pre_hook = use_gtt_cache },
 };
 
 static struct i915_wa_reg chv_gt_was[] = {
@@ -860,7 +895,6 @@ static void enable_dop_clock_gating(struct drm_i915_private *dev_priv,
 };
 
 static const struct i915_wa_reg_table bdw_gt_wa_tbl[] = {
-	{ gen8_gt_was, ARRAY_SIZE(gen8_gt_was) },
 	{ bdw_gt_was,  ARRAY_SIZE(bdw_gt_was) },
 };
 
@@ -947,10 +981,44 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("Number of GT specific w/a: %u\n", total_count);
 }
 
-static struct i915_wa_reg gen8_disp_was[] = {
-};
+static bool has_pipe(struct drm_i915_private *dev_priv, struct i915_wa_reg *wa)
+{
+	enum pipe pipe = wa->hook_data;
+
+	return (INTEL_INFO(dev_priv)->num_pipes > pipe);
+}
 
 static struct i915_wa_reg bdw_disp_was[] = {
+	{ WA_DISP("WaPsrDPAMaskVBlankInSRD"),
+	  ALL_REVS, REG(CHICKEN_PAR1_1),
+	  SET_BIT(DPA_MASK_VBLANK_SRD) },
+
+	{ WA_DISP("WaPsrDPRSUnmaskVBlankInSRD (pipe A)"),
+	  ALL_REVS, REG(CHICKEN_PIPESL_1(PIPE_A)),
+	  SET_BIT(BDW_DPRS_MASK_VBLANK_SRD),
+	  .hook_data = PIPE_A, .pre_hook = has_pipe },
+
+	{ WA_DISP("WaPsrDPRSUnmaskVBlankInSRD (pipe B)"),
+	  ALL_REVS, REG(CHICKEN_PIPESL_1(PIPE_B)),
+	  SET_BIT(BDW_DPRS_MASK_VBLANK_SRD),
+	  .hook_data = PIPE_B, .pre_hook = has_pipe },
+
+	{ WA_DISP("WaPsrDPRSUnmaskVBlankInSRD (pipe C)"),
+	  ALL_REVS, REG(CHICKEN_PIPESL_1(PIPE_C)),
+	  SET_BIT(BDW_DPRS_MASK_VBLANK_SRD),
+	  .hook_data = PIPE_C, .pre_hook = has_pipe },
+
+	{ WA_DISP("WaKVMNotificationOnConfigChange"),
+	  ALL_REVS, REG(CHICKEN_PAR2_1),
+	  SET_BIT(KVM_CONFIG_CHANGE_NOTIFICATION_SELECT) },
+
+	/*
+	 * Also see the CHICKEN2 write in bdw_gt_was to disable DOP
+	 * clock gating.
+	 */
+	{ WA_DISP("WaDisableDopClockGating"),
+	  ALL_REVS, REG(GEN6_UCGCTL1),
+	  SET_BIT(GEN6_EU_TCUNIT_CLOCK_GATE_DISABLE) },
 };
 
 static struct i915_wa_reg gen9_disp_was[] = {
@@ -1076,7 +1144,6 @@ static bool has_pch_cnp(struct drm_i915_private *dev_priv,
 };
 
 static const struct i915_wa_reg_table bdw_disp_wa_tbl[] = {
-	{ gen8_disp_was, ARRAY_SIZE(gen8_disp_was) },
 	{ bdw_disp_was,  ARRAY_SIZE(bdw_disp_was) },
 };
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 20/20] drm/i915: Document the i915_workarounds file
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (18 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 19/20] drm/i915/bdw: " Oscar Mateo
@ 2017-11-03 18:09 ` Oscar Mateo
  2017-11-03 18:37 ` ✗ Fi.CI.BAT: failure for Refactor HW workaround code (rev5) Patchwork
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 18:09 UTC (permalink / raw)
  To: intel-gfx

Does what it says on the tin (plus a few fixes in some old comments).

v2: Include display WAs as a separate category.
v3: Rebased

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c          |  4 +---
 drivers/gpu/drm/i915/intel_workarounds.c | 40 ++++++++++++++++++++++++++++----
 2 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0fc0670..98c2ac8 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8749,9 +8749,7 @@ static void nop_init_clock_gating(struct drm_i915_private *dev_priv)
  * @dev_priv: device private
  *
  * Setup the hooks that configure which clocks of a given platform can be
- * gated and also apply various GT and display specific workarounds for these
- * platforms. Note that some GT specific workarounds are applied separately
- * when GPU contexts or batchbuffers start their execution.
+ * gated.
  */
 void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 {
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index a8fe655..005cff7 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -25,6 +25,37 @@
 #include "i915_drv.h"
 #include "intel_workarounds.h"
 
+/**
+ * DOC: Hardware workarounds
+ *
+ * This file is a central place to implement most* of the required workarounds
+ * required for HW to work as originally intended. They fall in five basic
+ * categories depending on how/when they are applied:
+ *
+ * - Workarounds that touch registers that are saved/restored to/from the HW
+ *   context image. The list is emitted (via Load Register Immediate commands)
+ *   everytime a new context is created.
+ * - GT workarounds. The list of these WAs is applied whenever these registers
+ *   revert to default values (on GPU reset, suspend/resume**, etc..).
+ * - Display workarounds. The list is applied during display clock-gating
+ *   initialization.
+ * - Workarounds that whitelist a privileged register, so that UMDs can manage
+ *   them directly. This is just a special case of a MMMIO workaround (as we
+ *   write the list of these to/be-whitelisted registers to some special HW
+ *   registers).
+ * - Workaround batchbuffers, that get executed automatically by the hardware
+ *   on every HW context restore.
+ *
+ * * Please notice that there are other WAs that, due to their nature, cannot be
+ *   applied from a central place. Those are peppered around the rest of the
+ *   code, as needed).
+ *
+ * ** Technically, some registers are powercontext saved & restored, so they
+ *    survive a suspend/resume. In practice, writing them again is not too
+ *    costly and simplifies things. We can revisit this in the future.
+ *
+ */
+
 #define WA_CTX(wa)			\
 	.name = (wa),			\
 	.type = I915_WA_TYPE_CONTEXT
@@ -1382,10 +1413,11 @@ int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
  * but there is a slight complication as this is applied in WA batch where the
  * values are only initialized once so we cannot take register value at the
  * beginning and reuse it further; hence we save its value to memory, upload a
- * constant value with bit21 set and then we restore it back with the saved value.
+ * constant value with bit21 set and then we restore it back with the saved
+ * value.
  * To simplify the WA, a constant value is formed by using the default value
  * of this register. This shouldn't be a problem because we are only modifying
- * it for a short period and this batch in non-premptible. We can ofcourse
+ * it for a short period and this batch in non-premptible. We can of course
  * use additional instructions that read the actual value of the register
  * at that time and set our bit of interest but it makes the WA complicated.
  *
@@ -1421,8 +1453,8 @@ int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
  * Typically we only have one indirect_ctx and per_ctx batch buffer which are
  * initialized at the beginning and shared across all contexts but this field
  * helps us to have multiple batches at different offsets and select them based
- * on a criteria. At the moment this batch always start at the beginning of the page
- * and at this point we don't have multiple wa_ctx batch buffers.
+ * on a criteria. At the moment this batch always start at the beginning of the
+ * page and at this point we don't have multiple wa_ctx batch buffers.
  *
  * The number of WA applied are not known at the beginning; we use this field
  * to return the no of DWORDS written.
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* ✗ Fi.CI.BAT: failure for Refactor HW workaround code (rev5)
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (19 preceding siblings ...)
  2017-11-03 18:09 ` [RFC PATCH 20/20] drm/i915: Document the i915_workarounds file Oscar Mateo
@ 2017-11-03 18:37 ` Patchwork
  2017-11-03 21:05 ` ✗ Fi.CI.BAT: failure for Refactor HW workaround code (rev6) Patchwork
  2017-11-03 21:51 ` [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
  22 siblings, 0 replies; 30+ messages in thread
From: Patchwork @ 2017-11-03 18:37 UTC (permalink / raw)
  To: Oscar Mateo; +Cc: intel-gfx

== Series Details ==

Series: Refactor HW workaround code (rev5)
URL   : https://patchwork.freedesktop.org/series/31611/
State : failure

== Summary ==

Series 31611v5 Refactor HW workaround code
https://patchwork.freedesktop.org/api/1.0/series/31611/revisions/5/mbox/

Test chamelium:
        Subgroup dp-hpd-fast:
                skip       -> INCOMPLETE (fi-skl-6260u) fdo#102332
                skip       -> INCOMPLETE (fi-skl-6700k)
                skip       -> INCOMPLETE (fi-skl-6770hq)
                skip       -> INCOMPLETE (fi-skl-gvtdvm)
Test debugfs_test:
        Subgroup read_all_entries:
                pass       -> DMESG-WARN (fi-gdg-551)
                pass       -> DMESG-WARN (fi-blb-e6850)
                pass       -> DMESG-WARN (fi-pnv-d510)
                pass       -> DMESG-WARN (fi-bwr-2160)
                pass       -> DMESG-WARN (fi-elk-e7500)
                pass       -> DMESG-WARN (fi-ilk-650)
                pass       -> DMESG-WARN (fi-snb-2520m)
                pass       -> DMESG-WARN (fi-snb-2600)
                pass       -> DMESG-WARN (fi-ivb-3520m)
                pass       -> DMESG-WARN (fi-ivb-3770)
                pass       -> DMESG-WARN (fi-byt-j1900)
                pass       -> DMESG-WARN (fi-byt-n2820)
                pass       -> DMESG-WARN (fi-hsw-4770)
                pass       -> DMESG-WARN (fi-hsw-4770r)
                pass       -> DMESG-WARN (fi-bdw-5557u)
                pass       -> DMESG-WARN (fi-bdw-gvtdvm)
                pass       -> DMESG-WARN (fi-bsw-n3050)
Test gem_exec_reloc:
        Subgroup basic-write-gtt-active:
                fail       -> PASS       (fi-gdg-551) fdo#102582
Test gem_workarounds:
        Subgroup basic-read:
                pass       -> SKIP       (fi-bdw-5557u)
                pass       -> SKIP       (fi-bdw-gvtdvm)
                pass       -> SKIP       (fi-bsw-n3050)
                pass       -> SKIP       (fi-bxt-dsi)
                pass       -> SKIP       (fi-bxt-j4205)
                pass       -> SKIP       (fi-kbl-7500u)
                pass       -> SKIP       (fi-kbl-7560u)
                pass       -> SKIP       (fi-kbl-7567u)
                pass       -> SKIP       (fi-kbl-r)
                pass       -> SKIP       (fi-glk-1)
Test kms_busy:
        Subgroup basic-flip-b:
                fail       -> PASS       (fi-bwr-2160)
Test drv_module_reload:
        Subgroup basic-no-display:
                fail       -> PASS       (fi-hsw-4770r) fdo#103534

fdo#102332 https://bugs.freedesktop.org/show_bug.cgi?id=102332
fdo#102582 https://bugs.freedesktop.org/show_bug.cgi?id=102582
fdo#103534 https://bugs.freedesktop.org/show_bug.cgi?id=103534

fi-bdw-5557u     total:289  pass:266  dwarn:1   dfail:0   fail:0   skip:22  time:448s
fi-bdw-gvtdvm    total:289  pass:263  dwarn:1   dfail:0   fail:0   skip:25  time:454s
fi-blb-e6850     total:289  pass:222  dwarn:2   dfail:0   fail:0   skip:65  time:379s
fi-bsw-n3050     total:289  pass:241  dwarn:1   dfail:0   fail:0   skip:47  time:554s
fi-bwr-2160      total:289  pass:182  dwarn:1   dfail:0   fail:0   skip:106 time:275s
fi-bxt-dsi       total:289  pass:258  dwarn:0   dfail:0   fail:0   skip:31  time:506s
fi-bxt-j4205     total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:510s
fi-byt-j1900     total:289  pass:252  dwarn:2   dfail:0   fail:0   skip:35  time:526s
fi-byt-n2820     total:289  pass:248  dwarn:2   dfail:0   fail:0   skip:39  time:503s
fi-cfl-s         total:289  pass:253  dwarn:3   dfail:0   fail:0   skip:33  time:552s
fi-elk-e7500     total:289  pass:228  dwarn:1   dfail:0   fail:0   skip:60  time:435s
fi-gdg-551       total:289  pass:177  dwarn:2   dfail:0   fail:1   skip:109 time:263s
fi-glk-1         total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:581s
fi-hsw-4770      total:289  pass:261  dwarn:1   dfail:0   fail:0   skip:27  time:441s
fi-hsw-4770r     total:289  pass:261  dwarn:1   dfail:0   fail:0   skip:27  time:442s
fi-ilk-650       total:289  pass:227  dwarn:1   dfail:0   fail:0   skip:61  time:431s
fi-ivb-3520m     total:289  pass:259  dwarn:1   dfail:0   fail:0   skip:29  time:511s
fi-ivb-3770      total:289  pass:259  dwarn:1   dfail:0   fail:0   skip:29  time:472s
fi-kbl-7500u     total:289  pass:263  dwarn:1   dfail:0   fail:0   skip:25  time:494s
fi-kbl-7560u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:576s
fi-kbl-7567u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:480s
fi-kbl-r         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:590s
fi-pnv-d510      total:289  pass:221  dwarn:2   dfail:0   fail:0   skip:66  time:573s
fi-skl-6260u     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-skl-6700k     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-skl-6770hq    total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-skl-gvtdvm    total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-snb-2520m     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:584s
fi-snb-2600      total:289  pass:248  dwarn:1   dfail:0   fail:0   skip:40  time:433s

de359919ae463cdaef6bc6890156df84e19dee2a drm-tip: 2017y-11m-03d-14h-00m-37s UTC integration manifest
224598f43425 drm/i915: Document the i915_workarounds file
472f9c519616 drm/i915/bdw: Move GT and Display workarounds from init_clock_gating
9cd5ab2ab1f3 drm/i915/chv: Move GT and Display workarounds from init_clock_gating
9fea27f2ef56 drm/i915/skl: Move GT and Display workarounds from init_clock_gating
4775bddedaf3 drm/i915/bxt: Move GT and Display workarounds from init_clock_gating
a6b2390ba1b3 drm/i915/kbl: Move GT and Display workarounds from init_clock_gating
5eeb2504a4e8 drm/i915/glk: Move GT and Display workarounds from init_clock_gating
fe7adf2e3d75 drm/i915/cfl: Move GT and Display workarounds from init_clock_gating
c7c4ed0231f7 drm/i915/gen9: Move GT and Display workarounds from init_clock_gating
b901e5453911 drm/i915/cnl: Move GT and Display workarounds from init_clock_gating
c8cba80ea9aa drm/i915: Move WA BB stuff to the workarounds file as well
f6e4c1b0ab26 drm/i915: Do not store the total counts of WAs
609e7789d79a drm/i915: Print all workaround types correctly in debugfs
52178c527257 drm/i915: Create a new category of display WAs
641081623096 drm/i915: Transform Whitelist WAs into static tables
b80b5cd4af64 drm/i915: Transform GT WAs into static tables
5c5d72a03ffa drm/i915: Transform context WAs into static tables
7d5c51c54885 drm/i915: Split out functions for different kinds of workarounds
22f7d7efab27 drm/i915: Move a bunch of workaround-related code to its own file
1a991ddf1aec drm/i915: Remove Gen9 WAs with no effect

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6949/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC PATCH v2] drm/i915: Transform Whitelist WAs into static tables
  2017-11-03 18:09 ` [RFC PATCH 06/20] drm/i915: Transform Whitelist " Oscar Mateo
@ 2017-11-03 20:43   ` Oscar Mateo
  0 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 20:43 UTC (permalink / raw)
  To: intel-gfx

This is for WAs that whitelist a register.

v2: Warn about olden GENs in the apply, not in the get function

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          |   2 +
 drivers/gpu/drm/i915/intel_workarounds.c | 251 ++++++++++++++++---------------
 drivers/gpu/drm/i915/intel_workarounds.h |   3 +
 3 files changed, 131 insertions(+), 125 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e68edf18..441d92e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1991,6 +1991,8 @@ struct i915_wa_reg {
 	u8 since;
 	u8 until;
 
+	i915_reg_t whitelist_addr;
+
 	i915_reg_t addr;
 	u32 mask;
 	u32 value;
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index b07fbd0..efa6bc2 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -33,6 +33,10 @@
 	.name = (wa),			\
 	.type = I915_WA_TYPE_GT
 
+#define WA_WHITELIST(wa)		\
+	.name = (wa),			\
+	.type = I915_WA_TYPE_WHITELIST
+
 #define ALL_REVS		\
 	.since = 0,		\
 	.until = REVID_FOREVER
@@ -75,6 +79,9 @@
 	.value = MASK(m, v),		\
 	.is_masked_reg = true
 
+#define WHITELIST(reg)		\
+	.whitelist_addr = reg
+
 static struct i915_wa_reg gen8_ctx_was[] = {
 	{ WA_CTX(""),
 	  ALL_REVS, REG(INSTPM),
@@ -861,160 +868,154 @@ void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("Number of GT specific w/a: %u\n", total_count);
 }
 
-static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
-				 i915_reg_t reg)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	struct i915_workarounds *wa = &dev_priv->workarounds;
-	const uint32_t index = wa->hw_whitelist_count[engine->id];
-
-	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
-		return -EINVAL;
+static struct i915_wa_reg gen9_whitelist_was[] = {
+	{ WA_WHITELIST("WaVFEStateAfterPipeControlwithMediaStateClear"),
+	  ALL_REVS, WHITELIST(GEN9_CTX_PREEMPT_REG) },
 
-	I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
-		   i915_mmio_reg_offset(reg));
-	wa->hw_whitelist_count[engine->id]++;
+	{ WA_WHITELIST("WaEnablePreemptionGranularityControlByUMD"),
+	  ALL_REVS, WHITELIST(GEN8_CS_CHICKEN1) },
 
-	return 0;
-}
-
-static int gen9_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	int ret;
+	{ WA_WHITELIST("WaAllowUMDToModifyHDCChicken1"),
+	  ALL_REVS, WHITELIST(GEN8_HDC_CHICKEN1) },
+};
 
-	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
-	if (ret)
-		return ret;
+static struct i915_wa_reg skl_whitelist_was[] = {
+	{ WA_WHITELIST("WaDisableLSQCROPERFforOCL"),
+	  ALL_REVS, WHITELIST(GEN8_L3SQCREG4) },
+};
 
-	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
-	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
+static struct i915_wa_reg bxt_whitelist_was[] = {
+	{ WA_WHITELIST("WaDisableObjectLevelPreemptionForTrifanOrPolygon +"
+		       "WaDisableObjectLevelPreemptionForInstancedDraw +"
+		       "WaDisableObjectLevelPreemtionForInstanceId +"
+		       "WaDisableLSQCROPERFforOCL"),
+	  REVS(0, BXT_REVID_A1), WHITELIST(GEN9_CS_DEBUG_MODE1) },
 
-	/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
-	if (ret)
-		return ret;
+	{ WA_WHITELIST("WaDisableObjectLevelPreemptionForTrifanOrPolygon +"
+		       "WaDisableObjectLevelPreemptionForInstancedDraw +"
+		       "WaDisableObjectLevelPreemtionForInstanceId +"
+		       "WaDisableLSQCROPERFforOCL"),
+	  REVS(0, BXT_REVID_A1), WHITELIST(GEN8_L3SQCREG4) },
+};
 
-	return 0;
-}
+static struct i915_wa_reg kbl_whitelist_was[] = {
+	{ WA_WHITELIST("WaDisableLSQCROPERFforOCL"),
+	  ALL_REVS, WHITELIST(GEN8_L3SQCREG4) },
+};
 
-static int skl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	int ret = gen9_whitelist_workarounds_apply(engine);
-	if (ret)
-		return ret;
+static struct i915_wa_reg cnl_whitelist_was[] = {
+	{ WA_WHITELIST("WaEnablePreemptionGranularityControlByUMD"),
+	  ALL_REVS, WHITELIST(GEN8_CS_CHICKEN1) },
+};
 
-	/* WaDisableLSQCROPERFforOCL:skl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
+static const struct i915_wa_reg_table skl_whitelist_wa_tbl[] = {
+	{ gen9_whitelist_was, ARRAY_SIZE(gen9_whitelist_was) },
+	{ skl_whitelist_was,  ARRAY_SIZE(skl_whitelist_was) },
+};
 
-	return 0;
-}
+static const struct i915_wa_reg_table bxt_whitelist_wa_tbl[] = {
+	{ gen9_whitelist_was, ARRAY_SIZE(gen9_whitelist_was) },
+	{ bxt_whitelist_was,  ARRAY_SIZE(bxt_whitelist_was) },
+};
 
-static int bxt_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
+static const struct i915_wa_reg_table kbl_whitelist_wa_tbl[] = {
+	{ gen9_whitelist_was, ARRAY_SIZE(gen9_whitelist_was) },
+	{ kbl_whitelist_was,  ARRAY_SIZE(kbl_whitelist_was) },
+};
 
-	int ret = gen9_whitelist_workarounds_apply(engine);
-	if (ret)
-		return ret;
+static const struct i915_wa_reg_table glk_whitelist_wa_tbl[] = {
+	{ gen9_whitelist_was, ARRAY_SIZE(gen9_whitelist_was) },
+};
 
-	/* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
-	/* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
-	/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
-	/* WaDisableLSQCROPERFforOCL:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
-		if (ret)
-			return ret;
-
-		ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-		if (ret)
-			return ret;
-	}
+static const struct i915_wa_reg_table cfl_whitelist_wa_tbl[] = {
+	{ gen9_whitelist_was, ARRAY_SIZE(gen9_whitelist_was) },
+};
 
-	return 0;
-}
+static const struct i915_wa_reg_table cnl_whitelist_wa_tbl[] = {
+	{ cnl_whitelist_was,  ARRAY_SIZE(cnl_whitelist_was) },
+};
 
-static int kbl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+void intel_whitelist_workarounds_get(struct drm_i915_private *dev_priv,
+				     const struct i915_wa_reg_table **wa_table,
+				     uint *table_count)
 {
-	int ret = gen9_whitelist_workarounds_apply(engine);
-	if (ret)
-		return ret;
-
-	/* WaDisableLSQCROPERFforOCL:kbl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
+	*wa_table = NULL;
+	*table_count = 0;
 
-	return 0;
+	if (INTEL_GEN(dev_priv) < 9)
+		return;
+	else if (IS_SKYLAKE(dev_priv)) {
+		*wa_table = skl_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(skl_whitelist_wa_tbl);
+	} else if (IS_BROXTON(dev_priv)) {
+		*wa_table = bxt_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(bxt_whitelist_wa_tbl);
+	} else if (IS_KABYLAKE(dev_priv)) {
+		*wa_table = kbl_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(kbl_whitelist_wa_tbl);
+	} else if (IS_GEMINILAKE(dev_priv)) {
+		*wa_table = glk_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(glk_whitelist_wa_tbl);
+	} else if (IS_COFFEELAKE(dev_priv)) {
+		*wa_table = cfl_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(cfl_whitelist_wa_tbl);
+	} else if (IS_CANNONLAKE(dev_priv)) {
+		*wa_table = cnl_whitelist_wa_tbl;
+		*table_count = ARRAY_SIZE(cnl_whitelist_wa_tbl);
+	} else {
+		MISSING_CASE(INTEL_GEN(dev_priv));
+		return;
+	}
 }
 
-static int glk_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
 {
-	int ret = gen9_whitelist_workarounds_apply(engine);
-	if (ret)
-		return ret;
-
-	return 0;
-}
+	struct drm_i915_private *dev_priv = engine->i915;
+	const struct i915_wa_reg_table *wa_table;
+	uint table_count, total_count = 0;
+	int i, j;
 
-static int cfl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	int ret = gen9_whitelist_workarounds_apply(engine);
-	if (ret)
-		return ret;
+	if (INTEL_GEN(dev_priv) < 9) {
+		WARN(1, "No whitelisting in Gen%u\n", INTEL_GEN(dev_priv));
+		return -EINVAL;
+	}
 
-	return 0;
-}
+	intel_whitelist_workarounds_get(dev_priv, &wa_table, &table_count);
 
-static int cnl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	int ret;
+	for (i = 0; i < table_count; i++) {
+		struct i915_wa_reg *wa = wa_table[i].table;
 
-	/* WaEnablePreemptionGranularityControlByUMD:cnl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
+		for (j = 0; j < wa_table[i].count; j++) {
+			wa[j].applied =
+				IS_REVID(dev_priv, wa[j].since, wa[j].until);
 
-	return 0;
-}
+			if (wa[j].applied && wa[j].pre_hook)
+				wa[j].applied = wa[j].pre_hook(dev_priv, &wa[j]);
 
-int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int err;
+			if (wa[j].applied) {
+				if (WARN_ON(total_count >= RING_MAX_NONPRIV_SLOTS)) {
+					wa[j].applied = false;
+					return -EINVAL;
+				}
 
-	WARN_ON(engine->id != RCS);
+				/* Cache the translation of the */
+				wa[j].addr =
+					RING_FORCE_TO_NONPRIV(engine->mmio_base,
+							      total_count++);
+				wa[j].value =
+					i915_mmio_reg_offset(wa[j].whitelist_addr);
+				wa[j].mask = 0xffffffff;
 
-	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+				I915_WRITE(wa[j].addr, wa[j].value);
+			}
 
-	if (INTEL_GEN(dev_priv) < 9) {
-		WARN(1, "No whitelisting in Gen%u\n", INTEL_GEN(dev_priv));
-		err = 0;
-	} else if (IS_SKYLAKE(dev_priv))
-		err = skl_whitelist_workarounds_apply(engine);
-	else if (IS_BROXTON(dev_priv))
-		err = bxt_whitelist_workarounds_apply(engine);
-	else if (IS_KABYLAKE(dev_priv))
-		err = kbl_whitelist_workarounds_apply(engine);
-	else if (IS_GEMINILAKE(dev_priv))
-		err = glk_whitelist_workarounds_apply(engine);
-	else if (IS_COFFEELAKE(dev_priv))
-		err = cfl_whitelist_workarounds_apply(engine);
-	else if (IS_CANNONLAKE(dev_priv))
-		err = cnl_whitelist_workarounds_apply(engine);
-	else {
-		MISSING_CASE(INTEL_GEN(dev_priv));
-		err = 0;
+			GEM_BUG_ON(wa[j].post_hook);
+		}
 	}
-	if (err)
-		return err;
 
-	DRM_DEBUG_DRIVER("%s: Number of whitelist w/a: %d\n", engine->name,
-			 dev_priv->workarounds.hw_whitelist_count[engine->id]);
+	dev_priv->workarounds.hw_whitelist_count[engine->id] = total_count;
+	DRM_DEBUG_DRIVER("%s: Number of whitelist w/a: %u\n", engine->name,
+			 total_count);
+
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
index 9bb3c48..f60913f 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -35,6 +35,9 @@ void intel_gt_workarounds_get(struct drm_i915_private *dev_priv,
                               uint *table_count);
 void intel_gt_workarounds_apply(struct drm_i915_private *dev_priv);
 
+void intel_whitelist_workarounds_get(struct drm_i915_private *dev_priv,
+                                     const struct i915_wa_reg_table **wa_table,
+                                     uint *table_count);
 int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine);
 
 #endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* ✗ Fi.CI.BAT: failure for Refactor HW workaround code (rev6)
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (20 preceding siblings ...)
  2017-11-03 18:37 ` ✗ Fi.CI.BAT: failure for Refactor HW workaround code (rev5) Patchwork
@ 2017-11-03 21:05 ` Patchwork
  2017-11-03 21:51 ` [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
  22 siblings, 0 replies; 30+ messages in thread
From: Patchwork @ 2017-11-03 21:05 UTC (permalink / raw)
  To: Oscar Mateo; +Cc: intel-gfx

== Series Details ==

Series: Refactor HW workaround code (rev6)
URL   : https://patchwork.freedesktop.org/series/31611/
State : failure

== Summary ==

Series 31611v6 Refactor HW workaround code
https://patchwork.freedesktop.org/api/1.0/series/31611/revisions/6/mbox/

Test chamelium:
        Subgroup dp-hpd-fast:
                skip       -> INCOMPLETE (fi-skl-6260u) fdo#102332
                skip       -> INCOMPLETE (fi-skl-6700k)
                skip       -> INCOMPLETE (fi-skl-6770hq)
                skip       -> INCOMPLETE (fi-skl-gvtdvm)
Test gem_basic:
        Subgroup create-close:
                dmesg-warn -> PASS       (fi-cfl-s)
        Subgroup create-fd-close:
                dmesg-warn -> PASS       (fi-cfl-s)
Test gem_workarounds:
        Subgroup basic-read:
                pass       -> SKIP       (fi-bdw-5557u)
                pass       -> SKIP       (fi-bdw-gvtdvm)
                pass       -> SKIP       (fi-bsw-n3050)
                pass       -> SKIP       (fi-bxt-dsi)
                pass       -> SKIP       (fi-bxt-j4205)
                pass       -> SKIP       (fi-kbl-7500u)
                pass       -> SKIP       (fi-kbl-7560u)
                pass       -> SKIP       (fi-kbl-7567u)
                pass       -> SKIP       (fi-kbl-r)
                pass       -> SKIP       (fi-glk-1)
                pass       -> SKIP       (fi-cfl-s)
                pass       -> SKIP       (fi-cnl-y)
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-c:
                dmesg-warn -> PASS       (fi-bsw-n3050)

fdo#102332 https://bugs.freedesktop.org/show_bug.cgi?id=102332

fi-bdw-5557u     total:289  pass:267  dwarn:0   dfail:0   fail:0   skip:22  time:443s
fi-bdw-gvtdvm    total:289  pass:264  dwarn:0   dfail:0   fail:0   skip:25  time:454s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:382s
fi-bsw-n3050     total:289  pass:242  dwarn:0   dfail:0   fail:0   skip:47  time:550s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:274s
fi-bxt-dsi       total:289  pass:258  dwarn:0   dfail:0   fail:0   skip:31  time:508s
fi-bxt-j4205     total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:507s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:528s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:499s
fi-cfl-s         total:289  pass:253  dwarn:3   dfail:0   fail:0   skip:33  time:557s
fi-cnl-y         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:612s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:429s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:265s
fi-glk-1         total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:587s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:446s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:438s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:428s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:510s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:470s
fi-kbl-7500u     total:289  pass:263  dwarn:1   dfail:0   fail:0   skip:25  time:490s
fi-kbl-7560u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:574s
fi-kbl-7567u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:479s
fi-kbl-r         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:582s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:570s
fi-skl-6260u     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-skl-6700k     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-skl-6770hq    total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-skl-gvtdvm    total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:579s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:432s

8b0ae6b50a229dc661a02f4034252ee854cc9b83 drm-tip: 2017y-11m-03d-17h-15m-57s UTC integration manifest
f2669847fa75 drm/i915: Document the i915_workarounds file
38aa0f69abf7 drm/i915/bdw: Move GT and Display workarounds from init_clock_gating
8421f2af7194 drm/i915/chv: Move GT and Display workarounds from init_clock_gating
22d162368c95 drm/i915/skl: Move GT and Display workarounds from init_clock_gating
1a1d677877b5 drm/i915/bxt: Move GT and Display workarounds from init_clock_gating
2d47c83c1f56 drm/i915/kbl: Move GT and Display workarounds from init_clock_gating
fb168c1de5aa drm/i915/glk: Move GT and Display workarounds from init_clock_gating
9e43fbf90189 drm/i915/cfl: Move GT and Display workarounds from init_clock_gating
c43c4633394c drm/i915/gen9: Move GT and Display workarounds from init_clock_gating
d1cf94810c0b drm/i915/cnl: Move GT and Display workarounds from init_clock_gating
426147e4b6d0 drm/i915: Move WA BB stuff to the workarounds file as well
e9c28cd29fff drm/i915: Do not store the total counts of WAs
0525c74c783c drm/i915: Print all workaround types correctly in debugfs
f4d433201ab7 drm/i915: Create a new category of display WAs
bd3c74fbba8d drm/i915: Transform Whitelist WAs into static tables
e19cc53c3be5 drm/i915: Transform GT WAs into static tables
f24d5f241e0c drm/i915: Transform context WAs into static tables
9cb951c1cb19 drm/i915: Split out functions for different kinds of workarounds
538ad04010d5 drm/i915: Move a bunch of workaround-related code to its own file
28d4083f3f13 drm/i915: Remove Gen9 WAs with no effect

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6952/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v5 00/20] Refactor HW workaround code
  2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
                   ` (21 preceding siblings ...)
  2017-11-03 21:05 ` ✗ Fi.CI.BAT: failure for Refactor HW workaround code (rev6) Patchwork
@ 2017-11-03 21:51 ` Oscar Mateo
  22 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-03 21:51 UTC (permalink / raw)
  To: intel-gfx



On 11/03/2017 11:09 AM, Oscar Mateo wrote:
> New approach using static tables instead of a programmatic one. This is RFC for
> two reasons: firstly because I still need to re-review everything myself (I
> wanted to get it out ther asap), and secondly because I'm not 100% convinced
> by this approach.
>
> While writing the patches, the approach seemed forceful: I couldn't use const
> structs

Or, in other words, this whole thing I just sent is broken and not 
better that using global variables. I'll try to cache things somewhere 
else and resend, sorry for even sending this stupid respin.

> because there are a good deal of things that need calculations (e.g.
> skl_tune_iz_hashing), or need to pass data between the pre- and the post- hooks
> (e.g. disable/enable_dop_clock_gating), or cannot be done in tables (e.g. assert
> the values make sense for those registers that are masked), or need to extract
> fields from structs (whitelist registers). I also cannot predict how future-proof
> this thing is.
>
> Furthermote, this is going to be much more difficult to review than the previous
> approach, if only because the delta is much bigger. If this approach is preferred,
> I stronly suggest we do it in top of the previous one (that way we have the debugfs
> output much earlier in the game to make sure we are missing anything).
>
>  From previous cover letters:
>
> Currently, deciding how/where to apply new workarounds is challenging. Often,
> workarounds end up applied incorrectly and get lost under certain circumstances
> (e.g. a context switch or a GPU reset). This is a proposal to attempt to
> eliminate some of this pain, by clarifying the current classification of
> workarounds (context saved/restored, global registers, whitelisting, BB),
> putting them together on the same file, and improving the existing validation
> infrastructure (debugfs/i-g-t).
>
> Oscar Mateo (20):
>    drm/i915: Remove Gen9 WAs with no effect
>    drm/i915: Move a bunch of workaround-related code to its own file
>    drm/i915: Split out functions for different kinds of workarounds
>    drm/i915: Transform context WAs into static tables
>    drm/i915: Transform GT WAs into static tables
>    drm/i915: Transform Whitelist WAs into static tables
>    drm/i915: Create a new category of display WAs
>    drm/i915: Print all workaround types correctly in debugfs
>    drm/i915: Do not store the total counts of WAs
>    drm/i915: Move WA BB stuff to the workarounds file as well
>    drm/i915/cnl: Move GT and Display workarounds from init_clock_gating
>    drm/i915/gen9: Move GT and Display workarounds from init_clock_gating
>    drm/i915/cfl: Move GT and Display workarounds from init_clock_gating
>    drm/i915/glk: Move GT and Display workarounds from init_clock_gating
>    drm/i915/kbl: Move GT and Display workarounds from init_clock_gating
>    drm/i915/bxt: Move GT and Display workarounds from init_clock_gating
>    drm/i915/skl: Move GT and Display workarounds from init_clock_gating
>    drm/i915/chv: Move GT and Display workarounds from init_clock_gating
>    drm/i915/bdw: Move GT and Display workarounds from init_clock_gating
>    drm/i915: Document the i915_workarounds file
>
>   drivers/gpu/drm/i915/Makefile            |    3 +-
>   drivers/gpu/drm/i915/i915_debugfs.c      |  117 ++-
>   drivers/gpu/drm/i915/i915_drv.h          |   40 +-
>   drivers/gpu/drm/i915/i915_gem.c          |    3 +
>   drivers/gpu/drm/i915/i915_gem_context.c  |    1 +
>   drivers/gpu/drm/i915/i915_reg.h          |    3 -
>   drivers/gpu/drm/i915/intel_engine_cs.c   |  682 ------------
>   drivers/gpu/drm/i915/intel_lrc.c         |  264 +----
>   drivers/gpu/drm/i915/intel_pm.c          |  312 +-----
>   drivers/gpu/drm/i915/intel_ringbuffer.c  |    5 +-
>   drivers/gpu/drm/i915/intel_ringbuffer.h  |    3 -
>   drivers/gpu/drm/i915/intel_workarounds.c | 1663 ++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/intel_workarounds.h |   51 +
>   13 files changed, 1867 insertions(+), 1280 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/intel_workarounds.c
>   create mode 100644 drivers/gpu/drm/i915/intel_workarounds.h
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables
  2017-11-03 18:09 ` [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables Oscar Mateo
@ 2017-11-06 11:59   ` Joonas Lahtinen
  2017-11-06 18:54     ` Oscar Mateo
  0 siblings, 1 reply; 30+ messages in thread
From: Joonas Lahtinen @ 2017-11-06 11:59 UTC (permalink / raw)
  To: Oscar Mateo, intel-gfx

On Fri, 2017-11-03 at 11:09 -0700, Oscar Mateo wrote:
> This is for WAs that need to touch registers that get saved/restored
> together with the logical context. The idea is that WAs are "pretty"
> static, so a table is more declarative than a programmatic approah.
> Note however that some amount is caching is needed for those things
> that are dynamic (e.g. things that need some calculation, or have
> a criteria different than the more obvious GEN + stepping).
> 
> Also, this makes very explicit which WAs live in the context.
> 
> Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>

<SNIP>

> +struct i915_wa_reg;
> +
> +typedef bool (* wa_pre_hook_func)(struct drm_i915_private *dev_priv,
> +				  struct i915_wa_reg *wa);
> +typedef void (* wa_post_hook_func)(struct drm_i915_private *dev_priv,
> +				   struct i915_wa_reg *wa);

To avoid carrying any variables over, how about just apply() hook?
Also, you don't have to have "_hook" going there, it's tak

>  struct i915_wa_reg {
> +	const char *name;

We may want some Kconfig option for skipping these.

> +	enum wa_type {
> +		I915_WA_TYPE_CONTEXT = 0,
> +		I915_WA_TYPE_GT,
> +		I915_WA_TYPE_DISPLAY,
> +		I915_WA_TYPE_WHITELIST
> +	} type;
> +

Any specific reason not to have the gen here too? Then you can have one
big table, instead of tables of tables. Then the numeric code of a WA
(position in that table) would be equally identifying it compared to
the WA name (which is nice to have information, so config time opt-in).

> +	u8 since;
> +	u8 until;

Most seem to have ALL_REVS, so this could be after the coarse-grained
gen-check in the apply function.

> +
>  	i915_reg_t addr;
> -	u32 value;
> -	/* bitmask representing WA bits */
>  	u32 mask;
> +	u32 value;
> +	bool is_masked_reg;

I'd hide this detail into the apply function.

> +
> +	wa_pre_hook_func pre_hook;
> +	wa_post_hook_func post_hook;

	bool (*apply)(const struct i915_wa *wa,
		      struct drm_i915_private *dev_priv);

> +	u32 hook_data;
> +	bool applied;

The big point would be to make this into const, so "applied" would
defeat that.

<SNIP>

> +#define MASK(mask, value)	((mask) << 16 | (value))
> +#define MASK_ENABLE(x)		(MASK((x), (x)))
> +#define MASK_DISABLE(x)		(MASK((x), 0))
>  
> -#define WA_REG(addr, mask, val) do { \
> -		const int r = wa_add(dev_priv, (addr), (mask), (val)); \
> -		if (r) \
> -			return r; \
> -	} while (0)
> +#define SET_BIT_MASKED(m) 		\
> +	.mask = (m),			\
> +	.value = MASK_ENABLE(m),	\
> +	.is_masked_reg = true
>  
> -#define WA_SET_BIT_MASKED(addr, mask) \
> -	WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
> +#define CLEAR_BIT_MASKED( m) 		\
> +	.mask = (m),			\
> +	.value = MASK_DISABLE(m),	\
> +	.is_masked_reg = true
>  
> -#define WA_CLR_BIT_MASKED(addr, mask) \
> -	WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
> +#define SET_FIELD_MASKED(m, v) 		\
> +	.mask = (m),			\
> +	.value = MASK(m, v),		\
> +	.is_masked_reg = true

Lets try to have the struct i915_wa as small as possible, so this could
be calculated in the apply function.

So, avoiding the macros this would indeed become rather declarative;

{
	WA_NAME("WaDisableAsyncFlipPerfMode")
	.gen = ...,
	.reg = MI_MODE,
	.value = ASYNC_FLIP_PERF_DISABLE,
	.apply = set_bit_masked,
},

Or, we could also have;

static const struct i915_wa WaDisableAsyncFlipPerfMode = {
	.gen = ...,
	.reg = MI_MODE,
	.value = ASYNC_FLIP_PERF_DISABLE,
	.apply = set_bit_masked,
};

And then one array of those.

	WA(WaDisableAsyncFlipPerfMode),

Then you could at compile time decide if you stringify and store the
name. But that'd be more const data than necessary (pointers to
structs, instead of an array of structs).

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 01/20] drm/i915: Remove Gen9 WAs with no effect
  2017-11-03 18:09 ` [RFC PATCH 01/20] drm/i915: Remove Gen9 WAs with no effect Oscar Mateo
@ 2017-11-06 12:40   ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-11-06 12:40 UTC (permalink / raw)
  To: Oscar Mateo, intel-gfx

Quoting Oscar Mateo (2017-11-03 18:09:29)
> GEN8_CONFIG0 (0xD00) is a protected by a lock (bit 31) which is set by
> the BIOS, so there is no way we can enable the three chicken bits
> mandated by the WA (the BIOS should be doing it instead).
> 
> v2: Rebased
> v3: Standalone patch
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>

Mika, could you do the honours to get this out of the way?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 02/20] drm/i915: Move a bunch of workaround-related code to its own file
  2017-11-03 18:09 ` [RFC PATCH 02/20] drm/i915: Move a bunch of workaround-related code to its own file Oscar Mateo
@ 2017-11-06 12:42   ` Chris Wilson
  2017-11-06 12:47     ` Joonas Lahtinen
  0 siblings, 1 reply; 30+ messages in thread
From: Chris Wilson @ 2017-11-06 12:42 UTC (permalink / raw)
  To: Oscar Mateo, intel-gfx

Quoting Oscar Mateo (2017-11-03 18:09:30)
> This has grown to be a sizable amount of code, so move it to
> its own file before we try to refactor anything. For the moment,
> we are leaving behind the WA BB code and the WAs that get applied
> (incorrectly) in init_clock_gating, but we will deal with it later.
> 
> v2: Use intel_ prefix for code that deals with the hardware (Chris)
> v3: Rebased
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>

I would like to start on this reclassifications of w/a early; even just
moving the current set of register writes into the right groups should
be a massive help wrt to our confused init ordering.

Anyone object to moving the existing code around before we get the
universal solution?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 02/20] drm/i915: Move a bunch of workaround-related code to its own file
  2017-11-06 12:42   ` Chris Wilson
@ 2017-11-06 12:47     ` Joonas Lahtinen
  0 siblings, 0 replies; 30+ messages in thread
From: Joonas Lahtinen @ 2017-11-06 12:47 UTC (permalink / raw)
  To: Chris Wilson, Oscar Mateo, intel-gfx

On Mon, 2017-11-06 at 12:42 +0000, Chris Wilson wrote:
> Quoting Oscar Mateo (2017-11-03 18:09:30)
> > This has grown to be a sizable amount of code, so move it to
> > its own file before we try to refactor anything. For the moment,
> > we are leaving behind the WA BB code and the WAs that get applied
> > (incorrectly) in init_clock_gating, but we will deal with it later.
> > 
> > v2: Use intel_ prefix for code that deals with the hardware (Chris)
> > v3: Rebased
> > 
> > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> 
> I would like to start on this reclassifications of w/a early; even just
> moving the current set of register writes into the right groups should
> be a massive help wrt to our confused init ordering.
> 
> Anyone object to moving the existing code around before we get the
> universal solution?

Nope, I was actually going to suggest getting the first patches merged
first. Then it'll then be easier to spin RFCs.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables
  2017-11-06 11:59   ` Joonas Lahtinen
@ 2017-11-06 18:54     ` Oscar Mateo
  0 siblings, 0 replies; 30+ messages in thread
From: Oscar Mateo @ 2017-11-06 18:54 UTC (permalink / raw)
  To: Joonas Lahtinen, intel-gfx



On 11/06/2017 03:59 AM, Joonas Lahtinen wrote:
> On Fri, 2017-11-03 at 11:09 -0700, Oscar Mateo wrote:
>> This is for WAs that need to touch registers that get saved/restored
>> together with the logical context. The idea is that WAs are "pretty"
>> static, so a table is more declarative than a programmatic approah.
>> Note however that some amount is caching is needed for those things
>> that are dynamic (e.g. things that need some calculation, or have
>> a criteria different than the more obvious GEN + stepping).
>>
>> Also, this makes very explicit which WAs live in the context.
>>
>> Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> <SNIP>
>
>> +struct i915_wa_reg;
>> +
>> +typedef bool (* wa_pre_hook_func)(struct drm_i915_private *dev_priv,
>> +				  struct i915_wa_reg *wa);
>> +typedef void (* wa_post_hook_func)(struct drm_i915_private *dev_priv,
>> +				   struct i915_wa_reg *wa);
> To avoid carrying any variables over, how about just apply() hook?
> Also, you don't have to have "_hook" going there, it's tak
>

Not all WAs are applied in the same way: ctx-style workarounds are 
emitted as LRI commands to the ring. Do you treat those differently?

>>   struct i915_wa_reg {
>> +	const char *name;
> We may want some Kconfig option for skipping these.

Sure. But we should try to decide first if we want to store this at all, 
like: what do we expect to use this for? is it worth it?

>> +	enum wa_type {
>> +		I915_WA_TYPE_CONTEXT = 0,
>> +		I915_WA_TYPE_GT,
>> +		I915_WA_TYPE_DISPLAY,
>> +		I915_WA_TYPE_WHITELIST
>> +	} type;
>> +
> Any specific reason not to have the gen here too? Then you can have one
> big table, instead of tables of tables. Then the numeric code of a WA
> (position in that table) would be equally identifying it compared to
> the WA name (which is nice to have information, so config time opt-in).

Such a "big table" would be quite big, indeed. And we know we want to 
apply the workarounds from at least four different places, so looping 
through the table each and every time to find the relevant WAs seems 
like a waste. Also, in some places we would have to loop more than once 
( to know the number of WAs to apply before we can reserve space in the 
ring for ctx-style WAs, for example).

I could also go for 4 slightly smaller tables (one per type of WA) but 
then there is another problem to solve: how do you record WAs that apply 
for all revisions of one GEN, but a smaller number of revisions of 
another? (e.g. WaDisableFenceDestinationToSLM applies to all BDW 
steppings but only KBL A0).

>> +	u8 since;
>> +	u8 until;
> Most seem to have ALL_REVS, so this could be after the coarse-grained
> gen-check in the apply function.

So every single WA that applies to specific REVS gets an "apply" 
function? That looks like a lot of functions (I count 25 WAs that only 
apply to some steppings already). Or are you simply saying here that I 
check the GEN before checking the stepping (which is the only order that 
makes sense anyway)?

>> +
>>   	i915_reg_t addr;
>> -	u32 value;
>> -	/* bitmask representing WA bits */
>>   	u32 mask;
>> +	u32 value;
>> +	bool is_masked_reg;
> I'd hide this detail into the apply function.

I see. But if you don't store the mask: what do you output in debugfs?

>
>> +
>> +	wa_pre_hook_func pre_hook;
>> +	wa_post_hook_func post_hook;
> 	bool (*apply)(const struct i915_wa *wa,
> 		      struct drm_i915_private *dev_priv);
>
>> +	u32 hook_data;
>> +	bool applied;
> The big point would be to make this into const, so "applied" would
> defeat that.

Yeah, I realized. Keeping a separate bitmask of which WAs have been 
applied is not a big deal, but then I became aware that there are many 
more things that would need to be cached. For example, some WAs require 
to compute the actual value you write into their register. What do you 
do with those? (remember that you still want to print the expected value 
in debugfs for these).

> <SNIP>
>
>> +#define MASK(mask, value)	((mask) << 16 | (value))
>> +#define MASK_ENABLE(x)		(MASK((x), (x)))
>> +#define MASK_DISABLE(x)		(MASK((x), 0))
>>   
>> -#define WA_REG(addr, mask, val) do { \
>> -		const int r = wa_add(dev_priv, (addr), (mask), (val)); \
>> -		if (r) \
>> -			return r; \
>> -	} while (0)
>> +#define SET_BIT_MASKED(m) 		\
>> +	.mask = (m),			\
>> +	.value = MASK_ENABLE(m),	\
>> +	.is_masked_reg = true
>>   
>> -#define WA_SET_BIT_MASKED(addr, mask) \
>> -	WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
>> +#define CLEAR_BIT_MASKED( m) 		\
>> +	.mask = (m),			\
>> +	.value = MASK_DISABLE(m),	\
>> +	.is_masked_reg = true
>>   
>> -#define WA_CLR_BIT_MASKED(addr, mask) \
>> -	WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
>> +#define SET_FIELD_MASKED(m, v) 		\
>> +	.mask = (m),			\
>> +	.value = MASK(m, v),		\
>> +	.is_masked_reg = true
> Lets try to have the struct i915_wa as small as possible, so this could
> be calculated in the apply function.
>
> So, avoiding the macros this would indeed become rather declarative;
>
> {
> 	WA_NAME("WaDisableAsyncFlipPerfMode")
> 	.gen = ...,
> 	.reg = MI_MODE,
> 	.value = ASYNC_FLIP_PERF_DISABLE,
> 	.apply = set_bit_masked,
> },
> Or, we could also have;
>
> static const struct i915_wa WaDisableAsyncFlipPerfMode = {
> 	.gen = ...,
> 	.reg = MI_MODE,
> 	.value = ASYNC_FLIP_PERF_DISABLE,
> 	.apply = set_bit_masked,
> };
>
> And then one array of those.
>
> 	WA(WaDisableAsyncFlipPerfMode),

This is the list of problems we need to solve before we can go forward 
with this design:

- What to do with WAs that don't know a priori what .value should be, 
because it gets computed in places like skl_tune_iz_hashing or 
use_gtt_cache? (yes, computing in the apply function is the immediate 
answer, but then... how do you output that in debugfs?).
- What to do with context-style WAs, that are emitted instead of 
applied, as I mentioned above?.
- What to do with whitelist-style functions, where you need to access 
the .reg field of i915_reg_t to know the .value? Also, the .reg depends 
on the engine (although I guess you can always statically codify that in 
the table and apply the whitelist WAs later, once all the engines are up).
- You are not storing .since/.until. Does that mean every WA that 
applies to only some steppings gets a custom apply function?.
- If you don't store the computed mask anywhere, what do you output in 
debugfs? (which is the real improvement we want to achieve?).
- Something to be careful about: some WAs are named the same, but their 
reg/value is different (because the register has changed in one 
particular GEN or whatever). The solution could be a modifier to the 
name (WaSomething_bdw_chv and  WaSomething_skl) but this could be a 
source of errors.

> Then you could at compile time decide if you stringify and store the
> name. But that'd be more const data than necessary (pointers to
> structs, instead of an array of structs).
>
> Regards, Joonas

One more thing: I still urge to reconsider merging what we already have, 
and doing these improvements (once we agree on a design) later on. The 
reason being that the sooner we get a list of all WAs in debugfs, the 
better (which can be used later on to verify any further improvements we 
do).

Thanks for the review,
Oscar

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2017-11-06 18:53 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-03 18:09 [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 01/20] drm/i915: Remove Gen9 WAs with no effect Oscar Mateo
2017-11-06 12:40   ` Chris Wilson
2017-11-03 18:09 ` [RFC PATCH 02/20] drm/i915: Move a bunch of workaround-related code to its own file Oscar Mateo
2017-11-06 12:42   ` Chris Wilson
2017-11-06 12:47     ` Joonas Lahtinen
2017-11-03 18:09 ` [RFC PATCH 03/20] drm/i915: Split out functions for different kinds of workarounds Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables Oscar Mateo
2017-11-06 11:59   ` Joonas Lahtinen
2017-11-06 18:54     ` Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 05/20] drm/i915: Transform GT " Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 06/20] drm/i915: Transform Whitelist " Oscar Mateo
2017-11-03 20:43   ` [RFC PATCH v2] " Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 07/20] drm/i915: Create a new category of display WAs Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 08/20] drm/i915: Print all workaround types correctly in debugfs Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 09/20] drm/i915: Do not store the total counts of WAs Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 10/20] drm/i915: Move WA BB stuff to the workarounds file as well Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 11/20] drm/i915/cnl: Move GT and Display workarounds from init_clock_gating Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 12/20] drm/i915/gen9: " Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 13/20] drm/i915/cfl: " Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 14/20] drm/i915/glk: " Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 15/20] drm/i915/kbl: " Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 16/20] drm/i915/bxt: " Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 17/20] drm/i915/skl: " Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 18/20] drm/i915/chv: " Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 19/20] drm/i915/bdw: " Oscar Mateo
2017-11-03 18:09 ` [RFC PATCH 20/20] drm/i915: Document the i915_workarounds file Oscar Mateo
2017-11-03 18:37 ` ✗ Fi.CI.BAT: failure for Refactor HW workaround code (rev5) Patchwork
2017-11-03 21:05 ` ✗ Fi.CI.BAT: failure for Refactor HW workaround code (rev6) Patchwork
2017-11-03 21:51 ` [RFC PATCH v5 00/20] Refactor HW workaround code Oscar Mateo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.