All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Add BDW workarounds using golden render state
@ 2014-08-20 14:19 Arun Siluvery
  2014-08-20 14:19 ` [PATCH 1/2] drm/i915/bdw: Apply workarounds using the " Arun Siluvery
  2014-08-20 14:19 ` [PATCH 2/2] drm/i915/bdw: Extract workaround registers data from " Arun Siluvery
  0 siblings, 2 replies; 10+ messages in thread
From: Arun Siluvery @ 2014-08-20 14:19 UTC (permalink / raw)
  To: intel-gfx

In this patch workarounds for BDW are applied using golden render state.
Only those registers that are part of register state are added to this batch.
Remaining workarounds are still in its current place init_clock_gating() which
are not affected by a gpu reset. I can send another patch where they can be
moved to render ring init function but during testing I found their state
doesn't change after reset.

Arun Siluvery (2):
  drm/i915/bdw: Apply workarounds using the golden render state
  drm/i915/bdw: Extract workaround registers data from golden render
    state

 drivers/gpu/drm/i915/i915_debugfs.c           | 35 ++++++++++
 drivers/gpu/drm/i915/i915_drv.h               | 14 ++++
 drivers/gpu/drm/i915/i915_gem_render_state.c  | 51 ++++++++++++++
 drivers/gpu/drm/i915/intel_pm.c               | 49 --------------
 drivers/gpu/drm/i915/intel_renderstate_gen8.c | 95 ++++++++++++++++++++-------
 5 files changed, 172 insertions(+), 72 deletions(-)

-- 
2.0.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] drm/i915/bdw: Apply workarounds using the golden render state
  2014-08-20 14:19 [PATCH 0/2] Add BDW workarounds using golden render state Arun Siluvery
@ 2014-08-20 14:19 ` Arun Siluvery
  2014-08-20 14:24   ` Chris Wilson
  2014-08-20 14:52   ` Ville Syrjälä
  2014-08-20 14:19 ` [PATCH 2/2] drm/i915/bdw: Extract workaround registers data from " Arun Siluvery
  1 sibling, 2 replies; 10+ messages in thread
From: Arun Siluvery @ 2014-08-20 14:19 UTC (permalink / raw)
  To: intel-gfx

Workarounds for bdw are currently applied in init_clock_gating() but they
are lost following a gpu reset. Some of the WA registers are part of register
state context and they are restored with every context switch so initializing
them in golden render state ensures that they are applied even when we start
with an uninitialized context or during hw initlialization followed by a reset.

v2: Add comments corresponding to WAs in golden render state (Chris).

The generation of render state is not a straighforward process, it would
be ideal to augment WA values from during the setup state as opposed to
using a tool but that would be a follow up patch.

Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c               | 49 --------------
 drivers/gpu/drm/i915/intel_renderstate_gen8.c | 95 ++++++++++++++++++++-------
 2 files changed, 72 insertions(+), 72 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c8f744c..bcae3dc 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5507,101 +5507,52 @@ static void gen8_init_clock_gating(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	enum pipe pipe;
 
 	I915_WRITE(WM3_LP_ILK, 0);
 	I915_WRITE(WM2_LP_ILK, 0);
 	I915_WRITE(WM1_LP_ILK, 0);
 
 	/* FIXME(BDW): Check all the w/a, some might only apply to
 	 * pre-production hw. */
 
-	/* WaDisablePartialInstShootdown:bdw */
-	I915_WRITE(GEN8_ROW_CHICKEN,
-		   _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE));
-
-	/* WaDisableThreadStallDopClockGating:bdw */
-	/* FIXME: Unclear whether we really need this on production bdw. */
-	I915_WRITE(GEN8_ROW_CHICKEN,
-		   _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE));
-
-	/*
-	 * This GEN8_CENTROID_PIXEL_OPT_DIS W/A is only needed for
-	 * pre-production hardware
-	 */
-	I915_WRITE(HALF_SLICE_CHICKEN3,
-		   _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS));
-	I915_WRITE(HALF_SLICE_CHICKEN3,
-		   _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
 	I915_WRITE(GAMTARBMODE, _MASKED_BIT_ENABLE(ARB_MODE_BWGTLB_DISABLE));
 
 	I915_WRITE(_3D_CHICKEN3,
 		   _MASKED_BIT_ENABLE(_3D_CHICKEN_SDE_LIMIT_FIFO_POLY_DEPTH(2)));
 
-	I915_WRITE(COMMON_SLICE_CHICKEN2,
-		   _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE));
-
-	I915_WRITE(GEN7_HALF_SLICE_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE));
-
-	/* WaDisableDopClockGating:bdw May not be needed for production */
-	I915_WRITE(GEN7_ROW_CHICKEN2,
-		   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
-
 	/* WaSwitchSolVfFArbitrationPriority:bdw */
 	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
 
 	/* WaPsrDPAMaskVBlankInSRD:bdw */
 	I915_WRITE(CHICKEN_PAR1_1,
 		   I915_READ(CHICKEN_PAR1_1) | DPA_MASK_VBLANK_SRD);
 
 	/* WaPsrDPRSUnmaskVBlankInSRD:bdw */
 	for_each_pipe(pipe) {
 		I915_WRITE(CHICKEN_PIPESL_1(pipe),
 			   I915_READ(CHICKEN_PIPESL_1(pipe)) |
 			   BDW_DPRS_MASK_VBLANK_SRD);
 	}
 
-	/* Use Force Non-Coherent whenever executing a 3D context. This is a
-	 * workaround for for a possible hang in the unlikely event a TLB
-	 * invalidation occurs during a PSD flush.
-	 */
-	I915_WRITE(HDC_CHICKEN0,
-		   I915_READ(HDC_CHICKEN0) |
-		   _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT));
-
 	/* WaVSRefCountFullforceMissDisable:bdw */
 	/* WaDSRefCountFullforceMissDisable:bdw */
 	I915_WRITE(GEN7_FF_THREAD_MODE,
 		   I915_READ(GEN7_FF_THREAD_MODE) &
 		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
 
-	/*
-	 * BSpec recommends 8x4 when MSAA is used,
-	 * however in practice 16x4 seems fastest.
-	 *
-	 * Note that PS/WM thread counts depend on the WIZ hashing
-	 * disable bit, which we don't touch here, but it's good
-	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
-	 */
-	I915_WRITE(GEN7_GT_MODE,
-		   GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
 
 	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
 		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
 
 	/* WaDisableSDEUnitClockGating:bdw */
 	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
 		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
-
-	/* Wa4x4STCOptimizationDisable:bdw */
-	I915_WRITE(CACHE_MODE_1,
-		   _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE));
 }
 
 static void haswell_init_clock_gating(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
 	ilk_init_lp_watermarks(dev);
 
 	/* L3 caching of data atomics doesn't work -- disable it. */
 	I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen8.c b/drivers/gpu/drm/i915/intel_renderstate_gen8.c
index 75ef1b5d..617be0f 100644
--- a/drivers/gpu/drm/i915/intel_renderstate_gen8.c
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen8.c
@@ -1,21 +1,78 @@
 #include "intel_renderstate.h"
 
 static const u32 gen8_null_state_relocs[] = {
-	0x00000048,
-	0x00000050,
-	0x00000060,
-	0x000003ec,
+	0x000000a8,
+	0x000000b0,
+	0x000000c0,
+	0x0000044c,
 	-1,
 };
 
 static const u32 gen8_null_state_batch[] = {
+	0x11000001,	/* Apply workarounds - start */
+	/* GEN8_ROW_CHICKEN
+	 * WaDisablePartialInstShootdown:bdw
+	 * WaDisableThreadStallDopClockGating:bdw
+	 */
+	0x0000e4f0,
+	0x83208320,
+	0x11000001,
+	/* GEN7_ROW_CHICKEN2
+	 * WaDisableDopClockGating:bdw, may not be needed for production.
+	 */
+	0x0000e4f4,
+	0x00010001,
+	0x11000001,
+	/* HALF_SLICE_CHICKEN3
+	 * This GEN8_CENTROID_PIXEL_OPT_DIS W/A is only needed for
+	 * pre-production hardware
+	 */
+	0x0000e184,
+	0x01020102,
+	0x11000001,
+	/* GEN7_HALF_SLICE_CHICKEN1
+	 * Wa: GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE
+	 */
+	0x0000e100,
+	0x04000400,
+	0x11000001,
+	/* COMMON_SLICE_CHICKEN2
+	 * Wa: GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE
+	 */
+	0x00007014,
+	0x00010001,
+	0x11000001,
+	/* HDC_CHICKEN0
+	 * Use Force Non-Coherent whenever executing a 3D context. This is a
+	 * workaround for for a possible hang in the unlikely event a TLB
+	 * invalidation occurs during a PSD flush.
+	 */
+	0x00007300,
+	0x00100010,
+	0x11000001,
+	/* CACHE_MODE_1
+	 * Wa4x4STCOptimizationDisable:bdw
+	 */
+	0x00007004,
+	0x00400040,
+	0x11000001,
+	/*
+	 * BSpec recommends 8x4 when MSAA is used,
+	 * however in practice 16x4 seems fastest.
+	 *
+	 * Note that PS/WM thread counts depend on the WIZ hashing
+	 * disable bit, which we don't touch here, but it's good
+	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
+	 */
+	0x00007008,
+	0x02800200, /* Apply workarounds - end */
 	0x69040000,
 	0x61020001,
 	0x00000000,
 	0x00000000,
 	0x79120000,
 	0x00000000,
 	0x79130000,
 	0x00000000,
 	0x79140000,
 	0x00000000,
@@ -33,35 +90,35 @@ static const u32 gen8_null_state_batch[] = {
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000001,	 /* reloc */
 	0x00000000,
 	0xfffff001,
 	0x00001001,
 	0xfffff001,
 	0x00001001,
 	0x78230000,
-	0x000006e0,
+	0x00000720,
 	0x78210000,
-	0x00000700,
+	0x00000740,
 	0x78300000,
 	0x08010040,
 	0x78330000,
 	0x08000000,
 	0x78310000,
 	0x08000000,
 	0x78320000,
 	0x08000000,
 	0x78240000,
-	0x00000641,
+	0x00000681,
 	0x780e0000,
-	0x00000601,
+	0x00000641,
 	0x780d0000,
 	0x00000000,
 	0x78180000,
 	0x00000001,
 	0x78520003,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x78190009,
@@ -192,54 +249,54 @@ static const u32 gen8_null_state_batch[] = {
 	0x78500003,
 	0x00210000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x78130002,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x782a0000,
-	0x00000480,
+	0x000004c0,
 	0x782f0000,
-	0x00000540,
+	0x00000580,
 	0x78140000,
 	0x00000800,
 	0x78170009,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x7820000a,
-	0x00000580,
+	0x000005c0,
 	0x00000000,
 	0x08080000,
 	0x00000000,
 	0x00000000,
 	0x1f000002,
 	0x00060000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x784d0000,
 	0x40000000,
 	0x784f0000,
 	0x80000100,
 	0x780f0000,
-	0x00000740,
+	0x00000780,
 	0x78050006,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x78070003,
 	0x00000000,
@@ -253,21 +310,21 @@ static const u32 gen8_null_state_batch[] = {
 	0x00000000,
 	0x78040001,
 	0x00000000,
 	0x00000001,
 	0x79000002,
 	0xffffffff,
 	0x00000000,
 	0x00000000,
 	0x78080003,
 	0x00006000,
-	0x000005e0,	 /* reloc */
+	0x00000620,	 /* reloc */
 	0x00000000,
 	0x00000000,
 	0x78090005,
 	0x02000000,
 	0x22220000,
 	0x02f60000,
 	0x11230000,
 	0x02850004,
 	0x11230000,
 	0x784b0000,
@@ -282,30 +339,22 @@ static const u32 gen8_null_state_batch[] = {
 	0x00000001,
 	0x00000000,
 	0x00000000,
 	0x05000000,	 /* cmds end */
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
-	0x00000000,
-	0x00000000,
-	0x00000000,
-	0x00000000,
-	0x00000000,
-	0x00000000,
-	0x00000000,
-	0x00000000,
-	0x000004c0,	 /* state start */
-	0x00000500,
+	0x00000500,	 /* state start */
+	0x00000540,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
 	0x00000000,
-- 
2.0.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] drm/i915/bdw: Extract workaround registers data from golden render state
  2014-08-20 14:19 [PATCH 0/2] Add BDW workarounds using golden render state Arun Siluvery
  2014-08-20 14:19 ` [PATCH 1/2] drm/i915/bdw: Apply workarounds using the " Arun Siluvery
@ 2014-08-20 14:19 ` Arun Siluvery
  1 sibling, 0 replies; 10+ messages in thread
From: Arun Siluvery @ 2014-08-20 14:19 UTC (permalink / raw)
  To: intel-gfx

Workarounds are applied using golden render state and they are placed
at the beginning of this batch buffer. They are essentially register updates
and we use this fact to extract them and generate a list of WAs applied.
This list is also exported via debugfs file and it is used to validate their
status before and after a test condition (eg reset, suspend/resume etc)
This patch is only required to support testing.

Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c          | 35 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h              | 14 ++++++++
 drivers/gpu/drm/i915/i915_gem_render_state.c | 51 ++++++++++++++++++++++++++++
 3 files changed, 100 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d42db6b..c1d4e6b 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2451,20 +2451,54 @@ static int i915_shared_dplls_info(struct seq_file *m, void *unused)
 		seq_printf(m, " dpll_md: 0x%08x\n", pll->hw_state.dpll_md);
 		seq_printf(m, " fp0:     0x%08x\n", pll->hw_state.fp0);
 		seq_printf(m, " fp1:     0x%08x\n", pll->hw_state.fp1);
 		seq_printf(m, " wrpll:   0x%08x\n", pll->hw_state.wrpll);
 	}
 	drm_modeset_unlock_all(dev);
 
 	return 0;
 }
 
+static int intel_wa_registers(struct seq_file *m, void *unused)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int i;
+	int ret;
+
+	if (!IS_BROADWELL(dev)) {
+		DRM_DEBUG_DRIVER("Workaround table not available\n");
+		return -EINVAL;
+	}
+
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		return ret;
+
+	intel_runtime_pm_get(dev_priv);
+
+	seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs);
+	for (i = 0; i < dev_priv->num_wa_regs; ++i) {
+		if (dev_priv->intel_wa_regs[i].addr)
+			seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
+				   dev_priv->intel_wa_regs[i].addr,
+				   dev_priv->intel_wa_regs[i].value,
+				   dev_priv->intel_wa_regs[i].mask);
+	}
+
+	intel_runtime_pm_put(dev_priv);
+	mutex_unlock(&dev->struct_mutex);
+
+	return 0;
+}
+
 struct pipe_crc_info {
 	const char *name;
 	struct drm_device *dev;
 	enum pipe pipe;
 };
 
 static int i915_dp_mst_info(struct seq_file *m, void *unused)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
@@ -3980,20 +4014,21 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_llc", i915_llc, 0},
 	{"i915_edp_psr_status", i915_edp_psr_status, 0},
 	{"i915_sink_crc_eDP1", i915_sink_crc, 0},
 	{"i915_energy_uJ", i915_energy_uJ, 0},
 	{"i915_pc8_status", i915_pc8_status, 0},
 	{"i915_power_domain_info", i915_power_domain_info, 0},
 	{"i915_display_info", i915_display_info, 0},
 	{"i915_semaphore_status", i915_semaphore_status, 0},
 	{"i915_shared_dplls_info", i915_shared_dplls_info, 0},
 	{"i915_dp_mst_info", i915_dp_mst_info, 0},
+	{"intel_wa_registers", intel_wa_registers, 0},
 };
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
 static const struct i915_debugfs_files {
 	const char *name;
 	const struct file_operations *fops;
 } i915_debugfs_files[] = {
 	{"i915_wedged", &i915_wedged_fops},
 	{"i915_max_freq", &i915_max_freq_fops},
 	{"i915_min_freq", &i915_min_freq_fops},
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bcf79f0..2fc34bd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1546,20 +1546,34 @@ struct drm_i915_private {
 	wait_queue_head_t pending_flip_queue;
 
 #ifdef CONFIG_DEBUG_FS
 	struct intel_pipe_crc pipe_crc[I915_MAX_PIPES];
 #endif
 
 	int num_shared_dpll;
 	struct intel_shared_dpll shared_dplls[I915_NUM_PLLS];
 	int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
 
+	/*
+	 * workarounds are currently applied using golden render state
+	 * and the exact count is known only after parsing its content;
+	 * this patch is mainly for i-g-t so use a fixed number for now.
+	 */
+#define I915_MAX_WA_REGS  16
+	struct {
+		u32 addr;
+		u32 value;
+		/* bitmask representing WA bits */
+		u32 mask;
+	} intel_wa_regs[I915_MAX_WA_REGS];
+	u32 num_wa_regs;
+
 	/* Reclocking support */
 	bool render_reclock_avail;
 	bool lvds_downclock_avail;
 	/* indicates the reduced downclock for LVDS*/
 	int lvds_downclock;
 
 	struct i915_frontbuffer_tracking fb_tracking;
 
 	u16 orig_clock;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index e60be3f..ef37122 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -120,20 +120,65 @@ static int render_state_setup(struct render_state *so)
 		return ret;
 
 	if (rodata->reloc[reloc_index] != -1) {
 		DRM_ERROR("only %d relocs resolved\n", reloc_index);
 		return -EINVAL;
 	}
 
 	return 0;
 }
 
+static int bdw_init_workarounds_data(struct render_state *so,
+				    struct intel_engine_cs *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	const struct intel_renderstate_rodata *rodata = so->rodata;
+	unsigned int i = 0;
+	int ret = 0;
+
+	dev_priv->num_wa_regs = 0;
+
+	while (i < rodata->batch_items) {
+		u32 addr, value, mask;
+		u32 s = rodata->batch[i];
+
+		/*
+		 * Workarounds are placed at the beginning of
+		 * golden render state and they are essentially
+		 * register updates
+		 */
+		if (s != MI_LOAD_REGISTER_IMM(1)) {
+			ret = -EINVAL;
+			break;
+		}
+
+		addr = rodata->batch[i + 1];
+		mask = rodata->batch[i + 2] & 0xFFFF;
+		value = I915_READ(addr) | mask;
+
+		dev_priv->intel_wa_regs[dev_priv->num_wa_regs].addr = addr;
+		dev_priv->intel_wa_regs[dev_priv->num_wa_regs].value = value;
+		dev_priv->intel_wa_regs[dev_priv->num_wa_regs].mask = mask;
+
+		i += 3;
+		dev_priv->num_wa_regs++;
+	}
+
+	if (dev_priv->num_wa_regs > I915_MAX_WA_REGS)
+		dev_priv->num_wa_regs = I915_MAX_WA_REGS;
+
+	DRM_DEBUG_DRIVER("No. of workarounds found in render state: %d\n",
+			 dev_priv->num_wa_regs);
+
+	return ret;
+}
+
 static void render_state_fini(struct render_state *so)
 {
 	i915_gem_object_ggtt_unpin(so->obj);
 	drm_gem_object_unreference(&so->obj->base);
 }
 
 int i915_gem_render_state_init(struct intel_engine_cs *ring)
 {
 	struct render_state so;
 	int ret;
@@ -156,14 +201,20 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring)
 					so.ggtt_offset,
 					so.rodata->batch_items * 4,
 					I915_DISPATCH_SECURE);
 	if (ret)
 		goto out;
 
 	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
 
 	ret = __i915_add_request(ring, NULL, so.obj, NULL);
 	/* __i915_add_request moves object to inactive if it fails */
+
+	if (!ret) {
+		if (bdw_init_workarounds_data(&so, ring))
+			DRM_DEBUG_DRIVER("Workarounds data not found !!\n");
+	}
+
 out:
 	render_state_fini(&so);
 	return ret;
 }
-- 
2.0.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] drm/i915/bdw: Apply workarounds using the golden render state
  2014-08-20 14:19 ` [PATCH 1/2] drm/i915/bdw: Apply workarounds using the " Arun Siluvery
@ 2014-08-20 14:24   ` Chris Wilson
  2014-08-20 14:52   ` Ville Syrjälä
  1 sibling, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2014-08-20 14:24 UTC (permalink / raw)
  To: Arun Siluvery; +Cc: intel-gfx

On Wed, Aug 20, 2014 at 03:19:17PM +0100, Arun Siluvery wrote:
> Workarounds for bdw are currently applied in init_clock_gating() but they
> are lost following a gpu reset. Some of the WA registers are part of register
> state context and they are restored with every context switch so initializing
> them in golden render state ensures that they are applied even when we start
> with an uninitialized context or during hw initlialization followed by a reset.
> 
> v2: Add comments corresponding to WAs in golden render state (Chris).
> 
> The generation of render state is not a straighforward process, it would
> be ideal to augment WA values from during the setup state as opposed to
> using a tool but that would be a follow up patch.

So far we have put those wa into the render ring init.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] drm/i915/bdw: Apply workarounds using the golden render state
  2014-08-20 14:19 ` [PATCH 1/2] drm/i915/bdw: Apply workarounds using the " Arun Siluvery
  2014-08-20 14:24   ` Chris Wilson
@ 2014-08-20 14:52   ` Ville Syrjälä
  2014-08-22 11:06     ` Mika Kuoppala
  1 sibling, 1 reply; 10+ messages in thread
From: Ville Syrjälä @ 2014-08-20 14:52 UTC (permalink / raw)
  To: Arun Siluvery; +Cc: intel-gfx

On Wed, Aug 20, 2014 at 03:19:17PM +0100, Arun Siluvery wrote:
> Workarounds for bdw are currently applied in init_clock_gating() but they
> are lost following a gpu reset. Some of the WA registers are part of register
> state context and they are restored with every context switch so initializing
> them in golden render state ensures that they are applied even when we start
> with an uninitialized context or during hw initlialization followed by a reset.
> 
> v2: Add comments corresponding to WAs in golden render state (Chris).
> 
> The generation of render state is not a straighforward process, it would
> be ideal to augment WA values from during the setup state as opposed to
> using a tool but that would be a follow up patch.

I'd still prefer just emitting the LRIs from code rather tha mucking
about with null batch. Less hoops to jump through when adding a new w/a.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] drm/i915/bdw: Apply workarounds using the golden render state
  2014-08-20 14:52   ` Ville Syrjälä
@ 2014-08-22 11:06     ` Mika Kuoppala
  2014-08-22 12:10       ` Siluvery, Arun
  0 siblings, 1 reply; 10+ messages in thread
From: Mika Kuoppala @ 2014-08-22 11:06 UTC (permalink / raw)
  To: Ville Syrjälä, Arun Siluvery; +Cc: intel-gfx

Ville Syrjälä <ville.syrjala@linux.intel.com> writes:

> On Wed, Aug 20, 2014 at 03:19:17PM +0100, Arun Siluvery wrote:
>> Workarounds for bdw are currently applied in init_clock_gating() but they
>> are lost following a gpu reset. Some of the WA registers are part of register
>> state context and they are restored with every context switch so initializing
>> them in golden render state ensures that they are applied even when we start
>> with an uninitialized context or during hw initlialization followed by a reset.
>> 
>> v2: Add comments corresponding to WAs in golden render state (Chris).
>> 
>> The generation of render state is not a straighforward process, it would
>> be ideal to augment WA values from during the setup state as opposed to
>> using a tool but that would be a follow up patch.
>
> I'd still prefer just emitting the LRIs from code rather tha mucking
> about with null batch. Less hoops to jump through when adding a new w/a.

I agree with this. We should aim to keep null state as per
gen. Workaround set is different for gtX inside particular
gen so we would need then multiple null states per gen. 

After brief chat with Ville, I think that the correct
spot to init the context specific workarounds is after MI_SET_CONTEXT
to default and right before null batch is run. If we do these
with emitting LRIs to ring, we should be safe as they are then saved
with default ctx.

The default ctx is then used as a 'parent' for newly created
contexts. Ofcource if registers get globbered, then we inherit
crap.

If we have the per gen null state and the ring is initializing
workarounds for the default context, then in future we can
save this state as 'read only golden context'. And use it as the
initial state for all newly created contexts.

Then the full plan how to init would look like this:

#1 reset the gpu (on driver load, on resume or on hang recovery)
#2 if we have 'read only golden context', copy it to default ctx
#3 switch to default context
#4 if we had 'read only golden context' we are done with the init.

---

#5 if this is driver load thus there is no 'read only golden context' yet.
#6 init workarounds through ring LRIs
#7 run null/golden state batch
#8 save this state as a 'read only golden context'

---

#9 for each new context, initialize ctx obj with 'read only golden
 context' (either by memcpy or restoring from it when switching to new)

-Mika
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] drm/i915/bdw: Apply workarounds using the golden render state
  2014-08-22 11:06     ` Mika Kuoppala
@ 2014-08-22 12:10       ` Siluvery, Arun
  2014-08-26 12:53         ` Daniel Vetter
  0 siblings, 1 reply; 10+ messages in thread
From: Siluvery, Arun @ 2014-08-22 12:10 UTC (permalink / raw)
  To: Mika Kuoppala, Ville Syrjälä; +Cc: intel-gfx

On 22/08/2014 12:06, Mika Kuoppala wrote:
> Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
>
>> On Wed, Aug 20, 2014 at 03:19:17PM +0100, Arun Siluvery wrote:
>>> Workarounds for bdw are currently applied in init_clock_gating() but they
>>> are lost following a gpu reset. Some of the WA registers are part of register
>>> state context and they are restored with every context switch so initializing
>>> them in golden render state ensures that they are applied even when we start
>>> with an uninitialized context or during hw initlialization followed by a reset.
>>>
>>> v2: Add comments corresponding to WAs in golden render state (Chris).
>>>
>>> The generation of render state is not a straighforward process, it would
>>> be ideal to augment WA values from during the setup state as opposed to
>>> using a tool but that would be a follow up patch.
>>
>> I'd still prefer just emitting the LRIs from code rather tha mucking
>> about with null batch. Less hoops to jump through when adding a new w/a.
>
> I agree with this. We should aim to keep null state as per
> gen. Workaround set is different for gtX inside particular
> gen so we would need then multiple null states per gen.
>
> After brief chat with Ville, I think that the correct
> spot to init the context specific workarounds is after MI_SET_CONTEXT
> to default and right before null batch is run. If we do these
> with emitting LRIs to ring, we should be safe as they are then saved
> with default ctx.
>
> The default ctx is then used as a 'parent' for newly created
> contexts. Ofcource if registers get globbered, then we inherit
> crap.
>
> If we have the per gen null state and the ring is initializing
> workarounds for the default context, then in future we can
> save this state as 'read only golden context'. And use it as the
> initial state for all newly created contexts.
>
> Then the full plan how to init would look like this:
>
> #1 reset the gpu (on driver load, on resume or on hang recovery)
> #2 if we have 'read only golden context', copy it to default ctx
> #3 switch to default context
> #4 if we had 'read only golden context' we are done with the init.
>
> ---
>
> #5 if this is driver load thus there is no 'read only golden context' yet.
> #6 init workarounds through ring LRIs
> #7 run null/golden state batch
> #8 save this state as a 'read only golden context'
>
> ---
>
> #9 for each new context, initialize ctx obj with 'read only golden
>   context' (either by memcpy or restoring from it when switching to new)
>
I understand applying WAs using null batch has its issues but as I 
mentioned in the commit msg I will fix this as a follow up patch.
It is going to take some time though to change the patch as per the new 
sequence.
The patch in its current state helps fix WA issues after reset; so it 
can only be accepted if it is updated as per the new sequence?

regards
Arun


> -Mika
>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] drm/i915/bdw: Apply workarounds using the golden render state
  2014-08-22 12:10       ` Siluvery, Arun
@ 2014-08-26 12:53         ` Daniel Vetter
  2014-08-26 12:57           ` Siluvery, Arun
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Vetter @ 2014-08-26 12:53 UTC (permalink / raw)
  To: Siluvery, Arun; +Cc: intel-gfx

On Fri, Aug 22, 2014 at 01:10:26PM +0100, Siluvery, Arun wrote:
> On 22/08/2014 12:06, Mika Kuoppala wrote:
> >Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
> >
> >>On Wed, Aug 20, 2014 at 03:19:17PM +0100, Arun Siluvery wrote:
> >>>Workarounds for bdw are currently applied in init_clock_gating() but they
> >>>are lost following a gpu reset. Some of the WA registers are part of register
> >>>state context and they are restored with every context switch so initializing
> >>>them in golden render state ensures that they are applied even when we start
> >>>with an uninitialized context or during hw initlialization followed by a reset.
> >>>
> >>>v2: Add comments corresponding to WAs in golden render state (Chris).
> >>>
> >>>The generation of render state is not a straighforward process, it would
> >>>be ideal to augment WA values from during the setup state as opposed to
> >>>using a tool but that would be a follow up patch.
> >>
> >>I'd still prefer just emitting the LRIs from code rather tha mucking
> >>about with null batch. Less hoops to jump through when adding a new w/a.
> >
> >I agree with this. We should aim to keep null state as per
> >gen. Workaround set is different for gtX inside particular
> >gen so we would need then multiple null states per gen.
> >
> >After brief chat with Ville, I think that the correct
> >spot to init the context specific workarounds is after MI_SET_CONTEXT
> >to default and right before null batch is run. If we do these
> >with emitting LRIs to ring, we should be safe as they are then saved
> >with default ctx.
> >
> >The default ctx is then used as a 'parent' for newly created
> >contexts. Ofcource if registers get globbered, then we inherit
> >crap.
> >
> >If we have the per gen null state and the ring is initializing
> >workarounds for the default context, then in future we can
> >save this state as 'read only golden context'. And use it as the
> >initial state for all newly created contexts.
> >
> >Then the full plan how to init would look like this:
> >
> >#1 reset the gpu (on driver load, on resume or on hang recovery)
> >#2 if we have 'read only golden context', copy it to default ctx
> >#3 switch to default context
> >#4 if we had 'read only golden context' we are done with the init.
> >
> >---
> >
> >#5 if this is driver load thus there is no 'read only golden context' yet.
> >#6 init workarounds through ring LRIs
> >#7 run null/golden state batch
> >#8 save this state as a 'read only golden context'
> >
> >---
> >
> >#9 for each new context, initialize ctx obj with 'read only golden
> >  context' (either by memcpy or restoring from it when switching to new)
> >
> I understand applying WAs using null batch has its issues but as I mentioned
> in the commit msg I will fix this as a follow up patch.
> It is going to take some time though to change the patch as per the new
> sequence.
> The patch in its current state helps fix WA issues after reset; so it can
> only be accepted if it is updated as per the new sequence?

We already have a lot of "let's fix it later" experiments running, so I
don't want to overload the ship. So I highly prefer to merge the revised
version directly.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] drm/i915/bdw: Apply workarounds using the golden render state
  2014-08-26 12:53         ` Daniel Vetter
@ 2014-08-26 12:57           ` Siluvery, Arun
  2014-08-26 13:37             ` Daniel Vetter
  0 siblings, 1 reply; 10+ messages in thread
From: Siluvery, Arun @ 2014-08-26 12:57 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On 26/08/2014 13:53, Daniel Vetter wrote:
> On Fri, Aug 22, 2014 at 01:10:26PM +0100, Siluvery, Arun wrote:
>> On 22/08/2014 12:06, Mika Kuoppala wrote:
>>> Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
>>>
>>>> On Wed, Aug 20, 2014 at 03:19:17PM +0100, Arun Siluvery wrote:
>>>>> Workarounds for bdw are currently applied in init_clock_gating() but they
>>>>> are lost following a gpu reset. Some of the WA registers are part of register
>>>>> state context and they are restored with every context switch so initializing
>>>>> them in golden render state ensures that they are applied even when we start
>>>>> with an uninitialized context or during hw initlialization followed by a reset.
>>>>>
>>>>> v2: Add comments corresponding to WAs in golden render state (Chris).
>>>>>
>>>>> The generation of render state is not a straighforward process, it would
>>>>> be ideal to augment WA values from during the setup state as opposed to
>>>>> using a tool but that would be a follow up patch.
>>>>
>>>> I'd still prefer just emitting the LRIs from code rather tha mucking
>>>> about with null batch. Less hoops to jump through when adding a new w/a.
>>>
>>> I agree with this. We should aim to keep null state as per
>>> gen. Workaround set is different for gtX inside particular
>>> gen so we would need then multiple null states per gen.
>>>
>>> After brief chat with Ville, I think that the correct
>>> spot to init the context specific workarounds is after MI_SET_CONTEXT
>>> to default and right before null batch is run. If we do these
>>> with emitting LRIs to ring, we should be safe as they are then saved
>>> with default ctx.
>>>
>>> The default ctx is then used as a 'parent' for newly created
>>> contexts. Ofcource if registers get globbered, then we inherit
>>> crap.
>>>
>>> If we have the per gen null state and the ring is initializing
>>> workarounds for the default context, then in future we can
>>> save this state as 'read only golden context'. And use it as the
>>> initial state for all newly created contexts.
>>>
>>> Then the full plan how to init would look like this:
>>>
>>> #1 reset the gpu (on driver load, on resume or on hang recovery)
>>> #2 if we have 'read only golden context', copy it to default ctx
>>> #3 switch to default context
>>> #4 if we had 'read only golden context' we are done with the init.
>>>
>>> ---
>>>
>>> #5 if this is driver load thus there is no 'read only golden context' yet.
>>> #6 init workarounds through ring LRIs
>>> #7 run null/golden state batch
>>> #8 save this state as a 'read only golden context'
>>>
>>> ---
>>>
>>> #9 for each new context, initialize ctx obj with 'read only golden
>>>   context' (either by memcpy or restoring from it when switching to new)
>>>
>> I understand applying WAs using null batch has its issues but as I mentioned
>> in the commit msg I will fix this as a follow up patch.
>> It is going to take some time though to change the patch as per the new
>> sequence.
>> The patch in its current state helps fix WA issues after reset; so it can
>> only be accepted if it is updated as per the new sequence?
>
> We already have a lot of "let's fix it later" experiments running, so I
> don't want to overload the ship. So I highly prefer to merge the revised
> version directly.
> -Daniel
>
I understand, a revised version with LRIs emitting from the driver is 
already submitted and is being reviewed.

regards
Arun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] drm/i915/bdw: Apply workarounds using the golden render state
  2014-08-26 12:57           ` Siluvery, Arun
@ 2014-08-26 13:37             ` Daniel Vetter
  0 siblings, 0 replies; 10+ messages in thread
From: Daniel Vetter @ 2014-08-26 13:37 UTC (permalink / raw)
  To: Siluvery, Arun; +Cc: intel-gfx

On Tue, Aug 26, 2014 at 01:57:19PM +0100, Siluvery, Arun wrote:
> On 26/08/2014 13:53, Daniel Vetter wrote:
> >On Fri, Aug 22, 2014 at 01:10:26PM +0100, Siluvery, Arun wrote:
> >>On 22/08/2014 12:06, Mika Kuoppala wrote:
> >>>Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
> >>>
> >>>>On Wed, Aug 20, 2014 at 03:19:17PM +0100, Arun Siluvery wrote:
> >>>>>Workarounds for bdw are currently applied in init_clock_gating() but they
> >>>>>are lost following a gpu reset. Some of the WA registers are part of register
> >>>>>state context and they are restored with every context switch so initializing
> >>>>>them in golden render state ensures that they are applied even when we start
> >>>>>with an uninitialized context or during hw initlialization followed by a reset.
> >>>>>
> >>>>>v2: Add comments corresponding to WAs in golden render state (Chris).
> >>>>>
> >>>>>The generation of render state is not a straighforward process, it would
> >>>>>be ideal to augment WA values from during the setup state as opposed to
> >>>>>using a tool but that would be a follow up patch.
> >>>>
> >>>>I'd still prefer just emitting the LRIs from code rather tha mucking
> >>>>about with null batch. Less hoops to jump through when adding a new w/a.
> >>>
> >>>I agree with this. We should aim to keep null state as per
> >>>gen. Workaround set is different for gtX inside particular
> >>>gen so we would need then multiple null states per gen.
> >>>
> >>>After brief chat with Ville, I think that the correct
> >>>spot to init the context specific workarounds is after MI_SET_CONTEXT
> >>>to default and right before null batch is run. If we do these
> >>>with emitting LRIs to ring, we should be safe as they are then saved
> >>>with default ctx.
> >>>
> >>>The default ctx is then used as a 'parent' for newly created
> >>>contexts. Ofcource if registers get globbered, then we inherit
> >>>crap.
> >>>
> >>>If we have the per gen null state and the ring is initializing
> >>>workarounds for the default context, then in future we can
> >>>save this state as 'read only golden context'. And use it as the
> >>>initial state for all newly created contexts.
> >>>
> >>>Then the full plan how to init would look like this:
> >>>
> >>>#1 reset the gpu (on driver load, on resume or on hang recovery)
> >>>#2 if we have 'read only golden context', copy it to default ctx
> >>>#3 switch to default context
> >>>#4 if we had 'read only golden context' we are done with the init.
> >>>
> >>>---
> >>>
> >>>#5 if this is driver load thus there is no 'read only golden context' yet.
> >>>#6 init workarounds through ring LRIs
> >>>#7 run null/golden state batch
> >>>#8 save this state as a 'read only golden context'
> >>>
> >>>---
> >>>
> >>>#9 for each new context, initialize ctx obj with 'read only golden
> >>>  context' (either by memcpy or restoring from it when switching to new)
> >>>
> >>I understand applying WAs using null batch has its issues but as I mentioned
> >>in the commit msg I will fix this as a follow up patch.
> >>It is going to take some time though to change the patch as per the new
> >>sequence.
> >>The patch in its current state helps fix WA issues after reset; so it can
> >>only be accepted if it is updated as per the new sequence?
> >
> >We already have a lot of "let's fix it later" experiments running, so I
> >don't want to overload the ship. So I highly prefer to merge the revised
> >version directly.
> >-Daniel
> >
> I understand, a revised version with LRIs emitting from the driver is
> already submitted and is being reviewed.

Ah, still catching up from my unusable network connection from last week.
Please ignore me ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-08-26 13:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-20 14:19 [PATCH 0/2] Add BDW workarounds using golden render state Arun Siluvery
2014-08-20 14:19 ` [PATCH 1/2] drm/i915/bdw: Apply workarounds using the " Arun Siluvery
2014-08-20 14:24   ` Chris Wilson
2014-08-20 14:52   ` Ville Syrjälä
2014-08-22 11:06     ` Mika Kuoppala
2014-08-22 12:10       ` Siluvery, Arun
2014-08-26 12:53         ` Daniel Vetter
2014-08-26 12:57           ` Siluvery, Arun
2014-08-26 13:37             ` Daniel Vetter
2014-08-20 14:19 ` [PATCH 2/2] drm/i915/bdw: Extract workaround registers data from " Arun Siluvery

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.