* [PATCH 0/4] Rework workaround init fn for BDW and CHV
@ 2014-09-01 13:28 Arun Siluvery
2014-09-01 13:28 ` [PATCH 1/4] drm/i915: Rework workaround init functions " Arun Siluvery
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Arun Siluvery @ 2014-09-01 13:28 UTC (permalink / raw)
To: intel-gfx
The patches that initialize workarounds for BDW and CHV using LRIs
are already merged in the tree but I received few more comments
from Chris and Damien.
I have reworked these patches as per their comments; it fixes
some issues and the code now looks clean. we can easily add
more workarounds with minimal changes.
I have tried to split BDW and CHV changes as separate patches but
it was getting ugly hence combined them also since it is a fixup
patch it should be ok.
In the previous implementation we were emitting each LRI individually
so the total number was not clear during init so a temporary array
was used to save w/a data (this is exported to debugfs) but if not
initialized we could potentially overrun array and also the condition
was not correct.
In this rework, w/a are organized as an array so we know the exact
count from the beginning and temporary array is no longer required.
The corresponding i-g-t is gem_workarounds.c and it is also updated.
Arun Siluvery (2):
drm/i915: Rework workaround init functions for BDW and CHV
drm/i915: Rework workaround data exporting to debugfs
Damien Lespiau (2):
drm/i915: Rename intel_wa_registers with a i915_ prefix
drm/i915: Don't restrict i915_wa_registers to BDW
drivers/gpu/drm/i915/i915_debugfs.c | 42 ++++++---
drivers/gpu/drm/i915/i915_drv.h | 14 ---
drivers/gpu/drm/i915/intel_ringbuffer.c | 162 ++++++++++++++------------------
drivers/gpu/drm/i915/intel_ringbuffer.h | 8 ++
4 files changed, 107 insertions(+), 119 deletions(-)
--
2.0.4
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/4] drm/i915: Rework workaround init functions for BDW and CHV
2014-09-01 13:28 [PATCH 0/4] Rework workaround init fn for BDW and CHV Arun Siluvery
@ 2014-09-01 13:28 ` Arun Siluvery
2014-09-02 9:14 ` [PATCH v2] " Arun Siluvery
2014-09-01 13:28 ` [PATCH 2/4] drm/i915: Rename intel_wa_registers with a i915_ prefix Arun Siluvery
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: Arun Siluvery @ 2014-09-01 13:28 UTC (permalink / raw)
To: intel-gfx
This rework is based on suggestion from Chris.
Now w/a are organized in an array and all of them are emitted in
single fn instead of sending them individually. This approach is
very clean and new w/a can be added with minimal changes.
The same array can be used when exporting them to debugfs and the
temporary array in the current implementation is not required.
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 149 +++++++++++++-------------------
1 file changed, 58 insertions(+), 91 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c5e4dc7..bae1527 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -650,87 +650,71 @@ err:
return ret;
}
-static inline void intel_ring_emit_wa(struct intel_engine_cs *ring,
- u32 addr, u32 value)
+static int i915_request_emit_lri(struct intel_engine_cs *ring,
+ int num_registers, const u32 *lri_list)
{
- struct drm_device *dev = ring->dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
+ int i;
+ int ret;
- if (dev_priv->num_wa_regs > I915_MAX_WA_REGS)
- return;
+ ret = intel_ring_begin(ring, (2 * num_registers + 1));
+ if (ret)
+ return ret;
- intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
- intel_ring_emit(ring, addr);
- intel_ring_emit(ring, value);
+ intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(num_registers));
+ for (i = 0; i < 2*num_registers; i += 2) {
+ intel_ring_emit(ring, *(lri_list + i));
+ intel_ring_emit(ring, *(lri_list + i + 1));
+ }
- dev_priv->intel_wa_regs[dev_priv->num_wa_regs].addr = addr;
- dev_priv->intel_wa_regs[dev_priv->num_wa_regs].mask = (value) & 0xFFFF;
- /* value is updated with the status of remaining bits of this
- * register when it is read from debugfs file
- */
- dev_priv->intel_wa_regs[dev_priv->num_wa_regs].value = value;
- dev_priv->num_wa_regs++;
+ intel_ring_emit(ring, MI_NOOP);
+ intel_ring_advance(ring);
- return;
+ return 0;
}
-static int bdw8_init_workarounds(struct intel_engine_cs *ring)
-{
- int ret;
- struct drm_device *dev = ring->dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
+static const u32 bdw_ring_init_context[] = {
/*
* workarounds applied in this fn are part of register state context,
* they need to be re-initialized followed by gpu reset, suspend/resume,
* module reload.
*/
- dev_priv->num_wa_regs = 0;
- memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
-
- /*
- * update the number of dwords required based on the
- * actual number of workarounds applied
- */
- ret = intel_ring_begin(ring, 24);
- if (ret)
- return ret;
/* WaDisablePartialInstShootdown:bdw */
/* WaDisableThreadStallDopClockGating:bdw */
/* FIXME: Unclear whether we really need this on production bdw. */
- intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
- _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE
- | STALL_DOP_GATING_DISABLE));
+ GEN8_ROW_CHICKEN,
+ _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) |
+ _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE),
/* WaDisableDopClockGating:bdw May not be needed for production */
- intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
- _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
+ GEN7_ROW_CHICKEN2,
+ _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE),
/*
* This GEN8_CENTROID_PIXEL_OPT_DIS W/A is only needed for
* pre-production hardware
*/
- intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
- _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS
- | GEN8_SAMPLER_POWER_BYPASS_DIS));
+ HALF_SLICE_CHICKEN3,
+ _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS
+ | GEN8_SAMPLER_POWER_BYPASS_DIS),
- intel_ring_emit_wa(ring, GEN7_HALF_SLICE_CHICKEN1,
- _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE));
+ GEN7_HALF_SLICE_CHICKEN1,
+ _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE),
- intel_ring_emit_wa(ring, COMMON_SLICE_CHICKEN2,
- _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE));
+ COMMON_SLICE_CHICKEN2,
+ _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE),
/* Use Force Non-Coherent whenever executing a 3D context. This is a
* workaround for for a possible hang in the unlikely event a TLB
* invalidation occurs during a PSD flush.
*/
- intel_ring_emit_wa(ring, HDC_CHICKEN0,
- _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT));
+ HDC_CHICKEN0,
+ _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT),
/* Wa4x4STCOptimizationDisable:bdw */
- intel_ring_emit_wa(ring, CACHE_MODE_1,
- _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE));
+ CACHE_MODE_1,
+ _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE),
/*
* BSpec recommends 8x4 when MSAA is used,
@@ -740,54 +724,40 @@ static int bdw8_init_workarounds(struct intel_engine_cs *ring)
* disable bit, which we don't touch here, but it's good
* to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
*/
- intel_ring_emit_wa(ring, GEN7_GT_MODE,
- GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
-
- intel_ring_advance(ring);
-
- DRM_DEBUG_DRIVER("Number of Workarounds applied: %d\n",
- dev_priv->num_wa_regs);
-
- return 0;
-}
-
-static int chv_init_workarounds(struct intel_engine_cs *ring)
-{
- int ret;
- struct drm_device *dev = ring->dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
-
- /*
- * workarounds applied in this fn are part of register state context,
- * they need to be re-initialized followed by gpu reset, suspend/resume,
- * module reload.
- */
- dev_priv->num_wa_regs = 0;
- memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
-
- ret = intel_ring_begin(ring, 12);
- if (ret)
- return ret;
+ GEN7_GT_MODE,
+ _MASKED_BIT_ENABLE(GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4),
+};
+static const u32 chv_ring_init_context[] = {
/* WaDisablePartialInstShootdown:chv */
- intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
- _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE));
-
/* WaDisableThreadStallDopClockGating:chv */
- intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
- _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE));
+ GEN8_ROW_CHICKEN,
+ _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) |
+ _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE),
/* WaDisableDopClockGating:chv (pre-production hw) */
- intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
- _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
+ GEN7_ROW_CHICKEN2,
+ _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE),
/* WaDisableSamplerPowerBypass:chv (pre-production hw) */
- intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
- _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
+ HALF_SLICE_CHICKEN3,
+ _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS),
+};
- intel_ring_advance(ring);
+static int ring_init_context(struct intel_engine_cs *ring)
+{
+ int ret = -EINVAL;
- return 0;
+ if (IS_CHERRYVIEW(ring->dev))
+ ret = i915_request_emit_lri(ring,
+ ARRAY_SIZE(chv_ring_init_context)/2,
+ chv_ring_init_context);
+ else
+ ret = i915_request_emit_lri(ring,
+ ARRAY_SIZE(bdw_ring_init_context)/2,
+ bdw_ring_init_context);
+
+ return ret;
}
static int init_render_ring(struct intel_engine_cs *ring)
@@ -2275,10 +2245,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
dev_priv->semaphore_obj = obj;
}
}
- if (IS_CHERRYVIEW(dev))
- ring->init_context = chv_init_workarounds;
- else
- ring->init_context = bdw_init_workarounds;
+ ring->init_context = ring_init_context;
ring->add_request = gen6_add_request;
ring->flush = gen8_render_ring_flush;
ring->irq_get = gen8_ring_get_irq;
--
2.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/4] drm/i915: Rename intel_wa_registers with a i915_ prefix
2014-09-01 13:28 [PATCH 0/4] Rework workaround init fn for BDW and CHV Arun Siluvery
2014-09-01 13:28 ` [PATCH 1/4] drm/i915: Rework workaround init functions " Arun Siluvery
@ 2014-09-01 13:28 ` Arun Siluvery
2014-09-01 13:28 ` [PATCH 3/4] drm/i915: Don't restrict i915_wa_registers to BDW Arun Siluvery
2014-09-01 13:28 ` [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs Arun Siluvery
3 siblings, 0 replies; 14+ messages in thread
From: Arun Siluvery @ 2014-09-01 13:28 UTC (permalink / raw)
To: intel-gfx
From: Damien Lespiau <damien.lespiau@intel.com>
Those debugfs files are prefixed by i915, the name of the kernel module,
presumably to make the difference with files exposed by core DRM.
Also, add a ',' at the end of the last entry. This is to ease the
conflict resolution when rebasing internal patches that add a member at
the end of the array. Without it, wiggle can't do its job as we need to
modify an existing line (appending the ',').
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f0d63f6..f09ba8c 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2458,7 +2458,7 @@ static int i915_shared_dplls_info(struct seq_file *m, void *unused)
return 0;
}
-static int intel_wa_registers(struct seq_file *m, void *unused)
+static int i915_wa_registers(struct seq_file *m, void *unused)
{
int i;
int ret;
@@ -4026,7 +4026,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_semaphore_status", i915_semaphore_status, 0},
{"i915_shared_dplls_info", i915_shared_dplls_info, 0},
{"i915_dp_mst_info", i915_dp_mst_info, 0},
- {"intel_wa_registers", intel_wa_registers, 0}
+ {"i915_wa_registers", i915_wa_registers, 0},
};
#define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
--
2.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/4] drm/i915: Don't restrict i915_wa_registers to BDW
2014-09-01 13:28 [PATCH 0/4] Rework workaround init fn for BDW and CHV Arun Siluvery
2014-09-01 13:28 ` [PATCH 1/4] drm/i915: Rework workaround init functions " Arun Siluvery
2014-09-01 13:28 ` [PATCH 2/4] drm/i915: Rename intel_wa_registers with a i915_ prefix Arun Siluvery
@ 2014-09-01 13:28 ` Arun Siluvery
2014-09-01 13:28 ` [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs Arun Siluvery
3 siblings, 0 replies; 14+ messages in thread
From: Arun Siluvery @ 2014-09-01 13:28 UTC (permalink / raw)
To: intel-gfx
From: Damien Lespiau <damien.lespiau@intel.com>
We have CHV code that already makes the test obsolete. Besides, when
num_wa_regs is 0 (platforms not gathering that W/A data), we expose
something sensible already.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f09ba8c..2727bda 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2466,11 +2466,6 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
struct drm_device *dev = node->minor->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
- if (!IS_BROADWELL(dev)) {
- DRM_DEBUG_DRIVER("Workaround table not available !!\n");
- return -EINVAL;
- }
-
ret = mutex_lock_interruptible(&dev->struct_mutex);
if (ret)
return ret;
--
2.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs
2014-09-01 13:28 [PATCH 0/4] Rework workaround init fn for BDW and CHV Arun Siluvery
` (2 preceding siblings ...)
2014-09-01 13:28 ` [PATCH 3/4] drm/i915: Don't restrict i915_wa_registers to BDW Arun Siluvery
@ 2014-09-01 13:28 ` Arun Siluvery
2014-09-01 14:06 ` Damien Lespiau
2014-09-02 9:15 ` [PATCH v2] " Arun Siluvery
3 siblings, 2 replies; 14+ messages in thread
From: Arun Siluvery @ 2014-09-01 13:28 UTC (permalink / raw)
To: intel-gfx
Now w/a are organized in an array so we know exactly how many of them
are applied; use the same array while exporting data to debugfs and
remove the temporary array we currently have in driver priv structure.
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 41 +++++++++++++++++++++++----------
drivers/gpu/drm/i915/i915_drv.h | 14 -----------
drivers/gpu/drm/i915/intel_ringbuffer.c | 15 ++++++++++++
drivers/gpu/drm/i915/intel_ringbuffer.h | 8 +++++++
4 files changed, 52 insertions(+), 26 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2727bda..bab0408 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2465,6 +2465,14 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
struct drm_info_node *node = (struct drm_info_node *) m->private;
struct drm_device *dev = node->minor->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_ring_context_rodata ro_data;
+
+ ret = ring_context_rodata(dev, &ro_data);
+ if (ret) {
+ seq_printf(m, "Workarounds applied: 0\n");
+ DRM_DEBUG_DRIVER("Workaround table not available !!\n");
+ return -EINVAL;
+ }
ret = mutex_lock_interruptible(&dev->struct_mutex);
if (ret)
@@ -2472,18 +2480,27 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
intel_runtime_pm_get(dev_priv);
- seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs);
- for (i = 0; i < dev_priv->num_wa_regs; ++i) {
- u32 addr, mask;
-
- addr = dev_priv->intel_wa_regs[i].addr;
- mask = dev_priv->intel_wa_regs[i].mask;
- dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask;
- if (dev_priv->intel_wa_regs[i].addr)
- seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
- dev_priv->intel_wa_regs[i].addr,
- dev_priv->intel_wa_regs[i].value,
- dev_priv->intel_wa_regs[i].mask);
+ seq_printf(m, "Workarounds applied: %d\n", ro_data.num_items/2);
+ for (i = 0; i < ro_data.num_items; i += 2) {
+ u32 addr, mask, value;
+
+ addr = ro_data.init_context[i];
+ /*
+ * Most of workarounds are masked registers;
+ * to set a bit in lower 16-bits we set a mask bit in
+ * upper 16-bits so we can take either of them as mask but
+ * it doesn't work if the w/a is about clearing a bit so
+ * use upper 16-bits to cover both cases.
+ */
+ mask = ro_data.init_context[i+1] >> 16;
+
+ /*
+ * value represents the status of other bits in the
+ * register besides w/a bits
+ */
+ value = I915_READ(addr) | mask;
+ seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
+ addr, value, mask);
}
intel_runtime_pm_put(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 49b7be7..bcf79f0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1553,20 +1553,6 @@ struct drm_i915_private {
struct intel_shared_dpll shared_dplls[I915_NUM_PLLS];
int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
- /*
- * workarounds are currently applied at different places and
- * changes are being done to consolidate them so exact count is
- * not clear at this point, use a max value for now.
- */
-#define I915_MAX_WA_REGS 16
- struct {
- u32 addr;
- u32 value;
- /* bitmask representing WA bits */
- u32 mask;
- } intel_wa_regs[I915_MAX_WA_REGS];
- u32 num_wa_regs;
-
/* Reclocking support */
bool render_reclock_avail;
bool lvds_downclock_avail;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index bae1527..6c07e69 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -760,6 +760,21 @@ static int ring_init_context(struct intel_engine_cs *ring)
return ret;
}
+int ring_context_rodata(struct drm_device *dev,
+ struct intel_ring_context_rodata *ro_data)
+{
+ if (IS_CHERRYVIEW(dev)) {
+ ro_data->init_context = chv_ring_init_context;
+ ro_data->num_items = ARRAY_SIZE(chv_ring_init_context);
+ } else if (IS_BROADWELL(dev)) {
+ ro_data->init_context = bdw_ring_init_context;
+ ro_data->num_items = ARRAY_SIZE(bdw_ring_init_context);
+ } else
+ return -EINVAL;
+
+ return 0;
+}
+
static int init_render_ring(struct intel_engine_cs *ring)
{
struct drm_device *dev = ring->dev;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 96479c8..f555505 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -317,6 +317,14 @@ struct intel_engine_cs {
u32 (*get_cmd_length_mask)(u32 cmd_header);
};
+struct intel_ring_context_rodata {
+ u32 num_items;
+ const u32 *init_context;
+};
+
+int ring_context_rodata(struct drm_device *dev,
+ struct intel_ring_context_rodata *ro_data);
+
bool intel_ring_initialized(struct intel_engine_cs *ring);
static inline unsigned
--
2.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs
2014-09-01 13:28 ` [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs Arun Siluvery
@ 2014-09-01 14:06 ` Damien Lespiau
2014-09-01 14:45 ` Siluvery, Arun
2014-09-02 9:15 ` [PATCH v2] " Arun Siluvery
1 sibling, 1 reply; 14+ messages in thread
From: Damien Lespiau @ 2014-09-01 14:06 UTC (permalink / raw)
To: Arun Siluvery; +Cc: intel-gfx
On Mon, Sep 01, 2014 at 02:28:53PM +0100, Arun Siluvery wrote:
> Now w/a are organized in an array so we know exactly how many of them
> are applied; use the same array while exporting data to debugfs and
> remove the temporary array we currently have in driver priv structure.
>
> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_debugfs.c | 41 +++++++++++++++++++++++----------
> drivers/gpu/drm/i915/i915_drv.h | 14 -----------
> drivers/gpu/drm/i915/intel_ringbuffer.c | 15 ++++++++++++
> drivers/gpu/drm/i915/intel_ringbuffer.h | 8 +++++++
> 4 files changed, 52 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 2727bda..bab0408 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2465,6 +2465,14 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
> struct drm_info_node *node = (struct drm_info_node *) m->private;
> struct drm_device *dev = node->minor->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> + struct intel_ring_context_rodata ro_data;
> +
> + ret = ring_context_rodata(dev, &ro_data);
> + if (ret) {
> + seq_printf(m, "Workarounds applied: 0\n");
seq_puts()
> + DRM_DEBUG_DRIVER("Workaround table not available !!\n");
> + return -EINVAL;
> + }
>
> ret = mutex_lock_interruptible(&dev->struct_mutex);
> if (ret)
> @@ -2472,18 +2480,27 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>
> intel_runtime_pm_get(dev_priv);
>
> - seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs);
> - for (i = 0; i < dev_priv->num_wa_regs; ++i) {
> - u32 addr, mask;
> -
> - addr = dev_priv->intel_wa_regs[i].addr;
> - mask = dev_priv->intel_wa_regs[i].mask;
> - dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask;
> - if (dev_priv->intel_wa_regs[i].addr)
> - seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
> - dev_priv->intel_wa_regs[i].addr,
> - dev_priv->intel_wa_regs[i].value,
> - dev_priv->intel_wa_regs[i].mask);
> + seq_printf(m, "Workarounds applied: %d\n", ro_data.num_items/2);
> + for (i = 0; i < ro_data.num_items; i += 2) {
> + u32 addr, mask, value;
> +
> + addr = ro_data.init_context[i];
> + /*
> + * Most of workarounds are masked registers;
> + * to set a bit in lower 16-bits we set a mask bit in
> + * upper 16-bits so we can take either of them as mask but
> + * it doesn't work if the w/a is about clearing a bit so
> + * use upper 16-bits to cover both cases.
> + */
> + mask = ro_data.init_context[i+1] >> 16;
"Most" doesn't seem good here. Either it's "all" and we're happy, or we
need a generic way to describe the W/A (masked register or not). value +
mask is generic enough to code for both cases.
> +
> + /*
> + * value represents the status of other bits in the
> + * register besides w/a bits
> + */
> + value = I915_READ(addr) | mask;
> + seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
> + addr, value, mask);
> }
I still don't get it. 'value' is supposed to be the reference value for
the W/A, but you're or'ing the mask here, so you treat the mask as if it
were the reference value. This won't work if the W/A is about setting
multi-bits fields or about clearing a bit.
The comment is still not clear enough. You're saying "other bits besides
the w/a bits", but or'ing the mask doesn't do that.
Why do we care about the "other bits" in the reference value? they don't
matter. Why use something else than (ro_data.init_context[i+1] & 0xffff)
for the value here (as long we're talking about masked registers)?
--
Damien
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs
2014-09-01 14:06 ` Damien Lespiau
@ 2014-09-01 14:45 ` Siluvery, Arun
2014-09-01 15:04 ` Damien Lespiau
0 siblings, 1 reply; 14+ messages in thread
From: Siluvery, Arun @ 2014-09-01 14:45 UTC (permalink / raw)
To: Damien Lespiau; +Cc: intel-gfx
On 01/09/2014 15:06, Damien Lespiau wrote:
> On Mon, Sep 01, 2014 at 02:28:53PM +0100, Arun Siluvery wrote:
>> Now w/a are organized in an array so we know exactly how many of them
>> are applied; use the same array while exporting data to debugfs and
>> remove the temporary array we currently have in driver priv structure.
>>
>> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_debugfs.c | 41 +++++++++++++++++++++++----------
>> drivers/gpu/drm/i915/i915_drv.h | 14 -----------
>> drivers/gpu/drm/i915/intel_ringbuffer.c | 15 ++++++++++++
>> drivers/gpu/drm/i915/intel_ringbuffer.h | 8 +++++++
>> 4 files changed, 52 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 2727bda..bab0408 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -2465,6 +2465,14 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>> struct drm_info_node *node = (struct drm_info_node *) m->private;
>> struct drm_device *dev = node->minor->dev;
>> struct drm_i915_private *dev_priv = dev->dev_private;
>> + struct intel_ring_context_rodata ro_data;
>> +
>> + ret = ring_context_rodata(dev, &ro_data);
>> + if (ret) {
>> + seq_printf(m, "Workarounds applied: 0\n");
>
> seq_puts()
>
>> + DRM_DEBUG_DRIVER("Workaround table not available !!\n");
>> + return -EINVAL;
>> + }
>>
>> ret = mutex_lock_interruptible(&dev->struct_mutex);
>> if (ret)
>> @@ -2472,18 +2480,27 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>>
>> intel_runtime_pm_get(dev_priv);
>>
>> - seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs);
>> - for (i = 0; i < dev_priv->num_wa_regs; ++i) {
>> - u32 addr, mask;
>> -
>> - addr = dev_priv->intel_wa_regs[i].addr;
>> - mask = dev_priv->intel_wa_regs[i].mask;
>> - dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask;
>> - if (dev_priv->intel_wa_regs[i].addr)
>> - seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
>> - dev_priv->intel_wa_regs[i].addr,
>> - dev_priv->intel_wa_regs[i].value,
>> - dev_priv->intel_wa_regs[i].mask);
>> + seq_printf(m, "Workarounds applied: %d\n", ro_data.num_items/2);
>> + for (i = 0; i < ro_data.num_items; i += 2) {
>> + u32 addr, mask, value;
>> +
>> + addr = ro_data.init_context[i];
>> + /*
>> + * Most of workarounds are masked registers;
>> + * to set a bit in lower 16-bits we set a mask bit in
>> + * upper 16-bits so we can take either of them as mask but
>> + * it doesn't work if the w/a is about clearing a bit so
>> + * use upper 16-bits to cover both cases.
>> + */
>> + mask = ro_data.init_context[i+1] >> 16;
>
> "Most" doesn't seem good here. Either it's "all" and we're happy, or we
> need a generic way to describe the W/A (masked register or not). value +
> mask is generic enough to code for both cases.
>
It seems some of them could be unmasked registers.
We can use 'mask' itself to determine whether it is a masked/unmasked
register. mask == 0 if it is an unmasked register.
>> +
>> + /*
>> + * value represents the status of other bits in the
>> + * register besides w/a bits
>> + */
>> + value = I915_READ(addr) | mask;
>> + seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
>> + addr, value, mask);
>> }
>
> I still don't get it. 'value' is supposed to be the reference value for
> the W/A, but you're or'ing the mask here, so you treat the mask as if it
> were the reference value. This won't work if the W/A is about setting
> multi-bits fields or about clearing a bit.
>
> The comment is still not clear enough. You're saying "other bits besides
> the w/a bits", but or'ing the mask doesn't do that.
>
> Why do we care about the "other bits" in the reference value? they don't
> matter. Why use something else than (ro_data.init_context[i+1] & 0xffff)
> for the value here (as long we're talking about masked registers)?
>
I have always considered value as the register value (remaining bits of
the register and w/a bits) and now I see your point.
Yes lower 16-bits can be used as reference value, depending on whether
it is a masked/unmasked we can use/not use the mask in conjunction with
value in the test.
regards
Arun
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs
2014-09-01 14:45 ` Siluvery, Arun
@ 2014-09-01 15:04 ` Damien Lespiau
0 siblings, 0 replies; 14+ messages in thread
From: Damien Lespiau @ 2014-09-01 15:04 UTC (permalink / raw)
To: Siluvery, Arun; +Cc: intel-gfx
On Mon, Sep 01, 2014 at 03:45:29PM +0100, Siluvery, Arun wrote:
> >>+ /*
> >>+ * Most of workarounds are masked registers;
> >>+ * to set a bit in lower 16-bits we set a mask bit in
> >>+ * upper 16-bits so we can take either of them as mask but
> >>+ * it doesn't work if the w/a is about clearing a bit so
> >>+ * use upper 16-bits to cover both cases.
> >>+ */
> >>+ mask = ro_data.init_context[i+1] >> 16;
> >
> >"Most" doesn't seem good here. Either it's "all" and we're happy, or we
> >need a generic way to describe the W/A (masked register or not). value +
> >mask is generic enough to code for both cases.
> >
> It seems some of them could be unmasked registers.
> We can use 'mask' itself to determine whether it is a
> masked/unmasked register. mask == 0 if it is an unmasked register.
I don't think we can use mask == 0 when it's an unmasked register. In
this case, we still need to be able to have a mask that tells us which
are the interesting bits in the value. If we want something generic that
can be applied to masked and unmasked register, we'll something like:
struct reg_wa {
unsigned int masked_register : 1;
u32 addr;
u32 mask;
u32 value;
};
So:
1. masked register case:
- masked_register = 1
- addr
- mask (selects the lower 16 bits that are coding for the W/A value)
- value (it only makes sense to some of the lower 16 bits set here)
2. 'normal' register case
- masked_register = 0
- addr
- mask
- value
>From that generic description you can cover all cases, eg.
1. the write for masked registers becomes: mask << 16 | value
2. the write for normal registers becomes: (read(addr) & ~mask) | value
2. check a W/A has been applied (both normal and masked register):
read(addr) & mask == value
We seem to have only masked registers atm, so it's fine to handle just
that case, but if you envision that we'll need more than masked
registers, we need a better W/A definition.
--
Damien
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2] drm/i915: Rework workaround init functions for BDW and CHV
2014-09-01 13:28 ` [PATCH 1/4] drm/i915: Rework workaround init functions " Arun Siluvery
@ 2014-09-02 9:14 ` Arun Siluvery
2014-09-02 9:45 ` Damien Lespiau
0 siblings, 1 reply; 14+ messages in thread
From: Arun Siluvery @ 2014-09-02 9:14 UTC (permalink / raw)
To: intel-gfx
This rework is based on suggestion from Chris.
Now w/a are organized in an array and all of them are emitted in
single fn instead of sending them individually. This approach is
clean and new w/a can be added with minimal changes.
The same array can be used when exporting them to debugfs and the
temporary array in the current implementation is not required.
v2: As suggested by Damien a new w/a reg definition is added.
This helps in initializing w/a that are unmasked registers.
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 158 ++++++++++++++------------------
drivers/gpu/drm/i915/intel_ringbuffer.h | 26 ++++++
2 files changed, 93 insertions(+), 91 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c5e4dc7..3ed0ad5 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -650,87 +650,80 @@ err:
return ret;
}
-static inline void intel_ring_emit_wa(struct intel_engine_cs *ring,
- u32 addr, u32 value)
+static int i915_request_emit_lri(struct intel_engine_cs *ring,
+ int num_registers,
+ const struct intel_wa_reg *lri_list)
{
- struct drm_device *dev = ring->dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
+ int i;
+ int ret;
+ struct drm_i915_private *dev_priv = ring->dev->dev_private;
- if (dev_priv->num_wa_regs > I915_MAX_WA_REGS)
- return;
+ ret = intel_ring_begin(ring, (2 * num_registers + 1));
+ if (ret)
+ return ret;
- intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
- intel_ring_emit(ring, addr);
- intel_ring_emit(ring, value);
+ intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(num_registers));
+ for (i = 0; i < num_registers; ++i) {
+ u32 value;
+ const struct intel_wa_reg *p = (lri_list + i);
- dev_priv->intel_wa_regs[dev_priv->num_wa_regs].addr = addr;
- dev_priv->intel_wa_regs[dev_priv->num_wa_regs].mask = (value) & 0xFFFF;
- /* value is updated with the status of remaining bits of this
- * register when it is read from debugfs file
- */
- dev_priv->intel_wa_regs[dev_priv->num_wa_regs].value = value;
- dev_priv->num_wa_regs++;
+ if (p->masked_register)
+ value = (p->mask << 16) | p->value;
+ else
+ value = (I915_READ(p->addr) & ~p->mask) | p->value;
+ intel_ring_emit(ring, p->addr);
+ intel_ring_emit(ring, value);
+ }
- return;
+ intel_ring_emit(ring, MI_NOOP);
+ intel_ring_advance(ring);
+
+ return 0;
}
-static int bdw8_init_workarounds(struct intel_engine_cs *ring)
-{
- int ret;
- struct drm_device *dev = ring->dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
+static const struct intel_wa_reg bdw_ring_init_context[] = {
/*
* workarounds applied in this fn are part of register state context,
* they need to be re-initialized followed by gpu reset, suspend/resume,
* module reload.
*/
- dev_priv->num_wa_regs = 0;
- memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
-
- /*
- * update the number of dwords required based on the
- * actual number of workarounds applied
- */
- ret = intel_ring_begin(ring, 24);
- if (ret)
- return ret;
/* WaDisablePartialInstShootdown:bdw */
/* WaDisableThreadStallDopClockGating:bdw */
/* FIXME: Unclear whether we really need this on production bdw. */
- intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
- _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE
- | STALL_DOP_GATING_DISABLE));
+ INIT_MASKED_WA(GEN8_ROW_CHICKEN,
+ _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) |
+ _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE)),
/* WaDisableDopClockGating:bdw May not be needed for production */
- intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
- _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
+ INIT_MASKED_WA(GEN7_ROW_CHICKEN2,
+ _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE)),
/*
* This GEN8_CENTROID_PIXEL_OPT_DIS W/A is only needed for
* pre-production hardware
*/
- intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
- _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS
- | GEN8_SAMPLER_POWER_BYPASS_DIS));
+ INIT_MASKED_WA(HALF_SLICE_CHICKEN3,
+ _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS) |
+ _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS)),
- intel_ring_emit_wa(ring, GEN7_HALF_SLICE_CHICKEN1,
- _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE));
+ INIT_MASKED_WA(GEN7_HALF_SLICE_CHICKEN1,
+ _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE)),
- intel_ring_emit_wa(ring, COMMON_SLICE_CHICKEN2,
- _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE));
+ INIT_MASKED_WA(COMMON_SLICE_CHICKEN2,
+ _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE)),
/* Use Force Non-Coherent whenever executing a 3D context. This is a
* workaround for for a possible hang in the unlikely event a TLB
* invalidation occurs during a PSD flush.
*/
- intel_ring_emit_wa(ring, HDC_CHICKEN0,
- _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT));
+ INIT_MASKED_WA(HDC_CHICKEN0,
+ _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT)),
/* Wa4x4STCOptimizationDisable:bdw */
- intel_ring_emit_wa(ring, CACHE_MODE_1,
- _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE));
+ INIT_MASKED_WA(CACHE_MODE_1,
+ _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE)),
/*
* BSpec recommends 8x4 when MSAA is used,
@@ -740,54 +733,40 @@ static int bdw8_init_workarounds(struct intel_engine_cs *ring)
* disable bit, which we don't touch here, but it's good
* to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
*/
- intel_ring_emit_wa(ring, GEN7_GT_MODE,
- GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
-
- intel_ring_advance(ring);
-
- DRM_DEBUG_DRIVER("Number of Workarounds applied: %d\n",
- dev_priv->num_wa_regs);
-
- return 0;
-}
-
-static int chv_init_workarounds(struct intel_engine_cs *ring)
-{
- int ret;
- struct drm_device *dev = ring->dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
-
- /*
- * workarounds applied in this fn are part of register state context,
- * they need to be re-initialized followed by gpu reset, suspend/resume,
- * module reload.
- */
- dev_priv->num_wa_regs = 0;
- memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
-
- ret = intel_ring_begin(ring, 12);
- if (ret)
- return ret;
+ INIT_MASKED_WA(GEN7_GT_MODE,
+ (GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4)),
+};
+static const struct intel_wa_reg chv_ring_init_context[] = {
/* WaDisablePartialInstShootdown:chv */
- intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
- _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE));
-
/* WaDisableThreadStallDopClockGating:chv */
- intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
- _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE));
+ INIT_MASKED_WA(GEN8_ROW_CHICKEN,
+ _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) |
+ _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE)),
/* WaDisableDopClockGating:chv (pre-production hw) */
- intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
- _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
+ INIT_MASKED_WA(GEN7_ROW_CHICKEN2,
+ _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE)),
/* WaDisableSamplerPowerBypass:chv (pre-production hw) */
- intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
- _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
+ INIT_MASKED_WA(HALF_SLICE_CHICKEN3,
+ _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS)),
+};
- intel_ring_advance(ring);
+static int ring_init_context(struct intel_engine_cs *ring)
+{
+ int ret = -EINVAL;
- return 0;
+ if (IS_CHERRYVIEW(ring->dev))
+ ret = i915_request_emit_lri(ring,
+ ARRAY_SIZE(chv_ring_init_context),
+ chv_ring_init_context);
+ else
+ ret = i915_request_emit_lri(ring,
+ ARRAY_SIZE(bdw_ring_init_context),
+ bdw_ring_init_context);
+
+ return ret;
}
static int init_render_ring(struct intel_engine_cs *ring)
@@ -2275,10 +2254,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
dev_priv->semaphore_obj = obj;
}
}
- if (IS_CHERRYVIEW(dev))
- ring->init_context = chv_init_workarounds;
- else
- ring->init_context = bdw_init_workarounds;
+ ring->init_context = ring_init_context;
ring->add_request = gen6_add_request;
ring->flush = gen8_render_ring_flush;
ring->irq_get = gen8_ring_get_irq;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 96479c8..c9ed06c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -317,6 +317,32 @@ struct intel_engine_cs {
u32 (*get_cmd_length_mask)(u32 cmd_header);
};
+/*
+ * Workaround register definition
+ * mask: bits correponding to the w/a
+ * value: represents reference values of w/a bits
+ */
+struct intel_wa_reg {
+ u32 masked_register : 1;
+ u32 addr;
+ u32 mask;
+ u32 value;
+};
+
+#define INIT_WA_REG(_masked, _addr, _mask, _value) \
+{ \
+ .masked_register = _masked, \
+ .addr = _addr, \
+ .mask = _mask, \
+ .value = _value,\
+}
+
+#define INIT_MASKED_WA(_addr, _value) \
+ INIT_WA_REG(1, _addr, ((_value) >> 16), ((_value) & 0xFFFF))
+
+#define INIT_UNMASKED_WA(_addr, _value) \
+ INIT_WA_REG(0, _addr, ~(_value), _value)
+
bool intel_ring_initialized(struct intel_engine_cs *ring);
static inline unsigned
--
2.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2] drm/i915: Rework workaround data exporting to debugfs
2014-09-01 13:28 ` [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs Arun Siluvery
2014-09-01 14:06 ` Damien Lespiau
@ 2014-09-02 9:15 ` Arun Siluvery
2014-09-02 9:56 ` Daniel Vetter
1 sibling, 1 reply; 14+ messages in thread
From: Arun Siluvery @ 2014-09-02 9:15 UTC (permalink / raw)
To: intel-gfx
Now w/a are organized in an array so we know exactly how many of them
are applied; use the same array while exporting data to debugfs and
remove the temporary array we currently have in driver priv structure.
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 29 +++++++++++++++++------------
drivers/gpu/drm/i915/i915_drv.h | 14 --------------
drivers/gpu/drm/i915/intel_ringbuffer.c | 15 +++++++++++++++
drivers/gpu/drm/i915/intel_ringbuffer.h | 8 ++++++++
4 files changed, 40 insertions(+), 26 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2727bda..0c1e294 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2465,6 +2465,14 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
struct drm_info_node *node = (struct drm_info_node *) m->private;
struct drm_device *dev = node->minor->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_ring_context_rodata ro_data;
+
+ ret = ring_context_rodata(dev, &ro_data);
+ if (ret) {
+ seq_puts(m, "Workarounds applied: 0\n");
+ DRM_DEBUG_DRIVER("Workaround table not available !!\n");
+ return -EINVAL;
+ }
ret = mutex_lock_interruptible(&dev->struct_mutex);
if (ret)
@@ -2472,18 +2480,15 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
intel_runtime_pm_get(dev_priv);
- seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs);
- for (i = 0; i < dev_priv->num_wa_regs; ++i) {
- u32 addr, mask;
-
- addr = dev_priv->intel_wa_regs[i].addr;
- mask = dev_priv->intel_wa_regs[i].mask;
- dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask;
- if (dev_priv->intel_wa_regs[i].addr)
- seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
- dev_priv->intel_wa_regs[i].addr,
- dev_priv->intel_wa_regs[i].value,
- dev_priv->intel_wa_regs[i].mask);
+ seq_printf(m, "Workarounds applied: %d\n", ro_data.num_items);
+ for (i = 0; i < ro_data.num_items; ++i) {
+ u32 addr, mask, value;
+
+ addr = ro_data.init_context[i].addr;
+ mask = ro_data.init_context[i].mask;
+ value = ro_data.init_context[i].value;
+ seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
+ addr, value, mask);
}
intel_runtime_pm_put(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 49b7be7..bcf79f0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1553,20 +1553,6 @@ struct drm_i915_private {
struct intel_shared_dpll shared_dplls[I915_NUM_PLLS];
int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
- /*
- * workarounds are currently applied at different places and
- * changes are being done to consolidate them so exact count is
- * not clear at this point, use a max value for now.
- */
-#define I915_MAX_WA_REGS 16
- struct {
- u32 addr;
- u32 value;
- /* bitmask representing WA bits */
- u32 mask;
- } intel_wa_regs[I915_MAX_WA_REGS];
- u32 num_wa_regs;
-
/* Reclocking support */
bool render_reclock_avail;
bool lvds_downclock_avail;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3ed0ad5..7262c10 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -769,6 +769,21 @@ static int ring_init_context(struct intel_engine_cs *ring)
return ret;
}
+int ring_context_rodata(struct drm_device *dev,
+ struct intel_ring_context_rodata *ro_data)
+{
+ if (IS_CHERRYVIEW(dev)) {
+ ro_data->init_context = chv_ring_init_context;
+ ro_data->num_items = ARRAY_SIZE(chv_ring_init_context);
+ } else if (IS_BROADWELL(dev)) {
+ ro_data->init_context = bdw_ring_init_context;
+ ro_data->num_items = ARRAY_SIZE(bdw_ring_init_context);
+ } else
+ return -EINVAL;
+
+ return 0;
+}
+
static int init_render_ring(struct intel_engine_cs *ring)
{
struct drm_device *dev = ring->dev;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c9ed06c..33454ce 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -343,6 +343,14 @@ struct intel_wa_reg {
#define INIT_UNMASKED_WA(_addr, _value) \
INIT_WA_REG(0, _addr, ~(_value), _value)
+struct intel_ring_context_rodata {
+ u32 num_items;
+ const struct intel_wa_reg *init_context;
+};
+
+int ring_context_rodata(struct drm_device *dev,
+ struct intel_ring_context_rodata *ro_data);
+
bool intel_ring_initialized(struct intel_engine_cs *ring);
static inline unsigned
--
2.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2] drm/i915: Rework workaround init functions for BDW and CHV
2014-09-02 9:14 ` [PATCH v2] " Arun Siluvery
@ 2014-09-02 9:45 ` Damien Lespiau
2014-09-02 9:49 ` Damien Lespiau
2014-09-02 9:51 ` Chris Wilson
0 siblings, 2 replies; 14+ messages in thread
From: Damien Lespiau @ 2014-09-02 9:45 UTC (permalink / raw)
To: Arun Siluvery; +Cc: intel-gfx
On Tue, Sep 02, 2014 at 10:14:17AM +0100, Arun Siluvery wrote:
> This rework is based on suggestion from Chris.
> Now w/a are organized in an array and all of them are emitted in
> single fn instead of sending them individually. This approach is
> clean and new w/a can be added with minimal changes.
> The same array can be used when exporting them to debugfs and the
> temporary array in the current implementation is not required.
>
> v2: As suggested by Damien a new w/a reg definition is added.
> This helps in initializing w/a that are unmasked registers.
>
> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
> ---
> drivers/gpu/drm/i915/intel_ringbuffer.c | 158 ++++++++++++++------------------
> drivers/gpu/drm/i915/intel_ringbuffer.h | 26 ++++++
> 2 files changed, 93 insertions(+), 91 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index c5e4dc7..3ed0ad5 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -650,87 +650,80 @@ err:
> return ret;
> }
>
> -static inline void intel_ring_emit_wa(struct intel_engine_cs *ring,
> - u32 addr, u32 value)
> +static int i915_request_emit_lri(struct intel_engine_cs *ring,
> + int num_registers,
> + const struct intel_wa_reg *lri_list)
> {
[...]
> + for (i = 0; i < num_registers; ++i) {
> + u32 value;
> + const struct intel_wa_reg *p = (lri_list + i);
>
> + if (p->masked_register)
> + value = (p->mask << 16) | p->value;
> + else
> + value = (I915_READ(p->addr) & ~p->mask) | p->value;
There's one case when this won't work, when several WAs for a single
'normal' register are defined. The read done here means only the last of
those W/As will end up being applied (because the last LRI to that
register will be the value that ends up in the register. We'll probably
need to coalesce all W/A defined for a single normal register into one
write.
Considering that, and the comment about the define, maybe the simplest
way to go forward with the patch is to forget the non-masked case for
the moment, especially as it's not used in this patch.
> +#define INIT_MASKED_WA(_addr, _value) \
> + INIT_WA_REG(1, _addr, ((_value) >> 16), ((_value) & 0xFFFF))
Those outer '()' look unnecessary
> +#define INIT_UNMASKED_WA(_addr, _value) \
> + INIT_WA_REG(0, _addr, ~(_value), _value)
> +
~value is not a mask. Consider values with one or more bits defined to 0.
--
Damien
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] drm/i915: Rework workaround init functions for BDW and CHV
2014-09-02 9:45 ` Damien Lespiau
@ 2014-09-02 9:49 ` Damien Lespiau
2014-09-02 9:51 ` Chris Wilson
1 sibling, 0 replies; 14+ messages in thread
From: Damien Lespiau @ 2014-09-02 9:49 UTC (permalink / raw)
To: Arun Siluvery; +Cc: intel-gfx
On Tue, Sep 02, 2014 at 10:45:45AM +0100, Damien Lespiau wrote:
> There's one case when this won't work, when several WAs for a single
> 'normal' register are defined. The read done here means only the last of
> those W/As will end up being applied (because the last LRI to that
> register will be the value that ends up in the register. We'll probably
> need to coalesce all W/A defined for a single normal register into one
> write.
To more correct/clear, it's not the read that's the problem, in the case
where a normal register has several W/A, the last write will override
the previous ones.
--
Damien
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] drm/i915: Rework workaround init functions for BDW and CHV
2014-09-02 9:45 ` Damien Lespiau
2014-09-02 9:49 ` Damien Lespiau
@ 2014-09-02 9:51 ` Chris Wilson
1 sibling, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2014-09-02 9:51 UTC (permalink / raw)
To: Damien Lespiau; +Cc: intel-gfx
On Tue, Sep 02, 2014 at 10:45:45AM +0100, Damien Lespiau wrote:
> On Tue, Sep 02, 2014 at 10:14:17AM +0100, Arun Siluvery wrote:
> > This rework is based on suggestion from Chris.
> > Now w/a are organized in an array and all of them are emitted in
> > single fn instead of sending them individually. This approach is
> > clean and new w/a can be added with minimal changes.
> > The same array can be used when exporting them to debugfs and the
> > temporary array in the current implementation is not required.
> >
> > v2: As suggested by Damien a new w/a reg definition is added.
> > This helps in initializing w/a that are unmasked registers.
> >
> > Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/intel_ringbuffer.c | 158 ++++++++++++++------------------
> > drivers/gpu/drm/i915/intel_ringbuffer.h | 26 ++++++
> > 2 files changed, 93 insertions(+), 91 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index c5e4dc7..3ed0ad5 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -650,87 +650,80 @@ err:
> > return ret;
> > }
> >
> > -static inline void intel_ring_emit_wa(struct intel_engine_cs *ring,
> > - u32 addr, u32 value)
> > +static int i915_request_emit_lri(struct intel_engine_cs *ring,
> > + int num_registers,
> > + const struct intel_wa_reg *lri_list)
> > {
>
> [...]
>
> > + for (i = 0; i < num_registers; ++i) {
> > + u32 value;
> > + const struct intel_wa_reg *p = (lri_list + i);
> >
> > + if (p->masked_register)
> > + value = (p->mask << 16) | p->value;
> > + else
> > + value = (I915_READ(p->addr) & ~p->mask) | p->value;
>
> There's one case when this won't work, when several WAs for a single
> 'normal' register are defined. The read done here means only the last of
> those W/As will end up being applied (because the last LRI to that
> register will be the value that ends up in the register. We'll probably
> need to coalesce all W/A defined for a single normal register into one
> write.
>
> Considering that, and the comment about the define, maybe the simplest
> way to go forward with the patch is to forget the non-masked case for
> the moment, especially as it's not used in this patch.
If you are still considering an in-kernel table, despite that it can be
done in userspace, design it to track all w/a and not just the ring
specific fixups.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] drm/i915: Rework workaround data exporting to debugfs
2014-09-02 9:15 ` [PATCH v2] " Arun Siluvery
@ 2014-09-02 9:56 ` Daniel Vetter
0 siblings, 0 replies; 14+ messages in thread
From: Daniel Vetter @ 2014-09-02 9:56 UTC (permalink / raw)
To: Arun Siluvery; +Cc: intel-gfx
On Tue, Sep 02, 2014 at 10:15:31AM +0100, Arun Siluvery wrote:
> Now w/a are organized in an array so we know exactly how many of them
> are applied; use the same array while exporting data to debugfs and
> remove the temporary array we currently have in driver priv structure.
>
> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_debugfs.c | 29 +++++++++++++++++------------
> drivers/gpu/drm/i915/i915_drv.h | 14 --------------
> drivers/gpu/drm/i915/intel_ringbuffer.c | 15 +++++++++++++++
> drivers/gpu/drm/i915/intel_ringbuffer.h | 8 ++++++++
> 4 files changed, 40 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 2727bda..0c1e294 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2465,6 +2465,14 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
> struct drm_info_node *node = (struct drm_info_node *) m->private;
> struct drm_device *dev = node->minor->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> + struct intel_ring_context_rodata ro_data;
> +
> + ret = ring_context_rodata(dev, &ro_data);
> + if (ret) {
> + seq_puts(m, "Workarounds applied: 0\n");
> + DRM_DEBUG_DRIVER("Workaround table not available !!\n");
> + return -EINVAL;
> + }
>
> ret = mutex_lock_interruptible(&dev->struct_mutex);
> if (ret)
> @@ -2472,18 +2480,15 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>
> intel_runtime_pm_get(dev_priv);
>
> - seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs);
> - for (i = 0; i < dev_priv->num_wa_regs; ++i) {
> - u32 addr, mask;
> -
> - addr = dev_priv->intel_wa_regs[i].addr;
> - mask = dev_priv->intel_wa_regs[i].mask;
> - dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask;
> - if (dev_priv->intel_wa_regs[i].addr)
> - seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
> - dev_priv->intel_wa_regs[i].addr,
> - dev_priv->intel_wa_regs[i].value,
> - dev_priv->intel_wa_regs[i].mask);
> + seq_printf(m, "Workarounds applied: %d\n", ro_data.num_items);
> + for (i = 0; i < ro_data.num_items; ++i) {
> + u32 addr, mask, value;
> +
> + addr = ro_data.init_context[i].addr;
> + mask = ro_data.init_context[i].mask;
> + value = ro_data.init_context[i].value;
> + seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
> + addr, value, mask);
> }
>
> intel_runtime_pm_put(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 49b7be7..bcf79f0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1553,20 +1553,6 @@ struct drm_i915_private {
> struct intel_shared_dpll shared_dplls[I915_NUM_PLLS];
> int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
>
> - /*
> - * workarounds are currently applied at different places and
> - * changes are being done to consolidate them so exact count is
> - * not clear at this point, use a max value for now.
> - */
> -#define I915_MAX_WA_REGS 16
> - struct {
> - u32 addr;
> - u32 value;
> - /* bitmask representing WA bits */
> - u32 mask;
> - } intel_wa_regs[I915_MAX_WA_REGS];
> - u32 num_wa_regs;
> -
> /* Reclocking support */
> bool render_reclock_avail;
> bool lvds_downclock_avail;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3ed0ad5..7262c10 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -769,6 +769,21 @@ static int ring_init_context(struct intel_engine_cs *ring)
> return ret;
> }
>
> +int ring_context_rodata(struct drm_device *dev,
> + struct intel_ring_context_rodata *ro_data)
> +{
> + if (IS_CHERRYVIEW(dev)) {
> + ro_data->init_context = chv_ring_init_context;
> + ro_data->num_items = ARRAY_SIZE(chv_ring_init_context);
> + } else if (IS_BROADWELL(dev)) {
> + ro_data->init_context = bdw_ring_init_context;
> + ro_data->num_items = ARRAY_SIZE(bdw_ring_init_context);
This will kinda break my idea that we could put _all_ static register
w/a into these schem, so also everything that's in the various init_clock
gating functions.
The goal of the test is after all to make sure that we set the w/a bits in
the right place, so if you only check the w/a emitted in the context init
(and this change here kinda bakes this in) then it will not be really
useful.
Maybe you should (just as a prove of concept of these refactorings)
convert the chv or bdw w/a in the init_clock_gating to this infrastructure
too?
Thanks, Daniel
> + } else
> + return -EINVAL;
> +
> + return 0;
> +}
> +
> static int init_render_ring(struct intel_engine_cs *ring)
> {
> struct drm_device *dev = ring->dev;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index c9ed06c..33454ce 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -343,6 +343,14 @@ struct intel_wa_reg {
> #define INIT_UNMASKED_WA(_addr, _value) \
> INIT_WA_REG(0, _addr, ~(_value), _value)
>
> +struct intel_ring_context_rodata {
> + u32 num_items;
> + const struct intel_wa_reg *init_context;
> +};
> +
> +int ring_context_rodata(struct drm_device *dev,
> + struct intel_ring_context_rodata *ro_data);
> +
> bool intel_ring_initialized(struct intel_engine_cs *ring);
>
> static inline unsigned
> --
> 2.0.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-09-02 9:56 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-01 13:28 [PATCH 0/4] Rework workaround init fn for BDW and CHV Arun Siluvery
2014-09-01 13:28 ` [PATCH 1/4] drm/i915: Rework workaround init functions " Arun Siluvery
2014-09-02 9:14 ` [PATCH v2] " Arun Siluvery
2014-09-02 9:45 ` Damien Lespiau
2014-09-02 9:49 ` Damien Lespiau
2014-09-02 9:51 ` Chris Wilson
2014-09-01 13:28 ` [PATCH 2/4] drm/i915: Rename intel_wa_registers with a i915_ prefix Arun Siluvery
2014-09-01 13:28 ` [PATCH 3/4] drm/i915: Don't restrict i915_wa_registers to BDW Arun Siluvery
2014-09-01 13:28 ` [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs Arun Siluvery
2014-09-01 14:06 ` Damien Lespiau
2014-09-01 14:45 ` Siluvery, Arun
2014-09-01 15:04 ` Damien Lespiau
2014-09-02 9:15 ` [PATCH v2] " Arun Siluvery
2014-09-02 9:56 ` Daniel Vetter
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.