* [PATCH 0/4] Rework workaround init fn for BDW and CHV @ 2014-09-01 13:28 Arun Siluvery 2014-09-01 13:28 ` [PATCH 1/4] drm/i915: Rework workaround init functions " Arun Siluvery ` (3 more replies) 0 siblings, 4 replies; 14+ messages in thread From: Arun Siluvery @ 2014-09-01 13:28 UTC (permalink / raw) To: intel-gfx The patches that initialize workarounds for BDW and CHV using LRIs are already merged in the tree but I received few more comments from Chris and Damien. I have reworked these patches as per their comments; it fixes some issues and the code now looks clean. we can easily add more workarounds with minimal changes. I have tried to split BDW and CHV changes as separate patches but it was getting ugly hence combined them also since it is a fixup patch it should be ok. In the previous implementation we were emitting each LRI individually so the total number was not clear during init so a temporary array was used to save w/a data (this is exported to debugfs) but if not initialized we could potentially overrun array and also the condition was not correct. In this rework, w/a are organized as an array so we know the exact count from the beginning and temporary array is no longer required. The corresponding i-g-t is gem_workarounds.c and it is also updated. Arun Siluvery (2): drm/i915: Rework workaround init functions for BDW and CHV drm/i915: Rework workaround data exporting to debugfs Damien Lespiau (2): drm/i915: Rename intel_wa_registers with a i915_ prefix drm/i915: Don't restrict i915_wa_registers to BDW drivers/gpu/drm/i915/i915_debugfs.c | 42 ++++++--- drivers/gpu/drm/i915/i915_drv.h | 14 --- drivers/gpu/drm/i915/intel_ringbuffer.c | 162 ++++++++++++++------------------ drivers/gpu/drm/i915/intel_ringbuffer.h | 8 ++ 4 files changed, 107 insertions(+), 119 deletions(-) -- 2.0.4 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/4] drm/i915: Rework workaround init functions for BDW and CHV 2014-09-01 13:28 [PATCH 0/4] Rework workaround init fn for BDW and CHV Arun Siluvery @ 2014-09-01 13:28 ` Arun Siluvery 2014-09-02 9:14 ` [PATCH v2] " Arun Siluvery 2014-09-01 13:28 ` [PATCH 2/4] drm/i915: Rename intel_wa_registers with a i915_ prefix Arun Siluvery ` (2 subsequent siblings) 3 siblings, 1 reply; 14+ messages in thread From: Arun Siluvery @ 2014-09-01 13:28 UTC (permalink / raw) To: intel-gfx This rework is based on suggestion from Chris. Now w/a are organized in an array and all of them are emitted in single fn instead of sending them individually. This approach is very clean and new w/a can be added with minimal changes. The same array can be used when exporting them to debugfs and the temporary array in the current implementation is not required. Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> --- drivers/gpu/drm/i915/intel_ringbuffer.c | 149 +++++++++++++------------------- 1 file changed, 58 insertions(+), 91 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index c5e4dc7..bae1527 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -650,87 +650,71 @@ err: return ret; } -static inline void intel_ring_emit_wa(struct intel_engine_cs *ring, - u32 addr, u32 value) +static int i915_request_emit_lri(struct intel_engine_cs *ring, + int num_registers, const u32 *lri_list) { - struct drm_device *dev = ring->dev; - struct drm_i915_private *dev_priv = dev->dev_private; + int i; + int ret; - if (dev_priv->num_wa_regs > I915_MAX_WA_REGS) - return; + ret = intel_ring_begin(ring, (2 * num_registers + 1)); + if (ret) + return ret; - intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); - intel_ring_emit(ring, addr); - intel_ring_emit(ring, value); + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(num_registers)); + for (i = 0; i < 2*num_registers; i += 2) { + intel_ring_emit(ring, *(lri_list + i)); + intel_ring_emit(ring, *(lri_list + i + 1)); + } - dev_priv->intel_wa_regs[dev_priv->num_wa_regs].addr = addr; - dev_priv->intel_wa_regs[dev_priv->num_wa_regs].mask = (value) & 0xFFFF; - /* value is updated with the status of remaining bits of this - * register when it is read from debugfs file - */ - dev_priv->intel_wa_regs[dev_priv->num_wa_regs].value = value; - dev_priv->num_wa_regs++; + intel_ring_emit(ring, MI_NOOP); + intel_ring_advance(ring); - return; + return 0; } -static int bdw8_init_workarounds(struct intel_engine_cs *ring) -{ - int ret; - struct drm_device *dev = ring->dev; - struct drm_i915_private *dev_priv = dev->dev_private; +static const u32 bdw_ring_init_context[] = { /* * workarounds applied in this fn are part of register state context, * they need to be re-initialized followed by gpu reset, suspend/resume, * module reload. */ - dev_priv->num_wa_regs = 0; - memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs)); - - /* - * update the number of dwords required based on the - * actual number of workarounds applied - */ - ret = intel_ring_begin(ring, 24); - if (ret) - return ret; /* WaDisablePartialInstShootdown:bdw */ /* WaDisableThreadStallDopClockGating:bdw */ /* FIXME: Unclear whether we really need this on production bdw. */ - intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN, - _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE - | STALL_DOP_GATING_DISABLE)); + GEN8_ROW_CHICKEN, + _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) | + _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE), /* WaDisableDopClockGating:bdw May not be needed for production */ - intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2, - _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE)); + GEN7_ROW_CHICKEN2, + _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE), /* * This GEN8_CENTROID_PIXEL_OPT_DIS W/A is only needed for * pre-production hardware */ - intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3, - _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS - | GEN8_SAMPLER_POWER_BYPASS_DIS)); + HALF_SLICE_CHICKEN3, + _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS + | GEN8_SAMPLER_POWER_BYPASS_DIS), - intel_ring_emit_wa(ring, GEN7_HALF_SLICE_CHICKEN1, - _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE)); + GEN7_HALF_SLICE_CHICKEN1, + _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE), - intel_ring_emit_wa(ring, COMMON_SLICE_CHICKEN2, - _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE)); + COMMON_SLICE_CHICKEN2, + _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE), /* Use Force Non-Coherent whenever executing a 3D context. This is a * workaround for for a possible hang in the unlikely event a TLB * invalidation occurs during a PSD flush. */ - intel_ring_emit_wa(ring, HDC_CHICKEN0, - _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT)); + HDC_CHICKEN0, + _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT), /* Wa4x4STCOptimizationDisable:bdw */ - intel_ring_emit_wa(ring, CACHE_MODE_1, - _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE)); + CACHE_MODE_1, + _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE), /* * BSpec recommends 8x4 when MSAA is used, @@ -740,54 +724,40 @@ static int bdw8_init_workarounds(struct intel_engine_cs *ring) * disable bit, which we don't touch here, but it's good * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM). */ - intel_ring_emit_wa(ring, GEN7_GT_MODE, - GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4); - - intel_ring_advance(ring); - - DRM_DEBUG_DRIVER("Number of Workarounds applied: %d\n", - dev_priv->num_wa_regs); - - return 0; -} - -static int chv_init_workarounds(struct intel_engine_cs *ring) -{ - int ret; - struct drm_device *dev = ring->dev; - struct drm_i915_private *dev_priv = dev->dev_private; - - /* - * workarounds applied in this fn are part of register state context, - * they need to be re-initialized followed by gpu reset, suspend/resume, - * module reload. - */ - dev_priv->num_wa_regs = 0; - memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs)); - - ret = intel_ring_begin(ring, 12); - if (ret) - return ret; + GEN7_GT_MODE, + _MASKED_BIT_ENABLE(GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4), +}; +static const u32 chv_ring_init_context[] = { /* WaDisablePartialInstShootdown:chv */ - intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN, - _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE)); - /* WaDisableThreadStallDopClockGating:chv */ - intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN, - _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE)); + GEN8_ROW_CHICKEN, + _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) | + _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE), /* WaDisableDopClockGating:chv (pre-production hw) */ - intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2, - _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE)); + GEN7_ROW_CHICKEN2, + _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE), /* WaDisableSamplerPowerBypass:chv (pre-production hw) */ - intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3, - _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS)); + HALF_SLICE_CHICKEN3, + _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS), +}; - intel_ring_advance(ring); +static int ring_init_context(struct intel_engine_cs *ring) +{ + int ret = -EINVAL; - return 0; + if (IS_CHERRYVIEW(ring->dev)) + ret = i915_request_emit_lri(ring, + ARRAY_SIZE(chv_ring_init_context)/2, + chv_ring_init_context); + else + ret = i915_request_emit_lri(ring, + ARRAY_SIZE(bdw_ring_init_context)/2, + bdw_ring_init_context); + + return ret; } static int init_render_ring(struct intel_engine_cs *ring) @@ -2275,10 +2245,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev) dev_priv->semaphore_obj = obj; } } - if (IS_CHERRYVIEW(dev)) - ring->init_context = chv_init_workarounds; - else - ring->init_context = bdw_init_workarounds; + ring->init_context = ring_init_context; ring->add_request = gen6_add_request; ring->flush = gen8_render_ring_flush; ring->irq_get = gen8_ring_get_irq; -- 2.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2] drm/i915: Rework workaround init functions for BDW and CHV 2014-09-01 13:28 ` [PATCH 1/4] drm/i915: Rework workaround init functions " Arun Siluvery @ 2014-09-02 9:14 ` Arun Siluvery 2014-09-02 9:45 ` Damien Lespiau 0 siblings, 1 reply; 14+ messages in thread From: Arun Siluvery @ 2014-09-02 9:14 UTC (permalink / raw) To: intel-gfx This rework is based on suggestion from Chris. Now w/a are organized in an array and all of them are emitted in single fn instead of sending them individually. This approach is clean and new w/a can be added with minimal changes. The same array can be used when exporting them to debugfs and the temporary array in the current implementation is not required. v2: As suggested by Damien a new w/a reg definition is added. This helps in initializing w/a that are unmasked registers. Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> --- drivers/gpu/drm/i915/intel_ringbuffer.c | 158 ++++++++++++++------------------ drivers/gpu/drm/i915/intel_ringbuffer.h | 26 ++++++ 2 files changed, 93 insertions(+), 91 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index c5e4dc7..3ed0ad5 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -650,87 +650,80 @@ err: return ret; } -static inline void intel_ring_emit_wa(struct intel_engine_cs *ring, - u32 addr, u32 value) +static int i915_request_emit_lri(struct intel_engine_cs *ring, + int num_registers, + const struct intel_wa_reg *lri_list) { - struct drm_device *dev = ring->dev; - struct drm_i915_private *dev_priv = dev->dev_private; + int i; + int ret; + struct drm_i915_private *dev_priv = ring->dev->dev_private; - if (dev_priv->num_wa_regs > I915_MAX_WA_REGS) - return; + ret = intel_ring_begin(ring, (2 * num_registers + 1)); + if (ret) + return ret; - intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); - intel_ring_emit(ring, addr); - intel_ring_emit(ring, value); + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(num_registers)); + for (i = 0; i < num_registers; ++i) { + u32 value; + const struct intel_wa_reg *p = (lri_list + i); - dev_priv->intel_wa_regs[dev_priv->num_wa_regs].addr = addr; - dev_priv->intel_wa_regs[dev_priv->num_wa_regs].mask = (value) & 0xFFFF; - /* value is updated with the status of remaining bits of this - * register when it is read from debugfs file - */ - dev_priv->intel_wa_regs[dev_priv->num_wa_regs].value = value; - dev_priv->num_wa_regs++; + if (p->masked_register) + value = (p->mask << 16) | p->value; + else + value = (I915_READ(p->addr) & ~p->mask) | p->value; + intel_ring_emit(ring, p->addr); + intel_ring_emit(ring, value); + } - return; + intel_ring_emit(ring, MI_NOOP); + intel_ring_advance(ring); + + return 0; } -static int bdw8_init_workarounds(struct intel_engine_cs *ring) -{ - int ret; - struct drm_device *dev = ring->dev; - struct drm_i915_private *dev_priv = dev->dev_private; +static const struct intel_wa_reg bdw_ring_init_context[] = { /* * workarounds applied in this fn are part of register state context, * they need to be re-initialized followed by gpu reset, suspend/resume, * module reload. */ - dev_priv->num_wa_regs = 0; - memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs)); - - /* - * update the number of dwords required based on the - * actual number of workarounds applied - */ - ret = intel_ring_begin(ring, 24); - if (ret) - return ret; /* WaDisablePartialInstShootdown:bdw */ /* WaDisableThreadStallDopClockGating:bdw */ /* FIXME: Unclear whether we really need this on production bdw. */ - intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN, - _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE - | STALL_DOP_GATING_DISABLE)); + INIT_MASKED_WA(GEN8_ROW_CHICKEN, + _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) | + _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE)), /* WaDisableDopClockGating:bdw May not be needed for production */ - intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2, - _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE)); + INIT_MASKED_WA(GEN7_ROW_CHICKEN2, + _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE)), /* * This GEN8_CENTROID_PIXEL_OPT_DIS W/A is only needed for * pre-production hardware */ - intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3, - _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS - | GEN8_SAMPLER_POWER_BYPASS_DIS)); + INIT_MASKED_WA(HALF_SLICE_CHICKEN3, + _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS) | + _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS)), - intel_ring_emit_wa(ring, GEN7_HALF_SLICE_CHICKEN1, - _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE)); + INIT_MASKED_WA(GEN7_HALF_SLICE_CHICKEN1, + _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE)), - intel_ring_emit_wa(ring, COMMON_SLICE_CHICKEN2, - _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE)); + INIT_MASKED_WA(COMMON_SLICE_CHICKEN2, + _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE)), /* Use Force Non-Coherent whenever executing a 3D context. This is a * workaround for for a possible hang in the unlikely event a TLB * invalidation occurs during a PSD flush. */ - intel_ring_emit_wa(ring, HDC_CHICKEN0, - _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT)); + INIT_MASKED_WA(HDC_CHICKEN0, + _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT)), /* Wa4x4STCOptimizationDisable:bdw */ - intel_ring_emit_wa(ring, CACHE_MODE_1, - _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE)); + INIT_MASKED_WA(CACHE_MODE_1, + _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE)), /* * BSpec recommends 8x4 when MSAA is used, @@ -740,54 +733,40 @@ static int bdw8_init_workarounds(struct intel_engine_cs *ring) * disable bit, which we don't touch here, but it's good * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM). */ - intel_ring_emit_wa(ring, GEN7_GT_MODE, - GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4); - - intel_ring_advance(ring); - - DRM_DEBUG_DRIVER("Number of Workarounds applied: %d\n", - dev_priv->num_wa_regs); - - return 0; -} - -static int chv_init_workarounds(struct intel_engine_cs *ring) -{ - int ret; - struct drm_device *dev = ring->dev; - struct drm_i915_private *dev_priv = dev->dev_private; - - /* - * workarounds applied in this fn are part of register state context, - * they need to be re-initialized followed by gpu reset, suspend/resume, - * module reload. - */ - dev_priv->num_wa_regs = 0; - memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs)); - - ret = intel_ring_begin(ring, 12); - if (ret) - return ret; + INIT_MASKED_WA(GEN7_GT_MODE, + (GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4)), +}; +static const struct intel_wa_reg chv_ring_init_context[] = { /* WaDisablePartialInstShootdown:chv */ - intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN, - _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE)); - /* WaDisableThreadStallDopClockGating:chv */ - intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN, - _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE)); + INIT_MASKED_WA(GEN8_ROW_CHICKEN, + _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) | + _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE)), /* WaDisableDopClockGating:chv (pre-production hw) */ - intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2, - _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE)); + INIT_MASKED_WA(GEN7_ROW_CHICKEN2, + _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE)), /* WaDisableSamplerPowerBypass:chv (pre-production hw) */ - intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3, - _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS)); + INIT_MASKED_WA(HALF_SLICE_CHICKEN3, + _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS)), +}; - intel_ring_advance(ring); +static int ring_init_context(struct intel_engine_cs *ring) +{ + int ret = -EINVAL; - return 0; + if (IS_CHERRYVIEW(ring->dev)) + ret = i915_request_emit_lri(ring, + ARRAY_SIZE(chv_ring_init_context), + chv_ring_init_context); + else + ret = i915_request_emit_lri(ring, + ARRAY_SIZE(bdw_ring_init_context), + bdw_ring_init_context); + + return ret; } static int init_render_ring(struct intel_engine_cs *ring) @@ -2275,10 +2254,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev) dev_priv->semaphore_obj = obj; } } - if (IS_CHERRYVIEW(dev)) - ring->init_context = chv_init_workarounds; - else - ring->init_context = bdw_init_workarounds; + ring->init_context = ring_init_context; ring->add_request = gen6_add_request; ring->flush = gen8_render_ring_flush; ring->irq_get = gen8_ring_get_irq; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 96479c8..c9ed06c 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -317,6 +317,32 @@ struct intel_engine_cs { u32 (*get_cmd_length_mask)(u32 cmd_header); }; +/* + * Workaround register definition + * mask: bits correponding to the w/a + * value: represents reference values of w/a bits + */ +struct intel_wa_reg { + u32 masked_register : 1; + u32 addr; + u32 mask; + u32 value; +}; + +#define INIT_WA_REG(_masked, _addr, _mask, _value) \ +{ \ + .masked_register = _masked, \ + .addr = _addr, \ + .mask = _mask, \ + .value = _value,\ +} + +#define INIT_MASKED_WA(_addr, _value) \ + INIT_WA_REG(1, _addr, ((_value) >> 16), ((_value) & 0xFFFF)) + +#define INIT_UNMASKED_WA(_addr, _value) \ + INIT_WA_REG(0, _addr, ~(_value), _value) + bool intel_ring_initialized(struct intel_engine_cs *ring); static inline unsigned -- 2.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2] drm/i915: Rework workaround init functions for BDW and CHV 2014-09-02 9:14 ` [PATCH v2] " Arun Siluvery @ 2014-09-02 9:45 ` Damien Lespiau 2014-09-02 9:49 ` Damien Lespiau 2014-09-02 9:51 ` Chris Wilson 0 siblings, 2 replies; 14+ messages in thread From: Damien Lespiau @ 2014-09-02 9:45 UTC (permalink / raw) To: Arun Siluvery; +Cc: intel-gfx On Tue, Sep 02, 2014 at 10:14:17AM +0100, Arun Siluvery wrote: > This rework is based on suggestion from Chris. > Now w/a are organized in an array and all of them are emitted in > single fn instead of sending them individually. This approach is > clean and new w/a can be added with minimal changes. > The same array can be used when exporting them to debugfs and the > temporary array in the current implementation is not required. > > v2: As suggested by Damien a new w/a reg definition is added. > This helps in initializing w/a that are unmasked registers. > > Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 158 ++++++++++++++------------------ > drivers/gpu/drm/i915/intel_ringbuffer.h | 26 ++++++ > 2 files changed, 93 insertions(+), 91 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > index c5e4dc7..3ed0ad5 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -650,87 +650,80 @@ err: > return ret; > } > > -static inline void intel_ring_emit_wa(struct intel_engine_cs *ring, > - u32 addr, u32 value) > +static int i915_request_emit_lri(struct intel_engine_cs *ring, > + int num_registers, > + const struct intel_wa_reg *lri_list) > { [...] > + for (i = 0; i < num_registers; ++i) { > + u32 value; > + const struct intel_wa_reg *p = (lri_list + i); > > + if (p->masked_register) > + value = (p->mask << 16) | p->value; > + else > + value = (I915_READ(p->addr) & ~p->mask) | p->value; There's one case when this won't work, when several WAs for a single 'normal' register are defined. The read done here means only the last of those W/As will end up being applied (because the last LRI to that register will be the value that ends up in the register. We'll probably need to coalesce all W/A defined for a single normal register into one write. Considering that, and the comment about the define, maybe the simplest way to go forward with the patch is to forget the non-masked case for the moment, especially as it's not used in this patch. > +#define INIT_MASKED_WA(_addr, _value) \ > + INIT_WA_REG(1, _addr, ((_value) >> 16), ((_value) & 0xFFFF)) Those outer '()' look unnecessary > +#define INIT_UNMASKED_WA(_addr, _value) \ > + INIT_WA_REG(0, _addr, ~(_value), _value) > + ~value is not a mask. Consider values with one or more bits defined to 0. -- Damien ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] drm/i915: Rework workaround init functions for BDW and CHV 2014-09-02 9:45 ` Damien Lespiau @ 2014-09-02 9:49 ` Damien Lespiau 2014-09-02 9:51 ` Chris Wilson 1 sibling, 0 replies; 14+ messages in thread From: Damien Lespiau @ 2014-09-02 9:49 UTC (permalink / raw) To: Arun Siluvery; +Cc: intel-gfx On Tue, Sep 02, 2014 at 10:45:45AM +0100, Damien Lespiau wrote: > There's one case when this won't work, when several WAs for a single > 'normal' register are defined. The read done here means only the last of > those W/As will end up being applied (because the last LRI to that > register will be the value that ends up in the register. We'll probably > need to coalesce all W/A defined for a single normal register into one > write. To more correct/clear, it's not the read that's the problem, in the case where a normal register has several W/A, the last write will override the previous ones. -- Damien ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] drm/i915: Rework workaround init functions for BDW and CHV 2014-09-02 9:45 ` Damien Lespiau 2014-09-02 9:49 ` Damien Lespiau @ 2014-09-02 9:51 ` Chris Wilson 1 sibling, 0 replies; 14+ messages in thread From: Chris Wilson @ 2014-09-02 9:51 UTC (permalink / raw) To: Damien Lespiau; +Cc: intel-gfx On Tue, Sep 02, 2014 at 10:45:45AM +0100, Damien Lespiau wrote: > On Tue, Sep 02, 2014 at 10:14:17AM +0100, Arun Siluvery wrote: > > This rework is based on suggestion from Chris. > > Now w/a are organized in an array and all of them are emitted in > > single fn instead of sending them individually. This approach is > > clean and new w/a can be added with minimal changes. > > The same array can be used when exporting them to debugfs and the > > temporary array in the current implementation is not required. > > > > v2: As suggested by Damien a new w/a reg definition is added. > > This helps in initializing w/a that are unmasked registers. > > > > Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> > > --- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 158 ++++++++++++++------------------ > > drivers/gpu/drm/i915/intel_ringbuffer.h | 26 ++++++ > > 2 files changed, 93 insertions(+), 91 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > > index c5e4dc7..3ed0ad5 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > > @@ -650,87 +650,80 @@ err: > > return ret; > > } > > > > -static inline void intel_ring_emit_wa(struct intel_engine_cs *ring, > > - u32 addr, u32 value) > > +static int i915_request_emit_lri(struct intel_engine_cs *ring, > > + int num_registers, > > + const struct intel_wa_reg *lri_list) > > { > > [...] > > > + for (i = 0; i < num_registers; ++i) { > > + u32 value; > > + const struct intel_wa_reg *p = (lri_list + i); > > > > + if (p->masked_register) > > + value = (p->mask << 16) | p->value; > > + else > > + value = (I915_READ(p->addr) & ~p->mask) | p->value; > > There's one case when this won't work, when several WAs for a single > 'normal' register are defined. The read done here means only the last of > those W/As will end up being applied (because the last LRI to that > register will be the value that ends up in the register. We'll probably > need to coalesce all W/A defined for a single normal register into one > write. > > Considering that, and the comment about the define, maybe the simplest > way to go forward with the patch is to forget the non-masked case for > the moment, especially as it's not used in this patch. If you are still considering an in-kernel table, despite that it can be done in userspace, design it to track all w/a and not just the ring specific fixups. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/4] drm/i915: Rename intel_wa_registers with a i915_ prefix 2014-09-01 13:28 [PATCH 0/4] Rework workaround init fn for BDW and CHV Arun Siluvery 2014-09-01 13:28 ` [PATCH 1/4] drm/i915: Rework workaround init functions " Arun Siluvery @ 2014-09-01 13:28 ` Arun Siluvery 2014-09-01 13:28 ` [PATCH 3/4] drm/i915: Don't restrict i915_wa_registers to BDW Arun Siluvery 2014-09-01 13:28 ` [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs Arun Siluvery 3 siblings, 0 replies; 14+ messages in thread From: Arun Siluvery @ 2014-09-01 13:28 UTC (permalink / raw) To: intel-gfx From: Damien Lespiau <damien.lespiau@intel.com> Those debugfs files are prefixed by i915, the name of the kernel module, presumably to make the difference with files exposed by core DRM. Also, add a ',' at the end of the last entry. This is to ease the conflict resolution when rebasing internal patches that add a member at the end of the array. Without it, wiggle can't do its job as we need to modify an existing line (appending the ','). Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index f0d63f6..f09ba8c 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2458,7 +2458,7 @@ static int i915_shared_dplls_info(struct seq_file *m, void *unused) return 0; } -static int intel_wa_registers(struct seq_file *m, void *unused) +static int i915_wa_registers(struct seq_file *m, void *unused) { int i; int ret; @@ -4026,7 +4026,7 @@ static const struct drm_info_list i915_debugfs_list[] = { {"i915_semaphore_status", i915_semaphore_status, 0}, {"i915_shared_dplls_info", i915_shared_dplls_info, 0}, {"i915_dp_mst_info", i915_dp_mst_info, 0}, - {"intel_wa_registers", intel_wa_registers, 0} + {"i915_wa_registers", i915_wa_registers, 0}, }; #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list) -- 2.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/4] drm/i915: Don't restrict i915_wa_registers to BDW 2014-09-01 13:28 [PATCH 0/4] Rework workaround init fn for BDW and CHV Arun Siluvery 2014-09-01 13:28 ` [PATCH 1/4] drm/i915: Rework workaround init functions " Arun Siluvery 2014-09-01 13:28 ` [PATCH 2/4] drm/i915: Rename intel_wa_registers with a i915_ prefix Arun Siluvery @ 2014-09-01 13:28 ` Arun Siluvery 2014-09-01 13:28 ` [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs Arun Siluvery 3 siblings, 0 replies; 14+ messages in thread From: Arun Siluvery @ 2014-09-01 13:28 UTC (permalink / raw) To: intel-gfx From: Damien Lespiau <damien.lespiau@intel.com> We have CHV code that already makes the test obsolete. Besides, when num_wa_regs is 0 (platforms not gathering that W/A data), we expose something sensible already. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index f09ba8c..2727bda 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2466,11 +2466,6 @@ static int i915_wa_registers(struct seq_file *m, void *unused) struct drm_device *dev = node->minor->dev; struct drm_i915_private *dev_priv = dev->dev_private; - if (!IS_BROADWELL(dev)) { - DRM_DEBUG_DRIVER("Workaround table not available !!\n"); - return -EINVAL; - } - ret = mutex_lock_interruptible(&dev->struct_mutex); if (ret) return ret; -- 2.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs 2014-09-01 13:28 [PATCH 0/4] Rework workaround init fn for BDW and CHV Arun Siluvery ` (2 preceding siblings ...) 2014-09-01 13:28 ` [PATCH 3/4] drm/i915: Don't restrict i915_wa_registers to BDW Arun Siluvery @ 2014-09-01 13:28 ` Arun Siluvery 2014-09-01 14:06 ` Damien Lespiau 2014-09-02 9:15 ` [PATCH v2] " Arun Siluvery 3 siblings, 2 replies; 14+ messages in thread From: Arun Siluvery @ 2014-09-01 13:28 UTC (permalink / raw) To: intel-gfx Now w/a are organized in an array so we know exactly how many of them are applied; use the same array while exporting data to debugfs and remove the temporary array we currently have in driver priv structure. Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c | 41 +++++++++++++++++++++++---------- drivers/gpu/drm/i915/i915_drv.h | 14 ----------- drivers/gpu/drm/i915/intel_ringbuffer.c | 15 ++++++++++++ drivers/gpu/drm/i915/intel_ringbuffer.h | 8 +++++++ 4 files changed, 52 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 2727bda..bab0408 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2465,6 +2465,14 @@ static int i915_wa_registers(struct seq_file *m, void *unused) struct drm_info_node *node = (struct drm_info_node *) m->private; struct drm_device *dev = node->minor->dev; struct drm_i915_private *dev_priv = dev->dev_private; + struct intel_ring_context_rodata ro_data; + + ret = ring_context_rodata(dev, &ro_data); + if (ret) { + seq_printf(m, "Workarounds applied: 0\n"); + DRM_DEBUG_DRIVER("Workaround table not available !!\n"); + return -EINVAL; + } ret = mutex_lock_interruptible(&dev->struct_mutex); if (ret) @@ -2472,18 +2480,27 @@ static int i915_wa_registers(struct seq_file *m, void *unused) intel_runtime_pm_get(dev_priv); - seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs); - for (i = 0; i < dev_priv->num_wa_regs; ++i) { - u32 addr, mask; - - addr = dev_priv->intel_wa_regs[i].addr; - mask = dev_priv->intel_wa_regs[i].mask; - dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask; - if (dev_priv->intel_wa_regs[i].addr) - seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n", - dev_priv->intel_wa_regs[i].addr, - dev_priv->intel_wa_regs[i].value, - dev_priv->intel_wa_regs[i].mask); + seq_printf(m, "Workarounds applied: %d\n", ro_data.num_items/2); + for (i = 0; i < ro_data.num_items; i += 2) { + u32 addr, mask, value; + + addr = ro_data.init_context[i]; + /* + * Most of workarounds are masked registers; + * to set a bit in lower 16-bits we set a mask bit in + * upper 16-bits so we can take either of them as mask but + * it doesn't work if the w/a is about clearing a bit so + * use upper 16-bits to cover both cases. + */ + mask = ro_data.init_context[i+1] >> 16; + + /* + * value represents the status of other bits in the + * register besides w/a bits + */ + value = I915_READ(addr) | mask; + seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n", + addr, value, mask); } intel_runtime_pm_put(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 49b7be7..bcf79f0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1553,20 +1553,6 @@ struct drm_i915_private { struct intel_shared_dpll shared_dplls[I915_NUM_PLLS]; int dpio_phy_iosf_port[I915_NUM_PHYS_VLV]; - /* - * workarounds are currently applied at different places and - * changes are being done to consolidate them so exact count is - * not clear at this point, use a max value for now. - */ -#define I915_MAX_WA_REGS 16 - struct { - u32 addr; - u32 value; - /* bitmask representing WA bits */ - u32 mask; - } intel_wa_regs[I915_MAX_WA_REGS]; - u32 num_wa_regs; - /* Reclocking support */ bool render_reclock_avail; bool lvds_downclock_avail; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index bae1527..6c07e69 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -760,6 +760,21 @@ static int ring_init_context(struct intel_engine_cs *ring) return ret; } +int ring_context_rodata(struct drm_device *dev, + struct intel_ring_context_rodata *ro_data) +{ + if (IS_CHERRYVIEW(dev)) { + ro_data->init_context = chv_ring_init_context; + ro_data->num_items = ARRAY_SIZE(chv_ring_init_context); + } else if (IS_BROADWELL(dev)) { + ro_data->init_context = bdw_ring_init_context; + ro_data->num_items = ARRAY_SIZE(bdw_ring_init_context); + } else + return -EINVAL; + + return 0; +} + static int init_render_ring(struct intel_engine_cs *ring) { struct drm_device *dev = ring->dev; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 96479c8..f555505 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -317,6 +317,14 @@ struct intel_engine_cs { u32 (*get_cmd_length_mask)(u32 cmd_header); }; +struct intel_ring_context_rodata { + u32 num_items; + const u32 *init_context; +}; + +int ring_context_rodata(struct drm_device *dev, + struct intel_ring_context_rodata *ro_data); + bool intel_ring_initialized(struct intel_engine_cs *ring); static inline unsigned -- 2.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs 2014-09-01 13:28 ` [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs Arun Siluvery @ 2014-09-01 14:06 ` Damien Lespiau 2014-09-01 14:45 ` Siluvery, Arun 2014-09-02 9:15 ` [PATCH v2] " Arun Siluvery 1 sibling, 1 reply; 14+ messages in thread From: Damien Lespiau @ 2014-09-01 14:06 UTC (permalink / raw) To: Arun Siluvery; +Cc: intel-gfx On Mon, Sep 01, 2014 at 02:28:53PM +0100, Arun Siluvery wrote: > Now w/a are organized in an array so we know exactly how many of them > are applied; use the same array while exporting data to debugfs and > remove the temporary array we currently have in driver priv structure. > > Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> > --- > drivers/gpu/drm/i915/i915_debugfs.c | 41 +++++++++++++++++++++++---------- > drivers/gpu/drm/i915/i915_drv.h | 14 ----------- > drivers/gpu/drm/i915/intel_ringbuffer.c | 15 ++++++++++++ > drivers/gpu/drm/i915/intel_ringbuffer.h | 8 +++++++ > 4 files changed, 52 insertions(+), 26 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c > index 2727bda..bab0408 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -2465,6 +2465,14 @@ static int i915_wa_registers(struct seq_file *m, void *unused) > struct drm_info_node *node = (struct drm_info_node *) m->private; > struct drm_device *dev = node->minor->dev; > struct drm_i915_private *dev_priv = dev->dev_private; > + struct intel_ring_context_rodata ro_data; > + > + ret = ring_context_rodata(dev, &ro_data); > + if (ret) { > + seq_printf(m, "Workarounds applied: 0\n"); seq_puts() > + DRM_DEBUG_DRIVER("Workaround table not available !!\n"); > + return -EINVAL; > + } > > ret = mutex_lock_interruptible(&dev->struct_mutex); > if (ret) > @@ -2472,18 +2480,27 @@ static int i915_wa_registers(struct seq_file *m, void *unused) > > intel_runtime_pm_get(dev_priv); > > - seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs); > - for (i = 0; i < dev_priv->num_wa_regs; ++i) { > - u32 addr, mask; > - > - addr = dev_priv->intel_wa_regs[i].addr; > - mask = dev_priv->intel_wa_regs[i].mask; > - dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask; > - if (dev_priv->intel_wa_regs[i].addr) > - seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n", > - dev_priv->intel_wa_regs[i].addr, > - dev_priv->intel_wa_regs[i].value, > - dev_priv->intel_wa_regs[i].mask); > + seq_printf(m, "Workarounds applied: %d\n", ro_data.num_items/2); > + for (i = 0; i < ro_data.num_items; i += 2) { > + u32 addr, mask, value; > + > + addr = ro_data.init_context[i]; > + /* > + * Most of workarounds are masked registers; > + * to set a bit in lower 16-bits we set a mask bit in > + * upper 16-bits so we can take either of them as mask but > + * it doesn't work if the w/a is about clearing a bit so > + * use upper 16-bits to cover both cases. > + */ > + mask = ro_data.init_context[i+1] >> 16; "Most" doesn't seem good here. Either it's "all" and we're happy, or we need a generic way to describe the W/A (masked register or not). value + mask is generic enough to code for both cases. > + > + /* > + * value represents the status of other bits in the > + * register besides w/a bits > + */ > + value = I915_READ(addr) | mask; > + seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n", > + addr, value, mask); > } I still don't get it. 'value' is supposed to be the reference value for the W/A, but you're or'ing the mask here, so you treat the mask as if it were the reference value. This won't work if the W/A is about setting multi-bits fields or about clearing a bit. The comment is still not clear enough. You're saying "other bits besides the w/a bits", but or'ing the mask doesn't do that. Why do we care about the "other bits" in the reference value? they don't matter. Why use something else than (ro_data.init_context[i+1] & 0xffff) for the value here (as long we're talking about masked registers)? -- Damien ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs 2014-09-01 14:06 ` Damien Lespiau @ 2014-09-01 14:45 ` Siluvery, Arun 2014-09-01 15:04 ` Damien Lespiau 0 siblings, 1 reply; 14+ messages in thread From: Siluvery, Arun @ 2014-09-01 14:45 UTC (permalink / raw) To: Damien Lespiau; +Cc: intel-gfx On 01/09/2014 15:06, Damien Lespiau wrote: > On Mon, Sep 01, 2014 at 02:28:53PM +0100, Arun Siluvery wrote: >> Now w/a are organized in an array so we know exactly how many of them >> are applied; use the same array while exporting data to debugfs and >> remove the temporary array we currently have in driver priv structure. >> >> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> >> --- >> drivers/gpu/drm/i915/i915_debugfs.c | 41 +++++++++++++++++++++++---------- >> drivers/gpu/drm/i915/i915_drv.h | 14 ----------- >> drivers/gpu/drm/i915/intel_ringbuffer.c | 15 ++++++++++++ >> drivers/gpu/drm/i915/intel_ringbuffer.h | 8 +++++++ >> 4 files changed, 52 insertions(+), 26 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c >> index 2727bda..bab0408 100644 >> --- a/drivers/gpu/drm/i915/i915_debugfs.c >> +++ b/drivers/gpu/drm/i915/i915_debugfs.c >> @@ -2465,6 +2465,14 @@ static int i915_wa_registers(struct seq_file *m, void *unused) >> struct drm_info_node *node = (struct drm_info_node *) m->private; >> struct drm_device *dev = node->minor->dev; >> struct drm_i915_private *dev_priv = dev->dev_private; >> + struct intel_ring_context_rodata ro_data; >> + >> + ret = ring_context_rodata(dev, &ro_data); >> + if (ret) { >> + seq_printf(m, "Workarounds applied: 0\n"); > > seq_puts() > >> + DRM_DEBUG_DRIVER("Workaround table not available !!\n"); >> + return -EINVAL; >> + } >> >> ret = mutex_lock_interruptible(&dev->struct_mutex); >> if (ret) >> @@ -2472,18 +2480,27 @@ static int i915_wa_registers(struct seq_file *m, void *unused) >> >> intel_runtime_pm_get(dev_priv); >> >> - seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs); >> - for (i = 0; i < dev_priv->num_wa_regs; ++i) { >> - u32 addr, mask; >> - >> - addr = dev_priv->intel_wa_regs[i].addr; >> - mask = dev_priv->intel_wa_regs[i].mask; >> - dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask; >> - if (dev_priv->intel_wa_regs[i].addr) >> - seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n", >> - dev_priv->intel_wa_regs[i].addr, >> - dev_priv->intel_wa_regs[i].value, >> - dev_priv->intel_wa_regs[i].mask); >> + seq_printf(m, "Workarounds applied: %d\n", ro_data.num_items/2); >> + for (i = 0; i < ro_data.num_items; i += 2) { >> + u32 addr, mask, value; >> + >> + addr = ro_data.init_context[i]; >> + /* >> + * Most of workarounds are masked registers; >> + * to set a bit in lower 16-bits we set a mask bit in >> + * upper 16-bits so we can take either of them as mask but >> + * it doesn't work if the w/a is about clearing a bit so >> + * use upper 16-bits to cover both cases. >> + */ >> + mask = ro_data.init_context[i+1] >> 16; > > "Most" doesn't seem good here. Either it's "all" and we're happy, or we > need a generic way to describe the W/A (masked register or not). value + > mask is generic enough to code for both cases. > It seems some of them could be unmasked registers. We can use 'mask' itself to determine whether it is a masked/unmasked register. mask == 0 if it is an unmasked register. >> + >> + /* >> + * value represents the status of other bits in the >> + * register besides w/a bits >> + */ >> + value = I915_READ(addr) | mask; >> + seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n", >> + addr, value, mask); >> } > > I still don't get it. 'value' is supposed to be the reference value for > the W/A, but you're or'ing the mask here, so you treat the mask as if it > were the reference value. This won't work if the W/A is about setting > multi-bits fields or about clearing a bit. > > The comment is still not clear enough. You're saying "other bits besides > the w/a bits", but or'ing the mask doesn't do that. > > Why do we care about the "other bits" in the reference value? they don't > matter. Why use something else than (ro_data.init_context[i+1] & 0xffff) > for the value here (as long we're talking about masked registers)? > I have always considered value as the register value (remaining bits of the register and w/a bits) and now I see your point. Yes lower 16-bits can be used as reference value, depending on whether it is a masked/unmasked we can use/not use the mask in conjunction with value in the test. regards Arun ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs 2014-09-01 14:45 ` Siluvery, Arun @ 2014-09-01 15:04 ` Damien Lespiau 0 siblings, 0 replies; 14+ messages in thread From: Damien Lespiau @ 2014-09-01 15:04 UTC (permalink / raw) To: Siluvery, Arun; +Cc: intel-gfx On Mon, Sep 01, 2014 at 03:45:29PM +0100, Siluvery, Arun wrote: > >>+ /* > >>+ * Most of workarounds are masked registers; > >>+ * to set a bit in lower 16-bits we set a mask bit in > >>+ * upper 16-bits so we can take either of them as mask but > >>+ * it doesn't work if the w/a is about clearing a bit so > >>+ * use upper 16-bits to cover both cases. > >>+ */ > >>+ mask = ro_data.init_context[i+1] >> 16; > > > >"Most" doesn't seem good here. Either it's "all" and we're happy, or we > >need a generic way to describe the W/A (masked register or not). value + > >mask is generic enough to code for both cases. > > > It seems some of them could be unmasked registers. > We can use 'mask' itself to determine whether it is a > masked/unmasked register. mask == 0 if it is an unmasked register. I don't think we can use mask == 0 when it's an unmasked register. In this case, we still need to be able to have a mask that tells us which are the interesting bits in the value. If we want something generic that can be applied to masked and unmasked register, we'll something like: struct reg_wa { unsigned int masked_register : 1; u32 addr; u32 mask; u32 value; }; So: 1. masked register case: - masked_register = 1 - addr - mask (selects the lower 16 bits that are coding for the W/A value) - value (it only makes sense to some of the lower 16 bits set here) 2. 'normal' register case - masked_register = 0 - addr - mask - value >From that generic description you can cover all cases, eg. 1. the write for masked registers becomes: mask << 16 | value 2. the write for normal registers becomes: (read(addr) & ~mask) | value 2. check a W/A has been applied (both normal and masked register): read(addr) & mask == value We seem to have only masked registers atm, so it's fine to handle just that case, but if you envision that we'll need more than masked registers, we need a better W/A definition. -- Damien ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2] drm/i915: Rework workaround data exporting to debugfs 2014-09-01 13:28 ` [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs Arun Siluvery 2014-09-01 14:06 ` Damien Lespiau @ 2014-09-02 9:15 ` Arun Siluvery 2014-09-02 9:56 ` Daniel Vetter 1 sibling, 1 reply; 14+ messages in thread From: Arun Siluvery @ 2014-09-02 9:15 UTC (permalink / raw) To: intel-gfx Now w/a are organized in an array so we know exactly how many of them are applied; use the same array while exporting data to debugfs and remove the temporary array we currently have in driver priv structure. Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c | 29 +++++++++++++++++------------ drivers/gpu/drm/i915/i915_drv.h | 14 -------------- drivers/gpu/drm/i915/intel_ringbuffer.c | 15 +++++++++++++++ drivers/gpu/drm/i915/intel_ringbuffer.h | 8 ++++++++ 4 files changed, 40 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 2727bda..0c1e294 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2465,6 +2465,14 @@ static int i915_wa_registers(struct seq_file *m, void *unused) struct drm_info_node *node = (struct drm_info_node *) m->private; struct drm_device *dev = node->minor->dev; struct drm_i915_private *dev_priv = dev->dev_private; + struct intel_ring_context_rodata ro_data; + + ret = ring_context_rodata(dev, &ro_data); + if (ret) { + seq_puts(m, "Workarounds applied: 0\n"); + DRM_DEBUG_DRIVER("Workaround table not available !!\n"); + return -EINVAL; + } ret = mutex_lock_interruptible(&dev->struct_mutex); if (ret) @@ -2472,18 +2480,15 @@ static int i915_wa_registers(struct seq_file *m, void *unused) intel_runtime_pm_get(dev_priv); - seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs); - for (i = 0; i < dev_priv->num_wa_regs; ++i) { - u32 addr, mask; - - addr = dev_priv->intel_wa_regs[i].addr; - mask = dev_priv->intel_wa_regs[i].mask; - dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask; - if (dev_priv->intel_wa_regs[i].addr) - seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n", - dev_priv->intel_wa_regs[i].addr, - dev_priv->intel_wa_regs[i].value, - dev_priv->intel_wa_regs[i].mask); + seq_printf(m, "Workarounds applied: %d\n", ro_data.num_items); + for (i = 0; i < ro_data.num_items; ++i) { + u32 addr, mask, value; + + addr = ro_data.init_context[i].addr; + mask = ro_data.init_context[i].mask; + value = ro_data.init_context[i].value; + seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n", + addr, value, mask); } intel_runtime_pm_put(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 49b7be7..bcf79f0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1553,20 +1553,6 @@ struct drm_i915_private { struct intel_shared_dpll shared_dplls[I915_NUM_PLLS]; int dpio_phy_iosf_port[I915_NUM_PHYS_VLV]; - /* - * workarounds are currently applied at different places and - * changes are being done to consolidate them so exact count is - * not clear at this point, use a max value for now. - */ -#define I915_MAX_WA_REGS 16 - struct { - u32 addr; - u32 value; - /* bitmask representing WA bits */ - u32 mask; - } intel_wa_regs[I915_MAX_WA_REGS]; - u32 num_wa_regs; - /* Reclocking support */ bool render_reclock_avail; bool lvds_downclock_avail; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 3ed0ad5..7262c10 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -769,6 +769,21 @@ static int ring_init_context(struct intel_engine_cs *ring) return ret; } +int ring_context_rodata(struct drm_device *dev, + struct intel_ring_context_rodata *ro_data) +{ + if (IS_CHERRYVIEW(dev)) { + ro_data->init_context = chv_ring_init_context; + ro_data->num_items = ARRAY_SIZE(chv_ring_init_context); + } else if (IS_BROADWELL(dev)) { + ro_data->init_context = bdw_ring_init_context; + ro_data->num_items = ARRAY_SIZE(bdw_ring_init_context); + } else + return -EINVAL; + + return 0; +} + static int init_render_ring(struct intel_engine_cs *ring) { struct drm_device *dev = ring->dev; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index c9ed06c..33454ce 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -343,6 +343,14 @@ struct intel_wa_reg { #define INIT_UNMASKED_WA(_addr, _value) \ INIT_WA_REG(0, _addr, ~(_value), _value) +struct intel_ring_context_rodata { + u32 num_items; + const struct intel_wa_reg *init_context; +}; + +int ring_context_rodata(struct drm_device *dev, + struct intel_ring_context_rodata *ro_data); + bool intel_ring_initialized(struct intel_engine_cs *ring); static inline unsigned -- 2.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2] drm/i915: Rework workaround data exporting to debugfs 2014-09-02 9:15 ` [PATCH v2] " Arun Siluvery @ 2014-09-02 9:56 ` Daniel Vetter 0 siblings, 0 replies; 14+ messages in thread From: Daniel Vetter @ 2014-09-02 9:56 UTC (permalink / raw) To: Arun Siluvery; +Cc: intel-gfx On Tue, Sep 02, 2014 at 10:15:31AM +0100, Arun Siluvery wrote: > Now w/a are organized in an array so we know exactly how many of them > are applied; use the same array while exporting data to debugfs and > remove the temporary array we currently have in driver priv structure. > > Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> > --- > drivers/gpu/drm/i915/i915_debugfs.c | 29 +++++++++++++++++------------ > drivers/gpu/drm/i915/i915_drv.h | 14 -------------- > drivers/gpu/drm/i915/intel_ringbuffer.c | 15 +++++++++++++++ > drivers/gpu/drm/i915/intel_ringbuffer.h | 8 ++++++++ > 4 files changed, 40 insertions(+), 26 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c > index 2727bda..0c1e294 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -2465,6 +2465,14 @@ static int i915_wa_registers(struct seq_file *m, void *unused) > struct drm_info_node *node = (struct drm_info_node *) m->private; > struct drm_device *dev = node->minor->dev; > struct drm_i915_private *dev_priv = dev->dev_private; > + struct intel_ring_context_rodata ro_data; > + > + ret = ring_context_rodata(dev, &ro_data); > + if (ret) { > + seq_puts(m, "Workarounds applied: 0\n"); > + DRM_DEBUG_DRIVER("Workaround table not available !!\n"); > + return -EINVAL; > + } > > ret = mutex_lock_interruptible(&dev->struct_mutex); > if (ret) > @@ -2472,18 +2480,15 @@ static int i915_wa_registers(struct seq_file *m, void *unused) > > intel_runtime_pm_get(dev_priv); > > - seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs); > - for (i = 0; i < dev_priv->num_wa_regs; ++i) { > - u32 addr, mask; > - > - addr = dev_priv->intel_wa_regs[i].addr; > - mask = dev_priv->intel_wa_regs[i].mask; > - dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask; > - if (dev_priv->intel_wa_regs[i].addr) > - seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n", > - dev_priv->intel_wa_regs[i].addr, > - dev_priv->intel_wa_regs[i].value, > - dev_priv->intel_wa_regs[i].mask); > + seq_printf(m, "Workarounds applied: %d\n", ro_data.num_items); > + for (i = 0; i < ro_data.num_items; ++i) { > + u32 addr, mask, value; > + > + addr = ro_data.init_context[i].addr; > + mask = ro_data.init_context[i].mask; > + value = ro_data.init_context[i].value; > + seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n", > + addr, value, mask); > } > > intel_runtime_pm_put(dev_priv); > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 49b7be7..bcf79f0 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1553,20 +1553,6 @@ struct drm_i915_private { > struct intel_shared_dpll shared_dplls[I915_NUM_PLLS]; > int dpio_phy_iosf_port[I915_NUM_PHYS_VLV]; > > - /* > - * workarounds are currently applied at different places and > - * changes are being done to consolidate them so exact count is > - * not clear at this point, use a max value for now. > - */ > -#define I915_MAX_WA_REGS 16 > - struct { > - u32 addr; > - u32 value; > - /* bitmask representing WA bits */ > - u32 mask; > - } intel_wa_regs[I915_MAX_WA_REGS]; > - u32 num_wa_regs; > - > /* Reclocking support */ > bool render_reclock_avail; > bool lvds_downclock_avail; > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 3ed0ad5..7262c10 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -769,6 +769,21 @@ static int ring_init_context(struct intel_engine_cs *ring) > return ret; > } > > +int ring_context_rodata(struct drm_device *dev, > + struct intel_ring_context_rodata *ro_data) > +{ > + if (IS_CHERRYVIEW(dev)) { > + ro_data->init_context = chv_ring_init_context; > + ro_data->num_items = ARRAY_SIZE(chv_ring_init_context); > + } else if (IS_BROADWELL(dev)) { > + ro_data->init_context = bdw_ring_init_context; > + ro_data->num_items = ARRAY_SIZE(bdw_ring_init_context); This will kinda break my idea that we could put _all_ static register w/a into these schem, so also everything that's in the various init_clock gating functions. The goal of the test is after all to make sure that we set the w/a bits in the right place, so if you only check the w/a emitted in the context init (and this change here kinda bakes this in) then it will not be really useful. Maybe you should (just as a prove of concept of these refactorings) convert the chv or bdw w/a in the init_clock_gating to this infrastructure too? Thanks, Daniel > + } else > + return -EINVAL; > + > + return 0; > +} > + > static int init_render_ring(struct intel_engine_cs *ring) > { > struct drm_device *dev = ring->dev; > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h > index c9ed06c..33454ce 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -343,6 +343,14 @@ struct intel_wa_reg { > #define INIT_UNMASKED_WA(_addr, _value) \ > INIT_WA_REG(0, _addr, ~(_value), _value) > > +struct intel_ring_context_rodata { > + u32 num_items; > + const struct intel_wa_reg *init_context; > +}; > + > +int ring_context_rodata(struct drm_device *dev, > + struct intel_ring_context_rodata *ro_data); > + > bool intel_ring_initialized(struct intel_engine_cs *ring); > > static inline unsigned > -- > 2.0.4 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-09-02 9:56 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-09-01 13:28 [PATCH 0/4] Rework workaround init fn for BDW and CHV Arun Siluvery 2014-09-01 13:28 ` [PATCH 1/4] drm/i915: Rework workaround init functions " Arun Siluvery 2014-09-02 9:14 ` [PATCH v2] " Arun Siluvery 2014-09-02 9:45 ` Damien Lespiau 2014-09-02 9:49 ` Damien Lespiau 2014-09-02 9:51 ` Chris Wilson 2014-09-01 13:28 ` [PATCH 2/4] drm/i915: Rename intel_wa_registers with a i915_ prefix Arun Siluvery 2014-09-01 13:28 ` [PATCH 3/4] drm/i915: Don't restrict i915_wa_registers to BDW Arun Siluvery 2014-09-01 13:28 ` [PATCH 4/4] drm/i915: Rework workaround data exporting to debugfs Arun Siluvery 2014-09-01 14:06 ` Damien Lespiau 2014-09-01 14:45 ` Siluvery, Arun 2014-09-01 15:04 ` Damien Lespiau 2014-09-02 9:15 ` [PATCH v2] " Arun Siluvery 2014-09-02 9:56 ` Daniel Vetter
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.