All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Move hsw GT w/a to engine initialisation
@ 2017-11-03  2:56 Chris Wilson
  2017-11-03  3:19 ` ✗ Fi.CI.BAT: failure for " Patchwork
  2017-11-03 10:25 ` [PATCH] " Ville Syrjälä
  0 siblings, 2 replies; 5+ messages in thread
From: Chris Wilson @ 2017-11-03  2:56 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

In commit b7048ea12fbb ("drm/i915: Do .init_clock_gating() earlier to
avoid it clobbering watermarks") init_clock_gating was called earlier in
the module load sequence, moving it before we acquired the forcewake
used to initialise the engines. This revealed that on Haswell, at least,
some of those GT w/as had been moved into the power context, and so as
we were now setting them outside of the power context, those settings
were being lost. Now, strictly we want to set power context registers
using LRI (that ensures there is a power context loaded!), we can
restore the earlier behaviour by moving the GT register writes back to
the same point in the module initialisation sequence.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103549
Fixes: b7048ea12fbb ("drm/i915: Do .init_clock_gating() earlier to avoid it clobbering watermarks")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c         | 38 ------------------------------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 41 +++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 308439dd89d4..8a72526d491e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8588,49 +8588,11 @@ static void hsw_init_clock_gating(struct drm_i915_private *dev_priv)
 {
 	ilk_init_lp_watermarks(dev_priv);
 
-	/* L3 caching of data atomics doesn't work -- disable it. */
-	I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
-	I915_WRITE(HSW_ROW_CHICKEN3,
-		   _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
-
 	/* This is required by WaCatErrorRejectionIssue:hsw */
 	I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
 			I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) |
 			GEN7_SQ_CHICKEN_MBCUNIT_SQINTMOB);
 
-	/* WaVSRefCountFullforceMissDisable:hsw */
-	I915_WRITE(GEN7_FF_THREAD_MODE,
-		   I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME);
-
-	/* WaDisable_RenderCache_OperationalFlush:hsw */
-	I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
-
-	/* enable HiZ Raw Stall Optimization */
-	I915_WRITE(CACHE_MODE_0_GEN7,
-		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
-
-	/* WaDisable4x2SubspanOptimization:hsw */
-	I915_WRITE(CACHE_MODE_1,
-		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
-
-	/*
-	 * BSpec recommends 8x4 when MSAA is used,
-	 * however in practice 16x4 seems fastest.
-	 *
-	 * Note that PS/WM thread counts depend on the WIZ hashing
-	 * disable bit, which we don't touch here, but it's good
-	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
-	 */
-	I915_WRITE(GEN7_GT_MODE,
-		   _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4));
-
-	/* WaSampleCChickenBitEnable:hsw */
-	I915_WRITE(HALF_SLICE_CHICKEN3,
-		   _MASKED_BIT_ENABLE(HSW_SAMPLE_C_PERFORMANCE));
-
-	/* WaSwitchSolVfFArbitrationPriority:hsw */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
-
 	/* WaRsPkgCStateDisplayPMReq:hsw */
 	I915_WRITE(CHICKEN_PAR1_1,
 		   I915_READ(CHICKEN_PAR1_1) | FORCE_ARB_IDLE_PLANES);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3321b801e77d..3a2287b0d9f4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -707,6 +707,47 @@ static int init_render_ring(struct intel_engine_cs *engine)
 	if (IS_GEN(dev_priv, 6, 7))
 		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
 
+
+	if (IS_HASWELL(dev_priv)) {
+		/* L3 caching of data atomics doesn't work -- disable it. */
+		I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
+		I915_WRITE(HSW_ROW_CHICKEN3,
+			   _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
+
+		/* WaVSRefCountFullforceMissDisable:hsw */
+		I915_WRITE(GEN7_FF_THREAD_MODE,
+			   I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME);
+
+		/* WaDisable_RenderCache_OperationalFlush:hsw */
+		I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
+		/* enable HiZ Raw Stall Optimization */
+		I915_WRITE(CACHE_MODE_0_GEN7,
+			   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
+
+		/* WaDisable4x2SubspanOptimization:hsw */
+		I915_WRITE(CACHE_MODE_1,
+			   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
+
+		/*
+		 * BSpec recommends 8x4 when MSAA is used,
+		 * however in practice 16x4 seems fastest.
+		 *
+		 * Note that PS/WM thread counts depend on the WIZ hashing
+		 * disable bit, which we don't touch here, but it's good
+		 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
+		 */
+		I915_WRITE(GEN7_GT_MODE,
+			   _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4));
+
+		/* WaSampleCChickenBitEnable:hsw */
+		I915_WRITE(HALF_SLICE_CHICKEN3,
+			   _MASKED_BIT_ENABLE(HSW_SAMPLE_C_PERFORMANCE));
+
+		/* WaSwitchSolVfFArbitrationPriority:hsw */
+		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
+	}
+
 	if (INTEL_INFO(dev_priv)->gen >= 6)
 		I915_WRITE_IMR(engine, ~engine->irq_keep_mask);
 
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Move hsw GT w/a to engine initialisation
  2017-11-03  2:56 [PATCH] drm/i915: Move hsw GT w/a to engine initialisation Chris Wilson
@ 2017-11-03  3:19 ` Patchwork
  2017-11-03 10:25 ` [PATCH] " Ville Syrjälä
  1 sibling, 0 replies; 5+ messages in thread
From: Patchwork @ 2017-11-03  3:19 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Move hsw GT w/a to engine initialisation
URL   : https://patchwork.freedesktop.org/series/33095/
State : failure

== Summary ==

Series 33095v1 drm/i915: Move hsw GT w/a to engine initialisation
https://patchwork.freedesktop.org/api/1.0/series/33095/revisions/1/mbox/

Test chamelium:
        Subgroup dp-hpd-fast:
                skip       -> INCOMPLETE (fi-bdw-5557u)
        Subgroup dp-crc-fast:
                pass       -> FAIL       (fi-kbl-7500u) fdo#102514
Test gem_exec_reloc:
        Subgroup basic-cpu-read-active:
                pass       -> FAIL       (fi-gdg-551) fdo#102582

fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
fdo#102582 https://bugs.freedesktop.org/show_bug.cgi?id=102582

fi-bdw-5557u     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:377s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:543s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:276s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:510s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:503s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:505s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:491s
fi-cfl-s         total:289  pass:253  dwarn:4   dfail:0   fail:0   skip:32  time:690s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:430s
fi-gdg-551       total:289  pass:177  dwarn:1   dfail:0   fail:2   skip:109 time:269s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:579s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:434s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:430s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:427s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:459s
fi-kbl-7500u     total:289  pass:263  dwarn:1   dfail:0   fail:1   skip:24  time:481s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:575s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:481s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:582s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:580s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:454s
fi-skl-6600u     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:599s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:652s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:523s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:505s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:573s

2faf7577f4edf6e233c89b3b217440bcb87b651f drm-tip: 2017y-11m-02d-15h-33m-01s UTC integration manifest
002aa0c059a5 drm/i915: Move hsw GT w/a to engine initialisation

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6934/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/i915: Move hsw GT w/a to engine initialisation
  2017-11-03  2:56 [PATCH] drm/i915: Move hsw GT w/a to engine initialisation Chris Wilson
  2017-11-03  3:19 ` ✗ Fi.CI.BAT: failure for " Patchwork
@ 2017-11-03 10:25 ` Ville Syrjälä
  2017-11-03 10:38   ` Chris Wilson
  1 sibling, 1 reply; 5+ messages in thread
From: Ville Syrjälä @ 2017-11-03 10:25 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

On Fri, Nov 03, 2017 at 02:56:28AM +0000, Chris Wilson wrote:
> In commit b7048ea12fbb ("drm/i915: Do .init_clock_gating() earlier to
> avoid it clobbering watermarks") init_clock_gating was called earlier in
> the module load sequence, moving it before we acquired the forcewake
> used to initialise the engines. This revealed that on Haswell, at least,
> some of those GT w/as had been moved into the power context, and so as
> we were now setting them outside of the power context, those settings
> were being lost.

Hmm. Writes shouldn't need forcewake as they go through the wake FIFO,
And the power context should have been set up by the BIOS. So I'm not
sure that explanation is entirely satisfactory, for masked registers
at least. For the ones that do RMW it could well be a problem.

Also there are some registers on the list that IIRC live in the
logical context, like GT_MODE/CACHE_MODE. I guess if the BIOS would
already enable rc6 those would be lost until we have a context set up.

This problem doesn't seem like it should be specific to HSW. So I wonder
if we should start by just reverting that offending patch and move just
the watermark thing out to some earlier position in the sequence.

> Now, strictly we want to set power context registers
> using LRI (that ensures there is a power context loaded!), we can
> restore the earlier behaviour by moving the GT register writes back to
> the same point in the module initialisation sequence.
> 
> Reported-by: Mark Janes <mark.a.janes@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103549
> Fixes: b7048ea12fbb ("drm/i915: Do .init_clock_gating() earlier to avoid it clobbering watermarks")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mark Janes <mark.a.janes@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c         | 38 ------------------------------
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 41 +++++++++++++++++++++++++++++++++
>  2 files changed, 41 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 308439dd89d4..8a72526d491e 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -8588,49 +8588,11 @@ static void hsw_init_clock_gating(struct drm_i915_private *dev_priv)
>  {
>  	ilk_init_lp_watermarks(dev_priv);
>  
> -	/* L3 caching of data atomics doesn't work -- disable it. */
> -	I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
> -	I915_WRITE(HSW_ROW_CHICKEN3,
> -		   _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
> -
>  	/* This is required by WaCatErrorRejectionIssue:hsw */
>  	I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
>  			I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) |
>  			GEN7_SQ_CHICKEN_MBCUNIT_SQINTMOB);
>  
> -	/* WaVSRefCountFullforceMissDisable:hsw */
> -	I915_WRITE(GEN7_FF_THREAD_MODE,
> -		   I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME);
> -
> -	/* WaDisable_RenderCache_OperationalFlush:hsw */
> -	I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
> -
> -	/* enable HiZ Raw Stall Optimization */
> -	I915_WRITE(CACHE_MODE_0_GEN7,
> -		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
> -
> -	/* WaDisable4x2SubspanOptimization:hsw */
> -	I915_WRITE(CACHE_MODE_1,
> -		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
> -
> -	/*
> -	 * BSpec recommends 8x4 when MSAA is used,
> -	 * however in practice 16x4 seems fastest.
> -	 *
> -	 * Note that PS/WM thread counts depend on the WIZ hashing
> -	 * disable bit, which we don't touch here, but it's good
> -	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
> -	 */
> -	I915_WRITE(GEN7_GT_MODE,
> -		   _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4));
> -
> -	/* WaSampleCChickenBitEnable:hsw */
> -	I915_WRITE(HALF_SLICE_CHICKEN3,
> -		   _MASKED_BIT_ENABLE(HSW_SAMPLE_C_PERFORMANCE));
> -
> -	/* WaSwitchSolVfFArbitrationPriority:hsw */
> -	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
> -
>  	/* WaRsPkgCStateDisplayPMReq:hsw */
>  	I915_WRITE(CHICKEN_PAR1_1,
>  		   I915_READ(CHICKEN_PAR1_1) | FORCE_ARB_IDLE_PLANES);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3321b801e77d..3a2287b0d9f4 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -707,6 +707,47 @@ static int init_render_ring(struct intel_engine_cs *engine)
>  	if (IS_GEN(dev_priv, 6, 7))
>  		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
>  
> +
> +	if (IS_HASWELL(dev_priv)) {
> +		/* L3 caching of data atomics doesn't work -- disable it. */
> +		I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
> +		I915_WRITE(HSW_ROW_CHICKEN3,
> +			   _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
> +
> +		/* WaVSRefCountFullforceMissDisable:hsw */
> +		I915_WRITE(GEN7_FF_THREAD_MODE,
> +			   I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME);
> +
> +		/* WaDisable_RenderCache_OperationalFlush:hsw */
> +		I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
> +
> +		/* enable HiZ Raw Stall Optimization */
> +		I915_WRITE(CACHE_MODE_0_GEN7,
> +			   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
> +
> +		/* WaDisable4x2SubspanOptimization:hsw */
> +		I915_WRITE(CACHE_MODE_1,
> +			   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
> +
> +		/*
> +		 * BSpec recommends 8x4 when MSAA is used,
> +		 * however in practice 16x4 seems fastest.
> +		 *
> +		 * Note that PS/WM thread counts depend on the WIZ hashing
> +		 * disable bit, which we don't touch here, but it's good
> +		 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
> +		 */
> +		I915_WRITE(GEN7_GT_MODE,
> +			   _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4));
> +
> +		/* WaSampleCChickenBitEnable:hsw */
> +		I915_WRITE(HALF_SLICE_CHICKEN3,
> +			   _MASKED_BIT_ENABLE(HSW_SAMPLE_C_PERFORMANCE));
> +
> +		/* WaSwitchSolVfFArbitrationPriority:hsw */
> +		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
> +	}
> +
>  	if (INTEL_INFO(dev_priv)->gen >= 6)
>  		I915_WRITE_IMR(engine, ~engine->irq_keep_mask);
>  
> -- 
> 2.15.0

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/i915: Move hsw GT w/a to engine initialisation
  2017-11-03 10:25 ` [PATCH] " Ville Syrjälä
@ 2017-11-03 10:38   ` Chris Wilson
  2017-11-06 10:45     ` Chris Wilson
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Wilson @ 2017-11-03 10:38 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Daniel Vetter, intel-gfx

Quoting Ville Syrjälä (2017-11-03 10:25:17)
> On Fri, Nov 03, 2017 at 02:56:28AM +0000, Chris Wilson wrote:
> > In commit b7048ea12fbb ("drm/i915: Do .init_clock_gating() earlier to
> > avoid it clobbering watermarks") init_clock_gating was called earlier in
> > the module load sequence, moving it before we acquired the forcewake
> > used to initialise the engines. This revealed that on Haswell, at least,
> > some of those GT w/as had been moved into the power context, and so as
> > we were now setting them outside of the power context, those settings
> > were being lost.
> 
> Hmm. Writes shouldn't need forcewake as they go through the wake FIFO,
> And the power context should have been set up by the BIOS. So I'm not
> sure that explanation is entirely satisfactory, for masked registers
> at least. For the ones that do RMW it could well be a problem.
> 
> Also there are some registers on the list that IIRC live in the
> logical context, like GT_MODE/CACHE_MODE. I guess if the BIOS would
> already enable rc6 those would be lost until we have a context set up.

We don't overwrite the state when loading the first context, so that
itself shouldn't be an issue. It definitely does seem like some context
registers are being lost; and that is a behavior we have associated with
the "powercontext" (e.g. so many registers for execlists :(.
 
> This problem doesn't seem like it should be specific to HSW. So I wonder
> if we should start by just reverting that offending patch and move just
> the watermark thing out to some earlier position in the sequence.

Whatever makes the simpler cc:stable patch. We have to overhaul these
register initialisations, and I definitely pity the poor soul who has to
navigate all the old bspecs to work out where each register needs to
live.

Ville do you want to take a pass at splitting the wm from clock-gating?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/i915: Move hsw GT w/a to engine initialisation
  2017-11-03 10:38   ` Chris Wilson
@ 2017-11-06 10:45     ` Chris Wilson
  0 siblings, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2017-11-06 10:45 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Daniel Vetter, intel-gfx

Quoting Chris Wilson (2017-11-03 10:38:38)
> Quoting Ville Syrjälä (2017-11-03 10:25:17)
> > This problem doesn't seem like it should be specific to HSW. So I wonder
> > if we should start by just reverting that offending patch and move just
> > the watermark thing out to some earlier position in the sequence.
> 
> Whatever makes the simpler cc:stable patch. We have to overhaul these
> register initialisations, and I definitely pity the poor soul who has to
> navigate all the old bspecs to work out where each register needs to
> live.
> 
> Ville do you want to take a pass at splitting the wm from clock-gating?

I think the conclusion we take from the more generic patch to split vm
from clock-gating is that there are display w/a that we need to apply
very early during HW probing. So ultimately we do need to split display
and GT w/a. Should we just run with this patch to fixup the known
regression and hope there are no others before we complete the w/a
splitting?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-11-06 10:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-03  2:56 [PATCH] drm/i915: Move hsw GT w/a to engine initialisation Chris Wilson
2017-11-03  3:19 ` ✗ Fi.CI.BAT: failure for " Patchwork
2017-11-03 10:25 ` [PATCH] " Ville Syrjälä
2017-11-03 10:38   ` Chris Wilson
2017-11-06 10:45     ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.