intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Haswell GPU hang fixes
@ 2012-08-17 21:35 Paulo Zanoni
  2012-08-17 21:35 ` [PATCH 1/4] drm/i915: add gen7_render_ring_flush Paulo Zanoni
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Paulo Zanoni @ 2012-08-17 21:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

Hi

Since we stopped running the gen6 workarounds on gen > 6 I started seeing a lot
of GPU hangs on my gen 7.5 machine. These patches add gen7+ workarounds which
prevent the GPU hangs I'm seeing.

These patches were tested mostly on HSW and briefly on IVB, but the workarounds
implemented are documented for both. This is also the first time I touch this
kind of code, so extra careful reviewing might be needed.

Thanks,
Paulo


Paulo Zanoni (4):
  drm/i915: add gen7_render_ring_flush
  drm/i915: add workarounds directly to gen6_render_ring_flush
  drm/i915: add workaround to gen7_render_ring_flush
  drm/i915: add one more workaround to gen7_render_ring_flush

 drivers/gpu/drm/i915/intel_ringbuffer.c | 81 ++++++++++++++++++++++++++++++---
 1 file changed, 74 insertions(+), 7 deletions(-)

-- 
1.7.11.2

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] drm/i915: add gen7_render_ring_flush
  2012-08-17 21:35 [PATCH 0/4] Haswell GPU hang fixes Paulo Zanoni
@ 2012-08-17 21:35 ` Paulo Zanoni
  2012-08-17 21:35 ` [PATCH 2/4] drm/i915: add workarounds directly to gen6_render_ring_flush Paulo Zanoni
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Paulo Zanoni @ 2012-08-17 21:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

For now, just a copy of gen6_render_ring_flush. Different gens have
different workarounds, so we want different functions.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 50 ++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c828169..32e3034 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -258,6 +258,54 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
+gen7_render_ring_flush(struct intel_ring_buffer *ring,
+		       u32 invalidate_domains, u32 flush_domains)
+{
+	u32 flags = 0;
+	struct pipe_control *pc = ring->private;
+	u32 scratch_addr = pc->gtt_offset + 128;
+	int ret;
+
+	/* Just flush everything.  Experiments have shown that reducing the
+	 * number of bits based on the write domains has little performance
+	 * impact.
+	 */
+	if (flush_domains) {
+		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+		/*
+		 * Ensure that any following seqno writes only happen
+		 * when the render cache is indeed flushed.
+		 */
+		flags |= PIPE_CONTROL_CS_STALL;
+	}
+	if (invalidate_domains) {
+		flags |= PIPE_CONTROL_TLB_INVALIDATE;
+		flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
+		/*
+		 * TLB invalidate requires a post-sync write.
+		 */
+		flags |= PIPE_CONTROL_QW_WRITE;
+	}
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+
+	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
+	intel_ring_emit(ring, flags);
+	intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT);
+	intel_ring_emit(ring, 0);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
+static int
 gen6_render_ring_flush__wa(struct intel_ring_buffer *ring,
 			   u32 invalidate_domains, u32 flush_domains)
 {
@@ -1385,7 +1433,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
-		ring->flush = gen6_render_ring_flush;
+		ring->flush = gen7_render_ring_flush;
 		if (INTEL_INFO(dev)->gen == 6)
 			ring->flush = gen6_render_ring_flush__wa;
 		ring->irq_get = gen6_ring_get_irq;
-- 
1.7.11.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] drm/i915: add workarounds directly to gen6_render_ring_flush
  2012-08-17 21:35 [PATCH 0/4] Haswell GPU hang fixes Paulo Zanoni
  2012-08-17 21:35 ` [PATCH 1/4] drm/i915: add gen7_render_ring_flush Paulo Zanoni
@ 2012-08-17 21:35 ` Paulo Zanoni
  2012-08-17 21:35 ` [PATCH 3/4] drm/i915: add workaround to gen7_render_ring_flush Paulo Zanoni
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Paulo Zanoni @ 2012-08-17 21:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

Since gen 7+ now run the new gen7_render_ring_flush function.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 21 ++++++---------------
 1 file changed, 6 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 32e3034..dc5272b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -218,6 +218,11 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
 	u32 scratch_addr = pc->gtt_offset + 128;
 	int ret;
 
+	/* Force SNB workarounds for PIPE_CONTROL flushes */
+	ret = intel_emit_post_sync_nonzero_flush(ring);
+	if (ret)
+		return ret;
+
 	/* Just flush everything.  Experiments have shown that reducing the
 	 * number of bits based on the write domains has little performance
 	 * impact.
@@ -305,20 +310,6 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	return 0;
 }
 
-static int
-gen6_render_ring_flush__wa(struct intel_ring_buffer *ring,
-			   u32 invalidate_domains, u32 flush_domains)
-{
-	int ret;
-
-	/* Force SNB workarounds for PIPE_CONTROL flushes */
-	ret = intel_emit_post_sync_nonzero_flush(ring);
-	if (ret)
-		return ret;
-
-	return gen6_render_ring_flush(ring, invalidate_domains, flush_domains);
-}
-
 static void ring_write_tail(struct intel_ring_buffer *ring,
 			    u32 value)
 {
@@ -1435,7 +1426,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->add_request = gen6_add_request;
 		ring->flush = gen7_render_ring_flush;
 		if (INTEL_INFO(dev)->gen == 6)
-			ring->flush = gen6_render_ring_flush__wa;
+			ring->flush = gen6_render_ring_flush;
 		ring->irq_get = gen6_ring_get_irq;
 		ring->irq_put = gen6_ring_put_irq;
 		ring->irq_enable_mask = GT_USER_INTERRUPT;
-- 
1.7.11.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] drm/i915: add workaround to gen7_render_ring_flush
  2012-08-17 21:35 [PATCH 0/4] Haswell GPU hang fixes Paulo Zanoni
  2012-08-17 21:35 ` [PATCH 1/4] drm/i915: add gen7_render_ring_flush Paulo Zanoni
  2012-08-17 21:35 ` [PATCH 2/4] drm/i915: add workarounds directly to gen6_render_ring_flush Paulo Zanoni
@ 2012-08-17 21:35 ` Paulo Zanoni
  2012-08-28  9:17   ` Daniel Vetter
  2012-08-17 21:35 ` [PATCH 4/4] drm/i915: add one more " Paulo Zanoni
  2012-08-17 22:39 ` [PATCH 0/4] Haswell GPU hang fixes Ben Widawsky
  4 siblings, 1 reply; 11+ messages in thread
From: Paulo Zanoni @ 2012-08-17 21:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

The combination of this commit + the next one will prevent a lot of
gpu hangs.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index dc5272b..9895a6e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -263,6 +263,25 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
+gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
+{
+	int ret;
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+
+	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
+	intel_ring_emit(ring, PIPE_CONTROL_CS_STALL |
+			      PIPE_CONTROL_STALL_AT_SCOREBOARD);
+	intel_ring_emit(ring, 0);
+	intel_ring_emit(ring, 0);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
+static int
 gen7_render_ring_flush(struct intel_ring_buffer *ring,
 		       u32 invalidate_domains, u32 flush_domains)
 {
@@ -295,6 +314,11 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 		 * TLB invalidate requires a post-sync write.
 		 */
 		flags |= PIPE_CONTROL_QW_WRITE;
+
+		/* Workaround: we must issue a pipe_control with CS-stall bit
+		 * set before a pipe_control command that has the state cache
+		 * invalidate bit set. */
+		gen7_render_ring_cs_stall_wa(ring);
 	}
 
 	ret = intel_ring_begin(ring, 4);
-- 
1.7.11.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] drm/i915: add one more workaround to gen7_render_ring_flush
  2012-08-17 21:35 [PATCH 0/4] Haswell GPU hang fixes Paulo Zanoni
                   ` (2 preceding siblings ...)
  2012-08-17 21:35 ` [PATCH 3/4] drm/i915: add workaround to gen7_render_ring_flush Paulo Zanoni
@ 2012-08-17 21:35 ` Paulo Zanoni
  2012-08-17 22:39 ` [PATCH 0/4] Haswell GPU hang fixes Ben Widawsky
  4 siblings, 0 replies; 11+ messages in thread
From: Paulo Zanoni @ 2012-08-17 21:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

The combination of this commit + the previous one prevents the dozens
of GPU hangs I'm seeing on my gen 7.5 machine.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 9895a6e..7c0bc96 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -290,6 +290,15 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	u32 scratch_addr = pc->gtt_offset + 128;
 	int ret;
 
+	/*
+	 * Ensure that any following seqno writes only happen when the render
+	 * cache is indeed flushed.
+	 * The documentation also mentions that every 4th PIPE_CONTROL command
+	 * (except the ones with only read-cache invalidate bits set) must have
+	 * the CS_STALL bit set.
+	 */
+	flags |= PIPE_CONTROL_CS_STALL;
+
 	/* Just flush everything.  Experiments have shown that reducing the
 	 * number of bits based on the write domains has little performance
 	 * impact.
@@ -297,11 +306,6 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	if (flush_domains) {
 		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
 		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
-		/*
-		 * Ensure that any following seqno writes only happen
-		 * when the render cache is indeed flushed.
-		 */
-		flags |= PIPE_CONTROL_CS_STALL;
 	}
 	if (invalidate_domains) {
 		flags |= PIPE_CONTROL_TLB_INVALIDATE;
-- 
1.7.11.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] Haswell GPU hang fixes
  2012-08-17 21:35 [PATCH 0/4] Haswell GPU hang fixes Paulo Zanoni
                   ` (3 preceding siblings ...)
  2012-08-17 21:35 ` [PATCH 4/4] drm/i915: add one more " Paulo Zanoni
@ 2012-08-17 22:39 ` Ben Widawsky
  2012-08-17 22:42   ` Paulo Zanoni
  4 siblings, 1 reply; 11+ messages in thread
From: Ben Widawsky @ 2012-08-17 22:39 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On 2012-08-17 14:35, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> Hi
>
> Since we stopped running the gen6 workarounds on gen > 6 I started
> seeing a lot
> of GPU hangs on my gen 7.5 machine. These patches add gen7+ 
> workarounds which
> prevent the GPU hangs I'm seeing.
>
> These patches were tested mostly on HSW and briefly on IVB, but the
> workarounds
> implemented are documented for both. This is also the first time I 
> touch this
> kind of code, so extra careful reviewing might be needed.
>
> Thanks,
> Paulo
>
>
> Paulo Zanoni (4):
>   drm/i915: add gen7_render_ring_flush
>   drm/i915: add workarounds directly to gen6_render_ring_flush
>   drm/i915: add workaround to gen7_render_ring_flush
>   drm/i915: add one more workaround to gen7_render_ring_flush
>
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 81
> ++++++++++++++++++++++++++++++---
>  1 file changed, 74 insertions(+), 7 deletions(-)

I don't think this makes for ideal bisection. I'd rather you implement 
the ring flush + workarounds before you set the pointer. If you want to 
keep the patches split out like this, just add a 5 patch to do the 
pointer change that's in the first patch. Or else collapse patch 1, 3, 4 
into 1.

I didn't really review anything yet though.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] Haswell GPU hang fixes
  2012-08-17 22:39 ` [PATCH 0/4] Haswell GPU hang fixes Ben Widawsky
@ 2012-08-17 22:42   ` Paulo Zanoni
  2012-08-17 22:53     ` Ben Widawsky
  0 siblings, 1 reply; 11+ messages in thread
From: Paulo Zanoni @ 2012-08-17 22:42 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Paulo Zanoni

2012/8/17 Ben Widawsky <ben@bwidawsk.net>:
> On 2012-08-17 14:35, Paulo Zanoni wrote:
>>
>> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>
>> Hi
>>
>> Since we stopped running the gen6 workarounds on gen > 6 I started
>> seeing a lot
>> of GPU hangs on my gen 7.5 machine. These patches add gen7+ workarounds
>> which
>> prevent the GPU hangs I'm seeing.
>>
>> These patches were tested mostly on HSW and briefly on IVB, but the
>> workarounds
>> implemented are documented for both. This is also the first time I touch
>> this
>> kind of code, so extra careful reviewing might be needed.
>>
>> Thanks,
>> Paulo
>>
>>
>> Paulo Zanoni (4):
>>   drm/i915: add gen7_render_ring_flush
>>   drm/i915: add workarounds directly to gen6_render_ring_flush
>>   drm/i915: add workaround to gen7_render_ring_flush
>>   drm/i915: add one more workaround to gen7_render_ring_flush
>>
>>  drivers/gpu/drm/i915/intel_ringbuffer.c | 81
>> ++++++++++++++++++++++++++++++---
>>  1 file changed, 74 insertions(+), 7 deletions(-)
>
>
> I don't think this makes for ideal bisection. I'd rather you implement the
> ring flush + workarounds before you set the pointer. If you want to keep the
> patches split out like this, just add a 5 patch to do the pointer change
> that's in the first patch. Or else collapse patch 1, 3, 4 into 1.

But currently gen7 is *already* not running any of the workarounds...
Patch 1 does not introduce any real difference (neither adds nor
removes bugs), patch 2 also does not change anything. Only patches 3
and 4 start to fix the bugs that were already present before the
series.


>
> I didn't really review anything yet though.
>
> --
> Ben Widawsky, Intel Open Source Technology Center



-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] Haswell GPU hang fixes
  2012-08-17 22:42   ` Paulo Zanoni
@ 2012-08-17 22:53     ` Ben Widawsky
  0 siblings, 0 replies; 11+ messages in thread
From: Ben Widawsky @ 2012-08-17 22:53 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On 2012-08-17 15:42, Paulo Zanoni wrote:
> 2012/8/17 Ben Widawsky <ben@bwidawsk.net>:
>> On 2012-08-17 14:35, Paulo Zanoni wrote:
>>>
>>> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>>
>>> Hi
>>>
>>> Since we stopped running the gen6 workarounds on gen > 6 I started
>>> seeing a lot
>>> of GPU hangs on my gen 7.5 machine. These patches add gen7+ 
>>> workarounds
>>> which
>>> prevent the GPU hangs I'm seeing.
>>>
>>> These patches were tested mostly on HSW and briefly on IVB, but the
>>> workarounds
>>> implemented are documented for both. This is also the first time I 
>>> touch
>>> this
>>> kind of code, so extra careful reviewing might be needed.
>>>
>>> Thanks,
>>> Paulo
>>>
>>>
>>> Paulo Zanoni (4):
>>>   drm/i915: add gen7_render_ring_flush
>>>   drm/i915: add workarounds directly to gen6_render_ring_flush
>>>   drm/i915: add workaround to gen7_render_ring_flush
>>>   drm/i915: add one more workaround to gen7_render_ring_flush
>>>
>>>  drivers/gpu/drm/i915/intel_ringbuffer.c | 81
>>> ++++++++++++++++++++++++++++++---
>>>  1 file changed, 74 insertions(+), 7 deletions(-)
>>
>>
>> I don't think this makes for ideal bisection. I'd rather you 
>> implement the
>> ring flush + workarounds before you set the pointer. If you want to 
>> keep the
>> patches split out like this, just add a 5 patch to do the pointer 
>> change
>> that's in the first patch. Or else collapse patch 1, 3, 4 into 1.
>
> But currently gen7 is *already* not running any of the workarounds...
> Patch 1 does not introduce any real difference (neither adds nor
> removes bugs), patch 2 also does not change anything. Only patches 3
> and 4 start to fix the bugs that were already present before the
> series.
>

I guess I thought the patches would be, revert Chris patch to separate 
out the workaround, then add IVB specific flush. So now I see you've 
done things slightly differently. Sorry for the rushed comments. Also 
this doesn't fix my platform :/

>
>>
>> I didn't really review anything yet though.
>>
>> --
>> Ben Widawsky, Intel Open Source Technology Center

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] drm/i915: add workaround to gen7_render_ring_flush
  2012-08-17 21:35 ` [PATCH 3/4] drm/i915: add workaround to gen7_render_ring_flush Paulo Zanoni
@ 2012-08-28  9:17   ` Daniel Vetter
  2012-08-28 13:10     ` Paulo Zanoni
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Vetter @ 2012-08-28  9:17 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Fri, Aug 17, 2012 at 06:35:43PM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> The combination of this commit + the next one will prevent a lot of
> gpu hangs.
> 
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

tbh I'm not happy with the justification in the commit message here. If
there's anything about this in Bspec, I want a full reference, if this is
just due to the simulator, please say so (and maybe paste the complaint
fulsim raises). If it's just empirical evidence please say which machines
this affects (since afaict it's only confirmed to help on hsw).

[Also directed at Ben.]

Thanks, Daniel
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index dc5272b..9895a6e 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -263,6 +263,25 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
>  }
>  
>  static int
> +gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
> +{
> +	int ret;
> +
> +	ret = intel_ring_begin(ring, 4);
> +	if (ret)
> +		return ret;
> +
> +	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> +	intel_ring_emit(ring, PIPE_CONTROL_CS_STALL |
> +			      PIPE_CONTROL_STALL_AT_SCOREBOARD);
> +	intel_ring_emit(ring, 0);
> +	intel_ring_emit(ring, 0);
> +	intel_ring_advance(ring);
> +
> +	return 0;
> +}
> +
> +static int
>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  		       u32 invalidate_domains, u32 flush_domains)
>  {
> @@ -295,6 +314,11 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  		 * TLB invalidate requires a post-sync write.
>  		 */
>  		flags |= PIPE_CONTROL_QW_WRITE;
> +
> +		/* Workaround: we must issue a pipe_control with CS-stall bit
> +		 * set before a pipe_control command that has the state cache
> +		 * invalidate bit set. */
> +		gen7_render_ring_cs_stall_wa(ring);
>  	}
>  
>  	ret = intel_ring_begin(ring, 4);
> -- 
> 1.7.11.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] drm/i915: add workaround to gen7_render_ring_flush
  2012-08-28  9:17   ` Daniel Vetter
@ 2012-08-28 13:10     ` Paulo Zanoni
  2012-08-28 15:17       ` Daniel Vetter
  0 siblings, 1 reply; 11+ messages in thread
From: Paulo Zanoni @ 2012-08-28 13:10 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Paulo Zanoni

2012/8/28 Daniel Vetter <daniel@ffwll.ch>:
> On Fri, Aug 17, 2012 at 06:35:43PM -0300, Paulo Zanoni wrote:
>> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>
>> The combination of this commit + the next one will prevent a lot of
>> gpu hangs.
>>
>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> tbh I'm not happy with the justification in the commit message here. If
> there's anything about this in Bspec, I want a full reference, if this is
> just due to the simulator, please say so (and maybe paste the complaint
> fulsim raises). If it's just empirical evidence please say which machines
> this affects (since afaict it's only confirmed to help on hsw).

It's from Bspec. See the page that describes the PIPE_CONTROL
register. Look for a piece of text that's very similar to the comment
I wrote. It affects IVB and newer.

This is a regression. If you're not happy with this solution, you can
revert all the commits that happened in gen6_render_ring_flush since
we stopped applying the gen6 workarounds to IVB and newer. This
regression makes Haswell a *pain* to use: you're typing on the
terminal and then suddenly everything freezes because there's a GPU
hang. Then 5 seconds later it happens again. Then 2 minutes later,
again. Then again.

Cheers,
Paulo

>
> [Also directed at Ben.]
>
> Thanks, Daniel
>> ---
>>  drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++++++++++++++++++++++++
>>  1 file changed, 24 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index dc5272b..9895a6e 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -263,6 +263,25 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
>>  }
>>
>>  static int
>> +gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>> +{
>> +     int ret;
>> +
>> +     ret = intel_ring_begin(ring, 4);
>> +     if (ret)
>> +             return ret;
>> +
>> +     intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
>> +     intel_ring_emit(ring, PIPE_CONTROL_CS_STALL |
>> +                           PIPE_CONTROL_STALL_AT_SCOREBOARD);
>> +     intel_ring_emit(ring, 0);
>> +     intel_ring_emit(ring, 0);
>> +     intel_ring_advance(ring);
>> +
>> +     return 0;
>> +}
>> +
>> +static int
>>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>>                      u32 invalidate_domains, u32 flush_domains)
>>  {
>> @@ -295,6 +314,11 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>>                * TLB invalidate requires a post-sync write.
>>                */
>>               flags |= PIPE_CONTROL_QW_WRITE;
>> +
>> +             /* Workaround: we must issue a pipe_control with CS-stall bit
>> +              * set before a pipe_control command that has the state cache
>> +              * invalidate bit set. */
>> +             gen7_render_ring_cs_stall_wa(ring);
>>       }
>>
>>       ret = intel_ring_begin(ring, 4);
>> --
>> 1.7.11.2
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Mail: daniel@ffwll.ch
> Mobile: +41 (0)79 365 57 48



-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] drm/i915: add workaround to gen7_render_ring_flush
  2012-08-28 13:10     ` Paulo Zanoni
@ 2012-08-28 15:17       ` Daniel Vetter
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2012-08-28 15:17 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Tue, Aug 28, 2012 at 10:10:13AM -0300, Paulo Zanoni wrote:
> 2012/8/28 Daniel Vetter <daniel@ffwll.ch>:
> > On Fri, Aug 17, 2012 at 06:35:43PM -0300, Paulo Zanoni wrote:
> >> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >>
> >> The combination of this commit + the next one will prevent a lot of
> >> gpu hangs.
> >>
> >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >
> > tbh I'm not happy with the justification in the commit message here. If
> > there's anything about this in Bspec, I want a full reference, if this is
> > just due to the simulator, please say so (and maybe paste the complaint
> > fulsim raises). If it's just empirical evidence please say which machines
> > this affects (since afaict it's only confirmed to help on hsw).
> 
> It's from Bspec. See the page that describes the PIPE_CONTROL
> register. Look for a piece of text that's very similar to the comment
> I wrote. It affects IVB and newer.
> 
> This is a regression. If you're not happy with this solution, you can
> revert all the commits that happened in gen6_render_ring_flush since
> we stopped applying the gen6 workarounds to IVB and newer. This
> regression makes Haswell a *pain* to use: you're typing on the
> terminal and then suddenly everything freezes because there's a GPU
> hang. Then 5 seconds later it happens again. Then 2 minutes later,
> again. Then again.

Ok, after some flaming in private on irc we've resolved our differences.
I've applied the first 2 patches as-is and squashed the later 2 workaround
patches together, with some bikeshedding applied (mostly to cite the bspec
in the commit message, cite the comment that introduced the regression and
also mentioned that this restores hsw stability).

Thanks for tracking this down and writing the patches.

Yours, Daniel
> 
> Cheers,
> Paulo
> 
> >
> > [Also directed at Ben.]
> >
> > Thanks, Daniel
> >> ---
> >>  drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++++++++++++++++++++++++
> >>  1 file changed, 24 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> index dc5272b..9895a6e 100644
> >> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> @@ -263,6 +263,25 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
> >>  }
> >>
> >>  static int
> >> +gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
> >> +{
> >> +     int ret;
> >> +
> >> +     ret = intel_ring_begin(ring, 4);
> >> +     if (ret)
> >> +             return ret;
> >> +
> >> +     intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> >> +     intel_ring_emit(ring, PIPE_CONTROL_CS_STALL |
> >> +                           PIPE_CONTROL_STALL_AT_SCOREBOARD);
> >> +     intel_ring_emit(ring, 0);
> >> +     intel_ring_emit(ring, 0);
> >> +     intel_ring_advance(ring);
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static int
> >>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
> >>                      u32 invalidate_domains, u32 flush_domains)
> >>  {
> >> @@ -295,6 +314,11 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
> >>                * TLB invalidate requires a post-sync write.
> >>                */
> >>               flags |= PIPE_CONTROL_QW_WRITE;
> >> +
> >> +             /* Workaround: we must issue a pipe_control with CS-stall bit
> >> +              * set before a pipe_control command that has the state cache
> >> +              * invalidate bit set. */
> >> +             gen7_render_ring_cs_stall_wa(ring);
> >>       }
> >>
> >>       ret = intel_ring_begin(ring, 4);
> >> --
> >> 1.7.11.2
> >>
> >> _______________________________________________
> >> Intel-gfx mailing list
> >> Intel-gfx@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> > --
> > Daniel Vetter
> > Mail: daniel@ffwll.ch
> > Mobile: +41 (0)79 365 57 48
> 
> 
> 
> -- 
> Paulo Zanoni

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-08-28 15:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-17 21:35 [PATCH 0/4] Haswell GPU hang fixes Paulo Zanoni
2012-08-17 21:35 ` [PATCH 1/4] drm/i915: add gen7_render_ring_flush Paulo Zanoni
2012-08-17 21:35 ` [PATCH 2/4] drm/i915: add workarounds directly to gen6_render_ring_flush Paulo Zanoni
2012-08-17 21:35 ` [PATCH 3/4] drm/i915: add workaround to gen7_render_ring_flush Paulo Zanoni
2012-08-28  9:17   ` Daniel Vetter
2012-08-28 13:10     ` Paulo Zanoni
2012-08-28 15:17       ` Daniel Vetter
2012-08-17 21:35 ` [PATCH 4/4] drm/i915: add one more " Paulo Zanoni
2012-08-17 22:39 ` [PATCH 0/4] Haswell GPU hang fixes Ben Widawsky
2012-08-17 22:42   ` Paulo Zanoni
2012-08-17 22:53     ` Ben Widawsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).