All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: WA: FBC Render Nuke.
@ 2013-05-29  0:25 Rodrigo Vivi
  2013-05-29  0:36 ` Rodrigo Vivi
  2013-05-31 15:59 ` Ville Syrjälä
  0 siblings, 2 replies; 27+ messages in thread
From: Rodrigo Vivi @ 2013-05-29  0:25 UTC (permalink / raw)
  To: intel-gfx

WaFbcNukeOn3DBlt for IVB, HSW and VLV.

According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
follows each render submission."

v2: Chris noticed that flush_domains check was missing here and also suggested to do
    LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
    module parameter check.

v3: Adding Wa name as Damien suggested.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_reg.h         |  2 ++
 drivers/gpu/drm/i915/intel_pm.c         |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index cc4c223..81ac584 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -977,6 +977,8 @@
 /* Framebuffer compression for Ivybridge */
 #define IVB_FBC_RT_BASE			0x7020
 
+#define MSG_FBC_REND_STATE	0x50380
+#define   FBC_REND_NUKE		(1<<2)
 
 #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
 #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 1879188..e830a9b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
+	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
 
 	if (!intel_edp_is_psr_enabled(dev))
 		I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3d2c236..69491db 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
 	return 0;
 }
 
+static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
+{
+	struct drm_device *dev = ring->dev;
+	int ret;
+
+	if (i915_enable_fbc == 0)
+		return 0;
+
+	if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
+		return 0;
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_NOOP);
+	/* WaFbcNukeOn3DBlt:ivb/hsw/vlv */
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MSG_FBC_REND_STATE);
+	intel_ring_emit(ring, FBC_REND_NUKE);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
 static int
 gen7_render_ring_flush(struct intel_ring_buffer *ring,
 		       u32 invalidate_domains, u32 flush_domains)
@@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);
 
+	if (flush_domains)
+		return gen7_ring_fbc_flush(ring);
+
 	return 0;
 }
 
@@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 static int blt_ring_flush(struct intel_ring_buffer *ring,
 			  u32 invalidate, u32 flush)
 {
+	struct drm_device *dev = ring->dev;
 	uint32_t cmd;
 	int ret;
 
@@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
+
+	if (IS_GEN7(dev))
+		return gen7_ring_fbc_flush(ring);
+
 	return 0;
 }
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-05-29  0:25 [PATCH] drm/i915: WA: FBC Render Nuke Rodrigo Vivi
@ 2013-05-29  0:36 ` Rodrigo Vivi
  2013-05-31 15:59 ` Ville Syrjälä
  1 sibling, 0 replies; 27+ messages in thread
From: Rodrigo Vivi @ 2013-05-29  0:36 UTC (permalink / raw)
  To: intel-gfx, Ville Syrjälä, Damien Lespiau

Ops, sorry about miss "in-reply-to".. I was here fighting with my git
send-email for a long time doing many attempts...

Anyways, going to the point.
Ville, I decide to let it without SRM described in PM guide because it
doesn't not work properly. I' ve made many attempts here but it doesn'
t work rpopertly.
>From the MI_SOTRE_REGISTER_MEM in BSpec: "This command temporarily
halts command execution."
Besides that we don't have a reliable way to detect directly front
buffer rendering...

I understand the synchronization concerns, but this patch follows the
BSPec workaround and it is working properly here on my both machines
(IVB and HSW) for more than on week now without a single corruption or
slowness.

So, this is the best we have now and I kindly ask you guys to accept
it. It fixed many corruption and slowness that Timothy faced.
In parallel I'm going to contact PM guide writers asking about this
SRM and also asking them new updated versions.

Thanks in advance,
Rodrigo




On Tue, May 28, 2013 at 9:25 PM, Rodrigo Vivi <rodrigo.vivi@gmail.com> wrote:
> WaFbcNukeOn3DBlt for IVB, HSW and VLV.
>
> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> follows each render submission."
>
> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>     module parameter check.
>
> v3: Adding Wa name as Damien suggested.
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
>  3 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index cc4c223..81ac584 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -977,6 +977,8 @@
>  /* Framebuffer compression for Ivybridge */
>  #define IVB_FBC_RT_BASE                        0x7020
>
> +#define MSG_FBC_REND_STATE     0x50380
> +#define   FBC_REND_NUKE                (1<<2)
>
>  #define _HSW_PIPE_SLICE_CHICKEN_1_A    0x420B0
>  #define _HSW_PIPE_SLICE_CHICKEN_1_B    0x420B4
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 1879188..e830a9b 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>         struct drm_i915_gem_object *obj = intel_fb->obj;
>         struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>
> -       I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
> +       I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
>
>         if (!intel_edp_is_psr_enabled(dev))
>                 I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3d2c236..69491db 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>         return 0;
>  }
>
> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
> +{
> +       struct drm_device *dev = ring->dev;
> +       int ret;
> +
> +       if (i915_enable_fbc == 0)
> +               return 0;
> +
> +       if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
> +               return 0;
> +
> +       ret = intel_ring_begin(ring, 4);
> +       if (ret)
> +               return ret;
> +       intel_ring_emit(ring, MI_NOOP);
> +       /* WaFbcNukeOn3DBlt:ivb/hsw/vlv */
> +       intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +       intel_ring_emit(ring, MSG_FBC_REND_STATE);
> +       intel_ring_emit(ring, FBC_REND_NUKE);
> +       intel_ring_advance(ring);
> +
> +       return 0;
> +}
> +
>  static int
>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>                        u32 invalidate_domains, u32 flush_domains)
> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>         intel_ring_emit(ring, 0);
>         intel_ring_advance(ring);
>
> +       if (flush_domains)
> +               return gen7_ring_fbc_flush(ring);
> +
>         return 0;
>  }
>
> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  static int blt_ring_flush(struct intel_ring_buffer *ring,
>                           u32 invalidate, u32 flush)
>  {
> +       struct drm_device *dev = ring->dev;
>         uint32_t cmd;
>         int ret;
>
> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
>         intel_ring_emit(ring, 0);
>         intel_ring_emit(ring, MI_NOOP);
>         intel_ring_advance(ring);
> +
> +       if (IS_GEN7(dev))
> +               return gen7_ring_fbc_flush(ring);
> +
>         return 0;
>  }
>
> --
> 1.8.1.4
>



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-05-29  0:25 [PATCH] drm/i915: WA: FBC Render Nuke Rodrigo Vivi
  2013-05-29  0:36 ` Rodrigo Vivi
@ 2013-05-31 15:59 ` Ville Syrjälä
  2013-05-31 20:15   ` Rodrigo Vivi
  1 sibling, 1 reply; 27+ messages in thread
From: Ville Syrjälä @ 2013-05-31 15:59 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Tue, May 28, 2013 at 09:25:12PM -0300, Rodrigo Vivi wrote:
> WaFbcNukeOn3DBlt for IVB, HSW and VLV.

VLV doesn't have FBC, so this is a bit incorrect.

> 
> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> follows each render submission."
> 
> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>     module parameter check.
> 
> v3: Adding Wa name as Damien suggested.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
>  3 files changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index cc4c223..81ac584 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -977,6 +977,8 @@
>  /* Framebuffer compression for Ivybridge */
>  #define IVB_FBC_RT_BASE			0x7020
>  
> +#define MSG_FBC_REND_STATE	0x50380
> +#define   FBC_REND_NUKE		(1<<2)
>  
>  #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
>  #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 1879188..e830a9b 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>  	struct drm_i915_gem_object *obj = intel_fb->obj;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  
> -	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
> +	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
>  
>  	if (!intel_edp_is_psr_enabled(dev))
>  		I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3d2c236..69491db 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>  	return 0;
>  }
>  
> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
> +{
> +	struct drm_device *dev = ring->dev;
> +	int ret;
> +
> +	if (i915_enable_fbc == 0)
> +		return 0;
> +
> +	if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
> +		return 0;
> +
> +	ret = intel_ring_begin(ring, 4);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_NOOP);
> +	/* WaFbcNukeOn3DBlt:ivb/hsw/vlv */

Another mention of vlv. I can see BSpec makes the same mistake in
the register description though.

> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, MSG_FBC_REND_STATE);
> +	intel_ring_emit(ring, FBC_REND_NUKE);
> +	intel_ring_advance(ring);
> +
> +	return 0;
> +}
> +
>  static int
>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  		       u32 invalidate_domains, u32 flush_domains)
> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, 0);
>  	intel_ring_advance(ring);
>  
> +	if (flush_domains)
> +		return gen7_ring_fbc_flush(ring);
> +
>  	return 0;
>  }
>  
> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  static int blt_ring_flush(struct intel_ring_buffer *ring,
>  			  u32 invalidate, u32 flush)
>  {
> +	struct drm_device *dev = ring->dev;
>  	uint32_t cmd;
>  	int ret;
>  
> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, 0);
>  	intel_ring_emit(ring, MI_NOOP);
>  	intel_ring_advance(ring);
> +
> +	if (IS_GEN7(dev))
> +		return gen7_ring_fbc_flush(ring);

Should check flush_domains here as well?

So we're now using the same nuke mechanism from the blt ring too.
Should we then drop the regular blitter tracking things from fbc_enable?

Oh and what about vcs and vecs, should we nuke from those rings as well?
I guess it would be strange to write to the primary plane's buffer via
vcs, but I'm assuming vebox could write the same formats that we can
scan out...

> +
>  	return 0;
>  }
>  
> -- 
> 1.8.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-05-31 15:59 ` Ville Syrjälä
@ 2013-05-31 20:15   ` Rodrigo Vivi
  2013-06-03 11:34     ` Ville Syrjälä
  0 siblings, 1 reply; 27+ messages in thread
From: Rodrigo Vivi @ 2013-05-31 20:15 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

Hi Ville,

Thanks for the comments.


On Fri, May 31, 2013 at 12:59 PM, Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Tue, May 28, 2013 at 09:25:12PM -0300, Rodrigo Vivi wrote:
>> WaFbcNukeOn3DBlt for IVB, HSW and VLV.
>
> VLV doesn't have FBC, so this is a bit incorrect.

I'm going to remove the vlv mention that incorrectly came from spec...

>
>>
>> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
>> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
>> follows each render submission."
>>
>> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>>     module parameter check.
>>
>> v3: Adding Wa name as Damien suggested.
>>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
>> ---
>>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
>>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
>>  3 files changed, 35 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index cc4c223..81ac584 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -977,6 +977,8 @@
>>  /* Framebuffer compression for Ivybridge */
>>  #define IVB_FBC_RT_BASE                      0x7020
>>
>> +#define MSG_FBC_REND_STATE   0x50380
>> +#define   FBC_REND_NUKE              (1<<2)
>>
>>  #define _HSW_PIPE_SLICE_CHICKEN_1_A  0x420B0
>>  #define _HSW_PIPE_SLICE_CHICKEN_1_B  0x420B4
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 1879188..e830a9b 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>>       struct drm_i915_gem_object *obj = intel_fb->obj;
>>       struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>>
>> -     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
>> +     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
>>
>>       if (!intel_edp_is_psr_enabled(dev))
>>               I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 3d2c236..69491db 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>>       return 0;
>>  }
>>
>> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
>> +{
>> +     struct drm_device *dev = ring->dev;
>> +     int ret;
>> +
>> +     if (i915_enable_fbc == 0)
>> +             return 0;
>> +
>> +     if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
>> +             return 0;
>> +
>> +     ret = intel_ring_begin(ring, 4);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_NOOP);
>> +     /* WaFbcNukeOn3DBlt:ivb/hsw/vlv */
>
> Another mention of vlv. I can see BSpec makes the same mistake in
> the register description though.

... as you noticed.

>
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, MSG_FBC_REND_STATE);
>> +     intel_ring_emit(ring, FBC_REND_NUKE);
>> +     intel_ring_advance(ring);
>> +
>> +     return 0;
>> +}
>> +
>>  static int
>>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>>                      u32 invalidate_domains, u32 flush_domains)
>> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>>       intel_ring_emit(ring, 0);
>>       intel_ring_advance(ring);
>>
>> +     if (flush_domains)
>> +             return gen7_ring_fbc_flush(ring);
>> +
>>       return 0;
>>  }
>>
>> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>>  static int blt_ring_flush(struct intel_ring_buffer *ring,
>>                         u32 invalidate, u32 flush)
>>  {
>> +     struct drm_device *dev = ring->dev;
>>       uint32_t cmd;
>>       int ret;
>>
>> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
>>       intel_ring_emit(ring, 0);
>>       intel_ring_emit(ring, MI_NOOP);
>>       intel_ring_advance(ring);
>> +
>> +     if (IS_GEN7(dev))
>> +             return gen7_ring_fbc_flush(ring);
>
> Should check flush_domains here as well?

here is flush domain by definition, isn' t it?

>
> So we're now using the same nuke mechanism from the blt ring too.
> Should we then drop the regular blitter tracking things from fbc_enable?

This is a good question. Since this is a critical patch and it is
working in the way it is I prefer to let it in the way it is and
promisse that I will try to drop old blitter tracking for ivb and hsw
later. If it works I'll send the drop in another patch.

>
> Oh and what about vcs and vecs, should we nuke from those rings as well?
> I guess it would be strange to write to the primary plane's buffer via
> vcs, but I'm assuming vebox could write the same formats that we can
> scan out...

To be truly honest with you I have no idea about these case. specs
just says to put after every pipe_control following flush
renderings... and blt.

>
>> +
>>       return 0;
>>  }
>>
>> --
>> 1.8.1.4
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Ville Syrjälä
> Intel OTC



--
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-05-31 20:15   ` Rodrigo Vivi
@ 2013-06-03 11:34     ` Ville Syrjälä
  2013-06-03 16:50       ` Rodrigo Vivi
  0 siblings, 1 reply; 27+ messages in thread
From: Ville Syrjälä @ 2013-06-03 11:34 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Fri, May 31, 2013 at 05:15:41PM -0300, Rodrigo Vivi wrote:
> Hi Ville,
> 
> Thanks for the comments.
> 
> 
> On Fri, May 31, 2013 at 12:59 PM, Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > On Tue, May 28, 2013 at 09:25:12PM -0300, Rodrigo Vivi wrote:
> >> WaFbcNukeOn3DBlt for IVB, HSW and VLV.
> >
> > VLV doesn't have FBC, so this is a bit incorrect.
> 
> I'm going to remove the vlv mention that incorrectly came from spec...
> 
> >
> >>
> >> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> >> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> >> follows each render submission."
> >>
> >> v2: Chris noticed that flush_domains check was missing here and also suggested to do
> >>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
> >>     module parameter check.
> >>
> >> v3: Adding Wa name as Damien suggested.
> >>
> >> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> >> ---
> >>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
> >>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
> >>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
> >>  3 files changed, 35 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> >> index cc4c223..81ac584 100644
> >> --- a/drivers/gpu/drm/i915/i915_reg.h
> >> +++ b/drivers/gpu/drm/i915/i915_reg.h
> >> @@ -977,6 +977,8 @@
> >>  /* Framebuffer compression for Ivybridge */
> >>  #define IVB_FBC_RT_BASE                      0x7020
> >>
> >> +#define MSG_FBC_REND_STATE   0x50380
> >> +#define   FBC_REND_NUKE              (1<<2)
> >>
> >>  #define _HSW_PIPE_SLICE_CHICKEN_1_A  0x420B0
> >>  #define _HSW_PIPE_SLICE_CHICKEN_1_B  0x420B4
> >> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> >> index 1879188..e830a9b 100644
> >> --- a/drivers/gpu/drm/i915/intel_pm.c
> >> +++ b/drivers/gpu/drm/i915/intel_pm.c
> >> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
> >>       struct drm_i915_gem_object *obj = intel_fb->obj;
> >>       struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> >>
> >> -     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
> >> +     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
> >>
> >>       if (!intel_edp_is_psr_enabled(dev))
> >>               I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
> >> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> index 3d2c236..69491db 100644
> >> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
> >>       return 0;
> >>  }
> >>
> >> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
> >> +{
> >> +     struct drm_device *dev = ring->dev;
> >> +     int ret;
> >> +
> >> +     if (i915_enable_fbc == 0)
> >> +             return 0;
> >> +
> >> +     if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
> >> +             return 0;
> >> +
> >> +     ret = intel_ring_begin(ring, 4);
> >> +     if (ret)
> >> +             return ret;
> >> +     intel_ring_emit(ring, MI_NOOP);
> >> +     /* WaFbcNukeOn3DBlt:ivb/hsw/vlv */
> >
> > Another mention of vlv. I can see BSpec makes the same mistake in
> > the register description though.
> 
> ... as you noticed.
> 
> >
> >> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> >> +     intel_ring_emit(ring, MSG_FBC_REND_STATE);
> >> +     intel_ring_emit(ring, FBC_REND_NUKE);
> >> +     intel_ring_advance(ring);
> >> +
> >> +     return 0;
> >> +}
> >> +
> >>  static int
> >>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
> >>                      u32 invalidate_domains, u32 flush_domains)
> >> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
> >>       intel_ring_emit(ring, 0);
> >>       intel_ring_advance(ring);
> >>
> >> +     if (flush_domains)
> >> +             return gen7_ring_fbc_flush(ring);
> >> +
> >>       return 0;
> >>  }
> >>
> >> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
> >>  static int blt_ring_flush(struct intel_ring_buffer *ring,
> >>                         u32 invalidate, u32 flush)
> >>  {
> >> +     struct drm_device *dev = ring->dev;
> >>       uint32_t cmd;
> >>       int ret;
> >>
> >> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
> >>       intel_ring_emit(ring, 0);
> >>       intel_ring_emit(ring, MI_NOOP);
> >>       intel_ring_advance(ring);
> >> +
> >> +     if (IS_GEN7(dev))
> >> +             return gen7_ring_fbc_flush(ring);
> >
> > Should check flush_domains here as well?
> 
> here is flush domain by definition, isn' t it?

How so?

> 
> >
> > So we're now using the same nuke mechanism from the blt ring too.
> > Should we then drop the regular blitter tracking things from fbc_enable?
> 
> This is a good question. Since this is a critical patch and it is
> working in the way it is I prefer to let it in the way it is and
> promisse that I will try to drop old blitter tracking for ivb and hsw
> later. If it works I'll send the drop in another patch.
> 
> >
> > Oh and what about vcs and vecs, should we nuke from those rings as well?
> > I guess it would be strange to write to the primary plane's buffer via
> > vcs, but I'm assuming vebox could write the same formats that we can
> > scan out...
> 
> To be truly honest with you I have no idea about these case. specs
> just says to put after every pipe_control following flush
> renderings... and blt.

IIRC the spec doesn't say anything about blt.

Ah, the PM giude tells you to do LRIs w/ blt too. But it actually says
that you should do "cache clean" LRIs insted of "nuke" LRIs.

> 
> >
> >> +
> >>       return 0;
> >>  }
> >>
> >> --
> >> 1.8.1.4
> >>
> >> _______________________________________________
> >> Intel-gfx mailing list
> >> Intel-gfx@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> > --
> > Ville Syrjälä
> > Intel OTC
> 
> 
> 
> --
> Rodrigo Vivi
> Blog: http://blog.vivi.eng.br

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-03 11:34     ` Ville Syrjälä
@ 2013-06-03 16:50       ` Rodrigo Vivi
  2013-06-03 17:03         ` Ville Syrjälä
  2013-06-03 17:08         ` Rodrigo Vivi
  0 siblings, 2 replies; 27+ messages in thread
From: Rodrigo Vivi @ 2013-06-03 16:50 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Mon, Jun 3, 2013 at 8:34 AM, Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Fri, May 31, 2013 at 05:15:41PM -0300, Rodrigo Vivi wrote:
>> Hi Ville,
>>
>> Thanks for the comments.
>>
>>
>> On Fri, May 31, 2013 at 12:59 PM, Ville Syrjälä
>> <ville.syrjala@linux.intel.com> wrote:
>> > On Tue, May 28, 2013 at 09:25:12PM -0300, Rodrigo Vivi wrote:
>> >> WaFbcNukeOn3DBlt for IVB, HSW and VLV.
>> >
>> > VLV doesn't have FBC, so this is a bit incorrect.
>>
>> I'm going to remove the vlv mention that incorrectly came from spec...
>>
>> >
>> >>
>> >> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
>> >> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
>> >> follows each render submission."
>> >>
>> >> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>> >>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>> >>     module parameter check.
>> >>
>> >> v3: Adding Wa name as Damien suggested.
>> >>
>> >> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
>> >> ---
>> >>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>> >>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
>> >>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
>> >>  3 files changed, 35 insertions(+), 1 deletion(-)
>> >>
>> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> >> index cc4c223..81ac584 100644
>> >> --- a/drivers/gpu/drm/i915/i915_reg.h
>> >> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> >> @@ -977,6 +977,8 @@
>> >>  /* Framebuffer compression for Ivybridge */
>> >>  #define IVB_FBC_RT_BASE                      0x7020
>> >>
>> >> +#define MSG_FBC_REND_STATE   0x50380
>> >> +#define   FBC_REND_NUKE              (1<<2)
>> >>
>> >>  #define _HSW_PIPE_SLICE_CHICKEN_1_A  0x420B0
>> >>  #define _HSW_PIPE_SLICE_CHICKEN_1_B  0x420B4
>> >> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> >> index 1879188..e830a9b 100644
>> >> --- a/drivers/gpu/drm/i915/intel_pm.c
>> >> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> >> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>> >>       struct drm_i915_gem_object *obj = intel_fb->obj;
>> >>       struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>> >>
>> >> -     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
>> >> +     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
>> >>
>> >>       if (!intel_edp_is_psr_enabled(dev))
>> >>               I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
>> >> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> >> index 3d2c236..69491db 100644
>> >> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> >> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> >> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>> >>       return 0;
>> >>  }
>> >>
>> >> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
>> >> +{
>> >> +     struct drm_device *dev = ring->dev;
>> >> +     int ret;
>> >> +
>> >> +     if (i915_enable_fbc == 0)
>> >> +             return 0;
>> >> +
>> >> +     if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
>> >> +             return 0;
>> >> +
>> >> +     ret = intel_ring_begin(ring, 4);
>> >> +     if (ret)
>> >> +             return ret;
>> >> +     intel_ring_emit(ring, MI_NOOP);
>> >> +     /* WaFbcNukeOn3DBlt:ivb/hsw/vlv */
>> >
>> > Another mention of vlv. I can see BSpec makes the same mistake in
>> > the register description though.
>>
>> ... as you noticed.
>>
>> >
>> >> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> >> +     intel_ring_emit(ring, MSG_FBC_REND_STATE);
>> >> +     intel_ring_emit(ring, FBC_REND_NUKE);
>> >> +     intel_ring_advance(ring);
>> >> +
>> >> +     return 0;
>> >> +}
>> >> +
>> >>  static int
>> >>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>> >>                      u32 invalidate_domains, u32 flush_domains)
>> >> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>> >>       intel_ring_emit(ring, 0);
>> >>       intel_ring_advance(ring);
>> >>
>> >> +     if (flush_domains)
>> >> +             return gen7_ring_fbc_flush(ring);
>> >> +
>> >>       return 0;
>> >>  }
>> >>
>> >> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>> >>  static int blt_ring_flush(struct intel_ring_buffer *ring,
>> >>                         u32 invalidate, u32 flush)
>> >>  {
>> >> +     struct drm_device *dev = ring->dev;
>> >>       uint32_t cmd;
>> >>       int ret;
>> >>
>> >> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
>> >>       intel_ring_emit(ring, 0);
>> >>       intel_ring_emit(ring, MI_NOOP);
>> >>       intel_ring_advance(ring);
>> >> +
>> >> +     if (IS_GEN7(dev))
>> >> +             return gen7_ring_fbc_flush(ring);
>> >
>> > Should check flush_domains here as well?
>>
>> here is flush domain by definition, isn' t it?
>
> How so?

this function is the ring->flush. how is it possible to have it out of
flush domain? or are the names just confusing me?

>
>>
>> >
>> > So we're now using the same nuke mechanism from the blt ring too.
>> > Should we then drop the regular blitter tracking things from fbc_enable?
>>
>> This is a good question. Since this is a critical patch and it is
>> working in the way it is I prefer to let it in the way it is and
>> promisse that I will try to drop old blitter tracking for ivb and hsw
>> later. If it works I'll send the drop in another patch.
>>
>> >
>> > Oh and what about vcs and vecs, should we nuke from those rings as well?
>> > I guess it would be strange to write to the primary plane's buffer via
>> > vcs, but I'm assuming vebox could write the same formats that we can
>> > scan out...
>>
>> To be truly honest with you I have no idea about these case. specs
>> just says to put after every pipe_control following flush
>> renderings... and blt.
>
> IIRC the spec doesn't say anything about blt.
>
> Ah, the PM giude tells you to do LRIs w/ blt too. But it actually says
> that you should do "cache clean" LRIs insted of "nuke" LRIs.

It is just the name, but same address and bit.
from BSPec: " Driver must program a MI_FLUSH_DW followed by a LRI into
the BCS ring to generate a cache clean message to FBC (LRI to offset
0x50380 with data 0x00000002)."

With this blit in the way it is I' m getting best rendering and power
saving performance.

>
>>
>> >
>> >> +
>> >>       return 0;
>> >>  }
>> >>
>> >> --
>> >> 1.8.1.4
>> >>
>> >> _______________________________________________
>> >> Intel-gfx mailing list
>> >> Intel-gfx@lists.freedesktop.org
>> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>> >
>> > --
>> > Ville Syrjälä
>> > Intel OTC
>>
>>
>>
>> --
>> Rodrigo Vivi
>> Blog: http://blog.vivi.eng.br
>
> --
> Ville Syrjälä
> Intel OTC



--
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-03 16:50       ` Rodrigo Vivi
@ 2013-06-03 17:03         ` Ville Syrjälä
  2013-06-03 18:41           ` Rodrigo Vivi
  2013-06-03 17:08         ` Rodrigo Vivi
  1 sibling, 1 reply; 27+ messages in thread
From: Ville Syrjälä @ 2013-06-03 17:03 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Mon, Jun 03, 2013 at 01:50:46PM -0300, Rodrigo Vivi wrote:
> On Mon, Jun 3, 2013 at 8:34 AM, Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > On Fri, May 31, 2013 at 05:15:41PM -0300, Rodrigo Vivi wrote:
> >> Hi Ville,
> >>
> >> Thanks for the comments.
> >>
> >>
> >> On Fri, May 31, 2013 at 12:59 PM, Ville Syrjälä
> >> <ville.syrjala@linux.intel.com> wrote:
> >> > On Tue, May 28, 2013 at 09:25:12PM -0300, Rodrigo Vivi wrote:
> >> >> WaFbcNukeOn3DBlt for IVB, HSW and VLV.
> >> >
> >> > VLV doesn't have FBC, so this is a bit incorrect.
> >>
> >> I'm going to remove the vlv mention that incorrectly came from spec...
> >>
> >> >
> >> >>
> >> >> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> >> >> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> >> >> follows each render submission."
> >> >>
> >> >> v2: Chris noticed that flush_domains check was missing here and also suggested to do
> >> >>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
> >> >>     module parameter check.
> >> >>
> >> >> v3: Adding Wa name as Damien suggested.
> >> >>
> >> >> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >> >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> >> >> ---
> >> >>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
> >> >>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
> >> >>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
> >> >>  3 files changed, 35 insertions(+), 1 deletion(-)
> >> >>
> >> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> >> >> index cc4c223..81ac584 100644
> >> >> --- a/drivers/gpu/drm/i915/i915_reg.h
> >> >> +++ b/drivers/gpu/drm/i915/i915_reg.h
> >> >> @@ -977,6 +977,8 @@
> >> >>  /* Framebuffer compression for Ivybridge */
> >> >>  #define IVB_FBC_RT_BASE                      0x7020
> >> >>
> >> >> +#define MSG_FBC_REND_STATE   0x50380
> >> >> +#define   FBC_REND_NUKE              (1<<2)
> >> >>
> >> >>  #define _HSW_PIPE_SLICE_CHICKEN_1_A  0x420B0
> >> >>  #define _HSW_PIPE_SLICE_CHICKEN_1_B  0x420B4
> >> >> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> >> >> index 1879188..e830a9b 100644
> >> >> --- a/drivers/gpu/drm/i915/intel_pm.c
> >> >> +++ b/drivers/gpu/drm/i915/intel_pm.c
> >> >> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
> >> >>       struct drm_i915_gem_object *obj = intel_fb->obj;
> >> >>       struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> >> >>
> >> >> -     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
> >> >> +     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
> >> >>
> >> >>       if (!intel_edp_is_psr_enabled(dev))
> >> >>               I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
> >> >> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> >> index 3d2c236..69491db 100644
> >> >> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> >> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> >> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
> >> >>       return 0;
> >> >>  }
> >> >>
> >> >> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
> >> >> +{
> >> >> +     struct drm_device *dev = ring->dev;
> >> >> +     int ret;
> >> >> +
> >> >> +     if (i915_enable_fbc == 0)
> >> >> +             return 0;
> >> >> +
> >> >> +     if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
> >> >> +             return 0;
> >> >> +
> >> >> +     ret = intel_ring_begin(ring, 4);
> >> >> +     if (ret)
> >> >> +             return ret;
> >> >> +     intel_ring_emit(ring, MI_NOOP);
> >> >> +     /* WaFbcNukeOn3DBlt:ivb/hsw/vlv */
> >> >
> >> > Another mention of vlv. I can see BSpec makes the same mistake in
> >> > the register description though.
> >>
> >> ... as you noticed.
> >>
> >> >
> >> >> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> >> >> +     intel_ring_emit(ring, MSG_FBC_REND_STATE);
> >> >> +     intel_ring_emit(ring, FBC_REND_NUKE);
> >> >> +     intel_ring_advance(ring);
> >> >> +
> >> >> +     return 0;
> >> >> +}
> >> >> +
> >> >>  static int
> >> >>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
> >> >>                      u32 invalidate_domains, u32 flush_domains)
> >> >> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
> >> >>       intel_ring_emit(ring, 0);
> >> >>       intel_ring_advance(ring);
> >> >>
> >> >> +     if (flush_domains)
> >> >> +             return gen7_ring_fbc_flush(ring);
> >> >> +
> >> >>       return 0;
> >> >>  }
> >> >>
> >> >> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
> >> >>  static int blt_ring_flush(struct intel_ring_buffer *ring,
> >> >>                         u32 invalidate, u32 flush)
> >> >>  {
> >> >> +     struct drm_device *dev = ring->dev;
> >> >>       uint32_t cmd;
> >> >>       int ret;
> >> >>
> >> >> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
> >> >>       intel_ring_emit(ring, 0);
> >> >>       intel_ring_emit(ring, MI_NOOP);
> >> >>       intel_ring_advance(ring);
> >> >> +
> >> >> +     if (IS_GEN7(dev))
> >> >> +             return gen7_ring_fbc_flush(ring);
> >> >
> >> > Should check flush_domains here as well?
> >>
> >> here is flush domain by definition, isn' t it?
> >
> > How so?
> 
> this function is the ring->flush. how is it possible to have it out of
> flush domain? or are the names just confusing me?

->flush() takes as parameters invalidate and flush domain bitmasks. You
already check for flush_domains in gen7_render_ring_flush(). If you're
just invalidating, nothing gets flushed to the memory and hence the nuke
procedure is not needed.

> >> >
> >> > So we're now using the same nuke mechanism from the blt ring too.
> >> > Should we then drop the regular blitter tracking things from fbc_enable?
> >>
> >> This is a good question. Since this is a critical patch and it is
> >> working in the way it is I prefer to let it in the way it is and
> >> promisse that I will try to drop old blitter tracking for ivb and hsw
> >> later. If it works I'll send the drop in another patch.
> >>
> >> >
> >> > Oh and what about vcs and vecs, should we nuke from those rings as well?
> >> > I guess it would be strange to write to the primary plane's buffer via
> >> > vcs, but I'm assuming vebox could write the same formats that we can
> >> > scan out...
> >>
> >> To be truly honest with you I have no idea about these case. specs
> >> just says to put after every pipe_control following flush
> >> renderings... and blt.
> >
> > IIRC the spec doesn't say anything about blt.
> >
> > Ah, the PM giude tells you to do LRIs w/ blt too. But it actually says
> > that you should do "cache clean" LRIs insted of "nuke" LRIs.
> 
> It is just the name, but same address and bit.

It's not the same bit. Nuke is bit 2, cache clean is bit 1.

> from BSPec: " Driver must program a MI_FLUSH_DW followed by a LRI into
> the BCS ring to generate a cache clean message to FBC (LRI to offset
> 0x50380 with data 0x00000002)."

0x2 == (1 << 1)
aka. the cache clean bit, just like the text said.

> 
> With this blit in the way it is I' m getting best rendering and power
> saving performance.
> 
> >
> >>
> >> >
> >> >> +
> >> >>       return 0;
> >> >>  }
> >> >>
> >> >> --
> >> >> 1.8.1.4
> >> >>
> >> >> _______________________________________________
> >> >> Intel-gfx mailing list
> >> >> Intel-gfx@lists.freedesktop.org
> >> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >> >
> >> > --
> >> > Ville Syrjälä
> >> > Intel OTC
> >>
> >>
> >>
> >> --
> >> Rodrigo Vivi
> >> Blog: http://blog.vivi.eng.br
> >
> > --
> > Ville Syrjälä
> > Intel OTC
> 
> 
> 
> --
> Rodrigo Vivi
> Blog: http://blog.vivi.eng.br

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-03 16:50       ` Rodrigo Vivi
  2013-06-03 17:03         ` Ville Syrjälä
@ 2013-06-03 17:08         ` Rodrigo Vivi
  2013-06-03 17:10           ` Rodrigo Vivi
  1 sibling, 1 reply; 27+ messages in thread
From: Rodrigo Vivi @ 2013-06-03 17:08 UTC (permalink / raw)
  To: intel-gfx

WaFbcNukeOn3DBlt for IVB, HSW.

According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
follows each render submission."

v2: Chris noticed that flush_domains check was missing here and also suggested to do
    LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
    module parameter check.

v3: Adding Wa name as Damien suggested.

v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_reg.h         |  2 ++
 drivers/gpu/drm/i915/intel_pm.c         |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index cc4c223..81ac584 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -977,6 +977,8 @@
 /* Framebuffer compression for Ivybridge */
 #define IVB_FBC_RT_BASE			0x7020
 
+#define MSG_FBC_REND_STATE	0x50380
+#define   FBC_REND_NUKE		(1<<2)
 
 #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
 #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 1879188..e830a9b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
+	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
 
 	if (!intel_edp_is_psr_enabled(dev))
 		I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3d2c236..46df1ff 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
 	return 0;
 }
 
+static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
+{
+	struct drm_device *dev = ring->dev;
+	int ret;
+
+	if (i915_enable_fbc == 0)
+		return 0;
+
+	if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
+		return 0;
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_NOOP);
+	/* WaFbcNukeOn3DBlt:ivb/hsw */
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MSG_FBC_REND_STATE);
+	intel_ring_emit(ring, FBC_REND_NUKE);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
 static int
 gen7_render_ring_flush(struct intel_ring_buffer *ring,
 		       u32 invalidate_domains, u32 flush_domains)
@@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);
 
+	if (flush_domains)
+		return gen7_ring_fbc_flush(ring);
+
 	return 0;
 }
 
@@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 static int blt_ring_flush(struct intel_ring_buffer *ring,
 			  u32 invalidate, u32 flush)
 {
+	struct drm_device *dev = ring->dev;
 	uint32_t cmd;
 	int ret;
 
@@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
+
+	if (IS_GEN7(dev))
+		return gen7_ring_fbc_flush(ring);
+
 	return 0;
 }
 
-- 
1.7.11.7

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-03 17:08         ` Rodrigo Vivi
@ 2013-06-03 17:10           ` Rodrigo Vivi
  0 siblings, 0 replies; 27+ messages in thread
From: Rodrigo Vivi @ 2013-06-03 17:10 UTC (permalink / raw)
  To: intel-gfx

ops, ignore this... you were fast than my git send-email....
you are right.. my bad... going to test this clean version here...

On Mon, Jun 3, 2013 at 2:08 PM, Rodrigo Vivi <rodrigo.vivi@gmail.com> wrote:
> WaFbcNukeOn3DBlt for IVB, HSW.
>
> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> follows each render submission."
>
> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>     module parameter check.
>
> v3: Adding Wa name as Damien suggested.
>
> v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec.
>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
>  3 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index cc4c223..81ac584 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -977,6 +977,8 @@
>  /* Framebuffer compression for Ivybridge */
>  #define IVB_FBC_RT_BASE                        0x7020
>
> +#define MSG_FBC_REND_STATE     0x50380
> +#define   FBC_REND_NUKE                (1<<2)
>
>  #define _HSW_PIPE_SLICE_CHICKEN_1_A    0x420B0
>  #define _HSW_PIPE_SLICE_CHICKEN_1_B    0x420B4
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 1879188..e830a9b 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>         struct drm_i915_gem_object *obj = intel_fb->obj;
>         struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>
> -       I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
> +       I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
>
>         if (!intel_edp_is_psr_enabled(dev))
>                 I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3d2c236..46df1ff 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>         return 0;
>  }
>
> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
> +{
> +       struct drm_device *dev = ring->dev;
> +       int ret;
> +
> +       if (i915_enable_fbc == 0)
> +               return 0;
> +
> +       if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
> +               return 0;
> +
> +       ret = intel_ring_begin(ring, 4);
> +       if (ret)
> +               return ret;
> +       intel_ring_emit(ring, MI_NOOP);
> +       /* WaFbcNukeOn3DBlt:ivb/hsw */
> +       intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +       intel_ring_emit(ring, MSG_FBC_REND_STATE);
> +       intel_ring_emit(ring, FBC_REND_NUKE);
> +       intel_ring_advance(ring);
> +
> +       return 0;
> +}
> +
>  static int
>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>                        u32 invalidate_domains, u32 flush_domains)
> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>         intel_ring_emit(ring, 0);
>         intel_ring_advance(ring);
>
> +       if (flush_domains)
> +               return gen7_ring_fbc_flush(ring);
> +
>         return 0;
>  }
>
> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  static int blt_ring_flush(struct intel_ring_buffer *ring,
>                           u32 invalidate, u32 flush)
>  {
> +       struct drm_device *dev = ring->dev;
>         uint32_t cmd;
>         int ret;
>
> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
>         intel_ring_emit(ring, 0);
>         intel_ring_emit(ring, MI_NOOP);
>         intel_ring_advance(ring);
> +
> +       if (IS_GEN7(dev))
> +               return gen7_ring_fbc_flush(ring);
> +
>         return 0;
>  }
>
> --
> 1.7.11.7
>



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-03 17:03         ` Ville Syrjälä
@ 2013-06-03 18:41           ` Rodrigo Vivi
  2013-06-04  7:06             ` Ville Syrjälä
  0 siblings, 1 reply; 27+ messages in thread
From: Rodrigo Vivi @ 2013-06-03 18:41 UTC (permalink / raw)
  To: intel-gfx

WaFbcNukeOn3DBlt for IVB, HSW.

According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
follows each render submission."

v2: Chris noticed that flush_domains check was missing here and also suggested to do
    LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
    module parameter check.

v3: Adding Wa name as Damien suggested.

v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec.

v5: Ville noticed than on blt a Cache Clean LRI should be used instead the Nuke one.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_reg.h         |  3 +++
 drivers/gpu/drm/i915/intel_pm.c         |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index cc4c223..f37ddee 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -977,6 +977,9 @@
 /* Framebuffer compression for Ivybridge */
 #define IVB_FBC_RT_BASE			0x7020
 
+#define MSG_FBC_REND_STATE	0x50380
+#define   FBC_REND_NUKE		(1<<2)
+#define   FBC_REND_CACHE_CLEAN	(1<<1)
 
 #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
 #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 1879188..e830a9b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
+	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
 
 	if (!intel_edp_is_psr_enabled(dev))
 		I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3d2c236..3e24639 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
 	return 0;
 }
 
+static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, bool nuke)
+{
+	struct drm_device *dev = ring->dev;
+	int ret;
+
+	if (i915_enable_fbc == 0)
+		return 0;
+
+	if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
+		return 0;
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_NOOP);
+	/* WaFbcNukeOn3DBlt:ivb/hsw */
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MSG_FBC_REND_STATE);
+	intel_ring_emit(ring, nuke ? FBC_REND_NUKE : FBC_REND_CACHE_CLEAN);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
 static int
 gen7_render_ring_flush(struct intel_ring_buffer *ring,
 		       u32 invalidate_domains, u32 flush_domains)
@@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);
 
+	if (flush_domains)
+		return gen7_ring_fbc_flush(ring, true);
+
 	return 0;
 }
 
@@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 static int blt_ring_flush(struct intel_ring_buffer *ring,
 			  u32 invalidate, u32 flush)
 {
+	struct drm_device *dev = ring->dev;
 	uint32_t cmd;
 	int ret;
 
@@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
+
+	if (IS_GEN7(dev))
+		return gen7_ring_fbc_flush(ring, false);
+
 	return 0;
 }
 
-- 
1.7.11.7

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-03 18:41           ` Rodrigo Vivi
@ 2013-06-04  7:06             ` Ville Syrjälä
  2013-06-04  8:09               ` Chris Wilson
  0 siblings, 1 reply; 27+ messages in thread
From: Ville Syrjälä @ 2013-06-04  7:06 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Mon, Jun 03, 2013 at 03:41:49PM -0300, Rodrigo Vivi wrote:
> WaFbcNukeOn3DBlt for IVB, HSW.
> 
> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> follows each render submission."
> 
> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>     module parameter check.
> 
> v3: Adding Wa name as Damien suggested.
> 
> v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec.
> 
> v5: Ville noticed than on blt a Cache Clean LRI should be used instead the Nuke one.
> 
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h         |  3 +++
>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
>  3 files changed, 36 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index cc4c223..f37ddee 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -977,6 +977,9 @@
>  /* Framebuffer compression for Ivybridge */
>  #define IVB_FBC_RT_BASE			0x7020
>  
> +#define MSG_FBC_REND_STATE	0x50380
> +#define   FBC_REND_NUKE		(1<<2)
> +#define   FBC_REND_CACHE_CLEAN	(1<<1)
>  
>  #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
>  #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 1879188..e830a9b 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>  	struct drm_i915_gem_object *obj = intel_fb->obj;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  
> -	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
> +	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
>  
>  	if (!intel_edp_is_psr_enabled(dev))
>  		I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3d2c236..3e24639 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>  	return 0;
>  }
>  
> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, bool nuke)
> +{
> +	struct drm_device *dev = ring->dev;
> +	int ret;
> +
> +	if (i915_enable_fbc == 0)
> +		return 0;
> +
> +	if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
> +		return 0;
> +
> +	ret = intel_ring_begin(ring, 4);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_NOOP);
> +	/* WaFbcNukeOn3DBlt:ivb/hsw */
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, MSG_FBC_REND_STATE);
> +	intel_ring_emit(ring, nuke ? FBC_REND_NUKE : FBC_REND_CACHE_CLEAN);
> +	intel_ring_advance(ring);
> +
> +	return 0;
> +}
> +
>  static int
>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  		       u32 invalidate_domains, u32 flush_domains)
> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, 0);
>  	intel_ring_advance(ring);
>  
> +	if (flush_domains)
> +		return gen7_ring_fbc_flush(ring, true);
> +
>  	return 0;
>  }
>  
> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  static int blt_ring_flush(struct intel_ring_buffer *ring,
>  			  u32 invalidate, u32 flush)
>  {
> +	struct drm_device *dev = ring->dev;
>  	uint32_t cmd;
>  	int ret;
>  
> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, 0);
>  	intel_ring_emit(ring, MI_NOOP);
>  	intel_ring_advance(ring);
> +
> +	if (IS_GEN7(dev))
> +		return gen7_ring_fbc_flush(ring, false);

Still no flush_domains check?

Oh and looks like you need to rebase the patch since these functions
got renamed.

> +
>  	return 0;
>  }
>  
> -- 
> 1.7.11.7

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-04  7:06             ` Ville Syrjälä
@ 2013-06-04  8:09               ` Chris Wilson
  2013-06-06 14:49                 ` Rodrigo Vivi
  0 siblings, 1 reply; 27+ messages in thread
From: Chris Wilson @ 2013-06-04  8:09 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Tue, Jun 04, 2013 at 10:06:08AM +0300, Ville Syrjälä wrote:
> On Mon, Jun 03, 2013 at 03:41:49PM -0300, Rodrigo Vivi wrote:
> > +	if (IS_GEN7(dev))
> > +		return gen7_ring_fbc_flush(ring, false);
> 
> Still no flush_domains check?

And if we are being picky, not using the fbc_dirty flag either.
Also pass the dword to gen7_ring_fbc_flush() rather than bool nuke, as
it is more descriptive and flexible.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-04  8:09               ` Chris Wilson
@ 2013-06-06 14:49                 ` Rodrigo Vivi
  2013-06-06 15:00                   ` Chris Wilson
  0 siblings, 1 reply; 27+ messages in thread
From: Rodrigo Vivi @ 2013-06-06 14:49 UTC (permalink / raw)
  To: intel-gfx

WaFbcNukeOn3DBlt for IVB, HSW.

According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
follows each render submission."

v2: Chris noticed that flush_domains check was missing here and also suggested to do
    LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
    module parameter check.

v3: Adding Wa name as Damien suggested.

v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec.

v5: Ville noticed than on blt a Cache Clean LRI should be used instead the Nuke one.

v6: Check for flush domain on blt (by Ville).
    Check for scanout dirty (by Chris).

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_reg.h         |  4 ++++
 drivers/gpu/drm/i915/intel_pm.c         |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 36 +++++++++++++++++++++++++++++++++
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 47a9de0..757bbb7 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1026,6 +1026,10 @@
 #define IPS_CTL		0x43408
 #define   IPS_ENABLE	(1 << 31)
 
+#define MSG_FBC_REND_STATE	0x50380
+#define   FBC_REND_NUKE		(1<<2)
+#define   FBC_REND_CACHE_CLEAN	(1<<1)
+
 #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
 #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
 #define   HSW_BYPASS_FBC_QUEUE		(1<<22)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 50fe3d7..324d1ce 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
+	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
 
 	I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
 		   IVB_DPFC_CTL_FENCE_EN |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 0e72da6..cec5246 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -280,6 +280,34 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
 	return 0;
 }
 
+static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, u32 value)
+{
+	struct drm_device *dev = ring->dev;
+	struct drm_i915_gem_object *obj = ring->obj;
+	int ret;
+
+	if (i915_enable_fbc == 0)
+		return 0;
+
+	if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
+		return 0;
+
+	if (obj->base.write_domain != I915_GEM_DOMAIN_GTT)
+		return 0;
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_NOOP);
+	/* WaFbcNukeOn3DBlt:ivb/hsw */
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MSG_FBC_REND_STATE);
+	intel_ring_emit(ring, value);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
 static int
 gen7_render_ring_flush(struct intel_ring_buffer *ring,
 		       u32 invalidate_domains, u32 flush_domains)
@@ -336,6 +364,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);
 
+	if (flush_domains)
+		return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
+
 	return 0;
 }
 
@@ -1685,6 +1716,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 static int gen6_ring_flush(struct intel_ring_buffer *ring,
 			   u32 invalidate, u32 flush)
 {
+	struct drm_device *dev = ring->dev;
 	uint32_t cmd;
 	int ret;
 
@@ -1707,6 +1739,10 @@ static int gen6_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
+
+	if (IS_GEN7(dev) && flush)
+		return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
+
 	return 0;
 }
 
-- 
1.7.11.7

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-06 14:49                 ` Rodrigo Vivi
@ 2013-06-06 15:00                   ` Chris Wilson
  2013-06-06 15:16                     ` Rodrigo Vivi
                                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Chris Wilson @ 2013-06-06 15:00 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Thu, Jun 06, 2013 at 11:49:56AM -0300, Rodrigo Vivi wrote:
> WaFbcNukeOn3DBlt for IVB, HSW.
> 
> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> follows each render submission."
> 
> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>     module parameter check.
> 
> v3: Adding Wa name as Damien suggested.
> 
> v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec.
> 
> v5: Ville noticed than on blt a Cache Clean LRI should be used instead the Nuke one.
> 
> v6: Check for flush domain on blt (by Ville).
>     Check for scanout dirty (by Chris).
Note quite what I had in mind, see
https://patchwork.kernel.org/patch/2606131/
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-06 15:00                   ` Chris Wilson
@ 2013-06-06 15:16                     ` Rodrigo Vivi
  2013-06-06 19:53                     ` [PATCH 1/2] drm/i915: Track when we dirty the scanout with render commands Rodrigo Vivi
  2013-06-06 19:53                     ` Rodrigo Vivi
  2 siblings, 0 replies; 27+ messages in thread
From: Rodrigo Vivi @ 2013-06-06 15:16 UTC (permalink / raw)
  To: Chris Wilson, Rodrigo Vivi, intel-gfx, Ville Syrjälä

please just ignore this version...
going to try fbc_dirty and other changes here...

On Thu, Jun 6, 2013 at 12:00 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Thu, Jun 06, 2013 at 11:49:56AM -0300, Rodrigo Vivi wrote:
>> WaFbcNukeOn3DBlt for IVB, HSW.
>>
>> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
>> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
>> follows each render submission."
>>
>> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>>     module parameter check.
>>
>> v3: Adding Wa name as Damien suggested.
>>
>> v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec.
>>
>> v5: Ville noticed than on blt a Cache Clean LRI should be used instead the Nuke one.
>>
>> v6: Check for flush domain on blt (by Ville).
>>     Check for scanout dirty (by Chris).
> Note quite what I had in mind, see
> https://patchwork.kernel.org/patch/2606131/
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/2] drm/i915: Track when we dirty the scanout with render commands
  2013-06-06 15:00                   ` Chris Wilson
  2013-06-06 15:16                     ` Rodrigo Vivi
@ 2013-06-06 19:53                     ` Rodrigo Vivi
  2013-06-06 19:53                     ` Rodrigo Vivi
  2 siblings, 0 replies; 27+ messages in thread
From: Rodrigo Vivi @ 2013-06-06 19:53 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

This is required for tracking render damage for use with FBC and will be
used in subsequent patches.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/i915/intel_display.c       | 13 +++++++++----
 drivers/gpu/drm/i915/intel_drv.h           |  3 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  1 +
 4 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index a8bb62c..c98333d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -786,7 +786,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
 			obj->dirty = 1;
 			obj->last_write_seqno = intel_ring_get_seqno(ring);
 			if (obj->pin_count) /* check for potential scanout */
-				intel_mark_fb_busy(obj);
+				intel_mark_fb_busy(obj, ring);
 		}
 
 		trace_i915_gem_object_change_domain(obj, old_read, old_write);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 40d047e..657f9a1 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7042,7 +7042,8 @@ void intel_mark_idle(struct drm_device *dev)
 	}
 }
 
-void intel_mark_fb_busy(struct drm_i915_gem_object *obj)
+void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
+			struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_crtc *crtc;
@@ -7054,8 +7055,12 @@ void intel_mark_fb_busy(struct drm_i915_gem_object *obj)
 		if (!crtc->fb)
 			continue;
 
-		if (to_intel_framebuffer(crtc->fb)->obj == obj)
-			intel_increase_pllclock(crtc);
+		if (to_intel_framebuffer(crtc->fb)->obj != obj)
+			continue;
+
+		intel_increase_pllclock(crtc);
+		if (ring && intel_fbc_enabled(dev))
+			ring->fbc_dirty = true;
 	}
 }
 
@@ -7505,7 +7510,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 		goto cleanup_pending;
 
 	intel_disable_fbc(dev);
-	intel_mark_fb_busy(obj);
+	intel_mark_fb_busy(obj, NULL);
 	mutex_unlock(&dev->struct_mutex);
 
 	trace_i915_flip_request(intel_crtc->plane, obj);
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index eae3dbc..2f87652 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -563,7 +563,8 @@ extern bool intel_sdvo_init(struct drm_device *dev, uint32_t sdvo_reg,
 extern void intel_dvo_init(struct drm_device *dev);
 extern void intel_tv_init(struct drm_device *dev);
 extern void intel_mark_busy(struct drm_device *dev);
-extern void intel_mark_fb_busy(struct drm_i915_gem_object *obj);
+extern void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
+			       struct intel_ring_buffer *ring);
 extern void intel_mark_idle(struct drm_device *dev);
 extern bool intel_lvds_init(struct drm_device *dev);
 extern bool intel_is_dual_link_lvds(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 4c7e103..efc403d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -140,6 +140,7 @@ struct  intel_ring_buffer {
 	 */
 	u32 outstanding_lazy_request;
 	bool gpu_caches_dirty;
+	bool fbc_dirty;
 
 	wait_queue_head_t irq_queue;
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 1/2] drm/i915: Track when we dirty the scanout with render commands
  2013-06-06 15:00                   ` Chris Wilson
  2013-06-06 15:16                     ` Rodrigo Vivi
  2013-06-06 19:53                     ` [PATCH 1/2] drm/i915: Track when we dirty the scanout with render commands Rodrigo Vivi
@ 2013-06-06 19:53                     ` Rodrigo Vivi
  2013-06-06 19:53                       ` [PATCH 2/2] drm/i915: WA: FBC Render Nuke Rodrigo Vivi
  2 siblings, 1 reply; 27+ messages in thread
From: Rodrigo Vivi @ 2013-06-06 19:53 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

This is required for tracking render damage for use with FBC and will be
used in subsequent patches.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/i915/intel_display.c       | 13 +++++++++----
 drivers/gpu/drm/i915/intel_drv.h           |  3 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  1 +
 4 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index a8bb62c..c98333d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -786,7 +786,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
 			obj->dirty = 1;
 			obj->last_write_seqno = intel_ring_get_seqno(ring);
 			if (obj->pin_count) /* check for potential scanout */
-				intel_mark_fb_busy(obj);
+				intel_mark_fb_busy(obj, ring);
 		}
 
 		trace_i915_gem_object_change_domain(obj, old_read, old_write);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 40d047e..657f9a1 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7042,7 +7042,8 @@ void intel_mark_idle(struct drm_device *dev)
 	}
 }
 
-void intel_mark_fb_busy(struct drm_i915_gem_object *obj)
+void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
+			struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_crtc *crtc;
@@ -7054,8 +7055,12 @@ void intel_mark_fb_busy(struct drm_i915_gem_object *obj)
 		if (!crtc->fb)
 			continue;
 
-		if (to_intel_framebuffer(crtc->fb)->obj == obj)
-			intel_increase_pllclock(crtc);
+		if (to_intel_framebuffer(crtc->fb)->obj != obj)
+			continue;
+
+		intel_increase_pllclock(crtc);
+		if (ring && intel_fbc_enabled(dev))
+			ring->fbc_dirty = true;
 	}
 }
 
@@ -7505,7 +7510,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 		goto cleanup_pending;
 
 	intel_disable_fbc(dev);
-	intel_mark_fb_busy(obj);
+	intel_mark_fb_busy(obj, NULL);
 	mutex_unlock(&dev->struct_mutex);
 
 	trace_i915_flip_request(intel_crtc->plane, obj);
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index eae3dbc..2f87652 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -563,7 +563,8 @@ extern bool intel_sdvo_init(struct drm_device *dev, uint32_t sdvo_reg,
 extern void intel_dvo_init(struct drm_device *dev);
 extern void intel_tv_init(struct drm_device *dev);
 extern void intel_mark_busy(struct drm_device *dev);
-extern void intel_mark_fb_busy(struct drm_i915_gem_object *obj);
+extern void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
+			       struct intel_ring_buffer *ring);
 extern void intel_mark_idle(struct drm_device *dev);
 extern bool intel_lvds_init(struct drm_device *dev);
 extern bool intel_is_dual_link_lvds(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 4c7e103..efc403d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -140,6 +140,7 @@ struct  intel_ring_buffer {
 	 */
 	u32 outstanding_lazy_request;
 	bool gpu_caches_dirty;
+	bool fbc_dirty;
 
 	wait_queue_head_t irq_queue;
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/2] drm/i915: WA: FBC Render Nuke.
  2013-06-06 19:53                     ` Rodrigo Vivi
@ 2013-06-06 19:53                       ` Rodrigo Vivi
  2013-06-06 19:58                         ` [PATCH] " Rodrigo Vivi
  0 siblings, 1 reply; 27+ messages in thread
From: Rodrigo Vivi @ 2013-06-06 19:53 UTC (permalink / raw)
  To: intel-gfx

WaFbcNukeOn3DBlt for IVB, HSW.

According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
follows each render submission."

v2: Chris noticed that flush_domains check was missing here and also suggested to do
    LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
    module parameter check.

v3: Adding Wa name as Damien suggested.

v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec.

v5: Ville noticed than on blt a Cache Clean LRI should be used instead the Nuke one.

v6: Check for flush domain on blt (by Ville).
    Check for scanout dirty (by Chris).

v7: Apply proper fbc_dirty implemented by Chris.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_reg.h         |  4 ++++
 drivers/gpu/drm/i915/intel_pm.c         |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 31 +++++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 47a9de0..757bbb7 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1026,6 +1026,10 @@
 #define IPS_CTL		0x43408
 #define   IPS_ENABLE	(1 << 31)
 
+#define MSG_FBC_REND_STATE	0x50380
+#define   FBC_REND_NUKE		(1<<2)
+#define   FBC_REND_CACHE_CLEAN	(1<<1)
+
 #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
 #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
 #define   HSW_BYPASS_FBC_QUEUE		(1<<22)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 50fe3d7..324d1ce 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
+	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
 
 	I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
 		   IVB_DPFC_CTL_FENCE_EN |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 0e72da6..85f06a8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -280,6 +280,29 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
 	return 0;
 }
 
+static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, u32 value)
+{
+	struct drm_device *dev = ring->dev;
+	struct drm_i915_gem_object *obj = ring->obj;
+	int ret;
+
+	if (!ring->fbc_dirty)
+		return 0;
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_NOOP);
+	/* WaFbcNukeOn3DBlt:ivb/hsw */
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MSG_FBC_REND_STATE);
+	intel_ring_emit(ring, value);
+	intel_ring_advance(ring);
+
+	ring->fbc_dirty = false;
+	return 0;
+}
+
 static int
 gen7_render_ring_flush(struct intel_ring_buffer *ring,
 		       u32 invalidate_domains, u32 flush_domains)
@@ -336,6 +359,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);
 
+	if (flush_domains)
+		return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
+
 	return 0;
 }
 
@@ -1685,6 +1711,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 static int gen6_ring_flush(struct intel_ring_buffer *ring,
 			   u32 invalidate, u32 flush)
 {
+	struct drm_device *dev = ring->dev;
 	uint32_t cmd;
 	int ret;
 
@@ -1707,6 +1734,10 @@ static int gen6_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
+
+	if (IS_GEN7(dev) && flush)
+		return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
+
 	return 0;
 }
 
-- 
1.7.11.7

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-06 19:53                       ` [PATCH 2/2] drm/i915: WA: FBC Render Nuke Rodrigo Vivi
@ 2013-06-06 19:58                         ` Rodrigo Vivi
  2013-06-07 15:58                           ` Daniel Vetter
  0 siblings, 1 reply; 27+ messages in thread
From: Rodrigo Vivi @ 2013-06-06 19:58 UTC (permalink / raw)
  To: intel-gfx

WaFbcNukeOn3DBlt for IVB, HSW.

According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
follows each render submission."

v2: Chris noticed that flush_domains check was missing here and also suggested to do
    LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
    module parameter check.

v3: Adding Wa name as Damien suggested.

v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec.

v5: Ville noticed than on blt a Cache Clean LRI should be used instead the Nuke one.

v6: Check for flush domain on blt (by Ville).
    Check for scanout dirty (by Chris).

v7: Apply proper fbc_dirty implemented by Chris.

v8: remove unused variables.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_reg.h         |  4 ++++
 drivers/gpu/drm/i915/intel_pm.c         |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 29 +++++++++++++++++++++++++++++
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 47a9de0..757bbb7 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1026,6 +1026,10 @@
 #define IPS_CTL		0x43408
 #define   IPS_ENABLE	(1 << 31)
 
+#define MSG_FBC_REND_STATE	0x50380
+#define   FBC_REND_NUKE		(1<<2)
+#define   FBC_REND_CACHE_CLEAN	(1<<1)
+
 #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
 #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
 #define   HSW_BYPASS_FBC_QUEUE		(1<<22)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 50fe3d7..324d1ce 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
+	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
 
 	I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
 		   IVB_DPFC_CTL_FENCE_EN |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 0e72da6..1ef081c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -280,6 +280,27 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
 	return 0;
 }
 
+static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, u32 value)
+{
+	int ret;
+
+	if (!ring->fbc_dirty)
+		return 0;
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_NOOP);
+	/* WaFbcNukeOn3DBlt:ivb/hsw */
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MSG_FBC_REND_STATE);
+	intel_ring_emit(ring, value);
+	intel_ring_advance(ring);
+
+	ring->fbc_dirty = false;
+	return 0;
+}
+
 static int
 gen7_render_ring_flush(struct intel_ring_buffer *ring,
 		       u32 invalidate_domains, u32 flush_domains)
@@ -336,6 +357,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);
 
+	if (flush_domains)
+		return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
+
 	return 0;
 }
 
@@ -1685,6 +1709,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 static int gen6_ring_flush(struct intel_ring_buffer *ring,
 			   u32 invalidate, u32 flush)
 {
+	struct drm_device *dev = ring->dev;
 	uint32_t cmd;
 	int ret;
 
@@ -1707,6 +1732,10 @@ static int gen6_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
+
+	if (IS_GEN7(dev) && flush)
+		return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
+
 	return 0;
 }
 
-- 
1.7.11.7

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-06-06 19:58                         ` [PATCH] " Rodrigo Vivi
@ 2013-06-07 15:58                           ` Daniel Vetter
  0 siblings, 0 replies; 27+ messages in thread
From: Daniel Vetter @ 2013-06-07 15:58 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Thu, Jun 06, 2013 at 04:58:16PM -0300, Rodrigo Vivi wrote:
> WaFbcNukeOn3DBlt for IVB, HSW.
> 
> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> follows each render submission."
> 
> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>     module parameter check.
> 
> v3: Adding Wa name as Damien suggested.
> 
> v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec.
> 
> v5: Ville noticed than on blt a Cache Clean LRI should be used instead the Nuke one.
> 
> v6: Check for flush domain on blt (by Ville).
>     Check for scanout dirty (by Chris).
> 
> v7: Apply proper fbc_dirty implemented by Chris.
> 
> v8: remove unused variables.
> 
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>

Both queued for -next, thanks for the patch.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_reg.h         |  4 ++++
>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 29 +++++++++++++++++++++++++++++
>  3 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 47a9de0..757bbb7 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1026,6 +1026,10 @@
>  #define IPS_CTL		0x43408
>  #define   IPS_ENABLE	(1 << 31)
>  
> +#define MSG_FBC_REND_STATE	0x50380
> +#define   FBC_REND_NUKE		(1<<2)
> +#define   FBC_REND_CACHE_CLEAN	(1<<1)
> +
>  #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
>  #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
>  #define   HSW_BYPASS_FBC_QUEUE		(1<<22)
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 50fe3d7..324d1ce 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>  	struct drm_i915_gem_object *obj = intel_fb->obj;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  
> -	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
> +	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
>  
>  	I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
>  		   IVB_DPFC_CTL_FENCE_EN |
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 0e72da6..1ef081c 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -280,6 +280,27 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>  	return 0;
>  }
>  
> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, u32 value)
> +{
> +	int ret;
> +
> +	if (!ring->fbc_dirty)
> +		return 0;
> +
> +	ret = intel_ring_begin(ring, 4);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_NOOP);
> +	/* WaFbcNukeOn3DBlt:ivb/hsw */
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, MSG_FBC_REND_STATE);
> +	intel_ring_emit(ring, value);
> +	intel_ring_advance(ring);
> +
> +	ring->fbc_dirty = false;
> +	return 0;
> +}
> +
>  static int
>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  		       u32 invalidate_domains, u32 flush_domains)
> @@ -336,6 +357,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, 0);
>  	intel_ring_advance(ring);
>  
> +	if (flush_domains)
> +		return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
> +
>  	return 0;
>  }
>  
> @@ -1685,6 +1709,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  static int gen6_ring_flush(struct intel_ring_buffer *ring,
>  			   u32 invalidate, u32 flush)
>  {
> +	struct drm_device *dev = ring->dev;
>  	uint32_t cmd;
>  	int ret;
>  
> @@ -1707,6 +1732,10 @@ static int gen6_ring_flush(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, 0);
>  	intel_ring_emit(ring, MI_NOOP);
>  	intel_ring_advance(ring);
> +
> +	if (IS_GEN7(dev) && flush)
> +		return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +
>  	return 0;
>  }
>  
> -- 
> 1.7.11.7
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-05-23 18:06   ` Rodrigo Vivi
@ 2013-05-24  8:13     ` Ville Syrjälä
  0 siblings, 0 replies; 27+ messages in thread
From: Ville Syrjälä @ 2013-05-24  8:13 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Thu, May 23, 2013 at 03:06:01PM -0300, Rodrigo Vivi wrote:
> Thanks Damien, I'll incorporate the nuke name.
> 
> Ville, I though this SRM was just when adding the one for PSR, but now
> you pointed out I' ve seen it on PM enabling guide. This LRI without
> the SRM by itself already solved rendering problems we were facing.

I think the SRM is there just because the register in question is in the 
display engine, and the docs say that you can only have one outstanding
LRI directed towards the display engine. So SRM must be added after each
LRI to make sure thw write has landed before the next one is issued.

> So, not sure if this is really needed, but I' m going to check that.
> Also, on BSpec it says on every flush, but on PM enabling guide it
> mentioned to do it only when touching front buffer... I tried many
> different ways to do lri only when marked as busy and other things but
> it worked much better when doing lri always as suggested by bspec.

That would seem to indicate that you're not detecting front buffer
rendering correctly...

> I' m afraid this SRM always might not be good for rendering
> performance... as less things better... I'll have to add the other LRI
> for PSR as well. :/
> 
> So, what do you think guys?
> Thanks,
> Rodrigo.
> 
> On Thu, May 23, 2013 at 7:28 AM, Damien Lespiau
> <damien.lespiau@intel.com> wrote:
> > On Wed, May 22, 2013 at 02:23:17PM -0300, Rodrigo Vivi wrote:
> >> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> >> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> >> follows each render submission."
> >>
> >> v2: Chris noticed that flush_domains check was missing here and also suggested to do
> >>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
> >>     module parameter check.
> >>
> >> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> >> ---
> >>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
> >>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
> >>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
> >>  3 files changed, 35 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> >> index cc4c223..81ac584 100644
> >> --- a/drivers/gpu/drm/i915/i915_reg.h
> >> +++ b/drivers/gpu/drm/i915/i915_reg.h
> >> @@ -977,6 +977,8 @@
> >>  /* Framebuffer compression for Ivybridge */
> >>  #define IVB_FBC_RT_BASE                      0x7020
> >>
> >> +#define MSG_FBC_REND_STATE   0x50380
> >> +#define   FBC_REND_NUKE              (1<<2)
> >>
> >>  #define _HSW_PIPE_SLICE_CHICKEN_1_A  0x420B0
> >>  #define _HSW_PIPE_SLICE_CHICKEN_1_B  0x420B4
> >> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> >> index 1879188..e830a9b 100644
> >> --- a/drivers/gpu/drm/i915/intel_pm.c
> >> +++ b/drivers/gpu/drm/i915/intel_pm.c
> >> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
> >>       struct drm_i915_gem_object *obj = intel_fb->obj;
> >>       struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> >>
> >> -     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
> >> +     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
> >>
> >>       if (!intel_edp_is_psr_enabled(dev))
> >>               I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
> >> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> index 3d2c236..905d372 100644
> >> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
> >>       return 0;
> >>  }
> >>
> >> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
> >> +{
> >> +     struct drm_device *dev = ring->dev;
> >> +     int ret;
> >> +
> >> +     if (i915_enable_fbc == 0)
> >> +             return 0;
> >> +
> >> +     if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
> >> +             return 0;
> >> +
> >> +     ret = intel_ring_begin(ring, 4);
> >> +     if (ret)
> >> +             return ret;
> >> +
> >> +     intel_ring_emit(ring, MI_NOOP);
> >> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> >> +     intel_ring_emit(ring, MSG_FBC_REND_STATE);
> >> +     intel_ring_emit(ring, FBC_REND_NUKE);
> >> +     intel_ring_advance(ring);
> >> +
> >> +     return 0;
> >> +}
> >> +
> >
> > Can you document you're implementing WaFbcNukeOn3DBlt for ivb/hsw/vlv
> > here using the newly introduced w/a tags?
> >
> >>  static int
> >>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
> >>                      u32 invalidate_domains, u32 flush_domains)
> >> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
> >>       intel_ring_emit(ring, 0);
> >>       intel_ring_advance(ring);
> >>
> >> +     if (flush_domains)
> >> +             return gen7_ring_fbc_flush(ring);
> >> +
> >>       return 0;
> >>  }
> >>
> >> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
> >>  static int blt_ring_flush(struct intel_ring_buffer *ring,
> >>                         u32 invalidate, u32 flush)
> >>  {
> >> +     struct drm_device *dev = ring->dev;
> >>       uint32_t cmd;
> >>       int ret;
> >>
> >> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
> >>       intel_ring_emit(ring, 0);
> >>       intel_ring_emit(ring, MI_NOOP);
> >>       intel_ring_advance(ring);
> >> +
> >> +     if (IS_GEN7(dev))
> >> +             return gen7_ring_fbc_flush(ring);
> >> +
> >>       return 0;
> >>  }
> >>
> >> --
> >> 1.8.1.4
> >>
> >> _______________________________________________
> >> Intel-gfx mailing list
> >> Intel-gfx@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> 
> 
> -- 
> Rodrigo Vivi
> Blog: http://blog.vivi.eng.br
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-05-23 10:28 ` Damien Lespiau
@ 2013-05-23 18:06   ` Rodrigo Vivi
  2013-05-24  8:13     ` Ville Syrjälä
  0 siblings, 1 reply; 27+ messages in thread
From: Rodrigo Vivi @ 2013-05-23 18:06 UTC (permalink / raw)
  To: Damien Lespiau; +Cc: intel-gfx

Thanks Damien, I'll incorporate the nuke name.

Ville, I though this SRM was just when adding the one for PSR, but now
you pointed out I' ve seen it on PM enabling guide. This LRI without
the SRM by itself already solved rendering problems we were facing.
So, not sure if this is really needed, but I' m going to check that.
Also, on BSpec it says on every flush, but on PM enabling guide it
mentioned to do it only when touching front buffer... I tried many
different ways to do lri only when marked as busy and other things but
it worked much better when doing lri always as suggested by bspec.
I' m afraid this SRM always might not be good for rendering
performance... as less things better... I'll have to add the other LRI
for PSR as well. :/

So, what do you think guys?
Thanks,
Rodrigo.

On Thu, May 23, 2013 at 7:28 AM, Damien Lespiau
<damien.lespiau@intel.com> wrote:
> On Wed, May 22, 2013 at 02:23:17PM -0300, Rodrigo Vivi wrote:
>> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
>> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
>> follows each render submission."
>>
>> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>>     module parameter check.
>>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
>> ---
>>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
>>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
>>  3 files changed, 35 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index cc4c223..81ac584 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -977,6 +977,8 @@
>>  /* Framebuffer compression for Ivybridge */
>>  #define IVB_FBC_RT_BASE                      0x7020
>>
>> +#define MSG_FBC_REND_STATE   0x50380
>> +#define   FBC_REND_NUKE              (1<<2)
>>
>>  #define _HSW_PIPE_SLICE_CHICKEN_1_A  0x420B0
>>  #define _HSW_PIPE_SLICE_CHICKEN_1_B  0x420B4
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 1879188..e830a9b 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>>       struct drm_i915_gem_object *obj = intel_fb->obj;
>>       struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>>
>> -     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
>> +     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
>>
>>       if (!intel_edp_is_psr_enabled(dev))
>>               I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 3d2c236..905d372 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>>       return 0;
>>  }
>>
>> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
>> +{
>> +     struct drm_device *dev = ring->dev;
>> +     int ret;
>> +
>> +     if (i915_enable_fbc == 0)
>> +             return 0;
>> +
>> +     if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
>> +             return 0;
>> +
>> +     ret = intel_ring_begin(ring, 4);
>> +     if (ret)
>> +             return ret;
>> +
>> +     intel_ring_emit(ring, MI_NOOP);
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, MSG_FBC_REND_STATE);
>> +     intel_ring_emit(ring, FBC_REND_NUKE);
>> +     intel_ring_advance(ring);
>> +
>> +     return 0;
>> +}
>> +
>
> Can you document you're implementing WaFbcNukeOn3DBlt for ivb/hsw/vlv
> here using the newly introduced w/a tags?
>
>>  static int
>>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>>                      u32 invalidate_domains, u32 flush_domains)
>> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>>       intel_ring_emit(ring, 0);
>>       intel_ring_advance(ring);
>>
>> +     if (flush_domains)
>> +             return gen7_ring_fbc_flush(ring);
>> +
>>       return 0;
>>  }
>>
>> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>>  static int blt_ring_flush(struct intel_ring_buffer *ring,
>>                         u32 invalidate, u32 flush)
>>  {
>> +     struct drm_device *dev = ring->dev;
>>       uint32_t cmd;
>>       int ret;
>>
>> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
>>       intel_ring_emit(ring, 0);
>>       intel_ring_emit(ring, MI_NOOP);
>>       intel_ring_advance(ring);
>> +
>> +     if (IS_GEN7(dev))
>> +             return gen7_ring_fbc_flush(ring);
>> +
>>       return 0;
>>  }
>>
>> --
>> 1.8.1.4
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-05-22 17:23 Rodrigo Vivi
  2013-05-23  9:48 ` Ville Syrjälä
@ 2013-05-23 10:28 ` Damien Lespiau
  2013-05-23 18:06   ` Rodrigo Vivi
  1 sibling, 1 reply; 27+ messages in thread
From: Damien Lespiau @ 2013-05-23 10:28 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Wed, May 22, 2013 at 02:23:17PM -0300, Rodrigo Vivi wrote:
> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> follows each render submission."
> 
> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>     module parameter check.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
>  3 files changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index cc4c223..81ac584 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -977,6 +977,8 @@
>  /* Framebuffer compression for Ivybridge */
>  #define IVB_FBC_RT_BASE			0x7020
>  
> +#define MSG_FBC_REND_STATE	0x50380
> +#define   FBC_REND_NUKE		(1<<2)
>  
>  #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
>  #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 1879188..e830a9b 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>  	struct drm_i915_gem_object *obj = intel_fb->obj;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  
> -	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
> +	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
>  
>  	if (!intel_edp_is_psr_enabled(dev))
>  		I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3d2c236..905d372 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>  	return 0;
>  }
>  
> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
> +{
> +	struct drm_device *dev = ring->dev;
> +	int ret;
> +
> +	if (i915_enable_fbc == 0)
> +		return 0;
> +
> +	if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
> +		return 0;
> +
> +	ret = intel_ring_begin(ring, 4);
> +	if (ret)
> +		return ret;
> +
> +	intel_ring_emit(ring, MI_NOOP);
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, MSG_FBC_REND_STATE);
> +	intel_ring_emit(ring, FBC_REND_NUKE);
> +	intel_ring_advance(ring);
> +
> +	return 0;
> +}
> +

Can you document you're implementing WaFbcNukeOn3DBlt for ivb/hsw/vlv
here using the newly introduced w/a tags?

>  static int
>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  		       u32 invalidate_domains, u32 flush_domains)
> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, 0);
>  	intel_ring_advance(ring);
>  
> +	if (flush_domains)
> +		return gen7_ring_fbc_flush(ring);
> +
>  	return 0;
>  }
>  
> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  static int blt_ring_flush(struct intel_ring_buffer *ring,
>  			  u32 invalidate, u32 flush)
>  {
> +	struct drm_device *dev = ring->dev;
>  	uint32_t cmd;
>  	int ret;
>  
> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, 0);
>  	intel_ring_emit(ring, MI_NOOP);
>  	intel_ring_advance(ring);
> +
> +	if (IS_GEN7(dev))
> +		return gen7_ring_fbc_flush(ring);
> +
>  	return 0;
>  }
>  
> -- 
> 1.8.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-05-22 17:23 Rodrigo Vivi
@ 2013-05-23  9:48 ` Ville Syrjälä
  2013-05-23 10:28 ` Damien Lespiau
  1 sibling, 0 replies; 27+ messages in thread
From: Ville Syrjälä @ 2013-05-23  9:48 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Wed, May 22, 2013 at 02:23:17PM -0300, Rodrigo Vivi wrote:
> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> follows each render submission."
> 
> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>     module parameter check.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
>  3 files changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index cc4c223..81ac584 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -977,6 +977,8 @@
>  /* Framebuffer compression for Ivybridge */
>  #define IVB_FBC_RT_BASE			0x7020
>  
> +#define MSG_FBC_REND_STATE	0x50380
> +#define   FBC_REND_NUKE		(1<<2)
>  
>  #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
>  #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 1879188..e830a9b 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>  	struct drm_i915_gem_object *obj = intel_fb->obj;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  
> -	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
> +	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
>  
>  	if (!intel_edp_is_psr_enabled(dev))
>  		I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3d2c236..905d372 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>  	return 0;
>  }
>  
> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
> +{
> +	struct drm_device *dev = ring->dev;
> +	int ret;
> +
> +	if (i915_enable_fbc == 0)
> +		return 0;
> +
> +	if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
> +		return 0;
> +
> +	ret = intel_ring_begin(ring, 4);
> +	if (ret)
> +		return ret;
> +
> +	intel_ring_emit(ring, MI_NOOP);
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, MSG_FBC_REND_STATE);
> +	intel_ring_emit(ring, FBC_REND_NUKE);
> +	intel_ring_advance(ring);

According to the PM enabling guide you should do a SRM from the same
register after the LRI. Is that not required?

> +
> +	return 0;
> +}
> +
>  static int
>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  		       u32 invalidate_domains, u32 flush_domains)
> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, 0);
>  	intel_ring_advance(ring);
>  
> +	if (flush_domains)
> +		return gen7_ring_fbc_flush(ring);
> +
>  	return 0;
>  }
>  
> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  static int blt_ring_flush(struct intel_ring_buffer *ring,
>  			  u32 invalidate, u32 flush)
>  {
> +	struct drm_device *dev = ring->dev;
>  	uint32_t cmd;
>  	int ret;
>  
> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, 0);
>  	intel_ring_emit(ring, MI_NOOP);
>  	intel_ring_advance(ring);
> +
> +	if (IS_GEN7(dev))
> +		return gen7_ring_fbc_flush(ring);
> +
>  	return 0;
>  }
>  
> -- 
> 1.8.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] drm/i915: WA: FBC Render Nuke.
@ 2013-05-22 17:23 Rodrigo Vivi
  2013-05-23  9:48 ` Ville Syrjälä
  2013-05-23 10:28 ` Damien Lespiau
  0 siblings, 2 replies; 27+ messages in thread
From: Rodrigo Vivi @ 2013-05-22 17:23 UTC (permalink / raw)
  To: intel-gfx

According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
follows each render submission."

v2: Chris noticed that flush_domains check was missing here and also suggested to do
    LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
    module parameter check.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_reg.h         |  2 ++
 drivers/gpu/drm/i915/intel_pm.c         |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index cc4c223..81ac584 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -977,6 +977,8 @@
 /* Framebuffer compression for Ivybridge */
 #define IVB_FBC_RT_BASE			0x7020
 
+#define MSG_FBC_REND_STATE	0x50380
+#define   FBC_REND_NUKE		(1<<2)
 
 #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
 #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 1879188..e830a9b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
+	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
 
 	if (!intel_edp_is_psr_enabled(dev))
 		I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3d2c236..905d372 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
 	return 0;
 }
 
+static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
+{
+	struct drm_device *dev = ring->dev;
+	int ret;
+
+	if (i915_enable_fbc == 0)
+		return 0;
+
+	if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
+		return 0;
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+
+	intel_ring_emit(ring, MI_NOOP);
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MSG_FBC_REND_STATE);
+	intel_ring_emit(ring, FBC_REND_NUKE);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
 static int
 gen7_render_ring_flush(struct intel_ring_buffer *ring,
 		       u32 invalidate_domains, u32 flush_domains)
@@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);
 
+	if (flush_domains)
+		return gen7_ring_fbc_flush(ring);
+
 	return 0;
 }
 
@@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 static int blt_ring_flush(struct intel_ring_buffer *ring,
 			  u32 invalidate, u32 flush)
 {
+	struct drm_device *dev = ring->dev;
 	uint32_t cmd;
 	int ret;
 
@@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
+
+	if (IS_GEN7(dev))
+		return gen7_ring_fbc_flush(ring);
+
 	return 0;
 }
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] drm/i915: WA: FBC Render Nuke.
  2013-05-22 10:57 Rodrigo Vivi
@ 2013-05-22 14:21 ` Chris Wilson
  0 siblings, 0 replies; 27+ messages in thread
From: Chris Wilson @ 2013-05-22 14:21 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Wed, May 22, 2013 at 07:57:59AM -0300, Rodrigo Vivi wrote:
> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
> follows each render submission."

Missing check for flush_domains.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] drm/i915: WA: FBC Render Nuke.
@ 2013-05-22 10:57 Rodrigo Vivi
  2013-05-22 14:21 ` Chris Wilson
  0 siblings, 1 reply; 27+ messages in thread
From: Rodrigo Vivi @ 2013-05-22 10:57 UTC (permalink / raw)
  To: intel-gfx

According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
follows each render submission."

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_reg.h         |  2 ++
 drivers/gpu/drm/i915/intel_pm.c         |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 24 +++++++++++++++++++++++-
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index cc4c223..81ac584 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -977,6 +977,8 @@
 /* Framebuffer compression for Ivybridge */
 #define IVB_FBC_RT_BASE			0x7020
 
+#define MSG_FBC_REND_STATE	0x50380
+#define   FBC_REND_NUKE		(1<<2)
 
 #define _HSW_PIPE_SLICE_CHICKEN_1_A	0x420B0
 #define _HSW_PIPE_SLICE_CHICKEN_1_B	0x420B4
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 1879188..e830a9b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
+	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
 
 	if (!intel_edp_is_psr_enabled(dev))
 		I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3d2c236..8294d4b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -280,6 +280,23 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
 	return 0;
 }
 
+static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
+{
+	int ret;
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+
+	intel_ring_emit(ring, MI_NOOP);
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MSG_FBC_REND_STATE);
+	intel_ring_emit(ring, FBC_REND_NUKE);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
 static int
 gen7_render_ring_flush(struct intel_ring_buffer *ring,
 		       u32 invalidate_domains, u32 flush_domains)
@@ -336,7 +353,7 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);
 
-	return 0;
+	return gen7_ring_fbc_flush(ring);
 }
 
 static void ring_write_tail(struct intel_ring_buffer *ring,
@@ -1623,6 +1640,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 static int blt_ring_flush(struct intel_ring_buffer *ring,
 			  u32 invalidate, u32 flush)
 {
+	struct drm_device *dev = ring->dev;
 	uint32_t cmd;
 	int ret;
 
@@ -1645,6 +1663,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
+
+	if (IS_GEN7(dev))
+		return gen7_ring_fbc_flush(ring);
+
 	return 0;
 }
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2013-06-07 15:58 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-29  0:25 [PATCH] drm/i915: WA: FBC Render Nuke Rodrigo Vivi
2013-05-29  0:36 ` Rodrigo Vivi
2013-05-31 15:59 ` Ville Syrjälä
2013-05-31 20:15   ` Rodrigo Vivi
2013-06-03 11:34     ` Ville Syrjälä
2013-06-03 16:50       ` Rodrigo Vivi
2013-06-03 17:03         ` Ville Syrjälä
2013-06-03 18:41           ` Rodrigo Vivi
2013-06-04  7:06             ` Ville Syrjälä
2013-06-04  8:09               ` Chris Wilson
2013-06-06 14:49                 ` Rodrigo Vivi
2013-06-06 15:00                   ` Chris Wilson
2013-06-06 15:16                     ` Rodrigo Vivi
2013-06-06 19:53                     ` [PATCH 1/2] drm/i915: Track when we dirty the scanout with render commands Rodrigo Vivi
2013-06-06 19:53                     ` Rodrigo Vivi
2013-06-06 19:53                       ` [PATCH 2/2] drm/i915: WA: FBC Render Nuke Rodrigo Vivi
2013-06-06 19:58                         ` [PATCH] " Rodrigo Vivi
2013-06-07 15:58                           ` Daniel Vetter
2013-06-03 17:08         ` Rodrigo Vivi
2013-06-03 17:10           ` Rodrigo Vivi
  -- strict thread matches above, loose matches on Subject: below --
2013-05-22 17:23 Rodrigo Vivi
2013-05-23  9:48 ` Ville Syrjälä
2013-05-23 10:28 ` Damien Lespiau
2013-05-23 18:06   ` Rodrigo Vivi
2013-05-24  8:13     ` Ville Syrjälä
2013-05-22 10:57 Rodrigo Vivi
2013-05-22 14:21 ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.