All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] drm/i915: Only flush fbc on sw when fbc is enabled.
@ 2014-09-05 20:57 Rodrigo Vivi
  2014-09-05 20:57 ` [PATCH 2/2] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean Rodrigo Vivi
  2014-09-17 19:55 ` [PATCH 1/2] drm/i915: Only flush fbc on sw when fbc is enabled Paulo Zanoni
  0 siblings, 2 replies; 13+ messages in thread
From: Rodrigo Vivi @ 2014-09-05 20:57 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni, Rodrigo Vivi

Avoid touching fbc register when fbc is disabled.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 45f71e6..c3d3439 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -352,6 +352,9 @@ void gen8_fbc_sw_flush(struct drm_device *dev, u32 value)
 	if (!IS_GEN8(dev))
 		return;
 
+	if (!intel_fbc_enabled(dev))
+		return;
+
 	I915_WRITE(MSG_FBC_REND_STATE, value);
 }
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.
  2014-09-05 20:57 [PATCH 1/2] drm/i915: Only flush fbc on sw when fbc is enabled Rodrigo Vivi
@ 2014-09-05 20:57 ` Rodrigo Vivi
  2014-09-05 22:14   ` Rodrigo Vivi
  2014-09-17 20:49   ` Paulo Zanoni
  2014-09-17 19:55 ` [PATCH 1/2] drm/i915: Only flush fbc on sw when fbc is enabled Paulo Zanoni
  1 sibling, 2 replies; 13+ messages in thread
From: Rodrigo Vivi @ 2014-09-05 20:57 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni, Rodrigo Vivi

The sw cache clean on gen8 is a tempoorary workaround because we cannot
set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw.
However we are doing much more than needed. Not only when using blt ring.
So, with this extra w/a we minimize the ammount of cache cleans and call it only
on same cases that it was being called on gen7.

fbc.need_sw_cache_clean works in the opposite information direction
of ring->fbc_dirty telling software on frontbuffer tracking to perform
the cache clean on sw side.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 8 ++++++++
 drivers/gpu/drm/i915/intel_display.c    | 4 +++-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 9 +++++++--
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5706b8a..5acda40 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -657,6 +657,14 @@ struct i915_fbc {
 
 	bool false_color;
 
+	/* On gen8 some rings cannont perform fbc clean operation so for now
+	 * we are doing this on SW with mmio.
+	 * This variable works in the opposite information direction
+	 * of ring->fbc_dirty telling software on frontbuffer tracking
+	 * to perform the cache clean on sw side.
+	 */
+	bool need_sw_cache_clean;
+
 	struct intel_fbc_work {
 		struct delayed_work work;
 		struct drm_crtc *crtc;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 965eb3c..731d925 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9137,8 +9137,10 @@ void intel_frontbuffer_flush(struct drm_device *dev,
 	 * needs to be reworked into a proper frontbuffer tracking scheme like
 	 * psr employs.
 	 */
-	if (IS_BROADWELL(dev))
+	if (IS_BROADWELL(dev) && dev_priv->fbc.need_sw_cache_clean) {
+		dev_priv->fbc.need_sw_cache_clean = false;
 		gen8_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
+	}
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 85fc2b1..02b43cd 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2227,6 +2227,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 			   u32 invalidate, u32 flush)
 {
 	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	uint32_t cmd;
 	int ret;
 
@@ -2257,8 +2258,12 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 	}
 	intel_ring_advance(ring);
 
-	if (IS_GEN7(dev) && !invalidate && flush)
-		return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
+	if (!invalidate && flush) {
+		if (IS_GEN7(dev))
+			return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
+		else if (IS_GEN8(dev))
+			dev_priv->fbc.need_sw_cache_clean = true;
+	}
 
 	return 0;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.
  2014-09-05 20:57 ` [PATCH 2/2] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean Rodrigo Vivi
@ 2014-09-05 22:14   ` Rodrigo Vivi
  2014-09-17 20:36     ` Rodrigo Vivi
  2014-09-17 20:49   ` Paulo Zanoni
  1 sibling, 1 reply; 13+ messages in thread
From: Rodrigo Vivi @ 2014-09-05 22:14 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, Paulo Zanoni


[-- Attachment #1.1: Type: text/plain, Size: 3926 bytes --]

Hey Daniel,

funny story: Remember that with idle_frames=1 on BDW it was working but it
was faililng on HSW?
So, with these 2 patches applied now BDW PSR fails like HSW!!!

Ville, any thoughts on this?


On Fri, Sep 5, 2014 at 1:57 PM, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:

> The sw cache clean on gen8 is a tempoorary workaround because we cannot
> set cache clean on blt ring with risk of hungs. So we are doing the cache
> clean on sw.
> However we are doing much more than needed. Not only when using blt ring.
> So, with this extra w/a we minimize the ammount of cache cleans and call
> it only
> on same cases that it was being called on gen7.
>
> fbc.need_sw_cache_clean works in the opposite information direction
> of ring->fbc_dirty telling software on frontbuffer tracking to perform
> the cache clean on sw side.
>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         | 8 ++++++++
>  drivers/gpu/drm/i915/intel_display.c    | 4 +++-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 9 +++++++--
>  3 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h
> index 5706b8a..5acda40 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -657,6 +657,14 @@ struct i915_fbc {
>
>         bool false_color;
>
> +       /* On gen8 some rings cannont perform fbc clean operation so for
> now
> +        * we are doing this on SW with mmio.
> +        * This variable works in the opposite information direction
> +        * of ring->fbc_dirty telling software on frontbuffer tracking
> +        * to perform the cache clean on sw side.
> +        */
> +       bool need_sw_cache_clean;
> +
>         struct intel_fbc_work {
>                 struct delayed_work work;
>                 struct drm_crtc *crtc;
> diff --git a/drivers/gpu/drm/i915/intel_display.c
> b/drivers/gpu/drm/i915/intel_display.c
> index 965eb3c..731d925 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -9137,8 +9137,10 @@ void intel_frontbuffer_flush(struct drm_device *dev,
>          * needs to be reworked into a proper frontbuffer tracking scheme
> like
>          * psr employs.
>          */
> -       if (IS_BROADWELL(dev))
> +       if (IS_BROADWELL(dev) && dev_priv->fbc.need_sw_cache_clean) {
> +               dev_priv->fbc.need_sw_cache_clean = false;
>                 gen8_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
> +       }
>  }
>
>  /**
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 85fc2b1..02b43cd 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2227,6 +2227,7 @@ static int gen6_ring_flush(struct intel_engine_cs
> *ring,
>                            u32 invalidate, u32 flush)
>  {
>         struct drm_device *dev = ring->dev;
> +       struct drm_i915_private *dev_priv = dev->dev_private;
>         uint32_t cmd;
>         int ret;
>
> @@ -2257,8 +2258,12 @@ static int gen6_ring_flush(struct intel_engine_cs
> *ring,
>         }
>         intel_ring_advance(ring);
>
> -       if (IS_GEN7(dev) && !invalidate && flush)
> -               return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +       if (!invalidate && flush) {
> +               if (IS_GEN7(dev))
> +                       return gen7_ring_fbc_flush(ring,
> FBC_REND_CACHE_CLEAN);
> +               else if (IS_GEN8(dev))
> +                       dev_priv->fbc.need_sw_cache_clean = true;
> +       }
>
>         return 0;
>  }
> --
> 1.9.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

[-- Attachment #1.2: Type: text/html, Size: 5218 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] drm/i915: Only flush fbc on sw when fbc is enabled.
  2014-09-05 20:57 [PATCH 1/2] drm/i915: Only flush fbc on sw when fbc is enabled Rodrigo Vivi
  2014-09-05 20:57 ` [PATCH 2/2] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean Rodrigo Vivi
@ 2014-09-17 19:55 ` Paulo Zanoni
  1 sibling, 0 replies; 13+ messages in thread
From: Paulo Zanoni @ 2014-09-17 19:55 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: Intel Graphics Development, Paulo Zanoni

2014-09-05 17:57 GMT-03:00 Rodrigo Vivi <rodrigo.vivi@intel.com>:
> Avoid touching fbc register when fbc is disabled.

Just a note: we avoid touching _one_ fbc register, but now we touch
another one to check if FBC is enabled. OTOH
we probably want to modify intel_fbc_enabled() to also not touch
registers (since we run it dozens and dozens of times per second, even
when FBC is disabled). So:

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 45f71e6..c3d3439 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -352,6 +352,9 @@ void gen8_fbc_sw_flush(struct drm_device *dev, u32 value)
>         if (!IS_GEN8(dev))
>                 return;
>
> +       if (!intel_fbc_enabled(dev))
> +               return;
> +
>         I915_WRITE(MSG_FBC_REND_STATE, value);
>  }
>
> --
> 1.9.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.
  2014-09-05 22:14   ` Rodrigo Vivi
@ 2014-09-17 20:36     ` Rodrigo Vivi
  0 siblings, 0 replies; 13+ messages in thread
From: Rodrigo Vivi @ 2014-09-17 20:36 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, Paulo Zanoni


[-- Attachment #1.1: Type: text/plain, Size: 4300 bytes --]

Please just ignore my latest comment here...
I confirm this works propperly with the PSR with my latest fixes...

On Fri, Sep 5, 2014 at 3:14 PM, Rodrigo Vivi <rodrigo.vivi@gmail.com> wrote:

> Hey Daniel,
>
> funny story: Remember that with idle_frames=1 on BDW it was working but it
> was faililng on HSW?
> So, with these 2 patches applied now BDW PSR fails like HSW!!!
>
> Ville, any thoughts on this?
>
>
> On Fri, Sep 5, 2014 at 1:57 PM, Rodrigo Vivi <rodrigo.vivi@intel.com>
> wrote:
>
>> The sw cache clean on gen8 is a tempoorary workaround because we cannot
>> set cache clean on blt ring with risk of hungs. So we are doing the cache
>> clean on sw.
>> However we are doing much more than needed. Not only when using blt ring.
>> So, with this extra w/a we minimize the ammount of cache cleans and call
>> it only
>> on same cases that it was being called on gen7.
>>
>> fbc.need_sw_cache_clean works in the opposite information direction
>> of ring->fbc_dirty telling software on frontbuffer tracking to perform
>> the cache clean on sw side.
>>
>> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_drv.h         | 8 ++++++++
>>  drivers/gpu/drm/i915/intel_display.c    | 4 +++-
>>  drivers/gpu/drm/i915/intel_ringbuffer.c | 9 +++++++--
>>  3 files changed, 18 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 5706b8a..5acda40 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -657,6 +657,14 @@ struct i915_fbc {
>>
>>         bool false_color;
>>
>> +       /* On gen8 some rings cannont perform fbc clean operation so for
>> now
>> +        * we are doing this on SW with mmio.
>> +        * This variable works in the opposite information direction
>> +        * of ring->fbc_dirty telling software on frontbuffer tracking
>> +        * to perform the cache clean on sw side.
>> +        */
>> +       bool need_sw_cache_clean;
>> +
>>         struct intel_fbc_work {
>>                 struct delayed_work work;
>>                 struct drm_crtc *crtc;
>> diff --git a/drivers/gpu/drm/i915/intel_display.c
>> b/drivers/gpu/drm/i915/intel_display.c
>> index 965eb3c..731d925 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -9137,8 +9137,10 @@ void intel_frontbuffer_flush(struct drm_device
>> *dev,
>>          * needs to be reworked into a proper frontbuffer tracking scheme
>> like
>>          * psr employs.
>>          */
>> -       if (IS_BROADWELL(dev))
>> +       if (IS_BROADWELL(dev) && dev_priv->fbc.need_sw_cache_clean) {
>> +               dev_priv->fbc.need_sw_cache_clean = false;
>>                 gen8_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
>> +       }
>>  }
>>
>>  /**
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 85fc2b1..02b43cd 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -2227,6 +2227,7 @@ static int gen6_ring_flush(struct intel_engine_cs
>> *ring,
>>                            u32 invalidate, u32 flush)
>>  {
>>         struct drm_device *dev = ring->dev;
>> +       struct drm_i915_private *dev_priv = dev->dev_private;
>>         uint32_t cmd;
>>         int ret;
>>
>> @@ -2257,8 +2258,12 @@ static int gen6_ring_flush(struct intel_engine_cs
>> *ring,
>>         }
>>         intel_ring_advance(ring);
>>
>> -       if (IS_GEN7(dev) && !invalidate && flush)
>> -               return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
>> +       if (!invalidate && flush) {
>> +               if (IS_GEN7(dev))
>> +                       return gen7_ring_fbc_flush(ring,
>> FBC_REND_CACHE_CLEAN);
>> +               else if (IS_GEN8(dev))
>> +                       dev_priv->fbc.need_sw_cache_clean = true;
>> +       }
>>
>>         return 0;
>>  }
>> --
>> 1.9.3
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>
>
>
>
> --
> Rodrigo Vivi
> Blog: http://blog.vivi.eng.br
>
>



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

[-- Attachment #1.2: Type: text/html, Size: 6012 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.
  2014-09-05 20:57 ` [PATCH 2/2] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean Rodrigo Vivi
  2014-09-05 22:14   ` Rodrigo Vivi
@ 2014-09-17 20:49   ` Paulo Zanoni
  2014-09-17 21:16     ` Rodrigo Vivi
  1 sibling, 1 reply; 13+ messages in thread
From: Paulo Zanoni @ 2014-09-17 20:49 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: Intel Graphics Development, Paulo Zanoni

2014-09-05 17:57 GMT-03:00 Rodrigo Vivi <rodrigo.vivi@intel.com>:
> The sw cache clean on gen8 is a tempoorary workaround because we cannot
> set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw.
> However we are doing much more than needed. Not only when using blt ring.
> So, with this extra w/a we minimize the ammount of cache cleans and call it only
> on same cases that it was being called on gen7.
>
> fbc.need_sw_cache_clean works in the opposite information direction
> of ring->fbc_dirty telling software on frontbuffer tracking to perform
> the cache clean on sw side.
>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         | 8 ++++++++
>  drivers/gpu/drm/i915/intel_display.c    | 4 +++-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 9 +++++++--
>  3 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 5706b8a..5acda40 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -657,6 +657,14 @@ struct i915_fbc {
>
>         bool false_color;
>
> +       /* On gen8 some rings cannont perform fbc clean operation so for now
> +        * we are doing this on SW with mmio.
> +        * This variable works in the opposite information direction
> +        * of ring->fbc_dirty telling software on frontbuffer tracking
> +        * to perform the cache clean on sw side.
> +        */
> +       bool need_sw_cache_clean;


Ok, this is my first time trying to look at the buffer tracking code,
so I have some noob questions.

It seems need_sw_cache_clean is set every time we call
gen6_ring_flush(), just like gen7_ring_fbc_flush(). But the thing
about gen7_ring_fbc_flush() is that it becomes a noop when fbc_dirty
is false. So maybe on this code we're doing the sw cache clean every
time we do a BLT operation, even if this operation is not on the front
buffer? That would be an unnecessary overkill, right? Continuing the
same though, why can't we just reuse the ring->fbc_dirty variable
instead of using this new need_sw_cache_clean? Maybe we should only do
the sw cache clean when both fbc_dirty *and* need_sw_cache are true?

More below.


> +
>         struct intel_fbc_work {
>                 struct delayed_work work;
>                 struct drm_crtc *crtc;
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 965eb3c..731d925 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -9137,8 +9137,10 @@ void intel_frontbuffer_flush(struct drm_device *dev,
>          * needs to be reworked into a proper frontbuffer tracking scheme like
>          * psr employs.
>          */
> -       if (IS_BROADWELL(dev))
> +       if (IS_BROADWELL(dev) && dev_priv->fbc.need_sw_cache_clean) {
> +               dev_priv->fbc.need_sw_cache_clean = false;
>                 gen8_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
> +       }


Bikeshed: since this is the only caller, I think it makes sense to put
the checks inside the function instead of the caller.


>  }
>
>  /**
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 85fc2b1..02b43cd 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2227,6 +2227,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,


The thing about gen6_render_ring_flush is that it is also a function
for the vebox ring. Is this a problem? What are the consequences of
doing the FBC thing on it?


>                            u32 invalidate, u32 flush)
>  {
>         struct drm_device *dev = ring->dev;
> +       struct drm_i915_private *dev_priv = dev->dev_private;
>         uint32_t cmd;
>         int ret;
>
> @@ -2257,8 +2258,12 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>         }
>         intel_ring_advance(ring);
>
> -       if (IS_GEN7(dev) && !invalidate && flush)
> -               return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +       if (!invalidate && flush) {
> +               if (IS_GEN7(dev))
> +                       return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +               else if (IS_GEN8(dev))
> +                       dev_priv->fbc.need_sw_cache_clean = true;


Bikeshed: since I like to keep the callers clean, can't we just create
a new intel_blt_fbc_flush (or intel_ring_fbc_flush(), or whatever)
that will be responsible for deciding if it needs to call
gen7_ring_fbc_flush or set need_sw_cache_clean to true? That would
make even more sense if we conclude we also need to check for
fbc_dirty before setting need_sw_cache_clean to true...


> +       }
>
>         return 0;
>  }
> --
> 1.9.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.
  2014-09-17 20:49   ` Paulo Zanoni
@ 2014-09-17 21:16     ` Rodrigo Vivi
  2014-09-18 15:29       ` [PATCH] " Rodrigo Vivi
  0 siblings, 1 reply; 13+ messages in thread
From: Rodrigo Vivi @ 2014-09-17 21:16 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: Intel Graphics Development, Paulo Zanoni, Rodrigo Vivi


[-- Attachment #1.1: Type: text/plain, Size: 7332 bytes --]

On Wed, Sep 17, 2014 at 1:49 PM, Paulo Zanoni <przanoni@gmail.com> wrote:

> 2014-09-05 17:57 GMT-03:00 Rodrigo Vivi <rodrigo.vivi@intel.com>:
> > The sw cache clean on gen8 is a tempoorary workaround because we cannot
> > set cache clean on blt ring with risk of hungs. So we are doing the
> cache clean on sw.
> > However we are doing much more than needed. Not only when using blt ring.
> > So, with this extra w/a we minimize the ammount of cache cleans and call
> it only
> > on same cases that it was being called on gen7.
> >
> > fbc.need_sw_cache_clean works in the opposite information direction
> > of ring->fbc_dirty telling software on frontbuffer tracking to perform
> > the cache clean on sw side.
> >
> > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h         | 8 ++++++++
> >  drivers/gpu/drm/i915/intel_display.c    | 4 +++-
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 9 +++++++--
> >  3 files changed, 18 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h
> > index 5706b8a..5acda40 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -657,6 +657,14 @@ struct i915_fbc {
> >
> >         bool false_color;
> >
> > +       /* On gen8 some rings cannont perform fbc clean operation so for
> now
> > +        * we are doing this on SW with mmio.
> > +        * This variable works in the opposite information direction
> > +        * of ring->fbc_dirty telling software on frontbuffer tracking
> > +        * to perform the cache clean on sw side.
> > +        */
> > +       bool need_sw_cache_clean;
>
>
> Ok, this is my first time trying to look at the buffer tracking code,
> so I have some noob questions.
>
> It seems need_sw_cache_clean is set every time we call
> gen6_ring_flush(), just like gen7_ring_fbc_flush(). But the thing
> about gen7_ring_fbc_flush() is that it becomes a noop when fbc_dirty
> is false. So maybe on this code we're doing the sw cache clean every
> time we do a BLT operation, even if this operation is not on the front
> buffer? That would be an unnecessary overkill, right? Continuing the
> same though, why can't we just reuse the ring->fbc_dirty variable
> instead of using this new need_sw_cache_clean? Maybe we should only do
> the sw cache clean when both fbc_dirty *and* need_sw_cache are true?
>

need_sw_cache_clean is exactly the opposite information flow of fbc_dirty.
fbc_dirty was actually the first name that came to my mind but I noticed we
already had this name.

Well, the fbc cache need to be clean when it is dirty ;) That means on ring
flush (other than render one) and when frontbuffer was touched.

When frontbuffer was touched and there is render flush we Nuke FBC and when
frontbuffer was touched and other ring than render is flushed we clean the
cache. (that is the hw logic that is implemented there). The fbc_dirty is
to tell frontfbuffer was touched and cache clean lri can be executed on
ring.
need_sw_cache_clean is exactly the opposite when rings was flushed it tells
next frontbuffer touch to clean the cache.

I know the order of the things are different. The hw one is frontbuffer
than ring flush with lri for cache clean while this one is ring flush than
next frontbuffer do the mmio to clean the cache. So I imagine that might
have cases where the cache needs to be cleaned and no frontbuffer occurs
and fbc keeps disabled not compressed anything.... But this is exactly how
the hw was without the sw cache clean. And my tests here shows that with
this even not on the propper order fbc just keep working and compressing
propperly.



>
> More below.
>
>
> > +
> >         struct intel_fbc_work {
> >                 struct delayed_work work;
> >                 struct drm_crtc *crtc;
> > diff --git a/drivers/gpu/drm/i915/intel_display.c
> b/drivers/gpu/drm/i915/intel_display.c
> > index 965eb3c..731d925 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -9137,8 +9137,10 @@ void intel_frontbuffer_flush(struct drm_device
> *dev,
> >          * needs to be reworked into a proper frontbuffer tracking
> scheme like
> >          * psr employs.
> >          */
> > -       if (IS_BROADWELL(dev))
> > +       if (IS_BROADWELL(dev) && dev_priv->fbc.need_sw_cache_clean) {
> > +               dev_priv->fbc.need_sw_cache_clean = false;
> >                 gen8_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
> > +       }
>
>
> Bikeshed: since this is the only caller, I think it makes sense to put
> the checks inside the function instead of the caller.
>

Agree.


>
>
> >  }
> >
> >  /**
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 85fc2b1..02b43cd 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -2227,6 +2227,7 @@ static int gen6_ring_flush(struct intel_engine_cs
> *ring,
>
>
> The thing about gen6_render_ring_flush is that it is also a function
> for the vebox ring. Is this a problem? What are the consequences of
> doing the FBC thing on it?
>

As far as I remember we should nuke on render and clean cache on all other
rings flush so no problem with that.


>
>
> >                            u32 invalidate, u32 flush)
> >  {
> >         struct drm_device *dev = ring->dev;
> > +       struct drm_i915_private *dev_priv = dev->dev_private;
> >         uint32_t cmd;
> >         int ret;
> >
> > @@ -2257,8 +2258,12 @@ static int gen6_ring_flush(struct intel_engine_cs
> *ring,
> >         }
> >         intel_ring_advance(ring);
> >
> > -       if (IS_GEN7(dev) && !invalidate && flush)
> > -               return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> > +       if (!invalidate && flush) {
> > +               if (IS_GEN7(dev))
> > +                       return gen7_ring_fbc_flush(ring,
> FBC_REND_CACHE_CLEAN);
> > +               else if (IS_GEN8(dev))
> > +                       dev_priv->fbc.need_sw_cache_clean = true;
>
>
> Bikeshed: since I like to keep the callers clean, can't we just create
> a new intel_blt_fbc_flush (or intel_ring_fbc_flush(), or whatever)
> that will be responsible for deciding if it needs to call
> gen7_ring_fbc_flush or set need_sw_cache_clean to true?


Agree this can be cleaner.


> That would
> make even more sense if we conclude we also need to check for
> fbc_dirty before setting need_sw_cache_clean to true...
>

I don't think so. fbc_dirty is set on the frontbuffer touch, exactly when
fbc_dirty is set.
This would only make a flow that necessary we touch the frontbuffer before
flushing the ring... But I'm confident this is already happening and we
wouldn't be saving any case.



>
>
> > +       }
> >
> >         return 0;
> >  }
> > --
> > 1.9.3
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
>
>
> --
> Paulo Zanoni
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

[-- Attachment #1.2: Type: text/html, Size: 10256 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.
  2014-09-17 21:16     ` Rodrigo Vivi
@ 2014-09-18 15:29       ` Rodrigo Vivi
  2014-09-19 16:32         ` Paulo Zanoni
  0 siblings, 1 reply; 13+ messages in thread
From: Rodrigo Vivi @ 2014-09-18 15:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni, Rodrigo Vivi

The sw cache clean on BDW is a tempoorary workaround because we cannot
set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw.
However we are doing much more than needed. Not only when using blt ring.
So, with this extra w/a we minimize the ammount of cache cleans and call it only
on same cases that it was being called on gen7.

The traditional FBC Cache clean happens over LRI on BLT ring when there is a
frontbuffer touch happening. frontbuffer tracking set fbc_dirty variable
to let BLT flush that it must clean FBC cache.

fbc.need_sw_cache_clean works in the opposite information direction
of ring->fbc_dirty telling software on frontbuffer tracking to perform
the cache clean on sw side.

v2: Clean it a little bit and fully check for Broadwell instead of gen8.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 10 +++++++++-
 drivers/gpu/drm/i915/intel_display.c    |  6 ++++--
 drivers/gpu/drm/i915/intel_pm.c         |  4 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c |  9 +++++++--
 4 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e3ca8df..a8636e5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -659,6 +659,14 @@ struct i915_fbc {
 
 	bool false_color;
 
+	/* On gen8 some rings cannont perform fbc clean operation so for now
+	 * we are doing this on SW with mmio.
+	 * This variable works in the opposite information direction
+	 * of ring->fbc_dirty telling software on frontbuffer tracking
+	 * to perform the cache clean on sw side.
+	 */
+	bool need_sw_cache_clean;
+
 	struct intel_fbc_work {
 		struct delayed_work work;
 		struct drm_crtc *crtc;
@@ -2820,7 +2828,7 @@ extern void intel_modeset_setup_hw_state(struct drm_device *dev,
 extern void i915_redisable_vga(struct drm_device *dev);
 extern void i915_redisable_vga_power_on(struct drm_device *dev);
 extern bool intel_fbc_enabled(struct drm_device *dev);
-extern void gen8_fbc_sw_flush(struct drm_device *dev, u32 value);
+extern void bdw_fbc_sw_flush(struct drm_device *dev, u32 value);
 extern void intel_disable_fbc(struct drm_device *dev);
 extern bool ironlake_set_drps(struct drm_device *dev, u8 val);
 extern void intel_init_pch_refclk(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index b78f00a..20ba1eb 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9225,8 +9225,10 @@ void intel_frontbuffer_flush(struct drm_device *dev,
 	 * needs to be reworked into a proper frontbuffer tracking scheme like
 	 * psr employs.
 	 */
-	if (IS_BROADWELL(dev))
-		gen8_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
+	if (dev_priv->fbc.need_sw_cache_clean) {
+		dev_priv->fbc.need_sw_cache_clean = false;
+		bdw_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
+	}
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 6f3b94b..40377c7 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -345,11 +345,11 @@ bool intel_fbc_enabled(struct drm_device *dev)
 	return dev_priv->display.fbc_enabled(dev);
 }
 
-void gen8_fbc_sw_flush(struct drm_device *dev, u32 value)
+void bdw_fbc_sw_flush(struct drm_device *dev, u32 value)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	if (!IS_GEN8(dev))
+	if (!IS_BROADWELL(dev))
 		return;
 
 	if (!intel_fbc_enabled(dev))
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 25795f2..c2ddf5f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2240,6 +2240,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 			   u32 invalidate, u32 flush)
 {
 	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	uint32_t cmd;
 	int ret;
 
@@ -2270,8 +2271,12 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 	}
 	intel_ring_advance(ring);
 
-	if (IS_GEN7(dev) && !invalidate && flush)
-		return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
+	if (!invalidate && flush) {
+		if (IS_GEN7(dev))
+			return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
+		else if (IS_BROADWELL(dev))
+			dev_priv->fbc.need_sw_cache_clean = true;
+	}
 
 	return 0;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.
  2014-09-18 15:29       ` [PATCH] " Rodrigo Vivi
@ 2014-09-19 16:32         ` Paulo Zanoni
  2014-09-24 23:11           ` Rodrigo Vivi
  0 siblings, 1 reply; 13+ messages in thread
From: Paulo Zanoni @ 2014-09-19 16:32 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: Intel Graphics Development, Paulo Zanoni

2014-09-18 12:29 GMT-03:00 Rodrigo Vivi <rodrigo.vivi@intel.com>:
> The sw cache clean on BDW is a tempoorary workaround because we cannot
> set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw.
> However we are doing much more than needed. Not only when using blt ring.
> So, with this extra w/a we minimize the ammount of cache cleans and call it only
> on same cases that it was being called on gen7.
>
> The traditional FBC Cache clean happens over LRI on BLT ring when there is a
> frontbuffer touch happening. frontbuffer tracking set fbc_dirty variable
> to let BLT flush that it must clean FBC cache.
>
> fbc.need_sw_cache_clean works in the opposite information direction
> of ring->fbc_dirty telling software on frontbuffer tracking to perform
> the cache clean on sw side.
>
> v2: Clean it a little bit and fully check for Broadwell instead of gen8.

This is the first time that I try to understand the buffer tracking
code, but as far as I could understood - and as far as you explained
stuff to me on IRC - the code looks good. But we still could use the
review of either Ville or Daniel. With that said:
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

And the s/IS_GEN8/IS_BROADWELL/ thing ideally would be a separate
patch, but it's fine this way too.

>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         | 10 +++++++++-
>  drivers/gpu/drm/i915/intel_display.c    |  6 ++++--
>  drivers/gpu/drm/i915/intel_pm.c         |  4 ++--
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  9 +++++++--
>  4 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e3ca8df..a8636e5 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -659,6 +659,14 @@ struct i915_fbc {
>
>         bool false_color;
>
> +       /* On gen8 some rings cannont perform fbc clean operation so for now
> +        * we are doing this on SW with mmio.
> +        * This variable works in the opposite information direction
> +        * of ring->fbc_dirty telling software on frontbuffer tracking
> +        * to perform the cache clean on sw side.
> +        */
> +       bool need_sw_cache_clean;
> +
>         struct intel_fbc_work {
>                 struct delayed_work work;
>                 struct drm_crtc *crtc;
> @@ -2820,7 +2828,7 @@ extern void intel_modeset_setup_hw_state(struct drm_device *dev,
>  extern void i915_redisable_vga(struct drm_device *dev);
>  extern void i915_redisable_vga_power_on(struct drm_device *dev);
>  extern bool intel_fbc_enabled(struct drm_device *dev);
> -extern void gen8_fbc_sw_flush(struct drm_device *dev, u32 value);
> +extern void bdw_fbc_sw_flush(struct drm_device *dev, u32 value);
>  extern void intel_disable_fbc(struct drm_device *dev);
>  extern bool ironlake_set_drps(struct drm_device *dev, u8 val);
>  extern void intel_init_pch_refclk(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index b78f00a..20ba1eb 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -9225,8 +9225,10 @@ void intel_frontbuffer_flush(struct drm_device *dev,
>          * needs to be reworked into a proper frontbuffer tracking scheme like
>          * psr employs.
>          */
> -       if (IS_BROADWELL(dev))
> -               gen8_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
> +       if (dev_priv->fbc.need_sw_cache_clean) {
> +               dev_priv->fbc.need_sw_cache_clean = false;
> +               bdw_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
> +       }
>  }
>
>  /**
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 6f3b94b..40377c7 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -345,11 +345,11 @@ bool intel_fbc_enabled(struct drm_device *dev)
>         return dev_priv->display.fbc_enabled(dev);
>  }
>
> -void gen8_fbc_sw_flush(struct drm_device *dev, u32 value)
> +void bdw_fbc_sw_flush(struct drm_device *dev, u32 value)
>  {
>         struct drm_i915_private *dev_priv = dev->dev_private;
>
> -       if (!IS_GEN8(dev))
> +       if (!IS_BROADWELL(dev))
>                 return;
>
>         if (!intel_fbc_enabled(dev))
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 25795f2..c2ddf5f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2240,6 +2240,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>                            u32 invalidate, u32 flush)
>  {
>         struct drm_device *dev = ring->dev;
> +       struct drm_i915_private *dev_priv = dev->dev_private;
>         uint32_t cmd;
>         int ret;
>
> @@ -2270,8 +2271,12 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>         }
>         intel_ring_advance(ring);
>
> -       if (IS_GEN7(dev) && !invalidate && flush)
> -               return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +       if (!invalidate && flush) {
> +               if (IS_GEN7(dev))
> +                       return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +               else if (IS_BROADWELL(dev))
> +                       dev_priv->fbc.need_sw_cache_clean = true;
> +       }
>
>         return 0;
>  }
> --
> 1.9.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.
  2014-09-19 16:32         ` Paulo Zanoni
@ 2014-09-24 23:11           ` Rodrigo Vivi
  2014-09-24 23:50             ` Rodrigo Vivi
  0 siblings, 1 reply; 13+ messages in thread
From: Rodrigo Vivi @ 2014-09-24 23:11 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Paulo Zanoni, Rodrigo Vivi

The sw cache clean on BDW is a tempoorary workaround because we cannot
set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw.
However we are doing much more than needed. Not only when using blt ring.
So, with this extra w/a we minimize the ammount of cache cleans and call it only
on same cases that it was being called on gen7.

The traditional FBC Cache clean happens over LRI on BLT ring when there is a
frontbuffer touch happening. frontbuffer tracking set fbc_dirty variable
to let BLT flush that it must clean FBC cache.

fbc.need_sw_cache_clean works in the opposite information direction
of ring->fbc_dirty telling software on frontbuffer tracking to perform
the cache clean on sw side.

v2: Clean it a little bit and fully check for Broadwell instead of gen8.

v3: Rebased.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 10 +++++++++-
 drivers/gpu/drm/i915/intel_display.c    |  5 +++++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  9 +++++++--
 3 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index da607ab..4cd2aa3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -676,6 +676,14 @@ struct i915_fbc {
 	 * possible. */
 	bool enabled;
 
+	/* On gen8 some rings cannont perform fbc clean operation so for now
+	 * we are doing this on SW with mmio.
+	 * This variable works in the opposite information direction
+	 * of ring->fbc_dirty telling software on frontbuffer tracking
+	 * to perform the cache clean on sw side.
+	 */
+	bool need_sw_cache_clean;
+
 	struct intel_fbc_work {
 		struct delayed_work work;
 		struct drm_crtc *crtc;
@@ -2843,7 +2851,7 @@ extern void intel_modeset_setup_hw_state(struct drm_device *dev,
 extern void i915_redisable_vga(struct drm_device *dev);
 extern void i915_redisable_vga_power_on(struct drm_device *dev);
 extern bool intel_fbc_enabled(struct drm_device *dev);
-extern void gen8_fbc_sw_flush(struct drm_device *dev, u32 value);
+extern void bdw_fbc_sw_flush(struct drm_device *dev, u32 value);
 extern void intel_disable_fbc(struct drm_device *dev);
 extern bool ironlake_set_drps(struct drm_device *dev, u8 val);
 extern void intel_init_pch_refclk(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index b8488a8..3d0620e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2130,6 +2130,11 @@ static void intel_enable_primary_hw_plane(struct drm_plane *plane,
 	 */
 	if (IS_BROADWELL(dev))
 		intel_wait_for_vblank(dev, intel_crtc->pipe);
+
+	if (dev_priv->fbc.need_sw_cache_clean) {
+		dev_priv->fbc.need_sw_cache_clean = false;
+		bdw_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
+	}
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 896f564..b56e031 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2237,6 +2237,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 			   u32 invalidate, u32 flush)
 {
 	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	uint32_t cmd;
 	int ret;
 
@@ -2267,8 +2268,12 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 	}
 	intel_ring_advance(ring);
 
-	if (IS_GEN7(dev) && !invalidate && flush)
-		return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
+	if (!invalidate && flush) {
+		if (IS_GEN7(dev))
+			return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
+		else if (IS_BROADWELL(dev))
+			dev_priv->fbc.need_sw_cache_clean = true;
+	}
 
 	return 0;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.
  2014-09-24 23:11           ` Rodrigo Vivi
@ 2014-09-24 23:50             ` Rodrigo Vivi
  2014-09-25 17:48               ` Paulo Zanoni
  2014-09-29 12:17               ` Daniel Vetter
  0 siblings, 2 replies; 13+ messages in thread
From: Rodrigo Vivi @ 2014-09-24 23:50 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Paulo Zanoni, Rodrigo Vivi

The sw cache clean on BDW is a tempoorary workaround because we cannot
set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw.
However we are doing much more than needed. Not only when using blt ring.
So, with this extra w/a we minimize the ammount of cache cleans and call it only
on same cases that it was being called on gen7.

The traditional FBC Cache clean happens over LRI on BLT ring when there is a
frontbuffer touch happening. frontbuffer tracking set fbc_dirty variable
to let BLT flush that it must clean FBC cache.

fbc.need_sw_cache_clean works in the opposite information direction
of ring->fbc_dirty telling software on frontbuffer tracking to perform
the cache clean on sw side.

v2: Clean it a little bit and fully check for Broadwell instead of gen8.

v3: Rebase after frontbuffer organization.

v4: Wiggle confused me. So fixing v3!

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          | 10 +++++++++-
 drivers/gpu/drm/i915/intel_frontbuffer.c |  6 ++++--
 drivers/gpu/drm/i915/intel_pm.c          |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  |  9 +++++++--
 4 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index da607ab..4cd2aa3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -676,6 +676,14 @@ struct i915_fbc {
 	 * possible. */
 	bool enabled;
 
+	/* On gen8 some rings cannont perform fbc clean operation so for now
+	 * we are doing this on SW with mmio.
+	 * This variable works in the opposite information direction
+	 * of ring->fbc_dirty telling software on frontbuffer tracking
+	 * to perform the cache clean on sw side.
+	 */
+	bool need_sw_cache_clean;
+
 	struct intel_fbc_work {
 		struct delayed_work work;
 		struct drm_crtc *crtc;
@@ -2843,7 +2851,7 @@ extern void intel_modeset_setup_hw_state(struct drm_device *dev,
 extern void i915_redisable_vga(struct drm_device *dev);
 extern void i915_redisable_vga_power_on(struct drm_device *dev);
 extern bool intel_fbc_enabled(struct drm_device *dev);
-extern void gen8_fbc_sw_flush(struct drm_device *dev, u32 value);
+extern void bdw_fbc_sw_flush(struct drm_device *dev, u32 value);
 extern void intel_disable_fbc(struct drm_device *dev);
 extern bool ironlake_set_drps(struct drm_device *dev, u8 val);
 extern void intel_init_pch_refclk(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_frontbuffer.c b/drivers/gpu/drm/i915/intel_frontbuffer.c
index f74744c..7eb74a6 100644
--- a/drivers/gpu/drm/i915/intel_frontbuffer.c
+++ b/drivers/gpu/drm/i915/intel_frontbuffer.c
@@ -189,8 +189,10 @@ void intel_frontbuffer_flush(struct drm_device *dev,
 	 * needs to be reworked into a proper frontbuffer tracking scheme like
 	 * psr employs.
 	 */
-	if (IS_BROADWELL(dev))
-		gen8_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
+	if (dev_priv->fbc.need_sw_cache_clean) {
+		dev_priv->fbc.need_sw_cache_clean = false;
+		bdw_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
+	}
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index aaab056..36842c4 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -380,7 +380,7 @@ bool intel_fbc_enabled(struct drm_device *dev)
 	return dev_priv->fbc.enabled;
 }
 
-void gen8_fbc_sw_flush(struct drm_device *dev, u32 value)
+void bdw_fbc_sw_flush(struct drm_device *dev, u32 value)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 896f564..b56e031 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2237,6 +2237,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 			   u32 invalidate, u32 flush)
 {
 	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	uint32_t cmd;
 	int ret;
 
@@ -2267,8 +2268,12 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 	}
 	intel_ring_advance(ring);
 
-	if (IS_GEN7(dev) && !invalidate && flush)
-		return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
+	if (!invalidate && flush) {
+		if (IS_GEN7(dev))
+			return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
+		else if (IS_BROADWELL(dev))
+			dev_priv->fbc.need_sw_cache_clean = true;
+	}
 
 	return 0;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.
  2014-09-24 23:50             ` Rodrigo Vivi
@ 2014-09-25 17:48               ` Paulo Zanoni
  2014-09-29 12:17               ` Daniel Vetter
  1 sibling, 0 replies; 13+ messages in thread
From: Paulo Zanoni @ 2014-09-25 17:48 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: Daniel Vetter, Intel Graphics Development, Paulo Zanoni

2014-09-24 20:50 GMT-03:00 Rodrigo Vivi <rodrigo.vivi@intel.com>:
> The sw cache clean on BDW is a tempoorary workaround because we cannot
> set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw.
> However we are doing much more than needed. Not only when using blt ring.
> So, with this extra w/a we minimize the ammount of cache cleans and call it only
> on same cases that it was being called on gen7.
>
> The traditional FBC Cache clean happens over LRI on BLT ring when there is a
> frontbuffer touch happening. frontbuffer tracking set fbc_dirty variable
> to let BLT flush that it must clean FBC cache.
>
> fbc.need_sw_cache_clean works in the opposite information direction
> of ring->fbc_dirty telling software on frontbuffer tracking to perform
> the cache clean on sw side.
>
> v2: Clean it a little bit and fully check for Broadwell instead of gen8.
>
> v3: Rebase after frontbuffer organization.
>
> v4: Wiggle confused me. So fixing v3!
>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h          | 10 +++++++++-
>  drivers/gpu/drm/i915/intel_frontbuffer.c |  6 ++++--
>  drivers/gpu/drm/i915/intel_pm.c          |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c  |  9 +++++++--
>  4 files changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index da607ab..4cd2aa3 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -676,6 +676,14 @@ struct i915_fbc {
>          * possible. */
>         bool enabled;
>
> +       /* On gen8 some rings cannont perform fbc clean operation so for now
> +        * we are doing this on SW with mmio.
> +        * This variable works in the opposite information direction
> +        * of ring->fbc_dirty telling software on frontbuffer tracking
> +        * to perform the cache clean on sw side.
> +        */
> +       bool need_sw_cache_clean;
> +
>         struct intel_fbc_work {
>                 struct delayed_work work;
>                 struct drm_crtc *crtc;
> @@ -2843,7 +2851,7 @@ extern void intel_modeset_setup_hw_state(struct drm_device *dev,
>  extern void i915_redisable_vga(struct drm_device *dev);
>  extern void i915_redisable_vga_power_on(struct drm_device *dev);
>  extern bool intel_fbc_enabled(struct drm_device *dev);
> -extern void gen8_fbc_sw_flush(struct drm_device *dev, u32 value);
> +extern void bdw_fbc_sw_flush(struct drm_device *dev, u32 value);
>  extern void intel_disable_fbc(struct drm_device *dev);
>  extern bool ironlake_set_drps(struct drm_device *dev, u8 val);
>  extern void intel_init_pch_refclk(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/intel_frontbuffer.c b/drivers/gpu/drm/i915/intel_frontbuffer.c
> index f74744c..7eb74a6 100644
> --- a/drivers/gpu/drm/i915/intel_frontbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_frontbuffer.c
> @@ -189,8 +189,10 @@ void intel_frontbuffer_flush(struct drm_device *dev,
>          * needs to be reworked into a proper frontbuffer tracking scheme like
>          * psr employs.
>          */
> -       if (IS_BROADWELL(dev))
> -               gen8_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
> +       if (dev_priv->fbc.need_sw_cache_clean) {

In your first version, you had removed this IS_BROADWELL check, but
you replaced the check inside bdw_fbc_sw_flush() from IS_GEN8 to
IS_BROADWELL.


> +               dev_priv->fbc.need_sw_cache_clean = false;
> +               bdw_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
> +       }
>  }
>
>  /**
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index aaab056..36842c4 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -380,7 +380,7 @@ bool intel_fbc_enabled(struct drm_device *dev)
>         return dev_priv->fbc.enabled;
>  }
>
> -void gen8_fbc_sw_flush(struct drm_device *dev, u32 value)
> +void bdw_fbc_sw_flush(struct drm_device *dev, u32 value)
>  {
>         struct drm_i915_private *dev_priv = dev->dev_private;
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 896f564..b56e031 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2237,6 +2237,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>                            u32 invalidate, u32 flush)
>  {
>         struct drm_device *dev = ring->dev;
> +       struct drm_i915_private *dev_priv = dev->dev_private;
>         uint32_t cmd;
>         int ret;
>
> @@ -2267,8 +2268,12 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>         }
>         intel_ring_advance(ring);
>
> -       if (IS_GEN7(dev) && !invalidate && flush)
> -               return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +       if (!invalidate && flush) {
> +               if (IS_GEN7(dev))
> +                       return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +               else if (IS_BROADWELL(dev))
> +                       dev_priv->fbc.need_sw_cache_clean = true;

OTOH, since we'll only ever set need_sw_cache_clean on BDW because of
the check above, this version cannot be considered wrong, just
different from the previous one.

If I were to write this patch, I would unconditionally set
need_sw_cache_clean on all platforms, and then change the check inside
bdw_fbc_sw_flush() to return if the platform is not BDW. But your
patch is correct, so:

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

> +       }
>
>         return 0;
>  }
> --
> 1.9.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.
  2014-09-24 23:50             ` Rodrigo Vivi
  2014-09-25 17:48               ` Paulo Zanoni
@ 2014-09-29 12:17               ` Daniel Vetter
  1 sibling, 0 replies; 13+ messages in thread
From: Daniel Vetter @ 2014-09-29 12:17 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: Daniel Vetter, intel-gfx, Paulo Zanoni

On Wed, Sep 24, 2014 at 07:50:59PM -0400, Rodrigo Vivi wrote:
> The sw cache clean on BDW is a tempoorary workaround because we cannot
> set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw.
> However we are doing much more than needed. Not only when using blt ring.
> So, with this extra w/a we minimize the ammount of cache cleans and call it only
> on same cases that it was being called on gen7.
> 
> The traditional FBC Cache clean happens over LRI on BLT ring when there is a
> frontbuffer touch happening. frontbuffer tracking set fbc_dirty variable
> to let BLT flush that it must clean FBC cache.
> 
> fbc.need_sw_cache_clean works in the opposite information direction
> of ring->fbc_dirty telling software on frontbuffer tracking to perform
> the cache clean on sw side.
> 
> v2: Clean it a little bit and fully check for Broadwell instead of gen8.
> 
> v3: Rebase after frontbuffer organization.
> 
> v4: Wiggle confused me. So fixing v3!
> 
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Queued for -next, thanks for the patch.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h          | 10 +++++++++-
>  drivers/gpu/drm/i915/intel_frontbuffer.c |  6 ++++--
>  drivers/gpu/drm/i915/intel_pm.c          |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c  |  9 +++++++--
>  4 files changed, 21 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index da607ab..4cd2aa3 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -676,6 +676,14 @@ struct i915_fbc {
>  	 * possible. */
>  	bool enabled;
>  
> +	/* On gen8 some rings cannont perform fbc clean operation so for now
> +	 * we are doing this on SW with mmio.
> +	 * This variable works in the opposite information direction
> +	 * of ring->fbc_dirty telling software on frontbuffer tracking
> +	 * to perform the cache clean on sw side.
> +	 */
> +	bool need_sw_cache_clean;
> +
>  	struct intel_fbc_work {
>  		struct delayed_work work;
>  		struct drm_crtc *crtc;
> @@ -2843,7 +2851,7 @@ extern void intel_modeset_setup_hw_state(struct drm_device *dev,
>  extern void i915_redisable_vga(struct drm_device *dev);
>  extern void i915_redisable_vga_power_on(struct drm_device *dev);
>  extern bool intel_fbc_enabled(struct drm_device *dev);
> -extern void gen8_fbc_sw_flush(struct drm_device *dev, u32 value);
> +extern void bdw_fbc_sw_flush(struct drm_device *dev, u32 value);
>  extern void intel_disable_fbc(struct drm_device *dev);
>  extern bool ironlake_set_drps(struct drm_device *dev, u8 val);
>  extern void intel_init_pch_refclk(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/intel_frontbuffer.c b/drivers/gpu/drm/i915/intel_frontbuffer.c
> index f74744c..7eb74a6 100644
> --- a/drivers/gpu/drm/i915/intel_frontbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_frontbuffer.c
> @@ -189,8 +189,10 @@ void intel_frontbuffer_flush(struct drm_device *dev,
>  	 * needs to be reworked into a proper frontbuffer tracking scheme like
>  	 * psr employs.
>  	 */
> -	if (IS_BROADWELL(dev))
> -		gen8_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
> +	if (dev_priv->fbc.need_sw_cache_clean) {
> +		dev_priv->fbc.need_sw_cache_clean = false;
> +		bdw_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
> +	}
>  }
>  
>  /**
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index aaab056..36842c4 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -380,7 +380,7 @@ bool intel_fbc_enabled(struct drm_device *dev)
>  	return dev_priv->fbc.enabled;
>  }
>  
> -void gen8_fbc_sw_flush(struct drm_device *dev, u32 value)
> +void bdw_fbc_sw_flush(struct drm_device *dev, u32 value)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 896f564..b56e031 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2237,6 +2237,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>  			   u32 invalidate, u32 flush)
>  {
>  	struct drm_device *dev = ring->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	uint32_t cmd;
>  	int ret;
>  
> @@ -2267,8 +2268,12 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>  	}
>  	intel_ring_advance(ring);
>  
> -	if (IS_GEN7(dev) && !invalidate && flush)
> -		return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +	if (!invalidate && flush) {
> +		if (IS_GEN7(dev))
> +			return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +		else if (IS_BROADWELL(dev))
> +			dev_priv->fbc.need_sw_cache_clean = true;
> +	}
>  
>  	return 0;
>  }
> -- 
> 1.9.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-09-29 12:17 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-05 20:57 [PATCH 1/2] drm/i915: Only flush fbc on sw when fbc is enabled Rodrigo Vivi
2014-09-05 20:57 ` [PATCH 2/2] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean Rodrigo Vivi
2014-09-05 22:14   ` Rodrigo Vivi
2014-09-17 20:36     ` Rodrigo Vivi
2014-09-17 20:49   ` Paulo Zanoni
2014-09-17 21:16     ` Rodrigo Vivi
2014-09-18 15:29       ` [PATCH] " Rodrigo Vivi
2014-09-19 16:32         ` Paulo Zanoni
2014-09-24 23:11           ` Rodrigo Vivi
2014-09-24 23:50             ` Rodrigo Vivi
2014-09-25 17:48               ` Paulo Zanoni
2014-09-29 12:17               ` Daniel Vetter
2014-09-17 19:55 ` [PATCH 1/2] drm/i915: Only flush fbc on sw when fbc is enabled Paulo Zanoni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.