All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Disable caches for Global GTT.
@ 2014-10-30 16:18 Rodrigo Vivi
  2014-10-31  7:44 ` Chris Wilson
  2014-11-05 11:11 ` Daniel Vetter
  0 siblings, 2 replies; 16+ messages in thread
From: Rodrigo Vivi @ 2014-10-30 16:18 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
So the only way to avoid screen corruptions is setting PAT 0 to Uncached.

MOCS can still be used though. But if userspace is trusting PTE for
cache selection the safest thing to do is to let caches disabled.

BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
so RTL will always use the value corresponding to pat_sel = 000"

Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index cb7adab..24b4f27 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1911,14 +1911,27 @@ static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv)
 {
 	uint64_t pat;
 
-	pat = GEN8_PPAT(0, GEN8_PPAT_WB | GEN8_PPAT_LLC)     | /* for normal objects, no eLLC */
-	      GEN8_PPAT(1, GEN8_PPAT_WC | GEN8_PPAT_LLCELLC) | /* for something pointing to ptes? */
-	      GEN8_PPAT(2, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC) | /* for scanout with eLLC */
-	      GEN8_PPAT(3, GEN8_PPAT_UC)                     | /* Uncached objects, mostly for scanout */
-	      GEN8_PPAT(4, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0)) |
-	      GEN8_PPAT(5, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(1)) |
-	      GEN8_PPAT(6, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2)) |
-	      GEN8_PPAT(7, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));
+	if (!USES_PPGTT(dev_priv->dev))
+		/* Spec: "For GGTT, there is NO pat_sel[2:0] from the entry,
+		 * so RTL will always use the value corresponding to
+		 * pat_sel = 000".
+		 * So let's disable cache for GGTT to avoid screen corruptions.
+		 * MOCS still can be used though.
+		 */
+		pat = GEN8_PPAT(0, GEN8_PPAT_UC);
+	else
+		pat = GEN8_PPAT(0, GEN8_PPAT_WB | GEN8_PPAT_LLC)     | /* for normal objects, no eLLC */
+		      GEN8_PPAT(1, GEN8_PPAT_WC | GEN8_PPAT_LLCELLC) | /* for something pointing to ptes? */
+		      GEN8_PPAT(2, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC) | /* for scanout with eLLC */
+		      GEN8_PPAT(3, GEN8_PPAT_UC) | /* Uncached objects, mostly for scanout */
+		      GEN8_PPAT(4, GEN8_PPAT_WB |
+				GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0)) |
+		      GEN8_PPAT(5, GEN8_PPAT_WB |
+				GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(1)) |
+		      GEN8_PPAT(6, GEN8_PPAT_WB |
+				GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2)) |
+		      GEN8_PPAT(7, GEN8_PPAT_WB |
+				GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));
 
 	/* XXX: spec defines this as 2 distinct registers. It's unclear if a 64b
 	 * write would work. */
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-10-30 16:18 [PATCH] drm/i915: Disable caches for Global GTT Rodrigo Vivi
@ 2014-10-31  7:44 ` Chris Wilson
  2014-10-31 10:14   ` Ville Syrjälä
  2014-11-05 11:11 ` Daniel Vetter
  1 sibling, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2014-10-31  7:44 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Thu, Oct 30, 2014 at 09:18:18AM -0700, Rodrigo Vivi wrote:
> Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
> 
> MOCS can still be used though. But if userspace is trusting PTE for
> cache selection the safest thing to do is to let caches disabled.
> 
> BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> so RTL will always use the value corresponding to pat_sel = 000"
> 
> Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> Cc: James Ausmus <james.ausmus@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 29 +++++++++++++++++++++--------
>  1 file changed, 21 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index cb7adab..24b4f27 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1911,14 +1911,27 @@ static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv)
>  {
>  	uint64_t pat;
>  
> -	pat = GEN8_PPAT(0, GEN8_PPAT_WB | GEN8_PPAT_LLC)     | /* for normal objects, no eLLC */
> -	      GEN8_PPAT(1, GEN8_PPAT_WC | GEN8_PPAT_LLCELLC) | /* for something pointing to ptes? */
> -	      GEN8_PPAT(2, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC) | /* for scanout with eLLC */
> -	      GEN8_PPAT(3, GEN8_PPAT_UC)                     | /* Uncached objects, mostly for scanout */
> -	      GEN8_PPAT(4, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0)) |
> -	      GEN8_PPAT(5, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(1)) |
> -	      GEN8_PPAT(6, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2)) |
> -	      GEN8_PPAT(7, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));

Instead of deleting this and making a rather ugly if,

Just
> +	if (!USES_PPGTT(dev_priv))
> +		/* Spec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> +		 * so RTL will always use the value corresponding to
> +		 * pat_sel = 000".
> +		 * So let's disable cache for GGTT to avoid screen corruptions.
> +		 * MOCS still can be used though.
> +		 */
> +		pat = GEN8_PPAT(0, GEN8_PPAT_UC);

to override in the unlikely case of disabling ppgtt.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-10-31  7:44 ` Chris Wilson
@ 2014-10-31 10:14   ` Ville Syrjälä
  2014-10-31 10:20     ` Chris Wilson
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2014-10-31 10:14 UTC (permalink / raw)
  To: Chris Wilson, Rodrigo Vivi, intel-gfx

On Fri, Oct 31, 2014 at 07:44:34AM +0000, Chris Wilson wrote:
> On Thu, Oct 30, 2014 at 09:18:18AM -0700, Rodrigo Vivi wrote:
> > Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> > So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
> > 
> > MOCS can still be used though. But if userspace is trusting PTE for
> > cache selection the safest thing to do is to let caches disabled.
> > 
> > BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> > so RTL will always use the value corresponding to pat_sel = 000"
> > 
> > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> > Cc: James Ausmus <james.ausmus@intel.com>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 29 +++++++++++++++++++++--------
> >  1 file changed, 21 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index cb7adab..24b4f27 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -1911,14 +1911,27 @@ static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv)
> >  {
> >  	uint64_t pat;
> >  
> > -	pat = GEN8_PPAT(0, GEN8_PPAT_WB | GEN8_PPAT_LLC)     | /* for normal objects, no eLLC */
> > -	      GEN8_PPAT(1, GEN8_PPAT_WC | GEN8_PPAT_LLCELLC) | /* for something pointing to ptes? */
> > -	      GEN8_PPAT(2, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC) | /* for scanout with eLLC */
> > -	      GEN8_PPAT(3, GEN8_PPAT_UC)                     | /* Uncached objects, mostly for scanout */
> > -	      GEN8_PPAT(4, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0)) |
> > -	      GEN8_PPAT(5, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(1)) |
> > -	      GEN8_PPAT(6, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2)) |
> > -	      GEN8_PPAT(7, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));
> 
> Instead of deleting this and making a rather ugly if,
> 
> Just
> > +	if (!USES_PPGTT(dev_priv))
> > +		/* Spec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> > +		 * so RTL will always use the value corresponding to
> > +		 * pat_sel = 000".
> > +		 * So let's disable cache for GGTT to avoid screen corruptions.
> > +		 * MOCS still can be used though.
> > +		 */
> > +		pat = GEN8_PPAT(0, GEN8_PPAT_UC);
> 
> to override in the unlikely case of disabling ppgtt.

Well GGTT is broken whether or not PPGTT is enabled, so I belive we should
just swap two of the entires around WB<->UC, and then update the index
defines to match, and CHV needs the same change as well.

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-10-31 10:14   ` Ville Syrjälä
@ 2014-10-31 10:20     ` Chris Wilson
  2014-10-31 12:27       ` Rodrigo Vivi
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2014-10-31 10:20 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Rodrigo Vivi

On Fri, Oct 31, 2014 at 12:14:25PM +0200, Ville Syrjälä wrote:
> On Fri, Oct 31, 2014 at 07:44:34AM +0000, Chris Wilson wrote:
> > On Thu, Oct 30, 2014 at 09:18:18AM -0700, Rodrigo Vivi wrote:
> > > Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> > > So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
> > > 
> > > MOCS can still be used though. But if userspace is trusting PTE for
> > > cache selection the safest thing to do is to let caches disabled.
> > > 
> > > BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> > > so RTL will always use the value corresponding to pat_sel = 000"
> > > 
> > > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> > > Cc: James Ausmus <james.ausmus@intel.com>
> > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem_gtt.c | 29 +++++++++++++++++++++--------
> > >  1 file changed, 21 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index cb7adab..24b4f27 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -1911,14 +1911,27 @@ static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv)
> > >  {
> > >  	uint64_t pat;
> > >  
> > > -	pat = GEN8_PPAT(0, GEN8_PPAT_WB | GEN8_PPAT_LLC)     | /* for normal objects, no eLLC */
> > > -	      GEN8_PPAT(1, GEN8_PPAT_WC | GEN8_PPAT_LLCELLC) | /* for something pointing to ptes? */
> > > -	      GEN8_PPAT(2, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC) | /* for scanout with eLLC */
> > > -	      GEN8_PPAT(3, GEN8_PPAT_UC)                     | /* Uncached objects, mostly for scanout */
> > > -	      GEN8_PPAT(4, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0)) |
> > > -	      GEN8_PPAT(5, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(1)) |
> > > -	      GEN8_PPAT(6, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2)) |
> > > -	      GEN8_PPAT(7, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));
> > 
> > Instead of deleting this and making a rather ugly if,
> > 
> > Just
> > > +	if (!USES_PPGTT(dev_priv))
> > > +		/* Spec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> > > +		 * so RTL will always use the value corresponding to
> > > +		 * pat_sel = 000".
> > > +		 * So let's disable cache for GGTT to avoid screen corruptions.
> > > +		 * MOCS still can be used though.
> > > +		 */
> > > +		pat = GEN8_PPAT(0, GEN8_PPAT_UC);
> > 
> > to override in the unlikely case of disabling ppgtt.
> 
> Well GGTT is broken whether or not PPGTT is enabled, so I belive we should
> just swap two of the entires around WB<->UC, and then update the index
> defines to match, and CHV needs the same change as well.

I had pondered that. My conclusion was that the only access that
actively uses the PTE for cache bits are the render/blitter engines. We
impose caching on CPU access by alternative methods. So all we need to
force the GPU to disable write caching (WB/WT) when accessing the pages.
I had forgotten that we defined the PAT tables, and wondered about a
chicken bit that usually exists to force cache modes.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] drm/i915: Disable caches for Global GTT.
  2014-10-31 10:20     ` Chris Wilson
@ 2014-10-31 12:27       ` Rodrigo Vivi
  2014-10-31 21:38         ` Ausmus, James
  2014-11-04 18:29         ` Ausmus, James
  0 siblings, 2 replies; 16+ messages in thread
From: Rodrigo Vivi @ 2014-10-31 12:27 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
So the only way to avoid screen corruptions is setting PAT 0 to Uncached.

MOCS can still be used though. But if userspace is trusting PTE for
cache selection the safest thing to do is to let caches disabled.

BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
so RTL will always use the value corresponding to pat_sel = 000"

v2: Cleaner patch as suggested by Chris.

Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index cb7adab..ae568a2 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1920,6 +1920,15 @@ static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv)
 	      GEN8_PPAT(6, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2)) |
 	      GEN8_PPAT(7, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));
 
+	if (!USES_PPGTT(dev_priv->dev))
+		/* Spec: "For GGTT, there is NO pat_sel[2:0] from the entry,
+		 * so RTL will always use the value corresponding to
+		 * pat_sel = 000".
+		 * So let's disable cache for GGTT to avoid screen corruptions.
+		 * MOCS still can be used though.
+		 */
+		pat = GEN8_PPAT(0, GEN8_PPAT_UC);
+
 	/* XXX: spec defines this as 2 distinct registers. It's unclear if a 64b
 	 * write would work. */
 	I915_WRITE(GEN8_PRIVATE_PAT, pat);
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-10-31 12:27       ` Rodrigo Vivi
@ 2014-10-31 21:38         ` Ausmus, James
  2014-11-04 18:29         ` Ausmus, James
  1 sibling, 0 replies; 16+ messages in thread
From: Ausmus, James @ 2014-10-31 21:38 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: Intel GFX


[-- Attachment #1.1: Type: text/plain, Size: 2066 bytes --]

On Fri, Oct 31, 2014 at 5:27 AM, Rodrigo Vivi <rodrigo.vivi@intel.com>
wrote:
>
> Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
>
> MOCS can still be used though. But if userspace is trusting PTE for
> cache selection the safest thing to do is to let caches disabled.
>
> BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> so RTL will always use the value corresponding to pat_sel = 000"
>
> v2: Cleaner patch as suggested by Chris.
>
> Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: James Ausmus <james.ausmus@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c
b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index cb7adab..ae568a2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1920,6 +1920,15 @@ static void bdw_setup_private_ppat(struct
drm_i915_private *dev_priv)
>               GEN8_PPAT(6, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC |
GEN8_PPAT_AGE(2)) |
>               GEN8_PPAT(7, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC |
GEN8_PPAT_AGE(3));
>
> +       if (!USES_PPGTT(dev_priv->dev))
> +               /* Spec: "For GGTT, there is NO pat_sel[2:0] from the
entry,
> +                * so RTL will always use the value corresponding to
> +                * pat_sel = 000".
> +                * So let's disable cache for GGTT to avoid screen
corruptions.
> +                * MOCS still can be used though.
> +                */
> +               pat = GEN8_PPAT(0, GEN8_PPAT_UC);
> +
>         /* XXX: spec defines this as 2 distinct registers. It's unclear
if a 64b
>          * write would work. */
>         I915_WRITE(GEN8_PRIVATE_PAT, pat);
> --
> 1.9.3
>

Tested-by: James Ausmus <james.ausmus@intel.com>


--


James Ausmus
Sr. Software Engineer
SSG-OTC ChromeOS Integration

[-- Attachment #1.2: Type: text/html, Size: 3100 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-10-31 12:27       ` Rodrigo Vivi
  2014-10-31 21:38         ` Ausmus, James
@ 2014-11-04 18:29         ` Ausmus, James
  1 sibling, 0 replies; 16+ messages in thread
From: Ausmus, James @ 2014-11-04 18:29 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: Intel GFX


[-- Attachment #1.1: Type: text/plain, Size: 2098 bytes --]

On Fri, Oct 31, 2014 at 5:27 AM, Rodrigo Vivi <rodrigo.vivi@intel.com>
wrote:
>
> Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
>
> MOCS can still be used though. But if userspace is trusting PTE for
> cache selection the safest thing to do is to let caches disabled.
>
> BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> so RTL will always use the value corresponding to pat_sel = 000"
>
> v2: Cleaner patch as suggested by Chris.
>
> Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: James Ausmus <james.ausmus@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Matches my read of BSpec.

Reviewed-by: James Ausmus <james.ausmus@intel.com>


> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c
b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index cb7adab..ae568a2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1920,6 +1920,15 @@ static void bdw_setup_private_ppat(struct
drm_i915_private *dev_priv)
>               GEN8_PPAT(6, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC |
GEN8_PPAT_AGE(2)) |
>               GEN8_PPAT(7, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC |
GEN8_PPAT_AGE(3));
>
> +       if (!USES_PPGTT(dev_priv->dev))
> +               /* Spec: "For GGTT, there is NO pat_sel[2:0] from the
entry,
> +                * so RTL will always use the value corresponding to
> +                * pat_sel = 000".
> +                * So let's disable cache for GGTT to avoid screen
corruptions.
> +                * MOCS still can be used though.
> +                */
> +               pat = GEN8_PPAT(0, GEN8_PPAT_UC);
> +
>         /* XXX: spec defines this as 2 distinct registers. It's unclear
if a 64b
>          * write would work. */
>         I915_WRITE(GEN8_PRIVATE_PAT, pat);
> --
> 1.9.3
>



--


James Ausmus
Sr. Software Engineer
SSG-OTC ChromeOS Integration

[-- Attachment #1.2: Type: text/html, Size: 3003 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-10-30 16:18 [PATCH] drm/i915: Disable caches for Global GTT Rodrigo Vivi
  2014-10-31  7:44 ` Chris Wilson
@ 2014-11-05 11:11 ` Daniel Vetter
  2014-11-06  0:56   ` Rodrigo Vivi
  2014-11-06 10:55   ` Chris Wilson
  1 sibling, 2 replies; 16+ messages in thread
From: Daniel Vetter @ 2014-11-05 11:11 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Thu, Oct 30, 2014 at 5:18 PM, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
>
> MOCS can still be used though. But if userspace is trusting PTE for
> cache selection the safest thing to do is to let caches disabled.
>
> BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> so RTL will always use the value corresponding to pat_sel = 000"
>
> Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> Cc: James Ausmus <james.ausmus@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Ok, I think this is -fixes + cc: stable material, but we need to be a
bit more elaborate on the commit message:

- System agent ggtt writes (i.e. cpu gtt mmaps) already work before
this patch, i.e. the same uncached + snooping access like on gen6/7
seems to be in effect.
- So this just fixes blitter/render access. Again it looks like it's
not just uncached access, but uncached + snooping. So we can still
hold onto all our assumptions wrt cpu clflushing on LLC machines.

I think this should be both in the commit message and code.

Chris, please correct if I've summarized this wrongly.

Thanks, Daniel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] drm/i915: Disable caches for Global GTT.
  2014-11-05 11:11 ` Daniel Vetter
@ 2014-11-06  0:56   ` Rodrigo Vivi
  2014-11-06 16:40     ` Jani Nikula
  2014-11-06 10:55   ` Chris Wilson
  1 sibling, 1 reply; 16+ messages in thread
From: Rodrigo Vivi @ 2014-11-06  0:56 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula, Daniel Vetter, Stable, Rodrigo Vivi

Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
So the only way to avoid screen corruptions is setting PAT 0 to Uncached.

MOCS can still be used though. But if userspace is trusting PTE for
cache selection the safest thing to do is to let caches disabled.

BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
so RTL will always use the value corresponding to pat_sel = 000"

- System agent ggtt writes (i.e. cpu gtt mmaps) already work before
this patch, i.e. the same uncached + snooping access like on gen6/7
seems to be in effect.
- So this just fixes blitter/render access. Again it looks like it's
not just uncached access, but uncached + snooping. So we can still
hold onto all our assumptions wrt cpu clflushing on LLC machines.

v2: Cleaner patch as suggested by Chris.
v3: Add Daniel's comment

Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: James Ausmus <james.ausmus@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Stable@vger.kernel.org
Tested-by: James Ausmus <james.ausmus@intel.com>
Reviewed-by: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index cb7adab..6d3fb3c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1920,6 +1920,22 @@ static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv)
 	      GEN8_PPAT(6, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2)) |
 	      GEN8_PPAT(7, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));
 
+	if (!USES_PPGTT(dev_priv->dev))
+		/* Spec: "For GGTT, there is NO pat_sel[2:0] from the entry,
+		 * so RTL will always use the value corresponding to
+		 * pat_sel = 000".
+		 * So let's disable cache for GGTT to avoid screen corruptions.
+		 * MOCS still can be used though.
+		 * - System agent ggtt writes (i.e. cpu gtt mmaps) already work
+		 * before this patch, i.e. the same uncached + snooping access
+		 * like on gen6/7 seems to be in effect.
+		 * - So this just fixes blitter/render access. Again it looks
+		 * like it's not just uncached access, but uncached + snooping.
+		 * So we can still hold onto all our assumptions wrt cpu
+		 * clflushing on LLC machines.
+		 */
+		pat = GEN8_PPAT(0, GEN8_PPAT_UC);
+
 	/* XXX: spec defines this as 2 distinct registers. It's unclear if a 64b
 	 * write would work. */
 	I915_WRITE(GEN8_PRIVATE_PAT, pat);
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-11-05 11:11 ` Daniel Vetter
  2014-11-06  0:56   ` Rodrigo Vivi
@ 2014-11-06 10:55   ` Chris Wilson
  2014-11-06 19:09     ` Ville Syrjälä
  1 sibling, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2014-11-06 10:55 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Rodrigo Vivi

On Wed, Nov 05, 2014 at 12:11:41PM +0100, Daniel Vetter wrote:
> On Thu, Oct 30, 2014 at 5:18 PM, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> > Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> > So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
> >
> > MOCS can still be used though. But if userspace is trusting PTE for
> > cache selection the safest thing to do is to let caches disabled.
> >
> > BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> > so RTL will always use the value corresponding to pat_sel = 000"
> >
> > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> > Cc: James Ausmus <james.ausmus@intel.com>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> Ok, I think this is -fixes + cc: stable material, but we need to be a
> bit more elaborate on the commit message:
> 
> - System agent ggtt writes (i.e. cpu gtt mmaps) already work before
> this patch, i.e. the same uncached + snooping access like on gen6/7
> seems to be in effect.
> - So this just fixes blitter/render access. Again it looks like it's
> not just uncached access, but uncached + snooping. So we can still
> hold onto all our assumptions wrt cpu clflushing on LLC machines.
> 
> I think this should be both in the commit message and code.
> 
> Chris, please correct if I've summarized this wrongly.

Me, I just get confused by the statement that writes through the System
Agent to main memory are snooped.

But the statement that GTT + WC is coherent with the display as it was
on SNB+ is certainly true.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-11-06  0:56   ` Rodrigo Vivi
@ 2014-11-06 16:40     ` Jani Nikula
  0 siblings, 0 replies; 16+ messages in thread
From: Jani Nikula @ 2014-11-06 16:40 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi, Chris Wilson, James Ausmus, Daniel Vetter, Stable

On Thu, 06 Nov 2014, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
>
> MOCS can still be used though. But if userspace is trusting PTE for
> cache selection the safest thing to do is to let caches disabled.
>
> BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> so RTL will always use the value corresponding to pat_sel = 000"
>
> - System agent ggtt writes (i.e. cpu gtt mmaps) already work before
> this patch, i.e. the same uncached + snooping access like on gen6/7
> seems to be in effect.
> - So this just fixes blitter/render access. Again it looks like it's
> not just uncached access, but uncached + snooping. So we can still
> hold onto all our assumptions wrt cpu clflushing on LLC machines.
>
> v2: Cleaner patch as suggested by Chris.
> v3: Add Daniel's comment
>
> Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: James Ausmus <james.ausmus@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Stable@vger.kernel.org
> Tested-by: James Ausmus <james.ausmus@intel.com>
> Reviewed-by: James Ausmus <james.ausmus@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Pushed this one to drm-intel-fixes as-is, thanks for the patch, review,
and bikeshedding. :p

BR,
Jani.


> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index cb7adab..6d3fb3c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1920,6 +1920,22 @@ static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv)
>  	      GEN8_PPAT(6, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2)) |
>  	      GEN8_PPAT(7, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));
>  
> +	if (!USES_PPGTT(dev_priv->dev))
> +		/* Spec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> +		 * so RTL will always use the value corresponding to
> +		 * pat_sel = 000".
> +		 * So let's disable cache for GGTT to avoid screen corruptions.
> +		 * MOCS still can be used though.
> +		 * - System agent ggtt writes (i.e. cpu gtt mmaps) already work
> +		 * before this patch, i.e. the same uncached + snooping access
> +		 * like on gen6/7 seems to be in effect.
> +		 * - So this just fixes blitter/render access. Again it looks
> +		 * like it's not just uncached access, but uncached + snooping.
> +		 * So we can still hold onto all our assumptions wrt cpu
> +		 * clflushing on LLC machines.
> +		 */
> +		pat = GEN8_PPAT(0, GEN8_PPAT_UC);
> +
>  	/* XXX: spec defines this as 2 distinct registers. It's unclear if a 64b
>  	 * write would work. */
>  	I915_WRITE(GEN8_PRIVATE_PAT, pat);
> -- 
> 1.9.3
>

-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-11-06 10:55   ` Chris Wilson
@ 2014-11-06 19:09     ` Ville Syrjälä
  2014-11-06 20:53       ` Ville Syrjälä
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2014-11-06 19:09 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Rodrigo Vivi, intel-gfx, Nikula, Jani

On Thu, Nov 06, 2014 at 10:55:12AM +0000, Chris Wilson wrote:
> On Wed, Nov 05, 2014 at 12:11:41PM +0100, Daniel Vetter wrote:
> > On Thu, Oct 30, 2014 at 5:18 PM, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> > > Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> > > So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
> > >
> > > MOCS can still be used though. But if userspace is trusting PTE for
> > > cache selection the safest thing to do is to let caches disabled.
> > >
> > > BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> > > so RTL will always use the value corresponding to pat_sel = 000"
> > >
> > > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> > > Cc: James Ausmus <james.ausmus@intel.com>
> > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > 
> > Ok, I think this is -fixes + cc: stable material, but we need to be a
> > bit more elaborate on the commit message:
> > 
> > - System agent ggtt writes (i.e. cpu gtt mmaps) already work before
> > this patch, i.e. the same uncached + snooping access like on gen6/7
> > seems to be in effect.
> > - So this just fixes blitter/render access. Again it looks like it's
> > not just uncached access, but uncached + snooping. So we can still
> > hold onto all our assumptions wrt cpu clflushing on LLC machines.
> > 
> > I think this should be both in the commit message and code.
> > 
> > Chris, please correct if I've summarized this wrongly.
> 
> Me, I just get confused by the statement that writes through the System
> Agent to main memory are snooped.
> 
> But the statement that GTT + WC is coherent with the display as it was
> on SNB+ is certainly true.

I have a supicious mind so I simply had to go and verify this on my
IVB. Writes to a LLC cacheable scanout buffer through a GTT mmap did
not result in any cache dirt on the screen. So I must admit defeat.

I also did another experiment where I wrote the data through a cpu
mmap, which obviously resulted in plenty of cache dirt. But I then
followed it with a read through the GTT mmap (obj still mapped as
cacheable in GTT), and the data matched what I wrote, and it also
cleared the cache dirt from the screen. So any CPU GTT access will
snoop the caches and also causes a writeback of the cacheline. I
suspect it also invalidates the cacheline, but I didn't verify that
in any way.

So I guess I can rest easier now having witnessed it all with mine
own eyes :)

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-11-06 19:09     ` Ville Syrjälä
@ 2014-11-06 20:53       ` Ville Syrjälä
  2014-11-06 21:55         ` Rodrigo Vivi
  2014-11-07  9:33         ` Daniel Vetter
  0 siblings, 2 replies; 16+ messages in thread
From: Ville Syrjälä @ 2014-11-06 20:53 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Rodrigo Vivi, intel-gfx, Nikula, Jani

On Thu, Nov 06, 2014 at 09:09:52PM +0200, Ville Syrjälä wrote:
> On Thu, Nov 06, 2014 at 10:55:12AM +0000, Chris Wilson wrote:
> > On Wed, Nov 05, 2014 at 12:11:41PM +0100, Daniel Vetter wrote:
> > > On Thu, Oct 30, 2014 at 5:18 PM, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> > > > Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> > > > So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
> > > >
> > > > MOCS can still be used though. But if userspace is trusting PTE for
> > > > cache selection the safest thing to do is to let caches disabled.
> > > >
> > > > BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> > > > so RTL will always use the value corresponding to pat_sel = 000"
> > > >
> > > > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> > > > Cc: James Ausmus <james.ausmus@intel.com>
> > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > 
> > > Ok, I think this is -fixes + cc: stable material, but we need to be a
> > > bit more elaborate on the commit message:
> > > 
> > > - System agent ggtt writes (i.e. cpu gtt mmaps) already work before
> > > this patch, i.e. the same uncached + snooping access like on gen6/7
> > > seems to be in effect.
> > > - So this just fixes blitter/render access. Again it looks like it's
> > > not just uncached access, but uncached + snooping. So we can still
> > > hold onto all our assumptions wrt cpu clflushing on LLC machines.
> > > 
> > > I think this should be both in the commit message and code.
> > > 
> > > Chris, please correct if I've summarized this wrongly.
> > 
> > Me, I just get confused by the statement that writes through the System
> > Agent to main memory are snooped.
> > 
> > But the statement that GTT + WC is coherent with the display as it was
> > on SNB+ is certainly true.
> 
> I have a supicious mind so I simply had to go and verify this on my
> IVB. Writes to a LLC cacheable scanout buffer through a GTT mmap did
> not result in any cache dirt on the screen. So I must admit defeat.
> 
> I also did another experiment where I wrote the data through a cpu
> mmap, which obviously resulted in plenty of cache dirt. But I then
> followed it with a read through the GTT mmap (obj still mapped as
> cacheable in GTT), and the data matched what I wrote, and it also
> cleared the cache dirt from the screen. So any CPU GTT access will
> snoop the caches and also causes a writeback of the cacheline. I
> suspect it also invalidates the cacheline, but I didn't verify that
> in any way.
> 
> So I guess I can rest easier now having witnessed it all with mine
> own eyes :)

I also repeated the same tests on CHV, and it behaves in a manner
consistent with its non-LLC heritage ie. CPU GTT access never snoops
no matter what the PTE says.

I also frobbed with the PAT entries a bit, and that at least confirms
that status page writes go through PAT[0] even though we set up the
PTEs to use PAT[4]. When PTE[0] doesn't have the snoop bit, all we get
is a GPU "hang", and with the snoop bit set things work fine.

So I think that for CHV the ppgtt=0 case is perfectly fine with the
current code because:
1) CPU GTT access is fine as it never snoops and we expect it not to,
   in fact we hand a SIGBUS to anyone who tries this
2) GPU access to snooped bo will work since PAT[0] says to snoop
3) GPU access to non-snooped bo will work even if PAT[0] says to snoop,
   but will perhaps be a bit slow due to the extra snoops. Should be
   able to override with MOCS though and get the speed back.
4) Status page must be snooped, so things pretty much fall apart when
   PAT[0] doesn't have the snoop bit. Well, unless we decide to add
   explicit cache flushes to status page reads

I'm not entirely sure if I should be coserned about the color_adjust
stuff, especially in the case where MOCS is used. We might end up
mixing snooped and non-snooped accesses in a way that the color
adjustment wasn't able to anticipate. I guess I could just try to
see if I can anger the hardware doing that. But that's a job or
another day.

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-11-06 20:53       ` Ville Syrjälä
@ 2014-11-06 21:55         ` Rodrigo Vivi
  2014-11-07  9:33         ` Daniel Vetter
  1 sibling, 0 replies; 16+ messages in thread
From: Rodrigo Vivi @ 2014-11-06 21:55 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Rodrigo Vivi

On Thu, Nov 6, 2014 at 12:53 PM, Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Thu, Nov 06, 2014 at 09:09:52PM +0200, Ville Syrjälä wrote:
>> On Thu, Nov 06, 2014 at 10:55:12AM +0000, Chris Wilson wrote:
>> > On Wed, Nov 05, 2014 at 12:11:41PM +0100, Daniel Vetter wrote:
>> > > On Thu, Oct 30, 2014 at 5:18 PM, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
>> > > > Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
>> > > > So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
>> > > >
>> > > > MOCS can still be used though. But if userspace is trusting PTE for
>> > > > cache selection the safest thing to do is to let caches disabled.
>> > > >
>> > > > BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
>> > > > so RTL will always use the value corresponding to pat_sel = 000"
>> > > >
>> > > > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
>> > > > Cc: James Ausmus <james.ausmus@intel.com>
>> > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> > >
>> > > Ok, I think this is -fixes + cc: stable material, but we need to be a
>> > > bit more elaborate on the commit message:
>> > >
>> > > - System agent ggtt writes (i.e. cpu gtt mmaps) already work before
>> > > this patch, i.e. the same uncached + snooping access like on gen6/7
>> > > seems to be in effect.
>> > > - So this just fixes blitter/render access. Again it looks like it's
>> > > not just uncached access, but uncached + snooping. So we can still
>> > > hold onto all our assumptions wrt cpu clflushing on LLC machines.
>> > >
>> > > I think this should be both in the commit message and code.
>> > >
>> > > Chris, please correct if I've summarized this wrongly.
>> >
>> > Me, I just get confused by the statement that writes through the System
>> > Agent to main memory are snooped.
>> >
>> > But the statement that GTT + WC is coherent with the display as it was
>> > on SNB+ is certainly true.
>>
>> I have a supicious mind so I simply had to go and verify this on my
>> IVB. Writes to a LLC cacheable scanout buffer through a GTT mmap did
>> not result in any cache dirt on the screen. So I must admit defeat.
>>
>> I also did another experiment where I wrote the data through a cpu
>> mmap, which obviously resulted in plenty of cache dirt. But I then
>> followed it with a read through the GTT mmap (obj still mapped as
>> cacheable in GTT), and the data matched what I wrote, and it also
>> cleared the cache dirt from the screen. So any CPU GTT access will
>> snoop the caches and also causes a writeback of the cacheline. I
>> suspect it also invalidates the cacheline, but I didn't verify that
>> in any way.
>>
>> So I guess I can rest easier now having witnessed it all with mine
>> own eyes :)
>
> I also repeated the same tests on CHV, and it behaves in a manner
> consistent with its non-LLC heritage ie. CPU GTT access never snoops
> no matter what the PTE says.
>
> I also frobbed with the PAT entries a bit, and that at least confirms
> that status page writes go through PAT[0] even though we set up the
> PTEs to use PAT[4]. When PTE[0] doesn't have the snoop bit, all we get
> is a GPU "hang", and with the snoop bit set things work fine.
>
> So I think that for CHV the ppgtt=0 case is perfectly fine with the
> current code because:
> 1) CPU GTT access is fine as it never snoops and we expect it not to,
>    in fact we hand a SIGBUS to anyone who tries this
> 2) GPU access to snooped bo will work since PAT[0] says to snoop
> 3) GPU access to non-snooped bo will work even if PAT[0] says to snoop,
>    but will perhaps be a bit slow due to the extra snoops. Should be
>    able to override with MOCS though and get the speed back.
> 4) Status page must be snooped, so things pretty much fall apart when
>    PAT[0] doesn't have the snoop bit. Well, unless we decide to add
>    explicit cache flushes to status page reads

Great job! Thanks for verifying that deeply.
I wasn't really sure about how we are setting the entire PAT table,
but with your tests I'm confident we are in the right direction.

>
> I'm not entirely sure if I should be coserned about the color_adjust
> stuff, especially in the case where MOCS is used. We might end up
> mixing snooped and non-snooped accesses in a way that the color
> adjustment wasn't able to anticipate.

uhm... interesting point.

> I guess I could just try to
> see if I can anger the hardware doing that. But that's a job or
> another day.

indeed! :)

>
> --
> Ville Syrjälä
> Intel OTC
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-11-06 20:53       ` Ville Syrjälä
  2014-11-06 21:55         ` Rodrigo Vivi
@ 2014-11-07  9:33         ` Daniel Vetter
  2014-11-07  9:51           ` Chris Wilson
  1 sibling, 1 reply; 16+ messages in thread
From: Daniel Vetter @ 2014-11-07  9:33 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Rodrigo Vivi

On Thu, Nov 06, 2014 at 10:53:00PM +0200, Ville Syrjälä wrote:
> On Thu, Nov 06, 2014 at 09:09:52PM +0200, Ville Syrjälä wrote:
> > On Thu, Nov 06, 2014 at 10:55:12AM +0000, Chris Wilson wrote:
> > > On Wed, Nov 05, 2014 at 12:11:41PM +0100, Daniel Vetter wrote:
> > > > On Thu, Oct 30, 2014 at 5:18 PM, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> > > > > Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> > > > > So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
> > > > >
> > > > > MOCS can still be used though. But if userspace is trusting PTE for
> > > > > cache selection the safest thing to do is to let caches disabled.
> > > > >
> > > > > BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> > > > > so RTL will always use the value corresponding to pat_sel = 000"
> > > > >
> > > > > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> > > > > Cc: James Ausmus <james.ausmus@intel.com>
> > > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > 
> > > > Ok, I think this is -fixes + cc: stable material, but we need to be a
> > > > bit more elaborate on the commit message:
> > > > 
> > > > - System agent ggtt writes (i.e. cpu gtt mmaps) already work before
> > > > this patch, i.e. the same uncached + snooping access like on gen6/7
> > > > seems to be in effect.
> > > > - So this just fixes blitter/render access. Again it looks like it's
> > > > not just uncached access, but uncached + snooping. So we can still
> > > > hold onto all our assumptions wrt cpu clflushing on LLC machines.
> > > > 
> > > > I think this should be both in the commit message and code.
> > > > 
> > > > Chris, please correct if I've summarized this wrongly.
> > > 
> > > Me, I just get confused by the statement that writes through the System
> > > Agent to main memory are snooped.
> > > 
> > > But the statement that GTT + WC is coherent with the display as it was
> > > on SNB+ is certainly true.
> > 
> > I have a supicious mind so I simply had to go and verify this on my
> > IVB. Writes to a LLC cacheable scanout buffer through a GTT mmap did
> > not result in any cache dirt on the screen. So I must admit defeat.
> > 
> > I also did another experiment where I wrote the data through a cpu
> > mmap, which obviously resulted in plenty of cache dirt. But I then
> > followed it with a read through the GTT mmap (obj still mapped as
> > cacheable in GTT), and the data matched what I wrote, and it also
> > cleared the cache dirt from the screen. So any CPU GTT access will
> > snoop the caches and also causes a writeback of the cacheline. I
> > suspect it also invalidates the cacheline, but I didn't verify that
> > in any way.
> > 
> > So I guess I can rest easier now having witnessed it all with mine
> > own eyes :)
> 
> I also repeated the same tests on CHV, and it behaves in a manner
> consistent with its non-LLC heritage ie. CPU GTT access never snoops
> no matter what the PTE says.
> 
> I also frobbed with the PAT entries a bit, and that at least confirms
> that status page writes go through PAT[0] even though we set up the
> PTEs to use PAT[4]. When PTE[0] doesn't have the snoop bit, all we get
> is a GPU "hang", and with the snoop bit set things work fine.
> 
> So I think that for CHV the ppgtt=0 case is perfectly fine with the
> current code because:
> 1) CPU GTT access is fine as it never snoops and we expect it not to,
>    in fact we hand a SIGBUS to anyone who tries this
> 2) GPU access to snooped bo will work since PAT[0] says to snoop
> 3) GPU access to non-snooped bo will work even if PAT[0] says to snoop,
>    but will perhaps be a bit slow due to the extra snoops. Should be
>    able to override with MOCS though and get the speed back.
> 4) Status page must be snooped, so things pretty much fall apart when
>    PAT[0] doesn't have the snoop bit. Well, unless we decide to add
>    explicit cache flushes to status page reads

I think a patch which adds a comment to the chv pat code stating that
pat[0] must be cached would be great. Otherwise someone will go and
"clean" it up I fear ...

> I'm not entirely sure if I should be coserned about the color_adjust
> stuff, especially in the case where MOCS is used. We might end up
> mixing snooped and non-snooped accesses in a way that the color
> adjustment wasn't able to anticipate. I guess I could just try to
> see if I can anger the hardware doing that. But that's a job or
> another day.

tbh I'm not sure any more whether we still need this on vlv/chv - they did
change the snooping logic a lot. Iirc pre-gen5 only could snoop with the
blitter, vlv/chv can also snoop with the render engine. So maybe this has
improved enough that we don't need the color stuff any more.

I guess a good test would be to disable the color stuff on vlv/chv and
see what happens ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915: Disable caches for Global GTT.
  2014-11-07  9:33         ` Daniel Vetter
@ 2014-11-07  9:51           ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2014-11-07  9:51 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Rodrigo Vivi

On Fri, Nov 07, 2014 at 10:33:10AM +0100, Daniel Vetter wrote:
> tbh I'm not sure any more whether we still need this on vlv/chv - they did
> change the snooping logic a lot. Iirc pre-gen5 only could snoop with the
> blitter, vlv/chv can also snoop with the render engine. So maybe this has
> improved enough that we don't need the color stuff any more.

gen2+ snoops with the sampler as well.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-11-07  9:51 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-30 16:18 [PATCH] drm/i915: Disable caches for Global GTT Rodrigo Vivi
2014-10-31  7:44 ` Chris Wilson
2014-10-31 10:14   ` Ville Syrjälä
2014-10-31 10:20     ` Chris Wilson
2014-10-31 12:27       ` Rodrigo Vivi
2014-10-31 21:38         ` Ausmus, James
2014-11-04 18:29         ` Ausmus, James
2014-11-05 11:11 ` Daniel Vetter
2014-11-06  0:56   ` Rodrigo Vivi
2014-11-06 16:40     ` Jani Nikula
2014-11-06 10:55   ` Chris Wilson
2014-11-06 19:09     ` Ville Syrjälä
2014-11-06 20:53       ` Ville Syrjälä
2014-11-06 21:55         ` Rodrigo Vivi
2014-11-07  9:33         ` Daniel Vetter
2014-11-07  9:51           ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.