intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/i915: fix forcewake related hangs on snb
@ 2012-07-26 14:24 Daniel Vetter
  2012-07-26 14:50 ` Chris Wilson
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Vetter @ 2012-07-26 14:24 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

... by adding seemingly redudant posting reads.

This little dragon lair exploded the first time around when we've
refactored the code a bit to use the common wait_for_atomic_us in
"drm/i915: Group the GT routines together in both code and vtable",
which caused QA to file fdo bug #51738.

Chris Wilson entertained a few approaches to fixing #51738: Replacing
the udelay(1) with the previously-used udelay(10) (or any other
"sufficiently larger" delay), adding a posting read, or ditching the
delay completely and using cpu_relax. We went with the cpu_relax and
"915: Workaround hang with BSD and forcewake on SandyBridge". Which
blew up in fdo bug #52424, but adding the posting read while still
using cpu_relax seems to also fix that, it looks like the
posting read is the important ingriedient to fix these rc6 related
hangs on snb.

Popular theories as to why this is like it is include:
- A herd of pink elephants got royally angered somehow.

- The gpu has internally different functional units and judging by the
  register offsets, the forcewake request register and the forcewake
  ack registers are _not_ in the same functional unit (or at least
  aren't reached through the same routes). Hence the posting read
  syncs up with the wrong block and gets the entire gpu confused.

- ...

As a minimal ducttape fix for 3.6, let's just put these posting reads
into place again. We can try fancier approaches (like adding back the
cpu_relax instead of the udelay) in -next.

This (re-)fixes a regression introduced in

commit 990bbdadabaa51828e475eda86ee5720a4910cc3
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Jul 2 11:51:02 2012 -0300

    drm/i915: Group the GT routines together in both code and vtable

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52424
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51738u
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/intel_pm.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 94aabca..58c07cd 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3963,6 +3963,7 @@ static void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 		DRM_ERROR("Force wake wait timed out\n");
 
 	I915_WRITE_NOTRACE(FORCEWAKE, 1);
+	POSTING_READ(FORCEWAKE);
 
 	if (wait_for_atomic_us((I915_READ_NOTRACE(forcewake_ack) & 1), 500))
 		DRM_ERROR("Force wake wait timed out\n");
@@ -3983,6 +3984,7 @@ static void __gen6_gt_force_wake_mt_get(struct drm_i915_private *dev_priv)
 		DRM_ERROR("Force wake wait timed out\n");
 
 	I915_WRITE_NOTRACE(FORCEWAKE_MT, _MASKED_BIT_ENABLE(1));
+	POSTING_READ(FORCEWAKE_MT);
 
 	if (wait_for_atomic_us((I915_READ_NOTRACE(forcewake_ack) & 1), 500))
 		DRM_ERROR("Force wake wait timed out\n");
@@ -4018,14 +4020,14 @@ void gen6_gt_check_fifodbg(struct drm_i915_private *dev_priv)
 static void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 {
 	I915_WRITE_NOTRACE(FORCEWAKE, 0);
-	/* The below doubles as a POSTING_READ */
+	POSTING_READ(FORCEWAKE);
 	gen6_gt_check_fifodbg(dev_priv);
 }
 
 static void __gen6_gt_force_wake_mt_put(struct drm_i915_private *dev_priv)
 {
 	I915_WRITE_NOTRACE(FORCEWAKE_MT, _MASKED_BIT_DISABLE(1));
-	/* The below doubles as a POSTING_READ */
+	POSTING_READ(FORCEWAKE_MT);
 	gen6_gt_check_fifodbg(dev_priv);
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] drm/i915: fix forcewake related hangs on snb
  2012-07-26 14:24 [PATCH] drm/i915: fix forcewake related hangs on snb Daniel Vetter
@ 2012-07-26 14:50 ` Chris Wilson
  2012-07-26 16:53   ` Daniel Vetter
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Wilson @ 2012-07-26 14:50 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

On Thu, 26 Jul 2012 16:24:50 +0200, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> ... by adding seemingly redudant posting reads.
> 
> This little dragon lair exploded the first time around when we've
> refactored the code a bit to use the common wait_for_atomic_us in
> "drm/i915: Group the GT routines together in both code and vtable",
> which caused QA to file fdo bug #51738.
> 
> Chris Wilson entertained a few approaches to fixing #51738: Replacing
> the udelay(1) with the previously-used udelay(10) (or any other
> "sufficiently larger" delay), adding a posting read, or ditching the
> delay completely and using cpu_relax. We went with the cpu_relax and
> "915: Workaround hang with BSD and forcewake on SandyBridge". Which
> blew up in fdo bug #52424, but adding the posting read while still
> using cpu_relax seems to also fix that, it looks like the
> posting read is the important ingriedient to fix these rc6 related
> hangs on snb.
> 
> Popular theories as to why this is like it is include:
> - A herd of pink elephants got royally angered somehow.
> 
> - The gpu has internally different functional units and judging by the
>   register offsets, the forcewake request register and the forcewake
>   ack registers are _not_ in the same functional unit (or at least
>   aren't reached through the same routes). Hence the posting read
>   syncs up with the wrong block and gets the entire gpu confused.
> 
> - ...
> 
> As a minimal ducttape fix for 3.6, let's just put these posting reads
> into place again. We can try fancier approaches (like adding back the
> cpu_relax instead of the udelay) in -next.
> 
> This (re-)fixes a regression introduced in
> 
> commit 990bbdadabaa51828e475eda86ee5720a4910cc3
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Mon Jul 2 11:51:02 2012 -0300
> 
>     drm/i915: Group the GT routines together in both code and vtable
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52424
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51738u
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

No change on IVB, fixes the dummy_reloc_loop hang on SNB.

Tested-by: Chris Wilson <chris@chris-wilson.co.uk>

udelay() is winning the award for most popular function. Almost, it got
pipped at the last second by read_hpet() on earlier chipsets.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] drm/i915: fix forcewake related hangs on snb
  2012-07-26 14:50 ` Chris Wilson
@ 2012-07-26 16:53   ` Daniel Vetter
  0 siblings, 0 replies; 3+ messages in thread
From: Daniel Vetter @ 2012-07-26 16:53 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, Intel Graphics Development

On Thu, Jul 26, 2012 at 03:50:02PM +0100, Chris Wilson wrote:
> On Thu, 26 Jul 2012 16:24:50 +0200, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > ... by adding seemingly redudant posting reads.
> > 
> > This little dragon lair exploded the first time around when we've
> > refactored the code a bit to use the common wait_for_atomic_us in
> > "drm/i915: Group the GT routines together in both code and vtable",
> > which caused QA to file fdo bug #51738.
> > 
> > Chris Wilson entertained a few approaches to fixing #51738: Replacing
> > the udelay(1) with the previously-used udelay(10) (or any other
> > "sufficiently larger" delay), adding a posting read, or ditching the
> > delay completely and using cpu_relax. We went with the cpu_relax and
> > "915: Workaround hang with BSD and forcewake on SandyBridge". Which
> > blew up in fdo bug #52424, but adding the posting read while still
> > using cpu_relax seems to also fix that, it looks like the
> > posting read is the important ingriedient to fix these rc6 related
> > hangs on snb.
> > 
> > Popular theories as to why this is like it is include:
> > - A herd of pink elephants got royally angered somehow.
> > 
> > - The gpu has internally different functional units and judging by the
> >   register offsets, the forcewake request register and the forcewake
> >   ack registers are _not_ in the same functional unit (or at least
> >   aren't reached through the same routes). Hence the posting read
> >   syncs up with the wrong block and gets the entire gpu confused.
> > 
> > - ...
> > 
> > As a minimal ducttape fix for 3.6, let's just put these posting reads
> > into place again. We can try fancier approaches (like adding back the
> > cpu_relax instead of the udelay) in -next.
> > 
> > This (re-)fixes a regression introduced in
> > 
> > commit 990bbdadabaa51828e475eda86ee5720a4910cc3
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Mon Jul 2 11:51:02 2012 -0300
> > 
> >     drm/i915: Group the GT routines together in both code and vtable
> > 
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52424
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51738u
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> No change on IVB, fixes the dummy_reloc_loop hang on SNB.
> 
> Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
I've picked this up for -fixes, thanks for testing. I'll send a pull to
Dave tomorrow, assuming QA doesn't complain about things any more, too.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-07-26 16:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-26 14:24 [PATCH] drm/i915: fix forcewake related hangs on snb Daniel Vetter
2012-07-26 14:50 ` Chris Wilson
2012-07-26 16:53   ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).