All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Avoid promoting a simulated hang to 'wedged'
@ 2013-05-28  9:38 Chris Wilson
  2013-05-29 11:33 ` Mika Kuoppala
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Wilson @ 2013-05-28  9:38 UTC (permalink / raw)
  To: intel-gfx

It appears that a beneficial side-effect of Mika's more accurate hangman
work is to speed up hang detection and execution. This exposes a bug in
the reset code that then treats repeated simulated hangs as an
indication that the machine is wedged. Jiggle the code around so that we
only do the simulation processing from the hangcheck and avoid confusing
it with a real hang.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65060
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.c |   55 ++++++++++++++++-----------------------
 1 file changed, 23 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index a1a936f..af22450 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -839,37 +839,14 @@ static int gen6_do_reset(struct drm_device *dev)
 
 int intel_gpu_reset(struct drm_device *dev)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	int ret = -ENODEV;
-
 	switch (INTEL_INFO(dev)->gen) {
 	case 7:
-	case 6:
-		ret = gen6_do_reset(dev);
-		break;
-	case 5:
-		ret = ironlake_do_reset(dev);
-		break;
-	case 4:
-		ret = i965_do_reset(dev);
-		break;
-	case 2:
-		ret = i8xx_do_reset(dev);
-		break;
-	}
-
-	/* Also reset the gpu hangman. */
-	if (dev_priv->gpu_error.stop_rings) {
-		DRM_INFO("Simulated gpu hang, resetting stop_rings\n");
-		dev_priv->gpu_error.stop_rings = 0;
-		if (ret == -ENODEV) {
-			DRM_ERROR("Reset not implemented, but ignoring "
-				  "error for simulated gpu hangs\n");
-			ret = 0;
-		}
+	case 6: return gen6_do_reset(dev);
+	case 5: return ironlake_do_reset(dev);
+	case 4: return i965_do_reset(dev);
+	case 2: return i8xx_do_reset(dev);
+	default: return -ENODEV;
 	}
-
-	return ret;
 }
 
 /**
@@ -890,6 +867,7 @@ int intel_gpu_reset(struct drm_device *dev)
 int i915_reset(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	bool simulated;
 	int ret;
 
 	if (!i915_try_reset)
@@ -899,13 +877,26 @@ int i915_reset(struct drm_device *dev)
 
 	i915_gem_reset(dev);
 
-	ret = -ENODEV;
-	if (get_seconds() - dev_priv->gpu_error.last_reset < 5)
+	simulated = dev_priv->gpu_error.stop_rings != 0;
+
+	if (!simulated && get_seconds() - dev_priv->gpu_error.last_reset < 5) {
 		DRM_ERROR("GPU hanging too fast, declaring wedged!\n");
-	else
+		ret = -ENODEV;
+	} else {
 		ret = intel_gpu_reset(dev);
 
-	dev_priv->gpu_error.last_reset = get_seconds();
+		/* Also reset the gpu hangman. */
+		if (simulated) {
+			DRM_INFO("Simulated gpu hang, resetting stop_rings\n");
+			dev_priv->gpu_error.stop_rings = 0;
+			if (ret == -ENODEV) {
+				DRM_ERROR("Reset not implemented, but ignoring "
+					  "error for simulated gpu hangs\n");
+				ret = 0;
+			}
+		} else
+			dev_priv->gpu_error.last_reset = get_seconds();
+	}
 	if (ret) {
 		DRM_ERROR("Failed to reset chip.\n");
 		mutex_unlock(&dev->struct_mutex);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] drm/i915: Avoid promoting a simulated hang to 'wedged'
  2013-05-28  9:38 [PATCH] drm/i915: Avoid promoting a simulated hang to 'wedged' Chris Wilson
@ 2013-05-29 11:33 ` Mika Kuoppala
  2013-05-29 12:13   ` Daniel Vetter
  0 siblings, 1 reply; 3+ messages in thread
From: Mika Kuoppala @ 2013-05-29 11:33 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> It appears that a beneficial side-effect of Mika's more accurate hangman
> work is to speed up hang detection and execution. This exposes a bug in
> the reset code that then treats repeated simulated hangs as an
> indication that the machine is wedged. Jiggle the code around so that we
> only do the simulation processing from the hangcheck and avoid confusing
> it with a real hang.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65060
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] drm/i915: Avoid promoting a simulated hang to 'wedged'
  2013-05-29 11:33 ` Mika Kuoppala
@ 2013-05-29 12:13   ` Daniel Vetter
  0 siblings, 0 replies; 3+ messages in thread
From: Daniel Vetter @ 2013-05-29 12:13 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Wed, May 29, 2013 at 02:33:07PM +0300, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > It appears that a beneficial side-effect of Mika's more accurate hangman
> > work is to speed up hang detection and execution. This exposes a bug in
> > the reset code that then treats repeated simulated hangs as an
> > indication that the machine is wedged. Jiggle the code around so that we
> > only do the simulation processing from the hangcheck and avoid confusing
> > it with a real hang.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65060
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-05-29 12:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-28  9:38 [PATCH] drm/i915: Avoid promoting a simulated hang to 'wedged' Chris Wilson
2013-05-29 11:33 ` Mika Kuoppala
2013-05-29 12:13   ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.