All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Subject: [PATCH] drm/i915: Detect a failed GPU reset+recovery
Date: Mon, 16 Jan 2017 15:00:36 +0000	[thread overview]
Message-ID: <20170116150036.27917-1-chris@chris-wilson.co.uk> (raw)
In-Reply-To: <20170116094252.26429-2-chris@chris-wilson.co.uk>

If we can't recover the GPU after the reset, mark it as wedged to cancel
the outstanding tasks and to prevent new users from trying to use the
broken GPU.

v2: Check the same ring is hung again before declaring the reset broken.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c |  7 ++++++-
 drivers/gpu/drm/i915/i915_drv.h |  2 +-
 drivers/gpu/drm/i915/i915_gem.c | 16 ++++++++++++++--
 3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index bb747aeb29aa..5ee62976ec98 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1761,7 +1761,12 @@ void i915_reset(struct drm_i915_private *dev_priv)
 
 	pr_notice("drm/i915: Resetting chip after gpu hang\n");
 	disable_irq(dev_priv->drm.irq);
-	i915_gem_reset_prepare(dev_priv);
+	ret = i915_gem_reset_prepare(dev_priv);
+	if (ret) {
+		DRM_ERROR("GPU recovery failed\n");
+		intel_gpu_reset(dev_priv, ALL_ENGINES);
+		goto error;
+	}
 
 	ret = intel_gpu_reset(dev_priv, ALL_ENGINES);
 	if (ret) {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f861418122ef..38509505424d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3328,7 +3328,7 @@ static inline u32 i915_reset_count(struct i915_gpu_error *error)
 	return READ_ONCE(error->reset_count);
 }
 
-void i915_gem_reset_prepare(struct drm_i915_private *dev_priv);
+int i915_gem_reset_prepare(struct drm_i915_private *dev_priv);
 void i915_gem_reset_finish(struct drm_i915_private *dev_priv);
 void i915_gem_set_wedged(struct drm_i915_private *dev_priv);
 void i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 021d1e97cc15..e9dfd320a0e3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2629,16 +2629,28 @@ static void reset_request(struct drm_i915_gem_request *request)
 	dma_fence_set_error(&request->fence, -EIO);
 }
 
-void i915_gem_reset_prepare(struct drm_i915_private *dev_priv)
+int i915_gem_reset_prepare(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
+	int err = 0;
 
 	/* Ensure irq handler finishes, and not run again. */
-	for_each_engine(engine, dev_priv, id)
+	for_each_engine(engine, dev_priv, id) {
+		struct drm_i915_gem_request *request;
+
 		tasklet_kill(&engine->irq_tasklet);
 
+		if (engine->hangcheck.stalled) {
+			request = i915_gem_find_active_request(engine);
+			if (request && request->fence.error == -EIO)
+				err = -EIO; /* Previous reset failed! */
+		}
+	}
+
 	i915_gem_revoke_fences(dev_priv);
+
+	return err;
 }
 
 static void i915_gem_reset_engine(struct intel_engine_cs *engine)
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2017-01-16 15:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-16  9:42 [PATCH 1/2] drm/i915: Move engine reset preparation to i915_gem_reset_prepare() Chris Wilson
2017-01-16  9:42 ` [PATCH 2/2] drm/i915: Detect a failed GPU reset+recovery Chris Wilson
2017-01-16 11:18   ` Chris Wilson
2017-01-16 15:00   ` Chris Wilson [this message]
2017-01-16 11:54 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Move engine reset preparation to i915_gem_reset_prepare() Patchwork
2017-01-16 14:03 ` [PATCH 1/2] " Mika Kuoppala
2017-01-16 14:18   ` Chris Wilson
2017-01-16 14:37     ` Mika Kuoppala
2017-01-16 15:54 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Move engine reset preparation to i915_gem_reset_prepare() (rev2) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170116150036.27917-1-chris@chris-wilson.co.uk \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mika.kuoppala@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.