All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Subject: [CI 12/21] drm/i915: Replace wait-on-mutex with wait-on-bit in reset worker
Date: Fri,  9 Sep 2016 14:11:52 +0100	[thread overview]
Message-ID: <20160909131201.16673-12-chris@chris-wilson.co.uk> (raw)
In-Reply-To: <20160909131201.16673-1-chris@chris-wilson.co.uk>

Since we have a cooperative mode now with a direct reset, we can avoid
the contention on struct_mutex and instead try then sleep on the
I915_RESET_IN_PROGRESS bit. If the mutex is held and that bit is
cleared, all is fine. Otherwise, we sleep for a bit and try again. In
the worst case we sleep for an extra second waiting for the mutex to be
released (no one touching the GPU is allowed the struct_mutex whilst the
I915_RESET_IN_PROGRESS bit is set). But when we have a direct reset,
this allows us to clean up the reset worker faster.

v2: Remember to call wake_up_bit() after changing (for the faster wakeup
as promised)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 14 ++++++++------
 drivers/gpu/drm/i915/i915_drv.h |  2 +-
 drivers/gpu/drm/i915/i915_irq.c | 31 ++++++++++++++++++-------------
 3 files changed, 27 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ff4173e6e298..f2614b2f59f7 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1726,8 +1726,8 @@ int i915_resume_switcheroo(struct drm_device *dev)
  * i915_reset - reset chip after a hang
  * @dev: drm device to reset
  *
- * Reset the chip.  Useful if a hang is detected. Returns zero on successful
- * reset or otherwise an error code.
+ * Reset the chip.  Useful if a hang is detected. Marks the device as wedged
+ * on failure.
  *
  * Caller must hold the struct_mutex.
  *
@@ -1739,7 +1739,7 @@ int i915_resume_switcheroo(struct drm_device *dev)
  *   - re-init interrupt state
  *   - re-init display
  */
-int i915_reset(struct drm_i915_private *dev_priv)
+void i915_reset(struct drm_i915_private *dev_priv)
 {
 	struct drm_device *dev = &dev_priv->drm;
 	struct i915_gpu_error *error = &dev_priv->gpu_error;
@@ -1748,7 +1748,7 @@ int i915_reset(struct drm_i915_private *dev_priv)
 	lockdep_assert_held(&dev->struct_mutex);
 
 	if (!test_and_clear_bit(I915_RESET_IN_PROGRESS, &error->flags))
-		return test_bit(I915_WEDGED, &error->flags) ? -EIO : 0;
+		return;
 
 	/* Clear any previous failed attempts at recovery. Time to try again. */
 	__clear_bit(I915_WEDGED, &error->flags);
@@ -1798,11 +1798,13 @@ int i915_reset(struct drm_i915_private *dev_priv)
 	intel_sanitize_gt_powersave(dev_priv);
 	intel_autoenable_gt_powersave(dev_priv);
 
-	return 0;
+wakeup:
+	wake_up_bit(&error->flags, I915_RESET_IN_PROGRESS);
+	return;
 
 error:
 	set_bit(I915_WEDGED, &error->flags);
-	return ret;
+	goto wakeup;
 }
 
 static int i915_pm_suspend(struct device *kdev)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 15f1977e356a..9a9f07f3574c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2884,7 +2884,7 @@ extern long i915_compat_ioctl(struct file *filp, unsigned int cmd,
 #endif
 extern int intel_gpu_reset(struct drm_i915_private *dev_priv, u32 engine_mask);
 extern bool intel_has_gpu_reset(struct drm_i915_private *dev_priv);
-extern int i915_reset(struct drm_i915_private *dev_priv);
+extern void i915_reset(struct drm_i915_private *dev_priv);
 extern int intel_guc_reset(struct drm_i915_private *dev_priv);
 extern void intel_engine_init_hangcheck(struct intel_engine_cs *engine);
 extern unsigned long i915_chipset_val(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2c7cb5041511..ef2d40278191 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2497,7 +2497,6 @@ static void i915_reset_and_wakeup(struct drm_i915_private *dev_priv)
 	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
 	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
 	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
-	int ret;
 
 	kobject_uevent_env(kobj, KOBJ_CHANGE, error_event);
 
@@ -2512,24 +2511,30 @@ static void i915_reset_and_wakeup(struct drm_i915_private *dev_priv)
 	 * simulated reset via debugs, so get an RPM reference.
 	 */
 	intel_runtime_pm_get(dev_priv);
-
 	intel_prepare_reset(dev_priv);
 
-	/*
-	 * All state reset _must_ be completed before we update the
-	 * reset counter, for otherwise waiters might miss the reset
-	 * pending state and not properly drop locks, resulting in
-	 * deadlocks with the reset work.
-	 */
-	mutex_lock(&dev_priv->drm.struct_mutex);
-	ret = i915_reset(dev_priv);
-	mutex_unlock(&dev_priv->drm.struct_mutex);
+	do {
+		/*
+		 * All state reset _must_ be completed before we update the
+		 * reset counter, for otherwise waiters might miss the reset
+		 * pending state and not properly drop locks, resulting in
+		 * deadlocks with the reset work.
+		 */
+		if (mutex_trylock(&dev_priv->drm.struct_mutex)) {
+			i915_reset(dev_priv);
+			mutex_unlock(&dev_priv->drm.struct_mutex);
+		}
 
-	intel_finish_reset(dev_priv);
+		/* We need to wait for anyone holding the lock to wakeup */
+	} while (wait_on_bit_timeout(&dev_priv->gpu_error.flags,
+				     I915_RESET_IN_PROGRESS,
+				     TASK_UNINTERRUPTIBLE,
+				     HZ));
 
+	intel_finish_reset(dev_priv);
 	intel_runtime_pm_put(dev_priv);
 
-	if (ret == 0)
+	if (!test_bit(I915_WEDGED, &dev_priv->gpu_error.flags))
 		kobject_uevent_env(kobj,
 				   KOBJ_CHANGE, reset_done_event);
 
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2016-09-09 13:12 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-09 13:11 [CI 01/21] drm/i915: Add a sw fence for collecting up dma fences Chris Wilson
2016-09-09 13:11 ` [CI 02/21] drm/i915: Only queue requests during execlists submission Chris Wilson
2016-09-09 13:11 ` [CI 03/21] drm/i915: Record the position of the workarounds in the tail of the request Chris Wilson
2016-09-09 13:11 ` [CI 04/21] drm/i915: Compute the ELSP register location once Chris Wilson
2016-09-09 13:11 ` [CI 05/21] drm/i915: Reorder submitting the requests to ELSP Chris Wilson
2016-09-09 13:11 ` [CI 06/21] drm/i915: Simplify ELSP queue request tracking Chris Wilson
2016-09-09 13:11 ` [CI 07/21] drm/i915: Separate out reset flags from the reset counter Chris Wilson
2016-09-09 13:11 ` [CI 08/21] drm/i915: Drop local struct_mutex around intel_init_emon[ilk] Chris Wilson
2016-09-09 13:11 ` [CI 09/21] drm/i915: Expand bool interruptible to pass flags to i915_wait_request() Chris Wilson
2016-09-09 13:11 ` [CI 10/21] drm/i915: Mark up all locked waiters Chris Wilson
2016-09-12 19:17   ` Zanoni, Paulo R
2016-09-13 10:59     ` Mika Kuoppala
2016-09-13 12:21       ` Mika Kuoppala
2016-09-14  8:40         ` chris
2016-09-22 19:05           ` Paulo Zanoni
2016-09-23  7:45             ` chris
2016-09-09 13:11 ` [CI 11/21] drm/i915: Perform a direct reset of the GPU from the waiter Chris Wilson
2016-09-09 13:11 ` Chris Wilson [this message]
2016-09-09 13:11 ` [CI 13/21] drm/i915: Update reset path to fix incomplete requests Chris Wilson
2016-09-09 13:11 ` [CI 14/21] drm/i915: Drive request submission through fence callbacks Chris Wilson
2016-09-09 13:11 ` [CI 15/21] drm/i915: Reorder i915_add_request to separate the phases better Chris Wilson
2016-09-09 13:11 ` [CI 16/21] drm/i915: Prepare object synchronisation for asynchronicity Chris Wilson
2016-09-09 13:11 ` [CI 17/21] drm/i915/guc: Prepare for nonblocking execbuf submission Chris Wilson
2016-09-09 13:11 ` [CI 18/21] drm/i915: Ignore valid but unknown semaphores Chris Wilson
2016-09-09 13:11 ` [CI 19/21] drm/i915: Avoid incrementing hangcheck whilst waiting for external fence Chris Wilson
2016-09-09 13:12 ` [CI 20/21] drm/i915: Nonblocking request submission Chris Wilson
2016-09-09 13:12 ` [CI 21/21] drm/i915: Serialise execbuf operation after a dma-buf reservation object Chris Wilson
2016-09-12  8:23 ` ✓ Fi.CI.BAT: success for series starting with [CI,01/21] drm/i915: Add a sw fence for collecting up dma fences Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2016-09-09 11:00 [CI 01/21] " Chris Wilson
2016-09-09 11:00 ` [CI 12/21] drm/i915: Replace wait-on-mutex with wait-on-bit in reset worker Chris Wilson
2016-09-09  7:20 [CI 01/21] drm/i915: Add a sw fence for collecting up dma fences Chris Wilson
2016-09-09  7:20 ` [CI 12/21] drm/i915: Replace wait-on-mutex with wait-on-bit in reset worker Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160909131201.16673-12-chris@chris-wilson.co.uk \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.