Intel-GFX Archive on lore.kernel.org
 help / color / Atom feed
From: Daniel Vetter <daniel.vetter@ffwll.ch>
To: Intel Graphics Development <intel-gfx@lists.freedesktop.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Subject: [PATCH] drm/i915: clarify concurrent hang detect/gpu reset consistency
Date: Thu,  6 Dec 2012 16:23:37 +0100
Message-ID: <1354807417-16318-1-git-send-email-daniel.vetter@ffwll.ch> (raw)
In-Reply-To: <CAPX-8+8UDSpi=WDyR5M2eAbvYQ59YvUb_pjJE+_ZJMZV6hjeZQ@mail.gmail.com>

Damien Lespiau wondered how race the gpu reset/hang detection code is
against concurrent gpu resets/hang detections or combinations thereof.
Luckily the single work item is guranteed to never run concurrently,
so reset handling is already single-threaded.

Hence we only have to worry about concurrent hang detections, or a
hang detection firing off while we're still processing an older gpu
reset request. Due to the new mechanism of setting the reset in
progress flag and the ordering guaranteed by the schedule_work
function there's nothing to do but add a comment explaining why we're
safe.

The only thing I've noticed is that we still try to reset the gpu now,
even when it is declared terminally wedged. Add a check for that to
avoid continous warnings about failed resets, in case the hangcheck
timer ever gets stuck.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_irq.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index f7f9c21..09c33ac 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -851,9 +851,20 @@ static void i915_error_work_func(struct work_struct *work)
 
 	kobject_uevent_env(&dev->primary->kdev.kobj, KOBJ_CHANGE, error_event);
 
-	if (i915_reset_in_progress(error)) {
+	/*
+	 * Note that there's only one work item which does gpu resets, so we
+	 * need not worry about concurrent gpu resets potentially incrementing
+	 * error->reset_counter twice. We only need to take care of another
+	 * racing irq/hangcheck declaring the gpu dead for a second time. A
+	 * quick check for that is good enough: schedule_work ensures the
+	 * correct ordering between hang detection and this work item, and since
+	 * the reset in-progress bit is only ever set by code outside of this
+	 * work we don't need to worry about any other races.
+	 */
+	if (i915_reset_in_progress(error) && !i915_terminally_wedged(error)) {
 		DRM_DEBUG_DRIVER("resetting chip\n");
-		kobject_uevent_env(&dev->primary->kdev.kobj, KOBJ_CHANGE, reset_event);
+		kobject_uevent_env(&dev->primary->kdev.kobj, KOBJ_CHANGE,
+				   reset_event);
 
 		ret = i915_reset(dev);
 
-- 
1.7.11.7

  parent reply index

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-14 16:14 [PATCH 0/6] robustify reset transitions Daniel Vetter
2012-11-14 16:14 ` [PATCH 1/6] drm/i915: move dev_priv->mm out of line Daniel Vetter
2012-12-04 16:31   ` Damien Lespiau
2012-11-14 16:14 ` [PATCH 2/6] drm/i915: extract hangcheck/reset/error_state state into substruct Daniel Vetter
2012-12-04 17:20   ` Damien Lespiau
2012-11-14 16:14 ` [PATCH 3/6] drm/i915: move wedged to the other gpu error handling stuff Daniel Vetter
2012-12-04 17:24   ` Damien Lespiau
2012-11-14 16:14 ` [PATCH 4/6] drm/i915: fix reset handling in the throttle ioctl Daniel Vetter
2012-12-05 14:08   ` Damien Lespiau
2012-11-14 16:14 ` [PATCH 5/6] drm/i915: clear up wedged transitions Daniel Vetter
2012-11-14 16:14 ` [PATCH 6/6] drm/i915: create a race-free reset detection Daniel Vetter
2012-11-15 16:17 ` [PATCH 1/2] drm/i915: clear up wedged transitions Daniel Vetter
2012-11-15 16:17   ` [PATCH 2/2] drm/i915: create a race-free reset detection Daniel Vetter
2012-12-05 16:35     ` Damien Lespiau
2012-12-06  8:01       ` [PATCH] " Daniel Vetter
2012-12-06 15:23       ` Daniel Vetter [this message]
2013-01-18 20:48       ` [PATCH 2/2] " Daniel Vetter
2013-01-21 12:06         ` Damien Lespiau
2013-01-21 19:15           ` Daniel Vetter
2012-12-05 14:54   ` [PATCH 1/2] drm/i915: clear up wedged transitions Damien Lespiau
2012-12-05 16:38   ` Damien Lespiau
2012-12-05 17:14     ` Daniel Vetter
2014-09-03 20:26   ` Chris Wilson
2014-09-04  6:03     ` Daniel Vetter
2014-09-04  6:11       ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1354807417-16318-1-git-send-email-daniel.vetter@ffwll.ch \
    --to=daniel.vetter@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Intel-GFX Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/intel-gfx/0 intel-gfx/git/0.git
	git clone --mirror https://lore.kernel.org/intel-gfx/1 intel-gfx/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 intel-gfx intel-gfx/ https://lore.kernel.org/intel-gfx \
		intel-gfx@lists.freedesktop.org
	public-inbox-index intel-gfx

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.freedesktop.lists.intel-gfx


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git