All of lore.kernel.org
 help / color / mirror / Atom feed
From: John.C.Harrison@Intel.com
To: Intel-GFX@Lists.FreeDesktop.Org
Cc: Matthew Brost <matthew.brost@intel.com>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Chris Wilson <chris.p.wilson@intel.com>,
	Michael Cheng <michael.cheng@intel.com>,
	Alan Previn <alan.previn.teres.alexis@intel.com>,
	Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>,
	Matthew Auld <matthew.auld@intel.com>,
	Lucas De Marchi <lucas.demarchi@intel.com>,
	Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>,
	DRI-Devel@Lists.FreeDesktop.Org,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>,
	intel-gfx@lists.freedesktop.org,
	John Harrison <John.C.Harrison@Intel.com>,
	Bruce Chang <yu.bruce.chang@intel.com>
Subject: [PATCH v5 1/8] drm/i915/guc: Fix locking when searching for a hung request
Date: Wed, 25 Jan 2023 16:54:13 -0800	[thread overview]
Message-ID: <20230126005420.160070-2-John.C.Harrison@Intel.com> (raw)
In-Reply-To: <20230126005420.160070-1-John.C.Harrison@Intel.com>

From: John Harrison <John.C.Harrison@Intel.com>

intel_guc_find_hung_context() was not acquiring the correct spinlock
before searching the request list. So fix that up. While at it, add
some extra whitespace padding for readability.

Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index b436dd7f12e42..3b34a82d692be 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -4820,6 +4820,8 @@ void intel_guc_find_hung_context(struct intel_engine_cs *engine)
 
 	xa_lock_irqsave(&guc->context_lookup, flags);
 	xa_for_each(&guc->context_lookup, index, ce) {
+		bool found;
+
 		if (!kref_get_unless_zero(&ce->ref))
 			continue;
 
@@ -4836,10 +4838,18 @@ void intel_guc_find_hung_context(struct intel_engine_cs *engine)
 				goto next;
 		}
 
+		found = false;
+		spin_lock(&ce->guc_state.lock);
 		list_for_each_entry(rq, &ce->guc_state.requests, sched.link) {
 			if (i915_test_request_state(rq) != I915_REQUEST_ACTIVE)
 				continue;
 
+			found = true;
+			break;
+		}
+		spin_unlock(&ce->guc_state.lock);
+
+		if (found) {
 			intel_engine_set_hung_context(engine, ce);
 
 			/* Can only cope with one hang at a time... */
@@ -4847,6 +4857,7 @@ void intel_guc_find_hung_context(struct intel_engine_cs *engine)
 			xa_lock(&guc->context_lookup);
 			goto done;
 		}
+
 next:
 		intel_context_put(ce);
 		xa_lock(&guc->context_lookup);
-- 
2.39.1


WARNING: multiple messages have this Message-ID (diff)
From: John.C.Harrison@Intel.com
To: Intel-GFX@Lists.FreeDesktop.Org
Cc: Chris Wilson <chris.p.wilson@intel.com>,
	Michael Cheng <michael.cheng@intel.com>,
	Alan Previn <alan.previn.teres.alexis@intel.com>,
	Matthew Auld <matthew.auld@intel.com>,
	Lucas De Marchi <lucas.demarchi@intel.com>,
	DRI-Devel@Lists.FreeDesktop.Org,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>,
	intel-gfx@lists.freedesktop.org
Subject: [Intel-gfx] [PATCH v5 1/8] drm/i915/guc: Fix locking when searching for a hung request
Date: Wed, 25 Jan 2023 16:54:13 -0800	[thread overview]
Message-ID: <20230126005420.160070-2-John.C.Harrison@Intel.com> (raw)
In-Reply-To: <20230126005420.160070-1-John.C.Harrison@Intel.com>

From: John Harrison <John.C.Harrison@Intel.com>

intel_guc_find_hung_context() was not acquiring the correct spinlock
before searching the request list. So fix that up. While at it, add
some extra whitespace padding for readability.

Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index b436dd7f12e42..3b34a82d692be 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -4820,6 +4820,8 @@ void intel_guc_find_hung_context(struct intel_engine_cs *engine)
 
 	xa_lock_irqsave(&guc->context_lookup, flags);
 	xa_for_each(&guc->context_lookup, index, ce) {
+		bool found;
+
 		if (!kref_get_unless_zero(&ce->ref))
 			continue;
 
@@ -4836,10 +4838,18 @@ void intel_guc_find_hung_context(struct intel_engine_cs *engine)
 				goto next;
 		}
 
+		found = false;
+		spin_lock(&ce->guc_state.lock);
 		list_for_each_entry(rq, &ce->guc_state.requests, sched.link) {
 			if (i915_test_request_state(rq) != I915_REQUEST_ACTIVE)
 				continue;
 
+			found = true;
+			break;
+		}
+		spin_unlock(&ce->guc_state.lock);
+
+		if (found) {
 			intel_engine_set_hung_context(engine, ce);
 
 			/* Can only cope with one hang at a time... */
@@ -4847,6 +4857,7 @@ void intel_guc_find_hung_context(struct intel_engine_cs *engine)
 			xa_lock(&guc->context_lookup);
 			goto done;
 		}
+
 next:
 		intel_context_put(ce);
 		xa_lock(&guc->context_lookup);
-- 
2.39.1


  reply	other threads:[~2023-01-26  0:54 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-26  0:54 [PATCH v5 0/8] Allow error capture without a request & fix locking issues John.C.Harrison
2023-01-26  0:54 ` [Intel-gfx] " John.C.Harrison
2023-01-26  0:54 ` John.C.Harrison [this message]
2023-01-26  0:54   ` [Intel-gfx] [PATCH v5 1/8] drm/i915/guc: Fix locking when searching for a hung request John.C.Harrison
2023-01-26 23:45   ` Ceraolo Spurio, Daniele
2023-01-26 23:45     ` [Intel-gfx] " Ceraolo Spurio, Daniele
2023-01-26  0:54 ` [PATCH v5 2/8] drm/i915: Fix request locking during error capture & debugfs dump John.C.Harrison
2023-01-26  0:54   ` [Intel-gfx] " John.C.Harrison
2023-01-26  9:06   ` Tvrtko Ursulin
2023-01-26  9:06     ` [Intel-gfx] " Tvrtko Ursulin
2023-01-26  0:54 ` [PATCH v5 3/8] drm/i915: Fix up locking around dumping requests lists John.C.Harrison
2023-01-26  0:54   ` [Intel-gfx] " John.C.Harrison
2023-01-26  0:54 ` [PATCH v5 4/8] drm/i915: Allow error capture without a request John.C.Harrison
2023-01-26  0:54   ` [Intel-gfx] " John.C.Harrison
2023-01-26  0:54 ` [PATCH v5 5/8] drm/i915: Allow error capture of a pending request John.C.Harrison
2023-01-26  0:54   ` [Intel-gfx] " John.C.Harrison
2023-01-26  0:54 ` [PATCH v5 6/8] drm/i915/guc: Look for a guilty context when an engine reset fails John.C.Harrison
2023-01-26  0:54   ` [Intel-gfx] " John.C.Harrison
2023-01-26  0:54 ` [PATCH v5 7/8] drm/i915/guc: Add a debug print on GuC triggered reset John.C.Harrison
2023-01-26  0:54   ` [Intel-gfx] " John.C.Harrison
2023-01-26  0:54 ` [PATCH v5 8/8] drm/i915/guc: Rename GuC register state capture node to be more obvious John.C.Harrison
2023-01-26  0:54   ` [Intel-gfx] " John.C.Harrison
2023-01-26  4:24 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Allow error capture without a request & fix locking issues (rev3) Patchwork
2023-01-26  4:50 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-01-26 15:21 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230126005420.160070-2-John.C.Harrison@Intel.com \
    --to=john.c.harrison@intel.com \
    --cc=DRI-Devel@Lists.FreeDesktop.Org \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=alan.previn.teres.alexis@intel.com \
    --cc=chris.p.wilson@intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=michael.cheng@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=tejaskumarx.surendrakumar.upadhyay@intel.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    --cc=umesh.nerlige.ramappa@intel.com \
    --cc=yu.bruce.chang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.