dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] Allow error capture without a request & fix locking issues
@ 2023-01-20 23:28 John.C.Harrison
  2023-01-20 23:28 ` [PATCH v4 1/7] drm/i915: Fix request locking during error capture & debugfs dump John.C.Harrison
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: John.C.Harrison @ 2023-01-20 23:28 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

It is technically possible to get a hung context without a valid
request. In such a situation, try to provide as much information in
the error capture as possible rather than just aborting and capturing
nothing.

Similarly, in the case of an engine reset failure the GuC is not able
to report the guilty context. So try a manual search instead of
reporting nothing.

While doing all this, it was noticed that the locking was broken in a
number of places when searching for hung requests and dumping request
info. So fix all that up as well.

v2: Tidy up code flow in error capture. Reword some comments/messages.
(review feedback from Tvrtko)
Also fix up request locking issues from earlier changes noticed during
code review of this change.
v3: Fix some potential null pointer derefs and a reference leak.
Add new patch to refactor the duplicated hung request search code into
a common backend agnostic wrapper function and use the correct
spinlocks for the correct lists. Also tweak some of the patch
descriptions for better accuracy.
v4: Shuffle some code around to more appropriate source files. Fix
potential leak of GuC capture object after code flow re-org and pull
improved info message earlier (Daniele). Also rename the GuC capture
object to be more consistent.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>


John Harrison (7):
  drm/i915: Fix request locking during error capture & debugfs dump
  drm/i915: Fix up locking around dumping requests lists
  drm/i915: Allow error capture without a request
  drm/i915: Allow error capture of a pending request
  drm/i915/guc: Look for a guilty context when an engine reset fails
  drm/i915/guc: Add a debug print on GuC triggered reset
  drm/i915/guc: Rename GuC register state capture node to be more
    obvious

 drivers/gpu/drm/i915/gt/intel_context.c       |  4 +-
 drivers/gpu/drm/i915/gt/intel_context.h       |  3 +-
 drivers/gpu/drm/i915/gt/intel_engine.h        |  4 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 74 ++++++++-------
 .../drm/i915/gt/intel_execlists_submission.c  | 27 ++++++
 .../drm/i915/gt/intel_execlists_submission.h  |  4 +
 .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  8 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 35 ++++++-
 drivers/gpu/drm/i915/i915_gpu_error.c         | 92 ++++++++++---------
 drivers/gpu/drm/i915/i915_gpu_error.h         |  2 +-
 10 files changed, 160 insertions(+), 93 deletions(-)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-01-25 22:04 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-20 23:28 [PATCH v4 0/7] Allow error capture without a request & fix locking issues John.C.Harrison
2023-01-20 23:28 ` [PATCH v4 1/7] drm/i915: Fix request locking during error capture & debugfs dump John.C.Harrison
2023-01-23 17:51   ` Tvrtko Ursulin
2023-01-23 20:35     ` John Harrison
2023-01-25 22:04     ` John Harrison
2023-01-20 23:28 ` [PATCH v4 2/7] drm/i915: Fix up locking around dumping requests lists John.C.Harrison
2023-01-20 23:40   ` John Harrison
2023-01-24 14:40   ` [Intel-gfx] " Tvrtko Ursulin
2023-01-25 18:00     ` John Harrison
2023-01-25 18:12       ` Tvrtko Ursulin
2023-01-25 18:17         ` John Harrison
2023-01-25  0:31   ` Ceraolo Spurio, Daniele
2023-01-20 23:28 ` [PATCH v4 3/7] drm/i915: Allow error capture without a request John.C.Harrison
2023-01-25  0:39   ` [Intel-gfx] " Ceraolo Spurio, Daniele
2023-01-25  0:56     ` John Harrison
2023-01-20 23:28 ` [PATCH v4 4/7] drm/i915: Allow error capture of a pending request John.C.Harrison
2023-01-20 23:28 ` [PATCH v4 5/7] drm/i915/guc: Look for a guilty context when an engine reset fails John.C.Harrison
2023-01-20 23:28 ` [PATCH v4 6/7] drm/i915/guc: Add a debug print on GuC triggered reset John.C.Harrison
2023-01-20 23:28 ` [PATCH v4 7/7] drm/i915/guc: Rename GuC register state capture node to be more obvious John.C.Harrison
2023-01-25  0:44   ` [Intel-gfx] " Ceraolo Spurio, Daniele

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).