All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Subject: [Intel-gfx] [PATCH 02/11] drm/i915/gt: Don't declare hangs if engine is stalled
Date: Thu, 28 May 2020 08:41:00 +0100	[thread overview]
Message-ID: <20200528074109.28235-2-chris@chris-wilson.co.uk> (raw)
In-Reply-To: <20200528074109.28235-1-chris@chris-wilson.co.uk>

If the ring submission is stalled on an external request, nothing can be
submitted, not even the heartbeat in the kernel context. Since nothing
is running, resetting the engine/device does not unblock the system and
is pointless. We can see if the heartbeat is supposed to be running
before declaring foul.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 5136c8bf112d..f67ad937eefb 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -48,8 +48,10 @@ static void show_heartbeat(const struct i915_request *rq,
 	struct drm_printer p = drm_debug_printer("heartbeat");
 
 	intel_engine_dump(engine, &p,
-			  "%s heartbeat {prio:%d} not ticking\n",
+			  "%s heartbeat {seqno:%llx:%lld, prio:%d} not ticking\n",
 			  engine->name,
+			  rq->fence.context,
+			  rq->fence.seqno,
 			  rq->sched.attr.priority);
 }
 
@@ -76,8 +78,19 @@ static void heartbeat(struct work_struct *wrk)
 		goto out;
 
 	if (engine->heartbeat.systole) {
-		if (engine->schedule &&
-		    rq->sched.attr.priority < I915_PRIORITY_BARRIER) {
+		if (!i915_sw_fence_signaled(&rq->submit)) {
+			/*
+			 * Not yet submitted, system is stalled.
+			 *
+			 * This more often happens for ring submission,
+			 * where all contexts are funnelled into a common
+			 * ringbuffer. If one context is blocked on an
+			 * external fence, not only is it not submitted,
+			 * but all other contexts, including the kernel
+			 * context are stuck waiting for the signal.
+			 */
+		} else if (engine->schedule &&
+			   rq->sched.attr.priority < I915_PRIORITY_BARRIER) {
 			/*
 			 * Gradually raise the priority of the heartbeat to
 			 * give high priority work [which presumably desires
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2020-05-28  7:41 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-28  7:40 [Intel-gfx] [PATCH 01/11] drm/i915/gt: Prevent timeslicing into unpreemptable requests Chris Wilson
2020-05-28  7:41 ` Chris Wilson [this message]
2020-05-28  7:41 ` [Intel-gfx] [PATCH 03/11] drm/i915/gem: Async GPU relocations only Chris Wilson
2020-05-28  7:41 ` [Intel-gfx] [PATCH 04/11] drm/i915: Add list_for_each_entry_safe_continue_reverse Chris Wilson
2020-05-28  7:41 ` [Intel-gfx] [PATCH 05/11] drm/i915/gem: Separate reloc validation into an earlier step Chris Wilson
2020-05-28  7:41 ` [Intel-gfx] [PATCH 06/11] drm/i915/gem: Lift GPU relocation allocation Chris Wilson
2020-05-28  7:41 ` [Intel-gfx] [PATCH 07/11] drm/i915/gem: Add all GPU reloc awaits/signals en masse Chris Wilson
2020-05-28  7:41 ` [Intel-gfx] [PATCH 08/11] drm/i915/gem: Build the reloc request first Chris Wilson
2020-05-28  7:41 ` [Intel-gfx] [PATCH 09/11] dma-buf: Proxy fence, an unsignaled fence placeholder Chris Wilson
2020-05-28  7:41 ` [Intel-gfx] [PATCH 10/11] drm/i915: Unpeel awaits on a proxy fence Chris Wilson
2020-05-28  7:41 ` [Intel-gfx] [PATCH 11/11] drm/i915/gem: Make relocations atomic within execbuf Chris Wilson
2020-05-28  7:54 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915/gt: Prevent timeslicing into unpreemptable requests Patchwork
2020-05-28  7:55 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-05-28  8:16 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-05-28 10:38 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200528074109.28235-2-chris@chris-wilson.co.uk \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.