All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Subject: Re: [Intel-gfx] [PATCH 2/3] drm/i915/gt: Don't declare hangs if engine is stalled
Date: Thu, 28 May 2020 19:23:18 +0300	[thread overview]
Message-ID: <871rn4jafd.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20200528074324.5765-2-chris@chris-wilson.co.uk>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> If the ring submission is stalled on an external request, nothing can be
> submitted, not even the heartbeat in the kernel context. Since nothing
> is running, resetting the engine/device does not unblock the system and
> is pointless. We can see if the heartbeat is supposed to be running
> before declaring foul.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 19 ++++++++++++++++---
>  1 file changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> index 5136c8bf112d..f67ad937eefb 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> @@ -48,8 +48,10 @@ static void show_heartbeat(const struct i915_request *rq,
>  	struct drm_printer p = drm_debug_printer("heartbeat");
>  
>  	intel_engine_dump(engine, &p,
> -			  "%s heartbeat {prio:%d} not ticking\n",
> +			  "%s heartbeat {seqno:%llx:%lld, prio:%d} not ticking\n",
>  			  engine->name,
> +			  rq->fence.context,
> +			  rq->fence.seqno,
>  			  rq->sched.attr.priority);
>  }
>  
> @@ -76,8 +78,19 @@ static void heartbeat(struct work_struct *wrk)
>  		goto out;
>  
>  	if (engine->heartbeat.systole) {
> -		if (engine->schedule &&
> -		    rq->sched.attr.priority < I915_PRIORITY_BARRIER) {
> +		if (!i915_sw_fence_signaled(&rq->submit)) {
> +			/*
> +			 * Not yet submitted, system is stalled.
> +			 *
> +			 * This more often happens for ring submission,
> +			 * where all contexts are funnelled into a common
> +			 * ringbuffer. If one context is blocked on an
> +			 * external fence, not only is it not submitted,
> +			 * but all other contexts, including the kernel
> +			 * context are stuck waiting for the signal.
> +			 */

The solution how to save the system evades me.
But piling the heartbeat on top does not help with it in
any case.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> +		} else if (engine->schedule &&
> +			   rq->sched.attr.priority < I915_PRIORITY_BARRIER) {
>  			/*
>  			 * Gradually raise the priority of the heartbeat to
>  			 * give high priority work [which presumably desires
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2020-05-28 16:25 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-28  7:43 [Intel-gfx] [PATCH 1/3] drm/i915/gt: Prevent timeslicing into unpreemptable requests Chris Wilson
2020-05-28  7:43 ` [Intel-gfx] [PATCH 2/3] drm/i915/gt: Don't declare hangs if engine is stalled Chris Wilson
2020-05-28 16:23   ` Mika Kuoppala [this message]
2020-05-28 16:50     ` Chris Wilson
2020-05-28 16:52       ` Chris Wilson
2020-05-28  7:43 ` [Intel-gfx] [PATCH 3/3] drm/i915: Track i915_vma with its own reference counter Chris Wilson
2020-05-28  8:21 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915/gt: Prevent timeslicing into unpreemptable requests Patchwork
2020-05-28  8:44 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871rn4jafd.fsf@gaia.fi.intel.com \
    --to=mika.kuoppala@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.