stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 02/24] drm/i915/gt: Ignore repeated attempts to suspend request flow across reset
       [not found] <20201204140315.24341-1-chris@chris-wilson.co.uk>
@ 2020-12-04 14:02 ` Chris Wilson
  2020-12-04 15:03   ` [Intel-gfx] " Mika Kuoppala
  2020-12-04 14:02 ` [PATCH 03/24] drm/i915/gt: Cancel the preemption timeout on responding to it Chris Wilson
  1 sibling, 1 reply; 4+ messages in thread
From: Chris Wilson @ 2020-12-04 14:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: tvrtko.ursulin, Chris Wilson, stable

Before reseting the engine, we suspend the execution of the guilty
request, so that we can continue execution with a new context while we
slowly compress the captured error state for the guilty context. However,
if the reset fails, we will promptly attempt to reset the same request
again, and discover the ongoing capture. Ignore the second attempt to
suspend and capture the same request.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 43703efb36d1..1d209a8a95e8 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2823,6 +2823,9 @@ static void __execlists_hold(struct i915_request *rq)
 static bool execlists_hold(struct intel_engine_cs *engine,
 			   struct i915_request *rq)
 {
+	if (i915_request_on_hold(rq))
+		return false;
+
 	spin_lock_irq(&engine->active.lock);
 
 	if (i915_request_completed(rq)) { /* too late! */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 03/24] drm/i915/gt: Cancel the preemption timeout on responding to it
       [not found] <20201204140315.24341-1-chris@chris-wilson.co.uk>
  2020-12-04 14:02 ` [PATCH 02/24] drm/i915/gt: Ignore repeated attempts to suspend request flow across reset Chris Wilson
@ 2020-12-04 14:02 ` Chris Wilson
  2020-12-04 15:04   ` [Intel-gfx] " Mika Kuoppala
  1 sibling, 1 reply; 4+ messages in thread
From: Chris Wilson @ 2020-12-04 14:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: tvrtko.ursulin, Chris Wilson, stable

We currently presume that the engine reset is successful, cancelling the
expired preemption timer in the process. However, engine resets can
fail, leaving the timeout still pending and we will then respond to the
timeout again next time the tasklet fires. What we want is for the
failed engine reset to be promoted to a full device reset, which is
kicked by the heartbeat once the engine stops processing events.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.5+
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 1d209a8a95e8..7f25894e41d5 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -3209,8 +3209,10 @@ static void execlists_submission_tasklet(unsigned long data)
 		spin_unlock_irqrestore(&engine->active.lock, flags);
 
 		/* Recheck after serialising with direct-submission */
-		if (unlikely(timeout && preempt_timeout(engine)))
+		if (unlikely(timeout && preempt_timeout(engine))) {
+			cancel_timer(&engine->execlists.preempt);
 			execlists_reset(engine, "preemption time out");
+		}
 	}
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Intel-gfx] [PATCH 02/24] drm/i915/gt: Ignore repeated attempts to suspend request flow across reset
  2020-12-04 14:02 ` [PATCH 02/24] drm/i915/gt: Ignore repeated attempts to suspend request flow across reset Chris Wilson
@ 2020-12-04 15:03   ` Mika Kuoppala
  0 siblings, 0 replies; 4+ messages in thread
From: Mika Kuoppala @ 2020-12-04 15:03 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson, stable

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Before reseting the engine, we suspend the execution of the guilty
> request, so that we can continue execution with a new context while we
> slowly compress the captured error state for the guilty context. However,
> if the reset fails, we will promptly attempt to reset the same request
> again, and discover the ongoing capture. Ignore the second attempt to
> suspend and capture the same request.
>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
> Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: <stable@vger.kernel.org> # v5.7+

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 43703efb36d1..1d209a8a95e8 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -2823,6 +2823,9 @@ static void __execlists_hold(struct i915_request *rq)
>  static bool execlists_hold(struct intel_engine_cs *engine,
>  			   struct i915_request *rq)
>  {
> +	if (i915_request_on_hold(rq))
> +		return false;
> +
>  	spin_lock_irq(&engine->active.lock);
>  
>  	if (i915_request_completed(rq)) { /* too late! */
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Intel-gfx] [PATCH 03/24] drm/i915/gt: Cancel the preemption timeout on responding to it
  2020-12-04 14:02 ` [PATCH 03/24] drm/i915/gt: Cancel the preemption timeout on responding to it Chris Wilson
@ 2020-12-04 15:04   ` Mika Kuoppala
  0 siblings, 0 replies; 4+ messages in thread
From: Mika Kuoppala @ 2020-12-04 15:04 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson, stable

Chris Wilson <chris@chris-wilson.co.uk> writes:

> We currently presume that the engine reset is successful, cancelling the
> expired preemption timer in the process. However, engine resets can
> fail, leaving the timeout still pending and we will then respond to the
> timeout again next time the tasklet fires. What we want is for the
> failed engine reset to be promoted to a full device reset, which is
> kicked by the heartbeat once the engine stops processing events.
>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
> Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: <stable@vger.kernel.org> # v5.5+

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 1d209a8a95e8..7f25894e41d5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -3209,8 +3209,10 @@ static void execlists_submission_tasklet(unsigned long data)
>  		spin_unlock_irqrestore(&engine->active.lock, flags);
>  
>  		/* Recheck after serialising with direct-submission */
> -		if (unlikely(timeout && preempt_timeout(engine)))
> +		if (unlikely(timeout && preempt_timeout(engine))) {
> +			cancel_timer(&engine->execlists.preempt);
>  			execlists_reset(engine, "preemption time out");
> +		}
>  	}
>  }
>  
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-12-04 15:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20201204140315.24341-1-chris@chris-wilson.co.uk>
2020-12-04 14:02 ` [PATCH 02/24] drm/i915/gt: Ignore repeated attempts to suspend request flow across reset Chris Wilson
2020-12-04 15:03   ` [Intel-gfx] " Mika Kuoppala
2020-12-04 14:02 ` [PATCH 03/24] drm/i915/gt: Cancel the preemption timeout on responding to it Chris Wilson
2020-12-04 15:04   ` [Intel-gfx] " Mika Kuoppala

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).