[Intel-gfx] [PATCH 1/3] drm/i915/gt: Cancel submitted requests upon context reset

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Intel-gfx] [PATCH 1/3] drm/i915/gt: Cancel submitted requests upon context reset
@ 2020-12-24 23:01 Chris Wilson
  2020-12-24 23:01 ` [Intel-gfx] [PATCH 2/3] drm/i915/gt: Pull context closure check from request submit to schedule-in Chris Wilson
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Chris Wilson @ 2020-12-24 23:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Since we process schedule-in of a context after submitting the request,
if we decide to reset the context at that time, we also have to cancel
the requets we have marked for submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/gt/intel_execlists_submission.c  | 22 +++++++++++++++++--
 drivers/gpu/drm/i915/i915_request.c           |  2 ++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 1fae6c6f3868..2123d9566061 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -466,6 +466,23 @@ static void intel_engine_context_out(struct intel_engine_cs *engine)
 	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
+static struct i915_request *
+cancel_requests(const struct intel_timeline * const tl, struct i915_request *rq)
+{
+	struct i915_request *active = rq;
+
+	list_for_each_entry_from_reverse(rq, &tl->requests, link) {
+		if (__i915_request_is_complete(rq))
+			break;
+
+		i915_request_set_error_once(rq, -EIO);
+		__i915_request_skip(rq);
+		active = rq;
+	}
+
+	return active;
+}
+
 static void reset_active(struct i915_request *rq,
 			 struct intel_engine_cs *engine)
 {
@@ -487,14 +504,15 @@ static void reset_active(struct i915_request *rq,
 	 * remain correctly ordered. And we defer to __i915_request_submit()
 	 * so that all asynchronous waits are correctly handled.
 	 */
-	ENGINE_TRACE(engine, "{ rq=%llx:%lld }\n",
+	rq = cancel_requests(ce->timeline, rq);
+	ENGINE_TRACE(engine, "{ reset rq=%llx:%lld }\n",
 		     rq->fence.context, rq->fence.seqno);
 
 	/* On resubmission of the active request, payload will be scrubbed */
 	if (__i915_request_is_complete(rq))
 		head = rq->tail;
 	else
-		head = active_request(ce->timeline, rq)->head;
+		head = rq->head;
 	head = intel_ring_wrap(ce->ring, head);
 
 	/* Scrub the context image to prevent replaying the previous batch */
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 6578faf6eed8..ad3b6a4f424f 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -490,6 +490,8 @@ void __i915_request_skip(struct i915_request *rq)
 	if (rq->infix == rq->postfix)
 		return;
 
+	RQ_TRACE(rq, "error: %d\n", rq->fence.error);
+
 	/*
 	 * As this request likely depends on state from the lost
 	 * context, clear out all the user operations leaving the
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Intel-gfx] [PATCH 2/3] drm/i915/gt: Pull context closure check from request submit to schedule-in
  2020-12-24 23:01 [Intel-gfx] [PATCH 1/3] drm/i915/gt: Cancel submitted requests upon context reset Chris Wilson
@ 2020-12-24 23:01 ` Chris Wilson
  2020-12-24 23:01 ` [Intel-gfx] [PATCH 3/3] drm/i915/gem: Peek at the inflight context Chris Wilson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2020-12-24 23:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

We only need to evaluate the current status of the context when it is
scheduled in, we will force a reschedule when the context is closed
propagating the change to inflight contexts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 ++++
 drivers/gpu/drm/i915/i915_request.c                  | 4 ----
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 2123d9566061..908a7cb98746 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -530,6 +530,10 @@ __execlists_schedule_in(struct i915_request *rq)
 
 	intel_context_get(ce);
 
+	if (unlikely(intel_context_is_closed(ce) &&
+		     !intel_engine_has_heartbeat(engine)))
+		intel_context_set_banned(ce);
+
 	if (unlikely(intel_context_is_banned(ce)))
 		reset_active(rq, engine);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index ad3b6a4f424f..3a9820a9e521 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -546,10 +546,6 @@ bool __i915_request_submit(struct i915_request *request)
 	if (i915_request_completed(request))
 		goto xfer;
 
-	if (unlikely(intel_context_is_closed(request->context) &&
-		     !intel_engine_has_heartbeat(engine)))
-		intel_context_set_banned(request->context);
-
 	if (unlikely(intel_context_is_banned(request->context)))
 		i915_request_set_error_once(request, -EIO);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Intel-gfx] [PATCH 3/3] drm/i915/gem: Peek at the inflight context
  2020-12-24 23:01 [Intel-gfx] [PATCH 1/3] drm/i915/gt: Cancel submitted requests upon context reset Chris Wilson
  2020-12-24 23:01 ` [Intel-gfx] [PATCH 2/3] drm/i915/gt: Pull context closure check from request submit to schedule-in Chris Wilson
@ 2020-12-24 23:01 ` Chris Wilson
  2020-12-24 23:14 ` [Intel-gfx] [PATCH] drm/i915/gt: Cancel submitted requests upon context reset Chris Wilson
  2020-12-24 23:56 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with drm/i915/gt: Cancel submitted requests upon context reset (rev2) Patchwork
  3 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2020-12-24 23:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

If supported by the backend, we can quickly look at the context's
inflight engine rather than search along the active list to confirm.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c7363036765a..e53a9d40f28b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -426,6 +426,10 @@ static struct intel_engine_cs *active_engine(struct intel_context *ce)
 	if (!ce->timeline)
 		return NULL;
 
+	engine = intel_context_inflight(ce);
+	if (engine)
+		return engine;
+
 	/*
 	 * rq->link is only SLAB_TYPESAFE_BY_RCU, we need to hold a reference
 	 * to the request to prevent it being transferred to a new timeline
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Intel-gfx] [PATCH] drm/i915/gt: Cancel submitted requests upon context reset
  2020-12-24 23:01 [Intel-gfx] [PATCH 1/3] drm/i915/gt: Cancel submitted requests upon context reset Chris Wilson
  2020-12-24 23:01 ` [Intel-gfx] [PATCH 2/3] drm/i915/gt: Pull context closure check from request submit to schedule-in Chris Wilson
  2020-12-24 23:01 ` [Intel-gfx] [PATCH 3/3] drm/i915/gem: Peek at the inflight context Chris Wilson
@ 2020-12-24 23:14 ` Chris Wilson
  2020-12-24 23:56 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with drm/i915/gt: Cancel submitted requests upon context reset (rev2) Patchwork
  3 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2020-12-24 23:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Since we process schedule-in of a context after submitting the request,
if we decide to reset the context at that time, we also have to cancel
the requets we have marked for submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 8 ++++----
 drivers/gpu/drm/i915/i915_request.c                  | 2 ++
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 1fae6c6f3868..fd7ac25605b2 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -219,14 +219,14 @@ active_request(const struct intel_timeline * const tl, struct i915_request *rq)
 {
 	struct i915_request *active = rq;
 
-	rcu_read_lock();
-	list_for_each_entry_continue_reverse(rq, &tl->requests, link) {
+	list_for_each_entry_from_reverse(rq, &tl->requests, link) {
 		if (__i915_request_is_complete(rq))
 			break;
 
+		i915_request_set_error_once(rq, -EIO);
+		__i915_request_skip(rq);
 		active = rq;
 	}
-	rcu_read_unlock();
 
 	return active;
 }
@@ -487,7 +487,7 @@ static void reset_active(struct i915_request *rq,
 	 * remain correctly ordered. And we defer to __i915_request_submit()
 	 * so that all asynchronous waits are correctly handled.
 	 */
-	ENGINE_TRACE(engine, "{ rq=%llx:%lld }\n",
+	ENGINE_TRACE(engine, "{ reset rq=%llx:%lld }\n",
 		     rq->fence.context, rq->fence.seqno);
 
 	/* On resubmission of the active request, payload will be scrubbed */
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 6578faf6eed8..ad3b6a4f424f 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -490,6 +490,8 @@ void __i915_request_skip(struct i915_request *rq)
 	if (rq->infix == rq->postfix)
 		return;
 
+	RQ_TRACE(rq, "error: %d\n", rq->fence.error);
+
 	/*
 	 * As this request likely depends on state from the lost
 	 * context, clear out all the user operations leaving the
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with drm/i915/gt: Cancel submitted requests upon context reset (rev2)
  2020-12-24 23:01 [Intel-gfx] [PATCH 1/3] drm/i915/gt: Cancel submitted requests upon context reset Chris Wilson
                   ` (2 preceding siblings ...)
  2020-12-24 23:14 ` [Intel-gfx] [PATCH] drm/i915/gt: Cancel submitted requests upon context reset Chris Wilson
@ 2020-12-24 23:56 ` Patchwork
  3 siblings, 0 replies; 6+ messages in thread
From: Patchwork @ 2020-12-24 23:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 11549 bytes --]

== Series Details ==

Series: series starting with drm/i915/gt: Cancel submitted requests upon context reset (rev2)
URL   : https://patchwork.freedesktop.org/series/85209/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_9522 -> Patchwork_19215
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_19215 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_19215, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_19215:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live@hangcheck:
    - fi-cml-u2:          [PASS][1] -> [DMESG-FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-cml-u2/igt@i915_selftest@live@hangcheck.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-cml-u2/igt@i915_selftest@live@hangcheck.html
    - fi-cfl-8700k:       [PASS][3] -> [DMESG-FAIL][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-cfl-8700k/igt@i915_selftest@live@hangcheck.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-cfl-8700k/igt@i915_selftest@live@hangcheck.html
    - fi-skl-6700k2:      [PASS][5] -> [DMESG-FAIL][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-skl-6700k2/igt@i915_selftest@live@hangcheck.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-skl-6700k2/igt@i915_selftest@live@hangcheck.html
    - fi-icl-u2:          [PASS][7] -> [DMESG-FAIL][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-icl-u2/igt@i915_selftest@live@hangcheck.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-icl-u2/igt@i915_selftest@live@hangcheck.html
    - fi-skl-guc:         [PASS][9] -> [DMESG-FAIL][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-skl-guc/igt@i915_selftest@live@hangcheck.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-skl-guc/igt@i915_selftest@live@hangcheck.html
    - fi-skl-6600u:       [PASS][11] -> [DMESG-FAIL][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-skl-6600u/igt@i915_selftest@live@hangcheck.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-skl-6600u/igt@i915_selftest@live@hangcheck.html
    - fi-cfl-8109u:       [PASS][13] -> [DMESG-FAIL][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-cfl-8109u/igt@i915_selftest@live@hangcheck.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-cfl-8109u/igt@i915_selftest@live@hangcheck.html
    - fi-kbl-7500u:       [PASS][15] -> [DMESG-FAIL][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-kbl-7500u/igt@i915_selftest@live@hangcheck.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-kbl-7500u/igt@i915_selftest@live@hangcheck.html
    - fi-bsw-nick:        [PASS][17] -> [DMESG-FAIL][18]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-bsw-nick/igt@i915_selftest@live@hangcheck.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-bsw-nick/igt@i915_selftest@live@hangcheck.html
    - fi-kbl-guc:         [PASS][19] -> [DMESG-FAIL][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-kbl-guc/igt@i915_selftest@live@hangcheck.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-kbl-guc/igt@i915_selftest@live@hangcheck.html
    - fi-icl-y:           [PASS][21] -> [DMESG-FAIL][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-icl-y/igt@i915_selftest@live@hangcheck.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-icl-y/igt@i915_selftest@live@hangcheck.html
    - fi-kbl-r:           [PASS][23] -> [DMESG-FAIL][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-kbl-r/igt@i915_selftest@live@hangcheck.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-kbl-r/igt@i915_selftest@live@hangcheck.html
    - fi-glk-dsi:         [PASS][25] -> [DMESG-FAIL][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-glk-dsi/igt@i915_selftest@live@hangcheck.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-glk-dsi/igt@i915_selftest@live@hangcheck.html
    - fi-kbl-x1275:       [PASS][27] -> [DMESG-FAIL][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-kbl-x1275/igt@i915_selftest@live@hangcheck.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-kbl-x1275/igt@i915_selftest@live@hangcheck.html
    - fi-bsw-kefka:       [PASS][29] -> [DMESG-FAIL][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-bsw-kefka/igt@i915_selftest@live@hangcheck.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-bsw-kefka/igt@i915_selftest@live@hangcheck.html
    - fi-cml-s:           [PASS][31] -> [DMESG-FAIL][32]
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-cml-s/igt@i915_selftest@live@hangcheck.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-cml-s/igt@i915_selftest@live@hangcheck.html
    - fi-tgl-y:           [PASS][33] -> [DMESG-FAIL][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-tgl-y/igt@i915_selftest@live@hangcheck.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-tgl-y/igt@i915_selftest@live@hangcheck.html
    - fi-kbl-soraka:      [PASS][35] -> [DMESG-FAIL][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-kbl-soraka/igt@i915_selftest@live@hangcheck.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-kbl-soraka/igt@i915_selftest@live@hangcheck.html
    - fi-cfl-guc:         [PASS][37] -> [DMESG-FAIL][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-cfl-guc/igt@i915_selftest@live@hangcheck.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-cfl-guc/igt@i915_selftest@live@hangcheck.html
    - fi-bsw-n3050:       [PASS][39] -> [DMESG-FAIL][40]
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-bsw-n3050/igt@i915_selftest@live@hangcheck.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-bsw-n3050/igt@i915_selftest@live@hangcheck.html
    - fi-tgl-u2:          [PASS][41] -> [DMESG-FAIL][42]
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-tgl-u2/igt@i915_selftest@live@hangcheck.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-tgl-u2/igt@i915_selftest@live@hangcheck.html

  
#### Warnings ####

  * igt@i915_selftest@live@hangcheck:
    - fi-apl-guc:         [DMESG-WARN][43] ([i915#203]) -> [DMESG-FAIL][44]
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-apl-guc/igt@i915_selftest@live@hangcheck.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-apl-guc/igt@i915_selftest@live@hangcheck.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@i915_selftest@live@hangcheck:
    - {fi-tgl-dsi}:       [PASS][45] -> [DMESG-FAIL][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-tgl-dsi/igt@i915_selftest@live@hangcheck.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-tgl-dsi/igt@i915_selftest@live@hangcheck.html
    - {fi-ehl-1}:         [PASS][47] -> [DMESG-FAIL][48]
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-ehl-1/igt@i915_selftest@live@hangcheck.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-ehl-1/igt@i915_selftest@live@hangcheck.html

  
Known issues
------------

  Here are the changes found in Patchwork_19215 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_flink_basic@bad-flink:
    - fi-tgl-y:           [PASS][49] -> [DMESG-WARN][50] ([i915#402])
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-tgl-y/igt@gem_flink_basic@bad-flink.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-tgl-y/igt@gem_flink_basic@bad-flink.html

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-tgl-y:           [PASS][51] -> [DMESG-FAIL][52] ([i915#2601])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-tgl-y/igt@i915_selftest@live@gt_heartbeat.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-tgl-y/igt@i915_selftest@live@gt_heartbeat.html

  
#### Possible fixes ####

  * igt@gem_flink_basic@basic:
    - fi-tgl-y:           [DMESG-WARN][53] ([i915#402]) -> [PASS][54]
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-tgl-y/igt@gem_flink_basic@basic.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-tgl-y/igt@gem_flink_basic@basic.html

  * igt@i915_selftest@live@execlists:
    - fi-apl-guc:         [DMESG-WARN][55] ([i915#1037]) -> [PASS][56]
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-apl-guc/igt@i915_selftest@live@execlists.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-apl-guc/igt@i915_selftest@live@execlists.html

  * igt@i915_selftest@live@gt_contexts:
    - fi-apl-guc:         [DMESG-WARN][57] -> [PASS][58]
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-apl-guc/igt@i915_selftest@live@gt_contexts.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-apl-guc/igt@i915_selftest@live@gt_contexts.html

  * igt@i915_selftest@live@ring_submission:
    - fi-apl-guc:         [DMESG-WARN][59] ([i915#203]) -> [PASS][60] +23 similar issues
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9522/fi-apl-guc/igt@i915_selftest@live@ring_submission.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/fi-apl-guc/igt@i915_selftest@live@ring_submission.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1037]: https://gitlab.freedesktop.org/drm/intel/issues/1037
  [i915#203]: https://gitlab.freedesktop.org/drm/intel/issues/203
  [i915#2601]: https://gitlab.freedesktop.org/drm/intel/issues/2601
  [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402


Participating hosts (43 -> 37)
------------------------------

  Missing    (6): fi-ilk-m540 fi-bdw-5557u fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_9522 -> Patchwork_19215

  CI-20190529: 20190529
  CI_DRM_9522: 2384222e087b158e3cb9dd3b74a0561ed9be6dc6 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5922: 334f0c326bb2812e7a2764dc63ff83c83b6daf58 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_19215: 13f70ef110c3b3a54776b44cc3a176ed849a4b56 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

13f70ef110c3 drm/i915/gem: Peek at the inflight context
63b7e925c1c2 drm/i915/gt: Pull context closure check from request submit to schedule-in
7df04a055695 drm/i915/gt: Cancel submitted requests upon context reset

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19215/index.html

[-- Attachment #1.2: Type: text/html, Size: 12859 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Intel-gfx] [PATCH 1/3] drm/i915/gt: Cancel submitted requests upon context reset
@ 2020-12-25 20:49 Chris Wilson
  2020-12-25 20:49 ` [Intel-gfx] [PATCH 2/3] drm/i915/gt: Pull context closure check from request submit to schedule-in Chris Wilson
  0 siblings, 1 reply; 6+ messages in thread
From: Chris Wilson @ 2020-12-25 20:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Since we process schedule-in of a context after submitting the request,
if we decide to reset the context at that time, we also have to cancel
the requets we have marked for submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/gt/intel_execlists_submission.c  | 22 ++++++++++++++-----
 drivers/gpu/drm/i915/i915_request.c           |  2 ++
 2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 1fae6c6f3868..eb2c086dbce6 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -215,22 +215,32 @@ static void mark_eio(struct i915_request *rq)
 }
 
 static struct i915_request *
-active_request(const struct intel_timeline * const tl, struct i915_request *rq)
+__active_request(const struct intel_timeline * const tl,
+		 struct i915_request *rq,
+		 int error)
 {
 	struct i915_request *active = rq;
 
-	rcu_read_lock();
-	list_for_each_entry_continue_reverse(rq, &tl->requests, link) {
+	list_for_each_entry_from_reverse(rq, &tl->requests, link) {
 		if (__i915_request_is_complete(rq))
 			break;
 
+		if (error) {
+			i915_request_set_error_once(rq, error);
+			__i915_request_skip(rq);
+		}
 		active = rq;
 	}
-	rcu_read_unlock();
 
 	return active;
 }
 
+static struct i915_request *
+active_request(const struct intel_timeline * const tl, struct i915_request *rq)
+{
+	return __active_request(tl, rq, 0);
+}
+
 static inline void
 ring_set_paused(const struct intel_engine_cs *engine, int state)
 {
@@ -487,14 +497,14 @@ static void reset_active(struct i915_request *rq,
 	 * remain correctly ordered. And we defer to __i915_request_submit()
 	 * so that all asynchronous waits are correctly handled.
 	 */
-	ENGINE_TRACE(engine, "{ rq=%llx:%lld }\n",
+	ENGINE_TRACE(engine, "{ reset rq=%llx:%lld }\n",
 		     rq->fence.context, rq->fence.seqno);
 
 	/* On resubmission of the active request, payload will be scrubbed */
 	if (__i915_request_is_complete(rq))
 		head = rq->tail;
 	else
-		head = active_request(ce->timeline, rq)->head;
+		head = __active_request(ce->timeline, rq, -EIO)->head;
 	head = intel_ring_wrap(ce->ring, head);
 
 	/* Scrub the context image to prevent replaying the previous batch */
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 6578faf6eed8..ad3b6a4f424f 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -490,6 +490,8 @@ void __i915_request_skip(struct i915_request *rq)
 	if (rq->infix == rq->postfix)
 		return;
 
+	RQ_TRACE(rq, "error: %d\n", rq->fence.error);
+
 	/*
 	 * As this request likely depends on state from the lost
 	 * context, clear out all the user operations leaving the
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Intel-gfx] [PATCH 2/3] drm/i915/gt: Pull context closure check from request submit to schedule-in
  2020-12-25 20:49 [Intel-gfx] [PATCH 1/3] drm/i915/gt: Cancel submitted requests upon context reset Chris Wilson
@ 2020-12-25 20:49 ` Chris Wilson
  0 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2020-12-25 20:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

We only need to evaluate the current status of the context when it is
scheduled in, we will force a reschedule when the context is closed
propagating the change to inflight contexts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 ++++
 drivers/gpu/drm/i915/i915_request.c                  | 4 ----
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index eb2c086dbce6..cdd7606a65d4 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -522,6 +522,10 @@ __execlists_schedule_in(struct i915_request *rq)
 
 	intel_context_get(ce);
 
+	if (unlikely(intel_context_is_closed(ce) &&
+		     !intel_engine_has_heartbeat(engine)))
+		intel_context_set_banned(ce);
+
 	if (unlikely(intel_context_is_banned(ce)))
 		reset_active(rq, engine);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index ad3b6a4f424f..3a9820a9e521 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -546,10 +546,6 @@ bool __i915_request_submit(struct i915_request *request)
 	if (i915_request_completed(request))
 		goto xfer;
 
-	if (unlikely(intel_context_is_closed(request->context) &&
-		     !intel_engine_has_heartbeat(engine)))
-		intel_context_set_banned(request->context);
-
 	if (unlikely(intel_context_is_banned(request->context)))
 		i915_request_set_error_once(request, -EIO);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-12-25 20:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-24 23:01 [Intel-gfx] [PATCH 1/3] drm/i915/gt: Cancel submitted requests upon context reset Chris Wilson
2020-12-24 23:01 ` [Intel-gfx] [PATCH 2/3] drm/i915/gt: Pull context closure check from request submit to schedule-in Chris Wilson
2020-12-24 23:01 ` [Intel-gfx] [PATCH 3/3] drm/i915/gem: Peek at the inflight context Chris Wilson
2020-12-24 23:14 ` [Intel-gfx] [PATCH] drm/i915/gt: Cancel submitted requests upon context reset Chris Wilson
2020-12-24 23:56 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with drm/i915/gt: Cancel submitted requests upon context reset (rev2) Patchwork
2020-12-25 20:49 [Intel-gfx] [PATCH 1/3] drm/i915/gt: Cancel submitted requests upon context reset Chris Wilson
2020-12-25 20:49 ` [Intel-gfx] [PATCH 2/3] drm/i915/gt: Pull context closure check from request submit to schedule-in Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.