[Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests
@ 2020-06-15 12:39 Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 02/10] drm/i915/selftests: Dump engine state and trace upon hanging after reset Chris Wilson
                   ` (15 more replies)
  0 siblings, 16 replies; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 12:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Since the heartbeat may cause a preemption event, disable it over the
preemption suppression tests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_lrc.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index f651bdf7f191..91543494f595 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -2282,7 +2282,7 @@ static int live_suppress_self_preempt(void *arg)
 		if (igt_flush_test(gt->i915))
 			goto err_wedged;
 
-		intel_engine_pm_get(engine);
+		engine_heartbeat_disable(engine);
 		engine->execlists.preempt_hang.count = 0;
 
 		rq_a = spinner_create_request(&a.spin,
@@ -2290,14 +2290,14 @@ static int live_suppress_self_preempt(void *arg)
 					      MI_NOOP);
 		if (IS_ERR(rq_a)) {
 			err = PTR_ERR(rq_a);
-			intel_engine_pm_put(engine);
+			engine_heartbeat_enable(engine);
 			goto err_client_b;
 		}
 
 		i915_request_add(rq_a);
 		if (!igt_wait_for_spinner(&a.spin, rq_a)) {
 			pr_err("First client failed to start\n");
-			intel_engine_pm_put(engine);
+			engine_heartbeat_enable(engine);
 			goto err_wedged;
 		}
 
@@ -2309,7 +2309,7 @@ static int live_suppress_self_preempt(void *arg)
 						      MI_NOOP);
 			if (IS_ERR(rq_b)) {
 				err = PTR_ERR(rq_b);
-				intel_engine_pm_put(engine);
+				engine_heartbeat_enable(engine);
 				goto err_client_b;
 			}
 			i915_request_add(rq_b);
@@ -2320,7 +2320,7 @@ static int live_suppress_self_preempt(void *arg)
 
 			if (!igt_wait_for_spinner(&b.spin, rq_b)) {
 				pr_err("Second client failed to start\n");
-				intel_engine_pm_put(engine);
+				engine_heartbeat_enable(engine);
 				goto err_wedged;
 			}
 
@@ -2334,12 +2334,12 @@ static int live_suppress_self_preempt(void *arg)
 			       engine->name,
 			       engine->execlists.preempt_hang.count,
 			       depth);
-			intel_engine_pm_put(engine);
+			engine_heartbeat_enable(engine);
 			err = -EINVAL;
 			goto err_client_b;
 		}
 
-		intel_engine_pm_put(engine);
+		engine_heartbeat_enable(engine);
 		if (igt_flush_test(gt->i915))
 			goto err_wedged;
 	}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [PATCH 02/10] drm/i915/selftests: Dump engine state and trace upon hanging after reset
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
@ 2020-06-15 12:39 ` Chris Wilson
  2020-06-15 14:33   ` [Intel-gfx] [PATCH] " Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 03/10] drm/i915/gt: Add a safety submission flush in the heartbeat Chris Wilson
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 12:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

If the engine dies after a reset, and so we fail to submit a request
but need to be interrupted by the CI runner, dump the engine state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 2af66f8ffbd2..0f1be62cdc6f 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -499,6 +499,24 @@ static int igt_reset_nop_engine(void *arg)
 
 				rq = intel_context_create_request(ce);
 				if (IS_ERR(rq)) {
+					struct drm_printer p =
+						drm_info_printer(gt->i915->drm.dev);
+					intel_engine_dump(engine, &p,
+							  "%s(%s): failed to submit request %llx:%lld\n",
+							  __func__,
+							  engine->name,
+							  rq->fence.context,
+							  rq->fence.seqno);
+
+
+					GEM_TRACE("%s(%s): failed to submit request %llx:%lld\n",
+						  __func__, engine->name,
+						  rq->fence.context,
+						  rq->fence.seqno);
+					GEM_TRACE_DUMP();
+
+					intel_gt_set_wedged(gt);
+
 					err = PTR_ERR(rq);
 					break;
 				}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [PATCH 03/10] drm/i915/gt: Add a safety submission flush in the heartbeat
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 02/10] drm/i915/selftests: Dump engine state and trace upon hanging after reset Chris Wilson
@ 2020-06-15 12:39 ` Chris Wilson
  2020-06-15 13:11   ` [Intel-gfx] [PATCH] " Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 04/10] drm/i915/execlists: Replace direct submit with direct call to tasklet Chris Wilson
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 12:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Just in case everything fails (like for example "missed interrupt
syndrome" on Sandybridge), always flush the submission tasklet from the
heartbeat. This papers over such issues, but will still appear as a
second long glitch, and prevents us from detecting it unless we happen
to be performing a timed test.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 19 +++++++------------
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |  3 +++
 2 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index d613cf31970c..81884303bf6d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1094,19 +1094,14 @@ void intel_engine_flush_submission(struct intel_engine_cs *engine)
 {
 	struct tasklet_struct *t = &engine->execlists.tasklet;
 
-	if (__tasklet_is_scheduled(t)) {
-		local_bh_disable();
-		if (tasklet_trylock(t)) {
-			/* Must wait for any GPU reset in progress. */
-			if (__tasklet_is_enabled(t))
-				t->func(t->data);
-			tasklet_unlock(t);
-		}
-		local_bh_enable();
+	local_bh_disable();
+	if (tasklet_trylock(t)) {
+		/* Must wait for any GPU reset in progress. */
+		if (__tasklet_is_enabled(t))
+			t->func(t->data);
+		tasklet_unlock(t);
 	}
-
-	/* Otherwise flush the tasklet if it was running on another cpu */
-	tasklet_unlock_wait(t);
+	local_bh_enable();
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index f67ad937eefb..cd20fb549b38 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -65,6 +65,9 @@ static void heartbeat(struct work_struct *wrk)
 	struct intel_context *ce = engine->kernel_context;
 	struct i915_request *rq;
 
+	/* Just in case everything has gone horribly wrong, give it a kick */
+	intel_engine_flush_submission(engine);
+
 	rq = engine->heartbeat.systole;
 	if (rq && i915_request_completed(rq)) {
 		i915_request_put(rq);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [PATCH 04/10] drm/i915/execlists: Replace direct submit with direct call to tasklet
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 02/10] drm/i915/selftests: Dump engine state and trace upon hanging after reset Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 03/10] drm/i915/gt: Add a safety submission flush in the heartbeat Chris Wilson
@ 2020-06-15 12:39 ` Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 05/10] drm/i915/execlists: Defer schedule_out until after the next dequeue Chris Wilson
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 12:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Rather than having special case code for opportunistically calling
process_csb() and performing a direct submit while holding the engine
spinlock for submitting the request, simply call the tasklet directly.
This allows us to retain the direct submission path, including the CS
draining to allow fast/immediate submissions, without requiring any
duplicated code paths.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine.h       |  1 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    | 27 +++----
 drivers/gpu/drm/i915/gt/intel_lrc.c          | 78 ++++++++------------
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  1 +
 4 files changed, 45 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 791897f8d847..c77b3c0d2b3b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -210,6 +210,7 @@ int intel_engine_resume(struct intel_engine_cs *engine);
 
 int intel_ring_submission_setup(struct intel_engine_cs *engine);
 
+void __intel_engine_stop_cs(struct intel_engine_cs *engine);
 int intel_engine_stop_cs(struct intel_engine_cs *engine);
 void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 81884303bf6d..667ee52448a0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -903,33 +903,34 @@ static unsigned long stop_timeout(const struct intel_engine_cs *engine)
 	return READ_ONCE(engine->props.stop_timeout_ms);
 }
 
-int intel_engine_stop_cs(struct intel_engine_cs *engine)
+void __intel_engine_stop_cs(struct intel_engine_cs *engine)
 {
 	struct intel_uncore *uncore = engine->uncore;
-	const u32 base = engine->mmio_base;
-	const i915_reg_t mode = RING_MI_MODE(base);
-	int err;
+	const i915_reg_t mode = RING_MI_MODE(engine->mmio_base);
 
+	intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING));
+	intel_uncore_posting_read_fw(uncore, mode);
+}
+
+int intel_engine_stop_cs(struct intel_engine_cs *engine)
+{
 	if (INTEL_GEN(engine->i915) < 3)
 		return -ENODEV;
 
 	ENGINE_TRACE(engine, "\n");
 
-	intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING));
+	__intel_engine_stop_cs(engine);
 
-	err = 0;
-	if (__intel_wait_for_register_fw(uncore,
-					 mode, MODE_IDLE, MODE_IDLE,
+	if (__intel_wait_for_register_fw(engine->uncore,
+					 RING_MI_MODE(engine->mmio_base),
+					 MODE_IDLE, MODE_IDLE,
 					 1000, stop_timeout(engine),
 					 NULL)) {
 		ENGINE_TRACE(engine, "timed out on STOP_RING -> IDLE\n");
-		err = -ETIMEDOUT;
+		return -ETIMEDOUT;
 	}
 
-	/* A final mmio read to let GPU writes be hopefully flushed to memory */
-	intel_uncore_posting_read_fw(uncore, mode);
-
-	return err;
+	return 0;
 }
 
 void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e866b8d721ed..40c5085765da 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2703,16 +2703,6 @@ static void process_csb(struct intel_engine_cs *engine)
 	invalidate_csb_entries(&buf[0], &buf[num_entries - 1]);
 }
 
-static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
-{
-	lockdep_assert_held(&engine->active.lock);
-	if (!READ_ONCE(engine->execlists.pending[0])) {
-		rcu_read_lock(); /* protect peeking at execlists->active */
-		execlists_dequeue(engine);
-		rcu_read_unlock();
-	}
-}
-
 static void __execlists_hold(struct i915_request *rq)
 {
 	LIST_HEAD(list);
@@ -3102,7 +3092,7 @@ static bool preempt_timeout(const struct intel_engine_cs *const engine)
 	if (!timer_expired(t))
 		return false;
 
-	return READ_ONCE(engine->execlists.pending[0]);
+	return engine->execlists.pending[0];
 }
 
 /*
@@ -3112,7 +3102,6 @@ static bool preempt_timeout(const struct intel_engine_cs *const engine)
 static void execlists_submission_tasklet(unsigned long data)
 {
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
-	bool timeout = preempt_timeout(engine);
 
 	process_csb(engine);
 
@@ -3122,16 +3111,17 @@ static void execlists_submission_tasklet(unsigned long data)
 			execlists_reset(engine, "CS error");
 	}
 
-	if (!READ_ONCE(engine->execlists.pending[0]) || timeout) {
+	if (unlikely(preempt_timeout(engine)))
+		execlists_reset(engine, "preemption time out");
+
+	if (!engine->execlists.pending[0]) {
 		unsigned long flags;
 
+		rcu_read_lock(); /* protect peeking at execlists->active */
 		spin_lock_irqsave(&engine->active.lock, flags);
-		__execlists_submission_tasklet(engine);
+		execlists_dequeue(engine);
 		spin_unlock_irqrestore(&engine->active.lock, flags);
-
-		/* Recheck after serialising with direct-submission */
-		if (unlikely(timeout && preempt_timeout(engine)))
-			execlists_reset(engine, "preemption time out");
+		rcu_read_unlock();
 	}
 }
 
@@ -3163,26 +3153,16 @@ static void queue_request(struct intel_engine_cs *engine,
 	set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
 }
 
-static void __submit_queue_imm(struct intel_engine_cs *engine)
-{
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-
-	if (reset_in_progress(execlists))
-		return; /* defer until we restart the engine following reset */
-
-	__execlists_submission_tasklet(engine);
-}
-
-static void submit_queue(struct intel_engine_cs *engine,
+static bool submit_queue(struct intel_engine_cs *engine,
 			 const struct i915_request *rq)
 {
 	struct intel_engine_execlists *execlists = &engine->execlists;
 
 	if (rq_prio(rq) <= execlists->queue_priority_hint)
-		return;
+		return false;
 
 	execlists->queue_priority_hint = rq_prio(rq);
-	__submit_queue_imm(engine);
+	return true;
 }
 
 static bool ancestor_on_hold(const struct intel_engine_cs *engine,
@@ -3196,20 +3176,22 @@ static void flush_csb(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists *el = &engine->execlists;
 
-	if (READ_ONCE(el->pending[0]) && tasklet_trylock(&el->tasklet)) {
-		if (!reset_in_progress(el))
-			process_csb(engine);
-		tasklet_unlock(&el->tasklet);
+	if (!tasklet_trylock(&el->tasklet)) {
+		tasklet_hi_schedule(&el->tasklet);
+		return;
 	}
+
+	if (!reset_in_progress(el))
+		execlists_submission_tasklet((unsigned long)engine);
+
+	tasklet_unlock(&el->tasklet);
 }
 
 static void execlists_submit_request(struct i915_request *request)
 {
 	struct intel_engine_cs *engine = request->engine;
 	unsigned long flags;
-
-	/* Hopefully we clear execlists->pending[] to let us through */
-	flush_csb(engine);
+	bool submit = false;
 
 	/* Will be called from irq-context when using foreign fences. */
 	spin_lock_irqsave(&engine->active.lock, flags);
@@ -3224,10 +3206,13 @@ static void execlists_submit_request(struct i915_request *request)
 		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
 		GEM_BUG_ON(list_empty(&request->sched.link));
 
-		submit_queue(engine, request);
+		submit = submit_queue(engine, request);
 	}
 
 	spin_unlock_irqrestore(&engine->active.lock, flags);
+
+	if (submit)
+		flush_csb(engine);
 }
 
 static void __execlists_context_fini(struct intel_context *ce)
@@ -4113,7 +4098,6 @@ static int execlists_resume(struct intel_engine_cs *engine)
 static void execlists_reset_prepare(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
-	unsigned long flags;
 
 	ENGINE_TRACE(engine, "depth<-%d\n",
 		     atomic_read(&execlists->tasklet.count));
@@ -4130,10 +4114,6 @@ static void execlists_reset_prepare(struct intel_engine_cs *engine)
 	__tasklet_disable_sync_once(&execlists->tasklet);
 	GEM_BUG_ON(!reset_in_progress(execlists));
 
-	/* And flush any current direct submission. */
-	spin_lock_irqsave(&engine->active.lock, flags);
-	spin_unlock_irqrestore(&engine->active.lock, flags);
-
 	/*
 	 * We stop engines, otherwise we might get failed reset and a
 	 * dead gpu (on elk). Also as modern gpu as kbl can suffer
@@ -4147,7 +4127,7 @@ static void execlists_reset_prepare(struct intel_engine_cs *engine)
 	 * FIXME: Wa for more modern gens needs to be validated
 	 */
 	ring_set_paused(engine, 1);
-	intel_engine_stop_cs(engine);
+	__intel_engine_stop_cs(engine);
 
 	engine->execlists.reset_ccid = active_ccid(engine);
 }
@@ -4377,12 +4357,12 @@ static void execlists_reset_finish(struct intel_engine_cs *engine)
 	 * to sleep before we restart and reload a context.
 	 */
 	GEM_BUG_ON(!reset_in_progress(execlists));
-	if (!RB_EMPTY_ROOT(&execlists->queue.rb_root))
-		execlists->tasklet.func(execlists->tasklet.data);
+	GEM_BUG_ON(engine->execlists.pending[0]);
 
+	/* And kick in case we missed a new request submission. */
 	if (__tasklet_enable(&execlists->tasklet))
-		/* And kick in case we missed a new request submission. */
-		tasklet_hi_schedule(&execlists->tasklet);
+		flush_csb(engine);
+
 	ENGINE_TRACE(engine, "depth->%d\n",
 		     atomic_read(&execlists->tasklet.count));
 }
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 0f1be62cdc6f..1f26db4fc998 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -1601,6 +1601,7 @@ static int __igt_atomic_reset_engine(struct intel_engine_cs *engine,
 
 	p->critical_section_end();
 	tasklet_enable(t);
+	tasklet_hi_schedule(t);
 
 	if (err)
 		pr_err("i915_reset_engine(%s:%s) failed under %s\n",
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [PATCH 05/10] drm/i915/execlists: Defer schedule_out until after the next dequeue
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (2 preceding siblings ...)
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 04/10] drm/i915/execlists: Replace direct submit with direct call to tasklet Chris Wilson
@ 2020-06-15 12:39 ` Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 06/10] drm/i915/gt: ce->inflight updates are now serialised Chris Wilson
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 12:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Inside schedule_out, we do extra work upon idling the context, such as
updating the runtime, kicking off retires, kicking virtual engines.
However, if we are in a series of processing single requests per
contexts, we may find ourselves scheduling out the context, only to
immediately schedule it back in during dequeue. This is just extra work
that we can avoid if we keep the context marked as inflight across the
dequeue. This becomes more significant later on for minimising virtual
engine misses.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |  4 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  | 13 +++++
 drivers/gpu/drm/i915/gt/intel_lrc.c           | 47 ++++++++++++++-----
 4 files changed, 51 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 4954b0df4864..b63db45bab7b 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -45,8 +45,8 @@ struct intel_context {
 
 	struct intel_engine_cs *engine;
 	struct intel_engine_cs *inflight;
-#define intel_context_inflight(ce) ptr_mask_bits(READ_ONCE((ce)->inflight), 2)
-#define intel_context_inflight_count(ce) ptr_unmask_bits(READ_ONCE((ce)->inflight), 2)
+#define intel_context_inflight(ce) ptr_mask_bits(READ_ONCE((ce)->inflight), 3)
+#define intel_context_inflight_count(ce) ptr_unmask_bits(READ_ONCE((ce)->inflight), 3)
 
 	struct i915_address_space *vm;
 	struct i915_gem_context __rcu *gem_context;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 667ee52448a0..0e94e52ee760 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -515,6 +515,8 @@ void intel_engine_init_execlists(struct intel_engine_cs *engine)
 	memset(execlists->pending, 0, sizeof(execlists->pending));
 	execlists->active =
 		memset(execlists->inflight, 0, sizeof(execlists->inflight));
+	execlists->inactive =
+		memset(execlists->post, 0, sizeof(execlists->post));
 
 	execlists->queue_priority_hint = INT_MIN;
 	execlists->queue = RB_ROOT_CACHED;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 073c3769e8cc..31cf60cef5a8 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -208,6 +208,10 @@ struct intel_engine_execlists {
 	 * @active: the currently known context executing on HW
 	 */
 	struct i915_request * const *active;
+	/**
+	 * @inactive: the current vacancy of completed CS
+	 */
+	struct i915_request **inactive;
 	/**
 	 * @inflight: the set of contexts submitted and acknowleged by HW
 	 *
@@ -225,6 +229,15 @@ struct intel_engine_execlists {
 	 * preemption or idle-to-active event.
 	 */
 	struct i915_request *pending[EXECLIST_MAX_PORTS + 1];
+	/**
+	 * @post: the set of completed context switches
+	 *
+	 * Since we may want to stagger the processing of the CS switches
+	 * with the next submission, so that the context are notionally
+	 * kept in flight across the dequeue, we defer scheduling out of
+	 * the completed context switches.
+	 */
+	struct i915_request *post[2 * EXECLIST_MAX_PORTS + 1];
 
 	/**
 	 * @port_mask: number of execlist ports - 1
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 40c5085765da..adc14adfa89c 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1385,6 +1385,8 @@ __execlists_schedule_in(struct i915_request *rq)
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
 	intel_engine_context_in(engine);
 
+	CE_TRACE(ce, "schedule-in, ccid:%x\n", ce->lrc.ccid);
+
 	return engine;
 }
 
@@ -1431,6 +1433,8 @@ __execlists_schedule_out(struct i915_request *rq,
 	 * refrain from doing non-trivial work here.
 	 */
 
+	CE_TRACE(ce, "schedule-out, ccid:%x\n", ccid);
+
 	/*
 	 * If we have just completed this context, the engine may now be
 	 * idle and we want to re-enter powersaving.
@@ -2055,9 +2059,10 @@ static void set_preempt_timeout(struct intel_engine_cs *engine,
 		     active_preempt_timeout(engine, rq));
 }
 
-static inline void clear_ports(struct i915_request **ports, int count)
+static inline struct i915_request **
+clear_ports(struct i915_request **ports, int count)
 {
-	memset_p((void **)ports, NULL, count);
+	return memset_p((void **)ports, NULL, count);
 }
 
 static void execlists_dequeue(struct intel_engine_cs *engine)
@@ -2433,7 +2438,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		if (!memcmp(active, execlists->pending,
 			    (port - execlists->pending + 1) * sizeof(*port))) {
 			do
-				execlists_schedule_out(fetch_and_zero(port));
+				*execlists->inactive++ = *port;
 			while (port-- != execlists->pending);
 
 			goto skip_submit;
@@ -2447,6 +2452,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		start_timeslice(engine, execlists->queue_priority_hint);
 skip_submit:
 		ring_set_paused(engine, 0);
+		execlists->pending[0] = NULL;
 	}
 }
 
@@ -2456,12 +2462,12 @@ cancel_port_requests(struct intel_engine_execlists * const execlists)
 	struct i915_request * const *port;
 
 	for (port = execlists->pending; *port; port++)
-		execlists_schedule_out(*port);
+		*execlists->inactive++ = *port;
 	clear_ports(execlists->pending, ARRAY_SIZE(execlists->pending));
 
 	/* Mark the end of active before we overwrite *active */
 	for (port = xchg(&execlists->active, execlists->pending); *port; port++)
-		execlists_schedule_out(*port);
+		*execlists->inactive++ = *port;
 	clear_ports(execlists->inflight, ARRAY_SIZE(execlists->inflight));
 
 	smp_wmb(); /* complete the seqlock for execlists_active() */
@@ -2622,7 +2628,7 @@ static void process_csb(struct intel_engine_cs *engine)
 			/* cancel old inflight, prepare for switch */
 			trace_ports(execlists, "preempted", old);
 			while (*old)
-				execlists_schedule_out(*old++);
+				*execlists->inactive++ = *old++;
 
 			/* switch pending to inflight */
 			GEM_BUG_ON(!assert_pending_valid(execlists, "promote"));
@@ -2679,7 +2685,7 @@ static void process_csb(struct intel_engine_cs *engine)
 					     regs[CTX_RING_TAIL]);
 			}
 
-			execlists_schedule_out(*execlists->active++);
+			*execlists->inactive++ = *execlists->active++;
 
 			GEM_BUG_ON(execlists->active - execlists->inflight >
 				   execlists_num_ports(execlists));
@@ -2703,6 +2709,20 @@ static void process_csb(struct intel_engine_cs *engine)
 	invalidate_csb_entries(&buf[0], &buf[num_entries - 1]);
 }
 
+static void post_process_csb(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const el = &engine->execlists;
+	struct i915_request **port;
+
+	if (!el->post[0])
+		return;
+
+	GEM_BUG_ON(el->post[2 * EXECLIST_MAX_PORTS]);
+	for (port = el->post; *port; port++)
+		execlists_schedule_out(*port);
+	el->inactive = clear_ports(el->post, port - el->post);
+}
+
 static void __execlists_hold(struct i915_request *rq)
 {
 	LIST_HEAD(list);
@@ -2971,8 +2991,8 @@ active_context(struct intel_engine_cs *engine, u32 ccid)
 	for (port = el->active; (rq = *port); port++) {
 		if (rq->context->lrc.ccid == ccid) {
 			ENGINE_TRACE(engine,
-				     "ccid found at active:%zd\n",
-				     port - el->active);
+				     "ccid:%x found at active:%zd\n",
+				     ccid, port - el->active);
 			return rq;
 		}
 	}
@@ -2980,8 +3000,8 @@ active_context(struct intel_engine_cs *engine, u32 ccid)
 	for (port = el->pending; (rq = *port); port++) {
 		if (rq->context->lrc.ccid == ccid) {
 			ENGINE_TRACE(engine,
-				     "ccid found at pending:%zd\n",
-				     port - el->pending);
+				     "ccid:%x found at pending:%zd\n",
+				     ccid, port - el->pending);
 			return rq;
 		}
 	}
@@ -3123,6 +3143,8 @@ static void execlists_submission_tasklet(unsigned long data)
 		spin_unlock_irqrestore(&engine->active.lock, flags);
 		rcu_read_unlock();
 	}
+
+	post_process_csb(engine);
 }
 
 static void __execlists_kick(struct intel_engine_execlists *execlists)
@@ -4061,8 +4083,6 @@ static void enable_execlists(struct intel_engine_cs *engine)
 	ENGINE_POSTING_READ(engine, RING_HWS_PGA);
 
 	enable_error_interrupt(engine);
-
-	engine->context_tag = GENMASK(BITS_PER_LONG - 2, 0);
 }
 
 static bool unexpected_starting_state(struct intel_engine_cs *engine)
@@ -5104,6 +5124,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
 	else
 		execlists->csb_size = GEN11_CSB_ENTRIES;
 
+	engine->context_tag = GENMASK(BITS_PER_LONG - 2, 0);
 	if (INTEL_GEN(engine->i915) >= 11) {
 		execlists->ccid |= engine->instance << (GEN11_ENGINE_INSTANCE_SHIFT - 32);
 		execlists->ccid |= engine->class << (GEN11_ENGINE_CLASS_SHIFT - 32);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [PATCH 06/10] drm/i915/gt: ce->inflight updates are now serialised
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (3 preceding siblings ...)
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 05/10] drm/i915/execlists: Defer schedule_out until after the next dequeue Chris Wilson
@ 2020-06-15 12:39 ` Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 07/10] drm/i915/gt: Drop atomic for engine->fw_active tracking Chris Wilson
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 12:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Since schedule-in and schedule-out are now both always under the tasklet
bitlock, we can reduce the individual atomic operations to simple
instructions and worry less.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 43 ++++++++++++-----------------
 1 file changed, 18 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index adc14adfa89c..8b3959207c02 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1371,7 +1371,7 @@ __execlists_schedule_in(struct i915_request *rq)
 		unsigned int tag = ffs(READ_ONCE(engine->context_tag));
 
 		GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG);
-		clear_bit(tag - 1, &engine->context_tag);
+		__clear_bit(tag - 1, &engine->context_tag);
 		ce->lrc.ccid = tag << (GEN11_SW_CTX_ID_SHIFT - 32);
 
 		BUILD_BUG_ON(BITS_PER_LONG > GEN12_MAX_CONTEXT_HW_ID);
@@ -1399,13 +1399,10 @@ execlists_schedule_in(struct i915_request *rq, int idx)
 	GEM_BUG_ON(!intel_engine_pm_is_awake(rq->engine));
 	trace_i915_request_in(rq, idx);
 
-	old = READ_ONCE(ce->inflight);
-	do {
-		if (!old) {
-			WRITE_ONCE(ce->inflight, __execlists_schedule_in(rq));
-			break;
-		}
-	} while (!try_cmpxchg(&ce->inflight, &old, ptr_inc(old)));
+	old = ce->inflight;
+	if (!old)
+		old = __execlists_schedule_in(rq);
+	WRITE_ONCE(ce->inflight, ptr_inc(old));
 
 	GEM_BUG_ON(intel_context_inflight(ce) != rq->engine);
 	return i915_request_get(rq);
@@ -1420,12 +1417,11 @@ static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
 		tasklet_hi_schedule(&ve->base.execlists.tasklet);
 }
 
-static inline void
-__execlists_schedule_out(struct i915_request *rq,
-			 struct intel_engine_cs * const engine,
-			 unsigned int ccid)
+static inline void __execlists_schedule_out(struct i915_request *rq)
 {
 	struct intel_context * const ce = rq->context;
+	struct intel_engine_cs * const engine = ce->inflight;
+	unsigned int ccid;
 
 	/*
 	 * NB process_csb() is not under the engine->active.lock and hence
@@ -1433,7 +1429,7 @@ __execlists_schedule_out(struct i915_request *rq,
 	 * refrain from doing non-trivial work here.
 	 */
 
-	CE_TRACE(ce, "schedule-out, ccid:%x\n", ccid);
+	CE_TRACE(ce, "schedule-out, ccid:%x\n", ce->lrc.ccid);
 
 	/*
 	 * If we have just completed this context, the engine may now be
@@ -1443,12 +1439,13 @@ __execlists_schedule_out(struct i915_request *rq,
 	    i915_request_completed(rq))
 		intel_engine_add_retire(engine, ce->timeline);
 
+	ccid = ce->lrc.ccid;
 	ccid >>= GEN11_SW_CTX_ID_SHIFT - 32;
 	ccid &= GEN12_MAX_CONTEXT_HW_ID;
 	if (ccid < BITS_PER_LONG) {
 		GEM_BUG_ON(ccid == 0);
 		GEM_BUG_ON(test_bit(ccid - 1, &engine->context_tag));
-		set_bit(ccid - 1, &engine->context_tag);
+		__set_bit(ccid - 1, &engine->context_tag);
 	}
 
 	intel_context_update_runtime(ce);
@@ -1469,26 +1466,22 @@ __execlists_schedule_out(struct i915_request *rq,
 	 */
 	if (ce->engine != engine)
 		kick_siblings(rq, ce);
-
-	intel_context_put(ce);
 }
 
 static inline void
 execlists_schedule_out(struct i915_request *rq)
 {
 	struct intel_context * const ce = rq->context;
-	struct intel_engine_cs *cur, *old;
-	u32 ccid;
 
 	trace_i915_request_out(rq);
 
-	ccid = rq->context->lrc.ccid;
-	old = READ_ONCE(ce->inflight);
-	do
-		cur = ptr_unmask_bits(old, 2) ? ptr_dec(old) : NULL;
-	while (!try_cmpxchg(&ce->inflight, &old, cur));
-	if (!cur)
-		__execlists_schedule_out(rq, old, ccid);
+	GEM_BUG_ON(!ce->inflight);
+	ce->inflight = ptr_dec(ce->inflight);
+	if (!intel_context_inflight_count(ce)) {
+		__execlists_schedule_out(rq);
+		WRITE_ONCE(ce->inflight, NULL);
+		intel_context_put(ce);
+	}
 
 	i915_request_put(rq);
 }
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [PATCH 07/10] drm/i915/gt: Drop atomic for engine->fw_active tracking
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (4 preceding siblings ...)
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 06/10] drm/i915/gt: ce->inflight updates are now serialised Chris Wilson
@ 2020-06-15 12:39 ` Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 08/10] drm/i915/gt: Use virtual_engine during execlists_dequeue Chris Wilson
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 12:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Since schedule-in/out is now entirely serialised by the tasklet bitlock,
we do not need to worry about concurrent in/out operations and so reduce
the atomic operations to plain instructions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    | 2 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c          | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 0e94e52ee760..c91d18b384e7 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1524,7 +1524,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 	drm_printf(m, "\tLatency: %luus\n",
 		   ewma__engine_latency_read(&engine->latency));
 	drm_printf(m, "\tForcewake: %x domains, %d active\n",
-		   engine->fw_domain, atomic_read(&engine->fw_active));
+		   engine->fw_domain, READ_ONCE(engine->fw_active));
 
 	rcu_read_lock();
 	rq = READ_ONCE(engine->heartbeat.systole);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 31cf60cef5a8..ca124f229f65 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -335,7 +335,7 @@ struct intel_engine_cs {
 	 * as possible.
 	 */
 	enum forcewake_domains fw_domain;
-	atomic_t fw_active;
+	unsigned int fw_active;
 
 	unsigned long context_tag;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 8b3959207c02..09ec7242fbcb 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1380,7 +1380,7 @@ __execlists_schedule_in(struct i915_request *rq)
 	ce->lrc.ccid |= engine->execlists.ccid;
 
 	__intel_gt_pm_get(engine->gt);
-	if (engine->fw_domain && !atomic_fetch_inc(&engine->fw_active))
+	if (engine->fw_domain && !engine->fw_active++)
 		intel_uncore_forcewake_get(engine->uncore, engine->fw_domain);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
 	intel_engine_context_in(engine);
@@ -1451,7 +1451,7 @@ static inline void __execlists_schedule_out(struct i915_request *rq)
 	intel_context_update_runtime(ce);
 	intel_engine_context_out(engine);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
-	if (engine->fw_domain && !atomic_dec_return(&engine->fw_active))
+	if (engine->fw_domain && !--engine->fw_active)
 		intel_uncore_forcewake_put(engine->uncore, engine->fw_domain);
 	intel_gt_pm_put_async(engine->gt);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [PATCH 08/10] drm/i915/gt: Use virtual_engine during execlists_dequeue
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (5 preceding siblings ...)
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 07/10] drm/i915/gt: Drop atomic for engine->fw_active tracking Chris Wilson
@ 2020-06-15 12:39 ` Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 09/10] drm/i915/gt: Decouple inflight virtual engines Chris Wilson
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 12:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Rather than going back and forth between the rb_node entry and the
virtual_engine type, store the ve local and reuse it. As the
container_of conversion from rb_node to virtual_engine requires a
variable offset, performing that conversion just once shaves off a bit
of code.

v2: Keep a single virtual engine lookup, for typical use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 214 +++++++++++++---------------
 1 file changed, 101 insertions(+), 113 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 09ec7242fbcb..34a8eadc2de3 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -454,7 +454,7 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
 
 static inline bool need_preempt(const struct intel_engine_cs *engine,
 				const struct i915_request *rq,
-				struct rb_node *rb)
+				struct virtual_engine *ve)
 {
 	int last_prio;
 
@@ -491,9 +491,7 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 	    rq_prio(list_next_entry(rq, sched.link)) > last_prio)
 		return true;
 
-	if (rb) {
-		struct virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+	if (ve) {
 		bool preempt = false;
 
 		if (engine == ve->siblings[0]) { /* only preempt one sibling */
@@ -1816,6 +1814,35 @@ static bool virtual_matches(const struct virtual_engine *ve,
 	return true;
 }
 
+static struct virtual_engine *
+first_virtual_engine(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists *el = &engine->execlists;
+	struct rb_node *rb = rb_first_cached(&el->virtual);
+
+	while (rb) {
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq = READ_ONCE(ve->request);
+
+		if (!rq) { /* lazily cleanup after another engine handled rq */
+			rb_erase_cached(rb, &el->virtual);
+			RB_CLEAR_NODE(rb);
+			rb = rb_first_cached(&el->virtual);
+			continue;
+		}
+
+		if (!virtual_matches(ve, rq, engine)) {
+			rb = rb_next(rb);
+			continue;
+		}
+
+		return ve;
+	}
+
+	return NULL;
+}
+
 static void virtual_xfer_breadcrumbs(struct virtual_engine *ve)
 {
 	/*
@@ -1900,7 +1927,7 @@ static void defer_active(struct intel_engine_cs *engine)
 static bool
 need_timeslice(const struct intel_engine_cs *engine,
 	       const struct i915_request *rq,
-	       const struct rb_node *rb)
+	       struct virtual_engine *ve)
 {
 	int hint;
 
@@ -1909,9 +1936,7 @@ need_timeslice(const struct intel_engine_cs *engine,
 
 	hint = engine->execlists.queue_priority_hint;
 
-	if (rb) {
-		const struct virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+	if (ve) {
 		const struct intel_engine_cs *inflight =
 			intel_context_inflight(&ve->context);
 
@@ -2063,7 +2088,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_request **port = execlists->pending;
 	struct i915_request ** const last_port = port + execlists->port_mask;
-	struct i915_request * const *active;
+	struct i915_request * const *active = READ_ONCE(execlists->active);
+	struct virtual_engine *ve = first_virtual_engine(engine);
 	struct i915_request *last;
 	struct rb_node *rb;
 	bool submit = false;
@@ -2090,26 +2116,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * and context switches) submission.
 	 */
 
-	for (rb = rb_first_cached(&execlists->virtual); rb; ) {
-		struct virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
-		struct i915_request *rq = READ_ONCE(ve->request);
-
-		if (!rq) { /* lazily cleanup after another engine handled rq */
-			rb_erase_cached(rb, &execlists->virtual);
-			RB_CLEAR_NODE(rb);
-			rb = rb_first_cached(&execlists->virtual);
-			continue;
-		}
-
-		if (!virtual_matches(ve, rq, engine)) {
-			rb = rb_next(rb);
-			continue;
-		}
-
-		break;
-	}
-
 	/*
 	 * If the queue is higher priority than the last
 	 * request in the currently active context, submit afresh.
@@ -2117,10 +2123,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * the active context to interject the preemption request,
 	 * i.e. we will retrigger preemption following the ack in case
 	 * of trouble.
-	 */
-	active = READ_ONCE(execlists->active);
-
-	/*
+	 *
 	 * In theory we can skip over completed contexts that have not
 	 * yet been processed by events (as those events are in flight):
 	 *
@@ -2131,9 +2134,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * find itself trying to jump back into a context it has just
 	 * completed and barf.
 	 */
-
 	if ((last = *active)) {
-		if (need_preempt(engine, last, rb)) {
+		if (need_preempt(engine, last, ve)) {
 			if (i915_request_completed(last)) {
 				tasklet_hi_schedule(&execlists->tasklet);
 				return;
@@ -2164,7 +2166,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			__unwind_incomplete_requests(engine);
 
 			last = NULL;
-		} else if (need_timeslice(engine, last, rb) &&
+		} else if (need_timeslice(engine, last, ve) &&
 			   timeslice_expired(execlists, last)) {
 			if (i915_request_completed(last)) {
 				tasklet_hi_schedule(&execlists->tasklet);
@@ -2218,110 +2220,96 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		}
 	}
 
-	while (rb) { /* XXX virtual is always taking precedence */
-		struct virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+	while (ve) { /* XXX virtual is always taking precedence */
 		struct i915_request *rq;
 
 		spin_lock(&ve->base.active.lock);
 
 		rq = ve->request;
-		if (unlikely(!rq)) { /* lost the race to a sibling */
-			spin_unlock(&ve->base.active.lock);
-			rb_erase_cached(rb, &execlists->virtual);
-			RB_CLEAR_NODE(rb);
-			rb = rb_first_cached(&execlists->virtual);
-			continue;
-		}
+		if (unlikely(!rq)) /* lost the race to a sibling */
+			goto unlock;
 
 		GEM_BUG_ON(rq != ve->request);
 		GEM_BUG_ON(rq->engine != &ve->base);
 		GEM_BUG_ON(rq->context != &ve->context);
 
-		if (rq_prio(rq) >= queue_prio(execlists)) {
-			if (!virtual_matches(ve, rq, engine)) {
-				spin_unlock(&ve->base.active.lock);
-				rb = rb_next(rb);
-				continue;
-			}
+		if (unlikely(rq_prio(rq) < queue_prio(execlists))) {
+			spin_unlock(&ve->base.active.lock);
+			break;
+		}
 
-			if (last && !can_merge_rq(last, rq)) {
-				spin_unlock(&ve->base.active.lock);
-				start_timeslice(engine, rq_prio(rq));
-				return; /* leave this for another sibling */
-			}
+		GEM_BUG_ON(!virtual_matches(ve, rq, engine));
 
-			ENGINE_TRACE(engine,
-				     "virtual rq=%llx:%lld%s, new engine? %s\n",
-				     rq->fence.context,
-				     rq->fence.seqno,
-				     i915_request_completed(rq) ? "!" :
-				     i915_request_started(rq) ? "*" :
-				     "",
-				     yesno(engine != ve->siblings[0]));
-
-			WRITE_ONCE(ve->request, NULL);
-			WRITE_ONCE(ve->base.execlists.queue_priority_hint,
-				   INT_MIN);
-			rb_erase_cached(rb, &execlists->virtual);
-			RB_CLEAR_NODE(rb);
+		if (last && !can_merge_rq(last, rq)) {
+			spin_unlock(&ve->base.active.lock);
+			start_timeslice(engine, rq_prio(rq));
+			return; /* leave this for another sibling */
+		}
 
-			GEM_BUG_ON(!(rq->execution_mask & engine->mask));
-			WRITE_ONCE(rq->engine, engine);
+		ENGINE_TRACE(engine,
+			     "virtual rq=%llx:%lld%s, new engine? %s\n",
+			     rq->fence.context,
+			     rq->fence.seqno,
+			     i915_request_completed(rq) ? "!" :
+			     i915_request_started(rq) ? "*" :
+			     "",
+			     yesno(engine != ve->siblings[0]));
 
-			if (engine != ve->siblings[0]) {
-				u32 *regs = ve->context.lrc_reg_state;
-				unsigned int n;
+		WRITE_ONCE(ve->request, NULL);
+		WRITE_ONCE(ve->base.execlists.queue_priority_hint, INT_MIN);
 
-				GEM_BUG_ON(READ_ONCE(ve->context.inflight));
+		rb = &ve->nodes[engine->id].rb;
+		rb_erase_cached(rb, &execlists->virtual);
+		RB_CLEAR_NODE(rb);
 
-				if (!intel_engine_has_relative_mmio(engine))
-					virtual_update_register_offsets(regs,
-									engine);
+		GEM_BUG_ON(!(rq->execution_mask & engine->mask));
+		WRITE_ONCE(rq->engine, engine);
 
-				if (!list_empty(&ve->context.signals))
-					virtual_xfer_breadcrumbs(ve);
+		if (engine != ve->siblings[0]) {
+			u32 *regs = ve->context.lrc_reg_state;
+			unsigned int n;
 
-				/*
-				 * Move the bound engine to the top of the list
-				 * for future execution. We then kick this
-				 * tasklet first before checking others, so that
-				 * we preferentially reuse this set of bound
-				 * registers.
-				 */
-				for (n = 1; n < ve->num_siblings; n++) {
-					if (ve->siblings[n] == engine) {
-						swap(ve->siblings[n],
-						     ve->siblings[0]);
-						break;
-					}
-				}
+			GEM_BUG_ON(READ_ONCE(ve->context.inflight));
 
-				GEM_BUG_ON(ve->siblings[0] != engine);
-			}
+			if (!intel_engine_has_relative_mmio(engine))
+				virtual_update_register_offsets(regs, engine);
 
-			if (__i915_request_submit(rq)) {
-				submit = true;
-				last = rq;
-			}
-			i915_request_put(rq);
+			if (!list_empty(&ve->context.signals))
+				virtual_xfer_breadcrumbs(ve);
 
 			/*
-			 * Hmm, we have a bunch of virtual engine requests,
-			 * but the first one was already completed (thanks
-			 * preempt-to-busy!). Keep looking at the veng queue
-			 * until we have no more relevant requests (i.e.
-			 * the normal submit queue has higher priority).
+			 * Move the bound engine to the top of the list for
+			 * future execution. We then kick this tasklet first
+			 * before checking others, so that we preferentially
+			 * reuse this set of bound registers.
 			 */
-			if (!submit) {
-				spin_unlock(&ve->base.active.lock);
-				rb = rb_first_cached(&execlists->virtual);
-				continue;
+			for (n = 1; n < ve->num_siblings; n++) {
+				if (ve->siblings[n] == engine) {
+					swap(ve->siblings[n], ve->siblings[0]);
+					break;
+				}
 			}
+
+			GEM_BUG_ON(ve->siblings[0] != engine);
+		}
+
+		if (__i915_request_submit(rq)) {
+			submit = true;
+			last = rq;
 		}
 
+		i915_request_put(rq);
+unlock:
 		spin_unlock(&ve->base.active.lock);
-		break;
+
+		/*
+		 * Hmm, we have a bunch of virtual engine requests,
+		 * but the first one was already completed (thanks
+		 * preempt-to-busy!). Keep looking at the veng queue
+		 * until we have no more relevant requests (i.e.
+		 * the normal submit queue has higher priority).
+		 */
+		ve = submit ? NULL : first_virtual_engine(engine);
 	}
 
 	while ((rb = rb_first_cached(&execlists->queue))) {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [PATCH 09/10] drm/i915/gt: Decouple inflight virtual engines
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (6 preceding siblings ...)
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 08/10] drm/i915/gt: Use virtual_engine during execlists_dequeue Chris Wilson
@ 2020-06-15 12:39 ` Chris Wilson
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 10/10] drm/i915/gt: Resubmit the virtual engine on schedule-out Chris Wilson
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 12:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Once a virtual engine has been bound to a sibling, it will remain bound
until we finally schedule out the last active request. We can not rebind
the context to a new sibling while it is inflight as the context save
will conflict, hence we wait. As we cannot then use any other sibliing
while the context is inflight, only kick the bound sibling while it
inflight and upon scheduling out the kick the rest (so that we can swap
engines on timeslicing if the previously bound engine becomes
oversubscribed).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 30 +++++++++++++----------------
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 34a8eadc2de3..4a1e23e51314 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1409,9 +1409,8 @@ execlists_schedule_in(struct i915_request *rq, int idx)
 static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
 {
 	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
-	struct i915_request *next = READ_ONCE(ve->request);
 
-	if (next == rq || (next && next->execution_mask & ~rq->execution_mask))
+	if (READ_ONCE(ve->request))
 		tasklet_hi_schedule(&ve->base.execlists.tasklet);
 }
 
@@ -1825,18 +1824,14 @@ first_virtual_engine(struct intel_engine_cs *engine)
 			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
 		struct i915_request *rq = READ_ONCE(ve->request);
 
-		if (!rq) { /* lazily cleanup after another engine handled rq */
+		/* lazily cleanup after another engine handled rq */
+		if (!rq || !virtual_matches(ve, rq, engine)) {
 			rb_erase_cached(rb, &el->virtual);
 			RB_CLEAR_NODE(rb);
 			rb = rb_first_cached(&el->virtual);
 			continue;
 		}
 
-		if (!virtual_matches(ve, rq, engine)) {
-			rb = rb_next(rb);
-			continue;
-		}
-
 		return ve;
 	}
 
@@ -5486,7 +5481,6 @@ static void virtual_submission_tasklet(unsigned long data)
 	if (unlikely(!mask))
 		return;
 
-	local_irq_disable();
 	for (n = 0; n < ve->num_siblings; n++) {
 		struct intel_engine_cs *sibling = READ_ONCE(ve->siblings[n]);
 		struct ve_node * const node = &ve->nodes[sibling->id];
@@ -5496,20 +5490,19 @@ static void virtual_submission_tasklet(unsigned long data)
 		if (!READ_ONCE(ve->request))
 			break; /* already handled by a sibling's tasklet */
 
+		spin_lock_irq(&sibling->active.lock);
+
 		if (unlikely(!(mask & sibling->mask))) {
 			if (!RB_EMPTY_NODE(&node->rb)) {
-				spin_lock(&sibling->active.lock);
 				rb_erase_cached(&node->rb,
 						&sibling->execlists.virtual);
 				RB_CLEAR_NODE(&node->rb);
-				spin_unlock(&sibling->active.lock);
 			}
-			continue;
-		}
 
-		spin_lock(&sibling->active.lock);
+			goto unlock_engine;
+		}
 
-		if (!RB_EMPTY_NODE(&node->rb)) {
+		if (unlikely(!RB_EMPTY_NODE(&node->rb))) {
 			/*
 			 * Cheat and avoid rebalancing the tree if we can
 			 * reuse this node in situ.
@@ -5549,9 +5542,12 @@ static void virtual_submission_tasklet(unsigned long data)
 		if (first && prio > sibling->execlists.queue_priority_hint)
 			tasklet_hi_schedule(&sibling->execlists.tasklet);
 
-		spin_unlock(&sibling->active.lock);
+unlock_engine:
+		spin_unlock_irq(&sibling->active.lock);
+
+		if (intel_context_inflight(&ve->context))
+			break;
 	}
-	local_irq_enable();
 }
 
 static void virtual_submit_request(struct i915_request *rq)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [PATCH 10/10] drm/i915/gt: Resubmit the virtual engine on schedule-out
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (7 preceding siblings ...)
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 09/10] drm/i915/gt: Decouple inflight virtual engines Chris Wilson
@ 2020-06-15 12:39 ` Chris Wilson
  2020-06-15 14:03 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev2) Patchwork
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 12:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Having recognised that we do not change the sibling until we schedule
out, we can then defer the decision to resubmit the virtual engine from
the unwind of the active queue to scheduling out of the virtual context.

By keeping the unwind order intact on the local engine, we can preserve
data dependency ordering while doing a preempt-to-busy pass until we
have determined the new ELSP. This means that if we try to timeslice
between a virtual engine and a data-dependent ordinary request, the pair
will maintain their relative ordering and we will avoid the
resubmission, cancelling the timeslicing until further change.

The dilemma though is that we then may end up in a situation where the
'demotion' of the virtual request to an ordinary request in the engine
queue results in filling the ELSP[] with virtual requests instead of
spreading the load across the engines. To compensate for this, we mark
each virtual request and refuse to resubmit a virtual request in the
secondary ELSP slots, thus forcing subsequent virtual requests to be
scheduled out after timeslicing. By delaying the decision until we
schedule out, we will avoid unnecessary resubmission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c    | 118 ++++++++++++++++---------
 drivers/gpu/drm/i915/gt/selftest_lrc.c |   2 +-
 2 files changed, 75 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 4a1e23e51314..45029eb86b6e 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1117,53 +1117,23 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 
 		__i915_request_unsubmit(rq);
 
-		/*
-		 * Push the request back into the queue for later resubmission.
-		 * If this request is not native to this physical engine (i.e.
-		 * it came from a virtual source), push it back onto the virtual
-		 * engine so that it can be moved across onto another physical
-		 * engine as load dictates.
-		 */
-		if (likely(rq->execution_mask == engine->mask)) {
-			GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
-			if (rq_prio(rq) != prio) {
-				prio = rq_prio(rq);
-				pl = i915_sched_lookup_priolist(engine, prio);
-			}
-			GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
-
-			list_move(&rq->sched.link, pl);
-			set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+		GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
+		if (rq_prio(rq) != prio) {
+			prio = rq_prio(rq);
+			pl = i915_sched_lookup_priolist(engine, prio);
+		}
+		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
 
-			/* Check in case we rollback so far we wrap [size/2] */
-			if (intel_ring_direction(rq->ring,
-						 intel_ring_wrap(rq->ring,
-								 rq->tail),
-						 rq->ring->tail) > 0)
-				rq->context->lrc.desc |= CTX_DESC_FORCE_RESTORE;
+		list_move(&rq->sched.link, pl);
+		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
 
-			active = rq;
-		} else {
-			struct intel_engine_cs *owner = rq->context->engine;
+		/* Check in case we rollback so far we wrap [size/2] */
+		if (intel_ring_direction(rq->ring,
+					 intel_ring_wrap(rq->ring, rq->tail),
+					 rq->ring->tail) > 0)
+			rq->context->lrc.desc |= CTX_DESC_FORCE_RESTORE;
 
-			/*
-			 * Decouple the virtual breadcrumb before moving it
-			 * back to the virtual engine -- we don't want the
-			 * request to complete in the background and try
-			 * and cancel the breadcrumb on the virtual engine
-			 * (instead of the old engine where it is linked)!
-			 */
-			if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
-				     &rq->fence.flags)) {
-				spin_lock_nested(&rq->lock,
-						 SINGLE_DEPTH_NESTING);
-				i915_request_cancel_breadcrumb(rq);
-				spin_unlock(&rq->lock);
-			}
-			WRITE_ONCE(rq->engine, owner);
-			owner->submit_request(rq);
-			active = NULL;
-		}
+		active = rq;
 	}
 
 	return active;
@@ -1406,12 +1376,49 @@ execlists_schedule_in(struct i915_request *rq, int idx)
 	return i915_request_get(rq);
 }
 
+static void
+resubmit_virtual_request(struct i915_request *rq, struct virtual_engine *ve)
+{
+	struct intel_engine_cs *engine = rq->engine;
+	unsigned long flags;
+
+	spin_lock_irqsave(&engine->active.lock, flags);
+
+	/*
+	 * Decouple the virtual breadcrumb before moving it back to the virtual
+	 * engine -- we don't want the request to complete in the background
+	 * and then try and cancel the breadcrumb on the virtual engine
+	 * (instead of the old engine where it is linked)!
+	 */
+	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) {
+		spin_lock_nested(&rq->lock, SINGLE_DEPTH_NESTING);
+		i915_request_cancel_breadcrumb(rq);
+		spin_unlock(&rq->lock);
+	}
+
+	clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+	WRITE_ONCE(rq->engine, &ve->base);
+	ve->base.submit_request(rq);
+
+	spin_unlock_irqrestore(&engine->active.lock, flags);
+}
+
 static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
 {
 	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
 
 	if (READ_ONCE(ve->request))
 		tasklet_hi_schedule(&ve->base.execlists.tasklet);
+
+	/*
+	 * This engine is now too busy to run this virtual request, so
+	 * see if we can find an alternative engine for it to execute on.
+	 * Once a request has become bonded to this engine, we treat it the
+	 * same as other native request.
+	 */
+	if (i915_request_in_priority_queue(rq) &&
+	    rq->execution_mask != rq->engine->mask)
+		resubmit_virtual_request(rq, ve);
 }
 
 static inline void __execlists_schedule_out(struct i915_request *rq)
@@ -1650,6 +1657,20 @@ assert_pending_valid(const struct intel_engine_execlists *execlists,
 		}
 		sentinel = i915_request_has_sentinel(rq);
 
+		/*
+		 * We want virtual requests to only be in the first slot so
+		 * that they are never stuck behind a hog and can be immediately
+		 * transferred onto the next idle engine.
+		 */
+		if (rq->execution_mask != engine->mask &&
+		    port != execlists->pending) {
+			GEM_TRACE_ERR("%s: virtual engine:%llx not in prime position[%zd]\n",
+				      engine->name,
+				      ce->timeline->fence_context,
+				      port - execlists->pending);
+			return false;
+		}
+
 		/* Hold tightly onto the lock to prevent concurrent retires! */
 		if (!spin_trylock_irqsave(&rq->lock, flags))
 			continue;
@@ -2346,6 +2367,15 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				if (i915_request_has_sentinel(last))
 					goto done;
 
+				/*
+				 * We avoid submitting virtual requests into
+				 * the secondary ports so that we can migrate
+				 * the request immediately to another engine
+				 * rather than wait for the primary request.
+				 */
+				if (rq->execution_mask != engine->mask)
+					goto done;
+
 				/*
 				 * If GVT overrides us we only ever submit
 				 * port[0], leaving port[1] empty. Note that we
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 91543494f595..a8d62faa3a88 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -4289,7 +4289,7 @@ static int reset_virtual_engine(struct intel_gt *gt,
 	spin_lock_irq(&engine->active.lock);
 	__unwind_incomplete_requests(engine);
 	spin_unlock_irq(&engine->active.lock);
-	GEM_BUG_ON(rq->engine != ve->engine);
+	GEM_BUG_ON(rq->engine != engine);
 
 	/* Reset the engine while keeping our active request on hold */
 	execlists_hold(engine, rq);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [PATCH] drm/i915/gt: Add a safety submission flush in the heartbeat
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 03/10] drm/i915/gt: Add a safety submission flush in the heartbeat Chris Wilson
@ 2020-06-15 13:11   ` Chris Wilson
  2020-06-15 16:16     ` Mika Kuoppala
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 13:11 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Just in case everything fails (like for example "missed interrupt
syndrome" on Sandybridge), always flush the submission tasklet from the
heartbeat. This papers over such issues, but will still appear as a
second long glitch, and prevents us from detecting it unless we happen
to be performing a timed test.

v2: We rely on flush_submission() synchronizing with the tasklet on
another CPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 23 +++++++++----------
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |  3 +++
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index d613cf31970c..31049e0bdb57 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1094,19 +1094,18 @@ void intel_engine_flush_submission(struct intel_engine_cs *engine)
 {
 	struct tasklet_struct *t = &engine->execlists.tasklet;
 
-	if (__tasklet_is_scheduled(t)) {
-		local_bh_disable();
-		if (tasklet_trylock(t)) {
-			/* Must wait for any GPU reset in progress. */
-			if (__tasklet_is_enabled(t))
-				t->func(t->data);
-			tasklet_unlock(t);
-		}
-		local_bh_enable();
+	/* Synchronise and wait for the tasklet on another CPU */
+	tasklet_kill(t);
+
+	/* Having cancelled the tasklet, ensure that is run */
+	local_bh_disable();
+	if (tasklet_trylock(t)) {
+		/* Must wait for any GPU reset in progress. */
+		if (__tasklet_is_enabled(t))
+			t->func(t->data);
+		tasklet_unlock(t);
 	}
-
-	/* Otherwise flush the tasklet if it was running on another cpu */
-	tasklet_unlock_wait(t);
+	local_bh_enable();
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index f67ad937eefb..cd20fb549b38 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -65,6 +65,9 @@ static void heartbeat(struct work_struct *wrk)
 	struct intel_context *ce = engine->kernel_context;
 	struct i915_request *rq;
 
+	/* Just in case everything has gone horribly wrong, give it a kick */
+	intel_engine_flush_submission(engine);
+
 	rq = engine->heartbeat.systole;
 	if (rq && i915_request_completed(rq)) {
 		i915_request_put(rq);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev2)
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (8 preceding siblings ...)
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 10/10] drm/i915/gt: Resubmit the virtual engine on schedule-out Chris Wilson
@ 2020-06-15 14:03 ` Patchwork
  2020-06-15 14:05 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2020-06-15 14:03 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev2)
URL   : https://patchwork.freedesktop.org/series/78371/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
4a185d115cdb drm/i915/selftests: Disable preemptive heartbeats over preemption tests
7188eaf51b40 drm/i915/selftests: Dump engine state and trace upon hanging after reset
-:29: CHECK:LINE_SPACING: Please don't use multiple blank lines
#29: FILE: drivers/gpu/drm/i915/gt/selftest_hangcheck.c:511:
+
+

total: 0 errors, 0 warnings, 1 checks, 24 lines checked
8ca01417d140 drm/i915/gt: Add a safety submission flush in the heartbeat
211c9c5a677c drm/i915/execlists: Replace direct submit with direct call to tasklet
502d7d962450 drm/i915/execlists: Defer schedule_out until after the next dequeue
-:117: CHECK:SPACING: spaces preferred around that '*' (ctx:ExV)
#117: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:2441:
+				*execlists->inactive++ = *port;
 				^

total: 0 errors, 0 warnings, 1 checks, 179 lines checked
ab23cefb7c81 drm/i915/gt: ce->inflight updates are now serialised
ffe345ebb5eb drm/i915/gt: Drop atomic for engine->fw_active tracking
d8fe93841b53 drm/i915/gt: Use virtual_engine during execlists_dequeue
db4fcfc79839 drm/i915/gt: Decouple inflight virtual engines
2d03c2830ca3 drm/i915/gt: Resubmit the virtual engine on schedule-out

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev2)
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (9 preceding siblings ...)
  2020-06-15 14:03 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev2) Patchwork
@ 2020-06-15 14:05 ` Patchwork
  2020-06-15 14:26 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2020-06-15 14:05 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev2)
URL   : https://patchwork.freedesktop.org/series/78371/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.
+drivers/gpu/drm/i915/display/intel_display.c:1222:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1225:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1228:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1231:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2269:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2270:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2271:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2272:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2273:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2274:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/intel_reset.c:1310:5: warning: context imbalance in 'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/sysfs_engines.c:61:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/sysfs_engines.c:62:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/sysfs_engines.c:66:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gvt/mmio.c:287:23: warning: memcpy with byte count of 279040
+drivers/gpu/drm/i915/i915_perf.c:1425:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/i915_perf.c:1479:15: warning: memset with byte count of 16777216
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write8' - different lock contexts for basic block

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev2)
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (10 preceding siblings ...)
  2020-06-15 14:05 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2020-06-15 14:26 ` Patchwork
  2020-06-15 14:55 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev3) Patchwork
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2020-06-15 14:26 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev2)
URL   : https://patchwork.freedesktop.org/series/78371/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8627 -> Patchwork_17948
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_17948 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_17948, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_17948:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live@hangcheck:
    - fi-tgl-u2:          [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-u2/igt@i915_selftest@live@hangcheck.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-tgl-u2/igt@i915_selftest@live@hangcheck.html

  
Known issues
------------

  Here are the changes found in Patchwork_17948 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s0:
    - fi-apl-guc:         [PASS][3] -> [INCOMPLETE][4] ([i915#1242])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-apl-guc/igt@gem_exec_suspend@basic-s0.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-apl-guc/igt@gem_exec_suspend@basic-s0.html

  * igt@i915_module_load@reload:
    - fi-tgl-u2:          [PASS][5] -> [DMESG-WARN][6] ([i915#1982])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-u2/igt@i915_module_load@reload.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-tgl-u2/igt@i915_module_load@reload.html

  * igt@i915_pm_backlight@basic-brightness:
    - fi-whl-u:           [PASS][7] -> [DMESG-WARN][8] ([i915#95])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-whl-u/igt@i915_pm_backlight@basic-brightness.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-whl-u/igt@i915_pm_backlight@basic-brightness.html

  * igt@vgem_basic@dmabuf-mmap:
    - fi-tgl-y:           [PASS][9] -> [DMESG-WARN][10] ([i915#402])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-y/igt@vgem_basic@dmabuf-mmap.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-tgl-y/igt@vgem_basic@dmabuf-mmap.html

  
#### Possible fixes ####

  * igt@gem_ctx_create@basic:
    - fi-tgl-y:           [DMESG-WARN][11] ([i915#402]) -> [PASS][12] +1 similar issue
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-y/igt@gem_ctx_create@basic.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-tgl-y/igt@gem_ctx_create@basic.html

  * igt@i915_pm_rpm@basic-pci-d3-state:
    - fi-bsw-kefka:       [DMESG-WARN][13] ([i915#1982]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-bsw-kefka/igt@i915_pm_rpm@basic-pci-d3-state.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-bsw-kefka/igt@i915_pm_rpm@basic-pci-d3-state.html

  * igt@i915_pm_rpm@module-reload:
    - fi-glk-dsi:         [DMESG-WARN][15] ([i915#1982]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-glk-dsi/igt@i915_pm_rpm@module-reload.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-glk-dsi/igt@i915_pm_rpm@module-reload.html

  * igt@kms_busy@basic@modeset:
    - {fi-tgl-dsi}:       [DMESG-WARN][17] ([i915#1982]) -> [PASS][18] +2 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-dsi/igt@kms_busy@basic@modeset.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-tgl-dsi/igt@kms_busy@basic@modeset.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
    - fi-tgl-y:           [DMESG-WARN][19] ([i915#1982]) -> [PASS][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-y/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-tgl-y/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1:
    - fi-icl-u2:          [DMESG-WARN][21] ([i915#1982]) -> [PASS][22] +3 similar issues
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1.html

  
#### Warnings ####

  * igt@i915_pm_rpm@module-reload:
    - fi-kbl-x1275:       [DMESG-FAIL][23] ([i915#62]) -> [SKIP][24] ([fdo#109271])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-kbl-x1275/igt@i915_pm_rpm@module-reload.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-kbl-x1275/igt@i915_pm_rpm@module-reload.html

  * igt@kms_force_connector_basic@force-edid:
    - fi-kbl-x1275:       [DMESG-WARN][25] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][26] ([i915#62] / [i915#92]) +5 similar issues
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-kbl-x1275/igt@kms_force_connector_basic@force-edid.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-kbl-x1275/igt@kms_force_connector_basic@force-edid.html

  * igt@kms_pipe_crc_basic@read-crc-pipe-b:
    - fi-kbl-x1275:       [DMESG-WARN][27] ([i915#62] / [i915#92]) -> [DMESG-WARN][28] ([i915#62] / [i915#92] / [i915#95])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-kbl-x1275/igt@kms_pipe_crc_basic@read-crc-pipe-b.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/fi-kbl-x1275/igt@kms_pipe_crc_basic@read-crc-pipe-b.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1242]: https://gitlab.freedesktop.org/drm/intel/issues/1242
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (50 -> 32)
------------------------------

  Missing    (18): fi-ilk-m540 fi-hsw-4200u fi-byt-j1900 fi-byt-squawks fi-bsw-cyan fi-bwr-2160 fi-snb-2520m fi-ilk-650 fi-ctg-p8600 fi-hsw-4770 fi-ivb-3770 fi-elk-e7500 fi-pnv-d510 fi-blb-e6850 fi-byt-n2820 fi-byt-clapper fi-bdw-samus fi-snb-2600 


Build changes
-------------

  * Linux: CI_DRM_8627 -> Patchwork_17948

  CI-20190529: 20190529
  CI_DRM_8627: 593c112156feb0f6159814f2276a32c90f243823 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5710: f524eee47930601ad7b4cba9d40c26d68dc7d250 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_17948: 2d03c2830ca379b6bc73c3e25e323c58ac6955c8 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

2d03c2830ca3 drm/i915/gt: Resubmit the virtual engine on schedule-out
db4fcfc79839 drm/i915/gt: Decouple inflight virtual engines
d8fe93841b53 drm/i915/gt: Use virtual_engine during execlists_dequeue
ffe345ebb5eb drm/i915/gt: Drop atomic for engine->fw_active tracking
ab23cefb7c81 drm/i915/gt: ce->inflight updates are now serialised
502d7d962450 drm/i915/execlists: Defer schedule_out until after the next dequeue
211c9c5a677c drm/i915/execlists: Replace direct submit with direct call to tasklet
8ca01417d140 drm/i915/gt: Add a safety submission flush in the heartbeat
7188eaf51b40 drm/i915/selftests: Dump engine state and trace upon hanging after reset
4a185d115cdb drm/i915/selftests: Disable preemptive heartbeats over preemption tests

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17948/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Intel-gfx] [PATCH] drm/i915/selftests: Dump engine state and trace upon hanging after reset
  2020-06-15 12:39 ` [Intel-gfx] [PATCH 02/10] drm/i915/selftests: Dump engine state and trace upon hanging after reset Chris Wilson
@ 2020-06-15 14:33   ` Chris Wilson
  2020-06-15 15:56     ` Mika Kuoppala
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2020-06-15 14:33 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

If the engine dies after a reset, and so we fail to submit a request
but need to be interrupted by the CI runner, dump the engine state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 2af66f8ffbd2..1ea4935e0eeb 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -499,6 +499,21 @@ static int igt_reset_nop_engine(void *arg)
 
 				rq = intel_context_create_request(ce);
 				if (IS_ERR(rq)) {
+					struct drm_printer p =
+						drm_info_printer(gt->i915->drm.dev);
+					intel_engine_dump(engine, &p,
+							  "%s(%s): failed to submit request\n",
+							  __func__,
+							  engine->name);
+
+
+					GEM_TRACE("%s(%s): failed to submit request\n",
+						  __func__,
+						  engine->name);
+					GEM_TRACE_DUMP();
+
+					intel_gt_set_wedged(gt);
+
 					err = PTR_ERR(rq);
 					break;
 				}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev3)
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (11 preceding siblings ...)
  2020-06-15 14:26 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
@ 2020-06-15 14:55 ` Patchwork
  2020-06-15 14:56 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2020-06-15 14:55 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev3)
URL   : https://patchwork.freedesktop.org/series/78371/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
5fa39ee36f3b drm/i915/selftests: Disable preemptive heartbeats over preemption tests
00b8905ebbd9 drm/i915/selftests: Dump engine state and trace upon hanging after reset
-:27: CHECK:LINE_SPACING: Please don't use multiple blank lines
#27: FILE: drivers/gpu/drm/i915/gt/selftest_hangcheck.c:509:
+
+

total: 0 errors, 0 warnings, 1 checks, 21 lines checked
4dc0044cb492 drm/i915/gt: Add a safety submission flush in the heartbeat
6f4c470b5862 drm/i915/execlists: Replace direct submit with direct call to tasklet
a2f814c038a3 drm/i915/execlists: Defer schedule_out until after the next dequeue
-:117: CHECK:SPACING: spaces preferred around that '*' (ctx:ExV)
#117: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:2441:
+				*execlists->inactive++ = *port;
 				^

total: 0 errors, 0 warnings, 1 checks, 179 lines checked
cd5b50c90e8b drm/i915/gt: ce->inflight updates are now serialised
db8e1905bf15 drm/i915/gt: Drop atomic for engine->fw_active tracking
fd7fb36f7457 drm/i915/gt: Use virtual_engine during execlists_dequeue
0d9311cd5488 drm/i915/gt: Decouple inflight virtual engines
11a16dbbd0c4 drm/i915/gt: Resubmit the virtual engine on schedule-out

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev3)
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (12 preceding siblings ...)
  2020-06-15 14:55 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev3) Patchwork
@ 2020-06-15 14:56 ` Patchwork
  2020-06-15 15:20 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
  2020-06-15 15:29 ` [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Mika Kuoppala
  15 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2020-06-15 14:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev3)
URL   : https://patchwork.freedesktop.org/series/78371/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.
+drivers/gpu/drm/i915/display/intel_display.c:1222:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1225:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1228:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1231:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2269:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2270:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2271:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2272:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2273:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2274:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/intel_reset.c:1310:5: warning: context imbalance in 'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/sysfs_engines.c:61:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/sysfs_engines.c:62:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/sysfs_engines.c:66:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gvt/mmio.c:287:23: warning: memcpy with byte count of 279040
+drivers/gpu/drm/i915/i915_perf.c:1425:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/i915_perf.c:1479:15: warning: memset with byte count of 16777216
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write8' - different lock contexts for basic block

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev3)
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (13 preceding siblings ...)
  2020-06-15 14:56 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2020-06-15 15:20 ` Patchwork
  2020-06-15 15:29 ` [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Mika Kuoppala
  15 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2020-06-15 15:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev3)
URL   : https://patchwork.freedesktop.org/series/78371/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8627 -> Patchwork_17950
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_17950 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_17950, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_17950:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live@hangcheck:
    - fi-tgl-u2:          [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-u2/igt@i915_selftest@live@hangcheck.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-tgl-u2/igt@i915_selftest@live@hangcheck.html

  
Known issues
------------

  Here are the changes found in Patchwork_17950 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_getparams_basic@basic-subslice-total:
    - fi-tgl-y:           [PASS][3] -> [DMESG-WARN][4] ([i915#402])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-y/igt@i915_getparams_basic@basic-subslice-total.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-tgl-y/igt@i915_getparams_basic@basic-subslice-total.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
    - fi-bsw-kefka:       [PASS][5] -> [DMESG-WARN][6] ([i915#1982])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-bsw-kefka/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-bsw-kefka/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html

  
#### Possible fixes ####

  * igt@gem_exec_suspend@basic-s3:
    - fi-tgl-u2:          [FAIL][7] ([i915#1888]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-u2/igt@gem_exec_suspend@basic-s3.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-tgl-u2/igt@gem_exec_suspend@basic-s3.html

  * igt@gem_flink_basic@flink-lifetime:
    - fi-tgl-y:           [DMESG-WARN][9] ([i915#402]) -> [PASS][10] +1 similar issue
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-y/igt@gem_flink_basic@flink-lifetime.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-tgl-y/igt@gem_flink_basic@flink-lifetime.html

  * igt@i915_pm_rpm@basic-pci-d3-state:
    - fi-bsw-kefka:       [DMESG-WARN][11] ([i915#1982]) -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-bsw-kefka/igt@i915_pm_rpm@basic-pci-d3-state.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-bsw-kefka/igt@i915_pm_rpm@basic-pci-d3-state.html

  * igt@i915_pm_rpm@module-reload:
    - fi-glk-dsi:         [DMESG-WARN][13] ([i915#1982]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-glk-dsi/igt@i915_pm_rpm@module-reload.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-glk-dsi/igt@i915_pm_rpm@module-reload.html

  * igt@kms_busy@basic@modeset:
    - {fi-tgl-dsi}:       [DMESG-WARN][15] ([i915#1982]) -> [PASS][16] +1 similar issue
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-dsi/igt@kms_busy@basic@modeset.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-tgl-dsi/igt@kms_busy@basic@modeset.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1:
    - fi-icl-u2:          [DMESG-WARN][17] ([i915#1982]) -> [PASS][18] +3 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1.html

  * igt@kms_pipe_crc_basic@read-crc-pipe-a-frame-sequence:
    - fi-tgl-u2:          [DMESG-WARN][19] ([i915#402]) -> [PASS][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-tgl-u2/igt@kms_pipe_crc_basic@read-crc-pipe-a-frame-sequence.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-tgl-u2/igt@kms_pipe_crc_basic@read-crc-pipe-a-frame-sequence.html

  
#### Warnings ####

  * igt@i915_pm_rpm@module-reload:
    - fi-kbl-x1275:       [DMESG-FAIL][21] ([i915#62]) -> [SKIP][22] ([fdo#109271])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-kbl-x1275/igt@i915_pm_rpm@module-reload.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-kbl-x1275/igt@i915_pm_rpm@module-reload.html

  * igt@kms_flip@basic-flip-vs-modeset@a-dp1:
    - fi-kbl-x1275:       [DMESG-WARN][23] ([i915#62] / [i915#92]) -> [DMESG-WARN][24] ([i915#62] / [i915#92] / [i915#95]) +2 similar issues
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-kbl-x1275/igt@kms_flip@basic-flip-vs-modeset@a-dp1.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-kbl-x1275/igt@kms_flip@basic-flip-vs-modeset@a-dp1.html

  * igt@kms_force_connector_basic@force-edid:
    - fi-kbl-x1275:       [DMESG-WARN][25] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][26] ([i915#62] / [i915#92]) +4 similar issues
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8627/fi-kbl-x1275/igt@kms_force_connector_basic@force-edid.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/fi-kbl-x1275/igt@kms_force_connector_basic@force-edid.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1569]: https://gitlab.freedesktop.org/drm/intel/issues/1569
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#192]: https://gitlab.freedesktop.org/drm/intel/issues/192
  [i915#193]: https://gitlab.freedesktop.org/drm/intel/issues/193
  [i915#194]: https://gitlab.freedesktop.org/drm/intel/issues/194
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2029]: https://gitlab.freedesktop.org/drm/intel/issues/2029
  [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (50 -> 32)
------------------------------

  Missing    (18): fi-ilk-m540 fi-hsw-4200u fi-byt-j1900 fi-byt-squawks fi-bsw-cyan fi-bwr-2160 fi-snb-2520m fi-ilk-650 fi-ctg-p8600 fi-hsw-4770 fi-ivb-3770 fi-elk-e7500 fi-pnv-d510 fi-blb-e6850 fi-byt-n2820 fi-byt-clapper fi-bdw-samus fi-snb-2600 


Build changes
-------------

  * Linux: CI_DRM_8627 -> Patchwork_17950

  CI-20190529: 20190529
  CI_DRM_8627: 593c112156feb0f6159814f2276a32c90f243823 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5710: f524eee47930601ad7b4cba9d40c26d68dc7d250 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_17950: 11a16dbbd0c41889dccf4207b4bac2a9f08d3f7c @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

11a16dbbd0c4 drm/i915/gt: Resubmit the virtual engine on schedule-out
0d9311cd5488 drm/i915/gt: Decouple inflight virtual engines
fd7fb36f7457 drm/i915/gt: Use virtual_engine during execlists_dequeue
db8e1905bf15 drm/i915/gt: Drop atomic for engine->fw_active tracking
cd5b50c90e8b drm/i915/gt: ce->inflight updates are now serialised
a2f814c038a3 drm/i915/execlists: Defer schedule_out until after the next dequeue
6f4c470b5862 drm/i915/execlists: Replace direct submit with direct call to tasklet
4dc0044cb492 drm/i915/gt: Add a safety submission flush in the heartbeat
00b8905ebbd9 drm/i915/selftests: Dump engine state and trace upon hanging after reset
5fa39ee36f3b drm/i915/selftests: Disable preemptive heartbeats over preemption tests

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17950/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests
  2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
                   ` (14 preceding siblings ...)
  2020-06-15 15:20 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
@ 2020-06-15 15:29 ` Mika Kuoppala
  15 siblings, 0 replies; 21+ messages in thread
From: Mika Kuoppala @ 2020-06-15 15:29 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Since the heartbeat may cause a preemption event, disable it over the
> preemption suppression tests.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/selftest_lrc.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> index f651bdf7f191..91543494f595 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> @@ -2282,7 +2282,7 @@ static int live_suppress_self_preempt(void *arg)
>  		if (igt_flush_test(gt->i915))
>  			goto err_wedged;
>  
> -		intel_engine_pm_get(engine);
> +		engine_heartbeat_disable(engine);
>  		engine->execlists.preempt_hang.count = 0;
>  
>  		rq_a = spinner_create_request(&a.spin,
> @@ -2290,14 +2290,14 @@ static int live_suppress_self_preempt(void *arg)
>  					      MI_NOOP);
>  		if (IS_ERR(rq_a)) {
>  			err = PTR_ERR(rq_a);
> -			intel_engine_pm_put(engine);
> +			engine_heartbeat_enable(engine);
>  			goto err_client_b;
>  		}
>  
>  		i915_request_add(rq_a);
>  		if (!igt_wait_for_spinner(&a.spin, rq_a)) {
>  			pr_err("First client failed to start\n");
> -			intel_engine_pm_put(engine);
> +			engine_heartbeat_enable(engine);
>  			goto err_wedged;
>  		}
>  
> @@ -2309,7 +2309,7 @@ static int live_suppress_self_preempt(void *arg)
>  						      MI_NOOP);
>  			if (IS_ERR(rq_b)) {
>  				err = PTR_ERR(rq_b);
> -				intel_engine_pm_put(engine);
> +				engine_heartbeat_enable(engine);
>  				goto err_client_b;
>  			}
>  			i915_request_add(rq_b);
> @@ -2320,7 +2320,7 @@ static int live_suppress_self_preempt(void *arg)
>  
>  			if (!igt_wait_for_spinner(&b.spin, rq_b)) {
>  				pr_err("Second client failed to start\n");
> -				intel_engine_pm_put(engine);
> +				engine_heartbeat_enable(engine);
>  				goto err_wedged;
>  			}
>  
> @@ -2334,12 +2334,12 @@ static int live_suppress_self_preempt(void *arg)
>  			       engine->name,
>  			       engine->execlists.preempt_hang.count,
>  			       depth);
> -			intel_engine_pm_put(engine);
> +			engine_heartbeat_enable(engine);
>  			err = -EINVAL;
>  			goto err_client_b;
>  		}
>  
> -		intel_engine_pm_put(engine);
> +		engine_heartbeat_enable(engine);
>  		if (igt_flush_test(gt->i915))
>  			goto err_wedged;
>  	}
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/selftests: Dump engine state and trace upon hanging after reset
  2020-06-15 14:33   ` [Intel-gfx] [PATCH] " Chris Wilson
@ 2020-06-15 15:56     ` Mika Kuoppala
  0 siblings, 0 replies; 21+ messages in thread
From: Mika Kuoppala @ 2020-06-15 15:56 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson

Chris Wilson <chris@chris-wilson.co.uk> writes:

> If the engine dies after a reset, and so we fail to submit a request
> but need to be interrupted by the CI runner, dump the engine state.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index 2af66f8ffbd2..1ea4935e0eeb 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -499,6 +499,21 @@ static int igt_reset_nop_engine(void *arg)
>  
>  				rq = intel_context_create_request(ce);
>  				if (IS_ERR(rq)) {
> +					struct drm_printer p =
> +						drm_info_printer(gt->i915->drm.dev);
> +					intel_engine_dump(engine, &p,
> +							  "%s(%s): failed to submit request\n",
> +							  __func__,
> +							  engine->name);
> +
> +
> +					GEM_TRACE("%s(%s): failed to submit request\n",
> +						  __func__,
> +						  engine->name);
> +					GEM_TRACE_DUMP();
> +
> +					intel_gt_set_wedged(gt);
> +
>  					err = PTR_ERR(rq);
>  					break;
>  				}
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/gt: Add a safety submission flush in the heartbeat
  2020-06-15 13:11   ` [Intel-gfx] [PATCH] " Chris Wilson
@ 2020-06-15 16:16     ` Mika Kuoppala
  0 siblings, 0 replies; 21+ messages in thread
From: Mika Kuoppala @ 2020-06-15 16:16 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Just in case everything fails (like for example "missed interrupt
> syndrome" on Sandybridge), always flush the submission tasklet from the
> heartbeat. This papers over such issues, but will still appear as a
> second long glitch, and prevents us from detecting it unless we happen
> to be performing a timed test.
>
> v2: We rely on flush_submission() synchronizing with the tasklet on
> another CPU.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 23 +++++++++----------
>  .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |  3 +++
>  2 files changed, 14 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index d613cf31970c..31049e0bdb57 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1094,19 +1094,18 @@ void intel_engine_flush_submission(struct intel_engine_cs *engine)
>  {
>  	struct tasklet_struct *t = &engine->execlists.tasklet;
>  
> -	if (__tasklet_is_scheduled(t)) {
> -		local_bh_disable();
> -		if (tasklet_trylock(t)) {
> -			/* Must wait for any GPU reset in progress. */
> -			if (__tasklet_is_enabled(t))
> -				t->func(t->data);
> -			tasklet_unlock(t);
> -		}
> -		local_bh_enable();
> +	/* Synchronise and wait for the tasklet on another CPU */
> +	tasklet_kill(t);
> +
> +	/* Having cancelled the tasklet, ensure that is run */
> +	local_bh_disable();
> +	if (tasklet_trylock(t)) {
> +		/* Must wait for any GPU reset in progress. */
> +		if (__tasklet_is_enabled(t))

On heartbeat context this could be an assertion I think.

But it is difficult to enforce hw sanity and still please the
user.

They will curse on glitch, even tho thye are saved!

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

-Mika

> +			t->func(t->data);
> +		tasklet_unlock(t);
>  	}
> -
> -	/* Otherwise flush the tasklet if it was running on another cpu */
> -	tasklet_unlock_wait(t);
> +	local_bh_enable();
>  }
>  
>  /**
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> index f67ad937eefb..cd20fb549b38 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> @@ -65,6 +65,9 @@ static void heartbeat(struct work_struct *wrk)
>  	struct intel_context *ce = engine->kernel_context;
>  	struct i915_request *rq;
>  
> +	/* Just in case everything has gone horribly wrong, give it a kick */
> +	intel_engine_flush_submission(engine);
> +
>  	rq = engine->heartbeat.systole;
>  	if (rq && i915_request_completed(rq)) {
>  		i915_request_put(rq);
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-06-15 16:19 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-15 12:39 [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Chris Wilson
2020-06-15 12:39 ` [Intel-gfx] [PATCH 02/10] drm/i915/selftests: Dump engine state and trace upon hanging after reset Chris Wilson
2020-06-15 14:33   ` [Intel-gfx] [PATCH] " Chris Wilson
2020-06-15 15:56     ` Mika Kuoppala
2020-06-15 12:39 ` [Intel-gfx] [PATCH 03/10] drm/i915/gt: Add a safety submission flush in the heartbeat Chris Wilson
2020-06-15 13:11   ` [Intel-gfx] [PATCH] " Chris Wilson
2020-06-15 16:16     ` Mika Kuoppala
2020-06-15 12:39 ` [Intel-gfx] [PATCH 04/10] drm/i915/execlists: Replace direct submit with direct call to tasklet Chris Wilson
2020-06-15 12:39 ` [Intel-gfx] [PATCH 05/10] drm/i915/execlists: Defer schedule_out until after the next dequeue Chris Wilson
2020-06-15 12:39 ` [Intel-gfx] [PATCH 06/10] drm/i915/gt: ce->inflight updates are now serialised Chris Wilson
2020-06-15 12:39 ` [Intel-gfx] [PATCH 07/10] drm/i915/gt: Drop atomic for engine->fw_active tracking Chris Wilson
2020-06-15 12:39 ` [Intel-gfx] [PATCH 08/10] drm/i915/gt: Use virtual_engine during execlists_dequeue Chris Wilson
2020-06-15 12:39 ` [Intel-gfx] [PATCH 09/10] drm/i915/gt: Decouple inflight virtual engines Chris Wilson
2020-06-15 12:39 ` [Intel-gfx] [PATCH 10/10] drm/i915/gt: Resubmit the virtual engine on schedule-out Chris Wilson
2020-06-15 14:03 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev2) Patchwork
2020-06-15 14:05 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-06-15 14:26 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2020-06-15 14:55 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests (rev3) Patchwork
2020-06-15 14:56 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-06-15 15:20 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2020-06-15 15:29 ` [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests Mika Kuoppala

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.