Forced preemption gedankenexperiment

All of lore.kernel.org
 help / color / mirror / Atom feed

* Forced preemption gedankenexperiment
@ 2018-03-26 11:50 Chris Wilson
  2018-03-26 11:50 ` [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling Chris Wilson
                   ` (16 more replies)
  0 siblings, 17 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

Following the discussion of how should the timer be controlled, I wired
up one option which ended up with passing control to the caller letting
us use the forced-preemption for interactive uses (delay by more than a
frame and you get a fast reset!) as well whatever userspace desires.
-Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
@ 2018-03-26 11:50 ` Chris Wilson
  2018-03-27 11:34   ` Mika Kuoppala
  2018-03-27 11:42   ` Mika Kuoppala
  2018-03-26 11:50 ` [PATCH 02/11] drm/i915/execlists: Clear user-active flag on preemption completion Chris Wilson
                   ` (15 subsequent siblings)
  16 siblings, 2 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

If the request is still waiting on external fences, it has not yet been
submitted to the HW queue and so we can forgo kicking the submission
tasklet when re-evaluating its priority.

This should have no impact other than reducing the number of tasklet
wakeups under signal heavy workloads (e.g. switching between engines).

v2: Use prebaked container_of()

References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b4ab06b05e58..104b69e0494f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1051,12 +1051,16 @@ static void queue_request(struct intel_engine_cs *engine,
 	list_add_tail(&pt->link, &lookup_priolist(engine, pt, prio)->requests);
 }
 
-static void submit_queue(struct intel_engine_cs *engine, int prio)
+static void __submit_queue(struct intel_engine_cs *engine, int prio)
 {
-	if (prio > engine->execlists.queue_priority) {
 		engine->execlists.queue_priority = prio;
 		tasklet_hi_schedule(&engine->execlists.tasklet);
-	}
+}
+
+static void submit_queue(struct intel_engine_cs *engine, int prio)
+{
+	if (prio > engine->execlists.queue_priority)
+		__submit_queue(engine, prio);
 }
 
 static void execlists_submit_request(struct i915_request *request)
@@ -1189,7 +1193,10 @@ static void execlists_schedule(struct i915_request *request, int prio)
 			__list_del_entry(&pt->link);
 			queue_request(engine, pt, prio);
 		}
-		submit_queue(engine, prio);
+
+		if (prio > engine->execlists.queue_priority &&
+		    i915_sw_fence_done(&pt_to_request(pt)->submit))
+			__submit_queue(engine, prio);
 	}
 
 	spin_unlock_irq(&engine->timeline->lock);
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 02/11] drm/i915/execlists: Clear user-active flag on preemption completion
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
  2018-03-26 11:50 ` [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling Chris Wilson
@ 2018-03-26 11:50 ` Chris Wilson
  2018-03-27 10:00   ` Chris Wilson
  2018-03-26 11:50 ` [PATCH 03/11] drm/i915: Include submission tasklet state in engine dump Chris Wilson
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

When cancelling the requests and clearing out the ports following a
successful preemption completion, also clear the active flag. I had
assumed that all preemptions would be followed by an immediate dequeue
(preserving the active user flag), but under rare circumstances we may
be triggering a preemption for the second port only for it to have
completed before the preemotion kicks in; leaving execlists->active set
even though the system is now idle.

We can clear the flag inside the common execlists_cancel_port_requests()
as the other users also expect the semantics of active being cleared.

Fixes: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 104b69e0494f..c302952ab476 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -577,6 +577,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		 * know the next preemption status we see corresponds
 		 * to this ELSP update.
 		 */
+		GEM_BUG_ON(!execlists_is_active(execlists,
+						EXECLISTS_ACTIVE_USER));
 		GEM_BUG_ON(!port_count(&port[0]));
 		if (port_count(&port[0]) > 1)
 			goto unlock;
@@ -738,6 +740,8 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
 		memset(port, 0, sizeof(*port));
 		port++;
 	}
+
+	execlists_clear_active(execlists, EXECLISTS_ACTIVE_USER);
 }
 
 static void clear_gtiir(struct intel_engine_cs *engine)
@@ -1042,6 +1046,11 @@ static void execlists_submission_tasklet(unsigned long data)
 
 	if (fw)
 		intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
+
+	/* If the engine is now idle, so should be the flag; and vice versa. */
+	GEM_BUG_ON(execlists_is_active(&engine->execlists,
+				       EXECLISTS_ACTIVE_USER) ==
+		   !port_isset(engine->execlists.port));
 }
 
 static void queue_request(struct intel_engine_cs *engine,
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 03/11] drm/i915: Include submission tasklet state in engine dump
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
  2018-03-26 11:50 ` [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling Chris Wilson
  2018-03-26 11:50 ` [PATCH 02/11] drm/i915/execlists: Clear user-active flag on preemption completion Chris Wilson
@ 2018-03-26 11:50 ` Chris Wilson
  2018-03-27  8:37   ` Mika Kuoppala
  2018-03-26 11:50 ` [PATCH 04/11] drm/i915/execlists: Refactor out complete_preempt_context() Chris Wilson
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

For the off-chance we have an interrupt posted and haven't processed the
CSB.

v2: Include tasklet enable/disable state for good measure.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_engine_cs.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index de09fa42a509..12486d8f534b 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1859,12 +1859,15 @@ static void intel_engine_print_registers(const struct intel_engine_cs *engine,
 		ptr = I915_READ(RING_CONTEXT_STATUS_PTR(engine));
 		read = GEN8_CSB_READ_PTR(ptr);
 		write = GEN8_CSB_WRITE_PTR(ptr);
-		drm_printf(m, "\tExeclist CSB read %d [%d cached], write %d [%d from hws], interrupt posted? %s\n",
+		drm_printf(m, "\tExeclist CSB read %d [%d cached], write %d [%d from hws], interrupt posted? %s, tasklet queued? %s (%s)\n",
 			   read, execlists->csb_head,
 			   write,
 			   intel_read_status_page(engine, intel_hws_csb_write_index(engine->i915)),
 			   yesno(test_bit(ENGINE_IRQ_EXECLIST,
-					  &engine->irq_posted)));
+					  &engine->irq_posted)),
+			   yesno(test_bit(TASKLET_STATE_SCHED,
+					  &engine->execlists.tasklet.state)),
+			   enableddisabled(!atomic_read(&engine->execlists.tasklet.count)));
 		if (read >= GEN8_CSB_ENTRIES)
 			read = 0;
 		if (write >= GEN8_CSB_ENTRIES)
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 04/11] drm/i915/execlists: Refactor out complete_preempt_context()
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (2 preceding siblings ...)
  2018-03-26 11:50 ` [PATCH 03/11] drm/i915: Include submission tasklet state in engine dump Chris Wilson
@ 2018-03-26 11:50 ` Chris Wilson
  2018-03-26 11:50 ` [PATCH 05/11] drm/i915: Move engine reset prepare/finish to backends Chris Wilson
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

As a complement to inject_preempt_context(), follow up with the function
to handle its completion. This will be useful should we wish to extend
the duties of the preempt-context for execlists.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index c302952ab476..fb120da2c54f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -531,8 +531,18 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
 	if (execlists->ctrl_reg)
 		writel(EL_CTRL_LOAD, execlists->ctrl_reg);
 
-	execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
-	execlists_set_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT);
+	execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK);
+	execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
+}
+
+static void complete_preempt_context(struct intel_engine_execlists *execlists)
+{
+	GEM_BUG_ON(!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT));
+
+	execlists_cancel_port_requests(execlists);
+	execlists_unwind_incomplete_requests(execlists);
+
+	execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
 }
 
 static void execlists_dequeue(struct intel_engine_cs *engine)
@@ -979,14 +989,7 @@ static void execlists_submission_tasklet(unsigned long data)
 			if (status & GEN8_CTX_STATUS_COMPLETE &&
 			    buf[2*head + 1] == execlists->preempt_complete_status) {
 				GEM_TRACE("%s preempt-idle\n", engine->name);
-
-				execlists_cancel_port_requests(execlists);
-				execlists_unwind_incomplete_requests(execlists);
-
-				GEM_BUG_ON(!execlists_is_active(execlists,
-								EXECLISTS_ACTIVE_PREEMPT));
-				execlists_clear_active(execlists,
-						       EXECLISTS_ACTIVE_PREEMPT);
+				complete_preempt_context(execlists);
 				continue;
 			}
 
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 05/11] drm/i915: Move engine reset prepare/finish to backends
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (3 preceding siblings ...)
  2018-03-26 11:50 ` [PATCH 04/11] drm/i915/execlists: Refactor out complete_preempt_context() Chris Wilson
@ 2018-03-26 11:50 ` Chris Wilson
  2018-03-26 11:50 ` [PATCH 06/11] drm/i915: Split execlists/guc reset prepartions Chris Wilson
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

In preparation to more carefully handling incomplete preemption during
reset by execlists, we move the existing code wholesale to the backends
under a couple of new reset vfuncs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         | 47 +++-----------------------
 drivers/gpu/drm/i915/intel_lrc.c        | 59 +++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++++++++++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |  9 +++--
 4 files changed, 88 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9650a7b10c5f..038867c96809 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2917,7 +2917,7 @@ static bool engine_stalled(struct intel_engine_cs *engine)
 struct i915_request *
 i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
 {
-	struct i915_request *request = NULL;
+	struct i915_request *request;
 
 	/*
 	 * During the reset sequence, we must prevent the engine from
@@ -2940,40 +2940,7 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
 	 */
 	kthread_park(engine->breadcrumbs.signaler);
 
-	/*
-	 * Prevent request submission to the hardware until we have
-	 * completed the reset in i915_gem_reset_finish(). If a request
-	 * is completed by one engine, it may then queue a request
-	 * to a second via its execlists->tasklet *just* as we are
-	 * calling engine->init_hw() and also writing the ELSP.
-	 * Turning off the execlists->tasklet until the reset is over
-	 * prevents the race.
-	 *
-	 * Note that this needs to be a single atomic operation on the
-	 * tasklet (flush existing tasks, prevent new tasks) to prevent
-	 * a race between reset and set-wedged. It is not, so we do the best
-	 * we can atm and make sure we don't lock the machine up in the more
-	 * common case of recursively being called from set-wedged from inside
-	 * i915_reset.
-	 */
-	if (!atomic_read(&engine->execlists.tasklet.count))
-		tasklet_kill(&engine->execlists.tasklet);
-	tasklet_disable(&engine->execlists.tasklet);
-
-	/*
-	 * We're using worker to queue preemption requests from the tasklet in
-	 * GuC submission mode.
-	 * Even though tasklet was disabled, we may still have a worker queued.
-	 * Let's make sure that all workers scheduled before disabling the
-	 * tasklet are completed before continuing with the reset.
-	 */
-	if (engine->i915->guc.preempt_wq)
-		flush_workqueue(engine->i915->guc.preempt_wq);
-
-	if (engine->irq_seqno_barrier)
-		engine->irq_seqno_barrier(engine);
-
-	request = i915_gem_find_active_request(engine);
+	request = engine->reset.prepare(engine);
 	if (request && request->fence.error == -EIO)
 		request = ERR_PTR(-EIO); /* Previous reset failed! */
 
@@ -3114,13 +3081,8 @@ void i915_gem_reset_engine(struct intel_engine_cs *engine,
 	if (request)
 		request = i915_gem_reset_request(engine, request);
 
-	if (request) {
-		DRM_DEBUG_DRIVER("resetting %s to restart from tail of request 0x%x\n",
-				 engine->name, request->global_seqno);
-	}
-
 	/* Setup the CS to resume from the breadcrumb of the hung request */
-	engine->reset_hw(engine, request);
+	engine->reset.reset(engine, request);
 }
 
 void i915_gem_reset(struct drm_i915_private *dev_priv)
@@ -3172,7 +3134,8 @@ void i915_gem_reset(struct drm_i915_private *dev_priv)
 
 void i915_gem_reset_finish_engine(struct intel_engine_cs *engine)
 {
-	tasklet_enable(&engine->execlists.tasklet);
+	engine->reset.finish(engine);
+
 	kthread_unpark(engine->breadcrumbs.signaler);
 
 	intel_uncore_forcewake_put(engine->i915, FORCEWAKE_ALL);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index fb120da2c54f..f66d4c88b929 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1708,8 +1708,48 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 	return init_workarounds_ring(engine);
 }
 
-static void reset_common_ring(struct intel_engine_cs *engine,
-			      struct i915_request *request)
+static struct i915_request *
+execlists_reset_prepare(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+
+	GEM_TRACE("%s\n", engine->name);
+
+	/*
+	 * Prevent request submission to the hardware until we have
+	 * completed the reset in i915_gem_reset_finish(). If a request
+	 * is completed by one engine, it may then queue a request
+	 * to a second via its execlists->tasklet *just* as we are
+	 * calling engine->init_hw() and also writing the ELSP.
+	 * Turning off the execlists->tasklet until the reset is over
+	 * prevents the race.
+	 *
+	 * Note that this needs to be a single atomic operation on the
+	 * tasklet (flush existing tasks, prevent new tasks) to prevent
+	 * a race between reset and set-wedged. It is not, so we do the best
+	 * we can atm and make sure we don't lock the machine up in the more
+	 * common case of recursively being called from set-wedged from inside
+	 * i915_reset.
+	 */
+	if (!atomic_read(&execlists->tasklet.count))
+		tasklet_kill(&execlists->tasklet);
+	tasklet_disable(&execlists->tasklet);
+
+	/*
+	 * We're using worker to queue preemption requests from the tasklet in
+	 * GuC submission mode.
+	 * Even though tasklet was disabled, we may still have a worker queued.
+	 * Let's make sure that all workers scheduled before disabling the
+	 * tasklet are completed before continuing with the reset.
+	 */
+	if (engine->i915->guc.preempt_wq)
+		flush_workqueue(engine->i915->guc.preempt_wq);
+
+	return i915_gem_find_active_request(engine);
+}
+
+static void execlists_reset(struct intel_engine_cs *engine,
+			    struct i915_request *request)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct intel_context *ce;
@@ -1718,6 +1758,9 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	GEM_TRACE("%s seqno=%x\n",
 		  engine->name, request ? request->global_seqno : 0);
 
+	/* The submission tasklet must be disabled, engine->reset.prepare(). */
+	GEM_BUG_ON(!atomic_read(&execlists->tasklet.count));
+
 	/* See execlists_cancel_requests() for the irq/spinlock split. */
 	local_irq_save(flags);
 
@@ -1778,6 +1821,13 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	unwind_wa_tail(request);
 }
 
+static void execlists_reset_finish(struct intel_engine_cs *engine)
+{
+	tasklet_enable(&engine->execlists.tasklet);
+
+	GEM_TRACE("%s\n", engine->name);
+}
+
 static int intel_logical_ring_emit_pdps(struct i915_request *rq)
 {
 	struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
@@ -2102,7 +2152,10 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
 {
 	/* Default vfuncs which can be overriden by each engine. */
 	engine->init_hw = gen8_init_common_ring;
-	engine->reset_hw = reset_common_ring;
+
+	engine->reset.prepare = execlists_reset_prepare;
+	engine->reset.reset = execlists_reset;
+	engine->reset.finish = execlists_reset_finish;
 
 	engine->context_pin = execlists_context_pin;
 	engine->context_unpin = execlists_context_unpin;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 04d9d9a946a7..5dadbc435c0e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -530,9 +530,20 @@ static int init_ring_common(struct intel_engine_cs *engine)
 	return ret;
 }
 
-static void reset_ring_common(struct intel_engine_cs *engine,
-			      struct i915_request *request)
+static struct i915_request *reset_prepare(struct intel_engine_cs *engine)
 {
+	if (engine->irq_seqno_barrier)
+		engine->irq_seqno_barrier(engine);
+
+	return i915_gem_find_active_request(engine);
+}
+
+static void reset_ring(struct intel_engine_cs *engine,
+		       struct i915_request *request)
+{
+	GEM_TRACE("%s seqno=%x\n",
+		  engine->name, request ? request->global_seqno : 0);
+
 	/*
 	 * RC6 must be prevented until the reset is complete and the engine
 	 * reinitialised. If it occurs in the middle of this sequence, the
@@ -595,6 +606,10 @@ static void reset_ring_common(struct intel_engine_cs *engine,
 	}
 }
 
+static void reset_finish(struct intel_engine_cs *engine)
+{
+}
+
 static int intel_rcs_ctx_init(struct i915_request *rq)
 {
 	int ret;
@@ -1987,7 +2002,9 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
 	intel_ring_init_semaphores(dev_priv, engine);
 
 	engine->init_hw = init_ring_common;
-	engine->reset_hw = reset_ring_common;
+	engine->reset.prepare = reset_prepare;
+	engine->reset.reset = reset_ring;
+	engine->reset.finish = reset_finish;
 
 	engine->context_pin = intel_ring_context_pin;
 	engine->context_unpin = intel_ring_context_unpin;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index a02c7b3b9d55..4c71dcdc722b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -418,8 +418,13 @@ struct intel_engine_cs {
 	void		(*irq_disable)(struct intel_engine_cs *engine);
 
 	int		(*init_hw)(struct intel_engine_cs *engine);
-	void		(*reset_hw)(struct intel_engine_cs *engine,
-				    struct i915_request *rq);
+
+	struct {
+		struct i915_request *(*prepare)(struct intel_engine_cs *engine);
+		void (*reset)(struct intel_engine_cs *engine,
+			      struct i915_request *rq);
+		void (*finish)(struct intel_engine_cs *engine);
+	} reset;
 
 	void		(*park)(struct intel_engine_cs *engine);
 	void		(*unpark)(struct intel_engine_cs *engine);
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 06/11] drm/i915: Split execlists/guc reset prepartions
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (4 preceding siblings ...)
  2018-03-26 11:50 ` [PATCH 05/11] drm/i915: Move engine reset prepare/finish to backends Chris Wilson
@ 2018-03-26 11:50 ` Chris Wilson
  2018-03-26 11:50 ` [PATCH 07/11] drm/i915/execlists: Flush pending preemption events during reset Chris Wilson
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

In the next patch, we will make the execlists reset prepare callback
take into account preemption by flushing the context-switch handler.
This is not applicable to the GuC submission backend, so split the two
into their own backend callbacks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_submission.c | 41 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c            | 11 +-------
 2 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 207cda062626..2d0f1a04efea 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -776,6 +776,46 @@ static void guc_submission_tasklet(unsigned long data)
 		guc_dequeue(engine);
 }
 
+static struct i915_request *
+guc_reset_prepare(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+
+	GEM_TRACE("%s\n", engine->name);
+
+	/*
+	 * Prevent request submission to the hardware until we have
+	 * completed the reset in i915_gem_reset_finish(). If a request
+	 * is completed by one engine, it may then queue a request
+	 * to a second via its execlists->tasklet *just* as we are
+	 * calling engine->init_hw() and also writing the ELSP.
+	 * Turning off the execlists->tasklet until the reset is over
+	 * prevents the race.
+	 *
+	 * Note that this needs to be a single atomic operation on the
+	 * tasklet (flush existing tasks, prevent new tasks) to prevent
+	 * a race between reset and set-wedged. It is not, so we do the best
+	 * we can atm and make sure we don't lock the machine up in the more
+	 * common case of recursively being called from set-wedged from inside
+	 * i915_reset.
+	 */
+	if (!atomic_read(&execlists->tasklet.count))
+		tasklet_kill(&execlists->tasklet);
+	tasklet_disable(&execlists->tasklet);
+
+	/*
+	 * We're using worker to queue preemption requests from the tasklet in
+	 * GuC submission mode.
+	 * Even though tasklet was disabled, we may still have a worker queued.
+	 * Let's make sure that all workers scheduled before disabling the
+	 * tasklet are completed before continuing with the reset.
+	 */
+	if (engine->i915->guc.preempt_wq)
+		flush_workqueue(engine->i915->guc.preempt_wq);
+
+	return i915_gem_find_active_request(engine);
+}
+
 /*
  * Everything below here is concerned with setup & teardown, and is
  * therefore not part of the somewhat time-critical batch-submission
@@ -1235,6 +1275,7 @@ int intel_guc_submission_enable(struct intel_guc *guc)
 			&engine->execlists;
 
 		execlists->tasklet.func = guc_submission_tasklet;
+		engine->reset.prepare = guc_reset_prepare;
 		engine->park = guc_submission_park;
 		engine->unpark = guc_submission_unpark;
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f66d4c88b929..cf31b0749023 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1735,16 +1735,6 @@ execlists_reset_prepare(struct intel_engine_cs *engine)
 		tasklet_kill(&execlists->tasklet);
 	tasklet_disable(&execlists->tasklet);
 
-	/*
-	 * We're using worker to queue preemption requests from the tasklet in
-	 * GuC submission mode.
-	 * Even though tasklet was disabled, we may still have a worker queued.
-	 * Let's make sure that all workers scheduled before disabling the
-	 * tasklet are completed before continuing with the reset.
-	 */
-	if (engine->i915->guc.preempt_wq)
-		flush_workqueue(engine->i915->guc.preempt_wq);
-
 	return i915_gem_find_active_request(engine);
 }
 
@@ -2134,6 +2124,7 @@ static void execlists_set_default_submission(struct intel_engine_cs *engine)
 	engine->cancel_requests = execlists_cancel_requests;
 	engine->schedule = execlists_schedule;
 	engine->execlists.tasklet.func = execlists_submission_tasklet;
+	engine->reset.prepare = execlists_reset_prepare;
 
 	engine->park = NULL;
 	engine->unpark = NULL;
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 07/11] drm/i915/execlists: Flush pending preemption events during reset
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (5 preceding siblings ...)
  2018-03-26 11:50 ` [PATCH 06/11] drm/i915: Split execlists/guc reset prepartions Chris Wilson
@ 2018-03-26 11:50 ` Chris Wilson
  2018-03-27 11:44   ` [PATCH v2] " Chris Wilson
  2018-03-26 11:50 ` [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout Chris Wilson
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

Catch up with the inflight CSB events, after disabling the tasklet
before deciding which request was truly guilty of hanging the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 140 ++++++++++++++++++++++++++-------------
 1 file changed, 94 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index cf31b0749023..50688fc889d9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -874,63 +874,43 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 	local_irq_restore(flags);
 }
 
-/*
- * Check the unread Context Status Buffers and manage the submission of new
- * contexts to the ELSP accordingly.
- */
-static void execlists_submission_tasklet(unsigned long data)
+static void process_csb(struct intel_engine_cs *engine)
 {
-	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port * const port = execlists->port;
-	struct drm_i915_private *dev_priv = engine->i915;
+	struct drm_i915_private *i915 = engine->i915;
+	const u32 *buf;
 	bool fw = false;
 
-	/*
-	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
-	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
-	 * not be relinquished until the device is idle (see
-	 * i915_gem_idle_work_handler()). As a precaution, we make sure
-	 * that all ELSP are drained i.e. we have processed the CSB,
-	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
-	 */
-	GEM_BUG_ON(!dev_priv->gt.awake);
-
-	/*
-	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
-	 * imposing the cost of a locked atomic transaction when submitting a
-	 * new request (outside of the context-switch interrupt).
-	 */
-	while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
+	if (unlikely(execlists->csb_use_mmio)) {
+		buf = (u32 * __force)
+			(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));
+		execlists->csb_head = -1; /* force mmio read of CSB ptrs */
+	} else {
 		/* The HWSP contains a (cacheable) mirror of the CSB */
-		const u32 *buf =
-			&engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
-		unsigned int head, tail;
+		buf = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
+	}
 
-		if (unlikely(execlists->csb_use_mmio)) {
-			buf = (u32 * __force)
-				(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));
-			execlists->csb_head = -1; /* force mmio read of CSB ptrs */
-		}
+	do {
+		unsigned int head, tail;
 
 		/* Clear before reading to catch new interrupts */
 		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 		smp_mb__after_atomic();
 
-		if (unlikely(execlists->csb_head == -1)) { /* following a reset */
+		if (unlikely(execlists->csb_head == -1)) { /* after a reset */
 			if (!fw) {
-				intel_uncore_forcewake_get(dev_priv,
-							   execlists->fw_domains);
+				intel_uncore_forcewake_get(i915, execlists->fw_domains);
 				fw = true;
 			}
 
-			head = readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
+			head = readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
 			tail = GEN8_CSB_WRITE_PTR(head);
 			head = GEN8_CSB_READ_PTR(head);
 			execlists->csb_head = head;
 		} else {
 			const int write_idx =
-				intel_hws_csb_write_index(dev_priv) -
+				intel_hws_csb_write_index(i915) -
 				I915_HWS_CSB_BUF0_INDEX;
 
 			head = execlists->csb_head;
@@ -938,8 +918,8 @@ static void execlists_submission_tasklet(unsigned long data)
 		}
 		GEM_TRACE("%s cs-irq head=%d [%d%s], tail=%d [%d%s]\n",
 			  engine->name,
-			  head, GEN8_CSB_READ_PTR(readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",
-			  tail, GEN8_CSB_WRITE_PTR(readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");
+			  head, GEN8_CSB_READ_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",
+			  tail, GEN8_CSB_WRITE_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");
 
 		while (head != tail) {
 			struct i915_request *rq;
@@ -949,7 +929,8 @@ static void execlists_submission_tasklet(unsigned long data)
 			if (++head == GEN8_CSB_ENTRIES)
 				head = 0;
 
-			/* We are flying near dragons again.
+			/*
+			 * We are flying near dragons again.
 			 *
 			 * We hold a reference to the request in execlist_port[]
 			 * but no more than that. We are operating in softirq
@@ -1040,15 +1021,48 @@ static void execlists_submission_tasklet(unsigned long data)
 		if (head != execlists->csb_head) {
 			execlists->csb_head = head;
 			writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, head << 8),
-			       dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
+			       i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
 		}
-	}
+	} while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted));
 
-	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT))
-		execlists_dequeue(engine);
+	if (unlikely(fw))
+		intel_uncore_forcewake_put(i915, execlists->fw_domains);
+}
 
-	if (fw)
-		intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
+/*
+ * Check the unread Context Status Buffers and manage the submission of new
+ * contexts to the ELSP accordingly.
+ */
+static void execlists_submission_tasklet(unsigned long data)
+{
+	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
+
+	GEM_TRACE("%s awake?=%d, active=%x, irq-posted?=%d\n",
+		  engine->name,
+		  engine->i915->gt.awake,
+		  engine->execlists.active,
+		  test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted));
+
+	/*
+	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
+	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
+	 * not be relinquished until the device is idle (see
+	 * i915_gem_idle_work_handler()). As a precaution, we make sure
+	 * that all ELSP are drained i.e. we have processed the CSB,
+	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
+	 */
+	GEM_BUG_ON(!engine->i915->gt.awake);
+
+	/*
+	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
+	 * imposing the cost of a locked atomic transaction when submitting a
+	 * new request (outside of the context-switch interrupt).
+	 */
+	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
+		process_csb(engine);
+
+	if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
+		execlists_dequeue(engine);
 
 	/* If the engine is now idle, so should be the flag; and vice versa. */
 	GEM_BUG_ON(execlists_is_active(&engine->execlists,
@@ -1712,6 +1726,7 @@ static struct i915_request *
 execlists_reset_prepare(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
+	struct i915_request *request, *active;
 
 	GEM_TRACE("%s\n", engine->name);
 
@@ -1735,7 +1750,40 @@ execlists_reset_prepare(struct intel_engine_cs *engine)
 		tasklet_kill(&execlists->tasklet);
 	tasklet_disable(&execlists->tasklet);
 
-	return i915_gem_find_active_request(engine);
+	/*
+	 * We want to flush the pending context switches, having disabled
+	 * the tasklet above, we can assume exclusive access to the execlists.
+	 * For this allows us to catch up with an inflight preemption event,
+	 * and avoid blaming an innocent request if the stall was due to the
+	 * preemption itself.
+	 */
+	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
+		process_csb(engine);
+
+	/*
+	 * The last active request can then be no later than the last request
+	 * now in ELSP[0]. So search backwards from there, so that if the GPU
+	 * has advanced beyond the last CSB update, it will be pardoned.
+	 */
+	active = NULL;
+	request = port_request(execlists->port);
+	if (request) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&engine->timeline->lock, flags);
+		list_for_each_entry_from_reverse(request,
+						 &engine->timeline->requests,
+						 link) {
+			if (__i915_request_completed(request,
+						     request->global_seqno))
+				break;
+
+			active = request;
+		}
+		spin_unlock_irqrestore(&engine->timeline->lock, flags);
+	}
+
+	return active;
 }
 
 static void execlists_reset(struct intel_engine_cs *engine,
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (6 preceding siblings ...)
  2018-03-26 11:50 ` [PATCH 07/11] drm/i915/execlists: Flush pending preemption events during reset Chris Wilson
@ 2018-03-26 11:50 ` Chris Wilson
  2018-03-27  8:51   ` Tvrtko Ursulin
  2018-03-27 15:50   ` Jeff McGee
  2018-03-26 11:50 ` [PATCH 09/11] drm/i915/preemption: Select timeout when scheduling Chris Wilson
                   ` (8 subsequent siblings)
  16 siblings, 2 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

Install a timer when trying to preempt on behalf of an important
context such that if the active context does not honour the preemption
request within the desired timeout, then we reset the GPU to allow the
important context to run.

(Open: should not the timer be started from receiving the high priority
request...)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_lrc.c        | 53 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  8 +++++
 2 files changed, 61 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 50688fc889d9..6da816d23cb3 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -533,6 +533,47 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
 
 	execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK);
 	execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
+
+	/* Set a timer to force preemption vs hostile userspace */
+	if (execlists->queue_preempt_timeout) {
+		GEM_TRACE("%s timeout=%uns\n",
+			  engine->name, execlists->queue_preempt_timeout);
+		hrtimer_start(&execlists->preempt_timer,
+			      ktime_set(0, execlists->queue_preempt_timeout),
+			      HRTIMER_MODE_REL);
+	}
+}
+
+static enum hrtimer_restart preempt_timeout(struct hrtimer *hrtimer)
+{
+	struct intel_engine_execlists *execlists =
+		container_of(hrtimer, typeof(*execlists), preempt_timer);
+
+	GEM_TRACE("%s\n",
+		  container_of(execlists,
+			       struct intel_engine_cs,
+			       execlists)->name);
+
+	queue_work(system_highpri_wq, &execlists->preempt_reset);
+	return HRTIMER_NORESTART;
+}
+
+static void preempt_reset(struct work_struct *work)
+{
+	struct intel_engine_cs *engine =
+		container_of(work, typeof(*engine), execlists.preempt_reset);
+
+	GEM_TRACE("%s\n", engine->name);
+
+	tasklet_disable(&engine->execlists.tasklet);
+
+	engine->execlists.tasklet.func(engine->execlists.tasklet.data);
+
+	if (execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
+		i915_handle_error(engine->i915, BIT(engine->id), 0,
+				  "preemption timed out on %s", engine->name);
+
+	tasklet_enable(&engine->execlists.tasklet);
 }
 
 static void complete_preempt_context(struct intel_engine_execlists *execlists)
@@ -542,6 +583,10 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
 	execlists_cancel_port_requests(execlists);
 	execlists_unwind_incomplete_requests(execlists);
 
+	/* If the timer already fired, complete the reset */
+	if (hrtimer_try_to_cancel(&execlists->preempt_timer) < 0)
+		return;
+
 	execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
 }
 
@@ -708,6 +753,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
+	execlists->queue_preempt_timeout = 0; /* preemption point passed */
 	execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
 	execlists->first = rb;
 	if (submit)
@@ -864,6 +910,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
+	execlists->queue_preempt_timeout = 0;
 	execlists->queue_priority = INT_MIN;
 	execlists->queue = RB_ROOT;
 	execlists->first = NULL;
@@ -1080,6 +1127,7 @@ static void queue_request(struct intel_engine_cs *engine,
 static void __submit_queue(struct intel_engine_cs *engine, int prio)
 {
 		engine->execlists.queue_priority = prio;
+		engine->execlists.queue_preempt_timeout = 0;
 		tasklet_hi_schedule(&engine->execlists.tasklet);
 }
 
@@ -2270,6 +2318,11 @@ logical_ring_setup(struct intel_engine_cs *engine)
 	tasklet_init(&engine->execlists.tasklet,
 		     execlists_submission_tasklet, (unsigned long)engine);
 
+	INIT_WORK(&engine->execlists.preempt_reset, preempt_reset);
+	hrtimer_init(&engine->execlists.preempt_timer,
+		     CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	engine->execlists.preempt_timer.function = preempt_timeout;
+
 	logical_ring_default_vfuncs(engine);
 	logical_ring_default_irqs(engine);
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 4c71dcdc722b..7166f47c8489 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -284,6 +284,11 @@ struct intel_engine_execlists {
 	 */
 	int queue_priority;
 
+	/**
+	 * @queue_preempt_timeout: Timeout in ns before forcing preemption.
+	 */
+	unsigned int queue_preempt_timeout;
+
 	/**
 	 * @queue: queue of requests, in priority lists
 	 */
@@ -313,6 +318,9 @@ struct intel_engine_execlists {
 	 * @preempt_complete_status: expected CSB upon completing preemption
 	 */
 	u32 preempt_complete_status;
+
+	struct hrtimer preempt_timer;
+	struct work_struct preempt_reset;
 };
 
 #define INTEL_ENGINE_CS_MAX_NAME 8
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 09/11] drm/i915/preemption: Select timeout when scheduling
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (7 preceding siblings ...)
  2018-03-26 11:50 ` [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout Chris Wilson
@ 2018-03-26 11:50 ` Chris Wilson
  2018-03-26 11:50 ` [PATCH 10/11] drm/i915: Use a preemption timeout to enforce interactivity Chris Wilson
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

The choice of preemption timeout is determined by the context from which
we trigger the preemption, as such allow the caller to specify the
desired timeout.

Effectively the other choice would be to use the shortest timeout along
the dependency chain. However, given that we would have already
triggered preemption for the dependency chain, we can assume that no
preemption along that chain is more important than the current request,
ergo we need only consider the current timeout. Realising this, we can
then pass control of the preemption timeout to the caller for greater
flexibility.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c         |  2 +-
 drivers/gpu/drm/i915/i915_request.c     |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c        | 17 ++++++++++-------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  6 ++++--
 4 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 038867c96809..675d6bb59337 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -482,7 +482,7 @@ static void __fence_set_priority(struct dma_fence *fence, int prio)
 
 	rcu_read_lock();
 	if (engine->schedule)
-		engine->schedule(rq, prio);
+		engine->schedule(rq, prio, 0);
 	rcu_read_unlock();
 }
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 2314a26cd7f8..9d8dcebd9649 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1102,7 +1102,7 @@ void __i915_request_add(struct i915_request *request, bool flush_caches)
 	 */
 	rcu_read_lock();
 	if (engine->schedule)
-		engine->schedule(request, request->ctx->priority);
+		engine->schedule(request, request->ctx->priority, 0);
 	rcu_read_unlock();
 
 	local_bh_disable();
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6da816d23cb3..e266657851e1 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1124,17 +1124,19 @@ static void queue_request(struct intel_engine_cs *engine,
 	list_add_tail(&pt->link, &lookup_priolist(engine, pt, prio)->requests);
 }
 
-static void __submit_queue(struct intel_engine_cs *engine, int prio)
+static void __submit_queue(struct intel_engine_cs *engine,
+			   int prio, unsigned int timeout)
 {
 		engine->execlists.queue_priority = prio;
-		engine->execlists.queue_preempt_timeout = 0;
+		engine->execlists.queue_preempt_timeout = timeout;
 		tasklet_hi_schedule(&engine->execlists.tasklet);
 }
 
-static void submit_queue(struct intel_engine_cs *engine, int prio)
+static void submit_queue(struct intel_engine_cs *engine,
+			 int prio, unsigned int timeout)
 {
 	if (prio > engine->execlists.queue_priority)
-		__submit_queue(engine, prio);
+		__submit_queue(engine, prio, timeout);
 }
 
 static void execlists_submit_request(struct i915_request *request)
@@ -1146,7 +1148,7 @@ static void execlists_submit_request(struct i915_request *request)
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
 	queue_request(engine, &request->priotree, rq_prio(request));
-	submit_queue(engine, rq_prio(request));
+	submit_queue(engine, rq_prio(request), 0);
 
 	GEM_BUG_ON(!engine->execlists.first);
 	GEM_BUG_ON(list_empty(&request->priotree.link));
@@ -1174,7 +1176,8 @@ pt_lock_engine(struct i915_priotree *pt, struct intel_engine_cs *locked)
 	return engine;
 }
 
-static void execlists_schedule(struct i915_request *request, int prio)
+static void execlists_schedule(struct i915_request *request,
+			       int prio, unsigned int timeout)
 {
 	struct intel_engine_cs *engine;
 	struct i915_dependency *dep, *p;
@@ -1270,7 +1273,7 @@ static void execlists_schedule(struct i915_request *request, int prio)
 
 		if (prio > engine->execlists.queue_priority &&
 		    i915_sw_fence_done(&pt_to_request(pt)->submit))
-			__submit_queue(engine, prio);
+			__submit_queue(engine, prio, timeout);
 	}
 
 	spin_unlock_irq(&engine->timeline->lock);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 7166f47c8489..39df5a39af12 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -467,13 +467,15 @@ struct intel_engine_cs {
 	 */
 	void		(*submit_request)(struct i915_request *rq);
 
-	/* Call when the priority on a request has changed and it and its
+	/*
+	 * Call when the priority on a request has changed and it and its
 	 * dependencies may need rescheduling. Note the request itself may
 	 * not be ready to run!
 	 *
 	 * Called under the struct_mutex.
 	 */
-	void		(*schedule)(struct i915_request *request, int priority);
+	void		(*schedule)(struct i915_request *request,
+				    int priority, unsigned int timeout);
 
 	/*
 	 * Cancel all requests on the hardware, or queued for execution.
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 10/11] drm/i915: Use a preemption timeout to enforce interactivity
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (8 preceding siblings ...)
  2018-03-26 11:50 ` [PATCH 09/11] drm/i915/preemption: Select timeout when scheduling Chris Wilson
@ 2018-03-26 11:50 ` Chris Wilson
  2018-03-27  8:35   ` Tvrtko Ursulin
  2018-03-26 11:50 ` [PATCH 11/11] drm/i915: Allow user control over preempt timeout on the important context Chris Wilson
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

Use a liberal timeout of 20ms to ensure that the rendering for an
interactive pageflip is started in a timely fashion, and that
user interaction is not blocked by GPU, or CPU, hogs. This is at the cost
of resetting whoever was blocking the preemption, likely leading to that
context/process being banned from submitting future requests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h      |  3 ++-
 drivers/gpu/drm/i915/i915_gem.c      | 18 ++++++++++--------
 drivers/gpu/drm/i915/intel_display.c | 18 +++++++++++++++++-
 3 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 800230ba1c3b..87388feb973d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3150,8 +3150,9 @@ int i915_gem_object_wait(struct drm_i915_gem_object *obj,
 			 struct intel_rps_client *rps);
 int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 				  unsigned int flags,
-				  int priority);
+				  int priority, unsigned int timeout);
 #define I915_PRIORITY_DISPLAY I915_PRIORITY_MAX
+#define I915_PREEMPTION_TIMEOUT_DISPLAY (20 * 1000 * 1000) /* 20 ms */
 
 int __must_check
 i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 675d6bb59337..252c6b58e739 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -469,7 +469,8 @@ i915_gem_object_wait_reservation(struct reservation_object *resv,
 	return timeout;
 }
 
-static void __fence_set_priority(struct dma_fence *fence, int prio)
+static void __fence_set_priority(struct dma_fence *fence,
+				 int prio, unsigned int timeout)
 {
 	struct i915_request *rq;
 	struct intel_engine_cs *engine;
@@ -482,11 +483,12 @@ static void __fence_set_priority(struct dma_fence *fence, int prio)
 
 	rcu_read_lock();
 	if (engine->schedule)
-		engine->schedule(rq, prio, 0);
+		engine->schedule(rq, prio, timeout);
 	rcu_read_unlock();
 }
 
-static void fence_set_priority(struct dma_fence *fence, int prio)
+static void fence_set_priority(struct dma_fence *fence,
+			       int prio, unsigned int timeout)
 {
 	/* Recurse once into a fence-array */
 	if (dma_fence_is_array(fence)) {
@@ -494,16 +496,16 @@ static void fence_set_priority(struct dma_fence *fence, int prio)
 		int i;
 
 		for (i = 0; i < array->num_fences; i++)
-			__fence_set_priority(array->fences[i], prio);
+			__fence_set_priority(array->fences[i], prio, timeout);
 	} else {
-		__fence_set_priority(fence, prio);
+		__fence_set_priority(fence, prio, timeout);
 	}
 }
 
 int
 i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 			      unsigned int flags,
-			      int prio)
+			      int prio, unsigned int timeout)
 {
 	struct dma_fence *excl;
 
@@ -518,7 +520,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 			return ret;
 
 		for (i = 0; i < count; i++) {
-			fence_set_priority(shared[i], prio);
+			fence_set_priority(shared[i], prio, timeout);
 			dma_fence_put(shared[i]);
 		}
 
@@ -528,7 +530,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 	}
 
 	if (excl) {
-		fence_set_priority(excl, prio);
+		fence_set_priority(excl, prio, timeout);
 		dma_fence_put(excl);
 	}
 	return 0;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 4c30c7c04f9c..830f2d4e540f 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -12786,7 +12786,23 @@ intel_prepare_plane_fb(struct drm_plane *plane,
 
 	ret = intel_plane_pin_fb(to_intel_plane_state(new_state));
 
-	i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
+	/*
+	 * Reschedule our dependencies, and ensure we run within a timeout.
+	 *
+	 * Note that if the timeout is exceeded, then whoever was running that
+	 * prevented us from acquiring the GPU is declared rogue and reset. An
+	 * unresponsive process will then be banned in order to preserve
+	 * interactivity. Since this can be seen as a bit heavy-handed, we
+	 * select a timeout for when the dropped frames start to become a
+	 * noticeable nuisance for the user (20 ms, i.e. preemption was blocked
+	 * for more than a frame). Note, this is only a timeout for a delay in
+	 * preempting the current request in order to run our dependency chain,
+	 * our dependency chain may itself take a long time to run to completion
+	 * before we can present the framebuffer.
+	 */
+	i915_gem_object_wait_priority(obj, 0,
+				      I915_PRIORITY_DISPLAY,
+				      I915_PREEMPTION_TIMEOUT_DISPLAY);
 
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 	i915_gem_object_unpin_pages(obj);
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 11/11] drm/i915: Allow user control over preempt timeout on the important context
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (9 preceding siblings ...)
  2018-03-26 11:50 ` [PATCH 10/11] drm/i915: Use a preemption timeout to enforce interactivity Chris Wilson
@ 2018-03-26 11:50 ` Chris Wilson
  2018-03-26 17:09   ` Tvrtko Ursulin
  2018-03-26 12:08 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling Patchwork
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 11:50 UTC (permalink / raw)
  To: intel-gfx

EGL_NV_realtime_priority?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 22 ++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_context.h | 13 +++++++++++++
 drivers/gpu/drm/i915/i915_request.c     |  8 ++++++--
 drivers/gpu/drm/i915/intel_lrc.c        |  2 +-
 include/uapi/drm/i915_drm.h             | 12 ++++++++++++
 5 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5cfac0255758..dfb0a2b698c3 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -749,6 +749,15 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_CONTEXT_PARAM_PRIORITY:
 		args->value = ctx->priority;
 		break;
+	case I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
+		if (!(to_i915(dev)->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
+			ret = -ENODEV;
+		else if (args->size)
+			ret = -EINVAL;
+		else
+			args->value = ctx->preempt_timeout;
+		break;
+
 	default:
 		ret = -EINVAL;
 		break;
@@ -824,6 +833,19 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		}
 		break;
 
+	case I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
+		if (args->size)
+			ret = -EINVAL;
+		else if (args->value >= NSEC_PER_SEC)
+			ret = -EINVAL;
+		else if (!(to_i915(dev)->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
+			ret = -ENODEV;
+		else if (args->value && !capable(CAP_SYS_ADMIN))
+			ret = -EPERM;
+		else
+			ctx->preempt_timeout = args->value;
+		break;
+
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 7854262ddfd9..74d4cadd729e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -150,6 +150,19 @@ struct i915_gem_context {
 	 */
 	int priority;
 
+	/**
+	 * @preempt_timeout: QoS guarantee for the high priority context
+	 *
+	 * Some clients need a guarantee that they will start executing
+	 * within a certain window, even at the expense of others. This entails
+	 * that if a preemption request is not honoured by the active context
+	 * within the timeout, we will reset the GPU to evict the hog and
+	 * run the high priority context instead.
+	 *
+	 * Timeout is stored in nanoseconds; exclusive max of 1s.
+	 */
+	u32 preempt_timeout;
+
 	/** ggtt_offset_bias: placement restriction for context objects */
 	u32 ggtt_offset_bias;
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 9d8dcebd9649..2cd4ea75d127 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1101,8 +1101,12 @@ void __i915_request_add(struct i915_request *request, bool flush_caches)
 	 * run at the earliest possible convenience.
 	 */
 	rcu_read_lock();
-	if (engine->schedule)
-		engine->schedule(request, request->ctx->priority, 0);
+	if (engine->schedule) {
+		unsigned int timeout = request->ctx->preempt_timeout;
+		int priority = request->ctx->priority;
+
+		engine->schedule(request, priority, timeout);
+	}
 	rcu_read_unlock();
 
 	local_bh_disable();
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e266657851e1..e782a621b40b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1148,7 +1148,7 @@ static void execlists_submit_request(struct i915_request *request)
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
 	queue_request(engine, &request->priotree, rq_prio(request));
-	submit_queue(engine, rq_prio(request), 0);
+	submit_queue(engine, rq_prio(request), request->ctx->preempt_timeout);
 
 	GEM_BUG_ON(!engine->execlists.first);
 	GEM_BUG_ON(list_empty(&request->priotree.link));
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 7f5634ce8e88..6f10bbe90304 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1456,6 +1456,18 @@ struct drm_i915_gem_context_param {
 #define   I915_CONTEXT_MAX_USER_PRIORITY	1023 /* inclusive */
 #define   I915_CONTEXT_DEFAULT_PRIORITY		0
 #define   I915_CONTEXT_MIN_USER_PRIORITY	-1023 /* inclusive */
+
+/*
+ * I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
+ *
+ * Preemption timeout give in nanoseconds.
+ *
+ * Only allowed for privileged clients (CAP_SYS_ADMIN), this property allows
+ * the preempting context to kick out a GPU hog using a GPU reset if they do
+ * not honour the preemption request in time.
+ */
+#define I915_CONTEXT_PARAM_PREEMPT_TIMEOUT	0x7
+
 	__u64 value;
 };
 
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (10 preceding siblings ...)
  2018-03-26 11:50 ` [PATCH 11/11] drm/i915: Allow user control over preempt timeout on the important context Chris Wilson
@ 2018-03-26 12:08 ` Patchwork
  2018-03-26 12:23 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 40+ messages in thread
From: Patchwork @ 2018-03-26 12:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
URL   : https://patchwork.freedesktop.org/series/40665/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
6cd6e46a6275 drm/i915/execlists: Avoid kicking the submission too early for rescheduling
-:19: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#19: 
References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")

-:19: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")'
#19: 
References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")

total: 1 errors, 1 warnings, 0 checks, 30 lines checked
571ee98c9870 drm/i915/execlists: Clear user-active flag on preemption completion
d3c6c6e49941 drm/i915: Include submission tasklet state in engine dump
84c375a58791 drm/i915/execlists: Refactor out complete_preempt_context()
e0fa2379926b drm/i915: Move engine reset prepare/finish to backends
b9c4fb9ea980 drm/i915: Split execlists/guc reset prepartions
17bce80a463b drm/i915/execlists: Flush pending preemption events during reset
-:91: WARNING:LONG_LINE: line over 100 characters
#91: FILE: drivers/gpu/drm/i915/intel_lrc.c:907:
+			head = readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));

-:108: WARNING:LONG_LINE: line over 100 characters
#108: FILE: drivers/gpu/drm/i915/intel_lrc.c:921:
+			  head, GEN8_CSB_READ_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",

-:109: WARNING:LONG_LINE: line over 100 characters
#109: FILE: drivers/gpu/drm/i915/intel_lrc.c:922:
+			  tail, GEN8_CSB_WRITE_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");

total: 0 errors, 3 warnings, 0 checks, 200 lines checked
ebbe79e61164 drm/i915/execlists: Force preemption via reset on timeout
1992a69b17b3 drm/i915/preemption: Select timeout when scheduling
b72df86f0c84 drm/i915: Use a preemption timeout to enforce interactivity
ae480c117350 drm/i915: Allow user control over preempt timeout on the important context

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (11 preceding siblings ...)
  2018-03-26 12:08 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling Patchwork
@ 2018-03-26 12:23 ` Patchwork
  2018-03-26 14:56 ` ✗ Fi.CI.IGT: failure " Patchwork
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 40+ messages in thread
From: Patchwork @ 2018-03-26 12:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
URL   : https://patchwork.freedesktop.org/series/40665/
State : success

== Summary ==

Series 40665v1 series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
https://patchwork.freedesktop.org/api/1.0/series/40665/revisions/1/mbox/

---- Known issues:

Test gem_mmap_gtt:
        Subgroup basic-small-bo-tiledx:
                pass       -> FAIL       (fi-gdg-551) fdo#102575
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-c:
                incomplete -> PASS       (fi-hsw-4770) fdo#104944

fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
fdo#104944 https://bugs.freedesktop.org/show_bug.cgi?id=104944

fi-bdw-5557u     total:285  pass:264  dwarn:0   dfail:0   fail:0   skip:21  time:439s
fi-bdw-gvtdvm    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:439s
fi-blb-e6850     total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:379s
fi-bsw-n3050     total:285  pass:239  dwarn:0   dfail:0   fail:0   skip:46  time:546s
fi-bwr-2160      total:285  pass:180  dwarn:0   dfail:0   fail:0   skip:105 time:295s
fi-bxt-dsi       total:243  pass:216  dwarn:0   dfail:0   fail:0   skip:26 
fi-bxt-j4205     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:511s
fi-byt-j1900     total:285  pass:250  dwarn:0   dfail:0   fail:0   skip:35  time:521s
fi-byt-n2820     total:285  pass:246  dwarn:0   dfail:0   fail:0   skip:39  time:507s
fi-cfl-8700k     total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:407s
fi-cfl-u         total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:514s
fi-elk-e7500     total:285  pass:225  dwarn:1   dfail:0   fail:0   skip:59  time:428s
fi-gdg-551       total:285  pass:176  dwarn:0   dfail:0   fail:1   skip:108 time:316s
fi-glk-1         total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:537s
fi-hsw-4770      total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:408s
fi-ilk-650       total:285  pass:225  dwarn:0   dfail:0   fail:0   skip:60  time:423s
fi-ivb-3520m     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:472s
fi-ivb-3770      total:285  pass:252  dwarn:0   dfail:0   fail:0   skip:33  time:434s
fi-kbl-7500u     total:285  pass:260  dwarn:1   dfail:0   fail:0   skip:24  time:473s
fi-kbl-7567u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:472s
fi-kbl-r         total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:515s
fi-pnv-d510      total:285  pass:219  dwarn:1   dfail:0   fail:0   skip:65  time:652s
fi-skl-6260u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:438s
fi-skl-6600u     total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:541s
fi-skl-6700k2    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:509s
fi-skl-6770hq    total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:493s
fi-skl-guc       total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:429s
fi-skl-gvtdvm    total:285  pass:262  dwarn:0   dfail:0   fail:0   skip:23  time:446s
fi-snb-2520m     total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:572s
fi-snb-2600      total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:403s
Blacklisted hosts:
fi-cfl-s3        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:564s
fi-cnl-psr       total:224  pass:198  dwarn:0   dfail:0   fail:1   skip:24 
fi-glk-j4005     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:490s

94f5d9189e61055e246c31106b3810dc17ddee9c drm-tip: 2018y-03m-23d-23h-41m-40s UTC integration manifest
ae480c117350 drm/i915: Allow user control over preempt timeout on the important context
b72df86f0c84 drm/i915: Use a preemption timeout to enforce interactivity
1992a69b17b3 drm/i915/preemption: Select timeout when scheduling
ebbe79e61164 drm/i915/execlists: Force preemption via reset on timeout
17bce80a463b drm/i915/execlists: Flush pending preemption events during reset
b9c4fb9ea980 drm/i915: Split execlists/guc reset prepartions
e0fa2379926b drm/i915: Move engine reset prepare/finish to backends
84c375a58791 drm/i915/execlists: Refactor out complete_preempt_context()
d3c6c6e49941 drm/i915: Include submission tasklet state in engine dump
571ee98c9870 drm/i915/execlists: Clear user-active flag on preemption completion
6cd6e46a6275 drm/i915/execlists: Avoid kicking the submission too early for rescheduling

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8489/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (12 preceding siblings ...)
  2018-03-26 12:23 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-03-26 14:56 ` Patchwork
  2018-03-27 12:28 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling (rev2) Patchwork
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 40+ messages in thread
From: Patchwork @ 2018-03-26 14:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
URL   : https://patchwork.freedesktop.org/series/40665/
State : failure

== Summary ==

---- Possible new issues:

Test gem_ctx_param:
        Subgroup invalid-param-get:
                pass       -> FAIL       (shard-apl)
                pass       -> FAIL       (shard-hsw)
                pass       -> FAIL       (shard-snb)
        Subgroup invalid-param-set:
                pass       -> FAIL       (shard-apl)
                pass       -> FAIL       (shard-hsw)
                pass       -> FAIL       (shard-snb)

---- Known issues:

Test kms_flip:
        Subgroup 2x-flip-vs-expired-vblank-interruptible:
                fail       -> PASS       (shard-hsw) fdo#102887
        Subgroup 2x-modeset-vs-vblank-race-interruptible:
                pass       -> FAIL       (shard-hsw) fdo#103060 +2
        Subgroup 2x-plain-flip-ts-check-interruptible:
                fail       -> PASS       (shard-hsw) fdo#100368
Test kms_rotation_crc:
        Subgroup sprite-rotation-180:
                fail       -> PASS       (shard-snb) fdo#103925
Test kms_setmode:
        Subgroup basic:
                pass       -> FAIL       (shard-apl) fdo#99912

fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#103925 https://bugs.freedesktop.org/show_bug.cgi?id=103925
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912

shard-apl        total:3484 pass:1818 dwarn:1   dfail:0   fail:9   skip:1655 time:12974s
shard-hsw        total:3484 pass:1770 dwarn:1   dfail:0   fail:5   skip:1707 time:11768s
shard-snb        total:3484 pass:1361 dwarn:1   dfail:0   fail:5   skip:2117 time:7006s
Blacklisted hosts:
shard-kbl        total:3484 pass:1942 dwarn:1   dfail:1   fail:11  skip:1529 time:9857s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8489/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 11/11] drm/i915: Allow user control over preempt timeout on the important context
  2018-03-26 11:50 ` [PATCH 11/11] drm/i915: Allow user control over preempt timeout on the important context Chris Wilson
@ 2018-03-26 17:09   ` Tvrtko Ursulin
  2018-03-26 19:52     ` Chris Wilson
  0 siblings, 1 reply; 40+ messages in thread
From: Tvrtko Ursulin @ 2018-03-26 17:09 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 26/03/2018 12:50, Chris Wilson wrote:
> EGL_NV_realtime_priority?
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c | 22 ++++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_context.h | 13 +++++++++++++
>   drivers/gpu/drm/i915/i915_request.c     |  8 ++++++--
>   drivers/gpu/drm/i915/intel_lrc.c        |  2 +-
>   include/uapi/drm/i915_drm.h             | 12 ++++++++++++
>   5 files changed, 54 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 5cfac0255758..dfb0a2b698c3 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -749,6 +749,15 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>   	case I915_CONTEXT_PARAM_PRIORITY:
>   		args->value = ctx->priority;
>   		break;
> +	case I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
> +		if (!(to_i915(dev)->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
> +			ret = -ENODEV;
> +		else if (args->size)
> +			ret = -EINVAL;
> +		else
> +			args->value = ctx->preempt_timeout;
> +		break;
> +
>   	default:
>   		ret = -EINVAL;
>   		break;
> @@ -824,6 +833,19 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>   		}
>   		break;
>   
> +	case I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
> +		if (args->size)
> +			ret = -EINVAL;
> +		else if (args->value >= NSEC_PER_SEC)
> +			ret = -EINVAL;
> +		else if (!(to_i915(dev)->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
> +			ret = -ENODEV;
> +		else if (args->value && !capable(CAP_SYS_ADMIN))
> +			ret = -EPERM;
> +		else
> +			ctx->preempt_timeout = args->value;
> +		break;
> +
>   	default:
>   		ret = -EINVAL;
>   		break;
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index 7854262ddfd9..74d4cadd729e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -150,6 +150,19 @@ struct i915_gem_context {
>   	 */
>   	int priority;
>   
> +	/**
> +	 * @preempt_timeout: QoS guarantee for the high priority context
> +	 *
> +	 * Some clients need a guarantee that they will start executing
> +	 * within a certain window, even at the expense of others. This entails
> +	 * that if a preemption request is not honoured by the active context
> +	 * within the timeout, we will reset the GPU to evict the hog and
> +	 * run the high priority context instead.
> +	 *
> +	 * Timeout is stored in nanoseconds; exclusive max of 1s.

Why did you think we would want to limit it to 1s?

> +	 */
> +	u32 preempt_timeout;
> +
>   	/** ggtt_offset_bias: placement restriction for context objects */
>   	u32 ggtt_offset_bias;
>   
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 9d8dcebd9649..2cd4ea75d127 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1101,8 +1101,12 @@ void __i915_request_add(struct i915_request *request, bool flush_caches)
>   	 * run at the earliest possible convenience.
>   	 */
>   	rcu_read_lock();
> -	if (engine->schedule)
> -		engine->schedule(request, request->ctx->priority, 0);
> +	if (engine->schedule) {
> +		unsigned int timeout = request->ctx->preempt_timeout;
> +		int priority = request->ctx->priority;
> +
> +		engine->schedule(request, priority, timeout);
> +	}
>   	rcu_read_unlock();
>   
>   	local_bh_disable();
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index e266657851e1..e782a621b40b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1148,7 +1148,7 @@ static void execlists_submit_request(struct i915_request *request)
>   	spin_lock_irqsave(&engine->timeline->lock, flags);
>   
>   	queue_request(engine, &request->priotree, rq_prio(request));
> -	submit_queue(engine, rq_prio(request), 0);
> +	submit_queue(engine, rq_prio(request), request->ctx->preempt_timeout);
>   
>   	GEM_BUG_ON(!engine->execlists.first);
>   	GEM_BUG_ON(list_empty(&request->priotree.link));
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 7f5634ce8e88..6f10bbe90304 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1456,6 +1456,18 @@ struct drm_i915_gem_context_param {
>   #define   I915_CONTEXT_MAX_USER_PRIORITY	1023 /* inclusive */
>   #define   I915_CONTEXT_DEFAULT_PRIORITY		0
>   #define   I915_CONTEXT_MIN_USER_PRIORITY	-1023 /* inclusive */
> +
> +/*
> + * I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
> + *
> + * Preemption timeout give in nanoseconds.
> + *
> + * Only allowed for privileged clients (CAP_SYS_ADMIN), this property allows
> + * the preempting context to kick out a GPU hog using a GPU reset if they do
> + * not honour the preemption request in time.
> + */
> +#define I915_CONTEXT_PARAM_PREEMPT_TIMEOUT	0x7
> +

Since I expressed a concern on the "force preemption" naming in the 
other thread, I feel obliged to say that I am OK with PREEMPT_TIMEOUT. :)

Regards,

Tvrtko

>   	__u64 value;
>   };
>   
> 


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 11/11] drm/i915: Allow user control over preempt timeout on the important context
  2018-03-26 17:09   ` Tvrtko Ursulin
@ 2018-03-26 19:52     ` Chris Wilson
  2018-03-26 20:49       ` Chris Wilson
  0 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 19:52 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-03-26 18:09:29)
> 
> On 26/03/2018 12:50, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> > index 7854262ddfd9..74d4cadd729e 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.h
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> > @@ -150,6 +150,19 @@ struct i915_gem_context {
> >        */
> >       int priority;
> >   
> > +     /**
> > +      * @preempt_timeout: QoS guarantee for the high priority context
> > +      *
> > +      * Some clients need a guarantee that they will start executing
> > +      * within a certain window, even at the expense of others. This entails
> > +      * that if a preemption request is not honoured by the active context
> > +      * within the timeout, we will reset the GPU to evict the hog and
> > +      * run the high priority context instead.
> > +      *
> > +      * Timeout is stored in nanoseconds; exclusive max of 1s.
> 
> Why did you think we would want to limit it to 1s?

There's a realistic upper bound of hangcheck interval, say 6s. But
that's completely internal and so irrelevant to the API. 1s was just in
case we used any struct timespec and so could completely ignore the
division, but it really stems from forgetting about ns_to_ktime()...

Entirely arbitrary. I just couldn't believe setting a hrtimer for longer
than smallval made sense, so plumped for something safe to fit in 32b.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 11/11] drm/i915: Allow user control over preempt timeout on the important context
  2018-03-26 19:52     ` Chris Wilson
@ 2018-03-26 20:49       ` Chris Wilson
  0 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-26 20:49 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2018-03-26 20:52:06)
> Quoting Tvrtko Ursulin (2018-03-26 18:09:29)
> > 
> > On 26/03/2018 12:50, Chris Wilson wrote:
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> > > index 7854262ddfd9..74d4cadd729e 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_context.h
> > > +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> > > @@ -150,6 +150,19 @@ struct i915_gem_context {
> > >        */
> > >       int priority;
> > >   
> > > +     /**
> > > +      * @preempt_timeout: QoS guarantee for the high priority context
> > > +      *
> > > +      * Some clients need a guarantee that they will start executing
> > > +      * within a certain window, even at the expense of others. This entails
> > > +      * that if a preemption request is not honoured by the active context
> > > +      * within the timeout, we will reset the GPU to evict the hog and
> > > +      * run the high priority context instead.
> > > +      *
> > > +      * Timeout is stored in nanoseconds; exclusive max of 1s.
> > 
> > Why did you think we would want to limit it to 1s?
> 
> There's a realistic upper bound of hangcheck interval, say 6s. But
> that's completely internal and so irrelevant to the API. 1s was just in
> case we used any struct timespec and so could completely ignore the
> division, but it really stems from forgetting about ns_to_ktime()...
> 
> Entirely arbitrary. I just couldn't believe setting a hrtimer for longer
> than smallval made sense, so plumped for something safe to fit in 32b.

Also it ties into using hrtimer instead of jiffies. My expectation was
that timeout would be on the order of a millisecond (on the high side);
if we are talking whole seconds we should switch to jiffies.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/11] drm/i915: Use a preemption timeout to enforce interactivity
  2018-03-26 11:50 ` [PATCH 10/11] drm/i915: Use a preemption timeout to enforce interactivity Chris Wilson
@ 2018-03-27  8:35   ` Tvrtko Ursulin
  2018-03-27  8:39     ` Chris Wilson
  0 siblings, 1 reply; 40+ messages in thread
From: Tvrtko Ursulin @ 2018-03-27  8:35 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 26/03/2018 12:50, Chris Wilson wrote:
> Use a liberal timeout of 20ms to ensure that the rendering for an
> interactive pageflip is started in a timely fashion, and that
> user interaction is not blocked by GPU, or CPU, hogs. This is at the cost
> of resetting whoever was blocking the preemption, likely leading to that
> context/process being banned from submitting future requests.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_drv.h      |  3 ++-
>   drivers/gpu/drm/i915/i915_gem.c      | 18 ++++++++++--------
>   drivers/gpu/drm/i915/intel_display.c | 18 +++++++++++++++++-
>   3 files changed, 29 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 800230ba1c3b..87388feb973d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3150,8 +3150,9 @@ int i915_gem_object_wait(struct drm_i915_gem_object *obj,
>   			 struct intel_rps_client *rps);
>   int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>   				  unsigned int flags,
> -				  int priority);
> +				  int priority, unsigned int timeout);
>   #define I915_PRIORITY_DISPLAY I915_PRIORITY_MAX
> +#define I915_PREEMPTION_TIMEOUT_DISPLAY (20 * 1000 * 1000) /* 20 ms */
>   
>   int __must_check
>   i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 675d6bb59337..252c6b58e739 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -469,7 +469,8 @@ i915_gem_object_wait_reservation(struct reservation_object *resv,
>   	return timeout;
>   }
>   
> -static void __fence_set_priority(struct dma_fence *fence, int prio)
> +static void __fence_set_priority(struct dma_fence *fence,
> +				 int prio, unsigned int timeout)
>   {
>   	struct i915_request *rq;
>   	struct intel_engine_cs *engine;
> @@ -482,11 +483,12 @@ static void __fence_set_priority(struct dma_fence *fence, int prio)
>   
>   	rcu_read_lock();
>   	if (engine->schedule)
> -		engine->schedule(rq, prio, 0);
> +		engine->schedule(rq, prio, timeout);
>   	rcu_read_unlock();
>   }
>   
> -static void fence_set_priority(struct dma_fence *fence, int prio)
> +static void fence_set_priority(struct dma_fence *fence,
> +			       int prio, unsigned int timeout)
>   {
>   	/* Recurse once into a fence-array */
>   	if (dma_fence_is_array(fence)) {
> @@ -494,16 +496,16 @@ static void fence_set_priority(struct dma_fence *fence, int prio)
>   		int i;
>   
>   		for (i = 0; i < array->num_fences; i++)
> -			__fence_set_priority(array->fences[i], prio);
> +			__fence_set_priority(array->fences[i], prio, timeout);
>   	} else {
> -		__fence_set_priority(fence, prio);
> +		__fence_set_priority(fence, prio, timeout);
>   	}
>   }
>   
>   int
>   i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>   			      unsigned int flags,
> -			      int prio)
> +			      int prio, unsigned int timeout)
>   {
>   	struct dma_fence *excl;
>   
> @@ -518,7 +520,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>   			return ret;
>   
>   		for (i = 0; i < count; i++) {
> -			fence_set_priority(shared[i], prio);
> +			fence_set_priority(shared[i], prio, timeout);
>   			dma_fence_put(shared[i]);
>   		}
>   
> @@ -528,7 +530,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>   	}
>   
>   	if (excl) {
> -		fence_set_priority(excl, prio);
> +		fence_set_priority(excl, prio, timeout);
>   		dma_fence_put(excl);
>   	}
>   	return 0;
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 4c30c7c04f9c..830f2d4e540f 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -12786,7 +12786,23 @@ intel_prepare_plane_fb(struct drm_plane *plane,
>   
>   	ret = intel_plane_pin_fb(to_intel_plane_state(new_state));
>   
> -	i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
> +	/*
> +	 * Reschedule our dependencies, and ensure we run within a timeout.
> +	 *
> +	 * Note that if the timeout is exceeded, then whoever was running that
> +	 * prevented us from acquiring the GPU is declared rogue and reset. An
> +	 * unresponsive process will then be banned in order to preserve
> +	 * interactivity. Since this can be seen as a bit heavy-handed, we
> +	 * select a timeout for when the dropped frames start to become a
> +	 * noticeable nuisance for the user (20 ms, i.e. preemption was blocked
> +	 * for more than a frame). Note, this is only a timeout for a delay in
> +	 * preempting the current request in order to run our dependency chain,
> +	 * our dependency chain may itself take a long time to run to completion
> +	 * before we can present the framebuffer.
> +	 */
> +	i915_gem_object_wait_priority(obj, 0,
> +				      I915_PRIORITY_DISPLAY,
> +				      I915_PREEMPTION_TIMEOUT_DISPLAY);

API signature changes to allow timeouts are fine, but I think this hunk 
should be separate patch at the end of the series. (Since it has the 
potential to make things which used to work, albeit stutteringly (?), 
start crashing.

Regards,

Tvrtko

>   
>   	mutex_unlock(&dev_priv->drm.struct_mutex);
>   	i915_gem_object_unpin_pages(obj);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 03/11] drm/i915: Include submission tasklet state in engine dump
  2018-03-26 11:50 ` [PATCH 03/11] drm/i915: Include submission tasklet state in engine dump Chris Wilson
@ 2018-03-27  8:37   ` Mika Kuoppala
  0 siblings, 0 replies; 40+ messages in thread
From: Mika Kuoppala @ 2018-03-27  8:37 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> For the off-chance we have an interrupt posted and haven't processed the
> CSB.
>
> v2: Include tasklet enable/disable state for good measure.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_engine_cs.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index de09fa42a509..12486d8f534b 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1859,12 +1859,15 @@ static void intel_engine_print_registers(const struct intel_engine_cs *engine,
>  		ptr = I915_READ(RING_CONTEXT_STATUS_PTR(engine));
>  		read = GEN8_CSB_READ_PTR(ptr);
>  		write = GEN8_CSB_WRITE_PTR(ptr);
> -		drm_printf(m, "\tExeclist CSB read %d [%d cached], write %d [%d from hws], interrupt posted? %s\n",
> +		drm_printf(m, "\tExeclist CSB read %d [%d cached], write %d [%d from hws], interrupt posted? %s, tasklet queued? %s (%s)\n",
>  			   read, execlists->csb_head,
>  			   write,
>  			   intel_read_status_page(engine, intel_hws_csb_write_index(engine->i915)),
>  			   yesno(test_bit(ENGINE_IRQ_EXECLIST,
> -					  &engine->irq_posted)));
> +					  &engine->irq_posted)),
> +			   yesno(test_bit(TASKLET_STATE_SCHED,
> +					  &engine->execlists.tasklet.state)),
> +			   enableddisabled(!atomic_read(&engine->execlists.tasklet.count)));
>  		if (read >= GEN8_CSB_ENTRIES)
>  			read = 0;
>  		if (write >= GEN8_CSB_ENTRIES)
> -- 
> 2.16.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/11] drm/i915: Use a preemption timeout to enforce interactivity
  2018-03-27  8:35   ` Tvrtko Ursulin
@ 2018-03-27  8:39     ` Chris Wilson
  2018-03-27  8:57       ` Tvrtko Ursulin
  0 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2018-03-27  8:39 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-03-27 09:35:17)
> 
> On 26/03/2018 12:50, Chris Wilson wrote:
> > Use a liberal timeout of 20ms to ensure that the rendering for an
> > interactive pageflip is started in a timely fashion, and that
> > user interaction is not blocked by GPU, or CPU, hogs. This is at the cost
> > of resetting whoever was blocking the preemption, likely leading to that
> > context/process being banned from submitting future requests.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/i915_drv.h      |  3 ++-
> >   drivers/gpu/drm/i915/i915_gem.c      | 18 ++++++++++--------
> >   drivers/gpu/drm/i915/intel_display.c | 18 +++++++++++++++++-
> >   3 files changed, 29 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 800230ba1c3b..87388feb973d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -3150,8 +3150,9 @@ int i915_gem_object_wait(struct drm_i915_gem_object *obj,
> >                        struct intel_rps_client *rps);
> >   int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
> >                                 unsigned int flags,
> > -                               int priority);
> > +                               int priority, unsigned int timeout);
> >   #define I915_PRIORITY_DISPLAY I915_PRIORITY_MAX
> > +#define I915_PREEMPTION_TIMEOUT_DISPLAY (20 * 1000 * 1000) /* 20 ms */
> >   
> >   int __must_check
> >   i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write);
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 675d6bb59337..252c6b58e739 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -469,7 +469,8 @@ i915_gem_object_wait_reservation(struct reservation_object *resv,
> >       return timeout;
> >   }
> >   
> > -static void __fence_set_priority(struct dma_fence *fence, int prio)
> > +static void __fence_set_priority(struct dma_fence *fence,
> > +                              int prio, unsigned int timeout)
> >   {
> >       struct i915_request *rq;
> >       struct intel_engine_cs *engine;
> > @@ -482,11 +483,12 @@ static void __fence_set_priority(struct dma_fence *fence, int prio)
> >   
> >       rcu_read_lock();
> >       if (engine->schedule)
> > -             engine->schedule(rq, prio, 0);
> > +             engine->schedule(rq, prio, timeout);
> >       rcu_read_unlock();
> >   }
> >   
> > -static void fence_set_priority(struct dma_fence *fence, int prio)
> > +static void fence_set_priority(struct dma_fence *fence,
> > +                            int prio, unsigned int timeout)
> >   {
> >       /* Recurse once into a fence-array */
> >       if (dma_fence_is_array(fence)) {
> > @@ -494,16 +496,16 @@ static void fence_set_priority(struct dma_fence *fence, int prio)
> >               int i;
> >   
> >               for (i = 0; i < array->num_fences; i++)
> > -                     __fence_set_priority(array->fences[i], prio);
> > +                     __fence_set_priority(array->fences[i], prio, timeout);
> >       } else {
> > -             __fence_set_priority(fence, prio);
> > +             __fence_set_priority(fence, prio, timeout);
> >       }
> >   }
> >   
> >   int
> >   i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
> >                             unsigned int flags,
> > -                           int prio)
> > +                           int prio, unsigned int timeout)
> >   {
> >       struct dma_fence *excl;
> >   
> > @@ -518,7 +520,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
> >                       return ret;
> >   
> >               for (i = 0; i < count; i++) {
> > -                     fence_set_priority(shared[i], prio);
> > +                     fence_set_priority(shared[i], prio, timeout);
> >                       dma_fence_put(shared[i]);
> >               }
> >   
> > @@ -528,7 +530,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
> >       }
> >   
> >       if (excl) {
> > -             fence_set_priority(excl, prio);
> > +             fence_set_priority(excl, prio, timeout);
> >               dma_fence_put(excl);
> >       }
> >       return 0;
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 4c30c7c04f9c..830f2d4e540f 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -12786,7 +12786,23 @@ intel_prepare_plane_fb(struct drm_plane *plane,
> >   
> >       ret = intel_plane_pin_fb(to_intel_plane_state(new_state));
> >   
> > -     i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
> > +     /*
> > +      * Reschedule our dependencies, and ensure we run within a timeout.
> > +      *
> > +      * Note that if the timeout is exceeded, then whoever was running that
> > +      * prevented us from acquiring the GPU is declared rogue and reset. An
> > +      * unresponsive process will then be banned in order to preserve
> > +      * interactivity. Since this can be seen as a bit heavy-handed, we
> > +      * select a timeout for when the dropped frames start to become a
> > +      * noticeable nuisance for the user (20 ms, i.e. preemption was blocked
> > +      * for more than a frame). Note, this is only a timeout for a delay in
> > +      * preempting the current request in order to run our dependency chain,
> > +      * our dependency chain may itself take a long time to run to completion
> > +      * before we can present the framebuffer.
> > +      */
> > +     i915_gem_object_wait_priority(obj, 0,
> > +                                   I915_PRIORITY_DISPLAY,
> > +                                   I915_PREEMPTION_TIMEOUT_DISPLAY);
> 
> API signature changes to allow timeouts are fine, but I think this hunk 
> should be separate patch at the end of the series. (Since it has the 
> potential to make things which used to work, albeit stutteringly (?), 
> start crashing.

It is at the end as a separate patch. What am I missing?
(The only thing after it is exposing a param to userspace.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout
  2018-03-26 11:50 ` [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout Chris Wilson
@ 2018-03-27  8:51   ` Tvrtko Ursulin
  2018-03-27  9:10     ` Chris Wilson
                       ` (2 more replies)
  2018-03-27 15:50   ` Jeff McGee
  1 sibling, 3 replies; 40+ messages in thread
From: Tvrtko Ursulin @ 2018-03-27  8:51 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 26/03/2018 12:50, Chris Wilson wrote:
> Install a timer when trying to preempt on behalf of an important
> context such that if the active context does not honour the preemption
> request within the desired timeout, then we reset the GPU to allow the
> important context to run.

I suggest renaming patch title to "Implement optional preemption delay 
timeout", or "upper bound", or something, as long as it is not "force 
preemption". :)

> (Open: should not the timer be started from receiving the high priority
> request...)

If you think receiving as in execbuf I think not - that would be 
something else and not preempt timeout.

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c        | 53 +++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/intel_ringbuffer.h |  8 +++++
>   2 files changed, 61 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 50688fc889d9..6da816d23cb3 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -533,6 +533,47 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
>   
>   	execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK);
>   	execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
> +
> +	/* Set a timer to force preemption vs hostile userspace */
> +	if (execlists->queue_preempt_timeout) {
> +		GEM_TRACE("%s timeout=%uns\n",

preempt-timeout ?

> +			  engine->name, execlists->queue_preempt_timeout);
> +		hrtimer_start(&execlists->preempt_timer,
> +			      ktime_set(0, execlists->queue_preempt_timeout),
> +			      HRTIMER_MODE_REL);
> +	}
> +}
> +
> +static enum hrtimer_restart preempt_timeout(struct hrtimer *hrtimer)
> +{
> +	struct intel_engine_execlists *execlists =
> +		container_of(hrtimer, typeof(*execlists), preempt_timer);
> +
> +	GEM_TRACE("%s\n",
> +		  container_of(execlists,
> +			       struct intel_engine_cs,
> +			       execlists)->name);
> +
> +	queue_work(system_highpri_wq, &execlists->preempt_reset);

I suppose indirection from hrtimer to worker is for better than jiffie 
timeout granularity? But then queue_work might introduce some delay to 
defeat that.

I am wondering if simply schedule_delayed_work directly wouldn't be good 
enough. I suppose it is a question for the product group. But it is also 
implementation detail.

> +	return HRTIMER_NORESTART;
> +}
> +
> +static void preempt_reset(struct work_struct *work)
> +{
> +	struct intel_engine_cs *engine =
> +		container_of(work, typeof(*engine), execlists.preempt_reset);
> +
> +	GEM_TRACE("%s\n", engine->name);
> +
> +	tasklet_disable(&engine->execlists.tasklet);
> +
> +	engine->execlists.tasklet.func(engine->execlists.tasklet.data);

Comment on why calling the tasklet directly.

> +
> +	if (execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
> +		i915_handle_error(engine->i915, BIT(engine->id), 0,
> +				  "preemption timed out on %s", engine->name);

Can this race with the normal reset and we end up wit i915_handle_error 
twice simultaneously?

> +
> +	tasklet_enable(&engine->execlists.tasklet);
>   }
>   
>   static void complete_preempt_context(struct intel_engine_execlists *execlists)
> @@ -542,6 +583,10 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
>   	execlists_cancel_port_requests(execlists);
>   	execlists_unwind_incomplete_requests(execlists);
>   
> +	/* If the timer already fired, complete the reset */
> +	if (hrtimer_try_to_cancel(&execlists->preempt_timer) < 0)
> +		return;

What about timer which had already fired and queued the worker? 
hrtimer_try_to_cancel will return zero for that case I think.

> +
>   	execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
>   }
>   
> @@ -708,6 +753,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   			kmem_cache_free(engine->i915->priorities, p);
>   	}
>   done:
> +	execlists->queue_preempt_timeout = 0; /* preemption point passed */
>   	execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
>   	execlists->first = rb;
>   	if (submit)
> @@ -864,6 +910,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
>   
>   	/* Remaining _unready_ requests will be nop'ed when submitted */
>   
> +	execlists->queue_preempt_timeout = 0;
>   	execlists->queue_priority = INT_MIN;
>   	execlists->queue = RB_ROOT;
>   	execlists->first = NULL;
> @@ -1080,6 +1127,7 @@ static void queue_request(struct intel_engine_cs *engine,
>   static void __submit_queue(struct intel_engine_cs *engine, int prio)
>   {
>   		engine->execlists.queue_priority = prio;
> +		engine->execlists.queue_preempt_timeout = 0;
>   		tasklet_hi_schedule(&engine->execlists.tasklet);
>   }
>   
> @@ -2270,6 +2318,11 @@ logical_ring_setup(struct intel_engine_cs *engine)
>   	tasklet_init(&engine->execlists.tasklet,
>   		     execlists_submission_tasklet, (unsigned long)engine);
>   
> +	INIT_WORK(&engine->execlists.preempt_reset, preempt_reset);
> +	hrtimer_init(&engine->execlists.preempt_timer,
> +		     CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +	engine->execlists.preempt_timer.function = preempt_timeout;
> +
>   	logical_ring_default_vfuncs(engine);
>   	logical_ring_default_irqs(engine);
>   }
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 4c71dcdc722b..7166f47c8489 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -284,6 +284,11 @@ struct intel_engine_execlists {
>   	 */
>   	int queue_priority;
>   
> +	/**
> +	 * @queue_preempt_timeout: Timeout in ns before forcing preemption.
> +	 */
> +	unsigned int queue_preempt_timeout;
> +
>   	/**
>   	 * @queue: queue of requests, in priority lists
>   	 */
> @@ -313,6 +318,9 @@ struct intel_engine_execlists {
>   	 * @preempt_complete_status: expected CSB upon completing preemption
>   	 */
>   	u32 preempt_complete_status;
> +
> +	struct hrtimer preempt_timer;
> +	struct work_struct preempt_reset;
>   };
>   
>   #define INTEL_ENGINE_CS_MAX_NAME 8
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/11] drm/i915: Use a preemption timeout to enforce interactivity
  2018-03-27  8:39     ` Chris Wilson
@ 2018-03-27  8:57       ` Tvrtko Ursulin
  0 siblings, 0 replies; 40+ messages in thread
From: Tvrtko Ursulin @ 2018-03-27  8:57 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 27/03/2018 09:39, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-03-27 09:35:17)
>>
>> On 26/03/2018 12:50, Chris Wilson wrote:
>>> Use a liberal timeout of 20ms to ensure that the rendering for an
>>> interactive pageflip is started in a timely fashion, and that
>>> user interaction is not blocked by GPU, or CPU, hogs. This is at the cost
>>> of resetting whoever was blocking the preemption, likely leading to that
>>> context/process being banned from submitting future requests.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/i915_drv.h      |  3 ++-
>>>    drivers/gpu/drm/i915/i915_gem.c      | 18 ++++++++++--------
>>>    drivers/gpu/drm/i915/intel_display.c | 18 +++++++++++++++++-
>>>    3 files changed, 29 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index 800230ba1c3b..87388feb973d 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -3150,8 +3150,9 @@ int i915_gem_object_wait(struct drm_i915_gem_object *obj,
>>>                         struct intel_rps_client *rps);
>>>    int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>>>                                  unsigned int flags,
>>> -                               int priority);
>>> +                               int priority, unsigned int timeout);
>>>    #define I915_PRIORITY_DISPLAY I915_PRIORITY_MAX
>>> +#define I915_PREEMPTION_TIMEOUT_DISPLAY (20 * 1000 * 1000) /* 20 ms */
>>>    
>>>    int __must_check
>>>    i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write);
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index 675d6bb59337..252c6b58e739 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -469,7 +469,8 @@ i915_gem_object_wait_reservation(struct reservation_object *resv,
>>>        return timeout;
>>>    }
>>>    
>>> -static void __fence_set_priority(struct dma_fence *fence, int prio)
>>> +static void __fence_set_priority(struct dma_fence *fence,
>>> +                              int prio, unsigned int timeout)
>>>    {
>>>        struct i915_request *rq;
>>>        struct intel_engine_cs *engine;
>>> @@ -482,11 +483,12 @@ static void __fence_set_priority(struct dma_fence *fence, int prio)
>>>    
>>>        rcu_read_lock();
>>>        if (engine->schedule)
>>> -             engine->schedule(rq, prio, 0);
>>> +             engine->schedule(rq, prio, timeout);
>>>        rcu_read_unlock();
>>>    }
>>>    
>>> -static void fence_set_priority(struct dma_fence *fence, int prio)
>>> +static void fence_set_priority(struct dma_fence *fence,
>>> +                            int prio, unsigned int timeout)
>>>    {
>>>        /* Recurse once into a fence-array */
>>>        if (dma_fence_is_array(fence)) {
>>> @@ -494,16 +496,16 @@ static void fence_set_priority(struct dma_fence *fence, int prio)
>>>                int i;
>>>    
>>>                for (i = 0; i < array->num_fences; i++)
>>> -                     __fence_set_priority(array->fences[i], prio);
>>> +                     __fence_set_priority(array->fences[i], prio, timeout);
>>>        } else {
>>> -             __fence_set_priority(fence, prio);
>>> +             __fence_set_priority(fence, prio, timeout);
>>>        }
>>>    }
>>>    
>>>    int
>>>    i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>>>                              unsigned int flags,
>>> -                           int prio)
>>> +                           int prio, unsigned int timeout)
>>>    {
>>>        struct dma_fence *excl;
>>>    
>>> @@ -518,7 +520,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>>>                        return ret;
>>>    
>>>                for (i = 0; i < count; i++) {
>>> -                     fence_set_priority(shared[i], prio);
>>> +                     fence_set_priority(shared[i], prio, timeout);
>>>                        dma_fence_put(shared[i]);
>>>                }
>>>    
>>> @@ -528,7 +530,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>>>        }
>>>    
>>>        if (excl) {
>>> -             fence_set_priority(excl, prio);
>>> +             fence_set_priority(excl, prio, timeout);
>>>                dma_fence_put(excl);
>>>        }
>>>        return 0;
>>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>>> index 4c30c7c04f9c..830f2d4e540f 100644
>>> --- a/drivers/gpu/drm/i915/intel_display.c
>>> +++ b/drivers/gpu/drm/i915/intel_display.c
>>> @@ -12786,7 +12786,23 @@ intel_prepare_plane_fb(struct drm_plane *plane,
>>>    
>>>        ret = intel_plane_pin_fb(to_intel_plane_state(new_state));
>>>    
>>> -     i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
>>> +     /*
>>> +      * Reschedule our dependencies, and ensure we run within a timeout.
>>> +      *
>>> +      * Note that if the timeout is exceeded, then whoever was running that
>>> +      * prevented us from acquiring the GPU is declared rogue and reset. An
>>> +      * unresponsive process will then be banned in order to preserve
>>> +      * interactivity. Since this can be seen as a bit heavy-handed, we
>>> +      * select a timeout for when the dropped frames start to become a
>>> +      * noticeable nuisance for the user (20 ms, i.e. preemption was blocked
>>> +      * for more than a frame). Note, this is only a timeout for a delay in
>>> +      * preempting the current request in order to run our dependency chain,
>>> +      * our dependency chain may itself take a long time to run to completion
>>> +      * before we can present the framebuffer.
>>> +      */
>>> +     i915_gem_object_wait_priority(obj, 0,
>>> +                                   I915_PRIORITY_DISPLAY,
>>> +                                   I915_PREEMPTION_TIMEOUT_DISPLAY);
>>
>> API signature changes to allow timeouts are fine, but I think this hunk
>> should be separate patch at the end of the series. (Since it has the
>> potential to make things which used to work, albeit stutteringly (?),
>> start crashing.
> 
> It is at the end as a separate patch. What am I missing?
> (The only thing after it is exposing a param to userspace.)

My bad, I assumed that last one depends on API signature changes from 
this one.

Regards,

Tvrtko



_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout
  2018-03-27  8:51   ` Tvrtko Ursulin
@ 2018-03-27  9:10     ` Chris Wilson
  2018-03-27  9:23     ` Chris Wilson
  2018-03-27 15:40     ` Jeff McGee
  2 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-27  9:10 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-03-27 09:51:20)
> 
> On 26/03/2018 12:50, Chris Wilson wrote:
> > Install a timer when trying to preempt on behalf of an important
> > context such that if the active context does not honour the preemption
> > request within the desired timeout, then we reset the GPU to allow the
> > important context to run.
> 
> I suggest renaming patch title to "Implement optional preemption delay 
> timeout", or "upper bound", or something, as long as it is not "force 
> preemption". :)
> 
> > (Open: should not the timer be started from receiving the high priority
> > request...)
> 
> If you think receiving as in execbuf I think not - that would be 
> something else and not preempt timeout.

I'm thinking of submit_request -> [insert huge delay] -> ksoftirqd.

Actually I think it's better just to solve the ksoftirqd adding random
delays since it affects us all over the shop.

The tricky part is that I expect any request to improve (i.e. avoid)
ksoftirqd for our use case is likely to end up in "use threaded-irq"
with which we always incur a scheduler delay as we then do not have our
irq_exit hook. I currently hack kernel/softirq.c to always run at least
one loop for HI_SOFTIRQ before deferring to ksoftirqd.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout
  2018-03-27  8:51   ` Tvrtko Ursulin
  2018-03-27  9:10     ` Chris Wilson
@ 2018-03-27  9:23     ` Chris Wilson
  2018-03-28  8:59       ` Chris Wilson
  2018-03-27 15:40     ` Jeff McGee
  2 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2018-03-27  9:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-03-27 09:51:20)
> 
> On 26/03/2018 12:50, Chris Wilson wrote:
> > +static enum hrtimer_restart preempt_timeout(struct hrtimer *hrtimer)
> > +{
> > +     struct intel_engine_execlists *execlists =
> > +             container_of(hrtimer, typeof(*execlists), preempt_timer);
> > +
> > +     GEM_TRACE("%s\n",
> > +               container_of(execlists,
> > +                            struct intel_engine_cs,
> > +                            execlists)->name);
> > +
> > +     queue_work(system_highpri_wq, &execlists->preempt_reset);
> 
> I suppose indirection from hrtimer to worker is for better than jiffie 
> timeout granularity? But then queue_work might introduce some delay to 
> defeat that.

Yes. It's betting on highpri_wq being just that. We can do better with
our own RT kthread and a wakeup from the hrtimer if required.

Hmm, the original plan (watchdog TDR) was to avoid using the global
reset entirely. The per-engine reset (although needs serialising with
itself) doesn't need struct_mutex, so we should be able to do that from
the timer directly, and just kick off the global reset on failure.

That sounds a whole lot better, let's dust off that code and see what
breaks.

> I am wondering if simply schedule_delayed_work directly wouldn't be good 
> enough. I suppose it is a question for the product group. But it is also 
> implementation detail.
> 
> > +     return HRTIMER_NORESTART;
> > +}
> > +
> > +static void preempt_reset(struct work_struct *work)
> > +{
> > +     struct intel_engine_cs *engine =
> > +             container_of(work, typeof(*engine), execlists.preempt_reset);
> > +
> > +     GEM_TRACE("%s\n", engine->name);
> > +
> > +     tasklet_disable(&engine->execlists.tasklet);
> > +
> > +     engine->execlists.tasklet.func(engine->execlists.tasklet.data);
> 
> Comment on why calling the tasklet directly.

/* ksoftirqd hates me; I hate ksoftirqd. */
 
> > +
> > +     if (execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
> > +             i915_handle_error(engine->i915, BIT(engine->id), 0,
> > +                               "preemption timed out on %s", engine->name);
> 
> Can this race with the normal reset and we end up wit i915_handle_error 
> twice simultaneously?

i915_handle_error() serialises itself against multiple calls to the same
set of engines and against a global reset.

There's a window for us to try and reset the same request twice (in
quick succession) though. That's not good.

Also I think we need some more hand holding here to provoke the right
reset. Hmm. We may need to break i915_handle_error() down quite a bit.

(Ok decided to try and not use i915_handle_error() here, at least not on
the immediate path.)

Hmm, going back to that discussion, we will have to dig through the
details on when exactly hrtimer comes from softirq, or else we may
deadlock with waiting on the tasklet. Hmm indeed.

> > +     tasklet_enable(&engine->execlists.tasklet);
> >   }
> >   
> >   static void complete_preempt_context(struct intel_engine_execlists *execlists)
> > @@ -542,6 +583,10 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
> >       execlists_cancel_port_requests(execlists);
> >       execlists_unwind_incomplete_requests(execlists);
> >   
> > +     /* If the timer already fired, complete the reset */
> > +     if (hrtimer_try_to_cancel(&execlists->preempt_timer) < 0)
> > +             return;
> 
> What about timer which had already fired and queued the worker? 
> hrtimer_try_to_cancel will return zero for that case I think.

That's -1. 0 is the timer already fired and executed. 1 is pending.

If the timer fired, executed its callback and then we execute, we
clear the pending flag and proceed as normal (also when we get called
from inside the worker after the hrtimer). The worker sees the cleared
flag in that case and skips the reset.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 02/11] drm/i915/execlists: Clear user-active flag on preemption completion
  2018-03-26 11:50 ` [PATCH 02/11] drm/i915/execlists: Clear user-active flag on preemption completion Chris Wilson
@ 2018-03-27 10:00   ` Chris Wilson
  2018-03-27 10:01     ` Chris Wilson
  0 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2018-03-27 10:00 UTC (permalink / raw)
  To: intel-gfx

Quoting Chris Wilson (2018-03-26 12:50:35)
> When cancelling the requests and clearing out the ports following a
> successful preemption completion, also clear the active flag. I had
> assumed that all preemptions would be followed by an immediate dequeue
> (preserving the active user flag), but under rare circumstances we may
> be triggering a preemption for the second port only for it to have
> completed before the preemotion kicks in; leaving execlists->active set
> even though the system is now idle.
> 
> We can clear the flag inside the common execlists_cancel_port_requests()
> as the other users also expect the semantics of active being cleared.
> 
> Fixes: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

From the earlier posting,
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

Mika, any chance you want to complete the hat check and review the first
patch as well? :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 02/11] drm/i915/execlists: Clear user-active flag on preemption completion
  2018-03-27 10:00   ` Chris Wilson
@ 2018-03-27 10:01     ` Chris Wilson
  0 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-27 10:01 UTC (permalink / raw)
  To: intel-gfx

Quoting Chris Wilson (2018-03-27 11:00:32)
> Quoting Chris Wilson (2018-03-26 12:50:35)
> > When cancelling the requests and clearing out the ports following a
> > successful preemption completion, also clear the active flag. I had
> > assumed that all preemptions would be followed by an immediate dequeue
> > (preserving the active user flag), but under rare circumstances we may
> > be triggering a preemption for the second port only for it to have
> > completed before the preemotion kicks in; leaving execlists->active set
> > even though the system is now idle.
> > 
> > We can clear the flag inside the common execlists_cancel_port_requests()
> > as the other users also expect the semantics of active being cleared.
> > 
> > Fixes: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> > Cc: Michel Thierry <michel.thierry@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> From the earlier posting,
> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> 
> Mika, any chance you want to complete the hat check and review the first
> patch as well? :)

s/hat check/hat trick/
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
  2018-03-26 11:50 ` [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling Chris Wilson
@ 2018-03-27 11:34   ` Mika Kuoppala
  2018-03-27 11:47     ` Chris Wilson
  2018-03-27 11:42   ` Mika Kuoppala
  1 sibling, 1 reply; 40+ messages in thread
From: Mika Kuoppala @ 2018-03-27 11:34 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> If the request is still waiting on external fences, it has not yet been
> submitted to the HW queue and so we can forgo kicking the submission
> tasklet when re-evaluating its priority.
>
> This should have no impact other than reducing the number of tasklet
> wakeups under signal heavy workloads (e.g. switching between engines).
>
> v2: Use prebaked container_of()
>
> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index b4ab06b05e58..104b69e0494f 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1051,12 +1051,16 @@ static void queue_request(struct intel_engine_cs *engine,
>  	list_add_tail(&pt->link, &lookup_priolist(engine, pt, prio)->requests);
>  }
>  
> -static void submit_queue(struct intel_engine_cs *engine, int prio)
> +static void __submit_queue(struct intel_engine_cs *engine, int prio)
>  {
> -	if (prio > engine->execlists.queue_priority) {
>  		engine->execlists.queue_priority = prio;
>  		tasklet_hi_schedule(&engine->execlists.tasklet);
> -	}
> +}
> +
> +static void submit_queue(struct intel_engine_cs *engine, int prio)
> +{
> +	if (prio > engine->execlists.queue_priority)
> +		__submit_queue(engine, prio);

You did this...

>  }
>  
>  static void execlists_submit_request(struct i915_request *request)
> @@ -1189,7 +1193,10 @@ static void execlists_schedule(struct i915_request *request, int prio)
>  			__list_del_entry(&pt->link);
>  			queue_request(engine, pt, prio);
>  		}
> -		submit_queue(engine, prio);
> +
> +		if (prio > engine->execlists.queue_priority &&
> +		    i915_sw_fence_done(&pt_to_request(pt)->submit))
> +			__submit_queue(engine, prio);

..to have explicit priority comparison on submit callsite I gather.
Or is there some other reason?

-Mika

>  	}
>  
>  	spin_unlock_irq(&engine->timeline->lock);
> -- 
> 2.16.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
  2018-03-26 11:50 ` [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling Chris Wilson
  2018-03-27 11:34   ` Mika Kuoppala
@ 2018-03-27 11:42   ` Mika Kuoppala
  1 sibling, 0 replies; 40+ messages in thread
From: Mika Kuoppala @ 2018-03-27 11:42 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> If the request is still waiting on external fences, it has not yet been
> submitted to the HW queue and so we can forgo kicking the submission
> tasklet when re-evaluating its priority.
>
> This should have no impact other than reducing the number of tasklet
> wakeups under signal heavy workloads (e.g. switching between engines).
>
> v2: Use prebaked container_of()
>
> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index b4ab06b05e58..104b69e0494f 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1051,12 +1051,16 @@ static void queue_request(struct intel_engine_cs *engine,
>  	list_add_tail(&pt->link, &lookup_priolist(engine, pt, prio)->requests);
>  }
>  
> -static void submit_queue(struct intel_engine_cs *engine, int prio)
> +static void __submit_queue(struct intel_engine_cs *engine, int prio)
>  {
> -	if (prio > engine->execlists.queue_priority) {
>  		engine->execlists.queue_priority = prio;
>  		tasklet_hi_schedule(&engine->execlists.tasklet);

You need to fix indent in here.
-Mika


> -	}
> +}
> +
> +static void submit_queue(struct intel_engine_cs *engine, int prio)
> +{
> +	if (prio > engine->execlists.queue_priority)
> +		__submit_queue(engine, prio);
>  }
>  
>  static void execlists_submit_request(struct i915_request *request)
> @@ -1189,7 +1193,10 @@ static void execlists_schedule(struct i915_request *request, int prio)
>  			__list_del_entry(&pt->link);
>  			queue_request(engine, pt, prio);
>  		}
> -		submit_queue(engine, prio);
> +
> +		if (prio > engine->execlists.queue_priority &&
> +		    i915_sw_fence_done(&pt_to_request(pt)->submit))
> +			__submit_queue(engine, prio);
>  	}
>  
>  	spin_unlock_irq(&engine->timeline->lock);
> -- 
> 2.16.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v2] drm/i915/execlists: Flush pending preemption events during reset
  2018-03-26 11:50 ` [PATCH 07/11] drm/i915/execlists: Flush pending preemption events during reset Chris Wilson
@ 2018-03-27 11:44   ` Chris Wilson
  2018-03-27 15:33     ` Jeff McGee
  0 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2018-03-27 11:44 UTC (permalink / raw)
  To: intel-gfx

Catch up with the inflight CSB events, after disabling the tasklet
before deciding which request was truly guilty of hanging the GPU.

v2: Restore checking of use_csb_mmio on every loop, don't forget old
vgpu.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 127 +++++++++++++++++++++++++++------------
 1 file changed, 87 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index cf31b0749023..75dbdedde8b0 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -874,34 +874,14 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 	local_irq_restore(flags);
 }
 
-/*
- * Check the unread Context Status Buffers and manage the submission of new
- * contexts to the ELSP accordingly.
- */
-static void execlists_submission_tasklet(unsigned long data)
+static void process_csb(struct intel_engine_cs *engine)
 {
-	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port * const port = execlists->port;
-	struct drm_i915_private *dev_priv = engine->i915;
+	struct drm_i915_private *i915 = engine->i915;
 	bool fw = false;
 
-	/*
-	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
-	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
-	 * not be relinquished until the device is idle (see
-	 * i915_gem_idle_work_handler()). As a precaution, we make sure
-	 * that all ELSP are drained i.e. we have processed the CSB,
-	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
-	 */
-	GEM_BUG_ON(!dev_priv->gt.awake);
-
-	/*
-	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
-	 * imposing the cost of a locked atomic transaction when submitting a
-	 * new request (outside of the context-switch interrupt).
-	 */
-	while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
+	do {
 		/* The HWSP contains a (cacheable) mirror of the CSB */
 		const u32 *buf =
 			&engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
@@ -909,28 +889,27 @@ static void execlists_submission_tasklet(unsigned long data)
 
 		if (unlikely(execlists->csb_use_mmio)) {
 			buf = (u32 * __force)
-				(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));
-			execlists->csb_head = -1; /* force mmio read of CSB ptrs */
+				(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));
+			execlists->csb_head = -1; /* force mmio read of CSB */
 		}
 
 		/* Clear before reading to catch new interrupts */
 		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 		smp_mb__after_atomic();
 
-		if (unlikely(execlists->csb_head == -1)) { /* following a reset */
+		if (unlikely(execlists->csb_head == -1)) { /* after a reset */
 			if (!fw) {
-				intel_uncore_forcewake_get(dev_priv,
-							   execlists->fw_domains);
+				intel_uncore_forcewake_get(i915, execlists->fw_domains);
 				fw = true;
 			}
 
-			head = readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
+			head = readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
 			tail = GEN8_CSB_WRITE_PTR(head);
 			head = GEN8_CSB_READ_PTR(head);
 			execlists->csb_head = head;
 		} else {
 			const int write_idx =
-				intel_hws_csb_write_index(dev_priv) -
+				intel_hws_csb_write_index(i915) -
 				I915_HWS_CSB_BUF0_INDEX;
 
 			head = execlists->csb_head;
@@ -938,8 +917,8 @@ static void execlists_submission_tasklet(unsigned long data)
 		}
 		GEM_TRACE("%s cs-irq head=%d [%d%s], tail=%d [%d%s]\n",
 			  engine->name,
-			  head, GEN8_CSB_READ_PTR(readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",
-			  tail, GEN8_CSB_WRITE_PTR(readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");
+			  head, GEN8_CSB_READ_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",
+			  tail, GEN8_CSB_WRITE_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");
 
 		while (head != tail) {
 			struct i915_request *rq;
@@ -949,7 +928,8 @@ static void execlists_submission_tasklet(unsigned long data)
 			if (++head == GEN8_CSB_ENTRIES)
 				head = 0;
 
-			/* We are flying near dragons again.
+			/*
+			 * We are flying near dragons again.
 			 *
 			 * We hold a reference to the request in execlist_port[]
 			 * but no more than that. We are operating in softirq
@@ -1040,15 +1020,48 @@ static void execlists_submission_tasklet(unsigned long data)
 		if (head != execlists->csb_head) {
 			execlists->csb_head = head;
 			writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, head << 8),
-			       dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
+			       i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
 		}
-	}
+	} while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted));
 
-	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT))
-		execlists_dequeue(engine);
+	if (unlikely(fw))
+		intel_uncore_forcewake_put(i915, execlists->fw_domains);
+}
+
+/*
+ * Check the unread Context Status Buffers and manage the submission of new
+ * contexts to the ELSP accordingly.
+ */
+static void execlists_submission_tasklet(unsigned long data)
+{
+	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 
-	if (fw)
-		intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
+	GEM_TRACE("%s awake?=%d, active=%x, irq-posted?=%d\n",
+		  engine->name,
+		  engine->i915->gt.awake,
+		  engine->execlists.active,
+		  test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted));
+
+	/*
+	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
+	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
+	 * not be relinquished until the device is idle (see
+	 * i915_gem_idle_work_handler()). As a precaution, we make sure
+	 * that all ELSP are drained i.e. we have processed the CSB,
+	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
+	 */
+	GEM_BUG_ON(!engine->i915->gt.awake);
+
+	/*
+	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
+	 * imposing the cost of a locked atomic transaction when submitting a
+	 * new request (outside of the context-switch interrupt).
+	 */
+	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
+		process_csb(engine);
+
+	if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
+		execlists_dequeue(engine);
 
 	/* If the engine is now idle, so should be the flag; and vice versa. */
 	GEM_BUG_ON(execlists_is_active(&engine->execlists,
@@ -1712,6 +1725,7 @@ static struct i915_request *
 execlists_reset_prepare(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
+	struct i915_request *request, *active;
 
 	GEM_TRACE("%s\n", engine->name);
 
@@ -1735,7 +1749,40 @@ execlists_reset_prepare(struct intel_engine_cs *engine)
 		tasklet_kill(&execlists->tasklet);
 	tasklet_disable(&execlists->tasklet);
 
-	return i915_gem_find_active_request(engine);
+	/*
+	 * We want to flush the pending context switches, having disabled
+	 * the tasklet above, we can assume exclusive access to the execlists.
+	 * For this allows us to catch up with an inflight preemption event,
+	 * and avoid blaming an innocent request if the stall was due to the
+	 * preemption itself.
+	 */
+	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
+		process_csb(engine);
+
+	/*
+	 * The last active request can then be no later than the last request
+	 * now in ELSP[0]. So search backwards from there, so that if the GPU
+	 * has advanced beyond the last CSB update, it will be pardoned.
+	 */
+	active = NULL;
+	request = port_request(execlists->port);
+	if (request) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&engine->timeline->lock, flags);
+		list_for_each_entry_from_reverse(request,
+						 &engine->timeline->requests,
+						 link) {
+			if (__i915_request_completed(request,
+						     request->global_seqno))
+				break;
+
+			active = request;
+		}
+		spin_unlock_irqrestore(&engine->timeline->lock, flags);
+	}
+
+	return active;
 }
 
 static void execlists_reset(struct intel_engine_cs *engine,
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
  2018-03-27 11:34   ` Mika Kuoppala
@ 2018-03-27 11:47     ` Chris Wilson
  2018-03-27 12:18       ` Mika Kuoppala
  0 siblings, 1 reply; 40+ messages in thread
From: Chris Wilson @ 2018-03-27 11:47 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2018-03-27 12:34:31)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > If the request is still waiting on external fences, it has not yet been
> > submitted to the HW queue and so we can forgo kicking the submission
> > tasklet when re-evaluating its priority.
> >
> > This should have no impact other than reducing the number of tasklet
> > wakeups under signal heavy workloads (e.g. switching between engines).
> >
> > v2: Use prebaked container_of()
> >
> > References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> > Cc: Michel Thierry <michel.thierry@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_lrc.c | 15 +++++++++++----
> >  1 file changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index b4ab06b05e58..104b69e0494f 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -1051,12 +1051,16 @@ static void queue_request(struct intel_engine_cs *engine,
> >       list_add_tail(&pt->link, &lookup_priolist(engine, pt, prio)->requests);
> >  }
> >  
> > -static void submit_queue(struct intel_engine_cs *engine, int prio)
> > +static void __submit_queue(struct intel_engine_cs *engine, int prio)
> >  {
> > -     if (prio > engine->execlists.queue_priority) {
> >               engine->execlists.queue_priority = prio;
> >               tasklet_hi_schedule(&engine->execlists.tasklet);
> > -     }
> > +}
> > +
> > +static void submit_queue(struct intel_engine_cs *engine, int prio)
> > +{
> > +     if (prio > engine->execlists.queue_priority)
> > +             __submit_queue(engine, prio);
> 
> You did this...
> 
> >  }
> >  
> >  static void execlists_submit_request(struct i915_request *request)
> > @@ -1189,7 +1193,10 @@ static void execlists_schedule(struct i915_request *request, int prio)
> >                       __list_del_entry(&pt->link);
> >                       queue_request(engine, pt, prio);
> >               }
> > -             submit_queue(engine, prio);
> > +
> > +             if (prio > engine->execlists.queue_priority &&
> > +                 i915_sw_fence_done(&pt_to_request(pt)->submit))
> > +                     __submit_queue(engine, prio);
> 
> ..to have explicit priority comparison on submit callsite I gather.
> Or is there some other reason?

No, just because I wanted both checks in this case. On the other path
i915_sw_fence_done() isn't technically true as we are in process of
performing the i915_sw_fence callback, so just i915_sw_fence_signaled().
But we don't want to use i915_sw_fence_signaled() here as I don't want
to think about the race. :)

So since prio > engine.queue_priority should be cheaper than loading the
cacheline for the request->submit.flags, I wanted that tested first as
it will only fire, at most, once per engine.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
  2018-03-27 11:47     ` Chris Wilson
@ 2018-03-27 12:18       ` Mika Kuoppala
  2018-03-27 13:34         ` Chris Wilson
  0 siblings, 1 reply; 40+ messages in thread
From: Mika Kuoppala @ 2018-03-27 12:18 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2018-03-27 12:34:31)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > If the request is still waiting on external fences, it has not yet been
>> > submitted to the HW queue and so we can forgo kicking the submission
>> > tasklet when re-evaluating its priority.
>> >
>> > This should have no impact other than reducing the number of tasklet
>> > wakeups under signal heavy workloads (e.g. switching between engines).
>> >
>> > v2: Use prebaked container_of()
>> >
>> > References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Michał Winiarski <michal.winiarski@intel.com>
>> > Cc: Michel Thierry <michel.thierry@intel.com>
>> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/intel_lrc.c | 15 +++++++++++----
>> >  1 file changed, 11 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> > index b4ab06b05e58..104b69e0494f 100644
>> > --- a/drivers/gpu/drm/i915/intel_lrc.c
>> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> > @@ -1051,12 +1051,16 @@ static void queue_request(struct intel_engine_cs *engine,
>> >       list_add_tail(&pt->link, &lookup_priolist(engine, pt, prio)->requests);
>> >  }
>> >  
>> > -static void submit_queue(struct intel_engine_cs *engine, int prio)
>> > +static void __submit_queue(struct intel_engine_cs *engine, int prio)
>> >  {
>> > -     if (prio > engine->execlists.queue_priority) {
>> >               engine->execlists.queue_priority = prio;
>> >               tasklet_hi_schedule(&engine->execlists.tasklet);
>> > -     }
>> > +}
>> > +
>> > +static void submit_queue(struct intel_engine_cs *engine, int prio)
>> > +{
>> > +     if (prio > engine->execlists.queue_priority)
>> > +             __submit_queue(engine, prio);
>> 
>> You did this...
>> 
>> >  }
>> >  
>> >  static void execlists_submit_request(struct i915_request *request)
>> > @@ -1189,7 +1193,10 @@ static void execlists_schedule(struct i915_request *request, int prio)
>> >                       __list_del_entry(&pt->link);
>> >                       queue_request(engine, pt, prio);
>> >               }
>> > -             submit_queue(engine, prio);
>> > +
>> > +             if (prio > engine->execlists.queue_priority &&
>> > +                 i915_sw_fence_done(&pt_to_request(pt)->submit))
>> > +                     __submit_queue(engine, prio);
>> 
>> ..to have explicit priority comparison on submit callsite I gather.
>> Or is there some other reason?
>
> No, just because I wanted both checks in this case. On the other path
> i915_sw_fence_done() isn't technically true as we are in process of
> performing the i915_sw_fence callback, so just i915_sw_fence_signaled().
> But we don't want to use i915_sw_fence_signaled() here as I don't want
> to think about the race. :)
>
> So since prio > engine.queue_priority should be cheaper than loading the
> cacheline for the request->submit.flags, I wanted that tested first as
> it will only fire, at most, once per engine.

Ok, didn't see the perf angle of it but makes sense.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling (rev2)
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (13 preceding siblings ...)
  2018-03-26 14:56 ` ✗ Fi.CI.IGT: failure " Patchwork
@ 2018-03-27 12:28 ` Patchwork
  2018-03-27 12:44 ` ✓ Fi.CI.BAT: success " Patchwork
  2018-03-27 15:30 ` ✗ Fi.CI.IGT: failure " Patchwork
  16 siblings, 0 replies; 40+ messages in thread
From: Patchwork @ 2018-03-27 12:28 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling (rev2)
URL   : https://patchwork.freedesktop.org/series/40665/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
faa8da6b86be drm/i915/execlists: Avoid kicking the submission too early for rescheduling
-:19: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#19: 
References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")

-:19: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")'
#19: 
References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")

total: 1 errors, 1 warnings, 0 checks, 30 lines checked
4a804e342975 drm/i915/execlists: Refactor out complete_preempt_context()
e422869adae2 drm/i915: Move engine reset prepare/finish to backends
a891524a59ea drm/i915: Split execlists/guc reset prepartions
803a762fcd5a drm/i915/execlists: Flush pending preemption events during reset
-:69: WARNING:LONG_LINE: line over 100 characters
#69: FILE: drivers/gpu/drm/i915/intel_lrc.c:892:
+				(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));

-:87: WARNING:LONG_LINE: line over 100 characters
#87: FILE: drivers/gpu/drm/i915/intel_lrc.c:906:
+			head = readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));

-:104: WARNING:LONG_LINE: line over 100 characters
#104: FILE: drivers/gpu/drm/i915/intel_lrc.c:920:
+			  head, GEN8_CSB_READ_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",

-:105: WARNING:LONG_LINE: line over 100 characters
#105: FILE: drivers/gpu/drm/i915/intel_lrc.c:921:
+			  tail, GEN8_CSB_WRITE_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");

total: 0 errors, 4 warnings, 0 checks, 192 lines checked
ca960a24beab drm/i915/execlists: Force preemption via reset on timeout
1521bc101f62 drm/i915/preemption: Select timeout when scheduling
e411e7437e76 drm/i915: Use a preemption timeout to enforce interactivity
3d4caa26d478 drm/i915: Allow user control over preempt timeout on the important context

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling (rev2)
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (14 preceding siblings ...)
  2018-03-27 12:28 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling (rev2) Patchwork
@ 2018-03-27 12:44 ` Patchwork
  2018-03-27 15:30 ` ✗ Fi.CI.IGT: failure " Patchwork
  16 siblings, 0 replies; 40+ messages in thread
From: Patchwork @ 2018-03-27 12:44 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling (rev2)
URL   : https://patchwork.freedesktop.org/series/40665/
State : success

== Summary ==

Series 40665v2 series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
https://patchwork.freedesktop.org/api/1.0/series/40665/revisions/2/mbox/

---- Known issues:

Test gem_mmap_gtt:
        Subgroup basic-small-bo-tiledx:
                fail       -> PASS       (fi-gdg-551) fdo#102575
Test kms_flip:
        Subgroup basic-flip-vs-wf_vblank:
                fail       -> PASS       (fi-skl-6770hq) fdo#100368

fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368

fi-bdw-5557u     total:285  pass:264  dwarn:0   dfail:0   fail:0   skip:21  time:434s
fi-bdw-gvtdvm    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:441s
fi-blb-e6850     total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:381s
fi-bsw-n3050     total:285  pass:239  dwarn:0   dfail:0   fail:0   skip:46  time:542s
fi-bwr-2160      total:285  pass:180  dwarn:0   dfail:0   fail:0   skip:105 time:298s
fi-bxt-j4205     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:512s
fi-byt-j1900     total:285  pass:250  dwarn:0   dfail:0   fail:0   skip:35  time:525s
fi-byt-n2820     total:285  pass:246  dwarn:0   dfail:0   fail:0   skip:39  time:512s
fi-cfl-8700k     total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:412s
fi-cfl-s3        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:571s
fi-cfl-u         total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:511s
fi-cnl-y3        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:579s
fi-elk-e7500     total:285  pass:225  dwarn:1   dfail:0   fail:0   skip:59  time:424s
fi-gdg-551       total:285  pass:177  dwarn:0   dfail:0   fail:0   skip:108 time:323s
fi-glk-1         total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:535s
fi-hsw-4770      total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:410s
fi-ilk-650       total:285  pass:225  dwarn:0   dfail:0   fail:0   skip:60  time:429s
fi-ivb-3520m     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:477s
fi-ivb-3770      total:285  pass:252  dwarn:0   dfail:0   fail:0   skip:33  time:437s
fi-kbl-7500u     total:285  pass:260  dwarn:1   dfail:0   fail:0   skip:24  time:476s
fi-kbl-7567u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:463s
fi-kbl-r         total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:516s
fi-pnv-d510      total:285  pass:219  dwarn:1   dfail:0   fail:0   skip:65  time:670s
fi-skl-6260u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:446s
fi-skl-6600u     total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:532s
fi-skl-6700k2    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:503s
fi-skl-6770hq    total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:499s
fi-skl-guc       total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:432s
fi-skl-gvtdvm    total:285  pass:262  dwarn:0   dfail:0   fail:0   skip:23  time:445s
fi-snb-2520m     total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:594s
fi-snb-2600      total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:404s
Blacklisted hosts:
fi-cnl-psr       total:285  pass:256  dwarn:3   dfail:0   fail:0   skip:26  time:534s
fi-glk-j4005     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:485s

fc022080c6c72a9b96fe1b730066c182bed98ac5 drm-tip: 2018y-03m-27d-11h-15m-40s UTC integration manifest
3d4caa26d478 drm/i915: Allow user control over preempt timeout on the important context
e411e7437e76 drm/i915: Use a preemption timeout to enforce interactivity
1521bc101f62 drm/i915/preemption: Select timeout when scheduling
ca960a24beab drm/i915/execlists: Force preemption via reset on timeout
803a762fcd5a drm/i915/execlists: Flush pending preemption events during reset
a891524a59ea drm/i915: Split execlists/guc reset prepartions
e422869adae2 drm/i915: Move engine reset prepare/finish to backends
4a804e342975 drm/i915/execlists: Refactor out complete_preempt_context()
faa8da6b86be drm/i915/execlists: Avoid kicking the submission too early for rescheduling

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8503/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling
  2018-03-27 12:18       ` Mika Kuoppala
@ 2018-03-27 13:34         ` Chris Wilson
  0 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-27 13:34 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2018-03-27 13:18:06)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Quoting Mika Kuoppala (2018-03-27 12:34:31)
> >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> >> 
> >> > If the request is still waiting on external fences, it has not yet been
> >> > submitted to the HW queue and so we can forgo kicking the submission
> >> > tasklet when re-evaluating its priority.
> >> >
> >> > This should have no impact other than reducing the number of tasklet
> >> > wakeups under signal heavy workloads (e.g. switching between engines).
> >> >
> >> > v2: Use prebaked container_of()
> >> >
> >> > References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> >> > Cc: Michel Thierry <michel.thierry@intel.com>
> >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/intel_lrc.c | 15 +++++++++++----
> >> >  1 file changed, 11 insertions(+), 4 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >> > index b4ab06b05e58..104b69e0494f 100644
> >> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> >> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >> > @@ -1051,12 +1051,16 @@ static void queue_request(struct intel_engine_cs *engine,
> >> >       list_add_tail(&pt->link, &lookup_priolist(engine, pt, prio)->requests);
> >> >  }
> >> >  
> >> > -static void submit_queue(struct intel_engine_cs *engine, int prio)
> >> > +static void __submit_queue(struct intel_engine_cs *engine, int prio)
> >> >  {
> >> > -     if (prio > engine->execlists.queue_priority) {
> >> >               engine->execlists.queue_priority = prio;
> >> >               tasklet_hi_schedule(&engine->execlists.tasklet);
> >> > -     }
> >> > +}
> >> > +
> >> > +static void submit_queue(struct intel_engine_cs *engine, int prio)
> >> > +{
> >> > +     if (prio > engine->execlists.queue_priority)
> >> > +             __submit_queue(engine, prio);
> >> 
> >> You did this...
> >> 
> >> >  }
> >> >  
> >> >  static void execlists_submit_request(struct i915_request *request)
> >> > @@ -1189,7 +1193,10 @@ static void execlists_schedule(struct i915_request *request, int prio)
> >> >                       __list_del_entry(&pt->link);
> >> >                       queue_request(engine, pt, prio);
> >> >               }
> >> > -             submit_queue(engine, prio);
> >> > +
> >> > +             if (prio > engine->execlists.queue_priority &&
> >> > +                 i915_sw_fence_done(&pt_to_request(pt)->submit))
> >> > +                     __submit_queue(engine, prio);
> >> 
> >> ..to have explicit priority comparison on submit callsite I gather.
> >> Or is there some other reason?
> >
> > No, just because I wanted both checks in this case. On the other path
> > i915_sw_fence_done() isn't technically true as we are in process of
> > performing the i915_sw_fence callback, so just i915_sw_fence_signaled().
> > But we don't want to use i915_sw_fence_signaled() here as I don't want
> > to think about the race. :)
> >
> > So since prio > engine.queue_priority should be cheaper than loading the
> > cacheline for the request->submit.flags, I wanted that tested first as
> > it will only fire, at most, once per engine.
> 
> Ok, didn't see the perf angle of it but makes sense.
> 
> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

Fixed up the double indent and pushed. Thanks for the review, and
digging into i915_sw_fence :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling (rev2)
  2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
                   ` (15 preceding siblings ...)
  2018-03-27 12:44 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-03-27 15:30 ` Patchwork
  16 siblings, 0 replies; 40+ messages in thread
From: Patchwork @ 2018-03-27 15:30 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling (rev2)
URL   : https://patchwork.freedesktop.org/series/40665/
State : failure

== Summary ==

---- Possible new issues:

Test gem_ctx_param:
        Subgroup invalid-param-get:
                pass       -> FAIL       (shard-apl)
                pass       -> FAIL       (shard-hsw)
                pass       -> FAIL       (shard-snb)
        Subgroup invalid-param-set:
                pass       -> FAIL       (shard-apl)
                pass       -> FAIL       (shard-hsw)
                pass       -> FAIL       (shard-snb)
Test kms_cursor_legacy:
        Subgroup flip-vs-cursor-crc-atomic:
                fail       -> PASS       (shard-hsw)
        Subgroup short-flip-before-cursor-atomic-transitions:
                skip       -> PASS       (shard-snb)

---- Known issues:

Test kms_flip:
        Subgroup 2x-plain-flip-fb-recreate:
                fail       -> PASS       (shard-hsw) fdo#100368
        Subgroup flip-vs-expired-vblank:
                fail       -> PASS       (shard-hsw) fdo#102887 +1
        Subgroup modeset-vs-vblank-race:
                fail       -> PASS       (shard-hsw) fdo#103060 +2
Test kms_sysfs_edid_timing:
                warn       -> PASS       (shard-apl) fdo#100047
Test perf:
        Subgroup polling:
                pass       -> FAIL       (shard-hsw) fdo#102252

fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
fdo#100047 https://bugs.freedesktop.org/show_bug.cgi?id=100047
fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252

shard-apl        total:3495 pass:1830 dwarn:1   dfail:0   fail:9   skip:1655 time:12892s
shard-hsw        total:3495 pass:1778 dwarn:1   dfail:0   fail:6   skip:1709 time:11628s
shard-snb        total:3495 pass:1372 dwarn:1   dfail:0   fail:5   skip:2117 time:6987s
Blacklisted hosts:
shard-kbl        total:3495 pass:1951 dwarn:1   dfail:0   fail:13  skip:1530 time:9722s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8503/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2] drm/i915/execlists: Flush pending preemption events during reset
  2018-03-27 11:44   ` [PATCH v2] " Chris Wilson
@ 2018-03-27 15:33     ` Jeff McGee
  0 siblings, 0 replies; 40+ messages in thread
From: Jeff McGee @ 2018-03-27 15:33 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Tue, Mar 27, 2018 at 12:44:02PM +0100, Chris Wilson wrote:
> Catch up with the inflight CSB events, after disabling the tasklet
> before deciding which request was truly guilty of hanging the GPU.
> 
> v2: Restore checking of use_csb_mmio on every loop, don't forget old
> vgpu.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> CC: Michel Thierry <michel.thierry@intel.com>
> Cc: Jeff McGee <jeff.mcgee@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 127 +++++++++++++++++++++++++++------------
>  1 file changed, 87 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index cf31b0749023..75dbdedde8b0 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -874,34 +874,14 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
>  	local_irq_restore(flags);
>  }
>  
> -/*
> - * Check the unread Context Status Buffers and manage the submission of new
> - * contexts to the ELSP accordingly.
> - */
> -static void execlists_submission_tasklet(unsigned long data)
> +static void process_csb(struct intel_engine_cs *engine)
>  {
> -	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
>  	struct intel_engine_execlists * const execlists = &engine->execlists;
>  	struct execlist_port * const port = execlists->port;
> -	struct drm_i915_private *dev_priv = engine->i915;
> +	struct drm_i915_private *i915 = engine->i915;
>  	bool fw = false;
>  
> -	/*
> -	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
> -	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
> -	 * not be relinquished until the device is idle (see
> -	 * i915_gem_idle_work_handler()). As a precaution, we make sure
> -	 * that all ELSP are drained i.e. we have processed the CSB,
> -	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
> -	 */
> -	GEM_BUG_ON(!dev_priv->gt.awake);
> -
> -	/*
> -	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
> -	 * imposing the cost of a locked atomic transaction when submitting a
> -	 * new request (outside of the context-switch interrupt).
> -	 */
> -	while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
> +	do {
Wouldn't it be simpler for process_csb to use a while loop here, and for
callers that need a CSB processing point to just call it without themselves
checking irq_posted? Are we really saving much with the if-do-while approach?

>  		/* The HWSP contains a (cacheable) mirror of the CSB */
>  		const u32 *buf =
>  			&engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
> @@ -909,28 +889,27 @@ static void execlists_submission_tasklet(unsigned long data)
>  
>  		if (unlikely(execlists->csb_use_mmio)) {
>  			buf = (u32 * __force)
> -				(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));
> -			execlists->csb_head = -1; /* force mmio read of CSB ptrs */
> +				(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));
> +			execlists->csb_head = -1; /* force mmio read of CSB */
>  		}
>  
>  		/* Clear before reading to catch new interrupts */
>  		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>  		smp_mb__after_atomic();
>  
> -		if (unlikely(execlists->csb_head == -1)) { /* following a reset */
> +		if (unlikely(execlists->csb_head == -1)) { /* after a reset */
>  			if (!fw) {
> -				intel_uncore_forcewake_get(dev_priv,
> -							   execlists->fw_domains);
> +				intel_uncore_forcewake_get(i915, execlists->fw_domains);
>  				fw = true;
>  			}
>  
> -			head = readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
> +			head = readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
>  			tail = GEN8_CSB_WRITE_PTR(head);
>  			head = GEN8_CSB_READ_PTR(head);
>  			execlists->csb_head = head;
>  		} else {
>  			const int write_idx =
> -				intel_hws_csb_write_index(dev_priv) -
> +				intel_hws_csb_write_index(i915) -
>  				I915_HWS_CSB_BUF0_INDEX;
>  
>  			head = execlists->csb_head;
> @@ -938,8 +917,8 @@ static void execlists_submission_tasklet(unsigned long data)
>  		}
>  		GEM_TRACE("%s cs-irq head=%d [%d%s], tail=%d [%d%s]\n",
>  			  engine->name,
> -			  head, GEN8_CSB_READ_PTR(readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",
> -			  tail, GEN8_CSB_WRITE_PTR(readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");
> +			  head, GEN8_CSB_READ_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",
> +			  tail, GEN8_CSB_WRITE_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");
>  
>  		while (head != tail) {
>  			struct i915_request *rq;
> @@ -949,7 +928,8 @@ static void execlists_submission_tasklet(unsigned long data)
>  			if (++head == GEN8_CSB_ENTRIES)
>  				head = 0;
>  
> -			/* We are flying near dragons again.
> +			/*
> +			 * We are flying near dragons again.
>  			 *
>  			 * We hold a reference to the request in execlist_port[]
>  			 * but no more than that. We are operating in softirq
> @@ -1040,15 +1020,48 @@ static void execlists_submission_tasklet(unsigned long data)
>  		if (head != execlists->csb_head) {
>  			execlists->csb_head = head;
>  			writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, head << 8),
> -			       dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
> +			       i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
>  		}
> -	}
> +	} while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted));
>  
> -	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT))
> -		execlists_dequeue(engine);
> +	if (unlikely(fw))
> +		intel_uncore_forcewake_put(i915, execlists->fw_domains);
> +}
> +
> +/*
> + * Check the unread Context Status Buffers and manage the submission of new
> + * contexts to the ELSP accordingly.
> + */
> +static void execlists_submission_tasklet(unsigned long data)
> +{
> +	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
>  
> -	if (fw)
> -		intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
> +	GEM_TRACE("%s awake?=%d, active=%x, irq-posted?=%d\n",
> +		  engine->name,
> +		  engine->i915->gt.awake,
> +		  engine->execlists.active,
> +		  test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted));
> +
> +	/*
> +	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
> +	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
> +	 * not be relinquished until the device is idle (see
> +	 * i915_gem_idle_work_handler()). As a precaution, we make sure
> +	 * that all ELSP are drained i.e. we have processed the CSB,
> +	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
> +	 */
> +	GEM_BUG_ON(!engine->i915->gt.awake);
> +
> +	/*
> +	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
> +	 * imposing the cost of a locked atomic transaction when submitting a
> +	 * new request (outside of the context-switch interrupt).
> +	 */
> +	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
> +		process_csb(engine);
> +
> +	if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
> +		execlists_dequeue(engine);
>  
>  	/* If the engine is now idle, so should be the flag; and vice versa. */
>  	GEM_BUG_ON(execlists_is_active(&engine->execlists,
> @@ -1712,6 +1725,7 @@ static struct i915_request *
>  execlists_reset_prepare(struct intel_engine_cs *engine)
>  {
>  	struct intel_engine_execlists * const execlists = &engine->execlists;
> +	struct i915_request *request, *active;
>  
>  	GEM_TRACE("%s\n", engine->name);
>  
> @@ -1735,7 +1749,40 @@ execlists_reset_prepare(struct intel_engine_cs *engine)
>  		tasklet_kill(&execlists->tasklet);
>  	tasklet_disable(&execlists->tasklet);
>  
> -	return i915_gem_find_active_request(engine);
> +	/*
> +	 * We want to flush the pending context switches, having disabled
> +	 * the tasklet above, we can assume exclusive access to the execlists.
> +	 * For this allows us to catch up with an inflight preemption event,
> +	 * and avoid blaming an innocent request if the stall was due to the
> +	 * preemption itself.
> +	 */
> +	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
> +		process_csb(engine);
> +
> +	/*
> +	 * The last active request can then be no later than the last request
> +	 * now in ELSP[0]. So search backwards from there, so that if the GPU
> +	 * has advanced beyond the last CSB update, it will be pardoned.
> +	 */
> +	active = NULL;
> +	request = port_request(execlists->port);
> +	if (request) {
> +		unsigned long flags;
> +
> +		spin_lock_irqsave(&engine->timeline->lock, flags);
> +		list_for_each_entry_from_reverse(request,
> +						 &engine->timeline->requests,
> +						 link) {
> +			if (__i915_request_completed(request,
> +						     request->global_seqno))
> +				break;
> +
> +			active = request;
> +		}
> +		spin_unlock_irqrestore(&engine->timeline->lock, flags);
> +	}
> +
> +	return active;
If we return NULL here because the preemption completed, the reset will be
aborted, but no one will kick the tasklet to dequeue the waiting request(s).
My force preemption test confirmed this by hitting GPU "hang". We need to
add (or maybe move from gen8_init_common_ring) to here or maybe
execlists_reset_finish...

if (engine->execlists.first)
	tasklet_schedule(&engine->execlists.tasklet);

>  }
>  
>  static void execlists_reset(struct intel_engine_cs *engine,
> -- 
> 2.16.3
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout
  2018-03-27  8:51   ` Tvrtko Ursulin
  2018-03-27  9:10     ` Chris Wilson
  2018-03-27  9:23     ` Chris Wilson
@ 2018-03-27 15:40     ` Jeff McGee
  2 siblings, 0 replies; 40+ messages in thread
From: Jeff McGee @ 2018-03-27 15:40 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Tue, Mar 27, 2018 at 09:51:20AM +0100, Tvrtko Ursulin wrote:
> 
> On 26/03/2018 12:50, Chris Wilson wrote:
> >Install a timer when trying to preempt on behalf of an important
> >context such that if the active context does not honour the preemption
> >request within the desired timeout, then we reset the GPU to allow the
> >important context to run.
> 
> I suggest renaming patch title to "Implement optional preemption
> delay timeout", or "upper bound", or something, as long as it is not
> "force preemption". :)
> 
> >(Open: should not the timer be started from receiving the high priority
> >request...)
> 
> If you think receiving as in execbuf I think not - that would be
> something else and not preempt timeout.
> 
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >---
> >  drivers/gpu/drm/i915/intel_lrc.c        | 53 +++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_ringbuffer.h |  8 +++++
> >  2 files changed, 61 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >index 50688fc889d9..6da816d23cb3 100644
> >--- a/drivers/gpu/drm/i915/intel_lrc.c
> >+++ b/drivers/gpu/drm/i915/intel_lrc.c
> >@@ -533,6 +533,47 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
> >  	execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK);
> >  	execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
> >+
> >+	/* Set a timer to force preemption vs hostile userspace */
> >+	if (execlists->queue_preempt_timeout) {
> >+		GEM_TRACE("%s timeout=%uns\n",
> 
> preempt-timeout ?
> 
> >+			  engine->name, execlists->queue_preempt_timeout);
> >+		hrtimer_start(&execlists->preempt_timer,
> >+			      ktime_set(0, execlists->queue_preempt_timeout),
> >+			      HRTIMER_MODE_REL);
> >+	}
> >+}
> >+
> >+static enum hrtimer_restart preempt_timeout(struct hrtimer *hrtimer)
> >+{
> >+	struct intel_engine_execlists *execlists =
> >+		container_of(hrtimer, typeof(*execlists), preempt_timer);
> >+
> >+	GEM_TRACE("%s\n",
> >+		  container_of(execlists,
> >+			       struct intel_engine_cs,
> >+			       execlists)->name);
> >+
> >+	queue_work(system_highpri_wq, &execlists->preempt_reset);
> 
> I suppose indirection from hrtimer to worker is for better than
> jiffie timeout granularity? But then queue_work might introduce some
> delay to defeat that.
> 
> I am wondering if simply schedule_delayed_work directly wouldn't be
> good enough. I suppose it is a question for the product group. But
> it is also implementation detail.
> 
I started with schedule_delayed_work in my implementation hoping for
at least consistent msec accuracy, but it was all over the place.
We need msec granularity for the automotive use cases.
-Jeff

> >+	return HRTIMER_NORESTART;
> >+}
> >+
> >+static void preempt_reset(struct work_struct *work)
> >+{
> >+	struct intel_engine_cs *engine =
> >+		container_of(work, typeof(*engine), execlists.preempt_reset);
> >+
> >+	GEM_TRACE("%s\n", engine->name);
> >+
> >+	tasklet_disable(&engine->execlists.tasklet);
> >+
> >+	engine->execlists.tasklet.func(engine->execlists.tasklet.data);
> 
> Comment on why calling the tasklet directly.
> 
> >+
> >+	if (execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
> >+		i915_handle_error(engine->i915, BIT(engine->id), 0,
> >+				  "preemption timed out on %s", engine->name);
> 
> Can this race with the normal reset and we end up wit
> i915_handle_error twice simultaneously?
> 
> >+
> >+	tasklet_enable(&engine->execlists.tasklet);
> >  }
> >  static void complete_preempt_context(struct intel_engine_execlists *execlists)
> >@@ -542,6 +583,10 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
> >  	execlists_cancel_port_requests(execlists);
> >  	execlists_unwind_incomplete_requests(execlists);
> >+	/* If the timer already fired, complete the reset */
> >+	if (hrtimer_try_to_cancel(&execlists->preempt_timer) < 0)
> >+		return;
> 
> What about timer which had already fired and queued the worker?
> hrtimer_try_to_cancel will return zero for that case I think.
> 
> >+
> >  	execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
> >  }
> >@@ -708,6 +753,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> >  			kmem_cache_free(engine->i915->priorities, p);
> >  	}
> >  done:
> >+	execlists->queue_preempt_timeout = 0; /* preemption point passed */
> >  	execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
> >  	execlists->first = rb;
> >  	if (submit)
> >@@ -864,6 +910,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
> >  	/* Remaining _unready_ requests will be nop'ed when submitted */
> >+	execlists->queue_preempt_timeout = 0;
> >  	execlists->queue_priority = INT_MIN;
> >  	execlists->queue = RB_ROOT;
> >  	execlists->first = NULL;
> >@@ -1080,6 +1127,7 @@ static void queue_request(struct intel_engine_cs *engine,
> >  static void __submit_queue(struct intel_engine_cs *engine, int prio)
> >  {
> >  		engine->execlists.queue_priority = prio;
> >+		engine->execlists.queue_preempt_timeout = 0;
> >  		tasklet_hi_schedule(&engine->execlists.tasklet);
> >  }
> >@@ -2270,6 +2318,11 @@ logical_ring_setup(struct intel_engine_cs *engine)
> >  	tasklet_init(&engine->execlists.tasklet,
> >  		     execlists_submission_tasklet, (unsigned long)engine);
> >+	INIT_WORK(&engine->execlists.preempt_reset, preempt_reset);
> >+	hrtimer_init(&engine->execlists.preempt_timer,
> >+		     CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> >+	engine->execlists.preempt_timer.function = preempt_timeout;
> >+
> >  	logical_ring_default_vfuncs(engine);
> >  	logical_ring_default_irqs(engine);
> >  }
> >diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> >index 4c71dcdc722b..7166f47c8489 100644
> >--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> >+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> >@@ -284,6 +284,11 @@ struct intel_engine_execlists {
> >  	 */
> >  	int queue_priority;
> >+	/**
> >+	 * @queue_preempt_timeout: Timeout in ns before forcing preemption.
> >+	 */
> >+	unsigned int queue_preempt_timeout;
> >+
> >  	/**
> >  	 * @queue: queue of requests, in priority lists
> >  	 */
> >@@ -313,6 +318,9 @@ struct intel_engine_execlists {
> >  	 * @preempt_complete_status: expected CSB upon completing preemption
> >  	 */
> >  	u32 preempt_complete_status;
> >+
> >+	struct hrtimer preempt_timer;
> >+	struct work_struct preempt_reset;
> >  };
> >  #define INTEL_ENGINE_CS_MAX_NAME 8
> >
> 
> Regards,
> 
> Tvrtko
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout
  2018-03-26 11:50 ` [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout Chris Wilson
  2018-03-27  8:51   ` Tvrtko Ursulin
@ 2018-03-27 15:50   ` Jeff McGee
  1 sibling, 0 replies; 40+ messages in thread
From: Jeff McGee @ 2018-03-27 15:50 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Mon, Mar 26, 2018 at 12:50:41PM +0100, Chris Wilson wrote:
> Install a timer when trying to preempt on behalf of an important
> context such that if the active context does not honour the preemption
> request within the desired timeout, then we reset the GPU to allow the
> important context to run.
> 
> (Open: should not the timer be started from receiving the high priority
> request...)
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c        | 53 +++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  8 +++++
>  2 files changed, 61 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 50688fc889d9..6da816d23cb3 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -533,6 +533,47 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
>  
>  	execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK);
>  	execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
> +
> +	/* Set a timer to force preemption vs hostile userspace */
> +	if (execlists->queue_preempt_timeout) {
> +		GEM_TRACE("%s timeout=%uns\n",
> +			  engine->name, execlists->queue_preempt_timeout);
> +		hrtimer_start(&execlists->preempt_timer,
> +			      ktime_set(0, execlists->queue_preempt_timeout),
> +			      HRTIMER_MODE_REL);
> +	}
> +}
> +
> +static enum hrtimer_restart preempt_timeout(struct hrtimer *hrtimer)
> +{
> +	struct intel_engine_execlists *execlists =
> +		container_of(hrtimer, typeof(*execlists), preempt_timer);
> +
> +	GEM_TRACE("%s\n",
> +		  container_of(execlists,
> +			       struct intel_engine_cs,
> +			       execlists)->name);
> +
> +	queue_work(system_highpri_wq, &execlists->preempt_reset);
> +	return HRTIMER_NORESTART;
> +}
> +
> +static void preempt_reset(struct work_struct *work)
> +{
> +	struct intel_engine_cs *engine =
> +		container_of(work, typeof(*engine), execlists.preempt_reset);
> +
> +	GEM_TRACE("%s\n", engine->name);
> +
> +	tasklet_disable(&engine->execlists.tasklet);
> +
> +	engine->execlists.tasklet.func(engine->execlists.tasklet.data);
This seems redundant with the execlists_reset_prepare already doing a process_csb
at a later time to decide whether to execute or abort the reset. In your
previous proposal you suggested tasklet disable and kill here.

> +
> +	if (execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
> +		i915_handle_error(engine->i915, BIT(engine->id), 0,
> +				  "preemption timed out on %s", engine->name);
> +
> +	tasklet_enable(&engine->execlists.tasklet);
>  }
>  
>  static void complete_preempt_context(struct intel_engine_execlists *execlists)
> @@ -542,6 +583,10 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
>  	execlists_cancel_port_requests(execlists);
>  	execlists_unwind_incomplete_requests(execlists);
>  
> +	/* If the timer already fired, complete the reset */
> +	if (hrtimer_try_to_cancel(&execlists->preempt_timer) < 0)
> +		return;
> +
>  	execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
>  }
>  
> @@ -708,6 +753,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  			kmem_cache_free(engine->i915->priorities, p);
>  	}
>  done:
> +	execlists->queue_preempt_timeout = 0; /* preemption point passed */
>  	execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
>  	execlists->first = rb;
>  	if (submit)
> @@ -864,6 +910,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
>  
>  	/* Remaining _unready_ requests will be nop'ed when submitted */
>  
> +	execlists->queue_preempt_timeout = 0;
>  	execlists->queue_priority = INT_MIN;
>  	execlists->queue = RB_ROOT;
>  	execlists->first = NULL;
> @@ -1080,6 +1127,7 @@ static void queue_request(struct intel_engine_cs *engine,
>  static void __submit_queue(struct intel_engine_cs *engine, int prio)
>  {
>  		engine->execlists.queue_priority = prio;
> +		engine->execlists.queue_preempt_timeout = 0;
>  		tasklet_hi_schedule(&engine->execlists.tasklet);
>  }
>  
> @@ -2270,6 +2318,11 @@ logical_ring_setup(struct intel_engine_cs *engine)
>  	tasklet_init(&engine->execlists.tasklet,
>  		     execlists_submission_tasklet, (unsigned long)engine);
>  
> +	INIT_WORK(&engine->execlists.preempt_reset, preempt_reset);
> +	hrtimer_init(&engine->execlists.preempt_timer,
> +		     CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +	engine->execlists.preempt_timer.function = preempt_timeout;
> +
>  	logical_ring_default_vfuncs(engine);
>  	logical_ring_default_irqs(engine);
>  }
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 4c71dcdc722b..7166f47c8489 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -284,6 +284,11 @@ struct intel_engine_execlists {
>  	 */
>  	int queue_priority;
>  
> +	/**
> +	 * @queue_preempt_timeout: Timeout in ns before forcing preemption.
> +	 */
> +	unsigned int queue_preempt_timeout;
> +
>  	/**
>  	 * @queue: queue of requests, in priority lists
>  	 */
> @@ -313,6 +318,9 @@ struct intel_engine_execlists {
>  	 * @preempt_complete_status: expected CSB upon completing preemption
>  	 */
>  	u32 preempt_complete_status;
> +
> +	struct hrtimer preempt_timer;
> +	struct work_struct preempt_reset;
>  };
>  
>  #define INTEL_ENGINE_CS_MAX_NAME 8
> -- 
> 2.16.3
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout
  2018-03-27  9:23     ` Chris Wilson
@ 2018-03-28  8:59       ` Chris Wilson
  0 siblings, 0 replies; 40+ messages in thread
From: Chris Wilson @ 2018-03-28  8:59 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2018-03-27 10:23:26)
> Quoting Tvrtko Ursulin (2018-03-27 09:51:20)
> > 
> > On 26/03/2018 12:50, Chris Wilson wrote:
> > > +static enum hrtimer_restart preempt_timeout(struct hrtimer *hrtimer)
> > > +{
> > > +     struct intel_engine_execlists *execlists =
> > > +             container_of(hrtimer, typeof(*execlists), preempt_timer);
> > > +
> > > +     GEM_TRACE("%s\n",
> > > +               container_of(execlists,
> > > +                            struct intel_engine_cs,
> > > +                            execlists)->name);
> > > +
> > > +     queue_work(system_highpri_wq, &execlists->preempt_reset);
> > 
> > I suppose indirection from hrtimer to worker is for better than jiffie 
> > timeout granularity? But then queue_work might introduce some delay to 
> > defeat that.
> 
> Yes. It's betting on highpri_wq being just that. We can do better with
> our own RT kthread and a wakeup from the hrtimer if required.
> 
> Hmm, the original plan (watchdog TDR) was to avoid using the global
> reset entirely. The per-engine reset (although needs serialising with
> itself) doesn't need struct_mutex, so we should be able to do that from
> the timer directly, and just kick off the global reset on failure.
> 
> That sounds a whole lot better, let's dust off that code and see what
> breaks.

So I think something along the lines of

+static int try_execlists_reset(struct intel_engine_execlists *execlists)
+{
+       struct intel_engine_cs *engine =
+               container_of(execlists, typeof(*engine), execlists);
+       int err = -EBUSY;
+
+       if (!test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted) &&
+           tasklet_trylock(&execlists->tasklet)) {
+               unsigned long *lock = &engine->i915->gpu_error.flags;
+               unsigned int bit = I915_RESET_ENGINE + engine->id;
+
+               if (!test_and_set_bit(bit, lock)) {
+                       tasklet_disable(&engine->execlists.tasklet);
+                       err = i915_reset_engine(engine,
+                                               "preemption timeout");
+                       tasklet_enable(&engine->execlists.tasklet);
+
+                       clear_bit(bit, lock);
+                       wake_up_bit(lock, bit);
+               }
+
+               tasklet_unlock(&execlists->tasklet);
+       }
+
+       return err;
+}

should do the trick, with a fallback to worker+i915_handle_error for the
cases where we can't claim ensure softirq safety.

There's still the issue of two resets in quick succession being treated
as a failure. That's also an issue for the current per-engine failover
to whole-device reset.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2018-03-28  8:59 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
2018-03-26 11:50 ` [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling Chris Wilson
2018-03-27 11:34   ` Mika Kuoppala
2018-03-27 11:47     ` Chris Wilson
2018-03-27 12:18       ` Mika Kuoppala
2018-03-27 13:34         ` Chris Wilson
2018-03-27 11:42   ` Mika Kuoppala
2018-03-26 11:50 ` [PATCH 02/11] drm/i915/execlists: Clear user-active flag on preemption completion Chris Wilson
2018-03-27 10:00   ` Chris Wilson
2018-03-27 10:01     ` Chris Wilson
2018-03-26 11:50 ` [PATCH 03/11] drm/i915: Include submission tasklet state in engine dump Chris Wilson
2018-03-27  8:37   ` Mika Kuoppala
2018-03-26 11:50 ` [PATCH 04/11] drm/i915/execlists: Refactor out complete_preempt_context() Chris Wilson
2018-03-26 11:50 ` [PATCH 05/11] drm/i915: Move engine reset prepare/finish to backends Chris Wilson
2018-03-26 11:50 ` [PATCH 06/11] drm/i915: Split execlists/guc reset prepartions Chris Wilson
2018-03-26 11:50 ` [PATCH 07/11] drm/i915/execlists: Flush pending preemption events during reset Chris Wilson
2018-03-27 11:44   ` [PATCH v2] " Chris Wilson
2018-03-27 15:33     ` Jeff McGee
2018-03-26 11:50 ` [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout Chris Wilson
2018-03-27  8:51   ` Tvrtko Ursulin
2018-03-27  9:10     ` Chris Wilson
2018-03-27  9:23     ` Chris Wilson
2018-03-28  8:59       ` Chris Wilson
2018-03-27 15:40     ` Jeff McGee
2018-03-27 15:50   ` Jeff McGee
2018-03-26 11:50 ` [PATCH 09/11] drm/i915/preemption: Select timeout when scheduling Chris Wilson
2018-03-26 11:50 ` [PATCH 10/11] drm/i915: Use a preemption timeout to enforce interactivity Chris Wilson
2018-03-27  8:35   ` Tvrtko Ursulin
2018-03-27  8:39     ` Chris Wilson
2018-03-27  8:57       ` Tvrtko Ursulin
2018-03-26 11:50 ` [PATCH 11/11] drm/i915: Allow user control over preempt timeout on the important context Chris Wilson
2018-03-26 17:09   ` Tvrtko Ursulin
2018-03-26 19:52     ` Chris Wilson
2018-03-26 20:49       ` Chris Wilson
2018-03-26 12:08 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling Patchwork
2018-03-26 12:23 ` ✓ Fi.CI.BAT: success " Patchwork
2018-03-26 14:56 ` ✗ Fi.CI.IGT: failure " Patchwork
2018-03-27 12:28 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling (rev2) Patchwork
2018-03-27 12:44 ` ✓ Fi.CI.BAT: success " Patchwork
2018-03-27 15:30 ` ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.