All of lore.kernel.org
 help / color / mirror / Atom feed
* Still trying for context->preempt_timeout
@ 2018-04-09 11:13 Chris Wilson
  2018-04-09 11:13 ` [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port Chris Wilson
                   ` (20 more replies)
  0 siblings, 21 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:13 UTC (permalink / raw)
  To: intel-gfx

Now with i915_reset_engine() marking the stalled request as guilty,
preemption timeout doesn't lead into a GPU hang death spiral; at the
loss of potentially resetting a context with no harm (in practice that
didn't work out!).

A bit ambivalent on the flip forcing reset, both the stutter and glitch
can be seen under load. I'm not sold on the UX; it's bad either way.
If we don't break the deadlock, the user can't interact with the system;
if we do, they may see a glitch in one app.
-Chris


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
@ 2018-04-09 11:13 ` Chris Wilson
  2018-04-10 14:05   ` Tvrtko Ursulin
  2018-04-09 11:13 ` [PATCH 02/18] drm/i915/execlists: Refactor out complete_preempt_context() Chris Wilson
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:13 UTC (permalink / raw)
  To: intel-gfx

We can refine our current execlists->queue_priority if we inspect
ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
the unsubmitted queue and say that if a subsequent request is more than
important than the current queue, we will rerun the submission tasklet
to evaluate the need for preemption. However, we only want to preempt if
we need to jump ahead of a currently executing request in ELSP. The
second reason for running the submission tasklet is amalgamate requests
into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
(Though repeatedly amalgamating requests into the active context and
triggering many lite-restore is off question gain, the goal really is to
put a context into ELSP[1] to cover the interrupt.) So if instead of
looking at the head of the queue, we look at the context in ELSP[1] we
can answer both of the questions more accurately -- we don't need to
rerun the submission tasklet unless our new request is important enough
to feed into, at least, ELSP[1].

References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3592288e4696..b47d53d284ca 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
-	execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
+	execlists->queue_priority =
+		port != execlists->port ? rq_prio(last) : INT_MIN;
 	execlists->first = rb;
 	if (submit)
 		port_assign(port, last);
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/18] drm/i915/execlists: Refactor out complete_preempt_context()
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
  2018-04-09 11:13 ` [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port Chris Wilson
@ 2018-04-09 11:13 ` Chris Wilson
  2018-04-09 11:13 ` [PATCH 03/18] drm/i915: Move engine reset prepare/finish to backends Chris Wilson
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:13 UTC (permalink / raw)
  To: intel-gfx

As a complement to inject_preempt_context(), follow up with the function
to handle its completion. This will be useful should we wish to extend
the duties of the preempt-context for execlists.

v2: And do the same for the guc.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com> #v1
---
 drivers/gpu/drm/i915/intel_guc_submission.c | 26 ++++++++++++++-------
 drivers/gpu/drm/i915/intel_lrc.c            | 23 ++++++++++--------
 2 files changed, 30 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 97121230656c..04125e8cec6b 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -621,6 +621,21 @@ static void wait_for_guc_preempt_report(struct intel_engine_cs *engine)
 	report->report_return_status = INTEL_GUC_REPORT_STATUS_UNKNOWN;
 }
 
+static void complete_preempt_context(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists *execlists = &engine->execlists;
+
+	GEM_BUG_ON(!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT));
+
+	execlists_cancel_port_requests(execlists);
+	execlists_unwind_incomplete_requests(execlists);
+
+	wait_for_guc_preempt_report(engine);
+	intel_write_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX, 0);
+
+	execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
+}
+
 /**
  * guc_submit() - Submit commands through GuC
  * @engine: engine associated with the commands
@@ -775,15 +790,8 @@ static void guc_submission_tasklet(unsigned long data)
 
 	if (execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT) &&
 	    intel_read_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX) ==
-	    GUC_PREEMPT_FINISHED) {
-		execlists_cancel_port_requests(&engine->execlists);
-		execlists_unwind_incomplete_requests(execlists);
-
-		wait_for_guc_preempt_report(engine);
-
-		execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
-		intel_write_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX, 0);
-	}
+	    GUC_PREEMPT_FINISHED)
+		complete_preempt_context(engine);
 
 	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT))
 		guc_dequeue(engine);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b47d53d284ca..5773a0e8dc51 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -546,8 +546,18 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
 	if (execlists->ctrl_reg)
 		writel(EL_CTRL_LOAD, execlists->ctrl_reg);
 
-	execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
-	execlists_set_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT);
+	execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK);
+	execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
+}
+
+static void complete_preempt_context(struct intel_engine_execlists *execlists)
+{
+	GEM_BUG_ON(!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT));
+
+	execlists_cancel_port_requests(execlists);
+	execlists_unwind_incomplete_requests(execlists);
+
+	execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
 }
 
 static void execlists_dequeue(struct intel_engine_cs *engine)
@@ -995,14 +1005,7 @@ static void execlists_submission_tasklet(unsigned long data)
 			if (status & GEN8_CTX_STATUS_COMPLETE &&
 			    buf[2*head + 1] == execlists->preempt_complete_status) {
 				GEM_TRACE("%s preempt-idle\n", engine->name);
-
-				execlists_cancel_port_requests(execlists);
-				execlists_unwind_incomplete_requests(execlists);
-
-				GEM_BUG_ON(!execlists_is_active(execlists,
-								EXECLISTS_ACTIVE_PREEMPT));
-				execlists_clear_active(execlists,
-						       EXECLISTS_ACTIVE_PREEMPT);
+				complete_preempt_context(execlists);
 				continue;
 			}
 
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/18] drm/i915: Move engine reset prepare/finish to backends
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
  2018-04-09 11:13 ` [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port Chris Wilson
  2018-04-09 11:13 ` [PATCH 02/18] drm/i915/execlists: Refactor out complete_preempt_context() Chris Wilson
@ 2018-04-09 11:13 ` Chris Wilson
  2018-04-09 11:13 ` [PATCH 04/18] drm/i915: Split execlists/guc reset preparations Chris Wilson
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:13 UTC (permalink / raw)
  To: intel-gfx

In preparation to more carefully handling incomplete preemption during
reset by execlists, we move the existing code wholesale to the backends
under a couple of new reset vfuncs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         | 47 +++-----------------
 drivers/gpu/drm/i915/intel_lrc.c        | 59 +++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 23 ++++++++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |  9 +++-
 4 files changed, 88 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 28ab0beff86c..61aca602c3a9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2997,7 +2997,7 @@ i915_gem_find_active_request(struct intel_engine_cs *engine)
 struct i915_request *
 i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
 {
-	struct i915_request *request = NULL;
+	struct i915_request *request;
 
 	/*
 	 * During the reset sequence, we must prevent the engine from
@@ -3020,40 +3020,7 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
 	 */
 	kthread_park(engine->breadcrumbs.signaler);
 
-	/*
-	 * Prevent request submission to the hardware until we have
-	 * completed the reset in i915_gem_reset_finish(). If a request
-	 * is completed by one engine, it may then queue a request
-	 * to a second via its execlists->tasklet *just* as we are
-	 * calling engine->init_hw() and also writing the ELSP.
-	 * Turning off the execlists->tasklet until the reset is over
-	 * prevents the race.
-	 *
-	 * Note that this needs to be a single atomic operation on the
-	 * tasklet (flush existing tasks, prevent new tasks) to prevent
-	 * a race between reset and set-wedged. It is not, so we do the best
-	 * we can atm and make sure we don't lock the machine up in the more
-	 * common case of recursively being called from set-wedged from inside
-	 * i915_reset.
-	 */
-	if (!atomic_read(&engine->execlists.tasklet.count))
-		tasklet_kill(&engine->execlists.tasklet);
-	tasklet_disable(&engine->execlists.tasklet);
-
-	/*
-	 * We're using worker to queue preemption requests from the tasklet in
-	 * GuC submission mode.
-	 * Even though tasklet was disabled, we may still have a worker queued.
-	 * Let's make sure that all workers scheduled before disabling the
-	 * tasklet are completed before continuing with the reset.
-	 */
-	if (engine->i915->guc.preempt_wq)
-		flush_workqueue(engine->i915->guc.preempt_wq);
-
-	if (engine->irq_seqno_barrier)
-		engine->irq_seqno_barrier(engine);
-
-	request = i915_gem_find_active_request(engine);
+	request = engine->reset.prepare(engine);
 	if (request && request->fence.error == -EIO)
 		request = ERR_PTR(-EIO); /* Previous reset failed! */
 
@@ -3204,13 +3171,8 @@ void i915_gem_reset_engine(struct intel_engine_cs *engine,
 	if (request)
 		request = i915_gem_reset_request(engine, request, stalled);
 
-	if (request) {
-		DRM_DEBUG_DRIVER("resetting %s to restart from tail of request 0x%x\n",
-				 engine->name, request->global_seqno);
-	}
-
 	/* Setup the CS to resume from the breadcrumb of the hung request */
-	engine->reset_hw(engine, request);
+	engine->reset.reset(engine, request);
 }
 
 void i915_gem_reset(struct drm_i915_private *dev_priv,
@@ -3265,7 +3227,8 @@ void i915_gem_reset(struct drm_i915_private *dev_priv,
 
 void i915_gem_reset_finish_engine(struct intel_engine_cs *engine)
 {
-	tasklet_enable(&engine->execlists.tasklet);
+	engine->reset.finish(engine);
+
 	kthread_unpark(engine->breadcrumbs.signaler);
 
 	intel_uncore_forcewake_put(engine->i915, FORCEWAKE_ALL);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 5773a0e8dc51..0fe872f8b6a3 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1741,8 +1741,48 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 	return init_workarounds_ring(engine);
 }
 
-static void reset_common_ring(struct intel_engine_cs *engine,
-			      struct i915_request *request)
+static struct i915_request *
+execlists_reset_prepare(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+
+	GEM_TRACE("%s\n", engine->name);
+
+	/*
+	 * Prevent request submission to the hardware until we have
+	 * completed the reset in i915_gem_reset_finish(). If a request
+	 * is completed by one engine, it may then queue a request
+	 * to a second via its execlists->tasklet *just* as we are
+	 * calling engine->init_hw() and also writing the ELSP.
+	 * Turning off the execlists->tasklet until the reset is over
+	 * prevents the race.
+	 *
+	 * Note that this needs to be a single atomic operation on the
+	 * tasklet (flush existing tasks, prevent new tasks) to prevent
+	 * a race between reset and set-wedged. It is not, so we do the best
+	 * we can atm and make sure we don't lock the machine up in the more
+	 * common case of recursively being called from set-wedged from inside
+	 * i915_reset.
+	 */
+	if (!atomic_read(&execlists->tasklet.count))
+		tasklet_kill(&execlists->tasklet);
+	tasklet_disable(&execlists->tasklet);
+
+	/*
+	 * We're using worker to queue preemption requests from the tasklet in
+	 * GuC submission mode.
+	 * Even though tasklet was disabled, we may still have a worker queued.
+	 * Let's make sure that all workers scheduled before disabling the
+	 * tasklet are completed before continuing with the reset.
+	 */
+	if (engine->i915->guc.preempt_wq)
+		flush_workqueue(engine->i915->guc.preempt_wq);
+
+	return i915_gem_find_active_request(engine);
+}
+
+static void execlists_reset(struct intel_engine_cs *engine,
+			    struct i915_request *request)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct intel_context *ce;
@@ -1751,6 +1791,9 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	GEM_TRACE("%s seqno=%x\n",
 		  engine->name, request ? request->global_seqno : 0);
 
+	/* The submission tasklet must be disabled, engine->reset.prepare(). */
+	GEM_BUG_ON(!atomic_read(&execlists->tasklet.count));
+
 	/* See execlists_cancel_requests() for the irq/spinlock split. */
 	local_irq_save(flags);
 
@@ -1811,6 +1854,13 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	unwind_wa_tail(request);
 }
 
+static void execlists_reset_finish(struct intel_engine_cs *engine)
+{
+	tasklet_enable(&engine->execlists.tasklet);
+
+	GEM_TRACE("%s\n", engine->name);
+}
+
 static int intel_logical_ring_emit_pdps(struct i915_request *rq)
 {
 	struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
@@ -2137,7 +2187,10 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
 {
 	/* Default vfuncs which can be overriden by each engine. */
 	engine->init_hw = gen8_init_common_ring;
-	engine->reset_hw = reset_common_ring;
+
+	engine->reset.prepare = execlists_reset_prepare;
+	engine->reset.reset = execlists_reset;
+	engine->reset.finish = execlists_reset_finish;
 
 	engine->context_pin = execlists_context_pin;
 	engine->context_unpin = execlists_context_unpin;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 04d9d9a946a7..5dadbc435c0e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -530,9 +530,20 @@ static int init_ring_common(struct intel_engine_cs *engine)
 	return ret;
 }
 
-static void reset_ring_common(struct intel_engine_cs *engine,
-			      struct i915_request *request)
+static struct i915_request *reset_prepare(struct intel_engine_cs *engine)
 {
+	if (engine->irq_seqno_barrier)
+		engine->irq_seqno_barrier(engine);
+
+	return i915_gem_find_active_request(engine);
+}
+
+static void reset_ring(struct intel_engine_cs *engine,
+		       struct i915_request *request)
+{
+	GEM_TRACE("%s seqno=%x\n",
+		  engine->name, request ? request->global_seqno : 0);
+
 	/*
 	 * RC6 must be prevented until the reset is complete and the engine
 	 * reinitialised. If it occurs in the middle of this sequence, the
@@ -595,6 +606,10 @@ static void reset_ring_common(struct intel_engine_cs *engine,
 	}
 }
 
+static void reset_finish(struct intel_engine_cs *engine)
+{
+}
+
 static int intel_rcs_ctx_init(struct i915_request *rq)
 {
 	int ret;
@@ -1987,7 +2002,9 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
 	intel_ring_init_semaphores(dev_priv, engine);
 
 	engine->init_hw = init_ring_common;
-	engine->reset_hw = reset_ring_common;
+	engine->reset.prepare = reset_prepare;
+	engine->reset.reset = reset_ring;
+	engine->reset.finish = reset_finish;
 
 	engine->context_pin = intel_ring_context_pin;
 	engine->context_unpin = intel_ring_context_unpin;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 256d58487559..62f9aeefae30 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -418,8 +418,13 @@ struct intel_engine_cs {
 	void		(*irq_disable)(struct intel_engine_cs *engine);
 
 	int		(*init_hw)(struct intel_engine_cs *engine);
-	void		(*reset_hw)(struct intel_engine_cs *engine,
-				    struct i915_request *rq);
+
+	struct {
+		struct i915_request *(*prepare)(struct intel_engine_cs *engine);
+		void (*reset)(struct intel_engine_cs *engine,
+			      struct i915_request *rq);
+		void (*finish)(struct intel_engine_cs *engine);
+	} reset;
 
 	void		(*park)(struct intel_engine_cs *engine);
 	void		(*unpark)(struct intel_engine_cs *engine);
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/18] drm/i915: Split execlists/guc reset preparations
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (2 preceding siblings ...)
  2018-04-09 11:13 ` [PATCH 03/18] drm/i915: Move engine reset prepare/finish to backends Chris Wilson
@ 2018-04-09 11:13 ` Chris Wilson
  2018-04-09 11:14 ` [PATCH 05/18] drm/i915/execlists: Flush pending preemption events during reset Chris Wilson
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:13 UTC (permalink / raw)
  To: intel-gfx

In the next patch, we will make the execlists reset prepare callback
take into account preemption by flushing the context-switch handler.
This is not applicable to the GuC submission backend, so split the two
into their own backend callbacks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_submission.c | 41 +++++++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c            | 11 +-----
 2 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 04125e8cec6b..dc6782391f9f 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -797,6 +797,46 @@ static void guc_submission_tasklet(unsigned long data)
 		guc_dequeue(engine);
 }
 
+static struct i915_request *
+guc_reset_prepare(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+
+	GEM_TRACE("%s\n", engine->name);
+
+	/*
+	 * Prevent request submission to the hardware until we have
+	 * completed the reset in i915_gem_reset_finish(). If a request
+	 * is completed by one engine, it may then queue a request
+	 * to a second via its execlists->tasklet *just* as we are
+	 * calling engine->init_hw() and also writing the ELSP.
+	 * Turning off the execlists->tasklet until the reset is over
+	 * prevents the race.
+	 *
+	 * Note that this needs to be a single atomic operation on the
+	 * tasklet (flush existing tasks, prevent new tasks) to prevent
+	 * a race between reset and set-wedged. It is not, so we do the best
+	 * we can atm and make sure we don't lock the machine up in the more
+	 * common case of recursively being called from set-wedged from inside
+	 * i915_reset.
+	 */
+	if (!atomic_read(&execlists->tasklet.count))
+		tasklet_kill(&execlists->tasklet);
+	tasklet_disable(&execlists->tasklet);
+
+	/*
+	 * We're using worker to queue preemption requests from the tasklet in
+	 * GuC submission mode.
+	 * Even though tasklet was disabled, we may still have a worker queued.
+	 * Let's make sure that all workers scheduled before disabling the
+	 * tasklet are completed before continuing with the reset.
+	 */
+	if (engine->i915->guc.preempt_wq)
+		flush_workqueue(engine->i915->guc.preempt_wq);
+
+	return i915_gem_find_active_request(engine);
+}
+
 /*
  * Everything below here is concerned with setup & teardown, and is
  * therefore not part of the somewhat time-critical batch-submission
@@ -1256,6 +1296,7 @@ int intel_guc_submission_enable(struct intel_guc *guc)
 			&engine->execlists;
 
 		execlists->tasklet.func = guc_submission_tasklet;
+		engine->reset.prepare = guc_reset_prepare;
 		engine->park = guc_submission_park;
 		engine->unpark = guc_submission_unpark;
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0fe872f8b6a3..12f5abbadd3c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1768,16 +1768,6 @@ execlists_reset_prepare(struct intel_engine_cs *engine)
 		tasklet_kill(&execlists->tasklet);
 	tasklet_disable(&execlists->tasklet);
 
-	/*
-	 * We're using worker to queue preemption requests from the tasklet in
-	 * GuC submission mode.
-	 * Even though tasklet was disabled, we may still have a worker queued.
-	 * Let's make sure that all workers scheduled before disabling the
-	 * tasklet are completed before continuing with the reset.
-	 */
-	if (engine->i915->guc.preempt_wq)
-		flush_workqueue(engine->i915->guc.preempt_wq);
-
 	return i915_gem_find_active_request(engine);
 }
 
@@ -2167,6 +2157,7 @@ static void execlists_set_default_submission(struct intel_engine_cs *engine)
 	engine->cancel_requests = execlists_cancel_requests;
 	engine->schedule = execlists_schedule;
 	engine->execlists.tasklet.func = execlists_submission_tasklet;
+	engine->reset.prepare = execlists_reset_prepare;
 
 	engine->park = NULL;
 	engine->unpark = NULL;
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/18] drm/i915/execlists: Flush pending preemption events during reset
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (3 preceding siblings ...)
  2018-04-09 11:13 ` [PATCH 04/18] drm/i915: Split execlists/guc reset preparations Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-09 11:14 ` [PATCH 06/18] drm/i915/breadcrumbs: Keep the fake irq armed across reset Chris Wilson
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

Catch up with the inflight CSB events, after disabling the tasklet
before deciding which request was truly guilty of hanging the GPU.

v2: Restore checking of use_csb_mmio on every loop, don't forget old
vgpu.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 127 +++++++++++++++++++++----------
 1 file changed, 87 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 12f5abbadd3c..bbcc6439a2a1 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -890,34 +890,14 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 	local_irq_restore(flags);
 }
 
-/*
- * Check the unread Context Status Buffers and manage the submission of new
- * contexts to the ELSP accordingly.
- */
-static void execlists_submission_tasklet(unsigned long data)
+static void process_csb(struct intel_engine_cs *engine)
 {
-	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
-	struct drm_i915_private *dev_priv = engine->i915;
+	struct drm_i915_private *i915 = engine->i915;
 	bool fw = false;
 
-	/*
-	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
-	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
-	 * not be relinquished until the device is idle (see
-	 * i915_gem_idle_work_handler()). As a precaution, we make sure
-	 * that all ELSP are drained i.e. we have processed the CSB,
-	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
-	 */
-	GEM_BUG_ON(!dev_priv->gt.awake);
-
-	/*
-	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
-	 * imposing the cost of a locked atomic transaction when submitting a
-	 * new request (outside of the context-switch interrupt).
-	 */
-	while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
+	do {
 		/* The HWSP contains a (cacheable) mirror of the CSB */
 		const u32 *buf =
 			&engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
@@ -925,28 +905,27 @@ static void execlists_submission_tasklet(unsigned long data)
 
 		if (unlikely(execlists->csb_use_mmio)) {
 			buf = (u32 * __force)
-				(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));
-			execlists->csb_head = -1; /* force mmio read of CSB ptrs */
+				(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));
+			execlists->csb_head = -1; /* force mmio read of CSB */
 		}
 
 		/* Clear before reading to catch new interrupts */
 		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 		smp_mb__after_atomic();
 
-		if (unlikely(execlists->csb_head == -1)) { /* following a reset */
+		if (unlikely(execlists->csb_head == -1)) { /* after a reset */
 			if (!fw) {
-				intel_uncore_forcewake_get(dev_priv,
-							   execlists->fw_domains);
+				intel_uncore_forcewake_get(i915, execlists->fw_domains);
 				fw = true;
 			}
 
-			head = readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
+			head = readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
 			tail = GEN8_CSB_WRITE_PTR(head);
 			head = GEN8_CSB_READ_PTR(head);
 			execlists->csb_head = head;
 		} else {
 			const int write_idx =
-				intel_hws_csb_write_index(dev_priv) -
+				intel_hws_csb_write_index(i915) -
 				I915_HWS_CSB_BUF0_INDEX;
 
 			head = execlists->csb_head;
@@ -954,8 +933,8 @@ static void execlists_submission_tasklet(unsigned long data)
 		}
 		GEM_TRACE("%s cs-irq head=%d [%d%s], tail=%d [%d%s]\n",
 			  engine->name,
-			  head, GEN8_CSB_READ_PTR(readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",
-			  tail, GEN8_CSB_WRITE_PTR(readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");
+			  head, GEN8_CSB_READ_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",
+			  tail, GEN8_CSB_WRITE_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");
 
 		while (head != tail) {
 			struct i915_request *rq;
@@ -965,7 +944,8 @@ static void execlists_submission_tasklet(unsigned long data)
 			if (++head == GEN8_CSB_ENTRIES)
 				head = 0;
 
-			/* We are flying near dragons again.
+			/*
+			 * We are flying near dragons again.
 			 *
 			 * We hold a reference to the request in execlist_port[]
 			 * but no more than that. We are operating in softirq
@@ -1072,15 +1052,48 @@ static void execlists_submission_tasklet(unsigned long data)
 		if (head != execlists->csb_head) {
 			execlists->csb_head = head;
 			writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, head << 8),
-			       dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
+			       i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
 		}
-	}
+	} while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted));
 
-	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT))
-		execlists_dequeue(engine);
+	if (unlikely(fw))
+		intel_uncore_forcewake_put(i915, execlists->fw_domains);
+}
+
+/*
+ * Check the unread Context Status Buffers and manage the submission of new
+ * contexts to the ELSP accordingly.
+ */
+static void execlists_submission_tasklet(unsigned long data)
+{
+	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 
-	if (fw)
-		intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
+	GEM_TRACE("%s awake?=%d, active=%x, irq-posted?=%d\n",
+		  engine->name,
+		  engine->i915->gt.awake,
+		  engine->execlists.active,
+		  test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted));
+
+	/*
+	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
+	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
+	 * not be relinquished until the device is idle (see
+	 * i915_gem_idle_work_handler()). As a precaution, we make sure
+	 * that all ELSP are drained i.e. we have processed the CSB,
+	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
+	 */
+	GEM_BUG_ON(!engine->i915->gt.awake);
+
+	/*
+	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
+	 * imposing the cost of a locked atomic transaction when submitting a
+	 * new request (outside of the context-switch interrupt).
+	 */
+	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
+		process_csb(engine);
+
+	if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
+		execlists_dequeue(engine);
 
 	/* If the engine is now idle, so should be the flag; and vice versa. */
 	GEM_BUG_ON(execlists_is_active(&engine->execlists,
@@ -1745,6 +1758,7 @@ static struct i915_request *
 execlists_reset_prepare(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
+	struct i915_request *request, *active;
 
 	GEM_TRACE("%s\n", engine->name);
 
@@ -1768,7 +1782,40 @@ execlists_reset_prepare(struct intel_engine_cs *engine)
 		tasklet_kill(&execlists->tasklet);
 	tasklet_disable(&execlists->tasklet);
 
-	return i915_gem_find_active_request(engine);
+	/*
+	 * We want to flush the pending context switches, having disabled
+	 * the tasklet above, we can assume exclusive access to the execlists.
+	 * For this allows us to catch up with an inflight preemption event,
+	 * and avoid blaming an innocent request if the stall was due to the
+	 * preemption itself.
+	 */
+	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
+		process_csb(engine);
+
+	/*
+	 * The last active request can then be no later than the last request
+	 * now in ELSP[0]. So search backwards from there, so that if the GPU
+	 * has advanced beyond the last CSB update, it will be pardoned.
+	 */
+	active = NULL;
+	request = port_request(execlists->port);
+	if (request) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&engine->timeline->lock, flags);
+		list_for_each_entry_from_reverse(request,
+						 &engine->timeline->requests,
+						 link) {
+			if (__i915_request_completed(request,
+						     request->global_seqno))
+				break;
+
+			active = request;
+		}
+		spin_unlock_irqrestore(&engine->timeline->lock, flags);
+	}
+
+	return active;
 }
 
 static void execlists_reset(struct intel_engine_cs *engine,
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/18] drm/i915/breadcrumbs: Keep the fake irq armed across reset
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (4 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 05/18] drm/i915/execlists: Flush pending preemption events during reset Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-23 13:03   ` Mika Kuoppala
  2018-04-09 11:14 ` [PATCH 07/18] drm/i915: Combine tasklet_kill and tasklet_disable Chris Wilson
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

Instead of synchronously cancelling the timer and re-enabling it inside
the reset callbacks, keep the timer enabled and let it die on its next
wakeup if no longer required. This allows
intel_engine_reset_breadcrumbs() to be used from an atomic
(timer/softirq) context such as required for resetting an engine.

It also allows us to react better to the user poking around debugfs for
testing missed irqs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 27 +++++++++++++++++-------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 671a6d61e29d..0a2128c6b418 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -130,11 +130,12 @@ static void intel_breadcrumbs_hangcheck(struct timer_list *t)
 
 static void intel_breadcrumbs_fake_irq(struct timer_list *t)
 {
-	struct intel_engine_cs *engine = from_timer(engine, t,
-						    breadcrumbs.fake_irq);
+	struct intel_engine_cs *engine =
+		from_timer(engine, t, breadcrumbs.fake_irq);
 	struct intel_breadcrumbs *b = &engine->breadcrumbs;
 
-	/* The timer persists in case we cannot enable interrupts,
+	/*
+	 * The timer persists in case we cannot enable interrupts,
 	 * or if we have previously seen seqno/interrupt incoherency
 	 * ("missed interrupt" syndrome, better known as a "missed breadcrumb").
 	 * Here the worker will wake up every jiffie in order to kick the
@@ -148,6 +149,12 @@ static void intel_breadcrumbs_fake_irq(struct timer_list *t)
 	if (!b->irq_armed)
 		return;
 
+	/* If the user has disabled the fake-irq, restore the hangchecking */
+	if (!test_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings)) {
+		mod_timer(&b->hangcheck, wait_timeout());
+		return;
+	}
+
 	mod_timer(&b->fake_irq, jiffies + 1);
 }
 
@@ -840,15 +847,22 @@ void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
 {
 	struct intel_breadcrumbs *b = &engine->breadcrumbs;
 
-	cancel_fake_irq(engine);
 	spin_lock_irq(&b->irq_lock);
 
+	/*
+	 * Leave the fake_irq timer enabled (if it is running), but clear the
+	 * bit so that it turns itself off on its next wake up and goes back
+	 * to the long hangcheck interval if still required.
+	 */
+	clear_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
+
 	if (b->irq_enabled)
 		irq_enable(engine);
 	else
 		irq_disable(engine);
 
-	/* We set the IRQ_BREADCRUMB bit when we enable the irq presuming the
+	/*
+	 * We set the IRQ_BREADCRUMB bit when we enable the irq presuming the
 	 * GPU is active and may have already executed the MI_USER_INTERRUPT
 	 * before the CPU is ready to receive. However, the engine is currently
 	 * idle (we haven't started it yet), there is no possibility for a
@@ -857,9 +871,6 @@ void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
 	 */
 	clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
 
-	if (b->irq_armed)
-		enable_fake_irq(b);
-
 	spin_unlock_irq(&b->irq_lock);
 }
 
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/18] drm/i915: Combine tasklet_kill and tasklet_disable
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (5 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 06/18] drm/i915/breadcrumbs: Keep the fake irq armed across reset Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-24 12:26   ` Mika Kuoppala
  2018-04-09 11:14 ` [PATCH 08/18] drm/i915: Stop parking the signaler around reset Chris Wilson
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

Ideally, we want to atomically flush and disable the tasklet before
resetting the GPU. At present, we rely on being the only part to touch
our tasklet and serialisation of the reset process to ensure that we can
suspend the tasklet from the mix of reset/wedge pathways. In this patch,
we move the tasklet abuse into its own function and tweak it such that
we only do a synchronous operation the first time it is disabled around
the reset. This allows us to avoid the sync inside a softirq context in
subsequent patches.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index bbcc6439a2a1..d5640f3d5276 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1754,6 +1754,16 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 	return init_workarounds_ring(engine);
 }
 
+static void tasklet_kill_and_disable(struct tasklet_struct *t)
+{
+	if (!atomic_read(&t->count))
+		tasklet_kill(t);
+
+	if (atomic_inc_return(&t->count) == 1)
+		tasklet_unlock_wait(t);
+	smp_mb();
+}
+
 static struct i915_request *
 execlists_reset_prepare(struct intel_engine_cs *engine)
 {
@@ -1778,9 +1788,7 @@ execlists_reset_prepare(struct intel_engine_cs *engine)
 	 * common case of recursively being called from set-wedged from inside
 	 * i915_reset.
 	 */
-	if (!atomic_read(&execlists->tasklet.count))
-		tasklet_kill(&execlists->tasklet);
-	tasklet_disable(&execlists->tasklet);
+	tasklet_kill_and_disable(&execlists->tasklet);
 
 	/*
 	 * We want to flush the pending context switches, having disabled
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/18] drm/i915: Stop parking the signaler around reset
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (6 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 07/18] drm/i915: Combine tasklet_kill and tasklet_disable Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-09 11:14 ` [PATCH 09/18] drm/i915: Be irqsafe inside reset Chris Wilson
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

We cannot call kthread_park() from softirq context, so let's avoid it
entirely during the reset. We wanted to suspend the signaler so that it
would not mark a request as complete at the same time as we marked it as
being in error. Instead of parking the signaling, stop the engine from
advancing so that the GPU doesn't emit the breadcrumb for our chosen
"guilty" request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         | 14 --------------
 drivers/gpu/drm/i915/intel_lrc.c        | 21 +++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 18 ++++++++++++++++++
 3 files changed, 39 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 61aca602c3a9..0a3f6a41525b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3008,18 +3008,6 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
 	 */
 	intel_uncore_forcewake_get(engine->i915, FORCEWAKE_ALL);
 
-	/*
-	 * Prevent the signaler thread from updating the request
-	 * state (by calling dma_fence_signal) as we are processing
-	 * the reset. The write from the GPU of the seqno is
-	 * asynchronous and the signaler thread may see a different
-	 * value to us and declare the request complete, even though
-	 * the reset routine have picked that request as the active
-	 * (incomplete) request. This conflict is not handled
-	 * gracefully!
-	 */
-	kthread_park(engine->breadcrumbs.signaler);
-
 	request = engine->reset.prepare(engine);
 	if (request && request->fence.error == -EIO)
 		request = ERR_PTR(-EIO); /* Previous reset failed! */
@@ -3229,8 +3217,6 @@ void i915_gem_reset_finish_engine(struct intel_engine_cs *engine)
 {
 	engine->reset.finish(engine);
 
-	kthread_unpark(engine->breadcrumbs.signaler);
-
 	intel_uncore_forcewake_put(engine->i915, FORCEWAKE_ALL);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d5640f3d5276..bfafa2e4bbfc 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1764,6 +1764,21 @@ static void tasklet_kill_and_disable(struct tasklet_struct *t)
 	smp_mb();
 }
 
+static void set_stop_engine(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	const u32 base = engine->mmio_base;
+	const i915_reg_t mode = RING_MI_MODE(base);
+
+	GEM_TRACE("%s\n", engine->name);
+	I915_WRITE_FW(mode, _MASKED_BIT_ENABLE(STOP_RING));
+	if (__intel_wait_for_register_fw(dev_priv,
+					 mode, MODE_IDLE, MODE_IDLE,
+					 1000, 0,
+					 NULL))
+		GEM_TRACE("%s: timed out on STOP_RING -> IDLE\n", engine->name);
+}
+
 static struct i915_request *
 execlists_reset_prepare(struct intel_engine_cs *engine)
 {
@@ -1810,6 +1825,12 @@ execlists_reset_prepare(struct intel_engine_cs *engine)
 	if (request) {
 		unsigned long flags;
 
+		/*
+		 * Prevent the breadcrumb from advancing before we decide
+		 * which request is currently active.
+		 */
+		set_stop_engine(engine);
+
 		spin_lock_irqsave(&engine->timeline->lock, flags);
 		list_for_each_entry_from_reverse(request,
 						 &engine->timeline->requests,
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 5dadbc435c0e..226c00aecd8a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -530,8 +530,26 @@ static int init_ring_common(struct intel_engine_cs *engine)
 	return ret;
 }
 
+static void set_stop_engine(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	const u32 base = engine->mmio_base;
+	const i915_reg_t mode = RING_MI_MODE(base);
+
+	I915_WRITE_FW(mode, _MASKED_BIT_ENABLE(STOP_RING));
+	if (__intel_wait_for_register_fw(dev_priv,
+					 mode, MODE_IDLE, MODE_IDLE,
+					 1000, 0,
+					 NULL))
+		DRM_DEBUG_DRIVER("%s: timed out on STOP_RING\n",
+				 engine->name);
+}
+
 static struct i915_request *reset_prepare(struct intel_engine_cs *engine)
 {
+	if (INTEL_GEN(engine->i915) >= 3)
+		set_stop_engine(engine);
+
 	if (engine->irq_seqno_barrier)
 		engine->irq_seqno_barrier(engine);
 
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/18] drm/i915: Be irqsafe inside reset
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (7 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 08/18] drm/i915: Stop parking the signaler around reset Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-09 11:14 ` [PATCH 10/18] drm/i915/execlists: Make submission tasklet hardirq safe Chris Wilson
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

As we want to be able to call i915_reset_engine and co from a softirq or
timer context, we need to be irqsafe at all timers. So we have to forgo
the simple spin_lock_irq for the full spin_lock_irqsave.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0a3f6a41525b..42a387ff0eaa 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3130,15 +3130,17 @@ i915_gem_reset_request(struct intel_engine_cs *engine,
 		 */
 		request = i915_gem_find_active_request(engine);
 		if (request) {
+			unsigned long flags;
+
 			i915_gem_context_mark_innocent(request->ctx);
 			dma_fence_set_error(&request->fence, -EAGAIN);
 
 			/* Rewind the engine to replay the incomplete rq */
-			spin_lock_irq(&engine->timeline->lock);
+			spin_lock_irqsave(&engine->timeline->lock, flags);
 			request = list_prev_entry(request, link);
 			if (&request->link == &engine->timeline->requests)
 				request = NULL;
-			spin_unlock_irq(&engine->timeline->lock);
+			spin_unlock_irqrestore(&engine->timeline->lock, flags);
 		}
 	}
 
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/18] drm/i915/execlists: Make submission tasklet hardirq safe
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (8 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 09/18] drm/i915: Be irqsafe inside reset Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-09 11:14 ` [PATCH 11/18] drm/i915/guc: " Chris Wilson
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

Prepare to allow the execlists submission to be run from underneath a
hardirq timer context (and not just the current softirq context) as is
required for fast preemption resets.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_lrc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index bfafa2e4bbfc..73fb941a675e 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -568,6 +568,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		&execlists->port[execlists->port_mask];
 	struct i915_request *last = port_request(port);
 	struct rb_node *rb;
+	unsigned long flags;
 	bool submit = false;
 
 	/* Hardware submission is through 2 ports. Conceptually each port
@@ -591,7 +592,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * and context switches) submission.
 	 */
 
-	spin_lock_irq(&engine->timeline->lock);
+	spin_lock_irqsave(&engine->timeline->lock, flags);
 	rb = execlists->first;
 	GEM_BUG_ON(rb_first(&execlists->queue) != rb);
 
@@ -733,7 +734,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	GEM_BUG_ON(execlists->first && !port_isset(execlists->port));
 
 unlock:
-	spin_unlock_irq(&engine->timeline->lock);
+	spin_unlock_irqrestore(&engine->timeline->lock, flags);
 
 	if (submit) {
 		execlists_user_begin(execlists, execlists->port);
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 11/18] drm/i915/guc: Make submission tasklet hardirq safe
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (9 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 10/18] drm/i915/execlists: Make submission tasklet hardirq safe Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-09 11:14 ` [PATCH 12/18] drm/i915: Allow init_breadcrumbs to be used from irq context Chris Wilson
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

Prepare to allow the GuC submission to be run from underneath a
hardirq timer context (and not just the current softirq context) as is
required for fast preemption resets.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_guc_submission.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index dc6782391f9f..994082712181 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -689,10 +689,11 @@ static void guc_dequeue(struct intel_engine_cs *engine)
 	struct i915_request *last = NULL;
 	const struct execlist_port * const last_port =
 		&execlists->port[execlists->port_mask];
+	unsigned long flags;
 	bool submit = false;
 	struct rb_node *rb;
 
-	spin_lock_irq(&engine->timeline->lock);
+	spin_lock_irqsave(&engine->timeline->lock, flags);
 	rb = execlists->first;
 	GEM_BUG_ON(rb_first(&execlists->queue) != rb);
 
@@ -763,7 +764,7 @@ static void guc_dequeue(struct intel_engine_cs *engine)
 	GEM_BUG_ON(execlists->first && !port_isset(execlists->port));
 
 unlock:
-	spin_unlock_irq(&engine->timeline->lock);
+	spin_unlock_irqrestore(&engine->timeline->lock, flags);
 }
 
 static void guc_submission_tasklet(unsigned long data)
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 12/18] drm/i915: Allow init_breadcrumbs to be used from irq context
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (10 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 11/18] drm/i915/guc: " Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-09 11:14 ` [PATCH 13/18] drm/i915: Compile out engine debug for release Chris Wilson
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

In order to support engine reset from irq (timer) context, we need to be
able to re-initialise the breadcrumbs. So we need to promote the plain
spin_lock_irq to a safe spin_lock_irqsave.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 0a2128c6b418..ca0b04d9747c 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -846,8 +846,9 @@ static void cancel_fake_irq(struct intel_engine_cs *engine)
 void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
 {
 	struct intel_breadcrumbs *b = &engine->breadcrumbs;
+	unsigned long flags;
 
-	spin_lock_irq(&b->irq_lock);
+	spin_lock_irqsave(&b->irq_lock, flags);
 
 	/*
 	 * Leave the fake_irq timer enabled (if it is running), but clear the
@@ -871,7 +872,7 @@ void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
 	 */
 	clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
 
-	spin_unlock_irq(&b->irq_lock);
+	spin_unlock_irqrestore(&b->irq_lock, flags);
 }
 
 void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine)
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 13/18] drm/i915: Compile out engine debug for release
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (11 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 12/18] drm/i915: Allow init_breadcrumbs to be used from irq context Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-26 10:15   ` Mika Kuoppala
  2018-04-09 11:14 ` [PATCH 14/18] drm/i915/execlists: Force preemption via reset on timeout Chris Wilson
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

The majority of the engine state dumping is too voluminous to be useful
outside of a controlled setup, though a few do accompany severe errors.
Keep the debug dumps next to the errors, but hide the others behind a CI
compile flag. This becomes more useful when adding more dumps to latency
sensitive paths.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c          | 2 +-
 drivers/gpu/drm/i915/i915_gem.h          | 6 ++++++
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 2 +-
 drivers/gpu/drm/i915/intel_hangcheck.c   | 2 +-
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 42a387ff0eaa..cec52bbd1b41 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3267,7 +3267,7 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
 
 	GEM_TRACE("start\n");
 
-	if (drm_debug & DRM_UT_DRIVER) {
+	if (GEM_SHOW_DEBUG()) {
 		struct drm_printer p = drm_debug_printer(__func__);
 
 		for_each_engine(engine, i915, id)
diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index deaf78d2ae8b..525920404ede 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -30,6 +30,9 @@
 struct drm_i915_private;
 
 #ifdef CONFIG_DRM_I915_DEBUG_GEM
+
+#define GEM_SHOW_DEBUG() (drm_debug & DRM_UT_DRIVER)
+
 #define GEM_BUG_ON(condition) do { if (unlikely((condition))) {	\
 		pr_err("%s:%d GEM_BUG_ON(%s)\n", \
 		       __func__, __LINE__, __stringify(condition)); \
@@ -45,6 +48,9 @@ struct drm_i915_private;
 #define GEM_DEBUG_BUG_ON(expr) GEM_BUG_ON(expr)
 
 #else
+
+#define GEM_SHOW_DEBUG() (0)
+
 #define GEM_BUG_ON(expr) BUILD_BUG_ON_INVALID(expr)
 #define GEM_WARN_ON(expr) (BUILD_BUG_ON_INVALID(expr), 0)
 
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index ca0b04d9747c..ad761b8d843d 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -82,7 +82,7 @@ static unsigned long wait_timeout(void)
 
 static noinline void missed_breadcrumb(struct intel_engine_cs *engine)
 {
-	if (drm_debug & DRM_UT_DRIVER) {
+	if (GEM_SHOW_DEBUG()) {
 		struct drm_printer p = drm_debug_printer(__func__);
 
 		intel_engine_dump(engine, &p,
diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c
index fd0ffb8328d0..309e38b00e95 100644
--- a/drivers/gpu/drm/i915/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/intel_hangcheck.c
@@ -356,7 +356,7 @@ static void hangcheck_accumulate_sample(struct intel_engine_cs *engine,
 		break;
 
 	case ENGINE_DEAD:
-		if (drm_debug & DRM_UT_DRIVER) {
+		if (GEM_SHOW_DEBUG()) {
 			struct drm_printer p = drm_debug_printer("hangcheck");
 			intel_engine_dump(engine, &p, "%s\n", engine->name);
 		}
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 14/18] drm/i915/execlists: Force preemption via reset on timeout
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (12 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 13/18] drm/i915: Compile out engine debug for release Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-09 11:14 ` [PATCH 15/18] drm/i915/execlists: Try preempt-reset from hardirq timer context Chris Wilson
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

Install a timer when trying to preempt on behalf of an important
context such that if the active context does not honour the preemption
request within the desired timeout, then we reset the GPU to allow the
important context to run.

v2: Install the timer on scheduling the preempt request; long before we
even try to inject preemption into the ELSP, as the tasklet/injection
may itself be blocked.
v3: Update the guc to handle the preemption/tasklet timer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_guc_submission.c |  1 +
 drivers/gpu/drm/i915/intel_lrc.c            | 88 +++++++++++++++++++--
 drivers/gpu/drm/i915/intel_ringbuffer.h     |  8 +-
 drivers/gpu/drm/i915/selftests/intel_lrc.c  | 60 ++++++++++++++
 4 files changed, 148 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 994082712181..5577d6f717e3 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -750,6 +750,7 @@ static void guc_dequeue(struct intel_engine_cs *engine)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
+	execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT_TIMEOUT);
 	execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
 	execlists->first = rb;
 	if (submit) {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 73fb941a675e..ca1c54af2877 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -550,6 +550,52 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
 	execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
 }
 
+static enum hrtimer_restart preempt_timeout(struct hrtimer *hrtimer)
+{
+	struct intel_engine_execlists *execlists =
+		container_of(hrtimer, typeof(*execlists), preempt_timer);
+
+	GEM_TRACE("%s\n",
+		  container_of(execlists,
+			       struct intel_engine_cs,
+			       execlists)->name);
+
+	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT_TIMEOUT))
+		return HRTIMER_NORESTART;
+
+	if (GEM_SHOW_DEBUG()) {
+		struct intel_engine_cs *engine =
+			container_of(execlists, typeof(*engine), execlists);
+		struct drm_printer p = drm_debug_printer(__func__);
+
+		intel_engine_dump( engine, &p, "%s\n", engine->name);
+	}
+
+	queue_work(system_highpri_wq, &execlists->preempt_reset);
+
+	return HRTIMER_NORESTART;
+}
+
+static void preempt_reset(struct work_struct *work)
+{
+	struct intel_engine_execlists *execlists =
+		container_of(work, typeof(*execlists), preempt_reset);
+	struct intel_engine_cs *engine =
+		  container_of(execlists, struct intel_engine_cs, execlists);
+
+	GEM_TRACE("%s\n", engine->name);
+
+	tasklet_disable(&execlists->tasklet);
+
+	execlists->tasklet.func(execlists->tasklet.data);
+
+	if (execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT_TIMEOUT))
+		i915_handle_error(engine->i915, BIT(engine->id), 0,
+				  "preemption time out on %s", engine->name);
+
+	tasklet_enable(&execlists->tasklet);
+}
+
 static void complete_preempt_context(struct intel_engine_execlists *execlists)
 {
 	GEM_BUG_ON(!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT));
@@ -724,6 +770,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
+	execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT_TIMEOUT);
 	execlists->queue_priority =
 		port != execlists->port ? rq_prio(last) : INT_MIN;
 	execlists->first = rb;
@@ -1109,16 +1156,38 @@ static void queue_request(struct intel_engine_cs *engine,
 	list_add_tail(&pt->link, &lookup_priolist(engine, pt, prio)->requests);
 }
 
-static void __submit_queue(struct intel_engine_cs *engine, int prio)
+static void __submit_queue(struct intel_engine_cs *engine,
+			   int prio, unsigned int timeout)
 {
-	engine->execlists.queue_priority = prio;
-	tasklet_hi_schedule(&engine->execlists.tasklet);
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+	int old = execlists->queue_priority;
+
+	GEM_TRACE("%s prio=%d (previous=%d)\n", engine->name, prio, old);
+
+	if (unlikely(execlists_is_active(execlists,
+					 EXECLISTS_ACTIVE_PREEMPT_TIMEOUT)))
+		hrtimer_cancel(&execlists->preempt_timer);
+
+	execlists->queue_priority = prio;
+	tasklet_hi_schedule(&execlists->tasklet);
+
+	/* Set a timer to force preemption vs hostile userspace */
+	if (timeout && __execlists_need_preempt(prio, old)) {
+		GEM_TRACE("%s preempt timeout=%uns\n", engine->name, timeout);
+
+		execlists_set_active(execlists,
+				     EXECLISTS_ACTIVE_PREEMPT_TIMEOUT);
+		hrtimer_start(&execlists->preempt_timer,
+			      ns_to_ktime(timeout),
+			      HRTIMER_MODE_REL);
+	}
 }
 
-static void submit_queue(struct intel_engine_cs *engine, int prio)
+static void submit_queue(struct intel_engine_cs *engine,
+			 int prio, unsigned int timeout)
 {
 	if (prio > engine->execlists.queue_priority)
-		__submit_queue(engine, prio);
+		__submit_queue(engine, prio, timeout);
 }
 
 static void execlists_submit_request(struct i915_request *request)
@@ -1130,7 +1199,7 @@ static void execlists_submit_request(struct i915_request *request)
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
 	queue_request(engine, &request->priotree, rq_prio(request));
-	submit_queue(engine, rq_prio(request));
+	submit_queue(engine, rq_prio(request), 0);
 
 	GEM_BUG_ON(!engine->execlists.first);
 	GEM_BUG_ON(list_empty(&request->priotree.link));
@@ -1254,7 +1323,7 @@ static void execlists_schedule(struct i915_request *request, int prio)
 
 		if (prio > engine->execlists.queue_priority &&
 		    i915_sw_fence_done(&pt_to_request(pt)->submit))
-			__submit_queue(engine, prio);
+			__submit_queue(engine, prio, 0);
 	}
 
 	spin_unlock_irq(&engine->timeline->lock);
@@ -2334,6 +2403,11 @@ logical_ring_setup(struct intel_engine_cs *engine)
 	tasklet_init(&engine->execlists.tasklet,
 		     execlists_submission_tasklet, (unsigned long)engine);
 
+	INIT_WORK(&engine->execlists.preempt_reset, preempt_reset);
+	hrtimer_init(&engine->execlists.preempt_timer,
+		     CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	engine->execlists.preempt_timer.function = preempt_timeout;
+
 	logical_ring_default_vfuncs(engine);
 	logical_ring_default_irqs(engine);
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 62f9aeefae30..25147a877518 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -266,8 +266,9 @@ struct intel_engine_execlists {
 	 */
 	unsigned int active;
 #define EXECLISTS_ACTIVE_USER 0
-#define EXECLISTS_ACTIVE_PREEMPT 1
-#define EXECLISTS_ACTIVE_HWACK 2
+#define EXECLISTS_ACTIVE_HWACK 1
+#define EXECLISTS_ACTIVE_PREEMPT 2
+#define EXECLISTS_ACTIVE_PREEMPT_TIMEOUT 3
 
 	/**
 	 * @port_mask: number of execlist ports - 1
@@ -313,6 +314,9 @@ struct intel_engine_execlists {
 	 * @preempt_complete_status: expected CSB upon completing preemption
 	 */
 	u32 preempt_complete_status;
+
+	struct hrtimer preempt_timer;
+	struct work_struct preempt_reset;
 };
 
 #define INTEL_ENGINE_CS_MAX_NAME 8
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index 0481e2e01146..aed2502419ee 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -496,12 +496,72 @@ static int live_late_preempt(void *arg)
 	goto err_ctx_lo;
 }
 
+static int live_preempt_timeout(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine;
+	struct i915_gem_context *ctx;
+	enum intel_engine_id id;
+	struct spinner spin;
+	int err = -ENOMEM;
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	if (spinner_init(&spin, i915))
+		goto err_unlock;
+
+	ctx = kernel_context(i915);
+	if (!ctx)
+		goto err_spin;
+
+	for_each_engine(engine, i915, id) {
+		struct i915_request *rq;
+
+		rq = spinner_create_request(&spin, ctx, engine, MI_NOOP);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto err_ctx;
+		}
+
+		i915_request_add(rq);
+		if (!wait_for_spinner(&spin, rq)) {
+			i915_gem_set_wedged(i915);
+			err = -EIO;
+			goto err_ctx;
+		}
+
+		GEM_TRACE("%s triggering reset\n", engine->name);
+		execlists_set_active(&engine->execlists,
+				     EXECLISTS_ACTIVE_PREEMPT_TIMEOUT);
+		preempt_reset(&engine->execlists.preempt_reset);
+
+		if (flush_test(i915, I915_WAIT_LOCKED)) {
+			err = -EIO;
+			goto err_ctx;
+		}
+	}
+
+	err = 0;
+err_ctx:
+	kernel_context_close(ctx);
+err_spin:
+	spinner_fini(&spin);
+err_unlock:
+	flush_test(i915, I915_WAIT_LOCKED);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
 int intel_execlists_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(live_sanitycheck),
 		SUBTEST(live_preempt),
 		SUBTEST(live_late_preempt),
+		SUBTEST(live_preempt_timeout),
 	};
 	return i915_subtests(tests, i915);
 }
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 15/18] drm/i915/execlists: Try preempt-reset from hardirq timer context
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (13 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 14/18] drm/i915/execlists: Force preemption via reset on timeout Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-09 11:14 ` [PATCH 16/18] drm/i915/preemption: Select timeout when scheduling Chris Wilson
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

When circumstances allow, trying resetting the engine directly from the
preemption timeout handler. As this is softirq context, we have to be
careful both not to sleep and not to spin on anything we may be
interrupting (e.g. the submission tasklet).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c           |  35 +++++-
 drivers/gpu/drm/i915/selftests/intel_lrc.c | 123 +++++++++++++++++++++
 2 files changed, 157 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ca1c54af2877..4f1e985b3cdb 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -550,6 +550,38 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
 	execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
 }
 
+static int try_preempt_reset(struct intel_engine_execlists *execlists)
+{
+	int err = -EBUSY;
+
+	if (tasklet_trylock(&execlists->tasklet)) {
+		struct intel_engine_cs *engine =
+			container_of(execlists, typeof(*engine), execlists);
+		const unsigned int bit = I915_RESET_ENGINE + engine->id;
+		unsigned long *lock = &engine->i915->gpu_error.flags;
+
+		execlists->tasklet.func(execlists->tasklet.data);
+
+		if (!execlists_is_active(execlists,
+					 EXECLISTS_ACTIVE_PREEMPT_TIMEOUT)) {
+			/* Nothing to do; the tasklet was just delayed. */
+			err = 0;
+		} else if (!test_and_set_bit(bit, lock)) {
+			tasklet_disable_nosync(&execlists->tasklet);
+			err = i915_reset_engine(engine,
+						"preemption time out");
+			tasklet_enable(&execlists->tasklet);
+
+			clear_bit(bit, lock);
+			wake_up_bit(lock, bit);
+		}
+
+		tasklet_unlock(&execlists->tasklet);
+	}
+
+	return err;
+}
+
 static enum hrtimer_restart preempt_timeout(struct hrtimer *hrtimer)
 {
 	struct intel_engine_execlists *execlists =
@@ -571,7 +603,8 @@ static enum hrtimer_restart preempt_timeout(struct hrtimer *hrtimer)
 		intel_engine_dump( engine, &p, "%s\n", engine->name);
 	}
 
-	queue_work(system_highpri_wq, &execlists->preempt_reset);
+	if (try_preempt_reset(execlists))
+		queue_work(system_highpri_wq, &execlists->preempt_reset);
 
 	return HRTIMER_NORESTART;
 }
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index aed2502419ee..72d2770e5d71 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -555,6 +555,128 @@ static int live_preempt_timeout(void *arg)
 	return err;
 }
 
+static void __softirq_begin(void)
+{
+	local_bh_disable();
+}
+
+static void __softirq_end(void)
+{
+	local_bh_enable();
+}
+
+static void __hardirq_begin(void)
+{
+	local_irq_disable();
+}
+
+static void __hardirq_end(void)
+{
+	local_irq_enable();
+}
+
+static int live_preempt_reset(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine;
+	struct i915_gem_context *ctx;
+	enum intel_engine_id id;
+	struct spinner spin;
+	int err = -ENOMEM;
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	if (spinner_init(&spin, i915))
+		goto err_unlock;
+
+	ctx = kernel_context(i915);
+	if (!ctx)
+		goto err_spin;
+
+	for_each_engine(engine, i915, id) {
+		static const struct {
+			const char *name;
+			void (*critical_section_begin)(void);
+			void (*critical_section_end)(void);
+		} phases[] = {
+			{ "softirq", __softirq_begin, __softirq_end },
+			{ "hardirq", __hardirq_begin, __hardirq_end },
+			{ }
+		};
+		const typeof(*phases) *p;
+
+		for (p = phases; p->name; p++) {
+			struct i915_request *rq;
+
+			rq = spinner_create_request(&spin, ctx, engine,
+						    MI_NOOP);
+			if (IS_ERR(rq)) {
+				err = PTR_ERR(rq);
+				goto err_ctx;
+			}
+
+			i915_request_add(rq);
+			if (!wait_for_spinner(&spin, rq)) {
+				i915_gem_set_wedged(i915);
+				err = -EIO;
+				goto err_ctx;
+			}
+
+			/* Flush to give try_preempt_reset a chance */
+			do {
+				tasklet_schedule(&engine->execlists.tasklet);
+				usleep_range(100, 1000);
+				tasklet_kill(&engine->execlists.tasklet);
+			} while (test_bit(ENGINE_IRQ_EXECLIST,
+					  &engine->irq_posted));
+			GEM_BUG_ON(i915_request_completed(rq));
+
+			GEM_TRACE("%s triggering %s reset\n",
+				  engine->name, p->name);
+			p->critical_section_begin();
+
+			/* Trick execution of the tasklet from within reset */
+			set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
+			execlists_set_active(&engine->execlists,
+					     EXECLISTS_ACTIVE_PREEMPT_TIMEOUT);
+
+			err = try_preempt_reset(&engine->execlists);
+
+			p->critical_section_end();
+			if (err) {
+				pr_err("Preempt softirq reset failed on %s, irq_posted? %d, tasklet state %lx\n",
+				       engine->name,
+				       test_bit(ENGINE_IRQ_EXECLIST,
+						&engine->irq_posted),
+				       engine->execlists.tasklet.state);
+				spinner_end(&spin);
+				i915_gem_set_wedged(i915);
+				goto err_ctx;
+			}
+			GEM_BUG_ON(test_bit(ENGINE_IRQ_EXECLIST,
+					    &engine->irq_posted));
+
+			if (flush_test(i915, I915_WAIT_LOCKED)) {
+				err = -EIO;
+				goto err_ctx;
+			}
+		}
+	}
+
+	err = 0;
+err_ctx:
+	kernel_context_close(ctx);
+err_spin:
+	spinner_fini(&spin);
+err_unlock:
+	flush_test(i915, I915_WAIT_LOCKED);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
 int intel_execlists_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
@@ -562,6 +684,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_preempt),
 		SUBTEST(live_late_preempt),
 		SUBTEST(live_preempt_timeout),
+		SUBTEST(live_preempt_reset),
 	};
 	return i915_subtests(tests, i915);
 }
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 16/18] drm/i915/preemption: Select timeout when scheduling
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (14 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 15/18] drm/i915/execlists: Try preempt-reset from hardirq timer context Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-09 11:14 ` [PATCH 17/18] drm/i915: Use a preemption timeout to enforce interactivity Chris Wilson
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

The choice of preemption timeout is determined by the context from which
we trigger the preemption, as such allow the caller to specify the
desired timeout.

Effectively the other choice would be to use the shortest timeout along
the dependency chain. However, given that we would have already
triggered preemption for the dependency chain, we can assume that no
preemption along that chain is more important than the current request,
ergo we need only consider the current timeout. Realising this, we can
then pass control of the preemption timeout to the caller for greater
flexibility.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c            |   2 +-
 drivers/gpu/drm/i915/i915_request.c        |   2 +-
 drivers/gpu/drm/i915/intel_lrc.c           |   5 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h    |   6 +-
 drivers/gpu/drm/i915/selftests/intel_lrc.c | 106 ++++++++++++++++++++-
 5 files changed, 114 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index cec52bbd1b41..e8c1a077e223 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -576,7 +576,7 @@ static void __fence_set_priority(struct dma_fence *fence, int prio)
 
 	rcu_read_lock();
 	if (engine->schedule)
-		engine->schedule(rq, prio);
+		engine->schedule(rq, prio, 0);
 	rcu_read_unlock();
 }
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 629f3e860592..ddffffb829ef 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1062,7 +1062,7 @@ void __i915_request_add(struct i915_request *request, bool flush_caches)
 	 */
 	rcu_read_lock();
 	if (engine->schedule)
-		engine->schedule(request, request->ctx->priority);
+		engine->schedule(request, request->ctx->priority, 0);
 	rcu_read_unlock();
 
 	local_bh_disable();
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4f1e985b3cdb..877340afb63d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1260,7 +1260,8 @@ pt_lock_engine(struct i915_priotree *pt, struct intel_engine_cs *locked)
 	return engine;
 }
 
-static void execlists_schedule(struct i915_request *request, int prio)
+static void execlists_schedule(struct i915_request *request,
+			       int prio, unsigned int timeout)
 {
 	struct intel_engine_cs *engine;
 	struct i915_dependency *dep, *p;
@@ -1356,7 +1357,7 @@ static void execlists_schedule(struct i915_request *request, int prio)
 
 		if (prio > engine->execlists.queue_priority &&
 		    i915_sw_fence_done(&pt_to_request(pt)->submit))
-			__submit_queue(engine, prio, 0);
+			__submit_queue(engine, prio, timeout);
 	}
 
 	spin_unlock_irq(&engine->timeline->lock);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 25147a877518..04d25d1d4c2f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -463,13 +463,15 @@ struct intel_engine_cs {
 	 */
 	void		(*submit_request)(struct i915_request *rq);
 
-	/* Call when the priority on a request has changed and it and its
+	/*
+	 * Call when the priority on a request has changed and it and its
 	 * dependencies may need rescheduling. Note the request itself may
 	 * not be ready to run!
 	 *
 	 * Called under the struct_mutex.
 	 */
-	void		(*schedule)(struct i915_request *request, int priority);
+	void		(*schedule)(struct i915_request *request,
+				    int priority, unsigned int timeout);
 
 	/*
 	 * Cancel all requests on the hardware, or queued for execution.
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index 72d2770e5d71..5e4d6d07fff5 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -458,7 +458,7 @@ static int live_late_preempt(void *arg)
 			goto err_wedged;
 		}
 
-		engine->schedule(rq, I915_PRIORITY_MAX);
+		engine->schedule(rq, I915_PRIORITY_MAX, 0);
 
 		if (!wait_for_spinner(&spin_hi, rq)) {
 			pr_err("High priority context failed to preempt the low priority context\n");
@@ -677,6 +677,109 @@ static int live_preempt_reset(void *arg)
 	return err;
 }
 
+static int live_late_preempt_timeout(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct i915_gem_context *ctx_hi, *ctx_lo;
+	struct spinner spin_hi, spin_lo;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	if (spinner_init(&spin_hi, i915))
+		goto err_unlock;
+
+	if (spinner_init(&spin_lo, i915))
+		goto err_spin_hi;
+
+	ctx_hi = kernel_context(i915);
+	if (!ctx_hi)
+		goto err_spin_lo;
+
+	ctx_lo = kernel_context(i915);
+	if (!ctx_lo)
+		goto err_ctx_hi;
+
+	for_each_engine(engine, i915, id) {
+		struct i915_request *rq;
+
+		rq = spinner_create_request(&spin_lo, ctx_lo, engine, MI_NOOP);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (!wait_for_spinner(&spin_lo, rq)) {
+			pr_err("First context failed to start\n");
+			goto err_wedged;
+		}
+
+		rq = spinner_create_request(&spin_hi, ctx_hi, engine, MI_NOOP);
+		if (IS_ERR(rq)) {
+			spinner_end(&spin_lo);
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (wait_for_spinner(&spin_hi, rq)) {
+			pr_err("Second context overtook first?\n");
+			goto err_wedged;
+		}
+
+		GEM_TRACE("%s rescheduling (no timeout)\n", engine->name);
+		engine->schedule(rq, 1, 0);
+
+		if (wait_for_spinner(&spin_hi, rq)) {
+			pr_err("High priority context overtook first without an arbitration point?\n");
+			goto err_wedged;
+		}
+
+		GEM_TRACE("%s rescheduling (with timeout)\n", engine->name);
+		engine->schedule(rq, 2, 10 * 1000 /* 10us */);
+
+		if (!wait_for_spinner(&spin_hi, rq)) {
+			pr_err("High priority context failed to force itself in front of the low priority context\n");
+			GEM_TRACE_DUMP();
+			goto err_wedged;
+		}
+
+		spinner_end(&spin_hi);
+		spinner_end(&spin_lo);
+		if (flush_test(i915, I915_WAIT_LOCKED)) {
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+	}
+
+	err = 0;
+err_ctx_lo:
+	kernel_context_close(ctx_lo);
+err_ctx_hi:
+	kernel_context_close(ctx_hi);
+err_spin_lo:
+	spinner_fini(&spin_lo);
+err_spin_hi:
+	spinner_fini(&spin_hi);
+err_unlock:
+	flush_test(i915, I915_WAIT_LOCKED);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+
+err_wedged:
+	spinner_end(&spin_hi);
+	spinner_end(&spin_lo);
+	i915_gem_set_wedged(i915);
+	err = -EIO;
+	goto err_ctx_lo;
+}
+
 int intel_execlists_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
@@ -685,6 +788,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_late_preempt),
 		SUBTEST(live_preempt_timeout),
 		SUBTEST(live_preempt_reset),
+		SUBTEST(live_late_preempt_timeout),
 	};
 	return i915_subtests(tests, i915);
 }
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 17/18] drm/i915: Use a preemption timeout to enforce interactivity
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (15 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 16/18] drm/i915/preemption: Select timeout when scheduling Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-09 11:14 ` [PATCH 18/18] drm/i915: Allow user control over preempt timeout on their important context Chris Wilson
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

Use a liberal timeout of 20ms to ensure that the rendering for an
interactive pageflip is started in a timely fashion, and that
user interaction is not blocked by GPU, or CPU, hogs. This is at the cost
of resetting whoever was blocking the preemption, likely leading to that
context/process being banned from submitting future requests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h      |  3 ++-
 drivers/gpu/drm/i915/i915_gem.c      | 18 ++++++++++--------
 drivers/gpu/drm/i915/intel_display.c | 18 +++++++++++++++++-
 3 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9bca104c409e..732d375ab330 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3155,8 +3155,9 @@ int i915_gem_object_wait(struct drm_i915_gem_object *obj,
 			 struct intel_rps_client *rps);
 int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 				  unsigned int flags,
-				  int priority);
+				  int priority, unsigned int timeout);
 #define I915_PRIORITY_DISPLAY I915_PRIORITY_MAX
+#define I915_PREEMPTION_TIMEOUT_DISPLAY (100 * 1000 * 1000) /* 100 ms / 10Hz */
 
 int __must_check
 i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e8c1a077e223..38d8f6aebfbb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -563,7 +563,8 @@ i915_gem_object_wait_reservation(struct reservation_object *resv,
 	return timeout;
 }
 
-static void __fence_set_priority(struct dma_fence *fence, int prio)
+static void __fence_set_priority(struct dma_fence *fence,
+				 int prio, unsigned int timeout)
 {
 	struct i915_request *rq;
 	struct intel_engine_cs *engine;
@@ -576,11 +577,12 @@ static void __fence_set_priority(struct dma_fence *fence, int prio)
 
 	rcu_read_lock();
 	if (engine->schedule)
-		engine->schedule(rq, prio, 0);
+		engine->schedule(rq, prio, timeout);
 	rcu_read_unlock();
 }
 
-static void fence_set_priority(struct dma_fence *fence, int prio)
+static void fence_set_priority(struct dma_fence *fence,
+			       int prio, unsigned int timeout)
 {
 	/* Recurse once into a fence-array */
 	if (dma_fence_is_array(fence)) {
@@ -588,16 +590,16 @@ static void fence_set_priority(struct dma_fence *fence, int prio)
 		int i;
 
 		for (i = 0; i < array->num_fences; i++)
-			__fence_set_priority(array->fences[i], prio);
+			__fence_set_priority(array->fences[i], prio, timeout);
 	} else {
-		__fence_set_priority(fence, prio);
+		__fence_set_priority(fence, prio, timeout);
 	}
 }
 
 int
 i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 			      unsigned int flags,
-			      int prio)
+			      int prio, unsigned int timeout)
 {
 	struct dma_fence *excl;
 
@@ -612,7 +614,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 			return ret;
 
 		for (i = 0; i < count; i++) {
-			fence_set_priority(shared[i], prio);
+			fence_set_priority(shared[i], prio, timeout);
 			dma_fence_put(shared[i]);
 		}
 
@@ -622,7 +624,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 	}
 
 	if (excl) {
-		fence_set_priority(excl, prio);
+		fence_set_priority(excl, prio, timeout);
 		dma_fence_put(excl);
 	}
 	return 0;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 9c6156216e5e..ba850b59530f 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -12785,7 +12785,23 @@ intel_prepare_plane_fb(struct drm_plane *plane,
 
 	ret = intel_plane_pin_fb(to_intel_plane_state(new_state));
 
-	i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
+	/*
+	 * Reschedule our dependencies, and ensure we run within a timeout.
+	 *
+	 * Note that if the timeout is exceeded, then whoever was running that
+	 * prevented us from acquiring the GPU is declared rogue and reset. An
+	 * unresponsive process will then be banned in order to preserve
+	 * interactivity. Since this can be seen as a bit heavy-handed, we
+	 * select a timeout for when the dropped frames start to become a
+	 * noticeable nuisance for the user (100 ms, i.e. preemption was
+	 * blocked for more than a few frames). Note, this is only a timeout
+	 * for a delay in preempting the current request in order to run our
+	 * dependency chain, our dependency chain may itself take a long time
+	 * to run to completion before we can present the framebuffer.
+	 */
+	i915_gem_object_wait_priority(obj, 0,
+				      I915_PRIORITY_DISPLAY,
+				      I915_PREEMPTION_TIMEOUT_DISPLAY);
 
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 	i915_gem_object_unpin_pages(obj);
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 18/18] drm/i915: Allow user control over preempt timeout on their important context
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (16 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 17/18] drm/i915: Use a preemption timeout to enforce interactivity Chris Wilson
@ 2018-04-09 11:14 ` Chris Wilson
  2018-04-09 11:29 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/18] drm/i915/execlists: Set queue priority from secondary port Patchwork
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-09 11:14 UTC (permalink / raw)
  To: intel-gfx

One usecase would be to couple in via EGL_NV_context_priority_realtime
in userspace to provide some QoS guarantees in conjunction with setting
the highest priority.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_context.c    | 22 ++++++
 drivers/gpu/drm/i915/i915_gem_context.h    | 13 ++++
 drivers/gpu/drm/i915/i915_request.c        |  8 +-
 drivers/gpu/drm/i915/intel_lrc.c           |  2 +-
 drivers/gpu/drm/i915/selftests/intel_lrc.c | 85 ++++++++++++++++++++++
 include/uapi/drm/i915_drm.h                | 12 +++
 6 files changed, 139 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5cfac0255758..932ca1082b26 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -749,6 +749,15 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_CONTEXT_PARAM_PRIORITY:
 		args->value = ctx->priority;
 		break;
+	case I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
+		if (!(to_i915(dev)->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
+			ret = -ENODEV;
+		else if (args->size)
+			ret = -EINVAL;
+		else
+			args->value = ctx->preempt_timeout;
+		break;
+
 	default:
 		ret = -EINVAL;
 		break;
@@ -824,6 +833,19 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		}
 		break;
 
+	case I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
+		if (args->size)
+			ret = -EINVAL;
+		else if (args->value > U32_MAX)
+			ret = -EINVAL;
+		else if (!(to_i915(dev)->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
+			ret = -ENODEV;
+		else if (args->value && !capable(CAP_SYS_ADMIN))
+			ret = -EPERM;
+		else
+			ctx->preempt_timeout = args->value;
+		break;
+
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 7854262ddfd9..1fc7181edd2d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -150,6 +150,19 @@ struct i915_gem_context {
 	 */
 	int priority;
 
+	/**
+	 * @preempt_timeout: QoS guarantee for the high priority context
+	 *
+	 * Some clients need a guarantee that they will start executing
+	 * within a certain window, even at the expense of others. This entails
+	 * that if a preemption request is not honoured by the active context
+	 * within the timeout, we will reset the GPU to evict the hog and
+	 * run the high priority context instead.
+	 *
+	 * Timeout is stored in nanoseconds.
+	 */
+	u32 preempt_timeout;
+
 	/** ggtt_offset_bias: placement restriction for context objects */
 	u32 ggtt_offset_bias;
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index ddffffb829ef..b9dfd49a173e 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1061,8 +1061,12 @@ void __i915_request_add(struct i915_request *request, bool flush_caches)
 	 * run at the earliest possible convenience.
 	 */
 	rcu_read_lock();
-	if (engine->schedule)
-		engine->schedule(request, request->ctx->priority, 0);
+	if (engine->schedule) {
+		unsigned int timeout = request->ctx->preempt_timeout;
+		int priority = request->ctx->priority;
+
+		engine->schedule(request, priority, timeout);
+	}
 	rcu_read_unlock();
 
 	local_bh_disable();
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 877340afb63d..18cc83520e95 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1232,7 +1232,7 @@ static void execlists_submit_request(struct i915_request *request)
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
 	queue_request(engine, &request->priotree, rq_prio(request));
-	submit_queue(engine, rq_prio(request), 0);
+	submit_queue(engine, rq_prio(request), request->ctx->preempt_timeout);
 
 	GEM_BUG_ON(!engine->execlists.first);
 	GEM_BUG_ON(list_empty(&request->priotree.link));
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index 5e4d6d07fff5..4fac9c552595 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -780,6 +780,90 @@ static int live_late_preempt_timeout(void *arg)
 	goto err_ctx_lo;
 }
 
+static int live_context_preempt_timeout(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct i915_gem_context *ctx_hi, *ctx_lo;
+	struct spinner spin_hi, spin_lo;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	if (spinner_init(&spin_hi, i915))
+		goto err_unlock;
+
+	if (spinner_init(&spin_lo, i915))
+		goto err_spin_hi;
+
+	ctx_hi = kernel_context(i915);
+	if (!ctx_hi)
+		goto err_spin_lo;
+	ctx_hi->priority = I915_CONTEXT_MAX_USER_PRIORITY;
+	ctx_hi->preempt_timeout = 50 * 1000; /* 50us */
+
+	ctx_lo = kernel_context(i915);
+	if (!ctx_lo)
+		goto err_ctx_hi;
+	ctx_lo->priority = I915_CONTEXT_MIN_USER_PRIORITY;
+
+	for_each_engine(engine, i915, id) {
+		struct i915_request *rq;
+
+		rq = spinner_create_request(&spin_lo, ctx_lo, engine, MI_NOOP);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (!wait_for_spinner(&spin_lo, rq)) {
+			i915_gem_set_wedged(i915);
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		rq = spinner_create_request(&spin_hi, ctx_hi, engine, MI_NOOP);
+		if (IS_ERR(rq)) {
+			spinner_end(&spin_lo);
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (!wait_for_spinner(&spin_hi, rq)) {
+			i915_gem_set_wedged(i915);
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		spinner_end(&spin_hi);
+		spinner_end(&spin_lo);
+		if (flush_test(i915, I915_WAIT_LOCKED)) {
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+	}
+
+	err = 0;
+err_ctx_lo:
+	kernel_context_close(ctx_lo);
+err_ctx_hi:
+	kernel_context_close(ctx_hi);
+err_spin_lo:
+	spinner_fini(&spin_lo);
+err_spin_hi:
+	spinner_fini(&spin_hi);
+err_unlock:
+	flush_test(i915, I915_WAIT_LOCKED);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
 int intel_execlists_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
@@ -789,6 +873,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_preempt_timeout),
 		SUBTEST(live_preempt_reset),
 		SUBTEST(live_late_preempt_timeout),
+		SUBTEST(live_context_preempt_timeout),
 	};
 	return i915_subtests(tests, i915);
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 7f5634ce8e88..853e0c7e0e85 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1456,6 +1456,18 @@ struct drm_i915_gem_context_param {
 #define   I915_CONTEXT_MAX_USER_PRIORITY	1023 /* inclusive */
 #define   I915_CONTEXT_DEFAULT_PRIORITY		0
 #define   I915_CONTEXT_MIN_USER_PRIORITY	-1023 /* inclusive */
+
+/*
+ * I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
+ *
+ * Preemption timeout give in nanoseconds.
+ *
+ * Only allowed for privileged clients (CAP_SYS_ADMIN), this property allows
+ * the preempting context to kick out a GPU hog using a GPU reset if they do
+ * not honour our preemption request in time.
+ */
+#define I915_CONTEXT_PARAM_PREEMPT_TIMEOUT	0x7
+
 	__u64 value;
 };
 
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/18] drm/i915/execlists: Set queue priority from secondary port
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (17 preceding siblings ...)
  2018-04-09 11:14 ` [PATCH 18/18] drm/i915: Allow user control over preempt timeout on their important context Chris Wilson
@ 2018-04-09 11:29 ` Patchwork
  2018-04-09 11:44 ` ✓ Fi.CI.BAT: success " Patchwork
  2018-04-09 13:08 ` ✗ Fi.CI.IGT: failure " Patchwork
  20 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2018-04-09 11:29 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/18] drm/i915/execlists: Set queue priority from secondary port
URL   : https://patchwork.freedesktop.org/series/41357/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
fe34e7d49f28 drm/i915/execlists: Set queue priority from secondary port
-:25: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#25: 
References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")

-:25: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")'
#25: 
References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")

total: 1 errors, 1 warnings, 0 checks, 9 lines checked
7c05a3c646d6 drm/i915/execlists: Refactor out complete_preempt_context()
b7ecf299a4f9 drm/i915: Move engine reset prepare/finish to backends
2b9aff94cf13 drm/i915: Split execlists/guc reset preparations
69d92c7b2ef7 drm/i915/execlists: Flush pending preemption events during reset
-:69: WARNING:LONG_LINE: line over 100 characters
#69: FILE: drivers/gpu/drm/i915/intel_lrc.c:908:
+				(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));

-:87: WARNING:LONG_LINE: line over 100 characters
#87: FILE: drivers/gpu/drm/i915/intel_lrc.c:922:
+			head = readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));

-:104: WARNING:LONG_LINE: line over 100 characters
#104: FILE: drivers/gpu/drm/i915/intel_lrc.c:936:
+			  head, GEN8_CSB_READ_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",

-:105: WARNING:LONG_LINE: line over 100 characters
#105: FILE: drivers/gpu/drm/i915/intel_lrc.c:937:
+			  tail, GEN8_CSB_WRITE_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");

total: 0 errors, 4 warnings, 0 checks, 192 lines checked
8584674c645e drm/i915/breadcrumbs: Keep the fake irq armed across reset
5333a7b17913 drm/i915: Combine tasklet_kill and tasklet_disable
-:39: WARNING:MEMORY_BARRIER: memory barrier without comment
#39: FILE: drivers/gpu/drm/i915/intel_lrc.c:1764:
+	smp_mb();

total: 0 errors, 1 warnings, 0 checks, 26 lines checked
69c64b744f61 drm/i915: Stop parking the signaler around reset
3b598171e856 drm/i915: Be irqsafe inside reset
26a3324ce9c0 drm/i915/execlists: Make submission tasklet hardirq safe
277f79ddc081 drm/i915/guc: Make submission tasklet hardirq safe
c3667f5a4ad6 drm/i915: Allow init_breadcrumbs to be used from irq context
c9c18eae9f33 drm/i915: Compile out engine debug for release
68bac469bf86 drm/i915/execlists: Force preemption via reset on timeout
-:56: ERROR:SPACING: space prohibited after that open parenthesis '('
#56: FILE: drivers/gpu/drm/i915/intel_lrc.c:571:
+		intel_engine_dump( engine, &p, "%s\n", engine->name);

total: 1 errors, 0 warnings, 0 checks, 228 lines checked
df4b8d8c35f6 drm/i915/execlists: Try preempt-reset from hardirq timer context
d8144187ca0e drm/i915/preemption: Select timeout when scheduling
0aeab6cbe712 drm/i915: Use a preemption timeout to enforce interactivity
cd1c93c17baf drm/i915: Allow user control over preempt timeout on their important context

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [01/18] drm/i915/execlists: Set queue priority from secondary port
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (18 preceding siblings ...)
  2018-04-09 11:29 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/18] drm/i915/execlists: Set queue priority from secondary port Patchwork
@ 2018-04-09 11:44 ` Patchwork
  2018-04-09 13:08 ` ✗ Fi.CI.IGT: failure " Patchwork
  20 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2018-04-09 11:44 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/18] drm/i915/execlists: Set queue priority from secondary port
URL   : https://patchwork.freedesktop.org/series/41357/
State : success

== Summary ==

Series 41357v1 series starting with [01/18] drm/i915/execlists: Set queue priority from secondary port
https://patchwork.freedesktop.org/api/1.0/series/41357/revisions/1/mbox/

---- Known issues:

Test debugfs_test:
        Subgroup read_all_entries:
                incomplete -> PASS       (fi-snb-2520m) fdo#103713
Test gem_ctx_param:
        Subgroup basic-default:
                incomplete -> PASS       (fi-cnl-y3) fdo#105086
Test kms_chamelium:
        Subgroup dp-edid-read:
                fail       -> PASS       (fi-kbl-7500u) fdo#102505
Test kms_flip:
        Subgroup basic-flip-vs-wf_vblank:
                fail       -> PASS       (fi-cfl-s3) fdo#100368
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b-frame-sequence:
                pass       -> FAIL       (fi-skl-6700k2) fdo#103191
        Subgroup suspend-read-crc-pipe-c:
                pass       -> INCOMPLETE (fi-bxt-dsi) fdo#103927
Test prime_vgem:
        Subgroup basic-fence-flip:
                pass       -> FAIL       (fi-ilk-650) fdo#104008

fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
fdo#105086 https://bugs.freedesktop.org/show_bug.cgi?id=105086
fdo#102505 https://bugs.freedesktop.org/show_bug.cgi?id=102505
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
fdo#103927 https://bugs.freedesktop.org/show_bug.cgi?id=103927
fdo#104008 https://bugs.freedesktop.org/show_bug.cgi?id=104008

fi-bdw-5557u     total:285  pass:264  dwarn:0   dfail:0   fail:0   skip:21  time:430s
fi-bdw-gvtdvm    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:445s
fi-blb-e6850     total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:385s
fi-bsw-n3050     total:285  pass:239  dwarn:0   dfail:0   fail:0   skip:46  time:546s
fi-bwr-2160      total:285  pass:180  dwarn:0   dfail:0   fail:0   skip:105 time:297s
fi-bxt-dsi       total:243  pass:216  dwarn:0   dfail:0   fail:0   skip:26 
fi-bxt-j4205     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:515s
fi-byt-j1900     total:285  pass:250  dwarn:0   dfail:0   fail:0   skip:35  time:524s
fi-byt-n2820     total:285  pass:246  dwarn:0   dfail:0   fail:0   skip:39  time:513s
fi-cfl-8700k     total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:413s
fi-cfl-s3        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:559s
fi-cfl-u         total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:512s
fi-cnl-y3        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:584s
fi-elk-e7500     total:285  pass:226  dwarn:0   dfail:0   fail:0   skip:59  time:422s
fi-gdg-551       total:285  pass:176  dwarn:0   dfail:0   fail:1   skip:108 time:314s
fi-glk-1         total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:534s
fi-glk-j4005     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:491s
fi-hsw-4770      total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:409s
fi-ilk-650       total:285  pass:224  dwarn:0   dfail:0   fail:1   skip:60  time:423s
fi-ivb-3520m     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:469s
fi-ivb-3770      total:285  pass:252  dwarn:0   dfail:0   fail:0   skip:33  time:434s
fi-kbl-7500u     total:285  pass:260  dwarn:1   dfail:0   fail:0   skip:24  time:472s
fi-kbl-7567u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:465s
fi-kbl-r         total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:510s
fi-pnv-d510      total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:672s
fi-skl-6260u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:440s
fi-skl-6600u     total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:539s
fi-skl-6700k2    total:285  pass:260  dwarn:0   dfail:0   fail:1   skip:24  time:511s
fi-skl-6770hq    total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:497s
fi-skl-guc       total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:427s
fi-skl-gvtdvm    total:285  pass:262  dwarn:0   dfail:0   fail:0   skip:23  time:446s
fi-snb-2520m     total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:555s
fi-snb-2600      total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:395s

1be073153147c5c39cdcbdfdeb4e2595ba595bf7 drm-tip: 2018y-04m-07d-22h-26m-31s UTC integration manifest
cd1c93c17baf drm/i915: Allow user control over preempt timeout on their important context
0aeab6cbe712 drm/i915: Use a preemption timeout to enforce interactivity
d8144187ca0e drm/i915/preemption: Select timeout when scheduling
df4b8d8c35f6 drm/i915/execlists: Try preempt-reset from hardirq timer context
68bac469bf86 drm/i915/execlists: Force preemption via reset on timeout
c9c18eae9f33 drm/i915: Compile out engine debug for release
c3667f5a4ad6 drm/i915: Allow init_breadcrumbs to be used from irq context
277f79ddc081 drm/i915/guc: Make submission tasklet hardirq safe
26a3324ce9c0 drm/i915/execlists: Make submission tasklet hardirq safe
3b598171e856 drm/i915: Be irqsafe inside reset
69c64b744f61 drm/i915: Stop parking the signaler around reset
5333a7b17913 drm/i915: Combine tasklet_kill and tasklet_disable
8584674c645e drm/i915/breadcrumbs: Keep the fake irq armed across reset
69d92c7b2ef7 drm/i915/execlists: Flush pending preemption events during reset
2b9aff94cf13 drm/i915: Split execlists/guc reset preparations
b7ecf299a4f9 drm/i915: Move engine reset prepare/finish to backends
7c05a3c646d6 drm/i915/execlists: Refactor out complete_preempt_context()
fe34e7d49f28 drm/i915/execlists: Set queue priority from secondary port

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8638/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [01/18] drm/i915/execlists: Set queue priority from secondary port
  2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
                   ` (19 preceding siblings ...)
  2018-04-09 11:44 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-04-09 13:08 ` Patchwork
  20 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2018-04-09 13:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/18] drm/i915/execlists: Set queue priority from secondary port
URL   : https://patchwork.freedesktop.org/series/41357/
State : failure

== Summary ==

---- Possible new issues:

Test gem_ctx_param:
        Subgroup invalid-param-get:
                pass       -> FAIL       (shard-apl)
                pass       -> FAIL       (shard-hsw)
        Subgroup invalid-param-set:
                pass       -> FAIL       (shard-apl)
                pass       -> FAIL       (shard-hsw)
Test gem_pwrite:
        Subgroup big-gtt-backwards:
                skip       -> PASS       (shard-apl)

---- Known issues:

Test kms_flip:
        Subgroup 2x-flip-vs-expired-vblank-interruptible:
                fail       -> PASS       (shard-hsw) fdo#102887 +1
        Subgroup plain-flip-fb-recreate-interruptible:
                pass       -> FAIL       (shard-hsw) fdo#100368 +2

fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368

shard-apl        total:2680 pass:1833 dwarn:1   dfail:1   fail:9   skip:836 time:12708s
shard-hsw        total:2680 pass:1781 dwarn:1   dfail:0   fail:6   skip:891 time:11341s
Blacklisted hosts:
shard-kbl        total:2680 pass:1960 dwarn:1   dfail:1   fail:9   skip:709 time:9215s
shard-snb        total:2680 pass:1375 dwarn:1   dfail:0   fail:5   skip:1299 time:6916s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8638/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port
  2018-04-09 11:13 ` [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port Chris Wilson
@ 2018-04-10 14:05   ` Tvrtko Ursulin
       [not found]     ` <152337025293.3167.10189866034675290387@mail.alporthouse.com>
  0 siblings, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2018-04-10 14:05 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 09/04/2018 12:13, Chris Wilson wrote:
> We can refine our current execlists->queue_priority if we inspect
> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
> the unsubmitted queue and say that if a subsequent request is more than
> important than the current queue, we will rerun the submission tasklet

s/more than important/more important/

> to evaluate the need for preemption. However, we only want to preempt if
> we need to jump ahead of a currently executing request in ELSP. The
> second reason for running the submission tasklet is amalgamate requests
> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
> (Though repeatedly amalgamating requests into the active context and
> triggering many lite-restore is off question gain, the goal really is to
> put a context into ELSP[1] to cover the interrupt.) So if instead of
> looking at the head of the queue, we look at the context in ELSP[1] we
> can answer both of the questions more accurately -- we don't need to
> rerun the submission tasklet unless our new request is important enough
> to feed into, at least, ELSP[1].
> 
> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 3592288e4696..b47d53d284ca 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   			kmem_cache_free(engine->i915->priorities, p);
>   	}
>   done:
> -	execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
> +	execlists->queue_priority =
> +		port != execlists->port ? rq_prio(last) : INT_MIN;

Please bear with my questions since I am not 100% up to date with 
preemption.

Should this be rq_prio("port[1]") for future proofing? Or you really 
mean last port? In which case it wouldn't be the highest pending 
priority as kerneldoc for execlists->queue_priority says.

So this patch changes the meaning of "pending". From pending == "not 
submitted to ELSP" to pending == "not submitted to ELSP[0]". Which seems 
to make sense, although it is not the easiest job to figure out the 
consequences.

It even feels like a bugfix since it prevents tasklet scheduling when 
all ports are filled with higher priority requests than the new one.

Although I failed to understand what do we do in both cases if a new 
request arrives of higher prio than the one in ELSP[1]. Looks like 
nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so 
because we can't safely or I misread something?

Also, don't you need to manage execlists->queue_priority after CSB 
processing now? So that it correctly reflects the priority of request in 
ELSP[1] after ELSP[0] gets completed? It seems that without it would get 
stuck at the previous value and then submission could decide to skip 
scheduling the tasklet if new priority is lower than what was in ELSP[1] 
before, and so would fail to fill ELSP[1].

>   	execlists->first = rb;
>   	if (submit)
>   		port_assign(port, last);
> 

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port
       [not found]     ` <152337025293.3167.10189866034675290387@mail.alporthouse.com>
@ 2018-04-10 14:42       ` Tvrtko Ursulin
  2018-04-10 14:56         ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2018-04-10 14:42 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 10/04/2018 15:24, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
>>
>> On 09/04/2018 12:13, Chris Wilson wrote:
>>> We can refine our current execlists->queue_priority if we inspect
>>> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
>>> the unsubmitted queue and say that if a subsequent request is more than
>>> important than the current queue, we will rerun the submission tasklet
>>
>> s/more than important/more important/
>>
>>> to evaluate the need for preemption. However, we only want to preempt if
>>> we need to jump ahead of a currently executing request in ELSP. The
>>> second reason for running the submission tasklet is amalgamate requests
>>> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
>>> (Though repeatedly amalgamating requests into the active context and
>>> triggering many lite-restore is off question gain, the goal really is to
>>> put a context into ELSP[1] to cover the interrupt.) So if instead of
>>> looking at the head of the queue, we look at the context in ELSP[1] we
>>> can answer both of the questions more accurately -- we don't need to
>>> rerun the submission tasklet unless our new request is important enough
>>> to feed into, at least, ELSP[1].
>>>
>>> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Michał Winiarski <michal.winiarski@intel.com>
>>> Cc: Michel Thierry <michel.thierry@intel.com>
>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>> index 3592288e4696..b47d53d284ca 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>>>                        kmem_cache_free(engine->i915->priorities, p);
>>>        }
>>>    done:
>>> -     execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
>>> +     execlists->queue_priority =
>>> +             port != execlists->port ? rq_prio(last) : INT_MIN;
>>
>> Please bear with my questions since I am not 100% up to date with
>> preemption.
>>
>> Should this be rq_prio("port[1]") for future proofing? Or you really
>> mean last port? In which case it wouldn't be the highest pending
>> priority as kerneldoc for execlists->queue_priority says.
> 
> I mean "secondary" port, so yes using last executing port under the
> assumption that we grow into a ring of many useless submission ports.
> The kerneldoc is no more or no less accurate. :)

"That we _don't_ grow"?

> 
>> So this patch changes the meaning of "pending". From pending == "not
>> submitted to ELSP" to pending == "not submitted to ELSP[0]". Which seems
>> to make sense, although it is not the easiest job to figure out the
>> consequences.
> 
> Yes.
>   
>> It even feels like a bugfix since it prevents tasklet scheduling when
>> all ports are filled with higher priority requests than the new one.
> 
> It won't fix any bugs, since we just reduce the number of kicks. Kicking
> and evaluating we have nothing to do is just wasted work. So yes I do
> agree that it is a bug fix, just not enough to merit a Fixes.

Yeah that's fine.

>   
>> Although I failed to understand what do we do in both cases if a new
>> request arrives of higher prio than the one in ELSP[1]. Looks like
>> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
>> because we can't safely or I misread something?
> 
> This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> request higher than ELSP[1], we start a preemption as
> 
> 	if (need_preempt(engine, last, execlists->queue_priority)) {
> 
> will evaluate to true. It's either the lowest executing priority (new
> code), or lowest pending priority (old code). In either case, if the new
> request is more important than the queue_priority, we preempt.

How when "last" is request from ELSP[0]? And also 
execlists->queue_priority has not yet been updated to reflect the new 
priority?

Then there is also "if (port_count(port0)) goto unlock;" suggesting that 
if there were any appends to ELSP[0] we will also fail to act in this 
situation?

> We won't evaluate preemption if we are still awaiting the HWACK from the
> last ELSP write, or if we are still preempting. In both of those cases,
> we expect to receive an interrupt promptly, upon which we then redo our
> evaluations.
> 
>> Also, don't you need to manage execlists->queue_priority after CSB
>> processing now? So that it correctly reflects the priority of request in
>> ELSP[1] after ELSP[0] gets completed? It seems that without it would get
>> stuck at the previous value and then submission could decide to skip
>> scheduling the tasklet if new priority is lower than what was in ELSP[1]
>> before, and so would fail to fill ELSP[1].
> 
> Yes, that is also done here. Since we are always looking one request
> ahead, we either update the priority based on the queue following the
> resubmission on interrupt, or it is left as INT_MIN on completion.
> Indeed, making sure we reset back to INT_MIN is essential so that we
> don't any future submissions from idle.

Okay I see it - because execlists_dequeue is called and runs to the 
queue_priority update bit even when there is nothing in the queue.

> We can add GEM_BUG_ON(queue_priority != INT_MIN) in engines_park to
> check this condition.

Looks like we don't have these hooks set for execlists so it's probably 
more hassle than it would be worth.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port
  2018-04-10 14:42       ` Tvrtko Ursulin
@ 2018-04-10 14:56         ` Chris Wilson
  2018-04-10 15:08           ` Chris Wilson
       [not found]           ` <87af1b35-ba87-f560-c911-0e758a164909@linux.intel.com>
  0 siblings, 2 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-10 14:56 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
> 
> On 10/04/2018 15:24, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> >>
> >> On 09/04/2018 12:13, Chris Wilson wrote:
> >>> We can refine our current execlists->queue_priority if we inspect
> >>> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
> >>> the unsubmitted queue and say that if a subsequent request is more than
> >>> important than the current queue, we will rerun the submission tasklet
> >>
> >> s/more than important/more important/
> >>
> >>> to evaluate the need for preemption. However, we only want to preempt if
> >>> we need to jump ahead of a currently executing request in ELSP. The
> >>> second reason for running the submission tasklet is amalgamate requests
> >>> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
> >>> (Though repeatedly amalgamating requests into the active context and
> >>> triggering many lite-restore is off question gain, the goal really is to
> >>> put a context into ELSP[1] to cover the interrupt.) So if instead of
> >>> looking at the head of the queue, we look at the context in ELSP[1] we
> >>> can answer both of the questions more accurately -- we don't need to
> >>> rerun the submission tasklet unless our new request is important enough
> >>> to feed into, at least, ELSP[1].
> >>>
> >>> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Michał Winiarski <michal.winiarski@intel.com>
> >>> Cc: Michel Thierry <michel.thierry@intel.com>
> >>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
> >>>    1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >>> index 3592288e4696..b47d53d284ca 100644
> >>> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> >>>                        kmem_cache_free(engine->i915->priorities, p);
> >>>        }
> >>>    done:
> >>> -     execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
> >>> +     execlists->queue_priority =
> >>> +             port != execlists->port ? rq_prio(last) : INT_MIN;
> >>
> >> Please bear with my questions since I am not 100% up to date with
> >> preemption.
> >>
> >> Should this be rq_prio("port[1]") for future proofing? Or you really
> >> mean last port? In which case it wouldn't be the highest pending
> >> priority as kerneldoc for execlists->queue_priority says.
> > 
> > I mean "secondary" port, so yes using last executing port under the
> > assumption that we grow into a ring of many useless submission ports.
> > The kerneldoc is no more or no less accurate. :)
> 
> "That we _don't_ grow"?

Hmm, no, when we get "last_port" it slots right into here. I just don't
have the future facing code to prevent Mika from having to think a
little. The intent is that when there is a ELSP slot available,
queue_priority is INT_MIN, when there are none, then rq_prio(last).

My bad for remembering what I want the code to be without remembering
what the code says.
 
> >> Although I failed to understand what do we do in both cases if a new
> >> request arrives of higher prio than the one in ELSP[1]. Looks like
> >> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
> >> because we can't safely or I misread something?
> > 
> > This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> > request higher than ELSP[1], we start a preemption as
> > 
> >       if (need_preempt(engine, last, execlists->queue_priority)) {
> > 
> > will evaluate to true. It's either the lowest executing priority (new
> > code), or lowest pending priority (old code). In either case, if the new
> > request is more important than the queue_priority, we preempt.
> 
> How when "last" is request from ELSP[0]? And also 
> execlists->queue_priority has not yet been updated to reflect the new 
> priority?

When we start executing last on ELSP[0] there will have been another
execlists_dequeue() where we see an empty slot (or fill it) and update
queue_priority. If we are down to the last request, it will be set to
INT_MIN. Upon its completion, it will remain INT_MIN.

> Then there is also "if (port_count(port0)) goto unlock;" suggesting that 
> if there were any appends to ELSP[0] we will also fail to act in this 
> situation?

If we only write into ELSP[0], then ELSP[1] remains empty and the
queue_priority still needs to INT_MIN so that we service any new
i915_request_add immediately.
 
> > We won't evaluate preemption if we are still awaiting the HWACK from the
> > last ELSP write, or if we are still preempting. In both of those cases,
> > we expect to receive an interrupt promptly, upon which we then redo our
> > evaluations.
> > 
> >> Also, don't you need to manage execlists->queue_priority after CSB
> >> processing now? So that it correctly reflects the priority of request in
> >> ELSP[1] after ELSP[0] gets completed? It seems that without it would get
> >> stuck at the previous value and then submission could decide to skip
> >> scheduling the tasklet if new priority is lower than what was in ELSP[1]
> >> before, and so would fail to fill ELSP[1].
> > 
> > Yes, that is also done here. Since we are always looking one request
> > ahead, we either update the priority based on the queue following the
> > resubmission on interrupt, or it is left as INT_MIN on completion.
> > Indeed, making sure we reset back to INT_MIN is essential so that we
> > don't any future submissions from idle.
> 
> Okay I see it - because execlists_dequeue is called and runs to the 
> queue_priority update bit even when there is nothing in the queue.

Phew, I can get away from having to draw ascii diagrams. I'll leave that
to Mika as he figures out how to hook up N ports ;)

> > We can add GEM_BUG_ON(queue_priority != INT_MIN) in engines_park to
> > check this condition.
> 
> Looks like we don't have these hooks set for execlists so it's probably 
> more hassle than it would be worth.

For ringbuffer, it's permanently INT_MIN and guc is hooked up to the
same logic as execlists.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port
  2018-04-10 14:56         ` Chris Wilson
@ 2018-04-10 15:08           ` Chris Wilson
       [not found]           ` <87af1b35-ba87-f560-c911-0e758a164909@linux.intel.com>
  1 sibling, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-10 15:08 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2018-04-10 15:56:03)
> Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
> > 
> > On 10/04/2018 15:24, Chris Wilson wrote:
> > > Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> > >> Although I failed to understand what do we do in both cases if a new
> > >> request arrives of higher prio than the one in ELSP[1]. Looks like
> > >> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
> > >> because we can't safely or I misread something?
> > > 
> > > This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> > > request higher than ELSP[1], we start a preemption as
> > > 
> > >       if (need_preempt(engine, last, execlists->queue_priority)) {
> > > 
> > > will evaluate to true. It's either the lowest executing priority (new
> > > code), or lowest pending priority (old code). In either case, if the new
> > > request is more important than the queue_priority, we preempt.
> > 
> > How when "last" is request from ELSP[0]? And also 
> > execlists->queue_priority has not yet been updated to reflect the new 
> > priority?
> 
> When we start executing last on ELSP[0] there will have been another
> execlists_dequeue() where we see an empty slot (or fill it) and update
> queue_priority. If we are down to the last request, it will be set to
> INT_MIN. Upon its completion, it will remain INT_MIN.
> 
> > Then there is also "if (port_count(port0)) goto unlock;" suggesting that 
> > if there were any appends to ELSP[0] we will also fail to act in this 
> > situation?
> 
> If we only write into ELSP[0], then ELSP[1] remains empty and the
> queue_priority still needs to INT_MIN so that we service any new
> i915_request_add immediately.
>  
> > > We won't evaluate preemption if we are still awaiting the HWACK from the
> > > last ELSP write, or if we are still preempting. In both of those cases,
> > > we expect to receive an interrupt promptly, upon which we then redo our
> > > evaluations.
> > > 
> > >> Also, don't you need to manage execlists->queue_priority after CSB
> > >> processing now? So that it correctly reflects the priority of request in
> > >> ELSP[1] after ELSP[0] gets completed? It seems that without it would get
> > >> stuck at the previous value and then submission could decide to skip
> > >> scheduling the tasklet if new priority is lower than what was in ELSP[1]
> > >> before, and so would fail to fill ELSP[1].
> > > 
> > > Yes, that is also done here. Since we are always looking one request
> > > ahead, we either update the priority based on the queue following the
> > > resubmission on interrupt, or it is left as INT_MIN on completion.
> > > Indeed, making sure we reset back to INT_MIN is essential so that we
> > > don't any future submissions from idle.
> > 
> > Okay I see it - because execlists_dequeue is called and runs to the 
> > queue_priority update bit even when there is nothing in the queue.
> 
> Phew, I can get away from having to draw ascii diagrams. I'll leave that
> to Mika as he figures out how to hook up N ports ;)

        /*
         * Here be a bit of magic! Or sleight-of-hand, whichever you prefer.
         *
         * We choose queue_priority such that if we add a request of greater
         * priority than this, we kick the submission tasklet to decide on
         * the right order of submitting the requests to hardware. We must
         * also be prepared to reorder requests as they are in-flight on the
         * HW. We derive the queue_priority then as the first "hole" in
         * the HW submission ports and if there are no available slots,
         * it the priority of the lowest executing request, i.e. the last one.
         */
        execlists->queue_priority =
                port != execlists->port ? rq_prio(last) : INT_MIN;

Does that help, or do I need ASCII art?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port
       [not found]               ` <aa0e24bb-045c-e1c9-24bc-6dba0b4a28b8@linux.intel.com>
@ 2018-04-11 11:33                 ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-11 11:33 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-04-11 11:47:00)
> 
> On 11/04/2018 11:36, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-04-11 11:23:01)
> >>
> >> On 10/04/2018 15:56, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
> >>>>
> >>>> On 10/04/2018 15:24, Chris Wilson wrote:
> >>>>> Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> >>>>>>
> >>>>>> On 09/04/2018 12:13, Chris Wilson wrote:
> >>>>>>> We can refine our current execlists->queue_priority if we inspect
> >>>>>>> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
> >>>>>>> the unsubmitted queue and say that if a subsequent request is more than
> >>>>>>> important than the current queue, we will rerun the submission tasklet
> >>>>>>
> >>>>>> s/more than important/more important/
> >>>>>>
> >>>>>>> to evaluate the need for preemption. However, we only want to preempt if
> >>>>>>> we need to jump ahead of a currently executing request in ELSP. The
> >>>>>>> second reason for running the submission tasklet is amalgamate requests
> >>>>>>> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
> >>>>>>> (Though repeatedly amalgamating requests into the active context and
> >>>>>>> triggering many lite-restore is off question gain, the goal really is to
> >>>>>>> put a context into ELSP[1] to cover the interrupt.) So if instead of
> >>>>>>> looking at the head of the queue, we look at the context in ELSP[1] we
> >>>>>>> can answer both of the questions more accurately -- we don't need to
> >>>>>>> rerun the submission tasklet unless our new request is important enough
> >>>>>>> to feed into, at least, ELSP[1].
> >>>>>>>
> >>>>>>> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> >>>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>>>>>> Cc: Michał Winiarski <michal.winiarski@intel.com>
> >>>>>>> Cc: Michel Thierry <michel.thierry@intel.com>
> >>>>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >>>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>>>>> ---
> >>>>>>>      drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
> >>>>>>>      1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >>>>>>> index 3592288e4696..b47d53d284ca 100644
> >>>>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >>>>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>>>>>> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> >>>>>>>                          kmem_cache_free(engine->i915->priorities, p);
> >>>>>>>          }
> >>>>>>>      done:
> >>>>>>> -     execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
> >>>>>>> +     execlists->queue_priority =
> >>>>>>> +             port != execlists->port ? rq_prio(last) : INT_MIN;
> >>>>>>
> >>>>>> Please bear with my questions since I am not 100% up to date with
> >>>>>> preemption.
> >>>>>>
> >>>>>> Should this be rq_prio("port[1]") for future proofing? Or you really
> >>>>>> mean last port? In which case it wouldn't be the highest pending
> >>>>>> priority as kerneldoc for execlists->queue_priority says.
> >>>>>
> >>>>> I mean "secondary" port, so yes using last executing port under the
> >>>>> assumption that we grow into a ring of many useless submission ports.
> >>>>> The kerneldoc is no more or no less accurate. :)
> >>>>
> >>>> "That we _don't_ grow"?
> >>>
> >>> Hmm, no, when we get "last_port" it slots right into here. I just don't
> >>> have the future facing code to prevent Mika from having to think a
> >>> little. The intent is that when there is a ELSP slot available,
> >>> queue_priority is INT_MIN, when there are none, then rq_prio(last).
> >>>
> >>> My bad for remembering what I want the code to be without remembering
> >>> what the code says.
> >>>    
> >>>>>> Although I failed to understand what do we do in both cases if a new
> >>>>>> request arrives of higher prio than the one in ELSP[1]. Looks like
> >>>>>> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
> >>>>>> because we can't safely or I misread something?
> >>>>>
> >>>>> This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> >>>>> request higher than ELSP[1], we start a preemption as
> >>>>>
> >>>>>         if (need_preempt(engine, last, execlists->queue_priority)) {
> >>>>>
> >>>>> will evaluate to true. It's either the lowest executing priority (new
> >>>>> code), or lowest pending priority (old code). In either case, if the new
> >>>>> request is more important than the queue_priority, we preempt.
> >>>>
> >>>> How when "last" is request from ELSP[0]? And also
> >>>> execlists->queue_priority has not yet been updated to reflect the new
> >>>> priority?
> >>>
> >>> When we start executing last on ELSP[0] there will have been another
> >>> execlists_dequeue() where we see an empty slot (or fill it) and update
> >>> queue_priority. If we are down to the last request, it will be set to
> >>> INT_MIN. Upon its completion, it will remain INT_MIN.
> >>
> >> I don't see it yet, let me walk through it:
> >>
> >> 0. Initial situation, GPU busy with two requests, no outstanding ones:
> >>
> >> ELSP[0] = prio 2
> >> ELSP[1] = prio 0
> >>
> >> 1. queue_priority = 0
> >> 2. New execbuf comes along with prio=1.
> >> 3. We choose to schedule the tasklet - good.
> >> 4. execlists_dequeue runs
> >>
> >> last = prio 2
> >>
> >> if (need_preempt(engine, last, execlists->queue_priority))
> >>
> >> queue_priority = 0, so will not preempt last which is prio 2 - so no
> >> preemption - good.
> >>
> >> queue_priority remains at zero since we "goto unlock" with both ports
> >> busy and no preemption is triggered.
> >>
> >> 5. ELSP[0] completes, new ELSP[0] with prio 0.
> >>
> >> (Before we missed the opportunity to replace ELSP[1] with higher prio 1
> >> waiting request before ELSP[0] completed - perhaps we have no choice?
> >> But ok.. carrying on..)
> > 
> > We don't want to interrupt the higher priority task in ELSP[0] to sort
> > out ELSP[1].
> 
> I'll assume that means no safe way to just replace ELSP[1] without 
> preempting ELSP[0].

It easily doesn't fall out of the current tracking as we are not expecting
a lite-restore on ELSP[1], and would need to shadow the existing pair of
ELSP as well as the new pair, then figure out just which one we
preempted. It didn't look pretty, but I did try that approach early on
while trying to avoid the preempt-to-idle. (When the dust finally
settles, we should give that another go but it's probably already moot
if gen11 has zero extra preempt latency...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/18] drm/i915/breadcrumbs: Keep the fake irq armed across reset
  2018-04-09 11:14 ` [PATCH 06/18] drm/i915/breadcrumbs: Keep the fake irq armed across reset Chris Wilson
@ 2018-04-23 13:03   ` Mika Kuoppala
  2018-04-23 14:53     ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Kuoppala @ 2018-04-23 13:03 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Instead of synchronously cancelling the timer and re-enabling it inside
> the reset callbacks, keep the timer enabled and let it die on its next
> wakeup if no longer required. This allows
> intel_engine_reset_breadcrumbs() to be used from an atomic
> (timer/softirq) context such as required for resetting an engine.
>
> It also allows us to react better to the user poking around debugfs for
> testing missed irqs.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_breadcrumbs.c | 27 +++++++++++++++++-------
>  1 file changed, 19 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index 671a6d61e29d..0a2128c6b418 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -130,11 +130,12 @@ static void intel_breadcrumbs_hangcheck(struct timer_list *t)
>  
>  static void intel_breadcrumbs_fake_irq(struct timer_list *t)
>  {
> -	struct intel_engine_cs *engine = from_timer(engine, t,
> -						    breadcrumbs.fake_irq);
> +	struct intel_engine_cs *engine =
> +		from_timer(engine, t, breadcrumbs.fake_irq);
>  	struct intel_breadcrumbs *b = &engine->breadcrumbs;
>  
> -	/* The timer persists in case we cannot enable interrupts,
> +	/*
> +	 * The timer persists in case we cannot enable interrupts,
>  	 * or if we have previously seen seqno/interrupt incoherency
>  	 * ("missed interrupt" syndrome, better known as a "missed breadcrumb").
>  	 * Here the worker will wake up every jiffie in order to kick the
> @@ -148,6 +149,12 @@ static void intel_breadcrumbs_fake_irq(struct timer_list *t)
>  	if (!b->irq_armed)
>  		return;
>  
> +	/* If the user has disabled the fake-irq, restore the hangchecking */
> +	if (!test_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings)) {
> +		mod_timer(&b->hangcheck, wait_timeout());
> +		return;
> +	}
> +

Looking at the cancel_fake_irq() now, which we still need to keep as
sync, I think there is race introduce now as this can queue itself.

I think we want to also change the cancel_fake_irq() to do
the bit clearing first, not last after the del_timer_syncs().

-Mika

>  	mod_timer(&b->fake_irq, jiffies + 1);
>  }
>  
> @@ -840,15 +847,22 @@ void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
>  {
>  	struct intel_breadcrumbs *b = &engine->breadcrumbs;
>  
> -	cancel_fake_irq(engine);
>  	spin_lock_irq(&b->irq_lock);
>  
> +	/*
> +	 * Leave the fake_irq timer enabled (if it is running), but clear the
> +	 * bit so that it turns itself off on its next wake up and goes back
> +	 * to the long hangcheck interval if still required.
> +	 */
> +	clear_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
> +
>  	if (b->irq_enabled)
>  		irq_enable(engine);
>  	else
>  		irq_disable(engine);
>  
> -	/* We set the IRQ_BREADCRUMB bit when we enable the irq presuming the
> +	/*
> +	 * We set the IRQ_BREADCRUMB bit when we enable the irq presuming the
>  	 * GPU is active and may have already executed the MI_USER_INTERRUPT
>  	 * before the CPU is ready to receive. However, the engine is currently
>  	 * idle (we haven't started it yet), there is no possibility for a
> @@ -857,9 +871,6 @@ void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
>  	 */
>  	clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
>  
> -	if (b->irq_armed)
> -		enable_fake_irq(b);
> -
>  	spin_unlock_irq(&b->irq_lock);
>  }
>  
> -- 
> 2.17.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/18] drm/i915/breadcrumbs: Keep the fake irq armed across reset
  2018-04-23 13:03   ` Mika Kuoppala
@ 2018-04-23 14:53     ` Chris Wilson
  2018-04-24 11:06       ` Mika Kuoppala
  0 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2018-04-23 14:53 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2018-04-23 14:03:02)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> > @@ -148,6 +149,12 @@ static void intel_breadcrumbs_fake_irq(struct timer_list *t)
> >       if (!b->irq_armed)
> >               return;
> >  
> > +     /* If the user has disabled the fake-irq, restore the hangchecking */
> > +     if (!test_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings)) {
> > +             mod_timer(&b->hangcheck, wait_timeout());
> > +             return;
> > +     }
> > +
> 
> Looking at the cancel_fake_irq() now, which we still need to keep as
> sync, I think there is race introduce now as this can queue itself.
> 
> I think we want to also change the cancel_fake_irq() to do
> the bit clearing first, not last after the del_timer_syncs().

I see what you mean, but I think we want

	del_timer_sync(&b->fake_irq); // may queue b->hangcheck
	del_timer_sync(&b->hangcheck);
	clear_bit(engine->id. missed_irq_rings);

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/18] drm/i915/breadcrumbs: Keep the fake irq armed across reset
  2018-04-23 14:53     ` Chris Wilson
@ 2018-04-24 11:06       ` Mika Kuoppala
  0 siblings, 0 replies; 36+ messages in thread
From: Mika Kuoppala @ 2018-04-24 11:06 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2018-04-23 14:03:02)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> > @@ -148,6 +149,12 @@ static void intel_breadcrumbs_fake_irq(struct timer_list *t)
>> >       if (!b->irq_armed)
>> >               return;
>> >  
>> > +     /* If the user has disabled the fake-irq, restore the hangchecking */
>> > +     if (!test_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings)) {
>> > +             mod_timer(&b->hangcheck, wait_timeout());
>> > +             return;
>> > +     }
>> > +
>> 
>> Looking at the cancel_fake_irq() now, which we still need to keep as
>> sync, I think there is race introduce now as this can queue itself.
>> 
>> I think we want to also change the cancel_fake_irq() to do
>> the bit clearing first, not last after the del_timer_syncs().
>
> I see what you mean, but I think we want
>
> 	del_timer_sync(&b->fake_irq); // may queue b->hangcheck
> 	del_timer_sync(&b->hangcheck);
> 	clear_bit(engine->id. missed_irq_rings);
>

Ok got it. Swapping the order makes the newly
queued hangcheck from fake_irq to be cancelled and
the clearing should be last.

But now as the cancel is done only in fini path,
consider adding assertion for !irq_armed in start of
cancel_fake_irq().

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

-Mika
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 07/18] drm/i915: Combine tasklet_kill and tasklet_disable
  2018-04-09 11:14 ` [PATCH 07/18] drm/i915: Combine tasklet_kill and tasklet_disable Chris Wilson
@ 2018-04-24 12:26   ` Mika Kuoppala
  2018-04-24 12:28     ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Kuoppala @ 2018-04-24 12:26 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Ideally, we want to atomically flush and disable the tasklet before
> resetting the GPU. At present, we rely on being the only part to touch
> our tasklet and serialisation of the reset process to ensure that we can
> suspend the tasklet from the mix of reset/wedge pathways. In this patch,
> we move the tasklet abuse into its own function and tweak it such that
> we only do a synchronous operation the first time it is disabled around
> the reset. This allows us to avoid the sync inside a softirq context in
> subsequent patches.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> CC: Michel Thierry <michel.thierry@intel.com>
> Cc: Jeff McGee <jeff.mcgee@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index bbcc6439a2a1..d5640f3d5276 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1754,6 +1754,16 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
>  	return init_workarounds_ring(engine);
>  }
>  
> +static void tasklet_kill_and_disable(struct tasklet_struct *t)
> +{
> +	if (!atomic_read(&t->count))
> +		tasklet_kill(t);
> +
> +	if (atomic_inc_return(&t->count) == 1)
> +		tasklet_unlock_wait(t);

You would spin only on the first try regardless. Is this
just to prevent one extra spinlock on reset path?

-Mika

> +	smp_mb();
> +}
> +
>  static struct i915_request *
>  execlists_reset_prepare(struct intel_engine_cs *engine)
>  {
> @@ -1778,9 +1788,7 @@ execlists_reset_prepare(struct intel_engine_cs *engine)
>  	 * common case of recursively being called from set-wedged from inside
>  	 * i915_reset.
>  	 */
> -	if (!atomic_read(&execlists->tasklet.count))
> -		tasklet_kill(&execlists->tasklet);
> -	tasklet_disable(&execlists->tasklet);
> +	tasklet_kill_and_disable(&execlists->tasklet);
>  
>  	/*
>  	 * We want to flush the pending context switches, having disabled
> -- 
> 2.17.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 07/18] drm/i915: Combine tasklet_kill and tasklet_disable
  2018-04-24 12:26   ` Mika Kuoppala
@ 2018-04-24 12:28     ` Chris Wilson
  2018-04-26 10:19       ` Mika Kuoppala
  0 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2018-04-24 12:28 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2018-04-24 13:26:11)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Ideally, we want to atomically flush and disable the tasklet before
> > resetting the GPU. At present, we rely on being the only part to touch
> > our tasklet and serialisation of the reset process to ensure that we can
> > suspend the tasklet from the mix of reset/wedge pathways. In this patch,
> > we move the tasklet abuse into its own function and tweak it such that
> > we only do a synchronous operation the first time it is disabled around
> > the reset. This allows us to avoid the sync inside a softirq context in
> > subsequent patches.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> > CC: Michel Thierry <michel.thierry@intel.com>
> > Cc: Jeff McGee <jeff.mcgee@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_lrc.c | 14 +++++++++++---
> >  1 file changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index bbcc6439a2a1..d5640f3d5276 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -1754,6 +1754,16 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
> >       return init_workarounds_ring(engine);
> >  }
> >  
> > +static void tasklet_kill_and_disable(struct tasklet_struct *t)
> > +{
> > +     if (!atomic_read(&t->count))
> > +             tasklet_kill(t);
> > +
> > +     if (atomic_inc_return(&t->count) == 1)
> > +             tasklet_unlock_wait(t);
> 
> You would spin only on the first try regardless. Is this
> just to prevent one extra spinlock on reset path?

No, the end goal is to prevent a recursive lock.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 13/18] drm/i915: Compile out engine debug for release
  2018-04-09 11:14 ` [PATCH 13/18] drm/i915: Compile out engine debug for release Chris Wilson
@ 2018-04-26 10:15   ` Mika Kuoppala
  0 siblings, 0 replies; 36+ messages in thread
From: Mika Kuoppala @ 2018-04-26 10:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> The majority of the engine state dumping is too voluminous to be useful
> outside of a controlled setup, though a few do accompany severe errors.
> Keep the debug dumps next to the errors, but hide the others behind a CI
> compile flag. This becomes more useful when adding more dumps to latency
> sensitive paths.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/i915_gem.c          | 2 +-
>  drivers/gpu/drm/i915/i915_gem.h          | 6 ++++++
>  drivers/gpu/drm/i915/intel_breadcrumbs.c | 2 +-
>  drivers/gpu/drm/i915/intel_hangcheck.c   | 2 +-
>  4 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 42a387ff0eaa..cec52bbd1b41 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3267,7 +3267,7 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
>  
>  	GEM_TRACE("start\n");
>  
> -	if (drm_debug & DRM_UT_DRIVER) {
> +	if (GEM_SHOW_DEBUG()) {
>  		struct drm_printer p = drm_debug_printer(__func__);
>  
>  		for_each_engine(engine, i915, id)
> diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
> index deaf78d2ae8b..525920404ede 100644
> --- a/drivers/gpu/drm/i915/i915_gem.h
> +++ b/drivers/gpu/drm/i915/i915_gem.h
> @@ -30,6 +30,9 @@
>  struct drm_i915_private;
>  
>  #ifdef CONFIG_DRM_I915_DEBUG_GEM
> +
> +#define GEM_SHOW_DEBUG() (drm_debug & DRM_UT_DRIVER)
> +
>  #define GEM_BUG_ON(condition) do { if (unlikely((condition))) {	\
>  		pr_err("%s:%d GEM_BUG_ON(%s)\n", \
>  		       __func__, __LINE__, __stringify(condition)); \
> @@ -45,6 +48,9 @@ struct drm_i915_private;
>  #define GEM_DEBUG_BUG_ON(expr) GEM_BUG_ON(expr)
>  
>  #else
> +
> +#define GEM_SHOW_DEBUG() (0)
> +
>  #define GEM_BUG_ON(expr) BUILD_BUG_ON_INVALID(expr)
>  #define GEM_WARN_ON(expr) (BUILD_BUG_ON_INVALID(expr), 0)
>  
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index ca0b04d9747c..ad761b8d843d 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -82,7 +82,7 @@ static unsigned long wait_timeout(void)
>  
>  static noinline void missed_breadcrumb(struct intel_engine_cs *engine)
>  {
> -	if (drm_debug & DRM_UT_DRIVER) {
> +	if (GEM_SHOW_DEBUG()) {
>  		struct drm_printer p = drm_debug_printer(__func__);
>  
>  		intel_engine_dump(engine, &p,
> diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c
> index fd0ffb8328d0..309e38b00e95 100644
> --- a/drivers/gpu/drm/i915/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/intel_hangcheck.c
> @@ -356,7 +356,7 @@ static void hangcheck_accumulate_sample(struct intel_engine_cs *engine,
>  		break;
>  
>  	case ENGINE_DEAD:
> -		if (drm_debug & DRM_UT_DRIVER) {
> +		if (GEM_SHOW_DEBUG()) {
>  			struct drm_printer p = drm_debug_printer("hangcheck");
>  			intel_engine_dump(engine, &p, "%s\n", engine->name);
>  		}
> -- 
> 2.17.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 07/18] drm/i915: Combine tasklet_kill and tasklet_disable
  2018-04-24 12:28     ` Chris Wilson
@ 2018-04-26 10:19       ` Mika Kuoppala
  2018-04-26 10:26         ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Kuoppala @ 2018-04-26 10:19 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2018-04-24 13:26:11)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > Ideally, we want to atomically flush and disable the tasklet before
>> > resetting the GPU. At present, we rely on being the only part to touch
>> > our tasklet and serialisation of the reset process to ensure that we can
>> > suspend the tasklet from the mix of reset/wedge pathways. In this patch,
>> > we move the tasklet abuse into its own function and tweak it such that
>> > we only do a synchronous operation the first time it is disabled around
>> > the reset. This allows us to avoid the sync inside a softirq context in
>> > subsequent patches.
>> >
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> > Cc: Michał Winiarski <michal.winiarski@intel.com>
>> > CC: Michel Thierry <michel.thierry@intel.com>
>> > Cc: Jeff McGee <jeff.mcgee@intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/intel_lrc.c | 14 +++++++++++---
>> >  1 file changed, 11 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> > index bbcc6439a2a1..d5640f3d5276 100644
>> > --- a/drivers/gpu/drm/i915/intel_lrc.c
>> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> > @@ -1754,6 +1754,16 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
>> >       return init_workarounds_ring(engine);
>> >  }
>> >  
>> > +static void tasklet_kill_and_disable(struct tasklet_struct *t)
>> > +{
>> > +     if (!atomic_read(&t->count))
>> > +             tasklet_kill(t);
>> > +
>> > +     if (atomic_inc_return(&t->count) == 1)
>> > +             tasklet_unlock_wait(t);
>> 
>> You would spin only on the first try regardless. Is this
>> just to prevent one extra spinlock on reset path?
>
> No, the end goal is to prevent a recursive lock.

Ok so you want to be able to call this from inside the
tasklet itself.

On the bigger picture, if the preempt has has already
timeouted and we want to reset, does it matter between
resetting from wq or from timer irq. In another
words what do we gain for this much of added complexity?
-Mika


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 07/18] drm/i915: Combine tasklet_kill and tasklet_disable
  2018-04-26 10:19       ` Mika Kuoppala
@ 2018-04-26 10:26         ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-04-26 10:26 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2018-04-26 11:19:14)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Quoting Mika Kuoppala (2018-04-24 13:26:11)
> >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> >> 
> >> > Ideally, we want to atomically flush and disable the tasklet before
> >> > resetting the GPU. At present, we rely on being the only part to touch
> >> > our tasklet and serialisation of the reset process to ensure that we can
> >> > suspend the tasklet from the mix of reset/wedge pathways. In this patch,
> >> > we move the tasklet abuse into its own function and tweak it such that
> >> > we only do a synchronous operation the first time it is disabled around
> >> > the reset. This allows us to avoid the sync inside a softirq context in
> >> > subsequent patches.
> >> >
> >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> >> > CC: Michel Thierry <michel.thierry@intel.com>
> >> > Cc: Jeff McGee <jeff.mcgee@intel.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/intel_lrc.c | 14 +++++++++++---
> >> >  1 file changed, 11 insertions(+), 3 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >> > index bbcc6439a2a1..d5640f3d5276 100644
> >> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> >> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >> > @@ -1754,6 +1754,16 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
> >> >       return init_workarounds_ring(engine);
> >> >  }
> >> >  
> >> > +static void tasklet_kill_and_disable(struct tasklet_struct *t)
> >> > +{
> >> > +     if (!atomic_read(&t->count))
> >> > +             tasklet_kill(t);
> >> > +
> >> > +     if (atomic_inc_return(&t->count) == 1)
> >> > +             tasklet_unlock_wait(t);
> >> 
> >> You would spin only on the first try regardless. Is this
> >> just to prevent one extra spinlock on reset path?
> >
> > No, the end goal is to prevent a recursive lock.
> 
> Ok so you want to be able to call this from inside the
> tasklet itself.
> 
> On the bigger picture, if the preempt has has already
> timeouted and we want to reset, does it matter between
> resetting from wq or from timer irq. In another
> words what do we gain for this much of added complexity?

Yes. A reset and recovery is microseconds. Scheduling the wq is orders
of magnitude indeterminably worse latency. (Calling schedule() itself
will take as long as the reset.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 18/18] drm/i915: Allow user control over preempt timeout on their important context
  2018-03-30 15:55 Preemption reset from timer; CI now 110% happier Chris Wilson
@ 2018-03-30 15:55 ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2018-03-30 15:55 UTC (permalink / raw)
  To: intel-gfx

EGL_NV_realtime_priority?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_context.c    | 22 ++++++++
 drivers/gpu/drm/i915/i915_gem_context.h    | 13 +++++
 drivers/gpu/drm/i915/i915_request.c        |  8 ++-
 drivers/gpu/drm/i915/intel_lrc.c           |  2 +-
 drivers/gpu/drm/i915/selftests/intel_lrc.c | 85 ++++++++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h                | 12 +++++
 6 files changed, 139 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5cfac0255758..dfb0a2b698c3 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -749,6 +749,15 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_CONTEXT_PARAM_PRIORITY:
 		args->value = ctx->priority;
 		break;
+	case I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
+		if (!(to_i915(dev)->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
+			ret = -ENODEV;
+		else if (args->size)
+			ret = -EINVAL;
+		else
+			args->value = ctx->preempt_timeout;
+		break;
+
 	default:
 		ret = -EINVAL;
 		break;
@@ -824,6 +833,19 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		}
 		break;
 
+	case I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
+		if (args->size)
+			ret = -EINVAL;
+		else if (args->value >= NSEC_PER_SEC)
+			ret = -EINVAL;
+		else if (!(to_i915(dev)->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
+			ret = -ENODEV;
+		else if (args->value && !capable(CAP_SYS_ADMIN))
+			ret = -EPERM;
+		else
+			ctx->preempt_timeout = args->value;
+		break;
+
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 7854262ddfd9..74d4cadd729e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -150,6 +150,19 @@ struct i915_gem_context {
 	 */
 	int priority;
 
+	/**
+	 * @preempt_timeout: QoS guarantee for the high priority context
+	 *
+	 * Some clients need a guarantee that they will start executing
+	 * within a certain window, even at the expense of others. This entails
+	 * that if a preemption request is not honoured by the active context
+	 * within the timeout, we will reset the GPU to evict the hog and
+	 * run the high priority context instead.
+	 *
+	 * Timeout is stored in nanoseconds; exclusive max of 1s.
+	 */
+	u32 preempt_timeout;
+
 	/** ggtt_offset_bias: placement restriction for context objects */
 	u32 ggtt_offset_bias;
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index cdda3ebd51e2..eae807e56723 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1107,8 +1107,12 @@ void __i915_request_add(struct i915_request *request, bool flush_caches)
 	 * run at the earliest possible convenience.
 	 */
 	rcu_read_lock();
-	if (engine->schedule)
-		engine->schedule(request, request->ctx->priority, 0);
+	if (engine->schedule) {
+		unsigned int timeout = request->ctx->preempt_timeout;
+		int priority = request->ctx->priority;
+
+		engine->schedule(request, priority, timeout);
+	}
 	rcu_read_unlock();
 
 	local_bh_disable();
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 06a4815f5f31..e87181f88f96 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1212,7 +1212,7 @@ static void execlists_submit_request(struct i915_request *request)
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
 	queue_request(engine, &request->priotree, rq_prio(request));
-	submit_queue(engine, rq_prio(request), 0);
+	submit_queue(engine, rq_prio(request), request->ctx->preempt_timeout);
 
 	GEM_BUG_ON(!engine->execlists.first);
 	GEM_BUG_ON(list_empty(&request->priotree.link));
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index 1970a4658c5c..121180b87fd7 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -783,6 +783,90 @@ static int live_late_preempt_timeout(void *arg)
 	goto err_ctx_lo;
 }
 
+static int live_context_preempt_timeout(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct i915_gem_context *ctx_hi, *ctx_lo;
+	struct spinner spin_hi, spin_lo;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	if (spinner_init(&spin_hi, i915))
+		goto err_unlock;
+
+	if (spinner_init(&spin_lo, i915))
+		goto err_spin_hi;
+
+	ctx_hi = kernel_context(i915);
+	if (!ctx_hi)
+		goto err_spin_lo;
+	ctx_hi->priority = I915_CONTEXT_MAX_USER_PRIORITY;
+	ctx_hi->preempt_timeout = 50 * 1000; /* 50us */
+
+	ctx_lo = kernel_context(i915);
+	if (!ctx_lo)
+		goto err_ctx_hi;
+	ctx_lo->priority = I915_CONTEXT_MIN_USER_PRIORITY;
+
+	for_each_engine(engine, i915, id) {
+		struct i915_request *rq;
+
+		rq = spinner_create_request(&spin_lo, ctx_lo, engine, MI_NOOP);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (!wait_for_spinner(&spin_lo, rq)) {
+			i915_gem_set_wedged(i915);
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		rq = spinner_create_request(&spin_hi, ctx_hi, engine, MI_NOOP);
+		if (IS_ERR(rq)) {
+			spinner_end(&spin_lo);
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (!wait_for_spinner(&spin_hi, rq)) {
+			i915_gem_set_wedged(i915);
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		spinner_end(&spin_hi);
+		spinner_end(&spin_lo);
+		if (flush_test(i915, I915_WAIT_LOCKED)) {
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+	}
+
+	err = 0;
+err_ctx_lo:
+	kernel_context_close(ctx_lo);
+err_ctx_hi:
+	kernel_context_close(ctx_hi);
+err_spin_lo:
+	spinner_fini(&spin_lo);
+err_spin_hi:
+	spinner_fini(&spin_hi);
+err_unlock:
+	flush_test(i915, I915_WAIT_LOCKED);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
 int intel_execlists_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
@@ -792,6 +876,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_preempt_timeout),
 		SUBTEST(live_preempt_reset),
 		SUBTEST(live_late_preempt_timeout),
+		SUBTEST(live_context_preempt_timeout),
 	};
 	return i915_subtests(tests, i915);
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 7f5634ce8e88..6f10bbe90304 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1456,6 +1456,18 @@ struct drm_i915_gem_context_param {
 #define   I915_CONTEXT_MAX_USER_PRIORITY	1023 /* inclusive */
 #define   I915_CONTEXT_DEFAULT_PRIORITY		0
 #define   I915_CONTEXT_MIN_USER_PRIORITY	-1023 /* inclusive */
+
+/*
+ * I915_CONTEXT_PARAM_PREEMPT_TIMEOUT:
+ *
+ * Preemption timeout give in nanoseconds.
+ *
+ * Only allowed for privileged clients (CAP_SYS_ADMIN), this property allows
+ * the preempting context to kick out a GPU hog using a GPU reset if they do
+ * not honour the preemption request in time.
+ */
+#define I915_CONTEXT_PARAM_PREEMPT_TIMEOUT	0x7
+
 	__u64 value;
 };
 
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2018-04-26 10:26 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-09 11:13 Still trying for context->preempt_timeout Chris Wilson
2018-04-09 11:13 ` [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port Chris Wilson
2018-04-10 14:05   ` Tvrtko Ursulin
     [not found]     ` <152337025293.3167.10189866034675290387@mail.alporthouse.com>
2018-04-10 14:42       ` Tvrtko Ursulin
2018-04-10 14:56         ` Chris Wilson
2018-04-10 15:08           ` Chris Wilson
     [not found]           ` <87af1b35-ba87-f560-c911-0e758a164909@linux.intel.com>
     [not found]             ` <152344296759.13225.4187833354912190018@mail.alporthouse.com>
     [not found]               ` <aa0e24bb-045c-e1c9-24bc-6dba0b4a28b8@linux.intel.com>
2018-04-11 11:33                 ` Chris Wilson
2018-04-09 11:13 ` [PATCH 02/18] drm/i915/execlists: Refactor out complete_preempt_context() Chris Wilson
2018-04-09 11:13 ` [PATCH 03/18] drm/i915: Move engine reset prepare/finish to backends Chris Wilson
2018-04-09 11:13 ` [PATCH 04/18] drm/i915: Split execlists/guc reset preparations Chris Wilson
2018-04-09 11:14 ` [PATCH 05/18] drm/i915/execlists: Flush pending preemption events during reset Chris Wilson
2018-04-09 11:14 ` [PATCH 06/18] drm/i915/breadcrumbs: Keep the fake irq armed across reset Chris Wilson
2018-04-23 13:03   ` Mika Kuoppala
2018-04-23 14:53     ` Chris Wilson
2018-04-24 11:06       ` Mika Kuoppala
2018-04-09 11:14 ` [PATCH 07/18] drm/i915: Combine tasklet_kill and tasklet_disable Chris Wilson
2018-04-24 12:26   ` Mika Kuoppala
2018-04-24 12:28     ` Chris Wilson
2018-04-26 10:19       ` Mika Kuoppala
2018-04-26 10:26         ` Chris Wilson
2018-04-09 11:14 ` [PATCH 08/18] drm/i915: Stop parking the signaler around reset Chris Wilson
2018-04-09 11:14 ` [PATCH 09/18] drm/i915: Be irqsafe inside reset Chris Wilson
2018-04-09 11:14 ` [PATCH 10/18] drm/i915/execlists: Make submission tasklet hardirq safe Chris Wilson
2018-04-09 11:14 ` [PATCH 11/18] drm/i915/guc: " Chris Wilson
2018-04-09 11:14 ` [PATCH 12/18] drm/i915: Allow init_breadcrumbs to be used from irq context Chris Wilson
2018-04-09 11:14 ` [PATCH 13/18] drm/i915: Compile out engine debug for release Chris Wilson
2018-04-26 10:15   ` Mika Kuoppala
2018-04-09 11:14 ` [PATCH 14/18] drm/i915/execlists: Force preemption via reset on timeout Chris Wilson
2018-04-09 11:14 ` [PATCH 15/18] drm/i915/execlists: Try preempt-reset from hardirq timer context Chris Wilson
2018-04-09 11:14 ` [PATCH 16/18] drm/i915/preemption: Select timeout when scheduling Chris Wilson
2018-04-09 11:14 ` [PATCH 17/18] drm/i915: Use a preemption timeout to enforce interactivity Chris Wilson
2018-04-09 11:14 ` [PATCH 18/18] drm/i915: Allow user control over preempt timeout on their important context Chris Wilson
2018-04-09 11:29 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/18] drm/i915/execlists: Set queue priority from secondary port Patchwork
2018-04-09 11:44 ` ✓ Fi.CI.BAT: success " Patchwork
2018-04-09 13:08 ` ✗ Fi.CI.IGT: failure " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2018-03-30 15:55 Preemption reset from timer; CI now 110% happier Chris Wilson
2018-03-30 15:55 ` [PATCH 18/18] drm/i915: Allow user control over preempt timeout on their important context Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.