[PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients
@ 2019-02-04  8:41 Chris Wilson
  2019-02-04  8:41 ` [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption Chris Wilson
                   ` (15 more replies)
  0 siblings, 16 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx

When first enabling preemption, we hesitated from making it a free-for-all
where every higher priority client would force a preempt-to-idle cycle
and take over from all lower priority clients. We hesitated because we
were uncertain just how well preemption would work in practice, whether
the preemption latency itself would detract from the latency gains for
higher priority tasks and whether it would work at all. Since
introducing preemption, we have been enabling it for more common tasks,
even giving normal clients a small preemptive boost when they first
start (to aide fairness and improve interactivity). Now lets take one
step further and give permission for all normal (priority:0) clients to
preempt any idle (priority:<0) task so that users running long compute
jobs do not overly impact other jobs (i.e. their desktop) and the system
remains responsive under such idle loads.

References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
References: b16c765122f9 ("drm/i915: Priority boost for new clients")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.h | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 34d0a148e664..983ad1e7914d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -592,7 +592,20 @@ intel_engine_has_preemption(const struct intel_engine_cs *engine)
 
 static inline bool __execlists_need_preempt(int prio, int last)
 {
-	return prio > max(0, last);
+	/*
+	 * Allow preemption of low -> normal -> high, but we do
+	 * not allow low priority tasks to preempt other low priority
+	 * tasks under the impression that latency for low priority
+	 * tasks does not matter (as much as background throughput),
+	 * so kiss.
+	 *
+	 * More naturally we would write
+	 * 	prio >= max(0, last);
+	 * except that we wish to prevent triggering preemption at the same
+	 * priority level: the task that is running should remain running
+	 * to preserve FIFO ordering of dependencies.
+	 */
+	return prio > max(I915_PRIORITY_NORMAL - 1, last);
 }
 
 static inline void
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
@ 2019-02-04  8:41 ` Chris Wilson
  2019-02-04 10:06   ` Tvrtko Ursulin
  2019-02-04 10:49   ` [PATCH] " Chris Wilson
  2019-02-04  8:41 ` [PATCH 03/12] drm/i915/execlists: Suppress redundant preemption Chris Wilson
                   ` (14 subsequent siblings)
  15 siblings, 2 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx

WAIT is occasionally suppressed by virtue of preempted requests being
promoted to NEWCLIENT iff they have not all ready received that boost.
Make this consistent for all WAIT boosts that they are not allowed to
preempt executing contexts and are merely granted the right to be at the
front of the queue for the next execution slot. This is in keeping with
the desire that the WAIT boost be a minor tweak that does not give
excessive promotion to its user and open ourselves to trivial abuse.

The problem with the inconsistent WAIT preemption becomes more apparent
as the preemption is propagated across the engines, where one engine may
preempt and the other not, and we be relying on the exact execution
order being consistent across engines (e.g. using HW semaphores to
coordinate parallel execution).

v2: Also protect GuC submission from false preemption loops.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c          |  11 ++
 drivers/gpu/drm/i915/i915_scheduler.h        |   2 +
 drivers/gpu/drm/i915/intel_guc_submission.c  |   2 +-
 drivers/gpu/drm/i915/intel_lrc.c             |   9 +-
 drivers/gpu/drm/i915/selftests/igt_spinner.c |   9 +-
 drivers/gpu/drm/i915/selftests/intel_lrc.c   | 156 +++++++++++++++++++
 6 files changed, 186 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 9ed5baf157a3..7968875d0bed 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -377,12 +377,23 @@ void __i915_request_submit(struct i915_request *request)
 
 	/* We may be recursing from the signal callback of another i915 fence */
 	spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
+
 	GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags));
 	set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags);
+
 	request->global_seqno = seqno;
 	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
 	    !i915_request_enable_breadcrumb(request))
 		intel_engine_queue_breadcrumbs(engine);
+
+	/*
+	 * As we do not allow WAIT to preempt inflight requests,
+	 * once we have executed a request, along with triggering
+	 * any execution callbacks, we must preserve its ordering
+	 * within the non-preemptible FIFO.
+	 */
+	request->sched.attr.priority |= __NO_PREEMPTION;
+
 	spin_unlock(&request->lock);
 
 	engine->emit_fini_breadcrumb(request,
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index dbe9cb7ecd82..54bd6c89817e 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -33,6 +33,8 @@ enum {
 #define I915_PRIORITY_WAIT	((u8)BIT(0))
 #define I915_PRIORITY_NEWCLIENT	((u8)BIT(1))
 
+#define __NO_PREEMPTION (I915_PRIORITY_WAIT)
+
 struct i915_sched_attr {
 	/**
 	 * @priority: execution and service priority
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 8bc8aa54aa35..94695eb819c2 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -720,7 +720,7 @@ static inline int rq_prio(const struct i915_request *rq)
 
 static inline int port_prio(const struct execlist_port *port)
 {
-	return rq_prio(port_request(port));
+	return rq_prio(port_request(port)) | __NO_PREEMPTION;
 }
 
 static bool __guc_dequeue(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a9eb0211ce77..773df0bd685b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -188,6 +188,12 @@ static inline int rq_prio(const struct i915_request *rq)
 	return rq->sched.attr.priority;
 }
 
+static int effective_prio(const struct i915_request *rq)
+{
+	/* Restrict mere WAIT boosts from triggering preemption */
+	return rq_prio(rq) | __NO_PREEMPTION;
+}
+
 static int queue_prio(const struct intel_engine_execlists *execlists)
 {
 	struct i915_priolist *p;
@@ -208,7 +214,7 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
 static inline bool need_preempt(const struct intel_engine_cs *engine,
 				const struct i915_request *rq)
 {
-	const int last_prio = rq_prio(rq);
+	int last_prio;
 
 	if (!intel_engine_has_preemption(engine))
 		return false;
@@ -228,6 +234,7 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 	 * preempt. If that hint is stale or we may be trying to preempt
 	 * ourselves, ignore the request.
 	 */
+	last_prio = effective_prio(rq);
 	if (!__execlists_need_preempt(engine->execlists.queue_priority_hint,
 				      last_prio))
 		return false;
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index 9ebd9225684e..86354e51bdd3 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -142,10 +142,17 @@ igt_spinner_create_request(struct igt_spinner *spin,
 	*batch++ = upper_32_bits(vma->node.start);
 	*batch++ = MI_BATCH_BUFFER_END; /* not reached */
 
-	i915_gem_chipset_flush(spin->i915);
+	if (engine->emit_init_breadcrumb &&
+	    rq->timeline->has_initial_breadcrumb) {
+		err = engine->emit_init_breadcrumb(rq);
+		if (err)
+			goto cancel_rq;
+	}
 
 	err = engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0);
 
+	i915_gem_chipset_flush(spin->i915);
+
 cancel_rq:
 	if (err) {
 		i915_request_skip(rq, err);
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index fb35f53c9ce3..86fd4589f5f0 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -405,6 +405,161 @@ static int live_suppress_self_preempt(void *arg)
 	goto err_client_b;
 }
 
+static int __i915_sw_fence_call
+dummy_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
+{
+	return NOTIFY_DONE;
+}
+
+static struct i915_request *dummy_request(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq;
+
+	rq = kmalloc(sizeof(*rq), GFP_KERNEL | __GFP_ZERO);
+	if (!rq)
+		return NULL;
+
+	INIT_LIST_HEAD(&rq->active_list);
+	rq->engine = engine;
+
+	i915_sched_node_init(&rq->sched);
+
+	/* mark this request as permanently incomplete */
+	rq->fence.seqno = 1;
+	rq->hwsp_seqno = (u32 *)&rq->fence.seqno + 1;
+
+	i915_sw_fence_init(&rq->submit, dummy_notify);
+	i915_sw_fence_commit(&rq->submit);
+
+	return rq;
+}
+
+static void dummy_request_free(struct i915_request *dummy)
+{
+	i915_request_mark_complete(dummy);
+	i915_sched_node_fini(dummy->engine->i915, &dummy->sched);
+	kfree(dummy);
+}
+
+static int live_suppress_wait_preempt(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct preempt_client client[4];
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+	int err = -ENOMEM;
+	int i;
+
+	/*
+	 * Waiters are given a little priority nudge, but not enough
+	 * to actually cause any preemption. Double check that we do
+	 * not needlessly generate preempt-to-idle cycles.
+	 */
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	if (preempt_client_init(i915, &client[0])) /* ELSP[0] */
+		goto err_unlock;
+	if (preempt_client_init(i915, &client[1])) /* ELSP[1] */
+		goto err_client_0;
+	if (preempt_client_init(i915, &client[2])) /* head of queue */
+		goto err_client_1;
+	if (preempt_client_init(i915, &client[3])) /* bystander */
+		goto err_client_2;
+
+	for_each_engine(engine, i915, id) {
+		int depth;
+
+		if (!engine->emit_init_breadcrumb)
+			continue;
+
+		for (depth = 0; depth < ARRAY_SIZE(client); depth++) {
+			struct i915_request *rq[ARRAY_SIZE(client)];
+			struct i915_request *dummy;
+
+			engine->execlists.preempt_hang.count = 0;
+
+			dummy = dummy_request(engine);
+			if (!dummy)
+				goto err_client_3;
+
+			for (i = 0; i < ARRAY_SIZE(client); i++) {
+				rq[i] = igt_spinner_create_request(&client[i].spin,
+								   client[i].ctx, engine,
+								   MI_NOOP);
+				if (IS_ERR(rq[i])) {
+					err = PTR_ERR(rq[i]);
+					goto err_wedged;
+				}
+
+				/* Disable NEWCLIENT promotion */
+				i915_gem_active_set(&rq[i]->timeline->last_request,
+						    dummy);
+				i915_request_add(rq[i]);
+			}
+
+			dummy_request_free(dummy);
+
+			GEM_BUG_ON(i915_request_completed(rq[0]));
+			if (!igt_wait_for_spinner(&client[0].spin, rq[0])) {
+				pr_err("First client failed to start\n");
+				goto err_wedged;
+			}
+			GEM_BUG_ON(!i915_request_started(rq[0]));
+
+			if (i915_request_wait(rq[depth],
+					      I915_WAIT_LOCKED |
+					      I915_WAIT_PRIORITY,
+					      1) != -ETIME) {
+				pr_err("Waiter depth:%d completed!\n", depth);
+				goto err_wedged;
+			}
+
+			for (i = 0; i < ARRAY_SIZE(client); i++)
+				igt_spinner_end(&client[i].spin);
+
+			if (igt_flush_test(i915, I915_WAIT_LOCKED))
+				goto err_wedged;
+
+			if (engine->execlists.preempt_hang.count) {
+				pr_err("Preemption recorded x%d, depth %d; should have been suppressed!\n",
+				       engine->execlists.preempt_hang.count,
+				       depth);
+				err = -EINVAL;
+				goto err_client_3;
+			}
+		}
+	}
+
+	err = 0;
+err_client_3:
+	preempt_client_fini(&client[3]);
+err_client_2:
+	preempt_client_fini(&client[2]);
+err_client_1:
+	preempt_client_fini(&client[1]);
+err_client_0:
+	preempt_client_fini(&client[0]);
+err_unlock:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+
+err_wedged:
+	for (i = 0; i < ARRAY_SIZE(client); i++)
+		igt_spinner_end(&client[i].spin);
+	i915_gem_set_wedged(i915);
+	err = -EIO;
+	goto err_client_3;
+}
+
 static int live_preempt_hang(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -785,6 +940,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_preempt),
 		SUBTEST(live_late_preempt),
 		SUBTEST(live_suppress_self_preempt),
+		SUBTEST(live_suppress_wait_preempt),
 		SUBTEST(live_preempt_hang),
 		SUBTEST(live_preempt_smoke),
 	};
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/12] drm/i915/execlists: Suppress redundant preemption
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
  2019-02-04  8:41 ` [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption Chris Wilson
@ 2019-02-04  8:41 ` Chris Wilson
  2019-02-04 12:05   ` Tvrtko Ursulin
  2019-02-04  8:41 ` [PATCH 04/12] drm/i915/selftests: Exercise some AB...BA preemption chains Chris Wilson
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx

On unwinding the active request we give it a small (limited to internal
priority levels) boost to prevent it from being gazumped a second time.
However, this means that it can be promoted to above the request that
triggered the preemption request, causing a preempt-to-idle cycle for no
change. We can avoid this if we take the boost into account when
checking if the preemption request is valid.

v2: After preemption the active request will be after the preemptee if
they end up with equal priority.

v3: Tvrtko pointed out that this, the existing logic, makes
I915_PRIORITY_WAIT non-preemptible. Document this interesting quirk!

v4: Prove Tvrtko was right about WAIT being non-preemptible and test it.
v5: Except not all priorities were made equal, and the WAIT not preempting
is only if we start off as !NEWCLIENT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 38 ++++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 773df0bd685b..9b6b3acb9070 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -164,6 +164,8 @@
 #define WA_TAIL_DWORDS 2
 #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
 
+#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT)
+
 static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 					    struct intel_engine_cs *engine,
 					    struct intel_context *ce);
@@ -190,8 +192,30 @@ static inline int rq_prio(const struct i915_request *rq)
 
 static int effective_prio(const struct i915_request *rq)
 {
+	int prio = rq_prio(rq);
+
+	/*
+	 * On unwinding the active request, we give it a priority bump
+	 * equivalent to a freshly submitted request. This protects it from
+	 * being gazumped again, but it would be preferable if we didn't
+	 * let it be gazumped in the first place!
+	 *
+	 * See __unwind_incomplete_requests()
+	 */
+	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(rq)) {
+		/*
+		 * After preemption, we insert the active request at the
+		 * end of the new priority level. This means that we will be
+		 * _lower_ priority than the preemptee all things equal (and
+		 * so the preemption is valid), so adjust our comparison
+		 * accordingly.
+		 */
+		prio |= ACTIVE_PRIORITY;
+		prio--;
+	}
+
 	/* Restrict mere WAIT boosts from triggering preemption */
-	return rq_prio(rq) | __NO_PREEMPTION;
+	return prio | __NO_PREEMPTION;
 }
 
 static int queue_prio(const struct intel_engine_execlists *execlists)
@@ -360,7 +384,7 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 {
 	struct i915_request *rq, *rn, *active = NULL;
 	struct list_head *uninitialized_var(pl);
-	int prio = I915_PRIORITY_INVALID | I915_PRIORITY_NEWCLIENT;
+	int prio = I915_PRIORITY_INVALID | ACTIVE_PRIORITY;
 
 	lockdep_assert_held(&engine->timeline.lock);
 
@@ -391,9 +415,15 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	 * The active request is now effectively the start of a new client
 	 * stream, so give it the equivalent small priority bump to prevent
 	 * it being gazumped a second time by another peer.
+	 *
+	 * One consequence of this preemption boost is that we may jump
+	 * over lesser priorities (such as I915_PRIORITY_WAIT), effectively
+	 * making those priorities non-preemptible. They will be moved forward
+	 * in the priority queue, but they will not gain immediate access to
+	 * the GPU.
 	 */
-	if (!(prio & I915_PRIORITY_NEWCLIENT)) {
-		prio |= I915_PRIORITY_NEWCLIENT;
+	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(active)) {
+		prio |= ACTIVE_PRIORITY;
 		active->sched.attr.priority = prio;
 		list_move_tail(&active->sched.link,
 			       i915_sched_lookup_priolist(engine, prio));
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/12] drm/i915/selftests: Exercise some AB...BA preemption chains
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
  2019-02-04  8:41 ` [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption Chris Wilson
  2019-02-04  8:41 ` [PATCH 03/12] drm/i915/execlists: Suppress redundant preemption Chris Wilson
@ 2019-02-04  8:41 ` Chris Wilson
  2019-02-04  8:41 ` [PATCH 05/12] drm/i915: Trim NEWCLIENT boosting Chris Wilson
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx

Build a chain using 2 contexts (A, B) then request a preemption such
that a later A request runs before the spinner in B.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/selftests/intel_lrc.c | 103 +++++++++++++++++++++
 1 file changed, 103 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index 86fd4589f5f0..bb81e462a0c8 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -4,6 +4,8 @@
  * Copyright © 2018 Intel Corporation
  */
 
+#include <linux/prime_numbers.h>
+
 #include "../i915_reset.h"
 
 #include "../i915_selftest.h"
@@ -560,6 +562,106 @@ static int live_suppress_wait_preempt(void *arg)
 	goto err_client_3;
 }
 
+static int live_chain_preempt(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine;
+	struct preempt_client hi, lo;
+	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+	int err = -ENOMEM;
+
+	/*
+	 * Build a chain AB...BA between two contexts (A, B) and request
+	 * preemption of the last request. It should then complete before
+	 * the previously submitted spinner in B.
+	 */
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	if (preempt_client_init(i915, &hi))
+		goto err_unlock;
+
+	if (preempt_client_init(i915, &lo))
+		goto err_client_hi;
+
+	for_each_engine(engine, i915, id) {
+		struct i915_sched_attr attr = {
+			.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX),
+		};
+		int count, i;
+
+		for_each_prime_number_from(count, 1, 32) { /* must fit ring! */
+			struct i915_request *rq;
+
+			rq = igt_spinner_create_request(&hi.spin,
+							hi.ctx, engine,
+							MI_ARB_CHECK);
+			if (IS_ERR(rq))
+				goto err_wedged;
+			i915_request_add(rq);
+			if (!igt_wait_for_spinner(&hi.spin, rq))
+				goto err_wedged;
+
+			rq = igt_spinner_create_request(&lo.spin,
+							lo.ctx, engine,
+							MI_ARB_CHECK);
+			if (IS_ERR(rq))
+				goto err_wedged;
+			i915_request_add(rq);
+
+			for (i = 0; i < count; i++) {
+				rq = i915_request_alloc(engine, lo.ctx);
+				if (IS_ERR(rq))
+					goto err_wedged;
+				i915_request_add(rq);
+			}
+
+			rq = i915_request_alloc(engine, hi.ctx);
+			if (IS_ERR(rq))
+				goto err_wedged;
+			i915_request_add(rq);
+			engine->schedule(rq, &attr);
+
+			igt_spinner_end(&hi.spin);
+			if (i915_request_wait(rq, I915_WAIT_LOCKED, HZ / 5) < 0) {
+				struct drm_printer p =
+					drm_info_printer(i915->drm.dev);
+
+				pr_err("Failed to preempt over chain of %d\n",
+				       count);
+				intel_engine_dump(engine, &p,
+						  "%s\n", engine->name);
+				goto err_wedged;
+			}
+			igt_spinner_end(&lo.spin);
+		}
+	}
+
+	err = 0;
+err_client_lo:
+	preempt_client_fini(&lo);
+err_client_hi:
+	preempt_client_fini(&hi);
+err_unlock:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+
+err_wedged:
+	igt_spinner_end(&hi.spin);
+	igt_spinner_end(&lo.spin);
+	i915_gem_set_wedged(i915);
+	err = -EIO;
+	goto err_client_lo;
+}
+
 static int live_preempt_hang(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -941,6 +1043,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_late_preempt),
 		SUBTEST(live_suppress_self_preempt),
 		SUBTEST(live_suppress_wait_preempt),
+		SUBTEST(live_chain_preempt),
 		SUBTEST(live_preempt_hang),
 		SUBTEST(live_preempt_smoke),
 	};
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/12] drm/i915: Trim NEWCLIENT boosting
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (2 preceding siblings ...)
  2019-02-04  8:41 ` [PATCH 04/12] drm/i915/selftests: Exercise some AB...BA preemption chains Chris Wilson
@ 2019-02-04  8:41 ` Chris Wilson
  2019-02-04 12:11   ` Tvrtko Ursulin
  2019-02-04  8:41 ` [PATCH 06/12] drm/i915: Show support for accurate sw PMU busyness tracking Chris Wilson
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx

Limit the NEWCLIENT boost to only give its small priority boost to fresh
clients only that have no dependencies.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_request.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 7968875d0bed..69bc549e95d8 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -979,7 +979,7 @@ void i915_request_add(struct i915_request *request)
 		 * Allow interactive/synchronous clients to jump ahead of
 		 * the bulk clients. (FQ_CODEL)
 		 */
-		if (!prev || i915_request_completed(prev))
+		if (list_empty(&request->sched.signalers_list))
 			attr.priority |= I915_PRIORITY_NEWCLIENT;
 
 		engine->schedule(request, &attr);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/12] drm/i915: Show support for accurate sw PMU busyness tracking
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (3 preceding siblings ...)
  2019-02-04  8:41 ` [PATCH 05/12] drm/i915: Trim NEWCLIENT boosting Chris Wilson
@ 2019-02-04  8:41 ` Chris Wilson
  2019-02-04 12:14   ` Tvrtko Ursulin
  2019-02-04  8:41 ` [PATCH 07/12] drm/i915: Revoke mmaps and prevent access to fence registers across reset Chris Wilson
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx

Expose whether or not we support the PMU software tracking in our
scheduler capabilities, so userspace can query at runtime.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         |  2 ++
 drivers/gpu/drm/i915/intel_engine_cs.c  | 38 +++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        |  6 ----
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
 include/uapi/drm/i915_drm.h             |  1 +
 5 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e802af64d628..bc7d1338b69a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4757,6 +4757,8 @@ static int __i915_gem_restart_engines(void *data)
 		}
 	}
 
+	intel_engines_set_scheduler_caps(i915);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 71c01eb13af1..ec2cbbe070a4 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -614,6 +614,44 @@ int intel_engine_setup_common(struct intel_engine_cs *engine)
 	return err;
 }
 
+void intel_engines_set_scheduler_caps(struct drm_i915_private *i915)
+{
+	static const struct {
+		u32 engine_flag;
+		u32 sched_cap;
+	} map[] = {
+		{ I915_ENGINE_HAS_PREEMPTION, I915_SCHEDULER_CAP_PREEMPTION },
+		{ I915_ENGINE_SUPPORTS_STATS, I915_SCHEDULER_CAP_PMU },
+	};
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	u32 enabled, disabled;
+
+	enabled = 0;
+	disabled = 0;
+	for_each_engine(engine, i915, id) { /* all engines must agree! */
+		int i;
+
+		if (engine->schedule)
+			enabled |= (I915_SCHEDULER_CAP_ENABLED |
+				    I915_SCHEDULER_CAP_PRIORITY);
+		else
+			disabled |= (I915_SCHEDULER_CAP_ENABLED |
+				     I915_SCHEDULER_CAP_PRIORITY);
+
+		for (i = 0; i < ARRAY_SIZE(map); i++) {
+			if (engine->flags & map[i].engine_flag)
+				enabled |= map[i].sched_cap;
+			else
+				disabled |= map[i].sched_cap;
+		}
+	}
+
+	i915->caps.scheduler = enabled & ~disabled;
+	if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_ENABLED))
+		i915->caps.scheduler = 0;
+}
+
 static void __intel_context_unpin(struct i915_gem_context *ctx,
 				  struct intel_engine_cs *engine)
 {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9b6b3acb9070..0869a4fd20c7 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2299,12 +2299,6 @@ void intel_execlists_set_default_submission(struct intel_engine_cs *engine)
 	engine->flags |= I915_ENGINE_SUPPORTS_STATS;
 	if (engine->i915->preempt_context)
 		engine->flags |= I915_ENGINE_HAS_PREEMPTION;
-
-	engine->i915->caps.scheduler =
-		I915_SCHEDULER_CAP_ENABLED |
-		I915_SCHEDULER_CAP_PRIORITY;
-	if (intel_engine_has_preemption(engine))
-		engine->i915->caps.scheduler |= I915_SCHEDULER_CAP_PREEMPTION;
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 983ad1e7914d..610ee351ebee 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -590,6 +590,8 @@ intel_engine_has_preemption(const struct intel_engine_cs *engine)
 	return engine->flags & I915_ENGINE_HAS_PREEMPTION;
 }
 
+void intel_engines_set_scheduler_caps(struct drm_i915_private *i915);
+
 static inline bool __execlists_need_preempt(int prio, int last)
 {
 	/*
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 298b2e197744..d8ac7f105734 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -476,6 +476,7 @@ typedef struct drm_i915_irq_wait {
 #define   I915_SCHEDULER_CAP_ENABLED	(1ul << 0)
 #define   I915_SCHEDULER_CAP_PRIORITY	(1ul << 1)
 #define   I915_SCHEDULER_CAP_PREEMPTION	(1ul << 2)
+#define   I915_SCHEDULER_CAP_PMU	(1ul << 3)
 
 #define I915_PARAM_HUC_STATUS		 42
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/12] drm/i915: Revoke mmaps and prevent access to fence registers across reset
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (4 preceding siblings ...)
  2019-02-04  8:41 ` [PATCH 06/12] drm/i915: Show support for accurate sw PMU busyness tracking Chris Wilson
@ 2019-02-04  8:41 ` Chris Wilson
  2019-02-04 13:33   ` Mika Kuoppala
  2019-02-04  8:41 ` [PATCH 08/12] drm/i915: Force the GPU reset upon wedging Chris Wilson
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Previously, we were able to rely on the recursive properties of
struct_mutex to allow us to serialise revoking mmaps and reacquiring the
FENCE registers with them being clobbered over a global device reset.
I then proceeded to throw out the baby with the bath water in order to
pursue a struct_mutex-less reset.

Perusing LWN for alternative strategies, the dilemma on how to serialise
access to a global resource on one side was answered by
https://lwn.net/Articles/202847/ -- Sleepable RCU:

    1  int readside(void) {
    2      int idx;
    3      rcu_read_lock();
    4	   if (nomoresrcu) {
    5          rcu_read_unlock();
    6	       return -EINVAL;
    7      }
    8	   idx = srcu_read_lock(&ss);
    9	   rcu_read_unlock();
    10	   /* SRCU read-side critical section. */
    11	   srcu_read_unlock(&ss, idx);
    12	   return 0;
    13 }
    14
    15 void cleanup(void)
    16 {
    17     nomoresrcu = 1;
    18     synchronize_rcu();
    19     synchronize_srcu(&ss);
    20     cleanup_srcu_struct(&ss);
    21 }

No more worrying about stop_machine, just an uber-complex mutex,
optimised for reads, with the overhead pushed to the rare reset path.

However, we do run the risk of a deadlock as we allocate underneath the
SRCU read lock, and the allocation may require a GPU reset, causing a
dependency cycle via the in-flight requests. We resolve that by declaring
the driver wedged and cancelling all in-flight rendering.

v2: Use expedited rcu barriers to match our earlier timing
characteristics.
v3: Try to annotate locking contexts for sparse
v4: Reduce selftest lock duration to avoid a reset deadlock with fences

Testcase: igt/gem_mmap_gtt/hang
Fixes: eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c           |  12 +-
 drivers/gpu/drm/i915/i915_drv.h               |  18 +--
 drivers/gpu/drm/i915/i915_gem.c               |  56 +++------
 drivers/gpu/drm/i915/i915_gem_fence_reg.c     |  31 +----
 drivers/gpu/drm/i915/i915_gpu_error.h         |  12 +-
 drivers/gpu/drm/i915/i915_reset.c             | 107 +++++++++++-------
 drivers/gpu/drm/i915/i915_reset.h             |   4 +
 .../gpu/drm/i915/selftests/intel_hangcheck.c  |   5 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |   1 +
 9 files changed, 109 insertions(+), 137 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index fa2c226fc779..2cea263b4d79 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1281,14 +1281,11 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 	intel_wakeref_t wakeref;
 	enum intel_engine_id id;
 
+	seq_printf(m, "Reset flags: %lx\n", dev_priv->gpu_error.flags);
 	if (test_bit(I915_WEDGED, &dev_priv->gpu_error.flags))
-		seq_puts(m, "Wedged\n");
+		seq_puts(m, "\tWedged\n");
 	if (test_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags))
-		seq_puts(m, "Reset in progress: struct_mutex backoff\n");
-	if (waitqueue_active(&dev_priv->gpu_error.wait_queue))
-		seq_puts(m, "Waiter holding struct mutex\n");
-	if (waitqueue_active(&dev_priv->gpu_error.reset_queue))
-		seq_puts(m, "struct_mutex blocked for reset\n");
+		seq_puts(m, "\tDevice (global) reset in progress\n");
 
 	if (!i915_modparams.enable_hangcheck) {
 		seq_puts(m, "Hangcheck disabled\n");
@@ -3885,9 +3882,6 @@ i915_wedged_set(void *data, u64 val)
 	 * while it is writing to 'i915_wedged'
 	 */
 
-	if (i915_reset_backoff(&i915->gpu_error))
-		return -EAGAIN;
-
 	i915_handle_error(i915, val, I915_ERROR_CAPTURE,
 			  "Manually set wedged engine mask = %llx", val);
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 534e52e3a8da..3e4538ce5276 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2989,7 +2989,12 @@ i915_gem_obj_finish_shmem_access(struct drm_i915_gem_object *obj)
 	i915_gem_object_unpin_pages(obj);
 }
 
-int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
+static inline int __must_check
+i915_mutex_lock_interruptible(struct drm_device *dev)
+{
+	return mutex_lock_interruptible(&dev->struct_mutex);
+}
+
 int i915_gem_dumb_create(struct drm_file *file_priv,
 			 struct drm_device *dev,
 			 struct drm_mode_create_dumb *args);
@@ -3006,21 +3011,11 @@ int __must_check i915_gem_set_global_seqno(struct drm_device *dev, u32 seqno);
 struct i915_request *
 i915_gem_find_active_request(struct intel_engine_cs *engine);
 
-static inline bool i915_reset_backoff(struct i915_gpu_error *error)
-{
-	return unlikely(test_bit(I915_RESET_BACKOFF, &error->flags));
-}
-
 static inline bool i915_terminally_wedged(struct i915_gpu_error *error)
 {
 	return unlikely(test_bit(I915_WEDGED, &error->flags));
 }
 
-static inline bool i915_reset_backoff_or_wedged(struct i915_gpu_error *error)
-{
-	return i915_reset_backoff(error) | i915_terminally_wedged(error);
-}
-
 static inline u32 i915_reset_count(struct i915_gpu_error *error)
 {
 	return READ_ONCE(error->reset_count);
@@ -3093,7 +3088,6 @@ struct drm_i915_fence_reg *
 i915_reserve_fence(struct drm_i915_private *dev_priv);
 void i915_unreserve_fence(struct drm_i915_fence_reg *fence);
 
-void i915_gem_revoke_fences(struct drm_i915_private *dev_priv);
 void i915_gem_restore_fences(struct drm_i915_private *dev_priv);
 
 void i915_gem_detect_bit_6_swizzle(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bc7d1338b69a..2c6161c89cc7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -100,47 +100,6 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv,
 	spin_unlock(&dev_priv->mm.object_stat_lock);
 }
 
-static int
-i915_gem_wait_for_error(struct i915_gpu_error *error)
-{
-	int ret;
-
-	might_sleep();
-
-	/*
-	 * Only wait 10 seconds for the gpu reset to complete to avoid hanging
-	 * userspace. If it takes that long something really bad is going on and
-	 * we should simply try to bail out and fail as gracefully as possible.
-	 */
-	ret = wait_event_interruptible_timeout(error->reset_queue,
-					       !i915_reset_backoff(error),
-					       I915_RESET_TIMEOUT);
-	if (ret == 0) {
-		DRM_ERROR("Timed out waiting for the gpu reset to complete\n");
-		return -EIO;
-	} else if (ret < 0) {
-		return ret;
-	} else {
-		return 0;
-	}
-}
-
-int i915_mutex_lock_interruptible(struct drm_device *dev)
-{
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	int ret;
-
-	ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
-	if (ret)
-		return ret;
-
-	ret = mutex_lock_interruptible(&dev->struct_mutex);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
 static u32 __i915_gem_park(struct drm_i915_private *i915)
 {
 	intel_wakeref_t wakeref;
@@ -1869,6 +1828,7 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 	intel_wakeref_t wakeref;
 	struct i915_vma *vma;
 	pgoff_t page_offset;
+	int srcu;
 	int ret;
 
 	/* Sanity check that we allow writing into this object */
@@ -1908,7 +1868,6 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 		goto err_unlock;
 	}
 
-
 	/* Now pin it into the GTT as needed */
 	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
 				       PIN_MAPPABLE |
@@ -1946,9 +1905,15 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 	if (ret)
 		goto err_unpin;
 
+	srcu = i915_reset_trylock(dev_priv);
+	if (srcu < 0) {
+		ret = srcu;
+		goto err_unpin;
+	}
+
 	ret = i915_vma_pin_fence(vma);
 	if (ret)
-		goto err_unpin;
+		goto err_reset;
 
 	/* Finally, remap it using the new GTT offset */
 	ret = remap_io_mapping(area,
@@ -1969,6 +1934,8 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 
 err_fence:
 	i915_vma_unpin_fence(vma);
+err_reset:
+	i915_reset_unlock(dev_priv, srcu);
 err_unpin:
 	__i915_vma_unpin(vma);
 err_unlock:
@@ -5326,6 +5293,7 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
 	init_waitqueue_head(&dev_priv->gpu_error.wait_queue);
 	init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
 	mutex_init(&dev_priv->gpu_error.wedge_mutex);
+	init_srcu_struct(&dev_priv->gpu_error.srcu);
 
 	atomic_set(&dev_priv->mm.bsd_engine_dispatch_index, 0);
 
@@ -5358,6 +5326,8 @@ void i915_gem_cleanup_early(struct drm_i915_private *dev_priv)
 	GEM_BUG_ON(atomic_read(&dev_priv->mm.free_count));
 	WARN_ON(dev_priv->mm.object_count);
 
+	cleanup_srcu_struct(&dev_priv->gpu_error.srcu);
+
 	kmem_cache_destroy(dev_priv->priorities);
 	kmem_cache_destroy(dev_priv->dependencies);
 	kmem_cache_destroy(dev_priv->requests);
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index 46e259661294..bd0d5b8d6c96 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -240,6 +240,10 @@ static int fence_update(struct drm_i915_fence_reg *fence,
 		i915_vma_flush_writes(old);
 	}
 
+	ret = i915_reset_trylock(fence->i915);
+	if (ret < 0)
+		return ret;
+
 	if (fence->vma && fence->vma != vma) {
 		/* Ensure that all userspace CPU access is completed before
 		 * stealing the fence.
@@ -272,6 +276,7 @@ static int fence_update(struct drm_i915_fence_reg *fence,
 		list_move_tail(&fence->link, &fence->i915->mm.fence_list);
 	}
 
+	i915_reset_unlock(fence->i915, ret);
 	return 0;
 }
 
@@ -435,32 +440,6 @@ void i915_unreserve_fence(struct drm_i915_fence_reg *fence)
 	list_add(&fence->link, &fence->i915->mm.fence_list);
 }
 
-/**
- * i915_gem_revoke_fences - revoke fence state
- * @dev_priv: i915 device private
- *
- * Removes all GTT mmappings via the fence registers. This forces any user
- * of the fence to reacquire that fence before continuing with their access.
- * One use is during GPU reset where the fence register is lost and we need to
- * revoke concurrent userspace access via GTT mmaps until the hardware has been
- * reset and the fence registers have been restored.
- */
-void i915_gem_revoke_fences(struct drm_i915_private *dev_priv)
-{
-	int i;
-
-	lockdep_assert_held(&dev_priv->drm.struct_mutex);
-
-	for (i = 0; i < dev_priv->num_fence_regs; i++) {
-		struct drm_i915_fence_reg *fence = &dev_priv->fence_regs[i];
-
-		GEM_BUG_ON(fence->vma && fence->vma->fence != fence);
-
-		if (fence->vma)
-			i915_vma_revoke_mmap(fence->vma);
-	}
-}
-
 /**
  * i915_gem_restore_fences - restore fence state
  * @dev_priv: i915 device private
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index 53b1f22dd365..4e797c552b96 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -231,12 +231,10 @@ struct i915_gpu_error {
 	/**
 	 * flags: Control various stages of the GPU reset
 	 *
-	 * #I915_RESET_BACKOFF - When we start a reset, we want to stop any
-	 * other users acquiring the struct_mutex. To do this we set the
-	 * #I915_RESET_BACKOFF bit in the error flags when we detect a reset
-	 * and then check for that bit before acquiring the struct_mutex (in
-	 * i915_mutex_lock_interruptible()?). I915_RESET_BACKOFF serves a
-	 * secondary role in preventing two concurrent global reset attempts.
+	 * #I915_RESET_BACKOFF - When we start a global reset, we need to
+	 * serialise with any other users attempting to do the same, and
+	 * any global resources that may be clobber by the reset (such as
+	 * FENCE registers).
 	 *
 	 * #I915_RESET_ENGINE[num_engines] - Since the driver doesn't need to
 	 * acquire the struct_mutex to reset an engine, we need an explicit
@@ -272,6 +270,8 @@ struct i915_gpu_error {
 	 */
 	wait_queue_head_t reset_queue;
 
+	struct srcu_struct srcu;
+
 	struct i915_gpu_restart *restart;
 };
 
diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
index 4462007a681c..f58fae457ec6 100644
--- a/drivers/gpu/drm/i915/i915_reset.c
+++ b/drivers/gpu/drm/i915/i915_reset.c
@@ -639,6 +639,31 @@ static void reset_prepare_engine(struct intel_engine_cs *engine)
 	engine->reset.prepare(engine);
 }
 
+static void revoke_mmaps(struct drm_i915_private *i915)
+{
+	int i;
+
+	for (i = 0; i < i915->num_fence_regs; i++) {
+		struct i915_vma *vma = i915->fence_regs[i].vma;
+		struct drm_vma_offset_node *node;
+		u64 vma_offset;
+
+		if (!vma)
+			continue;
+
+		GEM_BUG_ON(vma->fence != &i915->fence_regs[i]);
+		if (!i915_vma_has_userfault(vma))
+			continue;
+
+		node = &vma->obj->base.vma_node;
+		vma_offset = vma->ggtt_view.partial.offset << PAGE_SHIFT;
+		unmap_mapping_range(i915->drm.anon_inode->i_mapping,
+				    drm_vma_node_offset_addr(node) + vma_offset,
+				    vma->size,
+				    1);
+	}
+}
+
 static void reset_prepare(struct drm_i915_private *i915)
 {
 	struct intel_engine_cs *engine;
@@ -648,6 +673,7 @@ static void reset_prepare(struct drm_i915_private *i915)
 		reset_prepare_engine(engine);
 
 	intel_uc_sanitize(i915);
+	revoke_mmaps(i915);
 }
 
 static int gt_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
@@ -911,50 +937,22 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 	return ret;
 }
 
-struct __i915_reset {
-	struct drm_i915_private *i915;
-	unsigned int stalled_mask;
-};
-
-static int __i915_reset__BKL(void *data)
-{
-	struct __i915_reset *arg = data;
-	int err;
-
-	err = intel_gpu_reset(arg->i915, ALL_ENGINES);
-	if (err)
-		return err;
-
-	return gt_reset(arg->i915, arg->stalled_mask);
-}
-
-#if RESET_UNDER_STOP_MACHINE
-/*
- * XXX An alternative to using stop_machine would be to park only the
- * processes that have a GGTT mmap. By remote parking the threads (SIGSTOP)
- * we should be able to prevent their memmory accesses via the lost fence
- * registers over the course of the reset without the potential recursive
- * of mutexes between the pagefault handler and reset.
- *
- * See igt/gem_mmap_gtt/hang
- */
-#define __do_reset(fn, arg) stop_machine(fn, arg, NULL)
-#else
-#define __do_reset(fn, arg) fn(arg)
-#endif
-
 static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
 {
-	struct __i915_reset arg = { i915, stalled_mask };
 	int err, i;
 
-	err = __do_reset(__i915_reset__BKL, &arg);
+	/* Flush everyone currently using a resource about to be clobbered */
+	synchronize_srcu(&i915->gpu_error.srcu);
+
+	err = intel_gpu_reset(i915, ALL_ENGINES);
 	for (i = 0; err && i < RESET_MAX_RETRIES; i++) {
-		msleep(100);
-		err = __do_reset(__i915_reset__BKL, &arg);
+		msleep(10 * (i + 1));
+		err = intel_gpu_reset(i915, ALL_ENGINES);
 	}
+	if (err)
+		return err;
 
-	return err;
+	return gt_reset(i915, stalled_mask);
 }
 
 /**
@@ -1274,9 +1272,12 @@ void i915_handle_error(struct drm_i915_private *i915,
 		wait_event(i915->gpu_error.reset_queue,
 			   !test_bit(I915_RESET_BACKOFF,
 				     &i915->gpu_error.flags));
-		goto out;
+		goto out; /* piggy-back on the other reset */
 	}
 
+	/* Make sure i915_reset_trylock() sees the I915_RESET_BACKOFF */
+	synchronize_rcu_expedited();
+
 	/* Prevent any other reset-engine attempt. */
 	for_each_engine(engine, i915, tmp) {
 		while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
@@ -1300,6 +1301,36 @@ void i915_handle_error(struct drm_i915_private *i915,
 	intel_runtime_pm_put(i915, wakeref);
 }
 
+int i915_reset_trylock(struct drm_i915_private *i915)
+{
+	struct i915_gpu_error *error = &i915->gpu_error;
+	int srcu;
+
+	rcu_read_lock();
+	while (test_bit(I915_RESET_BACKOFF, &error->flags)) {
+		rcu_read_unlock();
+
+		if (wait_event_interruptible(error->reset_queue,
+					     !test_bit(I915_RESET_BACKOFF,
+						       &error->flags)))
+			return -EINTR;
+
+		rcu_read_lock();
+	}
+	srcu = srcu_read_lock(&error->srcu);
+	rcu_read_unlock();
+
+	return srcu;
+}
+
+void i915_reset_unlock(struct drm_i915_private *i915, int tag)
+__releases(&i915->gpu_error.srcu)
+{
+	struct i915_gpu_error *error = &i915->gpu_error;
+
+	srcu_read_unlock(&error->srcu, tag);
+}
+
 bool i915_reset_flush(struct drm_i915_private *i915)
 {
 	int err;
diff --git a/drivers/gpu/drm/i915/i915_reset.h b/drivers/gpu/drm/i915/i915_reset.h
index f2d347f319df..893c5d1c2eb8 100644
--- a/drivers/gpu/drm/i915/i915_reset.h
+++ b/drivers/gpu/drm/i915/i915_reset.h
@@ -9,6 +9,7 @@
 
 #include <linux/compiler.h>
 #include <linux/types.h>
+#include <linux/srcu.h>
 
 struct drm_i915_private;
 struct intel_engine_cs;
@@ -32,6 +33,9 @@ int i915_reset_engine(struct intel_engine_cs *engine,
 void i915_reset_request(struct i915_request *rq, bool guilty);
 bool i915_reset_flush(struct drm_i915_private *i915);
 
+int __must_check i915_reset_trylock(struct drm_i915_private *i915);
+void i915_reset_unlock(struct drm_i915_private *i915, int tag);
+
 bool intel_has_gpu_reset(struct drm_i915_private *i915);
 bool intel_has_reset_engine(struct drm_i915_private *i915);
 
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index 7b6f3bea9ef8..4886fac12628 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -1039,8 +1039,6 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915,
 
 	/* Check that we can recover an unbind stuck on a hanging request */
 
-	igt_global_reset_lock(i915);
-
 	mutex_lock(&i915->drm.struct_mutex);
 	err = hang_init(&h, i915);
 	if (err)
@@ -1138,7 +1136,9 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915,
 	}
 
 out_reset:
+	igt_global_reset_lock(i915);
 	fake_hangcheck(rq->i915, intel_engine_flag(rq->engine));
+	igt_global_reset_unlock(i915);
 
 	if (tsk) {
 		struct igt_wedge_me w;
@@ -1159,7 +1159,6 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915,
 	hang_fini(&h);
 unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
-	igt_global_reset_unlock(i915);
 
 	if (i915_terminally_wedged(&i915->gpu_error))
 		return -EIO;
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 14ae46fda49f..074a0d9cbf26 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -189,6 +189,7 @@ struct drm_i915_private *mock_gem_device(void)
 
 	init_waitqueue_head(&i915->gpu_error.wait_queue);
 	init_waitqueue_head(&i915->gpu_error.reset_queue);
+	init_srcu_struct(&i915->gpu_error.srcu);
 	mutex_init(&i915->gpu_error.wedge_mutex);
 
 	i915->wq = alloc_ordered_workqueue("mock", 0);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/12] drm/i915: Force the GPU reset upon wedging
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (5 preceding siblings ...)
  2019-02-04  8:41 ` [PATCH 07/12] drm/i915: Revoke mmaps and prevent access to fence registers across reset Chris Wilson
@ 2019-02-04  8:41 ` Chris Wilson
  2019-02-04  8:41 ` [PATCH 09/12] drm/i915: Uninterruptibly drain the timelines on unwedging Chris Wilson
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

When declaring the GPU wedged, we do need to hit the GPU with the reset
hammer so that its state matches our presumed state during cleanup. If
the reset fails, it fails, and we may be unhappy but wedged. However, if
we are testing our wedge/unwedged handling, the desync carries over into
the next test and promptly explodes.

References: https://bugs.freedesktop.org/show_bug.cgi?id=106702
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_reset.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
index f58fae457ec6..c6f6400f95b4 100644
--- a/drivers/gpu/drm/i915/i915_reset.c
+++ b/drivers/gpu/drm/i915/i915_reset.c
@@ -532,9 +532,6 @@ typedef int (*reset_func)(struct drm_i915_private *,
 
 static reset_func intel_get_gpu_reset(struct drm_i915_private *i915)
 {
-	if (!i915_modparams.reset)
-		return NULL;
-
 	if (INTEL_GEN(i915) >= 8)
 		return gen8_reset_engines;
 	else if (INTEL_GEN(i915) >= 6)
@@ -599,6 +596,9 @@ bool intel_has_gpu_reset(struct drm_i915_private *i915)
 	if (USES_GUC(i915))
 		return false;
 
+	if (!i915_modparams.reset)
+		return NULL;
+
 	return intel_get_gpu_reset(i915);
 }
 
@@ -823,7 +823,7 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
 		reset_prepare_engine(engine);
 
 	/* Even if the GPU reset fails, it should still stop the engines */
-	if (INTEL_GEN(i915) >= 5)
+	if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
 		intel_gpu_reset(i915, ALL_ENGINES);
 
 	for_each_engine(engine, i915, id) {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/12] drm/i915: Uninterruptibly drain the timelines on unwedging
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (6 preceding siblings ...)
  2019-02-04  8:41 ` [PATCH 08/12] drm/i915: Force the GPU reset upon wedging Chris Wilson
@ 2019-02-04  8:41 ` Chris Wilson
  2019-02-04  8:41 ` [PATCH 10/12] drm/i915: Wait for old resets before applying debugfs/i915_wedged Chris Wilson
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx

On wedging, we mark all executing requests as complete and all pending
requests completed as soon as they are ready. Before unwedging though we
wish to flush those pending requests prior to restoring default
execution, and so we must wait. Do so interruptibly as we do not provide
the EINTR gracefully back to userspace in this case but persistent in
the permanently wedged start without restarting the syscall.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_reset.c | 28 ++++++++--------------------
 1 file changed, 8 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
index c6f6400f95b4..7fc86b44d872 100644
--- a/drivers/gpu/drm/i915/i915_reset.c
+++ b/drivers/gpu/drm/i915/i915_reset.c
@@ -861,7 +861,6 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 {
 	struct i915_gpu_error *error = &i915->gpu_error;
 	struct i915_timeline *tl;
-	bool ret = false;
 
 	if (!test_bit(I915_WEDGED, &error->flags))
 		return true;
@@ -886,30 +885,20 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 	mutex_lock(&i915->gt.timelines.mutex);
 	list_for_each_entry(tl, &i915->gt.timelines.active_list, link) {
 		struct i915_request *rq;
-		long timeout;
 
 		rq = i915_gem_active_get_unlocked(&tl->last_request);
 		if (!rq)
 			continue;
 
 		/*
-		 * We can't use our normal waiter as we want to
-		 * avoid recursively trying to handle the current
-		 * reset. The basic dma_fence_default_wait() installs
-		 * a callback for dma_fence_signal(), which is
-		 * triggered by our nop handler (indirectly, the
-		 * callback enables the signaler thread which is
-		 * woken by the nop_submit_request() advancing the seqno
-		 * and when the seqno passes the fence, the signaler
-		 * then signals the fence waking us up).
+		 * All internal dependencies (i915_requests) will have
+		 * been flushed by the set-wedge, but we may be stuck waiting
+		 * for external fences. These should all be capped to 10s
+		 * (I915_FENCE_TIMEOUT) so this wait should not be unbounded
+		 * in the worst case.
 		 */
-		timeout = dma_fence_default_wait(&rq->fence, true,
-						 MAX_SCHEDULE_TIMEOUT);
+		dma_fence_default_wait(&rq->fence, false, MAX_SCHEDULE_TIMEOUT);
 		i915_request_put(rq);
-		if (timeout < 0) {
-			mutex_unlock(&i915->gt.timelines.mutex);
-			goto unlock;
-		}
 	}
 	mutex_unlock(&i915->gt.timelines.mutex);
 
@@ -930,11 +919,10 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 
 	smp_mb__before_atomic(); /* complete takeover before enabling execbuf */
 	clear_bit(I915_WEDGED, &i915->gpu_error.flags);
-	ret = true;
-unlock:
+
 	mutex_unlock(&i915->gpu_error.wedge_mutex);
 
-	return ret;
+	return true;
 }
 
 static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/12] drm/i915: Wait for old resets before applying debugfs/i915_wedged
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (7 preceding siblings ...)
  2019-02-04  8:41 ` [PATCH 09/12] drm/i915: Uninterruptibly drain the timelines on unwedging Chris Wilson
@ 2019-02-04  8:41 ` Chris Wilson
  2019-02-04  8:41 ` [PATCH 11/12] drm/i915: Serialise resets with wedging Chris Wilson
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx

Since we use the debugfs to recover the device after modifying the
i915.reset parameter, we need to be sure that we apply the reset and not
piggy-back onto a concurrent one in order for the parameter to take
effect.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2cea263b4d79..54e426883529 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3874,13 +3874,9 @@ i915_wedged_set(void *data, u64 val)
 {
 	struct drm_i915_private *i915 = data;
 
-	/*
-	 * There is no safeguard against this debugfs entry colliding
-	 * with the hangcheck calling same i915_handle_error() in
-	 * parallel, causing an explosion. For now we assume that the
-	 * test harness is responsible enough not to inject gpu hangs
-	 * while it is writing to 'i915_wedged'
-	 */
+	/* Flush any previous reset before applying for a new one */
+	wait_event(i915->gpu_error.reset_queue,
+		   !test_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags));
 
 	i915_handle_error(i915, val, I915_ERROR_CAPTURE,
 			  "Manually set wedged engine mask = %llx", val);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 11/12] drm/i915: Serialise resets with wedging
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (8 preceding siblings ...)
  2019-02-04  8:41 ` [PATCH 10/12] drm/i915: Wait for old resets before applying debugfs/i915_wedged Chris Wilson
@ 2019-02-04  8:41 ` Chris Wilson
  2019-02-04  8:41 ` [PATCH 12/12] drm/i915: Don't claim an unstarted request was guilty Chris Wilson
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx

Prevent concurrent set-wedge with ongoing resets (and vice versa) by
taking the same wedge_mutex around both operations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_reset.c | 68 ++++++++++++++++++-------------
 1 file changed, 40 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
index 7fc86b44d872..ca19fcf29c5b 100644
--- a/drivers/gpu/drm/i915/i915_reset.c
+++ b/drivers/gpu/drm/i915/i915_reset.c
@@ -793,17 +793,14 @@ static void nop_submit_request(struct i915_request *request)
 	intel_engine_queue_breadcrumbs(engine);
 }
 
-void i915_gem_set_wedged(struct drm_i915_private *i915)
+static void __i915_gem_set_wedged(struct drm_i915_private *i915)
 {
 	struct i915_gpu_error *error = &i915->gpu_error;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 
-	mutex_lock(&error->wedge_mutex);
-	if (test_bit(I915_WEDGED, &error->flags)) {
-		mutex_unlock(&error->wedge_mutex);
+	if (test_bit(I915_WEDGED, &error->flags))
 		return;
-	}
 
 	if (GEM_SHOW_DEBUG() && !intel_engines_are_idle(i915)) {
 		struct drm_printer p = drm_debug_printer(__func__);
@@ -852,12 +849,18 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
 	set_bit(I915_WEDGED, &error->flags);
 
 	GEM_TRACE("end\n");
-	mutex_unlock(&error->wedge_mutex);
+}
 
-	wake_up_all(&error->reset_queue);
+void i915_gem_set_wedged(struct drm_i915_private *i915)
+{
+	struct i915_gpu_error *error = &i915->gpu_error;
+
+	mutex_lock(&error->wedge_mutex);
+	__i915_gem_set_wedged(i915);
+	mutex_unlock(&error->wedge_mutex);
 }
 
-bool i915_gem_unset_wedged(struct drm_i915_private *i915)
+static bool __i915_gem_unset_wedged(struct drm_i915_private *i915)
 {
 	struct i915_gpu_error *error = &i915->gpu_error;
 	struct i915_timeline *tl;
@@ -868,8 +871,6 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 	if (!i915->gt.scratch) /* Never full initialised, recovery impossible */
 		return false;
 
-	mutex_lock(&error->wedge_mutex);
-
 	GEM_TRACE("start\n");
 
 	/*
@@ -920,11 +921,21 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 	smp_mb__before_atomic(); /* complete takeover before enabling execbuf */
 	clear_bit(I915_WEDGED, &i915->gpu_error.flags);
 
-	mutex_unlock(&i915->gpu_error.wedge_mutex);
-
 	return true;
 }
 
+bool i915_gem_unset_wedged(struct drm_i915_private *i915)
+{
+	struct i915_gpu_error *error = &i915->gpu_error;
+	bool result;
+
+	mutex_lock(&error->wedge_mutex);
+	result = __i915_gem_unset_wedged(i915);
+	mutex_unlock(&error->wedge_mutex);
+
+	return result;
+}
+
 static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
 {
 	int err, i;
@@ -976,7 +987,7 @@ void i915_reset(struct drm_i915_private *i915,
 	GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &error->flags));
 
 	/* Clear any previous failed attempts at recovery. Time to try again. */
-	if (!i915_gem_unset_wedged(i915))
+	if (!__i915_gem_unset_wedged(i915))
 		return;
 
 	if (reason)
@@ -1038,7 +1049,7 @@ void i915_reset(struct drm_i915_private *i915,
 	 */
 	add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
 error:
-	i915_gem_set_wedged(i915);
+	__i915_gem_set_wedged(i915);
 	goto finish;
 }
 
@@ -1130,7 +1141,9 @@ static void i915_reset_device(struct drm_i915_private *i915,
 	i915_wedge_on_timeout(&w, i915, 5 * HZ) {
 		intel_prepare_reset(i915);
 
+		mutex_lock(&error->wedge_mutex);
 		i915_reset(i915, engine_mask, reason);
+		mutex_unlock(&error->wedge_mutex);
 
 		intel_finish_reset(i915);
 	}
@@ -1198,6 +1211,7 @@ void i915_handle_error(struct drm_i915_private *i915,
 		       unsigned long flags,
 		       const char *fmt, ...)
 {
+	struct i915_gpu_error *error = &i915->gpu_error;
 	struct intel_engine_cs *engine;
 	intel_wakeref_t wakeref;
 	unsigned int tmp;
@@ -1234,20 +1248,19 @@ void i915_handle_error(struct drm_i915_private *i915,
 	 * Try engine reset when available. We fall back to full reset if
 	 * single reset fails.
 	 */
-	if (intel_has_reset_engine(i915) &&
-	    !i915_terminally_wedged(&i915->gpu_error)) {
+	if (intel_has_reset_engine(i915) && !i915_terminally_wedged(error)) {
 		for_each_engine_masked(engine, i915, engine_mask, tmp) {
 			BUILD_BUG_ON(I915_RESET_MODESET >= I915_RESET_ENGINE);
 			if (test_and_set_bit(I915_RESET_ENGINE + engine->id,
-					     &i915->gpu_error.flags))
+					     &error->flags))
 				continue;
 
 			if (i915_reset_engine(engine, msg) == 0)
 				engine_mask &= ~intel_engine_flag(engine);
 
 			clear_bit(I915_RESET_ENGINE + engine->id,
-				  &i915->gpu_error.flags);
-			wake_up_bit(&i915->gpu_error.flags,
+				  &error->flags);
+			wake_up_bit(&error->flags,
 				    I915_RESET_ENGINE + engine->id);
 		}
 	}
@@ -1256,10 +1269,9 @@ void i915_handle_error(struct drm_i915_private *i915,
 		goto out;
 
 	/* Full reset needs the mutex, stop any other user trying to do so. */
-	if (test_and_set_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags)) {
-		wait_event(i915->gpu_error.reset_queue,
-			   !test_bit(I915_RESET_BACKOFF,
-				     &i915->gpu_error.flags));
+	if (test_and_set_bit(I915_RESET_BACKOFF, &error->flags)) {
+		wait_event(error->reset_queue,
+			   !test_bit(I915_RESET_BACKOFF, &error->flags));
 		goto out; /* piggy-back on the other reset */
 	}
 
@@ -1269,8 +1281,8 @@ void i915_handle_error(struct drm_i915_private *i915,
 	/* Prevent any other reset-engine attempt. */
 	for_each_engine(engine, i915, tmp) {
 		while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
-					&i915->gpu_error.flags))
-			wait_on_bit(&i915->gpu_error.flags,
+					&error->flags))
+			wait_on_bit(&error->flags,
 				    I915_RESET_ENGINE + engine->id,
 				    TASK_UNINTERRUPTIBLE);
 	}
@@ -1279,11 +1291,11 @@ void i915_handle_error(struct drm_i915_private *i915,
 
 	for_each_engine(engine, i915, tmp) {
 		clear_bit(I915_RESET_ENGINE + engine->id,
-			  &i915->gpu_error.flags);
+			  &error->flags);
 	}
 
-	clear_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags);
-	wake_up_all(&i915->gpu_error.reset_queue);
+	clear_bit(I915_RESET_BACKOFF, &error->flags);
+	wake_up_all(&error->reset_queue);
 
 out:
 	intel_runtime_pm_put(i915, wakeref);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 12/12] drm/i915: Don't claim an unstarted request was guilty
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (9 preceding siblings ...)
  2019-02-04  8:41 ` [PATCH 11/12] drm/i915: Serialise resets with wedging Chris Wilson
@ 2019-02-04  8:41 ` Chris Wilson
  2019-02-04  9:20 ` [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Tvrtko Ursulin
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04  8:41 UTC (permalink / raw)
  To: intel-gfx

If we haven't even begun executing the payload of the stalled request,
then we should not claim that its userspace context was guilty of
submitting a hanging batch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_lrc.c                 | 2 +-
 drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0869a4fd20c7..8e301f19036b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1947,7 +1947,7 @@ static void execlists_reset(struct intel_engine_cs *engine, bool stalled)
 		  rq ? rq->global_seqno : 0,
 		  intel_engine_get_seqno(engine),
 		  yesno(stalled));
-	if (!rq)
+	if (!rq || !i915_request_started(rq))
 		goto out_unlock;
 
 	/*
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index 4886fac12628..36c17bfe05a7 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -246,6 +246,12 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
 	if (INTEL_GEN(vm->i915) <= 5)
 		flags |= I915_DISPATCH_SECURE;
 
+	if (rq->engine->emit_init_breadcrumb) {
+		err = rq->engine->emit_init_breadcrumb(rq);
+		if (err)
+			goto cancel_rq;
+	}
+
 	err = rq->engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, flags);
 
 cancel_rq:
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (10 preceding siblings ...)
  2019-02-04  8:41 ` [PATCH 12/12] drm/i915: Don't claim an unstarted request was guilty Chris Wilson
@ 2019-02-04  9:20 ` Tvrtko Ursulin
  2019-02-04 10:19 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/12] " Patchwork
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-02-04  9:20 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 04/02/2019 08:41, Chris Wilson wrote:
> When first enabling preemption, we hesitated from making it a free-for-all
> where every higher priority client would force a preempt-to-idle cycle
> and take over from all lower priority clients. We hesitated because we
> were uncertain just how well preemption would work in practice, whether
> the preemption latency itself would detract from the latency gains for
> higher priority tasks and whether it would work at all. Since
> introducing preemption, we have been enabling it for more common tasks,
> even giving normal clients a small preemptive boost when they first
> start (to aide fairness and improve interactivity). Now lets take one
> step further and give permission for all normal (priority:0) clients to
> preempt any idle (priority:<0) task so that users running long compute
> jobs do not overly impact other jobs (i.e. their desktop) and the system
> remains responsive under such idle loads.
> 
> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> References: b16c765122f9 ("drm/i915: Priority boost for new clients")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com>
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_ringbuffer.h | 15 ++++++++++++++-
>   1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 34d0a148e664..983ad1e7914d 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -592,7 +592,20 @@ intel_engine_has_preemption(const struct intel_engine_cs *engine)
>   
>   static inline bool __execlists_need_preempt(int prio, int last)
>   {
> -	return prio > max(0, last);
> +	/*
> +	 * Allow preemption of low -> normal -> high, but we do
> +	 * not allow low priority tasks to preempt other low priority
> +	 * tasks under the impression that latency for low priority
> +	 * tasks does not matter (as much as background throughput),
> +	 * so kiss.
> +	 *
> +	 * More naturally we would write
> +	 * 	prio >= max(0, last);
> +	 * except that we wish to prevent triggering preemption at the same
> +	 * priority level: the task that is running should remain running
> +	 * to preserve FIFO ordering of dependencies.
> +	 */
> +	return prio > max(I915_PRIORITY_NORMAL - 1, last);
>   }
>   
>   static inline void
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption
  2019-02-04  8:41 ` [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption Chris Wilson
@ 2019-02-04 10:06   ` Tvrtko Ursulin
  2019-02-04 10:18     ` Chris Wilson
  2019-02-04 10:49   ` [PATCH] " Chris Wilson
  1 sibling, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-02-04 10:06 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 04/02/2019 08:41, Chris Wilson wrote:
> WAIT is occasionally suppressed by virtue of preempted requests being
> promoted to NEWCLIENT iff they have not all ready received that boost.

s/iff/if/

> Make this consistent for all WAIT boosts that they are not allowed to
> preempt executing contexts and are merely granted the right to be at the
> front of the queue for the next execution slot. This is in keeping with
> the desire that the WAIT boost be a minor tweak that does not give
> excessive promotion to its user and open ourselves to trivial abuse.
> 
> The problem with the inconsistent WAIT preemption becomes more apparent
> as the preemption is propagated across the engines, where one engine may
> preempt and the other not, and we be relying on the exact execution
> order being consistent across engines (e.g. using HW semaphores to
> coordinate parallel execution).
> 
> v2: Also protect GuC submission from false preemption loops.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_request.c          |  11 ++
>   drivers/gpu/drm/i915/i915_scheduler.h        |   2 +
>   drivers/gpu/drm/i915/intel_guc_submission.c  |   2 +-
>   drivers/gpu/drm/i915/intel_lrc.c             |   9 +-
>   drivers/gpu/drm/i915/selftests/igt_spinner.c |   9 +-
>   drivers/gpu/drm/i915/selftests/intel_lrc.c   | 156 +++++++++++++++++++
>   6 files changed, 186 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 9ed5baf157a3..7968875d0bed 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -377,12 +377,23 @@ void __i915_request_submit(struct i915_request *request)
>   
>   	/* We may be recursing from the signal callback of another i915 fence */
>   	spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
> +
>   	GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags));
>   	set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags);
> +
>   	request->global_seqno = seqno;
>   	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
>   	    !i915_request_enable_breadcrumb(request))
>   		intel_engine_queue_breadcrumbs(engine);
> +
> +	/*
> +	 * As we do not allow WAIT to preempt inflight requests,
> +	 * once we have executed a request, along with triggering
> +	 * any execution callbacks, we must preserve its ordering
> +	 * within the non-preemptible FIFO.
> +	 */
> +	request->sched.attr.priority |= __NO_PREEMPTION;
> +
>   	spin_unlock(&request->lock);
>   
>   	engine->emit_fini_breadcrumb(request,
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index dbe9cb7ecd82..54bd6c89817e 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -33,6 +33,8 @@ enum {
>   #define I915_PRIORITY_WAIT	((u8)BIT(0))
>   #define I915_PRIORITY_NEWCLIENT	((u8)BIT(1))
>   
> +#define __NO_PREEMPTION (I915_PRIORITY_WAIT)
> +
>   struct i915_sched_attr {
>   	/**
>   	 * @priority: execution and service priority
> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
> index 8bc8aa54aa35..94695eb819c2 100644
> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
> @@ -720,7 +720,7 @@ static inline int rq_prio(const struct i915_request *rq)
>   
>   static inline int port_prio(const struct execlist_port *port)
>   {
> -	return rq_prio(port_request(port));
> +	return rq_prio(port_request(port)) | __NO_PREEMPTION;
>   }
>   
>   static bool __guc_dequeue(struct intel_engine_cs *engine)
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index a9eb0211ce77..773df0bd685b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -188,6 +188,12 @@ static inline int rq_prio(const struct i915_request *rq)
>   	return rq->sched.attr.priority;
>   }
>   
> +static int effective_prio(const struct i915_request *rq)
> +{
> +	/* Restrict mere WAIT boosts from triggering preemption */
> +	return rq_prio(rq) | __NO_PREEMPTION;
> +}

I suggest adding i915_request_effective_prio to i915_request.h - it is 
verbose but avoids two implementation.

BUILD_BUG_ON(hweight32(__NO_PREEMPTION) != 1); as well? To ensure it is 
defined to internal priority levels.

> +
>   static int queue_prio(const struct intel_engine_execlists *execlists)
>   {
>   	struct i915_priolist *p;
> @@ -208,7 +214,7 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
>   static inline bool need_preempt(const struct intel_engine_cs *engine,
>   				const struct i915_request *rq)
>   {
> -	const int last_prio = rq_prio(rq);
> +	int last_prio;
>   
>   	if (!intel_engine_has_preemption(engine))
>   		return false;
> @@ -228,6 +234,7 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
>   	 * preempt. If that hint is stale or we may be trying to preempt
>   	 * ourselves, ignore the request.
>   	 */
> +	last_prio = effective_prio(rq);

Isn't this redundant? Every submitted request already had 
__NO_PREEMPTION applied in __i915_request_submit.

>   	if (!__execlists_need_preempt(engine->execlists.queue_priority_hint,
>   				      last_prio))
>   		return false;
> diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> index 9ebd9225684e..86354e51bdd3 100644
> --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
> +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> @@ -142,10 +142,17 @@ igt_spinner_create_request(struct igt_spinner *spin,
>   	*batch++ = upper_32_bits(vma->node.start);
>   	*batch++ = MI_BATCH_BUFFER_END; /* not reached */
>   
> -	i915_gem_chipset_flush(spin->i915);
> +	if (engine->emit_init_breadcrumb &&
> +	    rq->timeline->has_initial_breadcrumb) {
> +		err = engine->emit_init_breadcrumb(rq);
> +		if (err)
> +			goto cancel_rq;
> +	}
>   
>   	err = engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0);
>   
> +	i915_gem_chipset_flush(spin->i915);
> +
>   cancel_rq:
>   	if (err) {
>   		i915_request_skip(rq, err);
> diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
> index fb35f53c9ce3..86fd4589f5f0 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
> @@ -405,6 +405,161 @@ static int live_suppress_self_preempt(void *arg)
>   	goto err_client_b;
>   }
>   
> +static int __i915_sw_fence_call
> +dummy_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
> +{
> +	return NOTIFY_DONE;
> +}
> +
> +static struct i915_request *dummy_request(struct intel_engine_cs *engine)
> +{
> +	struct i915_request *rq;
> +
> +	rq = kmalloc(sizeof(*rq), GFP_KERNEL | __GFP_ZERO);
> +	if (!rq)
> +		return NULL;
> +
> +	INIT_LIST_HEAD(&rq->active_list);
> +	rq->engine = engine;
> +
> +	i915_sched_node_init(&rq->sched);
> +
> +	/* mark this request as permanently incomplete */
> +	rq->fence.seqno = 1;
> +	rq->hwsp_seqno = (u32 *)&rq->fence.seqno + 1;

Hackery level 10 unlocked! :)

Add to comment "..by pointing the hwsp_seqno to high (unused) 32-bits of 
seqno".

Also sounds like a good idea to add 
BUILD_BUG_ON(typeof(req->fence.seqno) == typeof(u64))?

> +
> +	i915_sw_fence_init(&rq->submit, dummy_notify);
> +	i915_sw_fence_commit(&rq->submit);
> +
> +	return rq;
> +}
> +
> +static void dummy_request_free(struct i915_request *dummy)
> +{
> +	i915_request_mark_complete(dummy);
> +	i915_sched_node_fini(dummy->engine->i915, &dummy->sched);
> +	kfree(dummy);
> +}
> +
> +static int live_suppress_wait_preempt(void *arg)
> +{
> +	struct drm_i915_private *i915 = arg;
> +	struct preempt_client client[4];
> +	struct intel_engine_cs *engine;
> +	enum intel_engine_id id;
> +	intel_wakeref_t wakeref;
> +	int err = -ENOMEM;
> +	int i;
> +
> +	/*
> +	 * Waiters are given a little priority nudge, but not enough
> +	 * to actually cause any preemption. Double check that we do
> +	 * not needlessly generate preempt-to-idle cycles.
> +	 */
> +
> +	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
> +		return 0;
> +
> +	mutex_lock(&i915->drm.struct_mutex);
> +	wakeref = intel_runtime_pm_get(i915);
> +
> +	if (preempt_client_init(i915, &client[0])) /* ELSP[0] */
> +		goto err_unlock;
> +	if (preempt_client_init(i915, &client[1])) /* ELSP[1] */
> +		goto err_client_0;
> +	if (preempt_client_init(i915, &client[2])) /* head of queue */
> +		goto err_client_1;
> +	if (preempt_client_init(i915, &client[3])) /* bystander */
> +		goto err_client_2;
> +
> +	for_each_engine(engine, i915, id) {
> +		int depth;
> +
> +		if (!engine->emit_init_breadcrumb)
> +			continue;
> +
> +		for (depth = 0; depth < ARRAY_SIZE(client); depth++) {
> +			struct i915_request *rq[ARRAY_SIZE(client)];
> +			struct i915_request *dummy;
> +
> +			engine->execlists.preempt_hang.count = 0;
> +
> +			dummy = dummy_request(engine);
> +			if (!dummy)
> +				goto err_client_3;
> +
> +			for (i = 0; i < ARRAY_SIZE(client); i++) {
> +				rq[i] = igt_spinner_create_request(&client[i].spin,
> +								   client[i].ctx, engine,
> +								   MI_NOOP);
> +				if (IS_ERR(rq[i])) {
> +					err = PTR_ERR(rq[i]);
> +					goto err_wedged;
> +				}
> +
> +				/* Disable NEWCLIENT promotion */
> +				i915_gem_active_set(&rq[i]->timeline->last_request,
> +						    dummy);
> +				i915_request_add(rq[i]);
> +			}
> +
> +			dummy_request_free(dummy);
> +
> +			GEM_BUG_ON(i915_request_completed(rq[0]));
> +			if (!igt_wait_for_spinner(&client[0].spin, rq[0])) {
> +				pr_err("First client failed to start\n");
> +				goto err_wedged;
> +			}
> +			GEM_BUG_ON(!i915_request_started(rq[0]));
> +
> +			if (i915_request_wait(rq[depth],
> +					      I915_WAIT_LOCKED |
> +					      I915_WAIT_PRIORITY,
> +					      1) != -ETIME) {
> +				pr_err("Waiter depth:%d completed!\n", depth);
> +				goto err_wedged;
> +			}
> +
> +			for (i = 0; i < ARRAY_SIZE(client); i++)
> +				igt_spinner_end(&client[i].spin);
> +
> +			if (igt_flush_test(i915, I915_WAIT_LOCKED))
> +				goto err_wedged;
> +
> +			if (engine->execlists.preempt_hang.count) {
> +				pr_err("Preemption recorded x%d, depth %d; should have been suppressed!\n",
> +				       engine->execlists.preempt_hang.count,
> +				       depth);

Worth logging engine names with all the above error messages?

> +				err = -EINVAL;
> +				goto err_client_3;
> +			}
> +		}
> +	}
> +
> +	err = 0;
> +err_client_3:
> +	preempt_client_fini(&client[3]);
> +err_client_2:
> +	preempt_client_fini(&client[2]);
> +err_client_1:
> +	preempt_client_fini(&client[1]);
> +err_client_0:
> +	preempt_client_fini(&client[0]);
> +err_unlock:
> +	if (igt_flush_test(i915, I915_WAIT_LOCKED))
> +		err = -EIO;
> +	intel_runtime_pm_put(i915, wakeref);
> +	mutex_unlock(&i915->drm.struct_mutex);
> +	return err;
> +
> +err_wedged:
> +	for (i = 0; i < ARRAY_SIZE(client); i++)
> +		igt_spinner_end(&client[i].spin);
> +	i915_gem_set_wedged(i915);
> +	err = -EIO;
> +	goto err_client_3;
> +}
> +
>   static int live_preempt_hang(void *arg)
>   {
>   	struct drm_i915_private *i915 = arg;
> @@ -785,6 +940,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
>   		SUBTEST(live_preempt),
>   		SUBTEST(live_late_preempt),
>   		SUBTEST(live_suppress_self_preempt),
> +		SUBTEST(live_suppress_wait_preempt),
>   		SUBTEST(live_preempt_hang),
>   		SUBTEST(live_preempt_smoke),
>   	};
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption
  2019-02-04 10:06   ` Tvrtko Ursulin
@ 2019-02-04 10:18     ` Chris Wilson
  2019-02-04 12:08       ` Tvrtko Ursulin
  0 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-02-04 10:18 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-02-04 10:06:35)
> 
> On 04/02/2019 08:41, Chris Wilson wrote:
> > WAIT is occasionally suppressed by virtue of preempted requests being
> > promoted to NEWCLIENT iff they have not all ready received that boost.
> 
> s/iff/if/

iff == if, and only if

But it was probably a typo.
 
> > Make this consistent for all WAIT boosts that they are not allowed to
> > preempt executing contexts and are merely granted the right to be at the
> > front of the queue for the next execution slot. This is in keeping with
> > the desire that the WAIT boost be a minor tweak that does not give
> > excessive promotion to its user and open ourselves to trivial abuse.
> > 
> > The problem with the inconsistent WAIT preemption becomes more apparent
> > as the preemption is propagated across the engines, where one engine may
> > preempt and the other not, and we be relying on the exact execution
> > order being consistent across engines (e.g. using HW semaphores to
> > coordinate parallel execution).
> > 
> > v2: Also protect GuC submission from false preemption loops.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_request.c          |  11 ++
> >   drivers/gpu/drm/i915/i915_scheduler.h        |   2 +
> >   drivers/gpu/drm/i915/intel_guc_submission.c  |   2 +-
> >   drivers/gpu/drm/i915/intel_lrc.c             |   9 +-
> >   drivers/gpu/drm/i915/selftests/igt_spinner.c |   9 +-
> >   drivers/gpu/drm/i915/selftests/intel_lrc.c   | 156 +++++++++++++++++++
> >   6 files changed, 186 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 9ed5baf157a3..7968875d0bed 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -377,12 +377,23 @@ void __i915_request_submit(struct i915_request *request)
> >   
> >       /* We may be recursing from the signal callback of another i915 fence */
> >       spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
> > +
> >       GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags));
> >       set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags);
> > +
> >       request->global_seqno = seqno;
> >       if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
> >           !i915_request_enable_breadcrumb(request))
> >               intel_engine_queue_breadcrumbs(engine);
> > +
> > +     /*
> > +      * As we do not allow WAIT to preempt inflight requests,
> > +      * once we have executed a request, along with triggering
> > +      * any execution callbacks, we must preserve its ordering
> > +      * within the non-preemptible FIFO.
> > +      */
> > +     request->sched.attr.priority |= __NO_PREEMPTION;
> > +
> >       spin_unlock(&request->lock);
> >   
> >       engine->emit_fini_breadcrumb(request,
> > diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> > index dbe9cb7ecd82..54bd6c89817e 100644
> > --- a/drivers/gpu/drm/i915/i915_scheduler.h
> > +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> > @@ -33,6 +33,8 @@ enum {
> >   #define I915_PRIORITY_WAIT  ((u8)BIT(0))
> >   #define I915_PRIORITY_NEWCLIENT     ((u8)BIT(1))
> >   
> > +#define __NO_PREEMPTION (I915_PRIORITY_WAIT)
> > +
> >   struct i915_sched_attr {
> >       /**
> >        * @priority: execution and service priority
> > diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
> > index 8bc8aa54aa35..94695eb819c2 100644
> > --- a/drivers/gpu/drm/i915/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
> > @@ -720,7 +720,7 @@ static inline int rq_prio(const struct i915_request *rq)
> >   
> >   static inline int port_prio(const struct execlist_port *port)
> >   {
> > -     return rq_prio(port_request(port));
> > +     return rq_prio(port_request(port)) | __NO_PREEMPTION;
> >   }
> >   
> >   static bool __guc_dequeue(struct intel_engine_cs *engine)
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index a9eb0211ce77..773df0bd685b 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -188,6 +188,12 @@ static inline int rq_prio(const struct i915_request *rq)
> >       return rq->sched.attr.priority;
> >   }
> >   
> > +static int effective_prio(const struct i915_request *rq)
> > +{
> > +     /* Restrict mere WAIT boosts from triggering preemption */
> > +     return rq_prio(rq) | __NO_PREEMPTION;
> > +}
> 
> I suggest adding i915_request_effective_prio to i915_request.h - it is 
> verbose but avoids two implementation.

Too verbose... And it may differ depending on backend details...

We don't even need to or in no-preemption until later...
 
> BUILD_BUG_ON(hweight32(__NO_PREEMPTION) != 1); as well? To ensure it is 
> defined to internal priority levels.

You mean __NO_PREEMPTION & ~I915_PRIORITY_MASK ?

> >   static int queue_prio(const struct intel_engine_execlists *execlists)
> >   {
> >       struct i915_priolist *p;
> > @@ -208,7 +214,7 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
> >   static inline bool need_preempt(const struct intel_engine_cs *engine,
> >                               const struct i915_request *rq)
> >   {
> > -     const int last_prio = rq_prio(rq);
> > +     int last_prio;
> >   
> >       if (!intel_engine_has_preemption(engine))
> >               return false;
> > @@ -228,6 +234,7 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
> >        * preempt. If that hint is stale or we may be trying to preempt
> >        * ourselves, ignore the request.
> >        */
> > +     last_prio = effective_prio(rq);
> 
> Isn't this redundant? Every submitted request already had 
> __NO_PREEMPTION applied in __i915_request_submit.

But not in the next patch which expands upon this.

> >       if (!__execlists_need_preempt(engine->execlists.queue_priority_hint,
> >                                     last_prio))
> >               return false;
> > diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > index 9ebd9225684e..86354e51bdd3 100644
> > --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > @@ -142,10 +142,17 @@ igt_spinner_create_request(struct igt_spinner *spin,
> >       *batch++ = upper_32_bits(vma->node.start);
> >       *batch++ = MI_BATCH_BUFFER_END; /* not reached */
> >   
> > -     i915_gem_chipset_flush(spin->i915);
> > +     if (engine->emit_init_breadcrumb &&
> > +         rq->timeline->has_initial_breadcrumb) {
> > +             err = engine->emit_init_breadcrumb(rq);
> > +             if (err)
> > +                     goto cancel_rq;
> > +     }
> >   
> >       err = engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0);
> >   
> > +     i915_gem_chipset_flush(spin->i915);
> > +
> >   cancel_rq:
> >       if (err) {
> >               i915_request_skip(rq, err);
> > diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
> > index fb35f53c9ce3..86fd4589f5f0 100644
> > --- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
> > @@ -405,6 +405,161 @@ static int live_suppress_self_preempt(void *arg)
> >       goto err_client_b;
> >   }
> >   
> > +static int __i915_sw_fence_call
> > +dummy_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
> > +{
> > +     return NOTIFY_DONE;
> > +}
> > +
> > +static struct i915_request *dummy_request(struct intel_engine_cs *engine)
> > +{
> > +     struct i915_request *rq;
> > +
> > +     rq = kmalloc(sizeof(*rq), GFP_KERNEL | __GFP_ZERO);
> > +     if (!rq)
> > +             return NULL;
> > +
> > +     INIT_LIST_HEAD(&rq->active_list);
> > +     rq->engine = engine;
> > +
> > +     i915_sched_node_init(&rq->sched);
> > +
> > +     /* mark this request as permanently incomplete */
> > +     rq->fence.seqno = 1;
> > +     rq->hwsp_seqno = (u32 *)&rq->fence.seqno + 1;
> 
> Hackery level 10 unlocked! :)
> 
> Add to comment "..by pointing the hwsp_seqno to high (unused) 32-bits of 
> seqno".

I put the why... How that is actually accomplished is mere
implementation, and you can read the code. :-p

> Also sounds like a good idea to add 
> BUILD_BUG_ON(typeof(req->fence.seqno) == typeof(u64))?
> 
> > +
> > +     i915_sw_fence_init(&rq->submit, dummy_notify);
> > +     i915_sw_fence_commit(&rq->submit);
> > +
> > +     return rq;
> > +}
> > +
> > +static void dummy_request_free(struct i915_request *dummy)
> > +{
> > +     i915_request_mark_complete(dummy);
> > +     i915_sched_node_fini(dummy->engine->i915, &dummy->sched);
> > +     kfree(dummy);
> > +}
> > +
> > +static int live_suppress_wait_preempt(void *arg)
> > +{
> > +     struct drm_i915_private *i915 = arg;
> > +     struct preempt_client client[4];
> > +     struct intel_engine_cs *engine;
> > +     enum intel_engine_id id;
> > +     intel_wakeref_t wakeref;
> > +     int err = -ENOMEM;
> > +     int i;
> > +
> > +     /*
> > +      * Waiters are given a little priority nudge, but not enough
> > +      * to actually cause any preemption. Double check that we do
> > +      * not needlessly generate preempt-to-idle cycles.
> > +      */
> > +
> > +     if (!HAS_LOGICAL_RING_PREEMPTION(i915))
> > +             return 0;
> > +
> > +     mutex_lock(&i915->drm.struct_mutex);
> > +     wakeref = intel_runtime_pm_get(i915);
> > +
> > +     if (preempt_client_init(i915, &client[0])) /* ELSP[0] */
> > +             goto err_unlock;
> > +     if (preempt_client_init(i915, &client[1])) /* ELSP[1] */
> > +             goto err_client_0;
> > +     if (preempt_client_init(i915, &client[2])) /* head of queue */
> > +             goto err_client_1;
> > +     if (preempt_client_init(i915, &client[3])) /* bystander */
> > +             goto err_client_2;
> > +
> > +     for_each_engine(engine, i915, id) {
> > +             int depth;
> > +
> > +             if (!engine->emit_init_breadcrumb)
> > +                     continue;
> > +
> > +             for (depth = 0; depth < ARRAY_SIZE(client); depth++) {
> > +                     struct i915_request *rq[ARRAY_SIZE(client)];
> > +                     struct i915_request *dummy;
> > +
> > +                     engine->execlists.preempt_hang.count = 0;
> > +
> > +                     dummy = dummy_request(engine);
> > +                     if (!dummy)
> > +                             goto err_client_3;
> > +
> > +                     for (i = 0; i < ARRAY_SIZE(client); i++) {
> > +                             rq[i] = igt_spinner_create_request(&client[i].spin,
> > +                                                                client[i].ctx, engine,
> > +                                                                MI_NOOP);
> > +                             if (IS_ERR(rq[i])) {
> > +                                     err = PTR_ERR(rq[i]);
> > +                                     goto err_wedged;
> > +                             }
> > +
> > +                             /* Disable NEWCLIENT promotion */
> > +                             i915_gem_active_set(&rq[i]->timeline->last_request,
> > +                                                 dummy);
> > +                             i915_request_add(rq[i]);
> > +                     }
> > +
> > +                     dummy_request_free(dummy);
> > +
> > +                     GEM_BUG_ON(i915_request_completed(rq[0]));
> > +                     if (!igt_wait_for_spinner(&client[0].spin, rq[0])) {
> > +                             pr_err("First client failed to start\n");
> > +                             goto err_wedged;
> > +                     }
> > +                     GEM_BUG_ON(!i915_request_started(rq[0]));
> > +
> > +                     if (i915_request_wait(rq[depth],
> > +                                           I915_WAIT_LOCKED |
> > +                                           I915_WAIT_PRIORITY,
> > +                                           1) != -ETIME) {
> > +                             pr_err("Waiter depth:%d completed!\n", depth);
> > +                             goto err_wedged;
> > +                     }
> > +
> > +                     for (i = 0; i < ARRAY_SIZE(client); i++)
> > +                             igt_spinner_end(&client[i].spin);
> > +
> > +                     if (igt_flush_test(i915, I915_WAIT_LOCKED))
> > +                             goto err_wedged;
> > +
> > +                     if (engine->execlists.preempt_hang.count) {
> > +                             pr_err("Preemption recorded x%d, depth %d; should have been suppressed!\n",
> > +                                    engine->execlists.preempt_hang.count,
> > +                                    depth);
> 
> Worth logging engine names with all the above error messages?

I honestly doubt we need per-engine since this is just about observing SW.
But as we are exercising across engines, it may as well include that
information.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (11 preceding siblings ...)
  2019-02-04  9:20 ` [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Tvrtko Ursulin
@ 2019-02-04 10:19 ` Patchwork
  2019-02-04 10:23 ` ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2019-02-04 10:19 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients
URL   : https://patchwork.freedesktop.org/series/56166/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
08acdf160cfa drm/i915: Allow normal clients to always preempt idle priority clients
-:24: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#24: 
References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")

-:24: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")'
#24: 
References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")

-:25: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit b16c765122f9 ("drm/i915: Priority boost for new clients")'
#25: 
References: b16c765122f9 ("drm/i915: Priority boost for new clients")

-:51: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#51: FILE: drivers/gpu/drm/i915/intel_ringbuffer.h:603:
+^I * ^Iprio >= max(0, last);$

total: 2 errors, 2 warnings, 0 checks, 21 lines checked
ce2fa789e441 drm/i915/execlists: Suppress mere WAIT preemption
cd4def238fe6 drm/i915/execlists: Suppress redundant preemption
54ed4585344b drm/i915/selftests: Exercise some AB...BA preemption chains
f84993b34dac drm/i915: Trim NEWCLIENT boosting
98ec3a89ed44 drm/i915: Show support for accurate sw PMU busyness tracking
46a673b165ae drm/i915: Revoke mmaps and prevent access to fence registers across reset
c2a4f9415d54 drm/i915: Force the GPU reset upon wedging
fa2991ab10f0 drm/i915: Uninterruptibly drain the timelines on unwedging
b97913409822 drm/i915: Wait for old resets before applying debugfs/i915_wedged
e3ebb5cf4797 drm/i915: Serialise resets with wedging
8a869de6fad0 drm/i915: Don't claim an unstarted request was guilty

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (12 preceding siblings ...)
  2019-02-04 10:19 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/12] " Patchwork
@ 2019-02-04 10:23 ` Patchwork
  2019-02-04 10:48 ` ✓ Fi.CI.BAT: success " Patchwork
  2019-02-04 11:27 ` ✗ Fi.CI.BAT: failure for series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients (rev2) Patchwork
  15 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2019-02-04 10:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients
URL   : https://patchwork.freedesktop.org/series/56166/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915: Allow normal clients to always preempt idle priority clients
-O:drivers/gpu/drm/i915/intel_ringbuffer.h:595:23: warning: expression using sizeof(void)
-O:drivers/gpu/drm/i915/intel_ringbuffer.h:595:23: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/intel_ringbuffer.h:608:23: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/intel_ringbuffer.h:608:23: warning: expression using sizeof(void)

Commit: drm/i915/execlists: Suppress mere WAIT preemption
Okay!

Commit: drm/i915/execlists: Suppress redundant preemption
Okay!

Commit: drm/i915/selftests: Exercise some AB...BA preemption chains
Okay!

Commit: drm/i915: Trim NEWCLIENT boosting
Okay!

Commit: drm/i915: Show support for accurate sw PMU busyness tracking
Okay!

Commit: drm/i915: Revoke mmaps and prevent access to fence registers across reset
-drivers/gpu/drm/i915/i915_gem.c:986:39: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/i915_gem.c:986:39: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_gem.c:986:39: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_gem.c:986:39: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_reset.c:1304:5: warning: context imbalance in 'i915_reset_trylock' - different lock contexts for basic block
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3551:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3545:16: warning: expression using sizeof(void)

Commit: drm/i915: Force the GPU reset upon wedging
Okay!

Commit: drm/i915: Uninterruptibly drain the timelines on unwedging
Okay!

Commit: drm/i915: Wait for old resets before applying debugfs/i915_wedged
Okay!

Commit: drm/i915: Serialise resets with wedging
Okay!

Commit: drm/i915: Don't claim an unstarted request was guilty
Okay!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (13 preceding siblings ...)
  2019-02-04 10:23 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2019-02-04 10:48 ` Patchwork
  2019-02-04 11:27 ` ✗ Fi.CI.BAT: failure for series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients (rev2) Patchwork
  15 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2019-02-04 10:48 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients
URL   : https://patchwork.freedesktop.org/series/56166/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_5535 -> Patchwork_12123
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/56166/revisions/1/mbox/

Known issues
------------

  Here are the changes found in Patchwork_12123 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3:
    - fi-blb-e6850:       PASS -> INCOMPLETE [fdo#107718]

  * igt@i915_selftest@live_execlists:
    - fi-apl-guc:         PASS -> INCOMPLETE [fdo#103927]

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       PASS -> FAIL [fdo#109485]

  * igt@pm_rpm@basic-rte:
    - fi-byt-j1900:       PASS -> FAIL [fdo#108800]

  
#### Possible fixes ####

  * igt@kms_pipe_crc_basic@read-crc-pipe-b:
    - fi-byt-clapper:     FAIL [fdo#107362] -> PASS

  * igt@pm_rpm@basic-pci-d3-state:
    - fi-bsw-kefka:       {SKIP} [fdo#109271] -> PASS

  * igt@pm_rpm@basic-rte:
    - fi-bsw-kefka:       FAIL [fdo#108800] -> PASS

  * igt@prime_vgem@basic-fence-flip:
    - fi-ilk-650:         FAIL [fdo#104008] -> PASS
    - fi-gdg-551:         DMESG-FAIL [fdo#103182] -> PASS

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103182]: https://bugs.freedesktop.org/show_bug.cgi?id=103182
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#104008]: https://bugs.freedesktop.org/show_bug.cgi?id=104008
  [fdo#107362]: https://bugs.freedesktop.org/show_bug.cgi?id=107362
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#108800]: https://bugs.freedesktop.org/show_bug.cgi?id=108800
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109485]: https://bugs.freedesktop.org/show_bug.cgi?id=109485


Participating hosts (48 -> 44)
------------------------------

  Missing    (4): fi-kbl-soraka fi-ilk-m540 fi-byt-squawks fi-bsw-cyan 


Build changes
-------------

    * Linux: CI_DRM_5535 -> Patchwork_12123

  CI_DRM_5535: 96b56f434555ee108d816668642299e58e421050 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4804: 0b9ac934a6aad9ed5c1fdfd48d2b0faa10bfbbc4 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12123: 8a869de6fad0c1741e262ae7cb9939eaabfa73a5 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

8a869de6fad0 drm/i915: Don't claim an unstarted request was guilty
e3ebb5cf4797 drm/i915: Serialise resets with wedging
b97913409822 drm/i915: Wait for old resets before applying debugfs/i915_wedged
fa2991ab10f0 drm/i915: Uninterruptibly drain the timelines on unwedging
c2a4f9415d54 drm/i915: Force the GPU reset upon wedging
46a673b165ae drm/i915: Revoke mmaps and prevent access to fence registers across reset
98ec3a89ed44 drm/i915: Show support for accurate sw PMU busyness tracking
f84993b34dac drm/i915: Trim NEWCLIENT boosting
54ed4585344b drm/i915/selftests: Exercise some AB...BA preemption chains
cd4def238fe6 drm/i915/execlists: Suppress redundant preemption
ce2fa789e441 drm/i915/execlists: Suppress mere WAIT preemption
08acdf160cfa drm/i915: Allow normal clients to always preempt idle priority clients

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12123/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH] drm/i915/execlists: Suppress mere WAIT preemption
  2019-02-04  8:41 ` [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption Chris Wilson
  2019-02-04 10:06   ` Tvrtko Ursulin
@ 2019-02-04 10:49   ` Chris Wilson
  1 sibling, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04 10:49 UTC (permalink / raw)
  To: intel-gfx

WAIT is occasionally suppressed by virtue of preempted requests being
promoted to NEWCLIENT if they have not all ready received that boost.
Make this consistent for all WAIT boosts that they are not allowed to
preempt executing contexts and are merely granted the right to be at the
front of the queue for the next execution slot. This is in keeping with
the desire that the WAIT boost be a minor tweak that does not give
excessive promotion to its user and open ourselves to trivial abuse.

The problem with the inconsistent WAIT preemption becomes more apparent
as the preemption is propagated across the engines, where one engine may
preempt and the other not, and we be relying on the exact execution
order being consistent across engines (e.g. using HW semaphores to
coordinate parallel execution).

v2: Also protect GuC submission from false preemption loops.
v3: Build bug safeguards and better debug messages for st.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c          |  12 ++
 drivers/gpu/drm/i915/i915_scheduler.h        |   2 +
 drivers/gpu/drm/i915/intel_lrc.c             |   9 +-
 drivers/gpu/drm/i915/selftests/igt_spinner.c |   9 +-
 drivers/gpu/drm/i915/selftests/intel_lrc.c   | 160 +++++++++++++++++++
 5 files changed, 190 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 9ed5baf157a3..d14a1b225f47 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -377,12 +377,24 @@ void __i915_request_submit(struct i915_request *request)
 
 	/* We may be recursing from the signal callback of another i915 fence */
 	spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
+
 	GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags));
 	set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags);
+
 	request->global_seqno = seqno;
 	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
 	    !i915_request_enable_breadcrumb(request))
 		intel_engine_queue_breadcrumbs(engine);
+
+	/*
+	 * As we do not allow WAIT to preempt inflight requests,
+	 * once we have executed a request, along with triggering
+	 * any execution callbacks, we must preserve its ordering
+	 * within the non-preemptible FIFO.
+	 */
+	BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
+	request->sched.attr.priority |= __NO_PREEMPTION;
+
 	spin_unlock(&request->lock);
 
 	engine->emit_fini_breadcrumb(request,
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index dbe9cb7ecd82..54bd6c89817e 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -33,6 +33,8 @@ enum {
 #define I915_PRIORITY_WAIT	((u8)BIT(0))
 #define I915_PRIORITY_NEWCLIENT	((u8)BIT(1))
 
+#define __NO_PREEMPTION (I915_PRIORITY_WAIT)
+
 struct i915_sched_attr {
 	/**
 	 * @priority: execution and service priority
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a9eb0211ce77..773df0bd685b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -188,6 +188,12 @@ static inline int rq_prio(const struct i915_request *rq)
 	return rq->sched.attr.priority;
 }
 
+static int effective_prio(const struct i915_request *rq)
+{
+	/* Restrict mere WAIT boosts from triggering preemption */
+	return rq_prio(rq) | __NO_PREEMPTION;
+}
+
 static int queue_prio(const struct intel_engine_execlists *execlists)
 {
 	struct i915_priolist *p;
@@ -208,7 +214,7 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
 static inline bool need_preempt(const struct intel_engine_cs *engine,
 				const struct i915_request *rq)
 {
-	const int last_prio = rq_prio(rq);
+	int last_prio;
 
 	if (!intel_engine_has_preemption(engine))
 		return false;
@@ -228,6 +234,7 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 	 * preempt. If that hint is stale or we may be trying to preempt
 	 * ourselves, ignore the request.
 	 */
+	last_prio = effective_prio(rq);
 	if (!__execlists_need_preempt(engine->execlists.queue_priority_hint,
 				      last_prio))
 		return false;
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index 9ebd9225684e..86354e51bdd3 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -142,10 +142,17 @@ igt_spinner_create_request(struct igt_spinner *spin,
 	*batch++ = upper_32_bits(vma->node.start);
 	*batch++ = MI_BATCH_BUFFER_END; /* not reached */
 
-	i915_gem_chipset_flush(spin->i915);
+	if (engine->emit_init_breadcrumb &&
+	    rq->timeline->has_initial_breadcrumb) {
+		err = engine->emit_init_breadcrumb(rq);
+		if (err)
+			goto cancel_rq;
+	}
 
 	err = engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0);
 
+	i915_gem_chipset_flush(spin->i915);
+
 cancel_rq:
 	if (err) {
 		i915_request_skip(rq, err);
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index fb35f53c9ce3..16037a841146 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -405,6 +405,165 @@ static int live_suppress_self_preempt(void *arg)
 	goto err_client_b;
 }
 
+static int __i915_sw_fence_call
+dummy_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
+{
+	return NOTIFY_DONE;
+}
+
+static struct i915_request *dummy_request(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq;
+
+	rq = kmalloc(sizeof(*rq), GFP_KERNEL | __GFP_ZERO);
+	if (!rq)
+		return NULL;
+
+	INIT_LIST_HEAD(&rq->active_list);
+	rq->engine = engine;
+
+	i915_sched_node_init(&rq->sched);
+
+	/* mark this request as permanently incomplete */
+	rq->fence.seqno = 1;
+	BUILD_BUG_ON(sizeof(rq->fence.seqno) != 8); /* upper 32b == 0 */
+	rq->hwsp_seqno = (u32 *)&rq->fence.seqno + 1;
+
+	i915_sw_fence_init(&rq->submit, dummy_notify);
+	i915_sw_fence_commit(&rq->submit);
+
+	return rq;
+}
+
+static void dummy_request_free(struct i915_request *dummy)
+{
+	i915_request_mark_complete(dummy);
+	i915_sched_node_fini(dummy->engine->i915, &dummy->sched);
+	kfree(dummy);
+}
+
+static int live_suppress_wait_preempt(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct preempt_client client[4];
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+	int err = -ENOMEM;
+	int i;
+
+	/*
+	 * Waiters are given a little priority nudge, but not enough
+	 * to actually cause any preemption. Double check that we do
+	 * not needlessly generate preempt-to-idle cycles.
+	 */
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	if (preempt_client_init(i915, &client[0])) /* ELSP[0] */
+		goto err_unlock;
+	if (preempt_client_init(i915, &client[1])) /* ELSP[1] */
+		goto err_client_0;
+	if (preempt_client_init(i915, &client[2])) /* head of queue */
+		goto err_client_1;
+	if (preempt_client_init(i915, &client[3])) /* bystander */
+		goto err_client_2;
+
+	for_each_engine(engine, i915, id) {
+		int depth;
+
+		if (!engine->emit_init_breadcrumb)
+			continue;
+
+		for (depth = 0; depth < ARRAY_SIZE(client); depth++) {
+			struct i915_request *rq[ARRAY_SIZE(client)];
+			struct i915_request *dummy;
+
+			engine->execlists.preempt_hang.count = 0;
+
+			dummy = dummy_request(engine);
+			if (!dummy)
+				goto err_client_3;
+
+			for (i = 0; i < ARRAY_SIZE(client); i++) {
+				rq[i] = igt_spinner_create_request(&client[i].spin,
+								   client[i].ctx, engine,
+								   MI_NOOP);
+				if (IS_ERR(rq[i])) {
+					err = PTR_ERR(rq[i]);
+					goto err_wedged;
+				}
+
+				/* Disable NEWCLIENT promotion */
+				i915_gem_active_set(&rq[i]->timeline->last_request,
+						    dummy);
+				i915_request_add(rq[i]);
+			}
+
+			dummy_request_free(dummy);
+
+			GEM_BUG_ON(i915_request_completed(rq[0]));
+			if (!igt_wait_for_spinner(&client[0].spin, rq[0])) {
+				pr_err("%s: First client failed to start\n",
+				       engine->name);
+				goto err_wedged;
+			}
+			GEM_BUG_ON(!i915_request_started(rq[0]));
+
+			if (i915_request_wait(rq[depth],
+					      I915_WAIT_LOCKED |
+					      I915_WAIT_PRIORITY,
+					      1) != -ETIME) {
+				pr_err("%s: Waiter depth:%d completed!\n",
+				       engine->name, depth);
+				goto err_wedged;
+			}
+
+			for (i = 0; i < ARRAY_SIZE(client); i++)
+				igt_spinner_end(&client[i].spin);
+
+			if (igt_flush_test(i915, I915_WAIT_LOCKED))
+				goto err_wedged;
+
+			if (engine->execlists.preempt_hang.count) {
+				pr_err("%s: Preemption recorded x%d, depth %d; should have been suppressed!\n",
+				       engine->name,
+				       engine->execlists.preempt_hang.count,
+				       depth);
+				err = -EINVAL;
+				goto err_client_3;
+			}
+		}
+	}
+
+	err = 0;
+err_client_3:
+	preempt_client_fini(&client[3]);
+err_client_2:
+	preempt_client_fini(&client[2]);
+err_client_1:
+	preempt_client_fini(&client[1]);
+err_client_0:
+	preempt_client_fini(&client[0]);
+err_unlock:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+
+err_wedged:
+	for (i = 0; i < ARRAY_SIZE(client); i++)
+		igt_spinner_end(&client[i].spin);
+	i915_gem_set_wedged(i915);
+	err = -EIO;
+	goto err_client_3;
+}
+
 static int live_preempt_hang(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -785,6 +944,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_preempt),
 		SUBTEST(live_late_preempt),
 		SUBTEST(live_suppress_self_preempt),
+		SUBTEST(live_suppress_wait_preempt),
 		SUBTEST(live_preempt_hang),
 		SUBTEST(live_preempt_smoke),
 	};
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients (rev2)
  2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
                   ` (14 preceding siblings ...)
  2019-02-04 10:48 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2019-02-04 11:27 ` Patchwork
  15 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2019-02-04 11:27 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients (rev2)
URL   : https://patchwork.freedesktop.org/series/56166/
State : failure

== Summary ==

Applying: drm/i915: Allow normal clients to always preempt idle priority clients
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/intel_ringbuffer.h
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/intel_ringbuffer.h
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/intel_ringbuffer.h
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0001 drm/i915: Allow normal clients to always preempt idle priority clients
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/12] drm/i915/execlists: Suppress redundant preemption
  2019-02-04  8:41 ` [PATCH 03/12] drm/i915/execlists: Suppress redundant preemption Chris Wilson
@ 2019-02-04 12:05   ` Tvrtko Ursulin
  2019-02-04 12:25     ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-02-04 12:05 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 04/02/2019 08:41, Chris Wilson wrote:
> On unwinding the active request we give it a small (limited to internal
> priority levels) boost to prevent it from being gazumped a second time.
> However, this means that it can be promoted to above the request that
> triggered the preemption request, causing a preempt-to-idle cycle for no
> change. We can avoid this if we take the boost into account when
> checking if the preemption request is valid.

After the previous patch to not preempt on wait this is not true any 
more right?

> 
> v2: After preemption the active request will be after the preemptee if
> they end up with equal priority.
> 
> v3: Tvrtko pointed out that this, the existing logic, makes
> I915_PRIORITY_WAIT non-preemptible. Document this interesting quirk!
> 
> v4: Prove Tvrtko was right about WAIT being non-preemptible and test it.
> v5: Except not all priorities were made equal, and the WAIT not preempting
> is only if we start off as !NEWCLIENT.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c | 38 ++++++++++++++++++++++++++++----
>   1 file changed, 34 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 773df0bd685b..9b6b3acb9070 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -164,6 +164,8 @@
>   #define WA_TAIL_DWORDS 2
>   #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
>   
> +#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT)
> +
>   static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
>   					    struct intel_engine_cs *engine,
>   					    struct intel_context *ce);
> @@ -190,8 +192,30 @@ static inline int rq_prio(const struct i915_request *rq)
>   
>   static int effective_prio(const struct i915_request *rq)
>   {
> +	int prio = rq_prio(rq);
> +
> +	/*
> +	 * On unwinding the active request, we give it a priority bump
> +	 * equivalent to a freshly submitted request. This protects it from
> +	 * being gazumped again, but it would be preferable if we didn't
> +	 * let it be gazumped in the first place!
> +	 *
> +	 * See __unwind_incomplete_requests()
> +	 */
> +	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(rq)) {

So effectively a started request cannot be preempted by a new client, on 
the same base priority level.

I am still not sure that we should give special meaning to a started 
request. We don't know how long it would execute or anything. It may be 
the only request, or it may be a last in a coalesced context. And in 
both cases it can be either short or long.

> +		/*
> +		 * After preemption, we insert the active request at the
> +		 * end of the new priority level. This means that we will be
> +		 * _lower_ priority than the preemptee all things equal (and
> +		 * so the preemption is valid), so adjust our comparison
> +		 * accordingly.
> +		 */
> +		prio |= ACTIVE_PRIORITY;
> +		prio--;
> +	}
> +
>   	/* Restrict mere WAIT boosts from triggering preemption */
> -	return rq_prio(rq) | __NO_PREEMPTION;
> +	return prio | __NO_PREEMPTION;
>   }
>   
>   static int queue_prio(const struct intel_engine_execlists *execlists)
> @@ -360,7 +384,7 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
>   {
>   	struct i915_request *rq, *rn, *active = NULL;
>   	struct list_head *uninitialized_var(pl);
> -	int prio = I915_PRIORITY_INVALID | I915_PRIORITY_NEWCLIENT;
> +	int prio = I915_PRIORITY_INVALID | ACTIVE_PRIORITY;
>   
>   	lockdep_assert_held(&engine->timeline.lock);
>   
> @@ -391,9 +415,15 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
>   	 * The active request is now effectively the start of a new client
>   	 * stream, so give it the equivalent small priority bump to prevent
>   	 * it being gazumped a second time by another peer.
> +	 *
> +	 * One consequence of this preemption boost is that we may jump
> +	 * over lesser priorities (such as I915_PRIORITY_WAIT), effectively
> +	 * making those priorities non-preemptible. They will be moved 
forward

Not fully preemptible, just in the same user prio bucket.

> +	 * in the priority queue, but they will not gain immediate access to
> +	 * the GPU.
>   	 */
> -	if (!(prio & I915_PRIORITY_NEWCLIENT)) {
> -		prio |= I915_PRIORITY_NEWCLIENT;
> +	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(active)) {
> +		prio |= ACTIVE_PRIORITY;
>   		active->sched.attr.priority = prio;
>   		list_move_tail(&active->sched.link,
>   			       i915_sched_lookup_priolist(engine, prio));
> 

And here it is a big change as well. We would stop giving boost to 
preempted requests if they haven't started executing yet. And we have no 
accounting to how many times something is preempted, to maybe keep 
bumping their priorities in those cases. Which would probably move 
towards different priority management. Like:

	if (preempted || new_client || wait)
		rq->effective_prio++;

#define USER_PRIO_SHIFT 16 // 65536 prio bumps before overflow?

I don't know, I find the total logic (all these patches combined) quite 
hard to understand so I started to think if we could have something more 
generic and simpler.

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption
  2019-02-04 10:18     ` Chris Wilson
@ 2019-02-04 12:08       ` Tvrtko Ursulin
  2019-02-04 12:19         ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-02-04 12:08 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 04/02/2019 10:18, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-02-04 10:06:35)
>>
>> On 04/02/2019 08:41, Chris Wilson wrote:
>>> WAIT is occasionally suppressed by virtue of preempted requests being
>>> promoted to NEWCLIENT iff they have not all ready received that boost.
>>
>> s/iff/if/
> 
> iff == if, and only if
> 
> But it was probably a typo.
>   
>>> Make this consistent for all WAIT boosts that they are not allowed to
>>> preempt executing contexts and are merely granted the right to be at the
>>> front of the queue for the next execution slot. This is in keeping with
>>> the desire that the WAIT boost be a minor tweak that does not give
>>> excessive promotion to its user and open ourselves to trivial abuse.
>>>
>>> The problem with the inconsistent WAIT preemption becomes more apparent
>>> as the preemption is propagated across the engines, where one engine may
>>> preempt and the other not, and we be relying on the exact execution
>>> order being consistent across engines (e.g. using HW semaphores to
>>> coordinate parallel execution).
>>>
>>> v2: Also protect GuC submission from false preemption loops.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_request.c          |  11 ++
>>>    drivers/gpu/drm/i915/i915_scheduler.h        |   2 +
>>>    drivers/gpu/drm/i915/intel_guc_submission.c  |   2 +-
>>>    drivers/gpu/drm/i915/intel_lrc.c             |   9 +-
>>>    drivers/gpu/drm/i915/selftests/igt_spinner.c |   9 +-
>>>    drivers/gpu/drm/i915/selftests/intel_lrc.c   | 156 +++++++++++++++++++
>>>    6 files changed, 186 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 9ed5baf157a3..7968875d0bed 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -377,12 +377,23 @@ void __i915_request_submit(struct i915_request *request)
>>>    
>>>        /* We may be recursing from the signal callback of another i915 fence */
>>>        spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
>>> +
>>>        GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags));
>>>        set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags);
>>> +
>>>        request->global_seqno = seqno;
>>>        if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
>>>            !i915_request_enable_breadcrumb(request))
>>>                intel_engine_queue_breadcrumbs(engine);
>>> +
>>> +     /*
>>> +      * As we do not allow WAIT to preempt inflight requests,
>>> +      * once we have executed a request, along with triggering
>>> +      * any execution callbacks, we must preserve its ordering
>>> +      * within the non-preemptible FIFO.
>>> +      */
>>> +     request->sched.attr.priority |= __NO_PREEMPTION;
>>> +
>>>        spin_unlock(&request->lock);
>>>    
>>>        engine->emit_fini_breadcrumb(request,
>>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
>>> index dbe9cb7ecd82..54bd6c89817e 100644
>>> --- a/drivers/gpu/drm/i915/i915_scheduler.h
>>> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
>>> @@ -33,6 +33,8 @@ enum {
>>>    #define I915_PRIORITY_WAIT  ((u8)BIT(0))
>>>    #define I915_PRIORITY_NEWCLIENT     ((u8)BIT(1))
>>>    
>>> +#define __NO_PREEMPTION (I915_PRIORITY_WAIT)
>>> +
>>>    struct i915_sched_attr {
>>>        /**
>>>         * @priority: execution and service priority
>>> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
>>> index 8bc8aa54aa35..94695eb819c2 100644
>>> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
>>> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
>>> @@ -720,7 +720,7 @@ static inline int rq_prio(const struct i915_request *rq)
>>>    
>>>    static inline int port_prio(const struct execlist_port *port)
>>>    {
>>> -     return rq_prio(port_request(port));
>>> +     return rq_prio(port_request(port)) | __NO_PREEMPTION;
>>>    }
>>>    
>>>    static bool __guc_dequeue(struct intel_engine_cs *engine)
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>> index a9eb0211ce77..773df0bd685b 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -188,6 +188,12 @@ static inline int rq_prio(const struct i915_request *rq)
>>>        return rq->sched.attr.priority;
>>>    }
>>>    
>>> +static int effective_prio(const struct i915_request *rq)
>>> +{
>>> +     /* Restrict mere WAIT boosts from triggering preemption */
>>> +     return rq_prio(rq) | __NO_PREEMPTION;
>>> +}
>>
>> I suggest adding i915_request_effective_prio to i915_request.h - it is
>> verbose but avoids two implementation.
> 
> Too verbose... And it may differ depending on backend details...
> 
> We don't even need to or in no-preemption until later...

Hmm.. I would hope it wouldn't depend on the backend. We should at least 
I think try to make things decoupled at this level.

Regards,

Tvrtko

>   
>> BUILD_BUG_ON(hweight32(__NO_PREEMPTION) != 1); as well? To ensure it is
>> defined to internal priority levels.
> 
> You mean __NO_PREEMPTION & ~I915_PRIORITY_MASK ?
> 
>>>    static int queue_prio(const struct intel_engine_execlists *execlists)
>>>    {
>>>        struct i915_priolist *p;
>>> @@ -208,7 +214,7 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
>>>    static inline bool need_preempt(const struct intel_engine_cs *engine,
>>>                                const struct i915_request *rq)
>>>    {
>>> -     const int last_prio = rq_prio(rq);
>>> +     int last_prio;
>>>    
>>>        if (!intel_engine_has_preemption(engine))
>>>                return false;
>>> @@ -228,6 +234,7 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
>>>         * preempt. If that hint is stale or we may be trying to preempt
>>>         * ourselves, ignore the request.
>>>         */
>>> +     last_prio = effective_prio(rq);
>>
>> Isn't this redundant? Every submitted request already had
>> __NO_PREEMPTION applied in __i915_request_submit.
> 
> But not in the next patch which expands upon this.
> 
>>>        if (!__execlists_need_preempt(engine->execlists.queue_priority_hint,
>>>                                      last_prio))
>>>                return false;
>>> diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
>>> index 9ebd9225684e..86354e51bdd3 100644
>>> --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
>>> +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
>>> @@ -142,10 +142,17 @@ igt_spinner_create_request(struct igt_spinner *spin,
>>>        *batch++ = upper_32_bits(vma->node.start);
>>>        *batch++ = MI_BATCH_BUFFER_END; /* not reached */
>>>    
>>> -     i915_gem_chipset_flush(spin->i915);
>>> +     if (engine->emit_init_breadcrumb &&
>>> +         rq->timeline->has_initial_breadcrumb) {
>>> +             err = engine->emit_init_breadcrumb(rq);
>>> +             if (err)
>>> +                     goto cancel_rq;
>>> +     }
>>>    
>>>        err = engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0);
>>>    
>>> +     i915_gem_chipset_flush(spin->i915);
>>> +
>>>    cancel_rq:
>>>        if (err) {
>>>                i915_request_skip(rq, err);
>>> diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
>>> index fb35f53c9ce3..86fd4589f5f0 100644
>>> --- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
>>> @@ -405,6 +405,161 @@ static int live_suppress_self_preempt(void *arg)
>>>        goto err_client_b;
>>>    }
>>>    
>>> +static int __i915_sw_fence_call
>>> +dummy_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
>>> +{
>>> +     return NOTIFY_DONE;
>>> +}
>>> +
>>> +static struct i915_request *dummy_request(struct intel_engine_cs *engine)
>>> +{
>>> +     struct i915_request *rq;
>>> +
>>> +     rq = kmalloc(sizeof(*rq), GFP_KERNEL | __GFP_ZERO);
>>> +     if (!rq)
>>> +             return NULL;
>>> +
>>> +     INIT_LIST_HEAD(&rq->active_list);
>>> +     rq->engine = engine;
>>> +
>>> +     i915_sched_node_init(&rq->sched);
>>> +
>>> +     /* mark this request as permanently incomplete */
>>> +     rq->fence.seqno = 1;
>>> +     rq->hwsp_seqno = (u32 *)&rq->fence.seqno + 1;
>>
>> Hackery level 10 unlocked! :)
>>
>> Add to comment "..by pointing the hwsp_seqno to high (unused) 32-bits of
>> seqno".
> 
> I put the why... How that is actually accomplished is mere
> implementation, and you can read the code. :-p
> 
>> Also sounds like a good idea to add
>> BUILD_BUG_ON(typeof(req->fence.seqno) == typeof(u64))?
>>
>>> +
>>> +     i915_sw_fence_init(&rq->submit, dummy_notify);
>>> +     i915_sw_fence_commit(&rq->submit);
>>> +
>>> +     return rq;
>>> +}
>>> +
>>> +static void dummy_request_free(struct i915_request *dummy)
>>> +{
>>> +     i915_request_mark_complete(dummy);
>>> +     i915_sched_node_fini(dummy->engine->i915, &dummy->sched);
>>> +     kfree(dummy);
>>> +}
>>> +
>>> +static int live_suppress_wait_preempt(void *arg)
>>> +{
>>> +     struct drm_i915_private *i915 = arg;
>>> +     struct preempt_client client[4];
>>> +     struct intel_engine_cs *engine;
>>> +     enum intel_engine_id id;
>>> +     intel_wakeref_t wakeref;
>>> +     int err = -ENOMEM;
>>> +     int i;
>>> +
>>> +     /*
>>> +      * Waiters are given a little priority nudge, but not enough
>>> +      * to actually cause any preemption. Double check that we do
>>> +      * not needlessly generate preempt-to-idle cycles.
>>> +      */
>>> +
>>> +     if (!HAS_LOGICAL_RING_PREEMPTION(i915))
>>> +             return 0;
>>> +
>>> +     mutex_lock(&i915->drm.struct_mutex);
>>> +     wakeref = intel_runtime_pm_get(i915);
>>> +
>>> +     if (preempt_client_init(i915, &client[0])) /* ELSP[0] */
>>> +             goto err_unlock;
>>> +     if (preempt_client_init(i915, &client[1])) /* ELSP[1] */
>>> +             goto err_client_0;
>>> +     if (preempt_client_init(i915, &client[2])) /* head of queue */
>>> +             goto err_client_1;
>>> +     if (preempt_client_init(i915, &client[3])) /* bystander */
>>> +             goto err_client_2;
>>> +
>>> +     for_each_engine(engine, i915, id) {
>>> +             int depth;
>>> +
>>> +             if (!engine->emit_init_breadcrumb)
>>> +                     continue;
>>> +
>>> +             for (depth = 0; depth < ARRAY_SIZE(client); depth++) {
>>> +                     struct i915_request *rq[ARRAY_SIZE(client)];
>>> +                     struct i915_request *dummy;
>>> +
>>> +                     engine->execlists.preempt_hang.count = 0;
>>> +
>>> +                     dummy = dummy_request(engine);
>>> +                     if (!dummy)
>>> +                             goto err_client_3;
>>> +
>>> +                     for (i = 0; i < ARRAY_SIZE(client); i++) {
>>> +                             rq[i] = igt_spinner_create_request(&client[i].spin,
>>> +                                                                client[i].ctx, engine,
>>> +                                                                MI_NOOP);
>>> +                             if (IS_ERR(rq[i])) {
>>> +                                     err = PTR_ERR(rq[i]);
>>> +                                     goto err_wedged;
>>> +                             }
>>> +
>>> +                             /* Disable NEWCLIENT promotion */
>>> +                             i915_gem_active_set(&rq[i]->timeline->last_request,
>>> +                                                 dummy);
>>> +                             i915_request_add(rq[i]);
>>> +                     }
>>> +
>>> +                     dummy_request_free(dummy);
>>> +
>>> +                     GEM_BUG_ON(i915_request_completed(rq[0]));
>>> +                     if (!igt_wait_for_spinner(&client[0].spin, rq[0])) {
>>> +                             pr_err("First client failed to start\n");
>>> +                             goto err_wedged;
>>> +                     }
>>> +                     GEM_BUG_ON(!i915_request_started(rq[0]));
>>> +
>>> +                     if (i915_request_wait(rq[depth],
>>> +                                           I915_WAIT_LOCKED |
>>> +                                           I915_WAIT_PRIORITY,
>>> +                                           1) != -ETIME) {
>>> +                             pr_err("Waiter depth:%d completed!\n", depth);
>>> +                             goto err_wedged;
>>> +                     }
>>> +
>>> +                     for (i = 0; i < ARRAY_SIZE(client); i++)
>>> +                             igt_spinner_end(&client[i].spin);
>>> +
>>> +                     if (igt_flush_test(i915, I915_WAIT_LOCKED))
>>> +                             goto err_wedged;
>>> +
>>> +                     if (engine->execlists.preempt_hang.count) {
>>> +                             pr_err("Preemption recorded x%d, depth %d; should have been suppressed!\n",
>>> +                                    engine->execlists.preempt_hang.count,
>>> +                                    depth);
>>
>> Worth logging engine names with all the above error messages?
> 
> I honestly doubt we need per-engine since this is just about observing SW.
> But as we are exercising across engines, it may as well include that
> information.
> -Chris
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/12] drm/i915: Trim NEWCLIENT boosting
  2019-02-04  8:41 ` [PATCH 05/12] drm/i915: Trim NEWCLIENT boosting Chris Wilson
@ 2019-02-04 12:11   ` Tvrtko Ursulin
  2019-02-04 12:26     ` Chris Wilson
  2019-02-04 12:27     ` Chris Wilson
  0 siblings, 2 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-02-04 12:11 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 04/02/2019 08:41, Chris Wilson wrote:
> Limit the NEWCLIENT boost to only give its small priority boost to fresh
> clients only that have no dependencies.

Needs some actual explaining/documenting on why we would want this.

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_request.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 7968875d0bed..69bc549e95d8 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -979,7 +979,7 @@ void i915_request_add(struct i915_request *request)
>   		 * Allow interactive/synchronous clients to jump ahead of
>   		 * the bulk clients. (FQ_CODEL)
>   		 */
> -		if (!prev || i915_request_completed(prev))
> +		if (list_empty(&request->sched.signalers_list))

Why are the !prev and completed checks gone? Seems in disagreement with 
the commit message text - they wouldn't have to be new clients any more, 
just have no dependencies.

>   			attr.priority |= I915_PRIORITY_NEWCLIENT;
>   
>   		engine->schedule(request, &attr);
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/12] drm/i915: Show support for accurate sw PMU busyness tracking
  2019-02-04  8:41 ` [PATCH 06/12] drm/i915: Show support for accurate sw PMU busyness tracking Chris Wilson
@ 2019-02-04 12:14   ` Tvrtko Ursulin
  2019-02-04 12:28     ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-02-04 12:14 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 04/02/2019 08:41, Chris Wilson wrote:
> Expose whether or not we support the PMU software tracking in our
> scheduler capabilities, so userspace can query at runtime.

I am leaning towards thinking PMU is a backend and not the scheduler 
feature. We could export it via engine discovery for instance.

Do you think the "all must agree" change in logic is interesting enough 
on it's own?

Regards,

Tvrtko


> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem.c         |  2 ++
>   drivers/gpu/drm/i915/intel_engine_cs.c  | 38 +++++++++++++++++++++++++
>   drivers/gpu/drm/i915/intel_lrc.c        |  6 ----
>   drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
>   include/uapi/drm/i915_drm.h             |  1 +
>   5 files changed, 43 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index e802af64d628..bc7d1338b69a 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4757,6 +4757,8 @@ static int __i915_gem_restart_engines(void *data)
>   		}
>   	}
>   
> +	intel_engines_set_scheduler_caps(i915);
> +
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 71c01eb13af1..ec2cbbe070a4 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -614,6 +614,44 @@ int intel_engine_setup_common(struct intel_engine_cs *engine)
>   	return err;
>   }
>   
> +void intel_engines_set_scheduler_caps(struct drm_i915_private *i915)
> +{
> +	static const struct {
> +		u32 engine_flag;
> +		u32 sched_cap;
> +	} map[] = {
> +		{ I915_ENGINE_HAS_PREEMPTION, I915_SCHEDULER_CAP_PREEMPTION },
> +		{ I915_ENGINE_SUPPORTS_STATS, I915_SCHEDULER_CAP_PMU },
> +	};
> +	struct intel_engine_cs *engine;
> +	enum intel_engine_id id;
> +	u32 enabled, disabled;
> +
> +	enabled = 0;
> +	disabled = 0;
> +	for_each_engine(engine, i915, id) { /* all engines must agree! */
> +		int i;
> +
> +		if (engine->schedule)
> +			enabled |= (I915_SCHEDULER_CAP_ENABLED |
> +				    I915_SCHEDULER_CAP_PRIORITY);
> +		else
> +			disabled |= (I915_SCHEDULER_CAP_ENABLED |
> +				     I915_SCHEDULER_CAP_PRIORITY);
> +
> +		for (i = 0; i < ARRAY_SIZE(map); i++) {
> +			if (engine->flags & map[i].engine_flag)
> +				enabled |= map[i].sched_cap;
> +			else
> +				disabled |= map[i].sched_cap;
> +		}
> +	}
> +
> +	i915->caps.scheduler = enabled & ~disabled;
> +	if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_ENABLED))
> +		i915->caps.scheduler = 0;
> +}
> +
>   static void __intel_context_unpin(struct i915_gem_context *ctx,
>   				  struct intel_engine_cs *engine)
>   {
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 9b6b3acb9070..0869a4fd20c7 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -2299,12 +2299,6 @@ void intel_execlists_set_default_submission(struct intel_engine_cs *engine)
>   	engine->flags |= I915_ENGINE_SUPPORTS_STATS;
>   	if (engine->i915->preempt_context)
>   		engine->flags |= I915_ENGINE_HAS_PREEMPTION;
> -
> -	engine->i915->caps.scheduler =
> -		I915_SCHEDULER_CAP_ENABLED |
> -		I915_SCHEDULER_CAP_PRIORITY;
> -	if (intel_engine_has_preemption(engine))
> -		engine->i915->caps.scheduler |= I915_SCHEDULER_CAP_PREEMPTION;
>   }
>   
>   static void
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 983ad1e7914d..610ee351ebee 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -590,6 +590,8 @@ intel_engine_has_preemption(const struct intel_engine_cs *engine)
>   	return engine->flags & I915_ENGINE_HAS_PREEMPTION;
>   }
>   
> +void intel_engines_set_scheduler_caps(struct drm_i915_private *i915);
> +
>   static inline bool __execlists_need_preempt(int prio, int last)
>   {
>   	/*
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 298b2e197744..d8ac7f105734 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -476,6 +476,7 @@ typedef struct drm_i915_irq_wait {
>   #define   I915_SCHEDULER_CAP_ENABLED	(1ul << 0)
>   #define   I915_SCHEDULER_CAP_PRIORITY	(1ul << 1)
>   #define   I915_SCHEDULER_CAP_PREEMPTION	(1ul << 2)
> +#define   I915_SCHEDULER_CAP_PMU	(1ul << 3)
>   
>   #define I915_PARAM_HUC_STATUS		 42
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption
  2019-02-04 12:08       ` Tvrtko Ursulin
@ 2019-02-04 12:19         ` Chris Wilson
  2019-02-04 12:29           ` Tvrtko Ursulin
  0 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-02-04 12:19 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-02-04 12:08:50)
> 
> On 04/02/2019 10:18, Chris Wilson wrote:
> >>> +static int effective_prio(const struct i915_request *rq)
> >>> +{
> >>> +     /* Restrict mere WAIT boosts from triggering preemption */
> >>> +     return rq_prio(rq) | __NO_PREEMPTION;
> >>> +}
> >>
> >> I suggest adding i915_request_effective_prio to i915_request.h - it is
> >> verbose but avoids two implementation.
> > 
> > Too verbose... And it may differ depending on backend details...
> > 
> > We don't even need to or in no-preemption until later...
> 
> Hmm.. I would hope it wouldn't depend on the backend. We should at least 
> I think try to make things decoupled at this level.

I'm speculating about what the long term interface will be. If they can
only handle static priorities on a context level and take all
dependencies as semaphores, guc submission is a mere conduit and very
hands off.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/12] drm/i915/execlists: Suppress redundant preemption
  2019-02-04 12:05   ` Tvrtko Ursulin
@ 2019-02-04 12:25     ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04 12:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-02-04 12:05:34)
> 
> On 04/02/2019 08:41, Chris Wilson wrote:
> > On unwinding the active request we give it a small (limited to internal
> > priority levels) boost to prevent it from being gazumped a second time.
> > However, this means that it can be promoted to above the request that
> > triggered the preemption request, causing a preempt-to-idle cycle for no
> > change. We can avoid this if we take the boost into account when
> > checking if the preemption request is valid.
> 
> After the previous patch to not preempt on wait this is not true any 
> more right?
> 
> > 
> > v2: After preemption the active request will be after the preemptee if
> > they end up with equal priority.
> > 
> > v3: Tvrtko pointed out that this, the existing logic, makes
> > I915_PRIORITY_WAIT non-preemptible. Document this interesting quirk!
> > 
> > v4: Prove Tvrtko was right about WAIT being non-preemptible and test it.
> > v5: Except not all priorities were made equal, and the WAIT not preempting
> > is only if we start off as !NEWCLIENT.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/intel_lrc.c | 38 ++++++++++++++++++++++++++++----
> >   1 file changed, 34 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index 773df0bd685b..9b6b3acb9070 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -164,6 +164,8 @@
> >   #define WA_TAIL_DWORDS 2
> >   #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
> >   
> > +#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT)
> > +
> >   static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
> >                                           struct intel_engine_cs *engine,
> >                                           struct intel_context *ce);
> > @@ -190,8 +192,30 @@ static inline int rq_prio(const struct i915_request *rq)
> >   
> >   static int effective_prio(const struct i915_request *rq)
> >   {
> > +     int prio = rq_prio(rq);
> > +
> > +     /*
> > +      * On unwinding the active request, we give it a priority bump
> > +      * equivalent to a freshly submitted request. This protects it from
> > +      * being gazumped again, but it would be preferable if we didn't
> > +      * let it be gazumped in the first place!
> > +      *
> > +      * See __unwind_incomplete_requests()
> > +      */
> > +     if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(rq)) {
> 
> So effectively a started request cannot be preempted by a new client, on 
> the same base priority level.
> 
> I am still not sure that we should give special meaning to a started 
> request. We don't know how long it would execute or anything. It may be 
> the only request, or it may be a last in a coalesced context. And in 
> both cases it can be either short or long.

It's a mere reflection of unwind_incomplete_requests. We don't do the
preemption in this case, so why even start.

> > +             /*
> > +              * After preemption, we insert the active request at the
> > +              * end of the new priority level. This means that we will be
> > +              * _lower_ priority than the preemptee all things equal (and
> > +              * so the preemption is valid), so adjust our comparison
> > +              * accordingly.
> > +              */
> > +             prio |= ACTIVE_PRIORITY;
> > +             prio--;
> > +     }
> > +
> >       /* Restrict mere WAIT boosts from triggering preemption */
> > -     return rq_prio(rq) | __NO_PREEMPTION;
> > +     return prio | __NO_PREEMPTION;
> >   }
> >   
> >   static int queue_prio(const struct intel_engine_execlists *execlists)
> > @@ -360,7 +384,7 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
> >   {
> >       struct i915_request *rq, *rn, *active = NULL;
> >       struct list_head *uninitialized_var(pl);
> > -     int prio = I915_PRIORITY_INVALID | I915_PRIORITY_NEWCLIENT;
> > +     int prio = I915_PRIORITY_INVALID | ACTIVE_PRIORITY;
> >   
> >       lockdep_assert_held(&engine->timeline.lock);
> >   
> > @@ -391,9 +415,15 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
> >        * The active request is now effectively the start of a new client
> >        * stream, so give it the equivalent small priority bump to prevent
> >        * it being gazumped a second time by another peer.
> > +      *
> > +      * One consequence of this preemption boost is that we may jump
> > +      * over lesser priorities (such as I915_PRIORITY_WAIT), effectively
> > +      * making those priorities non-preemptible. They will be moved 
> forward
> 
> Not fully preemptible, just in the same user prio bucket.

It means that the preemption request was ignored; it was unable to
preempt, non-preemptible.
 
> > +      * in the priority queue, but they will not gain immediate access to
> > +      * the GPU.
> >        */
> > -     if (!(prio & I915_PRIORITY_NEWCLIENT)) {
> > -             prio |= I915_PRIORITY_NEWCLIENT;
> > +     if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(active)) {
> > +             prio |= ACTIVE_PRIORITY;
> >               active->sched.attr.priority = prio;
> >               list_move_tail(&active->sched.link,
> >                              i915_sched_lookup_priolist(engine, prio));
> > 
> 
> And here it is a big change as well. We would stop giving boost to 
> preempted requests if they haven't started executing yet. And we have no 
> accounting to how many times something is preempted, to maybe keep 
> bumping their priorities in those cases. Which would probably move 
> towards different priority management. Like:

What? There's no change here. The logic is important though, as it can
only apply if the request wasn't waiting.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/12] drm/i915: Trim NEWCLIENT boosting
  2019-02-04 12:11   ` Tvrtko Ursulin
@ 2019-02-04 12:26     ` Chris Wilson
  2019-02-04 12:42       ` Tvrtko Ursulin
  2019-02-04 12:27     ` Chris Wilson
  1 sibling, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-02-04 12:26 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-02-04 12:11:41)
> 
> On 04/02/2019 08:41, Chris Wilson wrote:
> > Limit the NEWCLIENT boost to only give its small priority boost to fresh
> > clients only that have no dependencies.
> 
> Needs some actual explaining/documenting on why we would want this.
> 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 7968875d0bed..69bc549e95d8 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -979,7 +979,7 @@ void i915_request_add(struct i915_request *request)
> >                * Allow interactive/synchronous clients to jump ahead of
> >                * the bulk clients. (FQ_CODEL)
> >                */
> > -             if (!prev || i915_request_completed(prev))
> > +             if (list_empty(&request->sched.signalers_list))
> 
> Why are the !prev and completed checks gone? Seems in disagreement with 
> the commit message text - they wouldn't have to be new clients any more, 
> just have no dependencies.

What? signalers_list is a superset of !prev || i915_request_completed.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/12] drm/i915: Trim NEWCLIENT boosting
  2019-02-04 12:11   ` Tvrtko Ursulin
  2019-02-04 12:26     ` Chris Wilson
@ 2019-02-04 12:27     ` Chris Wilson
  1 sibling, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04 12:27 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-02-04 12:11:41)
> 
> On 04/02/2019 08:41, Chris Wilson wrote:
> > Limit the NEWCLIENT boost to only give its small priority boost to fresh
> > clients only that have no dependencies.
> 
> Needs some actual explaining/documenting on why we would want this.

Exactly the same reasoning as NEWCLIENT.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/12] drm/i915: Show support for accurate sw PMU busyness tracking
  2019-02-04 12:14   ` Tvrtko Ursulin
@ 2019-02-04 12:28     ` Chris Wilson
  2019-02-04 12:29       ` Chris Wilson
  2019-02-04 12:37       ` Tvrtko Ursulin
  0 siblings, 2 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04 12:28 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-02-04 12:14:38)
> 
> On 04/02/2019 08:41, Chris Wilson wrote:
> > Expose whether or not we support the PMU software tracking in our
> > scheduler capabilities, so userspace can query at runtime.
> 
> I am leaning towards thinking PMU is a backend and not the scheduler 
> feature. We could export it via engine discovery for instance.

The sw metrics are buggy. They include semaphore time on top of busy,
but historically that has always been separate (and how it's measured by
the HW).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption
  2019-02-04 12:19         ` Chris Wilson
@ 2019-02-04 12:29           ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-02-04 12:29 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 04/02/2019 12:19, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-02-04 12:08:50)
>>
>> On 04/02/2019 10:18, Chris Wilson wrote:
>>>>> +static int effective_prio(const struct i915_request *rq)
>>>>> +{
>>>>> +     /* Restrict mere WAIT boosts from triggering preemption */
>>>>> +     return rq_prio(rq) | __NO_PREEMPTION;
>>>>> +}
>>>>
>>>> I suggest adding i915_request_effective_prio to i915_request.h - it is
>>>> verbose but avoids two implementation.
>>>
>>> Too verbose... And it may differ depending on backend details...
>>>
>>> We don't even need to or in no-preemption until later...
>>
>> Hmm.. I would hope it wouldn't depend on the backend. We should at least
>> I think try to make things decoupled at this level.
> 
> I'm speculating about what the long term interface will be. If they can
> only handle static priorities on a context level and take all
> dependencies as semaphores, guc submission is a mere conduit and very
> hands off.

Point taken, the force of GuC influencing the i915 design will be too 
strong.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/12] drm/i915: Show support for accurate sw PMU busyness tracking
  2019-02-04 12:28     ` Chris Wilson
@ 2019-02-04 12:29       ` Chris Wilson
  2019-02-04 12:37       ` Tvrtko Ursulin
  1 sibling, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04 12:29 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2019-02-04 12:28:04)
> Quoting Tvrtko Ursulin (2019-02-04 12:14:38)
> > 
> > On 04/02/2019 08:41, Chris Wilson wrote:
> > > Expose whether or not we support the PMU software tracking in our
> > > scheduler capabilities, so userspace can query at runtime.
> > 
> > I am leaning towards thinking PMU is a backend and not the scheduler 
> > feature. We could export it via engine discovery for instance.
> 
> The sw metrics are buggy. They include semaphore time on top of busy,
> but historically that has always been separate (and how it's measured by
> the HW).

Furthermore they aren't even universally available today, breaking the
current assumption of HAS_EXECLISTS -> has SW PMU.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/12] drm/i915: Show support for accurate sw PMU busyness tracking
  2019-02-04 12:28     ` Chris Wilson
  2019-02-04 12:29       ` Chris Wilson
@ 2019-02-04 12:37       ` Tvrtko Ursulin
  2019-02-04 12:43         ` Chris Wilson
  1 sibling, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-02-04 12:37 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 04/02/2019 12:28, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-02-04 12:14:38)
>>
>> On 04/02/2019 08:41, Chris Wilson wrote:
>>> Expose whether or not we support the PMU software tracking in our
>>> scheduler capabilities, so userspace can query at runtime.
>>
>> I am leaning towards thinking PMU is a backend and not the scheduler
>> feature. We could export it via engine discovery for instance.
> 
> The sw metrics are buggy. They include semaphore time on top of busy,
> but historically that has always been separate (and how it's measured by
> the HW).

Time to resurrect the LRCA context runtime patches and see if that is 
consistent in wait vs poll mode.

But, why are the semantics of busy time related to the question of 
whether to expose this flag at engine or scheduler level?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/12] drm/i915: Trim NEWCLIENT boosting
  2019-02-04 12:26     ` Chris Wilson
@ 2019-02-04 12:42       ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-02-04 12:42 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 04/02/2019 12:26, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-02-04 12:11:41)
>>
>> On 04/02/2019 08:41, Chris Wilson wrote:
>>> Limit the NEWCLIENT boost to only give its small priority boost to fresh
>>> clients only that have no dependencies.
>>
>> Needs some actual explaining/documenting on why we would want this.
>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/i915_request.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 7968875d0bed..69bc549e95d8 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -979,7 +979,7 @@ void i915_request_add(struct i915_request *request)
>>>                 * Allow interactive/synchronous clients to jump ahead of
>>>                 * the bulk clients. (FQ_CODEL)
>>>                 */
>>> -             if (!prev || i915_request_completed(prev))
>>> +             if (list_empty(&request->sched.signalers_list))
>>
>> Why are the !prev and completed checks gone? Seems in disagreement with
>> the commit message text - they wouldn't have to be new clients any more,
>> just have no dependencies.
> 
> What? signalers_list is a superset of !prev || i915_request_completed.

Ok my bad - I forgot about adding the previous unfinished request as a 
dependency.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/12] drm/i915: Show support for accurate sw PMU busyness tracking
  2019-02-04 12:37       ` Tvrtko Ursulin
@ 2019-02-04 12:43         ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04 12:43 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-02-04 12:37:00)
> 
> On 04/02/2019 12:28, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-02-04 12:14:38)
> >>
> >> On 04/02/2019 08:41, Chris Wilson wrote:
> >>> Expose whether or not we support the PMU software tracking in our
> >>> scheduler capabilities, so userspace can query at runtime.
> >>
> >> I am leaning towards thinking PMU is a backend and not the scheduler
> >> feature. We could export it via engine discovery for instance.
> > 
> > The sw metrics are buggy. They include semaphore time on top of busy,
> > but historically that has always been separate (and how it's measured by
> > the HW).
> 
> Time to resurrect the LRCA context runtime patches and see if that is 
> consistent in wait vs poll mode.
> 
> But, why are the semantics of busy time related to the question of 
> whether to expose this flag at engine or scheduler level?

The accuracy (and meaning) presented to the user currently depends on
internal details that are not exposed. I just piggy backed caps.scheduler 
as it was adjacent to the code and already being used to determine
implementation details from igt / mesa.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 07/12] drm/i915: Revoke mmaps and prevent access to fence registers across reset
  2019-02-04  8:41 ` [PATCH 07/12] drm/i915: Revoke mmaps and prevent access to fence registers across reset Chris Wilson
@ 2019-02-04 13:33   ` Mika Kuoppala
  2019-02-04 13:47     ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Kuoppala @ 2019-02-04 13:33 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Previously, we were able to rely on the recursive properties of
> struct_mutex to allow us to serialise revoking mmaps and reacquiring the
> FENCE registers with them being clobbered over a global device reset.
> I then proceeded to throw out the baby with the bath water in order to
> pursue a struct_mutex-less reset.
>
> Perusing LWN for alternative strategies, the dilemma on how to serialise
> access to a global resource on one side was answered by
> https://lwn.net/Articles/202847/ -- Sleepable RCU:
>
>     1  int readside(void) {
>     2      int idx;
>     3      rcu_read_lock();
>     4	   if (nomoresrcu) {
>     5          rcu_read_unlock();
>     6	       return -EINVAL;
>     7      }
>     8	   idx = srcu_read_lock(&ss);
>     9	   rcu_read_unlock();
>     10	   /* SRCU read-side critical section. */
>     11	   srcu_read_unlock(&ss, idx);
>     12	   return 0;
>     13 }
>     14
>     15 void cleanup(void)
>     16 {
>     17     nomoresrcu = 1;
>     18     synchronize_rcu();
>     19     synchronize_srcu(&ss);
>     20     cleanup_srcu_struct(&ss);
>     21 }
>
> No more worrying about stop_machine, just an uber-complex mutex,
> optimised for reads, with the overhead pushed to the rare reset path.
>
> However, we do run the risk of a deadlock as we allocate underneath the
> SRCU read lock, and the allocation may require a GPU reset, causing a
> dependency cycle via the in-flight requests. We resolve that by declaring
> the driver wedged and cancelling all in-flight rendering.
>
> v2: Use expedited rcu barriers to match our earlier timing
> characteristics.
> v3: Try to annotate locking contexts for sparse
> v4: Reduce selftest lock duration to avoid a reset deadlock with fences
>
> Testcase: igt/gem_mmap_gtt/hang
> Fixes: eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c           |  12 +-
>  drivers/gpu/drm/i915/i915_drv.h               |  18 +--
>  drivers/gpu/drm/i915/i915_gem.c               |  56 +++------
>  drivers/gpu/drm/i915/i915_gem_fence_reg.c     |  31 +----
>  drivers/gpu/drm/i915/i915_gpu_error.h         |  12 +-
>  drivers/gpu/drm/i915/i915_reset.c             | 107 +++++++++++-------
>  drivers/gpu/drm/i915/i915_reset.h             |   4 +
>  .../gpu/drm/i915/selftests/intel_hangcheck.c  |   5 +-
>  .../gpu/drm/i915/selftests/mock_gem_device.c  |   1 +
>  9 files changed, 109 insertions(+), 137 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index fa2c226fc779..2cea263b4d79 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1281,14 +1281,11 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
>  	intel_wakeref_t wakeref;
>  	enum intel_engine_id id;
>  
> +	seq_printf(m, "Reset flags: %lx\n", dev_priv->gpu_error.flags);
>  	if (test_bit(I915_WEDGED, &dev_priv->gpu_error.flags))
> -		seq_puts(m, "Wedged\n");
> +		seq_puts(m, "\tWedged\n");
>  	if (test_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags))
> -		seq_puts(m, "Reset in progress: struct_mutex backoff\n");
> -	if (waitqueue_active(&dev_priv->gpu_error.wait_queue))
> -		seq_puts(m, "Waiter holding struct mutex\n");
> -	if (waitqueue_active(&dev_priv->gpu_error.reset_queue))
> -		seq_puts(m, "struct_mutex blocked for reset\n");
> +		seq_puts(m, "\tDevice (global) reset in progress\n");
>  
>  	if (!i915_modparams.enable_hangcheck) {
>  		seq_puts(m, "Hangcheck disabled\n");
> @@ -3885,9 +3882,6 @@ i915_wedged_set(void *data, u64 val)
>  	 * while it is writing to 'i915_wedged'
>  	 */
>  
> -	if (i915_reset_backoff(&i915->gpu_error))
> -		return -EAGAIN;
> -

On checking the set side, I noticed there is a stale
comment about full reset needing the mutex in the i915_handle_error()

>  	i915_handle_error(i915, val, I915_ERROR_CAPTURE,
>  			  "Manually set wedged engine mask = %llx", val);
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 534e52e3a8da..3e4538ce5276 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2989,7 +2989,12 @@ i915_gem_obj_finish_shmem_access(struct drm_i915_gem_object *obj)
>  	i915_gem_object_unpin_pages(obj);
>  }
>  
> -int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> +static inline int __must_check
> +i915_mutex_lock_interruptible(struct drm_device *dev)
> +{
> +	return mutex_lock_interruptible(&dev->struct_mutex);
> +}
> +
>  int i915_gem_dumb_create(struct drm_file *file_priv,
>  			 struct drm_device *dev,
>  			 struct drm_mode_create_dumb *args);
> @@ -3006,21 +3011,11 @@ int __must_check i915_gem_set_global_seqno(struct drm_device *dev, u32 seqno);
>  struct i915_request *
>  i915_gem_find_active_request(struct intel_engine_cs *engine);
>  
> -static inline bool i915_reset_backoff(struct i915_gpu_error *error)
> -{
> -	return unlikely(test_bit(I915_RESET_BACKOFF, &error->flags));
> -}
> -
>  static inline bool i915_terminally_wedged(struct i915_gpu_error *error)
>  {
>  	return unlikely(test_bit(I915_WEDGED, &error->flags));
>  }
>  
> -static inline bool i915_reset_backoff_or_wedged(struct i915_gpu_error *error)
> -{
> -	return i915_reset_backoff(error) | i915_terminally_wedged(error);
> -}
> -
>  static inline u32 i915_reset_count(struct i915_gpu_error *error)
>  {
>  	return READ_ONCE(error->reset_count);
> @@ -3093,7 +3088,6 @@ struct drm_i915_fence_reg *
>  i915_reserve_fence(struct drm_i915_private *dev_priv);
>  void i915_unreserve_fence(struct drm_i915_fence_reg *fence);
>  
> -void i915_gem_revoke_fences(struct drm_i915_private *dev_priv);
>  void i915_gem_restore_fences(struct drm_i915_private *dev_priv);
>  
>  void i915_gem_detect_bit_6_swizzle(struct drm_i915_private *dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index bc7d1338b69a..2c6161c89cc7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -100,47 +100,6 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv,
>  	spin_unlock(&dev_priv->mm.object_stat_lock);
>  }
>  
> -static int
> -i915_gem_wait_for_error(struct i915_gpu_error *error)
> -{
> -	int ret;
> -
> -	might_sleep();
> -
> -	/*
> -	 * Only wait 10 seconds for the gpu reset to complete to avoid hanging
> -	 * userspace. If it takes that long something really bad is going on and
> -	 * we should simply try to bail out and fail as gracefully as possible.
> -	 */
> -	ret = wait_event_interruptible_timeout(error->reset_queue,
> -					       !i915_reset_backoff(error),
> -					       I915_RESET_TIMEOUT);
> -	if (ret == 0) {
> -		DRM_ERROR("Timed out waiting for the gpu reset to complete\n");
> -		return -EIO;
> -	} else if (ret < 0) {
> -		return ret;
> -	} else {
> -		return 0;
> -	}
> -}
> -
> -int i915_mutex_lock_interruptible(struct drm_device *dev)
> -{
> -	struct drm_i915_private *dev_priv = to_i915(dev);
> -	int ret;
> -
> -	ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
> -	if (ret)
> -		return ret;
> -
> -	ret = mutex_lock_interruptible(&dev->struct_mutex);
> -	if (ret)
> -		return ret;
> -
> -	return 0;
> -}
> -
>  static u32 __i915_gem_park(struct drm_i915_private *i915)
>  {
>  	intel_wakeref_t wakeref;
> @@ -1869,6 +1828,7 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
>  	intel_wakeref_t wakeref;
>  	struct i915_vma *vma;
>  	pgoff_t page_offset;
> +	int srcu;
>  	int ret;
>  
>  	/* Sanity check that we allow writing into this object */
> @@ -1908,7 +1868,6 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
>  		goto err_unlock;
>  	}
>  
> -
>  	/* Now pin it into the GTT as needed */
>  	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
>  				       PIN_MAPPABLE |
> @@ -1946,9 +1905,15 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
>  	if (ret)
>  		goto err_unpin;
>  
> +	srcu = i915_reset_trylock(dev_priv);
> +	if (srcu < 0) {
> +		ret = srcu;
> +		goto err_unpin;
> +	}
> +
>  	ret = i915_vma_pin_fence(vma);
>  	if (ret)
> -		goto err_unpin;
> +		goto err_reset;
>  
>  	/* Finally, remap it using the new GTT offset */
>  	ret = remap_io_mapping(area,
> @@ -1969,6 +1934,8 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
>  
>  err_fence:
>  	i915_vma_unpin_fence(vma);
> +err_reset:
> +	i915_reset_unlock(dev_priv, srcu);
>  err_unpin:
>  	__i915_vma_unpin(vma);
>  err_unlock:
> @@ -5326,6 +5293,7 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
>  	init_waitqueue_head(&dev_priv->gpu_error.wait_queue);
>  	init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
>  	mutex_init(&dev_priv->gpu_error.wedge_mutex);
> +	init_srcu_struct(&dev_priv->gpu_error.srcu);
>  
>  	atomic_set(&dev_priv->mm.bsd_engine_dispatch_index, 0);
>  
> @@ -5358,6 +5326,8 @@ void i915_gem_cleanup_early(struct drm_i915_private *dev_priv)
>  	GEM_BUG_ON(atomic_read(&dev_priv->mm.free_count));
>  	WARN_ON(dev_priv->mm.object_count);
>  
> +	cleanup_srcu_struct(&dev_priv->gpu_error.srcu);
> +
>  	kmem_cache_destroy(dev_priv->priorities);
>  	kmem_cache_destroy(dev_priv->dependencies);
>  	kmem_cache_destroy(dev_priv->requests);
> diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
> index 46e259661294..bd0d5b8d6c96 100644
> --- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
> +++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
> @@ -240,6 +240,10 @@ static int fence_update(struct drm_i915_fence_reg *fence,
>  		i915_vma_flush_writes(old);
>  	}
>  
> +	ret = i915_reset_trylock(fence->i915);
> +	if (ret < 0)
> +		return ret;
> +
>  	if (fence->vma && fence->vma != vma) {
>  		/* Ensure that all userspace CPU access is completed before
>  		 * stealing the fence.
> @@ -272,6 +276,7 @@ static int fence_update(struct drm_i915_fence_reg *fence,
>  		list_move_tail(&fence->link, &fence->i915->mm.fence_list);
>  	}
>  
> +	i915_reset_unlock(fence->i915, ret);
>  	return 0;
>  }
>  
> @@ -435,32 +440,6 @@ void i915_unreserve_fence(struct drm_i915_fence_reg *fence)
>  	list_add(&fence->link, &fence->i915->mm.fence_list);
>  }
>  
> -/**
> - * i915_gem_revoke_fences - revoke fence state
> - * @dev_priv: i915 device private
> - *
> - * Removes all GTT mmappings via the fence registers. This forces any user
> - * of the fence to reacquire that fence before continuing with their access.
> - * One use is during GPU reset where the fence register is lost and we need to
> - * revoke concurrent userspace access via GTT mmaps until the hardware has been
> - * reset and the fence registers have been restored.
> - */
> -void i915_gem_revoke_fences(struct drm_i915_private *dev_priv)
> -{
> -	int i;
> -
> -	lockdep_assert_held(&dev_priv->drm.struct_mutex);
> -
> -	for (i = 0; i < dev_priv->num_fence_regs; i++) {
> -		struct drm_i915_fence_reg *fence = &dev_priv->fence_regs[i];
> -
> -		GEM_BUG_ON(fence->vma && fence->vma->fence != fence);
> -
> -		if (fence->vma)
> -			i915_vma_revoke_mmap(fence->vma);
> -	}
> -}
> -
>  /**
>   * i915_gem_restore_fences - restore fence state
>   * @dev_priv: i915 device private
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
> index 53b1f22dd365..4e797c552b96 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.h
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.h
> @@ -231,12 +231,10 @@ struct i915_gpu_error {
>  	/**
>  	 * flags: Control various stages of the GPU reset
>  	 *
> -	 * #I915_RESET_BACKOFF - When we start a reset, we want to stop any
> -	 * other users acquiring the struct_mutex. To do this we set the
> -	 * #I915_RESET_BACKOFF bit in the error flags when we detect a reset
> -	 * and then check for that bit before acquiring the struct_mutex (in
> -	 * i915_mutex_lock_interruptible()?). I915_RESET_BACKOFF serves a
> -	 * secondary role in preventing two concurrent global reset attempts.
> +	 * #I915_RESET_BACKOFF - When we start a global reset, we need to
> +	 * serialise with any other users attempting to do the same, and
> +	 * any global resources that may be clobber by the reset (such as
> +	 * FENCE registers).
>  	 *
>  	 * #I915_RESET_ENGINE[num_engines] - Since the driver doesn't need to
>  	 * acquire the struct_mutex to reset an engine, we need an explicit
> @@ -272,6 +270,8 @@ struct i915_gpu_error {
>  	 */
>  	wait_queue_head_t reset_queue;
>  
> +	struct srcu_struct srcu;
> +

It is the only one in here so not causing confusion
but could have been reset_backoff_srcu;

>  	struct i915_gpu_restart *restart;
>  };
>  
> diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
> index 4462007a681c..f58fae457ec6 100644
> --- a/drivers/gpu/drm/i915/i915_reset.c
> +++ b/drivers/gpu/drm/i915/i915_reset.c
> @@ -639,6 +639,31 @@ static void reset_prepare_engine(struct intel_engine_cs *engine)
>  	engine->reset.prepare(engine);
>  }
>  
> +static void revoke_mmaps(struct drm_i915_private *i915)
> +{
> +	int i;
> +
> +	for (i = 0; i < i915->num_fence_regs; i++) {
> +		struct i915_vma *vma = i915->fence_regs[i].vma;
> +		struct drm_vma_offset_node *node;
> +		u64 vma_offset;
> +
> +		if (!vma)
> +			continue;
> +
> +		GEM_BUG_ON(vma->fence != &i915->fence_regs[i]);
> +		if (!i915_vma_has_userfault(vma))
> +			continue;
> +
> +		node = &vma->obj->base.vma_node;
> +		vma_offset = vma->ggtt_view.partial.offset << PAGE_SHIFT;
> +		unmap_mapping_range(i915->drm.anon_inode->i_mapping,
> +				    drm_vma_node_offset_addr(node) + vma_offset,
> +				    vma->size,
> +				    1);
> +	}
> +}
> +
>  static void reset_prepare(struct drm_i915_private *i915)
>  {
>  	struct intel_engine_cs *engine;
> @@ -648,6 +673,7 @@ static void reset_prepare(struct drm_i915_private *i915)
>  		reset_prepare_engine(engine);
>  
>  	intel_uc_sanitize(i915);
> +	revoke_mmaps(i915);
>  }
>  
>  static int gt_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
> @@ -911,50 +937,22 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
>  	return ret;
>  }
>  
> -struct __i915_reset {
> -	struct drm_i915_private *i915;
> -	unsigned int stalled_mask;
> -};
> -
> -static int __i915_reset__BKL(void *data)
> -{
> -	struct __i915_reset *arg = data;
> -	int err;
> -
> -	err = intel_gpu_reset(arg->i915, ALL_ENGINES);
> -	if (err)
> -		return err;
> -
> -	return gt_reset(arg->i915, arg->stalled_mask);
> -}
> -
> -#if RESET_UNDER_STOP_MACHINE
> -/*
> - * XXX An alternative to using stop_machine would be to park only the
> - * processes that have a GGTT mmap. By remote parking the threads (SIGSTOP)
> - * we should be able to prevent their memmory accesses via the lost fence
> - * registers over the course of the reset without the potential recursive
> - * of mutexes between the pagefault handler and reset.
> - *
> - * See igt/gem_mmap_gtt/hang
> - */
> -#define __do_reset(fn, arg) stop_machine(fn, arg, NULL)
> -#else
> -#define __do_reset(fn, arg) fn(arg)
> -#endif
> -
>  static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
>  {
> -	struct __i915_reset arg = { i915, stalled_mask };
>  	int err, i;
>  
> -	err = __do_reset(__i915_reset__BKL, &arg);
> +	/* Flush everyone currently using a resource about to be clobbered */
> +	synchronize_srcu(&i915->gpu_error.srcu);
> +
> +	err = intel_gpu_reset(i915, ALL_ENGINES);
>  	for (i = 0; err && i < RESET_MAX_RETRIES; i++) {
> -		msleep(100);
> -		err = __do_reset(__i915_reset__BKL, &arg);
> +		msleep(10 * (i + 1));
> +		err = intel_gpu_reset(i915, ALL_ENGINES);
>  	}
> +	if (err)
> +		return err;
>  
> -	return err;
> +	return gt_reset(i915, stalled_mask);
>  }
>  
>  /**
> @@ -1274,9 +1272,12 @@ void i915_handle_error(struct drm_i915_private *i915,
>  		wait_event(i915->gpu_error.reset_queue,
>  			   !test_bit(I915_RESET_BACKOFF,
>  				     &i915->gpu_error.flags));
> -		goto out;
> +		goto out; /* piggy-back on the other reset */
>  	}
>  
> +	/* Make sure i915_reset_trylock() sees the I915_RESET_BACKOFF */
> +	synchronize_rcu_expedited();

Is the expedite here to minimize the time the faulted
client can try to reaquire?

> +
>  	/* Prevent any other reset-engine attempt. */
>  	for_each_engine(engine, i915, tmp) {
>  		while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
> @@ -1300,6 +1301,36 @@ void i915_handle_error(struct drm_i915_private *i915,
>  	intel_runtime_pm_put(i915, wakeref);
>  }
>  
> +int i915_reset_trylock(struct drm_i915_private *i915)
> +{
> +	struct i915_gpu_error *error = &i915->gpu_error;
> +	int srcu;
> +
> +	rcu_read_lock();
> +	while (test_bit(I915_RESET_BACKOFF, &error->flags)) {
> +		rcu_read_unlock();
> +
> +		if (wait_event_interruptible(error->reset_queue,
> +					     !test_bit(I915_RESET_BACKOFF,
> +						       &error->flags)))
> +			return -EINTR;
> +
> +		rcu_read_lock();
> +	}
> +	srcu = srcu_read_lock(&error->srcu);

In here are we piggybacking the graze period from srcu
into the rcu domain by nesting?

The srcu is ours, but the rcu is everyone. So this
bothers me.

-Mika

> +	rcu_read_unlock();
> +
> +	return srcu;
> +}
> +
> +void i915_reset_unlock(struct drm_i915_private *i915, int tag)
> +__releases(&i915->gpu_error.srcu)
> +{
> +	struct i915_gpu_error *error = &i915->gpu_error;
> +
> +	srcu_read_unlock(&error->srcu, tag);
> +}
> +
>  bool i915_reset_flush(struct drm_i915_private *i915)
>  {
>  	int err;
> diff --git a/drivers/gpu/drm/i915/i915_reset.h b/drivers/gpu/drm/i915/i915_reset.h
> index f2d347f319df..893c5d1c2eb8 100644
> --- a/drivers/gpu/drm/i915/i915_reset.h
> +++ b/drivers/gpu/drm/i915/i915_reset.h
> @@ -9,6 +9,7 @@
>  
>  #include <linux/compiler.h>
>  #include <linux/types.h>
> +#include <linux/srcu.h>
>  
>  struct drm_i915_private;
>  struct intel_engine_cs;
> @@ -32,6 +33,9 @@ int i915_reset_engine(struct intel_engine_cs *engine,
>  void i915_reset_request(struct i915_request *rq, bool guilty);
>  bool i915_reset_flush(struct drm_i915_private *i915);
>  
> +int __must_check i915_reset_trylock(struct drm_i915_private *i915);
> +void i915_reset_unlock(struct drm_i915_private *i915, int tag);
> +
>  bool intel_has_gpu_reset(struct drm_i915_private *i915);
>  bool intel_has_reset_engine(struct drm_i915_private *i915);
>  
> diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> index 7b6f3bea9ef8..4886fac12628 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> @@ -1039,8 +1039,6 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915,
>  
>  	/* Check that we can recover an unbind stuck on a hanging request */
>  
> -	igt_global_reset_lock(i915);
> -
>  	mutex_lock(&i915->drm.struct_mutex);
>  	err = hang_init(&h, i915);
>  	if (err)
> @@ -1138,7 +1136,9 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915,
>  	}
>  
>  out_reset:
> +	igt_global_reset_lock(i915);
>  	fake_hangcheck(rq->i915, intel_engine_flag(rq->engine));
> +	igt_global_reset_unlock(i915);
>  
>  	if (tsk) {
>  		struct igt_wedge_me w;
> @@ -1159,7 +1159,6 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915,
>  	hang_fini(&h);
>  unlock:
>  	mutex_unlock(&i915->drm.struct_mutex);
> -	igt_global_reset_unlock(i915);
>  
>  	if (i915_terminally_wedged(&i915->gpu_error))
>  		return -EIO;
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> index 14ae46fda49f..074a0d9cbf26 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> @@ -189,6 +189,7 @@ struct drm_i915_private *mock_gem_device(void)
>  
>  	init_waitqueue_head(&i915->gpu_error.wait_queue);
>  	init_waitqueue_head(&i915->gpu_error.reset_queue);
> +	init_srcu_struct(&i915->gpu_error.srcu);
>  	mutex_init(&i915->gpu_error.wedge_mutex);
>  
>  	i915->wq = alloc_ordered_workqueue("mock", 0);
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 07/12] drm/i915: Revoke mmaps and prevent access to fence registers across reset
  2019-02-04 13:33   ` Mika Kuoppala
@ 2019-02-04 13:47     ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-02-04 13:47 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2019-02-04 13:33:21)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> > @@ -272,6 +270,8 @@ struct i915_gpu_error {
> >        */
> >       wait_queue_head_t reset_queue;
> >  
> > +     struct srcu_struct srcu;
> > +
> 
> It is the only one in here so not causing confusion
> but could have been reset_backoff_srcu;

worksforme.

> > @@ -1274,9 +1272,12 @@ void i915_handle_error(struct drm_i915_private *i915,
> >               wait_event(i915->gpu_error.reset_queue,
> >                          !test_bit(I915_RESET_BACKOFF,
> >                                    &i915->gpu_error.flags));
> > -             goto out;
> > +             goto out; /* piggy-back on the other reset */
> >       }
> >  
> > +     /* Make sure i915_reset_trylock() sees the I915_RESET_BACKOFF */
> > +     synchronize_rcu_expedited();
> 
> Is the expedite here to minimize the time the faulted
> client can try to reaquire?

Simply to try and cap the amount of time it takes to issue a reset.
Without this we would regularly fail our assertion that userspace can do
a (full) reset in less than 250ms.

> >       /* Prevent any other reset-engine attempt. */
> >       for_each_engine(engine, i915, tmp) {
> >               while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
> > @@ -1300,6 +1301,36 @@ void i915_handle_error(struct drm_i915_private *i915,
> >       intel_runtime_pm_put(i915, wakeref);
> >  }
> >  
> > +int i915_reset_trylock(struct drm_i915_private *i915)
> > +{
> > +     struct i915_gpu_error *error = &i915->gpu_error;
> > +     int srcu;
> > +
> > +     rcu_read_lock();
> > +     while (test_bit(I915_RESET_BACKOFF, &error->flags)) {
> > +             rcu_read_unlock();
> > +
> > +             if (wait_event_interruptible(error->reset_queue,
> > +                                          !test_bit(I915_RESET_BACKOFF,
> > +                                                    &error->flags)))
> > +                     return -EINTR;
> > +
> > +             rcu_read_lock();
> > +     }
> > +     srcu = srcu_read_lock(&error->srcu);
> 
> In here are we piggybacking the graze period from srcu
> into the rcu domain by nesting?
> 
> The srcu is ours, but the rcu is everyone. So this
> bothers me.

rcu serialises the update of the I915_RESET_BACKOFF bit. That
coordinates with the reset to say nothing is in use at this moment, and
the reset is not allowed to begin until all rcu reads are complete.

The srcu then coordinates with the actual reset; it is the
mutex/semaphore that prevents both from running at the same time.

We acquire it under the rcu read lock, while we know that
I915_RESET_BACKOFF is not set, and cannot be set. Then as we release the
rcu read lock and let reset start, it sets the I915_RESET_BACKOFF
putting the next faulter to sleep, but the reset is forced to sleep on
the active scru reader. As soon as they are all complete synchronize_srcu
returns and we can do the reset uncontended.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2019-02-04 13:47 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
2019-02-04  8:41 ` [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption Chris Wilson
2019-02-04 10:06   ` Tvrtko Ursulin
2019-02-04 10:18     ` Chris Wilson
2019-02-04 12:08       ` Tvrtko Ursulin
2019-02-04 12:19         ` Chris Wilson
2019-02-04 12:29           ` Tvrtko Ursulin
2019-02-04 10:49   ` [PATCH] " Chris Wilson
2019-02-04  8:41 ` [PATCH 03/12] drm/i915/execlists: Suppress redundant preemption Chris Wilson
2019-02-04 12:05   ` Tvrtko Ursulin
2019-02-04 12:25     ` Chris Wilson
2019-02-04  8:41 ` [PATCH 04/12] drm/i915/selftests: Exercise some AB...BA preemption chains Chris Wilson
2019-02-04  8:41 ` [PATCH 05/12] drm/i915: Trim NEWCLIENT boosting Chris Wilson
2019-02-04 12:11   ` Tvrtko Ursulin
2019-02-04 12:26     ` Chris Wilson
2019-02-04 12:42       ` Tvrtko Ursulin
2019-02-04 12:27     ` Chris Wilson
2019-02-04  8:41 ` [PATCH 06/12] drm/i915: Show support for accurate sw PMU busyness tracking Chris Wilson
2019-02-04 12:14   ` Tvrtko Ursulin
2019-02-04 12:28     ` Chris Wilson
2019-02-04 12:29       ` Chris Wilson
2019-02-04 12:37       ` Tvrtko Ursulin
2019-02-04 12:43         ` Chris Wilson
2019-02-04  8:41 ` [PATCH 07/12] drm/i915: Revoke mmaps and prevent access to fence registers across reset Chris Wilson
2019-02-04 13:33   ` Mika Kuoppala
2019-02-04 13:47     ` Chris Wilson
2019-02-04  8:41 ` [PATCH 08/12] drm/i915: Force the GPU reset upon wedging Chris Wilson
2019-02-04  8:41 ` [PATCH 09/12] drm/i915: Uninterruptibly drain the timelines on unwedging Chris Wilson
2019-02-04  8:41 ` [PATCH 10/12] drm/i915: Wait for old resets before applying debugfs/i915_wedged Chris Wilson
2019-02-04  8:41 ` [PATCH 11/12] drm/i915: Serialise resets with wedging Chris Wilson
2019-02-04  8:41 ` [PATCH 12/12] drm/i915: Don't claim an unstarted request was guilty Chris Wilson
2019-02-04  9:20 ` [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Tvrtko Ursulin
2019-02-04 10:19 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/12] " Patchwork
2019-02-04 10:23 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-02-04 10:48 ` ✓ Fi.CI.BAT: success " Patchwork
2019-02-04 11:27 ` ✗ Fi.CI.BAT: failure for series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.