All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds
@ 2021-01-25 14:00 Chris Wilson
  2021-01-25 14:00 ` [Intel-gfx] [PATCH 02/41] drm/i915/gt: Move the defer_request waiter active assertion Chris Wilson
                   ` (44 more replies)
  0 siblings, 45 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

As we reset the engine between verifying the workarounds remain intact,
report an engine reset failure.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_workarounds.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index 37ea46907a7d..af33a720dbf8 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -1219,7 +1219,11 @@ live_engine_reset_workarounds(void *arg)
 			goto err;
 		}
 
-		intel_engine_reset(engine, "live_workarounds:idle");
+		ret = intel_engine_reset(engine, "live_workarounds:idle");
+		if (ret) {
+			pr_err("%s: Reset failed while idle\n", engine->name);
+			goto err;
+		}
 
 		ok = verify_wa_lists(gt, &lists, "after idle reset");
 		if (!ok) {
@@ -1240,12 +1244,18 @@ live_engine_reset_workarounds(void *arg)
 
 		ret = request_add_spin(rq, &spin);
 		if (ret) {
-			pr_err("Spinner failed to start\n");
+			pr_err("%s: Spinner failed to start\n", engine->name);
 			igt_spinner_fini(&spin);
 			goto err;
 		}
 
-		intel_engine_reset(engine, "live_workarounds:active");
+		ret = intel_engine_reset(engine, "live_workarounds:active");
+		if (ret) {
+			pr_err("%s: Reset failed on an active spinner\n",
+			       engine->name);
+			igt_spinner_fini(&spin);
+			goto err;
+		}
 
 		igt_spinner_end(&spin);
 		igt_spinner_fini(&spin);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 02/41] drm/i915/gt: Move the defer_request waiter active assertion
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
@ 2021-01-25 14:00 ` Chris Wilson
  2021-01-25 14:53   ` Tvrtko Ursulin
  2021-01-25 14:00 ` [Intel-gfx] [PATCH 03/41] drm/i915: Replace engine->schedule() with a known request operation Chris Wilson
                   ` (43 subsequent siblings)
  44 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

In defer_request() we start with the request we just unsubmitted (that
should be the active request on the gpu) and then defer all of its
waiters. No waiter should be ahead of the active request, so none should
be marked as active. That assert failed.

Of particular note this machine was undergoing persistent GPU result due
to underlying HW issues, so that may be a clue. A request is also marked
as active when it is retired, regardless of current queue status, and so
this assertion failure may be a result of the queue being completed by
the reset and then subsequently processed by the tasklet.

We can filter out retired requests here by doing the assertion check
after the is-ready check (active is a subset of being ready).

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2978
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 24731be6e462..56e36d938851 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -1061,7 +1061,6 @@ static void defer_request(struct i915_request *rq, struct list_head * const pl)
 				   __i915_request_has_started(w) &&
 				   !__i915_request_is_complete(rq));
 
-			GEM_BUG_ON(i915_request_is_active(w));
 			if (!i915_request_is_ready(w))
 				continue;
 
@@ -1069,6 +1068,7 @@ static void defer_request(struct i915_request *rq, struct list_head * const pl)
 				continue;
 
 			GEM_BUG_ON(rq_prio(w) > rq_prio(rq));
+			GEM_BUG_ON(i915_request_is_active(w));
 			list_move_tail(&w->sched.link, &list);
 		}
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 03/41] drm/i915: Replace engine->schedule() with a known request operation
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
  2021-01-25 14:00 ` [Intel-gfx] [PATCH 02/41] drm/i915/gt: Move the defer_request waiter active assertion Chris Wilson
@ 2021-01-25 14:00 ` Chris Wilson
  2021-01-25 15:14   ` Tvrtko Ursulin
  2021-01-25 14:00 ` [Intel-gfx] [PATCH 04/41] drm/i915: Teach the i915_dependency to use a double-lock Chris Wilson
                   ` (42 subsequent siblings)
  44 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Looking to the future, we want to set the scheduling attributes
explicitly and so replace the generic engine->schedule() with the more
direct i915_request_set_priority()

What it loses in removing the 'schedule' name from the function, it
gains in having an explicit entry point with a stated goal.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/display/intel_display.c  |  5 ++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  5 ++-
 drivers/gpu/drm/i915/gem/i915_gem_wait.c      | 29 +++++-----------
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  3 --
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |  4 +--
 drivers/gpu/drm/i915/gt/intel_engine_types.h  | 29 ++++++++--------
 drivers/gpu/drm/i915/gt/intel_engine_user.c   |  2 +-
 .../drm/i915/gt/intel_execlists_submission.c  |  3 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 33 +++++--------------
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  | 11 +++----
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  1 -
 drivers/gpu/drm/i915/i915_request.c           | 10 +++---
 drivers/gpu/drm/i915/i915_scheduler.c         | 15 +++++----
 drivers/gpu/drm/i915/i915_scheduler.h         |  3 +-
 14 files changed, 59 insertions(+), 94 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 7ec7d94b8cdb..2e80babd1f66 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -13632,7 +13632,6 @@ int
 intel_prepare_plane_fb(struct drm_plane *_plane,
 		       struct drm_plane_state *_new_plane_state)
 {
-	struct i915_sched_attr attr = { .priority = I915_PRIORITY_DISPLAY };
 	struct intel_plane *plane = to_intel_plane(_plane);
 	struct intel_plane_state *new_plane_state =
 		to_intel_plane_state(_new_plane_state);
@@ -13673,7 +13672,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 
 	if (new_plane_state->uapi.fence) { /* explicit fencing */
 		i915_gem_fence_wait_priority(new_plane_state->uapi.fence,
-					     &attr);
+					     I915_PRIORITY_DISPLAY);
 		ret = i915_sw_fence_await_dma_fence(&state->commit_ready,
 						    new_plane_state->uapi.fence,
 						    i915_fence_timeout(dev_priv),
@@ -13695,7 +13694,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 	if (ret)
 		return ret;
 
-	i915_gem_object_wait_priority(obj, 0, &attr);
+	i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
 	i915_gem_object_flush_frontbuffer(obj, ORIGIN_DIRTYFB);
 
 	if (!new_plane_state->uapi.fence) { /* implicit fencing */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 3411ad197fa6..325766abca21 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -549,15 +549,14 @@ static inline void __start_cpu_write(struct drm_i915_gem_object *obj)
 		obj->cache_dirty = true;
 }
 
-void i915_gem_fence_wait_priority(struct dma_fence *fence,
-				  const struct i915_sched_attr *attr);
+void i915_gem_fence_wait_priority(struct dma_fence *fence, int prio);
 
 int i915_gem_object_wait(struct drm_i915_gem_object *obj,
 			 unsigned int flags,
 			 long timeout);
 int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 				  unsigned int flags,
-				  const struct i915_sched_attr *attr);
+				  int prio);
 
 void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
 					 enum fb_op_origin origin);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index 4b9856d5ba14..d79bf16083bd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -91,22 +91,12 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
 	return timeout;
 }
 
-static void fence_set_priority(struct dma_fence *fence,
-			       const struct i915_sched_attr *attr)
+static void fence_set_priority(struct dma_fence *fence, int prio)
 {
-	struct i915_request *rq;
-	struct intel_engine_cs *engine;
-
 	if (dma_fence_is_signaled(fence) || !dma_fence_is_i915(fence))
 		return;
 
-	rq = to_request(fence);
-	engine = rq->engine;
-
-	rcu_read_lock(); /* RCU serialisation for set-wedged protection */
-	if (engine->schedule)
-		engine->schedule(rq, attr);
-	rcu_read_unlock();
+	i915_request_set_priority(to_request(fence), prio);
 }
 
 static inline bool __dma_fence_is_chain(const struct dma_fence *fence)
@@ -114,8 +104,7 @@ static inline bool __dma_fence_is_chain(const struct dma_fence *fence)
 	return fence->ops == &dma_fence_chain_ops;
 }
 
-void i915_gem_fence_wait_priority(struct dma_fence *fence,
-				  const struct i915_sched_attr *attr)
+void i915_gem_fence_wait_priority(struct dma_fence *fence, int prio)
 {
 	if (dma_fence_is_signaled(fence))
 		return;
@@ -128,19 +117,19 @@ void i915_gem_fence_wait_priority(struct dma_fence *fence,
 		int i;
 
 		for (i = 0; i < array->num_fences; i++)
-			fence_set_priority(array->fences[i], attr);
+			fence_set_priority(array->fences[i], prio);
 	} else if (__dma_fence_is_chain(fence)) {
 		struct dma_fence *iter;
 
 		/* The chain is ordered; if we boost the last, we boost all */
 		dma_fence_chain_for_each(iter, fence) {
 			fence_set_priority(to_dma_fence_chain(iter)->fence,
-					   attr);
+					   prio);
 			break;
 		}
 		dma_fence_put(iter);
 	} else {
-		fence_set_priority(fence, attr);
+		fence_set_priority(fence, prio);
 	}
 
 	local_bh_enable(); /* kick the tasklets if queues were reprioritised */
@@ -149,7 +138,7 @@ void i915_gem_fence_wait_priority(struct dma_fence *fence,
 int
 i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 			      unsigned int flags,
-			      const struct i915_sched_attr *attr)
+			      int prio)
 {
 	struct dma_fence *excl;
 
@@ -164,7 +153,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 			return ret;
 
 		for (i = 0; i < count; i++) {
-			i915_gem_fence_wait_priority(shared[i], attr);
+			i915_gem_fence_wait_priority(shared[i], prio);
 			dma_fence_put(shared[i]);
 		}
 
@@ -174,7 +163,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 	}
 
 	if (excl) {
-		i915_gem_fence_wait_priority(excl, attr);
+		i915_gem_fence_wait_priority(excl, prio);
 		dma_fence_put(excl);
 	}
 	return 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index ac9e020dbc9e..7e580d3ac58f 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -319,9 +319,6 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
 	if (engine->context_size)
 		DRIVER_CAPS(i915)->has_logical_contexts = true;
 
-	/* Nothing to do here, execute in order of dependencies */
-	engine->schedule = NULL;
-
 	ewma__engine_latency_init(&engine->latency);
 	seqcount_init(&engine->stats.lock);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 778bcae5ef2c..0b026cde9f09 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -114,7 +114,7 @@ static void heartbeat(struct work_struct *wrk)
 			 * but all other contexts, including the kernel
 			 * context are stuck waiting for the signal.
 			 */
-		} else if (engine->schedule &&
+		} else if (intel_engine_has_scheduler(engine) &&
 			   rq->sched.attr.priority < I915_PRIORITY_BARRIER) {
 			/*
 			 * Gradually raise the priority of the heartbeat to
@@ -129,7 +129,7 @@ static void heartbeat(struct work_struct *wrk)
 				attr.priority = I915_PRIORITY_BARRIER;
 
 			local_bh_disable();
-			engine->schedule(rq, &attr);
+			i915_request_set_priority(rq, attr.priority);
 			local_bh_enable();
 		} else {
 			if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 883bafc44902..27cb3dc0233b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -453,14 +453,6 @@ struct intel_engine_cs {
 	void            (*bond_execute)(struct i915_request *rq,
 					struct dma_fence *signal);
 
-	/*
-	 * Call when the priority on a request has changed and it and its
-	 * dependencies may need rescheduling. Note the request itself may
-	 * not be ready to run!
-	 */
-	void		(*schedule)(struct i915_request *request,
-				    const struct i915_sched_attr *attr);
-
 	void		(*release)(struct intel_engine_cs *engine);
 
 	struct intel_engine_execlists execlists;
@@ -478,13 +470,14 @@ struct intel_engine_cs {
 
 #define I915_ENGINE_USING_CMD_PARSER BIT(0)
 #define I915_ENGINE_SUPPORTS_STATS   BIT(1)
-#define I915_ENGINE_HAS_PREEMPTION   BIT(2)
-#define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
-#define I915_ENGINE_HAS_TIMESLICES   BIT(4)
-#define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(5)
-#define I915_ENGINE_IS_VIRTUAL       BIT(6)
-#define I915_ENGINE_HAS_RELATIVE_MMIO BIT(7)
-#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(8)
+#define I915_ENGINE_HAS_SCHEDULER    BIT(2)
+#define I915_ENGINE_HAS_PREEMPTION   BIT(3)
+#define I915_ENGINE_HAS_SEMAPHORES   BIT(4)
+#define I915_ENGINE_HAS_TIMESLICES   BIT(5)
+#define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(6)
+#define I915_ENGINE_IS_VIRTUAL       BIT(7)
+#define I915_ENGINE_HAS_RELATIVE_MMIO BIT(8)
+#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(9)
 	unsigned int flags;
 
 	/*
@@ -572,6 +565,12 @@ intel_engine_supports_stats(const struct intel_engine_cs *engine)
 	return engine->flags & I915_ENGINE_SUPPORTS_STATS;
 }
 
+static inline bool
+intel_engine_has_scheduler(const struct intel_engine_cs *engine)
+{
+	return engine->flags & I915_ENGINE_HAS_SCHEDULER;
+}
+
 static inline bool
 intel_engine_has_preemption(const struct intel_engine_cs *engine)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c
index 1cbd84eb24e4..64eccdf32a22 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
@@ -107,7 +107,7 @@ static void set_scheduler_caps(struct drm_i915_private *i915)
 	for_each_uabi_engine(engine, i915) { /* all engines must agree! */
 		int i;
 
-		if (engine->schedule)
+		if (intel_engine_has_scheduler(engine))
 			enabled |= (I915_SCHEDULER_CAP_ENABLED |
 				    I915_SCHEDULER_CAP_PRIORITY);
 		else
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 56e36d938851..309fb421ff5c 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3073,7 +3073,6 @@ static bool can_preempt(struct intel_engine_cs *engine)
 static void execlists_set_default_submission(struct intel_engine_cs *engine)
 {
 	engine->submit_request = execlists_submit_request;
-	engine->schedule = i915_schedule;
 	engine->execlists.tasklet.func = execlists_submission_tasklet;
 
 	engine->reset.prepare = execlists_reset_prepare;
@@ -3084,6 +3083,7 @@ static void execlists_set_default_submission(struct intel_engine_cs *engine)
 	engine->park = execlists_park;
 	engine->unpark = NULL;
 
+	engine->flags |= I915_ENGINE_HAS_SCHEDULER;
 	engine->flags |= I915_ENGINE_SUPPORTS_STATS;
 	if (!intel_vgpu_active(engine->i915)) {
 		engine->flags |= I915_ENGINE_HAS_SEMAPHORES;
@@ -3646,7 +3646,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 	ve->base.cops = &virtual_context_ops;
 	ve->base.request_alloc = execlists_request_alloc;
 
-	ve->base.schedule = i915_schedule;
 	ve->base.submit_request = virtual_submit_request;
 	ve->base.bond_execute = virtual_bond_execute;
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 6cfa9a89d891..acb7c089d05b 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -268,12 +268,8 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
 		i915_request_put(rq[0]);
 
 		if (prio) {
-			struct i915_sched_attr attr = {
-				.priority = prio,
-			};
-
 			/* Alternatively preempt the spinner with ce[1] */
-			engine->schedule(rq[1], &attr);
+			i915_request_set_priority(rq[1], prio);
 		}
 
 		/* And switch back to ce[0] for good measure */
@@ -873,9 +869,6 @@ release_queue(struct intel_engine_cs *engine,
 	      struct i915_vma *vma,
 	      int idx, int prio)
 {
-	struct i915_sched_attr attr = {
-		.priority = prio,
-	};
 	struct i915_request *rq;
 	u32 *cs;
 
@@ -900,7 +893,7 @@ release_queue(struct intel_engine_cs *engine,
 	i915_request_add(rq);
 
 	local_bh_disable();
-	engine->schedule(rq, &attr);
+	i915_request_set_priority(rq, prio);
 	local_bh_enable(); /* kick tasklet */
 
 	i915_request_put(rq);
@@ -1310,7 +1303,6 @@ static int live_timeslice_queue(void *arg)
 		goto err_pin;
 
 	for_each_engine(engine, gt, id) {
-		struct i915_sched_attr attr = { .priority = I915_PRIORITY_MAX };
 		struct i915_request *rq, *nop;
 
 		if (!intel_engine_has_preemption(engine))
@@ -1325,7 +1317,7 @@ static int live_timeslice_queue(void *arg)
 			err = PTR_ERR(rq);
 			goto err_heartbeat;
 		}
-		engine->schedule(rq, &attr);
+		i915_request_set_priority(rq, I915_PRIORITY_MAX);
 		err = wait_for_submit(engine, rq, HZ / 2);
 		if (err) {
 			pr_err("%s: Timed out trying to submit semaphores\n",
@@ -1806,7 +1798,6 @@ static int live_late_preempt(void *arg)
 	struct i915_gem_context *ctx_hi, *ctx_lo;
 	struct igt_spinner spin_hi, spin_lo;
 	struct intel_engine_cs *engine;
-	struct i915_sched_attr attr = {};
 	enum intel_engine_id id;
 	int err = -ENOMEM;
 
@@ -1866,8 +1857,7 @@ static int live_late_preempt(void *arg)
 			goto err_wedged;
 		}
 
-		attr.priority = I915_PRIORITY_MAX;
-		engine->schedule(rq, &attr);
+		i915_request_set_priority(rq, I915_PRIORITY_MAX);
 
 		if (!igt_wait_for_spinner(&spin_hi, rq)) {
 			pr_err("High priority context failed to preempt the low priority context\n");
@@ -2412,7 +2402,6 @@ static int live_preempt_cancel(void *arg)
 
 static int live_suppress_self_preempt(void *arg)
 {
-	struct i915_sched_attr attr = { .priority = I915_PRIORITY_MAX };
 	struct intel_gt *gt = arg;
 	struct intel_engine_cs *engine;
 	struct preempt_client a, b;
@@ -2480,7 +2469,7 @@ static int live_suppress_self_preempt(void *arg)
 			i915_request_add(rq_b);
 
 			GEM_BUG_ON(i915_request_completed(rq_a));
-			engine->schedule(rq_a, &attr);
+			i915_request_set_priority(rq_a, I915_PRIORITY_MAX);
 			igt_spinner_end(&a.spin);
 
 			if (!igt_wait_for_spinner(&b.spin, rq_b)) {
@@ -2545,7 +2534,6 @@ static int live_chain_preempt(void *arg)
 		goto err_client_hi;
 
 	for_each_engine(engine, gt, id) {
-		struct i915_sched_attr attr = { .priority = I915_PRIORITY_MAX };
 		struct igt_live_test t;
 		struct i915_request *rq;
 		int ring_size, count, i;
@@ -2612,7 +2600,7 @@ static int live_chain_preempt(void *arg)
 
 			i915_request_get(rq);
 			i915_request_add(rq);
-			engine->schedule(rq, &attr);
+			i915_request_set_priority(rq, I915_PRIORITY_MAX);
 
 			igt_spinner_end(&hi.spin);
 			if (i915_request_wait(rq, 0, HZ / 5) < 0) {
@@ -2964,14 +2952,12 @@ static int live_preempt_gang(void *arg)
 			return -EIO;
 
 		do {
-			struct i915_sched_attr attr = { .priority = prio++ };
-
 			err = create_gang(engine, &rq);
 			if (err)
 				break;
 
 			/* Submit each spinner at increasing priority */
-			engine->schedule(rq, &attr);
+			i915_request_set_priority(rq, prio++);
 		} while (prio <= I915_PRIORITY_MAX &&
 			 !__igt_timeout(end_time, NULL));
 		pr_debug("%s: Preempt chain of %d requests\n",
@@ -3192,9 +3178,6 @@ static int preempt_user(struct intel_engine_cs *engine,
 			struct i915_vma *global,
 			int id)
 {
-	struct i915_sched_attr attr = {
-		.priority = I915_PRIORITY_MAX
-	};
 	struct i915_request *rq;
 	int err = 0;
 	u32 *cs;
@@ -3219,7 +3202,7 @@ static int preempt_user(struct intel_engine_cs *engine,
 	i915_request_get(rq);
 	i915_request_add(rq);
 
-	engine->schedule(rq, &attr);
+	i915_request_set_priority(rq, I915_PRIORITY_MAX);
 
 	if (i915_request_wait(rq, 0, HZ / 2) < 0)
 		err = -ETIME;
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index d6ce4075602c..8cad102922e7 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -858,12 +858,11 @@ static int active_engine(void *data)
 		rq[idx] = i915_request_get(new);
 		i915_request_add(new);
 
-		if (engine->schedule && arg->flags & TEST_PRIORITY) {
-			struct i915_sched_attr attr = {
-				.priority =
-					i915_prandom_u32_max_state(512, &prng),
-			};
-			engine->schedule(rq[idx], &attr);
+		if (intel_engine_has_scheduler(engine) &&
+		    arg->flags & TEST_PRIORITY) {
+			int prio = i915_prandom_u32_max_state(512, &prng);
+
+			i915_request_set_priority(rq[idx], prio);
 		}
 
 		err = active_request_put(old);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 3124d8794d87..53cf68e240c3 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -607,7 +607,6 @@ static int guc_resume(struct intel_engine_cs *engine)
 static void guc_set_default_submission(struct intel_engine_cs *engine)
 {
 	engine->submit_request = guc_submit_request;
-	engine->schedule = i915_schedule;
 	engine->execlists.tasklet.func = guc_submission_tasklet;
 
 	engine->reset.prepare = guc_reset_prepare;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 22e39d938f17..abda565dfe62 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1218,7 +1218,7 @@ __i915_request_await_execution(struct i915_request *to,
 	}
 
 	/* Couple the dependency tree for PI on this exposed to->fence */
-	if (to->engine->schedule) {
+	if (intel_engine_has_scheduler(to->engine)) {
 		err = i915_sched_node_add_dependency(&to->sched,
 						     &from->sched,
 						     I915_DEPENDENCY_WEAK);
@@ -1359,7 +1359,7 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
 		return 0;
 	}
 
-	if (to->engine->schedule) {
+	if (intel_engine_has_scheduler(to->engine)) {
 		ret = i915_sched_node_add_dependency(&to->sched,
 						     &from->sched,
 						     I915_DEPENDENCY_EXTERNAL);
@@ -1546,7 +1546,7 @@ __i915_request_add_to_timeline(struct i915_request *rq)
 			__i915_sw_fence_await_dma_fence(&rq->submit,
 							&prev->fence,
 							&rq->dmaq);
-		if (rq->engine->schedule)
+		if (intel_engine_has_scheduler(rq->engine))
 			__i915_sched_node_add_dependency(&rq->sched,
 							 &prev->sched,
 							 &rq->dep,
@@ -1618,8 +1618,8 @@ void __i915_request_queue(struct i915_request *rq,
 	 * decide whether to preempt the entire chain so that it is ready to
 	 * run at the earliest possible convenience.
 	 */
-	if (attr && rq->engine->schedule)
-		rq->engine->schedule(rq, attr);
+	if (attr)
+		i915_request_set_priority(rq, attr->priority);
 
 	local_bh_disable();
 	__i915_request_queue_bh(rq);
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index efa638c3acc7..dbdd4128f13d 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -216,10 +216,8 @@ static void kick_submission(struct intel_engine_cs *engine,
 	rcu_read_unlock();
 }
 
-static void __i915_schedule(struct i915_sched_node *node,
-			    const struct i915_sched_attr *attr)
+static void __i915_schedule(struct i915_sched_node *node, int prio)
 {
-	const int prio = max(attr->priority, node->attr.priority);
 	struct intel_engine_cs *engine;
 	struct i915_dependency *dep, *p;
 	struct i915_dependency stack;
@@ -233,6 +231,8 @@ static void __i915_schedule(struct i915_sched_node *node,
 	if (node_signaled(node))
 		return;
 
+	prio = max(prio, node->attr.priority);
+
 	stack.signaler = node;
 	list_add(&stack.dfs_link, &dfs);
 
@@ -286,7 +286,7 @@ static void __i915_schedule(struct i915_sched_node *node,
 	 */
 	if (node->attr.priority == I915_PRIORITY_INVALID) {
 		GEM_BUG_ON(!list_empty(&node->link));
-		node->attr = *attr;
+		node->attr.priority = prio;
 
 		if (stack.dfs_link.next == stack.dfs_link.prev)
 			return;
@@ -341,10 +341,13 @@ static void __i915_schedule(struct i915_sched_node *node,
 	spin_unlock(&engine->active.lock);
 }
 
-void i915_schedule(struct i915_request *rq, const struct i915_sched_attr *attr)
+void i915_request_set_priority(struct i915_request *rq, int prio)
 {
+	if (!intel_engine_has_scheduler(rq->engine))
+		return;
+
 	spin_lock_irq(&schedule_lock);
-	__i915_schedule(&rq->sched, attr);
+	__i915_schedule(&rq->sched, prio);
 	spin_unlock_irq(&schedule_lock);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 858a0938f47a..ccee506c9a26 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -35,8 +35,7 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
 
 void i915_sched_node_fini(struct i915_sched_node *node);
 
-void i915_schedule(struct i915_request *request,
-		   const struct i915_sched_attr *attr);
+void i915_request_set_priority(struct i915_request *request, int prio);
 
 struct list_head *
 i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 04/41] drm/i915: Teach the i915_dependency to use a double-lock
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
  2021-01-25 14:00 ` [Intel-gfx] [PATCH 02/41] drm/i915/gt: Move the defer_request waiter active assertion Chris Wilson
  2021-01-25 14:00 ` [Intel-gfx] [PATCH 03/41] drm/i915: Replace engine->schedule() with a known request operation Chris Wilson
@ 2021-01-25 14:00 ` Chris Wilson
  2021-01-25 15:34   ` Tvrtko Ursulin
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance Chris Wilson
                   ` (41 subsequent siblings)
  44 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Currently, we construct and teardown the i915_dependency chains using a
global spinlock. As the lists are entirely local, it should be possible
to use an double-lock with an explicit nesting [signaler -> waiter,
always] and so avoid the costly convenience of a global spinlock.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_request.c         |  2 +-
 drivers/gpu/drm/i915/i915_scheduler.c       | 65 +++++++++++++--------
 drivers/gpu/drm/i915/i915_scheduler.h       |  2 +-
 drivers/gpu/drm/i915/i915_scheduler_types.h |  2 +
 4 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index abda565dfe62..df2ab39b394d 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -330,7 +330,7 @@ bool i915_request_retire(struct i915_request *rq)
 	intel_context_unpin(rq->context);
 
 	free_capture_list(rq);
-	i915_sched_node_fini(&rq->sched);
+	i915_sched_node_retire(&rq->sched);
 	i915_request_put(rq);
 
 	return true;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index dbdd4128f13d..96fe1e22dad7 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -19,6 +19,17 @@ static struct i915_global_scheduler {
 
 static DEFINE_SPINLOCK(schedule_lock);
 
+static struct i915_sched_node *node_get(struct i915_sched_node *node)
+{
+	i915_request_get(container_of(node, struct i915_request, sched));
+	return node;
+}
+
+static void node_put(struct i915_sched_node *node)
+{
+	i915_request_put(container_of(node, struct i915_request, sched));
+}
+
 static const struct i915_request *
 node_to_request(const struct i915_sched_node *node)
 {
@@ -353,6 +364,8 @@ void i915_request_set_priority(struct i915_request *rq, int prio)
 
 void i915_sched_node_init(struct i915_sched_node *node)
 {
+	spin_lock_init(&node->lock);
+
 	INIT_LIST_HEAD(&node->signalers_list);
 	INIT_LIST_HEAD(&node->waiters_list);
 	INIT_LIST_HEAD(&node->link);
@@ -377,10 +390,17 @@ i915_dependency_alloc(void)
 	return kmem_cache_alloc(global.slab_dependencies, GFP_KERNEL);
 }
 
+static void
+rcu_dependency_free(struct rcu_head *rcu)
+{
+	kmem_cache_free(global.slab_dependencies,
+			container_of(rcu, typeof(struct i915_dependency), rcu));
+}
+
 static void
 i915_dependency_free(struct i915_dependency *dep)
 {
-	kmem_cache_free(global.slab_dependencies, dep);
+	call_rcu(&dep->rcu, rcu_dependency_free);
 }
 
 bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
@@ -390,24 +410,27 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
 {
 	bool ret = false;
 
-	spin_lock_irq(&schedule_lock);
+	/* The signal->lock is always the outer lock in this double-lock. */
+	spin_lock(&signal->lock);
 
 	if (!node_signaled(signal)) {
 		INIT_LIST_HEAD(&dep->dfs_link);
 		dep->signaler = signal;
-		dep->waiter = node;
+		dep->waiter = node_get(node);
 		dep->flags = flags;
 
 		/* All set, now publish. Beware the lockless walkers. */
+		spin_lock_nested(&node->lock, SINGLE_DEPTH_NESTING);
 		list_add_rcu(&dep->signal_link, &node->signalers_list);
 		list_add_rcu(&dep->wait_link, &signal->waiters_list);
+		spin_unlock(&node->lock);
 
 		/* Propagate the chains */
 		node->flags |= signal->flags;
 		ret = true;
 	}
 
-	spin_unlock_irq(&schedule_lock);
+	spin_unlock(&signal->lock);
 
 	return ret;
 }
@@ -429,39 +452,36 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
 	return 0;
 }
 
-void i915_sched_node_fini(struct i915_sched_node *node)
+void i915_sched_node_retire(struct i915_sched_node *node)
 {
 	struct i915_dependency *dep, *tmp;
 
-	spin_lock_irq(&schedule_lock);
-
 	/*
 	 * Everyone we depended upon (the fences we wait to be signaled)
 	 * should retire before us and remove themselves from our list.
 	 * However, retirement is run independently on each timeline and
-	 * so we may be called out-of-order.
+	 * so we may be called out-of-order. As we need to avoid taking
+	 * the signaler's lock, just mark up our completion and be wary
+	 * in traversing the signalers->waiters_list.
 	 */
-	list_for_each_entry_safe(dep, tmp, &node->signalers_list, signal_link) {
-		GEM_BUG_ON(!list_empty(&dep->dfs_link));
-
-		list_del_rcu(&dep->wait_link);
-		if (dep->flags & I915_DEPENDENCY_ALLOC)
-			i915_dependency_free(dep);
-	}
-	INIT_LIST_HEAD(&node->signalers_list);
 
 	/* Remove ourselves from everyone who depends upon us */
+	spin_lock(&node->lock);
 	list_for_each_entry_safe(dep, tmp, &node->waiters_list, wait_link) {
-		GEM_BUG_ON(dep->signaler != node);
-		GEM_BUG_ON(!list_empty(&dep->dfs_link));
+		struct i915_sched_node *w = dep->waiter;
 
+		GEM_BUG_ON(dep->signaler != node);
+
+		spin_lock_nested(&w->lock, SINGLE_DEPTH_NESTING);
 		list_del_rcu(&dep->signal_link);
+		spin_unlock(&w->lock);
+		node_put(w);
+
 		if (dep->flags & I915_DEPENDENCY_ALLOC)
 			i915_dependency_free(dep);
 	}
-	INIT_LIST_HEAD(&node->waiters_list);
-
-	spin_unlock_irq(&schedule_lock);
+	INIT_LIST_HEAD_RCU(&node->waiters_list);
+	spin_unlock(&node->lock);
 }
 
 void i915_request_show_with_schedule(struct drm_printer *m,
@@ -512,8 +532,7 @@ static struct i915_global_scheduler global = { {
 int __init i915_global_scheduler_init(void)
 {
 	global.slab_dependencies = KMEM_CACHE(i915_dependency,
-					      SLAB_HWCACHE_ALIGN |
-					      SLAB_TYPESAFE_BY_RCU);
+					      SLAB_HWCACHE_ALIGN);
 	if (!global.slab_dependencies)
 		return -ENOMEM;
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index ccee506c9a26..a045be784c67 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -33,7 +33,7 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
 				   struct i915_sched_node *signal,
 				   unsigned long flags);
 
-void i915_sched_node_fini(struct i915_sched_node *node);
+void i915_sched_node_retire(struct i915_sched_node *node);
 
 void i915_request_set_priority(struct i915_request *request, int prio);
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index 343ed44d5ed4..623bf41fcf35 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -60,6 +60,7 @@ struct i915_sched_attr {
  * others.
  */
 struct i915_sched_node {
+	spinlock_t lock; /* protect the lists */
 	struct list_head signalers_list; /* those before us, we depend upon */
 	struct list_head waiters_list; /* those after us, they depend upon us */
 	struct list_head link;
@@ -75,6 +76,7 @@ struct i915_dependency {
 	struct list_head signal_link;
 	struct list_head wait_link;
 	struct list_head dfs_link;
+	struct rcu_head rcu;
 	unsigned long flags;
 #define I915_DEPENDENCY_ALLOC		BIT(0)
 #define I915_DEPENDENCY_EXTERNAL	BIT(1)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (2 preceding siblings ...)
  2021-01-25 14:00 ` [Intel-gfx] [PATCH 04/41] drm/i915: Teach the i915_dependency to use a double-lock Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-26 11:12   ` Tvrtko Ursulin
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 06/41] drm/i915/selftests: Measure set-priority duration Chris Wilson
                   ` (40 subsequent siblings)
  44 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

In anticipation of wanting to be able to call pi from underneath an
engine's active.lock, rework the priority inheritance to primarily work
along an engine's priority queue, delegating any other engine that the
chain may traverse to a worker. This reduces the global spinlock from
governing the multi-entire priority inheritance depth-first search, to a
smaller lock on each engine around a single list on that engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |   2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h |   3 +
 drivers/gpu/drm/i915/i915_scheduler.c        | 346 ++++++++++++-------
 drivers/gpu/drm/i915/i915_scheduler.h        |   2 +
 drivers/gpu/drm/i915/i915_scheduler_types.h  |  19 +-
 5 files changed, 234 insertions(+), 138 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 7e580d3ac58f..3bfd3853c0e9 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -576,6 +576,8 @@ void intel_engine_init_execlists(struct intel_engine_cs *engine)
 
 	execlists->queue_priority_hint = INT_MIN;
 	execlists->queue = RB_ROOT_CACHED;
+
+	i915_sched_init_ipi(&execlists->ipi);
 }
 
 static void cleanup_status_page(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 27cb3dc0233b..9105b7769635 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -20,6 +20,7 @@
 #include "i915_gem.h"
 #include "i915_pmu.h"
 #include "i915_priolist_types.h"
+#include "i915_scheduler_types.h"
 #include "i915_selftest.h"
 #include "intel_breadcrumbs_types.h"
 #include "intel_sseu.h"
@@ -257,6 +258,8 @@ struct intel_engine_execlists {
 	struct rb_root_cached queue;
 	struct rb_root_cached virtual;
 
+	struct i915_sched_ipi ipi;
+
 	/**
 	 * @csb_write: control register for Context Switch buffer
 	 *
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 96fe1e22dad7..0ecf71a6afd4 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -17,8 +17,6 @@ static struct i915_global_scheduler {
 	struct kmem_cache *slab_priorities;
 } global;
 
-static DEFINE_SPINLOCK(schedule_lock);
-
 static struct i915_sched_node *node_get(struct i915_sched_node *node)
 {
 	i915_request_get(container_of(node, struct i915_request, sched));
@@ -30,17 +28,116 @@ static void node_put(struct i915_sched_node *node)
 	i915_request_put(container_of(node, struct i915_request, sched));
 }
 
+static inline int rq_prio(const struct i915_request *rq)
+{
+	return READ_ONCE(rq->sched.attr.priority);
+}
+
+static int ipi_get_prio(struct i915_request *rq)
+{
+	if (READ_ONCE(rq->sched.ipi_priority) == I915_PRIORITY_INVALID)
+		return I915_PRIORITY_INVALID;
+
+	return xchg(&rq->sched.ipi_priority, I915_PRIORITY_INVALID);
+}
+
+static void ipi_schedule(struct work_struct *wrk)
+{
+	struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
+	struct i915_request *rq = xchg(&ipi->list, NULL);
+
+	do {
+		struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
+		int prio;
+
+		prio = ipi_get_prio(rq);
+
+		/*
+		 * For cross-engine scheduling to work we rely on one of two
+		 * things:
+		 *
+		 * a) The requests are using dma-fence fences and so will not
+		 * be scheduled until the previous engine is completed, and
+		 * so we cannot cross back onto the original engine and end up
+		 * queuing an earlier request after the first (due to the
+		 * interrupted DFS).
+		 *
+		 * b) The requests are using semaphores and so may be already
+		 * be in flight, in which case if we cross back onto the same
+		 * engine, we will already have put the interrupted DFS into
+		 * the priolist, and the continuation will now be queued
+		 * afterwards [out-of-order]. However, since we are using
+		 * semaphores in this case, we also perform yield on semaphore
+		 * waits and so will reorder the requests back into the correct
+		 * sequence. This occurrence (of promoting a request chain
+		 * that crosses the engines using semaphores back unto itself)
+		 * should be unlikely enough that it probably does not matter...
+		 */
+		local_bh_disable();
+		i915_request_set_priority(rq, prio);
+		local_bh_enable();
+
+		i915_request_put(rq);
+		rq = ptr_mask_bits(rn, 1);
+	} while (rq);
+}
+
+void i915_sched_init_ipi(struct i915_sched_ipi *ipi)
+{
+	INIT_WORK(&ipi->work, ipi_schedule);
+	ipi->list = NULL;
+}
+
+static void __ipi_add(struct i915_request *rq)
+{
+#define STUB ((struct i915_request *)1)
+	struct intel_engine_cs *engine = READ_ONCE(rq->engine);
+	struct i915_request *first;
+
+	if (!i915_request_get_rcu(rq))
+		return;
+
+	if (__i915_request_is_complete(rq) ||
+	    cmpxchg(&rq->sched.ipi_link, NULL, STUB)) { /* already queued */
+		i915_request_put(rq);
+		return;
+	}
+
+	first = READ_ONCE(engine->execlists.ipi.list);
+	do
+		rq->sched.ipi_link = ptr_pack_bits(first, 1, 1);
+	while (!try_cmpxchg(&engine->execlists.ipi.list, &first, rq));
+
+	if (!first)
+		queue_work(system_unbound_wq, &engine->execlists.ipi.work);
+}
+
+/*
+ * Virtual engines complicate acquiring the engine timeline lock,
+ * as their rq->engine pointer is not stable until under that
+ * engine lock. The simple ploy we use is to take the lock then
+ * check that the rq still belongs to the newly locked engine.
+ */
+#define lock_engine_irqsave(rq, flags) ({ \
+	struct i915_request * const rq__ = (rq); \
+	struct intel_engine_cs *engine__ = READ_ONCE(rq__->engine); \
+\
+	spin_lock_irqsave(&engine__->active.lock, (flags)); \
+	while (engine__ != READ_ONCE((rq__)->engine)) { \
+		spin_unlock(&engine__->active.lock); \
+		engine__ = READ_ONCE(rq__->engine); \
+		spin_lock(&engine__->active.lock); \
+	} \
+\
+	engine__; \
+})
+
 static const struct i915_request *
 node_to_request(const struct i915_sched_node *node)
 {
 	return container_of(node, const struct i915_request, sched);
 }
 
-static inline bool node_started(const struct i915_sched_node *node)
-{
-	return i915_request_started(node_to_request(node));
-}
-
 static inline bool node_signaled(const struct i915_sched_node *node)
 {
 	return i915_request_completed(node_to_request(node));
@@ -137,42 +234,6 @@ void __i915_priolist_free(struct i915_priolist *p)
 	kmem_cache_free(global.slab_priorities, p);
 }
 
-struct sched_cache {
-	struct list_head *priolist;
-};
-
-static struct intel_engine_cs *
-sched_lock_engine(const struct i915_sched_node *node,
-		  struct intel_engine_cs *locked,
-		  struct sched_cache *cache)
-{
-	const struct i915_request *rq = node_to_request(node);
-	struct intel_engine_cs *engine;
-
-	GEM_BUG_ON(!locked);
-
-	/*
-	 * Virtual engines complicate acquiring the engine timeline lock,
-	 * as their rq->engine pointer is not stable until under that
-	 * engine lock. The simple ploy we use is to take the lock then
-	 * check that the rq still belongs to the newly locked engine.
-	 */
-	while (locked != (engine = READ_ONCE(rq->engine))) {
-		spin_unlock(&locked->active.lock);
-		memset(cache, 0, sizeof(*cache));
-		spin_lock(&engine->active.lock);
-		locked = engine;
-	}
-
-	GEM_BUG_ON(locked != engine);
-	return locked;
-}
-
-static inline int rq_prio(const struct i915_request *rq)
-{
-	return rq->sched.attr.priority;
-}
-
 static inline bool need_preempt(int prio, int active)
 {
 	/*
@@ -198,19 +259,17 @@ static void kick_submission(struct intel_engine_cs *engine,
 	if (prio <= engine->execlists.queue_priority_hint)
 		return;
 
-	rcu_read_lock();
-
 	/* Nothing currently active? We're overdue for a submission! */
 	inflight = execlists_active(&engine->execlists);
 	if (!inflight)
-		goto unlock;
+		return;
 
 	/*
 	 * If we are already the currently executing context, don't
 	 * bother evaluating if we should preempt ourselves.
 	 */
 	if (inflight->context == rq->context)
-		goto unlock;
+		return;
 
 	ENGINE_TRACE(engine,
 		     "bumping queue-priority-hint:%d for rq:%llx:%lld, inflight:%llx:%lld prio %d\n",
@@ -222,30 +281,28 @@ static void kick_submission(struct intel_engine_cs *engine,
 	engine->execlists.queue_priority_hint = prio;
 	if (need_preempt(prio, rq_prio(inflight)))
 		tasklet_hi_schedule(&engine->execlists.tasklet);
-
-unlock:
-	rcu_read_unlock();
 }
 
-static void __i915_schedule(struct i915_sched_node *node, int prio)
+static void ipi_priority(struct i915_request *rq, int prio)
 {
-	struct intel_engine_cs *engine;
-	struct i915_dependency *dep, *p;
-	struct i915_dependency stack;
-	struct sched_cache cache;
+	int old = READ_ONCE(rq->sched.ipi_priority);
+
+	do {
+		if (prio <= old)
+			return;
+	} while (!try_cmpxchg(&rq->sched.ipi_priority, &old, prio));
+
+	__ipi_add(rq);
+}
+
+static void __i915_request_set_priority(struct i915_request *rq, int prio)
+{
+	struct intel_engine_cs *engine = rq->engine;
+	struct i915_request *rn;
+	struct list_head *plist;
 	LIST_HEAD(dfs);
 
-	/* Needed in order to use the temporary link inside i915_dependency */
-	lockdep_assert_held(&schedule_lock);
-	GEM_BUG_ON(prio == I915_PRIORITY_INVALID);
-
-	if (node_signaled(node))
-		return;
-
-	prio = max(prio, node->attr.priority);
-
-	stack.signaler = node;
-	list_add(&stack.dfs_link, &dfs);
+	list_add(&rq->sched.dfs, &dfs);
 
 	/*
 	 * Recursively bump all dependent priorities to match the new request.
@@ -265,66 +322,41 @@ static void __i915_schedule(struct i915_sched_node *node, int prio)
 	 * end result is a topological list of requests in reverse order, the
 	 * last element in the list is the request we must execute first.
 	 */
-	list_for_each_entry(dep, &dfs, dfs_link) {
-		struct i915_sched_node *node = dep->signaler;
+	list_for_each_entry(rq, &dfs, sched.dfs) {
+		struct i915_dependency *p;
 
-		/* If we are already flying, we know we have no signalers */
-		if (node_started(node))
-			continue;
+		/* Also release any children on this engine that are ready */
+		GEM_BUG_ON(rq->engine != engine);
 
-		/*
-		 * Within an engine, there can be no cycle, but we may
-		 * refer to the same dependency chain multiple times
-		 * (redundant dependencies are not eliminated) and across
-		 * engines.
-		 */
-		list_for_each_entry(p, &node->signalers_list, signal_link) {
-			GEM_BUG_ON(p == dep); /* no cycles! */
+		for_each_signaler(p, rq) {
+			struct i915_request *s =
+				container_of(p->signaler, typeof(*s), sched);
 
-			if (node_signaled(p->signaler))
+			GEM_BUG_ON(s == rq);
+
+			if (rq_prio(s) >= prio)
 				continue;
 
-			if (prio > READ_ONCE(p->signaler->attr.priority))
-				list_move_tail(&p->dfs_link, &dfs);
+			if (__i915_request_is_complete(s))
+				continue;
+
+			if (s->engine != rq->engine) {
+				ipi_priority(s, prio);
+				continue;
+			}
+
+			list_move_tail(&s->sched.dfs, &dfs);
 		}
 	}
 
-	/*
-	 * If we didn't need to bump any existing priorities, and we haven't
-	 * yet submitted this request (i.e. there is no potential race with
-	 * execlists_submit_request()), we can set our own priority and skip
-	 * acquiring the engine locks.
-	 */
-	if (node->attr.priority == I915_PRIORITY_INVALID) {
-		GEM_BUG_ON(!list_empty(&node->link));
-		node->attr.priority = prio;
+	plist = i915_sched_lookup_priolist(engine, prio);
 
-		if (stack.dfs_link.next == stack.dfs_link.prev)
-			return;
+	/* Fifo and depth-first replacement ensure our deps execute first */
+	list_for_each_entry_safe_reverse(rq, rn, &dfs, sched.dfs) {
+		GEM_BUG_ON(rq->engine != engine);
 
-		__list_del_entry(&stack.dfs_link);
-	}
-
-	memset(&cache, 0, sizeof(cache));
-	engine = node_to_request(node)->engine;
-	spin_lock(&engine->active.lock);
-
-	/* Fifo and depth-first replacement ensure our deps execute before us */
-	engine = sched_lock_engine(node, engine, &cache);
-	list_for_each_entry_safe_reverse(dep, p, &dfs, dfs_link) {
-		INIT_LIST_HEAD(&dep->dfs_link);
-
-		node = dep->signaler;
-		engine = sched_lock_engine(node, engine, &cache);
-		lockdep_assert_held(&engine->active.lock);
-
-		/* Recheck after acquiring the engine->timeline.lock */
-		if (prio <= node->attr.priority || node_signaled(node))
-			continue;
-
-		GEM_BUG_ON(node_to_request(node)->engine != engine);
-
-		WRITE_ONCE(node->attr.priority, prio);
+		INIT_LIST_HEAD(&rq->sched.dfs);
+		WRITE_ONCE(rq->sched.attr.priority, prio);
 
 		/*
 		 * Once the request is ready, it will be placed into the
@@ -334,32 +366,75 @@ static void __i915_schedule(struct i915_sched_node *node, int prio)
 		 * any preemption required, be dealt with upon submission.
 		 * See engine->submit_request()
 		 */
-		if (list_empty(&node->link))
+		if (!i915_request_is_ready(rq))
 			continue;
 
-		if (i915_request_in_priority_queue(node_to_request(node))) {
-			if (!cache.priolist)
-				cache.priolist =
-					i915_sched_lookup_priolist(engine,
-								   prio);
-			list_move_tail(&node->link, cache.priolist);
-		}
+		if (i915_request_in_priority_queue(rq))
+			list_move_tail(&rq->sched.link, plist);
 
-		/* Defer (tasklet) submission until after all of our updates. */
-		kick_submission(engine, node_to_request(node), prio);
+		/* Defer (tasklet) submission until after all updates. */
+		kick_submission(engine, rq, prio);
 	}
-
-	spin_unlock(&engine->active.lock);
 }
 
 void i915_request_set_priority(struct i915_request *rq, int prio)
 {
-	if (!intel_engine_has_scheduler(rq->engine))
+	struct intel_engine_cs *engine;
+	unsigned long flags;
+
+	if (prio <= rq_prio(rq))
 		return;
 
-	spin_lock_irq(&schedule_lock);
-	__i915_schedule(&rq->sched, prio);
-	spin_unlock_irq(&schedule_lock);
+	/*
+	 * If we are setting the priority before being submitted, see if we
+	 * can quickly adjust our own priority in-situ and avoid taking
+	 * the contended engine->active.lock. If we need priority inheritance,
+	 * take the slow route.
+	 */
+	if (rq_prio(rq) == I915_PRIORITY_INVALID) {
+		struct i915_dependency *p;
+
+		rcu_read_lock();
+		for_each_signaler(p, rq) {
+			struct i915_request *s =
+				container_of(p->signaler, typeof(*s), sched);
+
+			if (rq_prio(s) >= prio)
+				continue;
+
+			if (__i915_request_is_complete(s))
+				continue;
+
+			break;
+		}
+		rcu_read_unlock();
+
+		if (&p->signal_link == &rq->sched.signalers_list &&
+		    cmpxchg(&rq->sched.attr.priority,
+			    I915_PRIORITY_INVALID,
+			    prio) == I915_PRIORITY_INVALID)
+			return;
+	}
+
+	engine = lock_engine_irqsave(rq, flags);
+	if (prio <= rq_prio(rq))
+		goto unlock;
+
+	if (__i915_request_is_complete(rq))
+		goto unlock;
+
+	if (!intel_engine_has_scheduler(engine)) {
+		rq->sched.attr.priority = prio;
+		goto unlock;
+	}
+
+	rcu_read_lock();
+	__i915_request_set_priority(rq, prio);
+	rcu_read_unlock();
+	GEM_BUG_ON(rq_prio(rq) != prio);
+
+unlock:
+	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
 
 void i915_sched_node_init(struct i915_sched_node *node)
@@ -369,6 +444,9 @@ void i915_sched_node_init(struct i915_sched_node *node)
 	INIT_LIST_HEAD(&node->signalers_list);
 	INIT_LIST_HEAD(&node->waiters_list);
 	INIT_LIST_HEAD(&node->link);
+	INIT_LIST_HEAD(&node->dfs);
+
+	node->ipi_link = NULL;
 
 	i915_sched_node_reinit(node);
 }
@@ -379,6 +457,9 @@ void i915_sched_node_reinit(struct i915_sched_node *node)
 	node->semaphores = 0;
 	node->flags = 0;
 
+	GEM_BUG_ON(node->ipi_link);
+	node->ipi_priority = I915_PRIORITY_INVALID;
+
 	GEM_BUG_ON(!list_empty(&node->signalers_list));
 	GEM_BUG_ON(!list_empty(&node->waiters_list));
 	GEM_BUG_ON(!list_empty(&node->link));
@@ -414,7 +495,6 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
 	spin_lock(&signal->lock);
 
 	if (!node_signaled(signal)) {
-		INIT_LIST_HEAD(&dep->dfs_link);
 		dep->signaler = signal;
 		dep->waiter = node_get(node);
 		dep->flags = flags;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index a045be784c67..5be7f90e7896 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -35,6 +35,8 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
 
 void i915_sched_node_retire(struct i915_sched_node *node);
 
+void i915_sched_init_ipi(struct i915_sched_ipi *ipi);
+
 void i915_request_set_priority(struct i915_request *request, int prio);
 
 struct list_head *
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index 623bf41fcf35..5a84d59134ee 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -8,8 +8,8 @@
 #define _I915_SCHEDULER_TYPES_H_
 
 #include <linux/list.h>
+#include <linux/workqueue.h>
 
-#include "gt/intel_engine_types.h"
 #include "i915_priolist_types.h"
 
 struct drm_i915_private;
@@ -61,13 +61,23 @@ struct i915_sched_attr {
  */
 struct i915_sched_node {
 	spinlock_t lock; /* protect the lists */
+
 	struct list_head signalers_list; /* those before us, we depend upon */
 	struct list_head waiters_list; /* those after us, they depend upon us */
-	struct list_head link;
+	struct list_head link; /* guarded by engine->active.lock */
+	struct list_head dfs; /* guarded by engine->active.lock */
 	struct i915_sched_attr attr;
-	unsigned int flags;
+	unsigned long flags;
 #define I915_SCHED_HAS_EXTERNAL_CHAIN	BIT(0)
-	intel_engine_mask_t semaphores;
+	unsigned long semaphores;
+
+	struct i915_request *ipi_link;
+	int ipi_priority;
+};
+
+struct i915_sched_ipi {
+	struct i915_request *list;
+	struct work_struct work;
 };
 
 struct i915_dependency {
@@ -75,7 +85,6 @@ struct i915_dependency {
 	struct i915_sched_node *waiter;
 	struct list_head signal_link;
 	struct list_head wait_link;
-	struct list_head dfs_link;
 	struct rcu_head rcu;
 	unsigned long flags;
 #define I915_DEPENDENCY_ALLOC		BIT(0)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 06/41] drm/i915/selftests: Measure set-priority duration
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (3 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 07/41] drm/i915/selftests: Exercise priority inheritance around an engine loop Chris Wilson
                   ` (39 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

As a topological sort, we expect it to run in linear graph time,
O(V+E). In removing the recursion, it is no longer a DFS but rather a
BFS, and performs as O(VE). Let's demonstrate how bad this is with a few
examples, and build a few test cases to verify a potential fix.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_scheduler.c         |   4 +
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 .../drm/i915/selftests/i915_perf_selftests.h  |   1 +
 .../gpu/drm/i915/selftests/i915_scheduler.c   | 679 ++++++++++++++++++
 4 files changed, 685 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/selftests/i915_scheduler.c

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 0ecf71a6afd4..4802c9b1081d 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -592,6 +592,10 @@ void i915_request_show_with_schedule(struct drm_printer *m,
 	rcu_read_unlock();
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftests/i915_scheduler.c"
+#endif
+
 static void i915_global_scheduler_shrink(void)
 {
 	kmem_cache_shrink(global.slab_dependencies);
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index a92c0e9b7e6b..2200a5baa68e 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -26,6 +26,7 @@ selftest(gt_mocs, intel_mocs_live_selftests)
 selftest(gt_pm, intel_gt_pm_live_selftests)
 selftest(gt_heartbeat, intel_heartbeat_live_selftests)
 selftest(requests, i915_request_live_selftests)
+selftest(scheduler, i915_scheduler_live_selftests)
 selftest(active, i915_active_live_selftests)
 selftest(objects, i915_gem_object_live_selftests)
 selftest(mman, i915_gem_mman_live_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
index c2389f8a257d..137e35283fee 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
@@ -17,5 +17,6 @@
  */
 selftest(engine_cs, intel_engine_cs_perf_selftests)
 selftest(request, i915_request_perf_selftests)
+selftest(scheduler, i915_scheduler_perf_selftests)
 selftest(blt, i915_gem_object_blt_perf_selftests)
 selftest(region, intel_memory_region_perf_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
new file mode 100644
index 000000000000..cb67de304aeb
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
@@ -0,0 +1,679 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include "i915_selftest.h"
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/selftest_engine_heartbeat.h"
+#include "selftests/igt_spinner.h"
+#include "selftests/i915_random.h"
+
+static void scheduling_disable(struct intel_engine_cs *engine)
+{
+	engine->props.preempt_timeout_ms = 0;
+	engine->props.timeslice_duration_ms = 0;
+
+	st_engine_heartbeat_disable(engine);
+}
+
+static void scheduling_enable(struct intel_engine_cs *engine)
+{
+	st_engine_heartbeat_enable(engine);
+
+	engine->props.preempt_timeout_ms =
+		engine->defaults.preempt_timeout_ms;
+	engine->props.timeslice_duration_ms =
+		engine->defaults.timeslice_duration_ms;
+}
+
+static int first_engine(struct drm_i915_private *i915,
+			int (*chain)(struct intel_engine_cs *engine,
+				     unsigned long param,
+				     bool (*fn)(struct i915_request *rq,
+						unsigned long v,
+						unsigned long e)),
+			unsigned long param,
+			bool (*fn)(struct i915_request *rq,
+				   unsigned long v, unsigned long e))
+{
+	struct intel_engine_cs *engine;
+
+	for_each_uabi_engine(engine, i915) {
+		if (!intel_engine_has_scheduler(engine))
+			continue;
+
+		return chain(engine, param, fn);
+	}
+
+	return 0;
+}
+
+static int all_engines(struct drm_i915_private *i915,
+		       int (*chain)(struct intel_engine_cs *engine,
+				    unsigned long param,
+				    bool (*fn)(struct i915_request *rq,
+					       unsigned long v,
+					       unsigned long e)),
+		       unsigned long param,
+		       bool (*fn)(struct i915_request *rq,
+				  unsigned long v, unsigned long e))
+{
+	struct intel_engine_cs *engine;
+	int err;
+
+	for_each_uabi_engine(engine, i915) {
+		if (!intel_engine_has_scheduler(engine))
+			continue;
+
+		err = chain(engine, param, fn);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static bool check_context_order(struct intel_engine_cs *engine)
+{
+	u64 last_seqno, last_context;
+	unsigned long count;
+	bool result = false;
+	struct rb_node *rb;
+	int last_prio;
+
+	/* We expect the execution order to follow ascending fence-context */
+	spin_lock_irq(&engine->active.lock);
+
+	count = 0;
+	last_context = 0;
+	last_seqno = 0;
+	last_prio = 0;
+	for (rb = rb_first_cached(&engine->execlists.queue); rb; rb = rb_next(rb)) {
+		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
+		struct i915_request *rq;
+
+		priolist_for_each_request(rq, p) {
+			if (rq->fence.context < last_context ||
+			    (rq->fence.context == last_context &&
+			     rq->fence.seqno < last_seqno)) {
+				pr_err("[%lu] %llx:%lld [prio:%d] after %llx:%lld [prio:%d]\n",
+				       count,
+				       rq->fence.context,
+				       rq->fence.seqno,
+				       rq_prio(rq),
+				       last_context,
+				       last_seqno,
+				       last_prio);
+				goto out_unlock;
+			}
+
+			last_context = rq->fence.context;
+			last_seqno = rq->fence.seqno;
+			last_prio = rq_prio(rq);
+			count++;
+		}
+	}
+	result = true;
+out_unlock:
+	spin_unlock_irq(&engine->active.lock);
+
+	return result;
+}
+
+static int __single_chain(struct intel_engine_cs *engine, unsigned long length,
+			  bool (*fn)(struct i915_request *rq,
+				     unsigned long v, unsigned long e))
+{
+	struct intel_context *ce;
+	struct igt_spinner spin;
+	struct i915_request *rq;
+	unsigned long count;
+	unsigned long min;
+	int err = 0;
+
+	if (!intel_engine_can_store_dword(engine))
+		return 0;
+
+	scheduling_disable(engine);
+
+	if (igt_spinner_init(&spin, engine->gt)) {
+		err = -ENOMEM;
+		goto err_heartbeat;
+	}
+
+	ce = intel_context_create(engine);
+	if (IS_ERR(ce)) {
+		err = PTR_ERR(ce);
+		goto err_spin;
+	}
+	ce->ring = __intel_context_ring_size(SZ_512K);
+
+	rq = igt_spinner_create_request(&spin, ce, MI_NOOP);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto err_context;
+	}
+	i915_request_add(rq);
+	min = ce->ring->size - ce->ring->space;
+
+	count = 1;
+	while (count < length && ce->ring->space > min) {
+		rq = intel_context_create_request(ce);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			break;
+		}
+		i915_request_add(rq);
+		count++;
+	}
+	intel_engine_flush_submission(engine);
+
+	tasklet_disable(&engine->execlists.tasklet);
+	local_bh_disable();
+	if (fn(rq, count, count - 1) && !check_context_order(engine))
+		err = -EINVAL;
+	local_bh_enable();
+	tasklet_enable(&engine->execlists.tasklet);
+
+	igt_spinner_end(&spin);
+err_context:
+	intel_context_put(ce);
+err_spin:
+	igt_spinner_fini(&spin);
+err_heartbeat:
+	scheduling_enable(engine);
+	return err;
+}
+
+static int __wide_chain(struct intel_engine_cs *engine, unsigned long width,
+			bool (*fn)(struct i915_request *rq,
+				   unsigned long v, unsigned long e))
+{
+	struct intel_context **ce;
+	struct i915_request **rq;
+	struct igt_spinner spin;
+	unsigned long count;
+	unsigned long i, j;
+	int err = 0;
+
+	if (!intel_engine_can_store_dword(engine))
+		return 0;
+
+	scheduling_disable(engine);
+
+	if (igt_spinner_init(&spin, engine->gt)) {
+		err = -ENOMEM;
+		goto err_heartbeat;
+	}
+
+	ce = kmalloc_array(width, sizeof(*ce), GFP_KERNEL);
+	if (!ce) {
+		err = -ENOMEM;
+		goto err_spin;
+	}
+
+	for (i = 0; i < width; i++) {
+		ce[i] = intel_context_create(engine);
+		if (IS_ERR(ce[i])) {
+			err = PTR_ERR(ce[i]);
+			width = i;
+			goto err_context;
+		}
+	}
+
+	rq = kmalloc_array(width, sizeof(*rq), GFP_KERNEL);
+	if (!rq) {
+		err = -ENOMEM;
+		goto err_context;
+	}
+
+	rq[0] = igt_spinner_create_request(&spin, ce[0], MI_NOOP);
+	if (IS_ERR(rq[0])) {
+		err = PTR_ERR(rq[0]);
+		goto err_free;
+	}
+	i915_request_add(rq[0]);
+
+	count = 0;
+	for (i = 1; i < width; i++) {
+		GEM_BUG_ON(i915_request_completed(rq[0]));
+
+		rq[i] = intel_context_create_request(ce[i]);
+		if (IS_ERR(rq[i])) {
+			err = PTR_ERR(rq[i]);
+			break;
+		}
+		for (j = 0; j < i; j++) {
+			err = i915_request_await_dma_fence(rq[i],
+							   &rq[j]->fence);
+			if (err)
+				break;
+			count++;
+		}
+		i915_request_add(rq[i]);
+	}
+	intel_engine_flush_submission(engine);
+
+	tasklet_disable(&engine->execlists.tasklet);
+	local_bh_disable();
+	if (fn(rq[i - 1], i, count) && !check_context_order(engine))
+		err = -EINVAL;
+	local_bh_enable();
+	tasklet_enable(&engine->execlists.tasklet);
+
+	igt_spinner_end(&spin);
+err_free:
+	kfree(rq);
+err_context:
+	for (i = 0; i < width; i++)
+		intel_context_put(ce[i]);
+	kfree(ce);
+err_spin:
+	igt_spinner_fini(&spin);
+err_heartbeat:
+	scheduling_enable(engine);
+	return err;
+}
+
+static int __inv_chain(struct intel_engine_cs *engine, unsigned long width,
+		       bool (*fn)(struct i915_request *rq,
+				  unsigned long v, unsigned long e))
+{
+	struct intel_context **ce;
+	struct i915_request **rq;
+	struct igt_spinner spin;
+	unsigned long count;
+	unsigned long i, j;
+	int err = 0;
+
+	if (!intel_engine_can_store_dword(engine))
+		return 0;
+
+	scheduling_disable(engine);
+
+	if (igt_spinner_init(&spin, engine->gt)) {
+		err = -ENOMEM;
+		goto err_heartbeat;
+	}
+
+	ce = kmalloc_array(width, sizeof(*ce), GFP_KERNEL);
+	if (!ce) {
+		err = -ENOMEM;
+		goto err_spin;
+	}
+
+	for (i = 0; i < width; i++) {
+		ce[i] = intel_context_create(engine);
+		if (IS_ERR(ce[i])) {
+			err = PTR_ERR(ce[i]);
+			width = i;
+			goto err_context;
+		}
+	}
+
+	rq = kmalloc_array(width, sizeof(*rq), GFP_KERNEL);
+	if (!rq) {
+		err = -ENOMEM;
+		goto err_context;
+	}
+
+	rq[0] = igt_spinner_create_request(&spin, ce[0], MI_NOOP);
+	if (IS_ERR(rq[0])) {
+		err = PTR_ERR(rq[0]);
+		goto err_free;
+	}
+	i915_request_add(rq[0]);
+
+	count = 0;
+	for (i = 1; i < width; i++) {
+		GEM_BUG_ON(i915_request_completed(rq[0]));
+
+		rq[i] = intel_context_create_request(ce[i]);
+		if (IS_ERR(rq[i])) {
+			err = PTR_ERR(rq[i]);
+			break;
+		}
+		for (j = i; j > 0; j--) {
+			err = i915_request_await_dma_fence(rq[i],
+							   &rq[j - 1]->fence);
+			if (err)
+				break;
+			count++;
+		}
+		i915_request_add(rq[i]);
+	}
+	intel_engine_flush_submission(engine);
+
+	tasklet_disable(&engine->execlists.tasklet);
+	local_bh_disable();
+	if (fn(rq[i - 1], i, count) && !check_context_order(engine))
+		err = -EINVAL;
+	local_bh_enable();
+	tasklet_enable(&engine->execlists.tasklet);
+
+	igt_spinner_end(&spin);
+err_free:
+	kfree(rq);
+err_context:
+	for (i = 0; i < width; i++)
+		intel_context_put(ce[i]);
+	kfree(ce);
+err_spin:
+	igt_spinner_fini(&spin);
+err_heartbeat:
+	scheduling_enable(engine);
+	return err;
+}
+
+static int __sparse_chain(struct intel_engine_cs *engine, unsigned long width,
+			  bool (*fn)(struct i915_request *rq,
+				     unsigned long v, unsigned long e))
+{
+	struct intel_context **ce;
+	struct i915_request **rq;
+	struct igt_spinner spin;
+	I915_RND_STATE(prng);
+	unsigned long count;
+	unsigned long i, j;
+	int err = 0;
+
+	if (!intel_engine_can_store_dword(engine))
+		return 0;
+
+	scheduling_disable(engine);
+
+	if (igt_spinner_init(&spin, engine->gt)) {
+		err = -ENOMEM;
+		goto err_heartbeat;
+	}
+
+	ce = kmalloc_array(width, sizeof(*ce), GFP_KERNEL);
+	if (!ce) {
+		err = -ENOMEM;
+		goto err_spin;
+	}
+
+	for (i = 0; i < width; i++) {
+		ce[i] = intel_context_create(engine);
+		if (IS_ERR(ce[i])) {
+			err = PTR_ERR(ce[i]);
+			width = i;
+			goto err_context;
+		}
+	}
+
+	rq = kmalloc_array(width, sizeof(*rq), GFP_KERNEL);
+	if (!rq) {
+		err = -ENOMEM;
+		goto err_context;
+	}
+
+	rq[0] = igt_spinner_create_request(&spin, ce[0], MI_NOOP);
+	if (IS_ERR(rq[0])) {
+		err = PTR_ERR(rq[0]);
+		goto err_free;
+	}
+	i915_request_add(rq[0]);
+
+	count = 0;
+	for (i = 1; i < width; i++) {
+		GEM_BUG_ON(i915_request_completed(rq[0]));
+
+		rq[i] = intel_context_create_request(ce[i]);
+		if (IS_ERR(rq[i])) {
+			err = PTR_ERR(rq[i]);
+			break;
+		}
+
+		if (err == 0 && i > 1) {
+			j = i915_prandom_u32_max_state(i - 1, &prng);
+			err = i915_request_await_dma_fence(rq[i],
+							   &rq[j]->fence);
+			count++;
+		}
+
+		if (err == 0) {
+			err = i915_request_await_dma_fence(rq[i],
+							   &rq[i - 1]->fence);
+			count++;
+		}
+
+		if (err == 0 && i > 2) {
+			j = i915_prandom_u32_max_state(i - 2, &prng);
+			err = i915_request_await_dma_fence(rq[i],
+							   &rq[j]->fence);
+			count++;
+		}
+
+		i915_request_add(rq[i]);
+		if (err)
+			break;
+	}
+	intel_engine_flush_submission(engine);
+
+	tasklet_disable(&engine->execlists.tasklet);
+	local_bh_disable();
+	if (fn(rq[i - 1], i, count) && !check_context_order(engine))
+		err = -EINVAL;
+	local_bh_enable();
+	tasklet_enable(&engine->execlists.tasklet);
+
+	igt_spinner_end(&spin);
+err_free:
+	kfree(rq);
+err_context:
+	for (i = 0; i < width; i++)
+		intel_context_put(ce[i]);
+	kfree(ce);
+err_spin:
+	igt_spinner_fini(&spin);
+err_heartbeat:
+	scheduling_enable(engine);
+	return err;
+}
+
+static int igt_schedule_chains(struct drm_i915_private *i915,
+			       bool (*fn)(struct i915_request *rq,
+					  unsigned long v, unsigned long e))
+{
+	static int (* const chains[])(struct intel_engine_cs *engine,
+				      unsigned long length,
+				      bool (*fn)(struct i915_request *rq,
+						 unsigned long v, unsigned long e)) = {
+		__single_chain,
+		__wide_chain,
+		__inv_chain,
+		__sparse_chain,
+	};
+	int n, err;
+
+	for (n = 0; n < ARRAY_SIZE(chains); n++) {
+		err = all_engines(i915, chains[n], 17, fn);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static bool igt_priority(struct i915_request *rq,
+			 unsigned long v, unsigned long e)
+{
+	i915_request_set_priority(rq, I915_PRIORITY_BARRIER);
+	GEM_BUG_ON(rq_prio(rq) != I915_PRIORITY_BARRIER);
+	return true;
+}
+
+static int igt_priority_chains(void *arg)
+{
+	return igt_schedule_chains(arg, igt_priority);
+}
+
+int i915_scheduler_live_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(igt_priority_chains),
+	};
+
+	return i915_subtests(tests, i915);
+}
+
+static int chains(struct drm_i915_private *i915,
+		  int (*chain)(struct drm_i915_private *i915,
+			       unsigned long length,
+			       bool (*fn)(struct i915_request *rq,
+					  unsigned long v, unsigned long e)),
+		  bool (*fn)(struct i915_request *rq,
+			     unsigned long v, unsigned long e))
+{
+	unsigned long x[] = { 1, 4, 16, 64, 128, 256, 512, 1024, 4096 };
+	int i, err;
+
+	for (i = 0; i < ARRAY_SIZE(x); i++) {
+		IGT_TIMEOUT(end_time);
+
+		err = chain(i915, x[i], fn);
+		if (err)
+			return err;
+
+		if (__igt_timeout(end_time, NULL))
+			break;
+	}
+
+	return 0;
+}
+
+static int single_chain(struct drm_i915_private *i915,
+			unsigned long length,
+			bool (*fn)(struct i915_request *rq,
+				   unsigned long v, unsigned long e))
+{
+	return first_engine(i915, __single_chain, length, fn);
+}
+
+static int single(struct drm_i915_private *i915,
+		  bool (*fn)(struct i915_request *rq,
+			     unsigned long v, unsigned long e))
+{
+	return chains(i915, single_chain, fn);
+}
+
+static int wide_chain(struct drm_i915_private *i915,
+		      unsigned long width,
+		      bool (*fn)(struct i915_request *rq,
+				 unsigned long v, unsigned long e))
+{
+	return first_engine(i915, __wide_chain, width, fn);
+}
+
+static int wide(struct drm_i915_private *i915,
+		bool (*fn)(struct i915_request *rq,
+			   unsigned long v, unsigned long e))
+{
+	return chains(i915, wide_chain, fn);
+}
+
+static int inv_chain(struct drm_i915_private *i915,
+		     unsigned long width,
+		     bool (*fn)(struct i915_request *rq,
+				unsigned long v, unsigned long e))
+{
+	return first_engine(i915, __inv_chain, width, fn);
+}
+
+static int inv(struct drm_i915_private *i915,
+	       bool (*fn)(struct i915_request *rq,
+			  unsigned long v, unsigned long e))
+{
+	return chains(i915, inv_chain, fn);
+}
+
+static int sparse_chain(struct drm_i915_private *i915,
+			unsigned long width,
+			bool (*fn)(struct i915_request *rq,
+				   unsigned long v, unsigned long e))
+{
+	return first_engine(i915, __sparse_chain, width, fn);
+}
+
+static int sparse(struct drm_i915_private *i915,
+		  bool (*fn)(struct i915_request *rq,
+			     unsigned long v, unsigned long e))
+{
+	return chains(i915, sparse_chain, fn);
+}
+
+static void report(const char *what, unsigned long v, unsigned long e, u64 dt)
+{
+	pr_info("(%4lu, %7lu), %s:%10lluns\n", v, e, what, dt);
+}
+
+static u64 __set_priority(struct i915_request *rq, int prio)
+{
+	u64 dt;
+
+	preempt_disable();
+	dt = ktime_get_raw_fast_ns();
+	i915_request_set_priority(rq, prio);
+	dt = ktime_get_raw_fast_ns() - dt;
+	preempt_enable();
+
+	return dt;
+}
+
+static bool set_priority(struct i915_request *rq,
+			 unsigned long v, unsigned long e)
+{
+	report("set-priority", v, e, __set_priority(rq, I915_PRIORITY_BARRIER));
+	return true;
+}
+
+static int single_priority(void *arg)
+{
+	return single(arg, set_priority);
+}
+
+static int wide_priority(void *arg)
+{
+	return wide(arg, set_priority);
+}
+
+static int inv_priority(void *arg)
+{
+	return inv(arg, set_priority);
+}
+
+static int sparse_priority(void *arg)
+{
+	return sparse(arg, set_priority);
+}
+
+int i915_scheduler_perf_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(single_priority),
+		SUBTEST(wide_priority),
+		SUBTEST(inv_priority),
+		SUBTEST(sparse_priority),
+	};
+	static const struct {
+		const char *name;
+		size_t sz;
+	} types[] = {
+#define T(t) { #t, sizeof(struct t) }
+		T(i915_priolist),
+		T(i915_sched_attr),
+		T(i915_sched_node),
+#undef T
+		{}
+	};
+	typeof(*types) *t;
+
+	for (t = types; t->name; t++)
+		pr_info("sizeof(%s): %zd\n", t->name, t->sz);
+
+	return i915_subtests(tests, i915);
+}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 07/41] drm/i915/selftests: Exercise priority inheritance around an engine loop
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (4 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 06/41] drm/i915/selftests: Measure set-priority duration Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 08/41] drm/i915: Improve DFS for priority inheritance Chris Wilson
                   ` (38 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Exercise rescheduling priority inheritance around a sequence of requests
that wrap around all the engines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/selftests/i915_scheduler.c   | 225 ++++++++++++++++++
 1 file changed, 225 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
index cb67de304aeb..ad2a44449c44 100644
--- a/drivers/gpu/drm/i915/selftests/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
@@ -7,6 +7,7 @@
 
 #include "gt/intel_context.h"
 #include "gt/intel_gpu_commands.h"
+#include "gt/intel_ring.h"
 #include "gt/selftest_engine_heartbeat.h"
 #include "selftests/igt_spinner.h"
 #include "selftests/i915_random.h"
@@ -512,10 +513,234 @@ static int igt_priority_chains(void *arg)
 	return igt_schedule_chains(arg, igt_priority);
 }
 
+static struct i915_request *
+__write_timestamp(struct intel_engine_cs *engine,
+		  struct drm_i915_gem_object *obj,
+		  int slot,
+		  struct i915_request *prev)
+{
+	struct i915_request *rq = ERR_PTR(-EINVAL);
+	bool use_64b = INTEL_GEN(engine->i915) >= 8;
+	struct intel_context *ce;
+	struct i915_vma *vma;
+	int err = 0;
+	u32 *cs;
+
+	ce = intel_context_create(engine);
+	if (IS_ERR(ce))
+		return ERR_CAST(ce);
+
+	vma = i915_vma_instance(obj, ce->vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto out_ce;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, PIN_USER);
+	if (err)
+		goto out_ce;
+
+	rq = intel_context_create_request(ce);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto out_unpin;
+	}
+
+	i915_vma_lock(vma);
+	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
+	i915_vma_unlock(vma);
+	if (err)
+		goto out_request;
+
+	if (prev) {
+		err = i915_request_await_dma_fence(rq, &prev->fence);
+		if (err)
+			goto out_request;
+	}
+
+	if (engine->emit_init_breadcrumb) {
+		err = engine->emit_init_breadcrumb(rq);
+		if (err)
+			goto out_request;
+	}
+
+	cs = intel_ring_begin(rq, 4);
+	if (IS_ERR(cs)) {
+		err = PTR_ERR(cs);
+		goto out_request;
+	}
+
+	*cs++ = MI_STORE_REGISTER_MEM + use_64b;
+	*cs++ = i915_mmio_reg_offset(RING_TIMESTAMP(engine->mmio_base));
+	*cs++ = lower_32_bits(vma->node.start) + sizeof(u32) * slot;
+	*cs++ = upper_32_bits(vma->node.start);
+	intel_ring_advance(rq, cs);
+
+	i915_request_get(rq);
+out_request:
+	i915_request_add(rq);
+out_unpin:
+	i915_vma_unpin(vma);
+out_ce:
+	intel_context_put(ce);
+	i915_request_put(prev);
+	return err ? ERR_PTR(err) : rq;
+}
+
+static struct i915_request *create_spinner(struct drm_i915_private *i915,
+					   struct igt_spinner *spin)
+{
+	struct intel_engine_cs *engine;
+
+	for_each_uabi_engine(engine, i915) {
+		struct intel_context *ce;
+		struct i915_request *rq;
+
+		if (igt_spinner_init(spin, engine->gt))
+			return ERR_PTR(-ENOMEM);
+
+		ce = intel_context_create(engine);
+		if (IS_ERR(ce))
+			return ERR_CAST(ce);
+
+		rq = igt_spinner_create_request(spin, ce, MI_NOOP);
+		intel_context_put(ce);
+		if (rq == ERR_PTR(-ENODEV))
+			continue;
+		if (IS_ERR(rq))
+			return rq;
+
+		i915_request_get(rq);
+		i915_request_add(rq);
+		return rq;
+	}
+
+	return ERR_PTR(-ENODEV);
+}
+
+static bool has_timestamp(const struct drm_i915_private *i915)
+{
+	return INTEL_GEN(i915) >= 6;
+}
+
+static int __igt_schedule_cycle(struct drm_i915_private *i915,
+				bool (*fn)(struct i915_request *rq,
+					   unsigned long v, unsigned long e))
+{
+	struct intel_engine_cs *engine;
+	struct drm_i915_gem_object *obj;
+	struct igt_spinner spin;
+	struct i915_request *rq;
+	unsigned long count, n;
+	u32 *time, last;
+	int err;
+
+	/*
+	 * Queue a bunch of ordered requests (each waiting on the previous)
+	 * around the engines a couple of times. Each request will write
+	 * the timestamp it executes at into the scratch, with the expectation
+	 * that the timestamp will be in our desired execution order.
+	 */
+
+	if (!i915->caps.scheduler || !has_timestamp(i915))
+		return 0;
+
+	obj = i915_gem_object_create_internal(i915, SZ_64K);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	time = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	if (IS_ERR(time)) {
+		err = PTR_ERR(time);
+		goto out_obj;
+	}
+
+	rq = create_spinner(i915, &spin);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto out_obj;
+	}
+
+	err = 0;
+	count = 0;
+	for_each_uabi_engine(engine, i915) {
+		if (!intel_engine_has_scheduler(engine))
+			continue;
+
+		rq = __write_timestamp(engine, obj, count, rq);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			break;
+		}
+
+		count++;
+	}
+	for_each_uabi_engine(engine, i915) {
+		if (!intel_engine_has_scheduler(engine))
+			continue;
+
+		rq = __write_timestamp(engine, obj, count, rq);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			break;
+		}
+
+		count++;
+	}
+	GEM_BUG_ON(count * sizeof(u32) > obj->base.size);
+	if (err || !count)
+		goto out_spin;
+
+	fn(rq, count + 1, count);
+	igt_spinner_end(&spin);
+
+	if (i915_request_wait(rq, 0, HZ / 2) < 0) {
+		err = -ETIME;
+		goto out_request;
+	}
+
+	last = time[0];
+	for (n = 1; n < count; n++) {
+		if (i915_seqno_passed(last, time[n])) {
+			pr_err("Timestamp[%lu] %x before previous %x\n",
+			       n, time[n], last);
+			err = -EINVAL;
+			break;
+		}
+		last = time[n];
+	}
+
+out_request:
+	i915_request_put(rq);
+out_spin:
+	igt_spinner_fini(&spin);
+out_obj:
+	i915_gem_object_put(obj);
+	return err;
+}
+
+static bool noop(struct i915_request *rq, unsigned long v, unsigned long e)
+{
+	return true;
+}
+
+static int igt_schedule_cycle(void *arg)
+{
+	return __igt_schedule_cycle(arg, noop);
+}
+
+static int igt_priority_cycle(void *arg)
+{
+	return __igt_schedule_cycle(arg, igt_priority);
+}
+
 int i915_scheduler_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_priority_chains),
+
+		SUBTEST(igt_schedule_cycle),
+		SUBTEST(igt_priority_cycle),
 	};
 
 	return i915_subtests(tests, i915);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 08/41] drm/i915: Improve DFS for priority inheritance
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (5 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 07/41] drm/i915/selftests: Exercise priority inheritance around an engine loop Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-26 16:22   ` Tvrtko Ursulin
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 09/41] drm/i915/selftests: Exercise relative mmio paths to non-privileged registers Chris Wilson
                   ` (37 subsequent siblings)
  44 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

The core of the scheduling algorithm is that we compute the topological
order of the fence DAG. Knowing that we have a DAG, we should be able to
use a DFS to compute the topological sort in linear time. However,
during the conversion of the recursive algorithm into an iterative one,
the memoization of how far we had progressed down a branch was
forgotten. The result was that instead of running in linear time, it was
running in geometric time and could easily run for a few hundred
milliseconds given a wide enough graph, not the microseconds as required.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_scheduler.c | 58 ++++++++++++++++-----------
 1 file changed, 34 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 4802c9b1081d..9139a91f0aa3 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -234,6 +234,26 @@ void __i915_priolist_free(struct i915_priolist *p)
 	kmem_cache_free(global.slab_priorities, p);
 }
 
+static struct i915_request *
+stack_push(struct i915_request *rq,
+	   struct i915_request *stack,
+	   struct list_head *pos)
+{
+	stack->sched.dfs.prev = pos;
+	rq->sched.dfs.next = (struct list_head *)stack;
+	return rq;
+}
+
+static struct i915_request *
+stack_pop(struct i915_request *rq,
+	  struct list_head **pos)
+{
+	rq = (struct i915_request *)rq->sched.dfs.next;
+	if (rq)
+		*pos = rq->sched.dfs.prev;
+	return rq;
+}
+
 static inline bool need_preempt(int prio, int active)
 {
 	/*
@@ -298,11 +318,10 @@ static void ipi_priority(struct i915_request *rq, int prio)
 static void __i915_request_set_priority(struct i915_request *rq, int prio)
 {
 	struct intel_engine_cs *engine = rq->engine;
-	struct i915_request *rn;
+	struct list_head *pos = &rq->sched.signalers_list;
 	struct list_head *plist;
-	LIST_HEAD(dfs);
 
-	list_add(&rq->sched.dfs, &dfs);
+	plist = i915_sched_lookup_priolist(engine, prio);
 
 	/*
 	 * Recursively bump all dependent priorities to match the new request.
@@ -322,40 +341,31 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
 	 * end result is a topological list of requests in reverse order, the
 	 * last element in the list is the request we must execute first.
 	 */
-	list_for_each_entry(rq, &dfs, sched.dfs) {
-		struct i915_dependency *p;
-
-		/* Also release any children on this engine that are ready */
-		GEM_BUG_ON(rq->engine != engine);
-
-		for_each_signaler(p, rq) {
+	rq->sched.dfs.next = NULL;
+	do {
+		list_for_each_continue(pos, &rq->sched.signalers_list) {
+			struct i915_dependency *p =
+				list_entry(pos, typeof(*p), signal_link);
 			struct i915_request *s =
 				container_of(p->signaler, typeof(*s), sched);
 
-			GEM_BUG_ON(s == rq);
-
 			if (rq_prio(s) >= prio)
 				continue;
 
 			if (__i915_request_is_complete(s))
 				continue;
 
-			if (s->engine != rq->engine) {
+			if (s->engine != engine) {
 				ipi_priority(s, prio);
 				continue;
 			}
 
-			list_move_tail(&s->sched.dfs, &dfs);
+			/* Remember our position along this branch */
+			rq = stack_push(s, rq, pos);
+			pos = &rq->sched.signalers_list;
 		}
-	}
 
-	plist = i915_sched_lookup_priolist(engine, prio);
-
-	/* Fifo and depth-first replacement ensure our deps execute first */
-	list_for_each_entry_safe_reverse(rq, rn, &dfs, sched.dfs) {
-		GEM_BUG_ON(rq->engine != engine);
-
-		INIT_LIST_HEAD(&rq->sched.dfs);
+		RQ_TRACE(rq, "set-priority:%d\n", prio);
 		WRITE_ONCE(rq->sched.attr.priority, prio);
 
 		/*
@@ -369,12 +379,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
 		if (!i915_request_is_ready(rq))
 			continue;
 
+		GEM_BUG_ON(rq->engine != engine);
 		if (i915_request_in_priority_queue(rq))
 			list_move_tail(&rq->sched.link, plist);
 
 		/* Defer (tasklet) submission until after all updates. */
 		kick_submission(engine, rq, prio);
-	}
+	} while ((rq = stack_pop(rq, &pos)));
 }
 
 void i915_request_set_priority(struct i915_request *rq, int prio)
@@ -444,7 +455,6 @@ void i915_sched_node_init(struct i915_sched_node *node)
 	INIT_LIST_HEAD(&node->signalers_list);
 	INIT_LIST_HEAD(&node->waiters_list);
 	INIT_LIST_HEAD(&node->link);
-	INIT_LIST_HEAD(&node->dfs);
 
 	node->ipi_link = NULL;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 09/41] drm/i915/selftests: Exercise relative mmio paths to non-privileged registers
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (6 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 08/41] drm/i915: Improve DFS for priority inheritance Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 10/41] drm/i915/selftests: Exercise cross-process context isolation Chris Wilson
                   ` (36 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Verify that context isolation is also preserved when accessing
context-local registers with relative-mmio commands.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_lrc.c | 88 ++++++++++++++++++++------
 1 file changed, 67 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 7bf34c439876..0524232378e4 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -910,7 +910,9 @@ create_user_vma(struct i915_address_space *vm, unsigned long size)
 }
 
 static struct i915_vma *
-store_context(struct intel_context *ce, struct i915_vma *scratch)
+store_context(struct intel_context *ce,
+	      struct i915_vma *scratch,
+	      bool relative)
 {
 	struct i915_vma *batch;
 	u32 dw, x, *cs, *hw;
@@ -939,6 +941,9 @@ store_context(struct intel_context *ce, struct i915_vma *scratch)
 	hw += LRC_STATE_OFFSET / sizeof(*hw);
 	do {
 		u32 len = hw[dw] & 0x7f;
+		u32 cmd = MI_STORE_REGISTER_MEM_GEN8;
+		u32 offset = 0;
+		u32 mask = ~0;
 
 		if (hw[dw] == 0) {
 			dw++;
@@ -950,11 +955,19 @@ store_context(struct intel_context *ce, struct i915_vma *scratch)
 			continue;
 		}
 
+		if (hw[dw] & MI_LRI_LRM_CS_MMIO) {
+			mask = 0xfff;
+			if (relative)
+				cmd |= MI_LRI_LRM_CS_MMIO;
+			else
+				offset = ce->engine->mmio_base;
+		}
+
 		dw++;
 		len = (len + 1) / 2;
 		while (len--) {
-			*cs++ = MI_STORE_REGISTER_MEM_GEN8;
-			*cs++ = hw[dw];
+			*cs++ = cmd;
+			*cs++ = (hw[dw] & mask) + offset;
 			*cs++ = lower_32_bits(scratch->node.start + x);
 			*cs++ = upper_32_bits(scratch->node.start + x);
 
@@ -993,6 +1006,7 @@ static struct i915_request *
 record_registers(struct intel_context *ce,
 		 struct i915_vma *before,
 		 struct i915_vma *after,
+		 bool relative,
 		 u32 *sema)
 {
 	struct i915_vma *b_before, *b_after;
@@ -1000,11 +1014,11 @@ record_registers(struct intel_context *ce,
 	u32 *cs;
 	int err;
 
-	b_before = store_context(ce, before);
+	b_before = store_context(ce, before, relative);
 	if (IS_ERR(b_before))
 		return ERR_CAST(b_before);
 
-	b_after = store_context(ce, after);
+	b_after = store_context(ce, after, relative);
 	if (IS_ERR(b_after)) {
 		rq = ERR_CAST(b_after);
 		goto err_before;
@@ -1074,7 +1088,8 @@ record_registers(struct intel_context *ce,
 	goto err_after;
 }
 
-static struct i915_vma *load_context(struct intel_context *ce, u32 poison)
+static struct i915_vma *
+load_context(struct intel_context *ce, u32 poison, bool relative)
 {
 	struct i915_vma *batch;
 	u32 dw, *cs, *hw;
@@ -1101,7 +1116,10 @@ static struct i915_vma *load_context(struct intel_context *ce, u32 poison)
 	hw = defaults;
 	hw += LRC_STATE_OFFSET / sizeof(*hw);
 	do {
+		u32 cmd = MI_INSTR(0x22, 0);
 		u32 len = hw[dw] & 0x7f;
+		u32 offset = 0;
+		u32 mask = ~0;
 
 		if (hw[dw] == 0) {
 			dw++;
@@ -1113,11 +1131,19 @@ static struct i915_vma *load_context(struct intel_context *ce, u32 poison)
 			continue;
 		}
 
+		if (hw[dw] & MI_LRI_LRM_CS_MMIO) {
+			mask = 0xfff;
+			if (relative)
+				cmd |= MI_LRI_LRM_CS_MMIO;
+			else
+				offset = ce->engine->mmio_base;
+		}
+
 		dw++;
+		*cs++ = cmd | len;
 		len = (len + 1) / 2;
-		*cs++ = MI_LOAD_REGISTER_IMM(len);
 		while (len--) {
-			*cs++ = hw[dw];
+			*cs++ = (hw[dw] & mask) + offset;
 			*cs++ = poison;
 			dw += 2;
 		}
@@ -1134,14 +1160,18 @@ static struct i915_vma *load_context(struct intel_context *ce, u32 poison)
 	return batch;
 }
 
-static int poison_registers(struct intel_context *ce, u32 poison, u32 *sema)
+static int
+poison_registers(struct intel_context *ce,
+		 u32 poison,
+		 bool relative,
+		 u32 *sema)
 {
 	struct i915_request *rq;
 	struct i915_vma *batch;
 	u32 *cs;
 	int err;
 
-	batch = load_context(ce, poison);
+	batch = load_context(ce, poison, relative);
 	if (IS_ERR(batch))
 		return PTR_ERR(batch);
 
@@ -1191,7 +1221,7 @@ static int compare_isolation(struct intel_engine_cs *engine,
 			     struct i915_vma *ref[2],
 			     struct i915_vma *result[2],
 			     struct intel_context *ce,
-			     u32 poison)
+			     u32 poison, bool relative)
 {
 	u32 x, dw, *hw, *lrc;
 	u32 *A[2], *B[2];
@@ -1240,6 +1270,7 @@ static int compare_isolation(struct intel_engine_cs *engine,
 	hw += LRC_STATE_OFFSET / sizeof(*hw);
 	do {
 		u32 len = hw[dw] & 0x7f;
+		bool is_relative = relative;
 
 		if (hw[dw] == 0) {
 			dw++;
@@ -1251,6 +1282,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
 			continue;
 		}
 
+		if (!(hw[dw] & MI_LRI_LRM_CS_MMIO))
+			is_relative = false;
+
 		dw++;
 		len = (len + 1) / 2;
 		while (len--) {
@@ -1262,9 +1296,10 @@ static int compare_isolation(struct intel_engine_cs *engine,
 					break;
 
 				default:
-					pr_err("%s[%d]: Mismatch for register %4x, default %08x, reference %08x, result (%08x, %08x), poison %08x, context %08x\n",
-					       engine->name, dw,
-					       hw[dw], hw[dw + 1],
+					pr_err("%s[%d]: Mismatch for register %4x [using relative? %s], default %08x, reference %08x, result (%08x, %08x), poison %08x, context %08x\n",
+					       engine->name, dw, hw[dw],
+					       yesno(is_relative),
+					       hw[dw + 1],
 					       A[0][x], B[0][x], B[1][x],
 					       poison, lrc[dw + 1]);
 					err = -EINVAL;
@@ -1290,7 +1325,8 @@ static int compare_isolation(struct intel_engine_cs *engine,
 	return err;
 }
 
-static int __lrc_isolation(struct intel_engine_cs *engine, u32 poison)
+static int
+__lrc_isolation(struct intel_engine_cs *engine, u32 poison, bool relative)
 {
 	u32 *sema = memset32(engine->status_page.addr + 1000, 0, 1);
 	struct i915_vma *ref[2], *result[2];
@@ -1320,7 +1356,7 @@ static int __lrc_isolation(struct intel_engine_cs *engine, u32 poison)
 		goto err_ref0;
 	}
 
-	rq = record_registers(A, ref[0], ref[1], sema);
+	rq = record_registers(A, ref[0], ref[1], relative, sema);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_ref1;
@@ -1348,13 +1384,13 @@ static int __lrc_isolation(struct intel_engine_cs *engine, u32 poison)
 		goto err_result0;
 	}
 
-	rq = record_registers(A, result[0], result[1], sema);
+	rq = record_registers(A, result[0], result[1], relative, sema);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_result1;
 	}
 
-	err = poison_registers(B, poison, sema);
+	err = poison_registers(B, poison, relative, sema);
 	if (err) {
 		WRITE_ONCE(*sema, -1);
 		i915_request_put(rq);
@@ -1368,7 +1404,7 @@ static int __lrc_isolation(struct intel_engine_cs *engine, u32 poison)
 	}
 	i915_request_put(rq);
 
-	err = compare_isolation(engine, ref, result, A, poison);
+	err = compare_isolation(engine, ref, result, A, poison, relative);
 
 err_result1:
 	i915_vma_put(result[1]);
@@ -1430,13 +1466,23 @@ static int live_lrc_isolation(void *arg)
 		for (i = 0; i < ARRAY_SIZE(poison); i++) {
 			int result;
 
-			result = __lrc_isolation(engine, poison[i]);
+			result = __lrc_isolation(engine, poison[i], false);
 			if (result && !err)
 				err = result;
 
-			result = __lrc_isolation(engine, ~poison[i]);
+			result = __lrc_isolation(engine, ~poison[i], false);
 			if (result && !err)
 				err = result;
+
+			if (intel_engine_has_relative_mmio(engine)) {
+				result = __lrc_isolation(engine, poison[i], true);
+				if (result && !err)
+					err = result;
+
+				result = __lrc_isolation(engine, ~poison[i], true);
+				if (result && !err)
+					err = result;
+			}
 		}
 		intel_engine_pm_put(engine);
 		if (igt_flush_test(gt->i915)) {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 10/41] drm/i915/selftests: Exercise cross-process context isolation
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (7 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 09/41] drm/i915/selftests: Exercise relative mmio paths to non-privileged registers Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 11/41] drm/i915: Extract request submission from execlists Chris Wilson
                   ` (35 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Verify that one context running on engine A cannot manipulate another
client's context concurrently running on engine B using unprivileged
access.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_lrc.c | 275 +++++++++++++++++++++----
 1 file changed, 238 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 0524232378e4..e97adf1b7729 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -911,6 +911,7 @@ create_user_vma(struct i915_address_space *vm, unsigned long size)
 
 static struct i915_vma *
 store_context(struct intel_context *ce,
+	      struct intel_engine_cs *engine,
 	      struct i915_vma *scratch,
 	      bool relative)
 {
@@ -928,7 +929,7 @@ store_context(struct intel_context *ce,
 		return ERR_CAST(cs);
 	}
 
-	defaults = shmem_pin_map(ce->engine->default_state);
+	defaults = shmem_pin_map(engine->default_state);
 	if (!defaults) {
 		i915_gem_object_unpin_map(batch->obj);
 		i915_vma_put(batch);
@@ -960,7 +961,7 @@ store_context(struct intel_context *ce,
 			if (relative)
 				cmd |= MI_LRI_LRM_CS_MMIO;
 			else
-				offset = ce->engine->mmio_base;
+				offset = engine->mmio_base;
 		}
 
 		dw++;
@@ -979,7 +980,7 @@ store_context(struct intel_context *ce,
 
 	*cs++ = MI_BATCH_BUFFER_END;
 
-	shmem_unpin_map(ce->engine->default_state, defaults);
+	shmem_unpin_map(engine->default_state, defaults);
 
 	i915_gem_object_flush_map(batch->obj);
 	i915_gem_object_unpin_map(batch->obj);
@@ -1002,23 +1003,48 @@ static int move_to_active(struct i915_request *rq,
 	return err;
 }
 
+struct hwsp_semaphore {
+	u32 ggtt;
+	u32 *va;
+};
+
+static struct hwsp_semaphore hwsp_semaphore(struct intel_engine_cs *engine)
+{
+	struct hwsp_semaphore s;
+
+	s.va = memset32(engine->status_page.addr + 1000, 0, 1);
+	s.ggtt = (i915_ggtt_offset(engine->status_page.vma) +
+		  offset_in_page(s.va));
+
+	return s;
+}
+
+static u32 *emit_noops(u32 *cs, int count)
+{
+	while (count--)
+		*cs++ = MI_NOOP;
+
+	return cs;
+}
+
 static struct i915_request *
 record_registers(struct intel_context *ce,
+		 struct intel_engine_cs *engine,
 		 struct i915_vma *before,
 		 struct i915_vma *after,
 		 bool relative,
-		 u32 *sema)
+		 const struct hwsp_semaphore *sema)
 {
 	struct i915_vma *b_before, *b_after;
 	struct i915_request *rq;
 	u32 *cs;
 	int err;
 
-	b_before = store_context(ce, before, relative);
+	b_before = store_context(ce, engine, before, relative);
 	if (IS_ERR(b_before))
 		return ERR_CAST(b_before);
 
-	b_after = store_context(ce, after, relative);
+	b_after = store_context(ce, engine, after, relative);
 	if (IS_ERR(b_after)) {
 		rq = ERR_CAST(b_after);
 		goto err_before;
@@ -1044,7 +1070,7 @@ record_registers(struct intel_context *ce,
 	if (err)
 		goto err_rq;
 
-	cs = intel_ring_begin(rq, 14);
+	cs = intel_ring_begin(rq, 18);
 	if (IS_ERR(cs)) {
 		err = PTR_ERR(cs);
 		goto err_rq;
@@ -1055,16 +1081,28 @@ record_registers(struct intel_context *ce,
 	*cs++ = lower_32_bits(b_before->node.start);
 	*cs++ = upper_32_bits(b_before->node.start);
 
-	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-	*cs++ = MI_SEMAPHORE_WAIT |
-		MI_SEMAPHORE_GLOBAL_GTT |
-		MI_SEMAPHORE_POLL |
-		MI_SEMAPHORE_SAD_NEQ_SDD;
-	*cs++ = 0;
-	*cs++ = i915_ggtt_offset(ce->engine->status_page.vma) +
-		offset_in_page(sema);
-	*cs++ = 0;
-	*cs++ = MI_NOOP;
+	if (sema) {
+		WRITE_ONCE(*sema->va, -1);
+
+		/* Signal the poisoner */
+		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+		*cs++ = sema->ggtt;
+		*cs++ = 0;
+		*cs++ = 0;
+
+		/* Then wait for the poison to settle */
+		*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+		*cs++ = MI_SEMAPHORE_WAIT |
+			MI_SEMAPHORE_GLOBAL_GTT |
+			MI_SEMAPHORE_POLL |
+			MI_SEMAPHORE_SAD_NEQ_SDD;
+		*cs++ = 0;
+		*cs++ = sema->ggtt;
+		*cs++ = 0;
+		*cs++ = MI_NOOP;
+	} else {
+		cs = emit_noops(cs, 10);
+	}
 
 	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
 	*cs++ = MI_BATCH_BUFFER_START_GEN8 | BIT(8);
@@ -1073,7 +1111,6 @@ record_registers(struct intel_context *ce,
 
 	intel_ring_advance(rq, cs);
 
-	WRITE_ONCE(*sema, 0);
 	i915_request_get(rq);
 	i915_request_add(rq);
 err_after:
@@ -1089,7 +1126,9 @@ record_registers(struct intel_context *ce,
 }
 
 static struct i915_vma *
-load_context(struct intel_context *ce, u32 poison, bool relative)
+load_context(struct intel_context *ce,
+	     struct intel_engine_cs *engine,
+	     u32 poison, bool relative)
 {
 	struct i915_vma *batch;
 	u32 dw, *cs, *hw;
@@ -1105,7 +1144,7 @@ load_context(struct intel_context *ce, u32 poison, bool relative)
 		return ERR_CAST(cs);
 	}
 
-	defaults = shmem_pin_map(ce->engine->default_state);
+	defaults = shmem_pin_map(engine->default_state);
 	if (!defaults) {
 		i915_gem_object_unpin_map(batch->obj);
 		i915_vma_put(batch);
@@ -1136,7 +1175,7 @@ load_context(struct intel_context *ce, u32 poison, bool relative)
 			if (relative)
 				cmd |= MI_LRI_LRM_CS_MMIO;
 			else
-				offset = ce->engine->mmio_base;
+				offset = engine->mmio_base;
 		}
 
 		dw++;
@@ -1152,7 +1191,7 @@ load_context(struct intel_context *ce, u32 poison, bool relative)
 
 	*cs++ = MI_BATCH_BUFFER_END;
 
-	shmem_unpin_map(ce->engine->default_state, defaults);
+	shmem_unpin_map(engine->default_state, defaults);
 
 	i915_gem_object_flush_map(batch->obj);
 	i915_gem_object_unpin_map(batch->obj);
@@ -1162,16 +1201,17 @@ load_context(struct intel_context *ce, u32 poison, bool relative)
 
 static int
 poison_registers(struct intel_context *ce,
+		 struct intel_engine_cs *engine,
 		 u32 poison,
 		 bool relative,
-		 u32 *sema)
+		 const struct hwsp_semaphore *sema)
 {
 	struct i915_request *rq;
 	struct i915_vma *batch;
 	u32 *cs;
 	int err;
 
-	batch = load_context(ce, poison, relative);
+	batch = load_context(ce, engine, poison, relative);
 	if (IS_ERR(batch))
 		return PTR_ERR(batch);
 
@@ -1185,20 +1225,29 @@ poison_registers(struct intel_context *ce,
 	if (err)
 		goto err_rq;
 
-	cs = intel_ring_begin(rq, 8);
+	cs = intel_ring_begin(rq, 14);
 	if (IS_ERR(cs)) {
 		err = PTR_ERR(cs);
 		goto err_rq;
 	}
 
+	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+	*cs++ = MI_SEMAPHORE_WAIT |
+		MI_SEMAPHORE_GLOBAL_GTT |
+		MI_SEMAPHORE_POLL |
+		MI_SEMAPHORE_SAD_EQ_SDD;
+	*cs++ = 0;
+	*cs++ = sema->ggtt;
+	*cs++ = 0;
+	*cs++ = MI_NOOP;
+
 	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
 	*cs++ = MI_BATCH_BUFFER_START_GEN8 | BIT(8);
 	*cs++ = lower_32_bits(batch->node.start);
 	*cs++ = upper_32_bits(batch->node.start);
 
 	*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
-	*cs++ = i915_ggtt_offset(ce->engine->status_page.vma) +
-		offset_in_page(sema);
+	*cs++ = sema->ggtt;
 	*cs++ = 0;
 	*cs++ = 1;
 
@@ -1258,7 +1307,7 @@ static int compare_isolation(struct intel_engine_cs *engine,
 	}
 	lrc += LRC_STATE_OFFSET / sizeof(*hw);
 
-	defaults = shmem_pin_map(ce->engine->default_state);
+	defaults = shmem_pin_map(engine->default_state);
 	if (!defaults) {
 		err = -ENOMEM;
 		goto err_lrc;
@@ -1311,7 +1360,7 @@ static int compare_isolation(struct intel_engine_cs *engine,
 	} while (dw < PAGE_SIZE / sizeof(u32) &&
 		 (hw[dw] & ~BIT(0)) != MI_BATCH_BUFFER_END);
 
-	shmem_unpin_map(ce->engine->default_state, defaults);
+	shmem_unpin_map(engine->default_state, defaults);
 err_lrc:
 	i915_gem_object_unpin_map(ce->state->obj);
 err_B1:
@@ -1328,7 +1377,7 @@ static int compare_isolation(struct intel_engine_cs *engine,
 static int
 __lrc_isolation(struct intel_engine_cs *engine, u32 poison, bool relative)
 {
-	u32 *sema = memset32(engine->status_page.addr + 1000, 0, 1);
+	struct hwsp_semaphore sema = hwsp_semaphore(engine);
 	struct i915_vma *ref[2], *result[2];
 	struct intel_context *A, *B;
 	struct i915_request *rq;
@@ -1356,15 +1405,12 @@ __lrc_isolation(struct intel_engine_cs *engine, u32 poison, bool relative)
 		goto err_ref0;
 	}
 
-	rq = record_registers(A, ref[0], ref[1], relative, sema);
+	rq = record_registers(A, engine, ref[0], ref[1], relative, NULL);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_ref1;
 	}
 
-	WRITE_ONCE(*sema, 1);
-	wmb();
-
 	if (i915_request_wait(rq, 0, HZ / 2) < 0) {
 		i915_request_put(rq);
 		err = -ETIME;
@@ -1384,15 +1430,15 @@ __lrc_isolation(struct intel_engine_cs *engine, u32 poison, bool relative)
 		goto err_result0;
 	}
 
-	rq = record_registers(A, result[0], result[1], relative, sema);
+	rq = record_registers(A, engine, result[0], result[1], relative, &sema);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_result1;
 	}
 
-	err = poison_registers(B, poison, relative, sema);
+	err = poison_registers(B, engine, poison, relative, &sema);
 	if (err) {
-		WRITE_ONCE(*sema, -1);
+		WRITE_ONCE(*sema.va, -1);
 		i915_request_put(rq);
 		goto err_result1;
 	}
@@ -1494,6 +1540,160 @@ static int live_lrc_isolation(void *arg)
 	return err;
 }
 
+static int __lrc_cross(struct intel_engine_cs *a,
+		       struct intel_engine_cs *b,
+		       u32 poison)
+{
+	struct hwsp_semaphore sema = hwsp_semaphore(a);
+	struct i915_vma *ref[2], *result[2];
+	struct intel_context *A, *B;
+	struct i915_request *rq;
+	int err;
+
+	GEM_BUG_ON(a->gt->ggtt != b->gt->ggtt);
+
+	pr_debug("Context on %s, poisoning from %s with %08x\n",
+		 a->name, b->name, poison);
+
+	A = intel_context_create(a);
+	if (IS_ERR(A))
+		return PTR_ERR(A);
+
+	B = intel_context_create(b);
+	if (IS_ERR(B)) {
+		err = PTR_ERR(B);
+		goto err_A;
+	}
+
+	ref[0] = create_user_vma(A->vm, SZ_64K);
+	if (IS_ERR(ref[0])) {
+		err = PTR_ERR(ref[0]);
+		goto err_B;
+	}
+
+	ref[1] = create_user_vma(A->vm, SZ_64K);
+	if (IS_ERR(ref[1])) {
+		err = PTR_ERR(ref[1]);
+		goto err_ref0;
+	}
+
+	rq = record_registers(A, a, ref[0], ref[1], false, NULL);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto err_ref1;
+	}
+
+	if (i915_request_wait(rq, 0, HZ / 2) < 0) {
+		i915_request_put(rq);
+		err = -ETIME;
+		goto err_ref1;
+	}
+	i915_request_put(rq);
+
+	result[0] = create_user_vma(A->vm, SZ_64K);
+	if (IS_ERR(result[0])) {
+		err = PTR_ERR(result[0]);
+		goto err_ref1;
+	}
+
+	result[1] = create_user_vma(A->vm, SZ_64K);
+	if (IS_ERR(result[1])) {
+		err = PTR_ERR(result[1]);
+		goto err_result0;
+	}
+
+	rq = record_registers(A, a, result[0], result[1], false, &sema);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto err_result1;
+	}
+
+	err = poison_registers(B, a, poison, false, &sema);
+	if (err) {
+		WRITE_ONCE(*sema.va, -1);
+		i915_request_put(rq);
+		goto err_result1;
+	}
+
+	if (i915_request_wait(rq, 0, HZ / 2) < 0) {
+		i915_request_put(rq);
+		err = -ETIME;
+		goto err_result1;
+	}
+	i915_request_put(rq);
+
+	err = compare_isolation(a, ref, result, A, poison, false);
+
+err_result1:
+	i915_vma_put(result[1]);
+err_result0:
+	i915_vma_put(result[0]);
+err_ref1:
+	i915_vma_put(ref[1]);
+err_ref0:
+	i915_vma_put(ref[0]);
+err_B:
+	intel_context_put(B);
+err_A:
+	intel_context_put(A);
+	return err;
+}
+
+static int live_lrc_cross(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs *a, *b;
+	enum intel_engine_id a_id, b_id;
+	const u32 poison[] = {
+		STACK_MAGIC,
+		0x3a3a3a3a,
+		0x5c5c5c5c,
+		0xffffffff,
+		0xffff0000,
+	};
+	int err = 0;
+	int i;
+
+	/*
+	 * Our goal is to try and tamper with another client's context
+	 * running concurrently. The HW's goal is to stop us.
+	 */
+
+	for_each_engine(a, gt, a_id) {
+		if (!IS_ENABLED(CONFIG_DRM_I915_SELFTEST_BROKEN) &&
+		    skip_isolation(a))
+			continue;
+
+		intel_engine_pm_get(a);
+		for_each_engine(b, gt, b_id) {
+			if (a == b)
+				continue;
+
+			intel_engine_pm_get(b);
+			for (i = 0; i < ARRAY_SIZE(poison); i++) {
+				int result;
+
+				result = __lrc_cross(a, b, poison[i]);
+				if (result && !err)
+					err = result;
+
+				result = __lrc_cross(a, b, ~poison[i]);
+				if (result && !err)
+					err = result;
+			}
+			intel_engine_pm_put(b);
+		}
+		intel_engine_pm_put(a);
+
+		if (igt_flush_test(gt->i915)) {
+			err = -EIO;
+			break;
+		}
+	}
+
+	return err;
+}
+
 static int indirect_ctx_submit_req(struct intel_context *ce)
 {
 	struct i915_request *rq;
@@ -1884,6 +2084,7 @@ int intel_lrc_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_lrc_isolation),
 		SUBTEST(live_lrc_timestamp),
 		SUBTEST(live_lrc_garbage),
+		SUBTEST(live_lrc_cross),
 		SUBTEST(live_pphwsp_runtime),
 		SUBTEST(live_lrc_indirect_ctx_bb),
 	};
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 11/41] drm/i915: Extract request submission from execlists
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (8 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 10/41] drm/i915/selftests: Exercise cross-process context isolation Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-26 16:28   ` Tvrtko Ursulin
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 12/41] drm/i915: Extract request rewinding " Chris Wilson
                   ` (34 subsequent siblings)
  44 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

In the process of preparing to reuse the request submission logic for
other backends, lift it out of the execlists backend. It already
operates on the common structs, so just a matter of moving and renaming.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/gt/intel_execlists_submission.c  | 55 +------------
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 30 +------
 drivers/gpu/drm/i915/i915_scheduler.c         | 82 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_scheduler.h         |  2 +
 4 files changed, 86 insertions(+), 83 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 309fb421ff5c..e6acdd8dc361 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2404,59 +2404,6 @@ static void execlists_preempt(struct timer_list *timer)
 	execlists_kick(timer, preempt);
 }
 
-static void queue_request(struct intel_engine_cs *engine,
-			  struct i915_request *rq)
-{
-	GEM_BUG_ON(!list_empty(&rq->sched.link));
-	list_add_tail(&rq->sched.link,
-		      i915_sched_lookup_priolist(engine, rq_prio(rq)));
-	set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
-}
-
-static bool submit_queue(struct intel_engine_cs *engine,
-			 const struct i915_request *rq)
-{
-	struct intel_engine_execlists *execlists = &engine->execlists;
-
-	if (rq_prio(rq) <= execlists->queue_priority_hint)
-		return false;
-
-	execlists->queue_priority_hint = rq_prio(rq);
-	return true;
-}
-
-static bool ancestor_on_hold(const struct intel_engine_cs *engine,
-			     const struct i915_request *rq)
-{
-	GEM_BUG_ON(i915_request_on_hold(rq));
-	return !list_empty(&engine->active.hold) && hold_request(rq);
-}
-
-static void execlists_submit_request(struct i915_request *request)
-{
-	struct intel_engine_cs *engine = request->engine;
-	unsigned long flags;
-
-	/* Will be called from irq-context when using foreign fences. */
-	spin_lock_irqsave(&engine->active.lock, flags);
-
-	if (unlikely(ancestor_on_hold(engine, request))) {
-		RQ_TRACE(request, "ancestor on hold\n");
-		list_add_tail(&request->sched.link, &engine->active.hold);
-		i915_request_set_hold(request);
-	} else {
-		queue_request(engine, request);
-
-		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
-		GEM_BUG_ON(list_empty(&request->sched.link));
-
-		if (submit_queue(engine, request))
-			__execlists_kick(&engine->execlists);
-	}
-
-	spin_unlock_irqrestore(&engine->active.lock, flags);
-}
-
 static int execlists_context_pre_pin(struct intel_context *ce,
 				     struct i915_gem_ww_ctx *ww,
 				     void **vaddr)
@@ -3072,7 +3019,7 @@ static bool can_preempt(struct intel_engine_cs *engine)
 
 static void execlists_set_default_submission(struct intel_engine_cs *engine)
 {
-	engine->submit_request = execlists_submit_request;
+	engine->submit_request = i915_request_enqueue;
 	engine->execlists.tasklet.func = execlists_submission_tasklet;
 
 	engine->reset.prepare = execlists_reset_prepare;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 53cf68e240c3..4f1eee4fbfb2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -510,34 +510,6 @@ static int guc_request_alloc(struct i915_request *request)
 	return 0;
 }
 
-static inline void queue_request(struct intel_engine_cs *engine,
-				 struct i915_request *rq,
-				 int prio)
-{
-	GEM_BUG_ON(!list_empty(&rq->sched.link));
-	list_add_tail(&rq->sched.link,
-		      i915_sched_lookup_priolist(engine, prio));
-	set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
-}
-
-static void guc_submit_request(struct i915_request *rq)
-{
-	struct intel_engine_cs *engine = rq->engine;
-	unsigned long flags;
-
-	/* Will be called from irq-context when using foreign fences. */
-	spin_lock_irqsave(&engine->active.lock, flags);
-
-	queue_request(engine, rq, rq_prio(rq));
-
-	GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
-	GEM_BUG_ON(list_empty(&rq->sched.link));
-
-	tasklet_hi_schedule(&engine->execlists.tasklet);
-
-	spin_unlock_irqrestore(&engine->active.lock, flags);
-}
-
 static void sanitize_hwsp(struct intel_engine_cs *engine)
 {
 	struct intel_timeline *tl;
@@ -606,7 +578,7 @@ static int guc_resume(struct intel_engine_cs *engine)
 
 static void guc_set_default_submission(struct intel_engine_cs *engine)
 {
-	engine->submit_request = guc_submit_request;
+	engine->submit_request = i915_request_enqueue;
 	engine->execlists.tasklet.func = guc_submission_tasklet;
 
 	engine->reset.prepare = guc_reset_prepare;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 9139a91f0aa3..3f5fc03908dc 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -448,6 +448,88 @@ void i915_request_set_priority(struct i915_request *rq, int prio)
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
 
+static void queue_request(struct intel_engine_cs *engine,
+			  struct i915_request *rq)
+{
+	GEM_BUG_ON(!list_empty(&rq->sched.link));
+	list_add_tail(&rq->sched.link,
+		      i915_sched_lookup_priolist(engine, rq_prio(rq)));
+	set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+}
+
+static bool submit_queue(struct intel_engine_cs *engine,
+			 const struct i915_request *rq)
+{
+	struct intel_engine_execlists *execlists = &engine->execlists;
+
+	if (rq_prio(rq) <= execlists->queue_priority_hint)
+		return false;
+
+	execlists->queue_priority_hint = rq_prio(rq);
+	return true;
+}
+
+static bool hold_request(const struct i915_request *rq)
+{
+	struct i915_dependency *p;
+	bool result = false;
+
+	/*
+	 * If one of our ancestors is on hold, we must also be put on hold,
+	 * otherwise we will bypass it and execute before it.
+	 */
+	rcu_read_lock();
+	for_each_signaler(p, rq) {
+		const struct i915_request *s =
+			container_of(p->signaler, typeof(*s), sched);
+
+		if (s->engine != rq->engine)
+			continue;
+
+		result = i915_request_on_hold(s);
+		if (result)
+			break;
+	}
+	rcu_read_unlock();
+
+	return result;
+}
+
+static bool ancestor_on_hold(const struct intel_engine_cs *engine,
+			     const struct i915_request *rq)
+{
+	GEM_BUG_ON(i915_request_on_hold(rq));
+	return unlikely(!list_empty(&engine->active.hold)) && hold_request(rq);
+}
+
+void i915_request_enqueue(struct i915_request *rq)
+{
+	struct intel_engine_cs *engine = rq->engine;
+	unsigned long flags;
+	bool kick = false;
+
+	/* Will be called from irq-context when using foreign fences. */
+	spin_lock_irqsave(&engine->active.lock, flags);
+	GEM_BUG_ON(test_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags));
+
+	if (unlikely(ancestor_on_hold(engine, rq))) {
+		RQ_TRACE(rq, "ancestor on hold\n");
+		list_add_tail(&rq->sched.link, &engine->active.hold);
+		i915_request_set_hold(rq);
+	} else {
+		queue_request(engine, rq);
+
+		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+
+		kick = submit_queue(engine, rq);
+	}
+
+	GEM_BUG_ON(list_empty(&rq->sched.link));
+	spin_unlock_irqrestore(&engine->active.lock, flags);
+	if (kick)
+		tasklet_hi_schedule(&engine->execlists.tasklet);
+}
+
 void i915_sched_node_init(struct i915_sched_node *node)
 {
 	spin_lock_init(&node->lock);
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 5be7f90e7896..c4c086d56f81 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -39,6 +39,8 @@ void i915_sched_init_ipi(struct i915_sched_ipi *ipi);
 
 void i915_request_set_priority(struct i915_request *request, int prio);
 
+void i915_request_enqueue(struct i915_request *request);
+
 struct list_head *
 i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 12/41] drm/i915: Extract request rewinding from execlists
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (9 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 11/41] drm/i915: Extract request submission from execlists Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 13/41] drm/i915: Extract request suspension from the execlists Chris Wilson
                   ` (33 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

In the process of preparing to reuse the request submission logic for
other backends, lift it out of the execlists backend.

While this operates on the common structs, we do have a bit of backend
knowledge, which is harmless for !lrc but still unsightly.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine.h        |  3 -
 .../drm/i915/gt/intel_execlists_submission.c  | 58 ++-----------------
 drivers/gpu/drm/i915/gt/intel_lrc_reg.h       |  3 +
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  3 +-
 drivers/gpu/drm/i915/i915_scheduler.c         | 44 ++++++++++++++
 drivers/gpu/drm/i915/i915_scheduler.h         |  3 +
 7 files changed, 56 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 47ee8578e511..20974415e7d8 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -136,9 +136,6 @@ execlists_active_unlock_bh(struct intel_engine_execlists *execlists)
 	local_bh_enable(); /* restore softirq, and kick ksoftirqd! */
 }
 
-struct i915_request *
-execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists);
-
 static inline u32
 intel_read_status_page(const struct intel_engine_cs *engine, int reg)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index e6acdd8dc361..f7d383dd5144 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -359,56 +359,6 @@ assert_priority_queue(const struct i915_request *prev,
 	return rq_prio(prev) >= rq_prio(next);
 }
 
-static struct i915_request *
-__unwind_incomplete_requests(struct intel_engine_cs *engine)
-{
-	struct i915_request *rq, *rn, *active = NULL;
-	struct list_head *pl;
-	int prio = I915_PRIORITY_INVALID;
-
-	lockdep_assert_held(&engine->active.lock);
-
-	list_for_each_entry_safe_reverse(rq, rn,
-					 &engine->active.requests,
-					 sched.link) {
-		if (__i915_request_is_complete(rq)) {
-			list_del_init(&rq->sched.link);
-			continue;
-		}
-
-		__i915_request_unsubmit(rq);
-
-		GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
-		if (rq_prio(rq) != prio) {
-			prio = rq_prio(rq);
-			pl = i915_sched_lookup_priolist(engine, prio);
-		}
-		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
-
-		list_move(&rq->sched.link, pl);
-		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
-
-		/* Check in case we rollback so far we wrap [size/2] */
-		if (intel_ring_direction(rq->ring,
-					 rq->tail,
-					 rq->ring->tail + 8) > 0)
-			rq->context->lrc.desc |= CTX_DESC_FORCE_RESTORE;
-
-		active = rq;
-	}
-
-	return active;
-}
-
-struct i915_request *
-execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
-{
-	struct intel_engine_cs *engine =
-		container_of(execlists, typeof(*engine), execlists);
-
-	return __unwind_incomplete_requests(engine);
-}
-
 static void
 execlists_context_status_change(struct i915_request *rq, unsigned long status)
 {
@@ -1080,7 +1030,7 @@ static void defer_active(struct intel_engine_cs *engine)
 {
 	struct i915_request *rq;
 
-	rq = __unwind_incomplete_requests(engine);
+	rq = __intel_engine_rewind_requests(engine);
 	if (!rq)
 		return;
 
@@ -1292,7 +1242,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			 * the preemption, some of the unwound requests may
 			 * complete!
 			 */
-			__unwind_incomplete_requests(engine);
+			__intel_engine_rewind_requests(engine);
 
 			last = NULL;
 		} else if (timeslice_expired(engine, last)) {
@@ -2279,7 +2229,7 @@ static void execlists_capture(struct intel_engine_cs *engine)
 	 * which we return it to the queue for signaling.
 	 *
 	 * By removing them from the execlists queue, we also remove the
-	 * requests from being processed by __unwind_incomplete_requests()
+	 * requests from being processed by __intel_engine_rewind_requests()
 	 * during the intel_engine_reset(), and so they will *not* be replayed
 	 * afterwards.
 	 *
@@ -2869,7 +2819,7 @@ static void execlists_reset_rewind(struct intel_engine_cs *engine, bool stalled)
 	/* Push back any incomplete requests for replay after the reset. */
 	rcu_read_lock();
 	spin_lock_irqsave(&engine->active.lock, flags);
-	__unwind_incomplete_requests(engine);
+	__intel_engine_rewind_requests(engine);
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 	rcu_read_unlock();
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
index 41e5350a7a05..364656bedec7 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
+++ b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
@@ -92,4 +92,7 @@
 /* in Gen12 ID 0x7FF is reserved to indicate idle */
 #define GEN12_MAX_CONTEXT_HW_ID	(GEN11_MAX_CONTEXT_HW_ID - 1)
 
+#define CTX_DESC_RELOAD_PD BIT_ULL(1)
+#define CTX_DESC_FORCE_RESTORE BIT_ULL(2)
+
 #endif /* _INTEL_LRC_REG_H_ */
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index acb7c089d05b..84017eb9dd8b 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -4582,7 +4582,7 @@ static int reset_virtual_engine(struct intel_gt *gt,
 
 	/* Fake a preemption event; failed of course */
 	spin_lock_irq(&engine->active.lock);
-	__unwind_incomplete_requests(engine);
+	__intel_engine_rewind_requests(engine);
 	spin_unlock_irq(&engine->active.lock);
 	GEM_BUG_ON(rq->engine != engine);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 4f1eee4fbfb2..56ec738a9ce7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -305,14 +305,13 @@ static void guc_reset_state(struct intel_context *ce,
 
 static void guc_reset_rewind(struct intel_engine_cs *engine, bool stalled)
 {
-	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_request *rq;
 	unsigned long flags;
 
 	spin_lock_irqsave(&engine->active.lock, flags);
 
 	/* Push back any incomplete requests for replay after the reset. */
-	rq = execlists_unwind_incomplete_requests(execlists);
+	rq = __intel_engine_rewind_requests(engine);
 	if (!rq)
 		goto out_unlock;
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 3f5fc03908dc..bd687c891ab6 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -6,6 +6,9 @@
 
 #include <linux/mutex.h>
 
+#include "gt/intel_ring.h"
+#include "gt/intel_lrc_reg.h"
+
 #include "i915_drv.h"
 #include "i915_globals.h"
 #include "i915_request.h"
@@ -530,6 +533,47 @@ void i915_request_enqueue(struct i915_request *rq)
 		tasklet_hi_schedule(&engine->execlists.tasklet);
 }
 
+struct i915_request *
+__intel_engine_rewind_requests(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq, *rn, *active = NULL;
+	struct list_head *pl;
+	int prio = I915_PRIORITY_INVALID;
+
+	lockdep_assert_held(&engine->active.lock);
+
+	list_for_each_entry_safe_reverse(rq, rn,
+					 &engine->active.requests,
+					 sched.link) {
+		if (__i915_request_is_complete(rq)) {
+			list_del_init(&rq->sched.link);
+			continue;
+		}
+
+		__i915_request_unsubmit(rq);
+
+		GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
+		if (rq_prio(rq) != prio) {
+			prio = rq_prio(rq);
+			pl = i915_sched_lookup_priolist(engine, prio);
+		}
+		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+
+		list_move(&rq->sched.link, pl);
+		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+
+		/* Check in case we rollback so far we wrap [size/2] */
+		if (intel_ring_direction(rq->ring,
+					 rq->tail,
+					 rq->ring->tail + 8) > 0)
+			rq->context->lrc.desc |= CTX_DESC_FORCE_RESTORE;
+
+		active = rq;
+	}
+
+	return active;
+}
+
 void i915_sched_node_init(struct i915_sched_node *node)
 {
 	spin_lock_init(&node->lock);
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index c4c086d56f81..50fdc7168d38 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -41,6 +41,9 @@ void i915_request_set_priority(struct i915_request *request, int prio);
 
 void i915_request_enqueue(struct i915_request *request);
 
+struct i915_request *
+__intel_engine_rewind_requests(struct intel_engine_cs *engine);
+
 struct list_head *
 i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 13/41] drm/i915: Extract request suspension from the execlists
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (10 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 12/41] drm/i915: Extract request rewinding " Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 14/41] drm/i915: Extract the ability to defer and rerun a request later Chris Wilson
                   ` (32 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Make the ability to suspend and resume a request and its dependents
generic.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/gt/intel_execlists_submission.c  | 167 +-----------------
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |   8 +-
 drivers/gpu/drm/i915/i915_scheduler.c         | 153 ++++++++++++++++
 drivers/gpu/drm/i915/i915_scheduler.h         |  10 ++
 4 files changed, 169 insertions(+), 169 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index f7d383dd5144..f2967c31dd70 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -1922,169 +1922,6 @@ static void post_process_csb(struct i915_request **port,
 		execlists_schedule_out(*port++);
 }
 
-static void __execlists_hold(struct i915_request *rq)
-{
-	LIST_HEAD(list);
-
-	do {
-		struct i915_dependency *p;
-
-		if (i915_request_is_active(rq))
-			__i915_request_unsubmit(rq);
-
-		clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
-		list_move_tail(&rq->sched.link, &rq->engine->active.hold);
-		i915_request_set_hold(rq);
-		RQ_TRACE(rq, "on hold\n");
-
-		for_each_waiter(p, rq) {
-			struct i915_request *w =
-				container_of(p->waiter, typeof(*w), sched);
-
-			if (p->flags & I915_DEPENDENCY_WEAK)
-				continue;
-
-			/* Leave semaphores spinning on the other engines */
-			if (w->engine != rq->engine)
-				continue;
-
-			if (!i915_request_is_ready(w))
-				continue;
-
-			if (__i915_request_is_complete(w))
-				continue;
-
-			if (i915_request_on_hold(w))
-				continue;
-
-			list_move_tail(&w->sched.link, &list);
-		}
-
-		rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
-	} while (rq);
-}
-
-static bool execlists_hold(struct intel_engine_cs *engine,
-			   struct i915_request *rq)
-{
-	if (i915_request_on_hold(rq))
-		return false;
-
-	spin_lock_irq(&engine->active.lock);
-
-	if (__i915_request_is_complete(rq)) { /* too late! */
-		rq = NULL;
-		goto unlock;
-	}
-
-	/*
-	 * Transfer this request onto the hold queue to prevent it
-	 * being resumbitted to HW (and potentially completed) before we have
-	 * released it. Since we may have already submitted following
-	 * requests, we need to remove those as well.
-	 */
-	GEM_BUG_ON(i915_request_on_hold(rq));
-	GEM_BUG_ON(rq->engine != engine);
-	__execlists_hold(rq);
-	GEM_BUG_ON(list_empty(&engine->active.hold));
-
-unlock:
-	spin_unlock_irq(&engine->active.lock);
-	return rq;
-}
-
-static bool hold_request(const struct i915_request *rq)
-{
-	struct i915_dependency *p;
-	bool result = false;
-
-	/*
-	 * If one of our ancestors is on hold, we must also be on hold,
-	 * otherwise we will bypass it and execute before it.
-	 */
-	rcu_read_lock();
-	for_each_signaler(p, rq) {
-		const struct i915_request *s =
-			container_of(p->signaler, typeof(*s), sched);
-
-		if (s->engine != rq->engine)
-			continue;
-
-		result = i915_request_on_hold(s);
-		if (result)
-			break;
-	}
-	rcu_read_unlock();
-
-	return result;
-}
-
-static void __execlists_unhold(struct i915_request *rq)
-{
-	LIST_HEAD(list);
-
-	do {
-		struct i915_dependency *p;
-
-		RQ_TRACE(rq, "hold release\n");
-
-		GEM_BUG_ON(!i915_request_on_hold(rq));
-		GEM_BUG_ON(!i915_sw_fence_signaled(&rq->submit));
-
-		i915_request_clear_hold(rq);
-		list_move_tail(&rq->sched.link,
-			       i915_sched_lookup_priolist(rq->engine,
-							  rq_prio(rq)));
-		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
-
-		/* Also release any children on this engine that are ready */
-		for_each_waiter(p, rq) {
-			struct i915_request *w =
-				container_of(p->waiter, typeof(*w), sched);
-
-			if (p->flags & I915_DEPENDENCY_WEAK)
-				continue;
-
-			/* Propagate any change in error status */
-			if (rq->fence.error)
-				i915_request_set_error_once(w, rq->fence.error);
-
-			if (w->engine != rq->engine)
-				continue;
-
-			if (!i915_request_on_hold(w))
-				continue;
-
-			/* Check that no other parents are also on hold */
-			if (hold_request(w))
-				continue;
-
-			list_move_tail(&w->sched.link, &list);
-		}
-
-		rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
-	} while (rq);
-}
-
-static void execlists_unhold(struct intel_engine_cs *engine,
-			     struct i915_request *rq)
-{
-	spin_lock_irq(&engine->active.lock);
-
-	/*
-	 * Move this request back to the priority queue, and all of its
-	 * children and grandchildren that were suspended along with it.
-	 */
-	__execlists_unhold(rq);
-
-	if (rq_prio(rq) > engine->execlists.queue_priority_hint) {
-		engine->execlists.queue_priority_hint = rq_prio(rq);
-		tasklet_hi_schedule(&engine->execlists.tasklet);
-	}
-
-	spin_unlock_irq(&engine->active.lock);
-}
-
 struct execlists_capture {
 	struct work_struct work;
 	struct i915_request *rq;
@@ -2117,7 +1954,7 @@ static void execlists_capture_work(struct work_struct *work)
 	i915_gpu_coredump_put(cap->error);
 
 	/* Return this request and all that depend upon it for signaling */
-	execlists_unhold(engine, cap->rq);
+	intel_engine_resume_request(engine, cap->rq);
 	i915_request_put(cap->rq);
 
 	kfree(cap);
@@ -2242,7 +2079,7 @@ static void execlists_capture(struct intel_engine_cs *engine)
 	 * simply hold that request accountable for being non-preemptible
 	 * long enough to force the reset.
 	 */
-	if (!execlists_hold(engine, cap->rq))
+	if (!intel_engine_suspend_request(engine, cap->rq))
 		goto err_rq;
 
 	INIT_WORK(&cap->work, execlists_capture_work);
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 84017eb9dd8b..e34858b111c8 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -608,7 +608,7 @@ static int live_hold_reset(void *arg)
 		GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
 
 		i915_request_get(rq);
-		execlists_hold(engine, rq);
+		intel_engine_suspend_request(engine, rq);
 		GEM_BUG_ON(!i915_request_on_hold(rq));
 
 		__intel_engine_reset_bh(engine, NULL);
@@ -630,7 +630,7 @@ static int live_hold_reset(void *arg)
 		GEM_BUG_ON(!i915_request_on_hold(rq));
 
 		/* But is resubmitted on release */
-		execlists_unhold(engine, rq);
+		intel_engine_resume_request(engine, rq);
 		if (i915_request_wait(rq, 0, HZ / 5) < 0) {
 			pr_err("%s: held request did not complete!\n",
 			       engine->name);
@@ -4587,7 +4587,7 @@ static int reset_virtual_engine(struct intel_gt *gt,
 	GEM_BUG_ON(rq->engine != engine);
 
 	/* Reset the engine while keeping our active request on hold */
-	execlists_hold(engine, rq);
+	intel_engine_suspend_request(engine, rq);
 	GEM_BUG_ON(!i915_request_on_hold(rq));
 
 	__intel_engine_reset_bh(engine, NULL);
@@ -4610,7 +4610,7 @@ static int reset_virtual_engine(struct intel_gt *gt,
 	GEM_BUG_ON(!i915_request_on_hold(rq));
 
 	/* But is resubmitted on release */
-	execlists_unhold(engine, rq);
+	intel_engine_resume_request(engine, rq);
 	if (i915_request_wait(rq, 0, HZ / 5) < 0) {
 		pr_err("%s: held request did not complete!\n",
 		       engine->name);
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index bd687c891ab6..1f8c647d59d6 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -574,6 +574,159 @@ __intel_engine_rewind_requests(struct intel_engine_cs *engine)
 	return active;
 }
 
+bool __intel_engine_suspend_request(struct intel_engine_cs *engine,
+				    struct i915_request *rq)
+{
+	LIST_HEAD(list);
+
+	lockdep_assert_held(&engine->active.lock);
+	GEM_BUG_ON(rq->engine != engine);
+
+	if (__i915_request_is_complete(rq)) /* too late! */
+		return false;
+
+	if (i915_request_on_hold(rq))
+		return false;
+
+	ENGINE_TRACE(engine, "suspending request %llx:%lld\n",
+		     rq->fence.context, rq->fence.seqno);
+
+	/*
+	 * Transfer this request onto the hold queue to prevent it
+	 * being resumbitted to HW (and potentially completed) before we have
+	 * released it. Since we may have already submitted following
+	 * requests, we need to remove those as well.
+	 */
+	do {
+		struct i915_dependency *p;
+
+		if (i915_request_is_active(rq))
+			__i915_request_unsubmit(rq);
+
+		clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+		list_move_tail(&rq->sched.link, &rq->engine->active.hold);
+		i915_request_set_hold(rq);
+		RQ_TRACE(rq, "on hold\n");
+
+		for_each_waiter(p, rq) {
+			struct i915_request *w =
+				container_of(p->waiter, typeof(*w), sched);
+
+			if (p->flags & I915_DEPENDENCY_WEAK)
+				continue;
+
+			/* Leave semaphores spinning on the other engines */
+			if (w->engine != engine)
+				continue;
+
+			if (!i915_request_is_ready(w))
+				continue;
+
+			if (__i915_request_is_complete(w))
+				continue;
+
+			if (i915_request_on_hold(w)) /* acts as a visited bit */
+				continue;
+
+			list_move_tail(&w->sched.link, &list);
+		}
+
+		rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
+	} while (rq);
+
+	GEM_BUG_ON(list_empty(&engine->active.hold));
+
+	return true;
+}
+
+bool intel_engine_suspend_request(struct intel_engine_cs *engine,
+				  struct i915_request *rq)
+{
+	bool result;
+
+	if (i915_request_on_hold(rq))
+		return false;
+
+	spin_lock_irq(&engine->active.lock);
+	result = __intel_engine_suspend_request(engine, rq);
+	spin_unlock_irq(&engine->active.lock);
+
+	return result;
+}
+
+void __intel_engine_resume_request(struct intel_engine_cs *engine,
+				   struct i915_request *rq)
+{
+	LIST_HEAD(list);
+
+	lockdep_assert_held(&engine->active.lock);
+
+	if (rq_prio(rq) > engine->execlists.queue_priority_hint) {
+		engine->execlists.queue_priority_hint = rq_prio(rq);
+		tasklet_hi_schedule(&engine->execlists.tasklet);
+	}
+
+	if (!i915_request_on_hold(rq))
+		return;
+
+	ENGINE_TRACE(engine, "resuming request %llx:%lld\n",
+		     rq->fence.context, rq->fence.seqno);
+
+	/*
+	 * Move this request back to the priority queue, and all of its
+	 * children and grandchildren that were suspended along with it.
+	 */
+	do {
+		struct i915_dependency *p;
+
+		RQ_TRACE(rq, "hold release\n");
+
+		GEM_BUG_ON(!i915_request_on_hold(rq));
+		GEM_BUG_ON(!i915_sw_fence_signaled(&rq->submit));
+
+		i915_request_clear_hold(rq);
+		list_del_init(&rq->sched.link);
+
+		queue_request(engine, rq);
+
+		/* Also release any children on this engine that are ready */
+		for_each_waiter(p, rq) {
+			struct i915_request *w =
+				container_of(p->waiter, typeof(*w), sched);
+
+			if (p->flags & I915_DEPENDENCY_WEAK)
+				continue;
+
+			/* Propagate any change in error status */
+			if (rq->fence.error)
+				i915_request_set_error_once(w, rq->fence.error);
+
+			if (w->engine != engine)
+				continue;
+
+			/* We also treat the on-hold status as a visited bit */
+			if (!i915_request_on_hold(w))
+				continue;
+
+			/* Check that no other parents are also on hold [BFS] */
+			if (hold_request(w))
+				continue;
+
+			list_move_tail(&w->sched.link, &list);
+		}
+
+		rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
+	} while (rq);
+}
+
+void intel_engine_resume_request(struct intel_engine_cs *engine,
+				 struct i915_request *rq)
+{
+	spin_lock_irq(&engine->active.lock);
+	__intel_engine_resume_request(engine, rq);
+	spin_unlock_irq(&engine->active.lock);
+}
+
 void i915_sched_node_init(struct i915_sched_node *node)
 {
 	spin_lock_init(&node->lock);
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 50fdc7168d38..421254cb8e8c 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -44,6 +44,16 @@ void i915_request_enqueue(struct i915_request *request);
 struct i915_request *
 __intel_engine_rewind_requests(struct intel_engine_cs *engine);
 
+bool __intel_engine_suspend_request(struct intel_engine_cs *engine,
+				    struct i915_request *rq);
+void __intel_engine_resume_request(struct intel_engine_cs *engine,
+				   struct i915_request *request);
+
+bool intel_engine_suspend_request(struct intel_engine_cs *engine,
+				  struct i915_request *request);
+void intel_engine_resume_request(struct intel_engine_cs *engine,
+				 struct i915_request *rq);
+
 struct list_head *
 i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 14/41] drm/i915: Extract the ability to defer and rerun a request later
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (11 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 13/41] drm/i915: Extract request suspension from the execlists Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 15/41] drm/i915: Fix the iterative dfs for defering requests Chris Wilson
                   ` (31 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Lift the ability to defer a request until later from execlists into the
common layer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/gt/intel_execlists_submission.c  | 57 +++--------------
 drivers/gpu/drm/i915/i915_scheduler.c         | 63 +++++++++++++++++--
 drivers/gpu/drm/i915/i915_scheduler.h         |  5 +-
 3 files changed, 67 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index f2967c31dd70..abed9dbef124 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -978,54 +978,6 @@ static void virtual_xfer_context(struct virtual_engine *ve,
 	}
 }
 
-static void defer_request(struct i915_request *rq, struct list_head * const pl)
-{
-	LIST_HEAD(list);
-
-	/*
-	 * We want to move the interrupted request to the back of
-	 * the round-robin list (i.e. its priority level), but
-	 * in doing so, we must then move all requests that were in
-	 * flight and were waiting for the interrupted request to
-	 * be run after it again.
-	 */
-	do {
-		struct i915_dependency *p;
-
-		GEM_BUG_ON(i915_request_is_active(rq));
-		list_move_tail(&rq->sched.link, pl);
-
-		for_each_waiter(p, rq) {
-			struct i915_request *w =
-				container_of(p->waiter, typeof(*w), sched);
-
-			if (p->flags & I915_DEPENDENCY_WEAK)
-				continue;
-
-			/* Leave semaphores spinning on the other engines */
-			if (w->engine != rq->engine)
-				continue;
-
-			/* No waiter should start before its signaler */
-			GEM_BUG_ON(i915_request_has_initial_breadcrumb(w) &&
-				   __i915_request_has_started(w) &&
-				   !__i915_request_is_complete(rq));
-
-			if (!i915_request_is_ready(w))
-				continue;
-
-			if (rq_prio(w) < rq_prio(rq))
-				continue;
-
-			GEM_BUG_ON(rq_prio(w) > rq_prio(rq));
-			GEM_BUG_ON(i915_request_is_active(w));
-			list_move_tail(&w->sched.link, &list);
-		}
-
-		rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
-	} while (rq);
-}
-
 static void defer_active(struct intel_engine_cs *engine)
 {
 	struct i915_request *rq;
@@ -1034,7 +986,14 @@ static void defer_active(struct intel_engine_cs *engine)
 	if (!rq)
 		return;
 
-	defer_request(rq, i915_sched_lookup_priolist(engine, rq_prio(rq)));
+	/*
+	 * We want to move the interrupted request to the back of
+	 * the round-robin list (i.e. its priority level), but
+	 * in doing so, we must then move all requests that were in
+	 * flight and were waiting for the interrupted request to
+	 * be run after it again.
+	 */
+	__intel_engine_defer_request(engine, rq);
 }
 
 static bool
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 1f8c647d59d6..4d648c2d603a 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -171,8 +171,8 @@ static void assert_priolists(struct intel_engine_execlists * const execlists)
 	}
 }
 
-struct list_head *
-i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio)
+static struct list_head *
+lookup_priolist(struct intel_engine_cs *engine, int prio)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_priolist *p;
@@ -324,7 +324,7 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
 	struct list_head *pos = &rq->sched.signalers_list;
 	struct list_head *plist;
 
-	plist = i915_sched_lookup_priolist(engine, prio);
+	plist = lookup_priolist(engine, prio);
 
 	/*
 	 * Recursively bump all dependent priorities to match the new request.
@@ -451,12 +451,63 @@ void i915_request_set_priority(struct i915_request *rq, int prio)
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
 
+void __intel_engine_defer_request(struct intel_engine_cs *engine,
+				  struct i915_request *rq)
+{
+	struct list_head *pl;
+	LIST_HEAD(list);
+
+	lockdep_assert_held(&engine->active.lock);
+	GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags));
+
+	/*
+	 * When we defer a request, we must maintain its order with respect
+	 * to those that are waiting upon it. So we traverse its chain of
+	 * waiters and move any that are earlier than the request to after it.
+	 */
+	pl = lookup_priolist(engine, rq_prio(rq));
+	do {
+		struct i915_dependency *p;
+
+		GEM_BUG_ON(i915_request_is_active(rq));
+		list_move_tail(&rq->sched.link, pl);
+
+		for_each_waiter(p, rq) {
+			struct i915_request *w =
+				container_of(p->waiter, typeof(*w), sched);
+
+			if (p->flags & I915_DEPENDENCY_WEAK)
+				continue;
+
+			/* Leave semaphores spinning on the other engines */
+			if (w->engine != engine)
+				continue;
+
+			/* No waiter should start before its signaler */
+			GEM_BUG_ON(i915_request_has_initial_breadcrumb(w) &&
+				   __i915_request_has_started(w) &&
+				   !__i915_request_is_complete(rq));
+
+			if (!i915_request_is_ready(w))
+				continue;
+
+			if (rq_prio(w) < rq_prio(rq))
+				continue;
+
+			GEM_BUG_ON(rq_prio(w) > rq_prio(rq));
+			GEM_BUG_ON(i915_request_is_active(w));
+			list_move_tail(&w->sched.link, &list);
+		}
+
+		rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
+	} while (rq);
+}
+
 static void queue_request(struct intel_engine_cs *engine,
 			  struct i915_request *rq)
 {
 	GEM_BUG_ON(!list_empty(&rq->sched.link));
-	list_add_tail(&rq->sched.link,
-		      i915_sched_lookup_priolist(engine, rq_prio(rq)));
+	list_add_tail(&rq->sched.link, lookup_priolist(engine, rq_prio(rq)));
 	set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
 }
 
@@ -555,7 +606,7 @@ __intel_engine_rewind_requests(struct intel_engine_cs *engine)
 		GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
 		if (rq_prio(rq) != prio) {
 			prio = rq_prio(rq);
-			pl = i915_sched_lookup_priolist(engine, prio);
+			pl = lookup_priolist(engine, prio);
 		}
 		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 421254cb8e8c..6eafda957f64 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -43,6 +43,8 @@ void i915_request_enqueue(struct i915_request *request);
 
 struct i915_request *
 __intel_engine_rewind_requests(struct intel_engine_cs *engine);
+void __intel_engine_defer_request(struct intel_engine_cs *engine,
+				  struct i915_request *request);
 
 bool __intel_engine_suspend_request(struct intel_engine_cs *engine,
 				    struct i915_request *rq);
@@ -54,9 +56,6 @@ bool intel_engine_suspend_request(struct intel_engine_cs *engine,
 void intel_engine_resume_request(struct intel_engine_cs *engine,
 				 struct i915_request *rq);
 
-struct list_head *
-i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio);
-
 void __i915_priolist_free(struct i915_priolist *p);
 static inline void i915_priolist_free(struct i915_priolist *p)
 {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 15/41] drm/i915: Fix the iterative dfs for defering requests
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (12 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 14/41] drm/i915: Extract the ability to defer and rerun a request later Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 16/41] drm/i915: Move common active lists from engine to i915_scheduler Chris Wilson
                   ` (30 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

The current implementation of walking the children of a deferred
requests lacks the backtracking required to reduce the dfs to linear.
Having pulled it from execlists into the common layer, we can reuse the
dfs code for priority inheritance.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_scheduler.c | 58 +++++++++++++++++++--------
 1 file changed, 42 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 4d648c2d603a..7c93c2a8309b 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -454,25 +454,26 @@ void i915_request_set_priority(struct i915_request *rq, int prio)
 void __intel_engine_defer_request(struct intel_engine_cs *engine,
 				  struct i915_request *rq)
 {
-	struct list_head *pl;
-	LIST_HEAD(list);
+	struct list_head *pos = &rq->sched.waiters_list;
+	struct i915_request *rn;
+	LIST_HEAD(dfs);
+	int prio;
 
 	lockdep_assert_held(&engine->active.lock);
 	GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags));
 
+	prio = rq_prio(rq);
+
 	/*
 	 * When we defer a request, we must maintain its order with respect
 	 * to those that are waiting upon it. So we traverse its chain of
 	 * waiters and move any that are earlier than the request to after it.
 	 */
-	pl = lookup_priolist(engine, rq_prio(rq));
+	rq->sched.dfs.next = NULL;
 	do {
-		struct i915_dependency *p;
-
-		GEM_BUG_ON(i915_request_is_active(rq));
-		list_move_tail(&rq->sched.link, pl);
-
-		for_each_waiter(p, rq) {
+		list_for_each_continue(pos, &rq->sched.waiters_list) {
+			struct i915_dependency *p =
+				list_entry(pos, typeof(*p), wait_link);
 			struct i915_request *w =
 				container_of(p->waiter, typeof(*w), sched);
 
@@ -488,19 +489,44 @@ void __intel_engine_defer_request(struct intel_engine_cs *engine,
 				   __i915_request_has_started(w) &&
 				   !__i915_request_is_complete(rq));
 
-			if (!i915_request_is_ready(w))
+			if (!i915_request_in_priority_queue(w))
 				continue;
 
-			if (rq_prio(w) < rq_prio(rq))
+			/*
+			 * We also need to reorder within the same priority.
+			 *
+			 * This is unlike priority-inheritance, where if the
+			 * signaler already has a higher priority [earlier
+			 * deadline] than us, we can ignore as it will be
+			 * scheduled first. If a waiter already has the
+			 * same priority, we still have to push it to the end
+			 * of the list. This unfortunately means we cannot
+			 * use the rq_deadline() itself as a 'visited' bit.
+			 */
+			if (rq_prio(w) < prio)
 				continue;
 
-			GEM_BUG_ON(rq_prio(w) > rq_prio(rq));
-			GEM_BUG_ON(i915_request_is_active(w));
-			list_move_tail(&w->sched.link, &list);
+			GEM_BUG_ON(rq_prio(w) != prio);
+
+			/* Remember our position along this branch */
+			rq = stack_push(w, rq, pos);
+			pos = &rq->sched.waiters_list;
 		}
 
-		rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
-	} while (rq);
+		/* Note list is reversed for waiters wrt signal hierarchy */
+		GEM_BUG_ON(rq->engine != engine);
+		GEM_BUG_ON(!i915_request_in_priority_queue(rq));
+		list_move(&rq->sched.link, &dfs);
+
+		/* Track our visit, and prevent duplicate processing */
+		clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+	} while ((rq = stack_pop(rq, &pos)));
+
+	pos = lookup_priolist(engine, prio);
+	list_for_each_entry_safe(rq, rn, &dfs, sched.link) {
+		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+		list_add_tail(&rq->sched.link, pos);
+	}
 }
 
 static void queue_request(struct intel_engine_cs *engine,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 16/41] drm/i915: Move common active lists from engine to i915_scheduler
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (13 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 15/41] drm/i915: Fix the iterative dfs for defering requests Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 17/41] drm/i915: Move scheduler queue Chris Wilson
                   ` (29 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Extract the scheduler lists into a related structure, stop sprawling
over struct intel_engine_cs

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 26 +-------------
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 +----
 .../drm/i915/gt/intel_execlists_submission.c  |  2 +-
 drivers/gpu/drm/i915/gt/mock_engine.c         |  2 +-
 drivers/gpu/drm/i915/i915_scheduler.c         | 34 ++++++++++++++++---
 drivers/gpu/drm/i915/i915_scheduler.h         |  3 +-
 drivers/gpu/drm/i915/i915_scheduler_types.h   |  9 +++++
 7 files changed, 44 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 3bfd3853c0e9..54b394bd9429 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -576,8 +576,6 @@ void intel_engine_init_execlists(struct intel_engine_cs *engine)
 
 	execlists->queue_priority_hint = INT_MIN;
 	execlists->queue = RB_ROOT_CACHED;
-
-	i915_sched_init_ipi(&execlists->ipi);
 }
 
 static void cleanup_status_page(struct intel_engine_cs *engine)
@@ -693,7 +691,7 @@ static int engine_setup_common(struct intel_engine_cs *engine)
 		goto err_status;
 	}
 
-	intel_engine_init_active(engine, ENGINE_PHYSICAL);
+	i915_sched_init_engine(&engine->active, ENGINE_PHYSICAL);
 	intel_engine_init_execlists(engine);
 	intel_engine_init_cmd_parser(engine);
 	intel_engine_init__pm(engine);
@@ -761,28 +759,6 @@ static int measure_breadcrumb_dw(struct intel_context *ce)
 	return dw;
 }
 
-void
-intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass)
-{
-	INIT_LIST_HEAD(&engine->active.requests);
-	INIT_LIST_HEAD(&engine->active.hold);
-
-	spin_lock_init(&engine->active.lock);
-	lockdep_set_subclass(&engine->active.lock, subclass);
-
-	/*
-	 * Due to an interesting quirk in lockdep's internal debug tracking,
-	 * after setting a subclass we must ensure the lock is used. Otherwise,
-	 * nr_unused_locks is incremented once too often.
-	 */
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	local_irq_disable();
-	lock_map_acquire(&engine->active.lock.dep_map);
-	lock_map_release(&engine->active.lock.dep_map);
-	local_irq_enable();
-#endif
-}
-
 static struct intel_context *
 create_pinned_context(struct intel_engine_cs *engine,
 		      unsigned int hwsp,
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 9105b7769635..36bcd85cc73b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -258,8 +258,6 @@ struct intel_engine_execlists {
 	struct rb_root_cached queue;
 	struct rb_root_cached virtual;
 
-	struct i915_sched_ipi ipi;
-
 	/**
 	 * @csb_write: control register for Context Switch buffer
 	 *
@@ -329,11 +327,7 @@ struct intel_engine_cs {
 
 	struct intel_sseu sseu;
 
-	struct {
-		spinlock_t lock;
-		struct list_head requests;
-		struct list_head hold; /* ready requests, but on hold */
-	} active;
+	struct i915_sched_engine active;
 
 	/* keep a request in reserve for a [pm] barrier under oom */
 	struct i915_request *request_pool;
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index abed9dbef124..ff5025f18560 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3333,7 +3333,7 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 
 	snprintf(ve->base.name, sizeof(ve->base.name), "virtual");
 
-	intel_engine_init_active(&ve->base, ENGINE_VIRTUAL);
+	i915_sched_init_engine(&ve->base.active, ENGINE_VIRTUAL);
 	intel_engine_init_execlists(&ve->base);
 
 	ve->base.cops = &virtual_context_ops;
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
index df7c1b1acc32..217f217200df 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -325,7 +325,7 @@ int mock_engine_init(struct intel_engine_cs *engine)
 {
 	struct intel_context *ce;
 
-	intel_engine_init_active(engine, ENGINE_MOCK);
+	i915_sched_init_engine(&engine->active, ENGINE_MOCK);
 	intel_engine_init_execlists(engine);
 	intel_engine_init__pm(engine);
 	intel_engine_init_retire(engine);
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 7c93c2a8309b..14c69543a98d 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -85,12 +85,36 @@ static void ipi_schedule(struct work_struct *wrk)
 	} while (rq);
 }
 
-void i915_sched_init_ipi(struct i915_sched_ipi *ipi)
+static void i915_sched_init_ipi(struct i915_sched_ipi *ipi)
 {
 	INIT_WORK(&ipi->work, ipi_schedule);
 	ipi->list = NULL;
 }
 
+void i915_sched_init_engine(struct i915_sched_engine *se,
+			    unsigned int subclass)
+{
+	spin_lock_init(&se->lock);
+	lockdep_set_subclass(&se->lock, subclass);
+
+	INIT_LIST_HEAD(&se->requests);
+	INIT_LIST_HEAD(&se->hold);
+
+	i915_sched_init_ipi(&se->ipi);
+
+	/*
+	 * Due to an interesting quirk in lockdep's internal debug tracking,
+	 * after setting a subclass we must ensure the lock is used. Otherwise,
+	 * nr_unused_locks is incremented once too often.
+	 */
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	local_irq_disable();
+	lock_map_acquire(&se->lock.dep_map);
+	lock_map_release(&se->lock.dep_map);
+	local_irq_enable();
+#endif
+}
+
 static void __ipi_add(struct i915_request *rq)
 {
 #define STUB ((struct i915_request *)1)
@@ -106,13 +130,13 @@ static void __ipi_add(struct i915_request *rq)
 		return;
 	}
 
-	first = READ_ONCE(engine->execlists.ipi.list);
-	do
+	first = READ_ONCE(engine->active.ipi.list);
+	do {
 		rq->sched.ipi_link = ptr_pack_bits(first, 1, 1);
-	while (!try_cmpxchg(&engine->execlists.ipi.list, &first, rq));
+	} while (!try_cmpxchg(&engine->active.ipi.list, &first, rq));
 
 	if (!first)
-		queue_work(system_unbound_wq, &engine->execlists.ipi.work);
+		queue_work(system_unbound_wq, &engine->active.ipi.work);
 }
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 6eafda957f64..5c543124bdb9 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -35,7 +35,8 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
 
 void i915_sched_node_retire(struct i915_sched_node *node);
 
-void i915_sched_init_ipi(struct i915_sched_ipi *ipi);
+void i915_sched_init_engine(struct i915_sched_engine *se,
+			    unsigned int subclass);
 
 void i915_request_set_priority(struct i915_request *request, int prio);
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index 5a84d59134ee..f1e9bfa662e2 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -80,6 +80,15 @@ struct i915_sched_ipi {
 	struct work_struct work;
 };
 
+struct i915_sched_engine {
+	spinlock_t lock; /* protects the scheduling lists and queue */
+
+	struct list_head requests;
+	struct list_head hold; /* ready requests, but on hold */
+
+	struct i915_sched_ipi ipi;
+};
+
 struct i915_dependency {
 	struct i915_sched_node *signaler;
 	struct i915_sched_node *waiter;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 17/41] drm/i915: Move scheduler queue
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (14 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 16/41] drm/i915: Move common active lists from engine to i915_scheduler Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 18/41] drm/i915: Move tasklet from execlists to sched Chris Wilson
                   ` (28 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Extract the scheduling queue from "execlists" into the per-engine
scheduling structs, for reuse by other backends.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_wait.c      |  1 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  5 +--
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |  3 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  | 14 -------
 .../drm/i915/gt/intel_execlists_submission.c  | 29 +++++++-------
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 11 +++---
 drivers/gpu/drm/i915/i915_drv.h               |  1 -
 drivers/gpu/drm/i915/i915_request.h           |  2 +-
 drivers/gpu/drm/i915/i915_scheduler.c         | 38 ++++++++++++-------
 drivers/gpu/drm/i915/i915_scheduler.h         | 15 ++++++++
 drivers/gpu/drm/i915/i915_scheduler_types.h   | 14 +++++++
 .../gpu/drm/i915/selftests/i915_scheduler.c   |  2 +-
 13 files changed, 83 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 085f6a3735e8..d5bc75508048 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -19,7 +19,7 @@
 
 #include "gt/intel_context_types.h"
 
-#include "i915_scheduler.h"
+#include "i915_scheduler_types.h"
 #include "i915_sw_fence.h"
 
 struct pid;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index d79bf16083bd..4d1897c347b9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -13,6 +13,7 @@
 #include "dma_resv_utils.h"
 #include "i915_gem_ioctls.h"
 #include "i915_gem_object.h"
+#include "i915_scheduler.h"
 
 static long
 i915_gem_object_wait_fence(struct dma_fence *fence,
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 54b394bd9429..ef225da35399 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -575,7 +575,6 @@ void intel_engine_init_execlists(struct intel_engine_cs *engine)
 		memset(execlists->inflight, 0, sizeof(execlists->inflight));
 
 	execlists->queue_priority_hint = INT_MIN;
-	execlists->queue = RB_ROOT_CACHED;
 }
 
 static void cleanup_status_page(struct intel_engine_cs *engine)
@@ -902,7 +901,7 @@ int intel_engines_init(struct intel_gt *gt)
  */
 void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 {
-	GEM_BUG_ON(!list_empty(&engine->active.requests));
+	i915_sched_fini_engine(&engine->active);
 	tasklet_kill(&engine->execlists.tasklet); /* flush the callback */
 
 	intel_breadcrumbs_free(engine->breadcrumbs);
@@ -1234,7 +1233,7 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	}
 
 	/* ELSP is empty, but there are ready requests? E.g. after reset */
-	if (!RB_EMPTY_ROOT(&engine->execlists.queue.rb_root))
+	if (!i915_sched_is_idle(&engine->active))
 		return false;
 
 	/* Ring stopped? */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 6372d7826bc9..205feeaf0e76 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -4,6 +4,7 @@
  */
 
 #include "i915_drv.h"
+#include "i915_scheduler.h"
 
 #include "intel_breadcrumbs.h"
 #include "intel_context.h"
@@ -276,7 +277,7 @@ static int __engine_park(struct intel_wakeref *wf)
 	if (engine->park)
 		engine->park(engine);
 
-	engine->execlists.no_priolist = false;
+	i915_sched_park_engine(&engine->active);
 
 	/* While gt calls i915_vma_parked(), we have to break the lock cycle */
 	intel_gt_pm_put_async(engine->gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 36bcd85cc73b..c46d70b7e484 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -153,11 +153,6 @@ struct intel_engine_execlists {
 	 */
 	struct timer_list preempt;
 
-	/**
-	 * @default_priolist: priority list for I915_PRIORITY_NORMAL
-	 */
-	struct i915_priolist default_priolist;
-
 	/**
 	 * @ccid: identifier for contexts submitted to this engine
 	 */
@@ -192,11 +187,6 @@ struct intel_engine_execlists {
 	 */
 	u32 reset_ccid;
 
-	/**
-	 * @no_priolist: priority lists disabled
-	 */
-	bool no_priolist;
-
 	/**
 	 * @submit_reg: gen-specific execlist submission register
 	 * set to the ExecList Submission Port (elsp) register pre-Gen11 and to
@@ -252,10 +242,6 @@ struct intel_engine_execlists {
 	 */
 	int queue_priority_hint;
 
-	/**
-	 * @queue: queue of requests, in priority lists
-	 */
-	struct rb_root_cached queue;
 	struct rb_root_cached virtual;
 
 	/**
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index ff5025f18560..756ac388a4a8 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -272,11 +272,11 @@ static int effective_prio(const struct i915_request *rq)
 	return prio;
 }
 
-static int queue_prio(const struct intel_engine_execlists *execlists)
+static int queue_prio(const struct i915_sched_engine *se)
 {
 	struct rb_node *rb;
 
-	rb = rb_first_cached(&execlists->queue);
+	rb = rb_first_cached(&se->queue);
 	if (!rb)
 		return INT_MIN;
 
@@ -339,7 +339,7 @@ static bool need_preempt(const struct intel_engine_cs *engine,
 	 * context, it's priority would not exceed ELSP[0] aka last_prio.
 	 */
 	return max(virtual_prio(&engine->execlists),
-		   queue_prio(&engine->execlists)) > last_prio;
+		   queue_prio(&engine->active)) > last_prio;
 }
 
 __maybe_unused static bool
@@ -1030,13 +1030,13 @@ static bool needs_timeslice(const struct intel_engine_cs *engine,
 		return false;
 
 	/* If ELSP[1] is occupied, always check to see if worth slicing */
-	if (!list_is_last_rcu(&rq->sched.link, &engine->active.requests)) {
+	if (!i915_sched_is_last_request(&engine->active, rq)) {
 		ENGINE_TRACE(engine, "timeslice required for second inflight context\n");
 		return true;
 	}
 
 	/* Otherwise, ELSP[0] is by itself, but may be waiting in the queue */
-	if (!RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)) {
+	if (!i915_sched_is_idle(&engine->active)) {
 		ENGINE_TRACE(engine, "timeslice required for queue\n");
 		return true;
 	}
@@ -1281,7 +1281,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		GEM_BUG_ON(rq->engine != &ve->base);
 		GEM_BUG_ON(rq->context != &ve->context);
 
-		if (unlikely(rq_prio(rq) < queue_prio(execlists))) {
+		if (unlikely(rq_prio(rq) < queue_prio(&engine->active))) {
 			spin_unlock(&ve->base.active.lock);
 			break;
 		}
@@ -1347,7 +1347,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			break;
 	}
 
-	while ((rb = rb_first_cached(&execlists->queue))) {
+	while ((rb = rb_first_cached(&engine->active.queue))) {
 		struct i915_priolist *p = to_priolist(rb);
 		struct i915_request *rq, *rn;
 
@@ -1426,7 +1426,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			}
 		}
 
-		rb_erase_cached(&p->node, &execlists->queue);
+		rb_erase_cached(&p->node, &engine->active.queue);
 		i915_priolist_free(p);
 	}
 done:
@@ -1448,7 +1448,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * request triggering preemption on the next dequeue (or subsequent
 	 * interrupt for secondary ports).
 	 */
-	execlists->queue_priority_hint = queue_prio(execlists);
+	execlists->queue_priority_hint = queue_prio(&engine->active);
 	spin_unlock(&engine->active.lock);
 
 	/*
@@ -2662,7 +2662,7 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
 	intel_engine_signal_breadcrumbs(engine);
 
 	/* Flush the queued requests to the timeline list (for retiring). */
-	while ((rb = rb_first_cached(&execlists->queue))) {
+	while ((rb = rb_first_cached(&engine->active.queue))) {
 		struct i915_priolist *p = to_priolist(rb);
 
 		priolist_for_each_request_consume(rq, rn, p) {
@@ -2670,9 +2670,10 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
 			__i915_request_submit(rq);
 		}
 
-		rb_erase_cached(&p->node, &execlists->queue);
+		rb_erase_cached(&p->node, &engine->active.queue);
 		i915_priolist_free(p);
 	}
+	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
 
 	/* On-hold requests will be flushed to timeline upon their release */
 	list_for_each_entry(rq, &engine->active.hold, sched.link)
@@ -2703,7 +2704,7 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
 	execlists->queue_priority_hint = INT_MIN;
-	execlists->queue = RB_ROOT_CACHED;
+	engine->active.queue = RB_ROOT_CACHED;
 
 	GEM_BUG_ON(__tasklet_is_enabled(&execlists->tasklet));
 	execlists->tasklet.func = nop_submission_tasklet;
@@ -2939,7 +2940,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
 
 static struct list_head *virtual_queue(struct virtual_engine *ve)
 {
-	return &ve->base.execlists.default_priolist.requests;
+	return &ve->base.active.default_priolist.requests;
 }
 
 static void rcu_virtual_context_destroy(struct work_struct *wrk)
@@ -3532,7 +3533,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 
 	last = NULL;
 	count = 0;
-	for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) {
+	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 
 		priolist_for_each_request(rq, p) {
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 56ec738a9ce7..70b2e23a9644 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -203,7 +203,7 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
 	 * event.
 	 */
 	port = first;
-	while ((rb = rb_first_cached(&execlists->queue))) {
+	while ((rb = rb_first_cached(&engine->active.queue))) {
 		struct i915_priolist *p = to_priolist(rb);
 		struct i915_request *rq, *rn;
 
@@ -223,7 +223,7 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
 			last = rq;
 		}
 
-		rb_erase_cached(&p->node, &execlists->queue);
+		rb_erase_cached(&p->node, &engine->active.queue);
 		i915_priolist_free(p);
 	}
 done:
@@ -357,7 +357,7 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
 	}
 
 	/* Flush the queued requests to the timeline list (for retiring). */
-	while ((rb = rb_first_cached(&execlists->queue))) {
+	while ((rb = rb_first_cached(&engine->active.queue))) {
 		struct i915_priolist *p = to_priolist(rb);
 
 		priolist_for_each_request_consume(rq, rn, p) {
@@ -367,14 +367,15 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
 			i915_request_mark_complete(rq);
 		}
 
-		rb_erase_cached(&p->node, &execlists->queue);
+		rb_erase_cached(&p->node, &engine->active.queue);
 		i915_priolist_free(p);
 	}
+	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
 
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
 	execlists->queue_priority_hint = INT_MIN;
-	execlists->queue = RB_ROOT_CACHED;
+	engine->active.queue = RB_ROOT_CACHED;
 
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e0fcfc5590d9..f281c5799133 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -99,7 +99,6 @@
 #include "i915_gpu_error.h"
 #include "i915_perf_types.h"
 #include "i915_request.h"
-#include "i915_scheduler.h"
 #include "gt/intel_timeline.h"
 #include "i915_vma.h"
 #include "i915_irq.h"
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 1bfe214a47e9..199dffea28ec 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -35,7 +35,7 @@
 #include "gt/intel_timeline_types.h"
 
 #include "i915_gem.h"
-#include "i915_scheduler.h"
+#include "i915_scheduler_types.h"
 #include "i915_selftest.h"
 #include "i915_sw_fence.h"
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 14c69543a98d..76a11452c361 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -99,6 +99,7 @@ void i915_sched_init_engine(struct i915_sched_engine *se,
 
 	INIT_LIST_HEAD(&se->requests);
 	INIT_LIST_HEAD(&se->hold);
+	se->queue = RB_ROOT_CACHED;
 
 	i915_sched_init_ipi(&se->ipi);
 
@@ -115,6 +116,17 @@ void i915_sched_init_engine(struct i915_sched_engine *se,
 #endif
 }
 
+void i915_sched_park_engine(struct i915_sched_engine *se)
+{
+	GEM_BUG_ON(!i915_sched_is_idle(se));
+	se->no_priolist = false;
+}
+
+void i915_sched_fini_engine(struct i915_sched_engine *se)
+{
+	GEM_BUG_ON(!list_empty(&se->requests));
+}
+
 static void __ipi_add(struct i915_request *rq)
 {
 #define STUB ((struct i915_request *)1)
@@ -175,7 +187,7 @@ static inline struct i915_priolist *to_priolist(struct rb_node *rb)
 	return rb_entry(rb, struct i915_priolist, node);
 }
 
-static void assert_priolists(struct intel_engine_execlists * const execlists)
+static void assert_priolists(struct i915_sched_engine * const se)
 {
 	struct rb_node *rb;
 	long last_prio;
@@ -183,11 +195,11 @@ static void assert_priolists(struct intel_engine_execlists * const execlists)
 	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
 		return;
 
-	GEM_BUG_ON(rb_first_cached(&execlists->queue) !=
-		   rb_first(&execlists->queue.rb_root));
+	GEM_BUG_ON(rb_first_cached(&se->queue) !=
+		   rb_first(&se->queue.rb_root));
 
 	last_prio = INT_MAX;
-	for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) {
+	for (rb = rb_first_cached(&se->queue); rb; rb = rb_next(rb)) {
 		const struct i915_priolist *p = to_priolist(rb);
 
 		GEM_BUG_ON(p->priority > last_prio);
@@ -198,21 +210,21 @@ static void assert_priolists(struct intel_engine_execlists * const execlists)
 static struct list_head *
 lookup_priolist(struct intel_engine_cs *engine, int prio)
 {
-	struct intel_engine_execlists * const execlists = &engine->execlists;
+	struct i915_sched_engine * const se = &engine->active;
 	struct i915_priolist *p;
 	struct rb_node **parent, *rb;
 	bool first = true;
 
 	lockdep_assert_held(&engine->active.lock);
-	assert_priolists(execlists);
+	assert_priolists(se);
 
-	if (unlikely(execlists->no_priolist))
+	if (unlikely(se->no_priolist))
 		prio = I915_PRIORITY_NORMAL;
 
 find_priolist:
 	/* most positive priority is scheduled first, equal priorities fifo */
 	rb = NULL;
-	parent = &execlists->queue.rb_root.rb_node;
+	parent = &se->queue.rb_root.rb_node;
 	while (*parent) {
 		rb = *parent;
 		p = to_priolist(rb);
@@ -227,7 +239,7 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
 	}
 
 	if (prio == I915_PRIORITY_NORMAL) {
-		p = &execlists->default_priolist;
+		p = &se->default_priolist;
 	} else {
 		p = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
 		/* Convert an allocation failure to a priority bump */
@@ -242,7 +254,7 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
 			 * requests, so if userspace lied about their
 			 * dependencies that reordering may be visible.
 			 */
-			execlists->no_priolist = true;
+			se->no_priolist = true;
 			goto find_priolist;
 		}
 	}
@@ -251,7 +263,7 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
 	INIT_LIST_HEAD(&p->requests);
 
 	rb_link_node(&p->node, rb, parent);
-	rb_insert_color_cached(&p->node, &execlists->queue, first);
+	rb_insert_color_cached(&p->node, &se->queue, first);
 
 	return &p->requests;
 }
@@ -623,7 +635,7 @@ void i915_request_enqueue(struct i915_request *rq)
 	} else {
 		queue_request(engine, rq);
 
-		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+		GEM_BUG_ON(i915_sched_is_idle(&engine->active));
 
 		kick = submit_queue(engine, rq);
 	}
@@ -658,7 +670,7 @@ __intel_engine_rewind_requests(struct intel_engine_cs *engine)
 			prio = rq_prio(rq);
 			pl = lookup_priolist(engine, prio);
 		}
-		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+		GEM_BUG_ON(i915_sched_is_idle(&engine->active));
 
 		list_move(&rq->sched.link, pl);
 		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 5c543124bdb9..e89bf5ed61b3 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -12,6 +12,7 @@
 #include <linux/kernel.h>
 
 #include "i915_scheduler_types.h"
+#include "i915_request.h"
 
 struct drm_printer;
 
@@ -37,6 +38,8 @@ void i915_sched_node_retire(struct i915_sched_node *node);
 
 void i915_sched_init_engine(struct i915_sched_engine *se,
 			    unsigned int subclass);
+void i915_sched_park_engine(struct i915_sched_engine *se);
+void i915_sched_fini_engine(struct i915_sched_engine *se);
 
 void i915_request_set_priority(struct i915_request *request, int prio);
 
@@ -64,6 +67,18 @@ static inline void i915_priolist_free(struct i915_priolist *p)
 		__i915_priolist_free(p);
 }
 
+static inline bool i915_sched_is_idle(const struct i915_sched_engine *se)
+{
+	return RB_EMPTY_ROOT(&se->queue.rb_root);
+}
+
+static inline bool
+i915_sched_is_last_request(const struct i915_sched_engine *se,
+			   const struct i915_request *rq)
+{
+	return list_is_last_rcu(&rq->sched.link, &se->requests);
+}
+
 void i915_request_show_with_schedule(struct drm_printer *m,
 				     const struct i915_request *rq,
 				     const char *prefix,
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index f1e9bfa662e2..aa8a2d1e890a 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -85,8 +85,22 @@ struct i915_sched_engine {
 
 	struct list_head requests;
 	struct list_head hold; /* ready requests, but on hold */
+	/**
+	 * @queue: queue of requests, in priority lists
+	 */
+	struct rb_root_cached queue;
 
 	struct i915_sched_ipi ipi;
+
+	/**
+	 * @default_priolist: priority list for I915_PRIORITY_NORMAL
+	 */
+	struct i915_priolist default_priolist;
+
+	/**
+	 * @no_priolist: priority lists disabled
+	 */
+	bool no_priolist;
 };
 
 struct i915_dependency {
diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
index ad2a44449c44..000b68a40d4c 100644
--- a/drivers/gpu/drm/i915/selftests/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
@@ -92,7 +92,7 @@ static bool check_context_order(struct intel_engine_cs *engine)
 	last_context = 0;
 	last_seqno = 0;
 	last_prio = 0;
-	for (rb = rb_first_cached(&engine->execlists.queue); rb; rb = rb_next(rb)) {
+	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct i915_request *rq;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 18/41] drm/i915: Move tasklet from execlists to sched
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (15 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 17/41] drm/i915: Move scheduler queue Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-27 14:10   ` Tvrtko Ursulin
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 19/41] drm/i915/gt: Show scheduler queues when dumping state Chris Wilson
                   ` (27 subsequent siblings)
  44 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Move the scheduling tasklists out of the execlists backend into the
per-engine scheduling bookkeeping.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine.h        | 14 ----
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 11 ++--
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 --
 .../drm/i915/gt/intel_execlists_submission.c  | 65 +++++++++----------
 drivers/gpu/drm/i915/gt/intel_gt_irq.c        |  2 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 16 ++---
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  2 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c        |  6 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c      |  2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 18 ++---
 drivers/gpu/drm/i915/i915_scheduler.c         | 14 ++--
 drivers/gpu/drm/i915/i915_scheduler.h         | 20 ++++++
 drivers/gpu/drm/i915/i915_scheduler_types.h   |  6 ++
 .../gpu/drm/i915/selftests/i915_scheduler.c   | 16 ++---
 14 files changed, 99 insertions(+), 98 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 20974415e7d8..801ae54cf60d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -122,20 +122,6 @@ execlists_active(const struct intel_engine_execlists *execlists)
 	return active;
 }
 
-static inline void
-execlists_active_lock_bh(struct intel_engine_execlists *execlists)
-{
-	local_bh_disable(); /* prevent local softirq and lock recursion */
-	tasklet_lock(&execlists->tasklet);
-}
-
-static inline void
-execlists_active_unlock_bh(struct intel_engine_execlists *execlists)
-{
-	tasklet_unlock(&execlists->tasklet);
-	local_bh_enable(); /* restore softirq, and kick ksoftirqd! */
-}
-
 static inline u32
 intel_read_status_page(const struct intel_engine_cs *engine, int reg)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index ef225da35399..cdd07aeada05 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -902,7 +902,6 @@ int intel_engines_init(struct intel_gt *gt)
 void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 {
 	i915_sched_fini_engine(&engine->active);
-	tasklet_kill(&engine->execlists.tasklet); /* flush the callback */
 
 	intel_breadcrumbs_free(engine->breadcrumbs);
 
@@ -1187,7 +1186,7 @@ static bool ring_is_idle(struct intel_engine_cs *engine)
 
 void __intel_engine_flush_submission(struct intel_engine_cs *engine, bool sync)
 {
-	struct tasklet_struct *t = &engine->execlists.tasklet;
+	struct tasklet_struct *t = &engine->active.tasklet;
 
 	if (!t->func)
 		return;
@@ -1454,8 +1453,8 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
 
 		drm_printf(m, "\tExeclist tasklet queued? %s (%s), preempt? %s, timeslice? %s\n",
 			   yesno(test_bit(TASKLET_STATE_SCHED,
-					  &engine->execlists.tasklet.state)),
-			   enableddisabled(!atomic_read(&engine->execlists.tasklet.count)),
+					  &engine->active.tasklet.state)),
+			   enableddisabled(!atomic_read(&engine->active.tasklet.count)),
 			   repr_timer(&engine->execlists.preempt),
 			   repr_timer(&engine->execlists.timer));
 
@@ -1479,7 +1478,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
 				   idx, hws[idx * 2], hws[idx * 2 + 1]);
 		}
 
-		execlists_active_lock_bh(execlists);
+		i915_sched_lock_bh(&engine->active);
 		rcu_read_lock();
 		for (port = execlists->active; (rq = *port); port++) {
 			char hdr[160];
@@ -1510,7 +1509,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
 			i915_request_show(m, rq, hdr, 0);
 		}
 		rcu_read_unlock();
-		execlists_active_unlock_bh(execlists);
+		i915_sched_unlock_bh(&engine->active);
 	} else if (INTEL_GEN(dev_priv) > 6) {
 		drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
 			   ENGINE_READ(engine, RING_PP_DIR_BASE));
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index c46d70b7e484..76d561c2c6aa 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -138,11 +138,6 @@ struct st_preempt_hang {
  * driver and the hardware state for execlist mode of submission.
  */
 struct intel_engine_execlists {
-	/**
-	 * @tasklet: softirq tasklet for bottom handler
-	 */
-	struct tasklet_struct tasklet;
-
 	/**
 	 * @timer: kick the current context if its timeslice expires
 	 */
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 756ac388a4a8..1103c8a00af1 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -513,7 +513,7 @@ static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
 		resubmit_virtual_request(rq, ve);
 
 	if (READ_ONCE(ve->request))
-		tasklet_hi_schedule(&ve->base.execlists.tasklet);
+		i915_sched_kick(&ve->base.active);
 }
 
 static void __execlists_schedule_out(struct i915_request * const rq,
@@ -679,10 +679,9 @@ trace_ports(const struct intel_engine_execlists *execlists,
 		     dump_port(p1, sizeof(p1), ", ", ports[1]));
 }
 
-static bool
-reset_in_progress(const struct intel_engine_execlists *execlists)
+static bool reset_in_progress(const struct intel_engine_cs *engine)
 {
-	return unlikely(!__tasklet_is_enabled(&execlists->tasklet));
+	return unlikely(!__tasklet_is_enabled(&engine->active.tasklet));
 }
 
 static __maybe_unused noinline bool
@@ -699,7 +698,7 @@ assert_pending_valid(const struct intel_engine_execlists *execlists,
 	trace_ports(execlists, msg, execlists->pending);
 
 	/* We may be messing around with the lists during reset, lalala */
-	if (reset_in_progress(execlists))
+	if (reset_in_progress(engine))
 		return true;
 
 	if (!execlists->pending[0]) {
@@ -1084,7 +1083,7 @@ static void start_timeslice(struct intel_engine_cs *engine)
 			 * its timeslice, so recheck.
 			 */
 			if (!timer_pending(&el->timer))
-				tasklet_hi_schedule(&el->tasklet);
+				i915_sched_kick(&engine->active);
 			return;
 		}
 
@@ -1664,8 +1663,8 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
 	 * access. Either we are inside the tasklet, or the tasklet is disabled
 	 * and we assume that is only inside the reset paths and so serialised.
 	 */
-	GEM_BUG_ON(!tasklet_is_locked(&execlists->tasklet) &&
-		   !reset_in_progress(execlists));
+	GEM_BUG_ON(!tasklet_is_locked(&engine->active.tasklet) &&
+		   !reset_in_progress(engine));
 	GEM_BUG_ON(!intel_engine_in_execlists_submission_mode(engine));
 
 	/*
@@ -2077,13 +2076,13 @@ static noinline void execlists_reset(struct intel_engine_cs *engine)
 	ENGINE_TRACE(engine, "reset for %s\n", msg);
 
 	/* Mark this tasklet as disabled to avoid waiting for it to complete */
-	tasklet_disable_nosync(&engine->execlists.tasklet);
+	tasklet_disable_nosync(&engine->active.tasklet);
 
 	ring_set_paused(engine, 1); /* Freeze the current request in place */
 	execlists_capture(engine);
 	intel_engine_reset(engine, msg);
 
-	tasklet_enable(&engine->execlists.tasklet);
+	tasklet_enable(&engine->active.tasklet);
 	clear_and_wake_up_bit(bit, lock);
 }
 
@@ -2133,8 +2132,10 @@ static void execlists_submission_tasklet(unsigned long data)
 
 static void __execlists_kick(struct intel_engine_execlists *execlists)
 {
-	/* Kick the tasklet for some interrupt coalescing and reset handling */
-	tasklet_hi_schedule(&execlists->tasklet);
+	struct intel_engine_cs *engine =
+		container_of(execlists, typeof(*engine), execlists);
+
+	i915_sched_kick(&engine->active);
 }
 
 #define execlists_kick(t, member) \
@@ -2457,10 +2458,8 @@ static int execlists_resume(struct intel_engine_cs *engine)
 
 static void execlists_reset_prepare(struct intel_engine_cs *engine)
 {
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-
 	ENGINE_TRACE(engine, "depth<-%d\n",
-		     atomic_read(&execlists->tasklet.count));
+		     atomic_read(&engine->active.tasklet.count));
 
 	/*
 	 * Prevent request submission to the hardware until we have
@@ -2471,8 +2470,8 @@ static void execlists_reset_prepare(struct intel_engine_cs *engine)
 	 * Turning off the execlists->tasklet until the reset is over
 	 * prevents the race.
 	 */
-	__tasklet_disable_sync_once(&execlists->tasklet);
-	GEM_BUG_ON(!reset_in_progress(execlists));
+	__tasklet_disable_sync_once(&engine->active.tasklet);
+	GEM_BUG_ON(!reset_in_progress(engine));
 
 	/*
 	 * We stop engines, otherwise we might get failed reset and a
@@ -2706,8 +2705,8 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
 	execlists->queue_priority_hint = INT_MIN;
 	engine->active.queue = RB_ROOT_CACHED;
 
-	GEM_BUG_ON(__tasklet_is_enabled(&execlists->tasklet));
-	execlists->tasklet.func = nop_submission_tasklet;
+	GEM_BUG_ON(__tasklet_is_enabled(&engine->active.tasklet));
+	engine->active.tasklet.func = nop_submission_tasklet;
 
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 	rcu_read_unlock();
@@ -2715,8 +2714,6 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
 
 static void execlists_reset_finish(struct intel_engine_cs *engine)
 {
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-
 	/*
 	 * After a GPU reset, we may have requests to replay. Do so now while
 	 * we still have the forcewake to be sure that the GPU is not allowed
@@ -2727,14 +2724,14 @@ static void execlists_reset_finish(struct intel_engine_cs *engine)
 	 * reset as the next level of recovery, and as a final resort we
 	 * will declare the device wedged.
 	 */
-	GEM_BUG_ON(!reset_in_progress(execlists));
+	GEM_BUG_ON(!reset_in_progress(engine));
 
 	/* And kick in case we missed a new request submission. */
-	if (__tasklet_enable(&execlists->tasklet))
-		__execlists_kick(execlists);
+	if (__tasklet_enable(&engine->active.tasklet))
+		i915_sched_kick(&engine->active);
 
 	ENGINE_TRACE(engine, "depth->%d\n",
-		     atomic_read(&execlists->tasklet.count));
+		     atomic_read(&engine->active.tasklet.count));
 }
 
 static void gen8_logical_ring_enable_irq(struct intel_engine_cs *engine)
@@ -2767,7 +2764,7 @@ static bool can_preempt(struct intel_engine_cs *engine)
 static void execlists_set_default_submission(struct intel_engine_cs *engine)
 {
 	engine->submit_request = i915_request_enqueue;
-	engine->execlists.tasklet.func = execlists_submission_tasklet;
+	engine->active.tasklet.func = execlists_submission_tasklet;
 
 	engine->reset.prepare = execlists_reset_prepare;
 	engine->reset.rewind = execlists_reset_rewind;
@@ -2799,7 +2796,7 @@ static void execlists_shutdown(struct intel_engine_cs *engine)
 	/* Synchronise with residual timers and any softirq they raise */
 	del_timer_sync(&engine->execlists.timer);
 	del_timer_sync(&engine->execlists.preempt);
-	tasklet_kill(&engine->execlists.tasklet);
+	tasklet_kill(&engine->active.tasklet);
 }
 
 static void execlists_release(struct intel_engine_cs *engine)
@@ -2891,7 +2888,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
 	struct intel_uncore *uncore = engine->uncore;
 	u32 base = engine->mmio_base;
 
-	tasklet_init(&engine->execlists.tasklet,
+	tasklet_init(&engine->active.tasklet,
 		     execlists_submission_tasklet, (unsigned long)engine);
 	timer_setup(&engine->execlists.timer, execlists_timeslice, 0);
 	timer_setup(&engine->execlists.preempt, execlists_preempt, 0);
@@ -2974,7 +2971,7 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
 	 * rbtrees as in the case it is running in parallel, it may reinsert
 	 * the rb_node into a sibling.
 	 */
-	tasklet_kill(&ve->base.execlists.tasklet);
+	tasklet_kill(&ve->base.active.tasklet);
 
 	/* Decouple ourselves from the siblings, no more access allowed. */
 	for (n = 0; n < ve->num_siblings; n++) {
@@ -2992,7 +2989,7 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
 
 		spin_unlock_irq(&sibling->active.lock);
 	}
-	GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.execlists.tasklet));
+	GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.active.tasklet));
 	GEM_BUG_ON(!list_empty(virtual_queue(ve)));
 
 	lrc_fini(&ve->context);
@@ -3206,7 +3203,7 @@ static void virtual_submission_tasklet(unsigned long data)
 		GEM_BUG_ON(RB_EMPTY_NODE(&node->rb));
 		node->prio = prio;
 		if (first && prio > sibling->execlists.queue_priority_hint)
-			tasklet_hi_schedule(&sibling->execlists.tasklet);
+			i915_sched_kick(&sibling->active);
 
 unlock_engine:
 		spin_unlock_irq(&sibling->active.lock);
@@ -3247,7 +3244,7 @@ static void virtual_submit_request(struct i915_request *rq)
 	GEM_BUG_ON(!list_empty(virtual_queue(ve)));
 	list_move_tail(&rq->sched.link, virtual_queue(ve));
 
-	tasklet_hi_schedule(&ve->base.execlists.tasklet);
+	i915_sched_kick(&ve->base.active);
 
 unlock:
 	spin_unlock_irqrestore(&ve->base.active.lock, flags);
@@ -3345,7 +3342,7 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 
 	INIT_LIST_HEAD(virtual_queue(ve));
 	ve->base.execlists.queue_priority_hint = INT_MIN;
-	tasklet_init(&ve->base.execlists.tasklet,
+	tasklet_init(&ve->base.active.tasklet,
 		     virtual_submission_tasklet,
 		     (unsigned long)ve);
 
@@ -3375,7 +3372,7 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 		 * layering if we handle cloning of the requests and
 		 * submitting a copy into each backend.
 		 */
-		if (sibling->execlists.tasklet.func !=
+		if (sibling->active.tasklet.func !=
 		    execlists_submission_tasklet) {
 			err = -ENODEV;
 			goto err_put;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index 9fc6c912a4e5..5f5e96da09b0 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -59,7 +59,7 @@ cs_irq_handler(struct intel_engine_cs *engine, u32 iir)
 	}
 
 	if (tasklet)
-		tasklet_hi_schedule(&engine->execlists.tasklet);
+		i915_sched_kick(&engine->active);
 }
 
 static u32
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index e34858b111c8..d2036e16274d 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -43,7 +43,7 @@ static int wait_for_submit(struct intel_engine_cs *engine,
 			   unsigned long timeout)
 {
 	/* Ignore our own attempts to suppress excess tasklets */
-	tasklet_hi_schedule(&engine->execlists.tasklet);
+	i915_sched_kick(&engine->active);
 
 	timeout += jiffies;
 	do {
@@ -602,9 +602,9 @@ static int live_hold_reset(void *arg)
 			err = -EBUSY;
 			goto out;
 		}
-		tasklet_disable(&engine->execlists.tasklet);
+		tasklet_disable(&engine->active.tasklet);
 
-		engine->execlists.tasklet.func(engine->execlists.tasklet.data);
+		engine->active.tasklet.func(engine->active.tasklet.data);
 		GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
 
 		i915_request_get(rq);
@@ -614,7 +614,7 @@ static int live_hold_reset(void *arg)
 		__intel_engine_reset_bh(engine, NULL);
 		GEM_BUG_ON(rq->fence.error != -EIO);
 
-		tasklet_enable(&engine->execlists.tasklet);
+		tasklet_enable(&engine->active.tasklet);
 		clear_and_wake_up_bit(I915_RESET_ENGINE + id,
 				      &gt->reset.flags);
 		local_bh_enable();
@@ -1176,7 +1176,7 @@ static int live_timeslice_rewind(void *arg)
 		while (i915_request_is_active(rq[A2])) { /* semaphore yield! */
 			/* Wait for the timeslice to kick in */
 			del_timer(&engine->execlists.timer);
-			tasklet_hi_schedule(&engine->execlists.tasklet);
+			i915_sched_kick(&engine->active);
 			intel_engine_flush_submission(engine);
 		}
 		/* -> ELSP[] = { { A:rq1 }, { B:rq1 } } */
@@ -4575,9 +4575,9 @@ static int reset_virtual_engine(struct intel_gt *gt,
 		err = -EBUSY;
 		goto out_heartbeat;
 	}
-	tasklet_disable(&engine->execlists.tasklet);
+	tasklet_disable(&engine->active.tasklet);
 
-	engine->execlists.tasklet.func(engine->execlists.tasklet.data);
+	engine->active.tasklet.func(engine->active.tasklet.data);
 	GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
 
 	/* Fake a preemption event; failed of course */
@@ -4594,7 +4594,7 @@ static int reset_virtual_engine(struct intel_gt *gt,
 	GEM_BUG_ON(rq->fence.error != -EIO);
 
 	/* Release our grasp on the engine, letting CS flow again */
-	tasklet_enable(&engine->execlists.tasklet);
+	tasklet_enable(&engine->active.tasklet);
 	clear_and_wake_up_bit(I915_RESET_ENGINE + engine->id, &gt->reset.flags);
 	local_bh_enable();
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 8cad102922e7..3d3f41b1271a 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -1701,7 +1701,7 @@ static int __igt_atomic_reset_engine(struct intel_engine_cs *engine,
 				     const struct igt_atomic_section *p,
 				     const char *mode)
 {
-	struct tasklet_struct * const t = &engine->execlists.tasklet;
+	struct tasklet_struct * const t = &engine->active.tasklet;
 	int err;
 
 	GEM_TRACE("i915_reset_engine(%s:%s) under %s\n",
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index e97adf1b7729..0a40f6b574f8 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -49,7 +49,7 @@ static int wait_for_submit(struct intel_engine_cs *engine,
 			   unsigned long timeout)
 {
 	/* Ignore our own attempts to suppress excess tasklets */
-	tasklet_hi_schedule(&engine->execlists.tasklet);
+	i915_sched_kick(&engine->active);
 
 	timeout += jiffies;
 	do {
@@ -1857,12 +1857,12 @@ static void garbage_reset(struct intel_engine_cs *engine,
 
 	local_bh_disable();
 	if (!test_and_set_bit(bit, lock)) {
-		tasklet_disable(&engine->execlists.tasklet);
+		tasklet_disable(&engine->active.tasklet);
 
 		if (!rq->fence.error)
 			__intel_engine_reset_bh(engine, NULL);
 
-		tasklet_enable(&engine->execlists.tasklet);
+		tasklet_enable(&engine->active.tasklet);
 		clear_and_wake_up_bit(bit, lock);
 	}
 	local_bh_enable();
diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c
index 8784257ec808..154a09ef075a 100644
--- a/drivers/gpu/drm/i915/gt/selftest_reset.c
+++ b/drivers/gpu/drm/i915/gt/selftest_reset.c
@@ -321,7 +321,7 @@ static int igt_atomic_engine_reset(void *arg)
 		goto out_unlock;
 
 	for_each_engine(engine, gt, id) {
-		struct tasklet_struct *t = &engine->execlists.tasklet;
+		struct tasklet_struct *t = &engine->active.tasklet;
 
 		if (t->func)
 			tasklet_disable(t);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 70b2e23a9644..2d7339ef3b4c 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -265,8 +265,6 @@ static void guc_submission_tasklet(unsigned long data)
 
 static void guc_reset_prepare(struct intel_engine_cs *engine)
 {
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-
 	ENGINE_TRACE(engine, "\n");
 
 	/*
@@ -278,7 +276,7 @@ static void guc_reset_prepare(struct intel_engine_cs *engine)
 	 * Turning off the execlists->tasklet until the reset is over
 	 * prevents the race.
 	 */
-	__tasklet_disable_sync_once(&execlists->tasklet);
+	__tasklet_disable_sync_once(&engine->active.tasklet);
 }
 
 static void guc_reset_state(struct intel_context *ce,
@@ -382,14 +380,12 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
 
 static void guc_reset_finish(struct intel_engine_cs *engine)
 {
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-
-	if (__tasklet_enable(&execlists->tasklet))
+	if (__tasklet_enable(&engine->active.tasklet))
 		/* And kick in case we missed a new request submission. */
-		tasklet_hi_schedule(&execlists->tasklet);
+		tasklet_hi_schedule(&engine->active.tasklet);
 
 	ENGINE_TRACE(engine, "depth->%d\n",
-		     atomic_read(&execlists->tasklet.count));
+		     atomic_read(&engine->active.tasklet.count));
 }
 
 /*
@@ -579,7 +575,7 @@ static int guc_resume(struct intel_engine_cs *engine)
 static void guc_set_default_submission(struct intel_engine_cs *engine)
 {
 	engine->submit_request = i915_request_enqueue;
-	engine->execlists.tasklet.func = guc_submission_tasklet;
+	engine->active.tasklet.func = guc_submission_tasklet;
 
 	engine->reset.prepare = guc_reset_prepare;
 	engine->reset.rewind = guc_reset_rewind;
@@ -613,7 +609,7 @@ static void guc_release(struct intel_engine_cs *engine)
 {
 	engine->sanitize = NULL; /* no longer in control, nothing to sanitize */
 
-	tasklet_kill(&engine->execlists.tasklet);
+	tasklet_kill(&engine->active.tasklet);
 
 	intel_engine_cleanup_common(engine);
 	lrc_fini_wa_ctx(engine);
@@ -671,7 +667,7 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine)
 	 */
 	GEM_BUG_ON(INTEL_GEN(i915) < 11);
 
-	tasklet_init(&engine->execlists.tasklet,
+	tasklet_init(&engine->active.tasklet,
 		     guc_submission_tasklet, (unsigned long)engine);
 
 	guc_default_vfuncs(engine);
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 76a11452c361..a3ee06cb66d7 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -125,6 +125,7 @@ void i915_sched_park_engine(struct i915_sched_engine *se)
 void i915_sched_fini_engine(struct i915_sched_engine *se)
 {
 	GEM_BUG_ON(!list_empty(&se->requests));
+	tasklet_kill(&se->tasklet); /* flush the callback */
 }
 
 static void __ipi_add(struct i915_request *rq)
@@ -339,7 +340,7 @@ static void kick_submission(struct intel_engine_cs *engine,
 
 	engine->execlists.queue_priority_hint = prio;
 	if (need_preempt(prio, rq_prio(inflight)))
-		tasklet_hi_schedule(&engine->execlists.tasklet);
+		i915_sched_kick(&engine->active);
 }
 
 static void ipi_priority(struct i915_request *rq, int prio)
@@ -621,16 +622,17 @@ static bool ancestor_on_hold(const struct intel_engine_cs *engine,
 void i915_request_enqueue(struct i915_request *rq)
 {
 	struct intel_engine_cs *engine = rq->engine;
+	struct i915_sched_engine *se = &engine->active;
 	unsigned long flags;
 	bool kick = false;
 
 	/* Will be called from irq-context when using foreign fences. */
-	spin_lock_irqsave(&engine->active.lock, flags);
+	spin_lock_irqsave(&se->lock, flags);
 	GEM_BUG_ON(test_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags));
 
 	if (unlikely(ancestor_on_hold(engine, rq))) {
 		RQ_TRACE(rq, "ancestor on hold\n");
-		list_add_tail(&rq->sched.link, &engine->active.hold);
+		list_add_tail(&rq->sched.link, &se->hold);
 		i915_request_set_hold(rq);
 	} else {
 		queue_request(engine, rq);
@@ -641,9 +643,9 @@ void i915_request_enqueue(struct i915_request *rq)
 	}
 
 	GEM_BUG_ON(list_empty(&rq->sched.link));
-	spin_unlock_irqrestore(&engine->active.lock, flags);
+	spin_unlock_irqrestore(&se->lock, flags);
 	if (kick)
-		tasklet_hi_schedule(&engine->execlists.tasklet);
+		i915_sched_kick(se);
 }
 
 struct i915_request *
@@ -776,7 +778,7 @@ void __intel_engine_resume_request(struct intel_engine_cs *engine,
 
 	if (rq_prio(rq) > engine->execlists.queue_priority_hint) {
 		engine->execlists.queue_priority_hint = rq_prio(rq);
-		tasklet_hi_schedule(&engine->execlists.tasklet);
+		i915_sched_kick(&engine->active);
 	}
 
 	if (!i915_request_on_hold(rq))
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index e89bf5ed61b3..0ab47cbf0e9c 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -79,6 +79,26 @@ i915_sched_is_last_request(const struct i915_sched_engine *se,
 	return list_is_last_rcu(&rq->sched.link, &se->requests);
 }
 
+static inline void
+i915_sched_lock_bh(struct i915_sched_engine *se)
+{
+	local_bh_disable(); /* prevent local softirq and lock recursion */
+	tasklet_lock(&se->tasklet);
+}
+
+static inline void
+i915_sched_unlock_bh(struct i915_sched_engine *se)
+{
+	tasklet_unlock(&se->tasklet);
+	local_bh_enable(); /* restore softirq, and kick ksoftirqd! */
+}
+
+static inline void i915_sched_kick(struct i915_sched_engine *se)
+{
+	/* Kick the tasklet for some interrupt coalescing and reset handling */
+	tasklet_hi_schedule(&se->tasklet);
+}
+
 void i915_request_show_with_schedule(struct drm_printer *m,
 				     const struct i915_request *rq,
 				     const char *prefix,
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index aa8a2d1e890a..f668c680d290 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -7,6 +7,7 @@
 #ifndef _I915_SCHEDULER_TYPES_H_
 #define _I915_SCHEDULER_TYPES_H_
 
+#include <linux/interrupt.h>
 #include <linux/list.h>
 #include <linux/workqueue.h>
 
@@ -101,6 +102,11 @@ struct i915_sched_engine {
 	 * @no_priolist: priority lists disabled
 	 */
 	bool no_priolist;
+
+	/**
+	 * @tasklet: softirq tasklet for bottom half
+	 */
+	struct tasklet_struct tasklet;
 };
 
 struct i915_dependency {
diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
index 000b68a40d4c..de44a66210b7 100644
--- a/drivers/gpu/drm/i915/selftests/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
@@ -172,12 +172,12 @@ static int __single_chain(struct intel_engine_cs *engine, unsigned long length,
 	}
 	intel_engine_flush_submission(engine);
 
-	tasklet_disable(&engine->execlists.tasklet);
+	tasklet_disable(&engine->active.tasklet);
 	local_bh_disable();
 	if (fn(rq, count, count - 1) && !check_context_order(engine))
 		err = -EINVAL;
 	local_bh_enable();
-	tasklet_enable(&engine->execlists.tasklet);
+	tasklet_enable(&engine->active.tasklet);
 
 	igt_spinner_end(&spin);
 err_context:
@@ -258,12 +258,12 @@ static int __wide_chain(struct intel_engine_cs *engine, unsigned long width,
 	}
 	intel_engine_flush_submission(engine);
 
-	tasklet_disable(&engine->execlists.tasklet);
+	tasklet_disable(&engine->active.tasklet);
 	local_bh_disable();
 	if (fn(rq[i - 1], i, count) && !check_context_order(engine))
 		err = -EINVAL;
 	local_bh_enable();
-	tasklet_enable(&engine->execlists.tasklet);
+	tasklet_enable(&engine->active.tasklet);
 
 	igt_spinner_end(&spin);
 err_free:
@@ -348,12 +348,12 @@ static int __inv_chain(struct intel_engine_cs *engine, unsigned long width,
 	}
 	intel_engine_flush_submission(engine);
 
-	tasklet_disable(&engine->execlists.tasklet);
+	tasklet_disable(&engine->active.tasklet);
 	local_bh_disable();
 	if (fn(rq[i - 1], i, count) && !check_context_order(engine))
 		err = -EINVAL;
 	local_bh_enable();
-	tasklet_enable(&engine->execlists.tasklet);
+	tasklet_enable(&engine->active.tasklet);
 
 	igt_spinner_end(&spin);
 err_free:
@@ -455,12 +455,12 @@ static int __sparse_chain(struct intel_engine_cs *engine, unsigned long width,
 	}
 	intel_engine_flush_submission(engine);
 
-	tasklet_disable(&engine->execlists.tasklet);
+	tasklet_disable(&engine->active.tasklet);
 	local_bh_disable();
 	if (fn(rq[i - 1], i, count) && !check_context_order(engine))
 		err = -EINVAL;
 	local_bh_enable();
-	tasklet_enable(&engine->execlists.tasklet);
+	tasklet_enable(&engine->active.tasklet);
 
 	igt_spinner_end(&spin);
 err_free:
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 19/41] drm/i915/gt: Show scheduler queues when dumping state
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (16 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 18/41] drm/i915: Move tasklet from execlists to sched Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-27 14:13   ` Tvrtko Ursulin
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist Chris Wilson
                   ` (26 subsequent siblings)
  44 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Move the scheduler pretty printer from out of the execlists state to
match its more common location.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 34 +++++++++++++----------
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index cdd07aeada05..2f9a8960144b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1443,20 +1443,15 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
 
 	if (intel_engine_in_guc_submission_mode(engine)) {
 		/* nothing to print yet */
-	} else if (HAS_EXECLISTS(dev_priv)) {
-		struct i915_request * const *port, *rq;
 		const u32 *hws =
 			&engine->status_page.addr[I915_HWS_CSB_BUF0_INDEX];
 		const u8 num_entries = execlists->csb_size;
 		unsigned int idx;
 		u8 read, write;
 
-		drm_printf(m, "\tExeclist tasklet queued? %s (%s), preempt? %s, timeslice? %s\n",
-			   yesno(test_bit(TASKLET_STATE_SCHED,
-					  &engine->active.tasklet.state)),
-			   enableddisabled(!atomic_read(&engine->active.tasklet.count)),
-			   repr_timer(&engine->execlists.preempt),
-			   repr_timer(&engine->execlists.timer));
+		drm_printf(m, "\tExeclists preempt? %s, timeslice? %s\n",
+			   repr_timer(&execlists->preempt),
+			   repr_timer(&execlists->timer));
 
 		read = execlists->csb_head;
 		write = READ_ONCE(*execlists->csb_write);
@@ -1477,6 +1472,22 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
 			drm_printf(m, "\tExeclist CSB[%d]: 0x%08x, context: %d\n",
 				   idx, hws[idx * 2], hws[idx * 2 + 1]);
 		}
+	} else if (INTEL_GEN(dev_priv) > 6) {
+		drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
+			   ENGINE_READ(engine, RING_PP_DIR_BASE));
+		drm_printf(m, "\tPP_DIR_BASE_READ: 0x%08x\n",
+			   ENGINE_READ(engine, RING_PP_DIR_BASE_READ));
+		drm_printf(m, "\tPP_DIR_DCLV: 0x%08x\n",
+			   ENGINE_READ(engine, RING_PP_DIR_DCLV));
+	}
+
+	if (engine->active.tasklet.func) {
+		struct i915_request * const *port, *rq;
+
+		drm_printf(m, "\tTasklet queued? %s (%s)\n",
+			   yesno(test_bit(TASKLET_STATE_SCHED,
+					  &engine->active.tasklet.state)),
+			   enableddisabled(!atomic_read(&engine->active.tasklet.count)));
 
 		i915_sched_lock_bh(&engine->active);
 		rcu_read_lock();
@@ -1510,13 +1521,6 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
 		}
 		rcu_read_unlock();
 		i915_sched_unlock_bh(&engine->active);
-	} else if (INTEL_GEN(dev_priv) > 6) {
-		drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
-			   ENGINE_READ(engine, RING_PP_DIR_BASE));
-		drm_printf(m, "\tPP_DIR_BASE_READ: 0x%08x\n",
-			   ENGINE_READ(engine, RING_PP_DIR_BASE_READ));
-		drm_printf(m, "\tPP_DIR_DCLV: 0x%08x\n",
-			   ENGINE_READ(engine, RING_PP_DIR_DCLV));
 	}
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (17 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 19/41] drm/i915/gt: Show scheduler queues when dumping state Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-27 15:10   ` Tvrtko Ursulin
                     ` (3 more replies)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 21/41] drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper Chris Wilson
                   ` (25 subsequent siblings)
  44 siblings, 4 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Replace the priolist rbtree with a skiplist. The crucial difference is
that walking and removing the first element of a skiplist is O(1), but
O(lgN) for an rbtree, as we need to rebalance on remove. This is a
hindrance for submission latency as it occurs between picking a request
for the priolist and submitting it to hardware, as well effectively
trippling the number of O(lgN) operations required under the irqoff lock.
This is critical to reducing the latency jitter with multiple clients.

The downsides to skiplists are that lookup/insertion is only
probablistically O(lgN) and there is a significant memory penalty to
as each skip node is larger than the rbtree equivalent. Furthermore, we
don't use dynamic arrays for the skiplist, so the allocation is fixed,
and imposes an upper bound on the scalability wrt to the number of
inflight requests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/gt/intel_execlists_submission.c  |  63 +++--
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  30 +--
 drivers/gpu/drm/i915/i915_priolist_types.h    |  28 +-
 drivers/gpu/drm/i915/i915_scheduler.c         | 244 ++++++++++++++----
 drivers/gpu/drm/i915/i915_scheduler.h         |  11 +-
 drivers/gpu/drm/i915/i915_scheduler_types.h   |   2 +-
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 +
 .../gpu/drm/i915/selftests/i915_scheduler.c   |  53 +++-
 8 files changed, 316 insertions(+), 116 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 1103c8a00af1..129144dd86b0 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -244,11 +244,6 @@ static void ring_set_paused(const struct intel_engine_cs *engine, int state)
 		wmb();
 }
 
-static struct i915_priolist *to_priolist(struct rb_node *rb)
-{
-	return rb_entry(rb, struct i915_priolist, node);
-}
-
 static int rq_prio(const struct i915_request *rq)
 {
 	return READ_ONCE(rq->sched.attr.priority);
@@ -272,15 +267,31 @@ static int effective_prio(const struct i915_request *rq)
 	return prio;
 }
 
-static int queue_prio(const struct i915_sched_engine *se)
+static struct i915_request *first_request(struct i915_sched_engine *se)
 {
-	struct rb_node *rb;
+	struct i915_priolist *pl;
 
-	rb = rb_first_cached(&se->queue);
-	if (!rb)
+	for_each_priolist(pl, &se->queue) {
+		if (likely(!list_empty(&pl->requests)))
+			return list_first_entry(&pl->requests,
+						struct i915_request,
+						sched.link);
+
+		i915_priolist_advance(&se->queue, pl);
+	}
+
+	return NULL;
+}
+
+static int queue_prio(struct i915_sched_engine *se)
+{
+	struct i915_request *rq;
+
+	rq = first_request(se);
+	if (!rq)
 		return INT_MIN;
 
-	return to_priolist(rb)->priority;
+	return rq_prio(rq);
 }
 
 static int virtual_prio(const struct intel_engine_execlists *el)
@@ -290,7 +301,7 @@ static int virtual_prio(const struct intel_engine_execlists *el)
 	return rb ? rb_entry(rb, struct ve_node, rb)->prio : INT_MIN;
 }
 
-static bool need_preempt(const struct intel_engine_cs *engine,
+static bool need_preempt(struct intel_engine_cs *engine,
 			 const struct i915_request *rq)
 {
 	int last_prio;
@@ -1136,6 +1147,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	struct i915_request ** const last_port = port + execlists->port_mask;
 	struct i915_request *last, * const *active;
 	struct virtual_engine *ve;
+	struct i915_priolist *pl;
 	struct rb_node *rb;
 	bool submit = false;
 
@@ -1346,11 +1358,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			break;
 	}
 
-	while ((rb = rb_first_cached(&engine->active.queue))) {
-		struct i915_priolist *p = to_priolist(rb);
+	for_each_priolist(pl, &engine->active.queue) {
 		struct i915_request *rq, *rn;
 
-		priolist_for_each_request_consume(rq, rn, p) {
+		priolist_for_each_request_safe(rq, rn, pl) {
 			bool merge = true;
 
 			/*
@@ -1425,8 +1436,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			}
 		}
 
-		rb_erase_cached(&p->node, &engine->active.queue);
-		i915_priolist_free(p);
+		i915_priolist_advance(&engine->active.queue, pl);
 	}
 done:
 	*port++ = i915_request_get(last);
@@ -2631,6 +2641,7 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_request *rq, *rn;
+	struct i915_priolist *pl;
 	struct rb_node *rb;
 	unsigned long flags;
 
@@ -2661,16 +2672,12 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
 	intel_engine_signal_breadcrumbs(engine);
 
 	/* Flush the queued requests to the timeline list (for retiring). */
-	while ((rb = rb_first_cached(&engine->active.queue))) {
-		struct i915_priolist *p = to_priolist(rb);
-
-		priolist_for_each_request_consume(rq, rn, p) {
+	for_each_priolist(pl, &engine->active.queue) {
+		priolist_for_each_request_safe(rq, rn, pl) {
 			i915_request_mark_eio(rq);
 			__i915_request_submit(rq);
 		}
-
-		rb_erase_cached(&p->node, &engine->active.queue);
-		i915_priolist_free(p);
+		i915_priolist_advance(&engine->active.queue, pl);
 	}
 	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
 
@@ -2703,7 +2710,6 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
 	execlists->queue_priority_hint = INT_MIN;
-	engine->active.queue = RB_ROOT_CACHED;
 
 	GEM_BUG_ON(__tasklet_is_enabled(&engine->active.tasklet));
 	engine->active.tasklet.func = nop_submission_tasklet;
@@ -3089,6 +3095,8 @@ static void virtual_context_exit(struct intel_context *ce)
 
 	for (n = 0; n < ve->num_siblings; n++)
 		intel_engine_pm_put(ve->siblings[n]);
+
+	i915_sched_park_engine(&ve->base.active);
 }
 
 static const struct intel_context_ops virtual_context_ops = {
@@ -3501,6 +3509,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 {
 	const struct intel_engine_execlists *execlists = &engine->execlists;
 	struct i915_request *rq, *last;
+	struct i915_priolist *pl;
 	unsigned long flags;
 	unsigned int count;
 	struct rb_node *rb;
@@ -3530,10 +3539,8 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 
 	last = NULL;
 	count = 0;
-	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
-		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
-
-		priolist_for_each_request(rq, p) {
+	for_each_priolist(pl, &engine->active.queue) {
+		priolist_for_each_request(rq, pl) {
 			if (count++ < max - 1)
 				show_request(m, rq, "\t\t", 0);
 			else
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 2d7339ef3b4c..8d0c6cd277b3 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -59,11 +59,6 @@
 
 #define GUC_REQUEST_SIZE 64 /* bytes */
 
-static inline struct i915_priolist *to_priolist(struct rb_node *rb)
-{
-	return rb_entry(rb, struct i915_priolist, node);
-}
-
 static struct guc_stage_desc *__get_stage_desc(struct intel_guc *guc, u32 id)
 {
 	struct guc_stage_desc *base = guc->stage_desc_pool_vaddr;
@@ -185,8 +180,8 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
 	struct i915_request ** const last_port = first + execlists->port_mask;
 	struct i915_request *last = first[0];
 	struct i915_request **port;
+	struct i915_priolist *pl;
 	bool submit = false;
-	struct rb_node *rb;
 
 	lockdep_assert_held(&engine->active.lock);
 
@@ -203,11 +198,10 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
 	 * event.
 	 */
 	port = first;
-	while ((rb = rb_first_cached(&engine->active.queue))) {
-		struct i915_priolist *p = to_priolist(rb);
+	for_each_priolist(pl, &engine->active.queue) {
 		struct i915_request *rq, *rn;
 
-		priolist_for_each_request_consume(rq, rn, p) {
+		priolist_for_each_request_safe(rq, rn, pl) {
 			if (last && rq->context != last->context) {
 				if (port == last_port)
 					goto done;
@@ -223,12 +217,11 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
 			last = rq;
 		}
 
-		rb_erase_cached(&p->node, &engine->active.queue);
-		i915_priolist_free(p);
+		i915_priolist_advance(&engine->active.queue, pl);
 	}
 done:
 	execlists->queue_priority_hint =
-		rb ? to_priolist(rb)->priority : INT_MIN;
+		pl != &engine->active.queue.sentinel ? pl->priority : INT_MIN;
 	if (submit) {
 		*port = schedule_in(last, port - execlists->inflight);
 		*++port = NULL;
@@ -327,7 +320,7 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_request *rq, *rn;
-	struct rb_node *rb;
+	struct i915_priolist *p;
 	unsigned long flags;
 
 	ENGINE_TRACE(engine, "\n");
@@ -355,25 +348,20 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
 	}
 
 	/* Flush the queued requests to the timeline list (for retiring). */
-	while ((rb = rb_first_cached(&engine->active.queue))) {
-		struct i915_priolist *p = to_priolist(rb);
-
-		priolist_for_each_request_consume(rq, rn, p) {
+	for_each_priolist(p, &engine->active.queue) {
+		priolist_for_each_request_safe(rq, rn, p) {
 			list_del_init(&rq->sched.link);
 			__i915_request_submit(rq);
 			dma_fence_set_error(&rq->fence, -EIO);
 			i915_request_mark_complete(rq);
 		}
-
-		rb_erase_cached(&p->node, &engine->active.queue);
-		i915_priolist_free(p);
+		i915_priolist_advance(&engine->active.queue, p);
 	}
 	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
 
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
 	execlists->queue_priority_hint = INT_MIN;
-	engine->active.queue = RB_ROOT_CACHED;
 
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
index bc2fa84f98a8..1200c3df6a4a 100644
--- a/drivers/gpu/drm/i915/i915_priolist_types.h
+++ b/drivers/gpu/drm/i915/i915_priolist_types.h
@@ -38,10 +38,36 @@ enum {
 #define I915_PRIORITY_UNPREEMPTABLE INT_MAX
 #define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
 
+#ifdef CONFIG_64BIT
+#define I915_PRIOLIST_HEIGHT 12
+#else
+#define I915_PRIOLIST_HEIGHT 11
+#endif
+
 struct i915_priolist {
 	struct list_head requests;
-	struct rb_node node;
 	int priority;
+
+	int level;
+	struct i915_priolist *next[I915_PRIOLIST_HEIGHT];
 };
 
+struct i915_priolist_root {
+	struct i915_priolist sentinel;
+	u32 prng;
+};
+
+#define i915_priolist_is_empty(root) ((root)->sentinel.level < 0)
+
+#define for_each_priolist(p, root) \
+	for ((p) = (root)->sentinel.next[0]; \
+	     (p) != &(root)->sentinel; \
+	     (p) = (p)->next[0])
+
+#define priolist_for_each_request(it, plist) \
+	list_for_each_entry(it, &(plist)->requests, sched.link)
+
+#define priolist_for_each_request_safe(it, n, plist) \
+	list_for_each_entry_safe(it, n, &(plist)->requests, sched.link)
+
 #endif /* _I915_PRIOLIST_TYPES_H_ */
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index a3ee06cb66d7..74000d3eebb1 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -4,7 +4,9 @@
  * Copyright © 2018 Intel Corporation
  */
 
+#include <linux/bitops.h>
 #include <linux/mutex.h>
+#include <linux/prandom.h>
 
 #include "gt/intel_ring.h"
 #include "gt/intel_lrc_reg.h"
@@ -91,15 +93,24 @@ static void i915_sched_init_ipi(struct i915_sched_ipi *ipi)
 	ipi->list = NULL;
 }
 
+static void init_priolist(struct i915_priolist_root *const root)
+{
+	struct i915_priolist *pl = &root->sentinel;
+
+	memset_p((void **)pl->next, pl, ARRAY_SIZE(pl->next));
+	pl->priority = INT_MIN;
+	pl->level = -1;
+}
+
 void i915_sched_init_engine(struct i915_sched_engine *se,
 			    unsigned int subclass)
 {
 	spin_lock_init(&se->lock);
 	lockdep_set_subclass(&se->lock, subclass);
 
+	init_priolist(&se->queue);
 	INIT_LIST_HEAD(&se->requests);
 	INIT_LIST_HEAD(&se->hold);
-	se->queue = RB_ROOT_CACHED;
 
 	i915_sched_init_ipi(&se->ipi);
 
@@ -116,8 +127,57 @@ void i915_sched_init_engine(struct i915_sched_engine *se,
 #endif
 }
 
+__maybe_unused static bool priolist_idle(struct i915_priolist_root *root)
+{
+	struct i915_priolist *pl = &root->sentinel;
+	int lvl;
+
+	for (lvl = 0; lvl < ARRAY_SIZE(pl->next); lvl++) {
+		if (pl->next[lvl] != pl) {
+			GEM_TRACE_ERR("root[%d] is not empty\n", lvl);
+			return false;
+		}
+	}
+
+	if (pl->level != -1) {
+		GEM_TRACE_ERR("root is not clear: %d\n", pl->level);
+		return false;
+	}
+
+	return true;
+}
+
+static void pl_push(struct i915_priolist *pl, struct list_head *head)
+{
+	pl->requests.next = head->next;
+	head->next = &pl->requests;
+}
+
+static struct i915_priolist *pl_pop(struct list_head *head)
+{
+	struct i915_priolist *pl;
+
+	pl = container_of(head->next, typeof(*pl), requests);
+	head->next = pl->requests.next;
+
+	return pl;
+}
+
+static bool pl_empty(struct list_head *head)
+{
+	return !head->next;
+}
+
 void i915_sched_park_engine(struct i915_sched_engine *se)
 {
+	struct i915_priolist_root *root = &se->queue;
+	struct list_head *list = &root->sentinel.requests;
+
+	GEM_BUG_ON(!priolist_idle(root));
+
+	while (!pl_empty(list))
+		kmem_cache_free(global.slab_priorities, pl_pop(list));
+
 	GEM_BUG_ON(!i915_sched_is_idle(se));
 	se->no_priolist = false;
 }
@@ -183,71 +243,55 @@ static inline bool node_signaled(const struct i915_sched_node *node)
 	return i915_request_completed(node_to_request(node));
 }
 
-static inline struct i915_priolist *to_priolist(struct rb_node *rb)
+static inline unsigned int random_level(struct i915_priolist_root *root)
 {
-	return rb_entry(rb, struct i915_priolist, node);
-}
-
-static void assert_priolists(struct i915_sched_engine * const se)
-{
-	struct rb_node *rb;
-	long last_prio;
-
-	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
-		return;
-
-	GEM_BUG_ON(rb_first_cached(&se->queue) !=
-		   rb_first(&se->queue.rb_root));
-
-	last_prio = INT_MAX;
-	for (rb = rb_first_cached(&se->queue); rb; rb = rb_next(rb)) {
-		const struct i915_priolist *p = to_priolist(rb);
-
-		GEM_BUG_ON(p->priority > last_prio);
-		last_prio = p->priority;
-	}
+	root->prng = next_pseudo_random32(root->prng);
+	return  __ffs(root->prng) / 2;
 }
 
 static struct list_head *
 lookup_priolist(struct intel_engine_cs *engine, int prio)
 {
+	struct i915_priolist *update[I915_PRIOLIST_HEIGHT];
 	struct i915_sched_engine * const se = &engine->active;
-	struct i915_priolist *p;
-	struct rb_node **parent, *rb;
-	bool first = true;
-
-	lockdep_assert_held(&engine->active.lock);
-	assert_priolists(se);
+	struct i915_priolist_root *root = &se->queue;
+	struct i915_priolist *pl, *tmp;
+	int lvl;
 
+	lockdep_assert_held(&se->lock);
 	if (unlikely(se->no_priolist))
 		prio = I915_PRIORITY_NORMAL;
 
+	for_each_priolist(pl, root) { /* recycle any empty elements before us */
+		if (pl->priority >= prio || !list_empty(&pl->requests))
+			break;
+
+		i915_priolist_advance(root, pl);
+	}
+
 find_priolist:
-	/* most positive priority is scheduled first, equal priorities fifo */
-	rb = NULL;
-	parent = &se->queue.rb_root.rb_node;
-	while (*parent) {
-		rb = *parent;
-		p = to_priolist(rb);
-		if (prio > p->priority) {
-			parent = &rb->rb_left;
-		} else if (prio < p->priority) {
-			parent = &rb->rb_right;
-			first = false;
-		} else {
-			return &p->requests;
-		}
+	pl = &root->sentinel;
+	lvl = pl->level;
+	while (lvl >= 0) {
+		while (tmp = pl->next[lvl], tmp->priority >= prio)
+			pl = tmp;
+		if (pl->priority == prio)
+			goto out;
+		update[lvl--] = pl;
 	}
 
 	if (prio == I915_PRIORITY_NORMAL) {
-		p = &se->default_priolist;
+		pl = &se->default_priolist;
+	} else if (!pl_empty(&root->sentinel.requests)) {
+		pl = pl_pop(&root->sentinel.requests);
 	} else {
-		p = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
+		pl = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
 		/* Convert an allocation failure to a priority bump */
-		if (unlikely(!p)) {
+		if (unlikely(!pl)) {
 			prio = I915_PRIORITY_NORMAL; /* recurses just once */
 
-			/* To maintain ordering with all rendering, after an
+			/*
+			 * To maintain ordering with all rendering, after an
 			 * allocation failure we have to disable all scheduling.
 			 * Requests will then be executed in fifo, and schedule
 			 * will ensure that dependencies are emitted in fifo.
@@ -260,18 +304,103 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
 		}
 	}
 
-	p->priority = prio;
-	INIT_LIST_HEAD(&p->requests);
+	pl->priority = prio;
+	INIT_LIST_HEAD(&pl->requests);
 
-	rb_link_node(&p->node, rb, parent);
-	rb_insert_color_cached(&p->node, &se->queue, first);
+	lvl = random_level(root);
+	if (lvl > root->sentinel.level) {
+		if (root->sentinel.level < I915_PRIOLIST_HEIGHT - 1) {
+			lvl = ++root->sentinel.level;
+			update[lvl] = &root->sentinel;
+		} else {
+			lvl = I915_PRIOLIST_HEIGHT - 1;
+		}
+	}
+	GEM_BUG_ON(lvl < 0);
+	GEM_BUG_ON(lvl >= ARRAY_SIZE(pl->next));
 
-	return &p->requests;
+	pl->level = lvl;
+	do {
+		tmp = update[lvl];
+		pl->next[lvl] = update[lvl]->next[lvl];
+		tmp->next[lvl] = pl;
+	} while (--lvl >= 0);
+
+	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) {
+		struct i915_priolist *chk;
+
+		chk = &root->sentinel;
+		lvl = chk->level;
+		do {
+			while (tmp = chk->next[lvl], tmp->priority >= prio)
+				chk = tmp;
+		} while (--lvl >= 0);
+
+		GEM_BUG_ON(chk != pl);
+	}
+
+out:
+	GEM_BUG_ON(pl == &root->sentinel);
+	return &pl->requests;
 }
 
-void __i915_priolist_free(struct i915_priolist *p)
+static void remove_priolist(struct intel_engine_cs *engine,
+			    struct list_head *plist)
 {
-	kmem_cache_free(global.slab_priorities, p);
+	struct i915_sched_engine * const se = &engine->active;
+	struct i915_priolist_root *root = &se->queue;
+	struct i915_priolist *pl, *tmp;
+	struct i915_priolist *old =
+		container_of(plist, struct i915_priolist, requests);
+	int prio = old->priority;
+	int lvl;
+
+	lockdep_assert_held(&se->lock);
+	GEM_BUG_ON(!list_empty(plist));
+
+	pl = &root->sentinel;
+	lvl = pl->level;
+	GEM_BUG_ON(lvl < 0);
+
+	if (prio != I915_PRIORITY_NORMAL)
+		pl_push(old, &pl->requests);
+
+	do {
+		while (tmp = pl->next[lvl], tmp->priority > prio)
+			pl = tmp;
+		if (lvl <= old->level) {
+			pl->next[lvl] = old->next[lvl];
+			if (pl == &root->sentinel && old->next[lvl] == pl) {
+				GEM_BUG_ON(pl->level != lvl);
+				pl->level--;
+			}
+		}
+	} while (--lvl >= 0);
+	GEM_BUG_ON(tmp != old);
+}
+
+void i915_priolist_advance(struct i915_priolist_root *root,
+			   struct i915_priolist *pl)
+{
+	struct i915_priolist * const s = &root->sentinel;
+	int lvl;
+
+	GEM_BUG_ON(!list_empty(&pl->requests));
+	GEM_BUG_ON(pl != s->next[0]);
+	GEM_BUG_ON(pl == s);
+
+	if (pl->priority != I915_PRIORITY_NORMAL)
+		pl_push(pl, &s->requests);
+
+	lvl = pl->level;
+	GEM_BUG_ON(lvl < 0);
+	do {
+		s->next[lvl] = pl->next[lvl];
+		if (pl->next[lvl] == s) {
+			GEM_BUG_ON(s->level != lvl);
+			s->level--;
+		}
+	} while (--lvl >= 0);
 }
 
 static struct i915_request *
@@ -420,8 +549,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
 			continue;
 
 		GEM_BUG_ON(rq->engine != engine);
-		if (i915_request_in_priority_queue(rq))
+		if (i915_request_in_priority_queue(rq)) {
+			struct list_head *prev = rq->sched.link.prev;
+
 			list_move_tail(&rq->sched.link, plist);
+			if (list_empty(prev))
+				remove_priolist(engine, prev);
+		}
 
 		/* Defer (tasklet) submission until after all updates. */
 		kick_submission(engine, rq, prio);
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 0ab47cbf0e9c..bca89a58d953 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -16,12 +16,6 @@
 
 struct drm_printer;
 
-#define priolist_for_each_request(it, plist) \
-	list_for_each_entry(it, &(plist)->requests, sched.link)
-
-#define priolist_for_each_request_consume(it, n, plist) \
-	list_for_each_entry_safe(it, n, &(plist)->requests, sched.link)
-
 void i915_sched_node_init(struct i915_sched_node *node);
 void i915_sched_node_reinit(struct i915_sched_node *node);
 
@@ -69,7 +63,7 @@ static inline void i915_priolist_free(struct i915_priolist *p)
 
 static inline bool i915_sched_is_idle(const struct i915_sched_engine *se)
 {
-	return RB_EMPTY_ROOT(&se->queue.rb_root);
+	return i915_priolist_is_empty(&se->queue);
 }
 
 static inline bool
@@ -99,6 +93,9 @@ static inline void i915_sched_kick(struct i915_sched_engine *se)
 	tasklet_hi_schedule(&se->tasklet);
 }
 
+void i915_priolist_advance(struct i915_priolist_root *root,
+			   struct i915_priolist *old);
+
 void i915_request_show_with_schedule(struct drm_printer *m,
 				     const struct i915_request *rq,
 				     const char *prefix,
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index f668c680d290..e64750be4e77 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -89,7 +89,7 @@ struct i915_sched_engine {
 	/**
 	 * @queue: queue of requests, in priority lists
 	 */
-	struct rb_root_cached queue;
+	struct i915_priolist_root queue;
 
 	struct i915_sched_ipi ipi;
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
index 3db34d3eea58..946c93441c1f 100644
--- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
@@ -25,6 +25,7 @@ selftest(ring, intel_ring_mock_selftests)
 selftest(engine, intel_engine_cs_mock_selftests)
 selftest(timelines, intel_timeline_mock_selftests)
 selftest(requests, i915_request_mock_selftests)
+selftest(scheduler, i915_scheduler_mock_selftests)
 selftest(objects, i915_gem_object_mock_selftests)
 selftest(phys, i915_gem_phys_mock_selftests)
 selftest(dmabuf, i915_gem_dmabuf_mock_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
index de44a66210b7..de5b1443129b 100644
--- a/drivers/gpu/drm/i915/selftests/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
@@ -12,6 +12,54 @@
 #include "selftests/igt_spinner.h"
 #include "selftests/i915_random.h"
 
+static int mock_skiplist_levels(void *dummy)
+{
+	struct i915_priolist_root root = {};
+	struct i915_priolist *pl = &root.sentinel;
+	IGT_TIMEOUT(end_time);
+	unsigned long total;
+	int count, lvl;
+
+	total = 0;
+	do {
+		for (count = 0; count < 16384; count++) {
+			lvl = random_level(&root);
+			if (lvl > pl->level) {
+				if (lvl < I915_PRIOLIST_HEIGHT - 1)
+					lvl = ++pl->level;
+				else
+					lvl = I915_PRIOLIST_HEIGHT - 1;
+			}
+
+			pl->next[lvl] = ptr_inc(pl->next[lvl]);
+		}
+		total += count;
+	} while (!__igt_timeout(end_time, NULL));
+
+	pr_info("Total %9lu\n", total);
+	for (lvl = 0; lvl <= pl->level; lvl++) {
+		int x = ilog2((unsigned long)pl->next[lvl]);
+		char row[80];
+
+		memset(row, '*', x);
+		row[x] = '\0';
+
+		pr_info(" [%2d] %9lu %s\n",
+			lvl, (unsigned long)pl->next[lvl], row);
+	}
+
+	return 0;
+}
+
+int i915_scheduler_mock_selftests(void)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(mock_skiplist_levels),
+	};
+
+	return i915_subtests(tests, NULL);
+}
+
 static void scheduling_disable(struct intel_engine_cs *engine)
 {
 	engine->props.preempt_timeout_ms = 0;
@@ -80,9 +128,9 @@ static int all_engines(struct drm_i915_private *i915,
 static bool check_context_order(struct intel_engine_cs *engine)
 {
 	u64 last_seqno, last_context;
+	struct i915_priolist *p;
 	unsigned long count;
 	bool result = false;
-	struct rb_node *rb;
 	int last_prio;
 
 	/* We expect the execution order to follow ascending fence-context */
@@ -92,8 +140,7 @@ static bool check_context_order(struct intel_engine_cs *engine)
 	last_context = 0;
 	last_seqno = 0;
 	last_prio = 0;
-	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
-		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
+	for_each_priolist(p, &engine->active.queue) {
 		struct i915_request *rq;
 
 		priolist_for_each_request(rq, p) {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 21/41] drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (18 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-27 15:28   ` Tvrtko Ursulin
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 22/41] drm/i915: Fair low-latency scheduling Chris Wilson
                   ` (24 subsequent siblings)
  44 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Wrap cmpxchg64 with a try_cmpxchg()-esque helper. Hiding the old-value
dance in the helper allows for cleaner code.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_utils.h | 32 +++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index abd4dcd9f79c..95ead6bb1ba6 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -461,4 +461,36 @@ static inline bool timer_expired(const struct timer_list *t)
  */
 #define IS_ACTIVE(config) ((config) != 0)
 
+#ifndef try_cmpxchg64
+#if IS_ENABLED(CONFIG_64BIT)
+#define try_cmpxchg64(_ptr, _pold, _new) try_cmpxchg(_ptr, _pold, _new)
+#else
+#define try_cmpxchg64(_ptr, _pold, _new)				\
+({									\
+	__typeof__(_ptr) _old = (__typeof__(_ptr))(_pold);		\
+	__typeof__(*(_ptr)) __old = *_old;				\
+	__typeof__(*(_ptr)) __cur = cmpxchg64(_ptr, __old, _new);	\
+	bool success = __cur == __old;					\
+	if (unlikely(!success))						\
+		*_old = __cur;						\
+	likely(success);						\
+})
+#endif
+#endif
+
+#ifndef xchg64
+#if IS_ENABLED(CONFIG_64BIT)
+#define xchg64(_ptr, _new) xchg(_ptr, _new)
+#else
+#define xchg64(_ptr, _new)						\
+({									\
+	__typeof__(_ptr) __ptr = (_ptr);				\
+	__typeof__(*(_ptr)) __old = *__ptr;				\
+	while (!try_cmpxchg64(__ptr, &__old, (_new)))			\
+		;							\
+	__old;								\
+})
+#endif
+#endif
+
 #endif /* !__I915_UTILS_H */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 22/41] drm/i915: Fair low-latency scheduling
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (19 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 21/41] drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-28 11:35   ` Tvrtko Ursulin
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 23/41] drm/i915/gt: Specify a deadline for the heartbeat Chris Wilson
                   ` (23 subsequent siblings)
  44 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

The first "scheduler" was a topographical sorting of requests into
priority order. The execution order was deterministic, the earliest
submitted, highest priority request would be executed first. Priority
inheritance ensured that inversions were kept at bay, and allowed us to
dynamically boost priorities (e.g. for interactive pageflips).

The minimalistic timeslicing scheme was an attempt to introduce fairness
between long running requests, by evicting the active request at the end
of a timeslice and moving it to the back of its priority queue (while
ensuring that dependencies were kept in order). For short running
requests from many clients of equal priority, the scheme is still very
much FIFO submission ordering, and as unfair as before.

To impose fairness, we need an external metric that ensures that clients
are interpersed, so we don't execute one long chain from client A before
executing any of client B. This could be imposed by the clients
themselves by using fences based on an external clock, that is they only
submit work for a "frame" at frame-intervals, instead of submitting as
much work as they are able to. The standard SwapBuffers approach is akin
to double bufferring, where as one frame is being executed, the next is
being submitted, such that there is always a maximum of two frames per
client in the pipeline and so ideally maintains consistent input-output
latency. Even this scheme exhibits unfairness under load as a single
client will execute two frames back to back before the next, and with
enough clients, deadlines will be missed.

The idea introduced by BFS/MuQSS is that fairness is introduced by
metering with an external clock. Every request, when it becomes ready to
execute is assigned a virtual deadline, and execution order is then
determined by earliest deadline. Priority is used as a hint, rather than
strict ordering, where high priority requests have earlier deadlines,
but not necessarily earlier than outstanding work. Thus work is executed
in order of 'readiness', with timeslicing to demote long running work.

The Achille's heel of this scheduler is its strong preference for
low-latency and favouring of new queues. Whereas it was easy to dominate
the old scheduler by flooding it with many requests over a short period
of time, the new scheduler can be dominated by a 'synchronous' client
that waits for each of its requests to complete before submitting the
next. As such a client has no history, it is always considered
ready-to-run and receives an earlier deadline than the long running
requests. This is compensated for by refreshing the current execution's
deadline and by disallowing preemption for timeslice shuffling.

In contrast, one key advantage of disconnecting the sort key from the
priority value is that we can freely adjust the deadline to compensate
for other factors. This is used in conjunction with submitting requests
ahead-of-schedule that then busywait on the GPU using semaphores. Since
we don't want to spend a timeslice busywaiting instead of doing real
work when available, we deprioritise work by giving the semaphore waits
a later virtual deadline. The priority deboost is applied to semaphore
workloads after they miss a semaphore wait and a new context is pending.
The request is then restored to its normal priority once the semaphores
are signaled so that it not unfairly penalised under contention by
remaining at a far future deadline. This is a much improved and cleaner
version of commit f9e9e9de58c7 ("drm/i915: Prioritise non-busywait
semaphore workloads").

To check the impact on throughput (often the downfall of latency
sensitive schedulers), we used gem_wsim to simulate various transcode
workloads with different load balancers, and varying the number of
competing [heterogenous] clients. On Kabylake gt3e running at fixed
clocks,

+delta%------------------------------------------------------------------+
|       a                                                                |
|       a                                                                |
|       a                                                                |
|       a                                                                |
|       aa                                                               |
|      aaa                                                               |
|      aaaa                                                              |
|     aaaaaa                                                             |
|     aaaaaa                                                             |
|     aaaaaa   a                a                                        |
| aa  aaaaaa a a      a  a   aa a       a         a       a             a|
||______M__A__________|                                                  |
+------------------------------------------------------------------------+
    N           Min           Max        Median          Avg       Stddev
  108    -4.6326643     47.797855 -0.00069639128     2.116185   7.6764049

Reviewing the same workloads on Tigerlake,

+delta%------------------------------------------------------------------+
|       a                                                                |
|       a                                                                |
|       a                                                                |
|       aa a                                                             |
|       aaaa                                                             |
|       aaaa                                                             |
|    aaaaaaa                                                             |
|    aaaaaaa                                                             |
|    aaaaaaa      a   a   aa  a         a                         a      |
| aaaaaaaaaa a aa a a a aaaa aa   a     a        aa               a     a|
||_______M____A_____________|                                            |
+------------------------------------------------------------------------+
    N           Min           Max        Median          Avg       Stddev
  108     -4.258712      46.83081    0.36853159    4.1415662     9.461689

The expectation is that by deliberately increasing the number of context
switches to improve fairness between clients, throughput will be
diminished. What we do see is are small fluctations around no change,
with the median result being improved throughput. The dramatic
improvement is from reintroducing the improved no-semaphore boosting,
which avoids accidentally preventing scheduling of ready workloads due
to busy spinners.

This scheduler is based on MuQSS by Dr Con Kolivas.

Testcase: igt/gem_exec_fair
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   2 -
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |   1 +
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |   4 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  14 -
 .../drm/i915/gt/intel_execlists_submission.c  | 205 ++++-----
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  30 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   5 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c        |   1 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   5 -
 drivers/gpu/drm/i915/i915_priolist_types.h    |   7 +-
 drivers/gpu/drm/i915/i915_request.c           |  14 +-
 drivers/gpu/drm/i915/i915_scheduler.c         | 433 +++++++++++++-----
 drivers/gpu/drm/i915/i915_scheduler.h         |  16 +-
 drivers/gpu/drm/i915/i915_scheduler_types.h   |  23 +
 drivers/gpu/drm/i915/selftests/i915_request.c |   1 +
 .../gpu/drm/i915/selftests/i915_scheduler.c   | 136 ++++++
 16 files changed, 630 insertions(+), 267 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 2f9a8960144b..8372c8bc4ca5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -573,8 +573,6 @@ void intel_engine_init_execlists(struct intel_engine_cs *engine)
 	memset(execlists->pending, 0, sizeof(execlists->pending));
 	execlists->active =
 		memset(execlists->inflight, 0, sizeof(execlists->inflight));
-
-	execlists->queue_priority_hint = INT_MIN;
 }
 
 static void cleanup_status_page(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 0b026cde9f09..2d1f0a4da13c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -204,6 +204,7 @@ static int __intel_engine_pulse(struct intel_engine_cs *engine)
 	if (IS_ERR(rq))
 		return PTR_ERR(rq);
 
+	rq->sched.deadline = 0;
 	__set_bit(I915_FENCE_FLAG_SENTINEL, &rq->fence.flags);
 
 	heartbeat_commit(rq, &attr);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 205feeaf0e76..2427d9e01be9 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -211,6 +211,7 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	i915_request_add_active_barriers(rq);
 
 	/* Install ourselves as a preemption barrier */
+	rq->sched.deadline = 0;
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
 	if (likely(!__i915_request_commit(rq))) { /* engine should be idle! */
 		/*
@@ -271,9 +272,6 @@ static int __engine_park(struct intel_wakeref *wf)
 	intel_engine_park_heartbeat(engine);
 	intel_breadcrumbs_park(engine->breadcrumbs);
 
-	/* Must be reset upon idling, or we may miss the busy wakeup. */
-	GEM_BUG_ON(engine->execlists.queue_priority_hint != INT_MIN);
-
 	if (engine->park)
 		engine->park(engine);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 76d561c2c6aa..b5bef848a2d5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -223,20 +223,6 @@ struct intel_engine_execlists {
 	 */
 	unsigned int port_mask;
 
-	/**
-	 * @queue_priority_hint: Highest pending priority.
-	 *
-	 * When we add requests into the queue, or adjust the priority of
-	 * executing requests, we compute the maximum priority of those
-	 * pending requests. We can then use this value to determine if
-	 * we need to preempt the executing requests to service the queue.
-	 * However, since the we may have recorded the priority of an inflight
-	 * request we wanted to preempt but since completed, at the time of
-	 * dequeuing the priority hint may no longer may match the highest
-	 * available request priority.
-	 */
-	int queue_priority_hint;
-
 	struct rb_root_cached virtual;
 
 	/**
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 129144dd86b0..8f12068465bd 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -178,7 +178,7 @@ struct virtual_engine {
 	 */
 	struct ve_node {
 		struct rb_node rb;
-		int prio;
+		u64 deadline;
 	} nodes[I915_NUM_ENGINES];
 
 	/*
@@ -246,25 +246,12 @@ static void ring_set_paused(const struct intel_engine_cs *engine, int state)
 
 static int rq_prio(const struct i915_request *rq)
 {
-	return READ_ONCE(rq->sched.attr.priority);
+	return rq->sched.attr.priority;
 }
 
-static int effective_prio(const struct i915_request *rq)
+static u64 rq_deadline(const struct i915_request *rq)
 {
-	int prio = rq_prio(rq);
-
-	/*
-	 * If this request is special and must not be interrupted at any
-	 * cost, so be it. Note we are only checking the most recent request
-	 * in the context and so may be masking an earlier vip request. It
-	 * is hoped that under the conditions where nopreempt is used, this
-	 * will not matter (i.e. all requests to that context will be
-	 * nopreempt for as long as desired).
-	 */
-	if (i915_request_has_nopreempt(rq))
-		prio = I915_PRIORITY_UNPREEMPTABLE;
-
-	return prio;
+	return rq->sched.deadline;
 }
 
 static struct i915_request *first_request(struct i915_sched_engine *se)
@@ -283,61 +270,61 @@ static struct i915_request *first_request(struct i915_sched_engine *se)
 	return NULL;
 }
 
-static int queue_prio(struct i915_sched_engine *se)
+static struct i915_request *first_virtual(const struct intel_engine_cs *engine)
 {
-	struct i915_request *rq;
+	struct rb_node *rb;
 
-	rq = first_request(se);
-	if (!rq)
-		return INT_MIN;
+	rb = rb_first_cached(&engine->execlists.virtual);
+	if (!rb)
+		return NULL;
 
-	return rq_prio(rq);
+	return READ_ONCE(rb_entry(rb,
+				  struct virtual_engine,
+				  nodes[engine->id].rb)->request);
 }
 
-static int virtual_prio(const struct intel_engine_execlists *el)
+static const struct i915_request *
+next_elsp_request(struct intel_engine_cs *engine, const struct i915_request *rq)
 {
-	struct rb_node *rb = rb_first_cached(&el->virtual);
+	if (list_is_last(&rq->sched.link, &engine->active.requests))
+		return NULL;
 
-	return rb ? rb_entry(rb, struct ve_node, rb)->prio : INT_MIN;
+	return list_next_entry(rq, sched.link);
+}
+
+static bool
+dl_before(const struct i915_request *next, const struct i915_request *prev)
+{
+	return !prev || (next && rq_deadline(next) < rq_deadline(prev));
 }
 
 static bool need_preempt(struct intel_engine_cs *engine,
 			 const struct i915_request *rq)
 {
-	int last_prio;
+	const struct i915_request *first = NULL;
+	const struct i915_request *next;
 
 	if (!intel_engine_has_semaphores(engine))
 		return false;
 
 	/*
-	 * Check if the current priority hint merits a preemption attempt.
-	 *
-	 * We record the highest value priority we saw during rescheduling
-	 * prior to this dequeue, therefore we know that if it is strictly
-	 * less than the current tail of ESLP[0], we do not need to force
-	 * a preempt-to-idle cycle.
-	 *
-	 * However, the priority hint is a mere hint that we may need to
-	 * preempt. If that hint is stale or we may be trying to preempt
-	 * ourselves, ignore the request.
-	 *
-	 * More naturally we would write
-	 *      prio >= max(0, last);
-	 * except that we wish to prevent triggering preemption at the same
-	 * priority level: the task that is running should remain running
-	 * to preserve FIFO ordering of dependencies.
+	 * If this request is special and must not be interrupted at any
+	 * cost, so be it. Note we are only checking the most recent request
+	 * in the context and so may be masking an earlier vip request. It
+	 * is hoped that under the conditions where nopreempt is used, this
+	 * will not matter (i.e. all requests to that context will be
+	 * nopreempt for as long as desired).
 	 */
-	last_prio = max(effective_prio(rq), I915_PRIORITY_NORMAL - 1);
-	if (engine->execlists.queue_priority_hint <= last_prio)
+	if (i915_request_has_nopreempt(rq))
 		return false;
 
 	/*
 	 * Check against the first request in ELSP[1], it will, thanks to the
 	 * power of PI, be the highest priority of that context.
 	 */
-	if (!list_is_last(&rq->sched.link, &engine->active.requests) &&
-	    rq_prio(list_next_entry(rq, sched.link)) > last_prio)
-		return true;
+	next = next_elsp_request(engine, rq);
+	if (dl_before(next, first))
+		first = next;
 
 	/*
 	 * If the inflight context did not trigger the preemption, then maybe
@@ -349,8 +336,31 @@ static bool need_preempt(struct intel_engine_cs *engine,
 	 * ELSP[0] or ELSP[1] as, thanks again to PI, if it was the same
 	 * context, it's priority would not exceed ELSP[0] aka last_prio.
 	 */
-	return max(virtual_prio(&engine->execlists),
-		   queue_prio(&engine->active)) > last_prio;
+	next = first_request(&engine->active);
+	if (dl_before(next, first))
+		first = next;
+
+	next = first_virtual(engine);
+	if (dl_before(next, first))
+		first = next;
+
+	if (!dl_before(first, rq))
+		return false;
+
+	/*
+	 * While a request may have been queued that has an earlier deadline
+	 * than is currently running, we only allow it to perform an urgent
+	 * preemption if it also has higher priority. The cost of frequently
+	 * switching between contexts is noticeable, so we try to keep
+	 * the deadline shuffling only to timeslice boundaries.
+	 */
+	ENGINE_TRACE(engine,
+		     "preempt for first=%llx:%llu, dl=%llu, prio=%d?\n",
+		     first->fence.context,
+		     first->fence.seqno,
+		     rq_deadline(first),
+		     rq_prio(first));
+	return rq_prio(first) > max(rq_prio(rq), I915_PRIORITY_NORMAL - 1);
 }
 
 __maybe_unused static bool
@@ -367,7 +377,7 @@ assert_priority_queue(const struct i915_request *prev,
 	if (i915_request_is_active(prev))
 		return true;
 
-	return rq_prio(prev) >= rq_prio(next);
+	return rq_deadline(prev) <= rq_deadline(next);
 }
 
 static void
@@ -549,9 +559,12 @@ static void __execlists_schedule_out(struct i915_request * const rq,
 	 * If we have just completed this context, the engine may now be
 	 * idle and we want to re-enter powersaving.
 	 */
-	if (intel_timeline_is_last(ce->timeline, rq) &&
-	    __i915_request_is_complete(rq))
-		intel_engine_add_retire(engine, ce->timeline);
+	if (__i915_request_is_complete(rq)) {
+		if (!intel_timeline_is_last(ce->timeline, rq))
+			i915_request_update_deadline(list_next_entry(rq, link));
+		else
+			intel_engine_add_retire(engine, ce->timeline);
+	}
 
 	ccid = ce->lrc.ccid;
 	ccid >>= GEN11_SW_CTX_ID_SHIFT - 32;
@@ -661,14 +674,14 @@ dump_port(char *buf, int buflen, const char *prefix, struct i915_request *rq)
 	if (!rq)
 		return "";
 
-	snprintf(buf, buflen, "%sccid:%x %llx:%lld%s prio %d",
+	snprintf(buf, buflen, "%sccid:%x %llx:%lld%s dl:%llu",
 		 prefix,
 		 rq->context->lrc.ccid,
 		 rq->fence.context, rq->fence.seqno,
 		 __i915_request_is_complete(rq) ? "!" :
 		 __i915_request_has_started(rq) ? "*" :
 		 "",
-		 rq_prio(rq));
+		 rq_deadline(rq));
 
 	return buf;
 }
@@ -1191,11 +1204,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	if (last) {
 		if (need_preempt(engine, last)) {
 			ENGINE_TRACE(engine,
-				     "preempting last=%llx:%lld, prio=%d, hint=%d\n",
+				     "preempting last=%llx:%llu, dl=%llu, prio=%d\n",
 				     last->fence.context,
 				     last->fence.seqno,
-				     last->sched.attr.priority,
-				     execlists->queue_priority_hint);
+				     rq_deadline(last),
+				     rq_prio(last));
 			record_preemption(execlists);
 
 			/*
@@ -1217,11 +1230,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			last = NULL;
 		} else if (timeslice_expired(engine, last)) {
 			ENGINE_TRACE(engine,
-				     "expired:%s last=%llx:%lld, prio=%d, hint=%d, yield?=%s\n",
+				     "expired:%s last=%llx:%llu, deadline=%llu, now=%llu, yield?=%s\n",
 				     yesno(timer_expired(&execlists->timer)),
 				     last->fence.context, last->fence.seqno,
-				     rq_prio(last),
-				     execlists->queue_priority_hint,
+				     rq_deadline(last),
+				     i915_sched_to_ticks(ktime_get()),
 				     yesno(timeslice_yield(execlists, last)));
 
 			/*
@@ -1292,7 +1305,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		GEM_BUG_ON(rq->engine != &ve->base);
 		GEM_BUG_ON(rq->context != &ve->context);
 
-		if (unlikely(rq_prio(rq) < queue_prio(&engine->active))) {
+		if (!dl_before(rq, first_request(&engine->active))) {
 			spin_unlock(&ve->base.active.lock);
 			break;
 		}
@@ -1304,16 +1317,15 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		}
 
 		ENGINE_TRACE(engine,
-			     "virtual rq=%llx:%lld%s, new engine? %s\n",
+			     "virtual rq=%llx:%lld%s, dl %llx, new engine? %s\n",
 			     rq->fence.context,
 			     rq->fence.seqno,
 			     __i915_request_is_complete(rq) ? "!" :
 			     __i915_request_has_started(rq) ? "*" :
 			     "",
+			     rq_deadline(rq),
 			     yesno(engine != ve->siblings[0]));
-
 		WRITE_ONCE(ve->request, NULL);
-		WRITE_ONCE(ve->base.execlists.queue_priority_hint, INT_MIN);
 
 		rb = &ve->nodes[engine->id].rb;
 		rb_erase_cached(rb, &execlists->virtual);
@@ -1404,6 +1416,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				if (rq->execution_mask != engine->mask)
 					goto done;
 
+				if (unlikely(dl_before(first_virtual(engine),
+						       rq)))
+					goto done;
+
 				/*
 				 * If GVT overrides us we only ever submit
 				 * port[0], leaving port[1] empty. Note that we
@@ -1440,24 +1456,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	}
 done:
 	*port++ = i915_request_get(last);
-
-	/*
-	 * Here be a bit of magic! Or sleight-of-hand, whichever you prefer.
-	 *
-	 * We choose the priority hint such that if we add a request of greater
-	 * priority than this, we kick the submission tasklet to decide on
-	 * the right order of submitting the requests to hardware. We must
-	 * also be prepared to reorder requests as they are in-flight on the
-	 * HW. We derive the priority hint then as the first "hole" in
-	 * the HW submission ports and if there are no available slots,
-	 * the priority of the lowest executing request, i.e. last.
-	 *
-	 * When we do receive a higher priority request ready to run from the
-	 * user, see queue_request(), the priority hint is bumped to that
-	 * request triggering preemption on the next dequeue (or subsequent
-	 * interrupt for secondary ports).
-	 */
-	execlists->queue_priority_hint = queue_prio(&engine->active);
 	spin_unlock(&engine->active.lock);
 
 	/*
@@ -2631,10 +2629,6 @@ static void execlists_reset_rewind(struct intel_engine_cs *engine, bool stalled)
 
 static void nop_submission_tasklet(unsigned long data)
 {
-	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
-
-	/* The driver is wedged; don't process any more events. */
-	WRITE_ONCE(engine->execlists.queue_priority_hint, INT_MIN);
 }
 
 static void execlists_reset_cancel(struct intel_engine_cs *engine)
@@ -2701,16 +2695,12 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
 			rq->engine = engine;
 			__i915_request_submit(rq);
 			i915_request_put(rq);
-
-			ve->base.execlists.queue_priority_hint = INT_MIN;
 		}
 		spin_unlock(&ve->base.active.lock);
 	}
 
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
-	execlists->queue_priority_hint = INT_MIN;
-
 	GEM_BUG_ON(__tasklet_is_enabled(&engine->active.tasklet));
 	engine->active.tasklet.func = nop_submission_tasklet;
 
@@ -3115,7 +3105,8 @@ static const struct intel_context_ops virtual_context_ops = {
 	.destroy = virtual_context_destroy,
 };
 
-static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
+static intel_engine_mask_t
+virtual_submission_mask(struct virtual_engine *ve, u64 *deadline)
 {
 	struct i915_request *rq;
 	intel_engine_mask_t mask;
@@ -3132,9 +3123,11 @@ static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
 		mask = ve->siblings[0]->mask;
 	}
 
-	ENGINE_TRACE(&ve->base, "rq=%llx:%lld, mask=%x, prio=%d\n",
+	*deadline = rq_deadline(rq);
+
+	ENGINE_TRACE(&ve->base, "rq=%llx:%llu, mask=%x, dl=%llu\n",
 		     rq->fence.context, rq->fence.seqno,
-		     mask, ve->base.execlists.queue_priority_hint);
+		     mask, *deadline);
 
 	return mask;
 }
@@ -3142,12 +3135,12 @@ static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
 static void virtual_submission_tasklet(unsigned long data)
 {
 	struct virtual_engine * const ve = (struct virtual_engine *)data;
-	const int prio = READ_ONCE(ve->base.execlists.queue_priority_hint);
 	intel_engine_mask_t mask;
 	unsigned int n;
+	u64 deadline;
 
 	rcu_read_lock();
-	mask = virtual_submission_mask(ve);
+	mask = virtual_submission_mask(ve, &deadline);
 	rcu_read_unlock();
 	if (unlikely(!mask))
 		return;
@@ -3180,7 +3173,8 @@ static void virtual_submission_tasklet(unsigned long data)
 			 */
 			first = rb_first_cached(&sibling->execlists.virtual) ==
 				&node->rb;
-			if (prio == node->prio || (prio > node->prio && first))
+			if (deadline == node->deadline ||
+			    (deadline < node->deadline && first))
 				goto submit_engine;
 
 			rb_erase_cached(&node->rb, &sibling->execlists.virtual);
@@ -3194,7 +3188,7 @@ static void virtual_submission_tasklet(unsigned long data)
 
 			rb = *parent;
 			other = rb_entry(rb, typeof(*other), rb);
-			if (prio > other->prio) {
+			if (deadline < other->deadline) {
 				parent = &rb->rb_left;
 			} else {
 				parent = &rb->rb_right;
@@ -3209,8 +3203,8 @@ static void virtual_submission_tasklet(unsigned long data)
 
 submit_engine:
 		GEM_BUG_ON(RB_EMPTY_NODE(&node->rb));
-		node->prio = prio;
-		if (first && prio > sibling->execlists.queue_priority_hint)
+		node->deadline = deadline;
+		if (first)
 			i915_sched_kick(&sibling->active);
 
 unlock_engine:
@@ -3246,7 +3240,9 @@ static void virtual_submit_request(struct i915_request *rq)
 		i915_request_put(ve->request);
 	}
 
-	ve->base.execlists.queue_priority_hint = rq_prio(rq);
+	rq->sched.deadline =
+		min(rq->sched.deadline,
+		    i915_scheduler_next_virtual_deadline(rq_prio(rq)));
 	ve->request = i915_request_get(rq);
 
 	GEM_BUG_ON(!list_empty(virtual_queue(ve)));
@@ -3349,7 +3345,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 	ve->base.bond_execute = virtual_bond_execute;
 
 	INIT_LIST_HEAD(virtual_queue(ve));
-	ve->base.execlists.queue_priority_hint = INT_MIN;
 	tasklet_init(&ve->base.active.tasklet,
 		     virtual_submission_tasklet,
 		     (unsigned long)ve);
@@ -3533,10 +3528,6 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 		show_request(m, last, "\t\t", 0);
 	}
 
-	if (execlists->queue_priority_hint != INT_MIN)
-		drm_printf(m, "\t\tQueue priority hint: %d\n",
-			   READ_ONCE(execlists->queue_priority_hint));
-
 	last = NULL;
 	count = 0;
 	for_each_priolist(pl, &engine->active.queue) {
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index d2036e16274d..730b7ea920ec 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -867,7 +867,7 @@ semaphore_queue(struct intel_engine_cs *engine, struct i915_vma *vma, int idx)
 static int
 release_queue(struct intel_engine_cs *engine,
 	      struct i915_vma *vma,
-	      int idx, int prio)
+	      int idx, u64 deadline)
 {
 	struct i915_request *rq;
 	u32 *cs;
@@ -892,10 +892,7 @@ release_queue(struct intel_engine_cs *engine,
 	i915_request_get(rq);
 	i915_request_add(rq);
 
-	local_bh_disable();
-	i915_request_set_priority(rq, prio);
-	local_bh_enable(); /* kick tasklet */
-
+	i915_request_set_deadline(rq, deadline);
 	i915_request_put(rq);
 
 	return 0;
@@ -909,6 +906,7 @@ slice_semaphore_queue(struct intel_engine_cs *outer,
 	struct intel_engine_cs *engine;
 	struct i915_request *head;
 	enum intel_engine_id id;
+	long timeout;
 	int err, i, n = 0;
 
 	head = semaphore_queue(outer, vma, n++);
@@ -932,12 +930,16 @@ slice_semaphore_queue(struct intel_engine_cs *outer,
 		}
 	}
 
-	err = release_queue(outer, vma, n, I915_PRIORITY_BARRIER);
+	err = release_queue(outer, vma, n, 0);
 	if (err)
 		goto out;
 
-	if (i915_request_wait(head, 0,
-			      2 * outer->gt->info.num_engines * (count + 2) * (count + 3)) < 0) {
+	/* Expected number of pessimal slices required */
+	timeout = outer->gt->info.num_engines * (count + 2) * (count + 3);
+	timeout *= 4; /* safety factor, including bucketing */
+	timeout += HZ / 2; /* and include the request completion */
+
+	if (i915_request_wait(head, 0, timeout) < 0) {
 		pr_err("%s: Failed to slice along semaphore chain of length (%d, %d)!\n",
 		       outer->name, count, n);
 		GEM_TRACE_DUMP();
@@ -1042,6 +1044,8 @@ create_rewinder(struct intel_context *ce,
 		err = i915_request_await_dma_fence(rq, &wait->fence);
 		if (err)
 			goto err;
+
+		i915_request_set_deadline(rq, rq_deadline(wait));
 	}
 
 	cs = intel_ring_begin(rq, 14);
@@ -1318,6 +1322,7 @@ static int live_timeslice_queue(void *arg)
 			goto err_heartbeat;
 		}
 		i915_request_set_priority(rq, I915_PRIORITY_MAX);
+		i915_request_set_deadline(rq, 0);
 		err = wait_for_submit(engine, rq, HZ / 2);
 		if (err) {
 			pr_err("%s: Timed out trying to submit semaphores\n",
@@ -1340,10 +1345,9 @@ static int live_timeslice_queue(void *arg)
 		}
 
 		GEM_BUG_ON(i915_request_completed(rq));
-		GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
 
 		/* Queue: semaphore signal, matching priority as semaphore */
-		err = release_queue(engine, vma, 1, effective_prio(rq));
+		err = release_queue(engine, vma, 1, rq_deadline(rq));
 		if (err)
 			goto err_rq;
 
@@ -1454,6 +1458,7 @@ static int live_timeslice_nopreempt(void *arg)
 			goto out_spin;
 		}
 
+		rq->sched.deadline = 0;
 		rq->sched.attr.priority = I915_PRIORITY_BARRIER;
 		i915_request_get(rq);
 		i915_request_add(rq);
@@ -1817,6 +1822,7 @@ static int live_late_preempt(void *arg)
 
 	/* Make sure ctx_lo stays before ctx_hi until we trigger preemption. */
 	ctx_lo->sched.priority = 1;
+	ctx_hi->sched.priority = I915_PRIORITY_MIN;
 
 	for_each_engine(engine, gt, id) {
 		struct igt_live_test t;
@@ -2981,6 +2987,9 @@ static int live_preempt_gang(void *arg)
 		while (rq) { /* wait for each rq from highest to lowest prio */
 			struct i915_request *n = list_next_entry(rq, mock.link);
 
+			/* With deadlines, no strict priority ordering */
+			i915_request_set_deadline(rq, 0);
+
 			if (err == 0 && i915_request_wait(rq, 0, HZ / 5) < 0) {
 				struct drm_printer p =
 					drm_info_printer(engine->i915->drm.dev);
@@ -3203,6 +3212,7 @@ static int preempt_user(struct intel_engine_cs *engine,
 	i915_request_add(rq);
 
 	i915_request_set_priority(rq, I915_PRIORITY_MAX);
+	i915_request_set_deadline(rq, 0);
 
 	if (i915_request_wait(rq, 0, HZ / 2) < 0)
 		err = -ETIME;
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 3d3f41b1271a..df799379333f 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -1010,7 +1010,10 @@ static int __igt_reset_engines(struct intel_gt *gt,
 					break;
 				}
 
-				if (i915_request_wait(rq, 0, HZ / 5) < 0) {
+				/* With deadlines, no strict priority */
+				i915_request_set_deadline(rq, 0);
+
+				if (i915_request_wait(rq, 0, HZ / 2) < 0) {
 					struct drm_printer p =
 						drm_info_printer(gt->i915->drm.dev);
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 0a40f6b574f8..62916bf2cde9 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -1253,6 +1253,7 @@ poison_registers(struct intel_context *ce,
 
 	intel_ring_advance(rq, cs);
 
+	rq->sched.deadline = 0;
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
 err_rq:
 	i915_request_add(rq);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 8d0c6cd277b3..4de9c459eb75 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -220,8 +220,6 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
 		i915_priolist_advance(&engine->active.queue, pl);
 	}
 done:
-	execlists->queue_priority_hint =
-		pl != &engine->active.queue.sentinel ? pl->priority : INT_MIN;
 	if (submit) {
 		*port = schedule_in(last, port - execlists->inflight);
 		*++port = NULL;
@@ -318,7 +316,6 @@ static void guc_reset_rewind(struct intel_engine_cs *engine, bool stalled)
 
 static void guc_reset_cancel(struct intel_engine_cs *engine)
 {
-	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_request *rq, *rn;
 	struct i915_priolist *p;
 	unsigned long flags;
@@ -361,8 +358,6 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
 
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
-	execlists->queue_priority_hint = INT_MIN;
-
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
index 1200c3df6a4a..8a8cef0fcb48 100644
--- a/drivers/gpu/drm/i915/i915_priolist_types.h
+++ b/drivers/gpu/drm/i915/i915_priolist_types.h
@@ -22,6 +22,8 @@ enum {
 
 	/* Interactive workload, scheduled for immediate pageflipping */
 	I915_PRIORITY_DISPLAY,
+
+	__I915_PRIORITY_KERNEL__
 };
 
 /* Smallest priority value that cannot be bumped. */
@@ -35,8 +37,7 @@ enum {
  * i.e. nothing can have higher priority and force us to usurp the
  * active request.
  */
-#define I915_PRIORITY_UNPREEMPTABLE INT_MAX
-#define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
+#define I915_PRIORITY_BARRIER INT_MAX
 
 #ifdef CONFIG_64BIT
 #define I915_PRIOLIST_HEIGHT 12
@@ -46,7 +47,7 @@ enum {
 
 struct i915_priolist {
 	struct list_head requests;
-	int priority;
+	u64 deadline;
 
 	int level;
 	struct i915_priolist *next[I915_PRIOLIST_HEIGHT];
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index df2ab39b394d..e4c0c810b77e 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -530,7 +530,7 @@ bool __i915_request_submit(struct i915_request *request)
 	struct intel_engine_cs *engine = request->engine;
 	bool result = false;
 
-	RQ_TRACE(request, "\n");
+	RQ_TRACE(request, "dl %llu\n", request->sched.deadline);
 
 	GEM_BUG_ON(!irqs_disabled());
 	lockdep_assert_held(&engine->active.lock);
@@ -719,6 +719,7 @@ semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 
 	switch (state) {
 	case FENCE_COMPLETE:
+		i915_request_update_deadline(rq);
 		break;
 
 	case FENCE_FREE:
@@ -1879,14 +1880,15 @@ long i915_request_wait(struct i915_request *rq,
 	return timeout;
 }
 
-static int print_sched_attr(const struct i915_sched_attr *attr,
-			    char *buf, int x, int len)
+static int print_sched(const struct i915_sched_node *node,
+		       char *buf, int x, int len)
 {
-	if (attr->priority == I915_PRIORITY_INVALID)
+	if (node->attr.priority == I915_PRIORITY_INVALID)
 		return x;
 
 	x += snprintf(buf + x, len - x,
-		      " prio=%d", attr->priority);
+		      " prio=%d, dl=%llu",
+		      node->attr.priority, node->deadline);
 
 	return x;
 }
@@ -1966,7 +1968,7 @@ void i915_request_show(struct drm_printer *m,
 	 *      from the lists
 	 */
 
-	x = print_sched_attr(&rq->sched.attr, buf, x, sizeof(buf));
+	x = print_sched(&rq->sched, buf, x, sizeof(buf));
 
 	drm_printf(m, "%s%.*s%c %llx:%lld%s%s %s @ %dms: %s\n",
 		   prefix, indent, "                ",
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 74000d3eebb1..7ba816e83b55 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -33,6 +33,11 @@ static void node_put(struct i915_sched_node *node)
 	i915_request_put(container_of(node, struct i915_request, sched));
 }
 
+static inline u64 rq_deadline(const struct i915_request *rq)
+{
+	return READ_ONCE(rq->sched.deadline);
+}
+
 static inline int rq_prio(const struct i915_request *rq)
 {
 	return READ_ONCE(rq->sched.attr.priority);
@@ -46,6 +51,14 @@ static int ipi_get_prio(struct i915_request *rq)
 	return xchg(&rq->sched.ipi_priority, I915_PRIORITY_INVALID);
 }
 
+static u64 ipi_get_deadline(struct i915_request *rq)
+{
+	if (READ_ONCE(rq->sched.ipi_deadline) == I915_DEADLINE_NEVER)
+		return I915_DEADLINE_NEVER;
+
+	return xchg64(&rq->sched.ipi_deadline, I915_DEADLINE_NEVER);
+}
+
 static void ipi_schedule(struct work_struct *wrk)
 {
 	struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
@@ -53,9 +66,11 @@ static void ipi_schedule(struct work_struct *wrk)
 
 	do {
 		struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
+		u64 deadline;
 		int prio;
 
 		prio = ipi_get_prio(rq);
+		deadline = ipi_get_deadline(rq);
 
 		/*
 		 * For cross-engine scheduling to work we rely on one of two
@@ -80,6 +95,7 @@ static void ipi_schedule(struct work_struct *wrk)
 		 */
 		local_bh_disable();
 		i915_request_set_priority(rq, prio);
+		i915_request_set_deadline(rq, deadline);
 		local_bh_enable();
 
 		i915_request_put(rq);
@@ -98,7 +114,7 @@ static void init_priolist(struct i915_priolist_root *const root)
 	struct i915_priolist *pl = &root->sentinel;
 
 	memset_p((void **)pl->next, pl, ARRAY_SIZE(pl->next));
-	pl->priority = INT_MIN;
+	pl->deadline = I915_DEADLINE_NEVER;
 	pl->level = -1;
 }
 
@@ -250,7 +266,7 @@ static inline unsigned int random_level(struct i915_priolist_root *root)
 }
 
 static struct list_head *
-lookup_priolist(struct intel_engine_cs *engine, int prio)
+lookup_priolist(struct intel_engine_cs *engine, u64 deadline)
 {
 	struct i915_priolist *update[I915_PRIOLIST_HEIGHT];
 	struct i915_sched_engine * const se = &engine->active;
@@ -258,12 +274,13 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
 	struct i915_priolist *pl, *tmp;
 	int lvl;
 
+	GEM_BUG_ON(deadline == I915_DEADLINE_NEVER);
 	lockdep_assert_held(&se->lock);
 	if (unlikely(se->no_priolist))
-		prio = I915_PRIORITY_NORMAL;
+		deadline = 0;
 
 	for_each_priolist(pl, root) { /* recycle any empty elements before us */
-		if (pl->priority >= prio || !list_empty(&pl->requests))
+		if (pl->deadline >= deadline || !list_empty(&pl->requests))
 			break;
 
 		i915_priolist_advance(root, pl);
@@ -273,14 +290,14 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
 	pl = &root->sentinel;
 	lvl = pl->level;
 	while (lvl >= 0) {
-		while (tmp = pl->next[lvl], tmp->priority >= prio)
+		while (tmp = pl->next[lvl], tmp->deadline <= deadline)
 			pl = tmp;
-		if (pl->priority == prio)
+		if (pl->deadline == deadline)
 			goto out;
 		update[lvl--] = pl;
 	}
 
-	if (prio == I915_PRIORITY_NORMAL) {
+	if (!deadline) {
 		pl = &se->default_priolist;
 	} else if (!pl_empty(&root->sentinel.requests)) {
 		pl = pl_pop(&root->sentinel.requests);
@@ -288,7 +305,7 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
 		pl = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
 		/* Convert an allocation failure to a priority bump */
 		if (unlikely(!pl)) {
-			prio = I915_PRIORITY_NORMAL; /* recurses just once */
+			deadline = 0; /* recurses just once */
 
 			/*
 			 * To maintain ordering with all rendering, after an
@@ -304,7 +321,7 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
 		}
 	}
 
-	pl->priority = prio;
+	pl->deadline = deadline;
 	INIT_LIST_HEAD(&pl->requests);
 
 	lvl = random_level(root);
@@ -332,7 +349,7 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
 		chk = &root->sentinel;
 		lvl = chk->level;
 		do {
-			while (tmp = chk->next[lvl], tmp->priority >= prio)
+			while (tmp = chk->next[lvl], tmp->deadline <= deadline)
 				chk = tmp;
 		} while (--lvl >= 0);
 
@@ -352,7 +369,7 @@ static void remove_priolist(struct intel_engine_cs *engine,
 	struct i915_priolist *pl, *tmp;
 	struct i915_priolist *old =
 		container_of(plist, struct i915_priolist, requests);
-	int prio = old->priority;
+	u64 deadline = old->deadline;
 	int lvl;
 
 	lockdep_assert_held(&se->lock);
@@ -362,11 +379,11 @@ static void remove_priolist(struct intel_engine_cs *engine,
 	lvl = pl->level;
 	GEM_BUG_ON(lvl < 0);
 
-	if (prio != I915_PRIORITY_NORMAL)
+	if (deadline)
 		pl_push(old, &pl->requests);
 
 	do {
-		while (tmp = pl->next[lvl], tmp->priority > prio)
+		while (tmp = pl->next[lvl], tmp->deadline < deadline)
 			pl = tmp;
 		if (lvl <= old->level) {
 			pl->next[lvl] = old->next[lvl];
@@ -389,7 +406,7 @@ void i915_priolist_advance(struct i915_priolist_root *root,
 	GEM_BUG_ON(pl != s->next[0]);
 	GEM_BUG_ON(pl == s);
 
-	if (pl->priority != I915_PRIORITY_NORMAL)
+	if (pl->deadline)
 		pl_push(pl, &s->requests);
 
 	lvl = pl->level;
@@ -423,53 +440,245 @@ stack_pop(struct i915_request *rq,
 	return rq;
 }
 
-static inline bool need_preempt(int prio, int active)
+static void ipi_deadline(struct i915_request *rq, u64 deadline)
 {
-	/*
-	 * Allow preemption of low -> normal -> high, but we do
-	 * not allow low priority tasks to preempt other low priority
-	 * tasks under the impression that latency for low priority
-	 * tasks does not matter (as much as background throughput),
-	 * so kiss.
-	 */
-	return prio >= max(I915_PRIORITY_NORMAL, active);
+	u64 old = READ_ONCE(rq->sched.ipi_deadline);
+
+	do {
+		if (deadline >= old)
+			return;
+	} while (!try_cmpxchg64(&rq->sched.ipi_deadline, &old, deadline));
+
+	__ipi_add(rq);
 }
 
-static void kick_submission(struct intel_engine_cs *engine,
-			    const struct i915_request *rq,
-			    int prio)
+static bool is_first_priolist(const struct intel_engine_cs *engine,
+			      const struct list_head *requests)
 {
-	const struct i915_request *inflight;
+	return requests == &engine->active.queue.sentinel.next[0]->requests;
+}
 
-	/*
-	 * We only need to kick the tasklet once for the high priority
-	 * new context we add into the queue.
-	 */
-	if (prio <= engine->execlists.queue_priority_hint)
+static bool __i915_request_set_deadline(struct i915_request *rq, u64 deadline)
+{
+	struct intel_engine_cs *engine = rq->engine;
+	struct list_head *pos = &rq->sched.signalers_list;
+	struct list_head *plist;
+
+	if (unlikely(!i915_request_in_priority_queue(rq))) {
+		rq->sched.deadline = deadline;
+		return false;
+	}
+
+	/* Fifo and depth-first replacement ensure our deps execute first */
+	plist = lookup_priolist(engine, deadline);
+
+	rq->sched.dfs.next = NULL;
+	do {
+		list_for_each_continue(pos, &rq->sched.signalers_list) {
+			struct i915_dependency *p =
+				list_entry(pos, typeof(*p), signal_link);
+			struct i915_request *s =
+				container_of(p->signaler, typeof(*s), sched);
+
+			if (rq_deadline(s) <= deadline)
+				continue;
+
+			if (__i915_request_is_complete(s))
+				continue;
+
+			if (s->engine != engine) {
+				ipi_deadline(s, deadline);
+				continue;
+			}
+
+			/* Remember our position along this branch */
+			rq = stack_push(s, rq, pos);
+			pos = &rq->sched.signalers_list;
+		}
+
+		RQ_TRACE(rq, "set-deadline:%llu\n", deadline);
+		WRITE_ONCE(rq->sched.deadline, deadline);
+
+		/*
+		 * Once the request is ready, it will be placed into the
+		 * priority lists and then onto the HW runlist. Before the
+		 * request is ready, it does not contribute to our preemption
+		 * decisions and we can safely ignore it, as it will, and
+		 * any preemption required, be dealt with upon submission.
+		 * See engine->submit_request()
+		 */
+		GEM_BUG_ON(rq->engine != engine);
+		if (i915_request_in_priority_queue(rq)) {
+			struct list_head *prev = rq->sched.link.prev;
+
+			list_move_tail(&rq->sched.link, plist);
+			if (list_empty(prev))
+				remove_priolist(engine, prev);
+		}
+	} while ((rq = stack_pop(rq, &pos)));
+
+	return is_first_priolist(engine, plist);
+}
+
+void i915_request_set_deadline(struct i915_request *rq, u64 deadline)
+{
+	struct intel_engine_cs *engine;
+	unsigned long flags;
+
+	if (deadline >= rq_deadline(rq))
 		return;
 
-	/* Nothing currently active? We're overdue for a submission! */
-	inflight = execlists_active(&engine->execlists);
-	if (!inflight)
-		return;
+	engine = lock_engine_irqsave(rq, flags);
+	if (!intel_engine_has_scheduler(engine))
+		goto unlock;
 
-	/*
-	 * If we are already the currently executing context, don't
-	 * bother evaluating if we should preempt ourselves.
-	 */
-	if (inflight->context == rq->context)
-		return;
+	if (deadline >= rq_deadline(rq))
+		goto unlock;
 
-	ENGINE_TRACE(engine,
-		     "bumping queue-priority-hint:%d for rq:%llx:%lld, inflight:%llx:%lld prio %d\n",
-		     prio,
-		     rq->fence.context, rq->fence.seqno,
-		     inflight->fence.context, inflight->fence.seqno,
-		     inflight->sched.attr.priority);
+	if (__i915_request_is_complete(rq))
+		goto unlock;
 
-	engine->execlists.queue_priority_hint = prio;
-	if (need_preempt(prio, rq_prio(inflight)))
+	rcu_read_lock();
+	if (__i915_request_set_deadline(rq, deadline))
 		i915_sched_kick(&engine->active);
+	rcu_read_unlock();
+	GEM_BUG_ON(rq_deadline(rq) != deadline);
+
+unlock:
+	spin_unlock_irqrestore(&engine->active.lock, flags);
+}
+
+static u64 prio_slice(int prio)
+{
+	u64 slice;
+	int sf;
+
+	/*
+	 * This is the central heuristic to the virtual deadlines. By
+	 * imposing that each task takes an equal amount of time, we
+	 * let each client have an equal slice of the GPU time. By
+	 * bringing the virtual deadline forward, that client will then
+	 * have more GPU time, and vice versa a lower priority client will
+	 * have a later deadline and receive less GPU time.
+	 *
+	 * In BFS/MuQSS, the prio_ratios[] are based on the task nice range of
+	 * [-20, 20], with each lower priority having a ~10% longer deadline,
+	 * with the note that the proportion of CPU time between two clients
+	 * of different priority will be the square of the relative prio_slice.
+	 *
+	 * In contrast, this prio_slice() curve was chosen because it gave good
+	 * results with igt/gem_exec_schedule. It may not be the best choice!
+	 *
+	 * With a 1ms scheduling quantum:
+	 *
+	 *   MAX USER:  ~32us deadline
+	 *   0:         ~16ms deadline
+	 *   MIN_USER: 1000ms deadline
+	 */
+
+	if (prio >= __I915_PRIORITY_KERNEL__)
+		return INT_MAX - prio;
+
+	slice = __I915_PRIORITY_KERNEL__ - prio;
+	if (prio >= 0)
+		sf = 20 - 6;
+	else
+		sf = 20 - 1;
+
+	return slice << sf;
+}
+
+static u64 virtual_deadline(u64 kt, int priority)
+{
+	return i915_sched_to_ticks(kt + prio_slice(priority));
+}
+
+u64 i915_scheduler_next_virtual_deadline(int priority)
+{
+	return virtual_deadline(ktime_get_mono_fast_ns(), priority);
+}
+
+static u64 signal_deadline(const struct i915_request *rq)
+{
+	u64 last = ktime_get_mono_fast_ns();
+	const struct i915_dependency *p;
+
+	/*
+	 * Find the earliest point at which we will become 'ready',
+	 * which we infer from the deadline of all active signalers.
+	 * We will position ourselves at the end of that chain of work.
+	 */
+
+	rcu_read_lock();
+	for_each_signaler(p, rq) {
+		const struct i915_request *s =
+			container_of(p->signaler, typeof(*s), sched);
+		u64 deadline;
+		int prio;
+
+		if (__i915_request_is_complete(s))
+			continue;
+
+		if (s->timeline == rq->timeline &&
+		    __i915_request_has_started(s))
+			continue;
+
+		prio = rq_prio(s);
+		if (prio < rq_prio(rq))
+			continue;
+
+		deadline = rq_deadline(s);
+		if (deadline == I915_DEADLINE_NEVER) /* retired & reused */
+			continue;
+
+		deadline = i915_sched_to_ns(deadline);
+		if (p->flags & I915_DEPENDENCY_WEAK)
+			deadline -= prio_slice(prio);
+
+		last = max(last, deadline);
+	}
+	rcu_read_unlock();
+
+	return last;
+}
+
+static int adj_prio(const struct i915_request *rq)
+{
+	int prio = rq_prio(rq);
+
+	/*
+	 * Deprioritize semaphore waiters. We only want to run these if there
+	 * is nothing ready to run first.
+	 *
+	 * Note by giving a more distant deadline (due to a lower priority)
+	 * we do not prevent them from having a slice of the GPU, and if there
+	 * is still contention at that point, we expect to immediately yield
+	 * on the semaphore.
+	 *
+	 * When all semaphores are signaled, we will update the request
+	 * to remove the semaphore penalty.
+	 */
+	if (!i915_sw_fence_signaled(&rq->semaphore))
+		prio -= __I915_PRIORITY_KERNEL__;
+
+	return prio;
+}
+
+static u64 earliest_deadline(const struct i915_request *rq)
+{
+	return virtual_deadline(signal_deadline(rq), rq_prio(rq));
+}
+
+static bool set_earliest_deadline(struct i915_request *rq, u64 old)
+{
+	u64 dl;
+
+	/* Recompute our deadlines and promote after a priority change */
+	dl = min(earliest_deadline(rq), rq_deadline(rq));
+	if (dl >= old)
+		return false;
+
+	return __i915_request_set_deadline(rq, dl);
 }
 
 static void ipi_priority(struct i915_request *rq, int prio)
@@ -484,13 +693,11 @@ static void ipi_priority(struct i915_request *rq, int prio)
 	__ipi_add(rq);
 }
 
-static void __i915_request_set_priority(struct i915_request *rq, int prio)
+static bool __i915_request_set_priority(struct i915_request *rq, int prio)
 {
 	struct intel_engine_cs *engine = rq->engine;
 	struct list_head *pos = &rq->sched.signalers_list;
-	struct list_head *plist;
-
-	plist = lookup_priolist(engine, prio);
+	bool kick = false;
 
 	/*
 	 * Recursively bump all dependent priorities to match the new request.
@@ -512,6 +719,8 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
 	 */
 	rq->sched.dfs.next = NULL;
 	do {
+		struct i915_request *next;
+
 		list_for_each_continue(pos, &rq->sched.signalers_list) {
 			struct i915_dependency *p =
 				list_entry(pos, typeof(*p), signal_link);
@@ -537,6 +746,8 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
 		RQ_TRACE(rq, "set-priority:%d\n", prio);
 		WRITE_ONCE(rq->sched.attr.priority, prio);
 
+		next = stack_pop(rq, &pos);
+
 		/*
 		 * Once the request is ready, it will be placed into the
 		 * priority lists and then onto the HW runlist. Before the
@@ -545,21 +756,15 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
 		 * any preemption required, be dealt with upon submission.
 		 * See engine->submit_request()
 		 */
-		if (!i915_request_is_ready(rq))
-			continue;
-
 		GEM_BUG_ON(rq->engine != engine);
-		if (i915_request_in_priority_queue(rq)) {
-			struct list_head *prev = rq->sched.link.prev;
+		if (i915_request_is_ready(rq) &&
+		    set_earliest_deadline(rq, rq_deadline(rq)))
+			kick = true;
 
-			list_move_tail(&rq->sched.link, plist);
-			if (list_empty(prev))
-				remove_priolist(engine, prev);
-		}
+		rq = next;
+	} while (rq);
 
-		/* Defer (tasklet) submission until after all updates. */
-		kick_submission(engine, rq, prio);
-	} while ((rq = stack_pop(rq, &pos)));
+	return kick;
 }
 
 void i915_request_set_priority(struct i915_request *rq, int prio)
@@ -608,13 +813,9 @@ void i915_request_set_priority(struct i915_request *rq, int prio)
 	if (__i915_request_is_complete(rq))
 		goto unlock;
 
-	if (!intel_engine_has_scheduler(engine)) {
-		rq->sched.attr.priority = prio;
-		goto unlock;
-	}
-
 	rcu_read_lock();
-	__i915_request_set_priority(rq, prio);
+	if (__i915_request_set_priority(rq, prio))
+		i915_sched_kick(&engine->active);
 	rcu_read_unlock();
 	GEM_BUG_ON(rq_prio(rq) != prio);
 
@@ -628,12 +829,13 @@ void __intel_engine_defer_request(struct intel_engine_cs *engine,
 	struct list_head *pos = &rq->sched.waiters_list;
 	struct i915_request *rn;
 	LIST_HEAD(dfs);
-	int prio;
+	u64 deadline;
 
 	lockdep_assert_held(&engine->active.lock);
 	GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags));
 
-	prio = rq_prio(rq);
+	deadline = max(rq_deadline(rq),
+		       i915_scheduler_next_virtual_deadline(adj_prio(rq)));
 
 	/*
 	 * When we defer a request, we must maintain its order with respect
@@ -660,30 +862,32 @@ void __intel_engine_defer_request(struct intel_engine_cs *engine,
 				   __i915_request_has_started(w) &&
 				   !__i915_request_is_complete(rq));
 
+			/* An unready waiter imposes no deadline */
 			if (!i915_request_in_priority_queue(w))
 				continue;
 
 			/*
-			 * We also need to reorder within the same priority.
+			 * We also need to reorder within the same deadline.
 			 *
 			 * This is unlike priority-inheritance, where if the
 			 * signaler already has a higher priority [earlier
 			 * deadline] than us, we can ignore as it will be
 			 * scheduled first. If a waiter already has the
-			 * same priority, we still have to push it to the end
+			 * same deadline, we still have to push it to the end
 			 * of the list. This unfortunately means we cannot
 			 * use the rq_deadline() itself as a 'visited' bit.
 			 */
-			if (rq_prio(w) < prio)
+			if (rq_deadline(w) > deadline)
 				continue;
 
-			GEM_BUG_ON(rq_prio(w) != prio);
-
 			/* Remember our position along this branch */
 			rq = stack_push(w, rq, pos);
 			pos = &rq->sched.waiters_list;
 		}
 
+		RQ_TRACE(rq, "set-deadline:%llu\n", deadline);
+		WRITE_ONCE(rq->sched.deadline, deadline);
+
 		/* Note list is reversed for waiters wrt signal hierarchy */
 		GEM_BUG_ON(rq->engine != engine);
 		GEM_BUG_ON(!i915_request_in_priority_queue(rq));
@@ -693,31 +897,18 @@ void __intel_engine_defer_request(struct intel_engine_cs *engine,
 		clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
 	} while ((rq = stack_pop(rq, &pos)));
 
-	pos = lookup_priolist(engine, prio);
+	pos = lookup_priolist(engine, deadline);
 	list_for_each_entry_safe(rq, rn, &dfs, sched.link) {
 		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
 		list_add_tail(&rq->sched.link, pos);
 	}
 }
 
-static void queue_request(struct intel_engine_cs *engine,
-			  struct i915_request *rq)
+static bool
+queue_request(struct intel_engine_cs *engine, struct i915_request *rq)
 {
-	GEM_BUG_ON(!list_empty(&rq->sched.link));
-	list_add_tail(&rq->sched.link, lookup_priolist(engine, rq_prio(rq)));
 	set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
-}
-
-static bool submit_queue(struct intel_engine_cs *engine,
-			 const struct i915_request *rq)
-{
-	struct intel_engine_execlists *execlists = &engine->execlists;
-
-	if (rq_prio(rq) <= execlists->queue_priority_hint)
-		return false;
-
-	execlists->queue_priority_hint = rq_prio(rq);
-	return true;
+	return set_earliest_deadline(rq, I915_DEADLINE_NEVER);
 }
 
 static bool hold_request(const struct i915_request *rq)
@@ -757,6 +948,7 @@ void i915_request_enqueue(struct i915_request *rq)
 {
 	struct intel_engine_cs *engine = rq->engine;
 	struct i915_sched_engine *se = &engine->active;
+	u64 dl = earliest_deadline(rq);
 	unsigned long flags;
 	bool kick = false;
 
@@ -769,11 +961,11 @@ void i915_request_enqueue(struct i915_request *rq)
 		list_add_tail(&rq->sched.link, &se->hold);
 		i915_request_set_hold(rq);
 	} else {
-		queue_request(engine, rq);
-
-		GEM_BUG_ON(i915_sched_is_idle(&engine->active));
-
-		kick = submit_queue(engine, rq);
+		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+		kick = __i915_request_set_deadline(rq,
+						   min(dl, rq_deadline(rq)));
+		GEM_BUG_ON(rq_deadline(rq) == I915_DEADLINE_NEVER);
+		GEM_BUG_ON(i915_sched_is_idle(se));
 	}
 
 	GEM_BUG_ON(list_empty(&rq->sched.link));
@@ -786,8 +978,8 @@ struct i915_request *
 __intel_engine_rewind_requests(struct intel_engine_cs *engine)
 {
 	struct i915_request *rq, *rn, *active = NULL;
+	u64 deadline = I915_DEADLINE_NEVER;
 	struct list_head *pl;
-	int prio = I915_PRIORITY_INVALID;
 
 	lockdep_assert_held(&engine->active.lock);
 
@@ -801,13 +993,20 @@ __intel_engine_rewind_requests(struct intel_engine_cs *engine)
 
 		__i915_request_unsubmit(rq);
 
-		GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
-		if (rq_prio(rq) != prio) {
-			prio = rq_prio(rq);
-			pl = lookup_priolist(engine, prio);
+		if (__i915_request_has_started(rq)) {
+			u64 deadline =
+				i915_scheduler_next_virtual_deadline(rq_prio(rq));
+			rq->sched.deadline = min(rq_deadline(rq), deadline);
+		}
+		GEM_BUG_ON(rq_deadline(rq) == I915_DEADLINE_NEVER);
+
+		if (rq_deadline(rq) != deadline) {
+			deadline = rq_deadline(rq);
+			pl = lookup_priolist(engine, deadline);
 		}
 		GEM_BUG_ON(i915_sched_is_idle(&engine->active));
 
+		GEM_BUG_ON(i915_request_in_priority_queue(rq));
 		list_move(&rq->sched.link, pl);
 		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
 
@@ -907,14 +1106,10 @@ void __intel_engine_resume_request(struct intel_engine_cs *engine,
 				   struct i915_request *rq)
 {
 	LIST_HEAD(list);
+	bool submit = false;
 
 	lockdep_assert_held(&engine->active.lock);
 
-	if (rq_prio(rq) > engine->execlists.queue_priority_hint) {
-		engine->execlists.queue_priority_hint = rq_prio(rq);
-		i915_sched_kick(&engine->active);
-	}
-
 	if (!i915_request_on_hold(rq))
 		return;
 
@@ -936,7 +1131,7 @@ void __intel_engine_resume_request(struct intel_engine_cs *engine,
 		i915_request_clear_hold(rq);
 		list_del_init(&rq->sched.link);
 
-		queue_request(engine, rq);
+		submit |= queue_request(engine, rq);
 
 		/* Also release any children on this engine that are ready */
 		for_each_waiter(p, rq) {
@@ -966,6 +1161,18 @@ void __intel_engine_resume_request(struct intel_engine_cs *engine,
 
 		rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
 	} while (rq);
+
+	if (submit)
+		i915_sched_kick(&engine->active);
+}
+
+void i915_request_update_deadline(struct i915_request *rq)
+{
+	if (!i915_request_in_priority_queue(rq))
+		return;
+
+	/* Recompute our deadlines and promote after a priority change */
+	i915_request_set_deadline(rq, earliest_deadline(rq));
 }
 
 void intel_engine_resume_request(struct intel_engine_cs *engine,
@@ -992,10 +1199,12 @@ void i915_sched_node_init(struct i915_sched_node *node)
 void i915_sched_node_reinit(struct i915_sched_node *node)
 {
 	node->attr.priority = I915_PRIORITY_INVALID;
+	node->deadline = I915_DEADLINE_NEVER;
 	node->semaphores = 0;
 	node->flags = 0;
 
 	GEM_BUG_ON(node->ipi_link);
+	node->ipi_deadline = I915_DEADLINE_NEVER;
 	node->ipi_priority = I915_PRIORITY_INVALID;
 
 	GEM_BUG_ON(!list_empty(&node->signalers_list));
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index bca89a58d953..e04d7eeb1b36 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -36,6 +36,11 @@ void i915_sched_park_engine(struct i915_sched_engine *se);
 void i915_sched_fini_engine(struct i915_sched_engine *se);
 
 void i915_request_set_priority(struct i915_request *request, int prio);
+void i915_request_set_deadline(struct i915_request *request, u64 deadline);
+
+void i915_request_update_deadline(struct i915_request *request);
+
+u64 i915_scheduler_next_virtual_deadline(int priority);
 
 void i915_request_enqueue(struct i915_request *request);
 
@@ -54,11 +59,14 @@ bool intel_engine_suspend_request(struct intel_engine_cs *engine,
 void intel_engine_resume_request(struct intel_engine_cs *engine,
 				 struct i915_request *rq);
 
-void __i915_priolist_free(struct i915_priolist *p);
-static inline void i915_priolist_free(struct i915_priolist *p)
+static inline u64 i915_sched_to_ticks(ktime_t kt)
 {
-	if (p->priority != I915_PRIORITY_NORMAL)
-		__i915_priolist_free(p);
+	return ktime_to_ns(kt) >> I915_SCHED_DEADLINE_SHIFT;
+}
+
+static inline u64 i915_sched_to_ns(u64 deadline)
+{
+	return deadline << I915_SCHED_DEADLINE_SHIFT;
 }
 
 static inline bool i915_sched_is_idle(const struct i915_sched_engine *se)
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index e64750be4e77..72cb726ad75e 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -72,7 +72,30 @@ struct i915_sched_node {
 #define I915_SCHED_HAS_EXTERNAL_CHAIN	BIT(0)
 	unsigned long semaphores;
 
+	/**
+	 * @deadline: [virtual] deadline
+	 *
+	 * When the request is ready for execution, it is given a quota
+	 * (the engine's timeslice) and a virtual deadline. The virtual
+	 * deadline is derived from the current time:
+	 *     ktime_get() + (prio_ratio * timeslice)
+	 *
+	 * Requests are then executed in order of deadline completion.
+	 * Requests with earlier deadlines than currently executing on
+	 * the engine will preempt the active requests.
+	 *
+	 * By treating it as a virtual deadline, we use it as a hint for
+	 * when it is appropriate for a request to start with respect to
+	 * all other requests in the system. It is not a hard deadline, as
+	 * we allow requests to miss them, and we do not account for the
+	 * request runtime.
+	 */
+	u64 deadline;
+#define I915_SCHED_DEADLINE_SHIFT 19 /* i.e. roughly 500us buckets */
+#define I915_DEADLINE_NEVER U64_MAX
+
 	struct i915_request *ipi_link;
+	u64 ipi_deadline;
 	int ipi_priority;
 };
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
index d2a678a2497e..382f2d490959 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -2130,6 +2130,7 @@ static int measure_preemption(struct intel_context *ce)
 
 		intel_ring_advance(rq, cs);
 		rq->sched.attr.priority = I915_PRIORITY_BARRIER;
+		rq->sched.deadline = 0;
 
 		elapsed[i - 1] = ENGINE_READ_FW(ce->engine, RING_TIMESTAMP);
 		i915_request_add(rq);
diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
index de5b1443129b..8ea6763bf6a6 100644
--- a/drivers/gpu/drm/i915/selftests/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
@@ -12,6 +12,40 @@
 #include "selftests/igt_spinner.h"
 #include "selftests/i915_random.h"
 
+static int mock_scheduler_slices(void *dummy)
+{
+	u64 min, max, normal, kernel;
+
+	min = prio_slice(I915_PRIORITY_MIN);
+	pr_info("%8s slice: %lluus\n", "min", min >> 10);
+
+	normal = prio_slice(0);
+	pr_info("%8s slice: %lluus\n", "normal", normal >> 10);
+
+	max = prio_slice(I915_PRIORITY_MAX);
+	pr_info("%8s slice: %lluus\n", "max", max >> 10);
+
+	kernel = prio_slice(I915_PRIORITY_BARRIER);
+	pr_info("%8s slice: %lluus\n", "kernel", kernel >> 10);
+
+	if (kernel != 0) {
+		pr_err("kernel prio slice should be 0\n");
+		return -EINVAL;
+	}
+
+	if (max >= normal) {
+		pr_err("maximum prio slice should be shorter than normal\n");
+		return -EINVAL;
+	}
+
+	if (min <= normal) {
+		pr_err("minimum prio slice should be longer than normal\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static int mock_skiplist_levels(void *dummy)
 {
 	struct i915_priolist_root root = {};
@@ -54,6 +88,7 @@ static int mock_skiplist_levels(void *dummy)
 int i915_scheduler_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
+		SUBTEST(mock_scheduler_slices),
 		SUBTEST(mock_skiplist_levels),
 	};
 
@@ -560,6 +595,53 @@ static int igt_priority_chains(void *arg)
 	return igt_schedule_chains(arg, igt_priority);
 }
 
+static bool igt_deadline(struct i915_request *rq,
+			 unsigned long v, unsigned long e)
+{
+	i915_request_set_deadline(rq, 0);
+	GEM_BUG_ON(rq_deadline(rq) != 0);
+	return true;
+}
+
+static int igt_deadline_chains(void *arg)
+{
+	return igt_schedule_chains(arg, igt_deadline);
+}
+
+static bool igt_defer(struct i915_request *rq, unsigned long v, unsigned long e)
+{
+	struct intel_engine_cs *engine = rq->engine;
+
+	/* XXX No generic means to unwind incomplete requests yet */
+	if (!i915_request_in_priority_queue(rq))
+		return false;
+
+	if (!intel_engine_has_preemption(engine))
+		return false;
+
+	spin_lock_irq(&engine->active.lock);
+
+	/* Push all the requests to the same deadline */
+	__i915_request_set_deadline(rq, 0);
+	GEM_BUG_ON(rq_deadline(rq) != 0);
+
+	/* Then the very first request must be the one everyone depends on */
+	rq = list_first_entry(lookup_priolist(engine, 0),
+			      typeof(*rq), sched.link);
+	GEM_BUG_ON(rq->engine != engine);
+
+	/* Deferring the first request will then have to defer all requests */
+	__intel_engine_defer_request(engine, rq);
+
+	spin_unlock_irq(&engine->active.lock);
+	return true;
+}
+
+static int igt_deadline_defer(void *arg)
+{
+	return igt_schedule_chains(arg, igt_defer);
+}
+
 static struct i915_request *
 __write_timestamp(struct intel_engine_cs *engine,
 		  struct drm_i915_gem_object *obj,
@@ -781,13 +863,22 @@ static int igt_priority_cycle(void *arg)
 	return __igt_schedule_cycle(arg, igt_priority);
 }
 
+static int igt_deadline_cycle(void *arg)
+{
+	return __igt_schedule_cycle(arg, igt_deadline);
+}
+
 int i915_scheduler_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
+		SUBTEST(igt_deadline_chains),
 		SUBTEST(igt_priority_chains),
 
 		SUBTEST(igt_schedule_cycle),
+		SUBTEST(igt_deadline_cycle),
 		SUBTEST(igt_priority_cycle),
+
+		SUBTEST(igt_deadline_defer),
 	};
 
 	return i915_subtests(tests, i915);
@@ -923,9 +1014,54 @@ static int sparse_priority(void *arg)
 	return sparse(arg, set_priority);
 }
 
+static u64 __set_deadline(struct i915_request *rq, u64 deadline)
+{
+	u64 dt;
+
+	preempt_disable();
+	dt = ktime_get_raw_fast_ns();
+	i915_request_set_deadline(rq, deadline);
+	dt = ktime_get_raw_fast_ns() - dt;
+	preempt_enable();
+
+	return dt;
+}
+
+static bool set_deadline(struct i915_request *rq,
+			 unsigned long v, unsigned long e)
+{
+	report("set-deadline", v, e, __set_deadline(rq, 0));
+	return true;
+}
+
+static int single_deadline(void *arg)
+{
+	return single(arg, set_deadline);
+}
+
+static int wide_deadline(void *arg)
+{
+	return wide(arg, set_deadline);
+}
+
+static int inv_deadline(void *arg)
+{
+	return inv(arg, set_deadline);
+}
+
+static int sparse_deadline(void *arg)
+{
+	return sparse(arg, set_deadline);
+}
+
 int i915_scheduler_perf_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
+		SUBTEST(single_deadline),
+		SUBTEST(wide_deadline),
+		SUBTEST(inv_deadline),
+		SUBTEST(sparse_deadline),
+
 		SUBTEST(single_priority),
 		SUBTEST(wide_priority),
 		SUBTEST(inv_priority),
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 23/41] drm/i915/gt: Specify a deadline for the heartbeat
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (20 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 22/41] drm/i915: Fair low-latency scheduling Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 24/41] drm/i915: Extend the priority boosting for the display with a deadline Chris Wilson
                   ` (22 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

As we know when we expect the heartbeat to be checked for completion,
pass this information along as its deadline. We still do not complain if
the deadline is missed, at least until we have tried a few times, but it
will allow for quicker hang detection on systems where deadlines are
adhered to.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 2d1f0a4da13c..defc71d4c729 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -65,6 +65,16 @@ static void heartbeat_commit(struct i915_request *rq,
 	__i915_request_queue(rq, attr);
 }
 
+static void set_heartbeat_deadline(struct intel_engine_cs *engine,
+				   struct i915_request *rq)
+{
+	unsigned long interval;
+
+	interval = READ_ONCE(engine->props.heartbeat_interval_ms);
+	if (interval)
+		i915_request_set_deadline(rq, ktime_get() + (interval << 20));
+}
+
 static void show_heartbeat(const struct i915_request *rq,
 			   struct intel_engine_cs *engine)
 {
@@ -114,8 +124,7 @@ static void heartbeat(struct work_struct *wrk)
 			 * but all other contexts, including the kernel
 			 * context are stuck waiting for the signal.
 			 */
-		} else if (intel_engine_has_scheduler(engine) &&
-			   rq->sched.attr.priority < I915_PRIORITY_BARRIER) {
+		} else if (rq->sched.attr.priority < I915_PRIORITY_BARRIER) {
 			/*
 			 * Gradually raise the priority of the heartbeat to
 			 * give high priority work [which presumably desires
@@ -129,6 +138,8 @@ static void heartbeat(struct work_struct *wrk)
 				attr.priority = I915_PRIORITY_BARRIER;
 
 			local_bh_disable();
+			if (attr.priority == I915_PRIORITY_BARRIER)
+				i915_request_set_deadline(rq, 0);
 			i915_request_set_priority(rq, attr.priority);
 			local_bh_enable();
 		} else {
@@ -161,6 +172,7 @@ static void heartbeat(struct work_struct *wrk)
 	if (IS_ERR(rq))
 		goto unlock;
 
+	set_heartbeat_deadline(engine, rq);
 	heartbeat_commit(rq, &attr);
 
 unlock:
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 24/41] drm/i915: Extend the priority boosting for the display with a deadline
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (21 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 23/41] drm/i915/gt: Specify a deadline for the heartbeat Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 25/41] drm/i915/gt: Support virtual engine queues Chris Wilson
                   ` (21 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

For a modeset/pageflip, there is a very precise deadline by which the
frame must be completed in order to hit the vblank and be shown. While
we don't pass along that exact information, we can at least inform the
scheduler that this request-chain needs to be completed asap.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/display/intel_display.c |  7 +++++--
 drivers/gpu/drm/i915/gem/i915_gem_object.h   |  5 +++--
 drivers/gpu/drm/i915/gem/i915_gem_wait.c     | 21 ++++++++++++--------
 3 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 2e80babd1f66..eb5260591d57 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -13672,7 +13672,8 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 
 	if (new_plane_state->uapi.fence) { /* explicit fencing */
 		i915_gem_fence_wait_priority(new_plane_state->uapi.fence,
-					     I915_PRIORITY_DISPLAY);
+					     I915_PRIORITY_DISPLAY,
+					     ktime_get() /* next vblank? */);
 		ret = i915_sw_fence_await_dma_fence(&state->commit_ready,
 						    new_plane_state->uapi.fence,
 						    i915_fence_timeout(dev_priv),
@@ -13694,7 +13695,9 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 	if (ret)
 		return ret;
 
-	i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
+	i915_gem_object_wait_priority(obj, 0,
+				      I915_PRIORITY_DISPLAY,
+				      ktime_get() /* next vblank? */);
 	i915_gem_object_flush_frontbuffer(obj, ORIGIN_DIRTYFB);
 
 	if (!new_plane_state->uapi.fence) { /* implicit fencing */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 325766abca21..9935a2e59df0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -549,14 +549,15 @@ static inline void __start_cpu_write(struct drm_i915_gem_object *obj)
 		obj->cache_dirty = true;
 }
 
-void i915_gem_fence_wait_priority(struct dma_fence *fence, int prio);
+void i915_gem_fence_wait_priority(struct dma_fence *fence,
+				  int prio, ktime_t deadline);
 
 int i915_gem_object_wait(struct drm_i915_gem_object *obj,
 			 unsigned int flags,
 			 long timeout);
 int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 				  unsigned int flags,
-				  int prio);
+				  int prio, ktime_t deadline);
 
 void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
 					 enum fb_op_origin origin);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index 4d1897c347b9..162f9737965f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -92,11 +92,14 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
 	return timeout;
 }
 
-static void fence_set_priority(struct dma_fence *fence, int prio)
+static void
+fence_set_priority(struct dma_fence *fence, int prio, ktime_t deadline)
 {
 	if (dma_fence_is_signaled(fence) || !dma_fence_is_i915(fence))
 		return;
 
+	i915_request_set_deadline(to_request(fence),
+				  i915_sched_to_ticks(deadline));
 	i915_request_set_priority(to_request(fence), prio);
 }
 
@@ -105,7 +108,8 @@ static inline bool __dma_fence_is_chain(const struct dma_fence *fence)
 	return fence->ops == &dma_fence_chain_ops;
 }
 
-void i915_gem_fence_wait_priority(struct dma_fence *fence, int prio)
+void i915_gem_fence_wait_priority(struct dma_fence *fence,
+				  int prio, ktime_t deadline)
 {
 	if (dma_fence_is_signaled(fence))
 		return;
@@ -118,19 +122,19 @@ void i915_gem_fence_wait_priority(struct dma_fence *fence, int prio)
 		int i;
 
 		for (i = 0; i < array->num_fences; i++)
-			fence_set_priority(array->fences[i], prio);
+			fence_set_priority(array->fences[i], prio, deadline);
 	} else if (__dma_fence_is_chain(fence)) {
 		struct dma_fence *iter;
 
 		/* The chain is ordered; if we boost the last, we boost all */
 		dma_fence_chain_for_each(iter, fence) {
 			fence_set_priority(to_dma_fence_chain(iter)->fence,
-					   prio);
+					   prio, deadline);
 			break;
 		}
 		dma_fence_put(iter);
 	} else {
-		fence_set_priority(fence, prio);
+		fence_set_priority(fence, prio, deadline);
 	}
 
 	local_bh_enable(); /* kick the tasklets if queues were reprioritised */
@@ -139,7 +143,8 @@ void i915_gem_fence_wait_priority(struct dma_fence *fence, int prio)
 int
 i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 			      unsigned int flags,
-			      int prio)
+			      int prio,
+			      ktime_t deadline)
 {
 	struct dma_fence *excl;
 
@@ -154,7 +159,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 			return ret;
 
 		for (i = 0; i < count; i++) {
-			i915_gem_fence_wait_priority(shared[i], prio);
+			i915_gem_fence_wait_priority(shared[i], prio, deadline);
 			dma_fence_put(shared[i]);
 		}
 
@@ -164,7 +169,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 	}
 
 	if (excl) {
-		i915_gem_fence_wait_priority(excl, prio);
+		i915_gem_fence_wait_priority(excl, prio, deadline);
 		dma_fence_put(excl);
 	}
 	return 0;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 25/41] drm/i915/gt: Support virtual engine queues
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (22 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 24/41] drm/i915: Extend the priority boosting for the display with a deadline Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 26/41] drm/i915: Move saturated workload detection back to the context Chris Wilson
                   ` (20 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Allow multiple requests to be queued unto a virtual engine, whereas
before we only allowed a single request to be queued at a time. The
advantage of keeping just one request in the queue was to ensure that we
always decided late which engine to use. However, with the introduction
of the virtual deadline we throttle submission and still only drip one
request into the sibling at a time (unless it is truly empty, but then a
second request will have an earlier deadline than the queued virtual
engine and force itself in front). This also takes advantage that a
virtual engine will remain bound while it is active, i.e. we can not
switch to a second engine until the context is completed -- such that we
cannot be as lazy as lazy can be.

By allowing a full queue, we avoid having to synchronize via the
breadcrumb interrupt everytime, letting the virtual engine reach the
full throughput of the siblings.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/gt/intel_execlists_submission.c  | 421 +++++++++---------
 drivers/gpu/drm/i915/i915_request.c           |  12 +-
 drivers/gpu/drm/i915/i915_scheduler.c         |  65 ++-
 drivers/gpu/drm/i915/i915_scheduler.h         |   4 +-
 4 files changed, 273 insertions(+), 229 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 8f12068465bd..ecbc0538e155 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -160,17 +160,6 @@ struct virtual_engine {
 	struct intel_context context;
 	struct rcu_work rcu;
 
-	/*
-	 * We allow only a single request through the virtual engine at a time
-	 * (each request in the timeline waits for the completion fence of
-	 * the previous before being submitted). By restricting ourselves to
-	 * only submitting a single request, each request is placed on to a
-	 * physical to maximise load spreading (by virtue of the late greedy
-	 * scheduling -- each real engine takes the next available request
-	 * upon idling).
-	 */
-	struct i915_request *request;
-
 	/*
 	 * We keep a rbtree of available virtual engines inside each physical
 	 * engine, sorted by priority. Here we preallocate the nodes we need
@@ -270,17 +259,27 @@ static struct i915_request *first_request(struct i915_sched_engine *se)
 	return NULL;
 }
 
-static struct i915_request *first_virtual(const struct intel_engine_cs *engine)
+static struct virtual_engine *
+first_virtual_engine(struct intel_engine_cs *engine)
 {
-	struct rb_node *rb;
+	return rb_entry_safe(rb_first_cached(&engine->execlists.virtual),
+			     struct virtual_engine,
+			     nodes[engine->id].rb);
+}
 
-	rb = rb_first_cached(&engine->execlists.virtual);
-	if (!rb)
-		return NULL;
+static const struct i915_request *first_virtual(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq = NULL;
+	struct virtual_engine *ve;
 
-	return READ_ONCE(rb_entry(rb,
-				  struct virtual_engine,
-				  nodes[engine->id].rb)->request);
+	ve = first_virtual_engine(engine);
+	if (ve) {
+		spin_lock(&ve->base.active.lock);
+		rq = first_request(&ve->base.active);
+		spin_unlock(&ve->base.active.lock);
+	}
+
+	return rq;
 }
 
 static const struct i915_request *
@@ -377,7 +376,15 @@ assert_priority_queue(const struct i915_request *prev,
 	if (i915_request_is_active(prev))
 		return true;
 
-	return rq_deadline(prev) <= rq_deadline(next);
+	if (rq_deadline(prev) <= rq_deadline(next))
+		return true;
+
+	ENGINE_TRACE(prev->engine,
+		     "next %llx:%lld dl %lld is before prev %llx:%lld dl %lld\n",
+		     next->fence.context, next->fence.seqno, rq_deadline(next),
+		     prev->fence.context, prev->fence.seqno, rq_deadline(prev));
+
+	return false;
 }
 
 static void
@@ -487,7 +494,7 @@ static void execlists_schedule_in(struct i915_request *rq, int idx)
 	trace_i915_request_in(rq, idx);
 
 	old = ce->inflight;
-	if (!old)
+	if (!__intel_context_inflight_count(old))
 		old = __execlists_schedule_in(rq);
 	WRITE_ONCE(ce->inflight, ptr_inc(old));
 
@@ -498,30 +505,41 @@ static void
 resubmit_virtual_request(struct i915_request *rq, struct virtual_engine *ve)
 {
 	struct intel_engine_cs *engine = rq->engine;
+	struct i915_request *pos = rq;
+	struct intel_timeline *tl;
 
 	spin_lock_irq(&engine->active.lock);
 
-	clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
-	WRITE_ONCE(rq->engine, &ve->base);
-	ve->base.submit_request(rq);
+	if (__i915_request_is_complete(rq))
+		goto unlock;
 
+	tl = i915_request_active_timeline(rq);
+
+	/* Rewind back to the start of this virtual engine queue */
+	list_for_each_entry_continue_reverse(rq, &tl->requests, link) {
+		if (!i915_request_in_priority_queue(rq))
+			break;
+
+		pos = rq;
+	}
+
+	/* Resubmit the queue in execution order */
+	spin_lock(&ve->base.active.lock);
+	list_for_each_entry_from(pos, &tl->requests, link) {
+		if (pos->engine != engine)
+			break;
+
+		__i915_request_requeue(pos, &ve->base);
+	}
+	spin_unlock(&ve->base.active.lock);
+
+unlock:
 	spin_unlock_irq(&engine->active.lock);
 }
 
 static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
 {
 	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
-	struct intel_engine_cs *engine = rq->engine;
-
-	/*
-	 * After this point, the rq may be transferred to a new sibling, so
-	 * before we clear ce->inflight make sure that the context has been
-	 * removed from the b->signalers and furthermore we need to make sure
-	 * that the concurrent iterator in signal_irq_work is no longer
-	 * following ce->signal_link.
-	 */
-	if (!list_empty(&ce->signals))
-		intel_context_remove_breadcrumbs(ce, engine->breadcrumbs);
 
 	/*
 	 * This engine is now too busy to run this virtual request, so
@@ -530,10 +548,10 @@ static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
 	 * same as other native request.
 	 */
 	if (i915_request_in_priority_queue(rq) &&
-	    rq->execution_mask != engine->mask)
+	    rq->execution_mask != rq->engine->mask)
 		resubmit_virtual_request(rq, ve);
 
-	if (READ_ONCE(ve->request))
+	if (!i915_sched_is_idle(&ve->base.active))
 		i915_sched_kick(&ve->base.active);
 }
 
@@ -876,10 +894,16 @@ static bool ctx_single_port_submission(const struct intel_context *ce)
 		intel_context_force_single_submission(ce));
 }
 
+static bool __can_merge_ctx(const struct intel_context *prev,
+			    const struct intel_context *next)
+{
+	return prev == next;
+}
+
 static bool can_merge_ctx(const struct intel_context *prev,
 			  const struct intel_context *next)
 {
-	if (prev != next)
+	if (!__can_merge_ctx(prev, next))
 		return false;
 
 	if (ctx_single_port_submission(prev))
@@ -950,31 +974,6 @@ static bool virtual_matches(const struct virtual_engine *ve,
 	return true;
 }
 
-static struct virtual_engine *
-first_virtual_engine(struct intel_engine_cs *engine)
-{
-	struct intel_engine_execlists *el = &engine->execlists;
-	struct rb_node *rb = rb_first_cached(&el->virtual);
-
-	while (rb) {
-		struct virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
-		struct i915_request *rq = READ_ONCE(ve->request);
-
-		/* lazily cleanup after another engine handled rq */
-		if (!rq || !virtual_matches(ve, rq, engine)) {
-			rb_erase_cached(rb, &el->virtual);
-			RB_CLEAR_NODE(rb);
-			rb = rb_first_cached(&el->virtual);
-			continue;
-		}
-
-		return ve;
-	}
-
-	return NULL;
-}
-
 static void virtual_xfer_context(struct virtual_engine *ve,
 				 struct intel_engine_cs *engine)
 {
@@ -983,6 +982,10 @@ static void virtual_xfer_context(struct virtual_engine *ve,
 	if (likely(engine == ve->siblings[0]))
 		return;
 
+	if (!list_empty(&ve->context.signals))
+		intel_context_remove_breadcrumbs(&ve->context,
+						 ve->siblings[0]->breadcrumbs);
+
 	GEM_BUG_ON(READ_ONCE(ve->context.inflight));
 	if (!intel_engine_has_relative_mmio(engine))
 		lrc_update_offsets(&ve->context, engine);
@@ -1153,15 +1156,124 @@ static bool completed(const struct i915_request *rq)
 	return __i915_request_is_complete(rq);
 }
 
+static void __virtual_dequeue(struct virtual_engine *ve,
+			      struct intel_engine_cs *sibling)
+{
+	struct ve_node * const node = &ve->nodes[sibling->id];
+	struct rb_node **parent, *rb;
+	struct i915_request *rq;
+	u64 deadline;
+	bool first;
+
+	rb_erase_cached(&node->rb, &sibling->execlists.virtual);
+	RB_CLEAR_NODE(&node->rb);
+
+	rq = first_request(&ve->base.active);
+	if (!virtual_matches(ve, rq, sibling))
+		return;
+
+	rb = NULL;
+	first = true;
+	parent = &sibling->execlists.virtual.rb_root.rb_node;
+	deadline = rq_deadline(rq);
+	while (*parent) {
+		struct ve_node *other;
+
+		rb = *parent;
+		other = rb_entry(rb, typeof(*other), rb);
+		if (deadline <= other->deadline) {
+			parent = &rb->rb_left;
+		} else {
+			parent = &rb->rb_right;
+			first = false;
+		}
+	}
+
+	rb_link_node(&node->rb, rb, parent);
+	rb_insert_color_cached(&node->rb, &sibling->execlists.virtual, first);
+}
+
+static void virtual_requeue(struct intel_engine_cs *engine,
+			    struct i915_request *last)
+{
+	const struct i915_request * const first =
+		first_request(&engine->active);
+	struct virtual_engine *ve;
+
+	while ((ve = first_virtual_engine(engine))) {
+		struct i915_request *rq;
+
+		spin_lock(&ve->base.active.lock);
+
+		rq = first_request(&ve->base.active);
+		if (unlikely(!virtual_matches(ve, rq, engine)))
+			/* lost the race to a sibling */
+			goto unlock;
+
+		GEM_BUG_ON(rq->engine != &ve->base);
+		GEM_BUG_ON(rq->context != &ve->context);
+
+		if (last && !__can_merge_ctx(last->context, rq->context)) {
+			spin_unlock(&ve->base.active.lock);
+			return; /* leave this for another sibling? */
+		}
+
+		if (!dl_before(rq, first)) {
+			spin_unlock(&ve->base.active.lock);
+			return;
+		}
+
+		ENGINE_TRACE(engine,
+			     "virtual rq=%llx:%lld%s, dl %lld, new engine? %s\n",
+			     rq->fence.context,
+			     rq->fence.seqno,
+			     __i915_request_is_complete(rq) ? "!" :
+			     __i915_request_has_started(rq) ? "*" :
+			     "",
+			     rq_deadline(rq),
+			     yesno(engine != ve->siblings[0]));
+
+		GEM_BUG_ON(!(rq->execution_mask & engine->mask));
+		if (__i915_request_requeue(rq, engine)) {
+			/*
+			 * Only after we confirm that we will submit
+			 * this request (i.e. it has not already
+			 * completed), do we want to update the context.
+			 *
+			 * This serves two purposes. It avoids
+			 * unnecessary work if we are resubmitting an
+			 * already completed request after timeslicing.
+			 * But more importantly, it prevents us altering
+			 * ve->siblings[] on an idle context, where
+			 * we may be using ve->siblings[] in
+			 * virtual_context_enter / virtual_context_exit.
+			 */
+			virtual_xfer_context(ve, engine);
+
+			/* Bind this ve before we release the lock */
+			if (!ve->context.inflight)
+				WRITE_ONCE(ve->context.inflight, engine);
+
+			GEM_BUG_ON(rq->engine != engine);
+			GEM_BUG_ON(ve->siblings[0] != engine);
+			GEM_BUG_ON(intel_context_inflight(rq->context) != engine);
+
+			last = rq;
+		}
+
+unlock:
+		__virtual_dequeue(ve, engine);
+		spin_unlock(&ve->base.active.lock);
+	}
+}
+
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_request **port = execlists->pending;
 	struct i915_request ** const last_port = port + execlists->port_mask;
 	struct i915_request *last, * const *active;
-	struct virtual_engine *ve;
 	struct i915_priolist *pl;
-	struct rb_node *rb;
 	bool submit = false;
 
 	/*
@@ -1292,83 +1404,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		}
 	}
 
-	/* XXX virtual is always taking precedence */
-	while ((ve = first_virtual_engine(engine))) {
-		struct i915_request *rq;
-
-		spin_lock(&ve->base.active.lock);
-
-		rq = ve->request;
-		if (unlikely(!virtual_matches(ve, rq, engine)))
-			goto unlock; /* lost the race to a sibling */
-
-		GEM_BUG_ON(rq->engine != &ve->base);
-		GEM_BUG_ON(rq->context != &ve->context);
-
-		if (!dl_before(rq, first_request(&engine->active))) {
-			spin_unlock(&ve->base.active.lock);
-			break;
-		}
-
-		if (last && !can_merge_rq(last, rq)) {
-			spin_unlock(&ve->base.active.lock);
-			spin_unlock(&engine->active.lock);
-			return; /* leave this for another sibling */
-		}
-
-		ENGINE_TRACE(engine,
-			     "virtual rq=%llx:%lld%s, dl %llx, new engine? %s\n",
-			     rq->fence.context,
-			     rq->fence.seqno,
-			     __i915_request_is_complete(rq) ? "!" :
-			     __i915_request_has_started(rq) ? "*" :
-			     "",
-			     rq_deadline(rq),
-			     yesno(engine != ve->siblings[0]));
-		WRITE_ONCE(ve->request, NULL);
-
-		rb = &ve->nodes[engine->id].rb;
-		rb_erase_cached(rb, &execlists->virtual);
-		RB_CLEAR_NODE(rb);
-
-		GEM_BUG_ON(!(rq->execution_mask & engine->mask));
-		WRITE_ONCE(rq->engine, engine);
-
-		if (__i915_request_submit(rq)) {
-			/*
-			 * Only after we confirm that we will submit
-			 * this request (i.e. it has not already
-			 * completed), do we want to update the context.
-			 *
-			 * This serves two purposes. It avoids
-			 * unnecessary work if we are resubmitting an
-			 * already completed request after timeslicing.
-			 * But more importantly, it prevents us altering
-			 * ve->siblings[] on an idle context, where
-			 * we may be using ve->siblings[] in
-			 * virtual_context_enter / virtual_context_exit.
-			 */
-			virtual_xfer_context(ve, engine);
-			GEM_BUG_ON(ve->siblings[0] != engine);
-
-			submit = true;
-			last = rq;
-		}
-
-		i915_request_put(rq);
-unlock:
-		spin_unlock(&ve->base.active.lock);
-
-		/*
-		 * Hmm, we have a bunch of virtual engine requests,
-		 * but the first one was already completed (thanks
-		 * preempt-to-busy!). Keep looking at the veng queue
-		 * until we have no more relevant requests (i.e.
-		 * the normal submit queue has higher priority).
-		 */
-		if (submit)
-			break;
-	}
+	if (!RB_EMPTY_ROOT(&execlists->virtual.rb_root))
+		virtual_requeue(engine, last);
 
 	for_each_priolist(pl, &engine->active.queue) {
 		struct i915_request *rq, *rn;
@@ -1376,6 +1413,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		priolist_for_each_request_safe(rq, rn, pl) {
 			bool merge = true;
 
+			GEM_BUG_ON(rq->engine != engine);
+
 			/*
 			 * Can we combine this request with the current port?
 			 * It has to be the same context/ringbuffer and not
@@ -2688,13 +2727,11 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
 		RB_CLEAR_NODE(rb);
 
 		spin_lock(&ve->base.active.lock);
-		rq = fetch_and_zero(&ve->request);
-		if (rq) {
+		while ((rq = first_request(&ve->base.active))) {
 			i915_request_mark_eio(rq);
 
 			rq->engine = engine;
 			__i915_request_submit(rq);
-			i915_request_put(rq);
 		}
 		spin_unlock(&ve->base.active.lock);
 	}
@@ -2931,11 +2968,6 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static struct list_head *virtual_queue(struct virtual_engine *ve)
-{
-	return &ve->base.active.default_priolist.requests;
-}
-
 static void rcu_virtual_context_destroy(struct work_struct *wrk)
 {
 	struct virtual_engine *ve =
@@ -2945,17 +2977,13 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
 	GEM_BUG_ON(ve->context.inflight);
 
 	/* Preempt-to-busy may leave a stale request behind. */
-	if (unlikely(ve->request)) {
+	if (unlikely(!i915_sched_is_idle(&ve->base.active))) {
 		struct i915_request *old;
 
 		spin_lock_irq(&ve->base.active.lock);
 
-		old = fetch_and_zero(&ve->request);
-		if (old) {
-			GEM_BUG_ON(!__i915_request_is_complete(old));
+		while ((old = first_request(&ve->base.active)))
 			__i915_request_submit(old);
-			i915_request_put(old);
-		}
 
 		spin_unlock_irq(&ve->base.active.lock);
 	}
@@ -2986,7 +3014,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
 		spin_unlock_irq(&sibling->active.lock);
 	}
 	GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.active.tasklet));
-	GEM_BUG_ON(!list_empty(virtual_queue(ve)));
 
 	lrc_fini(&ve->context);
 	intel_context_fini(&ve->context);
@@ -3105,45 +3132,44 @@ static const struct intel_context_ops virtual_context_ops = {
 	.destroy = virtual_context_destroy,
 };
 
-static intel_engine_mask_t
+static struct i915_request *
 virtual_submission_mask(struct virtual_engine *ve, u64 *deadline)
 {
 	struct i915_request *rq;
-	intel_engine_mask_t mask;
 
-	rq = READ_ONCE(ve->request);
+	spin_lock_irq(&ve->base.active.lock);
+	rq = first_request(&ve->base.active);
+	spin_unlock_irq(&ve->base.active.lock);
 	if (!rq)
-		return 0;
+		return NULL;
 
 	/* The rq is ready for submission; rq->execution_mask is now stable. */
-	mask = rq->execution_mask;
-	if (unlikely(!mask)) {
+	if (unlikely(!rq->execution_mask)) {
 		/* Invalid selection, submit to a random engine in error */
 		i915_request_set_error_once(rq, -ENODEV);
-		mask = ve->siblings[0]->mask;
+		WRITE_ONCE(rq->execution_mask, ALL_ENGINES);
 	}
 
 	*deadline = rq_deadline(rq);
 
 	ENGINE_TRACE(&ve->base, "rq=%llx:%llu, mask=%x, dl=%llu\n",
 		     rq->fence.context, rq->fence.seqno,
-		     mask, *deadline);
+		     rq->execution_mask, *deadline);
 
-	return mask;
+	return rq;
 }
 
 static void virtual_submission_tasklet(unsigned long data)
 {
 	struct virtual_engine * const ve = (struct virtual_engine *)data;
-	intel_engine_mask_t mask;
+	struct i915_request *rq;
 	unsigned int n;
 	u64 deadline;
 
 	rcu_read_lock();
-	mask = virtual_submission_mask(ve, &deadline);
-	rcu_read_unlock();
-	if (unlikely(!mask))
-		return;
+	rq = virtual_submission_mask(ve, &deadline);
+	if (unlikely(!rq))
+		goto out;
 
 	for (n = 0; n < ve->num_siblings; n++) {
 		struct intel_engine_cs *sibling = READ_ONCE(ve->siblings[n]);
@@ -3151,12 +3177,9 @@ static void virtual_submission_tasklet(unsigned long data)
 		struct rb_node **parent, *rb;
 		bool first;
 
-		if (!READ_ONCE(ve->request))
-			break; /* already handled by a sibling's tasklet */
-
 		spin_lock_irq(&sibling->active.lock);
 
-		if (unlikely(!(mask & sibling->mask))) {
+		if (unlikely(!virtual_matches(ve, rq, sibling))) {
 			if (!RB_EMPTY_NODE(&node->rb)) {
 				rb_erase_cached(&node->rb,
 						&sibling->execlists.virtual);
@@ -3213,45 +3236,9 @@ static void virtual_submission_tasklet(unsigned long data)
 		if (intel_context_inflight(&ve->context))
 			break;
 	}
-}
 
-static void virtual_submit_request(struct i915_request *rq)
-{
-	struct virtual_engine *ve = to_virtual_engine(rq->engine);
-	unsigned long flags;
-
-	ENGINE_TRACE(&ve->base, "rq=%llx:%lld\n",
-		     rq->fence.context,
-		     rq->fence.seqno);
-
-	GEM_BUG_ON(ve->base.submit_request != virtual_submit_request);
-
-	spin_lock_irqsave(&ve->base.active.lock, flags);
-
-	/* By the time we resubmit a request, it may be completed */
-	if (__i915_request_is_complete(rq)) {
-		__i915_request_submit(rq);
-		goto unlock;
-	}
-
-	if (ve->request) { /* background completion from preempt-to-busy */
-		GEM_BUG_ON(!__i915_request_is_complete(ve->request));
-		__i915_request_submit(ve->request);
-		i915_request_put(ve->request);
-	}
-
-	rq->sched.deadline =
-		min(rq->sched.deadline,
-		    i915_scheduler_next_virtual_deadline(rq_prio(rq)));
-	ve->request = i915_request_get(rq);
-
-	GEM_BUG_ON(!list_empty(virtual_queue(ve)));
-	list_move_tail(&rq->sched.link, virtual_queue(ve));
-
-	i915_sched_kick(&ve->base.active);
-
-unlock:
-	spin_unlock_irqrestore(&ve->base.active.lock, flags);
+out:
+	rcu_read_unlock();
 }
 
 static struct ve_bond *
@@ -3341,10 +3328,9 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 	ve->base.cops = &virtual_context_ops;
 	ve->base.request_alloc = execlists_request_alloc;
 
-	ve->base.submit_request = virtual_submit_request;
+	ve->base.submit_request = i915_request_enqueue;
 	ve->base.bond_execute = virtual_bond_execute;
 
-	INIT_LIST_HEAD(virtual_queue(ve));
 	tasklet_init(&ve->base.active.tasklet,
 		     virtual_submission_tasklet,
 		     (unsigned long)ve);
@@ -3552,14 +3538,17 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 	for (rb = rb_first_cached(&execlists->virtual); rb; rb = rb_next(rb)) {
 		struct virtual_engine *ve =
 			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
-		struct i915_request *rq = READ_ONCE(ve->request);
+		struct i915_request *rq;
 
+		spin_lock(&ve->base.active.lock);
+		rq = first_request(&ve->base.active);
 		if (rq) {
 			if (count++ < max - 1)
 				show_request(m, rq, "\t\t", 0);
 			else
 				last = rq;
 		}
+		spin_unlock(&ve->base.active.lock);
 	}
 	if (last) {
 		if (count > max) {
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index e4c0c810b77e..0254c190f690 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1354,6 +1354,7 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
 
 	GEM_BUG_ON(to == from);
 	GEM_BUG_ON(to->timeline == from->timeline);
+	GEM_BUG_ON(to->context == from->context);
 
 	if (i915_request_completed(from)) {
 		i915_sw_fence_set_error_once(&to->submit, from->fence.error);
@@ -1500,6 +1501,15 @@ i915_request_await_object(struct i915_request *to,
 	return ret;
 }
 
+static bool in_order_submission(const struct i915_request *prev,
+				const struct i915_request *rq)
+{
+	if (likely(prev->context == rq->context))
+		return true;
+
+	return is_power_of_2(READ_ONCE(prev->engine)->mask | rq->engine->mask);
+}
+
 static struct i915_request *
 __i915_request_add_to_timeline(struct i915_request *rq)
 {
@@ -1539,7 +1549,7 @@ __i915_request_add_to_timeline(struct i915_request *rq)
 			   i915_seqno_passed(prev->fence.seqno,
 					     rq->fence.seqno));
 
-		if (is_power_of_2(READ_ONCE(prev->engine)->mask | rq->engine->mask))
+		if (in_order_submission(prev, rq))
 			i915_sw_fence_await_sw_fence(&rq->submit,
 						     &prev->submit,
 						     &rq->submitq);
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 7ba816e83b55..9678cabf88cf 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -593,7 +593,7 @@ static u64 virtual_deadline(u64 kt, int priority)
 	return i915_sched_to_ticks(kt + prio_slice(priority));
 }
 
-u64 i915_scheduler_next_virtual_deadline(int priority)
+static u64 next_virtual_deadline(int priority)
 {
 	return virtual_deadline(ktime_get_mono_fast_ns(), priority);
 }
@@ -823,20 +823,17 @@ void i915_request_set_priority(struct i915_request *rq, int prio)
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
 
-void __intel_engine_defer_request(struct intel_engine_cs *engine,
-				  struct i915_request *rq)
+static void __defer_request(struct intel_engine_cs *engine,
+			    struct i915_request *rq,
+			    u64 deadline)
 {
 	struct list_head *pos = &rq->sched.waiters_list;
 	struct i915_request *rn;
 	LIST_HEAD(dfs);
-	u64 deadline;
 
 	lockdep_assert_held(&engine->active.lock);
 	GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags));
 
-	deadline = max(rq_deadline(rq),
-		       i915_scheduler_next_virtual_deadline(adj_prio(rq)));
-
 	/*
 	 * When we defer a request, we must maintain its order with respect
 	 * to those that are waiting upon it. So we traverse its chain of
@@ -904,6 +901,14 @@ void __intel_engine_defer_request(struct intel_engine_cs *engine,
 	}
 }
 
+void __intel_engine_defer_request(struct intel_engine_cs *engine,
+				  struct i915_request *rq)
+{
+	__defer_request(engine, rq,
+			max(rq_deadline(rq),
+			    next_virtual_deadline(adj_prio(rq))));
+}
+
 static bool
 queue_request(struct intel_engine_cs *engine, struct i915_request *rq)
 {
@@ -944,6 +949,46 @@ static bool ancestor_on_hold(const struct intel_engine_cs *engine,
 	return unlikely(!list_empty(&engine->active.hold)) && hold_request(rq);
 }
 
+bool __i915_request_requeue(struct i915_request *rq,
+			    struct intel_engine_cs *engine)
+{
+	RQ_TRACE(rq, "transfer from %s to %s\n",
+		 rq->engine->name, engine->name);
+
+	lockdep_assert_held(&engine->active.lock);
+	lockdep_assert_held(&rq->engine->active.lock);
+	GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags));
+	GEM_BUG_ON(rq->engine == engine);
+
+	list_del_init(&rq->sched.link);
+	WRITE_ONCE(rq->engine, engine);
+
+	if (__i915_request_is_complete(rq)) {
+		clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+		set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
+		return false;
+	}
+
+	if (unlikely(ancestor_on_hold(engine, rq))) {
+		RQ_TRACE(rq, "ancestor on hold\n");
+		clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+		list_add_tail(&rq->sched.link, &engine->active.hold);
+		i915_request_set_hold(rq);
+	} else {
+		u64 deadline = min(earliest_deadline(rq), rq_deadline(rq));
+
+		/* Maintain request ordering wrt to existing on target */
+		__i915_request_set_deadline(rq, deadline);
+		if (!list_empty(&rq->sched.waiters_list))
+			__defer_request(engine, rq, deadline);
+
+		GEM_BUG_ON(rq_deadline(rq) == I915_DEADLINE_NEVER);
+	}
+
+	GEM_BUG_ON(list_empty(&rq->sched.link));
+	return true;
+}
+
 void i915_request_enqueue(struct i915_request *rq)
 {
 	struct intel_engine_cs *engine = rq->engine;
@@ -994,9 +1039,9 @@ __intel_engine_rewind_requests(struct intel_engine_cs *engine)
 		__i915_request_unsubmit(rq);
 
 		if (__i915_request_has_started(rq)) {
-			u64 deadline =
-				i915_scheduler_next_virtual_deadline(rq_prio(rq));
-			rq->sched.deadline = min(rq_deadline(rq), deadline);
+			rq->sched.deadline =
+				min(rq_deadline(rq),
+				    next_virtual_deadline(rq_prio(rq)));
 		}
 		GEM_BUG_ON(rq_deadline(rq) == I915_DEADLINE_NEVER);
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index e04d7eeb1b36..4a562befaf3e 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -40,9 +40,9 @@ void i915_request_set_deadline(struct i915_request *request, u64 deadline);
 
 void i915_request_update_deadline(struct i915_request *request);
 
-u64 i915_scheduler_next_virtual_deadline(int priority);
-
 void i915_request_enqueue(struct i915_request *request);
+bool __i915_request_requeue(struct i915_request *rq,
+			    struct intel_engine_cs *engine);
 
 struct i915_request *
 __intel_engine_rewind_requests(struct intel_engine_cs *engine);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 26/41] drm/i915: Move saturated workload detection back to the context
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (23 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 25/41] drm/i915/gt: Support virtual engine queues Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 27/41] drm/i915: Bump default timeslicing quantum to 5ms Chris Wilson
                   ` (19 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

When we introduced the saturated workload detection to tell us to back
off from semaphore usage [semaphores have a noticeable impact on
contended bus cycles with the CPU for some heavy workloads], we first
introduced it as a per-context tracker. This allows individual contexts
to try and optimise their own usage, but we found that with the local
tracking and the no-semaphore boosting, the first context to disable
semaphores got a massive priority boost and so would starve the rest and
all new contexts (as they started with semaphores enabled and lower
priority). Hence we moved the saturated workload detection to the
engine, and a consequence had to disable semaphores on virtual engines.

Now that we do not have semaphore priority boosting, and try to fairly
schedule irrespective of semaphore usage, we can move the tracking back
to the context and virtual engines can now utilise the faster inter-engine
synchronisation. If we see that any context fairs to use the semaphore,
because the system is oversubscribed and was busy doing something else
instead of spinning on the semaphore, we disable further usage of
semaphores with that context until it idles again. This should restrict
the semaphores to lightly utilised system where the latency between
requests is more noticeable, and curtail the bus-contention from checking
for signaled semaphores.

References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c           |  3 +++
 drivers/gpu/drm/i915/gt/intel_context_types.h     |  2 ++
 drivers/gpu/drm/i915/gt/intel_engine_pm.c         |  2 --
 drivers/gpu/drm/i915/gt/intel_engine_types.h      |  2 --
 .../gpu/drm/i915/gt/intel_execlists_submission.c  | 15 ---------------
 drivers/gpu/drm/i915/i915_request.c               |  6 +++---
 6 files changed, 8 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index daf537d1e415..57b6bde2b736 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -344,6 +344,9 @@ static int __intel_context_active(struct i915_active *active)
 {
 	struct intel_context *ce = container_of(active, typeof(*ce), active);
 
+	CE_TRACE(ce, "active\n");
+	ce->saturated = 0;
+
 	intel_context_get(ce);
 
 	/* everything should already be activated by intel_context_pre_pin() */
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 0ea18c9e2aca..d1a35c3055a7 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -109,6 +109,8 @@ struct intel_context {
 	} lrc;
 	u32 tag; /* cookie passed to HW to track this context on submission */
 
+	intel_engine_mask_t saturated; /* submitting semaphores too late? */
+
 	/** stats: Context GPU engine busyness tracking. */
 	struct intel_context_stats {
 		u64 active;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 2427d9e01be9..ac18bbd24450 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -253,8 +253,6 @@ static int __engine_park(struct intel_wakeref *wf)
 	struct intel_engine_cs *engine =
 		container_of(wf, typeof(*engine), wakeref);
 
-	engine->saturated = 0;
-
 	/*
 	 * If one and only one request is completed between pm events,
 	 * we know that we are inside the kernel context and it is
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index b5bef848a2d5..06a2582dc32f 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -303,8 +303,6 @@ struct intel_engine_cs {
 
 	struct intel_context *kernel_context; /* pinned */
 
-	intel_engine_mask_t saturated; /* submitting semaphores too late? */
-
 	struct {
 		struct delayed_work work;
 		struct i915_request *systole;
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index ecbc0538e155..99bdab2dc254 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3305,21 +3305,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 	ve->base.instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
 	ve->base.uabi_instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
 
-	/*
-	 * The decision on whether to submit a request using semaphores
-	 * depends on the saturated state of the engine. We only compute
-	 * this during HW submission of the request, and we need for this
-	 * state to be globally applied to all requests being submitted
-	 * to this engine. Virtual engines encompass more than one physical
-	 * engine and so we cannot accurately tell in advance if one of those
-	 * engines is already saturated and so cannot afford to use a semaphore
-	 * and be pessimized in priority for doing so -- if we are the only
-	 * context using semaphores after all other clients have stopped, we
-	 * will be starved on the saturated system. Such a global switch for
-	 * semaphores is less than ideal, but alas is the current compromise.
-	 */
-	ve->base.saturated = ALL_ENGINES;
-
 	snprintf(ve->base.name, sizeof(ve->base.name), "virtual");
 
 	i915_sched_init_engine(&ve->base.active, ENGINE_VIRTUAL);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 0254c190f690..9e622b5733fd 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -580,7 +580,7 @@ bool __i915_request_submit(struct i915_request *request)
 	 */
 	if (request->sched.semaphores &&
 	    i915_sw_fence_signaled(&request->semaphore))
-		engine->saturated |= request->sched.semaphores;
+		request->context->saturated |= request->sched.semaphores;
 
 	engine->emit_fini_breadcrumb(request,
 				     request->ring->vaddr + request->postfix);
@@ -1041,7 +1041,7 @@ already_busywaiting(struct i915_request *rq)
 	 *
 	 * See the are-we-too-late? check in __i915_request_submit().
 	 */
-	return rq->sched.semaphores | READ_ONCE(rq->engine->saturated);
+	return rq->sched.semaphores | READ_ONCE(rq->context->saturated);
 }
 
 static int
@@ -1135,7 +1135,7 @@ emit_semaphore_wait(struct i915_request *to,
 	if (__emit_semaphore_wait(to, from, from->fence.seqno))
 		goto await_fence;
 
-	to->sched.semaphores |= mask;
+	to->sched.semaphores |= mask & ~to->engine->mask;
 	wait = &to->semaphore;
 
 await_fence:
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 27/41] drm/i915: Bump default timeslicing quantum to 5ms
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (24 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 26/41] drm/i915: Move saturated workload detection back to the context Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 28/41] drm/i915/gt: Wrap intel_timeline.has_initial_breadcrumb Chris Wilson
                   ` (18 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Primarily to smooth over differences with the guc backend that struggles
with smaller quanta, bump the default timeslicing to 5ms from 1ms.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Kconfig.profile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index 35bbe2b80596..3eacea42b19f 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -90,7 +90,7 @@ config DRM_I915_STOP_TIMEOUT
 
 config DRM_I915_TIMESLICE_DURATION
 	int "Scheduling quantum for userspace batches (ms, jiffy granularity)"
-	default 1 # milliseconds
+	default 5 # milliseconds
 	help
 	  When two user batches of equal priority are executing, we will
 	  alternate execution of each batch to ensure forward progress of
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 28/41] drm/i915/gt: Wrap intel_timeline.has_initial_breadcrumb
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (25 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 27/41] drm/i915: Bump default timeslicing quantum to 5ms Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 29/41] drm/i915/gt: Track timeline GGTT offset separately from subpage offset Chris Wilson
                   ` (17 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

In preparation for removing the has_initial_breadcrumb field, add a
helper function for the existing callers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c        | 2 +-
 drivers/gpu/drm/i915/gt/intel_ring_submission.c | 4 ++--
 drivers/gpu/drm/i915/gt/intel_timeline.c        | 6 +++---
 drivers/gpu/drm/i915/gt/intel_timeline.h        | 6 ++++++
 drivers/gpu/drm/i915/gt/selftest_timeline.c     | 5 +++--
 5 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 07ba524da90b..449633371de6 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -354,7 +354,7 @@ int gen8_emit_init_breadcrumb(struct i915_request *rq)
 	u32 *cs;
 
 	GEM_BUG_ON(i915_request_has_initial_breadcrumb(rq));
-	if (!i915_request_timeline(rq)->has_initial_breadcrumb)
+	if (!intel_timeline_has_initial_breadcrumb(i915_request_timeline(rq)))
 		return 0;
 
 	cs = intel_ring_begin(rq, 6);
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 8b7cc637c432..9467228a392f 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -906,7 +906,7 @@ static int ring_request_alloc(struct i915_request *request)
 	int ret;
 
 	GEM_BUG_ON(!intel_context_is_pinned(request->context));
-	GEM_BUG_ON(i915_request_timeline(request)->has_initial_breadcrumb);
+	GEM_BUG_ON(intel_timeline_has_initial_breadcrumb(i915_request_timeline(request)));
 
 	/*
 	 * Flush enough space to reduce the likelihood of waiting after
@@ -1231,7 +1231,7 @@ int intel_ring_submission_setup(struct intel_engine_cs *engine)
 		err = PTR_ERR(timeline);
 		goto err;
 	}
-	GEM_BUG_ON(timeline->has_initial_breadcrumb);
+	GEM_BUG_ON(intel_timeline_has_initial_breadcrumb(timeline));
 
 	err = intel_timeline_pin(timeline, NULL);
 	if (err)
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 491b8df174c2..1505dffbaba9 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -444,14 +444,14 @@ void intel_timeline_exit(struct intel_timeline *tl)
 static u32 timeline_advance(struct intel_timeline *tl)
 {
 	GEM_BUG_ON(!atomic_read(&tl->pin_count));
-	GEM_BUG_ON(tl->seqno & tl->has_initial_breadcrumb);
+	GEM_BUG_ON(tl->seqno & intel_timeline_has_initial_breadcrumb(tl));
 
-	return tl->seqno += 1 + tl->has_initial_breadcrumb;
+	return tl->seqno += 1 + intel_timeline_has_initial_breadcrumb(tl);
 }
 
 static void timeline_rollback(struct intel_timeline *tl)
 {
-	tl->seqno -= 1 + tl->has_initial_breadcrumb;
+	tl->seqno -= 1 + intel_timeline_has_initial_breadcrumb(tl);
 }
 
 static noinline int
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.h b/drivers/gpu/drm/i915/gt/intel_timeline.h
index b1f81d947f8d..7d6218b55df6 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.h
@@ -42,6 +42,12 @@ static inline void intel_timeline_put(struct intel_timeline *timeline)
 	kref_put(&timeline->kref, __intel_timeline_free);
 }
 
+static inline bool
+intel_timeline_has_initial_breadcrumb(const struct intel_timeline *tl)
+{
+	return tl->has_initial_breadcrumb;
+}
+
 static inline int __intel_timeline_sync_set(struct intel_timeline *tl,
 					    u64 context, u32 seqno)
 {
diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c
index d283dce5b4ac..562a450d2832 100644
--- a/drivers/gpu/drm/i915/gt/selftest_timeline.c
+++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c
@@ -665,7 +665,7 @@ static int live_hwsp_wrap(void *arg)
 	if (IS_ERR(tl))
 		return PTR_ERR(tl);
 
-	if (!tl->has_initial_breadcrumb || !tl->hwsp_cacheline)
+	if (!intel_timeline_has_initial_breadcrumb(tl) || !tl->hwsp_cacheline)
 		goto out_free;
 
 	err = intel_timeline_pin(tl, NULL);
@@ -1234,7 +1234,8 @@ static int live_hwsp_rollover_user(void *arg)
 			goto out;
 
 		tl = ce->timeline;
-		if (!tl->has_initial_breadcrumb || !tl->hwsp_cacheline)
+		if (!intel_timeline_has_initial_breadcrumb(tl) ||
+		    !tl->hwsp_cacheline)
 			goto out;
 
 		timeline_rollback(tl);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 29/41] drm/i915/gt: Track timeline GGTT offset separately from subpage offset
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (26 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 28/41] drm/i915/gt: Wrap intel_timeline.has_initial_breadcrumb Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 30/41] drm/i915/gt: Add timeline "mode" Chris Wilson
                   ` (16 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Currently we know that the timeline status page is at most a page in
size, and so we can preserve the lower 12bits of the offset when
relocating the status page in the GGTT. If we want to use a larger
object, such as the context state, we may not necessarily use a position
within the first page and so need more than 12b.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/gen6_engine_cs.c       |  4 ++--
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c       |  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c      |  4 ++--
 drivers/gpu/drm/i915/gt/intel_timeline.c       | 17 +++++++----------
 drivers/gpu/drm/i915/gt/intel_timeline_types.h |  1 +
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_rc6.c         |  2 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c    | 16 ++++++++--------
 8 files changed, 23 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen6_engine_cs.c b/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
index ce38d1bcaba3..2f59dd3bdc18 100644
--- a/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
@@ -161,7 +161,7 @@ u32 *gen6_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 		 PIPE_CONTROL_DC_FLUSH_ENABLE |
 		 PIPE_CONTROL_QW_WRITE |
 		 PIPE_CONTROL_CS_STALL);
-	*cs++ = i915_request_active_timeline(rq)->hwsp_offset |
+	*cs++ = i915_request_active_timeline(rq)->ggtt_offset |
 		PIPE_CONTROL_GLOBAL_GTT;
 	*cs++ = rq->fence.seqno;
 
@@ -359,7 +359,7 @@ u32 *gen7_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 		 PIPE_CONTROL_QW_WRITE |
 		 PIPE_CONTROL_GLOBAL_GTT_IVB |
 		 PIPE_CONTROL_CS_STALL);
-	*cs++ = i915_request_active_timeline(rq)->hwsp_offset;
+	*cs++ = i915_request_active_timeline(rq)->ggtt_offset;
 	*cs++ = rq->fence.seqno;
 
 	*cs++ = MI_USER_INTERRUPT;
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 449633371de6..80784c5e43e3 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -346,7 +346,7 @@ static u32 hwsp_offset(const struct i915_request *rq)
 	if (cl)
 		return cl->ggtt_offset;
 
-	return rcu_dereference_protected(rq->timeline, 1)->hwsp_offset;
+	return rcu_dereference_protected(rq->timeline, 1)->ggtt_offset;
 }
 
 int gen8_emit_init_breadcrumb(struct i915_request *rq)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 8372c8bc4ca5..a1cd511223a2 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1322,7 +1322,7 @@ static int print_ring(char *buf, int sz, struct i915_request *rq)
 		len = scnprintf(buf, sz,
 				"ring:{start:%08x, hwsp:%08x, seqno:%08x, runtime:%llums}, ",
 				i915_ggtt_offset(rq->ring->vma),
-				tl ? tl->hwsp_offset : 0,
+				tl ? tl->ggtt_offset : 0,
 				hwsp_seqno(rq),
 				DIV_ROUND_CLOSEST_ULL(intel_context_get_total_runtime_ns(rq->context),
 						      1000 * 1000));
@@ -1667,7 +1667,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 
 		if (tl) {
 			drm_printf(m, "\t\tring->hwsp:   0x%08x\n",
-				   tl->hwsp_offset);
+				   tl->ggtt_offset);
 			intel_timeline_put(tl);
 		}
 
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 1505dffbaba9..b684322c879c 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -354,13 +354,11 @@ int intel_timeline_pin(struct intel_timeline *tl, struct i915_gem_ww_ctx *ww)
 	if (err)
 		return err;
 
-	tl->hwsp_offset =
-		i915_ggtt_offset(tl->hwsp_ggtt) +
-		offset_in_page(tl->hwsp_offset);
+	tl->ggtt_offset = i915_ggtt_offset(tl->hwsp_ggtt) + tl->hwsp_offset;
 	GT_TRACE(tl->gt, "timeline:%llx using HWSP offset:%x\n",
-		 tl->fence_context, tl->hwsp_offset);
+		 tl->fence_context, tl->ggtt_offset);
 
-	cacheline_acquire(tl->hwsp_cacheline, tl->hwsp_offset);
+	cacheline_acquire(tl->hwsp_cacheline, tl->ggtt_offset);
 	if (atomic_fetch_inc(&tl->pin_count)) {
 		cacheline_release(tl->hwsp_cacheline);
 		__i915_vma_unpin(tl->hwsp_ggtt);
@@ -528,14 +526,13 @@ __intel_timeline_get_seqno(struct intel_timeline *tl,
 
 	vaddr = page_mask_bits(cl->vaddr);
 	tl->hwsp_offset = cacheline * CACHELINE_BYTES;
-	tl->hwsp_seqno =
-		memset(vaddr + tl->hwsp_offset, 0, CACHELINE_BYTES);
+	tl->hwsp_seqno = memset(vaddr + tl->hwsp_offset, 0, CACHELINE_BYTES);
 
-	tl->hwsp_offset += i915_ggtt_offset(vma);
+	tl->ggtt_offset = i915_ggtt_offset(vma) + tl->hwsp_offset;
 	GT_TRACE(tl->gt, "timeline:%llx using HWSP offset:%x\n",
-		 tl->fence_context, tl->hwsp_offset);
+		 tl->fence_context, tl->ggtt_offset);
 
-	cacheline_acquire(cl, tl->hwsp_offset);
+	cacheline_acquire(cl, tl->ggtt_offset);
 	tl->hwsp_cacheline = cl;
 
 	*seqno = timeline_advance(tl);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index 9f677c9b7d06..c5995cc290a0 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -47,6 +47,7 @@ struct intel_timeline {
 	const u32 *hwsp_seqno;
 	struct i915_vma *hwsp_ggtt;
 	u32 hwsp_offset;
+	u32 ggtt_offset;
 
 	struct intel_timeline_cacheline *hwsp_cacheline;
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_cs.c b/drivers/gpu/drm/i915/gt/selftest_engine_cs.c
index 84d883de30ee..e33ec4e3b35d 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_cs.c
@@ -53,7 +53,7 @@ static int write_timestamp(struct i915_request *rq, int slot)
 		cmd++;
 	*cs++ = cmd;
 	*cs++ = i915_mmio_reg_offset(RING_TIMESTAMP(rq->engine->mmio_base));
-	*cs++ = i915_request_timeline(rq)->hwsp_offset + slot * sizeof(u32);
+	*cs++ = i915_request_timeline(rq)->ggtt_offset + slot * sizeof(u32);
 	*cs++ = 0;
 
 	intel_ring_advance(rq, cs);
diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c b/drivers/gpu/drm/i915/gt/selftest_rc6.c
index f097e420ac45..285cead849dd 100644
--- a/drivers/gpu/drm/i915/gt/selftest_rc6.c
+++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
@@ -137,7 +137,7 @@ static const u32 *__live_rc6_ctx(struct intel_context *ce)
 
 	*cs++ = cmd;
 	*cs++ = i915_mmio_reg_offset(GEN8_RC6_CTX_INFO);
-	*cs++ = ce->timeline->hwsp_offset + 8;
+	*cs++ = ce->timeline->ggtt_offset + 8;
 	*cs++ = 0;
 	intel_ring_advance(rq, cs);
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c
index 562a450d2832..6b412228a6fd 100644
--- a/drivers/gpu/drm/i915/gt/selftest_timeline.c
+++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c
@@ -468,7 +468,7 @@ tl_write(struct intel_timeline *tl, struct intel_engine_cs *engine, u32 value)
 
 	i915_request_get(rq);
 
-	err = emit_ggtt_store_dw(rq, tl->hwsp_offset, value);
+	err = emit_ggtt_store_dw(rq, tl->ggtt_offset, value);
 	i915_request_add(rq);
 	if (err) {
 		i915_request_put(rq);
@@ -564,7 +564,7 @@ static int live_hwsp_engine(void *arg)
 
 		if (!err && READ_ONCE(*tl->hwsp_seqno) != n) {
 			GEM_TRACE_ERR("Invalid seqno:%lu stored in timeline %llu @ %x, found 0x%x\n",
-				      n, tl->fence_context, tl->hwsp_offset, *tl->hwsp_seqno);
+				      n, tl->fence_context, tl->ggtt_offset, *tl->hwsp_seqno);
 			GEM_TRACE_DUMP();
 			err = -EINVAL;
 		}
@@ -636,7 +636,7 @@ static int live_hwsp_alternate(void *arg)
 
 		if (!err && READ_ONCE(*tl->hwsp_seqno) != n) {
 			GEM_TRACE_ERR("Invalid seqno:%lu stored in timeline %llu @ %x, found 0x%x\n",
-				      n, tl->fence_context, tl->hwsp_offset, *tl->hwsp_seqno);
+				      n, tl->fence_context, tl->ggtt_offset, *tl->hwsp_seqno);
 			GEM_TRACE_DUMP();
 			err = -EINVAL;
 		}
@@ -696,9 +696,9 @@ static int live_hwsp_wrap(void *arg)
 			goto out;
 		}
 		pr_debug("seqno[0]:%08x, hwsp_offset:%08x\n",
-			 seqno[0], tl->hwsp_offset);
+			 seqno[0], tl->ggtt_offset);
 
-		err = emit_ggtt_store_dw(rq, tl->hwsp_offset, seqno[0]);
+		err = emit_ggtt_store_dw(rq, tl->ggtt_offset, seqno[0]);
 		if (err) {
 			i915_request_add(rq);
 			goto out;
@@ -713,9 +713,9 @@ static int live_hwsp_wrap(void *arg)
 			goto out;
 		}
 		pr_debug("seqno[1]:%08x, hwsp_offset:%08x\n",
-			 seqno[1], tl->hwsp_offset);
+			 seqno[1], tl->ggtt_offset);
 
-		err = emit_ggtt_store_dw(rq, tl->hwsp_offset, seqno[1]);
+		err = emit_ggtt_store_dw(rq, tl->ggtt_offset, seqno[1]);
 		if (err) {
 			i915_request_add(rq);
 			goto out;
@@ -1343,7 +1343,7 @@ static int live_hwsp_recycle(void *arg)
 			if (READ_ONCE(*tl->hwsp_seqno) != count) {
 				GEM_TRACE_ERR("Invalid seqno:%lu stored in timeline %llu @ %x found 0x%x\n",
 					      count, tl->fence_context,
-					      tl->hwsp_offset, *tl->hwsp_seqno);
+					      tl->ggtt_offset, *tl->hwsp_seqno);
 				GEM_TRACE_DUMP();
 				err = -EINVAL;
 			}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 30/41] drm/i915/gt: Add timeline "mode"
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (27 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 29/41] drm/i915/gt: Track timeline GGTT offset separately from subpage offset Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 31/41] drm/i915/gt: Use indices for writing into relative timelines Chris Wilson
                   ` (15 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Explicitly differentiate between the absolute and relative timelines,
and the global HWSP and ppHWSP relative offsets. When using a timeline
that is relative to a known status page, we can replace the absolute
addressing in the commands with indexed variants.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_timeline.c      | 21 ++++++++++++++++---
 drivers/gpu/drm/i915/gt/intel_timeline.h      |  2 +-
 .../gpu/drm/i915/gt/intel_timeline_types.h    | 10 +++++++--
 3 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index b684322c879c..69052495c64a 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -226,7 +226,6 @@ static int intel_timeline_init(struct intel_timeline *timeline,
 
 	timeline->gt = gt;
 
-	timeline->has_initial_breadcrumb = !hwsp;
 	timeline->hwsp_cacheline = NULL;
 
 	if (!hwsp) {
@@ -243,13 +242,29 @@ static int intel_timeline_init(struct intel_timeline *timeline,
 			return PTR_ERR(cl);
 		}
 
+		timeline->mode = INTEL_TIMELINE_ABSOLUTE;
 		timeline->hwsp_cacheline = cl;
 		timeline->hwsp_offset = cacheline * CACHELINE_BYTES;
 
 		vaddr = page_mask_bits(cl->vaddr);
 	} else {
-		timeline->hwsp_offset = offset;
-		vaddr = i915_gem_object_pin_map(hwsp->obj, I915_MAP_WB);
+		int preferred;
+
+		if (offset & INTEL_TIMELINE_RELATIVE_CONTEXT) {
+			timeline->mode = INTEL_TIMELINE_RELATIVE_CONTEXT;
+			timeline->hwsp_offset =
+				offset & ~INTEL_TIMELINE_RELATIVE_CONTEXT;
+			preferred = i915_coherent_map_type(gt->i915);
+		} else {
+			timeline->mode = INTEL_TIMELINE_RELATIVE_ENGINE;
+			timeline->hwsp_offset = offset;
+			preferred = I915_MAP_WB;
+		}
+
+		vaddr = i915_gem_object_pin_map(hwsp->obj,
+						preferred | I915_MAP_OVERRIDE);
+		if (IS_ERR(vaddr))
+			vaddr = i915_gem_object_pin_map(hwsp->obj, I915_MAP_WC);
 		if (IS_ERR(vaddr))
 			return PTR_ERR(vaddr);
 	}
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.h b/drivers/gpu/drm/i915/gt/intel_timeline.h
index 7d6218b55df6..e1d522329757 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.h
@@ -45,7 +45,7 @@ static inline void intel_timeline_put(struct intel_timeline *timeline)
 static inline bool
 intel_timeline_has_initial_breadcrumb(const struct intel_timeline *tl)
 {
-	return tl->has_initial_breadcrumb;
+	return tl->mode == INTEL_TIMELINE_ABSOLUTE;
 }
 
 static inline int __intel_timeline_sync_set(struct intel_timeline *tl,
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index c5995cc290a0..61938d103a13 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -19,6 +19,12 @@ struct i915_syncmap;
 struct intel_gt;
 struct intel_timeline_hwsp;
 
+enum intel_timeline_mode {
+	INTEL_TIMELINE_ABSOLUTE = 0,
+	INTEL_TIMELINE_RELATIVE_CONTEXT = BIT(0),
+	INTEL_TIMELINE_RELATIVE_ENGINE  = BIT(1),
+};
+
 struct intel_timeline {
 	u64 fence_context;
 	u32 seqno;
@@ -44,6 +50,8 @@ struct intel_timeline {
 	atomic_t pin_count;
 	atomic_t active_count;
 
+	enum intel_timeline_mode mode;
+
 	const u32 *hwsp_seqno;
 	struct i915_vma *hwsp_ggtt;
 	u32 hwsp_offset;
@@ -51,8 +59,6 @@ struct intel_timeline {
 
 	struct intel_timeline_cacheline *hwsp_cacheline;
 
-	bool has_initial_breadcrumb;
-
 	/**
 	 * List of breadcrumbs associated with GPU requests currently
 	 * outstanding.
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 31/41] drm/i915/gt: Use indices for writing into relative timelines
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (28 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 30/41] drm/i915/gt: Add timeline "mode" Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 32/41] drm/i915/selftests: Exercise relative timeline modes Chris Wilson
                   ` (14 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Relative timelines are relative to either the global or per-process
HWSP, and so we can replace the absolute addressing with store-index
variants for position invariance.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 98 +++++++++++++++++-------
 drivers/gpu/drm/i915/gt/intel_timeline.h | 12 +++
 2 files changed, 82 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 80784c5e43e3..8b3a96b1afe0 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -518,7 +518,19 @@ gen8_emit_fini_breadcrumb_tail(struct i915_request *rq, u32 *cs)
 
 static u32 *emit_xcs_breadcrumb(struct i915_request *rq, u32 *cs)
 {
-	return gen8_emit_ggtt_write(cs, rq->fence.seqno, hwsp_offset(rq), 0);
+	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
+	unsigned int flags = MI_FLUSH_DW_OP_STOREDW;
+	u32 offset = hwsp_offset(rq);
+
+	if (intel_timeline_is_relative(tl)) {
+		offset = offset_in_page(offset);
+		flags |= MI_FLUSH_DW_STORE_INDEX;
+	}
+	GEM_BUG_ON(offset & 7);
+	if (!intel_timeline_in_context(tl))
+		offset |= MI_FLUSH_DW_USE_GTT;
+
+	return __gen8_emit_flush_dw(cs, rq->fence.seqno, offset, flags);
 }
 
 u32 *gen8_emit_fini_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
@@ -528,6 +540,18 @@ u32 *gen8_emit_fini_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
 
 u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 {
+	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
+	unsigned int flags = PIPE_CONTROL_FLUSH_ENABLE | PIPE_CONTROL_CS_STALL;
+	u32 offset = hwsp_offset(rq);
+
+	if (intel_timeline_is_relative(tl)) {
+		offset = offset_in_page(offset);
+		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
+	}
+	GEM_BUG_ON(offset & 7);
+	if (!intel_timeline_in_context(tl))
+		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
+
 	cs = gen8_emit_pipe_control(cs,
 				    PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
 				    PIPE_CONTROL_DEPTH_CACHE_FLUSH |
@@ -535,26 +559,33 @@ u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 				    0);
 
 	/* XXX flush+write+CS_STALL all in one upsets gem_concurrent_blt:kbl */
-	cs = gen8_emit_ggtt_write_rcs(cs,
-				      rq->fence.seqno,
-				      hwsp_offset(rq),
-				      PIPE_CONTROL_FLUSH_ENABLE |
-				      PIPE_CONTROL_CS_STALL);
+	cs = __gen8_emit_write_rcs(cs, rq->fence.seqno, offset, 0, flags);
 
 	return gen8_emit_fini_breadcrumb_tail(rq, cs);
 }
 
 u32 *gen11_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 {
-	cs = gen8_emit_ggtt_write_rcs(cs,
-				      rq->fence.seqno,
-				      hwsp_offset(rq),
-				      PIPE_CONTROL_CS_STALL |
-				      PIPE_CONTROL_TILE_CACHE_FLUSH |
-				      PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
-				      PIPE_CONTROL_DEPTH_CACHE_FLUSH |
-				      PIPE_CONTROL_DC_FLUSH_ENABLE |
-				      PIPE_CONTROL_FLUSH_ENABLE);
+	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
+	u32 offset = hwsp_offset(rq);
+	unsigned int flags;
+
+	flags = (PIPE_CONTROL_CS_STALL |
+		 PIPE_CONTROL_TILE_CACHE_FLUSH |
+		 PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
+		 PIPE_CONTROL_DEPTH_CACHE_FLUSH |
+		 PIPE_CONTROL_DC_FLUSH_ENABLE |
+		 PIPE_CONTROL_FLUSH_ENABLE);
+
+	if (intel_timeline_is_relative(tl)) {
+		offset = offset_in_page(offset);
+		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
+	}
+	GEM_BUG_ON(offset & 7);
+	if (!intel_timeline_in_context(tl))
+		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
+
+	cs = __gen8_emit_write_rcs(cs, rq->fence.seqno, offset, 0, flags);
 
 	return gen8_emit_fini_breadcrumb_tail(rq, cs);
 }
@@ -617,19 +648,30 @@ u32 *gen12_emit_fini_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
 
 u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 {
-	cs = gen12_emit_ggtt_write_rcs(cs,
-				       rq->fence.seqno,
-				       hwsp_offset(rq),
-				       PIPE_CONTROL0_HDC_PIPELINE_FLUSH,
-				       PIPE_CONTROL_CS_STALL |
-				       PIPE_CONTROL_TILE_CACHE_FLUSH |
-				       PIPE_CONTROL_FLUSH_L3 |
-				       PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
-				       PIPE_CONTROL_DEPTH_CACHE_FLUSH |
-				       /* Wa_1409600907:tgl */
-				       PIPE_CONTROL_DEPTH_STALL |
-				       PIPE_CONTROL_DC_FLUSH_ENABLE |
-				       PIPE_CONTROL_FLUSH_ENABLE);
+	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
+	u32 offset = hwsp_offset(rq);
+	unsigned int flags;
+
+	flags = (PIPE_CONTROL_CS_STALL |
+		 PIPE_CONTROL_TILE_CACHE_FLUSH |
+		 PIPE_CONTROL_FLUSH_L3 |
+		 PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
+		 PIPE_CONTROL_DEPTH_CACHE_FLUSH |
+		 /* Wa_1409600907:tgl */
+		 PIPE_CONTROL_DEPTH_STALL |
+		 PIPE_CONTROL_DC_FLUSH_ENABLE |
+		 PIPE_CONTROL_FLUSH_ENABLE);
+
+	if (intel_timeline_is_relative(tl)) {
+		offset = offset_in_page(offset);
+		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
+	}
+	GEM_BUG_ON(offset & 7);
+	if (!intel_timeline_in_context(tl))
+		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
+
+	cs = __gen8_emit_write_rcs(cs, rq->fence.seqno, offset,
+				   PIPE_CONTROL0_HDC_PIPELINE_FLUSH, flags);
 
 	return gen12_emit_fini_breadcrumb_tail(rq, cs);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.h b/drivers/gpu/drm/i915/gt/intel_timeline.h
index e1d522329757..9859a77a6f54 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.h
@@ -48,6 +48,18 @@ intel_timeline_has_initial_breadcrumb(const struct intel_timeline *tl)
 	return tl->mode == INTEL_TIMELINE_ABSOLUTE;
 }
 
+static inline bool
+intel_timeline_is_relative(const struct intel_timeline *tl)
+{
+	return tl->mode != INTEL_TIMELINE_ABSOLUTE;
+}
+
+static inline bool
+intel_timeline_in_context(const struct intel_timeline *tl)
+{
+	return tl->mode == INTEL_TIMELINE_RELATIVE_CONTEXT;
+}
+
 static inline int __intel_timeline_sync_set(struct intel_timeline *tl,
 					    u64 context, u32 seqno)
 {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 32/41] drm/i915/selftests: Exercise relative timeline modes
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (29 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 31/41] drm/i915/gt: Use indices for writing into relative timelines Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 33/41] drm/i915/gt: Use ppHWSP for unshared non-semaphore related timelines Chris Wilson
                   ` (13 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

A quick test to verify that the backend accepts each type of timeline
and can use them to track and control request emission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_timeline.c | 105 ++++++++++++++++++++
 1 file changed, 105 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c
index 6b412228a6fd..dcc03522b277 100644
--- a/drivers/gpu/drm/i915/gt/selftest_timeline.c
+++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c
@@ -1364,9 +1364,114 @@ static int live_hwsp_recycle(void *arg)
 	return err;
 }
 
+static int live_hwsp_relative(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	/*
+	 * Check backend support for different timeline modes.
+	 */
+
+	for_each_engine(engine, gt, id) {
+		enum intel_timeline_mode mode;
+
+		if (!intel_engine_has_scheduler(engine))
+			continue;
+
+		for (mode = INTEL_TIMELINE_ABSOLUTE;
+		     mode <= INTEL_TIMELINE_RELATIVE_ENGINE;
+		     mode++) {
+			struct intel_timeline *tl;
+			struct i915_request *rq;
+			struct intel_context *ce;
+			const char *msg;
+			int err;
+
+			if (mode == INTEL_TIMELINE_RELATIVE_CONTEXT &&
+			    !HAS_EXECLISTS(gt->i915))
+				continue;
+
+			ce = intel_context_create(engine);
+			if (IS_ERR(ce))
+				return PTR_ERR(ce);
+
+			err = intel_context_alloc_state(ce);
+			if (err) {
+				intel_context_put(ce);
+				return err;
+			}
+
+			switch (mode) {
+			case INTEL_TIMELINE_ABSOLUTE:
+				tl = intel_timeline_create(gt);
+				msg = "local";
+				break;
+
+			case INTEL_TIMELINE_RELATIVE_CONTEXT:
+				tl = __intel_timeline_create(gt,
+							     ce->state,
+							     INTEL_TIMELINE_RELATIVE_CONTEXT |
+							     0x400);
+				msg = "ppHWSP";
+				break;
+
+			case INTEL_TIMELINE_RELATIVE_ENGINE:
+				tl = __intel_timeline_create(gt,
+							     engine->status_page.vma,
+							     0x400);
+				msg = "HWSP";
+				break;
+			default:
+				continue;
+			}
+			if (IS_ERR(tl)) {
+				intel_context_put(ce);
+				return PTR_ERR(tl);
+			}
+
+			pr_info("Testing %s timeline on %s\n",
+				msg, engine->name);
+
+			intel_timeline_put(ce->timeline);
+			ce->timeline = tl;
+
+			err = intel_timeline_pin(tl, NULL);
+			if (err) {
+				intel_context_put(ce);
+				return err;
+			}
+			tl->seqno = 0xc0000000;
+			WRITE_ONCE(*(u32 *)tl->hwsp_seqno, tl->seqno);
+			intel_timeline_unpin(tl);
+
+			rq = intel_context_create_request(ce);
+			intel_context_put(ce);
+			if (IS_ERR(rq))
+				return PTR_ERR(rq);
+
+			GEM_BUG_ON(rcu_access_pointer(rq->timeline) != tl);
+
+			i915_request_get(rq);
+			i915_request_add(rq);
+
+			if (i915_request_wait(rq, 0, HZ / 5) < 0) {
+				i915_request_put(rq);
+				return -EIO;
+			}
+
+			i915_request_put(rq);
+		}
+	}
+
+	return 0;
+}
+
 int intel_timeline_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
+		SUBTEST(live_hwsp_relative),
 		SUBTEST(live_hwsp_recycle),
 		SUBTEST(live_hwsp_engine),
 		SUBTEST(live_hwsp_alternate),
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 33/41] drm/i915/gt: Use ppHWSP for unshared non-semaphore related timelines
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (30 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 32/41] drm/i915/selftests: Exercise relative timeline modes Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 34/41] Restore "drm/i915: drop engine_pin/unpin_breadcrumbs_irq" Chris Wilson
                   ` (12 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

When we are not using semaphores with a context/engine, we can simply
reuse the same seqno location across wraps, but we still require each
timeline to have its own address. For LRC submission, each context is
prefixed by a per-process HWSP, which provides us with a unique location
for each context-local timeline. A shared timeline that is common to
multiple contexts will continue to use a separate page.

This enables us to create position invariant contexts should we feel the
need to relocate them.

Initially they are automatically used by Broadwell/Braswell as they do
not require independent timelines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 33b529dcb05f..6208a3d5a93d 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -829,6 +829,14 @@ pinned_timeline(struct intel_context *ce, struct intel_engine_cs *engine)
 	return intel_timeline_create_from_engine(engine, page_unmask_bits(tl));
 }
 
+static struct intel_timeline *
+pphwsp_timeline(struct intel_context *ce, struct i915_vma *state)
+{
+	return __intel_timeline_create(ce->engine->gt, state,
+				       I915_GEM_HWS_SEQNO_ADDR |
+				       INTEL_TIMELINE_RELATIVE_CONTEXT);
+}
+
 int lrc_alloc(struct intel_context *ce, struct intel_engine_cs *engine)
 {
 	struct intel_ring *ring;
@@ -856,8 +864,10 @@ int lrc_alloc(struct intel_context *ce, struct intel_engine_cs *engine)
 		 */
 		if (unlikely(ce->timeline))
 			tl = pinned_timeline(ce, engine);
-		else
+		else if (intel_engine_has_semaphores(engine))
 			tl = intel_timeline_create(engine->gt);
+		else
+			tl = pphwsp_timeline(ce, vma);
 		if (IS_ERR(tl)) {
 			err = PTR_ERR(tl);
 			goto err_ring;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 34/41] Restore "drm/i915: drop engine_pin/unpin_breadcrumbs_irq"
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (31 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 33/41] drm/i915/gt: Use ppHWSP for unshared non-semaphore related timelines Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 35/41] drm/i915/gt: Couple tasklet scheduling for all CS interrupts Chris Wilson
                   ` (11 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

This was removed in commit 478ffad6d690 ("drm/i915: drop
engine_pin/unpin_breadcrumbs_irq") as the last user had been removed,
but now there is a promise of a new user in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 24 +++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.h |  3 +++
 2 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index 38cc42783dfb..9e67810c7767 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -310,6 +310,30 @@ void intel_breadcrumbs_reset(struct intel_breadcrumbs *b)
 	spin_unlock_irqrestore(&b->irq_lock, flags);
 }
 
+void intel_breadcrumbs_pin_irq(struct intel_breadcrumbs *b)
+{
+	if (GEM_DEBUG_WARN_ON(!b->irq_engine))
+		return;
+
+	spin_lock_irq(&b->irq_lock);
+	if (!b->irq_enabled++)
+		irq_enable(b->irq_engine);
+	GEM_BUG_ON(!b->irq_enabled); /* no overflow! */
+	spin_unlock_irq(&b->irq_lock);
+}
+
+void intel_breadcrumbs_unpin_irq(struct intel_breadcrumbs *b)
+{
+	if (GEM_DEBUG_WARN_ON(!b->irq_engine))
+		return;
+
+	spin_lock_irq(&b->irq_lock);
+	GEM_BUG_ON(!b->irq_enabled); /* no underflow! */
+	if (!--b->irq_enabled)
+		irq_disable(b->irq_engine);
+	spin_unlock_irq(&b->irq_lock);
+}
+
 void __intel_breadcrumbs_park(struct intel_breadcrumbs *b)
 {
 	if (!READ_ONCE(b->irq_armed))
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.h b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.h
index 3ce5ce270b04..c2bb3a79ca9f 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.h
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.h
@@ -19,6 +19,9 @@ struct intel_breadcrumbs *
 intel_breadcrumbs_create(struct intel_engine_cs *irq_engine);
 void intel_breadcrumbs_free(struct intel_breadcrumbs *b);
 
+void intel_breadcrumbs_pin_irq(struct intel_breadcrumbs *b);
+void intel_breadcrumbs_unpin_irq(struct intel_breadcrumbs *b);
+
 void intel_breadcrumbs_reset(struct intel_breadcrumbs *b);
 void __intel_breadcrumbs_park(struct intel_breadcrumbs *b);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 35/41] drm/i915/gt: Couple tasklet scheduling for all CS interrupts
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (32 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 34/41] Restore "drm/i915: drop engine_pin/unpin_breadcrumbs_irq" Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 36/41] drm/i915/gt: Support creation of 'internal' rings Chris Wilson
                   ` (10 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

If any engine asks for the tasklet to be kicked from the CS interrupt,
do so. Currently, this is used by the execlists scheduler backends to
feed in the next request to the HW, and similarly could be used by a
ring scheduler, as will be seen in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_irq.c | 17 ++++++++++++-----
 drivers/gpu/drm/i915/gt/intel_gt_irq.h |  3 +++
 drivers/gpu/drm/i915/gt/intel_rps.c    |  2 +-
 drivers/gpu/drm/i915/i915_irq.c        |  8 ++++----
 4 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index 5f5e96da09b0..a2d1aacfd464 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -62,6 +62,13 @@ cs_irq_handler(struct intel_engine_cs *engine, u32 iir)
 		i915_sched_kick(&engine->active);
 }
 
+void gen2_engine_cs_irq(struct intel_engine_cs *engine)
+{
+	intel_engine_signal_breadcrumbs(engine);
+	if (intel_engine_needs_breadcrumb_tasklet(engine))
+		i915_sched_kick(&engine->active);
+}
+
 static u32
 gen11_gt_engine_identity(struct intel_gt *gt,
 			 const unsigned int bank, const unsigned int bit)
@@ -275,9 +282,9 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
 void gen5_gt_irq_handler(struct intel_gt *gt, u32 gt_iir)
 {
 	if (gt_iir & GT_RENDER_USER_INTERRUPT)
-		intel_engine_signal_breadcrumbs(gt->engine_class[RENDER_CLASS][0]);
+		gen2_engine_cs_irq(gt->engine_class[RENDER_CLASS][0]);
 	if (gt_iir & ILK_BSD_USER_INTERRUPT)
-		intel_engine_signal_breadcrumbs(gt->engine_class[VIDEO_DECODE_CLASS][0]);
+		gen2_engine_cs_irq(gt->engine_class[VIDEO_DECODE_CLASS][0]);
 }
 
 static void gen7_parity_error_irq_handler(struct intel_gt *gt, u32 iir)
@@ -301,11 +308,11 @@ static void gen7_parity_error_irq_handler(struct intel_gt *gt, u32 iir)
 void gen6_gt_irq_handler(struct intel_gt *gt, u32 gt_iir)
 {
 	if (gt_iir & GT_RENDER_USER_INTERRUPT)
-		intel_engine_signal_breadcrumbs(gt->engine_class[RENDER_CLASS][0]);
+		gen2_engine_cs_irq(gt->engine_class[RENDER_CLASS][0]);
 	if (gt_iir & GT_BSD_USER_INTERRUPT)
-		intel_engine_signal_breadcrumbs(gt->engine_class[VIDEO_DECODE_CLASS][0]);
+		gen2_engine_cs_irq(gt->engine_class[VIDEO_DECODE_CLASS][0]);
 	if (gt_iir & GT_BLT_USER_INTERRUPT)
-		intel_engine_signal_breadcrumbs(gt->engine_class[COPY_ENGINE_CLASS][0]);
+		gen2_engine_cs_irq(gt->engine_class[COPY_ENGINE_CLASS][0]);
 
 	if (gt_iir & (GT_BLT_CS_ERROR_INTERRUPT |
 		      GT_BSD_CS_ERROR_INTERRUPT |
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.h b/drivers/gpu/drm/i915/gt/intel_gt_irq.h
index f667e976fb2b..26c2a5ea3b23 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.h
@@ -8,6 +8,7 @@
 
 #include <linux/types.h>
 
+struct intel_engine_cs;
 struct intel_gt;
 
 #define GEN8_GT_IRQS (GEN8_GT_RCS_IRQ | \
@@ -18,6 +19,8 @@ struct intel_gt;
 		      GEN8_GT_PM_IRQ | \
 		      GEN8_GT_GUC_IRQ)
 
+void gen2_engine_cs_irq(struct intel_engine_cs *engine);
+
 void gen11_gt_irq_reset(struct intel_gt *gt);
 void gen11_gt_irq_postinstall(struct intel_gt *gt);
 void gen11_gt_irq_handler(struct intel_gt *gt, const u32 master_ctl);
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index 405d814e9040..900c20a6d073 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1774,7 +1774,7 @@ void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir)
 		return;
 
 	if (pm_iir & PM_VEBOX_USER_INTERRUPT)
-		intel_engine_signal_breadcrumbs(gt->engine[VECS0]);
+		gen2_engine_cs_irq(gt->engine[VECS0]);
 
 	if (pm_iir & PM_VEBOX_CS_ERROR_INTERRUPT)
 		DRM_DEBUG("Command parser error, pm_iir 0x%08x\n", pm_iir);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 33019cf0e630..484512f9fb93 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -3915,7 +3915,7 @@ static irqreturn_t i8xx_irq_handler(int irq, void *arg)
 		intel_uncore_write16(&dev_priv->uncore, GEN2_IIR, iir);
 
 		if (iir & I915_USER_INTERRUPT)
-			intel_engine_signal_breadcrumbs(dev_priv->gt.engine[RCS0]);
+			gen2_engine_cs_irq(dev_priv->gt.engine[RCS0]);
 
 		if (iir & I915_MASTER_ERROR_INTERRUPT)
 			i8xx_error_irq_handler(dev_priv, eir, eir_stuck);
@@ -4023,7 +4023,7 @@ static irqreturn_t i915_irq_handler(int irq, void *arg)
 		intel_uncore_write(&dev_priv->uncore, GEN2_IIR, iir);
 
 		if (iir & I915_USER_INTERRUPT)
-			intel_engine_signal_breadcrumbs(dev_priv->gt.engine[RCS0]);
+			gen2_engine_cs_irq(dev_priv->gt.engine[RCS0]);
 
 		if (iir & I915_MASTER_ERROR_INTERRUPT)
 			i9xx_error_irq_handler(dev_priv, eir, eir_stuck);
@@ -4168,10 +4168,10 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 		intel_uncore_write(&dev_priv->uncore, GEN2_IIR, iir);
 
 		if (iir & I915_USER_INTERRUPT)
-			intel_engine_signal_breadcrumbs(dev_priv->gt.engine[RCS0]);
+			gen2_engine_cs_irq(dev_priv->gt.engine[RCS0]);
 
 		if (iir & I915_BSD_USER_INTERRUPT)
-			intel_engine_signal_breadcrumbs(dev_priv->gt.engine[VCS0]);
+			gen2_engine_cs_irq(dev_priv->gt.engine[VCS0]);
 
 		if (iir & I915_MASTER_ERROR_INTERRUPT)
 			i9xx_error_irq_handler(dev_priv, eir, eir_stuck);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 36/41] drm/i915/gt: Support creation of 'internal' rings
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (33 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 35/41] drm/i915/gt: Couple tasklet scheduling for all CS interrupts Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 37/41] drm/i915/gt: Use client timeline address for seqno writes Chris Wilson
                   ` (9 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

To support legacy ring buffer scheduling, we want a virtual ringbuffer
for each client. These rings are purely for holding the requests as they
are being constructed on the CPU and never accessed by the GPU, so they
should not be bound into the GGTT, and we can use plain old WB mapped
pages.

As they are not bound, we need to nerf a few assumptions that a rq->ring
is in the GGTT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c    |  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c  | 17 +-----
 drivers/gpu/drm/i915/gt/intel_ring.c       | 66 ++++++++++++++--------
 drivers/gpu/drm/i915/gt/intel_ring.h       | 12 +++-
 drivers/gpu/drm/i915/gt/intel_ring_types.h |  2 +
 5 files changed, 59 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 57b6bde2b736..c7ab4ed92da4 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -258,7 +258,7 @@ int __intel_context_do_pin_ww(struct intel_context *ce,
 		}
 
 		CE_TRACE(ce, "pin ring:{start:%08x, head:%04x, tail:%04x}\n",
-			 i915_ggtt_offset(ce->ring->vma),
+			 intel_ring_address(ce->ring),
 			 ce->ring->head, ce->ring->tail);
 
 		handoff = true;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index a1cd511223a2..936820b240dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1321,7 +1321,7 @@ static int print_ring(char *buf, int sz, struct i915_request *rq)
 
 		len = scnprintf(buf, sz,
 				"ring:{start:%08x, hwsp:%08x, seqno:%08x, runtime:%llums}, ",
-				i915_ggtt_offset(rq->ring->vma),
+				intel_ring_address(rq->ring),
 				tl ? tl->ggtt_offset : 0,
 				hwsp_seqno(rq),
 				DIV_ROUND_CLOSEST_ULL(intel_context_get_total_runtime_ns(rq->context),
@@ -1655,7 +1655,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 		i915_request_show(m, rq, "\t\tactive ", 0);
 
 		drm_printf(m, "\t\tring->start:  0x%08x\n",
-			   i915_ggtt_offset(rq->ring->vma));
+			   intel_ring_address(rq->ring));
 		drm_printf(m, "\t\tring->head:   0x%08x\n",
 			   rq->ring->head);
 		drm_printf(m, "\t\tring->tail:   0x%08x\n",
@@ -1736,13 +1736,6 @@ ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine, ktime_t *now)
 	return total;
 }
 
-static bool match_ring(struct i915_request *rq)
-{
-	u32 ring = ENGINE_READ(rq->engine, RING_START);
-
-	return ring == i915_ggtt_offset(rq->ring->vma);
-}
-
 struct i915_request *
 intel_engine_find_active_request(struct intel_engine_cs *engine)
 {
@@ -1782,11 +1775,7 @@ intel_engine_find_active_request(struct intel_engine_cs *engine)
 			continue;
 
 		if (!__i915_request_has_started(request))
-			continue;
-
-		/* More than one preemptible request may match! */
-		if (!match_ring(request))
-			continue;
+			break;
 
 		active = request;
 		break;
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index 29c87b3c23bc..3ece56251dac 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -31,33 +31,42 @@ void __intel_ring_pin(struct intel_ring *ring)
 int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
 {
 	struct i915_vma *vma = ring->vma;
-	unsigned int flags;
 	void *addr;
 	int ret;
 
 	if (atomic_fetch_inc(&ring->pin_count))
 		return 0;
 
-	/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
-	flags = PIN_OFFSET_BIAS | i915_ggtt_pin_bias(vma);
+	if (!(ring->flags & INTEL_RING_CREATE_INTERNAL)) {
+		int type = i915_coherent_map_type(vma->vm->i915);
+		unsigned int pin;
 
-	if (i915_gem_object_is_stolen(vma->obj))
-		flags |= PIN_MAPPABLE;
-	else
-		flags |= PIN_HIGH;
+		/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
+		pin |= PIN_OFFSET_BIAS | i915_ggtt_pin_bias(vma);
 
-	ret = i915_ggtt_pin(vma, ww, 0, flags);
-	if (unlikely(ret))
-		goto err_unpin;
+		if (i915_gem_object_is_stolen(vma->obj))
+			pin |= PIN_MAPPABLE;
+		else
+			pin |= PIN_HIGH;
 
-	if (i915_vma_is_map_and_fenceable(vma))
-		addr = (void __force *)i915_vma_pin_iomap(vma);
-	else
-		addr = i915_gem_object_pin_map(vma->obj,
-					       i915_coherent_map_type(vma->vm->i915));
-	if (IS_ERR(addr)) {
-		ret = PTR_ERR(addr);
-		goto err_ring;
+		ret = i915_ggtt_pin(vma, ww, 0, pin);
+		if (unlikely(ret))
+			goto err_unpin;
+
+		if (i915_vma_is_map_and_fenceable(vma))
+			addr = (void __force *)i915_vma_pin_iomap(vma);
+		else
+			addr = i915_gem_object_pin_map(vma->obj, type);
+		if (IS_ERR(addr)) {
+			ret = PTR_ERR(addr);
+			goto err_ring;
+		}
+	} else {
+		addr = i915_gem_object_pin_map(vma->obj, I915_MAP_WB);
+		if (IS_ERR(addr)) {
+			ret = PTR_ERR(addr);
+			goto err_ring;
+		}
 	}
 
 	i915_vma_make_unshrinkable(vma);
@@ -98,10 +107,12 @@ void intel_ring_unpin(struct intel_ring *ring)
 		i915_gem_object_unpin_map(vma->obj);
 
 	i915_vma_make_purgeable(vma);
-	i915_vma_unpin(vma);
+	if (!(ring->flags & INTEL_RING_CREATE_INTERNAL))
+		i915_vma_unpin(vma);
 }
 
-static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
+static struct i915_vma *
+create_ring_vma(struct i915_ggtt *ggtt, int size, unsigned int flags)
 {
 	struct i915_address_space *vm = &ggtt->vm;
 	struct drm_i915_private *i915 = vm->i915;
@@ -109,8 +120,10 @@ static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
 	struct i915_vma *vma;
 
 	obj = ERR_PTR(-ENODEV);
-	if (i915_ggtt_has_aperture(ggtt))
-		obj = i915_gem_object_create_stolen(i915, size);
+	if (!(flags & INTEL_RING_CREATE_INTERNAL)) {
+		if (i915_ggtt_has_aperture(ggtt))
+			obj = i915_gem_object_create_stolen(i915, size);
+	}
 	if (IS_ERR(obj))
 		obj = i915_gem_object_create_internal(i915, size);
 	if (IS_ERR(obj))
@@ -135,12 +148,14 @@ static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
 }
 
 struct intel_ring *
-intel_engine_create_ring(struct intel_engine_cs *engine, int size)
+intel_engine_create_ring(struct intel_engine_cs *engine, unsigned int size)
 {
 	struct drm_i915_private *i915 = engine->i915;
+	unsigned int flags = size & GENMASK(11, 0);
 	struct intel_ring *ring;
 	struct i915_vma *vma;
 
+	size ^= flags;
 	GEM_BUG_ON(!is_power_of_2(size));
 	GEM_BUG_ON(RING_CTL_SIZE(size) & ~RING_NR_PAGES);
 
@@ -149,8 +164,10 @@ intel_engine_create_ring(struct intel_engine_cs *engine, int size)
 		return ERR_PTR(-ENOMEM);
 
 	kref_init(&ring->ref);
+
 	ring->size = size;
 	ring->wrap = BITS_PER_TYPE(ring->size) - ilog2(size);
+	ring->flags = flags;
 
 	/*
 	 * Workaround an erratum on the i830 which causes a hang if
@@ -163,11 +180,12 @@ intel_engine_create_ring(struct intel_engine_cs *engine, int size)
 
 	intel_ring_update_space(ring);
 
-	vma = create_ring_vma(engine->gt->ggtt, size);
+	vma = create_ring_vma(engine->gt->ggtt, size, flags);
 	if (IS_ERR(vma)) {
 		kfree(ring);
 		return ERR_CAST(vma);
 	}
+
 	ring->vma = vma;
 
 	return ring;
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.h b/drivers/gpu/drm/i915/gt/intel_ring.h
index dbf5f14a136f..cbff78cf5508 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.h
+++ b/drivers/gpu/drm/i915/gt/intel_ring.h
@@ -8,12 +8,14 @@
 
 #include "i915_gem.h" /* GEM_BUG_ON */
 #include "i915_request.h"
+#include "i915_vma.h"
 #include "intel_ring_types.h"
 
 struct intel_engine_cs;
 
 struct intel_ring *
-intel_engine_create_ring(struct intel_engine_cs *engine, int size);
+intel_engine_create_ring(struct intel_engine_cs *engine, unsigned int size);
+#define INTEL_RING_CREATE_INTERNAL BIT(0)
 
 u32 *intel_ring_begin(struct i915_request *rq, unsigned int num_dwords);
 int intel_ring_cacheline_align(struct i915_request *rq);
@@ -138,4 +140,12 @@ __intel_ring_space(unsigned int head, unsigned int tail, unsigned int size)
 	return (head - tail - CACHELINE_BYTES) & (size - 1);
 }
 
+static inline u32 intel_ring_address(const struct intel_ring *ring)
+{
+	if (ring->flags & INTEL_RING_CREATE_INTERNAL)
+		return -1;
+
+	return i915_ggtt_offset(ring->vma);
+}
+
 #endif /* INTEL_RING_H */
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_types.h b/drivers/gpu/drm/i915/gt/intel_ring_types.h
index 49ccb76dda3b..3d091c699110 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_ring_types.h
@@ -46,6 +46,8 @@ struct intel_ring {
 	u32 size;
 	u32 wrap;
 	u32 effective_size;
+
+	unsigned long flags;
 };
 
 #endif /* INTEL_RING_TYPES_H */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 37/41] drm/i915/gt: Use client timeline address for seqno writes
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (34 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 36/41] drm/i915/gt: Support creation of 'internal' rings Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 38/41] drm/i915/gt: Infrastructure for ring scheduling Chris Wilson
                   ` (8 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

If we allow for per-client timelines, even with legacy ring submission,
we open the door to a world full of possiblities [scheduling and
semaphores].

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/gen2_engine_cs.c      | 72 ++++++++++++++-
 drivers/gpu/drm/i915/gt/gen2_engine_cs.h      |  5 +-
 drivers/gpu/drm/i915/gt/gen6_engine_cs.c      | 89 +++++++++++++------
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c      | 23 ++---
 .../gpu/drm/i915/gt/intel_ring_submission.c   | 30 +++----
 drivers/gpu/drm/i915/i915_request.h           | 13 +++
 6 files changed, 169 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen2_engine_cs.c b/drivers/gpu/drm/i915/gt/gen2_engine_cs.c
index b491a64919c8..b3fff7a955f2 100644
--- a/drivers/gpu/drm/i915/gt/gen2_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen2_engine_cs.c
@@ -172,9 +172,77 @@ u32 *gen3_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 	return __gen2_emit_breadcrumb(rq, cs, 16, 8);
 }
 
-u32 *gen5_emit_breadcrumb(struct i915_request *rq, u32 *cs)
+static u32 *__gen4_emit_breadcrumb(struct i915_request *rq, u32 *cs,
+				   int flush, int post)
 {
-	return __gen2_emit_breadcrumb(rq, cs, 8, 8);
+	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
+	u32 offset = __i915_request_hwsp_offset(rq);
+
+	GEM_BUG_ON(tl->mode == INTEL_TIMELINE_RELATIVE_CONTEXT);
+
+	*cs++ = MI_FLUSH;
+
+	while (flush--) {
+		*cs++ = MI_STORE_DWORD_INDEX;
+		*cs++ = I915_GEM_HWS_SCRATCH * sizeof(u32);
+		*cs++ = rq->fence.seqno;
+	}
+
+	if (intel_timeline_is_relative(tl)) {
+		offset = offset_in_page(offset);
+		while (post--) {
+			*cs++ = MI_STORE_DWORD_INDEX;
+			*cs++ = offset;
+			*cs++ = rq->fence.seqno;
+			*cs++ = MI_NOOP;
+		}
+	} else {
+		while (post--) {
+			*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+			*cs++ = 0;
+			*cs++ = offset;
+			*cs++ = rq->fence.seqno;
+		}
+	}
+
+	*cs++ = MI_USER_INTERRUPT;
+
+	rq->tail = intel_ring_offset(rq, cs);
+	assert_ring_tail_valid(rq->ring, rq->tail);
+
+	return cs;
+}
+
+u32 *gen4_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
+{
+	return __gen4_emit_breadcrumb(rq, cs, 8, 8);
+}
+
+int gen4_emit_init_breadcrumb_xcs(struct i915_request *rq)
+{
+	struct intel_timeline *tl = i915_request_timeline(rq);
+	u32 *cs;
+
+	GEM_BUG_ON(i915_request_has_initial_breadcrumb(rq));
+	if (!intel_timeline_has_initial_breadcrumb(tl))
+		return 0;
+
+	cs = intel_ring_begin(rq, 4);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+	*cs++ = 0;
+	*cs++ = __i915_request_hwsp_offset(rq);
+	*cs++ = rq->fence.seqno - 1;
+
+	intel_ring_advance(rq, cs);
+
+	/* Record the updated position of the request's payload */
+	rq->infix = intel_ring_offset(rq, cs);
+
+	__set_bit(I915_FENCE_FLAG_INITIAL_BREADCRUMB, &rq->fence.flags);
+	return 0;
 }
 
 /* Just userspace ABI convention to limit the wa batch bo to a resonable size */
diff --git a/drivers/gpu/drm/i915/gt/gen2_engine_cs.h b/drivers/gpu/drm/i915/gt/gen2_engine_cs.h
index a5cd64a65c9e..ba7567b15229 100644
--- a/drivers/gpu/drm/i915/gt/gen2_engine_cs.h
+++ b/drivers/gpu/drm/i915/gt/gen2_engine_cs.h
@@ -16,7 +16,10 @@ int gen4_emit_flush_rcs(struct i915_request *rq, u32 mode);
 int gen4_emit_flush_vcs(struct i915_request *rq, u32 mode);
 
 u32 *gen3_emit_breadcrumb(struct i915_request *rq, u32 *cs);
-u32 *gen5_emit_breadcrumb(struct i915_request *rq, u32 *cs);
+u32 *gen4_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs);
+
+u32 *gen4_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs);
+int gen4_emit_init_breadcrumb_xcs(struct i915_request *rq);
 
 int i830_emit_bb_start(struct i915_request *rq,
 		       u64 offset, u32 len,
diff --git a/drivers/gpu/drm/i915/gt/gen6_engine_cs.c b/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
index 2f59dd3bdc18..14cab4c726ce 100644
--- a/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
@@ -141,6 +141,12 @@ int gen6_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
 u32 *gen6_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 {
+	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
+	u32 offset = __i915_request_hwsp_offset(rq);
+	unsigned int flags;
+
+	GEM_BUG_ON(tl->mode == INTEL_TIMELINE_RELATIVE_CONTEXT);
+
 	/* First we do the gen6_emit_post_sync_nonzero_flush w/a */
 	*cs++ = GFX_OP_PIPE_CONTROL(4);
 	*cs++ = PIPE_CONTROL_CS_STALL | PIPE_CONTROL_STALL_AT_SCOREBOARD;
@@ -154,15 +160,22 @@ u32 *gen6_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 		PIPE_CONTROL_GLOBAL_GTT;
 	*cs++ = 0;
 
-	/* Finally we can flush and with it emit the breadcrumb */
-	*cs++ = GFX_OP_PIPE_CONTROL(4);
-	*cs++ = (PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
+	flags = (PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
 		 PIPE_CONTROL_DEPTH_CACHE_FLUSH |
 		 PIPE_CONTROL_DC_FLUSH_ENABLE |
 		 PIPE_CONTROL_QW_WRITE |
 		 PIPE_CONTROL_CS_STALL);
-	*cs++ = i915_request_active_timeline(rq)->ggtt_offset |
-		PIPE_CONTROL_GLOBAL_GTT;
+	if (intel_timeline_is_relative(tl)) {
+		offset = offset_in_page(offset);
+		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
+	}
+	if (!intel_timeline_in_context(tl))
+		offset |= PIPE_CONTROL_GLOBAL_GTT;
+
+	/* Finally we can flush and with it emit the breadcrumb */
+	*cs++ = GFX_OP_PIPE_CONTROL(4);
+	*cs++ = flags;
+	*cs++ = offset;
 	*cs++ = rq->fence.seqno;
 
 	*cs++ = MI_USER_INTERRUPT;
@@ -351,15 +364,28 @@ int gen7_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
 u32 *gen7_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 {
-	*cs++ = GFX_OP_PIPE_CONTROL(4);
-	*cs++ = (PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
+	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
+	u32 offset = __i915_request_hwsp_offset(rq);
+	unsigned int flags;
+
+	GEM_BUG_ON(tl->mode == INTEL_TIMELINE_RELATIVE_CONTEXT);
+
+	flags = (PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
 		 PIPE_CONTROL_DEPTH_CACHE_FLUSH |
 		 PIPE_CONTROL_DC_FLUSH_ENABLE |
 		 PIPE_CONTROL_FLUSH_ENABLE |
 		 PIPE_CONTROL_QW_WRITE |
-		 PIPE_CONTROL_GLOBAL_GTT_IVB |
 		 PIPE_CONTROL_CS_STALL);
-	*cs++ = i915_request_active_timeline(rq)->ggtt_offset;
+	if (intel_timeline_is_relative(tl)) {
+		offset = offset_in_page(offset);
+		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
+	}
+	if (!intel_timeline_in_context(tl))
+		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
+
+	*cs++ = GFX_OP_PIPE_CONTROL(4);
+	*cs++ = flags;
+	*cs++ = offset;
 	*cs++ = rq->fence.seqno;
 
 	*cs++ = MI_USER_INTERRUPT;
@@ -373,11 +399,21 @@ u32 *gen7_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 
 u32 *gen6_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
 {
-	GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma);
-	GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
+	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
+	u32 offset = __i915_request_hwsp_offset(rq);
+	unsigned int flags = 0;
 
-	*cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
-	*cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT;
+	GEM_BUG_ON(tl->mode == INTEL_TIMELINE_RELATIVE_CONTEXT);
+
+	if (intel_timeline_is_relative(tl)) {
+		offset = offset_in_page(offset);
+		flags |= MI_FLUSH_DW_STORE_INDEX;
+	}
+	if (!intel_timeline_in_context(tl))
+		offset |= MI_FLUSH_DW_USE_GTT;
+
+	*cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | flags;
+	*cs++ = offset;
 	*cs++ = rq->fence.seqno;
 
 	*cs++ = MI_USER_INTERRUPT;
@@ -391,28 +427,31 @@ u32 *gen6_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
 #define GEN7_XCS_WA 32
 u32 *gen7_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
 {
+	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
+	u32 offset = __i915_request_hwsp_offset(rq);
+	u32 cmd = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW;
 	int i;
 
-	GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma);
-	GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
+	GEM_BUG_ON(tl->mode == INTEL_TIMELINE_RELATIVE_CONTEXT);
 
-	*cs++ = MI_FLUSH_DW | MI_INVALIDATE_TLB |
-		MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
-	*cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT;
+	if (intel_timeline_is_relative(tl)) {
+		offset = offset_in_page(offset);
+		cmd |= MI_FLUSH_DW_STORE_INDEX;
+	}
+	if (!intel_timeline_in_context(tl))
+		offset |= MI_FLUSH_DW_USE_GTT;
+
+	*cs++ = cmd;
+	*cs++ = offset;
 	*cs++ = rq->fence.seqno;
 
 	for (i = 0; i < GEN7_XCS_WA; i++) {
-		*cs++ = MI_STORE_DWORD_INDEX;
-		*cs++ = I915_GEM_HWS_SEQNO_ADDR;
+		*cs++ = cmd;
+		*cs++ = offset;
 		*cs++ = rq->fence.seqno;
 	}
 
-	*cs++ = MI_FLUSH_DW;
-	*cs++ = 0;
-	*cs++ = 0;
-
 	*cs++ = MI_USER_INTERRUPT;
-	*cs++ = MI_NOOP;
 
 	rq->tail = intel_ring_offset(rq, cs);
 	assert_ring_tail_valid(rq->ring, rq->tail);
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 8b3a96b1afe0..e041c5b10b54 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -336,19 +336,6 @@ static u32 preempt_address(struct intel_engine_cs *engine)
 		I915_GEM_HWS_PREEMPT_ADDR);
 }
 
-static u32 hwsp_offset(const struct i915_request *rq)
-{
-	const struct intel_timeline_cacheline *cl;
-
-	/* Before the request is executed, the timeline/cachline is fixed */
-
-	cl = rcu_dereference_protected(rq->hwsp_cacheline, 1);
-	if (cl)
-		return cl->ggtt_offset;
-
-	return rcu_dereference_protected(rq->timeline, 1)->ggtt_offset;
-}
-
 int gen8_emit_init_breadcrumb(struct i915_request *rq)
 {
 	u32 *cs;
@@ -362,7 +349,7 @@ int gen8_emit_init_breadcrumb(struct i915_request *rq)
 		return PTR_ERR(cs);
 
 	*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
-	*cs++ = hwsp_offset(rq);
+	*cs++ = __i915_request_hwsp_offset(rq);
 	*cs++ = 0;
 	*cs++ = rq->fence.seqno - 1;
 
@@ -520,7 +507,7 @@ static u32 *emit_xcs_breadcrumb(struct i915_request *rq, u32 *cs)
 {
 	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
 	unsigned int flags = MI_FLUSH_DW_OP_STOREDW;
-	u32 offset = hwsp_offset(rq);
+	u32 offset = __i915_request_hwsp_offset(rq);
 
 	if (intel_timeline_is_relative(tl)) {
 		offset = offset_in_page(offset);
@@ -542,7 +529,7 @@ u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 {
 	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
 	unsigned int flags = PIPE_CONTROL_FLUSH_ENABLE | PIPE_CONTROL_CS_STALL;
-	u32 offset = hwsp_offset(rq);
+	u32 offset = __i915_request_hwsp_offset(rq);
 
 	if (intel_timeline_is_relative(tl)) {
 		offset = offset_in_page(offset);
@@ -567,7 +554,7 @@ u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 u32 *gen11_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 {
 	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
-	u32 offset = hwsp_offset(rq);
+	u32 offset = __i915_request_hwsp_offset(rq);
 	unsigned int flags;
 
 	flags = (PIPE_CONTROL_CS_STALL |
@@ -649,7 +636,7 @@ u32 *gen12_emit_fini_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
 u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
 {
 	struct intel_timeline *tl = rcu_dereference_protected(rq->timeline, 1);
-	u32 offset = hwsp_offset(rq);
+	u32 offset = __i915_request_hwsp_offset(rq);
 	unsigned int flags;
 
 	flags = (PIPE_CONTROL_CS_STALL |
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 9467228a392f..963804f22ab7 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -1045,11 +1045,6 @@ static void setup_common(struct intel_engine_cs *engine)
 	 * equivalent to our next initial bread so we can elide
 	 * engine->emit_init_breadcrumb().
 	 */
-	engine->emit_fini_breadcrumb = gen3_emit_breadcrumb;
-	if (IS_GEN(i915, 5))
-		engine->emit_fini_breadcrumb = gen5_emit_breadcrumb;
-
-	engine->set_default_submission = i9xx_set_default_submission;
 
 	if (INTEL_GEN(i915) >= 6)
 		engine->emit_bb_start = gen6_emit_bb_start;
@@ -1059,6 +1054,17 @@ static void setup_common(struct intel_engine_cs *engine)
 		engine->emit_bb_start = i830_emit_bb_start;
 	else
 		engine->emit_bb_start = gen3_emit_bb_start;
+
+	if (INTEL_GEN(i915) >= 7)
+		engine->emit_fini_breadcrumb = gen7_emit_breadcrumb_xcs;
+	else if (INTEL_GEN(i915) >= 6)
+		engine->emit_fini_breadcrumb = gen6_emit_breadcrumb_xcs;
+	else if (INTEL_GEN(i915) >= 4)
+		engine->emit_fini_breadcrumb = gen4_emit_breadcrumb_xcs;
+	else
+		engine->emit_fini_breadcrumb = gen3_emit_breadcrumb;
+
+	engine->set_default_submission = i9xx_set_default_submission;
 }
 
 static void setup_rcs(struct intel_engine_cs *engine)
@@ -1100,11 +1106,6 @@ static void setup_vcs(struct intel_engine_cs *engine)
 			engine->set_default_submission = gen6_bsd_set_default_submission;
 		engine->emit_flush = gen6_emit_flush_vcs;
 		engine->irq_enable_mask = GT_BSD_USER_INTERRUPT;
-
-		if (IS_GEN(i915, 6))
-			engine->emit_fini_breadcrumb = gen6_emit_breadcrumb_xcs;
-		else
-			engine->emit_fini_breadcrumb = gen7_emit_breadcrumb_xcs;
 	} else {
 		engine->emit_flush = gen4_emit_flush_vcs;
 		if (IS_GEN(i915, 5))
@@ -1118,13 +1119,10 @@ static void setup_bcs(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *i915 = engine->i915;
 
+	GEM_BUG_ON(INTEL_GEN(i915) < 6);
+
 	engine->emit_flush = gen6_emit_flush_xcs;
 	engine->irq_enable_mask = GT_BLT_USER_INTERRUPT;
-
-	if (IS_GEN(i915, 6))
-		engine->emit_fini_breadcrumb = gen6_emit_breadcrumb_xcs;
-	else
-		engine->emit_fini_breadcrumb = gen7_emit_breadcrumb_xcs;
 }
 
 static void setup_vecs(struct intel_engine_cs *engine)
@@ -1137,8 +1135,6 @@ static void setup_vecs(struct intel_engine_cs *engine)
 	engine->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
 	engine->irq_enable = hsw_irq_enable_vecs;
 	engine->irq_disable = hsw_irq_disable_vecs;
-
-	engine->emit_fini_breadcrumb = gen7_emit_breadcrumb_xcs;
 }
 
 static int gen7_ctx_switch_bb_setup(struct intel_engine_cs * const engine,
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 199dffea28ec..cf718333c5de 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -616,4 +616,17 @@ i915_request_active_timeline(const struct i915_request *rq)
 					 lockdep_is_held(&rq->engine->active.lock));
 }
 
+static inline u32 __i915_request_hwsp_offset(const struct i915_request *rq)
+{
+	const struct intel_timeline_cacheline *cl;
+
+	/* Before the request is executed, the timeline/cachline is fixed */
+
+	cl = rcu_dereference_protected(rq->hwsp_cacheline, 1);
+	if (cl)
+		return cl->ggtt_offset;
+
+	return rcu_dereference_protected(rq->timeline, 1)->ggtt_offset;
+}
+
 #endif /* I915_REQUEST_H */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 38/41] drm/i915/gt: Infrastructure for ring scheduling
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (35 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 37/41] drm/i915/gt: Use client timeline address for seqno writes Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 39/41] drm/i915/gt: Implement ring scheduler for gen4-7 Chris Wilson
                   ` (7 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Build a bare bones scheduler to sit on top the global legacy ringbuffer
submission. This virtual execlists scheme should be applicable to all
older platforms.

A key problem we have with the legacy ring buffer submission is that it
only allows for FIFO queuing. All clients share the global request queue
and must contend for its lock when submitting. As any client may need to
wait for external events, all clients must then wait. However, if we
stage each client into their own virtual ringbuffer with their own
timelines, we can copy the client requests into the global ringbuffer
only when they are ready, reordering the submission around stalls.
Furthermore, the ability to reorder gives us rudimentarily priority
sorting -- although without preemption support, once something is on the
GPU it stays on the GPU, and so it is still possible for a hog to delay
a high priority request (such as updating the display). However, it does
means that in keeping a short submission queue, the high priority
request will be next. This design resembles the old guc submission
scheduler, for reordering requests onto a global workqueue.

The implementation uses the MI_USER_INTERRUPT at the end of every
request to track completion, so is more interrupt happy than execlists
[which has an interrupt for each context event, albeit two]. Our
interrupts on these system are relatively heavy, and in the past we have
been able to completely starve Sandybrige by the interrupt traffic. Our
interrupt handlers are being much better (in part offloading the work to
bottom halves leaving the interrupt itself only dealing with acking the
registers) but we can still see the impact of starvation in the uneven
submission latency on a saturated system.

Overall though, the short sumission queues and extra interrupts do not
appear to be affecting throughput (+-10%, some tasks even improve to the
reduced request overheads) and improve latency. [Which is a massive
improvement since the introduction of Sandybridge!]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/gt/intel_engine.h        |   1 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   1 +
 .../gpu/drm/i915/gt/intel_ring_scheduler.c    | 797 ++++++++++++++++++
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  17 +-
 .../gpu/drm/i915/gt/intel_ring_submission.h   |  17 +
 6 files changed, 826 insertions(+), 8 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_ring_scheduler.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_ring_submission.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 026d500443d7..6096915f5607 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -115,6 +115,7 @@ gt-y += \
 	gt/intel_renderstate.o \
 	gt/intel_reset.o \
 	gt/intel_ring.o \
+	gt/intel_ring_scheduler.o \
 	gt/intel_ring_submission.o \
 	gt/intel_rps.o \
 	gt/intel_sseu.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 801ae54cf60d..fa257a305143 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -191,6 +191,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine);
 int intel_engine_resume(struct intel_engine_cs *engine);
 
 int intel_ring_submission_setup(struct intel_engine_cs *engine);
+int intel_ring_scheduler_setup(struct intel_engine_cs *engine);
 
 int intel_engine_stop_cs(struct intel_engine_cs *engine);
 void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 06a2582dc32f..b2bcb9a5b44c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -318,6 +318,7 @@ struct intel_engine_cs {
 	struct {
 		struct intel_ring *ring;
 		struct intel_timeline *timeline;
+		struct intel_context *context;
 	} legacy;
 
 	/*
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_scheduler.c b/drivers/gpu/drm/i915/gt/intel_ring_scheduler.c
new file mode 100644
index 000000000000..68a08a84d18b
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_ring_scheduler.c
@@ -0,0 +1,797 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include <linux/log2.h>
+
+#include <drm/i915_drm.h>
+
+#include "i915_drv.h"
+#include "intel_breadcrumbs.h"
+#include "intel_context.h"
+#include "intel_engine_pm.h"
+#include "intel_engine_stats.h"
+#include "intel_gt.h"
+#include "intel_gt_pm.h"
+#include "intel_gt_requests.h"
+#include "intel_reset.h"
+#include "intel_ring.h"
+#include "intel_ring_submission.h"
+#include "shmem_utils.h"
+
+/*
+ * Rough estimate of the typical request size, performing a flush,
+ * set-context and then emitting the batch.
+ */
+#define LEGACY_REQUEST_SIZE 200
+
+__maybe_unused
+static inline bool reset_in_progress(const struct intel_engine_cs *engine)
+{
+	return unlikely(!__tasklet_is_enabled(&engine->active.tasklet));
+}
+
+static void
+set_current_context(struct intel_context **ptr, struct intel_context *ce)
+{
+	if (ce)
+		intel_context_get(ce);
+
+	ce = xchg(ptr, ce);
+
+	if (ce)
+		intel_context_put(ce);
+}
+
+static inline void runtime_start(struct intel_context *ce)
+{
+	struct intel_context_stats *stats = &ce->stats;
+
+	if (intel_context_is_barrier(ce))
+		return;
+
+	if (stats->active)
+		return;
+
+	WRITE_ONCE(stats->active, intel_context_clock());
+}
+
+static inline void runtime_stop(struct intel_context *ce)
+{
+	struct intel_context_stats *stats = &ce->stats;
+	ktime_t dt;
+
+	if (!stats->active)
+		return;
+
+	dt = ktime_sub(intel_context_clock(), stats->active);
+	ewma_runtime_add(&stats->runtime.avg, dt);
+	stats->runtime.total += dt;
+
+	WRITE_ONCE(stats->active, 0);
+}
+
+static struct intel_engine_cs *__schedule_in(struct i915_request *rq)
+{
+	struct intel_context *ce = rq->context;
+	struct intel_engine_cs *engine = rq->engine;
+
+	intel_context_get(ce);
+
+	__intel_gt_pm_get(engine->gt);
+	if (engine->fw_domain && !engine->fw_active++)
+		intel_uncore_forcewake_get(engine->uncore, engine->fw_domain);
+
+	intel_engine_context_in(engine);
+
+	CE_TRACE(ce, "schedule-in\n");
+
+	return engine;
+}
+
+static void schedule_in(struct i915_request *rq)
+{
+	struct intel_context * const ce = rq->context;
+	struct intel_engine_cs *old;
+
+	GEM_BUG_ON(!intel_engine_pm_is_awake(rq->engine));
+
+	old = ce->inflight;
+	if (!old)
+		old = __schedule_in(rq);
+	WRITE_ONCE(ce->inflight, ptr_inc(old));
+
+	GEM_BUG_ON(intel_context_inflight(ce) != rq->engine);
+	GEM_BUG_ON(!intel_context_inflight_count(ce));
+}
+
+static void __schedule_out(struct i915_request *rq)
+{
+	struct intel_context *ce = rq->context;
+	struct intel_engine_cs *engine = rq->engine;
+
+	CE_TRACE(ce, "schedule-out\n");
+
+	if (intel_timeline_is_last(ce->timeline, rq))
+		intel_engine_add_retire(engine, ce->timeline);
+	else
+		i915_request_update_deadline(list_next_entry(rq, link));
+
+	intel_engine_context_out(engine);
+
+	if (engine->fw_domain && !--engine->fw_active)
+		intel_uncore_forcewake_put(engine->uncore, engine->fw_domain);
+	intel_gt_pm_put_async(engine->gt);
+}
+
+static void schedule_out(struct i915_request *rq)
+{
+	struct intel_context *ce = rq->context;
+
+	GEM_BUG_ON(!ce->inflight);
+	ce->inflight = ptr_dec(ce->inflight);
+	if (!intel_context_inflight_count(ce)) {
+		GEM_BUG_ON(ce->inflight != rq->engine);
+		__schedule_out(rq);
+		WRITE_ONCE(ce->inflight, NULL);
+		intel_context_put(ce);
+	}
+
+	i915_request_put(rq);
+}
+
+static u32 *ring_map(struct intel_ring *ring, u32 len)
+{
+	u32 *va;
+
+	if (unlikely(ring->tail + len > ring->effective_size)) {
+		memset(ring->vaddr + ring->tail, 0, ring->size - ring->tail);
+		ring->tail = 0;
+	}
+
+	va = ring->vaddr + ring->tail;
+	ring->tail = intel_ring_wrap(ring, ring->tail + len);
+
+	return va;
+}
+
+static inline u32 *ring_map_dw(struct intel_ring *ring, u32 len)
+{
+	return ring_map(ring, len * sizeof(u32));
+}
+
+static inline void ring_advance(struct intel_ring *ring, void *map)
+{
+	GEM_BUG_ON(intel_ring_wrap(ring, map - ring->vaddr) != ring->tail);
+}
+
+static void ring_copy(struct intel_ring *dst,
+		      const struct intel_ring *src,
+		      u32 start, u32 end)
+{
+	unsigned int len;
+	void *out;
+
+	len = end - start;
+	if (end < start)
+		len += src->size;
+	out = ring_map(dst, len);
+
+	if (end < start) {
+		len = src->size - start;
+		memcpy(out, src->vaddr + start, len);
+		out += len;
+		start = 0;
+	}
+
+	memcpy(out, src->vaddr + start, end - start);
+}
+
+static void switch_context(struct intel_ring *ring, struct i915_request *rq)
+{
+}
+
+static struct i915_request *ring_submit(struct i915_request *rq)
+{
+	struct intel_ring *ring = rq->engine->legacy.ring;
+
+	__i915_request_submit(rq);
+
+	if (rq->engine->legacy.context != rq->context) {
+		switch_context(ring, rq);
+		set_current_context(&rq->engine->legacy.context, rq->context);
+	}
+
+	ring_copy(ring, rq->ring, rq->head, rq->tail);
+	return rq;
+}
+
+static struct i915_request **
+copy_active(struct i915_request **port, struct i915_request * const *active)
+{
+	while (*active)
+		*port++ = *active++;
+
+	return port;
+}
+
+static inline void
+copy_ports(struct i915_request **dst, struct i915_request **src, int count)
+{
+	/* A memcpy_p() would be very useful here! */
+	while (count--)
+		WRITE_ONCE(*dst++, *src++); /* avoid write tearing */
+}
+
+static inline void write_tail(const struct intel_engine_cs *engine)
+{
+	ENGINE_WRITE(engine, RING_TAIL, engine->legacy.ring->tail);
+}
+
+static void dequeue(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const el = &engine->execlists;
+	struct i915_request ** const last_port = el->pending + el->port_mask;
+	struct i915_request **port, **first, *last;
+	struct i915_priolist *p;
+
+	first = copy_active(el->pending, el->active);
+	if (first > last_port)
+		return;
+
+	local_irq_disable();
+
+	last = NULL;
+	port = first;
+	spin_lock(&engine->active.lock);
+	for_each_priolist(p, &engine->active.queue) {
+		struct i915_request *rq, *rn;
+
+		priolist_for_each_request_safe(rq, rn, p) {
+			GEM_BUG_ON(rq == last);
+			if (last && rq->context != last->context) {
+				if (port == last_port)
+					goto done;
+
+				*port++ = i915_request_get(last);
+			}
+
+			last = ring_submit(rq);
+		}
+
+		i915_priolist_advance(&engine->active.queue, p);
+	}
+done:
+	spin_unlock(&engine->active.lock);
+
+	if (last) {
+		*port++ = i915_request_get(last);
+		*port = NULL;
+
+		if (!*el->active)
+			runtime_start((*el->pending)->context);
+		WRITE_ONCE(el->active, el->pending);
+
+		copy_ports(el->inflight, el->pending, port - el->pending + 1);
+		while (port-- != first)
+			schedule_in(*port);
+
+		wmb(); /* paranoid flush of WCB before RING_TAIL write */
+		write_tail(engine);
+
+		WRITE_ONCE(el->active, el->inflight);
+		GEM_BUG_ON(!*el->active);
+	}
+
+	WRITE_ONCE(el->pending[0], NULL);
+
+	local_irq_enable(); /* flush irq_work *after* RING_TAIL write */
+}
+
+static void post_process_csb(struct i915_request **port,
+			     struct i915_request **last)
+{
+	while (port != last)
+		schedule_out(*port++);
+}
+
+static struct i915_request **
+process_csb(struct intel_engine_execlists *el, struct i915_request **inactive)
+{
+	struct i915_request *rq;
+
+	while ((rq = *el->active)) {
+		if (!__i915_request_is_complete(rq)) {
+			runtime_start(rq->context);
+			break;
+		}
+
+		*inactive++ = rq;
+		el->active++;
+
+		runtime_stop(rq->context);
+	}
+
+	return inactive;
+}
+
+static void submission_tasklet(unsigned long data)
+{
+	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
+	struct i915_request *post[2 * EXECLIST_MAX_PORTS];
+	struct i915_request **inactive;
+
+	rcu_read_lock();
+	inactive = process_csb(&engine->execlists, post);
+	GEM_BUG_ON(inactive - post > ARRAY_SIZE(post));
+
+	if (!i915_sched_is_idle(&engine->active))
+		dequeue(engine);
+
+	post_process_csb(post, inactive);
+	rcu_read_unlock();
+}
+
+static void reset_prepare(struct intel_engine_cs *engine)
+{
+	GEM_TRACE("%s\n", engine->name);
+
+	__tasklet_disable_sync_once(&engine->active.tasklet);
+	GEM_BUG_ON(!reset_in_progress(engine));
+
+	intel_ring_submission_reset_prepare(engine);
+}
+
+static inline void clear_ports(struct i915_request **ports, int count)
+{
+	memset_p((void **)ports, NULL, count);
+}
+
+static struct i915_request **
+cancel_port_requests(struct intel_engine_execlists * const el,
+		     struct i915_request **inactive)
+{
+	struct i915_request * const *port;
+
+	clear_ports(el->pending, ARRAY_SIZE(el->pending));
+
+	/* Mark the end of active before we overwrite *active */
+	for (port = xchg(&el->active, el->pending); *port; port++)
+		*inactive++ = *port;
+	clear_ports(el->inflight, ARRAY_SIZE(el->inflight));
+
+	smp_wmb(); /* complete the seqlock for execlists_active() */
+	WRITE_ONCE(el->active, el->inflight);
+
+	return inactive;
+}
+
+static void __ring_rewind(struct intel_engine_cs *engine, bool stalled)
+{
+	struct i915_request *rq;
+	unsigned long flags;
+
+	rcu_read_lock();
+	spin_lock_irqsave(&engine->active.lock, flags);
+	rq = __intel_engine_rewind_requests(engine);
+	spin_unlock_irqrestore(&engine->active.lock, flags);
+	if (rq && __i915_request_has_started(rq))
+		__i915_request_reset(rq, stalled);
+	rcu_read_unlock();
+}
+
+static void ring_reset_csb(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const el = &engine->execlists;
+	struct i915_request *post[2 * EXECLIST_MAX_PORTS];
+	struct i915_request **inactive;
+
+	rcu_read_lock();
+	inactive = cancel_port_requests(el, post);
+
+	/* Clear the global submission state, we will submit from scratch */
+	intel_ring_reset(engine->legacy.ring, 0);
+	set_current_context(&engine->legacy.context, NULL);
+
+	post_process_csb(post, inactive);
+	rcu_read_unlock();
+}
+
+static void ring_reset_rewind(struct intel_engine_cs *engine, bool stalled)
+{
+	ring_reset_csb(engine);
+	__ring_rewind(engine, stalled);
+}
+
+static void nop_submission_tasklet(unsigned long data)
+{
+}
+
+static void ring_reset_cancel(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq, *rn;
+	struct i915_priolist *p;
+	unsigned long flags;
+
+	ring_reset_csb(engine);
+
+	rcu_read_lock();
+	spin_lock_irqsave(&engine->active.lock, flags);
+
+	/* Mark all submitted requests as skipped. */
+	list_for_each_entry(rq, &engine->active.requests, sched.link)
+		i915_request_mark_eio(rq);
+	intel_engine_signal_breadcrumbs(engine);
+
+	/* Flush the queued requests to the timeline list (for retiring). */
+	for_each_priolist(p, &engine->active.queue) {
+		priolist_for_each_request_safe(rq, rn, p) {
+			i915_request_mark_eio(rq);
+			__i915_request_submit(rq);
+		}
+		i915_priolist_advance(&engine->active.queue, p);
+	}
+	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
+
+	/* Remaining _unready_ requests will be nop'ed when submitted */
+
+	GEM_BUG_ON(__tasklet_is_enabled(&engine->active.tasklet));
+	engine->active.tasklet.func = nop_submission_tasklet;
+
+	spin_unlock_irqrestore(&engine->active.lock, flags);
+	rcu_read_unlock();
+}
+
+static void reset_finish(struct intel_engine_cs *engine)
+{
+	intel_ring_submission_reset_finish(engine);
+
+	if (__tasklet_enable(&engine->active.tasklet))
+		i915_sched_kick(&engine->active);
+}
+
+static void submission_park(struct intel_engine_cs *engine)
+{
+	/* drain the submit queue */
+	intel_breadcrumbs_unpin_irq(engine->breadcrumbs);
+	i915_sched_kick(&engine->active);
+}
+
+static void submission_unpark(struct intel_engine_cs *engine)
+{
+	intel_breadcrumbs_pin_irq(engine->breadcrumbs);
+}
+
+static void ring_context_destroy(struct kref *ref)
+{
+	struct intel_context *ce = container_of(ref, typeof(*ce), ref);
+
+	GEM_BUG_ON(intel_context_is_pinned(ce));
+
+	if (ce->state)
+		i915_vma_put(ce->state);
+	if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags))
+		intel_ring_put(ce->ring);
+
+	intel_context_fini(ce);
+	intel_context_free(ce);
+}
+
+static int alloc_context_vma(struct intel_context *ce)
+
+{
+	struct intel_engine_cs *engine = ce->engine;
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int err;
+
+	obj = i915_gem_object_create_shmem(engine->i915, engine->context_size);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	/*
+	 * Try to make the context utilize L3 as well as LLC.
+	 *
+	 * On VLV we don't have L3 controls in the PTEs so we
+	 * shouldn't touch the cache level, especially as that
+	 * would make the object snooped which might have a
+	 * negative performance impact.
+	 *
+	 * Snooping is required on non-llc platforms in execlist
+	 * mode, but since all GGTT accesses use PAT entry 0 we
+	 * get snooping anyway regardless of cache_level.
+	 *
+	 * This is only applicable for Ivy Bridge devices since
+	 * later platforms don't have L3 control bits in the PTE.
+	 */
+	if (IS_IVYBRIDGE(engine->i915))
+		i915_gem_object_set_cache_coherency(obj, I915_CACHE_L3_LLC);
+
+	if (engine->default_state) {
+		void *vaddr;
+
+		vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			goto err_obj;
+		}
+
+		shmem_read(engine->default_state, 0,
+			   vaddr, engine->context_size);
+		__set_bit(CONTEXT_VALID_BIT, &ce->flags);
+
+		i915_gem_object_flush_map(obj);
+		i915_gem_object_unpin_map(obj);
+	}
+
+	vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err_obj;
+	}
+
+	ce->state = vma;
+	return 0;
+
+err_obj:
+	i915_gem_object_put(obj);
+	return err;
+}
+
+static struct intel_timeline *pinned_timeline(struct intel_context *ce)
+{
+	struct intel_timeline *tl = fetch_and_zero(&ce->timeline);
+
+	return intel_timeline_create_from_engine(ce->engine,
+						 page_unmask_bits(tl));
+}
+
+static int alloc_timeline(struct intel_context *ce)
+{
+	struct intel_engine_cs *engine = ce->engine;
+	struct intel_timeline *tl;
+
+	if (unlikely(ce->timeline))
+		tl = pinned_timeline(ce);
+	else
+		tl = intel_timeline_create(engine->gt);
+	if (IS_ERR(tl))
+		return PTR_ERR(tl);
+
+	ce->timeline = tl;
+	return 0;
+}
+
+static int ring_context_alloc(struct intel_context *ce)
+{
+	struct intel_engine_cs *engine = ce->engine;
+	struct intel_ring *ring;
+	int err;
+
+	GEM_BUG_ON(ce->state);
+	if (engine->context_size) {
+		err = alloc_context_vma(ce);
+		if (err)
+			return err;
+	}
+
+	if (!page_mask_bits(ce->timeline)) {
+		err = alloc_timeline(ce);
+		if (err)
+			goto err_vma;
+	}
+
+	ring = intel_engine_create_ring(engine,
+					(unsigned long)ce->ring |
+					INTEL_RING_CREATE_INTERNAL);
+	if (IS_ERR(ring)) {
+		err = PTR_ERR(ring);
+		goto err_timeline;
+	}
+	ce->ring = ring;
+
+	return 0;
+
+err_timeline:
+	intel_timeline_put(ce->timeline);
+err_vma:
+	if (ce->state) {
+		i915_vma_put(ce->state);
+		ce->state = NULL;
+	}
+	return err;
+}
+
+static int ring_context_pre_pin(struct intel_context *ce,
+				struct i915_gem_ww_ctx *ww,
+				void **unused)
+{
+	return 0;
+}
+
+static int ring_context_pin(struct intel_context *ce, void *unused)
+{
+	return 0;
+}
+
+static void ring_context_unpin(struct intel_context *ce)
+{
+}
+
+static void ring_context_post_unpin(struct intel_context *ce)
+{
+}
+
+static void ring_context_reset(struct intel_context *ce)
+{
+	intel_ring_reset(ce->ring, 0);
+	clear_bit(CONTEXT_VALID_BIT, &ce->flags);
+}
+
+static const struct intel_context_ops ring_context_ops = {
+	.flags = COPS_HAS_INFLIGHT,
+
+	.alloc = ring_context_alloc,
+
+	.pre_pin = ring_context_pre_pin,
+	.pin = ring_context_pin,
+	.unpin = ring_context_unpin,
+	.post_unpin = ring_context_post_unpin,
+
+	.enter = intel_context_enter_engine,
+	.exit = intel_context_exit_engine,
+
+	.reset = ring_context_reset,
+	.destroy = ring_context_destroy,
+};
+
+static int ring_request_alloc(struct i915_request *rq)
+{
+	int ret;
+
+	GEM_BUG_ON(!intel_context_is_pinned(rq->context));
+
+	/*
+	 * Flush enough space to reduce the likelihood of waiting after
+	 * we start building the request - in which case we will just
+	 * have to repeat work.
+	 */
+	rq->reserved_space += LEGACY_REQUEST_SIZE;
+
+	/* Unconditionally invalidate GPU caches and TLBs. */
+	ret = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
+	if (ret)
+		return ret;
+
+	rq->reserved_space -= LEGACY_REQUEST_SIZE;
+	return 0;
+}
+
+static void set_default_submission(struct intel_engine_cs *engine)
+{
+	engine->submit_request = i915_request_enqueue;
+	engine->active.tasklet.func = submission_tasklet;
+}
+
+static void ring_release(struct intel_engine_cs *engine)
+{
+	intel_engine_cleanup_common(engine);
+
+	set_current_context(&engine->legacy.context, NULL);
+
+	intel_ring_unpin(engine->legacy.ring);
+	intel_ring_put(engine->legacy.ring);
+}
+
+static void setup_irq(struct intel_engine_cs *engine)
+{
+}
+
+static void setup_common(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *i915 = engine->i915;
+
+	/* gen8+ are only supported with execlists */
+	GEM_BUG_ON(INTEL_GEN(i915) >= 8);
+	GEM_BUG_ON(INTEL_GEN(i915) < 8);
+
+	setup_irq(engine);
+
+	engine->park = submission_park;
+	engine->unpark = submission_unpark;
+
+	engine->resume = intel_ring_submission_resume;
+	engine->sanitize = intel_ring_submission_sanitize;
+
+	engine->reset.prepare = reset_prepare;
+	engine->reset.rewind = ring_reset_rewind;
+	engine->reset.cancel = ring_reset_cancel;
+	engine->reset.finish = reset_finish;
+
+	engine->cops = &ring_context_ops;
+	engine->request_alloc = ring_request_alloc;
+
+	engine->set_default_submission = set_default_submission;
+}
+
+static void setup_rcs(struct intel_engine_cs *engine)
+{
+}
+
+static void setup_vcs(struct intel_engine_cs *engine)
+{
+}
+
+static void setup_bcs(struct intel_engine_cs *engine)
+{
+}
+
+static void setup_vecs(struct intel_engine_cs *engine)
+{
+	GEM_BUG_ON(!IS_HASWELL(engine->i915));
+}
+
+static unsigned int global_ring_size(void)
+{
+	/* Enough space to hold 2 clients and the context switch */
+	return roundup_pow_of_two(EXECLIST_MAX_PORTS * SZ_16K + SZ_4K);
+}
+
+int intel_ring_scheduler_setup(struct intel_engine_cs *engine)
+{
+	struct intel_ring *ring;
+	int err;
+
+	GEM_BUG_ON(HAS_EXECLISTS(engine->i915));
+
+	tasklet_init(&engine->active.tasklet,
+		     submission_tasklet, (unsigned long)engine);
+
+	setup_common(engine);
+
+	switch (engine->class) {
+	case RENDER_CLASS:
+		setup_rcs(engine);
+		break;
+	case VIDEO_DECODE_CLASS:
+		setup_vcs(engine);
+		break;
+	case COPY_ENGINE_CLASS:
+		setup_bcs(engine);
+		break;
+	case VIDEO_ENHANCEMENT_CLASS:
+		setup_vecs(engine);
+		break;
+	default:
+		MISSING_CASE(engine->class);
+		return -ENODEV;
+	}
+
+	ring = intel_engine_create_ring(engine, global_ring_size());
+	if (IS_ERR(ring)) {
+		err = PTR_ERR(ring);
+		goto err;
+	}
+
+	err = intel_ring_pin(ring, NULL);
+	if (err)
+		goto err_ring;
+
+	GEM_BUG_ON(engine->legacy.ring);
+	engine->legacy.ring = ring;
+
+	engine->flags |= I915_ENGINE_HAS_SCHEDULER;
+	engine->flags |= I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
+	engine->flags |= I915_ENGINE_SUPPORTS_STATS;
+
+	/* Finally, take ownership and responsibility for cleanup! */
+	engine->release = ring_release;
+	return 0;
+
+err_ring:
+	intel_ring_put(ring);
+err:
+	intel_engine_cleanup_common(engine);
+	return err;
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 963804f22ab7..c10038b004e6 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -14,6 +14,7 @@
 #include "intel_gt.h"
 #include "intel_reset.h"
 #include "intel_ring.h"
+#include "intel_ring_submission.h"
 #include "shmem_utils.h"
 
 /* Rough estimate of the typical request size, performing a flush,
@@ -176,7 +177,7 @@ static bool stop_ring(struct intel_engine_cs *engine)
 	return (ENGINE_READ_FW(engine, RING_HEAD) & HEAD_ADDR) == 0;
 }
 
-static int xcs_resume(struct intel_engine_cs *engine)
+int intel_ring_submission_resume(struct intel_engine_cs *engine)
 {
 	struct intel_ring *ring = engine->legacy.ring;
 
@@ -264,7 +265,7 @@ static void sanitize_hwsp(struct intel_engine_cs *engine)
 		intel_timeline_reset_seqno(tl);
 }
 
-static void xcs_sanitize(struct intel_engine_cs *engine)
+void intel_ring_submission_sanitize(struct intel_engine_cs *engine)
 {
 	/*
 	 * Poison residual state on resume, in case the suspend didn't!
@@ -289,7 +290,7 @@ static void xcs_sanitize(struct intel_engine_cs *engine)
 	clflush_cache_range(engine->status_page.addr, PAGE_SIZE);
 }
 
-static void reset_prepare(struct intel_engine_cs *engine)
+void intel_ring_submission_reset_prepare(struct intel_engine_cs *engine)
 {
 	/*
 	 * We stop engines, otherwise we might get failed reset and a
@@ -387,7 +388,7 @@ static void reset_rewind(struct intel_engine_cs *engine, bool stalled)
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
 
-static void reset_finish(struct intel_engine_cs *engine)
+void intel_ring_submission_reset_finish(struct intel_engine_cs *engine)
 {
 }
 
@@ -1029,13 +1030,13 @@ static void setup_common(struct intel_engine_cs *engine)
 
 	setup_irq(engine);
 
-	engine->resume = xcs_resume;
-	engine->sanitize = xcs_sanitize;
+	engine->resume = intel_ring_submission_resume;
+	engine->sanitize = intel_ring_submission_sanitize;
 
-	engine->reset.prepare = reset_prepare;
+	engine->reset.prepare = intel_ring_submission_reset_prepare;
 	engine->reset.rewind = reset_rewind;
 	engine->reset.cancel = reset_cancel;
-	engine->reset.finish = reset_finish;
+	engine->reset.finish = intel_ring_submission_reset_finish;
 
 	engine->cops = &ring_context_ops;
 	engine->request_alloc = ring_request_alloc;
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.h b/drivers/gpu/drm/i915/gt/intel_ring_submission.h
new file mode 100644
index 000000000000..59a43c221748
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef __INTEL_RING_SUBMISSION_H__
+#define __INTEL_RING_SUBMISSION_H__
+
+struct intel_engine_cs;
+
+void intel_ring_submission_reset_prepare(struct intel_engine_cs *engine);
+void intel_ring_submission_reset_finish(struct intel_engine_cs *engine);
+
+int intel_ring_submission_resume(struct intel_engine_cs *engine);
+void intel_ring_submission_sanitize(struct intel_engine_cs *engine);
+
+#endif /* __INTEL_RING_SUBMISSION_H__ */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 39/41] drm/i915/gt: Implement ring scheduler for gen4-7
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (36 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 38/41] drm/i915/gt: Infrastructure for ring scheduling Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 40/41] drm/i915/gt: Enable ring scheduling for gen5-7 Chris Wilson
                   ` (6 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

A key prolem with legacy ring buffer submission is that it is an inheret
FIFO queue across all clients; if one blocks, they all block. A
scheduler allows us to avoid that limitation, and ensures that all
clients can submit in parallel, removing the resource contention of the
global ringbuffer.

Having built the ring scheduler infrastructure over top of the global
ringbuffer submission, we now need to provide the HW knowledge required
to build command packets and implement context switching.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/gt/intel_ring_scheduler.c    | 454 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_reg.h               |  10 +
 2 files changed, 461 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ring_scheduler.c b/drivers/gpu/drm/i915/gt/intel_ring_scheduler.c
index 68a08a84d18b..57a266f1eab7 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_scheduler.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_scheduler.c
@@ -7,7 +7,12 @@
 
 #include <drm/i915_drm.h>
 
+#include "gen2_engine_cs.h"
+#include "gen6_engine_cs.h"
+#include "gen6_ppgtt.h"
+#include "gen7_renderclear.h"
 #include "i915_drv.h"
+#include "i915_mitigations.h"
 #include "intel_breadcrumbs.h"
 #include "intel_context.h"
 #include "intel_engine_pm.h"
@@ -188,8 +193,270 @@ static void ring_copy(struct intel_ring *dst,
 	memcpy(out, src->vaddr + start, end - start);
 }
 
+static void mi_set_context(struct intel_ring *ring,
+			   struct intel_engine_cs *engine,
+			   struct intel_context *ce,
+			   u32 flags)
+{
+	struct drm_i915_private *i915 = engine->i915;
+	enum intel_engine_id id;
+	const int num_engines =
+		IS_HASWELL(i915) ? engine->gt->info.num_engines - 1 : 0;
+	int len;
+	u32 *cs;
+
+	len = 4;
+	if (IS_GEN(i915, 7))
+		len += 2 + (num_engines ? 4 * num_engines + 6 : 0);
+	else if (IS_GEN(i915, 5))
+		len += 2;
+
+	cs = ring_map_dw(ring, len);
+
+	/* WaProgramMiArbOnOffAroundMiSetContext:ivb,vlv,hsw,bdw,chv */
+	if (IS_GEN(i915, 7)) {
+		*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+		if (num_engines) {
+			struct intel_engine_cs *signaller;
+
+			*cs++ = MI_LOAD_REGISTER_IMM(num_engines);
+			for_each_engine(signaller, engine->gt, id) {
+				if (signaller == engine)
+					continue;
+
+				*cs++ = i915_mmio_reg_offset(
+					   RING_PSMI_CTL(signaller->mmio_base));
+				*cs++ = _MASKED_BIT_ENABLE(
+						GEN6_PSMI_SLEEP_MSG_DISABLE);
+			}
+		}
+	} else if (IS_GEN(i915, 5)) {
+		/*
+		 * This w/a is only listed for pre-production ilk a/b steppings,
+		 * but is also mentioned for programming the powerctx. To be
+		 * safe, just apply the workaround; we do not use SyncFlush so
+		 * this should never take effect and so be a no-op!
+		 */
+		*cs++ = MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN;
+	}
+
+	*cs++ = MI_NOOP;
+	*cs++ = MI_SET_CONTEXT;
+	*cs++ = i915_ggtt_offset(ce->state) | flags;
+	/*
+	 * w/a: MI_SET_CONTEXT must always be followed by MI_NOOP
+	 * WaMiSetContext_Hang:snb,ivb,vlv
+	 */
+	*cs++ = MI_NOOP;
+
+	if (IS_GEN(i915, 7)) {
+		if (num_engines) {
+			struct intel_engine_cs *signaller;
+			i915_reg_t last_reg = {}; /* keep gcc quiet */
+
+			*cs++ = MI_LOAD_REGISTER_IMM(num_engines);
+			for_each_engine(signaller, engine->gt, id) {
+				if (signaller == engine)
+					continue;
+
+				last_reg = RING_PSMI_CTL(signaller->mmio_base);
+				*cs++ = i915_mmio_reg_offset(last_reg);
+				*cs++ = _MASKED_BIT_DISABLE(
+						GEN6_PSMI_SLEEP_MSG_DISABLE);
+			}
+
+			/* Insert a delay before the next switch! */
+			*cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT;
+			*cs++ = i915_mmio_reg_offset(last_reg);
+			*cs++ = intel_gt_scratch_offset(engine->gt,
+							INTEL_GT_SCRATCH_FIELD_DEFAULT);
+			*cs++ = MI_NOOP;
+		}
+		*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+	} else if (IS_GEN(i915, 5)) {
+		*cs++ = MI_SUSPEND_FLUSH;
+	}
+
+	ring_advance(ring, cs);
+}
+
+static struct i915_address_space *vm_alias(struct i915_address_space *vm)
+{
+	if (i915_is_ggtt(vm))
+		vm = &i915_vm_to_ggtt(vm)->alias->vm;
+
+	return vm;
+}
+
+static u32 pp_dir(const struct i915_ppgtt *ppgtt)
+{
+	return container_of(ppgtt, const struct gen6_ppgtt, base)->pp_dir;
+}
+
+static void load_pd_dir(struct intel_ring *ring,
+			struct intel_engine_cs *engine,
+			const struct i915_ppgtt *ppgtt)
+{
+	u32 *cs = ring_map_dw(ring, 10);
+
+	*cs++ = MI_LOAD_REGISTER_IMM(1);
+	*cs++ = i915_mmio_reg_offset(RING_PP_DIR_DCLV(engine->mmio_base));
+	*cs++ = PP_DIR_DCLV_2G;
+
+	*cs++ = MI_LOAD_REGISTER_IMM(1);
+	*cs++ = i915_mmio_reg_offset(RING_PP_DIR_BASE(engine->mmio_base));
+	*cs++ = pp_dir(ppgtt);
+
+	/* Stall until the page table load is complete? */
+	*cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT;
+	*cs++ = i915_mmio_reg_offset(RING_PP_DIR_BASE(engine->mmio_base));
+	*cs++ = intel_gt_scratch_offset(engine->gt,
+					INTEL_GT_SCRATCH_FIELD_DEFAULT);
+	*cs++ = MI_NOOP;
+
+	ring_advance(ring, cs);
+}
+
+static struct i915_address_space *current_vm(struct intel_engine_cs *engine)
+{
+	struct intel_context *old = engine->legacy.context;
+
+	return old ? vm_alias(old->vm) : NULL;
+}
+
+static void gen4_emit_invalidate_rcs(struct intel_ring *ring,
+				     struct intel_engine_cs *engine)
+{
+	u32 addr, flags;
+	u32 *cs;
+
+	addr = intel_gt_scratch_offset(engine->gt,
+				       INTEL_GT_SCRATCH_FIELD_RENDER_FLUSH);
+
+	flags = PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
+	flags |= PIPE_CONTROL_TLB_INVALIDATE;
+
+	if (INTEL_GEN(engine->i915) >= 7)
+		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
+	else
+		addr |= PIPE_CONTROL_GLOBAL_GTT;
+
+	cs = ring_map_dw(ring, 4);
+	*cs++ = GFX_OP_PIPE_CONTROL(4);
+	*cs++ = flags;
+	*cs++ = addr;
+	*cs++ = 0;
+	ring_advance(ring, cs);
+}
+
+static struct i915_address_space *
+clear_residuals(struct intel_ring *ring, struct intel_engine_cs *engine)
+{
+	struct intel_context *ce = engine->kernel_context;
+	struct i915_address_space *vm = vm_alias(engine->gt->vm);
+	u32 flags;
+
+	if (vm != current_vm(engine))
+		load_pd_dir(ring, engine, i915_vm_to_ppgtt(vm));
+
+	if (ce->state)
+		mi_set_context(ring, engine, ce,
+			       MI_MM_SPACE_GTT | MI_RESTORE_INHIBIT);
+
+	if (IS_HASWELL(engine->i915))
+		flags = MI_BATCH_PPGTT_HSW | MI_BATCH_NON_SECURE_HSW;
+	else
+		flags = MI_BATCH_NON_SECURE_I965;
+
+	__gen6_emit_bb_start(ring_map_dw(ring, 2),
+			     engine->wa_ctx.vma->node.start, flags);
+
+	return vm;
+}
+
+static void remap_l3_slice(struct intel_ring *ring,
+			   struct intel_engine_cs *engine,
+			   int slice)
+{
+	u32 *cs, *remap_info = engine->i915->l3_parity.remap_info[slice];
+	int i;
+
+	if (!remap_info)
+		return;
+
+	/*
+	 * Note: We do not worry about the concurrent register cacheline hang
+	 * here because no other code should access these registers other than
+	 * at initialization time.
+	 */
+	cs = ring_map_dw(ring, GEN7_L3LOG_SIZE / 4 * 2 + 2);
+	*cs++ = MI_LOAD_REGISTER_IMM(GEN7_L3LOG_SIZE / 4);
+	for (i = 0; i < GEN7_L3LOG_SIZE / 4; i++) {
+		*cs++ = i915_mmio_reg_offset(GEN7_L3LOG(slice, i));
+		*cs++ = remap_info[i];
+	}
+	*cs++ = MI_NOOP;
+	ring_advance(ring, cs);
+}
+
+static void remap_l3(struct intel_ring *ring,
+		     struct intel_engine_cs *engine,
+		     struct intel_context *ce)
+{
+	struct i915_gem_context *ctx =
+		rcu_dereference_protected(ce->gem_context, true);
+	int bit, idx = -1;
+
+	if (!ctx || !ctx->remap_slice)
+		return;
+
+	do {
+		bit = ffs(ctx->remap_slice);
+		remap_l3_slice(ring, engine, idx += bit);
+	} while (ctx->remap_slice >>= bit);
+}
+
 static void switch_context(struct intel_ring *ring, struct i915_request *rq)
 {
+	struct intel_engine_cs *engine = rq->engine;
+	struct i915_address_space *cvm = current_vm(engine);
+	struct intel_context *ce = rq->context;
+	struct i915_address_space *vm;
+
+	if (engine->wa_ctx.vma && ce != engine->kernel_context) {
+		if (engine->wa_ctx.vma->private != ce &&
+		    i915_mitigate_clear_residuals()) {
+			cvm = clear_residuals(ring, engine);
+			intel_context_put(engine->wa_ctx.vma->private);
+			engine->wa_ctx.vma->private = intel_context_get(ce);
+		}
+	}
+
+	vm = vm_alias(ce->vm);
+	if (vm != cvm)
+		load_pd_dir(ring, engine, i915_vm_to_ppgtt(vm));
+
+	if (ce->state) {
+		u32 flags;
+
+		GEM_BUG_ON(engine->id != RCS0);
+
+		/* For resource streamer on HSW+ and power context elsewhere */
+		BUILD_BUG_ON(HSW_MI_RS_SAVE_STATE_EN != MI_SAVE_EXT_STATE_EN);
+		BUILD_BUG_ON(HSW_MI_RS_RESTORE_STATE_EN != MI_RESTORE_EXT_STATE_EN);
+
+		flags = MI_SAVE_EXT_STATE_EN | MI_MM_SPACE_GTT;
+		if (test_bit(CONTEXT_VALID_BIT, &ce->flags)) {
+			gen4_emit_invalidate_rcs(ring, engine);
+			flags |= MI_RESTORE_EXT_STATE_EN;
+		} else {
+			flags |= MI_RESTORE_INHIBIT;
+		}
+
+		mi_set_context(ring, engine, ce, flags);
+	}
+
+	remap_l3(ring, engine, ce);
 }
 
 static struct i915_request *ring_submit(struct i915_request *rq)
@@ -229,6 +496,36 @@ static inline void write_tail(const struct intel_engine_cs *engine)
 	ENGINE_WRITE(engine, RING_TAIL, engine->legacy.ring->tail);
 }
 
+static void wa_write_tail(const struct intel_engine_cs *engine)
+{
+	const i915_reg_t psmi = RING_PSMI_CTL(engine->mmio_base);
+	struct intel_uncore *uncore = engine->uncore;
+
+	intel_uncore_write_fw(uncore, psmi,
+			      _MASKED_BIT_ENABLE(PSMI_SLEEP_MSG_DISABLE));
+
+	/* Clear the context id. Here be magic! */
+	intel_uncore_write64_fw(uncore, RING_RNCID(engine->mmio_base), 0x0);
+
+	/* Wait for the ring not to be idle, i.e. for it to wake up. */
+	if (__intel_wait_for_register_fw(uncore, psmi,
+					 PSMI_SLEEP_INDICATOR, 0,
+					 1000, 0, NULL))
+		drm_err(&uncore->i915->drm,
+			"timed out waiting for %s to wake up\n",
+			engine->name);
+
+	/* Now that the ring is fully powered up, update the tail */
+	write_tail(engine);
+
+	/*
+	 * Let the ring send IDLE messages to the GT again,
+	 * and so let it sleep to conserve power when idle.
+	 */
+	intel_uncore_write_fw(uncore, psmi,
+			      _MASKED_BIT_DISABLE(PSMI_SLEEP_MSG_DISABLE));
+}
+
 static void dequeue(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const el = &engine->execlists;
@@ -278,7 +575,10 @@ static void dequeue(struct intel_engine_cs *engine)
 			schedule_in(*port);
 
 		wmb(); /* paranoid flush of WCB before RING_TAIL write */
-		write_tail(engine);
+		if (!engine->fw_active)
+			write_tail(engine);
+		else
+			wa_write_tail(engine);
 
 		WRITE_ONCE(el->active, el->inflight);
 		GEM_BUG_ON(!*el->active);
@@ -607,7 +907,14 @@ static int ring_context_pre_pin(struct intel_context *ce,
 				struct i915_gem_ww_ctx *ww,
 				void **unused)
 {
-	return 0;
+	struct i915_address_space *vm;
+	int err = 0;
+
+	vm = vm_alias(ce->vm);
+	if (vm)
+		err = gen6_ppgtt_pin(i915_vm_to_ppgtt((vm)), ww);
+
+	return err;
 }
 
 static int ring_context_pin(struct intel_context *ce, void *unused)
@@ -615,12 +922,22 @@ static int ring_context_pin(struct intel_context *ce, void *unused)
 	return 0;
 }
 
+static void __context_unpin_ppgtt(struct intel_context *ce)
+{
+	struct i915_address_space *vm;
+
+	vm = vm_alias(ce->vm);
+	if (vm)
+		gen6_ppgtt_unpin(i915_vm_to_ppgtt(vm));
+}
+
 static void ring_context_unpin(struct intel_context *ce)
 {
 }
 
 static void ring_context_post_unpin(struct intel_context *ce)
 {
+	__context_unpin_ppgtt(ce);
 }
 
 static void ring_context_reset(struct intel_context *ce)
@@ -680,12 +997,27 @@ static void ring_release(struct intel_engine_cs *engine)
 
 	set_current_context(&engine->legacy.context, NULL);
 
+	if (engine->wa_ctx.vma) {
+		intel_context_put(engine->wa_ctx.vma->private);
+		i915_vma_unpin_and_release(&engine->wa_ctx.vma, 0);
+	}
+
 	intel_ring_unpin(engine->legacy.ring);
 	intel_ring_put(engine->legacy.ring);
 }
 
 static void setup_irq(struct intel_engine_cs *engine)
 {
+	if (INTEL_GEN(engine->i915) >= 6) {
+		engine->irq_enable = gen6_irq_enable;
+		engine->irq_disable = gen6_irq_disable;
+	} else if (INTEL_GEN(engine->i915) >= 5) {
+		engine->irq_enable = gen5_irq_enable;
+		engine->irq_disable = gen5_irq_disable;
+	} else {
+		engine->irq_enable = gen3_irq_enable;
+		engine->irq_disable = gen3_irq_disable;
+	}
 }
 
 static void setup_common(struct intel_engine_cs *engine)
@@ -694,7 +1026,7 @@ static void setup_common(struct intel_engine_cs *engine)
 
 	/* gen8+ are only supported with execlists */
 	GEM_BUG_ON(INTEL_GEN(i915) >= 8);
-	GEM_BUG_ON(INTEL_GEN(i915) < 8);
+	GEM_BUG_ON(INTEL_GEN(i915) < 4);
 
 	setup_irq(engine);
 
@@ -712,24 +1044,80 @@ static void setup_common(struct intel_engine_cs *engine)
 	engine->cops = &ring_context_ops;
 	engine->request_alloc = ring_request_alloc;
 
+	engine->emit_init_breadcrumb = gen4_emit_init_breadcrumb_xcs;
+
+	if (INTEL_GEN(i915) >= 6)
+		engine->emit_bb_start = gen6_emit_bb_start;
+	else
+		engine->emit_bb_start = gen4_emit_bb_start;
+
+	if (INTEL_GEN(i915) >= 7)
+		engine->emit_fini_breadcrumb = gen7_emit_breadcrumb_xcs;
+	else if (INTEL_GEN(i915) >= 6)
+		engine->emit_fini_breadcrumb = gen6_emit_breadcrumb_xcs;
+	else
+		engine->emit_fini_breadcrumb = gen4_emit_breadcrumb_xcs;
+
 	engine->set_default_submission = set_default_submission;
 }
 
 static void setup_rcs(struct intel_engine_cs *engine)
 {
+	struct drm_i915_private *i915 = engine->i915;
+
+	if (HAS_L3_DPF(i915))
+		engine->irq_keep_mask = GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
+
+	engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
+
+	if (INTEL_GEN(i915) >= 7) {
+		engine->emit_flush = gen7_emit_flush_rcs;
+		engine->emit_fini_breadcrumb = gen7_emit_breadcrumb_rcs;
+		if (IS_HASWELL(i915))
+			engine->emit_bb_start = hsw_emit_bb_start;
+	} else if (INTEL_GEN(i915) >= 6) {
+		engine->emit_flush = gen6_emit_flush_rcs;
+		engine->emit_fini_breadcrumb = gen6_emit_breadcrumb_rcs;
+	} else if (INTEL_GEN(i915) >= 5) {
+		engine->emit_flush = gen4_emit_flush_rcs;
+	} else {
+		engine->emit_flush = gen4_emit_flush_rcs;
+		engine->irq_enable_mask = I915_USER_INTERRUPT;
+	}
 }
 
 static void setup_vcs(struct intel_engine_cs *engine)
 {
+	if (INTEL_GEN(engine->i915) >= 6) {
+		if (IS_GEN(engine->i915, 6))
+			engine->fw_domain = FORCEWAKE_ALL;
+		engine->emit_flush = gen6_emit_flush_vcs;
+		engine->irq_enable_mask = GT_BSD_USER_INTERRUPT;
+	} else if (INTEL_GEN(engine->i915) >= 5) {
+		engine->emit_flush = gen4_emit_flush_vcs;
+		engine->irq_enable_mask = ILK_BSD_USER_INTERRUPT;
+	} else {
+		engine->emit_flush = gen4_emit_flush_vcs;
+		engine->irq_enable_mask = I915_BSD_USER_INTERRUPT;
+	}
 }
 
 static void setup_bcs(struct intel_engine_cs *engine)
 {
+	GEM_BUG_ON(INTEL_GEN(engine->i915) < 6);
+
+	engine->emit_flush = gen6_emit_flush_xcs;
+	engine->irq_enable_mask = GT_BLT_USER_INTERRUPT;
 }
 
 static void setup_vecs(struct intel_engine_cs *engine)
 {
 	GEM_BUG_ON(!IS_HASWELL(engine->i915));
+
+	engine->emit_flush = gen6_emit_flush_xcs;
+	engine->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
+	engine->irq_enable = hsw_irq_enable_vecs;
+	engine->irq_disable = hsw_irq_disable_vecs;
 }
 
 static unsigned int global_ring_size(void)
@@ -738,6 +1126,58 @@ static unsigned int global_ring_size(void)
 	return roundup_pow_of_two(EXECLIST_MAX_PORTS * SZ_16K + SZ_4K);
 }
 
+static int gen7_ctx_switch_bb_init(struct intel_engine_cs *engine)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int size;
+	int err;
+
+	size = gen7_setup_clear_gpr_bb(engine, NULL /* probe size */);
+	if (size <= 0)
+		return size;
+
+	size = ALIGN(size, PAGE_SIZE);
+	obj = i915_gem_object_create_internal(engine->i915, size);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	vma = i915_vma_instance(obj, engine->gt->vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err_obj;
+	}
+
+	vma->private = intel_context_create(engine); /* dummy residuals */
+	if (IS_ERR(vma->private)) {
+		err = PTR_ERR(vma->private);
+		goto err_obj;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, PIN_USER | PIN_HIGH);
+	if (err)
+		goto err_private;
+
+	err = i915_vma_sync(vma);
+	if (err)
+		goto err_unpin;
+
+	size = gen7_setup_clear_gpr_bb(engine, vma);
+	if (err)
+		goto err_unpin;
+
+	engine->wa_ctx.vma = vma;
+	return 0;
+
+err_unpin:
+	i915_vma_unpin(vma);
+err_private:
+	intel_context_put(vma->private);
+err_obj:
+	i915_gem_object_put(obj);
+	return err;
+}
+
 int intel_ring_scheduler_setup(struct intel_engine_cs *engine)
 {
 	struct intel_ring *ring;
@@ -781,6 +1221,12 @@ int intel_ring_scheduler_setup(struct intel_engine_cs *engine)
 	GEM_BUG_ON(engine->legacy.ring);
 	engine->legacy.ring = ring;
 
+	if (IS_GEN(engine->i915, 7) && engine->class == RENDER_CLASS) {
+		err = gen7_ctx_switch_bb_init(engine);
+		if (err)
+			goto err_ring_unpin;
+	}
+
 	engine->flags |= I915_ENGINE_HAS_SCHEDULER;
 	engine->flags |= I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
 	engine->flags |= I915_ENGINE_SUPPORTS_STATS;
@@ -789,6 +1235,8 @@ int intel_ring_scheduler_setup(struct intel_engine_cs *engine)
 	engine->release = ring_release;
 	return 0;
 
+err_ring_unpin:
+	intel_ring_unpin(ring);
 err_ring:
 	intel_ring_put(ring);
 err:
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 8b9bbc6bacb1..c4cbdb420530 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2529,7 +2529,16 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GEN6_VERSYNC	(RING_SYNC_1(VEBOX_RING_BASE))
 #define GEN6_VEVSYNC	(RING_SYNC_2(VEBOX_RING_BASE))
 #define GEN6_NOSYNC	INVALID_MMIO_REG
+
 #define RING_PSMI_CTL(base)	_MMIO((base) + 0x50)
+#define   PSMI_SLEEP_MSG_DISABLE		REG_BIT(0)
+#define   PSMI_SLEEP_FLUSH_DISABLE		REG_BIT(2)
+#define   PSMI_SLEEP_INDICATOR			REG_BIT(3)
+#define   PSMI_GO_INDICATOR			REG_BIT(4)
+#define   GEN12_PSMI_WAIT_FOR_EVENT_POWER_DOWN_DISABLE REG_BIT(7)
+#define   GEN8_PSMI_FF_DOP_CLOCK_GATE_DISABLE	REG_BIT(10)
+#define   GEN8_PSMI_RC_SEMA_IDLE_MSG_DISABLE	REG_BIT(12)
+
 #define RING_MAX_IDLE(base)	_MMIO((base) + 0x54)
 #define RING_HWS_PGA(base)	_MMIO((base) + 0x80)
 #define RING_ID(base)		_MMIO((base) + 0x8c)
@@ -2539,6 +2548,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   RESET_CTL_READY_TO_RESET REG_BIT(1)
 #define   RESET_CTL_REQUEST_RESET  REG_BIT(0)
 
+#define RING_RNCID(base)	_MMIO((base) + 0x198)
 #define RING_SEMA_WAIT_POLL(base) _MMIO((base) + 0x24c)
 
 #define HSW_GTT_CACHE_EN	_MMIO(0x4024)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 40/41] drm/i915/gt: Enable ring scheduling for gen5-7
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (37 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 39/41] drm/i915/gt: Implement ring scheduler for gen4-7 Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 41/41] drm/i915: Support secure dispatch on gen6/gen7 Chris Wilson
                   ` (5 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Switch over from FIFO global submission to the priority-sorted
topographical scheduler. At the cost of more busy work on the CPU to
keep the GPU supplied with the next packet of requests, this allows us
to reorder requests around submission stalls and so allow low latency
under load while maintaining fairness between clients.

The downside is that we enable interrupts on all requests (unlike with
execlists where we have an interrupt for context switches). This means
that instead of receiving an interrupt for when we are waitng for
completion, we are processing them all the time, with noticeable
overhead of cpu time absorbed by the interrupt handler. The effect is
most pronounced on CPU-throughput limited renderers like uxa, where
performance can be degraded by 20% in the worst case. Nevertheless, this
is a pathological example of an obsolete userspace driver. (There are
also cases where uxa performs better by 20%, which is an interesting
quirk...) The glxgears-not-a-benchmark (cpu throughtput bound) is one
such example of a performance hit, only affecting uxa.

The expectation is that allowing request reordering will allow much
smoother UX that greatly compensates for reduced throughput under high
submission load (but low GPU load).

This also enables the timer based RPS for better powersaving, with the
exception of Valleyview whose PCU doesn't take kindly to our
interference.

References: 0f46832fab77 ("drm/i915: Mask USER interrupts on gen6 (until required)")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c             | 2 ++
 drivers/gpu/drm/i915/gt/intel_rps.c                   | 6 ++----
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index d3f87dc4eda3..2246b5c308dc 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -94,7 +94,7 @@ static int live_nop_switch(void *arg)
 			rq = i915_request_get(this);
 			i915_request_add(this);
 		}
-		if (i915_request_wait(rq, 0, HZ / 5) < 0) {
+		if (i915_request_wait(rq, 0, HZ) < 0) {
 			pr_err("Failed to populated %d contexts\n", nctx);
 			intel_gt_set_wedged(&i915->gt);
 			i915_request_put(rq);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 936820b240dd..99d910f2c172 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -868,6 +868,8 @@ int intel_engines_init(struct intel_gt *gt)
 		setup = intel_guc_submission_setup;
 	else if (HAS_EXECLISTS(gt->i915))
 		setup = intel_execlists_submission_setup;
+	else if (INTEL_GEN(gt->i915) >= 5)
+		setup = intel_ring_scheduler_setup;
 	else
 		setup = intel_ring_submission_setup;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index 900c20a6d073..2c78d61e7ea9 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1081,9 +1081,7 @@ static bool gen6_rps_enable(struct intel_rps *rps)
 	intel_uncore_write_fw(uncore, GEN6_RP_DOWN_TIMEOUT, 50000);
 	intel_uncore_write_fw(uncore, GEN6_RP_IDLE_HYSTERSIS, 10);
 
-	rps->pm_events = (GEN6_PM_RP_UP_THRESHOLD |
-			  GEN6_PM_RP_DOWN_THRESHOLD |
-			  GEN6_PM_RP_DOWN_TIMEOUT);
+	rps->pm_events = GEN6_PM_RP_UP_THRESHOLD | GEN6_PM_RP_DOWN_THRESHOLD;
 
 	return rps_reset(rps);
 }
@@ -1391,7 +1389,7 @@ void intel_rps_enable(struct intel_rps *rps)
 	GEM_BUG_ON(rps->efficient_freq < rps->min_freq);
 	GEM_BUG_ON(rps->efficient_freq > rps->max_freq);
 
-	if (has_busy_stats(rps))
+	if (has_busy_stats(rps) && !IS_VALLEYVIEW(i915))
 		intel_rps_set_timer(rps);
 	else if (INTEL_GEN(i915) >= 6)
 		intel_rps_set_interrupts(rps);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 41/41] drm/i915: Support secure dispatch on gen6/gen7
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (38 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 40/41] drm/i915/gt: Enable ring scheduling for gen5-7 Chris Wilson
@ 2021-01-25 14:01 ` Chris Wilson
  2021-01-25 14:40 ` [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Tvrtko Ursulin
                   ` (4 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 14:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: thomas.hellstrom, Chris Wilson

Re-enable secure dispatch for gen6/gen7, primarily to workaround the
command parser and overly zealous command validation on Haswell.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f281c5799133..b5a530cd7122 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1675,7 +1675,7 @@ tgl_stepping_get(struct drm_i915_private *dev_priv)
 #define HAS_LLC(dev_priv)	(INTEL_INFO(dev_priv)->has_llc)
 #define HAS_SNOOP(dev_priv)	(INTEL_INFO(dev_priv)->has_snoop)
 #define HAS_EDRAM(dev_priv)	((dev_priv)->edram_size_mb)
-#define HAS_SECURE_BATCHES(dev_priv) (INTEL_GEN(dev_priv) < 6)
+#define HAS_SECURE_BATCHES(dev_priv) (INTEL_GEN(dev_priv) < 8)
 #define HAS_WT(dev_priv)	HAS_EDRAM(dev_priv)
 
 #define HWS_NEEDS_PHYSICAL(dev_priv)	(INTEL_INFO(dev_priv)->hws_needs_physical)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (39 preceding siblings ...)
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 41/41] drm/i915: Support secure dispatch on gen6/gen7 Chris Wilson
@ 2021-01-25 14:40 ` Tvrtko Ursulin
  2021-01-25 17:08 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/41] " Patchwork
                   ` (3 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-25 14:40 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 25/01/2021 14:00, Chris Wilson wrote:
> As we reset the engine between verifying the workarounds remain intact,
> report an engine reset failure.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/selftest_workarounds.c | 16 +++++++++++++---
>   1 file changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> index 37ea46907a7d..af33a720dbf8 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> @@ -1219,7 +1219,11 @@ live_engine_reset_workarounds(void *arg)
>   			goto err;
>   		}
>   
> -		intel_engine_reset(engine, "live_workarounds:idle");
> +		ret = intel_engine_reset(engine, "live_workarounds:idle");
> +		if (ret) {
> +			pr_err("%s: Reset failed while idle\n", engine->name);
> +			goto err;
> +		}
>   
>   		ok = verify_wa_lists(gt, &lists, "after idle reset");
>   		if (!ok) {
> @@ -1240,12 +1244,18 @@ live_engine_reset_workarounds(void *arg)
>   
>   		ret = request_add_spin(rq, &spin);
>   		if (ret) {
> -			pr_err("Spinner failed to start\n");
> +			pr_err("%s: Spinner failed to start\n", engine->name);
>   			igt_spinner_fini(&spin);
>   			goto err;
>   		}
>   
> -		intel_engine_reset(engine, "live_workarounds:active");
> +		ret = intel_engine_reset(engine, "live_workarounds:active");
> +		if (ret) {
> +			pr_err("%s: Reset failed on an active spinner\n",
> +			       engine->name);
> +			igt_spinner_fini(&spin);
> +			goto err;
> +		}
>   
>   		igt_spinner_end(&spin);
>   		igt_spinner_fini(&spin);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 02/41] drm/i915/gt: Move the defer_request waiter active assertion
  2021-01-25 14:00 ` [Intel-gfx] [PATCH 02/41] drm/i915/gt: Move the defer_request waiter active assertion Chris Wilson
@ 2021-01-25 14:53   ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-25 14:53 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 25/01/2021 14:00, Chris Wilson wrote:
> In defer_request() we start with the request we just unsubmitted (that
> should be the active request on the gpu) and then defer all of its
> waiters. No waiter should be ahead of the active request, so none should
> be marked as active. That assert failed.
> 
> Of particular note this machine was undergoing persistent GPU result due

s/result/reset/

> to underlying HW issues, so that may be a clue. A request is also marked
> as active when it is retired, regardless of current queue status, and so
> this assertion failure may be a result of the queue being completed by
> the reset and then subsequently processed by the tasklet.
> 
> We can filter out retired requests here by doing the assertion check
> after the is-ready check (active is a subset of being ready).
> 
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2978
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 24731be6e462..56e36d938851 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -1061,7 +1061,6 @@ static void defer_request(struct i915_request *rq, struct list_head * const pl)
>   				   __i915_request_has_started(w) &&
>   				   !__i915_request_is_complete(rq));
>   
> -			GEM_BUG_ON(i915_request_is_active(w));
>   			if (!i915_request_is_ready(w))
>   				continue;
>   
> @@ -1069,6 +1068,7 @@ static void defer_request(struct i915_request *rq, struct list_head * const pl)
>   				continue;
>   
>   			GEM_BUG_ON(rq_prio(w) > rq_prio(rq));
> +			GEM_BUG_ON(i915_request_is_active(w));
>   			list_move_tail(&w->sched.link, &list);
>   		}
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 03/41] drm/i915: Replace engine->schedule() with a known request operation
  2021-01-25 14:00 ` [Intel-gfx] [PATCH 03/41] drm/i915: Replace engine->schedule() with a known request operation Chris Wilson
@ 2021-01-25 15:14   ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-25 15:14 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 25/01/2021 14:00, Chris Wilson wrote:
> Looking to the future, we want to set the scheduling attributes
> explicitly and so replace the generic engine->schedule() with the more
> direct i915_request_set_priority()
> 
> What it loses in removing the 'schedule' name from the function, it
> gains in having an explicit entry point with a stated goal.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/display/intel_display.c  |  5 ++-
>   drivers/gpu/drm/i915/gem/i915_gem_object.h    |  5 ++-
>   drivers/gpu/drm/i915/gem/i915_gem_wait.c      | 29 +++++-----------
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  3 --
>   .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |  4 +--
>   drivers/gpu/drm/i915/gt/intel_engine_types.h  | 29 ++++++++--------
>   drivers/gpu/drm/i915/gt/intel_engine_user.c   |  2 +-
>   .../drm/i915/gt/intel_execlists_submission.c  |  3 +-
>   drivers/gpu/drm/i915/gt/selftest_execlists.c  | 33 +++++--------------
>   drivers/gpu/drm/i915/gt/selftest_hangcheck.c  | 11 +++----
>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  1 -
>   drivers/gpu/drm/i915/i915_request.c           | 10 +++---
>   drivers/gpu/drm/i915/i915_scheduler.c         | 15 +++++----
>   drivers/gpu/drm/i915/i915_scheduler.h         |  3 +-
>   14 files changed, 59 insertions(+), 94 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 7ec7d94b8cdb..2e80babd1f66 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -13632,7 +13632,6 @@ int
>   intel_prepare_plane_fb(struct drm_plane *_plane,
>   		       struct drm_plane_state *_new_plane_state)
>   {
> -	struct i915_sched_attr attr = { .priority = I915_PRIORITY_DISPLAY };
>   	struct intel_plane *plane = to_intel_plane(_plane);
>   	struct intel_plane_state *new_plane_state =
>   		to_intel_plane_state(_new_plane_state);
> @@ -13673,7 +13672,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
>   
>   	if (new_plane_state->uapi.fence) { /* explicit fencing */
>   		i915_gem_fence_wait_priority(new_plane_state->uapi.fence,
> -					     &attr);
> +					     I915_PRIORITY_DISPLAY);
>   		ret = i915_sw_fence_await_dma_fence(&state->commit_ready,
>   						    new_plane_state->uapi.fence,
>   						    i915_fence_timeout(dev_priv),
> @@ -13695,7 +13694,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
>   	if (ret)
>   		return ret;
>   
> -	i915_gem_object_wait_priority(obj, 0, &attr);
> +	i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
>   	i915_gem_object_flush_frontbuffer(obj, ORIGIN_DIRTYFB);
>   
>   	if (!new_plane_state->uapi.fence) { /* implicit fencing */
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 3411ad197fa6..325766abca21 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -549,15 +549,14 @@ static inline void __start_cpu_write(struct drm_i915_gem_object *obj)
>   		obj->cache_dirty = true;
>   }
>   
> -void i915_gem_fence_wait_priority(struct dma_fence *fence,
> -				  const struct i915_sched_attr *attr);
> +void i915_gem_fence_wait_priority(struct dma_fence *fence, int prio);
>   
>   int i915_gem_object_wait(struct drm_i915_gem_object *obj,
>   			 unsigned int flags,
>   			 long timeout);
>   int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>   				  unsigned int flags,
> -				  const struct i915_sched_attr *attr);
> +				  int prio);
>   
>   void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
>   					 enum fb_op_origin origin);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
> index 4b9856d5ba14..d79bf16083bd 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
> @@ -91,22 +91,12 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
>   	return timeout;
>   }
>   
> -static void fence_set_priority(struct dma_fence *fence,
> -			       const struct i915_sched_attr *attr)
> +static void fence_set_priority(struct dma_fence *fence, int prio)
>   {
> -	struct i915_request *rq;
> -	struct intel_engine_cs *engine;
> -
>   	if (dma_fence_is_signaled(fence) || !dma_fence_is_i915(fence))
>   		return;
>   
> -	rq = to_request(fence);
> -	engine = rq->engine;
> -
> -	rcu_read_lock(); /* RCU serialisation for set-wedged protection */
> -	if (engine->schedule)
> -		engine->schedule(rq, attr);
> -	rcu_read_unlock();
> +	i915_request_set_priority(to_request(fence), prio);

Don't need the dma_fence_is_i915 check any longer, or at least as assert?

>   }
>   
>   static inline bool __dma_fence_is_chain(const struct dma_fence *fence)
> @@ -114,8 +104,7 @@ static inline bool __dma_fence_is_chain(const struct dma_fence *fence)
>   	return fence->ops == &dma_fence_chain_ops;
>   }
>   
> -void i915_gem_fence_wait_priority(struct dma_fence *fence,
> -				  const struct i915_sched_attr *attr)
> +void i915_gem_fence_wait_priority(struct dma_fence *fence, int prio)
>   {
>   	if (dma_fence_is_signaled(fence))
>   		return;
> @@ -128,19 +117,19 @@ void i915_gem_fence_wait_priority(struct dma_fence *fence,
>   		int i;
>   
>   		for (i = 0; i < array->num_fences; i++)
> -			fence_set_priority(array->fences[i], attr);
> +			fence_set_priority(array->fences[i], prio);
>   	} else if (__dma_fence_is_chain(fence)) {
>   		struct dma_fence *iter;
>   
>   		/* The chain is ordered; if we boost the last, we boost all */
>   		dma_fence_chain_for_each(iter, fence) {
>   			fence_set_priority(to_dma_fence_chain(iter)->fence,
> -					   attr);
> +					   prio);
>   			break;
>   		}
>   		dma_fence_put(iter);
>   	} else {
> -		fence_set_priority(fence, attr);
> +		fence_set_priority(fence, prio);
>   	}
>   
>   	local_bh_enable(); /* kick the tasklets if queues were reprioritised */
> @@ -149,7 +138,7 @@ void i915_gem_fence_wait_priority(struct dma_fence *fence,
>   int
>   i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>   			      unsigned int flags,
> -			      const struct i915_sched_attr *attr)
> +			      int prio)
>   {
>   	struct dma_fence *excl;
>   
> @@ -164,7 +153,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>   			return ret;
>   
>   		for (i = 0; i < count; i++) {
> -			i915_gem_fence_wait_priority(shared[i], attr);
> +			i915_gem_fence_wait_priority(shared[i], prio);
>   			dma_fence_put(shared[i]);
>   		}
>   
> @@ -174,7 +163,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>   	}
>   
>   	if (excl) {
> -		i915_gem_fence_wait_priority(excl, attr);
> +		i915_gem_fence_wait_priority(excl, prio);
>   		dma_fence_put(excl);
>   	}
>   	return 0;
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index ac9e020dbc9e..7e580d3ac58f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -319,9 +319,6 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
>   	if (engine->context_size)
>   		DRIVER_CAPS(i915)->has_logical_contexts = true;
>   
> -	/* Nothing to do here, execute in order of dependencies */
> -	engine->schedule = NULL;
> -
>   	ewma__engine_latency_init(&engine->latency);
>   	seqcount_init(&engine->stats.lock);
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> index 778bcae5ef2c..0b026cde9f09 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> @@ -114,7 +114,7 @@ static void heartbeat(struct work_struct *wrk)
>   			 * but all other contexts, including the kernel
>   			 * context are stuck waiting for the signal.
>   			 */
> -		} else if (engine->schedule &&
> +		} else if (intel_engine_has_scheduler(engine) &&
>   			   rq->sched.attr.priority < I915_PRIORITY_BARRIER) {
>   			/*
>   			 * Gradually raise the priority of the heartbeat to
> @@ -129,7 +129,7 @@ static void heartbeat(struct work_struct *wrk)
>   				attr.priority = I915_PRIORITY_BARRIER;
>   
>   			local_bh_disable();
> -			engine->schedule(rq, &attr);
> +			i915_request_set_priority(rq, attr.priority);
>   			local_bh_enable();
>   		} else {
>   			if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index 883bafc44902..27cb3dc0233b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -453,14 +453,6 @@ struct intel_engine_cs {
>   	void            (*bond_execute)(struct i915_request *rq,
>   					struct dma_fence *signal);
>   
> -	/*
> -	 * Call when the priority on a request has changed and it and its
> -	 * dependencies may need rescheduling. Note the request itself may
> -	 * not be ready to run!
> -	 */
> -	void		(*schedule)(struct i915_request *request,
> -				    const struct i915_sched_attr *attr);
> -
>   	void		(*release)(struct intel_engine_cs *engine);
>   
>   	struct intel_engine_execlists execlists;
> @@ -478,13 +470,14 @@ struct intel_engine_cs {
>   
>   #define I915_ENGINE_USING_CMD_PARSER BIT(0)
>   #define I915_ENGINE_SUPPORTS_STATS   BIT(1)
> -#define I915_ENGINE_HAS_PREEMPTION   BIT(2)
> -#define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
> -#define I915_ENGINE_HAS_TIMESLICES   BIT(4)
> -#define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(5)
> -#define I915_ENGINE_IS_VIRTUAL       BIT(6)
> -#define I915_ENGINE_HAS_RELATIVE_MMIO BIT(7)
> -#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(8)
> +#define I915_ENGINE_HAS_SCHEDULER    BIT(2)
> +#define I915_ENGINE_HAS_PREEMPTION   BIT(3)
> +#define I915_ENGINE_HAS_SEMAPHORES   BIT(4)
> +#define I915_ENGINE_HAS_TIMESLICES   BIT(5)
> +#define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(6)
> +#define I915_ENGINE_IS_VIRTUAL       BIT(7)
> +#define I915_ENGINE_HAS_RELATIVE_MMIO BIT(8)
> +#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(9)
>   	unsigned int flags;
>   
>   	/*
> @@ -572,6 +565,12 @@ intel_engine_supports_stats(const struct intel_engine_cs *engine)
>   	return engine->flags & I915_ENGINE_SUPPORTS_STATS;
>   }
>   
> +static inline bool
> +intel_engine_has_scheduler(const struct intel_engine_cs *engine)
> +{
> +	return engine->flags & I915_ENGINE_HAS_SCHEDULER;
> +}
> +
>   static inline bool
>   intel_engine_has_preemption(const struct intel_engine_cs *engine)
>   {
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> index 1cbd84eb24e4..64eccdf32a22 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> @@ -107,7 +107,7 @@ static void set_scheduler_caps(struct drm_i915_private *i915)
>   	for_each_uabi_engine(engine, i915) { /* all engines must agree! */
>   		int i;
>   
> -		if (engine->schedule)
> +		if (intel_engine_has_scheduler(engine))
>   			enabled |= (I915_SCHEDULER_CAP_ENABLED |
>   				    I915_SCHEDULER_CAP_PRIORITY);
>   		else
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 56e36d938851..309fb421ff5c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -3073,7 +3073,6 @@ static bool can_preempt(struct intel_engine_cs *engine)
>   static void execlists_set_default_submission(struct intel_engine_cs *engine)
>   {
>   	engine->submit_request = execlists_submit_request;
> -	engine->schedule = i915_schedule;
>   	engine->execlists.tasklet.func = execlists_submission_tasklet;
>   
>   	engine->reset.prepare = execlists_reset_prepare;
> @@ -3084,6 +3083,7 @@ static void execlists_set_default_submission(struct intel_engine_cs *engine)
>   	engine->park = execlists_park;
>   	engine->unpark = NULL;
>   
> +	engine->flags |= I915_ENGINE_HAS_SCHEDULER;
>   	engine->flags |= I915_ENGINE_SUPPORTS_STATS;
>   	if (!intel_vgpu_active(engine->i915)) {
>   		engine->flags |= I915_ENGINE_HAS_SEMAPHORES;
> @@ -3646,7 +3646,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>   	ve->base.cops = &virtual_context_ops;
>   	ve->base.request_alloc = execlists_request_alloc;
>   
> -	ve->base.schedule = i915_schedule;
>   	ve->base.submit_request = virtual_submit_request;
>   	ve->base.bond_execute = virtual_bond_execute;
>   
> diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> index 6cfa9a89d891..acb7c089d05b 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> @@ -268,12 +268,8 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
>   		i915_request_put(rq[0]);
>   
>   		if (prio) {
> -			struct i915_sched_attr attr = {
> -				.priority = prio,
> -			};
> -
>   			/* Alternatively preempt the spinner with ce[1] */
> -			engine->schedule(rq[1], &attr);
> +			i915_request_set_priority(rq[1], prio);
>   		}
>   
>   		/* And switch back to ce[0] for good measure */
> @@ -873,9 +869,6 @@ release_queue(struct intel_engine_cs *engine,
>   	      struct i915_vma *vma,
>   	      int idx, int prio)
>   {
> -	struct i915_sched_attr attr = {
> -		.priority = prio,
> -	};
>   	struct i915_request *rq;
>   	u32 *cs;
>   
> @@ -900,7 +893,7 @@ release_queue(struct intel_engine_cs *engine,
>   	i915_request_add(rq);
>   
>   	local_bh_disable();
> -	engine->schedule(rq, &attr);
> +	i915_request_set_priority(rq, prio);
>   	local_bh_enable(); /* kick tasklet */
>   
>   	i915_request_put(rq);
> @@ -1310,7 +1303,6 @@ static int live_timeslice_queue(void *arg)
>   		goto err_pin;
>   
>   	for_each_engine(engine, gt, id) {
> -		struct i915_sched_attr attr = { .priority = I915_PRIORITY_MAX };
>   		struct i915_request *rq, *nop;
>   
>   		if (!intel_engine_has_preemption(engine))
> @@ -1325,7 +1317,7 @@ static int live_timeslice_queue(void *arg)
>   			err = PTR_ERR(rq);
>   			goto err_heartbeat;
>   		}
> -		engine->schedule(rq, &attr);
> +		i915_request_set_priority(rq, I915_PRIORITY_MAX);
>   		err = wait_for_submit(engine, rq, HZ / 2);
>   		if (err) {
>   			pr_err("%s: Timed out trying to submit semaphores\n",
> @@ -1806,7 +1798,6 @@ static int live_late_preempt(void *arg)
>   	struct i915_gem_context *ctx_hi, *ctx_lo;
>   	struct igt_spinner spin_hi, spin_lo;
>   	struct intel_engine_cs *engine;
> -	struct i915_sched_attr attr = {};
>   	enum intel_engine_id id;
>   	int err = -ENOMEM;
>   
> @@ -1866,8 +1857,7 @@ static int live_late_preempt(void *arg)
>   			goto err_wedged;
>   		}
>   
> -		attr.priority = I915_PRIORITY_MAX;
> -		engine->schedule(rq, &attr);
> +		i915_request_set_priority(rq, I915_PRIORITY_MAX);
>   
>   		if (!igt_wait_for_spinner(&spin_hi, rq)) {
>   			pr_err("High priority context failed to preempt the low priority context\n");
> @@ -2412,7 +2402,6 @@ static int live_preempt_cancel(void *arg)
>   
>   static int live_suppress_self_preempt(void *arg)
>   {
> -	struct i915_sched_attr attr = { .priority = I915_PRIORITY_MAX };
>   	struct intel_gt *gt = arg;
>   	struct intel_engine_cs *engine;
>   	struct preempt_client a, b;
> @@ -2480,7 +2469,7 @@ static int live_suppress_self_preempt(void *arg)
>   			i915_request_add(rq_b);
>   
>   			GEM_BUG_ON(i915_request_completed(rq_a));
> -			engine->schedule(rq_a, &attr);
> +			i915_request_set_priority(rq_a, I915_PRIORITY_MAX);
>   			igt_spinner_end(&a.spin);
>   
>   			if (!igt_wait_for_spinner(&b.spin, rq_b)) {
> @@ -2545,7 +2534,6 @@ static int live_chain_preempt(void *arg)
>   		goto err_client_hi;
>   
>   	for_each_engine(engine, gt, id) {
> -		struct i915_sched_attr attr = { .priority = I915_PRIORITY_MAX };
>   		struct igt_live_test t;
>   		struct i915_request *rq;
>   		int ring_size, count, i;
> @@ -2612,7 +2600,7 @@ static int live_chain_preempt(void *arg)
>   
>   			i915_request_get(rq);
>   			i915_request_add(rq);
> -			engine->schedule(rq, &attr);
> +			i915_request_set_priority(rq, I915_PRIORITY_MAX);
>   
>   			igt_spinner_end(&hi.spin);
>   			if (i915_request_wait(rq, 0, HZ / 5) < 0) {
> @@ -2964,14 +2952,12 @@ static int live_preempt_gang(void *arg)
>   			return -EIO;
>   
>   		do {
> -			struct i915_sched_attr attr = { .priority = prio++ };
> -
>   			err = create_gang(engine, &rq);
>   			if (err)
>   				break;
>   
>   			/* Submit each spinner at increasing priority */
> -			engine->schedule(rq, &attr);
> +			i915_request_set_priority(rq, prio++);
>   		} while (prio <= I915_PRIORITY_MAX &&
>   			 !__igt_timeout(end_time, NULL));
>   		pr_debug("%s: Preempt chain of %d requests\n",
> @@ -3192,9 +3178,6 @@ static int preempt_user(struct intel_engine_cs *engine,
>   			struct i915_vma *global,
>   			int id)
>   {
> -	struct i915_sched_attr attr = {
> -		.priority = I915_PRIORITY_MAX
> -	};
>   	struct i915_request *rq;
>   	int err = 0;
>   	u32 *cs;
> @@ -3219,7 +3202,7 @@ static int preempt_user(struct intel_engine_cs *engine,
>   	i915_request_get(rq);
>   	i915_request_add(rq);
>   
> -	engine->schedule(rq, &attr);
> +	i915_request_set_priority(rq, I915_PRIORITY_MAX);
>   
>   	if (i915_request_wait(rq, 0, HZ / 2) < 0)
>   		err = -ETIME;
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index d6ce4075602c..8cad102922e7 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -858,12 +858,11 @@ static int active_engine(void *data)
>   		rq[idx] = i915_request_get(new);
>   		i915_request_add(new);
>   
> -		if (engine->schedule && arg->flags & TEST_PRIORITY) {
> -			struct i915_sched_attr attr = {
> -				.priority =
> -					i915_prandom_u32_max_state(512, &prng),
> -			};
> -			engine->schedule(rq[idx], &attr);
> +		if (intel_engine_has_scheduler(engine) &&
> +		    arg->flags & TEST_PRIORITY) {
> +			int prio = i915_prandom_u32_max_state(512, &prng);
> +
> +			i915_request_set_priority(rq[idx], prio);
>   		}
>   
>   		err = active_request_put(old);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 3124d8794d87..53cf68e240c3 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -607,7 +607,6 @@ static int guc_resume(struct intel_engine_cs *engine)
>   static void guc_set_default_submission(struct intel_engine_cs *engine)
>   {
>   	engine->submit_request = guc_submit_request;
> -	engine->schedule = i915_schedule;
>   	engine->execlists.tasklet.func = guc_submission_tasklet;
>   
>   	engine->reset.prepare = guc_reset_prepare;
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 22e39d938f17..abda565dfe62 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1218,7 +1218,7 @@ __i915_request_await_execution(struct i915_request *to,
>   	}
>   
>   	/* Couple the dependency tree for PI on this exposed to->fence */
> -	if (to->engine->schedule) {
> +	if (intel_engine_has_scheduler(to->engine)) {
>   		err = i915_sched_node_add_dependency(&to->sched,
>   						     &from->sched,
>   						     I915_DEPENDENCY_WEAK);
> @@ -1359,7 +1359,7 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
>   		return 0;
>   	}
>   
> -	if (to->engine->schedule) {
> +	if (intel_engine_has_scheduler(to->engine)) {
>   		ret = i915_sched_node_add_dependency(&to->sched,
>   						     &from->sched,
>   						     I915_DEPENDENCY_EXTERNAL);
> @@ -1546,7 +1546,7 @@ __i915_request_add_to_timeline(struct i915_request *rq)
>   			__i915_sw_fence_await_dma_fence(&rq->submit,
>   							&prev->fence,
>   							&rq->dmaq);
> -		if (rq->engine->schedule)
> +		if (intel_engine_has_scheduler(rq->engine))
>   			__i915_sched_node_add_dependency(&rq->sched,
>   							 &prev->sched,
>   							 &rq->dep,
> @@ -1618,8 +1618,8 @@ void __i915_request_queue(struct i915_request *rq,
>   	 * decide whether to preempt the entire chain so that it is ready to
>   	 * run at the earliest possible convenience.
>   	 */
> -	if (attr && rq->engine->schedule)
> -		rq->engine->schedule(rq, attr);
> +	if (attr)
> +		i915_request_set_priority(rq, attr->priority);
>   
>   	local_bh_disable();
>   	__i915_request_queue_bh(rq);
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index efa638c3acc7..dbdd4128f13d 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -216,10 +216,8 @@ static void kick_submission(struct intel_engine_cs *engine,
>   	rcu_read_unlock();
>   }
>   
> -static void __i915_schedule(struct i915_sched_node *node,
> -			    const struct i915_sched_attr *attr)
> +static void __i915_schedule(struct i915_sched_node *node, int prio)
>   {
> -	const int prio = max(attr->priority, node->attr.priority);
>   	struct intel_engine_cs *engine;
>   	struct i915_dependency *dep, *p;
>   	struct i915_dependency stack;
> @@ -233,6 +231,8 @@ static void __i915_schedule(struct i915_sched_node *node,
>   	if (node_signaled(node))
>   		return;
>   
> +	prio = max(prio, node->attr.priority);
> +
>   	stack.signaler = node;
>   	list_add(&stack.dfs_link, &dfs);
>   
> @@ -286,7 +286,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>   	 */
>   	if (node->attr.priority == I915_PRIORITY_INVALID) {
>   		GEM_BUG_ON(!list_empty(&node->link));
> -		node->attr = *attr;
> +		node->attr.priority = prio;
>   
>   		if (stack.dfs_link.next == stack.dfs_link.prev)
>   			return;
> @@ -341,10 +341,13 @@ static void __i915_schedule(struct i915_sched_node *node,
>   	spin_unlock(&engine->active.lock);
>   }
>   
> -void i915_schedule(struct i915_request *rq, const struct i915_sched_attr *attr)
> +void i915_request_set_priority(struct i915_request *rq, int prio)
>   {
> +	if (!intel_engine_has_scheduler(rq->engine))
> +		return;
> +
>   	spin_lock_irq(&schedule_lock);
> -	__i915_schedule(&rq->sched, attr);
> +	__i915_schedule(&rq->sched, prio);
>   	spin_unlock_irq(&schedule_lock);
>   }
>   
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 858a0938f47a..ccee506c9a26 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -35,8 +35,7 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
>   
>   void i915_sched_node_fini(struct i915_sched_node *node);
>   
> -void i915_schedule(struct i915_request *request,
> -		   const struct i915_sched_attr *attr);
> +void i915_request_set_priority(struct i915_request *request, int prio);
>   
>   struct list_head *
>   i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 04/41] drm/i915: Teach the i915_dependency to use a double-lock
  2021-01-25 14:00 ` [Intel-gfx] [PATCH 04/41] drm/i915: Teach the i915_dependency to use a double-lock Chris Wilson
@ 2021-01-25 15:34   ` Tvrtko Ursulin
  2021-01-25 21:37     ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-25 15:34 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 25/01/2021 14:00, Chris Wilson wrote:
> Currently, we construct and teardown the i915_dependency chains using a
> global spinlock. As the lists are entirely local, it should be possible
> to use an double-lock with an explicit nesting [signaler -> waiter,
> always] and so avoid the costly convenience of a global spinlock.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_request.c         |  2 +-
>   drivers/gpu/drm/i915/i915_scheduler.c       | 65 +++++++++++++--------
>   drivers/gpu/drm/i915/i915_scheduler.h       |  2 +-
>   drivers/gpu/drm/i915/i915_scheduler_types.h |  2 +
>   4 files changed, 46 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index abda565dfe62..df2ab39b394d 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -330,7 +330,7 @@ bool i915_request_retire(struct i915_request *rq)
>   	intel_context_unpin(rq->context);
>   
>   	free_capture_list(rq);
> -	i915_sched_node_fini(&rq->sched);
> +	i915_sched_node_retire(&rq->sched);
>   	i915_request_put(rq);
>   
>   	return true;
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index dbdd4128f13d..96fe1e22dad7 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -19,6 +19,17 @@ static struct i915_global_scheduler {
>   
>   static DEFINE_SPINLOCK(schedule_lock);
>   
> +static struct i915_sched_node *node_get(struct i915_sched_node *node)
> +{
> +	i915_request_get(container_of(node, struct i915_request, sched));
> +	return node;
> +}
> +
> +static void node_put(struct i915_sched_node *node)
> +{
> +	i915_request_put(container_of(node, struct i915_request, sched));
> +}
> +
>   static const struct i915_request *
>   node_to_request(const struct i915_sched_node *node)
>   {
> @@ -353,6 +364,8 @@ void i915_request_set_priority(struct i915_request *rq, int prio)
>   
>   void i915_sched_node_init(struct i915_sched_node *node)
>   {
> +	spin_lock_init(&node->lock);
> +
>   	INIT_LIST_HEAD(&node->signalers_list);
>   	INIT_LIST_HEAD(&node->waiters_list);
>   	INIT_LIST_HEAD(&node->link);
> @@ -377,10 +390,17 @@ i915_dependency_alloc(void)
>   	return kmem_cache_alloc(global.slab_dependencies, GFP_KERNEL);
>   }
>   
> +static void
> +rcu_dependency_free(struct rcu_head *rcu)
> +{
> +	kmem_cache_free(global.slab_dependencies,
> +			container_of(rcu, typeof(struct i915_dependency), rcu));
> +}
> +
>   static void
>   i915_dependency_free(struct i915_dependency *dep)
>   {
> -	kmem_cache_free(global.slab_dependencies, dep);
> +	call_rcu(&dep->rcu, rcu_dependency_free);
>   }
>   
>   bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
> @@ -390,24 +410,27 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
>   {
>   	bool ret = false;
>   
> -	spin_lock_irq(&schedule_lock);
> +	/* The signal->lock is always the outer lock in this double-lock. */
> +	spin_lock(&signal->lock);
>   
>   	if (!node_signaled(signal)) {
>   		INIT_LIST_HEAD(&dep->dfs_link);
>   		dep->signaler = signal;
> -		dep->waiter = node;
> +		dep->waiter = node_get(node);
>   		dep->flags = flags;
>   
>   		/* All set, now publish. Beware the lockless walkers. */
> +		spin_lock_nested(&node->lock, SINGLE_DEPTH_NESTING);
>   		list_add_rcu(&dep->signal_link, &node->signalers_list);
>   		list_add_rcu(&dep->wait_link, &signal->waiters_list);
> +		spin_unlock(&node->lock);
>   
>   		/* Propagate the chains */
>   		node->flags |= signal->flags;
>   		ret = true;
>   	}
>   
> -	spin_unlock_irq(&schedule_lock);
> +	spin_unlock(&signal->lock);

So we have to be sure another entry point cannot try to lock the same 
nodes in reverse, that is with reversed roles. Situation where nodes are 
simultaneously both each other waiters and signalers does indeed sound 
impossible so I think this is fine.

Only if some entry point would lock something which is a waiter, and 
then went to boost the priority of a signaler. That is still one with a 
global lock. So the benefit of this patch is just to reduce contention 
between adding and re-scheduling?

And __i915_schedule does walk the list of signalers without holding this 
new lock. What is the safety net there? RCU? Do we need 
list_for_each_entry_rcu and explicit rcu_read_(un)lock in there then?

Regards,

Tvrtko

>   
>   	return ret;
>   }
> @@ -429,39 +452,36 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
>   	return 0;
>   }
>   
> -void i915_sched_node_fini(struct i915_sched_node *node)
> +void i915_sched_node_retire(struct i915_sched_node *node)
>   {
>   	struct i915_dependency *dep, *tmp;
>   
> -	spin_lock_irq(&schedule_lock);
> -
>   	/*
>   	 * Everyone we depended upon (the fences we wait to be signaled)
>   	 * should retire before us and remove themselves from our list.
>   	 * However, retirement is run independently on each timeline and
> -	 * so we may be called out-of-order.
> +	 * so we may be called out-of-order. As we need to avoid taking
> +	 * the signaler's lock, just mark up our completion and be wary
> +	 * in traversing the signalers->waiters_list.
>   	 */
> -	list_for_each_entry_safe(dep, tmp, &node->signalers_list, signal_link) {
> -		GEM_BUG_ON(!list_empty(&dep->dfs_link));
> -
> -		list_del_rcu(&dep->wait_link);
> -		if (dep->flags & I915_DEPENDENCY_ALLOC)
> -			i915_dependency_free(dep);
> -	}
> -	INIT_LIST_HEAD(&node->signalers_list);
>   
>   	/* Remove ourselves from everyone who depends upon us */
> +	spin_lock(&node->lock);
>   	list_for_each_entry_safe(dep, tmp, &node->waiters_list, wait_link) {
> -		GEM_BUG_ON(dep->signaler != node);
> -		GEM_BUG_ON(!list_empty(&dep->dfs_link));
> +		struct i915_sched_node *w = dep->waiter;
>   
> +		GEM_BUG_ON(dep->signaler != node);
> +
> +		spin_lock_nested(&w->lock, SINGLE_DEPTH_NESTING);
>   		list_del_rcu(&dep->signal_link);
> +		spin_unlock(&w->lock);
> +		node_put(w);
> +
>   		if (dep->flags & I915_DEPENDENCY_ALLOC)
>   			i915_dependency_free(dep);
>   	}
> -	INIT_LIST_HEAD(&node->waiters_list);
> -
> -	spin_unlock_irq(&schedule_lock);
> +	INIT_LIST_HEAD_RCU(&node->waiters_list);
> +	spin_unlock(&node->lock);
>   }
>   
>   void i915_request_show_with_schedule(struct drm_printer *m,
> @@ -512,8 +532,7 @@ static struct i915_global_scheduler global = { {
>   int __init i915_global_scheduler_init(void)
>   {
>   	global.slab_dependencies = KMEM_CACHE(i915_dependency,
> -					      SLAB_HWCACHE_ALIGN |
> -					      SLAB_TYPESAFE_BY_RCU);
> +					      SLAB_HWCACHE_ALIGN);
>   	if (!global.slab_dependencies)
>   		return -ENOMEM;
>   
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index ccee506c9a26..a045be784c67 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -33,7 +33,7 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
>   				   struct i915_sched_node *signal,
>   				   unsigned long flags);
>   
> -void i915_sched_node_fini(struct i915_sched_node *node);
> +void i915_sched_node_retire(struct i915_sched_node *node);
>   
>   void i915_request_set_priority(struct i915_request *request, int prio);
>   
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index 343ed44d5ed4..623bf41fcf35 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -60,6 +60,7 @@ struct i915_sched_attr {
>    * others.
>    */
>   struct i915_sched_node {
> +	spinlock_t lock; /* protect the lists */
>   	struct list_head signalers_list; /* those before us, we depend upon */
>   	struct list_head waiters_list; /* those after us, they depend upon us */
>   	struct list_head link;
> @@ -75,6 +76,7 @@ struct i915_dependency {
>   	struct list_head signal_link;
>   	struct list_head wait_link;
>   	struct list_head dfs_link;
> +	struct rcu_head rcu;
>   	unsigned long flags;
>   #define I915_DEPENDENCY_ALLOC		BIT(0)
>   #define I915_DEPENDENCY_EXTERNAL	BIT(1)
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (40 preceding siblings ...)
  2021-01-25 14:40 ` [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Tvrtko Ursulin
@ 2021-01-25 17:08 ` Patchwork
  2021-01-25 17:10 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  44 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2021-01-25 17:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds
URL   : https://patchwork.freedesktop.org/series/86259/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
e4e577627c26 drm/i915/selftests: Check for engine-reset errors in the middle of workarounds
367f5e603c5d drm/i915/gt: Move the defer_request waiter active assertion
0611644e4ddf drm/i915: Replace engine->schedule() with a known request operation
b5ca2c04b198 drm/i915: Teach the i915_dependency to use a double-lock
7b140da51266 drm/i915: Restructure priority inheritance
f1443908248f drm/i915/selftests: Measure set-priority duration
-:52: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#52: 
new file mode 100644

-:434: WARNING:LINE_SPACING: Missing a blank line after declarations
#434: FILE: drivers/gpu/drm/i915/selftests/i915_scheduler.c:378:
+	struct igt_spinner spin;
+	I915_RND_STATE(prng);

total: 0 errors, 2 warnings, 0 checks, 702 lines checked
22c3f3836779 drm/i915/selftests: Exercise priority inheritance around an engine loop
9b002fc62d1f drm/i915: Improve DFS for priority inheritance
ddf444ae329e drm/i915/selftests: Exercise relative mmio paths to non-privileged registers
f6b8b7362118 drm/i915/selftests: Exercise cross-process context isolation
1dac44d403d7 drm/i915: Extract request submission from execlists
db49df2e5ac7 drm/i915: Extract request rewinding from execlists
f04573d14fc7 drm/i915: Extract request suspension from the execlists
bf3ba20370f6 drm/i915: Extract the ability to defer and rerun a request later
d6eb9d01bae2 drm/i915: Fix the iterative dfs for defering requests
4ab371f1d39d drm/i915: Move common active lists from engine to i915_scheduler
0597492c1e44 drm/i915: Move scheduler queue
3744cfde0c1b drm/i915: Move tasklet from execlists to sched
3839efb97bae drm/i915/gt: Show scheduler queues when dumping state
4abff63c4cfe drm/i915: Replace priolist rbtree with a skiplist
-:306: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#306: FILE: drivers/gpu/drm/i915/i915_priolist_types.h:62:
+#define for_each_priolist(p, root) \
+	for ((p) = (root)->sentinel.next[0]; \
+	     (p) != &(root)->sentinel; \
+	     (p) = (p)->next[0])

-:306: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'root' - possible side-effects?
#306: FILE: drivers/gpu/drm/i915/i915_priolist_types.h:62:
+#define for_each_priolist(p, root) \
+	for ((p) = (root)->sentinel.next[0]; \
+	     (p) != &(root)->sentinel; \
+	     (p) = (p)->next[0])

-:717: WARNING:LINE_SPACING: Missing a blank line after declarations
#717: FILE: drivers/gpu/drm/i915/selftests/i915_scheduler.c:19:
+	struct i915_priolist *pl = &root.sentinel;
+	IGT_TIMEOUT(end_time);

total: 0 errors, 1 warnings, 2 checks, 697 lines checked
48a6729c987b drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper
-:23: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_ptr' - possible side-effects?
#23: FILE: drivers/gpu/drm/i915/i915_utils.h:468:
+#define try_cmpxchg64(_ptr, _pold, _new)				\
+({									\
+	__typeof__(_ptr) _old = (__typeof__(_ptr))(_pold);		\
+	__typeof__(*(_ptr)) __old = *_old;				\
+	__typeof__(*(_ptr)) __cur = cmpxchg64(_ptr, __old, _new);	\
+	bool success = __cur == __old;					\
+	if (unlikely(!success))						\
+		*_old = __cur;						\
+	likely(success);						\
+})

-:40: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_ptr' - possible side-effects?
#40: FILE: drivers/gpu/drm/i915/i915_utils.h:485:
+#define xchg64(_ptr, _new)						\
+({									\
+	__typeof__(_ptr) __ptr = (_ptr);				\
+	__typeof__(*(_ptr)) __old = *__ptr;				\
+	while (!try_cmpxchg64(__ptr, &__old, (_new)))			\
+		;							\
+	__old;								\
+})

total: 0 errors, 0 warnings, 2 checks, 36 lines checked
4cabd30226f5 drm/i915: Fair low-latency scheduling
f93f85ad482a drm/i915/gt: Specify a deadline for the heartbeat
010d86d0ecb1 drm/i915: Extend the priority boosting for the display with a deadline
e4a776414370 drm/i915/gt: Support virtual engine queues
4e61621ac116 drm/i915: Move saturated workload detection back to the context
-:29: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#29: 
References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")

-:29: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")'
#29: 
References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")

total: 1 errors, 1 warnings, 0 checks, 78 lines checked
75ecbfad42e2 drm/i915: Bump default timeslicing quantum to 5ms
1edd11623282 drm/i915/gt: Wrap intel_timeline.has_initial_breadcrumb
fc7e917f62a1 drm/i915/gt: Track timeline GGTT offset separately from subpage offset
9bb8f60faea0 drm/i915/gt: Add timeline "mode"
0a3936960afc drm/i915/gt: Use indices for writing into relative timelines
b9b643d59b9d drm/i915/selftests: Exercise relative timeline modes
913207bf89cd drm/i915/gt: Use ppHWSP for unshared non-semaphore related timelines
cf05a88e82c0 Restore "drm/i915: drop engine_pin/unpin_breadcrumbs_irq"
593e08f61ffd drm/i915/gt: Couple tasklet scheduling for all CS interrupts
209430537935 drm/i915/gt: Support creation of 'internal' rings
0337feb5791a drm/i915/gt: Use client timeline address for seqno writes
a07f5ba0a10d drm/i915/gt: Infrastructure for ring scheduling
-:79: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#79: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 891 lines checked
bf0a549d2293 drm/i915/gt: Implement ring scheduler for gen4-7
-:70: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#70: FILE: drivers/gpu/drm/i915/gt/intel_ring_scheduler.c:227:
+				*cs++ = i915_mmio_reg_offset(

-:72: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#72: FILE: drivers/gpu/drm/i915/gt/intel_ring_scheduler.c:229:
+				*cs++ = _MASKED_BIT_ENABLE(

-:107: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#107: FILE: drivers/gpu/drm/i915/gt/intel_ring_scheduler.c:264:
+				*cs++ = _MASKED_BIT_DISABLE(

total: 0 errors, 0 warnings, 3 checks, 582 lines checked
4743bbd27620 drm/i915/gt: Enable ring scheduling for gen5-7
-:32: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#32: 
References: 0f46832fab77 ("drm/i915: Mask USER interrupts on gen6 (until required)")

-:32: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 0f46832fab77 ("drm/i915: Mask USER interrupts on gen6 (until required)")'
#32: 
References: 0f46832fab77 ("drm/i915: Mask USER interrupts on gen6 (until required)")

total: 1 errors, 1 warnings, 0 checks, 34 lines checked
ac942d24c88f drm/i915: Support secure dispatch on gen6/gen7


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (41 preceding siblings ...)
  2021-01-25 17:08 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/41] " Patchwork
@ 2021-01-25 17:10 ` Patchwork
  2021-01-25 17:38 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
  2021-01-25 22:45 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
  44 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2021-01-25 17:10 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds
URL   : https://patchwork.freedesktop.org/series/86259/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/intel_wakeref.c:137:19: warning: context imbalance in 'wakeref_auto_timeout' - unexpected unlock
+drivers/gpu/drm/i915/selftests/i915_syncmap.c:80:54: warning: dubious: x | !y


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (42 preceding siblings ...)
  2021-01-25 17:10 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2021-01-25 17:38 ` Patchwork
  2021-01-25 22:45 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
  44 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2021-01-25 17:38 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 5812 bytes --]

== Series Details ==

Series: series starting with [01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds
URL   : https://patchwork.freedesktop.org/series/86259/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_9680 -> Patchwork_19487
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19487/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_19487:

### IGT changes ###

#### Possible regressions ####

  * {igt@i915_selftest@live@scheduler} (NEW):
    - fi-snb-2520m:       NOTRUN -> [DMESG-FAIL][1]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19487/fi-snb-2520m/igt@i915_selftest@live@scheduler.html
    - fi-snb-2600:        NOTRUN -> [DMESG-FAIL][2]
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19487/fi-snb-2600/igt@i915_selftest@live@scheduler.html

  
New tests
---------

  New tests have been introduced between CI_DRM_9680 and Patchwork_19487:

### New IGT tests (1) ###

  * igt@i915_selftest@live@scheduler:
    - Statuses : 2 dmesg-fail(s) 29 pass(s)
    - Exec time: [0.72, 8.80] s

  

Known issues
------------

  Here are the changes found in Patchwork_19487 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@prime_vgem@basic-gtt:
    - fi-tgl-y:           [PASS][3] -> [DMESG-WARN][4] ([i915#402]) +1 similar issue
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9680/fi-tgl-y/igt@prime_vgem@basic-gtt.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19487/fi-tgl-y/igt@prime_vgem@basic-gtt.html

  
#### Possible fixes ####

  * igt@prime_vgem@basic-fence-flip:
    - fi-tgl-y:           [DMESG-WARN][5] ([i915#402]) -> [PASS][6] +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9680/fi-tgl-y/igt@prime_vgem@basic-fence-flip.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19487/fi-tgl-y/igt@prime_vgem@basic-fence-flip.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402


Participating hosts (39 -> 35)
------------------------------

  Missing    (4): fi-ctg-p8600 fi-jsl-1 fi-ilk-m540 fi-hsw-4200u 


Build changes
-------------

  * Linux: CI_DRM_9680 -> Patchwork_19487

  CI-20190529: 20190529
  CI_DRM_9680: 9e03236ed9687144929d42404341384cc1e501b7 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5971: abef2b7d6ff30f3b948b3e5d39653debb73083f3 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_19487: ac942d24c88f5e0b7247d62f73b254f29a02145c @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ac942d24c88f drm/i915: Support secure dispatch on gen6/gen7
4743bbd27620 drm/i915/gt: Enable ring scheduling for gen5-7
bf0a549d2293 drm/i915/gt: Implement ring scheduler for gen4-7
a07f5ba0a10d drm/i915/gt: Infrastructure for ring scheduling
0337feb5791a drm/i915/gt: Use client timeline address for seqno writes
209430537935 drm/i915/gt: Support creation of 'internal' rings
593e08f61ffd drm/i915/gt: Couple tasklet scheduling for all CS interrupts
cf05a88e82c0 Restore "drm/i915: drop engine_pin/unpin_breadcrumbs_irq"
913207bf89cd drm/i915/gt: Use ppHWSP for unshared non-semaphore related timelines
b9b643d59b9d drm/i915/selftests: Exercise relative timeline modes
0a3936960afc drm/i915/gt: Use indices for writing into relative timelines
9bb8f60faea0 drm/i915/gt: Add timeline "mode"
fc7e917f62a1 drm/i915/gt: Track timeline GGTT offset separately from subpage offset
1edd11623282 drm/i915/gt: Wrap intel_timeline.has_initial_breadcrumb
75ecbfad42e2 drm/i915: Bump default timeslicing quantum to 5ms
4e61621ac116 drm/i915: Move saturated workload detection back to the context
e4a776414370 drm/i915/gt: Support virtual engine queues
010d86d0ecb1 drm/i915: Extend the priority boosting for the display with a deadline
f93f85ad482a drm/i915/gt: Specify a deadline for the heartbeat
4cabd30226f5 drm/i915: Fair low-latency scheduling
48a6729c987b drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper
4abff63c4cfe drm/i915: Replace priolist rbtree with a skiplist
3839efb97bae drm/i915/gt: Show scheduler queues when dumping state
3744cfde0c1b drm/i915: Move tasklet from execlists to sched
0597492c1e44 drm/i915: Move scheduler queue
4ab371f1d39d drm/i915: Move common active lists from engine to i915_scheduler
d6eb9d01bae2 drm/i915: Fix the iterative dfs for defering requests
bf3ba20370f6 drm/i915: Extract the ability to defer and rerun a request later
f04573d14fc7 drm/i915: Extract request suspension from the execlists
db49df2e5ac7 drm/i915: Extract request rewinding from execlists
1dac44d403d7 drm/i915: Extract request submission from execlists
f6b8b7362118 drm/i915/selftests: Exercise cross-process context isolation
ddf444ae329e drm/i915/selftests: Exercise relative mmio paths to non-privileged registers
9b002fc62d1f drm/i915: Improve DFS for priority inheritance
22c3f3836779 drm/i915/selftests: Exercise priority inheritance around an engine loop
f1443908248f drm/i915/selftests: Measure set-priority duration
7b140da51266 drm/i915: Restructure priority inheritance
b5ca2c04b198 drm/i915: Teach the i915_dependency to use a double-lock
0611644e4ddf drm/i915: Replace engine->schedule() with a known request operation
367f5e603c5d drm/i915/gt: Move the defer_request waiter active assertion
e4e577627c26 drm/i915/selftests: Check for engine-reset errors in the middle of workarounds

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19487/index.html

[-- Attachment #1.2: Type: text/html, Size: 6835 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 04/41] drm/i915: Teach the i915_dependency to use a double-lock
  2021-01-25 15:34   ` Tvrtko Ursulin
@ 2021-01-25 21:37     ` Chris Wilson
  2021-01-26  9:40       ` Tvrtko Ursulin
  0 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-25 21:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-25 15:34:53)
> 
> On 25/01/2021 14:00, Chris Wilson wrote:
> > @@ -390,24 +410,27 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
> >   {
> >       bool ret = false;
> >   
> > -     spin_lock_irq(&schedule_lock);
> > +     /* The signal->lock is always the outer lock in this double-lock. */
> > +     spin_lock(&signal->lock);
> >   
> >       if (!node_signaled(signal)) {
> >               INIT_LIST_HEAD(&dep->dfs_link);
> >               dep->signaler = signal;
> > -             dep->waiter = node;
> > +             dep->waiter = node_get(node);
> >               dep->flags = flags;
> >   
> >               /* All set, now publish. Beware the lockless walkers. */
> > +             spin_lock_nested(&node->lock, SINGLE_DEPTH_NESTING);
> >               list_add_rcu(&dep->signal_link, &node->signalers_list);
> >               list_add_rcu(&dep->wait_link, &signal->waiters_list);
> > +             spin_unlock(&node->lock);
> >   
> >               /* Propagate the chains */
> >               node->flags |= signal->flags;
> >               ret = true;
> >       }
> >   
> > -     spin_unlock_irq(&schedule_lock);
> > +     spin_unlock(&signal->lock);
> 
> So we have to be sure another entry point cannot try to lock the same 
> nodes in reverse, that is with reversed roles. Situation where nodes are 
> simultaneously both each other waiters and signalers does indeed sound 
> impossible so I think this is fine.
> 
> Only if some entry point would lock something which is a waiter, and 
> then went to boost the priority of a signaler. That is still one with a 
> global lock. So the benefit of this patch is just to reduce contention 
> between adding and re-scheduling?

We remove the global schedule_lock in the next patch. This patch tackles
the "simpler" list management by noting that the chains can always be
taken in order of (signaler, waiter) so we have strict nesting for a
local double lock.

> And __i915_schedule does walk the list of signalers without holding this 
> new lock. What is the safety net there? RCU? Do we need 
> list_for_each_entry_rcu and explicit rcu_read_(un)lock in there then?

Yes, we are already supposedly RCU safe for the list of signalers, as
we've been depending on that for a while.

#define for_each_signaler(p__, rq__) \
        list_for_each_entry_rcu(p__, \
                                &(rq__)->sched.signalers_list, \
                                signal_link)

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds
  2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
                   ` (43 preceding siblings ...)
  2021-01-25 17:38 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2021-01-25 22:45 ` Patchwork
  44 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2021-01-25 22:45 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 30330 bytes --]

== Series Details ==

Series: series starting with [01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds
URL   : https://patchwork.freedesktop.org/series/86259/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_9680_full -> Patchwork_19487_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_19487_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_19487_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_19487_full:

### IGT changes ###

#### Possible regressions ####

  * {igt@i915_selftest@live@scheduler} (NEW):
    - shard-snb:          NOTRUN -> [DMESG-FAIL][1]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19487/shard-snb5/igt@i915_selftest@live@scheduler.html

  * igt@i915_suspend@debugfs-reader (NEW):
    - shard-kbl:          [PASS][2] -> [INCOMPLETE][3]
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9680/shard-kbl7/igt@i915_suspend@debugfs-reader.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19487/shard-kbl2/igt@i915_suspend@debugfs-reader.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * {igt@sysfs_clients@busy@bcs0}:
    - shard-kbl:          [PASS][4] -> [FAIL][5]
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9680/shard-kbl1/igt@sysfs_clients@busy@bcs0.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19487/shard-kbl2/igt@sysfs_clients@busy@bcs0.html

  
New tests
---------

  New tests have been introduced between CI_DRM_9680_full and Patchwork_19487_full:

### New IGT tests (1884) ###

  * igt@core_auth@basic-auth:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@core_auth@many-magics:
    - Statuses : 6 pass(s)
    - Exec time: [0.18, 0.86] s

  * igt@core_getclient:
    - Statuses : 7 pass(s)
    - Exec time: [0.07, 0.18] s

  * igt@core_getstats:
    - Statuses : 6 pass(s)
    - Exec time: [0.07, 0.19] s

  * igt@core_getversion:
    - Statuses : 7 pass(s)
    - Exec time: [0.08, 0.19] s

  * igt@core_setmaster_vs_auth:
    - Statuses : 7 pass(s)
    - Exec time: [0.07, 0.18] s

  * igt@debugfs_test@read_all_entries:
    - Statuses : 7 pass(s)
    - Exec time: [0.05, 0.28] s

  * igt@debugfs_test@read_all_entries_display_off:
    - Statuses : 7 pass(s)
    - Exec time: [0.09, 1.17] s

  * igt@debugfs_test@read_all_entries_display_on:
    - Statuses :
    - Exec time: [None] s

  * igt@drm_import_export@flink:
    - Statuses : 7 pass(s)
    - Exec time: [10.74, 10.75] s

  * igt@drm_import_export@import-close-race-flink:
    - Statuses : 7 pass(s)
    - Exec time: [10.74, 10.76] s

  * igt@drm_import_export@import-close-race-prime:
    - Statuses : 7 pass(s)
    - Exec time: [10.74] s

  * igt@drm_import_export@prime:
    - Statuses : 7 pass(s)
    - Exec time: [10.74, 10.75] s

  * igt@drm_read@empty-block:
    - Statuses : 2 pass(s)
    - Exec time: [1.0] s

  * igt@drm_read@empty-nonblock:
    - Statuses : 6 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@drm_read@fault-buffer:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@drm_read@invalid-buffer:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@drm_read@short-buffer-block:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@drm_read@short-buffer-nonblock:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@dumb_buffer@create-clear:
    - Statuses : 7 pass(s)
    - Exec time: [37.49, 50.47] s

  * igt@dumb_buffer@create-valid-dumb:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@dumb_buffer@invalid-bpp:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@dumb_buffer@map-invalid-size:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@dumb_buffer@map-uaf:
    - Statuses : 7 pass(s)
    - Exec time: [0.02, 0.10] s

  * igt@dumb_buffer@map-valid:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@fbdev@info:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_bad_reloc@negative-reloc-bltcopy:
    - Statuses : 7 pass(s)
    - Exec time: [0.35, 4.19] s

  * igt@gem_basic@bad-close:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_basic@create-close:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_basic@create-fd-close:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_blits@basic:
    - Statuses : 7 pass(s)
    - Exec time: [1.50, 14.88] s

  * igt@gem_busy@close-race:
    - Statuses : 6 pass(s)
    - Exec time: [21.78, 22.33] s

  * igt@gem_caching@read-writes:
    - Statuses : 7 pass(s)
    - Exec time: [4.38, 23.12] s

  * igt@gem_caching@reads:
    - Statuses : 7 pass(s)
    - Exec time: [0.68, 5.67] s

  * igt@gem_caching@writes:
    - Statuses : 7 pass(s)
    - Exec time: [2.56, 13.60] s

  * igt@gem_close@basic:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_close@many-handles-one-vma:
    - Statuses : 7 pass(s)
    - Exec time: [0.02, 0.10] s

  * igt@gem_close_race@basic-process:
    - Statuses : 7 pass(s)
    - Exec time: [0.04, 0.12] s

  * igt@gem_close_race@basic-threads:
    - Statuses : 7 pass(s)
    - Exec time: [1.14, 1.25] s

  * igt@gem_create@create-invalid-size:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_create@create-valid-nonaligned:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_ctx_bad_destroy@double-destroy:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_ctx_bad_destroy@invalid-ctx:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_bad_destroy@invalid-default-ctx:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_bad_destroy@invalid-pad:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_ctx_create@basic:
    - Statuses : 6 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_ctx_create@basic-files:
    - Statuses : 7 pass(s)
    - Exec time: [2.01, 2.04] s

  * igt@gem_ctx_exec@basic:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_ctx_exec@basic-invalid-context:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_ctx_freq@sysfs:
    - Statuses : 7 pass(s)
    - Exec time: [4.81, 4.93] s

  * igt@gem_ctx_param@basic:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@basic-default:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@get-priority-new-ctx:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_ctx_param@invalid-ctx-get:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@invalid-ctx-set:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@invalid-param-get:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.01] s

  * igt@gem_ctx_param@invalid-param-set:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@invalid-size-get:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@invalid-size-set:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@non-root-set:
    - Statuses : 6 pass(s)
    - Exec time: [0.01, 0.04] s

  * igt@gem_ctx_param@non-root-set-no-zeromap:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.04] s

  * igt@gem_ctx_param@root-set:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@root-set-no-zeromap-disabled:
    - Statuses : 6 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@root-set-no-zeromap-enabled:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@set-priority-invalid-size:
    - Statuses : 6 pass(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@set-priority-not-supported:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_param@set-priority-range:
    - Statuses : 7 pass(s)
    - Exec time: [0.02, 0.06] s

  * igt@gem_eio@banned:
    - Statuses : 7 pass(s)
    - Exec time: [0.05, 0.42] s

  * igt@gem_eio@execbuf:
    - Statuses : 7 pass(s)
    - Exec time: [0.02, 0.06] s

  * igt@gem_eio@hibernate:
    - Statuses : 7 pass(s)
    - Exec time: [12.66, 15.15] s

  * igt@gem_eio@in-flight-10ms:
    - Statuses : 7 pass(s)
    - Exec time: [0.46, 2.38] s

  * igt@gem_eio@in-flight-1us:
    - Statuses : 6 pass(s)
    - Exec time: [0.71, 2.64] s

  * igt@gem_eio@in-flight-contexts-10ms:
    - Statuses : 7 pass(s)
    - Exec time: [0.68, 2.46] s

  * igt@gem_eio@in-flight-contexts-1us:
    - Statuses : 7 pass(s)
    - Exec time: [0.47, 2.36] s

  * igt@gem_eio@in-flight-contexts-immediate:
    - Statuses : 6 pass(s)
    - Exec time: [0.46, 2.44] s

  * igt@gem_eio@in-flight-external:
    - Statuses : 7 pass(s)
    - Exec time: [0.03, 0.26] s

  * igt@gem_eio@in-flight-immediate:
    - Statuses : 7 pass(s)
    - Exec time: [0.48, 2.24] s

  * igt@gem_eio@in-flight-internal-10ms:
    - Statuses : 7 pass(s)
    - Exec time: [0.04, 0.28] s

  * igt@gem_eio@in-flight-internal-1us:
    - Statuses : 7 pass(s)
    - Exec time: [0.03, 0.26] s

  * igt@gem_eio@in-flight-internal-immediate:
    - Statuses : 7 pass(s)
    - Exec time: [0.03, 0.26] s

  * igt@gem_eio@in-flight-suspend:
    - Statuses : 6 pass(s)
    - Exec time: [1.33, 3.12] s

  * igt@gem_eio@reset-stress:
    - Statuses : 7 pass(s)
    - Exec time: [27.71, 38.79] s

  * igt@gem_eio@suspend:
    - Statuses : 7 pass(s)
    - Exec time: [10.98, 12.61] s

  * igt@gem_eio@throttle:
    - Statuses : 7 pass(s)
    - Exec time: [0.02, 0.07] s

  * igt@gem_eio@unwedge-stress:
    - Statuses : 7 pass(s)
    - Exec time: [27.54, 39.84] s

  * igt@gem_eio@wait-10ms:
    - Statuses : 6 pass(s)
    - Exec time: [0.04, 0.16] s

  * igt@gem_eio@wait-1us:
    - Statuses : 7 pass(s)
    - Exec time: [0.03, 0.16] s

  * igt@gem_eio@wait-immediate:
    - Statuses : 7 pass(s)
    - Exec time: [0.03, 0.17] s

  * igt@gem_eio@wait-wedge-10ms:
    - Statuses : 7 pass(s)
    - Exec time: [0.04, 0.27] s

  * igt@gem_eio@wait-wedge-1us:
    - Statuses : 7 pass(s)
    - Exec time: [0.03, 0.27] s

  * igt@gem_eio@wait-wedge-immediate:
    - Statuses : 7 pass(s)
    - Exec time: [0.03, 0.25] s

  * igt@gem_exec_alignment@single:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_await@wide-all:
    - Statuses : 7 pass(s)
    - Exec time: [21.81, 22.55] s

  * igt@gem_exec_await@wide-contexts:
    - Statuses : 7 pass(s)
    - Exec time: [21.63, 22.32] s

  * igt@gem_exec_balancer@bonded-chain:
    - Statuses : 4 pass(s) 2 skip(s)
    - Exec time: [0.0, 7.21] s

  * igt@gem_exec_balancer@bonded-semaphore:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 4.57] s

  * igt@gem_exec_balancer@hang:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 4.58] s

  * igt@gem_exec_basic@basic:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_basic@basic@bcs0:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.01] s

  * igt@gem_exec_basic@basic@rcs0:
    - Statuses : 7 pass(s)
    - Exec time: [0.00] s

  * igt@gem_exec_basic@basic@vcs0:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_basic@basic@vcs1:
    - Statuses : 2 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_basic@basic@vecs0:
    - Statuses : 6 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_capture@userptr:
    - Statuses : 6 pass(s)
    - Exec time: [0.01, 0.04] s

  * igt@gem_exec_create@basic:
    - Statuses : 7 pass(s)
    - Exec time: [2.03, 2.07] s

  * igt@gem_exec_create@forked:
    - Statuses : 6 pass(s)
    - Exec time: [20.06, 20.12] s

  * igt@gem_exec_create@madvise:
    - Statuses : 7 pass(s)
    - Exec time: [20.03, 20.77] s

  * igt@gem_exec_fence@basic-busy-all:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.03] s

  * igt@gem_exec_fence@basic-wait-all:
    - Statuses : 6 pass(s)
    - Exec time: [0.01, 0.03] s

  * igt@gem_exec_flush@basic-batch-kernel-default-cmd:
    - Statuses : 3 pass(s) 3 skip(s)
    - Exec time: [0.0, 5.81] s

  * igt@gem_exec_flush@basic-batch-kernel-default-uc:
    - Statuses : 7 pass(s)
    - Exec time: [5.48, 6.13] s

  * igt@gem_exec_flush@basic-batch-kernel-default-wb:
    - Statuses : 7 pass(s)
    - Exec time: [5.57, 5.99] s

  * igt@gem_exec_flush@basic-uc-pro-default:
    - Statuses : 2 pass(s)
    - Exec time: [5.41, 5.43] s

  * igt@gem_exec_flush@basic-uc-prw-default:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_flush@basic-uc-ro-default:
    - Statuses : 7 pass(s)
    - Exec time: [5.40, 5.46] s

  * igt@gem_exec_flush@basic-uc-rw-default:
    - Statuses : 7 pass(s)
    - Exec time: [5.41, 5.48] s

  * igt@gem_exec_flush@basic-uc-set-default:
    - Statuses : 7 pass(s)
    - Exec time: [5.41, 5.46] s

  * igt@gem_exec_flush@basic-wb-pro-default:
    - Statuses : 7 pass(s)
    - Exec time: [5.41, 5.52] s

  * igt@gem_exec_flush@basic-wb-prw-default:
    - Statuses : 7 pass(s)
    - Exec time: [5.42, 5.46] s

  * igt@gem_exec_flush@basic-wb-ro-before-default:
    - Statuses : 7 pass(s)
    - Exec time: [5.41, 5.48] s

  * igt@gem_exec_flush@basic-wb-ro-default:
    - Statuses : 6 pass(s)
    - Exec time: [5.41, 5.47] s

  * igt@gem_exec_flush@basic-wb-rw-before-default:
    - Statuses : 7 pass(s)
    - Exec time: [5.41, 5.46] s

  * igt@gem_exec_flush@basic-wb-rw-default:
    - Statuses : 7 pass(s)
    - Exec time: [5.41, 5.46] s

  * igt@gem_exec_flush@basic-wb-set-default:
    - Statuses : 7 pass(s)
    - Exec time: [5.41, 5.47] s

  * igt@gem_exec_gttfill@basic:
    - Statuses : 7 pass(s)
    - Exec time: [3.48, 31.86] s

  * igt@gem_exec_nop@basic-parallel:
    - Statuses : 7 pass(s)
    - Exec time: [2.78, 3.34] s

  * igt@gem_exec_nop@basic-sequential:
    - Statuses : 7 pass(s)
    - Exec time: [2.77, 3.31] s

  * igt@gem_exec_nop@basic-series:
    - Statuses : 7 pass(s)
    - Exec time: [2.76, 3.30] s

  * igt@gem_exec_parallel@basic:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_parallel@contexts:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_parallel@fds:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_params@batch-first:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_params@cliprects-invalid:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@cliprects_ptr-dirt:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@dr1-dirt:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@dr4-dirt:
    - Statuses : 6 pass(s)
    - Exec time: [0.0] s

  * igt@gem_exec_params@invalid-bsd-ring:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.01] s

  * igt@gem_exec_params@invalid-bsd1-flag-on-blt:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@invalid-bsd1-flag-on-render:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@invalid-bsd1-flag-on-vebox:
    - Statuses : 5 pass(s) 1 skip(s)
    - Exec time: [0.0] s

  * igt@gem_exec_params@invalid-bsd2-flag-on-blt:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@invalid-bsd2-flag-on-render:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_exec_params@invalid-bsd2-flag-on-vebox:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_params@invalid-fence-in:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@invalid-flag:
    - Statuses : 6 pass(s)
    - Exec time: [0.0] s

  * igt@gem_exec_params@invalid-ring:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@invalid-ring2:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@no-blt:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_params@no-bsd:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_exec_params@no-vebox:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0] s

  * igt@gem_exec_params@rel-constants-invalid:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@rel-constants-invalid-rel-gen5:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_exec_params@rel-constants-invalid-ring:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@rs-invalid:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_params@rsvd2-dirt:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_exec_params@secure-non-master:
    - Statuses : 2 pass(s) 5 skip(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_params@secure-non-root:
    - Statuses : 2 pass(s) 5 skip(s)
    - Exec time: [0.0, 0.03] s

  * igt@gem_exec_params@sol-reset-invalid:
    - Statuses : 7 pass(s)
    - Exec time: [0.00] s

  * igt@gem_exec_params@sol-reset-not-gen7:
    - Statuses : 6 pass(s) 1 skip(s)
    - Exec time: [0.0, 0.01] s

  * igt@gem_exec_reloc@basic-active:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_reloc@basic-cpu:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.03] s

  * igt@gem_exec_reloc@basic-cpu-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.16] s

  * igt@gem_exec_reloc@basic-cpu-gtt:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_reloc@basic-cpu-gtt-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.16] s

  * igt@gem_exec_reloc@basic-cpu-gtt-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_reloc@basic-cpu-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_reloc@basic-cpu-read:
    - Statuses : 6 pass(s)
    - Exec time: [0.00, 0.03] s

  * igt@gem_exec_reloc@basic-cpu-read-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.16] s

  * igt@gem_exec_reloc@basic-cpu-read-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_reloc@basic-cpu-wc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.03] s

  * igt@gem_exec_reloc@basic-cpu-wc-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.16] s

  * igt@gem_exec_reloc@basic-cpu-wc-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.02] s

  * igt@gem_exec_reloc@basic-gtt:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.03] s

  * igt@gem_exec_reloc@basic-gtt-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.18] s

  * igt@gem_exec_reloc@basic-gtt-cpu:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.02] s

  * igt@gem_exec_reloc@basic-gtt-cpu-active:
    - Statuses : 6 pass(s)
    - Exec time: [0.11, 0.17] s

  * igt@gem_exec_reloc@basic-gtt-cpu-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_reloc@basic-gtt-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_reloc@basic-gtt-read:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.02] s

  * igt@gem_exec_reloc@basic-gtt-read-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.16] s

  * igt@gem_exec_reloc@basic-gtt-read-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.02] s

  * igt@gem_exec_reloc@basic-gtt-wc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.03] s

  * igt@gem_exec_reloc@basic-gtt-wc-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.16] s

  * igt@gem_exec_reloc@basic-gtt-wc-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.02] s

  * igt@gem_exec_reloc@basic-range:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.10] s

  * igt@gem_exec_reloc@basic-range-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.18] s

  * igt@gem_exec_reloc@basic-softpin:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_reloc@basic-wc:
    - Statuses : 6 pass(s)
    - Exec time: [0.00, 0.03] s

  * igt@gem_exec_reloc@basic-wc-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.17] s

  * igt@gem_exec_reloc@basic-wc-cpu:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.03] s

  * igt@gem_exec_reloc@basic-wc-cpu-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.17] s

  * igt@gem_exec_reloc@basic-wc-cpu-noreloc:
    - Statuses : 6 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_reloc@basic-wc-gtt:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.02] s

  * igt@gem_exec_reloc@basic-wc-gtt-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.17] s

  * igt@gem_exec_reloc@basic-wc-gtt-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.02] s

  * igt@gem_exec_reloc@basic-wc-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_reloc@basic-wc-read:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.02] s

  * igt@gem_exec_reloc@basic-wc-read-active:
    - Statuses : 6 pass(s)
    - Exec time: [0.11, 0.16] s

  * igt@gem_exec_reloc@basic-wc-read-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_reloc@basic-write-cpu:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.02] s

  * igt@gem_exec_reloc@basic-write-cpu-active:
    - Statuses : 6 pass(s)
    - Exec time: [0.11, 0.17] s

  * igt@gem_exec_reloc@basic-write-cpu-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_reloc@basic-write-gtt:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.02] s

  * igt@gem_exec_reloc@basic-write-gtt-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.16] s

  * igt@gem_exec_reloc@basic-write-gtt-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.02] s

  * igt@gem_exec_reloc@basic-write-read:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.02] s

  * igt@gem_exec_reloc@basic-write-read-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.17] s

  * igt@gem_exec_reloc@basic-write-read-noreloc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_reloc@basic-write-wc:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.02] s

  * igt@gem_exec_reloc@basic-write-wc-active:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.16] s

  * igt@gem_exec_reloc@basic-write-wc-noreloc:
    - Statuses : 6 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_exec_schedule@smoketest-all:
    - Statuses : 7 pass(s)
    - Exec time: [32.26, 32.32] s

  * igt@gem_exec_suspend@basic:
    - Statuses : 7 pass(s)
    - Exec time: [0.19, 1.54] s

  * igt@gem_exec_suspend@basic-s3:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_suspend@basic-s3-devices:
    - Statuses : 7 pass(s)
    - Exec time: [6.07, 10.34] s

  * igt@gem_exec_suspend@basic-s4-devices:
    - Statuses : 7 pass(s)
    - Exec time: [7.10, 11.96] s

  * igt@gem_fence_thrash@bo-copy:
    - Statuses : 7 pass(s)
    - Exec time: [1.11, 1.76] s

  * igt@gem_fence_thrash@bo-write-verify-none:
    - Statuses : 7 pass(s)
    - Exec time: [1.11, 1.19] s

  * igt@gem_fence_thrash@bo-write-verify-threaded-none:
    - Statuses : 7 pass(s)
    - Exec time: [1.22, 3.17] s

  * igt@gem_fence_thrash@bo-write-verify-threaded-x:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_fence_thrash@bo-write-verify-threaded-y:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_fence_thrash@bo-write-verify-x:
    - Statuses : 7 pass(s)
    - Exec time: [1.10, 1.28] s

  * igt@gem_fence_thrash@bo-write-verify-y:
    - Statuses : 7 pass(s)
    - Exec time: [1.09, 1.31] s

  * igt@gem_fenced_exec_thrash@2-spare-fences:
    - Statuses : 7 pass(s)
    - Exec time: [2.15, 2.17] s

  * igt@gem_fenced_exec_thrash@no-spare-fences:
    - Statuses : 6 pass(s)
    - Exec time: [2.15, 2.17] s

  * igt@gem_fenced_exec_thrash@no-spare-fences-busy:
    - Statuses : 7 pass(s)
    - Exec time: [2.15, 2.19] s

  * igt@gem_fenced_exec_thrash@no-spare-fences-busy-interruptible:
    - Statuses : 7 pass(s)
    - Exec time: [2.15, 2.18] s

  * igt@gem_fenced_exec_thrash@no-spare-fences-interruptible:
    - Statuses : 7 pass(s)
    - Exec time: [2.15, 2.17] s

  * igt@gem_fenced_exec_thrash@too-many-fences:
    - Statuses : 7 pass(s)
    - Exec time: [2.15, 2.17] s

  * igt@gem_flink_basic@bad-flink:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_flink_basic@bad-open:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_flink_basic@basic:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_flink_basic@double-flink:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_flink_basic@flink-lifetime:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.01] s

  * igt@gem_flink_race@flink_close:
    - Statuses : 7 pass(s)
    - Exec time: [5.01, 5.03] s

  * igt@gem_flink_race@flink_name:
    - Statuses : 7 pass(s)
    - Exec time: [5.37] s

  * igt@gem_gpgpu_fill:
    - Statuses : 1 pass(s) 1 skip(s)
    - Exec time: [0.09, 0.10] s

  * igt@gem_gtt_cpu_tlb:
    - Statuses : 7 pass(s)
    - Exec time: [0.09, 0.27] s

  * igt@gem_linear_blits@basic:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.03] s

  * igt@gem_linear_blits@interruptible:
    - Statuses : 7 pass(s)
    - Exec time: [1.54, 23.48] s

  * igt@gem_linear_blits@normal:
    - Statuses : 7 pass(s)
    - Exec time: [1.61, 20.18] s

  * igt@gem_madvise@dontneed-after-mmap:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_madvise@dontneed-before-exec:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_madvise@dontneed-before-mmap:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_madvise@dontneed-before-pwrite:
    - Statuses : 7 pass(s)
    - Exec time: [0.00] s

  * igt@gem_media_fill:
    - Statuses : 6 pass(s) 1 skip(s)
    - Exec time: [0.08, 0.21] s

  * igt@gem_mmap@bad-object:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_mmap@basic:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_mmap@basic-small-bo:
    - Statuses : 7 pass(s)
    - Exec time: [0.60, 3.10] s

  * igt@gem_mmap@big-bo:
    - Statuses : 7 pass(s)
    - Exec time: [0.64, 2.49] s

  * igt@gem_mmap@short-mmap:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_mmap_gtt@basic:
    - Statuses : 7 pass(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_mmap_gtt@basic-copy:
    - Statuses : 7 pass(s)
    - Exec time: [0.17, 0.90] s

  * igt@gem_mmap_gtt@basic-read:
    - Statuses : 7 pass(s)
    - Exec time: [0.03, 0.13] s

  * igt@gem_mmap_gtt@basic-read-write:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.05] s

  * igt@gem_mmap_gtt@basic-read-write-distinct:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.05] s

  * igt@gem_mmap_gtt@basic-short:
    - Statuses : 7 pass(s)
    - Exec time: [0.03, 0.07] s

  * igt@gem_mmap_gtt@basic-small-bo:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_mmap_gtt@basic-small-bo-tiledx:
    - Statuses : 7 pass(s)
    - Exec time: [0.23, 0.89] s

  * igt@gem_mmap_gtt@basic-small-bo-tiledy:
    - Statuses : 7 pass(s)
    - Exec time: [0.23, 0.84] s

  * igt@gem_mmap_gtt@basic-small-copy:
    - Statuses : 7 pass(s)
    - Exec time: [0.45, 2.91] s

  * igt@gem_mmap_gtt@basic-small-copy-odd:
    - Statuses : 7 pass(s)
    - Exec time: [0.68, 4.08] s

  * igt@gem_mmap_gtt@basic-small-copy-xy:
    - Statuses : 7 pass(s)
    - Exec time: [0.78, 4.55] s

  * igt@gem_mmap_gtt@basic-wc:
    - Statuses : 6 pass(s)
    - Exec time: [0.64, 0.65] s

  * igt@gem_mmap_gtt@basic-write:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.47] s

  * igt@gem_mmap_gtt@basic-write-cpu-read-gtt:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.31] s

  * igt@gem_mmap_gtt@basic-write-gtt:
    - Statuses : 7 pass(s)
    - Exec time: [0.11, 0.81] s

  * igt@gem_mmap_gtt@basic-write-read:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.05] s

  * igt@gem_mmap_gtt@basic-write-read-distinct:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.07] s

  * igt@gem_mmap_gtt@big-bo:
    - Statuses : 7 pass(s)
    - Exec time: [0.26, 1.03] s

  * igt@gem_mmap_gtt@big-bo-tiledx:
    - Statuses : 7 pass(s)
    - Exec time: [0.48, 1.91] s

  * igt@gem_mmap_gtt@big-bo-tiledy:
    - Statuses : 7 pass(s)
    - Exec time: [0.31, 1.20] s

  * igt@gem_mmap_gtt@big-copy:
    - Statuses : 7 pass(s)
    - Exec time: [1.34, 11.33] s

  * igt@gem_mmap_gtt@big-copy-odd:
    - Statuses : 7 pass(s)
    - Exec time: [1.68, 12.22] s

  * igt@gem_mmap_gtt@big-copy-xy:
    - Statuses : 7 pass(s)
    - Exec time: [1.77, 16.57] s

  * igt@gem_mmap_gtt@coherency:
    - Statuses : 3 pass(s) 4 skip(s)
    - Exec time: [0.0, 0.18] s

  * igt@gem_mmap_gtt@fault-concurrent:
    - Statuses : 7 pass(s)
    - Exec time: [2.62, 3.88] s

  * igt@gem_mmap_gtt@hang:
    - Statuses : 7 pass(s)
    - Exec time: [5.42, 5.50] s

  * igt@gem_mmap_gtt@medium-copy:
    - Statuses : 7 pass(s)
    - Exec time: [1.17, 6.77] s

  * igt@gem_mmap_gtt@medium-copy-odd:
    - Statuses : 7 pass(s)
    - Exec time: [0.93, 6.19] s

  * igt@gem_mmap_gtt@medium-copy-xy:
    - Statuses : 7 pass(s)
    - Exec time: [0.83, 8.26] s

  * igt@gem_mmap_gtt@zero-extend:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_mmap_offset@bad-extensions:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_mmap_offset@bad-flags:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_mmap_offset@bad-object:
    - Statuses : 7 pass(s)
    - Exec time: [0.0] s

  * igt@gem_mmap_offset@basic-uaf:
    - Statuses : 7 pass(s)
    - Exec time: [0.00] s

  * igt@gem_mmap_offset@clear:
    - Statuses : 7 pass(s)
    - Exec time: [23.42, 40.21] s

  * igt@gem_mmap_offset@close-race:
    - Statuses : 6 pass(s)
    - Exec time: [20.07, 20.11] s

  * igt@gem_mmap_offset@isolation:
    - Statuses : 7 pass(s)
    - Exec time: [0.00, 0.01] s

  * igt@gem_mmap_offset@open-flood:
    - Statuses : 7 pass(s)
    - Exec time: [21.

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19487/index.html

[-- Attachment #1.2: Type: text/html, Size: 39696 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 04/41] drm/i915: Teach the i915_dependency to use a double-lock
  2021-01-25 21:37     ` Chris Wilson
@ 2021-01-26  9:40       ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-26  9:40 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 25/01/2021 21:37, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2021-01-25 15:34:53)
>>
>> On 25/01/2021 14:00, Chris Wilson wrote:
>>> @@ -390,24 +410,27 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
>>>    {
>>>        bool ret = false;
>>>    
>>> -     spin_lock_irq(&schedule_lock);
>>> +     /* The signal->lock is always the outer lock in this double-lock. */
>>> +     spin_lock(&signal->lock);
>>>    
>>>        if (!node_signaled(signal)) {
>>>                INIT_LIST_HEAD(&dep->dfs_link);
>>>                dep->signaler = signal;
>>> -             dep->waiter = node;
>>> +             dep->waiter = node_get(node);
>>>                dep->flags = flags;
>>>    
>>>                /* All set, now publish. Beware the lockless walkers. */
>>> +             spin_lock_nested(&node->lock, SINGLE_DEPTH_NESTING);
>>>                list_add_rcu(&dep->signal_link, &node->signalers_list);
>>>                list_add_rcu(&dep->wait_link, &signal->waiters_list);
>>> +             spin_unlock(&node->lock);
>>>    
>>>                /* Propagate the chains */
>>>                node->flags |= signal->flags;
>>>                ret = true;
>>>        }
>>>    
>>> -     spin_unlock_irq(&schedule_lock);
>>> +     spin_unlock(&signal->lock);
>>
>> So we have to be sure another entry point cannot try to lock the same
>> nodes in reverse, that is with reversed roles. Situation where nodes are
>> simultaneously both each other waiters and signalers does indeed sound
>> impossible so I think this is fine.
>>
>> Only if some entry point would lock something which is a waiter, and
>> then went to boost the priority of a signaler. That is still one with a
>> global lock. So the benefit of this patch is just to reduce contention
>> between adding and re-scheduling?
> 
> We remove the global schedule_lock in the next patch. This patch tackles
> the "simpler" list management by noting that the chains can always be
> taken in order of (signaler, waiter) so we have strict nesting for a
> local double lock.
> 
>> And __i915_schedule does walk the list of signalers without holding this
>> new lock. What is the safety net there? RCU? Do we need
>> list_for_each_entry_rcu and explicit rcu_read_(un)lock in there then?
> 
> Yes, we are already supposedly RCU safe for the list of signalers, as
> we've been depending on that for a while.
> 
> #define for_each_signaler(p__, rq__) \
>          list_for_each_entry_rcu(p__, \
>                                  &(rq__)->sched.signalers_list, \
>                                  signal_link)

Yeah its fine, I wasn't seeing it's for_each_signaler but for some 
reason confused it with list_for_each_entry elsewhere in the function.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance Chris Wilson
@ 2021-01-26 11:12   ` Tvrtko Ursulin
  2021-01-26 11:30     ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-26 11:12 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom



On 25/01/2021 14:01, Chris Wilson wrote:
> In anticipation of wanting to be able to call pi from underneath an
> engine's active.lock, rework the priority inheritance to primarily work
> along an engine's priority queue, delegating any other engine that the
> chain may traverse to a worker. This reduces the global spinlock from
> governing the multi-entire priority inheritance depth-first search, to a
> smaller lock on each engine around a single list on that engine.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c    |   2 +
>   drivers/gpu/drm/i915/gt/intel_engine_types.h |   3 +
>   drivers/gpu/drm/i915/i915_scheduler.c        | 346 ++++++++++++-------
>   drivers/gpu/drm/i915/i915_scheduler.h        |   2 +
>   drivers/gpu/drm/i915/i915_scheduler_types.h  |  19 +-
>   5 files changed, 234 insertions(+), 138 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 7e580d3ac58f..3bfd3853c0e9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -576,6 +576,8 @@ void intel_engine_init_execlists(struct intel_engine_cs *engine)
>   
>   	execlists->queue_priority_hint = INT_MIN;
>   	execlists->queue = RB_ROOT_CACHED;
> +
> +	i915_sched_init_ipi(&execlists->ipi);
>   }
>   
>   static void cleanup_status_page(struct intel_engine_cs *engine)
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index 27cb3dc0233b..9105b7769635 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -20,6 +20,7 @@
>   #include "i915_gem.h"
>   #include "i915_pmu.h"
>   #include "i915_priolist_types.h"
> +#include "i915_scheduler_types.h"
>   #include "i915_selftest.h"
>   #include "intel_breadcrumbs_types.h"
>   #include "intel_sseu.h"
> @@ -257,6 +258,8 @@ struct intel_engine_execlists {
>   	struct rb_root_cached queue;
>   	struct rb_root_cached virtual;
>   
> +	struct i915_sched_ipi ipi;
> +
>   	/**
>   	 * @csb_write: control register for Context Switch buffer
>   	 *
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 96fe1e22dad7..0ecf71a6afd4 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -17,8 +17,6 @@ static struct i915_global_scheduler {
>   	struct kmem_cache *slab_priorities;
>   } global;
>   
> -static DEFINE_SPINLOCK(schedule_lock);
> -
>   static struct i915_sched_node *node_get(struct i915_sched_node *node)
>   {
>   	i915_request_get(container_of(node, struct i915_request, sched));
> @@ -30,17 +28,116 @@ static void node_put(struct i915_sched_node *node)
>   	i915_request_put(container_of(node, struct i915_request, sched));
>   }
>   
> +static inline int rq_prio(const struct i915_request *rq)
> +{
> +	return READ_ONCE(rq->sched.attr.priority);
> +}
> +
> +static int ipi_get_prio(struct i915_request *rq)
> +{
> +	if (READ_ONCE(rq->sched.ipi_priority) == I915_PRIORITY_INVALID)
> +		return I915_PRIORITY_INVALID;
> +
> +	return xchg(&rq->sched.ipi_priority, I915_PRIORITY_INVALID);
> +}
> +
> +static void ipi_schedule(struct work_struct *wrk)
> +{
> +	struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
> +	struct i915_request *rq = xchg(&ipi->list, NULL);
> +
> +	do {
> +		struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
> +		int prio;
> +
> +		prio = ipi_get_prio(rq);
> +
> +		/*
> +		 * For cross-engine scheduling to work we rely on one of two
> +		 * things:
> +		 *
> +		 * a) The requests are using dma-fence fences and so will not
> +		 * be scheduled until the previous engine is completed, and
> +		 * so we cannot cross back onto the original engine and end up
> +		 * queuing an earlier request after the first (due to the
> +		 * interrupted DFS).
> +		 *
> +		 * b) The requests are using semaphores and so may be already
> +		 * be in flight, in which case if we cross back onto the same
> +		 * engine, we will already have put the interrupted DFS into
> +		 * the priolist, and the continuation will now be queued
> +		 * afterwards [out-of-order]. However, since we are using
> +		 * semaphores in this case, we also perform yield on semaphore
> +		 * waits and so will reorder the requests back into the correct
> +		 * sequence. This occurrence (of promoting a request chain
> +		 * that crosses the engines using semaphores back unto itself)
> +		 * should be unlikely enough that it probably does not matter...
> +		 */
> +		local_bh_disable();
> +		i915_request_set_priority(rq, prio);
> +		local_bh_enable();

Is it that important and wouldn't the priority order restore eventually 
due timeslicing?

> +
> +		i915_request_put(rq);
> +		rq = ptr_mask_bits(rn, 1);
> +	} while (rq);
> +}
> +
> +void i915_sched_init_ipi(struct i915_sched_ipi *ipi)
> +{
> +	INIT_WORK(&ipi->work, ipi_schedule);
> +	ipi->list = NULL;
> +}
> +
> +static void __ipi_add(struct i915_request *rq)
> +{
> +#define STUB ((struct i915_request *)1)
> +	struct intel_engine_cs *engine = READ_ONCE(rq->engine);
> +	struct i915_request *first;
> +
> +	if (!i915_request_get_rcu(rq))
> +		return;
> +
> +	if (__i915_request_is_complete(rq) ||
> +	    cmpxchg(&rq->sched.ipi_link, NULL, STUB)) { /* already queued */
> +		i915_request_put(rq);
> +		return;
> +	}
> +
> +	first = READ_ONCE(engine->execlists.ipi.list);
> +	do
> +		rq->sched.ipi_link = ptr_pack_bits(first, 1, 1);
> +	while (!try_cmpxchg(&engine->execlists.ipi.list, &first, rq));
> +
> +	if (!first)
> +		queue_work(system_unbound_wq, &engine->execlists.ipi.work);
> +}
> +
> +/*
> + * Virtual engines complicate acquiring the engine timeline lock,
> + * as their rq->engine pointer is not stable until under that
> + * engine lock. The simple ploy we use is to take the lock then
> + * check that the rq still belongs to the newly locked engine.
> + */
> +#define lock_engine_irqsave(rq, flags) ({ \
> +	struct i915_request * const rq__ = (rq); \
> +	struct intel_engine_cs *engine__ = READ_ONCE(rq__->engine); \
> +\
> +	spin_lock_irqsave(&engine__->active.lock, (flags)); \
> +	while (engine__ != READ_ONCE((rq__)->engine)) { \
> +		spin_unlock(&engine__->active.lock); \
> +		engine__ = READ_ONCE(rq__->engine); \
> +		spin_lock(&engine__->active.lock); \
> +	} \
> +\
> +	engine__; \
> +})
> +
>   static const struct i915_request *
>   node_to_request(const struct i915_sched_node *node)
>   {
>   	return container_of(node, const struct i915_request, sched);
>   }
>   
> -static inline bool node_started(const struct i915_sched_node *node)
> -{
> -	return i915_request_started(node_to_request(node));
> -}
> -
>   static inline bool node_signaled(const struct i915_sched_node *node)
>   {
>   	return i915_request_completed(node_to_request(node));
> @@ -137,42 +234,6 @@ void __i915_priolist_free(struct i915_priolist *p)
>   	kmem_cache_free(global.slab_priorities, p);
>   }
>   
> -struct sched_cache {
> -	struct list_head *priolist;
> -};
> -
> -static struct intel_engine_cs *
> -sched_lock_engine(const struct i915_sched_node *node,
> -		  struct intel_engine_cs *locked,
> -		  struct sched_cache *cache)
> -{
> -	const struct i915_request *rq = node_to_request(node);
> -	struct intel_engine_cs *engine;
> -
> -	GEM_BUG_ON(!locked);
> -
> -	/*
> -	 * Virtual engines complicate acquiring the engine timeline lock,
> -	 * as their rq->engine pointer is not stable until under that
> -	 * engine lock. The simple ploy we use is to take the lock then
> -	 * check that the rq still belongs to the newly locked engine.
> -	 */
> -	while (locked != (engine = READ_ONCE(rq->engine))) {
> -		spin_unlock(&locked->active.lock);
> -		memset(cache, 0, sizeof(*cache));
> -		spin_lock(&engine->active.lock);
> -		locked = engine;
> -	}
> -
> -	GEM_BUG_ON(locked != engine);
> -	return locked;
> -}
> -
> -static inline int rq_prio(const struct i915_request *rq)
> -{
> -	return rq->sched.attr.priority;
> -}
> -
>   static inline bool need_preempt(int prio, int active)
>   {
>   	/*
> @@ -198,19 +259,17 @@ static void kick_submission(struct intel_engine_cs *engine,
>   	if (prio <= engine->execlists.queue_priority_hint)
>   		return;
>   
> -	rcu_read_lock();
> -
>   	/* Nothing currently active? We're overdue for a submission! */
>   	inflight = execlists_active(&engine->execlists);
>   	if (!inflight)
> -		goto unlock;
> +		return;
>   
>   	/*
>   	 * If we are already the currently executing context, don't
>   	 * bother evaluating if we should preempt ourselves.
>   	 */
>   	if (inflight->context == rq->context)
> -		goto unlock;
> +		return;
>   
>   	ENGINE_TRACE(engine,
>   		     "bumping queue-priority-hint:%d for rq:%llx:%lld, inflight:%llx:%lld prio %d\n",
> @@ -222,30 +281,28 @@ static void kick_submission(struct intel_engine_cs *engine,
>   	engine->execlists.queue_priority_hint = prio;
>   	if (need_preempt(prio, rq_prio(inflight)))
>   		tasklet_hi_schedule(&engine->execlists.tasklet);
> -
> -unlock:
> -	rcu_read_unlock();
>   }
>   
> -static void __i915_schedule(struct i915_sched_node *node, int prio)
> +static void ipi_priority(struct i915_request *rq, int prio)
>   {
> -	struct intel_engine_cs *engine;
> -	struct i915_dependency *dep, *p;
> -	struct i915_dependency stack;
> -	struct sched_cache cache;
> +	int old = READ_ONCE(rq->sched.ipi_priority);
> +
> +	do {
> +		if (prio <= old)
> +			return;
> +	} while (!try_cmpxchg(&rq->sched.ipi_priority, &old, prio));
> +
> +	__ipi_add(rq);
> +}
> +
> +static void __i915_request_set_priority(struct i915_request *rq, int prio)
> +{
> +	struct intel_engine_cs *engine = rq->engine;
> +	struct i915_request *rn;
> +	struct list_head *plist;
>   	LIST_HEAD(dfs);
>   
> -	/* Needed in order to use the temporary link inside i915_dependency */
> -	lockdep_assert_held(&schedule_lock);
> -	GEM_BUG_ON(prio == I915_PRIORITY_INVALID);
> -
> -	if (node_signaled(node))
> -		return;
> -
> -	prio = max(prio, node->attr.priority);
> -
> -	stack.signaler = node;
> -	list_add(&stack.dfs_link, &dfs);
> +	list_add(&rq->sched.dfs, &dfs);
>   
>   	/*
>   	 * Recursively bump all dependent priorities to match the new request.
> @@ -265,66 +322,41 @@ static void __i915_schedule(struct i915_sched_node *node, int prio)
>   	 * end result is a topological list of requests in reverse order, the
>   	 * last element in the list is the request we must execute first.
>   	 */
> -	list_for_each_entry(dep, &dfs, dfs_link) {
> -		struct i915_sched_node *node = dep->signaler;
> +	list_for_each_entry(rq, &dfs, sched.dfs) {
> +		struct i915_dependency *p;
>   
> -		/* If we are already flying, we know we have no signalers */
> -		if (node_started(node))
> -			continue;
> +		/* Also release any children on this engine that are ready */
> +		GEM_BUG_ON(rq->engine != engine);
>   
> -		/*
> -		 * Within an engine, there can be no cycle, but we may
> -		 * refer to the same dependency chain multiple times
> -		 * (redundant dependencies are not eliminated) and across
> -		 * engines.
> -		 */
> -		list_for_each_entry(p, &node->signalers_list, signal_link) {
> -			GEM_BUG_ON(p == dep); /* no cycles! */
> +		for_each_signaler(p, rq) {
> +			struct i915_request *s =
> +				container_of(p->signaler, typeof(*s), sched);
>   
> -			if (node_signaled(p->signaler))
> +			GEM_BUG_ON(s == rq);
> +
> +			if (rq_prio(s) >= prio)
>   				continue;
>   
> -			if (prio > READ_ONCE(p->signaler->attr.priority))
> -				list_move_tail(&p->dfs_link, &dfs);
> +			if (__i915_request_is_complete(s))
> +				continue;
> +
> +			if (s->engine != rq->engine) {
> +				ipi_priority(s, prio);
> +				continue;
> +			}
> +
> +			list_move_tail(&s->sched.dfs, &dfs);
>   		}
>   	}
>   
> -	/*
> -	 * If we didn't need to bump any existing priorities, and we haven't
> -	 * yet submitted this request (i.e. there is no potential race with
> -	 * execlists_submit_request()), we can set our own priority and skip
> -	 * acquiring the engine locks.
> -	 */
> -	if (node->attr.priority == I915_PRIORITY_INVALID) {
> -		GEM_BUG_ON(!list_empty(&node->link));
> -		node->attr.priority = prio;
> +	plist = i915_sched_lookup_priolist(engine, prio);
>   
> -		if (stack.dfs_link.next == stack.dfs_link.prev)
> -			return;
> +	/* Fifo and depth-first replacement ensure our deps execute first */
> +	list_for_each_entry_safe_reverse(rq, rn, &dfs, sched.dfs) {
> +		GEM_BUG_ON(rq->engine != engine);
>   
> -		__list_del_entry(&stack.dfs_link);
> -	}
> -
> -	memset(&cache, 0, sizeof(cache));
> -	engine = node_to_request(node)->engine;
> -	spin_lock(&engine->active.lock);
> -
> -	/* Fifo and depth-first replacement ensure our deps execute before us */
> -	engine = sched_lock_engine(node, engine, &cache);
> -	list_for_each_entry_safe_reverse(dep, p, &dfs, dfs_link) {
> -		INIT_LIST_HEAD(&dep->dfs_link);
> -
> -		node = dep->signaler;
> -		engine = sched_lock_engine(node, engine, &cache);
> -		lockdep_assert_held(&engine->active.lock);
> -
> -		/* Recheck after acquiring the engine->timeline.lock */
> -		if (prio <= node->attr.priority || node_signaled(node))
> -			continue;
> -
> -		GEM_BUG_ON(node_to_request(node)->engine != engine);
> -
> -		WRITE_ONCE(node->attr.priority, prio);
> +		INIT_LIST_HEAD(&rq->sched.dfs);
> +		WRITE_ONCE(rq->sched.attr.priority, prio);
>   
>   		/*
>   		 * Once the request is ready, it will be placed into the
> @@ -334,32 +366,75 @@ static void __i915_schedule(struct i915_sched_node *node, int prio)
>   		 * any preemption required, be dealt with upon submission.
>   		 * See engine->submit_request()
>   		 */
> -		if (list_empty(&node->link))
> +		if (!i915_request_is_ready(rq))
>   			continue;
>   
> -		if (i915_request_in_priority_queue(node_to_request(node))) {
> -			if (!cache.priolist)
> -				cache.priolist =
> -					i915_sched_lookup_priolist(engine,
> -								   prio);
> -			list_move_tail(&node->link, cache.priolist);
> -		}
> +		if (i915_request_in_priority_queue(rq))
> +			list_move_tail(&rq->sched.link, plist);
>   
> -		/* Defer (tasklet) submission until after all of our updates. */
> -		kick_submission(engine, node_to_request(node), prio);
> +		/* Defer (tasklet) submission until after all updates. */
> +		kick_submission(engine, rq, prio);
>   	}
> -
> -	spin_unlock(&engine->active.lock);
>   }
>   
>   void i915_request_set_priority(struct i915_request *rq, int prio)
>   {
> -	if (!intel_engine_has_scheduler(rq->engine))
> +	struct intel_engine_cs *engine;
> +	unsigned long flags;
> +
> +	if (prio <= rq_prio(rq))
>   		return;
>   
> -	spin_lock_irq(&schedule_lock);
> -	__i915_schedule(&rq->sched, prio);
> -	spin_unlock_irq(&schedule_lock);
> +	/*
> +	 * If we are setting the priority before being submitted, see if we
> +	 * can quickly adjust our own priority in-situ and avoid taking
> +	 * the contended engine->active.lock. If we need priority inheritance,
> +	 * take the slow route.
> +	 */
> +	if (rq_prio(rq) == I915_PRIORITY_INVALID) {
> +		struct i915_dependency *p;
> +
> +		rcu_read_lock();
> +		for_each_signaler(p, rq) {
> +			struct i915_request *s =
> +				container_of(p->signaler, typeof(*s), sched);
> +
> +			if (rq_prio(s) >= prio)
> +				continue;
> +
> +			if (__i915_request_is_complete(s))
> +				continue;
> +
> +			break;
> +		}
> +		rcu_read_unlock();

Exit this loop with a first lower priority incomplete signaler. What 
does the block below then do? Feels like it needs a comment.

> +
> +		if (&p->signal_link == &rq->sched.signalers_list &&
> +		    cmpxchg(&rq->sched.attr.priority,
> +			    I915_PRIORITY_INVALID,
> +			    prio) == I915_PRIORITY_INVALID)
> +			return;
> +	}
> +
> +	engine = lock_engine_irqsave(rq, flags);
> +	if (prio <= rq_prio(rq))
> +		goto unlock;
> +
> +	if (__i915_request_is_complete(rq))
> +		goto unlock;
> +
> +	if (!intel_engine_has_scheduler(engine)) {
> +		rq->sched.attr.priority = prio;
> +		goto unlock;
> +	}
> +
> +	rcu_read_lock();
> +	__i915_request_set_priority(rq, prio);
> +	rcu_read_unlock();
> +	GEM_BUG_ON(rq_prio(rq) != prio);
> +
> +unlock:
> +	spin_unlock_irqrestore(&engine->active.lock, flags);
>   }
>   
>   void i915_sched_node_init(struct i915_sched_node *node)
> @@ -369,6 +444,9 @@ void i915_sched_node_init(struct i915_sched_node *node)
>   	INIT_LIST_HEAD(&node->signalers_list);
>   	INIT_LIST_HEAD(&node->waiters_list);
>   	INIT_LIST_HEAD(&node->link);
> +	INIT_LIST_HEAD(&node->dfs);
> +
> +	node->ipi_link = NULL;
>   
>   	i915_sched_node_reinit(node);
>   }
> @@ -379,6 +457,9 @@ void i915_sched_node_reinit(struct i915_sched_node *node)
>   	node->semaphores = 0;
>   	node->flags = 0;
>   
> +	GEM_BUG_ON(node->ipi_link);
> +	node->ipi_priority = I915_PRIORITY_INVALID;
> +
>   	GEM_BUG_ON(!list_empty(&node->signalers_list));
>   	GEM_BUG_ON(!list_empty(&node->waiters_list));
>   	GEM_BUG_ON(!list_empty(&node->link));
> @@ -414,7 +495,6 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
>   	spin_lock(&signal->lock);
>   
>   	if (!node_signaled(signal)) {
> -		INIT_LIST_HEAD(&dep->dfs_link);
>   		dep->signaler = signal;
>   		dep->waiter = node_get(node);
>   		dep->flags = flags;
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index a045be784c67..5be7f90e7896 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -35,6 +35,8 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
>   
>   void i915_sched_node_retire(struct i915_sched_node *node);
>   
> +void i915_sched_init_ipi(struct i915_sched_ipi *ipi);
> +
>   void i915_request_set_priority(struct i915_request *request, int prio);
>   
>   struct list_head *
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index 623bf41fcf35..5a84d59134ee 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -8,8 +8,8 @@
>   #define _I915_SCHEDULER_TYPES_H_
>   
>   #include <linux/list.h>
> +#include <linux/workqueue.h>
>   
> -#include "gt/intel_engine_types.h"
>   #include "i915_priolist_types.h"
>   
>   struct drm_i915_private;
> @@ -61,13 +61,23 @@ struct i915_sched_attr {
>    */
>   struct i915_sched_node {
>   	spinlock_t lock; /* protect the lists */
> +
>   	struct list_head signalers_list; /* those before us, we depend upon */
>   	struct list_head waiters_list; /* those after us, they depend upon us */
> -	struct list_head link;
> +	struct list_head link; /* guarded by engine->active.lock */
> +	struct list_head dfs; /* guarded by engine->active.lock */
>   	struct i915_sched_attr attr;
> -	unsigned int flags;
> +	unsigned long flags;
>   #define I915_SCHED_HAS_EXTERNAL_CHAIN	BIT(0)
> -	intel_engine_mask_t semaphores;
> +	unsigned long semaphores;
> +
> +	struct i915_request *ipi_link;
> +	int ipi_priority;
> +};
> +
> +struct i915_sched_ipi {
> +	struct i915_request *list;
> +	struct work_struct work;
>   };
>   
>   struct i915_dependency {
> @@ -75,7 +85,6 @@ struct i915_dependency {
>   	struct i915_sched_node *waiter;
>   	struct list_head signal_link;
>   	struct list_head wait_link;
> -	struct list_head dfs_link;
>   	struct rcu_head rcu;
>   	unsigned long flags;
>   #define I915_DEPENDENCY_ALLOC		BIT(0)
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance
  2021-01-26 11:12   ` Tvrtko Ursulin
@ 2021-01-26 11:30     ` Chris Wilson
  2021-01-26 11:40       ` Tvrtko Ursulin
  0 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-26 11:30 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-26 11:12:53)
> 
> 
> On 25/01/2021 14:01, Chris Wilson wrote:
> > +static void ipi_schedule(struct work_struct *wrk)
> > +{
> > +     struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
> > +     struct i915_request *rq = xchg(&ipi->list, NULL);
> > +
> > +     do {
> > +             struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
> > +             int prio;
> > +
> > +             prio = ipi_get_prio(rq);
> > +
> > +             /*
> > +              * For cross-engine scheduling to work we rely on one of two
> > +              * things:
> > +              *
> > +              * a) The requests are using dma-fence fences and so will not
> > +              * be scheduled until the previous engine is completed, and
> > +              * so we cannot cross back onto the original engine and end up
> > +              * queuing an earlier request after the first (due to the
> > +              * interrupted DFS).
> > +              *
> > +              * b) The requests are using semaphores and so may be already
> > +              * be in flight, in which case if we cross back onto the same
> > +              * engine, we will already have put the interrupted DFS into
> > +              * the priolist, and the continuation will now be queued
> > +              * afterwards [out-of-order]. However, since we are using
> > +              * semaphores in this case, we also perform yield on semaphore
> > +              * waits and so will reorder the requests back into the correct
> > +              * sequence. This occurrence (of promoting a request chain
> > +              * that crosses the engines using semaphores back unto itself)
> > +              * should be unlikely enough that it probably does not matter...
> > +              */
> > +             local_bh_disable();
> > +             i915_request_set_priority(rq, prio);
> > +             local_bh_enable();
> 
> Is it that important and wouldn't the priority order restore eventually 
> due timeslicing?

There would be a window in which we executed userspace code
out-of-order. That's enough to scare me! However, for our PI dependency
chains it should not matter as the only time we do submit out-of-order,
we are stuck on _our_ semaphore that cannot be resolved until the
requests are back in-order.

I've tried to trick this into causing problems with the
i915_selftest/igt_schedule_cycle and gem_exec_schedule/noreorder.
Fortunately for my sanity, neither test have caught any problems.

This is the handwaving part of removing the global lock.

> > +     /*
> > +      * If we are setting the priority before being submitted, see if we
> > +      * can quickly adjust our own priority in-situ and avoid taking
> > +      * the contended engine->active.lock. If we need priority inheritance,
> > +      * take the slow route.
> > +      */
> > +     if (rq_prio(rq) == I915_PRIORITY_INVALID) {
> > +             struct i915_dependency *p;
> > +
> > +             rcu_read_lock();
> > +             for_each_signaler(p, rq) {
> > +                     struct i915_request *s =
> > +                             container_of(p->signaler, typeof(*s), sched);
> > +
> > +                     if (rq_prio(s) >= prio)
> > +                             continue;
> > +
> > +                     if (__i915_request_is_complete(s))
> > +                             continue;
> > +
> > +                     break;
> > +             }
> > +             rcu_read_unlock();
> 
> Exit this loop with a first lower priority incomplete signaler. What 
> does the block below then do? Feels like it needs a comment.

I thought I had sufficiently explained that in the comment above.

/* Update priority in place if no PI required */
> > +             if (&p->signal_link == &rq->sched.signalers_list &&
> > +                 cmpxchg(&rq->sched.attr.priority,
> > +                         I915_PRIORITY_INVALID,
> > +                         prio) == I915_PRIORITY_INVALID)
> > +                     return;

It could do a few more tricks to change the priority in-place a second
time, but I did not think that would be frequent enough to matter.
Whereas we always adjust the priority from INVALID once before
submission, and avoiding taking the lock then does make a difference to
the profiles.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance
  2021-01-26 11:30     ` Chris Wilson
@ 2021-01-26 11:40       ` Tvrtko Ursulin
  2021-01-26 11:55         ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-26 11:40 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 26/01/2021 11:30, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2021-01-26 11:12:53)
>>
>>
>> On 25/01/2021 14:01, Chris Wilson wrote:
>>> +static void ipi_schedule(struct work_struct *wrk)
>>> +{
>>> +     struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
>>> +     struct i915_request *rq = xchg(&ipi->list, NULL);
>>> +
>>> +     do {
>>> +             struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
>>> +             int prio;
>>> +
>>> +             prio = ipi_get_prio(rq);
>>> +
>>> +             /*
>>> +              * For cross-engine scheduling to work we rely on one of two
>>> +              * things:
>>> +              *
>>> +              * a) The requests are using dma-fence fences and so will not
>>> +              * be scheduled until the previous engine is completed, and
>>> +              * so we cannot cross back onto the original engine and end up
>>> +              * queuing an earlier request after the first (due to the
>>> +              * interrupted DFS).
>>> +              *
>>> +              * b) The requests are using semaphores and so may be already
>>> +              * be in flight, in which case if we cross back onto the same
>>> +              * engine, we will already have put the interrupted DFS into
>>> +              * the priolist, and the continuation will now be queued
>>> +              * afterwards [out-of-order]. However, since we are using
>>> +              * semaphores in this case, we also perform yield on semaphore
>>> +              * waits and so will reorder the requests back into the correct
>>> +              * sequence. This occurrence (of promoting a request chain
>>> +              * that crosses the engines using semaphores back unto itself)
>>> +              * should be unlikely enough that it probably does not matter...
>>> +              */
>>> +             local_bh_disable();
>>> +             i915_request_set_priority(rq, prio);
>>> +             local_bh_enable();
>>
>> Is it that important and wouldn't the priority order restore eventually
>> due timeslicing?
> 
> There would be a window in which we executed userspace code
> out-of-order. That's enough to scare me! However, for our PI dependency
> chains it should not matter as the only time we do submit out-of-order,
> we are stuck on _our_ semaphore that cannot be resolved until the
> requests are back in-order.

Out of order how? Within a single timeline?! I though only with 
incomplete view of priority inheritance, which in my mind could only 
cause deadlocks (if no timeslicing). But really really out of order?

> I've tried to trick this into causing problems with the
> i915_selftest/igt_schedule_cycle and gem_exec_schedule/noreorder.
> Fortunately for my sanity, neither test have caught any problems.
> 
> This is the handwaving part of removing the global lock.
> 
>>> +     /*
>>> +      * If we are setting the priority before being submitted, see if we
>>> +      * can quickly adjust our own priority in-situ and avoid taking
>>> +      * the contended engine->active.lock. If we need priority inheritance,
>>> +      * take the slow route.
>>> +      */
>>> +     if (rq_prio(rq) == I915_PRIORITY_INVALID) {
>>> +             struct i915_dependency *p;
>>> +
>>> +             rcu_read_lock();
>>> +             for_each_signaler(p, rq) {
>>> +                     struct i915_request *s =
>>> +                             container_of(p->signaler, typeof(*s), sched);
>>> +
>>> +                     if (rq_prio(s) >= prio)
>>> +                             continue;
>>> +
>>> +                     if (__i915_request_is_complete(s))
>>> +                             continue;
>>> +
>>> +                     break;
>>> +             }
>>> +             rcu_read_unlock();
>>
>> Exit this loop with a first lower priority incomplete signaler. What
>> does the block below then do? Feels like it needs a comment.
> 
> I thought I had sufficiently explained that in the comment above.
> 
> /* Update priority in place if no PI required */
>>> +             if (&p->signal_link == &rq->sched.signalers_list &&
>>> +                 cmpxchg(&rq->sched.attr.priority,
>>> +                         I915_PRIORITY_INVALID,
>>> +                         prio) == I915_PRIORITY_INVALID)
>>> +                     return;
> 
> It could do a few more tricks to change the priority in-place a second
> time, but I did not think that would be frequent enough to matter.
> Whereas we always adjust the priority from INVALID once before
> submission, and avoiding taking the lock then does make a difference to
> the profiles.

To start with, if p is NULL or un-initialized (can be, no?) then 
relationship of &p->signal_link to &rq->sched.signalers_list escapes me.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance
  2021-01-26 11:40       ` Tvrtko Ursulin
@ 2021-01-26 11:55         ` Chris Wilson
  2021-01-26 13:15           ` Tvrtko Ursulin
  0 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-26 11:55 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-26 11:40:24)
> 
> On 26/01/2021 11:30, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2021-01-26 11:12:53)
> >>
> >>
> >> On 25/01/2021 14:01, Chris Wilson wrote:
> >>> +static void ipi_schedule(struct work_struct *wrk)
> >>> +{
> >>> +     struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
> >>> +     struct i915_request *rq = xchg(&ipi->list, NULL);
> >>> +
> >>> +     do {
> >>> +             struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
> >>> +             int prio;
> >>> +
> >>> +             prio = ipi_get_prio(rq);
> >>> +
> >>> +             /*
> >>> +              * For cross-engine scheduling to work we rely on one of two
> >>> +              * things:
> >>> +              *
> >>> +              * a) The requests are using dma-fence fences and so will not
> >>> +              * be scheduled until the previous engine is completed, and
> >>> +              * so we cannot cross back onto the original engine and end up
> >>> +              * queuing an earlier request after the first (due to the
> >>> +              * interrupted DFS).
> >>> +              *
> >>> +              * b) The requests are using semaphores and so may be already
> >>> +              * be in flight, in which case if we cross back onto the same
> >>> +              * engine, we will already have put the interrupted DFS into
> >>> +              * the priolist, and the continuation will now be queued
> >>> +              * afterwards [out-of-order]. However, since we are using
> >>> +              * semaphores in this case, we also perform yield on semaphore
> >>> +              * waits and so will reorder the requests back into the correct
> >>> +              * sequence. This occurrence (of promoting a request chain
> >>> +              * that crosses the engines using semaphores back unto itself)
> >>> +              * should be unlikely enough that it probably does not matter...
> >>> +              */
> >>> +             local_bh_disable();
> >>> +             i915_request_set_priority(rq, prio);
> >>> +             local_bh_enable();
> >>
> >> Is it that important and wouldn't the priority order restore eventually
> >> due timeslicing?
> > 
> > There would be a window in which we executed userspace code
> > out-of-order. That's enough to scare me! However, for our PI dependency
> > chains it should not matter as the only time we do submit out-of-order,
> > we are stuck on _our_ semaphore that cannot be resolved until the
> > requests are back in-order.
> 
> Out of order how? Within a single timeline?! I though only with 
> incomplete view of priority inheritance, which in my mind could only 
> cause deadlocks (if no timeslicing). But really really out of order?

Fences between timelines. Let's say we have 3 requests, A,B,C all with
sequential fencing (C depends on B depends on A), but B is on a
different engine to (A, C) and we are using semaphores to submit early.
If we bump the priority of C, we see it crosses the engine to B, and send
an ipi_priority, but set C to be higher priority than A. So we now
schedule C before A!

However, since C depends on B which depends on A, C is stuck on its
semaphore from B, and B is waiting for A. As soon as A is set to the
same priority as C (after a couple of ipi_priority()), we rerun the
scheduler see that C has a semaphore-yield (or eventually timeslice
expired) and so run A before C, and order is restored.

> > I've tried to trick this into causing problems with the
> > i915_selftest/igt_schedule_cycle and gem_exec_schedule/noreorder.
> > Fortunately for my sanity, neither test have caught any problems.
> > 
> > This is the handwaving part of removing the global lock.
> > 
> >>> +     /*
> >>> +      * If we are setting the priority before being submitted, see if we
> >>> +      * can quickly adjust our own priority in-situ and avoid taking
> >>> +      * the contended engine->active.lock. If we need priority inheritance,
> >>> +      * take the slow route.
> >>> +      */
> >>> +     if (rq_prio(rq) == I915_PRIORITY_INVALID) {
> >>> +             struct i915_dependency *p;
> >>> +
> >>> +             rcu_read_lock();
> >>> +             for_each_signaler(p, rq) {
> >>> +                     struct i915_request *s =
> >>> +                             container_of(p->signaler, typeof(*s), sched);
> >>> +
> >>> +                     if (rq_prio(s) >= prio)
> >>> +                             continue;
> >>> +
> >>> +                     if (__i915_request_is_complete(s))
> >>> +                             continue;
> >>> +
> >>> +                     break;
> >>> +             }
> >>> +             rcu_read_unlock();
> >>
> >> Exit this loop with a first lower priority incomplete signaler. What
> >> does the block below then do? Feels like it needs a comment.
> > 
> > I thought I had sufficiently explained that in the comment above.
> > 
> > /* Update priority in place if no PI required */
> >>> +             if (&p->signal_link == &rq->sched.signalers_list &&
> >>> +                 cmpxchg(&rq->sched.attr.priority,
> >>> +                         I915_PRIORITY_INVALID,
> >>> +                         prio) == I915_PRIORITY_INVALID)
> >>> +                     return;
> > 
> > It could do a few more tricks to change the priority in-place a second
> > time, but I did not think that would be frequent enough to matter.
> > Whereas we always adjust the priority from INVALID once before
> > submission, and avoiding taking the lock then does make a difference to
> > the profiles.
> 
> To start with, if p is NULL or un-initialized (can be, no?) then 
> relationship of &p->signal_link to &rq->sched.signalers_list escapes me.

p is constrained to be a member of the signalers_list or its head.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance
  2021-01-26 11:55         ` Chris Wilson
@ 2021-01-26 13:15           ` Tvrtko Ursulin
  2021-01-26 13:24             ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-26 13:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 26/01/2021 11:55, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2021-01-26 11:40:24)
>>
>> On 26/01/2021 11:30, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2021-01-26 11:12:53)
>>>>
>>>>
>>>> On 25/01/2021 14:01, Chris Wilson wrote:
>>>>> +static void ipi_schedule(struct work_struct *wrk)
>>>>> +{
>>>>> +     struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
>>>>> +     struct i915_request *rq = xchg(&ipi->list, NULL);
>>>>> +
>>>>> +     do {
>>>>> +             struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
>>>>> +             int prio;
>>>>> +
>>>>> +             prio = ipi_get_prio(rq);
>>>>> +
>>>>> +             /*
>>>>> +              * For cross-engine scheduling to work we rely on one of two
>>>>> +              * things:
>>>>> +              *
>>>>> +              * a) The requests are using dma-fence fences and so will not
>>>>> +              * be scheduled until the previous engine is completed, and
>>>>> +              * so we cannot cross back onto the original engine and end up
>>>>> +              * queuing an earlier request after the first (due to the
>>>>> +              * interrupted DFS).
>>>>> +              *
>>>>> +              * b) The requests are using semaphores and so may be already
>>>>> +              * be in flight, in which case if we cross back onto the same
>>>>> +              * engine, we will already have put the interrupted DFS into
>>>>> +              * the priolist, and the continuation will now be queued
>>>>> +              * afterwards [out-of-order]. However, since we are using
>>>>> +              * semaphores in this case, we also perform yield on semaphore
>>>>> +              * waits and so will reorder the requests back into the correct
>>>>> +              * sequence. This occurrence (of promoting a request chain
>>>>> +              * that crosses the engines using semaphores back unto itself)
>>>>> +              * should be unlikely enough that it probably does not matter...
>>>>> +              */
>>>>> +             local_bh_disable();
>>>>> +             i915_request_set_priority(rq, prio);
>>>>> +             local_bh_enable();
>>>>
>>>> Is it that important and wouldn't the priority order restore eventually
>>>> due timeslicing?
>>>
>>> There would be a window in which we executed userspace code
>>> out-of-order. That's enough to scare me! However, for our PI dependency
>>> chains it should not matter as the only time we do submit out-of-order,
>>> we are stuck on _our_ semaphore that cannot be resolved until the
>>> requests are back in-order.
>>
>> Out of order how? Within a single timeline?! I though only with
>> incomplete view of priority inheritance, which in my mind could only
>> cause deadlocks (if no timeslicing). But really really out of order?
> 
> Fences between timelines. Let's say we have 3 requests, A,B,C all with
> sequential fencing (C depends on B depends on A), but B is on a
> different engine to (A, C) and we are using semaphores to submit early.
> If we bump the priority of C, we see it crosses the engine to B, and send
> an ipi_priority, but set C to be higher priority than A. So we now
> schedule C before A!

Yeah so different timelines, I think that's not a huge problem to start 
with. Only if things were non-preemptable.

> However, since C depends on B which depends on A, C is stuck on its
> semaphore from B, and B is waiting for A. As soon as A is set to the
> same priority as C (after a couple of ipi_priority()), we rerun the
> scheduler see that C has a semaphore-yield (or eventually timeslice
> expired) and so run A before C, and order is restored.
> 
>>> I've tried to trick this into causing problems with the
>>> i915_selftest/igt_schedule_cycle and gem_exec_schedule/noreorder.
>>> Fortunately for my sanity, neither test have caught any problems.
>>>
>>> This is the handwaving part of removing the global lock.
>>>
>>>>> +     /*
>>>>> +      * If we are setting the priority before being submitted, see if we
>>>>> +      * can quickly adjust our own priority in-situ and avoid taking
>>>>> +      * the contended engine->active.lock. If we need priority inheritance,
>>>>> +      * take the slow route.
>>>>> +      */
>>>>> +     if (rq_prio(rq) == I915_PRIORITY_INVALID) {
>>>>> +             struct i915_dependency *p;
>>>>> +
>>>>> +             rcu_read_lock();
>>>>> +             for_each_signaler(p, rq) {
>>>>> +                     struct i915_request *s =
>>>>> +                             container_of(p->signaler, typeof(*s), sched);
>>>>> +
>>>>> +                     if (rq_prio(s) >= prio)
>>>>> +                             continue;
>>>>> +
>>>>> +                     if (__i915_request_is_complete(s))
>>>>> +                             continue;
>>>>> +
>>>>> +                     break;
>>>>> +             }
>>>>> +             rcu_read_unlock();
>>>>
>>>> Exit this loop with a first lower priority incomplete signaler. What
>>>> does the block below then do? Feels like it needs a comment.
>>>
>>> I thought I had sufficiently explained that in the comment above.
>>>
>>> /* Update priority in place if no PI required */
>>>>> +             if (&p->signal_link == &rq->sched.signalers_list &&
>>>>> +                 cmpxchg(&rq->sched.attr.priority,
>>>>> +                         I915_PRIORITY_INVALID,
>>>>> +                         prio) == I915_PRIORITY_INVALID)
>>>>> +                     return;
>>>
>>> It could do a few more tricks to change the priority in-place a second
>>> time, but I did not think that would be frequent enough to matter.
>>> Whereas we always adjust the priority from INVALID once before
>>> submission, and avoiding taking the lock then does make a difference to
>>> the profiles.
>>
>> To start with, if p is NULL or un-initialized (can be, no?) then
>> relationship of &p->signal_link to &rq->sched.signalers_list escapes me.
> 
> p is constrained to be a member of the signalers_list or its head.

Is it defined list_for_each_entry exits with pos set? It is in 
implementation but I don't know why it would have to be. Could you 
change this to some form of list_empty or a descriptively named helper 
for clarity?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance
  2021-01-26 13:15           ` Tvrtko Ursulin
@ 2021-01-26 13:24             ` Chris Wilson
  2021-01-26 13:45               ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-26 13:24 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-26 13:15:29)
> 
> On 26/01/2021 11:55, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2021-01-26 11:40:24)
> >>
> >> On 26/01/2021 11:30, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2021-01-26 11:12:53)
> >>>>
> >>>>
> >>>> On 25/01/2021 14:01, Chris Wilson wrote:
> >>>>> +static void ipi_schedule(struct work_struct *wrk)
> >>>>> +{
> >>>>> +     struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
> >>>>> +     struct i915_request *rq = xchg(&ipi->list, NULL);
> >>>>> +
> >>>>> +     do {
> >>>>> +             struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
> >>>>> +             int prio;
> >>>>> +
> >>>>> +             prio = ipi_get_prio(rq);
> >>>>> +
> >>>>> +             /*
> >>>>> +              * For cross-engine scheduling to work we rely on one of two
> >>>>> +              * things:
> >>>>> +              *
> >>>>> +              * a) The requests are using dma-fence fences and so will not
> >>>>> +              * be scheduled until the previous engine is completed, and
> >>>>> +              * so we cannot cross back onto the original engine and end up
> >>>>> +              * queuing an earlier request after the first (due to the
> >>>>> +              * interrupted DFS).
> >>>>> +              *
> >>>>> +              * b) The requests are using semaphores and so may be already
> >>>>> +              * be in flight, in which case if we cross back onto the same
> >>>>> +              * engine, we will already have put the interrupted DFS into
> >>>>> +              * the priolist, and the continuation will now be queued
> >>>>> +              * afterwards [out-of-order]. However, since we are using
> >>>>> +              * semaphores in this case, we also perform yield on semaphore
> >>>>> +              * waits and so will reorder the requests back into the correct
> >>>>> +              * sequence. This occurrence (of promoting a request chain
> >>>>> +              * that crosses the engines using semaphores back unto itself)
> >>>>> +              * should be unlikely enough that it probably does not matter...
> >>>>> +              */
> >>>>> +             local_bh_disable();
> >>>>> +             i915_request_set_priority(rq, prio);
> >>>>> +             local_bh_enable();
> >>>>
> >>>> Is it that important and wouldn't the priority order restore eventually
> >>>> due timeslicing?
> >>>
> >>> There would be a window in which we executed userspace code
> >>> out-of-order. That's enough to scare me! However, for our PI dependency
> >>> chains it should not matter as the only time we do submit out-of-order,
> >>> we are stuck on _our_ semaphore that cannot be resolved until the
> >>> requests are back in-order.
> >>
> >> Out of order how? Within a single timeline?! I though only with
> >> incomplete view of priority inheritance, which in my mind could only
> >> cause deadlocks (if no timeslicing). But really really out of order?
> > 
> > Fences between timelines. Let's say we have 3 requests, A,B,C all with
> > sequential fencing (C depends on B depends on A), but B is on a
> > different engine to (A, C) and we are using semaphores to submit early.
> > If we bump the priority of C, we see it crosses the engine to B, and send
> > an ipi_priority, but set C to be higher priority than A. So we now
> > schedule C before A!
> 
> Yeah so different timelines, I think that's not a huge problem to start 
> with. Only if things were non-preemptable.

And for the special case where it may occur, it's inside an preemptible
section (under our control).

> > However, since C depends on B which depends on A, C is stuck on its
> > semaphore from B, and B is waiting for A. As soon as A is set to the
> > same priority as C (after a couple of ipi_priority()), we rerun the
> > scheduler see that C has a semaphore-yield (or eventually timeslice
> > expired) and so run A before C, and order is restored.
> > 
> >>> I've tried to trick this into causing problems with the
> >>> i915_selftest/igt_schedule_cycle and gem_exec_schedule/noreorder.
> >>> Fortunately for my sanity, neither test have caught any problems.
> >>>
> >>> This is the handwaving part of removing the global lock.
> >>>
> >>>>> +     /*
> >>>>> +      * If we are setting the priority before being submitted, see if we
> >>>>> +      * can quickly adjust our own priority in-situ and avoid taking
> >>>>> +      * the contended engine->active.lock. If we need priority inheritance,
> >>>>> +      * take the slow route.
> >>>>> +      */
> >>>>> +     if (rq_prio(rq) == I915_PRIORITY_INVALID) {
> >>>>> +             struct i915_dependency *p;
> >>>>> +
> >>>>> +             rcu_read_lock();
> >>>>> +             for_each_signaler(p, rq) {
> >>>>> +                     struct i915_request *s =
> >>>>> +                             container_of(p->signaler, typeof(*s), sched);
> >>>>> +
> >>>>> +                     if (rq_prio(s) >= prio)
> >>>>> +                             continue;
> >>>>> +
> >>>>> +                     if (__i915_request_is_complete(s))
> >>>>> +                             continue;
> >>>>> +
> >>>>> +                     break;
> >>>>> +             }
> >>>>> +             rcu_read_unlock();
> >>>>
> >>>> Exit this loop with a first lower priority incomplete signaler. What
> >>>> does the block below then do? Feels like it needs a comment.
> >>>
> >>> I thought I had sufficiently explained that in the comment above.
> >>>
> >>> /* Update priority in place if no PI required */
> >>>>> +             if (&p->signal_link == &rq->sched.signalers_list &&
> >>>>> +                 cmpxchg(&rq->sched.attr.priority,
> >>>>> +                         I915_PRIORITY_INVALID,
> >>>>> +                         prio) == I915_PRIORITY_INVALID)
> >>>>> +                     return;
> >>>
> >>> It could do a few more tricks to change the priority in-place a second
> >>> time, but I did not think that would be frequent enough to matter.
> >>> Whereas we always adjust the priority from INVALID once before
> >>> submission, and avoiding taking the lock then does make a difference to
> >>> the profiles.
> >>
> >> To start with, if p is NULL or un-initialized (can be, no?) then
> >> relationship of &p->signal_link to &rq->sched.signalers_list escapes me.
> > 
> > p is constrained to be a member of the signalers_list or its head.
> 
> Is it defined list_for_each_entry exits with pos set? It is in 
> implementation but I don't know why it would have to be. Could you 
> change this to some form of list_empty or a descriptively named helper 
> for clarity?

It as defined as the macro gets.

There's a list_entry_is_head(). That sounds new.

commit e130816164e244b692921de49771eeb28205152d
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date:   Thu Oct 15 20:11:31 2020 -0700

    include/linux/list.h: add a macro to test if entry is pointing to the head

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance
  2021-01-26 13:24             ` Chris Wilson
@ 2021-01-26 13:45               ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-26 13:45 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Chris Wilson (2021-01-26 13:24:07)
> Quoting Tvrtko Ursulin (2021-01-26 13:15:29)
> > 
> > On 26/01/2021 11:55, Chris Wilson wrote:
> > > Quoting Tvrtko Ursulin (2021-01-26 11:40:24)
> > >>
> > >> On 26/01/2021 11:30, Chris Wilson wrote:
> > >>> Quoting Tvrtko Ursulin (2021-01-26 11:12:53)
> > >>>>
> > >>>>
> > >>>> On 25/01/2021 14:01, Chris Wilson wrote:
> > >>>>> +static void ipi_schedule(struct work_struct *wrk)
> > >>>>> +{
> > >>>>> +     struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
> > >>>>> +     struct i915_request *rq = xchg(&ipi->list, NULL);
> > >>>>> +
> > >>>>> +     do {
> > >>>>> +             struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
> > >>>>> +             int prio;
> > >>>>> +
> > >>>>> +             prio = ipi_get_prio(rq);
> > >>>>> +
> > >>>>> +             /*
> > >>>>> +              * For cross-engine scheduling to work we rely on one of two
> > >>>>> +              * things:
> > >>>>> +              *
> > >>>>> +              * a) The requests are using dma-fence fences and so will not
> > >>>>> +              * be scheduled until the previous engine is completed, and
> > >>>>> +              * so we cannot cross back onto the original engine and end up
> > >>>>> +              * queuing an earlier request after the first (due to the
> > >>>>> +              * interrupted DFS).
> > >>>>> +              *
> > >>>>> +              * b) The requests are using semaphores and so may be already
> > >>>>> +              * be in flight, in which case if we cross back onto the same
> > >>>>> +              * engine, we will already have put the interrupted DFS into
> > >>>>> +              * the priolist, and the continuation will now be queued
> > >>>>> +              * afterwards [out-of-order]. However, since we are using
> > >>>>> +              * semaphores in this case, we also perform yield on semaphore
> > >>>>> +              * waits and so will reorder the requests back into the correct
> > >>>>> +              * sequence. This occurrence (of promoting a request chain
> > >>>>> +              * that crosses the engines using semaphores back unto itself)
> > >>>>> +              * should be unlikely enough that it probably does not matter...
> > >>>>> +              */
> > >>>>> +             local_bh_disable();
> > >>>>> +             i915_request_set_priority(rq, prio);
> > >>>>> +             local_bh_enable();
> > >>>>
> > >>>> Is it that important and wouldn't the priority order restore eventually
> > >>>> due timeslicing?
> > >>>
> > >>> There would be a window in which we executed userspace code
> > >>> out-of-order. That's enough to scare me! However, for our PI dependency
> > >>> chains it should not matter as the only time we do submit out-of-order,
> > >>> we are stuck on _our_ semaphore that cannot be resolved until the
> > >>> requests are back in-order.
> > >>
> > >> Out of order how? Within a single timeline?! I though only with
> > >> incomplete view of priority inheritance, which in my mind could only
> > >> cause deadlocks (if no timeslicing). But really really out of order?
> > > 
> > > Fences between timelines. Let's say we have 3 requests, A,B,C all with
> > > sequential fencing (C depends on B depends on A), but B is on a
> > > different engine to (A, C) and we are using semaphores to submit early.
> > > If we bump the priority of C, we see it crosses the engine to B, and send
> > > an ipi_priority, but set C to be higher priority than A. So we now
> > > schedule C before A!
> > 
> > Yeah so different timelines, I think that's not a huge problem to start 
> > with. Only if things were non-preemptable.
> 
> And for the special case where it may occur, it's inside an preemptible
> section (under our control).
> 
> > > However, since C depends on B which depends on A, C is stuck on its
> > > semaphore from B, and B is waiting for A. As soon as A is set to the
> > > same priority as C (after a couple of ipi_priority()), we rerun the
> > > scheduler see that C has a semaphore-yield (or eventually timeslice
> > > expired) and so run A before C, and order is restored.
> > > 
> > >>> I've tried to trick this into causing problems with the
> > >>> i915_selftest/igt_schedule_cycle and gem_exec_schedule/noreorder.
> > >>> Fortunately for my sanity, neither test have caught any problems.
> > >>>
> > >>> This is the handwaving part of removing the global lock.
> > >>>
> > >>>>> +     /*
> > >>>>> +      * If we are setting the priority before being submitted, see if we
> > >>>>> +      * can quickly adjust our own priority in-situ and avoid taking
> > >>>>> +      * the contended engine->active.lock. If we need priority inheritance,
> > >>>>> +      * take the slow route.
> > >>>>> +      */
> > >>>>> +     if (rq_prio(rq) == I915_PRIORITY_INVALID) {
> > >>>>> +             struct i915_dependency *p;
> > >>>>> +
> > >>>>> +             rcu_read_lock();
> > >>>>> +             for_each_signaler(p, rq) {
> > >>>>> +                     struct i915_request *s =
> > >>>>> +                             container_of(p->signaler, typeof(*s), sched);
> > >>>>> +
> > >>>>> +                     if (rq_prio(s) >= prio)
> > >>>>> +                             continue;
> > >>>>> +
> > >>>>> +                     if (__i915_request_is_complete(s))
> > >>>>> +                             continue;
> > >>>>> +
> > >>>>> +                     break;
> > >>>>> +             }
> > >>>>> +             rcu_read_unlock();
> > >>>>
> > >>>> Exit this loop with a first lower priority incomplete signaler. What
> > >>>> does the block below then do? Feels like it needs a comment.
> > >>>
> > >>> I thought I had sufficiently explained that in the comment above.
> > >>>
> > >>> /* Update priority in place if no PI required */
> > >>>>> +             if (&p->signal_link == &rq->sched.signalers_list &&
> > >>>>> +                 cmpxchg(&rq->sched.attr.priority,
> > >>>>> +                         I915_PRIORITY_INVALID,
> > >>>>> +                         prio) == I915_PRIORITY_INVALID)
> > >>>>> +                     return;
> > >>>
> > >>> It could do a few more tricks to change the priority in-place a second
> > >>> time, but I did not think that would be frequent enough to matter.
> > >>> Whereas we always adjust the priority from INVALID once before
> > >>> submission, and avoiding taking the lock then does make a difference to
> > >>> the profiles.
> > >>
> > >> To start with, if p is NULL or un-initialized (can be, no?) then
> > >> relationship of &p->signal_link to &rq->sched.signalers_list escapes me.
> > > 
> > > p is constrained to be a member of the signalers_list or its head.
> > 
> > Is it defined list_for_each_entry exits with pos set? It is in 
> > implementation but I don't know why it would have to be. Could you 
> > change this to some form of list_empty or a descriptively named helper 
> > for clarity?
> 
> It as defined as the macro gets.
> 
> There's a list_entry_is_head(). That sounds new.
> 
> commit e130816164e244b692921de49771eeb28205152d
> Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Date:   Thu Oct 15 20:11:31 2020 -0700
> 
>     include/linux/list.h: add a macro to test if entry is pointing to the head


#define all_dependencies_checked(p, rq) \
        list_entry_is_head(p, &(rq)->sched.signalers_list, signal_link)

/* Update priority in place if no PI required */
if (all_dependencies_checked(p, rq) &&
    cmpxchg(&rq->sched.attr.priority,
	    I915_PRIORITY_INVALID,
	    prio) == I915_PRIORITY_INVALID)
	return;

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 08/41] drm/i915: Improve DFS for priority inheritance
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 08/41] drm/i915: Improve DFS for priority inheritance Chris Wilson
@ 2021-01-26 16:22   ` Tvrtko Ursulin
  2021-01-26 16:26     ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-26 16:22 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom



On 25/01/2021 14:01, Chris Wilson wrote:
> The core of the scheduling algorithm is that we compute the topological
> order of the fence DAG. Knowing that we have a DAG, we should be able to
> use a DFS to compute the topological sort in linear time. However,
> during the conversion of the recursive algorithm into an iterative one,
> the memoization of how far we had progressed down a branch was
> forgotten. The result was that instead of running in linear time, it was
> running in geometric time and could easily run for a few hundred
> milliseconds given a wide enough graph, not the microseconds as required.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_scheduler.c | 58 ++++++++++++++++-----------
>   1 file changed, 34 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 4802c9b1081d..9139a91f0aa3 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -234,6 +234,26 @@ void __i915_priolist_free(struct i915_priolist *p)
>   	kmem_cache_free(global.slab_priorities, p);
>   }
>   
> +static struct i915_request *
> +stack_push(struct i915_request *rq,
> +	   struct i915_request *stack,
> +	   struct list_head *pos)
> +{
> +	stack->sched.dfs.prev = pos;
> +	rq->sched.dfs.next = (struct list_head *)stack;
> +	return rq;
> +}
> +
> +static struct i915_request *
> +stack_pop(struct i915_request *rq,
> +	  struct list_head **pos)
> +{
> +	rq = (struct i915_request *)rq->sched.dfs.next;
> +	if (rq)
> +		*pos = rq->sched.dfs.prev;
> +	return rq;
> +}
> +
>   static inline bool need_preempt(int prio, int active)
>   {
>   	/*
> @@ -298,11 +318,10 @@ static void ipi_priority(struct i915_request *rq, int prio)
>   static void __i915_request_set_priority(struct i915_request *rq, int prio)
>   {
>   	struct intel_engine_cs *engine = rq->engine;
> -	struct i915_request *rn;
> +	struct list_head *pos = &rq->sched.signalers_list;
>   	struct list_head *plist;
> -	LIST_HEAD(dfs);
>   
> -	list_add(&rq->sched.dfs, &dfs);
> +	plist = i915_sched_lookup_priolist(engine, prio);
>   
>   	/*
>   	 * Recursively bump all dependent priorities to match the new request.
> @@ -322,40 +341,31 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
>   	 * end result is a topological list of requests in reverse order, the
>   	 * last element in the list is the request we must execute first.
>   	 */
> -	list_for_each_entry(rq, &dfs, sched.dfs) {
> -		struct i915_dependency *p;
> -
> -		/* Also release any children on this engine that are ready */
> -		GEM_BUG_ON(rq->engine != engine);
> -
> -		for_each_signaler(p, rq) {
> +	rq->sched.dfs.next = NULL;
> +	do {
> +		list_for_each_continue(pos, &rq->sched.signalers_list) {
> +			struct i915_dependency *p =
> +				list_entry(pos, typeof(*p), signal_link);
>   			struct i915_request *s =
>   				container_of(p->signaler, typeof(*s), sched);
>   
> -			GEM_BUG_ON(s == rq);
> -
>   			if (rq_prio(s) >= prio)
>   				continue;
>   
>   			if (__i915_request_is_complete(s))
>   				continue;
>   
> -			if (s->engine != rq->engine) {
> +			if (s->engine != engine) {
>   				ipi_priority(s, prio);
>   				continue;
>   			}
>   
> -			list_move_tail(&s->sched.dfs, &dfs);
> +			/* Remember our position along this branch */
> +			rq = stack_push(s, rq, pos);
> +			pos = &rq->sched.signalers_list;
>   		}
> -	}
>   
> -	plist = i915_sched_lookup_priolist(engine, prio);
> -
> -	/* Fifo and depth-first replacement ensure our deps execute first */
> -	list_for_each_entry_safe_reverse(rq, rn, &dfs, sched.dfs) {
> -		GEM_BUG_ON(rq->engine != engine);
> -
> -		INIT_LIST_HEAD(&rq->sched.dfs);
> +		RQ_TRACE(rq, "set-priority:%d\n", prio);
>   		WRITE_ONCE(rq->sched.attr.priority, prio);
>   
>   		/*
> @@ -369,12 +379,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
>   		if (!i915_request_is_ready(rq))
>   			continue;
>   
> +		GEM_BUG_ON(rq->engine != engine);
>   		if (i915_request_in_priority_queue(rq))
>   			list_move_tail(&rq->sched.link, plist);
>   
>   		/* Defer (tasklet) submission until after all updates. */
>   		kick_submission(engine, rq, prio);
> -	}
> +	} while ((rq = stack_pop(rq, &pos)));
>   }
>   
>   void i915_request_set_priority(struct i915_request *rq, int prio)
> @@ -444,7 +455,6 @@ void i915_sched_node_init(struct i915_sched_node *node)
>   	INIT_LIST_HEAD(&node->signalers_list);
>   	INIT_LIST_HEAD(&node->waiters_list);
>   	INIT_LIST_HEAD(&node->link);
> -	INIT_LIST_HEAD(&node->dfs);
>   
>   	node->ipi_link = NULL;
>   
> 

Pen and paper was needed here but it looks good.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 08/41] drm/i915: Improve DFS for priority inheritance
  2021-01-26 16:22   ` Tvrtko Ursulin
@ 2021-01-26 16:26     ` Chris Wilson
  2021-01-26 16:42       ` Tvrtko Ursulin
  0 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-26 16:26 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-26 16:22:58)
> 
> 
> On 25/01/2021 14:01, Chris Wilson wrote:
> > The core of the scheduling algorithm is that we compute the topological
> > order of the fence DAG. Knowing that we have a DAG, we should be able to
> > use a DFS to compute the topological sort in linear time. However,
> > during the conversion of the recursive algorithm into an iterative one,
> > the memoization of how far we had progressed down a branch was
> > forgotten. The result was that instead of running in linear time, it was
> > running in geometric time and could easily run for a few hundred
> > milliseconds given a wide enough graph, not the microseconds as required.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/i915_scheduler.c | 58 ++++++++++++++++-----------
> >   1 file changed, 34 insertions(+), 24 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> > index 4802c9b1081d..9139a91f0aa3 100644
> > --- a/drivers/gpu/drm/i915/i915_scheduler.c
> > +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> > @@ -234,6 +234,26 @@ void __i915_priolist_free(struct i915_priolist *p)
> >       kmem_cache_free(global.slab_priorities, p);
> >   }
> >   
> > +static struct i915_request *
> > +stack_push(struct i915_request *rq,
> > +        struct i915_request *stack,
> > +        struct list_head *pos)
> > +{
> > +     stack->sched.dfs.prev = pos;
> > +     rq->sched.dfs.next = (struct list_head *)stack;
> > +     return rq;
> > +}
> > +
> > +static struct i915_request *
> > +stack_pop(struct i915_request *rq,
> > +       struct list_head **pos)
> > +{
> > +     rq = (struct i915_request *)rq->sched.dfs.next;
> > +     if (rq)
> > +             *pos = rq->sched.dfs.prev;
> > +     return rq;
> > +}
> > +
> >   static inline bool need_preempt(int prio, int active)
> >   {
> >       /*
> > @@ -298,11 +318,10 @@ static void ipi_priority(struct i915_request *rq, int prio)
> >   static void __i915_request_set_priority(struct i915_request *rq, int prio)
> >   {
> >       struct intel_engine_cs *engine = rq->engine;
> > -     struct i915_request *rn;
> > +     struct list_head *pos = &rq->sched.signalers_list;
> >       struct list_head *plist;
> > -     LIST_HEAD(dfs);
> >   
> > -     list_add(&rq->sched.dfs, &dfs);
> > +     plist = i915_sched_lookup_priolist(engine, prio);
> >   
> >       /*
> >        * Recursively bump all dependent priorities to match the new request.
> > @@ -322,40 +341,31 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
> >        * end result is a topological list of requests in reverse order, the
> >        * last element in the list is the request we must execute first.
> >        */
> > -     list_for_each_entry(rq, &dfs, sched.dfs) {
> > -             struct i915_dependency *p;
> > -
> > -             /* Also release any children on this engine that are ready */
> > -             GEM_BUG_ON(rq->engine != engine);
> > -
> > -             for_each_signaler(p, rq) {
> > +     rq->sched.dfs.next = NULL;
> > +     do {
> > +             list_for_each_continue(pos, &rq->sched.signalers_list) {
> > +                     struct i915_dependency *p =
> > +                             list_entry(pos, typeof(*p), signal_link);
> >                       struct i915_request *s =
> >                               container_of(p->signaler, typeof(*s), sched);
> >   
> > -                     GEM_BUG_ON(s == rq);
> > -
> >                       if (rq_prio(s) >= prio)
> >                               continue;
> >   
> >                       if (__i915_request_is_complete(s))
> >                               continue;
> >   
> > -                     if (s->engine != rq->engine) {
> > +                     if (s->engine != engine) {
> >                               ipi_priority(s, prio);
> >                               continue;
> >                       }
> >   
> > -                     list_move_tail(&s->sched.dfs, &dfs);
> > +                     /* Remember our position along this branch */
> > +                     rq = stack_push(s, rq, pos);
> > +                     pos = &rq->sched.signalers_list;
> >               }
> > -     }
> >   
> > -     plist = i915_sched_lookup_priolist(engine, prio);
> > -
> > -     /* Fifo and depth-first replacement ensure our deps execute first */
> > -     list_for_each_entry_safe_reverse(rq, rn, &dfs, sched.dfs) {
> > -             GEM_BUG_ON(rq->engine != engine);
> > -
> > -             INIT_LIST_HEAD(&rq->sched.dfs);
> > +             RQ_TRACE(rq, "set-priority:%d\n", prio);
> >               WRITE_ONCE(rq->sched.attr.priority, prio);
> >   
> >               /*
> > @@ -369,12 +379,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
> >               if (!i915_request_is_ready(rq))
> >                       continue;
> >   
> > +             GEM_BUG_ON(rq->engine != engine);
> >               if (i915_request_in_priority_queue(rq))
> >                       list_move_tail(&rq->sched.link, plist);
> >   
> >               /* Defer (tasklet) submission until after all updates. */
> >               kick_submission(engine, rq, prio);
> > -     }
> > +     } while ((rq = stack_pop(rq, &pos)));
> >   }
> >   
> >   void i915_request_set_priority(struct i915_request *rq, int prio)
> > @@ -444,7 +455,6 @@ void i915_sched_node_init(struct i915_sched_node *node)
> >       INIT_LIST_HEAD(&node->signalers_list);
> >       INIT_LIST_HEAD(&node->waiters_list);
> >       INIT_LIST_HEAD(&node->link);
> > -     INIT_LIST_HEAD(&node->dfs);
> >   
> >       node->ipi_link = NULL;
> >   
> > 
> 
> Pen and paper was needed here but it looks good.

If you highlight the areas that need more commentary, I guess
a theory-of-operation for stack_push/stack_pop?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 11/41] drm/i915: Extract request submission from execlists
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 11/41] drm/i915: Extract request submission from execlists Chris Wilson
@ 2021-01-26 16:28   ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-26 16:28 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 25/01/2021 14:01, Chris Wilson wrote:
> In the process of preparing to reuse the request submission logic for
> other backends, lift it out of the execlists backend. It already
> operates on the common structs, so just a matter of moving and renaming.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   .../drm/i915/gt/intel_execlists_submission.c  | 55 +------------
>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 30 +------
>   drivers/gpu/drm/i915/i915_scheduler.c         | 82 +++++++++++++++++++
>   drivers/gpu/drm/i915/i915_scheduler.h         |  2 +
>   4 files changed, 86 insertions(+), 83 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 309fb421ff5c..e6acdd8dc361 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -2404,59 +2404,6 @@ static void execlists_preempt(struct timer_list *timer)
>   	execlists_kick(timer, preempt);
>   }
>   
> -static void queue_request(struct intel_engine_cs *engine,
> -			  struct i915_request *rq)
> -{
> -	GEM_BUG_ON(!list_empty(&rq->sched.link));
> -	list_add_tail(&rq->sched.link,
> -		      i915_sched_lookup_priolist(engine, rq_prio(rq)));
> -	set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
> -}
> -
> -static bool submit_queue(struct intel_engine_cs *engine,
> -			 const struct i915_request *rq)
> -{
> -	struct intel_engine_execlists *execlists = &engine->execlists;
> -
> -	if (rq_prio(rq) <= execlists->queue_priority_hint)
> -		return false;
> -
> -	execlists->queue_priority_hint = rq_prio(rq);
> -	return true;
> -}
> -
> -static bool ancestor_on_hold(const struct intel_engine_cs *engine,
> -			     const struct i915_request *rq)
> -{
> -	GEM_BUG_ON(i915_request_on_hold(rq));
> -	return !list_empty(&engine->active.hold) && hold_request(rq);
> -}
> -
> -static void execlists_submit_request(struct i915_request *request)
> -{
> -	struct intel_engine_cs *engine = request->engine;
> -	unsigned long flags;
> -
> -	/* Will be called from irq-context when using foreign fences. */
> -	spin_lock_irqsave(&engine->active.lock, flags);
> -
> -	if (unlikely(ancestor_on_hold(engine, request))) {
> -		RQ_TRACE(request, "ancestor on hold\n");
> -		list_add_tail(&request->sched.link, &engine->active.hold);
> -		i915_request_set_hold(request);
> -	} else {
> -		queue_request(engine, request);
> -
> -		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
> -		GEM_BUG_ON(list_empty(&request->sched.link));
> -
> -		if (submit_queue(engine, request))
> -			__execlists_kick(&engine->execlists);
> -	}
> -
> -	spin_unlock_irqrestore(&engine->active.lock, flags);
> -}
> -
>   static int execlists_context_pre_pin(struct intel_context *ce,
>   				     struct i915_gem_ww_ctx *ww,
>   				     void **vaddr)
> @@ -3072,7 +3019,7 @@ static bool can_preempt(struct intel_engine_cs *engine)
>   
>   static void execlists_set_default_submission(struct intel_engine_cs *engine)
>   {
> -	engine->submit_request = execlists_submit_request;
> +	engine->submit_request = i915_request_enqueue;
>   	engine->execlists.tasklet.func = execlists_submission_tasklet;
>   
>   	engine->reset.prepare = execlists_reset_prepare;
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 53cf68e240c3..4f1eee4fbfb2 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -510,34 +510,6 @@ static int guc_request_alloc(struct i915_request *request)
>   	return 0;
>   }
>   
> -static inline void queue_request(struct intel_engine_cs *engine,
> -				 struct i915_request *rq,
> -				 int prio)
> -{
> -	GEM_BUG_ON(!list_empty(&rq->sched.link));
> -	list_add_tail(&rq->sched.link,
> -		      i915_sched_lookup_priolist(engine, prio));
> -	set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
> -}
> -
> -static void guc_submit_request(struct i915_request *rq)
> -{
> -	struct intel_engine_cs *engine = rq->engine;
> -	unsigned long flags;
> -
> -	/* Will be called from irq-context when using foreign fences. */
> -	spin_lock_irqsave(&engine->active.lock, flags);
> -
> -	queue_request(engine, rq, rq_prio(rq));
> -
> -	GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
> -	GEM_BUG_ON(list_empty(&rq->sched.link));
> -
> -	tasklet_hi_schedule(&engine->execlists.tasklet);
> -
> -	spin_unlock_irqrestore(&engine->active.lock, flags);
> -}
> -
>   static void sanitize_hwsp(struct intel_engine_cs *engine)
>   {
>   	struct intel_timeline *tl;
> @@ -606,7 +578,7 @@ static int guc_resume(struct intel_engine_cs *engine)
>   
>   static void guc_set_default_submission(struct intel_engine_cs *engine)
>   {
> -	engine->submit_request = guc_submit_request;
> +	engine->submit_request = i915_request_enqueue;
>   	engine->execlists.tasklet.func = guc_submission_tasklet;
>   
>   	engine->reset.prepare = guc_reset_prepare;
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 9139a91f0aa3..3f5fc03908dc 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -448,6 +448,88 @@ void i915_request_set_priority(struct i915_request *rq, int prio)
>   	spin_unlock_irqrestore(&engine->active.lock, flags);
>   }
>   
> +static void queue_request(struct intel_engine_cs *engine,
> +			  struct i915_request *rq)
> +{
> +	GEM_BUG_ON(!list_empty(&rq->sched.link));
> +	list_add_tail(&rq->sched.link,
> +		      i915_sched_lookup_priolist(engine, rq_prio(rq)));
> +	set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
> +}
> +
> +static bool submit_queue(struct intel_engine_cs *engine,
> +			 const struct i915_request *rq)
> +{
> +	struct intel_engine_execlists *execlists = &engine->execlists;
> +
> +	if (rq_prio(rq) <= execlists->queue_priority_hint)
> +		return false;
> +
> +	execlists->queue_priority_hint = rq_prio(rq);
> +	return true;
> +}
> +
> +static bool hold_request(const struct i915_request *rq)
> +{
> +	struct i915_dependency *p;
> +	bool result = false;
> +
> +	/*
> +	 * If one of our ancestors is on hold, we must also be put on hold,
> +	 * otherwise we will bypass it and execute before it.
> +	 */
> +	rcu_read_lock();
> +	for_each_signaler(p, rq) {
> +		const struct i915_request *s =
> +			container_of(p->signaler, typeof(*s), sched);
> +
> +		if (s->engine != rq->engine)
> +			continue;
> +
> +		result = i915_request_on_hold(s);
> +		if (result)
> +			break;
> +	}
> +	rcu_read_unlock();
> +
> +	return result;
> +}
> +
> +static bool ancestor_on_hold(const struct intel_engine_cs *engine,
> +			     const struct i915_request *rq)
> +{
> +	GEM_BUG_ON(i915_request_on_hold(rq));
> +	return unlikely(!list_empty(&engine->active.hold)) && hold_request(rq);
> +}
> +
> +void i915_request_enqueue(struct i915_request *rq)
> +{
> +	struct intel_engine_cs *engine = rq->engine;
> +	unsigned long flags;
> +	bool kick = false;
> +
> +	/* Will be called from irq-context when using foreign fences. */
> +	spin_lock_irqsave(&engine->active.lock, flags);
> +	GEM_BUG_ON(test_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags));
> +
> +	if (unlikely(ancestor_on_hold(engine, rq))) {
> +		RQ_TRACE(rq, "ancestor on hold\n");
> +		list_add_tail(&rq->sched.link, &engine->active.hold);
> +		i915_request_set_hold(rq);
> +	} else {
> +		queue_request(engine, rq);
> +
> +		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
> +
> +		kick = submit_queue(engine, rq);
> +	}
> +
> +	GEM_BUG_ON(list_empty(&rq->sched.link));
> +	spin_unlock_irqrestore(&engine->active.lock, flags);
> +	if (kick)
> +		tasklet_hi_schedule(&engine->execlists.tasklet);
> +}
> +
>   void i915_sched_node_init(struct i915_sched_node *node)
>   {
>   	spin_lock_init(&node->lock);
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 5be7f90e7896..c4c086d56f81 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -39,6 +39,8 @@ void i915_sched_init_ipi(struct i915_sched_ipi *ipi);
>   
>   void i915_request_set_priority(struct i915_request *request, int prio);
>   
> +void i915_request_enqueue(struct i915_request *request);
> +
>   struct list_head *
>   i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio);
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 08/41] drm/i915: Improve DFS for priority inheritance
  2021-01-26 16:26     ` Chris Wilson
@ 2021-01-26 16:42       ` Tvrtko Ursulin
  2021-01-26 16:51         ` Tvrtko Ursulin
  2021-01-26 16:51         ` Chris Wilson
  0 siblings, 2 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-26 16:42 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 26/01/2021 16:26, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2021-01-26 16:22:58)
>>
>>
>> On 25/01/2021 14:01, Chris Wilson wrote:
>>> The core of the scheduling algorithm is that we compute the topological
>>> order of the fence DAG. Knowing that we have a DAG, we should be able to
>>> use a DFS to compute the topological sort in linear time. However,
>>> during the conversion of the recursive algorithm into an iterative one,
>>> the memoization of how far we had progressed down a branch was
>>> forgotten. The result was that instead of running in linear time, it was
>>> running in geometric time and could easily run for a few hundred
>>> milliseconds given a wide enough graph, not the microseconds as required.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/i915_scheduler.c | 58 ++++++++++++++++-----------
>>>    1 file changed, 34 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
>>> index 4802c9b1081d..9139a91f0aa3 100644
>>> --- a/drivers/gpu/drm/i915/i915_scheduler.c
>>> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
>>> @@ -234,6 +234,26 @@ void __i915_priolist_free(struct i915_priolist *p)
>>>        kmem_cache_free(global.slab_priorities, p);
>>>    }
>>>    
>>> +static struct i915_request *
>>> +stack_push(struct i915_request *rq,
>>> +        struct i915_request *stack,
>>> +        struct list_head *pos)
>>> +{
>>> +     stack->sched.dfs.prev = pos;
>>> +     rq->sched.dfs.next = (struct list_head *)stack;
>>> +     return rq;
>>> +}
>>> +
>>> +static struct i915_request *
>>> +stack_pop(struct i915_request *rq,
>>> +       struct list_head **pos)
>>> +{
>>> +     rq = (struct i915_request *)rq->sched.dfs.next;
>>> +     if (rq)
>>> +             *pos = rq->sched.dfs.prev;
>>> +     return rq;
>>> +}
>>> +
>>>    static inline bool need_preempt(int prio, int active)
>>>    {
>>>        /*
>>> @@ -298,11 +318,10 @@ static void ipi_priority(struct i915_request *rq, int prio)
>>>    static void __i915_request_set_priority(struct i915_request *rq, int prio)
>>>    {
>>>        struct intel_engine_cs *engine = rq->engine;
>>> -     struct i915_request *rn;
>>> +     struct list_head *pos = &rq->sched.signalers_list;
>>>        struct list_head *plist;
>>> -     LIST_HEAD(dfs);
>>>    
>>> -     list_add(&rq->sched.dfs, &dfs);
>>> +     plist = i915_sched_lookup_priolist(engine, prio);
>>>    
>>>        /*
>>>         * Recursively bump all dependent priorities to match the new request.
>>> @@ -322,40 +341,31 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
>>>         * end result is a topological list of requests in reverse order, the
>>>         * last element in the list is the request we must execute first.
>>>         */
>>> -     list_for_each_entry(rq, &dfs, sched.dfs) {
>>> -             struct i915_dependency *p;
>>> -
>>> -             /* Also release any children on this engine that are ready */
>>> -             GEM_BUG_ON(rq->engine != engine);
>>> -
>>> -             for_each_signaler(p, rq) {
>>> +     rq->sched.dfs.next = NULL;
>>> +     do {
>>> +             list_for_each_continue(pos, &rq->sched.signalers_list) {
>>> +                     struct i915_dependency *p =
>>> +                             list_entry(pos, typeof(*p), signal_link);
>>>                        struct i915_request *s =
>>>                                container_of(p->signaler, typeof(*s), sched);
>>>    
>>> -                     GEM_BUG_ON(s == rq);
>>> -
>>>                        if (rq_prio(s) >= prio)
>>>                                continue;
>>>    
>>>                        if (__i915_request_is_complete(s))
>>>                                continue;
>>>    
>>> -                     if (s->engine != rq->engine) {
>>> +                     if (s->engine != engine) {
>>>                                ipi_priority(s, prio);
>>>                                continue;
>>>                        }
>>>    
>>> -                     list_move_tail(&s->sched.dfs, &dfs);
>>> +                     /* Remember our position along this branch */
>>> +                     rq = stack_push(s, rq, pos);
>>> +                     pos = &rq->sched.signalers_list;
>>>                }
>>> -     }
>>>    
>>> -     plist = i915_sched_lookup_priolist(engine, prio);
>>> -
>>> -     /* Fifo and depth-first replacement ensure our deps execute first */
>>> -     list_for_each_entry_safe_reverse(rq, rn, &dfs, sched.dfs) {
>>> -             GEM_BUG_ON(rq->engine != engine);
>>> -
>>> -             INIT_LIST_HEAD(&rq->sched.dfs);
>>> +             RQ_TRACE(rq, "set-priority:%d\n", prio);
>>>                WRITE_ONCE(rq->sched.attr.priority, prio);
>>>    
>>>                /*
>>> @@ -369,12 +379,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
>>>                if (!i915_request_is_ready(rq))
>>>                        continue;
>>>    
>>> +             GEM_BUG_ON(rq->engine != engine);
>>>                if (i915_request_in_priority_queue(rq))
>>>                        list_move_tail(&rq->sched.link, plist);
>>>    
>>>                /* Defer (tasklet) submission until after all updates. */
>>>                kick_submission(engine, rq, prio);
>>> -     }
>>> +     } while ((rq = stack_pop(rq, &pos)));
>>>    }
>>>    
>>>    void i915_request_set_priority(struct i915_request *rq, int prio)
>>> @@ -444,7 +455,6 @@ void i915_sched_node_init(struct i915_sched_node *node)
>>>        INIT_LIST_HEAD(&node->signalers_list);
>>>        INIT_LIST_HEAD(&node->waiters_list);
>>>        INIT_LIST_HEAD(&node->link);
>>> -     INIT_LIST_HEAD(&node->dfs);
>>>    
>>>        node->ipi_link = NULL;
>>>    
>>>
>>
>> Pen and paper was needed here but it looks good.
> 
> If you highlight the areas that need more commentary, I guess
> a theory-of-operation for stack_push/stack_pop?

At some point I wanted to suggest you change dfs.list_head abuse to 
explicit rq and list head pointer to better represent how there are two 
pieces of information tracked in there.

In terms of commentary don't know really. Perhaps it could be made 
clearer just with some code re-structure, for instance maybe a new data 
structure like i915_request_stack would work like:

struct i915_request_stack {
	struct i915_request *prev;
	struct list_head *pos;
};

And then push and pop operate on three distinct data types for clarity, 
request stack being embedded in request. I haven't really thought it 
through to be sure it works so just maybe.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 08/41] drm/i915: Improve DFS for priority inheritance
  2021-01-26 16:42       ` Tvrtko Ursulin
@ 2021-01-26 16:51         ` Tvrtko Ursulin
  2021-01-26 16:51         ` Chris Wilson
  1 sibling, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-26 16:51 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 26/01/2021 16:42, Tvrtko Ursulin wrote:
> 
> On 26/01/2021 16:26, Chris Wilson wrote:
>> Quoting Tvrtko Ursulin (2021-01-26 16:22:58)
>>>
>>>
>>> On 25/01/2021 14:01, Chris Wilson wrote:
>>>> The core of the scheduling algorithm is that we compute the topological
>>>> order of the fence DAG. Knowing that we have a DAG, we should be 
>>>> able to
>>>> use a DFS to compute the topological sort in linear time. However,
>>>> during the conversion of the recursive algorithm into an iterative one,
>>>> the memoization of how far we had progressed down a branch was
>>>> forgotten. The result was that instead of running in linear time, it 
>>>> was
>>>> running in geometric time and could easily run for a few hundred
>>>> milliseconds given a wide enough graph, not the microseconds as 
>>>> required.
>>>>
>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_scheduler.c | 58 
>>>> ++++++++++++++++-----------
>>>>    1 file changed, 34 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
>>>> b/drivers/gpu/drm/i915/i915_scheduler.c
>>>> index 4802c9b1081d..9139a91f0aa3 100644
>>>> --- a/drivers/gpu/drm/i915/i915_scheduler.c
>>>> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
>>>> @@ -234,6 +234,26 @@ void __i915_priolist_free(struct i915_priolist *p)
>>>>        kmem_cache_free(global.slab_priorities, p);
>>>>    }
>>>> +static struct i915_request *
>>>> +stack_push(struct i915_request *rq,
>>>> +        struct i915_request *stack,
>>>> +        struct list_head *pos)
>>>> +{
>>>> +     stack->sched.dfs.prev = pos;
>>>> +     rq->sched.dfs.next = (struct list_head *)stack;
>>>> +     return rq;
>>>> +}
>>>> +
>>>> +static struct i915_request *
>>>> +stack_pop(struct i915_request *rq,
>>>> +       struct list_head **pos)
>>>> +{
>>>> +     rq = (struct i915_request *)rq->sched.dfs.next;
>>>> +     if (rq)
>>>> +             *pos = rq->sched.dfs.prev;
>>>> +     return rq;
>>>> +}
>>>> +
>>>>    static inline bool need_preempt(int prio, int active)
>>>>    {
>>>>        /*
>>>> @@ -298,11 +318,10 @@ static void ipi_priority(struct i915_request 
>>>> *rq, int prio)
>>>>    static void __i915_request_set_priority(struct i915_request *rq, 
>>>> int prio)
>>>>    {
>>>>        struct intel_engine_cs *engine = rq->engine;
>>>> -     struct i915_request *rn;
>>>> +     struct list_head *pos = &rq->sched.signalers_list;
>>>>        struct list_head *plist;
>>>> -     LIST_HEAD(dfs);
>>>> -     list_add(&rq->sched.dfs, &dfs);
>>>> +     plist = i915_sched_lookup_priolist(engine, prio);
>>>>        /*
>>>>         * Recursively bump all dependent priorities to match the new 
>>>> request.
>>>> @@ -322,40 +341,31 @@ static void __i915_request_set_priority(struct 
>>>> i915_request *rq, int prio)
>>>>         * end result is a topological list of requests in reverse 
>>>> order, the
>>>>         * last element in the list is the request we must execute 
>>>> first.
>>>>         */
>>>> -     list_for_each_entry(rq, &dfs, sched.dfs) {
>>>> -             struct i915_dependency *p;
>>>> -
>>>> -             /* Also release any children on this engine that are 
>>>> ready */
>>>> -             GEM_BUG_ON(rq->engine != engine);
>>>> -
>>>> -             for_each_signaler(p, rq) {
>>>> +     rq->sched.dfs.next = NULL;
>>>> +     do {
>>>> +             list_for_each_continue(pos, &rq->sched.signalers_list) {
>>>> +                     struct i915_dependency *p =
>>>> +                             list_entry(pos, typeof(*p), signal_link);
>>>>                        struct i915_request *s =
>>>>                                container_of(p->signaler, typeof(*s), 
>>>> sched);
>>>> -                     GEM_BUG_ON(s == rq);
>>>> -
>>>>                        if (rq_prio(s) >= prio)
>>>>                                continue;
>>>>                        if (__i915_request_is_complete(s))
>>>>                                continue;
>>>> -                     if (s->engine != rq->engine) {
>>>> +                     if (s->engine != engine) {
>>>>                                ipi_priority(s, prio);
>>>>                                continue;
>>>>                        }
>>>> -                     list_move_tail(&s->sched.dfs, &dfs);
>>>> +                     /* Remember our position along this branch */
>>>> +                     rq = stack_push(s, rq, pos);
>>>> +                     pos = &rq->sched.signalers_list;
>>>>                }
>>>> -     }
>>>> -     plist = i915_sched_lookup_priolist(engine, prio);
>>>> -
>>>> -     /* Fifo and depth-first replacement ensure our deps execute 
>>>> first */
>>>> -     list_for_each_entry_safe_reverse(rq, rn, &dfs, sched.dfs) {
>>>> -             GEM_BUG_ON(rq->engine != engine);
>>>> -
>>>> -             INIT_LIST_HEAD(&rq->sched.dfs);
>>>> +             RQ_TRACE(rq, "set-priority:%d\n", prio);
>>>>                WRITE_ONCE(rq->sched.attr.priority, prio);
>>>>                /*
>>>> @@ -369,12 +379,13 @@ static void __i915_request_set_priority(struct 
>>>> i915_request *rq, int prio)
>>>>                if (!i915_request_is_ready(rq))
>>>>                        continue;
>>>> +             GEM_BUG_ON(rq->engine != engine);
>>>>                if (i915_request_in_priority_queue(rq))
>>>>                        list_move_tail(&rq->sched.link, plist);
>>>>                /* Defer (tasklet) submission until after all 
>>>> updates. */
>>>>                kick_submission(engine, rq, prio);
>>>> -     }
>>>> +     } while ((rq = stack_pop(rq, &pos)));
>>>>    }
>>>>    void i915_request_set_priority(struct i915_request *rq, int prio)
>>>> @@ -444,7 +455,6 @@ void i915_sched_node_init(struct i915_sched_node 
>>>> *node)
>>>>        INIT_LIST_HEAD(&node->signalers_list);
>>>>        INIT_LIST_HEAD(&node->waiters_list);
>>>>        INIT_LIST_HEAD(&node->link);
>>>> -     INIT_LIST_HEAD(&node->dfs);
>>>>        node->ipi_link = NULL;
>>>>
>>>
>>> Pen and paper was needed here but it looks good.
>>
>> If you highlight the areas that need more commentary, I guess
>> a theory-of-operation for stack_push/stack_pop?
> 
> At some point I wanted to suggest you change dfs.list_head abuse to 
> explicit rq and list head pointer to better represent how there are two 
> pieces of information tracked in there.
> 
> In terms of commentary don't know really. Perhaps it could be made 
> clearer just with some code re-structure, for instance maybe a new data 
> structure like i915_request_stack would work like:
> 
> struct i915_request_stack {
>      struct i915_request *prev;
>      struct list_head *pos;
> };
> 
> And then push and pop operate on three distinct data types for clarity, 
> request stack being embedded in request. I haven't really thought it 
> through to be sure it works so just maybe.

Ah I remember why I did not suggest this, to avoid wasting one pointer 
because of:

struct list_head {
         struct list_head *next, *prev;
};

There isn't anything for just one.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 08/41] drm/i915: Improve DFS for priority inheritance
  2021-01-26 16:42       ` Tvrtko Ursulin
  2021-01-26 16:51         ` Tvrtko Ursulin
@ 2021-01-26 16:51         ` Chris Wilson
  1 sibling, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-26 16:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-26 16:42:37)
> 
> On 26/01/2021 16:26, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2021-01-26 16:22:58)
> >>
> >>
> >> On 25/01/2021 14:01, Chris Wilson wrote:
> >>> The core of the scheduling algorithm is that we compute the topological
> >>> order of the fence DAG. Knowing that we have a DAG, we should be able to
> >>> use a DFS to compute the topological sort in linear time. However,
> >>> during the conversion of the recursive algorithm into an iterative one,
> >>> the memoization of how far we had progressed down a branch was
> >>> forgotten. The result was that instead of running in linear time, it was
> >>> running in geometric time and could easily run for a few hundred
> >>> milliseconds given a wide enough graph, not the microseconds as required.
> >>>
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_scheduler.c | 58 ++++++++++++++++-----------
> >>>    1 file changed, 34 insertions(+), 24 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> >>> index 4802c9b1081d..9139a91f0aa3 100644
> >>> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> >>> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> >>> @@ -234,6 +234,26 @@ void __i915_priolist_free(struct i915_priolist *p)
> >>>        kmem_cache_free(global.slab_priorities, p);
> >>>    }
> >>>    
> >>> +static struct i915_request *
> >>> +stack_push(struct i915_request *rq,
> >>> +        struct i915_request *stack,
> >>> +        struct list_head *pos)
> >>> +{
> >>> +     stack->sched.dfs.prev = pos;
> >>> +     rq->sched.dfs.next = (struct list_head *)stack;
> >>> +     return rq;
> >>> +}
> >>> +
> >>> +static struct i915_request *
> >>> +stack_pop(struct i915_request *rq,
> >>> +       struct list_head **pos)
> >>> +{
> >>> +     rq = (struct i915_request *)rq->sched.dfs.next;
> >>> +     if (rq)
> >>> +             *pos = rq->sched.dfs.prev;
> >>> +     return rq;
> >>> +}
> >>> +
> >>>    static inline bool need_preempt(int prio, int active)
> >>>    {
> >>>        /*
> >>> @@ -298,11 +318,10 @@ static void ipi_priority(struct i915_request *rq, int prio)
> >>>    static void __i915_request_set_priority(struct i915_request *rq, int prio)
> >>>    {
> >>>        struct intel_engine_cs *engine = rq->engine;
> >>> -     struct i915_request *rn;
> >>> +     struct list_head *pos = &rq->sched.signalers_list;
> >>>        struct list_head *plist;
> >>> -     LIST_HEAD(dfs);
> >>>    
> >>> -     list_add(&rq->sched.dfs, &dfs);
> >>> +     plist = i915_sched_lookup_priolist(engine, prio);
> >>>    
> >>>        /*
> >>>         * Recursively bump all dependent priorities to match the new request.
> >>> @@ -322,40 +341,31 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
> >>>         * end result is a topological list of requests in reverse order, the
> >>>         * last element in the list is the request we must execute first.
> >>>         */
> >>> -     list_for_each_entry(rq, &dfs, sched.dfs) {
> >>> -             struct i915_dependency *p;
> >>> -
> >>> -             /* Also release any children on this engine that are ready */
> >>> -             GEM_BUG_ON(rq->engine != engine);
> >>> -
> >>> -             for_each_signaler(p, rq) {
> >>> +     rq->sched.dfs.next = NULL;
> >>> +     do {
> >>> +             list_for_each_continue(pos, &rq->sched.signalers_list) {
> >>> +                     struct i915_dependency *p =
> >>> +                             list_entry(pos, typeof(*p), signal_link);
> >>>                        struct i915_request *s =
> >>>                                container_of(p->signaler, typeof(*s), sched);
> >>>    
> >>> -                     GEM_BUG_ON(s == rq);
> >>> -
> >>>                        if (rq_prio(s) >= prio)
> >>>                                continue;
> >>>    
> >>>                        if (__i915_request_is_complete(s))
> >>>                                continue;
> >>>    
> >>> -                     if (s->engine != rq->engine) {
> >>> +                     if (s->engine != engine) {
> >>>                                ipi_priority(s, prio);
> >>>                                continue;
> >>>                        }
> >>>    
> >>> -                     list_move_tail(&s->sched.dfs, &dfs);
> >>> +                     /* Remember our position along this branch */
> >>> +                     rq = stack_push(s, rq, pos);
> >>> +                     pos = &rq->sched.signalers_list;
> >>>                }
> >>> -     }
> >>>    
> >>> -     plist = i915_sched_lookup_priolist(engine, prio);
> >>> -
> >>> -     /* Fifo and depth-first replacement ensure our deps execute first */
> >>> -     list_for_each_entry_safe_reverse(rq, rn, &dfs, sched.dfs) {
> >>> -             GEM_BUG_ON(rq->engine != engine);
> >>> -
> >>> -             INIT_LIST_HEAD(&rq->sched.dfs);
> >>> +             RQ_TRACE(rq, "set-priority:%d\n", prio);
> >>>                WRITE_ONCE(rq->sched.attr.priority, prio);
> >>>    
> >>>                /*
> >>> @@ -369,12 +379,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
> >>>                if (!i915_request_is_ready(rq))
> >>>                        continue;
> >>>    
> >>> +             GEM_BUG_ON(rq->engine != engine);
> >>>                if (i915_request_in_priority_queue(rq))
> >>>                        list_move_tail(&rq->sched.link, plist);
> >>>    
> >>>                /* Defer (tasklet) submission until after all updates. */
> >>>                kick_submission(engine, rq, prio);
> >>> -     }
> >>> +     } while ((rq = stack_pop(rq, &pos)));
> >>>    }
> >>>    
> >>>    void i915_request_set_priority(struct i915_request *rq, int prio)
> >>> @@ -444,7 +455,6 @@ void i915_sched_node_init(struct i915_sched_node *node)
> >>>        INIT_LIST_HEAD(&node->signalers_list);
> >>>        INIT_LIST_HEAD(&node->waiters_list);
> >>>        INIT_LIST_HEAD(&node->link);
> >>> -     INIT_LIST_HEAD(&node->dfs);
> >>>    
> >>>        node->ipi_link = NULL;
> >>>    
> >>>
> >>
> >> Pen and paper was needed here but it looks good.
> > 
> > If you highlight the areas that need more commentary, I guess
> > a theory-of-operation for stack_push/stack_pop?
> 
> At some point I wanted to suggest you change dfs.list_head abuse to 
> explicit rq and list head pointer to better represent how there are two 
> pieces of information tracked in there.

Ok. While writing it I thought some places continued to use it as a
struct list_head, but it appears that this is the only user.
I'll give it a whirl.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 18/41] drm/i915: Move tasklet from execlists to sched
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 18/41] drm/i915: Move tasklet from execlists to sched Chris Wilson
@ 2021-01-27 14:10   ` Tvrtko Ursulin
  2021-01-27 14:24     ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-27 14:10 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


+ Matt to check on how this fits with GuC. This patch and a few before 
it in this series.

The split between physical and scheduling engine (i915_sched_engine) 
makes sense to me. Gut feeling says it should work for GuC as well, in 
principle.

A small comment or two below:

On 25/01/2021 14:01, Chris Wilson wrote:
> Move the scheduling tasklists out of the execlists backend into the
> per-engine scheduling bookkeeping.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine.h        | 14 ----
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 11 ++--
>   drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 --
>   .../drm/i915/gt/intel_execlists_submission.c  | 65 +++++++++----------
>   drivers/gpu/drm/i915/gt/intel_gt_irq.c        |  2 +-
>   drivers/gpu/drm/i915/gt/selftest_execlists.c  | 16 ++---
>   drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  2 +-
>   drivers/gpu/drm/i915/gt/selftest_lrc.c        |  6 +-
>   drivers/gpu/drm/i915/gt/selftest_reset.c      |  2 +-
>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 18 ++---
>   drivers/gpu/drm/i915/i915_scheduler.c         | 14 ++--
>   drivers/gpu/drm/i915/i915_scheduler.h         | 20 ++++++
>   drivers/gpu/drm/i915/i915_scheduler_types.h   |  6 ++
>   .../gpu/drm/i915/selftests/i915_scheduler.c   | 16 ++---
>   14 files changed, 99 insertions(+), 98 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> index 20974415e7d8..801ae54cf60d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -122,20 +122,6 @@ execlists_active(const struct intel_engine_execlists *execlists)
>   	return active;
>   }
>   
> -static inline void
> -execlists_active_lock_bh(struct intel_engine_execlists *execlists)
> -{
> -	local_bh_disable(); /* prevent local softirq and lock recursion */
> -	tasklet_lock(&execlists->tasklet);
> -}
> -
> -static inline void
> -execlists_active_unlock_bh(struct intel_engine_execlists *execlists)
> -{
> -	tasklet_unlock(&execlists->tasklet);
> -	local_bh_enable(); /* restore softirq, and kick ksoftirqd! */
> -}
> -
>   static inline u32
>   intel_read_status_page(const struct intel_engine_cs *engine, int reg)
>   {
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index ef225da35399..cdd07aeada05 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -902,7 +902,6 @@ int intel_engines_init(struct intel_gt *gt)
>   void intel_engine_cleanup_common(struct intel_engine_cs *engine)
>   {
>   	i915_sched_fini_engine(&engine->active);
> -	tasklet_kill(&engine->execlists.tasklet); /* flush the callback */
>   
>   	intel_breadcrumbs_free(engine->breadcrumbs);
>   
> @@ -1187,7 +1186,7 @@ static bool ring_is_idle(struct intel_engine_cs *engine)
>   
>   void __intel_engine_flush_submission(struct intel_engine_cs *engine, bool sync)
>   {
> -	struct tasklet_struct *t = &engine->execlists.tasklet;
> +	struct tasklet_struct *t = &engine->active.tasklet;
>   
>   	if (!t->func)
>   		return;
> @@ -1454,8 +1453,8 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
>   
>   		drm_printf(m, "\tExeclist tasklet queued? %s (%s), preempt? %s, timeslice? %s\n",
>   			   yesno(test_bit(TASKLET_STATE_SCHED,
> -					  &engine->execlists.tasklet.state)),
> -			   enableddisabled(!atomic_read(&engine->execlists.tasklet.count)),
> +					  &engine->active.tasklet.state)),
> +			   enableddisabled(!atomic_read(&engine->active.tasklet.count)),
>   			   repr_timer(&engine->execlists.preempt),
>   			   repr_timer(&engine->execlists.timer));
>   
> @@ -1479,7 +1478,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
>   				   idx, hws[idx * 2], hws[idx * 2 + 1]);
>   		}
>   
> -		execlists_active_lock_bh(execlists);
> +		i915_sched_lock_bh(&engine->active);
>   		rcu_read_lock();
>   		for (port = execlists->active; (rq = *port); port++) {
>   			char hdr[160];
> @@ -1510,7 +1509,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
>   			i915_request_show(m, rq, hdr, 0);
>   		}
>   		rcu_read_unlock();
> -		execlists_active_unlock_bh(execlists);
> +		i915_sched_unlock_bh(&engine->active);
>   	} else if (INTEL_GEN(dev_priv) > 6) {
>   		drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
>   			   ENGINE_READ(engine, RING_PP_DIR_BASE));
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index c46d70b7e484..76d561c2c6aa 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -138,11 +138,6 @@ struct st_preempt_hang {
>    * driver and the hardware state for execlist mode of submission.
>    */
>   struct intel_engine_execlists {
> -	/**
> -	 * @tasklet: softirq tasklet for bottom handler
> -	 */
> -	struct tasklet_struct tasklet;
> -
>   	/**
>   	 * @timer: kick the current context if its timeslice expires
>   	 */
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 756ac388a4a8..1103c8a00af1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -513,7 +513,7 @@ static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
>   		resubmit_virtual_request(rq, ve);
>   
>   	if (READ_ONCE(ve->request))
> -		tasklet_hi_schedule(&ve->base.execlists.tasklet);
> +		i915_sched_kick(&ve->base.active);

i915_sched_ or i915_sched_engine_ ?

>   }
>   
>   static void __execlists_schedule_out(struct i915_request * const rq,
> @@ -679,10 +679,9 @@ trace_ports(const struct intel_engine_execlists *execlists,
>   		     dump_port(p1, sizeof(p1), ", ", ports[1]));
>   }
>   
> -static bool
> -reset_in_progress(const struct intel_engine_execlists *execlists)
> +static bool reset_in_progress(const struct intel_engine_cs *engine)
>   {
> -	return unlikely(!__tasklet_is_enabled(&execlists->tasklet));
> +	return unlikely(!__tasklet_is_enabled(&engine->active.tasklet));
>   }
>   
>   static __maybe_unused noinline bool
> @@ -699,7 +698,7 @@ assert_pending_valid(const struct intel_engine_execlists *execlists,
>   	trace_ports(execlists, msg, execlists->pending);
>   
>   	/* We may be messing around with the lists during reset, lalala */
> -	if (reset_in_progress(execlists))
> +	if (reset_in_progress(engine))
>   		return true;
>   
>   	if (!execlists->pending[0]) {
> @@ -1084,7 +1083,7 @@ static void start_timeslice(struct intel_engine_cs *engine)
>   			 * its timeslice, so recheck.
>   			 */
>   			if (!timer_pending(&el->timer))
> -				tasklet_hi_schedule(&el->tasklet);
> +				i915_sched_kick(&engine->active);
>   			return;
>   		}
>   
> @@ -1664,8 +1663,8 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
>   	 * access. Either we are inside the tasklet, or the tasklet is disabled
>   	 * and we assume that is only inside the reset paths and so serialised.
>   	 */
> -	GEM_BUG_ON(!tasklet_is_locked(&execlists->tasklet) &&
> -		   !reset_in_progress(execlists));
> +	GEM_BUG_ON(!tasklet_is_locked(&engine->active.tasklet) &&
> +		   !reset_in_progress(engine));
>   	GEM_BUG_ON(!intel_engine_in_execlists_submission_mode(engine));
>   
>   	/*
> @@ -2077,13 +2076,13 @@ static noinline void execlists_reset(struct intel_engine_cs *engine)
>   	ENGINE_TRACE(engine, "reset for %s\n", msg);
>   
>   	/* Mark this tasklet as disabled to avoid waiting for it to complete */
> -	tasklet_disable_nosync(&engine->execlists.tasklet);
> +	tasklet_disable_nosync(&engine->active.tasklet);
>   
>   	ring_set_paused(engine, 1); /* Freeze the current request in place */
>   	execlists_capture(engine);
>   	intel_engine_reset(engine, msg);
>   
> -	tasklet_enable(&engine->execlists.tasklet);
> +	tasklet_enable(&engine->active.tasklet);

Maybe all access to the tasklet from the backend should go via 
i915_sched_ helpers to complete the separation? And with some generic 
naming in case we don't want to trumpet it is a tasklet but instead some 
higher level concept. Like schedule_enable/disable I don't know.. 
Depends also how this plugs in the GuC.

Just this really, code itself looks clean enough.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 19/41] drm/i915/gt: Show scheduler queues when dumping state
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 19/41] drm/i915/gt: Show scheduler queues when dumping state Chris Wilson
@ 2021-01-27 14:13   ` Tvrtko Ursulin
  2021-01-27 14:35     ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-27 14:13 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 25/01/2021 14:01, Chris Wilson wrote:
> Move the scheduler pretty printer from out of the execlists state to
> match its more common location.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 34 +++++++++++++----------
>   1 file changed, 19 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index cdd07aeada05..2f9a8960144b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1443,20 +1443,15 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
>   
>   	if (intel_engine_in_guc_submission_mode(engine)) {
>   		/* nothing to print yet */
> -	} else if (HAS_EXECLISTS(dev_priv)) {
> -		struct i915_request * const *port, *rq;
>   		const u32 *hws =
>   			&engine->status_page.addr[I915_HWS_CSB_BUF0_INDEX];
>   		const u8 num_entries = execlists->csb_size;
>   		unsigned int idx;
>   		u8 read, write;
>   
> -		drm_printf(m, "\tExeclist tasklet queued? %s (%s), preempt? %s, timeslice? %s\n",
> -			   yesno(test_bit(TASKLET_STATE_SCHED,
> -					  &engine->active.tasklet.state)),
> -			   enableddisabled(!atomic_read(&engine->active.tasklet.count)),
> -			   repr_timer(&engine->execlists.preempt),
> -			   repr_timer(&engine->execlists.timer));
> +		drm_printf(m, "\tExeclists preempt? %s, timeslice? %s\n",
> +			   repr_timer(&execlists->preempt),
> +			   repr_timer(&execlists->timer));
>   
>   		read = execlists->csb_head;
>   		write = READ_ONCE(*execlists->csb_write);
> @@ -1477,6 +1472,22 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
>   			drm_printf(m, "\tExeclist CSB[%d]: 0x%08x, context: %d\n",
>   				   idx, hws[idx * 2], hws[idx * 2 + 1]);
>   		}
> +	} else if (INTEL_GEN(dev_priv) > 6) {
> +		drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
> +			   ENGINE_READ(engine, RING_PP_DIR_BASE));
> +		drm_printf(m, "\tPP_DIR_BASE_READ: 0x%08x\n",
> +			   ENGINE_READ(engine, RING_PP_DIR_BASE_READ));
> +		drm_printf(m, "\tPP_DIR_DCLV: 0x%08x\n",
> +			   ENGINE_READ(engine, RING_PP_DIR_DCLV));
> +	}
> +
> +	if (engine->active.tasklet.func) {
> +		struct i915_request * const *port, *rq;
> +
> +		drm_printf(m, "\tTasklet queued? %s (%s)\n",
> +			   yesno(test_bit(TASKLET_STATE_SCHED,
> +					  &engine->active.tasklet.state)),
> +			   enableddisabled(!atomic_read(&engine->active.tasklet.count)));

Or have i915_sched_print_state() exported? Again it will depend on how 
clean split will be possible.

Regards,

Tvrtko

>   
>   		i915_sched_lock_bh(&engine->active);
>   		rcu_read_lock();
> @@ -1510,13 +1521,6 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
>   		}
>   		rcu_read_unlock();
>   		i915_sched_unlock_bh(&engine->active);
> -	} else if (INTEL_GEN(dev_priv) > 6) {
> -		drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
> -			   ENGINE_READ(engine, RING_PP_DIR_BASE));
> -		drm_printf(m, "\tPP_DIR_BASE_READ: 0x%08x\n",
> -			   ENGINE_READ(engine, RING_PP_DIR_BASE_READ));
> -		drm_printf(m, "\tPP_DIR_DCLV: 0x%08x\n",
> -			   ENGINE_READ(engine, RING_PP_DIR_DCLV));
>   	}
>   }
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 18/41] drm/i915: Move tasklet from execlists to sched
  2021-01-27 14:10   ` Tvrtko Ursulin
@ 2021-01-27 14:24     ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-27 14:24 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-27 14:10:55)
> 
> + Matt to check on how this fits with GuC. This patch and a few before 
> it in this series.
> 
> The split between physical and scheduling engine (i915_sched_engine) 
> makes sense to me. Gut feeling says it should work for GuC as well, in 
> principle.
> 
> A small comment or two below:
> 
> On 25/01/2021 14:01, Chris Wilson wrote:
> > Move the scheduling tasklists out of the execlists backend into the
> > per-engine scheduling bookkeeping.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine.h        | 14 ----
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 11 ++--
> >   drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 --
> >   .../drm/i915/gt/intel_execlists_submission.c  | 65 +++++++++----------
> >   drivers/gpu/drm/i915/gt/intel_gt_irq.c        |  2 +-
> >   drivers/gpu/drm/i915/gt/selftest_execlists.c  | 16 ++---
> >   drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  2 +-
> >   drivers/gpu/drm/i915/gt/selftest_lrc.c        |  6 +-
> >   drivers/gpu/drm/i915/gt/selftest_reset.c      |  2 +-
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 18 ++---
> >   drivers/gpu/drm/i915/i915_scheduler.c         | 14 ++--
> >   drivers/gpu/drm/i915/i915_scheduler.h         | 20 ++++++
> >   drivers/gpu/drm/i915/i915_scheduler_types.h   |  6 ++
> >   .../gpu/drm/i915/selftests/i915_scheduler.c   | 16 ++---
> >   14 files changed, 99 insertions(+), 98 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> > index 20974415e7d8..801ae54cf60d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> > @@ -122,20 +122,6 @@ execlists_active(const struct intel_engine_execlists *execlists)
> >       return active;
> >   }
> >   
> > -static inline void
> > -execlists_active_lock_bh(struct intel_engine_execlists *execlists)
> > -{
> > -     local_bh_disable(); /* prevent local softirq and lock recursion */
> > -     tasklet_lock(&execlists->tasklet);
> > -}
> > -
> > -static inline void
> > -execlists_active_unlock_bh(struct intel_engine_execlists *execlists)
> > -{
> > -     tasklet_unlock(&execlists->tasklet);
> > -     local_bh_enable(); /* restore softirq, and kick ksoftirqd! */
> > -}
> > -
> >   static inline u32
> >   intel_read_status_page(const struct intel_engine_cs *engine, int reg)
> >   {
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index ef225da35399..cdd07aeada05 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -902,7 +902,6 @@ int intel_engines_init(struct intel_gt *gt)
> >   void intel_engine_cleanup_common(struct intel_engine_cs *engine)
> >   {
> >       i915_sched_fini_engine(&engine->active);
> > -     tasklet_kill(&engine->execlists.tasklet); /* flush the callback */
> >   
> >       intel_breadcrumbs_free(engine->breadcrumbs);
> >   
> > @@ -1187,7 +1186,7 @@ static bool ring_is_idle(struct intel_engine_cs *engine)
> >   
> >   void __intel_engine_flush_submission(struct intel_engine_cs *engine, bool sync)
> >   {
> > -     struct tasklet_struct *t = &engine->execlists.tasklet;
> > +     struct tasklet_struct *t = &engine->active.tasklet;
> >   
> >       if (!t->func)
> >               return;
> > @@ -1454,8 +1453,8 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
> >   
> >               drm_printf(m, "\tExeclist tasklet queued? %s (%s), preempt? %s, timeslice? %s\n",
> >                          yesno(test_bit(TASKLET_STATE_SCHED,
> > -                                       &engine->execlists.tasklet.state)),
> > -                        enableddisabled(!atomic_read(&engine->execlists.tasklet.count)),
> > +                                       &engine->active.tasklet.state)),
> > +                        enableddisabled(!atomic_read(&engine->active.tasklet.count)),
> >                          repr_timer(&engine->execlists.preempt),
> >                          repr_timer(&engine->execlists.timer));
> >   
> > @@ -1479,7 +1478,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
> >                                  idx, hws[idx * 2], hws[idx * 2 + 1]);
> >               }
> >   
> > -             execlists_active_lock_bh(execlists);
> > +             i915_sched_lock_bh(&engine->active);
> >               rcu_read_lock();
> >               for (port = execlists->active; (rq = *port); port++) {
> >                       char hdr[160];
> > @@ -1510,7 +1509,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
> >                       i915_request_show(m, rq, hdr, 0);
> >               }
> >               rcu_read_unlock();
> > -             execlists_active_unlock_bh(execlists);
> > +             i915_sched_unlock_bh(&engine->active);
> >       } else if (INTEL_GEN(dev_priv) > 6) {
> >               drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
> >                          ENGINE_READ(engine, RING_PP_DIR_BASE));
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > index c46d70b7e484..76d561c2c6aa 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > @@ -138,11 +138,6 @@ struct st_preempt_hang {
> >    * driver and the hardware state for execlist mode of submission.
> >    */
> >   struct intel_engine_execlists {
> > -     /**
> > -      * @tasklet: softirq tasklet for bottom handler
> > -      */
> > -     struct tasklet_struct tasklet;
> > -
> >       /**
> >        * @timer: kick the current context if its timeslice expires
> >        */
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index 756ac388a4a8..1103c8a00af1 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -513,7 +513,7 @@ static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
> >               resubmit_virtual_request(rq, ve);
> >   
> >       if (READ_ONCE(ve->request))
> > -             tasklet_hi_schedule(&ve->base.execlists.tasklet);
> > +             i915_sched_kick(&ve->base.active);
> 
> i915_sched_ or i915_sched_engine_ ?

struct i915_request *
__i915_sched_rewind_requests(struct i915_sched_engine *engine);
void __i915_sched_defer_request(struct i915_sched_engine *engine,
                                struct i915_request *request);

bool __i915_sched_suspend_request(struct i915_sched_engine *engine,
                                  struct i915_request *rq);
void __i915_sched_resume_request(struct i915_sched_engine *engine,
                                 struct i915_request *request);

bool i915_sched_suspend_request(struct i915_sched_engine *engine,
                                struct i915_request *request);
void i915_sched_resume_request(struct i915_sched_engine *engine,
                               struct i915_request *rq);

static inline bool i915_sched_is_idle(const struct i915_sched_engine *se)
{
        return i915_priolist_is_empty(&se->queue);
}

static inline bool
i915_sched_is_last_request(const struct i915_sched_engine *se,
                           const struct i915_request *rq)
{
        return list_is_last_rcu(&rq->sched.link, &se->requests);
}

and a few more. I know it should be object_action, but I wanted to avoid
all that typing. [I'm not even sure if i915_sched_engine is best name
for aligning with the guc's requirement of a single scheduling entity.
And then the drm_sched uses entity which roughly aligns with
i915_sched_engine.] Also I have a patch to replace rq->engine with
rq->sched.engine, and that looks like a good step forward (with a just
small caveat of we will have to move the breadcrumbs again, I think a
intel_context.breadcrumbs pointer).

Anyway, since this was the primary means I was interacting with the
scheduler from execlists/ringscheduler, I wanted conciseness that
avoided all the tautology of engines from within engines.

> >   }
> >   
> >   static void __execlists_schedule_out(struct i915_request * const rq,
> > @@ -679,10 +679,9 @@ trace_ports(const struct intel_engine_execlists *execlists,
> >                    dump_port(p1, sizeof(p1), ", ", ports[1]));
> >   }
> >   
> > -static bool
> > -reset_in_progress(const struct intel_engine_execlists *execlists)
> > +static bool reset_in_progress(const struct intel_engine_cs *engine)
> >   {
> > -     return unlikely(!__tasklet_is_enabled(&execlists->tasklet));
> > +     return unlikely(!__tasklet_is_enabled(&engine->active.tasklet));
> >   }
> >   
> >   static __maybe_unused noinline bool
> > @@ -699,7 +698,7 @@ assert_pending_valid(const struct intel_engine_execlists *execlists,
> >       trace_ports(execlists, msg, execlists->pending);
> >   
> >       /* We may be messing around with the lists during reset, lalala */
> > -     if (reset_in_progress(execlists))
> > +     if (reset_in_progress(engine))
> >               return true;
> >   
> >       if (!execlists->pending[0]) {
> > @@ -1084,7 +1083,7 @@ static void start_timeslice(struct intel_engine_cs *engine)
> >                        * its timeslice, so recheck.
> >                        */
> >                       if (!timer_pending(&el->timer))
> > -                             tasklet_hi_schedule(&el->tasklet);
> > +                             i915_sched_kick(&engine->active);
> >                       return;
> >               }
> >   
> > @@ -1664,8 +1663,8 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
> >        * access. Either we are inside the tasklet, or the tasklet is disabled
> >        * and we assume that is only inside the reset paths and so serialised.
> >        */
> > -     GEM_BUG_ON(!tasklet_is_locked(&execlists->tasklet) &&
> > -                !reset_in_progress(execlists));
> > +     GEM_BUG_ON(!tasklet_is_locked(&engine->active.tasklet) &&
> > +                !reset_in_progress(engine));
> >       GEM_BUG_ON(!intel_engine_in_execlists_submission_mode(engine));
> >   
> >       /*
> > @@ -2077,13 +2076,13 @@ static noinline void execlists_reset(struct intel_engine_cs *engine)
> >       ENGINE_TRACE(engine, "reset for %s\n", msg);
> >   
> >       /* Mark this tasklet as disabled to avoid waiting for it to complete */
> > -     tasklet_disable_nosync(&engine->execlists.tasklet);
> > +     tasklet_disable_nosync(&engine->active.tasklet);
> >   
> >       ring_set_paused(engine, 1); /* Freeze the current request in place */
> >       execlists_capture(engine);
> >       intel_engine_reset(engine, msg);
> >   
> > -     tasklet_enable(&engine->execlists.tasklet);
> > +     tasklet_enable(&engine->active.tasklet);
> 
> Maybe all access to the tasklet from the backend should go via 
> i915_sched_ helpers to complete the separation?

This is running inside the tasklet, so could be excused for being tightly
coupled with the tasklet.

Since we now have the tasklet itself passed to the tasklet callback (new
tasklet API), a simple way to hide the coupling would be to pass that
local.

> And with some generic 
> naming in case we don't want to trumpet it is a tasklet but instead some 
> higher level concept. Like schedule_enable/disable I don't know.. 
> Depends also how this plugs in the GuC.

Which would suggest to me to avoid too generic naming, as the namespace
is already crowded as we start coupling in with guc actions.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 19/41] drm/i915/gt: Show scheduler queues when dumping state
  2021-01-27 14:13   ` Tvrtko Ursulin
@ 2021-01-27 14:35     ` Chris Wilson
  2021-01-27 14:50       ` Tvrtko Ursulin
  0 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-27 14:35 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-27 14:13:11)
> 
> On 25/01/2021 14:01, Chris Wilson wrote:
> > Move the scheduler pretty printer from out of the execlists state to
> > match its more common location.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 34 +++++++++++++----------
> >   1 file changed, 19 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index cdd07aeada05..2f9a8960144b 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -1443,20 +1443,15 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
> >   
> >       if (intel_engine_in_guc_submission_mode(engine)) {
> >               /* nothing to print yet */
> > -     } else if (HAS_EXECLISTS(dev_priv)) {
> > -             struct i915_request * const *port, *rq;
> >               const u32 *hws =
> >                       &engine->status_page.addr[I915_HWS_CSB_BUF0_INDEX];
> >               const u8 num_entries = execlists->csb_size;
> >               unsigned int idx;
> >               u8 read, write;
> >   
> > -             drm_printf(m, "\tExeclist tasklet queued? %s (%s), preempt? %s, timeslice? %s\n",
> > -                        yesno(test_bit(TASKLET_STATE_SCHED,
> > -                                       &engine->active.tasklet.state)),
> > -                        enableddisabled(!atomic_read(&engine->active.tasklet.count)),
> > -                        repr_timer(&engine->execlists.preempt),
> > -                        repr_timer(&engine->execlists.timer));
> > +             drm_printf(m, "\tExeclists preempt? %s, timeslice? %s\n",
> > +                        repr_timer(&execlists->preempt),
> > +                        repr_timer(&execlists->timer));
> >   
> >               read = execlists->csb_head;
> >               write = READ_ONCE(*execlists->csb_write);
> > @@ -1477,6 +1472,22 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
> >                       drm_printf(m, "\tExeclist CSB[%d]: 0x%08x, context: %d\n",
> >                                  idx, hws[idx * 2], hws[idx * 2 + 1]);
> >               }
> > +     } else if (INTEL_GEN(dev_priv) > 6) {
> > +             drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
> > +                        ENGINE_READ(engine, RING_PP_DIR_BASE));
> > +             drm_printf(m, "\tPP_DIR_BASE_READ: 0x%08x\n",
> > +                        ENGINE_READ(engine, RING_PP_DIR_BASE_READ));
> > +             drm_printf(m, "\tPP_DIR_DCLV: 0x%08x\n",
> > +                        ENGINE_READ(engine, RING_PP_DIR_DCLV));
> > +     }
> > +
> > +     if (engine->active.tasklet.func) {
> > +             struct i915_request * const *port, *rq;
> > +
> > +             drm_printf(m, "\tTasklet queued? %s (%s)\n",
> > +                        yesno(test_bit(TASKLET_STATE_SCHED,
> > +                                       &engine->active.tasklet.state)),
> > +                        enableddisabled(!atomic_read(&engine->active.tasklet.count)));
> 
> Or have i915_sched_print_state() exported? Again it will depend on how 
> clean split will be possible.

Not quite, unfortunately this is not dumping generic state but the
backend bookkeeping for execlists/ringscheduler. Common for that pair,
not so common with the guc.

I guess I oversold it.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 19/41] drm/i915/gt: Show scheduler queues when dumping state
  2021-01-27 14:35     ` Chris Wilson
@ 2021-01-27 14:50       ` Tvrtko Ursulin
  2021-01-27 14:55         ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-27 14:50 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 27/01/2021 14:35, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2021-01-27 14:13:11)
>>
>> On 25/01/2021 14:01, Chris Wilson wrote:
>>> Move the scheduler pretty printer from out of the execlists state to
>>> match its more common location.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_engine_cs.c | 34 +++++++++++++----------
>>>    1 file changed, 19 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> index cdd07aeada05..2f9a8960144b 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> @@ -1443,20 +1443,15 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
>>>    
>>>        if (intel_engine_in_guc_submission_mode(engine)) {
>>>                /* nothing to print yet */
>>> -     } else if (HAS_EXECLISTS(dev_priv)) {
>>> -             struct i915_request * const *port, *rq;
>>>                const u32 *hws =
>>>                        &engine->status_page.addr[I915_HWS_CSB_BUF0_INDEX];
>>>                const u8 num_entries = execlists->csb_size;
>>>                unsigned int idx;
>>>                u8 read, write;
>>>    
>>> -             drm_printf(m, "\tExeclist tasklet queued? %s (%s), preempt? %s, timeslice? %s\n",
>>> -                        yesno(test_bit(TASKLET_STATE_SCHED,
>>> -                                       &engine->active.tasklet.state)),
>>> -                        enableddisabled(!atomic_read(&engine->active.tasklet.count)),
>>> -                        repr_timer(&engine->execlists.preempt),
>>> -                        repr_timer(&engine->execlists.timer));
>>> +             drm_printf(m, "\tExeclists preempt? %s, timeslice? %s\n",
>>> +                        repr_timer(&execlists->preempt),
>>> +                        repr_timer(&execlists->timer));
>>>    
>>>                read = execlists->csb_head;
>>>                write = READ_ONCE(*execlists->csb_write);
>>> @@ -1477,6 +1472,22 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
>>>                        drm_printf(m, "\tExeclist CSB[%d]: 0x%08x, context: %d\n",
>>>                                   idx, hws[idx * 2], hws[idx * 2 + 1]);
>>>                }
>>> +     } else if (INTEL_GEN(dev_priv) > 6) {
>>> +             drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
>>> +                        ENGINE_READ(engine, RING_PP_DIR_BASE));
>>> +             drm_printf(m, "\tPP_DIR_BASE_READ: 0x%08x\n",
>>> +                        ENGINE_READ(engine, RING_PP_DIR_BASE_READ));
>>> +             drm_printf(m, "\tPP_DIR_DCLV: 0x%08x\n",
>>> +                        ENGINE_READ(engine, RING_PP_DIR_DCLV));
>>> +     }
>>> +
>>> +     if (engine->active.tasklet.func) {
>>> +             struct i915_request * const *port, *rq;
>>> +
>>> +             drm_printf(m, "\tTasklet queued? %s (%s)\n",
>>> +                        yesno(test_bit(TASKLET_STATE_SCHED,
>>> +                                       &engine->active.tasklet.state)),
>>> +                        enableddisabled(!atomic_read(&engine->active.tasklet.count)));
>>
>> Or have i915_sched_print_state() exported? Again it will depend on how
>> clean split will be possible.
> 
> Not quite, unfortunately this is not dumping generic state but the
> backend bookkeeping for execlists/ringscheduler. Common for that pair,
> not so common with the guc.
> 
> I guess I oversold it.

Okay I see it after a less superficial look. I guess it's okay. Too hard 
to get perfect separation so I'll focus on the scheduling changes.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 19/41] drm/i915/gt: Show scheduler queues when dumping state
  2021-01-27 14:50       ` Tvrtko Ursulin
@ 2021-01-27 14:55         ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-27 14:55 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-27 14:50:19)
> 
> On 27/01/2021 14:35, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2021-01-27 14:13:11)
> >>
> >> On 25/01/2021 14:01, Chris Wilson wrote:
> >>> Move the scheduler pretty printer from out of the execlists state to
> >>> match its more common location.
> >>>
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> ---
> >>>    drivers/gpu/drm/i915/gt/intel_engine_cs.c | 34 +++++++++++++----------
> >>>    1 file changed, 19 insertions(+), 15 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> index cdd07aeada05..2f9a8960144b 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> @@ -1443,20 +1443,15 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
> >>>    
> >>>        if (intel_engine_in_guc_submission_mode(engine)) {
> >>>                /* nothing to print yet */
> >>> -     } else if (HAS_EXECLISTS(dev_priv)) {
> >>> -             struct i915_request * const *port, *rq;
> >>>                const u32 *hws =
> >>>                        &engine->status_page.addr[I915_HWS_CSB_BUF0_INDEX];
> >>>                const u8 num_entries = execlists->csb_size;
> >>>                unsigned int idx;
> >>>                u8 read, write;
> >>>    
> >>> -             drm_printf(m, "\tExeclist tasklet queued? %s (%s), preempt? %s, timeslice? %s\n",
> >>> -                        yesno(test_bit(TASKLET_STATE_SCHED,
> >>> -                                       &engine->active.tasklet.state)),
> >>> -                        enableddisabled(!atomic_read(&engine->active.tasklet.count)),
> >>> -                        repr_timer(&engine->execlists.preempt),
> >>> -                        repr_timer(&engine->execlists.timer));
> >>> +             drm_printf(m, "\tExeclists preempt? %s, timeslice? %s\n",
> >>> +                        repr_timer(&execlists->preempt),
> >>> +                        repr_timer(&execlists->timer));
> >>>    
> >>>                read = execlists->csb_head;
> >>>                write = READ_ONCE(*execlists->csb_write);
> >>> @@ -1477,6 +1472,22 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
> >>>                        drm_printf(m, "\tExeclist CSB[%d]: 0x%08x, context: %d\n",
> >>>                                   idx, hws[idx * 2], hws[idx * 2 + 1]);
> >>>                }
> >>> +     } else if (INTEL_GEN(dev_priv) > 6) {
> >>> +             drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
> >>> +                        ENGINE_READ(engine, RING_PP_DIR_BASE));
> >>> +             drm_printf(m, "\tPP_DIR_BASE_READ: 0x%08x\n",
> >>> +                        ENGINE_READ(engine, RING_PP_DIR_BASE_READ));
> >>> +             drm_printf(m, "\tPP_DIR_DCLV: 0x%08x\n",
> >>> +                        ENGINE_READ(engine, RING_PP_DIR_DCLV));
> >>> +     }
> >>> +
> >>> +     if (engine->active.tasklet.func) {
> >>> +             struct i915_request * const *port, *rq;
> >>> +
> >>> +             drm_printf(m, "\tTasklet queued? %s (%s)\n",
> >>> +                        yesno(test_bit(TASKLET_STATE_SCHED,
> >>> +                                       &engine->active.tasklet.state)),
> >>> +                        enableddisabled(!atomic_read(&engine->active.tasklet.count)));
> >>
> >> Or have i915_sched_print_state() exported? Again it will depend on how
> >> clean split will be possible.
> > 
> > Not quite, unfortunately this is not dumping generic state but the
> > backend bookkeeping for execlists/ringscheduler. Common for that pair,
> > not so common with the guc.
> > 
> > I guess I oversold it.
> 
> Okay I see it after a less superficial look. I guess it's okay. Too hard 
> to get perfect separation so I'll focus on the scheduling changes.

Inside intel_execlists_show_requests, we have the scheduler list pretty
printer. Maybe something to salvage here after all.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist Chris Wilson
@ 2021-01-27 15:10   ` Tvrtko Ursulin
  2021-01-27 15:33     ` Chris Wilson
  2021-01-28 15:56   ` Tvrtko Ursulin
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-27 15:10 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 25/01/2021 14:01, Chris Wilson wrote:
> Replace the priolist rbtree with a skiplist. The crucial difference is
> that walking and removing the first element of a skiplist is O(1), but

I wasn't (and am not) familiar with them, but wikipedia page says 
removal is O(logN) average case to O(N) worst case.

If I understand correctly O(1) could be ignoring the need to traverse 
from top to bottom level and removing the element from all. But since 
I915_PRIOLIST_HEIGHT is fixed maybe it is okay to call it O(1).

I wonder though why this wouldn't mean skip list would be worse for both 
lightly loaded and highly-loaded scenarios? Presumably height would need 
to be balanced to compensate for that.

In summary I have no idea for what number of in-flight requests would 
they be better.

How about putting this patch aside for now since it doesn't sound it is 
critical for deadline scheduling per se?

Regards,

Tvrtko

> O(lgN) for an rbtree, as we need to rebalance on remove. This is a
> hindrance for submission latency as it occurs between picking a request
> for the priolist and submitting it to hardware, as well effectively
> trippling the number of O(lgN) operations required under the irqoff lock.
> This is critical to reducing the latency jitter with multiple clients.
> 
> The downsides to skiplists are that lookup/insertion is only
> probablistically O(lgN) and there is a significant memory penalty to
> as each skip node is larger than the rbtree equivalent. Furthermore, we
> don't use dynamic arrays for the skiplist, so the allocation is fixed,
> and imposes an upper bound on the scalability wrt to the number of
> inflight requests.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   .../drm/i915/gt/intel_execlists_submission.c  |  63 +++--
>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  30 +--
>   drivers/gpu/drm/i915/i915_priolist_types.h    |  28 +-
>   drivers/gpu/drm/i915/i915_scheduler.c         | 244 ++++++++++++++----
>   drivers/gpu/drm/i915/i915_scheduler.h         |  11 +-
>   drivers/gpu/drm/i915/i915_scheduler_types.h   |   2 +-
>   .../drm/i915/selftests/i915_mock_selftests.h  |   1 +
>   .../gpu/drm/i915/selftests/i915_scheduler.c   |  53 +++-
>   8 files changed, 316 insertions(+), 116 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 1103c8a00af1..129144dd86b0 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -244,11 +244,6 @@ static void ring_set_paused(const struct intel_engine_cs *engine, int state)
>   		wmb();
>   }
>   
> -static struct i915_priolist *to_priolist(struct rb_node *rb)
> -{
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
>   static int rq_prio(const struct i915_request *rq)
>   {
>   	return READ_ONCE(rq->sched.attr.priority);
> @@ -272,15 +267,31 @@ static int effective_prio(const struct i915_request *rq)
>   	return prio;
>   }
>   
> -static int queue_prio(const struct i915_sched_engine *se)
> +static struct i915_request *first_request(struct i915_sched_engine *se)
>   {
> -	struct rb_node *rb;
> +	struct i915_priolist *pl;
>   
> -	rb = rb_first_cached(&se->queue);
> -	if (!rb)
> +	for_each_priolist(pl, &se->queue) {
> +		if (likely(!list_empty(&pl->requests)))
> +			return list_first_entry(&pl->requests,
> +						struct i915_request,
> +						sched.link);
> +
> +		i915_priolist_advance(&se->queue, pl);
> +	}
> +
> +	return NULL;
> +}
> +
> +static int queue_prio(struct i915_sched_engine *se)
> +{
> +	struct i915_request *rq;
> +
> +	rq = first_request(se);
> +	if (!rq)
>   		return INT_MIN;
>   
> -	return to_priolist(rb)->priority;
> +	return rq_prio(rq);
>   }
>   
>   static int virtual_prio(const struct intel_engine_execlists *el)
> @@ -290,7 +301,7 @@ static int virtual_prio(const struct intel_engine_execlists *el)
>   	return rb ? rb_entry(rb, struct ve_node, rb)->prio : INT_MIN;
>   }
>   
> -static bool need_preempt(const struct intel_engine_cs *engine,
> +static bool need_preempt(struct intel_engine_cs *engine,
>   			 const struct i915_request *rq)
>   {
>   	int last_prio;
> @@ -1136,6 +1147,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   	struct i915_request ** const last_port = port + execlists->port_mask;
>   	struct i915_request *last, * const *active;
>   	struct virtual_engine *ve;
> +	struct i915_priolist *pl;
>   	struct rb_node *rb;
>   	bool submit = false;
>   
> @@ -1346,11 +1358,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   			break;
>   	}
>   
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> +	for_each_priolist(pl, &engine->active.queue) {
>   		struct i915_request *rq, *rn;
>   
> -		priolist_for_each_request_consume(rq, rn, p) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>   			bool merge = true;
>   
>   			/*
> @@ -1425,8 +1436,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   			}
>   		}
>   
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);
>   	}
>   done:
>   	*port++ = i915_request_get(last);
> @@ -2631,6 +2641,7 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>   {
>   	struct intel_engine_execlists * const execlists = &engine->execlists;
>   	struct i915_request *rq, *rn;
> +	struct i915_priolist *pl;
>   	struct rb_node *rb;
>   	unsigned long flags;
>   
> @@ -2661,16 +2672,12 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>   	intel_engine_signal_breadcrumbs(engine);
>   
>   	/* Flush the queued requests to the timeline list (for retiring). */
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> -
> -		priolist_for_each_request_consume(rq, rn, p) {
> +	for_each_priolist(pl, &engine->active.queue) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>   			i915_request_mark_eio(rq);
>   			__i915_request_submit(rq);
>   		}
> -
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);
>   	}
>   	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
>   
> @@ -2703,7 +2710,6 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>   	/* Remaining _unready_ requests will be nop'ed when submitted */
>   
>   	execlists->queue_priority_hint = INT_MIN;
> -	engine->active.queue = RB_ROOT_CACHED;
>   
>   	GEM_BUG_ON(__tasklet_is_enabled(&engine->active.tasklet));
>   	engine->active.tasklet.func = nop_submission_tasklet;
> @@ -3089,6 +3095,8 @@ static void virtual_context_exit(struct intel_context *ce)
>   
>   	for (n = 0; n < ve->num_siblings; n++)
>   		intel_engine_pm_put(ve->siblings[n]);
> +
> +	i915_sched_park_engine(&ve->base.active);
>   }
>   
>   static const struct intel_context_ops virtual_context_ops = {
> @@ -3501,6 +3509,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
>   {
>   	const struct intel_engine_execlists *execlists = &engine->execlists;
>   	struct i915_request *rq, *last;
> +	struct i915_priolist *pl;
>   	unsigned long flags;
>   	unsigned int count;
>   	struct rb_node *rb;
> @@ -3530,10 +3539,8 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
>   
>   	last = NULL;
>   	count = 0;
> -	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
> -		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
> -
> -		priolist_for_each_request(rq, p) {
> +	for_each_priolist(pl, &engine->active.queue) {
> +		priolist_for_each_request(rq, pl) {
>   			if (count++ < max - 1)
>   				show_request(m, rq, "\t\t", 0);
>   			else
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 2d7339ef3b4c..8d0c6cd277b3 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -59,11 +59,6 @@
>   
>   #define GUC_REQUEST_SIZE 64 /* bytes */
>   
> -static inline struct i915_priolist *to_priolist(struct rb_node *rb)
> -{
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
>   static struct guc_stage_desc *__get_stage_desc(struct intel_guc *guc, u32 id)
>   {
>   	struct guc_stage_desc *base = guc->stage_desc_pool_vaddr;
> @@ -185,8 +180,8 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>   	struct i915_request ** const last_port = first + execlists->port_mask;
>   	struct i915_request *last = first[0];
>   	struct i915_request **port;
> +	struct i915_priolist *pl;
>   	bool submit = false;
> -	struct rb_node *rb;
>   
>   	lockdep_assert_held(&engine->active.lock);
>   
> @@ -203,11 +198,10 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>   	 * event.
>   	 */
>   	port = first;
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> +	for_each_priolist(pl, &engine->active.queue) {
>   		struct i915_request *rq, *rn;
>   
> -		priolist_for_each_request_consume(rq, rn, p) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>   			if (last && rq->context != last->context) {
>   				if (port == last_port)
>   					goto done;
> @@ -223,12 +217,11 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>   			last = rq;
>   		}
>   
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);
>   	}
>   done:
>   	execlists->queue_priority_hint =
> -		rb ? to_priolist(rb)->priority : INT_MIN;
> +		pl != &engine->active.queue.sentinel ? pl->priority : INT_MIN;
>   	if (submit) {
>   		*port = schedule_in(last, port - execlists->inflight);
>   		*++port = NULL;
> @@ -327,7 +320,7 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
>   {
>   	struct intel_engine_execlists * const execlists = &engine->execlists;
>   	struct i915_request *rq, *rn;
> -	struct rb_node *rb;
> +	struct i915_priolist *p;
>   	unsigned long flags;
>   
>   	ENGINE_TRACE(engine, "\n");
> @@ -355,25 +348,20 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
>   	}
>   
>   	/* Flush the queued requests to the timeline list (for retiring). */
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> -
> -		priolist_for_each_request_consume(rq, rn, p) {
> +	for_each_priolist(p, &engine->active.queue) {
> +		priolist_for_each_request_safe(rq, rn, p) {
>   			list_del_init(&rq->sched.link);
>   			__i915_request_submit(rq);
>   			dma_fence_set_error(&rq->fence, -EIO);
>   			i915_request_mark_complete(rq);
>   		}
> -
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, p);
>   	}
>   	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
>   
>   	/* Remaining _unready_ requests will be nop'ed when submitted */
>   
>   	execlists->queue_priority_hint = INT_MIN;
> -	engine->active.queue = RB_ROOT_CACHED;
>   
>   	spin_unlock_irqrestore(&engine->active.lock, flags);
>   }
> diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
> index bc2fa84f98a8..1200c3df6a4a 100644
> --- a/drivers/gpu/drm/i915/i915_priolist_types.h
> +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
> @@ -38,10 +38,36 @@ enum {
>   #define I915_PRIORITY_UNPREEMPTABLE INT_MAX
>   #define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
>   
> +#ifdef CONFIG_64BIT
> +#define I915_PRIOLIST_HEIGHT 12
> +#else
> +#define I915_PRIOLIST_HEIGHT 11
> +#endif
> +
>   struct i915_priolist {
>   	struct list_head requests;
> -	struct rb_node node;
>   	int priority;
> +
> +	int level;
> +	struct i915_priolist *next[I915_PRIOLIST_HEIGHT];
>   };
>   
> +struct i915_priolist_root {
> +	struct i915_priolist sentinel;
> +	u32 prng;
> +};
> +
> +#define i915_priolist_is_empty(root) ((root)->sentinel.level < 0)
> +
> +#define for_each_priolist(p, root) \
> +	for ((p) = (root)->sentinel.next[0]; \
> +	     (p) != &(root)->sentinel; \
> +	     (p) = (p)->next[0])
> +
> +#define priolist_for_each_request(it, plist) \
> +	list_for_each_entry(it, &(plist)->requests, sched.link)
> +
> +#define priolist_for_each_request_safe(it, n, plist) \
> +	list_for_each_entry_safe(it, n, &(plist)->requests, sched.link)
> +
>   #endif /* _I915_PRIOLIST_TYPES_H_ */
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index a3ee06cb66d7..74000d3eebb1 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -4,7 +4,9 @@
>    * Copyright © 2018 Intel Corporation
>    */
>   
> +#include <linux/bitops.h>
>   #include <linux/mutex.h>
> +#include <linux/prandom.h>
>   
>   #include "gt/intel_ring.h"
>   #include "gt/intel_lrc_reg.h"
> @@ -91,15 +93,24 @@ static void i915_sched_init_ipi(struct i915_sched_ipi *ipi)
>   	ipi->list = NULL;
>   }
>   
> +static void init_priolist(struct i915_priolist_root *const root)
> +{
> +	struct i915_priolist *pl = &root->sentinel;
> +
> +	memset_p((void **)pl->next, pl, ARRAY_SIZE(pl->next));
> +	pl->priority = INT_MIN;
> +	pl->level = -1;
> +}
> +
>   void i915_sched_init_engine(struct i915_sched_engine *se,
>   			    unsigned int subclass)
>   {
>   	spin_lock_init(&se->lock);
>   	lockdep_set_subclass(&se->lock, subclass);
>   
> +	init_priolist(&se->queue);
>   	INIT_LIST_HEAD(&se->requests);
>   	INIT_LIST_HEAD(&se->hold);
> -	se->queue = RB_ROOT_CACHED;
>   
>   	i915_sched_init_ipi(&se->ipi);
>   
> @@ -116,8 +127,57 @@ void i915_sched_init_engine(struct i915_sched_engine *se,
>   #endif
>   }
>   
> +__maybe_unused static bool priolist_idle(struct i915_priolist_root *root)
> +{
> +	struct i915_priolist *pl = &root->sentinel;
> +	int lvl;
> +
> +	for (lvl = 0; lvl < ARRAY_SIZE(pl->next); lvl++) {
> +		if (pl->next[lvl] != pl) {
> +			GEM_TRACE_ERR("root[%d] is not empty\n", lvl);
> +			return false;
> +		}
> +	}
> +
> +	if (pl->level != -1) {
> +		GEM_TRACE_ERR("root is not clear: %d\n", pl->level);
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
> +static void pl_push(struct i915_priolist *pl, struct list_head *head)
> +{
> +	pl->requests.next = head->next;
> +	head->next = &pl->requests;
> +}
> +
> +static struct i915_priolist *pl_pop(struct list_head *head)
> +{
> +	struct i915_priolist *pl;
> +
> +	pl = container_of(head->next, typeof(*pl), requests);
> +	head->next = pl->requests.next;
> +
> +	return pl;
> +}
> +
> +static bool pl_empty(struct list_head *head)
> +{
> +	return !head->next;
> +}
> +
>   void i915_sched_park_engine(struct i915_sched_engine *se)
>   {
> +	struct i915_priolist_root *root = &se->queue;
> +	struct list_head *list = &root->sentinel.requests;
> +
> +	GEM_BUG_ON(!priolist_idle(root));
> +
> +	while (!pl_empty(list))
> +		kmem_cache_free(global.slab_priorities, pl_pop(list));
> +
>   	GEM_BUG_ON(!i915_sched_is_idle(se));
>   	se->no_priolist = false;
>   }
> @@ -183,71 +243,55 @@ static inline bool node_signaled(const struct i915_sched_node *node)
>   	return i915_request_completed(node_to_request(node));
>   }
>   
> -static inline struct i915_priolist *to_priolist(struct rb_node *rb)
> +static inline unsigned int random_level(struct i915_priolist_root *root)
>   {
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
> -static void assert_priolists(struct i915_sched_engine * const se)
> -{
> -	struct rb_node *rb;
> -	long last_prio;
> -
> -	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
> -		return;
> -
> -	GEM_BUG_ON(rb_first_cached(&se->queue) !=
> -		   rb_first(&se->queue.rb_root));
> -
> -	last_prio = INT_MAX;
> -	for (rb = rb_first_cached(&se->queue); rb; rb = rb_next(rb)) {
> -		const struct i915_priolist *p = to_priolist(rb);
> -
> -		GEM_BUG_ON(p->priority > last_prio);
> -		last_prio = p->priority;
> -	}
> +	root->prng = next_pseudo_random32(root->prng);
> +	return  __ffs(root->prng) / 2;
>   }
>   
>   static struct list_head *
>   lookup_priolist(struct intel_engine_cs *engine, int prio)
>   {
> +	struct i915_priolist *update[I915_PRIOLIST_HEIGHT];
>   	struct i915_sched_engine * const se = &engine->active;
> -	struct i915_priolist *p;
> -	struct rb_node **parent, *rb;
> -	bool first = true;
> -
> -	lockdep_assert_held(&engine->active.lock);
> -	assert_priolists(se);
> +	struct i915_priolist_root *root = &se->queue;
> +	struct i915_priolist *pl, *tmp;
> +	int lvl;
>   
> +	lockdep_assert_held(&se->lock);
>   	if (unlikely(se->no_priolist))
>   		prio = I915_PRIORITY_NORMAL;
>   
> +	for_each_priolist(pl, root) { /* recycle any empty elements before us */
> +		if (pl->priority >= prio || !list_empty(&pl->requests))
> +			break;
> +
> +		i915_priolist_advance(root, pl);
> +	}
> +
>   find_priolist:
> -	/* most positive priority is scheduled first, equal priorities fifo */
> -	rb = NULL;
> -	parent = &se->queue.rb_root.rb_node;
> -	while (*parent) {
> -		rb = *parent;
> -		p = to_priolist(rb);
> -		if (prio > p->priority) {
> -			parent = &rb->rb_left;
> -		} else if (prio < p->priority) {
> -			parent = &rb->rb_right;
> -			first = false;
> -		} else {
> -			return &p->requests;
> -		}
> +	pl = &root->sentinel;
> +	lvl = pl->level;
> +	while (lvl >= 0) {
> +		while (tmp = pl->next[lvl], tmp->priority >= prio)
> +			pl = tmp;
> +		if (pl->priority == prio)
> +			goto out;
> +		update[lvl--] = pl;
>   	}
>   
>   	if (prio == I915_PRIORITY_NORMAL) {
> -		p = &se->default_priolist;
> +		pl = &se->default_priolist;
> +	} else if (!pl_empty(&root->sentinel.requests)) {
> +		pl = pl_pop(&root->sentinel.requests);
>   	} else {
> -		p = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
> +		pl = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
>   		/* Convert an allocation failure to a priority bump */
> -		if (unlikely(!p)) {
> +		if (unlikely(!pl)) {
>   			prio = I915_PRIORITY_NORMAL; /* recurses just once */
>   
> -			/* To maintain ordering with all rendering, after an
> +			/*
> +			 * To maintain ordering with all rendering, after an
>   			 * allocation failure we have to disable all scheduling.
>   			 * Requests will then be executed in fifo, and schedule
>   			 * will ensure that dependencies are emitted in fifo.
> @@ -260,18 +304,103 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
>   		}
>   	}
>   
> -	p->priority = prio;
> -	INIT_LIST_HEAD(&p->requests);
> +	pl->priority = prio;
> +	INIT_LIST_HEAD(&pl->requests);
>   
> -	rb_link_node(&p->node, rb, parent);
> -	rb_insert_color_cached(&p->node, &se->queue, first);
> +	lvl = random_level(root);
> +	if (lvl > root->sentinel.level) {
> +		if (root->sentinel.level < I915_PRIOLIST_HEIGHT - 1) {
> +			lvl = ++root->sentinel.level;
> +			update[lvl] = &root->sentinel;
> +		} else {
> +			lvl = I915_PRIOLIST_HEIGHT - 1;
> +		}
> +	}
> +	GEM_BUG_ON(lvl < 0);
> +	GEM_BUG_ON(lvl >= ARRAY_SIZE(pl->next));
>   
> -	return &p->requests;
> +	pl->level = lvl;
> +	do {
> +		tmp = update[lvl];
> +		pl->next[lvl] = update[lvl]->next[lvl];
> +		tmp->next[lvl] = pl;
> +	} while (--lvl >= 0);
> +
> +	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) {
> +		struct i915_priolist *chk;
> +
> +		chk = &root->sentinel;
> +		lvl = chk->level;
> +		do {
> +			while (tmp = chk->next[lvl], tmp->priority >= prio)
> +				chk = tmp;
> +		} while (--lvl >= 0);
> +
> +		GEM_BUG_ON(chk != pl);
> +	}
> +
> +out:
> +	GEM_BUG_ON(pl == &root->sentinel);
> +	return &pl->requests;
>   }
>   
> -void __i915_priolist_free(struct i915_priolist *p)
> +static void remove_priolist(struct intel_engine_cs *engine,
> +			    struct list_head *plist)
>   {
> -	kmem_cache_free(global.slab_priorities, p);
> +	struct i915_sched_engine * const se = &engine->active;
> +	struct i915_priolist_root *root = &se->queue;
> +	struct i915_priolist *pl, *tmp;
> +	struct i915_priolist *old =
> +		container_of(plist, struct i915_priolist, requests);
> +	int prio = old->priority;
> +	int lvl;
> +
> +	lockdep_assert_held(&se->lock);
> +	GEM_BUG_ON(!list_empty(plist));
> +
> +	pl = &root->sentinel;
> +	lvl = pl->level;
> +	GEM_BUG_ON(lvl < 0);
> +
> +	if (prio != I915_PRIORITY_NORMAL)
> +		pl_push(old, &pl->requests);
> +
> +	do {
> +		while (tmp = pl->next[lvl], tmp->priority > prio)
> +			pl = tmp;
> +		if (lvl <= old->level) {
> +			pl->next[lvl] = old->next[lvl];
> +			if (pl == &root->sentinel && old->next[lvl] == pl) {
> +				GEM_BUG_ON(pl->level != lvl);
> +				pl->level--;
> +			}
> +		}
> +	} while (--lvl >= 0);
> +	GEM_BUG_ON(tmp != old);
> +}
> +
> +void i915_priolist_advance(struct i915_priolist_root *root,
> +			   struct i915_priolist *pl)
> +{
> +	struct i915_priolist * const s = &root->sentinel;
> +	int lvl;
> +
> +	GEM_BUG_ON(!list_empty(&pl->requests));
> +	GEM_BUG_ON(pl != s->next[0]);
> +	GEM_BUG_ON(pl == s);
> +
> +	if (pl->priority != I915_PRIORITY_NORMAL)
> +		pl_push(pl, &s->requests);
> +
> +	lvl = pl->level;
> +	GEM_BUG_ON(lvl < 0);
> +	do {
> +		s->next[lvl] = pl->next[lvl];
> +		if (pl->next[lvl] == s) {
> +			GEM_BUG_ON(s->level != lvl);
> +			s->level--;
> +		}
> +	} while (--lvl >= 0);
>   }
>   
>   static struct i915_request *
> @@ -420,8 +549,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
>   			continue;
>   
>   		GEM_BUG_ON(rq->engine != engine);
> -		if (i915_request_in_priority_queue(rq))
> +		if (i915_request_in_priority_queue(rq)) {
> +			struct list_head *prev = rq->sched.link.prev;
> +
>   			list_move_tail(&rq->sched.link, plist);
> +			if (list_empty(prev))
> +				remove_priolist(engine, prev);
> +		}
>   
>   		/* Defer (tasklet) submission until after all updates. */
>   		kick_submission(engine, rq, prio);
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 0ab47cbf0e9c..bca89a58d953 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -16,12 +16,6 @@
>   
>   struct drm_printer;
>   
> -#define priolist_for_each_request(it, plist) \
> -	list_for_each_entry(it, &(plist)->requests, sched.link)
> -
> -#define priolist_for_each_request_consume(it, n, plist) \
> -	list_for_each_entry_safe(it, n, &(plist)->requests, sched.link)
> -
>   void i915_sched_node_init(struct i915_sched_node *node);
>   void i915_sched_node_reinit(struct i915_sched_node *node);
>   
> @@ -69,7 +63,7 @@ static inline void i915_priolist_free(struct i915_priolist *p)
>   
>   static inline bool i915_sched_is_idle(const struct i915_sched_engine *se)
>   {
> -	return RB_EMPTY_ROOT(&se->queue.rb_root);
> +	return i915_priolist_is_empty(&se->queue);
>   }
>   
>   static inline bool
> @@ -99,6 +93,9 @@ static inline void i915_sched_kick(struct i915_sched_engine *se)
>   	tasklet_hi_schedule(&se->tasklet);
>   }
>   
> +void i915_priolist_advance(struct i915_priolist_root *root,
> +			   struct i915_priolist *old);
> +
>   void i915_request_show_with_schedule(struct drm_printer *m,
>   				     const struct i915_request *rq,
>   				     const char *prefix,
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index f668c680d290..e64750be4e77 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -89,7 +89,7 @@ struct i915_sched_engine {
>   	/**
>   	 * @queue: queue of requests, in priority lists
>   	 */
> -	struct rb_root_cached queue;
> +	struct i915_priolist_root queue;
>   
>   	struct i915_sched_ipi ipi;
>   
> diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> index 3db34d3eea58..946c93441c1f 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> @@ -25,6 +25,7 @@ selftest(ring, intel_ring_mock_selftests)
>   selftest(engine, intel_engine_cs_mock_selftests)
>   selftest(timelines, intel_timeline_mock_selftests)
>   selftest(requests, i915_request_mock_selftests)
> +selftest(scheduler, i915_scheduler_mock_selftests)
>   selftest(objects, i915_gem_object_mock_selftests)
>   selftest(phys, i915_gem_phys_mock_selftests)
>   selftest(dmabuf, i915_gem_dmabuf_mock_selftests)
> diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> index de44a66210b7..de5b1443129b 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> @@ -12,6 +12,54 @@
>   #include "selftests/igt_spinner.h"
>   #include "selftests/i915_random.h"
>   
> +static int mock_skiplist_levels(void *dummy)
> +{
> +	struct i915_priolist_root root = {};
> +	struct i915_priolist *pl = &root.sentinel;
> +	IGT_TIMEOUT(end_time);
> +	unsigned long total;
> +	int count, lvl;
> +
> +	total = 0;
> +	do {
> +		for (count = 0; count < 16384; count++) {
> +			lvl = random_level(&root);
> +			if (lvl > pl->level) {
> +				if (lvl < I915_PRIOLIST_HEIGHT - 1)
> +					lvl = ++pl->level;
> +				else
> +					lvl = I915_PRIOLIST_HEIGHT - 1;
> +			}
> +
> +			pl->next[lvl] = ptr_inc(pl->next[lvl]);
> +		}
> +		total += count;
> +	} while (!__igt_timeout(end_time, NULL));
> +
> +	pr_info("Total %9lu\n", total);
> +	for (lvl = 0; lvl <= pl->level; lvl++) {
> +		int x = ilog2((unsigned long)pl->next[lvl]);
> +		char row[80];
> +
> +		memset(row, '*', x);
> +		row[x] = '\0';
> +
> +		pr_info(" [%2d] %9lu %s\n",
> +			lvl, (unsigned long)pl->next[lvl], row);
> +	}
> +
> +	return 0;
> +}
> +
> +int i915_scheduler_mock_selftests(void)
> +{
> +	static const struct i915_subtest tests[] = {
> +		SUBTEST(mock_skiplist_levels),
> +	};
> +
> +	return i915_subtests(tests, NULL);
> +}
> +
>   static void scheduling_disable(struct intel_engine_cs *engine)
>   {
>   	engine->props.preempt_timeout_ms = 0;
> @@ -80,9 +128,9 @@ static int all_engines(struct drm_i915_private *i915,
>   static bool check_context_order(struct intel_engine_cs *engine)
>   {
>   	u64 last_seqno, last_context;
> +	struct i915_priolist *p;
>   	unsigned long count;
>   	bool result = false;
> -	struct rb_node *rb;
>   	int last_prio;
>   
>   	/* We expect the execution order to follow ascending fence-context */
> @@ -92,8 +140,7 @@ static bool check_context_order(struct intel_engine_cs *engine)
>   	last_context = 0;
>   	last_seqno = 0;
>   	last_prio = 0;
> -	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
> -		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
> +	for_each_priolist(p, &engine->active.queue) {
>   		struct i915_request *rq;
>   
>   		priolist_for_each_request(rq, p) {
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 21/41] drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 21/41] drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper Chris Wilson
@ 2021-01-27 15:28   ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-27 15:28 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 25/01/2021 14:01, Chris Wilson wrote:
> Wrap cmpxchg64 with a try_cmpxchg()-esque helper. Hiding the old-value
> dance in the helper allows for cleaner code.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_utils.h | 32 +++++++++++++++++++++++++++++++
>   1 file changed, 32 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
> index abd4dcd9f79c..95ead6bb1ba6 100644
> --- a/drivers/gpu/drm/i915/i915_utils.h
> +++ b/drivers/gpu/drm/i915/i915_utils.h
> @@ -461,4 +461,36 @@ static inline bool timer_expired(const struct timer_list *t)
>    */
>   #define IS_ACTIVE(config) ((config) != 0)
>   
> +#ifndef try_cmpxchg64
> +#if IS_ENABLED(CONFIG_64BIT)
> +#define try_cmpxchg64(_ptr, _pold, _new) try_cmpxchg(_ptr, _pold, _new)
> +#else
> +#define try_cmpxchg64(_ptr, _pold, _new)				\
> +({									\
> +	__typeof__(_ptr) _old = (__typeof__(_ptr))(_pold);		\
> +	__typeof__(*(_ptr)) __old = *_old;				\
> +	__typeof__(*(_ptr)) __cur = cmpxchg64(_ptr, __old, _new);	\
> +	bool success = __cur == __old;					\
> +	if (unlikely(!success))						\
> +		*_old = __cur;						\
> +	likely(success);						\
> +})
> +#endif
> +#endif
> +
> +#ifndef xchg64
> +#if IS_ENABLED(CONFIG_64BIT)
> +#define xchg64(_ptr, _new) xchg(_ptr, _new)
> +#else
> +#define xchg64(_ptr, _new)						\
> +({									\
> +	__typeof__(_ptr) __ptr = (_ptr);				\
> +	__typeof__(*(_ptr)) __old = *__ptr;				\
> +	while (!try_cmpxchg64(__ptr, &__old, (_new)))			\
> +		;							\
> +	__old;								\
> +})
> +#endif
> +#endif
> +
>   #endif /* !__I915_UTILS_H */
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-27 15:10   ` Tvrtko Ursulin
@ 2021-01-27 15:33     ` Chris Wilson
  2021-01-27 15:44       ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-27 15:33 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-27 15:10:43)
> 
> On 25/01/2021 14:01, Chris Wilson wrote:
> > Replace the priolist rbtree with a skiplist. The crucial difference is
> > that walking and removing the first element of a skiplist is O(1), but
> 
> I wasn't (and am not) familiar with them, but wikipedia page says 
> removal is O(logN) average case to O(N) worst case.
> 
> If I understand correctly O(1) could be ignoring the need to traverse 
> from top to bottom level and removing the element from all. But since 
> I915_PRIOLIST_HEIGHT is fixed maybe it is okay to call it O(1).

Correct, since we removing the first element, we do not need to do the
lgN search and can just move the next[I915_PRIOLIST_HEIGHT] forwards.
(Although, I did starting doing the lgN removal for timeslicing as
traversing the empty levels were showing up in worst case lock hold
times.) But the primary means of removing from the skiplist is as we
consume the first request during the dequeue.

> I wonder though why this wouldn't mean skip list would be worse for both 
> lightly loaded and highly-loaded scenarios? Presumably height would need 
> to be balanced to compensate for that.

I think the answer is yes. skiplists uses probablistic balancing so they
are only from a bird's eye view as good as a rbtree. If you look at the
perf tests, the skiplists are generally faster, but it's close overall.

What sold me was lock_stat and that I could remove a few hacky patches
trying to hide some of the worst case behaviour of rbtree and how we had
frees within the critical submit path.
 
> In summary I have no idea for what number of in-flight requests would 
> they be better.
> 
> How about putting this patch aside for now since it doesn't sound it is 
> critical for deadline scheduling per se?

Oh no, we are not going back to the hacky patches like
https://patchwork.freedesktop.org/patch/407913/?series=84900&rev=1
https://patchwork.freedesktop.org/patch/407903/?series=84900&rev=1
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-27 15:33     ` Chris Wilson
@ 2021-01-27 15:44       ` Chris Wilson
  2021-01-27 15:58         ` Tvrtko Ursulin
  0 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-27 15:44 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Chris Wilson (2021-01-27 15:33:05)
> Quoting Tvrtko Ursulin (2021-01-27 15:10:43)
> > 
> > On 25/01/2021 14:01, Chris Wilson wrote:
> > > Replace the priolist rbtree with a skiplist. The crucial difference is
> > > that walking and removing the first element of a skiplist is O(1), but
> > 
> > I wasn't (and am not) familiar with them, but wikipedia page says 
> > removal is O(logN) average case to O(N) worst case.
> > 
> > If I understand correctly O(1) could be ignoring the need to traverse 
> > from top to bottom level and removing the element from all. But since 
> > I915_PRIOLIST_HEIGHT is fixed maybe it is okay to call it O(1).
> 
> Correct, since we removing the first element, we do not need to do the
> lgN search and can just move the next[I915_PRIOLIST_HEIGHT] forwards.
> (Although, I did starting doing the lgN removal for timeslicing as
> traversing the empty levels were showing up in worst case lock hold
> times.) But the primary means of removing from the skiplist is as we
> consume the first request during the dequeue.
> 
> > I wonder though why this wouldn't mean skip list would be worse for both 
> > lightly loaded and highly-loaded scenarios? Presumably height would need 
> > to be balanced to compensate for that.
> 
> I think the answer is yes. skiplists uses probablistic balancing so they
> are only from a bird's eye view as good as a rbtree. If you look at the
> perf tests, the skiplists are generally faster, but it's close overall.
> 
> What sold me was lock_stat and that I could remove a few hacky patches
> trying to hide some of the worst case behaviour of rbtree and how we had
> frees within the critical submit path.
>  
> > In summary I have no idea for what number of in-flight requests would 
> > they be better.
> > 
> > How about putting this patch aside for now since it doesn't sound it is 
> > critical for deadline scheduling per se?
> 
> Oh no, we are not going back to the hacky patches like
> https://patchwork.freedesktop.org/patch/407913/?series=84900&rev=1
> https://patchwork.freedesktop.org/patch/407903/?series=84900&rev=1

To be extra clear, the biggest drawback in using deadlines as the sort
key is that they are very, very sparse in comparison to priorities.
Where we would typically have only a single priority level for every
request, with deadlines we typically have a new deadline with every
request (and it's not until we get into priority bumping or timeslice
deferring do we start to see the deadlines coalesce). In this situation,
the lgN list traversal of rbtree during execlists_dequeue() was
abyssmal, and so as the skiplists give similar lgN insertion but O(1)
list traversal, the difference is enough to completely negate the
overhead of having more levels to process. It is a dramatic improvement.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-27 15:44       ` Chris Wilson
@ 2021-01-27 15:58         ` Tvrtko Ursulin
  2021-01-28  9:50           ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-27 15:58 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 27/01/2021 15:44, Chris Wilson wrote:
> Quoting Chris Wilson (2021-01-27 15:33:05)
>> Quoting Tvrtko Ursulin (2021-01-27 15:10:43)
>>>
>>> On 25/01/2021 14:01, Chris Wilson wrote:
>>>> Replace the priolist rbtree with a skiplist. The crucial difference is
>>>> that walking and removing the first element of a skiplist is O(1), but
>>>
>>> I wasn't (and am not) familiar with them, but wikipedia page says
>>> removal is O(logN) average case to O(N) worst case.
>>>
>>> If I understand correctly O(1) could be ignoring the need to traverse
>>> from top to bottom level and removing the element from all. But since
>>> I915_PRIOLIST_HEIGHT is fixed maybe it is okay to call it O(1).
>>
>> Correct, since we removing the first element, we do not need to do the
>> lgN search and can just move the next[I915_PRIOLIST_HEIGHT] forwards.
>> (Although, I did starting doing the lgN removal for timeslicing as
>> traversing the empty levels were showing up in worst case lock hold
>> times.) But the primary means of removing from the skiplist is as we
>> consume the first request during the dequeue.
>>
>>> I wonder though why this wouldn't mean skip list would be worse for both
>>> lightly loaded and highly-loaded scenarios? Presumably height would need
>>> to be balanced to compensate for that.
>>
>> I think the answer is yes. skiplists uses probablistic balancing so they
>> are only from a bird's eye view as good as a rbtree. If you look at the
>> perf tests, the skiplists are generally faster, but it's close overall.
>>
>> What sold me was lock_stat and that I could remove a few hacky patches
>> trying to hide some of the worst case behaviour of rbtree and how we had
>> frees within the critical submit path.
>>   
>>> In summary I have no idea for what number of in-flight requests would
>>> they be better.
>>>
>>> How about putting this patch aside for now since it doesn't sound it is
>>> critical for deadline scheduling per se?
>>
>> Oh no, we are not going back to the hacky patches like
>> https://patchwork.freedesktop.org/patch/407913/?series=84900&rev=1
>> https://patchwork.freedesktop.org/patch/407903/?series=84900&rev=1
> 
> To be extra clear, the biggest drawback in using deadlines as the sort
> key is that they are very, very sparse in comparison to priorities.
> Where we would typically have only a single priority level for every
> request, with deadlines we typically have a new deadline with every
> request (and it's not until we get into priority bumping or timeslice
> deferring do we start to see the deadlines coalesce). In this situation,
> the lgN list traversal of rbtree during execlists_dequeue() was
> abyssmal, and so as the skiplists give similar lgN insertion but O(1)
> list traversal, the difference is enough to completely negate the
> overhead of having more levels to process. It is a dramatic improvement.

Okay makes sense. The change in key drives the requirement so just 
please mention in the commit message and I'll tackle the skip list 
mechanics in the meantime.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-27 15:58         ` Tvrtko Ursulin
@ 2021-01-28  9:50           ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-28  9:50 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-27 15:58:12)
> Okay makes sense. The change in key drives the requirement so just 
> please mention in the commit message and I'll tackle the skip list 
> mechanics in the meantime.

In the following patches, we introduce a new sort key to the scheduler,
a virtual deadline. This imposes a different structure to the tree.
Using a priority sort, we have very few priority levels active at any
time, most likely just the default priority and so the rbtree degenerates
to a single elements containing the list of all ready requests. The
deadlines in contrast are very sparse, and typically each request has a
unique deadline. Instead of being able to simply walk the list during
dequeue, with the deadline scheduler we have to iterate through the bst
on the critical submission path. Skiplists are vastly superior in this
instance due to the O(1) iteration during dequeue, with very similar
characteristics [on average] to the rbtree for insertion.

This means that by using skiplists we can introduce a sparse sort key
without degrading latency on the critical submission path.

As an example, one simple case where we try to do lots of
semi-independent work without any priority management (gem_exec_parallel),
the lock hold times were
[worst]        [total]    [avg]
 973.05     6301584.84     0.35 # plain rbtree
 559.82     5424915.25     0.33 # best rbtree with pruning
 208.21     3898784.09     0.24 # skiplist
  34.05     5784106.01     0.32 # rbtree without deadlines
  23.35     4152999.80     0.24 # skiplist without deadlines

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 22/41] drm/i915: Fair low-latency scheduling
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 22/41] drm/i915: Fair low-latency scheduling Chris Wilson
@ 2021-01-28 11:35   ` Tvrtko Ursulin
  2021-01-28 12:32     ` Chris Wilson
  0 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-28 11:35 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 25/01/2021 14:01, Chris Wilson wrote:
> The first "scheduler" was a topographical sorting of requests into
> priority order. The execution order was deterministic, the earliest
> submitted, highest priority request would be executed first. Priority
> inheritance ensured that inversions were kept at bay, and allowed us to
> dynamically boost priorities (e.g. for interactive pageflips).
> 
> The minimalistic timeslicing scheme was an attempt to introduce fairness
> between long running requests, by evicting the active request at the end
> of a timeslice and moving it to the back of its priority queue (while
> ensuring that dependencies were kept in order). For short running
> requests from many clients of equal priority, the scheme is still very
> much FIFO submission ordering, and as unfair as before.
> 
> To impose fairness, we need an external metric that ensures that clients
> are interpersed, so we don't execute one long chain from client A before
> executing any of client B. This could be imposed by the clients
> themselves by using fences based on an external clock, that is they only
> submit work for a "frame" at frame-intervals, instead of submitting as
> much work as they are able to. The standard SwapBuffers approach is akin
> to double bufferring, where as one frame is being executed, the next is
> being submitted, such that there is always a maximum of two frames per
> client in the pipeline and so ideally maintains consistent input-output
> latency. Even this scheme exhibits unfairness under load as a single
> client will execute two frames back to back before the next, and with
> enough clients, deadlines will be missed.
> 
> The idea introduced by BFS/MuQSS is that fairness is introduced by
> metering with an external clock. Every request, when it becomes ready to
> execute is assigned a virtual deadline, and execution order is then
> determined by earliest deadline. Priority is used as a hint, rather than
> strict ordering, where high priority requests have earlier deadlines,
> but not necessarily earlier than outstanding work. Thus work is executed
> in order of 'readiness', with timeslicing to demote long running work.
> 
> The Achille's heel of this scheduler is its strong preference for
> low-latency and favouring of new queues. Whereas it was easy to dominate
> the old scheduler by flooding it with many requests over a short period
> of time, the new scheduler can be dominated by a 'synchronous' client
> that waits for each of its requests to complete before submitting the
> next. As such a client has no history, it is always considered
> ready-to-run and receives an earlier deadline than the long running
> requests. This is compensated for by refreshing the current execution's
> deadline and by disallowing preemption for timeslice shuffling.
> 
> In contrast, one key advantage of disconnecting the sort key from the
> priority value is that we can freely adjust the deadline to compensate
> for other factors. This is used in conjunction with submitting requests
> ahead-of-schedule that then busywait on the GPU using semaphores. Since
> we don't want to spend a timeslice busywaiting instead of doing real
> work when available, we deprioritise work by giving the semaphore waits
> a later virtual deadline. The priority deboost is applied to semaphore
> workloads after they miss a semaphore wait and a new context is pending.
> The request is then restored to its normal priority once the semaphores
> are signaled so that it not unfairly penalised under contention by
> remaining at a far future deadline. This is a much improved and cleaner
> version of commit f9e9e9de58c7 ("drm/i915: Prioritise non-busywait
> semaphore workloads").
> 
> To check the impact on throughput (often the downfall of latency
> sensitive schedulers), we used gem_wsim to simulate various transcode
> workloads with different load balancers, and varying the number of
> competing [heterogenous] clients. On Kabylake gt3e running at fixed
> clocks,
> 
> +delta%------------------------------------------------------------------+
> |       a                                                                |
> |       a                                                                |
> |       a                                                                |
> |       a                                                                |
> |       aa                                                               |
> |      aaa                                                               |
> |      aaaa                                                              |
> |     aaaaaa                                                             |
> |     aaaaaa                                                             |
> |     aaaaaa   a                a                                        |
> | aa  aaaaaa a a      a  a   aa a       a         a       a             a|
> ||______M__A__________|                                                  |
> +------------------------------------------------------------------------+
>      N           Min           Max        Median          Avg       Stddev
>    108    -4.6326643     47.797855 -0.00069639128     2.116185   7.6764049

+47% is aggregate throughput or 47% less variance between worst-best 
clients from the group?

> 
> Reviewing the same workloads on Tigerlake,
> 
> +delta%------------------------------------------------------------------+
> |       a                                                                |
> |       a                                                                |
> |       a                                                                |
> |       aa a                                                             |
> |       aaaa                                                             |
> |       aaaa                                                             |
> |    aaaaaaa                                                             |
> |    aaaaaaa                                                             |
> |    aaaaaaa      a   a   aa  a         a                         a      |
> | aaaaaaaaaa a aa a a a aaaa aa   a     a        aa               a     a|
> ||_______M____A_____________|                                            |
> +------------------------------------------------------------------------+
>      N           Min           Max        Median          Avg       Stddev
>    108     -4.258712      46.83081    0.36853159    4.1415662     9.461689
> 
> The expectation is that by deliberately increasing the number of context
> switches to improve fairness between clients, throughput will be
> diminished. What we do see is are small fluctations around no change,
> with the median result being improved throughput. The dramatic
> improvement is from reintroducing the improved no-semaphore boosting,
> which avoids accidentally preventing scheduling of ready workloads due
> to busy spinners.
> 
> This scheduler is based on MuQSS by Dr Con Kolivas.
> 
> Testcase: igt/gem_exec_fair
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   2 -
>   .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |   1 +
>   drivers/gpu/drm/i915/gt/intel_engine_pm.c     |   4 +-
>   drivers/gpu/drm/i915/gt/intel_engine_types.h  |  14 -
>   .../drm/i915/gt/intel_execlists_submission.c  | 205 ++++-----
>   drivers/gpu/drm/i915/gt/selftest_execlists.c  |  30 +-
>   drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   5 +-
>   drivers/gpu/drm/i915/gt/selftest_lrc.c        |   1 +
>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   5 -
>   drivers/gpu/drm/i915/i915_priolist_types.h    |   7 +-
>   drivers/gpu/drm/i915/i915_request.c           |  14 +-
>   drivers/gpu/drm/i915/i915_scheduler.c         | 433 +++++++++++++-----
>   drivers/gpu/drm/i915/i915_scheduler.h         |  16 +-
>   drivers/gpu/drm/i915/i915_scheduler_types.h   |  23 +
>   drivers/gpu/drm/i915/selftests/i915_request.c |   1 +
>   .../gpu/drm/i915/selftests/i915_scheduler.c   | 136 ++++++
>   16 files changed, 630 insertions(+), 267 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 2f9a8960144b..8372c8bc4ca5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -573,8 +573,6 @@ void intel_engine_init_execlists(struct intel_engine_cs *engine)
>   	memset(execlists->pending, 0, sizeof(execlists->pending));
>   	execlists->active =
>   		memset(execlists->inflight, 0, sizeof(execlists->inflight));
> -
> -	execlists->queue_priority_hint = INT_MIN;
>   }
>   
>   static void cleanup_status_page(struct intel_engine_cs *engine)
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> index 0b026cde9f09..2d1f0a4da13c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> @@ -204,6 +204,7 @@ static int __intel_engine_pulse(struct intel_engine_cs *engine)
>   	if (IS_ERR(rq))
>   		return PTR_ERR(rq);
>   
> +	rq->sched.deadline = 0;
>   	__set_bit(I915_FENCE_FLAG_SENTINEL, &rq->fence.flags);
>   
>   	heartbeat_commit(rq, &attr);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 205feeaf0e76..2427d9e01be9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -211,6 +211,7 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
>   	i915_request_add_active_barriers(rq);
>   
>   	/* Install ourselves as a preemption barrier */
> +	rq->sched.deadline = 0;
>   	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
>   	if (likely(!__i915_request_commit(rq))) { /* engine should be idle! */
>   		/*
> @@ -271,9 +272,6 @@ static int __engine_park(struct intel_wakeref *wf)
>   	intel_engine_park_heartbeat(engine);
>   	intel_breadcrumbs_park(engine->breadcrumbs);
>   
> -	/* Must be reset upon idling, or we may miss the busy wakeup. */
> -	GEM_BUG_ON(engine->execlists.queue_priority_hint != INT_MIN);
> -
>   	if (engine->park)
>   		engine->park(engine);
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index 76d561c2c6aa..b5bef848a2d5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -223,20 +223,6 @@ struct intel_engine_execlists {
>   	 */
>   	unsigned int port_mask;
>   
> -	/**
> -	 * @queue_priority_hint: Highest pending priority.
> -	 *
> -	 * When we add requests into the queue, or adjust the priority of
> -	 * executing requests, we compute the maximum priority of those
> -	 * pending requests. We can then use this value to determine if
> -	 * we need to preempt the executing requests to service the queue.
> -	 * However, since the we may have recorded the priority of an inflight
> -	 * request we wanted to preempt but since completed, at the time of
> -	 * dequeuing the priority hint may no longer may match the highest
> -	 * available request priority.
> -	 */
> -	int queue_priority_hint;
> -
>   	struct rb_root_cached virtual;
>   
>   	/**
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 129144dd86b0..8f12068465bd 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -178,7 +178,7 @@ struct virtual_engine {
>   	 */
>   	struct ve_node {
>   		struct rb_node rb;
> -		int prio;
> +		u64 deadline;
>   	} nodes[I915_NUM_ENGINES];
>   
>   	/*
> @@ -246,25 +246,12 @@ static void ring_set_paused(const struct intel_engine_cs *engine, int state)
>   
>   static int rq_prio(const struct i915_request *rq)
>   {
> -	return READ_ONCE(rq->sched.attr.priority);
> +	return rq->sched.attr.priority;
>   }
>   
> -static int effective_prio(const struct i915_request *rq)
> +static u64 rq_deadline(const struct i915_request *rq)
>   {
> -	int prio = rq_prio(rq);
> -
> -	/*
> -	 * If this request is special and must not be interrupted at any
> -	 * cost, so be it. Note we are only checking the most recent request
> -	 * in the context and so may be masking an earlier vip request. It
> -	 * is hoped that under the conditions where nopreempt is used, this
> -	 * will not matter (i.e. all requests to that context will be
> -	 * nopreempt for as long as desired).
> -	 */
> -	if (i915_request_has_nopreempt(rq))
> -		prio = I915_PRIORITY_UNPREEMPTABLE;
> -
> -	return prio;
> +	return rq->sched.deadline;
>   }
>   
>   static struct i915_request *first_request(struct i915_sched_engine *se)
> @@ -283,61 +270,61 @@ static struct i915_request *first_request(struct i915_sched_engine *se)
>   	return NULL;
>   }
>   
> -static int queue_prio(struct i915_sched_engine *se)
> +static struct i915_request *first_virtual(const struct intel_engine_cs *engine)
>   {
> -	struct i915_request *rq;
> +	struct rb_node *rb;
>   
> -	rq = first_request(se);
> -	if (!rq)
> -		return INT_MIN;
> +	rb = rb_first_cached(&engine->execlists.virtual);
> +	if (!rb)
> +		return NULL;
>   
> -	return rq_prio(rq);
> +	return READ_ONCE(rb_entry(rb,
> +				  struct virtual_engine,
> +				  nodes[engine->id].rb)->request);
>   }
>   
> -static int virtual_prio(const struct intel_engine_execlists *el)
> +static const struct i915_request *
> +next_elsp_request(struct intel_engine_cs *engine, const struct i915_request *rq)
>   {
> -	struct rb_node *rb = rb_first_cached(&el->virtual);
> +	if (list_is_last(&rq->sched.link, &engine->active.requests))
> +		return NULL;
>   
> -	return rb ? rb_entry(rb, struct ve_node, rb)->prio : INT_MIN;
> +	return list_next_entry(rq, sched.link);
> +}
> +
> +static bool
> +dl_before(const struct i915_request *next, const struct i915_request *prev)
> +{
> +	return !prev || (next && rq_deadline(next) < rq_deadline(prev));
>   }
>   
>   static bool need_preempt(struct intel_engine_cs *engine,
>   			 const struct i915_request *rq)
>   {
> -	int last_prio;
> +	const struct i915_request *first = NULL;
> +	const struct i915_request *next;
>   
>   	if (!intel_engine_has_semaphores(engine))
>   		return false;
>   
>   	/*
> -	 * Check if the current priority hint merits a preemption attempt.
> -	 *
> -	 * We record the highest value priority we saw during rescheduling
> -	 * prior to this dequeue, therefore we know that if it is strictly
> -	 * less than the current tail of ESLP[0], we do not need to force
> -	 * a preempt-to-idle cycle.
> -	 *
> -	 * However, the priority hint is a mere hint that we may need to
> -	 * preempt. If that hint is stale or we may be trying to preempt
> -	 * ourselves, ignore the request.
> -	 *
> -	 * More naturally we would write
> -	 *      prio >= max(0, last);
> -	 * except that we wish to prevent triggering preemption at the same
> -	 * priority level: the task that is running should remain running
> -	 * to preserve FIFO ordering of dependencies.
> +	 * If this request is special and must not be interrupted at any
> +	 * cost, so be it. Note we are only checking the most recent request
> +	 * in the context and so may be masking an earlier vip request. It
> +	 * is hoped that under the conditions where nopreempt is used, this
> +	 * will not matter (i.e. all requests to that context will be
> +	 * nopreempt for as long as desired).
>   	 */
> -	last_prio = max(effective_prio(rq), I915_PRIORITY_NORMAL - 1);
> -	if (engine->execlists.queue_priority_hint <= last_prio)
> +	if (i915_request_has_nopreempt(rq))
>   		return false;
>   
>   	/*
>   	 * Check against the first request in ELSP[1], it will, thanks to the
>   	 * power of PI, be the highest priority of that context.
>   	 */
> -	if (!list_is_last(&rq->sched.link, &engine->active.requests) &&
> -	    rq_prio(list_next_entry(rq, sched.link)) > last_prio)
> -		return true;
> +	next = next_elsp_request(engine, rq);
> +	if (dl_before(next, first))
> +		first = next;
>   
>   	/*
>   	 * If the inflight context did not trigger the preemption, then maybe
> @@ -349,8 +336,31 @@ static bool need_preempt(struct intel_engine_cs *engine,
>   	 * ELSP[0] or ELSP[1] as, thanks again to PI, if it was the same
>   	 * context, it's priority would not exceed ELSP[0] aka last_prio.
>   	 */
> -	return max(virtual_prio(&engine->execlists),
> -		   queue_prio(&engine->active)) > last_prio;
> +	next = first_request(&engine->active);
> +	if (dl_before(next, first))
> +		first = next;
> +
> +	next = first_virtual(engine);
> +	if (dl_before(next, first))
> +		first = next;
> +
> +	if (!dl_before(first, rq))
> +		return false;
> +
> +	/*
> +	 * While a request may have been queued that has an earlier deadline
> +	 * than is currently running, we only allow it to perform an urgent
> +	 * preemption if it also has higher priority. The cost of frequently
> +	 * switching between contexts is noticeable, so we try to keep
> +	 * the deadline shuffling only to timeslice boundaries.
> +	 */
> +	ENGINE_TRACE(engine,
> +		     "preempt for first=%llx:%llu, dl=%llu, prio=%d?\n",
> +		     first->fence.context,
> +		     first->fence.seqno,
> +		     rq_deadline(first),
> +		     rq_prio(first));
> +	return rq_prio(first) > max(rq_prio(rq), I915_PRIORITY_NORMAL - 1);
>   }
>   
>   __maybe_unused static bool
> @@ -367,7 +377,7 @@ assert_priority_queue(const struct i915_request *prev,
>   	if (i915_request_is_active(prev))
>   		return true;
>   
> -	return rq_prio(prev) >= rq_prio(next);
> +	return rq_deadline(prev) <= rq_deadline(next);
>   }
>   
>   static void
> @@ -549,9 +559,12 @@ static void __execlists_schedule_out(struct i915_request * const rq,
>   	 * If we have just completed this context, the engine may now be
>   	 * idle and we want to re-enter powersaving.
>   	 */
> -	if (intel_timeline_is_last(ce->timeline, rq) &&
> -	    __i915_request_is_complete(rq))
> -		intel_engine_add_retire(engine, ce->timeline);
> +	if (__i915_request_is_complete(rq)) {
> +		if (!intel_timeline_is_last(ce->timeline, rq))
> +			i915_request_update_deadline(list_next_entry(rq, link));

Comment here explaining why it is important to update the deadline for 
the following request once previous completes?

And this is just for the last request of the coalesced bunch right?

> +		else
> +			intel_engine_add_retire(engine, ce->timeline);
> +	}
>   
>   	ccid = ce->lrc.ccid;
>   	ccid >>= GEN11_SW_CTX_ID_SHIFT - 32;
> @@ -661,14 +674,14 @@ dump_port(char *buf, int buflen, const char *prefix, struct i915_request *rq)
>   	if (!rq)
>   		return "";
>   
> -	snprintf(buf, buflen, "%sccid:%x %llx:%lld%s prio %d",
> +	snprintf(buf, buflen, "%sccid:%x %llx:%lld%s dl:%llu",
>   		 prefix,
>   		 rq->context->lrc.ccid,
>   		 rq->fence.context, rq->fence.seqno,
>   		 __i915_request_is_complete(rq) ? "!" :
>   		 __i915_request_has_started(rq) ? "*" :
>   		 "",
> -		 rq_prio(rq));
> +		 rq_deadline(rq));
>   
>   	return buf;
>   }
> @@ -1191,11 +1204,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   	if (last) {
>   		if (need_preempt(engine, last)) {
>   			ENGINE_TRACE(engine,
> -				     "preempting last=%llx:%lld, prio=%d, hint=%d\n",
> +				     "preempting last=%llx:%llu, dl=%llu, prio=%d\n",
>   				     last->fence.context,
>   				     last->fence.seqno,
> -				     last->sched.attr.priority,
> -				     execlists->queue_priority_hint);
> +				     rq_deadline(last),
> +				     rq_prio(last));
>   			record_preemption(execlists);
>   
>   			/*
> @@ -1217,11 +1230,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   			last = NULL;
>   		} else if (timeslice_expired(engine, last)) {
>   			ENGINE_TRACE(engine,
> -				     "expired:%s last=%llx:%lld, prio=%d, hint=%d, yield?=%s\n",
> +				     "expired:%s last=%llx:%llu, deadline=%llu, now=%llu, yield?=%s\n",
>   				     yesno(timer_expired(&execlists->timer)),
>   				     last->fence.context, last->fence.seqno,
> -				     rq_prio(last),
> -				     execlists->queue_priority_hint,
> +				     rq_deadline(last),
> +				     i915_sched_to_ticks(ktime_get()),
>   				     yesno(timeslice_yield(execlists, last)));
>   
>   			/*
> @@ -1292,7 +1305,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   		GEM_BUG_ON(rq->engine != &ve->base);
>   		GEM_BUG_ON(rq->context != &ve->context);
>   
> -		if (unlikely(rq_prio(rq) < queue_prio(&engine->active))) {
> +		if (!dl_before(rq, first_request(&engine->active))) {
>   			spin_unlock(&ve->base.active.lock);
>   			break;
>   		}
> @@ -1304,16 +1317,15 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   		}
>   
>   		ENGINE_TRACE(engine,
> -			     "virtual rq=%llx:%lld%s, new engine? %s\n",
> +			     "virtual rq=%llx:%lld%s, dl %llx, new engine? %s\n",
>   			     rq->fence.context,
>   			     rq->fence.seqno,
>   			     __i915_request_is_complete(rq) ? "!" :
>   			     __i915_request_has_started(rq) ? "*" :
>   			     "",
> +			     rq_deadline(rq),
>   			     yesno(engine != ve->siblings[0]));
> -
>   		WRITE_ONCE(ve->request, NULL);
> -		WRITE_ONCE(ve->base.execlists.queue_priority_hint, INT_MIN);
>   
>   		rb = &ve->nodes[engine->id].rb;
>   		rb_erase_cached(rb, &execlists->virtual);
> @@ -1404,6 +1416,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   				if (rq->execution_mask != engine->mask)
>   					goto done;
>   
> +				if (unlikely(dl_before(first_virtual(engine),
> +						       rq)))
> +					goto done;
> +
>   				/*
>   				 * If GVT overrides us we only ever submit
>   				 * port[0], leaving port[1] empty. Note that we
> @@ -1440,24 +1456,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   	}
>   done:
>   	*port++ = i915_request_get(last);
> -
> -	/*
> -	 * Here be a bit of magic! Or sleight-of-hand, whichever you prefer.
> -	 *
> -	 * We choose the priority hint such that if we add a request of greater
> -	 * priority than this, we kick the submission tasklet to decide on
> -	 * the right order of submitting the requests to hardware. We must
> -	 * also be prepared to reorder requests as they are in-flight on the
> -	 * HW. We derive the priority hint then as the first "hole" in
> -	 * the HW submission ports and if there are no available slots,
> -	 * the priority of the lowest executing request, i.e. last.
> -	 *
> -	 * When we do receive a higher priority request ready to run from the
> -	 * user, see queue_request(), the priority hint is bumped to that
> -	 * request triggering preemption on the next dequeue (or subsequent
> -	 * interrupt for secondary ports).
> -	 */
> -	execlists->queue_priority_hint = queue_prio(&engine->active);
>   	spin_unlock(&engine->active.lock);
>   
>   	/*
> @@ -2631,10 +2629,6 @@ static void execlists_reset_rewind(struct intel_engine_cs *engine, bool stalled)
>   
>   static void nop_submission_tasklet(unsigned long data)
>   {
> -	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
> -
> -	/* The driver is wedged; don't process any more events. */
> -	WRITE_ONCE(engine->execlists.queue_priority_hint, INT_MIN);
>   }
>   
>   static void execlists_reset_cancel(struct intel_engine_cs *engine)
> @@ -2701,16 +2695,12 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>   			rq->engine = engine;
>   			__i915_request_submit(rq);
>   			i915_request_put(rq);
> -
> -			ve->base.execlists.queue_priority_hint = INT_MIN;
>   		}
>   		spin_unlock(&ve->base.active.lock);
>   	}
>   
>   	/* Remaining _unready_ requests will be nop'ed when submitted */
>   
> -	execlists->queue_priority_hint = INT_MIN;
> -
>   	GEM_BUG_ON(__tasklet_is_enabled(&engine->active.tasklet));
>   	engine->active.tasklet.func = nop_submission_tasklet;
>   
> @@ -3115,7 +3105,8 @@ static const struct intel_context_ops virtual_context_ops = {
>   	.destroy = virtual_context_destroy,
>   };
>   
> -static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
> +static intel_engine_mask_t
> +virtual_submission_mask(struct virtual_engine *ve, u64 *deadline)
>   {
>   	struct i915_request *rq;
>   	intel_engine_mask_t mask;
> @@ -3132,9 +3123,11 @@ static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
>   		mask = ve->siblings[0]->mask;
>   	}
>   
> -	ENGINE_TRACE(&ve->base, "rq=%llx:%lld, mask=%x, prio=%d\n",
> +	*deadline = rq_deadline(rq);
> +
> +	ENGINE_TRACE(&ve->base, "rq=%llx:%llu, mask=%x, dl=%llu\n",
>   		     rq->fence.context, rq->fence.seqno,
> -		     mask, ve->base.execlists.queue_priority_hint);
> +		     mask, *deadline);
>   
>   	return mask;
>   }
> @@ -3142,12 +3135,12 @@ static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
>   static void virtual_submission_tasklet(unsigned long data)
>   {
>   	struct virtual_engine * const ve = (struct virtual_engine *)data;
> -	const int prio = READ_ONCE(ve->base.execlists.queue_priority_hint);
>   	intel_engine_mask_t mask;
>   	unsigned int n;
> +	u64 deadline;
>   
>   	rcu_read_lock();
> -	mask = virtual_submission_mask(ve);
> +	mask = virtual_submission_mask(ve, &deadline);
>   	rcu_read_unlock();
>   	if (unlikely(!mask))
>   		return;
> @@ -3180,7 +3173,8 @@ static void virtual_submission_tasklet(unsigned long data)
>   			 */
>   			first = rb_first_cached(&sibling->execlists.virtual) ==
>   				&node->rb;
> -			if (prio == node->prio || (prio > node->prio && first))
> +			if (deadline == node->deadline ||
> +			    (deadline < node->deadline && first))
>   				goto submit_engine;
>   
>   			rb_erase_cached(&node->rb, &sibling->execlists.virtual);
> @@ -3194,7 +3188,7 @@ static void virtual_submission_tasklet(unsigned long data)
>   
>   			rb = *parent;
>   			other = rb_entry(rb, typeof(*other), rb);
> -			if (prio > other->prio) {
> +			if (deadline < other->deadline) {
>   				parent = &rb->rb_left;
>   			} else {
>   				parent = &rb->rb_right;
> @@ -3209,8 +3203,8 @@ static void virtual_submission_tasklet(unsigned long data)
>   
>   submit_engine:
>   		GEM_BUG_ON(RB_EMPTY_NODE(&node->rb));
> -		node->prio = prio;
> -		if (first && prio > sibling->execlists.queue_priority_hint)
> +		node->deadline = deadline;
> +		if (first)
>   			i915_sched_kick(&sibling->active);
>   
>   unlock_engine:
> @@ -3246,7 +3240,9 @@ static void virtual_submit_request(struct i915_request *rq)
>   		i915_request_put(ve->request);
>   	}
>   
> -	ve->base.execlists.queue_priority_hint = rq_prio(rq);
> +	rq->sched.deadline =
> +		min(rq->sched.deadline,
> +		    i915_scheduler_next_virtual_deadline(rq_prio(rq)));
>   	ve->request = i915_request_get(rq);
>   
>   	GEM_BUG_ON(!list_empty(virtual_queue(ve)));
> @@ -3349,7 +3345,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>   	ve->base.bond_execute = virtual_bond_execute;
>   
>   	INIT_LIST_HEAD(virtual_queue(ve));
> -	ve->base.execlists.queue_priority_hint = INT_MIN;
>   	tasklet_init(&ve->base.active.tasklet,
>   		     virtual_submission_tasklet,
>   		     (unsigned long)ve);
> @@ -3533,10 +3528,6 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
>   		show_request(m, last, "\t\t", 0);
>   	}
>   
> -	if (execlists->queue_priority_hint != INT_MIN)
> -		drm_printf(m, "\t\tQueue priority hint: %d\n",
> -			   READ_ONCE(execlists->queue_priority_hint));
> -
>   	last = NULL;
>   	count = 0;
>   	for_each_priolist(pl, &engine->active.queue) {
> diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> index d2036e16274d..730b7ea920ec 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> @@ -867,7 +867,7 @@ semaphore_queue(struct intel_engine_cs *engine, struct i915_vma *vma, int idx)
>   static int
>   release_queue(struct intel_engine_cs *engine,
>   	      struct i915_vma *vma,
> -	      int idx, int prio)
> +	      int idx, u64 deadline)
>   {
>   	struct i915_request *rq;
>   	u32 *cs;
> @@ -892,10 +892,7 @@ release_queue(struct intel_engine_cs *engine,
>   	i915_request_get(rq);
>   	i915_request_add(rq);
>   
> -	local_bh_disable();
> -	i915_request_set_priority(rq, prio);
> -	local_bh_enable(); /* kick tasklet */
> -
> +	i915_request_set_deadline(rq, deadline);

I am thinking some underscores to this API could be beneficial to 
emphasise how high level callers should not use it on their requests. 
Thinking about things like tests and in kernel clients - my 
understanding is API is not for them.

>   	i915_request_put(rq);
>   
>   	return 0;
> @@ -909,6 +906,7 @@ slice_semaphore_queue(struct intel_engine_cs *outer,
>   	struct intel_engine_cs *engine;
>   	struct i915_request *head;
>   	enum intel_engine_id id;
> +	long timeout;
>   	int err, i, n = 0;
>   
>   	head = semaphore_queue(outer, vma, n++);
> @@ -932,12 +930,16 @@ slice_semaphore_queue(struct intel_engine_cs *outer,
>   		}
>   	}
>   
> -	err = release_queue(outer, vma, n, I915_PRIORITY_BARRIER);
> +	err = release_queue(outer, vma, n, 0);
>   	if (err)
>   		goto out;
>   
> -	if (i915_request_wait(head, 0,
> -			      2 * outer->gt->info.num_engines * (count + 2) * (count + 3)) < 0) {
> +	/* Expected number of pessimal slices required */
> +	timeout = outer->gt->info.num_engines * (count + 2) * (count + 3);
> +	timeout *= 4; /* safety factor, including bucketing */
> +	timeout += HZ / 2; /* and include the request completion */
> +
> +	if (i915_request_wait(head, 0, timeout) < 0) {
>   		pr_err("%s: Failed to slice along semaphore chain of length (%d, %d)!\n",
>   		       outer->name, count, n);
>   		GEM_TRACE_DUMP();
> @@ -1042,6 +1044,8 @@ create_rewinder(struct intel_context *ce,
>   		err = i915_request_await_dma_fence(rq, &wait->fence);
>   		if (err)
>   			goto err;
> +
> +		i915_request_set_deadline(rq, rq_deadline(wait));
>   	}
>   
>   	cs = intel_ring_begin(rq, 14);
> @@ -1318,6 +1322,7 @@ static int live_timeslice_queue(void *arg)
>   			goto err_heartbeat;
>   		}
>   		i915_request_set_priority(rq, I915_PRIORITY_MAX);
> +		i915_request_set_deadline(rq, 0);
>   		err = wait_for_submit(engine, rq, HZ / 2);
>   		if (err) {
>   			pr_err("%s: Timed out trying to submit semaphores\n",
> @@ -1340,10 +1345,9 @@ static int live_timeslice_queue(void *arg)
>   		}
>   
>   		GEM_BUG_ON(i915_request_completed(rq));
> -		GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
>   
>   		/* Queue: semaphore signal, matching priority as semaphore */
> -		err = release_queue(engine, vma, 1, effective_prio(rq));
> +		err = release_queue(engine, vma, 1, rq_deadline(rq));
>   		if (err)
>   			goto err_rq;
>   
> @@ -1454,6 +1458,7 @@ static int live_timeslice_nopreempt(void *arg)
>   			goto out_spin;
>   		}
>   
> +		rq->sched.deadline = 0;
>   		rq->sched.attr.priority = I915_PRIORITY_BARRIER;
>   		i915_request_get(rq);
>   		i915_request_add(rq);
> @@ -1817,6 +1822,7 @@ static int live_late_preempt(void *arg)
>   
>   	/* Make sure ctx_lo stays before ctx_hi until we trigger preemption. */
>   	ctx_lo->sched.priority = 1;
> +	ctx_hi->sched.priority = I915_PRIORITY_MIN;
>   
>   	for_each_engine(engine, gt, id) {
>   		struct igt_live_test t;
> @@ -2981,6 +2987,9 @@ static int live_preempt_gang(void *arg)
>   		while (rq) { /* wait for each rq from highest to lowest prio */
>   			struct i915_request *n = list_next_entry(rq, mock.link);
>   
> +			/* With deadlines, no strict priority ordering */
> +			i915_request_set_deadline(rq, 0);
> +
>   			if (err == 0 && i915_request_wait(rq, 0, HZ / 5) < 0) {
>   				struct drm_printer p =
>   					drm_info_printer(engine->i915->drm.dev);
> @@ -3203,6 +3212,7 @@ static int preempt_user(struct intel_engine_cs *engine,
>   	i915_request_add(rq);
>   
>   	i915_request_set_priority(rq, I915_PRIORITY_MAX);
> +	i915_request_set_deadline(rq, 0);
>   
>   	if (i915_request_wait(rq, 0, HZ / 2) < 0)
>   		err = -ETIME;
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index 3d3f41b1271a..df799379333f 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -1010,7 +1010,10 @@ static int __igt_reset_engines(struct intel_gt *gt,
>   					break;
>   				}
>   
> -				if (i915_request_wait(rq, 0, HZ / 5) < 0) {
> +				/* With deadlines, no strict priority */
> +				i915_request_set_deadline(rq, 0);
> +
> +				if (i915_request_wait(rq, 0, HZ / 2) < 0) {
>   					struct drm_printer p =
>   						drm_info_printer(gt->i915->drm.dev);
>   
> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> index 0a40f6b574f8..62916bf2cde9 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> @@ -1253,6 +1253,7 @@ poison_registers(struct intel_context *ce,
>   
>   	intel_ring_advance(rq, cs);
>   
> +	rq->sched.deadline = 0;
>   	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
>   err_rq:
>   	i915_request_add(rq);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 8d0c6cd277b3..4de9c459eb75 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -220,8 +220,6 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>   		i915_priolist_advance(&engine->active.queue, pl);
>   	}
>   done:
> -	execlists->queue_priority_hint =
> -		pl != &engine->active.queue.sentinel ? pl->priority : INT_MIN;
>   	if (submit) {
>   		*port = schedule_in(last, port - execlists->inflight);
>   		*++port = NULL;
> @@ -318,7 +316,6 @@ static void guc_reset_rewind(struct intel_engine_cs *engine, bool stalled)
>   
>   static void guc_reset_cancel(struct intel_engine_cs *engine)
>   {
> -	struct intel_engine_execlists * const execlists = &engine->execlists;
>   	struct i915_request *rq, *rn;
>   	struct i915_priolist *p;
>   	unsigned long flags;
> @@ -361,8 +358,6 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
>   
>   	/* Remaining _unready_ requests will be nop'ed when submitted */
>   
> -	execlists->queue_priority_hint = INT_MIN;
> -
>   	spin_unlock_irqrestore(&engine->active.lock, flags);
>   }
>   
> diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
> index 1200c3df6a4a..8a8cef0fcb48 100644
> --- a/drivers/gpu/drm/i915/i915_priolist_types.h
> +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
> @@ -22,6 +22,8 @@ enum {
>   
>   	/* Interactive workload, scheduled for immediate pageflipping */
>   	I915_PRIORITY_DISPLAY,
> +
> +	__I915_PRIORITY_KERNEL__
>   };
>   
>   /* Smallest priority value that cannot be bumped. */
> @@ -35,8 +37,7 @@ enum {
>    * i.e. nothing can have higher priority and force us to usurp the
>    * active request.
>    */
> -#define I915_PRIORITY_UNPREEMPTABLE INT_MAX
> -#define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
> +#define I915_PRIORITY_BARRIER INT_MAX
>   
>   #ifdef CONFIG_64BIT
>   #define I915_PRIOLIST_HEIGHT 12
> @@ -46,7 +47,7 @@ enum {
>   
>   struct i915_priolist {
>   	struct list_head requests;
> -	int priority;
> +	u64 deadline;
>   
>   	int level;
>   	struct i915_priolist *next[I915_PRIOLIST_HEIGHT];
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index df2ab39b394d..e4c0c810b77e 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -530,7 +530,7 @@ bool __i915_request_submit(struct i915_request *request)
>   	struct intel_engine_cs *engine = request->engine;
>   	bool result = false;
>   
> -	RQ_TRACE(request, "\n");
> +	RQ_TRACE(request, "dl %llu\n", request->sched.deadline);
>   
>   	GEM_BUG_ON(!irqs_disabled());
>   	lockdep_assert_held(&engine->active.lock);
> @@ -719,6 +719,7 @@ semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
>   
>   	switch (state) {
>   	case FENCE_COMPLETE:
> +		i915_request_update_deadline(rq);

This will pull the deadline in or push out in practice?

>   		break;
>   
>   	case FENCE_FREE:
> @@ -1879,14 +1880,15 @@ long i915_request_wait(struct i915_request *rq,
>   	return timeout;
>   }
>   
> -static int print_sched_attr(const struct i915_sched_attr *attr,
> -			    char *buf, int x, int len)
> +static int print_sched(const struct i915_sched_node *node,
> +		       char *buf, int x, int len)
>   {
> -	if (attr->priority == I915_PRIORITY_INVALID)
> +	if (node->attr.priority == I915_PRIORITY_INVALID)
>   		return x;
>   
>   	x += snprintf(buf + x, len - x,
> -		      " prio=%d", attr->priority);
> +		      " prio=%d, dl=%llu",
> +		      node->attr.priority, node->deadline);
>   
>   	return x;
>   }
> @@ -1966,7 +1968,7 @@ void i915_request_show(struct drm_printer *m,
>   	 *      from the lists
>   	 */
>   
> -	x = print_sched_attr(&rq->sched.attr, buf, x, sizeof(buf));
> +	x = print_sched(&rq->sched, buf, x, sizeof(buf));
>   
>   	drm_printf(m, "%s%.*s%c %llx:%lld%s%s %s @ %dms: %s\n",
>   		   prefix, indent, "                ",
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 74000d3eebb1..7ba816e83b55 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -33,6 +33,11 @@ static void node_put(struct i915_sched_node *node)
>   	i915_request_put(container_of(node, struct i915_request, sched));
>   }
>   
> +static inline u64 rq_deadline(const struct i915_request *rq)
> +{
> +	return READ_ONCE(rq->sched.deadline);
> +}
> +
>   static inline int rq_prio(const struct i915_request *rq)
>   {
>   	return READ_ONCE(rq->sched.attr.priority);
> @@ -46,6 +51,14 @@ static int ipi_get_prio(struct i915_request *rq)
>   	return xchg(&rq->sched.ipi_priority, I915_PRIORITY_INVALID);
>   }
>   
> +static u64 ipi_get_deadline(struct i915_request *rq)
> +{
> +	if (READ_ONCE(rq->sched.ipi_deadline) == I915_DEADLINE_NEVER)
> +		return I915_DEADLINE_NEVER;
> +
> +	return xchg64(&rq->sched.ipi_deadline, I915_DEADLINE_NEVER);
> +}
> +
>   static void ipi_schedule(struct work_struct *wrk)
>   {
>   	struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
> @@ -53,9 +66,11 @@ static void ipi_schedule(struct work_struct *wrk)
>   
>   	do {
>   		struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
> +		u64 deadline;
>   		int prio;
>   
>   		prio = ipi_get_prio(rq);
> +		deadline = ipi_get_deadline(rq);
>   
>   		/*
>   		 * For cross-engine scheduling to work we rely on one of two
> @@ -80,6 +95,7 @@ static void ipi_schedule(struct work_struct *wrk)
>   		 */
>   		local_bh_disable();
>   		i915_request_set_priority(rq, prio);
> +		i915_request_set_deadline(rq, deadline);
>   		local_bh_enable();
>   
>   		i915_request_put(rq);
> @@ -98,7 +114,7 @@ static void init_priolist(struct i915_priolist_root *const root)
>   	struct i915_priolist *pl = &root->sentinel;
>   
>   	memset_p((void **)pl->next, pl, ARRAY_SIZE(pl->next));
> -	pl->priority = INT_MIN;
> +	pl->deadline = I915_DEADLINE_NEVER;
>   	pl->level = -1;
>   }
>   
> @@ -250,7 +266,7 @@ static inline unsigned int random_level(struct i915_priolist_root *root)
>   }
>   
>   static struct list_head *
> -lookup_priolist(struct intel_engine_cs *engine, int prio)
> +lookup_priolist(struct intel_engine_cs *engine, u64 deadline)
>   {
>   	struct i915_priolist *update[I915_PRIOLIST_HEIGHT];
>   	struct i915_sched_engine * const se = &engine->active;
> @@ -258,12 +274,13 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
>   	struct i915_priolist *pl, *tmp;
>   	int lvl;
>   
> +	GEM_BUG_ON(deadline == I915_DEADLINE_NEVER);
>   	lockdep_assert_held(&se->lock);
>   	if (unlikely(se->no_priolist))
> -		prio = I915_PRIORITY_NORMAL;
> +		deadline = 0;
>   
>   	for_each_priolist(pl, root) { /* recycle any empty elements before us */
> -		if (pl->priority >= prio || !list_empty(&pl->requests))
> +		if (pl->deadline >= deadline || !list_empty(&pl->requests))
>   			break;
>   
>   		i915_priolist_advance(root, pl);
> @@ -273,14 +290,14 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
>   	pl = &root->sentinel;
>   	lvl = pl->level;
>   	while (lvl >= 0) {
> -		while (tmp = pl->next[lvl], tmp->priority >= prio)
> +		while (tmp = pl->next[lvl], tmp->deadline <= deadline)
>   			pl = tmp;
> -		if (pl->priority == prio)
> +		if (pl->deadline == deadline)
>   			goto out;
>   		update[lvl--] = pl;
>   	}
>   
> -	if (prio == I915_PRIORITY_NORMAL) {
> +	if (!deadline) {
>   		pl = &se->default_priolist;
>   	} else if (!pl_empty(&root->sentinel.requests)) {
>   		pl = pl_pop(&root->sentinel.requests);
> @@ -288,7 +305,7 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
>   		pl = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
>   		/* Convert an allocation failure to a priority bump */
>   		if (unlikely(!pl)) {
> -			prio = I915_PRIORITY_NORMAL; /* recurses just once */
> +			deadline = 0; /* recurses just once */
>   
>   			/*
>   			 * To maintain ordering with all rendering, after an
> @@ -304,7 +321,7 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
>   		}
>   	}
>   
> -	pl->priority = prio;
> +	pl->deadline = deadline;
>   	INIT_LIST_HEAD(&pl->requests);
>   
>   	lvl = random_level(root);
> @@ -332,7 +349,7 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
>   		chk = &root->sentinel;
>   		lvl = chk->level;
>   		do {
> -			while (tmp = chk->next[lvl], tmp->priority >= prio)
> +			while (tmp = chk->next[lvl], tmp->deadline <= deadline)
>   				chk = tmp;
>   		} while (--lvl >= 0);
>   
> @@ -352,7 +369,7 @@ static void remove_priolist(struct intel_engine_cs *engine,
>   	struct i915_priolist *pl, *tmp;
>   	struct i915_priolist *old =
>   		container_of(plist, struct i915_priolist, requests);
> -	int prio = old->priority;
> +	u64 deadline = old->deadline;
>   	int lvl;
>   
>   	lockdep_assert_held(&se->lock);
> @@ -362,11 +379,11 @@ static void remove_priolist(struct intel_engine_cs *engine,
>   	lvl = pl->level;
>   	GEM_BUG_ON(lvl < 0);
>   
> -	if (prio != I915_PRIORITY_NORMAL)
> +	if (deadline)
>   		pl_push(old, &pl->requests);
>   
>   	do {
> -		while (tmp = pl->next[lvl], tmp->priority > prio)
> +		while (tmp = pl->next[lvl], tmp->deadline < deadline)
>   			pl = tmp;
>   		if (lvl <= old->level) {
>   			pl->next[lvl] = old->next[lvl];
> @@ -389,7 +406,7 @@ void i915_priolist_advance(struct i915_priolist_root *root,
>   	GEM_BUG_ON(pl != s->next[0]);
>   	GEM_BUG_ON(pl == s);
>   
> -	if (pl->priority != I915_PRIORITY_NORMAL)
> +	if (pl->deadline)
>   		pl_push(pl, &s->requests);
>   
>   	lvl = pl->level;
> @@ -423,53 +440,245 @@ stack_pop(struct i915_request *rq,
>   	return rq;
>   }
>   
> -static inline bool need_preempt(int prio, int active)
> +static void ipi_deadline(struct i915_request *rq, u64 deadline)
>   {
> -	/*
> -	 * Allow preemption of low -> normal -> high, but we do
> -	 * not allow low priority tasks to preempt other low priority
> -	 * tasks under the impression that latency for low priority
> -	 * tasks does not matter (as much as background throughput),
> -	 * so kiss.
> -	 */
> -	return prio >= max(I915_PRIORITY_NORMAL, active);
> +	u64 old = READ_ONCE(rq->sched.ipi_deadline);
> +
> +	do {
> +		if (deadline >= old)
> +			return;
> +	} while (!try_cmpxchg64(&rq->sched.ipi_deadline, &old, deadline));
> +
> +	__ipi_add(rq);
>   }
>   
> -static void kick_submission(struct intel_engine_cs *engine,
> -			    const struct i915_request *rq,
> -			    int prio)
> +static bool is_first_priolist(const struct intel_engine_cs *engine,
> +			      const struct list_head *requests)
>   {
> -	const struct i915_request *inflight;
> +	return requests == &engine->active.queue.sentinel.next[0]->requests;
> +}
>   
> -	/*
> -	 * We only need to kick the tasklet once for the high priority
> -	 * new context we add into the queue.
> -	 */
> -	if (prio <= engine->execlists.queue_priority_hint)
> +static bool __i915_request_set_deadline(struct i915_request *rq, u64 deadline)
> +{
> +	struct intel_engine_cs *engine = rq->engine;
> +	struct list_head *pos = &rq->sched.signalers_list;
> +	struct list_head *plist;
> +
> +	if (unlikely(!i915_request_in_priority_queue(rq))) {
> +		rq->sched.deadline = deadline;
> +		return false;
> +	}
> +
> +	/* Fifo and depth-first replacement ensure our deps execute first */
> +	plist = lookup_priolist(engine, deadline);
> +
> +	rq->sched.dfs.next = NULL;
> +	do {
> +		list_for_each_continue(pos, &rq->sched.signalers_list) {
> +			struct i915_dependency *p =
> +				list_entry(pos, typeof(*p), signal_link);
> +			struct i915_request *s =
> +				container_of(p->signaler, typeof(*s), sched);
> +
> +			if (rq_deadline(s) <= deadline)
> +				continue;
> +
> +			if (__i915_request_is_complete(s))
> +				continue;
> +
> +			if (s->engine != engine) {
> +				ipi_deadline(s, deadline);
> +				continue;
> +			}
> +
> +			/* Remember our position along this branch */
> +			rq = stack_push(s, rq, pos);
> +			pos = &rq->sched.signalers_list;
> +		}
> +
> +		RQ_TRACE(rq, "set-deadline:%llu\n", deadline);
> +		WRITE_ONCE(rq->sched.deadline, deadline);
> +
> +		/*
> +		 * Once the request is ready, it will be placed into the
> +		 * priority lists and then onto the HW runlist. Before the
> +		 * request is ready, it does not contribute to our preemption
> +		 * decisions and we can safely ignore it, as it will, and
> +		 * any preemption required, be dealt with upon submission.
> +		 * See engine->submit_request()
> +		 */
> +		GEM_BUG_ON(rq->engine != engine);
> +		if (i915_request_in_priority_queue(rq)) {
> +			struct list_head *prev = rq->sched.link.prev;
> +
> +			list_move_tail(&rq->sched.link, plist);
> +			if (list_empty(prev))
> +				remove_priolist(engine, prev);
> +		}
> +	} while ((rq = stack_pop(rq, &pos)));
> +
> +	return is_first_priolist(engine, plist);
> +}
> +
> +void i915_request_set_deadline(struct i915_request *rq, u64 deadline)
> +{
> +	struct intel_engine_cs *engine;
> +	unsigned long flags;
> +
> +	if (deadline >= rq_deadline(rq))
>   		return;
>   
> -	/* Nothing currently active? We're overdue for a submission! */
> -	inflight = execlists_active(&engine->execlists);
> -	if (!inflight)
> -		return;
> +	engine = lock_engine_irqsave(rq, flags);
> +	if (!intel_engine_has_scheduler(engine))
> +		goto unlock;
>   
> -	/*
> -	 * If we are already the currently executing context, don't
> -	 * bother evaluating if we should preempt ourselves.
> -	 */
> -	if (inflight->context == rq->context)
> -		return;
> +	if (deadline >= rq_deadline(rq))
> +		goto unlock;
>   
> -	ENGINE_TRACE(engine,
> -		     "bumping queue-priority-hint:%d for rq:%llx:%lld, inflight:%llx:%lld prio %d\n",
> -		     prio,
> -		     rq->fence.context, rq->fence.seqno,
> -		     inflight->fence.context, inflight->fence.seqno,
> -		     inflight->sched.attr.priority);
> +	if (__i915_request_is_complete(rq))
> +		goto unlock;
>   
> -	engine->execlists.queue_priority_hint = prio;
> -	if (need_preempt(prio, rq_prio(inflight)))
> +	rcu_read_lock();
> +	if (__i915_request_set_deadline(rq, deadline))
>   		i915_sched_kick(&engine->active);
> +	rcu_read_unlock();
> +	GEM_BUG_ON(rq_deadline(rq) != deadline);
> +
> +unlock:
> +	spin_unlock_irqrestore(&engine->active.lock, flags);
> +}
> +
> +static u64 prio_slice(int prio)
> +{
> +	u64 slice;
> +	int sf;
> +
> +	/*
> +	 * This is the central heuristic to the virtual deadlines. By
> +	 * imposing that each task takes an equal amount of time, we
> +	 * let each client have an equal slice of the GPU time. By
> +	 * bringing the virtual deadline forward, that client will then
> +	 * have more GPU time, and vice versa a lower priority client will
> +	 * have a later deadline and receive less GPU time.
> +	 *
> +	 * In BFS/MuQSS, the prio_ratios[] are based on the task nice range of
> +	 * [-20, 20], with each lower priority having a ~10% longer deadline,
> +	 * with the note that the proportion of CPU time between two clients
> +	 * of different priority will be the square of the relative prio_slice.
> +	 *
> +	 * In contrast, this prio_slice() curve was chosen because it gave good
> +	 * results with igt/gem_exec_schedule. It may not be the best choice!
> +	 *
> +	 * With a 1ms scheduling quantum:
> +	 *
> +	 *   MAX USER:  ~32us deadline
> +	 *   0:         ~16ms deadline

Interesting centre/default point. Relates to 60Hz? If so how about 
exporting some sysfs controls?

> +	 *   MIN_USER: 1000ms deadline
> +	 */
> +
> +	if (prio >= __I915_PRIORITY_KERNEL__)
> +		return INT_MAX - prio;
> +
> +	slice = __I915_PRIORITY_KERNEL__ - prio;
> +	if (prio >= 0)
> +		sf = 20 - 6;
> +	else
> +		sf = 20 - 1;
> +
> +	return slice << sf;
> +}
> +
> +static u64 virtual_deadline(u64 kt, int priority)
> +{
> +	return i915_sched_to_ticks(kt + prio_slice(priority));
> +}
> +
> +u64 i915_scheduler_next_virtual_deadline(int priority)
> +{
> +	return virtual_deadline(ktime_get_mono_fast_ns(), priority);
> +}
> +
> +static u64 signal_deadline(const struct i915_request *rq)
> +{
> +	u64 last = ktime_get_mono_fast_ns();
> +	const struct i915_dependency *p;
> +
> +	/*
> +	 * Find the earliest point at which we will become 'ready',
> +	 * which we infer from the deadline of all active signalers.
> +	 * We will position ourselves at the end of that chain of work.
> +	 */
> +
> +	rcu_read_lock();
> +	for_each_signaler(p, rq) {
> +		const struct i915_request *s =
> +			container_of(p->signaler, typeof(*s), sched);
> +		u64 deadline;
> +		int prio;
> +
> +		if (__i915_request_is_complete(s))
> +			continue;
> +
> +		if (s->timeline == rq->timeline &&
> +		    __i915_request_has_started(s))
> +			continue;
> +
> +		prio = rq_prio(s);
> +		if (prio < rq_prio(rq))
> +			continue;
> +
> +		deadline = rq_deadline(s);
> +		if (deadline == I915_DEADLINE_NEVER) /* retired & reused */
> +			continue;
> +
> +		deadline = i915_sched_to_ns(deadline);
> +		if (p->flags & I915_DEPENDENCY_WEAK)
> +			deadline -= prio_slice(prio);
> +
> +		last = max(last, deadline);
> +	}
> +	rcu_read_unlock();
> +
> +	return last;
> +}
> +
> +static int adj_prio(const struct i915_request *rq)
> +{
> +	int prio = rq_prio(rq);
> +
> +	/*
> +	 * Deprioritize semaphore waiters. We only want to run these if there
> +	 * is nothing ready to run first.
> +	 *
> +	 * Note by giving a more distant deadline (due to a lower priority)
> +	 * we do not prevent them from having a slice of the GPU, and if there
> +	 * is still contention at that point, we expect to immediately yield
> +	 * on the semaphore.
> +	 *
> +	 * When all semaphores are signaled, we will update the request
> +	 * to remove the semaphore penalty.
> +	 */
> +	if (!i915_sw_fence_signaled(&rq->semaphore))
> +		prio -= __I915_PRIORITY_KERNEL__;
> +
> +	return prio;
> +}
> +
> +static u64 earliest_deadline(const struct i915_request *rq)
> +{
> +	return virtual_deadline(signal_deadline(rq), rq_prio(rq));
> +}
> +
> +static bool set_earliest_deadline(struct i915_request *rq, u64 old)
> +{
> +	u64 dl;
> +
> +	/* Recompute our deadlines and promote after a priority change */
> +	dl = min(earliest_deadline(rq), rq_deadline(rq));
> +	if (dl >= old)
> +		return false;
> +
> +	return __i915_request_set_deadline(rq, dl);
>   }
>   
>   static void ipi_priority(struct i915_request *rq, int prio)
> @@ -484,13 +693,11 @@ static void ipi_priority(struct i915_request *rq, int prio)
>   	__ipi_add(rq);
>   }
>   
> -static void __i915_request_set_priority(struct i915_request *rq, int prio)
> +static bool __i915_request_set_priority(struct i915_request *rq, int prio)
>   {
>   	struct intel_engine_cs *engine = rq->engine;
>   	struct list_head *pos = &rq->sched.signalers_list;
> -	struct list_head *plist;
> -
> -	plist = lookup_priolist(engine, prio);
> +	bool kick = false;
>   
>   	/*
>   	 * Recursively bump all dependent priorities to match the new request.
> @@ -512,6 +719,8 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
>   	 */
>   	rq->sched.dfs.next = NULL;
>   	do {
> +		struct i915_request *next;
> +
>   		list_for_each_continue(pos, &rq->sched.signalers_list) {
>   			struct i915_dependency *p =
>   				list_entry(pos, typeof(*p), signal_link);
> @@ -537,6 +746,8 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
>   		RQ_TRACE(rq, "set-priority:%d\n", prio);
>   		WRITE_ONCE(rq->sched.attr.priority, prio);
>   
> +		next = stack_pop(rq, &pos);
> +
>   		/*
>   		 * Once the request is ready, it will be placed into the
>   		 * priority lists and then onto the HW runlist. Before the
> @@ -545,21 +756,15 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
>   		 * any preemption required, be dealt with upon submission.
>   		 * See engine->submit_request()
>   		 */
> -		if (!i915_request_is_ready(rq))
> -			continue;
> -
>   		GEM_BUG_ON(rq->engine != engine);
> -		if (i915_request_in_priority_queue(rq)) {
> -			struct list_head *prev = rq->sched.link.prev;
> +		if (i915_request_is_ready(rq) &&
> +		    set_earliest_deadline(rq, rq_deadline(rq)))

Inside here it walks the signalers list for rq, while this is inside the 
loop which already walks the whole signalers tree for each rq. I wonder 
if there is scope to somehow eliminate this another sub-walk. But to be 
honest it makes my head spin how to do it so probably best to leave it 
for later, if even possible.

> +			kick = true;
>   
> -			list_move_tail(&rq->sched.link, plist);
> -			if (list_empty(prev))
> -				remove_priolist(engine, prev);
> -		}
> +		rq = next;
> +	} while (rq);
>   
> -		/* Defer (tasklet) submission until after all updates. */
> -		kick_submission(engine, rq, prio);
> -	} while ((rq = stack_pop(rq, &pos)));
> +	return kick;
>   }
>   
>   void i915_request_set_priority(struct i915_request *rq, int prio)
> @@ -608,13 +813,9 @@ void i915_request_set_priority(struct i915_request *rq, int prio)
>   	if (__i915_request_is_complete(rq))
>   		goto unlock;
>   
> -	if (!intel_engine_has_scheduler(engine)) {
> -		rq->sched.attr.priority = prio;
> -		goto unlock;
> -	}
> -
>   	rcu_read_lock();
> -	__i915_request_set_priority(rq, prio);
> +	if (__i915_request_set_priority(rq, prio))
> +		i915_sched_kick(&engine->active);
>   	rcu_read_unlock();
>   	GEM_BUG_ON(rq_prio(rq) != prio);
>   
> @@ -628,12 +829,13 @@ void __intel_engine_defer_request(struct intel_engine_cs *engine,
>   	struct list_head *pos = &rq->sched.waiters_list;
>   	struct i915_request *rn;
>   	LIST_HEAD(dfs);
> -	int prio;
> +	u64 deadline;
>   
>   	lockdep_assert_held(&engine->active.lock);
>   	GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags));
>   
> -	prio = rq_prio(rq);
> +	deadline = max(rq_deadline(rq),
> +		       i915_scheduler_next_virtual_deadline(adj_prio(rq)));
>   
>   	/*
>   	 * When we defer a request, we must maintain its order with respect
> @@ -660,30 +862,32 @@ void __intel_engine_defer_request(struct intel_engine_cs *engine,
>   				   __i915_request_has_started(w) &&
>   				   !__i915_request_is_complete(rq));
>   
> +			/* An unready waiter imposes no deadline */
>   			if (!i915_request_in_priority_queue(w))
>   				continue;
>   
>   			/*
> -			 * We also need to reorder within the same priority.
> +			 * We also need to reorder within the same deadline.
>   			 *
>   			 * This is unlike priority-inheritance, where if the
>   			 * signaler already has a higher priority [earlier
>   			 * deadline] than us, we can ignore as it will be
>   			 * scheduled first. If a waiter already has the
> -			 * same priority, we still have to push it to the end
> +			 * same deadline, we still have to push it to the end
>   			 * of the list. This unfortunately means we cannot
>   			 * use the rq_deadline() itself as a 'visited' bit.
>   			 */
> -			if (rq_prio(w) < prio)
> +			if (rq_deadline(w) > deadline)
>   				continue;
>   
> -			GEM_BUG_ON(rq_prio(w) != prio);
> -
>   			/* Remember our position along this branch */
>   			rq = stack_push(w, rq, pos);
>   			pos = &rq->sched.waiters_list;
>   		}
>   
> +		RQ_TRACE(rq, "set-deadline:%llu\n", deadline);
> +		WRITE_ONCE(rq->sched.deadline, deadline);
> +
>   		/* Note list is reversed for waiters wrt signal hierarchy */
>   		GEM_BUG_ON(rq->engine != engine);
>   		GEM_BUG_ON(!i915_request_in_priority_queue(rq));
> @@ -693,31 +897,18 @@ void __intel_engine_defer_request(struct intel_engine_cs *engine,
>   		clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
>   	} while ((rq = stack_pop(rq, &pos)));
>   
> -	pos = lookup_priolist(engine, prio);
> +	pos = lookup_priolist(engine, deadline);
>   	list_for_each_entry_safe(rq, rn, &dfs, sched.link) {
>   		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
>   		list_add_tail(&rq->sched.link, pos);
>   	}
>   }
>   
> -static void queue_request(struct intel_engine_cs *engine,
> -			  struct i915_request *rq)
> +static bool
> +queue_request(struct intel_engine_cs *engine, struct i915_request *rq)
>   {
> -	GEM_BUG_ON(!list_empty(&rq->sched.link));
> -	list_add_tail(&rq->sched.link, lookup_priolist(engine, rq_prio(rq)));
>   	set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
> -}
> -
> -static bool submit_queue(struct intel_engine_cs *engine,
> -			 const struct i915_request *rq)
> -{
> -	struct intel_engine_execlists *execlists = &engine->execlists;
> -
> -	if (rq_prio(rq) <= execlists->queue_priority_hint)
> -		return false;
> -
> -	execlists->queue_priority_hint = rq_prio(rq);
> -	return true;
> +	return set_earliest_deadline(rq, I915_DEADLINE_NEVER);
>   }
>   
>   static bool hold_request(const struct i915_request *rq)
> @@ -757,6 +948,7 @@ void i915_request_enqueue(struct i915_request *rq)
>   {
>   	struct intel_engine_cs *engine = rq->engine;
>   	struct i915_sched_engine *se = &engine->active;
> +	u64 dl = earliest_deadline(rq);
>   	unsigned long flags;
>   	bool kick = false;
>   
> @@ -769,11 +961,11 @@ void i915_request_enqueue(struct i915_request *rq)
>   		list_add_tail(&rq->sched.link, &se->hold);
>   		i915_request_set_hold(rq);
>   	} else {
> -		queue_request(engine, rq);
> -
> -		GEM_BUG_ON(i915_sched_is_idle(&engine->active));
> -
> -		kick = submit_queue(engine, rq);
> +		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
> +		kick = __i915_request_set_deadline(rq,
> +						   min(dl, rq_deadline(rq)));
> +		GEM_BUG_ON(rq_deadline(rq) == I915_DEADLINE_NEVER);
> +		GEM_BUG_ON(i915_sched_is_idle(se));
>   	}
>   
>   	GEM_BUG_ON(list_empty(&rq->sched.link));
> @@ -786,8 +978,8 @@ struct i915_request *
>   __intel_engine_rewind_requests(struct intel_engine_cs *engine)
>   {
>   	struct i915_request *rq, *rn, *active = NULL;
> +	u64 deadline = I915_DEADLINE_NEVER;
>   	struct list_head *pl;
> -	int prio = I915_PRIORITY_INVALID;
>   
>   	lockdep_assert_held(&engine->active.lock);
>   
> @@ -801,13 +993,20 @@ __intel_engine_rewind_requests(struct intel_engine_cs *engine)
>   
>   		__i915_request_unsubmit(rq);
>   
> -		GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
> -		if (rq_prio(rq) != prio) {
> -			prio = rq_prio(rq);
> -			pl = lookup_priolist(engine, prio);
> +		if (__i915_request_has_started(rq)) {
> +			u64 deadline =
> +				i915_scheduler_next_virtual_deadline(rq_prio(rq));
> +			rq->sched.deadline = min(rq_deadline(rq), deadline);
> +		}
> +		GEM_BUG_ON(rq_deadline(rq) == I915_DEADLINE_NEVER);
> +
> +		if (rq_deadline(rq) != deadline) {
> +			deadline = rq_deadline(rq);
> +			pl = lookup_priolist(engine, deadline);
>   		}
>   		GEM_BUG_ON(i915_sched_is_idle(&engine->active));
>   
> +		GEM_BUG_ON(i915_request_in_priority_queue(rq));
>   		list_move(&rq->sched.link, pl);
>   		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
>   
> @@ -907,14 +1106,10 @@ void __intel_engine_resume_request(struct intel_engine_cs *engine,
>   				   struct i915_request *rq)
>   {
>   	LIST_HEAD(list);
> +	bool submit = false;
>   
>   	lockdep_assert_held(&engine->active.lock);
>   
> -	if (rq_prio(rq) > engine->execlists.queue_priority_hint) {
> -		engine->execlists.queue_priority_hint = rq_prio(rq);
> -		i915_sched_kick(&engine->active);
> -	}
> -
>   	if (!i915_request_on_hold(rq))
>   		return;
>   
> @@ -936,7 +1131,7 @@ void __intel_engine_resume_request(struct intel_engine_cs *engine,
>   		i915_request_clear_hold(rq);
>   		list_del_init(&rq->sched.link);
>   
> -		queue_request(engine, rq);
> +		submit |= queue_request(engine, rq);
>   
>   		/* Also release any children on this engine that are ready */
>   		for_each_waiter(p, rq) {
> @@ -966,6 +1161,18 @@ void __intel_engine_resume_request(struct intel_engine_cs *engine,
>   
>   		rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
>   	} while (rq);
> +
> +	if (submit)
> +		i915_sched_kick(&engine->active);
> +}
> +
> +void i915_request_update_deadline(struct i915_request *rq)
> +{
> +	if (!i915_request_in_priority_queue(rq))
> +		return;
> +
> +	/* Recompute our deadlines and promote after a priority change */
> +	i915_request_set_deadline(rq, earliest_deadline(rq));
>   }
>   
>   void intel_engine_resume_request(struct intel_engine_cs *engine,
> @@ -992,10 +1199,12 @@ void i915_sched_node_init(struct i915_sched_node *node)
>   void i915_sched_node_reinit(struct i915_sched_node *node)
>   {
>   	node->attr.priority = I915_PRIORITY_INVALID;
> +	node->deadline = I915_DEADLINE_NEVER;
>   	node->semaphores = 0;
>   	node->flags = 0;
>   
>   	GEM_BUG_ON(node->ipi_link);
> +	node->ipi_deadline = I915_DEADLINE_NEVER;
>   	node->ipi_priority = I915_PRIORITY_INVALID;
>   
>   	GEM_BUG_ON(!list_empty(&node->signalers_list));
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index bca89a58d953..e04d7eeb1b36 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -36,6 +36,11 @@ void i915_sched_park_engine(struct i915_sched_engine *se);
>   void i915_sched_fini_engine(struct i915_sched_engine *se);
>   
>   void i915_request_set_priority(struct i915_request *request, int prio);
> +void i915_request_set_deadline(struct i915_request *request, u64 deadline);
> +
> +void i915_request_update_deadline(struct i915_request *request);
> +
> +u64 i915_scheduler_next_virtual_deadline(int priority);
>   
>   void i915_request_enqueue(struct i915_request *request);
>   
> @@ -54,11 +59,14 @@ bool intel_engine_suspend_request(struct intel_engine_cs *engine,
>   void intel_engine_resume_request(struct intel_engine_cs *engine,
>   				 struct i915_request *rq);
>   
> -void __i915_priolist_free(struct i915_priolist *p);
> -static inline void i915_priolist_free(struct i915_priolist *p)
> +static inline u64 i915_sched_to_ticks(ktime_t kt)
>   {
> -	if (p->priority != I915_PRIORITY_NORMAL)
> -		__i915_priolist_free(p);
> +	return ktime_to_ns(kt) >> I915_SCHED_DEADLINE_SHIFT;
> +}
> +
> +static inline u64 i915_sched_to_ns(u64 deadline)
> +{
> +	return deadline << I915_SCHED_DEADLINE_SHIFT;
>   }
>   
>   static inline bool i915_sched_is_idle(const struct i915_sched_engine *se)
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index e64750be4e77..72cb726ad75e 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -72,7 +72,30 @@ struct i915_sched_node {
>   #define I915_SCHED_HAS_EXTERNAL_CHAIN	BIT(0)
>   	unsigned long semaphores;
>   
> +	/**
> +	 * @deadline: [virtual] deadline
> +	 *
> +	 * When the request is ready for execution, it is given a quota
> +	 * (the engine's timeslice) and a virtual deadline. The virtual
> +	 * deadline is derived from the current time:
> +	 *     ktime_get() + (prio_ratio * timeslice)
> +	 *
> +	 * Requests are then executed in order of deadline completion.
> +	 * Requests with earlier deadlines than currently executing on
> +	 * the engine will preempt the active requests.
> +	 *
> +	 * By treating it as a virtual deadline, we use it as a hint for
> +	 * when it is appropriate for a request to start with respect to
> +	 * all other requests in the system. It is not a hard deadline, as
> +	 * we allow requests to miss them, and we do not account for the
> +	 * request runtime.
> +	 */
> +	u64 deadline;
> +#define I915_SCHED_DEADLINE_SHIFT 19 /* i.e. roughly 500us buckets */
> +#define I915_DEADLINE_NEVER U64_MAX
> +
>   	struct i915_request *ipi_link;
> +	u64 ipi_deadline;
>   	int ipi_priority;
>   };
>   
> diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
> index d2a678a2497e..382f2d490959 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_request.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_request.c
> @@ -2130,6 +2130,7 @@ static int measure_preemption(struct intel_context *ce)
>   
>   		intel_ring_advance(rq, cs);
>   		rq->sched.attr.priority = I915_PRIORITY_BARRIER;
> +		rq->sched.deadline = 0;
>   
>   		elapsed[i - 1] = ENGINE_READ_FW(ce->engine, RING_TIMESTAMP);
>   		i915_request_add(rq);
> diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> index de5b1443129b..8ea6763bf6a6 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> @@ -12,6 +12,40 @@
>   #include "selftests/igt_spinner.h"
>   #include "selftests/i915_random.h"
>   
> +static int mock_scheduler_slices(void *dummy)
> +{
> +	u64 min, max, normal, kernel;
> +
> +	min = prio_slice(I915_PRIORITY_MIN);
> +	pr_info("%8s slice: %lluus\n", "min", min >> 10);
> +
> +	normal = prio_slice(0);
> +	pr_info("%8s slice: %lluus\n", "normal", normal >> 10);
> +
> +	max = prio_slice(I915_PRIORITY_MAX);
> +	pr_info("%8s slice: %lluus\n", "max", max >> 10);
> +
> +	kernel = prio_slice(I915_PRIORITY_BARRIER);
> +	pr_info("%8s slice: %lluus\n", "kernel", kernel >> 10);
> +
> +	if (kernel != 0) {
> +		pr_err("kernel prio slice should be 0\n");
> +		return -EINVAL;
> +	}
> +
> +	if (max >= normal) {
> +		pr_err("maximum prio slice should be shorter than normal\n");
> +		return -EINVAL;
> +	}
> +
> +	if (min <= normal) {
> +		pr_err("minimum prio slice should be longer than normal\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
>   static int mock_skiplist_levels(void *dummy)
>   {
>   	struct i915_priolist_root root = {};
> @@ -54,6 +88,7 @@ static int mock_skiplist_levels(void *dummy)
>   int i915_scheduler_mock_selftests(void)
>   {
>   	static const struct i915_subtest tests[] = {
> +		SUBTEST(mock_scheduler_slices),
>   		SUBTEST(mock_skiplist_levels),
>   	};
>   
> @@ -560,6 +595,53 @@ static int igt_priority_chains(void *arg)
>   	return igt_schedule_chains(arg, igt_priority);
>   }
>   
> +static bool igt_deadline(struct i915_request *rq,
> +			 unsigned long v, unsigned long e)
> +{
> +	i915_request_set_deadline(rq, 0);
> +	GEM_BUG_ON(rq_deadline(rq) != 0);
> +	return true;
> +}
> +
> +static int igt_deadline_chains(void *arg)
> +{
> +	return igt_schedule_chains(arg, igt_deadline);
> +}
> +
> +static bool igt_defer(struct i915_request *rq, unsigned long v, unsigned long e)
> +{
> +	struct intel_engine_cs *engine = rq->engine;
> +
> +	/* XXX No generic means to unwind incomplete requests yet */
> +	if (!i915_request_in_priority_queue(rq))
> +		return false;
> +
> +	if (!intel_engine_has_preemption(engine))
> +		return false;
> +
> +	spin_lock_irq(&engine->active.lock);
> +
> +	/* Push all the requests to the same deadline */
> +	__i915_request_set_deadline(rq, 0);
> +	GEM_BUG_ON(rq_deadline(rq) != 0);
> +
> +	/* Then the very first request must be the one everyone depends on */
> +	rq = list_first_entry(lookup_priolist(engine, 0),
> +			      typeof(*rq), sched.link);
> +	GEM_BUG_ON(rq->engine != engine);
> +
> +	/* Deferring the first request will then have to defer all requests */
> +	__intel_engine_defer_request(engine, rq);
> +
> +	spin_unlock_irq(&engine->active.lock);
> +	return true;
> +}
> +
> +static int igt_deadline_defer(void *arg)
> +{
> +	return igt_schedule_chains(arg, igt_defer);
> +}
> +
>   static struct i915_request *
>   __write_timestamp(struct intel_engine_cs *engine,
>   		  struct drm_i915_gem_object *obj,
> @@ -781,13 +863,22 @@ static int igt_priority_cycle(void *arg)
>   	return __igt_schedule_cycle(arg, igt_priority);
>   }
>   
> +static int igt_deadline_cycle(void *arg)
> +{
> +	return __igt_schedule_cycle(arg, igt_deadline);
> +}
> +
>   int i915_scheduler_live_selftests(struct drm_i915_private *i915)
>   {
>   	static const struct i915_subtest tests[] = {
> +		SUBTEST(igt_deadline_chains),
>   		SUBTEST(igt_priority_chains),
>   
>   		SUBTEST(igt_schedule_cycle),
> +		SUBTEST(igt_deadline_cycle),
>   		SUBTEST(igt_priority_cycle),
> +
> +		SUBTEST(igt_deadline_defer),
>   	};
>   
>   	return i915_subtests(tests, i915);
> @@ -923,9 +1014,54 @@ static int sparse_priority(void *arg)
>   	return sparse(arg, set_priority);
>   }
>   
> +static u64 __set_deadline(struct i915_request *rq, u64 deadline)
> +{
> +	u64 dt;
> +
> +	preempt_disable();
> +	dt = ktime_get_raw_fast_ns();
> +	i915_request_set_deadline(rq, deadline);
> +	dt = ktime_get_raw_fast_ns() - dt;
> +	preempt_enable();
> +
> +	return dt;
> +}
> +
> +static bool set_deadline(struct i915_request *rq,
> +			 unsigned long v, unsigned long e)
> +{
> +	report("set-deadline", v, e, __set_deadline(rq, 0));
> +	return true;
> +}
> +
> +static int single_deadline(void *arg)
> +{
> +	return single(arg, set_deadline);
> +}
> +
> +static int wide_deadline(void *arg)
> +{
> +	return wide(arg, set_deadline);
> +}
> +
> +static int inv_deadline(void *arg)
> +{
> +	return inv(arg, set_deadline);
> +}
> +
> +static int sparse_deadline(void *arg)
> +{
> +	return sparse(arg, set_deadline);
> +}
> +
>   int i915_scheduler_perf_selftests(struct drm_i915_private *i915)
>   {
>   	static const struct i915_subtest tests[] = {
> +		SUBTEST(single_deadline),
> +		SUBTEST(wide_deadline),
> +		SUBTEST(inv_deadline),
> +		SUBTEST(sparse_deadline),
> +
>   		SUBTEST(single_priority),
>   		SUBTEST(wide_priority),
>   		SUBTEST(inv_priority),
> 

Numbers talk for themselves (who hasn't played with intel_gpu_top and 
clients stats enough probably can't appreciate how bad current code can 
schedule), design looks elegant, code is tidy. I'd say go for it and 
tweak/fix in situ if something pops up. So r-b in waiting effectively, 
just want to finish the series.

Regards,

Tvrtko



_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 22/41] drm/i915: Fair low-latency scheduling
  2021-01-28 11:35   ` Tvrtko Ursulin
@ 2021-01-28 12:32     ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-28 12:32 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-28 11:35:59)
> 
> On 25/01/2021 14:01, Chris Wilson wrote:
> > The first "scheduler" was a topographical sorting of requests into
> > priority order. The execution order was deterministic, the earliest
> > submitted, highest priority request would be executed first. Priority
> > inheritance ensured that inversions were kept at bay, and allowed us to
> > dynamically boost priorities (e.g. for interactive pageflips).
> > 
> > The minimalistic timeslicing scheme was an attempt to introduce fairness
> > between long running requests, by evicting the active request at the end
> > of a timeslice and moving it to the back of its priority queue (while
> > ensuring that dependencies were kept in order). For short running
> > requests from many clients of equal priority, the scheme is still very
> > much FIFO submission ordering, and as unfair as before.
> > 
> > To impose fairness, we need an external metric that ensures that clients
> > are interpersed, so we don't execute one long chain from client A before
> > executing any of client B. This could be imposed by the clients
> > themselves by using fences based on an external clock, that is they only
> > submit work for a "frame" at frame-intervals, instead of submitting as
> > much work as they are able to. The standard SwapBuffers approach is akin
> > to double bufferring, where as one frame is being executed, the next is
> > being submitted, such that there is always a maximum of two frames per
> > client in the pipeline and so ideally maintains consistent input-output
> > latency. Even this scheme exhibits unfairness under load as a single
> > client will execute two frames back to back before the next, and with
> > enough clients, deadlines will be missed.
> > 
> > The idea introduced by BFS/MuQSS is that fairness is introduced by
> > metering with an external clock. Every request, when it becomes ready to
> > execute is assigned a virtual deadline, and execution order is then
> > determined by earliest deadline. Priority is used as a hint, rather than
> > strict ordering, where high priority requests have earlier deadlines,
> > but not necessarily earlier than outstanding work. Thus work is executed
> > in order of 'readiness', with timeslicing to demote long running work.
> > 
> > The Achille's heel of this scheduler is its strong preference for
> > low-latency and favouring of new queues. Whereas it was easy to dominate
> > the old scheduler by flooding it with many requests over a short period
> > of time, the new scheduler can be dominated by a 'synchronous' client
> > that waits for each of its requests to complete before submitting the
> > next. As such a client has no history, it is always considered
> > ready-to-run and receives an earlier deadline than the long running
> > requests. This is compensated for by refreshing the current execution's
> > deadline and by disallowing preemption for timeslice shuffling.
> > 
> > In contrast, one key advantage of disconnecting the sort key from the
> > priority value is that we can freely adjust the deadline to compensate
> > for other factors. This is used in conjunction with submitting requests
> > ahead-of-schedule that then busywait on the GPU using semaphores. Since
> > we don't want to spend a timeslice busywaiting instead of doing real
> > work when available, we deprioritise work by giving the semaphore waits
> > a later virtual deadline. The priority deboost is applied to semaphore
> > workloads after they miss a semaphore wait and a new context is pending.
> > The request is then restored to its normal priority once the semaphores
> > are signaled so that it not unfairly penalised under contention by
> > remaining at a far future deadline. This is a much improved and cleaner
> > version of commit f9e9e9de58c7 ("drm/i915: Prioritise non-busywait
> > semaphore workloads").
> > 
> > To check the impact on throughput (often the downfall of latency
> > sensitive schedulers), we used gem_wsim to simulate various transcode
> > workloads with different load balancers, and varying the number of
> > competing [heterogenous] clients. On Kabylake gt3e running at fixed
> > clocks,
> > 
> > +delta%------------------------------------------------------------------+
> > |       a                                                                |
> > |       a                                                                |
> > |       a                                                                |
> > |       a                                                                |
> > |       aa                                                               |
> > |      aaa                                                               |
> > |      aaaa                                                              |
> > |     aaaaaa                                                             |
> > |     aaaaaa                                                             |
> > |     aaaaaa   a                a                                        |
> > | aa  aaaaaa a a      a  a   aa a       a         a       a             a|
> > ||______M__A__________|                                                  |
> > +------------------------------------------------------------------------+
> >      N           Min           Max        Median          Avg       Stddev
> >    108    -4.6326643     47.797855 -0.00069639128     2.116185   7.6764049
> 
> +47% is aggregate throughput or 47% less variance between worst-best 
> clients from the group?

Each point is relative change in throughput, wsim work-per-second B/A.

That +47% is due to the improved semaphore deprioritisation.
If you look at earlier results, it used to be range like -20%,20% where
sometimes we did better with avoiding the busywaits and sometimes worse.
The fix for the -20% was to apply the semaphore deprioritisation after a
miss rather than upfront (as we previously did).

> > @@ -549,9 +559,12 @@ static void __execlists_schedule_out(struct i915_request * const rq,
> >        * If we have just completed this context, the engine may now be
> >        * idle and we want to re-enter powersaving.
> >        */
> > -     if (intel_timeline_is_last(ce->timeline, rq) &&
> > -         __i915_request_is_complete(rq))
> > -             intel_engine_add_retire(engine, ce->timeline);
> > +     if (__i915_request_is_complete(rq)) {
> > +             if (!intel_timeline_is_last(ce->timeline, rq))
> > +                     i915_request_update_deadline(list_next_entry(rq, link));
> 
> Comment here explaining why it is important to update the deadline for 
> the following request once previous completes?
> 
> And this is just for the last request of the coalesced bunch right?

Yes. It follows on from the consideration that a deadline is set when
the request becomes ready. As we submit work ahead of the completion
signals, we may unfairly postpone further submissions along an active
context as the accumulated deadline far exceeds a new client, but both
pieces of work are ready to be executed.

From a bandwidth pov, this is still a reasonable hack as the executing
context finished early and did not consume all of its timeslice/budget.
So we award the next request in the context with the remainder of the
budget, and a fresh client will have its full budget.

Without this quirk, we always favour new clients versus long running
work.

> > @@ -892,10 +892,7 @@ release_queue(struct intel_engine_cs *engine,
> >       i915_request_get(rq);
> >       i915_request_add(rq);
> >   
> > -     local_bh_disable();
> > -     i915_request_set_priority(rq, prio);
> > -     local_bh_enable(); /* kick tasklet */
> > -
> > +     i915_request_set_deadline(rq, deadline);
> 
> I am thinking some underscores to this API could be beneficial to 
> emphasise how high level callers should not use it on their requests. 
> Thinking about things like tests and in kernel clients - my 
> understanding is API is not for them.

Ah, this is intended to be used just like changing priority, e.g., in
the display we set a deadline for the pageflip. So although the deadline
is soft, it is still a meaningful ktime_t.

That extra information will, of course, only be carried as far as it is
understood.

> >       switch (state) {
> >       case FENCE_COMPLETE:
> > +             i915_request_update_deadline(rq);
> 
> This will pull the deadline in or push out in practice?

In, or be ignored.

This signal corresponds to when the request would normally be submitted
as being ready. So we re-evaluate the request afresh.

As it is also after the semaphore, the new deadline is not only computed
relative the current time, but it is also without the semaphore
deboosting.

> > +static u64 prio_slice(int prio)
> > +{
> > +     u64 slice;
> > +     int sf;
> > +
> > +     /*
> > +      * This is the central heuristic to the virtual deadlines. By
> > +      * imposing that each task takes an equal amount of time, we
> > +      * let each client have an equal slice of the GPU time. By
> > +      * bringing the virtual deadline forward, that client will then
> > +      * have more GPU time, and vice versa a lower priority client will
> > +      * have a later deadline and receive less GPU time.
> > +      *
> > +      * In BFS/MuQSS, the prio_ratios[] are based on the task nice range of
> > +      * [-20, 20], with each lower priority having a ~10% longer deadline,
> > +      * with the note that the proportion of CPU time between two clients
> > +      * of different priority will be the square of the relative prio_slice.
> > +      *
> > +      * In contrast, this prio_slice() curve was chosen because it gave good
> > +      * results with igt/gem_exec_schedule. It may not be the best choice!
> > +      *
> > +      * With a 1ms scheduling quantum:
> > +      *
> > +      *   MAX USER:  ~32us deadline
> > +      *   0:         ~16ms deadline
> 
> Interesting centre/default point. Relates to 60Hz? If so how about 
> exporting some sysfs controls?

It's expected that we will definitely have input from cgroup here to
determine relative bandwidth budgets. The nice thing about the deadline
design is that it directly translates into bandwidth budgets :)

(But it will definitely take many tests to prove we get the right
factors for relative workload distribution.)

sysfs is a possibility, but for the difficulty in naming the controls.
So mostly kept as an ace up the sleeve until Joonas asks "can we...?"

> > @@ -545,21 +756,15 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
> >                * any preemption required, be dealt with upon submission.
> >                * See engine->submit_request()
> >                */
> > -             if (!i915_request_is_ready(rq))
> > -                     continue;
> > -
> >               GEM_BUG_ON(rq->engine != engine);
> > -             if (i915_request_in_priority_queue(rq)) {
> > -                     struct list_head *prev = rq->sched.link.prev;
> > +             if (i915_request_is_ready(rq) &&
> > +                 set_earliest_deadline(rq, rq_deadline(rq)))
> 
> Inside here it walks the signalers list for rq, while this is inside the 
> loop which already walks the whole signalers tree for each rq. I wonder 
> if there is scope to somehow eliminate this another sub-walk. But to be 
> honest it makes my head spin how to do it so probably best to leave it 
> for later, if even possible.

Yes. 'nuff said. :)

The inner dfs should be short as it should not have to descend into the
tree again. But there's some freedom as each set-priority may pick a
different deadline and so different subtrees may require re-traversing.

> >   int i915_scheduler_perf_selftests(struct drm_i915_private *i915)
> >   {
> >       static const struct i915_subtest tests[] = {
> > +             SUBTEST(single_deadline),
> > +             SUBTEST(wide_deadline),
> > +             SUBTEST(inv_deadline),
> > +             SUBTEST(sparse_deadline),
> > +
> >               SUBTEST(single_priority),
> >               SUBTEST(wide_priority),
> >               SUBTEST(inv_priority),
> > 
> 
> Numbers talk for themselves (who hasn't played with intel_gpu_top and 
> clients stats enough probably can't appreciate how bad current code can 
> schedule), design looks elegant, code is tidy. I'd say go for it and 
> tweak/fix in situ if something pops up. So r-b in waiting effectively, 
> just want to finish the series.

Aye. And wsim thoughput/deadline modes proved invaluable.

I have not been able to measure any difference in game benchmarks (except
if you look at them in intel_gpu_top) as they are dominated by a single
client on a single engine, but the small sample of media transcode
benchmarks I have saw a very nice uptick.

Where this matters most will be in saturated multi-client systems,
especially when asked for more precise budgets. The interactive desktop
being a simple example, but since we always had very aggressive priority
boosting for flips, I doubt anyone would notice [if we couldn't maintain
vrefresh in the first place, the system will always feel laggy].
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist Chris Wilson
  2021-01-27 15:10   ` Tvrtko Ursulin
@ 2021-01-28 15:56   ` Tvrtko Ursulin
  2021-01-28 16:26     ` Chris Wilson
  2021-01-28 22:56   ` Matthew Brost
  2021-01-29 10:22   ` Tvrtko Ursulin
  3 siblings, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-28 15:56 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom



On 25/01/2021 14:01, Chris Wilson wrote:
> Replace the priolist rbtree with a skiplist. The crucial difference is
> that walking and removing the first element of a skiplist is O(1), but
> O(lgN) for an rbtree, as we need to rebalance on remove. This is a
> hindrance for submission latency as it occurs between picking a request
> for the priolist and submitting it to hardware, as well effectively
> trippling the number of O(lgN) operations required under the irqoff lock.
> This is critical to reducing the latency jitter with multiple clients.
> 
> The downsides to skiplists are that lookup/insertion is only
> probablistically O(lgN) and there is a significant memory penalty to
> as each skip node is larger than the rbtree equivalent. Furthermore, we
> don't use dynamic arrays for the skiplist, so the allocation is fixed,
> and imposes an upper bound on the scalability wrt to the number of
> inflight requests.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   .../drm/i915/gt/intel_execlists_submission.c  |  63 +++--
>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  30 +--
>   drivers/gpu/drm/i915/i915_priolist_types.h    |  28 +-
>   drivers/gpu/drm/i915/i915_scheduler.c         | 244 ++++++++++++++----
>   drivers/gpu/drm/i915/i915_scheduler.h         |  11 +-
>   drivers/gpu/drm/i915/i915_scheduler_types.h   |   2 +-
>   .../drm/i915/selftests/i915_mock_selftests.h  |   1 +
>   .../gpu/drm/i915/selftests/i915_scheduler.c   |  53 +++-
>   8 files changed, 316 insertions(+), 116 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 1103c8a00af1..129144dd86b0 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -244,11 +244,6 @@ static void ring_set_paused(const struct intel_engine_cs *engine, int state)
>   		wmb();
>   }
>   
> -static struct i915_priolist *to_priolist(struct rb_node *rb)
> -{
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
>   static int rq_prio(const struct i915_request *rq)
>   {
>   	return READ_ONCE(rq->sched.attr.priority);
> @@ -272,15 +267,31 @@ static int effective_prio(const struct i915_request *rq)
>   	return prio;
>   }
>   
> -static int queue_prio(const struct i915_sched_engine *se)
> +static struct i915_request *first_request(struct i915_sched_engine *se)
>   {
> -	struct rb_node *rb;
> +	struct i915_priolist *pl;
>   
> -	rb = rb_first_cached(&se->queue);
> -	if (!rb)
> +	for_each_priolist(pl, &se->queue) {
> +		if (likely(!list_empty(&pl->requests)))
> +			return list_first_entry(&pl->requests,
> +						struct i915_request,
> +						sched.link);
> +
> +		i915_priolist_advance(&se->queue, pl);
> +	}
> +
> +	return NULL;
> +}
> +
> +static int queue_prio(struct i915_sched_engine *se)
> +{
> +	struct i915_request *rq;
> +
> +	rq = first_request(se);
> +	if (!rq)
>   		return INT_MIN;
>   
> -	return to_priolist(rb)->priority;
> +	return rq_prio(rq);
>   }
>   
>   static int virtual_prio(const struct intel_engine_execlists *el)
> @@ -290,7 +301,7 @@ static int virtual_prio(const struct intel_engine_execlists *el)
>   	return rb ? rb_entry(rb, struct ve_node, rb)->prio : INT_MIN;
>   }
>   
> -static bool need_preempt(const struct intel_engine_cs *engine,
> +static bool need_preempt(struct intel_engine_cs *engine,
>   			 const struct i915_request *rq)
>   {
>   	int last_prio;
> @@ -1136,6 +1147,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   	struct i915_request ** const last_port = port + execlists->port_mask;
>   	struct i915_request *last, * const *active;
>   	struct virtual_engine *ve;
> +	struct i915_priolist *pl;
>   	struct rb_node *rb;
>   	bool submit = false;
>   
> @@ -1346,11 +1358,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   			break;
>   	}
>   
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> +	for_each_priolist(pl, &engine->active.queue) {
>   		struct i915_request *rq, *rn;
>   
> -		priolist_for_each_request_consume(rq, rn, p) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>   			bool merge = true;
>   
>   			/*
> @@ -1425,8 +1436,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   			}
>   		}
>   
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);
>   	}
>   done:
>   	*port++ = i915_request_get(last);
> @@ -2631,6 +2641,7 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>   {
>   	struct intel_engine_execlists * const execlists = &engine->execlists;
>   	struct i915_request *rq, *rn;
> +	struct i915_priolist *pl;
>   	struct rb_node *rb;
>   	unsigned long flags;
>   
> @@ -2661,16 +2672,12 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>   	intel_engine_signal_breadcrumbs(engine);
>   
>   	/* Flush the queued requests to the timeline list (for retiring). */
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> -
> -		priolist_for_each_request_consume(rq, rn, p) {
> +	for_each_priolist(pl, &engine->active.queue) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>   			i915_request_mark_eio(rq);
>   			__i915_request_submit(rq);
>   		}
> -
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);
>   	}
>   	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
>   
> @@ -2703,7 +2710,6 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>   	/* Remaining _unready_ requests will be nop'ed when submitted */
>   
>   	execlists->queue_priority_hint = INT_MIN;
> -	engine->active.queue = RB_ROOT_CACHED;
>   
>   	GEM_BUG_ON(__tasklet_is_enabled(&engine->active.tasklet));
>   	engine->active.tasklet.func = nop_submission_tasklet;
> @@ -3089,6 +3095,8 @@ static void virtual_context_exit(struct intel_context *ce)
>   
>   	for (n = 0; n < ve->num_siblings; n++)
>   		intel_engine_pm_put(ve->siblings[n]);
> +
> +	i915_sched_park_engine(&ve->base.active);
>   }
>   
>   static const struct intel_context_ops virtual_context_ops = {
> @@ -3501,6 +3509,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
>   {
>   	const struct intel_engine_execlists *execlists = &engine->execlists;
>   	struct i915_request *rq, *last;
> +	struct i915_priolist *pl;
>   	unsigned long flags;
>   	unsigned int count;
>   	struct rb_node *rb;
> @@ -3530,10 +3539,8 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
>   
>   	last = NULL;
>   	count = 0;
> -	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
> -		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
> -
> -		priolist_for_each_request(rq, p) {
> +	for_each_priolist(pl, &engine->active.queue) {
> +		priolist_for_each_request(rq, pl) {
>   			if (count++ < max - 1)
>   				show_request(m, rq, "\t\t", 0);
>   			else
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 2d7339ef3b4c..8d0c6cd277b3 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -59,11 +59,6 @@
>   
>   #define GUC_REQUEST_SIZE 64 /* bytes */
>   
> -static inline struct i915_priolist *to_priolist(struct rb_node *rb)
> -{
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
>   static struct guc_stage_desc *__get_stage_desc(struct intel_guc *guc, u32 id)
>   {
>   	struct guc_stage_desc *base = guc->stage_desc_pool_vaddr;
> @@ -185,8 +180,8 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>   	struct i915_request ** const last_port = first + execlists->port_mask;
>   	struct i915_request *last = first[0];
>   	struct i915_request **port;
> +	struct i915_priolist *pl;
>   	bool submit = false;
> -	struct rb_node *rb;
>   
>   	lockdep_assert_held(&engine->active.lock);
>   
> @@ -203,11 +198,10 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>   	 * event.
>   	 */
>   	port = first;
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> +	for_each_priolist(pl, &engine->active.queue) {
>   		struct i915_request *rq, *rn;
>   
> -		priolist_for_each_request_consume(rq, rn, p) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>   			if (last && rq->context != last->context) {
>   				if (port == last_port)
>   					goto done;
> @@ -223,12 +217,11 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>   			last = rq;
>   		}
>   
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);
>   	}
>   done:
>   	execlists->queue_priority_hint =
> -		rb ? to_priolist(rb)->priority : INT_MIN;
> +		pl != &engine->active.queue.sentinel ? pl->priority : INT_MIN;
>   	if (submit) {
>   		*port = schedule_in(last, port - execlists->inflight);
>   		*++port = NULL;
> @@ -327,7 +320,7 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
>   {
>   	struct intel_engine_execlists * const execlists = &engine->execlists;
>   	struct i915_request *rq, *rn;
> -	struct rb_node *rb;
> +	struct i915_priolist *p;
>   	unsigned long flags;
>   
>   	ENGINE_TRACE(engine, "\n");
> @@ -355,25 +348,20 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
>   	}
>   
>   	/* Flush the queued requests to the timeline list (for retiring). */
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> -
> -		priolist_for_each_request_consume(rq, rn, p) {
> +	for_each_priolist(p, &engine->active.queue) {
> +		priolist_for_each_request_safe(rq, rn, p) {
>   			list_del_init(&rq->sched.link);
>   			__i915_request_submit(rq);
>   			dma_fence_set_error(&rq->fence, -EIO);
>   			i915_request_mark_complete(rq);
>   		}
> -
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, p);
>   	}
>   	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
>   
>   	/* Remaining _unready_ requests will be nop'ed when submitted */
>   
>   	execlists->queue_priority_hint = INT_MIN;
> -	engine->active.queue = RB_ROOT_CACHED;
>   
>   	spin_unlock_irqrestore(&engine->active.lock, flags);
>   }
> diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
> index bc2fa84f98a8..1200c3df6a4a 100644
> --- a/drivers/gpu/drm/i915/i915_priolist_types.h
> +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
> @@ -38,10 +38,36 @@ enum {
>   #define I915_PRIORITY_UNPREEMPTABLE INT_MAX
>   #define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
>   
> +#ifdef CONFIG_64BIT
> +#define I915_PRIOLIST_HEIGHT 12
> +#else
> +#define I915_PRIOLIST_HEIGHT 11
> +#endif

I did not get this. On one hand I could think pointers are larger on 
64-bit so go for fewer levels, if size was a concern. But on the other 
hand 32-bit is less important these days, definitely much less as a 
performance platform. So going for less memory use => worse performance 
on a less important platform, which typically could be more memory 
constrained? Not sure I see it as that important either way to be 
distinctive but a comment would satisfy me.

> +
>   struct i915_priolist {
>   	struct list_head requests;

What would be on this list? Request can only be on one at a time, so I 
was thinking these nodes would have pointers to list of that priority, 
rather than lists themselves. Assuming there can be multiple nodes of 
the same priority in the 2d hierarcy. Possibly I don't understand the 
layout.

> -	struct rb_node node;
>   	int priority;
> +
> +	int level;
> +	struct i915_priolist *next[I915_PRIOLIST_HEIGHT];

Does every node need maximum height or you could allocated depending on 
current height?

>   };
>   
> +struct i915_priolist_root {
> +	struct i915_priolist sentinel;
> +	u32 prng;
> +};
> +
> +#define i915_priolist_is_empty(root) ((root)->sentinel.level < 0)
> +
> +#define for_each_priolist(p, root) \
> +	for ((p) = (root)->sentinel.next[0]; \
> +	     (p) != &(root)->sentinel; \
> +	     (p) = (p)->next[0])
> +
> +#define priolist_for_each_request(it, plist) \
> +	list_for_each_entry(it, &(plist)->requests, sched.link)
> +
> +#define priolist_for_each_request_safe(it, n, plist) \
> +	list_for_each_entry_safe(it, n, &(plist)->requests, sched.link)
> +
>   #endif /* _I915_PRIOLIST_TYPES_H_ */
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index a3ee06cb66d7..74000d3eebb1 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -4,7 +4,9 @@
>    * Copyright © 2018 Intel Corporation
>    */
>   
> +#include <linux/bitops.h>
>   #include <linux/mutex.h>
> +#include <linux/prandom.h>
>   
>   #include "gt/intel_ring.h"
>   #include "gt/intel_lrc_reg.h"
> @@ -91,15 +93,24 @@ static void i915_sched_init_ipi(struct i915_sched_ipi *ipi)
>   	ipi->list = NULL;
>   }
>   
> +static void init_priolist(struct i915_priolist_root *const root)
> +{
> +	struct i915_priolist *pl = &root->sentinel;
> +
> +	memset_p((void **)pl->next, pl, ARRAY_SIZE(pl->next));
> +	pl->priority = INT_MIN;
> +	pl->level = -1;
> +}
> +
>   void i915_sched_init_engine(struct i915_sched_engine *se,
>   			    unsigned int subclass)
>   {
>   	spin_lock_init(&se->lock);
>   	lockdep_set_subclass(&se->lock, subclass);
>   
> +	init_priolist(&se->queue);
>   	INIT_LIST_HEAD(&se->requests);
>   	INIT_LIST_HEAD(&se->hold);
> -	se->queue = RB_ROOT_CACHED;
>   
>   	i915_sched_init_ipi(&se->ipi);
>   
> @@ -116,8 +127,57 @@ void i915_sched_init_engine(struct i915_sched_engine *se,
>   #endif
>   }
>   
> +__maybe_unused static bool priolist_idle(struct i915_priolist_root *root)
> +{
> +	struct i915_priolist *pl = &root->sentinel;
> +	int lvl;
> +
> +	for (lvl = 0; lvl < ARRAY_SIZE(pl->next); lvl++) {
> +		if (pl->next[lvl] != pl) {
> +			GEM_TRACE_ERR("root[%d] is not empty\n", lvl);
> +			return false;
> +		}
> +	}
> +
> +	if (pl->level != -1) {
> +		GEM_TRACE_ERR("root is not clear: %d\n", pl->level);
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
> +static void pl_push(struct i915_priolist *pl, struct list_head *head)
> +{
> +	pl->requests.next = head->next;
> +	head->next = &pl->requests;
> +}
> +
> +static struct i915_priolist *pl_pop(struct list_head *head)
> +{
> +	struct i915_priolist *pl;
> +
> +	pl = container_of(head->next, typeof(*pl), requests);
> +	head->next = pl->requests.next;
> +
> +	return pl;
> +}
> +
> +static bool pl_empty(struct list_head *head)
> +{
> +	return !head->next;
> +}
> +
>   void i915_sched_park_engine(struct i915_sched_engine *se)
>   {
> +	struct i915_priolist_root *root = &se->queue;
> +	struct list_head *list = &root->sentinel.requests;
> +
> +	GEM_BUG_ON(!priolist_idle(root));
> +
> +	while (!pl_empty(list))
> +		kmem_cache_free(global.slab_priorities, pl_pop(list));
> +
>   	GEM_BUG_ON(!i915_sched_is_idle(se));
>   	se->no_priolist = false;
>   }
> @@ -183,71 +243,55 @@ static inline bool node_signaled(const struct i915_sched_node *node)
>   	return i915_request_completed(node_to_request(node));
>   }
>   
> -static inline struct i915_priolist *to_priolist(struct rb_node *rb)
> +static inline unsigned int random_level(struct i915_priolist_root *root)
>   {
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
> -static void assert_priolists(struct i915_sched_engine * const se)
> -{
> -	struct rb_node *rb;
> -	long last_prio;
> -
> -	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
> -		return;
> -
> -	GEM_BUG_ON(rb_first_cached(&se->queue) !=
> -		   rb_first(&se->queue.rb_root));
> -
> -	last_prio = INT_MAX;
> -	for (rb = rb_first_cached(&se->queue); rb; rb = rb_next(rb)) {
> -		const struct i915_priolist *p = to_priolist(rb);
> -
> -		GEM_BUG_ON(p->priority > last_prio);
> -		last_prio = p->priority;
> -	}
> +	root->prng = next_pseudo_random32(root->prng);
> +	return  __ffs(root->prng) / 2;

Where is the relationship to I915_PRIOLIST_HEIGHT? Feels root->prng % 
I915_PRIOLIST_HEIGHT would be more obvious here unless I am terribly 
mistaken. Or at least put a comment saying why the hack.

>   }
>   
>   static struct list_head *
>   lookup_priolist(struct intel_engine_cs *engine, int prio)
>   {
> +	struct i915_priolist *update[I915_PRIOLIST_HEIGHT];
>   	struct i915_sched_engine * const se = &engine->active;
> -	struct i915_priolist *p;
> -	struct rb_node **parent, *rb;
> -	bool first = true;
> -
> -	lockdep_assert_held(&engine->active.lock);
> -	assert_priolists(se);
> +	struct i915_priolist_root *root = &se->queue;
> +	struct i915_priolist *pl, *tmp;
> +	int lvl;
>   
> +	lockdep_assert_held(&se->lock);
>   	if (unlikely(se->no_priolist))
>   		prio = I915_PRIORITY_NORMAL;
>   
> +	for_each_priolist(pl, root) { /* recycle any empty elements before us */
> +		if (pl->priority >= prio || !list_empty(&pl->requests))
> +			break;
> +
> +		i915_priolist_advance(root, pl);
> +	}
> +
>   find_priolist:
> -	/* most positive priority is scheduled first, equal priorities fifo */
> -	rb = NULL;
> -	parent = &se->queue.rb_root.rb_node;
> -	while (*parent) {
> -		rb = *parent;
> -		p = to_priolist(rb);
> -		if (prio > p->priority) {
> -			parent = &rb->rb_left;
> -		} else if (prio < p->priority) {
> -			parent = &rb->rb_right;
> -			first = false;
> -		} else {
> -			return &p->requests;
> -		}
> +	pl = &root->sentinel;
> +	lvl = pl->level;
> +	while (lvl >= 0) {
> +		while (tmp = pl->next[lvl], tmp->priority >= prio)
> +			pl = tmp;
> +		if (pl->priority == prio)
> +			goto out;
> +		update[lvl--] = pl;
>   	}
>   
>   	if (prio == I915_PRIORITY_NORMAL) {
> -		p = &se->default_priolist;
> +		pl = &se->default_priolist;
> +	} else if (!pl_empty(&root->sentinel.requests)) {
> +		pl = pl_pop(&root->sentinel.requests);
>   	} else {
> -		p = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
> +		pl = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
>   		/* Convert an allocation failure to a priority bump */
> -		if (unlikely(!p)) {
> +		if (unlikely(!pl)) {
>   			prio = I915_PRIORITY_NORMAL; /* recurses just once */
>   
> -			/* To maintain ordering with all rendering, after an
> +			/*
> +			 * To maintain ordering with all rendering, after an
>   			 * allocation failure we have to disable all scheduling.
>   			 * Requests will then be executed in fifo, and schedule
>   			 * will ensure that dependencies are emitted in fifo.
> @@ -260,18 +304,103 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
>   		}
>   	}
>   
> -	p->priority = prio;
> -	INIT_LIST_HEAD(&p->requests);
> +	pl->priority = prio;
> +	INIT_LIST_HEAD(&pl->requests);
>   
> -	rb_link_node(&p->node, rb, parent);
> -	rb_insert_color_cached(&p->node, &se->queue, first);
> +	lvl = random_level(root);
> +	if (lvl > root->sentinel.level) {
> +		if (root->sentinel.level < I915_PRIOLIST_HEIGHT - 1) {
> +			lvl = ++root->sentinel.level;

root->sentinel.level is maximum currently populated height? So if 
random_level said insert at 4 but there are currently only 2 levels, 
height will grow by one only?

> +			update[lvl] = &root->sentinel;
> +		} else {
> +			lvl = I915_PRIOLIST_HEIGHT - 1;

But if maximum level already has been reached then this branch does not 
set anything to update[], relying on the while loop earlier in the 
function has populated it? How should I think of the update array?

Regards,

Tvrtko

> +		}
> +	}
> +	GEM_BUG_ON(lvl < 0);
> +	GEM_BUG_ON(lvl >= ARRAY_SIZE(pl->next));
>   
> -	return &p->requests;
> +	pl->level = lvl;
> +	do {
> +		tmp = update[lvl];
> +		pl->next[lvl] = update[lvl]->next[lvl];
> +		tmp->next[lvl] = pl;
> +	} while (--lvl >= 0);
> +
> +	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) {
> +		struct i915_priolist *chk;
> +
> +		chk = &root->sentinel;
> +		lvl = chk->level;
> +		do {
> +			while (tmp = chk->next[lvl], tmp->priority >= prio)
> +				chk = tmp;
> +		} while (--lvl >= 0);
> +
> +		GEM_BUG_ON(chk != pl);
> +	}
> +
> +out:
> +	GEM_BUG_ON(pl == &root->sentinel);
> +	return &pl->requests;
>   }
>   
> -void __i915_priolist_free(struct i915_priolist *p)
> +static void remove_priolist(struct intel_engine_cs *engine,
> +			    struct list_head *plist)
>   {
> -	kmem_cache_free(global.slab_priorities, p);
> +	struct i915_sched_engine * const se = &engine->active;
> +	struct i915_priolist_root *root = &se->queue;
> +	struct i915_priolist *pl, *tmp;
> +	struct i915_priolist *old =
> +		container_of(plist, struct i915_priolist, requests);
> +	int prio = old->priority;
> +	int lvl;
> +
> +	lockdep_assert_held(&se->lock);
> +	GEM_BUG_ON(!list_empty(plist));
> +
> +	pl = &root->sentinel;
> +	lvl = pl->level;
> +	GEM_BUG_ON(lvl < 0);
> +
> +	if (prio != I915_PRIORITY_NORMAL)
> +		pl_push(old, &pl->requests);
> +
> +	do {
> +		while (tmp = pl->next[lvl], tmp->priority > prio)
> +			pl = tmp;
> +		if (lvl <= old->level) {
> +			pl->next[lvl] = old->next[lvl];
> +			if (pl == &root->sentinel && old->next[lvl] == pl) {
> +				GEM_BUG_ON(pl->level != lvl);
> +				pl->level--;
> +			}
> +		}
> +	} while (--lvl >= 0);
> +	GEM_BUG_ON(tmp != old);
> +}
> +
> +void i915_priolist_advance(struct i915_priolist_root *root,
> +			   struct i915_priolist *pl)
> +{
> +	struct i915_priolist * const s = &root->sentinel;
> +	int lvl;
> +
> +	GEM_BUG_ON(!list_empty(&pl->requests));
> +	GEM_BUG_ON(pl != s->next[0]);
> +	GEM_BUG_ON(pl == s);
> +
> +	if (pl->priority != I915_PRIORITY_NORMAL)
> +		pl_push(pl, &s->requests);
> +
> +	lvl = pl->level;
> +	GEM_BUG_ON(lvl < 0);
> +	do {
> +		s->next[lvl] = pl->next[lvl];
> +		if (pl->next[lvl] == s) {
> +			GEM_BUG_ON(s->level != lvl);
> +			s->level--;
> +		}
> +	} while (--lvl >= 0);
>   }
>   
>   static struct i915_request *
> @@ -420,8 +549,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
>   			continue;
>   
>   		GEM_BUG_ON(rq->engine != engine);
> -		if (i915_request_in_priority_queue(rq))
> +		if (i915_request_in_priority_queue(rq)) {
> +			struct list_head *prev = rq->sched.link.prev;
> +
>   			list_move_tail(&rq->sched.link, plist);
> +			if (list_empty(prev))
> +				remove_priolist(engine, prev);
> +		}
>   
>   		/* Defer (tasklet) submission until after all updates. */
>   		kick_submission(engine, rq, prio);
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 0ab47cbf0e9c..bca89a58d953 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -16,12 +16,6 @@
>   
>   struct drm_printer;
>   
> -#define priolist_for_each_request(it, plist) \
> -	list_for_each_entry(it, &(plist)->requests, sched.link)
> -
> -#define priolist_for_each_request_consume(it, n, plist) \
> -	list_for_each_entry_safe(it, n, &(plist)->requests, sched.link)
> -
>   void i915_sched_node_init(struct i915_sched_node *node);
>   void i915_sched_node_reinit(struct i915_sched_node *node);
>   
> @@ -69,7 +63,7 @@ static inline void i915_priolist_free(struct i915_priolist *p)
>   
>   static inline bool i915_sched_is_idle(const struct i915_sched_engine *se)
>   {
> -	return RB_EMPTY_ROOT(&se->queue.rb_root);
> +	return i915_priolist_is_empty(&se->queue);
>   }
>   
>   static inline bool
> @@ -99,6 +93,9 @@ static inline void i915_sched_kick(struct i915_sched_engine *se)
>   	tasklet_hi_schedule(&se->tasklet);
>   }
>   
> +void i915_priolist_advance(struct i915_priolist_root *root,
> +			   struct i915_priolist *old);
> +
>   void i915_request_show_with_schedule(struct drm_printer *m,
>   				     const struct i915_request *rq,
>   				     const char *prefix,
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index f668c680d290..e64750be4e77 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -89,7 +89,7 @@ struct i915_sched_engine {
>   	/**
>   	 * @queue: queue of requests, in priority lists
>   	 */
> -	struct rb_root_cached queue;
> +	struct i915_priolist_root queue;
>   
>   	struct i915_sched_ipi ipi;
>   
> diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> index 3db34d3eea58..946c93441c1f 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> @@ -25,6 +25,7 @@ selftest(ring, intel_ring_mock_selftests)
>   selftest(engine, intel_engine_cs_mock_selftests)
>   selftest(timelines, intel_timeline_mock_selftests)
>   selftest(requests, i915_request_mock_selftests)
> +selftest(scheduler, i915_scheduler_mock_selftests)
>   selftest(objects, i915_gem_object_mock_selftests)
>   selftest(phys, i915_gem_phys_mock_selftests)
>   selftest(dmabuf, i915_gem_dmabuf_mock_selftests)
> diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> index de44a66210b7..de5b1443129b 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> @@ -12,6 +12,54 @@
>   #include "selftests/igt_spinner.h"
>   #include "selftests/i915_random.h"
>   
> +static int mock_skiplist_levels(void *dummy)
> +{
> +	struct i915_priolist_root root = {};
> +	struct i915_priolist *pl = &root.sentinel;
> +	IGT_TIMEOUT(end_time);
> +	unsigned long total;
> +	int count, lvl;
> +
> +	total = 0;
> +	do {
> +		for (count = 0; count < 16384; count++) {
> +			lvl = random_level(&root);
> +			if (lvl > pl->level) {
> +				if (lvl < I915_PRIOLIST_HEIGHT - 1)
> +					lvl = ++pl->level;
> +				else
> +					lvl = I915_PRIOLIST_HEIGHT - 1;
> +			}
> +
> +			pl->next[lvl] = ptr_inc(pl->next[lvl]);
> +		}
> +		total += count;
> +	} while (!__igt_timeout(end_time, NULL));
> +
> +	pr_info("Total %9lu\n", total);
> +	for (lvl = 0; lvl <= pl->level; lvl++) {
> +		int x = ilog2((unsigned long)pl->next[lvl]);
> +		char row[80];
> +
> +		memset(row, '*', x);
> +		row[x] = '\0';
> +
> +		pr_info(" [%2d] %9lu %s\n",
> +			lvl, (unsigned long)pl->next[lvl], row);
> +	}
> +
> +	return 0;
> +}
> +
> +int i915_scheduler_mock_selftests(void)
> +{
> +	static const struct i915_subtest tests[] = {
> +		SUBTEST(mock_skiplist_levels),
> +	};
> +
> +	return i915_subtests(tests, NULL);
> +}
> +
>   static void scheduling_disable(struct intel_engine_cs *engine)
>   {
>   	engine->props.preempt_timeout_ms = 0;
> @@ -80,9 +128,9 @@ static int all_engines(struct drm_i915_private *i915,
>   static bool check_context_order(struct intel_engine_cs *engine)
>   {
>   	u64 last_seqno, last_context;
> +	struct i915_priolist *p;
>   	unsigned long count;
>   	bool result = false;
> -	struct rb_node *rb;
>   	int last_prio;
>   
>   	/* We expect the execution order to follow ascending fence-context */
> @@ -92,8 +140,7 @@ static bool check_context_order(struct intel_engine_cs *engine)
>   	last_context = 0;
>   	last_seqno = 0;
>   	last_prio = 0;
> -	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
> -		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
> +	for_each_priolist(p, &engine->active.queue) {
>   		struct i915_request *rq;
>   
>   		priolist_for_each_request(rq, p) {
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-28 15:56   ` Tvrtko Ursulin
@ 2021-01-28 16:26     ` Chris Wilson
  2021-01-28 16:42       ` Tvrtko Ursulin
  2021-01-29  9:37       ` Tvrtko Ursulin
  0 siblings, 2 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-28 16:26 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-28 15:56:19)
> On 25/01/2021 14:01, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
> > index bc2fa84f98a8..1200c3df6a4a 100644
> > --- a/drivers/gpu/drm/i915/i915_priolist_types.h
> > +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
> > @@ -38,10 +38,36 @@ enum {
> >   #define I915_PRIORITY_UNPREEMPTABLE INT_MAX
> >   #define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
> >   
> > +#ifdef CONFIG_64BIT
> > +#define I915_PRIOLIST_HEIGHT 12
> > +#else
> > +#define I915_PRIOLIST_HEIGHT 11
> > +#endif
> 
> I did not get this. On one hand I could think pointers are larger on 
> 64-bit so go for fewer levels, if size was a concern. But on the other 
> hand 32-bit is less important these days, definitely much less as a 
> performance platform. So going for less memory use => worse performance 
> on a less important platform, which typically could be more memory 
> constrained? Not sure I see it as that important either way to be 
> distinctive but a comment would satisfy me.

Just aligned to the cacheline. The struct is 128B on 64b and 64B on 32b.
On 64B, we will scale to around 16 million requests in flight and 4
million on 32b. Which should be enough.

If we shrunk 64b to a 64B node, we would only scale to 256 requests
which limit we definitely will exceed.

> >   struct i915_priolist {
> >       struct list_head requests;
> 
> What would be on this list? Request can only be on one at a time, so I 
> was thinking these nodes would have pointers to list of that priority, 
> rather than lists themselves. Assuming there can be multiple nodes of 
> the same priority in the 2d hierarcy. Possibly I don't understand the 
> layout.

A request is only on one list (queue, active, hold). But we may still
have more than one request at the same deadline, though that will likely
be limited to priority-inheritance and timeslice deferrals.

Since we would need pointer to the request, we could only reclaim a
single pointer here, which is not enough to warrant reducing the overall
node size. And while there is at least one user of request->sched.link,
the list maintenance will still be incurred. Using request->sched.link
remains a convenient interface.

> 
> > -     struct rb_node node;
> >       int priority;
> > +
> > +     int level;
> > +     struct i915_priolist *next[I915_PRIOLIST_HEIGHT];
> 
> Does every node need maximum height or you could allocated depending on 
> current height?

Every slab allocation here is a power of 2, so there are only a few
different options that are worthwhile (on 64b the only other choice is
[4], unless you want to go larger to [28]). It did not feel like enough
benefit to justify the extra code.

> > -static void assert_priolists(struct i915_sched_engine * const se)
> > -{
> > -     struct rb_node *rb;
> > -     long last_prio;
> > -
> > -     if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
> > -             return;
> > -
> > -     GEM_BUG_ON(rb_first_cached(&se->queue) !=
> > -                rb_first(&se->queue.rb_root));
> > -
> > -     last_prio = INT_MAX;
> > -     for (rb = rb_first_cached(&se->queue); rb; rb = rb_next(rb)) {
> > -             const struct i915_priolist *p = to_priolist(rb);
> > -
> > -             GEM_BUG_ON(p->priority > last_prio);
> > -             last_prio = p->priority;
> > -     }
> > +     root->prng = next_pseudo_random32(root->prng);
> > +     return  __ffs(root->prng) / 2;
> 
> Where is the relationship to I915_PRIOLIST_HEIGHT? Feels root->prng % 
> I915_PRIOLIST_HEIGHT would be more obvious here unless I am terribly 
> mistaken. Or at least put a comment saying why the hack.

HEIGHT is the maximum possible for our struct. skiplists only want to
increment the height of the tree one step at a time. So we choose a level
with decreasing probability, and then limit that to the maximum height of
the current tree + 1, clamped to HEIGHT.

You might notice that unlike traditional skiplists, this uses a
probability of 0.25 for each additional level. A neat trick discovered by
Con Kolivas (I haven't found it mentioned elsewhere) as the cost of the
extra level (using P=.5) is the same as the extra chain length with
P=.25. So you can scale to higher number of requests by packing more
requests into each level.

So that is split between randomly choosing a level and then working out
the height of the node.

> >   static struct list_head *
> >   lookup_priolist(struct intel_engine_cs *engine, int prio)
> >   {
> > +     struct i915_priolist *update[I915_PRIOLIST_HEIGHT];
> >       struct i915_sched_engine * const se = &engine->active;
> > -     struct i915_priolist *p;
> > -     struct rb_node **parent, *rb;
> > -     bool first = true;
> > -
> > -     lockdep_assert_held(&engine->active.lock);
> > -     assert_priolists(se);
> > +     struct i915_priolist_root *root = &se->queue;
> > +     struct i915_priolist *pl, *tmp;
> > +     int lvl;
> >   
> > +     lockdep_assert_held(&se->lock);
> >       if (unlikely(se->no_priolist))
> >               prio = I915_PRIORITY_NORMAL;
> >   
> > +     for_each_priolist(pl, root) { /* recycle any empty elements before us */
> > +             if (pl->priority >= prio || !list_empty(&pl->requests))
> > +                     break;
> > +
> > +             i915_priolist_advance(root, pl);
> > +     }
> > +
> >   find_priolist:
> > -     /* most positive priority is scheduled first, equal priorities fifo */
> > -     rb = NULL;
> > -     parent = &se->queue.rb_root.rb_node;
> > -     while (*parent) {
> > -             rb = *parent;
> > -             p = to_priolist(rb);
> > -             if (prio > p->priority) {
> > -                     parent = &rb->rb_left;
> > -             } else if (prio < p->priority) {
> > -                     parent = &rb->rb_right;
> > -                     first = false;
> > -             } else {
> > -                     return &p->requests;
> > -             }
> > +     pl = &root->sentinel;
> > +     lvl = pl->level;
> > +     while (lvl >= 0) {
> > +             while (tmp = pl->next[lvl], tmp->priority >= prio)
> > +                     pl = tmp;
> > +             if (pl->priority == prio)
> > +                     goto out;
> > +             update[lvl--] = pl;
> >       }
> >   
> >       if (prio == I915_PRIORITY_NORMAL) {
> > -             p = &se->default_priolist;
> > +             pl = &se->default_priolist;
> > +     } else if (!pl_empty(&root->sentinel.requests)) {
> > +             pl = pl_pop(&root->sentinel.requests);
> >       } else {
> > -             p = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
> > +             pl = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
> >               /* Convert an allocation failure to a priority bump */
> > -             if (unlikely(!p)) {
> > +             if (unlikely(!pl)) {
> >                       prio = I915_PRIORITY_NORMAL; /* recurses just once */
> >   
> > -                     /* To maintain ordering with all rendering, after an
> > +                     /*
> > +                      * To maintain ordering with all rendering, after an
> >                        * allocation failure we have to disable all scheduling.
> >                        * Requests will then be executed in fifo, and schedule
> >                        * will ensure that dependencies are emitted in fifo.
> > @@ -260,18 +304,103 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
> >               }
> >       }
> >   
> > -     p->priority = prio;
> > -     INIT_LIST_HEAD(&p->requests);
> > +     pl->priority = prio;
> > +     INIT_LIST_HEAD(&pl->requests);
> >   
> > -     rb_link_node(&p->node, rb, parent);
> > -     rb_insert_color_cached(&p->node, &se->queue, first);
> > +     lvl = random_level(root);
> > +     if (lvl > root->sentinel.level) {
> > +             if (root->sentinel.level < I915_PRIOLIST_HEIGHT - 1) {
> > +                     lvl = ++root->sentinel.level;
> 
> root->sentinel.level is maximum currently populated height? So if 
> random_level said insert at 4 but there are currently only 2 levels, 
> height will grow by one only?

Yes. The idea is keep the number of next[] as small as possible for the
number of nodes in the tree. (Since the height of the tree is the
constant overhead in list traversal.)

> > +                     update[lvl] = &root->sentinel;
> > +             } else {
> > +                     lvl = I915_PRIOLIST_HEIGHT - 1;
> 
> But if maximum level already has been reached then this branch does not 
> set anything to update[],

at the next level.

> relying on the while loop earlier in the 
> function has populated it? How should I think of the update array?

The update[] is the array of nodes just before the position we need to
insert. So update[] needs only be the height of the tree at that time,
and if we decide to grow the tree, update[height] will be the root node,
as we will be the first in that level.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-28 16:26     ` Chris Wilson
@ 2021-01-28 16:42       ` Tvrtko Ursulin
  2021-01-28 22:20         ` Chris Wilson
  2021-01-28 22:44         ` Chris Wilson
  2021-01-29  9:37       ` Tvrtko Ursulin
  1 sibling, 2 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-28 16:42 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 28/01/2021 16:26, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2021-01-28 15:56:19)
>> On 25/01/2021 14:01, Chris Wilson wrote:
>>> diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
>>> index bc2fa84f98a8..1200c3df6a4a 100644
>>> --- a/drivers/gpu/drm/i915/i915_priolist_types.h
>>> +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
>>> @@ -38,10 +38,36 @@ enum {
>>>    #define I915_PRIORITY_UNPREEMPTABLE INT_MAX
>>>    #define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
>>>    
>>> +#ifdef CONFIG_64BIT
>>> +#define I915_PRIOLIST_HEIGHT 12
>>> +#else
>>> +#define I915_PRIOLIST_HEIGHT 11
>>> +#endif
>>
>> I did not get this. On one hand I could think pointers are larger on
>> 64-bit so go for fewer levels, if size was a concern. But on the other
>> hand 32-bit is less important these days, definitely much less as a
>> performance platform. So going for less memory use => worse performance
>> on a less important platform, which typically could be more memory
>> constrained? Not sure I see it as that important either way to be
>> distinctive but a comment would satisfy me.
> 
> Just aligned to the cacheline. The struct is 128B on 64b and 64B on 32b.
> On 64B, we will scale to around 16 million requests in flight and 4
> million on 32b. Which should be enough.
> 
> If we shrunk 64b to a 64B node, we would only scale to 256 requests
> which limit we definitely will exceed.

Ok thanks, pouring it into a comment is implied.

> 
>>>    struct i915_priolist {
>>>        struct list_head requests;
>>
>> What would be on this list? Request can only be on one at a time, so I
>> was thinking these nodes would have pointers to list of that priority,
>> rather than lists themselves. Assuming there can be multiple nodes of
>> the same priority in the 2d hierarcy. Possibly I don't understand the
>> layout.
> 
> A request is only on one list (queue, active, hold). But we may still
> have more than one request at the same deadline, though that will likely
> be limited to priority-inheritance and timeslice deferrals.
> 
> Since we would need pointer to the request, we could only reclaim a
> single pointer here, which is not enough to warrant reducing the overall
> node size. And while there is at least one user of request->sched.link,
> the list maintenance will still be incurred. Using request->sched.link
> remains a convenient interface.

Lost you.

Is the data structure like this and I will limit to priorities for 
simplicity:

    Level1:	[-1]------------->[1]
    Level0: 	[-1]---->[0]----->[1]
[SENTINEL]

Each of the boxes is struct i915_priolist?

Sentinel contains pointers to first i915_priolist for each level. Or 
maybe it could contain just a single pointer to highest level (most 
sparse) list.

And then each box is i915_priolist, single linked to next, in order.

But it should also have a single pointer for down, or up (or both)? I 
don't understand why you have up to "max levels" pointers in each.

And each box should then contain a pointer to a list of requests. I 
cannot each have it's own list since there are duplicates.

But obviously I am understanding something way wrong.

> 
>>
>>> -     struct rb_node node;
>>>        int priority;
>>> +
>>> +     int level;
>>> +     struct i915_priolist *next[I915_PRIOLIST_HEIGHT];
>>
>> Does every node need maximum height or you could allocated depending on
>> current height?
> 
> Every slab allocation here is a power of 2, so there are only a few
> different options that are worthwhile (on 64b the only other choice is
> [4], unless you want to go larger to [28]). It did not feel like enough
> benefit to justify the extra code.
> 
>>> -static void assert_priolists(struct i915_sched_engine * const se)
>>> -{
>>> -     struct rb_node *rb;
>>> -     long last_prio;
>>> -
>>> -     if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
>>> -             return;
>>> -
>>> -     GEM_BUG_ON(rb_first_cached(&se->queue) !=
>>> -                rb_first(&se->queue.rb_root));
>>> -
>>> -     last_prio = INT_MAX;
>>> -     for (rb = rb_first_cached(&se->queue); rb; rb = rb_next(rb)) {
>>> -             const struct i915_priolist *p = to_priolist(rb);
>>> -
>>> -             GEM_BUG_ON(p->priority > last_prio);
>>> -             last_prio = p->priority;
>>> -     }
>>> +     root->prng = next_pseudo_random32(root->prng);
>>> +     return  __ffs(root->prng) / 2;
>>
>> Where is the relationship to I915_PRIOLIST_HEIGHT? Feels root->prng %
>> I915_PRIOLIST_HEIGHT would be more obvious here unless I am terribly
>> mistaken. Or at least put a comment saying why the hack.
> 
> HEIGHT is the maximum possible for our struct. skiplists only want to
> increment the height of the tree one step at a time. So we choose a level
> with decreasing probability, and then limit that to the maximum height of
> the current tree + 1, clamped to HEIGHT.
> 
> You might notice that unlike traditional skiplists, this uses a

That's optimistic, that I would notice that. I'll stick to the basics 
for now. :)

Regards,

Tvrtko

> probability of 0.25 for each additional level. A neat trick discovered by
> Con Kolivas (I haven't found it mentioned elsewhere) as the cost of the
> extra level (using P=.5) is the same as the extra chain length with
> P=.25. So you can scale to higher number of requests by packing more
> requests into each level.
> 
> So that is split between randomly choosing a level and then working out
> the height of the node.
> 
>>>    static struct list_head *
>>>    lookup_priolist(struct intel_engine_cs *engine, int prio)
>>>    {
>>> +     struct i915_priolist *update[I915_PRIOLIST_HEIGHT];
>>>        struct i915_sched_engine * const se = &engine->active;
>>> -     struct i915_priolist *p;
>>> -     struct rb_node **parent, *rb;
>>> -     bool first = true;
>>> -
>>> -     lockdep_assert_held(&engine->active.lock);
>>> -     assert_priolists(se);
>>> +     struct i915_priolist_root *root = &se->queue;
>>> +     struct i915_priolist *pl, *tmp;
>>> +     int lvl;
>>>    
>>> +     lockdep_assert_held(&se->lock);
>>>        if (unlikely(se->no_priolist))
>>>                prio = I915_PRIORITY_NORMAL;
>>>    
>>> +     for_each_priolist(pl, root) { /* recycle any empty elements before us */
>>> +             if (pl->priority >= prio || !list_empty(&pl->requests))
>>> +                     break;
>>> +
>>> +             i915_priolist_advance(root, pl);
>>> +     }
>>> +
>>>    find_priolist:
>>> -     /* most positive priority is scheduled first, equal priorities fifo */
>>> -     rb = NULL;
>>> -     parent = &se->queue.rb_root.rb_node;
>>> -     while (*parent) {
>>> -             rb = *parent;
>>> -             p = to_priolist(rb);
>>> -             if (prio > p->priority) {
>>> -                     parent = &rb->rb_left;
>>> -             } else if (prio < p->priority) {
>>> -                     parent = &rb->rb_right;
>>> -                     first = false;
>>> -             } else {
>>> -                     return &p->requests;
>>> -             }
>>> +     pl = &root->sentinel;
>>> +     lvl = pl->level;
>>> +     while (lvl >= 0) {
>>> +             while (tmp = pl->next[lvl], tmp->priority >= prio)
>>> +                     pl = tmp;
>>> +             if (pl->priority == prio)
>>> +                     goto out;
>>> +             update[lvl--] = pl;
>>>        }
>>>    
>>>        if (prio == I915_PRIORITY_NORMAL) {
>>> -             p = &se->default_priolist;
>>> +             pl = &se->default_priolist;
>>> +     } else if (!pl_empty(&root->sentinel.requests)) {
>>> +             pl = pl_pop(&root->sentinel.requests);
>>>        } else {
>>> -             p = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
>>> +             pl = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
>>>                /* Convert an allocation failure to a priority bump */
>>> -             if (unlikely(!p)) {
>>> +             if (unlikely(!pl)) {
>>>                        prio = I915_PRIORITY_NORMAL; /* recurses just once */
>>>    
>>> -                     /* To maintain ordering with all rendering, after an
>>> +                     /*
>>> +                      * To maintain ordering with all rendering, after an
>>>                         * allocation failure we have to disable all scheduling.
>>>                         * Requests will then be executed in fifo, and schedule
>>>                         * will ensure that dependencies are emitted in fifo.
>>> @@ -260,18 +304,103 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
>>>                }
>>>        }
>>>    
>>> -     p->priority = prio;
>>> -     INIT_LIST_HEAD(&p->requests);
>>> +     pl->priority = prio;
>>> +     INIT_LIST_HEAD(&pl->requests);
>>>    
>>> -     rb_link_node(&p->node, rb, parent);
>>> -     rb_insert_color_cached(&p->node, &se->queue, first);
>>> +     lvl = random_level(root);
>>> +     if (lvl > root->sentinel.level) {
>>> +             if (root->sentinel.level < I915_PRIOLIST_HEIGHT - 1) {
>>> +                     lvl = ++root->sentinel.level;
>>
>> root->sentinel.level is maximum currently populated height? So if
>> random_level said insert at 4 but there are currently only 2 levels,
>> height will grow by one only?
> 
> Yes. The idea is keep the number of next[] as small as possible for the
> number of nodes in the tree. (Since the height of the tree is the
> constant overhead in list traversal.)
> 
>>> +                     update[lvl] = &root->sentinel;
>>> +             } else {
>>> +                     lvl = I915_PRIOLIST_HEIGHT - 1;
>>
>> But if maximum level already has been reached then this branch does not
>> set anything to update[],
> 
> at the next level.
> 
>> relying on the while loop earlier in the
>> function has populated it? How should I think of the update array?
> 
> The update[] is the array of nodes just before the position we need to
> insert. So update[] needs only be the height of the tree at that time,
> and if we decide to grow the tree, update[height] will be the root node,
> as we will be the first in that level.
> -Chris
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-28 16:42       ` Tvrtko Ursulin
@ 2021-01-28 22:20         ` Chris Wilson
  2021-01-28 22:44         ` Chris Wilson
  1 sibling, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-28 22:20 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-28 16:42:44)
> 
> On 28/01/2021 16:26, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2021-01-28 15:56:19)
> >> On 25/01/2021 14:01, Chris Wilson wrote:
> >>>    struct i915_priolist {
> >>>        struct list_head requests;
> >>
> >> What would be on this list? Request can only be on one at a time, so I
> >> was thinking these nodes would have pointers to list of that priority,
> >> rather than lists themselves. Assuming there can be multiple nodes of
> >> the same priority in the 2d hierarcy. Possibly I don't understand the
> >> layout.
> > 
> > A request is only on one list (queue, active, hold). But we may still
> > have more than one request at the same deadline, though that will likely
> > be limited to priority-inheritance and timeslice deferrals.
> > 
> > Since we would need pointer to the request, we could only reclaim a
> > single pointer here, which is not enough to warrant reducing the overall
> > node size. And while there is at least one user of request->sched.link,
> > the list maintenance will still be incurred. Using request->sched.link
> > remains a convenient interface.
> 
> Lost you.
> 
> Is the data structure like this and I will limit to priorities for 
> simplicity:
> 
>     Level1:     [-1]------------->[1]
>     Level0:     [-1]---->[0]----->[1]
> [SENTINEL]
> 
> Each of the boxes is struct i915_priolist?

Although each level is circular.

1: SENTINEL -> [-1] --------> [1] -> SENTINEL
0: SENTINEL -> [-1] -> [0] -> [1] -> SENTINEL

Ah. I think I see the cause of confusion here. Each column, not each
box, is a i915_priolist.

So the skiplist is really a set of [HEIGHT] singly linked lists, with
each list containing a sorted subset of the whole. And each descending
level includes every member from the level above, until we reach a
linked list of all i915_priolist in [0].

[skip, hopefully I caught the central point]

SENTINEL[2] is a list of all i915_priolist of level >= 2
SENTINEL[1] is a list of all i915_priolist of level >= 1
SENTINEL[0] is a list of all i915_priolist.

As we randomly assign i915_priolist.level, SENTINEL[1] should have half
the elements of SENTINEL[0], and SENTINEL[2] should have half again the
elements of SENTINEL[1] (hence its ability to do a binary/lgN search for
a key, each level is a bisection of the last).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-28 16:42       ` Tvrtko Ursulin
  2021-01-28 22:20         ` Chris Wilson
@ 2021-01-28 22:44         ` Chris Wilson
  2021-01-29  9:24           ` Tvrtko Ursulin
  1 sibling, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-28 22:44 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-28 16:42:44)
> 
> On 28/01/2021 16:26, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2021-01-28 15:56:19)
> >> On 25/01/2021 14:01, Chris Wilson wrote:
> >>> diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
> >>> index bc2fa84f98a8..1200c3df6a4a 100644
> >>> --- a/drivers/gpu/drm/i915/i915_priolist_types.h
> >>> +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
> >>> @@ -38,10 +38,36 @@ enum {
> >>>    #define I915_PRIORITY_UNPREEMPTABLE INT_MAX
> >>>    #define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
> >>>    
> >>> +#ifdef CONFIG_64BIT
> >>> +#define I915_PRIOLIST_HEIGHT 12
> >>> +#else
> >>> +#define I915_PRIOLIST_HEIGHT 11
> >>> +#endif
> >>
> >> I did not get this. On one hand I could think pointers are larger on
> >> 64-bit so go for fewer levels, if size was a concern. But on the other
> >> hand 32-bit is less important these days, definitely much less as a
> >> performance platform. So going for less memory use => worse performance
> >> on a less important platform, which typically could be more memory
> >> constrained? Not sure I see it as that important either way to be
> >> distinctive but a comment would satisfy me.
> > 
> > Just aligned to the cacheline. The struct is 128B on 64b and 64B on 32b.
> > On 64B, we will scale to around 16 million requests in flight and 4
> > million on 32b. Which should be enough.
> > 
> > If we shrunk 64b to a 64B node, we would only scale to 256 requests
> > which limit we definitely will exceed.
> 
> Ok thanks, pouring it into a comment is implied.
> 
> > 
> >>>    struct i915_priolist {
> >>>        struct list_head requests;
> >>
> >> What would be on this list? Request can only be on one at a time, so I
> >> was thinking these nodes would have pointers to list of that priority,
> >> rather than lists themselves. Assuming there can be multiple nodes of
> >> the same priority in the 2d hierarcy. Possibly I don't understand the
> >> layout.
> > 
> > A request is only on one list (queue, active, hold). But we may still
> > have more than one request at the same deadline, though that will likely
> > be limited to priority-inheritance and timeslice deferrals.
> > 
> > Since we would need pointer to the request, we could only reclaim a
> > single pointer here, which is not enough to warrant reducing the overall
> > node size. And while there is at least one user of request->sched.link,
> > the list maintenance will still be incurred. Using request->sched.link
> > remains a convenient interface.
> 
> Lost you.

/*
 * i915_priolist forms a skiplist. The skiplist is built in layers,
 * starting at the base [0] is a singly linked list of all i915_priolist.
 * Each higher layer contains a fraction of the i915_priolist from the
 * previous layer:
 *
 * S[0] 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF S
 * E[1] >1>3>5>7>9>B>D>F>1>3>5>7>9>B>D>F>1>3>5>7>9>B>D>F>1>3>5>7>9>B>D>F E
 * N[2] -->3-->7-->B-->F-->3-->7-->B-->F-->3-->7-->B-->F-->3-->7-->B-->F N
 * T[3] ------->7----->F-------7------>F------>7------>F------>7------>F T
 * I[4] -------------->F-------------->F-------------->F-------------->F I
 * N[5] ------------------------------>F------------------------------>F N
 * E[6] ------------------------------>F-------------------------------> E
 * L[7] ---------------------------------------------------------------> L
 *
 * To iterate through all active i915_priolist, we only need to follow
 * the chain in i915_priolist.next[0] (see for_each_priolist).
 *
 * To quickly find a specific key (or insert point), we can perform a binary
 * search by starting at the highest level and following the linked list
 * at that level until we either find the node, or have gone passed the key.
 * Then we descend a level, and start walking the list again starting from
 * the current position, until eventually we find our key, or we run out of
 * levels.
 */
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist Chris Wilson
  2021-01-27 15:10   ` Tvrtko Ursulin
  2021-01-28 15:56   ` Tvrtko Ursulin
@ 2021-01-28 22:56   ` Matthew Brost
  2021-01-29 10:30     ` Chris Wilson
  2021-01-29 10:22   ` Tvrtko Ursulin
  3 siblings, 1 reply; 90+ messages in thread
From: Matthew Brost @ 2021-01-28 22:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, thomas.hellstrom

On Mon, Jan 25, 2021 at 02:01:15PM +0000, Chris Wilson wrote:
> Replace the priolist rbtree with a skiplist. The crucial difference is
> that walking and removing the first element of a skiplist is O(1), but
> O(lgN) for an rbtree, as we need to rebalance on remove. This is a
> hindrance for submission latency as it occurs between picking a request
> for the priolist and submitting it to hardware, as well effectively
> trippling the number of O(lgN) operations required under the irqoff lock.
> This is critical to reducing the latency jitter with multiple clients.
> 
> The downsides to skiplists are that lookup/insertion is only
> probablistically O(lgN) and there is a significant memory penalty to
> as each skip node is larger than the rbtree equivalent. Furthermore, we
> don't use dynamic arrays for the skiplist, so the allocation is fixed,
> and imposes an upper bound on the scalability wrt to the number of
> inflight requests.
> 

This is a fun data structure but IMO might be overkill to maintain this
code in the i915. The UMDs have effectively agreed to use only 3 levels,
is O(lgN) where N == 3 really a big deal? With GuC submission we will
statically map all user levels into 3 buckets. If we are doing that, do
we even need a complex data structure? i.e. Could use just use can
array of linked lists?

Also BTW, seems like people are having a hard time understanding what a
skip list is, might have just started with the below link which explains
it quite nicely:
https://en.wikipedia.org/wiki/Skip_list

Matt

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  .../drm/i915/gt/intel_execlists_submission.c  |  63 +++--
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  30 +--
>  drivers/gpu/drm/i915/i915_priolist_types.h    |  28 +-
>  drivers/gpu/drm/i915/i915_scheduler.c         | 244 ++++++++++++++----
>  drivers/gpu/drm/i915/i915_scheduler.h         |  11 +-
>  drivers/gpu/drm/i915/i915_scheduler_types.h   |   2 +-
>  .../drm/i915/selftests/i915_mock_selftests.h  |   1 +
>  .../gpu/drm/i915/selftests/i915_scheduler.c   |  53 +++-
>  8 files changed, 316 insertions(+), 116 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 1103c8a00af1..129144dd86b0 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -244,11 +244,6 @@ static void ring_set_paused(const struct intel_engine_cs *engine, int state)
>  		wmb();
>  }
>  
> -static struct i915_priolist *to_priolist(struct rb_node *rb)
> -{
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
>  static int rq_prio(const struct i915_request *rq)
>  {
>  	return READ_ONCE(rq->sched.attr.priority);
> @@ -272,15 +267,31 @@ static int effective_prio(const struct i915_request *rq)
>  	return prio;
>  }
>  
> -static int queue_prio(const struct i915_sched_engine *se)
> +static struct i915_request *first_request(struct i915_sched_engine *se)
>  {
> -	struct rb_node *rb;
> +	struct i915_priolist *pl;
>  
> -	rb = rb_first_cached(&se->queue);
> -	if (!rb)
> +	for_each_priolist(pl, &se->queue) {
> +		if (likely(!list_empty(&pl->requests)))
> +			return list_first_entry(&pl->requests,
> +						struct i915_request,
> +						sched.link);
> +
> +		i915_priolist_advance(&se->queue, pl);
> +	}
> +
> +	return NULL;
> +}
> +
> +static int queue_prio(struct i915_sched_engine *se)
> +{
> +	struct i915_request *rq;
> +
> +	rq = first_request(se);
> +	if (!rq)
>  		return INT_MIN;
>  
> -	return to_priolist(rb)->priority;
> +	return rq_prio(rq);
>  }
>  
>  static int virtual_prio(const struct intel_engine_execlists *el)
> @@ -290,7 +301,7 @@ static int virtual_prio(const struct intel_engine_execlists *el)
>  	return rb ? rb_entry(rb, struct ve_node, rb)->prio : INT_MIN;
>  }
>  
> -static bool need_preempt(const struct intel_engine_cs *engine,
> +static bool need_preempt(struct intel_engine_cs *engine,
>  			 const struct i915_request *rq)
>  {
>  	int last_prio;
> @@ -1136,6 +1147,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  	struct i915_request ** const last_port = port + execlists->port_mask;
>  	struct i915_request *last, * const *active;
>  	struct virtual_engine *ve;
> +	struct i915_priolist *pl;
>  	struct rb_node *rb;
>  	bool submit = false;
>  
> @@ -1346,11 +1358,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  			break;
>  	}
>  
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> +	for_each_priolist(pl, &engine->active.queue) {
>  		struct i915_request *rq, *rn;
>  
> -		priolist_for_each_request_consume(rq, rn, p) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>  			bool merge = true;
>  
>  			/*
> @@ -1425,8 +1436,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  			}
>  		}
>  
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);
>  	}
>  done:
>  	*port++ = i915_request_get(last);
> @@ -2631,6 +2641,7 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>  {
>  	struct intel_engine_execlists * const execlists = &engine->execlists;
>  	struct i915_request *rq, *rn;
> +	struct i915_priolist *pl;
>  	struct rb_node *rb;
>  	unsigned long flags;
>  
> @@ -2661,16 +2672,12 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>  	intel_engine_signal_breadcrumbs(engine);
>  
>  	/* Flush the queued requests to the timeline list (for retiring). */
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> -
> -		priolist_for_each_request_consume(rq, rn, p) {
> +	for_each_priolist(pl, &engine->active.queue) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>  			i915_request_mark_eio(rq);
>  			__i915_request_submit(rq);
>  		}
> -
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);
>  	}
>  	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
>  
> @@ -2703,7 +2710,6 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>  	/* Remaining _unready_ requests will be nop'ed when submitted */
>  
>  	execlists->queue_priority_hint = INT_MIN;
> -	engine->active.queue = RB_ROOT_CACHED;
>  
>  	GEM_BUG_ON(__tasklet_is_enabled(&engine->active.tasklet));
>  	engine->active.tasklet.func = nop_submission_tasklet;
> @@ -3089,6 +3095,8 @@ static void virtual_context_exit(struct intel_context *ce)
>  
>  	for (n = 0; n < ve->num_siblings; n++)
>  		intel_engine_pm_put(ve->siblings[n]);
> +
> +	i915_sched_park_engine(&ve->base.active);
>  }
>  
>  static const struct intel_context_ops virtual_context_ops = {
> @@ -3501,6 +3509,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
>  {
>  	const struct intel_engine_execlists *execlists = &engine->execlists;
>  	struct i915_request *rq, *last;
> +	struct i915_priolist *pl;
>  	unsigned long flags;
>  	unsigned int count;
>  	struct rb_node *rb;
> @@ -3530,10 +3539,8 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
>  
>  	last = NULL;
>  	count = 0;
> -	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
> -		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
> -
> -		priolist_for_each_request(rq, p) {
> +	for_each_priolist(pl, &engine->active.queue) {
> +		priolist_for_each_request(rq, pl) {
>  			if (count++ < max - 1)
>  				show_request(m, rq, "\t\t", 0);
>  			else
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 2d7339ef3b4c..8d0c6cd277b3 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -59,11 +59,6 @@
>  
>  #define GUC_REQUEST_SIZE 64 /* bytes */
>  
> -static inline struct i915_priolist *to_priolist(struct rb_node *rb)
> -{
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
>  static struct guc_stage_desc *__get_stage_desc(struct intel_guc *guc, u32 id)
>  {
>  	struct guc_stage_desc *base = guc->stage_desc_pool_vaddr;
> @@ -185,8 +180,8 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>  	struct i915_request ** const last_port = first + execlists->port_mask;
>  	struct i915_request *last = first[0];
>  	struct i915_request **port;
> +	struct i915_priolist *pl;
>  	bool submit = false;
> -	struct rb_node *rb;
>  
>  	lockdep_assert_held(&engine->active.lock);
>  
> @@ -203,11 +198,10 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>  	 * event.
>  	 */
>  	port = first;
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> +	for_each_priolist(pl, &engine->active.queue) {
>  		struct i915_request *rq, *rn;
>  
> -		priolist_for_each_request_consume(rq, rn, p) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>  			if (last && rq->context != last->context) {
>  				if (port == last_port)
>  					goto done;
> @@ -223,12 +217,11 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>  			last = rq;
>  		}
>  
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);
>  	}
>  done:
>  	execlists->queue_priority_hint =
> -		rb ? to_priolist(rb)->priority : INT_MIN;
> +		pl != &engine->active.queue.sentinel ? pl->priority : INT_MIN;
>  	if (submit) {
>  		*port = schedule_in(last, port - execlists->inflight);
>  		*++port = NULL;
> @@ -327,7 +320,7 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
>  {
>  	struct intel_engine_execlists * const execlists = &engine->execlists;
>  	struct i915_request *rq, *rn;
> -	struct rb_node *rb;
> +	struct i915_priolist *p;
>  	unsigned long flags;
>  
>  	ENGINE_TRACE(engine, "\n");
> @@ -355,25 +348,20 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
>  	}
>  
>  	/* Flush the queued requests to the timeline list (for retiring). */
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> -
> -		priolist_for_each_request_consume(rq, rn, p) {
> +	for_each_priolist(p, &engine->active.queue) {
> +		priolist_for_each_request_safe(rq, rn, p) {
>  			list_del_init(&rq->sched.link);
>  			__i915_request_submit(rq);
>  			dma_fence_set_error(&rq->fence, -EIO);
>  			i915_request_mark_complete(rq);
>  		}
> -
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, p);
>  	}
>  	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
>  
>  	/* Remaining _unready_ requests will be nop'ed when submitted */
>  
>  	execlists->queue_priority_hint = INT_MIN;
> -	engine->active.queue = RB_ROOT_CACHED;
>  
>  	spin_unlock_irqrestore(&engine->active.lock, flags);
>  }
> diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
> index bc2fa84f98a8..1200c3df6a4a 100644
> --- a/drivers/gpu/drm/i915/i915_priolist_types.h
> +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
> @@ -38,10 +38,36 @@ enum {
>  #define I915_PRIORITY_UNPREEMPTABLE INT_MAX
>  #define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
>  
> +#ifdef CONFIG_64BIT
> +#define I915_PRIOLIST_HEIGHT 12
> +#else
> +#define I915_PRIOLIST_HEIGHT 11
> +#endif
> +
>  struct i915_priolist {
>  	struct list_head requests;
> -	struct rb_node node;
>  	int priority;
> +
> +	int level;
> +	struct i915_priolist *next[I915_PRIOLIST_HEIGHT];
>  };
>  
> +struct i915_priolist_root {
> +	struct i915_priolist sentinel;
> +	u32 prng;
> +};
> +
> +#define i915_priolist_is_empty(root) ((root)->sentinel.level < 0)
> +
> +#define for_each_priolist(p, root) \
> +	for ((p) = (root)->sentinel.next[0]; \
> +	     (p) != &(root)->sentinel; \
> +	     (p) = (p)->next[0])
> +
> +#define priolist_for_each_request(it, plist) \
> +	list_for_each_entry(it, &(plist)->requests, sched.link)
> +
> +#define priolist_for_each_request_safe(it, n, plist) \
> +	list_for_each_entry_safe(it, n, &(plist)->requests, sched.link)
> +
>  #endif /* _I915_PRIOLIST_TYPES_H_ */
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index a3ee06cb66d7..74000d3eebb1 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -4,7 +4,9 @@
>   * Copyright © 2018 Intel Corporation
>   */
>  
> +#include <linux/bitops.h>
>  #include <linux/mutex.h>
> +#include <linux/prandom.h>
>  
>  #include "gt/intel_ring.h"
>  #include "gt/intel_lrc_reg.h"
> @@ -91,15 +93,24 @@ static void i915_sched_init_ipi(struct i915_sched_ipi *ipi)
>  	ipi->list = NULL;
>  }
>  
> +static void init_priolist(struct i915_priolist_root *const root)
> +{
> +	struct i915_priolist *pl = &root->sentinel;
> +
> +	memset_p((void **)pl->next, pl, ARRAY_SIZE(pl->next));
> +	pl->priority = INT_MIN;
> +	pl->level = -1;
> +}
> +
>  void i915_sched_init_engine(struct i915_sched_engine *se,
>  			    unsigned int subclass)
>  {
>  	spin_lock_init(&se->lock);
>  	lockdep_set_subclass(&se->lock, subclass);
>  
> +	init_priolist(&se->queue);
>  	INIT_LIST_HEAD(&se->requests);
>  	INIT_LIST_HEAD(&se->hold);
> -	se->queue = RB_ROOT_CACHED;
>  
>  	i915_sched_init_ipi(&se->ipi);
>  
> @@ -116,8 +127,57 @@ void i915_sched_init_engine(struct i915_sched_engine *se,
>  #endif
>  }
>  
> +__maybe_unused static bool priolist_idle(struct i915_priolist_root *root)
> +{
> +	struct i915_priolist *pl = &root->sentinel;
> +	int lvl;
> +
> +	for (lvl = 0; lvl < ARRAY_SIZE(pl->next); lvl++) {
> +		if (pl->next[lvl] != pl) {
> +			GEM_TRACE_ERR("root[%d] is not empty\n", lvl);
> +			return false;
> +		}
> +	}
> +
> +	if (pl->level != -1) {
> +		GEM_TRACE_ERR("root is not clear: %d\n", pl->level);
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
> +static void pl_push(struct i915_priolist *pl, struct list_head *head)
> +{
> +	pl->requests.next = head->next;
> +	head->next = &pl->requests;
> +}
> +
> +static struct i915_priolist *pl_pop(struct list_head *head)
> +{
> +	struct i915_priolist *pl;
> +
> +	pl = container_of(head->next, typeof(*pl), requests);
> +	head->next = pl->requests.next;
> +
> +	return pl;
> +}
> +
> +static bool pl_empty(struct list_head *head)
> +{
> +	return !head->next;
> +}
> +
>  void i915_sched_park_engine(struct i915_sched_engine *se)
>  {
> +	struct i915_priolist_root *root = &se->queue;
> +	struct list_head *list = &root->sentinel.requests;
> +
> +	GEM_BUG_ON(!priolist_idle(root));
> +
> +	while (!pl_empty(list))
> +		kmem_cache_free(global.slab_priorities, pl_pop(list));
> +
>  	GEM_BUG_ON(!i915_sched_is_idle(se));
>  	se->no_priolist = false;
>  }
> @@ -183,71 +243,55 @@ static inline bool node_signaled(const struct i915_sched_node *node)
>  	return i915_request_completed(node_to_request(node));
>  }
>  
> -static inline struct i915_priolist *to_priolist(struct rb_node *rb)
> +static inline unsigned int random_level(struct i915_priolist_root *root)
>  {
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
> -static void assert_priolists(struct i915_sched_engine * const se)
> -{
> -	struct rb_node *rb;
> -	long last_prio;
> -
> -	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
> -		return;
> -
> -	GEM_BUG_ON(rb_first_cached(&se->queue) !=
> -		   rb_first(&se->queue.rb_root));
> -
> -	last_prio = INT_MAX;
> -	for (rb = rb_first_cached(&se->queue); rb; rb = rb_next(rb)) {
> -		const struct i915_priolist *p = to_priolist(rb);
> -
> -		GEM_BUG_ON(p->priority > last_prio);
> -		last_prio = p->priority;
> -	}
> +	root->prng = next_pseudo_random32(root->prng);
> +	return  __ffs(root->prng) / 2;
>  }
>  
>  static struct list_head *
>  lookup_priolist(struct intel_engine_cs *engine, int prio)
>  {
> +	struct i915_priolist *update[I915_PRIOLIST_HEIGHT];
>  	struct i915_sched_engine * const se = &engine->active;
> -	struct i915_priolist *p;
> -	struct rb_node **parent, *rb;
> -	bool first = true;
> -
> -	lockdep_assert_held(&engine->active.lock);
> -	assert_priolists(se);
> +	struct i915_priolist_root *root = &se->queue;
> +	struct i915_priolist *pl, *tmp;
> +	int lvl;
>  
> +	lockdep_assert_held(&se->lock);
>  	if (unlikely(se->no_priolist))
>  		prio = I915_PRIORITY_NORMAL;
>  
> +	for_each_priolist(pl, root) { /* recycle any empty elements before us */
> +		if (pl->priority >= prio || !list_empty(&pl->requests))
> +			break;
> +
> +		i915_priolist_advance(root, pl);
> +	}
> +
>  find_priolist:
> -	/* most positive priority is scheduled first, equal priorities fifo */
> -	rb = NULL;
> -	parent = &se->queue.rb_root.rb_node;
> -	while (*parent) {
> -		rb = *parent;
> -		p = to_priolist(rb);
> -		if (prio > p->priority) {
> -			parent = &rb->rb_left;
> -		} else if (prio < p->priority) {
> -			parent = &rb->rb_right;
> -			first = false;
> -		} else {
> -			return &p->requests;
> -		}
> +	pl = &root->sentinel;
> +	lvl = pl->level;
> +	while (lvl >= 0) {
> +		while (tmp = pl->next[lvl], tmp->priority >= prio)
> +			pl = tmp;
> +		if (pl->priority == prio)
> +			goto out;
> +		update[lvl--] = pl;
>  	}
>  
>  	if (prio == I915_PRIORITY_NORMAL) {
> -		p = &se->default_priolist;
> +		pl = &se->default_priolist;
> +	} else if (!pl_empty(&root->sentinel.requests)) {
> +		pl = pl_pop(&root->sentinel.requests);
>  	} else {
> -		p = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
> +		pl = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
>  		/* Convert an allocation failure to a priority bump */
> -		if (unlikely(!p)) {
> +		if (unlikely(!pl)) {
>  			prio = I915_PRIORITY_NORMAL; /* recurses just once */
>  
> -			/* To maintain ordering with all rendering, after an
> +			/*
> +			 * To maintain ordering with all rendering, after an
>  			 * allocation failure we have to disable all scheduling.
>  			 * Requests will then be executed in fifo, and schedule
>  			 * will ensure that dependencies are emitted in fifo.
> @@ -260,18 +304,103 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
>  		}
>  	}
>  
> -	p->priority = prio;
> -	INIT_LIST_HEAD(&p->requests);
> +	pl->priority = prio;
> +	INIT_LIST_HEAD(&pl->requests);
>  
> -	rb_link_node(&p->node, rb, parent);
> -	rb_insert_color_cached(&p->node, &se->queue, first);
> +	lvl = random_level(root);
> +	if (lvl > root->sentinel.level) {
> +		if (root->sentinel.level < I915_PRIOLIST_HEIGHT - 1) {
> +			lvl = ++root->sentinel.level;
> +			update[lvl] = &root->sentinel;
> +		} else {
> +			lvl = I915_PRIOLIST_HEIGHT - 1;
> +		}
> +	}
> +	GEM_BUG_ON(lvl < 0);
> +	GEM_BUG_ON(lvl >= ARRAY_SIZE(pl->next));
>  
> -	return &p->requests;
> +	pl->level = lvl;
> +	do {
> +		tmp = update[lvl];
> +		pl->next[lvl] = update[lvl]->next[lvl];
> +		tmp->next[lvl] = pl;
> +	} while (--lvl >= 0);
> +
> +	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) {
> +		struct i915_priolist *chk;
> +
> +		chk = &root->sentinel;
> +		lvl = chk->level;
> +		do {
> +			while (tmp = chk->next[lvl], tmp->priority >= prio)
> +				chk = tmp;
> +		} while (--lvl >= 0);
> +
> +		GEM_BUG_ON(chk != pl);
> +	}
> +
> +out:
> +	GEM_BUG_ON(pl == &root->sentinel);
> +	return &pl->requests;
>  }
>  
> -void __i915_priolist_free(struct i915_priolist *p)
> +static void remove_priolist(struct intel_engine_cs *engine,
> +			    struct list_head *plist)
>  {
> -	kmem_cache_free(global.slab_priorities, p);
> +	struct i915_sched_engine * const se = &engine->active;
> +	struct i915_priolist_root *root = &se->queue;
> +	struct i915_priolist *pl, *tmp;
> +	struct i915_priolist *old =
> +		container_of(plist, struct i915_priolist, requests);
> +	int prio = old->priority;
> +	int lvl;
> +
> +	lockdep_assert_held(&se->lock);
> +	GEM_BUG_ON(!list_empty(plist));
> +
> +	pl = &root->sentinel;
> +	lvl = pl->level;
> +	GEM_BUG_ON(lvl < 0);
> +
> +	if (prio != I915_PRIORITY_NORMAL)
> +		pl_push(old, &pl->requests);
> +
> +	do {
> +		while (tmp = pl->next[lvl], tmp->priority > prio)
> +			pl = tmp;
> +		if (lvl <= old->level) {
> +			pl->next[lvl] = old->next[lvl];
> +			if (pl == &root->sentinel && old->next[lvl] == pl) {
> +				GEM_BUG_ON(pl->level != lvl);
> +				pl->level--;
> +			}
> +		}
> +	} while (--lvl >= 0);
> +	GEM_BUG_ON(tmp != old);
> +}
> +
> +void i915_priolist_advance(struct i915_priolist_root *root,
> +			   struct i915_priolist *pl)
> +{
> +	struct i915_priolist * const s = &root->sentinel;
> +	int lvl;
> +
> +	GEM_BUG_ON(!list_empty(&pl->requests));
> +	GEM_BUG_ON(pl != s->next[0]);
> +	GEM_BUG_ON(pl == s);
> +
> +	if (pl->priority != I915_PRIORITY_NORMAL)
> +		pl_push(pl, &s->requests);
> +
> +	lvl = pl->level;
> +	GEM_BUG_ON(lvl < 0);
> +	do {
> +		s->next[lvl] = pl->next[lvl];
> +		if (pl->next[lvl] == s) {
> +			GEM_BUG_ON(s->level != lvl);
> +			s->level--;
> +		}
> +	} while (--lvl >= 0);
>  }
>  
>  static struct i915_request *
> @@ -420,8 +549,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
>  			continue;
>  
>  		GEM_BUG_ON(rq->engine != engine);
> -		if (i915_request_in_priority_queue(rq))
> +		if (i915_request_in_priority_queue(rq)) {
> +			struct list_head *prev = rq->sched.link.prev;
> +
>  			list_move_tail(&rq->sched.link, plist);
> +			if (list_empty(prev))
> +				remove_priolist(engine, prev);
> +		}
>  
>  		/* Defer (tasklet) submission until after all updates. */
>  		kick_submission(engine, rq, prio);
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 0ab47cbf0e9c..bca89a58d953 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -16,12 +16,6 @@
>  
>  struct drm_printer;
>  
> -#define priolist_for_each_request(it, plist) \
> -	list_for_each_entry(it, &(plist)->requests, sched.link)
> -
> -#define priolist_for_each_request_consume(it, n, plist) \
> -	list_for_each_entry_safe(it, n, &(plist)->requests, sched.link)
> -
>  void i915_sched_node_init(struct i915_sched_node *node);
>  void i915_sched_node_reinit(struct i915_sched_node *node);
>  
> @@ -69,7 +63,7 @@ static inline void i915_priolist_free(struct i915_priolist *p)
>  
>  static inline bool i915_sched_is_idle(const struct i915_sched_engine *se)
>  {
> -	return RB_EMPTY_ROOT(&se->queue.rb_root);
> +	return i915_priolist_is_empty(&se->queue);
>  }
>  
>  static inline bool
> @@ -99,6 +93,9 @@ static inline void i915_sched_kick(struct i915_sched_engine *se)
>  	tasklet_hi_schedule(&se->tasklet);
>  }
>  
> +void i915_priolist_advance(struct i915_priolist_root *root,
> +			   struct i915_priolist *old);
> +
>  void i915_request_show_with_schedule(struct drm_printer *m,
>  				     const struct i915_request *rq,
>  				     const char *prefix,
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index f668c680d290..e64750be4e77 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -89,7 +89,7 @@ struct i915_sched_engine {
>  	/**
>  	 * @queue: queue of requests, in priority lists
>  	 */
> -	struct rb_root_cached queue;
> +	struct i915_priolist_root queue;
>  
>  	struct i915_sched_ipi ipi;
>  
> diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> index 3db34d3eea58..946c93441c1f 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> @@ -25,6 +25,7 @@ selftest(ring, intel_ring_mock_selftests)
>  selftest(engine, intel_engine_cs_mock_selftests)
>  selftest(timelines, intel_timeline_mock_selftests)
>  selftest(requests, i915_request_mock_selftests)
> +selftest(scheduler, i915_scheduler_mock_selftests)
>  selftest(objects, i915_gem_object_mock_selftests)
>  selftest(phys, i915_gem_phys_mock_selftests)
>  selftest(dmabuf, i915_gem_dmabuf_mock_selftests)
> diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> index de44a66210b7..de5b1443129b 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> @@ -12,6 +12,54 @@
>  #include "selftests/igt_spinner.h"
>  #include "selftests/i915_random.h"
>  
> +static int mock_skiplist_levels(void *dummy)
> +{
> +	struct i915_priolist_root root = {};
> +	struct i915_priolist *pl = &root.sentinel;
> +	IGT_TIMEOUT(end_time);
> +	unsigned long total;
> +	int count, lvl;
> +
> +	total = 0;
> +	do {
> +		for (count = 0; count < 16384; count++) {
> +			lvl = random_level(&root);
> +			if (lvl > pl->level) {
> +				if (lvl < I915_PRIOLIST_HEIGHT - 1)
> +					lvl = ++pl->level;
> +				else
> +					lvl = I915_PRIOLIST_HEIGHT - 1;
> +			}
> +
> +			pl->next[lvl] = ptr_inc(pl->next[lvl]);
> +		}
> +		total += count;
> +	} while (!__igt_timeout(end_time, NULL));
> +
> +	pr_info("Total %9lu\n", total);
> +	for (lvl = 0; lvl <= pl->level; lvl++) {
> +		int x = ilog2((unsigned long)pl->next[lvl]);
> +		char row[80];
> +
> +		memset(row, '*', x);
> +		row[x] = '\0';
> +
> +		pr_info(" [%2d] %9lu %s\n",
> +			lvl, (unsigned long)pl->next[lvl], row);
> +	}
> +
> +	return 0;
> +}
> +
> +int i915_scheduler_mock_selftests(void)
> +{
> +	static const struct i915_subtest tests[] = {
> +		SUBTEST(mock_skiplist_levels),
> +	};
> +
> +	return i915_subtests(tests, NULL);
> +}
> +
>  static void scheduling_disable(struct intel_engine_cs *engine)
>  {
>  	engine->props.preempt_timeout_ms = 0;
> @@ -80,9 +128,9 @@ static int all_engines(struct drm_i915_private *i915,
>  static bool check_context_order(struct intel_engine_cs *engine)
>  {
>  	u64 last_seqno, last_context;
> +	struct i915_priolist *p;
>  	unsigned long count;
>  	bool result = false;
> -	struct rb_node *rb;
>  	int last_prio;
>  
>  	/* We expect the execution order to follow ascending fence-context */
> @@ -92,8 +140,7 @@ static bool check_context_order(struct intel_engine_cs *engine)
>  	last_context = 0;
>  	last_seqno = 0;
>  	last_prio = 0;
> -	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
> -		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
> +	for_each_priolist(p, &engine->active.queue) {
>  		struct i915_request *rq;
>  
>  		priolist_for_each_request(rq, p) {
> -- 
> 2.20.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-28 22:44         ` Chris Wilson
@ 2021-01-29  9:24           ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-29  9:24 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 28/01/2021 22:44, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2021-01-28 16:42:44)
>>
>> On 28/01/2021 16:26, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2021-01-28 15:56:19)
>>>> On 25/01/2021 14:01, Chris Wilson wrote:
>>>>> diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
>>>>> index bc2fa84f98a8..1200c3df6a4a 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_priolist_types.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
>>>>> @@ -38,10 +38,36 @@ enum {
>>>>>     #define I915_PRIORITY_UNPREEMPTABLE INT_MAX
>>>>>     #define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
>>>>>     
>>>>> +#ifdef CONFIG_64BIT
>>>>> +#define I915_PRIOLIST_HEIGHT 12
>>>>> +#else
>>>>> +#define I915_PRIOLIST_HEIGHT 11
>>>>> +#endif
>>>>
>>>> I did not get this. On one hand I could think pointers are larger on
>>>> 64-bit so go for fewer levels, if size was a concern. But on the other
>>>> hand 32-bit is less important these days, definitely much less as a
>>>> performance platform. So going for less memory use => worse performance
>>>> on a less important platform, which typically could be more memory
>>>> constrained? Not sure I see it as that important either way to be
>>>> distinctive but a comment would satisfy me.
>>>
>>> Just aligned to the cacheline. The struct is 128B on 64b and 64B on 32b.
>>> On 64B, we will scale to around 16 million requests in flight and 4
>>> million on 32b. Which should be enough.
>>>
>>> If we shrunk 64b to a 64B node, we would only scale to 256 requests
>>> which limit we definitely will exceed.
>>
>> Ok thanks, pouring it into a comment is implied.
>>
>>>
>>>>>     struct i915_priolist {
>>>>>         struct list_head requests;
>>>>
>>>> What would be on this list? Request can only be on one at a time, so I
>>>> was thinking these nodes would have pointers to list of that priority,
>>>> rather than lists themselves. Assuming there can be multiple nodes of
>>>> the same priority in the 2d hierarcy. Possibly I don't understand the
>>>> layout.
>>>
>>> A request is only on one list (queue, active, hold). But we may still
>>> have more than one request at the same deadline, though that will likely
>>> be limited to priority-inheritance and timeslice deferrals.
>>>
>>> Since we would need pointer to the request, we could only reclaim a
>>> single pointer here, which is not enough to warrant reducing the overall
>>> node size. And while there is at least one user of request->sched.link,
>>> the list maintenance will still be incurred. Using request->sched.link
>>> remains a convenient interface.
>>
>> Lost you.
> 
> /*
>   * i915_priolist forms a skiplist. The skiplist is built in layers,
>   * starting at the base [0] is a singly linked list of all i915_priolist.
>   * Each higher layer contains a fraction of the i915_priolist from the
>   * previous layer:
>   *
>   * S[0] 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF S
>   * E[1] >1>3>5>7>9>B>D>F>1>3>5>7>9>B>D>F>1>3>5>7>9>B>D>F>1>3>5>7>9>B>D>F E
>   * N[2] -->3-->7-->B-->F-->3-->7-->B-->F-->3-->7-->B-->F-->3-->7-->B-->F N
>   * T[3] ------->7----->F-------7------>F------>7------>F------>7------>F 

Just align this first 7.

T
>   * I[4] -------------->F-------------->F-------------->F-------------->F I
>   * N[5] ------------------------------>F------------------------------>F N
>   * E[6] ------------------------------>F-------------------------------> E
>   * L[7] ---------------------------------------------------------------> L
>   *
>   * To iterate through all active i915_priolist, we only need to follow
>   * the chain in i915_priolist.next[0] (see for_each_priolist).
>   *
>   * To quickly find a specific key (or insert point), we can perform a binary
>   * search by starting at the highest level and following the linked list
>   * at that level until we either find the node, or have gone passed the key.
>   * Then we descend a level, and start walking the list again starting from
>   * the current position, until eventually we find our key, or we run out of

 From the previous on the current level, not current I believe. So go 
previous before descending, right?

Very useful diagram, thank you.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-28 16:26     ` Chris Wilson
  2021-01-28 16:42       ` Tvrtko Ursulin
@ 2021-01-29  9:37       ` Tvrtko Ursulin
  2021-01-29 10:26         ` Chris Wilson
  1 sibling, 1 reply; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-29  9:37 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 28/01/2021 16:26, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2021-01-28 15:56:19)

>>> -static void assert_priolists(struct i915_sched_engine * const se)
>>> -{
>>> -     struct rb_node *rb;
>>> -     long last_prio;
>>> -
>>> -     if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
>>> -             return;
>>> -
>>> -     GEM_BUG_ON(rb_first_cached(&se->queue) !=
>>> -                rb_first(&se->queue.rb_root));
>>> -
>>> -     last_prio = INT_MAX;
>>> -     for (rb = rb_first_cached(&se->queue); rb; rb = rb_next(rb)) {
>>> -             const struct i915_priolist *p = to_priolist(rb);
>>> -
>>> -             GEM_BUG_ON(p->priority > last_prio);
>>> -             last_prio = p->priority;
>>> -     }
>>> +     root->prng = next_pseudo_random32(root->prng);
>>> +     return  __ffs(root->prng) / 2;
>>
>> Where is the relationship to I915_PRIOLIST_HEIGHT? Feels root->prng %
>> I915_PRIOLIST_HEIGHT would be more obvious here unless I am terribly
>> mistaken. Or at least put a comment saying why the hack.
> 
> HEIGHT is the maximum possible for our struct. skiplists only want to
> increment the height of the tree one step at a time. So we choose a level
> with decreasing probability, and then limit that to the maximum height of
> the current tree + 1, clamped to HEIGHT.
> 
> You might notice that unlike traditional skiplists, this uses a
> probability of 0.25 for each additional level. A neat trick discovered by
> Con Kolivas (I haven't found it mentioned elsewhere) as the cost of the
> extra level (using P=.5) is the same as the extra chain length with
> P=.25. So you can scale to higher number of requests by packing more
> requests into each level.
> 
> So that is split between randomly choosing a level and then working out
> the height of the node.

Choosing levels with decreasing probability by the virtue of using ffs 
on a random number? Or because (BITS_PER_TYPE(u32) / 2) is greater than 
I915_PRIOLIST_HEIGHT?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-25 14:01 ` [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist Chris Wilson
                     ` (2 preceding siblings ...)
  2021-01-28 22:56   ` Matthew Brost
@ 2021-01-29 10:22   ` Tvrtko Ursulin
  3 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2021-01-29 10:22 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: thomas.hellstrom


On 25/01/2021 14:01, Chris Wilson wrote:
> Replace the priolist rbtree with a skiplist. The crucial difference is
> that walking and removing the first element of a skiplist is O(1), but
> O(lgN) for an rbtree, as we need to rebalance on remove. This is a
> hindrance for submission latency as it occurs between picking a request
> for the priolist and submitting it to hardware, as well effectively
> trippling the number of O(lgN) operations required under the irqoff lock.
> This is critical to reducing the latency jitter with multiple clients.
> 
> The downsides to skiplists are that lookup/insertion is only
> probablistically O(lgN) and there is a significant memory penalty to
> as each skip node is larger than the rbtree equivalent. Furthermore, we
> don't use dynamic arrays for the skiplist, so the allocation is fixed,
> and imposes an upper bound on the scalability wrt to the number of
> inflight requests.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   .../drm/i915/gt/intel_execlists_submission.c  |  63 +++--
>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  30 +--
>   drivers/gpu/drm/i915/i915_priolist_types.h    |  28 +-
>   drivers/gpu/drm/i915/i915_scheduler.c         | 244 ++++++++++++++----
>   drivers/gpu/drm/i915/i915_scheduler.h         |  11 +-
>   drivers/gpu/drm/i915/i915_scheduler_types.h   |   2 +-
>   .../drm/i915/selftests/i915_mock_selftests.h  |   1 +
>   .../gpu/drm/i915/selftests/i915_scheduler.c   |  53 +++-
>   8 files changed, 316 insertions(+), 116 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 1103c8a00af1..129144dd86b0 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -244,11 +244,6 @@ static void ring_set_paused(const struct intel_engine_cs *engine, int state)
>   		wmb();
>   }
>   
> -static struct i915_priolist *to_priolist(struct rb_node *rb)
> -{
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
>   static int rq_prio(const struct i915_request *rq)
>   {
>   	return READ_ONCE(rq->sched.attr.priority);
> @@ -272,15 +267,31 @@ static int effective_prio(const struct i915_request *rq)
>   	return prio;
>   }
>   
> -static int queue_prio(const struct i915_sched_engine *se)
> +static struct i915_request *first_request(struct i915_sched_engine *se)
>   {
> -	struct rb_node *rb;
> +	struct i915_priolist *pl;
>   
> -	rb = rb_first_cached(&se->queue);
> -	if (!rb)
> +	for_each_priolist(pl, &se->queue) {
> +		if (likely(!list_empty(&pl->requests)))
> +			return list_first_entry(&pl->requests,
> +						struct i915_request,
> +						sched.link);
> +
> +		i915_priolist_advance(&se->queue, pl);

Why is a "peek" type call site doing tree modifications? Couldn't that 
be limited to places which add/remove?

> +	}
> +
> +	return NULL;
> +}
> +
> +static int queue_prio(struct i915_sched_engine *se)
> +{
> +	struct i915_request *rq;
> +
> +	rq = first_request(se);
> +	if (!rq)
>   		return INT_MIN;
>   
> -	return to_priolist(rb)->priority;
> +	return rq_prio(rq);
>   }
>   
>   static int virtual_prio(const struct intel_engine_execlists *el)
> @@ -290,7 +301,7 @@ static int virtual_prio(const struct intel_engine_execlists *el)
>   	return rb ? rb_entry(rb, struct ve_node, rb)->prio : INT_MIN;
>   }
>   
> -static bool need_preempt(const struct intel_engine_cs *engine,
> +static bool need_preempt(struct intel_engine_cs *engine,
>   			 const struct i915_request *rq)
>   {
>   	int last_prio;
> @@ -1136,6 +1147,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   	struct i915_request ** const last_port = port + execlists->port_mask;
>   	struct i915_request *last, * const *active;
>   	struct virtual_engine *ve;
> +	struct i915_priolist *pl;
>   	struct rb_node *rb;
>   	bool submit = false;
>   
> @@ -1346,11 +1358,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   			break;
>   	}
>   
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> +	for_each_priolist(pl, &engine->active.queue) {
>   		struct i915_request *rq, *rn;
>   
> -		priolist_for_each_request_consume(rq, rn, p) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>   			bool merge = true;
>   
>   			/*
> @@ -1425,8 +1436,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   			}
>   		}
>   
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);

There must be someone doing a list_del on this request in this block so 
I suppose it is hidden somewhere in the chain, must be 
__i915_request_submit. I guess it was the same with rbtree so I just 
wonder if there is a way to document this better. Nothing comes to mind.

>   	}
>   done:
>   	*port++ = i915_request_get(last);
> @@ -2631,6 +2641,7 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>   {
>   	struct intel_engine_execlists * const execlists = &engine->execlists;
>   	struct i915_request *rq, *rn;
> +	struct i915_priolist *pl;
>   	struct rb_node *rb;
>   	unsigned long flags;
>   
> @@ -2661,16 +2672,12 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>   	intel_engine_signal_breadcrumbs(engine);
>   
>   	/* Flush the queued requests to the timeline list (for retiring). */
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> -
> -		priolist_for_each_request_consume(rq, rn, p) {
> +	for_each_priolist(pl, &engine->active.queue) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>   			i915_request_mark_eio(rq);
>   			__i915_request_submit(rq);
>   		}
> -
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);
>   	}
>   	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
>   
> @@ -2703,7 +2710,6 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
>   	/* Remaining _unready_ requests will be nop'ed when submitted */
>   
>   	execlists->queue_priority_hint = INT_MIN;
> -	engine->active.queue = RB_ROOT_CACHED;
>   
>   	GEM_BUG_ON(__tasklet_is_enabled(&engine->active.tasklet));
>   	engine->active.tasklet.func = nop_submission_tasklet;
> @@ -3089,6 +3095,8 @@ static void virtual_context_exit(struct intel_context *ce)
>   
>   	for (n = 0; n < ve->num_siblings; n++)
>   		intel_engine_pm_put(ve->siblings[n]);
> +
> +	i915_sched_park_engine(&ve->base.active);
>   }
>   
>   static const struct intel_context_ops virtual_context_ops = {
> @@ -3501,6 +3509,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
>   {
>   	const struct intel_engine_execlists *execlists = &engine->execlists;
>   	struct i915_request *rq, *last;
> +	struct i915_priolist *pl;
>   	unsigned long flags;
>   	unsigned int count;
>   	struct rb_node *rb;
> @@ -3530,10 +3539,8 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
>   
>   	last = NULL;
>   	count = 0;
> -	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
> -		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
> -
> -		priolist_for_each_request(rq, p) {
> +	for_each_priolist(pl, &engine->active.queue) {
> +		priolist_for_each_request(rq, pl) {
>   			if (count++ < max - 1)
>   				show_request(m, rq, "\t\t", 0);
>   			else
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 2d7339ef3b4c..8d0c6cd277b3 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -59,11 +59,6 @@
>   
>   #define GUC_REQUEST_SIZE 64 /* bytes */
>   
> -static inline struct i915_priolist *to_priolist(struct rb_node *rb)
> -{
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
>   static struct guc_stage_desc *__get_stage_desc(struct intel_guc *guc, u32 id)
>   {
>   	struct guc_stage_desc *base = guc->stage_desc_pool_vaddr;
> @@ -185,8 +180,8 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>   	struct i915_request ** const last_port = first + execlists->port_mask;
>   	struct i915_request *last = first[0];
>   	struct i915_request **port;
> +	struct i915_priolist *pl;
>   	bool submit = false;
> -	struct rb_node *rb;
>   
>   	lockdep_assert_held(&engine->active.lock);
>   
> @@ -203,11 +198,10 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>   	 * event.
>   	 */
>   	port = first;
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> +	for_each_priolist(pl, &engine->active.queue) {
>   		struct i915_request *rq, *rn;
>   
> -		priolist_for_each_request_consume(rq, rn, p) {
> +		priolist_for_each_request_safe(rq, rn, pl) {
>   			if (last && rq->context != last->context) {
>   				if (port == last_port)
>   					goto done;
> @@ -223,12 +217,11 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
>   			last = rq;
>   		}
>   
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, pl);
>   	}
>   done:
>   	execlists->queue_priority_hint =
> -		rb ? to_priolist(rb)->priority : INT_MIN;
> +		pl != &engine->active.queue.sentinel ? pl->priority : INT_MIN;
>   	if (submit) {
>   		*port = schedule_in(last, port - execlists->inflight);
>   		*++port = NULL;
> @@ -327,7 +320,7 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
>   {
>   	struct intel_engine_execlists * const execlists = &engine->execlists;
>   	struct i915_request *rq, *rn;
> -	struct rb_node *rb;
> +	struct i915_priolist *p;
>   	unsigned long flags;
>   
>   	ENGINE_TRACE(engine, "\n");
> @@ -355,25 +348,20 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
>   	}
>   
>   	/* Flush the queued requests to the timeline list (for retiring). */
> -	while ((rb = rb_first_cached(&engine->active.queue))) {
> -		struct i915_priolist *p = to_priolist(rb);
> -
> -		priolist_for_each_request_consume(rq, rn, p) {
> +	for_each_priolist(p, &engine->active.queue) {
> +		priolist_for_each_request_safe(rq, rn, p) {
>   			list_del_init(&rq->sched.link);
>   			__i915_request_submit(rq);
>   			dma_fence_set_error(&rq->fence, -EIO);
>   			i915_request_mark_complete(rq);
>   		}
> -
> -		rb_erase_cached(&p->node, &engine->active.queue);
> -		i915_priolist_free(p);
> +		i915_priolist_advance(&engine->active.queue, p);
>   	}
>   	GEM_BUG_ON(!i915_sched_is_idle(&engine->active));
>   
>   	/* Remaining _unready_ requests will be nop'ed when submitted */
>   
>   	execlists->queue_priority_hint = INT_MIN;
> -	engine->active.queue = RB_ROOT_CACHED;
>   
>   	spin_unlock_irqrestore(&engine->active.lock, flags);
>   }
> diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
> index bc2fa84f98a8..1200c3df6a4a 100644
> --- a/drivers/gpu/drm/i915/i915_priolist_types.h
> +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
> @@ -38,10 +38,36 @@ enum {
>   #define I915_PRIORITY_UNPREEMPTABLE INT_MAX
>   #define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
>   
> +#ifdef CONFIG_64BIT
> +#define I915_PRIOLIST_HEIGHT 12
> +#else
> +#define I915_PRIOLIST_HEIGHT 11
> +#endif
> +
>   struct i915_priolist {
>   	struct list_head requests;
> -	struct rb_node node;
>   	int priority;
> +
> +	int level;
> +	struct i915_priolist *next[I915_PRIOLIST_HEIGHT];
>   };
>   
> +struct i915_priolist_root {
> +	struct i915_priolist sentinel;
> +	u32 prng;
> +};
> +
> +#define i915_priolist_is_empty(root) ((root)->sentinel.level < 0)
> +
> +#define for_each_priolist(p, root) \
> +	for ((p) = (root)->sentinel.next[0]; \
> +	     (p) != &(root)->sentinel; \
> +	     (p) = (p)->next[0])
> +
> +#define priolist_for_each_request(it, plist) \
> +	list_for_each_entry(it, &(plist)->requests, sched.link)
> +
> +#define priolist_for_each_request_safe(it, n, plist) \
> +	list_for_each_entry_safe(it, n, &(plist)->requests, sched.link)
> +
>   #endif /* _I915_PRIOLIST_TYPES_H_ */
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index a3ee06cb66d7..74000d3eebb1 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -4,7 +4,9 @@
>    * Copyright © 2018 Intel Corporation
>    */
>   
> +#include <linux/bitops.h>
>   #include <linux/mutex.h>
> +#include <linux/prandom.h>
>   
>   #include "gt/intel_ring.h"
>   #include "gt/intel_lrc_reg.h"
> @@ -91,15 +93,24 @@ static void i915_sched_init_ipi(struct i915_sched_ipi *ipi)
>   	ipi->list = NULL;
>   }
>   
> +static void init_priolist(struct i915_priolist_root *const root)
> +{
> +	struct i915_priolist *pl = &root->sentinel;
> +
> +	memset_p((void **)pl->next, pl, ARRAY_SIZE(pl->next));
> +	pl->priority = INT_MIN;
> +	pl->level = -1;
> +}
> +
>   void i915_sched_init_engine(struct i915_sched_engine *se,
>   			    unsigned int subclass)
>   {
>   	spin_lock_init(&se->lock);
>   	lockdep_set_subclass(&se->lock, subclass);
>   
> +	init_priolist(&se->queue);
>   	INIT_LIST_HEAD(&se->requests);
>   	INIT_LIST_HEAD(&se->hold);
> -	se->queue = RB_ROOT_CACHED;
>   
>   	i915_sched_init_ipi(&se->ipi);
>   
> @@ -116,8 +127,57 @@ void i915_sched_init_engine(struct i915_sched_engine *se,
>   #endif
>   }
>   
> +__maybe_unused static bool priolist_idle(struct i915_priolist_root *root)
> +{
> +	struct i915_priolist *pl = &root->sentinel;
> +	int lvl;
> +
> +	for (lvl = 0; lvl < ARRAY_SIZE(pl->next); lvl++) {
> +		if (pl->next[lvl] != pl) {
> +			GEM_TRACE_ERR("root[%d] is not empty\n", lvl);
> +			return false;
> +		}
> +	}
> +
> +	if (pl->level != -1) {
> +		GEM_TRACE_ERR("root is not clear: %d\n", pl->level);
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
> +static void pl_push(struct i915_priolist *pl, struct list_head *head)
> +{
> +	pl->requests.next = head->next;
> +	head->next = &pl->requests;
> +}
> +
> +static struct i915_priolist *pl_pop(struct list_head *head)
> +{
> +	struct i915_priolist *pl;
> +
> +	pl = container_of(head->next, typeof(*pl), requests);
> +	head->next = pl->requests.next;
> +
> +	return pl;
> +}
> +
> +static bool pl_empty(struct list_head *head)
> +{
> +	return !head->next;
> +}
> +
>   void i915_sched_park_engine(struct i915_sched_engine *se)
>   {
> +	struct i915_priolist_root *root = &se->queue;
> +	struct list_head *list = &root->sentinel.requests;
> +
> +	GEM_BUG_ON(!priolist_idle(root));
> +
> +	while (!pl_empty(list))
> +		kmem_cache_free(global.slab_priorities, pl_pop(list));

If I follow correct you could just unlink the head and free the rest 
with just a walk, no need to pop along the way.

> +
>   	GEM_BUG_ON(!i915_sched_is_idle(se));
>   	se->no_priolist = false;
>   }
> @@ -183,71 +243,55 @@ static inline bool node_signaled(const struct i915_sched_node *node)
>   	return i915_request_completed(node_to_request(node));
>   }
>   
> -static inline struct i915_priolist *to_priolist(struct rb_node *rb)
> +static inline unsigned int random_level(struct i915_priolist_root *root)
>   {
> -	return rb_entry(rb, struct i915_priolist, node);
> -}
> -
> -static void assert_priolists(struct i915_sched_engine * const se)
> -{
> -	struct rb_node *rb;
> -	long last_prio;
> -
> -	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
> -		return;
> -
> -	GEM_BUG_ON(rb_first_cached(&se->queue) !=
> -		   rb_first(&se->queue.rb_root));
> -
> -	last_prio = INT_MAX;
> -	for (rb = rb_first_cached(&se->queue); rb; rb = rb_next(rb)) {
> -		const struct i915_priolist *p = to_priolist(rb);
> -
> -		GEM_BUG_ON(p->priority > last_prio);
> -		last_prio = p->priority;
> -	}
> +	root->prng = next_pseudo_random32(root->prng);
> +	return  __ffs(root->prng) / 2;
>   }
>   
>   static struct list_head *
>   lookup_priolist(struct intel_engine_cs *engine, int prio)
>   {
> +	struct i915_priolist *update[I915_PRIOLIST_HEIGHT];
>   	struct i915_sched_engine * const se = &engine->active;
> -	struct i915_priolist *p;
> -	struct rb_node **parent, *rb;
> -	bool first = true;
> -
> -	lockdep_assert_held(&engine->active.lock);
> -	assert_priolists(se);
> +	struct i915_priolist_root *root = &se->queue;
> +	struct i915_priolist *pl, *tmp;
> +	int lvl;
>   
> +	lockdep_assert_held(&se->lock);
>   	if (unlikely(se->no_priolist))
>   		prio = I915_PRIORITY_NORMAL;
>   
> +	for_each_priolist(pl, root) { /* recycle any empty elements before us */
> +		if (pl->priority >= prio || !list_empty(&pl->requests))
> +			break;
> +
> +		i915_priolist_advance(root, pl);
> +	}
> +
>   find_priolist:
> -	/* most positive priority is scheduled first, equal priorities fifo */
> -	rb = NULL;
> -	parent = &se->queue.rb_root.rb_node;
> -	while (*parent) {
> -		rb = *parent;
> -		p = to_priolist(rb);
> -		if (prio > p->priority) {
> -			parent = &rb->rb_left;
> -		} else if (prio < p->priority) {
> -			parent = &rb->rb_right;
> -			first = false;
> -		} else {
> -			return &p->requests;
> -		}
> +	pl = &root->sentinel;
> +	lvl = pl->level;
> +	while (lvl >= 0) {
> +		while (tmp = pl->next[lvl], tmp->priority >= prio)
> +			pl = tmp;
> +		if (pl->priority == prio)
> +			goto out;
> +		update[lvl--] = pl;
>   	}
>   
>   	if (prio == I915_PRIORITY_NORMAL) {
> -		p = &se->default_priolist;
> +		pl = &se->default_priolist;
> +	} else if (!pl_empty(&root->sentinel.requests)) {
> +		pl = pl_pop(&root->sentinel.requests);
>   	} else {
> -		p = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
> +		pl = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
>   		/* Convert an allocation failure to a priority bump */
> -		if (unlikely(!p)) {
> +		if (unlikely(!pl)) {
>   			prio = I915_PRIORITY_NORMAL; /* recurses just once */
>   
> -			/* To maintain ordering with all rendering, after an
> +			/*
> +			 * To maintain ordering with all rendering, after an
>   			 * allocation failure we have to disable all scheduling.
>   			 * Requests will then be executed in fifo, and schedule
>   			 * will ensure that dependencies are emitted in fifo.
> @@ -260,18 +304,103 @@ lookup_priolist(struct intel_engine_cs *engine, int prio)
>   		}
>   	}
>   
> -	p->priority = prio;
> -	INIT_LIST_HEAD(&p->requests);
> +	pl->priority = prio;
> +	INIT_LIST_HEAD(&pl->requests);
>   
> -	rb_link_node(&p->node, rb, parent);
> -	rb_insert_color_cached(&p->node, &se->queue, first);
> +	lvl = random_level(root);
> +	if (lvl > root->sentinel.level) {
> +		if (root->sentinel.level < I915_PRIOLIST_HEIGHT - 1) {
> +			lvl = ++root->sentinel.level;
> +			update[lvl] = &root->sentinel;
> +		} else {
> +			lvl = I915_PRIOLIST_HEIGHT - 1;
> +		}
> +	}
> +	GEM_BUG_ON(lvl < 0);
> +	GEM_BUG_ON(lvl >= ARRAY_SIZE(pl->next));
>   
> -	return &p->requests;
> +	pl->level = lvl;
> +	do {
> +		tmp = update[lvl];
> +		pl->next[lvl] = update[lvl]->next[lvl];
> +		tmp->next[lvl] = pl;
> +	} while (--lvl >= 0);
> +
> +	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) {
> +		struct i915_priolist *chk;
> +
> +		chk = &root->sentinel;
> +		lvl = chk->level;
> +		do {
> +			while (tmp = chk->next[lvl], tmp->priority >= prio)
> +				chk = tmp;
> +		} while (--lvl >= 0);
> +
> +		GEM_BUG_ON(chk != pl);
> +	}
> +
> +out:
> +	GEM_BUG_ON(pl == &root->sentinel);
> +	return &pl->requests;
>   }
>   
> -void __i915_priolist_free(struct i915_priolist *p)
> +static void remove_priolist(struct intel_engine_cs *engine,
> +			    struct list_head *plist)
>   {
> -	kmem_cache_free(global.slab_priorities, p);
> +	struct i915_sched_engine * const se = &engine->active;
> +	struct i915_priolist_root *root = &se->queue;
> +	struct i915_priolist *pl, *tmp;
> +	struct i915_priolist *old =
> +		container_of(plist, struct i915_priolist, requests);
> +	int prio = old->priority;
> +	int lvl;
> +
> +	lockdep_assert_held(&se->lock);
> +	GEM_BUG_ON(!list_empty(plist));
> +
> +	pl = &root->sentinel;
> +	lvl = pl->level;
> +	GEM_BUG_ON(lvl < 0);
> +
> +	if (prio != I915_PRIORITY_NORMAL)
> +		pl_push(old, &pl->requests);
> +
> +	do {
> +		while (tmp = pl->next[lvl], tmp->priority > prio)
> +			pl = tmp;
> +		if (lvl <= old->level) {
> +			pl->next[lvl] = old->next[lvl];
> +			if (pl == &root->sentinel && old->next[lvl] == pl) {
> +				GEM_BUG_ON(pl->level != lvl);
> +				pl->level--;
> +			}
> +		}
> +	} while (--lvl >= 0);
> +	GEM_BUG_ON(tmp != old);
> +}

Any chance to extract some commonality between remove and advance? Not 
fully getting it yet but they both seem to be removing nodes and then 
decreasing the height.

> +
> +void i915_priolist_advance(struct i915_priolist_root *root,
> +			   struct i915_priolist *pl)
> +{

This seems to be called when lowest priority entry has no requests, 
right? Some sort of trim? Not getting the "advance" idea.

> +	struct i915_priolist * const s = &root->sentinel;
> +	int lvl;
> +
> +	GEM_BUG_ON(!list_empty(&pl->requests));
> +	GEM_BUG_ON(pl != s->next[0]);
> +	GEM_BUG_ON(pl == s);
> +
> +	if (pl->priority != I915_PRIORITY_NORMAL)
> +		pl_push(pl, &s->requests);

s->requests is just a store of empty priolist nodes? If so please put a 
comment, maybe in struct i915_priolist_root.

> +
> +	lvl = pl->level;
> +	GEM_BUG_ON(lvl < 0);
> +	do {
> +		s->next[lvl] = pl->next[lvl];

So it unlinks the empty pl from each layer starting from the top.

> +		if (pl->next[lvl] == s) {

If the layer is completely empty..

> +			GEM_BUG_ON(s->level != lvl);
> +			s->level--;

.. decreases the max height by one. But it also expects max height to be 
equal to the heigh of the empty level. Because layer was empty it has to 
be, okay.

> +		}
> +	} while (--lvl >= 0);
>   }
>   
>   static struct i915_request *
> @@ -420,8 +549,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio)
>   			continue;
>   
>   		GEM_BUG_ON(rq->engine != engine);
> -		if (i915_request_in_priority_queue(rq))
> +		if (i915_request_in_priority_queue(rq)) {
> +			struct list_head *prev = rq->sched.link.prev;
> +
>   			list_move_tail(&rq->sched.link, plist);
> +			if (list_empty(prev))
> +				remove_priolist(engine, prev);
> +		}
>   
>   		/* Defer (tasklet) submission until after all updates. */
>   		kick_submission(engine, rq, prio);
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 0ab47cbf0e9c..bca89a58d953 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -16,12 +16,6 @@
>   
>   struct drm_printer;
>   
> -#define priolist_for_each_request(it, plist) \
> -	list_for_each_entry(it, &(plist)->requests, sched.link)
> -
> -#define priolist_for_each_request_consume(it, n, plist) \
> -	list_for_each_entry_safe(it, n, &(plist)->requests, sched.link)
> -
>   void i915_sched_node_init(struct i915_sched_node *node);
>   void i915_sched_node_reinit(struct i915_sched_node *node);
>   
> @@ -69,7 +63,7 @@ static inline void i915_priolist_free(struct i915_priolist *p)
>   
>   static inline bool i915_sched_is_idle(const struct i915_sched_engine *se)
>   {
> -	return RB_EMPTY_ROOT(&se->queue.rb_root);
> +	return i915_priolist_is_empty(&se->queue);
>   }
>   
>   static inline bool
> @@ -99,6 +93,9 @@ static inline void i915_sched_kick(struct i915_sched_engine *se)
>   	tasklet_hi_schedule(&se->tasklet);
>   }
>   
> +void i915_priolist_advance(struct i915_priolist_root *root,
> +			   struct i915_priolist *old);
> +
>   void i915_request_show_with_schedule(struct drm_printer *m,
>   				     const struct i915_request *rq,
>   				     const char *prefix,
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index f668c680d290..e64750be4e77 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -89,7 +89,7 @@ struct i915_sched_engine {
>   	/**
>   	 * @queue: queue of requests, in priority lists
>   	 */
> -	struct rb_root_cached queue;
> +	struct i915_priolist_root queue;
>   
>   	struct i915_sched_ipi ipi;
>   
> diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> index 3db34d3eea58..946c93441c1f 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> @@ -25,6 +25,7 @@ selftest(ring, intel_ring_mock_selftests)
>   selftest(engine, intel_engine_cs_mock_selftests)
>   selftest(timelines, intel_timeline_mock_selftests)
>   selftest(requests, i915_request_mock_selftests)
> +selftest(scheduler, i915_scheduler_mock_selftests)
>   selftest(objects, i915_gem_object_mock_selftests)
>   selftest(phys, i915_gem_phys_mock_selftests)
>   selftest(dmabuf, i915_gem_dmabuf_mock_selftests)
> diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> index de44a66210b7..de5b1443129b 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
> @@ -12,6 +12,54 @@
>   #include "selftests/igt_spinner.h"
>   #include "selftests/i915_random.h"
>   
> +static int mock_skiplist_levels(void *dummy)
> +{
> +	struct i915_priolist_root root = {};
> +	struct i915_priolist *pl = &root.sentinel;
> +	IGT_TIMEOUT(end_time);
> +	unsigned long total;
> +	int count, lvl;
> +
> +	total = 0;
> +	do {
> +		for (count = 0; count < 16384; count++) {
> +			lvl = random_level(&root);
> +			if (lvl > pl->level) {
> +				if (lvl < I915_PRIOLIST_HEIGHT - 1)
> +					lvl = ++pl->level;
> +				else
> +					lvl = I915_PRIOLIST_HEIGHT - 1;
> +			}
> +
> +			pl->next[lvl] = ptr_inc(pl->next[lvl]);
> +		}
> +		total += count;
> +	} while (!__igt_timeout(end_time, NULL));
> +
> +	pr_info("Total %9lu\n", total);
> +	for (lvl = 0; lvl <= pl->level; lvl++) {
> +		int x = ilog2((unsigned long)pl->next[lvl]);
> +		char row[80];
> +
> +		memset(row, '*', x);
> +		row[x] = '\0';
> +
> +		pr_info(" [%2d] %9lu %s\n",
> +			lvl, (unsigned long)pl->next[lvl], row);
> +	}
> +
> +	return 0;
> +}
> +
> +int i915_scheduler_mock_selftests(void)
> +{
> +	static const struct i915_subtest tests[] = {
> +		SUBTEST(mock_skiplist_levels),
> +	};
> +
> +	return i915_subtests(tests, NULL);
> +}
> +
>   static void scheduling_disable(struct intel_engine_cs *engine)
>   {
>   	engine->props.preempt_timeout_ms = 0;
> @@ -80,9 +128,9 @@ static int all_engines(struct drm_i915_private *i915,
>   static bool check_context_order(struct intel_engine_cs *engine)
>   {
>   	u64 last_seqno, last_context;
> +	struct i915_priolist *p;
>   	unsigned long count;
>   	bool result = false;
> -	struct rb_node *rb;
>   	int last_prio;
>   
>   	/* We expect the execution order to follow ascending fence-context */
> @@ -92,8 +140,7 @@ static bool check_context_order(struct intel_engine_cs *engine)
>   	last_context = 0;
>   	last_seqno = 0;
>   	last_prio = 0;
> -	for (rb = rb_first_cached(&engine->active.queue); rb; rb = rb_next(rb)) {
> -		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
> +	for_each_priolist(p, &engine->active.queue) {
>   		struct i915_request *rq;
>   
>   		priolist_for_each_request(rq, p) {
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-29  9:37       ` Tvrtko Ursulin
@ 2021-01-29 10:26         ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2021-01-29 10:26 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: thomas.hellstrom

Quoting Tvrtko Ursulin (2021-01-29 09:37:27)
> 
> On 28/01/2021 16:26, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2021-01-28 15:56:19)
> 
> >>> -static void assert_priolists(struct i915_sched_engine * const se)
> >>> -{
> >>> -     struct rb_node *rb;
> >>> -     long last_prio;
> >>> -
> >>> -     if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
> >>> -             return;
> >>> -
> >>> -     GEM_BUG_ON(rb_first_cached(&se->queue) !=
> >>> -                rb_first(&se->queue.rb_root));
> >>> -
> >>> -     last_prio = INT_MAX;
> >>> -     for (rb = rb_first_cached(&se->queue); rb; rb = rb_next(rb)) {
> >>> -             const struct i915_priolist *p = to_priolist(rb);
> >>> -
> >>> -             GEM_BUG_ON(p->priority > last_prio);
> >>> -             last_prio = p->priority;
> >>> -     }
> >>> +     root->prng = next_pseudo_random32(root->prng);
> >>> +     return  __ffs(root->prng) / 2;
> >>
> >> Where is the relationship to I915_PRIOLIST_HEIGHT? Feels root->prng %
> >> I915_PRIOLIST_HEIGHT would be more obvious here unless I am terribly
> >> mistaken. Or at least put a comment saying why the hack.
> > 
> > HEIGHT is the maximum possible for our struct. skiplists only want to
> > increment the height of the tree one step at a time. So we choose a level
> > with decreasing probability, and then limit that to the maximum height of
> > the current tree + 1, clamped to HEIGHT.
> > 
> > You might notice that unlike traditional skiplists, this uses a
> > probability of 0.25 for each additional level. A neat trick discovered by
> > Con Kolivas (I haven't found it mentioned elsewhere) as the cost of the
> > extra level (using P=.5) is the same as the extra chain length with
> > P=.25. So you can scale to higher number of requests by packing more
> > requests into each level.
> > 
> > So that is split between randomly choosing a level and then working out
> > the height of the node.
> 
> Choosing levels with decreasing probability by the virtue of using ffs 
> on a random number? Or because (BITS_PER_TYPE(u32) / 2) is greater than 
> I915_PRIOLIST_HEIGHT?

        /*
         * Given a uniform distribution of random numbers over the u32, then
         * the probability each bit is unset is P=0.5. The probability of a
         * successive sequence of bits being unset is P(n) = 0.5^n [n > 0].
         *   P(level:1) = 0.5
         *   P(level:2) = 0.25
         *   P(level:3) = 0.125
         *   P(level:4) = 0.0625
         *   ...
         * So we can use ffs() on a good random number generator to pick our
         * level. We divide by two to reduce the probability of choosing a
         * level to .25, as the cost of descending a level is the same as
         * following an extra link in the chain at that level (so we can
         * pack more nodes into fewer levels without incurring extra cost,
         * and allow scaling to higher volumes of requests without expanding
         * the height of the skiplist).
         */

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-28 22:56   ` Matthew Brost
@ 2021-01-29 10:30     ` Chris Wilson
  2021-01-29 17:01       ` Matthew Brost
  0 siblings, 1 reply; 90+ messages in thread
From: Chris Wilson @ 2021-01-29 10:30 UTC (permalink / raw)
  To: Matthew Brost; +Cc: intel-gfx, thomas.hellstrom

Quoting Matthew Brost (2021-01-28 22:56:04)
> On Mon, Jan 25, 2021 at 02:01:15PM +0000, Chris Wilson wrote:
> > Replace the priolist rbtree with a skiplist. The crucial difference is
> > that walking and removing the first element of a skiplist is O(1), but
> > O(lgN) for an rbtree, as we need to rebalance on remove. This is a
> > hindrance for submission latency as it occurs between picking a request
> > for the priolist and submitting it to hardware, as well effectively
> > trippling the number of O(lgN) operations required under the irqoff lock.
> > This is critical to reducing the latency jitter with multiple clients.
> > 
> > The downsides to skiplists are that lookup/insertion is only
> > probablistically O(lgN) and there is a significant memory penalty to
> > as each skip node is larger than the rbtree equivalent. Furthermore, we
> > don't use dynamic arrays for the skiplist, so the allocation is fixed,
> > and imposes an upper bound on the scalability wrt to the number of
> > inflight requests.
> > 
> 
> This is a fun data structure but IMO might be overkill to maintain this
> code in the i915. The UMDs have effectively agreed to use only 3 levels,
> is O(lgN) where N == 3 really a big deal? With GuC submission we will
> statically map all user levels into 3 buckets. If we are doing that, do
> we even need a complex data structure? i.e. Could use just use can
> array of linked lists?

Because we need to scale the bst to handle a unqiue key per request with
thousands of requests [this is not only about priorities]. And as you
will see from the results, even with just a single priority in the system
(so one entry in either the skiplist or rbtree), the skiplist is beating 
the rbtree as measured by the lock hold time around insert/dequeue of
requests. That surprised me.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist
  2021-01-29 10:30     ` Chris Wilson
@ 2021-01-29 17:01       ` Matthew Brost
  0 siblings, 0 replies; 90+ messages in thread
From: Matthew Brost @ 2021-01-29 17:01 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, thomas.hellstrom

On Fri, Jan 29, 2021 at 10:30:58AM +0000, Chris Wilson wrote:
> Quoting Matthew Brost (2021-01-28 22:56:04)
> > On Mon, Jan 25, 2021 at 02:01:15PM +0000, Chris Wilson wrote:
> > > Replace the priolist rbtree with a skiplist. The crucial difference is
> > > that walking and removing the first element of a skiplist is O(1), but
> > > O(lgN) for an rbtree, as we need to rebalance on remove. This is a
> > > hindrance for submission latency as it occurs between picking a request
> > > for the priolist and submitting it to hardware, as well effectively
> > > trippling the number of O(lgN) operations required under the irqoff lock.
> > > This is critical to reducing the latency jitter with multiple clients.
> > > 
> > > The downsides to skiplists are that lookup/insertion is only
> > > probablistically O(lgN) and there is a significant memory penalty to
> > > as each skip node is larger than the rbtree equivalent. Furthermore, we
> > > don't use dynamic arrays for the skiplist, so the allocation is fixed,
> > > and imposes an upper bound on the scalability wrt to the number of
> > > inflight requests.
> > > 
> > 
> > This is a fun data structure but IMO might be overkill to maintain this
> > code in the i915. The UMDs have effectively agreed to use only 3 levels,
> > is O(lgN) where N == 3 really a big deal? With GuC submission we will
> > statically map all user levels into 3 buckets. If we are doing that, do
> > we even need a complex data structure? i.e. Could use just use can
> > array of linked lists?
> 
> Because we need to scale the bst to handle a unqiue key per request with
> thousands of requests [this is not only about priorities]. And as you
> will see from the results, even with just a single priority in the system
> (so one entry in either the skiplist or rbtree), the skiplist is beating 
> the rbtree as measured by the lock hold time around insert/dequeue of
> requests. That surprised me.

Ok, seems reasonable. Skips list are pretty cool, wondering if at some
point we should make skip list code a bit more generic so it can
possibly be levered in other parts of the i915 / kernel.

Matt

> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

end of thread, other threads:[~2021-01-29 17:09 UTC | newest]

Thread overview: 90+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
2021-01-25 14:00 ` [Intel-gfx] [PATCH 02/41] drm/i915/gt: Move the defer_request waiter active assertion Chris Wilson
2021-01-25 14:53   ` Tvrtko Ursulin
2021-01-25 14:00 ` [Intel-gfx] [PATCH 03/41] drm/i915: Replace engine->schedule() with a known request operation Chris Wilson
2021-01-25 15:14   ` Tvrtko Ursulin
2021-01-25 14:00 ` [Intel-gfx] [PATCH 04/41] drm/i915: Teach the i915_dependency to use a double-lock Chris Wilson
2021-01-25 15:34   ` Tvrtko Ursulin
2021-01-25 21:37     ` Chris Wilson
2021-01-26  9:40       ` Tvrtko Ursulin
2021-01-25 14:01 ` [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance Chris Wilson
2021-01-26 11:12   ` Tvrtko Ursulin
2021-01-26 11:30     ` Chris Wilson
2021-01-26 11:40       ` Tvrtko Ursulin
2021-01-26 11:55         ` Chris Wilson
2021-01-26 13:15           ` Tvrtko Ursulin
2021-01-26 13:24             ` Chris Wilson
2021-01-26 13:45               ` Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 06/41] drm/i915/selftests: Measure set-priority duration Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 07/41] drm/i915/selftests: Exercise priority inheritance around an engine loop Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 08/41] drm/i915: Improve DFS for priority inheritance Chris Wilson
2021-01-26 16:22   ` Tvrtko Ursulin
2021-01-26 16:26     ` Chris Wilson
2021-01-26 16:42       ` Tvrtko Ursulin
2021-01-26 16:51         ` Tvrtko Ursulin
2021-01-26 16:51         ` Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 09/41] drm/i915/selftests: Exercise relative mmio paths to non-privileged registers Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 10/41] drm/i915/selftests: Exercise cross-process context isolation Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 11/41] drm/i915: Extract request submission from execlists Chris Wilson
2021-01-26 16:28   ` Tvrtko Ursulin
2021-01-25 14:01 ` [Intel-gfx] [PATCH 12/41] drm/i915: Extract request rewinding " Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 13/41] drm/i915: Extract request suspension from the execlists Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 14/41] drm/i915: Extract the ability to defer and rerun a request later Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 15/41] drm/i915: Fix the iterative dfs for defering requests Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 16/41] drm/i915: Move common active lists from engine to i915_scheduler Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 17/41] drm/i915: Move scheduler queue Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 18/41] drm/i915: Move tasklet from execlists to sched Chris Wilson
2021-01-27 14:10   ` Tvrtko Ursulin
2021-01-27 14:24     ` Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 19/41] drm/i915/gt: Show scheduler queues when dumping state Chris Wilson
2021-01-27 14:13   ` Tvrtko Ursulin
2021-01-27 14:35     ` Chris Wilson
2021-01-27 14:50       ` Tvrtko Ursulin
2021-01-27 14:55         ` Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist Chris Wilson
2021-01-27 15:10   ` Tvrtko Ursulin
2021-01-27 15:33     ` Chris Wilson
2021-01-27 15:44       ` Chris Wilson
2021-01-27 15:58         ` Tvrtko Ursulin
2021-01-28  9:50           ` Chris Wilson
2021-01-28 15:56   ` Tvrtko Ursulin
2021-01-28 16:26     ` Chris Wilson
2021-01-28 16:42       ` Tvrtko Ursulin
2021-01-28 22:20         ` Chris Wilson
2021-01-28 22:44         ` Chris Wilson
2021-01-29  9:24           ` Tvrtko Ursulin
2021-01-29  9:37       ` Tvrtko Ursulin
2021-01-29 10:26         ` Chris Wilson
2021-01-28 22:56   ` Matthew Brost
2021-01-29 10:30     ` Chris Wilson
2021-01-29 17:01       ` Matthew Brost
2021-01-29 10:22   ` Tvrtko Ursulin
2021-01-25 14:01 ` [Intel-gfx] [PATCH 21/41] drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper Chris Wilson
2021-01-27 15:28   ` Tvrtko Ursulin
2021-01-25 14:01 ` [Intel-gfx] [PATCH 22/41] drm/i915: Fair low-latency scheduling Chris Wilson
2021-01-28 11:35   ` Tvrtko Ursulin
2021-01-28 12:32     ` Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 23/41] drm/i915/gt: Specify a deadline for the heartbeat Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 24/41] drm/i915: Extend the priority boosting for the display with a deadline Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 25/41] drm/i915/gt: Support virtual engine queues Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 26/41] drm/i915: Move saturated workload detection back to the context Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 27/41] drm/i915: Bump default timeslicing quantum to 5ms Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 28/41] drm/i915/gt: Wrap intel_timeline.has_initial_breadcrumb Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 29/41] drm/i915/gt: Track timeline GGTT offset separately from subpage offset Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 30/41] drm/i915/gt: Add timeline "mode" Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 31/41] drm/i915/gt: Use indices for writing into relative timelines Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 32/41] drm/i915/selftests: Exercise relative timeline modes Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 33/41] drm/i915/gt: Use ppHWSP for unshared non-semaphore related timelines Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 34/41] Restore "drm/i915: drop engine_pin/unpin_breadcrumbs_irq" Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 35/41] drm/i915/gt: Couple tasklet scheduling for all CS interrupts Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 36/41] drm/i915/gt: Support creation of 'internal' rings Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 37/41] drm/i915/gt: Use client timeline address for seqno writes Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 38/41] drm/i915/gt: Infrastructure for ring scheduling Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 39/41] drm/i915/gt: Implement ring scheduler for gen4-7 Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 40/41] drm/i915/gt: Enable ring scheduling for gen5-7 Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 41/41] drm/i915: Support secure dispatch on gen6/gen7 Chris Wilson
2021-01-25 14:40 ` [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Tvrtko Ursulin
2021-01-25 17:08 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/41] " Patchwork
2021-01-25 17:10 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-01-25 17:38 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-01-25 22:45 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.