All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [RFC 0/5] Split up intel_lrc.c
@ 2019-12-11 21:12 Daniele Ceraolo Spurio
  2019-12-11 21:12 ` [Intel-gfx] [RFC 1/5] drm/i915: introduce logical_ring and lr_context naming Daniele Ceraolo Spurio
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-12-11 21:12 UTC (permalink / raw)
  To: intel-gfx

The new GuC submission code will get rid of the execlists emulation and
move towards being a more independent submission flow. However, given
that the HW underneath is still the same, the generic engine HW setup
and context handling can be shared between the GuC and execlists
submission paths. Currently, the execlists submission code and the more
generic handlers are mixed together in intel_lrc.c, which makes code
sharing trickier and makes it harder to isolate the 2 submission
mechanisms. Therefore, this series proposes to split the intel_lrc file
to divide the common parts from the submission-specific ones. Apart from
execlists submission and context management, the virtual engine code has
also been split in a generic part to be re-used and a back-end specific
one.

The status of the intel_lrc.c file gets a bit more confusing in the
first patch as execlists_* and lr_context_* functions are mixed
together, but that is solved at the end of the series when all the
execlists_* code is moved away.

I'm not too sure where the functions to emit commands in the ring belong
to, because most of them are common but there will be some small
differences between the GuC and execlist paths (e.g. no busywait with
the GuC). I've left them in intel_lrc.c for now, with a plan to
reconsider once the new GuC code lands and it becomes clearer what the
differences are.

Very lightly tested.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>

Daniele Ceraolo Spurio (4):
  drm/i915: introduce logical_ring and lr_context naming
  drm/i915: split out virtual engine code
  drm/i915: move execlists selftests to their own file
  drm/i915: introduce intel_execlists_submission.<c/h>

Matthew Brost (1):
  drm/i915: Move struct intel_virtual_engine to its own header

 drivers/gpu/drm/i915/Makefile                 |    2 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |    5 +-
 .../drm/i915/gem/selftests/igt_gem_utils.c    |   27 +
 .../drm/i915/gem/selftests/igt_gem_utils.h    |    3 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |    1 +
 drivers/gpu/drm/i915/gt/intel_engine_pool.c   |    1 +
 .../drm/i915/gt/intel_execlists_submission.c  | 2485 ++++++++++++
 .../drm/i915/gt/intel_execlists_submission.h  |   58 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           | 3132 +--------------
 drivers/gpu/drm/i915/gt/intel_lrc.h           |   59 +-
 .../gpu/drm/i915/gt/intel_virtual_engine.c    |  360 ++
 .../gpu/drm/i915/gt/intel_virtual_engine.h    |   48 +
 .../drm/i915/gt/intel_virtual_engine_types.h  |   57 +
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 3316 ++++++++++++++++
 drivers/gpu/drm/i915/gt/selftest_lrc.c        | 3346 +----------------
 drivers/gpu/drm/i915/gt/selftest_mocs.c       |   30 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |    1 +
 drivers/gpu/drm/i915/gvt/scheduler.c          |    1 +
 drivers/gpu/drm/i915/i915_perf.c              |    1 +
 19 files changed, 6547 insertions(+), 6386 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_execlists_submission.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_execlists_submission.h
 create mode 100644 drivers/gpu/drm/i915/gt/intel_virtual_engine.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_virtual_engine.h
 create mode 100644 drivers/gpu/drm/i915/gt/intel_virtual_engine_types.h
 create mode 100644 drivers/gpu/drm/i915/gt/selftest_execlists.c

-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Intel-gfx] [RFC 1/5] drm/i915: introduce logical_ring and lr_context naming
  2019-12-11 21:12 [Intel-gfx] [RFC 0/5] Split up intel_lrc.c Daniele Ceraolo Spurio
@ 2019-12-11 21:12 ` Daniele Ceraolo Spurio
  2019-12-11 21:20   ` Chris Wilson
  2019-12-11 21:12 ` [Intel-gfx] [RFC 2/5] drm/i915: Move struct intel_virtual_engine to its own header Daniele Ceraolo Spurio
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-12-11 21:12 UTC (permalink / raw)
  To: intel-gfx

Ahead of splitting out the code specific to execlists submission to its
own file, we can re-organize the code within the intel_lrc file to make
that separation clearer. To achieve this, a number of functions have
been split/renamed using the "logical_ring" and "lr_context" naming,
respectively for engine-related setup and lrc management.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c    | 154 ++++++++++++++-----------
 drivers/gpu/drm/i915/gt/selftest_lrc.c |  12 +-
 2 files changed, 93 insertions(+), 73 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 929f6bae4eba..6d6148e11fd0 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -228,17 +228,17 @@ static struct virtual_engine *to_virtual_engine(struct intel_engine_cs *engine)
 	return container_of(engine, struct virtual_engine, base);
 }
 
-static int __execlists_context_alloc(struct intel_context *ce,
-				     struct intel_engine_cs *engine);
-
-static void execlists_init_reg_state(u32 *reg_state,
-				     const struct intel_context *ce,
-				     const struct intel_engine_cs *engine,
-				     const struct intel_ring *ring,
-				     bool close);
+static int lr_context_alloc(struct intel_context *ce,
+			    struct intel_engine_cs *engine);
+
+static void lr_context_init_reg_state(u32 *reg_state,
+				      const struct intel_context *ce,
+				      const struct intel_engine_cs *engine,
+				      const struct intel_ring *ring,
+				      bool close);
 static void
-__execlists_update_reg_state(const struct intel_context *ce,
-			     const struct intel_engine_cs *engine);
+lr_context_update_reg_state(const struct intel_context *ce,
+			    const struct intel_engine_cs *engine);
 
 static void mark_eio(struct i915_request *rq)
 {
@@ -1035,8 +1035,8 @@ execlists_check_context(const struct intel_context *ce,
 	WARN_ONCE(!valid, "Invalid lrc state found before submission\n");
 }
 
-static void restore_default_state(struct intel_context *ce,
-				  struct intel_engine_cs *engine)
+static void lr_context_restore_default_state(struct intel_context *ce,
+					     struct intel_engine_cs *engine)
 {
 	u32 *regs = ce->lrc_reg_state;
 
@@ -1045,7 +1045,7 @@ static void restore_default_state(struct intel_context *ce,
 		       engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
 		       engine->context_size - PAGE_SIZE);
 
-	execlists_init_reg_state(regs, ce, engine, ce->ring, false);
+	lr_context_init_reg_state(regs, ce, engine, ce->ring, false);
 }
 
 static void reset_active(struct i915_request *rq,
@@ -1081,8 +1081,8 @@ static void reset_active(struct i915_request *rq,
 	intel_ring_update_space(ce->ring);
 
 	/* Scrub the context image to prevent replaying the previous batch */
-	restore_default_state(ce, engine);
-	__execlists_update_reg_state(ce, engine);
+	lr_context_restore_default_state(ce, engine);
+	lr_context_update_reg_state(ce, engine);
 
 	/* We've switched away, so this should be a no-op, but intent matters */
 	ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
@@ -2378,7 +2378,7 @@ static void execlists_submit_request(struct i915_request *request)
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
 
-static void __execlists_context_fini(struct intel_context *ce)
+static void lr_context_fini(struct intel_context *ce)
 {
 	intel_ring_put(ce->ring);
 	i915_vma_put(ce->state);
@@ -2392,7 +2392,7 @@ static void execlists_context_destroy(struct kref *kref)
 	GEM_BUG_ON(intel_context_is_pinned(ce));
 
 	if (ce->state)
-		__execlists_context_fini(ce);
+		lr_context_fini(ce);
 
 	intel_context_fini(ce);
 	intel_context_free(ce);
@@ -2423,7 +2423,7 @@ check_redzone(const void *vaddr, const struct intel_engine_cs *engine)
 			     engine->name);
 }
 
-static void execlists_context_unpin(struct intel_context *ce)
+static void intel_lr_context_unpin(struct intel_context *ce)
 {
 	check_redzone((void *)ce->lrc_reg_state - LRC_STATE_PN * PAGE_SIZE,
 		      ce->engine);
@@ -2433,8 +2433,8 @@ static void execlists_context_unpin(struct intel_context *ce)
 }
 
 static void
-__execlists_update_reg_state(const struct intel_context *ce,
-			     const struct intel_engine_cs *engine)
+lr_context_update_reg_state(const struct intel_context *ce,
+			    const struct intel_engine_cs *engine)
 {
 	struct intel_ring *ring = ce->ring;
 	u32 *regs = ce->lrc_reg_state;
@@ -2456,8 +2456,7 @@ __execlists_update_reg_state(const struct intel_context *ce,
 }
 
 static int
-__execlists_context_pin(struct intel_context *ce,
-			struct intel_engine_cs *engine)
+lr_context_pin(struct intel_context *ce, struct intel_engine_cs *engine)
 {
 	void *vaddr;
 	int ret;
@@ -2479,7 +2478,7 @@ __execlists_context_pin(struct intel_context *ce,
 
 	ce->lrc_desc = lrc_descriptor(ce, engine);
 	ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
-	__execlists_update_reg_state(ce, engine);
+	lr_context_update_reg_state(ce, engine);
 
 	return 0;
 
@@ -2491,12 +2490,12 @@ __execlists_context_pin(struct intel_context *ce,
 
 static int execlists_context_pin(struct intel_context *ce)
 {
-	return __execlists_context_pin(ce, ce->engine);
+	return lr_context_pin(ce, ce->engine);
 }
 
 static int execlists_context_alloc(struct intel_context *ce)
 {
-	return __execlists_context_alloc(ce, ce->engine);
+	return lr_context_alloc(ce, ce->engine);
 }
 
 static void execlists_context_reset(struct intel_context *ce)
@@ -2518,14 +2517,14 @@ static void execlists_context_reset(struct intel_context *ce)
 	 * simplicity, we just zero everything out.
 	 */
 	intel_ring_reset(ce->ring, 0);
-	__execlists_update_reg_state(ce, ce->engine);
+	lr_context_update_reg_state(ce, ce->engine);
 }
 
 static const struct intel_context_ops execlists_context_ops = {
 	.alloc = execlists_context_alloc,
 
 	.pin = execlists_context_pin,
-	.unpin = execlists_context_unpin,
+	.unpin = intel_lr_context_unpin,
 
 	.enter = intel_context_enter_engine,
 	.exit = intel_context_exit_engine,
@@ -2912,7 +2911,33 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 	return ret;
 }
 
-static void enable_execlists(struct intel_engine_cs *engine)
+static int logical_ring_init(struct intel_engine_cs *engine)
+{
+	int ret;
+
+	ret = intel_engine_init_common(engine);
+	if (ret)
+		return ret;
+
+	if (intel_init_workaround_bb(engine))
+		/*
+		 * We continue even if we fail to initialize WA batch
+		 * because we only expect rare glitches but nothing
+		 * critical to prevent us from using GPU
+		 */
+		DRM_ERROR("WA batch buffer initialization failed\n");
+
+	return 0;
+}
+
+static void logical_ring_destroy(struct intel_engine_cs *engine)
+{
+	intel_engine_cleanup_common(engine);
+	lrc_destroy_wa_ctx(engine);
+	kfree(engine);
+}
+
+static void logical_ring_enable(struct intel_engine_cs *engine)
 {
 	u32 mode;
 
@@ -2946,7 +2971,7 @@ static bool unexpected_starting_state(struct intel_engine_cs *engine)
 	return unexpected;
 }
 
-static int execlists_resume(struct intel_engine_cs *engine)
+static int logical_ring_resume(struct intel_engine_cs *engine)
 {
 	intel_engine_apply_workarounds(engine);
 	intel_engine_apply_whitelist(engine);
@@ -2961,7 +2986,7 @@ static int execlists_resume(struct intel_engine_cs *engine)
 		intel_engine_dump(engine, &p, NULL);
 	}
 
-	enable_execlists(engine);
+	logical_ring_enable(engine);
 
 	return 0;
 }
@@ -3037,8 +3062,8 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
 			       &execlists->csb_status[reset_value]);
 }
 
-static void __execlists_reset_reg_state(const struct intel_context *ce,
-					const struct intel_engine_cs *engine)
+static void lr_context_reset_reg_state(const struct intel_context *ce,
+				       const struct intel_engine_cs *engine)
 {
 	u32 *regs = ce->lrc_reg_state;
 	int x;
@@ -3131,14 +3156,14 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
 	 * to recreate its own state.
 	 */
 	GEM_BUG_ON(!intel_context_is_pinned(ce));
-	restore_default_state(ce, engine);
+	lr_context_restore_default_state(ce, engine);
 
 out_replay:
 	GEM_TRACE("%s replay {head:%04x, tail:%04x}\n",
 		  engine->name, ce->ring->head, ce->ring->tail);
 	intel_ring_update_space(ce->ring);
-	__execlists_reset_reg_state(ce, engine);
-	__execlists_update_reg_state(ce, engine);
+	lr_context_reset_reg_state(ce, engine);
+	lr_context_update_reg_state(ce, engine);
 	ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
 
 unwind:
@@ -3788,9 +3813,7 @@ static void execlists_destroy(struct intel_engine_cs *engine)
 {
 	execlists_shutdown(engine);
 
-	intel_engine_cleanup_common(engine);
-	lrc_destroy_wa_ctx(engine);
-	kfree(engine);
+	logical_ring_destroy(engine);
 }
 
 static void
@@ -3799,7 +3822,7 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
 	/* Default vfuncs which can be overriden by each engine. */
 
 	engine->destroy = execlists_destroy;
-	engine->resume = execlists_resume;
+	engine->resume = logical_ring_resume;
 
 	engine->reset.prepare = execlists_reset_prepare;
 	engine->reset.reset = execlists_reset;
@@ -3872,6 +3895,15 @@ static void rcs_submission_override(struct intel_engine_cs *engine)
 	}
 }
 
+static void logical_ring_setup(struct intel_engine_cs *engine)
+{
+	logical_ring_default_vfuncs(engine);
+	logical_ring_default_irqs(engine);
+
+	if (engine->class == RENDER_CLASS)
+		rcs_submission_override(engine);
+}
+
 int intel_execlists_submission_setup(struct intel_engine_cs *engine)
 {
 	tasklet_init(&engine->execlists.tasklet,
@@ -3879,11 +3911,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
 	timer_setup(&engine->execlists.timer, execlists_timeslice, 0);
 	timer_setup(&engine->execlists.preempt, execlists_preempt, 0);
 
-	logical_ring_default_vfuncs(engine);
-	logical_ring_default_irqs(engine);
-
-	if (engine->class == RENDER_CLASS)
-		rcs_submission_override(engine);
+	logical_ring_setup(engine);
 
 	return 0;
 }
@@ -3896,18 +3924,10 @@ int intel_execlists_submission_init(struct intel_engine_cs *engine)
 	u32 base = engine->mmio_base;
 	int ret;
 
-	ret = intel_engine_init_common(engine);
+	ret = logical_ring_init(engine);
 	if (ret)
 		return ret;
 
-	if (intel_init_workaround_bb(engine))
-		/*
-		 * We continue even if we fail to initialize WA batch
-		 * because we only expect rare glitches but nothing
-		 * critical to prevent us from using GPU
-		 */
-		DRM_ERROR("WA batch buffer initialization failed\n");
-
 	if (HAS_LOGICAL_RING_ELSQ(i915)) {
 		execlists->submit_reg = uncore->regs +
 			i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(base));
@@ -4033,11 +4053,11 @@ static struct i915_ppgtt *vm_alias(struct i915_address_space *vm)
 		return i915_vm_to_ppgtt(vm);
 }
 
-static void execlists_init_reg_state(u32 *regs,
-				     const struct intel_context *ce,
-				     const struct intel_engine_cs *engine,
-				     const struct intel_ring *ring,
-				     bool close)
+static void lr_context_init_reg_state(u32 *regs,
+				      const struct intel_context *ce,
+				      const struct intel_engine_cs *engine,
+				      const struct intel_ring *ring,
+				      bool close)
 {
 	/*
 	 * A context is actually a big batch buffer with several
@@ -4105,7 +4125,7 @@ populate_lr_context(struct intel_context *ce,
 	/* The second page of the context object contains some fields which must
 	 * be set up prior to the first execution. */
 	regs = vaddr + LRC_STATE_PN * PAGE_SIZE;
-	execlists_init_reg_state(regs, ce, engine, ring, inhibit);
+	lr_context_init_reg_state(regs, ce, engine, ring, inhibit);
 	if (inhibit)
 		regs[CTX_CONTEXT_CONTROL] |=
 			_MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT);
@@ -4117,8 +4137,8 @@ populate_lr_context(struct intel_context *ce,
 	return ret;
 }
 
-static int __execlists_context_alloc(struct intel_context *ce,
-				     struct intel_engine_cs *engine)
+static int lr_context_alloc(struct intel_context *ce,
+			    struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_object *ctx_obj;
 	struct intel_ring *ring;
@@ -4212,7 +4232,7 @@ static void virtual_context_destroy(struct kref *kref)
 	GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.execlists.tasklet));
 
 	if (ve->context.state)
-		__execlists_context_fini(&ve->context);
+		lr_context_fini(&ve->context);
 	intel_context_fini(&ve->context);
 
 	kfree(ve->bonds);
@@ -4252,7 +4272,7 @@ static int virtual_context_pin(struct intel_context *ce)
 	int err;
 
 	/* Note: we must use a real engine class for setting up reg state */
-	err = __execlists_context_pin(ce, ve->siblings[0]);
+	err = lr_context_pin(ce, ve->siblings[0]);
 	if (err)
 		return err;
 
@@ -4284,7 +4304,7 @@ static void virtual_context_exit(struct intel_context *ce)
 
 static const struct intel_context_ops virtual_context_ops = {
 	.pin = virtual_context_pin,
-	.unpin = execlists_context_unpin,
+	.unpin = intel_lr_context_unpin,
 
 	.enter = virtual_context_enter,
 	.exit = virtual_context_exit,
@@ -4602,7 +4622,7 @@ intel_execlists_create_virtual(struct i915_gem_context *ctx,
 
 	ve->base.flags |= I915_ENGINE_IS_VIRTUAL;
 
-	err = __execlists_context_alloc(&ve->context, siblings[0]);
+	err = lr_context_alloc(&ve->context, siblings[0]);
 	if (err)
 		goto err_put;
 
@@ -4792,13 +4812,13 @@ void intel_lr_context_reset(struct intel_engine_cs *engine,
 	 * to recreate its own state.
 	 */
 	if (scrub)
-		restore_default_state(ce, engine);
+		lr_context_restore_default_state(ce, engine);
 
 	/* Rerun the request; its payload has been neutered (if guilty). */
 	ce->ring->head = head;
 	intel_ring_update_space(ce->ring);
 
-	__execlists_update_reg_state(ce, engine);
+	lr_context_update_reg_state(ce, engine);
 }
 
 bool
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index ac8b9116d307..b4537497c3be 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -169,7 +169,7 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
 		}
 		GEM_BUG_ON(!ce[1]->ring->size);
 		intel_ring_reset(ce[1]->ring, ce[1]->ring->size / 2);
-		__execlists_update_reg_state(ce[1], engine);
+		lr_context_update_reg_state(ce[1], engine);
 
 		rq[0] = igt_spinner_create_request(&spin, ce[0], MI_ARB_CHECK);
 		if (IS_ERR(rq[0])) {
@@ -3406,11 +3406,11 @@ static int live_lrc_layout(void *arg)
 		hw += LRC_STATE_PN * PAGE_SIZE / sizeof(*hw);
 
 		lrc = memset(mem, 0, PAGE_SIZE);
-		execlists_init_reg_state(lrc,
-					 engine->kernel_context,
-					 engine,
-					 engine->kernel_context->ring,
-					 true);
+		lr_context_init_reg_state(lrc,
+					  engine->kernel_context,
+					  engine,
+					  engine->kernel_context->ring,
+					  true);
 
 		dw = 0;
 		do {
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [RFC 2/5] drm/i915: Move struct intel_virtual_engine to its own header
  2019-12-11 21:12 [Intel-gfx] [RFC 0/5] Split up intel_lrc.c Daniele Ceraolo Spurio
  2019-12-11 21:12 ` [Intel-gfx] [RFC 1/5] drm/i915: introduce logical_ring and lr_context naming Daniele Ceraolo Spurio
@ 2019-12-11 21:12 ` Daniele Ceraolo Spurio
  2019-12-11 21:22   ` Chris Wilson
  2019-12-11 21:12 ` [Intel-gfx] [RFC 3/5] drm/i915: split out virtual engine code Daniele Ceraolo Spurio
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-12-11 21:12 UTC (permalink / raw)
  To: intel-gfx

From: Matthew Brost <matthew.brost@intel.com>

The upcoming GuC submission code will need to use the structure, so
split it to its own file.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c           | 103 ++++++------------
 .../drm/i915/gt/intel_virtual_engine_types.h  |  57 ++++++++++
 2 files changed, 92 insertions(+), 68 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_virtual_engine_types.h

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 6d6148e11fd0..e6dea2d3a5c0 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -147,6 +147,7 @@
 #include "intel_mocs.h"
 #include "intel_reset.h"
 #include "intel_ring.h"
+#include "intel_virtual_engine_types.h"
 #include "intel_workarounds.h"
 
 #define RING_EXECLIST_QFULL		(1 << 0x2)
@@ -180,52 +181,11 @@
 #define WA_TAIL_DWORDS 2
 #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
 
-struct virtual_engine {
-	struct intel_engine_cs base;
-	struct intel_context context;
-
-	/*
-	 * We allow only a single request through the virtual engine at a time
-	 * (each request in the timeline waits for the completion fence of
-	 * the previous before being submitted). By restricting ourselves to
-	 * only submitting a single request, each request is placed on to a
-	 * physical to maximise load spreading (by virtue of the late greedy
-	 * scheduling -- each real engine takes the next available request
-	 * upon idling).
-	 */
-	struct i915_request *request;
-
-	/*
-	 * We keep a rbtree of available virtual engines inside each physical
-	 * engine, sorted by priority. Here we preallocate the nodes we need
-	 * for the virtual engine, indexed by physical_engine->id.
-	 */
-	struct ve_node {
-		struct rb_node rb;
-		int prio;
-	} nodes[I915_NUM_ENGINES];
-
-	/*
-	 * Keep track of bonded pairs -- restrictions upon on our selection
-	 * of physical engines any particular request may be submitted to.
-	 * If we receive a submit-fence from a master engine, we will only
-	 * use one of sibling_mask physical engines.
-	 */
-	struct ve_bond {
-		const struct intel_engine_cs *master;
-		intel_engine_mask_t sibling_mask;
-	} *bonds;
-	unsigned int num_bonds;
-
-	/* And finally, which physical engines this virtual engine maps onto. */
-	unsigned int num_siblings;
-	struct intel_engine_cs *siblings[0];
-};
-
-static struct virtual_engine *to_virtual_engine(struct intel_engine_cs *engine)
+static struct intel_virtual_engine *
+to_virtual_engine(struct intel_engine_cs *engine)
 {
 	GEM_BUG_ON(!intel_engine_is_virtual(engine));
-	return container_of(engine, struct virtual_engine, base);
+	return container_of(engine, struct intel_virtual_engine, base);
 }
 
 static int lr_context_alloc(struct intel_context *ce,
@@ -384,7 +344,7 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 		return true;
 
 	if (rb) {
-		struct virtual_engine *ve =
+		struct intel_virtual_engine *ve =
 			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
 		bool preempt = false;
 
@@ -1144,7 +1104,8 @@ execlists_schedule_in(struct i915_request *rq, int idx)
 
 static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
 {
-	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+	struct intel_virtual_engine *ve =
+		container_of(ce, typeof(*ve), context);
 	struct i915_request *next = READ_ONCE(ve->request);
 
 	if (next && next->execution_mask & ~rq->execution_mask)
@@ -1448,7 +1409,7 @@ static void virtual_update_register_offsets(u32 *regs,
 	set_offsets(regs, reg_offsets(engine), engine);
 }
 
-static bool virtual_matches(const struct virtual_engine *ve,
+static bool virtual_matches(const struct intel_virtual_engine *ve,
 			    const struct i915_request *rq,
 			    const struct intel_engine_cs *engine)
 {
@@ -1473,7 +1434,7 @@ static bool virtual_matches(const struct virtual_engine *ve,
 	return true;
 }
 
-static void virtual_xfer_breadcrumbs(struct virtual_engine *ve,
+static void virtual_xfer_breadcrumbs(struct intel_virtual_engine *ve,
 				     struct intel_engine_cs *engine)
 {
 	struct intel_engine_cs *old = ve->siblings[0];
@@ -1670,7 +1631,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 */
 
 	for (rb = rb_first_cached(&execlists->virtual); rb; ) {
-		struct virtual_engine *ve =
+		struct intel_virtual_engine *ve =
 			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
 		struct i915_request *rq = READ_ONCE(ve->request);
 
@@ -1786,7 +1747,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	}
 
 	while (rb) { /* XXX virtual is always taking precedence */
-		struct virtual_engine *ve =
+		struct intel_virtual_engine *ve =
 			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
 		struct i915_request *rq;
 
@@ -3237,7 +3198,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 
 	/* Cancel all attached virtual engines */
 	while ((rb = rb_first_cached(&execlists->virtual))) {
-		struct virtual_engine *ve =
+		struct intel_virtual_engine *ve =
 			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
 
 		rb_erase_cached(rb, &execlists->virtual);
@@ -4198,14 +4159,14 @@ static int lr_context_alloc(struct intel_context *ce,
 	return ret;
 }
 
-static struct list_head *virtual_queue(struct virtual_engine *ve)
+static struct list_head *virtual_queue(struct intel_virtual_engine *ve)
 {
 	return &ve->base.execlists.default_priolist.requests[0];
 }
 
 static void virtual_context_destroy(struct kref *kref)
 {
-	struct virtual_engine *ve =
+	struct intel_virtual_engine *ve =
 		container_of(kref, typeof(*ve), context.ref);
 	unsigned int n;
 
@@ -4239,7 +4200,7 @@ static void virtual_context_destroy(struct kref *kref)
 	kfree(ve);
 }
 
-static void virtual_engine_initial_hint(struct virtual_engine *ve)
+static void virtual_engine_initial_hint(struct intel_virtual_engine *ve)
 {
 	int swp;
 
@@ -4268,7 +4229,8 @@ static void virtual_engine_initial_hint(struct virtual_engine *ve)
 
 static int virtual_context_pin(struct intel_context *ce)
 {
-	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+	struct intel_virtual_engine *ve =
+		container_of(ce, typeof(*ve), context);
 	int err;
 
 	/* Note: we must use a real engine class for setting up reg state */
@@ -4282,7 +4244,8 @@ static int virtual_context_pin(struct intel_context *ce)
 
 static void virtual_context_enter(struct intel_context *ce)
 {
-	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+	struct intel_virtual_engine *ve =
+		container_of(ce, typeof(*ve), context);
 	unsigned int n;
 
 	for (n = 0; n < ve->num_siblings; n++)
@@ -4293,7 +4256,8 @@ static void virtual_context_enter(struct intel_context *ce)
 
 static void virtual_context_exit(struct intel_context *ce)
 {
-	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+	struct intel_virtual_engine *ve =
+		container_of(ce, typeof(*ve), context);
 	unsigned int n;
 
 	intel_timeline_exit(ce->timeline);
@@ -4312,7 +4276,8 @@ static const struct intel_context_ops virtual_context_ops = {
 	.destroy = virtual_context_destroy,
 };
 
-static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
+static intel_engine_mask_t
+virtual_submission_mask(struct intel_virtual_engine *ve)
 {
 	struct i915_request *rq;
 	intel_engine_mask_t mask;
@@ -4339,7 +4304,8 @@ static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
 
 static void virtual_submission_tasklet(unsigned long data)
 {
-	struct virtual_engine * const ve = (struct virtual_engine *)data;
+	struct intel_virtual_engine * const ve =
+		(struct intel_virtual_engine *)data;
 	const int prio = ve->base.execlists.queue_priority_hint;
 	intel_engine_mask_t mask;
 	unsigned int n;
@@ -4419,7 +4385,7 @@ static void virtual_submission_tasklet(unsigned long data)
 
 static void virtual_submit_request(struct i915_request *rq)
 {
-	struct virtual_engine *ve = to_virtual_engine(rq->engine);
+	struct intel_virtual_engine *ve = to_virtual_engine(rq->engine);
 	struct i915_request *old;
 	unsigned long flags;
 
@@ -4458,7 +4424,7 @@ static void virtual_submit_request(struct i915_request *rq)
 }
 
 static struct ve_bond *
-virtual_find_bond(struct virtual_engine *ve,
+virtual_find_bond(struct intel_virtual_engine *ve,
 		  const struct intel_engine_cs *master)
 {
 	int i;
@@ -4474,7 +4440,7 @@ virtual_find_bond(struct virtual_engine *ve,
 static void
 virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
 {
-	struct virtual_engine *ve = to_virtual_engine(rq->engine);
+	struct intel_virtual_engine *ve = to_virtual_engine(rq->engine);
 	intel_engine_mask_t allowed, exec;
 	struct ve_bond *bond;
 
@@ -4498,7 +4464,7 @@ intel_execlists_create_virtual(struct i915_gem_context *ctx,
 			       struct intel_engine_cs **siblings,
 			       unsigned int count)
 {
-	struct virtual_engine *ve;
+	struct intel_virtual_engine *ve;
 	unsigned int n;
 	int err;
 
@@ -4639,7 +4605,7 @@ struct intel_context *
 intel_execlists_clone_virtual(struct i915_gem_context *ctx,
 			      struct intel_engine_cs *src)
 {
-	struct virtual_engine *se = to_virtual_engine(src);
+	struct intel_virtual_engine *se = to_virtual_engine(src);
 	struct intel_context *dst;
 
 	dst = intel_execlists_create_virtual(ctx,
@@ -4649,7 +4615,8 @@ intel_execlists_clone_virtual(struct i915_gem_context *ctx,
 		return dst;
 
 	if (se->num_bonds) {
-		struct virtual_engine *de = to_virtual_engine(dst->engine);
+		struct intel_virtual_engine *de =
+			to_virtual_engine(dst->engine);
 
 		de->bonds = kmemdup(se->bonds,
 				    sizeof(*se->bonds) * se->num_bonds,
@@ -4669,7 +4636,7 @@ int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
 				     const struct intel_engine_cs *master,
 				     const struct intel_engine_cs *sibling)
 {
-	struct virtual_engine *ve = to_virtual_engine(engine);
+	struct intel_virtual_engine *ve = to_virtual_engine(engine);
 	struct ve_bond *bond;
 	int n;
 
@@ -4705,7 +4672,7 @@ struct intel_engine_cs *
 intel_virtual_engine_get_sibling(struct intel_engine_cs *engine,
 				 unsigned int sibling)
 {
-	struct virtual_engine *ve = to_virtual_engine(engine);
+	struct intel_virtual_engine *ve = to_virtual_engine(engine);
 
 	if (sibling >= ve->num_siblings)
 		return NULL;
@@ -4773,7 +4740,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 	last = NULL;
 	count = 0;
 	for (rb = rb_first_cached(&execlists->virtual); rb; rb = rb_next(rb)) {
-		struct virtual_engine *ve =
+		struct intel_virtual_engine *ve =
 			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
 		struct i915_request *rq = READ_ONCE(ve->request);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_virtual_engine_types.h b/drivers/gpu/drm/i915/gt/intel_virtual_engine_types.h
new file mode 100644
index 000000000000..9ba5f0e6e395
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_virtual_engine_types.h
@@ -0,0 +1,57 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __INTEL_VIRTUAL_ENGINE_TYPES__
+#define __INTEL_VIRTUAL_ENGINE_TYPES__
+
+#include "intel_context_types.h"
+#include "intel_engine_types.h"
+
+struct i915_request;
+
+struct intel_virtual_engine {
+	struct intel_engine_cs base;
+	struct intel_context context;
+
+	/*
+	 * We allow only a single request through the virtual engine at a time
+	 * (each request in the timeline waits for the completion fence of
+	 * the previous before being submitted). By restricting ourselves to
+	 * only submitting a single request, each request is placed on to a
+	 * physical to maximise load spreading (by virtue of the late greedy
+	 * scheduling -- each real engine takes the next available request
+	 * upon idling).
+	 */
+	struct i915_request *request;
+
+	/*
+	 * We keep a rbtree of available virtual engines inside each physical
+	 * engine, sorted by priority. Here we preallocate the nodes we need
+	 * for the virtual engine, indexed by physical_engine->id.
+	 */
+	struct ve_node {
+		struct rb_node rb;
+		int prio;
+	} nodes[I915_NUM_ENGINES];
+
+	/*
+	 * Keep track of bonded pairs -- restrictions upon on our selection
+	 * of physical engines any particular request may be submitted to.
+	 * If we receive a submit-fence from a master engine, we will only
+	 * use one of sibling_mask physical engines.
+	 */
+	struct ve_bond {
+		const struct intel_engine_cs *master;
+		intel_engine_mask_t sibling_mask;
+	} *bonds;
+	unsigned int num_bonds;
+
+	/* And finally, which physical engines this virtual engine maps onto. */
+	unsigned int num_siblings;
+	struct intel_engine_cs *siblings[0];
+};
+
+#endif /* __INTEL_VIRTUAL_ENGINE_TYPES__ */
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [RFC 3/5] drm/i915: split out virtual engine code
  2019-12-11 21:12 [Intel-gfx] [RFC 0/5] Split up intel_lrc.c Daniele Ceraolo Spurio
  2019-12-11 21:12 ` [Intel-gfx] [RFC 1/5] drm/i915: introduce logical_ring and lr_context naming Daniele Ceraolo Spurio
  2019-12-11 21:12 ` [Intel-gfx] [RFC 2/5] drm/i915: Move struct intel_virtual_engine to its own header Daniele Ceraolo Spurio
@ 2019-12-11 21:12 ` Daniele Ceraolo Spurio
  2019-12-11 21:22   ` Chris Wilson
  2019-12-11 21:12 ` [Intel-gfx] [RFC 4/5] drm/i915: move execlists selftests to their own file Daniele Ceraolo Spurio
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-12-11 21:12 UTC (permalink / raw)
  To: intel-gfx

Having the virtual engine handling in its own file will make it easier
call it from or modify for the GuC implementation without leaking the
changes in the context management or execlists submission paths.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   5 +-
 drivers/gpu/drm/i915/gt/intel_engine_pool.c   |   1 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           | 403 ++----------------
 drivers/gpu/drm/i915/gt/intel_lrc.h           |  29 +-
 .../gpu/drm/i915/gt/intel_virtual_engine.c    | 359 ++++++++++++++++
 .../gpu/drm/i915/gt/intel_virtual_engine.h    |  48 +++
 drivers/gpu/drm/i915/gt/selftest_lrc.c        |  13 +-
 8 files changed, 457 insertions(+), 402 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_virtual_engine.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_virtual_engine.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index e0fd10c0cfb8..79f5ef5acd4c 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -98,6 +98,7 @@ gt-y += \
 	gt/intel_rps.o \
 	gt/intel_sseu.o \
 	gt/intel_timeline.o \
+	gt/intel_virtual_engine.o \
 	gt/intel_workarounds.o
 # autogenerated null render state
 gt-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 46b4d1d643f8..6461370223b8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -74,6 +74,7 @@
 #include "gt/intel_engine_user.h"
 #include "gt/intel_lrc_reg.h"
 #include "gt/intel_ring.h"
+#include "gt/intel_virtual_engine.h"
 
 #include "i915_gem_context.h"
 #include "i915_globals.h"
@@ -1536,7 +1537,7 @@ set_engines__load_balance(struct i915_user_extension __user *base, void *data)
 		}
 	}
 
-	ce = intel_execlists_create_virtual(set->ctx, siblings, n);
+	ce = intel_virtual_engine_create(set->ctx, siblings, n);
 	if (IS_ERR(ce)) {
 		err = PTR_ERR(ce);
 		goto out_siblings;
@@ -1999,7 +2000,7 @@ static int clone_engines(struct i915_gem_context *dst,
 		 */
 		if (intel_engine_is_virtual(engine))
 			clone->engines[n] =
-				intel_execlists_clone_virtual(dst, engine);
+				intel_virtual_engine_clone(dst, engine);
 		else
 			clone->engines[n] = intel_context_create(dst, engine);
 		if (IS_ERR_OR_NULL(clone->engines[n])) {
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.c b/drivers/gpu/drm/i915/gt/intel_engine_pool.c
index 397186818305..33ab0e5bfa41 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pool.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.c
@@ -9,6 +9,7 @@
 #include "i915_drv.h"
 #include "intel_engine_pm.h"
 #include "intel_engine_pool.h"
+#include "intel_virtual_engine.h"
 
 static struct intel_engine_cs *to_engine(struct intel_engine_pool *pool)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e6dea2d3a5c0..3afae9a44911 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -147,7 +147,7 @@
 #include "intel_mocs.h"
 #include "intel_reset.h"
 #include "intel_ring.h"
-#include "intel_virtual_engine_types.h"
+#include "intel_virtual_engine.h"
 #include "intel_workarounds.h"
 
 #define RING_EXECLIST_QFULL		(1 << 0x2)
@@ -181,16 +181,6 @@
 #define WA_TAIL_DWORDS 2
 #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
 
-static struct intel_virtual_engine *
-to_virtual_engine(struct intel_engine_cs *engine)
-{
-	GEM_BUG_ON(!intel_engine_is_virtual(engine));
-	return container_of(engine, struct intel_virtual_engine, base);
-}
-
-static int lr_context_alloc(struct intel_context *ce,
-			    struct intel_engine_cs *engine);
-
 static void lr_context_init_reg_state(u32 *reg_state,
 				      const struct intel_context *ce,
 				      const struct intel_engine_cs *engine,
@@ -805,6 +795,12 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine)
 	}
 }
 
+u32 *intel_lr_context_set_register_offsets(u32 *regs,
+					   const struct intel_engine_cs *engine)
+{
+	return set_offsets(regs, reg_offsets(engine), engine);
+}
+
 static struct i915_request *
 __unwind_incomplete_requests(struct intel_engine_cs *engine)
 {
@@ -1403,12 +1399,6 @@ static bool can_merge_rq(const struct i915_request *prev,
 	return true;
 }
 
-static void virtual_update_register_offsets(u32 *regs,
-					    struct intel_engine_cs *engine)
-{
-	set_offsets(regs, reg_offsets(engine), engine);
-}
-
 static bool virtual_matches(const struct intel_virtual_engine *ve,
 			    const struct i915_request *rq,
 			    const struct intel_engine_cs *engine)
@@ -1802,8 +1792,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				GEM_BUG_ON(READ_ONCE(ve->context.inflight));
 
 				if (!intel_engine_has_relative_mmio(engine))
-					virtual_update_register_offsets(regs,
-									engine);
+					intel_lr_context_set_register_offsets(regs,
+									      engine);
 
 				if (!list_empty(&ve->context.signals))
 					virtual_xfer_breadcrumbs(ve, engine);
@@ -2339,12 +2329,6 @@ static void execlists_submit_request(struct i915_request *request)
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
 
-static void lr_context_fini(struct intel_context *ce)
-{
-	intel_ring_put(ce->ring);
-	i915_vma_put(ce->state);
-}
-
 static void execlists_context_destroy(struct kref *kref)
 {
 	struct intel_context *ce = container_of(kref, typeof(*ce), ref);
@@ -2353,7 +2337,7 @@ static void execlists_context_destroy(struct kref *kref)
 	GEM_BUG_ON(intel_context_is_pinned(ce));
 
 	if (ce->state)
-		lr_context_fini(ce);
+		intel_lr_context_fini(ce);
 
 	intel_context_fini(ce);
 	intel_context_free(ce);
@@ -2384,7 +2368,7 @@ check_redzone(const void *vaddr, const struct intel_engine_cs *engine)
 			     engine->name);
 }
 
-static void intel_lr_context_unpin(struct intel_context *ce)
+void intel_lr_context_unpin(struct intel_context *ce)
 {
 	check_redzone((void *)ce->lrc_reg_state - LRC_STATE_PN * PAGE_SIZE,
 		      ce->engine);
@@ -2416,8 +2400,9 @@ lr_context_update_reg_state(const struct intel_context *ce,
 	}
 }
 
-static int
-lr_context_pin(struct intel_context *ce, struct intel_engine_cs *engine)
+int
+intel_lr_context_pin(struct intel_context *ce,
+		     struct intel_engine_cs *engine)
 {
 	void *vaddr;
 	int ret;
@@ -2451,12 +2436,12 @@ lr_context_pin(struct intel_context *ce, struct intel_engine_cs *engine)
 
 static int execlists_context_pin(struct intel_context *ce)
 {
-	return lr_context_pin(ce, ce->engine);
+	return intel_lr_context_pin(ce, ce->engine);
 }
 
 static int execlists_context_alloc(struct intel_context *ce)
 {
-	return lr_context_alloc(ce, ce->engine);
+	return intel_lr_context_alloc(ce, ce->engine);
 }
 
 static void execlists_context_reset(struct intel_context *ce)
@@ -4030,7 +4015,7 @@ static void lr_context_init_reg_state(u32 *regs,
 	 *
 	 * Must keep consistent with virtual_update_register_offsets().
 	 */
-	u32 *bbe = set_offsets(regs, reg_offsets(engine), engine);
+	u32 *bbe = intel_lr_context_set_register_offsets(regs, engine);
 
 	if (close) { /* Close the batch; used mainly by live_lrc_layout() */
 		*bbe = MI_BATCH_BUFFER_END;
@@ -4098,8 +4083,8 @@ populate_lr_context(struct intel_context *ce,
 	return ret;
 }
 
-static int lr_context_alloc(struct intel_context *ce,
-			    struct intel_engine_cs *engine)
+int intel_lr_context_alloc(struct intel_context *ce,
+			   struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_object *ctx_obj;
 	struct intel_ring *ring;
@@ -4159,123 +4144,12 @@ static int lr_context_alloc(struct intel_context *ce,
 	return ret;
 }
 
-static struct list_head *virtual_queue(struct intel_virtual_engine *ve)
+void intel_lr_context_fini(struct intel_context *ce)
 {
-	return &ve->base.execlists.default_priolist.requests[0];
-}
-
-static void virtual_context_destroy(struct kref *kref)
-{
-	struct intel_virtual_engine *ve =
-		container_of(kref, typeof(*ve), context.ref);
-	unsigned int n;
-
-	GEM_BUG_ON(!list_empty(virtual_queue(ve)));
-	GEM_BUG_ON(ve->request);
-	GEM_BUG_ON(ve->context.inflight);
-
-	for (n = 0; n < ve->num_siblings; n++) {
-		struct intel_engine_cs *sibling = ve->siblings[n];
-		struct rb_node *node = &ve->nodes[sibling->id].rb;
-		unsigned long flags;
-
-		if (RB_EMPTY_NODE(node))
-			continue;
-
-		spin_lock_irqsave(&sibling->active.lock, flags);
-
-		/* Detachment is lazily performed in the execlists tasklet */
-		if (!RB_EMPTY_NODE(node))
-			rb_erase_cached(node, &sibling->execlists.virtual);
-
-		spin_unlock_irqrestore(&sibling->active.lock, flags);
-	}
-	GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.execlists.tasklet));
-
-	if (ve->context.state)
-		lr_context_fini(&ve->context);
-	intel_context_fini(&ve->context);
-
-	kfree(ve->bonds);
-	kfree(ve);
-}
-
-static void virtual_engine_initial_hint(struct intel_virtual_engine *ve)
-{
-	int swp;
-
-	/*
-	 * Pick a random sibling on starting to help spread the load around.
-	 *
-	 * New contexts are typically created with exactly the same order
-	 * of siblings, and often started in batches. Due to the way we iterate
-	 * the array of sibling when submitting requests, sibling[0] is
-	 * prioritised for dequeuing. If we make sure that sibling[0] is fairly
-	 * randomised across the system, we also help spread the load by the
-	 * first engine we inspect being different each time.
-	 *
-	 * NB This does not force us to execute on this engine, it will just
-	 * typically be the first we inspect for submission.
-	 */
-	swp = prandom_u32_max(ve->num_siblings);
-	if (!swp)
-		return;
-
-	swap(ve->siblings[swp], ve->siblings[0]);
-	if (!intel_engine_has_relative_mmio(ve->siblings[0]))
-		virtual_update_register_offsets(ve->context.lrc_reg_state,
-						ve->siblings[0]);
-}
-
-static int virtual_context_pin(struct intel_context *ce)
-{
-	struct intel_virtual_engine *ve =
-		container_of(ce, typeof(*ve), context);
-	int err;
-
-	/* Note: we must use a real engine class for setting up reg state */
-	err = lr_context_pin(ce, ve->siblings[0]);
-	if (err)
-		return err;
-
-	virtual_engine_initial_hint(ve);
-	return 0;
-}
-
-static void virtual_context_enter(struct intel_context *ce)
-{
-	struct intel_virtual_engine *ve =
-		container_of(ce, typeof(*ve), context);
-	unsigned int n;
-
-	for (n = 0; n < ve->num_siblings; n++)
-		intel_engine_pm_get(ve->siblings[n]);
-
-	intel_timeline_enter(ce->timeline);
-}
-
-static void virtual_context_exit(struct intel_context *ce)
-{
-	struct intel_virtual_engine *ve =
-		container_of(ce, typeof(*ve), context);
-	unsigned int n;
-
-	intel_timeline_exit(ce->timeline);
-
-	for (n = 0; n < ve->num_siblings; n++)
-		intel_engine_pm_put(ve->siblings[n]);
+	intel_ring_put(ce->ring);
+	i915_vma_put(ce->state);
 }
 
-static const struct intel_context_ops virtual_context_ops = {
-	.pin = virtual_context_pin,
-	.unpin = intel_lr_context_unpin,
-
-	.enter = virtual_context_enter,
-	.exit = virtual_context_exit,
-
-	.destroy = virtual_context_destroy,
-};
-
 static intel_engine_mask_t
 virtual_submission_mask(struct intel_virtual_engine *ve)
 {
@@ -4414,8 +4288,8 @@ static void virtual_submit_request(struct i915_request *rq)
 		ve->base.execlists.queue_priority_hint = rq_prio(rq);
 		ve->request = i915_request_get(rq);
 
-		GEM_BUG_ON(!list_empty(virtual_queue(ve)));
-		list_move_tail(&rq->sched.link, virtual_queue(ve));
+		GEM_BUG_ON(!list_empty(intel_virtual_engine_queue(ve)));
+		list_move_tail(&rq->sched.link, intel_virtual_engine_queue(ve));
 
 		tasklet_schedule(&ve->base.execlists.tasklet);
 	}
@@ -4423,20 +4297,6 @@ static void virtual_submit_request(struct i915_request *rq)
 	spin_unlock_irqrestore(&ve->base.active.lock, flags);
 }
 
-static struct ve_bond *
-virtual_find_bond(struct intel_virtual_engine *ve,
-		  const struct intel_engine_cs *master)
-{
-	int i;
-
-	for (i = 0; i < ve->num_bonds; i++) {
-		if (ve->bonds[i].master == master)
-			return &ve->bonds[i];
-	}
-
-	return NULL;
-}
-
 static void
 virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
 {
@@ -4446,7 +4306,7 @@ virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
 
 	allowed = ~to_request(signal)->engine->mask;
 
-	bond = virtual_find_bond(ve, to_request(signal)->engine);
+	bond = intel_virtual_engine_find_bond(ve, to_request(signal)->engine);
 	if (bond)
 		allowed &= bond->sibling_mask;
 
@@ -4459,225 +4319,14 @@ virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
 	to_request(signal)->execution_mask &= ~allowed;
 }
 
-struct intel_context *
-intel_execlists_create_virtual(struct i915_gem_context *ctx,
-			       struct intel_engine_cs **siblings,
-			       unsigned int count)
+void intel_execlists_virtual_submission_init(struct intel_virtual_engine *ve)
 {
-	struct intel_virtual_engine *ve;
-	unsigned int n;
-	int err;
-
-	if (count == 0)
-		return ERR_PTR(-EINVAL);
-
-	if (count == 1)
-		return intel_context_create(ctx, siblings[0]);
-
-	ve = kzalloc(struct_size(ve, siblings, count), GFP_KERNEL);
-	if (!ve)
-		return ERR_PTR(-ENOMEM);
-
-	ve->base.i915 = ctx->i915;
-	ve->base.gt = siblings[0]->gt;
-	ve->base.uncore = siblings[0]->uncore;
-	ve->base.id = -1;
-	ve->base.class = OTHER_CLASS;
-	ve->base.uabi_class = I915_ENGINE_CLASS_INVALID;
-	ve->base.instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
-
-	/*
-	 * The decision on whether to submit a request using semaphores
-	 * depends on the saturated state of the engine. We only compute
-	 * this during HW submission of the request, and we need for this
-	 * state to be globally applied to all requests being submitted
-	 * to this engine. Virtual engines encompass more than one physical
-	 * engine and so we cannot accurately tell in advance if one of those
-	 * engines is already saturated and so cannot afford to use a semaphore
-	 * and be pessimized in priority for doing so -- if we are the only
-	 * context using semaphores after all other clients have stopped, we
-	 * will be starved on the saturated system. Such a global switch for
-	 * semaphores is less than ideal, but alas is the current compromise.
-	 */
-	ve->base.saturated = ALL_ENGINES;
-
-	snprintf(ve->base.name, sizeof(ve->base.name), "virtual");
-
-	intel_engine_init_active(&ve->base, ENGINE_VIRTUAL);
-	intel_engine_init_breadcrumbs(&ve->base);
-
-	intel_engine_init_execlists(&ve->base);
-
-	ve->base.cops = &virtual_context_ops;
 	ve->base.request_alloc = execlists_request_alloc;
-
-	ve->base.schedule = i915_schedule;
 	ve->base.submit_request = virtual_submit_request;
 	ve->base.bond_execute = virtual_bond_execute;
-
-	INIT_LIST_HEAD(virtual_queue(ve));
-	ve->base.execlists.queue_priority_hint = INT_MIN;
 	tasklet_init(&ve->base.execlists.tasklet,
 		     virtual_submission_tasklet,
 		     (unsigned long)ve);
-
-	intel_context_init(&ve->context, ctx, &ve->base);
-
-	for (n = 0; n < count; n++) {
-		struct intel_engine_cs *sibling = siblings[n];
-
-		GEM_BUG_ON(!is_power_of_2(sibling->mask));
-		if (sibling->mask & ve->base.mask) {
-			DRM_DEBUG("duplicate %s entry in load balancer\n",
-				  sibling->name);
-			err = -EINVAL;
-			goto err_put;
-		}
-
-		/*
-		 * The virtual engine implementation is tightly coupled to
-		 * the execlists backend -- we push out request directly
-		 * into a tree inside each physical engine. We could support
-		 * layering if we handle cloning of the requests and
-		 * submitting a copy into each backend.
-		 */
-		if (sibling->execlists.tasklet.func !=
-		    execlists_submission_tasklet) {
-			err = -ENODEV;
-			goto err_put;
-		}
-
-		GEM_BUG_ON(RB_EMPTY_NODE(&ve->nodes[sibling->id].rb));
-		RB_CLEAR_NODE(&ve->nodes[sibling->id].rb);
-
-		ve->siblings[ve->num_siblings++] = sibling;
-		ve->base.mask |= sibling->mask;
-
-		/*
-		 * All physical engines must be compatible for their emission
-		 * functions (as we build the instructions during request
-		 * construction and do not alter them before submission
-		 * on the physical engine). We use the engine class as a guide
-		 * here, although that could be refined.
-		 */
-		if (ve->base.class != OTHER_CLASS) {
-			if (ve->base.class != sibling->class) {
-				DRM_DEBUG("invalid mixing of engine class, sibling %d, already %d\n",
-					  sibling->class, ve->base.class);
-				err = -EINVAL;
-				goto err_put;
-			}
-			continue;
-		}
-
-		ve->base.class = sibling->class;
-		ve->base.uabi_class = sibling->uabi_class;
-		snprintf(ve->base.name, sizeof(ve->base.name),
-			 "v%dx%d", ve->base.class, count);
-		ve->base.context_size = sibling->context_size;
-
-		ve->base.emit_bb_start = sibling->emit_bb_start;
-		ve->base.emit_flush = sibling->emit_flush;
-		ve->base.emit_init_breadcrumb = sibling->emit_init_breadcrumb;
-		ve->base.emit_fini_breadcrumb = sibling->emit_fini_breadcrumb;
-		ve->base.emit_fini_breadcrumb_dw =
-			sibling->emit_fini_breadcrumb_dw;
-
-		ve->base.flags = sibling->flags;
-	}
-
-	ve->base.flags |= I915_ENGINE_IS_VIRTUAL;
-
-	err = lr_context_alloc(&ve->context, siblings[0]);
-	if (err)
-		goto err_put;
-
-	__set_bit(CONTEXT_ALLOC_BIT, &ve->context.flags);
-
-	return &ve->context;
-
-err_put:
-	intel_context_put(&ve->context);
-	return ERR_PTR(err);
-}
-
-struct intel_context *
-intel_execlists_clone_virtual(struct i915_gem_context *ctx,
-			      struct intel_engine_cs *src)
-{
-	struct intel_virtual_engine *se = to_virtual_engine(src);
-	struct intel_context *dst;
-
-	dst = intel_execlists_create_virtual(ctx,
-					     se->siblings,
-					     se->num_siblings);
-	if (IS_ERR(dst))
-		return dst;
-
-	if (se->num_bonds) {
-		struct intel_virtual_engine *de =
-			to_virtual_engine(dst->engine);
-
-		de->bonds = kmemdup(se->bonds,
-				    sizeof(*se->bonds) * se->num_bonds,
-				    GFP_KERNEL);
-		if (!de->bonds) {
-			intel_context_put(dst);
-			return ERR_PTR(-ENOMEM);
-		}
-
-		de->num_bonds = se->num_bonds;
-	}
-
-	return dst;
-}
-
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
-				     const struct intel_engine_cs *master,
-				     const struct intel_engine_cs *sibling)
-{
-	struct intel_virtual_engine *ve = to_virtual_engine(engine);
-	struct ve_bond *bond;
-	int n;
-
-	/* Sanity check the sibling is part of the virtual engine */
-	for (n = 0; n < ve->num_siblings; n++)
-		if (sibling == ve->siblings[n])
-			break;
-	if (n == ve->num_siblings)
-		return -EINVAL;
-
-	bond = virtual_find_bond(ve, master);
-	if (bond) {
-		bond->sibling_mask |= sibling->mask;
-		return 0;
-	}
-
-	bond = krealloc(ve->bonds,
-			sizeof(*bond) * (ve->num_bonds + 1),
-			GFP_KERNEL);
-	if (!bond)
-		return -ENOMEM;
-
-	bond[ve->num_bonds].master = master;
-	bond[ve->num_bonds].sibling_mask = sibling->mask;
-
-	ve->bonds = bond;
-	ve->num_bonds++;
-
-	return 0;
-}
-
-struct intel_engine_cs *
-intel_virtual_engine_get_sibling(struct intel_engine_cs *engine,
-				 unsigned int sibling)
-{
-	struct intel_virtual_engine *ve = to_virtual_engine(engine);
-
-	if (sibling >= ve->num_siblings)
-		return NULL;
-
-	return ve->siblings[sibling];
 }
 
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h
index 04511d8ebdc1..93f30b2deb7f 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.h
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
@@ -33,6 +33,7 @@ struct i915_gem_context;
 struct i915_request;
 struct intel_context;
 struct intel_engine_cs;
+struct intel_virtual_engine;
 
 /* Execlists regs */
 #define RING_ELSP(base)				_MMIO((base) + 0x230)
@@ -98,11 +99,22 @@ int intel_execlists_submission_init(struct intel_engine_cs *engine);
 
 void intel_execlists_set_default_submission(struct intel_engine_cs *engine);
 
+int intel_lr_context_alloc(struct intel_context *ce,
+			   struct intel_engine_cs *engine);
+void intel_lr_context_fini(struct intel_context *ce);
+
+u32 *intel_lr_context_set_register_offsets(u32 *regs,
+					   const struct intel_engine_cs *engine);
+
 void intel_lr_context_reset(struct intel_engine_cs *engine,
 			    struct intel_context *ce,
 			    u32 head,
 			    bool scrub);
 
+int intel_lr_context_pin(struct intel_context *ce,
+			 struct intel_engine_cs *engine);
+void intel_lr_context_unpin(struct intel_context *ce);
+
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
 				   struct drm_printer *m,
 				   void (*show_request)(struct drm_printer *m,
@@ -110,22 +122,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 							const char *prefix),
 				   unsigned int max);
 
-struct intel_context *
-intel_execlists_create_virtual(struct i915_gem_context *ctx,
-			       struct intel_engine_cs **siblings,
-			       unsigned int count);
-
-struct intel_context *
-intel_execlists_clone_virtual(struct i915_gem_context *ctx,
-			      struct intel_engine_cs *src);
-
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
-				     const struct intel_engine_cs *master,
-				     const struct intel_engine_cs *sibling);
-
-struct intel_engine_cs *
-intel_virtual_engine_get_sibling(struct intel_engine_cs *engine,
-				 unsigned int sibling);
+void intel_execlists_virtual_submission_init(struct intel_virtual_engine *ve);
 
 bool
 intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_virtual_engine.c b/drivers/gpu/drm/i915/gt/intel_virtual_engine.c
new file mode 100644
index 000000000000..6ec3752132bc
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_virtual_engine.c
@@ -0,0 +1,359 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <drm/drm_print.h>
+#include <linux/slab.h>
+
+#include "gem/i915_gem_context.h"
+
+#include "i915_gem.h"
+#include "intel_context.h"
+#include "intel_engine.h"
+#include "intel_engine_pm.h"
+#include "intel_lrc.h"
+#include "intel_timeline.h"
+#include "intel_virtual_engine.h"
+
+static void virtual_context_destroy(struct kref *kref)
+{
+	struct intel_virtual_engine *ve =
+		container_of(kref, typeof(*ve), context.ref);
+	unsigned int n;
+
+	GEM_BUG_ON(!list_empty(intel_virtual_engine_queue(ve)));
+	GEM_BUG_ON(ve->request);
+	GEM_BUG_ON(ve->context.inflight);
+
+	for (n = 0; n < ve->num_siblings; n++) {
+		struct intel_engine_cs *sibling = ve->siblings[n];
+		struct rb_node *node = &ve->nodes[sibling->id].rb;
+		unsigned long flags;
+
+		if (RB_EMPTY_NODE(node))
+			continue;
+
+		spin_lock_irqsave(&sibling->active.lock, flags);
+
+		/* Detachment is lazily performed in the execlists tasklet */
+		if (!RB_EMPTY_NODE(node))
+			rb_erase_cached(node, &sibling->execlists.virtual);
+
+		spin_unlock_irqrestore(&sibling->active.lock, flags);
+	}
+	GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.execlists.tasklet));
+
+	if (ve->context.state)
+		intel_lr_context_fini(&ve->context);
+	intel_context_fini(&ve->context);
+
+	kfree(ve->bonds);
+	kfree(ve);
+}
+
+static void virtual_engine_initial_hint(struct intel_virtual_engine *ve)
+{
+	int swp;
+
+	/*
+	 * Pick a random sibling on starting to help spread the load around.
+	 *
+	 * New contexts are typically created with exactly the same order
+	 * of siblings, and often started in batches. Due to the way we iterate
+	 * the array of sibling when submitting requests, sibling[0] is
+	 * prioritised for dequeuing. If we make sure that sibling[0] is fairly
+	 * randomised across the system, we also help spread the load by the
+	 * first engine we inspect being different each time.
+	 *
+	 * NB This does not force us to execute on this engine, it will just
+	 * typically be the first we inspect for submission.
+	 */
+	swp = prandom_u32_max(ve->num_siblings);
+	if (!swp)
+		return;
+
+	swap(ve->siblings[swp], ve->siblings[0]);
+	if (!intel_engine_has_relative_mmio(ve->siblings[0]))
+		intel_lr_context_set_register_offsets(ve->context.lrc_reg_state,
+						      ve->siblings[0]);
+}
+
+static int virtual_context_pin(struct intel_context *ce)
+{
+	struct intel_virtual_engine *ve =
+		container_of(ce, typeof(*ve), context);
+	int err;
+
+	/* Note: we must use a real engine class for setting up reg state */
+	err = intel_lr_context_pin(ce, ve->siblings[0]);
+	if (err)
+		return err;
+
+	virtual_engine_initial_hint(ve);
+	return 0;
+}
+
+static void virtual_context_enter(struct intel_context *ce)
+{
+	struct intel_virtual_engine *ve =
+		container_of(ce, typeof(*ve), context);
+	unsigned int n;
+
+	for (n = 0; n < ve->num_siblings; n++)
+		intel_engine_pm_get(ve->siblings[n]);
+
+	intel_timeline_enter(ce->timeline);
+}
+
+static void virtual_context_exit(struct intel_context *ce)
+{
+	struct intel_virtual_engine *ve =
+		container_of(ce, typeof(*ve), context);
+	unsigned int n;
+
+	intel_timeline_exit(ce->timeline);
+
+	for (n = 0; n < ve->num_siblings; n++)
+		intel_engine_pm_put(ve->siblings[n]);
+}
+
+static const struct intel_context_ops virtual_context_ops = {
+	.pin = virtual_context_pin,
+	.unpin = intel_lr_context_unpin,
+
+	.enter = virtual_context_enter,
+	.exit = virtual_context_exit,
+
+	.destroy = virtual_context_destroy,
+};
+
+struct intel_context *
+intel_virtual_engine_create(struct i915_gem_context *ctx,
+			    struct intel_engine_cs **siblings,
+			    unsigned int count)
+{
+	struct intel_virtual_engine *ve;
+	unsigned int n;
+	int err;
+
+	if (count == 0)
+		return ERR_PTR(-EINVAL);
+
+	if (count == 1)
+		return intel_context_create(ctx, siblings[0]);
+
+	ve = kzalloc(struct_size(ve, siblings, count), GFP_KERNEL);
+	if (!ve)
+		return ERR_PTR(-ENOMEM);
+
+	ve->base.i915 = ctx->i915;
+	ve->base.gt = siblings[0]->gt;
+	ve->base.uncore = siblings[0]->uncore;
+	ve->base.id = -1;
+	ve->base.class = OTHER_CLASS;
+	ve->base.uabi_class = I915_ENGINE_CLASS_INVALID;
+	ve->base.instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
+
+	/*
+	 * The decision on whether to submit a request using semaphores
+	 * depends on the saturated state of the engine. We only compute
+	 * this during HW submission of the request, and we need for this
+	 * state to be globally applied to all requests being submitted
+	 * to this engine. Virtual engines encompass more than one physical
+	 * engine and so we cannot accurately tell in advance if one of those
+	 * engines is already saturated and so cannot afford to use a semaphore
+	 * and be pessimized in priority for doing so -- if we are the only
+	 * context using semaphores after all other clients have stopped, we
+	 * will be starved on the saturated system. Such a global switch for
+	 * semaphores is less than ideal, but alas is the current compromise.
+	 */
+	ve->base.saturated = ALL_ENGINES;
+
+	snprintf(ve->base.name, sizeof(ve->base.name), "virtual");
+
+	intel_engine_init_active(&ve->base, ENGINE_VIRTUAL);
+	intel_engine_init_breadcrumbs(&ve->base);
+
+	intel_engine_init_execlists(&ve->base);
+
+	ve->base.cops = &virtual_context_ops;
+
+	intel_execlists_virtual_submission_init(ve);
+
+	ve->base.schedule = i915_schedule;
+
+	INIT_LIST_HEAD(intel_virtual_engine_queue(ve));
+	ve->base.execlists.queue_priority_hint = INT_MIN;
+
+	intel_context_init(&ve->context, ctx, &ve->base);
+
+	for (n = 0; n < count; n++) {
+		struct intel_engine_cs *sibling = siblings[n];
+
+		GEM_BUG_ON(!is_power_of_2(sibling->mask));
+		if (sibling->mask & ve->base.mask) {
+			DRM_DEBUG("duplicate %s entry in load balancer\n",
+				  sibling->name);
+			err = -EINVAL;
+			goto err_put;
+		}
+
+		/*
+		 * The virtual engine implementation is tightly coupled to
+		 * the execlists backend -- we push out request directly
+		 * into a tree inside each physical engine. We could support
+		 * layering if we handle cloning of the requests and
+		 * submitting a copy into each backend.
+		 */
+		if (!intel_engine_in_execlists_submission_mode(sibling)) {
+			err = -ENODEV;
+			goto err_put;
+		}
+
+		GEM_BUG_ON(RB_EMPTY_NODE(&ve->nodes[sibling->id].rb));
+		RB_CLEAR_NODE(&ve->nodes[sibling->id].rb);
+
+		ve->siblings[ve->num_siblings++] = sibling;
+		ve->base.mask |= sibling->mask;
+
+		/*
+		 * All physical engines must be compatible for their emission
+		 * functions (as we build the instructions during request
+		 * construction and do not alter them before submission
+		 * on the physical engine). We use the engine class as a guide
+		 * here, although that could be refined.
+		 */
+		if (ve->base.class != OTHER_CLASS) {
+			if (ve->base.class != sibling->class) {
+				DRM_DEBUG("invalid mixing of engine class, sibling %d, already %d\n",
+					  sibling->class, ve->base.class);
+				err = -EINVAL;
+				goto err_put;
+			}
+			continue;
+		}
+
+		ve->base.class = sibling->class;
+		ve->base.uabi_class = sibling->uabi_class;
+		snprintf(ve->base.name, sizeof(ve->base.name),
+			 "v%dx%d", ve->base.class, count);
+		ve->base.context_size = sibling->context_size;
+
+		ve->base.emit_bb_start = sibling->emit_bb_start;
+		ve->base.emit_flush = sibling->emit_flush;
+		ve->base.emit_init_breadcrumb = sibling->emit_init_breadcrumb;
+		ve->base.emit_fini_breadcrumb = sibling->emit_fini_breadcrumb;
+		ve->base.emit_fini_breadcrumb_dw =
+			sibling->emit_fini_breadcrumb_dw;
+
+		ve->base.flags = sibling->flags;
+	}
+
+	ve->base.flags |= I915_ENGINE_IS_VIRTUAL;
+
+	err = intel_lr_context_alloc(&ve->context, siblings[0]);
+	if (err)
+		goto err_put;
+
+	__set_bit(CONTEXT_ALLOC_BIT, &ve->context.flags);
+
+	return &ve->context;
+
+err_put:
+	intel_context_put(&ve->context);
+	return ERR_PTR(err);
+}
+
+struct intel_context *
+intel_virtual_engine_clone(struct i915_gem_context *ctx,
+			   struct intel_engine_cs *src)
+{
+	struct intel_virtual_engine *se = to_virtual_engine(src);
+	struct intel_context *dst;
+
+	dst = intel_virtual_engine_create(ctx, se->siblings, se->num_siblings);
+	if (IS_ERR(dst))
+		return dst;
+
+	if (se->num_bonds) {
+		struct intel_virtual_engine *de =
+			to_virtual_engine(dst->engine);
+
+		de->bonds = kmemdup(se->bonds,
+				    sizeof(*se->bonds) * se->num_bonds,
+				    GFP_KERNEL);
+		if (!de->bonds) {
+			intel_context_put(dst);
+			return ERR_PTR(-ENOMEM);
+		}
+
+		de->num_bonds = se->num_bonds;
+	}
+
+	return dst;
+}
+
+struct ve_bond *
+intel_virtual_engine_find_bond(struct intel_virtual_engine *ve,
+			       const struct intel_engine_cs *master)
+{
+	int i;
+
+	for (i = 0; i < ve->num_bonds; i++) {
+		if (ve->bonds[i].master == master)
+			return &ve->bonds[i];
+	}
+
+	return NULL;
+}
+
+int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
+				     const struct intel_engine_cs *master,
+				     const struct intel_engine_cs *sibling)
+{
+	struct intel_virtual_engine *ve = to_virtual_engine(engine);
+	struct ve_bond *bond;
+	int n;
+
+	/* Sanity check the sibling is part of the virtual engine */
+	for (n = 0; n < ve->num_siblings; n++)
+		if (sibling == ve->siblings[n])
+			break;
+	if (n == ve->num_siblings)
+		return -EINVAL;
+
+	bond = intel_virtual_engine_find_bond(ve, master);
+	if (bond) {
+		bond->sibling_mask |= sibling->mask;
+		return 0;
+	}
+
+	bond = krealloc(ve->bonds,
+			sizeof(*bond) * (ve->num_bonds + 1),
+			GFP_KERNEL);
+	if (!bond)
+		return -ENOMEM;
+
+	bond[ve->num_bonds].master = master;
+	bond[ve->num_bonds].sibling_mask = sibling->mask;
+
+	ve->bonds = bond;
+	ve->num_bonds++;
+
+	return 0;
+}
+
+struct intel_engine_cs *
+intel_virtual_engine_get_sibling(struct intel_engine_cs *engine,
+				 unsigned int sibling)
+{
+	struct intel_virtual_engine *ve = to_virtual_engine(engine);
+
+	if (sibling >= ve->num_siblings)
+		return NULL;
+
+	return ve->siblings[sibling];
+}
+
diff --git a/drivers/gpu/drm/i915/gt/intel_virtual_engine.h b/drivers/gpu/drm/i915/gt/intel_virtual_engine.h
new file mode 100644
index 000000000000..acda89ab3f99
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_virtual_engine.h
@@ -0,0 +1,48 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __INTEL_VIRTUAL_ENGINE__
+#define __INTEL_VIRTUAL_ENGINE__
+
+#include "i915_gem.h"
+#include "intel_virtual_engine_types.h"
+
+static inline struct intel_virtual_engine *
+to_virtual_engine(struct intel_engine_cs *engine)
+{
+	GEM_BUG_ON(!intel_engine_is_virtual(engine));
+	return container_of(engine, struct intel_virtual_engine, base);
+}
+
+static inline struct list_head *
+intel_virtual_engine_queue(struct intel_virtual_engine *ve)
+{
+	return &ve->base.execlists.default_priolist.requests[0];
+}
+
+struct intel_context *
+intel_virtual_engine_create(struct i915_gem_context *ctx,
+			    struct intel_engine_cs **siblings,
+				    unsigned int count);
+
+struct intel_context *
+intel_virtual_engine_clone(struct i915_gem_context *ctx,
+			   struct intel_engine_cs *src);
+
+
+struct ve_bond *
+intel_virtual_engine_find_bond(struct intel_virtual_engine *ve,
+			       const struct intel_engine_cs *master);
+
+int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
+				     const struct intel_engine_cs *master,
+				     const struct intel_engine_cs *sibling);
+
+struct intel_engine_cs *
+intel_virtual_engine_get_sibling(struct intel_engine_cs *engine,
+				 unsigned int sibling);
+
+#endif /* __INTEL_VIRTUAL_ENGINE__ */
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index b4537497c3be..570c7891c62f 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -2635,8 +2635,7 @@ static int nop_virtual_engine(struct intel_gt *gt,
 			goto out;
 		}
 
-		ve[n] = intel_execlists_create_virtual(ctx[n],
-						       siblings, nsibling);
+		ve[n] = intel_virtual_engine_create(ctx[n], siblings, nsibling);
 		if (IS_ERR(ve[n])) {
 			kernel_context_close(ctx[n]);
 			err = PTR_ERR(ve[n]);
@@ -2816,7 +2815,7 @@ static int mask_virtual_engine(struct intel_gt *gt,
 	if (!ctx)
 		return -ENOMEM;
 
-	ve = intel_execlists_create_virtual(ctx, siblings, nsibling);
+	ve = intel_virtual_engine_create(ctx, siblings, nsibling);
 	if (IS_ERR(ve)) {
 		err = PTR_ERR(ve);
 		goto out_close;
@@ -2942,7 +2941,7 @@ static int preserved_virtual_engine(struct intel_gt *gt,
 		goto out_close;
 	}
 
-	ve = intel_execlists_create_virtual(ctx, siblings, nsibling);
+	ve = intel_virtual_engine_create(ctx, siblings, nsibling);
 	if (IS_ERR(ve)) {
 		err = PTR_ERR(ve);
 		goto out_scratch;
@@ -3172,9 +3171,9 @@ static int bond_virtual_engine(struct intel_gt *gt,
 		for (n = 0; n < nsibling; n++) {
 			struct intel_context *ve;
 
-			ve = intel_execlists_create_virtual(ctx,
-							    siblings,
-							    nsibling);
+			ve = intel_virtual_engine_create(ctx,
+							 siblings,
+							 nsibling);
 			if (IS_ERR(ve)) {
 				err = PTR_ERR(ve);
 				onstack_fence_fini(&fence);
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [RFC 4/5] drm/i915: move execlists selftests to their own file
  2019-12-11 21:12 [Intel-gfx] [RFC 0/5] Split up intel_lrc.c Daniele Ceraolo Spurio
                   ` (2 preceding siblings ...)
  2019-12-11 21:12 ` [Intel-gfx] [RFC 3/5] drm/i915: split out virtual engine code Daniele Ceraolo Spurio
@ 2019-12-11 21:12 ` Daniele Ceraolo Spurio
  2019-12-11 21:26   ` Chris Wilson
  2019-12-11 21:12 ` [Intel-gfx] [RFC 5/5] drm/i915: introduce intel_execlists_submission.<c/h> Daniele Ceraolo Spurio
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-12-11 21:12 UTC (permalink / raw)
  To: intel-gfx

Done ahead of splitting the lrc file as well, to keep that patch
smaller. Just a straight copy, with the exception of create_scratch()
that has been made common to avoid having 3 instances of it.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
 .../drm/i915/gem/selftests/igt_gem_utils.c    |   27 +
 .../drm/i915/gem/selftests/igt_gem_utils.h    |    3 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           |    1 +
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 3316 ++++++++++++++++
 drivers/gpu/drm/i915/gt/selftest_lrc.c        | 3333 +----------------
 drivers/gpu/drm/i915/gt/selftest_mocs.c       |   30 +-
 6 files changed, 3351 insertions(+), 3359 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/selftest_execlists.c

diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
index 6718da20f35d..88109333cb79 100644
--- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
+++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
@@ -15,6 +15,33 @@
 
 #include "i915_request.h"
 
+struct i915_vma *igt_create_scratch(struct intel_gt *gt)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int err;
+
+	obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
+	if (IS_ERR(obj))
+		return ERR_CAST(obj);
+
+	i915_gem_object_set_cache_coherency(obj, I915_CACHING_CACHED);
+
+	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
+	if (IS_ERR(vma)) {
+		i915_gem_object_put(obj);
+		return vma;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
+	if (err) {
+		i915_gem_object_put(obj);
+		return ERR_PTR(err);
+	}
+
+	return vma;
+}
+
 struct i915_request *
 igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 {
diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
index 4221cf84d175..aae781f59cfc 100644
--- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
+++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
@@ -15,6 +15,9 @@ struct i915_vma;
 
 struct intel_context;
 struct intel_engine_cs;
+struct intel_gt;
+
+struct i915_vma *igt_create_scratch(struct intel_gt *gt);
 
 struct i915_request *
 igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3afae9a44911..fbdd3bdd06f1 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -4446,4 +4446,5 @@ intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine)
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_lrc.c"
+#include "selftest_execlists.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
new file mode 100644
index 000000000000..b58a4feb2ec4
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -0,0 +1,3316 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018-2019 Intel Corporation
+ */
+
+#include <linux/prime_numbers.h>
+
+#include "gem/i915_gem_pm.h"
+#include "gt/intel_engine_heartbeat.h"
+#include "gt/intel_reset.h"
+
+#include "i915_selftest.h"
+#include "selftests/i915_random.h"
+#include "selftests/igt_flush_test.h"
+#include "selftests/igt_live_test.h"
+#include "selftests/igt_spinner.h"
+#include "selftests/lib_sw_fence.h"
+
+#include "gem/selftests/igt_gem_utils.h"
+#include "gem/selftests/mock_context.h"
+
+#define CS_GPR(engine, n) ((engine)->mmio_base + 0x600 + (n) * 4)
+#define NUM_GPR_DW (16 * 2) /* each GPR is 2 dwords */
+
+static int live_sanitycheck(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct i915_gem_engines_iter it;
+	struct i915_gem_context *ctx;
+	struct intel_context *ce;
+	struct igt_spinner spin;
+	int err = -ENOMEM;
+
+	if (!HAS_LOGICAL_RING_CONTEXTS(gt->i915))
+		return 0;
+
+	if (igt_spinner_init(&spin, gt))
+		return -ENOMEM;
+
+	ctx = kernel_context(gt->i915);
+	if (!ctx)
+		goto err_spin;
+
+	for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
+		struct i915_request *rq;
+
+		rq = igt_spinner_create_request(&spin, ce, MI_NOOP);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto err_ctx;
+		}
+
+		i915_request_add(rq);
+		if (!igt_wait_for_spinner(&spin, rq)) {
+			GEM_TRACE("spinner failed to start\n");
+			GEM_TRACE_DUMP();
+			intel_gt_set_wedged(gt);
+			err = -EIO;
+			goto err_ctx;
+		}
+
+		igt_spinner_end(&spin);
+		if (igt_flush_test(gt->i915)) {
+			err = -EIO;
+			goto err_ctx;
+		}
+	}
+
+	err = 0;
+err_ctx:
+	i915_gem_context_unlock_engines(ctx);
+	kernel_context_close(ctx);
+err_spin:
+	igt_spinner_fini(&spin);
+	return err;
+}
+
+static int live_unlite_restore(struct intel_gt *gt, int prio)
+{
+	struct intel_engine_cs *engine;
+	struct i915_gem_context *ctx;
+	enum intel_engine_id id;
+	struct igt_spinner spin;
+	int err = -ENOMEM;
+
+	/*
+	 * Check that we can correctly context switch between 2 instances
+	 * on the same engine from the same parent context.
+	 */
+
+	if (igt_spinner_init(&spin, gt))
+		return err;
+
+	ctx = kernel_context(gt->i915);
+	if (!ctx)
+		goto err_spin;
+
+	err = 0;
+	for_each_engine(engine, gt, id) {
+		struct intel_context *ce[2] = {};
+		struct i915_request *rq[2];
+		struct igt_live_test t;
+		int n;
+
+		if (prio && !intel_engine_has_preemption(engine))
+			continue;
+
+		if (!intel_engine_can_store_dword(engine))
+			continue;
+
+		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
+			err = -EIO;
+			break;
+		}
+
+		for (n = 0; n < ARRAY_SIZE(ce); n++) {
+			struct intel_context *tmp;
+
+			tmp = intel_context_create(ctx, engine);
+			if (IS_ERR(tmp)) {
+				err = PTR_ERR(tmp);
+				goto err_ce;
+			}
+
+			err = intel_context_pin(tmp);
+			if (err) {
+				intel_context_put(tmp);
+				goto err_ce;
+			}
+
+			/*
+			 * Setup the pair of contexts such that if we
+			 * lite-restore using the RING_TAIL from ce[1] it
+			 * will execute garbage from ce[0]->ring.
+			 */
+			memset(tmp->ring->vaddr,
+			       POISON_INUSE, /* IPEHR: 0x5a5a5a5a [hung!] */
+			       tmp->ring->vma->size);
+
+			ce[n] = tmp;
+		}
+		GEM_BUG_ON(!ce[1]->ring->size);
+		intel_ring_reset(ce[1]->ring, ce[1]->ring->size / 2);
+		lr_context_update_reg_state(ce[1], engine);
+
+		rq[0] = igt_spinner_create_request(&spin, ce[0], MI_ARB_CHECK);
+		if (IS_ERR(rq[0])) {
+			err = PTR_ERR(rq[0]);
+			goto err_ce;
+		}
+
+		i915_request_get(rq[0]);
+		i915_request_add(rq[0]);
+		GEM_BUG_ON(rq[0]->postfix > ce[1]->ring->emit);
+
+		if (!igt_wait_for_spinner(&spin, rq[0])) {
+			i915_request_put(rq[0]);
+			goto err_ce;
+		}
+
+		rq[1] = i915_request_create(ce[1]);
+		if (IS_ERR(rq[1])) {
+			err = PTR_ERR(rq[1]);
+			i915_request_put(rq[0]);
+			goto err_ce;
+		}
+
+		if (!prio) {
+			/*
+			 * Ensure we do the switch to ce[1] on completion.
+			 *
+			 * rq[0] is already submitted, so this should reduce
+			 * to a no-op (a wait on a request on the same engine
+			 * uses the submit fence, not the completion fence),
+			 * but it will install a dependency on rq[1] for rq[0]
+			 * that will prevent the pair being reordered by
+			 * timeslicing.
+			 */
+			i915_request_await_dma_fence(rq[1], &rq[0]->fence);
+		}
+
+		i915_request_get(rq[1]);
+		i915_request_add(rq[1]);
+		GEM_BUG_ON(rq[1]->postfix <= rq[0]->postfix);
+		i915_request_put(rq[0]);
+
+		if (prio) {
+			struct i915_sched_attr attr = {
+				.priority = prio,
+			};
+
+			/* Alternatively preempt the spinner with ce[1] */
+			engine->schedule(rq[1], &attr);
+		}
+
+		/* And switch back to ce[0] for good measure */
+		rq[0] = i915_request_create(ce[0]);
+		if (IS_ERR(rq[0])) {
+			err = PTR_ERR(rq[0]);
+			i915_request_put(rq[1]);
+			goto err_ce;
+		}
+
+		i915_request_await_dma_fence(rq[0], &rq[1]->fence);
+		i915_request_get(rq[0]);
+		i915_request_add(rq[0]);
+		GEM_BUG_ON(rq[0]->postfix > rq[1]->postfix);
+		i915_request_put(rq[1]);
+		i915_request_put(rq[0]);
+
+err_ce:
+		tasklet_kill(&engine->execlists.tasklet); /* flush submission */
+		igt_spinner_end(&spin);
+		for (n = 0; n < ARRAY_SIZE(ce); n++) {
+			if (IS_ERR_OR_NULL(ce[n]))
+				break;
+
+			intel_context_unpin(ce[n]);
+			intel_context_put(ce[n]);
+		}
+
+		if (igt_live_test_end(&t))
+			err = -EIO;
+		if (err)
+			break;
+	}
+
+	kernel_context_close(ctx);
+err_spin:
+	igt_spinner_fini(&spin);
+	return err;
+}
+
+static int live_unlite_switch(void *arg)
+{
+	return live_unlite_restore(arg, 0);
+}
+
+static int live_unlite_preempt(void *arg)
+{
+	return live_unlite_restore(arg, I915_USER_PRIORITY(I915_PRIORITY_MAX));
+}
+
+static int
+emit_semaphore_chain(struct i915_request *rq, struct i915_vma *vma, int idx)
+{
+	u32 *cs;
+
+	cs = intel_ring_begin(rq, 10);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
+	*cs++ = MI_SEMAPHORE_WAIT |
+		MI_SEMAPHORE_GLOBAL_GTT |
+		MI_SEMAPHORE_POLL |
+		MI_SEMAPHORE_SAD_NEQ_SDD;
+	*cs++ = 0;
+	*cs++ = i915_ggtt_offset(vma) + 4 * idx;
+	*cs++ = 0;
+
+	if (idx > 0) {
+		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+		*cs++ = i915_ggtt_offset(vma) + 4 * (idx - 1);
+		*cs++ = 0;
+		*cs++ = 1;
+	} else {
+		*cs++ = MI_NOOP;
+		*cs++ = MI_NOOP;
+		*cs++ = MI_NOOP;
+		*cs++ = MI_NOOP;
+	}
+
+	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+
+	intel_ring_advance(rq, cs);
+	return 0;
+}
+
+static struct i915_request *
+semaphore_queue(struct intel_engine_cs *engine, struct i915_vma *vma, int idx)
+{
+	struct i915_gem_context *ctx;
+	struct i915_request *rq;
+	int err;
+
+	ctx = kernel_context(engine->i915);
+	if (!ctx)
+		return ERR_PTR(-ENOMEM);
+
+	rq = igt_request_alloc(ctx, engine);
+	if (IS_ERR(rq))
+		goto out_ctx;
+
+	err = 0;
+	if (rq->engine->emit_init_breadcrumb)
+		err = rq->engine->emit_init_breadcrumb(rq);
+	if (err == 0)
+		err = emit_semaphore_chain(rq, vma, idx);
+	if (err == 0)
+		i915_request_get(rq);
+	i915_request_add(rq);
+	if (err)
+		rq = ERR_PTR(err);
+
+out_ctx:
+	kernel_context_close(ctx);
+	return rq;
+}
+
+static int
+release_queue(struct intel_engine_cs *engine,
+	      struct i915_vma *vma,
+	      int idx, int prio)
+{
+	struct i915_sched_attr attr = {
+		.priority = prio,
+	};
+	struct i915_request *rq;
+	u32 *cs;
+
+	rq = intel_engine_create_kernel_request(engine);
+	if (IS_ERR(rq))
+		return PTR_ERR(rq);
+
+	cs = intel_ring_begin(rq, 4);
+	if (IS_ERR(cs)) {
+		i915_request_add(rq);
+		return PTR_ERR(cs);
+	}
+
+	*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+	*cs++ = i915_ggtt_offset(vma) + 4 * (idx - 1);
+	*cs++ = 0;
+	*cs++ = 1;
+
+	intel_ring_advance(rq, cs);
+
+	i915_request_get(rq);
+	i915_request_add(rq);
+
+	local_bh_disable();
+	engine->schedule(rq, &attr);
+	local_bh_enable(); /* kick tasklet */
+
+	i915_request_put(rq);
+
+	return 0;
+}
+
+static int
+slice_semaphore_queue(struct intel_engine_cs *outer,
+		      struct i915_vma *vma,
+		      int count)
+{
+	struct intel_engine_cs *engine;
+	struct i915_request *head;
+	enum intel_engine_id id;
+	int err, i, n = 0;
+
+	head = semaphore_queue(outer, vma, n++);
+	if (IS_ERR(head))
+		return PTR_ERR(head);
+
+	for_each_engine(engine, outer->gt, id) {
+		for (i = 0; i < count; i++) {
+			struct i915_request *rq;
+
+			rq = semaphore_queue(engine, vma, n++);
+			if (IS_ERR(rq)) {
+				err = PTR_ERR(rq);
+				goto out;
+			}
+
+			i915_request_put(rq);
+		}
+	}
+
+	err = release_queue(outer, vma, n, INT_MAX);
+	if (err)
+		goto out;
+
+	if (i915_request_wait(head, 0,
+			      2 * RUNTIME_INFO(outer->i915)->num_engines * (count + 2) * (count + 3)) < 0) {
+		pr_err("Failed to slice along semaphore chain of length (%d, %d)!\n",
+		       count, n);
+		GEM_TRACE_DUMP();
+		intel_gt_set_wedged(outer->gt);
+		err = -EIO;
+	}
+
+out:
+	i915_request_put(head);
+	return err;
+}
+
+static int live_timeslice_preempt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	void *vaddr;
+	int err = 0;
+	int count;
+
+	/*
+	 * If a request takes too long, we would like to give other users
+	 * a fair go on the GPU. In particular, users may create batches
+	 * that wait upon external input, where that input may even be
+	 * supplied by another GPU job. To avoid blocking forever, we
+	 * need to preempt the current task and replace it with another
+	 * ready task.
+	 */
+	if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
+		return 0;
+
+	obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err_obj;
+	}
+
+	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	if (IS_ERR(vaddr)) {
+		err = PTR_ERR(vaddr);
+		goto err_obj;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
+	if (err)
+		goto err_map;
+
+	for_each_prime_number_from(count, 1, 16) {
+		struct intel_engine_cs *engine;
+		enum intel_engine_id id;
+
+		for_each_engine(engine, gt, id) {
+			if (!intel_engine_has_preemption(engine))
+				continue;
+
+			memset(vaddr, 0, PAGE_SIZE);
+
+			err = slice_semaphore_queue(engine, vma, count);
+			if (err)
+				goto err_pin;
+
+			if (igt_flush_test(gt->i915)) {
+				err = -EIO;
+				goto err_pin;
+			}
+		}
+	}
+
+err_pin:
+	i915_vma_unpin(vma);
+err_map:
+	i915_gem_object_unpin_map(obj);
+err_obj:
+	i915_gem_object_put(obj);
+	return err;
+}
+
+static struct i915_request *nop_request(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq;
+
+	rq = intel_engine_create_kernel_request(engine);
+	if (IS_ERR(rq))
+		return rq;
+
+	i915_request_get(rq);
+	i915_request_add(rq);
+
+	return rq;
+}
+
+static void wait_for_submit(struct intel_engine_cs *engine,
+			    struct i915_request *rq)
+{
+	do {
+		cond_resched();
+		intel_engine_flush_submission(engine);
+	} while (!i915_request_is_active(rq));
+}
+
+static long timeslice_threshold(const struct intel_engine_cs *engine)
+{
+	return 2 * msecs_to_jiffies_timeout(timeslice(engine)) + 1;
+}
+
+static int live_timeslice_queue(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct drm_i915_gem_object *obj;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	struct i915_vma *vma;
+	void *vaddr;
+	int err = 0;
+
+	/*
+	 * Make sure that even if ELSP[0] and ELSP[1] are filled with
+	 * timeslicing between them disabled, we *do* enable timeslicing
+	 * if the queue demands it. (Normally, we do not submit if
+	 * ELSP[1] is already occupied, so must rely on timeslicing to
+	 * eject ELSP[0] in favour of the queue.)
+	 */
+	if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
+		return 0;
+
+	obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err_obj;
+	}
+
+	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	if (IS_ERR(vaddr)) {
+		err = PTR_ERR(vaddr);
+		goto err_obj;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
+	if (err)
+		goto err_map;
+
+	for_each_engine(engine, gt, id) {
+		struct i915_sched_attr attr = {
+			.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX),
+		};
+		struct i915_request *rq, *nop;
+
+		if (!intel_engine_has_preemption(engine))
+			continue;
+
+		memset(vaddr, 0, PAGE_SIZE);
+
+		/* ELSP[0]: semaphore wait */
+		rq = semaphore_queue(engine, vma, 0);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto err_pin;
+		}
+		engine->schedule(rq, &attr);
+		wait_for_submit(engine, rq);
+
+		/* ELSP[1]: nop request */
+		nop = nop_request(engine);
+		if (IS_ERR(nop)) {
+			err = PTR_ERR(nop);
+			i915_request_put(rq);
+			goto err_pin;
+		}
+		wait_for_submit(engine, nop);
+		i915_request_put(nop);
+
+		GEM_BUG_ON(i915_request_completed(rq));
+		GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
+
+		/* Queue: semaphore signal, matching priority as semaphore */
+		err = release_queue(engine, vma, 1, effective_prio(rq));
+		if (err) {
+			i915_request_put(rq);
+			goto err_pin;
+		}
+
+		intel_engine_flush_submission(engine);
+		if (!READ_ONCE(engine->execlists.timer.expires) &&
+		    !i915_request_completed(rq)) {
+			struct drm_printer p =
+				drm_info_printer(gt->i915->drm.dev);
+
+			GEM_TRACE_ERR("%s: Failed to enable timeslicing!\n",
+				      engine->name);
+			intel_engine_dump(engine, &p,
+					  "%s\n", engine->name);
+			GEM_TRACE_DUMP();
+
+			memset(vaddr, 0xff, PAGE_SIZE);
+			err = -EINVAL;
+		}
+
+		/* Timeslice every jiffy, so within 2 we should signal */
+		if (i915_request_wait(rq, 0, timeslice_threshold(engine)) < 0) {
+			struct drm_printer p =
+				drm_info_printer(gt->i915->drm.dev);
+
+			pr_err("%s: Failed to timeslice into queue\n",
+			       engine->name);
+			intel_engine_dump(engine, &p,
+					  "%s\n", engine->name);
+
+			memset(vaddr, 0xff, PAGE_SIZE);
+			err = -EIO;
+		}
+		i915_request_put(rq);
+		if (err)
+			break;
+	}
+
+err_pin:
+	i915_vma_unpin(vma);
+err_map:
+	i915_gem_object_unpin_map(obj);
+err_obj:
+	i915_gem_object_put(obj);
+	return err;
+}
+
+static int live_busywait_preempt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct i915_gem_context *ctx_hi, *ctx_lo;
+	struct intel_engine_cs *engine;
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+	u32 *map;
+
+	/*
+	 * Verify that even without HAS_LOGICAL_RING_PREEMPTION, we can
+	 * preempt the busywaits used to synchronise between rings.
+	 */
+
+	ctx_hi = kernel_context(gt->i915);
+	if (!ctx_hi)
+		return -ENOMEM;
+	ctx_hi->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
+
+	ctx_lo = kernel_context(gt->i915);
+	if (!ctx_lo)
+		goto err_ctx_hi;
+	ctx_lo->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
+
+	obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
+	if (IS_ERR(obj)) {
+		err = PTR_ERR(obj);
+		goto err_ctx_lo;
+	}
+
+	map = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	if (IS_ERR(map)) {
+		err = PTR_ERR(map);
+		goto err_obj;
+	}
+
+	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err_map;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
+	if (err)
+		goto err_map;
+
+	for_each_engine(engine, gt, id) {
+		struct i915_request *lo, *hi;
+		struct igt_live_test t;
+		u32 *cs;
+
+		if (!intel_engine_has_preemption(engine))
+			continue;
+
+		if (!intel_engine_can_store_dword(engine))
+			continue;
+
+		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
+			err = -EIO;
+			goto err_vma;
+		}
+
+		/*
+		 * We create two requests. The low priority request
+		 * busywaits on a semaphore (inside the ringbuffer where
+		 * is should be preemptible) and the high priority requests
+		 * uses a MI_STORE_DWORD_IMM to update the semaphore value
+		 * allowing the first request to complete. If preemption
+		 * fails, we hang instead.
+		 */
+
+		lo = igt_request_alloc(ctx_lo, engine);
+		if (IS_ERR(lo)) {
+			err = PTR_ERR(lo);
+			goto err_vma;
+		}
+
+		cs = intel_ring_begin(lo, 8);
+		if (IS_ERR(cs)) {
+			err = PTR_ERR(cs);
+			i915_request_add(lo);
+			goto err_vma;
+		}
+
+		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+		*cs++ = i915_ggtt_offset(vma);
+		*cs++ = 0;
+		*cs++ = 1;
+
+		/* XXX Do we need a flush + invalidate here? */
+
+		*cs++ = MI_SEMAPHORE_WAIT |
+			MI_SEMAPHORE_GLOBAL_GTT |
+			MI_SEMAPHORE_POLL |
+			MI_SEMAPHORE_SAD_EQ_SDD;
+		*cs++ = 0;
+		*cs++ = i915_ggtt_offset(vma);
+		*cs++ = 0;
+
+		intel_ring_advance(lo, cs);
+
+		i915_request_get(lo);
+		i915_request_add(lo);
+
+		if (wait_for(READ_ONCE(*map), 10)) {
+			i915_request_put(lo);
+			err = -ETIMEDOUT;
+			goto err_vma;
+		}
+
+		/* Low priority request should be busywaiting now */
+		if (i915_request_wait(lo, 0, 1) != -ETIME) {
+			i915_request_put(lo);
+			pr_err("%s: Busywaiting request did not!\n",
+			       engine->name);
+			err = -EIO;
+			goto err_vma;
+		}
+
+		hi = igt_request_alloc(ctx_hi, engine);
+		if (IS_ERR(hi)) {
+			err = PTR_ERR(hi);
+			i915_request_put(lo);
+			goto err_vma;
+		}
+
+		cs = intel_ring_begin(hi, 4);
+		if (IS_ERR(cs)) {
+			err = PTR_ERR(cs);
+			i915_request_add(hi);
+			i915_request_put(lo);
+			goto err_vma;
+		}
+
+		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+		*cs++ = i915_ggtt_offset(vma);
+		*cs++ = 0;
+		*cs++ = 0;
+
+		intel_ring_advance(hi, cs);
+		i915_request_add(hi);
+
+		if (i915_request_wait(lo, 0, HZ / 5) < 0) {
+			struct drm_printer p = drm_info_printer(gt->i915->drm.dev);
+
+			pr_err("%s: Failed to preempt semaphore busywait!\n",
+			       engine->name);
+
+			intel_engine_dump(engine, &p, "%s\n", engine->name);
+			GEM_TRACE_DUMP();
+
+			i915_request_put(lo);
+			intel_gt_set_wedged(gt);
+			err = -EIO;
+			goto err_vma;
+		}
+		GEM_BUG_ON(READ_ONCE(*map));
+		i915_request_put(lo);
+
+		if (igt_live_test_end(&t)) {
+			err = -EIO;
+			goto err_vma;
+		}
+	}
+
+	err = 0;
+err_vma:
+	i915_vma_unpin(vma);
+err_map:
+	i915_gem_object_unpin_map(obj);
+err_obj:
+	i915_gem_object_put(obj);
+err_ctx_lo:
+	kernel_context_close(ctx_lo);
+err_ctx_hi:
+	kernel_context_close(ctx_hi);
+	return err;
+}
+
+static struct i915_request *
+spinner_create_request(struct igt_spinner *spin,
+		       struct i915_gem_context *ctx,
+		       struct intel_engine_cs *engine,
+		       u32 arb)
+{
+	struct intel_context *ce;
+	struct i915_request *rq;
+
+	ce = i915_gem_context_get_engine(ctx, engine->legacy_idx);
+	if (IS_ERR(ce))
+		return ERR_CAST(ce);
+
+	rq = igt_spinner_create_request(spin, ce, arb);
+	intel_context_put(ce);
+	return rq;
+}
+
+static int live_preempt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct i915_gem_context *ctx_hi, *ctx_lo;
+	struct igt_spinner spin_hi, spin_lo;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
+		return 0;
+
+	if (!(gt->i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
+		pr_err("Logical preemption supported, but not exposed\n");
+
+	if (igt_spinner_init(&spin_hi, gt))
+		return -ENOMEM;
+
+	if (igt_spinner_init(&spin_lo, gt))
+		goto err_spin_hi;
+
+	ctx_hi = kernel_context(gt->i915);
+	if (!ctx_hi)
+		goto err_spin_lo;
+	ctx_hi->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
+
+	ctx_lo = kernel_context(gt->i915);
+	if (!ctx_lo)
+		goto err_ctx_hi;
+	ctx_lo->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
+
+	for_each_engine(engine, gt, id) {
+		struct igt_live_test t;
+		struct i915_request *rq;
+
+		if (!intel_engine_has_preemption(engine))
+			continue;
+
+		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		rq = spinner_create_request(&spin_lo, ctx_lo, engine,
+					    MI_ARB_CHECK);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (!igt_wait_for_spinner(&spin_lo, rq)) {
+			GEM_TRACE("lo spinner failed to start\n");
+			GEM_TRACE_DUMP();
+			intel_gt_set_wedged(gt);
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		rq = spinner_create_request(&spin_hi, ctx_hi, engine,
+					    MI_ARB_CHECK);
+		if (IS_ERR(rq)) {
+			igt_spinner_end(&spin_lo);
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (!igt_wait_for_spinner(&spin_hi, rq)) {
+			GEM_TRACE("hi spinner failed to start\n");
+			GEM_TRACE_DUMP();
+			intel_gt_set_wedged(gt);
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		igt_spinner_end(&spin_hi);
+		igt_spinner_end(&spin_lo);
+
+		if (igt_live_test_end(&t)) {
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+	}
+
+	err = 0;
+err_ctx_lo:
+	kernel_context_close(ctx_lo);
+err_ctx_hi:
+	kernel_context_close(ctx_hi);
+err_spin_lo:
+	igt_spinner_fini(&spin_lo);
+err_spin_hi:
+	igt_spinner_fini(&spin_hi);
+	return err;
+}
+
+static int live_late_preempt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct i915_gem_context *ctx_hi, *ctx_lo;
+	struct igt_spinner spin_hi, spin_lo;
+	struct intel_engine_cs *engine;
+	struct i915_sched_attr attr = {};
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
+		return 0;
+
+	if (igt_spinner_init(&spin_hi, gt))
+		return -ENOMEM;
+
+	if (igt_spinner_init(&spin_lo, gt))
+		goto err_spin_hi;
+
+	ctx_hi = kernel_context(gt->i915);
+	if (!ctx_hi)
+		goto err_spin_lo;
+
+	ctx_lo = kernel_context(gt->i915);
+	if (!ctx_lo)
+		goto err_ctx_hi;
+
+	/* Make sure ctx_lo stays before ctx_hi until we trigger preemption. */
+	ctx_lo->sched.priority = I915_USER_PRIORITY(1);
+
+	for_each_engine(engine, gt, id) {
+		struct igt_live_test t;
+		struct i915_request *rq;
+
+		if (!intel_engine_has_preemption(engine))
+			continue;
+
+		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		rq = spinner_create_request(&spin_lo, ctx_lo, engine,
+					    MI_ARB_CHECK);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (!igt_wait_for_spinner(&spin_lo, rq)) {
+			pr_err("First context failed to start\n");
+			goto err_wedged;
+		}
+
+		rq = spinner_create_request(&spin_hi, ctx_hi, engine,
+					    MI_NOOP);
+		if (IS_ERR(rq)) {
+			igt_spinner_end(&spin_lo);
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (igt_wait_for_spinner(&spin_hi, rq)) {
+			pr_err("Second context overtook first?\n");
+			goto err_wedged;
+		}
+
+		attr.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX);
+		engine->schedule(rq, &attr);
+
+		if (!igt_wait_for_spinner(&spin_hi, rq)) {
+			pr_err("High priority context failed to preempt the low priority context\n");
+			GEM_TRACE_DUMP();
+			goto err_wedged;
+		}
+
+		igt_spinner_end(&spin_hi);
+		igt_spinner_end(&spin_lo);
+
+		if (igt_live_test_end(&t)) {
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+	}
+
+	err = 0;
+err_ctx_lo:
+	kernel_context_close(ctx_lo);
+err_ctx_hi:
+	kernel_context_close(ctx_hi);
+err_spin_lo:
+	igt_spinner_fini(&spin_lo);
+err_spin_hi:
+	igt_spinner_fini(&spin_hi);
+	return err;
+
+err_wedged:
+	igt_spinner_end(&spin_hi);
+	igt_spinner_end(&spin_lo);
+	intel_gt_set_wedged(gt);
+	err = -EIO;
+	goto err_ctx_lo;
+}
+
+struct preempt_client {
+	struct igt_spinner spin;
+	struct i915_gem_context *ctx;
+};
+
+static int preempt_client_init(struct intel_gt *gt, struct preempt_client *c)
+{
+	c->ctx = kernel_context(gt->i915);
+	if (!c->ctx)
+		return -ENOMEM;
+
+	if (igt_spinner_init(&c->spin, gt))
+		goto err_ctx;
+
+	return 0;
+
+err_ctx:
+	kernel_context_close(c->ctx);
+	return -ENOMEM;
+}
+
+static void preempt_client_fini(struct preempt_client *c)
+{
+	igt_spinner_fini(&c->spin);
+	kernel_context_close(c->ctx);
+}
+
+static int live_nopreempt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs *engine;
+	struct preempt_client a, b;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+
+	/*
+	 * Verify that we can disable preemption for an individual request
+	 * that may be being observed and not want to be interrupted.
+	 */
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
+		return 0;
+
+	if (preempt_client_init(gt, &a))
+		return -ENOMEM;
+	if (preempt_client_init(gt, &b))
+		goto err_client_a;
+	b.ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX);
+
+	for_each_engine(engine, gt, id) {
+		struct i915_request *rq_a, *rq_b;
+
+		if (!intel_engine_has_preemption(engine))
+			continue;
+
+		engine->execlists.preempt_hang.count = 0;
+
+		rq_a = spinner_create_request(&a.spin,
+					      a.ctx, engine,
+					      MI_ARB_CHECK);
+		if (IS_ERR(rq_a)) {
+			err = PTR_ERR(rq_a);
+			goto err_client_b;
+		}
+
+		/* Low priority client, but unpreemptable! */
+		rq_a->flags |= I915_REQUEST_NOPREEMPT;
+
+		i915_request_add(rq_a);
+		if (!igt_wait_for_spinner(&a.spin, rq_a)) {
+			pr_err("First client failed to start\n");
+			goto err_wedged;
+		}
+
+		rq_b = spinner_create_request(&b.spin,
+					      b.ctx, engine,
+					      MI_ARB_CHECK);
+		if (IS_ERR(rq_b)) {
+			err = PTR_ERR(rq_b);
+			goto err_client_b;
+		}
+
+		i915_request_add(rq_b);
+
+		/* B is much more important than A! (But A is unpreemptable.) */
+		GEM_BUG_ON(rq_prio(rq_b) <= rq_prio(rq_a));
+
+		/* Wait long enough for preemption and timeslicing */
+		if (igt_wait_for_spinner(&b.spin, rq_b)) {
+			pr_err("Second client started too early!\n");
+			goto err_wedged;
+		}
+
+		igt_spinner_end(&a.spin);
+
+		if (!igt_wait_for_spinner(&b.spin, rq_b)) {
+			pr_err("Second client failed to start\n");
+			goto err_wedged;
+		}
+
+		igt_spinner_end(&b.spin);
+
+		if (engine->execlists.preempt_hang.count) {
+			pr_err("Preemption recorded x%d; should have been suppressed!\n",
+			       engine->execlists.preempt_hang.count);
+			err = -EINVAL;
+			goto err_wedged;
+		}
+
+		if (igt_flush_test(gt->i915))
+			goto err_wedged;
+	}
+
+	err = 0;
+err_client_b:
+	preempt_client_fini(&b);
+err_client_a:
+	preempt_client_fini(&a);
+	return err;
+
+err_wedged:
+	igt_spinner_end(&b.spin);
+	igt_spinner_end(&a.spin);
+	intel_gt_set_wedged(gt);
+	err = -EIO;
+	goto err_client_b;
+}
+
+struct live_preempt_cancel {
+	struct intel_engine_cs *engine;
+	struct preempt_client a, b;
+};
+
+static int __cancel_active0(struct live_preempt_cancel *arg)
+{
+	struct i915_request *rq;
+	struct igt_live_test t;
+	int err;
+
+	/* Preempt cancel of ELSP0 */
+	GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
+	if (igt_live_test_begin(&t, arg->engine->i915,
+				__func__, arg->engine->name))
+		return -EIO;
+
+	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
+	rq = spinner_create_request(&arg->a.spin,
+				    arg->a.ctx, arg->engine,
+				    MI_ARB_CHECK);
+	if (IS_ERR(rq))
+		return PTR_ERR(rq);
+
+	i915_request_get(rq);
+	i915_request_add(rq);
+	if (!igt_wait_for_spinner(&arg->a.spin, rq)) {
+		err = -EIO;
+		goto out;
+	}
+
+	i915_gem_context_set_banned(arg->a.ctx);
+	err = intel_engine_pulse(arg->engine);
+	if (err)
+		goto out;
+
+	if (i915_request_wait(rq, 0, HZ / 5) < 0) {
+		err = -EIO;
+		goto out;
+	}
+
+	if (rq->fence.error != -EIO) {
+		pr_err("Cancelled inflight0 request did not report -EIO\n");
+		err = -EINVAL;
+		goto out;
+	}
+
+out:
+	i915_request_put(rq);
+	if (igt_live_test_end(&t))
+		err = -EIO;
+	return err;
+}
+
+static int __cancel_active1(struct live_preempt_cancel *arg)
+{
+	struct i915_request *rq[2] = {};
+	struct igt_live_test t;
+	int err;
+
+	/* Preempt cancel of ELSP1 */
+	GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
+	if (igt_live_test_begin(&t, arg->engine->i915,
+				__func__, arg->engine->name))
+		return -EIO;
+
+	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
+	rq[0] = spinner_create_request(&arg->a.spin,
+				       arg->a.ctx, arg->engine,
+				       MI_NOOP); /* no preemption */
+	if (IS_ERR(rq[0]))
+		return PTR_ERR(rq[0]);
+
+	i915_request_get(rq[0]);
+	i915_request_add(rq[0]);
+	if (!igt_wait_for_spinner(&arg->a.spin, rq[0])) {
+		err = -EIO;
+		goto out;
+	}
+
+	clear_bit(CONTEXT_BANNED, &arg->b.ctx->flags);
+	rq[1] = spinner_create_request(&arg->b.spin,
+				       arg->b.ctx, arg->engine,
+				       MI_ARB_CHECK);
+	if (IS_ERR(rq[1])) {
+		err = PTR_ERR(rq[1]);
+		goto out;
+	}
+
+	i915_request_get(rq[1]);
+	err = i915_request_await_dma_fence(rq[1], &rq[0]->fence);
+	i915_request_add(rq[1]);
+	if (err)
+		goto out;
+
+	i915_gem_context_set_banned(arg->b.ctx);
+	err = intel_engine_pulse(arg->engine);
+	if (err)
+		goto out;
+
+	igt_spinner_end(&arg->a.spin);
+	if (i915_request_wait(rq[1], 0, HZ / 5) < 0) {
+		err = -EIO;
+		goto out;
+	}
+
+	if (rq[0]->fence.error != 0) {
+		pr_err("Normal inflight0 request did not complete\n");
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (rq[1]->fence.error != -EIO) {
+		pr_err("Cancelled inflight1 request did not report -EIO\n");
+		err = -EINVAL;
+		goto out;
+	}
+
+out:
+	i915_request_put(rq[1]);
+	i915_request_put(rq[0]);
+	if (igt_live_test_end(&t))
+		err = -EIO;
+	return err;
+}
+
+static int __cancel_queued(struct live_preempt_cancel *arg)
+{
+	struct i915_request *rq[3] = {};
+	struct igt_live_test t;
+	int err;
+
+	/* Full ELSP and one in the wings */
+	GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
+	if (igt_live_test_begin(&t, arg->engine->i915,
+				__func__, arg->engine->name))
+		return -EIO;
+
+	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
+	rq[0] = spinner_create_request(&arg->a.spin,
+				       arg->a.ctx, arg->engine,
+				       MI_ARB_CHECK);
+	if (IS_ERR(rq[0]))
+		return PTR_ERR(rq[0]);
+
+	i915_request_get(rq[0]);
+	i915_request_add(rq[0]);
+	if (!igt_wait_for_spinner(&arg->a.spin, rq[0])) {
+		err = -EIO;
+		goto out;
+	}
+
+	clear_bit(CONTEXT_BANNED, &arg->b.ctx->flags);
+	rq[1] = igt_request_alloc(arg->b.ctx, arg->engine);
+	if (IS_ERR(rq[1])) {
+		err = PTR_ERR(rq[1]);
+		goto out;
+	}
+
+	i915_request_get(rq[1]);
+	err = i915_request_await_dma_fence(rq[1], &rq[0]->fence);
+	i915_request_add(rq[1]);
+	if (err)
+		goto out;
+
+	rq[2] = spinner_create_request(&arg->b.spin,
+				       arg->a.ctx, arg->engine,
+				       MI_ARB_CHECK);
+	if (IS_ERR(rq[2])) {
+		err = PTR_ERR(rq[2]);
+		goto out;
+	}
+
+	i915_request_get(rq[2]);
+	err = i915_request_await_dma_fence(rq[2], &rq[1]->fence);
+	i915_request_add(rq[2]);
+	if (err)
+		goto out;
+
+	i915_gem_context_set_banned(arg->a.ctx);
+	err = intel_engine_pulse(arg->engine);
+	if (err)
+		goto out;
+
+	if (i915_request_wait(rq[2], 0, HZ / 5) < 0) {
+		err = -EIO;
+		goto out;
+	}
+
+	if (rq[0]->fence.error != -EIO) {
+		pr_err("Cancelled inflight0 request did not report -EIO\n");
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (rq[1]->fence.error != 0) {
+		pr_err("Normal inflight1 request did not complete\n");
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (rq[2]->fence.error != -EIO) {
+		pr_err("Cancelled queued request did not report -EIO\n");
+		err = -EINVAL;
+		goto out;
+	}
+
+out:
+	i915_request_put(rq[2]);
+	i915_request_put(rq[1]);
+	i915_request_put(rq[0]);
+	if (igt_live_test_end(&t))
+		err = -EIO;
+	return err;
+}
+
+static int __cancel_hostile(struct live_preempt_cancel *arg)
+{
+	struct i915_request *rq;
+	int err;
+
+	/* Preempt cancel non-preemptible spinner in ELSP0 */
+	if (!IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT))
+		return 0;
+
+	GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
+	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
+	rq = spinner_create_request(&arg->a.spin,
+				    arg->a.ctx, arg->engine,
+				    MI_NOOP); /* preemption disabled */
+	if (IS_ERR(rq))
+		return PTR_ERR(rq);
+
+	i915_request_get(rq);
+	i915_request_add(rq);
+	if (!igt_wait_for_spinner(&arg->a.spin, rq)) {
+		err = -EIO;
+		goto out;
+	}
+
+	i915_gem_context_set_banned(arg->a.ctx);
+	err = intel_engine_pulse(arg->engine); /* force reset */
+	if (err)
+		goto out;
+
+	if (i915_request_wait(rq, 0, HZ / 5) < 0) {
+		err = -EIO;
+		goto out;
+	}
+
+	if (rq->fence.error != -EIO) {
+		pr_err("Cancelled inflight0 request did not report -EIO\n");
+		err = -EINVAL;
+		goto out;
+	}
+
+out:
+	i915_request_put(rq);
+	if (igt_flush_test(arg->engine->i915))
+		err = -EIO;
+	return err;
+}
+
+static int live_preempt_cancel(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct live_preempt_cancel data;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+
+	/*
+	 * To cancel an inflight context, we need to first remove it from the
+	 * GPU. That sounds like preemption! Plus a little bit of bookkeeping.
+	 */
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
+		return 0;
+
+	if (preempt_client_init(gt, &data.a))
+		return -ENOMEM;
+	if (preempt_client_init(gt, &data.b))
+		goto err_client_a;
+
+	for_each_engine(data.engine, gt, id) {
+		if (!intel_engine_has_preemption(data.engine))
+			continue;
+
+		err = __cancel_active0(&data);
+		if (err)
+			goto err_wedged;
+
+		err = __cancel_active1(&data);
+		if (err)
+			goto err_wedged;
+
+		err = __cancel_queued(&data);
+		if (err)
+			goto err_wedged;
+
+		err = __cancel_hostile(&data);
+		if (err)
+			goto err_wedged;
+	}
+
+	err = 0;
+err_client_b:
+	preempt_client_fini(&data.b);
+err_client_a:
+	preempt_client_fini(&data.a);
+	return err;
+
+err_wedged:
+	GEM_TRACE_DUMP();
+	igt_spinner_end(&data.b.spin);
+	igt_spinner_end(&data.a.spin);
+	intel_gt_set_wedged(gt);
+	goto err_client_b;
+}
+
+static int live_suppress_self_preempt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs *engine;
+	struct i915_sched_attr attr = {
+		.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX)
+	};
+	struct preempt_client a, b;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+
+	/*
+	 * Verify that if a preemption request does not cause a change in
+	 * the current execution order, the preempt-to-idle injection is
+	 * skipped and that we do not accidentally apply it after the CS
+	 * completion event.
+	 */
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
+		return 0;
+
+	if (USES_GUC_SUBMISSION(gt->i915))
+		return 0; /* presume black blox */
+
+	if (intel_vgpu_active(gt->i915))
+		return 0; /* GVT forces single port & request submission */
+
+	if (preempt_client_init(gt, &a))
+		return -ENOMEM;
+	if (preempt_client_init(gt, &b))
+		goto err_client_a;
+
+	for_each_engine(engine, gt, id) {
+		struct i915_request *rq_a, *rq_b;
+		int depth;
+
+		if (!intel_engine_has_preemption(engine))
+			continue;
+
+		if (igt_flush_test(gt->i915))
+			goto err_wedged;
+
+		intel_engine_pm_get(engine);
+		engine->execlists.preempt_hang.count = 0;
+
+		rq_a = spinner_create_request(&a.spin,
+					      a.ctx, engine,
+					      MI_NOOP);
+		if (IS_ERR(rq_a)) {
+			err = PTR_ERR(rq_a);
+			intel_engine_pm_put(engine);
+			goto err_client_b;
+		}
+
+		i915_request_add(rq_a);
+		if (!igt_wait_for_spinner(&a.spin, rq_a)) {
+			pr_err("First client failed to start\n");
+			intel_engine_pm_put(engine);
+			goto err_wedged;
+		}
+
+		/* Keep postponing the timer to avoid premature slicing */
+		mod_timer(&engine->execlists.timer, jiffies + HZ);
+		for (depth = 0; depth < 8; depth++) {
+			rq_b = spinner_create_request(&b.spin,
+						      b.ctx, engine,
+						      MI_NOOP);
+			if (IS_ERR(rq_b)) {
+				err = PTR_ERR(rq_b);
+				intel_engine_pm_put(engine);
+				goto err_client_b;
+			}
+			i915_request_add(rq_b);
+
+			GEM_BUG_ON(i915_request_completed(rq_a));
+			engine->schedule(rq_a, &attr);
+			igt_spinner_end(&a.spin);
+
+			if (!igt_wait_for_spinner(&b.spin, rq_b)) {
+				pr_err("Second client failed to start\n");
+				intel_engine_pm_put(engine);
+				goto err_wedged;
+			}
+
+			swap(a, b);
+			rq_a = rq_b;
+		}
+		igt_spinner_end(&a.spin);
+
+		if (engine->execlists.preempt_hang.count) {
+			pr_err("Preemption on %s recorded x%d, depth %d; should have been suppressed!\n",
+			       engine->name,
+			       engine->execlists.preempt_hang.count,
+			       depth);
+			intel_engine_pm_put(engine);
+			err = -EINVAL;
+			goto err_client_b;
+		}
+
+		intel_engine_pm_put(engine);
+		if (igt_flush_test(gt->i915))
+			goto err_wedged;
+	}
+
+	err = 0;
+err_client_b:
+	preempt_client_fini(&b);
+err_client_a:
+	preempt_client_fini(&a);
+	return err;
+
+err_wedged:
+	igt_spinner_end(&b.spin);
+	igt_spinner_end(&a.spin);
+	intel_gt_set_wedged(gt);
+	err = -EIO;
+	goto err_client_b;
+}
+
+static int __i915_sw_fence_call
+dummy_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
+{
+	return NOTIFY_DONE;
+}
+
+static struct i915_request *dummy_request(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq;
+
+	rq = kzalloc(sizeof(*rq), GFP_KERNEL);
+	if (!rq)
+		return NULL;
+
+	rq->engine = engine;
+
+	spin_lock_init(&rq->lock);
+	INIT_LIST_HEAD(&rq->fence.cb_list);
+	rq->fence.lock = &rq->lock;
+	rq->fence.ops = &i915_fence_ops;
+
+	i915_sched_node_init(&rq->sched);
+
+	/* mark this request as permanently incomplete */
+	rq->fence.seqno = 1;
+	BUILD_BUG_ON(sizeof(rq->fence.seqno) != 8); /* upper 32b == 0 */
+	rq->hwsp_seqno = (u32 *)&rq->fence.seqno + 1;
+	GEM_BUG_ON(i915_request_completed(rq));
+
+	i915_sw_fence_init(&rq->submit, dummy_notify);
+	set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
+
+	spin_lock_init(&rq->lock);
+	rq->fence.lock = &rq->lock;
+	INIT_LIST_HEAD(&rq->fence.cb_list);
+
+	return rq;
+}
+
+static void dummy_request_free(struct i915_request *dummy)
+{
+	/* We have to fake the CS interrupt to kick the next request */
+	i915_sw_fence_commit(&dummy->submit);
+
+	i915_request_mark_complete(dummy);
+	dma_fence_signal(&dummy->fence);
+
+	i915_sched_node_fini(&dummy->sched);
+	i915_sw_fence_fini(&dummy->submit);
+
+	dma_fence_free(&dummy->fence);
+}
+
+static int live_suppress_wait_preempt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct preempt_client client[4];
+	struct i915_request *rq[ARRAY_SIZE(client)] = {};
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+	int i;
+
+	/*
+	 * Waiters are given a little priority nudge, but not enough
+	 * to actually cause any preemption. Double check that we do
+	 * not needlessly generate preempt-to-idle cycles.
+	 */
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
+		return 0;
+
+	if (preempt_client_init(gt, &client[0])) /* ELSP[0] */
+		return -ENOMEM;
+	if (preempt_client_init(gt, &client[1])) /* ELSP[1] */
+		goto err_client_0;
+	if (preempt_client_init(gt, &client[2])) /* head of queue */
+		goto err_client_1;
+	if (preempt_client_init(gt, &client[3])) /* bystander */
+		goto err_client_2;
+
+	for_each_engine(engine, gt, id) {
+		int depth;
+
+		if (!intel_engine_has_preemption(engine))
+			continue;
+
+		if (!engine->emit_init_breadcrumb)
+			continue;
+
+		for (depth = 0; depth < ARRAY_SIZE(client); depth++) {
+			struct i915_request *dummy;
+
+			engine->execlists.preempt_hang.count = 0;
+
+			dummy = dummy_request(engine);
+			if (!dummy)
+				goto err_client_3;
+
+			for (i = 0; i < ARRAY_SIZE(client); i++) {
+				struct i915_request *this;
+
+				this = spinner_create_request(&client[i].spin,
+							      client[i].ctx, engine,
+							      MI_NOOP);
+				if (IS_ERR(this)) {
+					err = PTR_ERR(this);
+					goto err_wedged;
+				}
+
+				/* Disable NEWCLIENT promotion */
+				__i915_active_fence_set(&i915_request_timeline(this)->last_request,
+							&dummy->fence);
+
+				rq[i] = i915_request_get(this);
+				i915_request_add(this);
+			}
+
+			dummy_request_free(dummy);
+
+			GEM_BUG_ON(i915_request_completed(rq[0]));
+			if (!igt_wait_for_spinner(&client[0].spin, rq[0])) {
+				pr_err("%s: First client failed to start\n",
+				       engine->name);
+				goto err_wedged;
+			}
+			GEM_BUG_ON(!i915_request_started(rq[0]));
+
+			if (i915_request_wait(rq[depth],
+					      I915_WAIT_PRIORITY,
+					      1) != -ETIME) {
+				pr_err("%s: Waiter depth:%d completed!\n",
+				       engine->name, depth);
+				goto err_wedged;
+			}
+
+			for (i = 0; i < ARRAY_SIZE(client); i++) {
+				igt_spinner_end(&client[i].spin);
+				i915_request_put(rq[i]);
+				rq[i] = NULL;
+			}
+
+			if (igt_flush_test(gt->i915))
+				goto err_wedged;
+
+			if (engine->execlists.preempt_hang.count) {
+				pr_err("%s: Preemption recorded x%d, depth %d; should have been suppressed!\n",
+				       engine->name,
+				       engine->execlists.preempt_hang.count,
+				       depth);
+				err = -EINVAL;
+				goto err_client_3;
+			}
+		}
+	}
+
+	err = 0;
+err_client_3:
+	preempt_client_fini(&client[3]);
+err_client_2:
+	preempt_client_fini(&client[2]);
+err_client_1:
+	preempt_client_fini(&client[1]);
+err_client_0:
+	preempt_client_fini(&client[0]);
+	return err;
+
+err_wedged:
+	for (i = 0; i < ARRAY_SIZE(client); i++) {
+		igt_spinner_end(&client[i].spin);
+		i915_request_put(rq[i]);
+	}
+	intel_gt_set_wedged(gt);
+	err = -EIO;
+	goto err_client_3;
+}
+
+static int live_chain_preempt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs *engine;
+	struct preempt_client hi, lo;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+
+	/*
+	 * Build a chain AB...BA between two contexts (A, B) and request
+	 * preemption of the last request. It should then complete before
+	 * the previously submitted spinner in B.
+	 */
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
+		return 0;
+
+	if (preempt_client_init(gt, &hi))
+		return -ENOMEM;
+
+	if (preempt_client_init(gt, &lo))
+		goto err_client_hi;
+
+	for_each_engine(engine, gt, id) {
+		struct i915_sched_attr attr = {
+			.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX),
+		};
+		struct igt_live_test t;
+		struct i915_request *rq;
+		int ring_size, count, i;
+
+		if (!intel_engine_has_preemption(engine))
+			continue;
+
+		rq = spinner_create_request(&lo.spin,
+					    lo.ctx, engine,
+					    MI_ARB_CHECK);
+		if (IS_ERR(rq))
+			goto err_wedged;
+
+		i915_request_get(rq);
+		i915_request_add(rq);
+
+		ring_size = rq->wa_tail - rq->head;
+		if (ring_size < 0)
+			ring_size += rq->ring->size;
+		ring_size = rq->ring->size / ring_size;
+		pr_debug("%s(%s): Using maximum of %d requests\n",
+			 __func__, engine->name, ring_size);
+
+		igt_spinner_end(&lo.spin);
+		if (i915_request_wait(rq, 0, HZ / 2) < 0) {
+			pr_err("Timed out waiting to flush %s\n", engine->name);
+			i915_request_put(rq);
+			goto err_wedged;
+		}
+		i915_request_put(rq);
+
+		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
+			err = -EIO;
+			goto err_wedged;
+		}
+
+		for_each_prime_number_from(count, 1, ring_size) {
+			rq = spinner_create_request(&hi.spin,
+						    hi.ctx, engine,
+						    MI_ARB_CHECK);
+			if (IS_ERR(rq))
+				goto err_wedged;
+			i915_request_add(rq);
+			if (!igt_wait_for_spinner(&hi.spin, rq))
+				goto err_wedged;
+
+			rq = spinner_create_request(&lo.spin,
+						    lo.ctx, engine,
+						    MI_ARB_CHECK);
+			if (IS_ERR(rq))
+				goto err_wedged;
+			i915_request_add(rq);
+
+			for (i = 0; i < count; i++) {
+				rq = igt_request_alloc(lo.ctx, engine);
+				if (IS_ERR(rq))
+					goto err_wedged;
+				i915_request_add(rq);
+			}
+
+			rq = igt_request_alloc(hi.ctx, engine);
+			if (IS_ERR(rq))
+				goto err_wedged;
+
+			i915_request_get(rq);
+			i915_request_add(rq);
+			engine->schedule(rq, &attr);
+
+			igt_spinner_end(&hi.spin);
+			if (i915_request_wait(rq, 0, HZ / 5) < 0) {
+				struct drm_printer p =
+					drm_info_printer(gt->i915->drm.dev);
+
+				pr_err("Failed to preempt over chain of %d\n",
+				       count);
+				intel_engine_dump(engine, &p,
+						  "%s\n", engine->name);
+				i915_request_put(rq);
+				goto err_wedged;
+			}
+			igt_spinner_end(&lo.spin);
+			i915_request_put(rq);
+
+			rq = igt_request_alloc(lo.ctx, engine);
+			if (IS_ERR(rq))
+				goto err_wedged;
+
+			i915_request_get(rq);
+			i915_request_add(rq);
+
+			if (i915_request_wait(rq, 0, HZ / 5) < 0) {
+				struct drm_printer p =
+					drm_info_printer(gt->i915->drm.dev);
+
+				pr_err("Failed to flush low priority chain of %d requests\n",
+				       count);
+				intel_engine_dump(engine, &p,
+						  "%s\n", engine->name);
+
+				i915_request_put(rq);
+				goto err_wedged;
+			}
+			i915_request_put(rq);
+		}
+
+		if (igt_live_test_end(&t)) {
+			err = -EIO;
+			goto err_wedged;
+		}
+	}
+
+	err = 0;
+err_client_lo:
+	preempt_client_fini(&lo);
+err_client_hi:
+	preempt_client_fini(&hi);
+	return err;
+
+err_wedged:
+	igt_spinner_end(&hi.spin);
+	igt_spinner_end(&lo.spin);
+	intel_gt_set_wedged(gt);
+	err = -EIO;
+	goto err_client_lo;
+}
+
+static int create_gang(struct intel_engine_cs *engine,
+		       struct i915_request **prev)
+{
+	struct drm_i915_gem_object *obj;
+	struct intel_context *ce;
+	struct i915_request *rq;
+	struct i915_vma *vma;
+	u32 *cs;
+	int err;
+
+	ce = intel_context_create(engine->kernel_context->gem_context, engine);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	obj = i915_gem_object_create_internal(engine->i915, 4096);
+	if (IS_ERR(obj)) {
+		err = PTR_ERR(obj);
+		goto err_ce;
+	}
+
+	vma = i915_vma_instance(obj, ce->vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err_obj;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, PIN_USER);
+	if (err)
+		goto err_obj;
+
+	cs = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	if (IS_ERR(cs))
+		goto err_obj;
+
+	/* Semaphore target: spin until zero */
+	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
+	*cs++ = MI_SEMAPHORE_WAIT |
+		MI_SEMAPHORE_POLL |
+		MI_SEMAPHORE_SAD_EQ_SDD;
+	*cs++ = 0;
+	*cs++ = lower_32_bits(vma->node.start);
+	*cs++ = upper_32_bits(vma->node.start);
+
+	if (*prev) {
+		u64 offset = (*prev)->batch->node.start;
+
+		/* Terminate the spinner in the next lower priority batch. */
+		*cs++ = MI_STORE_DWORD_IMM_GEN4;
+		*cs++ = lower_32_bits(offset);
+		*cs++ = upper_32_bits(offset);
+		*cs++ = 0;
+	}
+
+	*cs++ = MI_BATCH_BUFFER_END;
+	i915_gem_object_flush_map(obj);
+	i915_gem_object_unpin_map(obj);
+
+	rq = intel_context_create_request(ce);
+	if (IS_ERR(rq))
+		goto err_obj;
+
+	rq->batch = vma;
+	i915_request_get(rq);
+
+	i915_vma_lock(vma);
+	err = i915_request_await_object(rq, vma->obj, false);
+	if (!err)
+		err = i915_vma_move_to_active(vma, rq, 0);
+	if (!err)
+		err = rq->engine->emit_bb_start(rq,
+						vma->node.start,
+						PAGE_SIZE, 0);
+	i915_vma_unlock(vma);
+	i915_request_add(rq);
+	if (err)
+		goto err_rq;
+
+	i915_gem_object_put(obj);
+	intel_context_put(ce);
+
+	rq->client_link.next = &(*prev)->client_link;
+	*prev = rq;
+	return 0;
+
+err_rq:
+	i915_request_put(rq);
+err_obj:
+	i915_gem_object_put(obj);
+err_ce:
+	intel_context_put(ce);
+	return err;
+}
+
+static int live_preempt_gang(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
+		return 0;
+
+	/*
+	 * Build as long a chain of preempters as we can, with each
+	 * request higher priority than the last. Once we are ready, we release
+	 * the last batch which then precolates down the chain, each releasing
+	 * the next oldest in turn. The intent is to simply push as hard as we
+	 * can with the number of preemptions, trying to exceed narrow HW
+	 * limits. At a minimum, we insist that we can sort all the user
+	 * high priority levels into execution order.
+	 */
+
+	for_each_engine(engine, gt, id) {
+		struct i915_request *rq = NULL;
+		struct igt_live_test t;
+		IGT_TIMEOUT(end_time);
+		int prio = 0;
+		int err = 0;
+		u32 *cs;
+
+		if (!intel_engine_has_preemption(engine))
+			continue;
+
+		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name))
+			return -EIO;
+
+		do {
+			struct i915_sched_attr attr = {
+				.priority = I915_USER_PRIORITY(prio++),
+			};
+
+			err = create_gang(engine, &rq);
+			if (err)
+				break;
+
+			/* Submit each spinner at increasing priority */
+			engine->schedule(rq, &attr);
+
+			if (prio <= I915_PRIORITY_MAX)
+				continue;
+
+			if (prio > (INT_MAX >> I915_USER_PRIORITY_SHIFT))
+				break;
+
+			if (__igt_timeout(end_time, NULL))
+				break;
+		} while (1);
+		pr_debug("%s: Preempt chain of %d requests\n",
+			 engine->name, prio);
+
+		/*
+		 * Such that the last spinner is the highest priority and
+		 * should execute first. When that spinner completes,
+		 * it will terminate the next lowest spinner until there
+		 * are no more spinners and the gang is complete.
+		 */
+		cs = i915_gem_object_pin_map(rq->batch->obj, I915_MAP_WC);
+		if (!IS_ERR(cs)) {
+			*cs = 0;
+			i915_gem_object_unpin_map(rq->batch->obj);
+		} else {
+			err = PTR_ERR(cs);
+			intel_gt_set_wedged(gt);
+		}
+
+		while (rq) { /* wait for each rq from highest to lowest prio */
+			struct i915_request *n =
+				list_next_entry(rq, client_link);
+
+			if (err == 0 && i915_request_wait(rq, 0, HZ / 5) < 0) {
+				struct drm_printer p =
+					drm_info_printer(engine->i915->drm.dev);
+
+				pr_err("Failed to flush chain of %d requests, at %d\n",
+				       prio, rq_prio(rq) >> I915_USER_PRIORITY_SHIFT);
+				intel_engine_dump(engine, &p,
+						  "%s\n", engine->name);
+
+				err = -ETIME;
+			}
+
+			i915_request_put(rq);
+			rq = n;
+		}
+
+		if (igt_live_test_end(&t))
+			err = -EIO;
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int live_preempt_hang(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct i915_gem_context *ctx_hi, *ctx_lo;
+	struct igt_spinner spin_hi, spin_lo;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
+		return 0;
+
+	if (!intel_has_reset_engine(gt))
+		return 0;
+
+	if (igt_spinner_init(&spin_hi, gt))
+		return -ENOMEM;
+
+	if (igt_spinner_init(&spin_lo, gt))
+		goto err_spin_hi;
+
+	ctx_hi = kernel_context(gt->i915);
+	if (!ctx_hi)
+		goto err_spin_lo;
+	ctx_hi->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
+
+	ctx_lo = kernel_context(gt->i915);
+	if (!ctx_lo)
+		goto err_ctx_hi;
+	ctx_lo->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
+
+	for_each_engine(engine, gt, id) {
+		struct i915_request *rq;
+
+		if (!intel_engine_has_preemption(engine))
+			continue;
+
+		rq = spinner_create_request(&spin_lo, ctx_lo, engine,
+					    MI_ARB_CHECK);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (!igt_wait_for_spinner(&spin_lo, rq)) {
+			GEM_TRACE("lo spinner failed to start\n");
+			GEM_TRACE_DUMP();
+			intel_gt_set_wedged(gt);
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		rq = spinner_create_request(&spin_hi, ctx_hi, engine,
+					    MI_ARB_CHECK);
+		if (IS_ERR(rq)) {
+			igt_spinner_end(&spin_lo);
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		init_completion(&engine->execlists.preempt_hang.completion);
+		engine->execlists.preempt_hang.inject_hang = true;
+
+		i915_request_add(rq);
+
+		if (!wait_for_completion_timeout(&engine->execlists.preempt_hang.completion,
+						 HZ / 10)) {
+			pr_err("Preemption did not occur within timeout!");
+			GEM_TRACE_DUMP();
+			intel_gt_set_wedged(gt);
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		set_bit(I915_RESET_ENGINE + id, &gt->reset.flags);
+		intel_engine_reset(engine, NULL);
+		clear_bit(I915_RESET_ENGINE + id, &gt->reset.flags);
+
+		engine->execlists.preempt_hang.inject_hang = false;
+
+		if (!igt_wait_for_spinner(&spin_hi, rq)) {
+			GEM_TRACE("hi spinner failed to start\n");
+			GEM_TRACE_DUMP();
+			intel_gt_set_wedged(gt);
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		igt_spinner_end(&spin_hi);
+		igt_spinner_end(&spin_lo);
+		if (igt_flush_test(gt->i915)) {
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+	}
+
+	err = 0;
+err_ctx_lo:
+	kernel_context_close(ctx_lo);
+err_ctx_hi:
+	kernel_context_close(ctx_hi);
+err_spin_lo:
+	igt_spinner_fini(&spin_lo);
+err_spin_hi:
+	igt_spinner_fini(&spin_hi);
+	return err;
+}
+
+static int live_preempt_timeout(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct i915_gem_context *ctx_hi, *ctx_lo;
+	struct igt_spinner spin_lo;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err = -ENOMEM;
+
+	/*
+	 * Check that we force preemption to occur by cancelling the previous
+	 * context if it refuses to yield the GPU.
+	 */
+	if (!IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT))
+		return 0;
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
+		return 0;
+
+	if (!intel_has_reset_engine(gt))
+		return 0;
+
+	if (igt_spinner_init(&spin_lo, gt))
+		return -ENOMEM;
+
+	ctx_hi = kernel_context(gt->i915);
+	if (!ctx_hi)
+		goto err_spin_lo;
+	ctx_hi->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
+
+	ctx_lo = kernel_context(gt->i915);
+	if (!ctx_lo)
+		goto err_ctx_hi;
+	ctx_lo->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
+
+	for_each_engine(engine, gt, id) {
+		unsigned long saved_timeout;
+		struct i915_request *rq;
+
+		if (!intel_engine_has_preemption(engine))
+			continue;
+
+		rq = spinner_create_request(&spin_lo, ctx_lo, engine,
+					    MI_NOOP); /* preemption disabled */
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		i915_request_add(rq);
+		if (!igt_wait_for_spinner(&spin_lo, rq)) {
+			intel_gt_set_wedged(gt);
+			err = -EIO;
+			goto err_ctx_lo;
+		}
+
+		rq = igt_request_alloc(ctx_hi, engine);
+		if (IS_ERR(rq)) {
+			igt_spinner_end(&spin_lo);
+			err = PTR_ERR(rq);
+			goto err_ctx_lo;
+		}
+
+		/* Flush the previous CS ack before changing timeouts */
+		while (READ_ONCE(engine->execlists.pending[0]))
+			cpu_relax();
+
+		saved_timeout = engine->props.preempt_timeout_ms;
+		engine->props.preempt_timeout_ms = 1; /* in ms, -> 1 jiffie */
+
+		i915_request_get(rq);
+		i915_request_add(rq);
+
+		intel_engine_flush_submission(engine);
+		engine->props.preempt_timeout_ms = saved_timeout;
+
+		if (i915_request_wait(rq, 0, HZ / 10) < 0) {
+			intel_gt_set_wedged(gt);
+			i915_request_put(rq);
+			err = -ETIME;
+			goto err_ctx_lo;
+		}
+
+		igt_spinner_end(&spin_lo);
+		i915_request_put(rq);
+	}
+
+	err = 0;
+err_ctx_lo:
+	kernel_context_close(ctx_lo);
+err_ctx_hi:
+	kernel_context_close(ctx_hi);
+err_spin_lo:
+	igt_spinner_fini(&spin_lo);
+	return err;
+}
+
+static int random_range(struct rnd_state *rnd, int min, int max)
+{
+	return i915_prandom_u32_max_state(max - min, rnd) + min;
+}
+
+static int random_priority(struct rnd_state *rnd)
+{
+	return random_range(rnd, I915_PRIORITY_MIN, I915_PRIORITY_MAX);
+}
+
+struct preempt_smoke {
+	struct intel_gt *gt;
+	struct i915_gem_context **contexts;
+	struct intel_engine_cs *engine;
+	struct drm_i915_gem_object *batch;
+	unsigned int ncontext;
+	struct rnd_state prng;
+	unsigned long count;
+};
+
+static struct i915_gem_context *smoke_context(struct preempt_smoke *smoke)
+{
+	return smoke->contexts[i915_prandom_u32_max_state(smoke->ncontext,
+							  &smoke->prng)];
+}
+
+static int smoke_submit(struct preempt_smoke *smoke,
+			struct i915_gem_context *ctx, int prio,
+			struct drm_i915_gem_object *batch)
+{
+	struct i915_request *rq;
+	struct i915_vma *vma = NULL;
+	int err = 0;
+
+	if (batch) {
+		struct i915_address_space *vm;
+
+		vm = i915_gem_context_get_vm_rcu(ctx);
+		vma = i915_vma_instance(batch, vm, NULL);
+		i915_vm_put(vm);
+		if (IS_ERR(vma))
+			return PTR_ERR(vma);
+
+		err = i915_vma_pin(vma, 0, 0, PIN_USER);
+		if (err)
+			return err;
+	}
+
+	ctx->sched.priority = prio;
+
+	rq = igt_request_alloc(ctx, smoke->engine);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto unpin;
+	}
+
+	if (vma) {
+		i915_vma_lock(vma);
+		err = i915_request_await_object(rq, vma->obj, false);
+		if (!err)
+			err = i915_vma_move_to_active(vma, rq, 0);
+		if (!err)
+			err = rq->engine->emit_bb_start(rq,
+							vma->node.start,
+							PAGE_SIZE, 0);
+		i915_vma_unlock(vma);
+	}
+
+	i915_request_add(rq);
+
+unpin:
+	if (vma)
+		i915_vma_unpin(vma);
+
+	return err;
+}
+
+static int smoke_crescendo_thread(void *arg)
+{
+	struct preempt_smoke *smoke = arg;
+	IGT_TIMEOUT(end_time);
+	unsigned long count;
+
+	count = 0;
+	do {
+		struct i915_gem_context *ctx = smoke_context(smoke);
+		int err;
+
+		err = smoke_submit(smoke,
+				   ctx, count % I915_PRIORITY_MAX,
+				   smoke->batch);
+		if (err)
+			return err;
+
+		count++;
+	} while (!__igt_timeout(end_time, NULL));
+
+	smoke->count = count;
+	return 0;
+}
+
+static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags)
+#define BATCH BIT(0)
+{
+	struct task_struct *tsk[I915_NUM_ENGINES] = {};
+	struct preempt_smoke arg[I915_NUM_ENGINES];
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	unsigned long count;
+	int err = 0;
+
+	for_each_engine(engine, smoke->gt, id) {
+		arg[id] = *smoke;
+		arg[id].engine = engine;
+		if (!(flags & BATCH))
+			arg[id].batch = NULL;
+		arg[id].count = 0;
+
+		tsk[id] = kthread_run(smoke_crescendo_thread, &arg,
+				      "igt/smoke:%d", id);
+		if (IS_ERR(tsk[id])) {
+			err = PTR_ERR(tsk[id]);
+			break;
+		}
+		get_task_struct(tsk[id]);
+	}
+
+	yield(); /* start all threads before we kthread_stop() */
+
+	count = 0;
+	for_each_engine(engine, smoke->gt, id) {
+		int status;
+
+		if (IS_ERR_OR_NULL(tsk[id]))
+			continue;
+
+		status = kthread_stop(tsk[id]);
+		if (status && !err)
+			err = status;
+
+		count += arg[id].count;
+
+		put_task_struct(tsk[id]);
+	}
+
+	pr_info("Submitted %lu crescendo:%x requests across %d engines and %d contexts\n",
+		count, flags,
+		RUNTIME_INFO(smoke->gt->i915)->num_engines, smoke->ncontext);
+	return 0;
+}
+
+static int smoke_random(struct preempt_smoke *smoke, unsigned int flags)
+{
+	enum intel_engine_id id;
+	IGT_TIMEOUT(end_time);
+	unsigned long count;
+
+	count = 0;
+	do {
+		for_each_engine(smoke->engine, smoke->gt, id) {
+			struct i915_gem_context *ctx = smoke_context(smoke);
+			int err;
+
+			err = smoke_submit(smoke,
+					   ctx, random_priority(&smoke->prng),
+					   flags & BATCH ? smoke->batch : NULL);
+			if (err)
+				return err;
+
+			count++;
+		}
+	} while (!__igt_timeout(end_time, NULL));
+
+	pr_info("Submitted %lu random:%x requests across %d engines and %d contexts\n",
+		count, flags,
+		RUNTIME_INFO(smoke->gt->i915)->num_engines, smoke->ncontext);
+	return 0;
+}
+
+static int live_preempt_smoke(void *arg)
+{
+	struct preempt_smoke smoke = {
+		.gt = arg,
+		.prng = I915_RND_STATE_INITIALIZER(i915_selftest.random_seed),
+		.ncontext = 1024,
+	};
+	const unsigned int phase[] = { 0, BATCH };
+	struct igt_live_test t;
+	int err = -ENOMEM;
+	u32 *cs;
+	int n;
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(smoke.gt->i915))
+		return 0;
+
+	smoke.contexts = kmalloc_array(smoke.ncontext,
+				       sizeof(*smoke.contexts),
+				       GFP_KERNEL);
+	if (!smoke.contexts)
+		return -ENOMEM;
+
+	smoke.batch =
+		i915_gem_object_create_internal(smoke.gt->i915, PAGE_SIZE);
+	if (IS_ERR(smoke.batch)) {
+		err = PTR_ERR(smoke.batch);
+		goto err_free;
+	}
+
+	cs = i915_gem_object_pin_map(smoke.batch, I915_MAP_WB);
+	if (IS_ERR(cs)) {
+		err = PTR_ERR(cs);
+		goto err_batch;
+	}
+	for (n = 0; n < PAGE_SIZE / sizeof(*cs) - 1; n++)
+		cs[n] = MI_ARB_CHECK;
+	cs[n] = MI_BATCH_BUFFER_END;
+	i915_gem_object_flush_map(smoke.batch);
+	i915_gem_object_unpin_map(smoke.batch);
+
+	if (igt_live_test_begin(&t, smoke.gt->i915, __func__, "all")) {
+		err = -EIO;
+		goto err_batch;
+	}
+
+	for (n = 0; n < smoke.ncontext; n++) {
+		smoke.contexts[n] = kernel_context(smoke.gt->i915);
+		if (!smoke.contexts[n])
+			goto err_ctx;
+	}
+
+	for (n = 0; n < ARRAY_SIZE(phase); n++) {
+		err = smoke_crescendo(&smoke, phase[n]);
+		if (err)
+			goto err_ctx;
+
+		err = smoke_random(&smoke, phase[n]);
+		if (err)
+			goto err_ctx;
+	}
+
+err_ctx:
+	if (igt_live_test_end(&t))
+		err = -EIO;
+
+	for (n = 0; n < smoke.ncontext; n++) {
+		if (!smoke.contexts[n])
+			break;
+		kernel_context_close(smoke.contexts[n]);
+	}
+
+err_batch:
+	i915_gem_object_put(smoke.batch);
+err_free:
+	kfree(smoke.contexts);
+
+	return err;
+}
+
+static int nop_virtual_engine(struct intel_gt *gt,
+			      struct intel_engine_cs **siblings,
+			      unsigned int nsibling,
+			      unsigned int nctx,
+			      unsigned int flags)
+#define CHAIN BIT(0)
+{
+	IGT_TIMEOUT(end_time);
+	struct i915_request *request[16] = {};
+	struct i915_gem_context *ctx[16];
+	struct intel_context *ve[16];
+	unsigned long n, prime, nc;
+	struct igt_live_test t;
+	ktime_t times[2] = {};
+	int err;
+
+	GEM_BUG_ON(!nctx || nctx > ARRAY_SIZE(ctx));
+
+	for (n = 0; n < nctx; n++) {
+		ctx[n] = kernel_context(gt->i915);
+		if (!ctx[n]) {
+			err = -ENOMEM;
+			nctx = n;
+			goto out;
+		}
+
+		ve[n] = intel_virtual_engine_create(ctx[n], siblings, nsibling);
+		if (IS_ERR(ve[n])) {
+			kernel_context_close(ctx[n]);
+			err = PTR_ERR(ve[n]);
+			nctx = n;
+			goto out;
+		}
+
+		err = intel_context_pin(ve[n]);
+		if (err) {
+			intel_context_put(ve[n]);
+			kernel_context_close(ctx[n]);
+			nctx = n;
+			goto out;
+		}
+	}
+
+	err = igt_live_test_begin(&t, gt->i915, __func__, ve[0]->engine->name);
+	if (err)
+		goto out;
+
+	for_each_prime_number_from(prime, 1, 8192) {
+		times[1] = ktime_get_raw();
+
+		if (flags & CHAIN) {
+			for (nc = 0; nc < nctx; nc++) {
+				for (n = 0; n < prime; n++) {
+					struct i915_request *rq;
+
+					rq = i915_request_create(ve[nc]);
+					if (IS_ERR(rq)) {
+						err = PTR_ERR(rq);
+						goto out;
+					}
+
+					if (request[nc])
+						i915_request_put(request[nc]);
+					request[nc] = i915_request_get(rq);
+					i915_request_add(rq);
+				}
+			}
+		} else {
+			for (n = 0; n < prime; n++) {
+				for (nc = 0; nc < nctx; nc++) {
+					struct i915_request *rq;
+
+					rq = i915_request_create(ve[nc]);
+					if (IS_ERR(rq)) {
+						err = PTR_ERR(rq);
+						goto out;
+					}
+
+					if (request[nc])
+						i915_request_put(request[nc]);
+					request[nc] = i915_request_get(rq);
+					i915_request_add(rq);
+				}
+			}
+		}
+
+		for (nc = 0; nc < nctx; nc++) {
+			if (i915_request_wait(request[nc], 0, HZ / 10) < 0) {
+				pr_err("%s(%s): wait for %llx:%lld timed out\n",
+				       __func__, ve[0]->engine->name,
+				       request[nc]->fence.context,
+				       request[nc]->fence.seqno);
+
+				GEM_TRACE("%s(%s) failed at request %llx:%lld\n",
+					  __func__, ve[0]->engine->name,
+					  request[nc]->fence.context,
+					  request[nc]->fence.seqno);
+				GEM_TRACE_DUMP();
+				intel_gt_set_wedged(gt);
+				break;
+			}
+		}
+
+		times[1] = ktime_sub(ktime_get_raw(), times[1]);
+		if (prime == 1)
+			times[0] = times[1];
+
+		for (nc = 0; nc < nctx; nc++) {
+			i915_request_put(request[nc]);
+			request[nc] = NULL;
+		}
+
+		if (__igt_timeout(end_time, NULL))
+			break;
+	}
+
+	err = igt_live_test_end(&t);
+	if (err)
+		goto out;
+
+	pr_info("Requestx%d latencies on %s: 1 = %lluns, %lu = %lluns\n",
+		nctx, ve[0]->engine->name, ktime_to_ns(times[0]),
+		prime, div64_u64(ktime_to_ns(times[1]), prime));
+
+out:
+	if (igt_flush_test(gt->i915))
+		err = -EIO;
+
+	for (nc = 0; nc < nctx; nc++) {
+		i915_request_put(request[nc]);
+		intel_context_unpin(ve[nc]);
+		intel_context_put(ve[nc]);
+		kernel_context_close(ctx[nc]);
+	}
+	return err;
+}
+
+static int live_virtual_engine(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	unsigned int class, inst;
+	int err;
+
+	if (USES_GUC_SUBMISSION(gt->i915))
+		return 0;
+
+	for_each_engine(engine, gt, id) {
+		err = nop_virtual_engine(gt, &engine, 1, 1, 0);
+		if (err) {
+			pr_err("Failed to wrap engine %s: err=%d\n",
+			       engine->name, err);
+			return err;
+		}
+	}
+
+	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
+		int nsibling, n;
+
+		nsibling = 0;
+		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
+			if (!gt->engine_class[class][inst])
+				continue;
+
+			siblings[nsibling++] = gt->engine_class[class][inst];
+		}
+		if (nsibling < 2)
+			continue;
+
+		for (n = 1; n <= nsibling + 1; n++) {
+			err = nop_virtual_engine(gt, siblings, nsibling,
+						 n, 0);
+			if (err)
+				return err;
+		}
+
+		err = nop_virtual_engine(gt, siblings, nsibling, n, CHAIN);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int mask_virtual_engine(struct intel_gt *gt,
+			       struct intel_engine_cs **siblings,
+			       unsigned int nsibling)
+{
+	struct i915_request *request[MAX_ENGINE_INSTANCE + 1];
+	struct i915_gem_context *ctx;
+	struct intel_context *ve;
+	struct igt_live_test t;
+	unsigned int n;
+	int err;
+
+	/*
+	 * Check that by setting the execution mask on a request, we can
+	 * restrict it to our desired engine within the virtual engine.
+	 */
+
+	ctx = kernel_context(gt->i915);
+	if (!ctx)
+		return -ENOMEM;
+
+	ve = intel_virtual_engine_create(ctx, siblings, nsibling);
+	if (IS_ERR(ve)) {
+		err = PTR_ERR(ve);
+		goto out_close;
+	}
+
+	err = intel_context_pin(ve);
+	if (err)
+		goto out_put;
+
+	err = igt_live_test_begin(&t, gt->i915, __func__, ve->engine->name);
+	if (err)
+		goto out_unpin;
+
+	for (n = 0; n < nsibling; n++) {
+		request[n] = i915_request_create(ve);
+		if (IS_ERR(request[n])) {
+			err = PTR_ERR(request[n]);
+			nsibling = n;
+			goto out;
+		}
+
+		/* Reverse order as it's more likely to be unnatural */
+		request[n]->execution_mask = siblings[nsibling - n - 1]->mask;
+
+		i915_request_get(request[n]);
+		i915_request_add(request[n]);
+	}
+
+	for (n = 0; n < nsibling; n++) {
+		if (i915_request_wait(request[n], 0, HZ / 10) < 0) {
+			pr_err("%s(%s): wait for %llx:%lld timed out\n",
+			       __func__, ve->engine->name,
+			       request[n]->fence.context,
+			       request[n]->fence.seqno);
+
+			GEM_TRACE("%s(%s) failed at request %llx:%lld\n",
+				  __func__, ve->engine->name,
+				  request[n]->fence.context,
+				  request[n]->fence.seqno);
+			GEM_TRACE_DUMP();
+			intel_gt_set_wedged(gt);
+			err = -EIO;
+			goto out;
+		}
+
+		if (request[n]->engine != siblings[nsibling - n - 1]) {
+			pr_err("Executed on wrong sibling '%s', expected '%s'\n",
+			       request[n]->engine->name,
+			       siblings[nsibling - n - 1]->name);
+			err = -EINVAL;
+			goto out;
+		}
+	}
+
+	err = igt_live_test_end(&t);
+out:
+	if (igt_flush_test(gt->i915))
+		err = -EIO;
+
+	for (n = 0; n < nsibling; n++)
+		i915_request_put(request[n]);
+
+out_unpin:
+	intel_context_unpin(ve);
+out_put:
+	intel_context_put(ve);
+out_close:
+	kernel_context_close(ctx);
+	return err;
+}
+
+static int live_virtual_mask(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+	unsigned int class, inst;
+	int err;
+
+	if (USES_GUC_SUBMISSION(gt->i915))
+		return 0;
+
+	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
+		unsigned int nsibling;
+
+		nsibling = 0;
+		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
+			if (!gt->engine_class[class][inst])
+				break;
+
+			siblings[nsibling++] = gt->engine_class[class][inst];
+		}
+		if (nsibling < 2)
+			continue;
+
+		err = mask_virtual_engine(gt, siblings, nsibling);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int preserved_virtual_engine(struct intel_gt *gt,
+				    struct intel_engine_cs **siblings,
+				    unsigned int nsibling)
+{
+	struct i915_request *last = NULL;
+	struct i915_gem_context *ctx;
+	struct intel_context *ve;
+	struct i915_vma *scratch;
+	struct igt_live_test t;
+	unsigned int n;
+	int err = 0;
+	u32 *cs;
+
+	ctx = kernel_context(gt->i915);
+	if (!ctx)
+		return -ENOMEM;
+
+	scratch = igt_create_scratch(siblings[0]->gt);
+	if (IS_ERR(scratch)) {
+		err = PTR_ERR(scratch);
+		goto out_close;
+	}
+
+	ve = intel_virtual_engine_create(ctx, siblings, nsibling);
+	if (IS_ERR(ve)) {
+		err = PTR_ERR(ve);
+		goto out_scratch;
+	}
+
+	err = intel_context_pin(ve);
+	if (err)
+		goto out_put;
+
+	err = igt_live_test_begin(&t, gt->i915, __func__, ve->engine->name);
+	if (err)
+		goto out_unpin;
+
+	for (n = 0; n < NUM_GPR_DW; n++) {
+		struct intel_engine_cs *engine = siblings[n % nsibling];
+		struct i915_request *rq;
+
+		rq = i915_request_create(ve);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto out_end;
+		}
+
+		i915_request_put(last);
+		last = i915_request_get(rq);
+
+		cs = intel_ring_begin(rq, 8);
+		if (IS_ERR(cs)) {
+			i915_request_add(rq);
+			err = PTR_ERR(cs);
+			goto out_end;
+		}
+
+		*cs++ = MI_STORE_REGISTER_MEM_GEN8 | MI_USE_GGTT;
+		*cs++ = CS_GPR(engine, n);
+		*cs++ = i915_ggtt_offset(scratch) + n * sizeof(u32);
+		*cs++ = 0;
+
+		*cs++ = MI_LOAD_REGISTER_IMM(1);
+		*cs++ = CS_GPR(engine, (n + 1) % NUM_GPR_DW);
+		*cs++ = n + 1;
+
+		*cs++ = MI_NOOP;
+		intel_ring_advance(rq, cs);
+
+		/* Restrict this request to run on a particular engine */
+		rq->execution_mask = engine->mask;
+		i915_request_add(rq);
+	}
+
+	if (i915_request_wait(last, 0, HZ / 5) < 0) {
+		err = -ETIME;
+		goto out_end;
+	}
+
+	cs = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB);
+	if (IS_ERR(cs)) {
+		err = PTR_ERR(cs);
+		goto out_end;
+	}
+
+	for (n = 0; n < NUM_GPR_DW; n++) {
+		if (cs[n] != n) {
+			pr_err("Incorrect value[%d] found for GPR[%d]\n",
+			       cs[n], n);
+			err = -EINVAL;
+			break;
+		}
+	}
+
+	i915_gem_object_unpin_map(scratch->obj);
+
+out_end:
+	if (igt_live_test_end(&t))
+		err = -EIO;
+	i915_request_put(last);
+out_unpin:
+	intel_context_unpin(ve);
+out_put:
+	intel_context_put(ve);
+out_scratch:
+	i915_vma_unpin_and_release(&scratch, 0);
+out_close:
+	kernel_context_close(ctx);
+	return err;
+}
+
+static int live_virtual_preserved(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+	unsigned int class, inst;
+
+	/*
+	 * Check that the context image retains non-privileged (user) registers
+	 * from one engine to the next. For this we check that the CS_GPR
+	 * are preserved.
+	 */
+
+	if (USES_GUC_SUBMISSION(gt->i915))
+		return 0;
+
+	/* As we use CS_GPR we cannot run before they existed on all engines. */
+	if (INTEL_GEN(gt->i915) < 9)
+		return 0;
+
+	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
+		int nsibling, err;
+
+		nsibling = 0;
+		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
+			if (!gt->engine_class[class][inst])
+				continue;
+
+			siblings[nsibling++] = gt->engine_class[class][inst];
+		}
+		if (nsibling < 2)
+			continue;
+
+		err = preserved_virtual_engine(gt, siblings, nsibling);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int bond_virtual_engine(struct intel_gt *gt,
+			       unsigned int class,
+			       struct intel_engine_cs **siblings,
+			       unsigned int nsibling,
+			       unsigned int flags)
+#define BOND_SCHEDULE BIT(0)
+{
+	struct intel_engine_cs *master;
+	struct i915_gem_context *ctx;
+	struct i915_request *rq[16];
+	enum intel_engine_id id;
+	struct igt_spinner spin;
+	unsigned long n;
+	int err;
+
+	/*
+	 * A set of bonded requests is intended to be run concurrently
+	 * across a number of engines. We use one request per-engine
+	 * and a magic fence to schedule each of the bonded requests
+	 * at the same time. A consequence of our current scheduler is that
+	 * we only move requests to the HW ready queue when the request
+	 * becomes ready, that is when all of its prerequisite fences have
+	 * been signaled. As one of those fences is the master submit fence,
+	 * there is a delay on all secondary fences as the HW may be
+	 * currently busy. Equally, as all the requests are independent,
+	 * they may have other fences that delay individual request
+	 * submission to HW. Ergo, we do not guarantee that all requests are
+	 * immediately submitted to HW at the same time, just that if the
+	 * rules are abided by, they are ready at the same time as the
+	 * first is submitted. Userspace can embed semaphores in its batch
+	 * to ensure parallel execution of its phases as it requires.
+	 * Though naturally it gets requested that perhaps the scheduler should
+	 * take care of parallel execution, even across preemption events on
+	 * different HW. (The proper answer is of course "lalalala".)
+	 *
+	 * With the submit-fence, we have identified three possible phases
+	 * of synchronisation depending on the master fence: queued (not
+	 * ready), executing, and signaled. The first two are quite simple
+	 * and checked below. However, the signaled master fence handling is
+	 * contentious. Currently we do not distinguish between a signaled
+	 * fence and an expired fence, as once signaled it does not convey
+	 * any information about the previous execution. It may even be freed
+	 * and hence checking later it may not exist at all. Ergo we currently
+	 * do not apply the bonding constraint for an already signaled fence,
+	 * as our expectation is that it should not constrain the secondaries
+	 * and is outside of the scope of the bonded request API (i.e. all
+	 * userspace requests are meant to be running in parallel). As
+	 * it imposes no constraint, and is effectively a no-op, we do not
+	 * check below as normal execution flows are checked extensively above.
+	 *
+	 * XXX Is the degenerate handling of signaled submit fences the
+	 * expected behaviour for userpace?
+	 */
+
+	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
+
+	if (igt_spinner_init(&spin, gt))
+		return -ENOMEM;
+
+	ctx = kernel_context(gt->i915);
+	if (!ctx) {
+		err = -ENOMEM;
+		goto err_spin;
+	}
+
+	err = 0;
+	rq[0] = ERR_PTR(-ENOMEM);
+	for_each_engine(master, gt, id) {
+		struct i915_sw_fence fence = {};
+
+		if (master->class == class)
+			continue;
+
+		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
+
+		rq[0] = spinner_create_request(&spin, ctx, master, MI_NOOP);
+		if (IS_ERR(rq[0])) {
+			err = PTR_ERR(rq[0]);
+			goto out;
+		}
+		i915_request_get(rq[0]);
+
+		if (flags & BOND_SCHEDULE) {
+			onstack_fence_init(&fence);
+			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
+							       &fence,
+							       GFP_KERNEL);
+		}
+
+		i915_request_add(rq[0]);
+		if (err < 0)
+			goto out;
+
+		if (!(flags & BOND_SCHEDULE) &&
+		    !igt_wait_for_spinner(&spin, rq[0])) {
+			err = -EIO;
+			goto out;
+		}
+
+		for (n = 0; n < nsibling; n++) {
+			struct intel_context *ve;
+
+			ve = intel_virtual_engine_create(ctx,
+							 siblings,
+							 nsibling);
+			if (IS_ERR(ve)) {
+				err = PTR_ERR(ve);
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+
+			err = intel_virtual_engine_attach_bond(ve->engine,
+							       master,
+							       siblings[n]);
+			if (err) {
+				intel_context_put(ve);
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+
+			err = intel_context_pin(ve);
+			intel_context_put(ve);
+			if (err) {
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+
+			rq[n + 1] = i915_request_create(ve);
+			intel_context_unpin(ve);
+			if (IS_ERR(rq[n + 1])) {
+				err = PTR_ERR(rq[n + 1]);
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+			i915_request_get(rq[n + 1]);
+
+			err = i915_request_await_execution(rq[n + 1],
+							   &rq[0]->fence,
+							   ve->engine->bond_execute);
+			i915_request_add(rq[n + 1]);
+			if (err < 0) {
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+		}
+		onstack_fence_fini(&fence);
+		intel_engine_flush_submission(master);
+		igt_spinner_end(&spin);
+
+		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
+			pr_err("Master request did not execute (on %s)!\n",
+			       rq[0]->engine->name);
+			err = -EIO;
+			goto out;
+		}
+
+		for (n = 0; n < nsibling; n++) {
+			if (i915_request_wait(rq[n + 1], 0,
+					      MAX_SCHEDULE_TIMEOUT) < 0) {
+				err = -EIO;
+				goto out;
+			}
+
+			if (rq[n + 1]->engine != siblings[n]) {
+				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
+				       siblings[n]->name,
+				       rq[n + 1]->engine->name,
+				       rq[0]->engine->name);
+				err = -EINVAL;
+				goto out;
+			}
+		}
+
+		for (n = 0; !IS_ERR(rq[n]); n++)
+			i915_request_put(rq[n]);
+		rq[0] = ERR_PTR(-ENOMEM);
+	}
+
+out:
+	for (n = 0; !IS_ERR(rq[n]); n++)
+		i915_request_put(rq[n]);
+	if (igt_flush_test(gt->i915))
+		err = -EIO;
+
+	kernel_context_close(ctx);
+err_spin:
+	igt_spinner_fini(&spin);
+	return err;
+}
+
+static int live_virtual_bond(void *arg)
+{
+	static const struct phase {
+		const char *name;
+		unsigned int flags;
+	} phases[] = {
+		{ "", 0 },
+		{ "schedule", BOND_SCHEDULE },
+		{ },
+	};
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+	unsigned int class, inst;
+	int err;
+
+	if (USES_GUC_SUBMISSION(gt->i915))
+		return 0;
+
+	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
+		const struct phase *p;
+		int nsibling;
+
+		nsibling = 0;
+		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
+			if (!gt->engine_class[class][inst])
+				break;
+
+			GEM_BUG_ON(nsibling == ARRAY_SIZE(siblings));
+			siblings[nsibling++] = gt->engine_class[class][inst];
+		}
+		if (nsibling < 2)
+			continue;
+
+		for (p = phases; p->name; p++) {
+			err = bond_virtual_engine(gt,
+						  class, siblings, nsibling,
+						  p->flags);
+			if (err) {
+				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
+				       __func__, p->name, class, nsibling, err);
+				return err;
+			}
+		}
+	}
+
+	return 0;
+}
+
+int intel_execlists_live_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(live_sanitycheck),
+		SUBTEST(live_unlite_switch),
+		SUBTEST(live_unlite_preempt),
+		SUBTEST(live_timeslice_preempt),
+		SUBTEST(live_timeslice_queue),
+		SUBTEST(live_busywait_preempt),
+		SUBTEST(live_preempt),
+		SUBTEST(live_late_preempt),
+		SUBTEST(live_nopreempt),
+		SUBTEST(live_preempt_cancel),
+		SUBTEST(live_suppress_self_preempt),
+		SUBTEST(live_suppress_wait_preempt),
+		SUBTEST(live_chain_preempt),
+		SUBTEST(live_preempt_gang),
+		SUBTEST(live_preempt_hang),
+		SUBTEST(live_preempt_timeout),
+		SUBTEST(live_preempt_smoke),
+		SUBTEST(live_virtual_engine),
+		SUBTEST(live_virtual_mask),
+		SUBTEST(live_virtual_preserved),
+		SUBTEST(live_virtual_bond),
+	};
+
+	if (!HAS_EXECLISTS(i915))
+		return 0;
+
+	if (intel_gt_is_wedged(&i915->gt))
+		return 0;
+
+	return intel_gt_live_subtests(tests, &i915->gt);
+}
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 570c7891c62f..c3f5f46ffcb4 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -4,18 +4,8 @@
  * Copyright © 2018 Intel Corporation
  */
 
-#include <linux/prime_numbers.h>
-
-#include "gem/i915_gem_pm.h"
-#include "gt/intel_engine_heartbeat.h"
-#include "gt/intel_reset.h"
-
 #include "i915_selftest.h"
-#include "selftests/i915_random.h"
 #include "selftests/igt_flush_test.h"
-#include "selftests/igt_live_test.h"
-#include "selftests/igt_spinner.h"
-#include "selftests/lib_sw_fence.h"
 
 #include "gem/selftests/igt_gem_utils.h"
 #include "gem/selftests/mock_context.h"
@@ -23,3325 +13,6 @@
 #define CS_GPR(engine, n) ((engine)->mmio_base + 0x600 + (n) * 4)
 #define NUM_GPR_DW (16 * 2) /* each GPR is 2 dwords */
 
-static struct i915_vma *create_scratch(struct intel_gt *gt)
-{
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-	int err;
-
-	obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	i915_gem_object_set_cache_coherency(obj, I915_CACHING_CACHED);
-
-	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
-	if (IS_ERR(vma)) {
-		i915_gem_object_put(obj);
-		return vma;
-	}
-
-	err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
-	if (err) {
-		i915_gem_object_put(obj);
-		return ERR_PTR(err);
-	}
-
-	return vma;
-}
-
-static int live_sanitycheck(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct i915_gem_engines_iter it;
-	struct i915_gem_context *ctx;
-	struct intel_context *ce;
-	struct igt_spinner spin;
-	int err = -ENOMEM;
-
-	if (!HAS_LOGICAL_RING_CONTEXTS(gt->i915))
-		return 0;
-
-	if (igt_spinner_init(&spin, gt))
-		return -ENOMEM;
-
-	ctx = kernel_context(gt->i915);
-	if (!ctx)
-		goto err_spin;
-
-	for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
-		struct i915_request *rq;
-
-		rq = igt_spinner_create_request(&spin, ce, MI_NOOP);
-		if (IS_ERR(rq)) {
-			err = PTR_ERR(rq);
-			goto err_ctx;
-		}
-
-		i915_request_add(rq);
-		if (!igt_wait_for_spinner(&spin, rq)) {
-			GEM_TRACE("spinner failed to start\n");
-			GEM_TRACE_DUMP();
-			intel_gt_set_wedged(gt);
-			err = -EIO;
-			goto err_ctx;
-		}
-
-		igt_spinner_end(&spin);
-		if (igt_flush_test(gt->i915)) {
-			err = -EIO;
-			goto err_ctx;
-		}
-	}
-
-	err = 0;
-err_ctx:
-	i915_gem_context_unlock_engines(ctx);
-	kernel_context_close(ctx);
-err_spin:
-	igt_spinner_fini(&spin);
-	return err;
-}
-
-static int live_unlite_restore(struct intel_gt *gt, int prio)
-{
-	struct intel_engine_cs *engine;
-	struct i915_gem_context *ctx;
-	enum intel_engine_id id;
-	struct igt_spinner spin;
-	int err = -ENOMEM;
-
-	/*
-	 * Check that we can correctly context switch between 2 instances
-	 * on the same engine from the same parent context.
-	 */
-
-	if (igt_spinner_init(&spin, gt))
-		return err;
-
-	ctx = kernel_context(gt->i915);
-	if (!ctx)
-		goto err_spin;
-
-	err = 0;
-	for_each_engine(engine, gt, id) {
-		struct intel_context *ce[2] = {};
-		struct i915_request *rq[2];
-		struct igt_live_test t;
-		int n;
-
-		if (prio && !intel_engine_has_preemption(engine))
-			continue;
-
-		if (!intel_engine_can_store_dword(engine))
-			continue;
-
-		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
-			err = -EIO;
-			break;
-		}
-
-		for (n = 0; n < ARRAY_SIZE(ce); n++) {
-			struct intel_context *tmp;
-
-			tmp = intel_context_create(ctx, engine);
-			if (IS_ERR(tmp)) {
-				err = PTR_ERR(tmp);
-				goto err_ce;
-			}
-
-			err = intel_context_pin(tmp);
-			if (err) {
-				intel_context_put(tmp);
-				goto err_ce;
-			}
-
-			/*
-			 * Setup the pair of contexts such that if we
-			 * lite-restore using the RING_TAIL from ce[1] it
-			 * will execute garbage from ce[0]->ring.
-			 */
-			memset(tmp->ring->vaddr,
-			       POISON_INUSE, /* IPEHR: 0x5a5a5a5a [hung!] */
-			       tmp->ring->vma->size);
-
-			ce[n] = tmp;
-		}
-		GEM_BUG_ON(!ce[1]->ring->size);
-		intel_ring_reset(ce[1]->ring, ce[1]->ring->size / 2);
-		lr_context_update_reg_state(ce[1], engine);
-
-		rq[0] = igt_spinner_create_request(&spin, ce[0], MI_ARB_CHECK);
-		if (IS_ERR(rq[0])) {
-			err = PTR_ERR(rq[0]);
-			goto err_ce;
-		}
-
-		i915_request_get(rq[0]);
-		i915_request_add(rq[0]);
-		GEM_BUG_ON(rq[0]->postfix > ce[1]->ring->emit);
-
-		if (!igt_wait_for_spinner(&spin, rq[0])) {
-			i915_request_put(rq[0]);
-			goto err_ce;
-		}
-
-		rq[1] = i915_request_create(ce[1]);
-		if (IS_ERR(rq[1])) {
-			err = PTR_ERR(rq[1]);
-			i915_request_put(rq[0]);
-			goto err_ce;
-		}
-
-		if (!prio) {
-			/*
-			 * Ensure we do the switch to ce[1] on completion.
-			 *
-			 * rq[0] is already submitted, so this should reduce
-			 * to a no-op (a wait on a request on the same engine
-			 * uses the submit fence, not the completion fence),
-			 * but it will install a dependency on rq[1] for rq[0]
-			 * that will prevent the pair being reordered by
-			 * timeslicing.
-			 */
-			i915_request_await_dma_fence(rq[1], &rq[0]->fence);
-		}
-
-		i915_request_get(rq[1]);
-		i915_request_add(rq[1]);
-		GEM_BUG_ON(rq[1]->postfix <= rq[0]->postfix);
-		i915_request_put(rq[0]);
-
-		if (prio) {
-			struct i915_sched_attr attr = {
-				.priority = prio,
-			};
-
-			/* Alternatively preempt the spinner with ce[1] */
-			engine->schedule(rq[1], &attr);
-		}
-
-		/* And switch back to ce[0] for good measure */
-		rq[0] = i915_request_create(ce[0]);
-		if (IS_ERR(rq[0])) {
-			err = PTR_ERR(rq[0]);
-			i915_request_put(rq[1]);
-			goto err_ce;
-		}
-
-		i915_request_await_dma_fence(rq[0], &rq[1]->fence);
-		i915_request_get(rq[0]);
-		i915_request_add(rq[0]);
-		GEM_BUG_ON(rq[0]->postfix > rq[1]->postfix);
-		i915_request_put(rq[1]);
-		i915_request_put(rq[0]);
-
-err_ce:
-		tasklet_kill(&engine->execlists.tasklet); /* flush submission */
-		igt_spinner_end(&spin);
-		for (n = 0; n < ARRAY_SIZE(ce); n++) {
-			if (IS_ERR_OR_NULL(ce[n]))
-				break;
-
-			intel_context_unpin(ce[n]);
-			intel_context_put(ce[n]);
-		}
-
-		if (igt_live_test_end(&t))
-			err = -EIO;
-		if (err)
-			break;
-	}
-
-	kernel_context_close(ctx);
-err_spin:
-	igt_spinner_fini(&spin);
-	return err;
-}
-
-static int live_unlite_switch(void *arg)
-{
-	return live_unlite_restore(arg, 0);
-}
-
-static int live_unlite_preempt(void *arg)
-{
-	return live_unlite_restore(arg, I915_USER_PRIORITY(I915_PRIORITY_MAX));
-}
-
-static int
-emit_semaphore_chain(struct i915_request *rq, struct i915_vma *vma, int idx)
-{
-	u32 *cs;
-
-	cs = intel_ring_begin(rq, 10);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-
-	*cs++ = MI_SEMAPHORE_WAIT |
-		MI_SEMAPHORE_GLOBAL_GTT |
-		MI_SEMAPHORE_POLL |
-		MI_SEMAPHORE_SAD_NEQ_SDD;
-	*cs++ = 0;
-	*cs++ = i915_ggtt_offset(vma) + 4 * idx;
-	*cs++ = 0;
-
-	if (idx > 0) {
-		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
-		*cs++ = i915_ggtt_offset(vma) + 4 * (idx - 1);
-		*cs++ = 0;
-		*cs++ = 1;
-	} else {
-		*cs++ = MI_NOOP;
-		*cs++ = MI_NOOP;
-		*cs++ = MI_NOOP;
-		*cs++ = MI_NOOP;
-	}
-
-	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
-
-	intel_ring_advance(rq, cs);
-	return 0;
-}
-
-static struct i915_request *
-semaphore_queue(struct intel_engine_cs *engine, struct i915_vma *vma, int idx)
-{
-	struct i915_gem_context *ctx;
-	struct i915_request *rq;
-	int err;
-
-	ctx = kernel_context(engine->i915);
-	if (!ctx)
-		return ERR_PTR(-ENOMEM);
-
-	rq = igt_request_alloc(ctx, engine);
-	if (IS_ERR(rq))
-		goto out_ctx;
-
-	err = 0;
-	if (rq->engine->emit_init_breadcrumb)
-		err = rq->engine->emit_init_breadcrumb(rq);
-	if (err == 0)
-		err = emit_semaphore_chain(rq, vma, idx);
-	if (err == 0)
-		i915_request_get(rq);
-	i915_request_add(rq);
-	if (err)
-		rq = ERR_PTR(err);
-
-out_ctx:
-	kernel_context_close(ctx);
-	return rq;
-}
-
-static int
-release_queue(struct intel_engine_cs *engine,
-	      struct i915_vma *vma,
-	      int idx, int prio)
-{
-	struct i915_sched_attr attr = {
-		.priority = prio,
-	};
-	struct i915_request *rq;
-	u32 *cs;
-
-	rq = intel_engine_create_kernel_request(engine);
-	if (IS_ERR(rq))
-		return PTR_ERR(rq);
-
-	cs = intel_ring_begin(rq, 4);
-	if (IS_ERR(cs)) {
-		i915_request_add(rq);
-		return PTR_ERR(cs);
-	}
-
-	*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
-	*cs++ = i915_ggtt_offset(vma) + 4 * (idx - 1);
-	*cs++ = 0;
-	*cs++ = 1;
-
-	intel_ring_advance(rq, cs);
-
-	i915_request_get(rq);
-	i915_request_add(rq);
-
-	local_bh_disable();
-	engine->schedule(rq, &attr);
-	local_bh_enable(); /* kick tasklet */
-
-	i915_request_put(rq);
-
-	return 0;
-}
-
-static int
-slice_semaphore_queue(struct intel_engine_cs *outer,
-		      struct i915_vma *vma,
-		      int count)
-{
-	struct intel_engine_cs *engine;
-	struct i915_request *head;
-	enum intel_engine_id id;
-	int err, i, n = 0;
-
-	head = semaphore_queue(outer, vma, n++);
-	if (IS_ERR(head))
-		return PTR_ERR(head);
-
-	for_each_engine(engine, outer->gt, id) {
-		for (i = 0; i < count; i++) {
-			struct i915_request *rq;
-
-			rq = semaphore_queue(engine, vma, n++);
-			if (IS_ERR(rq)) {
-				err = PTR_ERR(rq);
-				goto out;
-			}
-
-			i915_request_put(rq);
-		}
-	}
-
-	err = release_queue(outer, vma, n, INT_MAX);
-	if (err)
-		goto out;
-
-	if (i915_request_wait(head, 0,
-			      2 * RUNTIME_INFO(outer->i915)->num_engines * (count + 2) * (count + 3)) < 0) {
-		pr_err("Failed to slice along semaphore chain of length (%d, %d)!\n",
-		       count, n);
-		GEM_TRACE_DUMP();
-		intel_gt_set_wedged(outer->gt);
-		err = -EIO;
-	}
-
-out:
-	i915_request_put(head);
-	return err;
-}
-
-static int live_timeslice_preempt(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-	void *vaddr;
-	int err = 0;
-	int count;
-
-	/*
-	 * If a request takes too long, we would like to give other users
-	 * a fair go on the GPU. In particular, users may create batches
-	 * that wait upon external input, where that input may even be
-	 * supplied by another GPU job. To avoid blocking forever, we
-	 * need to preempt the current task and replace it with another
-	 * ready task.
-	 */
-	if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
-		return 0;
-
-	obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
-	if (IS_ERR(obj))
-		return PTR_ERR(obj);
-
-	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err_obj;
-	}
-
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
-	if (IS_ERR(vaddr)) {
-		err = PTR_ERR(vaddr);
-		goto err_obj;
-	}
-
-	err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
-	if (err)
-		goto err_map;
-
-	for_each_prime_number_from(count, 1, 16) {
-		struct intel_engine_cs *engine;
-		enum intel_engine_id id;
-
-		for_each_engine(engine, gt, id) {
-			if (!intel_engine_has_preemption(engine))
-				continue;
-
-			memset(vaddr, 0, PAGE_SIZE);
-
-			err = slice_semaphore_queue(engine, vma, count);
-			if (err)
-				goto err_pin;
-
-			if (igt_flush_test(gt->i915)) {
-				err = -EIO;
-				goto err_pin;
-			}
-		}
-	}
-
-err_pin:
-	i915_vma_unpin(vma);
-err_map:
-	i915_gem_object_unpin_map(obj);
-err_obj:
-	i915_gem_object_put(obj);
-	return err;
-}
-
-static struct i915_request *nop_request(struct intel_engine_cs *engine)
-{
-	struct i915_request *rq;
-
-	rq = intel_engine_create_kernel_request(engine);
-	if (IS_ERR(rq))
-		return rq;
-
-	i915_request_get(rq);
-	i915_request_add(rq);
-
-	return rq;
-}
-
-static void wait_for_submit(struct intel_engine_cs *engine,
-			    struct i915_request *rq)
-{
-	do {
-		cond_resched();
-		intel_engine_flush_submission(engine);
-	} while (!i915_request_is_active(rq));
-}
-
-static long timeslice_threshold(const struct intel_engine_cs *engine)
-{
-	return 2 * msecs_to_jiffies_timeout(timeslice(engine)) + 1;
-}
-
-static int live_timeslice_queue(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct drm_i915_gem_object *obj;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	struct i915_vma *vma;
-	void *vaddr;
-	int err = 0;
-
-	/*
-	 * Make sure that even if ELSP[0] and ELSP[1] are filled with
-	 * timeslicing between them disabled, we *do* enable timeslicing
-	 * if the queue demands it. (Normally, we do not submit if
-	 * ELSP[1] is already occupied, so must rely on timeslicing to
-	 * eject ELSP[0] in favour of the queue.)
-	 */
-	if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
-		return 0;
-
-	obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
-	if (IS_ERR(obj))
-		return PTR_ERR(obj);
-
-	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err_obj;
-	}
-
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
-	if (IS_ERR(vaddr)) {
-		err = PTR_ERR(vaddr);
-		goto err_obj;
-	}
-
-	err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
-	if (err)
-		goto err_map;
-
-	for_each_engine(engine, gt, id) {
-		struct i915_sched_attr attr = {
-			.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX),
-		};
-		struct i915_request *rq, *nop;
-
-		if (!intel_engine_has_preemption(engine))
-			continue;
-
-		memset(vaddr, 0, PAGE_SIZE);
-
-		/* ELSP[0]: semaphore wait */
-		rq = semaphore_queue(engine, vma, 0);
-		if (IS_ERR(rq)) {
-			err = PTR_ERR(rq);
-			goto err_pin;
-		}
-		engine->schedule(rq, &attr);
-		wait_for_submit(engine, rq);
-
-		/* ELSP[1]: nop request */
-		nop = nop_request(engine);
-		if (IS_ERR(nop)) {
-			err = PTR_ERR(nop);
-			i915_request_put(rq);
-			goto err_pin;
-		}
-		wait_for_submit(engine, nop);
-		i915_request_put(nop);
-
-		GEM_BUG_ON(i915_request_completed(rq));
-		GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
-
-		/* Queue: semaphore signal, matching priority as semaphore */
-		err = release_queue(engine, vma, 1, effective_prio(rq));
-		if (err) {
-			i915_request_put(rq);
-			goto err_pin;
-		}
-
-		intel_engine_flush_submission(engine);
-		if (!READ_ONCE(engine->execlists.timer.expires) &&
-		    !i915_request_completed(rq)) {
-			struct drm_printer p =
-				drm_info_printer(gt->i915->drm.dev);
-
-			GEM_TRACE_ERR("%s: Failed to enable timeslicing!\n",
-				      engine->name);
-			intel_engine_dump(engine, &p,
-					  "%s\n", engine->name);
-			GEM_TRACE_DUMP();
-
-			memset(vaddr, 0xff, PAGE_SIZE);
-			err = -EINVAL;
-		}
-
-		/* Timeslice every jiffy, so within 2 we should signal */
-		if (i915_request_wait(rq, 0, timeslice_threshold(engine)) < 0) {
-			struct drm_printer p =
-				drm_info_printer(gt->i915->drm.dev);
-
-			pr_err("%s: Failed to timeslice into queue\n",
-			       engine->name);
-			intel_engine_dump(engine, &p,
-					  "%s\n", engine->name);
-
-			memset(vaddr, 0xff, PAGE_SIZE);
-			err = -EIO;
-		}
-		i915_request_put(rq);
-		if (err)
-			break;
-	}
-
-err_pin:
-	i915_vma_unpin(vma);
-err_map:
-	i915_gem_object_unpin_map(obj);
-err_obj:
-	i915_gem_object_put(obj);
-	return err;
-}
-
-static int live_busywait_preempt(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct i915_gem_context *ctx_hi, *ctx_lo;
-	struct intel_engine_cs *engine;
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-	enum intel_engine_id id;
-	int err = -ENOMEM;
-	u32 *map;
-
-	/*
-	 * Verify that even without HAS_LOGICAL_RING_PREEMPTION, we can
-	 * preempt the busywaits used to synchronise between rings.
-	 */
-
-	ctx_hi = kernel_context(gt->i915);
-	if (!ctx_hi)
-		return -ENOMEM;
-	ctx_hi->sched.priority =
-		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
-
-	ctx_lo = kernel_context(gt->i915);
-	if (!ctx_lo)
-		goto err_ctx_hi;
-	ctx_lo->sched.priority =
-		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
-
-	obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
-	if (IS_ERR(obj)) {
-		err = PTR_ERR(obj);
-		goto err_ctx_lo;
-	}
-
-	map = i915_gem_object_pin_map(obj, I915_MAP_WC);
-	if (IS_ERR(map)) {
-		err = PTR_ERR(map);
-		goto err_obj;
-	}
-
-	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err_map;
-	}
-
-	err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
-	if (err)
-		goto err_map;
-
-	for_each_engine(engine, gt, id) {
-		struct i915_request *lo, *hi;
-		struct igt_live_test t;
-		u32 *cs;
-
-		if (!intel_engine_has_preemption(engine))
-			continue;
-
-		if (!intel_engine_can_store_dword(engine))
-			continue;
-
-		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
-			err = -EIO;
-			goto err_vma;
-		}
-
-		/*
-		 * We create two requests. The low priority request
-		 * busywaits on a semaphore (inside the ringbuffer where
-		 * is should be preemptible) and the high priority requests
-		 * uses a MI_STORE_DWORD_IMM to update the semaphore value
-		 * allowing the first request to complete. If preemption
-		 * fails, we hang instead.
-		 */
-
-		lo = igt_request_alloc(ctx_lo, engine);
-		if (IS_ERR(lo)) {
-			err = PTR_ERR(lo);
-			goto err_vma;
-		}
-
-		cs = intel_ring_begin(lo, 8);
-		if (IS_ERR(cs)) {
-			err = PTR_ERR(cs);
-			i915_request_add(lo);
-			goto err_vma;
-		}
-
-		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
-		*cs++ = i915_ggtt_offset(vma);
-		*cs++ = 0;
-		*cs++ = 1;
-
-		/* XXX Do we need a flush + invalidate here? */
-
-		*cs++ = MI_SEMAPHORE_WAIT |
-			MI_SEMAPHORE_GLOBAL_GTT |
-			MI_SEMAPHORE_POLL |
-			MI_SEMAPHORE_SAD_EQ_SDD;
-		*cs++ = 0;
-		*cs++ = i915_ggtt_offset(vma);
-		*cs++ = 0;
-
-		intel_ring_advance(lo, cs);
-
-		i915_request_get(lo);
-		i915_request_add(lo);
-
-		if (wait_for(READ_ONCE(*map), 10)) {
-			i915_request_put(lo);
-			err = -ETIMEDOUT;
-			goto err_vma;
-		}
-
-		/* Low priority request should be busywaiting now */
-		if (i915_request_wait(lo, 0, 1) != -ETIME) {
-			i915_request_put(lo);
-			pr_err("%s: Busywaiting request did not!\n",
-			       engine->name);
-			err = -EIO;
-			goto err_vma;
-		}
-
-		hi = igt_request_alloc(ctx_hi, engine);
-		if (IS_ERR(hi)) {
-			err = PTR_ERR(hi);
-			i915_request_put(lo);
-			goto err_vma;
-		}
-
-		cs = intel_ring_begin(hi, 4);
-		if (IS_ERR(cs)) {
-			err = PTR_ERR(cs);
-			i915_request_add(hi);
-			i915_request_put(lo);
-			goto err_vma;
-		}
-
-		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
-		*cs++ = i915_ggtt_offset(vma);
-		*cs++ = 0;
-		*cs++ = 0;
-
-		intel_ring_advance(hi, cs);
-		i915_request_add(hi);
-
-		if (i915_request_wait(lo, 0, HZ / 5) < 0) {
-			struct drm_printer p = drm_info_printer(gt->i915->drm.dev);
-
-			pr_err("%s: Failed to preempt semaphore busywait!\n",
-			       engine->name);
-
-			intel_engine_dump(engine, &p, "%s\n", engine->name);
-			GEM_TRACE_DUMP();
-
-			i915_request_put(lo);
-			intel_gt_set_wedged(gt);
-			err = -EIO;
-			goto err_vma;
-		}
-		GEM_BUG_ON(READ_ONCE(*map));
-		i915_request_put(lo);
-
-		if (igt_live_test_end(&t)) {
-			err = -EIO;
-			goto err_vma;
-		}
-	}
-
-	err = 0;
-err_vma:
-	i915_vma_unpin(vma);
-err_map:
-	i915_gem_object_unpin_map(obj);
-err_obj:
-	i915_gem_object_put(obj);
-err_ctx_lo:
-	kernel_context_close(ctx_lo);
-err_ctx_hi:
-	kernel_context_close(ctx_hi);
-	return err;
-}
-
-static struct i915_request *
-spinner_create_request(struct igt_spinner *spin,
-		       struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine,
-		       u32 arb)
-{
-	struct intel_context *ce;
-	struct i915_request *rq;
-
-	ce = i915_gem_context_get_engine(ctx, engine->legacy_idx);
-	if (IS_ERR(ce))
-		return ERR_CAST(ce);
-
-	rq = igt_spinner_create_request(spin, ce, arb);
-	intel_context_put(ce);
-	return rq;
-}
-
-static int live_preempt(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct i915_gem_context *ctx_hi, *ctx_lo;
-	struct igt_spinner spin_hi, spin_lo;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int err = -ENOMEM;
-
-	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
-		return 0;
-
-	if (!(gt->i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
-		pr_err("Logical preemption supported, but not exposed\n");
-
-	if (igt_spinner_init(&spin_hi, gt))
-		return -ENOMEM;
-
-	if (igt_spinner_init(&spin_lo, gt))
-		goto err_spin_hi;
-
-	ctx_hi = kernel_context(gt->i915);
-	if (!ctx_hi)
-		goto err_spin_lo;
-	ctx_hi->sched.priority =
-		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
-
-	ctx_lo = kernel_context(gt->i915);
-	if (!ctx_lo)
-		goto err_ctx_hi;
-	ctx_lo->sched.priority =
-		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
-
-	for_each_engine(engine, gt, id) {
-		struct igt_live_test t;
-		struct i915_request *rq;
-
-		if (!intel_engine_has_preemption(engine))
-			continue;
-
-		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
-			err = -EIO;
-			goto err_ctx_lo;
-		}
-
-		rq = spinner_create_request(&spin_lo, ctx_lo, engine,
-					    MI_ARB_CHECK);
-		if (IS_ERR(rq)) {
-			err = PTR_ERR(rq);
-			goto err_ctx_lo;
-		}
-
-		i915_request_add(rq);
-		if (!igt_wait_for_spinner(&spin_lo, rq)) {
-			GEM_TRACE("lo spinner failed to start\n");
-			GEM_TRACE_DUMP();
-			intel_gt_set_wedged(gt);
-			err = -EIO;
-			goto err_ctx_lo;
-		}
-
-		rq = spinner_create_request(&spin_hi, ctx_hi, engine,
-					    MI_ARB_CHECK);
-		if (IS_ERR(rq)) {
-			igt_spinner_end(&spin_lo);
-			err = PTR_ERR(rq);
-			goto err_ctx_lo;
-		}
-
-		i915_request_add(rq);
-		if (!igt_wait_for_spinner(&spin_hi, rq)) {
-			GEM_TRACE("hi spinner failed to start\n");
-			GEM_TRACE_DUMP();
-			intel_gt_set_wedged(gt);
-			err = -EIO;
-			goto err_ctx_lo;
-		}
-
-		igt_spinner_end(&spin_hi);
-		igt_spinner_end(&spin_lo);
-
-		if (igt_live_test_end(&t)) {
-			err = -EIO;
-			goto err_ctx_lo;
-		}
-	}
-
-	err = 0;
-err_ctx_lo:
-	kernel_context_close(ctx_lo);
-err_ctx_hi:
-	kernel_context_close(ctx_hi);
-err_spin_lo:
-	igt_spinner_fini(&spin_lo);
-err_spin_hi:
-	igt_spinner_fini(&spin_hi);
-	return err;
-}
-
-static int live_late_preempt(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct i915_gem_context *ctx_hi, *ctx_lo;
-	struct igt_spinner spin_hi, spin_lo;
-	struct intel_engine_cs *engine;
-	struct i915_sched_attr attr = {};
-	enum intel_engine_id id;
-	int err = -ENOMEM;
-
-	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
-		return 0;
-
-	if (igt_spinner_init(&spin_hi, gt))
-		return -ENOMEM;
-
-	if (igt_spinner_init(&spin_lo, gt))
-		goto err_spin_hi;
-
-	ctx_hi = kernel_context(gt->i915);
-	if (!ctx_hi)
-		goto err_spin_lo;
-
-	ctx_lo = kernel_context(gt->i915);
-	if (!ctx_lo)
-		goto err_ctx_hi;
-
-	/* Make sure ctx_lo stays before ctx_hi until we trigger preemption. */
-	ctx_lo->sched.priority = I915_USER_PRIORITY(1);
-
-	for_each_engine(engine, gt, id) {
-		struct igt_live_test t;
-		struct i915_request *rq;
-
-		if (!intel_engine_has_preemption(engine))
-			continue;
-
-		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
-			err = -EIO;
-			goto err_ctx_lo;
-		}
-
-		rq = spinner_create_request(&spin_lo, ctx_lo, engine,
-					    MI_ARB_CHECK);
-		if (IS_ERR(rq)) {
-			err = PTR_ERR(rq);
-			goto err_ctx_lo;
-		}
-
-		i915_request_add(rq);
-		if (!igt_wait_for_spinner(&spin_lo, rq)) {
-			pr_err("First context failed to start\n");
-			goto err_wedged;
-		}
-
-		rq = spinner_create_request(&spin_hi, ctx_hi, engine,
-					    MI_NOOP);
-		if (IS_ERR(rq)) {
-			igt_spinner_end(&spin_lo);
-			err = PTR_ERR(rq);
-			goto err_ctx_lo;
-		}
-
-		i915_request_add(rq);
-		if (igt_wait_for_spinner(&spin_hi, rq)) {
-			pr_err("Second context overtook first?\n");
-			goto err_wedged;
-		}
-
-		attr.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX);
-		engine->schedule(rq, &attr);
-
-		if (!igt_wait_for_spinner(&spin_hi, rq)) {
-			pr_err("High priority context failed to preempt the low priority context\n");
-			GEM_TRACE_DUMP();
-			goto err_wedged;
-		}
-
-		igt_spinner_end(&spin_hi);
-		igt_spinner_end(&spin_lo);
-
-		if (igt_live_test_end(&t)) {
-			err = -EIO;
-			goto err_ctx_lo;
-		}
-	}
-
-	err = 0;
-err_ctx_lo:
-	kernel_context_close(ctx_lo);
-err_ctx_hi:
-	kernel_context_close(ctx_hi);
-err_spin_lo:
-	igt_spinner_fini(&spin_lo);
-err_spin_hi:
-	igt_spinner_fini(&spin_hi);
-	return err;
-
-err_wedged:
-	igt_spinner_end(&spin_hi);
-	igt_spinner_end(&spin_lo);
-	intel_gt_set_wedged(gt);
-	err = -EIO;
-	goto err_ctx_lo;
-}
-
-struct preempt_client {
-	struct igt_spinner spin;
-	struct i915_gem_context *ctx;
-};
-
-static int preempt_client_init(struct intel_gt *gt, struct preempt_client *c)
-{
-	c->ctx = kernel_context(gt->i915);
-	if (!c->ctx)
-		return -ENOMEM;
-
-	if (igt_spinner_init(&c->spin, gt))
-		goto err_ctx;
-
-	return 0;
-
-err_ctx:
-	kernel_context_close(c->ctx);
-	return -ENOMEM;
-}
-
-static void preempt_client_fini(struct preempt_client *c)
-{
-	igt_spinner_fini(&c->spin);
-	kernel_context_close(c->ctx);
-}
-
-static int live_nopreempt(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *engine;
-	struct preempt_client a, b;
-	enum intel_engine_id id;
-	int err = -ENOMEM;
-
-	/*
-	 * Verify that we can disable preemption for an individual request
-	 * that may be being observed and not want to be interrupted.
-	 */
-
-	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
-		return 0;
-
-	if (preempt_client_init(gt, &a))
-		return -ENOMEM;
-	if (preempt_client_init(gt, &b))
-		goto err_client_a;
-	b.ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX);
-
-	for_each_engine(engine, gt, id) {
-		struct i915_request *rq_a, *rq_b;
-
-		if (!intel_engine_has_preemption(engine))
-			continue;
-
-		engine->execlists.preempt_hang.count = 0;
-
-		rq_a = spinner_create_request(&a.spin,
-					      a.ctx, engine,
-					      MI_ARB_CHECK);
-		if (IS_ERR(rq_a)) {
-			err = PTR_ERR(rq_a);
-			goto err_client_b;
-		}
-
-		/* Low priority client, but unpreemptable! */
-		rq_a->flags |= I915_REQUEST_NOPREEMPT;
-
-		i915_request_add(rq_a);
-		if (!igt_wait_for_spinner(&a.spin, rq_a)) {
-			pr_err("First client failed to start\n");
-			goto err_wedged;
-		}
-
-		rq_b = spinner_create_request(&b.spin,
-					      b.ctx, engine,
-					      MI_ARB_CHECK);
-		if (IS_ERR(rq_b)) {
-			err = PTR_ERR(rq_b);
-			goto err_client_b;
-		}
-
-		i915_request_add(rq_b);
-
-		/* B is much more important than A! (But A is unpreemptable.) */
-		GEM_BUG_ON(rq_prio(rq_b) <= rq_prio(rq_a));
-
-		/* Wait long enough for preemption and timeslicing */
-		if (igt_wait_for_spinner(&b.spin, rq_b)) {
-			pr_err("Second client started too early!\n");
-			goto err_wedged;
-		}
-
-		igt_spinner_end(&a.spin);
-
-		if (!igt_wait_for_spinner(&b.spin, rq_b)) {
-			pr_err("Second client failed to start\n");
-			goto err_wedged;
-		}
-
-		igt_spinner_end(&b.spin);
-
-		if (engine->execlists.preempt_hang.count) {
-			pr_err("Preemption recorded x%d; should have been suppressed!\n",
-			       engine->execlists.preempt_hang.count);
-			err = -EINVAL;
-			goto err_wedged;
-		}
-
-		if (igt_flush_test(gt->i915))
-			goto err_wedged;
-	}
-
-	err = 0;
-err_client_b:
-	preempt_client_fini(&b);
-err_client_a:
-	preempt_client_fini(&a);
-	return err;
-
-err_wedged:
-	igt_spinner_end(&b.spin);
-	igt_spinner_end(&a.spin);
-	intel_gt_set_wedged(gt);
-	err = -EIO;
-	goto err_client_b;
-}
-
-struct live_preempt_cancel {
-	struct intel_engine_cs *engine;
-	struct preempt_client a, b;
-};
-
-static int __cancel_active0(struct live_preempt_cancel *arg)
-{
-	struct i915_request *rq;
-	struct igt_live_test t;
-	int err;
-
-	/* Preempt cancel of ELSP0 */
-	GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
-	if (igt_live_test_begin(&t, arg->engine->i915,
-				__func__, arg->engine->name))
-		return -EIO;
-
-	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
-	rq = spinner_create_request(&arg->a.spin,
-				    arg->a.ctx, arg->engine,
-				    MI_ARB_CHECK);
-	if (IS_ERR(rq))
-		return PTR_ERR(rq);
-
-	i915_request_get(rq);
-	i915_request_add(rq);
-	if (!igt_wait_for_spinner(&arg->a.spin, rq)) {
-		err = -EIO;
-		goto out;
-	}
-
-	i915_gem_context_set_banned(arg->a.ctx);
-	err = intel_engine_pulse(arg->engine);
-	if (err)
-		goto out;
-
-	if (i915_request_wait(rq, 0, HZ / 5) < 0) {
-		err = -EIO;
-		goto out;
-	}
-
-	if (rq->fence.error != -EIO) {
-		pr_err("Cancelled inflight0 request did not report -EIO\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-out:
-	i915_request_put(rq);
-	if (igt_live_test_end(&t))
-		err = -EIO;
-	return err;
-}
-
-static int __cancel_active1(struct live_preempt_cancel *arg)
-{
-	struct i915_request *rq[2] = {};
-	struct igt_live_test t;
-	int err;
-
-	/* Preempt cancel of ELSP1 */
-	GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
-	if (igt_live_test_begin(&t, arg->engine->i915,
-				__func__, arg->engine->name))
-		return -EIO;
-
-	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
-	rq[0] = spinner_create_request(&arg->a.spin,
-				       arg->a.ctx, arg->engine,
-				       MI_NOOP); /* no preemption */
-	if (IS_ERR(rq[0]))
-		return PTR_ERR(rq[0]);
-
-	i915_request_get(rq[0]);
-	i915_request_add(rq[0]);
-	if (!igt_wait_for_spinner(&arg->a.spin, rq[0])) {
-		err = -EIO;
-		goto out;
-	}
-
-	clear_bit(CONTEXT_BANNED, &arg->b.ctx->flags);
-	rq[1] = spinner_create_request(&arg->b.spin,
-				       arg->b.ctx, arg->engine,
-				       MI_ARB_CHECK);
-	if (IS_ERR(rq[1])) {
-		err = PTR_ERR(rq[1]);
-		goto out;
-	}
-
-	i915_request_get(rq[1]);
-	err = i915_request_await_dma_fence(rq[1], &rq[0]->fence);
-	i915_request_add(rq[1]);
-	if (err)
-		goto out;
-
-	i915_gem_context_set_banned(arg->b.ctx);
-	err = intel_engine_pulse(arg->engine);
-	if (err)
-		goto out;
-
-	igt_spinner_end(&arg->a.spin);
-	if (i915_request_wait(rq[1], 0, HZ / 5) < 0) {
-		err = -EIO;
-		goto out;
-	}
-
-	if (rq[0]->fence.error != 0) {
-		pr_err("Normal inflight0 request did not complete\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-	if (rq[1]->fence.error != -EIO) {
-		pr_err("Cancelled inflight1 request did not report -EIO\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-out:
-	i915_request_put(rq[1]);
-	i915_request_put(rq[0]);
-	if (igt_live_test_end(&t))
-		err = -EIO;
-	return err;
-}
-
-static int __cancel_queued(struct live_preempt_cancel *arg)
-{
-	struct i915_request *rq[3] = {};
-	struct igt_live_test t;
-	int err;
-
-	/* Full ELSP and one in the wings */
-	GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
-	if (igt_live_test_begin(&t, arg->engine->i915,
-				__func__, arg->engine->name))
-		return -EIO;
-
-	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
-	rq[0] = spinner_create_request(&arg->a.spin,
-				       arg->a.ctx, arg->engine,
-				       MI_ARB_CHECK);
-	if (IS_ERR(rq[0]))
-		return PTR_ERR(rq[0]);
-
-	i915_request_get(rq[0]);
-	i915_request_add(rq[0]);
-	if (!igt_wait_for_spinner(&arg->a.spin, rq[0])) {
-		err = -EIO;
-		goto out;
-	}
-
-	clear_bit(CONTEXT_BANNED, &arg->b.ctx->flags);
-	rq[1] = igt_request_alloc(arg->b.ctx, arg->engine);
-	if (IS_ERR(rq[1])) {
-		err = PTR_ERR(rq[1]);
-		goto out;
-	}
-
-	i915_request_get(rq[1]);
-	err = i915_request_await_dma_fence(rq[1], &rq[0]->fence);
-	i915_request_add(rq[1]);
-	if (err)
-		goto out;
-
-	rq[2] = spinner_create_request(&arg->b.spin,
-				       arg->a.ctx, arg->engine,
-				       MI_ARB_CHECK);
-	if (IS_ERR(rq[2])) {
-		err = PTR_ERR(rq[2]);
-		goto out;
-	}
-
-	i915_request_get(rq[2]);
-	err = i915_request_await_dma_fence(rq[2], &rq[1]->fence);
-	i915_request_add(rq[2]);
-	if (err)
-		goto out;
-
-	i915_gem_context_set_banned(arg->a.ctx);
-	err = intel_engine_pulse(arg->engine);
-	if (err)
-		goto out;
-
-	if (i915_request_wait(rq[2], 0, HZ / 5) < 0) {
-		err = -EIO;
-		goto out;
-	}
-
-	if (rq[0]->fence.error != -EIO) {
-		pr_err("Cancelled inflight0 request did not report -EIO\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-	if (rq[1]->fence.error != 0) {
-		pr_err("Normal inflight1 request did not complete\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-	if (rq[2]->fence.error != -EIO) {
-		pr_err("Cancelled queued request did not report -EIO\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-out:
-	i915_request_put(rq[2]);
-	i915_request_put(rq[1]);
-	i915_request_put(rq[0]);
-	if (igt_live_test_end(&t))
-		err = -EIO;
-	return err;
-}
-
-static int __cancel_hostile(struct live_preempt_cancel *arg)
-{
-	struct i915_request *rq;
-	int err;
-
-	/* Preempt cancel non-preemptible spinner in ELSP0 */
-	if (!IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT))
-		return 0;
-
-	GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
-	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
-	rq = spinner_create_request(&arg->a.spin,
-				    arg->a.ctx, arg->engine,
-				    MI_NOOP); /* preemption disabled */
-	if (IS_ERR(rq))
-		return PTR_ERR(rq);
-
-	i915_request_get(rq);
-	i915_request_add(rq);
-	if (!igt_wait_for_spinner(&arg->a.spin, rq)) {
-		err = -EIO;
-		goto out;
-	}
-
-	i915_gem_context_set_banned(arg->a.ctx);
-	err = intel_engine_pulse(arg->engine); /* force reset */
-	if (err)
-		goto out;
-
-	if (i915_request_wait(rq, 0, HZ / 5) < 0) {
-		err = -EIO;
-		goto out;
-	}
-
-	if (rq->fence.error != -EIO) {
-		pr_err("Cancelled inflight0 request did not report -EIO\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-out:
-	i915_request_put(rq);
-	if (igt_flush_test(arg->engine->i915))
-		err = -EIO;
-	return err;
-}
-
-static int live_preempt_cancel(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct live_preempt_cancel data;
-	enum intel_engine_id id;
-	int err = -ENOMEM;
-
-	/*
-	 * To cancel an inflight context, we need to first remove it from the
-	 * GPU. That sounds like preemption! Plus a little bit of bookkeeping.
-	 */
-
-	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
-		return 0;
-
-	if (preempt_client_init(gt, &data.a))
-		return -ENOMEM;
-	if (preempt_client_init(gt, &data.b))
-		goto err_client_a;
-
-	for_each_engine(data.engine, gt, id) {
-		if (!intel_engine_has_preemption(data.engine))
-			continue;
-
-		err = __cancel_active0(&data);
-		if (err)
-			goto err_wedged;
-
-		err = __cancel_active1(&data);
-		if (err)
-			goto err_wedged;
-
-		err = __cancel_queued(&data);
-		if (err)
-			goto err_wedged;
-
-		err = __cancel_hostile(&data);
-		if (err)
-			goto err_wedged;
-	}
-
-	err = 0;
-err_client_b:
-	preempt_client_fini(&data.b);
-err_client_a:
-	preempt_client_fini(&data.a);
-	return err;
-
-err_wedged:
-	GEM_TRACE_DUMP();
-	igt_spinner_end(&data.b.spin);
-	igt_spinner_end(&data.a.spin);
-	intel_gt_set_wedged(gt);
-	goto err_client_b;
-}
-
-static int live_suppress_self_preempt(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *engine;
-	struct i915_sched_attr attr = {
-		.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX)
-	};
-	struct preempt_client a, b;
-	enum intel_engine_id id;
-	int err = -ENOMEM;
-
-	/*
-	 * Verify that if a preemption request does not cause a change in
-	 * the current execution order, the preempt-to-idle injection is
-	 * skipped and that we do not accidentally apply it after the CS
-	 * completion event.
-	 */
-
-	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
-		return 0;
-
-	if (USES_GUC_SUBMISSION(gt->i915))
-		return 0; /* presume black blox */
-
-	if (intel_vgpu_active(gt->i915))
-		return 0; /* GVT forces single port & request submission */
-
-	if (preempt_client_init(gt, &a))
-		return -ENOMEM;
-	if (preempt_client_init(gt, &b))
-		goto err_client_a;
-
-	for_each_engine(engine, gt, id) {
-		struct i915_request *rq_a, *rq_b;
-		int depth;
-
-		if (!intel_engine_has_preemption(engine))
-			continue;
-
-		if (igt_flush_test(gt->i915))
-			goto err_wedged;
-
-		intel_engine_pm_get(engine);
-		engine->execlists.preempt_hang.count = 0;
-
-		rq_a = spinner_create_request(&a.spin,
-					      a.ctx, engine,
-					      MI_NOOP);
-		if (IS_ERR(rq_a)) {
-			err = PTR_ERR(rq_a);
-			intel_engine_pm_put(engine);
-			goto err_client_b;
-		}
-
-		i915_request_add(rq_a);
-		if (!igt_wait_for_spinner(&a.spin, rq_a)) {
-			pr_err("First client failed to start\n");
-			intel_engine_pm_put(engine);
-			goto err_wedged;
-		}
-
-		/* Keep postponing the timer to avoid premature slicing */
-		mod_timer(&engine->execlists.timer, jiffies + HZ);
-		for (depth = 0; depth < 8; depth++) {
-			rq_b = spinner_create_request(&b.spin,
-						      b.ctx, engine,
-						      MI_NOOP);
-			if (IS_ERR(rq_b)) {
-				err = PTR_ERR(rq_b);
-				intel_engine_pm_put(engine);
-				goto err_client_b;
-			}
-			i915_request_add(rq_b);
-
-			GEM_BUG_ON(i915_request_completed(rq_a));
-			engine->schedule(rq_a, &attr);
-			igt_spinner_end(&a.spin);
-
-			if (!igt_wait_for_spinner(&b.spin, rq_b)) {
-				pr_err("Second client failed to start\n");
-				intel_engine_pm_put(engine);
-				goto err_wedged;
-			}
-
-			swap(a, b);
-			rq_a = rq_b;
-		}
-		igt_spinner_end(&a.spin);
-
-		if (engine->execlists.preempt_hang.count) {
-			pr_err("Preemption on %s recorded x%d, depth %d; should have been suppressed!\n",
-			       engine->name,
-			       engine->execlists.preempt_hang.count,
-			       depth);
-			intel_engine_pm_put(engine);
-			err = -EINVAL;
-			goto err_client_b;
-		}
-
-		intel_engine_pm_put(engine);
-		if (igt_flush_test(gt->i915))
-			goto err_wedged;
-	}
-
-	err = 0;
-err_client_b:
-	preempt_client_fini(&b);
-err_client_a:
-	preempt_client_fini(&a);
-	return err;
-
-err_wedged:
-	igt_spinner_end(&b.spin);
-	igt_spinner_end(&a.spin);
-	intel_gt_set_wedged(gt);
-	err = -EIO;
-	goto err_client_b;
-}
-
-static int __i915_sw_fence_call
-dummy_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
-{
-	return NOTIFY_DONE;
-}
-
-static struct i915_request *dummy_request(struct intel_engine_cs *engine)
-{
-	struct i915_request *rq;
-
-	rq = kzalloc(sizeof(*rq), GFP_KERNEL);
-	if (!rq)
-		return NULL;
-
-	rq->engine = engine;
-
-	spin_lock_init(&rq->lock);
-	INIT_LIST_HEAD(&rq->fence.cb_list);
-	rq->fence.lock = &rq->lock;
-	rq->fence.ops = &i915_fence_ops;
-
-	i915_sched_node_init(&rq->sched);
-
-	/* mark this request as permanently incomplete */
-	rq->fence.seqno = 1;
-	BUILD_BUG_ON(sizeof(rq->fence.seqno) != 8); /* upper 32b == 0 */
-	rq->hwsp_seqno = (u32 *)&rq->fence.seqno + 1;
-	GEM_BUG_ON(i915_request_completed(rq));
-
-	i915_sw_fence_init(&rq->submit, dummy_notify);
-	set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
-
-	spin_lock_init(&rq->lock);
-	rq->fence.lock = &rq->lock;
-	INIT_LIST_HEAD(&rq->fence.cb_list);
-
-	return rq;
-}
-
-static void dummy_request_free(struct i915_request *dummy)
-{
-	/* We have to fake the CS interrupt to kick the next request */
-	i915_sw_fence_commit(&dummy->submit);
-
-	i915_request_mark_complete(dummy);
-	dma_fence_signal(&dummy->fence);
-
-	i915_sched_node_fini(&dummy->sched);
-	i915_sw_fence_fini(&dummy->submit);
-
-	dma_fence_free(&dummy->fence);
-}
-
-static int live_suppress_wait_preempt(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct preempt_client client[4];
-	struct i915_request *rq[ARRAY_SIZE(client)] = {};
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int err = -ENOMEM;
-	int i;
-
-	/*
-	 * Waiters are given a little priority nudge, but not enough
-	 * to actually cause any preemption. Double check that we do
-	 * not needlessly generate preempt-to-idle cycles.
-	 */
-
-	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
-		return 0;
-
-	if (preempt_client_init(gt, &client[0])) /* ELSP[0] */
-		return -ENOMEM;
-	if (preempt_client_init(gt, &client[1])) /* ELSP[1] */
-		goto err_client_0;
-	if (preempt_client_init(gt, &client[2])) /* head of queue */
-		goto err_client_1;
-	if (preempt_client_init(gt, &client[3])) /* bystander */
-		goto err_client_2;
-
-	for_each_engine(engine, gt, id) {
-		int depth;
-
-		if (!intel_engine_has_preemption(engine))
-			continue;
-
-		if (!engine->emit_init_breadcrumb)
-			continue;
-
-		for (depth = 0; depth < ARRAY_SIZE(client); depth++) {
-			struct i915_request *dummy;
-
-			engine->execlists.preempt_hang.count = 0;
-
-			dummy = dummy_request(engine);
-			if (!dummy)
-				goto err_client_3;
-
-			for (i = 0; i < ARRAY_SIZE(client); i++) {
-				struct i915_request *this;
-
-				this = spinner_create_request(&client[i].spin,
-							      client[i].ctx, engine,
-							      MI_NOOP);
-				if (IS_ERR(this)) {
-					err = PTR_ERR(this);
-					goto err_wedged;
-				}
-
-				/* Disable NEWCLIENT promotion */
-				__i915_active_fence_set(&i915_request_timeline(this)->last_request,
-							&dummy->fence);
-
-				rq[i] = i915_request_get(this);
-				i915_request_add(this);
-			}
-
-			dummy_request_free(dummy);
-
-			GEM_BUG_ON(i915_request_completed(rq[0]));
-			if (!igt_wait_for_spinner(&client[0].spin, rq[0])) {
-				pr_err("%s: First client failed to start\n",
-				       engine->name);
-				goto err_wedged;
-			}
-			GEM_BUG_ON(!i915_request_started(rq[0]));
-
-			if (i915_request_wait(rq[depth],
-					      I915_WAIT_PRIORITY,
-					      1) != -ETIME) {
-				pr_err("%s: Waiter depth:%d completed!\n",
-				       engine->name, depth);
-				goto err_wedged;
-			}
-
-			for (i = 0; i < ARRAY_SIZE(client); i++) {
-				igt_spinner_end(&client[i].spin);
-				i915_request_put(rq[i]);
-				rq[i] = NULL;
-			}
-
-			if (igt_flush_test(gt->i915))
-				goto err_wedged;
-
-			if (engine->execlists.preempt_hang.count) {
-				pr_err("%s: Preemption recorded x%d, depth %d; should have been suppressed!\n",
-				       engine->name,
-				       engine->execlists.preempt_hang.count,
-				       depth);
-				err = -EINVAL;
-				goto err_client_3;
-			}
-		}
-	}
-
-	err = 0;
-err_client_3:
-	preempt_client_fini(&client[3]);
-err_client_2:
-	preempt_client_fini(&client[2]);
-err_client_1:
-	preempt_client_fini(&client[1]);
-err_client_0:
-	preempt_client_fini(&client[0]);
-	return err;
-
-err_wedged:
-	for (i = 0; i < ARRAY_SIZE(client); i++) {
-		igt_spinner_end(&client[i].spin);
-		i915_request_put(rq[i]);
-	}
-	intel_gt_set_wedged(gt);
-	err = -EIO;
-	goto err_client_3;
-}
-
-static int live_chain_preempt(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *engine;
-	struct preempt_client hi, lo;
-	enum intel_engine_id id;
-	int err = -ENOMEM;
-
-	/*
-	 * Build a chain AB...BA between two contexts (A, B) and request
-	 * preemption of the last request. It should then complete before
-	 * the previously submitted spinner in B.
-	 */
-
-	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
-		return 0;
-
-	if (preempt_client_init(gt, &hi))
-		return -ENOMEM;
-
-	if (preempt_client_init(gt, &lo))
-		goto err_client_hi;
-
-	for_each_engine(engine, gt, id) {
-		struct i915_sched_attr attr = {
-			.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX),
-		};
-		struct igt_live_test t;
-		struct i915_request *rq;
-		int ring_size, count, i;
-
-		if (!intel_engine_has_preemption(engine))
-			continue;
-
-		rq = spinner_create_request(&lo.spin,
-					    lo.ctx, engine,
-					    MI_ARB_CHECK);
-		if (IS_ERR(rq))
-			goto err_wedged;
-
-		i915_request_get(rq);
-		i915_request_add(rq);
-
-		ring_size = rq->wa_tail - rq->head;
-		if (ring_size < 0)
-			ring_size += rq->ring->size;
-		ring_size = rq->ring->size / ring_size;
-		pr_debug("%s(%s): Using maximum of %d requests\n",
-			 __func__, engine->name, ring_size);
-
-		igt_spinner_end(&lo.spin);
-		if (i915_request_wait(rq, 0, HZ / 2) < 0) {
-			pr_err("Timed out waiting to flush %s\n", engine->name);
-			i915_request_put(rq);
-			goto err_wedged;
-		}
-		i915_request_put(rq);
-
-		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
-			err = -EIO;
-			goto err_wedged;
-		}
-
-		for_each_prime_number_from(count, 1, ring_size) {
-			rq = spinner_create_request(&hi.spin,
-						    hi.ctx, engine,
-						    MI_ARB_CHECK);
-			if (IS_ERR(rq))
-				goto err_wedged;
-			i915_request_add(rq);
-			if (!igt_wait_for_spinner(&hi.spin, rq))
-				goto err_wedged;
-
-			rq = spinner_create_request(&lo.spin,
-						    lo.ctx, engine,
-						    MI_ARB_CHECK);
-			if (IS_ERR(rq))
-				goto err_wedged;
-			i915_request_add(rq);
-
-			for (i = 0; i < count; i++) {
-				rq = igt_request_alloc(lo.ctx, engine);
-				if (IS_ERR(rq))
-					goto err_wedged;
-				i915_request_add(rq);
-			}
-
-			rq = igt_request_alloc(hi.ctx, engine);
-			if (IS_ERR(rq))
-				goto err_wedged;
-
-			i915_request_get(rq);
-			i915_request_add(rq);
-			engine->schedule(rq, &attr);
-
-			igt_spinner_end(&hi.spin);
-			if (i915_request_wait(rq, 0, HZ / 5) < 0) {
-				struct drm_printer p =
-					drm_info_printer(gt->i915->drm.dev);
-
-				pr_err("Failed to preempt over chain of %d\n",
-				       count);
-				intel_engine_dump(engine, &p,
-						  "%s\n", engine->name);
-				i915_request_put(rq);
-				goto err_wedged;
-			}
-			igt_spinner_end(&lo.spin);
-			i915_request_put(rq);
-
-			rq = igt_request_alloc(lo.ctx, engine);
-			if (IS_ERR(rq))
-				goto err_wedged;
-
-			i915_request_get(rq);
-			i915_request_add(rq);
-
-			if (i915_request_wait(rq, 0, HZ / 5) < 0) {
-				struct drm_printer p =
-					drm_info_printer(gt->i915->drm.dev);
-
-				pr_err("Failed to flush low priority chain of %d requests\n",
-				       count);
-				intel_engine_dump(engine, &p,
-						  "%s\n", engine->name);
-
-				i915_request_put(rq);
-				goto err_wedged;
-			}
-			i915_request_put(rq);
-		}
-
-		if (igt_live_test_end(&t)) {
-			err = -EIO;
-			goto err_wedged;
-		}
-	}
-
-	err = 0;
-err_client_lo:
-	preempt_client_fini(&lo);
-err_client_hi:
-	preempt_client_fini(&hi);
-	return err;
-
-err_wedged:
-	igt_spinner_end(&hi.spin);
-	igt_spinner_end(&lo.spin);
-	intel_gt_set_wedged(gt);
-	err = -EIO;
-	goto err_client_lo;
-}
-
-static int create_gang(struct intel_engine_cs *engine,
-		       struct i915_request **prev)
-{
-	struct drm_i915_gem_object *obj;
-	struct intel_context *ce;
-	struct i915_request *rq;
-	struct i915_vma *vma;
-	u32 *cs;
-	int err;
-
-	ce = intel_context_create(engine->kernel_context->gem_context, engine);
-	if (IS_ERR(ce))
-		return PTR_ERR(ce);
-
-	obj = i915_gem_object_create_internal(engine->i915, 4096);
-	if (IS_ERR(obj)) {
-		err = PTR_ERR(obj);
-		goto err_ce;
-	}
-
-	vma = i915_vma_instance(obj, ce->vm, NULL);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err_obj;
-	}
-
-	err = i915_vma_pin(vma, 0, 0, PIN_USER);
-	if (err)
-		goto err_obj;
-
-	cs = i915_gem_object_pin_map(obj, I915_MAP_WC);
-	if (IS_ERR(cs))
-		goto err_obj;
-
-	/* Semaphore target: spin until zero */
-	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-
-	*cs++ = MI_SEMAPHORE_WAIT |
-		MI_SEMAPHORE_POLL |
-		MI_SEMAPHORE_SAD_EQ_SDD;
-	*cs++ = 0;
-	*cs++ = lower_32_bits(vma->node.start);
-	*cs++ = upper_32_bits(vma->node.start);
-
-	if (*prev) {
-		u64 offset = (*prev)->batch->node.start;
-
-		/* Terminate the spinner in the next lower priority batch. */
-		*cs++ = MI_STORE_DWORD_IMM_GEN4;
-		*cs++ = lower_32_bits(offset);
-		*cs++ = upper_32_bits(offset);
-		*cs++ = 0;
-	}
-
-	*cs++ = MI_BATCH_BUFFER_END;
-	i915_gem_object_flush_map(obj);
-	i915_gem_object_unpin_map(obj);
-
-	rq = intel_context_create_request(ce);
-	if (IS_ERR(rq))
-		goto err_obj;
-
-	rq->batch = vma;
-	i915_request_get(rq);
-
-	i915_vma_lock(vma);
-	err = i915_request_await_object(rq, vma->obj, false);
-	if (!err)
-		err = i915_vma_move_to_active(vma, rq, 0);
-	if (!err)
-		err = rq->engine->emit_bb_start(rq,
-						vma->node.start,
-						PAGE_SIZE, 0);
-	i915_vma_unlock(vma);
-	i915_request_add(rq);
-	if (err)
-		goto err_rq;
-
-	i915_gem_object_put(obj);
-	intel_context_put(ce);
-
-	rq->client_link.next = &(*prev)->client_link;
-	*prev = rq;
-	return 0;
-
-err_rq:
-	i915_request_put(rq);
-err_obj:
-	i915_gem_object_put(obj);
-err_ce:
-	intel_context_put(ce);
-	return err;
-}
-
-static int live_preempt_gang(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
-		return 0;
-
-	/*
-	 * Build as long a chain of preempters as we can, with each
-	 * request higher priority than the last. Once we are ready, we release
-	 * the last batch which then precolates down the chain, each releasing
-	 * the next oldest in turn. The intent is to simply push as hard as we
-	 * can with the number of preemptions, trying to exceed narrow HW
-	 * limits. At a minimum, we insist that we can sort all the user
-	 * high priority levels into execution order.
-	 */
-
-	for_each_engine(engine, gt, id) {
-		struct i915_request *rq = NULL;
-		struct igt_live_test t;
-		IGT_TIMEOUT(end_time);
-		int prio = 0;
-		int err = 0;
-		u32 *cs;
-
-		if (!intel_engine_has_preemption(engine))
-			continue;
-
-		if (igt_live_test_begin(&t, gt->i915, __func__, engine->name))
-			return -EIO;
-
-		do {
-			struct i915_sched_attr attr = {
-				.priority = I915_USER_PRIORITY(prio++),
-			};
-
-			err = create_gang(engine, &rq);
-			if (err)
-				break;
-
-			/* Submit each spinner at increasing priority */
-			engine->schedule(rq, &attr);
-
-			if (prio <= I915_PRIORITY_MAX)
-				continue;
-
-			if (prio > (INT_MAX >> I915_USER_PRIORITY_SHIFT))
-				break;
-
-			if (__igt_timeout(end_time, NULL))
-				break;
-		} while (1);
-		pr_debug("%s: Preempt chain of %d requests\n",
-			 engine->name, prio);
-
-		/*
-		 * Such that the last spinner is the highest priority and
-		 * should execute first. When that spinner completes,
-		 * it will terminate the next lowest spinner until there
-		 * are no more spinners and the gang is complete.
-		 */
-		cs = i915_gem_object_pin_map(rq->batch->obj, I915_MAP_WC);
-		if (!IS_ERR(cs)) {
-			*cs = 0;
-			i915_gem_object_unpin_map(rq->batch->obj);
-		} else {
-			err = PTR_ERR(cs);
-			intel_gt_set_wedged(gt);
-		}
-
-		while (rq) { /* wait for each rq from highest to lowest prio */
-			struct i915_request *n =
-				list_next_entry(rq, client_link);
-
-			if (err == 0 && i915_request_wait(rq, 0, HZ / 5) < 0) {
-				struct drm_printer p =
-					drm_info_printer(engine->i915->drm.dev);
-
-				pr_err("Failed to flush chain of %d requests, at %d\n",
-				       prio, rq_prio(rq) >> I915_USER_PRIORITY_SHIFT);
-				intel_engine_dump(engine, &p,
-						  "%s\n", engine->name);
-
-				err = -ETIME;
-			}
-
-			i915_request_put(rq);
-			rq = n;
-		}
-
-		if (igt_live_test_end(&t))
-			err = -EIO;
-		if (err)
-			return err;
-	}
-
-	return 0;
-}
-
-static int live_preempt_hang(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct i915_gem_context *ctx_hi, *ctx_lo;
-	struct igt_spinner spin_hi, spin_lo;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int err = -ENOMEM;
-
-	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
-		return 0;
-
-	if (!intel_has_reset_engine(gt))
-		return 0;
-
-	if (igt_spinner_init(&spin_hi, gt))
-		return -ENOMEM;
-
-	if (igt_spinner_init(&spin_lo, gt))
-		goto err_spin_hi;
-
-	ctx_hi = kernel_context(gt->i915);
-	if (!ctx_hi)
-		goto err_spin_lo;
-	ctx_hi->sched.priority =
-		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
-
-	ctx_lo = kernel_context(gt->i915);
-	if (!ctx_lo)
-		goto err_ctx_hi;
-	ctx_lo->sched.priority =
-		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
-
-	for_each_engine(engine, gt, id) {
-		struct i915_request *rq;
-
-		if (!intel_engine_has_preemption(engine))
-			continue;
-
-		rq = spinner_create_request(&spin_lo, ctx_lo, engine,
-					    MI_ARB_CHECK);
-		if (IS_ERR(rq)) {
-			err = PTR_ERR(rq);
-			goto err_ctx_lo;
-		}
-
-		i915_request_add(rq);
-		if (!igt_wait_for_spinner(&spin_lo, rq)) {
-			GEM_TRACE("lo spinner failed to start\n");
-			GEM_TRACE_DUMP();
-			intel_gt_set_wedged(gt);
-			err = -EIO;
-			goto err_ctx_lo;
-		}
-
-		rq = spinner_create_request(&spin_hi, ctx_hi, engine,
-					    MI_ARB_CHECK);
-		if (IS_ERR(rq)) {
-			igt_spinner_end(&spin_lo);
-			err = PTR_ERR(rq);
-			goto err_ctx_lo;
-		}
-
-		init_completion(&engine->execlists.preempt_hang.completion);
-		engine->execlists.preempt_hang.inject_hang = true;
-
-		i915_request_add(rq);
-
-		if (!wait_for_completion_timeout(&engine->execlists.preempt_hang.completion,
-						 HZ / 10)) {
-			pr_err("Preemption did not occur within timeout!");
-			GEM_TRACE_DUMP();
-			intel_gt_set_wedged(gt);
-			err = -EIO;
-			goto err_ctx_lo;
-		}
-
-		set_bit(I915_RESET_ENGINE + id, &gt->reset.flags);
-		intel_engine_reset(engine, NULL);
-		clear_bit(I915_RESET_ENGINE + id, &gt->reset.flags);
-
-		engine->execlists.preempt_hang.inject_hang = false;
-
-		if (!igt_wait_for_spinner(&spin_hi, rq)) {
-			GEM_TRACE("hi spinner failed to start\n");
-			GEM_TRACE_DUMP();
-			intel_gt_set_wedged(gt);
-			err = -EIO;
-			goto err_ctx_lo;
-		}
-
-		igt_spinner_end(&spin_hi);
-		igt_spinner_end(&spin_lo);
-		if (igt_flush_test(gt->i915)) {
-			err = -EIO;
-			goto err_ctx_lo;
-		}
-	}
-
-	err = 0;
-err_ctx_lo:
-	kernel_context_close(ctx_lo);
-err_ctx_hi:
-	kernel_context_close(ctx_hi);
-err_spin_lo:
-	igt_spinner_fini(&spin_lo);
-err_spin_hi:
-	igt_spinner_fini(&spin_hi);
-	return err;
-}
-
-static int live_preempt_timeout(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct i915_gem_context *ctx_hi, *ctx_lo;
-	struct igt_spinner spin_lo;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int err = -ENOMEM;
-
-	/*
-	 * Check that we force preemption to occur by cancelling the previous
-	 * context if it refuses to yield the GPU.
-	 */
-	if (!IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT))
-		return 0;
-
-	if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
-		return 0;
-
-	if (!intel_has_reset_engine(gt))
-		return 0;
-
-	if (igt_spinner_init(&spin_lo, gt))
-		return -ENOMEM;
-
-	ctx_hi = kernel_context(gt->i915);
-	if (!ctx_hi)
-		goto err_spin_lo;
-	ctx_hi->sched.priority =
-		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
-
-	ctx_lo = kernel_context(gt->i915);
-	if (!ctx_lo)
-		goto err_ctx_hi;
-	ctx_lo->sched.priority =
-		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
-
-	for_each_engine(engine, gt, id) {
-		unsigned long saved_timeout;
-		struct i915_request *rq;
-
-		if (!intel_engine_has_preemption(engine))
-			continue;
-
-		rq = spinner_create_request(&spin_lo, ctx_lo, engine,
-					    MI_NOOP); /* preemption disabled */
-		if (IS_ERR(rq)) {
-			err = PTR_ERR(rq);
-			goto err_ctx_lo;
-		}
-
-		i915_request_add(rq);
-		if (!igt_wait_for_spinner(&spin_lo, rq)) {
-			intel_gt_set_wedged(gt);
-			err = -EIO;
-			goto err_ctx_lo;
-		}
-
-		rq = igt_request_alloc(ctx_hi, engine);
-		if (IS_ERR(rq)) {
-			igt_spinner_end(&spin_lo);
-			err = PTR_ERR(rq);
-			goto err_ctx_lo;
-		}
-
-		/* Flush the previous CS ack before changing timeouts */
-		while (READ_ONCE(engine->execlists.pending[0]))
-			cpu_relax();
-
-		saved_timeout = engine->props.preempt_timeout_ms;
-		engine->props.preempt_timeout_ms = 1; /* in ms, -> 1 jiffie */
-
-		i915_request_get(rq);
-		i915_request_add(rq);
-
-		intel_engine_flush_submission(engine);
-		engine->props.preempt_timeout_ms = saved_timeout;
-
-		if (i915_request_wait(rq, 0, HZ / 10) < 0) {
-			intel_gt_set_wedged(gt);
-			i915_request_put(rq);
-			err = -ETIME;
-			goto err_ctx_lo;
-		}
-
-		igt_spinner_end(&spin_lo);
-		i915_request_put(rq);
-	}
-
-	err = 0;
-err_ctx_lo:
-	kernel_context_close(ctx_lo);
-err_ctx_hi:
-	kernel_context_close(ctx_hi);
-err_spin_lo:
-	igt_spinner_fini(&spin_lo);
-	return err;
-}
-
-static int random_range(struct rnd_state *rnd, int min, int max)
-{
-	return i915_prandom_u32_max_state(max - min, rnd) + min;
-}
-
-static int random_priority(struct rnd_state *rnd)
-{
-	return random_range(rnd, I915_PRIORITY_MIN, I915_PRIORITY_MAX);
-}
-
-struct preempt_smoke {
-	struct intel_gt *gt;
-	struct i915_gem_context **contexts;
-	struct intel_engine_cs *engine;
-	struct drm_i915_gem_object *batch;
-	unsigned int ncontext;
-	struct rnd_state prng;
-	unsigned long count;
-};
-
-static struct i915_gem_context *smoke_context(struct preempt_smoke *smoke)
-{
-	return smoke->contexts[i915_prandom_u32_max_state(smoke->ncontext,
-							  &smoke->prng)];
-}
-
-static int smoke_submit(struct preempt_smoke *smoke,
-			struct i915_gem_context *ctx, int prio,
-			struct drm_i915_gem_object *batch)
-{
-	struct i915_request *rq;
-	struct i915_vma *vma = NULL;
-	int err = 0;
-
-	if (batch) {
-		struct i915_address_space *vm;
-
-		vm = i915_gem_context_get_vm_rcu(ctx);
-		vma = i915_vma_instance(batch, vm, NULL);
-		i915_vm_put(vm);
-		if (IS_ERR(vma))
-			return PTR_ERR(vma);
-
-		err = i915_vma_pin(vma, 0, 0, PIN_USER);
-		if (err)
-			return err;
-	}
-
-	ctx->sched.priority = prio;
-
-	rq = igt_request_alloc(ctx, smoke->engine);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto unpin;
-	}
-
-	if (vma) {
-		i915_vma_lock(vma);
-		err = i915_request_await_object(rq, vma->obj, false);
-		if (!err)
-			err = i915_vma_move_to_active(vma, rq, 0);
-		if (!err)
-			err = rq->engine->emit_bb_start(rq,
-							vma->node.start,
-							PAGE_SIZE, 0);
-		i915_vma_unlock(vma);
-	}
-
-	i915_request_add(rq);
-
-unpin:
-	if (vma)
-		i915_vma_unpin(vma);
-
-	return err;
-}
-
-static int smoke_crescendo_thread(void *arg)
-{
-	struct preempt_smoke *smoke = arg;
-	IGT_TIMEOUT(end_time);
-	unsigned long count;
-
-	count = 0;
-	do {
-		struct i915_gem_context *ctx = smoke_context(smoke);
-		int err;
-
-		err = smoke_submit(smoke,
-				   ctx, count % I915_PRIORITY_MAX,
-				   smoke->batch);
-		if (err)
-			return err;
-
-		count++;
-	} while (!__igt_timeout(end_time, NULL));
-
-	smoke->count = count;
-	return 0;
-}
-
-static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags)
-#define BATCH BIT(0)
-{
-	struct task_struct *tsk[I915_NUM_ENGINES] = {};
-	struct preempt_smoke arg[I915_NUM_ENGINES];
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	unsigned long count;
-	int err = 0;
-
-	for_each_engine(engine, smoke->gt, id) {
-		arg[id] = *smoke;
-		arg[id].engine = engine;
-		if (!(flags & BATCH))
-			arg[id].batch = NULL;
-		arg[id].count = 0;
-
-		tsk[id] = kthread_run(smoke_crescendo_thread, &arg,
-				      "igt/smoke:%d", id);
-		if (IS_ERR(tsk[id])) {
-			err = PTR_ERR(tsk[id]);
-			break;
-		}
-		get_task_struct(tsk[id]);
-	}
-
-	yield(); /* start all threads before we kthread_stop() */
-
-	count = 0;
-	for_each_engine(engine, smoke->gt, id) {
-		int status;
-
-		if (IS_ERR_OR_NULL(tsk[id]))
-			continue;
-
-		status = kthread_stop(tsk[id]);
-		if (status && !err)
-			err = status;
-
-		count += arg[id].count;
-
-		put_task_struct(tsk[id]);
-	}
-
-	pr_info("Submitted %lu crescendo:%x requests across %d engines and %d contexts\n",
-		count, flags,
-		RUNTIME_INFO(smoke->gt->i915)->num_engines, smoke->ncontext);
-	return 0;
-}
-
-static int smoke_random(struct preempt_smoke *smoke, unsigned int flags)
-{
-	enum intel_engine_id id;
-	IGT_TIMEOUT(end_time);
-	unsigned long count;
-
-	count = 0;
-	do {
-		for_each_engine(smoke->engine, smoke->gt, id) {
-			struct i915_gem_context *ctx = smoke_context(smoke);
-			int err;
-
-			err = smoke_submit(smoke,
-					   ctx, random_priority(&smoke->prng),
-					   flags & BATCH ? smoke->batch : NULL);
-			if (err)
-				return err;
-
-			count++;
-		}
-	} while (!__igt_timeout(end_time, NULL));
-
-	pr_info("Submitted %lu random:%x requests across %d engines and %d contexts\n",
-		count, flags,
-		RUNTIME_INFO(smoke->gt->i915)->num_engines, smoke->ncontext);
-	return 0;
-}
-
-static int live_preempt_smoke(void *arg)
-{
-	struct preempt_smoke smoke = {
-		.gt = arg,
-		.prng = I915_RND_STATE_INITIALIZER(i915_selftest.random_seed),
-		.ncontext = 1024,
-	};
-	const unsigned int phase[] = { 0, BATCH };
-	struct igt_live_test t;
-	int err = -ENOMEM;
-	u32 *cs;
-	int n;
-
-	if (!HAS_LOGICAL_RING_PREEMPTION(smoke.gt->i915))
-		return 0;
-
-	smoke.contexts = kmalloc_array(smoke.ncontext,
-				       sizeof(*smoke.contexts),
-				       GFP_KERNEL);
-	if (!smoke.contexts)
-		return -ENOMEM;
-
-	smoke.batch =
-		i915_gem_object_create_internal(smoke.gt->i915, PAGE_SIZE);
-	if (IS_ERR(smoke.batch)) {
-		err = PTR_ERR(smoke.batch);
-		goto err_free;
-	}
-
-	cs = i915_gem_object_pin_map(smoke.batch, I915_MAP_WB);
-	if (IS_ERR(cs)) {
-		err = PTR_ERR(cs);
-		goto err_batch;
-	}
-	for (n = 0; n < PAGE_SIZE / sizeof(*cs) - 1; n++)
-		cs[n] = MI_ARB_CHECK;
-	cs[n] = MI_BATCH_BUFFER_END;
-	i915_gem_object_flush_map(smoke.batch);
-	i915_gem_object_unpin_map(smoke.batch);
-
-	if (igt_live_test_begin(&t, smoke.gt->i915, __func__, "all")) {
-		err = -EIO;
-		goto err_batch;
-	}
-
-	for (n = 0; n < smoke.ncontext; n++) {
-		smoke.contexts[n] = kernel_context(smoke.gt->i915);
-		if (!smoke.contexts[n])
-			goto err_ctx;
-	}
-
-	for (n = 0; n < ARRAY_SIZE(phase); n++) {
-		err = smoke_crescendo(&smoke, phase[n]);
-		if (err)
-			goto err_ctx;
-
-		err = smoke_random(&smoke, phase[n]);
-		if (err)
-			goto err_ctx;
-	}
-
-err_ctx:
-	if (igt_live_test_end(&t))
-		err = -EIO;
-
-	for (n = 0; n < smoke.ncontext; n++) {
-		if (!smoke.contexts[n])
-			break;
-		kernel_context_close(smoke.contexts[n]);
-	}
-
-err_batch:
-	i915_gem_object_put(smoke.batch);
-err_free:
-	kfree(smoke.contexts);
-
-	return err;
-}
-
-static int nop_virtual_engine(struct intel_gt *gt,
-			      struct intel_engine_cs **siblings,
-			      unsigned int nsibling,
-			      unsigned int nctx,
-			      unsigned int flags)
-#define CHAIN BIT(0)
-{
-	IGT_TIMEOUT(end_time);
-	struct i915_request *request[16] = {};
-	struct i915_gem_context *ctx[16];
-	struct intel_context *ve[16];
-	unsigned long n, prime, nc;
-	struct igt_live_test t;
-	ktime_t times[2] = {};
-	int err;
-
-	GEM_BUG_ON(!nctx || nctx > ARRAY_SIZE(ctx));
-
-	for (n = 0; n < nctx; n++) {
-		ctx[n] = kernel_context(gt->i915);
-		if (!ctx[n]) {
-			err = -ENOMEM;
-			nctx = n;
-			goto out;
-		}
-
-		ve[n] = intel_virtual_engine_create(ctx[n], siblings, nsibling);
-		if (IS_ERR(ve[n])) {
-			kernel_context_close(ctx[n]);
-			err = PTR_ERR(ve[n]);
-			nctx = n;
-			goto out;
-		}
-
-		err = intel_context_pin(ve[n]);
-		if (err) {
-			intel_context_put(ve[n]);
-			kernel_context_close(ctx[n]);
-			nctx = n;
-			goto out;
-		}
-	}
-
-	err = igt_live_test_begin(&t, gt->i915, __func__, ve[0]->engine->name);
-	if (err)
-		goto out;
-
-	for_each_prime_number_from(prime, 1, 8192) {
-		times[1] = ktime_get_raw();
-
-		if (flags & CHAIN) {
-			for (nc = 0; nc < nctx; nc++) {
-				for (n = 0; n < prime; n++) {
-					struct i915_request *rq;
-
-					rq = i915_request_create(ve[nc]);
-					if (IS_ERR(rq)) {
-						err = PTR_ERR(rq);
-						goto out;
-					}
-
-					if (request[nc])
-						i915_request_put(request[nc]);
-					request[nc] = i915_request_get(rq);
-					i915_request_add(rq);
-				}
-			}
-		} else {
-			for (n = 0; n < prime; n++) {
-				for (nc = 0; nc < nctx; nc++) {
-					struct i915_request *rq;
-
-					rq = i915_request_create(ve[nc]);
-					if (IS_ERR(rq)) {
-						err = PTR_ERR(rq);
-						goto out;
-					}
-
-					if (request[nc])
-						i915_request_put(request[nc]);
-					request[nc] = i915_request_get(rq);
-					i915_request_add(rq);
-				}
-			}
-		}
-
-		for (nc = 0; nc < nctx; nc++) {
-			if (i915_request_wait(request[nc], 0, HZ / 10) < 0) {
-				pr_err("%s(%s): wait for %llx:%lld timed out\n",
-				       __func__, ve[0]->engine->name,
-				       request[nc]->fence.context,
-				       request[nc]->fence.seqno);
-
-				GEM_TRACE("%s(%s) failed at request %llx:%lld\n",
-					  __func__, ve[0]->engine->name,
-					  request[nc]->fence.context,
-					  request[nc]->fence.seqno);
-				GEM_TRACE_DUMP();
-				intel_gt_set_wedged(gt);
-				break;
-			}
-		}
-
-		times[1] = ktime_sub(ktime_get_raw(), times[1]);
-		if (prime == 1)
-			times[0] = times[1];
-
-		for (nc = 0; nc < nctx; nc++) {
-			i915_request_put(request[nc]);
-			request[nc] = NULL;
-		}
-
-		if (__igt_timeout(end_time, NULL))
-			break;
-	}
-
-	err = igt_live_test_end(&t);
-	if (err)
-		goto out;
-
-	pr_info("Requestx%d latencies on %s: 1 = %lluns, %lu = %lluns\n",
-		nctx, ve[0]->engine->name, ktime_to_ns(times[0]),
-		prime, div64_u64(ktime_to_ns(times[1]), prime));
-
-out:
-	if (igt_flush_test(gt->i915))
-		err = -EIO;
-
-	for (nc = 0; nc < nctx; nc++) {
-		i915_request_put(request[nc]);
-		intel_context_unpin(ve[nc]);
-		intel_context_put(ve[nc]);
-		kernel_context_close(ctx[nc]);
-	}
-	return err;
-}
-
-static int live_virtual_engine(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	unsigned int class, inst;
-	int err;
-
-	if (USES_GUC_SUBMISSION(gt->i915))
-		return 0;
-
-	for_each_engine(engine, gt, id) {
-		err = nop_virtual_engine(gt, &engine, 1, 1, 0);
-		if (err) {
-			pr_err("Failed to wrap engine %s: err=%d\n",
-			       engine->name, err);
-			return err;
-		}
-	}
-
-	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
-		int nsibling, n;
-
-		nsibling = 0;
-		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
-			if (!gt->engine_class[class][inst])
-				continue;
-
-			siblings[nsibling++] = gt->engine_class[class][inst];
-		}
-		if (nsibling < 2)
-			continue;
-
-		for (n = 1; n <= nsibling + 1; n++) {
-			err = nop_virtual_engine(gt, siblings, nsibling,
-						 n, 0);
-			if (err)
-				return err;
-		}
-
-		err = nop_virtual_engine(gt, siblings, nsibling, n, CHAIN);
-		if (err)
-			return err;
-	}
-
-	return 0;
-}
-
-static int mask_virtual_engine(struct intel_gt *gt,
-			       struct intel_engine_cs **siblings,
-			       unsigned int nsibling)
-{
-	struct i915_request *request[MAX_ENGINE_INSTANCE + 1];
-	struct i915_gem_context *ctx;
-	struct intel_context *ve;
-	struct igt_live_test t;
-	unsigned int n;
-	int err;
-
-	/*
-	 * Check that by setting the execution mask on a request, we can
-	 * restrict it to our desired engine within the virtual engine.
-	 */
-
-	ctx = kernel_context(gt->i915);
-	if (!ctx)
-		return -ENOMEM;
-
-	ve = intel_virtual_engine_create(ctx, siblings, nsibling);
-	if (IS_ERR(ve)) {
-		err = PTR_ERR(ve);
-		goto out_close;
-	}
-
-	err = intel_context_pin(ve);
-	if (err)
-		goto out_put;
-
-	err = igt_live_test_begin(&t, gt->i915, __func__, ve->engine->name);
-	if (err)
-		goto out_unpin;
-
-	for (n = 0; n < nsibling; n++) {
-		request[n] = i915_request_create(ve);
-		if (IS_ERR(request[n])) {
-			err = PTR_ERR(request[n]);
-			nsibling = n;
-			goto out;
-		}
-
-		/* Reverse order as it's more likely to be unnatural */
-		request[n]->execution_mask = siblings[nsibling - n - 1]->mask;
-
-		i915_request_get(request[n]);
-		i915_request_add(request[n]);
-	}
-
-	for (n = 0; n < nsibling; n++) {
-		if (i915_request_wait(request[n], 0, HZ / 10) < 0) {
-			pr_err("%s(%s): wait for %llx:%lld timed out\n",
-			       __func__, ve->engine->name,
-			       request[n]->fence.context,
-			       request[n]->fence.seqno);
-
-			GEM_TRACE("%s(%s) failed at request %llx:%lld\n",
-				  __func__, ve->engine->name,
-				  request[n]->fence.context,
-				  request[n]->fence.seqno);
-			GEM_TRACE_DUMP();
-			intel_gt_set_wedged(gt);
-			err = -EIO;
-			goto out;
-		}
-
-		if (request[n]->engine != siblings[nsibling - n - 1]) {
-			pr_err("Executed on wrong sibling '%s', expected '%s'\n",
-			       request[n]->engine->name,
-			       siblings[nsibling - n - 1]->name);
-			err = -EINVAL;
-			goto out;
-		}
-	}
-
-	err = igt_live_test_end(&t);
-out:
-	if (igt_flush_test(gt->i915))
-		err = -EIO;
-
-	for (n = 0; n < nsibling; n++)
-		i915_request_put(request[n]);
-
-out_unpin:
-	intel_context_unpin(ve);
-out_put:
-	intel_context_put(ve);
-out_close:
-	kernel_context_close(ctx);
-	return err;
-}
-
-static int live_virtual_mask(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
-	unsigned int class, inst;
-	int err;
-
-	if (USES_GUC_SUBMISSION(gt->i915))
-		return 0;
-
-	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
-		unsigned int nsibling;
-
-		nsibling = 0;
-		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
-			if (!gt->engine_class[class][inst])
-				break;
-
-			siblings[nsibling++] = gt->engine_class[class][inst];
-		}
-		if (nsibling < 2)
-			continue;
-
-		err = mask_virtual_engine(gt, siblings, nsibling);
-		if (err)
-			return err;
-	}
-
-	return 0;
-}
-
-static int preserved_virtual_engine(struct intel_gt *gt,
-				    struct intel_engine_cs **siblings,
-				    unsigned int nsibling)
-{
-	struct i915_request *last = NULL;
-	struct i915_gem_context *ctx;
-	struct intel_context *ve;
-	struct i915_vma *scratch;
-	struct igt_live_test t;
-	unsigned int n;
-	int err = 0;
-	u32 *cs;
-
-	ctx = kernel_context(gt->i915);
-	if (!ctx)
-		return -ENOMEM;
-
-	scratch = create_scratch(siblings[0]->gt);
-	if (IS_ERR(scratch)) {
-		err = PTR_ERR(scratch);
-		goto out_close;
-	}
-
-	ve = intel_virtual_engine_create(ctx, siblings, nsibling);
-	if (IS_ERR(ve)) {
-		err = PTR_ERR(ve);
-		goto out_scratch;
-	}
-
-	err = intel_context_pin(ve);
-	if (err)
-		goto out_put;
-
-	err = igt_live_test_begin(&t, gt->i915, __func__, ve->engine->name);
-	if (err)
-		goto out_unpin;
-
-	for (n = 0; n < NUM_GPR_DW; n++) {
-		struct intel_engine_cs *engine = siblings[n % nsibling];
-		struct i915_request *rq;
-
-		rq = i915_request_create(ve);
-		if (IS_ERR(rq)) {
-			err = PTR_ERR(rq);
-			goto out_end;
-		}
-
-		i915_request_put(last);
-		last = i915_request_get(rq);
-
-		cs = intel_ring_begin(rq, 8);
-		if (IS_ERR(cs)) {
-			i915_request_add(rq);
-			err = PTR_ERR(cs);
-			goto out_end;
-		}
-
-		*cs++ = MI_STORE_REGISTER_MEM_GEN8 | MI_USE_GGTT;
-		*cs++ = CS_GPR(engine, n);
-		*cs++ = i915_ggtt_offset(scratch) + n * sizeof(u32);
-		*cs++ = 0;
-
-		*cs++ = MI_LOAD_REGISTER_IMM(1);
-		*cs++ = CS_GPR(engine, (n + 1) % NUM_GPR_DW);
-		*cs++ = n + 1;
-
-		*cs++ = MI_NOOP;
-		intel_ring_advance(rq, cs);
-
-		/* Restrict this request to run on a particular engine */
-		rq->execution_mask = engine->mask;
-		i915_request_add(rq);
-	}
-
-	if (i915_request_wait(last, 0, HZ / 5) < 0) {
-		err = -ETIME;
-		goto out_end;
-	}
-
-	cs = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB);
-	if (IS_ERR(cs)) {
-		err = PTR_ERR(cs);
-		goto out_end;
-	}
-
-	for (n = 0; n < NUM_GPR_DW; n++) {
-		if (cs[n] != n) {
-			pr_err("Incorrect value[%d] found for GPR[%d]\n",
-			       cs[n], n);
-			err = -EINVAL;
-			break;
-		}
-	}
-
-	i915_gem_object_unpin_map(scratch->obj);
-
-out_end:
-	if (igt_live_test_end(&t))
-		err = -EIO;
-	i915_request_put(last);
-out_unpin:
-	intel_context_unpin(ve);
-out_put:
-	intel_context_put(ve);
-out_scratch:
-	i915_vma_unpin_and_release(&scratch, 0);
-out_close:
-	kernel_context_close(ctx);
-	return err;
-}
-
-static int live_virtual_preserved(void *arg)
-{
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
-	unsigned int class, inst;
-
-	/*
-	 * Check that the context image retains non-privileged (user) registers
-	 * from one engine to the next. For this we check that the CS_GPR
-	 * are preserved.
-	 */
-
-	if (USES_GUC_SUBMISSION(gt->i915))
-		return 0;
-
-	/* As we use CS_GPR we cannot run before they existed on all engines. */
-	if (INTEL_GEN(gt->i915) < 9)
-		return 0;
-
-	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
-		int nsibling, err;
-
-		nsibling = 0;
-		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
-			if (!gt->engine_class[class][inst])
-				continue;
-
-			siblings[nsibling++] = gt->engine_class[class][inst];
-		}
-		if (nsibling < 2)
-			continue;
-
-		err = preserved_virtual_engine(gt, siblings, nsibling);
-		if (err)
-			return err;
-	}
-
-	return 0;
-}
-
-static int bond_virtual_engine(struct intel_gt *gt,
-			       unsigned int class,
-			       struct intel_engine_cs **siblings,
-			       unsigned int nsibling,
-			       unsigned int flags)
-#define BOND_SCHEDULE BIT(0)
-{
-	struct intel_engine_cs *master;
-	struct i915_gem_context *ctx;
-	struct i915_request *rq[16];
-	enum intel_engine_id id;
-	struct igt_spinner spin;
-	unsigned long n;
-	int err;
-
-	/*
-	 * A set of bonded requests is intended to be run concurrently
-	 * across a number of engines. We use one request per-engine
-	 * and a magic fence to schedule each of the bonded requests
-	 * at the same time. A consequence of our current scheduler is that
-	 * we only move requests to the HW ready queue when the request
-	 * becomes ready, that is when all of its prerequisite fences have
-	 * been signaled. As one of those fences is the master submit fence,
-	 * there is a delay on all secondary fences as the HW may be
-	 * currently busy. Equally, as all the requests are independent,
-	 * they may have other fences that delay individual request
-	 * submission to HW. Ergo, we do not guarantee that all requests are
-	 * immediately submitted to HW at the same time, just that if the
-	 * rules are abided by, they are ready at the same time as the
-	 * first is submitted. Userspace can embed semaphores in its batch
-	 * to ensure parallel execution of its phases as it requires.
-	 * Though naturally it gets requested that perhaps the scheduler should
-	 * take care of parallel execution, even across preemption events on
-	 * different HW. (The proper answer is of course "lalalala".)
-	 *
-	 * With the submit-fence, we have identified three possible phases
-	 * of synchronisation depending on the master fence: queued (not
-	 * ready), executing, and signaled. The first two are quite simple
-	 * and checked below. However, the signaled master fence handling is
-	 * contentious. Currently we do not distinguish between a signaled
-	 * fence and an expired fence, as once signaled it does not convey
-	 * any information about the previous execution. It may even be freed
-	 * and hence checking later it may not exist at all. Ergo we currently
-	 * do not apply the bonding constraint for an already signaled fence,
-	 * as our expectation is that it should not constrain the secondaries
-	 * and is outside of the scope of the bonded request API (i.e. all
-	 * userspace requests are meant to be running in parallel). As
-	 * it imposes no constraint, and is effectively a no-op, we do not
-	 * check below as normal execution flows are checked extensively above.
-	 *
-	 * XXX Is the degenerate handling of signaled submit fences the
-	 * expected behaviour for userpace?
-	 */
-
-	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
-
-	if (igt_spinner_init(&spin, gt))
-		return -ENOMEM;
-
-	ctx = kernel_context(gt->i915);
-	if (!ctx) {
-		err = -ENOMEM;
-		goto err_spin;
-	}
-
-	err = 0;
-	rq[0] = ERR_PTR(-ENOMEM);
-	for_each_engine(master, gt, id) {
-		struct i915_sw_fence fence = {};
-
-		if (master->class == class)
-			continue;
-
-		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
-
-		rq[0] = spinner_create_request(&spin, ctx, master, MI_NOOP);
-		if (IS_ERR(rq[0])) {
-			err = PTR_ERR(rq[0]);
-			goto out;
-		}
-		i915_request_get(rq[0]);
-
-		if (flags & BOND_SCHEDULE) {
-			onstack_fence_init(&fence);
-			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
-							       &fence,
-							       GFP_KERNEL);
-		}
-
-		i915_request_add(rq[0]);
-		if (err < 0)
-			goto out;
-
-		if (!(flags & BOND_SCHEDULE) &&
-		    !igt_wait_for_spinner(&spin, rq[0])) {
-			err = -EIO;
-			goto out;
-		}
-
-		for (n = 0; n < nsibling; n++) {
-			struct intel_context *ve;
-
-			ve = intel_virtual_engine_create(ctx,
-							 siblings,
-							 nsibling);
-			if (IS_ERR(ve)) {
-				err = PTR_ERR(ve);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			err = intel_virtual_engine_attach_bond(ve->engine,
-							       master,
-							       siblings[n]);
-			if (err) {
-				intel_context_put(ve);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			err = intel_context_pin(ve);
-			intel_context_put(ve);
-			if (err) {
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			rq[n + 1] = i915_request_create(ve);
-			intel_context_unpin(ve);
-			if (IS_ERR(rq[n + 1])) {
-				err = PTR_ERR(rq[n + 1]);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-			i915_request_get(rq[n + 1]);
-
-			err = i915_request_await_execution(rq[n + 1],
-							   &rq[0]->fence,
-							   ve->engine->bond_execute);
-			i915_request_add(rq[n + 1]);
-			if (err < 0) {
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-		}
-		onstack_fence_fini(&fence);
-		intel_engine_flush_submission(master);
-		igt_spinner_end(&spin);
-
-		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
-			pr_err("Master request did not execute (on %s)!\n",
-			       rq[0]->engine->name);
-			err = -EIO;
-			goto out;
-		}
-
-		for (n = 0; n < nsibling; n++) {
-			if (i915_request_wait(rq[n + 1], 0,
-					      MAX_SCHEDULE_TIMEOUT) < 0) {
-				err = -EIO;
-				goto out;
-			}
-
-			if (rq[n + 1]->engine != siblings[n]) {
-				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
-				       siblings[n]->name,
-				       rq[n + 1]->engine->name,
-				       rq[0]->engine->name);
-				err = -EINVAL;
-				goto out;
-			}
-		}
-
-		for (n = 0; !IS_ERR(rq[n]); n++)
-			i915_request_put(rq[n]);
-		rq[0] = ERR_PTR(-ENOMEM);
-	}
-
-out:
-	for (n = 0; !IS_ERR(rq[n]); n++)
-		i915_request_put(rq[n]);
-	if (igt_flush_test(gt->i915))
-		err = -EIO;
-
-	kernel_context_close(ctx);
-err_spin:
-	igt_spinner_fini(&spin);
-	return err;
-}
-
-static int live_virtual_bond(void *arg)
-{
-	static const struct phase {
-		const char *name;
-		unsigned int flags;
-	} phases[] = {
-		{ "", 0 },
-		{ "schedule", BOND_SCHEDULE },
-		{ },
-	};
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
-	unsigned int class, inst;
-	int err;
-
-	if (USES_GUC_SUBMISSION(gt->i915))
-		return 0;
-
-	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
-		const struct phase *p;
-		int nsibling;
-
-		nsibling = 0;
-		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
-			if (!gt->engine_class[class][inst])
-				break;
-
-			GEM_BUG_ON(nsibling == ARRAY_SIZE(siblings));
-			siblings[nsibling++] = gt->engine_class[class][inst];
-		}
-		if (nsibling < 2)
-			continue;
-
-		for (p = phases; p->name; p++) {
-			err = bond_virtual_engine(gt,
-						  class, siblings, nsibling,
-						  p->flags);
-			if (err) {
-				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
-				       __func__, p->name, class, nsibling, err);
-				return err;
-			}
-		}
-	}
-
-	return 0;
-}
-
-int intel_execlists_live_selftests(struct drm_i915_private *i915)
-{
-	static const struct i915_subtest tests[] = {
-		SUBTEST(live_sanitycheck),
-		SUBTEST(live_unlite_switch),
-		SUBTEST(live_unlite_preempt),
-		SUBTEST(live_timeslice_preempt),
-		SUBTEST(live_timeslice_queue),
-		SUBTEST(live_busywait_preempt),
-		SUBTEST(live_preempt),
-		SUBTEST(live_late_preempt),
-		SUBTEST(live_nopreempt),
-		SUBTEST(live_preempt_cancel),
-		SUBTEST(live_suppress_self_preempt),
-		SUBTEST(live_suppress_wait_preempt),
-		SUBTEST(live_chain_preempt),
-		SUBTEST(live_preempt_gang),
-		SUBTEST(live_preempt_hang),
-		SUBTEST(live_preempt_timeout),
-		SUBTEST(live_preempt_smoke),
-		SUBTEST(live_virtual_engine),
-		SUBTEST(live_virtual_mask),
-		SUBTEST(live_virtual_preserved),
-		SUBTEST(live_virtual_bond),
-	};
-
-	if (!HAS_EXECLISTS(i915))
-		return 0;
-
-	if (intel_gt_is_wedged(&i915->gt))
-		return 0;
-
-	return intel_gt_live_subtests(tests, &i915->gt);
-}
-
 static void hexdump(const void *buf, size_t len)
 {
 	const size_t rowsize = 8 * sizeof(u32);
@@ -3670,7 +341,7 @@ static int live_lrc_state(void *arg)
 	if (!fixme)
 		return -ENOMEM;
 
-	scratch = create_scratch(gt);
+	scratch = igt_create_scratch(gt);
 	if (IS_ERR(scratch)) {
 		err = PTR_ERR(scratch);
 		goto out_close;
@@ -3813,7 +484,7 @@ static int live_gpr_clear(void *arg)
 	if (!fixme)
 		return -ENOMEM;
 
-	scratch = create_scratch(gt);
+	scratch = igt_create_scratch(gt);
 	if (IS_ERR(scratch)) {
 		err = PTR_ERR(scratch);
 		goto out_close;
diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c b/drivers/gpu/drm/i915/gt/selftest_mocs.c
index de010f527757..48057a7b6e52 100644
--- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
+++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
@@ -7,6 +7,7 @@
 #include "gt/intel_engine_pm.h"
 #include "i915_selftest.h"
 
+#include "gem/selftests/igt_gem_utils.h"
 #include "gem/selftests/mock_context.h"
 #include "selftests/igt_reset.h"
 #include "selftests/igt_spinner.h"
@@ -41,33 +42,6 @@ static int request_add_spin(struct i915_request *rq, struct igt_spinner *spin)
 	return err;
 }
 
-static struct i915_vma *create_scratch(struct intel_gt *gt)
-{
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-	int err;
-
-	obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	i915_gem_object_set_cache_coherency(obj, I915_CACHING_CACHED);
-
-	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
-	if (IS_ERR(vma)) {
-		i915_gem_object_put(obj);
-		return vma;
-	}
-
-	err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
-	if (err) {
-		i915_gem_object_put(obj);
-		return ERR_PTR(err);
-	}
-
-	return vma;
-}
-
 static int live_mocs_init(struct live_mocs *arg, struct intel_gt *gt)
 {
 	int err;
@@ -75,7 +49,7 @@ static int live_mocs_init(struct live_mocs *arg, struct intel_gt *gt)
 	if (!get_mocs_settings(gt->i915, &arg->table))
 		return -EINVAL;
 
-	arg->scratch = create_scratch(gt);
+	arg->scratch = igt_create_scratch(gt);
 	if (IS_ERR(arg->scratch))
 		return PTR_ERR(arg->scratch);
 
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Intel-gfx] [RFC 5/5] drm/i915: introduce intel_execlists_submission.<c/h>
  2019-12-11 21:12 [Intel-gfx] [RFC 0/5] Split up intel_lrc.c Daniele Ceraolo Spurio
                   ` (3 preceding siblings ...)
  2019-12-11 21:12 ` [Intel-gfx] [RFC 4/5] drm/i915: move execlists selftests to their own file Daniele Ceraolo Spurio
@ 2019-12-11 21:12 ` Daniele Ceraolo Spurio
  2019-12-11 21:31   ` Chris Wilson
  2019-12-12  1:27 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Split up intel_lrc.c Patchwork
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-12-11 21:12 UTC (permalink / raw)
  To: intel-gfx

Split out all the code related to the execlists submission flow to its
own file to keep it separate from the general context management,
because the latter will be re-used by the GuC submission flow.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |    1 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |    1 +
 .../drm/i915/gt/intel_execlists_submission.c  | 2485 ++++++++++++++++
 .../drm/i915/gt/intel_execlists_submission.h  |   58 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           | 2511 +----------------
 drivers/gpu/drm/i915/gt/intel_lrc.h           |   34 +-
 .../gpu/drm/i915/gt/intel_virtual_engine.c    |    1 +
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |    2 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c        |    2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |    1 +
 drivers/gpu/drm/i915/gvt/scheduler.c          |    1 +
 drivers/gpu/drm/i915/i915_perf.c              |    1 +
 12 files changed, 2584 insertions(+), 2514 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_execlists_submission.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_execlists_submission.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 79f5ef5acd4c..3640e0436c97 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -82,6 +82,7 @@ gt-y += \
 	gt/intel_engine_pm.o \
 	gt/intel_engine_pool.o \
 	gt/intel_engine_user.o \
+	gt/intel_execlists_submission.o \
 	gt/intel_gt.o \
 	gt/intel_gt_irq.o \
 	gt/intel_gt_pm.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 49473c25916c..0a23d01b7589 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -33,6 +33,7 @@
 #include "intel_engine_pm.h"
 #include "intel_engine_pool.h"
 #include "intel_engine_user.h"
+#include "intel_execlists_submission.h"
 #include "intel_gt.h"
 #include "intel_gt_requests.h"
 #include "intel_lrc.h"
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
new file mode 100644
index 000000000000..76b878bf15ad
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -0,0 +1,2485 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/interrupt.h>
+
+#include "gem/i915_gem_context.h"
+
+#include "i915_drv.h"
+#include "i915_perf.h"
+#include "i915_trace.h"
+#include "i915_vgpu.h"
+#include "intel_engine_pm.h"
+#include "intel_gt.h"
+#include "intel_gt_pm.h"
+#include "intel_gt_requests.h"
+#include "intel_lrc_reg.h"
+#include "intel_mocs.h"
+#include "intel_reset.h"
+#include "intel_ring.h"
+#include "intel_virtual_engine.h"
+#include "intel_workarounds.h"
+#include "intel_execlists_submission.h"
+
+#define RING_EXECLIST_QFULL		(1 << 0x2)
+#define RING_EXECLIST1_VALID		(1 << 0x3)
+#define RING_EXECLIST0_VALID		(1 << 0x4)
+#define RING_EXECLIST_ACTIVE_STATUS	(3 << 0xE)
+#define RING_EXECLIST1_ACTIVE		(1 << 0x11)
+#define RING_EXECLIST0_ACTIVE		(1 << 0x12)
+
+#define GEN8_CTX_STATUS_IDLE_ACTIVE	(1 << 0)
+#define GEN8_CTX_STATUS_PREEMPTED	(1 << 1)
+#define GEN8_CTX_STATUS_ELEMENT_SWITCH	(1 << 2)
+#define GEN8_CTX_STATUS_ACTIVE_IDLE	(1 << 3)
+#define GEN8_CTX_STATUS_COMPLETE	(1 << 4)
+#define GEN8_CTX_STATUS_LITE_RESTORE	(1 << 15)
+
+#define GEN8_CTX_STATUS_COMPLETED_MASK \
+	 (GEN8_CTX_STATUS_COMPLETE | GEN8_CTX_STATUS_PREEMPTED)
+
+#define CTX_DESC_FORCE_RESTORE BIT_ULL(2)
+
+#define GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE	(0x1) /* lower csb dword */
+#define GEN12_CTX_SWITCH_DETAIL(csb_dw)	((csb_dw) & 0xF) /* upper csb dword */
+#define GEN12_CSB_SW_CTX_ID_MASK		GENMASK(25, 15)
+#define GEN12_IDLE_CTX_ID		0x7FF
+#define GEN12_CSB_CTX_VALID(csb_dw) \
+	(FIELD_GET(GEN12_CSB_SW_CTX_ID_MASK, csb_dw) != GEN12_IDLE_CTX_ID)
+
+/* Typical size of the average request (2 pipecontrols and a MI_BB) */
+#define EXECLISTS_REQUEST_SIZE 64 /* bytes */
+
+static void mark_eio(struct i915_request *rq)
+{
+	if (i915_request_completed(rq))
+		return;
+
+	GEM_BUG_ON(i915_request_signaled(rq));
+
+	dma_fence_set_error(&rq->fence, -EIO);
+	i915_request_mark_complete(rq);
+}
+
+static struct i915_request *
+active_request(const struct intel_timeline * const tl, struct i915_request *rq)
+{
+	struct i915_request *active = rq;
+
+	rcu_read_lock();
+	list_for_each_entry_continue_reverse(rq, &tl->requests, link) {
+		if (i915_request_completed(rq))
+			break;
+
+		active = rq;
+	}
+	rcu_read_unlock();
+
+	return active;
+}
+
+static inline void
+ring_set_paused(const struct intel_engine_cs *engine, int state)
+{
+	/*
+	 * We inspect HWS_PREEMPT with a semaphore inside
+	 * engine->emit_fini_breadcrumb. If the dword is true,
+	 * the ring is paused as the semaphore will busywait
+	 * until the dword is false.
+	 */
+	engine->status_page.addr[I915_GEM_HWS_PREEMPT] = state;
+	if (state)
+		wmb();
+}
+
+static inline struct i915_priolist *to_priolist(struct rb_node *rb)
+{
+	return rb_entry(rb, struct i915_priolist, node);
+}
+
+static inline int rq_prio(const struct i915_request *rq)
+{
+	return rq->sched.attr.priority;
+}
+
+static int effective_prio(const struct i915_request *rq)
+{
+	int prio = rq_prio(rq);
+
+	/*
+	 * If this request is special and must not be interrupted at any
+	 * cost, so be it. Note we are only checking the most recent request
+	 * in the context and so may be masking an earlier vip request. It
+	 * is hoped that under the conditions where nopreempt is used, this
+	 * will not matter (i.e. all requests to that context will be
+	 * nopreempt for as long as desired).
+	 */
+	if (i915_request_has_nopreempt(rq))
+		prio = I915_PRIORITY_UNPREEMPTABLE;
+
+	/*
+	 * On unwinding the active request, we give it a priority bump
+	 * if it has completed waiting on any semaphore. If we know that
+	 * the request has already started, we can prevent an unwanted
+	 * preempt-to-idle cycle by taking that into account now.
+	 */
+	if (__i915_request_has_started(rq))
+		prio |= I915_PRIORITY_NOSEMAPHORE;
+
+	/* Restrict mere WAIT boosts from triggering preemption */
+	BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
+	return prio | __NO_PREEMPTION;
+}
+
+static int queue_prio(const struct intel_engine_execlists *execlists)
+{
+	struct i915_priolist *p;
+	struct rb_node *rb;
+
+	rb = rb_first_cached(&execlists->queue);
+	if (!rb)
+		return INT_MIN;
+
+	/*
+	 * As the priolist[] are inverted, with the highest priority in [0],
+	 * we have to flip the index value to become priority.
+	 */
+	p = to_priolist(rb);
+	return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used);
+}
+
+static inline bool need_preempt(const struct intel_engine_cs *engine,
+				const struct i915_request *rq,
+				struct rb_node *rb)
+{
+	int last_prio;
+
+	if (!intel_engine_has_semaphores(engine))
+		return false;
+
+	/*
+	 * Check if the current priority hint merits a preemption attempt.
+	 *
+	 * We record the highest value priority we saw during rescheduling
+	 * prior to this dequeue, therefore we know that if it is strictly
+	 * less than the current tail of ESLP[0], we do not need to force
+	 * a preempt-to-idle cycle.
+	 *
+	 * However, the priority hint is a mere hint that we may need to
+	 * preempt. If that hint is stale or we may be trying to preempt
+	 * ourselves, ignore the request.
+	 *
+	 * More naturally we would write
+	 *      prio >= max(0, last);
+	 * except that we wish to prevent triggering preemption at the same
+	 * priority level: the task that is running should remain running
+	 * to preserve FIFO ordering of dependencies.
+	 */
+	last_prio = max(effective_prio(rq), I915_PRIORITY_NORMAL - 1);
+	if (engine->execlists.queue_priority_hint <= last_prio)
+		return false;
+
+	/*
+	 * Check against the first request in ELSP[1], it will, thanks to the
+	 * power of PI, be the highest priority of that context.
+	 */
+	if (!list_is_last(&rq->sched.link, &engine->active.requests) &&
+	    rq_prio(list_next_entry(rq, sched.link)) > last_prio)
+		return true;
+
+	if (rb) {
+		struct intel_virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		bool preempt = false;
+
+		if (engine == ve->siblings[0]) { /* only preempt one sibling */
+			struct i915_request *next;
+
+			rcu_read_lock();
+			next = READ_ONCE(ve->request);
+			if (next)
+				preempt = rq_prio(next) > last_prio;
+			rcu_read_unlock();
+		}
+
+		if (preempt)
+			return preempt;
+	}
+
+	/*
+	 * If the inflight context did not trigger the preemption, then maybe
+	 * it was the set of queued requests? Pick the highest priority in
+	 * the queue (the first active priolist) and see if it deserves to be
+	 * running instead of ELSP[0].
+	 *
+	 * The highest priority request in the queue can not be either
+	 * ELSP[0] or ELSP[1] as, thanks again to PI, if it was the same
+	 * context, it's priority would not exceed ELSP[0] aka last_prio.
+	 */
+	return queue_prio(&engine->execlists) > last_prio;
+}
+
+__maybe_unused static inline bool
+assert_priority_queue(const struct i915_request *prev,
+		      const struct i915_request *next)
+{
+	/*
+	 * Without preemption, the prev may refer to the still active element
+	 * which we refuse to let go.
+	 *
+	 * Even with preemption, there are times when we think it is better not
+	 * to preempt and leave an ostensibly lower priority request in flight.
+	 */
+	if (i915_request_is_active(prev))
+		return true;
+
+	return rq_prio(prev) >= rq_prio(next);
+}
+
+static struct i915_request *
+__unwind_incomplete_requests(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq, *rn, *active = NULL;
+	struct list_head *uninitialized_var(pl);
+	int prio = I915_PRIORITY_INVALID;
+
+	lockdep_assert_held(&engine->active.lock);
+
+	list_for_each_entry_safe_reverse(rq, rn,
+					 &engine->active.requests,
+					 sched.link) {
+		if (i915_request_completed(rq))
+			continue; /* XXX */
+
+		__i915_request_unsubmit(rq);
+
+		/*
+		 * Push the request back into the queue for later resubmission.
+		 * If this request is not native to this physical engine (i.e.
+		 * it came from a virtual source), push it back onto the virtual
+		 * engine so that it can be moved across onto another physical
+		 * engine as load dictates.
+		 */
+		if (likely(rq->execution_mask == engine->mask)) {
+			GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
+			if (rq_prio(rq) != prio) {
+				prio = rq_prio(rq);
+				pl = i915_sched_lookup_priolist(engine, prio);
+			}
+			GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+
+			list_move(&rq->sched.link, pl);
+			active = rq;
+		} else {
+			struct intel_engine_cs *owner = rq->hw_context->engine;
+
+			/*
+			 * Decouple the virtual breadcrumb before moving it
+			 * back to the virtual engine -- we don't want the
+			 * request to complete in the background and try
+			 * and cancel the breadcrumb on the virtual engine
+			 * (instead of the old engine where it is linked)!
+			 */
+			if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+				     &rq->fence.flags)) {
+				spin_lock_nested(&rq->lock,
+						 SINGLE_DEPTH_NESTING);
+				i915_request_cancel_breadcrumb(rq);
+				spin_unlock(&rq->lock);
+			}
+			rq->engine = owner;
+			owner->submit_request(rq);
+			active = NULL;
+		}
+	}
+
+	return active;
+}
+
+struct i915_request *
+execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
+{
+	struct intel_engine_cs *engine =
+		container_of(execlists, typeof(*engine), execlists);
+
+	return __unwind_incomplete_requests(engine);
+}
+
+static inline void
+execlists_context_status_change(struct i915_request *rq, unsigned long status)
+{
+	/*
+	 * Only used when GVT-g is enabled now. When GVT-g is disabled,
+	 * The compiler should eliminate this function as dead-code.
+	 */
+	if (!IS_ENABLED(CONFIG_DRM_I915_GVT))
+		return;
+
+	atomic_notifier_call_chain(&rq->engine->context_status_notifier,
+				   status, rq);
+}
+
+static void intel_engine_context_in(struct intel_engine_cs *engine)
+{
+	unsigned long flags;
+
+	if (READ_ONCE(engine->stats.enabled) == 0)
+		return;
+
+	write_seqlock_irqsave(&engine->stats.lock, flags);
+
+	if (engine->stats.enabled > 0) {
+		if (engine->stats.active++ == 0)
+			engine->stats.start = ktime_get();
+		GEM_BUG_ON(engine->stats.active == 0);
+	}
+
+	write_sequnlock_irqrestore(&engine->stats.lock, flags);
+}
+
+static void intel_engine_context_out(struct intel_engine_cs *engine)
+{
+	unsigned long flags;
+
+	if (READ_ONCE(engine->stats.enabled) == 0)
+		return;
+
+	write_seqlock_irqsave(&engine->stats.lock, flags);
+
+	if (engine->stats.enabled > 0) {
+		ktime_t last;
+
+		if (engine->stats.active && --engine->stats.active == 0) {
+			/*
+			 * Decrement the active context count and in case GPU
+			 * is now idle add up to the running total.
+			 */
+			last = ktime_sub(ktime_get(), engine->stats.start);
+
+			engine->stats.total = ktime_add(engine->stats.total,
+							last);
+		} else if (engine->stats.active == 0) {
+			/*
+			 * After turning on engine stats, context out might be
+			 * the first event in which case we account from the
+			 * time stats gathering was turned on.
+			 */
+			last = ktime_sub(ktime_get(), engine->stats.enabled_at);
+
+			engine->stats.total = ktime_add(engine->stats.total,
+							last);
+		}
+	}
+
+	write_sequnlock_irqrestore(&engine->stats.lock, flags);
+}
+
+static void
+execlists_check_context(const struct intel_context *ce,
+			const struct intel_engine_cs *engine)
+{
+	const struct intel_ring *ring = ce->ring;
+	u32 *regs = ce->lrc_reg_state;
+	bool valid = true;
+	int x;
+
+	if (regs[CTX_RING_START] != i915_ggtt_offset(ring->vma)) {
+		pr_err("%s: context submitted with incorrect RING_START [%08x], expected %08x\n",
+		       engine->name,
+		       regs[CTX_RING_START],
+		       i915_ggtt_offset(ring->vma));
+		regs[CTX_RING_START] = i915_ggtt_offset(ring->vma);
+		valid = false;
+	}
+
+	if ((regs[CTX_RING_CTL] & ~(RING_WAIT | RING_WAIT_SEMAPHORE)) !=
+	    (RING_CTL_SIZE(ring->size) | RING_VALID)) {
+		pr_err("%s: context submitted with incorrect RING_CTL [%08x], expected %08x\n",
+		       engine->name,
+		       regs[CTX_RING_CTL],
+		       (u32)(RING_CTL_SIZE(ring->size) | RING_VALID));
+		regs[CTX_RING_CTL] = RING_CTL_SIZE(ring->size) | RING_VALID;
+		valid = false;
+	}
+
+	x = intel_lrc_ring_mi_mode(engine);
+	if (x != -1 && regs[x + 1] & (regs[x + 1] >> 16) & STOP_RING) {
+		pr_err("%s: context submitted with STOP_RING [%08x] in RING_MI_MODE\n",
+		       engine->name, regs[x + 1]);
+		regs[x + 1] &= ~STOP_RING;
+		regs[x + 1] |= STOP_RING << 16;
+		valid = false;
+	}
+
+	WARN_ONCE(!valid, "Invalid lrc state found before submission\n");
+}
+
+static void reset_active(struct i915_request *rq,
+			 struct intel_engine_cs *engine)
+{
+	struct intel_context * const ce = rq->hw_context;
+	u32 head;
+
+	/*
+	 * The executing context has been cancelled. We want to prevent
+	 * further execution along this context and propagate the error on
+	 * to anything depending on its results.
+	 *
+	 * In __i915_request_submit(), we apply the -EIO and remove the
+	 * requests' payloads for any banned requests. But first, we must
+	 * rewind the context back to the start of the incomplete request so
+	 * that we do not jump back into the middle of the batch.
+	 *
+	 * We preserve the breadcrumbs and semaphores of the incomplete
+	 * requests so that inter-timeline dependencies (i.e other timelines)
+	 * remain correctly ordered. And we defer to __i915_request_submit()
+	 * so that all asynchronous waits are correctly handled.
+	 */
+	GEM_TRACE("%s(%s): { rq=%llx:%lld }\n",
+		  __func__, engine->name, rq->fence.context, rq->fence.seqno);
+
+	/* On resubmission of the active request, payload will be scrubbed */
+	if (i915_request_completed(rq))
+		head = rq->tail;
+	else
+		head = active_request(ce->timeline, rq)->head;
+	ce->ring->head = intel_ring_wrap(ce->ring, head);
+	intel_ring_update_space(ce->ring);
+
+	/* Scrub the context image to prevent replaying the previous batch */
+	intel_lr_context_restore_default_state(ce, engine);
+	intel_lr_context_update_reg_state(ce, engine);
+
+	/* We've switched away, so this should be a no-op, but intent matters */
+	ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
+}
+
+static inline struct intel_engine_cs *
+__execlists_schedule_in(struct i915_request *rq)
+{
+	struct intel_engine_cs * const engine = rq->engine;
+	struct intel_context * const ce = rq->hw_context;
+
+	intel_context_get(ce);
+
+	if (unlikely(i915_gem_context_is_banned(ce->gem_context)))
+		reset_active(rq, engine);
+
+	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
+		execlists_check_context(ce, engine);
+
+	if (ce->tag) {
+		/* Use a fixed tag for OA and friends */
+		ce->lrc_desc |= (u64)ce->tag << 32;
+	} else {
+		/* We don't need a strict matching tag, just different values */
+		ce->lrc_desc &= ~GENMASK_ULL(47, 37);
+		ce->lrc_desc |=
+			(u64)(engine->context_tag++ % NUM_CONTEXT_TAG) <<
+			GEN11_SW_CTX_ID_SHIFT;
+		BUILD_BUG_ON(NUM_CONTEXT_TAG > GEN12_MAX_CONTEXT_HW_ID);
+	}
+
+	__intel_gt_pm_get(engine->gt);
+	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
+	intel_engine_context_in(engine);
+
+	return engine;
+}
+
+static inline struct i915_request *
+execlists_schedule_in(struct i915_request *rq, int idx)
+{
+	struct intel_context * const ce = rq->hw_context;
+	struct intel_engine_cs *old;
+
+	GEM_BUG_ON(!intel_engine_pm_is_awake(rq->engine));
+	trace_i915_request_in(rq, idx);
+
+	old = READ_ONCE(ce->inflight);
+	do {
+		if (!old) {
+			WRITE_ONCE(ce->inflight, __execlists_schedule_in(rq));
+			break;
+		}
+	} while (!try_cmpxchg(&ce->inflight, &old, ptr_inc(old)));
+
+	GEM_BUG_ON(intel_context_inflight(ce) != rq->engine);
+	return i915_request_get(rq);
+}
+
+static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
+{
+	struct intel_virtual_engine *ve =
+		container_of(ce, typeof(*ve), context);
+	struct i915_request *next = READ_ONCE(ve->request);
+
+	if (next && next->execution_mask & ~rq->execution_mask)
+		tasklet_schedule(&ve->base.execlists.tasklet);
+}
+
+static inline void
+__execlists_schedule_out(struct i915_request *rq,
+			 struct intel_engine_cs * const engine)
+{
+	struct intel_context * const ce = rq->hw_context;
+
+	/*
+	 * NB process_csb() is not under the engine->active.lock and hence
+	 * schedule_out can race with schedule_in meaning that we should
+	 * refrain from doing non-trivial work here.
+	 */
+
+	/*
+	 * If we have just completed this context, the engine may now be
+	 * idle and we want to re-enter powersaving.
+	 */
+	if (list_is_last(&rq->link, &ce->timeline->requests) &&
+	    i915_request_completed(rq))
+		intel_engine_add_retire(engine, ce->timeline);
+
+	intel_engine_context_out(engine);
+	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
+	intel_gt_pm_put_async(engine->gt);
+
+	/*
+	 * If this is part of a virtual engine, its next request may
+	 * have been blocked waiting for access to the active context.
+	 * We have to kick all the siblings again in case we need to
+	 * switch (e.g. the next request is not runnable on this
+	 * engine). Hopefully, we will already have submitted the next
+	 * request before the tasklet runs and do not need to rebuild
+	 * each virtual tree and kick everyone again.
+	 */
+	if (ce->engine != engine)
+		kick_siblings(rq, ce);
+
+	intel_context_put(ce);
+}
+
+static inline void
+execlists_schedule_out(struct i915_request *rq)
+{
+	struct intel_context * const ce = rq->hw_context;
+	struct intel_engine_cs *cur, *old;
+
+	trace_i915_request_out(rq);
+
+	old = READ_ONCE(ce->inflight);
+	do
+		cur = ptr_unmask_bits(old, 2) ? ptr_dec(old) : NULL;
+	while (!try_cmpxchg(&ce->inflight, &old, cur));
+	if (!cur)
+		__execlists_schedule_out(rq, old);
+
+	i915_request_put(rq);
+}
+
+static u64 execlists_update_context(struct i915_request *rq)
+{
+	struct intel_context *ce = rq->hw_context;
+	u64 desc = ce->lrc_desc;
+	u32 tail;
+
+	/*
+	 * WaIdleLiteRestore:bdw,skl
+	 *
+	 * We should never submit the context with the same RING_TAIL twice
+	 * just in case we submit an empty ring, which confuses the HW.
+	 *
+	 * We append a couple of NOOPs (gen8_emit_wa_tail) after the end of
+	 * the normal request to be able to always advance the RING_TAIL on
+	 * subsequent resubmissions (for lite restore). Should that fail us,
+	 * and we try and submit the same tail again, force the context
+	 * reload.
+	 */
+	tail = intel_ring_set_tail(rq->ring, rq->tail);
+	if (unlikely(ce->lrc_reg_state[CTX_RING_TAIL] == tail))
+		desc |= CTX_DESC_FORCE_RESTORE;
+	ce->lrc_reg_state[CTX_RING_TAIL] = tail;
+	rq->tail = rq->wa_tail;
+
+	/*
+	 * Make sure the context image is complete before we submit it to HW.
+	 *
+	 * Ostensibly, writes (including the WCB) should be flushed prior to
+	 * an uncached write such as our mmio register access, the empirical
+	 * evidence (esp. on Braswell) suggests that the WC write into memory
+	 * may not be visible to the HW prior to the completion of the UC
+	 * register write and that we may begin execution from the context
+	 * before its image is complete leading to invalid PD chasing.
+	 */
+	wmb();
+
+	/* Wa_1607138340:tgl */
+	if (IS_TGL_REVID(rq->i915, TGL_REVID_A0, TGL_REVID_A0))
+		desc |= CTX_DESC_FORCE_RESTORE;
+
+	ce->lrc_desc &= ~CTX_DESC_FORCE_RESTORE;
+	return desc;
+}
+
+static inline void write_desc(struct intel_engine_execlists *execlists, u64 desc, u32 port)
+{
+	if (execlists->ctrl_reg) {
+		writel(lower_32_bits(desc), execlists->submit_reg + port * 2);
+		writel(upper_32_bits(desc), execlists->submit_reg + port * 2 + 1);
+	} else {
+		writel(upper_32_bits(desc), execlists->submit_reg);
+		writel(lower_32_bits(desc), execlists->submit_reg);
+	}
+}
+
+static __maybe_unused void
+trace_ports(const struct intel_engine_execlists *execlists,
+	    const char *msg,
+	    struct i915_request * const *ports)
+{
+	const struct intel_engine_cs *engine =
+		container_of(execlists, typeof(*engine), execlists);
+
+	if (!ports[0])
+		return;
+
+	GEM_TRACE("%s: %s { %llx:%lld%s, %llx:%lld }\n",
+		  engine->name, msg,
+		  ports[0]->fence.context,
+		  ports[0]->fence.seqno,
+		  i915_request_completed(ports[0]) ? "!" :
+		  i915_request_started(ports[0]) ? "*" :
+		  "",
+		  ports[1] ? ports[1]->fence.context : 0,
+		  ports[1] ? ports[1]->fence.seqno : 0);
+}
+
+static __maybe_unused bool
+assert_pending_valid(const struct intel_engine_execlists *execlists,
+		     const char *msg)
+{
+	struct i915_request * const *port, *rq;
+	struct intel_context *ce = NULL;
+
+	trace_ports(execlists, msg, execlists->pending);
+
+	if (!execlists->pending[0]) {
+		GEM_TRACE_ERR("Nothing pending for promotion!\n");
+		return false;
+	}
+
+	if (execlists->pending[execlists_num_ports(execlists)]) {
+		GEM_TRACE_ERR("Excess pending[%d] for promotion!\n",
+			      execlists_num_ports(execlists));
+		return false;
+	}
+
+	for (port = execlists->pending; (rq = *port); port++) {
+		unsigned long flags;
+		bool ok = true;
+
+		GEM_BUG_ON(!kref_read(&rq->fence.refcount));
+		GEM_BUG_ON(!i915_request_is_active(rq));
+
+		if (ce == rq->hw_context) {
+			GEM_TRACE_ERR("Dup context:%llx in pending[%zd]\n",
+				      ce->timeline->fence_context,
+				      port - execlists->pending);
+			return false;
+		}
+		ce = rq->hw_context;
+
+		/* Hold tightly onto the lock to prevent concurrent retires! */
+		if (!spin_trylock_irqsave(&rq->lock, flags))
+			continue;
+
+		if (i915_request_completed(rq))
+			goto unlock;
+
+		if (i915_active_is_idle(&ce->active) &&
+		    !i915_gem_context_is_kernel(ce->gem_context)) {
+			GEM_TRACE_ERR("Inactive context:%llx in pending[%zd]\n",
+				      ce->timeline->fence_context,
+				      port - execlists->pending);
+			ok = false;
+			goto unlock;
+		}
+
+		if (!i915_vma_is_pinned(ce->state)) {
+			GEM_TRACE_ERR("Unpinned context:%llx in pending[%zd]\n",
+				      ce->timeline->fence_context,
+				      port - execlists->pending);
+			ok = false;
+			goto unlock;
+		}
+
+		if (!i915_vma_is_pinned(ce->ring->vma)) {
+			GEM_TRACE_ERR("Unpinned ring:%llx in pending[%zd]\n",
+				      ce->timeline->fence_context,
+				      port - execlists->pending);
+			ok = false;
+			goto unlock;
+		}
+
+unlock:
+		spin_unlock_irqrestore(&rq->lock, flags);
+		if (!ok)
+			return false;
+	}
+
+	return ce;
+}
+
+static void execlists_submit_ports(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists *execlists = &engine->execlists;
+	unsigned int n;
+
+	GEM_BUG_ON(!assert_pending_valid(execlists, "submit"));
+
+	/*
+	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
+	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
+	 * not be relinquished until the device is idle (see
+	 * i915_gem_idle_work_handler()). As a precaution, we make sure
+	 * that all ELSP are drained i.e. we have processed the CSB,
+	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
+	 */
+	GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
+
+	/*
+	 * ELSQ note: the submit queue is not cleared after being submitted
+	 * to the HW so we need to make sure we always clean it up. This is
+	 * currently ensured by the fact that we always write the same number
+	 * of elsq entries, keep this in mind before changing the loop below.
+	 */
+	for (n = execlists_num_ports(execlists); n--; ) {
+		struct i915_request *rq = execlists->pending[n];
+
+		write_desc(execlists,
+			   rq ? execlists_update_context(rq) : 0,
+			   n);
+	}
+
+	/* we need to manually load the submit queue */
+	if (execlists->ctrl_reg)
+		writel(EL_CTRL_LOAD, execlists->ctrl_reg);
+}
+
+static bool ctx_single_port_submission(const struct intel_context *ce)
+{
+	return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
+		i915_gem_context_force_single_submission(ce->gem_context));
+}
+
+static bool can_merge_ctx(const struct intel_context *prev,
+			  const struct intel_context *next)
+{
+	if (prev != next)
+		return false;
+
+	if (ctx_single_port_submission(prev))
+		return false;
+
+	return true;
+}
+
+static bool can_merge_rq(const struct i915_request *prev,
+			 const struct i915_request *next)
+{
+	GEM_BUG_ON(prev == next);
+	GEM_BUG_ON(!assert_priority_queue(prev, next));
+
+	/*
+	 * We do not submit known completed requests. Therefore if the next
+	 * request is already completed, we can pretend to merge it in
+	 * with the previous context (and we will skip updating the ELSP
+	 * and tracking). Thus hopefully keeping the ELSP full with active
+	 * contexts, despite the best efforts of preempt-to-busy to confuse
+	 * us.
+	 */
+	if (i915_request_completed(next))
+		return true;
+
+	if (unlikely((prev->flags ^ next->flags) &
+		     (I915_REQUEST_NOPREEMPT | I915_REQUEST_SENTINEL)))
+		return false;
+
+	if (!can_merge_ctx(prev->hw_context, next->hw_context))
+		return false;
+
+	return true;
+}
+
+static bool virtual_matches(const struct intel_virtual_engine *ve,
+			    const struct i915_request *rq,
+			    const struct intel_engine_cs *engine)
+{
+	const struct intel_engine_cs *inflight;
+
+	if (!(rq->execution_mask & engine->mask)) /* We peeked too soon! */
+		return false;
+
+	/*
+	 * We track when the HW has completed saving the context image
+	 * (i.e. when we have seen the final CS event switching out of
+	 * the context) and must not overwrite the context image before
+	 * then. This restricts us to only using the active engine
+	 * while the previous virtualized request is inflight (so
+	 * we reuse the register offsets). This is a very small
+	 * hystersis on the greedy seelction algorithm.
+	 */
+	inflight = intel_context_inflight(&ve->context);
+	if (inflight && inflight != engine)
+		return false;
+
+	return true;
+}
+
+static void virtual_xfer_breadcrumbs(struct intel_virtual_engine *ve,
+				     struct intel_engine_cs *engine)
+{
+	struct intel_engine_cs *old = ve->siblings[0];
+
+	/* All unattached (rq->engine == old) must already be completed */
+
+	spin_lock(&old->breadcrumbs.irq_lock);
+	if (!list_empty(&ve->context.signal_link)) {
+		list_move_tail(&ve->context.signal_link,
+			       &engine->breadcrumbs.signalers);
+		intel_engine_queue_breadcrumbs(engine);
+	}
+	spin_unlock(&old->breadcrumbs.irq_lock);
+}
+
+static struct i915_request *
+last_active(const struct intel_engine_execlists *execlists)
+{
+	struct i915_request * const *last = READ_ONCE(execlists->active);
+
+	while (*last && i915_request_completed(*last))
+		last++;
+
+	return *last;
+}
+
+static void defer_request(struct i915_request *rq, struct list_head * const pl)
+{
+	LIST_HEAD(list);
+
+	/*
+	 * We want to move the interrupted request to the back of
+	 * the round-robin list (i.e. its priority level), but
+	 * in doing so, we must then move all requests that were in
+	 * flight and were waiting for the interrupted request to
+	 * be run after it again.
+	 */
+	do {
+		struct i915_dependency *p;
+
+		GEM_BUG_ON(i915_request_is_active(rq));
+		list_move_tail(&rq->sched.link, pl);
+
+		list_for_each_entry(p, &rq->sched.waiters_list, wait_link) {
+			struct i915_request *w =
+				container_of(p->waiter, typeof(*w), sched);
+
+			/* Leave semaphores spinning on the other engines */
+			if (w->engine != rq->engine)
+				continue;
+
+			/* No waiter should start before its signaler */
+			GEM_BUG_ON(i915_request_started(w) &&
+				   !i915_request_completed(rq));
+
+			GEM_BUG_ON(i915_request_is_active(w));
+			if (list_empty(&w->sched.link))
+				continue; /* Not yet submitted; unready */
+
+			if (rq_prio(w) < rq_prio(rq))
+				continue;
+
+			GEM_BUG_ON(rq_prio(w) > rq_prio(rq));
+			list_move_tail(&w->sched.link, &list);
+		}
+
+		rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
+	} while (rq);
+}
+
+static void defer_active(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq;
+
+	rq = __unwind_incomplete_requests(engine);
+	if (!rq)
+		return;
+
+	defer_request(rq, i915_sched_lookup_priolist(engine, rq_prio(rq)));
+}
+
+static bool
+need_timeslice(struct intel_engine_cs *engine, const struct i915_request *rq)
+{
+	int hint;
+
+	if (!intel_engine_has_timeslices(engine))
+		return false;
+
+	if (list_is_last(&rq->sched.link, &engine->active.requests))
+		return false;
+
+	hint = max(rq_prio(list_next_entry(rq, sched.link)),
+		   engine->execlists.queue_priority_hint);
+
+	return hint >= effective_prio(rq);
+}
+
+static int
+switch_prio(struct intel_engine_cs *engine, const struct i915_request *rq)
+{
+	if (list_is_last(&rq->sched.link, &engine->active.requests))
+		return INT_MIN;
+
+	return rq_prio(list_next_entry(rq, sched.link));
+}
+
+static inline unsigned long
+timeslice(const struct intel_engine_cs *engine)
+{
+	return READ_ONCE(engine->props.timeslice_duration_ms);
+}
+
+static unsigned long
+active_timeslice(const struct intel_engine_cs *engine)
+{
+	const struct i915_request *rq = *engine->execlists.active;
+
+	if (i915_request_completed(rq))
+		return 0;
+
+	if (engine->execlists.switch_priority_hint < effective_prio(rq))
+		return 0;
+
+	return timeslice(engine);
+}
+
+static void set_timeslice(struct intel_engine_cs *engine)
+{
+	if (!intel_engine_has_timeslices(engine))
+		return;
+
+	set_timer_ms(&engine->execlists.timer, active_timeslice(engine));
+}
+
+static void record_preemption(struct intel_engine_execlists *execlists)
+{
+	(void)I915_SELFTEST_ONLY(execlists->preempt_hang.count++);
+}
+
+static unsigned long active_preempt_timeout(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq;
+
+	rq = last_active(&engine->execlists);
+	if (!rq)
+		return 0;
+
+	/* Force a fast reset for terminated contexts (ignoring sysfs!) */
+	if (unlikely(i915_gem_context_is_banned(rq->gem_context)))
+		return 1;
+
+	return READ_ONCE(engine->props.preempt_timeout_ms);
+}
+
+static void set_preempt_timeout(struct intel_engine_cs *engine)
+{
+	if (!intel_engine_has_preempt_reset(engine))
+		return;
+
+	set_timer_ms(&engine->execlists.preempt,
+		     active_preempt_timeout(engine));
+}
+
+static void execlists_dequeue(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+	struct i915_request **port = execlists->pending;
+	struct i915_request ** const last_port = port + execlists->port_mask;
+	struct i915_request *last;
+	struct rb_node *rb;
+	bool submit = false;
+
+	/*
+	 * Hardware submission is through 2 ports. Conceptually each port
+	 * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
+	 * static for a context, and unique to each, so we only execute
+	 * requests belonging to a single context from each ring. RING_HEAD
+	 * is maintained by the CS in the context image, it marks the place
+	 * where it got up to last time, and through RING_TAIL we tell the CS
+	 * where we want to execute up to this time.
+	 *
+	 * In this list the requests are in order of execution. Consecutive
+	 * requests from the same context are adjacent in the ringbuffer. We
+	 * can combine these requests into a single RING_TAIL update:
+	 *
+	 *              RING_HEAD...req1...req2
+	 *                                    ^- RING_TAIL
+	 * since to execute req2 the CS must first execute req1.
+	 *
+	 * Our goal then is to point each port to the end of a consecutive
+	 * sequence of requests as being the most optimal (fewest wake ups
+	 * and context switches) submission.
+	 */
+
+	for (rb = rb_first_cached(&execlists->virtual); rb; ) {
+		struct intel_virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq = READ_ONCE(ve->request);
+
+		if (!rq) { /* lazily cleanup after another engine handled rq */
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+			rb = rb_first_cached(&execlists->virtual);
+			continue;
+		}
+
+		if (!virtual_matches(ve, rq, engine)) {
+			rb = rb_next(rb);
+			continue;
+		}
+
+		break;
+	}
+
+	/*
+	 * If the queue is higher priority than the last
+	 * request in the currently active context, submit afresh.
+	 * We will resubmit again afterwards in case we need to split
+	 * the active context to interject the preemption request,
+	 * i.e. we will retrigger preemption following the ack in case
+	 * of trouble.
+	 */
+	last = last_active(execlists);
+	if (last) {
+		if (need_preempt(engine, last, rb)) {
+			GEM_TRACE("%s: preempting last=%llx:%lld, prio=%d, hint=%d\n",
+				  engine->name,
+				  last->fence.context,
+				  last->fence.seqno,
+				  last->sched.attr.priority,
+				  execlists->queue_priority_hint);
+			record_preemption(execlists);
+
+			/*
+			 * Don't let the RING_HEAD advance past the breadcrumb
+			 * as we unwind (and until we resubmit) so that we do
+			 * not accidentally tell it to go backwards.
+			 */
+			ring_set_paused(engine, 1);
+
+			/*
+			 * Note that we have not stopped the GPU at this point,
+			 * so we are unwinding the incomplete requests as they
+			 * remain inflight and so by the time we do complete
+			 * the preemption, some of the unwound requests may
+			 * complete!
+			 */
+			__unwind_incomplete_requests(engine);
+
+			/*
+			 * If we need to return to the preempted context, we
+			 * need to skip the lite-restore and force it to
+			 * reload the RING_TAIL. Otherwise, the HW has a
+			 * tendency to ignore us rewinding the TAIL to the
+			 * end of an earlier request.
+			 */
+			last->hw_context->lrc_desc |= CTX_DESC_FORCE_RESTORE;
+			last = NULL;
+		} else if (need_timeslice(engine, last) &&
+			   timer_expired(&engine->execlists.timer)) {
+			GEM_TRACE("%s: expired last=%llx:%lld, prio=%d, hint=%d\n",
+				  engine->name,
+				  last->fence.context,
+				  last->fence.seqno,
+				  last->sched.attr.priority,
+				  execlists->queue_priority_hint);
+
+			ring_set_paused(engine, 1);
+			defer_active(engine);
+
+			/*
+			 * Unlike for preemption, if we rewind and continue
+			 * executing the same context as previously active,
+			 * the order of execution will remain the same and
+			 * the tail will only advance. We do not need to
+			 * force a full context restore, as a lite-restore
+			 * is sufficient to resample the monotonic TAIL.
+			 *
+			 * If we switch to any other context, similarly we
+			 * will not rewind TAIL of current context, and
+			 * normal save/restore will preserve state and allow
+			 * us to later continue executing the same request.
+			 */
+			last = NULL;
+		} else {
+			/*
+			 * Otherwise if we already have a request pending
+			 * for execution after the current one, we can
+			 * just wait until the next CS event before
+			 * queuing more. In either case we will force a
+			 * lite-restore preemption event, but if we wait
+			 * we hopefully coalesce several updates into a single
+			 * submission.
+			 */
+			if (!list_is_last(&last->sched.link,
+					  &engine->active.requests)) {
+				/*
+				 * Even if ELSP[1] is occupied and not worthy
+				 * of timeslices, our queue might be.
+				 */
+				if (!execlists->timer.expires &&
+				    need_timeslice(engine, last))
+					set_timer_ms(&execlists->timer,
+						     timeslice(engine));
+
+				return;
+			}
+		}
+	}
+
+	while (rb) { /* XXX virtual is always taking precedence */
+		struct intel_virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq;
+
+		spin_lock(&ve->base.active.lock);
+
+		rq = ve->request;
+		if (unlikely(!rq)) { /* lost the race to a sibling */
+			spin_unlock(&ve->base.active.lock);
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+			rb = rb_first_cached(&execlists->virtual);
+			continue;
+		}
+
+		GEM_BUG_ON(rq != ve->request);
+		GEM_BUG_ON(rq->engine != &ve->base);
+		GEM_BUG_ON(rq->hw_context != &ve->context);
+
+		if (rq_prio(rq) >= queue_prio(execlists)) {
+			if (!virtual_matches(ve, rq, engine)) {
+				spin_unlock(&ve->base.active.lock);
+				rb = rb_next(rb);
+				continue;
+			}
+
+			if (last && !can_merge_rq(last, rq)) {
+				spin_unlock(&ve->base.active.lock);
+				return; /* leave this for another */
+			}
+
+			GEM_TRACE("%s: virtual rq=%llx:%lld%s, new engine? %s\n",
+				  engine->name,
+				  rq->fence.context,
+				  rq->fence.seqno,
+				  i915_request_completed(rq) ? "!" :
+				  i915_request_started(rq) ? "*" :
+				  "",
+				  yesno(engine != ve->siblings[0]));
+
+			ve->request = NULL;
+			ve->base.execlists.queue_priority_hint = INT_MIN;
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+
+			GEM_BUG_ON(!(rq->execution_mask & engine->mask));
+			rq->engine = engine;
+
+			if (engine != ve->siblings[0]) {
+				u32 *regs = ve->context.lrc_reg_state;
+				unsigned int n;
+
+				GEM_BUG_ON(READ_ONCE(ve->context.inflight));
+
+				if (!intel_engine_has_relative_mmio(engine))
+					intel_lr_context_set_register_offsets(regs,
+									      engine);
+
+				if (!list_empty(&ve->context.signals))
+					virtual_xfer_breadcrumbs(ve, engine);
+
+				/*
+				 * Move the bound engine to the top of the list
+				 * for future execution. We then kick this
+				 * tasklet first before checking others, so that
+				 * we preferentially reuse this set of bound
+				 * registers.
+				 */
+				for (n = 1; n < ve->num_siblings; n++) {
+					if (ve->siblings[n] == engine) {
+						swap(ve->siblings[n],
+						     ve->siblings[0]);
+						break;
+					}
+				}
+
+				GEM_BUG_ON(ve->siblings[0] != engine);
+			}
+
+			if (__i915_request_submit(rq)) {
+				submit = true;
+				last = rq;
+			}
+			i915_request_put(rq);
+
+			/*
+			 * Hmm, we have a bunch of virtual engine requests,
+			 * but the first one was already completed (thanks
+			 * preempt-to-busy!). Keep looking at the veng queue
+			 * until we have no more relevant requests (i.e.
+			 * the normal submit queue has higher priority).
+			 */
+			if (!submit) {
+				spin_unlock(&ve->base.active.lock);
+				rb = rb_first_cached(&execlists->virtual);
+				continue;
+			}
+		}
+
+		spin_unlock(&ve->base.active.lock);
+		break;
+	}
+
+	while ((rb = rb_first_cached(&execlists->queue))) {
+		struct i915_priolist *p = to_priolist(rb);
+		struct i915_request *rq, *rn;
+		int i;
+
+		priolist_for_each_request_consume(rq, rn, p, i) {
+			bool merge = true;
+
+			/*
+			 * Can we combine this request with the current port?
+			 * It has to be the same context/ringbuffer and not
+			 * have any exceptions (e.g. GVT saying never to
+			 * combine contexts).
+			 *
+			 * If we can combine the requests, we can execute both
+			 * by updating the RING_TAIL to point to the end of the
+			 * second request, and so we never need to tell the
+			 * hardware about the first.
+			 */
+			if (last && !can_merge_rq(last, rq)) {
+				/*
+				 * If we are on the second port and cannot
+				 * combine this request with the last, then we
+				 * are done.
+				 */
+				if (port == last_port)
+					goto done;
+
+				/*
+				 * We must not populate both ELSP[] with the
+				 * same LRCA, i.e. we must submit 2 different
+				 * contexts if we submit 2 ELSP.
+				 */
+				if (last->hw_context == rq->hw_context)
+					goto done;
+
+				if (i915_request_has_sentinel(last))
+					goto done;
+
+				/*
+				 * If GVT overrides us we only ever submit
+				 * port[0], leaving port[1] empty. Note that we
+				 * also have to be careful that we don't queue
+				 * the same context (even though a different
+				 * request) to the second port.
+				 */
+				if (ctx_single_port_submission(last->hw_context) ||
+				    ctx_single_port_submission(rq->hw_context))
+					goto done;
+
+				merge = false;
+			}
+
+			if (__i915_request_submit(rq)) {
+				if (!merge) {
+					*port = execlists_schedule_in(last, port - execlists->pending);
+					port++;
+					last = NULL;
+				}
+
+				GEM_BUG_ON(last &&
+					   !can_merge_ctx(last->hw_context,
+							  rq->hw_context));
+
+				submit = true;
+				last = rq;
+			}
+		}
+
+		rb_erase_cached(&p->node, &execlists->queue);
+		i915_priolist_free(p);
+	}
+
+done:
+	/*
+	 * Here be a bit of magic! Or sleight-of-hand, whichever you prefer.
+	 *
+	 * We choose the priority hint such that if we add a request of greater
+	 * priority than this, we kick the submission tasklet to decide on
+	 * the right order of submitting the requests to hardware. We must
+	 * also be prepared to reorder requests as they are in-flight on the
+	 * HW. We derive the priority hint then as the first "hole" in
+	 * the HW submission ports and if there are no available slots,
+	 * the priority of the lowest executing request, i.e. last.
+	 *
+	 * When we do receive a higher priority request ready to run from the
+	 * user, see queue_request(), the priority hint is bumped to that
+	 * request triggering preemption on the next dequeue (or subsequent
+	 * interrupt for secondary ports).
+	 */
+	execlists->queue_priority_hint = queue_prio(execlists);
+	GEM_TRACE("%s: queue_priority_hint:%d, submit:%s\n",
+		  engine->name, execlists->queue_priority_hint,
+		  yesno(submit));
+
+	if (submit) {
+		*port = execlists_schedule_in(last, port - execlists->pending);
+		execlists->switch_priority_hint =
+			switch_prio(engine, *execlists->pending);
+
+		/*
+		 * Skip if we ended up with exactly the same set of requests,
+		 * e.g. trying to timeslice a pair of ordered contexts
+		 */
+		if (!memcmp(execlists->active, execlists->pending,
+			    (port - execlists->pending + 1) * sizeof(*port))) {
+			do
+				execlists_schedule_out(fetch_and_zero(port));
+			while (port-- != execlists->pending);
+
+			goto skip_submit;
+		}
+
+		memset(port + 1, 0, (last_port - port) * sizeof(*port));
+		execlists_submit_ports(engine);
+
+		set_preempt_timeout(engine);
+	} else {
+skip_submit:
+		ring_set_paused(engine, 0);
+	}
+}
+
+static void
+cancel_port_requests(struct intel_engine_execlists * const execlists)
+{
+	struct i915_request * const *port;
+
+	for (port = execlists->pending; *port; port++)
+		execlists_schedule_out(*port);
+	memset(execlists->pending, 0, sizeof(execlists->pending));
+
+	/* Mark the end of active before we overwrite *active */
+	for (port = xchg(&execlists->active, execlists->pending); *port; port++)
+		execlists_schedule_out(*port);
+	WRITE_ONCE(execlists->active,
+		   memset(execlists->inflight, 0, sizeof(execlists->inflight)));
+}
+
+static inline void
+invalidate_csb_entries(const u32 *first, const u32 *last)
+{
+	clflush((void *)first);
+	clflush((void *)last);
+}
+
+static inline bool
+reset_in_progress(const struct intel_engine_execlists *execlists)
+{
+	return unlikely(!__tasklet_is_enabled(&execlists->tasklet));
+}
+
+/*
+ * Starting with Gen12, the status has a new format:
+ *
+ *     bit  0:     switched to new queue
+ *     bit  1:     reserved
+ *     bit  2:     semaphore wait mode (poll or signal), only valid when
+ *                 switch detail is set to "wait on semaphore"
+ *     bits 3-5:   engine class
+ *     bits 6-11:  engine instance
+ *     bits 12-14: reserved
+ *     bits 15-25: sw context id of the lrc the GT switched to
+ *     bits 26-31: sw counter of the lrc the GT switched to
+ *     bits 32-35: context switch detail
+ *                  - 0: ctx complete
+ *                  - 1: wait on sync flip
+ *                  - 2: wait on vblank
+ *                  - 3: wait on scanline
+ *                  - 4: wait on semaphore
+ *                  - 5: context preempted (not on SEMAPHORE_WAIT or
+ *                       WAIT_FOR_EVENT)
+ *     bit  36:    reserved
+ *     bits 37-43: wait detail (for switch detail 1 to 4)
+ *     bits 44-46: reserved
+ *     bits 47-57: sw context id of the lrc the GT switched away from
+ *     bits 58-63: sw counter of the lrc the GT switched away from
+ */
+static inline bool
+gen12_csb_parse(const struct intel_engine_execlists *execlists, const u32 *csb)
+{
+	u32 lower_dw = csb[0];
+	u32 upper_dw = csb[1];
+	bool ctx_to_valid = GEN12_CSB_CTX_VALID(lower_dw);
+	bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_dw);
+	bool new_queue = lower_dw & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE;
+
+	/*
+	 * The context switch detail is not guaranteed to be 5 when a preemption
+	 * occurs, so we can't just check for that. The check below works for
+	 * all the cases we care about, including preemptions of WAIT
+	 * instructions and lite-restore. Preempt-to-idle via the CTRL register
+	 * would require some extra handling, but we don't support that.
+	 */
+	if (!ctx_away_valid || new_queue) {
+		GEM_BUG_ON(!ctx_to_valid);
+		return true;
+	}
+
+	/*
+	 * switch detail = 5 is covered by the case above and we do not expect a
+	 * context switch on an unsuccessful wait instruction since we always
+	 * use polling mode.
+	 */
+	GEM_BUG_ON(GEN12_CTX_SWITCH_DETAIL(upper_dw));
+	return false;
+}
+
+static inline bool
+gen8_csb_parse(const struct intel_engine_execlists *execlists, const u32 *csb)
+{
+	return *csb & (GEN8_CTX_STATUS_IDLE_ACTIVE | GEN8_CTX_STATUS_PREEMPTED);
+}
+
+static void process_csb(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+	const u32 * const buf = execlists->csb_status;
+	const u8 num_entries = execlists->csb_size;
+	u8 head, tail;
+
+	/*
+	 * As we modify our execlists state tracking we require exclusive
+	 * access. Either we are inside the tasklet, or the tasklet is disabled
+	 * and we assume that is only inside the reset paths and so serialised.
+	 */
+	GEM_BUG_ON(!tasklet_is_locked(&execlists->tasklet) &&
+		   !reset_in_progress(execlists));
+	GEM_BUG_ON(!intel_engine_in_execlists_submission_mode(engine));
+
+	/*
+	 * Note that csb_write, csb_status may be either in HWSP or mmio.
+	 * When reading from the csb_write mmio register, we have to be
+	 * careful to only use the GEN8_CSB_WRITE_PTR portion, which is
+	 * the low 4bits. As it happens we know the next 4bits are always
+	 * zero and so we can simply masked off the low u8 of the register
+	 * and treat it identically to reading from the HWSP (without having
+	 * to use explicit shifting and masking, and probably bifurcating
+	 * the code to handle the legacy mmio read).
+	 */
+	head = execlists->csb_head;
+	tail = READ_ONCE(*execlists->csb_write);
+	GEM_TRACE("%s cs-irq head=%d, tail=%d\n", engine->name, head, tail);
+	if (unlikely(head == tail))
+		return;
+
+	/*
+	 * Hopefully paired with a wmb() in HW!
+	 *
+	 * We must complete the read of the write pointer before any reads
+	 * from the CSB, so that we do not see stale values. Without an rmb
+	 * (lfence) the HW may speculatively perform the CSB[] reads *before*
+	 * we perform the READ_ONCE(*csb_write).
+	 */
+	rmb();
+
+	do {
+		bool promote;
+
+		if (++head == num_entries)
+			head = 0;
+
+		/*
+		 * We are flying near dragons again.
+		 *
+		 * We hold a reference to the request in execlist_port[]
+		 * but no more than that. We are operating in softirq
+		 * context and so cannot hold any mutex or sleep. That
+		 * prevents us stopping the requests we are processing
+		 * in port[] from being retired simultaneously (the
+		 * breadcrumb will be complete before we see the
+		 * context-switch). As we only hold the reference to the
+		 * request, any pointer chasing underneath the request
+		 * is subject to a potential use-after-free. Thus we
+		 * store all of the bookkeeping within port[] as
+		 * required, and avoid using unguarded pointers beneath
+		 * request itself. The same applies to the atomic
+		 * status notifier.
+		 */
+
+		GEM_TRACE("%s csb[%d]: status=0x%08x:0x%08x\n",
+			  engine->name, head,
+			  buf[2 * head + 0], buf[2 * head + 1]);
+
+		if (INTEL_GEN(engine->i915) >= 12)
+			promote = gen12_csb_parse(execlists, buf + 2 * head);
+		else
+			promote = gen8_csb_parse(execlists, buf + 2 * head);
+		if (promote) {
+			struct i915_request * const *old = execlists->active;
+
+			/* Point active to the new ELSP; prevent overwriting */
+			WRITE_ONCE(execlists->active, execlists->pending);
+			set_timeslice(engine);
+
+			if (!inject_preempt_hang(execlists))
+				ring_set_paused(engine, 0);
+
+			/* cancel old inflight, prepare for switch */
+			trace_ports(execlists, "preempted", old);
+			while (*old)
+				execlists_schedule_out(*old++);
+
+			/* switch pending to inflight */
+			GEM_BUG_ON(!assert_pending_valid(execlists, "promote"));
+			WRITE_ONCE(execlists->active,
+				   memcpy(execlists->inflight,
+					  execlists->pending,
+					  execlists_num_ports(execlists) *
+					  sizeof(*execlists->pending)));
+
+			WRITE_ONCE(execlists->pending[0], NULL);
+		} else {
+			GEM_BUG_ON(!*execlists->active);
+
+			/* port0 completed, advanced to port1 */
+			trace_ports(execlists, "completed", execlists->active);
+
+			/*
+			 * We rely on the hardware being strongly
+			 * ordered, that the breadcrumb write is
+			 * coherent (visible from the CPU) before the
+			 * user interrupt and CSB is processed.
+			 */
+			GEM_BUG_ON(!i915_request_completed(*execlists->active) &&
+				   !reset_in_progress(execlists));
+			execlists_schedule_out(*execlists->active++);
+
+			GEM_BUG_ON(execlists->active - execlists->inflight >
+				   execlists_num_ports(execlists));
+		}
+	} while (head != tail);
+
+	execlists->csb_head = head;
+
+	/*
+	 * Gen11 has proven to fail wrt global observation point between
+	 * entry and tail update, failing on the ordering and thus
+	 * we see an old entry in the context status buffer.
+	 *
+	 * Forcibly evict out entries for the next gpu csb update,
+	 * to increase the odds that we get a fresh entries with non
+	 * working hardware. The cost for doing so comes out mostly with
+	 * the wash as hardware, working or not, will need to do the
+	 * invalidation before.
+	 */
+	invalidate_csb_entries(&buf[0], &buf[num_entries - 1]);
+}
+
+static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
+{
+	lockdep_assert_held(&engine->active.lock);
+	if (!engine->execlists.pending[0]) {
+		rcu_read_lock(); /* protect peeking at execlists->active */
+		execlists_dequeue(engine);
+		rcu_read_unlock();
+	}
+}
+
+static noinline void preempt_reset(struct intel_engine_cs *engine)
+{
+	const unsigned int bit = I915_RESET_ENGINE + engine->id;
+	unsigned long *lock = &engine->gt->reset.flags;
+
+	if (i915_modparams.reset < 3)
+		return;
+
+	if (test_and_set_bit(bit, lock))
+		return;
+
+	/* Mark this tasklet as disabled to avoid waiting for it to complete */
+	tasklet_disable_nosync(&engine->execlists.tasklet);
+
+	GEM_TRACE("%s: preempt timeout %lu+%ums\n",
+		  engine->name,
+		  READ_ONCE(engine->props.preempt_timeout_ms),
+		  jiffies_to_msecs(jiffies - engine->execlists.preempt.expires));
+	intel_engine_reset(engine, "preemption time out");
+
+	tasklet_enable(&engine->execlists.tasklet);
+	clear_and_wake_up_bit(bit, lock);
+}
+
+static bool preempt_timeout(const struct intel_engine_cs *const engine)
+{
+	const struct timer_list *t = &engine->execlists.preempt;
+
+	if (!CONFIG_DRM_I915_PREEMPT_TIMEOUT)
+		return false;
+
+	if (!timer_expired(t))
+		return false;
+
+	return READ_ONCE(engine->execlists.pending[0]);
+}
+
+/*
+ * Check the unread Context Status Buffers and manage the submission of new
+ * contexts to the ELSP accordingly.
+ */
+static void execlists_submission_tasklet(unsigned long data)
+{
+	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
+	bool timeout = preempt_timeout(engine);
+
+	process_csb(engine);
+	if (!READ_ONCE(engine->execlists.pending[0]) || timeout) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&engine->active.lock, flags);
+		__execlists_submission_tasklet(engine);
+		spin_unlock_irqrestore(&engine->active.lock, flags);
+
+		/* Recheck after serialising with direct-submission */
+		if (timeout && preempt_timeout(engine))
+			preempt_reset(engine);
+	}
+}
+
+static void __execlists_kick(struct intel_engine_execlists *execlists)
+{
+	/* Kick the tasklet for some interrupt coalescing and reset handling */
+	tasklet_hi_schedule(&execlists->tasklet);
+}
+
+#define execlists_kick(t, member) \
+	__execlists_kick(container_of(t, struct intel_engine_execlists, member))
+
+static void execlists_timeslice(struct timer_list *timer)
+{
+	execlists_kick(timer, timer);
+}
+
+static void execlists_preempt(struct timer_list *timer)
+{
+	execlists_kick(timer, preempt);
+}
+
+static void queue_request(struct intel_engine_cs *engine,
+			  struct i915_sched_node *node,
+			  int prio)
+{
+	GEM_BUG_ON(!list_empty(&node->link));
+	list_add_tail(&node->link, i915_sched_lookup_priolist(engine, prio));
+}
+
+static void __submit_queue_imm(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+
+	if (reset_in_progress(execlists))
+		return; /* defer until we restart the engine following reset */
+
+	if (execlists->tasklet.func == execlists_submission_tasklet)
+		__execlists_submission_tasklet(engine);
+	else
+		tasklet_hi_schedule(&execlists->tasklet);
+}
+
+static void submit_queue(struct intel_engine_cs *engine,
+			 const struct i915_request *rq)
+{
+	struct intel_engine_execlists *execlists = &engine->execlists;
+
+	if (rq_prio(rq) <= execlists->queue_priority_hint)
+		return;
+
+	execlists->queue_priority_hint = rq_prio(rq);
+	__submit_queue_imm(engine);
+}
+
+static void execlists_submit_request(struct i915_request *request)
+{
+	struct intel_engine_cs *engine = request->engine;
+	unsigned long flags;
+
+	/* Will be called from irq-context when using foreign fences. */
+	spin_lock_irqsave(&engine->active.lock, flags);
+
+	queue_request(engine, &request->sched, rq_prio(request));
+
+	GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+	GEM_BUG_ON(list_empty(&request->sched.link));
+
+	submit_queue(engine, request);
+
+	spin_unlock_irqrestore(&engine->active.lock, flags);
+}
+
+static void execlists_context_destroy(struct kref *kref)
+{
+	struct intel_context *ce = container_of(kref, typeof(*ce), ref);
+
+	GEM_BUG_ON(!i915_active_is_idle(&ce->active));
+	GEM_BUG_ON(intel_context_is_pinned(ce));
+
+	if (ce->state)
+		intel_lr_context_fini(ce);
+
+	intel_context_fini(ce);
+	intel_context_free(ce);
+}
+
+static int execlists_context_pin(struct intel_context *ce)
+{
+	return intel_lr_context_pin(ce, ce->engine);
+}
+
+static int execlists_context_alloc(struct intel_context *ce)
+{
+	return intel_lr_context_alloc(ce, ce->engine);
+}
+
+static void execlists_context_reset(struct intel_context *ce)
+{
+	/*
+	 * Because we emit WA_TAIL_DWORDS there may be a disparity
+	 * between our bookkeeping in ce->ring->head and ce->ring->tail and
+	 * that stored in context. As we only write new commands from
+	 * ce->ring->tail onwards, everything before that is junk. If the GPU
+	 * starts reading from its RING_HEAD from the context, it may try to
+	 * execute that junk and die.
+	 *
+	 * The contexts that are stilled pinned on resume belong to the
+	 * kernel, and are local to each engine. All other contexts will
+	 * have their head/tail sanitized upon pinning before use, so they
+	 * will never see garbage,
+	 *
+	 * So to avoid that we reset the context images upon resume. For
+	 * simplicity, we just zero everything out.
+	 */
+	intel_ring_reset(ce->ring, 0);
+	intel_lr_context_update_reg_state(ce, ce->engine);
+}
+
+static const struct intel_context_ops execlists_context_ops = {
+	.alloc = execlists_context_alloc,
+
+	.pin = execlists_context_pin,
+	.unpin = intel_lr_context_unpin,
+
+	.enter = intel_context_enter_engine,
+	.exit = intel_context_exit_engine,
+
+	.reset = execlists_context_reset,
+	.destroy = execlists_context_destroy,
+};
+
+static int execlists_request_alloc(struct i915_request *request)
+{
+	int ret;
+
+	GEM_BUG_ON(!intel_context_is_pinned(request->hw_context));
+
+	/*
+	 * Flush enough space to reduce the likelihood of waiting after
+	 * we start building the request - in which case we will just
+	 * have to repeat work.
+	 */
+	request->reserved_space += EXECLISTS_REQUEST_SIZE;
+
+	/*
+	 * Note that after this point, we have committed to using
+	 * this request as it is being used to both track the
+	 * state of engine initialisation and liveness of the
+	 * golden renderstate above. Think twice before you try
+	 * to cancel/unwind this request now.
+	 */
+
+	/* Unconditionally invalidate GPU caches and TLBs. */
+	ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
+	if (ret)
+		return ret;
+
+	request->reserved_space -= EXECLISTS_REQUEST_SIZE;
+	return 0;
+}
+
+static void execlists_reset_prepare(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+	unsigned long flags;
+
+	GEM_TRACE("%s: depth<-%d\n", engine->name,
+		  atomic_read(&execlists->tasklet.count));
+
+	/*
+	 * Prevent request submission to the hardware until we have
+	 * completed the reset in i915_gem_reset_finish(). If a request
+	 * is completed by one engine, it may then queue a request
+	 * to a second via its execlists->tasklet *just* as we are
+	 * calling engine->resume() and also writing the ELSP.
+	 * Turning off the execlists->tasklet until the reset is over
+	 * prevents the race.
+	 */
+	__tasklet_disable_sync_once(&execlists->tasklet);
+	GEM_BUG_ON(!reset_in_progress(execlists));
+
+	/* And flush any current direct submission. */
+	spin_lock_irqsave(&engine->active.lock, flags);
+	spin_unlock_irqrestore(&engine->active.lock, flags);
+
+	/*
+	 * We stop engines, otherwise we might get failed reset and a
+	 * dead gpu (on elk). Also as modern gpu as kbl can suffer
+	 * from system hang if batchbuffer is progressing when
+	 * the reset is issued, regardless of READY_TO_RESET ack.
+	 * Thus assume it is best to stop engines on all gens
+	 * where we have a gpu reset.
+	 *
+	 * WaKBLVECSSemaphoreWaitPoll:kbl (on ALL_ENGINES)
+	 *
+	 * FIXME: Wa for more modern gens needs to be validated
+	 */
+	intel_engine_stop_cs(engine);
+}
+
+static void reset_csb_pointers(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+	const unsigned int reset_value = execlists->csb_size - 1;
+
+	ring_set_paused(engine, 0);
+
+	/*
+	 * After a reset, the HW starts writing into CSB entry [0]. We
+	 * therefore have to set our HEAD pointer back one entry so that
+	 * the *first* entry we check is entry 0. To complicate this further,
+	 * as we don't wait for the first interrupt after reset, we have to
+	 * fake the HW write to point back to the last entry so that our
+	 * inline comparison of our cached head position against the last HW
+	 * write works even before the first interrupt.
+	 */
+	execlists->csb_head = reset_value;
+	WRITE_ONCE(*execlists->csb_write, reset_value);
+	wmb(); /* Make sure this is visible to HW (paranoia?) */
+
+	/*
+	 * Sometimes Icelake forgets to reset its pointers on a GPU reset.
+	 * Bludgeon them with a mmio update to be sure.
+	 */
+	ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,
+		     reset_value << 8 | reset_value);
+	ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR);
+
+	invalidate_csb_entries(&execlists->csb_status[0],
+			       &execlists->csb_status[reset_value]);
+}
+
+static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+	struct intel_context *ce;
+	struct i915_request *rq;
+
+	mb(); /* paranoia: read the CSB pointers from after the reset */
+	clflush(execlists->csb_write);
+	mb();
+
+	process_csb(engine); /* drain preemption events */
+
+	/* Following the reset, we need to reload the CSB read/write pointers */
+	reset_csb_pointers(engine);
+
+	/*
+	 * Save the currently executing context, even if we completed
+	 * its request, it was still running at the time of the
+	 * reset and will have been clobbered.
+	 */
+	rq = execlists_active(execlists);
+	if (!rq)
+		goto unwind;
+
+	/* We still have requests in-flight; the engine should be active */
+	GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
+
+	ce = rq->hw_context;
+	GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
+
+	if (i915_request_completed(rq)) {
+		/* Idle context; tidy up the ring so we can restart afresh */
+		ce->ring->head = intel_ring_wrap(ce->ring, rq->tail);
+		goto out_replay;
+	}
+
+	/* Context has requests still in-flight; it should not be idle! */
+	GEM_BUG_ON(i915_active_is_idle(&ce->active));
+	rq = active_request(ce->timeline, rq);
+	ce->ring->head = intel_ring_wrap(ce->ring, rq->head);
+	GEM_BUG_ON(ce->ring->head == ce->ring->tail);
+
+	/*
+	 * If this request hasn't started yet, e.g. it is waiting on a
+	 * semaphore, we need to avoid skipping the request or else we
+	 * break the signaling chain. However, if the context is corrupt
+	 * the request will not restart and we will be stuck with a wedged
+	 * device. It is quite often the case that if we issue a reset
+	 * while the GPU is loading the context image, that the context
+	 * image becomes corrupt.
+	 *
+	 * Otherwise, if we have not started yet, the request should replay
+	 * perfectly and we do not need to flag the result as being erroneous.
+	 */
+	if (!i915_request_started(rq))
+		goto out_replay;
+
+	/*
+	 * If the request was innocent, we leave the request in the ELSP
+	 * and will try to replay it on restarting. The context image may
+	 * have been corrupted by the reset, in which case we may have
+	 * to service a new GPU hang, but more likely we can continue on
+	 * without impact.
+	 *
+	 * If the request was guilty, we presume the context is corrupt
+	 * and have to at least restore the RING register in the context
+	 * image back to the expected values to skip over the guilty request.
+	 */
+	__i915_request_reset(rq, stalled);
+	if (!stalled)
+		goto out_replay;
+
+	/*
+	 * We want a simple context + ring to execute the breadcrumb update.
+	 * We cannot rely on the context being intact across the GPU hang,
+	 * so clear it and rebuild just what we need for the breadcrumb.
+	 * All pending requests for this context will be zapped, and any
+	 * future request will be after userspace has had the opportunity
+	 * to recreate its own state.
+	 */
+	GEM_BUG_ON(!intel_context_is_pinned(ce));
+	intel_lr_context_restore_default_state(ce, engine);
+
+out_replay:
+	GEM_TRACE("%s replay {head:%04x, tail:%04x}\n",
+		  engine->name, ce->ring->head, ce->ring->tail);
+	intel_ring_update_space(ce->ring);
+	intel_lr_context_reset_reg_state(ce, engine);
+	intel_lr_context_update_reg_state(ce, engine);
+	ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
+
+unwind:
+	/* Push back any incomplete requests for replay after the reset. */
+	cancel_port_requests(execlists);
+	__unwind_incomplete_requests(engine);
+}
+
+static void execlists_reset(struct intel_engine_cs *engine, bool stalled)
+{
+	unsigned long flags;
+
+	GEM_TRACE("%s\n", engine->name);
+
+	spin_lock_irqsave(&engine->active.lock, flags);
+
+	__execlists_reset(engine, stalled);
+
+	spin_unlock_irqrestore(&engine->active.lock, flags);
+}
+
+static void nop_submission_tasklet(unsigned long data)
+{
+	/* The driver is wedged; don't process any more events. */
+}
+
+static void execlists_cancel_requests(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+	struct i915_request *rq, *rn;
+	struct rb_node *rb;
+	unsigned long flags;
+
+	GEM_TRACE("%s\n", engine->name);
+
+	/*
+	 * Before we call engine->cancel_requests(), we should have exclusive
+	 * access to the submission state. This is arranged for us by the
+	 * caller disabling the interrupt generation, the tasklet and other
+	 * threads that may then access the same state, giving us a free hand
+	 * to reset state. However, we still need to let lockdep be aware that
+	 * we know this state may be accessed in hardirq context, so we
+	 * disable the irq around this manipulation and we want to keep
+	 * the spinlock focused on its duties and not accidentally conflate
+	 * coverage to the submission's irq state. (Similarly, although we
+	 * shouldn't need to disable irq around the manipulation of the
+	 * submission's irq state, we also wish to remind ourselves that
+	 * it is irq state.)
+	 */
+	spin_lock_irqsave(&engine->active.lock, flags);
+
+	__execlists_reset(engine, true);
+
+	/* Mark all executing requests as skipped. */
+	list_for_each_entry(rq, &engine->active.requests, sched.link)
+		mark_eio(rq);
+
+	/* Flush the queued requests to the timeline list (for retiring). */
+	while ((rb = rb_first_cached(&execlists->queue))) {
+		struct i915_priolist *p = to_priolist(rb);
+		int i;
+
+		priolist_for_each_request_consume(rq, rn, p, i) {
+			mark_eio(rq);
+			__i915_request_submit(rq);
+		}
+
+		rb_erase_cached(&p->node, &execlists->queue);
+		i915_priolist_free(p);
+	}
+
+	/* Cancel all attached virtual engines */
+	while ((rb = rb_first_cached(&execlists->virtual))) {
+		struct intel_virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+
+		rb_erase_cached(rb, &execlists->virtual);
+		RB_CLEAR_NODE(rb);
+
+		spin_lock(&ve->base.active.lock);
+		rq = fetch_and_zero(&ve->request);
+		if (rq) {
+			mark_eio(rq);
+
+			rq->engine = engine;
+			__i915_request_submit(rq);
+			i915_request_put(rq);
+
+			ve->base.execlists.queue_priority_hint = INT_MIN;
+		}
+		spin_unlock(&ve->base.active.lock);
+	}
+
+	/* Remaining _unready_ requests will be nop'ed when submitted */
+
+	execlists->queue_priority_hint = INT_MIN;
+	execlists->queue = RB_ROOT_CACHED;
+
+	GEM_BUG_ON(__tasklet_is_enabled(&execlists->tasklet));
+	execlists->tasklet.func = nop_submission_tasklet;
+
+	spin_unlock_irqrestore(&engine->active.lock, flags);
+}
+
+static void execlists_reset_finish(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+
+	/*
+	 * After a GPU reset, we may have requests to replay. Do so now while
+	 * we still have the forcewake to be sure that the GPU is not allowed
+	 * to sleep before we restart and reload a context.
+	 */
+	GEM_BUG_ON(!reset_in_progress(execlists));
+	if (!RB_EMPTY_ROOT(&execlists->queue.rb_root))
+		execlists->tasklet.func(execlists->tasklet.data);
+
+	if (__tasklet_enable(&execlists->tasklet))
+		/* And kick in case we missed a new request submission. */
+		tasklet_hi_schedule(&execlists->tasklet);
+	GEM_TRACE("%s: depth->%d\n", engine->name,
+		  atomic_read(&execlists->tasklet.count));
+}
+
+static void execlists_park(struct intel_engine_cs *engine)
+{
+	cancel_timer(&engine->execlists.timer);
+	cancel_timer(&engine->execlists.preempt);
+}
+
+static void execlists_destroy(struct intel_engine_cs *engine)
+{
+	/* Synchronise with residual timers and any softirq they raise */
+	del_timer_sync(&engine->execlists.timer);
+	del_timer_sync(&engine->execlists.preempt);
+	tasklet_kill(&engine->execlists.tasklet);
+
+	intel_logical_ring_destroy(engine);
+}
+
+void intel_execlists_set_default_submission(struct intel_engine_cs *engine)
+{
+	engine->request_alloc = execlists_request_alloc;
+	engine->submit_request = execlists_submit_request;
+	engine->cancel_requests = execlists_cancel_requests;
+	engine->schedule = i915_schedule;
+	engine->execlists.tasklet.func = execlists_submission_tasklet;
+
+	engine->reset.prepare = execlists_reset_prepare;
+	engine->reset.reset = execlists_reset;
+	engine->reset.finish = execlists_reset_finish;
+
+	engine->destroy = execlists_destroy;
+	engine->park = execlists_park;
+	engine->unpark = NULL;
+
+	engine->flags |= I915_ENGINE_SUPPORTS_STATS;
+	if (!intel_vgpu_active(engine->i915)) {
+		engine->flags |= I915_ENGINE_HAS_SEMAPHORES;
+		if (HAS_LOGICAL_RING_PREEMPTION(engine->i915))
+			engine->flags |= I915_ENGINE_HAS_PREEMPTION;
+	}
+
+	if (INTEL_GEN(engine->i915) >= 12)
+		engine->flags |= I915_ENGINE_HAS_RELATIVE_MMIO;
+}
+
+int intel_execlists_submission_setup(struct intel_engine_cs *engine)
+{
+	tasklet_init(&engine->execlists.tasklet,
+		     execlists_submission_tasklet, (unsigned long)engine);
+	timer_setup(&engine->execlists.timer, execlists_timeslice, 0);
+	timer_setup(&engine->execlists.preempt, execlists_preempt, 0);
+
+	intel_logical_ring_setup(engine);
+
+	engine->set_default_submission = intel_execlists_set_default_submission;
+	engine->cops = &execlists_context_ops;
+
+	return 0;
+}
+
+int intel_execlists_submission_init(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+	struct drm_i915_private *i915 = engine->i915;
+	struct intel_uncore *uncore = engine->uncore;
+	u32 base = engine->mmio_base;
+	int ret;
+
+	ret = intel_logical_ring_init(engine);
+	if (ret)
+		return ret;
+
+	if (HAS_LOGICAL_RING_ELSQ(i915)) {
+		execlists->submit_reg = uncore->regs +
+			i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(base));
+		execlists->ctrl_reg = uncore->regs +
+			i915_mmio_reg_offset(RING_EXECLIST_CONTROL(base));
+	} else {
+		execlists->submit_reg = uncore->regs +
+			i915_mmio_reg_offset(RING_ELSP(base));
+	}
+
+	execlists->csb_status =
+		&engine->status_page.addr[I915_HWS_CSB_BUF0_INDEX];
+
+	execlists->csb_write =
+		&engine->status_page.addr[intel_hws_csb_write_index(i915)];
+
+	if (INTEL_GEN(i915) < 11)
+		execlists->csb_size = GEN8_CSB_ENTRIES;
+	else
+		execlists->csb_size = GEN11_CSB_ENTRIES;
+
+	reset_csb_pointers(engine);
+
+	return 0;
+}
+
+static intel_engine_mask_t
+virtual_submission_mask(struct intel_virtual_engine *ve)
+{
+	struct i915_request *rq;
+	intel_engine_mask_t mask;
+
+	rq = READ_ONCE(ve->request);
+	if (!rq)
+		return 0;
+
+	/* The rq is ready for submission; rq->execution_mask is now stable. */
+	mask = rq->execution_mask;
+	if (unlikely(!mask)) {
+		/* Invalid selection, submit to a random engine in error */
+		i915_request_skip(rq, -ENODEV);
+		mask = ve->siblings[0]->mask;
+	}
+
+	GEM_TRACE("%s: rq=%llx:%lld, mask=%x, prio=%d\n",
+		  ve->base.name,
+		  rq->fence.context, rq->fence.seqno,
+		  mask, ve->base.execlists.queue_priority_hint);
+
+	return mask;
+}
+
+static void virtual_submission_tasklet(unsigned long data)
+{
+	struct intel_virtual_engine * const ve =
+		(struct intel_virtual_engine *)data;
+	const int prio = ve->base.execlists.queue_priority_hint;
+	intel_engine_mask_t mask;
+	unsigned int n;
+
+	rcu_read_lock();
+	mask = virtual_submission_mask(ve);
+	rcu_read_unlock();
+	if (unlikely(!mask))
+		return;
+
+	local_irq_disable();
+	for (n = 0; READ_ONCE(ve->request) && n < ve->num_siblings; n++) {
+		struct intel_engine_cs *sibling = ve->siblings[n];
+		struct ve_node * const node = &ve->nodes[sibling->id];
+		struct rb_node **parent, *rb;
+		bool first;
+
+		if (unlikely(!(mask & sibling->mask))) {
+			if (!RB_EMPTY_NODE(&node->rb)) {
+				spin_lock(&sibling->active.lock);
+				rb_erase_cached(&node->rb,
+						&sibling->execlists.virtual);
+				RB_CLEAR_NODE(&node->rb);
+				spin_unlock(&sibling->active.lock);
+			}
+			continue;
+		}
+
+		spin_lock(&sibling->active.lock);
+
+		if (!RB_EMPTY_NODE(&node->rb)) {
+			/*
+			 * Cheat and avoid rebalancing the tree if we can
+			 * reuse this node in situ.
+			 */
+			first = rb_first_cached(&sibling->execlists.virtual) ==
+				&node->rb;
+			if (prio == node->prio || (prio > node->prio && first))
+				goto submit_engine;
+
+			rb_erase_cached(&node->rb, &sibling->execlists.virtual);
+		}
+
+		rb = NULL;
+		first = true;
+		parent = &sibling->execlists.virtual.rb_root.rb_node;
+		while (*parent) {
+			struct ve_node *other;
+
+			rb = *parent;
+			other = rb_entry(rb, typeof(*other), rb);
+			if (prio > other->prio) {
+				parent = &rb->rb_left;
+			} else {
+				parent = &rb->rb_right;
+				first = false;
+			}
+		}
+
+		rb_link_node(&node->rb, rb, parent);
+		rb_insert_color_cached(&node->rb,
+				       &sibling->execlists.virtual,
+				       first);
+
+submit_engine:
+		GEM_BUG_ON(RB_EMPTY_NODE(&node->rb));
+		node->prio = prio;
+		if (first && prio > sibling->execlists.queue_priority_hint) {
+			sibling->execlists.queue_priority_hint = prio;
+			tasklet_hi_schedule(&sibling->execlists.tasklet);
+		}
+
+		spin_unlock(&sibling->active.lock);
+	}
+	local_irq_enable();
+}
+
+static void virtual_submit_request(struct i915_request *rq)
+{
+	struct intel_virtual_engine *ve = to_virtual_engine(rq->engine);
+	struct i915_request *old;
+	unsigned long flags;
+
+	GEM_TRACE("%s: rq=%llx:%lld\n",
+		  ve->base.name,
+		  rq->fence.context,
+		  rq->fence.seqno);
+
+	GEM_BUG_ON(ve->base.submit_request != virtual_submit_request);
+
+	spin_lock_irqsave(&ve->base.active.lock, flags);
+
+	old = ve->request;
+	if (old) { /* background completion event from preempt-to-busy */
+		GEM_BUG_ON(!i915_request_completed(old));
+		__i915_request_submit(old);
+		i915_request_put(old);
+	}
+
+	if (i915_request_completed(rq)) {
+		__i915_request_submit(rq);
+
+		ve->base.execlists.queue_priority_hint = INT_MIN;
+		ve->request = NULL;
+	} else {
+		ve->base.execlists.queue_priority_hint = rq_prio(rq);
+		ve->request = i915_request_get(rq);
+
+		GEM_BUG_ON(!list_empty(intel_virtual_engine_queue(ve)));
+		list_move_tail(&rq->sched.link, intel_virtual_engine_queue(ve));
+
+		tasklet_schedule(&ve->base.execlists.tasklet);
+	}
+
+	spin_unlock_irqrestore(&ve->base.active.lock, flags);
+}
+
+static void
+virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
+{
+	struct intel_virtual_engine *ve = to_virtual_engine(rq->engine);
+	intel_engine_mask_t allowed, exec;
+	struct ve_bond *bond;
+
+	allowed = ~to_request(signal)->engine->mask;
+
+	bond = intel_virtual_engine_find_bond(ve, to_request(signal)->engine);
+	if (bond)
+		allowed &= bond->sibling_mask;
+
+	/* Restrict the bonded request to run on only the available engines */
+	exec = READ_ONCE(rq->execution_mask);
+	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
+		;
+
+	/* Prevent the master from being re-run on the bonded engines */
+	to_request(signal)->execution_mask &= ~allowed;
+}
+
+void intel_execlists_virtual_submission_init(struct intel_virtual_engine *ve)
+{
+	ve->base.request_alloc = execlists_request_alloc;
+	ve->base.submit_request = virtual_submit_request;
+	ve->base.bond_execute = virtual_bond_execute;
+	tasklet_init(&ve->base.execlists.tasklet,
+		     virtual_submission_tasklet,
+		     (unsigned long)ve);
+}
+
+void intel_execlists_show_requests(struct intel_engine_cs *engine,
+				   struct drm_printer *m,
+				   void (*show_request)(struct drm_printer *m,
+							struct i915_request *rq,
+							const char *prefix),
+				   unsigned int max)
+{
+	const struct intel_engine_execlists *execlists = &engine->execlists;
+	struct i915_request *rq, *last;
+	unsigned long flags;
+	unsigned int count;
+	struct rb_node *rb;
+
+	spin_lock_irqsave(&engine->active.lock, flags);
+
+	last = NULL;
+	count = 0;
+	list_for_each_entry(rq, &engine->active.requests, sched.link) {
+		if (count++ < max - 1)
+			show_request(m, rq, "\t\tE ");
+		else
+			last = rq;
+	}
+	if (last) {
+		if (count > max) {
+			drm_printf(m,
+				   "\t\t...skipping %d executing requests...\n",
+				   count - max);
+		}
+		show_request(m, last, "\t\tE ");
+	}
+
+	last = NULL;
+	count = 0;
+	if (execlists->queue_priority_hint != INT_MIN)
+		drm_printf(m, "\t\tQueue priority hint: %d\n",
+			   execlists->queue_priority_hint);
+	for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) {
+		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
+		int i;
+
+		priolist_for_each_request(rq, p, i) {
+			if (count++ < max - 1)
+				show_request(m, rq, "\t\tQ ");
+			else
+				last = rq;
+		}
+	}
+	if (last) {
+		if (count > max) {
+			drm_printf(m,
+				   "\t\t...skipping %d queued requests...\n",
+				   count - max);
+		}
+		show_request(m, last, "\t\tQ ");
+	}
+
+	last = NULL;
+	count = 0;
+	for (rb = rb_first_cached(&execlists->virtual); rb; rb = rb_next(rb)) {
+		struct intel_virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq = READ_ONCE(ve->request);
+
+		if (rq) {
+			if (count++ < max - 1)
+				show_request(m, rq, "\t\tV ");
+			else
+				last = rq;
+		}
+	}
+	if (last) {
+		if (count > max) {
+			drm_printf(m,
+				   "\t\t...skipping %d virtual requests...\n",
+				   count - max);
+		}
+		show_request(m, last, "\t\tV ");
+	}
+
+	spin_unlock_irqrestore(&engine->active.lock, flags);
+}
+
+bool
+intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine)
+{
+	return engine->set_default_submission ==
+	       intel_execlists_set_default_submission;
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftest_execlists.c"
+#endif
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
new file mode 100644
index 000000000000..b776bc60da52
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
@@ -0,0 +1,58 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __INTEL_EXECLISTS_SUBMISSION__
+#define __INTEL_EXECLISTS_SUBMISSION__
+
+#include <linux/types.h>
+
+struct drm_printer;
+
+struct i915_request;
+struct intel_engine_cs;
+struct intel_virtual_engine;
+
+/* The docs specify that the write pointer wraps around after 5h, "After status
+ * is written out to the last available status QW at offset 5h, this pointer
+ * wraps to 0."
+ *
+ * Therefore, one must infer than even though there are 3 bits available, 6 and
+ * 7 appear to be * reserved.
+ */
+#define GEN8_CSB_ENTRIES 6
+#define GEN8_CSB_PTR_MASK 0x7
+#define GEN8_CSB_READ_PTR_MASK (GEN8_CSB_PTR_MASK << 8)
+#define GEN8_CSB_WRITE_PTR_MASK (GEN8_CSB_PTR_MASK << 0)
+
+#define GEN11_CSB_ENTRIES 12
+#define GEN11_CSB_PTR_MASK 0xf
+#define GEN11_CSB_READ_PTR_MASK (GEN11_CSB_PTR_MASK << 8)
+#define GEN11_CSB_WRITE_PTR_MASK (GEN11_CSB_PTR_MASK << 0)
+
+enum {
+	INTEL_CONTEXT_SCHEDULE_IN = 0,
+	INTEL_CONTEXT_SCHEDULE_OUT,
+	INTEL_CONTEXT_SCHEDULE_PREEMPTED,
+};
+
+int intel_execlists_submission_setup(struct intel_engine_cs *engine);
+int intel_execlists_submission_init(struct intel_engine_cs *engine);
+
+void intel_execlists_set_default_submission(struct intel_engine_cs *engine);
+
+void intel_execlists_show_requests(struct intel_engine_cs *engine,
+				   struct drm_printer *m,
+				   void (*show_request)(struct drm_printer *m,
+							struct i915_request *rq,
+							const char *prefix),
+				   unsigned int max);
+
+void intel_execlists_virtual_submission_init(struct intel_virtual_engine *ve);
+
+bool
+intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
+
+#endif /* __INTEL_EXECLISTS_SUBMISSION__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index fbdd3bdd06f1..4f40cf64e996 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -150,34 +150,6 @@
 #include "intel_virtual_engine.h"
 #include "intel_workarounds.h"
 
-#define RING_EXECLIST_QFULL		(1 << 0x2)
-#define RING_EXECLIST1_VALID		(1 << 0x3)
-#define RING_EXECLIST0_VALID		(1 << 0x4)
-#define RING_EXECLIST_ACTIVE_STATUS	(3 << 0xE)
-#define RING_EXECLIST1_ACTIVE		(1 << 0x11)
-#define RING_EXECLIST0_ACTIVE		(1 << 0x12)
-
-#define GEN8_CTX_STATUS_IDLE_ACTIVE	(1 << 0)
-#define GEN8_CTX_STATUS_PREEMPTED	(1 << 1)
-#define GEN8_CTX_STATUS_ELEMENT_SWITCH	(1 << 2)
-#define GEN8_CTX_STATUS_ACTIVE_IDLE	(1 << 3)
-#define GEN8_CTX_STATUS_COMPLETE	(1 << 4)
-#define GEN8_CTX_STATUS_LITE_RESTORE	(1 << 15)
-
-#define GEN8_CTX_STATUS_COMPLETED_MASK \
-	 (GEN8_CTX_STATUS_COMPLETE | GEN8_CTX_STATUS_PREEMPTED)
-
-#define CTX_DESC_FORCE_RESTORE BIT_ULL(2)
-
-#define GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE	(0x1) /* lower csb dword */
-#define GEN12_CTX_SWITCH_DETAIL(csb_dw)	((csb_dw) & 0xF) /* upper csb dword */
-#define GEN12_CSB_SW_CTX_ID_MASK		GENMASK(25, 15)
-#define GEN12_IDLE_CTX_ID		0x7FF
-#define GEN12_CSB_CTX_VALID(csb_dw) \
-	(FIELD_GET(GEN12_CSB_SW_CTX_ID_MASK, csb_dw) != GEN12_IDLE_CTX_ID)
-
-/* Typical size of the average request (2 pipecontrols and a MI_BB) */
-#define EXECLISTS_REQUEST_SIZE 64 /* bytes */
 #define WA_TAIL_DWORDS 2
 #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
 
@@ -186,37 +158,6 @@ static void lr_context_init_reg_state(u32 *reg_state,
 				      const struct intel_engine_cs *engine,
 				      const struct intel_ring *ring,
 				      bool close);
-static void
-lr_context_update_reg_state(const struct intel_context *ce,
-			    const struct intel_engine_cs *engine);
-
-static void mark_eio(struct i915_request *rq)
-{
-	if (i915_request_completed(rq))
-		return;
-
-	GEM_BUG_ON(i915_request_signaled(rq));
-
-	dma_fence_set_error(&rq->fence, -EIO);
-	i915_request_mark_complete(rq);
-}
-
-static struct i915_request *
-active_request(const struct intel_timeline * const tl, struct i915_request *rq)
-{
-	struct i915_request *active = rq;
-
-	rcu_read_lock();
-	list_for_each_entry_continue_reverse(rq, &tl->requests, link) {
-		if (i915_request_completed(rq))
-			break;
-
-		active = rq;
-	}
-	rcu_read_unlock();
-
-	return active;
-}
 
 static inline u32 intel_hws_preempt_address(struct intel_engine_cs *engine)
 {
@@ -224,164 +165,6 @@ static inline u32 intel_hws_preempt_address(struct intel_engine_cs *engine)
 		I915_GEM_HWS_PREEMPT_ADDR);
 }
 
-static inline void
-ring_set_paused(const struct intel_engine_cs *engine, int state)
-{
-	/*
-	 * We inspect HWS_PREEMPT with a semaphore inside
-	 * engine->emit_fini_breadcrumb. If the dword is true,
-	 * the ring is paused as the semaphore will busywait
-	 * until the dword is false.
-	 */
-	engine->status_page.addr[I915_GEM_HWS_PREEMPT] = state;
-	if (state)
-		wmb();
-}
-
-static inline struct i915_priolist *to_priolist(struct rb_node *rb)
-{
-	return rb_entry(rb, struct i915_priolist, node);
-}
-
-static inline int rq_prio(const struct i915_request *rq)
-{
-	return rq->sched.attr.priority;
-}
-
-static int effective_prio(const struct i915_request *rq)
-{
-	int prio = rq_prio(rq);
-
-	/*
-	 * If this request is special and must not be interrupted at any
-	 * cost, so be it. Note we are only checking the most recent request
-	 * in the context and so may be masking an earlier vip request. It
-	 * is hoped that under the conditions where nopreempt is used, this
-	 * will not matter (i.e. all requests to that context will be
-	 * nopreempt for as long as desired).
-	 */
-	if (i915_request_has_nopreempt(rq))
-		prio = I915_PRIORITY_UNPREEMPTABLE;
-
-	/*
-	 * On unwinding the active request, we give it a priority bump
-	 * if it has completed waiting on any semaphore. If we know that
-	 * the request has already started, we can prevent an unwanted
-	 * preempt-to-idle cycle by taking that into account now.
-	 */
-	if (__i915_request_has_started(rq))
-		prio |= I915_PRIORITY_NOSEMAPHORE;
-
-	/* Restrict mere WAIT boosts from triggering preemption */
-	BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
-	return prio | __NO_PREEMPTION;
-}
-
-static int queue_prio(const struct intel_engine_execlists *execlists)
-{
-	struct i915_priolist *p;
-	struct rb_node *rb;
-
-	rb = rb_first_cached(&execlists->queue);
-	if (!rb)
-		return INT_MIN;
-
-	/*
-	 * As the priolist[] are inverted, with the highest priority in [0],
-	 * we have to flip the index value to become priority.
-	 */
-	p = to_priolist(rb);
-	return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used);
-}
-
-static inline bool need_preempt(const struct intel_engine_cs *engine,
-				const struct i915_request *rq,
-				struct rb_node *rb)
-{
-	int last_prio;
-
-	if (!intel_engine_has_semaphores(engine))
-		return false;
-
-	/*
-	 * Check if the current priority hint merits a preemption attempt.
-	 *
-	 * We record the highest value priority we saw during rescheduling
-	 * prior to this dequeue, therefore we know that if it is strictly
-	 * less than the current tail of ESLP[0], we do not need to force
-	 * a preempt-to-idle cycle.
-	 *
-	 * However, the priority hint is a mere hint that we may need to
-	 * preempt. If that hint is stale or we may be trying to preempt
-	 * ourselves, ignore the request.
-	 *
-	 * More naturally we would write
-	 *      prio >= max(0, last);
-	 * except that we wish to prevent triggering preemption at the same
-	 * priority level: the task that is running should remain running
-	 * to preserve FIFO ordering of dependencies.
-	 */
-	last_prio = max(effective_prio(rq), I915_PRIORITY_NORMAL - 1);
-	if (engine->execlists.queue_priority_hint <= last_prio)
-		return false;
-
-	/*
-	 * Check against the first request in ELSP[1], it will, thanks to the
-	 * power of PI, be the highest priority of that context.
-	 */
-	if (!list_is_last(&rq->sched.link, &engine->active.requests) &&
-	    rq_prio(list_next_entry(rq, sched.link)) > last_prio)
-		return true;
-
-	if (rb) {
-		struct intel_virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
-		bool preempt = false;
-
-		if (engine == ve->siblings[0]) { /* only preempt one sibling */
-			struct i915_request *next;
-
-			rcu_read_lock();
-			next = READ_ONCE(ve->request);
-			if (next)
-				preempt = rq_prio(next) > last_prio;
-			rcu_read_unlock();
-		}
-
-		if (preempt)
-			return preempt;
-	}
-
-	/*
-	 * If the inflight context did not trigger the preemption, then maybe
-	 * it was the set of queued requests? Pick the highest priority in
-	 * the queue (the first active priolist) and see if it deserves to be
-	 * running instead of ELSP[0].
-	 *
-	 * The highest priority request in the queue can not be either
-	 * ELSP[0] or ELSP[1] as, thanks again to PI, if it was the same
-	 * context, it's priority would not exceed ELSP[0] aka last_prio.
-	 */
-	return queue_prio(&engine->execlists) > last_prio;
-}
-
-__maybe_unused static inline bool
-assert_priority_queue(const struct i915_request *prev,
-		      const struct i915_request *next)
-{
-	/*
-	 * Without preemption, the prev may refer to the still active element
-	 * which we refuse to let go.
-	 *
-	 * Even with preemption, there are times when we think it is better not
-	 * to preempt and leave an ostensibly lower priority request in flight.
-	 */
-	if (i915_request_is_active(prev))
-		return true;
-
-	return rq_prio(prev) >= rq_prio(next);
-}
-
 /*
  * The context descriptor encodes various attributes of a context,
  * including its GTT address and some flags. Because it's fairly
@@ -801,145 +584,7 @@ u32 *intel_lr_context_set_register_offsets(u32 *regs,
 	return set_offsets(regs, reg_offsets(engine), engine);
 }
 
-static struct i915_request *
-__unwind_incomplete_requests(struct intel_engine_cs *engine)
-{
-	struct i915_request *rq, *rn, *active = NULL;
-	struct list_head *uninitialized_var(pl);
-	int prio = I915_PRIORITY_INVALID;
-
-	lockdep_assert_held(&engine->active.lock);
-
-	list_for_each_entry_safe_reverse(rq, rn,
-					 &engine->active.requests,
-					 sched.link) {
-		if (i915_request_completed(rq))
-			continue; /* XXX */
-
-		__i915_request_unsubmit(rq);
-
-		/*
-		 * Push the request back into the queue for later resubmission.
-		 * If this request is not native to this physical engine (i.e.
-		 * it came from a virtual source), push it back onto the virtual
-		 * engine so that it can be moved across onto another physical
-		 * engine as load dictates.
-		 */
-		if (likely(rq->execution_mask == engine->mask)) {
-			GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
-			if (rq_prio(rq) != prio) {
-				prio = rq_prio(rq);
-				pl = i915_sched_lookup_priolist(engine, prio);
-			}
-			GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
-
-			list_move(&rq->sched.link, pl);
-			active = rq;
-		} else {
-			struct intel_engine_cs *owner = rq->hw_context->engine;
-
-			/*
-			 * Decouple the virtual breadcrumb before moving it
-			 * back to the virtual engine -- we don't want the
-			 * request to complete in the background and try
-			 * and cancel the breadcrumb on the virtual engine
-			 * (instead of the old engine where it is linked)!
-			 */
-			if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
-				     &rq->fence.flags)) {
-				spin_lock_nested(&rq->lock,
-						 SINGLE_DEPTH_NESTING);
-				i915_request_cancel_breadcrumb(rq);
-				spin_unlock(&rq->lock);
-			}
-			rq->engine = owner;
-			owner->submit_request(rq);
-			active = NULL;
-		}
-	}
-
-	return active;
-}
-
-struct i915_request *
-execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
-{
-	struct intel_engine_cs *engine =
-		container_of(execlists, typeof(*engine), execlists);
-
-	return __unwind_incomplete_requests(engine);
-}
-
-static inline void
-execlists_context_status_change(struct i915_request *rq, unsigned long status)
-{
-	/*
-	 * Only used when GVT-g is enabled now. When GVT-g is disabled,
-	 * The compiler should eliminate this function as dead-code.
-	 */
-	if (!IS_ENABLED(CONFIG_DRM_I915_GVT))
-		return;
-
-	atomic_notifier_call_chain(&rq->engine->context_status_notifier,
-				   status, rq);
-}
-
-static void intel_engine_context_in(struct intel_engine_cs *engine)
-{
-	unsigned long flags;
-
-	if (READ_ONCE(engine->stats.enabled) == 0)
-		return;
-
-	write_seqlock_irqsave(&engine->stats.lock, flags);
-
-	if (engine->stats.enabled > 0) {
-		if (engine->stats.active++ == 0)
-			engine->stats.start = ktime_get();
-		GEM_BUG_ON(engine->stats.active == 0);
-	}
-
-	write_sequnlock_irqrestore(&engine->stats.lock, flags);
-}
-
-static void intel_engine_context_out(struct intel_engine_cs *engine)
-{
-	unsigned long flags;
-
-	if (READ_ONCE(engine->stats.enabled) == 0)
-		return;
-
-	write_seqlock_irqsave(&engine->stats.lock, flags);
-
-	if (engine->stats.enabled > 0) {
-		ktime_t last;
-
-		if (engine->stats.active && --engine->stats.active == 0) {
-			/*
-			 * Decrement the active context count and in case GPU
-			 * is now idle add up to the running total.
-			 */
-			last = ktime_sub(ktime_get(), engine->stats.start);
-
-			engine->stats.total = ktime_add(engine->stats.total,
-							last);
-		} else if (engine->stats.active == 0) {
-			/*
-			 * After turning on engine stats, context out might be
-			 * the first event in which case we account from the
-			 * time stats gathering was turned on.
-			 */
-			last = ktime_sub(ktime_get(), engine->stats.enabled_at);
-
-			engine->stats.total = ktime_add(engine->stats.total,
-							last);
-		}
-	}
-
-	write_sequnlock_irqrestore(&engine->stats.lock, flags);
-}
-
-static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
+int intel_lrc_ring_mi_mode(const struct intel_engine_cs *engine)
 {
 	if (INTEL_GEN(engine->i915) >= 12)
 		return 0x60;
@@ -951,1396 +596,17 @@ static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
 		return -1;
 }
 
-static void
-execlists_check_context(const struct intel_context *ce,
-			const struct intel_engine_cs *engine)
-{
-	const struct intel_ring *ring = ce->ring;
-	u32 *regs = ce->lrc_reg_state;
-	bool valid = true;
-	int x;
-
-	if (regs[CTX_RING_START] != i915_ggtt_offset(ring->vma)) {
-		pr_err("%s: context submitted with incorrect RING_START [%08x], expected %08x\n",
-		       engine->name,
-		       regs[CTX_RING_START],
-		       i915_ggtt_offset(ring->vma));
-		regs[CTX_RING_START] = i915_ggtt_offset(ring->vma);
-		valid = false;
-	}
-
-	if ((regs[CTX_RING_CTL] & ~(RING_WAIT | RING_WAIT_SEMAPHORE)) !=
-	    (RING_CTL_SIZE(ring->size) | RING_VALID)) {
-		pr_err("%s: context submitted with incorrect RING_CTL [%08x], expected %08x\n",
-		       engine->name,
-		       regs[CTX_RING_CTL],
-		       (u32)(RING_CTL_SIZE(ring->size) | RING_VALID));
-		regs[CTX_RING_CTL] = RING_CTL_SIZE(ring->size) | RING_VALID;
-		valid = false;
-	}
-
-	x = lrc_ring_mi_mode(engine);
-	if (x != -1 && regs[x + 1] & (regs[x + 1] >> 16) & STOP_RING) {
-		pr_err("%s: context submitted with STOP_RING [%08x] in RING_MI_MODE\n",
-		       engine->name, regs[x + 1]);
-		regs[x + 1] &= ~STOP_RING;
-		regs[x + 1] |= STOP_RING << 16;
-		valid = false;
-	}
-
-	WARN_ONCE(!valid, "Invalid lrc state found before submission\n");
-}
-
-static void lr_context_restore_default_state(struct intel_context *ce,
-					     struct intel_engine_cs *engine)
+void intel_lr_context_restore_default_state(struct intel_context *ce,
+					    struct intel_engine_cs *engine)
 {
 	u32 *regs = ce->lrc_reg_state;
 
-	if (engine->pinned_default_state)
-		memcpy(regs, /* skip restoring the vanilla PPHWSP */
-		       engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
-		       engine->context_size - PAGE_SIZE);
-
-	lr_context_init_reg_state(regs, ce, engine, ce->ring, false);
-}
-
-static void reset_active(struct i915_request *rq,
-			 struct intel_engine_cs *engine)
-{
-	struct intel_context * const ce = rq->hw_context;
-	u32 head;
-
-	/*
-	 * The executing context has been cancelled. We want to prevent
-	 * further execution along this context and propagate the error on
-	 * to anything depending on its results.
-	 *
-	 * In __i915_request_submit(), we apply the -EIO and remove the
-	 * requests' payloads for any banned requests. But first, we must
-	 * rewind the context back to the start of the incomplete request so
-	 * that we do not jump back into the middle of the batch.
-	 *
-	 * We preserve the breadcrumbs and semaphores of the incomplete
-	 * requests so that inter-timeline dependencies (i.e other timelines)
-	 * remain correctly ordered. And we defer to __i915_request_submit()
-	 * so that all asynchronous waits are correctly handled.
-	 */
-	GEM_TRACE("%s(%s): { rq=%llx:%lld }\n",
-		  __func__, engine->name, rq->fence.context, rq->fence.seqno);
-
-	/* On resubmission of the active request, payload will be scrubbed */
-	if (i915_request_completed(rq))
-		head = rq->tail;
-	else
-		head = active_request(ce->timeline, rq)->head;
-	ce->ring->head = intel_ring_wrap(ce->ring, head);
-	intel_ring_update_space(ce->ring);
-
-	/* Scrub the context image to prevent replaying the previous batch */
-	lr_context_restore_default_state(ce, engine);
-	lr_context_update_reg_state(ce, engine);
-
-	/* We've switched away, so this should be a no-op, but intent matters */
-	ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
-}
-
-static inline struct intel_engine_cs *
-__execlists_schedule_in(struct i915_request *rq)
-{
-	struct intel_engine_cs * const engine = rq->engine;
-	struct intel_context * const ce = rq->hw_context;
-
-	intel_context_get(ce);
-
-	if (unlikely(i915_gem_context_is_banned(ce->gem_context)))
-		reset_active(rq, engine);
-
-	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
-		execlists_check_context(ce, engine);
-
-	if (ce->tag) {
-		/* Use a fixed tag for OA and friends */
-		ce->lrc_desc |= (u64)ce->tag << 32;
-	} else {
-		/* We don't need a strict matching tag, just different values */
-		ce->lrc_desc &= ~GENMASK_ULL(47, 37);
-		ce->lrc_desc |=
-			(u64)(engine->context_tag++ % NUM_CONTEXT_TAG) <<
-			GEN11_SW_CTX_ID_SHIFT;
-		BUILD_BUG_ON(NUM_CONTEXT_TAG > GEN12_MAX_CONTEXT_HW_ID);
-	}
-
-	__intel_gt_pm_get(engine->gt);
-	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
-	intel_engine_context_in(engine);
-
-	return engine;
-}
-
-static inline struct i915_request *
-execlists_schedule_in(struct i915_request *rq, int idx)
-{
-	struct intel_context * const ce = rq->hw_context;
-	struct intel_engine_cs *old;
-
-	GEM_BUG_ON(!intel_engine_pm_is_awake(rq->engine));
-	trace_i915_request_in(rq, idx);
-
-	old = READ_ONCE(ce->inflight);
-	do {
-		if (!old) {
-			WRITE_ONCE(ce->inflight, __execlists_schedule_in(rq));
-			break;
-		}
-	} while (!try_cmpxchg(&ce->inflight, &old, ptr_inc(old)));
-
-	GEM_BUG_ON(intel_context_inflight(ce) != rq->engine);
-	return i915_request_get(rq);
-}
-
-static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
-{
-	struct intel_virtual_engine *ve =
-		container_of(ce, typeof(*ve), context);
-	struct i915_request *next = READ_ONCE(ve->request);
-
-	if (next && next->execution_mask & ~rq->execution_mask)
-		tasklet_schedule(&ve->base.execlists.tasklet);
-}
-
-static inline void
-__execlists_schedule_out(struct i915_request *rq,
-			 struct intel_engine_cs * const engine)
-{
-	struct intel_context * const ce = rq->hw_context;
-
-	/*
-	 * NB process_csb() is not under the engine->active.lock and hence
-	 * schedule_out can race with schedule_in meaning that we should
-	 * refrain from doing non-trivial work here.
-	 */
-
-	/*
-	 * If we have just completed this context, the engine may now be
-	 * idle and we want to re-enter powersaving.
-	 */
-	if (list_is_last(&rq->link, &ce->timeline->requests) &&
-	    i915_request_completed(rq))
-		intel_engine_add_retire(engine, ce->timeline);
-
-	intel_engine_context_out(engine);
-	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
-	intel_gt_pm_put_async(engine->gt);
-
-	/*
-	 * If this is part of a virtual engine, its next request may
-	 * have been blocked waiting for access to the active context.
-	 * We have to kick all the siblings again in case we need to
-	 * switch (e.g. the next request is not runnable on this
-	 * engine). Hopefully, we will already have submitted the next
-	 * request before the tasklet runs and do not need to rebuild
-	 * each virtual tree and kick everyone again.
-	 */
-	if (ce->engine != engine)
-		kick_siblings(rq, ce);
-
-	intel_context_put(ce);
-}
-
-static inline void
-execlists_schedule_out(struct i915_request *rq)
-{
-	struct intel_context * const ce = rq->hw_context;
-	struct intel_engine_cs *cur, *old;
-
-	trace_i915_request_out(rq);
-
-	old = READ_ONCE(ce->inflight);
-	do
-		cur = ptr_unmask_bits(old, 2) ? ptr_dec(old) : NULL;
-	while (!try_cmpxchg(&ce->inflight, &old, cur));
-	if (!cur)
-		__execlists_schedule_out(rq, old);
-
-	i915_request_put(rq);
-}
-
-static u64 execlists_update_context(struct i915_request *rq)
-{
-	struct intel_context *ce = rq->hw_context;
-	u64 desc = ce->lrc_desc;
-	u32 tail;
-
-	/*
-	 * WaIdleLiteRestore:bdw,skl
-	 *
-	 * We should never submit the context with the same RING_TAIL twice
-	 * just in case we submit an empty ring, which confuses the HW.
-	 *
-	 * We append a couple of NOOPs (gen8_emit_wa_tail) after the end of
-	 * the normal request to be able to always advance the RING_TAIL on
-	 * subsequent resubmissions (for lite restore). Should that fail us,
-	 * and we try and submit the same tail again, force the context
-	 * reload.
-	 */
-	tail = intel_ring_set_tail(rq->ring, rq->tail);
-	if (unlikely(ce->lrc_reg_state[CTX_RING_TAIL] == tail))
-		desc |= CTX_DESC_FORCE_RESTORE;
-	ce->lrc_reg_state[CTX_RING_TAIL] = tail;
-	rq->tail = rq->wa_tail;
-
-	/*
-	 * Make sure the context image is complete before we submit it to HW.
-	 *
-	 * Ostensibly, writes (including the WCB) should be flushed prior to
-	 * an uncached write such as our mmio register access, the empirical
-	 * evidence (esp. on Braswell) suggests that the WC write into memory
-	 * may not be visible to the HW prior to the completion of the UC
-	 * register write and that we may begin execution from the context
-	 * before its image is complete leading to invalid PD chasing.
-	 */
-	wmb();
-
-	/* Wa_1607138340:tgl */
-	if (IS_TGL_REVID(rq->i915, TGL_REVID_A0, TGL_REVID_A0))
-		desc |= CTX_DESC_FORCE_RESTORE;
-
-	ce->lrc_desc &= ~CTX_DESC_FORCE_RESTORE;
-	return desc;
-}
-
-static inline void write_desc(struct intel_engine_execlists *execlists, u64 desc, u32 port)
-{
-	if (execlists->ctrl_reg) {
-		writel(lower_32_bits(desc), execlists->submit_reg + port * 2);
-		writel(upper_32_bits(desc), execlists->submit_reg + port * 2 + 1);
-	} else {
-		writel(upper_32_bits(desc), execlists->submit_reg);
-		writel(lower_32_bits(desc), execlists->submit_reg);
-	}
-}
-
-static __maybe_unused void
-trace_ports(const struct intel_engine_execlists *execlists,
-	    const char *msg,
-	    struct i915_request * const *ports)
-{
-	const struct intel_engine_cs *engine =
-		container_of(execlists, typeof(*engine), execlists);
-
-	if (!ports[0])
-		return;
-
-	GEM_TRACE("%s: %s { %llx:%lld%s, %llx:%lld }\n",
-		  engine->name, msg,
-		  ports[0]->fence.context,
-		  ports[0]->fence.seqno,
-		  i915_request_completed(ports[0]) ? "!" :
-		  i915_request_started(ports[0]) ? "*" :
-		  "",
-		  ports[1] ? ports[1]->fence.context : 0,
-		  ports[1] ? ports[1]->fence.seqno : 0);
-}
-
-static __maybe_unused bool
-assert_pending_valid(const struct intel_engine_execlists *execlists,
-		     const char *msg)
-{
-	struct i915_request * const *port, *rq;
-	struct intel_context *ce = NULL;
-
-	trace_ports(execlists, msg, execlists->pending);
-
-	if (!execlists->pending[0]) {
-		GEM_TRACE_ERR("Nothing pending for promotion!\n");
-		return false;
-	}
-
-	if (execlists->pending[execlists_num_ports(execlists)]) {
-		GEM_TRACE_ERR("Excess pending[%d] for promotion!\n",
-			      execlists_num_ports(execlists));
-		return false;
-	}
-
-	for (port = execlists->pending; (rq = *port); port++) {
-		unsigned long flags;
-		bool ok = true;
-
-		GEM_BUG_ON(!kref_read(&rq->fence.refcount));
-		GEM_BUG_ON(!i915_request_is_active(rq));
-
-		if (ce == rq->hw_context) {
-			GEM_TRACE_ERR("Dup context:%llx in pending[%zd]\n",
-				      ce->timeline->fence_context,
-				      port - execlists->pending);
-			return false;
-		}
-		ce = rq->hw_context;
-
-		/* Hold tightly onto the lock to prevent concurrent retires! */
-		if (!spin_trylock_irqsave(&rq->lock, flags))
-			continue;
-
-		if (i915_request_completed(rq))
-			goto unlock;
-
-		if (i915_active_is_idle(&ce->active) &&
-		    !i915_gem_context_is_kernel(ce->gem_context)) {
-			GEM_TRACE_ERR("Inactive context:%llx in pending[%zd]\n",
-				      ce->timeline->fence_context,
-				      port - execlists->pending);
-			ok = false;
-			goto unlock;
-		}
-
-		if (!i915_vma_is_pinned(ce->state)) {
-			GEM_TRACE_ERR("Unpinned context:%llx in pending[%zd]\n",
-				      ce->timeline->fence_context,
-				      port - execlists->pending);
-			ok = false;
-			goto unlock;
-		}
-
-		if (!i915_vma_is_pinned(ce->ring->vma)) {
-			GEM_TRACE_ERR("Unpinned ring:%llx in pending[%zd]\n",
-				      ce->timeline->fence_context,
-				      port - execlists->pending);
-			ok = false;
-			goto unlock;
-		}
-
-unlock:
-		spin_unlock_irqrestore(&rq->lock, flags);
-		if (!ok)
-			return false;
-	}
-
-	return ce;
-}
-
-static void execlists_submit_ports(struct intel_engine_cs *engine)
-{
-	struct intel_engine_execlists *execlists = &engine->execlists;
-	unsigned int n;
-
-	GEM_BUG_ON(!assert_pending_valid(execlists, "submit"));
-
-	/*
-	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
-	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
-	 * not be relinquished until the device is idle (see
-	 * i915_gem_idle_work_handler()). As a precaution, we make sure
-	 * that all ELSP are drained i.e. we have processed the CSB,
-	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
-	 */
-	GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
-
-	/*
-	 * ELSQ note: the submit queue is not cleared after being submitted
-	 * to the HW so we need to make sure we always clean it up. This is
-	 * currently ensured by the fact that we always write the same number
-	 * of elsq entries, keep this in mind before changing the loop below.
-	 */
-	for (n = execlists_num_ports(execlists); n--; ) {
-		struct i915_request *rq = execlists->pending[n];
-
-		write_desc(execlists,
-			   rq ? execlists_update_context(rq) : 0,
-			   n);
-	}
-
-	/* we need to manually load the submit queue */
-	if (execlists->ctrl_reg)
-		writel(EL_CTRL_LOAD, execlists->ctrl_reg);
-}
-
-static bool ctx_single_port_submission(const struct intel_context *ce)
-{
-	return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
-		i915_gem_context_force_single_submission(ce->gem_context));
-}
-
-static bool can_merge_ctx(const struct intel_context *prev,
-			  const struct intel_context *next)
-{
-	if (prev != next)
-		return false;
-
-	if (ctx_single_port_submission(prev))
-		return false;
-
-	return true;
-}
-
-static bool can_merge_rq(const struct i915_request *prev,
-			 const struct i915_request *next)
-{
-	GEM_BUG_ON(prev == next);
-	GEM_BUG_ON(!assert_priority_queue(prev, next));
-
-	/*
-	 * We do not submit known completed requests. Therefore if the next
-	 * request is already completed, we can pretend to merge it in
-	 * with the previous context (and we will skip updating the ELSP
-	 * and tracking). Thus hopefully keeping the ELSP full with active
-	 * contexts, despite the best efforts of preempt-to-busy to confuse
-	 * us.
-	 */
-	if (i915_request_completed(next))
-		return true;
-
-	if (unlikely((prev->flags ^ next->flags) &
-		     (I915_REQUEST_NOPREEMPT | I915_REQUEST_SENTINEL)))
-		return false;
-
-	if (!can_merge_ctx(prev->hw_context, next->hw_context))
-		return false;
-
-	return true;
-}
-
-static bool virtual_matches(const struct intel_virtual_engine *ve,
-			    const struct i915_request *rq,
-			    const struct intel_engine_cs *engine)
-{
-	const struct intel_engine_cs *inflight;
-
-	if (!(rq->execution_mask & engine->mask)) /* We peeked too soon! */
-		return false;
-
-	/*
-	 * We track when the HW has completed saving the context image
-	 * (i.e. when we have seen the final CS event switching out of
-	 * the context) and must not overwrite the context image before
-	 * then. This restricts us to only using the active engine
-	 * while the previous virtualized request is inflight (so
-	 * we reuse the register offsets). This is a very small
-	 * hystersis on the greedy seelction algorithm.
-	 */
-	inflight = intel_context_inflight(&ve->context);
-	if (inflight && inflight != engine)
-		return false;
-
-	return true;
-}
-
-static void virtual_xfer_breadcrumbs(struct intel_virtual_engine *ve,
-				     struct intel_engine_cs *engine)
-{
-	struct intel_engine_cs *old = ve->siblings[0];
-
-	/* All unattached (rq->engine == old) must already be completed */
-
-	spin_lock(&old->breadcrumbs.irq_lock);
-	if (!list_empty(&ve->context.signal_link)) {
-		list_move_tail(&ve->context.signal_link,
-			       &engine->breadcrumbs.signalers);
-		intel_engine_queue_breadcrumbs(engine);
-	}
-	spin_unlock(&old->breadcrumbs.irq_lock);
-}
-
-static struct i915_request *
-last_active(const struct intel_engine_execlists *execlists)
-{
-	struct i915_request * const *last = READ_ONCE(execlists->active);
-
-	while (*last && i915_request_completed(*last))
-		last++;
-
-	return *last;
-}
-
-static void defer_request(struct i915_request *rq, struct list_head * const pl)
-{
-	LIST_HEAD(list);
-
-	/*
-	 * We want to move the interrupted request to the back of
-	 * the round-robin list (i.e. its priority level), but
-	 * in doing so, we must then move all requests that were in
-	 * flight and were waiting for the interrupted request to
-	 * be run after it again.
-	 */
-	do {
-		struct i915_dependency *p;
-
-		GEM_BUG_ON(i915_request_is_active(rq));
-		list_move_tail(&rq->sched.link, pl);
-
-		list_for_each_entry(p, &rq->sched.waiters_list, wait_link) {
-			struct i915_request *w =
-				container_of(p->waiter, typeof(*w), sched);
-
-			/* Leave semaphores spinning on the other engines */
-			if (w->engine != rq->engine)
-				continue;
-
-			/* No waiter should start before its signaler */
-			GEM_BUG_ON(i915_request_started(w) &&
-				   !i915_request_completed(rq));
-
-			GEM_BUG_ON(i915_request_is_active(w));
-			if (list_empty(&w->sched.link))
-				continue; /* Not yet submitted; unready */
-
-			if (rq_prio(w) < rq_prio(rq))
-				continue;
-
-			GEM_BUG_ON(rq_prio(w) > rq_prio(rq));
-			list_move_tail(&w->sched.link, &list);
-		}
-
-		rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
-	} while (rq);
-}
-
-static void defer_active(struct intel_engine_cs *engine)
-{
-	struct i915_request *rq;
-
-	rq = __unwind_incomplete_requests(engine);
-	if (!rq)
-		return;
-
-	defer_request(rq, i915_sched_lookup_priolist(engine, rq_prio(rq)));
-}
-
-static bool
-need_timeslice(struct intel_engine_cs *engine, const struct i915_request *rq)
-{
-	int hint;
-
-	if (!intel_engine_has_timeslices(engine))
-		return false;
-
-	if (list_is_last(&rq->sched.link, &engine->active.requests))
-		return false;
-
-	hint = max(rq_prio(list_next_entry(rq, sched.link)),
-		   engine->execlists.queue_priority_hint);
-
-	return hint >= effective_prio(rq);
-}
-
-static int
-switch_prio(struct intel_engine_cs *engine, const struct i915_request *rq)
-{
-	if (list_is_last(&rq->sched.link, &engine->active.requests))
-		return INT_MIN;
-
-	return rq_prio(list_next_entry(rq, sched.link));
-}
-
-static inline unsigned long
-timeslice(const struct intel_engine_cs *engine)
-{
-	return READ_ONCE(engine->props.timeslice_duration_ms);
-}
-
-static unsigned long
-active_timeslice(const struct intel_engine_cs *engine)
-{
-	const struct i915_request *rq = *engine->execlists.active;
-
-	if (i915_request_completed(rq))
-		return 0;
-
-	if (engine->execlists.switch_priority_hint < effective_prio(rq))
-		return 0;
-
-	return timeslice(engine);
-}
-
-static void set_timeslice(struct intel_engine_cs *engine)
-{
-	if (!intel_engine_has_timeslices(engine))
-		return;
-
-	set_timer_ms(&engine->execlists.timer, active_timeslice(engine));
-}
-
-static void record_preemption(struct intel_engine_execlists *execlists)
-{
-	(void)I915_SELFTEST_ONLY(execlists->preempt_hang.count++);
-}
-
-static unsigned long active_preempt_timeout(struct intel_engine_cs *engine)
-{
-	struct i915_request *rq;
-
-	rq = last_active(&engine->execlists);
-	if (!rq)
-		return 0;
-
-	/* Force a fast reset for terminated contexts (ignoring sysfs!) */
-	if (unlikely(i915_gem_context_is_banned(rq->gem_context)))
-		return 1;
-
-	return READ_ONCE(engine->props.preempt_timeout_ms);
-}
-
-static void set_preempt_timeout(struct intel_engine_cs *engine)
-{
-	if (!intel_engine_has_preempt_reset(engine))
-		return;
-
-	set_timer_ms(&engine->execlists.preempt,
-		     active_preempt_timeout(engine));
-}
-
-static void execlists_dequeue(struct intel_engine_cs *engine)
-{
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-	struct i915_request **port = execlists->pending;
-	struct i915_request ** const last_port = port + execlists->port_mask;
-	struct i915_request *last;
-	struct rb_node *rb;
-	bool submit = false;
-
-	/*
-	 * Hardware submission is through 2 ports. Conceptually each port
-	 * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
-	 * static for a context, and unique to each, so we only execute
-	 * requests belonging to a single context from each ring. RING_HEAD
-	 * is maintained by the CS in the context image, it marks the place
-	 * where it got up to last time, and through RING_TAIL we tell the CS
-	 * where we want to execute up to this time.
-	 *
-	 * In this list the requests are in order of execution. Consecutive
-	 * requests from the same context are adjacent in the ringbuffer. We
-	 * can combine these requests into a single RING_TAIL update:
-	 *
-	 *              RING_HEAD...req1...req2
-	 *                                    ^- RING_TAIL
-	 * since to execute req2 the CS must first execute req1.
-	 *
-	 * Our goal then is to point each port to the end of a consecutive
-	 * sequence of requests as being the most optimal (fewest wake ups
-	 * and context switches) submission.
-	 */
-
-	for (rb = rb_first_cached(&execlists->virtual); rb; ) {
-		struct intel_virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
-		struct i915_request *rq = READ_ONCE(ve->request);
-
-		if (!rq) { /* lazily cleanup after another engine handled rq */
-			rb_erase_cached(rb, &execlists->virtual);
-			RB_CLEAR_NODE(rb);
-			rb = rb_first_cached(&execlists->virtual);
-			continue;
-		}
-
-		if (!virtual_matches(ve, rq, engine)) {
-			rb = rb_next(rb);
-			continue;
-		}
-
-		break;
-	}
-
-	/*
-	 * If the queue is higher priority than the last
-	 * request in the currently active context, submit afresh.
-	 * We will resubmit again afterwards in case we need to split
-	 * the active context to interject the preemption request,
-	 * i.e. we will retrigger preemption following the ack in case
-	 * of trouble.
-	 */
-	last = last_active(execlists);
-	if (last) {
-		if (need_preempt(engine, last, rb)) {
-			GEM_TRACE("%s: preempting last=%llx:%lld, prio=%d, hint=%d\n",
-				  engine->name,
-				  last->fence.context,
-				  last->fence.seqno,
-				  last->sched.attr.priority,
-				  execlists->queue_priority_hint);
-			record_preemption(execlists);
-
-			/*
-			 * Don't let the RING_HEAD advance past the breadcrumb
-			 * as we unwind (and until we resubmit) so that we do
-			 * not accidentally tell it to go backwards.
-			 */
-			ring_set_paused(engine, 1);
-
-			/*
-			 * Note that we have not stopped the GPU at this point,
-			 * so we are unwinding the incomplete requests as they
-			 * remain inflight and so by the time we do complete
-			 * the preemption, some of the unwound requests may
-			 * complete!
-			 */
-			__unwind_incomplete_requests(engine);
-
-			/*
-			 * If we need to return to the preempted context, we
-			 * need to skip the lite-restore and force it to
-			 * reload the RING_TAIL. Otherwise, the HW has a
-			 * tendency to ignore us rewinding the TAIL to the
-			 * end of an earlier request.
-			 */
-			last->hw_context->lrc_desc |= CTX_DESC_FORCE_RESTORE;
-			last = NULL;
-		} else if (need_timeslice(engine, last) &&
-			   timer_expired(&engine->execlists.timer)) {
-			GEM_TRACE("%s: expired last=%llx:%lld, prio=%d, hint=%d\n",
-				  engine->name,
-				  last->fence.context,
-				  last->fence.seqno,
-				  last->sched.attr.priority,
-				  execlists->queue_priority_hint);
-
-			ring_set_paused(engine, 1);
-			defer_active(engine);
-
-			/*
-			 * Unlike for preemption, if we rewind and continue
-			 * executing the same context as previously active,
-			 * the order of execution will remain the same and
-			 * the tail will only advance. We do not need to
-			 * force a full context restore, as a lite-restore
-			 * is sufficient to resample the monotonic TAIL.
-			 *
-			 * If we switch to any other context, similarly we
-			 * will not rewind TAIL of current context, and
-			 * normal save/restore will preserve state and allow
-			 * us to later continue executing the same request.
-			 */
-			last = NULL;
-		} else {
-			/*
-			 * Otherwise if we already have a request pending
-			 * for execution after the current one, we can
-			 * just wait until the next CS event before
-			 * queuing more. In either case we will force a
-			 * lite-restore preemption event, but if we wait
-			 * we hopefully coalesce several updates into a single
-			 * submission.
-			 */
-			if (!list_is_last(&last->sched.link,
-					  &engine->active.requests)) {
-				/*
-				 * Even if ELSP[1] is occupied and not worthy
-				 * of timeslices, our queue might be.
-				 */
-				if (!execlists->timer.expires &&
-				    need_timeslice(engine, last))
-					set_timer_ms(&execlists->timer,
-						     timeslice(engine));
-
-				return;
-			}
-		}
-	}
-
-	while (rb) { /* XXX virtual is always taking precedence */
-		struct intel_virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
-		struct i915_request *rq;
-
-		spin_lock(&ve->base.active.lock);
-
-		rq = ve->request;
-		if (unlikely(!rq)) { /* lost the race to a sibling */
-			spin_unlock(&ve->base.active.lock);
-			rb_erase_cached(rb, &execlists->virtual);
-			RB_CLEAR_NODE(rb);
-			rb = rb_first_cached(&execlists->virtual);
-			continue;
-		}
-
-		GEM_BUG_ON(rq != ve->request);
-		GEM_BUG_ON(rq->engine != &ve->base);
-		GEM_BUG_ON(rq->hw_context != &ve->context);
-
-		if (rq_prio(rq) >= queue_prio(execlists)) {
-			if (!virtual_matches(ve, rq, engine)) {
-				spin_unlock(&ve->base.active.lock);
-				rb = rb_next(rb);
-				continue;
-			}
-
-			if (last && !can_merge_rq(last, rq)) {
-				spin_unlock(&ve->base.active.lock);
-				return; /* leave this for another */
-			}
-
-			GEM_TRACE("%s: virtual rq=%llx:%lld%s, new engine? %s\n",
-				  engine->name,
-				  rq->fence.context,
-				  rq->fence.seqno,
-				  i915_request_completed(rq) ? "!" :
-				  i915_request_started(rq) ? "*" :
-				  "",
-				  yesno(engine != ve->siblings[0]));
-
-			ve->request = NULL;
-			ve->base.execlists.queue_priority_hint = INT_MIN;
-			rb_erase_cached(rb, &execlists->virtual);
-			RB_CLEAR_NODE(rb);
-
-			GEM_BUG_ON(!(rq->execution_mask & engine->mask));
-			rq->engine = engine;
-
-			if (engine != ve->siblings[0]) {
-				u32 *regs = ve->context.lrc_reg_state;
-				unsigned int n;
-
-				GEM_BUG_ON(READ_ONCE(ve->context.inflight));
-
-				if (!intel_engine_has_relative_mmio(engine))
-					intel_lr_context_set_register_offsets(regs,
-									      engine);
-
-				if (!list_empty(&ve->context.signals))
-					virtual_xfer_breadcrumbs(ve, engine);
-
-				/*
-				 * Move the bound engine to the top of the list
-				 * for future execution. We then kick this
-				 * tasklet first before checking others, so that
-				 * we preferentially reuse this set of bound
-				 * registers.
-				 */
-				for (n = 1; n < ve->num_siblings; n++) {
-					if (ve->siblings[n] == engine) {
-						swap(ve->siblings[n],
-						     ve->siblings[0]);
-						break;
-					}
-				}
-
-				GEM_BUG_ON(ve->siblings[0] != engine);
-			}
-
-			if (__i915_request_submit(rq)) {
-				submit = true;
-				last = rq;
-			}
-			i915_request_put(rq);
-
-			/*
-			 * Hmm, we have a bunch of virtual engine requests,
-			 * but the first one was already completed (thanks
-			 * preempt-to-busy!). Keep looking at the veng queue
-			 * until we have no more relevant requests (i.e.
-			 * the normal submit queue has higher priority).
-			 */
-			if (!submit) {
-				spin_unlock(&ve->base.active.lock);
-				rb = rb_first_cached(&execlists->virtual);
-				continue;
-			}
-		}
-
-		spin_unlock(&ve->base.active.lock);
-		break;
-	}
-
-	while ((rb = rb_first_cached(&execlists->queue))) {
-		struct i915_priolist *p = to_priolist(rb);
-		struct i915_request *rq, *rn;
-		int i;
-
-		priolist_for_each_request_consume(rq, rn, p, i) {
-			bool merge = true;
-
-			/*
-			 * Can we combine this request with the current port?
-			 * It has to be the same context/ringbuffer and not
-			 * have any exceptions (e.g. GVT saying never to
-			 * combine contexts).
-			 *
-			 * If we can combine the requests, we can execute both
-			 * by updating the RING_TAIL to point to the end of the
-			 * second request, and so we never need to tell the
-			 * hardware about the first.
-			 */
-			if (last && !can_merge_rq(last, rq)) {
-				/*
-				 * If we are on the second port and cannot
-				 * combine this request with the last, then we
-				 * are done.
-				 */
-				if (port == last_port)
-					goto done;
-
-				/*
-				 * We must not populate both ELSP[] with the
-				 * same LRCA, i.e. we must submit 2 different
-				 * contexts if we submit 2 ELSP.
-				 */
-				if (last->hw_context == rq->hw_context)
-					goto done;
-
-				if (i915_request_has_sentinel(last))
-					goto done;
-
-				/*
-				 * If GVT overrides us we only ever submit
-				 * port[0], leaving port[1] empty. Note that we
-				 * also have to be careful that we don't queue
-				 * the same context (even though a different
-				 * request) to the second port.
-				 */
-				if (ctx_single_port_submission(last->hw_context) ||
-				    ctx_single_port_submission(rq->hw_context))
-					goto done;
-
-				merge = false;
-			}
-
-			if (__i915_request_submit(rq)) {
-				if (!merge) {
-					*port = execlists_schedule_in(last, port - execlists->pending);
-					port++;
-					last = NULL;
-				}
-
-				GEM_BUG_ON(last &&
-					   !can_merge_ctx(last->hw_context,
-							  rq->hw_context));
-
-				submit = true;
-				last = rq;
-			}
-		}
-
-		rb_erase_cached(&p->node, &execlists->queue);
-		i915_priolist_free(p);
-	}
-
-done:
-	/*
-	 * Here be a bit of magic! Or sleight-of-hand, whichever you prefer.
-	 *
-	 * We choose the priority hint such that if we add a request of greater
-	 * priority than this, we kick the submission tasklet to decide on
-	 * the right order of submitting the requests to hardware. We must
-	 * also be prepared to reorder requests as they are in-flight on the
-	 * HW. We derive the priority hint then as the first "hole" in
-	 * the HW submission ports and if there are no available slots,
-	 * the priority of the lowest executing request, i.e. last.
-	 *
-	 * When we do receive a higher priority request ready to run from the
-	 * user, see queue_request(), the priority hint is bumped to that
-	 * request triggering preemption on the next dequeue (or subsequent
-	 * interrupt for secondary ports).
-	 */
-	execlists->queue_priority_hint = queue_prio(execlists);
-	GEM_TRACE("%s: queue_priority_hint:%d, submit:%s\n",
-		  engine->name, execlists->queue_priority_hint,
-		  yesno(submit));
-
-	if (submit) {
-		*port = execlists_schedule_in(last, port - execlists->pending);
-		execlists->switch_priority_hint =
-			switch_prio(engine, *execlists->pending);
-
-		/*
-		 * Skip if we ended up with exactly the same set of requests,
-		 * e.g. trying to timeslice a pair of ordered contexts
-		 */
-		if (!memcmp(execlists->active, execlists->pending,
-			    (port - execlists->pending + 1) * sizeof(*port))) {
-			do
-				execlists_schedule_out(fetch_and_zero(port));
-			while (port-- != execlists->pending);
-
-			goto skip_submit;
-		}
-
-		memset(port + 1, 0, (last_port - port) * sizeof(*port));
-		execlists_submit_ports(engine);
-
-		set_preempt_timeout(engine);
-	} else {
-skip_submit:
-		ring_set_paused(engine, 0);
-	}
-}
-
-static void
-cancel_port_requests(struct intel_engine_execlists * const execlists)
-{
-	struct i915_request * const *port;
-
-	for (port = execlists->pending; *port; port++)
-		execlists_schedule_out(*port);
-	memset(execlists->pending, 0, sizeof(execlists->pending));
-
-	/* Mark the end of active before we overwrite *active */
-	for (port = xchg(&execlists->active, execlists->pending); *port; port++)
-		execlists_schedule_out(*port);
-	WRITE_ONCE(execlists->active,
-		   memset(execlists->inflight, 0, sizeof(execlists->inflight)));
-}
-
-static inline void
-invalidate_csb_entries(const u32 *first, const u32 *last)
-{
-	clflush((void *)first);
-	clflush((void *)last);
-}
-
-static inline bool
-reset_in_progress(const struct intel_engine_execlists *execlists)
-{
-	return unlikely(!__tasklet_is_enabled(&execlists->tasklet));
-}
-
-/*
- * Starting with Gen12, the status has a new format:
- *
- *     bit  0:     switched to new queue
- *     bit  1:     reserved
- *     bit  2:     semaphore wait mode (poll or signal), only valid when
- *                 switch detail is set to "wait on semaphore"
- *     bits 3-5:   engine class
- *     bits 6-11:  engine instance
- *     bits 12-14: reserved
- *     bits 15-25: sw context id of the lrc the GT switched to
- *     bits 26-31: sw counter of the lrc the GT switched to
- *     bits 32-35: context switch detail
- *                  - 0: ctx complete
- *                  - 1: wait on sync flip
- *                  - 2: wait on vblank
- *                  - 3: wait on scanline
- *                  - 4: wait on semaphore
- *                  - 5: context preempted (not on SEMAPHORE_WAIT or
- *                       WAIT_FOR_EVENT)
- *     bit  36:    reserved
- *     bits 37-43: wait detail (for switch detail 1 to 4)
- *     bits 44-46: reserved
- *     bits 47-57: sw context id of the lrc the GT switched away from
- *     bits 58-63: sw counter of the lrc the GT switched away from
- */
-static inline bool
-gen12_csb_parse(const struct intel_engine_execlists *execlists, const u32 *csb)
-{
-	u32 lower_dw = csb[0];
-	u32 upper_dw = csb[1];
-	bool ctx_to_valid = GEN12_CSB_CTX_VALID(lower_dw);
-	bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_dw);
-	bool new_queue = lower_dw & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE;
-
-	/*
-	 * The context switch detail is not guaranteed to be 5 when a preemption
-	 * occurs, so we can't just check for that. The check below works for
-	 * all the cases we care about, including preemptions of WAIT
-	 * instructions and lite-restore. Preempt-to-idle via the CTRL register
-	 * would require some extra handling, but we don't support that.
-	 */
-	if (!ctx_away_valid || new_queue) {
-		GEM_BUG_ON(!ctx_to_valid);
-		return true;
-	}
-
-	/*
-	 * switch detail = 5 is covered by the case above and we do not expect a
-	 * context switch on an unsuccessful wait instruction since we always
-	 * use polling mode.
-	 */
-	GEM_BUG_ON(GEN12_CTX_SWITCH_DETAIL(upper_dw));
-	return false;
-}
-
-static inline bool
-gen8_csb_parse(const struct intel_engine_execlists *execlists, const u32 *csb)
-{
-	return *csb & (GEN8_CTX_STATUS_IDLE_ACTIVE | GEN8_CTX_STATUS_PREEMPTED);
-}
-
-static void process_csb(struct intel_engine_cs *engine)
-{
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-	const u32 * const buf = execlists->csb_status;
-	const u8 num_entries = execlists->csb_size;
-	u8 head, tail;
-
-	/*
-	 * As we modify our execlists state tracking we require exclusive
-	 * access. Either we are inside the tasklet, or the tasklet is disabled
-	 * and we assume that is only inside the reset paths and so serialised.
-	 */
-	GEM_BUG_ON(!tasklet_is_locked(&execlists->tasklet) &&
-		   !reset_in_progress(execlists));
-	GEM_BUG_ON(!intel_engine_in_execlists_submission_mode(engine));
-
-	/*
-	 * Note that csb_write, csb_status may be either in HWSP or mmio.
-	 * When reading from the csb_write mmio register, we have to be
-	 * careful to only use the GEN8_CSB_WRITE_PTR portion, which is
-	 * the low 4bits. As it happens we know the next 4bits are always
-	 * zero and so we can simply masked off the low u8 of the register
-	 * and treat it identically to reading from the HWSP (without having
-	 * to use explicit shifting and masking, and probably bifurcating
-	 * the code to handle the legacy mmio read).
-	 */
-	head = execlists->csb_head;
-	tail = READ_ONCE(*execlists->csb_write);
-	GEM_TRACE("%s cs-irq head=%d, tail=%d\n", engine->name, head, tail);
-	if (unlikely(head == tail))
-		return;
-
-	/*
-	 * Hopefully paired with a wmb() in HW!
-	 *
-	 * We must complete the read of the write pointer before any reads
-	 * from the CSB, so that we do not see stale values. Without an rmb
-	 * (lfence) the HW may speculatively perform the CSB[] reads *before*
-	 * we perform the READ_ONCE(*csb_write).
-	 */
-	rmb();
-
-	do {
-		bool promote;
-
-		if (++head == num_entries)
-			head = 0;
-
-		/*
-		 * We are flying near dragons again.
-		 *
-		 * We hold a reference to the request in execlist_port[]
-		 * but no more than that. We are operating in softirq
-		 * context and so cannot hold any mutex or sleep. That
-		 * prevents us stopping the requests we are processing
-		 * in port[] from being retired simultaneously (the
-		 * breadcrumb will be complete before we see the
-		 * context-switch). As we only hold the reference to the
-		 * request, any pointer chasing underneath the request
-		 * is subject to a potential use-after-free. Thus we
-		 * store all of the bookkeeping within port[] as
-		 * required, and avoid using unguarded pointers beneath
-		 * request itself. The same applies to the atomic
-		 * status notifier.
-		 */
-
-		GEM_TRACE("%s csb[%d]: status=0x%08x:0x%08x\n",
-			  engine->name, head,
-			  buf[2 * head + 0], buf[2 * head + 1]);
-
-		if (INTEL_GEN(engine->i915) >= 12)
-			promote = gen12_csb_parse(execlists, buf + 2 * head);
-		else
-			promote = gen8_csb_parse(execlists, buf + 2 * head);
-		if (promote) {
-			struct i915_request * const *old = execlists->active;
-
-			/* Point active to the new ELSP; prevent overwriting */
-			WRITE_ONCE(execlists->active, execlists->pending);
-			set_timeslice(engine);
-
-			if (!inject_preempt_hang(execlists))
-				ring_set_paused(engine, 0);
-
-			/* cancel old inflight, prepare for switch */
-			trace_ports(execlists, "preempted", old);
-			while (*old)
-				execlists_schedule_out(*old++);
-
-			/* switch pending to inflight */
-			GEM_BUG_ON(!assert_pending_valid(execlists, "promote"));
-			WRITE_ONCE(execlists->active,
-				   memcpy(execlists->inflight,
-					  execlists->pending,
-					  execlists_num_ports(execlists) *
-					  sizeof(*execlists->pending)));
-
-			WRITE_ONCE(execlists->pending[0], NULL);
-		} else {
-			GEM_BUG_ON(!*execlists->active);
-
-			/* port0 completed, advanced to port1 */
-			trace_ports(execlists, "completed", execlists->active);
-
-			/*
-			 * We rely on the hardware being strongly
-			 * ordered, that the breadcrumb write is
-			 * coherent (visible from the CPU) before the
-			 * user interrupt and CSB is processed.
-			 */
-			GEM_BUG_ON(!i915_request_completed(*execlists->active) &&
-				   !reset_in_progress(execlists));
-			execlists_schedule_out(*execlists->active++);
-
-			GEM_BUG_ON(execlists->active - execlists->inflight >
-				   execlists_num_ports(execlists));
-		}
-	} while (head != tail);
-
-	execlists->csb_head = head;
-
-	/*
-	 * Gen11 has proven to fail wrt global observation point between
-	 * entry and tail update, failing on the ordering and thus
-	 * we see an old entry in the context status buffer.
-	 *
-	 * Forcibly evict out entries for the next gpu csb update,
-	 * to increase the odds that we get a fresh entries with non
-	 * working hardware. The cost for doing so comes out mostly with
-	 * the wash as hardware, working or not, will need to do the
-	 * invalidation before.
-	 */
-	invalidate_csb_entries(&buf[0], &buf[num_entries - 1]);
-}
-
-static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
-{
-	lockdep_assert_held(&engine->active.lock);
-	if (!engine->execlists.pending[0]) {
-		rcu_read_lock(); /* protect peeking at execlists->active */
-		execlists_dequeue(engine);
-		rcu_read_unlock();
-	}
-}
-
-static noinline void preempt_reset(struct intel_engine_cs *engine)
-{
-	const unsigned int bit = I915_RESET_ENGINE + engine->id;
-	unsigned long *lock = &engine->gt->reset.flags;
-
-	if (i915_modparams.reset < 3)
-		return;
-
-	if (test_and_set_bit(bit, lock))
-		return;
-
-	/* Mark this tasklet as disabled to avoid waiting for it to complete */
-	tasklet_disable_nosync(&engine->execlists.tasklet);
-
-	GEM_TRACE("%s: preempt timeout %lu+%ums\n",
-		  engine->name,
-		  READ_ONCE(engine->props.preempt_timeout_ms),
-		  jiffies_to_msecs(jiffies - engine->execlists.preempt.expires));
-	intel_engine_reset(engine, "preemption time out");
-
-	tasklet_enable(&engine->execlists.tasklet);
-	clear_and_wake_up_bit(bit, lock);
-}
-
-static bool preempt_timeout(const struct intel_engine_cs *const engine)
-{
-	const struct timer_list *t = &engine->execlists.preempt;
-
-	if (!CONFIG_DRM_I915_PREEMPT_TIMEOUT)
-		return false;
-
-	if (!timer_expired(t))
-		return false;
-
-	return READ_ONCE(engine->execlists.pending[0]);
-}
-
-/*
- * Check the unread Context Status Buffers and manage the submission of new
- * contexts to the ELSP accordingly.
- */
-static void execlists_submission_tasklet(unsigned long data)
-{
-	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
-	bool timeout = preempt_timeout(engine);
-
-	process_csb(engine);
-	if (!READ_ONCE(engine->execlists.pending[0]) || timeout) {
-		unsigned long flags;
-
-		spin_lock_irqsave(&engine->active.lock, flags);
-		__execlists_submission_tasklet(engine);
-		spin_unlock_irqrestore(&engine->active.lock, flags);
-
-		/* Recheck after serialising with direct-submission */
-		if (timeout && preempt_timeout(engine))
-			preempt_reset(engine);
-	}
-}
-
-static void __execlists_kick(struct intel_engine_execlists *execlists)
-{
-	/* Kick the tasklet for some interrupt coalescing and reset handling */
-	tasklet_hi_schedule(&execlists->tasklet);
-}
-
-#define execlists_kick(t, member) \
-	__execlists_kick(container_of(t, struct intel_engine_execlists, member))
-
-static void execlists_timeslice(struct timer_list *timer)
-{
-	execlists_kick(timer, timer);
-}
-
-static void execlists_preempt(struct timer_list *timer)
-{
-	execlists_kick(timer, preempt);
-}
-
-static void queue_request(struct intel_engine_cs *engine,
-			  struct i915_sched_node *node,
-			  int prio)
-{
-	GEM_BUG_ON(!list_empty(&node->link));
-	list_add_tail(&node->link, i915_sched_lookup_priolist(engine, prio));
-}
-
-static void __submit_queue_imm(struct intel_engine_cs *engine)
-{
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-
-	if (reset_in_progress(execlists))
-		return; /* defer until we restart the engine following reset */
-
-	if (execlists->tasklet.func == execlists_submission_tasklet)
-		__execlists_submission_tasklet(engine);
-	else
-		tasklet_hi_schedule(&execlists->tasklet);
-}
-
-static void submit_queue(struct intel_engine_cs *engine,
-			 const struct i915_request *rq)
-{
-	struct intel_engine_execlists *execlists = &engine->execlists;
-
-	if (rq_prio(rq) <= execlists->queue_priority_hint)
-		return;
-
-	execlists->queue_priority_hint = rq_prio(rq);
-	__submit_queue_imm(engine);
-}
-
-static void execlists_submit_request(struct i915_request *request)
-{
-	struct intel_engine_cs *engine = request->engine;
-	unsigned long flags;
-
-	/* Will be called from irq-context when using foreign fences. */
-	spin_lock_irqsave(&engine->active.lock, flags);
-
-	queue_request(engine, &request->sched, rq_prio(request));
-
-	GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
-	GEM_BUG_ON(list_empty(&request->sched.link));
-
-	submit_queue(engine, request);
-
-	spin_unlock_irqrestore(&engine->active.lock, flags);
-}
-
-static void execlists_context_destroy(struct kref *kref)
-{
-	struct intel_context *ce = container_of(kref, typeof(*ce), ref);
-
-	GEM_BUG_ON(!i915_active_is_idle(&ce->active));
-	GEM_BUG_ON(intel_context_is_pinned(ce));
-
-	if (ce->state)
-		intel_lr_context_fini(ce);
+	if (engine->pinned_default_state)
+		memcpy(regs, /* skip restoring the vanilla PPHWSP */
+		       engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
+		       engine->context_size - PAGE_SIZE);
 
-	intel_context_fini(ce);
-	intel_context_free(ce);
+	lr_context_init_reg_state(regs, ce, engine, ce->ring, false);
 }
 
 static void
@@ -2377,9 +643,9 @@ void intel_lr_context_unpin(struct intel_context *ce)
 	intel_ring_reset(ce->ring, ce->ring->tail);
 }
 
-static void
-lr_context_update_reg_state(const struct intel_context *ce,
-			    const struct intel_engine_cs *engine)
+void
+intel_lr_context_update_reg_state(const struct intel_context *ce,
+				  const struct intel_engine_cs *engine)
 {
 	struct intel_ring *ring = ce->ring;
 	u32 *regs = ce->lrc_reg_state;
@@ -2424,7 +690,7 @@ intel_lr_context_pin(struct intel_context *ce,
 
 	ce->lrc_desc = lrc_descriptor(ce, engine);
 	ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
-	lr_context_update_reg_state(ce, engine);
+	intel_lr_context_update_reg_state(ce, engine);
 
 	return 0;
 
@@ -2434,51 +700,6 @@ intel_lr_context_pin(struct intel_context *ce,
 	return ret;
 }
 
-static int execlists_context_pin(struct intel_context *ce)
-{
-	return intel_lr_context_pin(ce, ce->engine);
-}
-
-static int execlists_context_alloc(struct intel_context *ce)
-{
-	return intel_lr_context_alloc(ce, ce->engine);
-}
-
-static void execlists_context_reset(struct intel_context *ce)
-{
-	/*
-	 * Because we emit WA_TAIL_DWORDS there may be a disparity
-	 * between our bookkeeping in ce->ring->head and ce->ring->tail and
-	 * that stored in context. As we only write new commands from
-	 * ce->ring->tail onwards, everything before that is junk. If the GPU
-	 * starts reading from its RING_HEAD from the context, it may try to
-	 * execute that junk and die.
-	 *
-	 * The contexts that are stilled pinned on resume belong to the
-	 * kernel, and are local to each engine. All other contexts will
-	 * have their head/tail sanitized upon pinning before use, so they
-	 * will never see garbage,
-	 *
-	 * So to avoid that we reset the context images upon resume. For
-	 * simplicity, we just zero everything out.
-	 */
-	intel_ring_reset(ce->ring, 0);
-	lr_context_update_reg_state(ce, ce->engine);
-}
-
-static const struct intel_context_ops execlists_context_ops = {
-	.alloc = execlists_context_alloc,
-
-	.pin = execlists_context_pin,
-	.unpin = intel_lr_context_unpin,
-
-	.enter = intel_context_enter_engine,
-	.exit = intel_context_exit_engine,
-
-	.reset = execlists_context_reset,
-	.destroy = execlists_context_destroy,
-};
-
 static int gen8_emit_init_breadcrumb(struct i915_request *rq)
 {
 	u32 *cs;
@@ -2511,36 +732,6 @@ static int gen8_emit_init_breadcrumb(struct i915_request *rq)
 	return 0;
 }
 
-static int execlists_request_alloc(struct i915_request *request)
-{
-	int ret;
-
-	GEM_BUG_ON(!intel_context_is_pinned(request->hw_context));
-
-	/*
-	 * Flush enough space to reduce the likelihood of waiting after
-	 * we start building the request - in which case we will just
-	 * have to repeat work.
-	 */
-	request->reserved_space += EXECLISTS_REQUEST_SIZE;
-
-	/*
-	 * Note that after this point, we have committed to using
-	 * this request as it is being used to both track the
-	 * state of engine initialisation and liveness of the
-	 * golden renderstate above. Think twice before you try
-	 * to cancel/unwind this request now.
-	 */
-
-	/* Unconditionally invalidate GPU caches and TLBs. */
-	ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
-	if (ret)
-		return ret;
-
-	request->reserved_space -= EXECLISTS_REQUEST_SIZE;
-	return 0;
-}
-
 /*
  * In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after
  * PIPE_CONTROL instruction. This is required for the flush to happen correctly
@@ -2857,7 +1048,7 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 	return ret;
 }
 
-static int logical_ring_init(struct intel_engine_cs *engine)
+int intel_logical_ring_init(struct intel_engine_cs *engine)
 {
 	int ret;
 
@@ -2876,7 +1067,7 @@ static int logical_ring_init(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static void logical_ring_destroy(struct intel_engine_cs *engine)
+void intel_logical_ring_destroy(struct intel_engine_cs *engine)
 {
 	intel_engine_cleanup_common(engine);
 	lrc_destroy_wa_ctx(engine);
@@ -2937,303 +1128,19 @@ static int logical_ring_resume(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static void execlists_reset_prepare(struct intel_engine_cs *engine)
-{
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-	unsigned long flags;
-
-	GEM_TRACE("%s: depth<-%d\n", engine->name,
-		  atomic_read(&execlists->tasklet.count));
-
-	/*
-	 * Prevent request submission to the hardware until we have
-	 * completed the reset in i915_gem_reset_finish(). If a request
-	 * is completed by one engine, it may then queue a request
-	 * to a second via its execlists->tasklet *just* as we are
-	 * calling engine->resume() and also writing the ELSP.
-	 * Turning off the execlists->tasklet until the reset is over
-	 * prevents the race.
-	 */
-	__tasklet_disable_sync_once(&execlists->tasklet);
-	GEM_BUG_ON(!reset_in_progress(execlists));
-
-	/* And flush any current direct submission. */
-	spin_lock_irqsave(&engine->active.lock, flags);
-	spin_unlock_irqrestore(&engine->active.lock, flags);
-
-	/*
-	 * We stop engines, otherwise we might get failed reset and a
-	 * dead gpu (on elk). Also as modern gpu as kbl can suffer
-	 * from system hang if batchbuffer is progressing when
-	 * the reset is issued, regardless of READY_TO_RESET ack.
-	 * Thus assume it is best to stop engines on all gens
-	 * where we have a gpu reset.
-	 *
-	 * WaKBLVECSSemaphoreWaitPoll:kbl (on ALL_ENGINES)
-	 *
-	 * FIXME: Wa for more modern gens needs to be validated
-	 */
-	intel_engine_stop_cs(engine);
-}
-
-static void reset_csb_pointers(struct intel_engine_cs *engine)
-{
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-	const unsigned int reset_value = execlists->csb_size - 1;
-
-	ring_set_paused(engine, 0);
-
-	/*
-	 * After a reset, the HW starts writing into CSB entry [0]. We
-	 * therefore have to set our HEAD pointer back one entry so that
-	 * the *first* entry we check is entry 0. To complicate this further,
-	 * as we don't wait for the first interrupt after reset, we have to
-	 * fake the HW write to point back to the last entry so that our
-	 * inline comparison of our cached head position against the last HW
-	 * write works even before the first interrupt.
-	 */
-	execlists->csb_head = reset_value;
-	WRITE_ONCE(*execlists->csb_write, reset_value);
-	wmb(); /* Make sure this is visible to HW (paranoia?) */
-
-	/*
-	 * Sometimes Icelake forgets to reset its pointers on a GPU reset.
-	 * Bludgeon them with a mmio update to be sure.
-	 */
-	ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,
-		     reset_value << 8 | reset_value);
-	ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR);
-
-	invalidate_csb_entries(&execlists->csb_status[0],
-			       &execlists->csb_status[reset_value]);
-}
-
-static void lr_context_reset_reg_state(const struct intel_context *ce,
-				       const struct intel_engine_cs *engine)
+void intel_lr_context_reset_reg_state(const struct intel_context *ce,
+				      const struct intel_engine_cs *engine)
 {
 	u32 *regs = ce->lrc_reg_state;
 	int x;
 
-	x = lrc_ring_mi_mode(engine);
+	x = intel_lrc_ring_mi_mode(engine);
 	if (x != -1) {
 		regs[x + 1] &= ~STOP_RING;
 		regs[x + 1] |= STOP_RING << 16;
 	}
 }
 
-static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
-{
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-	struct intel_context *ce;
-	struct i915_request *rq;
-
-	mb(); /* paranoia: read the CSB pointers from after the reset */
-	clflush(execlists->csb_write);
-	mb();
-
-	process_csb(engine); /* drain preemption events */
-
-	/* Following the reset, we need to reload the CSB read/write pointers */
-	reset_csb_pointers(engine);
-
-	/*
-	 * Save the currently executing context, even if we completed
-	 * its request, it was still running at the time of the
-	 * reset and will have been clobbered.
-	 */
-	rq = execlists_active(execlists);
-	if (!rq)
-		goto unwind;
-
-	/* We still have requests in-flight; the engine should be active */
-	GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
-
-	ce = rq->hw_context;
-	GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
-
-	if (i915_request_completed(rq)) {
-		/* Idle context; tidy up the ring so we can restart afresh */
-		ce->ring->head = intel_ring_wrap(ce->ring, rq->tail);
-		goto out_replay;
-	}
-
-	/* Context has requests still in-flight; it should not be idle! */
-	GEM_BUG_ON(i915_active_is_idle(&ce->active));
-	rq = active_request(ce->timeline, rq);
-	ce->ring->head = intel_ring_wrap(ce->ring, rq->head);
-	GEM_BUG_ON(ce->ring->head == ce->ring->tail);
-
-	/*
-	 * If this request hasn't started yet, e.g. it is waiting on a
-	 * semaphore, we need to avoid skipping the request or else we
-	 * break the signaling chain. However, if the context is corrupt
-	 * the request will not restart and we will be stuck with a wedged
-	 * device. It is quite often the case that if we issue a reset
-	 * while the GPU is loading the context image, that the context
-	 * image becomes corrupt.
-	 *
-	 * Otherwise, if we have not started yet, the request should replay
-	 * perfectly and we do not need to flag the result as being erroneous.
-	 */
-	if (!i915_request_started(rq))
-		goto out_replay;
-
-	/*
-	 * If the request was innocent, we leave the request in the ELSP
-	 * and will try to replay it on restarting. The context image may
-	 * have been corrupted by the reset, in which case we may have
-	 * to service a new GPU hang, but more likely we can continue on
-	 * without impact.
-	 *
-	 * If the request was guilty, we presume the context is corrupt
-	 * and have to at least restore the RING register in the context
-	 * image back to the expected values to skip over the guilty request.
-	 */
-	__i915_request_reset(rq, stalled);
-	if (!stalled)
-		goto out_replay;
-
-	/*
-	 * We want a simple context + ring to execute the breadcrumb update.
-	 * We cannot rely on the context being intact across the GPU hang,
-	 * so clear it and rebuild just what we need for the breadcrumb.
-	 * All pending requests for this context will be zapped, and any
-	 * future request will be after userspace has had the opportunity
-	 * to recreate its own state.
-	 */
-	GEM_BUG_ON(!intel_context_is_pinned(ce));
-	lr_context_restore_default_state(ce, engine);
-
-out_replay:
-	GEM_TRACE("%s replay {head:%04x, tail:%04x}\n",
-		  engine->name, ce->ring->head, ce->ring->tail);
-	intel_ring_update_space(ce->ring);
-	lr_context_reset_reg_state(ce, engine);
-	lr_context_update_reg_state(ce, engine);
-	ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
-
-unwind:
-	/* Push back any incomplete requests for replay after the reset. */
-	cancel_port_requests(execlists);
-	__unwind_incomplete_requests(engine);
-}
-
-static void execlists_reset(struct intel_engine_cs *engine, bool stalled)
-{
-	unsigned long flags;
-
-	GEM_TRACE("%s\n", engine->name);
-
-	spin_lock_irqsave(&engine->active.lock, flags);
-
-	__execlists_reset(engine, stalled);
-
-	spin_unlock_irqrestore(&engine->active.lock, flags);
-}
-
-static void nop_submission_tasklet(unsigned long data)
-{
-	/* The driver is wedged; don't process any more events. */
-}
-
-static void execlists_cancel_requests(struct intel_engine_cs *engine)
-{
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-	struct i915_request *rq, *rn;
-	struct rb_node *rb;
-	unsigned long flags;
-
-	GEM_TRACE("%s\n", engine->name);
-
-	/*
-	 * Before we call engine->cancel_requests(), we should have exclusive
-	 * access to the submission state. This is arranged for us by the
-	 * caller disabling the interrupt generation, the tasklet and other
-	 * threads that may then access the same state, giving us a free hand
-	 * to reset state. However, we still need to let lockdep be aware that
-	 * we know this state may be accessed in hardirq context, so we
-	 * disable the irq around this manipulation and we want to keep
-	 * the spinlock focused on its duties and not accidentally conflate
-	 * coverage to the submission's irq state. (Similarly, although we
-	 * shouldn't need to disable irq around the manipulation of the
-	 * submission's irq state, we also wish to remind ourselves that
-	 * it is irq state.)
-	 */
-	spin_lock_irqsave(&engine->active.lock, flags);
-
-	__execlists_reset(engine, true);
-
-	/* Mark all executing requests as skipped. */
-	list_for_each_entry(rq, &engine->active.requests, sched.link)
-		mark_eio(rq);
-
-	/* Flush the queued requests to the timeline list (for retiring). */
-	while ((rb = rb_first_cached(&execlists->queue))) {
-		struct i915_priolist *p = to_priolist(rb);
-		int i;
-
-		priolist_for_each_request_consume(rq, rn, p, i) {
-			mark_eio(rq);
-			__i915_request_submit(rq);
-		}
-
-		rb_erase_cached(&p->node, &execlists->queue);
-		i915_priolist_free(p);
-	}
-
-	/* Cancel all attached virtual engines */
-	while ((rb = rb_first_cached(&execlists->virtual))) {
-		struct intel_virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
-
-		rb_erase_cached(rb, &execlists->virtual);
-		RB_CLEAR_NODE(rb);
-
-		spin_lock(&ve->base.active.lock);
-		rq = fetch_and_zero(&ve->request);
-		if (rq) {
-			mark_eio(rq);
-
-			rq->engine = engine;
-			__i915_request_submit(rq);
-			i915_request_put(rq);
-
-			ve->base.execlists.queue_priority_hint = INT_MIN;
-		}
-		spin_unlock(&ve->base.active.lock);
-	}
-
-	/* Remaining _unready_ requests will be nop'ed when submitted */
-
-	execlists->queue_priority_hint = INT_MIN;
-	execlists->queue = RB_ROOT_CACHED;
-
-	GEM_BUG_ON(__tasklet_is_enabled(&execlists->tasklet));
-	execlists->tasklet.func = nop_submission_tasklet;
-
-	spin_unlock_irqrestore(&engine->active.lock, flags);
-}
-
-static void execlists_reset_finish(struct intel_engine_cs *engine)
-{
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-
-	/*
-	 * After a GPU reset, we may have requests to replay. Do so now while
-	 * we still have the forcewake to be sure that the GPU is not allowed
-	 * to sleep before we restart and reload a context.
-	 */
-	GEM_BUG_ON(!reset_in_progress(execlists));
-	if (!RB_EMPTY_ROOT(&execlists->queue.rb_root))
-		execlists->tasklet.func(execlists->tasklet.data);
-
-	if (__tasklet_enable(&execlists->tasklet))
-		/* And kick in case we missed a new request submission. */
-		tasklet_hi_schedule(&execlists->tasklet);
-	GEM_TRACE("%s: depth->%d\n", engine->name,
-		  atomic_read(&execlists->tasklet.count));
-}
-
 static int gen8_emit_bb_start(struct i915_request *rq,
 			      u64 offset, u32 len,
 			      const unsigned int flags)
@@ -3716,75 +1623,20 @@ gen12_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs)
 	return gen12_emit_fini_breadcrumb_footer(request, cs);
 }
 
-static void execlists_park(struct intel_engine_cs *engine)
-{
-	cancel_timer(&engine->execlists.timer);
-	cancel_timer(&engine->execlists.preempt);
-}
-
-void intel_execlists_set_default_submission(struct intel_engine_cs *engine)
-{
-	engine->submit_request = execlists_submit_request;
-	engine->cancel_requests = execlists_cancel_requests;
-	engine->schedule = i915_schedule;
-	engine->execlists.tasklet.func = execlists_submission_tasklet;
-
-	engine->reset.prepare = execlists_reset_prepare;
-	engine->reset.reset = execlists_reset;
-	engine->reset.finish = execlists_reset_finish;
-
-	engine->park = execlists_park;
-	engine->unpark = NULL;
-
-	engine->flags |= I915_ENGINE_SUPPORTS_STATS;
-	if (!intel_vgpu_active(engine->i915)) {
-		engine->flags |= I915_ENGINE_HAS_SEMAPHORES;
-		if (HAS_LOGICAL_RING_PREEMPTION(engine->i915))
-			engine->flags |= I915_ENGINE_HAS_PREEMPTION;
-	}
-
-	if (INTEL_GEN(engine->i915) >= 12)
-		engine->flags |= I915_ENGINE_HAS_RELATIVE_MMIO;
-}
-
-static void execlists_shutdown(struct intel_engine_cs *engine)
-{
-	/* Synchronise with residual timers and any softirq they raise */
-	del_timer_sync(&engine->execlists.timer);
-	del_timer_sync(&engine->execlists.preempt);
-	tasklet_kill(&engine->execlists.tasklet);
-}
-
-static void execlists_destroy(struct intel_engine_cs *engine)
-{
-	execlists_shutdown(engine);
-
-	logical_ring_destroy(engine);
-}
-
 static void
 logical_ring_default_vfuncs(struct intel_engine_cs *engine)
 {
 	/* Default vfuncs which can be overriden by each engine. */
 
-	engine->destroy = execlists_destroy;
+	engine->destroy = intel_logical_ring_destroy;
 	engine->resume = logical_ring_resume;
 
-	engine->reset.prepare = execlists_reset_prepare;
-	engine->reset.reset = execlists_reset;
-	engine->reset.finish = execlists_reset_finish;
-
-	engine->cops = &execlists_context_ops;
-	engine->request_alloc = execlists_request_alloc;
-
 	engine->emit_flush = gen8_emit_flush;
 	engine->emit_init_breadcrumb = gen8_emit_init_breadcrumb;
 	engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb;
 	if (INTEL_GEN(engine->i915) >= 12)
 		engine->emit_fini_breadcrumb = gen12_emit_fini_breadcrumb;
 
-	engine->set_default_submission = intel_execlists_set_default_submission;
-
 	if (INTEL_GEN(engine->i915) < 11) {
 		engine->irq_enable = gen8_logical_ring_enable_irq;
 		engine->irq_disable = gen8_logical_ring_disable_irq;
@@ -3841,7 +1693,7 @@ static void rcs_submission_override(struct intel_engine_cs *engine)
 	}
 }
 
-static void logical_ring_setup(struct intel_engine_cs *engine)
+void intel_logical_ring_setup(struct intel_engine_cs *engine)
 {
 	logical_ring_default_vfuncs(engine);
 	logical_ring_default_irqs(engine);
@@ -3850,56 +1702,6 @@ static void logical_ring_setup(struct intel_engine_cs *engine)
 		rcs_submission_override(engine);
 }
 
-int intel_execlists_submission_setup(struct intel_engine_cs *engine)
-{
-	tasklet_init(&engine->execlists.tasklet,
-		     execlists_submission_tasklet, (unsigned long)engine);
-	timer_setup(&engine->execlists.timer, execlists_timeslice, 0);
-	timer_setup(&engine->execlists.preempt, execlists_preempt, 0);
-
-	logical_ring_setup(engine);
-
-	return 0;
-}
-
-int intel_execlists_submission_init(struct intel_engine_cs *engine)
-{
-	struct intel_engine_execlists * const execlists = &engine->execlists;
-	struct drm_i915_private *i915 = engine->i915;
-	struct intel_uncore *uncore = engine->uncore;
-	u32 base = engine->mmio_base;
-	int ret;
-
-	ret = logical_ring_init(engine);
-	if (ret)
-		return ret;
-
-	if (HAS_LOGICAL_RING_ELSQ(i915)) {
-		execlists->submit_reg = uncore->regs +
-			i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(base));
-		execlists->ctrl_reg = uncore->regs +
-			i915_mmio_reg_offset(RING_EXECLIST_CONTROL(base));
-	} else {
-		execlists->submit_reg = uncore->regs +
-			i915_mmio_reg_offset(RING_ELSP(base));
-	}
-
-	execlists->csb_status =
-		&engine->status_page.addr[I915_HWS_CSB_BUF0_INDEX];
-
-	execlists->csb_write =
-		&engine->status_page.addr[intel_hws_csb_write_index(i915)];
-
-	if (INTEL_GEN(i915) < 11)
-		execlists->csb_size = GEN8_CSB_ENTRIES;
-	else
-		execlists->csb_size = GEN11_CSB_ENTRIES;
-
-	reset_csb_pointers(engine);
-
-	return 0;
-}
-
 static u32 intel_lr_indirect_ctx_offset(const struct intel_engine_cs *engine)
 {
 	u32 indirect_ctx_offset;
@@ -3933,7 +1735,6 @@ static u32 intel_lr_indirect_ctx_offset(const struct intel_engine_cs *engine)
 	return indirect_ctx_offset;
 }
 
-
 static void init_common_reg_state(u32 * const regs,
 				  const struct intel_engine_cs *engine,
 				  const struct intel_ring *ring)
@@ -4150,268 +1951,6 @@ void intel_lr_context_fini(struct intel_context *ce)
 	i915_vma_put(ce->state);
 }
 
-static intel_engine_mask_t
-virtual_submission_mask(struct intel_virtual_engine *ve)
-{
-	struct i915_request *rq;
-	intel_engine_mask_t mask;
-
-	rq = READ_ONCE(ve->request);
-	if (!rq)
-		return 0;
-
-	/* The rq is ready for submission; rq->execution_mask is now stable. */
-	mask = rq->execution_mask;
-	if (unlikely(!mask)) {
-		/* Invalid selection, submit to a random engine in error */
-		i915_request_skip(rq, -ENODEV);
-		mask = ve->siblings[0]->mask;
-	}
-
-	GEM_TRACE("%s: rq=%llx:%lld, mask=%x, prio=%d\n",
-		  ve->base.name,
-		  rq->fence.context, rq->fence.seqno,
-		  mask, ve->base.execlists.queue_priority_hint);
-
-	return mask;
-}
-
-static void virtual_submission_tasklet(unsigned long data)
-{
-	struct intel_virtual_engine * const ve =
-		(struct intel_virtual_engine *)data;
-	const int prio = ve->base.execlists.queue_priority_hint;
-	intel_engine_mask_t mask;
-	unsigned int n;
-
-	rcu_read_lock();
-	mask = virtual_submission_mask(ve);
-	rcu_read_unlock();
-	if (unlikely(!mask))
-		return;
-
-	local_irq_disable();
-	for (n = 0; READ_ONCE(ve->request) && n < ve->num_siblings; n++) {
-		struct intel_engine_cs *sibling = ve->siblings[n];
-		struct ve_node * const node = &ve->nodes[sibling->id];
-		struct rb_node **parent, *rb;
-		bool first;
-
-		if (unlikely(!(mask & sibling->mask))) {
-			if (!RB_EMPTY_NODE(&node->rb)) {
-				spin_lock(&sibling->active.lock);
-				rb_erase_cached(&node->rb,
-						&sibling->execlists.virtual);
-				RB_CLEAR_NODE(&node->rb);
-				spin_unlock(&sibling->active.lock);
-			}
-			continue;
-		}
-
-		spin_lock(&sibling->active.lock);
-
-		if (!RB_EMPTY_NODE(&node->rb)) {
-			/*
-			 * Cheat and avoid rebalancing the tree if we can
-			 * reuse this node in situ.
-			 */
-			first = rb_first_cached(&sibling->execlists.virtual) ==
-				&node->rb;
-			if (prio == node->prio || (prio > node->prio && first))
-				goto submit_engine;
-
-			rb_erase_cached(&node->rb, &sibling->execlists.virtual);
-		}
-
-		rb = NULL;
-		first = true;
-		parent = &sibling->execlists.virtual.rb_root.rb_node;
-		while (*parent) {
-			struct ve_node *other;
-
-			rb = *parent;
-			other = rb_entry(rb, typeof(*other), rb);
-			if (prio > other->prio) {
-				parent = &rb->rb_left;
-			} else {
-				parent = &rb->rb_right;
-				first = false;
-			}
-		}
-
-		rb_link_node(&node->rb, rb, parent);
-		rb_insert_color_cached(&node->rb,
-				       &sibling->execlists.virtual,
-				       first);
-
-submit_engine:
-		GEM_BUG_ON(RB_EMPTY_NODE(&node->rb));
-		node->prio = prio;
-		if (first && prio > sibling->execlists.queue_priority_hint) {
-			sibling->execlists.queue_priority_hint = prio;
-			tasklet_hi_schedule(&sibling->execlists.tasklet);
-		}
-
-		spin_unlock(&sibling->active.lock);
-	}
-	local_irq_enable();
-}
-
-static void virtual_submit_request(struct i915_request *rq)
-{
-	struct intel_virtual_engine *ve = to_virtual_engine(rq->engine);
-	struct i915_request *old;
-	unsigned long flags;
-
-	GEM_TRACE("%s: rq=%llx:%lld\n",
-		  ve->base.name,
-		  rq->fence.context,
-		  rq->fence.seqno);
-
-	GEM_BUG_ON(ve->base.submit_request != virtual_submit_request);
-
-	spin_lock_irqsave(&ve->base.active.lock, flags);
-
-	old = ve->request;
-	if (old) { /* background completion event from preempt-to-busy */
-		GEM_BUG_ON(!i915_request_completed(old));
-		__i915_request_submit(old);
-		i915_request_put(old);
-	}
-
-	if (i915_request_completed(rq)) {
-		__i915_request_submit(rq);
-
-		ve->base.execlists.queue_priority_hint = INT_MIN;
-		ve->request = NULL;
-	} else {
-		ve->base.execlists.queue_priority_hint = rq_prio(rq);
-		ve->request = i915_request_get(rq);
-
-		GEM_BUG_ON(!list_empty(intel_virtual_engine_queue(ve)));
-		list_move_tail(&rq->sched.link, intel_virtual_engine_queue(ve));
-
-		tasklet_schedule(&ve->base.execlists.tasklet);
-	}
-
-	spin_unlock_irqrestore(&ve->base.active.lock, flags);
-}
-
-static void
-virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
-{
-	struct intel_virtual_engine *ve = to_virtual_engine(rq->engine);
-	intel_engine_mask_t allowed, exec;
-	struct ve_bond *bond;
-
-	allowed = ~to_request(signal)->engine->mask;
-
-	bond = intel_virtual_engine_find_bond(ve, to_request(signal)->engine);
-	if (bond)
-		allowed &= bond->sibling_mask;
-
-	/* Restrict the bonded request to run on only the available engines */
-	exec = READ_ONCE(rq->execution_mask);
-	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
-		;
-
-	/* Prevent the master from being re-run on the bonded engines */
-	to_request(signal)->execution_mask &= ~allowed;
-}
-
-void intel_execlists_virtual_submission_init(struct intel_virtual_engine *ve)
-{
-	ve->base.request_alloc = execlists_request_alloc;
-	ve->base.submit_request = virtual_submit_request;
-	ve->base.bond_execute = virtual_bond_execute;
-	tasklet_init(&ve->base.execlists.tasklet,
-		     virtual_submission_tasklet,
-		     (unsigned long)ve);
-}
-
-void intel_execlists_show_requests(struct intel_engine_cs *engine,
-				   struct drm_printer *m,
-				   void (*show_request)(struct drm_printer *m,
-							struct i915_request *rq,
-							const char *prefix),
-				   unsigned int max)
-{
-	const struct intel_engine_execlists *execlists = &engine->execlists;
-	struct i915_request *rq, *last;
-	unsigned long flags;
-	unsigned int count;
-	struct rb_node *rb;
-
-	spin_lock_irqsave(&engine->active.lock, flags);
-
-	last = NULL;
-	count = 0;
-	list_for_each_entry(rq, &engine->active.requests, sched.link) {
-		if (count++ < max - 1)
-			show_request(m, rq, "\t\tE ");
-		else
-			last = rq;
-	}
-	if (last) {
-		if (count > max) {
-			drm_printf(m,
-				   "\t\t...skipping %d executing requests...\n",
-				   count - max);
-		}
-		show_request(m, last, "\t\tE ");
-	}
-
-	last = NULL;
-	count = 0;
-	if (execlists->queue_priority_hint != INT_MIN)
-		drm_printf(m, "\t\tQueue priority hint: %d\n",
-			   execlists->queue_priority_hint);
-	for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) {
-		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
-		int i;
-
-		priolist_for_each_request(rq, p, i) {
-			if (count++ < max - 1)
-				show_request(m, rq, "\t\tQ ");
-			else
-				last = rq;
-		}
-	}
-	if (last) {
-		if (count > max) {
-			drm_printf(m,
-				   "\t\t...skipping %d queued requests...\n",
-				   count - max);
-		}
-		show_request(m, last, "\t\tQ ");
-	}
-
-	last = NULL;
-	count = 0;
-	for (rb = rb_first_cached(&execlists->virtual); rb; rb = rb_next(rb)) {
-		struct intel_virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
-		struct i915_request *rq = READ_ONCE(ve->request);
-
-		if (rq) {
-			if (count++ < max - 1)
-				show_request(m, rq, "\t\tV ");
-			else
-				last = rq;
-		}
-	}
-	if (last) {
-		if (count > max) {
-			drm_printf(m,
-				   "\t\t...skipping %d virtual requests...\n",
-				   count - max);
-		}
-		show_request(m, last, "\t\tV ");
-	}
-
-	spin_unlock_irqrestore(&engine->active.lock, flags);
-}
-
 void intel_lr_context_reset(struct intel_engine_cs *engine,
 			    struct intel_context *ce,
 			    u32 head,
@@ -4428,23 +1967,15 @@ void intel_lr_context_reset(struct intel_engine_cs *engine,
 	 * to recreate its own state.
 	 */
 	if (scrub)
-		lr_context_restore_default_state(ce, engine);
+		intel_lr_context_restore_default_state(ce, engine);
 
 	/* Rerun the request; its payload has been neutered (if guilty). */
 	ce->ring->head = head;
 	intel_ring_update_space(ce->ring);
 
-	lr_context_update_reg_state(ce, engine);
-}
-
-bool
-intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine)
-{
-	return engine->set_default_submission ==
-	       intel_execlists_set_default_submission;
+	intel_lr_context_update_reg_state(ce, engine);
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_lrc.c"
-#include "selftest_execlists.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h
index 93f30b2deb7f..6b3b8e4c230e 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.h
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
@@ -74,17 +74,11 @@ struct intel_virtual_engine;
 /* in Gen12 ID 0x7FF is reserved to indicate idle */
 #define GEN12_MAX_CONTEXT_HW_ID	(GEN11_MAX_CONTEXT_HW_ID - 1)
 
-enum {
-	INTEL_CONTEXT_SCHEDULE_IN = 0,
-	INTEL_CONTEXT_SCHEDULE_OUT,
-	INTEL_CONTEXT_SCHEDULE_PREEMPTED,
-};
-
 /* Logical Rings */
+int intel_logical_ring_init(struct intel_engine_cs *engine);
+void intel_logical_ring_setup(struct intel_engine_cs *engine);
 void intel_logical_ring_cleanup(struct intel_engine_cs *engine);
-
-int intel_execlists_submission_setup(struct intel_engine_cs *engine);
-int intel_execlists_submission_init(struct intel_engine_cs *engine);
+void intel_logical_ring_destroy(struct intel_engine_cs *engine);
 
 /* Logical Ring Contexts */
 /* At the start of the context image is its per-process HWS page */
@@ -97,7 +91,7 @@ int intel_execlists_submission_init(struct intel_engine_cs *engine);
 #define LRC_PPHWSP_SCRATCH		0x34
 #define LRC_PPHWSP_SCRATCH_ADDR		(LRC_PPHWSP_SCRATCH * sizeof(u32))
 
-void intel_execlists_set_default_submission(struct intel_engine_cs *engine);
+int intel_lrc_ring_mi_mode(const struct intel_engine_cs *engine);
 
 int intel_lr_context_alloc(struct intel_context *ce,
 			   struct intel_engine_cs *engine);
@@ -106,6 +100,14 @@ void intel_lr_context_fini(struct intel_context *ce);
 u32 *intel_lr_context_set_register_offsets(u32 *regs,
 					   const struct intel_engine_cs *engine);
 
+void intel_lr_context_restore_default_state(struct intel_context *ce,
+					    struct intel_engine_cs *engine);
+
+void intel_lr_context_update_reg_state(const struct intel_context *ce,
+				       const struct intel_engine_cs *engine);
+void intel_lr_context_reset_reg_state(const struct intel_context *ce,
+				      const struct intel_engine_cs *engine);
+
 void intel_lr_context_reset(struct intel_engine_cs *engine,
 			    struct intel_context *ce,
 			    u32 head,
@@ -115,16 +117,4 @@ int intel_lr_context_pin(struct intel_context *ce,
 			 struct intel_engine_cs *engine);
 void intel_lr_context_unpin(struct intel_context *ce);
 
-void intel_execlists_show_requests(struct intel_engine_cs *engine,
-				   struct drm_printer *m,
-				   void (*show_request)(struct drm_printer *m,
-							struct i915_request *rq,
-							const char *prefix),
-				   unsigned int max);
-
-void intel_execlists_virtual_submission_init(struct intel_virtual_engine *ve);
-
-bool
-intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
-
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_virtual_engine.c b/drivers/gpu/drm/i915/gt/intel_virtual_engine.c
index 6ec3752132bc..862865913fd3 100644
--- a/drivers/gpu/drm/i915/gt/intel_virtual_engine.c
+++ b/drivers/gpu/drm/i915/gt/intel_virtual_engine.c
@@ -13,6 +13,7 @@
 #include "intel_context.h"
 #include "intel_engine.h"
 #include "intel_engine_pm.h"
+#include "intel_execlists_submission.h"
 #include "intel_lrc.h"
 #include "intel_timeline.h"
 #include "intel_virtual_engine.h"
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index b58a4feb2ec4..86cadf32a096 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -142,7 +142,7 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
 		}
 		GEM_BUG_ON(!ce[1]->ring->size);
 		intel_ring_reset(ce[1]->ring, ce[1]->ring->size / 2);
-		lr_context_update_reg_state(ce[1], engine);
+		intel_lr_context_update_reg_state(ce[1], engine);
 
 		rq[0] = igt_spinner_create_request(&spin, ce[0], MI_ARB_CHECK);
 		if (IS_ERR(rq[0])) {
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index c3f5f46ffcb4..6b16ec113675 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -194,7 +194,7 @@ static int live_lrc_fixed(void *arg)
 			},
 			{
 				i915_mmio_reg_offset(RING_MI_MODE(engine->mmio_base)),
-				lrc_ring_mi_mode(engine),
+				intel_lrc_ring_mi_mode(engine),
 				"RING_MI_MODE"
 			},
 			{
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 172220e83079..097a504402a6 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -8,6 +8,7 @@
 #include "gem/i915_gem_context.h"
 #include "gt/intel_context.h"
 #include "gt/intel_engine_pm.h"
+#include "gt/intel_execlists_submission.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_lrc_reg.h"
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 5b2a7d072ec9..60f922a4399a 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -38,6 +38,7 @@
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_pm.h"
 #include "gt/intel_context.h"
+#include "gt/intel_execlists_submission.h"
 #include "gt/intel_ring.h"
 
 #include "i915_drv.h"
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 8d2e37949f46..2b5f8cbb3053 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -198,6 +198,7 @@
 #include "gem/i915_gem_context.h"
 #include "gt/intel_engine_pm.h"
 #include "gt/intel_engine_user.h"
+#include "gt/intel_execlists_submission.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_lrc_reg.h"
 #include "gt/intel_ring.h"
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 1/5] drm/i915: introduce logical_ring and lr_context naming
  2019-12-11 21:12 ` [Intel-gfx] [RFC 1/5] drm/i915: introduce logical_ring and lr_context naming Daniele Ceraolo Spurio
@ 2019-12-11 21:20   ` Chris Wilson
  2019-12-11 21:33     ` Chris Wilson
  2019-12-11 22:04     ` Daniele Ceraolo Spurio
  0 siblings, 2 replies; 21+ messages in thread
From: Chris Wilson @ 2019-12-11 21:20 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx

Quoting Daniele Ceraolo Spurio (2019-12-11 21:12:40)
> Ahead of splitting out the code specific to execlists submission to its
> own file, we can re-organize the code within the intel_lrc file to make
> that separation clearer. To achieve this, a number of functions have
> been split/renamed using the "logical_ring" and "lr_context" naming,
> respectively for engine-related setup and lrc management.
> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c    | 154 ++++++++++++++-----------
>  drivers/gpu/drm/i915/gt/selftest_lrc.c |  12 +-
>  2 files changed, 93 insertions(+), 73 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 929f6bae4eba..6d6148e11fd0 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -228,17 +228,17 @@ static struct virtual_engine *to_virtual_engine(struct intel_engine_cs *engine)
>         return container_of(engine, struct virtual_engine, base);
>  }
>  
> -static int __execlists_context_alloc(struct intel_context *ce,
> -                                    struct intel_engine_cs *engine);
> -
> -static void execlists_init_reg_state(u32 *reg_state,
> -                                    const struct intel_context *ce,
> -                                    const struct intel_engine_cs *engine,
> -                                    const struct intel_ring *ring,
> -                                    bool close);
> +static int lr_context_alloc(struct intel_context *ce,
> +                           struct intel_engine_cs *engine);

Execlists.

> +
> +static void lr_context_init_reg_state(u32 *reg_state,
> +                                     const struct intel_context *ce,
> +                                     const struct intel_engine_cs *engine,
> +                                     const struct intel_ring *ring,
> +                                     bool close);

lrc. lrc should just be the register offsets and default context image.

>  static void
> -__execlists_update_reg_state(const struct intel_context *ce,
> -                            const struct intel_engine_cs *engine);
> +lr_context_update_reg_state(const struct intel_context *ce,
> +                           const struct intel_engine_cs *engine);

lrc.

>  
>  static void mark_eio(struct i915_request *rq)
>  {
> @@ -1035,8 +1035,8 @@ execlists_check_context(const struct intel_context *ce,
>         WARN_ONCE(!valid, "Invalid lrc state found before submission\n");
>  }
>  
> -static void restore_default_state(struct intel_context *ce,
> -                                 struct intel_engine_cs *engine)
> +static void lr_context_restore_default_state(struct intel_context *ce,
> +                                            struct intel_engine_cs *engine)
>  {
>         u32 *regs = ce->lrc_reg_state;
>  
> @@ -1045,7 +1045,7 @@ static void restore_default_state(struct intel_context *ce,
>                        engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
>                        engine->context_size - PAGE_SIZE);
>  
> -       execlists_init_reg_state(regs, ce, engine, ce->ring, false);
> +       lr_context_init_reg_state(regs, ce, engine, ce->ring, false);
>  }
>  
>  static void reset_active(struct i915_request *rq,
> @@ -1081,8 +1081,8 @@ static void reset_active(struct i915_request *rq,
>         intel_ring_update_space(ce->ring);
>  
>         /* Scrub the context image to prevent replaying the previous batch */
> -       restore_default_state(ce, engine);
> -       __execlists_update_reg_state(ce, engine);
> +       lr_context_restore_default_state(ce, engine);
> +       lr_context_update_reg_state(ce, engine);
>  
>         /* We've switched away, so this should be a no-op, but intent matters */
>         ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
> @@ -2378,7 +2378,7 @@ static void execlists_submit_request(struct i915_request *request)
>         spin_unlock_irqrestore(&engine->active.lock, flags);
>  }
>  
> -static void __execlists_context_fini(struct intel_context *ce)
> +static void lr_context_fini(struct intel_context *ce)

execlists.

>  {
>         intel_ring_put(ce->ring);
>         i915_vma_put(ce->state);
> @@ -2392,7 +2392,7 @@ static void execlists_context_destroy(struct kref *kref)
>         GEM_BUG_ON(intel_context_is_pinned(ce));
>  
>         if (ce->state)
> -               __execlists_context_fini(ce);
> +               lr_context_fini(ce);
>  
>         intel_context_fini(ce);
>         intel_context_free(ce);
> @@ -2423,7 +2423,7 @@ check_redzone(const void *vaddr, const struct intel_engine_cs *engine)
>                              engine->name);
>  }
>  
> -static void execlists_context_unpin(struct intel_context *ce)
> +static void intel_lr_context_unpin(struct intel_context *ce)

execlists.

>  {
>         check_redzone((void *)ce->lrc_reg_state - LRC_STATE_PN * PAGE_SIZE,
>                       ce->engine);
> @@ -2433,8 +2433,8 @@ static void execlists_context_unpin(struct intel_context *ce)
>  }
>  
>  static void
> -__execlists_update_reg_state(const struct intel_context *ce,
> -                            const struct intel_engine_cs *engine)
> +lr_context_update_reg_state(const struct intel_context *ce,
> +                           const struct intel_engine_cs *engine)

lrc.

>  {
>         struct intel_ring *ring = ce->ring;
>         u32 *regs = ce->lrc_reg_state;
> @@ -2456,8 +2456,7 @@ __execlists_update_reg_state(const struct intel_context *ce,
>  }
>  
>  static int
> -__execlists_context_pin(struct intel_context *ce,
> -                       struct intel_engine_cs *engine)
> +lr_context_pin(struct intel_context *ce, struct intel_engine_cs *engine)

execlists.

>  {
>         void *vaddr;
>         int ret;
> @@ -2479,7 +2478,7 @@ __execlists_context_pin(struct intel_context *ce,
>  
>         ce->lrc_desc = lrc_descriptor(ce, engine);
>         ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
> -       __execlists_update_reg_state(ce, engine);
> +       lr_context_update_reg_state(ce, engine);
>  
>         return 0;
>  
> @@ -2491,12 +2490,12 @@ __execlists_context_pin(struct intel_context *ce,
>  
>  static int execlists_context_pin(struct intel_context *ce)
>  {
> -       return __execlists_context_pin(ce, ce->engine);
> +       return lr_context_pin(ce, ce->engine);
>  }
>  
>  static int execlists_context_alloc(struct intel_context *ce)
>  {
> -       return __execlists_context_alloc(ce, ce->engine);
> +       return lr_context_alloc(ce, ce->engine);
>  }
>  
>  static void execlists_context_reset(struct intel_context *ce)
> @@ -2518,14 +2517,14 @@ static void execlists_context_reset(struct intel_context *ce)
>          * simplicity, we just zero everything out.
>          */
>         intel_ring_reset(ce->ring, 0);
> -       __execlists_update_reg_state(ce, ce->engine);
> +       lr_context_update_reg_state(ce, ce->engine);
>  }
>  
>  static const struct intel_context_ops execlists_context_ops = {
>         .alloc = execlists_context_alloc,
>  
>         .pin = execlists_context_pin,
> -       .unpin = execlists_context_unpin,
> +       .unpin = intel_lr_context_unpin,

execlists.

>  
>         .enter = intel_context_enter_engine,
>         .exit = intel_context_exit_engine,
> @@ -2912,7 +2911,33 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
>         return ret;
>  }
>  
> -static void enable_execlists(struct intel_engine_cs *engine)
> +static int logical_ring_init(struct intel_engine_cs *engine)
> +{
> +       int ret;
> +
> +       ret = intel_engine_init_common(engine);
> +       if (ret)
> +               return ret;
> +
> +       if (intel_init_workaround_bb(engine))
> +               /*
> +                * We continue even if we fail to initialize WA batch
> +                * because we only expect rare glitches but nothing
> +                * critical to prevent us from using GPU
> +                */
> +               DRM_ERROR("WA batch buffer initialization failed\n");
> +
> +       return 0;
> +}
> +
> +static void logical_ring_destroy(struct intel_engine_cs *engine)
> +{
> +       intel_engine_cleanup_common(engine);
> +       lrc_destroy_wa_ctx(engine);
> +       kfree(engine);

> +}
> +
> +static void logical_ring_enable(struct intel_engine_cs *engine)
>  {
>         u32 mode;
>  
> @@ -2946,7 +2971,7 @@ static bool unexpected_starting_state(struct intel_engine_cs *engine)
>         return unexpected;
>  }
>  
> -static int execlists_resume(struct intel_engine_cs *engine)
> +static int logical_ring_resume(struct intel_engine_cs *engine)

execlists.

>  {
>         intel_engine_apply_workarounds(engine);
>         intel_engine_apply_whitelist(engine);
> @@ -2961,7 +2986,7 @@ static int execlists_resume(struct intel_engine_cs *engine)
>                 intel_engine_dump(engine, &p, NULL);
>         }
>  
> -       enable_execlists(engine);
> +       logical_ring_enable(engine);
>  
>         return 0;
>  }
> @@ -3037,8 +3062,8 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
>                                &execlists->csb_status[reset_value]);
>  }
>  
> -static void __execlists_reset_reg_state(const struct intel_context *ce,
> -                                       const struct intel_engine_cs *engine)
> +static void lr_context_reset_reg_state(const struct intel_context *ce,
> +                                      const struct intel_engine_cs *engine)

lrc.

>  {
>         u32 *regs = ce->lrc_reg_state;
>         int x;
> @@ -3131,14 +3156,14 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
>          * to recreate its own state.
>          */
>         GEM_BUG_ON(!intel_context_is_pinned(ce));
> -       restore_default_state(ce, engine);
> +       lr_context_restore_default_state(ce, engine);
>  
>  out_replay:
>         GEM_TRACE("%s replay {head:%04x, tail:%04x}\n",
>                   engine->name, ce->ring->head, ce->ring->tail);
>         intel_ring_update_space(ce->ring);
> -       __execlists_reset_reg_state(ce, engine);
> -       __execlists_update_reg_state(ce, engine);
> +       lr_context_reset_reg_state(ce, engine);
> +       lr_context_update_reg_state(ce, engine);
>         ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
>  
>  unwind:
> @@ -3788,9 +3813,7 @@ static void execlists_destroy(struct intel_engine_cs *engine)
>  {
>         execlists_shutdown(engine);
>  
> -       intel_engine_cleanup_common(engine);
> -       lrc_destroy_wa_ctx(engine);
> -       kfree(engine);
> +       logical_ring_destroy(engine);
>  }
>  
>  static void
> @@ -3799,7 +3822,7 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
>         /* Default vfuncs which can be overriden by each engine. */
>  
>         engine->destroy = execlists_destroy;
> -       engine->resume = execlists_resume;
> +       engine->resume = logical_ring_resume;
>  
>         engine->reset.prepare = execlists_reset_prepare;
>         engine->reset.reset = execlists_reset;
> @@ -3872,6 +3895,15 @@ static void rcs_submission_override(struct intel_engine_cs *engine)
>         }
>  }
>  
> +static void logical_ring_setup(struct intel_engine_cs *engine)
> +{
> +       logical_ring_default_vfuncs(engine);
> +       logical_ring_default_irqs(engine);
> +
> +       if (engine->class == RENDER_CLASS)
> +               rcs_submission_override(engine);
> +}
> +
>  int intel_execlists_submission_setup(struct intel_engine_cs *engine)
>  {
>         tasklet_init(&engine->execlists.tasklet,
> @@ -3879,11 +3911,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
>         timer_setup(&engine->execlists.timer, execlists_timeslice, 0);
>         timer_setup(&engine->execlists.preempt, execlists_preempt, 0);
>  
> -       logical_ring_default_vfuncs(engine);
> -       logical_ring_default_irqs(engine);
> -
> -       if (engine->class == RENDER_CLASS)
> -               rcs_submission_override(engine);
> +       logical_ring_setup(engine);
>  
>         return 0;
>  }
> @@ -3896,18 +3924,10 @@ int intel_execlists_submission_init(struct intel_engine_cs *engine)
>         u32 base = engine->mmio_base;
>         int ret;
>  
> -       ret = intel_engine_init_common(engine);
> +       ret = logical_ring_init(engine);
>         if (ret)
>                 return ret;
>  
> -       if (intel_init_workaround_bb(engine))
> -               /*
> -                * We continue even if we fail to initialize WA batch
> -                * because we only expect rare glitches but nothing
> -                * critical to prevent us from using GPU
> -                */
> -               DRM_ERROR("WA batch buffer initialization failed\n");
> -
>         if (HAS_LOGICAL_RING_ELSQ(i915)) {
>                 execlists->submit_reg = uncore->regs +
>                         i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(base));
> @@ -4033,11 +4053,11 @@ static struct i915_ppgtt *vm_alias(struct i915_address_space *vm)
>                 return i915_vm_to_ppgtt(vm);
>  }
>  
> -static void execlists_init_reg_state(u32 *regs,
> -                                    const struct intel_context *ce,
> -                                    const struct intel_engine_cs *engine,
> -                                    const struct intel_ring *ring,
> -                                    bool close)
> +static void lr_context_init_reg_state(u32 *regs,
> +                                     const struct intel_context *ce,
> +                                     const struct intel_engine_cs *engine,
> +                                     const struct intel_ring *ring,
> +                                     bool close)
>  {
>         /*
>          * A context is actually a big batch buffer with several
> @@ -4105,7 +4125,7 @@ populate_lr_context(struct intel_context *ce,
>         /* The second page of the context object contains some fields which must
>          * be set up prior to the first execution. */
>         regs = vaddr + LRC_STATE_PN * PAGE_SIZE;
> -       execlists_init_reg_state(regs, ce, engine, ring, inhibit);
> +       lr_context_init_reg_state(regs, ce, engine, ring, inhibit);
>         if (inhibit)
>                 regs[CTX_CONTEXT_CONTROL] |=
>                         _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT);
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 2/5] drm/i915: Move struct intel_virtual_engine to its own header
  2019-12-11 21:12 ` [Intel-gfx] [RFC 2/5] drm/i915: Move struct intel_virtual_engine to its own header Daniele Ceraolo Spurio
@ 2019-12-11 21:22   ` Chris Wilson
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2019-12-11 21:22 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx

Quoting Daniele Ceraolo Spurio (2019-12-11 21:12:41)
> From: Matthew Brost <matthew.brost@intel.com>
> 
> The upcoming GuC submission code will need to use the structure, so
> split it to its own file.

There is no way this struct belongs anywhere else.

You want to add a few vfuncs to the context_ops so we can
abstract creation and manipulation.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 3/5] drm/i915: split out virtual engine code
  2019-12-11 21:12 ` [Intel-gfx] [RFC 3/5] drm/i915: split out virtual engine code Daniele Ceraolo Spurio
@ 2019-12-11 21:22   ` Chris Wilson
  2019-12-11 21:34     ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2019-12-11 21:22 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx

Quoting Daniele Ceraolo Spurio (2019-12-11 21:12:42)
> Having the virtual engine handling in its own file will make it easier
> call it from or modify for the GuC implementation without leaking the
> changes in the context management or execlists submission paths.

No. The virtual engine is tightly coupled into the execlists, it is not
the starting point for a general veng.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 4/5] drm/i915: move execlists selftests to their own file
  2019-12-11 21:12 ` [Intel-gfx] [RFC 4/5] drm/i915: move execlists selftests to their own file Daniele Ceraolo Spurio
@ 2019-12-11 21:26   ` Chris Wilson
  2019-12-11 22:07     ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2019-12-11 21:26 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx

Quoting Daniele Ceraolo Spurio (2019-12-11 21:12:43)
> Done ahead of splitting the lrc file as well, to keep that patch
> smaller. Just a straight copy, with the exception of create_scratch()
> that has been made common to avoid having 3 instances of it.
> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> ---
>  .../drm/i915/gem/selftests/igt_gem_utils.c    |   27 +
>  .../drm/i915/gem/selftests/igt_gem_utils.h    |    3 +
>  drivers/gpu/drm/i915/gt/intel_lrc.c           |    1 +
>  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 3316 ++++++++++++++++
>  drivers/gpu/drm/i915/gt/selftest_lrc.c        | 3333 +----------------
>  drivers/gpu/drm/i915/gt/selftest_mocs.c       |   30 +-
>  6 files changed, 3351 insertions(+), 3359 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gt/selftest_execlists.c
> 
> diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
> index 6718da20f35d..88109333cb79 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
> @@ -15,6 +15,33 @@
>  
>  #include "i915_request.h"
>  
> +struct i915_vma *igt_create_scratch(struct intel_gt *gt)

_ggtt_scratch(size, coherency, pin) ?

As it stands, it's not general enough...

> +{
> +       struct drm_i915_gem_object *obj;
> +       struct i915_vma *vma;
> +       int err;
> +
> +       obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
> +       if (IS_ERR(obj))
> +               return ERR_CAST(obj);
> +
> +       i915_gem_object_set_cache_coherency(obj, I915_CACHING_CACHED);
> +
> +       vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
> +       if (IS_ERR(vma)) {
> +               i915_gem_object_put(obj);
> +               return vma;
> +       }
> +
> +       err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
> +       if (err) {
> +               i915_gem_object_put(obj);
> +               return ERR_PTR(err);
> +       }
> +
> +       return vma;
> +}
> +
>  struct i915_request *
>  igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
>  {
> diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
> index 4221cf84d175..aae781f59cfc 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
> +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
> @@ -15,6 +15,9 @@ struct i915_vma;
>  
>  struct intel_context;
>  struct intel_engine_cs;
> +struct intel_gt;
> +
> +struct i915_vma *igt_create_scratch(struct intel_gt *gt);
>  
>  struct i915_request *
>  igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 3afae9a44911..fbdd3bdd06f1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -4446,4 +4446,5 @@ intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine)
>  
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>  #include "selftest_lrc.c"
> +#include "selftest_execlists.c"
>  #endif
> diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> new file mode 100644
> index 000000000000..b58a4feb2ec4
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c

Note that many if not all (there are a few where the guc being a black
box we cannot poke at internals) of these should also be used for guc
submission as a BAT.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 5/5] drm/i915: introduce intel_execlists_submission.<c/h>
  2019-12-11 21:12 ` [Intel-gfx] [RFC 5/5] drm/i915: introduce intel_execlists_submission.<c/h> Daniele Ceraolo Spurio
@ 2019-12-11 21:31   ` Chris Wilson
  2019-12-11 22:35     ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2019-12-11 21:31 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx

Quoting Daniele Ceraolo Spurio (2019-12-11 21:12:44)
> Split out all the code related to the execlists submission flow to its
> own file to keep it separate from the general context management,
> because the latter will be re-used by the GuC submission flow.
> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/i915/Makefile                 |    1 +
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |    1 +
>  .../drm/i915/gt/intel_execlists_submission.c  | 2485 ++++++++++++++++
>  .../drm/i915/gt/intel_execlists_submission.h  |   58 +
>  drivers/gpu/drm/i915/gt/intel_lrc.c           | 2511 +----------------
>  drivers/gpu/drm/i915/gt/intel_lrc.h           |   34 +-
>  .../gpu/drm/i915/gt/intel_virtual_engine.c    |    1 +
>  drivers/gpu/drm/i915/gt/selftest_execlists.c  |    2 +-
>  drivers/gpu/drm/i915/gt/selftest_lrc.c        |    2 +-
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |    1 +
>  drivers/gpu/drm/i915/gvt/scheduler.c          |    1 +
>  drivers/gpu/drm/i915/i915_perf.c              |    1 +
>  12 files changed, 2584 insertions(+), 2514 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_execlists_submission.c
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 79f5ef5acd4c..3640e0436c97 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -82,6 +82,7 @@ gt-y += \
>         gt/intel_engine_pm.o \
>         gt/intel_engine_pool.o \
>         gt/intel_engine_user.o \
> +       gt/intel_execlists_submission.o \
>         gt/intel_gt.o \
>         gt/intel_gt_irq.o \
>         gt/intel_gt_pm.o \
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 49473c25916c..0a23d01b7589 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -33,6 +33,7 @@
>  #include "intel_engine_pm.h"
>  #include "intel_engine_pool.h"
>  #include "intel_engine_user.h"
> +#include "intel_execlists_submission.h"
>  #include "intel_gt.h"
>  #include "intel_gt_requests.h"
>  #include "intel_lrc.h"
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> new file mode 100644
> index 000000000000..76b878bf15ad
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -0,0 +1,2485 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include <linux/interrupt.h>
> +
> +#include "gem/i915_gem_context.h"
> +
> +#include "i915_drv.h"
> +#include "i915_perf.h"
> +#include "i915_trace.h"
> +#include "i915_vgpu.h"
> +#include "intel_engine_pm.h"
> +#include "intel_gt.h"
> +#include "intel_gt_pm.h"
> +#include "intel_gt_requests.h"
> +#include "intel_lrc_reg.h"
> +#include "intel_mocs.h"
> +#include "intel_reset.h"
> +#include "intel_ring.h"
> +#include "intel_virtual_engine.h"
> +#include "intel_workarounds.h"
> +#include "intel_execlists_submission.h"
> +
> +#define RING_EXECLIST_QFULL            (1 << 0x2)
> +#define RING_EXECLIST1_VALID           (1 << 0x3)
> +#define RING_EXECLIST0_VALID           (1 << 0x4)
> +#define RING_EXECLIST_ACTIVE_STATUS    (3 << 0xE)
> +#define RING_EXECLIST1_ACTIVE          (1 << 0x11)
> +#define RING_EXECLIST0_ACTIVE          (1 << 0x12)
> +
> +#define GEN8_CTX_STATUS_IDLE_ACTIVE    (1 << 0)
> +#define GEN8_CTX_STATUS_PREEMPTED      (1 << 1)
> +#define GEN8_CTX_STATUS_ELEMENT_SWITCH (1 << 2)
> +#define GEN8_CTX_STATUS_ACTIVE_IDLE    (1 << 3)
> +#define GEN8_CTX_STATUS_COMPLETE       (1 << 4)
> +#define GEN8_CTX_STATUS_LITE_RESTORE   (1 << 15)
> +
> +#define GEN8_CTX_STATUS_COMPLETED_MASK \
> +        (GEN8_CTX_STATUS_COMPLETE | GEN8_CTX_STATUS_PREEMPTED)
> +
> +#define CTX_DESC_FORCE_RESTORE BIT_ULL(2)
> +
> +#define GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE (0x1) /* lower csb dword */
> +#define GEN12_CTX_SWITCH_DETAIL(csb_dw)        ((csb_dw) & 0xF) /* upper csb dword */
> +#define GEN12_CSB_SW_CTX_ID_MASK               GENMASK(25, 15)
> +#define GEN12_IDLE_CTX_ID              0x7FF
> +#define GEN12_CSB_CTX_VALID(csb_dw) \
> +       (FIELD_GET(GEN12_CSB_SW_CTX_ID_MASK, csb_dw) != GEN12_IDLE_CTX_ID)
> +
> +/* Typical size of the average request (2 pipecontrols and a MI_BB) */
> +#define EXECLISTS_REQUEST_SIZE 64 /* bytes */
> +
> +static void mark_eio(struct i915_request *rq)
> +{
> +       if (i915_request_completed(rq))
> +               return;
> +
> +       GEM_BUG_ON(i915_request_signaled(rq));
> +
> +       dma_fence_set_error(&rq->fence, -EIO);
> +       i915_request_mark_complete(rq);
> +}
> +
> +static struct i915_request *
> +active_request(const struct intel_timeline * const tl, struct i915_request *rq)
> +{
> +       struct i915_request *active = rq;
> +
> +       rcu_read_lock();
> +       list_for_each_entry_continue_reverse(rq, &tl->requests, link) {
> +               if (i915_request_completed(rq))
> +                       break;
> +
> +               active = rq;
> +       }
> +       rcu_read_unlock();
> +
> +       return active;
> +}
> +
> +static inline void
> +ring_set_paused(const struct intel_engine_cs *engine, int state)
> +{
> +       /*
> +        * We inspect HWS_PREEMPT with a semaphore inside
> +        * engine->emit_fini_breadcrumb. If the dword is true,
> +        * the ring is paused as the semaphore will busywait
> +        * until the dword is false.
> +        */
> +       engine->status_page.addr[I915_GEM_HWS_PREEMPT] = state;
> +       if (state)
> +               wmb();
> +}
> +
> +static inline struct i915_priolist *to_priolist(struct rb_node *rb)
> +{
> +       return rb_entry(rb, struct i915_priolist, node);
> +}
> +
> +static inline int rq_prio(const struct i915_request *rq)
> +{
> +       return rq->sched.attr.priority;
> +}
> +
> +static int effective_prio(const struct i915_request *rq)
> +{
> +       int prio = rq_prio(rq);
> +
> +       /*
> +        * If this request is special and must not be interrupted at any
> +        * cost, so be it. Note we are only checking the most recent request
> +        * in the context and so may be masking an earlier vip request. It
> +        * is hoped that under the conditions where nopreempt is used, this
> +        * will not matter (i.e. all requests to that context will be
> +        * nopreempt for as long as desired).
> +        */
> +       if (i915_request_has_nopreempt(rq))
> +               prio = I915_PRIORITY_UNPREEMPTABLE;
> +
> +       /*
> +        * On unwinding the active request, we give it a priority bump
> +        * if it has completed waiting on any semaphore. If we know that
> +        * the request has already started, we can prevent an unwanted
> +        * preempt-to-idle cycle by taking that into account now.
> +        */
> +       if (__i915_request_has_started(rq))
> +               prio |= I915_PRIORITY_NOSEMAPHORE;
> +
> +       /* Restrict mere WAIT boosts from triggering preemption */
> +       BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
> +       return prio | __NO_PREEMPTION;
> +}
> +
> +static int queue_prio(const struct intel_engine_execlists *execlists)
> +{
> +       struct i915_priolist *p;
> +       struct rb_node *rb;
> +
> +       rb = rb_first_cached(&execlists->queue);
> +       if (!rb)
> +               return INT_MIN;
> +
> +       /*
> +        * As the priolist[] are inverted, with the highest priority in [0],
> +        * we have to flip the index value to become priority.
> +        */
> +       p = to_priolist(rb);
> +       return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used);
> +}
> +
> +static inline bool need_preempt(const struct intel_engine_cs *engine,
> +                               const struct i915_request *rq,
> +                               struct rb_node *rb)
> +{
> +       int last_prio;
> +
> +       if (!intel_engine_has_semaphores(engine))
> +               return false;
> +
> +       /*
> +        * Check if the current priority hint merits a preemption attempt.
> +        *
> +        * We record the highest value priority we saw during rescheduling
> +        * prior to this dequeue, therefore we know that if it is strictly
> +        * less than the current tail of ESLP[0], we do not need to force
> +        * a preempt-to-idle cycle.
> +        *
> +        * However, the priority hint is a mere hint that we may need to
> +        * preempt. If that hint is stale or we may be trying to preempt
> +        * ourselves, ignore the request.
> +        *
> +        * More naturally we would write
> +        *      prio >= max(0, last);
> +        * except that we wish to prevent triggering preemption at the same
> +        * priority level: the task that is running should remain running
> +        * to preserve FIFO ordering of dependencies.
> +        */
> +       last_prio = max(effective_prio(rq), I915_PRIORITY_NORMAL - 1);
> +       if (engine->execlists.queue_priority_hint <= last_prio)
> +               return false;
> +
> +       /*
> +        * Check against the first request in ELSP[1], it will, thanks to the
> +        * power of PI, be the highest priority of that context.
> +        */
> +       if (!list_is_last(&rq->sched.link, &engine->active.requests) &&
> +           rq_prio(list_next_entry(rq, sched.link)) > last_prio)
> +               return true;
> +
> +       if (rb) {
> +               struct intel_virtual_engine *ve =
> +                       rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
> +               bool preempt = false;
> +
> +               if (engine == ve->siblings[0]) { /* only preempt one sibling */
> +                       struct i915_request *next;
> +
> +                       rcu_read_lock();
> +                       next = READ_ONCE(ve->request);
> +                       if (next)
> +                               preempt = rq_prio(next) > last_prio;
> +                       rcu_read_unlock();
> +               }
> +
> +               if (preempt)
> +                       return preempt;
> +       }
> +
> +       /*
> +        * If the inflight context did not trigger the preemption, then maybe
> +        * it was the set of queued requests? Pick the highest priority in
> +        * the queue (the first active priolist) and see if it deserves to be
> +        * running instead of ELSP[0].
> +        *
> +        * The highest priority request in the queue can not be either
> +        * ELSP[0] or ELSP[1] as, thanks again to PI, if it was the same
> +        * context, it's priority would not exceed ELSP[0] aka last_prio.
> +        */
> +       return queue_prio(&engine->execlists) > last_prio;
> +}
> +
> +__maybe_unused static inline bool
> +assert_priority_queue(const struct i915_request *prev,
> +                     const struct i915_request *next)
> +{
> +       /*
> +        * Without preemption, the prev may refer to the still active element
> +        * which we refuse to let go.
> +        *
> +        * Even with preemption, there are times when we think it is better not
> +        * to preempt and leave an ostensibly lower priority request in flight.
> +        */
> +       if (i915_request_is_active(prev))
> +               return true;
> +
> +       return rq_prio(prev) >= rq_prio(next);
> +}
> +
> +static struct i915_request *
> +__unwind_incomplete_requests(struct intel_engine_cs *engine)
> +{
> +       struct i915_request *rq, *rn, *active = NULL;
> +       struct list_head *uninitialized_var(pl);
> +       int prio = I915_PRIORITY_INVALID;
> +
> +       lockdep_assert_held(&engine->active.lock);
> +
> +       list_for_each_entry_safe_reverse(rq, rn,
> +                                        &engine->active.requests,
> +                                        sched.link) {
> +               if (i915_request_completed(rq))
> +                       continue; /* XXX */
> +
> +               __i915_request_unsubmit(rq);
> +
> +               /*
> +                * Push the request back into the queue for later resubmission.
> +                * If this request is not native to this physical engine (i.e.
> +                * it came from a virtual source), push it back onto the virtual
> +                * engine so that it can be moved across onto another physical
> +                * engine as load dictates.
> +                */
> +               if (likely(rq->execution_mask == engine->mask)) {
> +                       GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
> +                       if (rq_prio(rq) != prio) {
> +                               prio = rq_prio(rq);
> +                               pl = i915_sched_lookup_priolist(engine, prio);
> +                       }
> +                       GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
> +
> +                       list_move(&rq->sched.link, pl);
> +                       active = rq;
> +               } else {
> +                       struct intel_engine_cs *owner = rq->hw_context->engine;
> +
> +                       /*
> +                        * Decouple the virtual breadcrumb before moving it
> +                        * back to the virtual engine -- we don't want the
> +                        * request to complete in the background and try
> +                        * and cancel the breadcrumb on the virtual engine
> +                        * (instead of the old engine where it is linked)!
> +                        */
> +                       if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
> +                                    &rq->fence.flags)) {
> +                               spin_lock_nested(&rq->lock,
> +                                                SINGLE_DEPTH_NESTING);
> +                               i915_request_cancel_breadcrumb(rq);
> +                               spin_unlock(&rq->lock);
> +                       }
> +                       rq->engine = owner;
> +                       owner->submit_request(rq);
> +                       active = NULL;
> +               }
> +       }
> +
> +       return active;
> +}
> +
> +struct i915_request *
> +execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)

There should be no exports from this file... Did you not also make
guc_submission standalone?

> +{
> +       struct intel_engine_cs *engine =
> +               container_of(execlists, typeof(*engine), execlists);
> +
> +       return __unwind_incomplete_requests(engine);
> +}
> +
> +static inline void
> +execlists_context_status_change(struct i915_request *rq, unsigned long status)
> +{
> +       /*
> +        * Only used when GVT-g is enabled now. When GVT-g is disabled,
> +        * The compiler should eliminate this function as dead-code.
> +        */
> +       if (!IS_ENABLED(CONFIG_DRM_I915_GVT))
> +               return;
> +
> +       atomic_notifier_call_chain(&rq->engine->context_status_notifier,
> +                                  status, rq);
> +}
> +
> +static void intel_engine_context_in(struct intel_engine_cs *engine)
> +{
> +       unsigned long flags;
> +
> +       if (READ_ONCE(engine->stats.enabled) == 0)
> +               return;
> +
> +       write_seqlock_irqsave(&engine->stats.lock, flags);
> +
> +       if (engine->stats.enabled > 0) {
> +               if (engine->stats.active++ == 0)
> +                       engine->stats.start = ktime_get();
> +               GEM_BUG_ON(engine->stats.active == 0);
> +       }
> +
> +       write_sequnlock_irqrestore(&engine->stats.lock, flags);
> +}
> +
> +static void intel_engine_context_out(struct intel_engine_cs *engine)
> +{
> +       unsigned long flags;
> +
> +       if (READ_ONCE(engine->stats.enabled) == 0)
> +               return;
> +
> +       write_seqlock_irqsave(&engine->stats.lock, flags);
> +
> +       if (engine->stats.enabled > 0) {
> +               ktime_t last;
> +
> +               if (engine->stats.active && --engine->stats.active == 0) {
> +                       /*
> +                        * Decrement the active context count and in case GPU
> +                        * is now idle add up to the running total.
> +                        */
> +                       last = ktime_sub(ktime_get(), engine->stats.start);
> +
> +                       engine->stats.total = ktime_add(engine->stats.total,
> +                                                       last);
> +               } else if (engine->stats.active == 0) {
> +                       /*
> +                        * After turning on engine stats, context out might be
> +                        * the first event in which case we account from the
> +                        * time stats gathering was turned on.
> +                        */
> +                       last = ktime_sub(ktime_get(), engine->stats.enabled_at);
> +
> +                       engine->stats.total = ktime_add(engine->stats.total,
> +                                                       last);
> +               }
> +       }
> +
> +       write_sequnlock_irqrestore(&engine->stats.lock, flags);
> +}
> +
> +static void
> +execlists_check_context(const struct intel_context *ce,
> +                       const struct intel_engine_cs *engine)
> +{
> +       const struct intel_ring *ring = ce->ring;
> +       u32 *regs = ce->lrc_reg_state;
> +       bool valid = true;
> +       int x;
> +
> +       if (regs[CTX_RING_START] != i915_ggtt_offset(ring->vma)) {
> +               pr_err("%s: context submitted with incorrect RING_START [%08x], expected %08x\n",
> +                      engine->name,
> +                      regs[CTX_RING_START],
> +                      i915_ggtt_offset(ring->vma));
> +               regs[CTX_RING_START] = i915_ggtt_offset(ring->vma);
> +               valid = false;
> +       }
> +
> +       if ((regs[CTX_RING_CTL] & ~(RING_WAIT | RING_WAIT_SEMAPHORE)) !=
> +           (RING_CTL_SIZE(ring->size) | RING_VALID)) {
> +               pr_err("%s: context submitted with incorrect RING_CTL [%08x], expected %08x\n",
> +                      engine->name,
> +                      regs[CTX_RING_CTL],
> +                      (u32)(RING_CTL_SIZE(ring->size) | RING_VALID));
> +               regs[CTX_RING_CTL] = RING_CTL_SIZE(ring->size) | RING_VALID;
> +               valid = false;
> +       }
> +
> +       x = intel_lrc_ring_mi_mode(engine);
> +       if (x != -1 && regs[x + 1] & (regs[x + 1] >> 16) & STOP_RING) {
> +               pr_err("%s: context submitted with STOP_RING [%08x] in RING_MI_MODE\n",
> +                      engine->name, regs[x + 1]);
> +               regs[x + 1] &= ~STOP_RING;
> +               regs[x + 1] |= STOP_RING << 16;
> +               valid = false;
> +       }
> +
> +       WARN_ONCE(!valid, "Invalid lrc state found before submission\n");
> +}
> +
> +static void reset_active(struct i915_request *rq,
> +                        struct intel_engine_cs *engine)
> +{
> +       struct intel_context * const ce = rq->hw_context;
> +       u32 head;
> +
> +       /*
> +        * The executing context has been cancelled. We want to prevent
> +        * further execution along this context and propagate the error on
> +        * to anything depending on its results.
> +        *
> +        * In __i915_request_submit(), we apply the -EIO and remove the
> +        * requests' payloads for any banned requests. But first, we must
> +        * rewind the context back to the start of the incomplete request so
> +        * that we do not jump back into the middle of the batch.
> +        *
> +        * We preserve the breadcrumbs and semaphores of the incomplete
> +        * requests so that inter-timeline dependencies (i.e other timelines)
> +        * remain correctly ordered. And we defer to __i915_request_submit()
> +        * so that all asynchronous waits are correctly handled.
> +        */
> +       GEM_TRACE("%s(%s): { rq=%llx:%lld }\n",
> +                 __func__, engine->name, rq->fence.context, rq->fence.seqno);
> +
> +       /* On resubmission of the active request, payload will be scrubbed */
> +       if (i915_request_completed(rq))
> +               head = rq->tail;
> +       else
> +               head = active_request(ce->timeline, rq)->head;
> +       ce->ring->head = intel_ring_wrap(ce->ring, head);
> +       intel_ring_update_space(ce->ring);
> +
> +       /* Scrub the context image to prevent replaying the previous batch */
> +       intel_lr_context_restore_default_state(ce, engine);
> +       intel_lr_context_update_reg_state(ce, engine);
> +
> +       /* We've switched away, so this should be a no-op, but intent matters */
> +       ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
> +}
> +
> +static inline struct intel_engine_cs *
> +__execlists_schedule_in(struct i915_request *rq)
> +{
> +       struct intel_engine_cs * const engine = rq->engine;
> +       struct intel_context * const ce = rq->hw_context;
> +
> +       intel_context_get(ce);
> +
> +       if (unlikely(i915_gem_context_is_banned(ce->gem_context)))
> +               reset_active(rq, engine);
> +
> +       if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
> +               execlists_check_context(ce, engine);
> +
> +       if (ce->tag) {
> +               /* Use a fixed tag for OA and friends */
> +               ce->lrc_desc |= (u64)ce->tag << 32;
> +       } else {
> +               /* We don't need a strict matching tag, just different values */
> +               ce->lrc_desc &= ~GENMASK_ULL(47, 37);
> +               ce->lrc_desc |=
> +                       (u64)(engine->context_tag++ % NUM_CONTEXT_TAG) <<
> +                       GEN11_SW_CTX_ID_SHIFT;
> +               BUILD_BUG_ON(NUM_CONTEXT_TAG > GEN12_MAX_CONTEXT_HW_ID);
> +       }
> +
> +       __intel_gt_pm_get(engine->gt);
> +       execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
> +       intel_engine_context_in(engine);
> +
> +       return engine;
> +}
> +
> +static inline struct i915_request *
> +execlists_schedule_in(struct i915_request *rq, int idx)
> +{
> +       struct intel_context * const ce = rq->hw_context;
> +       struct intel_engine_cs *old;
> +
> +       GEM_BUG_ON(!intel_engine_pm_is_awake(rq->engine));
> +       trace_i915_request_in(rq, idx);
> +
> +       old = READ_ONCE(ce->inflight);
> +       do {
> +               if (!old) {
> +                       WRITE_ONCE(ce->inflight, __execlists_schedule_in(rq));
> +                       break;
> +               }
> +       } while (!try_cmpxchg(&ce->inflight, &old, ptr_inc(old)));
> +
> +       GEM_BUG_ON(intel_context_inflight(ce) != rq->engine);
> +       return i915_request_get(rq);
> +}
> +
> +static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
> +{
> +       struct intel_virtual_engine *ve =
> +               container_of(ce, typeof(*ve), context);
> +       struct i915_request *next = READ_ONCE(ve->request);
> +
> +       if (next && next->execution_mask & ~rq->execution_mask)
> +               tasklet_schedule(&ve->base.execlists.tasklet);
> +}
> +
> +static inline void
> +__execlists_schedule_out(struct i915_request *rq,
> +                        struct intel_engine_cs * const engine)
> +{
> +       struct intel_context * const ce = rq->hw_context;
> +
> +       /*
> +        * NB process_csb() is not under the engine->active.lock and hence
> +        * schedule_out can race with schedule_in meaning that we should
> +        * refrain from doing non-trivial work here.
> +        */
> +
> +       /*
> +        * If we have just completed this context, the engine may now be
> +        * idle and we want to re-enter powersaving.
> +        */
> +       if (list_is_last(&rq->link, &ce->timeline->requests) &&
> +           i915_request_completed(rq))
> +               intel_engine_add_retire(engine, ce->timeline);
> +
> +       intel_engine_context_out(engine);
> +       execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
> +       intel_gt_pm_put_async(engine->gt);
> +
> +       /*
> +        * If this is part of a virtual engine, its next request may
> +        * have been blocked waiting for access to the active context.
> +        * We have to kick all the siblings again in case we need to
> +        * switch (e.g. the next request is not runnable on this
> +        * engine). Hopefully, we will already have submitted the next
> +        * request before the tasklet runs and do not need to rebuild
> +        * each virtual tree and kick everyone again.
> +        */
> +       if (ce->engine != engine)
> +               kick_siblings(rq, ce);
> +
> +       intel_context_put(ce);
> +}
> +
> +static inline void
> +execlists_schedule_out(struct i915_request *rq)
> +{
> +       struct intel_context * const ce = rq->hw_context;
> +       struct intel_engine_cs *cur, *old;
> +
> +       trace_i915_request_out(rq);
> +
> +       old = READ_ONCE(ce->inflight);
> +       do
> +               cur = ptr_unmask_bits(old, 2) ? ptr_dec(old) : NULL;
> +       while (!try_cmpxchg(&ce->inflight, &old, cur));
> +       if (!cur)
> +               __execlists_schedule_out(rq, old);
> +
> +       i915_request_put(rq);
> +}
> +
> +static u64 execlists_update_context(struct i915_request *rq)
> +{
> +       struct intel_context *ce = rq->hw_context;
> +       u64 desc = ce->lrc_desc;
> +       u32 tail;
> +
> +       /*
> +        * WaIdleLiteRestore:bdw,skl
> +        *
> +        * We should never submit the context with the same RING_TAIL twice
> +        * just in case we submit an empty ring, which confuses the HW.
> +        *
> +        * We append a couple of NOOPs (gen8_emit_wa_tail) after the end of
> +        * the normal request to be able to always advance the RING_TAIL on
> +        * subsequent resubmissions (for lite restore). Should that fail us,
> +        * and we try and submit the same tail again, force the context
> +        * reload.
> +        */
> +       tail = intel_ring_set_tail(rq->ring, rq->tail);
> +       if (unlikely(ce->lrc_reg_state[CTX_RING_TAIL] == tail))
> +               desc |= CTX_DESC_FORCE_RESTORE;
> +       ce->lrc_reg_state[CTX_RING_TAIL] = tail;
> +       rq->tail = rq->wa_tail;
> +
> +       /*
> +        * Make sure the context image is complete before we submit it to HW.
> +        *
> +        * Ostensibly, writes (including the WCB) should be flushed prior to
> +        * an uncached write such as our mmio register access, the empirical
> +        * evidence (esp. on Braswell) suggests that the WC write into memory
> +        * may not be visible to the HW prior to the completion of the UC
> +        * register write and that we may begin execution from the context
> +        * before its image is complete leading to invalid PD chasing.
> +        */
> +       wmb();
> +
> +       /* Wa_1607138340:tgl */
> +       if (IS_TGL_REVID(rq->i915, TGL_REVID_A0, TGL_REVID_A0))
> +               desc |= CTX_DESC_FORCE_RESTORE;
> +
> +       ce->lrc_desc &= ~CTX_DESC_FORCE_RESTORE;
> +       return desc;
> +}
> +
> +static inline void write_desc(struct intel_engine_execlists *execlists, u64 desc, u32 port)
> +{
> +       if (execlists->ctrl_reg) {
> +               writel(lower_32_bits(desc), execlists->submit_reg + port * 2);
> +               writel(upper_32_bits(desc), execlists->submit_reg + port * 2 + 1);
> +       } else {
> +               writel(upper_32_bits(desc), execlists->submit_reg);
> +               writel(lower_32_bits(desc), execlists->submit_reg);
> +       }
> +}
> +
> +static __maybe_unused void
> +trace_ports(const struct intel_engine_execlists *execlists,
> +           const char *msg,
> +           struct i915_request * const *ports)
> +{
> +       const struct intel_engine_cs *engine =
> +               container_of(execlists, typeof(*engine), execlists);
> +
> +       if (!ports[0])
> +               return;
> +
> +       GEM_TRACE("%s: %s { %llx:%lld%s, %llx:%lld }\n",
> +                 engine->name, msg,
> +                 ports[0]->fence.context,
> +                 ports[0]->fence.seqno,
> +                 i915_request_completed(ports[0]) ? "!" :
> +                 i915_request_started(ports[0]) ? "*" :
> +                 "",
> +                 ports[1] ? ports[1]->fence.context : 0,
> +                 ports[1] ? ports[1]->fence.seqno : 0);
> +}
> +
> +static __maybe_unused bool
> +assert_pending_valid(const struct intel_engine_execlists *execlists,
> +                    const char *msg)
> +{
> +       struct i915_request * const *port, *rq;
> +       struct intel_context *ce = NULL;
> +
> +       trace_ports(execlists, msg, execlists->pending);
> +
> +       if (!execlists->pending[0]) {
> +               GEM_TRACE_ERR("Nothing pending for promotion!\n");
> +               return false;
> +       }
> +
> +       if (execlists->pending[execlists_num_ports(execlists)]) {
> +               GEM_TRACE_ERR("Excess pending[%d] for promotion!\n",
> +                             execlists_num_ports(execlists));
> +               return false;
> +       }
> +
> +       for (port = execlists->pending; (rq = *port); port++) {
> +               unsigned long flags;
> +               bool ok = true;
> +
> +               GEM_BUG_ON(!kref_read(&rq->fence.refcount));
> +               GEM_BUG_ON(!i915_request_is_active(rq));
> +
> +               if (ce == rq->hw_context) {
> +                       GEM_TRACE_ERR("Dup context:%llx in pending[%zd]\n",
> +                                     ce->timeline->fence_context,
> +                                     port - execlists->pending);
> +                       return false;
> +               }
> +               ce = rq->hw_context;
> +
> +               /* Hold tightly onto the lock to prevent concurrent retires! */
> +               if (!spin_trylock_irqsave(&rq->lock, flags))
> +                       continue;
> +
> +               if (i915_request_completed(rq))
> +                       goto unlock;
> +
> +               if (i915_active_is_idle(&ce->active) &&
> +                   !i915_gem_context_is_kernel(ce->gem_context)) {
> +                       GEM_TRACE_ERR("Inactive context:%llx in pending[%zd]\n",
> +                                     ce->timeline->fence_context,
> +                                     port - execlists->pending);
> +                       ok = false;
> +                       goto unlock;
> +               }
> +
> +               if (!i915_vma_is_pinned(ce->state)) {
> +                       GEM_TRACE_ERR("Unpinned context:%llx in pending[%zd]\n",
> +                                     ce->timeline->fence_context,
> +                                     port - execlists->pending);
> +                       ok = false;
> +                       goto unlock;
> +               }
> +
> +               if (!i915_vma_is_pinned(ce->ring->vma)) {
> +                       GEM_TRACE_ERR("Unpinned ring:%llx in pending[%zd]\n",
> +                                     ce->timeline->fence_context,
> +                                     port - execlists->pending);
> +                       ok = false;
> +                       goto unlock;
> +               }
> +
> +unlock:
> +               spin_unlock_irqrestore(&rq->lock, flags);
> +               if (!ok)
> +                       return false;
> +       }
> +
> +       return ce;
> +}
> +
> +static void execlists_submit_ports(struct intel_engine_cs *engine)
> +{
> +       struct intel_engine_execlists *execlists = &engine->execlists;
> +       unsigned int n;
> +
> +       GEM_BUG_ON(!assert_pending_valid(execlists, "submit"));
> +
> +       /*
> +        * We can skip acquiring intel_runtime_pm_get() here as it was taken
> +        * on our behalf by the request (see i915_gem_mark_busy()) and it will
> +        * not be relinquished until the device is idle (see
> +        * i915_gem_idle_work_handler()). As a precaution, we make sure
> +        * that all ELSP are drained i.e. we have processed the CSB,
> +        * before allowing ourselves to idle and calling intel_runtime_pm_put().
> +        */
> +       GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
> +
> +       /*
> +        * ELSQ note: the submit queue is not cleared after being submitted
> +        * to the HW so we need to make sure we always clean it up. This is
> +        * currently ensured by the fact that we always write the same number
> +        * of elsq entries, keep this in mind before changing the loop below.
> +        */
> +       for (n = execlists_num_ports(execlists); n--; ) {
> +               struct i915_request *rq = execlists->pending[n];
> +
> +               write_desc(execlists,
> +                          rq ? execlists_update_context(rq) : 0,
> +                          n);
> +       }
> +
> +       /* we need to manually load the submit queue */
> +       if (execlists->ctrl_reg)
> +               writel(EL_CTRL_LOAD, execlists->ctrl_reg);
> +}
> +
> +static bool ctx_single_port_submission(const struct intel_context *ce)
> +{
> +       return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
> +               i915_gem_context_force_single_submission(ce->gem_context));
> +}
> +
> +static bool can_merge_ctx(const struct intel_context *prev,
> +                         const struct intel_context *next)
> +{
> +       if (prev != next)
> +               return false;
> +
> +       if (ctx_single_port_submission(prev))
> +               return false;
> +
> +       return true;
> +}
> +
> +static bool can_merge_rq(const struct i915_request *prev,
> +                        const struct i915_request *next)
> +{
> +       GEM_BUG_ON(prev == next);
> +       GEM_BUG_ON(!assert_priority_queue(prev, next));
> +
> +       /*
> +        * We do not submit known completed requests. Therefore if the next
> +        * request is already completed, we can pretend to merge it in
> +        * with the previous context (and we will skip updating the ELSP
> +        * and tracking). Thus hopefully keeping the ELSP full with active
> +        * contexts, despite the best efforts of preempt-to-busy to confuse
> +        * us.
> +        */
> +       if (i915_request_completed(next))
> +               return true;
> +
> +       if (unlikely((prev->flags ^ next->flags) &
> +                    (I915_REQUEST_NOPREEMPT | I915_REQUEST_SENTINEL)))
> +               return false;
> +
> +       if (!can_merge_ctx(prev->hw_context, next->hw_context))
> +               return false;
> +
> +       return true;
> +}
> +
> +static bool virtual_matches(const struct intel_virtual_engine *ve,
> +                           const struct i915_request *rq,
> +                           const struct intel_engine_cs *engine)
> +{
> +       const struct intel_engine_cs *inflight;
> +
> +       if (!(rq->execution_mask & engine->mask)) /* We peeked too soon! */
> +               return false;
> +
> +       /*
> +        * We track when the HW has completed saving the context image
> +        * (i.e. when we have seen the final CS event switching out of
> +        * the context) and must not overwrite the context image before
> +        * then. This restricts us to only using the active engine
> +        * while the previous virtualized request is inflight (so
> +        * we reuse the register offsets). This is a very small
> +        * hystersis on the greedy seelction algorithm.
> +        */
> +       inflight = intel_context_inflight(&ve->context);
> +       if (inflight && inflight != engine)
> +               return false;
> +
> +       return true;
> +}
> +
> +static void virtual_xfer_breadcrumbs(struct intel_virtual_engine *ve,
> +                                    struct intel_engine_cs *engine)
> +{
> +       struct intel_engine_cs *old = ve->siblings[0];
> +
> +       /* All unattached (rq->engine == old) must already be completed */
> +
> +       spin_lock(&old->breadcrumbs.irq_lock);
> +       if (!list_empty(&ve->context.signal_link)) {
> +               list_move_tail(&ve->context.signal_link,
> +                              &engine->breadcrumbs.signalers);
> +               intel_engine_queue_breadcrumbs(engine);
> +       }
> +       spin_unlock(&old->breadcrumbs.irq_lock);
> +}
> +
> +static struct i915_request *
> +last_active(const struct intel_engine_execlists *execlists)
> +{
> +       struct i915_request * const *last = READ_ONCE(execlists->active);
> +
> +       while (*last && i915_request_completed(*last))
> +               last++;
> +
> +       return *last;
> +}
> +
> +static void defer_request(struct i915_request *rq, struct list_head * const pl)
> +{
> +       LIST_HEAD(list);
> +
> +       /*
> +        * We want to move the interrupted request to the back of
> +        * the round-robin list (i.e. its priority level), but
> +        * in doing so, we must then move all requests that were in
> +        * flight and were waiting for the interrupted request to
> +        * be run after it again.
> +        */
> +       do {
> +               struct i915_dependency *p;
> +
> +               GEM_BUG_ON(i915_request_is_active(rq));
> +               list_move_tail(&rq->sched.link, pl);
> +
> +               list_for_each_entry(p, &rq->sched.waiters_list, wait_link) {
> +                       struct i915_request *w =
> +                               container_of(p->waiter, typeof(*w), sched);
> +
> +                       /* Leave semaphores spinning on the other engines */
> +                       if (w->engine != rq->engine)
> +                               continue;
> +
> +                       /* No waiter should start before its signaler */
> +                       GEM_BUG_ON(i915_request_started(w) &&
> +                                  !i915_request_completed(rq));
> +
> +                       GEM_BUG_ON(i915_request_is_active(w));
> +                       if (list_empty(&w->sched.link))
> +                               continue; /* Not yet submitted; unready */
> +
> +                       if (rq_prio(w) < rq_prio(rq))
> +                               continue;
> +
> +                       GEM_BUG_ON(rq_prio(w) > rq_prio(rq));
> +                       list_move_tail(&w->sched.link, &list);
> +               }
> +
> +               rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
> +       } while (rq);
> +}
> +
> +static void defer_active(struct intel_engine_cs *engine)
> +{
> +       struct i915_request *rq;
> +
> +       rq = __unwind_incomplete_requests(engine);
> +       if (!rq)
> +               return;
> +
> +       defer_request(rq, i915_sched_lookup_priolist(engine, rq_prio(rq)));
> +}
> +
> +static bool
> +need_timeslice(struct intel_engine_cs *engine, const struct i915_request *rq)
> +{
> +       int hint;
> +
> +       if (!intel_engine_has_timeslices(engine))
> +               return false;
> +
> +       if (list_is_last(&rq->sched.link, &engine->active.requests))
> +               return false;
> +
> +       hint = max(rq_prio(list_next_entry(rq, sched.link)),
> +                  engine->execlists.queue_priority_hint);
> +
> +       return hint >= effective_prio(rq);
> +}
> +
> +static int
> +switch_prio(struct intel_engine_cs *engine, const struct i915_request *rq)
> +{
> +       if (list_is_last(&rq->sched.link, &engine->active.requests))
> +               return INT_MIN;
> +
> +       return rq_prio(list_next_entry(rq, sched.link));
> +}
> +
> +static inline unsigned long
> +timeslice(const struct intel_engine_cs *engine)
> +{
> +       return READ_ONCE(engine->props.timeslice_duration_ms);
> +}
> +
> +static unsigned long
> +active_timeslice(const struct intel_engine_cs *engine)
> +{
> +       const struct i915_request *rq = *engine->execlists.active;
> +
> +       if (i915_request_completed(rq))
> +               return 0;
> +
> +       if (engine->execlists.switch_priority_hint < effective_prio(rq))
> +               return 0;
> +
> +       return timeslice(engine);
> +}
> +
> +static void set_timeslice(struct intel_engine_cs *engine)
> +{
> +       if (!intel_engine_has_timeslices(engine))
> +               return;
> +
> +       set_timer_ms(&engine->execlists.timer, active_timeslice(engine));
> +}
> +
> +static void record_preemption(struct intel_engine_execlists *execlists)
> +{
> +       (void)I915_SELFTEST_ONLY(execlists->preempt_hang.count++);
> +}
> +
> +static unsigned long active_preempt_timeout(struct intel_engine_cs *engine)
> +{
> +       struct i915_request *rq;
> +
> +       rq = last_active(&engine->execlists);
> +       if (!rq)
> +               return 0;
> +
> +       /* Force a fast reset for terminated contexts (ignoring sysfs!) */
> +       if (unlikely(i915_gem_context_is_banned(rq->gem_context)))
> +               return 1;
> +
> +       return READ_ONCE(engine->props.preempt_timeout_ms);
> +}
> +
> +static void set_preempt_timeout(struct intel_engine_cs *engine)
> +{
> +       if (!intel_engine_has_preempt_reset(engine))
> +               return;
> +
> +       set_timer_ms(&engine->execlists.preempt,
> +                    active_preempt_timeout(engine));
> +}
> +
> +static void execlists_dequeue(struct intel_engine_cs *engine)
> +{
> +       struct intel_engine_execlists * const execlists = &engine->execlists;
> +       struct i915_request **port = execlists->pending;
> +       struct i915_request ** const last_port = port + execlists->port_mask;
> +       struct i915_request *last;
> +       struct rb_node *rb;
> +       bool submit = false;
> +
> +       /*
> +        * Hardware submission is through 2 ports. Conceptually each port
> +        * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
> +        * static for a context, and unique to each, so we only execute
> +        * requests belonging to a single context from each ring. RING_HEAD
> +        * is maintained by the CS in the context image, it marks the place
> +        * where it got up to last time, and through RING_TAIL we tell the CS
> +        * where we want to execute up to this time.
> +        *
> +        * In this list the requests are in order of execution. Consecutive
> +        * requests from the same context are adjacent in the ringbuffer. We
> +        * can combine these requests into a single RING_TAIL update:
> +        *
> +        *              RING_HEAD...req1...req2
> +        *                                    ^- RING_TAIL
> +        * since to execute req2 the CS must first execute req1.
> +        *
> +        * Our goal then is to point each port to the end of a consecutive
> +        * sequence of requests as being the most optimal (fewest wake ups
> +        * and context switches) submission.
> +        */
> +
> +       for (rb = rb_first_cached(&execlists->virtual); rb; ) {
> +               struct intel_virtual_engine *ve =
> +                       rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
> +               struct i915_request *rq = READ_ONCE(ve->request);
> +
> +               if (!rq) { /* lazily cleanup after another engine handled rq */
> +                       rb_erase_cached(rb, &execlists->virtual);
> +                       RB_CLEAR_NODE(rb);
> +                       rb = rb_first_cached(&execlists->virtual);
> +                       continue;
> +               }
> +
> +               if (!virtual_matches(ve, rq, engine)) {
> +                       rb = rb_next(rb);
> +                       continue;
> +               }
> +
> +               break;
> +       }
> +
> +       /*
> +        * If the queue is higher priority than the last
> +        * request in the currently active context, submit afresh.
> +        * We will resubmit again afterwards in case we need to split
> +        * the active context to interject the preemption request,
> +        * i.e. we will retrigger preemption following the ack in case
> +        * of trouble.
> +        */
> +       last = last_active(execlists);
> +       if (last) {
> +               if (need_preempt(engine, last, rb)) {
> +                       GEM_TRACE("%s: preempting last=%llx:%lld, prio=%d, hint=%d\n",
> +                                 engine->name,
> +                                 last->fence.context,
> +                                 last->fence.seqno,
> +                                 last->sched.attr.priority,
> +                                 execlists->queue_priority_hint);
> +                       record_preemption(execlists);
> +
> +                       /*
> +                        * Don't let the RING_HEAD advance past the breadcrumb
> +                        * as we unwind (and until we resubmit) so that we do
> +                        * not accidentally tell it to go backwards.
> +                        */
> +                       ring_set_paused(engine, 1);
> +
> +                       /*
> +                        * Note that we have not stopped the GPU at this point,
> +                        * so we are unwinding the incomplete requests as they
> +                        * remain inflight and so by the time we do complete
> +                        * the preemption, some of the unwound requests may
> +                        * complete!
> +                        */
> +                       __unwind_incomplete_requests(engine);
> +
> +                       /*
> +                        * If we need to return to the preempted context, we
> +                        * need to skip the lite-restore and force it to
> +                        * reload the RING_TAIL. Otherwise, the HW has a
> +                        * tendency to ignore us rewinding the TAIL to the
> +                        * end of an earlier request.
> +                        */
> +                       last->hw_context->lrc_desc |= CTX_DESC_FORCE_RESTORE;
> +                       last = NULL;
> +               } else if (need_timeslice(engine, last) &&
> +                          timer_expired(&engine->execlists.timer)) {
> +                       GEM_TRACE("%s: expired last=%llx:%lld, prio=%d, hint=%d\n",
> +                                 engine->name,
> +                                 last->fence.context,
> +                                 last->fence.seqno,
> +                                 last->sched.attr.priority,
> +                                 execlists->queue_priority_hint);
> +
> +                       ring_set_paused(engine, 1);
> +                       defer_active(engine);
> +
> +                       /*
> +                        * Unlike for preemption, if we rewind and continue
> +                        * executing the same context as previously active,
> +                        * the order of execution will remain the same and
> +                        * the tail will only advance. We do not need to
> +                        * force a full context restore, as a lite-restore
> +                        * is sufficient to resample the monotonic TAIL.
> +                        *
> +                        * If we switch to any other context, similarly we
> +                        * will not rewind TAIL of current context, and
> +                        * normal save/restore will preserve state and allow
> +                        * us to later continue executing the same request.
> +                        */
> +                       last = NULL;
> +               } else {
> +                       /*
> +                        * Otherwise if we already have a request pending
> +                        * for execution after the current one, we can
> +                        * just wait until the next CS event before
> +                        * queuing more. In either case we will force a
> +                        * lite-restore preemption event, but if we wait
> +                        * we hopefully coalesce several updates into a single
> +                        * submission.
> +                        */
> +                       if (!list_is_last(&last->sched.link,
> +                                         &engine->active.requests)) {
> +                               /*
> +                                * Even if ELSP[1] is occupied and not worthy
> +                                * of timeslices, our queue might be.
> +                                */
> +                               if (!execlists->timer.expires &&
> +                                   need_timeslice(engine, last))
> +                                       set_timer_ms(&execlists->timer,
> +                                                    timeslice(engine));
> +
> +                               return;
> +                       }
> +               }
> +       }
> +
> +       while (rb) { /* XXX virtual is always taking precedence */
> +               struct intel_virtual_engine *ve =
> +                       rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
> +               struct i915_request *rq;
> +
> +               spin_lock(&ve->base.active.lock);
> +
> +               rq = ve->request;
> +               if (unlikely(!rq)) { /* lost the race to a sibling */
> +                       spin_unlock(&ve->base.active.lock);
> +                       rb_erase_cached(rb, &execlists->virtual);
> +                       RB_CLEAR_NODE(rb);
> +                       rb = rb_first_cached(&execlists->virtual);
> +                       continue;
> +               }
> +
> +               GEM_BUG_ON(rq != ve->request);
> +               GEM_BUG_ON(rq->engine != &ve->base);
> +               GEM_BUG_ON(rq->hw_context != &ve->context);
> +
> +               if (rq_prio(rq) >= queue_prio(execlists)) {
> +                       if (!virtual_matches(ve, rq, engine)) {
> +                               spin_unlock(&ve->base.active.lock);
> +                               rb = rb_next(rb);
> +                               continue;
> +                       }
> +
> +                       if (last && !can_merge_rq(last, rq)) {
> +                               spin_unlock(&ve->base.active.lock);
> +                               return; /* leave this for another */
> +                       }
> +
> +                       GEM_TRACE("%s: virtual rq=%llx:%lld%s, new engine? %s\n",
> +                                 engine->name,
> +                                 rq->fence.context,
> +                                 rq->fence.seqno,
> +                                 i915_request_completed(rq) ? "!" :
> +                                 i915_request_started(rq) ? "*" :
> +                                 "",
> +                                 yesno(engine != ve->siblings[0]));
> +
> +                       ve->request = NULL;
> +                       ve->base.execlists.queue_priority_hint = INT_MIN;
> +                       rb_erase_cached(rb, &execlists->virtual);
> +                       RB_CLEAR_NODE(rb);
> +
> +                       GEM_BUG_ON(!(rq->execution_mask & engine->mask));
> +                       rq->engine = engine;
> +
> +                       if (engine != ve->siblings[0]) {
> +                               u32 *regs = ve->context.lrc_reg_state;
> +                               unsigned int n;
> +
> +                               GEM_BUG_ON(READ_ONCE(ve->context.inflight));
> +
> +                               if (!intel_engine_has_relative_mmio(engine))
> +                                       intel_lr_context_set_register_offsets(regs,
> +                                                                             engine);
> +
> +                               if (!list_empty(&ve->context.signals))
> +                                       virtual_xfer_breadcrumbs(ve, engine);
> +
> +                               /*
> +                                * Move the bound engine to the top of the list
> +                                * for future execution. We then kick this
> +                                * tasklet first before checking others, so that
> +                                * we preferentially reuse this set of bound
> +                                * registers.
> +                                */
> +                               for (n = 1; n < ve->num_siblings; n++) {
> +                                       if (ve->siblings[n] == engine) {
> +                                               swap(ve->siblings[n],
> +                                                    ve->siblings[0]);
> +                                               break;
> +                                       }
> +                               }
> +
> +                               GEM_BUG_ON(ve->siblings[0] != engine);
> +                       }
> +
> +                       if (__i915_request_submit(rq)) {
> +                               submit = true;
> +                               last = rq;
> +                       }
> +                       i915_request_put(rq);
> +
> +                       /*
> +                        * Hmm, we have a bunch of virtual engine requests,
> +                        * but the first one was already completed (thanks
> +                        * preempt-to-busy!). Keep looking at the veng queue
> +                        * until we have no more relevant requests (i.e.
> +                        * the normal submit queue has higher priority).
> +                        */
> +                       if (!submit) {
> +                               spin_unlock(&ve->base.active.lock);
> +                               rb = rb_first_cached(&execlists->virtual);
> +                               continue;
> +                       }
> +               }
> +
> +               spin_unlock(&ve->base.active.lock);
> +               break;
> +       }
> +
> +       while ((rb = rb_first_cached(&execlists->queue))) {
> +               struct i915_priolist *p = to_priolist(rb);
> +               struct i915_request *rq, *rn;
> +               int i;
> +
> +               priolist_for_each_request_consume(rq, rn, p, i) {
> +                       bool merge = true;
> +
> +                       /*
> +                        * Can we combine this request with the current port?
> +                        * It has to be the same context/ringbuffer and not
> +                        * have any exceptions (e.g. GVT saying never to
> +                        * combine contexts).
> +                        *
> +                        * If we can combine the requests, we can execute both
> +                        * by updating the RING_TAIL to point to the end of the
> +                        * second request, and so we never need to tell the
> +                        * hardware about the first.
> +                        */
> +                       if (last && !can_merge_rq(last, rq)) {
> +                               /*
> +                                * If we are on the second port and cannot
> +                                * combine this request with the last, then we
> +                                * are done.
> +                                */
> +                               if (port == last_port)
> +                                       goto done;
> +
> +                               /*
> +                                * We must not populate both ELSP[] with the
> +                                * same LRCA, i.e. we must submit 2 different
> +                                * contexts if we submit 2 ELSP.
> +                                */
> +                               if (last->hw_context == rq->hw_context)
> +                                       goto done;
> +
> +                               if (i915_request_has_sentinel(last))
> +                                       goto done;
> +
> +                               /*
> +                                * If GVT overrides us we only ever submit
> +                                * port[0], leaving port[1] empty. Note that we
> +                                * also have to be careful that we don't queue
> +                                * the same context (even though a different
> +                                * request) to the second port.
> +                                */
> +                               if (ctx_single_port_submission(last->hw_context) ||
> +                                   ctx_single_port_submission(rq->hw_context))
> +                                       goto done;
> +
> +                               merge = false;
> +                       }
> +
> +                       if (__i915_request_submit(rq)) {
> +                               if (!merge) {
> +                                       *port = execlists_schedule_in(last, port - execlists->pending);
> +                                       port++;
> +                                       last = NULL;
> +                               }
> +
> +                               GEM_BUG_ON(last &&
> +                                          !can_merge_ctx(last->hw_context,
> +                                                         rq->hw_context));
> +
> +                               submit = true;
> +                               last = rq;
> +                       }
> +               }
> +
> +               rb_erase_cached(&p->node, &execlists->queue);
> +               i915_priolist_free(p);
> +       }
> +
> +done:
> +       /*
> +        * Here be a bit of magic! Or sleight-of-hand, whichever you prefer.
> +        *
> +        * We choose the priority hint such that if we add a request of greater
> +        * priority than this, we kick the submission tasklet to decide on
> +        * the right order of submitting the requests to hardware. We must
> +        * also be prepared to reorder requests as they are in-flight on the
> +        * HW. We derive the priority hint then as the first "hole" in
> +        * the HW submission ports and if there are no available slots,
> +        * the priority of the lowest executing request, i.e. last.
> +        *
> +        * When we do receive a higher priority request ready to run from the
> +        * user, see queue_request(), the priority hint is bumped to that
> +        * request triggering preemption on the next dequeue (or subsequent
> +        * interrupt for secondary ports).
> +        */
> +       execlists->queue_priority_hint = queue_prio(execlists);
> +       GEM_TRACE("%s: queue_priority_hint:%d, submit:%s\n",
> +                 engine->name, execlists->queue_priority_hint,
> +                 yesno(submit));
> +
> +       if (submit) {
> +               *port = execlists_schedule_in(last, port - execlists->pending);
> +               execlists->switch_priority_hint =
> +                       switch_prio(engine, *execlists->pending);
> +
> +               /*
> +                * Skip if we ended up with exactly the same set of requests,
> +                * e.g. trying to timeslice a pair of ordered contexts
> +                */
> +               if (!memcmp(execlists->active, execlists->pending,
> +                           (port - execlists->pending + 1) * sizeof(*port))) {
> +                       do
> +                               execlists_schedule_out(fetch_and_zero(port));
> +                       while (port-- != execlists->pending);
> +
> +                       goto skip_submit;
> +               }
> +
> +               memset(port + 1, 0, (last_port - port) * sizeof(*port));
> +               execlists_submit_ports(engine);
> +
> +               set_preempt_timeout(engine);
> +       } else {
> +skip_submit:
> +               ring_set_paused(engine, 0);
> +       }
> +}
> +
> +static void
> +cancel_port_requests(struct intel_engine_execlists * const execlists)
> +{
> +       struct i915_request * const *port;
> +
> +       for (port = execlists->pending; *port; port++)
> +               execlists_schedule_out(*port);
> +       memset(execlists->pending, 0, sizeof(execlists->pending));
> +
> +       /* Mark the end of active before we overwrite *active */
> +       for (port = xchg(&execlists->active, execlists->pending); *port; port++)
> +               execlists_schedule_out(*port);
> +       WRITE_ONCE(execlists->active,
> +                  memset(execlists->inflight, 0, sizeof(execlists->inflight)));
> +}
> +
> +static inline void
> +invalidate_csb_entries(const u32 *first, const u32 *last)
> +{
> +       clflush((void *)first);
> +       clflush((void *)last);
> +}
> +
> +static inline bool
> +reset_in_progress(const struct intel_engine_execlists *execlists)
> +{
> +       return unlikely(!__tasklet_is_enabled(&execlists->tasklet));
> +}
> +
> +/*
> + * Starting with Gen12, the status has a new format:
> + *
> + *     bit  0:     switched to new queue
> + *     bit  1:     reserved
> + *     bit  2:     semaphore wait mode (poll or signal), only valid when
> + *                 switch detail is set to "wait on semaphore"
> + *     bits 3-5:   engine class
> + *     bits 6-11:  engine instance
> + *     bits 12-14: reserved
> + *     bits 15-25: sw context id of the lrc the GT switched to
> + *     bits 26-31: sw counter of the lrc the GT switched to
> + *     bits 32-35: context switch detail
> + *                  - 0: ctx complete
> + *                  - 1: wait on sync flip
> + *                  - 2: wait on vblank
> + *                  - 3: wait on scanline
> + *                  - 4: wait on semaphore
> + *                  - 5: context preempted (not on SEMAPHORE_WAIT or
> + *                       WAIT_FOR_EVENT)
> + *     bit  36:    reserved
> + *     bits 37-43: wait detail (for switch detail 1 to 4)
> + *     bits 44-46: reserved
> + *     bits 47-57: sw context id of the lrc the GT switched away from
> + *     bits 58-63: sw counter of the lrc the GT switched away from
> + */
> +static inline bool
> +gen12_csb_parse(const struct intel_engine_execlists *execlists, const u32 *csb)
> +{
> +       u32 lower_dw = csb[0];
> +       u32 upper_dw = csb[1];
> +       bool ctx_to_valid = GEN12_CSB_CTX_VALID(lower_dw);
> +       bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_dw);
> +       bool new_queue = lower_dw & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE;
> +
> +       /*
> +        * The context switch detail is not guaranteed to be 5 when a preemption
> +        * occurs, so we can't just check for that. The check below works for
> +        * all the cases we care about, including preemptions of WAIT
> +        * instructions and lite-restore. Preempt-to-idle via the CTRL register
> +        * would require some extra handling, but we don't support that.
> +        */
> +       if (!ctx_away_valid || new_queue) {
> +               GEM_BUG_ON(!ctx_to_valid);
> +               return true;
> +       }
> +
> +       /*
> +        * switch detail = 5 is covered by the case above and we do not expect a
> +        * context switch on an unsuccessful wait instruction since we always
> +        * use polling mode.
> +        */
> +       GEM_BUG_ON(GEN12_CTX_SWITCH_DETAIL(upper_dw));
> +       return false;
> +}
> +
> +static inline bool
> +gen8_csb_parse(const struct intel_engine_execlists *execlists, const u32 *csb)
> +{
> +       return *csb & (GEN8_CTX_STATUS_IDLE_ACTIVE | GEN8_CTX_STATUS_PREEMPTED);
> +}
> +
> +static void process_csb(struct intel_engine_cs *engine)
> +{
> +       struct intel_engine_execlists * const execlists = &engine->execlists;
> +       const u32 * const buf = execlists->csb_status;
> +       const u8 num_entries = execlists->csb_size;
> +       u8 head, tail;
> +
> +       /*
> +        * As we modify our execlists state tracking we require exclusive
> +        * access. Either we are inside the tasklet, or the tasklet is disabled
> +        * and we assume that is only inside the reset paths and so serialised.
> +        */
> +       GEM_BUG_ON(!tasklet_is_locked(&execlists->tasklet) &&
> +                  !reset_in_progress(execlists));
> +       GEM_BUG_ON(!intel_engine_in_execlists_submission_mode(engine));
> +
> +       /*
> +        * Note that csb_write, csb_status may be either in HWSP or mmio.
> +        * When reading from the csb_write mmio register, we have to be
> +        * careful to only use the GEN8_CSB_WRITE_PTR portion, which is
> +        * the low 4bits. As it happens we know the next 4bits are always
> +        * zero and so we can simply masked off the low u8 of the register
> +        * and treat it identically to reading from the HWSP (without having
> +        * to use explicit shifting and masking, and probably bifurcating
> +        * the code to handle the legacy mmio read).
> +        */
> +       head = execlists->csb_head;
> +       tail = READ_ONCE(*execlists->csb_write);
> +       GEM_TRACE("%s cs-irq head=%d, tail=%d\n", engine->name, head, tail);
> +       if (unlikely(head == tail))
> +               return;
> +
> +       /*
> +        * Hopefully paired with a wmb() in HW!
> +        *
> +        * We must complete the read of the write pointer before any reads
> +        * from the CSB, so that we do not see stale values. Without an rmb
> +        * (lfence) the HW may speculatively perform the CSB[] reads *before*
> +        * we perform the READ_ONCE(*csb_write).
> +        */
> +       rmb();
> +
> +       do {
> +               bool promote;
> +
> +               if (++head == num_entries)
> +                       head = 0;
> +
> +               /*
> +                * We are flying near dragons again.
> +                *
> +                * We hold a reference to the request in execlist_port[]
> +                * but no more than that. We are operating in softirq
> +                * context and so cannot hold any mutex or sleep. That
> +                * prevents us stopping the requests we are processing
> +                * in port[] from being retired simultaneously (the
> +                * breadcrumb will be complete before we see the
> +                * context-switch). As we only hold the reference to the
> +                * request, any pointer chasing underneath the request
> +                * is subject to a potential use-after-free. Thus we
> +                * store all of the bookkeeping within port[] as
> +                * required, and avoid using unguarded pointers beneath
> +                * request itself. The same applies to the atomic
> +                * status notifier.
> +                */
> +
> +               GEM_TRACE("%s csb[%d]: status=0x%08x:0x%08x\n",
> +                         engine->name, head,
> +                         buf[2 * head + 0], buf[2 * head + 1]);
> +
> +               if (INTEL_GEN(engine->i915) >= 12)
> +                       promote = gen12_csb_parse(execlists, buf + 2 * head);
> +               else
> +                       promote = gen8_csb_parse(execlists, buf + 2 * head);
> +               if (promote) {
> +                       struct i915_request * const *old = execlists->active;
> +
> +                       /* Point active to the new ELSP; prevent overwriting */
> +                       WRITE_ONCE(execlists->active, execlists->pending);
> +                       set_timeslice(engine);
> +
> +                       if (!inject_preempt_hang(execlists))
> +                               ring_set_paused(engine, 0);
> +
> +                       /* cancel old inflight, prepare for switch */
> +                       trace_ports(execlists, "preempted", old);
> +                       while (*old)
> +                               execlists_schedule_out(*old++);
> +
> +                       /* switch pending to inflight */
> +                       GEM_BUG_ON(!assert_pending_valid(execlists, "promote"));
> +                       WRITE_ONCE(execlists->active,
> +                                  memcpy(execlists->inflight,
> +                                         execlists->pending,
> +                                         execlists_num_ports(execlists) *
> +                                         sizeof(*execlists->pending)));
> +
> +                       WRITE_ONCE(execlists->pending[0], NULL);
> +               } else {
> +                       GEM_BUG_ON(!*execlists->active);
> +
> +                       /* port0 completed, advanced to port1 */
> +                       trace_ports(execlists, "completed", execlists->active);
> +
> +                       /*
> +                        * We rely on the hardware being strongly
> +                        * ordered, that the breadcrumb write is
> +                        * coherent (visible from the CPU) before the
> +                        * user interrupt and CSB is processed.
> +                        */
> +                       GEM_BUG_ON(!i915_request_completed(*execlists->active) &&
> +                                  !reset_in_progress(execlists));
> +                       execlists_schedule_out(*execlists->active++);
> +
> +                       GEM_BUG_ON(execlists->active - execlists->inflight >
> +                                  execlists_num_ports(execlists));
> +               }
> +       } while (head != tail);
> +
> +       execlists->csb_head = head;
> +
> +       /*
> +        * Gen11 has proven to fail wrt global observation point between
> +        * entry and tail update, failing on the ordering and thus
> +        * we see an old entry in the context status buffer.
> +        *
> +        * Forcibly evict out entries for the next gpu csb update,
> +        * to increase the odds that we get a fresh entries with non
> +        * working hardware. The cost for doing so comes out mostly with
> +        * the wash as hardware, working or not, will need to do the
> +        * invalidation before.
> +        */
> +       invalidate_csb_entries(&buf[0], &buf[num_entries - 1]);
> +}
> +
> +static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
> +{
> +       lockdep_assert_held(&engine->active.lock);
> +       if (!engine->execlists.pending[0]) {
> +               rcu_read_lock(); /* protect peeking at execlists->active */
> +               execlists_dequeue(engine);
> +               rcu_read_unlock();
> +       }
> +}
> +
> +static noinline void preempt_reset(struct intel_engine_cs *engine)
> +{
> +       const unsigned int bit = I915_RESET_ENGINE + engine->id;
> +       unsigned long *lock = &engine->gt->reset.flags;
> +
> +       if (i915_modparams.reset < 3)
> +               return;
> +
> +       if (test_and_set_bit(bit, lock))
> +               return;
> +
> +       /* Mark this tasklet as disabled to avoid waiting for it to complete */
> +       tasklet_disable_nosync(&engine->execlists.tasklet);
> +
> +       GEM_TRACE("%s: preempt timeout %lu+%ums\n",
> +                 engine->name,
> +                 READ_ONCE(engine->props.preempt_timeout_ms),
> +                 jiffies_to_msecs(jiffies - engine->execlists.preempt.expires));
> +       intel_engine_reset(engine, "preemption time out");
> +
> +       tasklet_enable(&engine->execlists.tasklet);
> +       clear_and_wake_up_bit(bit, lock);
> +}
> +
> +static bool preempt_timeout(const struct intel_engine_cs *const engine)
> +{
> +       const struct timer_list *t = &engine->execlists.preempt;
> +
> +       if (!CONFIG_DRM_I915_PREEMPT_TIMEOUT)
> +               return false;
> +
> +       if (!timer_expired(t))
> +               return false;
> +
> +       return READ_ONCE(engine->execlists.pending[0]);
> +}
> +
> +/*
> + * Check the unread Context Status Buffers and manage the submission of new
> + * contexts to the ELSP accordingly.
> + */
> +static void execlists_submission_tasklet(unsigned long data)
> +{
> +       struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
> +       bool timeout = preempt_timeout(engine);
> +
> +       process_csb(engine);
> +       if (!READ_ONCE(engine->execlists.pending[0]) || timeout) {
> +               unsigned long flags;
> +
> +               spin_lock_irqsave(&engine->active.lock, flags);
> +               __execlists_submission_tasklet(engine);
> +               spin_unlock_irqrestore(&engine->active.lock, flags);
> +
> +               /* Recheck after serialising with direct-submission */
> +               if (timeout && preempt_timeout(engine))
> +                       preempt_reset(engine);
> +       }
> +}
> +
> +static void __execlists_kick(struct intel_engine_execlists *execlists)
> +{
> +       /* Kick the tasklet for some interrupt coalescing and reset handling */
> +       tasklet_hi_schedule(&execlists->tasklet);
> +}
> +
> +#define execlists_kick(t, member) \
> +       __execlists_kick(container_of(t, struct intel_engine_execlists, member))
> +
> +static void execlists_timeslice(struct timer_list *timer)
> +{
> +       execlists_kick(timer, timer);
> +}
> +
> +static void execlists_preempt(struct timer_list *timer)
> +{
> +       execlists_kick(timer, preempt);
> +}
> +
> +static void queue_request(struct intel_engine_cs *engine,
> +                         struct i915_sched_node *node,
> +                         int prio)
> +{
> +       GEM_BUG_ON(!list_empty(&node->link));
> +       list_add_tail(&node->link, i915_sched_lookup_priolist(engine, prio));
> +}
> +
> +static void __submit_queue_imm(struct intel_engine_cs *engine)
> +{
> +       struct intel_engine_execlists * const execlists = &engine->execlists;
> +
> +       if (reset_in_progress(execlists))
> +               return; /* defer until we restart the engine following reset */
> +
> +       if (execlists->tasklet.func == execlists_submission_tasklet)
> +               __execlists_submission_tasklet(engine);
> +       else
> +               tasklet_hi_schedule(&execlists->tasklet);
> +}
> +
> +static void submit_queue(struct intel_engine_cs *engine,
> +                        const struct i915_request *rq)
> +{
> +       struct intel_engine_execlists *execlists = &engine->execlists;
> +
> +       if (rq_prio(rq) <= execlists->queue_priority_hint)
> +               return;
> +
> +       execlists->queue_priority_hint = rq_prio(rq);
> +       __submit_queue_imm(engine);
> +}
> +
> +static void execlists_submit_request(struct i915_request *request)
> +{
> +       struct intel_engine_cs *engine = request->engine;
> +       unsigned long flags;
> +
> +       /* Will be called from irq-context when using foreign fences. */
> +       spin_lock_irqsave(&engine->active.lock, flags);
> +
> +       queue_request(engine, &request->sched, rq_prio(request));
> +
> +       GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
> +       GEM_BUG_ON(list_empty(&request->sched.link));
> +
> +       submit_queue(engine, request);
> +
> +       spin_unlock_irqrestore(&engine->active.lock, flags);
> +}
> +
> +static void execlists_context_destroy(struct kref *kref)
> +{
> +       struct intel_context *ce = container_of(kref, typeof(*ce), ref);
> +
> +       GEM_BUG_ON(!i915_active_is_idle(&ce->active));
> +       GEM_BUG_ON(intel_context_is_pinned(ce));
> +
> +       if (ce->state)
> +               intel_lr_context_fini(ce);
> +
> +       intel_context_fini(ce);
> +       intel_context_free(ce);
> +}
> +
> +static int execlists_context_pin(struct intel_context *ce)
> +{
> +       return intel_lr_context_pin(ce, ce->engine);
> +}
> +
> +static int execlists_context_alloc(struct intel_context *ce)
> +{
> +       return intel_lr_context_alloc(ce, ce->engine);
> +}
> +
> +static void execlists_context_reset(struct intel_context *ce)
> +{
> +       /*
> +        * Because we emit WA_TAIL_DWORDS there may be a disparity
> +        * between our bookkeeping in ce->ring->head and ce->ring->tail and
> +        * that stored in context. As we only write new commands from
> +        * ce->ring->tail onwards, everything before that is junk. If the GPU
> +        * starts reading from its RING_HEAD from the context, it may try to
> +        * execute that junk and die.
> +        *
> +        * The contexts that are stilled pinned on resume belong to the
> +        * kernel, and are local to each engine. All other contexts will
> +        * have their head/tail sanitized upon pinning before use, so they
> +        * will never see garbage,
> +        *
> +        * So to avoid that we reset the context images upon resume. For
> +        * simplicity, we just zero everything out.
> +        */
> +       intel_ring_reset(ce->ring, 0);
> +       intel_lr_context_update_reg_state(ce, ce->engine);
> +}
> +
> +static const struct intel_context_ops execlists_context_ops = {
> +       .alloc = execlists_context_alloc,
> +
> +       .pin = execlists_context_pin,
> +       .unpin = intel_lr_context_unpin,
> +
> +       .enter = intel_context_enter_engine,
> +       .exit = intel_context_exit_engine,
> +
> +       .reset = execlists_context_reset,
> +       .destroy = execlists_context_destroy,
> +};
> +
> +static int execlists_request_alloc(struct i915_request *request)
> +{
> +       int ret;
> +
> +       GEM_BUG_ON(!intel_context_is_pinned(request->hw_context));
> +
> +       /*
> +        * Flush enough space to reduce the likelihood of waiting after
> +        * we start building the request - in which case we will just
> +        * have to repeat work.
> +        */
> +       request->reserved_space += EXECLISTS_REQUEST_SIZE;
> +
> +       /*
> +        * Note that after this point, we have committed to using
> +        * this request as it is being used to both track the
> +        * state of engine initialisation and liveness of the
> +        * golden renderstate above. Think twice before you try
> +        * to cancel/unwind this request now.
> +        */
> +
> +       /* Unconditionally invalidate GPU caches and TLBs. */
> +       ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
> +       if (ret)
> +               return ret;
> +
> +       request->reserved_space -= EXECLISTS_REQUEST_SIZE;
> +       return 0;
> +}
> +
> +static void execlists_reset_prepare(struct intel_engine_cs *engine)
> +{
> +       struct intel_engine_execlists * const execlists = &engine->execlists;
> +       unsigned long flags;
> +
> +       GEM_TRACE("%s: depth<-%d\n", engine->name,
> +                 atomic_read(&execlists->tasklet.count));
> +
> +       /*
> +        * Prevent request submission to the hardware until we have
> +        * completed the reset in i915_gem_reset_finish(). If a request
> +        * is completed by one engine, it may then queue a request
> +        * to a second via its execlists->tasklet *just* as we are
> +        * calling engine->resume() and also writing the ELSP.
> +        * Turning off the execlists->tasklet until the reset is over
> +        * prevents the race.
> +        */
> +       __tasklet_disable_sync_once(&execlists->tasklet);
> +       GEM_BUG_ON(!reset_in_progress(execlists));
> +
> +       /* And flush any current direct submission. */
> +       spin_lock_irqsave(&engine->active.lock, flags);
> +       spin_unlock_irqrestore(&engine->active.lock, flags);
> +
> +       /*
> +        * We stop engines, otherwise we might get failed reset and a
> +        * dead gpu (on elk). Also as modern gpu as kbl can suffer
> +        * from system hang if batchbuffer is progressing when
> +        * the reset is issued, regardless of READY_TO_RESET ack.
> +        * Thus assume it is best to stop engines on all gens
> +        * where we have a gpu reset.
> +        *
> +        * WaKBLVECSSemaphoreWaitPoll:kbl (on ALL_ENGINES)
> +        *
> +        * FIXME: Wa for more modern gens needs to be validated
> +        */
> +       intel_engine_stop_cs(engine);
> +}
> +
> +static void reset_csb_pointers(struct intel_engine_cs *engine)
> +{
> +       struct intel_engine_execlists * const execlists = &engine->execlists;
> +       const unsigned int reset_value = execlists->csb_size - 1;
> +
> +       ring_set_paused(engine, 0);
> +
> +       /*
> +        * After a reset, the HW starts writing into CSB entry [0]. We
> +        * therefore have to set our HEAD pointer back one entry so that
> +        * the *first* entry we check is entry 0. To complicate this further,
> +        * as we don't wait for the first interrupt after reset, we have to
> +        * fake the HW write to point back to the last entry so that our
> +        * inline comparison of our cached head position against the last HW
> +        * write works even before the first interrupt.
> +        */
> +       execlists->csb_head = reset_value;
> +       WRITE_ONCE(*execlists->csb_write, reset_value);
> +       wmb(); /* Make sure this is visible to HW (paranoia?) */
> +
> +       /*
> +        * Sometimes Icelake forgets to reset its pointers on a GPU reset.
> +        * Bludgeon them with a mmio update to be sure.
> +        */
> +       ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,
> +                    reset_value << 8 | reset_value);
> +       ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR);
> +
> +       invalidate_csb_entries(&execlists->csb_status[0],
> +                              &execlists->csb_status[reset_value]);
> +}
> +
> +static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
> +{
> +       struct intel_engine_execlists * const execlists = &engine->execlists;
> +       struct intel_context *ce;
> +       struct i915_request *rq;
> +
> +       mb(); /* paranoia: read the CSB pointers from after the reset */
> +       clflush(execlists->csb_write);
> +       mb();
> +
> +       process_csb(engine); /* drain preemption events */
> +
> +       /* Following the reset, we need to reload the CSB read/write pointers */
> +       reset_csb_pointers(engine);
> +
> +       /*
> +        * Save the currently executing context, even if we completed
> +        * its request, it was still running at the time of the
> +        * reset and will have been clobbered.
> +        */
> +       rq = execlists_active(execlists);
> +       if (!rq)
> +               goto unwind;
> +
> +       /* We still have requests in-flight; the engine should be active */
> +       GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
> +
> +       ce = rq->hw_context;
> +       GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
> +
> +       if (i915_request_completed(rq)) {
> +               /* Idle context; tidy up the ring so we can restart afresh */
> +               ce->ring->head = intel_ring_wrap(ce->ring, rq->tail);
> +               goto out_replay;
> +       }
> +
> +       /* Context has requests still in-flight; it should not be idle! */
> +       GEM_BUG_ON(i915_active_is_idle(&ce->active));
> +       rq = active_request(ce->timeline, rq);
> +       ce->ring->head = intel_ring_wrap(ce->ring, rq->head);
> +       GEM_BUG_ON(ce->ring->head == ce->ring->tail);
> +
> +       /*
> +        * If this request hasn't started yet, e.g. it is waiting on a
> +        * semaphore, we need to avoid skipping the request or else we
> +        * break the signaling chain. However, if the context is corrupt
> +        * the request will not restart and we will be stuck with a wedged
> +        * device. It is quite often the case that if we issue a reset
> +        * while the GPU is loading the context image, that the context
> +        * image becomes corrupt.
> +        *
> +        * Otherwise, if we have not started yet, the request should replay
> +        * perfectly and we do not need to flag the result as being erroneous.
> +        */
> +       if (!i915_request_started(rq))
> +               goto out_replay;
> +
> +       /*
> +        * If the request was innocent, we leave the request in the ELSP
> +        * and will try to replay it on restarting. The context image may
> +        * have been corrupted by the reset, in which case we may have
> +        * to service a new GPU hang, but more likely we can continue on
> +        * without impact.
> +        *
> +        * If the request was guilty, we presume the context is corrupt
> +        * and have to at least restore the RING register in the context
> +        * image back to the expected values to skip over the guilty request.
> +        */
> +       __i915_request_reset(rq, stalled);
> +       if (!stalled)
> +               goto out_replay;
> +
> +       /*
> +        * We want a simple context + ring to execute the breadcrumb update.
> +        * We cannot rely on the context being intact across the GPU hang,
> +        * so clear it and rebuild just what we need for the breadcrumb.
> +        * All pending requests for this context will be zapped, and any
> +        * future request will be after userspace has had the opportunity
> +        * to recreate its own state.
> +        */
> +       GEM_BUG_ON(!intel_context_is_pinned(ce));
> +       intel_lr_context_restore_default_state(ce, engine);
> +
> +out_replay:
> +       GEM_TRACE("%s replay {head:%04x, tail:%04x}\n",
> +                 engine->name, ce->ring->head, ce->ring->tail);
> +       intel_ring_update_space(ce->ring);
> +       intel_lr_context_reset_reg_state(ce, engine);
> +       intel_lr_context_update_reg_state(ce, engine);
> +       ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
> +
> +unwind:
> +       /* Push back any incomplete requests for replay after the reset. */
> +       cancel_port_requests(execlists);
> +       __unwind_incomplete_requests(engine);
> +}
> +
> +static void execlists_reset(struct intel_engine_cs *engine, bool stalled)
> +{
> +       unsigned long flags;
> +
> +       GEM_TRACE("%s\n", engine->name);
> +
> +       spin_lock_irqsave(&engine->active.lock, flags);
> +
> +       __execlists_reset(engine, stalled);
> +
> +       spin_unlock_irqrestore(&engine->active.lock, flags);
> +}
> +
> +static void nop_submission_tasklet(unsigned long data)
> +{
> +       /* The driver is wedged; don't process any more events. */
> +}
> +
> +static void execlists_cancel_requests(struct intel_engine_cs *engine)
> +{
> +       struct intel_engine_execlists * const execlists = &engine->execlists;
> +       struct i915_request *rq, *rn;
> +       struct rb_node *rb;
> +       unsigned long flags;
> +
> +       GEM_TRACE("%s\n", engine->name);
> +
> +       /*
> +        * Before we call engine->cancel_requests(), we should have exclusive
> +        * access to the submission state. This is arranged for us by the
> +        * caller disabling the interrupt generation, the tasklet and other
> +        * threads that may then access the same state, giving us a free hand
> +        * to reset state. However, we still need to let lockdep be aware that
> +        * we know this state may be accessed in hardirq context, so we
> +        * disable the irq around this manipulation and we want to keep
> +        * the spinlock focused on its duties and not accidentally conflate
> +        * coverage to the submission's irq state. (Similarly, although we
> +        * shouldn't need to disable irq around the manipulation of the
> +        * submission's irq state, we also wish to remind ourselves that
> +        * it is irq state.)
> +        */
> +       spin_lock_irqsave(&engine->active.lock, flags);
> +
> +       __execlists_reset(engine, true);
> +
> +       /* Mark all executing requests as skipped. */
> +       list_for_each_entry(rq, &engine->active.requests, sched.link)
> +               mark_eio(rq);
> +
> +       /* Flush the queued requests to the timeline list (for retiring). */
> +       while ((rb = rb_first_cached(&execlists->queue))) {
> +               struct i915_priolist *p = to_priolist(rb);
> +               int i;
> +
> +               priolist_for_each_request_consume(rq, rn, p, i) {
> +                       mark_eio(rq);
> +                       __i915_request_submit(rq);
> +               }
> +
> +               rb_erase_cached(&p->node, &execlists->queue);
> +               i915_priolist_free(p);
> +       }
> +
> +       /* Cancel all attached virtual engines */
> +       while ((rb = rb_first_cached(&execlists->virtual))) {
> +               struct intel_virtual_engine *ve =
> +                       rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
> +
> +               rb_erase_cached(rb, &execlists->virtual);
> +               RB_CLEAR_NODE(rb);
> +
> +               spin_lock(&ve->base.active.lock);
> +               rq = fetch_and_zero(&ve->request);
> +               if (rq) {
> +                       mark_eio(rq);
> +
> +                       rq->engine = engine;
> +                       __i915_request_submit(rq);
> +                       i915_request_put(rq);
> +
> +                       ve->base.execlists.queue_priority_hint = INT_MIN;
> +               }
> +               spin_unlock(&ve->base.active.lock);
> +       }
> +
> +       /* Remaining _unready_ requests will be nop'ed when submitted */
> +
> +       execlists->queue_priority_hint = INT_MIN;
> +       execlists->queue = RB_ROOT_CACHED;
> +
> +       GEM_BUG_ON(__tasklet_is_enabled(&execlists->tasklet));
> +       execlists->tasklet.func = nop_submission_tasklet;
> +
> +       spin_unlock_irqrestore(&engine->active.lock, flags);
> +}
> +
> +static void execlists_reset_finish(struct intel_engine_cs *engine)
> +{
> +       struct intel_engine_execlists * const execlists = &engine->execlists;
> +
> +       /*
> +        * After a GPU reset, we may have requests to replay. Do so now while
> +        * we still have the forcewake to be sure that the GPU is not allowed
> +        * to sleep before we restart and reload a context.
> +        */
> +       GEM_BUG_ON(!reset_in_progress(execlists));
> +       if (!RB_EMPTY_ROOT(&execlists->queue.rb_root))
> +               execlists->tasklet.func(execlists->tasklet.data);
> +
> +       if (__tasklet_enable(&execlists->tasklet))
> +               /* And kick in case we missed a new request submission. */
> +               tasklet_hi_schedule(&execlists->tasklet);
> +       GEM_TRACE("%s: depth->%d\n", engine->name,
> +                 atomic_read(&execlists->tasklet.count));
> +}
> +
> +static void execlists_park(struct intel_engine_cs *engine)
> +{
> +       cancel_timer(&engine->execlists.timer);
> +       cancel_timer(&engine->execlists.preempt);
> +}
> +
> +static void execlists_destroy(struct intel_engine_cs *engine)
> +{
> +       /* Synchronise with residual timers and any softirq they raise */
> +       del_timer_sync(&engine->execlists.timer);
> +       del_timer_sync(&engine->execlists.preempt);
> +       tasklet_kill(&engine->execlists.tasklet);
> +
> +       intel_logical_ring_destroy(engine);
> +}
> +
> +void intel_execlists_set_default_submission(struct intel_engine_cs *engine)
> +{
> +       engine->request_alloc = execlists_request_alloc;
> +       engine->submit_request = execlists_submit_request;
> +       engine->cancel_requests = execlists_cancel_requests;
> +       engine->schedule = i915_schedule;
> +       engine->execlists.tasklet.func = execlists_submission_tasklet;
> +
> +       engine->reset.prepare = execlists_reset_prepare;
> +       engine->reset.reset = execlists_reset;
> +       engine->reset.finish = execlists_reset_finish;
> +
> +       engine->destroy = execlists_destroy;
> +       engine->park = execlists_park;
> +       engine->unpark = NULL;
> +
> +       engine->flags |= I915_ENGINE_SUPPORTS_STATS;
> +       if (!intel_vgpu_active(engine->i915)) {
> +               engine->flags |= I915_ENGINE_HAS_SEMAPHORES;
> +               if (HAS_LOGICAL_RING_PREEMPTION(engine->i915))
> +                       engine->flags |= I915_ENGINE_HAS_PREEMPTION;
> +       }
> +
> +       if (INTEL_GEN(engine->i915) >= 12)
> +               engine->flags |= I915_ENGINE_HAS_RELATIVE_MMIO;
> +}
> +
> +int intel_execlists_submission_setup(struct intel_engine_cs *engine)
> +{
> +       tasklet_init(&engine->execlists.tasklet,
> +                    execlists_submission_tasklet, (unsigned long)engine);
> +       timer_setup(&engine->execlists.timer, execlists_timeslice, 0);
> +       timer_setup(&engine->execlists.preempt, execlists_preempt, 0);
> +
> +       intel_logical_ring_setup(engine);
> +
> +       engine->set_default_submission = intel_execlists_set_default_submission;
> +       engine->cops = &execlists_context_ops;
> +
> +       return 0;
> +}
> +
> +int intel_execlists_submission_init(struct intel_engine_cs *engine)
> +{
> +       struct intel_engine_execlists * const execlists = &engine->execlists;
> +       struct drm_i915_private *i915 = engine->i915;
> +       struct intel_uncore *uncore = engine->uncore;
> +       u32 base = engine->mmio_base;
> +       int ret;
> +
> +       ret = intel_logical_ring_init(engine);
> +       if (ret)
> +               return ret;
> +
> +       if (HAS_LOGICAL_RING_ELSQ(i915)) {
> +               execlists->submit_reg = uncore->regs +
> +                       i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(base));
> +               execlists->ctrl_reg = uncore->regs +
> +                       i915_mmio_reg_offset(RING_EXECLIST_CONTROL(base));
> +       } else {
> +               execlists->submit_reg = uncore->regs +
> +                       i915_mmio_reg_offset(RING_ELSP(base));
> +       }
> +
> +       execlists->csb_status =
> +               &engine->status_page.addr[I915_HWS_CSB_BUF0_INDEX];
> +
> +       execlists->csb_write =
> +               &engine->status_page.addr[intel_hws_csb_write_index(i915)];
> +
> +       if (INTEL_GEN(i915) < 11)
> +               execlists->csb_size = GEN8_CSB_ENTRIES;
> +       else
> +               execlists->csb_size = GEN11_CSB_ENTRIES;
> +
> +       reset_csb_pointers(engine);
> +
> +       return 0;
> +}
> +
> +static intel_engine_mask_t
> +virtual_submission_mask(struct intel_virtual_engine *ve)
> +{
> +       struct i915_request *rq;
> +       intel_engine_mask_t mask;
> +
> +       rq = READ_ONCE(ve->request);
> +       if (!rq)
> +               return 0;
> +
> +       /* The rq is ready for submission; rq->execution_mask is now stable. */
> +       mask = rq->execution_mask;
> +       if (unlikely(!mask)) {
> +               /* Invalid selection, submit to a random engine in error */
> +               i915_request_skip(rq, -ENODEV);
> +               mask = ve->siblings[0]->mask;
> +       }
> +
> +       GEM_TRACE("%s: rq=%llx:%lld, mask=%x, prio=%d\n",
> +                 ve->base.name,
> +                 rq->fence.context, rq->fence.seqno,
> +                 mask, ve->base.execlists.queue_priority_hint);
> +
> +       return mask;
> +}
> +
> +static void virtual_submission_tasklet(unsigned long data)
> +{
> +       struct intel_virtual_engine * const ve =
> +               (struct intel_virtual_engine *)data;
> +       const int prio = ve->base.execlists.queue_priority_hint;
> +       intel_engine_mask_t mask;
> +       unsigned int n;
> +
> +       rcu_read_lock();
> +       mask = virtual_submission_mask(ve);
> +       rcu_read_unlock();
> +       if (unlikely(!mask))
> +               return;
> +
> +       local_irq_disable();
> +       for (n = 0; READ_ONCE(ve->request) && n < ve->num_siblings; n++) {
> +               struct intel_engine_cs *sibling = ve->siblings[n];
> +               struct ve_node * const node = &ve->nodes[sibling->id];
> +               struct rb_node **parent, *rb;
> +               bool first;
> +
> +               if (unlikely(!(mask & sibling->mask))) {
> +                       if (!RB_EMPTY_NODE(&node->rb)) {
> +                               spin_lock(&sibling->active.lock);
> +                               rb_erase_cached(&node->rb,
> +                                               &sibling->execlists.virtual);
> +                               RB_CLEAR_NODE(&node->rb);
> +                               spin_unlock(&sibling->active.lock);
> +                       }
> +                       continue;
> +               }
> +
> +               spin_lock(&sibling->active.lock);
> +
> +               if (!RB_EMPTY_NODE(&node->rb)) {
> +                       /*
> +                        * Cheat and avoid rebalancing the tree if we can
> +                        * reuse this node in situ.
> +                        */
> +                       first = rb_first_cached(&sibling->execlists.virtual) ==
> +                               &node->rb;
> +                       if (prio == node->prio || (prio > node->prio && first))
> +                               goto submit_engine;
> +
> +                       rb_erase_cached(&node->rb, &sibling->execlists.virtual);
> +               }
> +
> +               rb = NULL;
> +               first = true;
> +               parent = &sibling->execlists.virtual.rb_root.rb_node;
> +               while (*parent) {
> +                       struct ve_node *other;
> +
> +                       rb = *parent;
> +                       other = rb_entry(rb, typeof(*other), rb);
> +                       if (prio > other->prio) {
> +                               parent = &rb->rb_left;
> +                       } else {
> +                               parent = &rb->rb_right;
> +                               first = false;
> +                       }
> +               }
> +
> +               rb_link_node(&node->rb, rb, parent);
> +               rb_insert_color_cached(&node->rb,
> +                                      &sibling->execlists.virtual,
> +                                      first);
> +
> +submit_engine:
> +               GEM_BUG_ON(RB_EMPTY_NODE(&node->rb));
> +               node->prio = prio;
> +               if (first && prio > sibling->execlists.queue_priority_hint) {
> +                       sibling->execlists.queue_priority_hint = prio;
> +                       tasklet_hi_schedule(&sibling->execlists.tasklet);
> +               }
> +
> +               spin_unlock(&sibling->active.lock);
> +       }
> +       local_irq_enable();
> +}
> +
> +static void virtual_submit_request(struct i915_request *rq)
> +{
> +       struct intel_virtual_engine *ve = to_virtual_engine(rq->engine);
> +       struct i915_request *old;
> +       unsigned long flags;
> +
> +       GEM_TRACE("%s: rq=%llx:%lld\n",
> +                 ve->base.name,
> +                 rq->fence.context,
> +                 rq->fence.seqno);
> +
> +       GEM_BUG_ON(ve->base.submit_request != virtual_submit_request);
> +
> +       spin_lock_irqsave(&ve->base.active.lock, flags);
> +
> +       old = ve->request;
> +       if (old) { /* background completion event from preempt-to-busy */
> +               GEM_BUG_ON(!i915_request_completed(old));
> +               __i915_request_submit(old);
> +               i915_request_put(old);
> +       }
> +
> +       if (i915_request_completed(rq)) {
> +               __i915_request_submit(rq);
> +
> +               ve->base.execlists.queue_priority_hint = INT_MIN;
> +               ve->request = NULL;
> +       } else {
> +               ve->base.execlists.queue_priority_hint = rq_prio(rq);
> +               ve->request = i915_request_get(rq);
> +
> +               GEM_BUG_ON(!list_empty(intel_virtual_engine_queue(ve)));
> +               list_move_tail(&rq->sched.link, intel_virtual_engine_queue(ve));
> +
> +               tasklet_schedule(&ve->base.execlists.tasklet);
> +       }
> +
> +       spin_unlock_irqrestore(&ve->base.active.lock, flags);
> +}
> +
> +static void
> +virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> +{
> +       struct intel_virtual_engine *ve = to_virtual_engine(rq->engine);
> +       intel_engine_mask_t allowed, exec;
> +       struct ve_bond *bond;
> +
> +       allowed = ~to_request(signal)->engine->mask;
> +
> +       bond = intel_virtual_engine_find_bond(ve, to_request(signal)->engine);
> +       if (bond)
> +               allowed &= bond->sibling_mask;
> +
> +       /* Restrict the bonded request to run on only the available engines */
> +       exec = READ_ONCE(rq->execution_mask);
> +       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> +               ;
> +
> +       /* Prevent the master from being re-run on the bonded engines */
> +       to_request(signal)->execution_mask &= ~allowed;
> +}
> +
> +void intel_execlists_virtual_submission_init(struct intel_virtual_engine *ve)
> +{
> +       ve->base.request_alloc = execlists_request_alloc;
> +       ve->base.submit_request = virtual_submit_request;
> +       ve->base.bond_execute = virtual_bond_execute;
> +       tasklet_init(&ve->base.execlists.tasklet,
> +                    virtual_submission_tasklet,
> +                    (unsigned long)ve);
> +}
> +
> +void intel_execlists_show_requests(struct intel_engine_cs *engine,
> +                                  struct drm_printer *m,
> +                                  void (*show_request)(struct drm_printer *m,
> +                                                       struct i915_request *rq,
> +                                                       const char *prefix),
> +                                  unsigned int max)
> +{
> +       const struct intel_engine_execlists *execlists = &engine->execlists;
> +       struct i915_request *rq, *last;
> +       unsigned long flags;
> +       unsigned int count;
> +       struct rb_node *rb;
> +
> +       spin_lock_irqsave(&engine->active.lock, flags);
> +
> +       last = NULL;
> +       count = 0;
> +       list_for_each_entry(rq, &engine->active.requests, sched.link) {
> +               if (count++ < max - 1)
> +                       show_request(m, rq, "\t\tE ");
> +               else
> +                       last = rq;
> +       }
> +       if (last) {
> +               if (count > max) {
> +                       drm_printf(m,
> +                                  "\t\t...skipping %d executing requests...\n",
> +                                  count - max);
> +               }
> +               show_request(m, last, "\t\tE ");
> +       }
> +
> +       last = NULL;
> +       count = 0;
> +       if (execlists->queue_priority_hint != INT_MIN)
> +               drm_printf(m, "\t\tQueue priority hint: %d\n",
> +                          execlists->queue_priority_hint);
> +       for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) {
> +               struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
> +               int i;
> +
> +               priolist_for_each_request(rq, p, i) {
> +                       if (count++ < max - 1)
> +                               show_request(m, rq, "\t\tQ ");
> +                       else
> +                               last = rq;
> +               }
> +       }
> +       if (last) {
> +               if (count > max) {
> +                       drm_printf(m,
> +                                  "\t\t...skipping %d queued requests...\n",
> +                                  count - max);
> +               }
> +               show_request(m, last, "\t\tQ ");
> +       }
> +
> +       last = NULL;
> +       count = 0;
> +       for (rb = rb_first_cached(&execlists->virtual); rb; rb = rb_next(rb)) {
> +               struct intel_virtual_engine *ve =
> +                       rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
> +               struct i915_request *rq = READ_ONCE(ve->request);
> +
> +               if (rq) {
> +                       if (count++ < max - 1)
> +                               show_request(m, rq, "\t\tV ");
> +                       else
> +                               last = rq;
> +               }
> +       }
> +       if (last) {
> +               if (count > max) {
> +                       drm_printf(m,
> +                                  "\t\t...skipping %d virtual requests...\n",
> +                                  count - max);
> +               }
> +               show_request(m, last, "\t\tV ");
> +       }
> +
> +       spin_unlock_irqrestore(&engine->active.lock, flags);
> +}
> +
> +bool
> +intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine)
> +{
> +       return engine->set_default_submission ==
> +              intel_execlists_set_default_submission;
> +}

The breadcrumb submission code is specialised to execlists and should
not be shared (leaves emit_flush, emit_bb_start as common
gen8_submission.c). The reset code is specialised to execlists and should not
be shared. The virtual engine is specialised to execlists and should not
be shared. Even submit_request should be distinct between guc and
execlists, especially request_alloc (which you may like to put on the
context_ops rather than engine)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 1/5] drm/i915: introduce logical_ring and lr_context naming
  2019-12-11 21:20   ` Chris Wilson
@ 2019-12-11 21:33     ` Chris Wilson
  2019-12-11 22:04     ` Daniele Ceraolo Spurio
  1 sibling, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2019-12-11 21:33 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx

Quoting Chris Wilson (2019-12-11 21:20:52)
> Quoting Daniele Ceraolo Spurio (2019-12-11 21:12:40)
> > +static void lr_context_init_reg_state(u32 *reg_state,
> > +                                     const struct intel_context *ce,
> > +                                     const struct intel_engine_cs *engine,
> > +                                     const struct intel_ring *ring,
> > +                                     bool close);
> 
> lrc. lrc should just be the register offsets and default context image.

Fwiw, I also put the w/a batch buffers into intel_lrc.c as they seemed
HW/lrc specific as opposed to being specialised for submission --
although we may want to do that at some point.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 3/5] drm/i915: split out virtual engine code
  2019-12-11 21:22   ` Chris Wilson
@ 2019-12-11 21:34     ` Daniele Ceraolo Spurio
  2019-12-11 23:09       ` Matthew Brost
  0 siblings, 1 reply; 21+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-12-11 21:34 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 12/11/19 1:22 PM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-12-11 21:12:42)
>> Having the virtual engine handling in its own file will make it easier
>> call it from or modify for the GuC implementation without leaking the
>> changes in the context management or execlists submission paths.
> 
> No. The virtual engine is tightly coupled into the execlists, it is not
> the starting point for a general veng.
> -Chris
> 

What's the issue from your POV? We've been using it with little changes 
for GuC submission and IMO it flows relatively well, mainly just using a 
different tasklet and slightly different cops (need to call into GuC for 
pin/unpin).

Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 1/5] drm/i915: introduce logical_ring and lr_context naming
  2019-12-11 21:20   ` Chris Wilson
  2019-12-11 21:33     ` Chris Wilson
@ 2019-12-11 22:04     ` Daniele Ceraolo Spurio
  2019-12-11 23:35       ` Matthew Brost
  1 sibling, 1 reply; 21+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-12-11 22:04 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 12/11/19 1:20 PM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-12-11 21:12:40)
>> Ahead of splitting out the code specific to execlists submission to its
>> own file, we can re-organize the code within the intel_lrc file to make
>> that separation clearer. To achieve this, a number of functions have
>> been split/renamed using the "logical_ring" and "lr_context" naming,
>> respectively for engine-related setup and lrc management.
>>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_lrc.c    | 154 ++++++++++++++-----------
>>   drivers/gpu/drm/i915/gt/selftest_lrc.c |  12 +-
>>   2 files changed, 93 insertions(+), 73 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> index 929f6bae4eba..6d6148e11fd0 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> @@ -228,17 +228,17 @@ static struct virtual_engine *to_virtual_engine(struct intel_engine_cs *engine)
>>          return container_of(engine, struct virtual_engine, base);
>>   }
>>   
>> -static int __execlists_context_alloc(struct intel_context *ce,
>> -                                    struct intel_engine_cs *engine);
>> -
>> -static void execlists_init_reg_state(u32 *reg_state,
>> -                                    const struct intel_context *ce,
>> -                                    const struct intel_engine_cs *engine,
>> -                                    const struct intel_ring *ring,
>> -                                    bool close);
>> +static int lr_context_alloc(struct intel_context *ce,
>> +                           struct intel_engine_cs *engine);
> 
> Execlists.
> 
>> +
>> +static void lr_context_init_reg_state(u32 *reg_state,
>> +                                     const struct intel_context *ce,
>> +                                     const struct intel_engine_cs *engine,
>> +                                     const struct intel_ring *ring,
>> +                                     bool close);
> 
> lrc. lrc should just be the register offsets and default context image.
> 

I've used "lrc" for anything that was related to managing the context 
object (creation, population, pin, etc) and "execlists" for anything 
related to executing the context on the HW, with the aim of having the 
GuC code call only into lrc functions and not into execlists ones.
If you prefer keeping the execlists naming for anything non related to 
the context of the context image, should we go for execlist_submission_* 
for anything that's specific to the execlist submission, and avoid those 
from the GuC side?

Daniele

>>   static void
>> -__execlists_update_reg_state(const struct intel_context *ce,
>> -                            const struct intel_engine_cs *engine);
>> +lr_context_update_reg_state(const struct intel_context *ce,
>> +                           const struct intel_engine_cs *engine);
> 
> lrc.
> 
>>   
>>   static void mark_eio(struct i915_request *rq)
>>   {
>> @@ -1035,8 +1035,8 @@ execlists_check_context(const struct intel_context *ce,
>>          WARN_ONCE(!valid, "Invalid lrc state found before submission\n");
>>   }
>>   
>> -static void restore_default_state(struct intel_context *ce,
>> -                                 struct intel_engine_cs *engine)
>> +static void lr_context_restore_default_state(struct intel_context *ce,
>> +                                            struct intel_engine_cs *engine)
>>   {
>>          u32 *regs = ce->lrc_reg_state;
>>   
>> @@ -1045,7 +1045,7 @@ static void restore_default_state(struct intel_context *ce,
>>                         engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
>>                         engine->context_size - PAGE_SIZE);
>>   
>> -       execlists_init_reg_state(regs, ce, engine, ce->ring, false);
>> +       lr_context_init_reg_state(regs, ce, engine, ce->ring, false);
>>   }
>>   
>>   static void reset_active(struct i915_request *rq,
>> @@ -1081,8 +1081,8 @@ static void reset_active(struct i915_request *rq,
>>          intel_ring_update_space(ce->ring);
>>   
>>          /* Scrub the context image to prevent replaying the previous batch */
>> -       restore_default_state(ce, engine);
>> -       __execlists_update_reg_state(ce, engine);
>> +       lr_context_restore_default_state(ce, engine);
>> +       lr_context_update_reg_state(ce, engine);
>>   
>>          /* We've switched away, so this should be a no-op, but intent matters */
>>          ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
>> @@ -2378,7 +2378,7 @@ static void execlists_submit_request(struct i915_request *request)
>>          spin_unlock_irqrestore(&engine->active.lock, flags);
>>   }
>>   
>> -static void __execlists_context_fini(struct intel_context *ce)
>> +static void lr_context_fini(struct intel_context *ce)
> 
> execlists.
> 
>>   {
>>          intel_ring_put(ce->ring);
>>          i915_vma_put(ce->state);
>> @@ -2392,7 +2392,7 @@ static void execlists_context_destroy(struct kref *kref)
>>          GEM_BUG_ON(intel_context_is_pinned(ce));
>>   
>>          if (ce->state)
>> -               __execlists_context_fini(ce);
>> +               lr_context_fini(ce);
>>   
>>          intel_context_fini(ce);
>>          intel_context_free(ce);
>> @@ -2423,7 +2423,7 @@ check_redzone(const void *vaddr, const struct intel_engine_cs *engine)
>>                               engine->name);
>>   }
>>   
>> -static void execlists_context_unpin(struct intel_context *ce)
>> +static void intel_lr_context_unpin(struct intel_context *ce)
> 
> execlists.
> 
>>   {
>>          check_redzone((void *)ce->lrc_reg_state - LRC_STATE_PN * PAGE_SIZE,
>>                        ce->engine);
>> @@ -2433,8 +2433,8 @@ static void execlists_context_unpin(struct intel_context *ce)
>>   }
>>   
>>   static void
>> -__execlists_update_reg_state(const struct intel_context *ce,
>> -                            const struct intel_engine_cs *engine)
>> +lr_context_update_reg_state(const struct intel_context *ce,
>> +                           const struct intel_engine_cs *engine)
> 
> lrc.
> 
>>   {
>>          struct intel_ring *ring = ce->ring;
>>          u32 *regs = ce->lrc_reg_state;
>> @@ -2456,8 +2456,7 @@ __execlists_update_reg_state(const struct intel_context *ce,
>>   }
>>   
>>   static int
>> -__execlists_context_pin(struct intel_context *ce,
>> -                       struct intel_engine_cs *engine)
>> +lr_context_pin(struct intel_context *ce, struct intel_engine_cs *engine)
> 
> execlists.
> 
>>   {
>>          void *vaddr;
>>          int ret;
>> @@ -2479,7 +2478,7 @@ __execlists_context_pin(struct intel_context *ce,
>>   
>>          ce->lrc_desc = lrc_descriptor(ce, engine);
>>          ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
>> -       __execlists_update_reg_state(ce, engine);
>> +       lr_context_update_reg_state(ce, engine);
>>   
>>          return 0;
>>   
>> @@ -2491,12 +2490,12 @@ __execlists_context_pin(struct intel_context *ce,
>>   
>>   static int execlists_context_pin(struct intel_context *ce)
>>   {
>> -       return __execlists_context_pin(ce, ce->engine);
>> +       return lr_context_pin(ce, ce->engine);
>>   }
>>   
>>   static int execlists_context_alloc(struct intel_context *ce)
>>   {
>> -       return __execlists_context_alloc(ce, ce->engine);
>> +       return lr_context_alloc(ce, ce->engine);
>>   }
>>   
>>   static void execlists_context_reset(struct intel_context *ce)
>> @@ -2518,14 +2517,14 @@ static void execlists_context_reset(struct intel_context *ce)
>>           * simplicity, we just zero everything out.
>>           */
>>          intel_ring_reset(ce->ring, 0);
>> -       __execlists_update_reg_state(ce, ce->engine);
>> +       lr_context_update_reg_state(ce, ce->engine);
>>   }
>>   
>>   static const struct intel_context_ops execlists_context_ops = {
>>          .alloc = execlists_context_alloc,
>>   
>>          .pin = execlists_context_pin,
>> -       .unpin = execlists_context_unpin,
>> +       .unpin = intel_lr_context_unpin,
> 
> execlists.
> 
>>   
>>          .enter = intel_context_enter_engine,
>>          .exit = intel_context_exit_engine,
>> @@ -2912,7 +2911,33 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
>>          return ret;
>>   }
>>   
>> -static void enable_execlists(struct intel_engine_cs *engine)
>> +static int logical_ring_init(struct intel_engine_cs *engine)
>> +{
>> +       int ret;
>> +
>> +       ret = intel_engine_init_common(engine);
>> +       if (ret)
>> +               return ret;
>> +
>> +       if (intel_init_workaround_bb(engine))
>> +               /*
>> +                * We continue even if we fail to initialize WA batch
>> +                * because we only expect rare glitches but nothing
>> +                * critical to prevent us from using GPU
>> +                */
>> +               DRM_ERROR("WA batch buffer initialization failed\n");
>> +
>> +       return 0;
>> +}
>> +
>> +static void logical_ring_destroy(struct intel_engine_cs *engine)
>> +{
>> +       intel_engine_cleanup_common(engine);
>> +       lrc_destroy_wa_ctx(engine);
>> +       kfree(engine);
> 
>> +}
>> +
>> +static void logical_ring_enable(struct intel_engine_cs *engine)
>>   {
>>          u32 mode;
>>   
>> @@ -2946,7 +2971,7 @@ static bool unexpected_starting_state(struct intel_engine_cs *engine)
>>          return unexpected;
>>   }
>>   
>> -static int execlists_resume(struct intel_engine_cs *engine)
>> +static int logical_ring_resume(struct intel_engine_cs *engine)
> 
> execlists.
> 
>>   {
>>          intel_engine_apply_workarounds(engine);
>>          intel_engine_apply_whitelist(engine);
>> @@ -2961,7 +2986,7 @@ static int execlists_resume(struct intel_engine_cs *engine)
>>                  intel_engine_dump(engine, &p, NULL);
>>          }
>>   
>> -       enable_execlists(engine);
>> +       logical_ring_enable(engine);
>>   
>>          return 0;
>>   }
>> @@ -3037,8 +3062,8 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
>>                                 &execlists->csb_status[reset_value]);
>>   }
>>   
>> -static void __execlists_reset_reg_state(const struct intel_context *ce,
>> -                                       const struct intel_engine_cs *engine)
>> +static void lr_context_reset_reg_state(const struct intel_context *ce,
>> +                                      const struct intel_engine_cs *engine)
> 
> lrc.
> 
>>   {
>>          u32 *regs = ce->lrc_reg_state;
>>          int x;
>> @@ -3131,14 +3156,14 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
>>           * to recreate its own state.
>>           */
>>          GEM_BUG_ON(!intel_context_is_pinned(ce));
>> -       restore_default_state(ce, engine);
>> +       lr_context_restore_default_state(ce, engine);
>>   
>>   out_replay:
>>          GEM_TRACE("%s replay {head:%04x, tail:%04x}\n",
>>                    engine->name, ce->ring->head, ce->ring->tail);
>>          intel_ring_update_space(ce->ring);
>> -       __execlists_reset_reg_state(ce, engine);
>> -       __execlists_update_reg_state(ce, engine);
>> +       lr_context_reset_reg_state(ce, engine);
>> +       lr_context_update_reg_state(ce, engine);
>>          ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
>>   
>>   unwind:
>> @@ -3788,9 +3813,7 @@ static void execlists_destroy(struct intel_engine_cs *engine)
>>   {
>>          execlists_shutdown(engine);
>>   
>> -       intel_engine_cleanup_common(engine);
>> -       lrc_destroy_wa_ctx(engine);
>> -       kfree(engine);
>> +       logical_ring_destroy(engine);
>>   }
>>   
>>   static void
>> @@ -3799,7 +3822,7 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
>>          /* Default vfuncs which can be overriden by each engine. */
>>   
>>          engine->destroy = execlists_destroy;
>> -       engine->resume = execlists_resume;
>> +       engine->resume = logical_ring_resume;
>>   
>>          engine->reset.prepare = execlists_reset_prepare;
>>          engine->reset.reset = execlists_reset;
>> @@ -3872,6 +3895,15 @@ static void rcs_submission_override(struct intel_engine_cs *engine)
>>          }
>>   }
>>   
>> +static void logical_ring_setup(struct intel_engine_cs *engine)
>> +{
>> +       logical_ring_default_vfuncs(engine);
>> +       logical_ring_default_irqs(engine);
>> +
>> +       if (engine->class == RENDER_CLASS)
>> +               rcs_submission_override(engine);
>> +}
>> +
>>   int intel_execlists_submission_setup(struct intel_engine_cs *engine)
>>   {
>>          tasklet_init(&engine->execlists.tasklet,
>> @@ -3879,11 +3911,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
>>          timer_setup(&engine->execlists.timer, execlists_timeslice, 0);
>>          timer_setup(&engine->execlists.preempt, execlists_preempt, 0);
>>   
>> -       logical_ring_default_vfuncs(engine);
>> -       logical_ring_default_irqs(engine);
>> -
>> -       if (engine->class == RENDER_CLASS)
>> -               rcs_submission_override(engine);
>> +       logical_ring_setup(engine);
>>   
>>          return 0;
>>   }
>> @@ -3896,18 +3924,10 @@ int intel_execlists_submission_init(struct intel_engine_cs *engine)
>>          u32 base = engine->mmio_base;
>>          int ret;
>>   
>> -       ret = intel_engine_init_common(engine);
>> +       ret = logical_ring_init(engine);
>>          if (ret)
>>                  return ret;
>>   
>> -       if (intel_init_workaround_bb(engine))
>> -               /*
>> -                * We continue even if we fail to initialize WA batch
>> -                * because we only expect rare glitches but nothing
>> -                * critical to prevent us from using GPU
>> -                */
>> -               DRM_ERROR("WA batch buffer initialization failed\n");
>> -
>>          if (HAS_LOGICAL_RING_ELSQ(i915)) {
>>                  execlists->submit_reg = uncore->regs +
>>                          i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(base));
>> @@ -4033,11 +4053,11 @@ static struct i915_ppgtt *vm_alias(struct i915_address_space *vm)
>>                  return i915_vm_to_ppgtt(vm);
>>   }
>>   
>> -static void execlists_init_reg_state(u32 *regs,
>> -                                    const struct intel_context *ce,
>> -                                    const struct intel_engine_cs *engine,
>> -                                    const struct intel_ring *ring,
>> -                                    bool close)
>> +static void lr_context_init_reg_state(u32 *regs,
>> +                                     const struct intel_context *ce,
>> +                                     const struct intel_engine_cs *engine,
>> +                                     const struct intel_ring *ring,
>> +                                     bool close)
>>   {
>>          /*
>>           * A context is actually a big batch buffer with several
>> @@ -4105,7 +4125,7 @@ populate_lr_context(struct intel_context *ce,
>>          /* The second page of the context object contains some fields which must
>>           * be set up prior to the first execution. */
>>          regs = vaddr + LRC_STATE_PN * PAGE_SIZE;
>> -       execlists_init_reg_state(regs, ce, engine, ring, inhibit);
>> +       lr_context_init_reg_state(regs, ce, engine, ring, inhibit);
>>          if (inhibit)
>>                  regs[CTX_CONTEXT_CONTROL] |=
>>                          _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT);
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 4/5] drm/i915: move execlists selftests to their own file
  2019-12-11 21:26   ` Chris Wilson
@ 2019-12-11 22:07     ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 21+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-12-11 22:07 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 12/11/19 1:26 PM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-12-11 21:12:43)
>> Done ahead of splitting the lrc file as well, to keep that patch
>> smaller. Just a straight copy, with the exception of create_scratch()
>> that has been made common to avoid having 3 instances of it.
>>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> ---
>>   .../drm/i915/gem/selftests/igt_gem_utils.c    |   27 +
>>   .../drm/i915/gem/selftests/igt_gem_utils.h    |    3 +
>>   drivers/gpu/drm/i915/gt/intel_lrc.c           |    1 +
>>   drivers/gpu/drm/i915/gt/selftest_execlists.c  | 3316 ++++++++++++++++
>>   drivers/gpu/drm/i915/gt/selftest_lrc.c        | 3333 +----------------
>>   drivers/gpu/drm/i915/gt/selftest_mocs.c       |   30 +-
>>   6 files changed, 3351 insertions(+), 3359 deletions(-)
>>   create mode 100644 drivers/gpu/drm/i915/gt/selftest_execlists.c
>>
>> diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
>> index 6718da20f35d..88109333cb79 100644
>> --- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
>> +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
>> @@ -15,6 +15,33 @@
>>   
>>   #include "i915_request.h"
>>   
>> +struct i915_vma *igt_create_scratch(struct intel_gt *gt)
> 
> _ggtt_scratch(size, coherency, pin) ?
> 
> As it stands, it's not general enough...
> 

ack.

>> +{
>> +       struct drm_i915_gem_object *obj;
>> +       struct i915_vma *vma;
>> +       int err;
>> +
>> +       obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
>> +       if (IS_ERR(obj))
>> +               return ERR_CAST(obj);
>> +
>> +       i915_gem_object_set_cache_coherency(obj, I915_CACHING_CACHED);
>> +
>> +       vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
>> +       if (IS_ERR(vma)) {
>> +               i915_gem_object_put(obj);
>> +               return vma;
>> +       }
>> +
>> +       err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
>> +       if (err) {
>> +               i915_gem_object_put(obj);
>> +               return ERR_PTR(err);
>> +       }
>> +
>> +       return vma;
>> +}
>> +
>>   struct i915_request *
>>   igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
>>   {
>> diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
>> index 4221cf84d175..aae781f59cfc 100644
>> --- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
>> +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
>> @@ -15,6 +15,9 @@ struct i915_vma;
>>   
>>   struct intel_context;
>>   struct intel_engine_cs;
>> +struct intel_gt;
>> +
>> +struct i915_vma *igt_create_scratch(struct intel_gt *gt);
>>   
>>   struct i915_request *
>>   igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> index 3afae9a44911..fbdd3bdd06f1 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> @@ -4446,4 +4446,5 @@ intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine)
>>   
>>   #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>>   #include "selftest_lrc.c"
>> +#include "selftest_execlists.c"
>>   #endif
>> diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
>> new file mode 100644
>> index 000000000000..b58a4feb2ec4
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> 
> Note that many if not all (there are a few where the guc being a black
> box we cannot poke at internals) of these should also be used for guc
> submission as a BAT.
> -Chris
> 

True, but several of them might also need minor updates due to GuC 
taking control of some scheduling decisions (e.g. timeslicing, 
pre-emption), so IMO better sort them out as we get the GuC flows in place.

Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 5/5] drm/i915: introduce intel_execlists_submission.<c/h>
  2019-12-11 21:31   ` Chris Wilson
@ 2019-12-11 22:35     ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 21+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-12-11 22:35 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

<snip>


>> +
>> +struct i915_request *
>> +execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
> 
> There should be no exports from this file... Did you not also make
> guc_submission standalone?
> 

The new GuC submission code will have its own 
_unwind_incomplete_requests function, just didn't seem worth it copying 
this to the GuC file now to make this static and get rid of it later. 
The current version of the GuC patches (being worked on by Matt) is also 
not yet fully standalone, but we're moving into that direction.


<snip>

>> +bool
>> +intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine)
>> +{
>> +       return engine->set_default_submission ==
>> +              intel_execlists_set_default_submission;
>> +}
> 
> The breadcrumb submission code is specialised to execlists and should
> not be shared (leaves emit_flush, emit_bb_start as common
> gen8_submission.c). The reset code is specialised to execlists and should not
> be shared. The virtual engine is specialised to execlists and should not
> be shared. Even submit_request should be distinct between guc and
> execlists, especially request_alloc (which you may like to put on the
> context_ops rather than engine)
> -Chris
> 

engine->reset.*, request_alloc and submit_request have all been moved to 
execlists_submission.c in this patch, with the aim of not sharing them.

For the virtual engine, I've moved the submission related chunks to 
execlists_submission.c as well (see the new 
intel_execlists_virtual_submission_init, although I could probably move 
a few extra bits in there). As I mentioned on the other reply, other 
parts do seem quite generic to me, but let's keep this chunk of the 
discussion on the other thread.

Regarding the breadcrumb code, IMO we do still want to share most of it 
(seqno writing, interrupt, MI_ARB_CHECK, wa_tail), but we most likely 
won't need the preempt_busywait. Given this, it didn't feel right to me 
to move the relevant code out of the file until we get some more mature 
GuC code to make a cleaner call.

Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 3/5] drm/i915: split out virtual engine code
  2019-12-11 21:34     ` Daniele Ceraolo Spurio
@ 2019-12-11 23:09       ` Matthew Brost
  0 siblings, 0 replies; 21+ messages in thread
From: Matthew Brost @ 2019-12-11 23:09 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-gfx

On Wed, Dec 11, 2019 at 01:34:20PM -0800, Daniele Ceraolo Spurio wrote:
>
>
>On 12/11/19 1:22 PM, Chris Wilson wrote:
>>Quoting Daniele Ceraolo Spurio (2019-12-11 21:12:42)
>>>Having the virtual engine handling in its own file will make it easier
>>>call it from or modify for the GuC implementation without leaking the
>>>changes in the context management or execlists submission paths.
>>
>>No. The virtual engine is tightly coupled into the execlists, it is not
>>the starting point for a general veng.
>>-Chris
>>
>
>What's the issue from your POV? We've been using it with little 
>changes for GuC submission and IMO it flows relatively well, mainly 
>just using a different tasklet and slightly different cops (need to 
>call into GuC for pin/unpin).
>
>Daniele

I agree with Daniele's approach here. The new GuC code can reuse
intel_execlists_create_virtual with a couple of GuC specific branches in the
function. The new GuC also reuses virtual_engine_enter / virtual_engine_exit in
the virtual GuC context operations. To me it makes more sense to have this
virtual engine code in its' own file than polluting an execlist specific file
with references to the GuC.

Matt
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Intel-gfx] [RFC 1/5] drm/i915: introduce logical_ring and lr_context naming
  2019-12-11 22:04     ` Daniele Ceraolo Spurio
@ 2019-12-11 23:35       ` Matthew Brost
  0 siblings, 0 replies; 21+ messages in thread
From: Matthew Brost @ 2019-12-11 23:35 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-gfx

On Wed, Dec 11, 2019 at 02:04:48PM -0800, Daniele Ceraolo Spurio wrote:
>
>
>On 12/11/19 1:20 PM, Chris Wilson wrote:
>>Quoting Daniele Ceraolo Spurio (2019-12-11 21:12:40)
>>>Ahead of splitting out the code specific to execlists submission to its
>>>own file, we can re-organize the code within the intel_lrc file to make
>>>that separation clearer. To achieve this, a number of functions have
>>>been split/renamed using the "logical_ring" and "lr_context" naming,
>>>respectively for engine-related setup and lrc management.
>>>
>>>Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>>Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>>Cc: Matthew Brost <matthew.brost@intel.com>
>>>---
>>>  drivers/gpu/drm/i915/gt/intel_lrc.c    | 154 ++++++++++++++-----------
>>>  drivers/gpu/drm/i915/gt/selftest_lrc.c |  12 +-
>>>  2 files changed, 93 insertions(+), 73 deletions(-)
>>>
>>>diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>index 929f6bae4eba..6d6148e11fd0 100644
>>>--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>@@ -228,17 +228,17 @@ static struct virtual_engine *to_virtual_engine(struct intel_engine_cs *engine)
>>>         return container_of(engine, struct virtual_engine, base);
>>>  }
>>>-static int __execlists_context_alloc(struct intel_context *ce,
>>>-                                    struct intel_engine_cs *engine);
>>>-
>>>-static void execlists_init_reg_state(u32 *reg_state,
>>>-                                    const struct intel_context *ce,
>>>-                                    const struct intel_engine_cs *engine,
>>>-                                    const struct intel_ring *ring,
>>>-                                    bool close);
>>>+static int lr_context_alloc(struct intel_context *ce,
>>>+                           struct intel_engine_cs *engine);
>>
>>Execlists.
>>
>>>+
>>>+static void lr_context_init_reg_state(u32 *reg_state,
>>>+                                     const struct intel_context *ce,
>>>+                                     const struct intel_engine_cs *engine,
>>>+                                     const struct intel_ring *ring,
>>>+                                     bool close);
>>
>>lrc. lrc should just be the register offsets and default context image.
>>
>
>I've used "lrc" for anything that was related to managing the context 
>object (creation, population, pin, etc) and "execlists" for anything 
>related to executing the context on the HW, with the aim of having the 
>GuC code call only into lrc functions and not into execlists ones.
>If you prefer keeping the execlists naming for anything non related to 
>the context of the context image, should we go for 
>execlist_submission_* for anything that's specific to the execlist 
>submission, and avoid those from the GuC side?
>
>Daniele
>

Again I like this approach. The GuC is going to leverage quite a bit of the
existing code intel_lrc.c. For example intel_execlists_context_pin is used. To
me a better name would be intel_lr_context_pin and this be in common file than
execlist specific file.

Matt

>>>  static void
>>>-__execlists_update_reg_state(const struct intel_context *ce,
>>>-                            const struct intel_engine_cs *engine);
>>>+lr_context_update_reg_state(const struct intel_context *ce,
>>>+                           const struct intel_engine_cs *engine);
>>
>>lrc.
>>
>>>  static void mark_eio(struct i915_request *rq)
>>>  {
>>>@@ -1035,8 +1035,8 @@ execlists_check_context(const struct intel_context *ce,
>>>         WARN_ONCE(!valid, "Invalid lrc state found before submission\n");
>>>  }
>>>-static void restore_default_state(struct intel_context *ce,
>>>-                                 struct intel_engine_cs *engine)
>>>+static void lr_context_restore_default_state(struct intel_context *ce,
>>>+                                            struct intel_engine_cs *engine)
>>>  {
>>>         u32 *regs = ce->lrc_reg_state;
>>>@@ -1045,7 +1045,7 @@ static void restore_default_state(struct intel_context *ce,
>>>                        engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
>>>                        engine->context_size - PAGE_SIZE);
>>>-       execlists_init_reg_state(regs, ce, engine, ce->ring, false);
>>>+       lr_context_init_reg_state(regs, ce, engine, ce->ring, false);
>>>  }
>>>  static void reset_active(struct i915_request *rq,
>>>@@ -1081,8 +1081,8 @@ static void reset_active(struct i915_request *rq,
>>>         intel_ring_update_space(ce->ring);
>>>         /* Scrub the context image to prevent replaying the previous batch */
>>>-       restore_default_state(ce, engine);
>>>-       __execlists_update_reg_state(ce, engine);
>>>+       lr_context_restore_default_state(ce, engine);
>>>+       lr_context_update_reg_state(ce, engine);
>>>         /* We've switched away, so this should be a no-op, but intent matters */
>>>         ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
>>>@@ -2378,7 +2378,7 @@ static void execlists_submit_request(struct i915_request *request)
>>>         spin_unlock_irqrestore(&engine->active.lock, flags);
>>>  }
>>>-static void __execlists_context_fini(struct intel_context *ce)
>>>+static void lr_context_fini(struct intel_context *ce)
>>
>>execlists.
>>
>>>  {
>>>         intel_ring_put(ce->ring);
>>>         i915_vma_put(ce->state);
>>>@@ -2392,7 +2392,7 @@ static void execlists_context_destroy(struct kref *kref)
>>>         GEM_BUG_ON(intel_context_is_pinned(ce));
>>>         if (ce->state)
>>>-               __execlists_context_fini(ce);
>>>+               lr_context_fini(ce);
>>>         intel_context_fini(ce);
>>>         intel_context_free(ce);
>>>@@ -2423,7 +2423,7 @@ check_redzone(const void *vaddr, const struct intel_engine_cs *engine)
>>>                              engine->name);
>>>  }
>>>-static void execlists_context_unpin(struct intel_context *ce)
>>>+static void intel_lr_context_unpin(struct intel_context *ce)
>>
>>execlists.
>>
>>>  {
>>>         check_redzone((void *)ce->lrc_reg_state - LRC_STATE_PN * PAGE_SIZE,
>>>                       ce->engine);
>>>@@ -2433,8 +2433,8 @@ static void execlists_context_unpin(struct intel_context *ce)
>>>  }
>>>  static void
>>>-__execlists_update_reg_state(const struct intel_context *ce,
>>>-                            const struct intel_engine_cs *engine)
>>>+lr_context_update_reg_state(const struct intel_context *ce,
>>>+                           const struct intel_engine_cs *engine)
>>
>>lrc.
>>
>>>  {
>>>         struct intel_ring *ring = ce->ring;
>>>         u32 *regs = ce->lrc_reg_state;
>>>@@ -2456,8 +2456,7 @@ __execlists_update_reg_state(const struct intel_context *ce,
>>>  }
>>>  static int
>>>-__execlists_context_pin(struct intel_context *ce,
>>>-                       struct intel_engine_cs *engine)
>>>+lr_context_pin(struct intel_context *ce, struct intel_engine_cs *engine)
>>
>>execlists.
>>
>>>  {
>>>         void *vaddr;
>>>         int ret;
>>>@@ -2479,7 +2478,7 @@ __execlists_context_pin(struct intel_context *ce,
>>>         ce->lrc_desc = lrc_descriptor(ce, engine);
>>>         ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
>>>-       __execlists_update_reg_state(ce, engine);
>>>+       lr_context_update_reg_state(ce, engine);
>>>         return 0;
>>>@@ -2491,12 +2490,12 @@ __execlists_context_pin(struct intel_context *ce,
>>>  static int execlists_context_pin(struct intel_context *ce)
>>>  {
>>>-       return __execlists_context_pin(ce, ce->engine);
>>>+       return lr_context_pin(ce, ce->engine);
>>>  }
>>>  static int execlists_context_alloc(struct intel_context *ce)
>>>  {
>>>-       return __execlists_context_alloc(ce, ce->engine);
>>>+       return lr_context_alloc(ce, ce->engine);
>>>  }
>>>  static void execlists_context_reset(struct intel_context *ce)
>>>@@ -2518,14 +2517,14 @@ static void execlists_context_reset(struct intel_context *ce)
>>>          * simplicity, we just zero everything out.
>>>          */
>>>         intel_ring_reset(ce->ring, 0);
>>>-       __execlists_update_reg_state(ce, ce->engine);
>>>+       lr_context_update_reg_state(ce, ce->engine);
>>>  }
>>>  static const struct intel_context_ops execlists_context_ops = {
>>>         .alloc = execlists_context_alloc,
>>>         .pin = execlists_context_pin,
>>>-       .unpin = execlists_context_unpin,
>>>+       .unpin = intel_lr_context_unpin,
>>
>>execlists.
>>
>>>         .enter = intel_context_enter_engine,
>>>         .exit = intel_context_exit_engine,
>>>@@ -2912,7 +2911,33 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
>>>         return ret;
>>>  }
>>>-static void enable_execlists(struct intel_engine_cs *engine)
>>>+static int logical_ring_init(struct intel_engine_cs *engine)
>>>+{
>>>+       int ret;
>>>+
>>>+       ret = intel_engine_init_common(engine);
>>>+       if (ret)
>>>+               return ret;
>>>+
>>>+       if (intel_init_workaround_bb(engine))
>>>+               /*
>>>+                * We continue even if we fail to initialize WA batch
>>>+                * because we only expect rare glitches but nothing
>>>+                * critical to prevent us from using GPU
>>>+                */
>>>+               DRM_ERROR("WA batch buffer initialization failed\n");
>>>+
>>>+       return 0;
>>>+}
>>>+
>>>+static void logical_ring_destroy(struct intel_engine_cs *engine)
>>>+{
>>>+       intel_engine_cleanup_common(engine);
>>>+       lrc_destroy_wa_ctx(engine);
>>>+       kfree(engine);
>>
>>>+}
>>>+
>>>+static void logical_ring_enable(struct intel_engine_cs *engine)
>>>  {
>>>         u32 mode;
>>>@@ -2946,7 +2971,7 @@ static bool unexpected_starting_state(struct intel_engine_cs *engine)
>>>         return unexpected;
>>>  }
>>>-static int execlists_resume(struct intel_engine_cs *engine)
>>>+static int logical_ring_resume(struct intel_engine_cs *engine)
>>
>>execlists.
>>
>>>  {
>>>         intel_engine_apply_workarounds(engine);
>>>         intel_engine_apply_whitelist(engine);
>>>@@ -2961,7 +2986,7 @@ static int execlists_resume(struct intel_engine_cs *engine)
>>>                 intel_engine_dump(engine, &p, NULL);
>>>         }
>>>-       enable_execlists(engine);
>>>+       logical_ring_enable(engine);
>>>         return 0;
>>>  }
>>>@@ -3037,8 +3062,8 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
>>>                                &execlists->csb_status[reset_value]);
>>>  }
>>>-static void __execlists_reset_reg_state(const struct intel_context *ce,
>>>-                                       const struct intel_engine_cs *engine)
>>>+static void lr_context_reset_reg_state(const struct intel_context *ce,
>>>+                                      const struct intel_engine_cs *engine)
>>
>>lrc.
>>
>>>  {
>>>         u32 *regs = ce->lrc_reg_state;
>>>         int x;
>>>@@ -3131,14 +3156,14 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
>>>          * to recreate its own state.
>>>          */
>>>         GEM_BUG_ON(!intel_context_is_pinned(ce));
>>>-       restore_default_state(ce, engine);
>>>+       lr_context_restore_default_state(ce, engine);
>>>  out_replay:
>>>         GEM_TRACE("%s replay {head:%04x, tail:%04x}\n",
>>>                   engine->name, ce->ring->head, ce->ring->tail);
>>>         intel_ring_update_space(ce->ring);
>>>-       __execlists_reset_reg_state(ce, engine);
>>>-       __execlists_update_reg_state(ce, engine);
>>>+       lr_context_reset_reg_state(ce, engine);
>>>+       lr_context_update_reg_state(ce, engine);
>>>         ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
>>>  unwind:
>>>@@ -3788,9 +3813,7 @@ static void execlists_destroy(struct intel_engine_cs *engine)
>>>  {
>>>         execlists_shutdown(engine);
>>>-       intel_engine_cleanup_common(engine);
>>>-       lrc_destroy_wa_ctx(engine);
>>>-       kfree(engine);
>>>+       logical_ring_destroy(engine);
>>>  }
>>>  static void
>>>@@ -3799,7 +3822,7 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
>>>         /* Default vfuncs which can be overriden by each engine. */
>>>         engine->destroy = execlists_destroy;
>>>-       engine->resume = execlists_resume;
>>>+       engine->resume = logical_ring_resume;
>>>         engine->reset.prepare = execlists_reset_prepare;
>>>         engine->reset.reset = execlists_reset;
>>>@@ -3872,6 +3895,15 @@ static void rcs_submission_override(struct intel_engine_cs *engine)
>>>         }
>>>  }
>>>+static void logical_ring_setup(struct intel_engine_cs *engine)
>>>+{
>>>+       logical_ring_default_vfuncs(engine);
>>>+       logical_ring_default_irqs(engine);
>>>+
>>>+       if (engine->class == RENDER_CLASS)
>>>+               rcs_submission_override(engine);
>>>+}
>>>+
>>>  int intel_execlists_submission_setup(struct intel_engine_cs *engine)
>>>  {
>>>         tasklet_init(&engine->execlists.tasklet,
>>>@@ -3879,11 +3911,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
>>>         timer_setup(&engine->execlists.timer, execlists_timeslice, 0);
>>>         timer_setup(&engine->execlists.preempt, execlists_preempt, 0);
>>>-       logical_ring_default_vfuncs(engine);
>>>-       logical_ring_default_irqs(engine);
>>>-
>>>-       if (engine->class == RENDER_CLASS)
>>>-               rcs_submission_override(engine);
>>>+       logical_ring_setup(engine);
>>>         return 0;
>>>  }
>>>@@ -3896,18 +3924,10 @@ int intel_execlists_submission_init(struct intel_engine_cs *engine)
>>>         u32 base = engine->mmio_base;
>>>         int ret;
>>>-       ret = intel_engine_init_common(engine);
>>>+       ret = logical_ring_init(engine);
>>>         if (ret)
>>>                 return ret;
>>>-       if (intel_init_workaround_bb(engine))
>>>-               /*
>>>-                * We continue even if we fail to initialize WA batch
>>>-                * because we only expect rare glitches but nothing
>>>-                * critical to prevent us from using GPU
>>>-                */
>>>-               DRM_ERROR("WA batch buffer initialization failed\n");
>>>-
>>>         if (HAS_LOGICAL_RING_ELSQ(i915)) {
>>>                 execlists->submit_reg = uncore->regs +
>>>                         i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(base));
>>>@@ -4033,11 +4053,11 @@ static struct i915_ppgtt *vm_alias(struct i915_address_space *vm)
>>>                 return i915_vm_to_ppgtt(vm);
>>>  }
>>>-static void execlists_init_reg_state(u32 *regs,
>>>-                                    const struct intel_context *ce,
>>>-                                    const struct intel_engine_cs *engine,
>>>-                                    const struct intel_ring *ring,
>>>-                                    bool close)
>>>+static void lr_context_init_reg_state(u32 *regs,
>>>+                                     const struct intel_context *ce,
>>>+                                     const struct intel_engine_cs *engine,
>>>+                                     const struct intel_ring *ring,
>>>+                                     bool close)
>>>  {
>>>         /*
>>>          * A context is actually a big batch buffer with several
>>>@@ -4105,7 +4125,7 @@ populate_lr_context(struct intel_context *ce,
>>>         /* The second page of the context object contains some fields which must
>>>          * be set up prior to the first execution. */
>>>         regs = vaddr + LRC_STATE_PN * PAGE_SIZE;
>>>-       execlists_init_reg_state(regs, ce, engine, ring, inhibit);
>>>+       lr_context_init_reg_state(regs, ce, engine, ring, inhibit);
>>>         if (inhibit)
>>>                 regs[CTX_CONTEXT_CONTROL] |=
>>>                         _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT);
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Split up intel_lrc.c
  2019-12-11 21:12 [Intel-gfx] [RFC 0/5] Split up intel_lrc.c Daniele Ceraolo Spurio
                   ` (4 preceding siblings ...)
  2019-12-11 21:12 ` [Intel-gfx] [RFC 5/5] drm/i915: introduce intel_execlists_submission.<c/h> Daniele Ceraolo Spurio
@ 2019-12-12  1:27 ` Patchwork
  2019-12-12  1:49 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
  2019-12-12 12:51 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
  7 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2019-12-12  1:27 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-gfx

== Series Details ==

Series: Split up intel_lrc.c
URL   : https://patchwork.freedesktop.org/series/70787/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
4fa85f63db3e drm/i915: introduce logical_ring and lr_context naming
e25d3a53a30a drm/i915: Move struct intel_virtual_engine to its own header
-:306: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#306: 
new file mode 100644

-:311: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#311: FILE: drivers/gpu/drm/i915/gt/intel_virtual_engine_types.h:1:
+/*

-:312: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#312: FILE: drivers/gpu/drm/i915/gt/intel_virtual_engine_types.h:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 3 warnings, 0 checks, 318 lines checked
c172772b81db drm/i915: split out virtual engine code
-:669: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#669: 
new file mode 100644

-:674: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#674: FILE: drivers/gpu/drm/i915/gt/intel_virtual_engine.c:1:
+/*

-:675: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#675: FILE: drivers/gpu/drm/i915/gt/intel_virtual_engine.c:2:
+ * SPDX-License-Identifier: MIT

-:1039: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#1039: FILE: drivers/gpu/drm/i915/gt/intel_virtual_engine.h:1:
+/*

-:1040: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#1040: FILE: drivers/gpu/drm/i915/gt/intel_virtual_engine.h:2:
+ * SPDX-License-Identifier: MIT

-:1073: CHECK:LINE_SPACING: Please don't use multiple blank lines
#1073: FILE: drivers/gpu/drm/i915/gt/intel_virtual_engine.h:35:
+
+

total: 0 errors, 5 warnings, 1 checks, 1052 lines checked
331b2734b611 drm/i915: move execlists selftests to their own file
-:78: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#78: 
new file mode 100644

-:83: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#83: FILE: drivers/gpu/drm/i915/gt/selftest_execlists.c:1:
+/*

-:84: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#84: FILE: drivers/gpu/drm/i915/gt/selftest_execlists.c:2:
+ * SPDX-License-Identifier: MIT

-:468: WARNING:LONG_LINE: line over 100 characters
#468: FILE: drivers/gpu/drm/i915/gt/selftest_execlists.c:386:
+			      2 * RUNTIME_INFO(outer->i915)->num_engines * (count + 2) * (count + 3)) < 0) {

-:2120: WARNING:LINE_SPACING: Missing a blank line after declarations
#2120: FILE: drivers/gpu/drm/i915/gt/selftest_execlists.c:2038:
+		struct igt_live_test t;
+		IGT_TIMEOUT(end_time);

-:2489: WARNING:LINE_SPACING: Missing a blank line after declarations
#2489: FILE: drivers/gpu/drm/i915/gt/selftest_execlists.c:2407:
+	struct preempt_smoke *smoke = arg;
+	IGT_TIMEOUT(end_time);

-:2536: WARNING:YIELD: Using yield() is generally wrong. See yield() kernel-doc (sched/core.c)
#2536: FILE: drivers/gpu/drm/i915/gt/selftest_execlists.c:2454:
+	yield(); /* start all threads before we kthread_stop() */

-:2563: WARNING:LINE_SPACING: Missing a blank line after declarations
#2563: FILE: drivers/gpu/drm/i915/gt/selftest_execlists.c:2481:
+	enum intel_engine_id id;
+	IGT_TIMEOUT(end_time);

total: 0 errors, 8 warnings, 0 checks, 6770 lines checked
7865d27bced8 drm/i915: introduce intel_execlists_submission.<c/h>
-:40: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#40: 
new file mode 100644

-:45: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#45: FILE: drivers/gpu/drm/i915/gt/intel_execlists_submission.c:1:
+/*

-:46: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#46: FILE: drivers/gpu/drm/i915/gt/intel_execlists_submission.c:2:
+ * SPDX-License-Identifier: MIT

-:139: WARNING:MEMORY_BARRIER: memory barrier without comment
#139: FILE: drivers/gpu/drm/i915/gt/intel_execlists_submission.c:95:
+		wmb();

-:290: WARNING:FUNCTION_ARGUMENTS: function definition argument 'pl' should also have an identifier name
#290: FILE: drivers/gpu/drm/i915/gt/intel_execlists_submission.c:246:
+	struct list_head *uninitialized_var(pl);

-:1355: WARNING:LONG_LINE: line over 100 characters
#1355: FILE: drivers/gpu/drm/i915/gt/intel_execlists_submission.c:1311:
+					*port = execlists_schedule_in(last, port - execlists->pending);

-:1957: WARNING:MEMORY_BARRIER: memory barrier without comment
#1957: FILE: drivers/gpu/drm/i915/gt/intel_execlists_submission.c:1913:
+	mb();

-:2536: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#2536: FILE: drivers/gpu/drm/i915/gt/intel_execlists_submission.h:1:
+/*

-:2537: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#2537: FILE: drivers/gpu/drm/i915/gt/intel_execlists_submission.h:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 9 warnings, 0 checks, 5312 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for Split up intel_lrc.c
  2019-12-11 21:12 [Intel-gfx] [RFC 0/5] Split up intel_lrc.c Daniele Ceraolo Spurio
                   ` (5 preceding siblings ...)
  2019-12-12  1:27 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Split up intel_lrc.c Patchwork
@ 2019-12-12  1:49 ` Patchwork
  2019-12-12 12:51 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
  7 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2019-12-12  1:49 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-gfx

== Series Details ==

Series: Split up intel_lrc.c
URL   : https://patchwork.freedesktop.org/series/70787/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7545 -> Patchwork_15699
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/index.html

Known issues
------------

  Here are the changes found in Patchwork_15699 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_pm_rpm@module-reload:
    - fi-skl-6770hq:      [PASS][1] -> [FAIL][2] ([i915#178])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html

  * igt@i915_selftest@live_gem_contexts:
    - fi-byt-n2820:       [PASS][3] -> [DMESG-FAIL][4] ([i915#722])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/fi-byt-n2820/igt@i915_selftest@live_gem_contexts.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/fi-byt-n2820/igt@i915_selftest@live_gem_contexts.html

  * igt@kms_busy@basic-flip-pipe-a:
    - fi-bxt-dsi:         [PASS][5] -> [DMESG-WARN][6] ([i915#109])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/fi-bxt-dsi/igt@kms_busy@basic-flip-pipe-a.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/fi-bxt-dsi/igt@kms_busy@basic-flip-pipe-a.html

  
#### Possible fixes ####

  * igt@i915_selftest@live_blt:
    - fi-ivb-3770:        [DMESG-FAIL][7] ([i915#725]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/fi-ivb-3770/igt@i915_selftest@live_blt.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/fi-ivb-3770/igt@i915_selftest@live_blt.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [FAIL][9] ([fdo#111096] / [i915#323]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
#### Warnings ####

  * igt@i915_module_load@reload:
    - fi-icl-u2:          [DMESG-WARN][11] ([i915#109] / [i915#289]) -> [DMESG-WARN][12] ([i915#289])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/fi-icl-u2/igt@i915_module_load@reload.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/fi-icl-u2/igt@i915_module_load@reload.html

  * igt@i915_selftest@live_blt:
    - fi-hsw-4770r:       [DMESG-FAIL][13] ([i915#553] / [i915#725]) -> [DMESG-FAIL][14] ([i915#725])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/fi-hsw-4770r/igt@i915_selftest@live_blt.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/fi-hsw-4770r/igt@i915_selftest@live_blt.html

  * igt@kms_flip@basic-flip-vs-modeset:
    - fi-kbl-x1275:       [DMESG-WARN][15] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][16] ([i915#62] / [i915#92]) +6 similar issues
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/fi-kbl-x1275/igt@kms_flip@basic-flip-vs-modeset.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/fi-kbl-x1275/igt@kms_flip@basic-flip-vs-modeset.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - fi-kbl-x1275:       [DMESG-WARN][17] ([i915#62] / [i915#92]) -> [DMESG-WARN][18] ([i915#62] / [i915#92] / [i915#95]) +5 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/fi-kbl-x1275/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/fi-kbl-x1275/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#111096]: https://bugs.freedesktop.org/show_bug.cgi?id=111096
  [fdo#111593]: https://bugs.freedesktop.org/show_bug.cgi?id=111593
  [i915#109]: https://gitlab.freedesktop.org/drm/intel/issues/109
  [i915#178]: https://gitlab.freedesktop.org/drm/intel/issues/178
  [i915#289]: https://gitlab.freedesktop.org/drm/intel/issues/289
  [i915#323]: https://gitlab.freedesktop.org/drm/intel/issues/323
  [i915#472]: https://gitlab.freedesktop.org/drm/intel/issues/472
  [i915#553]: https://gitlab.freedesktop.org/drm/intel/issues/553
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#707]: https://gitlab.freedesktop.org/drm/intel/issues/707
  [i915#722]: https://gitlab.freedesktop.org/drm/intel/issues/722
  [i915#725]: https://gitlab.freedesktop.org/drm/intel/issues/725
  [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (52 -> 46)
------------------------------

  Missing    (6): fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7545 -> Patchwork_15699

  CI-20190529: 20190529
  CI_DRM_7545: b1b808dff985c3c2050b20771050453589a60ca3 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5346: 466b0e6cbcbaccff012b484d1fd7676364b37b93 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_15699: 7865d27bced83f512520ef79fd6e658150123789 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

7865d27bced8 drm/i915: introduce intel_execlists_submission.<c/h>
331b2734b611 drm/i915: move execlists selftests to their own file
c172772b81db drm/i915: split out virtual engine code
e25d3a53a30a drm/i915: Move struct intel_virtual_engine to its own header
4fa85f63db3e drm/i915: introduce logical_ring and lr_context naming

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for Split up intel_lrc.c
  2019-12-11 21:12 [Intel-gfx] [RFC 0/5] Split up intel_lrc.c Daniele Ceraolo Spurio
                   ` (6 preceding siblings ...)
  2019-12-12  1:49 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2019-12-12 12:51 ` Patchwork
  7 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2019-12-12 12:51 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-gfx

== Series Details ==

Series: Split up intel_lrc.c
URL   : https://patchwork.freedesktop.org/series/70787/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_7545_full -> Patchwork_15699_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_15699_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_15699_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_15699_full:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_exec_balancer@bonded-slice:
    - shard-iclb:         [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-iclb3/igt@gem_exec_balancer@bonded-slice.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-iclb2/igt@gem_exec_balancer@bonded-slice.html

  
Known issues
------------

  Here are the changes found in Patchwork_15699_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_busy@extended-parallel-vcs1:
    - shard-iclb:         [PASS][3] -> [SKIP][4] ([fdo#112080]) +1 similar issue
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-iclb2/igt@gem_busy@extended-parallel-vcs1.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-iclb8/igt@gem_busy@extended-parallel-vcs1.html

  * igt@gem_ctx_persistence@bcs0-mixed-process:
    - shard-skl:          [PASS][5] -> [FAIL][6] ([i915#679])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-skl4/igt@gem_ctx_persistence@bcs0-mixed-process.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-skl8/igt@gem_ctx_persistence@bcs0-mixed-process.html

  * igt@gem_exec_balancer@bonded-slice:
    - shard-kbl:          [PASS][7] -> [FAIL][8] ([i915#800])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-kbl2/igt@gem_exec_balancer@bonded-slice.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-kbl4/igt@gem_exec_balancer@bonded-slice.html

  * igt@gem_exec_schedule@deep-bsd:
    - shard-iclb:         [PASS][9] -> [SKIP][10] ([fdo#112146]) +1 similar issue
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-iclb5/igt@gem_exec_schedule@deep-bsd.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-iclb1/igt@gem_exec_schedule@deep-bsd.html

  * igt@gem_exec_schedule@preempt-queue-bsd1:
    - shard-iclb:         [PASS][11] -> [SKIP][12] ([fdo#109276]) +3 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-iclb2/igt@gem_exec_schedule@preempt-queue-bsd1.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-iclb8/igt@gem_exec_schedule@preempt-queue-bsd1.html

  * igt@gem_persistent_relocs@forked-interruptible-thrashing:
    - shard-tglb:         [PASS][13] -> [TIMEOUT][14] ([fdo#112126] / [i915#530])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-tglb4/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-tglb6/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
    - shard-hsw:          [PASS][15] -> [FAIL][16] ([i915#520])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-hsw1/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-hsw6/igt@gem_persistent_relocs@forked-interruptible-thrashing.html

  * igt@gem_persistent_relocs@forked-thrash-inactive:
    - shard-apl:          [PASS][17] -> [INCOMPLETE][18] ([fdo#103927] / [i915#530])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-apl2/igt@gem_persistent_relocs@forked-thrash-inactive.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-apl2/igt@gem_persistent_relocs@forked-thrash-inactive.html

  * igt@gem_userptr_blits@dmabuf-unsync:
    - shard-snb:          [PASS][19] -> [DMESG-WARN][20] ([fdo#111870])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-snb5/igt@gem_userptr_blits@dmabuf-unsync.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-snb6/igt@gem_userptr_blits@dmabuf-unsync.html

  * igt@gem_workarounds@suspend-resume-context:
    - shard-apl:          [PASS][21] -> [DMESG-WARN][22] ([i915#180]) +2 similar issues
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-apl1/igt@gem_workarounds@suspend-resume-context.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-apl4/igt@gem_workarounds@suspend-resume-context.html

  * igt@i915_pm_rpm@drm-resources-equal:
    - shard-tglb:         [PASS][23] -> [DMESG-WARN][24] ([i915#766])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-tglb6/igt@i915_pm_rpm@drm-resources-equal.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-tglb7/igt@i915_pm_rpm@drm-resources-equal.html

  * igt@i915_selftest@mock_sanitycheck:
    - shard-skl:          [PASS][25] -> [DMESG-WARN][26] ([i915#747])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-skl2/igt@i915_selftest@mock_sanitycheck.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-skl5/igt@i915_selftest@mock_sanitycheck.html

  * igt@kms_cursor_crc@pipe-a-cursor-suspend:
    - shard-kbl:          [PASS][27] -> [DMESG-WARN][28] ([i915#180]) +9 similar issues
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-kbl1/igt@kms_cursor_crc@pipe-a-cursor-suspend.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-kbl3/igt@kms_cursor_crc@pipe-a-cursor-suspend.html

  * igt@kms_cursor_crc@pipe-c-cursor-256x85-sliding:
    - shard-skl:          [PASS][29] -> [FAIL][30] ([i915#54]) +1 similar issue
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-skl1/igt@kms_cursor_crc@pipe-c-cursor-256x85-sliding.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-skl6/igt@kms_cursor_crc@pipe-c-cursor-256x85-sliding.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-blt:
    - shard-tglb:         [PASS][31] -> [INCOMPLETE][32] ([i915#474] / [i915#667])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-tglb4/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-blt.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-tglb3/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-blt.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
    - shard-tglb:         [PASS][33] -> [INCOMPLETE][34] ([i915#456] / [i915#460] / [i915#474])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-tglb8/igt@kms_frontbuffer_tracking@fbc-suspend.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-tglb3/igt@kms_frontbuffer_tracking@fbc-suspend.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-msflip-blt:
    - shard-tglb:         [PASS][35] -> [FAIL][36] ([i915#49])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-tglb4/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-msflip-blt.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-tglb6/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-msflip-blt.html

  * igt@kms_plane@pixel-format-pipe-b-planes-source-clamping:
    - shard-kbl:          [PASS][37] -> [INCOMPLETE][38] ([fdo#103665] / [i915#435] / [i915#648] / [i915#667])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-kbl6/igt@kms_plane@pixel-format-pipe-b-planes-source-clamping.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-kbl6/igt@kms_plane@pixel-format-pipe-b-planes-source-clamping.html

  * igt@kms_psr@psr2_cursor_render:
    - shard-iclb:         [PASS][39] -> [SKIP][40] ([fdo#109441]) +1 similar issue
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-iclb2/igt@kms_psr@psr2_cursor_render.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-iclb8/igt@kms_psr@psr2_cursor_render.html

  * igt@kms_setmode@basic:
    - shard-hsw:          [PASS][41] -> [FAIL][42] ([i915#31])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-hsw4/igt@kms_setmode@basic.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-hsw6/igt@kms_setmode@basic.html

  
#### Possible fixes ####

  * igt@gem_ctx_isolation@vcs0-s3:
    - shard-skl:          [INCOMPLETE][43] ([i915#69]) -> [PASS][44]
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-skl5/igt@gem_ctx_isolation@vcs0-s3.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-skl3/igt@gem_ctx_isolation@vcs0-s3.html

  * igt@gem_ctx_isolation@vcs1-s3:
    - shard-iclb:         [SKIP][45] ([fdo#109276] / [fdo#112080]) -> [PASS][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-iclb5/igt@gem_ctx_isolation@vcs1-s3.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-iclb1/igt@gem_ctx_isolation@vcs1-s3.html

  * igt@gem_ctx_shared@q-smoketest-all:
    - shard-tglb:         [INCOMPLETE][47] ([fdo#111735]) -> [PASS][48]
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-tglb8/igt@gem_ctx_shared@q-smoketest-all.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-tglb3/igt@gem_ctx_shared@q-smoketest-all.html

  * igt@gem_eio@unwedge-stress:
    - shard-snb:          [FAIL][49] ([i915#232]) -> [PASS][50]
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-snb6/igt@gem_eio@unwedge-stress.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-snb5/igt@gem_eio@unwedge-stress.html

  * igt@gem_exec_nop@basic-parallel:
    - shard-tglb:         [INCOMPLETE][51] ([i915#435]) -> [PASS][52]
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-tglb7/igt@gem_exec_nop@basic-parallel.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-tglb2/igt@gem_exec_nop@basic-parallel.html

  * igt@gem_exec_parallel@vcs1-fds:
    - shard-iclb:         [SKIP][53] ([fdo#112080]) -> [PASS][54] +6 similar issues
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-iclb3/igt@gem_exec_parallel@vcs1-fds.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-iclb2/igt@gem_exec_parallel@vcs1-fds.html

  * {igt@gem_exec_schedule@pi-shared-iova-bsd2}:
    - shard-iclb:         [SKIP][55] ([fdo#109276]) -> [PASS][56] +5 similar issues
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-iclb5/igt@gem_exec_schedule@pi-shared-iova-bsd2.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-iclb1/igt@gem_exec_schedule@pi-shared-iova-bsd2.html

  * igt@gem_exec_schedule@preempt-hang-bsd:
    - shard-iclb:         [SKIP][57] ([fdo#112146]) -> [PASS][58]
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-iclb2/igt@gem_exec_schedule@preempt-hang-bsd.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-iclb8/igt@gem_exec_schedule@preempt-hang-bsd.html

  * igt@gem_exec_schedule@preempt-queue-contexts-blt:
    - shard-tglb:         [INCOMPLETE][59] ([fdo#111606] / [fdo#111677]) -> [PASS][60]
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-tglb6/igt@gem_exec_schedule@preempt-queue-contexts-blt.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-tglb8/igt@gem_exec_schedule@preempt-queue-contexts-blt.html

  * igt@gem_sync@basic-store-each:
    - shard-tglb:         [INCOMPLETE][61] ([i915#435] / [i915#472]) -> [PASS][62]
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-tglb7/igt@gem_sync@basic-store-each.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-tglb1/igt@gem_sync@basic-store-each.html

  * igt@gem_userptr_blits@sync-unmap-after-close:
    - shard-snb:          [DMESG-WARN][63] ([fdo#111870]) -> [PASS][64]
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-snb1/igt@gem_userptr_blits@sync-unmap-after-close.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-snb5/igt@gem_userptr_blits@sync-unmap-after-close.html

  * igt@gem_workarounds@suspend-resume:
    - shard-tglb:         [INCOMPLETE][65] ([i915#456] / [i915#460]) -> [PASS][66] +1 similar issue
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-tglb1/igt@gem_workarounds@suspend-resume.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-tglb2/igt@gem_workarounds@suspend-resume.html

  * igt@i915_selftest@mock_sanitycheck:
    - shard-hsw:          [DMESG-WARN][67] ([i915#747]) -> [PASS][68]
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-hsw4/igt@i915_selftest@mock_sanitycheck.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-hsw4/igt@i915_selftest@mock_sanitycheck.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
    - shard-apl:          [DMESG-WARN][69] ([i915#180]) -> [PASS][70] +2 similar issues
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-apl1/igt@i915_suspend@fence-restore-tiled2untiled.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-apl4/igt@i915_suspend@fence-restore-tiled2untiled.html

  * igt@kms_cursor_crc@pipe-a-cursor-128x42-sliding:
    - shard-skl:          [FAIL][71] ([i915#54]) -> [PASS][72] +1 similar issue
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-skl5/igt@kms_cursor_crc@pipe-a-cursor-128x42-sliding.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-skl9/igt@kms_cursor_crc@pipe-a-cursor-128x42-sliding.html

  * igt@kms_cursor_crc@pipe-c-cursor-256x256-offscreen:
    - shard-hsw:          [DMESG-WARN][73] ([IGT#6]) -> [PASS][74]
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-hsw6/igt@kms_cursor_crc@pipe-c-cursor-256x256-offscreen.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-hsw5/igt@kms_cursor_crc@pipe-c-cursor-256x256-offscreen.html

  * igt@kms_cursor_legacy@2x-long-cursor-vs-flip-legacy:
    - shard-hsw:          [FAIL][75] ([i915#96]) -> [PASS][76]
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-hsw8/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-legacy.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-hsw8/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-legacy.html

  * igt@kms_flip@2x-flip-vs-expired-vblank-interruptible:
    - shard-glk:          [FAIL][77] ([i915#79]) -> [PASS][78]
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-glk7/igt@kms_flip@2x-flip-vs-expired-vblank-interruptible.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-glk1/igt@kms_flip@2x-flip-vs-expired-vblank-interruptible.html

  * igt@kms_flip@flip-vs-suspend:
    - shard-snb:          [INCOMPLETE][79] ([i915#82]) -> [PASS][80]
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-snb1/igt@kms_flip@flip-vs-suspend.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-snb5/igt@kms_flip@flip-vs-suspend.html

  * igt@kms_flip@plain-flip-interruptible:
    - shard-hsw:          [INCOMPLETE][81] ([i915#61]) -> [PASS][82]
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-hsw2/igt@kms_flip@plain-flip-interruptible.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-hsw1/igt@kms_flip@plain-flip-interruptible.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
    - shard-kbl:          [DMESG-WARN][83] ([i915#180]) -> [PASS][84] +5 similar issues
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-kbl4/igt@kms_frontbuffer_tracking@fbc-suspend.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-kbl7/igt@kms_frontbuffer_tracking@fbc-suspend.html

  * igt@kms_frontbuffer_tracking@fbcpsr-rgb565-draw-pwrite:
    - shard-tglb:         [FAIL][85] ([i915#49]) -> [PASS][86] +1 similar issue
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-tglb3/igt@kms_frontbuffer_tracking@fbcpsr-rgb565-draw-pwrite.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-tglb7/igt@kms_frontbuffer_tracking@fbcpsr-rgb565-draw-pwrite.html

  * igt@kms_plane@pixel-format-pipe-b-planes:
    - shard-skl:          [INCOMPLETE][87] ([fdo#112391] / [i915#648] / [i915#667]) -> [PASS][88]
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-skl10/igt@kms_plane@pixel-format-pipe-b-planes.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-skl8/igt@kms_plane@pixel-format-pipe-b-planes.html

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
    - shard-skl:          [FAIL][89] ([fdo#108145] / [i915#265]) -> [PASS][90]
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-skl5/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-skl2/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html

  * igt@kms_psr@psr2_cursor_blt:
    - shard-iclb:         [SKIP][91] ([fdo#109441]) -> [PASS][92]
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-iclb1/igt@kms_psr@psr2_cursor_blt.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-iclb2/igt@kms_psr@psr2_cursor_blt.html

  * igt@kms_sequence@get-forked-busy:
    - shard-snb:          [SKIP][93] ([fdo#109271]) -> [PASS][94] +1 similar issue
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-snb2/igt@kms_sequence@get-forked-busy.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-snb7/igt@kms_sequence@get-forked-busy.html

  * igt@kms_setmode@basic:
    - shard-apl:          [FAIL][95] ([i915#31]) -> [PASS][96]
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-apl1/igt@kms_setmode@basic.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-apl4/igt@kms_setmode@basic.html
    - shard-glk:          [FAIL][97] ([i915#31]) -> [PASS][98]
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-glk3/igt@kms_setmode@basic.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-glk7/igt@kms_setmode@basic.html

  
#### Warnings ####

  * igt@gem_eio@kms:
    - shard-snb:          [DMESG-WARN][99] ([i915#444]) -> [INCOMPLETE][100] ([i915#82])
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-snb6/igt@gem_eio@kms.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-snb4/igt@gem_eio@kms.html

  * igt@kms_plane@pixel-format-pipe-a-planes:
    - shard-skl:          [INCOMPLETE][101] ([fdo#112347] / [i915#648] / [i915#667]) -> [INCOMPLETE][102] ([i915#648])
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7545/shard-skl7/igt@kms_plane@pixel-format-pipe-a-planes.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/shard-skl1/igt@kms_plane@pixel-format-pipe-a-planes.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [IGT#6]: https://gitlab.freedesktop.org/drm/igt-gpu-tools/issues/6
  [fdo#103665]: https://bugs.freedesktop.org/show_bug.cgi?id=103665
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109276]: https://bugs.freedesktop.org/show_bug.cgi?id=109276
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#111606]: https://bugs.freedesktop.org/show_bug.cgi?id=111606
  [fdo#111677]: https://bugs.freedesktop.org/show_bug.cgi?id=111677
  [fdo#111735]: https://bugs.freedesktop.org/show_bug.cgi?id=111735
  [fdo#111870]: https://bugs.freedesktop.org/show_bug.cgi?id=111870
  [fdo#112080]: https://bugs.freedesktop.org/show_bug.cgi?id=112080
  [fdo#112126]: https://bugs.freedesktop.org/show_bug.cgi?id=112126
  [fdo#112146]: https://bugs.freedesktop.org/show_bug.cgi?id=112146
  [fdo#112347]: https://bugs.freedesktop.org/show_bug.cgi?id=112347
  [fdo#112391]: https://bugs.freedesktop.org/show_bug.cgi?id=112391
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#232]: https://gitlab.freedesktop.org/drm/intel/issues/232
  [i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
  [i915#31]: https://gitlab.freedesktop.org/drm/intel/issues/31
  [i915#435]: https://gitlab.freedesktop.org/drm/intel/issues/435
  [i915#444]: https://gitlab.freedesktop.org/drm/intel/issues/444
  [i915#456]: https://gitlab.freedesktop.org/drm/intel/issues/456
  [i915#460]: https://gitlab.freedesktop.org/drm/intel/issues/460
  [i915#472]: https://gitlab.freedesktop.org/drm/intel/issues/472
  [i915#474]: https://gitlab.freedesktop.org/drm/intel/issues/474
  [i915#49]: https://gitlab.freedesktop.org/drm/intel/issues/49
  [i915#520]: https://gitlab.freedesktop.org/drm/intel/issues/520
  [i915#530]: https://gitlab.freedesktop.org/drm/intel/issues/530
  [i915#54]: https://gitlab.freedesktop.org/drm/intel/issues/54
  [i915#61]: https://gitlab.freedesktop.org/drm/intel/issues/61
  [i915#648]: https://gitlab.freedesktop.org/drm/intel/issues/648
  [i915#667]: https://gitlab.freedesktop.org/drm/intel/issues/667
  [i915#679]: https://gitlab.freedesktop.org/drm/intel/issues/679
  [i915#69]: https://gitlab.freedesktop.org/drm/intel/issues/69
  [i915#747]: https://gitlab.freedesktop.org/drm/intel/issues/747
  [i915#766]: https://gitlab.freedesktop.org/drm/intel/issues/766
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#800]: https://gitlab.freedesktop.org/drm/intel/issues/800
  [i915#82]: https://gitlab.freedesktop.org/drm/intel/issues/82
  [i915#96]: https://gitlab.freedesktop.org/drm/intel/issues/96


Participating hosts (11 -> 11)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7545 -> Patchwork_15699

  CI-20190529: 20190529
  CI_DRM_7545: b1b808dff985c3c2050b20771050453589a60ca3 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5346: 466b0e6cbcbaccff012b484d1fd7676364b37b93 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_15699: 7865d27bced83f512520ef79fd6e658150123789 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15699/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-12-12 12:51 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-11 21:12 [Intel-gfx] [RFC 0/5] Split up intel_lrc.c Daniele Ceraolo Spurio
2019-12-11 21:12 ` [Intel-gfx] [RFC 1/5] drm/i915: introduce logical_ring and lr_context naming Daniele Ceraolo Spurio
2019-12-11 21:20   ` Chris Wilson
2019-12-11 21:33     ` Chris Wilson
2019-12-11 22:04     ` Daniele Ceraolo Spurio
2019-12-11 23:35       ` Matthew Brost
2019-12-11 21:12 ` [Intel-gfx] [RFC 2/5] drm/i915: Move struct intel_virtual_engine to its own header Daniele Ceraolo Spurio
2019-12-11 21:22   ` Chris Wilson
2019-12-11 21:12 ` [Intel-gfx] [RFC 3/5] drm/i915: split out virtual engine code Daniele Ceraolo Spurio
2019-12-11 21:22   ` Chris Wilson
2019-12-11 21:34     ` Daniele Ceraolo Spurio
2019-12-11 23:09       ` Matthew Brost
2019-12-11 21:12 ` [Intel-gfx] [RFC 4/5] drm/i915: move execlists selftests to their own file Daniele Ceraolo Spurio
2019-12-11 21:26   ` Chris Wilson
2019-12-11 22:07     ` Daniele Ceraolo Spurio
2019-12-11 21:12 ` [Intel-gfx] [RFC 5/5] drm/i915: introduce intel_execlists_submission.<c/h> Daniele Ceraolo Spurio
2019-12-11 21:31   ` Chris Wilson
2019-12-11 22:35     ` Daniele Ceraolo Spurio
2019-12-12  1:27 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Split up intel_lrc.c Patchwork
2019-12-12  1:49 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2019-12-12 12:51 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.