All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures
@ 2019-12-12 14:04 Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 02/33] drm/i915: Eliminate the trylock for awaiting an earlier request Chris Wilson
                   ` (32 more replies)
  0 siblings, 33 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

While not good behaviour, it is, however, established behaviour that we
can punt EAGAIN to userspace if we need to retry the ioctl. When trying
to acquire a mutex, prefer to use EAGAIN to propagate losing the race
so that if it does end up back in userspace, we try again.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/800
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_timeline.c | 2 +-
 drivers/gpu/drm/i915/i915_request.c      | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 038e05a6336c..d71aafb66d6e 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -527,7 +527,7 @@ int intel_timeline_read_hwsp(struct i915_request *from,
 
 	GEM_BUG_ON(rcu_access_pointer(to->timeline) == tl);
 
-	err = -EBUSY;
+	err = -EAGAIN;
 	if (mutex_trylock(&tl->mutex)) {
 		struct intel_timeline_cacheline *cl = from->hwsp_cacheline;
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 51bb8a0812a1..fb8738987aeb 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -783,7 +783,7 @@ i915_request_await_start(struct i915_request *rq, struct i915_request *signal)
 	if (!tl) /* already started or maybe even completed */
 		return 0;
 
-	fence = ERR_PTR(-EBUSY);
+	fence = ERR_PTR(-EAGAIN);
 	if (mutex_trylock(&tl->mutex)) {
 		fence = NULL;
 		if (!i915_request_started(signal) &&
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 02/33] drm/i915: Eliminate the trylock for awaiting an earlier request
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 03/33] drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp Chris Wilson
                   ` (31 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

We currently use an error-prone mutex_trylock to grab another timeline
to find an earlier request along it. However, with a bit of a
sleight-of-hand, we can reduce the mutex_trylock to a spin_lock on the
immediate request and careful pointer chasing to acquire a reference on
the previous request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 39 ++++++++++++++++-------------
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index fb8738987aeb..f513747ff027 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -768,34 +768,37 @@ i915_request_create(struct intel_context *ce)
 static int
 i915_request_await_start(struct i915_request *rq, struct i915_request *signal)
 {
-	struct intel_timeline *tl;
 	struct dma_fence *fence;
 	int err;
 
 	GEM_BUG_ON(i915_request_timeline(rq) ==
 		   rcu_access_pointer(signal->timeline));
 
+	fence = NULL;
 	rcu_read_lock();
-	tl = rcu_dereference(signal->timeline);
-	if (i915_request_started(signal) || !kref_get_unless_zero(&tl->kref))
-		tl = NULL;
-	rcu_read_unlock();
-	if (!tl) /* already started or maybe even completed */
-		return 0;
+	spin_lock_irq(&signal->lock);
+	if (!i915_request_started(signal) &&
+	    !list_is_first(&signal->link,
+			   &rcu_dereference(signal->timeline)->requests)) {
+		struct i915_request *prev = list_prev_entry(signal, link);
 
-	fence = ERR_PTR(-EAGAIN);
-	if (mutex_trylock(&tl->mutex)) {
-		fence = NULL;
-		if (!i915_request_started(signal) &&
-		    !list_is_first(&signal->link, &tl->requests)) {
-			signal = list_prev_entry(signal, link);
-			fence = dma_fence_get(&signal->fence);
+		/*
+		 * Peek at the request before us in the timeline. That
+		 * request will only be valid before it is retired, so
+		 * after acquiring a reference to it, confirm that it is
+		 * still part of the signaler's timeline.
+		 */
+		if (i915_request_get_rcu(prev)) {
+			if (list_next_entry(prev, link) == signal)
+				fence = &prev->fence;
+			else
+				i915_request_put(prev);
 		}
-		mutex_unlock(&tl->mutex);
 	}
-	intel_timeline_put(tl);
-	if (IS_ERR_OR_NULL(fence))
-		return PTR_ERR_OR_ZERO(fence);
+	spin_unlock_irq(&signal->lock);
+	rcu_read_unlock();
+	if (!fence)
+		return 0;
 
 	err = 0;
 	if (intel_timeline_sync_is_later(i915_request_timeline(rq), fence))
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 03/33] drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 02/33] drm/i915: Eliminate the trylock for awaiting an earlier request Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 04/33] drm/i915/uc: Ignore maybe-unused debug-only i915 local Chris Wilson
                   ` (30 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

As we stash a pointer to the HWSP cacheline on the request, when reading
it we only need confirm that the cacheline is still valid by checking
that the request and timeline are still intact.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_timeline.c | 38 ++++++++----------------
 1 file changed, 13 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index d71aafb66d6e..e852bd142ddf 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -514,6 +514,7 @@ int intel_timeline_read_hwsp(struct i915_request *from,
 			     struct i915_request *to,
 			     u32 *hwsp)
 {
+	struct intel_timeline_cacheline *cl = from->hwsp_cacheline;
 	struct intel_timeline *tl;
 	int err;
 
@@ -526,33 +527,20 @@ int intel_timeline_read_hwsp(struct i915_request *from,
 		return 1;
 
 	GEM_BUG_ON(rcu_access_pointer(to->timeline) == tl);
-
-	err = -EAGAIN;
-	if (mutex_trylock(&tl->mutex)) {
-		struct intel_timeline_cacheline *cl = from->hwsp_cacheline;
-
-		if (i915_request_completed(from)) {
-			err = 1;
-			goto unlock;
-		}
-
-		err = cacheline_ref(cl, to);
-		if (err)
-			goto unlock;
-
-		if (likely(cl == tl->hwsp_cacheline)) {
-			*hwsp = tl->hwsp_offset;
-		} else { /* across a seqno wrap, recover the original offset */
-			*hwsp = i915_ggtt_offset(cl->hwsp->vma) +
-				ptr_unmask_bits(cl->vaddr, CACHELINE_BITS) *
-				CACHELINE_BYTES;
-		}
-
-unlock:
-		mutex_unlock(&tl->mutex);
+	err = cacheline_ref(cl, to);
+	if (err)
+		goto out;
+
+	*hwsp = tl->hwsp_offset;
+	if (unlikely(cl != READ_ONCE(tl->hwsp_cacheline))) {
+		/* across a seqno wrap, recover the original offset */
+		*hwsp = i915_ggtt_offset(cl->hwsp->vma) +
+			ptr_unmask_bits(cl->vaddr, CACHELINE_BITS) *
+			CACHELINE_BYTES;
 	}
-	intel_timeline_put(tl);
 
+out:
+	intel_timeline_put(tl);
 	return err;
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 04/33] drm/i915/uc: Ignore maybe-unused debug-only i915 local
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 02/33] drm/i915: Eliminate the trylock for awaiting an earlier request Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 03/33] drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 05/33] drm/i915/gem: Prepare gen7 cmdparser for async execution Chris Wilson
                   ` (29 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

As the i915 local in __force_fw_fetch_failures() may not be used in a
non-debug build, tell the compiler to ignore it and not throw waarning.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index b6aedee46f9e..4d02e06480e5 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -254,6 +254,8 @@ static void __force_fw_fetch_failures(struct intel_uc_fw *uc_fw, int e)
 		uc_fw->minor_ver_wanted = 0;
 		uc_fw->user_overridden = true;
 	}
+
+	(void)i915;
 }
 
 /**
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 05/33] drm/i915/gem: Prepare gen7 cmdparser for async execution
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (2 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 04/33] drm/i915/uc: Ignore maybe-unused debug-only i915 local Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 06/33] drm/i915/gem: Asynchronous cmdparser Chris Wilson
                   ` (28 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

The gen7 cmdparser is primarily a promotion-based system to allow access
to additional registers beyond the HW validation, and allows fallback to
normal execution of the user batch buffer if valid and requires
chaining. In the next patch, we will do the cmdparser validation in the
pipeline asynchronously and so at the point of request construction we
will not know if we want to execute the privileged and validated batch,
or the original user batch. The solution employed here is to execute
both batches, one with raised privileges and one as normal. This is
because the gen7 MI_BATCH_BUFFER_START command cannot change privilege
level within a batch and must strictly use the current privilege level
(or undefined behaviour kills the GPU). So in order to execute the
original batch, we need a second non-priviledged batch buffer chain from
the ring, i.e. we need to emit two batches for each user batch. Inside
the two batches we determine which one should actually execute, we
provide a conditional trampoline to call the original batch.

Implementation-wise, we create a single buffer and write the shadow and
the trampoline inside it at different offsets; and bind the buffer into
both the kernel GGTT for the privileged execution of the shadow and into
the user ppGTT for the non-privileged execution of the trampoline and
original batch. One buffer, two batches and two vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 117 ++++++++++--------
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   8 ++
 drivers/gpu/drm/i915/i915_cmd_parser.c        |  57 +++++++--
 drivers/gpu/drm/i915/i915_drv.h               |   4 +-
 4 files changed, 127 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 13f88fc536c7..4e546b6fff8e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -228,6 +228,7 @@ struct i915_execbuffer {
 
 	struct i915_request *request; /** our request to build */
 	struct i915_vma *batch; /** identity of the batch obj/vma */
+	struct i915_vma *trampoline; /** trampoline used for chaining */
 
 	/** actual size of execobj[] as we may extend it for the cmdparser */
 	unsigned int buffer_count;
@@ -1946,31 +1947,13 @@ static int i915_reset_gen7_sol_offsets(struct i915_request *rq)
 }
 
 static struct i915_vma *
-shadow_batch_pin(struct i915_execbuffer *eb, struct drm_i915_gem_object *obj)
+shadow_batch_pin(struct drm_i915_gem_object *obj,
+		 struct i915_address_space *vm,
+		 unsigned int flags)
 {
-	struct i915_address_space *vm;
 	struct i915_vma *vma;
-	u64 flags;
 	int err;
 
-	/*
-	 * PPGTT backed shadow buffers must be mapped RO, to prevent
-	 * post-scan tampering
-	 */
-	if (CMDPARSER_USES_GGTT(eb->i915)) {
-		vm = &eb->engine->gt->ggtt->vm;
-		flags = PIN_GLOBAL;
-	} else {
-		vm = eb->context->vm;
-		if (!vm->has_read_only) {
-			DRM_DEBUG("Cannot prevent post-scan tampering without RO capable vm\n");
-			return ERR_PTR(-EINVAL);
-		}
-
-		i915_gem_object_set_readonly(obj);
-		flags = PIN_USER;
-	}
-
 	vma = i915_vma_instance(obj, vm, NULL);
 	if (IS_ERR(vma))
 		return vma;
@@ -1985,59 +1968,80 @@ shadow_batch_pin(struct i915_execbuffer *eb, struct drm_i915_gem_object *obj)
 static int eb_parse(struct i915_execbuffer *eb)
 {
 	struct intel_engine_pool_node *pool;
-	struct i915_vma *vma;
+	struct i915_vma *shadow, *trampoline;
+	unsigned int len;
 	int err;
 
 	if (!eb_use_cmdparser(eb))
 		return 0;
 
-	pool = intel_engine_get_pool(eb->engine, eb->batch_len);
+	len = eb->batch_len;
+	if (!CMDPARSER_USES_GGTT(eb->i915)) {
+		/*
+		 * ppGTT backed shadow buffers must be mapped RO, to prevent
+		 * post-scan tampering
+		 */
+		if (!eb->context->vm->has_read_only) {
+			DRM_DEBUG("Cannot prevent post-scan tampering without RO capable vm\n");
+			return -EINVAL;
+		}
+	} else {
+		len += I915_CMD_PARSER_TRAMPOLINE_SIZE;
+	}
+
+	pool = intel_engine_get_pool(eb->engine, len);
 	if (IS_ERR(pool))
 		return PTR_ERR(pool);
 
-	vma = shadow_batch_pin(eb, pool->obj);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
+	shadow = shadow_batch_pin(pool->obj, eb->context->vm, PIN_USER);
+	if (IS_ERR(shadow)) {
+		err = PTR_ERR(shadow);
 		goto err;
 	}
+	i915_gem_object_set_readonly(shadow->obj);
+
+	trampoline = NULL;
+	if (CMDPARSER_USES_GGTT(eb->i915)) {
+		trampoline = shadow;
+
+		shadow = shadow_batch_pin(pool->obj,
+					  &eb->engine->gt->ggtt->vm,
+					  PIN_GLOBAL);
+		if (IS_ERR(shadow)) {
+			err = PTR_ERR(shadow);
+			shadow = trampoline;
+			goto err_shadow;
+		}
+
+		eb->batch_flags |= I915_DISPATCH_SECURE;
+	}
 
 	err = intel_engine_cmd_parser(eb->engine,
 				      eb->batch,
 				      eb->batch_start_offset,
 				      eb->batch_len,
-				      vma);
-	if (err) {
-		/*
-		 * Unsafe GGTT-backed buffers can still be submitted safely
-		 * as non-secure.
-		 * For PPGTT backing however, we have no choice but to forcibly
-		 * reject unsafe buffers
-		 */
-		if (i915_vma_is_ggtt(vma) && err == -EACCES)
-			err = 0;
-
-		goto err_unpin;
-	}
+				      shadow, trampoline);
+	if (err)
+		goto err_trampoline;
 
-	eb->vma[eb->buffer_count] = i915_vma_get(vma);
+	eb->vma[eb->buffer_count] = i915_vma_get(shadow);
 	eb->flags[eb->buffer_count] =
 		__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_REF;
-	vma->exec_flags = &eb->flags[eb->buffer_count];
+	shadow->exec_flags = &eb->flags[eb->buffer_count];
 	eb->buffer_count++;
 
+	eb->trampoline = trampoline;
 	eb->batch_start_offset = 0;
-	eb->batch = vma;
-
-	if (i915_vma_is_ggtt(vma))
-		eb->batch_flags |= I915_DISPATCH_SECURE;
-
-	/* eb->batch_len unchanged */
+	eb->batch = shadow;
 
-	vma->private = pool;
+	shadow->private = pool;
 	return 0;
 
-err_unpin:
-	i915_vma_unpin(vma);
+err_trampoline:
+	if (trampoline)
+		i915_vma_unpin(trampoline);
+err_shadow:
+	i915_vma_unpin(shadow);
 err:
 	intel_engine_pool_put(pool);
 	return err;
@@ -2089,6 +2093,16 @@ static int eb_submit(struct i915_execbuffer *eb)
 	if (err)
 		return err;
 
+	if (eb->trampoline) {
+		GEM_BUG_ON(eb->batch_start_offset);
+		err = eb->engine->emit_bb_start(eb->request,
+						eb->trampoline->node.start +
+						eb->batch_len,
+						0, 0);
+		if (err)
+			return err;
+	}
+
 	if (i915_gem_context_nopreempt(eb->gem_context))
 		eb->request->flags |= I915_REQUEST_NOPREEMPT;
 
@@ -2470,6 +2484,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	eb.buffer_count = args->buffer_count;
 	eb.batch_start_offset = args->batch_start_offset;
 	eb.batch_len = args->batch_len;
+	eb.trampoline = NULL;
 
 	eb.batch_flags = 0;
 	if (args->flags & I915_EXEC_SECURE) {
@@ -2667,6 +2682,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_vma:
 	if (eb.exec)
 		eb_release_vmas(&eb);
+	if (eb.trampoline)
+		i915_vma_unpin(eb.trampoline);
 	mutex_unlock(&dev->struct_mutex);
 err_engine:
 	eb_unpin_engine(&eb);
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index c68c0e033f30..51b8718513bc 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -340,4 +340,12 @@ static inline u64 gen8_noncanonical_addr(u64 address)
 	return address & GENMASK_ULL(GEN8_HIGH_ADDRESS_BIT, 0);
 }
 
+static inline u32 *__gen6_emit_bb_start(u32 *cs, u32 addr, unsigned int flags)
+{
+	*cs++ = MI_BATCH_BUFFER_START | flags;
+	*cs++ = addr;
+
+	return cs;
+}
+
 #endif /* _INTEL_GPU_COMMANDS_H_ */
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index ddc8bf5175d9..34b0ea403a96 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1362,8 +1362,7 @@ static int check_bbstart(u32 *cmd, u32 offset, u32 length,
 	return 0;
 }
 
-static unsigned long *
-alloc_whitelist(struct drm_i915_private *i915, u32 batch_length)
+static unsigned long *alloc_whitelist(u32 batch_length)
 {
 	unsigned long *jmp;
 
@@ -1373,9 +1372,6 @@ alloc_whitelist(struct drm_i915_private *i915, u32 batch_length)
 	 * reasonably cheap due to kmalloc caches.
 	 */
 
-	if (CMDPARSER_USES_GGTT(i915))
-		return NULL;
-
 	/* Prefer to report transient allocation failure rather than hit oom */
 	jmp = bitmap_zalloc(DIV_ROUND_UP(batch_length, sizeof(u32)),
 			    GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
@@ -1394,6 +1390,7 @@ alloc_whitelist(struct drm_i915_private *i915, u32 batch_length)
  * @batch_offset: byte offset in the batch at which execution starts
  * @batch_length: length of the commands in batch_obj
  * @shadow: validated copy of the batch buffer in question
+ * @trampoline: whether to emit a conditional trampoline at the end of the batch
  *
  * Parses the specified batch buffer looking for privilege violations as
  * described in the overview.
@@ -1406,7 +1403,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 			    struct i915_vma *batch,
 			    u32 batch_offset,
 			    u32 batch_length,
-			    struct i915_vma *shadow)
+			    struct i915_vma *shadow,
+			    bool trampoline)
 {
 	u32 *cmd, *batch_end, offset = 0;
 	struct drm_i915_cmd_descriptor default_desc = noop_desc;
@@ -1430,8 +1428,10 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 		return PTR_ERR(cmd);
 	}
 
-	/* Defer failure until attempted use */
-	jump_whitelist = alloc_whitelist(engine->i915, batch_length);
+	jump_whitelist = NULL;
+	if (!trampoline)
+		/* Defer failure until attempted use */
+		jump_whitelist = alloc_whitelist(batch_length);
 
 	shadow_addr = gen8_canonical_addr(shadow->node.start);
 	batch_addr = gen8_canonical_addr(batch->node.start + batch_offset);
@@ -1493,6 +1493,47 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 		}
 	} while (1);
 
+	if (trampoline) {
+		/*
+		 * With the trampoline, the shadow is executed twice.
+		 *
+		 *   1 - starting at offset 0, in privileged mode
+		 *   2 - starting at offset batch_len, as non-privileged
+		 *
+		 * Only if the batch is valid and safe to execute, do we
+		 * allow the first privileged execution to proceed. If not,
+		 * we terminate the first batch and use the second batchbuffer
+		 * entry to chain to the original unsafe non-privileged batch,
+		 * leaving it to the HW to validate.
+		 */
+		*batch_end = MI_BATCH_BUFFER_END;
+
+		if (ret) {
+			/* Batch unsafe to execute with privileges, cancel! */
+			cmd = page_mask_bits(shadow->obj->mm.mapping);
+			*cmd = MI_BATCH_BUFFER_END;
+
+			/* If batch is unsafe but valid, jump to the original */
+			if (ret == -EACCES) {
+				unsigned int flags;
+
+				flags = MI_BATCH_NON_SECURE_I965;
+				if (IS_HASWELL(engine->i915))
+					flags = MI_BATCH_NON_SECURE_HSW;
+
+				GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
+				__gen6_emit_bb_start(batch_end,
+						     batch_addr,
+						     flags);
+
+				ret = 0; /* allow execution */
+			}
+		}
+
+		if (needs_clflush_after)
+			drm_clflush_virt_range(batch_end, 8);
+	}
+
 	if (needs_clflush_after) {
 		void *ptr = page_mask_bits(shadow->obj->mm.mapping);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 45f31197b390..0781b6326b8c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1949,7 +1949,9 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 			    struct i915_vma *batch,
 			    u32 batch_offset,
 			    u32 batch_length,
-			    struct i915_vma *shadow);
+			    struct i915_vma *shadow,
+			    bool trampoline);
+#define I915_CMD_PARSER_TRAMPOLINE_SIZE 8
 
 /* intel_device_info.c */
 static inline struct intel_device_info *
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 06/33] drm/i915/gem: Asynchronous cmdparser
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (3 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 05/33] drm/i915/gem: Prepare gen7 cmdparser for async execution Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 07/33] drm/i915/gt: Mark up ips_mchdev pointer access Chris Wilson
                   ` (27 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Execute the cmdparser asynchronously as part of the submission pipeline.
Using our dma-fences, we can schedule execution after an asynchronous
piece of work, so we move the cmdparser out from under the struct_mutex
inside execbuf as run it as part of the submission pipeline. The same
security rules apply, we copy the user batch before validation and
userspace cannot touch the validation shadow. The only caveat is that we
will do request construction before we complete cmdparsing and so we
cannot know the outcome of the validation step until later -- so the
execbuf ioctl does not report -EINVAL directly, but we must cancel
execution of the request and flag the error on the out-fence.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/611
Closes: https://gitlab.freedesktop.org/drm/intel/issues/412
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 90 +++++++++++++++++--
 drivers/gpu/drm/i915/i915_cmd_parser.c        | 41 ++++-----
 2 files changed, 99 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4e546b6fff8e..81eaf812c9da 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -25,6 +25,7 @@
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
+#include "i915_sw_fence_work.h"
 #include "i915_trace.h"
 
 enum {
@@ -1223,10 +1224,6 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb,
 	if (unlikely(!cache->rq)) {
 		int err;
 
-		/* If we need to copy for the cmdparser, we will stall anyway */
-		if (eb_use_cmdparser(eb))
-			return ERR_PTR(-EWOULDBLOCK);
-
 		if (!intel_engine_can_store_dword(eb->engine))
 			return ERR_PTR(-ENODEV);
 
@@ -1965,6 +1962,85 @@ shadow_batch_pin(struct drm_i915_gem_object *obj,
 	return vma;
 }
 
+struct eb_parse_work {
+	struct dma_fence_work base;
+	struct intel_engine_cs *engine;
+	struct i915_vma *batch;
+	struct i915_vma *shadow;
+	struct i915_vma *trampoline;
+	unsigned int batch_offset;
+	unsigned int batch_length;
+};
+
+static int __eb_parse(struct dma_fence_work *work)
+{
+	struct eb_parse_work *pw = container_of(work, typeof(*pw), base);
+
+	return intel_engine_cmd_parser(pw->engine,
+				       pw->batch,
+				       pw->batch_offset,
+				       pw->batch_length,
+				       pw->shadow,
+				       pw->trampoline);
+}
+
+static const struct dma_fence_work_ops eb_parse_ops = {
+	.name = "eb_parse",
+	.work = __eb_parse,
+};
+
+static int eb_parse_pipeline(struct i915_execbuffer *eb,
+			     struct i915_vma *shadow,
+			     struct i915_vma *trampoline)
+{
+	struct eb_parse_work *pw;
+	int err;
+
+	pw = kzalloc(sizeof(*pw), GFP_KERNEL);
+	if (!pw)
+		return -ENOMEM;
+
+	dma_fence_work_init(&pw->base, &eb_parse_ops);
+
+	pw->engine = eb->engine;
+	pw->batch = eb->batch;
+	pw->batch_offset = eb->batch_start_offset;
+	pw->batch_length = eb->batch_len;
+	pw->shadow = shadow;
+	pw->trampoline = trampoline;
+
+	dma_resv_lock(pw->batch->resv, NULL);
+
+	err = dma_resv_reserve_shared(pw->batch->resv, 1);
+	if (err)
+		goto err_batch_unlock;
+
+	/* Wait for all writes (and relocs) into the batch to complete */
+	err = i915_sw_fence_await_reservation(&pw->base.chain,
+					      pw->batch->resv, NULL, false,
+					      0, I915_FENCE_GFP);
+	if (err < 0)
+		goto err_batch_unlock;
+
+	/* Keep the batch alive and unwritten as we parse */
+	dma_resv_add_shared_fence(pw->batch->resv, &pw->base.dma);
+
+	dma_resv_unlock(pw->batch->resv);
+
+	/* Force execution to wait for completion of the parser */
+	dma_resv_lock(shadow->resv, NULL);
+	dma_resv_add_excl_fence(shadow->resv, &pw->base.dma);
+	dma_resv_unlock(shadow->resv);
+
+	dma_fence_work_commit(&pw->base);
+	return 0;
+
+err_batch_unlock:
+	dma_resv_unlock(pw->batch->resv);
+	kfree(pw);
+	return err;
+}
+
 static int eb_parse(struct i915_execbuffer *eb)
 {
 	struct intel_engine_pool_node *pool;
@@ -2016,11 +2092,7 @@ static int eb_parse(struct i915_execbuffer *eb)
 		eb->batch_flags |= I915_DISPATCH_SECURE;
 	}
 
-	err = intel_engine_cmd_parser(eb->engine,
-				      eb->batch,
-				      eb->batch_start_offset,
-				      eb->batch_len,
-				      shadow, trampoline);
+	err = eb_parse_pipeline(eb, shadow, trampoline);
 	if (err)
 		goto err_trampoline;
 
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 34b0ea403a96..7fcdcf31dc76 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1127,31 +1127,27 @@ find_reg(const struct intel_engine_cs *engine, u32 addr)
 /* Returns a vmap'd pointer to dst_obj, which the caller must unmap */
 static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
 		       struct drm_i915_gem_object *src_obj,
-		       u32 offset, u32 length,
-		       bool *needs_clflush_after)
+		       u32 offset, u32 length)
 {
-	unsigned int src_needs_clflush;
-	unsigned int dst_needs_clflush;
+	bool needs_clflush;
 	void *dst, *src;
 	int ret;
 
-	ret = i915_gem_object_prepare_write(dst_obj, &dst_needs_clflush);
-	if (ret)
-		return ERR_PTR(ret);
-
 	dst = i915_gem_object_pin_map(dst_obj, I915_MAP_FORCE_WB);
-	i915_gem_object_finish_access(dst_obj);
 	if (IS_ERR(dst))
 		return dst;
 
-	ret = i915_gem_object_prepare_read(src_obj, &src_needs_clflush);
+	ret = i915_gem_object_pin_pages(src_obj);
 	if (ret) {
 		i915_gem_object_unpin_map(dst_obj);
 		return ERR_PTR(ret);
 	}
 
+	needs_clflush =
+		!(src_obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ);
+
 	src = ERR_PTR(-ENODEV);
-	if (src_needs_clflush && i915_has_memcpy_from_wc()) {
+	if (needs_clflush && i915_has_memcpy_from_wc()) {
 		src = i915_gem_object_pin_map(src_obj, I915_MAP_WC);
 		if (!IS_ERR(src)) {
 			i915_unaligned_memcpy_from_wc(dst,
@@ -1172,7 +1168,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
 		 * We don't care about copying too much here as we only
 		 * validate up to the end of the batch.
 		 */
-		if (dst_needs_clflush & CLFLUSH_BEFORE)
+		if (!(dst_obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
 			length = round_up(length,
 					  boot_cpu_data.x86_clflush_size);
 
@@ -1182,7 +1178,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
 			int len = min_t(int, length, PAGE_SIZE - x);
 
 			src = kmap_atomic(i915_gem_object_get_page(src_obj, n));
-			if (src_needs_clflush)
+			if (needs_clflush)
 				drm_clflush_virt_range(src + x, len);
 			memcpy(ptr, src + x, len);
 			kunmap_atomic(src);
@@ -1193,11 +1189,9 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
 		}
 	}
 
-	i915_gem_object_finish_access(src_obj);
+	i915_gem_object_unpin_pages(src_obj);
 
 	/* dst_obj is returned with vmap pinned */
-	*needs_clflush_after = dst_needs_clflush & CLFLUSH_AFTER;
-
 	return dst;
 }
 
@@ -1383,6 +1377,11 @@ static unsigned long *alloc_whitelist(u32 batch_length)
 
 #define LENGTH_BIAS 2
 
+static bool shadow_needs_clflush(struct drm_i915_gem_object *obj)
+{
+	return !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE);
+}
+
 /**
  * intel_engine_cmd_parser() - parse a batch buffer for privilege violations
  * @engine: the engine on which the batch is to execute
@@ -1398,7 +1397,6 @@ static unsigned long *alloc_whitelist(u32 batch_length)
  * Return: non-zero if the parser finds violations or otherwise fails; -EACCES
  * if the batch appears legal but should use hardware parsing
  */
-
 int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 			    struct i915_vma *batch,
 			    u32 batch_offset,
@@ -1409,7 +1407,6 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 	u32 *cmd, *batch_end, offset = 0;
 	struct drm_i915_cmd_descriptor default_desc = noop_desc;
 	const struct drm_i915_cmd_descriptor *desc = &default_desc;
-	bool needs_clflush_after = false;
 	unsigned long *jump_whitelist;
 	u64 batch_addr, shadow_addr;
 	int ret = 0;
@@ -1420,9 +1417,7 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 				     batch->size));
 	GEM_BUG_ON(!batch_length);
 
-	cmd = copy_batch(shadow->obj, batch->obj,
-			 batch_offset, batch_length,
-			 &needs_clflush_after);
+	cmd = copy_batch(shadow->obj, batch->obj, batch_offset, batch_length);
 	if (IS_ERR(cmd)) {
 		DRM_DEBUG("CMD: Failed to copy batch\n");
 		return PTR_ERR(cmd);
@@ -1530,11 +1525,11 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 			}
 		}
 
-		if (needs_clflush_after)
+		if (shadow_needs_clflush(shadow->obj))
 			drm_clflush_virt_range(batch_end, 8);
 	}
 
-	if (needs_clflush_after) {
+	if (shadow_needs_clflush(shadow->obj)) {
 		void *ptr = page_mask_bits(shadow->obj->mm.mapping);
 
 		drm_clflush_virt_range(ptr, (void *)(cmd + 1) - ptr);
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 07/33] drm/i915/gt: Mark up ips_mchdev pointer access
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (4 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 06/33] drm/i915/gem: Asynchronous cmdparser Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 16:39   ` Venkata Sandeep Dhanalakota
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 08/33] drm/i915/gt: Mark context->state vma as active while pinned Chris Wilson
                   ` (26 subsequent siblings)
  32 siblings, 1 reply; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

drivers/gpu/drm/i915/gt/intel_rps.c:1726:24: error: incompatible types in comparison expression (different address spaces):
drivers/gpu/drm/i915/gt/intel_rps.c:1726:24:    struct drm_i915_private [noderef] <asn:4> *
drivers/gpu/drm/i915/gt/intel_rps.c:1726:24:    struct drm_i915_private *

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_rps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index fd01e4100fc1..106c9fce9d6c 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1723,7 +1723,7 @@ void intel_rps_driver_register(struct intel_rps *rps)
 
 void intel_rps_driver_unregister(struct intel_rps *rps)
 {
-	if (ips_mchdev == rps_to_i915(rps))
+	if (rcu_access_pointer(ips_mchdev) == rps_to_i915(rps))
 		rcu_assign_pointer(ips_mchdev, NULL);
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 08/33] drm/i915/gt: Mark context->state vma as active while pinned
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (5 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 07/33] drm/i915/gt: Mark up ips_mchdev pointer access Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 19:55   ` Lionel Landwerlin
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 09/33] drm/i915/gt: Mark ring->vma " Chris Wilson
                   ` (25 subsequent siblings)
  32 siblings, 1 reply; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

As we use the active state to keep the vma alive while we are reading
its contents during GPU error capture, we need to mark the
context->state vma as active during execution if we want to include it
in the error state.

Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b1e3177bd1d8 ("drm/i915: Coordinate i915_active with its own mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 61c39e943f69..f7e2f3af007a 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -120,6 +120,10 @@ static int __context_pin_state(struct i915_vma *vma)
 	if (err)
 		return err;
 
+	err = i915_active_acquire(&vma->active);
+	if (err)
+		goto err_unpin;
+
 	/*
 	 * And mark it as a globally pinned object to let the shrinker know
 	 * it cannot reclaim the object until we release it.
@@ -128,11 +132,16 @@ static int __context_pin_state(struct i915_vma *vma)
 	vma->obj->mm.dirty = true;
 
 	return 0;
+
+err_unpin:
+	i915_vma_unpin(vma);
+	return err;
 }
 
 static void __context_unpin_state(struct i915_vma *vma)
 {
 	i915_vma_make_shrinkable(vma);
+	i915_active_release(&vma->active);
 	__i915_vma_unpin(vma);
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 09/33] drm/i915/gt: Mark ring->vma as active while pinned
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (6 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 08/33] drm/i915/gt: Mark context->state vma as active while pinned Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 19:55   ` Lionel Landwerlin
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 10/33] drm/i915/selftests: Disable heartbeats around long queues Chris Wilson
                   ` (24 subsequent siblings)
  32 siblings, 1 reply; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

As we use the active state to keep the vma alive while we are reading
its contents during GPU error capture, we need to mark the
ring->vma as active during execution if we want to include the rinbuffer
in the error state.

Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b1e3177bd1d8 ("drm/i915: Coordinate i915_active with its own mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_ring.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index 374b28f13ca0..7a27264150b9 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -45,6 +45,10 @@ int intel_ring_pin(struct intel_ring *ring)
 	if (unlikely(ret))
 		goto err_unpin;
 
+	ret = i915_active_acquire(&vma->active);
+	if (ret)
+		goto err_ring;
+
 	if (i915_vma_is_map_and_fenceable(vma))
 		addr = (void __force *)i915_vma_pin_iomap(vma);
 	else
@@ -52,7 +56,7 @@ int intel_ring_pin(struct intel_ring *ring)
 					       i915_coherent_map_type(vma->vm->i915));
 	if (IS_ERR(addr)) {
 		ret = PTR_ERR(addr);
-		goto err_ring;
+		goto err_active;
 	}
 
 	i915_vma_make_unshrinkable(vma);
@@ -63,6 +67,8 @@ int intel_ring_pin(struct intel_ring *ring)
 	ring->vaddr = addr;
 	return 0;
 
+err_active:
+	i915_active_release(&vma->active);
 err_ring:
 	i915_vma_unpin(vma);
 err_unpin:
@@ -93,6 +99,8 @@ void intel_ring_unpin(struct intel_ring *ring)
 		i915_gem_object_unpin_map(vma->obj);
 
 	i915_vma_make_purgeable(vma);
+
+	i915_active_release(&vma->active);
 	i915_vma_unpin(vma);
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 10/33] drm/i915/selftests: Disable heartbeats around long queues
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (7 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 09/33] drm/i915/gt: Mark ring->vma " Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 11/33] drm/i915/selftests: Impose a timeout for request submission Chris Wilson
                   ` (23 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

For some execlists scheduler tests we assume very precise layout of the
inflight queue and become angry if the heartbeat interferes by
reprioritising our contexts (because we happen to be using the same
engine->kernel_context for our test).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_lrc.c | 42 +++++++++++++++++++++-----
 1 file changed, 34 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index ac8b9116d307..54bce282717a 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -50,6 +50,24 @@ static struct i915_vma *create_scratch(struct intel_gt *gt)
 	return vma;
 }
 
+static void engine_heartbeat_disable(struct intel_engine_cs *engine,
+				     unsigned long *saved)
+{
+	*saved = engine->props.heartbeat_interval_ms;
+	engine->props.heartbeat_interval_ms = 0;
+
+	intel_engine_pm_get(engine);
+	intel_engine_park_heartbeat(engine);
+}
+
+static void engine_heartbeat_enable(struct intel_engine_cs *engine,
+				    unsigned long saved)
+{
+	intel_engine_pm_put(engine);
+
+	engine->props.heartbeat_interval_ms = saved;
+}
+
 static int live_sanitycheck(void *arg)
 {
 	struct intel_gt *gt = arg;
@@ -128,6 +146,7 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
 		struct intel_context *ce[2] = {};
 		struct i915_request *rq[2];
 		struct igt_live_test t;
+		unsigned long saved;
 		int n;
 
 		if (prio && !intel_engine_has_preemption(engine))
@@ -140,6 +159,7 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
 			err = -EIO;
 			break;
 		}
+		engine_heartbeat_disable(engine, &saved);
 
 		for (n = 0; n < ARRAY_SIZE(ce); n++) {
 			struct intel_context *tmp;
@@ -247,6 +267,7 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
 			intel_context_put(ce[n]);
 		}
 
+		engine_heartbeat_enable(engine, saved);
 		if (igt_live_test_end(&t))
 			err = -EIO;
 		if (err)
@@ -468,12 +489,16 @@ static int live_timeslice_preempt(void *arg)
 		enum intel_engine_id id;
 
 		for_each_engine(engine, gt, id) {
+			unsigned long saved;
+
 			if (!intel_engine_has_preemption(engine))
 				continue;
 
 			memset(vaddr, 0, PAGE_SIZE);
 
+			engine_heartbeat_disable(engine, &saved);
 			err = slice_semaphore_queue(engine, vma, count);
+			engine_heartbeat_enable(engine, saved);
 			if (err)
 				goto err_pin;
 
@@ -566,17 +591,19 @@ static int live_timeslice_queue(void *arg)
 			.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX),
 		};
 		struct i915_request *rq, *nop;
+		unsigned long saved;
 
 		if (!intel_engine_has_preemption(engine))
 			continue;
 
+		engine_heartbeat_disable(engine, &saved);
 		memset(vaddr, 0, PAGE_SIZE);
 
 		/* ELSP[0]: semaphore wait */
 		rq = semaphore_queue(engine, vma, 0);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
-			goto err_pin;
+			goto err_heartbeat;
 		}
 		engine->schedule(rq, &attr);
 		wait_for_submit(engine, rq);
@@ -585,8 +612,7 @@ static int live_timeslice_queue(void *arg)
 		nop = nop_request(engine);
 		if (IS_ERR(nop)) {
 			err = PTR_ERR(nop);
-			i915_request_put(rq);
-			goto err_pin;
+			goto err_rq;
 		}
 		wait_for_submit(engine, nop);
 		i915_request_put(nop);
@@ -596,10 +622,8 @@ static int live_timeslice_queue(void *arg)
 
 		/* Queue: semaphore signal, matching priority as semaphore */
 		err = release_queue(engine, vma, 1, effective_prio(rq));
-		if (err) {
-			i915_request_put(rq);
-			goto err_pin;
-		}
+		if (err)
+			goto err_rq;
 
 		intel_engine_flush_submission(engine);
 		if (!READ_ONCE(engine->execlists.timer.expires) &&
@@ -630,12 +654,14 @@ static int live_timeslice_queue(void *arg)
 			memset(vaddr, 0xff, PAGE_SIZE);
 			err = -EIO;
 		}
+err_rq:
 		i915_request_put(rq);
+err_heartbeat:
+		engine_heartbeat_enable(engine, saved);
 		if (err)
 			break;
 	}
 
-err_pin:
 	i915_vma_unpin(vma);
 err_map:
 	i915_gem_object_unpin_map(obj);
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 11/33] drm/i915/selftests: Impose a timeout for request submission
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (8 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 10/33] drm/i915/selftests: Disable heartbeats around long queues Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 12/33] drm/i915/gt: Remove direct invocation of breadcrumb signaling Chris Wilson
                   ` (22 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Avoid spinning indefinitely waiting for the request to be submitted, and
instead apply a timeout. A secondary benefit is that the error message
will show which suspect is blocked.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_lrc.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 54bce282717a..3976198eff37 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -532,13 +532,19 @@ static struct i915_request *nop_request(struct intel_engine_cs *engine)
 	return rq;
 }
 
-static void wait_for_submit(struct intel_engine_cs *engine,
-			    struct i915_request *rq)
+static int wait_for_submit(struct intel_engine_cs *engine,
+			   struct i915_request *rq,
+			   unsigned long timeout)
 {
+	timeout += jiffies;
 	do {
 		cond_resched();
 		intel_engine_flush_submission(engine);
-	} while (!i915_request_is_active(rq));
+		if (i915_request_is_active(rq))
+			return 0;
+	} while (time_before(jiffies, timeout));
+
+	return -ETIME;
 }
 
 static long timeslice_threshold(const struct intel_engine_cs *engine)
@@ -606,7 +612,12 @@ static int live_timeslice_queue(void *arg)
 			goto err_heartbeat;
 		}
 		engine->schedule(rq, &attr);
-		wait_for_submit(engine, rq);
+		err = wait_for_submit(engine, rq, HZ / 2);
+		if (err) {
+			pr_err("%s: Timed out trying to submit semaphores\n",
+			       engine->name);
+			goto err_rq;
+		}
 
 		/* ELSP[1]: nop request */
 		nop = nop_request(engine);
@@ -614,8 +625,13 @@ static int live_timeslice_queue(void *arg)
 			err = PTR_ERR(nop);
 			goto err_rq;
 		}
-		wait_for_submit(engine, nop);
+		err = wait_for_submit(engine, nop, HZ / 2);
 		i915_request_put(nop);
+		if (err) {
+			pr_err("%s: Timed out trying to submit nop\n",
+			       engine->name);
+			goto err_rq;
+		}
 
 		GEM_BUG_ON(i915_request_completed(rq));
 		GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 12/33] drm/i915/gt: Remove direct invocation of breadcrumb signaling
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (9 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 11/33] drm/i915/selftests: Impose a timeout for request submission Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 13/33] drm/i915: Tweak scheduler's kick_submission() Chris Wilson
                   ` (21 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Only signal the breadcrumbs from inside the irq_work, simplifying our
interface and calling conventions. The micro-optimisation here is that
by always using the irq_work interface, we know we are always inside an
irq-off critical section for the breadcrumb signaling and can ellide
save/restore of the irq flags.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   | 27 +++++++------------
 drivers/gpu/drm/i915/gt/intel_engine.h        |  4 +--
 drivers/gpu/drm/i915/gt/intel_gt_irq.c        | 12 ++++-----
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  2 +-
 drivers/gpu/drm/i915/gt/intel_reset.c         |  4 +--
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  2 +-
 drivers/gpu/drm/i915/gt/intel_rps.c           |  2 +-
 drivers/gpu/drm/i915/gt/mock_engine.c         |  2 +-
 drivers/gpu/drm/i915/i915_irq.c               |  8 +++---
 drivers/gpu/drm/i915/i915_request.c           |  2 +-
 10 files changed, 27 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index 8a9facf4f3b6..5fa4d621528e 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -130,16 +130,15 @@ __dma_fence_signal__notify(struct dma_fence *fence,
 	}
 }
 
-void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
+static void signal_irq_work(struct irq_work *work)
 {
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
+	struct intel_breadcrumbs *b = container_of(work, typeof(*b), irq_work);
 	const ktime_t timestamp = ktime_get();
 	struct intel_context *ce, *cn;
 	struct list_head *pos, *next;
-	unsigned long flags;
 	LIST_HEAD(signal);
 
-	spin_lock_irqsave(&b->irq_lock, flags);
+	spin_lock(&b->irq_lock);
 
 	if (b->irq_armed && list_empty(&b->signalers))
 		__intel_breadcrumbs_disarm_irq(b);
@@ -185,31 +184,23 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
 		}
 	}
 
-	spin_unlock_irqrestore(&b->irq_lock, flags);
+	spin_unlock(&b->irq_lock);
 
 	list_for_each_safe(pos, next, &signal) {
 		struct i915_request *rq =
 			list_entry(pos, typeof(*rq), signal_link);
 		struct list_head cb_list;
 
-		spin_lock_irqsave(&rq->lock, flags);
+		spin_lock(&rq->lock);
 		list_replace(&rq->fence.cb_list, &cb_list);
 		__dma_fence_signal__timestamp(&rq->fence, timestamp);
 		__dma_fence_signal__notify(&rq->fence, &cb_list);
-		spin_unlock_irqrestore(&rq->lock, flags);
+		spin_unlock(&rq->lock);
 
 		i915_request_put(rq);
 	}
 }
 
-static void signal_irq_work(struct irq_work *work)
-{
-	struct intel_engine_cs *engine =
-		container_of(work, typeof(*engine), breadcrumbs.irq_work);
-
-	intel_engine_breadcrumbs_irq(engine);
-}
-
 static bool __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b)
 {
 	struct intel_engine_cs *engine =
@@ -290,9 +281,9 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq)
 
 		/*
 		 * We keep the seqno in retirement order, so we can break
-		 * inside intel_engine_breadcrumbs_irq as soon as we've passed
-		 * the last completed request (or seen a request that hasn't
-		 * event started). We could iterate the timeline->requests list,
+		 * inside intel_engine_signal_breadcrumbs as soon as we've
+		 * passed the last completed request (or seen a request that
+		 * hasn't event started). We could walk the timeline->requests,
 		 * but keeping a separate signalers_list has the advantage of
 		 * hopefully being much smaller than the full list and so
 		 * provides faster iteration and detection when there are no
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index c294ea80605e..d9f9ef24fbff 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -206,13 +206,11 @@ void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine);
 void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine);
 
 static inline void
-intel_engine_queue_breadcrumbs(struct intel_engine_cs *engine)
+intel_engine_signal_breadcrumbs(struct intel_engine_cs *engine)
 {
 	irq_work_queue(&engine->breadcrumbs.irq_work);
 }
 
-void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine);
-
 void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine);
 void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index 332b12a574fb..f796bdf1ed30 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -28,7 +28,7 @@ cs_irq_handler(struct intel_engine_cs *engine, u32 iir)
 		tasklet = true;
 
 	if (iir & GT_RENDER_USER_INTERRUPT) {
-		intel_engine_queue_breadcrumbs(engine);
+		intel_engine_signal_breadcrumbs(engine);
 		tasklet |= intel_engine_needs_breadcrumb_tasklet(engine);
 	}
 
@@ -245,9 +245,9 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
 void gen5_gt_irq_handler(struct intel_gt *gt, u32 gt_iir)
 {
 	if (gt_iir & GT_RENDER_USER_INTERRUPT)
-		intel_engine_breadcrumbs_irq(gt->engine_class[RENDER_CLASS][0]);
+		intel_engine_signal_breadcrumbs(gt->engine_class[RENDER_CLASS][0]);
 	if (gt_iir & ILK_BSD_USER_INTERRUPT)
-		intel_engine_breadcrumbs_irq(gt->engine_class[VIDEO_DECODE_CLASS][0]);
+		intel_engine_signal_breadcrumbs(gt->engine_class[VIDEO_DECODE_CLASS][0]);
 }
 
 static void gen7_parity_error_irq_handler(struct intel_gt *gt, u32 iir)
@@ -271,11 +271,11 @@ static void gen7_parity_error_irq_handler(struct intel_gt *gt, u32 iir)
 void gen6_gt_irq_handler(struct intel_gt *gt, u32 gt_iir)
 {
 	if (gt_iir & GT_RENDER_USER_INTERRUPT)
-		intel_engine_breadcrumbs_irq(gt->engine_class[RENDER_CLASS][0]);
+		intel_engine_signal_breadcrumbs(gt->engine_class[RENDER_CLASS][0]);
 	if (gt_iir & GT_BSD_USER_INTERRUPT)
-		intel_engine_breadcrumbs_irq(gt->engine_class[VIDEO_DECODE_CLASS][0]);
+		intel_engine_signal_breadcrumbs(gt->engine_class[VIDEO_DECODE_CLASS][0]);
 	if (gt_iir & GT_BLT_USER_INTERRUPT)
-		intel_engine_breadcrumbs_irq(gt->engine_class[COPY_ENGINE_CLASS][0]);
+		intel_engine_signal_breadcrumbs(gt->engine_class[COPY_ENGINE_CLASS][0]);
 
 	if (gt_iir & (GT_BLT_CS_ERROR_INTERRUPT |
 		      GT_BSD_CS_ERROR_INTERRUPT |
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 929f6bae4eba..088901484932 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1484,7 +1484,7 @@ static void virtual_xfer_breadcrumbs(struct virtual_engine *ve,
 	if (!list_empty(&ve->context.signal_link)) {
 		list_move_tail(&ve->context.signal_link,
 			       &engine->breadcrumbs.signalers);
-		intel_engine_queue_breadcrumbs(engine);
+		intel_engine_signal_breadcrumbs(engine);
 	}
 	spin_unlock(&old->breadcrumbs.irq_lock);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index 8408cb84e52c..6b2abe1657ad 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -742,7 +742,7 @@ static void reset_finish_engine(struct intel_engine_cs *engine)
 	engine->reset.finish(engine);
 	intel_uncore_forcewake_put(engine->uncore, FORCEWAKE_ALL);
 
-	intel_engine_breadcrumbs_irq(engine);
+	intel_engine_signal_breadcrumbs(engine);
 }
 
 static void reset_finish(struct intel_gt *gt, intel_engine_mask_t awake)
@@ -771,7 +771,7 @@ static void nop_submit_request(struct i915_request *request)
 	i915_request_mark_complete(request);
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 
-	intel_engine_queue_breadcrumbs(engine);
+	intel_engine_signal_breadcrumbs(engine);
 }
 
 static void __intel_gt_set_wedged(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 5c22ca6f998a..e01766c5c06d 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -720,7 +720,7 @@ static int xcs_resume(struct intel_engine_cs *engine)
 	}
 
 	/* Papering over lost _interrupts_ immediately following the restart */
-	intel_engine_queue_breadcrumbs(engine);
+	intel_engine_signal_breadcrumbs(engine);
 out:
 	intel_uncore_forcewake_put(engine->uncore, FORCEWAKE_ALL);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index 106c9fce9d6c..5198c3185dae 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1566,7 +1566,7 @@ void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir)
 		return;
 
 	if (pm_iir & PM_VEBOX_USER_INTERRUPT)
-		intel_engine_breadcrumbs_irq(gt->engine[VECS0]);
+		intel_engine_signal_breadcrumbs(gt->engine[VECS0]);
 
 	if (pm_iir & PM_VEBOX_CS_ERROR_INTERRUPT)
 		DRM_DEBUG("Command parser error, pm_iir 0x%08x\n", pm_iir);
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
index 83f549d203a0..39df9d49a134 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -77,7 +77,7 @@ static void advance(struct i915_request *request)
 	i915_request_mark_complete(request);
 	GEM_BUG_ON(!i915_request_completed(request));
 
-	intel_engine_queue_breadcrumbs(request->engine);
+	intel_engine_signal_breadcrumbs(request->engine);
 }
 
 static void hw_delay_complete(struct timer_list *t)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index a5348f79114f..42b79f577500 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -3619,7 +3619,7 @@ static irqreturn_t i8xx_irq_handler(int irq, void *arg)
 		intel_uncore_write16(&dev_priv->uncore, GEN2_IIR, iir);
 
 		if (iir & I915_USER_INTERRUPT)
-			intel_engine_breadcrumbs_irq(dev_priv->engine[RCS0]);
+			intel_engine_signal_breadcrumbs(dev_priv->engine[RCS0]);
 
 		if (iir & I915_MASTER_ERROR_INTERRUPT)
 			i8xx_error_irq_handler(dev_priv, eir, eir_stuck);
@@ -3724,7 +3724,7 @@ static irqreturn_t i915_irq_handler(int irq, void *arg)
 		I915_WRITE(GEN2_IIR, iir);
 
 		if (iir & I915_USER_INTERRUPT)
-			intel_engine_breadcrumbs_irq(dev_priv->engine[RCS0]);
+			intel_engine_signal_breadcrumbs(dev_priv->engine[RCS0]);
 
 		if (iir & I915_MASTER_ERROR_INTERRUPT)
 			i9xx_error_irq_handler(dev_priv, eir, eir_stuck);
@@ -3866,10 +3866,10 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 		I915_WRITE(GEN2_IIR, iir);
 
 		if (iir & I915_USER_INTERRUPT)
-			intel_engine_breadcrumbs_irq(dev_priv->engine[RCS0]);
+			intel_engine_signal_breadcrumbs(dev_priv->engine[RCS0]);
 
 		if (iir & I915_BSD_USER_INTERRUPT)
-			intel_engine_breadcrumbs_irq(dev_priv->engine[VCS0]);
+			intel_engine_signal_breadcrumbs(dev_priv->engine[VCS0]);
 
 		if (iir & I915_MASTER_ERROR_INTERRUPT)
 			i9xx_error_irq_handler(dev_priv, eir, eir_stuck);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index f513747ff027..5d8692642c6b 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -417,7 +417,7 @@ bool __i915_request_submit(struct i915_request *request)
 	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
 	    !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &request->fence.flags) &&
 	    !i915_request_enable_breadcrumb(request))
-		intel_engine_queue_breadcrumbs(engine);
+		intel_engine_signal_breadcrumbs(engine);
 
 	__notify_execute_cb(request);
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 13/33] drm/i915: Tweak scheduler's kick_submission()
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (10 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 12/33] drm/i915/gt: Remove direct invocation of breadcrumb signaling Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 14/33] drm/i915: Flush idle barriers when waiting Chris Wilson
                   ` (20 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Skip useless priority bumping on adding a new dependency, but otherwise
prevent tasklet scheduling until we have completed all the potential
rescheduling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_scheduler.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 2bc2aa46a1b9..00f5d107f2b7 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -216,12 +216,13 @@ static void kick_submission(struct intel_engine_cs *engine,
 	if (inflight->hw_context == rq->hw_context)
 		goto unlock;
 
-	engine->execlists.queue_priority_hint = prio;
 	if (need_preempt(prio, rq_prio(inflight)))
 		tasklet_hi_schedule(&engine->execlists.tasklet);
 
 unlock:
 	rcu_read_unlock();
+
+	engine->execlists.queue_priority_hint = prio;
 }
 
 static void __i915_schedule(struct i915_sched_node *node,
@@ -365,6 +366,9 @@ static void __bump_priority(struct i915_sched_node *node, unsigned int bump)
 {
 	struct i915_sched_attr attr = node->attr;
 
+	if (attr.priority & bump)
+		return;
+
 	attr.priority |= bump;
 	__i915_schedule(node, &attr);
 }
@@ -463,11 +467,15 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
 	if (!dep)
 		return -ENOMEM;
 
+	local_bh_disable();
+
 	if (!__i915_sched_node_add_dependency(node, signal, dep,
 					      I915_DEPENDENCY_EXTERNAL |
 					      I915_DEPENDENCY_ALLOC))
 		i915_dependency_free(dep);
 
+	local_bh_enable(); /* kick submission tasklet */
+
 	return 0;
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 14/33] drm/i915: Flush idle barriers when waiting
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (11 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 13/33] drm/i915: Tweak scheduler's kick_submission() Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 15/33] drm/i915: Allow userspace to specify ringsize on construction Chris Wilson
                   ` (19 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

If we do find ourselves with an idle barrier inside our active while
waiting, attempt to flush it by emitting a pulse using the kernel
context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_active.c           | 21 ++++++++-
 drivers/gpu/drm/i915/selftests/i915_active.c | 46 ++++++++++++++++++++
 2 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index 3d0edde84705..eb5c219188a1 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -6,6 +6,7 @@
 
 #include <linux/debugobjects.h>
 
+#include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_engine_pm.h"
 #include "gt/intel_ring.h"
 
@@ -447,6 +448,21 @@ static void enable_signaling(struct i915_active_fence *active)
 	dma_fence_put(fence);
 }
 
+static int flush_barrier(struct active_node *it)
+{
+	struct intel_engine_cs *engine;
+
+	if (!is_barrier(&it->base))
+		return 0;
+
+	engine = __barrier_to_engine(it);
+	smp_rmb(); /* serialise with add_active_barriers */
+	if (!is_barrier(&it->base))
+		return 0;
+
+	return intel_engine_flush_barriers(engine);
+}
+
 int i915_active_wait(struct i915_active *ref)
 {
 	struct active_node *it, *n;
@@ -460,8 +476,9 @@ int i915_active_wait(struct i915_active *ref)
 	/* Flush lazy signals */
 	enable_signaling(&ref->excl);
 	rbtree_postorder_for_each_entry_safe(it, n, &ref->tree, node) {
-		if (is_barrier(&it->base)) /* unconnected idle barrier */
-			continue;
+		err = flush_barrier(it); /* unconnected idle barrier? */
+		if (err)
+			break;
 
 		enable_signaling(&it->base);
 	}
diff --git a/drivers/gpu/drm/i915/selftests/i915_active.c b/drivers/gpu/drm/i915/selftests/i915_active.c
index ef572a0c2566..067e30b8927f 100644
--- a/drivers/gpu/drm/i915/selftests/i915_active.c
+++ b/drivers/gpu/drm/i915/selftests/i915_active.c
@@ -201,11 +201,57 @@ static int live_active_retire(void *arg)
 	return err;
 }
 
+static int live_active_barrier(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine;
+	struct live_active *active;
+	int err = 0;
+
+	/* Check that we get a callback when requests retire upon waiting */
+
+	active = __live_alloc(i915);
+	if (!active)
+		return -ENOMEM;
+
+	err = i915_active_acquire(&active->base);
+	if (err)
+		goto out;
+
+	for_each_uabi_engine(engine, i915) {
+		err = i915_active_acquire_preallocate_barrier(&active->base,
+							      engine);
+		if (err)
+			break;
+
+		i915_active_acquire_barrier(&active->base);
+	}
+
+	i915_active_release(&active->base);
+
+	if (err == 0)
+		err = i915_active_wait(&active->base);
+
+	if (err == 0 && !READ_ONCE(active->retired)) {
+		pr_err("i915_active not retired after flushing barriers!\n");
+		err = -EINVAL;
+	}
+
+out:
+	__live_put(active);
+
+	if (igt_flush_test(i915))
+		err = -EIO;
+
+	return err;
+}
+
 int i915_active_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(live_active_wait),
 		SUBTEST(live_active_retire),
+		SUBTEST(live_active_barrier),
 	};
 
 	if (intel_gt_is_wedged(&i915->gt))
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 15/33] drm/i915: Allow userspace to specify ringsize on construction
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (12 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 14/33] drm/i915: Flush idle barriers when waiting Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 16/33] drm/i915/gem: Honour O_NONBLOCK before throttling execbuf submissions Chris Wilson
                   ` (18 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

No good reason why we must always use a static ringsize, so let
userspace select one during construction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 104 +++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_context_param.c |  64 +++++++++++
 drivers/gpu/drm/i915/gt/intel_context_param.h |  15 +++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   1 +
 include/uapi/drm/i915_drm.h                   |  19 ++++
 6 files changed, 198 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_context_param.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_context_param.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index e0fd10c0cfb8..2083496a1193 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -77,6 +77,7 @@ obj-y += gt/
 gt-y += \
 	gt/intel_breadcrumbs.o \
 	gt/intel_context.o \
+	gt/intel_context_param.o \
 	gt/intel_engine_cs.o \
 	gt/intel_engine_heartbeat.o \
 	gt/intel_engine_pm.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 46b4d1d643f8..fd59d040f483 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -69,6 +69,7 @@
 
 #include <drm/i915_drm.h>
 
+#include "gt/intel_context_param.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_engine_pm.h"
 #include "gt/intel_engine_user.h"
@@ -593,23 +594,30 @@ __create_context(struct drm_i915_private *i915)
 	return ERR_PTR(err);
 }
 
-static void
+static int
 context_apply_all(struct i915_gem_context *ctx,
-		  void (*fn)(struct intel_context *ce, void *data),
+		  int (*fn)(struct intel_context *ce, void *data),
 		  void *data)
 {
 	struct i915_gem_engines_iter it;
 	struct intel_context *ce;
+	int err = 0;
 
-	for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it)
-		fn(ce, data);
+	for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
+		err = fn(ce, data);
+		if (err)
+			break;
+	}
 	i915_gem_context_unlock_engines(ctx);
+
+	return err;
 }
 
-static void __apply_ppgtt(struct intel_context *ce, void *vm)
+static int __apply_ppgtt(struct intel_context *ce, void *vm)
 {
 	i915_vm_put(ce->vm);
 	ce->vm = i915_vm_get(vm);
+	return 0;
 }
 
 static struct i915_address_space *
@@ -647,9 +655,10 @@ static void __set_timeline(struct intel_timeline **dst,
 		intel_timeline_put(old);
 }
 
-static void __apply_timeline(struct intel_context *ce, void *timeline)
+static int __apply_timeline(struct intel_context *ce, void *timeline)
 {
 	__set_timeline(&ce->timeline, timeline);
+	return 0;
 }
 
 static void __assign_timeline(struct i915_gem_context *ctx,
@@ -1226,6 +1235,63 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv,
 	return err;
 }
 
+static int __apply_ringsize(struct intel_context *ce, void *sz)
+{
+	return intel_context_set_ring_size(ce, (unsigned long)sz);
+}
+
+static int set_ringsize(struct i915_gem_context *ctx,
+			struct drm_i915_gem_context_param *args)
+{
+	if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
+		return -ENODEV;
+
+	if (args->size)
+		return -EINVAL;
+
+	if (!IS_ALIGNED(args->value, I915_GTT_PAGE_SIZE))
+		return -EINVAL;
+
+	if (args->value < I915_GTT_PAGE_SIZE)
+		return -EINVAL;
+
+	if (args->value > 128 * I915_GTT_PAGE_SIZE)
+		return -EINVAL;
+
+	return context_apply_all(ctx,
+				 __apply_ringsize,
+				 __intel_context_ring_size(args->value));
+}
+
+static int __get_ringsize(struct intel_context *ce, void *arg)
+{
+	long sz;
+
+	sz = intel_context_get_ring_size(ce);
+	GEM_BUG_ON(sz > INT_MAX);
+
+	return sz; /* stop on first engine */
+}
+
+static int get_ringsize(struct i915_gem_context *ctx,
+			struct drm_i915_gem_context_param *args)
+{
+	int sz;
+
+	if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
+		return -ENODEV;
+
+	if (args->size)
+		return -EINVAL;
+
+	sz = context_apply_all(ctx, __get_ringsize, NULL);
+	if (sz < 0)
+		return sz;
+
+	args->value = sz;
+	return 0;
+}
+
 static int gen8_emit_rpcs_config(struct i915_request *rq,
 				 struct intel_context *ce,
 				 struct intel_sseu sseu)
@@ -1938,6 +2004,10 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 		ret = set_persistence(ctx, args);
 		break;
 
+	case I915_CONTEXT_PARAM_RINGSIZE:
+		ret = set_ringsize(ctx, args);
+		break;
+
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	default:
 		ret = -EINVAL;
@@ -1966,6 +2036,18 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
 	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
 }
 
+static int copy_ring_size(struct intel_context *dst,
+			  struct intel_context *src)
+{
+	long sz;
+
+	sz = intel_context_get_ring_size(src);
+	if (sz < 0)
+		return sz;
+
+	return intel_context_set_ring_size(dst, sz);
+}
+
 static int clone_engines(struct i915_gem_context *dst,
 			 struct i915_gem_context *src)
 {
@@ -2006,6 +2088,12 @@ static int clone_engines(struct i915_gem_context *dst,
 			__free_engines(clone, n);
 			goto err_unlock;
 		}
+
+		/* Copy across the preferred ringsize */
+		if (copy_ring_size(clone->engines[n], e->engines[n])) {
+			__free_engines(clone, n + 1);
+			goto err_unlock;
+		}
 	}
 	clone->num_engines = n;
 
@@ -2371,6 +2459,10 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		args->value = i915_gem_context_is_persistent(ctx);
 		break;
 
+	case I915_CONTEXT_PARAM_RINGSIZE:
+		ret = get_ringsize(ctx, args);
+		break;
+
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	default:
 		ret = -EINVAL;
diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.c b/drivers/gpu/drm/i915/gt/intel_context_param.c
new file mode 100644
index 000000000000..222d2e9bee71
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_context_param.c
@@ -0,0 +1,64 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_active.h"
+#include "intel_context.h"
+#include "intel_context_param.h"
+#include "intel_ring.h"
+
+int intel_context_set_ring_size(struct intel_context *ce, long sz)
+{
+	int err;
+
+	err = i915_active_wait(&ce->active);
+	if (err < 0)
+		return err;
+
+	if (intel_context_lock_pinned(ce))
+		return -EINTR;
+
+	if (intel_context_is_pinned(ce)) {
+		err = -EBUSY; /* In active use, come back later! */
+		goto unlock;
+	}
+
+	if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
+		struct intel_ring *ring;
+
+		/* Replace the existing ringbuffer */
+		ring = intel_engine_create_ring(ce->engine, sz);
+		if (IS_ERR(ring)) {
+			err = PTR_ERR(ring);
+			goto unlock;
+		}
+
+		intel_ring_put(ce->ring);
+		ce->ring = ring;
+
+		/* Context image will be updated on next pin */
+	} else {
+		ce->ring = __intel_context_ring_size(sz);
+	}
+
+unlock:
+	intel_context_unlock_pinned(ce);
+	return err;
+}
+
+long intel_context_get_ring_size(struct intel_context *ce)
+{
+	long sz = (unsigned long)READ_ONCE(ce->ring);
+
+	if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
+		if (intel_context_lock_pinned(ce))
+			return -EINTR;
+
+		sz = ce->ring->size;
+		intel_context_unlock_pinned(ce);
+	}
+
+	return sz;
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
new file mode 100644
index 000000000000..0981a8399a68
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
@@ -0,0 +1,15 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef INTEL_CONTEXT_PARAM_H
+#define INTEL_CONTEXT_PARAM_H
+
+struct intel_context;
+
+int intel_context_set_ring_size(struct intel_context *ce, long sz);
+long intel_context_get_ring_size(struct intel_context *ce);
+
+#endif /* INTEL_CONTEXT_PARAM_H */
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 088901484932..095b93c88975 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2445,6 +2445,7 @@ __execlists_update_reg_state(const struct intel_context *ce,
 	regs[CTX_RING_START] = i915_ggtt_offset(ring->vma);
 	regs[CTX_RING_HEAD] = ring->head;
 	regs[CTX_RING_TAIL] = ring->tail;
+	regs[CTX_RING_CTL] = RING_CTL_SIZE(ring->size) | RING_VALID;
 
 	/* RPCS */
 	if (engine->class == RENDER_CLASS) {
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 829c0a48577f..decdd08a02c7 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1619,6 +1619,25 @@ struct drm_i915_gem_context_param {
  * By default, new contexts allow persistence.
  */
 #define I915_CONTEXT_PARAM_PERSISTENCE	0xb
+
+/*
+ * I915_CONTEXT_PARAM_RINGSIZE:
+ *
+ * Sets the size of the CS ringbuffer to use for logical ring contexts. This
+ * applies a limit of how many batches can be queued to HW before the caller
+ * is blocked due to lack of space for more commands.
+ *
+ * Only reliably possible to be set prior to first use, i.e. during
+ * construction. At any later point, the current execution must be flushed as
+ * the ring can only be changed while the context is idle.
+ *
+ * Only applies to the current set of engine and lost when those engines
+ * are replaced by a new mapping (see I915_CONTEXT_PARAM_ENGINES).
+ *
+ * Must be between 4 - 512 KiB, in intervals of page size [4 KiB].
+ * Default is 16 KiB.
+ */
+#define I915_CONTEXT_PARAM_RINGSIZE	0xc
 /* Must be kept compact -- no holes and well documented */
 
 	__u64 value;
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 16/33] drm/i915/gem: Honour O_NONBLOCK before throttling execbuf submissions
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (13 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 15/33] drm/i915: Allow userspace to specify ringsize on construction Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 17/33] drm/i915/gt: Teach veng to defer the context allocation Chris Wilson
                   ` (17 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Check the user's flags on the struct file before deciding whether or not
to stall before submitting a request. This allows us to reasonably
cheaply honour O_NONBLOCK without checking at more critical phases
during request submission.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 21 ++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 81eaf812c9da..f7f581ff8a29 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2290,15 +2290,22 @@ static int __eb_pin_engine(struct i915_execbuffer *eb, struct intel_context *ce)
 	intel_context_timeline_unlock(tl);
 
 	if (rq) {
-		if (i915_request_wait(rq,
-				      I915_WAIT_INTERRUPTIBLE,
-				      MAX_SCHEDULE_TIMEOUT) < 0) {
-			i915_request_put(rq);
-			err = -EINTR;
-			goto err_exit;
-		}
+		bool nonblock = eb->file->filp->f_flags & O_NONBLOCK;
+		long timeout;
+
+		timeout = MAX_SCHEDULE_TIMEOUT;
+		if (nonblock)
+			timeout = 0;
 
+		timeout = i915_request_wait(rq,
+					    I915_WAIT_INTERRUPTIBLE,
+					    timeout);
 		i915_request_put(rq);
+
+		if (timeout < 0) {
+			err = nonblock ? -EWOULDBLOCK : timeout;
+			goto err_exit;
+		}
 	}
 
 	eb->engine = ce->engine;
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 17/33] drm/i915/gt: Teach veng to defer the context allocation
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (14 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 16/33] drm/i915/gem: Honour O_NONBLOCK before throttling execbuf submissions Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 18/33] drm/i915: Drop GEM context as a direct link from i915_request Chris Wilson
                   ` (16 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Since we added the context_alloc callback to intel_context_ops, we can
safely install a custom hook for the deferred virtual context allocation.
This means that all new contexts behave the same upon creation,
simplifying later code.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 095b93c88975..0559a77b0c01 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -4247,6 +4247,13 @@ static void virtual_engine_initial_hint(struct virtual_engine *ve)
 						ve->siblings[0]);
 }
 
+static int virtual_context_alloc(struct intel_context *ce)
+{
+	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+
+	return __execlists_context_alloc(ce, ve->siblings[0]);
+}
+
 static int virtual_context_pin(struct intel_context *ce)
 {
 	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
@@ -4284,6 +4291,8 @@ static void virtual_context_exit(struct intel_context *ce)
 }
 
 static const struct intel_context_ops virtual_context_ops = {
+	.alloc = virtual_context_alloc,
+
 	.pin = virtual_context_pin,
 	.unpin = execlists_context_unpin,
 
@@ -4603,12 +4612,6 @@ intel_execlists_create_virtual(struct i915_gem_context *ctx,
 
 	ve->base.flags |= I915_ENGINE_IS_VIRTUAL;
 
-	err = __execlists_context_alloc(&ve->context, siblings[0]);
-	if (err)
-		goto err_put;
-
-	__set_bit(CONTEXT_ALLOC_BIT, &ve->context.flags);
-
 	return &ve->context;
 
 err_put:
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 18/33] drm/i915: Drop GEM context as a direct link from i915_request
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (15 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 17/33] drm/i915/gt: Teach veng to defer the context allocation Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 19/33] drm/i915: Push the use-semaphore marker onto the intel_context Chris Wilson
                   ` (15 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Keep the intel_context as being the primary state for i915_request, with
the GEM context a backpointer from the low level state for the rarer
cases we need client information. Our goal is to remove such references
to clients from the backend, and leave the HW submission agnostic to
client interfaces and self-contained.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 15 ++----
 drivers/gpu/drm/i915/gem/i915_gem_context.h   | 38 ---------------
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  7 +--
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  8 ++--
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   |  4 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  2 +-
 drivers/gpu/drm/i915/gt/intel_context.h       | 42 +++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_context_types.h |  7 ++-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  6 +--
 drivers/gpu/drm/i915/gt/intel_lrc.c           | 47 +++++++++----------
 drivers/gpu/drm/i915/gt/intel_reset.c         | 37 ++++++++-------
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  8 ++--
 drivers/gpu/drm/i915/gt/selftest_lrc.c        | 20 ++++----
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  6 +--
 drivers/gpu/drm/i915/gvt/scheduler.c          | 27 +++++------
 drivers/gpu/drm/i915/i915_gem.c               | 11 ++---
 drivers/gpu/drm/i915/i915_gpu_error.c         | 11 +++--
 drivers/gpu/drm/i915/i915_perf.c              |  4 +-
 drivers/gpu/drm/i915/i915_request.c           | 21 +++++----
 drivers/gpu/drm/i915/i915_request.h           |  3 +-
 drivers/gpu/drm/i915/i915_scheduler.c         |  2 +-
 21 files changed, 165 insertions(+), 161 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index fd59d040f483..80278261a640 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -69,6 +69,7 @@
 
 #include <drm/i915_drm.h>
 
+#include "gt/intel_context.h"
 #include "gt/intel_context_param.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_engine_pm.h"
@@ -424,15 +425,6 @@ static void kill_context(struct i915_gem_context *ctx)
 	struct i915_gem_engines_iter it;
 	struct intel_context *ce;
 
-	/*
-	 * If we are already banned, it was due to a guilty request causing
-	 * a reset and the entire context being evicted from the GPU.
-	 */
-	if (i915_gem_context_is_banned(ctx))
-		return;
-
-	i915_gem_context_set_banned(ctx);
-
 	/*
 	 * Map the user's engine back to the actual engines; one virtual
 	 * engine will be mapped to multiple engines, and using ctx->engine[]
@@ -443,6 +435,9 @@ static void kill_context(struct i915_gem_context *ctx)
 	for_each_gem_engine(ce, __context_engines_static(ctx), it) {
 		struct intel_engine_cs *engine;
 
+		if (intel_context_set_banned(ce))
+			continue;
+
 		/*
 		 * Check the current active state of this context; if we
 		 * are currently executing on the GPU we need to evict
@@ -1102,7 +1097,7 @@ static void set_ppgtt_barrier(void *data)
 
 static int emit_ppgtt_update(struct i915_request *rq, void *data)
 {
-	struct i915_address_space *vm = rq->hw_context->vm;
+	struct i915_address_space *vm = rq->context->vm;
 	struct intel_engine_cs *engine = rq->engine;
 	u32 base = engine->mmio_base;
 	u32 *cs;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 18e50a769a6e..69932899803e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -91,26 +91,6 @@ static inline void i915_gem_context_clear_persistence(struct i915_gem_context *c
 	clear_bit(UCONTEXT_PERSISTENCE, &ctx->user_flags);
 }
 
-static inline bool i915_gem_context_is_banned(const struct i915_gem_context *ctx)
-{
-	return test_bit(CONTEXT_BANNED, &ctx->flags);
-}
-
-static inline void i915_gem_context_set_banned(struct i915_gem_context *ctx)
-{
-	set_bit(CONTEXT_BANNED, &ctx->flags);
-}
-
-static inline bool i915_gem_context_force_single_submission(const struct i915_gem_context *ctx)
-{
-	return test_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ctx->flags);
-}
-
-static inline void i915_gem_context_set_force_single_submission(struct i915_gem_context *ctx)
-{
-	__set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ctx->flags);
-}
-
 static inline bool
 i915_gem_context_user_engines(const struct i915_gem_context *ctx)
 {
@@ -129,24 +109,6 @@ i915_gem_context_clear_user_engines(struct i915_gem_context *ctx)
 	clear_bit(CONTEXT_USER_ENGINES, &ctx->flags);
 }
 
-static inline bool
-i915_gem_context_nopreempt(const struct i915_gem_context *ctx)
-{
-	return test_bit(CONTEXT_NOPREEMPT, &ctx->flags);
-}
-
-static inline void
-i915_gem_context_set_nopreempt(struct i915_gem_context *ctx)
-{
-	set_bit(CONTEXT_NOPREEMPT, &ctx->flags);
-}
-
-static inline void
-i915_gem_context_clear_nopreempt(struct i915_gem_context *ctx)
-{
-	clear_bit(CONTEXT_NOPREEMPT, &ctx->flags);
-}
-
 static inline bool i915_gem_context_is_kernel(struct i915_gem_context *ctx)
 {
 	return !ctx->file_priv;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 69df5459c350..017ca803ab47 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -134,11 +134,8 @@ struct i915_gem_context {
 	 * @flags: small set of booleans
 	 */
 	unsigned long flags;
-#define CONTEXT_BANNED			0
-#define CONTEXT_CLOSED			1
-#define CONTEXT_FORCE_SINGLE_SUBMISSION	2
-#define CONTEXT_USER_ENGINES		3
-#define CONTEXT_NOPREEMPT		4
+#define CONTEXT_CLOSED			0
+#define CONTEXT_USER_ENGINES		1
 
 	struct mutex mutex;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index f7f581ff8a29..0866a1ba9d9f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -730,9 +730,6 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
 	unsigned int i, batch;
 	int err;
 
-	if (unlikely(i915_gem_context_is_banned(eb->gem_context)))
-		return -EIO;
-
 	INIT_LIST_HEAD(&eb->relocs);
 	INIT_LIST_HEAD(&eb->unbound);
 
@@ -2175,7 +2172,7 @@ static int eb_submit(struct i915_execbuffer *eb)
 			return err;
 	}
 
-	if (i915_gem_context_nopreempt(eb->gem_context))
+	if (intel_context_nopreempt(eb->context))
 		eb->request->flags |= I915_REQUEST_NOPREEMPT;
 
 	return 0;
@@ -2261,6 +2258,9 @@ static int __eb_pin_engine(struct i915_execbuffer *eb, struct intel_context *ce)
 	if (err)
 		return err;
 
+	if (unlikely(intel_context_is_banned(ce)))
+		return -EIO;
+
 	/*
 	 * Pinning the contexts may generate requests in order to acquire
 	 * GGTT space, so do this first before we reserve a seqno for
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index 5fa4d621528e..7265ff88e7de 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -270,7 +270,7 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq)
 
 	if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) {
 		struct intel_breadcrumbs *b = &rq->engine->breadcrumbs;
-		struct intel_context *ce = rq->hw_context;
+		struct intel_context *ce = rq->context;
 		struct list_head *pos;
 
 		spin_lock(&b->irq_lock);
@@ -327,7 +327,7 @@ void i915_request_cancel_breadcrumb(struct i915_request *rq)
 	 */
 	spin_lock(&b->irq_lock);
 	if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)) {
-		struct intel_context *ce = rq->hw_context;
+		struct intel_context *ce = rq->context;
 
 		list_del(&rq->signal_link);
 		if (list_empty(&ce->signals))
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index f7e2f3af007a..4c310cee1f04 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -314,7 +314,7 @@ int intel_context_prepare_remote_request(struct intel_context *ce,
 	int err;
 
 	/* Only suitable for use in remotely modifying this context */
-	GEM_BUG_ON(rq->hw_context == ce);
+	GEM_BUG_ON(rq->context == ce);
 
 	if (rcu_access_pointer(rq->timeline) != tl) { /* timeline sharing! */
 		/* Queue this switch after current activity by this context. */
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index 68b3d317d959..1e607343d256 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -7,7 +7,9 @@
 #ifndef __INTEL_CONTEXT_H__
 #define __INTEL_CONTEXT_H__
 
+#include <linux/bitops.h>
 #include <linux/lockdep.h>
+#include <linux/types.h>
 
 #include "i915_active.h"
 #include "intel_context_types.h"
@@ -153,4 +155,44 @@ static inline struct intel_ring *__intel_context_ring_size(u64 sz)
 	return u64_to_ptr(struct intel_ring, sz);
 }
 
+static inline bool intel_context_is_banned(const struct intel_context *ce)
+{
+	return test_bit(CONTEXT_BANNED, &ce->flags);
+}
+
+static inline bool intel_context_set_banned(struct intel_context *ce)
+{
+	return test_and_set_bit(CONTEXT_BANNED, &ce->flags);
+}
+
+static inline bool
+intel_context_force_single_submission(const struct intel_context *ce)
+{
+	return test_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ce->flags);
+}
+
+static inline void
+intel_context_set_single_submission(struct intel_context *ce)
+{
+	__set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ce->flags);
+}
+
+static inline bool
+intel_context_nopreempt(const struct intel_context *ce)
+{
+	return test_bit(CONTEXT_NOPREEMPT, &ce->flags);
+}
+
+static inline void
+intel_context_set_nopreempt(struct intel_context *ce)
+{
+	set_bit(CONTEXT_NOPREEMPT, &ce->flags);
+}
+
+static inline void
+intel_context_clear_nopreempt(struct intel_context *ce)
+{
+	clear_bit(CONTEXT_NOPREEMPT, &ce->flags);
+}
+
 #endif /* __INTEL_CONTEXT_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index d1204cc899a3..597448f6e98b 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -54,8 +54,11 @@ struct intel_context {
 	struct intel_timeline *timeline;
 
 	unsigned long flags;
-#define CONTEXT_ALLOC_BIT 0
-#define CONTEXT_VALID_BIT 1
+#define CONTEXT_ALLOC_BIT		0
+#define CONTEXT_VALID_BIT		1
+#define CONTEXT_BANNED			2
+#define CONTEXT_FORCE_SINGLE_SUBMISSION	3
+#define CONTEXT_NOPREEMPT		4
 
 	u32 *lrc_reg_state;
 	u64 lrc_desc;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 49473c25916c..0fe7be3cf21a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1520,9 +1520,9 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 
 		print_request_ring(m, rq);
 
-		if (rq->hw_context->lrc_reg_state) {
+		if (rq->context->lrc_reg_state) {
 			drm_printf(m, "Logical Ring Context:\n");
-			hexdump(m, rq->hw_context->lrc_reg_state, PAGE_SIZE);
+			hexdump(m, rq->context->lrc_reg_state, PAGE_SIZE);
 		}
 	}
 	spin_unlock_irqrestore(&engine->active.lock, flags);
@@ -1583,7 +1583,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 
 		for (port = execlists->pending; (rq = *port); port++) {
 			/* Exclude any contexts already counted in active */
-			if (!intel_context_inflight_count(rq->hw_context))
+			if (!intel_context_inflight_count(rq->context))
 				engine->stats.active++;
 		}
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 0559a77b0c01..c7bd271bf60f 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -880,7 +880,7 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 			list_move(&rq->sched.link, pl);
 			active = rq;
 		} else {
-			struct intel_engine_cs *owner = rq->hw_context->engine;
+			struct intel_engine_cs *owner = rq->context->engine;
 
 			/*
 			 * Decouple the virtual breadcrumb before moving it
@@ -1051,7 +1051,7 @@ static void restore_default_state(struct intel_context *ce,
 static void reset_active(struct i915_request *rq,
 			 struct intel_engine_cs *engine)
 {
-	struct intel_context * const ce = rq->hw_context;
+	struct intel_context * const ce = rq->context;
 	u32 head;
 
 	/*
@@ -1092,11 +1092,11 @@ static inline struct intel_engine_cs *
 __execlists_schedule_in(struct i915_request *rq)
 {
 	struct intel_engine_cs * const engine = rq->engine;
-	struct intel_context * const ce = rq->hw_context;
+	struct intel_context * const ce = rq->context;
 
 	intel_context_get(ce);
 
-	if (unlikely(i915_gem_context_is_banned(ce->gem_context)))
+	if (unlikely(intel_context_is_banned(ce)))
 		reset_active(rq, engine);
 
 	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
@@ -1124,7 +1124,7 @@ __execlists_schedule_in(struct i915_request *rq)
 static inline struct i915_request *
 execlists_schedule_in(struct i915_request *rq, int idx)
 {
-	struct intel_context * const ce = rq->hw_context;
+	struct intel_context * const ce = rq->context;
 	struct intel_engine_cs *old;
 
 	GEM_BUG_ON(!intel_engine_pm_is_awake(rq->engine));
@@ -1155,7 +1155,7 @@ static inline void
 __execlists_schedule_out(struct i915_request *rq,
 			 struct intel_engine_cs * const engine)
 {
-	struct intel_context * const ce = rq->hw_context;
+	struct intel_context * const ce = rq->context;
 
 	/*
 	 * NB process_csb() is not under the engine->active.lock and hence
@@ -1193,7 +1193,7 @@ __execlists_schedule_out(struct i915_request *rq,
 static inline void
 execlists_schedule_out(struct i915_request *rq)
 {
-	struct intel_context * const ce = rq->hw_context;
+	struct intel_context * const ce = rq->context;
 	struct intel_engine_cs *cur, *old;
 
 	trace_i915_request_out(rq);
@@ -1210,7 +1210,7 @@ execlists_schedule_out(struct i915_request *rq)
 
 static u64 execlists_update_context(struct i915_request *rq)
 {
-	struct intel_context *ce = rq->hw_context;
+	struct intel_context *ce = rq->context;
 	u64 desc = ce->lrc_desc;
 	u32 tail;
 
@@ -1312,13 +1312,13 @@ assert_pending_valid(const struct intel_engine_execlists *execlists,
 		GEM_BUG_ON(!kref_read(&rq->fence.refcount));
 		GEM_BUG_ON(!i915_request_is_active(rq));
 
-		if (ce == rq->hw_context) {
+		if (ce == rq->context) {
 			GEM_TRACE_ERR("Dup context:%llx in pending[%zd]\n",
 				      ce->timeline->fence_context,
 				      port - execlists->pending);
 			return false;
 		}
-		ce = rq->hw_context;
+		ce = rq->context;
 
 		/* Hold tightly onto the lock to prevent concurrent retires! */
 		if (!spin_trylock_irqsave(&rq->lock, flags))
@@ -1327,8 +1327,7 @@ assert_pending_valid(const struct intel_engine_execlists *execlists,
 		if (i915_request_completed(rq))
 			goto unlock;
 
-		if (i915_active_is_idle(&ce->active) &&
-		    !i915_gem_context_is_kernel(ce->gem_context)) {
+		if (i915_active_is_idle(&ce->active) && ce->gem_context) {
 			GEM_TRACE_ERR("Inactive context:%llx in pending[%zd]\n",
 				      ce->timeline->fence_context,
 				      port - execlists->pending);
@@ -1400,7 +1399,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 static bool ctx_single_port_submission(const struct intel_context *ce)
 {
 	return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
-		i915_gem_context_force_single_submission(ce->gem_context));
+		intel_context_force_single_submission(ce));
 }
 
 static bool can_merge_ctx(const struct intel_context *prev,
@@ -1436,7 +1435,7 @@ static bool can_merge_rq(const struct i915_request *prev,
 		     (I915_REQUEST_NOPREEMPT | I915_REQUEST_SENTINEL)))
 		return false;
 
-	if (!can_merge_ctx(prev->hw_context, next->hw_context))
+	if (!can_merge_ctx(prev->context, next->context))
 		return false;
 
 	return true;
@@ -1623,7 +1622,7 @@ static unsigned long active_preempt_timeout(struct intel_engine_cs *engine)
 		return 0;
 
 	/* Force a fast reset for terminated contexts (ignoring sysfs!) */
-	if (unlikely(i915_gem_context_is_banned(rq->gem_context)))
+	if (unlikely(intel_context_is_banned(rq->context)))
 		return 1;
 
 	return READ_ONCE(engine->props.preempt_timeout_ms);
@@ -1731,7 +1730,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			 * tendency to ignore us rewinding the TAIL to the
 			 * end of an earlier request.
 			 */
-			last->hw_context->lrc_desc |= CTX_DESC_FORCE_RESTORE;
+			last->context->lrc_desc |= CTX_DESC_FORCE_RESTORE;
 			last = NULL;
 		} else if (need_timeslice(engine, last) &&
 			   timer_expired(&engine->execlists.timer)) {
@@ -1803,7 +1802,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 
 		GEM_BUG_ON(rq != ve->request);
 		GEM_BUG_ON(rq->engine != &ve->base);
-		GEM_BUG_ON(rq->hw_context != &ve->context);
+		GEM_BUG_ON(rq->context != &ve->context);
 
 		if (rq_prio(rq) >= queue_prio(execlists)) {
 			if (!virtual_matches(ve, rq, engine)) {
@@ -1922,7 +1921,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				 * same LRCA, i.e. we must submit 2 different
 				 * contexts if we submit 2 ELSP.
 				 */
-				if (last->hw_context == rq->hw_context)
+				if (last->context == rq->context)
 					goto done;
 
 				if (i915_request_has_sentinel(last))
@@ -1935,8 +1934,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				 * the same context (even though a different
 				 * request) to the second port.
 				 */
-				if (ctx_single_port_submission(last->hw_context) ||
-				    ctx_single_port_submission(rq->hw_context))
+				if (ctx_single_port_submission(last->context) ||
+				    ctx_single_port_submission(rq->context))
 					goto done;
 
 				merge = false;
@@ -1950,8 +1949,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				}
 
 				GEM_BUG_ON(last &&
-					   !can_merge_ctx(last->hw_context,
-							  rq->hw_context));
+					   !can_merge_ctx(last->context,
+							  rq->context));
 
 				submit = true;
 				last = rq;
@@ -2571,7 +2570,7 @@ static int execlists_request_alloc(struct i915_request *request)
 {
 	int ret;
 
-	GEM_BUG_ON(!intel_context_is_pinned(request->hw_context));
+	GEM_BUG_ON(!intel_context_is_pinned(request->context));
 
 	/*
 	 * Flush enough space to reduce the likelihood of waiting after
@@ -3078,7 +3077,7 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
 	/* We still have requests in-flight; the engine should be active */
 	GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
 
-	ce = rq->hw_context;
+	ce = rq->context;
 	GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
 
 	if (i915_request_completed(rq)) {
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index 6b2abe1657ad..fa3fe23162a2 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -41,27 +41,30 @@ static void rmw_clear_fw(struct intel_uncore *uncore, i915_reg_t reg, u32 clr)
 static void engine_skip_context(struct i915_request *rq)
 {
 	struct intel_engine_cs *engine = rq->engine;
-	struct i915_gem_context *hung_ctx = rq->gem_context;
+	struct intel_context *hung_ctx = rq->context;
 
 	if (!i915_request_is_active(rq))
 		return;
 
 	lockdep_assert_held(&engine->active.lock);
 	list_for_each_entry_continue(rq, &engine->active.requests, sched.link)
-		if (rq->gem_context == hung_ctx)
+		if (rq->context == hung_ctx)
 			i915_request_skip(rq, -EIO);
 }
 
-static void client_mark_guilty(struct drm_i915_file_private *file_priv,
-			       const struct i915_gem_context *ctx)
+static void client_mark_guilty(struct i915_request *rq, bool banned)
 {
-	unsigned int score;
+	struct i915_gem_context *ctx = rq->context->gem_context;
+	struct drm_i915_file_private *file_priv = ctx->file_priv;
 	unsigned long prev_hang;
+	unsigned int score;
 
-	if (i915_gem_context_is_banned(ctx))
+	if (IS_ERR_OR_NULL(file_priv))
+		return;
+
+	score = 0;
+	if (banned)
 		score = I915_CLIENT_SCORE_CONTEXT_BAN;
-	else
-		score = 0;
 
 	prev_hang = xchg(&file_priv->hang_timestamp, jiffies);
 	if (time_before(jiffies, prev_hang + I915_CLIENT_FAST_HANG_JIFFIES))
@@ -76,14 +79,15 @@ static void client_mark_guilty(struct drm_i915_file_private *file_priv,
 	}
 }
 
-static bool context_mark_guilty(struct i915_gem_context *ctx)
+static bool mark_guilty(struct i915_request *rq)
 {
+	struct i915_gem_context *ctx = rq->context->gem_context;
 	unsigned long prev_hang;
 	bool banned;
 	int i;
 
 	if (i915_gem_context_is_closed(ctx)) {
-		i915_gem_context_set_banned(ctx);
+		intel_context_set_banned(rq->context);
 		return true;
 	}
 
@@ -110,18 +114,17 @@ static bool context_mark_guilty(struct i915_gem_context *ctx)
 	if (banned) {
 		DRM_DEBUG_DRIVER("context %s: guilty %d, banned\n",
 				 ctx->name, atomic_read(&ctx->guilty_count));
-		i915_gem_context_set_banned(ctx);
+		intel_context_set_banned(rq->context);
 	}
 
-	if (!IS_ERR_OR_NULL(ctx->file_priv))
-		client_mark_guilty(ctx->file_priv, ctx);
+	client_mark_guilty(rq, banned);
 
 	return banned;
 }
 
-static void context_mark_innocent(struct i915_gem_context *ctx)
+static void mark_innocent(struct i915_request *rq)
 {
-	atomic_inc(&ctx->active_count);
+	atomic_inc(&rq->context->gem_context->active_count);
 }
 
 void __i915_request_reset(struct i915_request *rq, bool guilty)
@@ -137,11 +140,11 @@ void __i915_request_reset(struct i915_request *rq, bool guilty)
 	rcu_read_lock(); /* protect the GEM context */
 	if (guilty) {
 		i915_request_skip(rq, -EIO);
-		if (context_mark_guilty(rq->gem_context))
+		if (mark_guilty(rq))
 			engine_skip_context(rq);
 	} else {
 		dma_fence_set_error(&rq->fence, -EAGAIN);
-		context_mark_innocent(rq->gem_context);
+		mark_innocent(rq);
 	}
 	rcu_read_unlock();
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index e01766c5c06d..9897394b81e6 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -1502,7 +1502,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 
 	*cs++ = MI_NOOP;
 	*cs++ = MI_SET_CONTEXT;
-	*cs++ = i915_ggtt_offset(rq->hw_context->state) | flags;
+	*cs++ = i915_ggtt_offset(rq->context->state) | flags;
 	/*
 	 * w/a: MI_SET_CONTEXT must always be followed by MI_NOOP
 	 * WaMiSetContext_Hang:snb,ivb,vlv
@@ -1572,7 +1572,7 @@ static int remap_l3_slice(struct i915_request *rq, int slice)
 
 static int remap_l3(struct i915_request *rq)
 {
-	struct i915_gem_context *ctx = rq->gem_context;
+	struct i915_gem_context *ctx = rq->context->gem_context;
 	int i, err;
 
 	if (!ctx->remap_slice)
@@ -1593,7 +1593,7 @@ static int remap_l3(struct i915_request *rq)
 
 static int switch_context(struct i915_request *rq)
 {
-	struct intel_context *ce = rq->hw_context;
+	struct intel_context *ce = rq->context;
 	struct i915_address_space *vm = vm_alias(ce);
 	u32 hw_flags = 0;
 	int ret;
@@ -1656,7 +1656,7 @@ static int ring_request_alloc(struct i915_request *request)
 {
 	int ret;
 
-	GEM_BUG_ON(!intel_context_is_pinned(request->hw_context));
+	GEM_BUG_ON(!intel_context_is_pinned(request->context));
 	GEM_BUG_ON(i915_request_timeline(request)->has_initial_breadcrumb);
 
 	/*
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 3976198eff37..a7f3dbe4a113 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -1237,13 +1237,13 @@ static int __cancel_active0(struct live_preempt_cancel *arg)
 				__func__, arg->engine->name))
 		return -EIO;
 
-	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
 	rq = spinner_create_request(&arg->a.spin,
 				    arg->a.ctx, arg->engine,
 				    MI_ARB_CHECK);
 	if (IS_ERR(rq))
 		return PTR_ERR(rq);
 
+	clear_bit(CONTEXT_BANNED, &rq->context->flags);
 	i915_request_get(rq);
 	i915_request_add(rq);
 	if (!igt_wait_for_spinner(&arg->a.spin, rq)) {
@@ -1251,7 +1251,7 @@ static int __cancel_active0(struct live_preempt_cancel *arg)
 		goto out;
 	}
 
-	i915_gem_context_set_banned(arg->a.ctx);
+	intel_context_set_banned(rq->context);
 	err = intel_engine_pulse(arg->engine);
 	if (err)
 		goto out;
@@ -1286,13 +1286,13 @@ static int __cancel_active1(struct live_preempt_cancel *arg)
 				__func__, arg->engine->name))
 		return -EIO;
 
-	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
 	rq[0] = spinner_create_request(&arg->a.spin,
 				       arg->a.ctx, arg->engine,
 				       MI_NOOP); /* no preemption */
 	if (IS_ERR(rq[0]))
 		return PTR_ERR(rq[0]);
 
+	clear_bit(CONTEXT_BANNED, &rq[0]->context->flags);
 	i915_request_get(rq[0]);
 	i915_request_add(rq[0]);
 	if (!igt_wait_for_spinner(&arg->a.spin, rq[0])) {
@@ -1300,7 +1300,6 @@ static int __cancel_active1(struct live_preempt_cancel *arg)
 		goto out;
 	}
 
-	clear_bit(CONTEXT_BANNED, &arg->b.ctx->flags);
 	rq[1] = spinner_create_request(&arg->b.spin,
 				       arg->b.ctx, arg->engine,
 				       MI_ARB_CHECK);
@@ -1309,13 +1308,14 @@ static int __cancel_active1(struct live_preempt_cancel *arg)
 		goto out;
 	}
 
+	clear_bit(CONTEXT_BANNED, &rq[1]->context->flags);
 	i915_request_get(rq[1]);
 	err = i915_request_await_dma_fence(rq[1], &rq[0]->fence);
 	i915_request_add(rq[1]);
 	if (err)
 		goto out;
 
-	i915_gem_context_set_banned(arg->b.ctx);
+	intel_context_set_banned(rq[1]->context);
 	err = intel_engine_pulse(arg->engine);
 	if (err)
 		goto out;
@@ -1358,13 +1358,13 @@ static int __cancel_queued(struct live_preempt_cancel *arg)
 				__func__, arg->engine->name))
 		return -EIO;
 
-	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
 	rq[0] = spinner_create_request(&arg->a.spin,
 				       arg->a.ctx, arg->engine,
 				       MI_ARB_CHECK);
 	if (IS_ERR(rq[0]))
 		return PTR_ERR(rq[0]);
 
+	clear_bit(CONTEXT_BANNED, &rq[0]->context->flags);
 	i915_request_get(rq[0]);
 	i915_request_add(rq[0]);
 	if (!igt_wait_for_spinner(&arg->a.spin, rq[0])) {
@@ -1372,13 +1372,13 @@ static int __cancel_queued(struct live_preempt_cancel *arg)
 		goto out;
 	}
 
-	clear_bit(CONTEXT_BANNED, &arg->b.ctx->flags);
 	rq[1] = igt_request_alloc(arg->b.ctx, arg->engine);
 	if (IS_ERR(rq[1])) {
 		err = PTR_ERR(rq[1]);
 		goto out;
 	}
 
+	clear_bit(CONTEXT_BANNED, &rq[1]->context->flags);
 	i915_request_get(rq[1]);
 	err = i915_request_await_dma_fence(rq[1], &rq[0]->fence);
 	i915_request_add(rq[1]);
@@ -1399,7 +1399,7 @@ static int __cancel_queued(struct live_preempt_cancel *arg)
 	if (err)
 		goto out;
 
-	i915_gem_context_set_banned(arg->a.ctx);
+	intel_context_set_banned(rq[2]->context);
 	err = intel_engine_pulse(arg->engine);
 	if (err)
 		goto out;
@@ -1446,13 +1446,13 @@ static int __cancel_hostile(struct live_preempt_cancel *arg)
 		return 0;
 
 	GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
-	clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
 	rq = spinner_create_request(&arg->a.spin,
 				    arg->a.ctx, arg->engine,
 				    MI_NOOP); /* preemption disabled */
 	if (IS_ERR(rq))
 		return PTR_ERR(rq);
 
+	clear_bit(CONTEXT_BANNED, &rq->context->flags);
 	i915_request_get(rq);
 	i915_request_add(rq);
 	if (!igt_wait_for_spinner(&arg->a.spin, rq)) {
@@ -1460,7 +1460,7 @@ static int __cancel_hostile(struct live_preempt_cancel *arg)
 		goto out;
 	}
 
-	i915_gem_context_set_banned(arg->a.ctx);
+	intel_context_set_banned(rq->context);
 	err = intel_engine_pulse(arg->engine); /* force reset */
 	if (err)
 		goto out;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 172220e83079..1d615d375bfa 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -218,7 +218,7 @@ static void guc_wq_item_append(struct intel_guc *guc,
 static void guc_add_request(struct intel_guc *guc, struct i915_request *rq)
 {
 	struct intel_engine_cs *engine = rq->engine;
-	u32 ctx_desc = lower_32_bits(rq->hw_context->lrc_desc);
+	u32 ctx_desc = lower_32_bits(rq->context->lrc_desc);
 	u32 ring_tail = intel_ring_set_tail(rq->ring, rq->tail) / sizeof(u64);
 
 	guc_wq_item_append(guc, engine->guc_id, ctx_desc,
@@ -316,7 +316,7 @@ static void __guc_dequeue(struct intel_engine_cs *engine)
 		int i;
 
 		priolist_for_each_request_consume(rq, rn, p, i) {
-			if (last && rq->hw_context != last->hw_context) {
+			if (last && rq->context != last->context) {
 				if (port == last_port)
 					goto done;
 
@@ -421,7 +421,7 @@ static void guc_reset(struct intel_engine_cs *engine, bool stalled)
 		stalled = false;
 
 	__i915_request_reset(rq, stalled);
-	intel_lr_context_reset(engine, rq->hw_context, rq->head, stalled);
+	intel_lr_context_reset(engine, rq->context, rq->head, stalled);
 
 out_unlock:
 	spin_unlock_irqrestore(&engine->active.lock, flags);
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 5b2a7d072ec9..228c66534e21 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -59,7 +59,7 @@ static void set_context_pdp_root_pointer(
 static void update_shadow_pdps(struct intel_vgpu_workload *workload)
 {
 	struct drm_i915_gem_object *ctx_obj =
-		workload->req->hw_context->state->obj;
+		workload->req->context->state->obj;
 	struct execlist_ring_context *shadow_ring_context;
 	struct page *page;
 
@@ -130,7 +130,7 @@ static int populate_shadow_context(struct intel_vgpu_workload *workload)
 	struct intel_gvt *gvt = vgpu->gvt;
 	int ring_id = workload->ring_id;
 	struct drm_i915_gem_object *ctx_obj =
-		workload->req->hw_context->state->obj;
+		workload->req->context->state->obj;
 	struct execlist_ring_context *shadow_ring_context;
 	struct page *page;
 	void *dst;
@@ -205,9 +205,9 @@ static int populate_shadow_context(struct intel_vgpu_workload *workload)
 	return 0;
 }
 
-static inline bool is_gvt_request(struct i915_request *req)
+static inline bool is_gvt_request(struct i915_request *rq)
 {
-	return i915_gem_context_force_single_submission(req->gem_context);
+	return intel_context_force_single_submission(rq->context);
 }
 
 static void save_ring_hw_state(struct intel_vgpu *vgpu, int ring_id)
@@ -307,7 +307,7 @@ static int copy_workload_to_ring_buffer(struct intel_vgpu_workload *workload)
 	u32 *cs;
 	int err;
 
-	if (IS_GEN(req->i915, 9) && is_inhibit_context(req->hw_context))
+	if (IS_GEN(req->i915, 9) && is_inhibit_context(req->context))
 		intel_vgpu_restore_inhibit_context(vgpu, req);
 
 	/*
@@ -363,11 +363,10 @@ static void release_shadow_wa_ctx(struct intel_shadow_wa_ctx *wa_ctx)
 }
 
 static void set_context_ppgtt_from_shadow(struct intel_vgpu_workload *workload,
-					  struct i915_gem_context *ctx)
+					  struct intel_context *ce)
 {
 	struct intel_vgpu_mm *mm = workload->shadow_mm;
-	struct i915_ppgtt *ppgtt =
-		i915_vm_to_ppgtt(i915_gem_context_get_vm_rcu(ctx));
+	struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(ce->vm);
 	int i = 0;
 
 	if (mm->ppgtt_mm.root_entry_type == GTT_TYPE_PPGTT_ROOT_L4_ENTRY) {
@@ -380,8 +379,6 @@ static void set_context_ppgtt_from_shadow(struct intel_vgpu_workload *workload,
 			px_dma(pd) = mm->ppgtt_mm.shadow_pdps[i];
 		}
 	}
-
-	i915_vm_put(&ppgtt->vm);
 }
 
 static int
@@ -529,7 +526,7 @@ static void update_wa_ctx_2_shadow_ctx(struct intel_shadow_wa_ctx *wa_ctx)
 		container_of(wa_ctx, struct intel_vgpu_workload, wa_ctx);
 	struct i915_request *rq = workload->req;
 	struct execlist_ring_context *shadow_ring_context =
-		(struct execlist_ring_context *)rq->hw_context->lrc_reg_state;
+		(struct execlist_ring_context *)rq->context->lrc_reg_state;
 
 	shadow_ring_context->bb_per_ctx_ptr.val =
 		(shadow_ring_context->bb_per_ctx_ptr.val &
@@ -628,7 +625,7 @@ static int prepare_workload(struct intel_vgpu_workload *workload)
 
 	update_shadow_pdps(workload);
 
-	set_context_ppgtt_from_shadow(workload, s->shadow[ring]->gem_context);
+	set_context_ppgtt_from_shadow(workload, s->shadow[ring]);
 
 	ret = intel_vgpu_sync_oos_pages(workload->vgpu);
 	if (ret) {
@@ -787,7 +784,7 @@ static void update_guest_context(struct intel_vgpu_workload *workload)
 	struct i915_request *rq = workload->req;
 	struct intel_vgpu *vgpu = workload->vgpu;
 	struct intel_gvt *gvt = vgpu->gvt;
-	struct drm_i915_gem_object *ctx_obj = rq->hw_context->state->obj;
+	struct drm_i915_gem_object *ctx_obj = rq->context->state->obj;
 	struct execlist_ring_context *shadow_ring_context;
 	struct page *page;
 	void *src;
@@ -1232,8 +1229,6 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
-	i915_gem_context_set_force_single_submission(ctx);
-
 	ppgtt = i915_vm_to_ppgtt(i915_gem_context_get_vm_rcu(ctx));
 	i915_context_ppgtt_root_save(s, ppgtt);
 
@@ -1249,6 +1244,8 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 			goto out_shadow_ctx;
 		}
 
+		intel_context_set_single_submission(ce);
+
 		if (!USES_GUC_SUBMISSION(i915)) { /* Max ring buffer size */
 			const unsigned int ring_size = 512 * SZ_4K;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5eeef1ef7448..fe5ba72f3ce4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1168,19 +1168,18 @@ static int __intel_engines_record_defaults(struct intel_gt *gt)
 		if (!rq)
 			continue;
 
-		GEM_BUG_ON(!test_bit(CONTEXT_ALLOC_BIT,
-				     &rq->hw_context->flags));
-		state = rq->hw_context->state;
+		GEM_BUG_ON(!test_bit(CONTEXT_ALLOC_BIT, &rq->context->flags));
+		state = rq->context->state;
 		if (!state)
 			continue;
 
 		/* Serialise with retirement on another CPU */
-		err = __intel_context_flush_retire(rq->hw_context);
+		err = __intel_context_flush_retire(rq->context);
 		if (err)
 			goto out;
 
 		/* We want to be able to unbind the state from the GGTT */
-		GEM_BUG_ON(intel_context_is_pinned(rq->hw_context));
+		GEM_BUG_ON(intel_context_is_pinned(rq->context));
 
 		/*
 		 * As we will hold a reference to the logical state, it will
@@ -1230,7 +1229,7 @@ static int __intel_engines_record_defaults(struct intel_gt *gt)
 		if (!rq)
 			continue;
 
-		ce = rq->hw_context;
+		ce = rq->context;
 		i915_request_put(rq);
 		intel_context_put(ce);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 8374d50c0770..7e2cb063110c 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1221,7 +1221,7 @@ static void error_record_engine_registers(struct i915_gpu_state *error,
 static void record_request(const struct i915_request *request,
 			   struct drm_i915_error_request *erq)
 {
-	const struct i915_gem_context *ctx = request->gem_context;
+	const struct i915_gem_context *ctx = request->context->gem_context;
 
 	erq->flags = request->fence.flags;
 	erq->context = request->fence.context;
@@ -1231,7 +1231,7 @@ static void record_request(const struct i915_request *request,
 	erq->start = i915_ggtt_offset(request->ring->vma);
 	erq->head = request->head;
 	erq->tail = request->tail;
-	erq->pid = ctx->pid ? pid_nr(ctx->pid) : 0;
+	erq->pid = ctx && ctx->pid ? pid_nr(ctx->pid) : 0;
 }
 
 static void engine_record_requests(struct intel_engine_cs *engine,
@@ -1298,7 +1298,10 @@ static void error_record_engine_execlists(const struct intel_engine_cs *engine,
 static bool record_context(struct drm_i915_error_context *e,
 			   const struct i915_request *rq)
 {
-	const struct i915_gem_context *ctx = rq->gem_context;
+	const struct i915_gem_context *ctx = rq->context->gem_context;
+
+	if (!ctx)
+		return false;
 
 	if (ctx->pid) {
 		struct task_struct *task;
@@ -1452,7 +1455,7 @@ gem_record_rings(struct i915_gpu_state *error, struct compress *compress)
 		capture = request_record_user_bo(request, ee, capture);
 
 		capture = capture_vma(capture,
-				      request->hw_context->state,
+				      request->context->state,
 				      &ee->ctx);
 
 		capture = capture_vma(capture,
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 8d2e37949f46..4a89da21ae51 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3110,7 +3110,7 @@ static void i915_perf_enable_locked(struct i915_perf_stream *stream)
 		stream->ops->enable(stream);
 
 	if (stream->hold_preemption)
-		i915_gem_context_set_nopreempt(stream->ctx);
+		intel_context_set_nopreempt(stream->pinned_ctx);
 }
 
 /**
@@ -3136,7 +3136,7 @@ static void i915_perf_disable_locked(struct i915_perf_stream *stream)
 	stream->enabled = false;
 
 	if (stream->hold_preemption)
-		i915_gem_context_clear_nopreempt(stream->ctx);
+		intel_context_clear_nopreempt(stream->pinned_ctx);
 
 	if (stream->ops->disable)
 		stream->ops->disable(stream);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 5d8692642c6b..627600dedcc7 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -62,6 +62,8 @@ static const char *i915_fence_get_driver_name(struct dma_fence *fence)
 
 static const char *i915_fence_get_timeline_name(struct dma_fence *fence)
 {
+	const struct i915_gem_context *ctx;
+
 	/*
 	 * The timeline struct (as part of the ppgtt underneath a context)
 	 * may be freed when the request is no longer in use by the GPU.
@@ -74,7 +76,11 @@ static const char *i915_fence_get_timeline_name(struct dma_fence *fence)
 	if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 		return "signaled";
 
-	return to_request(fence)->gem_context->name ?: "[" DRIVER_NAME "]";
+	ctx = to_request(fence)->context->gem_context;
+	if (!ctx)
+		return "[" DRIVER_NAME "]";
+
+	return ctx->name;
 }
 
 static bool i915_fence_signaled(struct dma_fence *fence)
@@ -272,8 +278,8 @@ bool i915_request_retire(struct i915_request *rq)
 	remove_from_client(rq);
 	list_del(&rq->link);
 
-	intel_context_exit(rq->hw_context);
-	intel_context_unpin(rq->hw_context);
+	intel_context_exit(rq->context);
+	intel_context_unpin(rq->context);
 
 	free_capture_list(rq);
 	i915_sched_node_fini(&rq->sched);
@@ -378,7 +384,7 @@ bool __i915_request_submit(struct i915_request *request)
 	if (i915_request_completed(request))
 		goto xfer;
 
-	if (i915_gem_context_is_banned(request->gem_context))
+	if (intel_context_is_banned(request->context))
 		i915_request_skip(request, -EIO);
 
 	/*
@@ -660,8 +666,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 		goto err_free;
 
 	rq->i915 = ce->engine->i915;
-	rq->hw_context = ce;
-	rq->gem_context = ce->gem_context;
+	rq->context = ce;
 	rq->engine = ce->engine;
 	rq->ring = ce->ring;
 	rq->execution_mask = ce->engine->mask;
@@ -929,7 +934,7 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
 						       &from->submit,
 						       I915_FENCE_GFP);
 	} else if (intel_engine_has_semaphores(to->engine) &&
-		   to->gem_context->sched.priority >= I915_PRIORITY_NORMAL) {
+		   to->context->gem_context->sched.priority >= I915_PRIORITY_NORMAL) {
 		ret = emit_semaphore_wait(to, from, I915_FENCE_GFP);
 	} else {
 		ret = i915_sw_fence_await_dma_fence(&to->submit,
@@ -1311,7 +1316,7 @@ void __i915_request_queue(struct i915_request *rq,
 
 void i915_request_add(struct i915_request *rq)
 {
-	struct i915_sched_attr attr = rq->gem_context->sched;
+	struct i915_sched_attr attr = rq->context->gem_context->sched;
 	struct intel_timeline * const tl = i915_request_timeline(rq);
 	struct i915_request *prev;
 
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 96991d64759c..b3b6534ae798 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -109,9 +109,8 @@ struct i915_request {
 	 * i915_request_free() will then decrement the refcount on the
 	 * context.
 	 */
-	struct i915_gem_context *gem_context;
 	struct intel_engine_cs *engine;
-	struct intel_context *hw_context;
+	struct intel_context *context;
 	struct intel_ring *ring;
 	struct intel_timeline __rcu *timeline;
 	struct list_head signal_link;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 00f5d107f2b7..416631985199 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -213,7 +213,7 @@ static void kick_submission(struct intel_engine_cs *engine,
 	 * If we are already the currently executing context, don't
 	 * bother evaluating if we should preempt ourselves.
 	 */
-	if (inflight->hw_context == rq->hw_context)
+	if (inflight->context == rq->context)
 		goto unlock;
 
 	if (need_preempt(prio, rq_prio(inflight)))
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 19/33] drm/i915: Push the use-semaphore marker onto the intel_context
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (16 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 18/33] drm/i915: Drop GEM context as a direct link from i915_request Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 20/33] drm/i915: Remove i915->kernel_context Chris Wilson
                   ` (14 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Instead of rummaging through the intel_context to peek at the GEM
context in the middle of request submission to decide whether to use
semaphores, store that information on the intel_context itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 54 +++++++++++++------
 drivers/gpu/drm/i915/gt/intel_context.c       |  3 ++
 drivers/gpu/drm/i915/gt/intel_context.h       | 15 ++++++
 drivers/gpu/drm/i915/gt/intel_context_types.h |  7 +--
 drivers/gpu/drm/i915/i915_request.c           |  8 ++-
 5 files changed, 62 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 80278261a640..101d221f1682 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1918,6 +1918,42 @@ set_persistence(struct i915_gem_context *ctx,
 	return __context_set_persistence(ctx, args->value);
 }
 
+static int __apply_priority(struct intel_context *ce, void *arg)
+{
+	struct i915_gem_context *ctx = arg;
+
+	if (intel_context_use_semaphores(ce) &&
+	    ctx->sched.priority < I915_PRIORITY_NORMAL)
+		intel_context_clear_use_semaphores(ce);
+
+	return 0;
+}
+
+static int set_priority(struct i915_gem_context *ctx,
+			const struct drm_i915_gem_context_param *args)
+{
+	s64 priority = args->value;
+
+	if (args->size)
+		return -EINVAL;
+
+	if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
+		return -ENODEV;
+
+	if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
+	    priority < I915_CONTEXT_MIN_USER_PRIORITY)
+		return -EINVAL;
+
+	if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
+	    !capable(CAP_SYS_NICE))
+		return -EPERM;
+
+	ctx->sched.priority = I915_USER_PRIORITY(priority);
+	context_apply_all(ctx, __apply_priority, ctx);
+
+	return 0;
+}
+
 static int ctx_setparam(struct drm_i915_file_private *fpriv,
 			struct i915_gem_context *ctx,
 			struct drm_i915_gem_context_param *args)
@@ -1964,23 +2000,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 		break;
 
 	case I915_CONTEXT_PARAM_PRIORITY:
-		{
-			s64 priority = args->value;
-
-			if (args->size)
-				ret = -EINVAL;
-			else if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
-				ret = -ENODEV;
-			else if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
-				 priority < I915_CONTEXT_MIN_USER_PRIORITY)
-				ret = -EINVAL;
-			else if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
-				 !capable(CAP_SYS_NICE))
-				ret = -EPERM;
-			else
-				ctx->sched.priority =
-					I915_USER_PRIORITY(priority);
-		}
+		ret = set_priority(ctx, args);
 		break;
 
 	case I915_CONTEXT_PARAM_SSEU:
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 4c310cee1f04..eb048cfd6b7e 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -245,6 +245,9 @@ intel_context_init(struct intel_context *ce,
 	rcu_read_unlock();
 	if (ctx->timeline)
 		ce->timeline = intel_timeline_get(ctx->timeline);
+	if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
+	    intel_engine_has_semaphores(engine))
+		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
 
 	ce->engine = engine;
 	ce->ops = engine->cops;
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index 1e607343d256..d7b667a26e08 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -155,6 +155,21 @@ static inline struct intel_ring *__intel_context_ring_size(u64 sz)
 	return u64_to_ptr(struct intel_ring, sz);
 }
 
+static inline bool intel_context_use_semaphores(const struct intel_context *ce)
+{
+	return test_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
+}
+
+static inline void intel_context_set_use_semaphores(struct intel_context *ce)
+{
+	set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
+}
+
+static inline void intel_context_clear_use_semaphores(struct intel_context *ce)
+{
+	clear_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
+}
+
 static inline bool intel_context_is_banned(const struct intel_context *ce)
 {
 	return test_bit(CONTEXT_BANNED, &ce->flags);
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 597448f6e98b..af0d55b111f5 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -56,9 +56,10 @@ struct intel_context {
 	unsigned long flags;
 #define CONTEXT_ALLOC_BIT		0
 #define CONTEXT_VALID_BIT		1
-#define CONTEXT_BANNED			2
-#define CONTEXT_FORCE_SINGLE_SUBMISSION	3
-#define CONTEXT_NOPREEMPT		4
+#define CONTEXT_USE_SEMAPHORES		2
+#define CONTEXT_BANNED			3
+#define CONTEXT_FORCE_SINGLE_SUBMISSION	4
+#define CONTEXT_NOPREEMPT		5
 
 	u32 *lrc_reg_state;
 	u64 lrc_desc;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 627600dedcc7..8ac3ea65a105 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -929,18 +929,16 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
 			return ret;
 	}
 
-	if (to->engine == from->engine) {
+	if (to->engine == from->engine)
 		ret = i915_sw_fence_await_sw_fence_gfp(&to->submit,
 						       &from->submit,
 						       I915_FENCE_GFP);
-	} else if (intel_engine_has_semaphores(to->engine) &&
-		   to->context->gem_context->sched.priority >= I915_PRIORITY_NORMAL) {
+	else if (intel_context_use_semaphores(to->context))
 		ret = emit_semaphore_wait(to, from, I915_FENCE_GFP);
-	} else {
+	else
 		ret = i915_sw_fence_await_dma_fence(&to->submit,
 						    &from->fence, 0,
 						    I915_FENCE_GFP);
-	}
 	if (ret < 0)
 		return ret;
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 20/33] drm/i915: Remove i915->kernel_context
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (17 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 19/33] drm/i915: Push the use-semaphore marker onto the intel_context Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 21/33] drm/i915: Move i915_gem_init_contexts() earlier Chris Wilson
                   ` (13 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Allocate only an internal intel_context for the kernel_context, forgoing
a global GEM context for internal use as we only require a separate
address space (for our own protection).

Now having weaned GT from requiring ce->gem_context, we can stop
referencing it entirely. This also means we no longer have to create random
and unnecessary GEM contexts for internal use.

GEM contexts are now entirely for tracking GEM clients, and intel_context
the execution environment on the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  96 +++++------
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   5 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |   6 +-
 .../drm/i915/gem/selftests/i915_gem_context.c |   5 +-
 .../gpu/drm/i915/gem/selftests/mock_context.c |  11 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  32 +---
 drivers/gpu/drm/i915/gt/intel_context.h       |   9 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |  13 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  21 +--
 drivers/gpu/drm/i915/gt/intel_gt.c            |  25 ++-
 drivers/gpu/drm/i915/gt/intel_gt_types.h      |   7 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  21 +--
 drivers/gpu/drm/i915/gt/intel_lrc.h           |   6 +-
 drivers/gpu/drm/i915/gt/intel_reset.c         |  14 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |   2 +-
 drivers/gpu/drm/i915/gt/selftest_context.c    |  57 ++-----
 .../drm/i915/gt/selftest_engine_heartbeat.c   |   3 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  | 141 ++++++----------
 drivers/gpu/drm/i915/gt/selftest_lrc.c        | 159 ++++++------------
 drivers/gpu/drm/i915/gt/selftest_mocs.c       |   6 +-
 drivers/gpu/drm/i915/gt/selftest_rc6.c        |   3 +-
 .../gpu/drm/i915/gt/selftest_workarounds.c    |  72 +++-----
 drivers/gpu/drm/i915/gvt/scheduler.c          |  16 +-
 drivers/gpu/drm/i915/i915_active.c            |   2 +
 drivers/gpu/drm/i915/i915_drv.h               |   3 -
 drivers/gpu/drm/i915/i915_gem.c               |  18 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c           |   8 +-
 drivers/gpu/drm/i915/i915_perf.c              |   3 -
 drivers/gpu/drm/i915/i915_request.c           |   5 +-
 drivers/gpu/drm/i915/selftests/i915_request.c |   6 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |   8 +-
 31 files changed, 303 insertions(+), 480 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 101d221f1682..df09f0ff87b6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -210,6 +210,35 @@ context_get_vm_rcu(struct i915_gem_context *ctx)
 	} while (1);
 }
 
+static void intel_context_set_gem(struct intel_context *ce,
+				  struct i915_gem_context *ctx)
+{
+	GEM_BUG_ON(ce->gem_context);
+	ce->gem_context = ctx;
+
+	if (!test_bit(CONTEXT_ALLOC_BIT, &ce->flags))
+		ce->ring = __intel_context_ring_size(SZ_16K);
+
+	if (rcu_access_pointer(ctx->vm)) {
+		struct i915_address_space *vm;
+
+		rcu_read_lock();
+		vm = context_get_vm_rcu(ctx); /* hmm */
+		rcu_read_unlock();
+
+		i915_vm_put(ce->vm);
+		ce->vm = vm;
+	}
+
+	GEM_BUG_ON(ce->timeline);
+	if (ctx->timeline)
+		ce->timeline = intel_timeline_get(ctx->timeline);
+
+	if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
+	    intel_engine_has_semaphores(ce->engine))
+		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
+}
+
 static void __free_engines(struct i915_gem_engines *e, unsigned int count)
 {
 	while (count--) {
@@ -252,12 +281,14 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
 		GEM_BUG_ON(engine->legacy_idx >= I915_NUM_ENGINES);
 		GEM_BUG_ON(e->engines[engine->legacy_idx]);
 
-		ce = intel_context_create(ctx, engine);
+		ce = intel_context_create(engine);
 		if (IS_ERR(ce)) {
 			__free_engines(e, e->num_engines + 1);
 			return ERR_CAST(ce);
 		}
 
+		intel_context_set_gem(ce, ctx);
+
 		e->engines[engine->legacy_idx] = ce;
 		e->num_engines = max(e->num_engines, engine->legacy_idx);
 	}
@@ -715,37 +746,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
 	return ctx;
 }
 
-static void
-destroy_kernel_context(struct i915_gem_context **ctxp)
-{
-	struct i915_gem_context *ctx;
-
-	/* Keep the context ref so that we can free it immediately ourselves */
-	ctx = i915_gem_context_get(fetch_and_zero(ctxp));
-	GEM_BUG_ON(!i915_gem_context_is_kernel(ctx));
-
-	context_close(ctx);
-	i915_gem_context_free(ctx);
-}
-
-struct i915_gem_context *
-i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio)
-{
-	struct i915_gem_context *ctx;
-
-	ctx = i915_gem_create_context(i915, 0);
-	if (IS_ERR(ctx))
-		return ctx;
-
-	i915_gem_context_clear_bannable(ctx);
-	i915_gem_context_set_persistence(ctx);
-	ctx->sched.priority = I915_USER_PRIORITY(prio);
-
-	GEM_BUG_ON(!i915_gem_context_is_kernel(ctx));
-
-	return ctx;
-}
-
 static void init_contexts(struct i915_gem_contexts *gc)
 {
 	spin_lock_init(&gc->lock);
@@ -755,32 +755,16 @@ static void init_contexts(struct i915_gem_contexts *gc)
 	init_llist_head(&gc->free_list);
 }
 
-int i915_gem_init_contexts(struct drm_i915_private *i915)
+void i915_gem_init__contexts(struct drm_i915_private *i915)
 {
-	struct i915_gem_context *ctx;
-
-	/* Reassure ourselves we are only called once */
-	GEM_BUG_ON(i915->kernel_context);
-
 	init_contexts(&i915->gem.contexts);
-
-	/* lowest priority; idle task */
-	ctx = i915_gem_context_create_kernel(i915, I915_PRIORITY_MIN);
-	if (IS_ERR(ctx)) {
-		DRM_ERROR("Failed to create default global context\n");
-		return PTR_ERR(ctx);
-	}
-	i915->kernel_context = ctx;
-
 	DRM_DEBUG_DRIVER("%s context support initialized\n",
 			 DRIVER_CAPS(i915)->has_logical_contexts ?
 			 "logical" : "fake");
-	return 0;
 }
 
 void i915_gem_driver_release__contexts(struct drm_i915_private *i915)
 {
-	destroy_kernel_context(&i915->kernel_context);
 	flush_work(&i915->gem.contexts.free_work);
 }
 
@@ -1597,12 +1581,14 @@ set_engines__load_balance(struct i915_user_extension __user *base, void *data)
 		}
 	}
 
-	ce = intel_execlists_create_virtual(set->ctx, siblings, n);
+	ce = intel_execlists_create_virtual(siblings, n);
 	if (IS_ERR(ce)) {
 		err = PTR_ERR(ce);
 		goto out_siblings;
 	}
 
+	intel_context_set_gem(ce, set->ctx);
+
 	if (cmpxchg(&set->engines->engines[idx], NULL, ce)) {
 		intel_context_put(ce);
 		err = -EEXIST;
@@ -1772,12 +1758,14 @@ set_engines(struct i915_gem_context *ctx,
 			return -ENOENT;
 		}
 
-		ce = intel_context_create(ctx, engine);
+		ce = intel_context_create(engine);
 		if (IS_ERR(ce)) {
 			__free_engines(set.engines, n);
 			return PTR_ERR(ce);
 		}
 
+		intel_context_set_gem(ce, ctx);
+
 		set.engines->engines[n] = ce;
 	}
 	set.engines->num_engines = num_engines;
@@ -2096,14 +2084,16 @@ static int clone_engines(struct i915_gem_context *dst,
 		 */
 		if (intel_engine_is_virtual(engine))
 			clone->engines[n] =
-				intel_execlists_clone_virtual(dst, engine);
+				intel_execlists_clone_virtual(engine);
 		else
-			clone->engines[n] = intel_context_create(dst, engine);
+			clone->engines[n] = intel_context_create(engine);
 		if (IS_ERR_OR_NULL(clone->engines[n])) {
 			__free_engines(clone, n);
 			goto err_unlock;
 		}
 
+		intel_context_set_gem(clone->engines[n], dst);
+
 		/* Copy across the preferred ringsize */
 		if (copy_ring_size(clone->engines[n], e->engines[n])) {
 			__free_engines(clone, n + 1);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 69932899803e..c519b9f8d83b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -115,7 +115,7 @@ static inline bool i915_gem_context_is_kernel(struct i915_gem_context *ctx)
 }
 
 /* i915_gem_context.c */
-int __must_check i915_gem_init_contexts(struct drm_i915_private *i915);
+void i915_gem_init__contexts(struct drm_i915_private *i915);
 void i915_gem_driver_release__contexts(struct drm_i915_private *i915);
 
 int i915_gem_context_open(struct drm_i915_private *i915,
@@ -140,9 +140,6 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
 
-struct i915_gem_context *
-i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio);
-
 static inline struct i915_gem_context *
 i915_gem_context_get(struct i915_gem_context *ctx)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index f7f66c62cf0e..e5558af111e2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -779,15 +779,11 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 		return -EFAULT;
 
 	if (args->flags & I915_USERPTR_READ_ONLY) {
-		struct i915_address_space *vm;
-
 		/*
 		 * On almost all of the older hw, we cannot tell the GPU that
 		 * a page is readonly.
 		 */
-		vm = rcu_dereference_protected(dev_priv->kernel_context->vm,
-					       true); /* static vm */
-		if (!vm || !vm->has_read_only)
+		if (!dev_priv->gt.vm->has_read_only)
 			return -ENODEV;
 	}
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 780e58fe5c64..7fc46861a54d 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -337,7 +337,7 @@ static int live_parallel_switch(void *arg)
 			if (!data[m].ce[0])
 				continue;
 
-			ce = intel_context_create(ctx, data[m].ce[0]->engine);
+			ce = intel_context_create(data[m].ce[0]->engine);
 			if (IS_ERR(ce))
 				goto out;
 
@@ -1264,8 +1264,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 			hweight32(engine->sseu.slice_mask),
 			hweight32(pg_sseu.slice_mask));
 
-		ce = intel_context_create(engine->kernel_context->gem_context,
-					  engine);
+		ce = intel_context_create(engine);
 		if (IS_ERR(ce)) {
 			ret = PTR_ERR(ce);
 			goto out_put;
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
index 53e89efb09c0..fdf2f120234c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
@@ -96,7 +96,16 @@ live_context(struct drm_i915_private *i915, struct file *file)
 struct i915_gem_context *
 kernel_context(struct drm_i915_private *i915)
 {
-	return i915_gem_context_create_kernel(i915, I915_PRIORITY_NORMAL);
+	struct i915_gem_context *ctx;
+
+	ctx = i915_gem_create_context(i915, 0);
+	if (IS_ERR(ctx))
+		return ctx;
+
+	GEM_BUG_ON(!i915_gem_context_is_kernel(ctx));
+	i915_gem_context_clear_bannable(ctx);
+
+	return ctx;
 }
 
 void kernel_context_close(struct i915_gem_context *ctx)
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index eb048cfd6b7e..ce20874020b8 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -31,8 +31,7 @@ void intel_context_free(struct intel_context *ce)
 }
 
 struct intel_context *
-intel_context_create(struct i915_gem_context *ctx,
-		     struct intel_engine_cs *engine)
+intel_context_create(struct intel_engine_cs *engine)
 {
 	struct intel_context *ce;
 
@@ -40,7 +39,7 @@ intel_context_create(struct i915_gem_context *ctx,
 	if (!ce)
 		return ERR_PTR(-ENOMEM);
 
-	intel_context_init(ce, ctx, engine);
+	intel_context_init(ce, engine);
 	return ce;
 }
 
@@ -72,8 +71,6 @@ int __intel_context_do_pin(struct intel_context *ce)
 			  ce->engine->name, ce->timeline->fence_context,
 			  ce->ring->head, ce->ring->tail);
 
-		i915_gem_context_get(ce->gem_context); /* for ctx->ppgtt */
-
 		smp_mb__before_atomic(); /* flush pin before it is visible */
 	}
 
@@ -103,7 +100,6 @@ void intel_context_unpin(struct intel_context *ce)
 
 		ce->ops->unpin(ce);
 
-		i915_gem_context_put(ce->gem_context);
 		intel_context_active_release(ce);
 	}
 
@@ -205,7 +201,7 @@ int intel_context_active_acquire(struct intel_context *ce)
 		return err;
 
 	/* Preallocate tracking nodes */
-	if (!i915_gem_context_is_kernel(ce->gem_context)) {
+	if (!intel_context_is_barrier(ce)) {
 		err = i915_active_acquire_preallocate_barrier(&ce->active,
 							      ce->engine);
 		if (err) {
@@ -226,33 +222,19 @@ void intel_context_active_release(struct intel_context *ce)
 
 void
 intel_context_init(struct intel_context *ce,
-		   struct i915_gem_context *ctx,
 		   struct intel_engine_cs *engine)
 {
-	struct i915_address_space *vm;
-
 	GEM_BUG_ON(!engine->cops);
+	GEM_BUG_ON(!engine->gt->vm);
 
 	kref_init(&ce->ref);
 
-	ce->gem_context = ctx;
-	rcu_read_lock();
-	vm = rcu_dereference(ctx->vm);
-	if (vm)
-		ce->vm = i915_vm_get(vm);
-	else
-		ce->vm = i915_vm_get(&engine->gt->ggtt->vm);
-	rcu_read_unlock();
-	if (ctx->timeline)
-		ce->timeline = intel_timeline_get(ctx->timeline);
-	if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
-	    intel_engine_has_semaphores(engine))
-		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
-
 	ce->engine = engine;
 	ce->ops = engine->cops;
 	ce->sseu = engine->sseu;
-	ce->ring = __intel_context_ring_size(SZ_16K);
+	ce->ring = __intel_context_ring_size(SZ_4K);
+
+	ce->vm = i915_vm_get(engine->gt->vm);
 
 	INIT_LIST_HEAD(&ce->signal_link);
 	INIT_LIST_HEAD(&ce->signals);
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index d7b667a26e08..80d4c2acc729 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -18,13 +18,11 @@
 #include "intel_timeline_types.h"
 
 void intel_context_init(struct intel_context *ce,
-			struct i915_gem_context *ctx,
 			struct intel_engine_cs *engine);
 void intel_context_fini(struct intel_context *ce);
 
 struct intel_context *
-intel_context_create(struct i915_gem_context *ctx,
-		     struct intel_engine_cs *engine);
+intel_context_create(struct intel_engine_cs *engine);
 
 void intel_context_free(struct intel_context *ce);
 
@@ -155,6 +153,11 @@ static inline struct intel_ring *__intel_context_ring_size(u64 sz)
 	return u64_to_ptr(struct intel_ring, sz);
 }
 
+static inline bool intel_context_is_barrier(const struct intel_context *ce)
+{
+	return test_bit(CONTEXT_BARRIER_BIT, &ce->flags);
+}
+
 static inline bool intel_context_use_semaphores(const struct intel_context *ce)
 {
 	return test_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index af0d55b111f5..7dd03ad9826c 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -54,12 +54,13 @@ struct intel_context {
 	struct intel_timeline *timeline;
 
 	unsigned long flags;
-#define CONTEXT_ALLOC_BIT		0
-#define CONTEXT_VALID_BIT		1
-#define CONTEXT_USE_SEMAPHORES		2
-#define CONTEXT_BANNED			3
-#define CONTEXT_FORCE_SINGLE_SUBMISSION	4
-#define CONTEXT_NOPREEMPT		5
+#define CONTEXT_BARRIER_BIT		0
+#define CONTEXT_ALLOC_BIT		1
+#define CONTEXT_VALID_BIT		2
+#define CONTEXT_USE_SEMAPHORES		3
+#define CONTEXT_BANNED			4
+#define CONTEXT_FORCE_SINGLE_SUBMISSION	5
+#define CONTEXT_NOPREEMPT		6
 
 	u32 *lrc_reg_state;
 	u64 lrc_desc;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 0fe7be3cf21a..b72e37c2055e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -758,11 +758,11 @@ create_kernel_context(struct intel_engine_cs *engine)
 	struct intel_context *ce;
 	int err;
 
-	ce = intel_context_create(engine->i915->kernel_context, engine);
+	ce = intel_context_create(engine);
 	if (IS_ERR(ce))
 		return ce;
 
-	ce->ring = __intel_context_ring_size(SZ_4K);
+	__set_bit(CONTEXT_BARRIER_BIT, &ce->flags);
 
 	err = intel_context_pin(ce);
 	if (err) {
@@ -799,6 +799,12 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
 
 	engine->set_default_submission(engine);
 
+	ret = measure_breadcrumb_dw(engine);
+	if (ret < 0)
+		return ret;
+
+	engine->emit_fini_breadcrumb_dw = ret;
+
 	/*
 	 * We may need to do things with the shrinker which
 	 * require us to immediately switch back to the default
@@ -813,18 +819,7 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
 
 	engine->kernel_context = ce;
 
-	ret = measure_breadcrumb_dw(engine);
-	if (ret < 0)
-		goto err_unpin;
-
-	engine->emit_fini_breadcrumb_dw = ret;
-
 	return 0;
-
-err_unpin:
-	intel_context_unpin(ce);
-	intel_context_put(ce);
-	return ret;
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index c4fd8d65b8a3..af4f8c810009 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -74,7 +74,6 @@ int intel_gt_init_hw(struct intel_gt *gt)
 	struct intel_uncore *uncore = gt->uncore;
 	int ret;
 
-	BUG_ON(!i915->kernel_context);
 	ret = intel_gt_terminally_wedged(gt);
 	if (ret)
 		return ret;
@@ -365,6 +364,14 @@ static void intel_gt_fini_scratch(struct intel_gt *gt)
 	i915_vma_unpin_and_release(&gt->scratch, 0);
 }
 
+static struct i915_address_space *kernel_vm(struct intel_gt *gt)
+{
+	if (INTEL_PPGTT(gt->i915) > INTEL_PPGTT_ALIASING)
+		return &i915_ppgtt_create(gt->i915)->vm;
+	else
+		return i915_vm_get(&gt->ggtt->vm);
+}
+
 int intel_gt_init(struct intel_gt *gt)
 {
 	int err;
@@ -375,7 +382,17 @@ int intel_gt_init(struct intel_gt *gt)
 
 	intel_gt_pm_init(gt);
 
+	gt->vm = kernel_vm(gt);
+	if (!gt->vm) {
+		err = -ENOMEM;
+		goto err_scratch;
+	}
+
 	return 0;
+
+err_scratch:
+	intel_gt_fini_scratch(gt);
+	return err;
 }
 
 void intel_gt_driver_remove(struct intel_gt *gt)
@@ -390,6 +407,12 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
 
 void intel_gt_driver_release(struct intel_gt *gt)
 {
+	struct i915_address_space *vm;
+
+	vm = fetch_and_zero(&gt->vm);
+	if (vm) /* FIXME being called twice on error paths :( */
+		i915_vm_put(vm);
+
 	intel_gt_pm_fini(gt);
 	intel_gt_fini_scratch(gt);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index d4e14dbd172e..96890dd12b5f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -90,6 +90,13 @@ struct intel_gt {
 	struct intel_engine_cs *engine[I915_NUM_ENGINES];
 	struct intel_engine_cs *engine_class[MAX_ENGINE_CLASS + 1]
 					    [MAX_ENGINE_INSTANCE + 1];
+
+	/*
+	 * Default address space (either GGTT or ppGTT depending on arch).
+	 *
+	 * Reserved for exclusive use by the kernel.
+	 */
+	struct i915_address_space *vm;
 };
 
 enum intel_gt_scratch_field {
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index c7bd271bf60f..e8d5518cdcad 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -133,12 +133,11 @@
  */
 #include <linux/interrupt.h>
 
-#include "gem/i915_gem_context.h"
-
 #include "i915_drv.h"
 #include "i915_perf.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
+#include "intel_context.h"
 #include "intel_engine_pm.h"
 #include "intel_gt.h"
 #include "intel_gt_pm.h"
@@ -1327,7 +1326,8 @@ assert_pending_valid(const struct intel_engine_execlists *execlists,
 		if (i915_request_completed(rq))
 			goto unlock;
 
-		if (i915_active_is_idle(&ce->active) && ce->gem_context) {
+		if (i915_active_is_idle(&ce->active) &&
+		    !intel_context_is_barrier(ce)) {
 			GEM_TRACE_ERR("Inactive context:%llx in pending[%zd]\n",
 				      ce->timeline->fence_context,
 				      port - execlists->pending);
@@ -4483,8 +4483,7 @@ virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
 }
 
 struct intel_context *
-intel_execlists_create_virtual(struct i915_gem_context *ctx,
-			       struct intel_engine_cs **siblings,
+intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 			       unsigned int count)
 {
 	struct virtual_engine *ve;
@@ -4495,13 +4494,13 @@ intel_execlists_create_virtual(struct i915_gem_context *ctx,
 		return ERR_PTR(-EINVAL);
 
 	if (count == 1)
-		return intel_context_create(ctx, siblings[0]);
+		return intel_context_create(siblings[0]);
 
 	ve = kzalloc(struct_size(ve, siblings, count), GFP_KERNEL);
 	if (!ve)
 		return ERR_PTR(-ENOMEM);
 
-	ve->base.i915 = ctx->i915;
+	ve->base.i915 = siblings[0]->i915;
 	ve->base.gt = siblings[0]->gt;
 	ve->base.uncore = siblings[0]->uncore;
 	ve->base.id = -1;
@@ -4544,7 +4543,7 @@ intel_execlists_create_virtual(struct i915_gem_context *ctx,
 		     virtual_submission_tasklet,
 		     (unsigned long)ve);
 
-	intel_context_init(&ve->context, ctx, &ve->base);
+	intel_context_init(&ve->context, &ve->base);
 
 	for (n = 0; n < count; n++) {
 		struct intel_engine_cs *sibling = siblings[n];
@@ -4619,14 +4618,12 @@ intel_execlists_create_virtual(struct i915_gem_context *ctx,
 }
 
 struct intel_context *
-intel_execlists_clone_virtual(struct i915_gem_context *ctx,
-			      struct intel_engine_cs *src)
+intel_execlists_clone_virtual(struct intel_engine_cs *src)
 {
 	struct virtual_engine *se = to_virtual_engine(src);
 	struct intel_context *dst;
 
-	dst = intel_execlists_create_virtual(ctx,
-					     se->siblings,
+	dst = intel_execlists_create_virtual(se->siblings,
 					     se->num_siblings);
 	if (IS_ERR(dst))
 		return dst;
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h
index 04511d8ebdc1..081521f17c74 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.h
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
@@ -111,13 +111,11 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 				   unsigned int max);
 
 struct intel_context *
-intel_execlists_create_virtual(struct i915_gem_context *ctx,
-			       struct intel_engine_cs **siblings,
+intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 			       unsigned int count);
 
 struct intel_context *
-intel_execlists_clone_virtual(struct i915_gem_context *ctx,
-			      struct intel_engine_cs *src);
+intel_execlists_clone_virtual(struct intel_engine_cs *src);
 
 int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
 				     const struct intel_engine_cs *master,
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index fa3fe23162a2..b5770bcfbaec 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -52,9 +52,8 @@ static void engine_skip_context(struct i915_request *rq)
 			i915_request_skip(rq, -EIO);
 }
 
-static void client_mark_guilty(struct i915_request *rq, bool banned)
+static void client_mark_guilty(struct i915_gem_context *ctx, bool banned)
 {
-	struct i915_gem_context *ctx = rq->context->gem_context;
 	struct drm_i915_file_private *file_priv = ctx->file_priv;
 	unsigned long prev_hang;
 	unsigned int score;
@@ -81,11 +80,15 @@ static void client_mark_guilty(struct i915_request *rq, bool banned)
 
 static bool mark_guilty(struct i915_request *rq)
 {
-	struct i915_gem_context *ctx = rq->context->gem_context;
+	struct i915_gem_context *ctx;
 	unsigned long prev_hang;
 	bool banned;
 	int i;
 
+	ctx = rq->context->gem_context;
+	if (!ctx)
+		return false;
+
 	if (i915_gem_context_is_closed(ctx)) {
 		intel_context_set_banned(rq->context);
 		return true;
@@ -117,14 +120,15 @@ static bool mark_guilty(struct i915_request *rq)
 		intel_context_set_banned(rq->context);
 	}
 
-	client_mark_guilty(rq, banned);
+	client_mark_guilty(ctx, banned);
 
 	return banned;
 }
 
 static void mark_innocent(struct i915_request *rq)
 {
-	atomic_inc(&rq->context->gem_context->active_count);
+	if (rq->context->gem_context)
+		atomic_inc(&rq->context->gem_context->active_count);
 }
 
 void __i915_request_reset(struct i915_request *rq, bool guilty)
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 9897394b81e6..792b2fad27f3 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -1575,7 +1575,7 @@ static int remap_l3(struct i915_request *rq)
 	struct i915_gem_context *ctx = rq->context->gem_context;
 	int i, err;
 
-	if (!ctx->remap_slice)
+	if (!ctx || !ctx->remap_slice)
 		return 0;
 
 	for (i = 0; i < MAX_L3_SLICES; i++) {
diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index af354ccdbf40..e874dfaa5316 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -70,15 +70,14 @@ static int context_sync(struct intel_context *ce)
 	return err;
 }
 
-static int __live_context_size(struct intel_engine_cs *engine,
-			       struct i915_gem_context *fixme)
+static int __live_context_size(struct intel_engine_cs *engine)
 {
 	struct intel_context *ce;
 	struct i915_request *rq;
 	void *vaddr;
 	int err;
 
-	ce = intel_context_create(fixme, engine);
+	ce = intel_context_create(engine);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
@@ -146,7 +145,6 @@ static int live_context_size(void *arg)
 {
 	struct intel_gt *gt = arg;
 	struct intel_engine_cs *engine;
-	struct i915_gem_context *fixme;
 	enum intel_engine_id id;
 	int err = 0;
 
@@ -155,10 +153,6 @@ static int live_context_size(void *arg)
 	 * HW tries to write past the end of one.
 	 */
 
-	fixme = kernel_context(gt->i915);
-	if (IS_ERR(fixme))
-		return PTR_ERR(fixme);
-
 	for_each_engine(engine, gt, id) {
 		struct {
 			struct drm_i915_gem_object *state;
@@ -183,7 +177,7 @@ static int live_context_size(void *arg)
 		/* Overlaps with the execlists redzone */
 		engine->context_size += I915_GTT_PAGE_SIZE;
 
-		err = __live_context_size(engine, fixme);
+		err = __live_context_size(engine);
 
 		engine->context_size -= I915_GTT_PAGE_SIZE;
 
@@ -196,12 +190,10 @@ static int live_context_size(void *arg)
 			break;
 	}
 
-	kernel_context_close(fixme);
 	return err;
 }
 
-static int __live_active_context(struct intel_engine_cs *engine,
-				 struct i915_gem_context *fixme)
+static int __live_active_context(struct intel_engine_cs *engine)
 {
 	unsigned long saved_heartbeat;
 	struct intel_context *ce;
@@ -227,7 +219,7 @@ static int __live_active_context(struct intel_engine_cs *engine,
 		return -EINVAL;
 	}
 
-	ce = intel_context_create(fixme, engine);
+	ce = intel_context_create(engine);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
@@ -310,23 +302,11 @@ static int live_active_context(void *arg)
 {
 	struct intel_gt *gt = arg;
 	struct intel_engine_cs *engine;
-	struct i915_gem_context *fixme;
 	enum intel_engine_id id;
-	struct file *file;
 	int err = 0;
 
-	file = mock_file(gt->i915);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
-
-	fixme = live_context(gt->i915, file);
-	if (IS_ERR(fixme)) {
-		err = PTR_ERR(fixme);
-		goto out_file;
-	}
-
 	for_each_engine(engine, gt, id) {
-		err = __live_active_context(engine, fixme);
+		err = __live_active_context(engine);
 		if (err)
 			break;
 
@@ -335,8 +315,6 @@ static int live_active_context(void *arg)
 			break;
 	}
 
-out_file:
-	fput(file);
 	return err;
 }
 
@@ -368,8 +346,7 @@ static int __remote_sync(struct intel_context *ce, struct intel_context *remote)
 	return err;
 }
 
-static int __live_remote_context(struct intel_engine_cs *engine,
-				 struct i915_gem_context *fixme)
+static int __live_remote_context(struct intel_engine_cs *engine)
 {
 	struct intel_context *local, *remote;
 	unsigned long saved_heartbeat;
@@ -390,11 +367,11 @@ static int __live_remote_context(struct intel_engine_cs *engine,
 		return -EINVAL;
 	}
 
-	remote = intel_context_create(fixme, engine);
+	remote = intel_context_create(engine);
 	if (IS_ERR(remote))
 		return PTR_ERR(remote);
 
-	local = intel_context_create(fixme, engine);
+	local = intel_context_create(engine);
 	if (IS_ERR(local)) {
 		err = PTR_ERR(local);
 		goto err_remote;
@@ -434,23 +411,11 @@ static int live_remote_context(void *arg)
 {
 	struct intel_gt *gt = arg;
 	struct intel_engine_cs *engine;
-	struct i915_gem_context *fixme;
 	enum intel_engine_id id;
-	struct file *file;
 	int err = 0;
 
-	file = mock_file(gt->i915);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
-
-	fixme = live_context(gt->i915, file);
-	if (IS_ERR(fixme)) {
-		err = PTR_ERR(fixme);
-		goto out_file;
-	}
-
 	for_each_engine(engine, gt, id) {
-		err = __live_remote_context(engine, fixme);
+		err = __live_remote_context(engine);
 		if (err)
 			break;
 
@@ -459,8 +424,6 @@ static int live_remote_context(void *arg)
 			break;
 	}
 
-out_file:
-	fput(file);
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
index 5227e79204a5..43d4d589749f 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
@@ -200,8 +200,7 @@ static int __live_heartbeat_fast(struct intel_engine_cs *engine)
 	int err;
 	int i;
 
-	ce = intel_context_create(engine->kernel_context->gem_context,
-				  engine);
+	ce = intel_context_create(engine);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index d155c9374453..ff2d8af282a6 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -377,36 +377,30 @@ static int igt_reset_nop(void *arg)
 	struct intel_gt *gt = arg;
 	struct i915_gpu_error *global = &gt->i915->gpu_error;
 	struct intel_engine_cs *engine;
-	struct i915_gem_context *ctx;
 	unsigned int reset_count, count;
 	enum intel_engine_id id;
 	IGT_TIMEOUT(end_time);
-	struct file *file;
 	int err = 0;
 
 	/* Check that we can reset during non-user portions of requests */
 
-	file = mock_file(gt->i915);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
-
-	ctx = live_context(gt->i915, file);
-	if (IS_ERR(ctx)) {
-		err = PTR_ERR(ctx);
-		goto out;
-	}
-
-	i915_gem_context_clear_bannable(ctx);
 	reset_count = i915_reset_count(global);
 	count = 0;
 	do {
 		for_each_engine(engine, gt, id) {
+			struct intel_context *ce;
 			int i;
 
+			ce = intel_context_create(engine);
+			if (IS_ERR(ce)) {
+				err = PTR_ERR(ce);
+				break;
+			}
+
 			for (i = 0; i < 16; i++) {
 				struct i915_request *rq;
 
-				rq = igt_request_alloc(ctx, engine);
+				rq = intel_context_create_request(ce);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
 					break;
@@ -414,6 +408,8 @@ static int igt_reset_nop(void *arg)
 
 				i915_request_add(rq);
 			}
+
+			intel_context_put(ce);
 		}
 
 		igt_global_reset_lock(gt);
@@ -437,10 +433,7 @@ static int igt_reset_nop(void *arg)
 	} while (time_before(jiffies, end_time));
 	pr_info("%s: %d resets\n", __func__, count);
 
-	err = igt_flush_test(gt->i915);
-out:
-	fput(file);
-	if (intel_gt_is_wedged(gt))
+	if (igt_flush_test(gt->i915))
 		err = -EIO;
 	return err;
 }
@@ -450,31 +443,22 @@ static int igt_reset_nop_engine(void *arg)
 	struct intel_gt *gt = arg;
 	struct i915_gpu_error *global = &gt->i915->gpu_error;
 	struct intel_engine_cs *engine;
-	struct i915_gem_context *ctx;
 	enum intel_engine_id id;
-	struct file *file;
-	int err = 0;
 
 	/* Check that we can engine-reset during non-user portions */
 
 	if (!intel_has_reset_engine(gt))
 		return 0;
 
-	file = mock_file(gt->i915);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
-
-	ctx = live_context(gt->i915, file);
-	if (IS_ERR(ctx)) {
-		err = PTR_ERR(ctx);
-		goto out;
-	}
-
-	i915_gem_context_clear_bannable(ctx);
 	for_each_engine(engine, gt, id) {
-		unsigned int reset_count, reset_engine_count;
-		unsigned int count;
+		unsigned int reset_count, reset_engine_count, count;
+		struct intel_context *ce;
 		IGT_TIMEOUT(end_time);
+		int err;
+
+		ce = intel_context_create(engine);
+		if (IS_ERR(ce))
+			return PTR_ERR(ce);
 
 		reset_count = i915_reset_count(global);
 		reset_engine_count = i915_reset_engine_count(global, engine);
@@ -494,7 +478,7 @@ static int igt_reset_nop_engine(void *arg)
 			for (i = 0; i < 16; i++) {
 				struct i915_request *rq;
 
-				rq = igt_request_alloc(ctx, engine);
+				rq = intel_context_create_request(ce);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
 					break;
@@ -525,20 +509,14 @@ static int igt_reset_nop_engine(void *arg)
 		clear_bit(I915_RESET_ENGINE + id, &gt->reset.flags);
 		pr_info("%s(%s): %d resets\n", __func__, engine->name, count);
 
+		intel_context_put(ce);
+		if (igt_flush_test(gt->i915))
+			err = -EIO;
 		if (err)
-			break;
-
-		err = igt_flush_test(gt->i915);
-		if (err)
-			break;
+			return err;
 	}
 
-	err = igt_flush_test(gt->i915);
-out:
-	fput(file);
-	if (intel_gt_is_wedged(gt))
-		err = -EIO;
-	return err;
+	return 0;
 }
 
 static int __igt_reset_engine(struct intel_gt *gt, bool active)
@@ -699,43 +677,43 @@ static int active_engine(void *data)
 	struct active_engine *arg = data;
 	struct intel_engine_cs *engine = arg->engine;
 	struct i915_request *rq[8] = {};
-	struct i915_gem_context *ctx[ARRAY_SIZE(rq)];
-	unsigned long count = 0;
-	struct file *file;
+	struct intel_context *ce[ARRAY_SIZE(rq)];
+	unsigned long count;
 	int err = 0;
 
-	file = mock_file(engine->i915);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
-
-	for (count = 0; count < ARRAY_SIZE(ctx); count++) {
-		ctx[count] = live_context(engine->i915, file);
-		if (IS_ERR(ctx[count])) {
-			err = PTR_ERR(ctx[count]);
+	for (count = 0; count < ARRAY_SIZE(ce); count++) {
+		ce[count] = intel_context_create(engine);
+		if (IS_ERR(ce[count])) {
+			err = PTR_ERR(ce[count]);
 			while (--count)
-				i915_gem_context_put(ctx[count]);
-			goto err_file;
+				intel_context_put(ce[count]);
+			return err;
 		}
 	}
 
+	count = 0;
 	while (!kthread_should_stop()) {
 		unsigned int idx = count++ & (ARRAY_SIZE(rq) - 1);
 		struct i915_request *old = rq[idx];
 		struct i915_request *new;
 
-		new = igt_request_alloc(ctx[idx], engine);
+		new = intel_context_create_request(ce[idx]);
 		if (IS_ERR(new)) {
 			err = PTR_ERR(new);
 			break;
 		}
 
-		if (arg->flags & TEST_PRIORITY)
-			ctx[idx]->sched.priority =
-				i915_prandom_u32_max_state(512, &prng);
-
 		rq[idx] = i915_request_get(new);
 		i915_request_add(new);
 
+		if (engine->schedule && arg->flags & TEST_PRIORITY) {
+			struct i915_sched_attr attr = {
+				.priority =
+					i915_prandom_u32_max_state(512, &prng),
+			};
+			engine->schedule(rq[idx], &attr);
+		}
+
 		err = active_request_put(old);
 		if (err)
 			break;
@@ -749,10 +727,10 @@ static int active_engine(void *data)
 		/* Keep the first error */
 		if (!err)
 			err = err__;
+
+		intel_context_put(ce[count]);
 	}
 
-err_file:
-	fput(file);
 	return err;
 }
 
@@ -1300,32 +1278,21 @@ static int igt_reset_evict_ggtt(void *arg)
 static int igt_reset_evict_ppgtt(void *arg)
 {
 	struct intel_gt *gt = arg;
-	struct i915_gem_context *ctx;
-	struct i915_address_space *vm;
-	struct file *file;
+	struct i915_ppgtt *ppgtt;
 	int err;
 
-	file = mock_file(gt->i915);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
+	/* aliasing == global gtt locking, covered above */
+	if (INTEL_PPGTT(gt->i915) < INTEL_PPGTT_FULL)
+		return 0;
 
-	ctx = live_context(gt->i915, file);
-	if (IS_ERR(ctx)) {
-		err = PTR_ERR(ctx);
-		goto out;
-	}
+	ppgtt = i915_ppgtt_create(gt->i915);
+	if (IS_ERR(ppgtt))
+		return PTR_ERR(ppgtt);
 
-	err = 0;
-	vm = i915_gem_context_get_vm_rcu(ctx);
-	if (!i915_is_ggtt(vm)) {
-		/* aliasing == global gtt locking, covered above */
-		err = __igt_reset_evict_vma(gt, vm,
-					    evict_vma, EXEC_OBJECT_WRITE);
-	}
-	i915_vm_put(vm);
+	err = __igt_reset_evict_vma(gt, &ppgtt->vm,
+				    evict_vma, EXEC_OBJECT_WRITE);
+	i915_vm_put(&ppgtt->vm);
 
-out:
-	fput(file);
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index a7f3dbe4a113..b279b0d8aecc 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -71,11 +71,10 @@ static void engine_heartbeat_enable(struct intel_engine_cs *engine,
 static int live_sanitycheck(void *arg)
 {
 	struct intel_gt *gt = arg;
-	struct i915_gem_engines_iter it;
-	struct i915_gem_context *ctx;
-	struct intel_context *ce;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
 	struct igt_spinner spin;
-	int err = -ENOMEM;
+	int err = 0;
 
 	if (!HAS_LOGICAL_RING_CONTEXTS(gt->i915))
 		return 0;
@@ -83,17 +82,20 @@ static int live_sanitycheck(void *arg)
 	if (igt_spinner_init(&spin, gt))
 		return -ENOMEM;
 
-	ctx = kernel_context(gt->i915);
-	if (!ctx)
-		goto err_spin;
-
-	for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
+	for_each_engine(engine, gt, id) {
+		struct intel_context *ce;
 		struct i915_request *rq;
 
+		ce = intel_context_create(engine);
+		if (IS_ERR(ce)) {
+			err = PTR_ERR(ce);
+			break;
+		}
+
 		rq = igt_spinner_create_request(&spin, ce, MI_NOOP);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
-			goto err_ctx;
+			goto out_ctx;
 		}
 
 		i915_request_add(rq);
@@ -102,21 +104,21 @@ static int live_sanitycheck(void *arg)
 			GEM_TRACE_DUMP();
 			intel_gt_set_wedged(gt);
 			err = -EIO;
-			goto err_ctx;
+			goto out_ctx;
 		}
 
 		igt_spinner_end(&spin);
 		if (igt_flush_test(gt->i915)) {
 			err = -EIO;
-			goto err_ctx;
+			goto out_ctx;
 		}
+
+out_ctx:
+		intel_context_put(ce);
+		if (err)
+			break;
 	}
 
-	err = 0;
-err_ctx:
-	i915_gem_context_unlock_engines(ctx);
-	kernel_context_close(ctx);
-err_spin:
 	igt_spinner_fini(&spin);
 	return err;
 }
@@ -124,7 +126,6 @@ static int live_sanitycheck(void *arg)
 static int live_unlite_restore(struct intel_gt *gt, int prio)
 {
 	struct intel_engine_cs *engine;
-	struct i915_gem_context *ctx;
 	enum intel_engine_id id;
 	struct igt_spinner spin;
 	int err = -ENOMEM;
@@ -137,10 +138,6 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
 	if (igt_spinner_init(&spin, gt))
 		return err;
 
-	ctx = kernel_context(gt->i915);
-	if (!ctx)
-		goto err_spin;
-
 	err = 0;
 	for_each_engine(engine, gt, id) {
 		struct intel_context *ce[2] = {};
@@ -164,7 +161,7 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
 		for (n = 0; n < ARRAY_SIZE(ce); n++) {
 			struct intel_context *tmp;
 
-			tmp = intel_context_create(ctx, engine);
+			tmp = intel_context_create(engine);
 			if (IS_ERR(tmp)) {
 				err = PTR_ERR(tmp);
 				goto err_ce;
@@ -274,8 +271,6 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
 			break;
 	}
 
-	kernel_context_close(ctx);
-err_spin:
 	igt_spinner_fini(&spin);
 	return err;
 }
@@ -330,17 +325,17 @@ emit_semaphore_chain(struct i915_request *rq, struct i915_vma *vma, int idx)
 static struct i915_request *
 semaphore_queue(struct intel_engine_cs *engine, struct i915_vma *vma, int idx)
 {
-	struct i915_gem_context *ctx;
+	struct intel_context *ce;
 	struct i915_request *rq;
 	int err;
 
-	ctx = kernel_context(engine->i915);
-	if (!ctx)
-		return ERR_PTR(-ENOMEM);
+	ce = intel_context_create(engine);
+	if (IS_ERR(ce))
+		return ERR_CAST(ce);
 
-	rq = igt_request_alloc(ctx, engine);
+	rq = intel_context_create_request(ce);
 	if (IS_ERR(rq))
-		goto out_ctx;
+		goto out_ce;
 
 	err = 0;
 	if (rq->engine->emit_init_breadcrumb)
@@ -353,8 +348,8 @@ semaphore_queue(struct intel_engine_cs *engine, struct i915_vma *vma, int idx)
 	if (err)
 		rq = ERR_PTR(err);
 
-out_ctx:
-	kernel_context_close(ctx);
+out_ce:
+	intel_context_put(ce);
 	return rq;
 }
 
@@ -1998,7 +1993,7 @@ static int create_gang(struct intel_engine_cs *engine,
 	u32 *cs;
 	int err;
 
-	ce = intel_context_create(engine->kernel_context->gem_context, engine);
+	ce = intel_context_create(engine);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
@@ -2660,27 +2655,17 @@ static int nop_virtual_engine(struct intel_gt *gt,
 {
 	IGT_TIMEOUT(end_time);
 	struct i915_request *request[16] = {};
-	struct i915_gem_context *ctx[16];
 	struct intel_context *ve[16];
 	unsigned long n, prime, nc;
 	struct igt_live_test t;
 	ktime_t times[2] = {};
 	int err;
 
-	GEM_BUG_ON(!nctx || nctx > ARRAY_SIZE(ctx));
+	GEM_BUG_ON(!nctx || nctx > ARRAY_SIZE(ve));
 
 	for (n = 0; n < nctx; n++) {
-		ctx[n] = kernel_context(gt->i915);
-		if (!ctx[n]) {
-			err = -ENOMEM;
-			nctx = n;
-			goto out;
-		}
-
-		ve[n] = intel_execlists_create_virtual(ctx[n],
-						       siblings, nsibling);
+		ve[n] = intel_execlists_create_virtual(siblings, nsibling);
 		if (IS_ERR(ve[n])) {
-			kernel_context_close(ctx[n]);
 			err = PTR_ERR(ve[n]);
 			nctx = n;
 			goto out;
@@ -2689,7 +2674,6 @@ static int nop_virtual_engine(struct intel_gt *gt,
 		err = intel_context_pin(ve[n]);
 		if (err) {
 			intel_context_put(ve[n]);
-			kernel_context_close(ctx[n]);
 			nctx = n;
 			goto out;
 		}
@@ -2784,7 +2768,6 @@ static int nop_virtual_engine(struct intel_gt *gt,
 		i915_request_put(request[nc]);
 		intel_context_unpin(ve[nc]);
 		intel_context_put(ve[nc]);
-		kernel_context_close(ctx[nc]);
 	}
 	return err;
 }
@@ -2843,7 +2826,6 @@ static int mask_virtual_engine(struct intel_gt *gt,
 			       unsigned int nsibling)
 {
 	struct i915_request *request[MAX_ENGINE_INSTANCE + 1];
-	struct i915_gem_context *ctx;
 	struct intel_context *ve;
 	struct igt_live_test t;
 	unsigned int n;
@@ -2854,11 +2836,7 @@ static int mask_virtual_engine(struct intel_gt *gt,
 	 * restrict it to our desired engine within the virtual engine.
 	 */
 
-	ctx = kernel_context(gt->i915);
-	if (!ctx)
-		return -ENOMEM;
-
-	ve = intel_execlists_create_virtual(ctx, siblings, nsibling);
+	ve = intel_execlists_create_virtual(siblings, nsibling);
 	if (IS_ERR(ve)) {
 		err = PTR_ERR(ve);
 		goto out_close;
@@ -2926,7 +2904,6 @@ static int mask_virtual_engine(struct intel_gt *gt,
 out_put:
 	intel_context_put(ve);
 out_close:
-	kernel_context_close(ctx);
 	return err;
 }
 
@@ -2966,7 +2943,6 @@ static int preserved_virtual_engine(struct intel_gt *gt,
 				    unsigned int nsibling)
 {
 	struct i915_request *last = NULL;
-	struct i915_gem_context *ctx;
 	struct intel_context *ve;
 	struct i915_vma *scratch;
 	struct igt_live_test t;
@@ -2974,17 +2950,11 @@ static int preserved_virtual_engine(struct intel_gt *gt,
 	int err = 0;
 	u32 *cs;
 
-	ctx = kernel_context(gt->i915);
-	if (!ctx)
-		return -ENOMEM;
-
 	scratch = create_scratch(siblings[0]->gt);
-	if (IS_ERR(scratch)) {
-		err = PTR_ERR(scratch);
-		goto out_close;
-	}
+	if (IS_ERR(scratch))
+		return PTR_ERR(scratch);
 
-	ve = intel_execlists_create_virtual(ctx, siblings, nsibling);
+	ve = intel_execlists_create_virtual(siblings, nsibling);
 	if (IS_ERR(ve)) {
 		err = PTR_ERR(ve);
 		goto out_scratch;
@@ -3067,8 +3037,6 @@ static int preserved_virtual_engine(struct intel_gt *gt,
 	intel_context_put(ve);
 out_scratch:
 	i915_vma_unpin_and_release(&scratch, 0);
-out_close:
-	kernel_context_close(ctx);
 	return err;
 }
 
@@ -3120,7 +3088,6 @@ static int bond_virtual_engine(struct intel_gt *gt,
 #define BOND_SCHEDULE BIT(0)
 {
 	struct intel_engine_cs *master;
-	struct i915_gem_context *ctx;
 	struct i915_request *rq[16];
 	enum intel_engine_id id;
 	struct igt_spinner spin;
@@ -3171,12 +3138,6 @@ static int bond_virtual_engine(struct intel_gt *gt,
 	if (igt_spinner_init(&spin, gt))
 		return -ENOMEM;
 
-	ctx = kernel_context(gt->i915);
-	if (!ctx) {
-		err = -ENOMEM;
-		goto err_spin;
-	}
-
 	err = 0;
 	rq[0] = ERR_PTR(-ENOMEM);
 	for_each_engine(master, gt, id) {
@@ -3187,7 +3148,9 @@ static int bond_virtual_engine(struct intel_gt *gt,
 
 		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
 
-		rq[0] = spinner_create_request(&spin, ctx, master, MI_NOOP);
+		rq[0] = igt_spinner_create_request(&spin,
+						   master->kernel_context,
+						   MI_NOOP);
 		if (IS_ERR(rq[0])) {
 			err = PTR_ERR(rq[0]);
 			goto out;
@@ -3214,9 +3177,7 @@ static int bond_virtual_engine(struct intel_gt *gt,
 		for (n = 0; n < nsibling; n++) {
 			struct intel_context *ve;
 
-			ve = intel_execlists_create_virtual(ctx,
-							    siblings,
-							    nsibling);
+			ve = intel_execlists_create_virtual(siblings, nsibling);
 			if (IS_ERR(ve)) {
 				err = PTR_ERR(ve);
 				onstack_fence_fini(&fence);
@@ -3296,8 +3257,6 @@ static int bond_virtual_engine(struct intel_gt *gt,
 	if (igt_flush_test(gt->i915))
 		err = -EIO;
 
-	kernel_context_close(ctx);
-err_spin:
 	igt_spinner_fini(&spin);
 	return err;
 }
@@ -3609,8 +3568,7 @@ static int live_lrc_fixed(void *arg)
 	return err;
 }
 
-static int __live_lrc_state(struct i915_gem_context *fixme,
-			    struct intel_engine_cs *engine,
+static int __live_lrc_state(struct intel_engine_cs *engine,
 			    struct i915_vma *scratch)
 {
 	struct intel_context *ce;
@@ -3625,7 +3583,7 @@ static int __live_lrc_state(struct i915_gem_context *fixme,
 	int err;
 	int n;
 
-	ce = intel_context_create(fixme, engine);
+	ce = intel_context_create(engine);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
@@ -3699,7 +3657,6 @@ static int live_lrc_state(void *arg)
 {
 	struct intel_gt *gt = arg;
 	struct intel_engine_cs *engine;
-	struct i915_gem_context *fixme;
 	struct i915_vma *scratch;
 	enum intel_engine_id id;
 	int err = 0;
@@ -3709,18 +3666,12 @@ static int live_lrc_state(void *arg)
 	 * intel_context.
 	 */
 
-	fixme = kernel_context(gt->i915);
-	if (!fixme)
-		return -ENOMEM;
-
 	scratch = create_scratch(gt);
-	if (IS_ERR(scratch)) {
-		err = PTR_ERR(scratch);
-		goto out_close;
-	}
+	if (IS_ERR(scratch))
+		return PTR_ERR(scratch);
 
 	for_each_engine(engine, gt, id) {
-		err = __live_lrc_state(fixme, engine, scratch);
+		err = __live_lrc_state(engine, scratch);
 		if (err)
 			break;
 	}
@@ -3729,8 +3680,6 @@ static int live_lrc_state(void *arg)
 		err = -EIO;
 
 	i915_vma_unpin_and_release(&scratch, 0);
-out_close:
-	kernel_context_close(fixme);
 	return err;
 }
 
@@ -3763,8 +3712,7 @@ static int gpr_make_dirty(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int __live_gpr_clear(struct i915_gem_context *fixme,
-			    struct intel_engine_cs *engine,
+static int __live_gpr_clear(struct intel_engine_cs *engine,
 			    struct i915_vma *scratch)
 {
 	struct intel_context *ce;
@@ -3780,7 +3728,7 @@ static int __live_gpr_clear(struct i915_gem_context *fixme,
 	if (err)
 		return err;
 
-	ce = intel_context_create(fixme, engine);
+	ce = intel_context_create(engine);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
@@ -3842,7 +3790,6 @@ static int live_gpr_clear(void *arg)
 {
 	struct intel_gt *gt = arg;
 	struct intel_engine_cs *engine;
-	struct i915_gem_context *fixme;
 	struct i915_vma *scratch;
 	enum intel_engine_id id;
 	int err = 0;
@@ -3852,18 +3799,12 @@ static int live_gpr_clear(void *arg)
 	 * to avoid leaking any information from previous contexts.
 	 */
 
-	fixme = kernel_context(gt->i915);
-	if (!fixme)
-		return -ENOMEM;
-
 	scratch = create_scratch(gt);
-	if (IS_ERR(scratch)) {
-		err = PTR_ERR(scratch);
-		goto out_close;
-	}
+	if (IS_ERR(scratch))
+		return PTR_ERR(scratch);
 
 	for_each_engine(engine, gt, id) {
-		err = __live_gpr_clear(fixme, engine, scratch);
+		err = __live_gpr_clear(engine, scratch);
 		if (err)
 			break;
 	}
@@ -3872,8 +3813,6 @@ static int live_gpr_clear(void *arg)
 		err = -EIO;
 
 	i915_vma_unpin_and_release(&scratch, 0);
-out_close:
-	kernel_context_close(fixme);
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c b/drivers/gpu/drm/i915/gt/selftest_mocs.c
index de010f527757..de1f83100fb6 100644
--- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
+++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
@@ -289,8 +289,7 @@ static int live_mocs_clean(void *arg)
 	for_each_engine(engine, gt, id) {
 		struct intel_context *ce;
 
-		ce = intel_context_create(engine->kernel_context->gem_context,
-					  engine);
+		ce = intel_context_create(engine);
 		if (IS_ERR(ce)) {
 			err = PTR_ERR(ce);
 			break;
@@ -384,8 +383,7 @@ static int live_mocs_reset(void *arg)
 	for_each_engine(engine, gt, id) {
 		struct intel_context *ce;
 
-		ce = intel_context_create(engine->kernel_context->gem_context,
-					  engine);
+		ce = intel_context_create(engine);
 		if (IS_ERR(ce)) {
 			err = PTR_ERR(ce);
 			break;
diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c b/drivers/gpu/drm/i915/gt/selftest_rc6.c
index f8b7691be4d1..8cc55a0e9e06 100644
--- a/drivers/gpu/drm/i915/gt/selftest_rc6.c
+++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
@@ -160,8 +160,7 @@ int live_rc6_ctx_wa(void *arg)
 			const u32 *res;
 
 			/* Use a sacrifical context */
-			ce = intel_context_create(engine->kernel_context->gem_context,
-						  engine);
+			ce = intel_context_create(engine);
 			if (IS_ERR(ce)) {
 				err = PTR_ERR(ce);
 				goto out;
diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index d5d1e1a32187..ac1921854cbf 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -264,22 +264,15 @@ static int
 switch_to_scratch_context(struct intel_engine_cs *engine,
 			  struct igt_spinner *spin)
 {
-	struct i915_gem_context *ctx;
 	struct intel_context *ce;
 	struct i915_request *rq;
 	int err = 0;
 
-	ctx = kernel_context(engine->i915);
-	if (IS_ERR(ctx))
-		return PTR_ERR(ctx);
-
-	GEM_BUG_ON(i915_gem_context_is_bannable(ctx));
-
-	ce = i915_gem_context_get_engine(ctx, engine->legacy_idx);
-	GEM_BUG_ON(IS_ERR(ce));
+	ce = intel_context_create(engine);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
 
 	rq = igt_spinner_create_request(spin, ce, MI_NOOP);
-
 	intel_context_put(ce);
 
 	if (IS_ERR(rq)) {
@@ -293,7 +286,6 @@ switch_to_scratch_context(struct intel_engine_cs *engine,
 	if (err && spin)
 		igt_spinner_end(spin);
 
-	kernel_context_close(ctx);
 	return err;
 }
 
@@ -367,20 +359,17 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine,
 	return err;
 }
 
-static struct i915_vma *create_batch(struct i915_gem_context *ctx)
+static struct i915_vma *create_batch(struct i915_address_space *vm)
 {
 	struct drm_i915_gem_object *obj;
-	struct i915_address_space *vm;
 	struct i915_vma *vma;
 	int err;
 
-	obj = i915_gem_object_create_internal(ctx->i915, 16 * PAGE_SIZE);
+	obj = i915_gem_object_create_internal(vm->i915, 16 * PAGE_SIZE);
 	if (IS_ERR(obj))
 		return ERR_CAST(obj);
 
-	vm = i915_gem_context_get_vm_rcu(ctx);
 	vma = i915_vma_instance(obj, vm, NULL);
-	i915_vm_put(vm);
 	if (IS_ERR(vma)) {
 		err = PTR_ERR(vma);
 		goto err_obj;
@@ -452,8 +441,7 @@ static int whitelist_writable_count(struct intel_engine_cs *engine)
 	return count;
 }
 
-static int check_dirty_whitelist(struct i915_gem_context *ctx,
-				 struct intel_engine_cs *engine)
+static int check_dirty_whitelist(struct intel_context *ce)
 {
 	const u32 values[] = {
 		0x00000000,
@@ -481,19 +469,17 @@ static int check_dirty_whitelist(struct i915_gem_context *ctx,
 		0xffff00ff,
 		0xffffffff,
 	};
-	struct i915_address_space *vm;
+	struct intel_engine_cs *engine = ce->engine;
 	struct i915_vma *scratch;
 	struct i915_vma *batch;
 	int err = 0, i, v;
 	u32 *cs, *results;
 
-	vm = i915_gem_context_get_vm_rcu(ctx);
-	scratch = create_scratch(vm, 2 * ARRAY_SIZE(values) + 1);
-	i915_vm_put(vm);
+	scratch = create_scratch(ce->vm, 2 * ARRAY_SIZE(values) + 1);
 	if (IS_ERR(scratch))
 		return PTR_ERR(scratch);
 
-	batch = create_batch(ctx);
+	batch = create_batch(ce->vm);
 	if (IS_ERR(batch)) {
 		err = PTR_ERR(batch);
 		goto out_scratch;
@@ -518,7 +504,7 @@ static int check_dirty_whitelist(struct i915_gem_context *ctx,
 
 		srm = MI_STORE_REGISTER_MEM;
 		lrm = MI_LOAD_REGISTER_MEM;
-		if (INTEL_GEN(ctx->i915) >= 8)
+		if (INTEL_GEN(engine->i915) >= 8)
 			lrm++, srm++;
 
 		pr_debug("%s: Writing garbage to %x\n",
@@ -577,7 +563,7 @@ static int check_dirty_whitelist(struct i915_gem_context *ctx,
 		i915_gem_object_unpin_map(batch->obj);
 		intel_gt_chipset_flush(engine->gt);
 
-		rq = igt_request_alloc(ctx, engine);
+		rq = intel_context_create_request(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
 			goto out_batch;
@@ -696,7 +682,7 @@ static int check_dirty_whitelist(struct i915_gem_context *ctx,
 			break;
 	}
 
-	if (igt_flush_test(ctx->i915))
+	if (igt_flush_test(engine->i915))
 		err = -EIO;
 out_batch:
 	i915_vma_unpin_and_release(&batch, 0);
@@ -709,38 +695,31 @@ static int live_dirty_whitelist(void *arg)
 {
 	struct intel_gt *gt = arg;
 	struct intel_engine_cs *engine;
-	struct i915_gem_context *ctx;
 	enum intel_engine_id id;
-	struct file *file;
-	int err = 0;
 
 	/* Can the user write to the whitelisted registers? */
 
 	if (INTEL_GEN(gt->i915) < 7) /* minimum requirement for LRI, SRM, LRM */
 		return 0;
 
-	file = mock_file(gt->i915);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
-
-	ctx = live_context(gt->i915, file);
-	if (IS_ERR(ctx)) {
-		err = PTR_ERR(ctx);
-		goto out_file;
-	}
-
 	for_each_engine(engine, gt, id) {
+		struct intel_context *ce;
+		int err;
+
 		if (engine->whitelist.count == 0)
 			continue;
 
-		err = check_dirty_whitelist(ctx, engine);
+		ce = intel_context_create(engine);
+		if (IS_ERR(ce))
+			return PTR_ERR(ce);
+
+		err = check_dirty_whitelist(ce);
+		intel_context_put(ce);
 		if (err)
-			goto out_file;
+			return err;
 	}
 
-out_file:
-	fput(file);
-	return err;
+	return 0;
 }
 
 static int live_reset_whitelist(void *arg)
@@ -830,12 +809,15 @@ static int read_whitelisted_registers(struct i915_gem_context *ctx,
 static int scrub_whitelisted_registers(struct i915_gem_context *ctx,
 				       struct intel_engine_cs *engine)
 {
+	struct i915_address_space *vm;
 	struct i915_request *rq;
 	struct i915_vma *batch;
 	int i, err = 0;
 	u32 *cs;
 
-	batch = create_batch(ctx);
+	vm = i915_gem_context_get_vm_rcu(ctx);
+	batch = create_batch(vm);
+	i915_vm_put(vm);
 	if (IS_ERR(batch))
 		return PTR_ERR(batch);
 
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 228c66534e21..b3299f88e24e 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -35,12 +35,12 @@
 
 #include <linux/kthread.h>
 
-#include "gem/i915_gem_context.h"
 #include "gem/i915_gem_pm.h"
 #include "gt/intel_context.h"
 #include "gt/intel_ring.h"
 
 #include "i915_drv.h"
+#include "i915_gem_gtt.h"
 #include "gvt.h"
 
 #define RING_CTX_OFF(x) \
@@ -1220,16 +1220,14 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 	struct drm_i915_private *i915 = vgpu->gvt->dev_priv;
 	struct intel_vgpu_submission *s = &vgpu->submission;
 	struct intel_engine_cs *engine;
-	struct i915_gem_context *ctx;
 	struct i915_ppgtt *ppgtt;
 	enum intel_engine_id i;
 	int ret;
 
-	ctx = i915_gem_context_create_kernel(i915, I915_PRIORITY_MAX);
-	if (IS_ERR(ctx))
-		return PTR_ERR(ctx);
+	ppgtt = i915_ppgtt_create(i915);
+	if (IS_ERR(ppgtt))
+		return PTR_ERR(ppgtt);
 
-	ppgtt = i915_vm_to_ppgtt(i915_gem_context_get_vm_rcu(ctx));
 	i915_context_ppgtt_root_save(s, ppgtt);
 
 	for_each_engine(engine, i915, i) {
@@ -1238,12 +1236,14 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 		INIT_LIST_HEAD(&s->workload_q_head[i]);
 		s->shadow[i] = ERR_PTR(-EINVAL);
 
-		ce = intel_context_create(ctx, engine);
+		ce = intel_context_create(engine);
 		if (IS_ERR(ce)) {
 			ret = PTR_ERR(ce);
 			goto out_shadow_ctx;
 		}
 
+		i915_vm_put(ce->vm);
+		ce->vm = i915_vm_get(&ppgtt->vm);
 		intel_context_set_single_submission(ce);
 
 		if (!USES_GUC_SUBMISSION(i915)) { /* Max ring buffer size */
@@ -1278,7 +1278,6 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 	bitmap_zero(s->tlb_handle_pending, I915_NUM_ENGINES);
 
 	i915_vm_put(&ppgtt->vm);
-	i915_gem_context_put(ctx);
 	return 0;
 
 out_shadow_ctx:
@@ -1291,7 +1290,6 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 		intel_context_put(s->shadow[i]);
 	}
 	i915_vm_put(&ppgtt->vm);
-	i915_gem_context_put(ctx);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index eb5c219188a1..8bf4ab874b2f 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -6,6 +6,7 @@
 
 #include <linux/debugobjects.h>
 
+#include "gt/intel_context.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_engine_pm.h"
 #include "gt/intel_ring.h"
@@ -741,6 +742,7 @@ void i915_request_add_active_barriers(struct i915_request *rq)
 	struct llist_node *node, *next;
 	unsigned long flags;
 
+	GEM_BUG_ON(!intel_context_is_barrier(rq->context));
 	GEM_BUG_ON(intel_engine_is_virtual(engine));
 	GEM_BUG_ON(i915_request_timeline(rq) != engine->kernel_context->timeline);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0781b6326b8c..95db8017f138 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -955,9 +955,6 @@ struct drm_i915_private {
 
 	struct pci_dev *bridge_dev;
 
-	/* Context used internally to idle the GPU and setup initial state */
-	struct i915_gem_context *kernel_context;
-
 	struct intel_engine_cs *engine[I915_NUM_ENGINES];
 	struct rb_root uabi_engines;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fe5ba72f3ce4..d7a9cf474921 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1123,8 +1123,7 @@ static int __intel_engines_record_defaults(struct intel_gt *gt)
 		GEM_BUG_ON(!engine->kernel_context);
 		engine->serial++; /* force the kernel context switch */
 
-		ce = intel_context_create(engine->kernel_context->gem_context,
-					  engine);
+		ce = intel_context_create(engine);
 		if (IS_ERR(ce)) {
 			err = PTR_ERR(ce);
 			goto out;
@@ -1284,6 +1283,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	}
 
 	intel_gt_init(&dev_priv->gt);
+	i915_gem_init__contexts(dev_priv);
 
 	ret = intel_engines_setup(&dev_priv->gt);
 	if (ret) {
@@ -1291,16 +1291,10 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 		goto err_unlock;
 	}
 
-	ret = i915_gem_init_contexts(dev_priv);
-	if (ret) {
-		GEM_BUG_ON(ret == -EIO);
-		goto err_scratch;
-	}
-
 	ret = intel_engines_init(&dev_priv->gt);
 	if (ret) {
 		GEM_BUG_ON(ret == -EIO);
-		goto err_context;
+		goto err_scratch;
 	}
 
 	intel_uc_init(&dev_priv->gt.uc);
@@ -1364,9 +1358,6 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 		intel_uc_fini(&dev_priv->gt.uc);
 		intel_engines_cleanup(&dev_priv->gt);
 	}
-err_context:
-	if (ret != -EIO)
-		i915_gem_driver_release__contexts(dev_priv);
 err_scratch:
 	intel_gt_driver_release(&dev_priv->gt);
 err_unlock:
@@ -1431,7 +1422,6 @@ void i915_gem_driver_remove(struct drm_i915_private *dev_priv)
 void i915_gem_driver_release(struct drm_i915_private *dev_priv)
 {
 	intel_engines_cleanup(&dev_priv->gt);
-	i915_gem_driver_release__contexts(dev_priv);
 	intel_gt_driver_release(&dev_priv->gt);
 
 	intel_wa_list_free(&dev_priv->gt_wa_list);
@@ -1439,6 +1429,8 @@ void i915_gem_driver_release(struct drm_i915_private *dev_priv)
 	intel_uc_cleanup_firmwares(&dev_priv->gt.uc);
 	i915_gem_cleanup_userptr(dev_priv);
 
+	i915_gem_driver_release__contexts(dev_priv);
+
 	i915_gem_drain_freed_objects(dev_priv);
 
 	WARN_ON(!list_empty(&dev_priv->gem.contexts.list));
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index be36719e7987..066d8f7a4888 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1375,12 +1375,8 @@ static int gen8_init_scratch(struct i915_address_space *vm)
 	 * If everybody agrees to not to write into the scratch page,
 	 * we can reuse it for all vm, keeping contexts and processes separate.
 	 */
-	if (vm->has_read_only &&
-	    vm->i915->kernel_context &&
-	    vm->i915->kernel_context->vm) {
-		struct i915_address_space *clone =
-			rcu_dereference_protected(vm->i915->kernel_context->vm,
-						  true); /* static */
+	if (vm->has_read_only && vm->gt->vm && !i915_is_ggtt(vm->gt->vm)) {
+		struct i915_address_space *clone = vm->gt->vm;
 
 		GEM_BUG_ON(!clone->has_read_only);
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 4a89da21ae51..1f388bccc74d 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -2321,9 +2321,6 @@ static int oa_configure_all_contexts(struct i915_perf_stream *stream,
 	 */
 	spin_lock(&i915->gem.contexts.lock);
 	list_for_each_entry_safe(ctx, cn, &i915->gem.contexts.list, link) {
-		if (ctx == i915->kernel_context)
-			continue;
-
 		if (!kref_get_unless_zero(&ctx->ref))
 			continue;
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 8ac3ea65a105..15767c2d9ea1 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1314,8 +1314,8 @@ void __i915_request_queue(struct i915_request *rq,
 
 void i915_request_add(struct i915_request *rq)
 {
-	struct i915_sched_attr attr = rq->context->gem_context->sched;
 	struct intel_timeline * const tl = i915_request_timeline(rq);
+	struct i915_sched_attr attr = {};
 	struct i915_request *prev;
 
 	lockdep_assert_held(&tl->mutex);
@@ -1325,6 +1325,9 @@ void i915_request_add(struct i915_request *rq)
 
 	prev = __i915_request_commit(rq);
 
+	if (rq->context->gem_context)
+		attr = rq->context->gem_context->sched;
+
 	/*
 	 * Boost actual workloads past semaphores!
 	 *
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
index 99c94b4f69fb..c8fb63a8e650 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -749,10 +749,8 @@ static int live_empty_request(void *arg)
 
 static struct i915_vma *recursive_batch(struct drm_i915_private *i915)
 {
-	struct i915_gem_context *ctx = i915->kernel_context;
 	struct drm_i915_gem_object *obj;
 	const int gen = INTEL_GEN(i915);
-	struct i915_address_space *vm;
 	struct i915_vma *vma;
 	u32 *cmd;
 	int err;
@@ -761,9 +759,7 @@ static struct i915_vma *recursive_batch(struct drm_i915_private *i915)
 	if (IS_ERR(obj))
 		return ERR_CAST(obj);
 
-	vm = i915_gem_context_get_vm_rcu(ctx);
-	vma = i915_vma_instance(obj, vm, NULL);
-	i915_vm_put(vm);
+	vma = i915_vma_instance(obj, i915->gt.vm, NULL);
 	if (IS_ERR(vma)) {
 		err = PTR_ERR(vma);
 		goto err;
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index d14ba8498f57..a5e46a4739f9 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -63,7 +63,6 @@ static void mock_device_release(struct drm_device *dev)
 
 	for_each_engine(engine, &i915->gt, id)
 		mock_engine_free(engine);
-	i915_gem_driver_release__contexts(i915);
 
 	drain_workqueue(i915->wq);
 	i915_gem_drain_freed_objects(i915);
@@ -180,6 +179,7 @@ struct drm_i915_private *mock_gem_device(void)
 	mock_init_contexts(i915);
 
 	mock_init_ggtt(i915, &i915->ggtt);
+	i915->gt.vm = i915_vm_get(&i915->ggtt.vm);
 
 	mkwrite_device_info(i915)->engine_mask = BIT(0);
 
@@ -187,10 +187,6 @@ struct drm_i915_private *mock_gem_device(void)
 	if (!i915->engine[RCS0])
 		goto err_unlock;
 
-	i915->kernel_context = mock_context(i915, NULL);
-	if (!i915->kernel_context)
-		goto err_engine;
-
 	if (mock_engine_init(i915->engine[RCS0]))
 		goto err_context;
 
@@ -199,8 +195,6 @@ struct drm_i915_private *mock_gem_device(void)
 	return i915;
 
 err_context:
-	i915_gem_driver_release__contexts(i915);
-err_engine:
 	mock_engine_free(i915->engine[RCS0]);
 err_unlock:
 	destroy_workqueue(i915->wq);
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 21/33] drm/i915: Move i915_gem_init_contexts() earlier
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (18 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 20/33] drm/i915: Remove i915->kernel_context Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 22/33] drm/i915/uc: Use an internal buffer for firmware images Chris Wilson
                   ` (12 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

As the GEM global context setup is now independent of the GT state
(although GT does currently still depending upon the global
i915->kernel_context), we can move its init earlier, leaving the gt init
ready to extracted.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d7a9cf474921..dcbfc436af68 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1283,18 +1283,17 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	}
 
 	intel_gt_init(&dev_priv->gt);
-	i915_gem_init__contexts(dev_priv);
 
 	ret = intel_engines_setup(&dev_priv->gt);
 	if (ret) {
 		GEM_BUG_ON(ret == -EIO);
-		goto err_unlock;
+		goto err_gt_early;
 	}
 
 	ret = intel_engines_init(&dev_priv->gt);
 	if (ret) {
 		GEM_BUG_ON(ret == -EIO);
-		goto err_scratch;
+		goto err_engines;
 	}
 
 	intel_uc_init(&dev_priv->gt.uc);
@@ -1321,19 +1320,19 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 
 	ret = intel_engines_verify_workarounds(&dev_priv->gt);
 	if (ret)
-		goto err_gt;
+		goto err_gt_late;
 
 	ret = __intel_engines_record_defaults(&dev_priv->gt);
 	if (ret)
-		goto err_gt;
+		goto err_gt_late;
 
 	ret = i915_inject_probe_error(dev_priv, -ENODEV);
 	if (ret)
-		goto err_gt;
+		goto err_gt_late;
 
 	ret = i915_inject_probe_error(dev_priv, -EIO);
 	if (ret)
-		goto err_gt;
+		goto err_gt_late;
 
 	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
 
@@ -1345,7 +1344,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	 * HW as irrevisibly wedged, but keep enough state around that the
 	 * driver doesn't explode during runtime.
 	 */
-err_gt:
+err_gt_late:
 	intel_gt_set_wedged_on_init(&dev_priv->gt);
 	i915_gem_suspend(dev_priv);
 	i915_gem_suspend_late(dev_priv);
@@ -1354,11 +1353,12 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 err_init_hw:
 	intel_uc_fini_hw(&dev_priv->gt.uc);
 err_uc_init:
-	if (ret != -EIO) {
+	if (ret != -EIO)
 		intel_uc_fini(&dev_priv->gt.uc);
+err_engines:
+	if (ret != -EIO)
 		intel_engines_cleanup(&dev_priv->gt);
-	}
-err_scratch:
+err_gt_early:
 	intel_gt_driver_release(&dev_priv->gt);
 err_unlock:
 	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
@@ -1451,6 +1451,7 @@ static void i915_gem_init__mm(struct drm_i915_private *i915)
 void i915_gem_init_early(struct drm_i915_private *dev_priv)
 {
 	i915_gem_init__mm(dev_priv);
+	i915_gem_init__contexts(dev_priv);
 
 	spin_lock_init(&dev_priv->fb_tracking.lock);
 }
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 22/33] drm/i915/uc: Use an internal buffer for firmware images
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (19 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 21/33] drm/i915: Move i915_gem_init_contexts() earlier Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 23/33] drm/i915/gt: Pull GT initialisation under intel_gt_init() Chris Wilson
                   ` (11 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Since the lifetime of the uc_fw is virtually identical to the current
pinned range, simplify the setup to avoid using a swappable shmem file,
and just use an internal bo. The immediate advantage is in removing the
extra pin/unpin stages during init that are very difficult to balance
along error paths.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_internal.c | 43 +++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_object.h   |  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c    | 51 --------------------
 drivers/gpu/drm/i915/gt/uc/intel_guc.c       | 12 ++---
 drivers/gpu/drm/i915/gt/uc/intel_huc.c       | 12 ++---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c     | 30 +-----------
 6 files changed, 56 insertions(+), 96 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index 9cfb0e41ff06..a8eb0eb27390 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -198,3 +198,46 @@ i915_gem_object_create_internal(struct drm_i915_private *i915,
 
 	return obj;
 }
+
+/* Allocate a new GEM object and fill it with the supplied data */
+struct drm_i915_gem_object *
+i915_gem_object_create_from_data(struct drm_i915_private *i915,
+				 const void *data, resource_size_t size)
+{
+	struct drm_i915_gem_object *obj;
+	resource_size_t offset;
+	struct sgt_iter it;
+	struct page *page;
+	int err;
+
+	obj = i915_gem_object_create_internal(i915, round_up(size, PAGE_SIZE));
+	if (IS_ERR(obj))
+		return obj;
+
+	GEM_BUG_ON(obj->write_domain != I915_GEM_DOMAIN_CPU);
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err)
+		goto err;
+
+	offset = 0;
+	for_each_sgt_page(page, it, obj->mm.pages) {
+		int len = min_t(typeof(size), size - offset, PAGE_SIZE);
+		void *ptr;
+
+		ptr = kmap(page);
+
+		memcpy(ptr, data + offset, len);
+		if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
+			drm_clflush_virt_range(ptr, len);
+
+		kunmap(page);
+		offset += len;
+	}
+
+	return obj; /* keep pages pinned */
+
+err:
+	i915_gem_object_put(obj);
+	return ERR_PTR(err);
+}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index a1eb7c0b23ac..994dd654b5c1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -29,8 +29,8 @@ struct drm_i915_gem_object *
 i915_gem_object_create_shmem(struct drm_i915_private *i915,
 			     resource_size_t size);
 struct drm_i915_gem_object *
-i915_gem_object_create_shmem_from_data(struct drm_i915_private *i915,
-				       const void *data, resource_size_t size);
+i915_gem_object_create_from_data(struct drm_i915_private *i915,
+				 const void *data, resource_size_t size);
 
 extern const struct drm_i915_gem_object_ops i915_gem_shmem_ops;
 void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 4d69c3fc3439..0f4c8fc38bba 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -533,57 +533,6 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915,
 					     size, 0);
 }
 
-/* Allocate a new GEM object and fill it with the supplied data */
-struct drm_i915_gem_object *
-i915_gem_object_create_shmem_from_data(struct drm_i915_private *dev_priv,
-				       const void *data, resource_size_t size)
-{
-	struct drm_i915_gem_object *obj;
-	struct file *file;
-	resource_size_t offset;
-	int err;
-
-	obj = i915_gem_object_create_shmem(dev_priv, round_up(size, PAGE_SIZE));
-	if (IS_ERR(obj))
-		return obj;
-
-	GEM_BUG_ON(obj->write_domain != I915_GEM_DOMAIN_CPU);
-
-	file = obj->base.filp;
-	offset = 0;
-	do {
-		unsigned int len = min_t(typeof(size), size, PAGE_SIZE);
-		struct page *page;
-		void *pgdata, *vaddr;
-
-		err = pagecache_write_begin(file, file->f_mapping,
-					    offset, len, 0,
-					    &page, &pgdata);
-		if (err < 0)
-			goto fail;
-
-		vaddr = kmap(page);
-		memcpy(vaddr, data, len);
-		kunmap(page);
-
-		err = pagecache_write_end(file, file->f_mapping,
-					  offset, len, len,
-					  page, pgdata);
-		if (err < 0)
-			goto fail;
-
-		size -= len;
-		data += len;
-		offset += len;
-	} while (size);
-
-	return obj;
-
-fail:
-	i915_gem_object_put(obj);
-	return ERR_PTR(err);
-}
-
 static int init_shmem(struct intel_memory_region *mem)
 {
 	int err;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 922a19635d20..b5639548b8fc 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -334,13 +334,14 @@ int intel_guc_init(struct intel_guc *guc)
 	struct intel_gt *gt = guc_to_gt(guc);
 	int ret;
 
-	ret = intel_uc_fw_init(&guc->fw);
-	if (ret)
-		goto err_fetch;
+	/* This should happen before the load! */
+	GEM_BUG_ON(intel_uc_fw_is_loaded(&guc->fw));
+	if (!intel_uc_fw_is_available(&guc->fw))
+		return -ENOEXEC;
 
 	ret = intel_guc_log_create(&guc->log);
 	if (ret)
-		goto err_fw;
+		goto err_fetch;
 
 	ret = intel_guc_ads_create(guc);
 	if (ret)
@@ -375,8 +376,6 @@ int intel_guc_init(struct intel_guc *guc)
 	intel_guc_ads_destroy(guc);
 err_log:
 	intel_guc_log_destroy(&guc->log);
-err_fw:
-	intel_uc_fw_fini(&guc->fw);
 err_fetch:
 	intel_uc_fw_cleanup_fetch(&guc->fw);
 	DRM_DEV_DEBUG_DRIVER(gt->i915->drm.dev, "failed with %d\n", ret);
@@ -399,7 +398,6 @@ void intel_guc_fini(struct intel_guc *guc)
 
 	intel_guc_ads_destroy(guc);
 	intel_guc_log_destroy(&guc->log);
-	intel_uc_fw_fini(&guc->fw);
 	intel_uc_fw_cleanup_fetch(&guc->fw);
 }
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
index 32a069841c14..f0b555fb6dd2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
@@ -108,9 +108,10 @@ int intel_huc_init(struct intel_huc *huc)
 	struct drm_i915_private *i915 = huc_to_gt(huc)->i915;
 	int err;
 
-	err = intel_uc_fw_init(&huc->fw);
-	if (err)
-		goto out;
+	/* This should happen before the load! */
+	GEM_BUG_ON(intel_uc_fw_is_loaded(&huc->fw));
+	if (!intel_uc_fw_is_available(&huc->fw))
+		return -ENOEXEC;
 
 	/*
 	 * HuC firmware image is outside GuC accessible range.
@@ -119,12 +120,10 @@ int intel_huc_init(struct intel_huc *huc)
 	 */
 	err = intel_huc_rsa_data_create(huc);
 	if (err)
-		goto out_fini;
+		goto out;
 
 	return 0;
 
-out_fini:
-	intel_uc_fw_fini(&huc->fw);
 out:
 	intel_uc_fw_cleanup_fetch(&huc->fw);
 	DRM_DEV_DEBUG_DRIVER(i915->drm.dev, "failed with %d\n", err);
@@ -137,7 +136,6 @@ void intel_huc_fini(struct intel_huc *huc)
 		return;
 
 	intel_huc_rsa_data_destroy(huc);
-	intel_uc_fw_fini(&huc->fw);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 4d02e06480e5..f19636d4bed7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -364,7 +364,7 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw)
 		}
 	}
 
-	obj = i915_gem_object_create_shmem_from_data(i915, fw->data, fw->size);
+	obj = i915_gem_object_create_from_data(i915, fw->data, fw->size);
 	if (IS_ERR(obj)) {
 		err = PTR_ERR(obj);
 		goto fail;
@@ -524,34 +524,6 @@ int intel_uc_fw_upload(struct intel_uc_fw *uc_fw, u32 dst_offset, u32 dma_flags)
 	return err;
 }
 
-int intel_uc_fw_init(struct intel_uc_fw *uc_fw)
-{
-	int err;
-
-	/* this should happen before the load! */
-	GEM_BUG_ON(intel_uc_fw_is_loaded(uc_fw));
-
-	if (!intel_uc_fw_is_available(uc_fw))
-		return -ENOEXEC;
-
-	err = i915_gem_object_pin_pages(uc_fw->obj);
-	if (err) {
-		DRM_DEBUG_DRIVER("%s fw pin-pages err=%d\n",
-				 intel_uc_fw_type_repr(uc_fw->type), err);
-		intel_uc_fw_change_status(uc_fw, INTEL_UC_FIRMWARE_FAIL);
-	}
-
-	return err;
-}
-
-void intel_uc_fw_fini(struct intel_uc_fw *uc_fw)
-{
-	if (!intel_uc_fw_is_available(uc_fw))
-		return;
-
-	i915_gem_object_unpin_pages(uc_fw->obj);
-}
-
 /**
  * intel_uc_fw_cleanup_fetch - cleanup uC firmware
  * @uc_fw: uC firmware
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 23/33] drm/i915/gt: Pull GT initialisation under intel_gt_init()
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (20 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 22/33] drm/i915/uc: Use an internal buffer for firmware images Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 24/33] drm/i915/gt: Pull intel_gt_init_hw() into intel_gt_resume() Chris Wilson
                   ` (10 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Begin pulling the GT setup underneath a single GT umbrella; let intel_gt
take ownership of its engines! As hinted, the complication is the
lifetime of the probed engine versus the active lifetime of the GT
backends. We need to detect the engine layout early and keep it until
the end so that we can sanitize state on takeover and release.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/display/intel_overlay.c  |   6 +-
 drivers/gpu/drm/i915/gt/intel_engine.h        |   8 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  42 +--
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  15 +-
 drivers/gpu/drm/i915/gt/intel_gt.c            | 254 +++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  17 +-
 drivers/gpu/drm/i915/gt/intel_reset.c         |   9 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  14 +-
 .../gpu/drm/i915/gt/intel_timeline_types.h    |   4 +-
 drivers/gpu/drm/i915/gt/mock_engine.c         |  16 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   9 +-
 drivers/gpu/drm/i915/i915_drv.c               |   1 -
 drivers/gpu/drm/i915/i915_gem.c               | 252 +----------------
 drivers/gpu/drm/i915/selftests/i915_gem.c     |   1 +
 14 files changed, 327 insertions(+), 321 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_overlay.c b/drivers/gpu/drm/i915/display/intel_overlay.c
index 2a44b3be2600..94528ccc388f 100644
--- a/drivers/gpu/drm/i915/display/intel_overlay.c
+++ b/drivers/gpu/drm/i915/display/intel_overlay.c
@@ -1324,12 +1324,14 @@ static int get_registers(struct intel_overlay *overlay, bool use_phys)
 void intel_overlay_setup(struct drm_i915_private *dev_priv)
 {
 	struct intel_overlay *overlay;
+	struct intel_engine_cs *engine;
 	int ret;
 
 	if (!HAS_OVERLAY(dev_priv))
 		return;
 
-	if (!HAS_ENGINE(dev_priv, RCS0))
+	engine = dev_priv->engine[RCS0];
+	if (!engine || !engine->kernel_context)
 		return;
 
 	overlay = kzalloc(sizeof(*overlay), GFP_KERNEL);
@@ -1337,7 +1339,7 @@ void intel_overlay_setup(struct drm_i915_private *dev_priv)
 		return;
 
 	overlay->i915 = dev_priv;
-	overlay->context = dev_priv->engine[RCS0]->kernel_context;
+	overlay->context = engine->kernel_context;
 	GEM_BUG_ON(!overlay->context);
 
 	overlay->color_key = 0x0101fe;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index d9f9ef24fbff..4172f2ab78e2 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -179,7 +179,9 @@ void intel_engine_cleanup(struct intel_engine_cs *engine);
 int intel_engines_init_mmio(struct intel_gt *gt);
 int intel_engines_setup(struct intel_gt *gt);
 int intel_engines_init(struct intel_gt *gt);
-void intel_engines_cleanup(struct intel_gt *gt);
+
+void intel_engines_release(struct intel_gt *gt);
+void intel_engines_free(struct intel_gt *gt);
 
 int intel_engine_init_common(struct intel_engine_cs *engine);
 void intel_engine_cleanup_common(struct intel_engine_cs *engine);
@@ -268,8 +270,8 @@ gen8_emit_ggtt_write(u32 *cs, u32 value, u32 gtt_offset, u32 flags)
 static inline void __intel_engine_reset(struct intel_engine_cs *engine,
 					bool stalled)
 {
-	if (engine->reset.reset)
-		engine->reset.reset(engine, stalled);
+	if (engine->reset.rewind)
+		engine->reset.rewind(engine, stalled);
 	engine->serial++; /* contexts lost */
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index b72e37c2055e..5a5e75483bbb 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -319,12 +319,6 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
 	engine->props.timeslice_duration_ms =
 		CONFIG_DRM_I915_TIMESLICE_DURATION;
 
-	/*
-	 * To be overridden by the backend on setup. However to facilitate
-	 * cleanup on error during setup, we always provide the destroy vfunc.
-	 */
-	engine->destroy = (typeof(engine->destroy))kfree;
-
 	engine->context_size = intel_engine_context_size(gt, engine->class);
 	if (WARN_ON(engine->context_size > BIT(20)))
 		engine->context_size = 0;
@@ -389,21 +383,39 @@ static void intel_setup_engine_capabilities(struct intel_gt *gt)
 }
 
 /**
- * intel_engines_cleanup() - free the resources allocated for Command Streamers
+ * intel_engines_release() - free the resources allocated for Command Streamers
  * @gt: pointer to struct intel_gt
  */
-void intel_engines_cleanup(struct intel_gt *gt)
+void intel_engines_release(struct intel_gt *gt)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 
+	/* Decouple the backend; but keep the layout for late GPU resets */
 	for_each_engine(engine, gt, id) {
-		engine->destroy(engine);
-		gt->engine[id] = NULL;
+		if (!engine->release)
+			continue;
+
+		engine->release(engine);
+		engine->release = NULL;
+
+		memset(&engine->reset, 0, sizeof(engine->reset));
+
 		gt->i915->engine[id] = NULL;
 	}
 }
 
+void intel_engines_free(struct intel_gt *gt)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, gt, id) {
+		kfree(engine);
+		gt->engine[id] = NULL;
+	}
+}
+
 /**
  * intel_engines_init_mmio() - allocate and prepare the Engine Command Streamers
  * @gt: pointer to struct intel_gt
@@ -454,7 +466,7 @@ int intel_engines_init_mmio(struct intel_gt *gt)
 	return 0;
 
 cleanup:
-	intel_engines_cleanup(gt);
+	intel_engines_free(gt);
 	return err;
 }
 
@@ -487,7 +499,7 @@ int intel_engines_init(struct intel_gt *gt)
 	return 0;
 
 cleanup:
-	intel_engines_cleanup(gt);
+	intel_engines_release(gt);
 	return err;
 }
 
@@ -662,16 +674,13 @@ int intel_engines_setup(struct intel_gt *gt)
 		if (err)
 			goto cleanup;
 
-		/* We expect the backend to take control over its state */
-		GEM_BUG_ON(engine->destroy == (typeof(engine->destroy))kfree);
-
 		GEM_BUG_ON(!engine->cops);
 	}
 
 	return 0;
 
 cleanup:
-	intel_engines_cleanup(gt);
+	intel_engines_release(gt);
 	return err;
 }
 
@@ -832,6 +841,7 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
 void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 {
 	GEM_BUG_ON(!list_empty(&engine->active.requests));
+	tasklet_kill(&engine->execlists.tasklet); /* flush the callback */
 
 	cleanup_status_page(engine);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 17f1f1441efc..ede7ce2695cd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -389,7 +389,10 @@ struct intel_engine_cs {
 
 	struct {
 		void (*prepare)(struct intel_engine_cs *engine);
-		void (*reset)(struct intel_engine_cs *engine, bool stalled);
+
+		void (*rewind)(struct intel_engine_cs *engine, bool stalled);
+		void (*cancel)(struct intel_engine_cs *engine);
+
 		void (*finish)(struct intel_engine_cs *engine);
 	} reset;
 
@@ -439,15 +442,7 @@ struct intel_engine_cs {
 	void		(*schedule)(struct i915_request *request,
 				    const struct i915_sched_attr *attr);
 
-	/*
-	 * Cancel all requests on the hardware, or queued for execution.
-	 * This should only cancel the ready requests that have been
-	 * submitted to the engine (via the engine->submit_request callback).
-	 * This is called when marking the device as wedged.
-	 */
-	void		(*cancel_requests)(struct intel_engine_cs *engine);
-
-	void		(*destroy)(struct intel_engine_cs *engine);
+	void		(*release)(struct intel_engine_cs *engine);
 
 	struct intel_engine_execlists execlists;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index af4f8c810009..e7f4c1beb3b6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -4,11 +4,13 @@
  */
 
 #include "i915_drv.h"
+#include "intel_context.h"
 #include "intel_gt.h"
 #include "intel_gt_pm.h"
 #include "intel_gt_requests.h"
 #include "intel_mocs.h"
 #include "intel_rc6.h"
+#include "intel_renderstate.h"
 #include "intel_rps.h"
 #include "intel_uncore.h"
 #include "intel_pm.h"
@@ -372,32 +374,275 @@ static struct i915_address_space *kernel_vm(struct intel_gt *gt)
 		return i915_vm_get(&gt->ggtt->vm);
 }
 
+static int __intel_context_flush_retire(struct intel_context *ce)
+{
+	struct intel_timeline *tl;
+
+	tl = intel_context_timeline_lock(ce);
+	if (IS_ERR(tl))
+		return PTR_ERR(tl);
+
+	intel_context_timeline_unlock(tl);
+	return 0;
+}
+
+static int __engines_record_defaults(struct intel_gt *gt)
+{
+	struct i915_request *requests[I915_NUM_ENGINES] = {};
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err = 0;
+
+	/*
+	 * As we reset the gpu during very early sanitisation, the current
+	 * register state on the GPU should reflect its defaults values.
+	 * We load a context onto the hw (with restore-inhibit), then switch
+	 * over to a second context to save that default register state. We
+	 * can then prime every new context with that state so they all start
+	 * from the same default HW values.
+	 */
+
+	for_each_engine(engine, gt, id) {
+		struct intel_renderstate so;
+		struct intel_context *ce;
+		struct i915_request *rq;
+
+		err = intel_renderstate_init(&so, engine);
+		if (err)
+			goto out;
+
+		/* We must be able to switch to something! */
+		GEM_BUG_ON(!engine->kernel_context);
+		engine->serial++; /* force the kernel context switch */
+
+		ce = intel_context_create(engine);
+		if (IS_ERR(ce)) {
+			err = PTR_ERR(ce);
+			goto out;
+		}
+
+		rq = intel_context_create_request(ce);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			intel_context_put(ce);
+			goto out;
+		}
+
+		err = intel_engine_emit_ctx_wa(rq);
+		if (err)
+			goto err_rq;
+
+		err = intel_renderstate_emit(&so, rq);
+		if (err)
+			goto err_rq;
+
+err_rq:
+		requests[id] = i915_request_get(rq);
+		i915_request_add(rq);
+		intel_renderstate_fini(&so);
+		if (err)
+			goto out;
+	}
+
+	/* Flush the default context image to memory, and enable powersaving. */
+	if (intel_gt_wait_for_idle(gt, I915_GEM_IDLE_TIMEOUT) == -ETIME) {
+		err = -EIO;
+		goto out;
+	}
+
+	for (id = 0; id < ARRAY_SIZE(requests); id++) {
+		struct i915_request *rq;
+		struct i915_vma *state;
+		void *vaddr;
+
+		rq = requests[id];
+		if (!rq)
+			continue;
+
+		GEM_BUG_ON(!test_bit(CONTEXT_ALLOC_BIT, &rq->context->flags));
+		state = rq->context->state;
+		if (!state)
+			continue;
+
+		/* Serialise with retirement on another CPU */
+		err = __intel_context_flush_retire(rq->context);
+		if (err)
+			goto out;
+
+		/* We want to be able to unbind the state from the GGTT */
+		GEM_BUG_ON(intel_context_is_pinned(rq->context));
+
+		/*
+		 * As we will hold a reference to the logical state, it will
+		 * not be torn down with the context, and importantly the
+		 * object will hold onto its vma (making it possible for a
+		 * stray GTT write to corrupt our defaults). Unmap the vma
+		 * from the GTT to prevent such accidents and reclaim the
+		 * space.
+		 */
+		err = i915_vma_unbind(state);
+		if (err)
+			goto out;
+
+		i915_gem_object_lock(state->obj);
+		err = i915_gem_object_set_to_cpu_domain(state->obj, false);
+		i915_gem_object_unlock(state->obj);
+		if (err)
+			goto out;
+
+		i915_gem_object_set_cache_coherency(state->obj, I915_CACHE_LLC);
+
+		/* Check we can acquire the image of the context state */
+		vaddr = i915_gem_object_pin_map(state->obj, I915_MAP_FORCE_WB);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			goto out;
+		}
+
+		rq->engine->default_state = i915_gem_object_get(state->obj);
+		i915_gem_object_unpin_map(state->obj);
+	}
+
+out:
+	/*
+	 * If we have to abandon now, we expect the engines to be idle
+	 * and ready to be torn-down. The quickest way we can accomplish
+	 * this is by declaring ourselves wedged.
+	 */
+	if (err)
+		intel_gt_set_wedged(gt);
+
+	for (id = 0; id < ARRAY_SIZE(requests); id++) {
+		struct intel_context *ce;
+		struct i915_request *rq;
+
+		rq = requests[id];
+		if (!rq)
+			continue;
+
+		ce = rq->context;
+		i915_request_put(rq);
+		intel_context_put(ce);
+	}
+	return err;
+}
+
+static int __engines_verify_workarounds(struct intel_gt *gt)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err = 0;
+
+	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
+		return 0;
+
+	for_each_engine(engine, gt, id) {
+		if (intel_engine_verify_workarounds(engine, "load"))
+			err = -EIO;
+	}
+
+	return err;
+}
+
+static void __intel_gt_disable(struct intel_gt *gt)
+{
+	intel_gt_set_wedged_on_init(gt);
+
+	intel_gt_suspend_prepare(gt);
+	intel_gt_suspend_late(gt);
+
+	GEM_BUG_ON(intel_gt_pm_is_awake(gt));
+}
+
 int intel_gt_init(struct intel_gt *gt)
 {
 	int err;
 
-	err = intel_gt_init_scratch(gt, IS_GEN(gt->i915, 2) ? SZ_256K : SZ_4K);
+	err = i915_inject_probe_error(gt->i915, -ENODEV);
 	if (err)
 		return err;
 
+	/*
+	 * This is just a security blanket to placate dragons.
+	 * On some systems, we very sporadically observe that the first TLBs
+	 * used by the CS may be stale, despite us poking the TLB reset. If
+	 * we hold the forcewake during initialisation these problems
+	 * just magically go away.
+	 */
+	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
+
+	err = intel_gt_init_scratch(gt, IS_GEN(gt->i915, 2) ? SZ_256K : SZ_4K);
+	if (err)
+		goto out_fw;
+
 	intel_gt_pm_init(gt);
 
 	gt->vm = kernel_vm(gt);
 	if (!gt->vm) {
 		err = -ENOMEM;
-		goto err_scratch;
+		goto err_pm;
 	}
 
-	return 0;
+	err = intel_engines_setup(gt);
+	if (err)
+		goto err_vm;
+
+	err = intel_engines_init(gt);
+	if (err)
+		goto err_engines;
+
+	intel_uc_init(&gt->uc);
+
+	err = intel_gt_init_hw(gt);
+	if (err)
+		goto err_uc_init;
+
+	/* Only when the HW is re-initialised, can we replay the requests */
+	err = intel_gt_resume(gt);
+	if (err)
+		goto err_uc_init;
+
+	err = __engines_record_defaults(gt);
+	if (err)
+		goto err_gt;
 
-err_scratch:
+	err = __engines_verify_workarounds(gt);
+	if (err)
+		goto err_gt;
+
+	err = i915_inject_probe_error(gt->i915, -EIO);
+	if (err)
+		goto err_gt;
+
+	goto out_fw;
+err_gt:
+	__intel_gt_disable(gt);
+err_uc_init:
+	intel_uc_fini(&gt->uc);
+err_engines:
+	intel_engines_release(gt);
+err_vm:
+	i915_vm_put(fetch_and_zero(&gt->vm));
+err_pm:
+	intel_gt_pm_fini(gt);
 	intel_gt_fini_scratch(gt);
+out_fw:
+	if (err)
+		intel_gt_set_wedged_on_init(gt);
+	intel_uncore_forcewake_put(gt->uncore, FORCEWAKE_ALL);
 	return err;
 }
 
 void intel_gt_driver_remove(struct intel_gt *gt)
 {
 	GEM_BUG_ON(gt->awake);
+
+	__intel_gt_disable(gt);
+
+	intel_uc_fini_hw(&gt->uc);
+	intel_uc_fini(&gt->uc);
+
+	intel_engines_release(gt);
 }
 
 void intel_gt_driver_unregister(struct intel_gt *gt)
@@ -423,4 +668,5 @@ void intel_gt_driver_late_release(struct intel_gt *gt)
 	intel_gt_fini_requests(gt);
 	intel_gt_fini_reset(gt);
 	intel_gt_fini_timelines(gt);
+	intel_engines_free(gt);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e8d5518cdcad..2922b5c4024e 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -3147,7 +3147,7 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
 	__unwind_incomplete_requests(engine);
 }
 
-static void execlists_reset(struct intel_engine_cs *engine, bool stalled)
+static void execlists_reset_rewind(struct intel_engine_cs *engine, bool stalled)
 {
 	unsigned long flags;
 
@@ -3165,7 +3165,7 @@ static void nop_submission_tasklet(unsigned long data)
 	/* The driver is wedged; don't process any more events. */
 }
 
-static void execlists_cancel_requests(struct intel_engine_cs *engine)
+static void execlists_reset_cancel(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_request *rq, *rn;
@@ -3754,12 +3754,12 @@ static void execlists_park(struct intel_engine_cs *engine)
 void intel_execlists_set_default_submission(struct intel_engine_cs *engine)
 {
 	engine->submit_request = execlists_submit_request;
-	engine->cancel_requests = execlists_cancel_requests;
 	engine->schedule = i915_schedule;
 	engine->execlists.tasklet.func = execlists_submission_tasklet;
 
 	engine->reset.prepare = execlists_reset_prepare;
-	engine->reset.reset = execlists_reset;
+	engine->reset.rewind = execlists_reset_rewind;
+	engine->reset.cancel = execlists_reset_cancel;
 	engine->reset.finish = execlists_reset_finish;
 
 	engine->park = execlists_park;
@@ -3784,13 +3784,12 @@ static void execlists_shutdown(struct intel_engine_cs *engine)
 	tasklet_kill(&engine->execlists.tasklet);
 }
 
-static void execlists_destroy(struct intel_engine_cs *engine)
+static void execlists_release(struct intel_engine_cs *engine)
 {
 	execlists_shutdown(engine);
 
 	intel_engine_cleanup_common(engine);
 	lrc_destroy_wa_ctx(engine);
-	kfree(engine);
 }
 
 static void
@@ -3798,13 +3797,9 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
 {
 	/* Default vfuncs which can be overriden by each engine. */
 
-	engine->destroy = execlists_destroy;
+	engine->release = execlists_release;
 	engine->resume = execlists_resume;
 
-	engine->reset.prepare = execlists_reset_prepare;
-	engine->reset.reset = execlists_reset;
-	engine->reset.finish = execlists_reset_finish;
-
 	engine->cops = &execlists_context_ops;
 	engine->request_alloc = execlists_request_alloc;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index b5770bcfbaec..b7bb80c79227 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -666,7 +666,8 @@ static void reset_prepare_engine(struct intel_engine_cs *engine)
 	 * GPU state upon resume, i.e. fail to restart after a reset.
 	 */
 	intel_uncore_forcewake_get(engine->uncore, FORCEWAKE_ALL);
-	engine->reset.prepare(engine);
+	if (engine->reset.prepare)
+		engine->reset.prepare(engine);
 }
 
 static void revoke_mmaps(struct intel_gt *gt)
@@ -746,7 +747,8 @@ static int gt_reset(struct intel_gt *gt, intel_engine_mask_t stalled_mask)
 
 static void reset_finish_engine(struct intel_engine_cs *engine)
 {
-	engine->reset.finish(engine);
+	if (engine->reset.finish)
+		engine->reset.finish(engine);
 	intel_uncore_forcewake_put(engine->uncore, FORCEWAKE_ALL);
 
 	intel_engine_signal_breadcrumbs(engine);
@@ -823,7 +825,8 @@ static void __intel_gt_set_wedged(struct intel_gt *gt)
 
 	/* Mark all executing requests as skipped */
 	for_each_engine(engine, gt, id)
-		engine->cancel_requests(engine);
+		if (engine->reset.cancel)
+			engine->reset.cancel(engine);
 
 	reset_finish(gt, awake);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 792b2fad27f3..785442923a5c 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -770,7 +770,7 @@ static void reset_prepare(struct intel_engine_cs *engine)
 			  intel_uncore_read_fw(uncore, RING_HEAD(base)));
 }
 
-static void reset_ring(struct intel_engine_cs *engine, bool stalled)
+static void reset_rewind(struct intel_engine_cs *engine, bool stalled)
 {
 	struct i915_request *pos, *rq;
 	unsigned long flags;
@@ -902,7 +902,7 @@ static int rcs_resume(struct intel_engine_cs *engine)
 	return xcs_resume(engine);
 }
 
-static void cancel_requests(struct intel_engine_cs *engine)
+static void reset_cancel(struct intel_engine_cs *engine)
 {
 	struct i915_request *request;
 	unsigned long flags;
@@ -1812,7 +1812,6 @@ static int gen6_ring_flush(struct i915_request *rq, u32 mode)
 static void i9xx_set_default_submission(struct intel_engine_cs *engine)
 {
 	engine->submit_request = i9xx_submit_request;
-	engine->cancel_requests = cancel_requests;
 
 	engine->park = NULL;
 	engine->unpark = NULL;
@@ -1824,7 +1823,7 @@ static void gen6_bsd_set_default_submission(struct intel_engine_cs *engine)
 	engine->submit_request = gen6_bsd_submit_request;
 }
 
-static void ring_destroy(struct intel_engine_cs *engine)
+static void ring_release(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 
@@ -1838,8 +1837,6 @@ static void ring_destroy(struct intel_engine_cs *engine)
 
 	intel_timeline_unpin(engine->legacy.timeline);
 	intel_timeline_put(engine->legacy.timeline);
-
-	kfree(engine);
 }
 
 static void setup_irq(struct intel_engine_cs *engine)
@@ -1870,11 +1867,12 @@ static void setup_common(struct intel_engine_cs *engine)
 
 	setup_irq(engine);
 
-	engine->destroy = ring_destroy;
+	engine->release = ring_release;
 
 	engine->resume = xcs_resume;
 	engine->reset.prepare = reset_prepare;
-	engine->reset.reset = reset_ring;
+	engine->reset.rewind = reset_rewind;
+	engine->reset.cancel = reset_cancel;
 	engine->reset.finish = reset_finish;
 
 	engine->cops = &ring_context_ops;
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index aaf15cbe1ce1..84b4f8b0a7f9 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -14,10 +14,10 @@
 
 #include "i915_active_types.h"
 
-struct drm_i915_private;
 struct i915_vma;
-struct intel_timeline_cacheline;
 struct i915_syncmap;
+struct intel_gt;
+struct intel_timeline_cacheline;
 
 struct intel_timeline {
 	u64 fence_context;
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
index 39df9d49a134..c5995f015357 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -207,16 +207,12 @@ static void mock_reset_prepare(struct intel_engine_cs *engine)
 {
 }
 
-static void mock_reset(struct intel_engine_cs *engine, bool stalled)
+static void mock_reset_rewind(struct intel_engine_cs *engine, bool stalled)
 {
 	GEM_BUG_ON(stalled);
 }
 
-static void mock_reset_finish(struct intel_engine_cs *engine)
-{
-}
-
-static void mock_cancel_requests(struct intel_engine_cs *engine)
+static void mock_reset_cancel(struct intel_engine_cs *engine)
 {
 	struct i915_request *request;
 	unsigned long flags;
@@ -234,6 +230,10 @@ static void mock_cancel_requests(struct intel_engine_cs *engine)
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
 
+static void mock_reset_finish(struct intel_engine_cs *engine)
+{
+}
+
 struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 				    const char *name,
 				    int id)
@@ -265,9 +265,9 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 	engine->base.submit_request = mock_submit_request;
 
 	engine->base.reset.prepare = mock_reset_prepare;
-	engine->base.reset.reset = mock_reset;
+	engine->base.reset.rewind = mock_reset_rewind;
+	engine->base.reset.cancel = mock_reset_cancel;
 	engine->base.reset.finish = mock_reset_finish;
-	engine->base.cancel_requests = mock_cancel_requests;
 
 	i915->gt.engine[id] = &engine->base;
 	i915->gt.engine_class[0][id] = &engine->base;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 1d615d375bfa..5aa87536ba04 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -402,7 +402,7 @@ cancel_port_requests(struct intel_engine_execlists * const execlists)
 		memset(execlists->inflight, 0, sizeof(execlists->inflight));
 }
 
-static void guc_reset(struct intel_engine_cs *engine, bool stalled)
+static void guc_reset_rewind(struct intel_engine_cs *engine, bool stalled)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_request *rq;
@@ -427,7 +427,7 @@ static void guc_reset(struct intel_engine_cs *engine, bool stalled)
 	spin_unlock_irqrestore(&engine->active.lock, flags);
 }
 
-static void guc_cancel_requests(struct intel_engine_cs *engine)
+static void guc_reset_cancel(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_request *rq, *rn;
@@ -600,11 +600,10 @@ static void guc_set_default_submission(struct intel_engine_cs *engine)
 	engine->park = engine->unpark = NULL;
 
 	engine->reset.prepare = guc_reset_prepare;
-	engine->reset.reset = guc_reset;
+	engine->reset.rewind = guc_reset_rewind;
+	engine->reset.cancel = guc_reset_cancel;
 	engine->reset.finish = guc_reset_finish;
 
-	engine->cancel_requests = guc_cancel_requests;
-
 	engine->flags &= ~I915_ENGINE_SUPPORTS_STATS;
 	engine->flags |= I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 8b08cfe30151..59525094d0e3 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -619,7 +619,6 @@ static int i915_driver_mmio_probe(struct drm_i915_private *dev_priv)
  */
 static void i915_driver_mmio_release(struct drm_i915_private *dev_priv)
 {
-	intel_engines_cleanup(&dev_priv->gt);
 	intel_teardown_mchbar(dev_priv);
 	intel_uncore_fini_mmio(&dev_priv->uncore);
 	pci_dev_put(dev_priv->bridge_dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dcbfc436af68..9b19b093ece7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -45,20 +45,12 @@
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_ioctls.h"
 #include "gem/i915_gem_mman.h"
-#include "gem/i915_gem_pm.h"
-#include "gt/intel_context.h"
 #include "gt/intel_engine_user.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
-#include "gt/intel_gt_requests.h"
-#include "gt/intel_mocs.h"
-#include "gt/intel_reset.h"
-#include "gt/intel_renderstate.h"
-#include "gt/intel_rps.h"
 #include "gt/intel_workarounds.h"
 
 #include "i915_drv.h"
-#include "i915_scatterlist.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
@@ -1082,176 +1074,6 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	return err;
 }
 
-static int __intel_context_flush_retire(struct intel_context *ce)
-{
-	struct intel_timeline *tl;
-
-	tl = intel_context_timeline_lock(ce);
-	if (IS_ERR(tl))
-		return PTR_ERR(tl);
-
-	intel_context_timeline_unlock(tl);
-	return 0;
-}
-
-static int __intel_engines_record_defaults(struct intel_gt *gt)
-{
-	struct i915_request *requests[I915_NUM_ENGINES] = {};
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int err = 0;
-
-	/*
-	 * As we reset the gpu during very early sanitisation, the current
-	 * register state on the GPU should reflect its defaults values.
-	 * We load a context onto the hw (with restore-inhibit), then switch
-	 * over to a second context to save that default register state. We
-	 * can then prime every new context with that state so they all start
-	 * from the same default HW values.
-	 */
-
-	for_each_engine(engine, gt, id) {
-		struct intel_renderstate so;
-		struct intel_context *ce;
-		struct i915_request *rq;
-
-		err = intel_renderstate_init(&so, engine);
-		if (err)
-			goto out;
-
-		/* We must be able to switch to something! */
-		GEM_BUG_ON(!engine->kernel_context);
-		engine->serial++; /* force the kernel context switch */
-
-		ce = intel_context_create(engine);
-		if (IS_ERR(ce)) {
-			err = PTR_ERR(ce);
-			goto out;
-		}
-
-		rq = intel_context_create_request(ce);
-		if (IS_ERR(rq)) {
-			err = PTR_ERR(rq);
-			intel_context_put(ce);
-			goto out;
-		}
-
-		err = intel_engine_emit_ctx_wa(rq);
-		if (err)
-			goto err_rq;
-
-		err = intel_renderstate_emit(&so, rq);
-		if (err)
-			goto err_rq;
-
-err_rq:
-		requests[id] = i915_request_get(rq);
-		i915_request_add(rq);
-		intel_renderstate_fini(&so);
-		if (err)
-			goto out;
-	}
-
-	/* Flush the default context image to memory, and enable powersaving. */
-	if (intel_gt_wait_for_idle(gt, I915_GEM_IDLE_TIMEOUT) == -ETIME) {
-		err = -EIO;
-		goto out;
-	}
-
-	for (id = 0; id < ARRAY_SIZE(requests); id++) {
-		struct i915_request *rq;
-		struct i915_vma *state;
-		void *vaddr;
-
-		rq = requests[id];
-		if (!rq)
-			continue;
-
-		GEM_BUG_ON(!test_bit(CONTEXT_ALLOC_BIT, &rq->context->flags));
-		state = rq->context->state;
-		if (!state)
-			continue;
-
-		/* Serialise with retirement on another CPU */
-		err = __intel_context_flush_retire(rq->context);
-		if (err)
-			goto out;
-
-		/* We want to be able to unbind the state from the GGTT */
-		GEM_BUG_ON(intel_context_is_pinned(rq->context));
-
-		/*
-		 * As we will hold a reference to the logical state, it will
-		 * not be torn down with the context, and importantly the
-		 * object will hold onto its vma (making it possible for a
-		 * stray GTT write to corrupt our defaults). Unmap the vma
-		 * from the GTT to prevent such accidents and reclaim the
-		 * space.
-		 */
-		err = i915_vma_unbind(state);
-		if (err)
-			goto out;
-
-		i915_gem_object_lock(state->obj);
-		err = i915_gem_object_set_to_cpu_domain(state->obj, false);
-		i915_gem_object_unlock(state->obj);
-		if (err)
-			goto out;
-
-		i915_gem_object_set_cache_coherency(state->obj, I915_CACHE_LLC);
-
-		/* Check we can acquire the image of the context state */
-		vaddr = i915_gem_object_pin_map(state->obj, I915_MAP_FORCE_WB);
-		if (IS_ERR(vaddr)) {
-			err = PTR_ERR(vaddr);
-			goto out;
-		}
-
-		rq->engine->default_state = i915_gem_object_get(state->obj);
-		i915_gem_object_unpin_map(state->obj);
-	}
-
-out:
-	/*
-	 * If we have to abandon now, we expect the engines to be idle
-	 * and ready to be torn-down. The quickest way we can accomplish
-	 * this is by declaring ourselves wedged.
-	 */
-	if (err)
-		intel_gt_set_wedged(gt);
-
-	for (id = 0; id < ARRAY_SIZE(requests); id++) {
-		struct intel_context *ce;
-		struct i915_request *rq;
-
-		rq = requests[id];
-		if (!rq)
-			continue;
-
-		ce = rq->context;
-		i915_request_put(rq);
-		intel_context_put(ce);
-	}
-	return err;
-}
-
-static int intel_engines_verify_workarounds(struct intel_gt *gt)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int err = 0;
-
-	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
-		return 0;
-
-	for_each_engine(engine, gt, id) {
-		if (intel_engine_verify_workarounds(engine, "load"))
-			err = -EIO;
-	}
-
-	return err;
-}
-
 int i915_gem_init(struct drm_i915_private *dev_priv)
 {
 	int ret;
@@ -1268,45 +1090,12 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	intel_uc_fetch_firmwares(&dev_priv->gt.uc);
 	intel_wopcm_init(&dev_priv->wopcm);
 
-	/* This is just a security blanket to placate dragons.
-	 * On some systems, we very sporadically observe that the first TLBs
-	 * used by the CS may be stale, despite us poking the TLB reset. If
-	 * we hold the forcewake during initialisation these problems
-	 * just magically go away.
-	 */
-	intel_uncore_forcewake_get(&dev_priv->uncore, FORCEWAKE_ALL);
-
 	ret = i915_init_ggtt(dev_priv);
 	if (ret) {
 		GEM_BUG_ON(ret == -EIO);
 		goto err_unlock;
 	}
 
-	intel_gt_init(&dev_priv->gt);
-
-	ret = intel_engines_setup(&dev_priv->gt);
-	if (ret) {
-		GEM_BUG_ON(ret == -EIO);
-		goto err_gt_early;
-	}
-
-	ret = intel_engines_init(&dev_priv->gt);
-	if (ret) {
-		GEM_BUG_ON(ret == -EIO);
-		goto err_engines;
-	}
-
-	intel_uc_init(&dev_priv->gt.uc);
-
-	ret = intel_gt_init_hw(&dev_priv->gt);
-	if (ret)
-		goto err_uc_init;
-
-	/* Only when the HW is re-initialised, can we replay the requests */
-	ret = intel_gt_resume(&dev_priv->gt);
-	if (ret)
-		goto err_init_hw;
-
 	/*
 	 * Despite its name intel_init_clock_gating applies both display
 	 * clock gating workarounds; GT mmio workarounds and the occasional
@@ -1318,23 +1107,9 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	 */
 	intel_init_clock_gating(dev_priv);
 
-	ret = intel_engines_verify_workarounds(&dev_priv->gt);
-	if (ret)
-		goto err_gt_late;
-
-	ret = __intel_engines_record_defaults(&dev_priv->gt);
-	if (ret)
-		goto err_gt_late;
-
-	ret = i915_inject_probe_error(dev_priv, -ENODEV);
-	if (ret)
-		goto err_gt_late;
-
-	ret = i915_inject_probe_error(dev_priv, -EIO);
+	ret = intel_gt_init(&dev_priv->gt);
 	if (ret)
-		goto err_gt_late;
-
-	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
+		goto err_unlock;
 
 	return 0;
 
@@ -1344,24 +1119,8 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	 * HW as irrevisibly wedged, but keep enough state around that the
 	 * driver doesn't explode during runtime.
 	 */
-err_gt_late:
-	intel_gt_set_wedged_on_init(&dev_priv->gt);
-	i915_gem_suspend(dev_priv);
-	i915_gem_suspend_late(dev_priv);
-
-	i915_gem_drain_workqueue(dev_priv);
-err_init_hw:
-	intel_uc_fini_hw(&dev_priv->gt.uc);
-err_uc_init:
-	if (ret != -EIO)
-		intel_uc_fini(&dev_priv->gt.uc);
-err_engines:
-	if (ret != -EIO)
-		intel_engines_cleanup(&dev_priv->gt);
-err_gt_early:
-	intel_gt_driver_release(&dev_priv->gt);
 err_unlock:
-	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
+	i915_gem_drain_workqueue(dev_priv);
 
 	if (ret != -EIO) {
 		intel_uc_cleanup_firmwares(&dev_priv->gt.uc);
@@ -1409,19 +1168,16 @@ void i915_gem_driver_remove(struct drm_i915_private *dev_priv)
 
 	i915_gem_suspend_late(dev_priv);
 	intel_gt_driver_remove(&dev_priv->gt);
+	dev_priv->uabi_engines = RB_ROOT;
 
 	/* Flush any outstanding unpin_work. */
 	i915_gem_drain_workqueue(dev_priv);
 
-	intel_uc_fini_hw(&dev_priv->gt.uc);
-	intel_uc_fini(&dev_priv->gt.uc);
-
 	i915_gem_drain_freed_objects(dev_priv);
 }
 
 void i915_gem_driver_release(struct drm_i915_private *dev_priv)
 {
-	intel_engines_cleanup(&dev_priv->gt);
 	intel_gt_driver_release(&dev_priv->gt);
 
 	intel_wa_list_free(&dev_priv->gt_wa_list);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c
index 657e23a8dd11..b37fc53973cc 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
@@ -9,6 +9,7 @@
 #include "gem/selftests/igt_gem_utils.h"
 #include "gem/selftests/mock_context.h"
 #include "gt/intel_gt.h"
+#include "gt/intel_gt_pm.h"
 
 #include "i915_selftest.h"
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 24/33] drm/i915/gt: Pull intel_gt_init_hw() into intel_gt_resume()
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (21 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 23/33] drm/i915/gt: Pull GT initialisation under intel_gt_init() Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 25/33] drm/i915/gt: Merge engine init/setup loops Chris Wilson
                   ` (9 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Since intel_gt_resume() is always immediately proceeded by init_hw, pull
the call into intel_gt_resume, where we have the rpm and fw already
held.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_pm.c | 20 +-------------------
 drivers/gpu/drm/i915/gt/intel_gt.c     |  5 -----
 drivers/gpu/drm/i915/gt/intel_gt_pm.c  | 12 +++++++++++-
 3 files changed, 12 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
index f88ee1317bb4..544b99c4f70f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
@@ -101,28 +101,10 @@ void i915_gem_resume(struct drm_i915_private *i915)
 {
 	GEM_TRACE("\n");
 
-	intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
-
-	if (intel_gt_init_hw(&i915->gt))
-		goto err_wedged;
-
 	/*
 	 * As we didn't flush the kernel context before suspend, we cannot
 	 * guarantee that the context image is complete. So let's just reset
 	 * it and start again.
 	 */
-	if (intel_gt_resume(&i915->gt))
-		goto err_wedged;
-
-out_unlock:
-	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
-	return;
-
-err_wedged:
-	if (!intel_gt_is_wedged(&i915->gt)) {
-		dev_err(i915->drm.dev,
-			"Failed to re-initialize GPU, declaring it wedged!\n");
-		intel_gt_set_wedged(&i915->gt);
-	}
-	goto out_unlock;
+	intel_gt_resume(&i915->gt);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index e7f4c1beb3b6..fc15acee3984 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -593,11 +593,6 @@ int intel_gt_init(struct intel_gt *gt)
 
 	intel_uc_init(&gt->uc);
 
-	err = intel_gt_init_hw(gt);
-	if (err)
-		goto err_uc_init;
-
-	/* Only when the HW is re-initialised, can we replay the requests */
 	err = intel_gt_resume(gt);
 	if (err)
 		goto err_uc_init;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index ecde67a75e32..147b86538317 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -186,7 +186,7 @@ int intel_gt_resume(struct intel_gt *gt)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	int err = 0;
+	int err;
 
 	GEM_TRACE("\n");
 
@@ -201,6 +201,15 @@ int intel_gt_resume(struct intel_gt *gt)
 	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
 	intel_rc6_sanitize(&gt->rc6);
 
+	/* Only when the HW is re-initialised, can we replay the requests */
+	err = intel_gt_init_hw(gt);
+	if (err) {
+		dev_err(gt->i915->drm.dev,
+			"Failed to initialize GPU, declaring it wedged!\n");
+		intel_gt_set_wedged(gt);
+		goto err_fw;
+	}
+
 	intel_rps_enable(&gt->rps);
 	intel_llc_enable(&gt->llc);
 
@@ -233,6 +242,7 @@ int intel_gt_resume(struct intel_gt *gt)
 
 	user_forcewake(gt, false);
 
+err_fw:
 	intel_uncore_forcewake_put(gt->uncore, FORCEWAKE_ALL);
 	intel_gt_pm_put(gt);
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 25/33] drm/i915/gt: Merge engine init/setup loops
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (22 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 24/33] drm/i915/gt: Pull intel_gt_init_hw() into intel_gt_resume() Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 26/33] drm/i915/gt: Expose engine properties via sysfs Chris Wilson
                   ` (8 subsequent siblings)
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Now that we don't need to create GEM contexts in the middle of engine
construction, we can pull the engine init/setup loops together.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 118 +++++++-----------
 drivers/gpu/drm/i915/gt/intel_gt.c            |   5 -
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   5 -
 .../gpu/drm/i915/gt/intel_ring_submission.c   |   6 -
 4 files changed, 43 insertions(+), 91 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 5a5e75483bbb..9e16d8a9bbf5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -470,39 +470,6 @@ int intel_engines_init_mmio(struct intel_gt *gt)
 	return err;
 }
 
-/**
- * intel_engines_init() - init the Engine Command Streamers
- * @gt: pointer to struct intel_gt
- *
- * Return: non-zero if the initialization failed.
- */
-int intel_engines_init(struct intel_gt *gt)
-{
-	int (*init)(struct intel_engine_cs *engine);
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int err;
-
-	if (HAS_EXECLISTS(gt->i915))
-		init = intel_execlists_submission_init;
-	else
-		init = intel_ring_submission_init;
-
-	for_each_engine(engine, gt, id) {
-		err = init(engine);
-		if (err)
-			goto cleanup;
-
-		intel_engine_add_user(engine);
-	}
-
-	return 0;
-
-cleanup:
-	intel_engines_release(gt);
-	return err;
-}
-
 void intel_engine_init_execlists(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
@@ -614,7 +581,7 @@ static int init_status_page(struct intel_engine_cs *engine)
 	return ret;
 }
 
-static int intel_engine_setup_common(struct intel_engine_cs *engine)
+static int engine_setup_common(struct intel_engine_cs *engine)
 {
 	int err;
 
@@ -644,46 +611,6 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
 	return 0;
 }
 
-/**
- * intel_engines_setup- setup engine state not requiring hw access
- * @gt: pointer to struct intel_gt
- *
- * Initializes engine structure members shared between legacy and execlists
- * submission modes which do not require hardware access.
- *
- * Typically done early in the submission mode specific engine setup stage.
- */
-int intel_engines_setup(struct intel_gt *gt)
-{
-	int (*setup)(struct intel_engine_cs *engine);
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int err;
-
-	if (HAS_EXECLISTS(gt->i915))
-		setup = intel_execlists_submission_setup;
-	else
-		setup = intel_ring_submission_setup;
-
-	for_each_engine(engine, gt, id) {
-		err = intel_engine_setup_common(engine);
-		if (err)
-			goto cleanup;
-
-		err = setup(engine);
-		if (err)
-			goto cleanup;
-
-		GEM_BUG_ON(!engine->cops);
-	}
-
-	return 0;
-
-cleanup:
-	intel_engines_release(gt);
-	return err;
-}
-
 struct measure_breadcrumb {
 	struct i915_request rq;
 	struct intel_timeline timeline;
@@ -801,7 +728,7 @@ create_kernel_context(struct intel_engine_cs *engine)
  *
  * Returns zero on success or an error code on failure.
  */
-int intel_engine_init_common(struct intel_engine_cs *engine)
+static int engine_init_common(struct intel_engine_cs *engine)
 {
 	struct intel_context *ce;
 	int ret;
@@ -831,6 +758,47 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
 	return 0;
 }
 
+int intel_engines_init(struct intel_gt *gt)
+{
+	int (*init)(struct intel_engine_cs *engine);
+	int (*setup)(struct intel_engine_cs *engine);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err;
+
+	if (HAS_EXECLISTS(gt->i915)) {
+		setup = intel_execlists_submission_setup;
+		init = intel_execlists_submission_init;
+	} else {
+		setup = intel_ring_submission_setup;
+		init = intel_ring_submission_init;
+	}
+
+	for_each_engine(engine, gt, id) {
+		err = engine_setup_common(engine);
+		if (err)
+			return err;
+
+		err = setup(engine);
+		if (err)
+			return err;
+
+		GEM_BUG_ON(!engine->cops);
+
+		err = init(engine);
+		if (err)
+			return err;
+
+		err = engine_init_common(engine);
+		if (err)
+			return err;
+
+		intel_engine_add_user(engine);
+	}
+
+	return 0;
+}
+
 /**
  * intel_engines_cleanup_common - cleans up the engine state created by
  *                                the common initiailizers.
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index fc15acee3984..4a38ca5794d6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -583,10 +583,6 @@ int intel_gt_init(struct intel_gt *gt)
 		goto err_pm;
 	}
 
-	err = intel_engines_setup(gt);
-	if (err)
-		goto err_vm;
-
 	err = intel_engines_init(gt);
 	if (err)
 		goto err_engines;
@@ -616,7 +612,6 @@ int intel_gt_init(struct intel_gt *gt)
 	intel_uc_fini(&gt->uc);
 err_engines:
 	intel_engines_release(gt);
-err_vm:
 	i915_vm_put(fetch_and_zero(&gt->vm));
 err_pm:
 	intel_gt_pm_fini(gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 2922b5c4024e..024a639da787 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -3889,11 +3889,6 @@ int intel_execlists_submission_init(struct intel_engine_cs *engine)
 	struct drm_i915_private *i915 = engine->i915;
 	struct intel_uncore *uncore = engine->uncore;
 	u32 base = engine->mmio_base;
-	int ret;
-
-	ret = intel_engine_init_common(engine);
-	if (ret)
-		return ret;
 
 	if (intel_init_workaround_bb(engine))
 		/*
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 785442923a5c..50f172ece43f 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -2037,16 +2037,10 @@ int intel_ring_submission_init(struct intel_engine_cs *engine)
 	engine->legacy.ring = ring;
 	engine->legacy.timeline = timeline;
 
-	err = intel_engine_init_common(engine);
-	if (err)
-		goto err_ring_unpin;
-
 	GEM_BUG_ON(timeline->hwsp_ggtt != engine->status_page.vma);
 
 	return 0;
 
-err_ring_unpin:
-	intel_ring_unpin(ring);
 err_ring:
 	intel_ring_put(ring);
 err_timeline_unpin:
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 26/33] drm/i915/gt: Expose engine properties via sysfs
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (23 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 25/33] drm/i915/gt: Merge engine init/setup loops Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2020-01-22 21:13   ` [Intel-gfx] [26/33] " Steve Carbonari
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 27/33] drm/i915/gt: Expose engine->mmio_base " Chris Wilson
                   ` (7 subsequent siblings)
  32 siblings, 1 reply; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Preliminary stub to add engines underneath /sys/class/drm/cardN/, so
that we can expose properties on each engine to the sysadmin.

To start with we have basic analogues of the i915_query ioctl so that we
can pretty print engine discovery from the shell, and flesh out the
directory structure. Later we will add writeable sysadmin properties such
as per-engine timeout controls.

An example tree of the engine properties on Braswell:
    /sys/class/drm/card0
    └── engine
        ├── bcs0
        │   ├── capabilities
        │   ├── class
        │   ├── instance
        │   ├── known_capabilities
        │   └── name
        ├── rcs0
        │   ├── capabilities
        │   ├── class
        │   ├── instance
        │   ├── known_capabilities
        │   └── name
        ├── vcs0
        │   ├── capabilities
        │   ├── class
        │   ├── instance
        │   ├── known_capabilities
        │   └── name
        └── vecs0
            ├── capabilities
            ├── class
            ├── instance
            ├── known_capabilities
            └── name

v2: Include stringified capabilities
v3: Include all known capabilities for futureproofing.
v4: Combine the two caps loops into one

v5: Hide underneath Kconfig.unstable for wider discussion

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/Kconfig.unstable        |   7 +
 drivers/gpu/drm/i915/Makefile                |   1 +
 drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 208 +++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_engine_sysfs.h |  14 ++
 drivers/gpu/drm/i915/i915_sysfs.c            |   3 +
 5 files changed, 233 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_sysfs.h

diff --git a/drivers/gpu/drm/i915/Kconfig.unstable b/drivers/gpu/drm/i915/Kconfig.unstable
index 0c2276155c2b..1f866cae943b 100644
--- a/drivers/gpu/drm/i915/Kconfig.unstable
+++ b/drivers/gpu/drm/i915/Kconfig.unstable
@@ -27,3 +27,10 @@ config DRM_I915_UNSTABLE_FAKE_LMEM
 	help
 	  Convert some system memory into a fake local memory region for
 	  testing.
+
+config DRM_I915_UNSTABLE_SYSFS
+	bool "Enable the experimental sysfs properties"
+	depends on DRM_I915_UNSTABLE
+	default n
+	help
+	  Explore the HW property space from the shell command line!
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 2083496a1193..d80fe6bf9fea 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -82,6 +82,7 @@ gt-y += \
 	gt/intel_engine_heartbeat.o \
 	gt/intel_engine_pm.o \
 	gt/intel_engine_pool.o \
+	gt/intel_engine_sysfs.o \
 	gt/intel_engine_user.o \
 	gt/intel_gt.o \
 	gt/intel_gt_irq.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
new file mode 100644
index 000000000000..df263af3a9ea
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
@@ -0,0 +1,208 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/kobject.h>
+#include <linux/sysfs.h>
+
+#include "i915_drv.h"
+#include "intel_engine.h"
+#include "intel_engine_sysfs.h"
+
+struct kobj_engine {
+	struct kobject base;
+	struct intel_engine_cs *engine;
+};
+
+static struct intel_engine_cs *kobj_to_engine(struct kobject *kobj)
+{
+	return container_of(kobj, struct kobj_engine, base)->engine;
+}
+
+static ssize_t
+name_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%s\n", kobj_to_engine(kobj)->name);
+}
+
+static struct kobj_attribute name_attr =
+__ATTR(name, 0444, name_show, NULL);
+
+static ssize_t
+class_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", kobj_to_engine(kobj)->uabi_class);
+}
+
+static struct kobj_attribute class_attr =
+__ATTR(class, 0444, class_show, NULL);
+
+static ssize_t
+inst_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", kobj_to_engine(kobj)->uabi_instance);
+}
+
+static struct kobj_attribute inst_attr =
+__ATTR(instance, 0444, inst_show, NULL);
+
+static const char * const vcs_caps[] = {
+	[ilog2(I915_VIDEO_CLASS_CAPABILITY_HEVC)] = "hevc",
+	[ilog2(I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC)] = "sfc",
+};
+
+static const char * const vecs_caps[] = {
+	[ilog2(I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC)] = "sfc",
+};
+
+static ssize_t repr_trim(char *buf, ssize_t len)
+{
+	/* Trim off the trailing space and replace with a newline */
+	if (len > PAGE_SIZE)
+		len = PAGE_SIZE;
+	if (len > 0)
+		buf[len - 1] = '\n';
+
+	return len;
+}
+
+static ssize_t
+__caps_show(struct intel_engine_cs *engine,
+	    typeof(engine->uabi_capabilities) caps,
+	    char *buf, bool show_unknown)
+{
+	const char * const *repr;
+	int count, n;
+	ssize_t len;
+
+	switch (engine->class) {
+	case VIDEO_DECODE_CLASS:
+		repr = vcs_caps;
+		count = ARRAY_SIZE(vcs_caps);
+		break;
+
+	case VIDEO_ENHANCEMENT_CLASS:
+		repr = vecs_caps;
+		count = ARRAY_SIZE(vecs_caps);
+		break;
+
+	default:
+		repr = NULL;
+		count = 0;
+		break;
+	}
+	GEM_BUG_ON(count > BITS_PER_TYPE(typeof(caps)));
+
+	len = 0;
+	for_each_set_bit(n,
+			 (unsigned long *)&caps,
+			 show_unknown ? BITS_PER_TYPE(typeof(caps)) : count) {
+		if (n >= count || !repr[n]) {
+			if (GEM_WARN_ON(show_unknown))
+				len += snprintf(buf + len, PAGE_SIZE - len,
+						"[%x] ", n);
+		} else {
+			len += snprintf(buf + len, PAGE_SIZE - len,
+					"%s ", repr[n]);
+		}
+		if (GEM_WARN_ON(len >= PAGE_SIZE))
+			break;
+	}
+	return repr_trim(buf, len);
+}
+
+static ssize_t
+caps_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+	struct intel_engine_cs *engine = kobj_to_engine(kobj);
+
+	return __caps_show(engine, engine->uabi_capabilities, buf, true);
+}
+
+static struct kobj_attribute caps_attr =
+__ATTR(capabilities, 0444, caps_show, NULL);
+
+static ssize_t
+all_caps_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+	return __caps_show(kobj_to_engine(kobj), -1, buf, false);
+}
+
+static struct kobj_attribute all_caps_attr =
+__ATTR(known_capabilities, 0444, all_caps_show, NULL);
+
+static void kobj_engine_release(struct kobject *kobj)
+{
+	kfree(kobj);
+}
+
+static struct kobj_type kobj_engine_type = {
+	.release = kobj_engine_release,
+	.sysfs_ops = &kobj_sysfs_ops
+};
+
+static struct kobject *
+kobj_engine(struct kobject *dir, struct intel_engine_cs *engine)
+{
+	struct kobj_engine *ke;
+
+	ke = kzalloc(sizeof(*ke), GFP_KERNEL);
+	if (!ke)
+		return NULL;
+
+	kobject_init(&ke->base, &kobj_engine_type);
+	ke->engine = engine;
+
+	if (kobject_add(&ke->base, dir, "%s", engine->name)) {
+		kobject_put(&ke->base);
+		return NULL;
+	}
+
+	/* xfer ownership to sysfs tree */
+	return &ke->base;
+}
+
+void intel_engines_add_sysfs(struct drm_i915_private *i915)
+{
+	static const struct attribute *files[] = {
+		&name_attr.attr,
+		&class_attr.attr,
+		&inst_attr.attr,
+		&caps_attr.attr,
+		&all_caps_attr.attr,
+		NULL
+	};
+
+	struct device *kdev = i915->drm.primary->kdev;
+	struct intel_engine_cs *engine;
+	struct kobject *dir;
+
+	if (!IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_SYSFS))
+		return;
+
+	dir = kobject_create_and_add("engine", &kdev->kobj);
+	if (!dir)
+		return;
+
+	for_each_uabi_engine(engine, i915) {
+		struct kobject *kobj;
+
+		kobj = kobj_engine(dir, engine);
+		if (!kobj)
+			goto err_engine;
+
+		if (sysfs_create_files(kobj, files))
+			goto err_object;
+
+		if (0) {
+err_object:
+			kobject_put(kobj);
+err_engine:
+			dev_err(kdev, "Failed to add sysfs engine '%s'\n",
+				engine->name);
+			break;
+		}
+	}
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.h b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.h
new file mode 100644
index 000000000000..ef44a745b70a
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.h
@@ -0,0 +1,14 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef INTEL_ENGINE_SYSFS_H
+#define INTEL_ENGINE_SYSFS_H
+
+struct drm_i915_private;
+
+void intel_engines_add_sysfs(struct drm_i915_private *i915);
+
+#endif /* INTEL_ENGINE_SYSFS_H */
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 65476909d1bf..26738c5c7d23 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -30,6 +30,7 @@
 #include <linux/stat.h>
 #include <linux/sysfs.h>
 
+#include "gt/intel_engine_sysfs.h"
 #include "gt/intel_rc6.h"
 #include "gt/intel_rps.h"
 
@@ -616,6 +617,8 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 		DRM_ERROR("RPS sysfs setup failed\n");
 
 	i915_setup_error_capture(kdev);
+
+	intel_engines_add_sysfs(dev_priv);
 }
 
 void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 27/33] drm/i915/gt: Expose engine->mmio_base via sysfs
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (24 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 26/33] drm/i915/gt: Expose engine properties via sysfs Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2020-01-22 21:18   ` [Intel-gfx] [27/33] " Steve Carbonari
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 28/33] drm/i915/gt: Expose timeslice duration to sysfs Chris Wilson
                   ` (6 subsequent siblings)
  32 siblings, 1 reply; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Use the per-engine sysfs directory to let userspace discover the
mmio_base of each engine. Prior to recent generations, the user
accessible registers on each engine are at a fixed offset relative to
each engine -- but require absolute addressing. As the absolute address
depends on the actual physical engine, this is not always possible to
determine from userspace (for example icl may expose vcs1 or vcs2 as the
second vcs engine). Make this easy for userspace to discover by
providing the mmio_base in sysfs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
index df263af3a9ea..abddd8d0f9ae 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
@@ -48,6 +48,15 @@ inst_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
 static struct kobj_attribute inst_attr =
 __ATTR(instance, 0444, inst_show, NULL);
 
+static ssize_t
+mmio_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "0x%x\n", kobj_to_engine(kobj)->mmio_base);
+}
+
+static struct kobj_attribute mmio_attr =
+__ATTR(mmio_base, 0444, mmio_show, NULL);
+
 static const char * const vcs_caps[] = {
 	[ilog2(I915_VIDEO_CLASS_CAPABILITY_HEVC)] = "hevc",
 	[ilog2(I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC)] = "sfc",
@@ -170,6 +179,7 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
 		&name_attr.attr,
 		&class_attr.attr,
 		&inst_attr.attr,
+		&mmio_attr.attr,
 		&caps_attr.attr,
 		&all_caps_attr.attr,
 		NULL
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 28/33] drm/i915/gt: Expose timeslice duration to sysfs
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (25 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 27/33] drm/i915/gt: Expose engine->mmio_base " Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2020-01-22 21:25   ` [Intel-gfx] [28/33] " Steve Carbonari
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 29/33] drm/i915/gt: Expose busywait " Chris Wilson
                   ` (5 subsequent siblings)
  32 siblings, 1 reply; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

Execlists uses a scheduling quantum (a timeslice) to alternate execution
between ready-to-run contexts of equal priority. This ensures that all
users (though only if they of equal importance) have the opportunity to
run and prevents livelocks where contexts may have implicit ordering due
to userspace semaphores.

The timeslicing mechanism can be compiled out with

	./scripts/config --set-val DRM_I915_TIMESLICE_DURATION 0

The timeslice duration can be adjusted per-engine using,

	/sys/class/drm/card?/engine/*/timeslice_duration_ms

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/Kconfig.profile         |  3 ++
 drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 46 ++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index c280b6ae38eb..d8d4a16179bd 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -73,4 +73,7 @@ config DRM_I915_TIMESLICE_DURATION
 	  is scheduled for execution for the timeslice duration, before
 	  switching to the next context.
 
+	  This is adjustable via
+	  /sys/class/drm/card?/engine/*/timeslice_duration_ms
+
 	  May be 0 to disable timeslicing.
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
index abddd8d0f9ae..b1bd768b13d7 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
@@ -142,6 +142,48 @@ all_caps_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
 static struct kobj_attribute all_caps_attr =
 __ATTR(known_capabilities, 0444, all_caps_show, NULL);
 
+static ssize_t
+timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
+		const char *buf, size_t count)
+{
+	struct intel_engine_cs *engine = kobj_to_engine(kobj);
+	unsigned long long duration;
+	int err;
+
+	/*
+	 * Execlists uses a scheduling quantum (a timeslice) to alternate
+	 * execution between ready-to-run contexts of equal priority. This
+	 * ensures that all users (though only if they of equal importance)
+	 * have the opportunity to run and prevents livelocks where contexts
+	 * may have implicit ordering due to userspace semaphores.
+	 */
+
+	err = kstrtoull(buf, 0, &duration);
+	if (err)
+		return err;
+
+	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+		return -EINVAL;
+
+	WRITE_ONCE(engine->props.timeslice_duration_ms, duration);
+
+	if (execlists_active(&engine->execlists))
+		set_timer_ms(&engine->execlists.timer, duration);
+
+	return count;
+}
+
+static ssize_t
+timeslice_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+	struct intel_engine_cs *engine = kobj_to_engine(kobj);
+
+	return sprintf(buf, "%lu\n", engine->props.timeslice_duration_ms);
+}
+
+static struct kobj_attribute timeslice_duration_attr =
+__ATTR(timeslice_duration_ms, 0644, timeslice_show, timeslice_store);
+
 static void kobj_engine_release(struct kobject *kobj)
 {
 	kfree(kobj);
@@ -206,6 +248,10 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
 		if (sysfs_create_files(kobj, files))
 			goto err_object;
 
+		if (intel_engine_has_timeslices(engine) &&
+		    sysfs_create_file(kobj, &timeslice_duration_attr.attr))
+			goto err_engine;
+
 		if (0) {
 err_object:
 			kobject_put(kobj);
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 29/33] drm/i915/gt: Expose busywait duration to sysfs
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (26 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 28/33] drm/i915/gt: Expose timeslice duration to sysfs Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2020-01-22 21:29   ` [Intel-gfx] [29/33] " Steve Carbonari
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 30/33] drm/i915/gt: Expose reset stop timeout via sysfs Chris Wilson
                   ` (4 subsequent siblings)
  32 siblings, 1 reply; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

We busywait on an inflight request (one that is currently executing on
HW, and so might complete quickly) prior to setting up an interrupt and
sleeping. The trade off is that we keep an expensive CPU core busy in
order to avoid wake up latency: where that trade off should lie is best
left to the sysadmin.

The busywait mechanism can be compiled out with

	./scripts/config --set-val DRM_I915_SPIN_REQUEST 0

The maximum busywait duration can be adjusted per-engine using,

	/sys/class/drm/card?/engine/*/ms_busywait_duration_ns

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/Kconfig.profile         |  9 ++--
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 49 ++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  1 +
 drivers/gpu/drm/i915/i915_request.c          | 19 ++++----
 5 files changed, 68 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index d8d4a16179bd..9ee3b59685b9 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -35,9 +35,9 @@ config DRM_I915_PREEMPT_TIMEOUT
 
 	  May be 0 to disable the timeout.
 
-config DRM_I915_SPIN_REQUEST
-	int "Busywait for request completion (us)"
-	default 5 # microseconds
+config DRM_I915_MAX_REQUEST_BUSYWAIT
+	int "Busywait for request completion limit (ns)"
+	default 8000 # nanoseconds
 	help
 	  Before sleeping waiting for a request (GPU operation) to complete,
 	  we may spend some time polling for its completion. As the IRQ may
@@ -45,6 +45,9 @@ config DRM_I915_SPIN_REQUEST
 	  check if the request will complete in the time it would have taken
 	  us to enable the interrupt.
 
+	  This is adjustable via
+	  /sys/class/drm/card?/engine/*/max_busywait_duration_ns
+
 	  May be 0 to disable the initial spin. In practice, we estimate
 	  the cost of enabling the interrupt (if currently disabled) to be
 	  a few microseconds.
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 9e16d8a9bbf5..9162b94a7b80 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -312,6 +312,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
 
 	engine->props.heartbeat_interval_ms =
 		CONFIG_DRM_I915_HEARTBEAT_INTERVAL;
+	engine->props.max_busywait_duration_ns =
+		CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT;
 	engine->props.preempt_timeout_ms =
 		CONFIG_DRM_I915_PREEMPT_TIMEOUT;
 	engine->props.stop_timeout_ms =
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
index b1bd768b13d7..6d87529c64a7 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
@@ -142,6 +142,54 @@ all_caps_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
 static struct kobj_attribute all_caps_attr =
 __ATTR(known_capabilities, 0444, all_caps_show, NULL);
 
+static ssize_t
+max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
+	       const char *buf, size_t count)
+{
+	struct intel_engine_cs *engine = kobj_to_engine(kobj);
+	unsigned long long duration;
+	int err;
+
+	/*
+	 * When waiting for a request, if is it currently being executed
+	 * on the GPU, we busywait for a short while before sleeping. The
+	 * premise is that most requests are short, and if it is already
+	 * executing then there is a good chance that it will complete
+	 * before we can setup the interrupt handler and go to sleep.
+	 * We try to offset the cost of going to sleep, by first spinning
+	 * on the request -- if it completed in less time than it would take
+	 * to go sleep, process the interrupt and return back to the client,
+	 * then we have saved the client some latency, albeit at the cost
+	 * of spinning on an expensive CPU core.
+	 *
+	 * While we try to avoid waiting at all for a request that is unlikely
+	 * to complete, deciding how long it is worth spinning is for is an
+	 * arbitrary decision: trading off power vs latency.
+	 */
+
+	err = kstrtoull(buf, 0, &duration);
+	if (err)
+		return err;
+
+	if (duration > jiffies_to_nsecs(2))
+		return -EINVAL;
+
+	WRITE_ONCE(engine->props.max_busywait_duration_ns, duration);
+
+	return count;
+}
+
+static ssize_t
+max_spin_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+	struct intel_engine_cs *engine = kobj_to_engine(kobj);
+
+	return sprintf(buf, "%lu\n", engine->props.max_busywait_duration_ns);
+}
+
+static struct kobj_attribute max_spin_attr =
+__ATTR(max_busywait_duration_ns, 0644, max_spin_show, max_spin_store);
+
 static ssize_t
 timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
 		const char *buf, size_t count)
@@ -224,6 +272,7 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
 		&mmio_attr.attr,
 		&caps_attr.attr,
 		&all_caps_attr.attr,
+		&max_spin_attr.attr,
 		NULL
 	};
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index ede7ce2695cd..807796d619e9 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -525,6 +525,7 @@ struct intel_engine_cs {
 
 	struct {
 		unsigned long heartbeat_interval_ms;
+		unsigned long max_busywait_duration_ns;
 		unsigned long preempt_timeout_ms;
 		unsigned long stop_timeout_ms;
 		unsigned long timeslice_duration_ms;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 15767c2d9ea1..2f6979009c8f 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1381,7 +1381,7 @@ void i915_request_add(struct i915_request *rq)
 	mutex_unlock(&tl->mutex);
 }
 
-static unsigned long local_clock_us(unsigned int *cpu)
+static unsigned long local_clock_ns(unsigned int *cpu)
 {
 	unsigned long t;
 
@@ -1398,7 +1398,7 @@ static unsigned long local_clock_us(unsigned int *cpu)
 	 * stop busywaiting, see busywait_stop().
 	 */
 	*cpu = get_cpu();
-	t = local_clock() >> 10;
+	t = local_clock();
 	put_cpu();
 
 	return t;
@@ -1408,15 +1408,15 @@ static bool busywait_stop(unsigned long timeout, unsigned int cpu)
 {
 	unsigned int this_cpu;
 
-	if (time_after(local_clock_us(&this_cpu), timeout))
+	if (time_after(local_clock_ns(&this_cpu), timeout))
 		return true;
 
 	return this_cpu != cpu;
 }
 
-static bool __i915_spin_request(const struct i915_request * const rq,
-				int state, unsigned long timeout_us)
+static bool __i915_spin_request(const struct i915_request * const rq, int state)
 {
+	unsigned long timeout_ns;
 	unsigned int cpu;
 
 	/*
@@ -1444,7 +1444,8 @@ static bool __i915_spin_request(const struct i915_request * const rq,
 	 * takes to sleep on a request, on the order of a microsecond.
 	 */
 
-	timeout_us += local_clock_us(&cpu);
+	timeout_ns = READ_ONCE(rq->engine->props.max_busywait_duration_ns);
+	timeout_ns += local_clock_ns(&cpu);
 	do {
 		if (i915_request_completed(rq))
 			return true;
@@ -1452,7 +1453,7 @@ static bool __i915_spin_request(const struct i915_request * const rq,
 		if (signal_pending_state(state, current))
 			break;
 
-		if (busywait_stop(timeout_us, cpu))
+		if (busywait_stop(timeout_ns, cpu))
 			break;
 
 		cpu_relax();
@@ -1538,8 +1539,8 @@ long i915_request_wait(struct i915_request *rq,
 	 * completion. That requires having a good predictor for the request
 	 * duration, which we currently lack.
 	 */
-	if (IS_ACTIVE(CONFIG_DRM_I915_SPIN_REQUEST) &&
-	    __i915_spin_request(rq, state, CONFIG_DRM_I915_SPIN_REQUEST)) {
+	if (IS_ACTIVE(CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT) &&
+	    __i915_spin_request(rq, state)) {
 		dma_fence_signal(&rq->fence);
 		goto out;
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 30/33] drm/i915/gt: Expose reset stop timeout via sysfs
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (27 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 29/33] drm/i915/gt: Expose busywait " Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2020-01-22 21:34   ` [Intel-gfx] [30/33] " Steve Carbonari
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 31/33] drm/i915/gt: Expose preempt reset " Chris Wilson
                   ` (3 subsequent siblings)
  32 siblings, 1 reply; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

When we allow ourselves to sleep before a GPU reset after disabling
submission, even for a few milliseconds, gives an innocent context the
opportunity to clear the GPU before the reset occurs. However, how long
to sleep depends on the typical non-preemptible duration (a similar
problem to determining the ideal preempt-reset timeout or even the
heartbeat interval). As this seems of a hard policy decision, punt it to
userspace.

The timeout can be adjusted using

	/sys/class/drm/card?/engine/*/stop_timeout_ms

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
---
 drivers/gpu/drm/i915/Kconfig.profile         |  3 ++
 drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 40 ++++++++++++++++++++
 2 files changed, 43 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index 9ee3b59685b9..5f4ec3aec1d2 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -63,6 +63,9 @@ config DRM_I915_STOP_TIMEOUT
 	  that the reset itself may take longer and so be more disruptive to
 	  interactive or low latency workloads.
 
+	  This is adjustable via
+	  /sys/class/drm/card?/engine/*/stop_timeout_ms
+
 config DRM_I915_TIMESLICE_DURATION
 	int "Scheduling quantum for userspace batches (ms, jiffy granularity)"
 	default 1 # milliseconds
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
index 6d87529c64a7..2b65fed76435 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
@@ -232,6 +232,45 @@ timeslice_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
 static struct kobj_attribute timeslice_duration_attr =
 __ATTR(timeslice_duration_ms, 0644, timeslice_show, timeslice_store);
 
+static ssize_t
+stop_store(struct kobject *kobj, struct kobj_attribute *attr,
+	   const char *buf, size_t count)
+{
+	struct intel_engine_cs *engine = kobj_to_engine(kobj);
+	unsigned long long duration;
+	int err;
+
+	/*
+	 * When we allow ourselves to sleep before a GPU reset after disabling
+	 * submission, even for a few milliseconds, gives an innocent context
+	 * the opportunity to clear the GPU before the reset occurs. However,
+	 * how long to sleep depends on the typical non-preemptible duration
+	 * (a similar problem to determining the ideal preempt-reset timeout
+	 * or even the heartbeat interval).
+	 */
+
+	err = kstrtoull(buf, 0, &duration);
+	if (err)
+		return err;
+
+	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+		return -EINVAL;
+
+	WRITE_ONCE(engine->props.stop_timeout_ms, duration);
+	return count;
+}
+
+static ssize_t
+stop_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+	struct intel_engine_cs *engine = kobj_to_engine(kobj);
+
+	return sprintf(buf, "%lu\n", engine->props.stop_timeout_ms);
+}
+
+static struct kobj_attribute stop_timeout_attr =
+__ATTR(stop_timeout_ms, 0644, stop_show, stop_store);
+
 static void kobj_engine_release(struct kobject *kobj)
 {
 	kfree(kobj);
@@ -273,6 +312,7 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
 		&caps_attr.attr,
 		&all_caps_attr.attr,
 		&max_spin_attr.attr,
+		&stop_timeout_attr.attr,
 		NULL
 	};
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 31/33] drm/i915/gt: Expose preempt reset timeout via sysfs
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (28 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 30/33] drm/i915/gt: Expose reset stop timeout via sysfs Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2020-01-22 21:36   ` [Intel-gfx] [31/33] " Steve Carbonari
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 32/33] drm/i915/gt: Expose heartbeat interval " Chris Wilson
                   ` (2 subsequent siblings)
  32 siblings, 1 reply; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

After initialising a preemption request, we give the current resident a
small amount of time to vacate the GPU. The preemption request is for a
higher priority context and should be immediate to maintain high
quality of service (and avoid priority inversion). However, the
preemption granularity of the GPU can be quite coarse and so we need a
compromise.

The preempt timeout can be adjusted per-engine using,

	/sys/class/drm/card?/engine/*/preempt_timeout_ms

and can be disabled by setting it to 0.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Kconfig.profile         |  3 ++
 drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 48 ++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_lrc.c          |  2 +-
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index 5f4ec3aec1d2..1f4e98a8532f 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -33,6 +33,9 @@ config DRM_I915_PREEMPT_TIMEOUT
 	  expires, the HW will be reset to allow the more important context
 	  to execute.
 
+	  This is adjustable via
+	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
+
 	  May be 0 to disable the timeout.
 
 config DRM_I915_MAX_REQUEST_BUSYWAIT
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
index 2b65fed76435..d299c66cf7ec 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
@@ -271,6 +271,50 @@ stop_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
 static struct kobj_attribute stop_timeout_attr =
 __ATTR(stop_timeout_ms, 0644, stop_show, stop_store);
 
+static ssize_t
+preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
+		      const char *buf, size_t count)
+{
+	struct intel_engine_cs *engine = kobj_to_engine(kobj);
+	unsigned long long timeout;
+	int err;
+
+	/*
+	 * After initialising a preemption request, we give the current
+	 * resident a small amount of time to vacate the GPU. The preemption
+	 * request is for a higher priority context and should be immediate to
+	 * maintain high quality of service (and avoid priority inversion).
+	 * However, the preemption granularity of the GPU can be quite coarse
+	 * and so we need a compromise.
+	 */
+
+	err = kstrtoull(buf, 0, &timeout);
+	if (err)
+		return err;
+
+	if (timeout > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+		return -EINVAL;
+
+	WRITE_ONCE(engine->props.preempt_timeout_ms, timeout);
+
+	if (READ_ONCE(engine->execlists.pending[0]))
+		set_timer_ms(&engine->execlists.preempt, timeout);
+
+	return count;
+}
+
+static ssize_t
+preempt_timeout_show(struct kobject *kobj, struct kobj_attribute *attr,
+		     char *buf)
+{
+	struct intel_engine_cs *engine = kobj_to_engine(kobj);
+
+	return sprintf(buf, "%lu\n", engine->props.preempt_timeout_ms);
+}
+
+static struct kobj_attribute preempt_timeout_attr =
+__ATTR(preempt_timeout_ms, 0644, preempt_timeout_show, preempt_timeout_store);
+
 static void kobj_engine_release(struct kobject *kobj)
 {
 	kfree(kobj);
@@ -341,6 +385,10 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
 		    sysfs_create_file(kobj, &timeslice_duration_attr.attr))
 			goto err_engine;
 
+		if (intel_engine_has_preempt_reset(engine) &&
+		    sysfs_create_file(kobj, &preempt_timeout_attr.attr))
+			goto err_engine;
+
 		if (0) {
 err_object:
 			kobject_put(kobj);
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 024a639da787..9a9b23e3d0a3 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2252,7 +2252,7 @@ static noinline void preempt_reset(struct intel_engine_cs *engine)
 	const unsigned int bit = I915_RESET_ENGINE + engine->id;
 	unsigned long *lock = &engine->gt->reset.flags;
 
-	if (i915_modparams.reset < 3)
+	if (!intel_has_reset_engine(engine->gt))
 		return;
 
 	if (test_and_set_bit(bit, lock))
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 32/33] drm/i915/gt: Expose heartbeat interval via sysfs
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (29 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 31/33] drm/i915/gt: Expose preempt reset " Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2020-01-22 21:38   ` [Intel-gfx] [32/33] " Steve Carbonari
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 33/33] HAX: Use aliasing-ppgtt for gen7 Chris Wilson
  2019-12-12 18:09 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [01/33] drm/i915: Use EAGAIN for trylock failures Patchwork
  32 siblings, 1 reply; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

We monitor the health of the system via periodic heartbeat pulses. The
pulses also provide the opportunity to perform garbage collection.
However, we interpret an incomplete pulse (a missed heartbeat) as an
indication that the system is no longer responsive, i.e. hung, and
perform an engine or full GPU reset. Given that the preemption
granularity can be very coarse on a system, we let the sysadmin override
our legacy timeouts which were "optimised" for desktop applications.

The heartbeat interval can be adjusted per-engine using,

	/sys/class/drm/card?/engine/*/heartbeat_interval_ms

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Kconfig.profile         |  3 ++
 drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 47 ++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index 1f4e98a8532f..ba8767fc0d6e 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -20,6 +20,9 @@ config DRM_I915_HEARTBEAT_INTERVAL
 	  check the health of the GPU and undertake regular house-keeping of
 	  internal driver state.
 
+	  This is adjustable via
+	  /sys/class/drm/card?/engine/*/heartbeat_interval_ms
+
 	  May be 0 to disable heartbeats and therefore disable automatic GPU
 	  hang detection.
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
index d299c66cf7ec..33b4c00b93f2 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
@@ -9,6 +9,7 @@
 
 #include "i915_drv.h"
 #include "intel_engine.h"
+#include "intel_engine_heartbeat.h"
 #include "intel_engine_sysfs.h"
 
 struct kobj_engine {
@@ -315,6 +316,49 @@ preempt_timeout_show(struct kobject *kobj, struct kobj_attribute *attr,
 static struct kobj_attribute preempt_timeout_attr =
 __ATTR(preempt_timeout_ms, 0644, preempt_timeout_show, preempt_timeout_store);
 
+static ssize_t
+heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
+		const char *buf, size_t count)
+{
+	struct intel_engine_cs *engine = kobj_to_engine(kobj);
+	unsigned long long delay;
+	int err;
+
+	/*
+	 * We monitor the health of the system via periodic heartbeat pulses.
+	 * The pulses also provide the opportunity to perform garbage
+	 * collection.  However, we interpret an incomplete pulse (a missed
+	 * heartbeat) as an indication that the system is no longer responsive,
+	 * i.e. hung, and perform an engine or full GPU reset. Given that the
+	 * preemption granularity can be very coarse on a system, the optimal
+	 * value for any workload is unknowable!
+	 */
+
+	err = kstrtoull(buf, 0, &delay);
+	if (err)
+		return err;
+
+	if (delay >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+		return -EINVAL;
+
+	err = intel_engine_set_heartbeat(engine, delay);
+	if (err)
+		return err;
+
+	return count;
+}
+
+static ssize_t
+heartbeat_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+	struct intel_engine_cs *engine = kobj_to_engine(kobj);
+
+	return sprintf(buf, "%lu\n", engine->props.heartbeat_interval_ms);
+}
+
+static struct kobj_attribute heartbeat_interval_attr =
+__ATTR(heartbeat_interval_ms, 0644, heartbeat_show, heartbeat_store);
+
 static void kobj_engine_release(struct kobject *kobj)
 {
 	kfree(kobj);
@@ -357,6 +401,9 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
 		&all_caps_attr.attr,
 		&max_spin_attr.attr,
 		&stop_timeout_attr.attr,
+#if CONFIG_DRM_I915_HEARTBEAT_INTERVAL
+		&heartbeat_interval_attr.attr,
+#endif
 		NULL
 	};
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Intel-gfx] [PATCH 33/33] HAX: Use aliasing-ppgtt for gen7
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (30 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 32/33] drm/i915/gt: Expose heartbeat interval " Chris Wilson
@ 2019-12-12 14:04 ` Chris Wilson
  2019-12-12 18:09 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [01/33] drm/i915: Use EAGAIN for trylock failures Patchwork
  32 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 14:04 UTC (permalink / raw)
  To: intel-gfx

---
 drivers/gpu/drm/i915/i915_pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index bba6b50e6beb..da3e9b5752ac 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -436,7 +436,7 @@ static const struct intel_device_info intel_sandybridge_m_gt2_info = {
 	.has_rc6 = 1, \
 	.has_rc6p = 1, \
 	.has_rps = true, \
-	.ppgtt_type = INTEL_PPGTT_FULL, \
+	.ppgtt_type = INTEL_PPGTT_ALIASING, \
 	.ppgtt_size = 31, \
 	IVB_PIPE_OFFSETS, \
 	IVB_CURSOR_OFFSETS, \
@@ -493,7 +493,7 @@ static const struct intel_device_info intel_valleyview_info = {
 	.has_rps = true,
 	.display.has_gmch = 1,
 	.display.has_hotplug = 1,
-	.ppgtt_type = INTEL_PPGTT_FULL,
+	.ppgtt_type = INTEL_PPGTT_ALIASING,
 	.ppgtt_size = 31,
 	.has_snoop = true,
 	.has_coherent_ggtt = false,
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [PATCH 07/33] drm/i915/gt: Mark up ips_mchdev pointer access
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 07/33] drm/i915/gt: Mark up ips_mchdev pointer access Chris Wilson
@ 2019-12-12 16:39   ` Venkata Sandeep Dhanalakota
  0 siblings, 0 replies; 46+ messages in thread
From: Venkata Sandeep Dhanalakota @ 2019-12-12 16:39 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On 19/12/12 02:04, Chris Wilson wrote:
> drivers/gpu/drm/i915/gt/intel_rps.c:1726:24: error: incompatible types in comparison expression (different address spaces):
> drivers/gpu/drm/i915/gt/intel_rps.c:1726:24:    struct drm_i915_private [noderef] <asn:4> *
> drivers/gpu/drm/i915/gt/intel_rps.c:1726:24:    struct drm_i915_private *
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gt/intel_rps.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
> index fd01e4100fc1..106c9fce9d6c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> @@ -1723,7 +1723,7 @@ void intel_rps_driver_register(struct intel_rps *rps)
>  
>  void intel_rps_driver_unregister(struct intel_rps *rps)
>  {
> -	if (ips_mchdev == rps_to_i915(rps))
> +	if (rcu_access_pointer(ips_mchdev) == rps_to_i915(rps))
>  		rcu_assign_pointer(ips_mchdev, NULL);
>  }
Right way to access rcu pointer.

Reviewed-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>  
> -- 
> 2.24.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [01/33] drm/i915: Use EAGAIN for trylock failures
  2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
                   ` (31 preceding siblings ...)
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 33/33] HAX: Use aliasing-ppgtt for gen7 Chris Wilson
@ 2019-12-12 18:09 ` Patchwork
  32 siblings, 0 replies; 46+ messages in thread
From: Patchwork @ 2019-12-12 18:09 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/33] drm/i915: Use EAGAIN for trylock failures
URL   : https://patchwork.freedesktop.org/series/70833/
State : failure

== Summary ==

Applying: drm/i915: Use EAGAIN for trylock failures
Applying: drm/i915: Eliminate the trylock for awaiting an earlier request
Applying: drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp
Applying: drm/i915/uc: Ignore maybe-unused debug-only i915 local
Applying: drm/i915/gem: Prepare gen7 cmdparser for async execution
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
M	drivers/gpu/drm/i915/gt/intel_gpu_commands.h
M	drivers/gpu/drm/i915/i915_cmd_parser.c
M	drivers/gpu/drm/i915/i915_drv.h
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/i915_cmd_parser.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/i915_cmd_parser.c
Auto-merging drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0005 drm/i915/gem: Prepare gen7 cmdparser for async execution
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [PATCH 08/33] drm/i915/gt: Mark context->state vma as active while pinned
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 08/33] drm/i915/gt: Mark context->state vma as active while pinned Chris Wilson
@ 2019-12-12 19:55   ` Lionel Landwerlin
  2019-12-12 21:35     ` Lionel Landwerlin
  0 siblings, 1 reply; 46+ messages in thread
From: Lionel Landwerlin @ 2019-12-12 19:55 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 12/12/2019 16:04, Chris Wilson wrote:
> As we use the active state to keep the vma alive while we are reading
> its contents during GPU error capture, we need to mark the
> context->state vma as active during execution if we want to include it
> in the error state.
>
> Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Fixes: b1e3177bd1d8 ("drm/i915: Coordinate i915_active with its own mutex")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_context.c | 9 +++++++++
>   1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> index 61c39e943f69..f7e2f3af007a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -120,6 +120,10 @@ static int __context_pin_state(struct i915_vma *vma)
>   	if (err)
>   		return err;
>   
> +	err = i915_active_acquire(&vma->active);
> +	if (err)
> +		goto err_unpin;
> +
>   	/*
>   	 * And mark it as a globally pinned object to let the shrinker know
>   	 * it cannot reclaim the object until we release it.
> @@ -128,11 +132,16 @@ static int __context_pin_state(struct i915_vma *vma)
>   	vma->obj->mm.dirty = true;
>   
>   	return 0;
> +
> +err_unpin:
> +	i915_vma_unpin(vma);
> +	return err;
>   }
>   
>   static void __context_unpin_state(struct i915_vma *vma)
>   {
>   	i915_vma_make_shrinkable(vma);
> +	i915_active_release(&vma->active);
>   	__i915_vma_unpin(vma);
>   }
>   


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [PATCH 09/33] drm/i915/gt: Mark ring->vma as active while pinned
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 09/33] drm/i915/gt: Mark ring->vma " Chris Wilson
@ 2019-12-12 19:55   ` Lionel Landwerlin
  0 siblings, 0 replies; 46+ messages in thread
From: Lionel Landwerlin @ 2019-12-12 19:55 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 12/12/2019 16:04, Chris Wilson wrote:
> As we use the active state to keep the vma alive while we are reading
> its contents during GPU error capture, we need to mark the
> ring->vma as active during execution if we want to include the rinbuffer
> in the error state.
>
> Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Fixes: b1e3177bd1d8 ("drm/i915: Coordinate i915_active with its own mutex")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_ring.c | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> index 374b28f13ca0..7a27264150b9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> @@ -45,6 +45,10 @@ int intel_ring_pin(struct intel_ring *ring)
>   	if (unlikely(ret))
>   		goto err_unpin;
>   
> +	ret = i915_active_acquire(&vma->active);
> +	if (ret)
> +		goto err_ring;
> +
>   	if (i915_vma_is_map_and_fenceable(vma))
>   		addr = (void __force *)i915_vma_pin_iomap(vma);
>   	else
> @@ -52,7 +56,7 @@ int intel_ring_pin(struct intel_ring *ring)
>   					       i915_coherent_map_type(vma->vm->i915));
>   	if (IS_ERR(addr)) {
>   		ret = PTR_ERR(addr);
> -		goto err_ring;
> +		goto err_active;
>   	}
>   
>   	i915_vma_make_unshrinkable(vma);
> @@ -63,6 +67,8 @@ int intel_ring_pin(struct intel_ring *ring)
>   	ring->vaddr = addr;
>   	return 0;
>   
> +err_active:
> +	i915_active_release(&vma->active);
>   err_ring:
>   	i915_vma_unpin(vma);
>   err_unpin:
> @@ -93,6 +99,8 @@ void intel_ring_unpin(struct intel_ring *ring)
>   		i915_gem_object_unpin_map(vma->obj);
>   
>   	i915_vma_make_purgeable(vma);
> +
> +	i915_active_release(&vma->active);
>   	i915_vma_unpin(vma);
>   }
>   


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [PATCH 08/33] drm/i915/gt: Mark context->state vma as active while pinned
  2019-12-12 19:55   ` Lionel Landwerlin
@ 2019-12-12 21:35     ` Lionel Landwerlin
  2019-12-12 21:46       ` Chris Wilson
  0 siblings, 1 reply; 46+ messages in thread
From: Lionel Landwerlin @ 2019-12-12 21:35 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 12/12/2019 21:55, Lionel Landwerlin wrote:
> On 12/12/2019 16:04, Chris Wilson wrote:
>> As we use the active state to keep the vma alive while we are reading
>> its contents during GPU error capture, we need to mark the
>> context->state vma as active during execution if we want to include it
>> in the error state.
>>
>> Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> Fixes: b1e3177bd1d8 ("drm/i915: Coordinate i915_active with its own 
>> mutex")
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> 


I'm wondering whether we want a test for this or some kind of assert in 
the error capture.

If there is a batch, there must be a context/ring kind of assert.


-Lionel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [PATCH 08/33] drm/i915/gt: Mark context->state vma as active while pinned
  2019-12-12 21:35     ` Lionel Landwerlin
@ 2019-12-12 21:46       ` Chris Wilson
  0 siblings, 0 replies; 46+ messages in thread
From: Chris Wilson @ 2019-12-12 21:46 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-12-12 21:35:00)
> On 12/12/2019 21:55, Lionel Landwerlin wrote:
> > On 12/12/2019 16:04, Chris Wilson wrote:
> >> As we use the active state to keep the vma alive while we are reading
> >> its contents during GPU error capture, we need to mark the
> >> context->state vma as active during execution if we want to include it
> >> in the error state.
> >>
> >> Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> >> Fixes: b1e3177bd1d8 ("drm/i915: Coordinate i915_active with its own 
> >> mutex")
> >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> > Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> 
> 
> 
> I'm wondering whether we want a test for this or some kind of assert in 
> the error capture.

Considered it, but it's not a fundamental part of the ABI, it's just
useful to have until we have something better. There are a few valid
reasons why we might not be able to capture it as well, it's best
effort.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [26/33] drm/i915/gt: Expose engine properties via sysfs
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 26/33] drm/i915/gt: Expose engine properties via sysfs Chris Wilson
@ 2020-01-22 21:13   ` Steve Carbonari
  0 siblings, 0 replies; 46+ messages in thread
From: Steve Carbonari @ 2020-01-22 21:13 UTC (permalink / raw)
  To: intel-gfx

On Thu, Dec 12, 2019 at 02:04:52PM +0000, Chris Wilson wrote:
> Preliminary stub to add engines underneath /sys/class/drm/cardN/, so
> that we can expose properties on each engine to the sysadmin.
> 
> To start with we have basic analogues of the i915_query ioctl so that we
> can pretty print engine discovery from the shell, and flesh out the
> directory structure. Later we will add writeable sysadmin properties such
> as per-engine timeout controls.
> 
> An example tree of the engine properties on Braswell:
>     /sys/class/drm/card0
>     └── engine
>         ├── bcs0
>         │   ├── capabilities
>         │   ├── class
>         │   ├── instance
>         │   ├── known_capabilities
>         │   └── name
>         ├── rcs0
>         │   ├── capabilities
>         │   ├── class
>         │   ├── instance
>         │   ├── known_capabilities
>         │   └── name
>         ├── vcs0
>         │   ├── capabilities
>         │   ├── class
>         │   ├── instance
>         │   ├── known_capabilities
>         │   └── name
>         └── vecs0
>             ├── capabilities
>             ├── class
>             ├── instance
>             ├── known_capabilities
>             └── name
> 
> v2: Include stringified capabilities
> v3: Include all known capabilities for futureproofing.
> v4: Combine the two caps loops into one
> 
> v5: Hide underneath Kconfig.unstable for wider discussion
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Code is correct and satisfies the interface requirements.
In addition, downloaded and tested the patch.
The directories and files show up with expected values.

Reviewed-by: Steve Carbonari <steven.carbonari@intel.com>
Tested-by: Steve Carbonari <stenve.carbonari@intel.com>

> ---
>  drivers/gpu/drm/i915/Kconfig.unstable        |   7 +
>  drivers/gpu/drm/i915/Makefile                |   1 +
>  drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 208 +++++++++++++++++++
>  drivers/gpu/drm/i915/gt/intel_engine_sysfs.h |  14 ++
>  drivers/gpu/drm/i915/i915_sysfs.c            |   3 +
>  5 files changed, 233 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_sysfs.h
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig.unstable b/drivers/gpu/drm/i915/Kconfig.unstable
> index 0c2276155c2b..1f866cae943b 100644
> --- a/drivers/gpu/drm/i915/Kconfig.unstable
> +++ b/drivers/gpu/drm/i915/Kconfig.unstable
> @@ -27,3 +27,10 @@ config DRM_I915_UNSTABLE_FAKE_LMEM
>  	help
>  	  Convert some system memory into a fake local memory region for
>  	  testing.
> +
> +config DRM_I915_UNSTABLE_SYSFS
> +	bool "Enable the experimental sysfs properties"
> +	depends on DRM_I915_UNSTABLE
> +	default n
> +	help
> +	  Explore the HW property space from the shell command line!
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 2083496a1193..d80fe6bf9fea 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -82,6 +82,7 @@ gt-y += \
>  	gt/intel_engine_heartbeat.o \
>  	gt/intel_engine_pm.o \
>  	gt/intel_engine_pool.o \
> +	gt/intel_engine_sysfs.o \
>  	gt/intel_engine_user.o \
>  	gt/intel_gt.o \
>  	gt/intel_gt_irq.o \
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> new file mode 100644
> index 000000000000..df263af3a9ea
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> @@ -0,0 +1,208 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include <linux/kobject.h>
> +#include <linux/sysfs.h>
> +
> +#include "i915_drv.h"
> +#include "intel_engine.h"
> +#include "intel_engine_sysfs.h"
> +
> +struct kobj_engine {
> +	struct kobject base;
> +	struct intel_engine_cs *engine;
> +};
> +
> +static struct intel_engine_cs *kobj_to_engine(struct kobject *kobj)
> +{
> +	return container_of(kobj, struct kobj_engine, base)->engine;
> +}
> +
> +static ssize_t
> +name_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%s\n", kobj_to_engine(kobj)->name);
> +}
> +
> +static struct kobj_attribute name_attr =
> +__ATTR(name, 0444, name_show, NULL);
> +
> +static ssize_t
> +class_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%d\n", kobj_to_engine(kobj)->uabi_class);
> +}
> +
> +static struct kobj_attribute class_attr =
> +__ATTR(class, 0444, class_show, NULL);
> +
> +static ssize_t
> +inst_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%d\n", kobj_to_engine(kobj)->uabi_instance);
> +}
> +
> +static struct kobj_attribute inst_attr =
> +__ATTR(instance, 0444, inst_show, NULL);
> +
> +static const char * const vcs_caps[] = {
> +	[ilog2(I915_VIDEO_CLASS_CAPABILITY_HEVC)] = "hevc",
> +	[ilog2(I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC)] = "sfc",
> +};
> +
> +static const char * const vecs_caps[] = {
> +	[ilog2(I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC)] = "sfc",
> +};
> +
> +static ssize_t repr_trim(char *buf, ssize_t len)
> +{
> +	/* Trim off the trailing space and replace with a newline */
> +	if (len > PAGE_SIZE)
> +		len = PAGE_SIZE;
> +	if (len > 0)
> +		buf[len - 1] = '\n';
> +
> +	return len;
> +}
> +
> +static ssize_t
> +__caps_show(struct intel_engine_cs *engine,
> +	    typeof(engine->uabi_capabilities) caps,
> +	    char *buf, bool show_unknown)
> +{
> +	const char * const *repr;
> +	int count, n;
> +	ssize_t len;
> +
> +	switch (engine->class) {
> +	case VIDEO_DECODE_CLASS:
> +		repr = vcs_caps;
> +		count = ARRAY_SIZE(vcs_caps);
> +		break;
> +
> +	case VIDEO_ENHANCEMENT_CLASS:
> +		repr = vecs_caps;
> +		count = ARRAY_SIZE(vecs_caps);
> +		break;
> +
> +	default:
> +		repr = NULL;
> +		count = 0;
> +		break;
> +	}
> +	GEM_BUG_ON(count > BITS_PER_TYPE(typeof(caps)));
> +
> +	len = 0;
> +	for_each_set_bit(n,
> +			 (unsigned long *)&caps,
> +			 show_unknown ? BITS_PER_TYPE(typeof(caps)) : count) {
> +		if (n >= count || !repr[n]) {
> +			if (GEM_WARN_ON(show_unknown))
> +				len += snprintf(buf + len, PAGE_SIZE - len,
> +						"[%x] ", n);
> +		} else {
> +			len += snprintf(buf + len, PAGE_SIZE - len,
> +					"%s ", repr[n]);
> +		}
> +		if (GEM_WARN_ON(len >= PAGE_SIZE))
> +			break;
> +	}
> +	return repr_trim(buf, len);
> +}
> +
> +static ssize_t
> +caps_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
> +{
> +	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> +
> +	return __caps_show(engine, engine->uabi_capabilities, buf, true);
> +}
> +
> +static struct kobj_attribute caps_attr =
> +__ATTR(capabilities, 0444, caps_show, NULL);
> +
> +static ssize_t
> +all_caps_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
> +{
> +	return __caps_show(kobj_to_engine(kobj), -1, buf, false);
> +}
> +
> +static struct kobj_attribute all_caps_attr =
> +__ATTR(known_capabilities, 0444, all_caps_show, NULL);
> +
> +static void kobj_engine_release(struct kobject *kobj)
> +{
> +	kfree(kobj);
> +}
> +
> +static struct kobj_type kobj_engine_type = {
> +	.release = kobj_engine_release,
> +	.sysfs_ops = &kobj_sysfs_ops
> +};
> +
> +static struct kobject *
> +kobj_engine(struct kobject *dir, struct intel_engine_cs *engine)
> +{
> +	struct kobj_engine *ke;
> +
> +	ke = kzalloc(sizeof(*ke), GFP_KERNEL);
> +	if (!ke)
> +		return NULL;
> +
> +	kobject_init(&ke->base, &kobj_engine_type);
> +	ke->engine = engine;
> +
> +	if (kobject_add(&ke->base, dir, "%s", engine->name)) {
> +		kobject_put(&ke->base);
> +		return NULL;
> +	}
> +
> +	/* xfer ownership to sysfs tree */
> +	return &ke->base;
> +}
> +
> +void intel_engines_add_sysfs(struct drm_i915_private *i915)
> +{
> +	static const struct attribute *files[] = {
> +		&name_attr.attr,
> +		&class_attr.attr,
> +		&inst_attr.attr,
> +		&caps_attr.attr,
> +		&all_caps_attr.attr,
> +		NULL
> +	};
> +
> +	struct device *kdev = i915->drm.primary->kdev;
> +	struct intel_engine_cs *engine;
> +	struct kobject *dir;
> +
> +	if (!IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_SYSFS))
> +		return;
> +
> +	dir = kobject_create_and_add("engine", &kdev->kobj);
> +	if (!dir)
> +		return;
> +
> +	for_each_uabi_engine(engine, i915) {
> +		struct kobject *kobj;
> +
> +		kobj = kobj_engine(dir, engine);
> +		if (!kobj)
> +			goto err_engine;
> +
> +		if (sysfs_create_files(kobj, files))
> +			goto err_object;
> +
> +		if (0) {
> +err_object:
> +			kobject_put(kobj);
> +err_engine:
> +			dev_err(kdev, "Failed to add sysfs engine '%s'\n",
> +				engine->name);
> +			break;
> +		}
> +	}
> +}
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.h b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.h
> new file mode 100644
> index 000000000000..ef44a745b70a
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.h
> @@ -0,0 +1,14 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#ifndef INTEL_ENGINE_SYSFS_H
> +#define INTEL_ENGINE_SYSFS_H
> +
> +struct drm_i915_private;
> +
> +void intel_engines_add_sysfs(struct drm_i915_private *i915);
> +
> +#endif /* INTEL_ENGINE_SYSFS_H */
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index 65476909d1bf..26738c5c7d23 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -30,6 +30,7 @@
>  #include <linux/stat.h>
>  #include <linux/sysfs.h>
>  
> +#include "gt/intel_engine_sysfs.h"
>  #include "gt/intel_rc6.h"
>  #include "gt/intel_rps.h"
>  
> @@ -616,6 +617,8 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
>  		DRM_ERROR("RPS sysfs setup failed\n");
>  
>  	i915_setup_error_capture(kdev);
> +
> +	intel_engines_add_sysfs(dev_priv);
>  }
>  
>  void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [27/33] drm/i915/gt: Expose engine->mmio_base via sysfs
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 27/33] drm/i915/gt: Expose engine->mmio_base " Chris Wilson
@ 2020-01-22 21:18   ` Steve Carbonari
  0 siblings, 0 replies; 46+ messages in thread
From: Steve Carbonari @ 2020-01-22 21:18 UTC (permalink / raw)
  To: intel-gfx

On Thu, Dec 12, 2019 at 02:04:53PM +0000, Chris Wilson wrote:
> Use the per-engine sysfs directory to let userspace discover the
> mmio_base of each engine. Prior to recent generations, the user
> accessible registers on each engine are at a fixed offset relative to
> each engine -- but require absolute addressing. As the absolute address
> depends on the actual physical engine, this is not always possible to
> determine from userspace (for example icl may expose vcs1 or vcs2 as the
> second vcs engine). Make this easy for userspace to discover by
> providing the mmio_base in sysfs.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

Reviewed code.
Downloaded and tested the patch with dependent previous patch
[26/33] drm/i915/gt: Expose engine properties via sysfs.
The mmio_base file appears with correct value.

Reviewed-by: Steve Carbonari <steven.carbonari@intel.com>
Tested-by: Steve Carbonari <steven.carbonari@intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> index df263af3a9ea..abddd8d0f9ae 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> @@ -48,6 +48,15 @@ inst_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
>  static struct kobj_attribute inst_attr =
>  __ATTR(instance, 0444, inst_show, NULL);
>  
> +static ssize_t
> +mmio_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "0x%x\n", kobj_to_engine(kobj)->mmio_base);
> +}
> +
> +static struct kobj_attribute mmio_attr =
> +__ATTR(mmio_base, 0444, mmio_show, NULL);
> +
>  static const char * const vcs_caps[] = {
>  	[ilog2(I915_VIDEO_CLASS_CAPABILITY_HEVC)] = "hevc",
>  	[ilog2(I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC)] = "sfc",
> @@ -170,6 +179,7 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
>  		&name_attr.attr,
>  		&class_attr.attr,
>  		&inst_attr.attr,
> +		&mmio_attr.attr,
>  		&caps_attr.attr,
>  		&all_caps_attr.attr,
>  		NULL
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [28/33] drm/i915/gt: Expose timeslice duration to sysfs
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 28/33] drm/i915/gt: Expose timeslice duration to sysfs Chris Wilson
@ 2020-01-22 21:25   ` Steve Carbonari
  0 siblings, 0 replies; 46+ messages in thread
From: Steve Carbonari @ 2020-01-22 21:25 UTC (permalink / raw)
  To: intel-gfx

On Thu, Dec 12, 2019 at 02:04:54PM +0000, Chris Wilson wrote:
> Execlists uses a scheduling quantum (a timeslice) to alternate execution
> between ready-to-run contexts of equal priority. This ensures that all
> users (though only if they of equal importance) have the opportunity to
> run and prevents livelocks where contexts may have implicit ordering due
> to userspace semaphores.
> 
> The timeslicing mechanism can be compiled out with
> 
> 	./scripts/config --set-val DRM_I915_TIMESLICE_DURATION 0
> 
> The timeslice duration can be adjusted per-engine using,
> 
> 	/sys/class/drm/card?/engine/*/timeslice_duration_ms
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Code looks good.
Tested the patch with previous patches in the series.
[26/33] drm/i915/gt: Expose engine properties via sysfs
[27/33] drm/i915/gt: Expose engine->mmio_base via sysfs
The directories and files show up with expected values.
Verified that if engine does not have timeslices the
sysfs entry is not created.

Reviewed-by: Steve Carbonari <steven.carbonari@intel.com>
Tested-by: Steve Carbonari <steven.carbonari@intel.com


> ---
>  drivers/gpu/drm/i915/Kconfig.profile         |  3 ++
>  drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 46 ++++++++++++++++++++
>  2 files changed, 49 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> index c280b6ae38eb..d8d4a16179bd 100644
> --- a/drivers/gpu/drm/i915/Kconfig.profile
> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -73,4 +73,7 @@ config DRM_I915_TIMESLICE_DURATION
>  	  is scheduled for execution for the timeslice duration, before
>  	  switching to the next context.
>  
> +	  This is adjustable via
> +	  /sys/class/drm/card?/engine/*/timeslice_duration_ms
> +
>  	  May be 0 to disable timeslicing.
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> index abddd8d0f9ae..b1bd768b13d7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> @@ -142,6 +142,48 @@ all_caps_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
>  static struct kobj_attribute all_caps_attr =
>  __ATTR(known_capabilities, 0444, all_caps_show, NULL);
>  
> +static ssize_t
> +timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
> +		const char *buf, size_t count)
> +{
> +	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> +	unsigned long long duration;
> +	int err;
> +
> +	/*
> +	 * Execlists uses a scheduling quantum (a timeslice) to alternate
> +	 * execution between ready-to-run contexts of equal priority. This
> +	 * ensures that all users (though only if they of equal importance)
> +	 * have the opportunity to run and prevents livelocks where contexts
> +	 * may have implicit ordering due to userspace semaphores.
> +	 */
> +
> +	err = kstrtoull(buf, 0, &duration);
> +	if (err)
> +		return err;
> +
> +	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +		return -EINVAL;
> +
> +	WRITE_ONCE(engine->props.timeslice_duration_ms, duration);
> +
> +	if (execlists_active(&engine->execlists))
> +		set_timer_ms(&engine->execlists.timer, duration);
> +
> +	return count;
> +}
> +
> +static ssize_t
> +timeslice_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
> +{
> +	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> +
> +	return sprintf(buf, "%lu\n", engine->props.timeslice_duration_ms);
> +}
> +
> +static struct kobj_attribute timeslice_duration_attr =
> +__ATTR(timeslice_duration_ms, 0644, timeslice_show, timeslice_store);
> +
>  static void kobj_engine_release(struct kobject *kobj)
>  {
>  	kfree(kobj);
> @@ -206,6 +248,10 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
>  		if (sysfs_create_files(kobj, files))
>  			goto err_object;
>  
> +		if (intel_engine_has_timeslices(engine) &&
> +		    sysfs_create_file(kobj, &timeslice_duration_attr.attr))
> +			goto err_engine;
> +
>  		if (0) {
>  err_object:
>  			kobject_put(kobj);
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [29/33] drm/i915/gt: Expose busywait duration to sysfs
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 29/33] drm/i915/gt: Expose busywait " Chris Wilson
@ 2020-01-22 21:29   ` Steve Carbonari
  0 siblings, 0 replies; 46+ messages in thread
From: Steve Carbonari @ 2020-01-22 21:29 UTC (permalink / raw)
  To: intel-gfx

On Thu, Dec 12, 2019 at 02:04:55PM +0000, Chris Wilson wrote:
> We busywait on an inflight request (one that is currently executing on
> HW, and so might complete quickly) prior to setting up an interrupt and
> sleeping. The trade off is that we keep an expensive CPU core busy in
> order to avoid wake up latency: where that trade off should lie is best
> left to the sysadmin.
> 
> The busywait mechanism can be compiled out with
> 
> 	./scripts/config --set-val DRM_I915_SPIN_REQUEST 0
> 
> The maximum busywait duration can be adjusted per-engine using,
> 
> 	/sys/class/drm/card?/engine/*/ms_busywait_duration_ns
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Code looks good.
Tested the patch with previous sysfs related patches in the series.
The directories and files show up with expected values.
Busywait can be changed.

Reviewed-by: Steve Carbonari <steven.carbonari@intel.com>
Tested-by: Steve Carbonari <steven.carbonari@intel.com


> ---
>  drivers/gpu/drm/i915/Kconfig.profile         |  9 ++--
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  2 +
>  drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 49 ++++++++++++++++++++
>  drivers/gpu/drm/i915/gt/intel_engine_types.h |  1 +
>  drivers/gpu/drm/i915/i915_request.c          | 19 ++++----
>  5 files changed, 68 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> index d8d4a16179bd..9ee3b59685b9 100644
> --- a/drivers/gpu/drm/i915/Kconfig.profile
> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -35,9 +35,9 @@ config DRM_I915_PREEMPT_TIMEOUT
>  
>  	  May be 0 to disable the timeout.
>  
> -config DRM_I915_SPIN_REQUEST
> -	int "Busywait for request completion (us)"
> -	default 5 # microseconds
> +config DRM_I915_MAX_REQUEST_BUSYWAIT
> +	int "Busywait for request completion limit (ns)"
> +	default 8000 # nanoseconds
>  	help
>  	  Before sleeping waiting for a request (GPU operation) to complete,
>  	  we may spend some time polling for its completion. As the IRQ may
> @@ -45,6 +45,9 @@ config DRM_I915_SPIN_REQUEST
>  	  check if the request will complete in the time it would have taken
>  	  us to enable the interrupt.
>  
> +	  This is adjustable via
> +	  /sys/class/drm/card?/engine/*/max_busywait_duration_ns
> +
>  	  May be 0 to disable the initial spin. In practice, we estimate
>  	  the cost of enabling the interrupt (if currently disabled) to be
>  	  a few microseconds.
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 9e16d8a9bbf5..9162b94a7b80 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -312,6 +312,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
>  
>  	engine->props.heartbeat_interval_ms =
>  		CONFIG_DRM_I915_HEARTBEAT_INTERVAL;
> +	engine->props.max_busywait_duration_ns =
> +		CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT;
>  	engine->props.preempt_timeout_ms =
>  		CONFIG_DRM_I915_PREEMPT_TIMEOUT;
>  	engine->props.stop_timeout_ms =
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> index b1bd768b13d7..6d87529c64a7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> @@ -142,6 +142,54 @@ all_caps_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
>  static struct kobj_attribute all_caps_attr =
>  __ATTR(known_capabilities, 0444, all_caps_show, NULL);
>  
> +static ssize_t
> +max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
> +	       const char *buf, size_t count)
> +{
> +	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> +	unsigned long long duration;
> +	int err;
> +
> +	/*
> +	 * When waiting for a request, if is it currently being executed
> +	 * on the GPU, we busywait for a short while before sleeping. The
> +	 * premise is that most requests are short, and if it is already
> +	 * executing then there is a good chance that it will complete
> +	 * before we can setup the interrupt handler and go to sleep.
> +	 * We try to offset the cost of going to sleep, by first spinning
> +	 * on the request -- if it completed in less time than it would take
> +	 * to go sleep, process the interrupt and return back to the client,
> +	 * then we have saved the client some latency, albeit at the cost
> +	 * of spinning on an expensive CPU core.
> +	 *
> +	 * While we try to avoid waiting at all for a request that is unlikely
> +	 * to complete, deciding how long it is worth spinning is for is an
> +	 * arbitrary decision: trading off power vs latency.
> +	 */
> +
> +	err = kstrtoull(buf, 0, &duration);
> +	if (err)
> +		return err;
> +
> +	if (duration > jiffies_to_nsecs(2))
> +		return -EINVAL;
> +
> +	WRITE_ONCE(engine->props.max_busywait_duration_ns, duration);
> +
> +	return count;
> +}
> +
> +static ssize_t
> +max_spin_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
> +{
> +	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> +
> +	return sprintf(buf, "%lu\n", engine->props.max_busywait_duration_ns);
> +}
> +
> +static struct kobj_attribute max_spin_attr =
> +__ATTR(max_busywait_duration_ns, 0644, max_spin_show, max_spin_store);
> +
>  static ssize_t
>  timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
>  		const char *buf, size_t count)
> @@ -224,6 +272,7 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
>  		&mmio_attr.attr,
>  		&caps_attr.attr,
>  		&all_caps_attr.attr,
> +		&max_spin_attr.attr,
>  		NULL
>  	};
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index ede7ce2695cd..807796d619e9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -525,6 +525,7 @@ struct intel_engine_cs {
>  
>  	struct {
>  		unsigned long heartbeat_interval_ms;
> +		unsigned long max_busywait_duration_ns;
>  		unsigned long preempt_timeout_ms;
>  		unsigned long stop_timeout_ms;
>  		unsigned long timeslice_duration_ms;
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 15767c2d9ea1..2f6979009c8f 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1381,7 +1381,7 @@ void i915_request_add(struct i915_request *rq)
>  	mutex_unlock(&tl->mutex);
>  }
>  
> -static unsigned long local_clock_us(unsigned int *cpu)
> +static unsigned long local_clock_ns(unsigned int *cpu)
>  {
>  	unsigned long t;
>  
> @@ -1398,7 +1398,7 @@ static unsigned long local_clock_us(unsigned int *cpu)
>  	 * stop busywaiting, see busywait_stop().
>  	 */
>  	*cpu = get_cpu();
> -	t = local_clock() >> 10;
> +	t = local_clock();
>  	put_cpu();
>  
>  	return t;
> @@ -1408,15 +1408,15 @@ static bool busywait_stop(unsigned long timeout, unsigned int cpu)
>  {
>  	unsigned int this_cpu;
>  
> -	if (time_after(local_clock_us(&this_cpu), timeout))
> +	if (time_after(local_clock_ns(&this_cpu), timeout))
>  		return true;
>  
>  	return this_cpu != cpu;
>  }
>  
> -static bool __i915_spin_request(const struct i915_request * const rq,
> -				int state, unsigned long timeout_us)
> +static bool __i915_spin_request(const struct i915_request * const rq, int state)
>  {
> +	unsigned long timeout_ns;
>  	unsigned int cpu;
>  
>  	/*
> @@ -1444,7 +1444,8 @@ static bool __i915_spin_request(const struct i915_request * const rq,
>  	 * takes to sleep on a request, on the order of a microsecond.
>  	 */
>  
> -	timeout_us += local_clock_us(&cpu);
> +	timeout_ns = READ_ONCE(rq->engine->props.max_busywait_duration_ns);
> +	timeout_ns += local_clock_ns(&cpu);
>  	do {
>  		if (i915_request_completed(rq))
>  			return true;
> @@ -1452,7 +1453,7 @@ static bool __i915_spin_request(const struct i915_request * const rq,
>  		if (signal_pending_state(state, current))
>  			break;
>  
> -		if (busywait_stop(timeout_us, cpu))
> +		if (busywait_stop(timeout_ns, cpu))
>  			break;
>  
>  		cpu_relax();
> @@ -1538,8 +1539,8 @@ long i915_request_wait(struct i915_request *rq,
>  	 * completion. That requires having a good predictor for the request
>  	 * duration, which we currently lack.
>  	 */
> -	if (IS_ACTIVE(CONFIG_DRM_I915_SPIN_REQUEST) &&
> -	    __i915_spin_request(rq, state, CONFIG_DRM_I915_SPIN_REQUEST)) {
> +	if (IS_ACTIVE(CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT) &&
> +	    __i915_spin_request(rq, state)) {
>  		dma_fence_signal(&rq->fence);
>  		goto out;
>  	}
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [30/33] drm/i915/gt: Expose reset stop timeout via sysfs
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 30/33] drm/i915/gt: Expose reset stop timeout via sysfs Chris Wilson
@ 2020-01-22 21:34   ` Steve Carbonari
  0 siblings, 0 replies; 46+ messages in thread
From: Steve Carbonari @ 2020-01-22 21:34 UTC (permalink / raw)
  To: intel-gfx

On Thu, Dec 12, 2019 at 02:04:56PM +0000, Chris Wilson wrote:
> When we allow ourselves to sleep before a GPU reset after disabling
> submission, even for a few milliseconds, gives an innocent context the
> opportunity to clear the GPU before the reset occurs. However, how long
> to sleep depends on the typical non-preemptible duration (a similar
> problem to determining the ideal preempt-reset timeout or even the
> heartbeat interval). As this seems of a hard policy decision, punt it to
> userspace.
> 
> The timeout can be adjusted using
> 
> 	/sys/class/drm/card?/engine/*/stop_timeout_ms
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>

Code looks good.
Tested with other sysfs related patches in the series.
The stop_timeout_ms file exists and can be modified.

Reviewed-by: Steve Carbonari <steven.carbonari@intel.com>
Tested-by: Steve Carbonari <steven.carbonari@intel.com

> ---
>  drivers/gpu/drm/i915/Kconfig.profile         |  3 ++
>  drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 40 ++++++++++++++++++++
>  2 files changed, 43 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> index 9ee3b59685b9..5f4ec3aec1d2 100644
> --- a/drivers/gpu/drm/i915/Kconfig.profile
> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -63,6 +63,9 @@ config DRM_I915_STOP_TIMEOUT
>  	  that the reset itself may take longer and so be more disruptive to
>  	  interactive or low latency workloads.
>  
> +	  This is adjustable via
> +	  /sys/class/drm/card?/engine/*/stop_timeout_ms
> +
>  config DRM_I915_TIMESLICE_DURATION
>  	int "Scheduling quantum for userspace batches (ms, jiffy granularity)"
>  	default 1 # milliseconds
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> index 6d87529c64a7..2b65fed76435 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> @@ -232,6 +232,45 @@ timeslice_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
>  static struct kobj_attribute timeslice_duration_attr =
>  __ATTR(timeslice_duration_ms, 0644, timeslice_show, timeslice_store);
>  
> +static ssize_t
> +stop_store(struct kobject *kobj, struct kobj_attribute *attr,
> +	   const char *buf, size_t count)
> +{
> +	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> +	unsigned long long duration;
> +	int err;
> +
> +	/*
> +	 * When we allow ourselves to sleep before a GPU reset after disabling
> +	 * submission, even for a few milliseconds, gives an innocent context
> +	 * the opportunity to clear the GPU before the reset occurs. However,
> +	 * how long to sleep depends on the typical non-preemptible duration
> +	 * (a similar problem to determining the ideal preempt-reset timeout
> +	 * or even the heartbeat interval).
> +	 */
> +
> +	err = kstrtoull(buf, 0, &duration);
> +	if (err)
> +		return err;
> +
> +	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +		return -EINVAL;
> +
> +	WRITE_ONCE(engine->props.stop_timeout_ms, duration);
> +	return count;
> +}
> +
> +static ssize_t
> +stop_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
> +{
> +	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> +
> +	return sprintf(buf, "%lu\n", engine->props.stop_timeout_ms);
> +}
> +
> +static struct kobj_attribute stop_timeout_attr =
> +__ATTR(stop_timeout_ms, 0644, stop_show, stop_store);
> +
>  static void kobj_engine_release(struct kobject *kobj)
>  {
>  	kfree(kobj);
> @@ -273,6 +312,7 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
>  		&caps_attr.attr,
>  		&all_caps_attr.attr,
>  		&max_spin_attr.attr,
> +		&stop_timeout_attr.attr,
>  		NULL
>  	};
>  
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [31/33] drm/i915/gt: Expose preempt reset timeout via sysfs
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 31/33] drm/i915/gt: Expose preempt reset " Chris Wilson
@ 2020-01-22 21:36   ` Steve Carbonari
  0 siblings, 0 replies; 46+ messages in thread
From: Steve Carbonari @ 2020-01-22 21:36 UTC (permalink / raw)
  To: intel-gfx

On Thu, Dec 12, 2019 at 02:04:57PM +0000, Chris Wilson wrote:
> After initialising a preemption request, we give the current resident a
> small amount of time to vacate the GPU. The preemption request is for a
> higher priority context and should be immediate to maintain high
> quality of service (and avoid priority inversion). However, the
> preemption granularity of the GPU can be quite coarse and so we need a
> compromise.
> 
> The preempt timeout can be adjusted per-engine using,
> 
> 	/sys/class/drm/card?/engine/*/preempt_timeout_ms
> 
> and can be disabled by setting it to 0.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Code looks good.
Tested with other sysfs related patches in the series.
The preempt_timeout_ms file exists and can be modified.
Tested when there is no reset engine and the sysfs file
is not created.

Reviewed-by: Steve Carbonari <steven.carbonari@intel.com>
Tested-by: Steve Carbonari <steven.carbonari@intel.com


> ---
>  drivers/gpu/drm/i915/Kconfig.profile         |  3 ++
>  drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 48 ++++++++++++++++++++
>  drivers/gpu/drm/i915/gt/intel_lrc.c          |  2 +-
>  3 files changed, 52 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> index 5f4ec3aec1d2..1f4e98a8532f 100644
> --- a/drivers/gpu/drm/i915/Kconfig.profile
> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -33,6 +33,9 @@ config DRM_I915_PREEMPT_TIMEOUT
>  	  expires, the HW will be reset to allow the more important context
>  	  to execute.
>  
> +	  This is adjustable via
> +	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
> +
>  	  May be 0 to disable the timeout.
>  
>  config DRM_I915_MAX_REQUEST_BUSYWAIT
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> index 2b65fed76435..d299c66cf7ec 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> @@ -271,6 +271,50 @@ stop_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
>  static struct kobj_attribute stop_timeout_attr =
>  __ATTR(stop_timeout_ms, 0644, stop_show, stop_store);
>  
> +static ssize_t
> +preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
> +		      const char *buf, size_t count)
> +{
> +	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> +	unsigned long long timeout;
> +	int err;
> +
> +	/*
> +	 * After initialising a preemption request, we give the current
> +	 * resident a small amount of time to vacate the GPU. The preemption
> +	 * request is for a higher priority context and should be immediate to
> +	 * maintain high quality of service (and avoid priority inversion).
> +	 * However, the preemption granularity of the GPU can be quite coarse
> +	 * and so we need a compromise.
> +	 */
> +
> +	err = kstrtoull(buf, 0, &timeout);
> +	if (err)
> +		return err;
> +
> +	if (timeout > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +		return -EINVAL;
> +
> +	WRITE_ONCE(engine->props.preempt_timeout_ms, timeout);
> +
> +	if (READ_ONCE(engine->execlists.pending[0]))
> +		set_timer_ms(&engine->execlists.preempt, timeout);
> +
> +	return count;
> +}
> +
> +static ssize_t
> +preempt_timeout_show(struct kobject *kobj, struct kobj_attribute *attr,
> +		     char *buf)
> +{
> +	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> +
> +	return sprintf(buf, "%lu\n", engine->props.preempt_timeout_ms);
> +}
> +
> +static struct kobj_attribute preempt_timeout_attr =
> +__ATTR(preempt_timeout_ms, 0644, preempt_timeout_show, preempt_timeout_store);
> +
>  static void kobj_engine_release(struct kobject *kobj)
>  {
>  	kfree(kobj);
> @@ -341,6 +385,10 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
>  		    sysfs_create_file(kobj, &timeslice_duration_attr.attr))
>  			goto err_engine;
>  
> +		if (intel_engine_has_preempt_reset(engine) &&
> +		    sysfs_create_file(kobj, &preempt_timeout_attr.attr))
> +			goto err_engine;
> +
>  		if (0) {
>  err_object:
>  			kobject_put(kobj);
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 024a639da787..9a9b23e3d0a3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -2252,7 +2252,7 @@ static noinline void preempt_reset(struct intel_engine_cs *engine)
>  	const unsigned int bit = I915_RESET_ENGINE + engine->id;
>  	unsigned long *lock = &engine->gt->reset.flags;
>  
> -	if (i915_modparams.reset < 3)
> +	if (!intel_has_reset_engine(engine->gt))
>  		return;
>  
>  	if (test_and_set_bit(bit, lock))
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-gfx] [32/33] drm/i915/gt: Expose heartbeat interval via sysfs
  2019-12-12 14:04 ` [Intel-gfx] [PATCH 32/33] drm/i915/gt: Expose heartbeat interval " Chris Wilson
@ 2020-01-22 21:38   ` Steve Carbonari
  0 siblings, 0 replies; 46+ messages in thread
From: Steve Carbonari @ 2020-01-22 21:38 UTC (permalink / raw)
  To: intel-gfx

On Thu, Dec 12, 2019 at 02:04:58PM +0000, Chris Wilson wrote:
> We monitor the health of the system via periodic heartbeat pulses. The
> pulses also provide the opportunity to perform garbage collection.
> However, we interpret an incomplete pulse (a missed heartbeat) as an
> indication that the system is no longer responsive, i.e. hung, and
> perform an engine or full GPU reset. Given that the preemption
> granularity can be very coarse on a system, we let the sysadmin override
> our legacy timeouts which were "optimised" for desktop applications.
> 
> The heartbeat interval can be adjusted per-engine using,
> 
> 	/sys/class/drm/card?/engine/*/heartbeat_interval_ms
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Code looks good.
Performed light testing with other sysfs related patches in the series.
The heartbeat_interval_ms file exists and can be modified.

Reviewed-by: Steve Carbonari <steven.carbonari@intel.com>
Tested-by: Steve Carbonari <steven.carbonari@intel.com


> ---
>  drivers/gpu/drm/i915/Kconfig.profile         |  3 ++
>  drivers/gpu/drm/i915/gt/intel_engine_sysfs.c | 47 ++++++++++++++++++++
>  2 files changed, 50 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> index 1f4e98a8532f..ba8767fc0d6e 100644
> --- a/drivers/gpu/drm/i915/Kconfig.profile
> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -20,6 +20,9 @@ config DRM_I915_HEARTBEAT_INTERVAL
>  	  check the health of the GPU and undertake regular house-keeping of
>  	  internal driver state.
>  
> +	  This is adjustable via
> +	  /sys/class/drm/card?/engine/*/heartbeat_interval_ms
> +
>  	  May be 0 to disable heartbeats and therefore disable automatic GPU
>  	  hang detection.
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> index d299c66cf7ec..33b4c00b93f2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_sysfs.c
> @@ -9,6 +9,7 @@
>  
>  #include "i915_drv.h"
>  #include "intel_engine.h"
> +#include "intel_engine_heartbeat.h"
>  #include "intel_engine_sysfs.h"
>  
>  struct kobj_engine {
> @@ -315,6 +316,49 @@ preempt_timeout_show(struct kobject *kobj, struct kobj_attribute *attr,
>  static struct kobj_attribute preempt_timeout_attr =
>  __ATTR(preempt_timeout_ms, 0644, preempt_timeout_show, preempt_timeout_store);
>  
> +static ssize_t
> +heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
> +		const char *buf, size_t count)
> +{
> +	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> +	unsigned long long delay;
> +	int err;
> +
> +	/*
> +	 * We monitor the health of the system via periodic heartbeat pulses.
> +	 * The pulses also provide the opportunity to perform garbage
> +	 * collection.  However, we interpret an incomplete pulse (a missed
> +	 * heartbeat) as an indication that the system is no longer responsive,
> +	 * i.e. hung, and perform an engine or full GPU reset. Given that the
> +	 * preemption granularity can be very coarse on a system, the optimal
> +	 * value for any workload is unknowable!
> +	 */
> +
> +	err = kstrtoull(buf, 0, &delay);
> +	if (err)
> +		return err;
> +
> +	if (delay >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +		return -EINVAL;
> +
> +	err = intel_engine_set_heartbeat(engine, delay);
> +	if (err)
> +		return err;
> +
> +	return count;
> +}
> +
> +static ssize_t
> +heartbeat_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
> +{
> +	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> +
> +	return sprintf(buf, "%lu\n", engine->props.heartbeat_interval_ms);
> +}
> +
> +static struct kobj_attribute heartbeat_interval_attr =
> +__ATTR(heartbeat_interval_ms, 0644, heartbeat_show, heartbeat_store);
> +
>  static void kobj_engine_release(struct kobject *kobj)
>  {
>  	kfree(kobj);
> @@ -357,6 +401,9 @@ void intel_engines_add_sysfs(struct drm_i915_private *i915)
>  		&all_caps_attr.attr,
>  		&max_spin_attr.attr,
>  		&stop_timeout_attr.attr,
> +#if CONFIG_DRM_I915_HEARTBEAT_INTERVAL
> +		&heartbeat_interval_attr.attr,
> +#endif
>  		NULL
>  	};
>  
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2020-01-22 21:45 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-12 14:04 [Intel-gfx] [PATCH 01/33] drm/i915: Use EAGAIN for trylock failures Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 02/33] drm/i915: Eliminate the trylock for awaiting an earlier request Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 03/33] drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 04/33] drm/i915/uc: Ignore maybe-unused debug-only i915 local Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 05/33] drm/i915/gem: Prepare gen7 cmdparser for async execution Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 06/33] drm/i915/gem: Asynchronous cmdparser Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 07/33] drm/i915/gt: Mark up ips_mchdev pointer access Chris Wilson
2019-12-12 16:39   ` Venkata Sandeep Dhanalakota
2019-12-12 14:04 ` [Intel-gfx] [PATCH 08/33] drm/i915/gt: Mark context->state vma as active while pinned Chris Wilson
2019-12-12 19:55   ` Lionel Landwerlin
2019-12-12 21:35     ` Lionel Landwerlin
2019-12-12 21:46       ` Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 09/33] drm/i915/gt: Mark ring->vma " Chris Wilson
2019-12-12 19:55   ` Lionel Landwerlin
2019-12-12 14:04 ` [Intel-gfx] [PATCH 10/33] drm/i915/selftests: Disable heartbeats around long queues Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 11/33] drm/i915/selftests: Impose a timeout for request submission Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 12/33] drm/i915/gt: Remove direct invocation of breadcrumb signaling Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 13/33] drm/i915: Tweak scheduler's kick_submission() Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 14/33] drm/i915: Flush idle barriers when waiting Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 15/33] drm/i915: Allow userspace to specify ringsize on construction Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 16/33] drm/i915/gem: Honour O_NONBLOCK before throttling execbuf submissions Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 17/33] drm/i915/gt: Teach veng to defer the context allocation Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 18/33] drm/i915: Drop GEM context as a direct link from i915_request Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 19/33] drm/i915: Push the use-semaphore marker onto the intel_context Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 20/33] drm/i915: Remove i915->kernel_context Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 21/33] drm/i915: Move i915_gem_init_contexts() earlier Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 22/33] drm/i915/uc: Use an internal buffer for firmware images Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 23/33] drm/i915/gt: Pull GT initialisation under intel_gt_init() Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 24/33] drm/i915/gt: Pull intel_gt_init_hw() into intel_gt_resume() Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 25/33] drm/i915/gt: Merge engine init/setup loops Chris Wilson
2019-12-12 14:04 ` [Intel-gfx] [PATCH 26/33] drm/i915/gt: Expose engine properties via sysfs Chris Wilson
2020-01-22 21:13   ` [Intel-gfx] [26/33] " Steve Carbonari
2019-12-12 14:04 ` [Intel-gfx] [PATCH 27/33] drm/i915/gt: Expose engine->mmio_base " Chris Wilson
2020-01-22 21:18   ` [Intel-gfx] [27/33] " Steve Carbonari
2019-12-12 14:04 ` [Intel-gfx] [PATCH 28/33] drm/i915/gt: Expose timeslice duration to sysfs Chris Wilson
2020-01-22 21:25   ` [Intel-gfx] [28/33] " Steve Carbonari
2019-12-12 14:04 ` [Intel-gfx] [PATCH 29/33] drm/i915/gt: Expose busywait " Chris Wilson
2020-01-22 21:29   ` [Intel-gfx] [29/33] " Steve Carbonari
2019-12-12 14:04 ` [Intel-gfx] [PATCH 30/33] drm/i915/gt: Expose reset stop timeout via sysfs Chris Wilson
2020-01-22 21:34   ` [Intel-gfx] [30/33] " Steve Carbonari
2019-12-12 14:04 ` [Intel-gfx] [PATCH 31/33] drm/i915/gt: Expose preempt reset " Chris Wilson
2020-01-22 21:36   ` [Intel-gfx] [31/33] " Steve Carbonari
2019-12-12 14:04 ` [Intel-gfx] [PATCH 32/33] drm/i915/gt: Expose heartbeat interval " Chris Wilson
2020-01-22 21:38   ` [Intel-gfx] [32/33] " Steve Carbonari
2019-12-12 14:04 ` [Intel-gfx] [PATCH 33/33] HAX: Use aliasing-ppgtt for gen7 Chris Wilson
2019-12-12 18:09 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [01/33] drm/i915: Use EAGAIN for trylock failures Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.