All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
@ 2016-01-13 16:19 Nick Hoath
  2016-01-13 19:28 ` Yu Dai
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Nick Hoath @ 2016-01-13 16:19 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

Use the first retired request on a new context to unpin
the old context. This ensures that the hw context remains
bound until it has been written back to by the GPU.
Now that the context is pinned until later in the request/context
lifecycle, it no longer needs to be pinned from context_queue to
retire_requests.
This fixes an issue with GuC submission where the GPU might not
have finished writing back the context before it is unpinned. This
results in a GPU hang.

v2: Moved the new pin to cover GuC submission (Alex Dai)
    Moved the new unpin to request_retire to fix coverage leak
v3: Added switch to default context if freeing a still pinned
    context just in case the hw was actually still using it
v4: Unwrapped context unpin to allow calling without a request
v5: Only create a switch to idle context if the ring doesn't
    already have a request pending on it (Alex Dai)
    Rename unsaved to dirty to avoid double negatives (Dave Gordon)
    Changed _no_req postfix to __ prefix for consistency (Dave Gordon)
    Split out per engine cleanup from context_free as it
    was getting unwieldy
    Corrected locking (Dave Gordon)
v6: Removed some bikeshedding (Mika Kuoppala)
    Added explanation of the GuC hang that this fixes (Daniel Vetter)
v7: Removed extra per request pinning from ring reset code (Alex Dai)
    Added forced ring unpin/clean in error case in context free (Alex Dai)
v8: Renamed lrc specific last_context to lrc_last_context as there
    were some reset cases where the codepaths leaked (Mika Kuoppala)
    NULL'd last_context in reset case - there was a pointer leak
    if someone did reset->close context.
v9: Rebase over "Fix context/engine cleanup order"
v10: Rebase over nightly, remove WARN_ON which caused the
    dependency on dev.

Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
Issue: VIZ-4277
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alex Dai <yu.dai@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |   1 +
 drivers/gpu/drm/i915/i915_gem.c         |   3 +
 drivers/gpu/drm/i915/intel_lrc.c        | 138 ++++++++++++++++++++++++++------
 drivers/gpu/drm/i915/intel_lrc.h        |   1 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
 5 files changed, 121 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 104bd18..d28e10a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -882,6 +882,7 @@ struct intel_context {
 	struct {
 		struct drm_i915_gem_object *state;
 		struct intel_ringbuffer *ringbuf;
+		bool dirty;
 		int pin_count;
 	} engine[I915_NUM_RINGS];
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ddc21d4..7b79405 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1413,6 +1413,9 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 {
 	trace_i915_gem_request_retire(request);
 
+	if (i915.enable_execlists)
+		intel_lr_context_complete_check(request);
+
 	/* We know the GPU must have read the request to have
 	 * sent us the seqno + interrupt, so use the position
 	 * of tail of the request to update the last known position
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 5027699..b661058 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -585,9 +585,6 @@ static int execlists_context_queue(struct drm_i915_gem_request *request)
 	struct drm_i915_gem_request *cursor;
 	int num_elements = 0;
 
-	if (request->ctx != ring->default_context)
-		intel_lr_context_pin(request);
-
 	i915_gem_request_reference(request);
 
 	spin_lock_irq(&ring->execlist_lock);
@@ -763,6 +760,13 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
 	if (intel_ring_stopped(ring))
 		return;
 
+	if (request->ctx != ring->default_context) {
+		if (!request->ctx->engine[ring->id].dirty) {
+			intel_lr_context_pin(request);
+			request->ctx->engine[ring->id].dirty = true;
+		}
+	}
+
 	if (dev_priv->guc.execbuf_client)
 		i915_guc_submit(dev_priv->guc.execbuf_client, request);
 	else
@@ -989,12 +993,6 @@ void intel_execlists_retire_requests(struct intel_engine_cs *ring)
 	spin_unlock_irq(&ring->execlist_lock);
 
 	list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) {
-		struct intel_context *ctx = req->ctx;
-		struct drm_i915_gem_object *ctx_obj =
-				ctx->engine[ring->id].state;
-
-		if (ctx_obj && (ctx != ring->default_context))
-			intel_lr_context_unpin(req);
 		list_del(&req->execlist_link);
 		i915_gem_request_unreference(req);
 	}
@@ -1089,21 +1087,39 @@ reset_pin_count:
 	return ret;
 }
 
-void intel_lr_context_unpin(struct drm_i915_gem_request *rq)
+static void __intel_lr_context_unpin(struct intel_engine_cs *ring,
+		struct intel_context *ctx)
 {
-	struct intel_engine_cs *ring = rq->ring;
-	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
-	struct intel_ringbuffer *ringbuf = rq->ringbuf;
-
+	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
+	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
 	if (ctx_obj) {
 		WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
-		if (--rq->ctx->engine[ring->id].pin_count == 0) {
+		if (--ctx->engine[ring->id].pin_count == 0) {
 			intel_unpin_ringbuffer_obj(ringbuf);
 			i915_gem_object_ggtt_unpin(ctx_obj);
 		}
 	}
 }
 
+void intel_lr_context_unpin(struct drm_i915_gem_request *rq)
+{
+	__intel_lr_context_unpin(rq->ring, rq->ctx);
+}
+
+void intel_lr_context_complete_check(struct drm_i915_gem_request *req)
+{
+	struct intel_engine_cs *ring = req->ring;
+
+	if (ring->lrc_last_context && ring->lrc_last_context != req->ctx &&
+			ring->lrc_last_context->engine[ring->id].dirty) {
+		__intel_lr_context_unpin(
+				ring,
+				ring->lrc_last_context);
+		ring->lrc_last_context->engine[ring->id].dirty = false;
+	}
+	ring->lrc_last_context = req->ctx;
+}
+
 static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req)
 {
 	int ret, i;
@@ -2347,6 +2363,74 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
 }
 
 /**
+ * intel_lr_context_clean_ring() - clean the ring specific parts of an LRC
+ * @ctx: the LR context being freed.
+ * @ring: the engine being cleaned
+ * @ctx_obj: the hw context being unreferenced
+ * @ringbuf: the ringbuf being freed
+ *
+ * Take care of cleaning up the per-engine backing
+ * objects and the logical ringbuffer.
+ */
+static void
+intel_lr_context_clean_ring(struct intel_context *ctx,
+			    struct intel_engine_cs *ring,
+			    struct drm_i915_gem_object *ctx_obj,
+			    struct intel_ringbuffer *ringbuf)
+{
+	int ret;
+
+	if (ctx == ring->default_context) {
+		intel_unpin_ringbuffer_obj(ringbuf);
+		i915_gem_object_ggtt_unpin(ctx_obj);
+	}
+
+	if (ctx->engine[ring->id].dirty) {
+		struct drm_i915_gem_request *req = NULL;
+
+		/**
+		 * If there is already a request pending on
+		 * this ring, wait for that to complete,
+		 * otherwise create a switch to idle request
+		 */
+		if (list_empty(&ring->request_list)) {
+			int ret;
+
+			ret = i915_gem_request_alloc(
+					ring,
+					ring->default_context,
+					&req);
+			if (!ret)
+				i915_add_request(req);
+			else
+				DRM_DEBUG("Failed to ensure context saved");
+		} else {
+			req = list_first_entry(
+					&ring->request_list,
+					typeof(*req), list);
+		}
+		if (req) {
+			ret = i915_wait_request(req);
+			if (ret != 0) {
+				/**
+				 * If we get here, there's probably been a ring
+				 * reset, so we just clean up the dirty flag.&
+				 * pin count.
+				 */
+				ctx->engine[ring->id].dirty = false;
+				__intel_lr_context_unpin(
+					ring,
+					ctx);
+			}
+		}
+	}
+
+	WARN_ON(ctx->engine[ring->id].pin_count);
+	intel_ringbuffer_free(ringbuf);
+	drm_gem_object_unreference(&ctx_obj->base);
+}
+
+/**
  * intel_lr_context_free() - free the LRC specific bits of a context
  * @ctx: the LR context to free.
  *
@@ -2358,7 +2444,7 @@ void intel_lr_context_free(struct intel_context *ctx)
 {
 	int i;
 
-	for (i = 0; i < I915_NUM_RINGS; i++) {
+	for (i = 0; i < I915_NUM_RINGS; ++i) {
 		struct drm_i915_gem_object *ctx_obj = ctx->engine[i].state;
 
 		if (ctx_obj) {
@@ -2366,13 +2452,10 @@ void intel_lr_context_free(struct intel_context *ctx)
 					ctx->engine[i].ringbuf;
 			struct intel_engine_cs *ring = ringbuf->ring;
 
-			if (ctx == ring->default_context) {
-				intel_unpin_ringbuffer_obj(ringbuf);
-				i915_gem_object_ggtt_unpin(ctx_obj);
-			}
-			WARN_ON(ctx->engine[ring->id].pin_count);
-			intel_ringbuffer_free(ringbuf);
-			drm_gem_object_unreference(&ctx_obj->base);
+			intel_lr_context_clean_ring(ctx,
+						    ring,
+						    ctx_obj,
+						    ringbuf);
 		}
 	}
 }
@@ -2548,5 +2631,14 @@ void intel_lr_context_reset(struct drm_device *dev,
 
 		ringbuf->head = 0;
 		ringbuf->tail = 0;
+
+		if (ctx->engine[ring->id].dirty) {
+			__intel_lr_context_unpin(
+					ring,
+					ctx);
+			ctx->engine[ring->id].dirty = false;
+			if (ring->lrc_last_context == ctx)
+				ring->lrc_last_context = NULL;
+		}
 	}
 }
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index de41ad6..48690f79 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -106,6 +106,7 @@ void intel_lr_context_reset(struct drm_device *dev,
 			struct intel_context *ctx);
 uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 				     struct intel_engine_cs *ring);
+void intel_lr_context_complete_check(struct drm_i915_gem_request *req);
 
 /* Execlists */
 int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 7349d92..fb1df83 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -276,6 +276,7 @@ struct  intel_engine_cs {
 				      u32 flush_domains);
 	int		(*emit_bb_start)(struct drm_i915_gem_request *req,
 					 u64 offset, unsigned dispatch_flags);
+	struct intel_context *lrc_last_context;
 
 	/**
 	 * List of objects currently involved in rendering from the
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-13 16:19 [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback Nick Hoath
@ 2016-01-13 19:28 ` Yu Dai
  2016-01-14  7:20 ` ✗ failure: Fi.CI.BAT Patchwork
  2016-01-14 11:36 ` [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback Chris Wilson
  2 siblings, 0 replies; 17+ messages in thread
From: Yu Dai @ 2016-01-13 19:28 UTC (permalink / raw)
  To: Nick Hoath, intel-gfx; +Cc: Daniel Vetter

This version resolved the issue (kernel bug check in 
intel_lr_context_clean_ring) I reported on previous versions. Verified 
by igt drv_module_reload_basic, gem_close_race and -t basic tests.

Reviewed-by: Alex Dai <yu.dai@intel.com>

On 01/13/2016 08:19 AM, Nick Hoath wrote:
> Use the first retired request on a new context to unpin
> the old context. This ensures that the hw context remains
> bound until it has been written back to by the GPU.
> Now that the context is pinned until later in the request/context
> lifecycle, it no longer needs to be pinned from context_queue to
> retire_requests.
> This fixes an issue with GuC submission where the GPU might not
> have finished writing back the context before it is unpinned. This
> results in a GPU hang.
>
> v2: Moved the new pin to cover GuC submission (Alex Dai)
>      Moved the new unpin to request_retire to fix coverage leak
> v3: Added switch to default context if freeing a still pinned
>      context just in case the hw was actually still using it
> v4: Unwrapped context unpin to allow calling without a request
> v5: Only create a switch to idle context if the ring doesn't
>      already have a request pending on it (Alex Dai)
>      Rename unsaved to dirty to avoid double negatives (Dave Gordon)
>      Changed _no_req postfix to __ prefix for consistency (Dave Gordon)
>      Split out per engine cleanup from context_free as it
>      was getting unwieldy
>      Corrected locking (Dave Gordon)
> v6: Removed some bikeshedding (Mika Kuoppala)
>      Added explanation of the GuC hang that this fixes (Daniel Vetter)
> v7: Removed extra per request pinning from ring reset code (Alex Dai)
>      Added forced ring unpin/clean in error case in context free (Alex Dai)
> v8: Renamed lrc specific last_context to lrc_last_context as there
>      were some reset cases where the codepaths leaked (Mika Kuoppala)
>      NULL'd last_context in reset case - there was a pointer leak
>      if someone did reset->close context.
> v9: Rebase over "Fix context/engine cleanup order"
> v10: Rebase over nightly, remove WARN_ON which caused the
>      dependency on dev.
>
> Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
> Issue: VIZ-4277
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: David Gordon <david.s.gordon@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Alex Dai <yu.dai@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         |   1 +
>   drivers/gpu/drm/i915/i915_gem.c         |   3 +
>   drivers/gpu/drm/i915/intel_lrc.c        | 138 ++++++++++++++++++++++++++------
>   drivers/gpu/drm/i915/intel_lrc.h        |   1 +
>   drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
>   5 files changed, 121 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 104bd18..d28e10a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -882,6 +882,7 @@ struct intel_context {
>   	struct {
>   		struct drm_i915_gem_object *state;
>   		struct intel_ringbuffer *ringbuf;
> +		bool dirty;
>   		int pin_count;
>   	} engine[I915_NUM_RINGS];
>   
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index ddc21d4..7b79405 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1413,6 +1413,9 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
>   {
>   	trace_i915_gem_request_retire(request);
>   
> +	if (i915.enable_execlists)
> +		intel_lr_context_complete_check(request);
> +
>   	/* We know the GPU must have read the request to have
>   	 * sent us the seqno + interrupt, so use the position
>   	 * of tail of the request to update the last known position
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 5027699..b661058 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -585,9 +585,6 @@ static int execlists_context_queue(struct drm_i915_gem_request *request)
>   	struct drm_i915_gem_request *cursor;
>   	int num_elements = 0;
>   
> -	if (request->ctx != ring->default_context)
> -		intel_lr_context_pin(request);
> -
>   	i915_gem_request_reference(request);
>   
>   	spin_lock_irq(&ring->execlist_lock);
> @@ -763,6 +760,13 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
>   	if (intel_ring_stopped(ring))
>   		return;
>   
> +	if (request->ctx != ring->default_context) {
> +		if (!request->ctx->engine[ring->id].dirty) {
> +			intel_lr_context_pin(request);
> +			request->ctx->engine[ring->id].dirty = true;
> +		}
> +	}
> +
>   	if (dev_priv->guc.execbuf_client)
>   		i915_guc_submit(dev_priv->guc.execbuf_client, request);
>   	else
> @@ -989,12 +993,6 @@ void intel_execlists_retire_requests(struct intel_engine_cs *ring)
>   	spin_unlock_irq(&ring->execlist_lock);
>   
>   	list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) {
> -		struct intel_context *ctx = req->ctx;
> -		struct drm_i915_gem_object *ctx_obj =
> -				ctx->engine[ring->id].state;
> -
> -		if (ctx_obj && (ctx != ring->default_context))
> -			intel_lr_context_unpin(req);
>   		list_del(&req->execlist_link);
>   		i915_gem_request_unreference(req);
>   	}
> @@ -1089,21 +1087,39 @@ reset_pin_count:
>   	return ret;
>   }
>   
> -void intel_lr_context_unpin(struct drm_i915_gem_request *rq)
> +static void __intel_lr_context_unpin(struct intel_engine_cs *ring,
> +		struct intel_context *ctx)
>   {
> -	struct intel_engine_cs *ring = rq->ring;
> -	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
> -	struct intel_ringbuffer *ringbuf = rq->ringbuf;
> -
> +	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
> +	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
>   	if (ctx_obj) {
>   		WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
> -		if (--rq->ctx->engine[ring->id].pin_count == 0) {
> +		if (--ctx->engine[ring->id].pin_count == 0) {
>   			intel_unpin_ringbuffer_obj(ringbuf);
>   			i915_gem_object_ggtt_unpin(ctx_obj);
>   		}
>   	}
>   }
>   
> +void intel_lr_context_unpin(struct drm_i915_gem_request *rq)
> +{
> +	__intel_lr_context_unpin(rq->ring, rq->ctx);
> +}
> +
> +void intel_lr_context_complete_check(struct drm_i915_gem_request *req)
> +{
> +	struct intel_engine_cs *ring = req->ring;
> +
> +	if (ring->lrc_last_context && ring->lrc_last_context != req->ctx &&
> +			ring->lrc_last_context->engine[ring->id].dirty) {
> +		__intel_lr_context_unpin(
> +				ring,
> +				ring->lrc_last_context);
> +		ring->lrc_last_context->engine[ring->id].dirty = false;
> +	}
> +	ring->lrc_last_context = req->ctx;
> +}
> +
>   static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req)
>   {
>   	int ret, i;
> @@ -2347,6 +2363,74 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
>   }
>   
>   /**
> + * intel_lr_context_clean_ring() - clean the ring specific parts of an LRC
> + * @ctx: the LR context being freed.
> + * @ring: the engine being cleaned
> + * @ctx_obj: the hw context being unreferenced
> + * @ringbuf: the ringbuf being freed
> + *
> + * Take care of cleaning up the per-engine backing
> + * objects and the logical ringbuffer.
> + */
> +static void
> +intel_lr_context_clean_ring(struct intel_context *ctx,
> +			    struct intel_engine_cs *ring,
> +			    struct drm_i915_gem_object *ctx_obj,
> +			    struct intel_ringbuffer *ringbuf)
> +{
> +	int ret;
> +
> +	if (ctx == ring->default_context) {
> +		intel_unpin_ringbuffer_obj(ringbuf);
> +		i915_gem_object_ggtt_unpin(ctx_obj);
> +	}
> +
> +	if (ctx->engine[ring->id].dirty) {
> +		struct drm_i915_gem_request *req = NULL;
> +
> +		/**
> +		 * If there is already a request pending on
> +		 * this ring, wait for that to complete,
> +		 * otherwise create a switch to idle request
> +		 */
> +		if (list_empty(&ring->request_list)) {
> +			int ret;
> +
> +			ret = i915_gem_request_alloc(
> +					ring,
> +					ring->default_context,
> +					&req);
> +			if (!ret)
> +				i915_add_request(req);
> +			else
> +				DRM_DEBUG("Failed to ensure context saved");
> +		} else {
> +			req = list_first_entry(
> +					&ring->request_list,
> +					typeof(*req), list);
> +		}
> +		if (req) {
> +			ret = i915_wait_request(req);
> +			if (ret != 0) {
> +				/**
> +				 * If we get here, there's probably been a ring
> +				 * reset, so we just clean up the dirty flag.&
> +				 * pin count.
> +				 */
> +				ctx->engine[ring->id].dirty = false;
> +				__intel_lr_context_unpin(
> +					ring,
> +					ctx);
> +			}
> +		}
> +	}
> +
> +	WARN_ON(ctx->engine[ring->id].pin_count);
> +	intel_ringbuffer_free(ringbuf);
> +	drm_gem_object_unreference(&ctx_obj->base);
> +}
> +
> +/**
>    * intel_lr_context_free() - free the LRC specific bits of a context
>    * @ctx: the LR context to free.
>    *
> @@ -2358,7 +2444,7 @@ void intel_lr_context_free(struct intel_context *ctx)
>   {
>   	int i;
>   
> -	for (i = 0; i < I915_NUM_RINGS; i++) {
> +	for (i = 0; i < I915_NUM_RINGS; ++i) {
>   		struct drm_i915_gem_object *ctx_obj = ctx->engine[i].state;
>   
>   		if (ctx_obj) {
> @@ -2366,13 +2452,10 @@ void intel_lr_context_free(struct intel_context *ctx)
>   					ctx->engine[i].ringbuf;
>   			struct intel_engine_cs *ring = ringbuf->ring;
>   
> -			if (ctx == ring->default_context) {
> -				intel_unpin_ringbuffer_obj(ringbuf);
> -				i915_gem_object_ggtt_unpin(ctx_obj);
> -			}
> -			WARN_ON(ctx->engine[ring->id].pin_count);
> -			intel_ringbuffer_free(ringbuf);
> -			drm_gem_object_unreference(&ctx_obj->base);
> +			intel_lr_context_clean_ring(ctx,
> +						    ring,
> +						    ctx_obj,
> +						    ringbuf);
>   		}
>   	}
>   }
> @@ -2548,5 +2631,14 @@ void intel_lr_context_reset(struct drm_device *dev,
>   
>   		ringbuf->head = 0;
>   		ringbuf->tail = 0;
> +
> +		if (ctx->engine[ring->id].dirty) {
> +			__intel_lr_context_unpin(
> +					ring,
> +					ctx);
> +			ctx->engine[ring->id].dirty = false;
> +			if (ring->lrc_last_context == ctx)
> +				ring->lrc_last_context = NULL;
> +		}
>   	}
>   }
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index de41ad6..48690f79 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -106,6 +106,7 @@ void intel_lr_context_reset(struct drm_device *dev,
>   			struct intel_context *ctx);
>   uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>   				     struct intel_engine_cs *ring);
> +void intel_lr_context_complete_check(struct drm_i915_gem_request *req);
>   
>   /* Execlists */
>   int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 7349d92..fb1df83 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -276,6 +276,7 @@ struct  intel_engine_cs {
>   				      u32 flush_domains);
>   	int		(*emit_bb_start)(struct drm_i915_gem_request *req,
>   					 u64 offset, unsigned dispatch_flags);
> +	struct intel_context *lrc_last_context;
>   
>   	/**
>   	 * List of objects currently involved in rendering from the

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ failure: Fi.CI.BAT
  2016-01-13 16:19 [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback Nick Hoath
  2016-01-13 19:28 ` Yu Dai
@ 2016-01-14  7:20 ` Patchwork
  2016-01-14 11:31   ` Nick Hoath
  2016-01-14 11:36 ` [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback Chris Wilson
  2 siblings, 1 reply; 17+ messages in thread
From: Patchwork @ 2016-01-14  7:20 UTC (permalink / raw)
  To: Nick Hoath; +Cc: intel-gfx

== Summary ==

Built on 058740f8fced6851aeda34f366f5330322cd585f drm-intel-nightly: 2016y-01m-13d-17h-07m-44s UTC integration manifest

Test gem_ctx_basic:
                pass       -> FAIL       (bdw-ultra)
Test gem_ctx_param_basic:
        Subgroup non-root-set:
                pass       -> DMESG-WARN (bsw-nuc-2)
Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                pass       -> SKIP       (bsw-nuc-2)
                dmesg-warn -> PASS       (ilk-hp8440p)

bdw-nuci7        total:138  pass:128  dwarn:1   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:131  dwarn:0   dfail:0   fail:1   skip:6  
bsw-nuc-2        total:141  pass:113  dwarn:3   dfail:0   fail:0   skip:25 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1174/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ✗ failure: Fi.CI.BAT
  2016-01-14  7:20 ` ✗ failure: Fi.CI.BAT Patchwork
@ 2016-01-14 11:31   ` Nick Hoath
  2016-01-19  9:08     ` Daniel Vetter
  0 siblings, 1 reply; 17+ messages in thread
From: Nick Hoath @ 2016-01-14 11:31 UTC (permalink / raw)
  To: Patchwork; +Cc: intel-gfx

On 14/01/2016 07:20, Patchwork wrote:
> == Summary ==
>
> Built on 058740f8fced6851aeda34f366f5330322cd585f drm-intel-nightly: 2016y-01m-13d-17h-07m-44s UTC integration manifest
>
> Test gem_ctx_basic:
>                  pass       -> FAIL       (bdw-ultra)

Test failed to load - not patch related

> Test gem_ctx_param_basic:
>          Subgroup non-root-set:
>                  pass       -> DMESG-WARN (bsw-nuc-2)

gem driver allocated a poisoned slab - not patch related

> Test kms_flip:
>          Subgroup basic-flip-vs-dpms:
>                  pass       -> SKIP       (bsw-nuc-2)

test reqs not met - not patch related

>                  dmesg-warn -> PASS       (ilk-hp8440p)

warn to PASS

>
> bdw-nuci7        total:138  pass:128  dwarn:1   dfail:0   fail:0   skip:9
> bdw-ultra        total:138  pass:131  dwarn:0   dfail:0   fail:1   skip:6
> bsw-nuc-2        total:141  pass:113  dwarn:3   dfail:0   fail:0   skip:25
> hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7
> hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4
> ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37
> ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6
> skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8
> skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8
> snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14
> snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13
>
> Results at /archive/results/CI_IGT_test/Patchwork_1174/
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-13 16:19 [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback Nick Hoath
  2016-01-13 19:28 ` Yu Dai
  2016-01-14  7:20 ` ✗ failure: Fi.CI.BAT Patchwork
@ 2016-01-14 11:36 ` Chris Wilson
  2016-01-14 11:56   ` Nick Hoath
  2 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2016-01-14 11:36 UTC (permalink / raw)
  To: Nick Hoath; +Cc: Daniel Vetter, intel-gfx

On Wed, Jan 13, 2016 at 04:19:45PM +0000, Nick Hoath wrote:
> +	if (ctx->engine[ring->id].dirty) {
> +		struct drm_i915_gem_request *req = NULL;
> +
> +		/**
> +		 * If there is already a request pending on
> +		 * this ring, wait for that to complete,
> +		 * otherwise create a switch to idle request
> +		 */
> +		if (list_empty(&ring->request_list)) {
> +			int ret;
> +
> +			ret = i915_gem_request_alloc(
> +					ring,
> +					ring->default_context,
> +					&req);
> +			if (!ret)
> +				i915_add_request(req);
> +			else
> +				DRM_DEBUG("Failed to ensure context saved");
> +		} else {
> +			req = list_first_entry(
> +					&ring->request_list,
> +					typeof(*req), list);
> +		}
> +		if (req) {
> +			ret = i915_wait_request(req);
> +			if (ret != 0) {
> +				/**
> +				 * If we get here, there's probably been a ring
> +				 * reset, so we just clean up the dirty flag.&
> +				 * pin count.
> +				 */
> +				ctx->engine[ring->id].dirty = false;
> +				__intel_lr_context_unpin(
> +					ring,
> +					ctx);
> +			}
> +		}

If you were to take a lr_context_pin on the last_context, and only
release that pin when you change to a new context, you do not need to
introduce a blocking context-close, nor do you need to introduce the
usage of default_context.

(lr_context_pin should take a reference on the ctx to prevent early
freeeing ofc).

The code at that point starts to look v.v.similar to legacy, right down
to the need to use a GPU reset during shutdown to prevent writing back
the context image. (Which you still currently need to get rid of the
default context now.)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-14 11:36 ` [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback Chris Wilson
@ 2016-01-14 11:56   ` Nick Hoath
  2016-01-14 12:31     ` Chris Wilson
  0 siblings, 1 reply; 17+ messages in thread
From: Nick Hoath @ 2016-01-14 11:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

On 14/01/2016 11:36, Chris Wilson wrote:
> On Wed, Jan 13, 2016 at 04:19:45PM +0000, Nick Hoath wrote:
>> +	if (ctx->engine[ring->id].dirty) {
>> +		struct drm_i915_gem_request *req = NULL;
>> +
>> +		/**
>> +		 * If there is already a request pending on
>> +		 * this ring, wait for that to complete,
>> +		 * otherwise create a switch to idle request
>> +		 */
>> +		if (list_empty(&ring->request_list)) {
>> +			int ret;
>> +
>> +			ret = i915_gem_request_alloc(
>> +					ring,
>> +					ring->default_context,
>> +					&req);
>> +			if (!ret)
>> +				i915_add_request(req);
>> +			else
>> +				DRM_DEBUG("Failed to ensure context saved");
>> +		} else {
>> +			req = list_first_entry(
>> +					&ring->request_list,
>> +					typeof(*req), list);
>> +		}
>> +		if (req) {
>> +			ret = i915_wait_request(req);
>> +			if (ret != 0) {
>> +				/**
>> +				 * If we get here, there's probably been a ring
>> +				 * reset, so we just clean up the dirty flag.&
>> +				 * pin count.
>> +				 */
>> +				ctx->engine[ring->id].dirty = false;
>> +				__intel_lr_context_unpin(
>> +					ring,
>> +					ctx);
>> +			}
>> +		}
>
> If you were to take a lr_context_pin on the last_context, and only
> release that pin when you change to a new context, you do not need to

That what this patch does.

> introduce a blocking context-close, nor do you need to introduce the
> usage of default_context.

The use of default_context here is to stop a context hanging around 
after it is no longer needed.

>
> (lr_context_pin should take a reference on the ctx to prevent early
> freeeing ofc).

You can't clear the reference on the ctx in an interrupt context.

>
> The code at that point starts to look v.v.similar to legacy, right down
> to the need to use a GPU reset during shutdown to prevent writing back
> the context image. (Which you still currently need to get rid of the
> default context now.)
> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-14 11:56   ` Nick Hoath
@ 2016-01-14 12:31     ` Chris Wilson
  2016-01-14 12:37       ` Nick Hoath
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2016-01-14 12:31 UTC (permalink / raw)
  To: Nick Hoath; +Cc: Daniel Vetter, intel-gfx

On Thu, Jan 14, 2016 at 11:56:07AM +0000, Nick Hoath wrote:
> On 14/01/2016 11:36, Chris Wilson wrote:
> >On Wed, Jan 13, 2016 at 04:19:45PM +0000, Nick Hoath wrote:
> >>+	if (ctx->engine[ring->id].dirty) {
> >>+		struct drm_i915_gem_request *req = NULL;
> >>+
> >>+		/**
> >>+		 * If there is already a request pending on
> >>+		 * this ring, wait for that to complete,
> >>+		 * otherwise create a switch to idle request
> >>+		 */
> >>+		if (list_empty(&ring->request_list)) {
> >>+			int ret;
> >>+
> >>+			ret = i915_gem_request_alloc(
> >>+					ring,
> >>+					ring->default_context,
> >>+					&req);
> >>+			if (!ret)
> >>+				i915_add_request(req);
> >>+			else
> >>+				DRM_DEBUG("Failed to ensure context saved");
> >>+		} else {
> >>+			req = list_first_entry(
> >>+					&ring->request_list,
> >>+					typeof(*req), list);
> >>+		}
> >>+		if (req) {
> >>+			ret = i915_wait_request(req);
> >>+			if (ret != 0) {
> >>+				/**
> >>+				 * If we get here, there's probably been a ring
> >>+				 * reset, so we just clean up the dirty flag.&
> >>+				 * pin count.
> >>+				 */
> >>+				ctx->engine[ring->id].dirty = false;
> >>+				__intel_lr_context_unpin(
> >>+					ring,
> >>+					ctx);
> >>+			}
> >>+		}
> >
> >If you were to take a lr_context_pin on the last_context, and only
> >release that pin when you change to a new context, you do not need to
> 
> That what this patch does.
> 
> >introduce a blocking context-close, nor do you need to introduce the
> >usage of default_context.
> 
> The use of default_context here is to stop a context hanging around
> after it is no longer needed.

By blocking, which is not acceptable. Also we can eliminate the
default_context and so pinning that opposed to the last_context serves
no purpose other than by chance having a more preferrable position when
it comes to defragmentation. But you don't enable that anyway and we
have alternative strategies now that avoid the issue with fragmentation
of the mappable aperture.

> >(lr_context_pin should take a reference on the ctx to prevent early
> >freeeing ofc).
> 
> You can't clear the reference on the ctx in an interrupt context.

The execlists submission should moved out of the interrupt context, for
the very simple reason that it is causing machine panics. userspace
submits a workload, machine lockups....
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-14 12:31     ` Chris Wilson
@ 2016-01-14 12:37       ` Nick Hoath
  2016-01-15 10:59         ` Nick Hoath
  0 siblings, 1 reply; 17+ messages in thread
From: Nick Hoath @ 2016-01-14 12:37 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

On 14/01/2016 12:31, Chris Wilson wrote:
> On Thu, Jan 14, 2016 at 11:56:07AM +0000, Nick Hoath wrote:
>> On 14/01/2016 11:36, Chris Wilson wrote:
>>> On Wed, Jan 13, 2016 at 04:19:45PM +0000, Nick Hoath wrote:
>>>> +	if (ctx->engine[ring->id].dirty) {
>>>> +		struct drm_i915_gem_request *req = NULL;
>>>> +
>>>> +		/**
>>>> +		 * If there is already a request pending on
>>>> +		 * this ring, wait for that to complete,
>>>> +		 * otherwise create a switch to idle request
>>>> +		 */
>>>> +		if (list_empty(&ring->request_list)) {
>>>> +			int ret;
>>>> +
>>>> +			ret = i915_gem_request_alloc(
>>>> +					ring,
>>>> +					ring->default_context,
>>>> +					&req);
>>>> +			if (!ret)
>>>> +				i915_add_request(req);
>>>> +			else
>>>> +				DRM_DEBUG("Failed to ensure context saved");
>>>> +		} else {
>>>> +			req = list_first_entry(
>>>> +					&ring->request_list,
>>>> +					typeof(*req), list);
>>>> +		}
>>>> +		if (req) {
>>>> +			ret = i915_wait_request(req);
>>>> +			if (ret != 0) {
>>>> +				/**
>>>> +				 * If we get here, there's probably been a ring
>>>> +				 * reset, so we just clean up the dirty flag.&
>>>> +				 * pin count.
>>>> +				 */
>>>> +				ctx->engine[ring->id].dirty = false;
>>>> +				__intel_lr_context_unpin(
>>>> +					ring,
>>>> +					ctx);
>>>> +			}
>>>> +		}
>>>
>>> If you were to take a lr_context_pin on the last_context, and only
>>> release that pin when you change to a new context, you do not need to
>>
>> That what this patch does.
>>
>>> introduce a blocking context-close, nor do you need to introduce the
>>> usage of default_context.
>>
>> The use of default_context here is to stop a context hanging around
>> after it is no longer needed.
>
> By blocking, which is not acceptable. Also we can eliminate the
> default_context and so pinning that opposed to the last_context serves
> no purpose other than by chance having a more preferrable position when
> it comes to defragmentation. But you don't enable that anyway and we

Enabling the shrinker on execlists is something I'm working on which is
predicated on this patch. Also why is blocking on closing a context not
acceptable?

> have alternative strategies now that avoid the issue with fragmentation
> of the mappable aperture.
>
>>> (lr_context_pin should take a reference on the ctx to prevent early
>>> freeeing ofc).
>>
>> You can't clear the reference on the ctx in an interrupt context.
>
> The execlists submission should moved out of the interrupt context, for
> the very simple reason that it is causing machine panics. userspace
> submits a workload, machine lockups....

Create a jira, and I'm sure we'll look at making that change.

> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-14 12:37       ` Nick Hoath
@ 2016-01-15 10:59         ` Nick Hoath
  2016-01-18 15:02           ` Tvrtko Ursulin
  0 siblings, 1 reply; 17+ messages in thread
From: Nick Hoath @ 2016-01-15 10:59 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

On 14/01/2016 12:37, Nick Hoath wrote:
> On 14/01/2016 12:31, Chris Wilson wrote:
>> On Thu, Jan 14, 2016 at 11:56:07AM +0000, Nick Hoath wrote:
>>> On 14/01/2016 11:36, Chris Wilson wrote:
>>>> On Wed, Jan 13, 2016 at 04:19:45PM +0000, Nick Hoath wrote:
>>>>> +	if (ctx->engine[ring->id].dirty) {
>>>>> +		struct drm_i915_gem_request *req = NULL;
>>>>> +
>>>>> +		/**
>>>>> +		 * If there is already a request pending on
>>>>> +		 * this ring, wait for that to complete,
>>>>> +		 * otherwise create a switch to idle request
>>>>> +		 */
>>>>> +		if (list_empty(&ring->request_list)) {
>>>>> +			int ret;
>>>>> +
>>>>> +			ret = i915_gem_request_alloc(
>>>>> +					ring,
>>>>> +					ring->default_context,
>>>>> +					&req);
>>>>> +			if (!ret)
>>>>> +				i915_add_request(req);
>>>>> +			else
>>>>> +				DRM_DEBUG("Failed to ensure context saved");
>>>>> +		} else {
>>>>> +			req = list_first_entry(
>>>>> +					&ring->request_list,
>>>>> +					typeof(*req), list);
>>>>> +		}
>>>>> +		if (req) {
>>>>> +			ret = i915_wait_request(req);
>>>>> +			if (ret != 0) {
>>>>> +				/**
>>>>> +				 * If we get here, there's probably been a ring
>>>>> +				 * reset, so we just clean up the dirty flag.&
>>>>> +				 * pin count.
>>>>> +				 */
>>>>> +				ctx->engine[ring->id].dirty = false;
>>>>> +				__intel_lr_context_unpin(
>>>>> +					ring,
>>>>> +					ctx);
>>>>> +			}
>>>>> +		}
>>>>
>>>> If you were to take a lr_context_pin on the last_context, and only
>>>> release that pin when you change to a new context, you do not need to
>>>
>>> That what this patch does.
>>>
>>>> introduce a blocking context-close, nor do you need to introduce the
>>>> usage of default_context.
>>>
>>> The use of default_context here is to stop a context hanging around
>>> after it is no longer needed.
>>
>> By blocking, which is not acceptable. Also we can eliminate the
>> default_context and so pinning that opposed to the last_context serves
>> no purpose other than by chance having a more preferrable position when
>> it comes to defragmentation. But you don't enable that anyway and we
>
> Enabling the shrinker on execlists is something I'm working on which is
> predicated on this patch. Also why is blocking on closing a context not
> acceptable?
>

As a clarification: Without rewriting the execlist code to not submit or 
cleanup from an interrupt handler, we can't use refcounting to allow non 
blocking closing.

>> have alternative strategies now that avoid the issue with fragmentation
>> of the mappable aperture.
>>
>>>> (lr_context_pin should take a reference on the ctx to prevent early
>>>> freeeing ofc).
>>>
>>> You can't clear the reference on the ctx in an interrupt context.
>>
>> The execlists submission should moved out of the interrupt context, for
>> the very simple reason that it is causing machine panics. userspace
>> submits a workload, machine lockups....
>
> Create a jira, and I'm sure we'll look at making that change.
>
>> -Chris
>>
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-15 10:59         ` Nick Hoath
@ 2016-01-18 15:02           ` Tvrtko Ursulin
  2016-01-18 16:53             ` Chris Wilson
  0 siblings, 1 reply; 17+ messages in thread
From: Tvrtko Ursulin @ 2016-01-18 15:02 UTC (permalink / raw)
  To: Nick Hoath, Chris Wilson; +Cc: Daniel Vetter, intel-gfx


Hi guys,

On 15/01/16 10:59, Nick Hoath wrote:
> On 14/01/2016 12:37, Nick Hoath wrote:
>> On 14/01/2016 12:31, Chris Wilson wrote:
>>> On Thu, Jan 14, 2016 at 11:56:07AM +0000, Nick Hoath wrote:
>>>> On 14/01/2016 11:36, Chris Wilson wrote:
>>>>> On Wed, Jan 13, 2016 at 04:19:45PM +0000, Nick Hoath wrote:
>>>>>> +    if (ctx->engine[ring->id].dirty) {
>>>>>> +        struct drm_i915_gem_request *req = NULL;
>>>>>> +
>>>>>> +        /**
>>>>>> +         * If there is already a request pending on
>>>>>> +         * this ring, wait for that to complete,
>>>>>> +         * otherwise create a switch to idle request
>>>>>> +         */
>>>>>> +        if (list_empty(&ring->request_list)) {
>>>>>> +            int ret;
>>>>>> +
>>>>>> +            ret = i915_gem_request_alloc(
>>>>>> +                    ring,
>>>>>> +                    ring->default_context,
>>>>>> +                    &req);
>>>>>> +            if (!ret)
>>>>>> +                i915_add_request(req);
>>>>>> +            else
>>>>>> +                DRM_DEBUG("Failed to ensure context saved");
>>>>>> +        } else {
>>>>>> +            req = list_first_entry(
>>>>>> +                    &ring->request_list,
>>>>>> +                    typeof(*req), list);
>>>>>> +        }
>>>>>> +        if (req) {
>>>>>> +            ret = i915_wait_request(req);
>>>>>> +            if (ret != 0) {
>>>>>> +                /**
>>>>>> +                 * If we get here, there's probably been a ring
>>>>>> +                 * reset, so we just clean up the dirty flag.&
>>>>>> +                 * pin count.
>>>>>> +                 */
>>>>>> +                ctx->engine[ring->id].dirty = false;
>>>>>> +                __intel_lr_context_unpin(
>>>>>> +                    ring,
>>>>>> +                    ctx);
>>>>>> +            }
>>>>>> +        }
>>>>>
>>>>> If you were to take a lr_context_pin on the last_context, and only
>>>>> release that pin when you change to a new context, you do not need to
>>>>
>>>> That what this patch does.
>>>>
>>>>> introduce a blocking context-close, nor do you need to introduce the
>>>>> usage of default_context.
>>>>
>>>> The use of default_context here is to stop a context hanging around
>>>> after it is no longer needed.
>>>
>>> By blocking, which is not acceptable. Also we can eliminate the
>>> default_context and so pinning that opposed to the last_context serves
>>> no purpose other than by chance having a more preferrable position when
>>> it comes to defragmentation. But you don't enable that anyway and we
>>
>> Enabling the shrinker on execlists is something I'm working on which is
>> predicated on this patch. Also why is blocking on closing a context not
>> acceptable?
>>
> 
> As a clarification: Without rewriting the execlist code to not submit or 
> cleanup from an interrupt handler, we can't use refcounting to allow non 
> blocking closing.

I am trying to understand this issue so please bear with me if I got it
wrong. And also it is not tested since I don't have a suitable system.

But... would something like the below be an interesting step towards an
acceptable solution?

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2cfcf9401971..63bb251edffd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2927,6 +2927,8 @@ void i915_gem_reset(struct drm_device *dev)
 void
 i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 {
+       struct drm_i915_gem_request *prev_req, *next, *req;
+
        WARN_ON(i915_verify_lists(ring->dev));
 
        /* Retire requests first as we use it above for the early return.
@@ -2934,17 +2936,17 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
         * the requests lists without clearing the active list, leading to
         * confusion.
         */
-       while (!list_empty(&ring->request_list)) {
-               struct drm_i915_gem_request *request;
-
-               request = list_first_entry(&ring->request_list,
-                                          struct drm_i915_gem_request,
-                                          list);
-
-               if (!i915_gem_request_completed(request, true))
+       list_for_each_entry_safe(req, next, &ring->request_list, list) {
+               if (!i915_gem_request_completed(req, true))
                        break;
 
-               i915_gem_request_retire(request);
+               if (!i915.enable_execlists || !i915.enable_guc_submission) {
+                       i915_gem_request_retire(req);
+               } else {
+                       prev_req = list_prev_entry(req, list);
+                       if (prev_req)
+                               i915_gem_request_retire(prev_req);
+               }
        }

To explain, this attempts to ensure that in GuC mode requests are only
unreferenced if there is a *following* *completed* request.

This way, regardless of whether they are using the same or different
contexts, we can be sure that the GPU has either completed the 
context writing, or that the unreference will not cause the final
unpin of the context.

With the above snippet it would leave the last context pinned, but,
that could also be improved by either appending a default empty
context when we get to the end of the list, or perhaps periodically
from the retire worker only to lessen the ctx switch traffic.

Thoughts, comments?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-18 15:02           ` Tvrtko Ursulin
@ 2016-01-18 16:53             ` Chris Wilson
  2016-01-18 17:14               ` Tvrtko Ursulin
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2016-01-18 16:53 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, intel-gfx

On Mon, Jan 18, 2016 at 03:02:25PM +0000, Tvrtko Ursulin wrote:
> -       while (!list_empty(&ring->request_list)) {
> -               struct drm_i915_gem_request *request;
> -
> -               request = list_first_entry(&ring->request_list,
> -                                          struct drm_i915_gem_request,
> -                                          list);
> -
> -               if (!i915_gem_request_completed(request, true))
> +       list_for_each_entry_safe(req, next, &ring->request_list, list) {
> +               if (!i915_gem_request_completed(req, true))
>                         break;
>  
> -               i915_gem_request_retire(request);
> +               if (!i915.enable_execlists || !i915.enable_guc_submission) {
> +                       i915_gem_request_retire(req);
> +               } else {
> +                       prev_req = list_prev_entry(req, list);
> +                       if (prev_req)
> +                               i915_gem_request_retire(prev_req);
> +               }
>         }
> 
> To explain, this attempts to ensure that in GuC mode requests are only
> unreferenced if there is a *following* *completed* request.
> 
> This way, regardless of whether they are using the same or different
> contexts, we can be sure that the GPU has either completed the 
> context writing, or that the unreference will not cause the final
> unpin of the context.

This is the first bogus step. contexts have to be unreferenced from
request retire, not request free. As it stands today, this forces us to
hold the struct_mutex for the free (causing many foul ups along the
line).  The only reason why it is like that is because of execlists not
decoupling its context pinning inside request cancel.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-18 16:53             ` Chris Wilson
@ 2016-01-18 17:14               ` Tvrtko Ursulin
  2016-01-18 20:47                 ` Chris Wilson
  0 siblings, 1 reply; 17+ messages in thread
From: Tvrtko Ursulin @ 2016-01-18 17:14 UTC (permalink / raw)
  To: Chris Wilson, Nick Hoath, Daniel Vetter, intel-gfx


On 18/01/16 16:53, Chris Wilson wrote:
> On Mon, Jan 18, 2016 at 03:02:25PM +0000, Tvrtko Ursulin wrote:
>> -       while (!list_empty(&ring->request_list)) {
>> -               struct drm_i915_gem_request *request;
>> -
>> -               request = list_first_entry(&ring->request_list,
>> -                                          struct drm_i915_gem_request,
>> -                                          list);
>> -
>> -               if (!i915_gem_request_completed(request, true))
>> +       list_for_each_entry_safe(req, next, &ring->request_list, list) {
>> +               if (!i915_gem_request_completed(req, true))
>>                          break;
>>
>> -               i915_gem_request_retire(request);
>> +               if (!i915.enable_execlists || !i915.enable_guc_submission) {
>> +                       i915_gem_request_retire(req);
>> +               } else {
>> +                       prev_req = list_prev_entry(req, list);
>> +                       if (prev_req)
>> +                               i915_gem_request_retire(prev_req);
>> +               }
>>          }
>>
>> To explain, this attempts to ensure that in GuC mode requests are only
>> unreferenced if there is a *following* *completed* request.
>>
>> This way, regardless of whether they are using the same or different
>> contexts, we can be sure that the GPU has either completed the
>> context writing, or that the unreference will not cause the final
>> unpin of the context.
>
> This is the first bogus step. contexts have to be unreferenced from
> request retire, not request free. As it stands today, this forces us to
> hold the struct_mutex for the free (causing many foul ups along the
> line).  The only reason why it is like that is because of execlists not
> decoupling its context pinning inside request cancel.

What is the first bogus step? My idea of how to fix the GuC issue, or 
the mention of final unreference in relation to GPU completing the 
submission?

Also I don't understand how would you decouple context and request lifetime?

Maybe we can ignore execlist mode for the moment and just consider the 
GuC which, as much as I understand it, has a simpler and fully aligned 
request/context/lrc lifetime of:

* reference and pin and request creation
* unpin and unreference at retire

Where retire is decoupled from actual GPU activity, or maybe better say 
indirectly driven.

Execlists bolt on a parallel another instance reference and pin on top, 
with different lifetime rules so maybe ignore that for the GuC 
discussion. Just to figure out what you have in mind.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-18 17:14               ` Tvrtko Ursulin
@ 2016-01-18 20:47                 ` Chris Wilson
  2016-01-19 10:24                   ` Tvrtko Ursulin
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2016-01-18 20:47 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, intel-gfx

On Mon, Jan 18, 2016 at 05:14:26PM +0000, Tvrtko Ursulin wrote:
> 
> On 18/01/16 16:53, Chris Wilson wrote:
> >On Mon, Jan 18, 2016 at 03:02:25PM +0000, Tvrtko Ursulin wrote:
> >>-       while (!list_empty(&ring->request_list)) {
> >>-               struct drm_i915_gem_request *request;
> >>-
> >>-               request = list_first_entry(&ring->request_list,
> >>-                                          struct drm_i915_gem_request,
> >>-                                          list);
> >>-
> >>-               if (!i915_gem_request_completed(request, true))
> >>+       list_for_each_entry_safe(req, next, &ring->request_list, list) {
> >>+               if (!i915_gem_request_completed(req, true))
> >>                         break;
> >>
> >>-               i915_gem_request_retire(request);
> >>+               if (!i915.enable_execlists || !i915.enable_guc_submission) {
> >>+                       i915_gem_request_retire(req);
> >>+               } else {
> >>+                       prev_req = list_prev_entry(req, list);
> >>+                       if (prev_req)
> >>+                               i915_gem_request_retire(prev_req);
> >>+               }
> >>         }
> >>
> >>To explain, this attempts to ensure that in GuC mode requests are only
> >>unreferenced if there is a *following* *completed* request.
> >>
> >>This way, regardless of whether they are using the same or different
> >>contexts, we can be sure that the GPU has either completed the
> >>context writing, or that the unreference will not cause the final
> >>unpin of the context.
> >
> >This is the first bogus step. contexts have to be unreferenced from
> >request retire, not request free. As it stands today, this forces us to
> >hold the struct_mutex for the free (causing many foul ups along the
> >line).  The only reason why it is like that is because of execlists not
> >decoupling its context pinning inside request cancel.
> 
> What is the first bogus step? My idea of how to fix the GuC issue,
> or the mention of final unreference in relation to GPU completing
> the submission?

That we want to want to actually unreference the request. We want to
unpin the context at the appropriate juncture. At the moment, it looks
like that you are conflating those two steps: "requests are only
unreferenced". Using the retirement mechanism would mean coupling the
context unpinning into a subsequent request rather than defer retiring a
completed request, for example legacy uses active vma tracking to
accomplish the same thing. Aiui, the current claim is that we couldn't
do that since the guc may reorder contexts - except that we currently
use a global seqno so that would be bad on many levels.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ✗ failure:  Fi.CI.BAT
  2016-01-14 11:31   ` Nick Hoath
@ 2016-01-19  9:08     ` Daniel Vetter
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Vetter @ 2016-01-19  9:08 UTC (permalink / raw)
  To: Nick Hoath; +Cc: intel-gfx

On Thu, Jan 14, 2016 at 11:31:07AM +0000, Nick Hoath wrote:
> On 14/01/2016 07:20, Patchwork wrote:
> >== Summary ==

Our BKM is to link to bugzilla entries to make sure these are all real
failures which are tracked already. Otherwise stuff falls through the
cracks.

> >
> >Built on 058740f8fced6851aeda34f366f5330322cd585f drm-intel-nightly: 2016y-01m-13d-17h-07m-44s UTC integration manifest
> >
> >Test gem_ctx_basic:
> >                 pass       -> FAIL       (bdw-ultra)
> 
> Test failed to load - not patch related
> 
> >Test gem_ctx_param_basic:
> >         Subgroup non-root-set:
> >                 pass       -> DMESG-WARN (bsw-nuc-2)
> 
> gem driver allocated a poisoned slab - not patch related
> 
> >Test kms_flip:
> >         Subgroup basic-flip-vs-dpms:
> >                 pass       -> SKIP       (bsw-nuc-2)
> 
> test reqs not met - not patch related

basic-flip-vs-dpms MUST always work. Well except if you have a chip with
GT only where the display is fused off. I expect more serious analysis
instead of casually shrugging issues away as "not my problem".

Same sloppy analysis with the others imo.
-Daniel


> 
> >                 dmesg-warn -> PASS       (ilk-hp8440p)
> 
> warn to PASS
> 
> >
> >bdw-nuci7        total:138  pass:128  dwarn:1   dfail:0   fail:0   skip:9
> >bdw-ultra        total:138  pass:131  dwarn:0   dfail:0   fail:1   skip:6
> >bsw-nuc-2        total:141  pass:113  dwarn:3   dfail:0   fail:0   skip:25
> >hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7
> >hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4
> >ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37
> >ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6
> >skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8
> >skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8
> >snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14
> >snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13
> >
> >Results at /archive/results/CI_IGT_test/Patchwork_1174/
> >
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-18 20:47                 ` Chris Wilson
@ 2016-01-19 10:24                   ` Tvrtko Ursulin
  2016-01-19 17:18                     ` Tvrtko Ursulin
  0 siblings, 1 reply; 17+ messages in thread
From: Tvrtko Ursulin @ 2016-01-19 10:24 UTC (permalink / raw)
  To: Chris Wilson, Nick Hoath, Daniel Vetter, intel-gfx



On 18/01/16 20:47, Chris Wilson wrote:
> On Mon, Jan 18, 2016 at 05:14:26PM +0000, Tvrtko Ursulin wrote:
>>
>> On 18/01/16 16:53, Chris Wilson wrote:
>>> On Mon, Jan 18, 2016 at 03:02:25PM +0000, Tvrtko Ursulin wrote:
>>>> -       while (!list_empty(&ring->request_list)) {
>>>> -               struct drm_i915_gem_request *request;
>>>> -
>>>> -               request = list_first_entry(&ring->request_list,
>>>> -                                          struct drm_i915_gem_request,
>>>> -                                          list);
>>>> -
>>>> -               if (!i915_gem_request_completed(request, true))
>>>> +       list_for_each_entry_safe(req, next, &ring->request_list, list) {
>>>> +               if (!i915_gem_request_completed(req, true))
>>>>                          break;
>>>>
>>>> -               i915_gem_request_retire(request);
>>>> +               if (!i915.enable_execlists || !i915.enable_guc_submission) {
>>>> +                       i915_gem_request_retire(req);
>>>> +               } else {
>>>> +                       prev_req = list_prev_entry(req, list);
>>>> +                       if (prev_req)
>>>> +                               i915_gem_request_retire(prev_req);
>>>> +               }
>>>>          }
>>>>
>>>> To explain, this attempts to ensure that in GuC mode requests are only
>>>> unreferenced if there is a *following* *completed* request.
>>>>
>>>> This way, regardless of whether they are using the same or different
>>>> contexts, we can be sure that the GPU has either completed the
>>>> context writing, or that the unreference will not cause the final
>>>> unpin of the context.
>>>
>>> This is the first bogus step. contexts have to be unreferenced from
>>> request retire, not request free. As it stands today, this forces us to
>>> hold the struct_mutex for the free (causing many foul ups along the
>>> line).  The only reason why it is like that is because of execlists not
>>> decoupling its context pinning inside request cancel.
>>
>> What is the first bogus step? My idea of how to fix the GuC issue,
>> or the mention of final unreference in relation to GPU completing
>> the submission?
>
> That we want to want to actually unreference the request. We want to
> unpin the context at the appropriate juncture. At the moment, it looks

What would be the appropriate juncture? With GuC we don't have the 
equivalent of context complete irq.

> like that you are conflating those two steps: "requests are only
> unreferenced". Using the retirement mechanism would mean coupling the
> context unpinning into a subsequent request rather than defer retiring a
> completed request, for example legacy uses active vma tracking to
> accomplish the same thing. Aiui, the current claim is that we couldn't
> do that since the guc may reorder contexts - except that we currently
> use a global seqno so that would be bad on many levels.

I don't know legacy. :( I can see that request/context lifetime is 
coupled there and associated with request creation to retirement.

Does it have the same problem of seqno signaling completion before the 
GPU is done with writing out the context image and how does it solve that?

Regards,

Tvrtko



_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-19 10:24                   ` Tvrtko Ursulin
@ 2016-01-19 17:18                     ` Tvrtko Ursulin
  2016-01-19 17:31                       ` Tvrtko Ursulin
  0 siblings, 1 reply; 17+ messages in thread
From: Tvrtko Ursulin @ 2016-01-19 17:18 UTC (permalink / raw)
  To: Chris Wilson, Nick Hoath, Daniel Vetter, intel-gfx


On 19/01/16 10:24, Tvrtko Ursulin wrote:
>
>
> On 18/01/16 20:47, Chris Wilson wrote:
>> On Mon, Jan 18, 2016 at 05:14:26PM +0000, Tvrtko Ursulin wrote:
>>>
>>> On 18/01/16 16:53, Chris Wilson wrote:
>>>> On Mon, Jan 18, 2016 at 03:02:25PM +0000, Tvrtko Ursulin wrote:
>>>>> -       while (!list_empty(&ring->request_list)) {
>>>>> -               struct drm_i915_gem_request *request;
>>>>> -
>>>>> -               request = list_first_entry(&ring->request_list,
>>>>> -                                          struct
>>>>> drm_i915_gem_request,
>>>>> -                                          list);
>>>>> -
>>>>> -               if (!i915_gem_request_completed(request, true))
>>>>> +       list_for_each_entry_safe(req, next, &ring->request_list,
>>>>> list) {
>>>>> +               if (!i915_gem_request_completed(req, true))
>>>>>                          break;
>>>>>
>>>>> -               i915_gem_request_retire(request);
>>>>> +               if (!i915.enable_execlists ||
>>>>> !i915.enable_guc_submission) {
>>>>> +                       i915_gem_request_retire(req);
>>>>> +               } else {
>>>>> +                       prev_req = list_prev_entry(req, list);
>>>>> +                       if (prev_req)
>>>>> +                               i915_gem_request_retire(prev_req);
>>>>> +               }
>>>>>          }
>>>>>
>>>>> To explain, this attempts to ensure that in GuC mode requests are only
>>>>> unreferenced if there is a *following* *completed* request.
>>>>>
>>>>> This way, regardless of whether they are using the same or different
>>>>> contexts, we can be sure that the GPU has either completed the
>>>>> context writing, or that the unreference will not cause the final
>>>>> unpin of the context.
>>>>
>>>> This is the first bogus step. contexts have to be unreferenced from
>>>> request retire, not request free. As it stands today, this forces us to
>>>> hold the struct_mutex for the free (causing many foul ups along the
>>>> line).  The only reason why it is like that is because of execlists not
>>>> decoupling its context pinning inside request cancel.
>>>
>>> What is the first bogus step? My idea of how to fix the GuC issue,
>>> or the mention of final unreference in relation to GPU completing
>>> the submission?
>>
>> That we want to want to actually unreference the request. We want to
>> unpin the context at the appropriate juncture. At the moment, it looks
>
> What would be the appropriate juncture? With GuC we don't have the
> equivalent of context complete irq.
>
>> like that you are conflating those two steps: "requests are only
>> unreferenced". Using the retirement mechanism would mean coupling the
>> context unpinning into a subsequent request rather than defer retiring a
>> completed request, for example legacy uses active vma tracking to
>> accomplish the same thing. Aiui, the current claim is that we couldn't
>> do that since the guc may reorder contexts - except that we currently
>> use a global seqno so that would be bad on many levels.
>
> I don't know legacy. :( I can see that request/context lifetime is
> coupled there and associated with request creation to retirement.
>
> Does it have the same problem of seqno signaling completion before the
> GPU is done with writing out the context image and how does it solve that?

Ok I think I am starting to see the legacy code paths.

Interesting areas are i915_switch_context + do_switch which do the 
ring->last_context tracking and make the ring/engine own one extra 
reference on the context.

Then, code paths which want to make sure no user context are active on 
the GPU call i915_gpu_idle and submit a dummy default context request.

The latter even explicitly avoids execlist mode.

So unless I am missing something, we could just unify the behaviour 
between the two. Make ring/engine->last_context do the identical 
tracking as legacy context switching and let i915_gpu_idle idle the GPU 
in execlist mode as well?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback
  2016-01-19 17:18                     ` Tvrtko Ursulin
@ 2016-01-19 17:31                       ` Tvrtko Ursulin
  0 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2016-01-19 17:31 UTC (permalink / raw)
  To: Chris Wilson, Nick Hoath, Daniel Vetter, intel-gfx


On 19/01/16 17:18, Tvrtko Ursulin wrote:
>
> On 19/01/16 10:24, Tvrtko Ursulin wrote:
>>
>>
>> On 18/01/16 20:47, Chris Wilson wrote:
>>> On Mon, Jan 18, 2016 at 05:14:26PM +0000, Tvrtko Ursulin wrote:
>>>>
>>>> On 18/01/16 16:53, Chris Wilson wrote:
>>>>> On Mon, Jan 18, 2016 at 03:02:25PM +0000, Tvrtko Ursulin wrote:
>>>>>> -       while (!list_empty(&ring->request_list)) {
>>>>>> -               struct drm_i915_gem_request *request;
>>>>>> -
>>>>>> -               request = list_first_entry(&ring->request_list,
>>>>>> -                                          struct
>>>>>> drm_i915_gem_request,
>>>>>> -                                          list);
>>>>>> -
>>>>>> -               if (!i915_gem_request_completed(request, true))
>>>>>> +       list_for_each_entry_safe(req, next, &ring->request_list,
>>>>>> list) {
>>>>>> +               if (!i915_gem_request_completed(req, true))
>>>>>>                          break;
>>>>>>
>>>>>> -               i915_gem_request_retire(request);
>>>>>> +               if (!i915.enable_execlists ||
>>>>>> !i915.enable_guc_submission) {
>>>>>> +                       i915_gem_request_retire(req);
>>>>>> +               } else {
>>>>>> +                       prev_req = list_prev_entry(req, list);
>>>>>> +                       if (prev_req)
>>>>>> +                               i915_gem_request_retire(prev_req);
>>>>>> +               }
>>>>>>          }
>>>>>>
>>>>>> To explain, this attempts to ensure that in GuC mode requests are
>>>>>> only
>>>>>> unreferenced if there is a *following* *completed* request.
>>>>>>
>>>>>> This way, regardless of whether they are using the same or different
>>>>>> contexts, we can be sure that the GPU has either completed the
>>>>>> context writing, or that the unreference will not cause the final
>>>>>> unpin of the context.
>>>>>
>>>>> This is the first bogus step. contexts have to be unreferenced from
>>>>> request retire, not request free. As it stands today, this forces
>>>>> us to
>>>>> hold the struct_mutex for the free (causing many foul ups along the
>>>>> line).  The only reason why it is like that is because of execlists
>>>>> not
>>>>> decoupling its context pinning inside request cancel.
>>>>
>>>> What is the first bogus step? My idea of how to fix the GuC issue,
>>>> or the mention of final unreference in relation to GPU completing
>>>> the submission?
>>>
>>> That we want to want to actually unreference the request. We want to
>>> unpin the context at the appropriate juncture. At the moment, it looks
>>
>> What would be the appropriate juncture? With GuC we don't have the
>> equivalent of context complete irq.
>>
>>> like that you are conflating those two steps: "requests are only
>>> unreferenced". Using the retirement mechanism would mean coupling the
>>> context unpinning into a subsequent request rather than defer retiring a
>>> completed request, for example legacy uses active vma tracking to
>>> accomplish the same thing. Aiui, the current claim is that we couldn't
>>> do that since the guc may reorder contexts - except that we currently
>>> use a global seqno so that would be bad on many levels.
>>
>> I don't know legacy. :( I can see that request/context lifetime is
>> coupled there and associated with request creation to retirement.
>>
>> Does it have the same problem of seqno signaling completion before the
>> GPU is done with writing out the context image and how does it solve
>> that?
>
> Ok I think I am starting to see the legacy code paths.
>
> Interesting areas are i915_switch_context + do_switch which do the
> ring->last_context tracking and make the ring/engine own one extra
> reference on the context.
>
> Then, code paths which want to make sure no user context are active on
> the GPU call i915_gpu_idle and submit a dummy default context request.
>
> The latter even explicitly avoids execlist mode.
>
> So unless I am missing something, we could just unify the behaviour
> between the two. Make ring/engine->last_context do the identical
> tracking as legacy context switching and let i915_gpu_idle idle the GPU
> in execlist mode as well?

Although I am not sure the engine->last_context concept works with LRC 
and GuC because of the multiple submission ports. Need to give it more 
thought.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-01-19 17:31 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-13 16:19 [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback Nick Hoath
2016-01-13 19:28 ` Yu Dai
2016-01-14  7:20 ` ✗ failure: Fi.CI.BAT Patchwork
2016-01-14 11:31   ` Nick Hoath
2016-01-19  9:08     ` Daniel Vetter
2016-01-14 11:36 ` [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback Chris Wilson
2016-01-14 11:56   ` Nick Hoath
2016-01-14 12:31     ` Chris Wilson
2016-01-14 12:37       ` Nick Hoath
2016-01-15 10:59         ` Nick Hoath
2016-01-18 15:02           ` Tvrtko Ursulin
2016-01-18 16:53             ` Chris Wilson
2016-01-18 17:14               ` Tvrtko Ursulin
2016-01-18 20:47                 ` Chris Wilson
2016-01-19 10:24                   ` Tvrtko Ursulin
2016-01-19 17:18                     ` Tvrtko Ursulin
2016-01-19 17:31                       ` Tvrtko Ursulin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.