All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] drm/i915: Fix context/engine cleanup order
@ 2015-12-11 14:59 Nick Hoath
  2015-12-11 16:26 ` Daniel Vetter
  2015-12-16 18:36 ` tidy up and fix init-fail and teardown paths Dave Gordon
  0 siblings, 2 replies; 21+ messages in thread
From: Nick Hoath @ 2015-12-11 14:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

Swap the order of context & engine cleanup, so that it is now
contexts, then engines.
This allows the context clean up code to do things like confirm
that ring->dev->struct_mutex is locked without a NULL pointer
dereference.
This came about as a result of the 'intel_ring_initialized() must
be simple and inline' patch now using ring->dev as an initialised
flag.
Rename the cleanup function to reflect what it actually does.
Also clean up some very annoying whitespace issues at the same time.

v2: Also make the fix in i915_load_modeset_init, not just
    in i915_driver_unload (Chris Wilson)

Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_dma.c |  4 ++--
 drivers/gpu/drm/i915/i915_drv.h |  2 +-
 drivers/gpu/drm/i915/i915_gem.c | 23 ++++++++++++-----------
 3 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 84e2b20..4dad121 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -449,8 +449,8 @@ static int i915_load_modeset_init(struct drm_device *dev)
 
 cleanup_gem:
 	mutex_lock(&dev->struct_mutex);
-	i915_gem_cleanup_ringbuffer(dev);
 	i915_gem_context_fini(dev);
+	i915_gem_cleanup_engines(dev);
 	mutex_unlock(&dev->struct_mutex);
 cleanup_irq:
 	intel_guc_ucode_fini(dev);
@@ -1188,8 +1188,8 @@ int i915_driver_unload(struct drm_device *dev)
 
 	intel_guc_ucode_fini(dev);
 	mutex_lock(&dev->struct_mutex);
-	i915_gem_cleanup_ringbuffer(dev);
 	i915_gem_context_fini(dev);
+	i915_gem_cleanup_engines(dev);
 	mutex_unlock(&dev->struct_mutex);
 	intel_fbc_cleanup_cfb(dev_priv);
 	i915_gem_cleanup_stolen(dev);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5edd393..e317f88 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3016,7 +3016,7 @@ int i915_gem_init_rings(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
 int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
-void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
+void i915_gem_cleanup_engines(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
 void __i915_add_request(struct drm_i915_gem_request *req,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8e2acde..04a22db 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4823,7 +4823,7 @@ i915_gem_init_hw(struct drm_device *dev)
 
 		ret = i915_gem_request_alloc(ring, ring->default_context, &req);
 		if (ret) {
-			i915_gem_cleanup_ringbuffer(dev);
+			i915_gem_cleanup_engines(dev);
 			goto out;
 		}
 
@@ -4836,7 +4836,7 @@ i915_gem_init_hw(struct drm_device *dev)
 		if (ret && ret != -EIO) {
 			DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret);
 			i915_gem_request_cancel(req);
-			i915_gem_cleanup_ringbuffer(dev);
+			i915_gem_cleanup_engines(dev);
 			goto out;
 		}
 
@@ -4844,7 +4844,7 @@ i915_gem_init_hw(struct drm_device *dev)
 		if (ret && ret != -EIO) {
 			DRM_ERROR("Context enable ring #%d failed %d\n", i, ret);
 			i915_gem_request_cancel(req);
-			i915_gem_cleanup_ringbuffer(dev);
+			i915_gem_cleanup_engines(dev);
 			goto out;
 		}
 
@@ -4919,7 +4919,7 @@ out_unlock:
 }
 
 void
-i915_gem_cleanup_ringbuffer(struct drm_device *dev)
+i915_gem_cleanup_engines(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *ring;
@@ -4928,13 +4928,14 @@ i915_gem_cleanup_ringbuffer(struct drm_device *dev)
 	for_each_ring(ring, dev_priv, i)
 		dev_priv->gt.cleanup_ring(ring);
 
-    if (i915.enable_execlists)
-            /*
-             * Neither the BIOS, ourselves or any other kernel
-             * expects the system to be in execlists mode on startup,
-             * so we need to reset the GPU back to legacy mode.
-             */
-            intel_gpu_reset(dev);
+	if (i915.enable_execlists) {
+		/*
+		 * Neither the BIOS, ourselves or any other kernel
+		 * expects the system to be in execlists mode on startup,
+		 * so we need to reset the GPU back to legacy mode.
+		 */
+		intel_gpu_reset(dev);
+	}
 }
 
 static void
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] drm/i915: Fix context/engine cleanup order
  2015-12-11 14:59 [PATCH v2] drm/i915: Fix context/engine cleanup order Nick Hoath
@ 2015-12-11 16:26 ` Daniel Vetter
  2015-12-11 18:58   ` Daniel Vetter
  2015-12-16 18:36 ` tidy up and fix init-fail and teardown paths Dave Gordon
  1 sibling, 1 reply; 21+ messages in thread
From: Daniel Vetter @ 2015-12-11 16:26 UTC (permalink / raw)
  To: Nick Hoath; +Cc: intel-gfx, Daniel Vetter

On Fri, Dec 11, 2015 at 02:59:09PM +0000, Nick Hoath wrote:
> Swap the order of context & engine cleanup, so that it is now
> contexts, then engines.
> This allows the context clean up code to do things like confirm
> that ring->dev->struct_mutex is locked without a NULL pointer
> dereference.
> This came about as a result of the 'intel_ring_initialized() must
> be simple and inline' patch now using ring->dev as an initialised
> flag.
> Rename the cleanup function to reflect what it actually does.
> Also clean up some very annoying whitespace issues at the same time.
> 
> v2: Also make the fix in i915_load_modeset_init, not just
>     in i915_driver_unload (Chris Wilson)
> 
> Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: David Gordon <david.s.gordon@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>

Queued for -next, thanks for the patch.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_dma.c |  4 ++--
>  drivers/gpu/drm/i915/i915_drv.h |  2 +-
>  drivers/gpu/drm/i915/i915_gem.c | 23 ++++++++++++-----------
>  3 files changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 84e2b20..4dad121 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -449,8 +449,8 @@ static int i915_load_modeset_init(struct drm_device *dev)
>  
>  cleanup_gem:
>  	mutex_lock(&dev->struct_mutex);
> -	i915_gem_cleanup_ringbuffer(dev);
>  	i915_gem_context_fini(dev);
> +	i915_gem_cleanup_engines(dev);
>  	mutex_unlock(&dev->struct_mutex);
>  cleanup_irq:
>  	intel_guc_ucode_fini(dev);
> @@ -1188,8 +1188,8 @@ int i915_driver_unload(struct drm_device *dev)
>  
>  	intel_guc_ucode_fini(dev);
>  	mutex_lock(&dev->struct_mutex);
> -	i915_gem_cleanup_ringbuffer(dev);
>  	i915_gem_context_fini(dev);
> +	i915_gem_cleanup_engines(dev);
>  	mutex_unlock(&dev->struct_mutex);
>  	intel_fbc_cleanup_cfb(dev_priv);
>  	i915_gem_cleanup_stolen(dev);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 5edd393..e317f88 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3016,7 +3016,7 @@ int i915_gem_init_rings(struct drm_device *dev);
>  int __must_check i915_gem_init_hw(struct drm_device *dev);
>  int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice);
>  void i915_gem_init_swizzling(struct drm_device *dev);
> -void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
> +void i915_gem_cleanup_engines(struct drm_device *dev);
>  int __must_check i915_gpu_idle(struct drm_device *dev);
>  int __must_check i915_gem_suspend(struct drm_device *dev);
>  void __i915_add_request(struct drm_i915_gem_request *req,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 8e2acde..04a22db 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4823,7 +4823,7 @@ i915_gem_init_hw(struct drm_device *dev)
>  
>  		ret = i915_gem_request_alloc(ring, ring->default_context, &req);
>  		if (ret) {
> -			i915_gem_cleanup_ringbuffer(dev);
> +			i915_gem_cleanup_engines(dev);
>  			goto out;
>  		}
>  
> @@ -4836,7 +4836,7 @@ i915_gem_init_hw(struct drm_device *dev)
>  		if (ret && ret != -EIO) {
>  			DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret);
>  			i915_gem_request_cancel(req);
> -			i915_gem_cleanup_ringbuffer(dev);
> +			i915_gem_cleanup_engines(dev);
>  			goto out;
>  		}
>  
> @@ -4844,7 +4844,7 @@ i915_gem_init_hw(struct drm_device *dev)
>  		if (ret && ret != -EIO) {
>  			DRM_ERROR("Context enable ring #%d failed %d\n", i, ret);
>  			i915_gem_request_cancel(req);
> -			i915_gem_cleanup_ringbuffer(dev);
> +			i915_gem_cleanup_engines(dev);
>  			goto out;
>  		}
>  
> @@ -4919,7 +4919,7 @@ out_unlock:
>  }
>  
>  void
> -i915_gem_cleanup_ringbuffer(struct drm_device *dev)
> +i915_gem_cleanup_engines(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_engine_cs *ring;
> @@ -4928,13 +4928,14 @@ i915_gem_cleanup_ringbuffer(struct drm_device *dev)
>  	for_each_ring(ring, dev_priv, i)
>  		dev_priv->gt.cleanup_ring(ring);
>  
> -    if (i915.enable_execlists)
> -            /*
> -             * Neither the BIOS, ourselves or any other kernel
> -             * expects the system to be in execlists mode on startup,
> -             * so we need to reset the GPU back to legacy mode.
> -             */
> -            intel_gpu_reset(dev);
> +	if (i915.enable_execlists) {
> +		/*
> +		 * Neither the BIOS, ourselves or any other kernel
> +		 * expects the system to be in execlists mode on startup,
> +		 * so we need to reset the GPU back to legacy mode.
> +		 */
> +		intel_gpu_reset(dev);
> +	}
>  }
>  
>  static void
> -- 
> 1.9.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] drm/i915: Fix context/engine cleanup order
  2015-12-11 16:26 ` Daniel Vetter
@ 2015-12-11 18:58   ` Daniel Vetter
  0 siblings, 0 replies; 21+ messages in thread
From: Daniel Vetter @ 2015-12-11 18:58 UTC (permalink / raw)
  To: Nick Hoath; +Cc: intel-gfx, Daniel Vetter

On Fri, Dec 11, 2015 at 05:26:19PM +0100, Daniel Vetter wrote:
> On Fri, Dec 11, 2015 at 02:59:09PM +0000, Nick Hoath wrote:
> > Swap the order of context & engine cleanup, so that it is now
> > contexts, then engines.
> > This allows the context clean up code to do things like confirm
> > that ring->dev->struct_mutex is locked without a NULL pointer
> > dereference.
> > This came about as a result of the 'intel_ring_initialized() must
> > be simple and inline' patch now using ring->dev as an initialised
> > flag.
> > Rename the cleanup function to reflect what it actually does.
> > Also clean up some very annoying whitespace issues at the same time.
> > 
> > v2: Also make the fix in i915_load_modeset_init, not just
> >     in i915_driver_unload (Chris Wilson)
> > 
> > Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: David Gordon <david.s.gordon@intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Queued for -next, thanks for the patch.

Well looked at BAT results still in time before pushing and dropped.
Sounds like you&Mika are looking into what's going on here.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* tidy up and fix init-fail and teardown paths
  2015-12-11 14:59 [PATCH v2] drm/i915: Fix context/engine cleanup order Nick Hoath
  2015-12-11 16:26 ` Daniel Vetter
@ 2015-12-16 18:36 ` Dave Gordon
  2015-12-16 18:36   ` [PATCH 1/4] drm/i915: teardown default context in reverse, update comments Dave Gordon
                     ` (3 more replies)
  1 sibling, 4 replies; 21+ messages in thread
From: Dave Gordon @ 2015-12-16 18:36 UTC (permalink / raw)
  To: intel-gfx

A collection of small patches to fix some incorrect failure paths and
generally tidy up the corresponding teardown code, mostly relating to
contexts, and in particular the global default context that's created
at startup.

These should make subsequent reorganisation of other startup/teardown
code easier and safer.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/4] drm/i915: teardown default context in reverse, update comments
  2015-12-16 18:36 ` tidy up and fix init-fail and teardown paths Dave Gordon
@ 2015-12-16 18:36   ` Dave Gordon
  2015-12-17 10:43     ` Nick Hoath
  2015-12-21 10:48     ` Daniel Vetter
  2015-12-16 18:36   ` [PATCH 2/4] drm/i915: mark the global default (intel_)context as such Dave Gordon
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 21+ messages in thread
From: Dave Gordon @ 2015-12-16 18:36 UTC (permalink / raw)
  To: intel-gfx

We set up engines in forwards order, so some things (notably the
default context) are "owned" by engine 0 (the render engine, aka "RCS").
For symmetry and to make sure such shared objects don't disappear too
early, we should generally run teardown loops in the reverse order,
so that engine 0 is processed last.

This patch changes i915_gem_context_fini() to do that, and clarifies the
comments in i915_gem_context_{init,fini}() about the refcounting of the
default {struct intel_)context: the refcount is just ONE, no matter how
many rings exist or are active, and this refcount is nominally ascribed
to the render ring (RCS), which is set up first and now torn down last.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 900ffd0..e143ea5 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -391,7 +391,13 @@ int i915_gem_context_init(struct drm_device *dev)
 	for (i = 0; i < I915_NUM_RINGS; i++) {
 		struct intel_engine_cs *ring = &dev_priv->ring[i];
 
-		/* NB: RCS will hold a ref for all rings */
+		/*
+		 * Although each engine has a pointer to the global default
+		 * context, they don't contribute to the refcount on the
+		 * context. We consider that RCS (which is set up first and
+		 * torn down last) holds this reference on behalf of all the
+		 * other engines
+		 */
 		ring->default_context = ctx;
 	}
 
@@ -431,14 +437,21 @@ void i915_gem_context_fini(struct drm_device *dev)
 		i915_gem_object_ggtt_unpin(dctx->legacy_hw_ctx.rcs_state);
 	}
 
-	for (i = 0; i < I915_NUM_RINGS; i++) {
+	for (i = I915_NUM_RINGS; --i >= 0;) {
 		struct intel_engine_cs *ring = &dev_priv->ring[i];
 
-		if (ring->last_context)
+		if (ring->last_context) {
 			i915_gem_context_unreference(ring->last_context);
+			ring->last_context = NULL;
+		}
 
+		/*
+		 * These default_context pointers don't contribute to the
+		 * refcount on the context. We consider that RCS holds its
+		 * reference on behalf of all the other engines, so there's
+		 * just a single unreference() call below.
+		 */
 		ring->default_context = NULL;
-		ring->last_context = NULL;
 	}
 
 	i915_gem_context_unreference(dctx);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/4] drm/i915: mark the global default (intel_)context as such
  2015-12-16 18:36 ` tidy up and fix init-fail and teardown paths Dave Gordon
  2015-12-16 18:36   ` [PATCH 1/4] drm/i915: teardown default context in reverse, update comments Dave Gordon
@ 2015-12-16 18:36   ` Dave Gordon
  2015-12-16 18:57     ` Chris Wilson
  2015-12-16 18:36   ` [PATCH 3/4] drm/i915: tidy up initialisation failure paths (legacy) Dave Gordon
  2015-12-16 18:36   ` [PATCH 4/4] drm/i915: tidy up initialisation failure paths (GEM & LRC) Dave Gordon
  3 siblings, 1 reply; 21+ messages in thread
From: Dave Gordon @ 2015-12-16 18:36 UTC (permalink / raw)
  To: intel-gfx

Some of the LRC-specific context-destruction code has to special-case
the global default context, because the HWSP is part of that context. At
present it deduces it indirectly by checking for the backpointer from
the engine to the context, but that's an unsafe assumption if the setup
and teardown code is reorganised. (It could also test !ctx->file_priv,
but again that's a detail that might be subject to change).

So here we explicitly flag the default context at the point of creation,
and then reorganise the code in intel_lr_context_free() not to rely on
the ring->default_pointer (still) being set up; to iterate over engines
in reverse (as this is teardown code); and to reduce the nesting level
so it's easier to read.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  2 ++
 drivers/gpu/drm/i915/i915_gem_context.c | 12 +++++++-----
 drivers/gpu/drm/i915/intel_lrc.c        | 25 ++++++++++++-------------
 3 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9124085..ea59843 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -851,6 +851,7 @@ struct i915_ctx_hang_stats {
  * @ref: reference count.
  * @user_handle: userspace tracking identity for this context.
  * @remap_slice: l3 row remapping information.
+ * @is_default: true iff this is the global default context
  * @flags: context specific flags:
  *         CONTEXT_NO_ZEROMAP: do not allow mapping things to page 0.
  * @file_priv: filp associated with this context (NULL for global default
@@ -869,6 +870,7 @@ struct intel_context {
 	struct kref ref;
 	int user_handle;
 	uint8_t remap_slice;
+	uint8_t is_global_default;
 	struct drm_i915_private *i915;
 	int flags;
 	struct drm_i915_file_private *file_priv;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e143ea5..1f16880 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -241,8 +241,10 @@ __create_hw_context(struct drm_device *dev,
 				DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
 		if (ret < 0)
 			goto err_out;
-	} else
+	} else {
+		ctx->is_global_default = true;
 		ret = DEFAULT_CONTEXT_HANDLE;
+	}
 
 	ctx->file_priv = file_priv;
 	ctx->user_handle = ret;
@@ -269,7 +271,6 @@ static struct intel_context *
 i915_gem_create_context(struct drm_device *dev,
 			struct drm_i915_file_private *file_priv)
 {
-	const bool is_global_default_ctx = file_priv == NULL;
 	struct intel_context *ctx;
 	int ret = 0;
 
@@ -279,8 +280,9 @@ i915_gem_create_context(struct drm_device *dev,
 	if (IS_ERR(ctx))
 		return ctx;
 
-	if (is_global_default_ctx && ctx->legacy_hw_ctx.rcs_state) {
-		/* We may need to do things with the shrinker which
+	if (ctx->is_global_default && ctx->legacy_hw_ctx.rcs_state) {
+		/*
+		 * We may need to do things with the shrinker which
 		 * require us to immediately switch back to the default
 		 * context. This can cause a problem as pinning the
 		 * default context also requires GTT space which may not
@@ -313,7 +315,7 @@ i915_gem_create_context(struct drm_device *dev,
 	return ctx;
 
 err_unpin:
-	if (is_global_default_ctx && ctx->legacy_hw_ctx.rcs_state)
+	if (ctx->is_global_default && ctx->legacy_hw_ctx.rcs_state)
 		i915_gem_object_ggtt_unpin(ctx->legacy_hw_ctx.rcs_state);
 err_destroy:
 	idr_remove(&file_priv->context_idr, ctx->user_handle);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3aa6147..23f90b2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2367,22 +2367,21 @@ void intel_lr_context_free(struct intel_context *ctx)
 {
 	int i;
 
-	for (i = 0; i < I915_NUM_RINGS; i++) {
+	for (i = I915_NUM_RINGS; --i >= 0; ) {
+		struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
 		struct drm_i915_gem_object *ctx_obj = ctx->engine[i].state;
 
-		if (ctx_obj) {
-			struct intel_ringbuffer *ringbuf =
-					ctx->engine[i].ringbuf;
-			struct intel_engine_cs *ring = ringbuf->ring;
+		if (!ctx_obj)
+			continue;
 
-			if (ctx == ring->default_context) {
-				intel_unpin_ringbuffer_obj(ringbuf);
-				i915_gem_object_ggtt_unpin(ctx_obj);
-			}
-			WARN_ON(ctx->engine[ring->id].pin_count);
-			intel_ringbuffer_free(ringbuf);
-			drm_gem_object_unreference(&ctx_obj->base);
+		if (ctx->is_global_default) {
+			intel_unpin_ringbuffer_obj(ringbuf);
+			i915_gem_object_ggtt_unpin(ctx_obj);
 		}
+
+		WARN_ON(ctx->engine[i].pin_count);
+		intel_ringbuffer_free(ringbuf);
+		drm_gem_object_unreference(&ctx_obj->base);
 	}
 }
 
@@ -2443,7 +2442,7 @@ static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring,
  */
 
 int intel_lr_context_deferred_alloc(struct intel_context *ctx,
-				     struct intel_engine_cs *ring)
+				    struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_gem_object *ctx_obj;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/4] drm/i915: tidy up initialisation failure paths (legacy)
  2015-12-16 18:36 ` tidy up and fix init-fail and teardown paths Dave Gordon
  2015-12-16 18:36   ` [PATCH 1/4] drm/i915: teardown default context in reverse, update comments Dave Gordon
  2015-12-16 18:36   ` [PATCH 2/4] drm/i915: mark the global default (intel_)context as such Dave Gordon
@ 2015-12-16 18:36   ` Dave Gordon
  2015-12-17 11:36     ` Nick Hoath
  2015-12-16 18:36   ` [PATCH 4/4] drm/i915: tidy up initialisation failure paths (GEM & LRC) Dave Gordon
  3 siblings, 1 reply; 21+ messages in thread
From: Dave Gordon @ 2015-12-16 18:36 UTC (permalink / raw)
  To: intel-gfx

1. Fix intel_cleanup_ring_buffer() to handle the error cleanup
   case where the ringbuffer has been allocated but map-and-pin
   failed. Unpin it iff it's previously been mapped-and-pinned.

2. Fix the error path in intel_init_ring_buffer(), which already
   called intel_destroy_ringbuffer_obj(), but failed to free the
   actual ringbuffer structure. Calling intel_ringbuffer_free()
   instead does both in one go.

3. With the above change, intel_destroy_ringbuffer_obj() is only
   called in one place (intel_ringbuffer_free()), so flatten it
   into that function.

4. move low-level register accesses from intel_cleanup_ring_buffer()
   (which calls intel_stop_ring_buffer(ring) which calls stop_ring())
   down into stop_ring() itself), which is already doing low-level
   register accesses. Then, intel_cleanup_ring_buffer() no longer
   needs 'dev_priv'.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 47 +++++++++++++++------------------
 1 file changed, 22 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index eefce9a..2853754 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -549,6 +549,8 @@ static bool stop_ring(struct intel_engine_cs *ring)
 		I915_WRITE_MODE(ring, _MASKED_BIT_DISABLE(STOP_RING));
 	}
 
+	WARN_ON(!IS_GEN2(ring->dev) && (I915_READ_MODE(ring) & MODE_IDLE) == 0);
+
 	return (I915_READ_HEAD(ring) & HEAD_ADDR) == 0;
 }
 
@@ -2057,12 +2059,6 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 	return 0;
 }
 
-static void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
-{
-	drm_gem_object_unreference(&ringbuf->obj->base);
-	ringbuf->obj = NULL;
-}
-
 static int intel_alloc_ringbuffer_obj(struct drm_device *dev,
 				      struct intel_ringbuffer *ringbuf)
 {
@@ -2125,11 +2121,14 @@ intel_engine_create_ringbuffer(struct intel_engine_cs *engine, int size)
 }
 
 void
-intel_ringbuffer_free(struct intel_ringbuffer *ring)
+intel_ringbuffer_free(struct intel_ringbuffer *ringbuf)
 {
-	intel_destroy_ringbuffer_obj(ring);
-	list_del(&ring->link);
-	kfree(ring);
+	if (ringbuf->obj) {
+		drm_gem_object_unreference(&ringbuf->obj->base);
+		ringbuf->obj = NULL;
+	}
+	list_del(&ringbuf->link);
+	kfree(ringbuf);
 }
 
 static int intel_init_ring_buffer(struct drm_device *dev,
@@ -2157,6 +2156,13 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	}
 	ring->buffer = ringbuf;
 
+	ret = intel_pin_and_map_ringbuffer_obj(dev, ringbuf);
+	if (ret) {
+		DRM_ERROR("Failed to pin and map ringbuffer %s: %d\n",
+				ring->name, ret);
+		goto error;
+	}
+
 	if (I915_NEED_GFX_HWS(dev)) {
 		ret = init_status_page(ring);
 		if (ret)
@@ -2168,14 +2174,6 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 			goto error;
 	}
 
-	ret = intel_pin_and_map_ringbuffer_obj(dev, ringbuf);
-	if (ret) {
-		DRM_ERROR("Failed to pin and map ringbuffer %s: %d\n",
-				ring->name, ret);
-		intel_destroy_ringbuffer_obj(ringbuf);
-		goto error;
-	}
-
 	ret = i915_cmd_parser_init_ring(ring);
 	if (ret)
 		goto error;
@@ -2189,19 +2187,18 @@ error:
 
 void intel_cleanup_ring_buffer(struct intel_engine_cs *ring)
 {
-	struct drm_i915_private *dev_priv;
+	struct intel_ringbuffer *ringbuf;
 
 	if (!intel_ring_initialized(ring))
 		return;
 
-	dev_priv = to_i915(ring->dev);
-
-	if (ring->buffer) {
+	ringbuf = ring->buffer;
+	if (ringbuf) {
 		intel_stop_ring_buffer(ring);
-		WARN_ON(!IS_GEN2(ring->dev) && (I915_READ_MODE(ring) & MODE_IDLE) == 0);
 
-		intel_unpin_ringbuffer_obj(ring->buffer);
-		intel_ringbuffer_free(ring->buffer);
+		if (ringbuf->virtual_start)
+			intel_unpin_ringbuffer_obj(ringbuf);
+		intel_ringbuffer_free(ringbuf);
 		ring->buffer = NULL;
 	}
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/4] drm/i915: tidy up initialisation failure paths (GEM & LRC)
  2015-12-16 18:36 ` tidy up and fix init-fail and teardown paths Dave Gordon
                     ` (2 preceding siblings ...)
  2015-12-16 18:36   ` [PATCH 3/4] drm/i915: tidy up initialisation failure paths (legacy) Dave Gordon
@ 2015-12-16 18:36   ` Dave Gordon
  2015-12-17 11:37     ` Nick Hoath
  3 siblings, 1 reply; 21+ messages in thread
From: Dave Gordon @ 2015-12-16 18:36 UTC (permalink / raw)
  To: intel-gfx

1. add call to i915_gem_context_fini() to deallocate the default
   context(s) if the call to init_rings() fails, so that we don't
   leak the context in that situation.

2. remove useless code in intel_logical_ring_cleanup(), presumably
   copypasted from legacy ringbuffer version at creation.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c  |  5 ++++-
 drivers/gpu/drm/i915/intel_lrc.c | 10 ++--------
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 66b1705..15f8989 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4890,8 +4890,11 @@ int i915_gem_init(struct drm_device *dev)
 		goto out_unlock;
 
 	ret = dev_priv->gt.init_rings(dev);
-	if (ret)
+	if (ret) {
+		i915_gem_context_fini(dev);
+		/* XXX: anything else to be undone here? */
 		goto out_unlock;
+	}
 
 	ret = i915_gem_init_hw(dev);
 	if (ret == -EIO) {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 23f90b2..cdb65eb 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1887,17 +1887,11 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
  */
 void intel_logical_ring_cleanup(struct intel_engine_cs *ring)
 {
-	struct drm_i915_private *dev_priv;
-
 	if (!intel_ring_initialized(ring))
 		return;
 
-	dev_priv = ring->dev->dev_private;
-
-	if (ring->buffer) {
-		intel_logical_ring_stop(ring);
-		WARN_ON((I915_READ_MODE(ring) & MODE_IDLE) == 0);
-	}
+	/* should not be set in LRC mode */
+	WARN_ON(ring->buffer);
 
 	if (ring->cleanup)
 		ring->cleanup(ring);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/i915: mark the global default (intel_)context as such
  2015-12-16 18:36   ` [PATCH 2/4] drm/i915: mark the global default (intel_)context as such Dave Gordon
@ 2015-12-16 18:57     ` Chris Wilson
  2015-12-16 19:22       ` Dave Gordon
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2015-12-16 18:57 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Wed, Dec 16, 2015 at 06:36:49PM +0000, Dave Gordon wrote:
> Some of the LRC-specific context-destruction code has to special-case
> the global default context, because the HWSP is part of that context. At
> present it deduces it indirectly by checking for the backpointer from
> the engine to the context, but that's an unsafe assumption if the setup
> and teardown code is reorganised. (It could also test !ctx->file_priv,
> but again that's a detail that might be subject to change).
> 
> So here we explicitly flag the default context at the point of creation,
> and then reorganise the code in intel_lr_context_free() not to rely on
> the ring->default_pointer (still) being set up; to iterate over engines
> in reverse (as this is teardown code); and to reduce the nesting level
> so it's easier to read.
> 
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>

#define intel_context_is_global(ctx) ((ctx)->file_priv == NULL)

> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 3aa6147..23f90b2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -2367,22 +2367,21 @@ void intel_lr_context_free(struct intel_context *ctx)
>  {
>  	int i;
>  
> -	for (i = 0; i < I915_NUM_RINGS; i++) {
> +	for (i = I915_NUM_RINGS; --i >= 0; ) {
> +		struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
>  		struct drm_i915_gem_object *ctx_obj = ctx->engine[i].state;
>  
> -		if (ctx_obj) {
> -			struct intel_ringbuffer *ringbuf =
> -					ctx->engine[i].ringbuf;
> -			struct intel_engine_cs *ring = ringbuf->ring;
> +		if (!ctx_obj)
> +			continue;
>  
> -			if (ctx == ring->default_context) {
> -				intel_unpin_ringbuffer_obj(ringbuf);
> -				i915_gem_object_ggtt_unpin(ctx_obj);
> -			}
> -			WARN_ON(ctx->engine[ring->id].pin_count);
> -			intel_ringbuffer_free(ringbuf);
> -			drm_gem_object_unreference(&ctx_obj->base);
> +		if (ctx->is_global_default) {
> +			intel_unpin_ringbuffer_obj(ringbuf);
> +			i915_gem_object_ggtt_unpin(ctx_obj);
>  		}
> +
> +		WARN_ON(ctx->engine[i].pin_count);
> +		intel_ringbuffer_free(ringbuf);
> +		drm_gem_object_unreference(&ctx_obj->base);

That looks much neater, indeed.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/i915: mark the global default (intel_)context as such
  2015-12-16 18:57     ` Chris Wilson
@ 2015-12-16 19:22       ` Dave Gordon
  2015-12-16 19:30         ` Chris Wilson
  0 siblings, 1 reply; 21+ messages in thread
From: Dave Gordon @ 2015-12-16 19:22 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 16/12/15 18:57, Chris Wilson wrote:
> On Wed, Dec 16, 2015 at 06:36:49PM +0000, Dave Gordon wrote:
>> Some of the LRC-specific context-destruction code has to special-case
>> the global default context, because the HWSP is part of that context. At
>> present it deduces it indirectly by checking for the backpointer from
>> the engine to the context, but that's an unsafe assumption if the setup
>> and teardown code is reorganised. (It could also test !ctx->file_priv,
>> but again that's a detail that might be subject to change).
>>
>> So here we explicitly flag the default context at the point of creation,
>> and then reorganise the code in intel_lr_context_free() not to rely on
>> the ring->default_pointer (still) being set up; to iterate over engines
>> in reverse (as this is teardown code); and to reduce the nesting level
>> so it's easier to read.
>>
>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>
> #define intel_context_is_global(ctx) ((ctx)->file_priv == NULL)

The last sentence of the first paragraph of the commit message above 
notes that we *could* use that as a test, but I don't regard it as a 
safe test, in either direction. That is, it could give a false negative 
if we someday associate some (internal) fd with the default context, or 
(more likely) a false positive if the file association were broken and 
the pointer nulled in an earlier stage of the teardown of a non-global 
(user-created) context.

int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
                                    struct drm_file *file)
{
         struct drm_i915_gem_context_destroy *args = data;
         struct drm_i915_file_private *file_priv = file->driver_priv;
         struct intel_context *ctx;
         int ret;

         if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
                 return -ENOENT;

         ret = i915_mutex_lock_interruptible(dev);
         if (ret)
                 return ret;

         ctx = i915_gem_context_get(file_priv, args->ctx_id);
         if (IS_ERR(ctx)) {
                 mutex_unlock(&dev->struct_mutex);
                 return PTR_ERR(ctx);
         }

         idr_remove(&ctx->file_priv->context_idr, ctx->user_handle);
         i915_gem_context_unreference(ctx);
         mutex_unlock(&dev->struct_mutex);

         DRM_DEBUG_DRIVER("HW context %d destroyed\n", args->ctx_id);
         return 0;
}

At present, i915_gem_context_destroy_ioctl() above removes the context 
from the file's list-of-contexts but DOESN'T clear the ctx->file_priv, 
which means there's a somewhat inconsistent (but transient) state during 
which a soon-to-be-destroyed context links to a file, but the file 
doesn't have a link back. It probably doesn't matter, because the code 
holds the mutex across the two operations ...

... unless of course the context's refcount isn't 1 at this point, in 
which case I suppose someone else *might* go from the context to the 
file and then be mystified as to why the context isn't on the list ...

... and if we changed the code above, then file_priv would *always* be 
NULL by the time the destructor was called!

So it's surely safer to have a flag that explicitly says "I'm the global 
default context" than to guess based on some other contingent property.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/i915: mark the global default (intel_)context as such
  2015-12-16 19:22       ` Dave Gordon
@ 2015-12-16 19:30         ` Chris Wilson
  2015-12-17 11:09           ` Nick Hoath
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2015-12-16 19:30 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Wed, Dec 16, 2015 at 07:22:52PM +0000, Dave Gordon wrote:
> On 16/12/15 18:57, Chris Wilson wrote:
> >On Wed, Dec 16, 2015 at 06:36:49PM +0000, Dave Gordon wrote:
> >>Some of the LRC-specific context-destruction code has to special-case
> >>the global default context, because the HWSP is part of that context. At
> >>present it deduces it indirectly by checking for the backpointer from
> >>the engine to the context, but that's an unsafe assumption if the setup
> >>and teardown code is reorganised. (It could also test !ctx->file_priv,
> >>but again that's a detail that might be subject to change).
> >>
> >>So here we explicitly flag the default context at the point of creation,
> >>and then reorganise the code in intel_lr_context_free() not to rely on
> >>the ring->default_pointer (still) being set up; to iterate over engines
> >>in reverse (as this is teardown code); and to reduce the nesting level
> >>so it's easier to read.
> >>
> >>Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> >
> >#define intel_context_is_global(ctx) ((ctx)->file_priv == NULL)
> 
> The last sentence of the first paragraph of the commit message above
> notes that we *could* use that as a test, but I don't regard it as a
> safe test, in either direction. That is, it could give a false
> negative if we someday associate some (internal) fd with the default
> context, or (more likely) a false positive if the file association
> were broken and the pointer nulled in an earlier stage of the
> teardown of a non-global (user-created) context.
> 
> int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>                                    struct drm_file *file)
> {
>         struct drm_i915_gem_context_destroy *args = data;
>         struct drm_i915_file_private *file_priv = file->driver_priv;
>         struct intel_context *ctx;
>         int ret;
> 
>         if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
>                 return -ENOENT;
> 
>         ret = i915_mutex_lock_interruptible(dev);
>         if (ret)
>                 return ret;
> 
>         ctx = i915_gem_context_get(file_priv, args->ctx_id);
>         if (IS_ERR(ctx)) {
>                 mutex_unlock(&dev->struct_mutex);
>                 return PTR_ERR(ctx);
>         }
> 
>         idr_remove(&ctx->file_priv->context_idr, ctx->user_handle);
>         i915_gem_context_unreference(ctx);
>         mutex_unlock(&dev->struct_mutex);
> 
>         DRM_DEBUG_DRIVER("HW context %d destroyed\n", args->ctx_id);
>         return 0;
> }
> 
> At present, i915_gem_context_destroy_ioctl() above removes the
> context from the file's list-of-contexts but DOESN'T clear the
> ctx->file_priv, which means there's a somewhat inconsistent (but
> transient) state during which a soon-to-be-destroyed context links
> to a file, but the file doesn't have a link back. It probably
> doesn't matter, because the code holds the mutex across the two
> operations ...

And that the ctx was created to belong to the file still holds true.
 
> ... unless of course the context's refcount isn't 1 at this point,
> in which case I suppose someone else *might* go from the context to
> the file and then be mystified as to why the context isn't on the
> list ...
> 
> ... and if we changed the code above, then file_priv would *always*
> be NULL by the time the destructor was called!
> 
> So it's surely safer to have a flag that explicitly says "I'm the
> global default context" than to guess based on some other contingent
> property.

No, we have a flag that says this context was created belonging to a
file, with the corollary that only one context doesn't belong to any
file.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] drm/i915: teardown default context in reverse, update comments
  2015-12-16 18:36   ` [PATCH 1/4] drm/i915: teardown default context in reverse, update comments Dave Gordon
@ 2015-12-17 10:43     ` Nick Hoath
  2015-12-21 10:48     ` Daniel Vetter
  1 sibling, 0 replies; 21+ messages in thread
From: Nick Hoath @ 2015-12-17 10:43 UTC (permalink / raw)
  To: Gordon, David S, intel-gfx

Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>

On 16/12/2015 18:36, Gordon, David S wrote:
> We set up engines in forwards order, so some things (notably the
> default context) are "owned" by engine 0 (the render engine, aka "RCS").
> For symmetry and to make sure such shared objects don't disappear too
> early, we should generally run teardown loops in the reverse order,
> so that engine 0 is processed last.
>
> This patch changes i915_gem_context_fini() to do that, and clarifies the
> comments in i915_gem_context_{init,fini}() about the refcounting of the
> default {struct intel_)context: the refcount is just ONE, no matter how
> many rings exist or are active, and this refcount is nominally ascribed
> to the render ring (RCS), which is set up first and now torn down last.
>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c | 21 +++++++++++++++++----
>   1 file changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 900ffd0..e143ea5 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -391,7 +391,13 @@ int i915_gem_context_init(struct drm_device *dev)
>   	for (i = 0; i < I915_NUM_RINGS; i++) {
>   		struct intel_engine_cs *ring = &dev_priv->ring[i];
>
> -		/* NB: RCS will hold a ref for all rings */
> +		/*
> +		 * Although each engine has a pointer to the global default
> +		 * context, they don't contribute to the refcount on the
> +		 * context. We consider that RCS (which is set up first and
> +		 * torn down last) holds this reference on behalf of all the
> +		 * other engines
> +		 */
>   		ring->default_context = ctx;
>   	}
>
> @@ -431,14 +437,21 @@ void i915_gem_context_fini(struct drm_device *dev)
>   		i915_gem_object_ggtt_unpin(dctx->legacy_hw_ctx.rcs_state);
>   	}
>
> -	for (i = 0; i < I915_NUM_RINGS; i++) {
> +	for (i = I915_NUM_RINGS; --i >= 0;) {
>   		struct intel_engine_cs *ring = &dev_priv->ring[i];
>
> -		if (ring->last_context)
> +		if (ring->last_context) {
>   			i915_gem_context_unreference(ring->last_context);
> +			ring->last_context = NULL;
> +		}
>
> +		/*
> +		 * These default_context pointers don't contribute to the
> +		 * refcount on the context. We consider that RCS holds its
> +		 * reference on behalf of all the other engines, so there's
> +		 * just a single unreference() call below.
> +		 */
>   		ring->default_context = NULL;
> -		ring->last_context = NULL;
>   	}
>
>   	i915_gem_context_unreference(dctx);
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/i915: mark the global default (intel_)context as such
  2015-12-16 19:30         ` Chris Wilson
@ 2015-12-17 11:09           ` Nick Hoath
  2015-12-17 12:27             ` Chris Wilson
  0 siblings, 1 reply; 21+ messages in thread
From: Nick Hoath @ 2015-12-17 11:09 UTC (permalink / raw)
  To: Gordon, David S; +Cc: intel-gfx

On 16/12/2015 19:30, Chris Wilson wrote:
> On Wed, Dec 16, 2015 at 07:22:52PM +0000, Dave Gordon wrote:
>> On 16/12/15 18:57, Chris Wilson wrote:
>>> On Wed, Dec 16, 2015 at 06:36:49PM +0000, Dave Gordon wrote:
>>>> Some of the LRC-specific context-destruction code has to special-case
>>>> the global default context, because the HWSP is part of that context. At
>>>> present it deduces it indirectly by checking for the backpointer from
>>>> the engine to the context, but that's an unsafe assumption if the setup
>>>> and teardown code is reorganised. (It could also test !ctx->file_priv,
>>>> but again that's a detail that might be subject to change).
>>>>
>>>> So here we explicitly flag the default context at the point of creation,
>>>> and then reorganise the code in intel_lr_context_free() not to rely on
>>>> the ring->default_pointer (still) being set up; to iterate over engines
>>>> in reverse (as this is teardown code); and to reduce the nesting level
>>>> so it's easier to read.
>>>>
>>>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>>>
>>> #define intel_context_is_global(ctx) ((ctx)->file_priv == NULL)
>>
>> The last sentence of the first paragraph of the commit message above
>> notes that we *could* use that as a test, but I don't regard it as a
>> safe test, in either direction. That is, it could give a false
>> negative if we someday associate some (internal) fd with the default
>> context, or (more likely) a false positive if the file association
>> were broken and the pointer nulled in an earlier stage of the
>> teardown of a non-global (user-created) context.
>>
>> int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>>                                     struct drm_file *file)
>> {
>>          struct drm_i915_gem_context_destroy *args = data;
>>          struct drm_i915_file_private *file_priv = file->driver_priv;
>>          struct intel_context *ctx;
>>          int ret;
>>
>>          if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
>>                  return -ENOENT;
>>
>>          ret = i915_mutex_lock_interruptible(dev);
>>          if (ret)
>>                  return ret;
>>
>>          ctx = i915_gem_context_get(file_priv, args->ctx_id);
>>          if (IS_ERR(ctx)) {
>>                  mutex_unlock(&dev->struct_mutex);
>>                  return PTR_ERR(ctx);
>>          }
>>
>>          idr_remove(&ctx->file_priv->context_idr, ctx->user_handle);
>>          i915_gem_context_unreference(ctx);
>>          mutex_unlock(&dev->struct_mutex);
>>
>>          DRM_DEBUG_DRIVER("HW context %d destroyed\n", args->ctx_id);
>>          return 0;
>> }
>>
>> At present, i915_gem_context_destroy_ioctl() above removes the
>> context from the file's list-of-contexts but DOESN'T clear the
>> ctx->file_priv, which means there's a somewhat inconsistent (but
>> transient) state during which a soon-to-be-destroyed context links
>> to a file, but the file doesn't have a link back. It probably
>> doesn't matter, because the code holds the mutex across the two
>> operations ...
>
> And that the ctx was created to belong to the file still holds true.
>
>> ... unless of course the context's refcount isn't 1 at this point,
>> in which case I suppose someone else *might* go from the context to
>> the file and then be mystified as to why the context isn't on the
>> list ...
>>
>> ... and if we changed the code above, then file_priv would *always*
>> be NULL by the time the destructor was called!
>>
>> So it's surely safer to have a flag that explicitly says "I'm the
>> global default context" than to guess based on some other contingent
>> property.
>
> No, we have a flag that says this context was created belonging to a
> file, with the corollary that only one context doesn't belong to any
> file.
Using pointers like this to provide 'magic' secondary state information
just adds to the fragility of the driver.
So:
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>
to the original patch.
> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/i915: tidy up initialisation failure paths (legacy)
  2015-12-16 18:36   ` [PATCH 3/4] drm/i915: tidy up initialisation failure paths (legacy) Dave Gordon
@ 2015-12-17 11:36     ` Nick Hoath
  0 siblings, 0 replies; 21+ messages in thread
From: Nick Hoath @ 2015-12-17 11:36 UTC (permalink / raw)
  To: Gordon, David S, intel-gfx

On 16/12/2015 18:36, Gordon, David S wrote:
> 1. Fix intel_cleanup_ring_buffer() to handle the error cleanup
>     case where the ringbuffer has been allocated but map-and-pin
>     failed. Unpin it iff it's previously been mapped-and-pinned.
>
> 2. Fix the error path in intel_init_ring_buffer(), which already
>     called intel_destroy_ringbuffer_obj(), but failed to free the
>     actual ringbuffer structure. Calling intel_ringbuffer_free()
>     instead does both in one go.
>
> 3. With the above change, intel_destroy_ringbuffer_obj() is only
>     called in one place (intel_ringbuffer_free()), so flatten it
>     into that function.
>
> 4. move low-level register accesses from intel_cleanup_ring_buffer()
>     (which calls intel_stop_ring_buffer(ring) which calls stop_ring())
>     down into stop_ring() itself), which is already doing low-level
>     register accesses. Then, intel_cleanup_ring_buffer() no longer
>     needs 'dev_priv'.
>
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>

> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 47 +++++++++++++++------------------
>   1 file changed, 22 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index eefce9a..2853754 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -549,6 +549,8 @@ static bool stop_ring(struct intel_engine_cs *ring)
>   		I915_WRITE_MODE(ring, _MASKED_BIT_DISABLE(STOP_RING));
>   	}
>
> +	WARN_ON(!IS_GEN2(ring->dev) && (I915_READ_MODE(ring) & MODE_IDLE) == 0);
> +
>   	return (I915_READ_HEAD(ring) & HEAD_ADDR) == 0;
>   }
>
> @@ -2057,12 +2059,6 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>   	return 0;
>   }
>
> -static void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
> -{
> -	drm_gem_object_unreference(&ringbuf->obj->base);
> -	ringbuf->obj = NULL;
> -}
> -
>   static int intel_alloc_ringbuffer_obj(struct drm_device *dev,
>   				      struct intel_ringbuffer *ringbuf)
>   {
> @@ -2125,11 +2121,14 @@ intel_engine_create_ringbuffer(struct intel_engine_cs *engine, int size)
>   }
>
>   void
> -intel_ringbuffer_free(struct intel_ringbuffer *ring)
> +intel_ringbuffer_free(struct intel_ringbuffer *ringbuf)
>   {
> -	intel_destroy_ringbuffer_obj(ring);
> -	list_del(&ring->link);
> -	kfree(ring);
> +	if (ringbuf->obj) {
> +		drm_gem_object_unreference(&ringbuf->obj->base);
> +		ringbuf->obj = NULL;
> +	}
> +	list_del(&ringbuf->link);
> +	kfree(ringbuf);
>   }
>
>   static int intel_init_ring_buffer(struct drm_device *dev,
> @@ -2157,6 +2156,13 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>   	}
>   	ring->buffer = ringbuf;
>
> +	ret = intel_pin_and_map_ringbuffer_obj(dev, ringbuf);
> +	if (ret) {
> +		DRM_ERROR("Failed to pin and map ringbuffer %s: %d\n",
> +				ring->name, ret);
> +		goto error;
> +	}
> +
>   	if (I915_NEED_GFX_HWS(dev)) {
>   		ret = init_status_page(ring);
>   		if (ret)
> @@ -2168,14 +2174,6 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>   			goto error;
>   	}
>
> -	ret = intel_pin_and_map_ringbuffer_obj(dev, ringbuf);
> -	if (ret) {
> -		DRM_ERROR("Failed to pin and map ringbuffer %s: %d\n",
> -				ring->name, ret);
> -		intel_destroy_ringbuffer_obj(ringbuf);
> -		goto error;
> -	}
> -
>   	ret = i915_cmd_parser_init_ring(ring);
>   	if (ret)
>   		goto error;
> @@ -2189,19 +2187,18 @@ error:
>
>   void intel_cleanup_ring_buffer(struct intel_engine_cs *ring)
>   {
> -	struct drm_i915_private *dev_priv;
> +	struct intel_ringbuffer *ringbuf;
>
>   	if (!intel_ring_initialized(ring))
>   		return;
>
> -	dev_priv = to_i915(ring->dev);
> -
> -	if (ring->buffer) {
> +	ringbuf = ring->buffer;
> +	if (ringbuf) {
>   		intel_stop_ring_buffer(ring);
> -		WARN_ON(!IS_GEN2(ring->dev) && (I915_READ_MODE(ring) & MODE_IDLE) == 0);
>
> -		intel_unpin_ringbuffer_obj(ring->buffer);
> -		intel_ringbuffer_free(ring->buffer);
> +		if (ringbuf->virtual_start)
> +			intel_unpin_ringbuffer_obj(ringbuf);
> +		intel_ringbuffer_free(ringbuf);
>   		ring->buffer = NULL;
>   	}
>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/4] drm/i915: tidy up initialisation failure paths (GEM & LRC)
  2015-12-16 18:36   ` [PATCH 4/4] drm/i915: tidy up initialisation failure paths (GEM & LRC) Dave Gordon
@ 2015-12-17 11:37     ` Nick Hoath
  0 siblings, 0 replies; 21+ messages in thread
From: Nick Hoath @ 2015-12-17 11:37 UTC (permalink / raw)
  To: Gordon, David S, intel-gfx

On 16/12/2015 18:36, Gordon, David S wrote:
> 1. add call to i915_gem_context_fini() to deallocate the default
>     context(s) if the call to init_rings() fails, so that we don't
>     leak the context in that situation.
>
> 2. remove useless code in intel_logical_ring_cleanup(), presumably
>     copypasted from legacy ringbuffer version at creation.
>

Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>

> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem.c  |  5 ++++-
>   drivers/gpu/drm/i915/intel_lrc.c | 10 ++--------
>   2 files changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 66b1705..15f8989 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4890,8 +4890,11 @@ int i915_gem_init(struct drm_device *dev)
>   		goto out_unlock;
>
>   	ret = dev_priv->gt.init_rings(dev);
> -	if (ret)
> +	if (ret) {
> +		i915_gem_context_fini(dev);
> +		/* XXX: anything else to be undone here? */
>   		goto out_unlock;
> +	}
>
>   	ret = i915_gem_init_hw(dev);
>   	if (ret == -EIO) {
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 23f90b2..cdb65eb 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1887,17 +1887,11 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
>    */
>   void intel_logical_ring_cleanup(struct intel_engine_cs *ring)
>   {
> -	struct drm_i915_private *dev_priv;
> -
>   	if (!intel_ring_initialized(ring))
>   		return;
>
> -	dev_priv = ring->dev->dev_private;
> -
> -	if (ring->buffer) {
> -		intel_logical_ring_stop(ring);
> -		WARN_ON((I915_READ_MODE(ring) & MODE_IDLE) == 0);
> -	}
> +	/* should not be set in LRC mode */
> +	WARN_ON(ring->buffer);
>
>   	if (ring->cleanup)
>   		ring->cleanup(ring);
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/i915: mark the global default (intel_)context as such
  2015-12-17 11:09           ` Nick Hoath
@ 2015-12-17 12:27             ` Chris Wilson
  2015-12-17 19:00               ` Dave Gordon
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2015-12-17 12:27 UTC (permalink / raw)
  To: Nick Hoath; +Cc: intel-gfx

On Thu, Dec 17, 2015 at 11:09:54AM +0000, Nick Hoath wrote:
> On 16/12/2015 19:30, Chris Wilson wrote:
> >On Wed, Dec 16, 2015 at 07:22:52PM +0000, Dave Gordon wrote:
> >>On 16/12/15 18:57, Chris Wilson wrote:
> >>>On Wed, Dec 16, 2015 at 06:36:49PM +0000, Dave Gordon wrote:
> >>>>Some of the LRC-specific context-destruction code has to special-case
> >>>>the global default context, because the HWSP is part of that context. At
> >>>>present it deduces it indirectly by checking for the backpointer from
> >>>>the engine to the context, but that's an unsafe assumption if the setup
> >>>>and teardown code is reorganised. (It could also test !ctx->file_priv,
> >>>>but again that's a detail that might be subject to change).
> >>>>
> >>>>So here we explicitly flag the default context at the point of creation,
> >>>>and then reorganise the code in intel_lr_context_free() not to rely on
> >>>>the ring->default_pointer (still) being set up; to iterate over engines
> >>>>in reverse (as this is teardown code); and to reduce the nesting level
> >>>>so it's easier to read.
> >>>>
> >>>>Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> >>>
> >>>#define intel_context_is_global(ctx) ((ctx)->file_priv == NULL)
> >>
> >>The last sentence of the first paragraph of the commit message above
> >>notes that we *could* use that as a test, but I don't regard it as a
> >>safe test, in either direction. That is, it could give a false
> >>negative if we someday associate some (internal) fd with the default
> >>context, or (more likely) a false positive if the file association
> >>were broken and the pointer nulled in an earlier stage of the
> >>teardown of a non-global (user-created) context.
> >>
> >>int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> >>                                    struct drm_file *file)
> >>{
> >>         struct drm_i915_gem_context_destroy *args = data;
> >>         struct drm_i915_file_private *file_priv = file->driver_priv;
> >>         struct intel_context *ctx;
> >>         int ret;
> >>
> >>         if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
> >>                 return -ENOENT;
> >>
> >>         ret = i915_mutex_lock_interruptible(dev);
> >>         if (ret)
> >>                 return ret;
> >>
> >>         ctx = i915_gem_context_get(file_priv, args->ctx_id);
> >>         if (IS_ERR(ctx)) {
> >>                 mutex_unlock(&dev->struct_mutex);
> >>                 return PTR_ERR(ctx);
> >>         }
> >>
> >>         idr_remove(&ctx->file_priv->context_idr, ctx->user_handle);
> >>         i915_gem_context_unreference(ctx);
> >>         mutex_unlock(&dev->struct_mutex);
> >>
> >>         DRM_DEBUG_DRIVER("HW context %d destroyed\n", args->ctx_id);
> >>         return 0;
> >>}
> >>
> >>At present, i915_gem_context_destroy_ioctl() above removes the
> >>context from the file's list-of-contexts but DOESN'T clear the
> >>ctx->file_priv, which means there's a somewhat inconsistent (but
> >>transient) state during which a soon-to-be-destroyed context links
> >>to a file, but the file doesn't have a link back. It probably
> >>doesn't matter, because the code holds the mutex across the two
> >>operations ...
> >
> >And that the ctx was created to belong to the file still holds true.
> >
> >>... unless of course the context's refcount isn't 1 at this point,
> >>in which case I suppose someone else *might* go from the context to
> >>the file and then be mystified as to why the context isn't on the
> >>list ...
> >>
> >>... and if we changed the code above, then file_priv would *always*
> >>be NULL by the time the destructor was called!
> >>
> >>So it's surely safer to have a flag that explicitly says "I'm the
> >>global default context" than to guess based on some other contingent
> >>property.
> >
> >No, we have a flag that says this context was created belonging to a
> >file, with the corollary that only one context doesn't belong to any
> >file.
> Using pointers like this to provide 'magic' secondary state information
> just adds to the fragility of the driver.

It's primary. Ownership information is fundamental to the driver.
The change is irrelevant to the bugfix.

If you want to make such a big change, eliminate the default_ctx from
execlists.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/i915: mark the global default (intel_)context as such
  2015-12-17 12:27             ` Chris Wilson
@ 2015-12-17 19:00               ` Dave Gordon
  2015-12-18 16:02                 ` Dave Gordon
  0 siblings, 1 reply; 21+ messages in thread
From: Dave Gordon @ 2015-12-17 19:00 UTC (permalink / raw)
  To: Chris Wilson, Nick Hoath, intel-gfx

On 17/12/15 12:27, Chris Wilson wrote:
> On Thu, Dec 17, 2015 at 11:09:54AM +0000, Nick Hoath wrote:
>> On 16/12/2015 19:30, Chris Wilson wrote:
>>> On Wed, Dec 16, 2015 at 07:22:52PM +0000, Dave Gordon wrote:
>>>> On 16/12/15 18:57, Chris Wilson wrote:
>>>>> On Wed, Dec 16, 2015 at 06:36:49PM +0000, Dave Gordon wrote:
>>>>>> Some of the LRC-specific context-destruction code has to special-case
>>>>>> the global default context, because the HWSP is part of that context. At
>>>>>> present it deduces it indirectly by checking for the backpointer from
>>>>>> the engine to the context, but that's an unsafe assumption if the setup
>>>>>> and teardown code is reorganised. (It could also test !ctx->file_priv,
>>>>>> but again that's a detail that might be subject to change).
>>>>>>
>>>>>> So here we explicitly flag the default context at the point of creation,
>>>>>> and then reorganise the code in intel_lr_context_free() not to rely on
>>>>>> the ring->default_pointer (still) being set up; to iterate over engines
>>>>>> in reverse (as this is teardown code); and to reduce the nesting level
>>>>>> so it's easier to read.
>>>>>>
>>>>>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>>>>>
>>>>> #define intel_context_is_global(ctx) ((ctx)->file_priv == NULL)

Why not
#define intel_context_is_global(ctx)	((ctx)->is_global) /* ! */

>>>> The last sentence of the first paragraph of the commit message above
>>>> notes that we *could* use that as a test, but I don't regard it as a
>>>> safe test, in either direction. That is, it could give a false
>>>> negative if we someday associate some (internal) fd with the default
>>>> context, or (more likely) a false positive if the file association
>>>> were broken and the pointer nulled in an earlier stage of the
>>>> teardown of a non-global (user-created) context.
>>>>
>>>> int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>>>>                                     struct drm_file *file)
>>>> {
>>>>          struct drm_i915_gem_context_destroy *args = data;
>>>>          struct drm_i915_file_private *file_priv = file->driver_priv;
>>>>          struct intel_context *ctx;
>>>>          int ret;
>>>>
>>>>          if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
>>>>                  return -ENOENT;
>>>>
>>>>          ret = i915_mutex_lock_interruptible(dev);
>>>>          if (ret)
>>>>                  return ret;
>>>>
>>>>          ctx = i915_gem_context_get(file_priv, args->ctx_id);
>>>>          if (IS_ERR(ctx)) {
>>>>                  mutex_unlock(&dev->struct_mutex);
>>>>                  return PTR_ERR(ctx);
>>>>          }
>>>>
>>>>          idr_remove(&ctx->file_priv->context_idr, ctx->user_handle);
>>>>          i915_gem_context_unreference(ctx);
>>>>          mutex_unlock(&dev->struct_mutex);
>>>>
>>>>          DRM_DEBUG_DRIVER("HW context %d destroyed\n", args->ctx_id);
>>>>          return 0;
>>>> }
>>>>
>>>> At present, i915_gem_context_destroy_ioctl() above removes the
>>>> context from the file's list-of-contexts but DOESN'T clear the
>>>> ctx->file_priv, which means there's a somewhat inconsistent (but
>>>> transient) state during which a soon-to-be-destroyed context links
>>>> to a file, but the file doesn't have a link back. It probably
>>>> doesn't matter, because the code holds the mutex across the two
>>>> operations ...
>>>
>>> And that the ctx was created to belong to the file still holds true.
>>>
>>>> ... unless of course the context's refcount isn't 1 at this point,
>>>> in which case I suppose someone else *might* go from the context to
>>>> the file and then be mystified as to why the context isn't on the
>>>> list ...
>>>>
>>>> ... and if we changed the code above, then file_priv would *always*
>>>> be NULL by the time the destructor was called!
>>>>
>>>> So it's surely safer to have a flag that explicitly says "I'm the
>>>> global default context" than to guess based on some other contingent
>>>> property.
>>>
>>> No, we have a flag that says this context was created belonging to a
>>> file, with the corollary that only one context doesn't belong to any
>>> file.
>> Using pointers like this to provide 'magic' secondary state information
>> just adds to the fragility of the driver.
>
> It's primary. Ownership information is fundamental to the driver.
> The change is irrelevant to the bugfix.
>
> If you want to make such a big change, eliminate the default_ctx from
> execlists.
> -Chris

No, we need the default (or global) context for idling the engines, as 
well as for sending initialisation commands during startup. We can't 
make the GPU stop using any given (user) context within a known bounded 
time except by telling it to switch to another context. Ergo, to stop 
using ANY user context, there must exist a non-user context that we can 
switch to.

And this is not a BIG change, it's a trivial change, to identify THE 
GLOBAL CONTEXT by means of *specific* information contained within that 
context, rather than by relying on information stored in a different 
structure.

Your suggested #define also does that much, but in a less obvious way 
that I consider lays a trap for a future maintainer who doesn't realise 
that the code relies on a subtle distinction between "a context created 
internally by the driver, without an owning fd" and "a context that has 
become detached from the fd that originally created it".

Eliminating such obscure and undocumented dependencies enables us to 
more safely restructure and simplify areas that currently have 
excessively high coupling, in terms of the nonlocal assumptions that 
each part makes about what other pieces of code have done/will do.
The delicate and fragile load/unload dance is a prime example.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/i915: mark the global default (intel_)context as such
  2015-12-17 19:00               ` Dave Gordon
@ 2015-12-18 16:02                 ` Dave Gordon
  0 siblings, 0 replies; 21+ messages in thread
From: Dave Gordon @ 2015-12-18 16:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On 17/12/15 19:00, Dave Gordon wrote:
> On 17/12/15 12:27, Chris Wilson wrote:
>> On Thu, Dec 17, 2015 at 11:09:54AM +0000, Nick Hoath wrote:
[snip!]
>>
>> If you want to make such a big change, eliminate the default_ctx from
>> execlists.
>> -Chris
>
> No, we need the default (or global) context for idling the engines, as
> well as for sending initialisation commands during startup. We can't
> make the GPU stop using any given (user) context within a known bounded
> time except by telling it to switch to another context. Ergo, to stop
> using ANY user context, there must exist a non-user context that we can
> switch to.

After writing this last night, I realised this morning that maybe you 
didn't mean "eliminate the default context per se" but rather "eliminate 
the use of ring->default_context", which is a very different thing and 
much less problematic.

So if that's the case, then, yes, I'll be quite happy to provide a 
followup patch which eliminates most uses of ring->default_context and 
in particular all those in intel_lrc.c where it's compared against 
another context pointer. But I'm not going to do that extra work until 
and unless this is merged, as it would just be a waste of effort.

The reason we just hit this specific use of default_context first is 
because this is the one that was blocking the merge of Nick's "Fix 
context/engine cleanup order" patch that you'd already R-B'd. Once this 
is in, we can get Nick's patches in, and /then/ clean up all the other 
comparisons made against ring->default_context.

.Dave.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] drm/i915: teardown default context in reverse, update comments
  2015-12-16 18:36   ` [PATCH 1/4] drm/i915: teardown default context in reverse, update comments Dave Gordon
  2015-12-17 10:43     ` Nick Hoath
@ 2015-12-21 10:48     ` Daniel Vetter
  2015-12-21 11:01       ` Chris Wilson
  1 sibling, 1 reply; 21+ messages in thread
From: Daniel Vetter @ 2015-12-21 10:48 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Wed, Dec 16, 2015 at 06:36:48PM +0000, Dave Gordon wrote:
> We set up engines in forwards order, so some things (notably the
> default context) are "owned" by engine 0 (the render engine, aka "RCS").
> For symmetry and to make sure such shared objects don't disappear too
> early, we should generally run teardown loops in the reverse order,
> so that engine 0 is processed last.
> 
> This patch changes i915_gem_context_fini() to do that, and clarifies the
> comments in i915_gem_context_{init,fini}() about the refcounting of the
> default {struct intel_)context: the refcount is just ONE, no matter how
> many rings exist or are active, and this refcount is nominally ascribed
> to the render ring (RCS), which is set up first and now torn down last.
> 
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 21 +++++++++++++++++----
>  1 file changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 900ffd0..e143ea5 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -391,7 +391,13 @@ int i915_gem_context_init(struct drm_device *dev)
>  	for (i = 0; i < I915_NUM_RINGS; i++) {
>  		struct intel_engine_cs *ring = &dev_priv->ring[i];
>  
> -		/* NB: RCS will hold a ref for all rings */
> +		/*
> +		 * Although each engine has a pointer to the global default
> +		 * context, they don't contribute to the refcount on the
> +		 * context. We consider that RCS (which is set up first and
> +		 * torn down last) holds this reference on behalf of all the
> +		 * other engines
> +		 */

Instead of piles of comments, can't we just reference-count this pointer
properly? Pointers to reference-counted objects which don't hold a full
reference are just fraught with peril, and doing that should imo only be
done when there's really clear performance data justifying the
atomic_inc/dec overhead. Init/teardown code isn't such a place.

This misdesign goes back to the original execlist merge, which expanded
the default context from RCS to all engines.

Or do I miss something and we can't do this?

Thanks, Daniel

>  		ring->default_context = ctx;
>  	}
>  
> @@ -431,14 +437,21 @@ void i915_gem_context_fini(struct drm_device *dev)
>  		i915_gem_object_ggtt_unpin(dctx->legacy_hw_ctx.rcs_state);
>  	}
>  
> -	for (i = 0; i < I915_NUM_RINGS; i++) {
> +	for (i = I915_NUM_RINGS; --i >= 0;) {
>  		struct intel_engine_cs *ring = &dev_priv->ring[i];
>  
> -		if (ring->last_context)
> +		if (ring->last_context) {
>  			i915_gem_context_unreference(ring->last_context);
> +			ring->last_context = NULL;
> +		}
>  
> +		/*
> +		 * These default_context pointers don't contribute to the
> +		 * refcount on the context. We consider that RCS holds its
> +		 * reference on behalf of all the other engines, so there's
> +		 * just a single unreference() call below.
> +		 */
>  		ring->default_context = NULL;
> -		ring->last_context = NULL;
>  	}
>  
>  	i915_gem_context_unreference(dctx);
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] drm/i915: teardown default context in reverse, update comments
  2015-12-21 10:48     ` Daniel Vetter
@ 2015-12-21 11:01       ` Chris Wilson
  2015-12-21 11:38         ` Dave Gordon
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2015-12-21 11:01 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon, Dec 21, 2015 at 11:48:35AM +0100, Daniel Vetter wrote:
> On Wed, Dec 16, 2015 at 06:36:48PM +0000, Dave Gordon wrote:
> > We set up engines in forwards order, so some things (notably the
> > default context) are "owned" by engine 0 (the render engine, aka "RCS").
> > For symmetry and to make sure such shared objects don't disappear too
> > early, we should generally run teardown loops in the reverse order,
> > so that engine 0 is processed last.
> > 
> > This patch changes i915_gem_context_fini() to do that, and clarifies the
> > comments in i915_gem_context_{init,fini}() about the refcounting of the
> > default {struct intel_)context: the refcount is just ONE, no matter how
> > many rings exist or are active, and this refcount is nominally ascribed
> > to the render ring (RCS), which is set up first and now torn down last.
> > 
> > Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_context.c | 21 +++++++++++++++++----
> >  1 file changed, 17 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 900ffd0..e143ea5 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -391,7 +391,13 @@ int i915_gem_context_init(struct drm_device *dev)
> >  	for (i = 0; i < I915_NUM_RINGS; i++) {
> >  		struct intel_engine_cs *ring = &dev_priv->ring[i];
> >  
> > -		/* NB: RCS will hold a ref for all rings */
> > +		/*
> > +		 * Although each engine has a pointer to the global default
> > +		 * context, they don't contribute to the refcount on the
> > +		 * context. We consider that RCS (which is set up first and
> > +		 * torn down last) holds this reference on behalf of all the
> > +		 * other engines
> > +		 */
> 
> Instead of piles of comments, can't we just reference-count this pointer
> properly? Pointers to reference-counted objects which don't hold a full
> reference are just fraught with peril, and doing that should imo only be
> done when there's really clear performance data justifying the
> atomic_inc/dec overhead. Init/teardown code isn't such a place.
> 
> This misdesign goes back to the original execlist merge, which expanded
> the default context from RCS to all engines.
> 
> Or do I miss something and we can't do this?

We can. It's actually even easier to just do dev_priv->kernel_context.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] drm/i915: teardown default context in reverse, update comments
  2015-12-21 11:01       ` Chris Wilson
@ 2015-12-21 11:38         ` Dave Gordon
  0 siblings, 0 replies; 21+ messages in thread
From: Dave Gordon @ 2015-12-21 11:38 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, intel-gfx

On 21/12/15 11:01, Chris Wilson wrote:
> On Mon, Dec 21, 2015 at 11:48:35AM +0100, Daniel Vetter wrote:
>> On Wed, Dec 16, 2015 at 06:36:48PM +0000, Dave Gordon wrote:
>>> We set up engines in forwards order, so some things (notably the
>>> default context) are "owned" by engine 0 (the render engine, aka "RCS").
>>> For symmetry and to make sure such shared objects don't disappear too
>>> early, we should generally run teardown loops in the reverse order,
>>> so that engine 0 is processed last.
>>>
>>> This patch changes i915_gem_context_fini() to do that, and clarifies the
>>> comments in i915_gem_context_{init,fini}() about the refcounting of the
>>> default {struct intel_)context: the refcount is just ONE, no matter how
>>> many rings exist or are active, and this refcount is nominally ascribed
>>> to the render ring (RCS), which is set up first and now torn down last.
>>>
>>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_gem_context.c | 21 +++++++++++++++++----
>>>   1 file changed, 17 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>>> index 900ffd0..e143ea5 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>>> @@ -391,7 +391,13 @@ int i915_gem_context_init(struct drm_device *dev)
>>>   	for (i = 0; i < I915_NUM_RINGS; i++) {
>>>   		struct intel_engine_cs *ring = &dev_priv->ring[i];
>>>
>>> -		/* NB: RCS will hold a ref for all rings */
>>> +		/*
>>> +		 * Although each engine has a pointer to the global default
>>> +		 * context, they don't contribute to the refcount on the
>>> +		 * context. We consider that RCS (which is set up first and
>>> +		 * torn down last) holds this reference on behalf of all the
>>> +		 * other engines
>>> +		 */
>>
>> Instead of piles of comments, can't we just reference-count this pointer
>> properly? Pointers to reference-counted objects which don't hold a full
>> reference are just fraught with peril, and doing that should imo only be
>> done when there's really clear performance data justifying the
>> atomic_inc/dec overhead. Init/teardown code isn't such a place.
>>
>> This misdesign goes back to the original execlist merge, which expanded
>> the default context from RCS to all engines.
>>
>> Or do I miss something and we can't do this?
>
> We can. It's actually even easier to just do dev_priv->kernel_context.
> -Chris

I'm sure it can be done, but Nick & I tried changing this section so 
that each engine held a reference, and it broke something else (because 
the last unreference then occured at a different point), hence my 
comment about the "the delicate and fragile load/unload dance".

This is why we're trying to tidy up and clarify, bit by bit, without 
breaking too many of the existing assumptions at once.

But ... I like Chris' idea of moving the pointer to the default context 
from the engine structure to dev_priv; that will simplify several of the 
complicated ...->engine[RCS].default_context constructs, and mean that 
we're change the pointers to match the lifecycle rather than vice versa. 
I'll try that ...

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-12-21 11:38 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-11 14:59 [PATCH v2] drm/i915: Fix context/engine cleanup order Nick Hoath
2015-12-11 16:26 ` Daniel Vetter
2015-12-11 18:58   ` Daniel Vetter
2015-12-16 18:36 ` tidy up and fix init-fail and teardown paths Dave Gordon
2015-12-16 18:36   ` [PATCH 1/4] drm/i915: teardown default context in reverse, update comments Dave Gordon
2015-12-17 10:43     ` Nick Hoath
2015-12-21 10:48     ` Daniel Vetter
2015-12-21 11:01       ` Chris Wilson
2015-12-21 11:38         ` Dave Gordon
2015-12-16 18:36   ` [PATCH 2/4] drm/i915: mark the global default (intel_)context as such Dave Gordon
2015-12-16 18:57     ` Chris Wilson
2015-12-16 19:22       ` Dave Gordon
2015-12-16 19:30         ` Chris Wilson
2015-12-17 11:09           ` Nick Hoath
2015-12-17 12:27             ` Chris Wilson
2015-12-17 19:00               ` Dave Gordon
2015-12-18 16:02                 ` Dave Gordon
2015-12-16 18:36   ` [PATCH 3/4] drm/i915: tidy up initialisation failure paths (legacy) Dave Gordon
2015-12-17 11:36     ` Nick Hoath
2015-12-16 18:36   ` [PATCH 4/4] drm/i915: tidy up initialisation failure paths (GEM & LRC) Dave Gordon
2015-12-17 11:37     ` Nick Hoath

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.