Using RCU requests, take 2

All of lore.kernel.org
 help / color / mirror / Atom feed

* Using RCU requests, take 2
@ 2016-08-04 19:52 Chris Wilson
  2016-08-04 19:52 ` [PATCH 01/19] drm/i915: Introduce i915_gem_active_wait_unlocked() Chris Wilson
                   ` (19 more replies)
  0 siblings, 20 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

Joonas has gone through this series once and now hopefully this captures
his feedback. Join in, have fun!
-Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 01/19] drm/i915: Introduce i915_gem_active_wait_unlocked()
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-04 19:52 ` [PATCH 02/19] drm/i915: Convert non-blocking waits for requests over to using RCU Chris Wilson
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

It is useful to be able to wait on pending rendering without grabbing
the struct_mutex. We can do this by using the i915_gem_active_get_rcu()
primitive to acquire a reference to the pending request without
requiring struct_mutex, just the RCU read lock, and then call
i915_wait_request().

v2: Rebase onto new i915_gem_active_get_unlocked() semantics that take
the RCU read lock on behalf of the caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_request.h | 40 +++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h
index 6002adc43523..15495d1e48e8 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.h
+++ b/drivers/gpu/drm/i915/i915_gem_request.h
@@ -569,6 +569,46 @@ i915_gem_active_wait(const struct i915_gem_active *active, struct mutex *mutex)
 }
 
 /**
+ * i915_gem_active_wait_unlocked - waits until the request is completed
+ * @active - the active request on which to wait
+ * @interruptible - whether the wait can be woken by a userspace signal
+ * @timeout - how long to wait at most
+ * @rps - userspace client to charge for a waitboost
+ *
+ * i915_gem_active_wait_unlocked() waits until the request is completed before
+ * returning, without requiring any locks to be held. Note that it does not
+ * retire any requests before returning.
+ *
+ * This function relies on RCU in order to acquire the reference to the active
+ * request without holding any locks. See __i915_gem_active_get_rcu() for the
+ * glory details on how that is managed. Once the reference is acquired, we
+ * can then wait upon the request, and afterwards release our reference,
+ * free of any locking.
+ *
+ * This function wraps i915_wait_request(), see it for the full details on
+ * the arguments.
+ *
+ * Returns 0 if successful, or a negative error code.
+ */
+static inline int
+i915_gem_active_wait_unlocked(const struct i915_gem_active *active,
+			      bool interruptible,
+			      s64 *timeout,
+			      struct intel_rps_client *rps)
+{
+	struct drm_i915_gem_request *request;
+	int ret = 0;
+
+	request = i915_gem_active_get_unlocked(active);
+	if (request) {
+		ret = i915_wait_request(request, interruptible, timeout, rps);
+		i915_gem_request_put(request);
+	}
+
+	return ret;
+}
+
+/**
  * i915_gem_active_retire - waits until the request is retired
  * @active - the active request on which to wait
  *
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 02/19] drm/i915: Convert non-blocking waits for requests over to using RCU
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
  2016-08-04 19:52 ` [PATCH 01/19] drm/i915: Introduce i915_gem_active_wait_unlocked() Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-05  5:38   ` Joonas Lahtinen
  2016-08-04 19:52 ` [PATCH 03/19] drm/i915: Convert non-blocking userptr " Chris Wilson
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

We can completely avoid taking the struct_mutex around the non-blocking
waits by switching over to the RCU request management (trading the mutex
for a RCU read lock and some complex atomic operations). The improvement
is that we gain further contention reduction, and overall the code
become simpler due to the reduced mutex dancing.

v2: Move i915_gem_fault tracepoint to start of function, before the
unlocked wait.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 114 +++++++++++++++++-----------------------
 1 file changed, 48 insertions(+), 66 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index aa40d6c35c30..c600f2366d2c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -349,24 +349,20 @@ i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
 	return 0;
 }
 
-/* A nonblocking variant of the above wait. This is a highly dangerous routine
- * as the object state may change during this call.
+/* A nonblocking variant of the above wait. Must be called prior to
+ * acquiring the mutex for the object, as the object state may change
+ * during this call. A reference must be held by the caller for the object.
  */
 static __must_check int
-i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
-					    struct intel_rps_client *rps,
-					    bool readonly)
+__unsafe_wait_rendering(struct drm_i915_gem_object *obj,
+			struct intel_rps_client *rps,
+			bool readonly)
 {
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_gem_request *requests[I915_NUM_ENGINES];
 	struct i915_gem_active *active;
 	unsigned long active_mask;
-	int ret, i, n = 0;
-
-	lockdep_assert_held(&dev->struct_mutex);
-	GEM_BUG_ON(!to_i915(dev)->mm.interruptible);
+	int idx;
 
-	active_mask = i915_gem_object_get_active(obj);
+	active_mask = __I915_BO_ACTIVE(obj);
 	if (!active_mask)
 		return 0;
 
@@ -377,25 +373,16 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 		active = &obj->last_write;
 	}
 
-	for_each_active(active_mask, i) {
-		struct drm_i915_gem_request *req;
+	for_each_active(active_mask, idx) {
+		int ret;
 
-		req = i915_gem_active_get(&active[i],
-					  &obj->base.dev->struct_mutex);
-		if (req)
-			requests[n++] = req;
+		ret = i915_gem_active_wait_unlocked(&active[idx],
+						    true, NULL, rps);
+		if (ret)
+			return ret;
 	}
 
-	mutex_unlock(&dev->struct_mutex);
-	ret = 0;
-	for (i = 0; ret == 0 && i < n; i++)
-		ret = i915_wait_request(requests[i], true, NULL, rps);
-	mutex_lock(&dev->struct_mutex);
-
-	for (i = 0; i < n; i++)
-		i915_gem_request_put(requests[i]);
-
-	return ret;
+	return 0;
 }
 
 static struct intel_rps_client *to_rps_client(struct drm_file *file)
@@ -1467,10 +1454,7 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 	int ret;
 
 	/* Only handle setting domains to types used by the CPU. */
-	if (write_domain & I915_GEM_GPU_DOMAINS)
-		return -EINVAL;
-
-	if (read_domains & I915_GEM_GPU_DOMAINS)
+	if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS)
 		return -EINVAL;
 
 	/* Having something in the write domain implies it's in the read
@@ -1479,25 +1463,21 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 	if (write_domain != 0 && read_domains != write_domain)
 		return -EINVAL;
 
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		return ret;
-
 	obj = i915_gem_object_lookup(file, args->handle);
-	if (!obj) {
-		ret = -ENOENT;
-		goto unlock;
-	}
+	if (!obj)
+		return -ENOENT;
 
 	/* Try to flush the object off the GPU without holding the lock.
 	 * We will repeat the flush holding the lock in the normal manner
 	 * to catch cases where we are gazumped.
 	 */
-	ret = i915_gem_object_wait_rendering__nonblocking(obj,
-							  to_rps_client(file),
-							  !write_domain);
+	ret = __unsafe_wait_rendering(obj, to_rps_client(file), !write_domain);
+	if (ret)
+		goto err;
+
+	ret = i915_mutex_lock_interruptible(dev);
 	if (ret)
-		goto unref;
+		goto err;
 
 	if (read_domains & I915_GEM_DOMAIN_GTT)
 		ret = i915_gem_object_set_to_gtt_domain(obj, write_domain != 0);
@@ -1507,11 +1487,13 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 	if (write_domain != 0)
 		intel_fb_obj_invalidate(obj, write_origin(obj, write_domain));
 
-unref:
 	i915_gem_object_put(obj);
-unlock:
 	mutex_unlock(&dev->struct_mutex);
 	return ret;
+
+err:
+	i915_gem_object_put_unlocked(obj);
+	return ret;
 }
 
 /**
@@ -1648,36 +1630,36 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct i915_ggtt *ggtt = &dev_priv->ggtt;
 	struct i915_ggtt_view view = i915_ggtt_view_normal;
+	bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
 	pgoff_t page_offset;
 	unsigned long pfn;
-	int ret = 0;
-	bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
-
-	intel_runtime_pm_get(dev_priv);
+	int ret;
 
 	/* We don't use vmf->pgoff since that has the fake offset */
 	page_offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >>
 		PAGE_SHIFT;
 
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		goto out;
-
 	trace_i915_gem_object_fault(obj, page_offset, true, write);
 
 	/* Try to flush the object off the GPU first without holding the lock.
-	 * Upon reacquiring the lock, we will perform our sanity checks and then
+	 * Upon acquiring the lock, we will perform our sanity checks and then
 	 * repeat the flush holding the lock in the normal manner to catch cases
 	 * where we are gazumped.
 	 */
-	ret = i915_gem_object_wait_rendering__nonblocking(obj, NULL, !write);
+	ret = __unsafe_wait_rendering(obj, NULL, !write);
 	if (ret)
-		goto unlock;
+		goto err;
+
+	intel_runtime_pm_get(dev_priv);
+
+	ret = i915_mutex_lock_interruptible(dev);
+	if (ret)
+		goto err_rpm;
 
 	/* Access to snoopable pages through the GTT is incoherent. */
 	if (obj->cache_level != I915_CACHE_NONE && !HAS_LLC(dev)) {
 		ret = -EFAULT;
-		goto unlock;
+		goto err_unlock;
 	}
 
 	/* Use a partial view if the object is bigger than the aperture. */
@@ -1698,15 +1680,15 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	/* Now pin it into the GTT if needed */
 	ret = i915_gem_object_ggtt_pin(obj, &view, 0, 0, PIN_MAPPABLE);
 	if (ret)
-		goto unlock;
+		goto err_unlock;
 
 	ret = i915_gem_object_set_to_gtt_domain(obj, write);
 	if (ret)
-		goto unpin;
+		goto err_unpin;
 
 	ret = i915_gem_object_get_fence(obj);
 	if (ret)
-		goto unpin;
+		goto err_unpin;
 
 	/* Finally, remap it using the new GTT offset */
 	pfn = ggtt->mappable_base +
@@ -1751,11 +1733,13 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 					    (unsigned long)vmf->virtual_address,
 					    pfn + page_offset);
 	}
-unpin:
+err_unpin:
 	i915_gem_object_ggtt_unpin_view(obj, &view);
-unlock:
+err_unlock:
 	mutex_unlock(&dev->struct_mutex);
-out:
+err_rpm:
+	intel_runtime_pm_put(dev_priv);
+err:
 	switch (ret) {
 	case -EIO:
 		/*
@@ -1796,8 +1780,6 @@ out:
 		ret = VM_FAULT_SIGBUS;
 		break;
 	}
-
-	intel_runtime_pm_put(dev_priv);
 	return ret;
 }
 
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 03/19] drm/i915: Convert non-blocking userptr waits for requests over to using RCU
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
  2016-08-04 19:52 ` [PATCH 01/19] drm/i915: Introduce i915_gem_active_wait_unlocked() Chris Wilson
  2016-08-04 19:52 ` [PATCH 02/19] drm/i915: Convert non-blocking waits for requests over to using RCU Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-04 19:52 ` [PATCH 04/19] drm/i915/userptr: Remove superfluous interruptible=false on waiting Chris Wilson
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

We can completely avoid taking the struct_mutex around the non-blocking
waits by switching over to the RCU request management (trading the mutex
for a RCU read lock and some complex atomic operations). The improvement
is that we gain further contention reduction, and overall the code
become simpler due to the reduced mutex dancing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_userptr.c | 34 +++++++--------------------------
 1 file changed, 7 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 53f64fcc89ef..96ab6161903a 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -63,32 +63,12 @@ struct i915_mmu_object {
 
 static void wait_rendering(struct drm_i915_gem_object *obj)
 {
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_gem_request *requests[I915_NUM_ENGINES];
-	int i, n;
-
-	if (!i915_gem_object_is_active(obj))
-		return;
-
-	n = 0;
-	for (i = 0; i < I915_NUM_ENGINES; i++) {
-		struct drm_i915_gem_request *req;
+	unsigned long active = __I915_BO_ACTIVE(obj);
+	int idx;
 
-		req = i915_gem_active_get(&obj->last_read[i],
-					  &obj->base.dev->struct_mutex);
-		if (req)
-			requests[n++] = req;
-	}
-
-	mutex_unlock(&dev->struct_mutex);
-
-	for (i = 0; i < n; i++)
-		i915_wait_request(requests[i], false, NULL, NULL);
-
-	mutex_lock(&dev->struct_mutex);
-
-	for (i = 0; i < n; i++)
-		i915_gem_request_put(requests[i]);
+	for_each_active(active, idx)
+		i915_gem_active_wait_unlocked(&obj->last_read[idx],
+					      false, NULL, NULL);
 }
 
 static void cancel_userptr(struct work_struct *work)
@@ -97,6 +77,8 @@ static void cancel_userptr(struct work_struct *work)
 	struct drm_i915_gem_object *obj = mo->obj;
 	struct drm_device *dev = obj->base.dev;
 
+	wait_rendering(obj);
+
 	mutex_lock(&dev->struct_mutex);
 	/* Cancel any active worker and force us to re-evaluate gup */
 	obj->userptr.work = NULL;
@@ -105,8 +87,6 @@ static void cancel_userptr(struct work_struct *work)
 		struct drm_i915_private *dev_priv = to_i915(dev);
 		bool was_interruptible;
 
-		wait_rendering(obj);
-
 		was_interruptible = dev_priv->mm.interruptible;
 		dev_priv->mm.interruptible = false;
 
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 04/19] drm/i915/userptr: Remove superfluous interruptible=false on waiting
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (2 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 03/19] drm/i915: Convert non-blocking userptr " Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-04 19:52 ` [PATCH 05/19] drm/i915: Remove forced stop ring on suspend/unload Chris Wilson
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

Inside the kthread context, we can't be interrupted by signals so
touching the mm.interruptible flag is pointless and wait-request now
consumes EIO itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_userptr.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 96ab6161903a..57218cca7e05 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -84,16 +84,9 @@ static void cancel_userptr(struct work_struct *work)
 	obj->userptr.work = NULL;
 
 	if (obj->pages != NULL) {
-		struct drm_i915_private *dev_priv = to_i915(dev);
-		bool was_interruptible;
-
-		was_interruptible = dev_priv->mm.interruptible;
-		dev_priv->mm.interruptible = false;
-
+		/* We are inside a kthread context and can't be interrupted */
 		WARN_ON(i915_gem_object_unbind(obj));
 		WARN_ON(i915_gem_object_put_pages(obj));
-
-		dev_priv->mm.interruptible = was_interruptible;
 	}
 
 	i915_gem_object_put(obj);
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 05/19] drm/i915: Remove forced stop ring on suspend/unload
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (3 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 04/19] drm/i915/userptr: Remove superfluous interruptible=false on waiting Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-05  5:46   ` Joonas Lahtinen
  2016-08-04 19:52 ` [PATCH 06/19] drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex Chris Wilson
                   ` (14 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

Before suspending (or unloading), we would first wait upon all rendering
to be completed and then disable the rings. This later step is a remanent
from DRI1 days when we did not use request tracking for all operations
upon the ring. Now that we are sure we are waiting upon the very last
operation by the engine, we can forgo clobbering the ring registers,
though we do keep the assert that the engine is indeed idle before
sleeping.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h         |  1 -
 drivers/gpu/drm/i915/i915_gem.c         | 18 ------------------
 drivers/gpu/drm/i915/intel_lrc.c        | 26 --------------------------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 16 ----------------
 4 files changed, 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index db5dc5bd78d8..abdfb97096e2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2004,7 +2004,6 @@ struct drm_i915_private {
 	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
 	struct {
 		void (*cleanup_engine)(struct intel_engine_cs *engine);
-		void (*stop_engine)(struct intel_engine_cs *engine);
 
 		/**
 		 * Is the GPU currently considered idle, or busy executing
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c600f2366d2c..8946c52e09fb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4080,16 +4080,6 @@ struct i915_vma *i915_gem_obj_to_ggtt_view(struct drm_i915_gem_object *obj,
 	return NULL;
 }
 
-static void
-i915_gem_stop_engines(struct drm_device *dev)
-{
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_engine_cs *engine;
-
-	for_each_engine(engine, dev_priv)
-		dev_priv->gt.stop_engine(engine);
-}
-
 int
 i915_gem_suspend(struct drm_device *dev)
 {
@@ -4118,12 +4108,6 @@ i915_gem_suspend(struct drm_device *dev)
 
 	i915_gem_retire_requests(dev_priv);
 
-	/* Note that rather than stopping the engines, all we have to do
-	 * is assert that every RING_HEAD == RING_TAIL (all execution complete)
-	 * and similar for all logical context images (to ensure they are
-	 * all ready for hibernation).
-	 */
-	i915_gem_stop_engines(dev);
 	i915_gem_context_lost(dev_priv);
 	mutex_unlock(&dev->struct_mutex);
 
@@ -4308,10 +4292,8 @@ int i915_gem_init(struct drm_device *dev)
 
 	if (!i915.enable_execlists) {
 		dev_priv->gt.cleanup_engine = intel_engine_cleanup;
-		dev_priv->gt.stop_engine = intel_engine_stop;
 	} else {
 		dev_priv->gt.cleanup_engine = intel_logical_ring_cleanup;
-		dev_priv->gt.stop_engine = intel_logical_ring_stop;
 	}
 
 	/* This is just a security blanket to placate dragons.
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a07da548ff49..309c5d9b1c57 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -760,31 +760,6 @@ void intel_execlists_cancel_requests(struct intel_engine_cs *engine)
 	}
 }
 
-void intel_logical_ring_stop(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	if (!intel_engine_initialized(engine))
-		return;
-
-	ret = intel_engine_idle(engine);
-	if (ret)
-		DRM_ERROR("failed to quiesce %s whilst cleaning up: %d\n",
-			  engine->name, ret);
-
-	/* TODO: Is this correct with Execlists enabled? */
-	I915_WRITE_MODE(engine, _MASKED_BIT_ENABLE(STOP_RING));
-	if (intel_wait_for_register(dev_priv,
-				    RING_MI_MODE(engine->mmio_base),
-				    MODE_IDLE, MODE_IDLE,
-				    1000)) {
-		DRM_ERROR("%s :timed out trying to stop ring\n", engine->name);
-		return;
-	}
-	I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
-}
-
 static int intel_lr_context_pin(struct i915_gem_context *ctx,
 				struct intel_engine_cs *engine)
 {
@@ -1717,7 +1692,6 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *engine)
 	dev_priv = engine->i915;
 
 	if (engine->buffer) {
-		intel_logical_ring_stop(engine);
 		WARN_ON((I915_READ_MODE(engine) & MODE_IDLE) == 0);
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index a862234ccf18..4593a65cae84 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2203,7 +2203,6 @@ void intel_engine_cleanup(struct intel_engine_cs *engine)
 	dev_priv = engine->i915;
 
 	if (engine->buffer) {
-		intel_engine_stop(engine);
 		WARN_ON(!IS_GEN2(dev_priv) && (I915_READ_MODE(engine) & MODE_IDLE) == 0);
 
 		intel_ring_unpin(engine->buffer);
@@ -2907,18 +2906,3 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
 
 	return intel_init_ring_buffer(engine);
 }
-
-void intel_engine_stop(struct intel_engine_cs *engine)
-{
-	int ret;
-
-	if (!intel_engine_initialized(engine))
-		return;
-
-	ret = intel_engine_idle(engine);
-	if (ret)
-		DRM_ERROR("failed to quiesce %s whilst cleaning up: %d\n",
-			  engine->name, ret);
-
-	stop_ring(engine);
-}
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 06/19] drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (4 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 05/19] drm/i915: Remove forced stop ring on suspend/unload Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-05  6:16   ` Joonas Lahtinen
  2016-08-04 19:52 ` [PATCH 07/19] drm/i915: Simplify do_idling() (Ironlake vt-d w/a) Chris Wilson
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

The principal motivation for this was to try and eliminate the
struct_mutex from i915_gem_suspend - but we still need to hold the mutex
current for the i915_gem_context_lost(). (The issue there is that there
may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and
struct_mutex via the stop_machine().) For the moment, enabling last
request tracking for the engine, allows us to do busyness checking and
waiting without requiring the struct_mutex - which is useful in its own
right.

v2: Pass along "bool interruptible" as being unlocked we cannot rely on
i915->mm.interruptible being stable or even under our control.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c      |  2 +-
 drivers/gpu/drm/i915/i915_drv.h          |  3 ++-
 drivers/gpu/drm/i915/i915_gem.c          | 33 +++++++++++++++-----------------
 drivers/gpu/drm/i915/i915_gem_evict.c    |  6 +++---
 drivers/gpu/drm/i915/i915_gem_gtt.c      |  2 +-
 drivers/gpu/drm/i915/i915_gem_request.c  |  7 ++++---
 drivers/gpu/drm/i915/i915_gem_request.h  | 11 +++++++++++
 drivers/gpu/drm/i915/i915_gem_shrinker.c |  2 +-
 drivers/gpu/drm/i915/i915_irq.c          |  3 +--
 drivers/gpu/drm/i915/intel_engine_cs.c   |  8 +++++++-
 drivers/gpu/drm/i915/intel_pm.c          |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  | 18 -----------------
 drivers/gpu/drm/i915/intel_ringbuffer.h  | 33 ++++++++++++++++++++------------
 13 files changed, 68 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 24d63e271f4b..1faea382dfeb 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4925,7 +4925,7 @@ i915_drop_caches_set(void *data, u64 val)
 		return ret;
 
 	if (val & DROP_ACTIVE) {
-		ret = i915_gem_wait_for_idle(dev_priv);
+		ret = i915_gem_wait_for_idle(dev_priv, true);
 		if (ret)
 			goto unlock;
 	}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index abdfb97096e2..6eff31202336 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3233,7 +3233,8 @@ int __must_check i915_gem_init(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
 void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_engines(struct drm_device *dev);
-int __must_check i915_gem_wait_for_idle(struct drm_i915_private *dev_priv);
+int __must_check i915_gem_wait_for_idle(struct drm_i915_private *dev_priv,
+					bool interruptible);
 int __must_check i915_gem_suspend(struct drm_device *dev);
 void i915_gem_resume(struct drm_device *dev);
 int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8946c52e09fb..0872de359bd7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2438,13 +2438,18 @@ static void i915_gem_reset_engine_status(struct intel_engine_cs *engine)
 
 static void i915_gem_reset_engine_cleanup(struct intel_engine_cs *engine)
 {
+	struct drm_i915_gem_request *request;
 	struct intel_ring *ring;
 
+	request = i915_gem_active_peek(&engine->last_request,
+				       &engine->i915->drm.struct_mutex);
+
 	/* Mark all pending requests as complete so that any concurrent
 	 * (lockless) lookup doesn't try and wait upon the request as we
 	 * reset it.
 	 */
-	intel_engine_init_seqno(engine, engine->last_submitted_seqno);
+	if (request)
+		intel_engine_init_seqno(engine, request->fence.seqno);
 
 	/*
 	 * Clear the execlists queue up before freeing the requests, as those
@@ -2466,15 +2471,9 @@ static void i915_gem_reset_engine_cleanup(struct intel_engine_cs *engine)
 	 * implicit references on things like e.g. ppgtt address spaces through
 	 * the request.
 	 */
-	if (!list_empty(&engine->request_list)) {
-		struct drm_i915_gem_request *request;
-
-		request = list_last_entry(&engine->request_list,
-					  struct drm_i915_gem_request,
-					  link);
-
+	if (request)
 		i915_gem_request_retire_upto(request);
-	}
+	GEM_BUG_ON(intel_engine_is_active(engine));
 
 	/* Having flushed all requests from all queues, we know that all
 	 * ringbuffers must now be empty. However, since we do not reclaim
@@ -2897,18 +2896,17 @@ destroy:
 	return 0;
 }
 
-int i915_gem_wait_for_idle(struct drm_i915_private *dev_priv)
+int i915_gem_wait_for_idle(struct drm_i915_private *dev_priv,
+			   bool interruptible)
 {
 	struct intel_engine_cs *engine;
 	int ret;
 
-	lockdep_assert_held(&dev_priv->drm.struct_mutex);
-
 	for_each_engine(engine, dev_priv) {
 		if (engine->last_context == NULL)
 			continue;
 
-		ret = intel_engine_idle(engine);
+		ret = intel_engine_idle(engine, interruptible);
 		if (ret)
 			return ret;
 	}
@@ -4080,11 +4078,10 @@ struct i915_vma *i915_gem_obj_to_ggtt_view(struct drm_i915_gem_object *obj,
 	return NULL;
 }
 
-int
-i915_gem_suspend(struct drm_device *dev)
+int i915_gem_suspend(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	int ret = 0;
+	int ret;
 
 	intel_suspend_gt_powersave(dev_priv);
 
@@ -4102,7 +4099,7 @@ i915_gem_suspend(struct drm_device *dev)
 	if (ret)
 		goto err;
 
-	ret = i915_gem_wait_for_idle(dev_priv);
+	ret = i915_gem_wait_for_idle(dev_priv, true);
 	if (ret)
 		goto err;
 
@@ -4123,7 +4120,7 @@ i915_gem_suspend(struct drm_device *dev)
 	return 0;
 
 err:
-	mutex_unlock(&dev->struct_mutex);
+	mutex_unlock(&dev_priv->drm.struct_mutex);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 7be425826539..f76c06e92677 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -39,7 +39,7 @@ gpu_is_idle(struct drm_i915_private *dev_priv)
 	struct intel_engine_cs *engine;
 
 	for_each_engine(engine, dev_priv) {
-		if (!list_empty(&engine->request_list))
+		if (intel_engine_is_active(engine))
 			return false;
 	}
 
@@ -167,7 +167,7 @@ search_again:
 	if (ret)
 		return ret;
 
-	ret = i915_gem_wait_for_idle(dev_priv);
+	ret = i915_gem_wait_for_idle(dev_priv, true);
 	if (ret)
 		return ret;
 
@@ -272,7 +272,7 @@ int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
 				return ret;
 		}
 
-		ret = i915_gem_wait_for_idle(dev_priv);
+		ret = i915_gem_wait_for_idle(dev_priv, true);
 		if (ret)
 			return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index db97155074d3..c1d79978f409 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2248,7 +2248,7 @@ static bool do_idling(struct drm_i915_private *dev_priv)
 
 	if (unlikely(ggtt->do_idle_maps)) {
 		dev_priv->mm.interruptible = false;
-		if (i915_gem_wait_for_idle(dev_priv)) {
+		if (i915_gem_wait_for_idle(dev_priv, false)) {
 			DRM_ERROR("Failed to wait for idle; VT'd may hang.\n");
 			/* Wait a bit, in hopes it avoids the hang */
 			udelay(10);
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 3fecb8f0e041..1f91dc8c4171 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -265,7 +265,7 @@ static int i915_gem_init_seqno(struct drm_i915_private *dev_priv, u32 seqno)
 
 	/* Carefully retire all requests without writing to the rings */
 	for_each_engine(engine, dev_priv) {
-		ret = intel_engine_idle(engine);
+		ret = intel_engine_idle(engine, true);
 		if (ret)
 			return ret;
 	}
@@ -486,7 +486,8 @@ void __i915_add_request(struct drm_i915_gem_request *request,
 	 */
 	request->emitted_jiffies = jiffies;
 	request->previous_seqno = engine->last_submitted_seqno;
-	smp_store_mb(engine->last_submitted_seqno, request->fence.seqno);
+	engine->last_submitted_seqno = request->fence.seqno;
+	i915_gem_active_set(&engine->last_request, request);
 	list_add_tail(&request->link, &engine->request_list);
 	list_add_tail(&request->ring_link, &ring->request_list);
 
@@ -757,7 +758,7 @@ void i915_gem_retire_requests(struct drm_i915_private *dev_priv)
 
 	for_each_engine(engine, dev_priv) {
 		engine_retire_requests(engine);
-		if (list_empty(&engine->request_list))
+		if (!intel_engine_is_active(engine))
 			dev_priv->gt.active_engines &= ~intel_engine_flag(engine);
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h
index 15495d1e48e8..3496e28785e7 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.h
+++ b/drivers/gpu/drm/i915/i915_gem_request.h
@@ -29,6 +29,17 @@
 
 #include "i915_gem.h"
 
+struct intel_wait {
+	struct rb_node node;
+	struct task_struct *tsk;
+	u32 seqno;
+};
+
+struct intel_signal_node {
+	struct rb_node node;
+	struct intel_wait wait;
+};
+
 /**
  * Request queue structure.
  *
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 1341cb55b6f1..23d70376b104 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -412,7 +412,7 @@ i915_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr
 		return NOTIFY_DONE;
 
 	/* Force everything onto the inactive lists */
-	ret = i915_gem_wait_for_idle(dev_priv);
+	ret = i915_gem_wait_for_idle(dev_priv, false);
 	if (ret)
 		goto out;
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index e58650096426..e35af9ba1546 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2807,8 +2807,7 @@ static void gen8_disable_vblank(struct drm_device *dev, unsigned int pipe)
 static bool
 ring_idle(struct intel_engine_cs *engine, u32 seqno)
 {
-	return i915_seqno_passed(seqno,
-				 READ_ONCE(engine->last_submitted_seqno));
+	return !intel_engine_is_active(engine);
 }
 
 static bool
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index f495969f749b..e9b301ae2d0c 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -166,6 +166,12 @@ void intel_engine_init_hangcheck(struct intel_engine_cs *engine)
 	memset(&engine->hangcheck, 0, sizeof(engine->hangcheck));
 }
 
+static void intel_engine_init_requests(struct intel_engine_cs *engine)
+{
+	init_request_active(&engine->last_request, NULL);
+	INIT_LIST_HEAD(&engine->request_list);
+}
+
 /**
  * intel_engines_setup_common - setup engine state not requiring hw access
  * @engine: Engine to setup.
@@ -177,13 +183,13 @@ void intel_engine_init_hangcheck(struct intel_engine_cs *engine)
  */
 void intel_engine_setup_common(struct intel_engine_cs *engine)
 {
-	INIT_LIST_HEAD(&engine->request_list);
 	INIT_LIST_HEAD(&engine->buffers);
 	INIT_LIST_HEAD(&engine->execlist_queue);
 	spin_lock_init(&engine->execlist_lock);
 
 	engine->fence_context = fence_context_alloc(1);
 
+	intel_engine_init_requests(engine);
 	intel_engine_init_hangcheck(engine);
 	i915_gem_batch_pool_init(engine, &engine->batch_pool);
 }
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 4317cdf7011d..4eac4b4beece 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6336,7 +6336,7 @@ bool i915_gpu_busy(void)
 	dev_priv = i915_mch_dev;
 
 	for_each_engine(engine, dev_priv)
-		ret |= !list_empty(&engine->request_list);
+		ret |= intel_engine_is_active(engine);
 
 out_unlock:
 	spin_unlock_irq(&mchdev_lock);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 4593a65cae84..322274a239e4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2227,24 +2227,6 @@ void intel_engine_cleanup(struct intel_engine_cs *engine)
 	engine->i915 = NULL;
 }
 
-int intel_engine_idle(struct intel_engine_cs *engine)
-{
-	struct drm_i915_gem_request *req;
-
-	/* Wait upon the last request to be completed */
-	if (list_empty(&engine->request_list))
-		return 0;
-
-	req = list_entry(engine->request_list.prev,
-			 struct drm_i915_gem_request,
-			 link);
-
-	/* Make sure we do not trigger any retires */
-	return i915_wait_request(req,
-				 req->i915->mm.interruptible,
-				 NULL, NULL);
-}
-
 int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 88952bf10b9d..43e545e44352 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -3,6 +3,7 @@
 
 #include <linux/hashtable.h>
 #include "i915_gem_batch_pool.h"
+#include "i915_gem_request.h"
 
 #define I915_CMD_HASH_ORDER 9
 
@@ -307,6 +308,13 @@ struct intel_engine_cs {
 	 */
 	u32 last_submitted_seqno;
 
+	/* An RCU guarded pointer to the last request. No reference is
+	 * held to the request, users must carefully acquire a reference to
+	 * the request using i915_gem_active_get_request_rcu(), or hold the
+	 * struct_mutex.
+	 */
+	struct i915_gem_active last_request;
+
 	struct i915_gem_context *last_context;
 
 	struct intel_engine_hangcheck hangcheck;
@@ -465,7 +473,6 @@ static inline u32 intel_ring_offset(struct intel_ring *ring, u32 value)
 int __intel_ring_space(int head, int tail, int size);
 void intel_ring_update_space(struct intel_ring *ring);
 
-int __must_check intel_engine_idle(struct intel_engine_cs *engine);
 void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno);
 
 int intel_init_pipe_control(struct intel_engine_cs *engine, int size);
@@ -475,6 +482,14 @@ void intel_engine_setup_common(struct intel_engine_cs *engine);
 int intel_engine_init_common(struct intel_engine_cs *engine);
 void intel_engine_cleanup_common(struct intel_engine_cs *engine);
 
+static inline int intel_engine_idle(struct intel_engine_cs *engine,
+				    bool interruptible)
+{
+	/* Wait upon the last request to be completed */
+	return i915_gem_active_wait_unlocked(&engine->last_request,
+					     interruptible, NULL, NULL);
+}
+
 int intel_init_render_ring_buffer(struct intel_engine_cs *engine);
 int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine);
 int intel_init_bsd2_ring_buffer(struct intel_engine_cs *engine);
@@ -504,17 +519,6 @@ static inline u32 intel_hws_seqno_address(struct intel_engine_cs *engine)
 }
 
 /* intel_breadcrumbs.c -- user interrupt bottom-half for waiters */
-struct intel_wait {
-	struct rb_node node;
-	struct task_struct *tsk;
-	u32 seqno;
-};
-
-struct intel_signal_node {
-	struct rb_node node;
-	struct intel_wait wait;
-};
-
 int intel_engine_init_breadcrumbs(struct intel_engine_cs *engine);
 
 static inline void intel_wait_init(struct intel_wait *wait, u32 seqno)
@@ -561,4 +565,9 @@ void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine);
 unsigned int intel_kick_waiters(struct drm_i915_private *i915);
 unsigned int intel_kick_signalers(struct drm_i915_private *i915);
 
+static inline bool intel_engine_is_active(struct intel_engine_cs *engine)
+{
+	return i915_gem_active_isset(&engine->last_request);
+}
+
 #endif /* _INTEL_RINGBUFFER_H_ */
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 07/19] drm/i915: Simplify do_idling() (Ironlake vt-d w/a)
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (5 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 06/19] drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-05  6:19   ` Joonas Lahtinen
  2016-08-04 19:52 ` [PATCH 08/19] drm/i915/shrinker: Wait before acquiring struct_mutex under oom Chris Wilson
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

Now that we pass along the expected interruptible nature for the
wait-for-idle, we do not need to modify the global
i915->mm.interruptible for this single call. (Only the immediate call to
i915_gem_wait_for_idle() takes the interruptible status as the other
action, dma_map_sg(), is independent of i915.ko)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 37 ++++++++-----------------------------
 1 file changed, 8 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index c1d79978f409..8b4f2f35019c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2241,31 +2241,6 @@ static bool needs_idle_maps(struct drm_i915_private *dev_priv)
 	return false;
 }
 
-static bool do_idling(struct drm_i915_private *dev_priv)
-{
-	struct i915_ggtt *ggtt = &dev_priv->ggtt;
-	bool ret = dev_priv->mm.interruptible;
-
-	if (unlikely(ggtt->do_idle_maps)) {
-		dev_priv->mm.interruptible = false;
-		if (i915_gem_wait_for_idle(dev_priv, false)) {
-			DRM_ERROR("Failed to wait for idle; VT'd may hang.\n");
-			/* Wait a bit, in hopes it avoids the hang */
-			udelay(10);
-		}
-	}
-
-	return ret;
-}
-
-static void undo_idling(struct drm_i915_private *dev_priv, bool interruptible)
-{
-	struct i915_ggtt *ggtt = &dev_priv->ggtt;
-
-	if (unlikely(ggtt->do_idle_maps))
-		dev_priv->mm.interruptible = interruptible;
-}
-
 void i915_check_and_clear_faults(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *engine;
@@ -2713,14 +2688,18 @@ void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	bool interruptible;
+	struct i915_ggtt *ggtt = &dev_priv->ggtt;
 
-	interruptible = do_idling(dev_priv);
+	if (unlikely(ggtt->do_idle_maps)) {
+		if (i915_gem_wait_for_idle(dev_priv, false)) {
+			DRM_ERROR("Failed to wait for idle; VT'd may hang.\n");
+			/* Wait a bit, in hopes it avoids the hang */
+			udelay(10);
+		}
+	}
 
 	dma_unmap_sg(&dev->pdev->dev, obj->pages->sgl, obj->pages->nents,
 		     PCI_DMA_BIDIRECTIONAL);
-
-	undo_idling(dev_priv, interruptible);
 }
 
 static void i915_gtt_color_adjust(struct drm_mm_node *node,
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 08/19] drm/i915/shrinker: Wait before acquiring struct_mutex under oom
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (6 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 07/19] drm/i915: Simplify do_idling() (Ironlake vt-d w/a) Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-05  6:24   ` Joonas Lahtinen
  2016-08-04 19:52 ` [PATCH 09/19] drm/i915: Tidy generation of the GTT mmap offset Chris Wilson
                   ` (11 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

We can now wait for the GPU (all engines) to become idle without
requiring the struct_mutex. Inside the shrinker, we need to currently
take the struct_mutex in order to purge objects and to purge the objects
we need the GPU to be idle - causing a stall whilst we hold the
struct_mutex. We can hide most of that stall by performing the wait
before taking the struct_mutex and only doing essential waits for
new rendering on objects to be freed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 23d70376b104..9b92b6470ccc 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -323,17 +323,22 @@ i915_gem_shrinker_lock_uninterruptible(struct drm_i915_private *dev_priv,
 				       struct shrinker_lock_uninterruptible *slu,
 				       int timeout_ms)
 {
-	unsigned long timeout = msecs_to_jiffies(timeout_ms) + 1;
+	unsigned long timeout = jiffies + msecs_to_jiffies_timeout(timeout_ms);
+
+	do {
+		if (i915_gem_wait_for_idle(dev_priv, false) == 0 &&
+		    i915_gem_shrinker_lock(&dev_priv->drm, &slu->unlock))
+			break;
 
-	while (!i915_gem_shrinker_lock(&dev_priv->drm, &slu->unlock)) {
 		schedule_timeout_killable(1);
 		if (fatal_signal_pending(current))
 			return false;
-		if (--timeout == 0) {
+
+		if (time_after(jiffies, timeout)) {
 			pr_err("Unable to lock GPU to purge memory.\n");
 			return false;
 		}
-	}
+	} while (1);
 
 	slu->was_interruptible = dev_priv->mm.interruptible;
 	dev_priv->mm.interruptible = false;
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 09/19] drm/i915: Tidy generation of the GTT mmap offset
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (7 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 08/19] drm/i915/shrinker: Wait before acquiring struct_mutex under oom Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-04 19:52 ` [PATCH 10/19] drm/i915: Remove unused no-shrinker-steal Chris Wilson
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

If we make the observation that mmap-offsets are only released when we
free an object, we can then deduce that the shrinker only creates free
space in the mmap arena indirectly by flushing the request list and
freeing expired objects. If we combine this with the lockless
vma-manager and lockless idling, we can avoid taking our big struct_mutex
until we need to actually free the requests.

One side-effect is that we defer the madvise checking until we need the
pages (i.e. the fault handler). This brings us into line with the other
delayed checks (and madvise in general).

v2: s/ret/err/ and use if (!err) rather than if (ret == 0)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 69 +++++++++++++----------------------------
 1 file changed, 22 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0872de359bd7..ee222961c62a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1898,36 +1898,28 @@ u64 i915_gem_get_ggtt_alignment(struct drm_i915_private *dev_priv, u64 size,
 static int i915_gem_object_create_mmap_offset(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
-	int ret;
-
-	dev_priv->mm.shrinker_no_lock_stealing = true;
+	int err;
 
-	ret = drm_gem_create_mmap_offset(&obj->base);
-	if (ret != -ENOSPC)
-		goto out;
+	err = drm_gem_create_mmap_offset(&obj->base);
+	if (!err)
+		return 0;
 
-	/* Badly fragmented mmap space? The only way we can recover
-	 * space is by destroying unwanted objects. We can't randomly release
-	 * mmap_offsets as userspace expects them to be persistent for the
-	 * lifetime of the objects. The closest we can is to release the
-	 * offsets on purgeable objects by truncating it and marking it purged,
-	 * which prevents userspace from ever using that object again.
+	/* We can idle the GPU locklessly to flush stale objects, but in order
+	 * to claim that space for ourselves, we need to take the big
+	 * struct_mutex to free the requests+objects and allocate our slot.
 	 */
-	i915_gem_shrink(dev_priv,
-			obj->base.size >> PAGE_SHIFT,
-			I915_SHRINK_BOUND |
-			I915_SHRINK_UNBOUND |
-			I915_SHRINK_PURGEABLE);
-	ret = drm_gem_create_mmap_offset(&obj->base);
-	if (ret != -ENOSPC)
-		goto out;
+	err = i915_gem_wait_for_idle(dev_priv, true);
+	if (err)
+		return err;
 
-	i915_gem_shrink_all(dev_priv);
-	ret = drm_gem_create_mmap_offset(&obj->base);
-out:
-	dev_priv->mm.shrinker_no_lock_stealing = false;
+	err = i915_mutex_lock_interruptible(&dev_priv->drm);
+	if (!err) {
+		i915_gem_retire_requests(dev_priv);
+		err = drm_gem_create_mmap_offset(&obj->base);
+		mutex_unlock(&dev_priv->drm.struct_mutex);
+	}
 
-	return ret;
+	return err;
 }
 
 static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
@@ -1944,32 +1936,15 @@ i915_gem_mmap_gtt(struct drm_file *file,
 	struct drm_i915_gem_object *obj;
 	int ret;
 
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		return ret;
-
 	obj = i915_gem_object_lookup(file, handle);
-	if (!obj) {
-		ret = -ENOENT;
-		goto unlock;
-	}
-
-	if (obj->madv != I915_MADV_WILLNEED) {
-		DRM_DEBUG("Attempting to mmap a purgeable buffer\n");
-		ret = -EFAULT;
-		goto out;
-	}
+	if (!obj)
+		return -ENOENT;
 
 	ret = i915_gem_object_create_mmap_offset(obj);
-	if (ret)
-		goto out;
+	if (ret == 0)
+		*offset = drm_vma_node_offset_addr(&obj->base.vma_node);
 
-	*offset = drm_vma_node_offset_addr(&obj->base.vma_node);
-
-out:
-	i915_gem_object_put(obj);
-unlock:
-	mutex_unlock(&dev->struct_mutex);
+	i915_gem_object_put_unlocked(obj);
 	return ret;
 }
 
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 10/19] drm/i915: Remove unused no-shrinker-steal
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (8 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 09/19] drm/i915: Tidy generation of the GTT mmap offset Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-04 19:52 ` [PATCH 11/19] drm/i915: Do a nonblocking wait first in pread/pwrite Chris Wilson
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

After removing the user of this wart, we can remove the wart entirely.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          | 1 -
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 3 ---
 2 files changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6eff31202336..31a614fe9ed7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1318,7 +1318,6 @@ struct i915_gem_mm {
 	struct notifier_block oom_notifier;
 	struct notifier_block vmap_notifier;
 	struct shrinker shrinker;
-	bool shrinker_no_lock_stealing;
 
 	/** LRU list of objects with fence regs on them. */
 	struct list_head fence_list;
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 9b92b6470ccc..b80802b35353 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -244,9 +244,6 @@ static bool i915_gem_shrinker_lock(struct drm_device *dev, bool *unlock)
 		if (!mutex_is_locked_by(&dev->struct_mutex, current))
 			return false;
 
-		if (to_i915(dev)->mm.shrinker_no_lock_stealing)
-			return false;
-
 		*unlock = false;
 	} else
 		*unlock = true;
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 11/19] drm/i915: Do a nonblocking wait first in pread/pwrite
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (9 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 10/19] drm/i915: Remove unused no-shrinker-steal Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-05  7:08   ` Joonas Lahtinen
  2016-08-04 19:52 ` [PATCH 12/19] drm/i915: Remove (struct_mutex) locking for wait-ioctl Chris Wilson
                   ` (8 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

If we try and read or write to an active request, we first must wait
upon the GPU completing that request. Let's do that without holding the
mutex (and so allow someone else to access the GPU whilst we wait). Upn
completion, we will reacquire the mutex and only then start the
operation (i.e. we do not rely on state from before dropping the mutex).

v2: Repaint the goto labels

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 62 +++++++++++++++++++++++------------------
 1 file changed, 35 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ee222961c62a..fa0936a787a7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -956,25 +956,26 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
 		       args->size))
 		return -EFAULT;
 
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		return ret;
-
 	obj = i915_gem_object_lookup(file, args->handle);
-	if (!obj) {
-		ret = -ENOENT;
-		goto unlock;
-	}
+	if (!obj)
+		return -ENOENT;
 
 	/* Bounds check source.  */
 	if (args->offset > obj->base.size ||
 	    args->size > obj->base.size - args->offset) {
 		ret = -EINVAL;
-		goto out;
+		goto err;
 	}
 
-	trace_i915_gem_object_pread(obj, args->offset, args->size);
+	ret = __unsafe_wait_rendering(obj, to_rps_client(file), true);
+	if (ret)
+		goto err;
 
+	ret = i915_mutex_lock_interruptible(dev);
+	if (ret)
+		goto err;
+
+	trace_i915_gem_object_pread(obj, args->offset, args->size);
 	ret = i915_gem_shmem_pread(dev, obj, args, file);
 
 	/* pread for non shmem backed objects */
@@ -985,10 +986,13 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
 		intel_runtime_pm_put(to_i915(dev));
 	}
 
-out:
 	i915_gem_object_put(obj);
-unlock:
 	mutex_unlock(&dev->struct_mutex);
+
+	return ret;
+
+err:
+	i915_gem_object_put_unlocked(obj);
 	return ret;
 }
 
@@ -1374,27 +1378,28 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 			return -EFAULT;
 	}
 
-	intel_runtime_pm_get(dev_priv);
-
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		goto put_rpm;
-
 	obj = i915_gem_object_lookup(file, args->handle);
-	if (!obj) {
-		ret = -ENOENT;
-		goto unlock;
-	}
+	if (!obj)
+		return -ENOENT;
 
 	/* Bounds check destination. */
 	if (args->offset > obj->base.size ||
 	    args->size > obj->base.size - args->offset) {
 		ret = -EINVAL;
-		goto out;
+		goto err;
 	}
 
-	trace_i915_gem_object_pwrite(obj, args->offset, args->size);
+	ret = __unsafe_wait_rendering(obj, to_rps_client(file), false);
+	if (ret)
+		goto err;
 
+	intel_runtime_pm_get(dev_priv);
+
+	ret = i915_mutex_lock_interruptible(dev);
+	if (ret)
+		goto err_rpm;
+
+	trace_i915_gem_object_pwrite(obj, args->offset, args->size);
 	ret = -EFAULT;
 	/* We can only do the GTT pwrite on untiled buffers, as otherwise
 	 * it would end up going through the fenced access, and we'll get
@@ -1419,14 +1424,17 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 			ret = -ENODEV;
 	}
 
-out:
 	i915_gem_object_put(obj);
-unlock:
 	mutex_unlock(&dev->struct_mutex);
-put_rpm:
 	intel_runtime_pm_put(dev_priv);
 
 	return ret;
+
+err_rpm:
+	intel_runtime_pm_put(dev_priv);
+err:
+	i915_gem_object_put_unlocked(obj);
+	return ret;
 }
 
 static enum fb_op_origin
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 12/19] drm/i915: Remove (struct_mutex) locking for wait-ioctl
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (10 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 11/19] drm/i915: Do a nonblocking wait first in pread/pwrite Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-04 19:52 ` [PATCH 13/19] drm/i915: Remove (struct_mutex) locking for busy-ioctl Chris Wilson
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

With a bit of care (and leniency) we can iterate over the object and
wait for previous rendering to complete with judicial use of atomic
reference counting. The ABI requires us to ensure that an active object
is eventually flushed (like the busy-ioctl) which is guaranteed by our
management of requests (i.e. everything that is submitted to hardware is
flushed in the same request). All we have to do is ensure that we can
detect when the requests are complete for reporting when the object is
idle (without triggering ETIME), locklessly - this is handled by
i915_gem_active_wait_unlocked().

The impact of this is actually quite small - the return to userspace
following the wait was already lockless and so we don't see much gain in
latency improvement upon completing the wait. What we do achieve here is
completing an already finished wait without hitting the struct_mutex,
our hold is quite short and so we are typically just a victim of
contention rather than a cause - but it is still one less contention
point!

v2: Break up a long line.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 43 ++++++++++++-----------------------------
 1 file changed, 12 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fa0936a787a7..4ef3f704b8b2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2620,47 +2620,28 @@ int
 i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 {
 	struct drm_i915_gem_wait *args = data;
+	struct intel_rps_client *rps = to_rps_client(file);
 	struct drm_i915_gem_object *obj;
-	struct drm_i915_gem_request *requests[I915_NUM_ENGINES];
-	int i, n = 0;
-	int ret;
+	unsigned long active;
+	int idx, ret = 0;
 
 	if (args->flags != 0)
 		return -EINVAL;
 
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		return ret;
-
 	obj = i915_gem_object_lookup(file, args->bo_handle);
-	if (!obj) {
-		mutex_unlock(&dev->struct_mutex);
+	if (!obj)
 		return -ENOENT;
-	}
-
-	if (!i915_gem_object_is_active(obj))
-		goto out;
 
-	for (i = 0; i < I915_NUM_ENGINES; i++) {
-		struct drm_i915_gem_request *req;
-
-		req = i915_gem_active_get(&obj->last_read[i],
-					  &obj->base.dev->struct_mutex);
-		if (req)
-			requests[n++] = req;
+	active = __I915_BO_ACTIVE(obj);
+	for_each_active(active, idx) {
+		s64 *timeout = args->timeout_ns >= 0 ? &args->timeout_ns : NULL;
+		ret = i915_gem_active_wait_unlocked(&obj->last_read[idx], true,
+						    timeout, rps);
+		if (ret)
+			break;
 	}
 
-out:
-	i915_gem_object_put(obj);
-	mutex_unlock(&dev->struct_mutex);
-
-	for (i = 0; i < n; i++) {
-		if (ret == 0)
-			ret = i915_wait_request(requests[i], true,
-						args->timeout_ns > 0 ? &args->timeout_ns : NULL,
-						to_rps_client(file));
-		i915_gem_request_put(requests[i]);
-	}
+	i915_gem_object_put_unlocked(obj);
 	return ret;
 }
 
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 13/19] drm/i915: Remove (struct_mutex) locking for busy-ioctl
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (11 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 12/19] drm/i915: Remove (struct_mutex) locking for wait-ioctl Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-04 19:52 ` [PATCH 14/19] drm/i915: Reduce locking inside swfinish ioctl Chris Wilson
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

By applying the same logic as for wait-ioctl, we can query whether a
request has completed without holding struct_mutex. The biggest impact
system-wide is removing the flush_active and the contention that causes.

Testcase: igt/gem_busy
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 126 ++++++++++++++++++++++++++++++----------
 1 file changed, 96 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4ef3f704b8b2..5ec3ebf33bc8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3734,49 +3734,115 @@ i915_gem_object_ggtt_unpin_view(struct drm_i915_gem_object *obj,
 	i915_vma_unpin(i915_gem_obj_to_ggtt_view(obj, view));
 }
 
+static __always_inline unsigned __busy_read_flag(unsigned int id)
+{
+	/* Note that we could alias engines in the execbuf API, but
+	 * that would be very unwise as it prevents userspace from
+	 * fine control over engine selection. Ahem.
+	 *
+	 * This should be something like EXEC_MAX_ENGINE instead of
+	 * I915_NUM_ENGINES.
+	 */
+	BUILD_BUG_ON(I915_NUM_ENGINES > 16);
+	return 0x10000 << id;
+}
+
+static __always_inline unsigned int __busy_write_flag(unsigned int id)
+{
+	return id;
+}
+
+static __always_inline unsigned
+__busy_flag(const struct i915_gem_active *active,
+	    unsigned int (*flag)(unsigned int id))
+{
+	/* For more discussion about the barriers and locking concerns,
+	 * see __i915_gem_active_get_rcu().
+	 */
+	do {
+		struct drm_i915_gem_request *request;
+		unsigned int id;
+
+		request = rcu_dereference(active->request);
+		if (!request || i915_gem_request_completed(request))
+			return 0;
+
+		id = request->engine->exec_id;
+
+		/* Check that the pointer wasn't reassigned and overwritten. */
+		if (request == rcu_access_pointer(active->request))
+			return flag(id);
+	} while (1);
+}
+
+static inline unsigned
+busy_read_flag(const struct i915_gem_active *active)
+{
+	return __busy_flag(active, __busy_read_flag);
+}
+
+static inline unsigned
+busy_write_flag(const struct i915_gem_active *active)
+{
+	return __busy_flag(active, __busy_write_flag);
+}
+
 int
 i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 		    struct drm_file *file)
 {
 	struct drm_i915_gem_busy *args = data;
 	struct drm_i915_gem_object *obj;
-	int ret;
-
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		return ret;
+	unsigned long active;
 
 	obj = i915_gem_object_lookup(file, args->handle);
-	if (!obj) {
-		ret = -ENOENT;
-		goto unlock;
-	}
+	if (!obj)
+		return -ENOENT;
 
-	/* Count all active objects as busy, even if they are currently not used
-	 * by the gpu. Users of this interface expect objects to eventually
-	 * become non-busy without any further actions.
-	 */
 	args->busy = 0;
-	if (i915_gem_object_is_active(obj)) {
-		struct drm_i915_gem_request *req;
-		int i;
+	active = __I915_BO_ACTIVE(obj);
+	if (active) {
+		int idx;
 
-		for (i = 0; i < I915_NUM_ENGINES; i++) {
-			req = i915_gem_active_peek(&obj->last_read[i],
-						   &obj->base.dev->struct_mutex);
-			if (req)
-				args->busy |= 1 << (16 + req->engine->exec_id);
-		}
-		req = i915_gem_active_peek(&obj->last_write,
-					   &obj->base.dev->struct_mutex);
-		if (req)
-			args->busy |= req->engine->exec_id;
+		/* Yes, the lookups are intentionally racy.
+		 *
+		 * Even though we guard the pointer lookup by RCU, that only
+		 * guarantees that the pointer and its contents remain
+		 * dereferencable and does *not* mean that the request we
+		 * have is the same as the one being tracked by the object.
+		 *
+		 * Consider that we lookup the request just as it is being
+		 * retired and free. We take a local copy of the pointer,
+		 * but before we add its engine into the busy set, the other
+		 * thread reallocates it and assigns it to a task on another
+		 * engine with a fresh and incomplete seqno.
+		 *
+		 * So after we lookup the engine's id, we double check that
+		 * the active request is the same and only then do we add it
+		 * into the busy set.
+		 */
+		rcu_read_lock();
+
+		for_each_active(active, idx)
+			args->busy |= busy_read_flag(&obj->last_read[idx]);
+
+		/* For ABI sanity, we only care that the write engine is in
+		 * the set of read engines. This is ensured by the ordering
+		 * of setting last_read/last_write in i915_vma_move_to_active,
+		 * and then in reverse in retire.
+		 *
+		 * We don't care that the set of active read/write engines
+		 * may change during construction of the result, as it is
+		 * equally liable to change before userspace can inspect
+		 * the result.
+		 */
+		args->busy |= busy_write_flag(&obj->last_write);
+
+		rcu_read_unlock();
 	}
 
-	i915_gem_object_put(obj);
-unlock:
-	mutex_unlock(&dev->struct_mutex);
-	return ret;
+	i915_gem_object_put_unlocked(obj);
+	return 0;
 }
 
 int
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 14/19] drm/i915: Reduce locking inside swfinish ioctl
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (12 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 13/19] drm/i915: Remove (struct_mutex) locking for busy-ioctl Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-05  6:55   ` Joonas Lahtinen
  2016-08-04 19:52 ` [PATCH 15/19] drm/i915: Remove pinned check from madvise ioctl Chris Wilson
                   ` (5 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

We only need to take the struct_mutex if the object is pinned to the
display engine and so requires checking for clflush. (The race with
userspace pinning the object to a framebuffer is irrelevant.)

v2: Use access once for compiler hints (or not as it is a bitfield)
v3: READ_ONCE, obj->pin_display is not a bitfield anymore
v4: Don't be creative with goto.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5ec3ebf33bc8..e18e30dfac4d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1516,26 +1516,23 @@ i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_gem_sw_finish *args = data;
 	struct drm_i915_gem_object *obj;
-	int ret = 0;
-
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		return ret;
+	int err = 0;
 
 	obj = i915_gem_object_lookup(file, args->handle);
-	if (!obj) {
-		ret = -ENOENT;
-		goto unlock;
-	}
+	if (!obj)
+		return -ENOENT;
 
 	/* Pinned buffers may be scanout, so flush the cache */
-	if (obj->pin_display)
-		i915_gem_object_flush_cpu_write_domain(obj);
+	if (READ_ONCE(obj->pin_display)) {
+		err = i915_mutex_lock_interruptible(dev);
+		if (!err) {
+			i915_gem_object_flush_cpu_write_domain(obj);
+			mutex_unlock(&dev->struct_mutex);
+		}
+	}
 
-	i915_gem_object_put(obj);
-unlock:
-	mutex_unlock(&dev->struct_mutex);
-	return ret;
+	i915_gem_object_put_unlocked(obj);
+	return err;
 }
 
 /**
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 15/19] drm/i915: Remove pinned check from madvise ioctl
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (13 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 14/19] drm/i915: Reduce locking inside swfinish ioctl Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-04 19:52 ` [PATCH 16/19] drm/i915: Remove locking for get_tiling Chris Wilson
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

We don't need to incur the overhead of checking whether the object is
pinned prior to changing its madvise. If the object is pinned, the
madvise will not take effect until it is unpinned and so we cannot free
the pages being pointed at by hardware. Marking a pinned object with
allocated pages as DONTNEED will not trigger any undue warnings. The check
is therefore superfluous, and by removing it we can remove a linear walk
over all the vma the object has.

Still despite it being an overzealous check, that error code is part of
the current ABI and so we must proceed with caution.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e18e30dfac4d..eb7b0f4d8a53 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3876,11 +3876,6 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 		goto unlock;
 	}
 
-	if (i915_gem_obj_is_pinned(obj)) {
-		ret = -EINVAL;
-		goto out;
-	}
-
 	if (obj->pages &&
 	    obj->tiling_mode != I915_TILING_NONE &&
 	    dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
@@ -3899,7 +3894,6 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 
 	args->retained = obj->madv != __I915_MADV_PURGED;
 
-out:
 	i915_gem_object_put(obj);
 unlock:
 	mutex_unlock(&dev->struct_mutex);
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 16/19] drm/i915: Remove locking for get_tiling
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (14 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 15/19] drm/i915: Remove pinned check from madvise ioctl Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-04 19:52 ` [PATCH 17/19] drm/i915: Document and reject invalid tiling modes Chris Wilson
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

Since we are not concerned with userspace racing itself with set-tiling
(the order is indeterminant even if we take a lock), then we can safely
read back the single obj->tiling_mode and do the static lookup of
swizzle mode without having to take a lock.

get-tiling is reasonably frequent due to the back-channel passing around
of tiling parameters in DRI2/DRI3.

v2: Make tiling_mode a full unsigned int so that we can trivially use it
with READ_ONCE(). Separating it out into manual control over the flags
field was too noisy for a simple patch. Note that we could use the lower
bits of obj->stride for the tiling mode.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h        | 15 ++++++++-------
 drivers/gpu/drm/i915/i915_gem_tiling.c | 10 +++-------
 2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 31a614fe9ed7..f18d8761305c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2183,10 +2183,6 @@ struct drm_i915_gem_object {
 	unsigned int madv:2;
 
 	/**
-	 * Current tiling mode for the object.
-	 */
-	unsigned int tiling_mode:2;
-	/**
 	 * Whether the tiling parameters for the currently associated fence
 	 * register have changed. Note that for the purposes of tracking
 	 * tiling changes we also treat the unfenced register, the register
@@ -2218,6 +2214,14 @@ struct drm_i915_gem_object {
 
 	atomic_t frontbuffer_bits;
 
+	/**
+	 * Current tiling mode for the object.
+	 */
+	unsigned int tiling_mode;
+
+	/** Current tiling stride for the object, if it's tiled. */
+	uint32_t stride;
+
 	unsigned int has_wc_mmap;
 	/** Count of VMA actually bound by this object */
 	unsigned int bind_count;
@@ -2245,9 +2249,6 @@ struct drm_i915_gem_object {
 	struct i915_gem_active last_write;
 	struct i915_gem_active last_fence;
 
-	/** Current tiling stride for the object, if it's tiled. */
-	uint32_t stride;
-
 	/** References from framebuffers, locks out tiling changes. */
 	unsigned long framebuffer_references;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index b7f9875f69b4..c0e01333bddf 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -303,10 +303,8 @@ i915_gem_get_tiling(struct drm_device *dev, void *data,
 	if (!obj)
 		return -ENOENT;
 
-	mutex_lock(&dev->struct_mutex);
-
-	args->tiling_mode = obj->tiling_mode;
-	switch (obj->tiling_mode) {
+	args->tiling_mode = READ_ONCE(obj->tiling_mode);
+	switch (args->tiling_mode) {
 	case I915_TILING_X:
 		args->swizzle_mode = dev_priv->mm.bit_6_swizzle_x;
 		break;
@@ -330,8 +328,6 @@ i915_gem_get_tiling(struct drm_device *dev, void *data,
 	if (args->swizzle_mode == I915_BIT_6_SWIZZLE_9_10_17)
 		args->swizzle_mode = I915_BIT_6_SWIZZLE_9_10;
 
-	i915_gem_object_put(obj);
-	mutex_unlock(&dev->struct_mutex);
-
+	i915_gem_object_put_unlocked(obj);
 	return 0;
 }
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 17/19] drm/i915: Document and reject invalid tiling modes
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (15 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 16/19] drm/i915: Remove locking for get_tiling Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-05  6:44   ` Joonas Lahtinen
  2016-08-04 19:52 ` [PATCH 18/19] drm/i915: Repack fence tiling mode and stride into a single integer Chris Wilson
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

Through the GTT interface to the fence registers, we can only handle
linear, X and Y tiling. The more esoteric tiling patterns are ignored.
Document that the tiling ABI only supports upto Y tiling, and reject any
attempts to set a tiling mode other than NONE, X or Y.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_tiling.c | 3 +++
 include/uapi/drm/i915_drm.h            | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index c0e01333bddf..6817f69947d9 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -68,6 +68,9 @@ i915_tiling_ok(struct drm_device *dev, int stride, int size, int tiling_mode)
 	if (tiling_mode == I915_TILING_NONE)
 		return true;
 
+	if (tiling_mode > I915_TILING_LAST)
+		return false;
+
 	if (IS_GEN2(dev) ||
 	    (tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev)))
 		tile_width = 128;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 0f292733cffc..452629de7a57 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -926,6 +926,7 @@ struct drm_i915_gem_caching {
 #define I915_TILING_NONE	0
 #define I915_TILING_X		1
 #define I915_TILING_Y		2
+#define I915_TILING_LAST	I915_TILING_Y
 
 #define I915_BIT_6_SWIZZLE_NONE		0
 #define I915_BIT_6_SWIZZLE_9		1
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 18/19] drm/i915: Repack fence tiling mode and stride into a single integer
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (16 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 17/19] drm/i915: Document and reject invalid tiling modes Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-04 19:52 ` [PATCH 19/19] drm/i915: Assert that the request hasn't been retired Chris Wilson
  2016-08-05  5:49 ` ✗ Ro.CI.BAT: failure for series starting with [01/19] drm/i915: Introduce i915_gem_active_wait_unlocked() Patchwork
  19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

In the previous commit, we moved the obj->tiling_mode out of a bitfield
and into its own integer so that we could safely use READ_ONCE(). Let us
now repair some of that damage by sharing the tiling_mode with its
companion, the fence stride.

v2: New magic

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  2 +-
 drivers/gpu/drm/i915/i915_drv.h            | 30 +++++++++++++++++------
 drivers/gpu/drm/i915/i915_gem.c            | 20 ++++++++-------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_fence.c      | 39 ++++++++++++++++++------------
 drivers/gpu/drm/i915/i915_gem_tiling.c     | 19 +++++++++------
 drivers/gpu/drm/i915/i915_gpu_error.c      |  2 +-
 drivers/gpu/drm/i915/intel_display.c       | 34 +++++++++++++-------------
 drivers/gpu/drm/i915/intel_fbc.c           |  2 +-
 drivers/gpu/drm/i915/intel_overlay.c       |  2 +-
 drivers/gpu/drm/i915/intel_pm.c            |  2 +-
 drivers/gpu/drm/i915/intel_sprite.c        | 12 ++++-----
 12 files changed, 98 insertions(+), 68 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 1faea382dfeb..0620a84d00ca 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -101,7 +101,7 @@ static char get_pin_flag(struct drm_i915_gem_object *obj)
 
 static char get_tiling_flag(struct drm_i915_gem_object *obj)
 {
-	switch (obj->tiling_mode) {
+	switch (i915_gem_object_get_tiling(obj)) {
 	default:
 	case I915_TILING_NONE: return ' ';
 	case I915_TILING_X: return 'X';
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f18d8761305c..feec00f769e1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2214,13 +2214,11 @@ struct drm_i915_gem_object {
 
 	atomic_t frontbuffer_bits;
 
-	/**
-	 * Current tiling mode for the object.
-	 */
-	unsigned int tiling_mode;
-
 	/** Current tiling stride for the object, if it's tiled. */
-	uint32_t stride;
+	unsigned int tiling_and_stride;
+#define FENCE_MINIMUM_STRIDE 128 /* See i915_tiling_ok() */
+#define TILING_MASK (FENCE_MINIMUM_STRIDE-1)
+#define STRIDE_MASK (~TILING_MASK)
 
 	unsigned int has_wc_mmap;
 	/** Count of VMA actually bound by this object */
@@ -2359,6 +2357,24 @@ i915_gem_object_has_active_engine(const struct drm_i915_gem_object *obj,
 	return obj->flags & BIT(engine + I915_BO_ACTIVE_SHIFT);
 }
 
+static inline unsigned int
+i915_gem_object_get_tiling(struct drm_i915_gem_object *obj)
+{
+	return obj->tiling_and_stride & TILING_MASK;
+}
+
+static inline bool
+i915_gem_object_is_tiled(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_object_get_tiling(obj) != I915_TILING_NONE;
+}
+
+static inline unsigned int
+i915_gem_object_get_stride(struct drm_i915_gem_object *obj)
+{
+	return obj->tiling_and_stride & STRIDE_MASK;
+}
+
 /*
  * Optimised SGL iterator for GEM objects
  */
@@ -3457,7 +3473,7 @@ static inline bool i915_gem_object_needs_bit17_swizzle(struct drm_i915_gem_objec
 	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
 
 	return dev_priv->mm.bit_6_swizzle_x == I915_BIT_6_SWIZZLE_9_10_17 &&
-		obj->tiling_mode != I915_TILING_NONE;
+		i915_gem_object_is_tiled(obj);
 }
 
 /* i915_debugfs.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index eb7b0f4d8a53..2b5cd63bca44 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1041,7 +1041,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
 	int ret;
 	bool hit_slow_path = false;
 
-	if (obj->tiling_mode != I915_TILING_NONE)
+	if (i915_gem_object_is_tiled(obj))
 		return -EFAULT;
 
 	ret = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
@@ -1669,7 +1669,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 
 	/* Use a partial view if the object is bigger than the aperture. */
 	if (obj->base.size >= ggtt->mappable_end &&
-	    obj->tiling_mode == I915_TILING_NONE) {
+	    !i915_gem_object_is_tiled(obj)) {
 		static const unsigned int chunk_size = 256; // 1 MiB
 
 		memset(&view, 0, sizeof(view));
@@ -2187,7 +2187,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	if (i915_gem_object_needs_bit17_swizzle(obj))
 		i915_gem_object_do_bit_17_swizzle(obj);
 
-	if (obj->tiling_mode != I915_TILING_NONE &&
+	if (i915_gem_object_is_tiled(obj) &&
 	    dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES)
 		i915_gem_object_pin_pages(obj);
 
@@ -2936,10 +2936,12 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 
 	size = max(size, vma->size);
 	if (flags & PIN_MAPPABLE)
-		size = i915_gem_get_ggtt_size(dev_priv, size, obj->tiling_mode);
+		size = i915_gem_get_ggtt_size(dev_priv, size,
+					      i915_gem_object_get_tiling(obj));
 
 	min_alignment =
-		i915_gem_get_ggtt_alignment(dev_priv, size, obj->tiling_mode,
+		i915_gem_get_ggtt_alignment(dev_priv, size,
+					    i915_gem_object_get_tiling(obj),
 					    flags & PIN_MAPPABLE);
 	if (alignment == 0)
 		alignment = min_alignment;
@@ -3635,10 +3637,10 @@ void __i915_vma_set_map_and_fenceable(struct i915_vma *vma)
 
 	fence_size = i915_gem_get_ggtt_size(dev_priv,
 					    obj->base.size,
-					    obj->tiling_mode);
+					    i915_gem_object_get_tiling(obj));
 	fence_alignment = i915_gem_get_ggtt_alignment(dev_priv,
 						      obj->base.size,
-						      obj->tiling_mode,
+						      i915_gem_object_get_tiling(obj),
 						      true);
 
 	fenceable = (vma->node.size == fence_size &&
@@ -3877,7 +3879,7 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	}
 
 	if (obj->pages &&
-	    obj->tiling_mode != I915_TILING_NONE &&
+	    i915_gem_object_is_tiled(obj) &&
 	    dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
 		if (obj->madv == I915_MADV_WILLNEED)
 			i915_gem_object_unpin_pages(obj);
@@ -4047,7 +4049,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 
 	if (obj->pages && obj->madv == I915_MADV_WILLNEED &&
 	    dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES &&
-	    obj->tiling_mode != I915_TILING_NONE)
+	    i915_gem_object_is_tiled(obj))
 		i915_gem_object_unpin_pages(obj);
 
 	if (WARN_ON(obj->pages_pin_count))
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 71834741bd87..c494b79ded20 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -803,7 +803,7 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *engine,
 			entry->flags &= ~EXEC_OBJECT_NEEDS_FENCE;
 		need_fence =
 			entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
-			obj->tiling_mode != I915_TILING_NONE;
+			i915_gem_object_is_tiled(obj);
 		need_mappable = need_fence || need_reloc_mappable(vma);
 
 		if (entry->flags & EXEC_OBJECT_PINNED)
diff --git a/drivers/gpu/drm/i915/i915_gem_fence.c b/drivers/gpu/drm/i915/i915_gem_fence.c
index 3b462da612ca..9e8173fe2a09 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence.c
@@ -86,20 +86,22 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg,
 
 	if (obj) {
 		u32 size = i915_gem_obj_ggtt_size(obj);
+		unsigned int tiling = i915_gem_object_get_tiling(obj);
+		unsigned int stride = i915_gem_object_get_stride(obj);
 		uint64_t val;
 
 		/* Adjust fence size to match tiled area */
-		if (obj->tiling_mode != I915_TILING_NONE) {
-			uint32_t row_size = obj->stride *
-				(obj->tiling_mode == I915_TILING_Y ? 32 : 8);
+		if (tiling != I915_TILING_NONE) {
+			uint32_t row_size = stride *
+				(tiling == I915_TILING_Y ? 32 : 8);
 			size = (size / row_size) * row_size;
 		}
 
 		val = (uint64_t)((i915_gem_obj_ggtt_offset(obj) + size - 4096) &
 				 0xfffff000) << 32;
 		val |= i915_gem_obj_ggtt_offset(obj) & 0xfffff000;
-		val |= (uint64_t)((obj->stride / 128) - 1) << fence_pitch_shift;
-		if (obj->tiling_mode == I915_TILING_Y)
+		val |= (uint64_t)((stride / 128) - 1) << fence_pitch_shift;
+		if (tiling == I915_TILING_Y)
 			val |= 1 << I965_FENCE_TILING_Y_SHIFT;
 		val |= I965_FENCE_REG_VALID;
 
@@ -122,6 +124,8 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg,
 
 	if (obj) {
 		u32 size = i915_gem_obj_ggtt_size(obj);
+		unsigned int tiling = i915_gem_object_get_tiling(obj);
+		unsigned int stride = i915_gem_object_get_stride(obj);
 		int pitch_val;
 		int tile_width;
 
@@ -131,17 +135,17 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg,
 		     "object 0x%08llx [fenceable? %d] not 1M or pot-size (0x%08x) aligned\n",
 		     i915_gem_obj_ggtt_offset(obj), obj->map_and_fenceable, size);
 
-		if (obj->tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev))
+		if (tiling == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev))
 			tile_width = 128;
 		else
 			tile_width = 512;
 
 		/* Note: pitch better be a power of two tile widths */
-		pitch_val = obj->stride / tile_width;
+		pitch_val = stride / tile_width;
 		pitch_val = ffs(pitch_val) - 1;
 
 		val = i915_gem_obj_ggtt_offset(obj);
-		if (obj->tiling_mode == I915_TILING_Y)
+		if (tiling == I915_TILING_Y)
 			val |= 1 << I830_FENCE_TILING_Y_SHIFT;
 		val |= I915_FENCE_SIZE_BITS(size);
 		val |= pitch_val << I830_FENCE_PITCH_SHIFT;
@@ -161,6 +165,8 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
 
 	if (obj) {
 		u32 size = i915_gem_obj_ggtt_size(obj);
+		unsigned int tiling = i915_gem_object_get_tiling(obj);
+		unsigned int stride = i915_gem_object_get_stride(obj);
 		uint32_t pitch_val;
 
 		WARN((i915_gem_obj_ggtt_offset(obj) & ~I830_FENCE_START_MASK) ||
@@ -169,11 +175,11 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
 		     "object 0x%08llx not 512K or pot-size 0x%08x aligned\n",
 		     i915_gem_obj_ggtt_offset(obj), size);
 
-		pitch_val = obj->stride / 128;
+		pitch_val = stride / 128;
 		pitch_val = ffs(pitch_val) - 1;
 
 		val = i915_gem_obj_ggtt_offset(obj);
-		if (obj->tiling_mode == I915_TILING_Y)
+		if (tiling == I915_TILING_Y)
 			val |= 1 << I830_FENCE_TILING_Y_SHIFT;
 		val |= I830_FENCE_SIZE_BITS(size);
 		val |= pitch_val << I830_FENCE_PITCH_SHIFT;
@@ -201,9 +207,12 @@ static void i915_gem_write_fence(struct drm_device *dev, int reg,
 	if (i915_gem_object_needs_mb(dev_priv->fence_regs[reg].obj))
 		mb();
 
-	WARN(obj && (!obj->stride || !obj->tiling_mode),
+	WARN(obj &&
+	     (!i915_gem_object_get_stride(obj) ||
+	      !i915_gem_object_get_tiling(obj)),
 	     "bogus fence setup with stride: 0x%x, tiling mode: %i\n",
-	     obj->stride, obj->tiling_mode);
+	     i915_gem_object_get_stride(obj),
+	     i915_gem_object_get_tiling(obj));
 
 	if (IS_GEN2(dev))
 		i830_write_fence_reg(dev, reg, obj);
@@ -248,7 +257,7 @@ static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj,
 
 static inline void i915_gem_object_fence_lost(struct drm_i915_gem_object *obj)
 {
-	if (obj->tiling_mode)
+	if (i915_gem_object_is_tiled(obj))
 		i915_gem_release_mmap(obj);
 
 	/* As we do not have an associated fence register, we will force
@@ -361,7 +370,7 @@ i915_gem_object_get_fence(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	bool enable = obj->tiling_mode != I915_TILING_NONE;
+	bool enable = i915_gem_object_is_tiled(obj);
 	struct drm_i915_fence_reg *reg;
 	int ret;
 
@@ -477,7 +486,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
 		 */
 		if (reg->obj) {
 			i915_gem_object_update_fence(reg->obj, reg,
-						     reg->obj->tiling_mode);
+						     i915_gem_object_get_tiling(reg->obj));
 		} else {
 			i915_gem_write_fence(dev, i, NULL);
 		}
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 6817f69947d9..f4b984de83b5 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -170,6 +170,9 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 	struct drm_i915_gem_object *obj;
 	int ret = 0;
 
+	/* Make sure we don't cross-contaminate obj->tiling_and_stride */
+	BUILD_BUG_ON(I915_TILING_LAST & STRIDE_MASK);
+
 	obj = i915_gem_object_lookup(file, args->handle);
 	if (!obj)
 		return -ENOENT;
@@ -217,8 +220,8 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 		}
 	}
 
-	if (args->tiling_mode != obj->tiling_mode ||
-	    args->stride != obj->stride) {
+	if (args->tiling_mode != i915_gem_object_get_tiling(obj) ||
+	    args->stride != i915_gem_object_get_stride(obj)) {
 		/* We need to rebind the object if its current allocation
 		 * no longer meets the alignment restrictions for its new
 		 * tiling mode. Otherwise we can just leave it alone, but
@@ -241,7 +244,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 			    dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
 				if (args->tiling_mode == I915_TILING_NONE)
 					i915_gem_object_unpin_pages(obj);
-				if (obj->tiling_mode == I915_TILING_NONE)
+				if (!i915_gem_object_is_tiled(obj))
 					i915_gem_object_pin_pages(obj);
 			}
 
@@ -250,16 +253,16 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 							 &dev->struct_mutex) ||
 				obj->fence_reg != I915_FENCE_REG_NONE;
 
-			obj->tiling_mode = args->tiling_mode;
-			obj->stride = args->stride;
+			obj->tiling_and_stride =
+				args->stride | args->tiling_mode;
 
 			/* Force the fence to be reacquired for GTT access */
 			i915_gem_release_mmap(obj);
 		}
 	}
 	/* we have to maintain this existing ABI... */
-	args->stride = obj->stride;
-	args->tiling_mode = obj->tiling_mode;
+	args->stride = i915_gem_object_get_stride(obj);
+	args->tiling_mode = i915_gem_object_get_tiling(obj);
 
 	/* Try to preallocate memory required to save swizzling on put-pages */
 	if (i915_gem_object_needs_bit17_swizzle(obj)) {
@@ -306,7 +309,7 @@ i915_gem_get_tiling(struct drm_device *dev, void *data,
 	if (!obj)
 		return -ENOENT;
 
-	args->tiling_mode = READ_ONCE(obj->tiling_mode);
+	args->tiling_mode = READ_ONCE(obj->tiling_and_stride) & TILING_MASK;
 	switch (args->tiling_mode) {
 	case I915_TILING_X:
 		args->swizzle_mode = dev_priv->mm.bit_6_swizzle_x;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index cc28ad429dd8..eecb87063c88 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -781,7 +781,7 @@ static void capture_bo(struct drm_i915_error_buffer *err,
 	err->pinned = 0;
 	if (i915_gem_obj_is_pinned(obj))
 		err->pinned = 1;
-	err->tiling = obj->tiling_mode;
+	err->tiling = i915_gem_object_get_tiling(obj);
 	err->dirty = obj->dirty;
 	err->purgeable = obj->madv != I915_MADV_WILLNEED;
 	err->userptr = obj->userptr.mm != NULL;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 9068676943bf..9cbf5431c1e3 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2466,9 +2466,8 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
 		return false;
 	}
 
-	obj->tiling_mode = plane_config->tiling;
-	if (obj->tiling_mode == I915_TILING_X)
-		obj->stride = fb->pitches[0];
+	if (plane_config->tiling == I915_TILING_X)
+		obj->tiling_and_stride = fb->pitches[0] | I915_TILING_X;
 
 	mode_cmd.pixel_format = fb->pixel_format;
 	mode_cmd.width = fb->width;
@@ -2594,7 +2593,7 @@ valid_fb:
 	intel_state->dst.y2 = plane_state->crtc_y + plane_state->crtc_h;
 
 	obj = intel_fb_obj(fb);
-	if (obj->tiling_mode != I915_TILING_NONE)
+	if (i915_gem_object_is_tiled(obj))
 		dev_priv->preserve_bios_swizzle = true;
 
 	drm_framebuffer_reference(fb);
@@ -2672,8 +2671,7 @@ static void i9xx_update_primary_plane(struct drm_plane *primary,
 		BUG();
 	}
 
-	if (INTEL_INFO(dev)->gen >= 4 &&
-	    obj->tiling_mode != I915_TILING_NONE)
+	if (INTEL_INFO(dev)->gen >= 4 && i915_gem_object_is_tiled(obj))
 		dspcntr |= DISPPLANE_TILED;
 
 	if (IS_G4X(dev))
@@ -2782,7 +2780,7 @@ static void ironlake_update_primary_plane(struct drm_plane *primary,
 		BUG();
 	}
 
-	if (obj->tiling_mode != I915_TILING_NONE)
+	if (i915_gem_object_is_tiled(obj))
 		dspcntr |= DISPPLANE_TILED;
 
 	if (!IS_HASWELL(dev) && !IS_BROADWELL(dev))
@@ -11200,7 +11198,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0]);
 	intel_ring_emit(ring, intel_crtc->flip_work->gtt_offset |
-			obj->tiling_mode);
+			i915_gem_object_get_tiling(obj));
 
 	/* XXX Enabling the panel-fitter across page-flip is so far
 	 * untested on non-native modes, so ignore it for now.
@@ -11232,7 +11230,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 
 	intel_ring_emit(ring, MI_DISPLAY_FLIP |
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
-	intel_ring_emit(ring, fb->pitches[0] | obj->tiling_mode);
+	intel_ring_emit(ring, fb->pitches[0] | i915_gem_object_get_tiling(obj));
 	intel_ring_emit(ring, intel_crtc->flip_work->gtt_offset);
 
 	/* Contrary to the suggestions in the documentation,
@@ -11335,7 +11333,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	}
 
 	intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit);
-	intel_ring_emit(ring, (fb->pitches[0] | obj->tiling_mode));
+	intel_ring_emit(ring, fb->pitches[0] | i915_gem_object_get_tiling(obj));
 	intel_ring_emit(ring, intel_crtc->flip_work->gtt_offset);
 	intel_ring_emit(ring, (MI_NOOP));
 
@@ -11442,7 +11440,7 @@ static void ilk_do_mmio_flip(struct intel_crtc *intel_crtc,
 
 	dspcntr = I915_READ(reg);
 
-	if (obj->tiling_mode != I915_TILING_NONE)
+	if (i915_gem_object_is_tiled(obj))
 		dspcntr |= DISPPLANE_TILED;
 	else
 		dspcntr &= ~DISPPLANE_TILED;
@@ -11670,7 +11668,8 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 
 	if (IS_VALLEYVIEW(dev) || IS_CHERRYVIEW(dev)) {
 		engine = &dev_priv->engine[BCS];
-		if (obj->tiling_mode != intel_fb_obj(work->old_fb)->tiling_mode)
+		if (i915_gem_object_get_tiling(obj) !=
+		    i915_gem_object_get_tiling(intel_fb_obj(work->old_fb)))
 			/* vlv: DISPLAY_FLIP fails to change tiling */
 			engine = NULL;
 	} else if (IS_IVYBRIDGE(dev) || IS_HASWELL(dev)) {
@@ -14932,15 +14931,15 @@ static int intel_framebuffer_init(struct drm_device *dev,
 	if (mode_cmd->flags & DRM_MODE_FB_MODIFIERS) {
 		/* Enforce that fb modifier and tiling mode match, but only for
 		 * X-tiled. This is needed for FBC. */
-		if (!!(obj->tiling_mode == I915_TILING_X) !=
+		if (!!(i915_gem_object_get_tiling(obj) == I915_TILING_X) !=
 		    !!(mode_cmd->modifier[0] == I915_FORMAT_MOD_X_TILED)) {
 			DRM_DEBUG("tiling_mode doesn't match fb modifier\n");
 			return -EINVAL;
 		}
 	} else {
-		if (obj->tiling_mode == I915_TILING_X)
+		if (i915_gem_object_get_tiling(obj) == I915_TILING_X)
 			mode_cmd->modifier[0] = I915_FORMAT_MOD_X_TILED;
-		else if (obj->tiling_mode == I915_TILING_Y) {
+		else if (i915_gem_object_get_tiling(obj) == I915_TILING_Y) {
 			DRM_DEBUG("No Y tiling for legacy addfb\n");
 			return -EINVAL;
 		}
@@ -14984,9 +14983,10 @@ static int intel_framebuffer_init(struct drm_device *dev,
 	}
 
 	if (mode_cmd->modifier[0] == I915_FORMAT_MOD_X_TILED &&
-	    mode_cmd->pitches[0] != obj->stride) {
+	    mode_cmd->pitches[0] != i915_gem_object_get_stride(obj)) {
 		DRM_DEBUG("pitch (%d) must match tiling stride (%d)\n",
-			  mode_cmd->pitches[0], obj->stride);
+			  mode_cmd->pitches[0],
+			  i915_gem_object_get_stride(obj));
 		return -EINVAL;
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index d4be07615aa9..85adc2b92594 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -741,7 +741,7 @@ static void intel_fbc_update_state_cache(struct intel_crtc *crtc,
 	cache->fb.pixel_format = fb->pixel_format;
 	cache->fb.stride = fb->pitches[0];
 	cache->fb.fence_reg = obj->fence_reg;
-	cache->fb.tiling_mode = obj->tiling_mode;
+	cache->fb.tiling_mode = i915_gem_object_get_tiling(obj);
 }
 
 static bool intel_fbc_can_activate(struct intel_crtc *crtc)
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 413a2038e6d1..90f3ab424e01 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -1129,7 +1129,7 @@ int intel_overlay_put_image_ioctl(struct drm_device *dev, void *data,
 	drm_modeset_lock_all(dev);
 	mutex_lock(&dev->struct_mutex);
 
-	if (new_bo->tiling_mode) {
+	if (i915_gem_object_is_tiled(new_bo)) {
 		DRM_DEBUG_KMS("buffer used for overlay image can not be tiled\n");
 		ret = -EINVAL;
 		goto out_unlock;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 4eac4b4beece..0440a7dfcd6c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -1585,7 +1585,7 @@ static void i9xx_update_wm(struct drm_crtc *unused_crtc)
 		obj = intel_fb_obj(enabled->primary->state->fb);
 
 		/* self-refresh seems busted with untiled */
-		if (obj->tiling_mode == I915_TILING_NONE)
+		if (!i915_gem_object_is_tiled(obj))
 			enabled = NULL;
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index c47b74d9cfb1..5beafd4bc1c1 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -431,7 +431,7 @@ vlv_update_plane(struct drm_plane *dplane,
 	 */
 	sprctl |= SP_GAMMA_ENABLE;
 
-	if (obj->tiling_mode != I915_TILING_NONE)
+	if (i915_gem_object_is_tiled(obj))
 		sprctl |= SP_TILED;
 
 	/* Sizes are 0 based */
@@ -468,7 +468,7 @@ vlv_update_plane(struct drm_plane *dplane,
 	I915_WRITE(SPSTRIDE(pipe, plane), fb->pitches[0]);
 	I915_WRITE(SPPOS(pipe, plane), (crtc_y << 16) | crtc_x);
 
-	if (obj->tiling_mode != I915_TILING_NONE)
+	if (i915_gem_object_is_tiled(obj))
 		I915_WRITE(SPTILEOFF(pipe, plane), (y << 16) | x);
 	else
 		I915_WRITE(SPLINOFF(pipe, plane), linear_offset);
@@ -553,7 +553,7 @@ ivb_update_plane(struct drm_plane *plane,
 	 */
 	sprctl |= SPRITE_GAMMA_ENABLE;
 
-	if (obj->tiling_mode != I915_TILING_NONE)
+	if (i915_gem_object_is_tiled(obj))
 		sprctl |= SPRITE_TILED;
 
 	if (IS_HASWELL(dev) || IS_BROADWELL(dev))
@@ -607,7 +607,7 @@ ivb_update_plane(struct drm_plane *plane,
 	 * register */
 	if (IS_HASWELL(dev) || IS_BROADWELL(dev))
 		I915_WRITE(SPROFFSET(pipe), (y << 16) | x);
-	else if (obj->tiling_mode != I915_TILING_NONE)
+	else if (i915_gem_object_is_tiled(obj))
 		I915_WRITE(SPRTILEOFF(pipe), (y << 16) | x);
 	else
 		I915_WRITE(SPRLINOFF(pipe), linear_offset);
@@ -694,7 +694,7 @@ ilk_update_plane(struct drm_plane *plane,
 	 */
 	dvscntr |= DVS_GAMMA_ENABLE;
 
-	if (obj->tiling_mode != I915_TILING_NONE)
+	if (i915_gem_object_is_tiled(obj))
 		dvscntr |= DVS_TILED;
 
 	if (IS_GEN6(dev))
@@ -737,7 +737,7 @@ ilk_update_plane(struct drm_plane *plane,
 	I915_WRITE(DVSSTRIDE(pipe), fb->pitches[0]);
 	I915_WRITE(DVSPOS(pipe), (crtc_y << 16) | crtc_x);
 
-	if (obj->tiling_mode != I915_TILING_NONE)
+	if (i915_gem_object_is_tiled(obj))
 		I915_WRITE(DVSTILEOFF(pipe), (y << 16) | x);
 	else
 		I915_WRITE(DVSLINOFF(pipe), linear_offset);
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 19/19] drm/i915: Assert that the request hasn't been retired
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (17 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 18/19] drm/i915: Repack fence tiling mode and stride into a single integer Chris Wilson
@ 2016-08-04 19:52 ` Chris Wilson
  2016-08-05  5:49 ` ✗ Ro.CI.BAT: failure for series starting with [01/19] drm/i915: Introduce i915_gem_active_wait_unlocked() Patchwork
  19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-04 19:52 UTC (permalink / raw)
  To: intel-gfx

With all callers now not playing tricks with dropping the struct_mutex
between waiting and retiring, we can assert that the request is ready to
be retired.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_request.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 1f91dc8c4171..b317a672040f 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -170,7 +170,7 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 	struct i915_gem_active *active, *next;
 
 	trace_i915_gem_request_retire(request);
-	list_del_init(&request->link);
+	list_del(&request->link);
 
 	/* We know the GPU must have read the request to have
 	 * sent us the seqno + interrupt, so use the position
@@ -228,9 +228,7 @@ void i915_gem_request_retire_upto(struct drm_i915_gem_request *req)
 	struct drm_i915_gem_request *tmp;
 
 	lockdep_assert_held(&req->i915->drm.struct_mutex);
-
-	if (list_empty(&req->link))
-		return;
+	GEM_BUG_ON(list_empty(&req->link));
 
 	do {
 		tmp = list_first_entry(&engine->request_list,
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 02/19] drm/i915: Convert non-blocking waits for requests over to using RCU
  2016-08-04 19:52 ` [PATCH 02/19] drm/i915: Convert non-blocking waits for requests over to using RCU Chris Wilson
@ 2016-08-05  5:38   ` Joonas Lahtinen
  0 siblings, 0 replies; 33+ messages in thread
From: Joonas Lahtinen @ 2016-08-05  5:38 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> We can completely avoid taking the struct_mutex around the non-blocking
> waits by switching over to the RCU request management (trading the mutex
> for a RCU read lock and some complex atomic operations). The improvement
> is that we gain further contention reduction, and overall the code
> become simpler due to the reduced mutex dancing.
> 
> v2: Move i915_gem_fault tracepoint to start of function, before the
> unlocked wait.

"Move i915_gem_fault tracepoint *back* to start of ..." to be exact :)

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 05/19] drm/i915: Remove forced stop ring on suspend/unload
  2016-08-04 19:52 ` [PATCH 05/19] drm/i915: Remove forced stop ring on suspend/unload Chris Wilson
@ 2016-08-05  5:46   ` Joonas Lahtinen
  0 siblings, 0 replies; 33+ messages in thread
From: Joonas Lahtinen @ 2016-08-05  5:46 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> Before suspending (or unloading), we would first wait upon all rendering
> to be completed and then disable the rings. This later step is a remanent
> from DRI1 days when we did not use request tracking for all operations
> upon the ring. Now that we are sure we are waiting upon the very last
> operation by the engine, we can forgo clobbering the ring registers,
> though we do keep the assert that the engine is indeed idle before
> sleeping.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Could use thorough testing again.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* ✗ Ro.CI.BAT: failure for series starting with [01/19] drm/i915: Introduce i915_gem_active_wait_unlocked()
  2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
                   ` (18 preceding siblings ...)
  2016-08-04 19:52 ` [PATCH 19/19] drm/i915: Assert that the request hasn't been retired Chris Wilson
@ 2016-08-05  5:49 ` Patchwork
  19 siblings, 0 replies; 33+ messages in thread
From: Patchwork @ 2016-08-05  5:49 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915: Introduce i915_gem_active_wait_unlocked()
URL   : https://patchwork.freedesktop.org/series/10686/
State : failure

== Summary ==

Series 10686v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/10686/revisions/1/mbox

Test drv_module_reload_basic:
                dmesg-warn -> PASS       (fi-hsw-i7-4770k)
Test kms_cursor_legacy:
        Subgroup basic-flip-vs-cursor-legacy:
                pass       -> FAIL       (ro-bdw-i7-5600u)
                fail       -> PASS       (ro-bdw-i5-5250u)
        Subgroup basic-flip-vs-cursor-varying-size:
                pass       -> FAIL       (ro-hsw-i7-4770r)

fi-hsw-i7-4770k  total:240  pass:218  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-qkkr      total:240  pass:181  dwarn:29  dfail:0   fail:4   skip:26 
fi-skl-i7-6700k  total:240  pass:204  dwarn:4   dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:240  pass:198  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:219  dwarn:5   dfail:0   fail:0   skip:16 
ro-bdw-i7-5557U  total:240  pass:223  dwarn:1   dfail:0   fail:0   skip:16 
ro-bdw-i7-5600u  total:240  pass:205  dwarn:1   dfail:0   fail:2   skip:32 
ro-bsw-n3050     total:240  pass:194  dwarn:0   dfail:0   fail:4   skip:42 
ro-byt-n2820     total:240  pass:197  dwarn:0   dfail:0   fail:3   skip:40 
ro-hsw-i3-4010u  total:240  pass:214  dwarn:0   dfail:0   fail:0   skip:26 
ro-hsw-i7-4770r  total:240  pass:213  dwarn:0   dfail:0   fail:1   skip:26 
ro-ilk-i7-620lm  total:240  pass:172  dwarn:1   dfail:0   fail:2   skip:65 
ro-ilk1-i5-650   total:235  pass:173  dwarn:0   dfail:0   fail:2   skip:60 
ro-ivb-i7-3770   total:240  pass:205  dwarn:0   dfail:0   fail:0   skip:35 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:221  dwarn:1   dfail:0   fail:4   skip:14 
ro-snb-i7-2620M  total:240  pass:198  dwarn:0   dfail:0   fail:1   skip:41 
fi-skl-i5-6260u failed to connect after reboot

Results at /archive/results/CI_IGT_test/RO_Patchwork_1719/

deedff0 drm-intel-nightly: 2016y-08m-04d-23h-50m-15s UTC integration manifest
d1497c5 drm/i915: Assert that the request hasn't been retired
ebf04c1 drm/i915: Repack fence tiling mode and stride into a single integer
ec0049e drm/i915: Document and reject invalid tiling modes
c58f587 drm/i915: Remove locking for get_tiling
95c8415 drm/i915: Remove pinned check from madvise ioctl
4aa445e drm/i915: Reduce locking inside swfinish ioctl
e835356 drm/i915: Remove (struct_mutex) locking for busy-ioctl
35ce145 drm/i915: Remove (struct_mutex) locking for wait-ioctl
caf0112 drm/i915: Do a nonblocking wait first in pread/pwrite
bd39eb1 drm/i915: Remove unused no-shrinker-steal
7d887c9 drm/i915: Tidy generation of the GTT mmap offset
8bdc98a drm/i915/shrinker: Wait before acquiring struct_mutex under oom
5a8779e drm/i915: Simplify do_idling() (Ironlake vt-d w/a)
07185f4 drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
90e9a7f drm/i915: Remove forced stop ring on suspend/unload
bbb5820 drm/i915/userptr: Remove superfluous interruptible=false on waiting
6837af5 drm/i915: Convert non-blocking userptr waits for requests over to using RCU
80f0ac3 drm/i915: Convert non-blocking waits for requests over to using RCU
1d13908 drm/i915: Introduce i915_gem_active_wait_unlocked()

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/19] drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
  2016-08-04 19:52 ` [PATCH 06/19] drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex Chris Wilson
@ 2016-08-05  6:16   ` Joonas Lahtinen
  2016-08-05  6:51     ` Chris Wilson
  2016-08-05  8:23     ` Chris Wilson
  0 siblings, 2 replies; 33+ messages in thread
From: Joonas Lahtinen @ 2016-08-05  6:16 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> -i915_gem_suspend(struct drm_device *dev)
> +int i915_gem_suspend(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(dev);
> -	int ret = 0;
> +	int ret;
>  
>  	intel_suspend_gt_powersave(dev_priv);
>  
> @@ -4102,7 +4099,7 @@ i915_gem_suspend(struct drm_device *dev)
>  	if (ret)
>  		goto err;
>  
> -	ret = i915_gem_wait_for_idle(dev_priv);
> +	ret = i915_gem_wait_for_idle(dev_priv, true);
>  	if (ret)
>  		goto err;
>  
> @@ -4123,7 +4120,7 @@ i915_gem_suspend(struct drm_device *dev)
>  	return 0;
>  
>  err:
> -	mutex_unlock(&dev->struct_mutex);
> +	mutex_unlock(&dev_priv->drm.struct_mutex);

Did you intend to convert the parameter to dev_priv?

> @@ -486,7 +486,8 @@ void __i915_add_request(struct drm_i915_gem_request *request,
>  	 */
>  	request->emitted_jiffies = jiffies;
>  	request->previous_seqno = engine->last_submitted_seqno;
> -	smp_store_mb(engine->last_submitted_seqno, request->fence.seqno);
> +	engine->last_submitted_seqno = request->fence.seqno;

What's up with this change?

> +	i915_gem_active_set(&engine->last_request, request);
>  	list_add_tail(&request->link, &engine->request_list);
>  	list_add_tail(&request->ring_link, &ring->request_list);
>  
> <SNIP>
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -6336,7 +6336,7 @@ bool i915_gpu_busy(void)
>  	dev_priv = i915_mch_dev;
>  
>  	for_each_engine(engine, dev_priv)
> -		ret |= !list_empty(&engine->request_list);
> +		ret |= intel_engine_is_active(engine);

|= always makes me think of bitfields because, well -- it is bitwise
operation :P

		if (intel_engine_is_active(engine))
			ret = true;

But I can live with what it is.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 07/19] drm/i915: Simplify do_idling() (Ironlake vt-d w/a)
  2016-08-04 19:52 ` [PATCH 07/19] drm/i915: Simplify do_idling() (Ironlake vt-d w/a) Chris Wilson
@ 2016-08-05  6:19   ` Joonas Lahtinen
  0 siblings, 0 replies; 33+ messages in thread
From: Joonas Lahtinen @ 2016-08-05  6:19 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> Now that we pass along the expected interruptible nature for the
> wait-for-idle, we do not need to modify the global
> i915->mm.interruptible for this single call. (Only the immediate call to
> i915_gem_wait_for_idle() takes the interruptible status as the other
> action, dma_map_sg(), is independent of i915.ko)
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Looks far less hackish, good;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 08/19] drm/i915/shrinker: Wait before acquiring struct_mutex under oom
  2016-08-04 19:52 ` [PATCH 08/19] drm/i915/shrinker: Wait before acquiring struct_mutex under oom Chris Wilson
@ 2016-08-05  6:24   ` Joonas Lahtinen
  0 siblings, 0 replies; 33+ messages in thread
From: Joonas Lahtinen @ 2016-08-05  6:24 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> We can now wait for the GPU (all engines) to become idle without
> requiring the struct_mutex. Inside the shrinker, we need to currently
> take the struct_mutex in order to purge objects and to purge the objects
> we need the GPU to be idle - causing a stall whilst we hold the
> struct_mutex. We can hide most of that stall by performing the wait
> before taking the struct_mutex and only doing essential waits for
> new rendering on objects to be freed.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Cleaner and the timeout detection should be more robust too;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 17/19] drm/i915: Document and reject invalid tiling modes
  2016-08-04 19:52 ` [PATCH 17/19] drm/i915: Document and reject invalid tiling modes Chris Wilson
@ 2016-08-05  6:44   ` Joonas Lahtinen
  0 siblings, 0 replies; 33+ messages in thread
From: Joonas Lahtinen @ 2016-08-05  6:44 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> Through the GTT interface to the fence registers, we can only handle
> linear, X and Y tiling. The more esoteric tiling patterns are ignored.
> Document that the tiling ABI only supports upto Y tiling, and reject any
> attempts to set a tiling mode other than NONE, X or Y.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/19] drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
  2016-08-05  6:16   ` Joonas Lahtinen
@ 2016-08-05  6:51     ` Chris Wilson
  2016-08-05  7:31       ` Joonas Lahtinen
  2016-08-05  8:23     ` Chris Wilson
  1 sibling, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2016-08-05  6:51 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

On Fri, Aug 05, 2016 at 09:16:21AM +0300, Joonas Lahtinen wrote:
> On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> > @@ -486,7 +486,8 @@ void __i915_add_request(struct drm_i915_gem_request *request,
> >  	 */
> >  	request->emitted_jiffies = jiffies;
> >  	request->previous_seqno = engine->last_submitted_seqno;
> > -	smp_store_mb(engine->last_submitted_seqno, request->fence.seqno);
> > +	engine->last_submitted_seqno = request->fence.seqno;
> 
> What's up with this change?

Where we use to use an ordered store of the last seqno for checking
idleness inside hangcheck, we now use the RCU active tracking instead.

engine->last_submitted_seqno is reduced to a heuristic used for debug
(gpu error state) and for deciding when to grant a fresh waitboost.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 14/19] drm/i915: Reduce locking inside swfinish ioctl
  2016-08-04 19:52 ` [PATCH 14/19] drm/i915: Reduce locking inside swfinish ioctl Chris Wilson
@ 2016-08-05  6:55   ` Joonas Lahtinen
  0 siblings, 0 replies; 33+ messages in thread
From: Joonas Lahtinen @ 2016-08-05  6:55 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Daniel Vetter

On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> We only need to take the struct_mutex if the object is pinned to the
> display engine and so requires checking for clflush. (The race with
> userspace pinning the object to a framebuffer is irrelevant.)
> 
> v2: Use access once for compiler hints (or not as it is a bitfield)
> v3: READ_ONCE, obj->pin_display is not a bitfield anymore
> v4: Don't be creative with goto.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>

This is readable, and it wasn't that hard.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 11/19] drm/i915: Do a nonblocking wait first in pread/pwrite
  2016-08-04 19:52 ` [PATCH 11/19] drm/i915: Do a nonblocking wait first in pread/pwrite Chris Wilson
@ 2016-08-05  7:08   ` Joonas Lahtinen
  2016-08-05  7:59     ` Chris Wilson
  0 siblings, 1 reply; 33+ messages in thread
From: Joonas Lahtinen @ 2016-08-05  7:08 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> If we try and read or write to an active request, we first must wait
> upon the GPU completing that request. Let's do that without holding the
> mutex (and so allow someone else to access the GPU whilst we wait). Upn

STILL TYPO                                                    Upon ---^

>  	/* Bounds check destination. */
>  	if (args->offset > obj->base.size ||
>  	    args->size > obj->base.size - args->offset) {
>  		ret = -EINVAL;
> -		goto out;
> +		goto err;
>  	}
>  
> -	trace_i915_gem_object_pwrite(obj, args->offset, args->size);
> +	ret = __unsafe_wait_rendering(obj, to_rps_client(file), false);
> +	if (ret)
> +		goto err;
>  
> +	intel_runtime_pm_get(dev_priv);
> +
> +	ret = i915_mutex_lock_interruptible(dev);
> +	if (ret)
> +		goto err_rpm;
> +
> +	trace_i915_gem_object_pwrite(obj, args->offset, args->size);

This trace is still moved, maybe add your reasoning to commit message.

With those addressed (and the RPM fix in separate patch);

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/19] drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
  2016-08-05  6:51     ` Chris Wilson
@ 2016-08-05  7:31       ` Joonas Lahtinen
  0 siblings, 0 replies; 33+ messages in thread
From: Joonas Lahtinen @ 2016-08-05  7:31 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On pe, 2016-08-05 at 07:51 +0100, Chris Wilson wrote:
> On Fri, Aug 05, 2016 at 09:16:21AM +0300, Joonas Lahtinen wrote:
> > 
> > On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> > > 
> > > @@ -486,7 +486,8 @@ void __i915_add_request(struct drm_i915_gem_request *request,
> > >  	 */
> > >  	request->emitted_jiffies = jiffies;
> > >  	request->previous_seqno = engine->last_submitted_seqno;
> > > -	smp_store_mb(engine->last_submitted_seqno, request->fence.seqno);
> > > +	engine->last_submitted_seqno = request->fence.seqno;
> > What's up with this change?
> Where we use to use an ordered store of the last seqno for checking
> idleness inside hangcheck, we now use the RCU active tracking instead.
> 
> engine->last_submitted_seqno is reduced to a heuristic used for debug
> (gpu error state) and for deciding when to grant a fresh waitboost.

Makes sense, put that to commit message and;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 11/19] drm/i915: Do a nonblocking wait first in pread/pwrite
  2016-08-05  7:08   ` Joonas Lahtinen
@ 2016-08-05  7:59     ` Chris Wilson
  0 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-05  7:59 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

On Fri, Aug 05, 2016 at 10:08:30AM +0300, Joonas Lahtinen wrote:
> On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> > If we try and read or write to an active request, we first must wait
> > upon the GPU completing that request. Let's do that without holding the
> > mutex (and so allow someone else to access the GPU whilst we wait). Upn
> 
> STILL TYPO                                                    Upon ---^
> 
> >  	/* Bounds check destination. */
> >  	if (args->offset > obj->base.size ||
> >  	    args->size > obj->base.size - args->offset) {
> >  		ret = -EINVAL;
> > -		goto out;
> > +		goto err;
> >  	}
> >  
> > -	trace_i915_gem_object_pwrite(obj, args->offset, args->size);
> > +	ret = __unsafe_wait_rendering(obj, to_rps_client(file), false);
> > +	if (ret)
> > +		goto err;
> >  
> > +	intel_runtime_pm_get(dev_priv);
> > +
> > +	ret = i915_mutex_lock_interruptible(dev);
> > +	if (ret)
> > +		goto err_rpm;
> > +
> > +	trace_i915_gem_object_pwrite(obj, args->offset, args->size);
> 
> This trace is still moved, maybe add your reasoning to commit message.

I think the trace is not good enough. We know we entered the ioctl, we
can trace that itself. So the question is what information is valuable
at this point, and that would be which path we take (gtt, shmem, phys).

Anyway, I'll bump it back to the start and we can add a review of
tracepoints to the wishlist.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/19] drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
  2016-08-05  6:16   ` Joonas Lahtinen
  2016-08-05  6:51     ` Chris Wilson
@ 2016-08-05  8:23     ` Chris Wilson
  1 sibling, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2016-08-05  8:23 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

On Fri, Aug 05, 2016 at 09:16:21AM +0300, Joonas Lahtinen wrote:
> On to, 2016-08-04 at 20:52 +0100, Chris Wilson wrote:
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -6336,7 +6336,7 @@ bool i915_gpu_busy(void)
> >  	dev_priv = i915_mch_dev;
> >  
> >  	for_each_engine(engine, dev_priv)
> > -		ret |= !list_empty(&engine->request_list);
> > +		ret |= intel_engine_is_active(engine);
> 
> |= always makes me think of bitfields because, well -- it is bitwise
> operation :P
> 
> 		if (intel_engine_is_active(engine))
> 			ret = true;

Ah, this can just die now. And we can use dev_priv->gt.awake instead.
There's a little bit more of a lag, but ips on Ironlake is already a
sore point (and I hope this is not the straw that breaks the camel's
back).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2016-08-05  8:23 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-04 19:52 Using RCU requests, take 2 Chris Wilson
2016-08-04 19:52 ` [PATCH 01/19] drm/i915: Introduce i915_gem_active_wait_unlocked() Chris Wilson
2016-08-04 19:52 ` [PATCH 02/19] drm/i915: Convert non-blocking waits for requests over to using RCU Chris Wilson
2016-08-05  5:38   ` Joonas Lahtinen
2016-08-04 19:52 ` [PATCH 03/19] drm/i915: Convert non-blocking userptr " Chris Wilson
2016-08-04 19:52 ` [PATCH 04/19] drm/i915/userptr: Remove superfluous interruptible=false on waiting Chris Wilson
2016-08-04 19:52 ` [PATCH 05/19] drm/i915: Remove forced stop ring on suspend/unload Chris Wilson
2016-08-05  5:46   ` Joonas Lahtinen
2016-08-04 19:52 ` [PATCH 06/19] drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex Chris Wilson
2016-08-05  6:16   ` Joonas Lahtinen
2016-08-05  6:51     ` Chris Wilson
2016-08-05  7:31       ` Joonas Lahtinen
2016-08-05  8:23     ` Chris Wilson
2016-08-04 19:52 ` [PATCH 07/19] drm/i915: Simplify do_idling() (Ironlake vt-d w/a) Chris Wilson
2016-08-05  6:19   ` Joonas Lahtinen
2016-08-04 19:52 ` [PATCH 08/19] drm/i915/shrinker: Wait before acquiring struct_mutex under oom Chris Wilson
2016-08-05  6:24   ` Joonas Lahtinen
2016-08-04 19:52 ` [PATCH 09/19] drm/i915: Tidy generation of the GTT mmap offset Chris Wilson
2016-08-04 19:52 ` [PATCH 10/19] drm/i915: Remove unused no-shrinker-steal Chris Wilson
2016-08-04 19:52 ` [PATCH 11/19] drm/i915: Do a nonblocking wait first in pread/pwrite Chris Wilson
2016-08-05  7:08   ` Joonas Lahtinen
2016-08-05  7:59     ` Chris Wilson
2016-08-04 19:52 ` [PATCH 12/19] drm/i915: Remove (struct_mutex) locking for wait-ioctl Chris Wilson
2016-08-04 19:52 ` [PATCH 13/19] drm/i915: Remove (struct_mutex) locking for busy-ioctl Chris Wilson
2016-08-04 19:52 ` [PATCH 14/19] drm/i915: Reduce locking inside swfinish ioctl Chris Wilson
2016-08-05  6:55   ` Joonas Lahtinen
2016-08-04 19:52 ` [PATCH 15/19] drm/i915: Remove pinned check from madvise ioctl Chris Wilson
2016-08-04 19:52 ` [PATCH 16/19] drm/i915: Remove locking for get_tiling Chris Wilson
2016-08-04 19:52 ` [PATCH 17/19] drm/i915: Document and reject invalid tiling modes Chris Wilson
2016-08-05  6:44   ` Joonas Lahtinen
2016-08-04 19:52 ` [PATCH 18/19] drm/i915: Repack fence tiling mode and stride into a single integer Chris Wilson
2016-08-04 19:52 ` [PATCH 19/19] drm/i915: Assert that the request hasn't been retired Chris Wilson
2016-08-05  5:49 ` ✗ Ro.CI.BAT: failure for series starting with [01/19] drm/i915: Introduce i915_gem_active_wait_unlocked() Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.