All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/13] Misc cleanups and locking fixes
@ 2016-01-08 11:29 Tvrtko Ursulin
  2016-01-08 11:29 ` [PATCH 01/13] drm/i915/bdw+: Replace list_del+list_add_tail with list_move_tail Tvrtko Ursulin
                   ` (14 more replies)
  0 siblings, 15 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Random cleanups from yesterday mutated into a bit of locking verification and
fixing as suggested by Daniel Vetter.

Replacig list_for_each_entry with a dedicated i915_gem_obj_for_each_vma which
asserts the struct_mutex uncovered a few holes, not all of which (atomic display
are being fixed in this series).

I've also added the same assert to i915_gem_object_get_page since it sounds
correct to me to require it although I have not analysed all callsites. Some
might assume they know what they are doing so the approach might be too
drastic, but one like execlist is fixed in this series.

Tvrtko Ursulin (13):
  drm/i915/bdw+: Replace list_del+list_add_tail with list_move_tail
  drm/i915: Don't need a timer to wake us up
  drm/i915: Avoid invariant conditionals in lrc interrupt handler
  drm/i915: Fail engine initialization if LRCA is incorrectly aligned
  drm/i915: Cache LRCA in the context
  drm/i915: Only grab timestamps when needed
  drm/i915: Introduce dedicated object VMA iterator
  drm/i915: GEM operations need to be done under the big lock
  drm/i915: Remove two impossible asserts
  drm/i915: Introduce dedicated safe object VMA iterator
  drm/i915: Cache ringbuffer GTT address
  drm/i915: Add BKL asserts to get page helpers
  drm/i915: Cache LRC state page in the context

 drivers/gpu/drm/i915/i915_debugfs.c      |  23 +++---
 drivers/gpu/drm/i915/i915_drv.h          |  17 +++++
 drivers/gpu/drm/i915/i915_gem.c          |  72 ++++++++----------
 drivers/gpu/drm/i915/i915_gem_gtt.c      |   2 +-
 drivers/gpu/drm/i915/i915_gem_shrinker.c |   5 +-
 drivers/gpu/drm/i915/i915_gem_stolen.c   |   5 ++
 drivers/gpu/drm/i915/i915_gem_userptr.c  |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c    |   4 +-
 drivers/gpu/drm/i915/intel_lrc.c         | 122 ++++++++++++++++---------------
 drivers/gpu/drm/i915/intel_lrc.h         |   3 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  |   3 +
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   3 +
 12 files changed, 140 insertions(+), 121 deletions(-)

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 01/13] drm/i915/bdw+: Replace list_del+list_add_tail with list_move_tail
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-11  8:22   ` Daniel Vetter
  2016-01-08 11:29 ` [PATCH 02/13] drm/i915: Don't need a timer to wake us up Tvrtko Ursulin
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Same effect for slightly less source code and resulting binary.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 23839ff04e27..8b6071fcd743 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -431,9 +431,8 @@ static void execlists_context_unqueue(struct intel_engine_cs *ring)
 			/* Same ctx: ignore first request, as second request
 			 * will update tail past first request's workload */
 			cursor->elsp_submitted = req0->elsp_submitted;
-			list_del(&req0->execlist_link);
-			list_add_tail(&req0->execlist_link,
-				&ring->execlist_retired_req_list);
+			list_move_tail(&req0->execlist_link,
+				       &ring->execlist_retired_req_list);
 			req0 = cursor;
 		} else {
 			req1 = cursor;
@@ -485,9 +484,8 @@ static bool execlists_check_remove_request(struct intel_engine_cs *ring,
 			     "Never submitted head request\n");
 
 			if (--head_req->elsp_submitted <= 0) {
-				list_del(&head_req->execlist_link);
-				list_add_tail(&head_req->execlist_link,
-					&ring->execlist_retired_req_list);
+				list_move_tail(&head_req->execlist_link,
+					       &ring->execlist_retired_req_list);
 				return true;
 			}
 		}
@@ -608,9 +606,8 @@ static int execlists_context_queue(struct drm_i915_gem_request *request)
 		if (request->ctx == tail_req->ctx) {
 			WARN(tail_req->elsp_submitted != 0,
 				"More than 2 already-submitted reqs queued\n");
-			list_del(&tail_req->execlist_link);
-			list_add_tail(&tail_req->execlist_link,
-				&ring->execlist_retired_req_list);
+			list_move_tail(&tail_req->execlist_link,
+				       &ring->execlist_retired_req_list);
 		}
 	}
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 02/13] drm/i915: Don't need a timer to wake us up
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
  2016-01-08 11:29 ` [PATCH 01/13] drm/i915/bdw+: Replace list_del+list_add_tail with list_move_tail Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-11  8:26   ` Daniel Vetter
  2016-01-08 11:29 ` [PATCH 03/13] drm/i915: Avoid invariant conditionals in lrc interrupt handler Tvrtko Ursulin
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Looks like the sleeping loop in __i915_wait_request can be
simplified by using io_schedule_timeout instead of setting
up and destroying a timer.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 28 ++++++++--------------------
 1 file changed, 8 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6c60e04fc09c..de98dc41fb9f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1135,11 +1135,6 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
 	return 0;
 }
 
-static void fake_irq(unsigned long data)
-{
-	wake_up_process((struct task_struct *)data);
-}
-
 static bool missed_irq(struct drm_i915_private *dev_priv,
 		       struct intel_engine_cs *ring)
 {
@@ -1291,7 +1286,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	}
 
 	for (;;) {
-		struct timer_list timer;
+		long sched_timeout;
 
 		prepare_to_wait(&ring->irq_queue, &wait, state);
 
@@ -1321,21 +1316,14 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 			break;
 		}
 
-		timer.function = NULL;
-		if (timeout || missed_irq(dev_priv, ring)) {
-			unsigned long expire;
-
-			setup_timer_on_stack(&timer, fake_irq, (unsigned long)current);
-			expire = missed_irq(dev_priv, ring) ? jiffies + 1 : timeout_expire;
-			mod_timer(&timer, expire);
-		}
-
-		io_schedule();
+		if (timeout)
+			sched_timeout = timeout_expire - jiffies;
+		else if (missed_irq(dev_priv, ring))
+			sched_timeout = 1;
+		else
+			sched_timeout = MAX_SCHEDULE_TIMEOUT;
 
-		if (timer.function) {
-			del_singleshot_timer_sync(&timer);
-			destroy_timer_on_stack(&timer);
-		}
+		io_schedule_timeout(sched_timeout);
 	}
 	if (!irq_test_in_progress)
 		ring->irq_put(ring);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 03/13] drm/i915: Avoid invariant conditionals in lrc interrupt handler
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
  2016-01-08 11:29 ` [PATCH 01/13] drm/i915/bdw+: Replace list_del+list_add_tail with list_move_tail Tvrtko Ursulin
  2016-01-08 11:29 ` [PATCH 02/13] drm/i915: Don't need a timer to wake us up Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-11  8:29   ` Daniel Vetter
  2016-01-08 11:29 ` [PATCH 04/13] drm/i915: Fail engine initialization if LRCA is incorrectly aligned Tvrtko Ursulin
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

There is no need to check on what Gen we are running on every
interrupt and every command submission. We can instead set up
some of that when engines are initialized, store it in the
engine structure and use it directly at runtime.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        | 36 ++++++++++++++++++---------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
 2 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 8b6071fcd743..84977a6e6f3f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -298,29 +298,15 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 				     struct intel_engine_cs *ring)
 {
 	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
-	uint64_t desc;
+	uint64_t desc = ring->ctx_desc_template;
 	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
 			LRC_PPHWSP_PN * PAGE_SIZE;
 
 	WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
 
-	desc = GEN8_CTX_VALID;
-	desc |= GEN8_CTX_ADDRESSING_MODE(dev) << GEN8_CTX_ADDRESSING_MODE_SHIFT;
-	if (IS_GEN8(ctx_obj->base.dev))
-		desc |= GEN8_CTX_L3LLC_COHERENT;
-	desc |= GEN8_CTX_PRIVILEGE;
 	desc |= lrca;
 	desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
 
-	/* TODO: WaDisableLiteRestore when we start using semaphore
-	 * signalling between Command Streamers */
-	/* desc |= GEN8_CTX_FORCE_RESTORE; */
-
-	/* WaEnableForceRestoreInCtxtDescForVCS:skl */
-	/* WaEnableForceRestoreInCtxtDescForVCS:bxt */
-	if (disable_lite_restore_wa(ring))
-		desc |= GEN8_CTX_FORCE_RESTORE;
-
 	return desc;
 }
 
@@ -556,7 +542,7 @@ void intel_lrc_irq_handler(struct intel_engine_cs *ring)
 		}
 	}
 
-	if (disable_lite_restore_wa(ring)) {
+	if (ring->disable_lite_restore_wa) {
 		/* Prevent a ctx to preempt itself */
 		if ((status & GEN8_CTX_STATUS_ACTIVE_IDLE) &&
 		    (submit_contexts != 0))
@@ -1980,6 +1966,24 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
 		goto error;
 	}
 
+	ring->disable_lite_restore_wa = disable_lite_restore_wa(ring);
+
+	ring->ctx_desc_template = GEN8_CTX_VALID;
+	ring->ctx_desc_template |= GEN8_CTX_ADDRESSING_MODE(dev) <<
+				   GEN8_CTX_ADDRESSING_MODE_SHIFT;
+	if (IS_GEN8(dev))
+		ring->ctx_desc_template |= GEN8_CTX_L3LLC_COHERENT;
+	ring->ctx_desc_template |= GEN8_CTX_PRIVILEGE;
+
+	/* TODO: WaDisableLiteRestore when we start using semaphore
+	 * signalling between Command Streamers */
+	/* ring->ctx_desc_template |= GEN8_CTX_FORCE_RESTORE; */
+
+	/* WaEnableForceRestoreInCtxtDescForVCS:skl */
+	/* WaEnableForceRestoreInCtxtDescForVCS:bxt */
+	if (ring->disable_lite_restore_wa)
+		ring->ctx_desc_template |= GEN8_CTX_FORCE_RESTORE;
+
 	return 0;
 
 error:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 49574ffe54bc..0b91a4b77359 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -268,6 +268,8 @@ struct  intel_engine_cs {
 	struct list_head execlist_queue;
 	struct list_head execlist_retired_req_list;
 	u8 next_context_status_buffer;
+	bool disable_lite_restore_wa;
+	u32 ctx_desc_template;
 	u32             irq_keep_mask; /* bitmask for interrupts that should not be masked */
 	int		(*emit_request)(struct drm_i915_gem_request *request);
 	int		(*emit_flush)(struct drm_i915_gem_request *request,
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 04/13] drm/i915: Fail engine initialization if LRCA is incorrectly aligned
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2016-01-08 11:29 ` [PATCH 03/13] drm/i915: Avoid invariant conditionals in lrc interrupt handler Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-11  8:31   ` Daniel Vetter
  2016-01-08 11:29 ` [PATCH 05/13] drm/i915: Cache LRCA in the context Tvrtko Ursulin
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

LRCA can change only when it goes from unpinned to pinned so it
makes sense to check its alignment at that point rather than at
every batch buffer submission.

Furthermore, if we check it at pin time we can actually
gracefuly fail the engine initialization rather than just
spamming the logs at runtime with WARNs.

v2: Return ENODEV for bad alignment. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 84977a6e6f3f..ff146a15d395 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -302,8 +302,6 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
 			LRC_PPHWSP_PN * PAGE_SIZE;
 
-	WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
-
 	desc |= lrca;
 	desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
 
@@ -1030,6 +1028,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	u64 lrca;
 	int ret = 0;
 
 	WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
@@ -1038,6 +1037,12 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
 	if (ret)
 		return ret;
 
+	lrca = i915_gem_obj_ggtt_offset(ctx_obj) + LRC_PPHWSP_PN * PAGE_SIZE;
+	if (WARN_ON(lrca & 0xFFFFFFFF00000FFFULL)) {
+		ret = -ENODEV;
+		goto unpin_ctx_obj;
+	}
+
 	ret = intel_pin_and_map_ringbuffer_obj(ring->dev, ringbuf);
 	if (ret)
 		goto unpin_ctx_obj;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 05/13] drm/i915: Cache LRCA in the context
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2016-01-08 11:29 ` [PATCH 04/13] drm/i915: Fail engine initialization if LRCA is incorrectly aligned Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-11  8:38   ` Daniel Vetter
  2016-01-08 11:29 ` [PATCH 06/13] drm/i915: Only grab timestamps when needed Tvrtko Ursulin
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We are not allowed to call i915_gem_obj_ggtt_offset from irq
context without the big kernel lock held.

LRCA lifetime is well defined so cache it so it can be looked up
cheaply from the interrupt context and at command submission
time.

v2: Added irq context reasoning to the commit message. (Daniel Vetter)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 15 ++++++--------
 drivers/gpu/drm/i915/i915_drv.h     |  1 +
 drivers/gpu/drm/i915/intel_lrc.c    | 40 ++++++++++++++++---------------------
 drivers/gpu/drm/i915/intel_lrc.h    |  3 ++-
 4 files changed, 26 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 3b05bd189eab..714a45cf8a51 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1976,12 +1976,13 @@ static int i915_context_status(struct seq_file *m, void *unused)
 }
 
 static void i915_dump_lrc_obj(struct seq_file *m,
-			      struct intel_engine_cs *ring,
-			      struct drm_i915_gem_object *ctx_obj)
+			      struct intel_context *ctx,
+			      struct intel_engine_cs *ring)
 {
 	struct page *page;
 	uint32_t *reg_state;
 	int j;
+	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
 	unsigned long ggtt_offset = 0;
 
 	if (ctx_obj == NULL) {
@@ -1991,7 +1992,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
 	}
 
 	seq_printf(m, "CONTEXT: %s %u\n", ring->name,
-		   intel_execlists_ctx_id(ctx_obj));
+		   intel_execlists_ctx_id(ctx, ring));
 
 	if (!i915_gem_obj_ggtt_bound(ctx_obj))
 		seq_puts(m, "\tNot bound in GGTT\n");
@@ -2040,8 +2041,7 @@ static int i915_dump_lrc(struct seq_file *m, void *unused)
 	list_for_each_entry(ctx, &dev_priv->context_list, link) {
 		for_each_ring(ring, dev_priv, i) {
 			if (ring->default_context != ctx)
-				i915_dump_lrc_obj(m, ring,
-						  ctx->engine[i].state);
+				i915_dump_lrc_obj(m, ctx, ring);
 		}
 	}
 
@@ -2115,11 +2115,8 @@ static int i915_execlists(struct seq_file *m, void *data)
 
 		seq_printf(m, "\t%d requests in queue\n", count);
 		if (head_req) {
-			struct drm_i915_gem_object *ctx_obj;
-
-			ctx_obj = head_req->ctx->engine[ring_id].state;
 			seq_printf(m, "\tHead request id: %u\n",
-				   intel_execlists_ctx_id(ctx_obj));
+				   intel_execlists_ctx_id(head_req->ctx, ring));
 			seq_printf(m, "\tHead request tail: %u\n",
 				   head_req->tail);
 		}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8cf655c6fc03..b77a5d84eac2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -881,6 +881,7 @@ struct intel_context {
 		struct drm_i915_gem_object *state;
 		struct intel_ringbuffer *ringbuf;
 		int pin_count;
+		u32 lrca;
 	} engine[I915_NUM_RINGS];
 
 	struct list_head link;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ff146a15d395..ffe004de22b0 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -265,7 +265,8 @@ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists
 
 /**
  * intel_execlists_ctx_id() - get the Execlists Context ID
- * @ctx_obj: Logical Ring Context backing object.
+ * @ctx: Context to get the ID for
+ * @ring: Engine to get the ID for
  *
  * Do not confuse with ctx->id! Unfortunately we have a name overload
  * here: the old context ID we pass to userspace as a handler so that
@@ -275,14 +276,12 @@ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists
  *
  * Return: 20-bits globally unique context ID.
  */
-u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
+u32 intel_execlists_ctx_id(struct intel_context *ctx,
+			   struct intel_engine_cs *ring)
 {
-	u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
-			LRC_PPHWSP_PN * PAGE_SIZE;
-
 	/* LRCA is required to be 4K aligned so the more significant 20 bits
 	 * are globally unique */
-	return lrca >> 12;
+	return ctx->engine[ring->id].lrca >> 12;
 }
 
 static bool disable_lite_restore_wa(struct intel_engine_cs *ring)
@@ -297,13 +296,11 @@ static bool disable_lite_restore_wa(struct intel_engine_cs *ring)
 uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 				     struct intel_engine_cs *ring)
 {
-	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
+	uint64_t lrca = ctx->engine[ring->id].lrca;
 	uint64_t desc = ring->ctx_desc_template;
-	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
-			LRC_PPHWSP_PN * PAGE_SIZE;
 
 	desc |= lrca;
-	desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
+	desc |= (u64)intel_execlists_ctx_id(ctx, ring) << GEN8_CTX_ID_SHIFT;
 
 	return desc;
 }
@@ -461,9 +458,7 @@ static bool execlists_check_remove_request(struct intel_engine_cs *ring,
 					    execlist_link);
 
 	if (head_req != NULL) {
-		struct drm_i915_gem_object *ctx_obj =
-				head_req->ctx->engine[ring->id].state;
-		if (intel_execlists_ctx_id(ctx_obj) == request_id) {
+		if (intel_execlists_ctx_id(head_req->ctx, ring) == request_id) {
 			WARN(head_req->elsp_submitted == 0,
 			     "Never submitted head request\n");
 
@@ -1023,15 +1018,17 @@ int logical_ring_flush_all_caches(struct drm_i915_gem_request *req)
 }
 
 static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
-		struct drm_i915_gem_object *ctx_obj,
-		struct intel_ringbuffer *ringbuf)
+				   struct intel_context *ctx)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
+	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
 	u64 lrca;
-	int ret = 0;
+	int ret;
 
 	WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
+
 	ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
 			PIN_OFFSET_BIAS | GUC_WOPCM_TOP);
 	if (ret)
@@ -1047,6 +1044,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
 	if (ret)
 		goto unpin_ctx_obj;
 
+	ctx->engine[ring->id].lrca = lrca;
 	ctx_obj->dirty = true;
 
 	/* Invalidate GuC TLB. */
@@ -1065,11 +1063,9 @@ static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
 {
 	int ret = 0;
 	struct intel_engine_cs *ring = rq->ring;
-	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
-	struct intel_ringbuffer *ringbuf = rq->ringbuf;
 
 	if (rq->ctx->engine[ring->id].pin_count++ == 0) {
-		ret = intel_lr_context_do_pin(ring, ctx_obj, ringbuf);
+		ret = intel_lr_context_do_pin(ring, rq->ctx);
 		if (ret)
 			goto reset_pin_count;
 	}
@@ -1091,6 +1087,7 @@ void intel_lr_context_unpin(struct drm_i915_gem_request *rq)
 		if (--rq->ctx->engine[ring->id].pin_count == 0) {
 			intel_unpin_ringbuffer_obj(ringbuf);
 			i915_gem_object_ggtt_unpin(ctx_obj);
+			rq->ctx->engine[ring->id].lrca = 0;
 		}
 	}
 }
@@ -1960,10 +1957,7 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
 		goto error;
 
 	/* As this is the default context, always pin it */
-	ret = intel_lr_context_do_pin(
-			ring,
-			ring->default_context->engine[ring->id].state,
-			ring->default_context->engine[ring->id].ringbuf);
+	ret = intel_lr_context_do_pin(ring, ring->default_context);
 	if (ret) {
 		DRM_ERROR(
 			"Failed to pin and map ringbuffer %s: %d\n",
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index de41ad6cd63d..f9db5f187d77 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -106,6 +106,8 @@ void intel_lr_context_reset(struct drm_device *dev,
 			struct intel_context *ctx);
 uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 				     struct intel_engine_cs *ring);
+u32 intel_execlists_ctx_id(struct intel_context *ctx,
+			   struct intel_engine_cs *ring);
 
 /* Execlists */
 int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
@@ -113,7 +115,6 @@ struct i915_execbuffer_params;
 int intel_execlists_submission(struct i915_execbuffer_params *params,
 			       struct drm_i915_gem_execbuffer2 *args,
 			       struct list_head *vmas);
-u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
 
 void intel_lrc_irq_handler(struct intel_engine_cs *ring);
 void intel_execlists_retire_requests(struct intel_engine_cs *ring);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 06/13] drm/i915: Only grab timestamps when needed
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2016-01-08 11:29 ` [PATCH 05/13] drm/i915: Cache LRCA in the context Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-11  8:42   ` Daniel Vetter
  2016-01-08 11:29 ` [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator Tvrtko Ursulin
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

No need to call ktime_get_raw_ns twice per unlimited wait and can
also elimate a local variable.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index de98dc41fb9f..c4f69579eb7a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1246,7 +1246,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
 	DEFINE_WAIT(wait);
 	unsigned long timeout_expire;
-	s64 before, now;
+	s64 before = 0;
 	int ret;
 
 	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
@@ -1266,14 +1266,17 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 			return -ETIME;
 
 		timeout_expire = jiffies + nsecs_to_jiffies_timeout(*timeout);
+
+		/*
+		 * Record current time in case interrupted by signal, or wedged.
+		 */
+		before = ktime_get_raw_ns();
 	}
 
 	if (INTEL_INFO(dev_priv)->gen >= 6)
 		gen6_rps_boost(dev_priv, rps, req->emitted_jiffies);
 
-	/* Record current time in case interrupted by signal, or wedged */
 	trace_i915_gem_request_wait_begin(req);
-	before = ktime_get_raw_ns();
 
 	/* Optimistic spin for the next jiffie before touching IRQs */
 	ret = __i915_spin_request(req, state);
@@ -1331,11 +1334,10 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	finish_wait(&ring->irq_queue, &wait);
 
 out:
-	now = ktime_get_raw_ns();
 	trace_i915_gem_request_wait_end(req);
 
 	if (timeout) {
-		s64 tres = *timeout - (now - before);
+		s64 tres = *timeout - (ktime_get_raw_ns() - before);
 
 		*timeout = tres < 0 ? 0 : tres;
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2016-01-08 11:29 ` [PATCH 06/13] drm/i915: Only grab timestamps when needed Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-08 11:44   ` Chris Wilson
  2016-01-08 13:29   ` Tvrtko Ursulin
  2016-01-08 11:29 ` [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock Tvrtko Ursulin
                   ` (7 subsequent siblings)
  14 siblings, 2 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Daniel Vetter

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Purpose is to catch places which iterate the object VMA list
without holding the big lock.

Implemented by open coding list_for_each_entry to make the
macro compatible with existing call sites.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_debugfs.c      |  8 ++++----
 drivers/gpu/drm/i915/i915_drv.h          |  6 ++++++
 drivers/gpu/drm/i915/i915_gem.c          | 24 ++++++++++++------------
 drivers/gpu/drm/i915/i915_gem_gtt.c      |  2 +-
 drivers/gpu/drm/i915/i915_gem_shrinker.c |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c    |  4 ++--
 6 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 714a45cf8a51..d7c2a3201161 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -117,7 +117,7 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj)
 	u64 size = 0;
 	struct i915_vma *vma;
 
-	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+	i915_gem_obj_for_each_vma(vma, obj) {
 		if (i915_is_ggtt(vma->vm) &&
 		    drm_mm_node_allocated(&vma->node))
 			size += vma->node.size;
@@ -155,7 +155,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		   obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
 	if (obj->base.name)
 		seq_printf(m, " (name: %d)", obj->base.name);
-	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+	i915_gem_obj_for_each_vma(vma, obj) {
 		if (vma->pin_count > 0)
 			pin_count++;
 	}
@@ -164,7 +164,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		seq_printf(m, " (display)");
 	if (obj->fence_reg != I915_FENCE_REG_NONE)
 		seq_printf(m, " (fence: %d)", obj->fence_reg);
-	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+	i915_gem_obj_for_each_vma(vma, obj) {
 		seq_printf(m, " (%sgtt offset: %08llx, size: %08llx",
 			   i915_is_ggtt(vma->vm) ? "g" : "pp",
 			   vma->node.start, vma->node.size);
@@ -342,7 +342,7 @@ static int per_file_stats(int id, void *ptr, void *data)
 		stats->shared += obj->base.size;
 
 	if (USES_FULL_PPGTT(obj->base.dev)) {
-		list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		i915_gem_obj_for_each_vma(vma, obj) {
 			struct i915_hw_ppgtt *ppgtt;
 
 			if (!drm_mm_node_allocated(&vma->node))
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b77a5d84eac2..0406a020dfcc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2852,6 +2852,12 @@ struct drm_i915_gem_object *i915_gem_object_create_from_data(
 void i915_gem_free_object(struct drm_gem_object *obj);
 void i915_gem_vma_destroy(struct i915_vma *vma);
 
+#define i915_gem_obj_for_each_vma(vma, obj) \
+	for (WARN_ON_ONCE(!mutex_is_locked(&(obj)->base.dev->struct_mutex)), \
+	     vma = list_first_entry(&(obj)->vma_list, typeof(*vma), vma_link);\
+	     &vma->vma_link != (&(obj)->vma_list); \
+	     vma = list_next_entry(vma, vma_link))
+
 /* Flags used by pin/bind&friends. */
 #define PIN_MAPPABLE	(1<<0)
 #define PIN_NONBLOCK	(1<<1)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c4f69579eb7a..415bb5ef8b3a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2442,7 +2442,7 @@ i915_gem_object_retire__read(struct drm_i915_gem_object *obj, int ring)
 	list_move_tail(&obj->global_list,
 		       &to_i915(obj->base.dev)->mm.bound_list);
 
-	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+	i915_gem_obj_for_each_vma(vma, obj) {
 		if (!list_empty(&vma->mm_list))
 			list_move_tail(&vma->mm_list, &vma->vm->inactive_list);
 	}
@@ -3834,7 +3834,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 			 */
 		}
 
-		list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		i915_gem_obj_for_each_vma(vma, obj) {
 			if (!drm_mm_node_allocated(&vma->node))
 				continue;
 
@@ -3844,7 +3844,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		}
 	}
 
-	list_for_each_entry(vma, &obj->vma_list, vma_link)
+	i915_gem_obj_for_each_vma(vma, obj)
 		vma->node.color = cache_level;
 	obj->cache_level = cache_level;
 
@@ -4564,7 +4564,7 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm)
 {
 	struct i915_vma *vma;
-	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+	i915_gem_obj_for_each_vma(vma, obj) {
 		if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL &&
 		    vma->vm == vm)
 			return vma;
@@ -4581,7 +4581,7 @@ struct i915_vma *i915_gem_obj_to_ggtt_view(struct drm_i915_gem_object *obj,
 	if (WARN_ONCE(!view, "no view specified"))
 		return ERR_PTR(-EINVAL);
 
-	list_for_each_entry(vma, &obj->vma_list, vma_link)
+	i915_gem_obj_for_each_vma(vma, obj)
 		if (vma->vm == ggtt &&
 		    i915_ggtt_view_equal(&vma->ggtt_view, view))
 			return vma;
@@ -5144,7 +5144,7 @@ u64 i915_gem_obj_offset(struct drm_i915_gem_object *o,
 
 	WARN_ON(vm == &dev_priv->mm.aliasing_ppgtt->base);
 
-	list_for_each_entry(vma, &o->vma_list, vma_link) {
+	i915_gem_obj_for_each_vma(vma, o) {
 		if (i915_is_ggtt(vma->vm) &&
 		    vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL)
 			continue;
@@ -5163,7 +5163,7 @@ u64 i915_gem_obj_ggtt_offset_view(struct drm_i915_gem_object *o,
 	struct i915_address_space *ggtt = i915_obj_to_ggtt(o);
 	struct i915_vma *vma;
 
-	list_for_each_entry(vma, &o->vma_list, vma_link)
+	i915_gem_obj_for_each_vma(vma, o)
 		if (vma->vm == ggtt &&
 		    i915_ggtt_view_equal(&vma->ggtt_view, view))
 			return vma->node.start;
@@ -5177,7 +5177,7 @@ bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
 {
 	struct i915_vma *vma;
 
-	list_for_each_entry(vma, &o->vma_list, vma_link) {
+	i915_gem_obj_for_each_vma(vma, o) {
 		if (i915_is_ggtt(vma->vm) &&
 		    vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL)
 			continue;
@@ -5194,7 +5194,7 @@ bool i915_gem_obj_ggtt_bound_view(struct drm_i915_gem_object *o,
 	struct i915_address_space *ggtt = i915_obj_to_ggtt(o);
 	struct i915_vma *vma;
 
-	list_for_each_entry(vma, &o->vma_list, vma_link)
+	i915_gem_obj_for_each_vma(vma, o)
 		if (vma->vm == ggtt &&
 		    i915_ggtt_view_equal(&vma->ggtt_view, view) &&
 		    drm_mm_node_allocated(&vma->node))
@@ -5207,7 +5207,7 @@ bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
 {
 	struct i915_vma *vma;
 
-	list_for_each_entry(vma, &o->vma_list, vma_link)
+	i915_gem_obj_for_each_vma(vma, o)
 		if (drm_mm_node_allocated(&vma->node))
 			return true;
 
@@ -5224,7 +5224,7 @@ unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
 
 	BUG_ON(list_empty(&o->vma_list));
 
-	list_for_each_entry(vma, &o->vma_list, vma_link) {
+	i915_gem_obj_for_each_vma(vma, o) {
 		if (i915_is_ggtt(vma->vm) &&
 		    vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL)
 			continue;
@@ -5237,7 +5237,7 @@ unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
 bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj)
 {
 	struct i915_vma *vma;
-	list_for_each_entry(vma, &obj->vma_list, vma_link)
+	i915_gem_obj_for_each_vma(vma, obj)
 		if (vma->pin_count > 0)
 			return true;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 56f4f2e58d53..40fe2bf6bd91 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3220,7 +3220,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 	vm = &dev_priv->gtt.base;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		flush = false;
-		list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		i915_gem_obj_for_each_vma(vma, obj) {
 			if (vma->vm != vm)
 				continue;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 16da9c1422cc..4106666a4303 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -52,7 +52,7 @@ static int num_vma_bound(struct drm_i915_gem_object *obj)
 	struct i915_vma *vma;
 	int count = 0;
 
-	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+	i915_gem_obj_for_each_vma(vma, obj) {
 		if (drm_mm_node_allocated(&vma->node))
 			count++;
 		if (vma->pin_count)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 06ca4082735b..8b7a8c4b9b37 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -755,7 +755,7 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
 		if (err == last)
 			break;
 
-		list_for_each_entry(vma, &obj->vma_list, vma_link)
+		i915_gem_obj_for_each_vma(vma, obj)
 			if (vma->vm == vm && vma->pin_count > 0)
 				capture_bo(err++, vma);
 	}
@@ -1128,7 +1128,7 @@ static void i915_gem_capture_vm(struct drm_i915_private *dev_priv,
 	error->active_bo_count[ndx] = i;
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		list_for_each_entry(vma, &obj->vma_list, vma_link)
+		i915_gem_obj_for_each_vma(vma, obj)
 			if (vma->vm == vm && vma->pin_count > 0)
 				i++;
 	}
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (6 preceding siblings ...)
  2016-01-08 11:29 ` [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-08 11:40   ` Chris Wilson
                     ` (2 more replies)
  2016-01-08 11:29 ` [PATCH 09/13] drm/i915: Remove two impossible asserts Tvrtko Ursulin
                   ` (6 subsequent siblings)
  14 siblings, 3 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

VMA creation and GEM list management need the big lock.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_stolen.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index c384dc9c8a63..16cab255fa6c 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -670,9 +670,12 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (gtt_offset == I915_GTT_OFFSET_NONE)
 		return obj;
 
+	mutex_lock(&dev->struct_mutex);
+
 	vma = i915_gem_obj_lookup_or_create_vma(obj, ggtt);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
+		mutex_unlock(&dev->struct_mutex);
 		goto err;
 	}
 
@@ -698,6 +701,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	i915_gem_object_pin_pages(obj);
 
+	mutex_unlock(&dev->struct_mutex);
+
 	return obj;
 
 err:
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 09/13] drm/i915: Remove two impossible asserts
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (7 preceding siblings ...)
  2016-01-08 11:29 ` [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-11  8:49   ` Daniel Vetter
  2016-01-08 11:29 ` [PATCH 10/13] drm/i915: Introduce dedicated safe object VMA iterator Tvrtko Ursulin
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Engine initialization would have failed if those two weren't
pinned and calling i915_gem_obj_is_pinned is illegal from irq
context without the big lock held.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ffe004de22b0..5b3795815d8e 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -350,8 +350,6 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
 	uint32_t *reg_state;
 
 	BUG_ON(!ctx_obj);
-	WARN_ON(!i915_gem_obj_is_pinned(ctx_obj));
-	WARN_ON(!i915_gem_obj_is_pinned(rb_obj));
 
 	page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN);
 	reg_state = kmap_atomic(page);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 10/13] drm/i915: Introduce dedicated safe object VMA iterator
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (8 preceding siblings ...)
  2016-01-08 11:29 ` [PATCH 09/13] drm/i915: Remove two impossible asserts Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-08 11:29 ` [PATCH 11/13] drm/i915: Cache ringbuffer GTT address Tvrtko Ursulin
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Daniel Vetter

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Purpose is to catch places which iterate the object VMA list
without holding the big lock.

Implemented by open coding list_for_each_entry_safe to make
the macro compatible with existing call sites.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.h          | 7 +++++++
 drivers/gpu/drm/i915/i915_gem.c          | 6 +++---
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 3 +--
 drivers/gpu/drm/i915/i915_gem_userptr.c  | 2 +-
 4 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0406a020dfcc..81a83424fa13 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2858,6 +2858,13 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
 	     &vma->vma_link != (&(obj)->vma_list); \
 	     vma = list_next_entry(vma, vma_link))
 
+#define i915_gem_obj_for_each_vma_safe(vma, next, obj) \
+	for (WARN_ON_ONCE(!mutex_is_locked(&(obj)->base.dev->struct_mutex)), \
+	     vma = list_first_entry(&(obj)->vma_list, typeof(*vma), vma_link), \
+	     next = list_next_entry(vma, vma_link); \
+	     &vma->vma_link != (&(obj)->vma_list); \
+	     vma = next, next = list_next_entry(next, vma_link))
+
 /* Flags used by pin/bind&friends. */
 #define PIN_MAPPABLE	(1<<0)
 #define PIN_NONBLOCK	(1<<1)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 415bb5ef8b3a..5a6daeaef680 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -272,7 +272,7 @@ drop_pages(struct drm_i915_gem_object *obj)
 	int ret;
 
 	drm_gem_object_reference(&obj->base);
-	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link)
+	i915_gem_obj_for_each_vma_safe(vma, next, obj)
 		if (i915_vma_unbind(vma))
 			break;
 
@@ -3771,7 +3771,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 	 * catch the issue of the CS prefetch crossing page boundaries and
 	 * reading an invalid PTE on older architectures.
 	 */
-	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
+	i915_gem_obj_for_each_vma_safe(vma, next, obj) {
 		if (!drm_mm_node_allocated(&vma->node))
 			continue;
 
@@ -4507,7 +4507,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 
 	trace_i915_gem_object_destroy(obj);
 
-	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
+	i915_gem_obj_for_each_vma_safe(vma, next, obj) {
 		int ret;
 
 		vma->pin_count = 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 4106666a4303..3cf83f4eb7c9 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -175,8 +175,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
 			drm_gem_object_reference(&obj->base);
 
 			/* For the unbound phase, this should be a no-op! */
-			list_for_each_entry_safe(vma, v,
-						 &obj->vma_list, vma_link)
+			i915_gem_obj_for_each_vma_safe(vma, v, obj)
 				if (i915_vma_unbind(vma))
 					break;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 19fb0bddc1cd..b730dfffe306 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -81,7 +81,7 @@ static void __cancel_userptr__worker(struct work_struct *work)
 		was_interruptible = dev_priv->mm.interruptible;
 		dev_priv->mm.interruptible = false;
 
-		list_for_each_entry_safe(vma, tmp, &obj->vma_list, vma_link) {
+		i915_gem_obj_for_each_vma_safe(vma, tmp, obj) {
 			int ret = i915_vma_unbind(vma);
 			WARN_ON(ret && ret != -EIO);
 		}
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 11/13] drm/i915: Cache ringbuffer GTT address
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (9 preceding siblings ...)
  2016-01-08 11:29 ` [PATCH 10/13] drm/i915: Introduce dedicated safe object VMA iterator Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-08 11:37   ` Chris Wilson
                     ` (2 more replies)
  2016-01-08 11:29 ` [PATCH 12/13] drm/i915: Add BKL asserts to get page helpers Tvrtko Ursulin
                   ` (3 subsequent siblings)
  14 siblings, 3 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Purpose is to avoid calling i915_gem_obj_ggtt_offset from the
interrupt context without the big lock held.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        | 3 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 5b3795815d8e..70c511ef6b12 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -345,7 +345,6 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
 	struct intel_engine_cs *ring = rq->ring;
 	struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
 	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
-	struct drm_i915_gem_object *rb_obj = rq->ringbuf->obj;
 	struct page *page;
 	uint32_t *reg_state;
 
@@ -355,7 +354,7 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
 	reg_state = kmap_atomic(page);
 
 	reg_state[CTX_RING_TAIL+1] = rq->tail;
-	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
+	reg_state[CTX_RING_BUFFER_START+1] = rq->ringbuf->gtt_start;
 
 	if (ppgtt && !USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
 		/* True 32b PPGTT with dynamic page allocation: update PDP
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 339701d7a9a5..9094ce254125 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1988,6 +1988,7 @@ void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
 	else
 		iounmap(ringbuf->virtual_start);
 	ringbuf->virtual_start = NULL;
+	ringbuf->gtt_start = 0;
 	i915_gem_object_ggtt_unpin(ringbuf->obj);
 }
 
@@ -2054,6 +2055,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 		}
 	}
 
+	ringbuf->gtt_start = i915_gem_obj_ggtt_offset(obj);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 0b91a4b77359..25d3716228ae 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -98,6 +98,7 @@ struct intel_ring_hangcheck {
 struct intel_ringbuffer {
 	struct drm_i915_gem_object *obj;
 	void __iomem *virtual_start;
+	u64 gtt_start;
 
 	struct intel_engine_cs *ring;
 	struct list_head link;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 12/13] drm/i915: Add BKL asserts to get page helpers
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (10 preceding siblings ...)
  2016-01-08 11:29 ` [PATCH 11/13] drm/i915: Cache ringbuffer GTT address Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-08 11:37   ` Chris Wilson
  2016-01-08 11:29 ` [PATCH 13/13] drm/i915: Cache LRC state page in the context Tvrtko Ursulin
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Purpose is catching illegal callers.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 2 ++
 drivers/gpu/drm/i915/i915_gem.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 81a83424fa13..51869496b299 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2916,6 +2916,8 @@ i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, int n);
 static inline struct page *
 i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n)
 {
+	WARN_ON_ONCE(!mutex_is_locked(&obj->base.dev->struct_mutex));
+
 	if (WARN_ON(n >= obj->base.size >> PAGE_SHIFT))
 		return NULL;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5a6daeaef680..b54b5619e07f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5250,6 +5250,8 @@ i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, int n)
 {
 	struct page *page;
 
+	WARN_ON_ONCE(!mutex_is_locked(&obj->base.dev->struct_mutex));
+
 	/* Only default objects have per-page dirty tracking */
 	if (WARN_ON(obj->ops != &i915_gem_object_ops))
 		return NULL;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 13/13] drm/i915: Cache LRC state page in the context
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (11 preceding siblings ...)
  2016-01-08 11:29 ` [PATCH 12/13] drm/i915: Add BKL asserts to get page helpers Tvrtko Ursulin
@ 2016-01-08 11:29 ` Tvrtko Ursulin
  2016-01-08 11:38 ` [PATCH v2 00/13] Misc cleanups and locking fixes Chris Wilson
  2016-01-11  9:44 ` ✗ failure: Fi.CI.BAT Patchwork
  14 siblings, 0 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

LRC lifetime is well defined so we can cache the page pointing
to the object backing store in the context in order to avoid
walking over the object SG page list from the interrupt context
without the big lock held.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  |  1 +
 drivers/gpu/drm/i915/intel_lrc.c | 17 +++++++++++------
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 51869496b299..0e3554c821fa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -882,6 +882,7 @@ struct intel_context {
 		struct intel_ringbuffer *ringbuf;
 		int pin_count;
 		u32 lrca;
+		struct page *lrc_state_page;
 	} engine[I915_NUM_RINGS];
 
 	struct list_head link;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 70c511ef6b12..016d8c833a99 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -344,14 +344,10 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
 {
 	struct intel_engine_cs *ring = rq->ring;
 	struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
-	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
-	struct page *page;
 	uint32_t *reg_state;
 
-	BUG_ON(!ctx_obj);
-
-	page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN);
-	reg_state = kmap_atomic(page);
+	BUG_ON(!rq->ctx->engine[ring->id].lrc_state_page);
+	reg_state = kmap_atomic(rq->ctx->engine[ring->id].lrc_state_page);
 
 	reg_state[CTX_RING_TAIL+1] = rq->tail;
 	reg_state[CTX_RING_BUFFER_START+1] = rq->ringbuf->gtt_start;
@@ -1021,6 +1017,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
 	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
+	struct page *lrc_state_page;
 	u64 lrca;
 	int ret;
 
@@ -1037,11 +1034,18 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
 		goto unpin_ctx_obj;
 	}
 
+	lrc_state_page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN);
+	if (WARN_ON(!lrc_state_page)) {
+		ret = -ENODEV;
+		goto unpin_ctx_obj;
+	}
+
 	ret = intel_pin_and_map_ringbuffer_obj(ring->dev, ringbuf);
 	if (ret)
 		goto unpin_ctx_obj;
 
 	ctx->engine[ring->id].lrca = lrca;
+	ctx->engine[ring->id].lrc_state_page = lrc_state_page;
 	ctx_obj->dirty = true;
 
 	/* Invalidate GuC TLB. */
@@ -1085,6 +1089,7 @@ void intel_lr_context_unpin(struct drm_i915_gem_request *rq)
 			intel_unpin_ringbuffer_obj(ringbuf);
 			i915_gem_object_ggtt_unpin(ctx_obj);
 			rq->ctx->engine[ring->id].lrca = 0;
+			rq->ctx->engine[ring->id].lrc_state_page = NULL;
 		}
 	}
 }
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 12/13] drm/i915: Add BKL asserts to get page helpers
  2016-01-08 11:29 ` [PATCH 12/13] drm/i915: Add BKL asserts to get page helpers Tvrtko Ursulin
@ 2016-01-08 11:37   ` Chris Wilson
  0 siblings, 0 replies; 50+ messages in thread
From: Chris Wilson @ 2016-01-08 11:37 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:51AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Purpose is catching illegal callers.

I know you didn't try benchmarking this.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 11/13] drm/i915: Cache ringbuffer GTT address
  2016-01-08 11:29 ` [PATCH 11/13] drm/i915: Cache ringbuffer GTT address Tvrtko Ursulin
@ 2016-01-08 11:37   ` Chris Wilson
  2016-01-11  8:49   ` Daniel Vetter
  2016-01-11 16:16   ` [PATCH 11/13] drm/i915: Cache ringbuffer GTT address Dave Gordon
  2 siblings, 0 replies; 50+ messages in thread
From: Chris Wilson @ 2016-01-08 11:37 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:50AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Purpose is to avoid calling i915_gem_obj_ggtt_offset from the
> interrupt context without the big lock held.

No. Move to tracking VMA.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 00/13] Misc cleanups and locking fixes
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (12 preceding siblings ...)
  2016-01-08 11:29 ` [PATCH 13/13] drm/i915: Cache LRC state page in the context Tvrtko Ursulin
@ 2016-01-08 11:38 ` Chris Wilson
  2016-01-11  9:44 ` ✗ failure: Fi.CI.BAT Patchwork
  14 siblings, 0 replies; 50+ messages in thread
From: Chris Wilson @ 2016-01-08 11:38 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:39AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Random cleanups from yesterday mutated into a bit of locking verification and
> fixing as suggested by Daniel Vetter.

Rather than duplicating my work, please do help fix the bugs that are
impeding moving forwards.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock
  2016-01-08 11:29 ` [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock Tvrtko Ursulin
@ 2016-01-08 11:40   ` Chris Wilson
  2016-01-08 11:45     ` Chris Wilson
  2016-01-08 11:47     ` Tvrtko Ursulin
  2016-01-08 12:40   ` Tvrtko Ursulin
  2016-01-08 13:04   ` [PATCH v3 " Tvrtko Ursulin
  2 siblings, 2 replies; 50+ messages in thread
From: Chris Wilson @ 2016-01-08 11:40 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:47AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> VMA creation and GEM list management need the big lock.

No, this operation doesn't require locking.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator
  2016-01-08 11:29 ` [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator Tvrtko Ursulin
@ 2016-01-08 11:44   ` Chris Wilson
  2016-01-11  8:48     ` Daniel Vetter
  2016-01-08 13:29   ` Tvrtko Ursulin
  1 sibling, 1 reply; 50+ messages in thread
From: Chris Wilson @ 2016-01-08 11:44 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Fri, Jan 08, 2016 at 11:29:46AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Purpose is to catch places which iterate the object VMA list
> without holding the big lock.
> 
> Implemented by open coding list_for_each_entry to make the
> macro compatible with existing call sites.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> +#define i915_gem_obj_for_each_vma(vma, obj) \
> +	for (WARN_ON_ONCE(!mutex_is_locked(&(obj)->base.dev->struct_mutex)), \

Let's not go around adding WARN(!mutex_locked) to GEM code when
lockdep_assert_held doesn't add overhead outside of testing.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock
  2016-01-08 11:40   ` Chris Wilson
@ 2016-01-08 11:45     ` Chris Wilson
  2016-01-08 11:47     ` Tvrtko Ursulin
  1 sibling, 0 replies; 50+ messages in thread
From: Chris Wilson @ 2016-01-08 11:45 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx

On Fri, Jan 08, 2016 at 11:40:56AM +0000, Chris Wilson wrote:
> On Fri, Jan 08, 2016 at 11:29:47AM +0000, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > 
> > VMA creation and GEM list management need the big lock.
> 
> No, this operation doesn't require locking.

My bad, there's a vm->list tracking. I was thinking of the
obj->vma_list.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock
  2016-01-08 11:40   ` Chris Wilson
  2016-01-08 11:45     ` Chris Wilson
@ 2016-01-08 11:47     ` Tvrtko Ursulin
  2016-01-08 12:09       ` Chris Wilson
  1 sibling, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 11:47 UTC (permalink / raw)
  To: Chris Wilson, Intel-gfx


On 08/01/16 11:40, Chris Wilson wrote:
> On Fri, Jan 08, 2016 at 11:29:47AM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> VMA creation and GEM list management need the big lock.
>
> No, this operation doesn't require locking.

You can argue about positioning and such, but it is adding it to the 
bound and inactive list...

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock
  2016-01-08 11:47     ` Tvrtko Ursulin
@ 2016-01-08 12:09       ` Chris Wilson
  0 siblings, 0 replies; 50+ messages in thread
From: Chris Wilson @ 2016-01-08 12:09 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:47:03AM +0000, Tvrtko Ursulin wrote:
> 
> On 08/01/16 11:40, Chris Wilson wrote:
> >On Fri, Jan 08, 2016 at 11:29:47AM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >>VMA creation and GEM list management need the big lock.
> >
> >No, this operation doesn't require locking.
> 
> You can argue about positioning and such, but it is adding it to the
> bound and inactive list...

Actually you raise a good point about the other stolen series,
namely we need to push this lock up to the caller since we have the
samme issue wit i915_gem_object_create_stolen() which I don't remember
that we fixed in valleyview_setup_pctx().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock
  2016-01-08 11:29 ` [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock Tvrtko Ursulin
  2016-01-08 11:40   ` Chris Wilson
@ 2016-01-08 12:40   ` Tvrtko Ursulin
  2016-01-08 13:04   ` [PATCH v3 " Tvrtko Ursulin
  2 siblings, 0 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 12:40 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

VMA creation and GEM list management need the big lock.

v2:

Mutex unlock ended on the wrong path somehow. (0-day, Julia Lawall)

Not to mention drm_gem_object_unreference was there in existing
code with no mutex held.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_stolen.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index c384dc9c8a63..7c2f9127bb23 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -670,6 +670,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (gtt_offset == I915_GTT_OFFSET_NONE)
 		return obj;
 
+	mutex_lock(&dev->struct_mutex);
+
 	vma = i915_gem_obj_lookup_or_create_vma(obj, ggtt);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
@@ -698,9 +700,12 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	i915_gem_object_pin_pages(obj);
 
+	mutex_unlock(&dev->struct_mutex);
+
 	return obj;
 
 err:
 	drm_gem_object_unreference(&obj->base);
+	mutex_unlock(&dev->struct_mutex);
 	return NULL;
 }
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 08/13] drm/i915: GEM operations need to be done under the big lock
  2016-01-08 11:29 ` [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock Tvrtko Ursulin
  2016-01-08 11:40   ` Chris Wilson
  2016-01-08 12:40   ` Tvrtko Ursulin
@ 2016-01-08 13:04   ` Tvrtko Ursulin
  2 siblings, 0 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 13:04 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

VMA creation and GEM list management need the big lock.

v2:

Mutex unlock ended on the wrong path somehow. (0-day, Julia Lawall)

Not to mention drm_gem_object_unreference was there in existing
code with no mutex held.

v3:

Some callers of i915_gem_object_create_stolen_for_preallocated
already hold the lock so move the mutex into the other caller
as well.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_stolen.c | 3 +++
 drivers/gpu/drm/i915/intel_display.c   | 8 ++++++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index c384dc9c8a63..a97a5e762c0f 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -635,6 +635,9 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (!drm_mm_initialized(&dev_priv->mm.stolen))
 		return NULL;
 
+	if (WARN_ON_ONCE(!mutex_is_locked(&dev->struct_mutex)))
+		return NULL;
+
 	DRM_DEBUG_KMS("creating preallocated stolen object: stolen_offset=%x, gtt_offset=%x, size=%x\n",
 			stolen_offset, gtt_offset, size);
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index cb76054fe4c8..c502b7387509 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2534,12 +2534,16 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
 	if (size_aligned * 2 > dev_priv->gtt.stolen_usable_size)
 		return false;
 
+	mutex_lock(&dev->struct_mutex);
+
 	obj = i915_gem_object_create_stolen_for_preallocated(dev,
 							     base_aligned,
 							     base_aligned,
 							     size_aligned);
-	if (!obj)
+	if (!obj) {
+		mutex_unlock(&dev->struct_mutex);
 		return false;
+	}
 
 	obj->tiling_mode = plane_config->tiling;
 	if (obj->tiling_mode == I915_TILING_X)
@@ -2552,12 +2556,12 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
 	mode_cmd.modifier[0] = fb->modifier[0];
 	mode_cmd.flags = DRM_MODE_FB_MODIFIERS;
 
-	mutex_lock(&dev->struct_mutex);
 	if (intel_framebuffer_init(dev, to_intel_framebuffer(fb),
 				   &mode_cmd, obj)) {
 		DRM_DEBUG_KMS("intel fb init failed\n");
 		goto out_unref_obj;
 	}
+
 	mutex_unlock(&dev->struct_mutex);
 
 	DRM_DEBUG_KMS("initial plane fb obj %p\n", obj);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator
  2016-01-08 11:29 ` [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator Tvrtko Ursulin
  2016-01-08 11:44   ` Chris Wilson
@ 2016-01-08 13:29   ` Tvrtko Ursulin
  2016-01-11  8:43     ` Daniel Vetter
  1 sibling, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-08 13:29 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Daniel Vetter


On 08/01/16 11:29, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Purpose is to catch places which iterate the object VMA list
> without holding the big lock.
>
> Implemented by open coding list_for_each_entry to make the
> macro compatible with existing call sites.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c      |  8 ++++----
>   drivers/gpu/drm/i915/i915_drv.h          |  6 ++++++
>   drivers/gpu/drm/i915/i915_gem.c          | 24 ++++++++++++------------
>   drivers/gpu/drm/i915/i915_gem_gtt.c      |  2 +-
>   drivers/gpu/drm/i915/i915_gem_shrinker.c |  2 +-
>   drivers/gpu/drm/i915/i915_gpu_error.c    |  4 ++--
>   6 files changed, 26 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 714a45cf8a51..d7c2a3201161 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -117,7 +117,7 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj)
>   	u64 size = 0;
>   	struct i915_vma *vma;
>
> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +	i915_gem_obj_for_each_vma(vma, obj) {
>   		if (i915_is_ggtt(vma->vm) &&
>   		    drm_mm_node_allocated(&vma->node))
>   			size += vma->node.size;
> @@ -155,7 +155,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>   		   obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
>   	if (obj->base.name)
>   		seq_printf(m, " (name: %d)", obj->base.name);
> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +	i915_gem_obj_for_each_vma(vma, obj) {
>   		if (vma->pin_count > 0)
>   			pin_count++;
>   	}
> @@ -164,7 +164,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>   		seq_printf(m, " (display)");
>   	if (obj->fence_reg != I915_FENCE_REG_NONE)
>   		seq_printf(m, " (fence: %d)", obj->fence_reg);
> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +	i915_gem_obj_for_each_vma(vma, obj) {
>   		seq_printf(m, " (%sgtt offset: %08llx, size: %08llx",
>   			   i915_is_ggtt(vma->vm) ? "g" : "pp",
>   			   vma->node.start, vma->node.size);
> @@ -342,7 +342,7 @@ static int per_file_stats(int id, void *ptr, void *data)
>   		stats->shared += obj->base.size;
>
>   	if (USES_FULL_PPGTT(obj->base.dev)) {
> -		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +		i915_gem_obj_for_each_vma(vma, obj) {
>   			struct i915_hw_ppgtt *ppgtt;
>
>   			if (!drm_mm_node_allocated(&vma->node))
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index b77a5d84eac2..0406a020dfcc 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2852,6 +2852,12 @@ struct drm_i915_gem_object *i915_gem_object_create_from_data(
>   void i915_gem_free_object(struct drm_gem_object *obj);
>   void i915_gem_vma_destroy(struct i915_vma *vma);
>
> +#define i915_gem_obj_for_each_vma(vma, obj) \
> +	for (WARN_ON_ONCE(!mutex_is_locked(&(obj)->base.dev->struct_mutex)), \
> +	     vma = list_first_entry(&(obj)->vma_list, typeof(*vma), vma_link);\
> +	     &vma->vma_link != (&(obj)->vma_list); \
> +	     vma = list_next_entry(vma, vma_link))
> +


Unfortunately error capture is not happy with this approach. Can't even 
see that error capture attempts to grab the mutex anywhere.

So what? Drop the idea or add a "doing error capture" flag somewhere?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 01/13] drm/i915/bdw+: Replace list_del+list_add_tail with list_move_tail
  2016-01-08 11:29 ` [PATCH 01/13] drm/i915/bdw+: Replace list_del+list_add_tail with list_move_tail Tvrtko Ursulin
@ 2016-01-11  8:22   ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2016-01-11  8:22 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:40AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Same effect for slightly less source code and resulting binary.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 15 ++++++---------
>  1 file changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 23839ff04e27..8b6071fcd743 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -431,9 +431,8 @@ static void execlists_context_unqueue(struct intel_engine_cs *ring)
>  			/* Same ctx: ignore first request, as second request
>  			 * will update tail past first request's workload */
>  			cursor->elsp_submitted = req0->elsp_submitted;
> -			list_del(&req0->execlist_link);
> -			list_add_tail(&req0->execlist_link,
> -				&ring->execlist_retired_req_list);
> +			list_move_tail(&req0->execlist_link,
> +				       &ring->execlist_retired_req_list);
>  			req0 = cursor;
>  		} else {
>  			req1 = cursor;
> @@ -485,9 +484,8 @@ static bool execlists_check_remove_request(struct intel_engine_cs *ring,
>  			     "Never submitted head request\n");
>  
>  			if (--head_req->elsp_submitted <= 0) {
> -				list_del(&head_req->execlist_link);
> -				list_add_tail(&head_req->execlist_link,
> -					&ring->execlist_retired_req_list);
> +				list_move_tail(&head_req->execlist_link,
> +					       &ring->execlist_retired_req_list);
>  				return true;

Aside: Some of this code is over-indented ...
-Daniel

>  			}
>  		}
> @@ -608,9 +606,8 @@ static int execlists_context_queue(struct drm_i915_gem_request *request)
>  		if (request->ctx == tail_req->ctx) {
>  			WARN(tail_req->elsp_submitted != 0,
>  				"More than 2 already-submitted reqs queued\n");
> -			list_del(&tail_req->execlist_link);
> -			list_add_tail(&tail_req->execlist_link,
> -				&ring->execlist_retired_req_list);
> +			list_move_tail(&tail_req->execlist_link,
> +				       &ring->execlist_retired_req_list);
>  		}
>  	}
>  
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 02/13] drm/i915: Don't need a timer to wake us up
  2016-01-08 11:29 ` [PATCH 02/13] drm/i915: Don't need a timer to wake us up Tvrtko Ursulin
@ 2016-01-11  8:26   ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2016-01-11  8:26 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:41AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Looks like the sleeping loop in __i915_wait_request can be
> simplified by using io_schedule_timeout instead of setting
> up and destroying a timer.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>

io_schedule_timeout was only added in

commit 9cff8adeaa34b5d2802f03f89803da57856b3b72
Author: NeilBrown <neilb@suse.de>
Date:   Fri Feb 13 15:49:17 2015 +1100

    sched: Prevent recursion in io_schedule()

(well the EXPORT_SYMBOL for it), that was iirc why this was open-coded.
Please add this to your commit message.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/i915_gem.c | 28 ++++++++--------------------
>  1 file changed, 8 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 6c60e04fc09c..de98dc41fb9f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1135,11 +1135,6 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
>  	return 0;
>  }
>  
> -static void fake_irq(unsigned long data)
> -{
> -	wake_up_process((struct task_struct *)data);
> -}
> -
>  static bool missed_irq(struct drm_i915_private *dev_priv,
>  		       struct intel_engine_cs *ring)
>  {
> @@ -1291,7 +1286,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  	}
>  
>  	for (;;) {
> -		struct timer_list timer;
> +		long sched_timeout;
>  
>  		prepare_to_wait(&ring->irq_queue, &wait, state);
>  
> @@ -1321,21 +1316,14 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  			break;
>  		}
>  
> -		timer.function = NULL;
> -		if (timeout || missed_irq(dev_priv, ring)) {
> -			unsigned long expire;
> -
> -			setup_timer_on_stack(&timer, fake_irq, (unsigned long)current);
> -			expire = missed_irq(dev_priv, ring) ? jiffies + 1 : timeout_expire;
> -			mod_timer(&timer, expire);
> -		}
> -
> -		io_schedule();
> +		if (timeout)
> +			sched_timeout = timeout_expire - jiffies;
> +		else if (missed_irq(dev_priv, ring))
> +			sched_timeout = 1;
> +		else
> +			sched_timeout = MAX_SCHEDULE_TIMEOUT;
>  
> -		if (timer.function) {
> -			del_singleshot_timer_sync(&timer);
> -			destroy_timer_on_stack(&timer);
> -		}
> +		io_schedule_timeout(sched_timeout);
>  	}
>  	if (!irq_test_in_progress)
>  		ring->irq_put(ring);
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 03/13] drm/i915: Avoid invariant conditionals in lrc interrupt handler
  2016-01-08 11:29 ` [PATCH 03/13] drm/i915: Avoid invariant conditionals in lrc interrupt handler Tvrtko Ursulin
@ 2016-01-11  8:29   ` Daniel Vetter
  2016-01-11  9:43     ` Tvrtko Ursulin
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2016-01-11  8:29 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:42AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> There is no need to check on what Gen we are running on every
> interrupt and every command submission. We can instead set up
> some of that when engines are initialized, store it in the
> engine structure and use it directly at runtime.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c        | 36 ++++++++++++++++++---------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
>  2 files changed, 22 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 8b6071fcd743..84977a6e6f3f 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -298,29 +298,15 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>  				     struct intel_engine_cs *ring)
>  {
>  	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
> -	uint64_t desc;
> +	uint64_t desc = ring->ctx_desc_template;
>  	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
>  			LRC_PPHWSP_PN * PAGE_SIZE;
>  
>  	WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
>  
> -	desc = GEN8_CTX_VALID;
> -	desc |= GEN8_CTX_ADDRESSING_MODE(dev) << GEN8_CTX_ADDRESSING_MODE_SHIFT;
> -	if (IS_GEN8(ctx_obj->base.dev))
> -		desc |= GEN8_CTX_L3LLC_COHERENT;
> -	desc |= GEN8_CTX_PRIVILEGE;
>  	desc |= lrca;
>  	desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
>  
> -	/* TODO: WaDisableLiteRestore when we start using semaphore
> -	 * signalling between Command Streamers */
> -	/* desc |= GEN8_CTX_FORCE_RESTORE; */
> -
> -	/* WaEnableForceRestoreInCtxtDescForVCS:skl */
> -	/* WaEnableForceRestoreInCtxtDescForVCS:bxt */
> -	if (disable_lite_restore_wa(ring))
> -		desc |= GEN8_CTX_FORCE_RESTORE;
> -
>  	return desc;
>  }

tbh I'd go full monty and just cache the entire context descriptor.
-Daniel

>  
> @@ -556,7 +542,7 @@ void intel_lrc_irq_handler(struct intel_engine_cs *ring)
>  		}
>  	}
>  
> -	if (disable_lite_restore_wa(ring)) {
> +	if (ring->disable_lite_restore_wa) {
>  		/* Prevent a ctx to preempt itself */
>  		if ((status & GEN8_CTX_STATUS_ACTIVE_IDLE) &&
>  		    (submit_contexts != 0))
> @@ -1980,6 +1966,24 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
>  		goto error;
>  	}
>  
> +	ring->disable_lite_restore_wa = disable_lite_restore_wa(ring);
> +
> +	ring->ctx_desc_template = GEN8_CTX_VALID;
> +	ring->ctx_desc_template |= GEN8_CTX_ADDRESSING_MODE(dev) <<
> +				   GEN8_CTX_ADDRESSING_MODE_SHIFT;
> +	if (IS_GEN8(dev))
> +		ring->ctx_desc_template |= GEN8_CTX_L3LLC_COHERENT;
> +	ring->ctx_desc_template |= GEN8_CTX_PRIVILEGE;
> +
> +	/* TODO: WaDisableLiteRestore when we start using semaphore
> +	 * signalling between Command Streamers */
> +	/* ring->ctx_desc_template |= GEN8_CTX_FORCE_RESTORE; */
> +
> +	/* WaEnableForceRestoreInCtxtDescForVCS:skl */
> +	/* WaEnableForceRestoreInCtxtDescForVCS:bxt */
> +	if (ring->disable_lite_restore_wa)
> +		ring->ctx_desc_template |= GEN8_CTX_FORCE_RESTORE;
> +
>  	return 0;
>  
>  error:
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 49574ffe54bc..0b91a4b77359 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -268,6 +268,8 @@ struct  intel_engine_cs {
>  	struct list_head execlist_queue;
>  	struct list_head execlist_retired_req_list;
>  	u8 next_context_status_buffer;
> +	bool disable_lite_restore_wa;
> +	u32 ctx_desc_template;
>  	u32             irq_keep_mask; /* bitmask for interrupts that should not be masked */
>  	int		(*emit_request)(struct drm_i915_gem_request *request);
>  	int		(*emit_flush)(struct drm_i915_gem_request *request,
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 04/13] drm/i915: Fail engine initialization if LRCA is incorrectly aligned
  2016-01-08 11:29 ` [PATCH 04/13] drm/i915: Fail engine initialization if LRCA is incorrectly aligned Tvrtko Ursulin
@ 2016-01-11  8:31   ` Daniel Vetter
  2016-01-11 16:02     ` Dave Gordon
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2016-01-11  8:31 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:43AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> LRCA can change only when it goes from unpinned to pinned so it
> makes sense to check its alignment at that point rather than at
> every batch buffer submission.
> 
> Furthermore, if we check it at pin time we can actually
> gracefuly fail the engine initialization rather than just
> spamming the logs at runtime with WARNs.
> 
> v2: Return ENODEV for bad alignment. (Chris Wilson)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 84977a6e6f3f..ff146a15d395 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -302,8 +302,6 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>  	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
>  			LRC_PPHWSP_PN * PAGE_SIZE;
>  
> -	WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
> -
>  	desc |= lrca;
>  	desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
>  
> @@ -1030,6 +1028,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	u64 lrca;
>  	int ret = 0;
>  
>  	WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
> @@ -1038,6 +1037,12 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
>  	if (ret)
>  		return ret;
>  
> +	lrca = i915_gem_obj_ggtt_offset(ctx_obj) + LRC_PPHWSP_PN * PAGE_SIZE;
> +	if (WARN_ON(lrca & 0xFFFFFFFF00000FFFULL)) {

Essentially this checks that it's page-aligned (which is a fundamental
assumption of how we place objects we depend upon everywhere) and that it
fits within the 4G hw limit of the global gtt (again we assume our code is
correct that way). tbh I'd just drop entirely, it's a useless check.
-Daniel

> +		ret = -ENODEV;
> +		goto unpin_ctx_obj;
> +	}
> +
>  	ret = intel_pin_and_map_ringbuffer_obj(ring->dev, ringbuf);
>  	if (ret)
>  		goto unpin_ctx_obj;
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 05/13] drm/i915: Cache LRCA in the context
  2016-01-08 11:29 ` [PATCH 05/13] drm/i915: Cache LRCA in the context Tvrtko Ursulin
@ 2016-01-11  8:38   ` Daniel Vetter
  2016-01-11 19:41     ` Dave Gordon
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2016-01-11  8:38 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:44AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> We are not allowed to call i915_gem_obj_ggtt_offset from irq
> context without the big kernel lock held.
> 
> LRCA lifetime is well defined so cache it so it can be looked up
> cheaply from the interrupt context and at command submission
> time.
> 
> v2: Added irq context reasoning to the commit message. (Daniel Vetter)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A i915_obj_for_each_vma macro with a
WARN_ON(!mutex_is_locked(dev->struct_mutex)) would be awesome to validate
this. Especially since this is by far not the only time I've seen this
bug. Needs to be a follow-up though to avoid stalling this fix.

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 15 ++++++--------
>  drivers/gpu/drm/i915/i915_drv.h     |  1 +
>  drivers/gpu/drm/i915/intel_lrc.c    | 40 ++++++++++++++++---------------------
>  drivers/gpu/drm/i915/intel_lrc.h    |  3 ++-
>  4 files changed, 26 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 3b05bd189eab..714a45cf8a51 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1976,12 +1976,13 @@ static int i915_context_status(struct seq_file *m, void *unused)
>  }
>  
>  static void i915_dump_lrc_obj(struct seq_file *m,
> -			      struct intel_engine_cs *ring,
> -			      struct drm_i915_gem_object *ctx_obj)
> +			      struct intel_context *ctx,
> +			      struct intel_engine_cs *ring)
>  {
>  	struct page *page;
>  	uint32_t *reg_state;
>  	int j;
> +	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>  	unsigned long ggtt_offset = 0;
>  
>  	if (ctx_obj == NULL) {
> @@ -1991,7 +1992,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
>  	}
>  
>  	seq_printf(m, "CONTEXT: %s %u\n", ring->name,
> -		   intel_execlists_ctx_id(ctx_obj));
> +		   intel_execlists_ctx_id(ctx, ring));
>  
>  	if (!i915_gem_obj_ggtt_bound(ctx_obj))
>  		seq_puts(m, "\tNot bound in GGTT\n");
> @@ -2040,8 +2041,7 @@ static int i915_dump_lrc(struct seq_file *m, void *unused)
>  	list_for_each_entry(ctx, &dev_priv->context_list, link) {
>  		for_each_ring(ring, dev_priv, i) {
>  			if (ring->default_context != ctx)
> -				i915_dump_lrc_obj(m, ring,
> -						  ctx->engine[i].state);
> +				i915_dump_lrc_obj(m, ctx, ring);
>  		}
>  	}
>  
> @@ -2115,11 +2115,8 @@ static int i915_execlists(struct seq_file *m, void *data)
>  
>  		seq_printf(m, "\t%d requests in queue\n", count);
>  		if (head_req) {
> -			struct drm_i915_gem_object *ctx_obj;
> -
> -			ctx_obj = head_req->ctx->engine[ring_id].state;
>  			seq_printf(m, "\tHead request id: %u\n",
> -				   intel_execlists_ctx_id(ctx_obj));
> +				   intel_execlists_ctx_id(head_req->ctx, ring));
>  			seq_printf(m, "\tHead request tail: %u\n",
>  				   head_req->tail);
>  		}
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8cf655c6fc03..b77a5d84eac2 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -881,6 +881,7 @@ struct intel_context {
>  		struct drm_i915_gem_object *state;
>  		struct intel_ringbuffer *ringbuf;
>  		int pin_count;
> +		u32 lrca;

lrc_offset imo. Consistent with our other usage in the driver, and
actually readable. Please apply liberally everywhere else (I know that
bsepc calls it lrca, but we don't need to follow bad naming styles
blindly).

With that Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  	} engine[I915_NUM_RINGS];
>  
>  	struct list_head link;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index ff146a15d395..ffe004de22b0 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -265,7 +265,8 @@ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists
>  
>  /**
>   * intel_execlists_ctx_id() - get the Execlists Context ID
> - * @ctx_obj: Logical Ring Context backing object.
> + * @ctx: Context to get the ID for
> + * @ring: Engine to get the ID for
>   *
>   * Do not confuse with ctx->id! Unfortunately we have a name overload
>   * here: the old context ID we pass to userspace as a handler so that
> @@ -275,14 +276,12 @@ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists
>   *
>   * Return: 20-bits globally unique context ID.
>   */
> -u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
> +u32 intel_execlists_ctx_id(struct intel_context *ctx,
> +			   struct intel_engine_cs *ring)
>  {
> -	u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
> -			LRC_PPHWSP_PN * PAGE_SIZE;
> -
>  	/* LRCA is required to be 4K aligned so the more significant 20 bits
>  	 * are globally unique */
> -	return lrca >> 12;
> +	return ctx->engine[ring->id].lrca >> 12;
>  }
>  
>  static bool disable_lite_restore_wa(struct intel_engine_cs *ring)
> @@ -297,13 +296,11 @@ static bool disable_lite_restore_wa(struct intel_engine_cs *ring)
>  uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>  				     struct intel_engine_cs *ring)
>  {
> -	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
> +	uint64_t lrca = ctx->engine[ring->id].lrca;
>  	uint64_t desc = ring->ctx_desc_template;
> -	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
> -			LRC_PPHWSP_PN * PAGE_SIZE;
>  
>  	desc |= lrca;
> -	desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
> +	desc |= (u64)intel_execlists_ctx_id(ctx, ring) << GEN8_CTX_ID_SHIFT;
>  
>  	return desc;
>  }
> @@ -461,9 +458,7 @@ static bool execlists_check_remove_request(struct intel_engine_cs *ring,
>  					    execlist_link);
>  
>  	if (head_req != NULL) {
> -		struct drm_i915_gem_object *ctx_obj =
> -				head_req->ctx->engine[ring->id].state;
> -		if (intel_execlists_ctx_id(ctx_obj) == request_id) {
> +		if (intel_execlists_ctx_id(head_req->ctx, ring) == request_id) {
>  			WARN(head_req->elsp_submitted == 0,
>  			     "Never submitted head request\n");
>  
> @@ -1023,15 +1018,17 @@ int logical_ring_flush_all_caches(struct drm_i915_gem_request *req)
>  }
>  
>  static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
> -		struct drm_i915_gem_object *ctx_obj,
> -		struct intel_ringbuffer *ringbuf)
> +				   struct intel_context *ctx)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
> +	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
>  	u64 lrca;
> -	int ret = 0;
> +	int ret;
>  
>  	WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
> +
>  	ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
>  			PIN_OFFSET_BIAS | GUC_WOPCM_TOP);
>  	if (ret)
> @@ -1047,6 +1044,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
>  	if (ret)
>  		goto unpin_ctx_obj;
>  
> +	ctx->engine[ring->id].lrca = lrca;
>  	ctx_obj->dirty = true;
>  
>  	/* Invalidate GuC TLB. */
> @@ -1065,11 +1063,9 @@ static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
>  {
>  	int ret = 0;
>  	struct intel_engine_cs *ring = rq->ring;
> -	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
> -	struct intel_ringbuffer *ringbuf = rq->ringbuf;
>  
>  	if (rq->ctx->engine[ring->id].pin_count++ == 0) {
> -		ret = intel_lr_context_do_pin(ring, ctx_obj, ringbuf);
> +		ret = intel_lr_context_do_pin(ring, rq->ctx);
>  		if (ret)
>  			goto reset_pin_count;
>  	}
> @@ -1091,6 +1087,7 @@ void intel_lr_context_unpin(struct drm_i915_gem_request *rq)
>  		if (--rq->ctx->engine[ring->id].pin_count == 0) {
>  			intel_unpin_ringbuffer_obj(ringbuf);
>  			i915_gem_object_ggtt_unpin(ctx_obj);
> +			rq->ctx->engine[ring->id].lrca = 0;
>  		}
>  	}
>  }
> @@ -1960,10 +1957,7 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
>  		goto error;
>  
>  	/* As this is the default context, always pin it */
> -	ret = intel_lr_context_do_pin(
> -			ring,
> -			ring->default_context->engine[ring->id].state,
> -			ring->default_context->engine[ring->id].ringbuf);
> +	ret = intel_lr_context_do_pin(ring, ring->default_context);
>  	if (ret) {
>  		DRM_ERROR(
>  			"Failed to pin and map ringbuffer %s: %d\n",
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index de41ad6cd63d..f9db5f187d77 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -106,6 +106,8 @@ void intel_lr_context_reset(struct drm_device *dev,
>  			struct intel_context *ctx);
>  uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>  				     struct intel_engine_cs *ring);
> +u32 intel_execlists_ctx_id(struct intel_context *ctx,
> +			   struct intel_engine_cs *ring);
>  
>  /* Execlists */
>  int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
> @@ -113,7 +115,6 @@ struct i915_execbuffer_params;
>  int intel_execlists_submission(struct i915_execbuffer_params *params,
>  			       struct drm_i915_gem_execbuffer2 *args,
>  			       struct list_head *vmas);
> -u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
>  
>  void intel_lrc_irq_handler(struct intel_engine_cs *ring);
>  void intel_execlists_retire_requests(struct intel_engine_cs *ring);
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/13] drm/i915: Only grab timestamps when needed
  2016-01-08 11:29 ` [PATCH 06/13] drm/i915: Only grab timestamps when needed Tvrtko Ursulin
@ 2016-01-11  8:42   ` Daniel Vetter
  2016-01-11  9:45     ` Tvrtko Ursulin
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2016-01-11  8:42 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:45AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> No need to call ktime_get_raw_ns twice per unlimited wait and can
> also elimate a local variable.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index de98dc41fb9f..c4f69579eb7a 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1246,7 +1246,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
>  	DEFINE_WAIT(wait);
>  	unsigned long timeout_expire;
> -	s64 before, now;
> +	s64 before = 0;

Is gcc really this dense? Should be easy for it to spot that both branches
depend upon the same condition. Please remove that assignment. With that
changed:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  	int ret;
>  
>  	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
> @@ -1266,14 +1266,17 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  			return -ETIME;
>  
>  		timeout_expire = jiffies + nsecs_to_jiffies_timeout(*timeout);
> +
> +		/*
> +		 * Record current time in case interrupted by signal, or wedged.
> +		 */
> +		before = ktime_get_raw_ns();
>  	}
>  
>  	if (INTEL_INFO(dev_priv)->gen >= 6)
>  		gen6_rps_boost(dev_priv, rps, req->emitted_jiffies);
>  
> -	/* Record current time in case interrupted by signal, or wedged */
>  	trace_i915_gem_request_wait_begin(req);
> -	before = ktime_get_raw_ns();
>  
>  	/* Optimistic spin for the next jiffie before touching IRQs */
>  	ret = __i915_spin_request(req, state);
> @@ -1331,11 +1334,10 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  	finish_wait(&ring->irq_queue, &wait);
>  
>  out:
> -	now = ktime_get_raw_ns();
>  	trace_i915_gem_request_wait_end(req);
>  
>  	if (timeout) {
> -		s64 tres = *timeout - (now - before);
> +		s64 tres = *timeout - (ktime_get_raw_ns() - before);
>  
>  		*timeout = tres < 0 ? 0 : tres;
>  
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator
  2016-01-08 13:29   ` Tvrtko Ursulin
@ 2016-01-11  8:43     ` Daniel Vetter
  2016-01-11  9:51       ` Tvrtko Ursulin
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2016-01-11  8:43 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Fri, Jan 08, 2016 at 01:29:14PM +0000, Tvrtko Ursulin wrote:
> 
> On 08/01/16 11:29, Tvrtko Ursulin wrote:
> >From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >
> >Purpose is to catch places which iterate the object VMA list
> >without holding the big lock.
> >
> >Implemented by open coding list_for_each_entry to make the
> >macro compatible with existing call sites.
> >
> >Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >---
> >  drivers/gpu/drm/i915/i915_debugfs.c      |  8 ++++----
> >  drivers/gpu/drm/i915/i915_drv.h          |  6 ++++++
> >  drivers/gpu/drm/i915/i915_gem.c          | 24 ++++++++++++------------
> >  drivers/gpu/drm/i915/i915_gem_gtt.c      |  2 +-
> >  drivers/gpu/drm/i915/i915_gem_shrinker.c |  2 +-
> >  drivers/gpu/drm/i915/i915_gpu_error.c    |  4 ++--
> >  6 files changed, 26 insertions(+), 20 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> >index 714a45cf8a51..d7c2a3201161 100644
> >--- a/drivers/gpu/drm/i915/i915_debugfs.c
> >+++ b/drivers/gpu/drm/i915/i915_debugfs.c
> >@@ -117,7 +117,7 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj)
> >  	u64 size = 0;
> >  	struct i915_vma *vma;
> >
> >-	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> >+	i915_gem_obj_for_each_vma(vma, obj) {
> >  		if (i915_is_ggtt(vma->vm) &&
> >  		    drm_mm_node_allocated(&vma->node))
> >  			size += vma->node.size;
> >@@ -155,7 +155,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> >  		   obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
> >  	if (obj->base.name)
> >  		seq_printf(m, " (name: %d)", obj->base.name);
> >-	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> >+	i915_gem_obj_for_each_vma(vma, obj) {
> >  		if (vma->pin_count > 0)
> >  			pin_count++;
> >  	}
> >@@ -164,7 +164,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> >  		seq_printf(m, " (display)");
> >  	if (obj->fence_reg != I915_FENCE_REG_NONE)
> >  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> >-	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> >+	i915_gem_obj_for_each_vma(vma, obj) {
> >  		seq_printf(m, " (%sgtt offset: %08llx, size: %08llx",
> >  			   i915_is_ggtt(vma->vm) ? "g" : "pp",
> >  			   vma->node.start, vma->node.size);
> >@@ -342,7 +342,7 @@ static int per_file_stats(int id, void *ptr, void *data)
> >  		stats->shared += obj->base.size;
> >
> >  	if (USES_FULL_PPGTT(obj->base.dev)) {
> >-		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> >+		i915_gem_obj_for_each_vma(vma, obj) {
> >  			struct i915_hw_ppgtt *ppgtt;
> >
> >  			if (!drm_mm_node_allocated(&vma->node))
> >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >index b77a5d84eac2..0406a020dfcc 100644
> >--- a/drivers/gpu/drm/i915/i915_drv.h
> >+++ b/drivers/gpu/drm/i915/i915_drv.h
> >@@ -2852,6 +2852,12 @@ struct drm_i915_gem_object *i915_gem_object_create_from_data(
> >  void i915_gem_free_object(struct drm_gem_object *obj);
> >  void i915_gem_vma_destroy(struct i915_vma *vma);
> >
> >+#define i915_gem_obj_for_each_vma(vma, obj) \
> >+	for (WARN_ON_ONCE(!mutex_is_locked(&(obj)->base.dev->struct_mutex)), \
> >+	     vma = list_first_entry(&(obj)->vma_list, typeof(*vma), vma_link);\
> >+	     &vma->vma_link != (&(obj)->vma_list); \
> >+	     vma = list_next_entry(vma, vma_link))
> >+
> 
> 
> Unfortunately error capture is not happy with this approach. Can't even see
> that error capture attempts to grab the mutex anywhere.
> 
> So what? Drop the idea or add a "doing error capture" flag somewhere?

Fix the bugs. Not surprise at all that we've screwed this up all over the
place ;-) Afaics modeset code isn't much better either ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator
  2016-01-08 11:44   ` Chris Wilson
@ 2016-01-11  8:48     ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2016-01-11  8:48 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx, Daniel Vetter

On Fri, Jan 08, 2016 at 11:44:04AM +0000, Chris Wilson wrote:
> On Fri, Jan 08, 2016 at 11:29:46AM +0000, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > 
> > Purpose is to catch places which iterate the object VMA list
> > without holding the big lock.
> > 
> > Implemented by open coding list_for_each_entry to make the
> > macro compatible with existing call sites.
> > 
> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > +#define i915_gem_obj_for_each_vma(vma, obj) \
> > +	for (WARN_ON_ONCE(!mutex_is_locked(&(obj)->base.dev->struct_mutex)), \
> 
> Let's not go around adding WARN(!mutex_locked) to GEM code when
> lockdep_assert_held doesn't add overhead outside of testing.

Hm yeah I still prefere WARN_ON for modeset code (where it doesn't matter)
because of increased test coverage. But for gem it indeed makes more sense
to only do this for lockdep-enabled builds. CI runs with lockdep, so we're
good.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 11/13] drm/i915: Cache ringbuffer GTT address
  2016-01-08 11:29 ` [PATCH 11/13] drm/i915: Cache ringbuffer GTT address Tvrtko Ursulin
  2016-01-08 11:37   ` Chris Wilson
@ 2016-01-11  8:49   ` Daniel Vetter
  2016-01-12 11:42     ` [PATCH v3 3/7] drm/i915: Cache ringbuffer GTT VMA Tvrtko Ursulin
  2016-01-11 16:16   ` [PATCH 11/13] drm/i915: Cache ringbuffer GTT address Dave Gordon
  2 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2016-01-11  8:49 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:50AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Purpose is to avoid calling i915_gem_obj_ggtt_offset from the
> interrupt context without the big lock held.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c        | 3 +--
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
>  3 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 5b3795815d8e..70c511ef6b12 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -345,7 +345,6 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
>  	struct intel_engine_cs *ring = rq->ring;
>  	struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
>  	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
> -	struct drm_i915_gem_object *rb_obj = rq->ringbuf->obj;
>  	struct page *page;
>  	uint32_t *reg_state;
>  
> @@ -355,7 +354,7 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
>  	reg_state = kmap_atomic(page);
>  
>  	reg_state[CTX_RING_TAIL+1] = rq->tail;
> -	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
> +	reg_state[CTX_RING_BUFFER_START+1] = rq->ringbuf->gtt_start;
>  
>  	if (ppgtt && !USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
>  		/* True 32b PPGTT with dynamic page allocation: update PDP
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 339701d7a9a5..9094ce254125 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1988,6 +1988,7 @@ void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
>  	else
>  		iounmap(ringbuf->virtual_start);
>  	ringbuf->virtual_start = NULL;
> +	ringbuf->gtt_start = 0;
>  	i915_gem_object_ggtt_unpin(ringbuf->obj);
>  }
>  
> @@ -2054,6 +2055,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>  		}
>  	}
>  
> +	ringbuf->gtt_start = i915_gem_obj_ggtt_offset(obj);
> +
>  	return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 0b91a4b77359..25d3716228ae 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -98,6 +98,7 @@ struct intel_ring_hangcheck {
>  struct intel_ringbuffer {
>  	struct drm_i915_gem_object *obj;
>  	void __iomem *virtual_start;
> +	u64 gtt_start;

gtt_offset, because consistency. Or vma, as Chris suggested.
-Daniel

>  
>  	struct intel_engine_cs *ring;
>  	struct list_head link;
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 09/13] drm/i915: Remove two impossible asserts
  2016-01-08 11:29 ` [PATCH 09/13] drm/i915: Remove two impossible asserts Tvrtko Ursulin
@ 2016-01-11  8:49   ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2016-01-11  8:49 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Jan 08, 2016 at 11:29:48AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Engine initialization would have failed if those two weren't
> pinned and calling i915_gem_obj_is_pinned is illegal from irq
> context without the big lock held.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index ffe004de22b0..5b3795815d8e 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -350,8 +350,6 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
>  	uint32_t *reg_state;
>  
>  	BUG_ON(!ctx_obj);
> -	WARN_ON(!i915_gem_obj_is_pinned(ctx_obj));
> -	WARN_ON(!i915_gem_obj_is_pinned(rb_obj));
>  
>  	page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN);
>  	reg_state = kmap_atomic(page);
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 03/13] drm/i915: Avoid invariant conditionals in lrc interrupt handler
  2016-01-11  8:29   ` Daniel Vetter
@ 2016-01-11  9:43     ` Tvrtko Ursulin
  0 siblings, 0 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-11  9:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx


On 11/01/16 08:29, Daniel Vetter wrote:
> On Fri, Jan 08, 2016 at 11:29:42AM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> There is no need to check on what Gen we are running on every
>> interrupt and every command submission. We can instead set up
>> some of that when engines are initialized, store it in the
>> engine structure and use it directly at runtime.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_lrc.c        | 36 ++++++++++++++++++---------------
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
>>   2 files changed, 22 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 8b6071fcd743..84977a6e6f3f 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -298,29 +298,15 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>>   				     struct intel_engine_cs *ring)
>>   {
>>   	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>> -	uint64_t desc;
>> +	uint64_t desc = ring->ctx_desc_template;
>>   	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
>>   			LRC_PPHWSP_PN * PAGE_SIZE;
>>
>>   	WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
>>
>> -	desc = GEN8_CTX_VALID;
>> -	desc |= GEN8_CTX_ADDRESSING_MODE(dev) << GEN8_CTX_ADDRESSING_MODE_SHIFT;
>> -	if (IS_GEN8(ctx_obj->base.dev))
>> -		desc |= GEN8_CTX_L3LLC_COHERENT;
>> -	desc |= GEN8_CTX_PRIVILEGE;
>>   	desc |= lrca;
>>   	desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
>>
>> -	/* TODO: WaDisableLiteRestore when we start using semaphore
>> -	 * signalling between Command Streamers */
>> -	/* desc |= GEN8_CTX_FORCE_RESTORE; */
>> -
>> -	/* WaEnableForceRestoreInCtxtDescForVCS:skl */
>> -	/* WaEnableForceRestoreInCtxtDescForVCS:bxt */
>> -	if (disable_lite_restore_wa(ring))
>> -		desc |= GEN8_CTX_FORCE_RESTORE;
>> -
>>   	return desc;
>>   }
>
> tbh I'd go full monty and just cache the entire context descriptor.

Yeah you're right - this was more a consequence of me learning about 
this code as making small cleanups.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* ✗ failure: Fi.CI.BAT
  2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
                   ` (13 preceding siblings ...)
  2016-01-08 11:38 ` [PATCH v2 00/13] Misc cleanups and locking fixes Chris Wilson
@ 2016-01-11  9:44 ` Patchwork
  14 siblings, 0 replies; 50+ messages in thread
From: Patchwork @ 2016-01-11  9:44 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Summary ==

Built on ff88655b3a5467bbc3be8c67d3e05ebf182557d3 drm-intel-nightly: 2016y-01m-11d-07h-30m-16s UTC integration manifest

Test gem_basic:
        Subgroup create-close:
                pass       -> DMESG-WARN (skl-i7k-2)
Test gem_cpu_reloc:
        Subgroup basic:
                pass       -> DMESG-FAIL (skl-i7k-2)
Test gem_ctx_param_basic:
        Subgroup basic:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup invalid-param-set:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup non-root-set-no-zeromap:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup root-set-no-zeromap-disabled:
                pass       -> DMESG-WARN (skl-i7k-2)
Test gem_mmap:
        Subgroup basic:
                pass       -> DMESG-WARN (skl-i7k-2)
Test gem_mmap_gtt:
        Subgroup basic-read:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup basic-write:
                pass       -> DMESG-WARN (skl-i7k-2)
Test gem_storedw_loop:
        Subgroup basic-render:
                dmesg-warn -> PASS       (skl-i7k-2) UNSTABLE
Test kms_addfb_basic:
        Subgroup addfb25-modifier-no-flag:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup addfb25-x-tiled-mismatch:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup addfb25-yf-tiled:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup bad-pitch-1024:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup bad-pitch-63:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup bad-pitch-999:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup clobberred-modifier:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup too-high:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup too-wide:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup unused-offsets:
                pass       -> DMESG-WARN (skl-i7k-2)
Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                dmesg-warn -> PASS       (ilk-hp8440p)
        Subgroup basic-plain-flip:
                pass       -> DMESG-FAIL (skl-i7k-2)
Test kms_pipe_crc_basic:
        Subgroup hang-read-crc-pipe-a:
                pass       -> DMESG-WARN (skl-i5k-2)
                pass       -> DMESG-WARN (bdw-ultra)
                pass       -> DMESG-WARN (skl-i7k-2)
                pass       -> DMESG-WARN (byt-nuc)
                pass       -> DMESG-WARN (snb-x220t)
                pass       -> DMESG-WARN (snb-dellxps)
                pass       -> DMESG-WARN (hsw-brixbox)
                pass       -> DMESG-WARN (ilk-hp8440p)
        Subgroup nonblocking-crc-pipe-a-frame-sequence:
                pass       -> DMESG-FAIL (skl-i7k-2)
        Subgroup read-crc-pipe-b:
                dmesg-warn -> PASS       (byt-nuc)
        Subgroup read-crc-pipe-b-frame-sequence:
                pass       -> DMESG-FAIL (skl-i7k-2)
Test prime_self_import:
        Subgroup basic-with_two_bos:
                pass       -> DMESG-WARN (skl-i7k-2)

bdw-ultra        total:138  pass:129  dwarn:3   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:113  dwarn:4   dfail:0   fail:0   skip:24 
byt-nuc          total:141  pass:117  dwarn:9   dfail:0   fail:0   skip:15 
hsw-brixbox      total:141  pass:132  dwarn:2   dfail:0   fail:0   skip:7  
ilk-hp8440p      total:141  pass:100  dwarn:4   dfail:0   fail:0   skip:37 
skl-i5k-2        total:141  pass:130  dwarn:3   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:106  dwarn:22  dfail:4   fail:0   skip:8  
snb-dellxps      total:141  pass:121  dwarn:6   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:121  dwarn:6   dfail:0   fail:1   skip:13 

HANGED hsw-gt2 in igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c

Results at /archive/results/CI_IGT_test/Patchwork_1115/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/13] drm/i915: Only grab timestamps when needed
  2016-01-11  8:42   ` Daniel Vetter
@ 2016-01-11  9:45     ` Tvrtko Ursulin
  0 siblings, 0 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-11  9:45 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx


On 11/01/16 08:42, Daniel Vetter wrote:
> On Fri, Jan 08, 2016 at 11:29:45AM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> No need to call ktime_get_raw_ns twice per unlimited wait and can
>> also elimate a local variable.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem.c | 12 +++++++-----
>>   1 file changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index de98dc41fb9f..c4f69579eb7a 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -1246,7 +1246,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>>   	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
>>   	DEFINE_WAIT(wait);
>>   	unsigned long timeout_expire;
>> -	s64 before, now;
>> +	s64 before = 0;
>
> Is gcc really this dense? Should be easy for it to spot that both branches
> depend upon the same condition. Please remove that assignment. With that
> changed:
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

It is this dense, at least gcc 4.8.4 on my machine. :(

Do you want to remove it regardless of the warning?

Rrgards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator
  2016-01-11  8:43     ` Daniel Vetter
@ 2016-01-11  9:51       ` Tvrtko Ursulin
  2016-01-12 16:19         ` Daniel Vetter
  0 siblings, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-11  9:51 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, Intel-gfx


On 11/01/16 08:43, Daniel Vetter wrote:
> On Fri, Jan 08, 2016 at 01:29:14PM +0000, Tvrtko Ursulin wrote:
>>
>> On 08/01/16 11:29, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> Purpose is to catch places which iterate the object VMA list
>>> without holding the big lock.
>>>
>>> Implemented by open coding list_for_each_entry to make the
>>> macro compatible with existing call sites.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> ---
>>>   drivers/gpu/drm/i915/i915_debugfs.c      |  8 ++++----
>>>   drivers/gpu/drm/i915/i915_drv.h          |  6 ++++++
>>>   drivers/gpu/drm/i915/i915_gem.c          | 24 ++++++++++++------------
>>>   drivers/gpu/drm/i915/i915_gem_gtt.c      |  2 +-
>>>   drivers/gpu/drm/i915/i915_gem_shrinker.c |  2 +-
>>>   drivers/gpu/drm/i915/i915_gpu_error.c    |  4 ++--
>>>   6 files changed, 26 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index 714a45cf8a51..d7c2a3201161 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -117,7 +117,7 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj)
>>>   	u64 size = 0;
>>>   	struct i915_vma *vma;
>>>
>>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>> +	i915_gem_obj_for_each_vma(vma, obj) {
>>>   		if (i915_is_ggtt(vma->vm) &&
>>>   		    drm_mm_node_allocated(&vma->node))
>>>   			size += vma->node.size;
>>> @@ -155,7 +155,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>>>   		   obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
>>>   	if (obj->base.name)
>>>   		seq_printf(m, " (name: %d)", obj->base.name);
>>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>> +	i915_gem_obj_for_each_vma(vma, obj) {
>>>   		if (vma->pin_count > 0)
>>>   			pin_count++;
>>>   	}
>>> @@ -164,7 +164,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>>>   		seq_printf(m, " (display)");
>>>   	if (obj->fence_reg != I915_FENCE_REG_NONE)
>>>   		seq_printf(m, " (fence: %d)", obj->fence_reg);
>>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>> +	i915_gem_obj_for_each_vma(vma, obj) {
>>>   		seq_printf(m, " (%sgtt offset: %08llx, size: %08llx",
>>>   			   i915_is_ggtt(vma->vm) ? "g" : "pp",
>>>   			   vma->node.start, vma->node.size);
>>> @@ -342,7 +342,7 @@ static int per_file_stats(int id, void *ptr, void *data)
>>>   		stats->shared += obj->base.size;
>>>
>>>   	if (USES_FULL_PPGTT(obj->base.dev)) {
>>> -		list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>> +		i915_gem_obj_for_each_vma(vma, obj) {
>>>   			struct i915_hw_ppgtt *ppgtt;
>>>
>>>   			if (!drm_mm_node_allocated(&vma->node))
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index b77a5d84eac2..0406a020dfcc 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -2852,6 +2852,12 @@ struct drm_i915_gem_object *i915_gem_object_create_from_data(
>>>   void i915_gem_free_object(struct drm_gem_object *obj);
>>>   void i915_gem_vma_destroy(struct i915_vma *vma);
>>>
>>> +#define i915_gem_obj_for_each_vma(vma, obj) \
>>> +	for (WARN_ON_ONCE(!mutex_is_locked(&(obj)->base.dev->struct_mutex)), \
>>> +	     vma = list_first_entry(&(obj)->vma_list, typeof(*vma), vma_link);\
>>> +	     &vma->vma_link != (&(obj)->vma_list); \
>>> +	     vma = list_next_entry(vma, vma_link))
>>> +
>>
>>
>> Unfortunately error capture is not happy with this approach. Can't even see
>> that error capture attempts to grab the mutex anywhere.
>>
>> So what? Drop the idea or add a "doing error capture" flag somewhere?
> 
> Fix the bugs. Not surprise at all that we've screwed this up all over the
> place ;-) Afaics modeset code isn't much better either ...

Ok I'll drop this patch then since the series contains fixes to all but one
related issues. The remaining one is then:

[   17.370366] ------------[ cut here ]------------
[   17.375633] WARNING: CPU: 0 PID: 1128 at drivers/gpu/drm/i915/i915_gem.c:5166 i915_gem_obj_ggtt_offset_view+0x10f/0x120 [i915]()
[   17.388879] WARN_ON_ONCE(!mutex_is_locked(&(o)->base.dev->struct_mutex))
[   17.396364] Modules linked in: hid_generic usbhid coretemp asix usbnet libphy mii i915 gpio_lynxpoint i2c_hid hid video i2c_algo_bit drm_kms_helper acpi_pad drm lpc_ich mfd_core nls_iso8859_1 e1000e ptp ahci libahci pps_core
[   17.419484] CPU: 0 PID: 1128 Comm: Xorg Tainted: G     U          4.4.0-rc8-160107+ #105
[   17.428771] Hardware name: Intel Corporation Broadwell Client platform/WhiteTip Mountain 1, BIOS BDW-E1R1.86C.0080.R01.1406120446 06/12/2014
[   17.443161]  ffffffffa0227790 ffff8800a98439b8 ffffffff81280d82 ffff8800a9843a00
[   17.451677]  ffff8800a98439f0 ffffffff81049c8c ffff8801495d0000 ffff8800aa934900
[   17.460166]  ffff8801495d8668 ffffffffa0242520 ffff8800aacea000 ffff8800a9843a50
[   17.468674] Call Trace:
[   17.471470]  [<ffffffff81280d82>] dump_stack+0x4b/0x79
[   17.477355]  [<ffffffff81049c8c>] warn_slowpath_common+0x7c/0xc0
[   17.484255]  [<ffffffff81049d17>] warn_slowpath_fmt+0x47/0x50
[   17.490869]  [<ffffffffa018f7ef>] i915_gem_obj_ggtt_offset_view+0x10f/0x120 [i915]
[   17.499572]  [<ffffffffa01a7290>] ? gen9_write8+0x2d0/0x2d0 [i915]
[   17.506663]  [<ffffffffa01bdbfe>] ironlake_update_primary_plane+0x1ee/0x3a0 [i915]
[   17.515358]  [<ffffffffa01ace1f>] intel_plane_atomic_update+0x5f/0x70 [i915]
[   17.523391]  [<ffffffffa00ef122>] drm_atomic_helper_commit_planes_on_crtc+0x142/0x230 [drm_kms_helper]
[   17.534063]  [<ffffffffa01c1744>] intel_atomic_commit+0x424/0x2270 [i915]
[   17.541763]  [<ffffffff814ee04a>] ? __ww_mutex_lock+0x4a/0x82
[   17.548355]  [<ffffffffa0097edb>] ? drm_atomic_check_only+0x12b/0x5e0 [drm]
[   17.556352]  [<ffffffffa0097a07>] ? drm_atomic_set_crtc_for_connector+0x77/0xf0 [drm]
[   17.565346]  [<ffffffffa00983c2>] drm_atomic_commit+0x32/0x50 [drm]
[   17.572553]  [<ffffffffa00ef782>] drm_atomic_helper_set_config+0x72/0xb0 [drm_kms_helper]
[   17.581954]  [<ffffffffa0088e8f>] drm_mode_set_config_internal+0x5f/0x100 [drm]
[   17.590313]  [<ffffffffa008cdc8>] drm_mode_setcrtc+0xd8/0x500 [drm]
[   17.597491]  [<ffffffffa007f5d8>] drm_ioctl+0x258/0x4f0 [drm]
[   17.604065]  [<ffffffff810c34ac>] ? unlock_page+0x4c/0x50
[   17.610245]  [<ffffffffa008ccf0>] ? drm_mode_setplane+0x1c0/0x1c0 [drm]
[   17.617844]  [<ffffffff8111597d>] do_vfs_ioctl+0x2cd/0x4a0
[   17.624137]  [<ffffffff81052572>] ? recalc_sigpending+0x12/0x40
[   17.630943]  [<ffffffff8111eb32>] ? __fget+0x72/0xb0
[   17.636645]  [<ffffffff81115b8c>] SyS_ioctl+0x3c/0x70
[   17.642448]  [<ffffffff81055301>] ? SyS_rt_sigprocmask+0x81/0xa0
[   17.649334]  [<ffffffff814f0017>] entry_SYSCALL_64_fastpath+0x12/0x6a
[   17.656740] ---[ end trace 3f29190006c97826 ]---

Would you like a BZ for this or maybe it is known already?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 04/13] drm/i915: Fail engine initialization if LRCA is incorrectly aligned
  2016-01-11  8:31   ` Daniel Vetter
@ 2016-01-11 16:02     ` Dave Gordon
  2016-01-11 23:31       ` Chris Wilson
  0 siblings, 1 reply; 50+ messages in thread
From: Dave Gordon @ 2016-01-11 16:02 UTC (permalink / raw)
  To: Daniel Vetter, Tvrtko Ursulin; +Cc: Intel-gfx

On 11/01/16 08:31, Daniel Vetter wrote:
> On Fri, Jan 08, 2016 at 11:29:43AM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> LRCA can change only when it goes from unpinned to pinned so it
>> makes sense to check its alignment at that point rather than at
>> every batch buffer submission.
>>
>> Furthermore, if we check it at pin time we can actually
>> gracefuly fail the engine initialization rather than just
>> spamming the logs at runtime with WARNs.
>>
>> v2: Return ENODEV for bad alignment. (Chris Wilson)
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_lrc.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 84977a6e6f3f..ff146a15d395 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -302,8 +302,6 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>>   	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
>>   			LRC_PPHWSP_PN * PAGE_SIZE;
>>
>> -	WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
>> -
>>   	desc |= lrca;
>>   	desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
>>
>> @@ -1030,6 +1028,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
>>   {
>>   	struct drm_device *dev = ring->dev;
>>   	struct drm_i915_private *dev_priv = dev->dev_private;
>> +	u64 lrca;
>>   	int ret = 0;
>>
>>   	WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
>> @@ -1038,6 +1037,12 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
>>   	if (ret)
>>   		return ret;
>>
>> +	lrca = i915_gem_obj_ggtt_offset(ctx_obj) + LRC_PPHWSP_PN * PAGE_SIZE;
>> +	if (WARN_ON(lrca & 0xFFFFFFFF00000FFFULL)) {
>
> Essentially this checks that it's page-aligned (which is a fundamental
> assumption of how we place objects we depend upon everywhere) and that it
> fits within the 4G hw limit of the global gtt (again we assume our code is
> correct that way). tbh I'd just drop entirely, it's a useless check.
> -Daniel

IIRC the original version of this WARN (in intel_lr_context_descriptor()
above) was added with the GuC submission code, because the context 
descriptor as used in execlist code is a 64-bit value, but the GuC 
requires that all the unique stuff fits in those 20 unmasked bits of a 
32-bit value, with the low 12 bits being used for flags. So we wanted to 
check that we never got a context ID that couldn't be pruned down to 
just those 20 bits without losing information. It's never been seen to 
happen since GuC development finished, so we can reasonably lose the 
check now.

Unless anyone wants to be prepared for the expansion of the GTT beyond 4Gb!

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 11/13] drm/i915: Cache ringbuffer GTT address
  2016-01-08 11:29 ` [PATCH 11/13] drm/i915: Cache ringbuffer GTT address Tvrtko Ursulin
  2016-01-08 11:37   ` Chris Wilson
  2016-01-11  8:49   ` Daniel Vetter
@ 2016-01-11 16:16   ` Dave Gordon
  2016-01-11 17:44     ` Tvrtko Ursulin
  2 siblings, 1 reply; 50+ messages in thread
From: Dave Gordon @ 2016-01-11 16:16 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx

On 08/01/16 11:29, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Purpose is to avoid calling i915_gem_obj_ggtt_offset from the
> interrupt context without the big lock held.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c        | 3 +--
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++
>   drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
>   3 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 5b3795815d8e..70c511ef6b12 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -345,7 +345,6 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
>   	struct intel_engine_cs *ring = rq->ring;
>   	struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
>   	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
> -	struct drm_i915_gem_object *rb_obj = rq->ringbuf->obj;
>   	struct page *page;
>   	uint32_t *reg_state;
>
> @@ -355,7 +354,7 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
>   	reg_state = kmap_atomic(page);
>
>   	reg_state[CTX_RING_TAIL+1] = rq->tail;
> -	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
> +	reg_state[CTX_RING_BUFFER_START+1] = rq->ringbuf->gtt_start;
>
>   	if (ppgtt && !USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
>   		/* True 32b PPGTT with dynamic page allocation: update PDP
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 339701d7a9a5..9094ce254125 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1988,6 +1988,7 @@ void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
>   	else
>   		iounmap(ringbuf->virtual_start);
>   	ringbuf->virtual_start = NULL;
> +	ringbuf->gtt_start = 0;

Zero isn't an invalid GTT address (although it does cause confusion).
Maybe we should define a suitable value to indicate not-an-address?
Anything with low bits set, or above 4Gb, is not valid, so perhaps:

#define	I915_GTT_BAD_ADDRESS		(~(u64)0)

.Dave.

>   	i915_gem_object_ggtt_unpin(ringbuf->obj);
>   }
>
> @@ -2054,6 +2055,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>   		}
>   	}
>
> +	ringbuf->gtt_start = i915_gem_obj_ggtt_offset(obj);
> +
>   	return 0;
>   }
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 0b91a4b77359..25d3716228ae 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -98,6 +98,7 @@ struct intel_ring_hangcheck {
>   struct intel_ringbuffer {
>   	struct drm_i915_gem_object *obj;
>   	void __iomem *virtual_start;
> +	u64 gtt_start;
>
>   	struct intel_engine_cs *ring;
>   	struct list_head link;
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 11/13] drm/i915: Cache ringbuffer GTT address
  2016-01-11 16:16   ` [PATCH 11/13] drm/i915: Cache ringbuffer GTT address Dave Gordon
@ 2016-01-11 17:44     ` Tvrtko Ursulin
  0 siblings, 0 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-11 17:44 UTC (permalink / raw)
  To: Dave Gordon, Intel-gfx


On 11/01/16 16:16, Dave Gordon wrote:
> On 08/01/16 11:29, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Purpose is to avoid calling i915_gem_obj_ggtt_offset from the
>> interrupt context without the big lock held.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_lrc.c        | 3 +--
>>   drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++
>>   drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
>>   3 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index 5b3795815d8e..70c511ef6b12 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -345,7 +345,6 @@ static int execlists_update_context(struct
>> drm_i915_gem_request *rq)
>>       struct intel_engine_cs *ring = rq->ring;
>>       struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
>>       struct drm_i915_gem_object *ctx_obj =
>> rq->ctx->engine[ring->id].state;
>> -    struct drm_i915_gem_object *rb_obj = rq->ringbuf->obj;
>>       struct page *page;
>>       uint32_t *reg_state;
>>
>> @@ -355,7 +354,7 @@ static int execlists_update_context(struct
>> drm_i915_gem_request *rq)
>>       reg_state = kmap_atomic(page);
>>
>>       reg_state[CTX_RING_TAIL+1] = rq->tail;
>> -    reg_state[CTX_RING_BUFFER_START+1] =
>> i915_gem_obj_ggtt_offset(rb_obj);
>> +    reg_state[CTX_RING_BUFFER_START+1] = rq->ringbuf->gtt_start;
>>
>>       if (ppgtt && !USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
>>           /* True 32b PPGTT with dynamic page allocation: update PDP
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 339701d7a9a5..9094ce254125 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -1988,6 +1988,7 @@ void intel_unpin_ringbuffer_obj(struct
>> intel_ringbuffer *ringbuf)
>>       else
>>           iounmap(ringbuf->virtual_start);
>>       ringbuf->virtual_start = NULL;
>> +    ringbuf->gtt_start = 0;
>
> Zero isn't an invalid GTT address (although it does cause confusion).
> Maybe we should define a suitable value to indicate not-an-address?
> Anything with low bits set, or above 4Gb, is not valid, so perhaps:

Oh yes, it crossed my mind and then I forgot about it, so thanks!

> #define    I915_GTT_BAD_ADDRESS        (~(u64)0)

It would match current behaviour of i915_gem_obj_offset(_view) of 
returning -1 when VMA is not found so I think it is good enough to go 
with this approach.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 05/13] drm/i915: Cache LRCA in the context
  2016-01-11  8:38   ` Daniel Vetter
@ 2016-01-11 19:41     ` Dave Gordon
  2016-01-12  9:59       ` Tvrtko Ursulin
  0 siblings, 1 reply; 50+ messages in thread
From: Dave Gordon @ 2016-01-11 19:41 UTC (permalink / raw)
  To: Daniel Vetter, Tvrtko Ursulin; +Cc: Intel-gfx

On 11/01/16 08:38, Daniel Vetter wrote:
> On Fri, Jan 08, 2016 at 11:29:44AM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> We are not allowed to call i915_gem_obj_ggtt_offset from irq
>> context without the big kernel lock held.
>>
>> LRCA lifetime is well defined so cache it so it can be looked up
>> cheaply from the interrupt context and at command submission
>> time.
>>
>> v2: Added irq context reasoning to the commit message. (Daniel Vetter)
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> A i915_obj_for_each_vma macro with a
> WARN_ON(!mutex_is_locked(dev->struct_mutex)) would be awesome to validate
> this. Especially since this is by far not the only time I've seen this
> bug. Needs to be a follow-up though to avoid stalling this fix.
>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c | 15 ++++++--------
>>   drivers/gpu/drm/i915/i915_drv.h     |  1 +
>>   drivers/gpu/drm/i915/intel_lrc.c    | 40 ++++++++++++++++---------------------
>>   drivers/gpu/drm/i915/intel_lrc.h    |  3 ++-
>>   4 files changed, 26 insertions(+), 33 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 3b05bd189eab..714a45cf8a51 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -1976,12 +1976,13 @@ static int i915_context_status(struct seq_file *m, void *unused)
>>   }
>>
>>   static void i915_dump_lrc_obj(struct seq_file *m,
>> -			      struct intel_engine_cs *ring,
>> -			      struct drm_i915_gem_object *ctx_obj)
>> +			      struct intel_context *ctx,
>> +			      struct intel_engine_cs *ring)
>>   {
>>   	struct page *page;
>>   	uint32_t *reg_state;
>>   	int j;
>> +	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>>   	unsigned long ggtt_offset = 0;
>>
>>   	if (ctx_obj == NULL) {
>> @@ -1991,7 +1992,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
>>   	}
>>
>>   	seq_printf(m, "CONTEXT: %s %u\n", ring->name,
>> -		   intel_execlists_ctx_id(ctx_obj));
>> +		   intel_execlists_ctx_id(ctx, ring));
>>
>>   	if (!i915_gem_obj_ggtt_bound(ctx_obj))
>>   		seq_puts(m, "\tNot bound in GGTT\n");
>> @@ -2040,8 +2041,7 @@ static int i915_dump_lrc(struct seq_file *m, void *unused)
>>   	list_for_each_entry(ctx, &dev_priv->context_list, link) {
>>   		for_each_ring(ring, dev_priv, i) {
>>   			if (ring->default_context != ctx)
>> -				i915_dump_lrc_obj(m, ring,
>> -						  ctx->engine[i].state);
>> +				i915_dump_lrc_obj(m, ctx, ring);
>>   		}
>>   	}
>>
>> @@ -2115,11 +2115,8 @@ static int i915_execlists(struct seq_file *m, void *data)
>>
>>   		seq_printf(m, "\t%d requests in queue\n", count);
>>   		if (head_req) {
>> -			struct drm_i915_gem_object *ctx_obj;
>> -
>> -			ctx_obj = head_req->ctx->engine[ring_id].state;
>>   			seq_printf(m, "\tHead request id: %u\n",
>> -				   intel_execlists_ctx_id(ctx_obj));
>> +				   intel_execlists_ctx_id(head_req->ctx, ring));
>>   			seq_printf(m, "\tHead request tail: %u\n",
>>   				   head_req->tail);
>>   		}
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 8cf655c6fc03..b77a5d84eac2 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -881,6 +881,7 @@ struct intel_context {
>>   		struct drm_i915_gem_object *state;
>>   		struct intel_ringbuffer *ringbuf;
>>   		int pin_count;
>> +		u32 lrca;
>
> lrc_offset imo. Consistent with our other usage in the driver, and
> actually readable. Please apply liberally everywhere else (I know that
> bsepc calls it lrca, but we don't need to follow bad naming styles
> blindly).
>
> With that Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
>>   	} engine[I915_NUM_RINGS];
>>
>>   	struct list_head link;
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index ff146a15d395..ffe004de22b0 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -265,7 +265,8 @@ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists
>>
>>   /**
>>    * intel_execlists_ctx_id() - get the Execlists Context ID
>> - * @ctx_obj: Logical Ring Context backing object.
>> + * @ctx: Context to get the ID for
>> + * @ring: Engine to get the ID for
>>    *
>>    * Do not confuse with ctx->id! Unfortunately we have a name overload
>>    * here: the old context ID we pass to userspace as a handler so that
>> @@ -275,14 +276,12 @@ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists
>>    *
>>    * Return: 20-bits globally unique context ID.
>>    */
>> -u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
>> +u32 intel_execlists_ctx_id(struct intel_context *ctx,
>> +			   struct intel_engine_cs *ring)
>>   {
>> -	u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
>> -			LRC_PPHWSP_PN * PAGE_SIZE;
>> -
>>   	/* LRCA is required to be 4K aligned so the more significant 20 bits
>>   	 * are globally unique */
>> -	return lrca >> 12;
>> +	return ctx->engine[ring->id].lrca >> 12;
>>   }
>>
>>   static bool disable_lite_restore_wa(struct intel_engine_cs *ring)
>> @@ -297,13 +296,11 @@ static bool disable_lite_restore_wa(struct intel_engine_cs *ring)
>>   uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>>   				     struct intel_engine_cs *ring)
>>   {
>> -	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>> +	uint64_t lrca = ctx->engine[ring->id].lrca;
>>   	uint64_t desc = ring->ctx_desc_template;
>> -	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
>> -			LRC_PPHWSP_PN * PAGE_SIZE;
>>
>>   	desc |= lrca;
>> -	desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
>> +	desc |= (u64)intel_execlists_ctx_id(ctx, ring) << GEN8_CTX_ID_SHIFT;
>>
>>   	return desc;
>>   }
>> @@ -461,9 +458,7 @@ static bool execlists_check_remove_request(struct intel_engine_cs *ring,
>>   					    execlist_link);
>>
>>   	if (head_req != NULL) {
>> -		struct drm_i915_gem_object *ctx_obj =
>> -				head_req->ctx->engine[ring->id].state;
>> -		if (intel_execlists_ctx_id(ctx_obj) == request_id) {
>> +		if (intel_execlists_ctx_id(head_req->ctx, ring) == request_id) {
>>   			WARN(head_req->elsp_submitted == 0,
>>   			     "Never submitted head request\n");
>>
>> @@ -1023,15 +1018,17 @@ int logical_ring_flush_all_caches(struct drm_i915_gem_request *req)
>>   }
>>
>>   static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
>> -		struct drm_i915_gem_object *ctx_obj,
>> -		struct intel_ringbuffer *ringbuf)
>> +				   struct intel_context *ctx)
>>   {
>>   	struct drm_device *dev = ring->dev;
>>   	struct drm_i915_private *dev_priv = dev->dev_private;
>> +	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>> +	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
>>   	u64 lrca;
>> -	int ret = 0;
>> +	int ret;
>>
>>   	WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
>> +
>>   	ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
>>   			PIN_OFFSET_BIAS | GUC_WOPCM_TOP);
>>   	if (ret)
>> @@ -1047,6 +1044,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
>>   	if (ret)
>>   		goto unpin_ctx_obj;
>>
>> +	ctx->engine[ring->id].lrca = lrca;
>>   	ctx_obj->dirty = true;
>>
>>   	/* Invalidate GuC TLB. */
>> @@ -1065,11 +1063,9 @@ static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
>>   {
>>   	int ret = 0;
>>   	struct intel_engine_cs *ring = rq->ring;
>> -	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
>> -	struct intel_ringbuffer *ringbuf = rq->ringbuf;
>>
>>   	if (rq->ctx->engine[ring->id].pin_count++ == 0) {
>> -		ret = intel_lr_context_do_pin(ring, ctx_obj, ringbuf);
>> +		ret = intel_lr_context_do_pin(ring, rq->ctx);
>>   		if (ret)
>>   			goto reset_pin_count;
>>   	}
>> @@ -1091,6 +1087,7 @@ void intel_lr_context_unpin(struct drm_i915_gem_request *rq)
>>   		if (--rq->ctx->engine[ring->id].pin_count == 0) {
>>   			intel_unpin_ringbuffer_obj(ringbuf);
>>   			i915_gem_object_ggtt_unpin(ctx_obj);
>> +			rq->ctx->engine[ring->id].lrca = 0;
>>   		}
>>   	}
>>   }
>> @@ -1960,10 +1957,7 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
>>   		goto error;
>>
>>   	/* As this is the default context, always pin it */
>> -	ret = intel_lr_context_do_pin(
>> -			ring,
>> -			ring->default_context->engine[ring->id].state,
>> -			ring->default_context->engine[ring->id].ringbuf);
>> +	ret = intel_lr_context_do_pin(ring, ring->default_context);
>>   	if (ret) {
>>   		DRM_ERROR(
>>   			"Failed to pin and map ringbuffer %s: %d\n",
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
>> index de41ad6cd63d..f9db5f187d77 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.h
>> +++ b/drivers/gpu/drm/i915/intel_lrc.h
>> @@ -106,6 +106,8 @@ void intel_lr_context_reset(struct drm_device *dev,
>>   			struct intel_context *ctx);
>>   uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>>   				     struct intel_engine_cs *ring);
>> +u32 intel_execlists_ctx_id(struct intel_context *ctx,
>> +			   struct intel_engine_cs *ring);
>>
>>   /* Execlists */
>>   int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
>> @@ -113,7 +115,6 @@ struct i915_execbuffer_params;
>>   int intel_execlists_submission(struct i915_execbuffer_params *params,
>>   			       struct drm_i915_gem_execbuffer2 *args,
>>   			       struct list_head *vmas);
>> -u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
>>
>>   void intel_lrc_irq_handler(struct intel_engine_cs *ring);
>>   void intel_execlists_retire_requests(struct intel_engine_cs *ring);
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

How about caching the WHOLE descriptor so that retrieval becomes 
trivial, retrieving the context ID becomes almost as trivial, as
does retrieving the LRCA (all these could be inlined) and the single 
calculate-and-cache function can be nicely documented as to how a 
descriptor encodes LRCA etc. Something like this ...

/**
  * intel_lr_context_descriptor_update() - calculate & cache the
  *					  descriptor for a pinned
  * 					  context
  * @ctx: Context to work on
  * @ring: Engine the descriptor will be used with
  *
  * The context descriptor encodes various attributes of a context,
  * including its GTT address and some flags. Because it's fairly
  * expensive to calculate, we'll just do it once and cache the result,
  * which remains valid until the context is unpinned.
  *
  * This is what a descriptor looks like, from LSB to MSB:
  *	bits 0-11:	flags, GEN8_CTX_* (cached in ctx_desc_template)
  *	bits 12-31:	LRCA, GTT address of (the HWSP of) this context
  *	bits 32-51:	ctx ID, a globally unique tag (the LRCA again!)
  *	bits 52-63:	reserved, may encode the engine ID (for GuC)
  */
static void
intel_lr_context_descriptor_update(struct intel_context *ctx,
				   struct intel_engine_cs *ring)
{
         struct drm_i915_gem_object *ctx_obj;
	uint64_t lrca, desc;

	ctx_obj = ctx->engine[ring->id].state;
	lrca = i915_gem_obj_ggtt_offset(ctx_obj) + LRC_PPHWSP_PN * PAGE_SIZE;

	desc = ring->ctx_desc_template;			/* bits  0-11 */
	desc |= lrca;					/* bits 12-31 */
	desc |= (lrca >> PAGE_SHIFT) << 32;		/* bits 32-51 */

	ctx->engine[ring->id].ctx_desc = desc;
}

/**
  * intel_lr_context_descriptor() - get the descriptor for the specified
  * 				   context/engine combination
  * @ctx: Context to get the ID for
  * @ring: Engine to get the ID for
  *
  * Now just a trivial accessor, since the value was computed and cached
  * above at the point when the context was pinned.
  *
  * Return: 64-bit context descriptor (encodes LRCA and various flags)
  */
uint64_t inline
intel_lr_context_descriptor(struct intel_context *ctx,
			    struct intel_engine_cs *ring)
{
	return ctx->engine[ring->id].ctx_desc;
}

/**
* intel_lr_context_address() - get the GTT address of (the HWSP of) the
*			       specified context/engine combination
* @ctx: Context to get the ID for
* @ring: Engine to get the ID for
*
* The address (LRCA) is a portion of the context descriptor, so we can
* just extract the required part from the cached descriptor.
*
* Return: 32-bit GTT address
*/
uint32_t inline
intel_lr_context_address(struct intel_context *ctx,
			 struct intel_engine_cs *ring)
{
	return (u32)intel_lr_context_descriptor(ctx, ring) & PAGE_MASK;
}

/**
  * intel_execlists_ctx_id() - get the Execlists Context ID
  * @ctx: Context to get the ID for
  * @ring: Engine to get the ID for
  *
  * Do not confuse with ctx->id! Unfortunately we have a name overload
  * here: the old context ID we pass to userspace as a handler so that
  * they can refer to a context, and the new context ID we pass to the
  * ELSP so that the GPU can inform us of the context status via
  * interrupts.
  *
  * The context ID is a portion of the context descriptor, so we can
  * just extract the required part from the cached descriptor.
  *
  * Return: 20-bit globally unique ctx ID (actually, a GTT page number)
  */
uint32_t inline
intel_execlists_ctx_id(struct intel_context *ctx,
		       struct intel_engine_cs *ring)
{
	return (u32)(intel_lr_context_descriptor(ctx, ring) >> 32);
}

... and insert a call to intel_lr_context_descriptor_update() into the 
appropriate spot in intel_lr_context_do_pin().

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 04/13] drm/i915: Fail engine initialization if LRCA is incorrectly aligned
  2016-01-11 16:02     ` Dave Gordon
@ 2016-01-11 23:31       ` Chris Wilson
  0 siblings, 0 replies; 50+ messages in thread
From: Chris Wilson @ 2016-01-11 23:31 UTC (permalink / raw)
  To: Dave Gordon; +Cc: Intel-gfx

On Mon, Jan 11, 2016 at 04:02:09PM +0000, Dave Gordon wrote:
> IIRC the original version of this WARN (in intel_lr_context_descriptor()
> above) was added with the GuC submission code, because the context
> descriptor as used in execlist code is a 64-bit value, but the GuC
> requires that all the unique stuff fits in those 20 unmasked bits of
> a 32-bit value, with the low 12 bits being used for flags. So we
> wanted to check that we never got a context ID that couldn't be
> pruned down to just those 20 bits without losing information. It's
> never been seen to happen since GuC development finished, so we can
> reasonably lose the check now.

I am missing something here as the GuC doesn't use the high 32bits of
the context descriptor, i.e. it never touches the lrca portion?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 05/13] drm/i915: Cache LRCA in the context
  2016-01-11 19:41     ` Dave Gordon
@ 2016-01-12  9:59       ` Tvrtko Ursulin
  0 siblings, 0 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-12  9:59 UTC (permalink / raw)
  To: Dave Gordon, Daniel Vetter; +Cc: Intel-gfx


On 11/01/16 19:41, Dave Gordon wrote:
> On 11/01/16 08:38, Daniel Vetter wrote:
>> On Fri, Jan 08, 2016 at 11:29:44AM +0000, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> We are not allowed to call i915_gem_obj_ggtt_offset from irq
>>> context without the big kernel lock held.
>>>
>>> LRCA lifetime is well defined so cache it so it can be looked up
>>> cheaply from the interrupt context and at command submission
>>> time.
>>>
>>> v2: Added irq context reasoning to the commit message. (Daniel Vetter)
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> A i915_obj_for_each_vma macro with a
>> WARN_ON(!mutex_is_locked(dev->struct_mutex)) would be awesome to validate
>> this. Especially since this is by far not the only time I've seen this
>> bug. Needs to be a follow-up though to avoid stalling this fix.
>>
>>> ---
>>>   drivers/gpu/drm/i915/i915_debugfs.c | 15 ++++++--------
>>>   drivers/gpu/drm/i915/i915_drv.h     |  1 +
>>>   drivers/gpu/drm/i915/intel_lrc.c    | 40
>>> ++++++++++++++++---------------------
>>>   drivers/gpu/drm/i915/intel_lrc.h    |  3 ++-
>>>   4 files changed, 26 insertions(+), 33 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
>>> b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index 3b05bd189eab..714a45cf8a51 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -1976,12 +1976,13 @@ static int i915_context_status(struct
>>> seq_file *m, void *unused)
>>>   }
>>>
>>>   static void i915_dump_lrc_obj(struct seq_file *m,
>>> -                  struct intel_engine_cs *ring,
>>> -                  struct drm_i915_gem_object *ctx_obj)
>>> +                  struct intel_context *ctx,
>>> +                  struct intel_engine_cs *ring)
>>>   {
>>>       struct page *page;
>>>       uint32_t *reg_state;
>>>       int j;
>>> +    struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>>>       unsigned long ggtt_offset = 0;
>>>
>>>       if (ctx_obj == NULL) {
>>> @@ -1991,7 +1992,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
>>>       }
>>>
>>>       seq_printf(m, "CONTEXT: %s %u\n", ring->name,
>>> -           intel_execlists_ctx_id(ctx_obj));
>>> +           intel_execlists_ctx_id(ctx, ring));
>>>
>>>       if (!i915_gem_obj_ggtt_bound(ctx_obj))
>>>           seq_puts(m, "\tNot bound in GGTT\n");
>>> @@ -2040,8 +2041,7 @@ static int i915_dump_lrc(struct seq_file *m,
>>> void *unused)
>>>       list_for_each_entry(ctx, &dev_priv->context_list, link) {
>>>           for_each_ring(ring, dev_priv, i) {
>>>               if (ring->default_context != ctx)
>>> -                i915_dump_lrc_obj(m, ring,
>>> -                          ctx->engine[i].state);
>>> +                i915_dump_lrc_obj(m, ctx, ring);
>>>           }
>>>       }
>>>
>>> @@ -2115,11 +2115,8 @@ static int i915_execlists(struct seq_file *m,
>>> void *data)
>>>
>>>           seq_printf(m, "\t%d requests in queue\n", count);
>>>           if (head_req) {
>>> -            struct drm_i915_gem_object *ctx_obj;
>>> -
>>> -            ctx_obj = head_req->ctx->engine[ring_id].state;
>>>               seq_printf(m, "\tHead request id: %u\n",
>>> -                   intel_execlists_ctx_id(ctx_obj));
>>> +                   intel_execlists_ctx_id(head_req->ctx, ring));
>>>               seq_printf(m, "\tHead request tail: %u\n",
>>>                      head_req->tail);
>>>           }
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index 8cf655c6fc03..b77a5d84eac2 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -881,6 +881,7 @@ struct intel_context {
>>>           struct drm_i915_gem_object *state;
>>>           struct intel_ringbuffer *ringbuf;
>>>           int pin_count;
>>> +        u32 lrca;
>>
>> lrc_offset imo. Consistent with our other usage in the driver, and
>> actually readable. Please apply liberally everywhere else (I know that
>> bsepc calls it lrca, but we don't need to follow bad naming styles
>> blindly).
>>
>> With that Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>
>>>       } engine[I915_NUM_RINGS];
>>>
>>>       struct list_head link;
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>> index ff146a15d395..ffe004de22b0 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -265,7 +265,8 @@ int intel_sanitize_enable_execlists(struct
>>> drm_device *dev, int enable_execlists
>>>
>>>   /**
>>>    * intel_execlists_ctx_id() - get the Execlists Context ID
>>> - * @ctx_obj: Logical Ring Context backing object.
>>> + * @ctx: Context to get the ID for
>>> + * @ring: Engine to get the ID for
>>>    *
>>>    * Do not confuse with ctx->id! Unfortunately we have a name overload
>>>    * here: the old context ID we pass to userspace as a handler so that
>>> @@ -275,14 +276,12 @@ int intel_sanitize_enable_execlists(struct
>>> drm_device *dev, int enable_execlists
>>>    *
>>>    * Return: 20-bits globally unique context ID.
>>>    */
>>> -u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
>>> +u32 intel_execlists_ctx_id(struct intel_context *ctx,
>>> +               struct intel_engine_cs *ring)
>>>   {
>>> -    u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
>>> -            LRC_PPHWSP_PN * PAGE_SIZE;
>>> -
>>>       /* LRCA is required to be 4K aligned so the more significant 20
>>> bits
>>>        * are globally unique */
>>> -    return lrca >> 12;
>>> +    return ctx->engine[ring->id].lrca >> 12;
>>>   }
>>>
>>>   static bool disable_lite_restore_wa(struct intel_engine_cs *ring)
>>> @@ -297,13 +296,11 @@ static bool disable_lite_restore_wa(struct
>>> intel_engine_cs *ring)
>>>   uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>>>                        struct intel_engine_cs *ring)
>>>   {
>>> -    struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>>> +    uint64_t lrca = ctx->engine[ring->id].lrca;
>>>       uint64_t desc = ring->ctx_desc_template;
>>> -    uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
>>> -            LRC_PPHWSP_PN * PAGE_SIZE;
>>>
>>>       desc |= lrca;
>>> -    desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
>>> +    desc |= (u64)intel_execlists_ctx_id(ctx, ring) <<
>>> GEN8_CTX_ID_SHIFT;
>>>
>>>       return desc;
>>>   }
>>> @@ -461,9 +458,7 @@ static bool execlists_check_remove_request(struct
>>> intel_engine_cs *ring,
>>>                           execlist_link);
>>>
>>>       if (head_req != NULL) {
>>> -        struct drm_i915_gem_object *ctx_obj =
>>> -                head_req->ctx->engine[ring->id].state;
>>> -        if (intel_execlists_ctx_id(ctx_obj) == request_id) {
>>> +        if (intel_execlists_ctx_id(head_req->ctx, ring) ==
>>> request_id) {
>>>               WARN(head_req->elsp_submitted == 0,
>>>                    "Never submitted head request\n");
>>>
>>> @@ -1023,15 +1018,17 @@ int logical_ring_flush_all_caches(struct
>>> drm_i915_gem_request *req)
>>>   }
>>>
>>>   static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
>>> -        struct drm_i915_gem_object *ctx_obj,
>>> -        struct intel_ringbuffer *ringbuf)
>>> +                   struct intel_context *ctx)
>>>   {
>>>       struct drm_device *dev = ring->dev;
>>>       struct drm_i915_private *dev_priv = dev->dev_private;
>>> +    struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>>> +    struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
>>>       u64 lrca;
>>> -    int ret = 0;
>>> +    int ret;
>>>
>>>       WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
>>> +
>>>       ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
>>>               PIN_OFFSET_BIAS | GUC_WOPCM_TOP);
>>>       if (ret)
>>> @@ -1047,6 +1044,7 @@ static int intel_lr_context_do_pin(struct
>>> intel_engine_cs *ring,
>>>       if (ret)
>>>           goto unpin_ctx_obj;
>>>
>>> +    ctx->engine[ring->id].lrca = lrca;
>>>       ctx_obj->dirty = true;
>>>
>>>       /* Invalidate GuC TLB. */
>>> @@ -1065,11 +1063,9 @@ static int intel_lr_context_pin(struct
>>> drm_i915_gem_request *rq)
>>>   {
>>>       int ret = 0;
>>>       struct intel_engine_cs *ring = rq->ring;
>>> -    struct drm_i915_gem_object *ctx_obj =
>>> rq->ctx->engine[ring->id].state;
>>> -    struct intel_ringbuffer *ringbuf = rq->ringbuf;
>>>
>>>       if (rq->ctx->engine[ring->id].pin_count++ == 0) {
>>> -        ret = intel_lr_context_do_pin(ring, ctx_obj, ringbuf);
>>> +        ret = intel_lr_context_do_pin(ring, rq->ctx);
>>>           if (ret)
>>>               goto reset_pin_count;
>>>       }
>>> @@ -1091,6 +1087,7 @@ void intel_lr_context_unpin(struct
>>> drm_i915_gem_request *rq)
>>>           if (--rq->ctx->engine[ring->id].pin_count == 0) {
>>>               intel_unpin_ringbuffer_obj(ringbuf);
>>>               i915_gem_object_ggtt_unpin(ctx_obj);
>>> +            rq->ctx->engine[ring->id].lrca = 0;
>>>           }
>>>       }
>>>   }
>>> @@ -1960,10 +1957,7 @@ static int logical_ring_init(struct drm_device
>>> *dev, struct intel_engine_cs *rin
>>>           goto error;
>>>
>>>       /* As this is the default context, always pin it */
>>> -    ret = intel_lr_context_do_pin(
>>> -            ring,
>>> -            ring->default_context->engine[ring->id].state,
>>> -            ring->default_context->engine[ring->id].ringbuf);
>>> +    ret = intel_lr_context_do_pin(ring, ring->default_context);
>>>       if (ret) {
>>>           DRM_ERROR(
>>>               "Failed to pin and map ringbuffer %s: %d\n",
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.h
>>> b/drivers/gpu/drm/i915/intel_lrc.h
>>> index de41ad6cd63d..f9db5f187d77 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.h
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.h
>>> @@ -106,6 +106,8 @@ void intel_lr_context_reset(struct drm_device *dev,
>>>               struct intel_context *ctx);
>>>   uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>>>                        struct intel_engine_cs *ring);
>>> +u32 intel_execlists_ctx_id(struct intel_context *ctx,
>>> +               struct intel_engine_cs *ring);
>>>
>>>   /* Execlists */
>>>   int intel_sanitize_enable_execlists(struct drm_device *dev, int
>>> enable_execlists);
>>> @@ -113,7 +115,6 @@ struct i915_execbuffer_params;
>>>   int intel_execlists_submission(struct i915_execbuffer_params *params,
>>>                      struct drm_i915_gem_execbuffer2 *args,
>>>                      struct list_head *vmas);
>>> -u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
>>>
>>>   void intel_lrc_irq_handler(struct intel_engine_cs *ring);
>>>   void intel_execlists_retire_requests(struct intel_engine_cs *ring);
>>> --
>>> 1.9.1
>>>
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> How about caching the WHOLE descriptor so that retrieval becomes
> trivial, retrieving the context ID becomes almost as trivial, as
> does retrieving the LRCA (all these could be inlined) and the single
> calculate-and-cache function can be nicely documented as to how a
> descriptor encodes LRCA etc. Something like this ...

Similar to v3 of my series from yesterday as Daniel already suggested 
caching it all. :)

But I think moving towards storing the VMA will be the approach unless 
we can agree on that I915_GTT_BAD_ADDRESS idea.

> /**
>   * intel_lr_context_descriptor_update() - calculate & cache the
>   *                      descriptor for a pinned
>   *                       context
>   * @ctx: Context to work on
>   * @ring: Engine the descriptor will be used with
>   *
>   * The context descriptor encodes various attributes of a context,
>   * including its GTT address and some flags. Because it's fairly
>   * expensive to calculate, we'll just do it once and cache the result,
>   * which remains valid until the context is unpinned.
>   *
>   * This is what a descriptor looks like, from LSB to MSB:
>   *    bits 0-11:    flags, GEN8_CTX_* (cached in ctx_desc_template)
>   *    bits 12-31:    LRCA, GTT address of (the HWSP of) this context
>   *    bits 32-51:    ctx ID, a globally unique tag (the LRCA again!)
>   *    bits 52-63:    reserved, may encode the engine ID (for GuC)
>   */
> static void
> intel_lr_context_descriptor_update(struct intel_context *ctx,
>                     struct intel_engine_cs *ring)
> {
>          struct drm_i915_gem_object *ctx_obj;
>      uint64_t lrca, desc;
>
>      ctx_obj = ctx->engine[ring->id].state;
>      lrca = i915_gem_obj_ggtt_offset(ctx_obj) + LRC_PPHWSP_PN * PAGE_SIZE;
>
>      desc = ring->ctx_desc_template;            /* bits  0-11 */
>      desc |= lrca;                    /* bits 12-31 */
>      desc |= (lrca >> PAGE_SHIFT) << 32;        /* bits 32-51 */
>
>      ctx->engine[ring->id].ctx_desc = desc;
> }
>
> /**
>   * intel_lr_context_descriptor() - get the descriptor for the specified
>   *                    context/engine combination
>   * @ctx: Context to get the ID for
>   * @ring: Engine to get the ID for
>   *
>   * Now just a trivial accessor, since the value was computed and cached
>   * above at the point when the context was pinned.
>   *
>   * Return: 64-bit context descriptor (encodes LRCA and various flags)
>   */
> uint64_t inline
> intel_lr_context_descriptor(struct intel_context *ctx,
>                  struct intel_engine_cs *ring)
> {
>      return ctx->engine[ring->id].ctx_desc;
> }
>
> /**
> * intel_lr_context_address() - get the GTT address of (the HWSP of) the
> *                   specified context/engine combination
> * @ctx: Context to get the ID for
> * @ring: Engine to get the ID for
> *
> * The address (LRCA) is a portion of the context descriptor, so we can
> * just extract the required part from the cached descriptor.
> *
> * Return: 32-bit GTT address
> */
> uint32_t inline
> intel_lr_context_address(struct intel_context *ctx,
>               struct intel_engine_cs *ring)
> {
>      return (u32)intel_lr_context_descriptor(ctx, ring) & PAGE_MASK;
> }
>
> /**
>   * intel_execlists_ctx_id() - get the Execlists Context ID
>   * @ctx: Context to get the ID for
>   * @ring: Engine to get the ID for
>   *
>   * Do not confuse with ctx->id! Unfortunately we have a name overload
>   * here: the old context ID we pass to userspace as a handler so that
>   * they can refer to a context, and the new context ID we pass to the
>   * ELSP so that the GPU can inform us of the context status via
>   * interrupts.
>   *
>   * The context ID is a portion of the context descriptor, so we can
>   * just extract the required part from the cached descriptor.
>   *
>   * Return: 20-bit globally unique ctx ID (actually, a GTT page number)
>   */
> uint32_t inline
> intel_execlists_ctx_id(struct intel_context *ctx,
>                 struct intel_engine_cs *ring)
> {
>      return (u32)(intel_lr_context_descriptor(ctx, ring) >> 32);
> }
>
> ... and insert a call to intel_lr_context_descriptor_update() into the
> appropriate spot in intel_lr_context_do_pin().

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 3/7] drm/i915: Cache ringbuffer GTT VMA
  2016-01-11  8:49   ` Daniel Vetter
@ 2016-01-12 11:42     ` Tvrtko Ursulin
  2016-01-12 11:54       ` Chris Wilson
  0 siblings, 1 reply; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-12 11:42 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Daniel Vetter

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Purpose is to avoid calling i915_gem_obj_ggtt_offset from the
interrupt context without the big lock held.

v2: Renamed gtt_start to gtt_offset. (Daniel Vetter)
v3: Cache the VMA instead of address. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_lrc.c        | 3 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 504030ce7f93..94314b344f38 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -340,7 +340,6 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
 	struct intel_engine_cs *ring = rq->ring;
 	struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
 	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
-	struct drm_i915_gem_object *rb_obj = rq->ringbuf->obj;
 	struct page *page;
 	uint32_t *reg_state;
 
@@ -350,7 +349,7 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
 	reg_state = kmap_atomic(page);
 
 	reg_state[CTX_RING_TAIL+1] = rq->tail;
-	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
+	reg_state[CTX_RING_BUFFER_START+1] = rq->ringbuf->vma->node.start;
 
 	if (ppgtt && !USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
 		/* True 32b PPGTT with dynamic page allocation: update PDP
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 339701d7a9a5..eccc5323d59b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1988,6 +1988,7 @@ void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
 	else
 		iounmap(ringbuf->virtual_start);
 	ringbuf->virtual_start = NULL;
+	ringbuf->vma = NULL;
 	i915_gem_object_ggtt_unpin(ringbuf->obj);
 }
 
@@ -2054,6 +2055,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 		}
 	}
 
+	ringbuf->vma = i915_gem_obj_to_ggtt(obj);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 85ce2272f92c..ede57954080e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -99,6 +99,7 @@ struct intel_ring_hangcheck {
 struct intel_ringbuffer {
 	struct drm_i915_gem_object *obj;
 	void __iomem *virtual_start;
+	struct i915_vma *vma;
 
 	struct intel_engine_cs *ring;
 	struct list_head link;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 3/7] drm/i915: Cache ringbuffer GTT VMA
  2016-01-12 11:42     ` [PATCH v3 3/7] drm/i915: Cache ringbuffer GTT VMA Tvrtko Ursulin
@ 2016-01-12 11:54       ` Chris Wilson
  0 siblings, 0 replies; 50+ messages in thread
From: Chris Wilson @ 2016-01-12 11:54 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Tue, Jan 12, 2016 at 11:42:39AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Purpose is to avoid calling i915_gem_obj_ggtt_offset from the
> interrupt context without the big lock held.
> 
> v2: Renamed gtt_start to gtt_offset. (Daniel Vetter)
> v3: Cache the VMA instead of address. (Chris Wilson)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>

I feel like I have done this before...

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator
  2016-01-11  9:51       ` Tvrtko Ursulin
@ 2016-01-12 16:19         ` Daniel Vetter
  2016-01-12 16:43           ` Maarten Lankhorst
  2016-01-13 14:35           ` Tvrtko Ursulin
  0 siblings, 2 replies; 50+ messages in thread
From: Daniel Vetter @ 2016-01-12 16:19 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Mon, Jan 11, 2016 at 09:51:38AM +0000, Tvrtko Ursulin wrote:
> 
> On 11/01/16 08:43, Daniel Vetter wrote:
> > On Fri, Jan 08, 2016 at 01:29:14PM +0000, Tvrtko Ursulin wrote:
> >>
> >> On 08/01/16 11:29, Tvrtko Ursulin wrote:
> >>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>
> >>> Purpose is to catch places which iterate the object VMA list
> >>> without holding the big lock.
> >>>
> >>> Implemented by open coding list_for_each_entry to make the
> >>> macro compatible with existing call sites.
> >>>
> >>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>> ---
> >>>   drivers/gpu/drm/i915/i915_debugfs.c      |  8 ++++----
> >>>   drivers/gpu/drm/i915/i915_drv.h          |  6 ++++++
> >>>   drivers/gpu/drm/i915/i915_gem.c          | 24 ++++++++++++------------
> >>>   drivers/gpu/drm/i915/i915_gem_gtt.c      |  2 +-
> >>>   drivers/gpu/drm/i915/i915_gem_shrinker.c |  2 +-
> >>>   drivers/gpu/drm/i915/i915_gpu_error.c    |  4 ++--
> >>>   6 files changed, 26 insertions(+), 20 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> >>> index 714a45cf8a51..d7c2a3201161 100644
> >>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> >>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> >>> @@ -117,7 +117,7 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj)
> >>>   	u64 size = 0;
> >>>   	struct i915_vma *vma;
> >>>
> >>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> >>> +	i915_gem_obj_for_each_vma(vma, obj) {
> >>>   		if (i915_is_ggtt(vma->vm) &&
> >>>   		    drm_mm_node_allocated(&vma->node))
> >>>   			size += vma->node.size;
> >>> @@ -155,7 +155,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> >>>   		   obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
> >>>   	if (obj->base.name)
> >>>   		seq_printf(m, " (name: %d)", obj->base.name);
> >>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> >>> +	i915_gem_obj_for_each_vma(vma, obj) {
> >>>   		if (vma->pin_count > 0)
> >>>   			pin_count++;
> >>>   	}
> >>> @@ -164,7 +164,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> >>>   		seq_printf(m, " (display)");
> >>>   	if (obj->fence_reg != I915_FENCE_REG_NONE)
> >>>   		seq_printf(m, " (fence: %d)", obj->fence_reg);
> >>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> >>> +	i915_gem_obj_for_each_vma(vma, obj) {
> >>>   		seq_printf(m, " (%sgtt offset: %08llx, size: %08llx",
> >>>   			   i915_is_ggtt(vma->vm) ? "g" : "pp",
> >>>   			   vma->node.start, vma->node.size);
> >>> @@ -342,7 +342,7 @@ static int per_file_stats(int id, void *ptr, void *data)
> >>>   		stats->shared += obj->base.size;
> >>>
> >>>   	if (USES_FULL_PPGTT(obj->base.dev)) {
> >>> -		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> >>> +		i915_gem_obj_for_each_vma(vma, obj) {
> >>>   			struct i915_hw_ppgtt *ppgtt;
> >>>
> >>>   			if (!drm_mm_node_allocated(&vma->node))
> >>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>> index b77a5d84eac2..0406a020dfcc 100644
> >>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>> @@ -2852,6 +2852,12 @@ struct drm_i915_gem_object *i915_gem_object_create_from_data(
> >>>   void i915_gem_free_object(struct drm_gem_object *obj);
> >>>   void i915_gem_vma_destroy(struct i915_vma *vma);
> >>>
> >>> +#define i915_gem_obj_for_each_vma(vma, obj) \
> >>> +	for (WARN_ON_ONCE(!mutex_is_locked(&(obj)->base.dev->struct_mutex)), \
> >>> +	     vma = list_first_entry(&(obj)->vma_list, typeof(*vma), vma_link);\
> >>> +	     &vma->vma_link != (&(obj)->vma_list); \
> >>> +	     vma = list_next_entry(vma, vma_link))
> >>> +
> >>
> >>
> >> Unfortunately error capture is not happy with this approach. Can't even see
> >> that error capture attempts to grab the mutex anywhere.
> >>
> >> So what? Drop the idea or add a "doing error capture" flag somewhere?
> > 
> > Fix the bugs. Not surprise at all that we've screwed this up all over the
> > place ;-) Afaics modeset code isn't much better either ...
> 
> Ok I'll drop this patch then since the series contains fixes to all but one
> related issues. The remaining one is then:
> 
> [   17.370366] ------------[ cut here ]------------
> [   17.375633] WARNING: CPU: 0 PID: 1128 at drivers/gpu/drm/i915/i915_gem.c:5166 i915_gem_obj_ggtt_offset_view+0x10f/0x120 [i915]()
> [   17.388879] WARN_ON_ONCE(!mutex_is_locked(&(o)->base.dev->struct_mutex))
> [   17.396364] Modules linked in: hid_generic usbhid coretemp asix usbnet libphy mii i915 gpio_lynxpoint i2c_hid hid video i2c_algo_bit drm_kms_helper acpi_pad drm lpc_ich mfd_core nls_iso8859_1 e1000e ptp ahci libahci pps_core
> [   17.419484] CPU: 0 PID: 1128 Comm: Xorg Tainted: G     U          4.4.0-rc8-160107+ #105
> [   17.428771] Hardware name: Intel Corporation Broadwell Client platform/WhiteTip Mountain 1, BIOS BDW-E1R1.86C.0080.R01.1406120446 06/12/2014
> [   17.443161]  ffffffffa0227790 ffff8800a98439b8 ffffffff81280d82 ffff8800a9843a00
> [   17.451677]  ffff8800a98439f0 ffffffff81049c8c ffff8801495d0000 ffff8800aa934900
> [   17.460166]  ffff8801495d8668 ffffffffa0242520 ffff8800aacea000 ffff8800a9843a50
> [   17.468674] Call Trace:
> [   17.471470]  [<ffffffff81280d82>] dump_stack+0x4b/0x79
> [   17.477355]  [<ffffffff81049c8c>] warn_slowpath_common+0x7c/0xc0
> [   17.484255]  [<ffffffff81049d17>] warn_slowpath_fmt+0x47/0x50
> [   17.490869]  [<ffffffffa018f7ef>] i915_gem_obj_ggtt_offset_view+0x10f/0x120 [i915]
> [   17.499572]  [<ffffffffa01a7290>] ? gen9_write8+0x2d0/0x2d0 [i915]
> [   17.506663]  [<ffffffffa01bdbfe>] ironlake_update_primary_plane+0x1ee/0x3a0 [i915]
> [   17.515358]  [<ffffffffa01ace1f>] intel_plane_atomic_update+0x5f/0x70 [i915]
> [   17.523391]  [<ffffffffa00ef122>] drm_atomic_helper_commit_planes_on_crtc+0x142/0x230 [drm_kms_helper]
> [   17.534063]  [<ffffffffa01c1744>] intel_atomic_commit+0x424/0x2270 [i915]
> [   17.541763]  [<ffffffff814ee04a>] ? __ww_mutex_lock+0x4a/0x82
> [   17.548355]  [<ffffffffa0097edb>] ? drm_atomic_check_only+0x12b/0x5e0 [drm]
> [   17.556352]  [<ffffffffa0097a07>] ? drm_atomic_set_crtc_for_connector+0x77/0xf0 [drm]
> [   17.565346]  [<ffffffffa00983c2>] drm_atomic_commit+0x32/0x50 [drm]
> [   17.572553]  [<ffffffffa00ef782>] drm_atomic_helper_set_config+0x72/0xb0 [drm_kms_helper]
> [   17.581954]  [<ffffffffa0088e8f>] drm_mode_set_config_internal+0x5f/0x100 [drm]
> [   17.590313]  [<ffffffffa008cdc8>] drm_mode_setcrtc+0xd8/0x500 [drm]
> [   17.597491]  [<ffffffffa007f5d8>] drm_ioctl+0x258/0x4f0 [drm]
> [   17.604065]  [<ffffffff810c34ac>] ? unlock_page+0x4c/0x50
> [   17.610245]  [<ffffffffa008ccf0>] ? drm_mode_setplane+0x1c0/0x1c0 [drm]
> [   17.617844]  [<ffffffff8111597d>] do_vfs_ioctl+0x2cd/0x4a0
> [   17.624137]  [<ffffffff81052572>] ? recalc_sigpending+0x12/0x40
> [   17.630943]  [<ffffffff8111eb32>] ? __fget+0x72/0xb0
> [   17.636645]  [<ffffffff81115b8c>] SyS_ioctl+0x3c/0x70
> [   17.642448]  [<ffffffff81055301>] ? SyS_rt_sigprocmask+0x81/0xa0
> [   17.649334]  [<ffffffff814f0017>] entry_SYSCALL_64_fastpath+0x12/0x6a
> [   17.656740] ---[ end trace 3f29190006c97826 ]---
> 
> Would you like a BZ for this or maybe it is known already?

Maarten is working on this, or at least we're tracking it already in
VIZ-7000.

Can you pls coordinate with Maarten to make sure the final "enable the
warning" patch lands only once all the bits are in?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator
  2016-01-12 16:19         ` Daniel Vetter
@ 2016-01-12 16:43           ` Maarten Lankhorst
  2016-01-13 14:35           ` Tvrtko Ursulin
  1 sibling, 0 replies; 50+ messages in thread
From: Maarten Lankhorst @ 2016-01-12 16:43 UTC (permalink / raw)
  To: Daniel Vetter, Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

Op 12-01-16 om 17:19 schreef Daniel Vetter:
> On Mon, Jan 11, 2016 at 09:51:38AM +0000, Tvrtko Ursulin wrote:
>> On 11/01/16 08:43, Daniel Vetter wrote:
>>> On Fri, Jan 08, 2016 at 01:29:14PM +0000, Tvrtko Ursulin wrote:
>>>> On 08/01/16 11:29, Tvrtko Ursulin wrote:
>>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>
>>>>> Purpose is to catch places which iterate the object VMA list
>>>>> without holding the big lock.
>>>>>
>>>>> Implemented by open coding list_for_each_entry to make the
>>>>> macro compatible with existing call sites.
>>>>>
>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>> ---
>>>>>   drivers/gpu/drm/i915/i915_debugfs.c      |  8 ++++----
>>>>>   drivers/gpu/drm/i915/i915_drv.h          |  6 ++++++
>>>>>   drivers/gpu/drm/i915/i915_gem.c          | 24 ++++++++++++------------
>>>>>   drivers/gpu/drm/i915/i915_gem_gtt.c      |  2 +-
>>>>>   drivers/gpu/drm/i915/i915_gem_shrinker.c |  2 +-
>>>>>   drivers/gpu/drm/i915/i915_gpu_error.c    |  4 ++--
>>>>>   6 files changed, 26 insertions(+), 20 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>>>>> index 714a45cf8a51..d7c2a3201161 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>>>> @@ -117,7 +117,7 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj)
>>>>>   	u64 size = 0;
>>>>>   	struct i915_vma *vma;
>>>>>
>>>>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>>>> +	i915_gem_obj_for_each_vma(vma, obj) {
>>>>>   		if (i915_is_ggtt(vma->vm) &&
>>>>>   		    drm_mm_node_allocated(&vma->node))
>>>>>   			size += vma->node.size;
>>>>> @@ -155,7 +155,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>>>>>   		   obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
>>>>>   	if (obj->base.name)
>>>>>   		seq_printf(m, " (name: %d)", obj->base.name);
>>>>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>>>> +	i915_gem_obj_for_each_vma(vma, obj) {
>>>>>   		if (vma->pin_count > 0)
>>>>>   			pin_count++;
>>>>>   	}
>>>>> @@ -164,7 +164,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>>>>>   		seq_printf(m, " (display)");
>>>>>   	if (obj->fence_reg != I915_FENCE_REG_NONE)
>>>>>   		seq_printf(m, " (fence: %d)", obj->fence_reg);
>>>>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>>>> +	i915_gem_obj_for_each_vma(vma, obj) {
>>>>>   		seq_printf(m, " (%sgtt offset: %08llx, size: %08llx",
>>>>>   			   i915_is_ggtt(vma->vm) ? "g" : "pp",
>>>>>   			   vma->node.start, vma->node.size);
>>>>> @@ -342,7 +342,7 @@ static int per_file_stats(int id, void *ptr, void *data)
>>>>>   		stats->shared += obj->base.size;
>>>>>
>>>>>   	if (USES_FULL_PPGTT(obj->base.dev)) {
>>>>> -		list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>>>> +		i915_gem_obj_for_each_vma(vma, obj) {
>>>>>   			struct i915_hw_ppgtt *ppgtt;
>>>>>
>>>>>   			if (!drm_mm_node_allocated(&vma->node))
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index b77a5d84eac2..0406a020dfcc 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -2852,6 +2852,12 @@ struct drm_i915_gem_object *i915_gem_object_create_from_data(
>>>>>   void i915_gem_free_object(struct drm_gem_object *obj);
>>>>>   void i915_gem_vma_destroy(struct i915_vma *vma);
>>>>>
>>>>> +#define i915_gem_obj_for_each_vma(vma, obj) \
>>>>> +	for (WARN_ON_ONCE(!mutex_is_locked(&(obj)->base.dev->struct_mutex)), \
>>>>> +	     vma = list_first_entry(&(obj)->vma_list, typeof(*vma), vma_link);\
>>>>> +	     &vma->vma_link != (&(obj)->vma_list); \
>>>>> +	     vma = list_next_entry(vma, vma_link))
>>>>> +
>>>>
>>>> Unfortunately error capture is not happy with this approach. Can't even see
>>>> that error capture attempts to grab the mutex anywhere.
>>>>
>>>> So what? Drop the idea or add a "doing error capture" flag somewhere?
>>> Fix the bugs. Not surprise at all that we've screwed this up all over the
>>> place ;-) Afaics modeset code isn't much better either ...
>> Ok I'll drop this patch then since the series contains fixes to all but one
>> related issues. The remaining one is then:
>>
>> [   17.370366] ------------[ cut here ]------------
>> [   17.375633] WARNING: CPU: 0 PID: 1128 at drivers/gpu/drm/i915/i915_gem.c:5166 i915_gem_obj_ggtt_offset_view+0x10f/0x120 [i915]()
>> [   17.388879] WARN_ON_ONCE(!mutex_is_locked(&(o)->base.dev->struct_mutex))
>> [   17.396364] Modules linked in: hid_generic usbhid coretemp asix usbnet libphy mii i915 gpio_lynxpoint i2c_hid hid video i2c_algo_bit drm_kms_helper acpi_pad drm lpc_ich mfd_core nls_iso8859_1 e1000e ptp ahci libahci pps_core
>> [   17.419484] CPU: 0 PID: 1128 Comm: Xorg Tainted: G     U          4.4.0-rc8-160107+ #105
>> [   17.428771] Hardware name: Intel Corporation Broadwell Client platform/WhiteTip Mountain 1, BIOS BDW-E1R1.86C.0080.R01.1406120446 06/12/2014
>> [   17.443161]  ffffffffa0227790 ffff8800a98439b8 ffffffff81280d82 ffff8800a9843a00
>> [   17.451677]  ffff8800a98439f0 ffffffff81049c8c ffff8801495d0000 ffff8800aa934900
>> [   17.460166]  ffff8801495d8668 ffffffffa0242520 ffff8800aacea000 ffff8800a9843a50
>> [   17.468674] Call Trace:
>> [   17.471470]  [<ffffffff81280d82>] dump_stack+0x4b/0x79
>> [   17.477355]  [<ffffffff81049c8c>] warn_slowpath_common+0x7c/0xc0
>> [   17.484255]  [<ffffffff81049d17>] warn_slowpath_fmt+0x47/0x50
>> [   17.490869]  [<ffffffffa018f7ef>] i915_gem_obj_ggtt_offset_view+0x10f/0x120 [i915]
>> [   17.499572]  [<ffffffffa01a7290>] ? gen9_write8+0x2d0/0x2d0 [i915]
>> [   17.506663]  [<ffffffffa01bdbfe>] ironlake_update_primary_plane+0x1ee/0x3a0 [i915]
>> [   17.515358]  [<ffffffffa01ace1f>] intel_plane_atomic_update+0x5f/0x70 [i915]
>> [   17.523391]  [<ffffffffa00ef122>] drm_atomic_helper_commit_planes_on_crtc+0x142/0x230 [drm_kms_helper]
>> [   17.534063]  [<ffffffffa01c1744>] intel_atomic_commit+0x424/0x2270 [i915]
>> [   17.541763]  [<ffffffff814ee04a>] ? __ww_mutex_lock+0x4a/0x82
>> [   17.548355]  [<ffffffffa0097edb>] ? drm_atomic_check_only+0x12b/0x5e0 [drm]
>> [   17.556352]  [<ffffffffa0097a07>] ? drm_atomic_set_crtc_for_connector+0x77/0xf0 [drm]
>> [   17.565346]  [<ffffffffa00983c2>] drm_atomic_commit+0x32/0x50 [drm]
>> [   17.572553]  [<ffffffffa00ef782>] drm_atomic_helper_set_config+0x72/0xb0 [drm_kms_helper]
>> [   17.581954]  [<ffffffffa0088e8f>] drm_mode_set_config_internal+0x5f/0x100 [drm]
>> [   17.590313]  [<ffffffffa008cdc8>] drm_mode_setcrtc+0xd8/0x500 [drm]
>> [   17.597491]  [<ffffffffa007f5d8>] drm_ioctl+0x258/0x4f0 [drm]
>> [   17.604065]  [<ffffffff810c34ac>] ? unlock_page+0x4c/0x50
>> [   17.610245]  [<ffffffffa008ccf0>] ? drm_mode_setplane+0x1c0/0x1c0 [drm]
>> [   17.617844]  [<ffffffff8111597d>] do_vfs_ioctl+0x2cd/0x4a0
>> [   17.624137]  [<ffffffff81052572>] ? recalc_sigpending+0x12/0x40
>> [   17.630943]  [<ffffffff8111eb32>] ? __fget+0x72/0xb0
>> [   17.636645]  [<ffffffff81115b8c>] SyS_ioctl+0x3c/0x70
>> [   17.642448]  [<ffffffff81055301>] ? SyS_rt_sigprocmask+0x81/0xa0
>> [   17.649334]  [<ffffffff814f0017>] entry_SYSCALL_64_fastpath+0x12/0x6a
>> [   17.656740] ---[ end trace 3f29190006c97826 ]---
>>
>> Would you like a BZ for this or maybe it is known already?
> Maarten is working on this, or at least we're tracking it already in
> VIZ-7000.
>
> Can you pls coordinate with Maarten to make sure the final "enable the
> warning" patch lands only once all the bits are in?
>
Chris Wilson's patch bomb makes it possible to use the vma instead of grabbing the view over and over, and that should fix this specific error.

~Maarten
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator
  2016-01-12 16:19         ` Daniel Vetter
  2016-01-12 16:43           ` Maarten Lankhorst
@ 2016-01-13 14:35           ` Tvrtko Ursulin
  1 sibling, 0 replies; 50+ messages in thread
From: Tvrtko Ursulin @ 2016-01-13 14:35 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, Intel-gfx


On 12/01/16 16:19, Daniel Vetter wrote:
> On Mon, Jan 11, 2016 at 09:51:38AM +0000, Tvrtko Ursulin wrote:
>>
>> On 11/01/16 08:43, Daniel Vetter wrote:
>>> On Fri, Jan 08, 2016 at 01:29:14PM +0000, Tvrtko Ursulin wrote:
>>>>
>>>> On 08/01/16 11:29, Tvrtko Ursulin wrote:
>>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>
>>>>> Purpose is to catch places which iterate the object VMA list
>>>>> without holding the big lock.
>>>>>
>>>>> Implemented by open coding list_for_each_entry to make the
>>>>> macro compatible with existing call sites.
>>>>>
>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>> ---
>>>>>    drivers/gpu/drm/i915/i915_debugfs.c      |  8 ++++----
>>>>>    drivers/gpu/drm/i915/i915_drv.h          |  6 ++++++
>>>>>    drivers/gpu/drm/i915/i915_gem.c          | 24 ++++++++++++------------
>>>>>    drivers/gpu/drm/i915/i915_gem_gtt.c      |  2 +-
>>>>>    drivers/gpu/drm/i915/i915_gem_shrinker.c |  2 +-
>>>>>    drivers/gpu/drm/i915/i915_gpu_error.c    |  4 ++--
>>>>>    6 files changed, 26 insertions(+), 20 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>>>>> index 714a45cf8a51..d7c2a3201161 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>>>> @@ -117,7 +117,7 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj)
>>>>>    	u64 size = 0;
>>>>>    	struct i915_vma *vma;
>>>>>
>>>>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>>>> +	i915_gem_obj_for_each_vma(vma, obj) {
>>>>>    		if (i915_is_ggtt(vma->vm) &&
>>>>>    		    drm_mm_node_allocated(&vma->node))
>>>>>    			size += vma->node.size;
>>>>> @@ -155,7 +155,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>>>>>    		   obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
>>>>>    	if (obj->base.name)
>>>>>    		seq_printf(m, " (name: %d)", obj->base.name);
>>>>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>>>> +	i915_gem_obj_for_each_vma(vma, obj) {
>>>>>    		if (vma->pin_count > 0)
>>>>>    			pin_count++;
>>>>>    	}
>>>>> @@ -164,7 +164,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>>>>>    		seq_printf(m, " (display)");
>>>>>    	if (obj->fence_reg != I915_FENCE_REG_NONE)
>>>>>    		seq_printf(m, " (fence: %d)", obj->fence_reg);
>>>>> -	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>>>> +	i915_gem_obj_for_each_vma(vma, obj) {
>>>>>    		seq_printf(m, " (%sgtt offset: %08llx, size: %08llx",
>>>>>    			   i915_is_ggtt(vma->vm) ? "g" : "pp",
>>>>>    			   vma->node.start, vma->node.size);
>>>>> @@ -342,7 +342,7 @@ static int per_file_stats(int id, void *ptr, void *data)
>>>>>    		stats->shared += obj->base.size;
>>>>>
>>>>>    	if (USES_FULL_PPGTT(obj->base.dev)) {
>>>>> -		list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>>>> +		i915_gem_obj_for_each_vma(vma, obj) {
>>>>>    			struct i915_hw_ppgtt *ppgtt;
>>>>>
>>>>>    			if (!drm_mm_node_allocated(&vma->node))
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index b77a5d84eac2..0406a020dfcc 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -2852,6 +2852,12 @@ struct drm_i915_gem_object *i915_gem_object_create_from_data(
>>>>>    void i915_gem_free_object(struct drm_gem_object *obj);
>>>>>    void i915_gem_vma_destroy(struct i915_vma *vma);
>>>>>
>>>>> +#define i915_gem_obj_for_each_vma(vma, obj) \
>>>>> +	for (WARN_ON_ONCE(!mutex_is_locked(&(obj)->base.dev->struct_mutex)), \
>>>>> +	     vma = list_first_entry(&(obj)->vma_list, typeof(*vma), vma_link);\
>>>>> +	     &vma->vma_link != (&(obj)->vma_list); \
>>>>> +	     vma = list_next_entry(vma, vma_link))
>>>>> +
>>>>
>>>>
>>>> Unfortunately error capture is not happy with this approach. Can't even see
>>>> that error capture attempts to grab the mutex anywhere.
>>>>
>>>> So what? Drop the idea or add a "doing error capture" flag somewhere?
>>>
>>> Fix the bugs. Not surprise at all that we've screwed this up all over the
>>> place ;-) Afaics modeset code isn't much better either ...
>>
>> Ok I'll drop this patch then since the series contains fixes to all but one
>> related issues. The remaining one is then:
>>
>> [   17.370366] ------------[ cut here ]------------
>> [   17.375633] WARNING: CPU: 0 PID: 1128 at drivers/gpu/drm/i915/i915_gem.c:5166 i915_gem_obj_ggtt_offset_view+0x10f/0x120 [i915]()
>> [   17.388879] WARN_ON_ONCE(!mutex_is_locked(&(o)->base.dev->struct_mutex))
>> [   17.396364] Modules linked in: hid_generic usbhid coretemp asix usbnet libphy mii i915 gpio_lynxpoint i2c_hid hid video i2c_algo_bit drm_kms_helper acpi_pad drm lpc_ich mfd_core nls_iso8859_1 e1000e ptp ahci libahci pps_core
>> [   17.419484] CPU: 0 PID: 1128 Comm: Xorg Tainted: G     U          4.4.0-rc8-160107+ #105
>> [   17.428771] Hardware name: Intel Corporation Broadwell Client platform/WhiteTip Mountain 1, BIOS BDW-E1R1.86C.0080.R01.1406120446 06/12/2014
>> [   17.443161]  ffffffffa0227790 ffff8800a98439b8 ffffffff81280d82 ffff8800a9843a00
>> [   17.451677]  ffff8800a98439f0 ffffffff81049c8c ffff8801495d0000 ffff8800aa934900
>> [   17.460166]  ffff8801495d8668 ffffffffa0242520 ffff8800aacea000 ffff8800a9843a50
>> [   17.468674] Call Trace:
>> [   17.471470]  [<ffffffff81280d82>] dump_stack+0x4b/0x79
>> [   17.477355]  [<ffffffff81049c8c>] warn_slowpath_common+0x7c/0xc0
>> [   17.484255]  [<ffffffff81049d17>] warn_slowpath_fmt+0x47/0x50
>> [   17.490869]  [<ffffffffa018f7ef>] i915_gem_obj_ggtt_offset_view+0x10f/0x120 [i915]
>> [   17.499572]  [<ffffffffa01a7290>] ? gen9_write8+0x2d0/0x2d0 [i915]
>> [   17.506663]  [<ffffffffa01bdbfe>] ironlake_update_primary_plane+0x1ee/0x3a0 [i915]
>> [   17.515358]  [<ffffffffa01ace1f>] intel_plane_atomic_update+0x5f/0x70 [i915]
>> [   17.523391]  [<ffffffffa00ef122>] drm_atomic_helper_commit_planes_on_crtc+0x142/0x230 [drm_kms_helper]
>> [   17.534063]  [<ffffffffa01c1744>] intel_atomic_commit+0x424/0x2270 [i915]
>> [   17.541763]  [<ffffffff814ee04a>] ? __ww_mutex_lock+0x4a/0x82
>> [   17.548355]  [<ffffffffa0097edb>] ? drm_atomic_check_only+0x12b/0x5e0 [drm]
>> [   17.556352]  [<ffffffffa0097a07>] ? drm_atomic_set_crtc_for_connector+0x77/0xf0 [drm]
>> [   17.565346]  [<ffffffffa00983c2>] drm_atomic_commit+0x32/0x50 [drm]
>> [   17.572553]  [<ffffffffa00ef782>] drm_atomic_helper_set_config+0x72/0xb0 [drm_kms_helper]
>> [   17.581954]  [<ffffffffa0088e8f>] drm_mode_set_config_internal+0x5f/0x100 [drm]
>> [   17.590313]  [<ffffffffa008cdc8>] drm_mode_setcrtc+0xd8/0x500 [drm]
>> [   17.597491]  [<ffffffffa007f5d8>] drm_ioctl+0x258/0x4f0 [drm]
>> [   17.604065]  [<ffffffff810c34ac>] ? unlock_page+0x4c/0x50
>> [   17.610245]  [<ffffffffa008ccf0>] ? drm_mode_setplane+0x1c0/0x1c0 [drm]
>> [   17.617844]  [<ffffffff8111597d>] do_vfs_ioctl+0x2cd/0x4a0
>> [   17.624137]  [<ffffffff81052572>] ? recalc_sigpending+0x12/0x40
>> [   17.630943]  [<ffffffff8111eb32>] ? __fget+0x72/0xb0
>> [   17.636645]  [<ffffffff81115b8c>] SyS_ioctl+0x3c/0x70
>> [   17.642448]  [<ffffffff81055301>] ? SyS_rt_sigprocmask+0x81/0xa0
>> [   17.649334]  [<ffffffff814f0017>] entry_SYSCALL_64_fastpath+0x12/0x6a
>> [   17.656740] ---[ end trace 3f29190006c97826 ]---
>>
>> Would you like a BZ for this or maybe it is known already?
>
> Maarten is working on this, or at least we're tracking it already in
> VIZ-7000.
>
> Can you pls coordinate with Maarten to make sure the final "enable the
> warning" patch lands only once all the bits are in?

Oh I dropped that patch due the error capture problem - do you have a 
suggestion how to handle that?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2016-01-13 14:35 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-08 11:29 [PATCH v2 00/13] Misc cleanups and locking fixes Tvrtko Ursulin
2016-01-08 11:29 ` [PATCH 01/13] drm/i915/bdw+: Replace list_del+list_add_tail with list_move_tail Tvrtko Ursulin
2016-01-11  8:22   ` Daniel Vetter
2016-01-08 11:29 ` [PATCH 02/13] drm/i915: Don't need a timer to wake us up Tvrtko Ursulin
2016-01-11  8:26   ` Daniel Vetter
2016-01-08 11:29 ` [PATCH 03/13] drm/i915: Avoid invariant conditionals in lrc interrupt handler Tvrtko Ursulin
2016-01-11  8:29   ` Daniel Vetter
2016-01-11  9:43     ` Tvrtko Ursulin
2016-01-08 11:29 ` [PATCH 04/13] drm/i915: Fail engine initialization if LRCA is incorrectly aligned Tvrtko Ursulin
2016-01-11  8:31   ` Daniel Vetter
2016-01-11 16:02     ` Dave Gordon
2016-01-11 23:31       ` Chris Wilson
2016-01-08 11:29 ` [PATCH 05/13] drm/i915: Cache LRCA in the context Tvrtko Ursulin
2016-01-11  8:38   ` Daniel Vetter
2016-01-11 19:41     ` Dave Gordon
2016-01-12  9:59       ` Tvrtko Ursulin
2016-01-08 11:29 ` [PATCH 06/13] drm/i915: Only grab timestamps when needed Tvrtko Ursulin
2016-01-11  8:42   ` Daniel Vetter
2016-01-11  9:45     ` Tvrtko Ursulin
2016-01-08 11:29 ` [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator Tvrtko Ursulin
2016-01-08 11:44   ` Chris Wilson
2016-01-11  8:48     ` Daniel Vetter
2016-01-08 13:29   ` Tvrtko Ursulin
2016-01-11  8:43     ` Daniel Vetter
2016-01-11  9:51       ` Tvrtko Ursulin
2016-01-12 16:19         ` Daniel Vetter
2016-01-12 16:43           ` Maarten Lankhorst
2016-01-13 14:35           ` Tvrtko Ursulin
2016-01-08 11:29 ` [PATCH 08/13] drm/i915: GEM operations need to be done under the big lock Tvrtko Ursulin
2016-01-08 11:40   ` Chris Wilson
2016-01-08 11:45     ` Chris Wilson
2016-01-08 11:47     ` Tvrtko Ursulin
2016-01-08 12:09       ` Chris Wilson
2016-01-08 12:40   ` Tvrtko Ursulin
2016-01-08 13:04   ` [PATCH v3 " Tvrtko Ursulin
2016-01-08 11:29 ` [PATCH 09/13] drm/i915: Remove two impossible asserts Tvrtko Ursulin
2016-01-11  8:49   ` Daniel Vetter
2016-01-08 11:29 ` [PATCH 10/13] drm/i915: Introduce dedicated safe object VMA iterator Tvrtko Ursulin
2016-01-08 11:29 ` [PATCH 11/13] drm/i915: Cache ringbuffer GTT address Tvrtko Ursulin
2016-01-08 11:37   ` Chris Wilson
2016-01-11  8:49   ` Daniel Vetter
2016-01-12 11:42     ` [PATCH v3 3/7] drm/i915: Cache ringbuffer GTT VMA Tvrtko Ursulin
2016-01-12 11:54       ` Chris Wilson
2016-01-11 16:16   ` [PATCH 11/13] drm/i915: Cache ringbuffer GTT address Dave Gordon
2016-01-11 17:44     ` Tvrtko Ursulin
2016-01-08 11:29 ` [PATCH 12/13] drm/i915: Add BKL asserts to get page helpers Tvrtko Ursulin
2016-01-08 11:37   ` Chris Wilson
2016-01-08 11:29 ` [PATCH 13/13] drm/i915: Cache LRC state page in the context Tvrtko Ursulin
2016-01-08 11:38 ` [PATCH v2 00/13] Misc cleanups and locking fixes Chris Wilson
2016-01-11  9:44 ` ✗ failure: Fi.CI.BAT Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.