All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending
@ 2017-01-24 11:00 Chris Wilson
  2017-01-24 11:00 ` [PATCH v3 2/8] drm/i915: Only disable execlist preemption for the duration of the request Chris Wilson
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Chris Wilson @ 2017-01-24 11:00 UTC (permalink / raw)
  To: intel-gfx

To complement the check in execlists_elsp_ready(), also assert that we
don't submit the same context while it has a lite restore still pending.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 32096d141d57..9dd612a2df16 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -375,6 +375,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 		dev_priv->regs + i915_mmio_reg_offset(RING_ELSP(engine));
 	u64 desc[2];
 
+	GEM_BUG_ON(port[0].count > 1);
 	if (!port[0].count)
 		execlists_context_status_change(port[0].request,
 						INTEL_CONTEXT_SCHEDULE_IN);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 2/8] drm/i915: Only disable execlist preemption for the duration of the request
  2017-01-24 11:00 [PATCH v3 1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Chris Wilson
@ 2017-01-24 11:00 ` Chris Wilson
  2017-01-24 11:00 ` [PATCH v3 3/8] drm/i915: Move breadcrumbs irq_posted up a level to engine Chris Wilson
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2017-01-24 11:00 UTC (permalink / raw)
  To: intel-gfx

We need to prevent resubmission of the context immediately following an
initial resubmit (which does a lite-restore preemption). Currently we do
this by disabling all submission whilst the context is still active, but
we can improve this by limiting the restriction to only until we
receive notification from the context-switch interrupt that the
lite-restore preemption is complete.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     | 18 ++++++++++++------
 drivers/gpu/drm/i915/intel_lrc.c        | 14 ++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 -
 3 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index fa69d72fdcb9..9d7a77ecec3d 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3320,15 +3320,21 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 
 			rcu_read_lock();
 			rq = READ_ONCE(engine->execlist_port[0].request);
-			if (rq)
-				print_request(m, rq, "\t\tELSP[0] ");
-			else
+			if (rq) {
+				seq_printf(m, "\t\tELSP[0] count=%d, ",
+					   engine->execlist_port[0].count);
+				print_request(m, rq, "rq: ");
+			} else {
 				seq_printf(m, "\t\tELSP[0] idle\n");
+			}
 			rq = READ_ONCE(engine->execlist_port[1].request);
-			if (rq)
-				print_request(m, rq, "\t\tELSP[1] ");
-			else
+			if (rq) {
+				seq_printf(m, "\t\tELSP[1] count=%d, ",
+					   engine->execlist_port[1].count);
+				print_request(m, rq, "rq: ");
+			} else {
 				seq_printf(m, "\t\tELSP[1] idle\n");
+			}
 			rcu_read_unlock();
 
 			spin_lock_irq(&engine->timeline->lock);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9dd612a2df16..9896027880ea 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -380,7 +380,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 		execlists_context_status_change(port[0].request,
 						INTEL_CONTEXT_SCHEDULE_IN);
 	desc[0] = execlists_update_context(port[0].request);
-	engine->preempt_wa = port[0].count++; /* bdw only? fixed on skl? */
+	port[0].count++;
 
 	if (port[1].request) {
 		GEM_BUG_ON(port[1].count);
@@ -545,15 +545,11 @@ bool intel_execlists_idle(struct drm_i915_private *dev_priv)
 	return true;
 }
 
-static bool execlists_elsp_ready(struct intel_engine_cs *engine)
+static bool execlists_elsp_ready(const struct intel_engine_cs *engine)
 {
-	int port;
+	const struct execlist_port *port = engine->execlist_port;
 
-	port = 1; /* wait for a free slot */
-	if (engine->preempt_wa)
-		port = 0; /* wait for GPU to be idle before continuing */
-
-	return !engine->execlist_port[port].request;
+	return port[0].count + port[1].count < 2;
 }
 
 /*
@@ -601,8 +597,6 @@ static void intel_lrc_irq_handler(unsigned long data)
 				i915_gem_request_put(port[0].request);
 				port[0] = port[1];
 				memset(&port[1], 0, sizeof(port[1]));
-
-				engine->preempt_wa = false;
 			}
 
 			GEM_BUG_ON(port[0].count == 0 &&
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 9183d148b36a..dbd32585f27a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -380,7 +380,6 @@ struct intel_engine_cs {
 	struct rb_root execlist_queue;
 	struct rb_node *execlist_first;
 	unsigned int fw_domains;
-	bool preempt_wa;
 	u32 ctx_desc_template;
 
 	/* Contexts are pinned whilst they are active on the GPU. The last
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 3/8] drm/i915: Move breadcrumbs irq_posted up a level to engine
  2017-01-24 11:00 [PATCH v3 1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Chris Wilson
  2017-01-24 11:00 ` [PATCH v3 2/8] drm/i915: Only disable execlist preemption for the duration of the request Chris Wilson
@ 2017-01-24 11:00 ` Chris Wilson
  2017-01-24 15:18   ` [PATCH v4] " Chris Wilson
  2017-01-24 11:00 ` [PATCH v3 4/8] drm/i915: Only run execlist context-switch handler after an interrupt Chris Wilson
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2017-01-24 11:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

In the next patch, we will use the irq_posted technique for another
engine interrupt, rather than use two members for the atomic updates, we
can use two bits of one instead. First, we need to update the
breadcrumbs to use the new common engine->irq_posted.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com
---
 drivers/gpu/drm/i915/i915_drv.h          | 2 +-
 drivers/gpu/drm/i915/i915_irq.c          | 2 +-
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 9 +++++----
 drivers/gpu/drm/i915/intel_ringbuffer.h  | 4 +++-
 4 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ccb4fb804fca..652e40c6ca40 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3964,7 +3964,7 @@ __i915_request_irq_complete(struct drm_i915_gem_request *req)
 	 */
 	if (engine->irq_seqno_barrier &&
 	    rcu_access_pointer(engine->breadcrumbs.irq_seqno_bh) == current &&
-	    cmpxchg_relaxed(&engine->breadcrumbs.irq_posted, 1, 0)) {
+	    test_and_clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted)) {
 		struct task_struct *tsk;
 
 		/* The ordering of irq_posted versus applying the barrier
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 6fefc34ef602..7e087c344265 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1033,7 +1033,7 @@ static void ironlake_rps_change_irq_handler(struct drm_i915_private *dev_priv)
 
 static void notify_ring(struct intel_engine_cs *engine)
 {
-	smp_store_mb(engine->breadcrumbs.irq_posted, true);
+	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
 	if (intel_engine_wakeup(engine))
 		trace_i915_gem_request_notify(engine);
 }
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index c6fa77177615..3dea9e7e17e9 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -81,7 +81,7 @@ static void irq_enable(struct intel_engine_cs *engine)
 	 * we still need to force the barrier before reading the seqno,
 	 * just in case.
 	 */
-	engine->breadcrumbs.irq_posted = true;
+	__set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
 
 	/* Caller disables interrupts */
 	spin_lock(&engine->i915->irq_lock);
@@ -96,7 +96,7 @@ static void irq_disable(struct intel_engine_cs *engine)
 	engine->irq_disable(engine);
 	spin_unlock(&engine->i915->irq_lock);
 
-	engine->breadcrumbs.irq_posted = false;
+	__clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
 }
 
 static void __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
@@ -257,7 +257,8 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
 			 * in case the seqno passed.
 			 */
 			__intel_breadcrumbs_enable_irq(b);
-			if (READ_ONCE(b->irq_posted))
+			if (test_bit(ENGINE_IRQ_BREADCRUMB,
+				     &engine->irq_posted))
 				wake_up_process(to_wait(next)->tsk);
 		}
 
@@ -610,7 +611,7 @@ void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
 	if (intel_engine_has_waiter(engine)) {
 		b->timeout = wait_timeout();
 		__intel_breadcrumbs_enable_irq(b);
-		if (READ_ONCE(b->irq_posted))
+		if (test_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted))
 			wake_up_process(b->first_wait->tsk);
 	} else {
 		/* sanitize the IMR and unmask any auxiliary interrupts */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index dbd32585f27a..a9ea84ea3155 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -211,6 +211,9 @@ struct intel_engine_cs {
 
 	struct intel_render_state *render_state;
 
+	unsigned long irq_posted;
+#define ENGINE_IRQ_BREADCRUMB 0
+
 	/* Rather than have every client wait upon all user interrupts,
 	 * with the herd waking after every interrupt and each doing the
 	 * heavyweight seqno dance, we delegate the task (of being the
@@ -229,7 +232,6 @@ struct intel_engine_cs {
 	 */
 	struct intel_breadcrumbs {
 		struct task_struct __rcu *irq_seqno_bh; /* bh for interrupts */
-		bool irq_posted;
 
 		spinlock_t lock; /* protects the lists of requests; irqsafe */
 		struct rb_root waiters; /* sorted by retirement, priority */
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 4/8] drm/i915: Only run execlist context-switch handler after an interrupt
  2017-01-24 11:00 [PATCH v3 1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Chris Wilson
  2017-01-24 11:00 ` [PATCH v3 2/8] drm/i915: Only disable execlist preemption for the duration of the request Chris Wilson
  2017-01-24 11:00 ` [PATCH v3 3/8] drm/i915: Move breadcrumbs irq_posted up a level to engine Chris Wilson
@ 2017-01-24 11:00 ` Chris Wilson
  2017-01-24 15:20   ` [PATCH v4] " Chris Wilson
  2017-01-24 11:00 ` [PATCH v3 5/8] drm/i915: Skip the execlists CSB scan and rewrite if the ring is untouched Chris Wilson
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2017-01-24 11:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Mark when we run the execlist tasklet following the interrupt, so we
don't probe a potentially uninitialised register when submitting the
contexts multiple times before the hardware responds.

v2: Use a shared engine->irq_posted

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com
---
 drivers/gpu/drm/i915/i915_irq.c         | 7 +++++--
 drivers/gpu/drm/i915/intel_lrc.c        | 3 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 7e087c344265..3f3c9082b0f8 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1349,8 +1349,11 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 {
 	if (iir & (GT_RENDER_USER_INTERRUPT << test_shift))
 		notify_ring(engine);
-	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift))
-		tasklet_schedule(&engine->irq_tasklet);
+
+	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
+		set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
+		tasklet_hi_schedule(&engine->irq_tasklet);
+	}
 }
 
 static irqreturn_t gen8_gt_irq_ack(struct drm_i915_private *dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9896027880ea..7db9c49450f9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -564,7 +564,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 
 	intel_uncore_forcewake_get(dev_priv, engine->fw_domains);
 
-	if (!execlists_elsp_idle(engine)) {
+	while (__test_and_clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
 		u32 __iomem *csb_mmio =
 			dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine));
 		u32 __iomem *buf =
@@ -1297,6 +1297,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
 	DRM_DEBUG_DRIVER("Execlists enabled for %s\n", engine->name);
 
 	/* After a GPU reset, we may have requests to replay */
+	__clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 	if (!execlists_elsp_idle(engine)) {
 		engine->execlist_port[0].count = 0;
 		engine->execlist_port[1].count = 0;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index a9ea84ea3155..8e872730f8eb 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -213,6 +213,7 @@ struct intel_engine_cs {
 
 	unsigned long irq_posted;
 #define ENGINE_IRQ_BREADCRUMB 0
+#define ENGINE_IRQ_EXECLIST 1
 
 	/* Rather than have every client wait upon all user interrupts,
 	 * with the herd waking after every interrupt and each doing the
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 5/8] drm/i915: Skip the execlists CSB scan and rewrite if the ring is untouched
  2017-01-24 11:00 [PATCH v3 1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Chris Wilson
                   ` (2 preceding siblings ...)
  2017-01-24 11:00 ` [PATCH v3 4/8] drm/i915: Only run execlist context-switch handler after an interrupt Chris Wilson
@ 2017-01-24 11:00 ` Chris Wilson
  2017-01-24 11:00 ` [PATCH v3 6/8] drm/i915: Only attempt to pass the first request to execlists Chris Wilson
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2017-01-24 11:00 UTC (permalink / raw)
  To: intel-gfx

If the CSB head/tail pointers are unchanged, we can skip the update of
the CSB register afterwards.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 7db9c49450f9..a46a05f0e79c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -574,9 +574,12 @@ static void intel_lrc_irq_handler(unsigned long data)
 		csb = readl(csb_mmio);
 		head = GEN8_CSB_READ_PTR(csb);
 		tail = GEN8_CSB_WRITE_PTR(csb);
+		if (head == tail)
+			break;
+
 		if (tail < head)
 			tail += GEN8_CSB_ENTRIES;
-		while (head < tail) {
+		do {
 			unsigned int idx = ++head % GEN8_CSB_ENTRIES;
 			unsigned int status = readl(buf + 2 * idx);
 
@@ -601,7 +604,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 
 			GEM_BUG_ON(port[0].count == 0 &&
 				   !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
-		}
+		} while (head < tail);
 
 		writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK,
 				     GEN8_CSB_WRITE_PTR(csb) << 8),
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 6/8] drm/i915: Only attempt to pass the first request to execlists
  2017-01-24 11:00 [PATCH v3 1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Chris Wilson
                   ` (3 preceding siblings ...)
  2017-01-24 11:00 ` [PATCH v3 5/8] drm/i915: Skip the execlists CSB scan and rewrite if the ring is untouched Chris Wilson
@ 2017-01-24 11:00 ` Chris Wilson
  2017-01-24 15:51   ` Mika Kuoppala
  2017-01-24 11:00 ` [PATCH v3 7/8] drm/i915: Dequeue execlists on a new request if any port is available Chris Wilson
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2017-01-24 11:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Only the first request added to the execlist queue, can be submitted. If
this request is not the first request on the queue, it means that there
are already higher priority requests waiting upon the tasklet and
kicking it will make no difference.

This is more relevant for a later patch, where we more eagerly try and
kick the tasklet to handle the submission of new requests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com
---
 drivers/gpu/drm/i915/intel_lrc.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a46a05f0e79c..9bf8858ecdcf 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -651,10 +651,11 @@ static void execlists_submit_request(struct drm_i915_gem_request *request)
 	/* Will be called from irq-context when using foreign fences. */
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
-	if (insert_request(&request->priotree, &engine->execlist_queue))
+	if (insert_request(&request->priotree, &engine->execlist_queue)) {
 		engine->execlist_first = &request->priotree.node;
-	if (execlists_elsp_idle(engine))
-		tasklet_hi_schedule(&engine->irq_tasklet);
+		if (execlists_elsp_idle(engine))
+			tasklet_hi_schedule(&engine->irq_tasklet);
+	}
 
 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
 }
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 7/8] drm/i915: Dequeue execlists on a new request if any port is available
  2017-01-24 11:00 [PATCH v3 1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Chris Wilson
                   ` (4 preceding siblings ...)
  2017-01-24 11:00 ` [PATCH v3 6/8] drm/i915: Only attempt to pass the first request to execlists Chris Wilson
@ 2017-01-24 11:00 ` Chris Wilson
  2017-01-24 11:00 ` [PATCH v3 8/8] drm/i915: Emit dma-fence (and execlists submit) first from signaler Chris Wilson
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2017-01-24 11:00 UTC (permalink / raw)
  To: intel-gfx

If the second ELSP port is available, schedule the execlists tasklet to
see if the new request can occupy it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9bf8858ecdcf..39fdfae0894a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -653,7 +653,7 @@ static void execlists_submit_request(struct drm_i915_gem_request *request)
 
 	if (insert_request(&request->priotree, &engine->execlist_queue)) {
 		engine->execlist_first = &request->priotree.node;
-		if (execlists_elsp_idle(engine))
+		if (execlists_elsp_ready(engine))
 			tasklet_hi_schedule(&engine->irq_tasklet);
 	}
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 8/8] drm/i915: Emit dma-fence (and execlists submit) first from signaler
  2017-01-24 11:00 [PATCH v3 1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Chris Wilson
                   ` (5 preceding siblings ...)
  2017-01-24 11:00 ` [PATCH v3 7/8] drm/i915: Dequeue execlists on a new request if any port is available Chris Wilson
@ 2017-01-24 11:00 ` Chris Wilson
  2017-01-24 11:54 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Patchwork
  2017-01-24 15:54 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending (rev3) Patchwork
  8 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2017-01-24 11:00 UTC (permalink / raw)
  To: intel-gfx

When introduced, I thought that reducing client latency from the
signaler was the priority. Since its inception the signaler has become
responsible for keeping the execlists full, via the dma-fence. As this
is very important to minimise overall execution time, signal the
dma-fence first and then signal any waiting clients.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 3dea9e7e17e9..e798ec7bb4f5 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -461,16 +461,16 @@ static int intel_breadcrumbs_signaler(void *arg)
 		 */
 		request = READ_ONCE(b->first_signal);
 		if (signal_complete(request)) {
+			local_bh_disable();
+			dma_fence_signal(&request->fence);
+			local_bh_enable(); /* kick start the tasklets */
+
 			/* Wake up all other completed waiters and select the
 			 * next bottom-half for the next user interrupt.
 			 */
 			intel_engine_remove_wait(engine,
 						 &request->signaling.wait);
 
-			local_bh_disable();
-			dma_fence_signal(&request->fence);
-			local_bh_enable(); /* kick start the tasklets */
-
 			/* Find the next oldest signal. Note that as we have
 			 * not been holding the lock, another client may
 			 * have installed an even older signal than the one
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending
  2017-01-24 11:00 [PATCH v3 1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Chris Wilson
                   ` (6 preceding siblings ...)
  2017-01-24 11:00 ` [PATCH v3 8/8] drm/i915: Emit dma-fence (and execlists submit) first from signaler Chris Wilson
@ 2017-01-24 11:54 ` Patchwork
  2017-01-24 15:54 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending (rev3) Patchwork
  8 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2017-01-24 11:54 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending
URL   : https://patchwork.freedesktop.org/series/18474/
State : success

== Summary ==

Series 18474v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/18474/revisions/1/mbox/


fi-bdw-5557u     total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050     total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205     total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700     total:79   pass:66   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900     total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820     total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770      total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r     total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m     total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770      total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u     total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u     total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hq    total:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k     total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-snb-2520m     total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600      total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

64fc20ef2f4bf8a6b563a812485fc6ac86637fcd drm-tip: 2017y-01m-24d-10h-02m-05s UTC integration manifest
b023ef3 drm/i915: Emit dma-fence (and execlists submit) first from signaler
d5dd99a drm/i915: Dequeue execlists on a new request if any port is available
8ed8ff2 drm/i915: Only attempt to pass the first request to execlists
a6325a8 drm/i915: Skip the execlists CSB scan and rewrite if the ring is untouched
f09eeae drm/i915: Only run execlist context-switch handler after an interrupt
49c666b drm/i915: Move breadcrumbs irq_posted up a level to engine
1327d79 drm/i915: Only disable execlist preemption for the duration of the request
17953d9 drm/i915: Assert that we don't submit to execlists whilst a preempt is pending

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3589/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4] drm/i915: Move breadcrumbs irq_posted up a level to engine
  2017-01-24 11:00 ` [PATCH v3 3/8] drm/i915: Move breadcrumbs irq_posted up a level to engine Chris Wilson
@ 2017-01-24 15:18   ` Chris Wilson
  2017-01-24 15:45     ` Mika Kuoppala
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2017-01-24 15:18 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

In the next patch, we will use the irq_posted technique for another
engine interrupt, rather than use two members for the atomic updates, we
can use two bits of one instead. First, we need to update the
breadcrumbs to use the new common engine->irq_posted.

v2: Use set_bit() rather than __set_bit() to ensure atomicity with
respect to other bits in the mask

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com
---
 drivers/gpu/drm/i915/i915_drv.h          | 2 +-
 drivers/gpu/drm/i915/i915_irq.c          | 2 +-
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 9 ++++-----
 drivers/gpu/drm/i915/intel_ringbuffer.h  | 4 +++-
 4 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ccb4fb804fca..652e40c6ca40 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3964,7 +3964,7 @@ __i915_request_irq_complete(struct drm_i915_gem_request *req)
 	 */
 	if (engine->irq_seqno_barrier &&
 	    rcu_access_pointer(engine->breadcrumbs.irq_seqno_bh) == current &&
-	    cmpxchg_relaxed(&engine->breadcrumbs.irq_posted, 1, 0)) {
+	    test_and_clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted)) {
 		struct task_struct *tsk;
 
 		/* The ordering of irq_posted versus applying the barrier
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 6fefc34ef602..7e087c344265 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1033,7 +1033,7 @@ static void ironlake_rps_change_irq_handler(struct drm_i915_private *dev_priv)
 
 static void notify_ring(struct intel_engine_cs *engine)
 {
-	smp_store_mb(engine->breadcrumbs.irq_posted, true);
+	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
 	if (intel_engine_wakeup(engine))
 		trace_i915_gem_request_notify(engine);
 }
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index c6fa77177615..6b24f2544b6b 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -81,7 +81,7 @@ static void irq_enable(struct intel_engine_cs *engine)
 	 * we still need to force the barrier before reading the seqno,
 	 * just in case.
 	 */
-	engine->breadcrumbs.irq_posted = true;
+	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
 
 	/* Caller disables interrupts */
 	spin_lock(&engine->i915->irq_lock);
@@ -95,8 +95,6 @@ static void irq_disable(struct intel_engine_cs *engine)
 	spin_lock(&engine->i915->irq_lock);
 	engine->irq_disable(engine);
 	spin_unlock(&engine->i915->irq_lock);
-
-	engine->breadcrumbs.irq_posted = false;
 }
 
 static void __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
@@ -257,7 +255,8 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
 			 * in case the seqno passed.
 			 */
 			__intel_breadcrumbs_enable_irq(b);
-			if (READ_ONCE(b->irq_posted))
+			if (test_bit(ENGINE_IRQ_BREADCRUMB,
+				     &engine->irq_posted))
 				wake_up_process(to_wait(next)->tsk);
 		}
 
@@ -610,7 +609,7 @@ void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
 	if (intel_engine_has_waiter(engine)) {
 		b->timeout = wait_timeout();
 		__intel_breadcrumbs_enable_irq(b);
-		if (READ_ONCE(b->irq_posted))
+		if (test_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted))
 			wake_up_process(b->first_wait->tsk);
 	} else {
 		/* sanitize the IMR and unmask any auxiliary interrupts */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index dbd32585f27a..a9ea84ea3155 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -211,6 +211,9 @@ struct intel_engine_cs {
 
 	struct intel_render_state *render_state;
 
+	unsigned long irq_posted;
+#define ENGINE_IRQ_BREADCRUMB 0
+
 	/* Rather than have every client wait upon all user interrupts,
 	 * with the herd waking after every interrupt and each doing the
 	 * heavyweight seqno dance, we delegate the task (of being the
@@ -229,7 +232,6 @@ struct intel_engine_cs {
 	 */
 	struct intel_breadcrumbs {
 		struct task_struct __rcu *irq_seqno_bh; /* bh for interrupts */
-		bool irq_posted;
 
 		spinlock_t lock; /* protects the lists of requests; irqsafe */
 		struct rb_root waiters; /* sorted by retirement, priority */
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4] drm/i915: Only run execlist context-switch handler after an interrupt
  2017-01-24 11:00 ` [PATCH v3 4/8] drm/i915: Only run execlist context-switch handler after an interrupt Chris Wilson
@ 2017-01-24 15:20   ` Chris Wilson
  2017-01-24 15:46     ` Mika Kuoppala
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2017-01-24 15:20 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Mark when we run the execlist tasklet following the interrupt, so we
don't probe a potentially uninitialised register when submitting the
contexts multiple times before the hardware responds.

v2: Use a shared engine->irq_posted
v3: Always use locked bitops to be sure of atomicity wrt to other bits
in the mask.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com
---
 drivers/gpu/drm/i915/i915_irq.c         | 7 +++++--
 drivers/gpu/drm/i915/intel_lrc.c        | 3 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 7e087c344265..3f3c9082b0f8 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1349,8 +1349,11 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 {
 	if (iir & (GT_RENDER_USER_INTERRUPT << test_shift))
 		notify_ring(engine);
-	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift))
-		tasklet_schedule(&engine->irq_tasklet);
+
+	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
+		set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
+		tasklet_hi_schedule(&engine->irq_tasklet);
+	}
 }
 
 static irqreturn_t gen8_gt_irq_ack(struct drm_i915_private *dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9896027880ea..f729568e5e54 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -564,7 +564,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 
 	intel_uncore_forcewake_get(dev_priv, engine->fw_domains);
 
-	if (!execlists_elsp_idle(engine)) {
+	while (test_and_clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
 		u32 __iomem *csb_mmio =
 			dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine));
 		u32 __iomem *buf =
@@ -1297,6 +1297,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
 	DRM_DEBUG_DRIVER("Execlists enabled for %s\n", engine->name);
 
 	/* After a GPU reset, we may have requests to replay */
+	clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 	if (!execlists_elsp_idle(engine)) {
 		engine->execlist_port[0].count = 0;
 		engine->execlist_port[1].count = 0;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index a9ea84ea3155..8e872730f8eb 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -213,6 +213,7 @@ struct intel_engine_cs {
 
 	unsigned long irq_posted;
 #define ENGINE_IRQ_BREADCRUMB 0
+#define ENGINE_IRQ_EXECLIST 1
 
 	/* Rather than have every client wait upon all user interrupts,
 	 * with the herd waking after every interrupt and each doing the
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] drm/i915: Move breadcrumbs irq_posted up a level to engine
  2017-01-24 15:18   ` [PATCH v4] " Chris Wilson
@ 2017-01-24 15:45     ` Mika Kuoppala
  0 siblings, 0 replies; 16+ messages in thread
From: Mika Kuoppala @ 2017-01-24 15:45 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> In the next patch, we will use the irq_posted technique for another
> engine interrupt, rather than use two members for the atomic updates, we
> can use two bits of one instead. First, we need to update the
> breadcrumbs to use the new common engine->irq_posted.
>
> v2: Use set_bit() rather than __set_bit() to ensure atomicity with
> respect to other bits in the mask
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com
Missing '>' in the end of line

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_drv.h          | 2 +-
>  drivers/gpu/drm/i915/i915_irq.c          | 2 +-
>  drivers/gpu/drm/i915/intel_breadcrumbs.c | 9 ++++-----
>  drivers/gpu/drm/i915/intel_ringbuffer.h  | 4 +++-
>  4 files changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index ccb4fb804fca..652e40c6ca40 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3964,7 +3964,7 @@ __i915_request_irq_complete(struct drm_i915_gem_request *req)
>  	 */
>  	if (engine->irq_seqno_barrier &&
>  	    rcu_access_pointer(engine->breadcrumbs.irq_seqno_bh) == current &&
> -	    cmpxchg_relaxed(&engine->breadcrumbs.irq_posted, 1, 0)) {
> +	    test_and_clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted)) {
>  		struct task_struct *tsk;
>  
>  		/* The ordering of irq_posted versus applying the barrier
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 6fefc34ef602..7e087c344265 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1033,7 +1033,7 @@ static void ironlake_rps_change_irq_handler(struct drm_i915_private *dev_priv)
>  
>  static void notify_ring(struct intel_engine_cs *engine)
>  {
> -	smp_store_mb(engine->breadcrumbs.irq_posted, true);
> +	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
>  	if (intel_engine_wakeup(engine))
>  		trace_i915_gem_request_notify(engine);
>  }
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index c6fa77177615..6b24f2544b6b 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -81,7 +81,7 @@ static void irq_enable(struct intel_engine_cs *engine)
>  	 * we still need to force the barrier before reading the seqno,
>  	 * just in case.
>  	 */
> -	engine->breadcrumbs.irq_posted = true;
> +	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
>  
>  	/* Caller disables interrupts */
>  	spin_lock(&engine->i915->irq_lock);
> @@ -95,8 +95,6 @@ static void irq_disable(struct intel_engine_cs *engine)
>  	spin_lock(&engine->i915->irq_lock);
>  	engine->irq_disable(engine);
>  	spin_unlock(&engine->i915->irq_lock);
> -
> -	engine->breadcrumbs.irq_posted = false;
>  }
>  
>  static void __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
> @@ -257,7 +255,8 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
>  			 * in case the seqno passed.
>  			 */
>  			__intel_breadcrumbs_enable_irq(b);
> -			if (READ_ONCE(b->irq_posted))
> +			if (test_bit(ENGINE_IRQ_BREADCRUMB,
> +				     &engine->irq_posted))
>  				wake_up_process(to_wait(next)->tsk);
>  		}
>  
> @@ -610,7 +609,7 @@ void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
>  	if (intel_engine_has_waiter(engine)) {
>  		b->timeout = wait_timeout();
>  		__intel_breadcrumbs_enable_irq(b);
> -		if (READ_ONCE(b->irq_posted))
> +		if (test_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted))
>  			wake_up_process(b->first_wait->tsk);
>  	} else {
>  		/* sanitize the IMR and unmask any auxiliary interrupts */
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index dbd32585f27a..a9ea84ea3155 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -211,6 +211,9 @@ struct intel_engine_cs {
>  
>  	struct intel_render_state *render_state;
>  
> +	unsigned long irq_posted;
> +#define ENGINE_IRQ_BREADCRUMB 0
> +
>  	/* Rather than have every client wait upon all user interrupts,
>  	 * with the herd waking after every interrupt and each doing the
>  	 * heavyweight seqno dance, we delegate the task (of being the
> @@ -229,7 +232,6 @@ struct intel_engine_cs {
>  	 */
>  	struct intel_breadcrumbs {
>  		struct task_struct __rcu *irq_seqno_bh; /* bh for interrupts */
> -		bool irq_posted;
>  
>  		spinlock_t lock; /* protects the lists of requests; irqsafe */
>  		struct rb_root waiters; /* sorted by retirement, priority */
> -- 
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] drm/i915: Only run execlist context-switch handler after an interrupt
  2017-01-24 15:20   ` [PATCH v4] " Chris Wilson
@ 2017-01-24 15:46     ` Mika Kuoppala
  0 siblings, 0 replies; 16+ messages in thread
From: Mika Kuoppala @ 2017-01-24 15:46 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Mark when we run the execlist tasklet following the interrupt, so we
> don't probe a potentially uninitialised register when submitting the
> contexts multiple times before the hardware responds.
>
> v2: Use a shared engine->irq_posted
> v3: Always use locked bitops to be sure of atomicity wrt to other bits
> in the mask.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com

Missing '>' from Cc.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_irq.c         | 7 +++++--
>  drivers/gpu/drm/i915/intel_lrc.c        | 3 ++-
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
>  3 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 7e087c344265..3f3c9082b0f8 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1349,8 +1349,11 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
>  {
>  	if (iir & (GT_RENDER_USER_INTERRUPT << test_shift))
>  		notify_ring(engine);
> -	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift))
> -		tasklet_schedule(&engine->irq_tasklet);
> +
> +	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
> +		set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> +		tasklet_hi_schedule(&engine->irq_tasklet);
> +	}
>  }
>  
>  static irqreturn_t gen8_gt_irq_ack(struct drm_i915_private *dev_priv,
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 9896027880ea..f729568e5e54 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -564,7 +564,7 @@ static void intel_lrc_irq_handler(unsigned long data)
>  
>  	intel_uncore_forcewake_get(dev_priv, engine->fw_domains);
>  
> -	if (!execlists_elsp_idle(engine)) {
> +	while (test_and_clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
>  		u32 __iomem *csb_mmio =
>  			dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine));
>  		u32 __iomem *buf =
> @@ -1297,6 +1297,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
>  	DRM_DEBUG_DRIVER("Execlists enabled for %s\n", engine->name);
>  
>  	/* After a GPU reset, we may have requests to replay */
> +	clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>  	if (!execlists_elsp_idle(engine)) {
>  		engine->execlist_port[0].count = 0;
>  		engine->execlist_port[1].count = 0;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index a9ea84ea3155..8e872730f8eb 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -213,6 +213,7 @@ struct intel_engine_cs {
>  
>  	unsigned long irq_posted;
>  #define ENGINE_IRQ_BREADCRUMB 0
> +#define ENGINE_IRQ_EXECLIST 1
>  
>  	/* Rather than have every client wait upon all user interrupts,
>  	 * with the herd waking after every interrupt and each doing the
> -- 
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 6/8] drm/i915: Only attempt to pass the first request to execlists
  2017-01-24 11:00 ` [PATCH v3 6/8] drm/i915: Only attempt to pass the first request to execlists Chris Wilson
@ 2017-01-24 15:51   ` Mika Kuoppala
  0 siblings, 0 replies; 16+ messages in thread
From: Mika Kuoppala @ 2017-01-24 15:51 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Only the first request added to the execlist queue, can be submitted. If
> this request is not the first request on the queue, it means that there
> are already higher priority requests waiting upon the tasklet and
> kicking it will make no difference.
>
> This is more relevant for a later patch, where we more eagerly try and
> kick the tasklet to handle the submission of new requests.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index a46a05f0e79c..9bf8858ecdcf 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -651,10 +651,11 @@ static void execlists_submit_request(struct drm_i915_gem_request *request)
>  	/* Will be called from irq-context when using foreign fences. */
>  	spin_lock_irqsave(&engine->timeline->lock, flags);
>  
> -	if (insert_request(&request->priotree, &engine->execlist_queue))
> +	if (insert_request(&request->priotree, &engine->execlist_queue)) {
>  		engine->execlist_first = &request->priotree.node;
> -	if (execlists_elsp_idle(engine))
> -		tasklet_hi_schedule(&engine->irq_tasklet);
> +		if (execlists_elsp_idle(engine))
> +			tasklet_hi_schedule(&engine->irq_tasklet);
> +	}
>  
>  	spin_unlock_irqrestore(&engine->timeline->lock, flags);
>  }
> -- 
> 2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending (rev3)
  2017-01-24 11:00 [PATCH v3 1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Chris Wilson
                   ` (7 preceding siblings ...)
  2017-01-24 11:54 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Patchwork
@ 2017-01-24 15:54 ` Patchwork
  2017-01-24 16:06   ` Chris Wilson
  8 siblings, 1 reply; 16+ messages in thread
From: Patchwork @ 2017-01-24 15:54 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending (rev3)
URL   : https://patchwork.freedesktop.org/series/18474/
State : success

== Summary ==

Series 18474v3 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/18474/revisions/3/mbox/

Test gem_exec_store:
        Subgroup basic-blt:
                incomplete -> PASS       (fi-byt-j1900)

fi-bdw-5557u     total:247  pass:233  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050     total:247  pass:208  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205     total:247  pass:225  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700     total:79   pass:66   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900     total:247  pass:220  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820     total:247  pass:216  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770      total:247  pass:228  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r     total:247  pass:228  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m     total:247  pass:226  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770      total:247  pass:226  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u     total:247  pass:226  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u     total:247  pass:234  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hq    total:247  pass:227  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k     total:247  pass:222  dwarn:4   dfail:0   fail:0   skip:21 
fi-skl-6770hq    total:247  pass:234  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m     total:247  pass:216  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600      total:247  pass:215  dwarn:0   dfail:0   fail:0   skip:32 

d0150572848ac878b63afdd3d4fcd9456635024d drm-tip: 2017y-01m-24d-13h-35m-17s UTC integration manifest
05c07b2 drm/i915: Emit dma-fence (and execlists submit) first from signaler
eecdd01 drm/i915: Dequeue execlists on a new request if any port is available
80a30c4 drm/i915: Only attempt to pass the first request to execlists
fe661c8 drm/i915: Skip the execlists CSB scan and rewrite if the ring is untouched
43170be drm/i915: Only run execlist context-switch handler after an interrupt
85c5527 drm/i915: Move breadcrumbs irq_posted up a level to engine
ffc5260 drm/i915: Only disable execlist preemption for the duration of the request
291be04 drm/i915: Assert that we don't submit to execlists whilst a preempt is pending

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3594/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ✓ Fi.CI.BAT: success for series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending (rev3)
  2017-01-24 15:54 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending (rev3) Patchwork
@ 2017-01-24 16:06   ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2017-01-24 16:06 UTC (permalink / raw)
  To: intel-gfx

On Tue, Jan 24, 2017 at 03:54:19PM -0000, Patchwork wrote:
> == Series Details ==
> 
> Series: series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending (rev3)
> URL   : https://patchwork.freedesktop.org/series/18474/
> State : success
> 
> == Summary ==
> 
> Series 18474v3 Series without cover letter
> https://patchwork.freedesktop.org/api/1.0/series/18474/revisions/3/mbox/
> 
> Test gem_exec_store:
>         Subgroup basic-blt:
>                 incomplete -> PASS       (fi-byt-j1900)

Thanks for the reviews, lots of patches for such an unnoticeable
improvement. Still looking for a miracle.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-01-24 16:06 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-24 11:00 [PATCH v3 1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Chris Wilson
2017-01-24 11:00 ` [PATCH v3 2/8] drm/i915: Only disable execlist preemption for the duration of the request Chris Wilson
2017-01-24 11:00 ` [PATCH v3 3/8] drm/i915: Move breadcrumbs irq_posted up a level to engine Chris Wilson
2017-01-24 15:18   ` [PATCH v4] " Chris Wilson
2017-01-24 15:45     ` Mika Kuoppala
2017-01-24 11:00 ` [PATCH v3 4/8] drm/i915: Only run execlist context-switch handler after an interrupt Chris Wilson
2017-01-24 15:20   ` [PATCH v4] " Chris Wilson
2017-01-24 15:46     ` Mika Kuoppala
2017-01-24 11:00 ` [PATCH v3 5/8] drm/i915: Skip the execlists CSB scan and rewrite if the ring is untouched Chris Wilson
2017-01-24 11:00 ` [PATCH v3 6/8] drm/i915: Only attempt to pass the first request to execlists Chris Wilson
2017-01-24 15:51   ` Mika Kuoppala
2017-01-24 11:00 ` [PATCH v3 7/8] drm/i915: Dequeue execlists on a new request if any port is available Chris Wilson
2017-01-24 11:00 ` [PATCH v3 8/8] drm/i915: Emit dma-fence (and execlists submit) first from signaler Chris Wilson
2017-01-24 11:54 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending Patchwork
2017-01-24 15:54 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/8] drm/i915: Assert that we don't submit to execlists whilst a preempt is pending (rev3) Patchwork
2017-01-24 16:06   ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.