All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Support for more than two execlist ports (v2)
@ 2017-09-20 14:36 Mika Kuoppala
  2017-09-20 14:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
                   ` (9 more replies)
  0 siblings, 10 replies; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-20 14:36 UTC (permalink / raw)
  To: intel-gfx

Hi,

HWSP context status buffer handling and guc cleanup and request
coalescing series were merged and those triggered non trivial
rebase. So I had to drop reviewed-by's from the patches :(

First 4 should be tasting more or less the same though.

Thankyou for review and comments!

-Mika

Mika Kuoppala (8):
  drm/i915: Make own struct for execlist items
  drm/i915: Move execlist initialization into intel_engine_cs.c
  drm/i915: Wrap port cancellation into a function
  drm/i915: Add execlist_port_complete
  drm/i915: Make execlist port count variable
  drm/i915: Introduce execlist_port_* accessors
  drm/i915: Keep track of reserved execlist ports
  drm/i915: Improve GuC request coalescing

 drivers/gpu/drm/i915/i915_debugfs.c        |  24 ++--
 drivers/gpu/drm/i915/i915_drv.h            |   3 +-
 drivers/gpu/drm/i915/i915_gem.c            |   6 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      |  17 ++-
 drivers/gpu/drm/i915/i915_guc_submission.c | 102 ++++++++------
 drivers/gpu/drm/i915/i915_irq.c            |   5 +-
 drivers/gpu/drm/i915/intel_engine_cs.c     |  48 +++++--
 drivers/gpu/drm/i915/intel_lrc.c           | 214 ++++++++++++++++-------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 197 +++++++++++++++++++++++---
 9 files changed, 424 insertions(+), 192 deletions(-)

-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/8] drm/i915: Make own struct for execlist items
  2017-09-20 14:36 [PATCH 0/8] Support for more than two execlist ports (v2) Mika Kuoppala
@ 2017-09-20 14:36 ` Mika Kuoppala
  2017-09-21 11:55   ` Michał Winiarski
                     ` (2 more replies)
  2017-09-20 14:36 ` [PATCH 2/8] drm/i915: Move execlist initialization into intel_engine_cs.c Mika Kuoppala
                   ` (8 subsequent siblings)
  9 siblings, 3 replies; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-20 14:36 UTC (permalink / raw)
  To: intel-gfx

Engine's execlist related items have been increasing to
a point where a separate struct is warranted. Carve execlist
specific items to a dedicated struct to add clarity.

v2: add kerneldoc and fix whitespace (Joonas, Chris)
v3: csb_mmio changes, rebase

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |   8 +--
 drivers/gpu/drm/i915/i915_gem.c            |   6 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      |   4 +-
 drivers/gpu/drm/i915/i915_guc_submission.c |  31 +++++----
 drivers/gpu/drm/i915/i915_irq.c            |   5 +-
 drivers/gpu/drm/i915/intel_engine_cs.c     |  12 ++--
 drivers/gpu/drm/i915/intel_lrc.c           | 100 +++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 100 +++++++++++++++++++++++------
 8 files changed, 167 insertions(+), 99 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index ca6fa6d122c6..7648ff207670 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3324,7 +3324,7 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			read = GEN8_CSB_READ_PTR(ptr);
 			write = GEN8_CSB_WRITE_PTR(ptr);
 			seq_printf(m, "\tExeclist CSB read %d [%d cached], write %d [%d from hws], interrupt posted? %s\n",
-				   read, engine->csb_head,
+				   read, engine->execlist.csb_head,
 				   write,
 				   intel_read_status_page(engine, intel_hws_csb_write_index(engine->i915)),
 				   yesno(test_bit(ENGINE_IRQ_EXECLIST,
@@ -3346,10 +3346,10 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			}
 
 			rcu_read_lock();
-			for (idx = 0; idx < ARRAY_SIZE(engine->execlist_port); idx++) {
+			for (idx = 0; idx < ARRAY_SIZE(engine->execlist.port); idx++) {
 				unsigned int count;
 
-				rq = port_unpack(&engine->execlist_port[idx],
+				rq = port_unpack(&engine->execlist.port[idx],
 						 &count);
 				if (rq) {
 					seq_printf(m, "\t\tELSP[%d] count=%d, ",
@@ -3363,7 +3363,7 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			rcu_read_unlock();
 
 			spin_lock_irq(&engine->timeline->lock);
-			for (rb = engine->execlist_first; rb; rb = rb_next(rb)){
+			for (rb = engine->execlist.first; rb; rb = rb_next(rb)) {
 				struct i915_priolist *p =
 					rb_entry(rb, typeof(*p), node);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c4bf34865fa3..7934bafab081 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2815,8 +2815,8 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
 	 * Turning off the engine->irq_tasklet until the reset is over
 	 * prevents the race.
 	 */
-	tasklet_kill(&engine->irq_tasklet);
-	tasklet_disable(&engine->irq_tasklet);
+	tasklet_kill(&engine->execlist.irq_tasklet);
+	tasklet_disable(&engine->execlist.irq_tasklet);
 
 	if (engine->irq_seqno_barrier)
 		engine->irq_seqno_barrier(engine);
@@ -2995,7 +2995,7 @@ void i915_gem_reset(struct drm_i915_private *dev_priv)
 
 void i915_gem_reset_finish_engine(struct intel_engine_cs *engine)
 {
-	tasklet_enable(&engine->irq_tasklet);
+	tasklet_enable(&engine->execlist.irq_tasklet);
 	kthread_unpark(engine->breadcrumbs.signaler);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0c779671fe2d..8710fbdbbcb1 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1327,10 +1327,10 @@ static void engine_record_requests(struct intel_engine_cs *engine,
 static void error_record_engine_execlists(struct intel_engine_cs *engine,
 					  struct drm_i915_error_engine *ee)
 {
-	const struct execlist_port *port = engine->execlist_port;
+	const struct execlist_port *port = engine->execlist.port;
 	unsigned int n;
 
-	for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++) {
+	for (n = 0; n < ARRAY_SIZE(engine->execlist.port); n++) {
 		struct drm_i915_gem_request *rq = port_request(&port[n]);
 
 		if (!rq)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index e191d56fc990..e967d266bc3c 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -494,11 +494,12 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
 	struct drm_i915_private *dev_priv = engine->i915;
 	struct intel_guc *guc = &dev_priv->guc;
 	struct i915_guc_client *client = guc->execbuf_client;
-	struct execlist_port *port = engine->execlist_port;
-	unsigned int engine_id = engine->id;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
+	const unsigned int engine_id = engine->id;
 	unsigned int n;
 
-	for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++) {
+	for (n = 0; n < ARRAY_SIZE(el->port); n++) {
 		struct drm_i915_gem_request *rq;
 		unsigned int count;
 
@@ -558,7 +559,8 @@ static void port_assign(struct execlist_port *port,
 
 static void i915_guc_dequeue(struct intel_engine_cs *engine)
 {
-	struct execlist_port *port = engine->execlist_port;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
 	struct drm_i915_gem_request *last = NULL;
 	bool submit = false;
 	struct rb_node *rb;
@@ -567,15 +569,15 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 		port++;
 
 	spin_lock_irq(&engine->timeline->lock);
-	rb = engine->execlist_first;
-	GEM_BUG_ON(rb_first(&engine->execlist_queue) != rb);
+	rb = el->first;
+	GEM_BUG_ON(rb_first(&el->queue) != rb);
 	while (rb) {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
 
 		list_for_each_entry_safe(rq, rn, &p->requests, priotree.link) {
 			if (last && rq->ctx != last->ctx) {
-				if (port != engine->execlist_port) {
+				if (port != el->port) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -596,13 +598,13 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 		}
 
 		rb = rb_next(rb);
-		rb_erase(&p->node, &engine->execlist_queue);
+		rb_erase(&p->node, &el->queue);
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
-	engine->execlist_first = rb;
+	el->first = rb;
 	if (submit) {
 		port_assign(port, last);
 		i915_guc_submit(engine);
@@ -612,8 +614,8 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 
 static void i915_guc_irq_handler(unsigned long data)
 {
-	struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
-	struct execlist_port *port = engine->execlist_port;
+	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
+	struct execlist_port *port = engine->execlist.port;
 	struct drm_i915_gem_request *rq;
 
 	rq = port_request(&port[0]);
@@ -1144,7 +1146,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 	 * and it is guaranteed that it will remove the work item from the
 	 * queue before our request is completed.
 	 */
-	BUILD_BUG_ON(ARRAY_SIZE(engine->execlist_port) *
+	BUILD_BUG_ON(ARRAY_SIZE(engine->execlist.port) *
 		     sizeof(struct guc_wq_item) *
 		     I915_NUM_ENGINES > GUC_WQ_SIZE);
 
@@ -1175,14 +1177,15 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 	guc_interrupts_capture(dev_priv);
 
 	for_each_engine(engine, dev_priv, id) {
+		struct intel_engine_execlist * const el = &engine->execlist;
 		/* The tasklet was initialised by execlists, and may be in
 		 * a state of flux (across a reset) and so we just want to
 		 * take over the callback without changing any other state
 		 * in the tasklet.
 		 */
-		engine->irq_tasklet.func = i915_guc_irq_handler;
+		el->irq_tasklet.func = i915_guc_irq_handler;
 		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
-		tasklet_schedule(&engine->irq_tasklet);
+		tasklet_schedule(&el->irq_tasklet);
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index c23efc4394ce..de703c26e124 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1338,10 +1338,11 @@ static void snb_gt_irq_handler(struct drm_i915_private *dev_priv,
 static void
 gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 {
+	struct intel_engine_execlist * const el = &engine->execlist;
 	bool tasklet = false;
 
 	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
-		if (port_count(&engine->execlist_port[0])) {
+		if (port_count(&el->port[0])) {
 			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 			tasklet = true;
 		}
@@ -1353,7 +1354,7 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 	}
 
 	if (tasklet)
-		tasklet_hi_schedule(&engine->irq_tasklet);
+		tasklet_hi_schedule(&el->irq_tasklet);
 }
 
 static irqreturn_t gen8_gt_irq_ack(struct drm_i915_private *dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 3d135c3cd380..eb6feaf69a3b 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -391,8 +391,8 @@ static void intel_engine_init_timeline(struct intel_engine_cs *engine)
  */
 void intel_engine_setup_common(struct intel_engine_cs *engine)
 {
-	engine->execlist_queue = RB_ROOT;
-	engine->execlist_first = NULL;
+	engine->execlist.queue = RB_ROOT;
+	engine->execlist.first = NULL;
 
 	intel_engine_init_timeline(engine);
 	intel_engine_init_hangcheck(engine);
@@ -1473,11 +1473,11 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 		return false;
 
 	/* Both ports drained, no more ELSP submission? */
-	if (port_request(&engine->execlist_port[0]))
+	if (port_request(&engine->execlist.port[0]))
 		return false;
 
 	/* ELSP is empty, but there are ready requests? */
-	if (READ_ONCE(engine->execlist_first))
+	if (READ_ONCE(engine->execlist.first))
 		return false;
 
 	/* Ring stopped? */
@@ -1526,8 +1526,8 @@ void intel_engines_mark_idle(struct drm_i915_private *i915)
 	for_each_engine(engine, i915, id) {
 		intel_engine_disarm_breadcrumbs(engine);
 		i915_gem_batch_pool_fini(&engine->batch_pool);
-		tasklet_kill(&engine->irq_tasklet);
-		engine->no_priolist = false;
+		tasklet_kill(&engine->execlist.irq_tasklet);
+		engine->execlist.no_priolist = false;
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 86fed9f1f1ae..5c2fcc4936ba 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -291,17 +291,18 @@ lookup_priolist(struct intel_engine_cs *engine,
 		struct i915_priotree *pt,
 		int prio)
 {
+	struct intel_engine_execlist * const el = &engine->execlist;
 	struct i915_priolist *p;
 	struct rb_node **parent, *rb;
 	bool first = true;
 
-	if (unlikely(engine->no_priolist))
+	if (unlikely(el->no_priolist))
 		prio = I915_PRIORITY_NORMAL;
 
 find_priolist:
 	/* most positive priority is scheduled first, equal priorities fifo */
 	rb = NULL;
-	parent = &engine->execlist_queue.rb_node;
+	parent = &el->queue.rb_node;
 	while (*parent) {
 		rb = *parent;
 		p = rb_entry(rb, typeof(*p), node);
@@ -316,7 +317,7 @@ lookup_priolist(struct intel_engine_cs *engine,
 	}
 
 	if (prio == I915_PRIORITY_NORMAL) {
-		p = &engine->default_priolist;
+		p = &el->default_priolist;
 	} else {
 		p = kmem_cache_alloc(engine->i915->priorities, GFP_ATOMIC);
 		/* Convert an allocation failure to a priority bump */
@@ -331,7 +332,7 @@ lookup_priolist(struct intel_engine_cs *engine,
 			 * requests, so if userspace lied about their
 			 * dependencies that reordering may be visible.
 			 */
-			engine->no_priolist = true;
+			el->no_priolist = true;
 			goto find_priolist;
 		}
 	}
@@ -339,10 +340,10 @@ lookup_priolist(struct intel_engine_cs *engine,
 	p->priority = prio;
 	INIT_LIST_HEAD(&p->requests);
 	rb_link_node(&p->node, rb, parent);
-	rb_insert_color(&p->node, &engine->execlist_queue);
+	rb_insert_color(&p->node, &el->queue);
 
 	if (first)
-		engine->execlist_first = &p->node;
+		el->first = &p->node;
 
 	return ptr_pack_bits(p, first, 1);
 }
@@ -393,12 +394,12 @@ static u64 execlists_update_context(struct drm_i915_gem_request *rq)
 
 static void execlists_submit_ports(struct intel_engine_cs *engine)
 {
-	struct execlist_port *port = engine->execlist_port;
+	struct execlist_port *port = engine->execlist.port;
 	u32 __iomem *elsp =
 		engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
 	unsigned int n;
 
-	for (n = ARRAY_SIZE(engine->execlist_port); n--; ) {
+	for (n = ARRAY_SIZE(engine->execlist.port); n--; ) {
 		struct drm_i915_gem_request *rq;
 		unsigned int count;
 		u64 desc;
@@ -453,7 +454,7 @@ static void port_assign(struct execlist_port *port,
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_request *last;
-	struct execlist_port *port = engine->execlist_port;
+	struct execlist_port *port = engine->execlist.port;
 	struct rb_node *rb;
 	bool submit = false;
 
@@ -491,8 +492,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 */
 
 	spin_lock_irq(&engine->timeline->lock);
-	rb = engine->execlist_first;
-	GEM_BUG_ON(rb_first(&engine->execlist_queue) != rb);
+	rb = engine->execlist.first;
+	GEM_BUG_ON(rb_first(&engine->execlist.queue) != rb);
 	while (rb) {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
@@ -515,7 +516,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				 * combine this request with the last, then we
 				 * are done.
 				 */
-				if (port != engine->execlist_port) {
+				if (port != engine->execlist.port) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -552,13 +553,13 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		}
 
 		rb = rb_next(rb);
-		rb_erase(&p->node, &engine->execlist_queue);
+		rb_erase(&p->node, &engine->execlist.queue);
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
-	engine->execlist_first = rb;
+	engine->execlist.first = rb;
 	if (submit)
 		port_assign(port, last);
 	spin_unlock_irq(&engine->timeline->lock);
@@ -569,7 +570,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 
 static void execlists_cancel_requests(struct intel_engine_cs *engine)
 {
-	struct execlist_port *port = engine->execlist_port;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
 	struct drm_i915_gem_request *rq, *rn;
 	struct rb_node *rb;
 	unsigned long flags;
@@ -578,9 +580,9 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
 	/* Cancel the requests on the HW and clear the ELSP tracker. */
-	for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++)
+	for (n = 0; n < ARRAY_SIZE(el->port); n++)
 		i915_gem_request_put(port_request(&port[n]));
-	memset(engine->execlist_port, 0, sizeof(engine->execlist_port));
+	memset(el->port, 0, sizeof(el->port));
 
 	/* Mark all executing requests as skipped. */
 	list_for_each_entry(rq, &engine->timeline->requests, link) {
@@ -590,7 +592,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 	}
 
 	/* Flush the queued requests to the timeline list (for retiring). */
-	rb = engine->execlist_first;
+	rb = el->first;
 	while (rb) {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 
@@ -603,7 +605,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 		}
 
 		rb = rb_next(rb);
-		rb_erase(&p->node, &engine->execlist_queue);
+		rb_erase(&p->node, &el->queue);
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
@@ -611,8 +613,8 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
-	engine->execlist_queue = RB_ROOT;
-	engine->execlist_first = NULL;
+	el->queue = RB_ROOT;
+	el->first = NULL;
 	GEM_BUG_ON(port_isset(&port[0]));
 
 	/*
@@ -628,7 +630,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 
 static bool execlists_elsp_ready(const struct intel_engine_cs *engine)
 {
-	const struct execlist_port *port = engine->execlist_port;
+	const struct execlist_port *port = engine->execlist.port;
 
 	return port_count(&port[0]) + port_count(&port[1]) < 2;
 }
@@ -639,8 +641,9 @@ static bool execlists_elsp_ready(const struct intel_engine_cs *engine)
  */
 static void intel_lrc_irq_handler(unsigned long data)
 {
-	struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
-	struct execlist_port *port = engine->execlist_port;
+	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
 	struct drm_i915_private *dev_priv = engine->i915;
 
 	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
@@ -652,7 +655,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 	 */
 	GEM_BUG_ON(!dev_priv->gt.awake);
 
-	intel_uncore_forcewake_get(dev_priv, engine->fw_domains);
+	intel_uncore_forcewake_get(dev_priv, el->fw_domains);
 
 	/* Prefer doing test_and_clear_bit() as a two stage operation to avoid
 	 * imposing the cost of a locked atomic transaction when submitting a
@@ -665,10 +668,10 @@ static void intel_lrc_irq_handler(unsigned long data)
 		unsigned int head, tail;
 
 		/* However GVT emulation depends upon intercepting CSB mmio */
-		if (unlikely(engine->csb_use_mmio)) {
+		if (unlikely(el->csb_use_mmio)) {
 			buf = (u32 * __force)
 				(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));
-			engine->csb_head = -1; /* force mmio read of CSB ptrs */
+			el->csb_head = -1; /* force mmio read of CSB ptrs */
 		}
 
 		/* The write will be ordered by the uncached read (itself
@@ -682,19 +685,20 @@ static void intel_lrc_irq_handler(unsigned long data)
 		 * is set and we do a new loop.
 		 */
 		__clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
-		if (unlikely(engine->csb_head == -1)) { /* following a reset */
+		if (unlikely(el->csb_head == -1)) { /* following a reset */
 			head = readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
 			tail = GEN8_CSB_WRITE_PTR(head);
 			head = GEN8_CSB_READ_PTR(head);
-			engine->csb_head = head;
+			el->csb_head = head;
 		} else {
 			const int write_idx =
 				intel_hws_csb_write_index(dev_priv) -
 				I915_HWS_CSB_BUF0_INDEX;
 
-			head = engine->csb_head;
+			head = el->csb_head;
 			tail = READ_ONCE(buf[write_idx]);
 		}
+
 		while (head != tail) {
 			struct drm_i915_gem_request *rq;
 			unsigned int status;
@@ -748,8 +752,8 @@ static void intel_lrc_irq_handler(unsigned long data)
 				   !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
 		}
 
-		if (head != engine->csb_head) {
-			engine->csb_head = head;
+		if (head != el->csb_head) {
+			el->csb_head = head;
 			writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, head << 8),
 			       dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
 		}
@@ -758,7 +762,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 	if (execlists_elsp_ready(engine))
 		execlists_dequeue(engine);
 
-	intel_uncore_forcewake_put(dev_priv, engine->fw_domains);
+	intel_uncore_forcewake_put(dev_priv, el->fw_domains);
 }
 
 static void insert_request(struct intel_engine_cs *engine,
@@ -769,7 +773,7 @@ static void insert_request(struct intel_engine_cs *engine,
 
 	list_add_tail(&pt->link, &ptr_mask_bits(p, 1)->requests);
 	if (ptr_unmask_bits(p, 1) && execlists_elsp_ready(engine))
-		tasklet_hi_schedule(&engine->irq_tasklet);
+		tasklet_hi_schedule(&engine->execlist.irq_tasklet);
 }
 
 static void execlists_submit_request(struct drm_i915_gem_request *request)
@@ -782,7 +786,7 @@ static void execlists_submit_request(struct drm_i915_gem_request *request)
 
 	insert_request(engine, &request->priotree, request->priotree.priority);
 
-	GEM_BUG_ON(!engine->execlist_first);
+	GEM_BUG_ON(!engine->execlist.first);
 	GEM_BUG_ON(list_empty(&request->priotree.link));
 
 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
@@ -1289,6 +1293,7 @@ static u8 gtiir[] = {
 static int gen8_init_common_ring(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
+	struct intel_engine_execlist * const el = &engine->execlist;
 	int ret;
 
 	ret = intel_mocs_init_engine(engine);
@@ -1321,11 +1326,11 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
 	I915_WRITE(GEN8_GT_IIR(gtiir[engine->id]),
 		   GT_CONTEXT_SWITCH_INTERRUPT << engine->irq_shift);
 	clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
-	engine->csb_head = -1;
+	el->csb_head = -1;
 
 	/* After a GPU reset, we may have requests to replay */
-	if (!i915.enable_guc_submission && engine->execlist_first)
-		tasklet_schedule(&engine->irq_tasklet);
+	if (!i915.enable_guc_submission && el->first)
+		tasklet_schedule(&el->irq_tasklet);
 
 	return 0;
 }
@@ -1366,7 +1371,8 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 static void reset_common_ring(struct intel_engine_cs *engine,
 			      struct drm_i915_gem_request *request)
 {
-	struct execlist_port *port = engine->execlist_port;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
 	struct drm_i915_gem_request *rq, *rn;
 	struct intel_context *ce;
 	unsigned long flags;
@@ -1383,9 +1389,9 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	 * guessing the missed context-switch events by looking at what
 	 * requests were completed.
 	 */
-	for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++)
+	for (n = 0; n < ARRAY_SIZE(el->port); n++)
 		i915_gem_request_put(port_request(&port[n]));
-	memset(engine->execlist_port, 0, sizeof(engine->execlist_port));
+	memset(el->port, 0, sizeof(el->port));
 
 	/* Push back any incomplete requests for replay after the reset. */
 	list_for_each_entry_safe_reverse(rq, rn,
@@ -1719,8 +1725,8 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *engine)
 	 * Tasklet cannot be active at this point due intel_mark_active/idle
 	 * so this is just for documentation.
 	 */
-	if (WARN_ON(test_bit(TASKLET_STATE_SCHED, &engine->irq_tasklet.state)))
-		tasklet_kill(&engine->irq_tasklet);
+	if (WARN_ON(test_bit(TASKLET_STATE_SCHED, &engine->execlist.irq_tasklet.state)))
+		tasklet_kill(&engine->execlist.irq_tasklet);
 
 	dev_priv = engine->i915;
 
@@ -1744,7 +1750,7 @@ static void execlists_set_default_submission(struct intel_engine_cs *engine)
 	engine->submit_request = execlists_submit_request;
 	engine->cancel_requests = execlists_cancel_requests;
 	engine->schedule = execlists_schedule;
-	engine->irq_tasklet.func = intel_lrc_irq_handler;
+	engine->execlist.irq_tasklet.func = intel_lrc_irq_handler;
 }
 
 static void
@@ -1806,7 +1812,7 @@ logical_ring_setup(struct intel_engine_cs *engine)
 	/* Intentionally left blank. */
 	engine->buffer = NULL;
 
-	engine->csb_use_mmio = irq_handler_force_mmio(dev_priv);
+	engine->execlist.csb_use_mmio = irq_handler_force_mmio(dev_priv);
 
 	fw_domains = intel_uncore_forcewake_for_reg(dev_priv,
 						    RING_ELSP(engine),
@@ -1820,9 +1826,9 @@ logical_ring_setup(struct intel_engine_cs *engine)
 						     RING_CONTEXT_STATUS_BUF_BASE(engine),
 						     FW_REG_READ);
 
-	engine->fw_domains = fw_domains;
+	engine->execlist.fw_domains = fw_domains;
 
-	tasklet_init(&engine->irq_tasklet,
+	tasklet_init(&engine->execlist.irq_tasklet,
 		     intel_lrc_irq_handler, (unsigned long)engine);
 
 	logical_ring_default_vfuncs(engine);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 138116a3b537..9469f314d3fa 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -184,6 +184,84 @@ struct i915_priolist {
 	int priority;
 };
 
+/**
+ * struct intel_engine_execlist - execlist submission queue and port state
+ *
+ * The struct intel_engine_execlist represents the combined logical state of driver
+ * and the hardware state for execlist mode of submission.
+ */
+struct intel_engine_execlist {
+	/**
+	 * @irq_tasklet: softirq tasklet for bottom handler
+	 */
+	struct tasklet_struct irq_tasklet;
+
+	/**
+	 * @default_priolist: priority list for I915_PRIORITY_NORMAL
+	 */
+	struct i915_priolist default_priolist;
+
+	/**
+	 * @no_priolist: priority lists disabled
+	 */
+	bool no_priolist;
+
+	/**
+	 * @port: execlist port states
+	 *
+	 * For each hardware ELSP (ExecList Submission Port) we keep
+	 * track of the last request and the number of times we submitted
+	 * that port to hw. We then count the number of times the hw reports
+	 * a context completion or preemption. As only one context can
+	 * be active on hw, we limit resubmission of context to port[0]. This
+	 * is called Lite Restore, of the context.
+	 */
+	struct execlist_port {
+		/**
+		 * @request_count: combined request and submission count
+		 */
+		struct drm_i915_gem_request *request_count;
+#define EXECLIST_COUNT_BITS 2
+#define port_request(p) ptr_mask_bits((p)->request_count, EXECLIST_COUNT_BITS)
+#define port_count(p) ptr_unmask_bits((p)->request_count, EXECLIST_COUNT_BITS)
+#define port_pack(rq, count) ptr_pack_bits(rq, count, EXECLIST_COUNT_BITS)
+#define port_unpack(p, count) ptr_unpack_bits((p)->request_count, count, EXECLIST_COUNT_BITS)
+#define port_set(p, packed) ((p)->request_count = (packed))
+#define port_isset(p) ((p)->request_count)
+#define port_index(p, e) ((p) - (e)->execlist.port)
+
+		/**
+		 * @context_id: context ID for port
+		 */
+		GEM_DEBUG_DECL(u32 context_id);
+	} port[2];
+
+	/**
+	 * @queue: queue of requests, in priority lists
+	 */
+	struct rb_root queue;
+
+	/**
+	 * @first: leftmost level in priority @queue
+	 */
+	struct rb_node *first;
+
+	/**
+	 * @fw_domains: forcewake domains for irq tasklet
+	 */
+	unsigned int fw_domains;
+
+	/**
+	 * @csb_head: context status buffer head
+	 */
+	unsigned int csb_head;
+
+	/**
+	 * @csb_use_mmio: access csb through mmio, instead of hwsp
+	 */
+	bool csb_use_mmio;
+};
+
 #define INTEL_ENGINE_CS_MAX_NAME 8
 
 struct intel_engine_cs {
@@ -380,27 +458,7 @@ struct intel_engine_cs {
 		u32	*(*signal)(struct drm_i915_gem_request *req, u32 *cs);
 	} semaphore;
 
-	/* Execlists */
-	struct tasklet_struct irq_tasklet;
-	struct i915_priolist default_priolist;
-	bool no_priolist;
-	struct execlist_port {
-		struct drm_i915_gem_request *request_count;
-#define EXECLIST_COUNT_BITS 2
-#define port_request(p) ptr_mask_bits((p)->request_count, EXECLIST_COUNT_BITS)
-#define port_count(p) ptr_unmask_bits((p)->request_count, EXECLIST_COUNT_BITS)
-#define port_pack(rq, count) ptr_pack_bits(rq, count, EXECLIST_COUNT_BITS)
-#define port_unpack(p, count) ptr_unpack_bits((p)->request_count, count, EXECLIST_COUNT_BITS)
-#define port_set(p, packed) ((p)->request_count = (packed))
-#define port_isset(p) ((p)->request_count)
-#define port_index(p, e) ((p) - (e)->execlist_port)
-		GEM_DEBUG_DECL(u32 context_id);
-	} execlist_port[2];
-	struct rb_root execlist_queue;
-	struct rb_node *execlist_first;
-	unsigned int fw_domains;
-	unsigned int csb_head;
-	bool csb_use_mmio;
+	struct intel_engine_execlist execlist;
 
 	/* Contexts are pinned whilst they are active on the GPU. The last
 	 * context executed remains active whilst the GPU is idle - the
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/8] drm/i915: Move execlist initialization into intel_engine_cs.c
  2017-09-20 14:36 [PATCH 0/8] Support for more than two execlist ports (v2) Mika Kuoppala
  2017-09-20 14:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
@ 2017-09-20 14:36 ` Mika Kuoppala
  2017-09-21 12:20   ` Chris Wilson
  2017-09-20 14:37 ` [PATCH 3/8] drm/i915: Wrap port cancellation into a function Mika Kuoppala
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-20 14:36 UTC (permalink / raw)
  To: intel-gfx

Move execlist init into a common engine setup. As it is
common to both guc and hw execlists.

v2: rebase with csb changes

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c | 31 ++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/intel_lrc.c       | 19 -------------------
 2 files changed, 28 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index eb6feaf69a3b..d58e17efd243 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -380,6 +380,33 @@ static void intel_engine_init_timeline(struct intel_engine_cs *engine)
 	engine->timeline = &engine->i915->gt.global_timeline.engine[engine->id];
 }
 
+static bool csb_force_mmio(struct drm_i915_private *i915)
+{
+	/* GVT emulation depends upon intercepting CSB mmio */
+	if (intel_vgpu_active(i915))
+		return true;
+
+	/*
+	 * IOMMU adds unpredictable latency causing the CSB write (from the
+	 * GPU into the HWSP) to only be visible some time after the interrupt
+	 * (missed breadcrumb syndrome).
+	 */
+	if (intel_vtd_active())
+		return true;
+
+	return false;
+}
+
+static void intel_engine_init_execlist(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlist * const el = &engine->execlist;
+
+	el->csb_use_mmio = csb_force_mmio(engine->i915);
+
+	el->queue = RB_ROOT;
+	el->first = NULL;
+}
+
 /**
  * intel_engines_setup_common - setup engine state not requiring hw access
  * @engine: Engine to setup.
@@ -391,9 +418,7 @@ static void intel_engine_init_timeline(struct intel_engine_cs *engine)
  */
 void intel_engine_setup_common(struct intel_engine_cs *engine)
 {
-	engine->execlist.queue = RB_ROOT;
-	engine->execlist.first = NULL;
-
+	intel_engine_init_execlist(engine);
 	intel_engine_init_timeline(engine);
 	intel_engine_init_hangcheck(engine);
 	i915_gem_batch_pool_init(engine, &engine->batch_pool);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 5c2fcc4936ba..a4ece4c4f291 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1784,23 +1784,6 @@ logical_ring_default_irqs(struct intel_engine_cs *engine)
 	engine->irq_keep_mask = GT_CONTEXT_SWITCH_INTERRUPT << shift;
 }
 
-static bool irq_handler_force_mmio(struct drm_i915_private *i915)
-{
-	/* GVT emulation depends upon intercepting CSB mmio */
-	if (intel_vgpu_active(i915))
-		return true;
-
-	/*
-	 * IOMMU adds unpredictable latency causing the CSB write (from the
-	 * GPU into the HWSP) to only be visible some time after the interrupt
-	 * (missed breadcrumb syndrome).
-	 */
-	if (intel_vtd_active())
-		return true;
-
-	return false;
-}
-
 static void
 logical_ring_setup(struct intel_engine_cs *engine)
 {
@@ -1812,8 +1795,6 @@ logical_ring_setup(struct intel_engine_cs *engine)
 	/* Intentionally left blank. */
 	engine->buffer = NULL;
 
-	engine->execlist.csb_use_mmio = irq_handler_force_mmio(dev_priv);
-
 	fw_domains = intel_uncore_forcewake_for_reg(dev_priv,
 						    RING_ELSP(engine),
 						    FW_REG_WRITE);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/8] drm/i915: Wrap port cancellation into a function
  2017-09-20 14:36 [PATCH 0/8] Support for more than two execlist ports (v2) Mika Kuoppala
  2017-09-20 14:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
  2017-09-20 14:36 ` [PATCH 2/8] drm/i915: Move execlist initialization into intel_engine_cs.c Mika Kuoppala
@ 2017-09-20 14:37 ` Mika Kuoppala
  2017-09-21 12:18   ` Michał Winiarski
  2017-09-21 12:20   ` Chris Wilson
  2017-09-20 14:37 ` [PATCH 4/8] drm/i915: Add execlist_port_complete Mika Kuoppala
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-20 14:37 UTC (permalink / raw)
  To: intel-gfx

On reset and wedged path, we want to release the requests
that are tied to ports and then mark the ports to be unset.
Introduce a function for this.

v2: rebase

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a4ece4c4f291..ffb9c900328b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -568,6 +568,16 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		execlists_submit_ports(engine);
 }
 
+static void execlist_cancel_port_requests(struct intel_engine_execlist *el)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(el->port); i++)
+		i915_gem_request_put(port_request(&el->port[i]));
+
+	memset(el->port, 0, sizeof(el->port));
+}
+
 static void execlists_cancel_requests(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlist * const el = &engine->execlist;
@@ -575,14 +585,11 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 	struct drm_i915_gem_request *rq, *rn;
 	struct rb_node *rb;
 	unsigned long flags;
-	unsigned long n;
 
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
 	/* Cancel the requests on the HW and clear the ELSP tracker. */
-	for (n = 0; n < ARRAY_SIZE(el->port); n++)
-		i915_gem_request_put(port_request(&port[n]));
-	memset(el->port, 0, sizeof(el->port));
+	execlist_cancel_port_requests(el);
 
 	/* Mark all executing requests as skipped. */
 	list_for_each_entry(rq, &engine->timeline->requests, link) {
@@ -1372,11 +1379,9 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 			      struct drm_i915_gem_request *request)
 {
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = el->port;
 	struct drm_i915_gem_request *rq, *rn;
 	struct intel_context *ce;
 	unsigned long flags;
-	unsigned int n;
 
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
@@ -1389,9 +1394,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	 * guessing the missed context-switch events by looking at what
 	 * requests were completed.
 	 */
-	for (n = 0; n < ARRAY_SIZE(el->port); n++)
-		i915_gem_request_put(port_request(&port[n]));
-	memset(el->port, 0, sizeof(el->port));
+	execlist_cancel_port_requests(el);
 
 	/* Push back any incomplete requests for replay after the reset. */
 	list_for_each_entry_safe_reverse(rq, rn,
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/8] drm/i915: Add execlist_port_complete
  2017-09-20 14:36 [PATCH 0/8] Support for more than two execlist ports (v2) Mika Kuoppala
                   ` (2 preceding siblings ...)
  2017-09-20 14:37 ` [PATCH 3/8] drm/i915: Wrap port cancellation into a function Mika Kuoppala
@ 2017-09-20 14:37 ` Mika Kuoppala
  2017-09-21 12:21   ` Chris Wilson
  2017-09-20 14:37 ` [PATCH 5/8] drm/i915: Make execlist port count variable Mika Kuoppala
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-20 14:37 UTC (permalink / raw)
  To: intel-gfx

When first execlist entry is processed, we move the port (contents).
Introduce function for this as execlist and guc use this common
operation.

v2: rebase. s/GEM_DEBUG_BUG/GEM_BUG (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c |  8 ++++----
 drivers/gpu/drm/i915/intel_lrc.c           | 22 +++++++++++-----------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 14 +++++++++++++-
 3 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index e967d266bc3c..d9f6dedae0d7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -592,7 +592,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 			rq->priotree.priority = INT_MAX;
 
 			__i915_gem_request_submit(rq);
-			trace_i915_gem_request_in(rq, port_index(port, engine));
+			trace_i915_gem_request_in(rq, port_index(port, el));
 			last = rq;
 			submit = true;
 		}
@@ -615,7 +615,8 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 static void i915_guc_irq_handler(unsigned long data)
 {
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
-	struct execlist_port *port = engine->execlist.port;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
 	struct drm_i915_gem_request *rq;
 
 	rq = port_request(&port[0]);
@@ -623,8 +624,7 @@ static void i915_guc_irq_handler(unsigned long data)
 		trace_i915_gem_request_out(rq);
 		i915_gem_request_put(rq);
 
-		port[0] = port[1];
-		memset(&port[1], 0, sizeof(port[1]));
+		execlist_port_complete(el, port);
 
 		rq = port_request(&port[0]);
 	}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ffb9c900328b..3008e13f9c47 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -454,7 +454,8 @@ static void port_assign(struct execlist_port *port,
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_request *last;
-	struct execlist_port *port = engine->execlist.port;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
 	struct rb_node *rb;
 	bool submit = false;
 
@@ -468,8 +469,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		 */
 		last->tail = last->wa_tail;
 
-	GEM_BUG_ON(port_isset(&port[1]));
-
 	/* Hardware submission is through 2 ports. Conceptually each port
 	 * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
 	 * static for a context, and unique to each, so we only execute
@@ -492,8 +491,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 */
 
 	spin_lock_irq(&engine->timeline->lock);
-	rb = engine->execlist.first;
-	GEM_BUG_ON(rb_first(&engine->execlist.queue) != rb);
+	rb = el->first;
+	GEM_BUG_ON(rb_first(&el->queue) != rb);
 	while (rb) {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
@@ -516,7 +515,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				 * combine this request with the last, then we
 				 * are done.
 				 */
-				if (port != engine->execlist.port) {
+				if (port != el->port) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -541,25 +540,27 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				if (submit)
 					port_assign(port, last);
 				port++;
+
+				GEM_BUG_ON(port_isset(port));
 			}
 
 			INIT_LIST_HEAD(&rq->priotree.link);
 			rq->priotree.priority = INT_MAX;
 
 			__i915_gem_request_submit(rq);
-			trace_i915_gem_request_in(rq, port_index(port, engine));
+			trace_i915_gem_request_in(rq, port_index(port, el));
 			last = rq;
 			submit = true;
 		}
 
 		rb = rb_next(rb);
-		rb_erase(&p->node, &engine->execlist.queue);
+		rb_erase(&p->node, &el->queue);
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
-	engine->execlist.first = rb;
+	el->first = rb;
 	if (submit)
 		port_assign(port, last);
 	spin_unlock_irq(&engine->timeline->lock);
@@ -748,8 +749,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 				trace_i915_gem_request_out(rq);
 				i915_gem_request_put(rq);
 
-				port[0] = port[1];
-				memset(&port[1], 0, sizeof(port[1]));
+				execlist_port_complete(el, port);
 			} else {
 				port_set(port, port_pack(rq, count));
 			}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 9469f314d3fa..bbf3dfe4c340 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -228,7 +228,7 @@ struct intel_engine_execlist {
 #define port_unpack(p, count) ptr_unpack_bits((p)->request_count, count, EXECLIST_COUNT_BITS)
 #define port_set(p, packed) ((p)->request_count = (packed))
 #define port_isset(p) ((p)->request_count)
-#define port_index(p, e) ((p) - (e)->execlist.port)
+#define port_index(p, el) ((p) - (el)->port)
 
 		/**
 		 * @context_id: context ID for port
@@ -511,6 +511,18 @@ struct intel_engine_cs {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 };
 
+static inline void
+execlist_port_complete(struct intel_engine_execlist * const el,
+		       struct execlist_port * const port)
+{
+	struct execlist_port * const port1 = &el->port[1];
+
+	GEM_BUG_ON(port_index(port, el) != 0);
+
+	*port = *port1;
+	memset(port1, 0, sizeof(struct execlist_port));
+}
+
 static inline unsigned int
 intel_engine_flag(const struct intel_engine_cs *engine)
 {
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 5/8] drm/i915: Make execlist port count variable
  2017-09-20 14:36 [PATCH 0/8] Support for more than two execlist ports (v2) Mika Kuoppala
                   ` (3 preceding siblings ...)
  2017-09-20 14:37 ` [PATCH 4/8] drm/i915: Add execlist_port_complete Mika Kuoppala
@ 2017-09-20 14:37 ` Mika Kuoppala
  2017-09-21 12:21   ` Chris Wilson
  2017-09-20 14:37 ` [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors Mika Kuoppala
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-20 14:37 UTC (permalink / raw)
  To: intel-gfx

As we emulate execlists on top of the GuC workqueue, it is not
restricted to just 2 ports and we can increase that number arbitrarily
to trade-off queue depth (i.e. scheduling latency) against pipeline
bubbles.

v2: rebase. better commit msg (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  8 ++++----
 drivers/gpu/drm/i915/i915_drv.h            |  3 ++-
 drivers/gpu/drm/i915/i915_gpu_error.c      | 17 ++++++++++++-----
 drivers/gpu/drm/i915/i915_guc_submission.c |  8 ++++++--
 drivers/gpu/drm/i915/intel_engine_cs.c     |  4 ++++
 drivers/gpu/drm/i915/intel_lrc.c           |  6 ++++--
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 21 +++++++++++++++++----
 7 files changed, 49 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 7648ff207670..dbeb6f08ab79 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3313,6 +3313,7 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 
 		if (i915.enable_execlists) {
 			const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
+			struct intel_engine_execlist * const el = &engine->execlist;
 			u32 ptr, read, write;
 			unsigned int idx;
 
@@ -3346,11 +3347,10 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			}
 
 			rcu_read_lock();
-			for (idx = 0; idx < ARRAY_SIZE(engine->execlist.port); idx++) {
+			for (idx = 0; idx < execlist_num_ports(el); idx++) {
 				unsigned int count;
 
-				rq = port_unpack(&engine->execlist.port[idx],
-						 &count);
+				rq = port_unpack(&el->port[idx], &count);
 				if (rq) {
 					seq_printf(m, "\t\tELSP[%d] count=%d, ",
 						   idx, count);
@@ -3363,7 +3363,7 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			rcu_read_unlock();
 
 			spin_lock_irq(&engine->timeline->lock);
-			for (rb = engine->execlist.first; rb; rb = rb_next(rb)) {
+			for (rb = el->first; rb; rb = rb_next(rb)) {
 				struct i915_priolist *p =
 					rb_entry(rb, typeof(*p), node);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6d7d871b32ad..f4dd53eef61b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1000,7 +1000,8 @@ struct i915_gpu_state {
 			u32 seqno;
 			u32 head;
 			u32 tail;
-		} *requests, execlist[2];
+		} *requests, execlist[EXECLIST_MAX_PORTS];
+		unsigned int num_ports;
 
 		struct drm_i915_error_waiter {
 			char comm[TASK_COMM_LEN];
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 8710fbdbbcb1..0a803d76256b 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -396,6 +396,8 @@ static void error_print_context(struct drm_i915_error_state_buf *m,
 static void error_print_engine(struct drm_i915_error_state_buf *m,
 			       const struct drm_i915_error_engine *ee)
 {
+	int n;
+
 	err_printf(m, "%s command stream:\n", engine_str(ee->engine_id));
 	err_printf(m, "  START: 0x%08x\n", ee->start);
 	err_printf(m, "  HEAD:  0x%08x [0x%08x]\n", ee->head, ee->rq_head);
@@ -465,8 +467,11 @@ static void error_print_engine(struct drm_i915_error_state_buf *m,
 		   jiffies_to_msecs(jiffies - ee->hangcheck_timestamp));
 	err_printf(m, "  engine reset count: %u\n", ee->reset_count);
 
-	error_print_request(m, "  ELSP[0]: ", &ee->execlist[0]);
-	error_print_request(m, "  ELSP[1]: ", &ee->execlist[1]);
+	for (n = 0; n < ee->num_ports; n++) {
+		err_printf(m, "  ELSP[%d]:", n);
+		error_print_request(m, " ", &ee->execlist[n]);
+	}
+
 	error_print_context(m, "  Active context: ", &ee->context);
 }
 
@@ -1327,17 +1332,19 @@ static void engine_record_requests(struct intel_engine_cs *engine,
 static void error_record_engine_execlists(struct intel_engine_cs *engine,
 					  struct drm_i915_error_engine *ee)
 {
-	const struct execlist_port *port = engine->execlist.port;
+	const struct intel_engine_execlist * const el = &engine->execlist;
 	unsigned int n;
 
-	for (n = 0; n < ARRAY_SIZE(engine->execlist.port); n++) {
-		struct drm_i915_gem_request *rq = port_request(&port[n]);
+	for (n = 0; n < execlist_num_ports(el); n++) {
+		struct drm_i915_gem_request *rq = port_request(&el->port[n]);
 
 		if (!rq)
 			break;
 
 		record_request(rq, &ee->execlist[n]);
 	}
+
+	ee->num_ports = n;
 }
 
 static void record_context(struct drm_i915_error_context *e,
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index d9f6dedae0d7..3a4f875d5930 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -562,6 +562,8 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 	struct intel_engine_execlist * const el = &engine->execlist;
 	struct execlist_port *port = el->port;
 	struct drm_i915_gem_request *last = NULL;
+	const struct execlist_port * const last_port =
+		&el->port[el->port_mask];
 	bool submit = false;
 	struct rb_node *rb;
 
@@ -577,7 +579,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 
 		list_for_each_entry_safe(rq, rn, &p->requests, priotree.link) {
 			if (last && rq->ctx != last->ctx) {
-				if (port != el->port) {
+				if (port == last_port) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -617,6 +619,8 @@ static void i915_guc_irq_handler(unsigned long data)
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlist * const el = &engine->execlist;
 	struct execlist_port *port = el->port;
+	const struct execlist_port * const last_port =
+		&el->port[el->port_mask];
 	struct drm_i915_gem_request *rq;
 
 	rq = port_request(&port[0]);
@@ -629,7 +633,7 @@ static void i915_guc_irq_handler(unsigned long data)
 		rq = port_request(&port[0]);
 	}
 
-	if (!port_isset(&port[1]))
+	if (!port_isset(last_port))
 		i915_guc_dequeue(engine);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index d58e17efd243..523d56084870 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -403,6 +403,10 @@ static void intel_engine_init_execlist(struct intel_engine_cs *engine)
 
 	el->csb_use_mmio = csb_force_mmio(engine->i915);
 
+	engine->execlist.port_mask = 1;
+	BUILD_BUG_ON_NOT_POWER_OF_2(execlist_num_ports(&engine->execlist));
+	GEM_BUG_ON(execlist_num_ports(&engine->execlist) > EXECLIST_MAX_PORTS);
+
 	el->queue = RB_ROOT;
 	el->first = NULL;
 }
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3008e13f9c47..c659745e6b7b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -399,7 +399,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 		engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
 	unsigned int n;
 
-	for (n = ARRAY_SIZE(engine->execlist.port); n--; ) {
+	for (n = execlist_num_ports(&engine->execlist); n--; ) {
 		struct drm_i915_gem_request *rq;
 		unsigned int count;
 		u64 desc;
@@ -456,6 +456,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	struct drm_i915_gem_request *last;
 	struct intel_engine_execlist * const el = &engine->execlist;
 	struct execlist_port *port = el->port;
+	const struct execlist_port * const last_port =
+		&el->port[el->port_mask];
 	struct rb_node *rb;
 	bool submit = false;
 
@@ -515,7 +517,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				 * combine this request with the last, then we
 				 * are done.
 				 */
-				if (port != el->port) {
+				if (port == last_port) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index bbf3dfe4c340..94e6c2a38fb7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -234,7 +234,14 @@ struct intel_engine_execlist {
 		 * @context_id: context ID for port
 		 */
 		GEM_DEBUG_DECL(u32 context_id);
-	} port[2];
+
+#define EXECLIST_MAX_PORTS 2
+	} port[EXECLIST_MAX_PORTS];
+
+	/**
+	 * @port_mask: number of execlist ports - 1
+	 */
+	unsigned int port_mask;
 
 	/**
 	 * @queue: queue of requests, in priority lists
@@ -511,16 +518,22 @@ struct intel_engine_cs {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 };
 
+static inline unsigned int
+execlist_num_ports(const struct intel_engine_execlist * const el)
+{
+	return el->port_mask + 1;
+}
+
 static inline void
 execlist_port_complete(struct intel_engine_execlist * const el,
 		       struct execlist_port * const port)
 {
-	struct execlist_port * const port1 = &el->port[1];
+	const unsigned int m = el->port_mask;
 
 	GEM_BUG_ON(port_index(port, el) != 0);
 
-	*port = *port1;
-	memset(port1, 0, sizeof(struct execlist_port));
+	memmove(port, port + 1, m * sizeof(struct execlist_port));
+	memset(port + m, 0, sizeof(struct execlist_port));
 }
 
 static inline unsigned int
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors
  2017-09-20 14:36 [PATCH 0/8] Support for more than two execlist ports (v2) Mika Kuoppala
                   ` (4 preceding siblings ...)
  2017-09-20 14:37 ` [PATCH 5/8] drm/i915: Make execlist port count variable Mika Kuoppala
@ 2017-09-20 14:37 ` Mika Kuoppala
  2017-09-21 12:26   ` Chris Wilson
  2017-09-20 14:37 ` [PATCH 7/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-20 14:37 UTC (permalink / raw)
  To: intel-gfx

Instead of trusting that first available port is at index 0,
use accessor to hide this. This is a preparation for a
following patches where head can be at arbitrary location
in the port array.

v2: improved commit message, elsp_ready readability (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 16 +++++++----
 drivers/gpu/drm/i915/i915_gpu_error.c      |  4 +--
 drivers/gpu/drm/i915/i915_guc_submission.c | 17 ++++++-----
 drivers/gpu/drm/i915/i915_irq.c            |  2 +-
 drivers/gpu/drm/i915/intel_engine_cs.c     |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c           | 42 +++++++++++++++------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 46 ++++++++++++++++++++++++++----
 7 files changed, 87 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index dbeb6f08ab79..af8cc2eab1b1 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3348,16 +3348,20 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 
 			rcu_read_lock();
 			for (idx = 0; idx < execlist_num_ports(el); idx++) {
-				unsigned int count;
+				const struct execlist_port *port;
+				unsigned int count, n;
 
-				rq = port_unpack(&el->port[idx], &count);
+				port = execlist_port_index(el, idx);
+				n = port_index(port, el);
+
+				rq = port_unpack(port, &count);
 				if (rq) {
-					seq_printf(m, "\t\tELSP[%d] count=%d, ",
-						   idx, count);
+					seq_printf(m, "\t\tELSP[%d:%d] count=%d, ",
+						   idx, n, count);
 					print_request(m, rq, "rq: ");
 				} else {
-					seq_printf(m, "\t\tELSP[%d] idle\n",
-						   idx);
+					seq_printf(m, "\t\tELSP[%d:%d] idle\n",
+						   idx, n);
 				}
 			}
 			rcu_read_unlock();
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0a803d76256b..19e4c297c857 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1332,11 +1332,11 @@ static void engine_record_requests(struct intel_engine_cs *engine,
 static void error_record_engine_execlists(struct intel_engine_cs *engine,
 					  struct drm_i915_error_engine *ee)
 {
-	const struct intel_engine_execlist * const el = &engine->execlist;
+	struct intel_engine_execlist * const el = &engine->execlist;
 	unsigned int n;
 
 	for (n = 0; n < execlist_num_ports(el); n++) {
-		struct drm_i915_gem_request *rq = port_request(&el->port[n]);
+		struct drm_i915_gem_request *rq = port_request(execlist_port_index(el, n));
 
 		if (!rq)
 			break;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 3a4f875d5930..25c9bac94c39 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -562,8 +562,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 	struct intel_engine_execlist * const el = &engine->execlist;
 	struct execlist_port *port = el->port;
 	struct drm_i915_gem_request *last = NULL;
-	const struct execlist_port * const last_port =
-		&el->port[el->port_mask];
+	const struct execlist_port * const last_port = execlist_port_tail(el);
 	bool submit = false;
 	struct rb_node *rb;
 
@@ -587,7 +586,8 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 
 				if (submit)
 					port_assign(port, last);
-				port++;
+
+				port = execlist_port_next(el, port);
 			}
 
 			INIT_LIST_HEAD(&rq->priotree.link);
@@ -618,19 +618,18 @@ static void i915_guc_irq_handler(unsigned long data)
 {
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = el->port;
-	const struct execlist_port * const last_port =
-		&el->port[el->port_mask];
+	struct execlist_port *port = execlist_port_head(el);
+	const struct execlist_port * const last_port = execlist_port_tail(el);
 	struct drm_i915_gem_request *rq;
 
-	rq = port_request(&port[0]);
+	rq = port_request(port);
 	while (rq && i915_gem_request_completed(rq)) {
 		trace_i915_gem_request_out(rq);
 		i915_gem_request_put(rq);
 
-		execlist_port_complete(el, port);
+		port = execlist_port_complete(el, port);
 
-		rq = port_request(&port[0]);
+		rq = port_request(port);
 	}
 
 	if (!port_isset(last_port))
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index de703c26e124..ac5a95439393 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1342,7 +1342,7 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 	bool tasklet = false;
 
 	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
-		if (port_count(&el->port[0])) {
+		if (port_count(execlist_port_head(el))) {
 			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 			tasklet = true;
 		}
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 523d56084870..b0d702063a50 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1502,7 +1502,7 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 		return false;
 
 	/* Both ports drained, no more ELSP submission? */
-	if (port_request(&engine->execlist.port[0]))
+	if (port_request(execlist_port_head(&engine->execlist)))
 		return false;
 
 	/* ELSP is empty, but there are ready requests? */
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index c659745e6b7b..8550cd6635c9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -394,24 +394,27 @@ static u64 execlists_update_context(struct drm_i915_gem_request *rq)
 
 static void execlists_submit_ports(struct intel_engine_cs *engine)
 {
-	struct execlist_port *port = engine->execlist.port;
+	struct intel_engine_execlist * const el = &engine->execlist;
 	u32 __iomem *elsp =
 		engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
 	unsigned int n;
 
-	for (n = execlist_num_ports(&engine->execlist); n--; ) {
+	for (n = execlist_num_ports(el); n--; ) {
+		struct execlist_port *port;
 		struct drm_i915_gem_request *rq;
 		unsigned int count;
 		u64 desc;
 
-		rq = port_unpack(&port[n], &count);
+		port = execlist_port_index(el, n);
+
+		rq = port_unpack(port, &count);
 		if (rq) {
 			GEM_BUG_ON(count > !n);
 			if (!count++)
 				execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
-			port_set(&port[n], port_pack(rq, count));
+			port_set(port, port_pack(rq, count));
 			desc = execlists_update_context(rq);
-			GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc));
+			GEM_DEBUG_EXEC(port->context_id = upper_32_bits(desc));
 		} else {
 			GEM_BUG_ON(!n);
 			desc = 0;
@@ -455,9 +458,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_request *last;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = el->port;
-	const struct execlist_port * const last_port =
-		&el->port[el->port_mask];
+	struct execlist_port *port = execlist_port_head(el);
+	const struct execlist_port * const last_port = execlist_port_tail(el);
 	struct rb_node *rb;
 	bool submit = false;
 
@@ -541,7 +543,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 
 				if (submit)
 					port_assign(port, last);
-				port++;
+
+				port = execlist_port_next(el, port);
 
 				GEM_BUG_ON(port_isset(port));
 			}
@@ -638,11 +641,12 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
 }
 
-static bool execlists_elsp_ready(const struct intel_engine_cs *engine)
+static bool execlists_elsp_ready(struct intel_engine_execlist * const el)
 {
-	const struct execlist_port *port = engine->execlist.port;
+	struct execlist_port * const port0 = execlist_port_head(el);
+	struct execlist_port * const port1 = execlist_port_next(el, port0);
 
-	return port_count(&port[0]) + port_count(&port[1]) < 2;
+	return port_count(port0) + port_count(port1) < 2;
 }
 
 /*
@@ -653,7 +657,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 {
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = el->port;
+	struct execlist_port *port = execlist_port_head(el);
 	struct drm_i915_private *dev_priv = engine->i915;
 
 	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
@@ -751,7 +755,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 				trace_i915_gem_request_out(rq);
 				i915_gem_request_put(rq);
 
-				execlist_port_complete(el, port);
+				port = execlist_port_complete(el, port);
 			} else {
 				port_set(port, port_pack(rq, count));
 			}
@@ -768,7 +772,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 		}
 	}
 
-	if (execlists_elsp_ready(engine))
+	if (execlists_elsp_ready(el))
 		execlists_dequeue(engine);
 
 	intel_uncore_forcewake_put(dev_priv, el->fw_domains);
@@ -778,16 +782,18 @@ static void insert_request(struct intel_engine_cs *engine,
 			   struct i915_priotree *pt,
 			   int prio)
 {
+	struct intel_engine_execlist * const el = &engine->execlist;
 	struct i915_priolist *p = lookup_priolist(engine, pt, prio);
 
 	list_add_tail(&pt->link, &ptr_mask_bits(p, 1)->requests);
-	if (ptr_unmask_bits(p, 1) && execlists_elsp_ready(engine))
-		tasklet_hi_schedule(&engine->execlist.irq_tasklet);
+	if (ptr_unmask_bits(p, 1) && execlists_elsp_ready(el))
+		tasklet_hi_schedule(&el->irq_tasklet);
 }
 
 static void execlists_submit_request(struct drm_i915_gem_request *request)
 {
 	struct intel_engine_cs *engine = request->engine;
+	struct intel_engine_execlist * const el = &engine->execlist;
 	unsigned long flags;
 
 	/* Will be called from irq-context when using foreign fences. */
@@ -795,7 +801,7 @@ static void execlists_submit_request(struct drm_i915_gem_request *request)
 
 	insert_request(engine, &request->priotree, request->priotree.priority);
 
-	GEM_BUG_ON(!engine->execlist.first);
+	GEM_BUG_ON(!el->first);
 	GEM_BUG_ON(list_empty(&request->priotree.link));
 
 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 94e6c2a38fb7..991f6c0bd6c2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -244,6 +244,11 @@ struct intel_engine_execlist {
 	unsigned int port_mask;
 
 	/**
+	 * @port_head: first used execlist port
+	 */
+	unsigned int port_head;
+
+	/**
 	 * @queue: queue of requests, in priority lists
 	 */
 	struct rb_root queue;
@@ -524,16 +529,47 @@ execlist_num_ports(const struct intel_engine_execlist * const el)
 	return el->port_mask + 1;
 }
 
-static inline void
+#define __port_idx(start, index, mask) (((start) + (index)) & (mask))
+
+static inline struct execlist_port *
+execlist_port_head(struct intel_engine_execlist * const el)
+{
+	return &el->port[el->port_head];
+}
+
+/* Index starting from port_head */
+static inline struct execlist_port *
+execlist_port_index(struct intel_engine_execlist * const el,
+		    const unsigned int n)
+{
+	return &el->port[__port_idx(el->port_head, n, el->port_mask)];
+}
+
+static inline struct execlist_port *
+execlist_port_tail(struct intel_engine_execlist * const el)
+{
+	return &el->port[__port_idx(el->port_head, -1, el->port_mask)];
+}
+
+static inline struct execlist_port *
+execlist_port_next(struct intel_engine_execlist * const el,
+		   const struct execlist_port * const port)
+{
+	const unsigned int i = port_index(port, el);
+
+	return &el->port[__port_idx(i, 1, el->port_mask)];
+}
+
+static inline struct execlist_port *
 execlist_port_complete(struct intel_engine_execlist * const el,
 		       struct execlist_port * const port)
 {
-	const unsigned int m = el->port_mask;
+	GEM_BUG_ON(port_index(port, el) != el->port_head);
 
-	GEM_BUG_ON(port_index(port, el) != 0);
+	memset(port, 0, sizeof(struct execlist_port));
+	el->port_head = __port_idx(el->port_head, 1, el->port_mask);
 
-	memmove(port, port + 1, m * sizeof(struct execlist_port));
-	memset(port + m, 0, sizeof(struct execlist_port));
+	return execlist_port_head(el);
 }
 
 static inline unsigned int
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 7/8] drm/i915: Keep track of reserved execlist ports
  2017-09-20 14:36 [PATCH 0/8] Support for more than two execlist ports (v2) Mika Kuoppala
                   ` (5 preceding siblings ...)
  2017-09-20 14:37 ` [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors Mika Kuoppala
@ 2017-09-20 14:37 ` Mika Kuoppala
  2017-09-21 12:08   ` Mika Kuoppala
  2017-09-21 12:30   ` Chris Wilson
  2017-09-20 14:37 ` [PATCH 8/8] drm/i915: Improve GuC request coalescing Mika Kuoppala
                   ` (2 subsequent siblings)
  9 siblings, 2 replies; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-20 14:37 UTC (permalink / raw)
  To: intel-gfx

To further enchance port processing, keep track of
reserved ports. This way we can iterate only the used subset
of port space. Note that we lift the responsibility of
execlists_submit_request() to inspect hw availability and
always do dequeuing. This is to ensure that only the irq
handler will be responsible for keeping track of available ports.

v2: rebase, comment fix, READ_ONCE only outside of irq handler (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 51 +++++++++--------
 drivers/gpu/drm/i915/i915_irq.c            |  2 +-
 drivers/gpu/drm/i915/intel_engine_cs.c     |  7 ++-
 drivers/gpu/drm/i915/intel_lrc.c           | 90 ++++++++++++++++++------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 55 +++++++++++++-----
 5 files changed, 129 insertions(+), 76 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 25c9bac94c39..359f57a59cba 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -487,7 +487,7 @@ static void guc_ring_doorbell(struct i915_guc_client *client)
  * @engine: engine associated with the commands
  *
  * The only error here arises if the doorbell hardware isn't functioning
- * as expected, which really shouln't happen.
+ * as expected, which really shouldn't happen.
  */
 static void i915_guc_submit(struct intel_engine_cs *engine)
 {
@@ -495,17 +495,19 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
 	struct intel_guc *guc = &dev_priv->guc;
 	struct i915_guc_client *client = guc->execbuf_client;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = el->port;
 	const unsigned int engine_id = engine->id;
 	unsigned int n;
 
-	for (n = 0; n < ARRAY_SIZE(el->port); n++) {
+	for (n = 0; n < execlist_active_ports(el); n++) {
+		struct execlist_port *port;
 		struct drm_i915_gem_request *rq;
 		unsigned int count;
 
-		rq = port_unpack(&port[n], &count);
+		port = execlist_port_index(el, n);
+
+		rq = port_unpack(port, &count);
 		if (rq && count == 0) {
-			port_set(&port[n], port_pack(rq, ++count));
+			port_set(port, port_pack(rq, ++count));
 
 			if (i915_vma_is_map_and_fenceable(rq->ring->vma))
 				POSTING_READ_FW(GUC_STATUS);
@@ -560,25 +562,27 @@ static void port_assign(struct execlist_port *port,
 static void i915_guc_dequeue(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = el->port;
+	struct execlist_port *port;
 	struct drm_i915_gem_request *last = NULL;
-	const struct execlist_port * const last_port = execlist_port_tail(el);
 	bool submit = false;
 	struct rb_node *rb;
 
-	if (port_isset(port))
-		port++;
-
 	spin_lock_irq(&engine->timeline->lock);
 	rb = el->first;
 	GEM_BUG_ON(rb_first(&el->queue) != rb);
-	while (rb) {
+
+	if (unlikely(!rb))
+		goto done;
+
+	port = execlist_request_port(el);
+
+	do {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
 
 		list_for_each_entry_safe(rq, rn, &p->requests, priotree.link) {
 			if (last && rq->ctx != last->ctx) {
-				if (port == last_port) {
+				if (!execlist_inactive_ports(el)) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -587,7 +591,8 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 				if (submit)
 					port_assign(port, last);
 
-				port = execlist_port_next(el, port);
+				port = execlist_request_port(el);
+				GEM_BUG_ON(port_isset(port));
 			}
 
 			INIT_LIST_HEAD(&rq->priotree.link);
@@ -604,7 +609,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
-	}
+	} while (rb);
 done:
 	el->first = rb;
 	if (submit) {
@@ -618,21 +623,21 @@ static void i915_guc_irq_handler(unsigned long data)
 {
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = execlist_port_head(el);
-	const struct execlist_port * const last_port = execlist_port_tail(el);
-	struct drm_i915_gem_request *rq;
 
-	rq = port_request(port);
-	while (rq && i915_gem_request_completed(rq)) {
+	while (execlist_active_ports(el)) {
+		struct execlist_port *port = execlist_port_head(el);
+		struct drm_i915_gem_request *rq = port_request(port);
+
+		if (!i915_gem_request_completed(rq))
+			break;
+
 		trace_i915_gem_request_out(rq);
 		i915_gem_request_put(rq);
 
-		port = execlist_port_complete(el, port);
-
-		rq = port_request(port);
+		execlist_release_port(el, port);
 	}
 
-	if (!port_isset(last_port))
+	if (execlist_inactive_ports(el))
 		i915_guc_dequeue(engine);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index ac5a95439393..a9d888b726c4 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1342,7 +1342,7 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 	bool tasklet = false;
 
 	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
-		if (port_count(execlist_port_head(el))) {
+		if (READ_ONCE(el->port_count)) {
 			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 			tasklet = true;
 		}
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index b0d702063a50..29b170fdd6ef 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -407,6 +407,9 @@ static void intel_engine_init_execlist(struct intel_engine_cs *engine)
 	BUILD_BUG_ON_NOT_POWER_OF_2(execlist_num_ports(&engine->execlist));
 	GEM_BUG_ON(execlist_num_ports(&engine->execlist) > EXECLIST_MAX_PORTS);
 
+	el->port_head = 0;
+	el->port_count = 0;
+
 	el->queue = RB_ROOT;
 	el->first = NULL;
 }
@@ -1501,8 +1504,8 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
 		return false;
 
-	/* Both ports drained, no more ELSP submission? */
-	if (port_request(execlist_port_head(&engine->execlist)))
+	/* All ports drained, no more ELSP submission? */
+	if (execlist_active_ports(&engine->execlist))
 		return false;
 
 	/* ELSP is empty, but there are ready requests? */
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 8550cd6635c9..bea10620bed2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -399,26 +399,29 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 		engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
 	unsigned int n;
 
-	for (n = execlist_num_ports(el); n--; ) {
-		struct execlist_port *port;
+	for (n = 0; n < execlist_inactive_ports(el); n++) {
+		writel(0, elsp);
+		writel(0, elsp);
+	}
+
+	for (n = execlist_active_ports(el); n--; ) {
 		struct drm_i915_gem_request *rq;
+		struct execlist_port *port;
 		unsigned int count;
 		u64 desc;
 
 		port = execlist_port_index(el, n);
-
 		rq = port_unpack(port, &count);
-		if (rq) {
-			GEM_BUG_ON(count > !n);
-			if (!count++)
-				execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
-			port_set(port, port_pack(rq, count));
-			desc = execlists_update_context(rq);
-			GEM_DEBUG_EXEC(port->context_id = upper_32_bits(desc));
-		} else {
-			GEM_BUG_ON(!n);
-			desc = 0;
-		}
+
+		GEM_BUG_ON(!rq);
+		GEM_BUG_ON(count > !n);
+
+		if (!count++)
+			execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
+
+		port_set(port, port_pack(rq, count));
+		desc = execlists_update_context(rq);
+		GEM_DEBUG_EXEC(port->context_id = upper_32_bits(desc));
 
 		writel(upper_32_bits(desc), elsp);
 		writel(lower_32_bits(desc), elsp);
@@ -456,15 +459,23 @@ static void port_assign(struct execlist_port *port,
 
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
-	struct drm_i915_gem_request *last;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = execlist_port_head(el);
-	const struct execlist_port * const last_port = execlist_port_tail(el);
+	struct execlist_port *port;
+	struct drm_i915_gem_request *last;
 	struct rb_node *rb;
 	bool submit = false;
 
-	last = port_request(port);
-	if (last)
+	spin_lock_irq(&engine->timeline->lock);
+	rb = el->first;
+	GEM_BUG_ON(rb_first(&el->queue) != rb);
+
+	if (unlikely(!rb))
+		goto done;
+
+	if (execlist_active_ports(el)) {
+		port = execlist_port_tail(el);
+		last = port_request(port);
+
 		/* WaIdleLiteRestore:bdw,skl
 		 * Apply the wa NOOPs to prevent ring:HEAD == req:TAIL
 		 * as we resubmit the request. See gen8_emit_breadcrumb()
@@ -472,6 +483,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		 * request.
 		 */
 		last->tail = last->wa_tail;
+	} else {
+		/* Allocate first port to coalesce into */
+		port = execlist_request_port(el);
+		last = NULL;
+	}
 
 	/* Hardware submission is through 2 ports. Conceptually each port
 	 * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
@@ -493,11 +509,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * sequence of requests as being the most optimal (fewest wake ups
 	 * and context switches) submission.
 	 */
-
-	spin_lock_irq(&engine->timeline->lock);
-	rb = el->first;
-	GEM_BUG_ON(rb_first(&el->queue) != rb);
-	while (rb) {
+	do {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
 
@@ -515,11 +527,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			 */
 			if (last && !can_merge_ctx(rq->ctx, last->ctx)) {
 				/*
-				 * If we are on the second port and cannot
+				 * If we are on the last port and cannot
 				 * combine this request with the last, then we
 				 * are done.
 				 */
-				if (port == last_port) {
+				if (!execlist_inactive_ports(el)) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -544,8 +556,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				if (submit)
 					port_assign(port, last);
 
-				port = execlist_port_next(el, port);
-
+				port = execlist_request_port(el);
 				GEM_BUG_ON(port_isset(port));
 			}
 
@@ -563,7 +574,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
-	}
+	} while (rb);
+
 done:
 	el->first = rb;
 	if (submit)
@@ -582,6 +594,9 @@ static void execlist_cancel_port_requests(struct intel_engine_execlist *el)
 		i915_gem_request_put(port_request(&el->port[i]));
 
 	memset(el->port, 0, sizeof(el->port));
+
+	el->port_count = 0;
+	el->port_head = 0;
 }
 
 static void execlists_cancel_requests(struct intel_engine_cs *engine)
@@ -643,10 +658,12 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 
 static bool execlists_elsp_ready(struct intel_engine_execlist * const el)
 {
-	struct execlist_port * const port0 = execlist_port_head(el);
-	struct execlist_port * const port1 = execlist_port_next(el, port0);
+	const unsigned int active = execlist_active_ports(el);
+
+	if (!active)
+		return true;
 
-	return port_count(port0) + port_count(port1) < 2;
+	return port_count(execlist_port_tail(el)) + active < 2;
 }
 
 /*
@@ -657,7 +674,6 @@ static void intel_lrc_irq_handler(unsigned long data)
 {
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = execlist_port_head(el);
 	struct drm_i915_private *dev_priv = engine->i915;
 
 	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
@@ -714,6 +730,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 		}
 
 		while (head != tail) {
+			struct execlist_port *port;
 			struct drm_i915_gem_request *rq;
 			unsigned int status;
 			unsigned int count;
@@ -742,6 +759,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 			if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
 				continue;
 
+			port = execlist_port_head(el);
 			/* Check the context/desc id for this event matches */
 			GEM_DEBUG_BUG_ON(buf[2 * head + 1] != port->context_id);
 
@@ -755,13 +773,13 @@ static void intel_lrc_irq_handler(unsigned long data)
 				trace_i915_gem_request_out(rq);
 				i915_gem_request_put(rq);
 
-				port = execlist_port_complete(el, port);
+				execlist_release_port(el, port);
 			} else {
 				port_set(port, port_pack(rq, count));
 			}
 
 			/* After the final element, the hw should be idle */
-			GEM_BUG_ON(port_count(port) == 0 &&
+			GEM_BUG_ON(execlist_active_ports(el) == 0 &&
 				   !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
 		}
 
@@ -786,7 +804,7 @@ static void insert_request(struct intel_engine_cs *engine,
 	struct i915_priolist *p = lookup_priolist(engine, pt, prio);
 
 	list_add_tail(&pt->link, &ptr_mask_bits(p, 1)->requests);
-	if (ptr_unmask_bits(p, 1) && execlists_elsp_ready(el))
+	if (ptr_unmask_bits(p, 1))
 		tasklet_hi_schedule(&el->irq_tasklet);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 991f6c0bd6c2..efa5a8ea1ecb 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -249,6 +249,11 @@ struct intel_engine_execlist {
 	unsigned int port_head;
 
 	/**
+	 * @port_count: reserved ports
+	 */
+	unsigned int port_count;
+
+	/**
 	 * @queue: queue of requests, in priority lists
 	 */
 	struct rb_root queue;
@@ -529,14 +534,20 @@ execlist_num_ports(const struct intel_engine_execlist * const el)
 	return el->port_mask + 1;
 }
 
-#define __port_idx(start, index, mask) (((start) + (index)) & (mask))
+static inline unsigned int
+execlist_active_ports(const struct intel_engine_execlist * const el)
+{
+	return el->port_count;
+}
 
-static inline struct execlist_port *
-execlist_port_head(struct intel_engine_execlist * const el)
+static inline unsigned int
+execlist_inactive_ports(const struct intel_engine_execlist * const el)
 {
-	return &el->port[el->port_head];
+	return execlist_num_ports(el) - execlist_active_ports(el);
 }
 
+#define __port_idx(start, index, mask) (((start) + (index)) & (mask))
+
 /* Index starting from port_head */
 static inline struct execlist_port *
 execlist_port_index(struct intel_engine_execlist * const el,
@@ -546,30 +557,46 @@ execlist_port_index(struct intel_engine_execlist * const el,
 }
 
 static inline struct execlist_port *
-execlist_port_tail(struct intel_engine_execlist * const el)
+execlist_port_head(struct intel_engine_execlist * const el)
 {
-	return &el->port[__port_idx(el->port_head, -1, el->port_mask)];
+	GEM_BUG_ON(!el->port_count);
+
+	return execlist_port_index(el, 0);
 }
 
 static inline struct execlist_port *
-execlist_port_next(struct intel_engine_execlist * const el,
-		   const struct execlist_port * const port)
+execlist_port_tail(struct intel_engine_execlist * const el)
 {
-	const unsigned int i = port_index(port, el);
+	GEM_BUG_ON(!el->port_count);
 
-	return &el->port[__port_idx(i, 1, el->port_mask)];
+	return execlist_port_index(el, el->port_count - 1);
 }
 
 static inline struct execlist_port *
-execlist_port_complete(struct intel_engine_execlist * const el,
-		       struct execlist_port * const port)
+execlist_request_port(struct intel_engine_execlist * const el)
 {
+	GEM_BUG_ON(el->port_count == el->port_mask + 1);
+
+	el->port_count++;
+
+	GEM_BUG_ON(port_isset(execlist_port_tail(el)));
+
+	return execlist_port_tail(el);
+}
+
+static inline void
+execlist_release_port(struct intel_engine_execlist * const el,
+		      struct execlist_port * const port)
+{
+
 	GEM_BUG_ON(port_index(port, el) != el->port_head);
+	GEM_BUG_ON(!port_isset(port));
+	GEM_BUG_ON(!el->port_count);
 
 	memset(port, 0, sizeof(struct execlist_port));
-	el->port_head = __port_idx(el->port_head, 1, el->port_mask);
 
-	return execlist_port_head(el);
+	el->port_head = __port_idx(el->port_head, 1, el->port_mask);
+	el->port_count--;
 }
 
 static inline unsigned int
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 8/8] drm/i915: Improve GuC request coalescing
  2017-09-20 14:36 [PATCH 0/8] Support for more than two execlist ports (v2) Mika Kuoppala
                   ` (6 preceding siblings ...)
  2017-09-20 14:37 ` [PATCH 7/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
@ 2017-09-20 14:37 ` Mika Kuoppala
  2017-09-21 12:34   ` Chris Wilson
  2017-09-21 12:53   ` Michał Winiarski
  2017-09-20 15:19 ` ✓ Fi.CI.BAT: success for Support for more than two execlist ports (rev2) Patchwork
  2017-09-20 16:34 ` ✓ Fi.CI.IGT: " Patchwork
  9 siblings, 2 replies; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-20 14:37 UTC (permalink / raw)
  To: intel-gfx

Now that we can keep track of what ports we have
dequeued, coalesce only those ports instead of iterating
through all ports.

Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 31 +++++++++++++++++-------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  9 +++++++++
 2 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 359f57a59cba..1057a0fb9f27 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -485,11 +485,13 @@ static void guc_ring_doorbell(struct i915_guc_client *client)
 /**
  * i915_guc_submit() - Submit commands through GuC
  * @engine: engine associated with the commands
+ * @first: index of first execlist port to start coalescing from
  *
  * The only error here arises if the doorbell hardware isn't functioning
  * as expected, which really shouldn't happen.
  */
-static void i915_guc_submit(struct intel_engine_cs *engine)
+static void i915_guc_submit(struct intel_engine_cs *engine,
+			    const unsigned int first)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 	struct intel_guc *guc = &dev_priv->guc;
@@ -498,7 +500,7 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
 	const unsigned int engine_id = engine->id;
 	unsigned int n;
 
-	for (n = 0; n < execlist_active_ports(el); n++) {
+	for (n = first; n < execlist_active_ports(el); n++) {
 		struct execlist_port *port;
 		struct drm_i915_gem_request *rq;
 		unsigned int count;
@@ -506,21 +508,22 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
 		port = execlist_port_index(el, n);
 
 		rq = port_unpack(port, &count);
-		if (rq && count == 0) {
-			port_set(port, port_pack(rq, ++count));
+		GEM_BUG_ON(!rq);
+		GEM_BUG_ON(count);
 
-			if (i915_vma_is_map_and_fenceable(rq->ring->vma))
-				POSTING_READ_FW(GUC_STATUS);
+		port_set(port, port_pack(rq, ++count));
 
-			spin_lock(&client->wq_lock);
+		if (i915_vma_is_map_and_fenceable(rq->ring->vma))
+			POSTING_READ_FW(GUC_STATUS);
 
-			guc_wq_item_append(client, rq);
-			guc_ring_doorbell(client);
+		spin_lock(&client->wq_lock);
 
-			client->submissions[engine_id] += 1;
+		guc_wq_item_append(client, rq);
+		guc_ring_doorbell(client);
 
-			spin_unlock(&client->wq_lock);
-		}
+		client->submissions[engine_id] += 1;
+
+		spin_unlock(&client->wq_lock);
 	}
 }
 
@@ -566,6 +569,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 	struct drm_i915_gem_request *last = NULL;
 	bool submit = false;
 	struct rb_node *rb;
+	unsigned int first_idx;
 
 	spin_lock_irq(&engine->timeline->lock);
 	rb = el->first;
@@ -575,6 +579,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 		goto done;
 
 	port = execlist_request_port(el);
+	first_idx = execlist_get_port_index(el, port);
 
 	do {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
@@ -614,7 +619,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 	el->first = rb;
 	if (submit) {
 		port_assign(port, last);
-		i915_guc_submit(engine);
+		i915_guc_submit(engine, first_idx);
 	}
 	spin_unlock_irq(&engine->timeline->lock);
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index efa5a8ea1ecb..f2eb32539300 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -556,6 +556,15 @@ execlist_port_index(struct intel_engine_execlist * const el,
 	return &el->port[__port_idx(el->port_head, n, el->port_mask)];
 }
 
+static inline unsigned int
+execlist_get_port_index(const struct intel_engine_execlist * const el,
+			const struct execlist_port * const port)
+{
+	const unsigned int n = port_index(port, el);
+
+	return __port_idx(n, -el->port_head, el->port_mask);
+}
+
 static inline struct execlist_port *
 execlist_port_head(struct intel_engine_execlist * const el)
 {
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* ✓ Fi.CI.BAT: success for Support for more than two execlist ports (rev2)
  2017-09-20 14:36 [PATCH 0/8] Support for more than two execlist ports (v2) Mika Kuoppala
                   ` (7 preceding siblings ...)
  2017-09-20 14:37 ` [PATCH 8/8] drm/i915: Improve GuC request coalescing Mika Kuoppala
@ 2017-09-20 15:19 ` Patchwork
  2017-09-20 16:34 ` ✓ Fi.CI.IGT: " Patchwork
  9 siblings, 0 replies; 26+ messages in thread
From: Patchwork @ 2017-09-20 15:19 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

== Series Details ==

Series: Support for more than two execlist ports (rev2)
URL   : https://patchwork.freedesktop.org/series/30183/
State : success

== Summary ==

Series 30183v2 Support for more than two execlist ports
https://patchwork.freedesktop.org/api/1.0/series/30183/revisions/2/mbox/

Test gem_exec_suspend:
        Subgroup basic-s3:
                incomplete -> PASS       (fi-kbl-7500u) fdo#102850
Test kms_cursor_legacy:
        Subgroup basic-busy-flip-before-cursor-legacy:
                fail       -> PASS       (fi-snb-2600) fdo#100215
Test kms_force_connector_basic:
        Subgroup force-connector-state:
                skip       -> PASS       (fi-ivb-3520m)
        Subgroup force-edid:
                skip       -> PASS       (fi-ivb-3520m)
        Subgroup force-load-detect:
                skip       -> PASS       (fi-ivb-3520m)
        Subgroup prune-stale-modes:
                skip       -> PASS       (fi-ivb-3520m)
Test pm_rpm:
        Subgroup basic-rte:
                dmesg-warn -> PASS       (fi-cfl-s) fdo#102294
Test drv_module_reload:
        Subgroup basic-reload:
                pass       -> DMESG-WARN (fi-glk-1) fdo#102777

fdo#102850 https://bugs.freedesktop.org/show_bug.cgi?id=102850
fdo#100215 https://bugs.freedesktop.org/show_bug.cgi?id=100215
fdo#102294 https://bugs.freedesktop.org/show_bug.cgi?id=102294
fdo#102777 https://bugs.freedesktop.org/show_bug.cgi?id=102777

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:444s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:468s
fi-blb-e6850     total:289  pass:224  dwarn:1   dfail:0   fail:0   skip:64  time:427s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:531s
fi-bwr-2160      total:289  pass:184  dwarn:0   dfail:0   fail:0   skip:105 time:277s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:498s
fi-byt-j1900     total:289  pass:254  dwarn:1   dfail:0   fail:0   skip:34  time:495s
fi-byt-n2820     total:289  pass:250  dwarn:1   dfail:0   fail:0   skip:38  time:492s
fi-cfl-s         total:289  pass:223  dwarn:34  dfail:0   fail:0   skip:32  time:540s
fi-elk-e7500     total:289  pass:230  dwarn:0   dfail:0   fail:0   skip:59  time:425s
fi-glk-1         total:289  pass:259  dwarn:1   dfail:0   fail:0   skip:29  time:566s
fi-hsw-4770      total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:425s
fi-hsw-4770r     total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:407s
fi-ilk-650       total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:433s
fi-ivb-3520m     total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:477s
fi-ivb-3770      total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:461s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:473s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:576s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:588s
fi-pnv-d510      total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:542s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:446s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:749s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:490s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:476s
fi-snb-2520m     total:289  pass:251  dwarn:0   dfail:0   fail:0   skip:38  time:560s
fi-snb-2600      total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:413s

ed7a99bf23cea2276e77ba26d219e58e069d1e32 drm-tip: 2017y-09m-20d-11h-03m-37s UTC integration manifest
03b7a2df3fd1 drm/i915: Improve GuC request coalescing
582ba31b7a9f drm/i915: Keep track of reserved execlist ports
02b6dae25461 drm/i915: Introduce execlist_port_* accessors
202cc6c2b792 drm/i915: Make execlist port count variable
3853e558bdd7 drm/i915: Add execlist_port_complete
9dd456d33cc9 drm/i915: Wrap port cancellation into a function
830307905e9c drm/i915: Move execlist initialization into intel_engine_cs.c
791be2cd0de1 drm/i915: Make own struct for execlist items

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5766/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* ✓ Fi.CI.IGT: success for Support for more than two execlist ports (rev2)
  2017-09-20 14:36 [PATCH 0/8] Support for more than two execlist ports (v2) Mika Kuoppala
                   ` (8 preceding siblings ...)
  2017-09-20 15:19 ` ✓ Fi.CI.BAT: success for Support for more than two execlist ports (rev2) Patchwork
@ 2017-09-20 16:34 ` Patchwork
  9 siblings, 0 replies; 26+ messages in thread
From: Patchwork @ 2017-09-20 16:34 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

== Series Details ==

Series: Support for more than two execlist ports (rev2)
URL   : https://patchwork.freedesktop.org/series/30183/
State : success

== Summary ==

Test perf:
        Subgroup blocking:
                pass       -> FAIL       (shard-hsw) fdo#102252 +1

fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252

shard-hsw        total:2317 pass:1245 dwarn:2   dfail:0   fail:13  skip:1057 time:9584s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5766/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] drm/i915: Make own struct for execlist items
  2017-09-20 14:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
@ 2017-09-21 11:55   ` Michał Winiarski
  2017-09-21 12:19   ` Chris Wilson
  2017-09-21 15:13   ` Joonas Lahtinen
  2 siblings, 0 replies; 26+ messages in thread
From: Michał Winiarski @ 2017-09-21 11:55 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Wed, Sep 20, 2017 at 05:36:58PM +0300, Mika Kuoppala wrote:
> Engine's execlist related items have been increasing to
> a point where a separate struct is warranted. Carve execlist
> specific items to a dedicated struct to add clarity.
> 
> v2: add kerneldoc and fix whitespace (Joonas, Chris)
> v3: csb_mmio changes, rebase
> 
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

-Michał

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |   8 +--
>  drivers/gpu/drm/i915/i915_gem.c            |   6 +-
>  drivers/gpu/drm/i915/i915_gpu_error.c      |   4 +-
>  drivers/gpu/drm/i915/i915_guc_submission.c |  31 +++++----
>  drivers/gpu/drm/i915/i915_irq.c            |   5 +-
>  drivers/gpu/drm/i915/intel_engine_cs.c     |  12 ++--
>  drivers/gpu/drm/i915/intel_lrc.c           | 100 +++++++++++++++--------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h    | 100 +++++++++++++++++++++++------
>  8 files changed, 167 insertions(+), 99 deletions(-)
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/8] drm/i915: Keep track of reserved execlist ports
  2017-09-20 14:37 ` [PATCH 7/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
@ 2017-09-21 12:08   ` Mika Kuoppala
  2017-09-21 12:30   ` Chris Wilson
  1 sibling, 0 replies; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-21 12:08 UTC (permalink / raw)
  To: intel-gfx

Mika Kuoppala <mika.kuoppala@linux.intel.com> writes:

> To further enchance port processing, keep track of
> reserved ports. This way we can iterate only the used subset
> of port space. Note that we lift the responsibility of
> execlists_submit_request() to inspect hw availability and

s/execlist_submit_request/insert_request.

-Mika

> always do dequeuing. This is to ensure that only the irq
> handler will be responsible for keeping track of available ports.
>
> v2: rebase, comment fix, READ_ONCE only outside of irq handler (Chris)
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 51 +++++++++--------
>  drivers/gpu/drm/i915/i915_irq.c            |  2 +-
>  drivers/gpu/drm/i915/intel_engine_cs.c     |  7 ++-
>  drivers/gpu/drm/i915/intel_lrc.c           | 90 ++++++++++++++++++------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h    | 55 +++++++++++++-----
>  5 files changed, 129 insertions(+), 76 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 25c9bac94c39..359f57a59cba 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -487,7 +487,7 @@ static void guc_ring_doorbell(struct i915_guc_client *client)
>   * @engine: engine associated with the commands
>   *
>   * The only error here arises if the doorbell hardware isn't functioning
> - * as expected, which really shouln't happen.
> + * as expected, which really shouldn't happen.
>   */
>  static void i915_guc_submit(struct intel_engine_cs *engine)
>  {
> @@ -495,17 +495,19 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
>  	struct intel_guc *guc = &dev_priv->guc;
>  	struct i915_guc_client *client = guc->execbuf_client;
>  	struct intel_engine_execlist * const el = &engine->execlist;
> -	struct execlist_port *port = el->port;
>  	const unsigned int engine_id = engine->id;
>  	unsigned int n;
>  
> -	for (n = 0; n < ARRAY_SIZE(el->port); n++) {
> +	for (n = 0; n < execlist_active_ports(el); n++) {
> +		struct execlist_port *port;
>  		struct drm_i915_gem_request *rq;
>  		unsigned int count;
>  
> -		rq = port_unpack(&port[n], &count);
> +		port = execlist_port_index(el, n);
> +
> +		rq = port_unpack(port, &count);
>  		if (rq && count == 0) {
> -			port_set(&port[n], port_pack(rq, ++count));
> +			port_set(port, port_pack(rq, ++count));
>  
>  			if (i915_vma_is_map_and_fenceable(rq->ring->vma))
>  				POSTING_READ_FW(GUC_STATUS);
> @@ -560,25 +562,27 @@ static void port_assign(struct execlist_port *port,
>  static void i915_guc_dequeue(struct intel_engine_cs *engine)
>  {
>  	struct intel_engine_execlist * const el = &engine->execlist;
> -	struct execlist_port *port = el->port;
> +	struct execlist_port *port;
>  	struct drm_i915_gem_request *last = NULL;
> -	const struct execlist_port * const last_port = execlist_port_tail(el);
>  	bool submit = false;
>  	struct rb_node *rb;
>  
> -	if (port_isset(port))
> -		port++;
> -
>  	spin_lock_irq(&engine->timeline->lock);
>  	rb = el->first;
>  	GEM_BUG_ON(rb_first(&el->queue) != rb);
> -	while (rb) {
> +
> +	if (unlikely(!rb))
> +		goto done;
> +
> +	port = execlist_request_port(el);
> +
> +	do {
>  		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
>  		struct drm_i915_gem_request *rq, *rn;
>  
>  		list_for_each_entry_safe(rq, rn, &p->requests, priotree.link) {
>  			if (last && rq->ctx != last->ctx) {
> -				if (port == last_port) {
> +				if (!execlist_inactive_ports(el)) {
>  					__list_del_many(&p->requests,
>  							&rq->priotree.link);
>  					goto done;
> @@ -587,7 +591,8 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
>  				if (submit)
>  					port_assign(port, last);
>  
> -				port = execlist_port_next(el, port);
> +				port = execlist_request_port(el);
> +				GEM_BUG_ON(port_isset(port));
>  			}
>  
>  			INIT_LIST_HEAD(&rq->priotree.link);
> @@ -604,7 +609,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
>  		INIT_LIST_HEAD(&p->requests);
>  		if (p->priority != I915_PRIORITY_NORMAL)
>  			kmem_cache_free(engine->i915->priorities, p);
> -	}
> +	} while (rb);
>  done:
>  	el->first = rb;
>  	if (submit) {
> @@ -618,21 +623,21 @@ static void i915_guc_irq_handler(unsigned long data)
>  {
>  	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
>  	struct intel_engine_execlist * const el = &engine->execlist;
> -	struct execlist_port *port = execlist_port_head(el);
> -	const struct execlist_port * const last_port = execlist_port_tail(el);
> -	struct drm_i915_gem_request *rq;
>  
> -	rq = port_request(port);
> -	while (rq && i915_gem_request_completed(rq)) {
> +	while (execlist_active_ports(el)) {
> +		struct execlist_port *port = execlist_port_head(el);
> +		struct drm_i915_gem_request *rq = port_request(port);
> +
> +		if (!i915_gem_request_completed(rq))
> +			break;
> +
>  		trace_i915_gem_request_out(rq);
>  		i915_gem_request_put(rq);
>  
> -		port = execlist_port_complete(el, port);
> -
> -		rq = port_request(port);
> +		execlist_release_port(el, port);
>  	}
>  
> -	if (!port_isset(last_port))
> +	if (execlist_inactive_ports(el))
>  		i915_guc_dequeue(engine);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index ac5a95439393..a9d888b726c4 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1342,7 +1342,7 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
>  	bool tasklet = false;
>  
>  	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
> -		if (port_count(execlist_port_head(el))) {
> +		if (READ_ONCE(el->port_count)) {
>  			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>  			tasklet = true;
>  		}
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index b0d702063a50..29b170fdd6ef 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -407,6 +407,9 @@ static void intel_engine_init_execlist(struct intel_engine_cs *engine)
>  	BUILD_BUG_ON_NOT_POWER_OF_2(execlist_num_ports(&engine->execlist));
>  	GEM_BUG_ON(execlist_num_ports(&engine->execlist) > EXECLIST_MAX_PORTS);
>  
> +	el->port_head = 0;
> +	el->port_count = 0;
> +
>  	el->queue = RB_ROOT;
>  	el->first = NULL;
>  }
> @@ -1501,8 +1504,8 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
>  	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
>  		return false;
>  
> -	/* Both ports drained, no more ELSP submission? */
> -	if (port_request(execlist_port_head(&engine->execlist)))
> +	/* All ports drained, no more ELSP submission? */
> +	if (execlist_active_ports(&engine->execlist))
>  		return false;
>  
>  	/* ELSP is empty, but there are ready requests? */
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 8550cd6635c9..bea10620bed2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -399,26 +399,29 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>  		engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
>  	unsigned int n;
>  
> -	for (n = execlist_num_ports(el); n--; ) {
> -		struct execlist_port *port;
> +	for (n = 0; n < execlist_inactive_ports(el); n++) {
> +		writel(0, elsp);
> +		writel(0, elsp);
> +	}
> +
> +	for (n = execlist_active_ports(el); n--; ) {
>  		struct drm_i915_gem_request *rq;
> +		struct execlist_port *port;
>  		unsigned int count;
>  		u64 desc;
>  
>  		port = execlist_port_index(el, n);
> -
>  		rq = port_unpack(port, &count);
> -		if (rq) {
> -			GEM_BUG_ON(count > !n);
> -			if (!count++)
> -				execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
> -			port_set(port, port_pack(rq, count));
> -			desc = execlists_update_context(rq);
> -			GEM_DEBUG_EXEC(port->context_id = upper_32_bits(desc));
> -		} else {
> -			GEM_BUG_ON(!n);
> -			desc = 0;
> -		}
> +
> +		GEM_BUG_ON(!rq);
> +		GEM_BUG_ON(count > !n);
> +
> +		if (!count++)
> +			execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
> +
> +		port_set(port, port_pack(rq, count));
> +		desc = execlists_update_context(rq);
> +		GEM_DEBUG_EXEC(port->context_id = upper_32_bits(desc));
>  
>  		writel(upper_32_bits(desc), elsp);
>  		writel(lower_32_bits(desc), elsp);
> @@ -456,15 +459,23 @@ static void port_assign(struct execlist_port *port,
>  
>  static void execlists_dequeue(struct intel_engine_cs *engine)
>  {
> -	struct drm_i915_gem_request *last;
>  	struct intel_engine_execlist * const el = &engine->execlist;
> -	struct execlist_port *port = execlist_port_head(el);
> -	const struct execlist_port * const last_port = execlist_port_tail(el);
> +	struct execlist_port *port;
> +	struct drm_i915_gem_request *last;
>  	struct rb_node *rb;
>  	bool submit = false;
>  
> -	last = port_request(port);
> -	if (last)
> +	spin_lock_irq(&engine->timeline->lock);
> +	rb = el->first;
> +	GEM_BUG_ON(rb_first(&el->queue) != rb);
> +
> +	if (unlikely(!rb))
> +		goto done;
> +
> +	if (execlist_active_ports(el)) {
> +		port = execlist_port_tail(el);
> +		last = port_request(port);
> +
>  		/* WaIdleLiteRestore:bdw,skl
>  		 * Apply the wa NOOPs to prevent ring:HEAD == req:TAIL
>  		 * as we resubmit the request. See gen8_emit_breadcrumb()
> @@ -472,6 +483,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  		 * request.
>  		 */
>  		last->tail = last->wa_tail;
> +	} else {
> +		/* Allocate first port to coalesce into */
> +		port = execlist_request_port(el);
> +		last = NULL;
> +	}
>  
>  	/* Hardware submission is through 2 ports. Conceptually each port
>  	 * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
> @@ -493,11 +509,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  	 * sequence of requests as being the most optimal (fewest wake ups
>  	 * and context switches) submission.
>  	 */
> -
> -	spin_lock_irq(&engine->timeline->lock);
> -	rb = el->first;
> -	GEM_BUG_ON(rb_first(&el->queue) != rb);
> -	while (rb) {
> +	do {
>  		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
>  		struct drm_i915_gem_request *rq, *rn;
>  
> @@ -515,11 +527,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  			 */
>  			if (last && !can_merge_ctx(rq->ctx, last->ctx)) {
>  				/*
> -				 * If we are on the second port and cannot
> +				 * If we are on the last port and cannot
>  				 * combine this request with the last, then we
>  				 * are done.
>  				 */
> -				if (port == last_port) {
> +				if (!execlist_inactive_ports(el)) {
>  					__list_del_many(&p->requests,
>  							&rq->priotree.link);
>  					goto done;
> @@ -544,8 +556,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  				if (submit)
>  					port_assign(port, last);
>  
> -				port = execlist_port_next(el, port);
> -
> +				port = execlist_request_port(el);
>  				GEM_BUG_ON(port_isset(port));
>  			}
>  
> @@ -563,7 +574,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  		INIT_LIST_HEAD(&p->requests);
>  		if (p->priority != I915_PRIORITY_NORMAL)
>  			kmem_cache_free(engine->i915->priorities, p);
> -	}
> +	} while (rb);
> +
>  done:
>  	el->first = rb;
>  	if (submit)
> @@ -582,6 +594,9 @@ static void execlist_cancel_port_requests(struct intel_engine_execlist *el)
>  		i915_gem_request_put(port_request(&el->port[i]));
>  
>  	memset(el->port, 0, sizeof(el->port));
> +
> +	el->port_count = 0;
> +	el->port_head = 0;
>  }
>  
>  static void execlists_cancel_requests(struct intel_engine_cs *engine)
> @@ -643,10 +658,12 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
>  
>  static bool execlists_elsp_ready(struct intel_engine_execlist * const el)
>  {
> -	struct execlist_port * const port0 = execlist_port_head(el);
> -	struct execlist_port * const port1 = execlist_port_next(el, port0);
> +	const unsigned int active = execlist_active_ports(el);
> +
> +	if (!active)
> +		return true;
>  
> -	return port_count(port0) + port_count(port1) < 2;
> +	return port_count(execlist_port_tail(el)) + active < 2;
>  }
>  
>  /*
> @@ -657,7 +674,6 @@ static void intel_lrc_irq_handler(unsigned long data)
>  {
>  	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
>  	struct intel_engine_execlist * const el = &engine->execlist;
> -	struct execlist_port *port = execlist_port_head(el);
>  	struct drm_i915_private *dev_priv = engine->i915;
>  
>  	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
> @@ -714,6 +730,7 @@ static void intel_lrc_irq_handler(unsigned long data)
>  		}
>  
>  		while (head != tail) {
> +			struct execlist_port *port;
>  			struct drm_i915_gem_request *rq;
>  			unsigned int status;
>  			unsigned int count;
> @@ -742,6 +759,7 @@ static void intel_lrc_irq_handler(unsigned long data)
>  			if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
>  				continue;
>  
> +			port = execlist_port_head(el);
>  			/* Check the context/desc id for this event matches */
>  			GEM_DEBUG_BUG_ON(buf[2 * head + 1] != port->context_id);
>  
> @@ -755,13 +773,13 @@ static void intel_lrc_irq_handler(unsigned long data)
>  				trace_i915_gem_request_out(rq);
>  				i915_gem_request_put(rq);
>  
> -				port = execlist_port_complete(el, port);
> +				execlist_release_port(el, port);
>  			} else {
>  				port_set(port, port_pack(rq, count));
>  			}
>  
>  			/* After the final element, the hw should be idle */
> -			GEM_BUG_ON(port_count(port) == 0 &&
> +			GEM_BUG_ON(execlist_active_ports(el) == 0 &&
>  				   !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
>  		}
>  
> @@ -786,7 +804,7 @@ static void insert_request(struct intel_engine_cs *engine,
>  	struct i915_priolist *p = lookup_priolist(engine, pt, prio);
>  
>  	list_add_tail(&pt->link, &ptr_mask_bits(p, 1)->requests);
> -	if (ptr_unmask_bits(p, 1) && execlists_elsp_ready(el))
> +	if (ptr_unmask_bits(p, 1))
>  		tasklet_hi_schedule(&el->irq_tasklet);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 991f6c0bd6c2..efa5a8ea1ecb 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -249,6 +249,11 @@ struct intel_engine_execlist {
>  	unsigned int port_head;
>  
>  	/**
> +	 * @port_count: reserved ports
> +	 */
> +	unsigned int port_count;
> +
> +	/**
>  	 * @queue: queue of requests, in priority lists
>  	 */
>  	struct rb_root queue;
> @@ -529,14 +534,20 @@ execlist_num_ports(const struct intel_engine_execlist * const el)
>  	return el->port_mask + 1;
>  }
>  
> -#define __port_idx(start, index, mask) (((start) + (index)) & (mask))
> +static inline unsigned int
> +execlist_active_ports(const struct intel_engine_execlist * const el)
> +{
> +	return el->port_count;
> +}
>  
> -static inline struct execlist_port *
> -execlist_port_head(struct intel_engine_execlist * const el)
> +static inline unsigned int
> +execlist_inactive_ports(const struct intel_engine_execlist * const el)
>  {
> -	return &el->port[el->port_head];
> +	return execlist_num_ports(el) - execlist_active_ports(el);
>  }
>  
> +#define __port_idx(start, index, mask) (((start) + (index)) & (mask))
> +
>  /* Index starting from port_head */
>  static inline struct execlist_port *
>  execlist_port_index(struct intel_engine_execlist * const el,
> @@ -546,30 +557,46 @@ execlist_port_index(struct intel_engine_execlist * const el,
>  }
>  
>  static inline struct execlist_port *
> -execlist_port_tail(struct intel_engine_execlist * const el)
> +execlist_port_head(struct intel_engine_execlist * const el)
>  {
> -	return &el->port[__port_idx(el->port_head, -1, el->port_mask)];
> +	GEM_BUG_ON(!el->port_count);
> +
> +	return execlist_port_index(el, 0);
>  }
>  
>  static inline struct execlist_port *
> -execlist_port_next(struct intel_engine_execlist * const el,
> -		   const struct execlist_port * const port)
> +execlist_port_tail(struct intel_engine_execlist * const el)
>  {
> -	const unsigned int i = port_index(port, el);
> +	GEM_BUG_ON(!el->port_count);
>  
> -	return &el->port[__port_idx(i, 1, el->port_mask)];
> +	return execlist_port_index(el, el->port_count - 1);
>  }
>  
>  static inline struct execlist_port *
> -execlist_port_complete(struct intel_engine_execlist * const el,
> -		       struct execlist_port * const port)
> +execlist_request_port(struct intel_engine_execlist * const el)
>  {
> +	GEM_BUG_ON(el->port_count == el->port_mask + 1);
> +
> +	el->port_count++;
> +
> +	GEM_BUG_ON(port_isset(execlist_port_tail(el)));
> +
> +	return execlist_port_tail(el);
> +}
> +
> +static inline void
> +execlist_release_port(struct intel_engine_execlist * const el,
> +		      struct execlist_port * const port)
> +{
> +
>  	GEM_BUG_ON(port_index(port, el) != el->port_head);
> +	GEM_BUG_ON(!port_isset(port));
> +	GEM_BUG_ON(!el->port_count);
>  
>  	memset(port, 0, sizeof(struct execlist_port));
> -	el->port_head = __port_idx(el->port_head, 1, el->port_mask);
>  
> -	return execlist_port_head(el);
> +	el->port_head = __port_idx(el->port_head, 1, el->port_mask);
> +	el->port_count--;
>  }
>  
>  static inline unsigned int
> -- 
> 2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] drm/i915: Wrap port cancellation into a function
  2017-09-20 14:37 ` [PATCH 3/8] drm/i915: Wrap port cancellation into a function Mika Kuoppala
@ 2017-09-21 12:18   ` Michał Winiarski
  2017-09-21 14:02     ` Mika Kuoppala
  2017-09-21 12:20   ` Chris Wilson
  1 sibling, 1 reply; 26+ messages in thread
From: Michał Winiarski @ 2017-09-21 12:18 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Wed, Sep 20, 2017 at 05:37:00PM +0300, Mika Kuoppala wrote:
> On reset and wedged path, we want to release the requests
> that are tied to ports and then mark the ports to be unset.
> Introduce a function for this.
> 
> v2: rebase
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++++++---------
>  1 file changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index a4ece4c4f291..ffb9c900328b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -568,6 +568,16 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  		execlists_submit_ports(engine);
>  }
>  
> +static void execlist_cancel_port_requests(struct intel_engine_execlist *el)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(el->port); i++)
> +		i915_gem_request_put(port_request(&el->port[i]));
> +
> +	memset(el->port, 0, sizeof(el->port));
> +}
> +
>  static void execlists_cancel_requests(struct intel_engine_cs *engine)
>  {
>  	struct intel_engine_execlist * const el = &engine->execlist;
> @@ -575,14 +585,11 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
>  	struct drm_i915_gem_request *rq, *rn;
>  	struct rb_node *rb;
>  	unsigned long flags;
> -	unsigned long n;
>  
>  	spin_lock_irqsave(&engine->timeline->lock, flags);
>  
>  	/* Cancel the requests on the HW and clear the ELSP tracker. */
> -	for (n = 0; n < ARRAY_SIZE(el->port); n++)
> -		i915_gem_request_put(port_request(&port[n]));

We could also drop the local variable for port.
It's only used in GEM_BUG_ON(port_isset(&port[0])).
Do we even need this assert when we're starting to treat ports in a more
ring-like fashion?

Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

-Michał

> -	memset(el->port, 0, sizeof(el->port));
> +	execlist_cancel_port_requests(el);
>  
>  	/* Mark all executing requests as skipped. */
>  	list_for_each_entry(rq, &engine->timeline->requests, link) {
> @@ -1372,11 +1379,9 @@ static void reset_common_ring(struct intel_engine_cs *engine,
>  			      struct drm_i915_gem_request *request)
>  {
>  	struct intel_engine_execlist * const el = &engine->execlist;
> -	struct execlist_port *port = el->port;
>  	struct drm_i915_gem_request *rq, *rn;
>  	struct intel_context *ce;
>  	unsigned long flags;
> -	unsigned int n;
>  
>  	spin_lock_irqsave(&engine->timeline->lock, flags);
>  
> @@ -1389,9 +1394,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
>  	 * guessing the missed context-switch events by looking at what
>  	 * requests were completed.
>  	 */
> -	for (n = 0; n < ARRAY_SIZE(el->port); n++)
> -		i915_gem_request_put(port_request(&port[n]));
> -	memset(el->port, 0, sizeof(el->port));
> +	execlist_cancel_port_requests(el);
>  
>  	/* Push back any incomplete requests for replay after the reset. */
>  	list_for_each_entry_safe_reverse(rq, rn,
> -- 
> 2.11.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] drm/i915: Make own struct for execlist items
  2017-09-20 14:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
  2017-09-21 11:55   ` Michał Winiarski
@ 2017-09-21 12:19   ` Chris Wilson
  2017-09-21 15:13   ` Joonas Lahtinen
  2 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2017-09-21 12:19 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-20 15:36:58)
> Engine's execlist related items have been increasing to
> a point where a separate struct is warranted. Carve execlist
> specific items to a dedicated struct to add clarity.
> 
> v2: add kerneldoc and fix whitespace (Joonas, Chris)
> v3: csb_mmio changes, rebase
> 
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/8] drm/i915: Move execlist initialization into intel_engine_cs.c
  2017-09-20 14:36 ` [PATCH 2/8] drm/i915: Move execlist initialization into intel_engine_cs.c Mika Kuoppala
@ 2017-09-21 12:20   ` Chris Wilson
  0 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2017-09-21 12:20 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-20 15:36:59)
> Move execlist init into a common engine setup. As it is
> common to both guc and hw execlists.
> 
> v2: rebase with csb changes
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] drm/i915: Wrap port cancellation into a function
  2017-09-20 14:37 ` [PATCH 3/8] drm/i915: Wrap port cancellation into a function Mika Kuoppala
  2017-09-21 12:18   ` Michał Winiarski
@ 2017-09-21 12:20   ` Chris Wilson
  1 sibling, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2017-09-21 12:20 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-20 15:37:00)
> On reset and wedged path, we want to release the requests
> that are tied to ports and then mark the ports to be unset.
> Introduce a function for this.
> 
> v2: rebase
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/8] drm/i915: Add execlist_port_complete
  2017-09-20 14:37 ` [PATCH 4/8] drm/i915: Add execlist_port_complete Mika Kuoppala
@ 2017-09-21 12:21   ` Chris Wilson
  0 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2017-09-21 12:21 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-20 15:37:01)
> When first execlist entry is processed, we move the port (contents).
> Introduce function for this as execlist and guc use this common
> operation.
> 
> v2: rebase. s/GEM_DEBUG_BUG/GEM_BUG (Chris)
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/8] drm/i915: Make execlist port count variable
  2017-09-20 14:37 ` [PATCH 5/8] drm/i915: Make execlist port count variable Mika Kuoppala
@ 2017-09-21 12:21   ` Chris Wilson
  0 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2017-09-21 12:21 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-20 15:37:02)
> As we emulate execlists on top of the GuC workqueue, it is not
> restricted to just 2 ports and we can increase that number arbitrarily
> to trade-off queue depth (i.e. scheduling latency) against pipeline
> bubbles.
> 
> v2: rebase. better commit msg (Chris)
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors
  2017-09-20 14:37 ` [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors Mika Kuoppala
@ 2017-09-21 12:26   ` Chris Wilson
  2017-09-21 14:45     ` Mika Kuoppala
  0 siblings, 1 reply; 26+ messages in thread
From: Chris Wilson @ 2017-09-21 12:26 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-20 15:37:03)
> Instead of trusting that first available port is at index 0,
> use accessor to hide this. This is a preparation for a
> following patches where head can be at arbitrary location
> in the port array.
> 
> v2: improved commit message, elsp_ready readability (Chris)
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        | 16 +++++++----
>  drivers/gpu/drm/i915/i915_gpu_error.c      |  4 +--
>  drivers/gpu/drm/i915/i915_guc_submission.c | 17 ++++++-----
>  drivers/gpu/drm/i915/i915_irq.c            |  2 +-
>  drivers/gpu/drm/i915/intel_engine_cs.c     |  2 +-
>  drivers/gpu/drm/i915/intel_lrc.c           | 42 +++++++++++++++------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h    | 46 ++++++++++++++++++++++++++----
>  7 files changed, 87 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index dbeb6f08ab79..af8cc2eab1b1 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3348,16 +3348,20 @@ static int i915_engine_info(struct seq_file *m, void *unused)
>  
>                         rcu_read_lock();
>                         for (idx = 0; idx < execlist_num_ports(el); idx++) {
> -                               unsigned int count;
> +                               const struct execlist_port *port;
> +                               unsigned int count, n;
>  
> -                               rq = port_unpack(&el->port[idx], &count);
> +                               port = execlist_port_index(el, idx);
> +                               n = port_index(port, el);

Bah, execlist_port_index() implies to me that it should return the
index, not the port. I would just call it execlist_port(). How does that
look?

> -static inline void
> +#define __port_idx(start, index, mask) (((start) + (index)) & (mask))
> +
> +static inline struct execlist_port *
> +execlist_port_head(struct intel_engine_execlist * const el)
> +{
> +       return &el->port[el->port_head];
> +}
> +
> +/* Index starting from port_head */
> +static inline struct execlist_port *
> +execlist_port_index(struct intel_engine_execlist * const el,
> +                   const unsigned int n)
> +{
> +       return &el->port[__port_idx(el->port_head, n, el->port_mask)];
> +}
> +
> +static inline struct execlist_port *
> +execlist_port_tail(struct intel_engine_execlist * const el)
> +{
> +       return &el->port[__port_idx(el->port_head, -1, el->port_mask)];
> +}

Hmm, I was expecting

execlist_port_head() { return execlist_port(el, 0); }
execlist_port_tail() { return execlist_port(el, -1); }

What's the impact on object size? (As a quick guide to how much the
compiler can keep the code in check.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/8] drm/i915: Keep track of reserved execlist ports
  2017-09-20 14:37 ` [PATCH 7/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
  2017-09-21 12:08   ` Mika Kuoppala
@ 2017-09-21 12:30   ` Chris Wilson
  1 sibling, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2017-09-21 12:30 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx; +Cc: Mika

Quoting Mika Kuoppala (2017-09-20 15:37:04)
> To further enchance port processing, keep track of
> reserved ports. This way we can iterate only the used subset
> of port space. Note that we lift the responsibility of
> execlists_submit_request() to inspect hw availability and
> always do dequeuing. This is to ensure that only the irq
> handler will be responsible for keeping track of available ports.
> 
> v2: rebase, comment fix, READ_ONCE only outside of irq handler (Chris)
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>

Ok, doesn't look hideous. I need to look at it with a clear head, but for
now, could you check scripts/bloat-o-meter for my usual quick guide on
how much gcc likes it?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 8/8] drm/i915: Improve GuC request coalescing
  2017-09-20 14:37 ` [PATCH 8/8] drm/i915: Improve GuC request coalescing Mika Kuoppala
@ 2017-09-21 12:34   ` Chris Wilson
  2017-09-21 12:53   ` Michał Winiarski
  1 sibling, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2017-09-21 12:34 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx; +Cc: Mika

Quoting Mika Kuoppala (2017-09-20 15:37:05)
> -static void i915_guc_submit(struct intel_engine_cs *engine)
> +static void i915_guc_submit(struct intel_engine_cs *engine,
> +                           const unsigned int first)
>  {
>         struct drm_i915_private *dev_priv = engine->i915;
>         struct intel_guc *guc = &dev_priv->guc;
> @@ -498,7 +500,7 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
>         const unsigned int engine_id = engine->id;
>         unsigned int n;
>  
> -       for (n = 0; n < execlist_active_ports(el); n++) {
> +       for (n = first; n < execlist_active_ports(el); n++) {
>                 struct execlist_port *port;
>                 struct drm_i915_gem_request *rq;
>                 unsigned int count;
> @@ -506,21 +508,22 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
>                 port = execlist_port_index(el, n);
>  
>                 rq = port_unpack(port, &count);
> -               if (rq && count == 0) {
> -                       port_set(port, port_pack(rq, ++count));
> +               GEM_BUG_ON(!rq);
> +               GEM_BUG_ON(count);
>  
> -                       if (i915_vma_is_map_and_fenceable(rq->ring->vma))
> -                               POSTING_READ_FW(GUC_STATUS);
> +               port_set(port, port_pack(rq, ++count));

Ok, with this method we don't need count anymore. Seems sensible.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 8/8] drm/i915: Improve GuC request coalescing
  2017-09-20 14:37 ` [PATCH 8/8] drm/i915: Improve GuC request coalescing Mika Kuoppala
  2017-09-21 12:34   ` Chris Wilson
@ 2017-09-21 12:53   ` Michał Winiarski
  1 sibling, 0 replies; 26+ messages in thread
From: Michał Winiarski @ 2017-09-21 12:53 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Wed, Sep 20, 2017 at 05:37:05PM +0300, Mika Kuoppala wrote:
> Now that we can keep track of what ports we have
> dequeued, coalesce only those ports instead of iterating
> through all ports.

s/coalesce/submit.

By coalescing I meant that we're no longer have a 1:1 relationship between a
request and GuC workitem. But we're doing that in guc_dequeue by keeping the
request-to-be-turned-into-workitem in port.

> 
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 31 +++++++++++++++++-------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h    |  9 +++++++++
>  2 files changed, 27 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 359f57a59cba..1057a0fb9f27 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -485,11 +485,13 @@ static void guc_ring_doorbell(struct i915_guc_client *client)
>  /**
>   * i915_guc_submit() - Submit commands through GuC
>   * @engine: engine associated with the commands
> + * @first: index of first execlist port to start coalescing from

s/coalescing/submitting

Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

-Michał

>   *
>   * The only error here arises if the doorbell hardware isn't functioning
>   * as expected, which really shouldn't happen.
>   */
> -static void i915_guc_submit(struct intel_engine_cs *engine)
> +static void i915_guc_submit(struct intel_engine_cs *engine,
> +			    const unsigned int first)
>  {
>  	struct drm_i915_private *dev_priv = engine->i915;
>  	struct intel_guc *guc = &dev_priv->guc;
> @@ -498,7 +500,7 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
>  	const unsigned int engine_id = engine->id;
>  	unsigned int n;
>  
> -	for (n = 0; n < execlist_active_ports(el); n++) {
> +	for (n = first; n < execlist_active_ports(el); n++) {
>  		struct execlist_port *port;
>  		struct drm_i915_gem_request *rq;
>  		unsigned int count;
> @@ -506,21 +508,22 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
>  		port = execlist_port_index(el, n);
>  
>  		rq = port_unpack(port, &count);
> -		if (rq && count == 0) {
> -			port_set(port, port_pack(rq, ++count));
> +		GEM_BUG_ON(!rq);
> +		GEM_BUG_ON(count);
>  
> -			if (i915_vma_is_map_and_fenceable(rq->ring->vma))
> -				POSTING_READ_FW(GUC_STATUS);
> +		port_set(port, port_pack(rq, ++count));
>  
> -			spin_lock(&client->wq_lock);
> +		if (i915_vma_is_map_and_fenceable(rq->ring->vma))
> +			POSTING_READ_FW(GUC_STATUS);
>  
> -			guc_wq_item_append(client, rq);
> -			guc_ring_doorbell(client);
> +		spin_lock(&client->wq_lock);
>  
> -			client->submissions[engine_id] += 1;
> +		guc_wq_item_append(client, rq);
> +		guc_ring_doorbell(client);
>  
> -			spin_unlock(&client->wq_lock);
> -		}
> +		client->submissions[engine_id] += 1;
> +
> +		spin_unlock(&client->wq_lock);
>  	}
>  }
>  
> @@ -566,6 +569,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
>  	struct drm_i915_gem_request *last = NULL;
>  	bool submit = false;
>  	struct rb_node *rb;
> +	unsigned int first_idx;
>  
>  	spin_lock_irq(&engine->timeline->lock);
>  	rb = el->first;
> @@ -575,6 +579,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
>  		goto done;
>  
>  	port = execlist_request_port(el);
> +	first_idx = execlist_get_port_index(el, port);
>  
>  	do {
>  		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
> @@ -614,7 +619,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
>  	el->first = rb;
>  	if (submit) {
>  		port_assign(port, last);
> -		i915_guc_submit(engine);
> +		i915_guc_submit(engine, first_idx);
>  	}
>  	spin_unlock_irq(&engine->timeline->lock);
>  }
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index efa5a8ea1ecb..f2eb32539300 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -556,6 +556,15 @@ execlist_port_index(struct intel_engine_execlist * const el,
>  	return &el->port[__port_idx(el->port_head, n, el->port_mask)];
>  }
>  
> +static inline unsigned int
> +execlist_get_port_index(const struct intel_engine_execlist * const el,
> +			const struct execlist_port * const port)
> +{
> +	const unsigned int n = port_index(port, el);
> +
> +	return __port_idx(n, -el->port_head, el->port_mask);
> +}
> +
>  static inline struct execlist_port *
>  execlist_port_head(struct intel_engine_execlist * const el)
>  {
> -- 
> 2.11.0
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] drm/i915: Wrap port cancellation into a function
  2017-09-21 12:18   ` Michał Winiarski
@ 2017-09-21 14:02     ` Mika Kuoppala
  0 siblings, 0 replies; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-21 14:02 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

Michał Winiarski <michal.winiarski@intel.com> writes:

> On Wed, Sep 20, 2017 at 05:37:00PM +0300, Mika Kuoppala wrote:
>> On reset and wedged path, we want to release the requests
>> that are tied to ports and then mark the ports to be unset.
>> Introduce a function for this.
>> 
>> v2: rebase
>> 
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
>> ---
>>  drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++++++---------
>>  1 file changed, 12 insertions(+), 9 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index a4ece4c4f291..ffb9c900328b 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -568,6 +568,16 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>>  		execlists_submit_ports(engine);
>>  }
>>  
>> +static void execlist_cancel_port_requests(struct intel_engine_execlist *el)
>> +{
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(el->port); i++)
>> +		i915_gem_request_put(port_request(&el->port[i]));
>> +
>> +	memset(el->port, 0, sizeof(el->port));
>> +}
>> +
>>  static void execlists_cancel_requests(struct intel_engine_cs *engine)
>>  {
>>  	struct intel_engine_execlist * const el = &engine->execlist;
>> @@ -575,14 +585,11 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
>>  	struct drm_i915_gem_request *rq, *rn;
>>  	struct rb_node *rb;
>>  	unsigned long flags;
>> -	unsigned long n;
>>  
>>  	spin_lock_irqsave(&engine->timeline->lock, flags);
>>  
>>  	/* Cancel the requests on the HW and clear the ELSP tracker. */
>> -	for (n = 0; n < ARRAY_SIZE(el->port); n++)
>> -		i915_gem_request_put(port_request(&port[n]));
>
> We could also drop the local variable for port.

Dropped.

> It's only used in GEM_BUG_ON(port_isset(&port[0])).
> Do we even need this assert when we're starting to treat ports in a more
> ring-like fashion?
>

The memset is, still, so close there in this version that it indeed
begs the question.

But it is there to ensure that we really did the port parts
properly.

-Mika

> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
>
> -Michał
>
>> -	memset(el->port, 0, sizeof(el->port));
>> +	execlist_cancel_port_requests(el);
>>  
>>  	/* Mark all executing requests as skipped. */
>>  	list_for_each_entry(rq, &engine->timeline->requests, link) {
>> @@ -1372,11 +1379,9 @@ static void reset_common_ring(struct intel_engine_cs *engine,
>>  			      struct drm_i915_gem_request *request)
>>  {
>>  	struct intel_engine_execlist * const el = &engine->execlist;
>> -	struct execlist_port *port = el->port;
>>  	struct drm_i915_gem_request *rq, *rn;
>>  	struct intel_context *ce;
>>  	unsigned long flags;
>> -	unsigned int n;
>>  
>>  	spin_lock_irqsave(&engine->timeline->lock, flags);
>>  
>> @@ -1389,9 +1394,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
>>  	 * guessing the missed context-switch events by looking at what
>>  	 * requests were completed.
>>  	 */
>> -	for (n = 0; n < ARRAY_SIZE(el->port); n++)
>> -		i915_gem_request_put(port_request(&port[n]));
>> -	memset(el->port, 0, sizeof(el->port));
>> +	execlist_cancel_port_requests(el);
>>  
>>  	/* Push back any incomplete requests for replay after the reset. */
>>  	list_for_each_entry_safe_reverse(rq, rn,
>> -- 
>> 2.11.0
>> 
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors
  2017-09-21 12:26   ` Chris Wilson
@ 2017-09-21 14:45     ` Mika Kuoppala
  0 siblings, 0 replies; 26+ messages in thread
From: Mika Kuoppala @ 2017-09-21 14:45 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2017-09-20 15:37:03)
>> Instead of trusting that first available port is at index 0,
>> use accessor to hide this. This is a preparation for a
>> following patches where head can be at arbitrary location
>> in the port array.
>> 
>> v2: improved commit message, elsp_ready readability (Chris)
>> 
>> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_debugfs.c        | 16 +++++++----
>>  drivers/gpu/drm/i915/i915_gpu_error.c      |  4 +--
>>  drivers/gpu/drm/i915/i915_guc_submission.c | 17 ++++++-----
>>  drivers/gpu/drm/i915/i915_irq.c            |  2 +-
>>  drivers/gpu/drm/i915/intel_engine_cs.c     |  2 +-
>>  drivers/gpu/drm/i915/intel_lrc.c           | 42 +++++++++++++++------------
>>  drivers/gpu/drm/i915/intel_ringbuffer.h    | 46 ++++++++++++++++++++++++++----
>>  7 files changed, 87 insertions(+), 42 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>> index dbeb6f08ab79..af8cc2eab1b1 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -3348,16 +3348,20 @@ static int i915_engine_info(struct seq_file *m, void *unused)
>>  
>>                         rcu_read_lock();
>>                         for (idx = 0; idx < execlist_num_ports(el); idx++) {
>> -                               unsigned int count;
>> +                               const struct execlist_port *port;
>> +                               unsigned int count, n;
>>  
>> -                               rq = port_unpack(&el->port[idx], &count);
>> +                               port = execlist_port_index(el, idx);
>> +                               n = port_index(port, el);
>
> Bah, execlist_port_index() implies to me that it should return the
> index, not the port. I would just call it execlist_port(). How does that
> look?

It looks much better.

>
>> -static inline void
>> +#define __port_idx(start, index, mask) (((start) + (index)) & (mask))
>> +
>> +static inline struct execlist_port *
>> +execlist_port_head(struct intel_engine_execlist * const el)
>> +{
>> +       return &el->port[el->port_head];
>> +}
>> +
>> +/* Index starting from port_head */
>> +static inline struct execlist_port *
>> +execlist_port_index(struct intel_engine_execlist * const el,
>> +                   const unsigned int n)
>> +{
>> +       return &el->port[__port_idx(el->port_head, n, el->port_mask)];
>> +}
>> +
>> +static inline struct execlist_port *
>> +execlist_port_tail(struct intel_engine_execlist * const el)
>> +{
>> +       return &el->port[__port_idx(el->port_head, -1, el->port_mask)];
>> +}
>
> Hmm, I was expecting
>
> execlist_port_head() { return execlist_port(el, 0); }
> execlist_port_tail() { return execlist_port(el, -1); }

Seems that I did these on the next patch, moved to here.

>
> What's the impact on object size? (As a quick guide to how much the
> compiler can keep the code in check.)

I can't say what would constitute as a still in check, but things
grow:

add/remove: 0/0 grow/shrink: 6/1 up/down: 277/-26 (251)
function                                     old     new   delta
intel_lrc_irq_handler                       1591    1700    +109
i915_guc_irq_handler                        1041    1110     +69
i915_engine_info                            1983    2031     +48
insert_request                               127     152     +25
intel_engine_is_idle                         304     317     +13
gen8_cs_irq_handler                          113     126     +13
capture                                     5454    5428     -26
Total: Before=1144612, After=1144863, chg +0.02%

-Mika
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] drm/i915: Make own struct for execlist items
  2017-09-20 14:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
  2017-09-21 11:55   ` Michał Winiarski
  2017-09-21 12:19   ` Chris Wilson
@ 2017-09-21 15:13   ` Joonas Lahtinen
  2 siblings, 0 replies; 26+ messages in thread
From: Joonas Lahtinen @ 2017-09-21 15:13 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

On Wed, 2017-09-20 at 17:36 +0300, Mika Kuoppala wrote:
> Engine's execlist related items have been increasing to
> a point where a separate struct is warranted. Carve execlist
> specific items to a dedicated struct to add clarity.
> 
> v2: add kerneldoc and fix whitespace (Joonas, Chris)
> v3: csb_mmio changes, rebase
> 
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

With s/\b(el|execlist)\b/execlists/ this is;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Please rebase and merge, it's going to cause rebases anyway, so now is
the best time after yesterday.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2017-09-21 15:14 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-20 14:36 [PATCH 0/8] Support for more than two execlist ports (v2) Mika Kuoppala
2017-09-20 14:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
2017-09-21 11:55   ` Michał Winiarski
2017-09-21 12:19   ` Chris Wilson
2017-09-21 15:13   ` Joonas Lahtinen
2017-09-20 14:36 ` [PATCH 2/8] drm/i915: Move execlist initialization into intel_engine_cs.c Mika Kuoppala
2017-09-21 12:20   ` Chris Wilson
2017-09-20 14:37 ` [PATCH 3/8] drm/i915: Wrap port cancellation into a function Mika Kuoppala
2017-09-21 12:18   ` Michał Winiarski
2017-09-21 14:02     ` Mika Kuoppala
2017-09-21 12:20   ` Chris Wilson
2017-09-20 14:37 ` [PATCH 4/8] drm/i915: Add execlist_port_complete Mika Kuoppala
2017-09-21 12:21   ` Chris Wilson
2017-09-20 14:37 ` [PATCH 5/8] drm/i915: Make execlist port count variable Mika Kuoppala
2017-09-21 12:21   ` Chris Wilson
2017-09-20 14:37 ` [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors Mika Kuoppala
2017-09-21 12:26   ` Chris Wilson
2017-09-21 14:45     ` Mika Kuoppala
2017-09-20 14:37 ` [PATCH 7/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
2017-09-21 12:08   ` Mika Kuoppala
2017-09-21 12:30   ` Chris Wilson
2017-09-20 14:37 ` [PATCH 8/8] drm/i915: Improve GuC request coalescing Mika Kuoppala
2017-09-21 12:34   ` Chris Wilson
2017-09-21 12:53   ` Michał Winiarski
2017-09-20 15:19 ` ✓ Fi.CI.BAT: success for Support for more than two execlist ports (rev2) Patchwork
2017-09-20 16:34 ` ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.