All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Support for more than two execlist ports
@ 2017-09-12  8:36 Mika Kuoppala
  2017-09-12  8:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12  8:36 UTC (permalink / raw)
  To: intel-gfx

Hi,

Here is a patchset to allow power-of-two number of execlist
ports configurable at init time. The purpose is to support
more than two ports (contexts) for guc submission.

I did few runs of gem_exec_nop and gem_exec_ctx on non guc
paths to verify that these don't regress the hw submission
path. I expected a neutral or minor negative effect but
there is an improvement.

These needs to be rebased on top of Michał Winiarski's
coalesced requests patch, but to get comments on the native side
of handling and general approach, here it is.

-Mika

Mika Kuoppala (8):
  drm/i915: Make own struct for execlist items
  drm/i915: Wrap port cancellation into a function
  drm/i915: Add execlist_port_complete
  drm/i915: Make execlist port count variable
  drm/i915: Introduce iterators for execlist ports
  drm/i915: Introduce execlist_port_* accessors
  drm/i915: Move execlist initialization into intel_engine_cs.c
  drm/i915: Keep track of reserved execlist ports

 drivers/gpu/drm/i915/i915_debugfs.c        |   9 +-
 drivers/gpu/drm/i915/i915_drv.h            |   3 +-
 drivers/gpu/drm/i915/i915_gem.c            |  17 ++-
 drivers/gpu/drm/i915/i915_gpu_error.c      |  18 ++-
 drivers/gpu/drm/i915/i915_guc_submission.c |  59 ++++++----
 drivers/gpu/drm/i915/i915_irq.c            |   5 +-
 drivers/gpu/drm/i915/intel_engine_cs.c     |  39 +++++--
 drivers/gpu/drm/i915/intel_lrc.c           | 159 ++++++++++++++------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 173 +++++++++++++++++++++++++----
 9 files changed, 339 insertions(+), 143 deletions(-)

-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/8] drm/i915: Make own struct for execlist items
  2017-09-12  8:36 [PATCH 0/8] Support for more than two execlist ports Mika Kuoppala
@ 2017-09-12  8:36 ` Mika Kuoppala
  2017-09-12 10:14   ` Chris Wilson
  2017-09-12  8:36 ` [PATCH 2/8] drm/i915: Wrap port cancellation into a function Mika Kuoppala
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12  8:36 UTC (permalink / raw)
  To: intel-gfx

Engine's execlist related items have been increasing to
a point where a separate struct is warranted. Carve execlist
specific items to a dedicated struct to add clarity.

v2: add kerneldoc and fix whitespace (Joonas, Chris)

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  6 +-
 drivers/gpu/drm/i915/i915_gem.c            | 16 +++---
 drivers/gpu/drm/i915/i915_gpu_error.c      |  4 +-
 drivers/gpu/drm/i915/i915_guc_submission.c | 19 ++++---
 drivers/gpu/drm/i915/i915_irq.c            |  5 +-
 drivers/gpu/drm/i915/intel_engine_cs.c     | 12 ++--
 drivers/gpu/drm/i915/intel_lrc.c           | 64 +++++++++++-----------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 88 +++++++++++++++++++++++-------
 8 files changed, 134 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 6338018f655d..cc36409936dc 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3344,10 +3344,10 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			}
 
 			rcu_read_lock();
-			for (idx = 0; idx < ARRAY_SIZE(engine->execlist_port); idx++) {
+			for (idx = 0; idx < ARRAY_SIZE(engine->execlist.port); idx++) {
 				unsigned int count;
 
-				rq = port_unpack(&engine->execlist_port[idx],
+				rq = port_unpack(&engine->execlist.port[idx],
 						 &count);
 				if (rq) {
 					seq_printf(m, "\t\tELSP[%d] count=%d, ",
@@ -3361,7 +3361,7 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			rcu_read_unlock();
 
 			spin_lock_irq(&engine->timeline->lock);
-			for (rb = engine->execlist_first; rb; rb = rb_next(rb)){
+			for (rb = engine->execlist.first; rb; rb = rb_next(rb)) {
 				struct i915_priolist *p =
 					rb_entry(rb, typeof(*p), node);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f445587c1a4b..7d0263e9c220 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2815,8 +2815,8 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
 	 * Turning off the engine->irq_tasklet until the reset is over
 	 * prevents the race.
 	 */
-	tasklet_kill(&engine->irq_tasklet);
-	tasklet_disable(&engine->irq_tasklet);
+	tasklet_kill(&engine->execlist.irq_tasklet);
+	tasklet_disable(&engine->execlist.irq_tasklet);
 
 	if (engine->irq_seqno_barrier)
 		engine->irq_seqno_barrier(engine);
@@ -2995,7 +2995,7 @@ void i915_gem_reset(struct drm_i915_private *dev_priv)
 
 void i915_gem_reset_finish_engine(struct intel_engine_cs *engine)
 {
-	tasklet_enable(&engine->irq_tasklet);
+	tasklet_enable(&engine->execlist.irq_tasklet);
 	kthread_unpark(engine->breadcrumbs.signaler);
 }
 
@@ -3047,17 +3047,17 @@ static void engine_set_wedged(struct intel_engine_cs *engine)
 	 */
 
 	if (i915.enable_execlists) {
-		struct execlist_port *port = engine->execlist_port;
+		struct execlist_port *port = engine->execlist.port;
 		unsigned long flags;
 		unsigned int n;
 
 		spin_lock_irqsave(&engine->timeline->lock, flags);
 
-		for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++)
+		for (n = 0; n < ARRAY_SIZE(engine->execlist.port); n++)
 			i915_gem_request_put(port_request(&port[n]));
-		memset(engine->execlist_port, 0, sizeof(engine->execlist_port));
-		engine->execlist_queue = RB_ROOT;
-		engine->execlist_first = NULL;
+		memset(engine->execlist.port, 0, sizeof(engine->execlist.port));
+		engine->execlist.queue = RB_ROOT;
+		engine->execlist.first = NULL;
 
 		spin_unlock_irqrestore(&engine->timeline->lock, flags);
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index ed5a1eb839ad..6114bf79219d 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1327,10 +1327,10 @@ static void engine_record_requests(struct intel_engine_cs *engine,
 static void error_record_engine_execlists(struct intel_engine_cs *engine,
 					  struct drm_i915_error_engine *ee)
 {
-	const struct execlist_port *port = engine->execlist_port;
+	const struct execlist_port *port = engine->execlist.port;
 	unsigned int n;
 
-	for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++) {
+	for (n = 0; n < ARRAY_SIZE(engine->execlist.port); n++) {
 		struct drm_i915_gem_request *rq = port_request(&port[n]);
 
 		if (!rq)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 48a1e9349a2c..f95defe18885 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -661,21 +661,22 @@ static void port_assign(struct execlist_port *port,
 
 static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 {
-	struct execlist_port *port = engine->execlist_port;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
 	struct drm_i915_gem_request *last = port_request(port);
 	struct rb_node *rb;
 	bool submit = false;
 
 	spin_lock_irq(&engine->timeline->lock);
-	rb = engine->execlist_first;
-	GEM_BUG_ON(rb_first(&engine->execlist_queue) != rb);
+	rb = el->first;
+	GEM_BUG_ON(rb_first(&el->queue) != rb);
 	while (rb) {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
 
 		list_for_each_entry_safe(rq, rn, &p->requests, priotree.link) {
 			if (last && rq->ctx != last->ctx) {
-				if (port != engine->execlist_port) {
+				if (port != el->port) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -696,13 +697,13 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 		}
 
 		rb = rb_next(rb);
-		rb_erase(&p->node, &engine->execlist_queue);
+		rb_erase(&p->node, &el->queue);
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
-	engine->execlist_first = rb;
+	el->first = rb;
 	if (submit)
 		port_assign(port, last);
 	spin_unlock_irq(&engine->timeline->lock);
@@ -712,8 +713,8 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 
 static void i915_guc_irq_handler(unsigned long data)
 {
-	struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
-	struct execlist_port *port = engine->execlist_port;
+	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
+	struct execlist_port *port = engine->execlist.port;
 	struct drm_i915_gem_request *rq;
 	bool submit;
 
@@ -1256,7 +1257,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 		 * take over the callback without changing any other state
 		 * in the tasklet.
 		 */
-		engine->irq_tasklet.func = i915_guc_irq_handler;
+		engine->execlist.irq_tasklet.func = i915_guc_irq_handler;
 		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 
 		/* Replay the current set of previously submitted requests */
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 5d391e689070..e16fbea03823 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1308,10 +1308,11 @@ static void snb_gt_irq_handler(struct drm_i915_private *dev_priv,
 static void
 gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 {
+	struct intel_engine_execlist * const el = &engine->execlist;
 	bool tasklet = false;
 
 	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
-		if (port_count(&engine->execlist_port[0])) {
+		if (port_count(&el->port[0])) {
 			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 			tasklet = true;
 		}
@@ -1323,7 +1324,7 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 	}
 
 	if (tasklet)
-		tasklet_hi_schedule(&engine->irq_tasklet);
+		tasklet_hi_schedule(&el->irq_tasklet);
 }
 
 static irqreturn_t gen8_gt_irq_ack(struct drm_i915_private *dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 3ae89a9d6241..6197df0de3fe 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -391,8 +391,8 @@ static void intel_engine_init_timeline(struct intel_engine_cs *engine)
  */
 void intel_engine_setup_common(struct intel_engine_cs *engine)
 {
-	engine->execlist_queue = RB_ROOT;
-	engine->execlist_first = NULL;
+	engine->execlist.queue = RB_ROOT;
+	engine->execlist.first = NULL;
 
 	intel_engine_init_timeline(engine);
 	intel_engine_init_hangcheck(engine);
@@ -1334,11 +1334,11 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 		return false;
 
 	/* Both ports drained, no more ELSP submission? */
-	if (port_request(&engine->execlist_port[0]))
+	if (port_request(&engine->execlist.port[0]))
 		return false;
 
 	/* ELSP is empty, but there are ready requests? */
-	if (READ_ONCE(engine->execlist_first))
+	if (READ_ONCE(engine->execlist.first))
 		return false;
 
 	/* Ring stopped? */
@@ -1387,8 +1387,8 @@ void intel_engines_mark_idle(struct drm_i915_private *i915)
 	for_each_engine(engine, i915, id) {
 		intel_engine_disarm_breadcrumbs(engine);
 		i915_gem_batch_pool_fini(&engine->batch_pool);
-		tasklet_kill(&engine->irq_tasklet);
-		engine->no_priolist = false;
+		tasklet_kill(&engine->execlist.irq_tasklet);
+		engine->execlist.no_priolist = false;
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d89e1b8e1cc5..2964e7c0a873 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -338,12 +338,12 @@ static u64 execlists_update_context(struct drm_i915_gem_request *rq)
 
 static void execlists_submit_ports(struct intel_engine_cs *engine)
 {
-	struct execlist_port *port = engine->execlist_port;
+	struct execlist_port *port = engine->execlist.port;
 	u32 __iomem *elsp =
 		engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
 	unsigned int n;
 
-	for (n = ARRAY_SIZE(engine->execlist_port); n--; ) {
+	for (n = ARRAY_SIZE(engine->execlist.port); n--; ) {
 		struct drm_i915_gem_request *rq;
 		unsigned int count;
 		u64 desc;
@@ -398,7 +398,7 @@ static void port_assign(struct execlist_port *port,
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_request *last;
-	struct execlist_port *port = engine->execlist_port;
+	struct execlist_port *port = engine->execlist.port;
 	struct rb_node *rb;
 	bool submit = false;
 
@@ -436,8 +436,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 */
 
 	spin_lock_irq(&engine->timeline->lock);
-	rb = engine->execlist_first;
-	GEM_BUG_ON(rb_first(&engine->execlist_queue) != rb);
+	rb = engine->execlist.first;
+	GEM_BUG_ON(rb_first(&engine->execlist.queue) != rb);
 	while (rb) {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
@@ -460,7 +460,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				 * combine this request with the last, then we
 				 * are done.
 				 */
-				if (port != engine->execlist_port) {
+				if (port != engine->execlist.port) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -497,13 +497,13 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		}
 
 		rb = rb_next(rb);
-		rb_erase(&p->node, &engine->execlist_queue);
+		rb_erase(&p->node, &engine->execlist.queue);
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
-	engine->execlist_first = rb;
+	engine->execlist.first = rb;
 	if (submit)
 		port_assign(port, last);
 	spin_unlock_irq(&engine->timeline->lock);
@@ -514,7 +514,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 
 static bool execlists_elsp_ready(const struct intel_engine_cs *engine)
 {
-	const struct execlist_port *port = engine->execlist_port;
+	const struct execlist_port *port = engine->execlist.port;
 
 	return port_count(&port[0]) + port_count(&port[1]) < 2;
 }
@@ -525,8 +525,9 @@ static bool execlists_elsp_ready(const struct intel_engine_cs *engine)
  */
 static void intel_lrc_irq_handler(unsigned long data)
 {
-	struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
-	struct execlist_port *port = engine->execlist_port;
+	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
 	struct drm_i915_private *dev_priv = engine->i915;
 
 	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
@@ -538,7 +539,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 	 */
 	GEM_BUG_ON(!dev_priv->gt.awake);
 
-	intel_uncore_forcewake_get(dev_priv, engine->fw_domains);
+	intel_uncore_forcewake_get(dev_priv, el->fw_domains);
 
 	/* Prefer doing test_and_clear_bit() as a two stage operation to avoid
 	 * imposing the cost of a locked atomic transaction when submitting a
@@ -626,7 +627,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 	if (execlists_elsp_ready(engine))
 		execlists_dequeue(engine);
 
-	intel_uncore_forcewake_put(dev_priv, engine->fw_domains);
+	intel_uncore_forcewake_put(dev_priv, el->fw_domains);
 }
 
 static bool
@@ -634,17 +635,18 @@ insert_request(struct intel_engine_cs *engine,
 	       struct i915_priotree *pt,
 	       int prio)
 {
+	struct intel_engine_execlist *el = &engine->execlist;
 	struct i915_priolist *p;
 	struct rb_node **parent, *rb;
 	bool first = true;
 
-	if (unlikely(engine->no_priolist))
+	if (unlikely(engine->execlist.no_priolist))
 		prio = I915_PRIORITY_NORMAL;
 
 find_priolist:
 	/* most positive priority is scheduled first, equal priorities fifo */
 	rb = NULL;
-	parent = &engine->execlist_queue.rb_node;
+	parent = &el->queue.rb_node;
 	while (*parent) {
 		rb = *parent;
 		p = rb_entry(rb, typeof(*p), node);
@@ -660,7 +662,7 @@ insert_request(struct intel_engine_cs *engine,
 	}
 
 	if (prio == I915_PRIORITY_NORMAL) {
-		p = &engine->default_priolist;
+		p = &el->default_priolist;
 	} else {
 		p = kmem_cache_alloc(engine->i915->priorities, GFP_ATOMIC);
 		/* Convert an allocation failure to a priority bump */
@@ -675,20 +677,20 @@ insert_request(struct intel_engine_cs *engine,
 			 * requests, so if userspace lied about their
 			 * dependencies that reordering may be visible.
 			 */
-			engine->no_priolist = true;
+			el->no_priolist = true;
 			goto find_priolist;
 		}
 	}
 
 	p->priority = prio;
 	rb_link_node(&p->node, rb, parent);
-	rb_insert_color(&p->node, &engine->execlist_queue);
+	rb_insert_color(&p->node, &el->queue);
 
 	INIT_LIST_HEAD(&p->requests);
 	list_add_tail(&pt->link, &p->requests);
 
 	if (first)
-		engine->execlist_first = &p->node;
+		el->first = &p->node;
 
 	return first;
 }
@@ -705,10 +707,10 @@ static void execlists_submit_request(struct drm_i915_gem_request *request)
 			   &request->priotree,
 			   request->priotree.priority)) {
 		if (execlists_elsp_ready(engine))
-			tasklet_hi_schedule(&engine->irq_tasklet);
+			tasklet_hi_schedule(&engine->execlist.irq_tasklet);
 	}
 
-	GEM_BUG_ON(!engine->execlist_first);
+	GEM_BUG_ON(!engine->execlist.first);
 	GEM_BUG_ON(list_empty(&request->priotree.link));
 
 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
@@ -1234,7 +1236,7 @@ static u8 gtiir[] = {
 static int gen8_init_common_ring(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
-	struct execlist_port *port = engine->execlist_port;
+	struct execlist_port *port = engine->execlist.port;
 	unsigned int n;
 	bool submit;
 	int ret;
@@ -1272,7 +1274,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
 
 	/* After a GPU reset, we may have requests to replay */
 	submit = false;
-	for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++) {
+	for (n = 0; n < ARRAY_SIZE(engine->execlist.port); n++) {
 		if (!port_isset(&port[n]))
 			break;
 
@@ -1327,7 +1329,7 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 static void reset_common_ring(struct intel_engine_cs *engine,
 			      struct drm_i915_gem_request *request)
 {
-	struct execlist_port *port = engine->execlist_port;
+	struct execlist_port *port = engine->execlist.port;
 	struct intel_context *ce;
 	unsigned int n;
 
@@ -1341,9 +1343,9 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	 * requests were completed.
 	 */
 	if (!request) {
-		for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++)
+		for (n = 0; n < ARRAY_SIZE(engine->execlist.port); n++)
 			i915_gem_request_put(port_request(&port[n]));
-		memset(engine->execlist_port, 0, sizeof(engine->execlist_port));
+		memset(engine->execlist.port, 0, sizeof(engine->execlist.port));
 		return;
 	}
 
@@ -1668,8 +1670,8 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *engine)
 	 * Tasklet cannot be active at this point due intel_mark_active/idle
 	 * so this is just for documentation.
 	 */
-	if (WARN_ON(test_bit(TASKLET_STATE_SCHED, &engine->irq_tasklet.state)))
-		tasklet_kill(&engine->irq_tasklet);
+	if (WARN_ON(test_bit(TASKLET_STATE_SCHED, &engine->execlist.irq_tasklet.state)))
+		tasklet_kill(&engine->execlist.irq_tasklet);
 
 	dev_priv = engine->i915;
 
@@ -1697,7 +1699,7 @@ static void execlists_set_default_submission(struct intel_engine_cs *engine)
 {
 	engine->submit_request = execlists_submit_request;
 	engine->schedule = execlists_schedule;
-	engine->irq_tasklet.func = intel_lrc_irq_handler;
+	engine->execlist.irq_tasklet.func = intel_lrc_irq_handler;
 }
 
 static void
@@ -1772,9 +1774,9 @@ logical_ring_setup(struct intel_engine_cs *engine)
 						     RING_CONTEXT_STATUS_BUF_BASE(engine),
 						     FW_REG_READ);
 
-	engine->fw_domains = fw_domains;
+	engine->execlist.fw_domains = fw_domains;
 
-	tasklet_init(&engine->irq_tasklet,
+	tasklet_init(&engine->execlist.irq_tasklet,
 		     intel_lrc_irq_handler, (unsigned long)engine);
 
 	logical_ring_default_vfuncs(engine);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 79c0021f3700..326ca1453e91 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -184,6 +184,74 @@ struct i915_priolist {
 	int priority;
 };
 
+/**
+ * struct intel_engine_execlist - execlist submission queue and port state
+ *
+ * The struct intel_engine_execlist represents the combined logical state of driver
+ * and the hardware state for execlist mode of submission.
+ */
+struct intel_engine_execlist {
+	/**
+	 * @irq_tasklet: softirq tasklet for bottom handler
+	 */
+	struct tasklet_struct irq_tasklet;
+
+	/**
+	 * @default_priolist: priority list for I915_PRIORITY_NORMAL
+	 */
+	struct i915_priolist default_priolist;
+
+	/**
+	 * @no_priolist: priority lists disabled
+	 */
+	bool no_priolist;
+
+	/**
+	 * @port: execlist port states
+	 *
+	 * For each hardware ELSP (ExecList Submission Port) we keep
+	 * track of the last request and the number of times we submitted
+	 * that port to hw. We then count the number of times the hw reports
+	 * a context completion or preemption. As only one context can
+	 * be active on hw, we limit resubmission of context to port[0]. This
+	 * is called Lite Restore, of the context.
+	 */
+	struct execlist_port {
+		/**
+		 * @request_count: combined request and submission count
+		 */
+		struct drm_i915_gem_request *request_count;
+#define EXECLIST_COUNT_BITS 2
+#define port_request(p) ptr_mask_bits((p)->request_count, EXECLIST_COUNT_BITS)
+#define port_count(p) ptr_unmask_bits((p)->request_count, EXECLIST_COUNT_BITS)
+#define port_pack(rq, count) ptr_pack_bits(rq, count, EXECLIST_COUNT_BITS)
+#define port_unpack(p, count) ptr_unpack_bits((p)->request_count, count, EXECLIST_COUNT_BITS)
+#define port_set(p, packed) ((p)->request_count = (packed))
+#define port_isset(p) ((p)->request_count)
+#define port_index(p, e) ((p) - (e)->execlist.port)
+
+		/**
+		 * @context_id: context ID for port
+		 */
+		GEM_DEBUG_DECL(u32 context_id);
+	} port[2];
+
+	/**
+	 * @queue: queue of requests, in priority lists
+	 */
+	struct rb_root queue;
+
+	/**
+	 * @first: leftmost level in priority @queue
+	 */
+	struct rb_node *first;
+
+	/**
+	 * @fw_domains: forcewake domains for irq tasklet
+	 */
+	unsigned int fw_domains;
+};
+
 #define INTEL_ENGINE_CS_MAX_NAME 8
 
 struct intel_engine_cs {
@@ -372,25 +440,7 @@ struct intel_engine_cs {
 		u32	*(*signal)(struct drm_i915_gem_request *req, u32 *cs);
 	} semaphore;
 
-	/* Execlists */
-	struct tasklet_struct irq_tasklet;
-	struct i915_priolist default_priolist;
-	bool no_priolist;
-	struct execlist_port {
-		struct drm_i915_gem_request *request_count;
-#define EXECLIST_COUNT_BITS 2
-#define port_request(p) ptr_mask_bits((p)->request_count, EXECLIST_COUNT_BITS)
-#define port_count(p) ptr_unmask_bits((p)->request_count, EXECLIST_COUNT_BITS)
-#define port_pack(rq, count) ptr_pack_bits(rq, count, EXECLIST_COUNT_BITS)
-#define port_unpack(p, count) ptr_unpack_bits((p)->request_count, count, EXECLIST_COUNT_BITS)
-#define port_set(p, packed) ((p)->request_count = (packed))
-#define port_isset(p) ((p)->request_count)
-#define port_index(p, e) ((p) - (e)->execlist_port)
-		GEM_DEBUG_DECL(u32 context_id);
-	} execlist_port[2];
-	struct rb_root execlist_queue;
-	struct rb_node *execlist_first;
-	unsigned int fw_domains;
+	struct intel_engine_execlist execlist;
 
 	/* Contexts are pinned whilst they are active on the GPU. The last
 	 * context executed remains active whilst the GPU is idle - the
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/8] drm/i915: Wrap port cancellation into a function
  2017-09-12  8:36 [PATCH 0/8] Support for more than two execlist ports Mika Kuoppala
  2017-09-12  8:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
@ 2017-09-12  8:36 ` Mika Kuoppala
  2017-09-12 10:15   ` Chris Wilson
  2017-09-12  8:36 ` [PATCH 3/8] drm/i915: Add execlist_port_complete Mika Kuoppala
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12  8:36 UTC (permalink / raw)
  To: intel-gfx

On reset and wedged path, we want to release the requests
that are tied to ports and then mark the ports to be unset.
Introduce a function for this.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         | 11 ++++-------
 drivers/gpu/drm/i915/intel_engine_cs.c  | 10 ++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        |  5 +----
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
 4 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7d0263e9c220..d65f654821cb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3047,17 +3047,14 @@ static void engine_set_wedged(struct intel_engine_cs *engine)
 	 */
 
 	if (i915.enable_execlists) {
-		struct execlist_port *port = engine->execlist.port;
+		struct intel_engine_execlist * const el = &engine->execlist;
 		unsigned long flags;
-		unsigned int n;
 
 		spin_lock_irqsave(&engine->timeline->lock, flags);
 
-		for (n = 0; n < ARRAY_SIZE(engine->execlist.port); n++)
-			i915_gem_request_put(port_request(&port[n]));
-		memset(engine->execlist.port, 0, sizeof(engine->execlist.port));
-		engine->execlist.queue = RB_ROOT;
-		engine->execlist.first = NULL;
+		execlist_cancel_port_requests(el);
+		el->queue = RB_ROOT;
+		el->first = NULL;
 
 		spin_unlock_irqrestore(&engine->timeline->lock, flags);
 
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 6197df0de3fe..192ecc26f432 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1407,6 +1407,16 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine)
 	}
 }
 
+void execlist_cancel_port_requests(struct intel_engine_execlist * const el)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(el->port); i++)
+		i915_gem_request_put(port_request(&el->port[i]));
+
+	memset(el->port, 0, sizeof(el->port));
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_engine.c"
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 2964e7c0a873..7dc893806b43 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1331,7 +1331,6 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 {
 	struct execlist_port *port = engine->execlist.port;
 	struct intel_context *ce;
-	unsigned int n;
 
 	/*
 	 * Catch up with any missed context-switch interrupts.
@@ -1343,9 +1342,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	 * requests were completed.
 	 */
 	if (!request) {
-		for (n = 0; n < ARRAY_SIZE(engine->execlist.port); n++)
-			i915_gem_request_put(port_request(&port[n]));
-		memset(engine->execlist.port, 0, sizeof(engine->execlist.port));
+		execlist_cancel_port_requests(&engine->execlist);
 		return;
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 326ca1453e91..ae921f916c89 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -493,6 +493,8 @@ struct intel_engine_cs {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 };
 
+void execlist_cancel_port_requests(struct intel_engine_execlist * const el);
+
 static inline unsigned int
 intel_engine_flag(const struct intel_engine_cs *engine)
 {
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/8] drm/i915: Add execlist_port_complete
  2017-09-12  8:36 [PATCH 0/8] Support for more than two execlist ports Mika Kuoppala
  2017-09-12  8:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
  2017-09-12  8:36 ` [PATCH 2/8] drm/i915: Wrap port cancellation into a function Mika Kuoppala
@ 2017-09-12  8:36 ` Mika Kuoppala
  2017-09-12 10:19   ` Chris Wilson
  2017-09-12  8:36 ` [PATCH 4/8] drm/i915: Make execlist port count variable Mika Kuoppala
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12  8:36 UTC (permalink / raw)
  To: intel-gfx

When first execlist entry is processed, we move the port (contents).
Introduce function for this as execlist, guc and reset handling
use this common operation.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 12 ++++++------
 drivers/gpu/drm/i915/intel_lrc.c           | 28 ++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 14 +++++++++++++-
 3 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index f95defe18885..336d22ea5216 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -691,7 +691,7 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 			rq->priotree.priority = INT_MAX;
 
 			i915_guc_submit(rq);
-			trace_i915_gem_request_in(rq, port_index(port, engine));
+			trace_i915_gem_request_in(rq, port_index(port, el));
 			last = rq;
 			submit = true;
 		}
@@ -714,20 +714,20 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 static void i915_guc_irq_handler(unsigned long data)
 {
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
-	struct execlist_port *port = engine->execlist.port;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
 	struct drm_i915_gem_request *rq;
 	bool submit;
 
 	do {
-		rq = port_request(&port[0]);
+		rq = port_request(port);
 		while (rq && i915_gem_request_completed(rq)) {
 			trace_i915_gem_request_out(rq);
 			i915_gem_request_put(rq);
 
-			port[0] = port[1];
-			memset(&port[1], 0, sizeof(port[1]));
+			execlist_port_complete(el, port);
 
-			rq = port_request(&port[0]);
+			rq = port_request(port);
 		}
 
 		submit = false;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 7dc893806b43..73626dbcef50 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -398,7 +398,8 @@ static void port_assign(struct execlist_port *port,
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_request *last;
-	struct execlist_port *port = engine->execlist.port;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
 	struct rb_node *rb;
 	bool submit = false;
 
@@ -412,8 +413,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		 */
 		last->tail = last->wa_tail;
 
-	GEM_BUG_ON(port_isset(&port[1]));
-
 	/* Hardware submission is through 2 ports. Conceptually each port
 	 * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
 	 * static for a context, and unique to each, so we only execute
@@ -436,8 +435,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 */
 
 	spin_lock_irq(&engine->timeline->lock);
-	rb = engine->execlist.first;
-	GEM_BUG_ON(rb_first(&engine->execlist.queue) != rb);
+	rb = el->first;
+	GEM_BUG_ON(rb_first(&el->queue) != rb);
 	while (rb) {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
@@ -460,7 +459,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				 * combine this request with the last, then we
 				 * are done.
 				 */
-				if (port != engine->execlist.port) {
+				if (port != el->port) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -485,25 +484,27 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				if (submit)
 					port_assign(port, last);
 				port++;
+
+				GEM_BUG_ON(port_isset(port));
 			}
 
 			INIT_LIST_HEAD(&rq->priotree.link);
 			rq->priotree.priority = INT_MAX;
 
 			__i915_gem_request_submit(rq);
-			trace_i915_gem_request_in(rq, port_index(port, engine));
+			trace_i915_gem_request_in(rq, port_index(port, el));
 			last = rq;
 			submit = true;
 		}
 
 		rb = rb_next(rb);
-		rb_erase(&p->node, &engine->execlist.queue);
+		rb_erase(&p->node, &el->queue);
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
-	engine->execlist.first = rb;
+	el->first = rb;
 	if (submit)
 		port_assign(port, last);
 	spin_unlock_irq(&engine->timeline->lock);
@@ -609,8 +610,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 				trace_i915_gem_request_out(rq);
 				i915_gem_request_put(rq);
 
-				port[0] = port[1];
-				memset(&port[1], 0, sizeof(port[1]));
+				execlist_port_complete(el, port);
 			} else {
 				port_set(port, port_pack(rq, count));
 			}
@@ -1329,7 +1329,8 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 static void reset_common_ring(struct intel_engine_cs *engine,
 			      struct drm_i915_gem_request *request)
 {
-	struct execlist_port *port = engine->execlist.port;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port = el->port;
 	struct intel_context *ce;
 
 	/*
@@ -1348,8 +1349,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 
 	if (request->ctx != port_request(port)->ctx) {
 		i915_gem_request_put(port_request(port));
-		port[0] = port[1];
-		memset(&port[1], 0, sizeof(port[1]));
+		execlist_port_complete(el, port);
 	}
 
 	GEM_BUG_ON(request->ctx != port_request(port)->ctx);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index ae921f916c89..d7a941fc3f87 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -228,7 +228,7 @@ struct intel_engine_execlist {
 #define port_unpack(p, count) ptr_unpack_bits((p)->request_count, count, EXECLIST_COUNT_BITS)
 #define port_set(p, packed) ((p)->request_count = (packed))
 #define port_isset(p) ((p)->request_count)
-#define port_index(p, e) ((p) - (e)->execlist.port)
+#define port_index(p, el) ((p) - (el)->port)
 
 		/**
 		 * @context_id: context ID for port
@@ -495,6 +495,18 @@ struct intel_engine_cs {
 
 void execlist_cancel_port_requests(struct intel_engine_execlist * const el);
 
+static inline void
+execlist_port_complete(struct intel_engine_execlist * const el,
+		       struct execlist_port * const port)
+{
+	struct execlist_port * const port1 = &el->port[1];
+
+	GEM_DEBUG_BUG_ON(port_index(port, el) != 0);
+
+	*port = *port1;
+	memset(port1, 0, sizeof(struct execlist_port));
+}
+
 static inline unsigned int
 intel_engine_flag(const struct intel_engine_cs *engine)
 {
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/8] drm/i915: Make execlist port count variable
  2017-09-12  8:36 [PATCH 0/8] Support for more than two execlist ports Mika Kuoppala
                   ` (2 preceding siblings ...)
  2017-09-12  8:36 ` [PATCH 3/8] drm/i915: Add execlist_port_complete Mika Kuoppala
@ 2017-09-12  8:36 ` Mika Kuoppala
  2017-09-12  8:55   ` Chris Wilson
  2017-09-12 10:24   ` Chris Wilson
  2017-09-12  8:36 ` [PATCH 5/8] drm/i915: Introduce iterators for execlist ports Mika Kuoppala
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12  8:36 UTC (permalink / raw)
  To: intel-gfx

GuC can handle more than two submission ports. Make port count
variable to prepare supporting more than 2 ports.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  8 ++++----
 drivers/gpu/drm/i915/i915_drv.h            |  3 ++-
 drivers/gpu/drm/i915/i915_gpu_error.c      | 17 ++++++++++++-----
 drivers/gpu/drm/i915/i915_guc_submission.c |  8 ++++++--
 drivers/gpu/drm/i915/intel_lrc.c           | 24 ++++++++++++++++--------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 23 ++++++++++++++++++-----
 6 files changed, 58 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index cc36409936dc..77ac775e312d 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3315,6 +3315,7 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			   upper_32_bits(addr), lower_32_bits(addr));
 
 		if (i915.enable_execlists) {
+			struct intel_engine_execlist * const el = &engine->execlist;
 			u32 ptr, read, write;
 			unsigned int idx;
 
@@ -3344,11 +3345,10 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			}
 
 			rcu_read_lock();
-			for (idx = 0; idx < ARRAY_SIZE(engine->execlist.port); idx++) {
+			for (idx = 0; idx < execlist_num_ports(el); idx++) {
 				unsigned int count;
 
-				rq = port_unpack(&engine->execlist.port[idx],
-						 &count);
+				rq = port_unpack(&el->port[idx], &count);
 				if (rq) {
 					seq_printf(m, "\t\tELSP[%d] count=%d, ",
 						   idx, count);
@@ -3361,7 +3361,7 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			rcu_read_unlock();
 
 			spin_lock_irq(&engine->timeline->lock);
-			for (rb = engine->execlist.first; rb; rb = rb_next(rb)) {
+			for (rb = el->first; rb; rb = rb_next(rb)) {
 				struct i915_priolist *p =
 					rb_entry(rb, typeof(*p), node);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1cc31a5b049f..fd068ac66388 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1001,7 +1001,8 @@ struct i915_gpu_state {
 			u32 seqno;
 			u32 head;
 			u32 tail;
-		} *requests, execlist[2];
+		} *requests, execlist[EXECLIST_MAX_PORTS];
+		unsigned int num_ports;
 
 		struct drm_i915_error_waiter {
 			char comm[TASK_COMM_LEN];
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 6114bf79219d..e048d713f72c 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -396,6 +396,8 @@ static void error_print_context(struct drm_i915_error_state_buf *m,
 static void error_print_engine(struct drm_i915_error_state_buf *m,
 			       const struct drm_i915_error_engine *ee)
 {
+	int n;
+
 	err_printf(m, "%s command stream:\n", engine_str(ee->engine_id));
 	err_printf(m, "  START: 0x%08x\n", ee->start);
 	err_printf(m, "  HEAD:  0x%08x [0x%08x]\n", ee->head, ee->rq_head);
@@ -465,8 +467,11 @@ static void error_print_engine(struct drm_i915_error_state_buf *m,
 		   jiffies_to_msecs(jiffies - ee->hangcheck_timestamp));
 	err_printf(m, "  engine reset count: %u\n", ee->reset_count);
 
-	error_print_request(m, "  ELSP[0]: ", &ee->execlist[0]);
-	error_print_request(m, "  ELSP[1]: ", &ee->execlist[1]);
+	for (n = 0; n < ee->num_ports; n++) {
+		err_printf(m, "  ELSP[%d]:", n);
+		error_print_request(m, " ", &ee->execlist[n]);
+	}
+
 	error_print_context(m, "  Active context: ", &ee->context);
 }
 
@@ -1327,17 +1332,19 @@ static void engine_record_requests(struct intel_engine_cs *engine,
 static void error_record_engine_execlists(struct intel_engine_cs *engine,
 					  struct drm_i915_error_engine *ee)
 {
-	const struct execlist_port *port = engine->execlist.port;
+	const struct intel_engine_execlist * const el = &engine->execlist;
 	unsigned int n;
 
-	for (n = 0; n < ARRAY_SIZE(engine->execlist.port); n++) {
-		struct drm_i915_gem_request *rq = port_request(&port[n]);
+	for (n = 0; n < execlist_num_ports(el); n++) {
+		struct drm_i915_gem_request *rq = port_request(&el->port[n]);
 
 		if (!rq)
 			break;
 
 		record_request(rq, &ee->execlist[n]);
 	}
+
+	ee->num_ports = n;
 }
 
 static void record_context(struct drm_i915_error_context *e,
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 336d22ea5216..c38a3e71e285 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -663,6 +663,8 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlist * const el = &engine->execlist;
 	struct execlist_port *port = el->port;
+	const struct execlist_port * const last_port =
+		&el->port[el->port_mask];
 	struct drm_i915_gem_request *last = port_request(port);
 	struct rb_node *rb;
 	bool submit = false;
@@ -676,7 +678,7 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 
 		list_for_each_entry_safe(rq, rn, &p->requests, priotree.link) {
 			if (last && rq->ctx != last->ctx) {
-				if (port != el->port) {
+				if (port == last_port) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -716,6 +718,8 @@ static void i915_guc_irq_handler(unsigned long data)
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlist * const el = &engine->execlist;
 	struct execlist_port *port = el->port;
+	const struct execlist_port * const last_port =
+		&el->port[el->port_mask];
 	struct drm_i915_gem_request *rq;
 	bool submit;
 
@@ -731,7 +735,7 @@ static void i915_guc_irq_handler(unsigned long data)
 		}
 
 		submit = false;
-		if (!port_count(&port[1]))
+		if (!port_count(last_port))
 			submit = i915_guc_dequeue(engine);
 	} while (submit);
 }
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 73626dbcef50..4752d71dd644 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -343,7 +343,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 		engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
 	unsigned int n;
 
-	for (n = ARRAY_SIZE(engine->execlist.port); n--; ) {
+	for (n = execlist_num_ports(&engine->execlist); n--; ) {
 		struct drm_i915_gem_request *rq;
 		unsigned int count;
 		u64 desc;
@@ -400,6 +400,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	struct drm_i915_gem_request *last;
 	struct intel_engine_execlist * const el = &engine->execlist;
 	struct execlist_port *port = el->port;
+	const struct execlist_port * const last_port =
+		&el->port[el->port_mask];
 	struct rb_node *rb;
 	bool submit = false;
 
@@ -459,7 +461,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				 * combine this request with the last, then we
 				 * are done.
 				 */
-				if (port != el->port) {
+				if (port == last_port) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -1236,7 +1238,7 @@ static u8 gtiir[] = {
 static int gen8_init_common_ring(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
-	struct execlist_port *port = engine->execlist.port;
+	struct intel_engine_execlist * const el = &engine->execlist;
 	unsigned int n;
 	bool submit;
 	int ret;
@@ -1274,16 +1276,18 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
 
 	/* After a GPU reset, we may have requests to replay */
 	submit = false;
-	for (n = 0; n < ARRAY_SIZE(engine->execlist.port); n++) {
-		if (!port_isset(&port[n]))
+	for (n = 0; n < execlist_num_ports(el); n++) {
+		struct execlist_port * const port = &el->port[n];
+
+		if (!port_isset(port))
 			break;
 
 		DRM_DEBUG_DRIVER("Restarting %s:%d from 0x%x\n",
 				 engine->name, n,
-				 port_request(&port[n])->global_seqno);
+				 port_request(port)->global_seqno);
 
 		/* Discard the current inflight count */
-		port_set(&port[n], port_request(&port[n]));
+		port_set(port, port_request(port));
 		submit = true;
 	}
 
@@ -1343,7 +1347,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	 * requests were completed.
 	 */
 	if (!request) {
-		execlist_cancel_port_requests(&engine->execlist);
+		execlist_cancel_port_requests(el);
 		return;
 	}
 
@@ -1759,6 +1763,10 @@ logical_ring_setup(struct intel_engine_cs *engine)
 	/* Intentionally left blank. */
 	engine->buffer = NULL;
 
+	engine->execlist.port_mask = 1;
+	BUILD_BUG_ON_NOT_POWER_OF_2(execlist_num_ports(&engine->execlist));
+	GEM_BUG_ON(execlist_num_ports(&engine->execlist) > EXECLIST_MAX_PORTS);
+
 	fw_domains = intel_uncore_forcewake_for_reg(dev_priv,
 						    RING_ELSP(engine),
 						    FW_REG_WRITE);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index d7a941fc3f87..7b64e5e16136 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -234,7 +234,14 @@ struct intel_engine_execlist {
 		 * @context_id: context ID for port
 		 */
 		GEM_DEBUG_DECL(u32 context_id);
-	} port[2];
+
+#define EXECLIST_MAX_PORTS 2
+	} port[EXECLIST_MAX_PORTS];
+
+	/**
+	 * @port_mask: number of execlist ports - 1
+	 */
+	unsigned int port_mask;
 
 	/**
 	 * @queue: queue of requests, in priority lists
@@ -493,20 +500,26 @@ struct intel_engine_cs {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 };
 
-void execlist_cancel_port_requests(struct intel_engine_execlist * const el);
+static inline unsigned int
+execlist_num_ports(const struct intel_engine_execlist * const el)
+{
+	return el->port_mask + 1;
+}
 
 static inline void
 execlist_port_complete(struct intel_engine_execlist * const el,
 		       struct execlist_port * const port)
 {
-	struct execlist_port * const port1 = &el->port[1];
+	const unsigned int m = el->port_mask;
 
 	GEM_DEBUG_BUG_ON(port_index(port, el) != 0);
 
-	*port = *port1;
-	memset(port1, 0, sizeof(struct execlist_port));
+	memmove(port, port + 1, m * sizeof(struct execlist_port));
+	memset(port + m, 0, sizeof(struct execlist_port));
 }
 
+void execlist_cancel_port_requests(struct intel_engine_execlist * const el);
+
 static inline unsigned int
 intel_engine_flag(const struct intel_engine_cs *engine)
 {
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 5/8] drm/i915: Introduce iterators for execlist ports
  2017-09-12  8:36 [PATCH 0/8] Support for more than two execlist ports Mika Kuoppala
                   ` (3 preceding siblings ...)
  2017-09-12  8:36 ` [PATCH 4/8] drm/i915: Make execlist port count variable Mika Kuoppala
@ 2017-09-12  8:36 ` Mika Kuoppala
  2017-09-12 10:30   ` Chris Wilson
  2017-09-12  8:36 ` [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors Mika Kuoppala
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12  8:36 UTC (permalink / raw)
  To: intel-gfx

Switch to iterators for execlist_port access. This is
a preparation for indexing ports from arbitrary location. Which
in turn allows us to handle ports in ring like fashion.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  5 +++--
 drivers/gpu/drm/i915/i915_gpu_error.c   |  7 ++++---
 drivers/gpu/drm/i915/intel_lrc.c        | 23 +++++++++++------------
 drivers/gpu/drm/i915/intel_ringbuffer.h | 16 +++++++++++-----
 4 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 77ac775e312d..6bff702a5fc6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3254,6 +3254,7 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 
 	for_each_engine(engine, dev_priv, id) {
 		struct intel_breadcrumbs *b = &engine->breadcrumbs;
+		struct execlist_port *port;
 		struct drm_i915_gem_request *rq;
 		struct rb_node *rb;
 		u64 addr;
@@ -3345,10 +3346,10 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			}
 
 			rcu_read_lock();
-			for (idx = 0; idx < execlist_num_ports(el); idx++) {
+			for_each_execlist_port(el, port, idx) {
 				unsigned int count;
 
-				rq = port_unpack(&el->port[idx], &count);
+				rq = port_unpack(port, &count);
 				if (rq) {
 					seq_printf(m, "\t\tELSP[%d] count=%d, ",
 						   idx, count);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index e048d713f72c..966f00de53aa 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1332,11 +1332,12 @@ static void engine_record_requests(struct intel_engine_cs *engine,
 static void error_record_engine_execlists(struct intel_engine_cs *engine,
 					  struct drm_i915_error_engine *ee)
 {
-	const struct intel_engine_execlist * const el = &engine->execlist;
+	struct intel_engine_execlist * const el = &engine->execlist;
+	const struct execlist_port *port;
 	unsigned int n;
 
-	for (n = 0; n < execlist_num_ports(el); n++) {
-		struct drm_i915_gem_request *rq = port_request(&el->port[n]);
+	for_each_execlist_port(el, port, n) {
+		struct drm_i915_gem_request *rq = port_request(port);
 
 		if (!rq)
 			break;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4752d71dd644..cfa21731b9c7 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -338,24 +338,25 @@ static u64 execlists_update_context(struct drm_i915_gem_request *rq)
 
 static void execlists_submit_ports(struct intel_engine_cs *engine)
 {
-	struct execlist_port *port = engine->execlist.port;
-	u32 __iomem *elsp =
+	u32 __iomem * const elsp =
 		engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
+	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port;
 	unsigned int n;
 
-	for (n = execlist_num_ports(&engine->execlist); n--; ) {
+	for_each_execlist_port_reverse(el, port, n) {
 		struct drm_i915_gem_request *rq;
 		unsigned int count;
 		u64 desc;
 
-		rq = port_unpack(&port[n], &count);
+		rq = port_unpack(port, &count);
 		if (rq) {
 			GEM_BUG_ON(count > !n);
 			if (!count++)
 				execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
-			port_set(&port[n], port_pack(rq, count));
+			port_set(port, port_pack(rq, count));
 			desc = execlists_update_context(rq);
-			GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc));
+			GEM_DEBUG_EXEC(port->context_id = upper_32_bits(desc));
 		} else {
 			GEM_BUG_ON(!n);
 			desc = 0;
@@ -1238,7 +1239,7 @@ static u8 gtiir[] = {
 static int gen8_init_common_ring(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
-	struct intel_engine_execlist * const el = &engine->execlist;
+	struct execlist_port *port;
 	unsigned int n;
 	bool submit;
 	int ret;
@@ -1276,9 +1277,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
 
 	/* After a GPU reset, we may have requests to replay */
 	submit = false;
-	for (n = 0; n < execlist_num_ports(el); n++) {
-		struct execlist_port * const port = &el->port[n];
-
+	for_each_execlist_port(&engine->execlist, port, n) {
 		if (!port_isset(port))
 			break;
 
@@ -1764,8 +1763,8 @@ logical_ring_setup(struct intel_engine_cs *engine)
 	engine->buffer = NULL;
 
 	engine->execlist.port_mask = 1;
-	BUILD_BUG_ON_NOT_POWER_OF_2(execlist_num_ports(&engine->execlist));
-	GEM_BUG_ON(execlist_num_ports(&engine->execlist) > EXECLIST_MAX_PORTS);
+	BUILD_BUG_ON_NOT_POWER_OF_2(engine->execlist.port_mask + 1);
+	GEM_BUG_ON(engine->execlist.port_mask >= EXECLIST_MAX_PORTS);
 
 	fw_domains = intel_uncore_forcewake_for_reg(dev_priv,
 						    RING_ELSP(engine),
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 7b64e5e16136..838d2d9b77c1 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -500,11 +500,17 @@ struct intel_engine_cs {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 };
 
-static inline unsigned int
-execlist_num_ports(const struct intel_engine_execlist * const el)
-{
-	return el->port_mask + 1;
-}
+/* Iterators over elsp ports */
+#define __port_idx(start, i, m) (((start) + (i)) & (m))
+
+#define for_each_execlist_port(el__, port__, n__) \
+	for ((n__) = 0; \
+	     (port__) = &(el__)->port[__port_idx(0, (n__), (el__)->port_mask)], (n__) < (el__)->port_mask + 1; \
+	     (n__)++)
+
+#define for_each_execlist_port_reverse(el__, port__, n__) \
+	for ((n__) = (el__)->port_mask + 1; \
+	     (port__) = &(el__)->port[__port_idx((el__)->port_mask, (n__), (el__)->port_mask)], (n__)--;)
 
 static inline void
 execlist_port_complete(struct intel_engine_execlist * const el,
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors
  2017-09-12  8:36 [PATCH 0/8] Support for more than two execlist ports Mika Kuoppala
                   ` (4 preceding siblings ...)
  2017-09-12  8:36 ` [PATCH 5/8] drm/i915: Introduce iterators for execlist ports Mika Kuoppala
@ 2017-09-12  8:36 ` Mika Kuoppala
  2017-09-12 10:37   ` Chris Wilson
  2017-09-12  8:36 ` [PATCH 7/8] drm/i915: Move execlist initialization into intel_engine_cs.c Mika Kuoppala
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12  8:36 UTC (permalink / raw)
  To: intel-gfx

Instead of trusting that first available port is at index 0,
use accessor to hide this. This allows us to just move the head
on port completion instead of memmoving the array.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 15 ++++++-----
 drivers/gpu/drm/i915/i915_irq.c            |  2 +-
 drivers/gpu/drm/i915/intel_engine_cs.c     |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c           | 29 +++++++++++-----------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 40 ++++++++++++++++++++++++------
 5 files changed, 57 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index c38a3e71e285..0dfb03a0cee4 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -662,9 +662,8 @@ static void port_assign(struct execlist_port *port,
 static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = el->port;
-	const struct execlist_port * const last_port =
-		&el->port[el->port_mask];
+	struct execlist_port *port = execlist_port_head(el);
+	const struct execlist_port * const last_port = execlist_port_tail(el);
 	struct drm_i915_gem_request *last = port_request(port);
 	struct rb_node *rb;
 	bool submit = false;
@@ -686,7 +685,8 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 
 				if (submit)
 					port_assign(port, last);
-				port++;
+
+				port = execlist_port_next(el, port);
 			}
 
 			INIT_LIST_HEAD(&rq->priotree.link);
@@ -717,9 +717,8 @@ static void i915_guc_irq_handler(unsigned long data)
 {
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = el->port;
-	const struct execlist_port * const last_port =
-		&el->port[el->port_mask];
+	struct execlist_port *port = execlist_port_head(el);
+	const struct execlist_port * const last_port = execlist_port_tail(el);
 	struct drm_i915_gem_request *rq;
 	bool submit;
 
@@ -729,7 +728,7 @@ static void i915_guc_irq_handler(unsigned long data)
 			trace_i915_gem_request_out(rq);
 			i915_gem_request_put(rq);
 
-			execlist_port_complete(el, port);
+			port = execlist_port_complete(el, port);
 
 			rq = port_request(port);
 		}
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index e16fbea03823..1919ac0b7b0f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1312,7 +1312,7 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 	bool tasklet = false;
 
 	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
-		if (port_count(&el->port[0])) {
+		if (port_count(execlist_port_head(el))) {
 			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 			tasklet = true;
 		}
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 192ecc26f432..0fb5a1f99349 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1334,7 +1334,7 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 		return false;
 
 	/* Both ports drained, no more ELSP submission? */
-	if (port_request(&engine->execlist.port[0]))
+	if (port_request(execlist_port_head(&engine->execlist)))
 		return false;
 
 	/* ELSP is empty, but there are ready requests? */
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index cfa21731b9c7..0bc42a3f9528 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -400,9 +400,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_request *last;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = el->port;
-	const struct execlist_port * const last_port =
-		&el->port[el->port_mask];
+	struct execlist_port *port = execlist_port_head(el);
+	const struct execlist_port * const last_port = execlist_port_tail(el);
 	struct rb_node *rb;
 	bool submit = false;
 
@@ -486,7 +485,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 
 				if (submit)
 					port_assign(port, last);
-				port++;
+
+				port = execlist_port_next(el, port);
 
 				GEM_BUG_ON(port_isset(port));
 			}
@@ -516,11 +516,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		execlists_submit_ports(engine);
 }
 
-static bool execlists_elsp_ready(const struct intel_engine_cs *engine)
+static bool execlists_elsp_ready(struct intel_engine_execlist * const el)
 {
-	const struct execlist_port *port = engine->execlist.port;
+	struct execlist_port * const port = execlist_port_head(el);
 
-	return port_count(&port[0]) + port_count(&port[1]) < 2;
+	return port_count(port) + port_count(execlist_port_next(el, port)) < 2;
 }
 
 /*
@@ -531,7 +531,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 {
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = el->port;
+	struct execlist_port *port = execlist_port_head(el);
 	struct drm_i915_private *dev_priv = engine->i915;
 
 	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
@@ -613,7 +613,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 				trace_i915_gem_request_out(rq);
 				i915_gem_request_put(rq);
 
-				execlist_port_complete(el, port);
+				port = execlist_port_complete(el, port);
 			} else {
 				port_set(port, port_pack(rq, count));
 			}
@@ -627,7 +627,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 		       csb_mmio);
 	}
 
-	if (execlists_elsp_ready(engine))
+	if (execlists_elsp_ready(el))
 		execlists_dequeue(engine);
 
 	intel_uncore_forcewake_put(dev_priv, el->fw_domains);
@@ -701,6 +701,7 @@ insert_request(struct intel_engine_cs *engine,
 static void execlists_submit_request(struct drm_i915_gem_request *request)
 {
 	struct intel_engine_cs *engine = request->engine;
+	struct intel_engine_execlist * const el = &engine->execlist;
 	unsigned long flags;
 
 	/* Will be called from irq-context when using foreign fences. */
@@ -709,11 +710,11 @@ static void execlists_submit_request(struct drm_i915_gem_request *request)
 	if (insert_request(engine,
 			   &request->priotree,
 			   request->priotree.priority)) {
-		if (execlists_elsp_ready(engine))
-			tasklet_hi_schedule(&engine->execlist.irq_tasklet);
+		if (execlists_elsp_ready(el))
+			tasklet_hi_schedule(&el->irq_tasklet);
 	}
 
-	GEM_BUG_ON(!engine->execlist.first);
+	GEM_BUG_ON(!el->first);
 	GEM_BUG_ON(list_empty(&request->priotree.link));
 
 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
@@ -1333,7 +1334,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 			      struct drm_i915_gem_request *request)
 {
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = el->port;
+	struct execlist_port *port = execlist_port_head(el);
 	struct intel_context *ce;
 
 	/*
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 838d2d9b77c1..13d8e1115f3b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -244,6 +244,11 @@ struct intel_engine_execlist {
 	unsigned int port_mask;
 
 	/**
+	 * @port_head: first used execlist port
+	 */
+	unsigned int port_head;
+
+	/**
 	 * @queue: queue of requests, in priority lists
 	 */
 	struct rb_root queue;
@@ -505,23 +510,44 @@ struct intel_engine_cs {
 
 #define for_each_execlist_port(el__, port__, n__) \
 	for ((n__) = 0; \
-	     (port__) = &(el__)->port[__port_idx(0, (n__), (el__)->port_mask)], (n__) < (el__)->port_mask + 1; \
+	     (port__) = &(el__)->port[__port_idx((el__)->port_head, (n__), (el__)->port_mask)], (n__) < (el__)->port_mask + 1; \
 	     (n__)++)
 
 #define for_each_execlist_port_reverse(el__, port__, n__) \
 	for ((n__) = (el__)->port_mask + 1; \
-	     (port__) = &(el__)->port[__port_idx((el__)->port_mask, (n__), (el__)->port_mask)], (n__)--;)
+	     (port__) = &(el__)->port[__port_idx((el__)->port_head - 1, (n__), (el__)->port_mask)], (n__)--;)
 
-static inline void
+static inline struct execlist_port *
+execlist_port_head(struct intel_engine_execlist * const el)
+{
+	return &el->port[el->port_head];
+}
+
+static inline struct execlist_port *
+execlist_port_tail(struct intel_engine_execlist * const el)
+{
+	return &el->port[__port_idx(el->port_head, el->port_mask, el->port_mask)];
+}
+
+static inline struct execlist_port *
+execlist_port_next(struct intel_engine_execlist * const el,
+		   const struct execlist_port * const port)
+{
+	const unsigned int i = port_index(port, el);
+
+	return &el->port[__port_idx(i, 1, el->port_mask)];
+}
+
+static inline struct execlist_port *
 execlist_port_complete(struct intel_engine_execlist * const el,
 		       struct execlist_port * const port)
 {
-	const unsigned int m = el->port_mask;
+	GEM_DEBUG_BUG_ON(port_index(port, el) != el->port_head);
 
-	GEM_DEBUG_BUG_ON(port_index(port, el) != 0);
+	memset(port, 0, sizeof(struct execlist_port));
+	el->port_head = __port_idx(el->port_head, 1, el->port_mask);
 
-	memmove(port, port + 1, m * sizeof(struct execlist_port));
-	memset(port + m, 0, sizeof(struct execlist_port));
+	return execlist_port_head(el);
 }
 
 void execlist_cancel_port_requests(struct intel_engine_execlist * const el);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 7/8] drm/i915: Move execlist initialization into intel_engine_cs.c
  2017-09-12  8:36 [PATCH 0/8] Support for more than two execlist ports Mika Kuoppala
                   ` (5 preceding siblings ...)
  2017-09-12  8:36 ` [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors Mika Kuoppala
@ 2017-09-12  8:36 ` Mika Kuoppala
  2017-09-12 10:41   ` Chris Wilson
  2017-09-12  8:36 ` [PATCH 8/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12  8:36 UTC (permalink / raw)
  To: intel-gfx

Move execlist init into a common engine setup. As it is
common to both guc and hw execlists.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c | 18 +++++++++++++++---
 drivers/gpu/drm/i915/intel_lrc.c       |  4 ----
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 0fb5a1f99349..4b9eaec50070 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -380,6 +380,20 @@ static void intel_engine_init_timeline(struct intel_engine_cs *engine)
 	engine->timeline = &engine->i915->gt.global_timeline.engine[engine->id];
 }
 
+static void intel_engine_init_execlist(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlist * const el = &engine->execlist;
+
+	el->port_mask = 1;
+	BUILD_BUG_ON_NOT_POWER_OF_2(el->port_mask + 1);
+	GEM_BUG_ON(el->port_mask >= EXECLIST_MAX_PORTS);
+
+	el->port_head = 0;
+
+	el->queue = RB_ROOT;
+	el->first = NULL;
+}
+
 /**
  * intel_engines_setup_common - setup engine state not requiring hw access
  * @engine: Engine to setup.
@@ -391,9 +405,7 @@ static void intel_engine_init_timeline(struct intel_engine_cs *engine)
  */
 void intel_engine_setup_common(struct intel_engine_cs *engine)
 {
-	engine->execlist.queue = RB_ROOT;
-	engine->execlist.first = NULL;
-
+	intel_engine_init_execlist(engine);
 	intel_engine_init_timeline(engine);
 	intel_engine_init_hangcheck(engine);
 	i915_gem_batch_pool_init(engine, &engine->batch_pool);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0bc42a3f9528..1a1c68c53fbd 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1763,10 +1763,6 @@ logical_ring_setup(struct intel_engine_cs *engine)
 	/* Intentionally left blank. */
 	engine->buffer = NULL;
 
-	engine->execlist.port_mask = 1;
-	BUILD_BUG_ON_NOT_POWER_OF_2(engine->execlist.port_mask + 1);
-	GEM_BUG_ON(engine->execlist.port_mask >= EXECLIST_MAX_PORTS);
-
 	fw_domains = intel_uncore_forcewake_for_reg(dev_priv,
 						    RING_ELSP(engine),
 						    FW_REG_WRITE);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 8/8] drm/i915: Keep track of reserved execlist ports
  2017-09-12  8:36 [PATCH 0/8] Support for more than two execlist ports Mika Kuoppala
                   ` (6 preceding siblings ...)
  2017-09-12  8:36 ` [PATCH 7/8] drm/i915: Move execlist initialization into intel_engine_cs.c Mika Kuoppala
@ 2017-09-12  8:36 ` Mika Kuoppala
  2017-09-12  9:15   ` Chris Wilson
                     ` (5 more replies)
  2017-09-12  8:59 ` ✓ Fi.CI.BAT: success for Support for more than two " Patchwork
  2017-09-12 10:48 ` ✓ Fi.CI.IGT: " Patchwork
  9 siblings, 6 replies; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12  8:36 UTC (permalink / raw)
  To: intel-gfx

To further enchance port processing, keep track of
reserved ports. This way we can iterate only the used subset
of port space. Note that we lift the responsibility of
execlists_submit_request() to inspect hw availability and
always do dequeuing. This is to ensure that only the irq
handler will be responsible for keeping track of available ports.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 43 ++++++++-------
 drivers/gpu/drm/i915/i915_irq.c            |  2 +-
 drivers/gpu/drm/i915/intel_engine_cs.c     |  5 +-
 drivers/gpu/drm/i915/intel_lrc.c           | 84 ++++++++++++++++++------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 50 +++++++++++++-----
 5 files changed, 117 insertions(+), 67 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 0dfb03a0cee4..fdda3a1835ad 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -662,12 +662,19 @@ static void port_assign(struct execlist_port *port,
 static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = execlist_port_head(el);
-	const struct execlist_port * const last_port = execlist_port_tail(el);
-	struct drm_i915_gem_request *last = port_request(port);
+	struct execlist_port *port;
+	struct drm_i915_gem_request *last;
 	struct rb_node *rb;
 	bool submit = false;
 
+	if (execlist_active_ports(el)) {
+		port = execlist_port_tail(el);
+		last = port_request(port);
+	} else {
+		port = NULL;
+		last = NULL;
+	}
+
 	spin_lock_irq(&engine->timeline->lock);
 	rb = el->first;
 	GEM_BUG_ON(rb_first(&el->queue) != rb);
@@ -675,9 +682,12 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
 
+		if (!port)
+			port = execlist_request_port(el);
+
 		list_for_each_entry_safe(rq, rn, &p->requests, priotree.link) {
 			if (last && rq->ctx != last->ctx) {
-				if (port == last_port) {
+				if (!execlist_inactive_ports(el)) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -686,7 +696,8 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
 				if (submit)
 					port_assign(port, last);
 
-				port = execlist_port_next(el, port);
+				port = execlist_request_port(el);
+				GEM_BUG_ON(port_isset(port));
 			}
 
 			INIT_LIST_HEAD(&rq->priotree.link);
@@ -717,26 +728,22 @@ static void i915_guc_irq_handler(unsigned long data)
 {
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = execlist_port_head(el);
-	const struct execlist_port * const last_port = execlist_port_tail(el);
-	struct drm_i915_gem_request *rq;
-	bool submit;
 
 	do {
-		rq = port_request(port);
-		while (rq && i915_gem_request_completed(rq)) {
+		while (execlist_active_ports(el)) {
+			struct execlist_port *port = execlist_port_head(el);
+			struct drm_i915_gem_request *rq = port_request(port);
+
+			if (!i915_gem_request_completed(rq))
+				break;
+
 			trace_i915_gem_request_out(rq);
 			i915_gem_request_put(rq);
 
-			port = execlist_port_complete(el, port);
-
-			rq = port_request(port);
+			execlist_release_port(el, port);
 		}
 
-		submit = false;
-		if (!port_count(last_port))
-			submit = i915_guc_dequeue(engine);
-	} while (submit);
+	} while (execlist_inactive_ports(el) && i915_guc_dequeue(engine));
 }
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 1919ac0b7b0f..de4e608786a8 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1312,7 +1312,7 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 	bool tasklet = false;
 
 	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
-		if (port_count(execlist_port_head(el))) {
+		if (execlist_active_ports(el)) {
 			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 			tasklet = true;
 		}
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 4b9eaec50070..b09212bb66b4 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -389,6 +389,7 @@ static void intel_engine_init_execlist(struct intel_engine_cs *engine)
 	GEM_BUG_ON(el->port_mask >= EXECLIST_MAX_PORTS);
 
 	el->port_head = 0;
+	el->port_count = 0;
 
 	el->queue = RB_ROOT;
 	el->first = NULL;
@@ -1345,8 +1346,8 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
 		return false;
 
-	/* Both ports drained, no more ELSP submission? */
-	if (port_request(execlist_port_head(&engine->execlist)))
+	/* All ports drained, no more ELSP submission? */
+	if (execlist_active_ports(&engine->execlist))
 		return false;
 
 	/* ELSP is empty, but there are ready requests? */
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 1a1c68c53fbd..18a87cf56b81 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -344,23 +344,27 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 	struct execlist_port *port;
 	unsigned int n;
 
+	for (n = 0; n < execlist_inactive_ports(el); n++) {
+		writel(0, elsp);
+		writel(0, elsp);
+	}
+
 	for_each_execlist_port_reverse(el, port, n) {
 		struct drm_i915_gem_request *rq;
 		unsigned int count;
 		u64 desc;
 
 		rq = port_unpack(port, &count);
-		if (rq) {
-			GEM_BUG_ON(count > !n);
-			if (!count++)
-				execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
-			port_set(port, port_pack(rq, count));
-			desc = execlists_update_context(rq);
-			GEM_DEBUG_EXEC(port->context_id = upper_32_bits(desc));
-		} else {
-			GEM_BUG_ON(!n);
-			desc = 0;
-		}
+
+		GEM_BUG_ON(!rq);
+		GEM_BUG_ON(count > !n);
+
+		if (!count++)
+			execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
+
+		port_set(port, port_pack(rq, count));
+		desc = execlists_update_context(rq);
+		GEM_DEBUG_EXEC(port->context_id = upper_32_bits(desc));
 
 		writel(upper_32_bits(desc), elsp);
 		writel(lower_32_bits(desc), elsp);
@@ -398,15 +402,16 @@ static void port_assign(struct execlist_port *port,
 
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
-	struct drm_i915_gem_request *last;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = execlist_port_head(el);
-	const struct execlist_port * const last_port = execlist_port_tail(el);
+	struct execlist_port *port;
+	struct drm_i915_gem_request *last;
 	struct rb_node *rb;
 	bool submit = false;
 
-	last = port_request(port);
-	if (last)
+	if (execlist_active_ports(el)) {
+		port = execlist_port_tail(el);
+		last = port_request(port);
+
 		/* WaIdleLiteRestore:bdw,skl
 		 * Apply the wa NOOPs to prevent ring:HEAD == req:TAIL
 		 * as we resubmit the request. See gen8_emit_breadcrumb()
@@ -414,6 +419,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		 * request.
 		 */
 		last->tail = last->wa_tail;
+	} else {
+		port = NULL;
+		last = NULL;
+	}
 
 	/* Hardware submission is through 2 ports. Conceptually each port
 	 * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
@@ -443,6 +452,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
 
+		if (!port)
+			port = execlist_request_port(el);
+
 		list_for_each_entry_safe(rq, rn, &p->requests, priotree.link) {
 			/*
 			 * Can we combine this request with the current port?
@@ -461,7 +473,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				 * combine this request with the last, then we
 				 * are done.
 				 */
-				if (port == last_port) {
+				if (!execlist_inactive_ports(el)) {
 					__list_del_many(&p->requests,
 							&rq->priotree.link);
 					goto done;
@@ -486,8 +498,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				if (submit)
 					port_assign(port, last);
 
-				port = execlist_port_next(el, port);
-
+				port = execlist_request_port(el);
 				GEM_BUG_ON(port_isset(port));
 			}
 
@@ -518,9 +529,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 
 static bool execlists_elsp_ready(struct intel_engine_execlist * const el)
 {
-	struct execlist_port * const port = execlist_port_head(el);
+	const unsigned int active = execlist_active_ports(el);
+
+	if (!active)
+		return true;
 
-	return port_count(port) + port_count(execlist_port_next(el, port)) < 2;
+	return port_count(execlist_port_tail(el)) + active < 2;
 }
 
 /*
@@ -531,7 +545,6 @@ static void intel_lrc_irq_handler(unsigned long data)
 {
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = execlist_port_head(el);
 	struct drm_i915_private *dev_priv = engine->i915;
 
 	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
@@ -571,6 +584,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 		tail = GEN8_CSB_WRITE_PTR(head);
 		head = GEN8_CSB_READ_PTR(head);
 		while (head != tail) {
+			struct execlist_port *port;
 			struct drm_i915_gem_request *rq;
 			unsigned int status;
 			unsigned int count;
@@ -599,6 +613,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 			if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
 				continue;
 
+			port = execlist_port_head(el);
 			/* Check the context/desc id for this event matches */
 			GEM_DEBUG_BUG_ON(readl(buf + 2 * head + 1) !=
 					 port->context_id);
@@ -613,13 +628,13 @@ static void intel_lrc_irq_handler(unsigned long data)
 				trace_i915_gem_request_out(rq);
 				i915_gem_request_put(rq);
 
-				port = execlist_port_complete(el, port);
+				execlist_release_port(el, port);
 			} else {
 				port_set(port, port_pack(rq, count));
 			}
 
 			/* After the final element, the hw should be idle */
-			GEM_BUG_ON(port_count(port) == 0 &&
+			GEM_BUG_ON(execlist_active_ports(el) == 0 &&
 				   !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
 		}
 
@@ -709,10 +724,8 @@ static void execlists_submit_request(struct drm_i915_gem_request *request)
 
 	if (insert_request(engine,
 			   &request->priotree,
-			   request->priotree.priority)) {
-		if (execlists_elsp_ready(el))
-			tasklet_hi_schedule(&el->irq_tasklet);
-	}
+			   request->priotree.priority))
+		tasklet_hi_schedule(&el->irq_tasklet);
 
 	GEM_BUG_ON(!el->first);
 	GEM_BUG_ON(list_empty(&request->priotree.link));
@@ -1334,7 +1347,6 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 			      struct drm_i915_gem_request *request)
 {
 	struct intel_engine_execlist * const el = &engine->execlist;
-	struct execlist_port *port = execlist_port_head(el);
 	struct intel_context *ce;
 
 	/*
@@ -1351,12 +1363,16 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 		return;
 	}
 
-	if (request->ctx != port_request(port)->ctx) {
-		i915_gem_request_put(port_request(port));
-		execlist_port_complete(el, port);
-	}
+	if (execlist_active_ports(el)) {
+		struct execlist_port *port = execlist_port_head(el);
 
-	GEM_BUG_ON(request->ctx != port_request(port)->ctx);
+		if (request->ctx != port_request(port)->ctx) {
+			i915_gem_request_put(port_request(port));
+			execlist_release_port(el, port);
+		}
+
+		GEM_BUG_ON(request->ctx != port_request(port)->ctx);
+	}
 
 	/* If the request was innocent, we leave the request in the ELSP
 	 * and will try to replay it on restarting. The context image may
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 13d8e1115f3b..9c377f6eb8fe 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -249,6 +249,11 @@ struct intel_engine_execlist {
 	unsigned int port_head;
 
 	/**
+	 * @port_count: reserved ports
+	 */
+	unsigned int port_count;
+
+	/**
 	 * @queue: queue of requests, in priority lists
 	 */
 	struct rb_root queue;
@@ -510,44 +515,65 @@ struct intel_engine_cs {
 
 #define for_each_execlist_port(el__, port__, n__) \
 	for ((n__) = 0; \
-	     (port__) = &(el__)->port[__port_idx((el__)->port_head, (n__), (el__)->port_mask)], (n__) < (el__)->port_mask + 1; \
+	     (port__) = &(el__)->port[__port_idx((el__)->port_head, (n__), (el__)->port_mask)], (n__) < (el__)->port_count; \
 	     (n__)++)
 
 #define for_each_execlist_port_reverse(el__, port__, n__) \
-	for ((n__) = (el__)->port_mask + 1; \
+	for ((n__) = (el__)->port_count; \
 	     (port__) = &(el__)->port[__port_idx((el__)->port_head - 1, (n__), (el__)->port_mask)], (n__)--;)
 
 static inline struct execlist_port *
 execlist_port_head(struct intel_engine_execlist * const el)
 {
+	GEM_DEBUG_BUG_ON(!el->port_count);
+
 	return &el->port[el->port_head];
 }
 
 static inline struct execlist_port *
 execlist_port_tail(struct intel_engine_execlist * const el)
 {
-	return &el->port[__port_idx(el->port_head, el->port_mask, el->port_mask)];
+	GEM_DEBUG_BUG_ON(!el->port_count);
+
+	return &el->port[__port_idx(el->port_head, el->port_count - 1, el->port_mask)];
 }
 
-static inline struct execlist_port *
-execlist_port_next(struct intel_engine_execlist * const el,
-		   const struct execlist_port * const port)
+static inline unsigned int
+execlist_active_ports(const struct intel_engine_execlist * const el)
 {
-	const unsigned int i = port_index(port, el);
+	return READ_ONCE(el->port_count);
+}
 
-	return &el->port[__port_idx(i, 1, el->port_mask)];
+static inline unsigned int
+execlist_inactive_ports(const struct intel_engine_execlist * const el)
+{
+	return el->port_mask + 1 - READ_ONCE(el->port_count);
 }
 
 static inline struct execlist_port *
-execlist_port_complete(struct intel_engine_execlist * const el,
-		       struct execlist_port * const port)
+execlist_request_port(struct intel_engine_execlist * const el)
+{
+	GEM_DEBUG_BUG_ON(el->port_count == el->port_mask + 1);
+
+	el->port_count++;
+
+	GEM_DEBUG_BUG_ON(port_isset(execlist_port_tail(el)));
+
+	return execlist_port_tail(el);
+}
+
+static inline void
+execlist_release_port(struct intel_engine_execlist * const el,
+		      struct execlist_port * const port)
 {
 	GEM_DEBUG_BUG_ON(port_index(port, el) != el->port_head);
+	GEM_DEBUG_BUG_ON(!port_isset(port));
+	GEM_DEBUG_BUG_ON(!el->port_count);
 
 	memset(port, 0, sizeof(struct execlist_port));
-	el->port_head = __port_idx(el->port_head, 1, el->port_mask);
 
-	return execlist_port_head(el);
+	el->port_head = __port_idx(el->port_head, 1, el->port_mask);
+	el->port_count--;
 }
 
 void execlist_cancel_port_requests(struct intel_engine_execlist * const el);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/8] drm/i915: Make execlist port count variable
  2017-09-12  8:36 ` [PATCH 4/8] drm/i915: Make execlist port count variable Mika Kuoppala
@ 2017-09-12  8:55   ` Chris Wilson
  2017-09-12 10:24   ` Chris Wilson
  1 sibling, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12  8:55 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:14)
>  static inline void
>  execlist_port_complete(struct intel_engine_execlist * const el,
>                        struct execlist_port * const port)
>  {
> -       struct execlist_port * const port1 = &el->port[1];
> +       const unsigned int m = el->port_mask;
>  
>         GEM_DEBUG_BUG_ON(port_index(port, el) != 0);
>  
> -       *port = *port1;
> -       memset(port1, 0, sizeof(struct execlist_port));
> +       memmove(port, port + 1, m * sizeof(struct execlist_port));
> +       memset(port + m, 0, sizeof(struct execlist_port));

Ah I was going to suggest you keep the port[0] = port[1] in the earlier
patch, then remember there will be magic later on. But if we go through
this step, we may as not cut out the meander.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* ✓ Fi.CI.BAT: success for Support for more than two execlist ports
  2017-09-12  8:36 [PATCH 0/8] Support for more than two execlist ports Mika Kuoppala
                   ` (7 preceding siblings ...)
  2017-09-12  8:36 ` [PATCH 8/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
@ 2017-09-12  8:59 ` Patchwork
  2017-09-12 10:48 ` ✓ Fi.CI.IGT: " Patchwork
  9 siblings, 0 replies; 30+ messages in thread
From: Patchwork @ 2017-09-12  8:59 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

== Series Details ==

Series: Support for more than two execlist ports
URL   : https://patchwork.freedesktop.org/series/30183/
State : success

== Summary ==

Series 30183v1 Support for more than two execlist ports
https://patchwork.freedesktop.org/api/1.0/series/30183/revisions/1/mbox/

Test chamelium:
        Subgroup dp-edid-read:
                fail       -> PASS       (fi-kbl-7500u) fdo#102672
Test gem_ringfill:
        Subgroup basic-default-hang:
                incomplete -> DMESG-WARN (fi-pnv-d510) fdo#101600
Test kms_frontbuffer_tracking:
        Subgroup basic:
                dmesg-warn -> PASS       (fi-bdw-5557u) fdo#102473

fdo#102672 https://bugs.freedesktop.org/show_bug.cgi?id=102672
fdo#101600 https://bugs.freedesktop.org/show_bug.cgi?id=101600
fdo#102473 https://bugs.freedesktop.org/show_bug.cgi?id=102473

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:444s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:454s
fi-blb-e6850     total:289  pass:224  dwarn:1   dfail:0   fail:0   skip:64  time:380s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:538s
fi-bwr-2160      total:289  pass:184  dwarn:0   dfail:0   fail:0   skip:105 time:269s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:512s
fi-byt-j1900     total:289  pass:254  dwarn:1   dfail:0   fail:0   skip:34  time:502s
fi-byt-n2820     total:289  pass:250  dwarn:1   dfail:0   fail:0   skip:38  time:500s
fi-cfl-s         total:289  pass:250  dwarn:4   dfail:0   fail:0   skip:35  time:452s
fi-elk-e7500     total:289  pass:230  dwarn:0   dfail:0   fail:0   skip:59  time:447s
fi-glk-2a        total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:594s
fi-hsw-4770      total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:430s
fi-hsw-4770r     total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:408s
fi-ilk-650       total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:437s
fi-ivb-3520m     total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:491s
fi-ivb-3770      total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:463s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:486s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:576s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:586s
fi-pnv-d510      total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:549s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:460s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:520s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:504s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:458s
fi-skl-x1585l    total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:479s
fi-snb-2520m     total:289  pass:251  dwarn:0   dfail:0   fail:0   skip:38  time:564s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:1   skip:39  time:421s

3cbfafbde5bb92b6ecdbfd47e189d2742aa6fbec drm-tip: 2017y-09m-11d-23h-04m-21s UTC integration manifest
92a4d479b91f drm/i915: Keep track of reserved execlist ports
68c571c9f0fb drm/i915: Move execlist initialization into intel_engine_cs.c
f960067d8174 drm/i915: Introduce execlist_port_* accessors
8378cd04c0e3 drm/i915: Introduce iterators for execlist ports
f54ccbc16ae8 drm/i915: Make execlist port count variable
3069386bf94e drm/i915: Add execlist_port_complete
ddfca3322990 drm/i915: Wrap port cancellation into a function
809bf696eded drm/i915: Make own struct for execlist items

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5653/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] drm/i915: Keep track of reserved execlist ports
  2017-09-12  8:36 ` [PATCH 8/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
@ 2017-09-12  9:15   ` Chris Wilson
  2017-09-12 10:27     ` Mika Kuoppala
  2017-09-12  9:17   ` Chris Wilson
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 30+ messages in thread
From: Chris Wilson @ 2017-09-12  9:15 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:18)
> +execlist_request_port(struct intel_engine_execlist * const el)
> +{
> +       GEM_DEBUG_BUG_ON(el->port_count == el->port_mask + 1);
> +
> +       el->port_count++;
> +
> +       GEM_DEBUG_BUG_ON(port_isset(execlist_port_tail(el)));
> +
> +       return execlist_port_tail(el);
> +}
> 
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 0dfb03a0cee4..fdda3a1835ad 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -662,12 +662,19 @@ static void port_assign(struct execlist_port *port,
>  static bool i915_guc_dequeue(struct intel_engine_cs *engine)
>  {
>         struct intel_engine_execlist * const el = &engine->execlist;
> -       struct execlist_port *port = execlist_port_head(el);
> -       const struct execlist_port * const last_port = execlist_port_tail(el);
> -       struct drm_i915_gem_request *last = port_request(port);
> +       struct execlist_port *port;
> +       struct drm_i915_gem_request *last;
>         struct rb_node *rb;
>         bool submit = false;
>  
> +       if (execlist_active_ports(el)) {
> +               port = execlist_port_tail(el);
> +               last = port_request(port);
> +       } else {
> +               port = NULL;
> +               last = NULL;
> +       }
> +
>         spin_lock_irq(&engine->timeline->lock);
>         rb = el->first;
>         GEM_BUG_ON(rb_first(&el->queue) != rb);
> @@ -675,9 +682,12 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
>                 struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
>                 struct drm_i915_gem_request *rq, *rn;
>  
> +               if (!port)
> +                       port = execlist_request_port(el);
> +

I can't see port becoming NULL inside the loop, so why can't we do it
during the init above? What did I miss?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] drm/i915: Keep track of reserved execlist ports
  2017-09-12  8:36 ` [PATCH 8/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
  2017-09-12  9:15   ` Chris Wilson
@ 2017-09-12  9:17   ` Chris Wilson
  2017-09-12  9:19   ` Chris Wilson
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12  9:17 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:18)
>  static void execlists_dequeue(struct intel_engine_cs *engine)
>  {
> -       struct drm_i915_gem_request *last;
>         struct intel_engine_execlist * const el = &engine->execlist;
> -       struct execlist_port *port = execlist_port_head(el);
> -       const struct execlist_port * const last_port = execlist_port_tail(el);
> +       struct execlist_port *port;
> +       struct drm_i915_gem_request *last;
>         struct rb_node *rb;
>         bool submit = false;
>  
> -       last = port_request(port);
> -       if (last)
> +       if (execlist_active_ports(el)) {
> +               port = execlist_port_tail(el);
> +               last = port_request(port);
> +
>                 /* WaIdleLiteRestore:bdw,skl
>                  * Apply the wa NOOPs to prevent ring:HEAD == req:TAIL
>                  * as we resubmit the request. See gen8_emit_breadcrumb()
> @@ -414,6 +419,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>                  * request.
>                  */
>                 last->tail = last->wa_tail;
> +       } else {
> +               port = NULL;
> +               last = NULL;
> +       }
>  
>         /* Hardware submission is through 2 ports. Conceptually each port
>          * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
> @@ -443,6 +452,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>                 struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
>                 struct drm_i915_gem_request *rq, *rn;
>  
> +               if (!port)
> +                       port = execlist_request_port(el);
> +
>                 list_for_each_entry_safe(rq, rn, &p->requests, priotree.link) {
>                         /*
>                          * Can we combine this request with the current port?
> @@ -461,7 +473,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>                                  * combine this request with the last, then we
>                                  * are done.
>                                  */

The comment above is now confused, so
/*
 * If we are on the last port and cannot combine
 * this request with the previous, then we are
 * done.
 */
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] drm/i915: Keep track of reserved execlist ports
  2017-09-12  8:36 ` [PATCH 8/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
  2017-09-12  9:15   ` Chris Wilson
  2017-09-12  9:17   ` Chris Wilson
@ 2017-09-12  9:19   ` Chris Wilson
  2017-09-12  9:23   ` Chris Wilson
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12  9:19 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:18)
> To further enchance port processing, keep track of
> reserved ports. This way we can iterate only the used subset
> of port space. Note that we lift the responsibility of
> execlists_submit_request() to inspect hw availability and
> always do dequeuing. This is to ensure that only the irq
> handler will be responsible for keeping track of available ports.
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 43 ++++++++-------
>  drivers/gpu/drm/i915/i915_irq.c            |  2 +-
>  drivers/gpu/drm/i915/intel_engine_cs.c     |  5 +-
>  drivers/gpu/drm/i915/intel_lrc.c           | 84 ++++++++++++++++++------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h    | 50 +++++++++++++-----
>  5 files changed, 117 insertions(+), 67 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 0dfb03a0cee4..fdda3a1835ad 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -662,12 +662,19 @@ static void port_assign(struct execlist_port *port,
>  static bool i915_guc_dequeue(struct intel_engine_cs *engine)
>  {
>         struct intel_engine_execlist * const el = &engine->execlist;
> -       struct execlist_port *port = execlist_port_head(el);
> -       const struct execlist_port * const last_port = execlist_port_tail(el);
> -       struct drm_i915_gem_request *last = port_request(port);
> +       struct execlist_port *port;
> +       struct drm_i915_gem_request *last;
>         struct rb_node *rb;
>         bool submit = false;
>  
> +       if (execlist_active_ports(el)) {
> +               port = execlist_port_tail(el);
> +               last = port_request(port);
> +       } else {
> +               port = NULL;
> +               last = NULL;
> +       }
> +
>         spin_lock_irq(&engine->timeline->lock);
>         rb = el->first;
>         GEM_BUG_ON(rb_first(&el->queue) != rb);
> @@ -675,9 +682,12 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
>                 struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
>                 struct drm_i915_gem_request *rq, *rn;
>  
> +               if (!port)
> +                       port = execlist_request_port(el);
> +
>                 list_for_each_entry_safe(rq, rn, &p->requests, priotree.link) {
>                         if (last && rq->ctx != last->ctx) {
> -                               if (port == last_port) {
> +                               if (!execlist_inactive_ports(el)) {
>                                         __list_del_many(&p->requests,
>                                                         &rq->priotree.link);
>                                         goto done;
> @@ -686,7 +696,8 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
>                                 if (submit)
>                                         port_assign(port, last);
>  
> -                               port = execlist_port_next(el, port);
> +                               port = execlist_request_port(el);
> +                               GEM_BUG_ON(port_isset(port));
>                         }
>  
>                         INIT_LIST_HEAD(&rq->priotree.link);
> @@ -717,26 +728,22 @@ static void i915_guc_irq_handler(unsigned long data)
>  {
>         struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
>         struct intel_engine_execlist * const el = &engine->execlist;
> -       struct execlist_port *port = execlist_port_head(el);
> -       const struct execlist_port * const last_port = execlist_port_tail(el);
> -       struct drm_i915_gem_request *rq;
> -       bool submit;
>  
>         do {
> -               rq = port_request(port);
> -               while (rq && i915_gem_request_completed(rq)) {
> +               while (execlist_active_ports(el)) {
> +                       struct execlist_port *port = execlist_port_head(el);
> +                       struct drm_i915_gem_request *rq = port_request(port);
> +
> +                       if (!i915_gem_request_completed(rq))
> +                               break;
> +
>                         trace_i915_gem_request_out(rq);
>                         i915_gem_request_put(rq);
>  
> -                       port = execlist_port_complete(el, port);
> -
> -                       rq = port_request(port);
> +                       execlist_release_port(el, port);
>                 }
>  
> -               submit = false;
> -               if (!port_count(last_port))
> -                       submit = i915_guc_dequeue(engine);
> -       } while (submit);
> +       } while (execlist_inactive_ports(el) && i915_guc_dequeue(engine));
>  }
>  
>  /*
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 1919ac0b7b0f..de4e608786a8 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1312,7 +1312,7 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
>         bool tasklet = false;
>  
>         if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
> -               if (port_count(execlist_port_head(el))) {
> +               if (execlist_active_ports(el)) {
>                         __set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>                         tasklet = true;
>                 }
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 4b9eaec50070..b09212bb66b4 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -389,6 +389,7 @@ static void intel_engine_init_execlist(struct intel_engine_cs *engine)
>         GEM_BUG_ON(el->port_mask >= EXECLIST_MAX_PORTS);
>  
>         el->port_head = 0;
> +       el->port_count = 0;
>  
>         el->queue = RB_ROOT;
>         el->first = NULL;
> @@ -1345,8 +1346,8 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
>         if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
>                 return false;
>  
> -       /* Both ports drained, no more ELSP submission? */
> -       if (port_request(execlist_port_head(&engine->execlist)))
> +       /* All ports drained, no more ELSP submission? */
> +       if (execlist_active_ports(&engine->execlist))
>                 return false;
>  
>         /* ELSP is empty, but there are ready requests? */
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 1a1c68c53fbd..18a87cf56b81 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -344,23 +344,27 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>         struct execlist_port *port;
>         unsigned int n;
>  
> +       for (n = 0; n < execlist_inactive_ports(el); n++) {
> +               writel(0, elsp);
> +               writel(0, elsp);
> +       }
> +
>         for_each_execlist_port_reverse(el, port, n) {
>                 struct drm_i915_gem_request *rq;
>                 unsigned int count;
>                 u64 desc;
>  
>                 rq = port_unpack(port, &count);
> -               if (rq) {
> -                       GEM_BUG_ON(count > !n);
> -                       if (!count++)
> -                               execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
> -                       port_set(port, port_pack(rq, count));
> -                       desc = execlists_update_context(rq);
> -                       GEM_DEBUG_EXEC(port->context_id = upper_32_bits(desc));
> -               } else {
> -                       GEM_BUG_ON(!n);
> -                       desc = 0;
> -               }
> +
> +               GEM_BUG_ON(!rq);
> +               GEM_BUG_ON(count > !n);
> +
> +               if (!count++)
> +                       execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
> +
> +               port_set(port, port_pack(rq, count));
> +               desc = execlists_update_context(rq);
> +               GEM_DEBUG_EXEC(port->context_id = upper_32_bits(desc));
>  
>                 writel(upper_32_bits(desc), elsp);
>                 writel(lower_32_bits(desc), elsp);
> @@ -398,15 +402,16 @@ static void port_assign(struct execlist_port *port,
>  
>  static void execlists_dequeue(struct intel_engine_cs *engine)
>  {
> -       struct drm_i915_gem_request *last;
>         struct intel_engine_execlist * const el = &engine->execlist;
> -       struct execlist_port *port = execlist_port_head(el);
> -       const struct execlist_port * const last_port = execlist_port_tail(el);
> +       struct execlist_port *port;
> +       struct drm_i915_gem_request *last;
>         struct rb_node *rb;
>         bool submit = false;
>  
> -       last = port_request(port);
> -       if (last)
> +       if (execlist_active_ports(el)) {
> +               port = execlist_port_tail(el);
> +               last = port_request(port);
> +
>                 /* WaIdleLiteRestore:bdw,skl
>                  * Apply the wa NOOPs to prevent ring:HEAD == req:TAIL
>                  * as we resubmit the request. See gen8_emit_breadcrumb()
> @@ -414,6 +419,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>                  * request.
>                  */
>                 last->tail = last->wa_tail;
> +       } else {
> +               port = NULL;
> +               last = NULL;
> +       }
>  
>         /* Hardware submission is through 2 ports. Conceptually each port
>          * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
> @@ -443,6 +452,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>                 struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
>                 struct drm_i915_gem_request *rq, *rn;
>  
> +               if (!port)
> +                       port = execlist_request_port(el);
> +
>                 list_for_each_entry_safe(rq, rn, &p->requests, priotree.link) {
>                         /*
>                          * Can we combine this request with the current port?
> @@ -461,7 +473,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>                                  * combine this request with the last, then we
>                                  * are done.
>                                  */
> -                               if (port == last_port) {
> +                               if (!execlist_inactive_ports(el)) {
>                                         __list_del_many(&p->requests,
>                                                         &rq->priotree.link);
>                                         goto done;
> @@ -486,8 +498,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>                                 if (submit)
>                                         port_assign(port, last);
>  
> -                               port = execlist_port_next(el, port);
> -
> +                               port = execlist_request_port(el);
>                                 GEM_BUG_ON(port_isset(port));
>                         }
>  
> @@ -518,9 +529,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  
>  static bool execlists_elsp_ready(struct intel_engine_execlist * const el)
>  {
> -       struct execlist_port * const port = execlist_port_head(el);
> +       const unsigned int active = execlist_active_ports(el);
> +
> +       if (!active)
> +               return true;
>  
> -       return port_count(port) + port_count(execlist_port_next(el, port)) < 2;
> +       return port_count(execlist_port_tail(el)) + active < 2;
>  }
>  
>  /*
> @@ -531,7 +545,6 @@ static void intel_lrc_irq_handler(unsigned long data)
>  {
>         struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
>         struct intel_engine_execlist * const el = &engine->execlist;
> -       struct execlist_port *port = execlist_port_head(el);
>         struct drm_i915_private *dev_priv = engine->i915;
>  
>         /* We can skip acquiring intel_runtime_pm_get() here as it was taken
> @@ -571,6 +584,7 @@ static void intel_lrc_irq_handler(unsigned long data)
>                 tail = GEN8_CSB_WRITE_PTR(head);
>                 head = GEN8_CSB_READ_PTR(head);
>                 while (head != tail) {
> +                       struct execlist_port *port;
>                         struct drm_i915_gem_request *rq;
>                         unsigned int status;
>                         unsigned int count;
> @@ -599,6 +613,7 @@ static void intel_lrc_irq_handler(unsigned long data)
>                         if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
>                                 continue;
>  
> +                       port = execlist_port_head(el);
>                         /* Check the context/desc id for this event matches */
>                         GEM_DEBUG_BUG_ON(readl(buf + 2 * head + 1) !=
>                                          port->context_id);
> @@ -613,13 +628,13 @@ static void intel_lrc_irq_handler(unsigned long data)
>                                 trace_i915_gem_request_out(rq);
>                                 i915_gem_request_put(rq);
>  
> -                               port = execlist_port_complete(el, port);
> +                               execlist_release_port(el, port);
>                         } else {
>                                 port_set(port, port_pack(rq, count));
>                         }
>  
>                         /* After the final element, the hw should be idle */
> -                       GEM_BUG_ON(port_count(port) == 0 &&
> +                       GEM_BUG_ON(execlist_active_ports(el) == 0 &&
>                                    !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
>                 }
>  
> @@ -709,10 +724,8 @@ static void execlists_submit_request(struct drm_i915_gem_request *request)
>  
>         if (insert_request(engine,
>                            &request->priotree,
> -                          request->priotree.priority)) {
> -               if (execlists_elsp_ready(el))
> -                       tasklet_hi_schedule(&el->irq_tasklet);
> -       }
> +                          request->priotree.priority))
> +               tasklet_hi_schedule(&el->irq_tasklet);
>  
>         GEM_BUG_ON(!el->first);
>         GEM_BUG_ON(list_empty(&request->priotree.link));
> @@ -1334,7 +1347,6 @@ static void reset_common_ring(struct intel_engine_cs *engine,
>                               struct drm_i915_gem_request *request)
>  {
>         struct intel_engine_execlist * const el = &engine->execlist;
> -       struct execlist_port *port = execlist_port_head(el);
>         struct intel_context *ce;
>  
>         /*
> @@ -1351,12 +1363,16 @@ static void reset_common_ring(struct intel_engine_cs *engine,
>                 return;
>         }
>  
> -       if (request->ctx != port_request(port)->ctx) {
> -               i915_gem_request_put(port_request(port));
> -               execlist_port_complete(el, port);
> -       }
> +       if (execlist_active_ports(el)) {
> +               struct execlist_port *port = execlist_port_head(el);
>  
> -       GEM_BUG_ON(request->ctx != port_request(port)->ctx);
> +               if (request->ctx != port_request(port)->ctx) {
> +                       i915_gem_request_put(port_request(port));
> +                       execlist_release_port(el, port);
> +               }
> +
> +               GEM_BUG_ON(request->ctx != port_request(port)->ctx);
> +       }
>  
>         /* If the request was innocent, we leave the request in the ELSP
>          * and will try to replay it on restarting. The context image may
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 13d8e1115f3b..9c377f6eb8fe 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -249,6 +249,11 @@ struct intel_engine_execlist {
>         unsigned int port_head;
>  
>         /**
> +        * @port_count: reserved ports
> +        */
> +       unsigned int port_count;
> +
> +       /**
>          * @queue: queue of requests, in priority lists
>          */
>         struct rb_root queue;
> @@ -510,44 +515,65 @@ struct intel_engine_cs {
>  
>  #define for_each_execlist_port(el__, port__, n__) \
>         for ((n__) = 0; \
> -            (port__) = &(el__)->port[__port_idx((el__)->port_head, (n__), (el__)->port_mask)], (n__) < (el__)->port_mask + 1; \
> +            (port__) = &(el__)->port[__port_idx((el__)->port_head, (n__), (el__)->port_mask)], (n__) < (el__)->port_count; \
>              (n__)++)
>  
>  #define for_each_execlist_port_reverse(el__, port__, n__) \
> -       for ((n__) = (el__)->port_mask + 1; \
> +       for ((n__) = (el__)->port_count; \
>              (port__) = &(el__)->port[__port_idx((el__)->port_head - 1, (n__), (el__)->port_mask)], (n__)--;)
>  
>  static inline struct execlist_port *
>  execlist_port_head(struct intel_engine_execlist * const el)
>  {
> +       GEM_DEBUG_BUG_ON(!el->port_count);
> +
>         return &el->port[el->port_head];
>  }
>  
>  static inline struct execlist_port *
>  execlist_port_tail(struct intel_engine_execlist * const el)
>  {
> -       return &el->port[__port_idx(el->port_head, el->port_mask, el->port_mask)];
> +       GEM_DEBUG_BUG_ON(!el->port_count);
> +
> +       return &el->port[__port_idx(el->port_head, el->port_count - 1, el->port_mask)];
>  }
>  
> -static inline struct execlist_port *
> -execlist_port_next(struct intel_engine_execlist * const el,
> -                  const struct execlist_port * const port)
> +static inline unsigned int
> +execlist_active_ports(const struct intel_engine_execlist * const el)
>  {
> -       const unsigned int i = port_index(port, el);
> +       return READ_ONCE(el->port_count);
> +}
>  
> -       return &el->port[__port_idx(i, 1, el->port_mask)];
> +static inline unsigned int
> +execlist_inactive_ports(const struct intel_engine_execlist * const el)
> +{
> +       return el->port_mask + 1 - READ_ONCE(el->port_count);
>  }
>  
>  static inline struct execlist_port *
> -execlist_port_complete(struct intel_engine_execlist * const el,
> -                      struct execlist_port * const port)
> +execlist_request_port(struct intel_engine_execlist * const el)
> +{
> +       GEM_DEBUG_BUG_ON(el->port_count == el->port_mask + 1);

The idea of the DEBUG ones are that they match GEM_DEBUG_EXEC, i.e. code
that only conditionally exists under debugging. GEM_BUG_ON() only exists
for debugging (CI) builds.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] drm/i915: Keep track of reserved execlist ports
  2017-09-12  8:36 ` [PATCH 8/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
                     ` (2 preceding siblings ...)
  2017-09-12  9:19   ` Chris Wilson
@ 2017-09-12  9:23   ` Chris Wilson
  2017-09-12 10:29     ` Mika Kuoppala
  2017-09-12  9:35   ` Chris Wilson
  2017-09-12 10:45   ` Chris Wilson
  5 siblings, 1 reply; 30+ messages in thread
From: Chris Wilson @ 2017-09-12  9:23 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:18)
> +static inline unsigned int
> +execlist_active_ports(const struct intel_engine_execlist * const el)
>  {
> -       const unsigned int i = port_index(port, el);
> +       return READ_ONCE(el->port_count);

READ_ONCE? Could we separate the racy check from the serialized uses
inside the tasklet where we do want the compiler to go to town?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] drm/i915: Keep track of reserved execlist ports
  2017-09-12  8:36 ` [PATCH 8/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
                     ` (3 preceding siblings ...)
  2017-09-12  9:23   ` Chris Wilson
@ 2017-09-12  9:35   ` Chris Wilson
  2017-09-12 10:45   ` Chris Wilson
  5 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12  9:35 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:18)
> @@ -1351,12 +1363,16 @@ static void reset_common_ring(struct intel_engine_cs *engine,
>                 return;
>         }
>  
> -       if (request->ctx != port_request(port)->ctx) {
> -               i915_gem_request_put(port_request(port));
> -               execlist_port_complete(el, port);
> -       }
> +       if (execlist_active_ports(el)) {
> +               struct execlist_port *port = execlist_port_head(el);
>  
> -       GEM_BUG_ON(request->ctx != port_request(port)->ctx);

This needs a FIXME at least.
For starters we need to drop requests until the ctx match (i.e. while
(rq->ctx != port_request(port)->ctx)) and then we have the issue that we
may have the same ctx multiple times in an N slot elsp. So
while (rq->ctx != port_rquest(port)->ctx || i915_gem_request_completed(port_request(port)))
?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/8] drm/i915: Make own struct for execlist items
  2017-09-12  8:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
@ 2017-09-12 10:14   ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12 10:14 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:11)
> Engine's execlist related items have been increasing to
> a point where a separate struct is warranted. Carve execlist
> specific items to a dedicated struct to add clarity.
> 
> v2: add kerneldoc and fix whitespace (Joonas, Chris)
> 
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/8] drm/i915: Wrap port cancellation into a function
  2017-09-12  8:36 ` [PATCH 2/8] drm/i915: Wrap port cancellation into a function Mika Kuoppala
@ 2017-09-12 10:15   ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12 10:15 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:12)
> On reset and wedged path, we want to release the requests
> that are tied to ports and then mark the ports to be unset.
> Introduce a function for this.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/8] drm/i915: Add execlist_port_complete
  2017-09-12  8:36 ` [PATCH 3/8] drm/i915: Add execlist_port_complete Mika Kuoppala
@ 2017-09-12 10:19   ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12 10:19 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:13)
> @@ -412,8 +413,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>                  */
>                 last->tail = last->wa_tail;
>  
> -       GEM_BUG_ON(port_isset(&port[1]));

* Raises eyebrow

> -
>         /* Hardware submission is through 2 ports. Conceptually each port
>          * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
>          * static for a context, and unique to each, so we only execute
> @@ -485,25 +484,27 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>                                 if (submit)
>                                         port_assign(port, last);
>                                 port++;
> +
> +                               GEM_BUG_ON(port_isset(port));

* and relax

You had me worried there, but that looks like a neat generalisation.

> @@ -495,6 +495,18 @@ struct intel_engine_cs {
>  
>  void execlist_cancel_port_requests(struct intel_engine_execlist * const el);
>  
> +static inline void
> +execlist_port_complete(struct intel_engine_execlist * const el,
> +                      struct execlist_port * const port)
> +{
> +       struct execlist_port * const port1 = &el->port[1];
> +
> +       GEM_DEBUG_BUG_ON(port_index(port, el) != 0);

Just GEM_BUG_ON().

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/8] drm/i915: Make execlist port count variable
  2017-09-12  8:36 ` [PATCH 4/8] drm/i915: Make execlist port count variable Mika Kuoppala
  2017-09-12  8:55   ` Chris Wilson
@ 2017-09-12 10:24   ` Chris Wilson
  1 sibling, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12 10:24 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:14)
> GuC can handle more than two submission ports.

I know why you want this, but this is a little too wishy-washy.
Something along the lines,
"As we emulate execlists on top of the GuC workqueue, it is not
restricted to just 2 ports and we can increase that number arbitrarily
to trade-off queue depth (i.e. scheduling latency) against pipeline
bubbles."

Not that we have any evidence to suggest we need more than 2 ports ;)

> Make port count variable to prepare supporting more than 2 ports.
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] drm/i915: Keep track of reserved execlist ports
  2017-09-12  9:15   ` Chris Wilson
@ 2017-09-12 10:27     ` Mika Kuoppala
  2017-09-12 10:51       ` Chris Wilson
  0 siblings, 1 reply; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12 10:27 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2017-09-12 09:36:18)
>> +execlist_request_port(struct intel_engine_execlist * const el)
>> +{
>> +       GEM_DEBUG_BUG_ON(el->port_count == el->port_mask + 1);
>> +
>> +       el->port_count++;
>> +
>> +       GEM_DEBUG_BUG_ON(port_isset(execlist_port_tail(el)));
>> +
>> +       return execlist_port_tail(el);
>> +}
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index 0dfb03a0cee4..fdda3a1835ad 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -662,12 +662,19 @@ static void port_assign(struct execlist_port *port,
>>  static bool i915_guc_dequeue(struct intel_engine_cs *engine)
>>  {
>>         struct intel_engine_execlist * const el = &engine->execlist;
>> -       struct execlist_port *port = execlist_port_head(el);
>> -       const struct execlist_port * const last_port = execlist_port_tail(el);
>> -       struct drm_i915_gem_request *last = port_request(port);
>> +       struct execlist_port *port;
>> +       struct drm_i915_gem_request *last;
>>         struct rb_node *rb;
>>         bool submit = false;
>>  
>> +       if (execlist_active_ports(el)) {
>> +               port = execlist_port_tail(el);
>> +               last = port_request(port);
>> +       } else {
>> +               port = NULL;
>> +               last = NULL;
>> +       }
>> +
>>         spin_lock_irq(&engine->timeline->lock);
>>         rb = el->first;
>>         GEM_BUG_ON(rb_first(&el->queue) != rb);
>> @@ -675,9 +682,12 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
>>                 struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
>>                 struct drm_i915_gem_request *rq, *rn;
>>  
>> +               if (!port)
>> +                       port = execlist_request_port(el);
>> +
>
> I can't see port becoming NULL inside the loop, so why can't we do it
> during the init above? What did I miss?

Hmm this might have warranted a comment. I wanted to avoid requesting
a port if there is nothing to dequeue, that is why it inside while (rb).
We could allocate early and let the port linger if nothing to dequeue,
but then the GEM_DEBUG's need to be relaxed more.

-Mika

> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] drm/i915: Keep track of reserved execlist ports
  2017-09-12  9:23   ` Chris Wilson
@ 2017-09-12 10:29     ` Mika Kuoppala
  2017-09-12 10:53       ` Chris Wilson
  0 siblings, 1 reply; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12 10:29 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2017-09-12 09:36:18)
>> +static inline unsigned int
>> +execlist_active_ports(const struct intel_engine_execlist * const el)
>>  {
>> -       const unsigned int i = port_index(port, el);
>> +       return READ_ONCE(el->port_count);
>
> READ_ONCE? Could we separate the racy check from the serialized uses
> inside the tasklet where we do want the compiler to go to town?

Hmm so substitute execlist_active_ports() in i915_irq.c with
READ_ONCE(el->port_count) and remove it in this function?

-Mika

> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/8] drm/i915: Introduce iterators for execlist ports
  2017-09-12  8:36 ` [PATCH 5/8] drm/i915: Introduce iterators for execlist ports Mika Kuoppala
@ 2017-09-12 10:30   ` Chris Wilson
  2017-09-12 10:36     ` Mika Kuoppala
  0 siblings, 1 reply; 30+ messages in thread
From: Chris Wilson @ 2017-09-12 10:30 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:15)
> Switch to iterators for execlist_port access. This is
> a preparation for indexing ports from arbitrary location. Which
> in turn allows us to handle ports in ring like fashion.
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>

Something we want to keep an eye as we switch to these macros is code
generation. As a quick guide, check object size, i.e. run
scripts/bloat-o-meter.

> -static inline unsigned int
> -execlist_num_ports(const struct intel_engine_execlist * const el)
> -{
> -       return el->port_mask + 1;
> -}
> +/* Iterators over elsp ports */
> +#define __port_idx(start, i, m) (((start) + (i)) & (m))
> +
> +#define for_each_execlist_port(el__, port__, n__) \
> +       for ((n__) = 0; \
> +            (port__) = &(el__)->port[__port_idx(0, (n__), (el__)->port_mask)], (n__) < (el__)->port_mask + 1; \
> +            (n__)++)

Using (n__) is misleading. It can't be anything other than a lhv (i.e.
plain variable and not an expr). checkpatch can complain all it wants.

Probably should keep execlist_num_ports() for another patch.
That x < y + 1 just keeps on triggering me everytime I see it.

> +
> +#define for_each_execlist_port_reverse(el__, port__, n__) \
> +       for ((n__) = (el__)->port_mask + 1; \
> +            (port__) = &(el__)->port[__port_idx((el__)->port_mask, (n__), (el__)->port_mask)], (n__)--;)
>  
>  static inline void
>  execlist_port_complete(struct intel_engine_execlist * const el,
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/8] drm/i915: Introduce iterators for execlist ports
  2017-09-12 10:30   ` Chris Wilson
@ 2017-09-12 10:36     ` Mika Kuoppala
  0 siblings, 0 replies; 30+ messages in thread
From: Mika Kuoppala @ 2017-09-12 10:36 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2017-09-12 09:36:15)
>> Switch to iterators for execlist_port access. This is
>> a preparation for indexing ports from arbitrary location. Which
>> in turn allows us to handle ports in ring like fashion.
>> 
>> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
>
> Something we want to keep an eye as we switch to these macros is code
> generation. As a quick guide, check object size, i.e. run
> scripts/bloat-o-meter.
>
>> -static inline unsigned int
>> -execlist_num_ports(const struct intel_engine_execlist * const el)
>> -{
>> -       return el->port_mask + 1;
>> -}
>> +/* Iterators over elsp ports */
>> +#define __port_idx(start, i, m) (((start) + (i)) & (m))
>> +
>> +#define for_each_execlist_port(el__, port__, n__) \
>> +       for ((n__) = 0; \
>> +            (port__) = &(el__)->port[__port_idx(0, (n__), (el__)->port_mask)], (n__) < (el__)->port_mask + 1; \
>> +            (n__)++)
>
> Using (n__) is misleading. It can't be anything other than a lhv (i.e.
> plain variable and not an expr). checkpatch can complain all it wants.
>
> Probably should keep execlist_num_ports() for another patch.
> That x < y + 1 just keeps on triggering me everytime I see it.

There is only a once callsite per macro if I recall right. We could
just drop these macros and live happier life as no-one really is a fan.

-Mika

>
>> +
>> +#define for_each_execlist_port_reverse(el__, port__, n__) \
>> +       for ((n__) = (el__)->port_mask + 1; \
>> +            (port__) = &(el__)->port[__port_idx((el__)->port_mask, (n__), (el__)->port_mask)], (n__)--;)
>>  
>>  static inline void
>>  execlist_port_complete(struct intel_engine_execlist * const el,
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors
  2017-09-12  8:36 ` [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors Mika Kuoppala
@ 2017-09-12 10:37   ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12 10:37 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:16)
> Instead of trusting that first available port is at index 0,
> use accessor to hide this. This allows us to just move the head
> on port completion instead of memmoving the array.

Ah. Even after a re-read, I wasn't expecting to see the advance in this
path. "This allows" is too passive, sounds like you are describing a
future optimisation.

> -static bool execlists_elsp_ready(const struct intel_engine_cs *engine)
> +static bool execlists_elsp_ready(struct intel_engine_execlist * const el)
>  {
> -       const struct execlist_port *port = engine->execlist.port;
> +       struct execlist_port * const port = execlist_port_head(el);

For readability
struct execlist_port * const port0 = execlist_port_head(el);
struct execlist_port * const port1 = execlist_port_next(el, port0);

> -       return port_count(&port[0]) + port_count(&port[1]) < 2;
> +       return port_count(port) + port_count(execlist_port_next(el, port)) < 2;
return port_count(port0) + port_count(port1) < 2;
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] drm/i915: Move execlist initialization into intel_engine_cs.c
  2017-09-12  8:36 ` [PATCH 7/8] drm/i915: Move execlist initialization into intel_engine_cs.c Mika Kuoppala
@ 2017-09-12 10:41   ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12 10:41 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:17)
> Move execlist init into a common engine setup. As it is
> common to both guc and hw execlists.

"Move execlist init into the commone engine setup for simplicity. At
present, it is split between the common setup and lrc for no real
reason. As we always inspect the execlists for debugging, we should
make sure that it is always initialised (and empty at start).

Plus now that we have a common substruct for execlists, having a common
initialiser makes our setup more self-descriptive."
 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

I would bump this to just after introduction of struct engine->execlist
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] drm/i915: Keep track of reserved execlist ports
  2017-09-12  8:36 ` [PATCH 8/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
                     ` (4 preceding siblings ...)
  2017-09-12  9:35   ` Chris Wilson
@ 2017-09-12 10:45   ` Chris Wilson
  5 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12 10:45 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 09:36:18)
> To further enchance port processing, keep track of
> reserved ports. This way we can iterate only the used subset
> of port space. Note that we lift the responsibility of
> execlists_submit_request() to inspect hw availability and
> always do dequeuing. This is to ensure that only the irq
> handler will be responsible for keeping track of available ports.

Hmm, gut feeling is that tracking the tail instead of count is more
intuitive, especially for dequeue.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* ✓ Fi.CI.IGT: success for Support for more than two execlist ports
  2017-09-12  8:36 [PATCH 0/8] Support for more than two execlist ports Mika Kuoppala
                   ` (8 preceding siblings ...)
  2017-09-12  8:59 ` ✓ Fi.CI.BAT: success for Support for more than two " Patchwork
@ 2017-09-12 10:48 ` Patchwork
  9 siblings, 0 replies; 30+ messages in thread
From: Patchwork @ 2017-09-12 10:48 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

== Series Details ==

Series: Support for more than two execlist ports
URL   : https://patchwork.freedesktop.org/series/30183/
State : success

== Summary ==

Test kms_cursor_legacy:
        Subgroup flip-vs-cursor-atomic:
                fail       -> PASS       (shard-hsw) fdo#102402

fdo#102402 https://bugs.freedesktop.org/show_bug.cgi?id=102402

shard-hsw        total:2301 pass:1237 dwarn:0   dfail:0   fail:12  skip:1052 time:9452s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5653/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] drm/i915: Keep track of reserved execlist ports
  2017-09-12 10:27     ` Mika Kuoppala
@ 2017-09-12 10:51       ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12 10:51 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 11:27:53)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Quoting Mika Kuoppala (2017-09-12 09:36:18)
> >> +execlist_request_port(struct intel_engine_execlist * const el)
> >> +{
> >> +       GEM_DEBUG_BUG_ON(el->port_count == el->port_mask + 1);
> >> +
> >> +       el->port_count++;
> >> +
> >> +       GEM_DEBUG_BUG_ON(port_isset(execlist_port_tail(el)));
> >> +
> >> +       return execlist_port_tail(el);
> >> +}
> >> 
> >> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> index 0dfb03a0cee4..fdda3a1835ad 100644
> >> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> @@ -662,12 +662,19 @@ static void port_assign(struct execlist_port *port,
> >>  static bool i915_guc_dequeue(struct intel_engine_cs *engine)
> >>  {
> >>         struct intel_engine_execlist * const el = &engine->execlist;
> >> -       struct execlist_port *port = execlist_port_head(el);
> >> -       const struct execlist_port * const last_port = execlist_port_tail(el);
> >> -       struct drm_i915_gem_request *last = port_request(port);
> >> +       struct execlist_port *port;
> >> +       struct drm_i915_gem_request *last;
> >>         struct rb_node *rb;
> >>         bool submit = false;
> >>  
> >> +       if (execlist_active_ports(el)) {
> >> +               port = execlist_port_tail(el);
> >> +               last = port_request(port);
> >> +       } else {
> >> +               port = NULL;
> >> +               last = NULL;
> >> +       }
> >> +
> >>         spin_lock_irq(&engine->timeline->lock);
> >>         rb = el->first;
> >>         GEM_BUG_ON(rb_first(&el->queue) != rb);
> >> @@ -675,9 +682,12 @@ static bool i915_guc_dequeue(struct intel_engine_cs *engine)
> >>                 struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
> >>                 struct drm_i915_gem_request *rq, *rn;
> >>  
> >> +               if (!port)
> >> +                       port = execlist_request_port(el);
> >> +
> >
> > I can't see port becoming NULL inside the loop, so why can't we do it
> > during the init above? What did I miss?
> 
> Hmm this might have warranted a comment. I wanted to avoid requesting
> a port if there is nothing to dequeue, that is why it inside while (rb).
> We could allocate early and let the port linger if nothing to dequeue,
> but then the GEM_DEBUG's need to be relaxed more.

Ah, that's where only advancing at the end comes into play. Would also
help not to call it request_port... Or just not hiding the mechanics.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] drm/i915: Keep track of reserved execlist ports
  2017-09-12 10:29     ` Mika Kuoppala
@ 2017-09-12 10:53       ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2017-09-12 10:53 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-09-12 11:29:48)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Quoting Mika Kuoppala (2017-09-12 09:36:18)
> >> +static inline unsigned int
> >> +execlist_active_ports(const struct intel_engine_execlist * const el)
> >>  {
> >> -       const unsigned int i = port_index(port, el);
> >> +       return READ_ONCE(el->port_count);
> >
> > READ_ONCE? Could we separate the racy check from the serialized uses
> > inside the tasklet where we do want the compiler to go to town?
> 
> Hmm so substitute execlist_active_ports() in i915_irq.c with
> READ_ONCE(el->port_count) and remove it in this function?

The elsp_ready outside of the tasklet also need to take special care (or
at least document that they are outside of the serialized section).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2017-09-12 10:53 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-12  8:36 [PATCH 0/8] Support for more than two execlist ports Mika Kuoppala
2017-09-12  8:36 ` [PATCH 1/8] drm/i915: Make own struct for execlist items Mika Kuoppala
2017-09-12 10:14   ` Chris Wilson
2017-09-12  8:36 ` [PATCH 2/8] drm/i915: Wrap port cancellation into a function Mika Kuoppala
2017-09-12 10:15   ` Chris Wilson
2017-09-12  8:36 ` [PATCH 3/8] drm/i915: Add execlist_port_complete Mika Kuoppala
2017-09-12 10:19   ` Chris Wilson
2017-09-12  8:36 ` [PATCH 4/8] drm/i915: Make execlist port count variable Mika Kuoppala
2017-09-12  8:55   ` Chris Wilson
2017-09-12 10:24   ` Chris Wilson
2017-09-12  8:36 ` [PATCH 5/8] drm/i915: Introduce iterators for execlist ports Mika Kuoppala
2017-09-12 10:30   ` Chris Wilson
2017-09-12 10:36     ` Mika Kuoppala
2017-09-12  8:36 ` [PATCH 6/8] drm/i915: Introduce execlist_port_* accessors Mika Kuoppala
2017-09-12 10:37   ` Chris Wilson
2017-09-12  8:36 ` [PATCH 7/8] drm/i915: Move execlist initialization into intel_engine_cs.c Mika Kuoppala
2017-09-12 10:41   ` Chris Wilson
2017-09-12  8:36 ` [PATCH 8/8] drm/i915: Keep track of reserved execlist ports Mika Kuoppala
2017-09-12  9:15   ` Chris Wilson
2017-09-12 10:27     ` Mika Kuoppala
2017-09-12 10:51       ` Chris Wilson
2017-09-12  9:17   ` Chris Wilson
2017-09-12  9:19   ` Chris Wilson
2017-09-12  9:23   ` Chris Wilson
2017-09-12 10:29     ` Mika Kuoppala
2017-09-12 10:53       ` Chris Wilson
2017-09-12  9:35   ` Chris Wilson
2017-09-12 10:45   ` Chris Wilson
2017-09-12  8:59 ` ✓ Fi.CI.BAT: success for Support for more than two " Patchwork
2017-09-12 10:48 ` ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.