All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] drm/i915/guc: use one GuC client per GPU engine
@ 2016-07-21 18:15 Dave Gordon
  2016-07-21 18:15 ` [PATCH v2 1/6] drm/i915/guc: doorbell reset should avoid used doorbells Dave Gordon
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Dave Gordon @ 2016-07-21 18:15 UTC (permalink / raw)
  To: intel-gfx

When using a single GuC client for multiple engines, the i915 driver has
to merge all work items into a single work queue, which the GuC firmware
then demultiplexes into separate submission queues per engine. In
theory, this could lead to the single queue becoming a bottleneck in
which an excess of outstanding work for one or more engines might
prevent work for an idle engine reaching the hardware.

To reduce this risk, we can create one GuC client per engine. Each will
have its own workqueue, to be used only for work targeting a single
engine, so there will be no cross-engine contention for workqueue slots.

Dave Gordon (6):
  drm/i915/guc: doorbell reset should avoid used doorbells
  drm/i915/guc: refactor guc_init_doorbell_hw()
  drm/i915/guc: use a separate GuC client for each engine
  drm/i915/guc: add engine mask to GuC client & pass to GuC
  drm/i915/guc: use for_each_engine_id() where appropriate
  drm/i915/guc: re-optimise i915_guc_client layout

 drivers/gpu/drm/i915/i915_debugfs.c        |  43 +++++++----
 drivers/gpu/drm/i915/i915_guc_submission.c | 118 ++++++++++++++++++-----------
 drivers/gpu/drm/i915/intel_guc.h           |  11 ++-
 3 files changed, 107 insertions(+), 65 deletions(-)

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/6] drm/i915/guc: doorbell reset should avoid used doorbells
  2016-07-21 18:15 [PATCH v2 0/6] drm/i915/guc: use one GuC client per GPU engine Dave Gordon
@ 2016-07-21 18:15 ` Dave Gordon
  2016-07-22 10:00   ` Tvrtko Ursulin
  2016-07-21 18:15 ` [PATCH v2 2/6] drm/i915/guc: refactor guc_init_doorbell_hw() Dave Gordon
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-07-21 18:15 UTC (permalink / raw)
  To: intel-gfx

guc_init_doorbell_hw() borrows the (currently single) GuC client to use
in reinitialising ALL the doorbell registers (as the hardware doesn't
reset them when the GuC is reset). As a prerequisite for accommodating
multiple clients, it should only reset doorbells that are supposed to be
disabled, avoiding those that are marked as in use by any client.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 01c1c16..9c5b81b 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -699,7 +699,7 @@ static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
 }
 
 /*
- * Borrow the first client to set up & tear down every doorbell
+ * Borrow the first client to set up & tear down each unused doorbell
  * in turn, to ensure that all doorbell h/w is (re)initialised.
  */
 static void guc_init_doorbell_hw(struct intel_guc *guc)
@@ -715,6 +715,9 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
 		i915_reg_t drbreg = GEN8_DRBREGL(i);
 		u32 value = I915_READ(drbreg);
 
+		if (test_bit(i, guc->doorbell_bitmap))
+			continue;
+
 		err = guc_update_doorbell_id(guc, client, i);
 
 		/* Report update failure or unexpectedly active doorbell */
@@ -733,6 +736,9 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
 		i915_reg_t drbreg = GEN8_DRBREGL(i);
 		u32 value = I915_READ(drbreg);
 
+		if (test_bit(i, guc->doorbell_bitmap))
+			continue;
+
 		if (i != db_id && (value & GUC_DOORBELL_ENABLED))
 			DRM_DEBUG_DRIVER("Doorbell %d (reg 0x%x) finally 0x%x\n",
 					  i, drbreg.reg, value);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/6] drm/i915/guc: refactor guc_init_doorbell_hw()
  2016-07-21 18:15 [PATCH v2 0/6] drm/i915/guc: use one GuC client per GPU engine Dave Gordon
  2016-07-21 18:15 ` [PATCH v2 1/6] drm/i915/guc: doorbell reset should avoid used doorbells Dave Gordon
@ 2016-07-21 18:15 ` Dave Gordon
  2016-07-22 10:01   ` Tvrtko Ursulin
  2016-07-21 18:15 ` [PATCH v2 3/6] drm/i915/guc: use a separate GuC client for each engine Dave Gordon
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-07-21 18:15 UTC (permalink / raw)
  To: intel-gfx

We have essentially the same code in each of two different loops, so we
can refactor it into a little helper function.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 54 +++++++++++++++++-------------
 1 file changed, 30 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 9c5b81b..816bdca 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -698,32 +698,47 @@ static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
 	kfree(client);
 }
 
+/* Check that a doorbell register is in the expected state */
+static bool guc_doorbell_check(struct intel_guc *guc, uint16_t db_id)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	i915_reg_t drbreg = GEN8_DRBREGL(db_id);
+	uint32_t value = I915_READ(drbreg);
+	bool enabled = (value & GUC_DOORBELL_ENABLED) != 0;
+	bool expected = test_bit(db_id, guc->doorbell_bitmap);
+
+	if (enabled == expected)
+		return true;
+
+	DRM_DEBUG_DRIVER("Doorbell %d (reg 0x%x) 0x%x, should be %s\n",
+			 db_id, drbreg.reg, value,
+			 expected ? "active" : "inactive");
+
+	return false;
+}
+
 /*
  * Borrow the first client to set up & tear down each unused doorbell
  * in turn, to ensure that all doorbell h/w is (re)initialised.
  */
 static void guc_init_doorbell_hw(struct intel_guc *guc)
 {
-	struct drm_i915_private *dev_priv = guc_to_i915(guc);
 	struct i915_guc_client *client = guc->execbuf_client;
-	uint16_t db_id, i;
-	int err;
+	uint16_t db_id;
+	int i, err;
 
+	/* Save client's original doorbell selection */
 	db_id = client->doorbell_id;
 
 	for (i = 0; i < GUC_MAX_DOORBELLS; ++i) {
-		i915_reg_t drbreg = GEN8_DRBREGL(i);
-		u32 value = I915_READ(drbreg);
-
-		if (test_bit(i, guc->doorbell_bitmap))
+		/* Skip if doorbell is OK */
+		if (guc_doorbell_check(guc, i))
 			continue;
 
 		err = guc_update_doorbell_id(guc, client, i);
-
-		/* Report update failure or unexpectedly active doorbell */
-		if (err || (i != db_id && (value & GUC_DOORBELL_ENABLED)))
-			DRM_DEBUG_DRIVER("Doorbell %d (reg 0x%x) was 0x%x, err %d\n",
-					  i, drbreg.reg, value, err);
+		if (err)
+			DRM_DEBUG_DRIVER("Doorbell %d update failed, err %d\n",
+					i, err);
 	}
 
 	/* Restore to original value */
@@ -732,18 +747,9 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
 		DRM_ERROR("Failed to restore doorbell to %d, err %d\n",
 			db_id, err);
 
-	for (i = 0; i < GUC_MAX_DOORBELLS; ++i) {
-		i915_reg_t drbreg = GEN8_DRBREGL(i);
-		u32 value = I915_READ(drbreg);
-
-		if (test_bit(i, guc->doorbell_bitmap))
-			continue;
-
-		if (i != db_id && (value & GUC_DOORBELL_ENABLED))
-			DRM_DEBUG_DRIVER("Doorbell %d (reg 0x%x) finally 0x%x\n",
-					  i, drbreg.reg, value);
-
-	}
+	/* Read back & verify all doorbell registers */
+	for (i = 0; i < GUC_MAX_DOORBELLS; ++i)
+		(void)guc_doorbell_check(guc, i);
 }
 
 /**
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/6] drm/i915/guc: use a separate GuC client for each engine
  2016-07-21 18:15 [PATCH v2 0/6] drm/i915/guc: use one GuC client per GPU engine Dave Gordon
  2016-07-21 18:15 ` [PATCH v2 1/6] drm/i915/guc: doorbell reset should avoid used doorbells Dave Gordon
  2016-07-21 18:15 ` [PATCH v2 2/6] drm/i915/guc: refactor guc_init_doorbell_hw() Dave Gordon
@ 2016-07-21 18:15 ` Dave Gordon
  2016-07-21 18:30   ` Chris Wilson
  2016-07-21 18:15 ` [PATCH v2 4/6] drm/i915/guc: add engine mask to GuC client & pass to GuC Dave Gordon
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-07-21 18:15 UTC (permalink / raw)
  To: intel-gfx

When using a single GuC client for multiple engines, the i915 driver has
to merge all work items into a single work queue, which the GuC firmware
then demultiplexes into separate submission queues per engine. In
theory, this could lead to the single queue becoming a bottleneck in
which an excess of outstanding work for one or more engines might
prevent work for an idle engine reaching the hardware.

To reduce this risk, we can create one GuC client per engine. Each will
have its own workqueue, to be used only for work targeting a single
engine, so there will be no cross-engine contention for workqueue slots.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 25 ++++++++++++++++-----
 drivers/gpu/drm/i915/i915_guc_submission.c | 35 +++++++++++++++++++-----------
 drivers/gpu/drm/i915/intel_guc.h           |  2 +-
 3 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 9aa62c5..793b1d9 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2564,20 +2564,26 @@ static int i915_guc_info(struct seq_file *m, void *data)
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_guc guc;
-	struct i915_guc_client client = {};
+	struct i915_guc_client *clients;
 	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
 	u64 total = 0;
 
 	if (!HAS_GUC_SCHED(dev_priv))
 		return 0;
 
+	clients = kcalloc(I915_NUM_ENGINES, sizeof(*clients), GFP_KERNEL);
+	if (clients == NULL)
+		return -ENOMEM;
+
 	if (mutex_lock_interruptible(&dev->struct_mutex))
-		return 0;
+		goto done;
 
 	/* Take a local copy of the GuC data, so we can dump it at leisure */
 	guc = dev_priv->guc;
-	if (guc.execbuf_client)
-		client = *guc.execbuf_client;
+	for_each_engine_id(engine, dev_priv, id)
+		if (guc.exec_clients[id])
+			clients[id] = *guc.exec_clients[id];
 
 	mutex_unlock(&dev->struct_mutex);
 
@@ -2600,11 +2606,18 @@ static int i915_guc_info(struct seq_file *m, void *data)
 	}
 	seq_printf(m, "\t%s: %llu\n", "Total", total);
 
-	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
-	i915_guc_client_info(m, dev_priv, &client);
+	for_each_engine_id(engine, dev_priv, id) {
+		seq_printf(m, "\nGuC exec_client[%d] @ %p:\n",
+				id, guc.exec_clients[id]);
+		if (guc.exec_clients[id])
+			i915_guc_client_info(m, dev_priv, &clients[id]);
+	}
 
 	/* Add more as required ... */
 
+done:
+	kfree(clients);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 816bdca..9999797 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -434,7 +434,9 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 int i915_guc_wq_check_space(struct drm_i915_gem_request *request)
 {
 	const size_t wqi_size = sizeof(struct guc_wq_item);
-	struct i915_guc_client *gc = request->i915->guc.execbuf_client;
+	enum intel_engine_id engine_id = request->engine->id;
+	struct intel_guc *guc = &request->i915->guc;
+	struct i915_guc_client *gc = guc->exec_clients[engine_id];
 	struct guc_process_desc *desc;
 	u32 freespace;
 
@@ -589,7 +591,7 @@ int i915_guc_submit(struct drm_i915_gem_request *rq)
 {
 	unsigned int engine_id = rq->engine->id;
 	struct intel_guc *guc = &rq->i915->guc;
-	struct i915_guc_client *client = guc->execbuf_client;
+	struct i915_guc_client *client = guc->exec_clients[engine_id];
 	int b_ret;
 
 	guc_add_workqueue_item(client, rq);
@@ -723,7 +725,7 @@ static bool guc_doorbell_check(struct intel_guc *guc, uint16_t db_id)
  */
 static void guc_init_doorbell_hw(struct intel_guc *guc)
 {
-	struct i915_guc_client *client = guc->execbuf_client;
+	struct i915_guc_client *client = guc->exec_clients[RCS];
 	uint16_t db_id;
 	int i, err;
 
@@ -1004,17 +1006,21 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
 	struct i915_guc_client *client;
+	struct intel_engine_cs *engine;
 
-	/* client for execbuf submission */
-	client = guc_client_alloc(dev_priv,
-				  GUC_CTX_PRIORITY_KMD_NORMAL,
-				  dev_priv->kernel_context);
-	if (!client) {
-		DRM_ERROR("Failed to create execbuf guc_client\n");
-		return -ENOMEM;
+	for_each_engine(engine, dev_priv) {
+		/* client for execbuf submission */
+		client = guc_client_alloc(dev_priv,
+					  GUC_CTX_PRIORITY_KMD_NORMAL,
+					  dev_priv->kernel_context);
+		if (!client) {
+			DRM_ERROR("Failed to create GuC client(s)\n");
+			return -ENOMEM;
+		}
+
+		guc->exec_clients[engine->id] = client;
 	}
 
-	guc->execbuf_client = client;
 	host2guc_sample_forcewake(guc, client);
 	guc_init_doorbell_hw(guc);
 
@@ -1024,9 +1030,12 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
+	struct intel_engine_cs *engine;
 
-	guc_client_free(dev_priv, guc->execbuf_client);
-	guc->execbuf_client = NULL;
+	for_each_engine(engine, dev_priv) {
+		guc_client_free(dev_priv, guc->exec_clients[engine->id]);
+		guc->exec_clients[engine->id] = NULL;
+	}
 }
 
 void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 3e3e743..7b4cc4d 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -132,7 +132,7 @@ struct intel_guc {
 	struct drm_i915_gem_object *ctx_pool_obj;
 	struct ida ctx_ids;
 
-	struct i915_guc_client *execbuf_client;
+	struct i915_guc_client *exec_clients[I915_NUM_ENGINES];
 
 	DECLARE_BITMAP(doorbell_bitmap, GUC_MAX_DOORBELLS);
 	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/6] drm/i915/guc: add engine mask to GuC client & pass to GuC
  2016-07-21 18:15 [PATCH v2 0/6] drm/i915/guc: use one GuC client per GPU engine Dave Gordon
                   ` (2 preceding siblings ...)
  2016-07-21 18:15 ` [PATCH v2 3/6] drm/i915/guc: use a separate GuC client for each engine Dave Gordon
@ 2016-07-21 18:15 ` Dave Gordon
  2016-07-21 18:15 ` [PATCH v2 5/6] drm/i915/guc: use for_each_engine_id() where appropriate Dave Gordon
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Dave Gordon @ 2016-07-21 18:15 UTC (permalink / raw)
  To: intel-gfx

The Context Descriptor passed by the kernel to the GuC contains a field
specifying which engine(s) the context will use. Historically, this was
always set to "all of them", but now that we have one client per engine,
we can be more precise, and set only the single bit for the engine that
the client is associated with.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 15 ++++++++++-----
 drivers/gpu/drm/i915/intel_guc.h           |  3 ++-
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 9999797..6756db0 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -340,7 +340,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
 	desc.priority = client->priority;
 	desc.db_id = client->doorbell_id;
 
-	for_each_engine(engine, dev_priv) {
+	for_each_engine_masked(engine, dev_priv, client->engines) {
 		struct intel_context *ce = &ctx->engine[engine->id];
 		struct guc_execlist_context *lrc = &desc.lrc[engine->guc_id];
 		struct drm_i915_gem_object *obj;
@@ -374,6 +374,8 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
 		desc.engines_used |= (1 << engine->guc_id);
 	}
 
+	DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
+			client->engines, desc.engines_used);
 	WARN_ON(desc.engines_used == 0);
 
 	/*
@@ -768,6 +770,7 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
  */
 static struct i915_guc_client *
 guc_client_alloc(struct drm_i915_private *dev_priv,
+		 uint32_t engines,
 		 uint32_t priority,
 		 struct i915_gem_context *ctx)
 {
@@ -780,10 +783,11 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
 	if (!client)
 		return NULL;
 
-	client->doorbell_id = GUC_INVALID_DOORBELL_ID;
-	client->priority = priority;
 	client->owner = ctx;
 	client->guc = guc;
+	client->engines = engines;
+	client->priority = priority;
+	client->doorbell_id = GUC_INVALID_DOORBELL_ID;
 
 	client->ctx_index = (uint32_t)ida_simple_get(&guc->ctx_ids, 0,
 			GUC_MAX_GPU_CONTEXTS, GFP_KERNEL);
@@ -825,8 +829,8 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
 	if (guc_init_doorbell(guc, client, db_id))
 		goto err;
 
-	DRM_DEBUG_DRIVER("new priority %u client %p: ctx_index %u\n",
-		priority, client, client->ctx_index);
+	DRM_DEBUG_DRIVER("new priority %u client %p for engine(s) 0x%x: ctx_index %u\n",
+		priority, client, client->engines, client->ctx_index);
 	DRM_DEBUG_DRIVER("doorbell id %u, cacheline offset 0x%x\n",
 		client->doorbell_id, client->doorbell_offset);
 
@@ -1011,6 +1015,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 	for_each_engine(engine, dev_priv) {
 		/* client for execbuf submission */
 		client = guc_client_alloc(dev_priv,
+					  intel_engine_flag(engine),
 					  GUC_CTX_PRIORITY_KMD_NORMAL,
 					  dev_priv->kernel_context);
 		if (!client) {
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 7b4cc4d..53d41b5 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -67,6 +67,8 @@ struct i915_guc_client {
 	void *client_base;		/* first page (only) of above	*/
 	struct i915_gem_context *owner;
 	struct intel_guc *guc;
+
+	uint32_t engines;		/* bitmap of (host) engine ids	*/
 	uint32_t priority;
 	uint32_t ctx_index;
 
@@ -79,7 +81,6 @@ struct i915_guc_client {
 	uint32_t wq_offset;
 	uint32_t wq_size;
 	uint32_t wq_tail;
-	uint32_t unused;		/* Was 'wq_head'		*/
 
 	uint32_t no_wq_space;
 	uint32_t q_fail;		/* No longer used		*/
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 5/6] drm/i915/guc: use for_each_engine_id() where appropriate
  2016-07-21 18:15 [PATCH v2 0/6] drm/i915/guc: use one GuC client per GPU engine Dave Gordon
                   ` (3 preceding siblings ...)
  2016-07-21 18:15 ` [PATCH v2 4/6] drm/i915/guc: add engine mask to GuC client & pass to GuC Dave Gordon
@ 2016-07-21 18:15 ` Dave Gordon
  2016-07-22 10:04   ` Tvrtko Ursulin
  2016-07-21 18:15 ` [PATCH v2 6/6] drm/i915/guc: re-optimise i915_guc_client layout Dave Gordon
  2016-07-22  6:02 ` ✓ Ro.CI.BAT: success for drm/i915/guc: use one GuC client per GPU engine Patchwork
  6 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-07-21 18:15 UTC (permalink / raw)
  To: intel-gfx

Now that host structures are indexed by host engine-id rather than
guc_id, we can usefully convert some for_each_engine() loops to use
for_each_engine_id() and avoid multiple dereferences of engine->id.

Also a few related tweaks to cache structure members locally wherever
they're used more than once or twice, hopefully eliminating memory
references.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 17 +++++++++--------
 drivers/gpu/drm/i915/i915_guc_submission.c | 22 +++++++++++++---------
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 793b1d9..2106766 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2535,6 +2535,7 @@ static void i915_guc_client_info(struct seq_file *m,
 				 struct i915_guc_client *client)
 {
 	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
 	uint64_t tot = 0;
 
 	seq_printf(m, "\tPriority %d, GuC ctx index: %u, PD offset 0x%x\n",
@@ -2549,11 +2550,11 @@ static void i915_guc_client_info(struct seq_file *m,
 	seq_printf(m, "\tFailed doorbell: %u\n", client->b_fail);
 	seq_printf(m, "\tLast submission result: %d\n", client->retcode);
 
-	for_each_engine(engine, dev_priv) {
+	for_each_engine_id(engine, dev_priv, id) {
+		u64 submissions = client->submissions[id];
+		tot += submissions;
 		seq_printf(m, "\tSubmissions: %llu %s\n",
-				client->submissions[engine->id],
-				engine->name);
-		tot += client->submissions[engine->id];
+				submissions, engine->name);
 	}
 	seq_printf(m, "\tTotal: %llu\n", tot);
 }
@@ -2598,11 +2599,11 @@ static int i915_guc_info(struct seq_file *m, void *data)
 	seq_printf(m, "GuC last action error code: %d\n", guc.action_err);
 
 	seq_printf(m, "\nGuC submissions:\n");
-	for_each_engine(engine, dev_priv) {
+	for_each_engine_id(engine, dev_priv, id) {
+		u64 submissions = guc.submissions[id];
+		total += submissions;
 		seq_printf(m, "\t%-24s: %10llu, last seqno 0x%08x\n",
-			engine->name, guc.submissions[engine->id],
-			guc.last_seqno[engine->id]);
-		total += guc.submissions[engine->id];
+			engine->name, submissions, guc.last_seqno[id]);
 	}
 	seq_printf(m, "\t%s: %llu\n", "Total", total);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 6756db0..ece3479 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -342,7 +342,8 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
 
 	for_each_engine_masked(engine, dev_priv, client->engines) {
 		struct intel_context *ce = &ctx->engine[engine->id];
-		struct guc_execlist_context *lrc = &desc.lrc[engine->guc_id];
+		uint32_t guc_engine_id = engine->guc_id;
+		struct guc_execlist_context *lrc = &desc.lrc[guc_engine_id];
 		struct drm_i915_gem_object *obj;
 
 		/* TODO: We have a design issue to be solved here. Only when we
@@ -361,7 +362,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
 		gfx_addr = i915_gem_obj_ggtt_offset(ce->state);
 		lrc->ring_lcra = gfx_addr + LRC_STATE_PN * PAGE_SIZE;
 		lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) |
-				(engine->guc_id << GUC_ELC_ENGINE_OFFSET);
+				(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
 
 		obj = ce->ringbuf->obj;
 		gfx_addr = i915_gem_obj_ggtt_offset(obj);
@@ -371,7 +372,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
 		lrc->ring_next_free_location = gfx_addr;
 		lrc->ring_current_tail_pointer_value = 0;
 
-		desc.engines_used |= (1 << engine->guc_id);
+		desc.engines_used |= (1 << guc_engine_id);
 	}
 
 	DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
@@ -461,6 +462,7 @@ static void guc_add_workqueue_item(struct i915_guc_client *gc,
 	/* wqi_len is in DWords, and does not include the one-word header */
 	const size_t wqi_size = sizeof(struct guc_wq_item);
 	const u32 wqi_len = wqi_size/sizeof(u32) - 1;
+	struct intel_engine_cs *engine = rq->engine;
 	struct guc_process_desc *desc;
 	struct guc_wq_item *wqi;
 	void *base;
@@ -502,12 +504,11 @@ static void guc_add_workqueue_item(struct i915_guc_client *gc,
 	/* Now fill in the 4-word work queue item */
 	wqi->header = WQ_TYPE_INORDER |
 			(wqi_len << WQ_LEN_SHIFT) |
-			(rq->engine->guc_id << WQ_TARGET_SHIFT) |
+			(engine->guc_id << WQ_TARGET_SHIFT) |
 			WQ_NO_WCFLUSH_WAIT;
 
 	/* The GuC wants only the low-order word of the context descriptor */
-	wqi->context_desc = (u32)intel_lr_context_descriptor(rq->ctx,
-							     rq->engine);
+	wqi->context_desc = (u32)intel_lr_context_descriptor(rq->ctx, engine);
 
 	wqi->ring_tail = tail << WQ_RING_TAIL_SHIFT;
 	wqi->fence_id = rq->fence.seqno;
@@ -1035,11 +1036,14 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
+	struct i915_guc_client **gc_ptr;
 	struct intel_engine_cs *engine;
+	enum intel_engine_id engine_id;
 
-	for_each_engine(engine, dev_priv) {
-		guc_client_free(dev_priv, guc->exec_clients[engine->id]);
-		guc->exec_clients[engine->id] = NULL;
+	for_each_engine_id(engine, dev_priv, engine_id) {
+		gc_ptr = &guc->exec_clients[engine_id];
+		guc_client_free(dev_priv, *gc_ptr);
+		*gc_ptr = NULL;
 	}
 }
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 6/6] drm/i915/guc: re-optimise i915_guc_client layout
  2016-07-21 18:15 [PATCH v2 0/6] drm/i915/guc: use one GuC client per GPU engine Dave Gordon
                   ` (4 preceding siblings ...)
  2016-07-21 18:15 ` [PATCH v2 5/6] drm/i915/guc: use for_each_engine_id() where appropriate Dave Gordon
@ 2016-07-21 18:15 ` Dave Gordon
  2016-07-22 10:05   ` Tvrtko Ursulin
  2016-07-22  6:02 ` ✓ Ro.CI.BAT: success for drm/i915/guc: use one GuC client per GPU engine Patchwork
  6 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-07-21 18:15 UTC (permalink / raw)
  To: intel-gfx

As we're tweaking the GuC-related code in debugfs, we can
drop the now-used 'q_fail' and repack the structure.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 1 -
 drivers/gpu/drm/i915/intel_guc.h    | 6 ++----
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2106766..096d212 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2546,7 +2546,6 @@ static void i915_guc_client_info(struct seq_file *m,
 		client->wq_size, client->wq_offset, client->wq_tail);
 
 	seq_printf(m, "\tWork queue full: %u\n", client->no_wq_space);
-	seq_printf(m, "\tFailed to queue: %u\n", client->q_fail);
 	seq_printf(m, "\tFailed doorbell: %u\n", client->b_fail);
 	seq_printf(m, "\tLast submission result: %d\n", client->retcode);
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 53d41b5..a74b128 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -71,19 +71,17 @@ struct i915_guc_client {
 	uint32_t engines;		/* bitmap of (host) engine ids	*/
 	uint32_t priority;
 	uint32_t ctx_index;
-
 	uint32_t proc_desc_offset;
+
 	uint32_t doorbell_offset;
 	uint32_t cookie;
 	uint16_t doorbell_id;
-	uint16_t padding;		/* Maintain alignment		*/
+	uint16_t padding[3];		/* Maintain alignment		*/
 
 	uint32_t wq_offset;
 	uint32_t wq_size;
 	uint32_t wq_tail;
-
 	uint32_t no_wq_space;
-	uint32_t q_fail;		/* No longer used		*/
 	uint32_t b_fail;
 	int retcode;
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/6] drm/i915/guc: use a separate GuC client for each engine
  2016-07-21 18:15 ` [PATCH v2 3/6] drm/i915/guc: use a separate GuC client for each engine Dave Gordon
@ 2016-07-21 18:30   ` Chris Wilson
  2016-07-22 10:23     ` Dave Gordon
  0 siblings, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2016-07-21 18:30 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 21, 2016 at 07:15:39PM +0100, Dave Gordon wrote:
> When using a single GuC client for multiple engines, the i915 driver has
> to merge all work items into a single work queue, which the GuC firmware
> then demultiplexes into separate submission queues per engine. In
> theory, this could lead to the single queue becoming a bottleneck in
> which an excess of outstanding work for one or more engines might
> prevent work for an idle engine reaching the hardware.
> 
> To reduce this risk, we can create one GuC client per engine. Each will
> have its own workqueue, to be used only for work targeting a single
> engine, so there will be no cross-engine contention for workqueue slots.
> 
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Does guc_context_desc.engines_used have any effect?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* ✓ Ro.CI.BAT: success for drm/i915/guc: use one GuC client per GPU engine
  2016-07-21 18:15 [PATCH v2 0/6] drm/i915/guc: use one GuC client per GPU engine Dave Gordon
                   ` (5 preceding siblings ...)
  2016-07-21 18:15 ` [PATCH v2 6/6] drm/i915/guc: re-optimise i915_guc_client layout Dave Gordon
@ 2016-07-22  6:02 ` Patchwork
  2016-07-22 10:06   ` Tvrtko Ursulin
  6 siblings, 1 reply; 18+ messages in thread
From: Patchwork @ 2016-07-22  6:02 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/guc: use one GuC client per GPU engine
URL   : https://patchwork.freedesktop.org/series/10149/
State : success

== Summary ==

Series 10149v1 drm/i915/guc: use one GuC client per GPU engine
http://patchwork.freedesktop.org/api/1.0/series/10149/revisions/1/mbox


fi-hsw-i7-4770k  total:244  pass:216  dwarn:0   dfail:0   fail:8   skip:20 
fi-kbl-qkkr      total:244  pass:179  dwarn:29  dfail:0   fail:8   skip:28 
fi-skl-i5-6260u  total:244  pass:224  dwarn:0   dfail:0   fail:8   skip:12 
fi-skl-i7-6700k  total:244  pass:210  dwarn:0   dfail:0   fail:8   skip:26 
fi-snb-i7-2600   total:244  pass:196  dwarn:0   dfail:0   fail:8   skip:40 
ro-bdw-i5-5250u  total:244  pass:219  dwarn:4   dfail:0   fail:8   skip:13 
ro-bdw-i7-5600u  total:244  pass:204  dwarn:0   dfail:0   fail:8   skip:32 
ro-bsw-n3050     total:218  pass:173  dwarn:0   dfail:0   fail:2   skip:42 
ro-byt-n2820     total:244  pass:197  dwarn:0   dfail:0   fail:9   skip:38 
ro-hsw-i3-4010u  total:244  pass:212  dwarn:0   dfail:0   fail:8   skip:24 
ro-hsw-i7-4770r  total:244  pass:212  dwarn:0   dfail:0   fail:8   skip:24 
ro-ilk-i7-620lm  total:244  pass:172  dwarn:0   dfail:0   fail:9   skip:63 
ro-ilk1-i5-650   total:239  pass:172  dwarn:0   dfail:0   fail:9   skip:58 
ro-ivb-i7-3770   total:244  pass:203  dwarn:0   dfail:0   fail:8   skip:33 
ro-skl3-i5-6260u total:244  pass:224  dwarn:0   dfail:0   fail:8   skip:12 
ro-snb-i7-2620M  total:244  pass:193  dwarn:0   dfail:0   fail:9   skip:42 
ro-bdw-i7-5557U failed to connect after reboot

Results at /archive/results/CI_IGT_test/RO_Patchwork_1565/

cf82f46 drm-intel-nightly: 2016y-07m-21d-20h-43m-36s UTC integration manifest
45411d3 drm/i915/guc: re-optimise i915_guc_client layout
c6b2ad6 drm/i915/guc: use for_each_engine_id() where appropriate
df49243 drm/i915/guc: add engine mask to GuC client & pass to GuC
1645de9 drm/i915/guc: use a separate GuC client for each engine
c810e6b drm/i915/guc: refactor guc_init_doorbell_hw()
ff4f069 drm/i915/guc: doorbell reset should avoid used doorbells

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/6] drm/i915/guc: doorbell reset should avoid used doorbells
  2016-07-21 18:15 ` [PATCH v2 1/6] drm/i915/guc: doorbell reset should avoid used doorbells Dave Gordon
@ 2016-07-22 10:00   ` Tvrtko Ursulin
  0 siblings, 0 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2016-07-22 10:00 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx


On 21/07/16 19:15, Dave Gordon wrote:
> guc_init_doorbell_hw() borrows the (currently single) GuC client to use
> in reinitialising ALL the doorbell registers (as the hardware doesn't
> reset them when the GuC is reset). As a prerequisite for accommodating
> multiple clients, it should only reset doorbells that are supposed to be
> disabled, avoiding those that are marked as in use by any client.
>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 01c1c16..9c5b81b 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -699,7 +699,7 @@ static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
>   }
>
>   /*
> - * Borrow the first client to set up & tear down every doorbell
> + * Borrow the first client to set up & tear down each unused doorbell
>    * in turn, to ensure that all doorbell h/w is (re)initialised.
>    */
>   static void guc_init_doorbell_hw(struct intel_guc *guc)
> @@ -715,6 +715,9 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
>   		i915_reg_t drbreg = GEN8_DRBREGL(i);
>   		u32 value = I915_READ(drbreg);
>
> +		if (test_bit(i, guc->doorbell_bitmap))
> +			continue;
> +
>   		err = guc_update_doorbell_id(guc, client, i);
>
>   		/* Report update failure or unexpectedly active doorbell */
> @@ -733,6 +736,9 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
>   		i915_reg_t drbreg = GEN8_DRBREGL(i);
>   		u32 value = I915_READ(drbreg);
>
> +		if (test_bit(i, guc->doorbell_bitmap))
> +			continue;
> +
>   		if (i != db_id && (value & GUC_DOORBELL_ENABLED))
>   			DRM_DEBUG_DRIVER("Doorbell %d (reg 0x%x) finally 0x%x\n",
>   					  i, drbreg.reg, value);
>

Still,

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/6] drm/i915/guc: refactor guc_init_doorbell_hw()
  2016-07-21 18:15 ` [PATCH v2 2/6] drm/i915/guc: refactor guc_init_doorbell_hw() Dave Gordon
@ 2016-07-22 10:01   ` Tvrtko Ursulin
  0 siblings, 0 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2016-07-22 10:01 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx


On 21/07/16 19:15, Dave Gordon wrote:
> We have essentially the same code in each of two different loops, so we
> can refactor it into a little helper function.
>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 54 +++++++++++++++++-------------
>   1 file changed, 30 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 9c5b81b..816bdca 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -698,32 +698,47 @@ static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
>   	kfree(client);
>   }
>
> +/* Check that a doorbell register is in the expected state */
> +static bool guc_doorbell_check(struct intel_guc *guc, uint16_t db_id)
> +{
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	i915_reg_t drbreg = GEN8_DRBREGL(db_id);
> +	uint32_t value = I915_READ(drbreg);
> +	bool enabled = (value & GUC_DOORBELL_ENABLED) != 0;
> +	bool expected = test_bit(db_id, guc->doorbell_bitmap);
> +
> +	if (enabled == expected)
> +		return true;
> +
> +	DRM_DEBUG_DRIVER("Doorbell %d (reg 0x%x) 0x%x, should be %s\n",
> +			 db_id, drbreg.reg, value,
> +			 expected ? "active" : "inactive");
> +
> +	return false;
> +}
> +
>   /*
>    * Borrow the first client to set up & tear down each unused doorbell
>    * in turn, to ensure that all doorbell h/w is (re)initialised.
>    */
>   static void guc_init_doorbell_hw(struct intel_guc *guc)
>   {
> -	struct drm_i915_private *dev_priv = guc_to_i915(guc);
>   	struct i915_guc_client *client = guc->execbuf_client;
> -	uint16_t db_id, i;
> -	int err;
> +	uint16_t db_id;
> +	int i, err;
>
> +	/* Save client's original doorbell selection */
>   	db_id = client->doorbell_id;
>
>   	for (i = 0; i < GUC_MAX_DOORBELLS; ++i) {
> -		i915_reg_t drbreg = GEN8_DRBREGL(i);
> -		u32 value = I915_READ(drbreg);
> -
> -		if (test_bit(i, guc->doorbell_bitmap))
> +		/* Skip if doorbell is OK */
> +		if (guc_doorbell_check(guc, i))
>   			continue;
>
>   		err = guc_update_doorbell_id(guc, client, i);
> -
> -		/* Report update failure or unexpectedly active doorbell */
> -		if (err || (i != db_id && (value & GUC_DOORBELL_ENABLED)))
> -			DRM_DEBUG_DRIVER("Doorbell %d (reg 0x%x) was 0x%x, err %d\n",
> -					  i, drbreg.reg, value, err);
> +		if (err)
> +			DRM_DEBUG_DRIVER("Doorbell %d update failed, err %d\n",
> +					i, err);
>   	}
>
>   	/* Restore to original value */
> @@ -732,18 +747,9 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
>   		DRM_ERROR("Failed to restore doorbell to %d, err %d\n",
>   			db_id, err);
>
> -	for (i = 0; i < GUC_MAX_DOORBELLS; ++i) {
> -		i915_reg_t drbreg = GEN8_DRBREGL(i);
> -		u32 value = I915_READ(drbreg);
> -
> -		if (test_bit(i, guc->doorbell_bitmap))
> -			continue;
> -
> -		if (i != db_id && (value & GUC_DOORBELL_ENABLED))
> -			DRM_DEBUG_DRIVER("Doorbell %d (reg 0x%x) finally 0x%x\n",
> -					  i, drbreg.reg, value);
> -
> -	}
> +	/* Read back & verify all doorbell registers */
> +	for (i = 0; i < GUC_MAX_DOORBELLS; ++i)
> +		(void)guc_doorbell_check(guc, i);
>   }
>
>   /**
>

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/6] drm/i915/guc: use for_each_engine_id() where appropriate
  2016-07-21 18:15 ` [PATCH v2 5/6] drm/i915/guc: use for_each_engine_id() where appropriate Dave Gordon
@ 2016-07-22 10:04   ` Tvrtko Ursulin
  2016-07-22 10:45     ` Dave Gordon
  0 siblings, 1 reply; 18+ messages in thread
From: Tvrtko Ursulin @ 2016-07-22 10:04 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx


On 21/07/16 19:15, Dave Gordon wrote:
> Now that host structures are indexed by host engine-id rather than
> guc_id, we can usefully convert some for_each_engine() loops to use
> for_each_engine_id() and avoid multiple dereferences of engine->id.
>
> Also a few related tweaks to cache structure members locally wherever
> they're used more than once or twice, hopefully eliminating memory
> references.
>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c        | 17 +++++++++--------
>   drivers/gpu/drm/i915/i915_guc_submission.c | 22 +++++++++++++---------
>   2 files changed, 22 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 793b1d9..2106766 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2535,6 +2535,7 @@ static void i915_guc_client_info(struct seq_file *m,
>   				 struct i915_guc_client *client)
>   {
>   	struct intel_engine_cs *engine;
> +	enum intel_engine_id id;
>   	uint64_t tot = 0;
>
>   	seq_printf(m, "\tPriority %d, GuC ctx index: %u, PD offset 0x%x\n",
> @@ -2549,11 +2550,11 @@ static void i915_guc_client_info(struct seq_file *m,
>   	seq_printf(m, "\tFailed doorbell: %u\n", client->b_fail);
>   	seq_printf(m, "\tLast submission result: %d\n", client->retcode);
>
> -	for_each_engine(engine, dev_priv) {
> +	for_each_engine_id(engine, dev_priv, id) {
> +		u64 submissions = client->submissions[id];
> +		tot += submissions;
>   		seq_printf(m, "\tSubmissions: %llu %s\n",
> -				client->submissions[engine->id],
> -				engine->name);
> -		tot += client->submissions[engine->id];
> +				submissions, engine->name);
>   	}
>   	seq_printf(m, "\tTotal: %llu\n", tot);
>   }
> @@ -2598,11 +2599,11 @@ static int i915_guc_info(struct seq_file *m, void *data)
>   	seq_printf(m, "GuC last action error code: %d\n", guc.action_err);
>
>   	seq_printf(m, "\nGuC submissions:\n");
> -	for_each_engine(engine, dev_priv) {
> +	for_each_engine_id(engine, dev_priv, id) {
> +		u64 submissions = guc.submissions[id];
> +		total += submissions;
>   		seq_printf(m, "\t%-24s: %10llu, last seqno 0x%08x\n",
> -			engine->name, guc.submissions[engine->id],
> -			guc.last_seqno[engine->id]);
> -		total += guc.submissions[engine->id];
> +			engine->name, submissions, guc.last_seqno[id]);
>   	}
>   	seq_printf(m, "\t%s: %llu\n", "Total", total);
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 6756db0..ece3479 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -342,7 +342,8 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
>
>   	for_each_engine_masked(engine, dev_priv, client->engines) {
>   		struct intel_context *ce = &ctx->engine[engine->id];
> -		struct guc_execlist_context *lrc = &desc.lrc[engine->guc_id];
> +		uint32_t guc_engine_id = engine->guc_id;
> +		struct guc_execlist_context *lrc = &desc.lrc[guc_engine_id];
>   		struct drm_i915_gem_object *obj;
>
>   		/* TODO: We have a design issue to be solved here. Only when we
> @@ -361,7 +362,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
>   		gfx_addr = i915_gem_obj_ggtt_offset(ce->state);
>   		lrc->ring_lcra = gfx_addr + LRC_STATE_PN * PAGE_SIZE;
>   		lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) |
> -				(engine->guc_id << GUC_ELC_ENGINE_OFFSET);
> +				(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
>
>   		obj = ce->ringbuf->obj;
>   		gfx_addr = i915_gem_obj_ggtt_offset(obj);
> @@ -371,7 +372,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
>   		lrc->ring_next_free_location = gfx_addr;
>   		lrc->ring_current_tail_pointer_value = 0;
>
> -		desc.engines_used |= (1 << engine->guc_id);
> +		desc.engines_used |= (1 << guc_engine_id);
>   	}
>
>   	DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
> @@ -461,6 +462,7 @@ static void guc_add_workqueue_item(struct i915_guc_client *gc,
>   	/* wqi_len is in DWords, and does not include the one-word header */
>   	const size_t wqi_size = sizeof(struct guc_wq_item);
>   	const u32 wqi_len = wqi_size/sizeof(u32) - 1;
> +	struct intel_engine_cs *engine = rq->engine;
>   	struct guc_process_desc *desc;
>   	struct guc_wq_item *wqi;
>   	void *base;
> @@ -502,12 +504,11 @@ static void guc_add_workqueue_item(struct i915_guc_client *gc,
>   	/* Now fill in the 4-word work queue item */
>   	wqi->header = WQ_TYPE_INORDER |
>   			(wqi_len << WQ_LEN_SHIFT) |
> -			(rq->engine->guc_id << WQ_TARGET_SHIFT) |
> +			(engine->guc_id << WQ_TARGET_SHIFT) |
>   			WQ_NO_WCFLUSH_WAIT;
>
>   	/* The GuC wants only the low-order word of the context descriptor */
> -	wqi->context_desc = (u32)intel_lr_context_descriptor(rq->ctx,
> -							     rq->engine);
> +	wqi->context_desc = (u32)intel_lr_context_descriptor(rq->ctx, engine);
>
>   	wqi->ring_tail = tail << WQ_RING_TAIL_SHIFT;
>   	wqi->fence_id = rq->fence.seqno;
> @@ -1035,11 +1036,14 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>   void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
>   {
>   	struct intel_guc *guc = &dev_priv->guc;
> +	struct i915_guc_client **gc_ptr;
>   	struct intel_engine_cs *engine;
> +	enum intel_engine_id engine_id;
>
> -	for_each_engine(engine, dev_priv) {
> -		guc_client_free(dev_priv, guc->exec_clients[engine->id]);
> -		guc->exec_clients[engine->id] = NULL;
> +	for_each_engine_id(engine, dev_priv, engine_id) {
> +		gc_ptr = &guc->exec_clients[engine_id];
> +		guc_client_free(dev_priv, *gc_ptr);
> +		*gc_ptr = NULL;
>   	}
>   }
>
>

Didn't spot a difference from v1 which I reviewed before. What is new in v2?

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 6/6] drm/i915/guc: re-optimise i915_guc_client layout
  2016-07-21 18:15 ` [PATCH v2 6/6] drm/i915/guc: re-optimise i915_guc_client layout Dave Gordon
@ 2016-07-22 10:05   ` Tvrtko Ursulin
  0 siblings, 0 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2016-07-22 10:05 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx


On 21/07/16 19:15, Dave Gordon wrote:
> As we're tweaking the GuC-related code in debugfs, we can
> drop the now-used 'q_fail' and repack the structure.

now-unused

>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c | 1 -
>   drivers/gpu/drm/i915/intel_guc.h    | 6 ++----
>   2 files changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 2106766..096d212 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2546,7 +2546,6 @@ static void i915_guc_client_info(struct seq_file *m,
>   		client->wq_size, client->wq_offset, client->wq_tail);
>
>   	seq_printf(m, "\tWork queue full: %u\n", client->no_wq_space);
> -	seq_printf(m, "\tFailed to queue: %u\n", client->q_fail);
>   	seq_printf(m, "\tFailed doorbell: %u\n", client->b_fail);
>   	seq_printf(m, "\tLast submission result: %d\n", client->retcode);
>
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 53d41b5..a74b128 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -71,19 +71,17 @@ struct i915_guc_client {
>   	uint32_t engines;		/* bitmap of (host) engine ids	*/
>   	uint32_t priority;
>   	uint32_t ctx_index;
> -
>   	uint32_t proc_desc_offset;
> +
>   	uint32_t doorbell_offset;
>   	uint32_t cookie;
>   	uint16_t doorbell_id;
> -	uint16_t padding;		/* Maintain alignment		*/
> +	uint16_t padding[3];		/* Maintain alignment		*/
>
>   	uint32_t wq_offset;
>   	uint32_t wq_size;
>   	uint32_t wq_tail;
> -
>   	uint32_t no_wq_space;
> -	uint32_t q_fail;		/* No longer used		*/
>   	uint32_t b_fail;
>   	int retcode;
>
>

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ✓ Ro.CI.BAT: success for drm/i915/guc: use one GuC client per GPU engine
  2016-07-22  6:02 ` ✓ Ro.CI.BAT: success for drm/i915/guc: use one GuC client per GPU engine Patchwork
@ 2016-07-22 10:06   ` Tvrtko Ursulin
  2016-07-22 15:17     ` Dave Gordon
  0 siblings, 1 reply; 18+ messages in thread
From: Tvrtko Ursulin @ 2016-07-22 10:06 UTC (permalink / raw)
  To: intel-gfx, Dave Gordon


On 22/07/16 07:02, Patchwork wrote:
> == Series Details ==
>
> Series: drm/i915/guc: use one GuC client per GPU engine
> URL   : https://patchwork.freedesktop.org/series/10149/
> State : success
>
> == Summary ==
>
> Series 10149v1 drm/i915/guc: use one GuC client per GPU engine
> http://patchwork.freedesktop.org/api/1.0/series/10149/revisions/1/mbox
>
>
> fi-hsw-i7-4770k  total:244  pass:216  dwarn:0   dfail:0   fail:8   skip:20
> fi-kbl-qkkr      total:244  pass:179  dwarn:29  dfail:0   fail:8   skip:28
> fi-skl-i5-6260u  total:244  pass:224  dwarn:0   dfail:0   fail:8   skip:12
> fi-skl-i7-6700k  total:244  pass:210  dwarn:0   dfail:0   fail:8   skip:26
> fi-snb-i7-2600   total:244  pass:196  dwarn:0   dfail:0   fail:8   skip:40
> ro-bdw-i5-5250u  total:244  pass:219  dwarn:4   dfail:0   fail:8   skip:13
> ro-bdw-i7-5600u  total:244  pass:204  dwarn:0   dfail:0   fail:8   skip:32
> ro-bsw-n3050     total:218  pass:173  dwarn:0   dfail:0   fail:2   skip:42
> ro-byt-n2820     total:244  pass:197  dwarn:0   dfail:0   fail:9   skip:38
> ro-hsw-i3-4010u  total:244  pass:212  dwarn:0   dfail:0   fail:8   skip:24
> ro-hsw-i7-4770r  total:244  pass:212  dwarn:0   dfail:0   fail:8   skip:24
> ro-ilk-i7-620lm  total:244  pass:172  dwarn:0   dfail:0   fail:9   skip:63
> ro-ilk1-i5-650   total:239  pass:172  dwarn:0   dfail:0   fail:9   skip:58
> ro-ivb-i7-3770   total:244  pass:203  dwarn:0   dfail:0   fail:8   skip:33
> ro-skl3-i5-6260u total:244  pass:224  dwarn:0   dfail:0   fail:8   skip:12
> ro-snb-i7-2620M  total:244  pass:193  dwarn:0   dfail:0   fail:9   skip:42
> ro-bdw-i7-5557U failed to connect after reboot
>
> Results at /archive/results/CI_IGT_test/RO_Patchwork_1565/
>
> cf82f46 drm-intel-nightly: 2016y-07m-21d-20h-43m-36s UTC integration manifest
> 45411d3 drm/i915/guc: re-optimise i915_guc_client layout
> c6b2ad6 drm/i915/guc: use for_each_engine_id() where appropriate
> df49243 drm/i915/guc: add engine mask to GuC client & pass to GuC
> 1645de9 drm/i915/guc: use a separate GuC client for each engine
> c810e6b drm/i915/guc: refactor guc_init_doorbell_hw()
> ff4f069 drm/i915/guc: doorbell reset should avoid used doorbells

You'll need to send one with "do not merge enabe guc" now that it was 
disabled on all platforms due APL issues.

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/6] drm/i915/guc: use a separate GuC client for each engine
  2016-07-21 18:30   ` Chris Wilson
@ 2016-07-22 10:23     ` Dave Gordon
  2016-07-22 21:57       ` Chris Wilson
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-07-22 10:23 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 21/07/16 19:30, Chris Wilson wrote:
> On Thu, Jul 21, 2016 at 07:15:39PM +0100, Dave Gordon wrote:
>> When using a single GuC client for multiple engines, the i915 driver has
>> to merge all work items into a single work queue, which the GuC firmware
>> then demultiplexes into separate submission queues per engine. In
>> theory, this could lead to the single queue becoming a bottleneck in
>> which an excess of outstanding work for one or more engines might
>> prevent work for an idle engine reaching the hardware.
>>
>> To reduce this risk, we can create one GuC client per engine. Each will
>> have its own workqueue, to be used only for work targeting a single
>> engine, so there will be no cross-engine contention for workqueue slots.
>>
>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Does guc_context_desc.engines_used have any effect?
> -Chris

Yes, some of the firmware code uses it to optimise which queues it scans 
at certain times. If it knows that a certain queue *doesn't* contain 
work for a given engine, it can skip scanning that queue entirely.

Does this patchset change the results in the parallel-submission nop test?

.Dave.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/6] drm/i915/guc: use for_each_engine_id() where appropriate
  2016-07-22 10:04   ` Tvrtko Ursulin
@ 2016-07-22 10:45     ` Dave Gordon
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Gordon @ 2016-07-22 10:45 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

On 22/07/16 11:04, Tvrtko Ursulin wrote:
>
> On 21/07/16 19:15, Dave Gordon wrote:
>> Now that host structures are indexed by host engine-id rather than
>> guc_id, we can usefully convert some for_each_engine() loops to use
>> for_each_engine_id() and avoid multiple dereferences of engine->id.
>>
>> Also a few related tweaks to cache structure members locally wherever
>> they're used more than once or twice, hopefully eliminating memory
>> references.
>>
>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c        | 17 +++++++++--------
>>   drivers/gpu/drm/i915/i915_guc_submission.c | 22 +++++++++++++---------
>>   2 files changed, 22 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 793b1d9..2106766 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -2535,6 +2535,7 @@ static void i915_guc_client_info(struct seq_file
>> *m,
>>                    struct i915_guc_client *client)
>>   {
>>       struct intel_engine_cs *engine;
>> +    enum intel_engine_id id;
>>       uint64_t tot = 0;
>>
>>       seq_printf(m, "\tPriority %d, GuC ctx index: %u, PD offset 0x%x\n",
>> @@ -2549,11 +2550,11 @@ static void i915_guc_client_info(struct
>> seq_file *m,
>>       seq_printf(m, "\tFailed doorbell: %u\n", client->b_fail);
>>       seq_printf(m, "\tLast submission result: %d\n", client->retcode);
>>
>> -    for_each_engine(engine, dev_priv) {
>> +    for_each_engine_id(engine, dev_priv, id) {
>> +        u64 submissions = client->submissions[id];
>> +        tot += submissions;
>>           seq_printf(m, "\tSubmissions: %llu %s\n",
>> -                client->submissions[engine->id],
>> -                engine->name);
>> -        tot += client->submissions[engine->id];
>> +                submissions, engine->name);
>>       }
>>       seq_printf(m, "\tTotal: %llu\n", tot);
>>   }
>> @@ -2598,11 +2599,11 @@ static int i915_guc_info(struct seq_file *m,
>> void *data)
>>       seq_printf(m, "GuC last action error code: %d\n", guc.action_err);
>>
>>       seq_printf(m, "\nGuC submissions:\n");
>> -    for_each_engine(engine, dev_priv) {
>> +    for_each_engine_id(engine, dev_priv, id) {
>> +        u64 submissions = guc.submissions[id];
>> +        total += submissions;
>>           seq_printf(m, "\t%-24s: %10llu, last seqno 0x%08x\n",
>> -            engine->name, guc.submissions[engine->id],
>> -            guc.last_seqno[engine->id]);
>> -        total += guc.submissions[engine->id];
>> +            engine->name, submissions, guc.last_seqno[id]);
>>       }
>>       seq_printf(m, "\t%s: %llu\n", "Total", total);
>>
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index 6756db0..ece3479 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -342,7 +342,8 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
>>
>>       for_each_engine_masked(engine, dev_priv, client->engines) {
>>           struct intel_context *ce = &ctx->engine[engine->id];
>> -        struct guc_execlist_context *lrc = &desc.lrc[engine->guc_id];
>> +        uint32_t guc_engine_id = engine->guc_id;
>> +        struct guc_execlist_context *lrc = &desc.lrc[guc_engine_id];
>>           struct drm_i915_gem_object *obj;
>>
>>           /* TODO: We have a design issue to be solved here. Only when we
>> @@ -361,7 +362,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
>>           gfx_addr = i915_gem_obj_ggtt_offset(ce->state);
>>           lrc->ring_lcra = gfx_addr + LRC_STATE_PN * PAGE_SIZE;
>>           lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) |
>> -                (engine->guc_id << GUC_ELC_ENGINE_OFFSET);
>> +                (guc_engine_id << GUC_ELC_ENGINE_OFFSET);
>>
>>           obj = ce->ringbuf->obj;
>>           gfx_addr = i915_gem_obj_ggtt_offset(obj);
>> @@ -371,7 +372,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
>>           lrc->ring_next_free_location = gfx_addr;
>>           lrc->ring_current_tail_pointer_value = 0;
>>
>> -        desc.engines_used |= (1 << engine->guc_id);
>> +        desc.engines_used |= (1 << guc_engine_id);
>>       }
>>
>>       DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
>> @@ -461,6 +462,7 @@ static void guc_add_workqueue_item(struct
>> i915_guc_client *gc,
>>       /* wqi_len is in DWords, and does not include the one-word
>> header */
>>       const size_t wqi_size = sizeof(struct guc_wq_item);
>>       const u32 wqi_len = wqi_size/sizeof(u32) - 1;
>> +    struct intel_engine_cs *engine = rq->engine;
>>       struct guc_process_desc *desc;
>>       struct guc_wq_item *wqi;
>>       void *base;
>> @@ -502,12 +504,11 @@ static void guc_add_workqueue_item(struct
>> i915_guc_client *gc,
>>       /* Now fill in the 4-word work queue item */
>>       wqi->header = WQ_TYPE_INORDER |
>>               (wqi_len << WQ_LEN_SHIFT) |
>> -            (rq->engine->guc_id << WQ_TARGET_SHIFT) |
>> +            (engine->guc_id << WQ_TARGET_SHIFT) |
>>               WQ_NO_WCFLUSH_WAIT;
>>
>>       /* The GuC wants only the low-order word of the context
>> descriptor */
>> -    wqi->context_desc = (u32)intel_lr_context_descriptor(rq->ctx,
>> -                                 rq->engine);
>> +    wqi->context_desc = (u32)intel_lr_context_descriptor(rq->ctx,
>> engine);
>>
>>       wqi->ring_tail = tail << WQ_RING_TAIL_SHIFT;
>>       wqi->fence_id = rq->fence.seqno;
>> @@ -1035,11 +1036,14 @@ int i915_guc_submission_enable(struct
>> drm_i915_private *dev_priv)
>>   void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
>>   {
>>       struct intel_guc *guc = &dev_priv->guc;
>> +    struct i915_guc_client **gc_ptr;
>>       struct intel_engine_cs *engine;
>> +    enum intel_engine_id engine_id;
>>
>> -    for_each_engine(engine, dev_priv) {
>> -        guc_client_free(dev_priv, guc->exec_clients[engine->id]);
>> -        guc->exec_clients[engine->id] = NULL;
>> +    for_each_engine_id(engine, dev_priv, engine_id) {
>> +        gc_ptr = &guc->exec_clients[engine_id];
>> +        guc_client_free(dev_priv, *gc_ptr);
>> +        *gc_ptr = NULL;
>>       }
>>   }
>>
>>
>
> Didn't spot a difference from v1 which I reviewed before. What is new in
> v2?
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Regards,
> Tvrtko

Nothing different in this patch; it's v2 of the set, not of each patch 
within it. Notice in particular the presence of a cover letter (as it's 
more than a single patch), as agreed at yesterday's Tech Forum.

But I only found R-B's from you for v1 3/5 and 4/5; we discussed 1/5 and 
2/5 offline but AFAICT you didn't post an R-B for them?

.Dave.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ✓ Ro.CI.BAT: success for drm/i915/guc: use one GuC client per GPU engine
  2016-07-22 10:06   ` Tvrtko Ursulin
@ 2016-07-22 15:17     ` Dave Gordon
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Gordon @ 2016-07-22 15:17 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

On 22/07/16 11:06, Tvrtko Ursulin wrote:
>
> On 22/07/16 07:02, Patchwork wrote:
>> == Series Details ==
>>
>> Series: drm/i915/guc: use one GuC client per GPU engine
>> URL   : https://patchwork.freedesktop.org/series/10149/
>> State : success
>>
>> == Summary ==
>>
>> Series 10149v1 drm/i915/guc: use one GuC client per GPU engine
>> http://patchwork.freedesktop.org/api/1.0/series/10149/revisions/1/mbox
>>
>>
>> fi-hsw-i7-4770k  total:244  pass:216  dwarn:0   dfail:0   fail:8
>> skip:20
>> fi-kbl-qkkr      total:244  pass:179  dwarn:29  dfail:0   fail:8
>> skip:28
>> fi-skl-i5-6260u  total:244  pass:224  dwarn:0   dfail:0   fail:8
>> skip:12
>> fi-skl-i7-6700k  total:244  pass:210  dwarn:0   dfail:0   fail:8
>> skip:26
>> fi-snb-i7-2600   total:244  pass:196  dwarn:0   dfail:0   fail:8
>> skip:40
>> ro-bdw-i5-5250u  total:244  pass:219  dwarn:4   dfail:0   fail:8
>> skip:13
>> ro-bdw-i7-5600u  total:244  pass:204  dwarn:0   dfail:0   fail:8
>> skip:32
>> ro-bsw-n3050     total:218  pass:173  dwarn:0   dfail:0   fail:2
>> skip:42
>> ro-byt-n2820     total:244  pass:197  dwarn:0   dfail:0   fail:9
>> skip:38
>> ro-hsw-i3-4010u  total:244  pass:212  dwarn:0   dfail:0   fail:8
>> skip:24
>> ro-hsw-i7-4770r  total:244  pass:212  dwarn:0   dfail:0   fail:8
>> skip:24
>> ro-ilk-i7-620lm  total:244  pass:172  dwarn:0   dfail:0   fail:9
>> skip:63
>> ro-ilk1-i5-650   total:239  pass:172  dwarn:0   dfail:0   fail:9
>> skip:58
>> ro-ivb-i7-3770   total:244  pass:203  dwarn:0   dfail:0   fail:8
>> skip:33
>> ro-skl3-i5-6260u total:244  pass:224  dwarn:0   dfail:0   fail:8
>> skip:12
>> ro-snb-i7-2620M  total:244  pass:193  dwarn:0   dfail:0   fail:9
>> skip:42
>> ro-bdw-i7-5557U failed to connect after reboot
>>
>> Results at /archive/results/CI_IGT_test/RO_Patchwork_1565/
>>
>> cf82f46 drm-intel-nightly: 2016y-07m-21d-20h-43m-36s UTC integration
>> manifest
>> 45411d3 drm/i915/guc: re-optimise i915_guc_client layout
>> c6b2ad6 drm/i915/guc: use for_each_engine_id() where appropriate
>> df49243 drm/i915/guc: add engine mask to GuC client & pass to GuC
>> 1645de9 drm/i915/guc: use a separate GuC client for each engine
>> c810e6b drm/i915/guc: refactor guc_init_doorbell_hw()
>> ff4f069 drm/i915/guc: doorbell reset should avoid used doorbells
>
> You'll need to send one with "do not merge enabe guc" now that it was
> disabled on all platforms due APL issues.
>
> Regards,
> Tvrtko

Done :)

R-B's attached & series resent with GuC-enabling patch appended

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/6] drm/i915/guc: use a separate GuC client for each engine
  2016-07-22 10:23     ` Dave Gordon
@ 2016-07-22 21:57       ` Chris Wilson
  0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2016-07-22 21:57 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Fri, Jul 22, 2016 at 11:23:11AM +0100, Dave Gordon wrote:
> Does this patchset change the results in the parallel-submission nop test?

No.

Using Execlists submission
Maximum execution latency on render, 3.392us, total 7.794us per cycle
All (4 engines): 8,257,536 cycles, average 1.211us per cycle

default: 5,789,696 cycles: 3.455us
render: 5,748,736 cycles: 3.479us
bsd: 13,406,208 cycles: 1.492us
blt: 13,852,672 cycles: 1.444us
vebox: 13,680,640 cycles: 1.462us

Using GuC submission (v6.1)
Maximum execution latency on render, 3.674us, total 9.089us per cycle
All (4 engines): 577,536 cycles, average 17.441us per cycle

default: 5,399,552 cycles: 3.704us
render: 5,341,184 cycles: 3.745us
bsd: 13,060,096 cycles: 1.531us
blt: 8,828,928 cycles: 2.265us
vebox: 12,945,408 cycles: 1.545us


-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-07-22 21:57 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-21 18:15 [PATCH v2 0/6] drm/i915/guc: use one GuC client per GPU engine Dave Gordon
2016-07-21 18:15 ` [PATCH v2 1/6] drm/i915/guc: doorbell reset should avoid used doorbells Dave Gordon
2016-07-22 10:00   ` Tvrtko Ursulin
2016-07-21 18:15 ` [PATCH v2 2/6] drm/i915/guc: refactor guc_init_doorbell_hw() Dave Gordon
2016-07-22 10:01   ` Tvrtko Ursulin
2016-07-21 18:15 ` [PATCH v2 3/6] drm/i915/guc: use a separate GuC client for each engine Dave Gordon
2016-07-21 18:30   ` Chris Wilson
2016-07-22 10:23     ` Dave Gordon
2016-07-22 21:57       ` Chris Wilson
2016-07-21 18:15 ` [PATCH v2 4/6] drm/i915/guc: add engine mask to GuC client & pass to GuC Dave Gordon
2016-07-21 18:15 ` [PATCH v2 5/6] drm/i915/guc: use for_each_engine_id() where appropriate Dave Gordon
2016-07-22 10:04   ` Tvrtko Ursulin
2016-07-22 10:45     ` Dave Gordon
2016-07-21 18:15 ` [PATCH v2 6/6] drm/i915/guc: re-optimise i915_guc_client layout Dave Gordon
2016-07-22 10:05   ` Tvrtko Ursulin
2016-07-22  6:02 ` ✓ Ro.CI.BAT: success for drm/i915/guc: use one GuC client per GPU engine Patchwork
2016-07-22 10:06   ` Tvrtko Ursulin
2016-07-22 15:17     ` Dave Gordon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.