All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset
@ 2017-10-05  9:13 Michał Winiarski
  2017-10-05  9:13 ` [PATCH 02/10] drm/i915/guc: Initialize GuC before restarting engines Michał Winiarski
                   ` (7 more replies)
  0 siblings, 8 replies; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05  9:13 UTC (permalink / raw)
  To: intel-gfx

We're using first page of kernel context state to share data with GuC,
let's precompute the ggtt offset at GuC initialization time rather than
everytime we're using GuC actions.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 4 ++--
 drivers/gpu/drm/i915/intel_uc.c            | 4 ++++
 drivers/gpu/drm/i915/intel_uc.h            | 3 +++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 04f1281d81a5..2c0aeee3143d 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1234,7 +1234,7 @@ int intel_guc_suspend(struct drm_i915_private *dev_priv)
 	/* any value greater than GUC_POWER_D0 */
 	data[1] = GUC_POWER_D1;
 	/* first page is shared data with GuC */
-	data[2] = guc_ggtt_offset(ctx->engine[RCS].state) + LRC_GUCSHR_PN * PAGE_SIZE;
+	data[2] = guc->shared_data_offset;
 
 	return intel_guc_send(guc, data, ARRAY_SIZE(data));
 }
@@ -1260,7 +1260,7 @@ int intel_guc_resume(struct drm_i915_private *dev_priv)
 	data[0] = INTEL_GUC_ACTION_EXIT_S_STATE;
 	data[1] = GUC_POWER_D0;
 	/* first page is shared data with GuC */
-	data[2] = guc_ggtt_offset(ctx->engine[RCS].state) + LRC_GUCSHR_PN * PAGE_SIZE;
+	data[2] = guc->shared_data_offset;
 
 	return intel_guc_send(guc, data, ARRAY_SIZE(data));
 }
diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index e7875277ba97..f4893c85e54a 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -173,6 +173,10 @@ static void guc_free_load_err_log(struct intel_guc *guc)
 static int guc_enable_communication(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct i915_gem_context *ctx = dev_priv->kernel_context;
+
+	guc->shared_data_offset = guc_ggtt_offset(ctx->engine[RCS].state) +
+		LRC_GUCSHR_PN * PAGE_SIZE;
 
 	if (HAS_GUC_CT(dev_priv))
 		return intel_guc_enable_ct(guc);
diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
index 4fa091e90b5f..10e8f0ed02e4 100644
--- a/drivers/gpu/drm/i915/intel_uc.h
+++ b/drivers/gpu/drm/i915/intel_uc.h
@@ -121,6 +121,9 @@ struct intel_guc {
 	/* To serialize the intel_guc_send actions */
 	struct mutex send_mutex;
 
+	/* Kernel context state ggtt offset, first page is shared with GuC */
+	u32 shared_data_offset;
+
 	/* GuC's FW specific send function */
 	int (*send)(struct intel_guc *guc, const u32 *data, u32 len);
 
-- 
2.13.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 02/10] drm/i915/guc: Initialize GuC before restarting engines
  2017-10-05  9:13 [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Michał Winiarski
@ 2017-10-05  9:13 ` Michał Winiarski
  2017-10-05  9:36   ` Chris Wilson
  2017-10-05  9:13 ` [PATCH 03/10] drm/i915/guc: Add preemption action to GuC firmware interface Michał Winiarski
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05  9:13 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Now that we're handling request resubmission the same way as regular
submission (from the tasklet), we can move GuC initialization earlier,
before restarting the engines. This way, we're no longer being in the
state of flux during engine restart - we're already in user requested
submission mode.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c            | 10 +++++-----
 drivers/gpu/drm/i915/i915_guc_submission.c |  7 -------
 drivers/gpu/drm/i915/intel_lrc.c           |  2 +-
 3 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ab8c6946fea4..cab328658a0a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4716,6 +4716,11 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
 		goto out;
 	}
 
+	/* We can't enable contexts until all firmware is loaded */
+	ret = intel_uc_init_hw(dev_priv);
+	if (ret)
+		goto out;
+
 	/* Need to do basic initialisation of all rings first: */
 	ret = __i915_gem_restart_engines(dev_priv);
 	if (ret)
@@ -4723,11 +4728,6 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
 
 	intel_mocs_init_l3cc_table(dev_priv);
 
-	/* We can't enable contexts until all firmware is loaded */
-	ret = intel_uc_init_hw(dev_priv);
-	if (ret)
-		goto out;
-
 out:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 	return ret;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2c0aeee3143d..374501a67274 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1182,14 +1182,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 
 	for_each_engine(engine, dev_priv, id) {
 		struct intel_engine_execlists * const execlists = &engine->execlists;
-		/* The tasklet was initialised by execlists, and may be in
-		 * a state of flux (across a reset) and so we just want to
-		 * take over the callback without changing any other state
-		 * in the tasklet.
-		 */
 		execlists->irq_tasklet.func = i915_guc_irq_handler;
-		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
-		tasklet_schedule(&execlists->irq_tasklet);
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index c5b76082d695..12956f2ba699 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1462,7 +1462,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
 	execlists->preempt = false;
 
 	/* After a GPU reset, we may have requests to replay */
-	if (!i915_modparams.enable_guc_submission && execlists->first)
+	if (execlists->first)
 		tasklet_schedule(&execlists->irq_tasklet);
 
 	return 0;
-- 
2.13.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 03/10] drm/i915/guc: Add preemption action to GuC firmware interface
  2017-10-05  9:13 [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Michał Winiarski
  2017-10-05  9:13 ` [PATCH 02/10] drm/i915/guc: Initialize GuC before restarting engines Michał Winiarski
@ 2017-10-05  9:13 ` Michał Winiarski
  2017-10-05 14:25   ` Jeff McGee
  2017-10-05  9:13 ` [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05  9:13 UTC (permalink / raw)
  To: intel-gfx

We're using GuC action to request preemption. However, after requesting
preemption we need to wait for GuC to finish its own post-processing
before we start submitting our requests. Firmware is using shared
context to report its status.
Let's update GuC firmware interface with those new definitions.

Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 40 +++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 7eb6b4fa1d6f..e83e565b43d5 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -544,9 +544,36 @@ union guc_log_control {
 	u32 value;
 } __packed;
 
+struct guc_ctx_report {
+	u32 report_return_status;
+	u32 reserved1[64];
+	u32 affected_count;
+	u32 reserved2[2];
+} __packed;
+
+/* GuC Shared Context Data Struct */
+struct guc_shared_ctx_data {
+	u32 addr_of_last_preempted_data_low;
+	u32 addr_of_last_preempted_data_high;
+	u32 addr_of_last_preempted_data_high_tmp;
+	u32 padding;
+	u32 is_mapped_to_proxy;
+	u32 proxy_ctx_id;
+	u32 engine_reset_ctx_id;
+	u32 media_reset_count;
+	u32 reserved1[8];
+	u32 uk_last_ctx_switch_reason;
+	u32 was_reset;
+	u32 lrca_gpu_addr;
+	u64 execlist_ctx;
+	u32 reserved2[66];
+	struct guc_ctx_report preempt_ctx_report[GUC_MAX_ENGINES_NUM];
+} __packed;
+
 /* This Action will be programmed in C180 - SOFT_SCRATCH_O_REG */
 enum intel_guc_action {
 	INTEL_GUC_ACTION_DEFAULT = 0x0,
+	INTEL_GUC_ACTION_REQUEST_PREEMPTION = 0x2,
 	INTEL_GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
 	INTEL_GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
 	INTEL_GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
@@ -562,6 +589,19 @@ enum intel_guc_action {
 	INTEL_GUC_ACTION_LIMIT
 };
 
+enum intel_guc_preempt_options {
+	INTEL_GUC_PREEMPT_OPTION_IMMEDIATE = 0x1,
+	INTEL_GUC_PREEMPT_OPTION_DROP_WORK_Q = 0x4,
+	INTEL_GUC_PREEMPT_OPTION_DROP_SUBMIT_Q = 0x8,
+};
+
+enum intel_guc_report_status {
+	INTEL_GUC_REPORT_STATUS_UNKNOWN = 0x0,
+	INTEL_GUC_REPORT_STATUS_ACKED = 0x1,
+	INTEL_GUC_REPORT_STATUS_ERROR = 0x2,
+	INTEL_GUC_REPORT_STATUS_COMPLETE = 0x4,
+};
+
 /*
  * The GuC sends its response to a command by overwriting the
  * command in SS0. The response is distinguishable from a command
-- 
2.13.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-05  9:13 [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Michał Winiarski
  2017-10-05  9:13 ` [PATCH 02/10] drm/i915/guc: Initialize GuC before restarting engines Michał Winiarski
  2017-10-05  9:13 ` [PATCH 03/10] drm/i915/guc: Add preemption action to GuC firmware interface Michał Winiarski
@ 2017-10-05  9:13 ` Michał Winiarski
  2017-10-05  9:39   ` Chris Wilson
                     ` (3 more replies)
  2017-10-05  9:13 ` [PATCH 05/10] drm/i915/guc: Split guc_wq_item_append Michał Winiarski
                   ` (4 subsequent siblings)
  7 siblings, 4 replies; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05  9:13 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dave Gordon

From: Dave Gordon <david.s.gordon@intel.com>

This second client is created with priority KMD_HIGH, and marked
as preemptive. This will allow us to request preemption using GuC actions.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  4 ++++
 drivers/gpu/drm/i915/i915_guc_submission.c | 32 ++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/intel_uc.h            |  1 +
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index b4a6ac60e7c6..1a963c13cab8 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2488,6 +2488,10 @@ static int i915_guc_info(struct seq_file *m, void *data)
 
 	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);
 	i915_guc_client_info(m, dev_priv, guc->execbuf_client);
+	if (guc->preempt_client) {
+		seq_printf(m, "\nGuC preempt client\n");
+		i915_guc_client_info(m, dev_priv, guc->preempt_client);
+	}
 
 	i915_guc_log_info(m, dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 374501a67274..e8052e86426d 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -331,6 +331,8 @@ static void guc_stage_desc_init(struct intel_guc *guc,
 	memset(desc, 0, sizeof(*desc));
 
 	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE | GUC_STAGE_DESC_ATTR_KERNEL;
+	if (client->priority <= GUC_CLIENT_PRIORITY_HIGH)
+		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
 	desc->stage_id = client->stage_id;
 	desc->priority = client->priority;
 	desc->db_id = client->doorbell_id;
@@ -1154,7 +1156,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 		     sizeof(struct guc_wq_item) *
 		     I915_NUM_ENGINES > GUC_WQ_SIZE);
 
-	if (!client) {
+	if (!guc->execbuf_client) {
 		client = guc_client_alloc(dev_priv,
 					  INTEL_INFO(dev_priv)->ring_mask,
 					  GUC_CLIENT_PRIORITY_KMD_NORMAL,
@@ -1167,15 +1169,29 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 		guc->execbuf_client = client;
 	}
 
+	if (!guc->preempt_client) {
+		client = guc_client_alloc(dev_priv,
+					  INTEL_INFO(dev_priv)->ring_mask,
+					  GUC_CLIENT_PRIORITY_KMD_HIGH,
+					  dev_priv->preempt_context);
+		if (IS_ERR(client)) {
+			DRM_ERROR("Failed to create GuC client for preemption!\n");
+			err = PTR_ERR(client);
+			goto err_free_clients;
+		}
+
+		guc->preempt_client = client;
+	}
+
 	err = intel_guc_sample_forcewake(guc);
 	if (err)
-		goto err_execbuf_client;
+		goto err_free_clients;
 
 	guc_reset_wq(client);
 
 	err = guc_init_doorbell_hw(guc);
 	if (err)
-		goto err_execbuf_client;
+		goto err_free_clients;
 
 	/* Take over from manual control of ELSP (execlists) */
 	guc_interrupts_capture(dev_priv);
@@ -1187,7 +1203,11 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 
 	return 0;
 
-err_execbuf_client:
+err_free_clients:
+	if (guc->preempt_client) {
+		guc_client_free(guc->preempt_client);
+		guc->preempt_client = NULL;
+	}
 	guc_client_free(guc->execbuf_client);
 	guc->execbuf_client = NULL;
 	return err;
@@ -1202,6 +1222,10 @@ void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
 	/* Revert back to manual ELSP submission */
 	intel_engines_reset_default_submission(dev_priv);
 
+	if (guc->preempt_client) {
+		guc_client_free(guc->preempt_client);
+		guc->preempt_client = NULL;
+	}
 	guc_client_free(guc->execbuf_client);
 	guc->execbuf_client = NULL;
 }
diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
index 10e8f0ed02e4..c6c6f8513bbf 100644
--- a/drivers/gpu/drm/i915/intel_uc.h
+++ b/drivers/gpu/drm/i915/intel_uc.h
@@ -107,6 +107,7 @@ struct intel_guc {
 	struct ida stage_ids;
 
 	struct i915_guc_client *execbuf_client;
+	struct i915_guc_client *preempt_client;
 
 	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
 	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
-- 
2.13.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 05/10] drm/i915/guc: Split guc_wq_item_append
  2017-10-05  9:13 [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Michał Winiarski
                   ` (2 preceding siblings ...)
  2017-10-05  9:13 ` [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
@ 2017-10-05  9:13 ` Michał Winiarski
  2017-10-05  9:45   ` Chris Wilson
  2017-10-05 20:37   ` Jeff McGee
  2017-10-05  9:20 ` [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions Michał Winiarski
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05  9:13 UTC (permalink / raw)
  To: intel-gfx

We're using a special preempt context for HW to preempt into. We don't
want to emit any requests there, but we still need to wrap this context
into a valid GuC work item.
Let's cleanup the functions operating on GuC work items.
We can extract guc_request_add - responsible for adding GuC work item and
ringing the doorbell, and guc_wq_item_append - used by the function
above, not tied to the concept of gem request.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 55 ++++++++++++++++--------------
 1 file changed, 30 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index e8052e86426d..2ce2bd6ed509 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -69,7 +69,7 @@
  * WQ_TYPE_INORDER is needed to support legacy submission via GuC, which
  * represents in-order queue. The kernel driver packs ring tail pointer and an
  * ELSP context descriptor dword into Work Item.
- * See guc_wq_item_append()
+ * See guc_add_request()
  *
  * ADS:
  * The Additional Data Struct (ADS) has pointers for different buffers used by
@@ -357,7 +357,7 @@ static void guc_stage_desc_init(struct intel_guc *guc,
 		 * submission or, in other words, not using a direct submission
 		 * model) the KMD's LRCA is not used for any work submission.
 		 * Instead, the GuC uses the LRCA of the user mode context (see
-		 * guc_wq_item_append below).
+		 * guc_add_request below).
 		 */
 		lrc->context_desc = lower_32_bits(ce->lrc_desc);
 
@@ -409,22 +409,18 @@ static void guc_stage_desc_fini(struct intel_guc *guc,
 
 /* Construct a Work Item and append it to the GuC's Work Queue */
 static void guc_wq_item_append(struct i915_guc_client *client,
-			       struct drm_i915_gem_request *rq)
+			       u32 target_engine, u32 context_desc,
+			       u32 ring_tail, u32 fence_id)
 {
 	/* wqi_len is in DWords, and does not include the one-word header */
 	const size_t wqi_size = sizeof(struct guc_wq_item);
 	const u32 wqi_len = wqi_size / sizeof(u32) - 1;
-	struct intel_engine_cs *engine = rq->engine;
-	struct i915_gem_context *ctx = rq->ctx;
 	struct guc_process_desc *desc = __get_process_desc(client);
 	struct guc_wq_item *wqi;
-	u32 ring_tail, wq_off;
+	u32 wq_off;
 
 	lockdep_assert_held(&client->wq_lock);
 
-	ring_tail = intel_ring_set_tail(rq->ring, rq->tail) / sizeof(u64);
-	GEM_BUG_ON(ring_tail > WQ_RING_TAIL_MAX);
-
 	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
 	 * should not have the case where structure wqi is across page, neither
 	 * wrapped to the beginning. This simplifies the implementation below.
@@ -446,15 +442,14 @@ static void guc_wq_item_append(struct i915_guc_client *client,
 	/* Now fill in the 4-word work queue item */
 	wqi->header = WQ_TYPE_INORDER |
 		      (wqi_len << WQ_LEN_SHIFT) |
-		      (engine->guc_id << WQ_TARGET_SHIFT) |
+		      (target_engine << WQ_TARGET_SHIFT) |
 		      WQ_NO_WCFLUSH_WAIT;
-
-	wqi->context_desc = lower_32_bits(intel_lr_context_descriptor(ctx, engine));
-
+	wqi->context_desc = context_desc;
 	wqi->submit_element_info = ring_tail << WQ_RING_TAIL_SHIFT;
-	wqi->fence_id = rq->global_seqno;
+	GEM_BUG_ON(ring_tail > WQ_RING_TAIL_MAX);
+	wqi->fence_id = fence_id;
 
-	/* Postincrement WQ tail for next time. */
+	/* Make the update visible to GuC */
 	WRITE_ONCE(desc->tail, (wq_off + wqi_size) & (GUC_WQ_SIZE - 1));
 }
 
@@ -484,6 +479,25 @@ static void guc_ring_doorbell(struct i915_guc_client *client)
 	GEM_BUG_ON(db->db_status != GUC_DOORBELL_ENABLED);
 }
 
+static void guc_add_request(struct intel_guc *guc,
+			    struct drm_i915_gem_request *rq)
+{
+	struct i915_guc_client *client = guc->execbuf_client;
+	struct intel_engine_cs *engine = rq->engine;
+	u32 ctx_desc = lower_32_bits(intel_lr_context_descriptor(rq->ctx, engine));
+	u32 ring_tail = intel_ring_set_tail(rq->ring, rq->tail) / sizeof(u64);
+
+	spin_lock(&client->wq_lock);
+
+	guc_wq_item_append(client, engine->guc_id, ctx_desc,
+			   ring_tail, rq->global_seqno);
+	guc_ring_doorbell(client);
+
+	client->submissions[engine->id] += 1;
+
+	spin_unlock(&client->wq_lock);
+}
+
 /**
  * i915_guc_submit() - Submit commands through GuC
  * @engine: engine associated with the commands
@@ -495,10 +509,8 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 	struct intel_guc *guc = &dev_priv->guc;
-	struct i915_guc_client *client = guc->execbuf_client;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
-	const unsigned int engine_id = engine->id;
 	unsigned int n;
 
 	for (n = 0; n < ARRAY_SIZE(execlists->port); n++) {
@@ -512,14 +524,7 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
 			if (i915_vma_is_map_and_fenceable(rq->ring->vma))
 				POSTING_READ_FW(GUC_STATUS);
 
-			spin_lock(&client->wq_lock);
-
-			guc_wq_item_append(client, rq);
-			guc_ring_doorbell(client);
-
-			client->submissions[engine_id] += 1;
-
-			spin_unlock(&client->wq_lock);
+			guc_add_request(guc, rq);
 		}
 	}
 }
-- 
2.13.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions
  2017-10-05  9:13 [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Michał Winiarski
                   ` (3 preceding siblings ...)
  2017-10-05  9:13 ` [PATCH 05/10] drm/i915/guc: Split guc_wq_item_append Michał Winiarski
@ 2017-10-05  9:20 ` Michał Winiarski
  2017-10-05  9:20   ` [PATCH 07/10] drm/i915: Add information needed to track engine preempt state Michał Winiarski
                     ` (4 more replies)
  2017-10-05  9:33 ` [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Chris Wilson
                   ` (2 subsequent siblings)
  7 siblings, 5 replies; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05  9:20 UTC (permalink / raw)
  To: intel-gfx

Let's separate the "emit" part from touching any internal structures,
this way we can have a generic "emit coherent GGTT write" function.
We would like to reuse this functionality for emitting HWSP write, to
confirm that preempt-to-idle has finished.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        | 21 +++++-------------
 drivers/gpu/drm/i915/intel_ringbuffer.h | 38 +++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 12956f2ba699..e6bbdc5dd01b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1766,10 +1766,8 @@ static void gen8_emit_breadcrumb(struct drm_i915_gem_request *request, u32 *cs)
 	/* w/a: bit 5 needs to be zero for MI_FLUSH_DW address. */
 	BUILD_BUG_ON(I915_GEM_HWS_INDEX_ADDR & (1 << 5));
 
-	*cs++ = (MI_FLUSH_DW + 1) | MI_FLUSH_DW_OP_STOREDW;
-	*cs++ = intel_hws_seqno_address(request->engine) | MI_FLUSH_DW_USE_GTT;
-	*cs++ = 0;
-	*cs++ = request->global_seqno;
+	cs = gen8_emit_ggtt_write(intel_hws_seqno_address(request->engine),
+				  request->global_seqno, cs);
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_NOOP;
 	request->tail = intel_ring_offset(request, cs);
@@ -1785,18 +1783,9 @@ static void gen8_emit_breadcrumb_render(struct drm_i915_gem_request *request,
 	/* We're using qword write, seqno should be aligned to 8 bytes. */
 	BUILD_BUG_ON(I915_GEM_HWS_INDEX & 1);
 
-	/* w/a for post sync ops following a GPGPU operation we
-	 * need a prior CS_STALL, which is emitted by the flush
-	 * following the batch.
-	 */
-	*cs++ = GFX_OP_PIPE_CONTROL(6);
-	*cs++ = PIPE_CONTROL_GLOBAL_GTT_IVB | PIPE_CONTROL_CS_STALL |
-		PIPE_CONTROL_QW_WRITE;
-	*cs++ = intel_hws_seqno_address(request->engine);
-	*cs++ = 0;
-	*cs++ = request->global_seqno;
-	/* We're thrashing one dword of HWS. */
-	*cs++ = 0;
+	cs = gen8_emit_ggtt_write_render(
+				      intel_hws_seqno_address(request->engine),
+				      request->global_seqno, cs);
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_NOOP;
 	request->tail = intel_ring_offset(request, cs);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 0fedda17488c..e9650552d3c2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -831,6 +831,44 @@ static inline u32 *gen8_emit_pipe_control(u32 *batch, u32 flags, u32 offset)
 	return batch + 6;
 }
 
+static inline u32 *
+gen8_emit_ggtt_write_render(u32 gtt_offset, u32 value, u32 *cs)
+{
+	/* We're using qword write, offset should be aligned to 8 bytes. */
+	GEM_BUG_ON(!IS_ALIGNED(gtt_offset, 8));
+
+	/* w/a for post sync ops following a GPGPU operation we
+	 * need a prior CS_STALL, which is emitted by the flush
+	 * following the batch.
+	 */
+	*cs++ = GFX_OP_PIPE_CONTROL(6);
+	*cs++ = PIPE_CONTROL_GLOBAL_GTT_IVB | PIPE_CONTROL_CS_STALL |
+		PIPE_CONTROL_QW_WRITE;
+	*cs++ = gtt_offset;
+	*cs++ = 0;
+	*cs++ = value;
+	/* We're thrashing one dword of HWS. */
+	*cs++ = 0;
+
+	return cs;
+}
+
+static inline u32 *
+gen8_emit_ggtt_write(u32 gtt_offset, u32 value, u32 *cs)
+{
+	/* w/a: bit 5 needs to be zero for MI_FLUSH_DW address. */
+	GEM_BUG_ON(gtt_offset & (1 << 5));
+	/* Offset should be aligned to 8 bytes for both (QW/DW) write types */
+	GEM_BUG_ON(!IS_ALIGNED(gtt_offset, 8));
+
+	*cs++ = (MI_FLUSH_DW + 1) | MI_FLUSH_DW_OP_STOREDW;
+	*cs++ = gtt_offset | MI_FLUSH_DW_USE_GTT;
+	*cs++ = 0;
+	*cs++ = value;
+
+	return cs;
+}
+
 bool intel_engine_is_idle(struct intel_engine_cs *engine);
 bool intel_engines_are_idle(struct drm_i915_private *dev_priv);
 
-- 
2.13.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 07/10] drm/i915: Add information needed to track engine preempt state
  2017-10-05  9:20 ` [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions Michał Winiarski
@ 2017-10-05  9:20   ` Michał Winiarski
  2017-10-05  9:49     ` Chris Wilson
  2017-10-05 20:42     ` Jeff McGee
  2017-10-05  9:20   ` [PATCH 08/10] drm/i915/guc: Keep request->priority for its lifetime Michał Winiarski
                     ` (3 subsequent siblings)
  4 siblings, 2 replies; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05  9:20 UTC (permalink / raw)
  To: intel-gfx

We shouldn't inspect ELSP context status (or any other bits depending on
specific submission backend) when using GuC submission.
Let's use another piece of HWSP for preempt context, to write its bit of
information, meaning that preemption has finished, and hardware is now
idle.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index e9650552d3c2..dd3b8727bf6d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -591,6 +591,8 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
  */
 #define I915_GEM_HWS_INDEX		0x30
 #define I915_GEM_HWS_INDEX_ADDR (I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
+#define I915_GEM_HWS_PREEMPT_INDEX	0x32
+#define I915_GEM_HWS_PREEMPT_ADDR (I915_GEM_HWS_PREEMPT_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
 #define I915_GEM_HWS_SCRATCH_INDEX	0x40
 #define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
 
@@ -743,6 +745,11 @@ static inline u32 intel_hws_seqno_address(struct intel_engine_cs *engine)
 	return engine->status_page.ggtt_offset + I915_GEM_HWS_INDEX_ADDR;
 }
 
+static inline u32 intel_hws_preempt_done_address(struct intel_engine_cs *engine)
+{
+	return engine->status_page.ggtt_offset + I915_GEM_HWS_PREEMPT_ADDR;
+}
+
 /* intel_breadcrumbs.c -- user interrupt bottom-half for waiters */
 int intel_engine_init_breadcrumbs(struct intel_engine_cs *engine);
 
-- 
2.13.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 08/10] drm/i915/guc: Keep request->priority for its lifetime
  2017-10-05  9:20 ` [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions Michał Winiarski
  2017-10-05  9:20   ` [PATCH 07/10] drm/i915: Add information needed to track engine preempt state Michał Winiarski
@ 2017-10-05  9:20   ` Michał Winiarski
  2017-10-05  9:50     ` Chris Wilson
  2017-10-05  9:20   ` [PATCH 09/10] drm/i915/guc: Preemption! With GuC Michał Winiarski
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05  9:20 UTC (permalink / raw)
  To: intel-gfx

We also want to support preemption with GuC submission backend.
In order to do that, we need to remember the priority, like we do on
execlists path.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2ce2bd6ed509..0f36bba9fc9e 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -598,7 +598,6 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 			}
 
 			INIT_LIST_HEAD(&rq->priotree.link);
-			rq->priotree.priority = INT_MAX;
 
 			__i915_gem_request_submit(rq);
 			trace_i915_gem_request_in(rq, port_index(port, execlists));
@@ -633,6 +632,7 @@ static void i915_guc_irq_handler(unsigned long data)
 	rq = port_request(&port[0]);
 	while (rq && i915_gem_request_completed(rq)) {
 		trace_i915_gem_request_out(rq);
+		rq->priotree.priority = INT_MAX;
 		i915_gem_request_put(rq);
 
 		execlists_port_complete(execlists, port);
-- 
2.13.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 09/10] drm/i915/guc: Preemption! With GuC.
  2017-10-05  9:20 ` [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions Michał Winiarski
  2017-10-05  9:20   ` [PATCH 07/10] drm/i915: Add information needed to track engine preempt state Michał Winiarski
  2017-10-05  9:20   ` [PATCH 08/10] drm/i915/guc: Keep request->priority for its lifetime Michał Winiarski
@ 2017-10-05  9:20   ` Michał Winiarski
  2017-10-05 10:04     ` Chris Wilson
                       ` (2 more replies)
  2017-10-05  9:20   ` [PATCH 10/10] HAX Enable GuC Submission for CI Michał Winiarski
  2017-10-05  9:47   ` [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions Chris Wilson
  4 siblings, 3 replies; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05  9:20 UTC (permalink / raw)
  To: intel-gfx

Pretty similar to what we have on execlists.
We're reusing most of the GEM code, however, due to GuC quirks we need a
couple of extra bits.
Preemption is implemented as GuC action, and actions can be pretty slow.
Because of that, we're using a mutex to serialize them. Since we're
requesting preemption from the tasklet, the task of creating a workitem
and wrapping it in GuC action is delegated to a worker.

To distinguish that preemption has finished, we're using additional
piece of HWSP, and since we're not getting context switch interrupts,
we're also adding a user interrupt.

The fact that our special preempt context has completed unfortunately
doesn't mean that we're ready to submit new work. We also need to wait
for GuC to finish its own processing.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c            |   3 +-
 drivers/gpu/drm/i915/i915_guc_submission.c | 139 +++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_lrc.c           |   4 +-
 drivers/gpu/drm/i915/intel_lrc.h           |   4 +
 drivers/gpu/drm/i915/intel_ringbuffer.h    |   2 +
 drivers/gpu/drm/i915/intel_uc.c            |   2 +
 drivers/gpu/drm/i915/intel_uc.h            |   3 +
 7 files changed, 144 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 66fc156b294a..06670cc3ece0 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -373,8 +373,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 			value |= I915_SCHEDULER_CAP_PRIORITY;
 
 			if (INTEL_INFO(dev_priv)->has_logical_ring_preemption &&
-			    i915_modparams.enable_execlists &&
-			    !i915_modparams.enable_guc_submission)
+			    i915_modparams.enable_execlists)
 				value |= I915_SCHEDULER_CAP_PREEMPTION;
 		}
 		break;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 0f36bba9fc9e..21412ea3dc0e 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -498,6 +498,88 @@ static void guc_add_request(struct intel_guc *guc,
 	spin_unlock(&client->wq_lock);
 }
 
+#define GUC_PREEMPT_FINISHED 0x1
+#define GUC_PREEMPT_BREADCRUMB_DWORDS 0x8
+static void inject_preempt_context(struct work_struct *work)
+{
+	struct intel_engine_cs *engine =
+		container_of(work, typeof(*engine), guc_preempt_work);
+	struct drm_i915_private *dev_priv = engine->i915;
+	struct intel_guc *guc = &engine->i915->guc;
+	struct i915_guc_client *client = guc->preempt_client;
+	struct intel_context *ce = &client->owner->engine[engine->id];
+	u32 ctx_desc = lower_32_bits(intel_lr_context_descriptor(client->owner,
+								 engine));
+	u32 *cs = ce->ring->vaddr + ce->ring->tail;
+	u32 data[7];
+
+	if (engine->id == RCS) {
+		cs = gen8_emit_ggtt_write_render(
+				intel_hws_preempt_done_address(engine),
+				GUC_PREEMPT_FINISHED, cs);
+		*cs++ = MI_USER_INTERRUPT;
+		*cs++ = MI_NOOP;
+	} else {
+		cs = gen8_emit_ggtt_write(
+				intel_hws_preempt_done_address(engine),
+				GUC_PREEMPT_FINISHED, cs);
+		*cs++ = MI_USER_INTERRUPT;
+		*cs++ = MI_NOOP;
+		*cs++ = MI_NOOP;
+		*cs++ = MI_NOOP;
+	}
+
+	ce->ring->tail += GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32);
+	ce->ring->tail &= (ce->ring->size - 1);
+
+	if (i915_vma_is_map_and_fenceable(ce->ring->vma))
+		POSTING_READ_FW(GUC_STATUS);
+
+	spin_lock_irq(&client->wq_lock);
+	guc_wq_item_append(client, engine->guc_id, ctx_desc,
+			   ce->ring->tail / sizeof(u64), 0);
+	spin_unlock_irq(&client->wq_lock);
+
+	data[0] = INTEL_GUC_ACTION_REQUEST_PREEMPTION;
+	data[1] = client->stage_id;
+	data[2] = INTEL_GUC_PREEMPT_OPTION_IMMEDIATE |
+		  INTEL_GUC_PREEMPT_OPTION_DROP_WORK_Q |
+		  INTEL_GUC_PREEMPT_OPTION_DROP_SUBMIT_Q;
+	data[3] = engine->guc_id;
+	data[4] = guc->execbuf_client->priority;
+	data[5] = guc->execbuf_client->stage_id;
+	data[6] = guc->shared_data_offset;
+
+	if (WARN_ON(intel_guc_send(guc, data, ARRAY_SIZE(data)))) {
+		WRITE_ONCE(engine->execlists.preempt, false);
+		tasklet_schedule(&engine->execlists.irq_tasklet);
+	}
+}
+
+/*
+ * We're using user interrupt and HWSP value to mark that preemption has
+ * finished and GPU is idle. Normally, we could unwind and continue similar to
+ * execlists submission path. Unfortunately, with GuC we also need to wait for
+ * it to finish its own postprocessing, before attempting to submit. Otherwise
+ * GuC may silently ignore our submissions, and thus we risk losing request at
+ * best, executing out-of-order and causing kernel panic at worst.
+ */
+#define GUC_PREEMPT_POSTPROCESS_DELAY_MS 10
+static void wait_for_guc_preempt_report(struct intel_engine_cs *engine)
+{
+	struct intel_guc *guc = &engine->i915->guc;
+	struct guc_shared_ctx_data *data = guc->shared_data_addr;
+	struct guc_ctx_report *report = &data->preempt_ctx_report[engine->guc_id];
+
+	GEM_BUG_ON(wait_for_atomic(report->report_return_status ==
+				   INTEL_GUC_REPORT_STATUS_COMPLETE,
+				   GUC_PREEMPT_POSTPROCESS_DELAY_MS));
+	/* GuC is expecting that we're also going to clear the affected context
+	 * counter */
+	report->affected_count = 0;
+}
+
+
 /**
  * i915_guc_submit() - Submit commands through GuC
  * @engine: engine associated with the commands
@@ -574,13 +656,28 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 	bool submit = false;
 	struct rb_node *rb;
 
-	if (port_isset(port))
-		port++;
-
 	spin_lock_irq(&engine->timeline->lock);
 	rb = execlists->first;
 	GEM_BUG_ON(rb_first(&execlists->queue) != rb);
-	while (rb) {
+
+	if (!rb)
+		goto unlock;
+
+	if (port_isset(port)) {
+		if (rb_entry(rb, struct i915_priolist, node)->priority >
+		    max(port_request(port)->priotree.priority, 0)) {
+			WRITE_ONCE(execlists->preempt, true);
+			queue_work(engine->i915->guc.preempt_wq,
+				   &engine->guc_preempt_work);
+			goto unlock;
+		} else if (port_isset(last_port)) {
+			goto unlock;
+		}
+
+		port++;
+	}
+
+	do {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
 
@@ -610,13 +707,14 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
-	}
+	} while (rb);
 done:
 	execlists->first = rb;
 	if (submit) {
 		port_assign(port, last);
 		i915_guc_submit(engine);
 	}
+unlock:
 	spin_unlock_irq(&engine->timeline->lock);
 }
 
@@ -625,10 +723,23 @@ static void i915_guc_irq_handler(unsigned long data)
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
-	const struct execlist_port * const last_port =
-		&execlists->port[execlists->port_mask];
 	struct drm_i915_gem_request *rq;
 
+	if (READ_ONCE(execlists->preempt) &&
+	    intel_read_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX) ==
+	    GUC_PREEMPT_FINISHED) {
+		execlist_cancel_port_requests(&engine->execlists);
+
+		spin_lock_irq(&engine->timeline->lock);
+		unwind_incomplete_requests(engine);
+		spin_unlock_irq(&engine->timeline->lock);
+
+		wait_for_guc_preempt_report(engine);
+
+		WRITE_ONCE(execlists->preempt, false);
+		intel_write_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX, 0);
+	}
+
 	rq = port_request(&port[0]);
 	while (rq && i915_gem_request_completed(rq)) {
 		trace_i915_gem_request_out(rq);
@@ -640,7 +751,7 @@ static void i915_guc_irq_handler(unsigned long data)
 		rq = port_request(&port[0]);
 	}
 
-	if (!port_isset(last_port))
+	if (!READ_ONCE(execlists->preempt))
 		i915_guc_dequeue(engine);
 }
 
@@ -1046,14 +1157,22 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
 	if (ret < 0)
 		goto err_vaddr;
 
+	guc->preempt_wq = alloc_ordered_workqueue("i915-guc_preempt", WQ_HIGHPRI);
+	if (!guc->preempt_wq) {
+		ret = -ENOMEM;
+		goto err_log;
+	}
+
 	ret = guc_ads_create(guc);
 	if (ret < 0)
-		goto err_log;
+		goto err_wq;
 
 	ida_init(&guc->stage_ids);
 
 	return 0;
 
+err_wq:
+	destroy_workqueue(guc->preempt_wq);
 err_log:
 	intel_guc_log_destroy(guc);
 err_vaddr:
@@ -1069,6 +1188,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
 
 	ida_destroy(&guc->stage_ids);
 	guc_ads_destroy(guc);
+	destroy_workqueue(guc->preempt_wq);
 	intel_guc_log_destroy(guc);
 	i915_gem_object_unpin_map(guc->stage_desc_pool->obj);
 	i915_vma_unpin_and_release(&guc->stage_desc_pool);
@@ -1204,6 +1324,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 	for_each_engine(engine, dev_priv, id) {
 		struct intel_engine_execlists * const execlists = &engine->execlists;
 		execlists->irq_tasklet.func = i915_guc_irq_handler;
+		INIT_WORK(&engine->guc_preempt_work, inject_preempt_context);
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e6bbdc5dd01b..affef8e808be 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -354,7 +354,7 @@ static void unwind_wa_tail(struct drm_i915_gem_request *rq)
 	assert_ring_tail_valid(rq->ring, rq->tail);
 }
 
-static void unwind_incomplete_requests(struct intel_engine_cs *engine)
+void unwind_incomplete_requests(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_request *rq, *rn;
 	struct i915_priolist *uninitialized_var(p);
@@ -687,7 +687,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		execlists_submit_ports(engine);
 }
 
-static void
+void
 execlist_cancel_port_requests(struct intel_engine_execlists *execlists)
 {
 	struct execlist_port *port = execlists->port;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 689fde1a63a9..e650610de226 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -107,6 +107,10 @@ intel_lr_context_descriptor(struct i915_gem_context *ctx,
 	return ctx->engine[engine->id].lrc_desc;
 }
 
+void unwind_incomplete_requests(struct intel_engine_cs *engine);
+void
+execlist_cancel_port_requests(struct intel_engine_execlists *execlists);
+
 
 /* Execlists */
 int intel_sanitize_enable_execlists(struct drm_i915_private *dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index dd3b8727bf6d..2da1869fe4e3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -498,6 +498,8 @@ struct intel_engine_cs {
 
 	bool needs_cmd_parser;
 
+	struct work_struct guc_preempt_work;
+
 	/*
 	 * Table of commands the command parser needs to know about
 	 * for this engine.
diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index f4893c85e54a..ca66b61c799e 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -177,6 +177,8 @@ static int guc_enable_communication(struct intel_guc *guc)
 
 	guc->shared_data_offset = guc_ggtt_offset(ctx->engine[RCS].state) +
 		LRC_GUCSHR_PN * PAGE_SIZE;
+	guc->shared_data_addr = (void*)ctx->engine[RCS].lrc_reg_state -
+		LRC_STATE_PN * PAGE_SIZE;
 
 	if (HAS_GUC_CT(dev_priv))
 		return intel_guc_enable_ct(guc);
diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
index c6c6f8513bbf..8e2818e5741e 100644
--- a/drivers/gpu/drm/i915/intel_uc.h
+++ b/drivers/gpu/drm/i915/intel_uc.h
@@ -124,6 +124,9 @@ struct intel_guc {
 
 	/* Kernel context state ggtt offset, first page is shared with GuC */
 	u32 shared_data_offset;
+	void *shared_data_addr;
+
+	struct workqueue_struct *preempt_wq;
 
 	/* GuC's FW specific send function */
 	int (*send)(struct intel_guc *guc, const u32 *data, u32 len);
-- 
2.13.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 10/10] HAX Enable GuC Submission for CI
  2017-10-05  9:20 ` [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions Michał Winiarski
                     ` (2 preceding siblings ...)
  2017-10-05  9:20   ` [PATCH 09/10] drm/i915/guc: Preemption! With GuC Michał Winiarski
@ 2017-10-05  9:20   ` Michał Winiarski
  2017-10-05  9:47   ` [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions Chris Wilson
  4 siblings, 0 replies; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05  9:20 UTC (permalink / raw)
  To: intel-gfx

Also:
Revert "drm/i915/guc: Assert that we switch between known ggtt->invalidate functions"

This reverts commit 04f7b24eccdfae680a36e9825fe0d61dcd5ed528.
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 8 ++------
 drivers/gpu/drm/i915/i915_params.h  | 4 ++--
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4c82ceb8d318..a6b7f81eeb03 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3377,17 +3377,13 @@ int i915_ggtt_enable_hw(struct drm_i915_private *dev_priv)
 
 void i915_ggtt_enable_guc(struct drm_i915_private *i915)
 {
-	GEM_BUG_ON(i915->ggtt.invalidate != gen6_ggtt_invalidate);
-
 	i915->ggtt.invalidate = guc_ggtt_invalidate;
 }
 
 void i915_ggtt_disable_guc(struct drm_i915_private *i915)
 {
-	/* We should only be called after i915_ggtt_enable_guc() */
-	GEM_BUG_ON(i915->ggtt.invalidate != guc_ggtt_invalidate);
-
-	i915->ggtt.invalidate = gen6_ggtt_invalidate;
+	if (i915->ggtt.invalidate == guc_ggtt_invalidate)
+		i915->ggtt.invalidate = gen6_ggtt_invalidate;
 }
 
 void i915_gem_restore_gtt_mappings(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index c7292268ed43..97c06116d777 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -44,8 +44,8 @@
 	param(int, disable_power_well, -1) \
 	param(int, enable_ips, 1) \
 	param(int, invert_brightness, 0) \
-	param(int, enable_guc_loading, 0) \
-	param(int, enable_guc_submission, 0) \
+	param(int, enable_guc_loading, 2) \
+	param(int, enable_guc_submission, 2) \
 	param(int, guc_log_level, -1) \
 	param(char *, guc_firmware_path, NULL) \
 	param(char *, huc_firmware_path, NULL) \
-- 
2.13.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset
  2017-10-05  9:13 [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Michał Winiarski
                   ` (4 preceding siblings ...)
  2017-10-05  9:20 ` [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions Michał Winiarski
@ 2017-10-05  9:33 ` Chris Wilson
  2017-10-05 17:02   ` Daniele Ceraolo Spurio
  2017-10-05 11:30 ` Michal Wajdeczko
  2017-10-26 21:22 ` ✗ Fi.CI.BAT: failure for series starting with [01/10] " Patchwork
  7 siblings, 1 reply; 37+ messages in thread
From: Chris Wilson @ 2017-10-05  9:33 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

Quoting Michał Winiarski (2017-10-05 10:13:40)
> We're using first page of kernel context state to share data with GuC,
> let's precompute the ggtt offset at GuC initialization time rather than
> everytime we're using GuC actions.

So LRC_GUCSHR_PN is still 0. Plans for that to change?

Atm, we should be changing one pointer deref for another...
-Chris
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/10] drm/i915/guc: Initialize GuC before restarting engines
  2017-10-05  9:13 ` [PATCH 02/10] drm/i915/guc: Initialize GuC before restarting engines Michał Winiarski
@ 2017-10-05  9:36   ` Chris Wilson
  2017-10-05 11:55     ` Michał Winiarski
  0 siblings, 1 reply; 37+ messages in thread
From: Chris Wilson @ 2017-10-05  9:36 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx; +Cc: Mika Kuoppala

Quoting Michał Winiarski (2017-10-05 10:13:41)
> Now that we're handling request resubmission the same way as regular
> submission (from the tasklet), we can move GuC initialization earlier,
> before restarting the engines. This way, we're no longer being in the
> state of flux during engine restart - we're already in user requested
> submission mode.

Do you remember why it was the other way around?

> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-05  9:13 ` [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
@ 2017-10-05  9:39   ` Chris Wilson
  2017-10-05 13:59   ` Michal Wajdeczko
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Chris Wilson @ 2017-10-05  9:39 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx; +Cc: Dave

Quoting Michał Winiarski (2017-10-05 10:13:43)
> From: Dave Gordon <david.s.gordon@intel.com>
> 
> This second client is created with priority KMD_HIGH, and marked
> as preemptive. This will allow us to request preemption using GuC actions.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  4 ++++
>  drivers/gpu/drm/i915/i915_guc_submission.c | 32 ++++++++++++++++++++++++++----
>  drivers/gpu/drm/i915/intel_uc.h            |  1 +
>  3 files changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index b4a6ac60e7c6..1a963c13cab8 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2488,6 +2488,10 @@ static int i915_guc_info(struct seq_file *m, void *data)
>  
>         seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);
>         i915_guc_client_info(m, dev_priv, guc->execbuf_client);
> +       if (guc->preempt_client) {

Hmm, when do we have guc->execbuf_client but not guc->preempt_client?
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 05/10] drm/i915/guc: Split guc_wq_item_append
  2017-10-05  9:13 ` [PATCH 05/10] drm/i915/guc: Split guc_wq_item_append Michał Winiarski
@ 2017-10-05  9:45   ` Chris Wilson
  2017-10-05 20:37   ` Jeff McGee
  1 sibling, 0 replies; 37+ messages in thread
From: Chris Wilson @ 2017-10-05  9:45 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

Quoting Michał Winiarski (2017-10-05 10:13:44)
> We're using a special preempt context for HW to preempt into. We don't
> want to emit any requests there, but we still need to wrap this context
> into a valid GuC work item.
> Let's cleanup the functions operating on GuC work items.
> We can extract guc_request_add - responsible for adding GuC work item and
> ringing the doorbell, and guc_wq_item_append - used by the function
> above, not tied to the concept of gem request.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 55 ++++++++++++++++--------------
>  1 file changed, 30 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index e8052e86426d..2ce2bd6ed509 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -69,7 +69,7 @@
>   * WQ_TYPE_INORDER is needed to support legacy submission via GuC, which
>   * represents in-order queue. The kernel driver packs ring tail pointer and an
>   * ELSP context descriptor dword into Work Item.
> - * See guc_wq_item_append()
> + * See guc_add_request()
>   *
>   * ADS:
>   * The Additional Data Struct (ADS) has pointers for different buffers used by
> @@ -357,7 +357,7 @@ static void guc_stage_desc_init(struct intel_guc *guc,
>                  * submission or, in other words, not using a direct submission
>                  * model) the KMD's LRCA is not used for any work submission.
>                  * Instead, the GuC uses the LRCA of the user mode context (see
> -                * guc_wq_item_append below).
> +                * guc_add_request below).
>                  */
>                 lrc->context_desc = lower_32_bits(ce->lrc_desc);
>  
> @@ -409,22 +409,18 @@ static void guc_stage_desc_fini(struct intel_guc *guc,
>  
>  /* Construct a Work Item and append it to the GuC's Work Queue */
>  static void guc_wq_item_append(struct i915_guc_client *client,
> -                              struct drm_i915_gem_request *rq)
> +                              u32 target_engine, u32 context_desc,
> +                              u32 ring_tail, u32 fence_id)
>  {
>         /* wqi_len is in DWords, and does not include the one-word header */
>         const size_t wqi_size = sizeof(struct guc_wq_item);
>         const u32 wqi_len = wqi_size / sizeof(u32) - 1;
> -       struct intel_engine_cs *engine = rq->engine;
> -       struct i915_gem_context *ctx = rq->ctx;
>         struct guc_process_desc *desc = __get_process_desc(client);
>         struct guc_wq_item *wqi;
> -       u32 ring_tail, wq_off;
> +       u32 wq_off;
>  
>         lockdep_assert_held(&client->wq_lock);
>  
> -       ring_tail = intel_ring_set_tail(rq->ring, rq->tail) / sizeof(u64);
> -       GEM_BUG_ON(ring_tail > WQ_RING_TAIL_MAX);
> -
>         /* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
>          * should not have the case where structure wqi is across page, neither
>          * wrapped to the beginning. This simplifies the implementation below.
> @@ -446,15 +442,14 @@ static void guc_wq_item_append(struct i915_guc_client *client,
>         /* Now fill in the 4-word work queue item */
>         wqi->header = WQ_TYPE_INORDER |
>                       (wqi_len << WQ_LEN_SHIFT) |
> -                     (engine->guc_id << WQ_TARGET_SHIFT) |
> +                     (target_engine << WQ_TARGET_SHIFT) |
>                       WQ_NO_WCFLUSH_WAIT;
> -
> -       wqi->context_desc = lower_32_bits(intel_lr_context_descriptor(ctx, engine));
> -
> +       wqi->context_desc = context_desc;
>         wqi->submit_element_info = ring_tail << WQ_RING_TAIL_SHIFT;
> -       wqi->fence_id = rq->global_seqno;
> +       GEM_BUG_ON(ring_tail > WQ_RING_TAIL_MAX);
> +       wqi->fence_id = fence_id;

Hmm. Are causing a problem for ourselves with the reuse of fence_ids?
Afaik, fence_id is currently unused, but I did think at some point we
will pass around guc fences.

Probably doesn't matter until all the missing pieces of the jigsaw are
put together, by which point fence_id will probably no longer be
global_seqno.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions
  2017-10-05  9:20 ` [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions Michał Winiarski
                     ` (3 preceding siblings ...)
  2017-10-05  9:20   ` [PATCH 10/10] HAX Enable GuC Submission for CI Michał Winiarski
@ 2017-10-05  9:47   ` Chris Wilson
  4 siblings, 0 replies; 37+ messages in thread
From: Chris Wilson @ 2017-10-05  9:47 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

Quoting Michał Winiarski (2017-10-05 10:20:01)
> Let's separate the "emit" part from touching any internal structures,
> this way we can have a generic "emit coherent GGTT write" function.
> We would like to reuse this functionality for emitting HWSP write, to
> confirm that preempt-to-idle has finished.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c        | 21 +++++-------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 38 +++++++++++++++++++++++++++++++++
>  2 files changed, 43 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 12956f2ba699..e6bbdc5dd01b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1766,10 +1766,8 @@ static void gen8_emit_breadcrumb(struct drm_i915_gem_request *request, u32 *cs)
>         /* w/a: bit 5 needs to be zero for MI_FLUSH_DW address. */
>         BUILD_BUG_ON(I915_GEM_HWS_INDEX_ADDR & (1 << 5));
>  
> -       *cs++ = (MI_FLUSH_DW + 1) | MI_FLUSH_DW_OP_STOREDW;
> -       *cs++ = intel_hws_seqno_address(request->engine) | MI_FLUSH_DW_USE_GTT;
> -       *cs++ = 0;
> -       *cs++ = request->global_seqno;
> +       cs = gen8_emit_ggtt_write(intel_hws_seqno_address(request->engine),
> +                                 request->global_seqno, cs);
>         *cs++ = MI_USER_INTERRUPT;
>         *cs++ = MI_NOOP;
>         request->tail = intel_ring_offset(request, cs);
> @@ -1785,18 +1783,9 @@ static void gen8_emit_breadcrumb_render(struct drm_i915_gem_request *request,
>         /* We're using qword write, seqno should be aligned to 8 bytes. */
>         BUILD_BUG_ON(I915_GEM_HWS_INDEX & 1);
>  
> -       /* w/a for post sync ops following a GPGPU operation we
> -        * need a prior CS_STALL, which is emitted by the flush
> -        * following the batch.
> -        */
> -       *cs++ = GFX_OP_PIPE_CONTROL(6);
> -       *cs++ = PIPE_CONTROL_GLOBAL_GTT_IVB | PIPE_CONTROL_CS_STALL |
> -               PIPE_CONTROL_QW_WRITE;
> -       *cs++ = intel_hws_seqno_address(request->engine);
> -       *cs++ = 0;
> -       *cs++ = request->global_seqno;
> -       /* We're thrashing one dword of HWS. */
> -       *cs++ = 0;
> +       cs = gen8_emit_ggtt_write_render(
> +                                     intel_hws_seqno_address(request->engine),
> +                                     request->global_seqno, cs);

Would this be less problematic if we s/render/rcs/ ?

>         *cs++ = MI_USER_INTERRUPT;
>         *cs++ = MI_NOOP;
>         request->tail = intel_ring_offset(request, cs);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 0fedda17488c..e9650552d3c2 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -831,6 +831,44 @@ static inline u32 *gen8_emit_pipe_control(u32 *batch, u32 flags, u32 offset)
>         return batch + 6;
>  }
>  
> +static inline u32 *
> +gen8_emit_ggtt_write_render(u32 gtt_offset, u32 value, u32 *cs)

I prefer cs to the be first param. I just checked we did
emit_pipe_control in that fashion as well...

In principle,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 07/10] drm/i915: Add information needed to track engine preempt state
  2017-10-05  9:20   ` [PATCH 07/10] drm/i915: Add information needed to track engine preempt state Michał Winiarski
@ 2017-10-05  9:49     ` Chris Wilson
  2017-10-05 20:42     ` Jeff McGee
  1 sibling, 0 replies; 37+ messages in thread
From: Chris Wilson @ 2017-10-05  9:49 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

Quoting Michał Winiarski (2017-10-05 10:20:02)
> We shouldn't inspect ELSP context status (or any other bits depending on
> specific submission backend) when using GuC submission.
> Let's use another piece of HWSP for preempt context, to write its bit of
> information, meaning that preemption has finished, and hardware is now
> idle.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

But I'm not keen on this being a standalone patch, it has use by itself.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 08/10] drm/i915/guc: Keep request->priority for its lifetime
  2017-10-05  9:20   ` [PATCH 08/10] drm/i915/guc: Keep request->priority for its lifetime Michał Winiarski
@ 2017-10-05  9:50     ` Chris Wilson
  0 siblings, 0 replies; 37+ messages in thread
From: Chris Wilson @ 2017-10-05  9:50 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

Quoting Michał Winiarski (2017-10-05 10:20:03)
> We also want to support preemption with GuC submission backend.
> In order to do that, we need to remember the priority, like we do on
> execlists path.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 09/10] drm/i915/guc: Preemption! With GuC.
  2017-10-05  9:20   ` [PATCH 09/10] drm/i915/guc: Preemption! With GuC Michał Winiarski
@ 2017-10-05 10:04     ` Chris Wilson
  2017-10-05 10:31     ` Chris Wilson
  2017-10-05 15:00     ` Michal Wajdeczko
  2 siblings, 0 replies; 37+ messages in thread
From: Chris Wilson @ 2017-10-05 10:04 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

Quoting Michał Winiarski (2017-10-05 10:20:04)
> Pretty similar to what we have on execlists.
> We're reusing most of the GEM code, however, due to GuC quirks we need a
> couple of extra bits.
> Preemption is implemented as GuC action, and actions can be pretty slow.
> Because of that, we're using a mutex to serialize them. Since we're
> requesting preemption from the tasklet, the task of creating a workitem
> and wrapping it in GuC action is delegated to a worker.
> 
> To distinguish that preemption has finished, we're using additional
> piece of HWSP, and since we're not getting context switch interrupts,
> we're also adding a user interrupt.
> 
> The fact that our special preempt context has completed unfortunately
> doesn't mean that we're ready to submit new work. We also need to wait
> for GuC to finish its own processing.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c            |   3 +-
>  drivers/gpu/drm/i915/i915_guc_submission.c | 139 +++++++++++++++++++++++++++--
>  drivers/gpu/drm/i915/intel_lrc.c           |   4 +-
>  drivers/gpu/drm/i915/intel_lrc.h           |   4 +
>  drivers/gpu/drm/i915/intel_ringbuffer.h    |   2 +
>  drivers/gpu/drm/i915/intel_uc.c            |   2 +
>  drivers/gpu/drm/i915/intel_uc.h            |   3 +
>  7 files changed, 144 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 66fc156b294a..06670cc3ece0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -373,8 +373,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
>                         value |= I915_SCHEDULER_CAP_PRIORITY;
>  
>                         if (INTEL_INFO(dev_priv)->has_logical_ring_preemption &&
> -                           i915_modparams.enable_execlists &&
> -                           !i915_modparams.enable_guc_submission)
> +                           i915_modparams.enable_execlists)
>                                 value |= I915_SCHEDULER_CAP_PREEMPTION;
>                 }
>                 break;
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 0f36bba9fc9e..21412ea3dc0e 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -498,6 +498,88 @@ static void guc_add_request(struct intel_guc *guc,
>         spin_unlock(&client->wq_lock);
>  }
>  
> +#define GUC_PREEMPT_FINISHED 0x1
> +#define GUC_PREEMPT_BREADCRUMB_DWORDS 0x8
> +static void inject_preempt_context(struct work_struct *work)
> +{
> +       struct intel_engine_cs *engine =
> +               container_of(work, typeof(*engine), guc_preempt_work);
> +       struct drm_i915_private *dev_priv = engine->i915;
> +       struct intel_guc *guc = &engine->i915->guc;
> +       struct i915_guc_client *client = guc->preempt_client;
> +       struct intel_context *ce = &client->owner->engine[engine->id];
> +       u32 ctx_desc = lower_32_bits(intel_lr_context_descriptor(client->owner,
> +                                                                engine));
> +       u32 *cs = ce->ring->vaddr + ce->ring->tail;
> +       u32 data[7];
> +
> +       if (engine->id == RCS) {
> +               cs = gen8_emit_ggtt_write_render(
> +                               intel_hws_preempt_done_address(engine),
> +                               GUC_PREEMPT_FINISHED, cs);
> +               *cs++ = MI_USER_INTERRUPT;
> +               *cs++ = MI_NOOP;
> +       } else {
> +               cs = gen8_emit_ggtt_write(
> +                               intel_hws_preempt_done_address(engine),
> +                               GUC_PREEMPT_FINISHED, cs);
> +               *cs++ = MI_USER_INTERRUPT;
> +               *cs++ = MI_NOOP;
> +               *cs++ = MI_NOOP;
> +               *cs++ = MI_NOOP;
> +       }
> +
> +       ce->ring->tail += GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32);
> +       ce->ring->tail &= (ce->ring->size - 1);

My turn! What guarantees do you give me that everything is nicely
aligned such that we don't overflow the ring?

> +       if (i915_vma_is_map_and_fenceable(ce->ring->vma))
> +               POSTING_READ_FW(GUC_STATUS);

Push this to

flush_ggtt_writes(ce->ring->vma);

It'll hide the dev_priv, and we can explain it a bit more carefully,
for both users.

> +       spin_lock_irq(&client->wq_lock);
> +       guc_wq_item_append(client, engine->guc_id, ctx_desc,
> +                          ce->ring->tail / sizeof(u64), 0);
> +       spin_unlock_irq(&client->wq_lock);
> +
> +       data[0] = INTEL_GUC_ACTION_REQUEST_PREEMPTION;
> +       data[1] = client->stage_id;
> +       data[2] = INTEL_GUC_PREEMPT_OPTION_IMMEDIATE |
> +                 INTEL_GUC_PREEMPT_OPTION_DROP_WORK_Q |
> +                 INTEL_GUC_PREEMPT_OPTION_DROP_SUBMIT_Q;
> +       data[3] = engine->guc_id;
> +       data[4] = guc->execbuf_client->priority;
> +       data[5] = guc->execbuf_client->stage_id;
> +       data[6] = guc->shared_data_offset;

On the surface, this all makes sense. I'll let those in the know check
that this is what the guc wants.

> +       if (WARN_ON(intel_guc_send(guc, data, ARRAY_SIZE(data)))) {
> +               WRITE_ONCE(engine->execlists.preempt, false);
> +               tasklet_schedule(&engine->execlists.irq_tasklet);
> +       }
> +}
> +
> +/*
> + * We're using user interrupt and HWSP value to mark that preemption has
> + * finished and GPU is idle. Normally, we could unwind and continue similar to
> + * execlists submission path. Unfortunately, with GuC we also need to wait for
> + * it to finish its own postprocessing, before attempting to submit. Otherwise
> + * GuC may silently ignore our submissions, and thus we risk losing request at
> + * best, executing out-of-order and causing kernel panic at worst.
> + */
> +#define GUC_PREEMPT_POSTPROCESS_DELAY_MS 10
> +static void wait_for_guc_preempt_report(struct intel_engine_cs *engine)
> +{
> +       struct intel_guc *guc = &engine->i915->guc;
> +       struct guc_shared_ctx_data *data = guc->shared_data_addr;
> +       struct guc_ctx_report *report = &data->preempt_ctx_report[engine->guc_id];
> +
> +       GEM_BUG_ON(wait_for_atomic(report->report_return_status ==
> +                                  INTEL_GUC_REPORT_STATUS_COMPLETE,
> +                                  GUC_PREEMPT_POSTPROCESS_DELAY_MS));

This will be compiled out for !debug. Is that intentional? The above
comment suggests it maybe, but also that we do want to try and wait.

> +       /* GuC is expecting that we're also going to clear the affected context
> +        * counter */
> +       report->affected_count = 0;
> +}
> +
> +
>  /**
>   * i915_guc_submit() - Submit commands through GuC
>   * @engine: engine associated with the commands
> @@ -574,13 +656,28 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
>         bool submit = false;
>         struct rb_node *rb;
>  
> -       if (port_isset(port))
> -               port++;
> -
>         spin_lock_irq(&engine->timeline->lock);
>         rb = execlists->first;
>         GEM_BUG_ON(rb_first(&execlists->queue) != rb);
> -       while (rb) {
> +
> +       if (!rb)
> +               goto unlock;
> +
> +       if (port_isset(port)) {

Feeling that we do can_preempt() as well. Just to have that get out of
jail clause.

> +               if (rb_entry(rb, struct i915_priolist, node)->priority >
> +                   max(port_request(port)->priotree.priority, 0)) {
> +                       WRITE_ONCE(execlists->preempt, true);
> +                       queue_work(engine->i915->guc.preempt_wq,
> +                                  &engine->guc_preempt_work);
> +                       goto unlock;
> +               } else if (port_isset(last_port)) {
> +                       goto unlock;
> +               }
> +
> +               port++;
> +       }
> +
> +       do {
>                 struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
>                 struct drm_i915_gem_request *rq, *rn;
>  
> @@ -610,13 +707,14 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
>                 INIT_LIST_HEAD(&p->requests);
>                 if (p->priority != I915_PRIORITY_NORMAL)
>                         kmem_cache_free(engine->i915->priorities, p);
> -       }
> +       } while (rb);
>  done:
>         execlists->first = rb;
>         if (submit) {
>                 port_assign(port, last);
>                 i915_guc_submit(engine);
>         }
> +unlock:
>         spin_unlock_irq(&engine->timeline->lock);
>  }
>  
> @@ -625,10 +723,23 @@ static void i915_guc_irq_handler(unsigned long data)
>         struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
>         struct intel_engine_execlists * const execlists = &engine->execlists;
>         struct execlist_port *port = execlists->port;
> -       const struct execlist_port * const last_port =
> -               &execlists->port[execlists->port_mask];
>         struct drm_i915_gem_request *rq;
>  
> +       if (READ_ONCE(execlists->preempt) &&
> +           intel_read_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX) ==
> +           GUC_PREEMPT_FINISHED) {
> +               execlist_cancel_port_requests(&engine->execlists);
> +
> +               spin_lock_irq(&engine->timeline->lock);
> +               unwind_incomplete_requests(engine);
> +               spin_unlock_irq(&engine->timeline->lock);
> +
> +               wait_for_guc_preempt_report(engine);
> +
> +               WRITE_ONCE(execlists->preempt, false);
> +               intel_write_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX, 0);
> +       }
> +
>         rq = port_request(&port[0]);
>         while (rq && i915_gem_request_completed(rq)) {
>                 trace_i915_gem_request_out(rq);
> @@ -640,7 +751,7 @@ static void i915_guc_irq_handler(unsigned long data)
>                 rq = port_request(&port[0]);
>         }
>  
> -       if (!port_isset(last_port))
> +       if (!READ_ONCE(execlists->preempt))
>                 i915_guc_dequeue(engine);
>  }
>  
> @@ -1046,14 +1157,22 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
>         if (ret < 0)
>                 goto err_vaddr;
>  

/* Please explain you choice. */

Ordered? I presume because we are serialised by the guc send mutex, you
don't want to have engines preempting in parallel. Makes sense, but
please do record rationale for the reader.

How should we cancel the preempt work on reset?

> +       guc->preempt_wq = alloc_ordered_workqueue("i915-guc_preempt", WQ_HIGHPRI);
> +       if (!guc->preempt_wq) {
> +               ret = -ENOMEM;
> +               goto err_log;
> +       }
> +
>         ret = guc_ads_create(guc);
>         if (ret < 0)
> -               goto err_log;
> +               goto err_wq;
>  
>         ida_init(&guc->stage_ids);
>  
>         return 0;
>  
> +err_wq:
> +       destroy_workqueue(guc->preempt_wq);
>  err_log:
>         intel_guc_log_destroy(guc);
>  err_vaddr:
> @@ -1069,6 +1188,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
>  
>         ida_destroy(&guc->stage_ids);
>         guc_ads_destroy(guc);
> +       destroy_workqueue(guc->preempt_wq);
>         intel_guc_log_destroy(guc);
>         i915_gem_object_unpin_map(guc->stage_desc_pool->obj);
>         i915_vma_unpin_and_release(&guc->stage_desc_pool);
> @@ -1204,6 +1324,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>         for_each_engine(engine, dev_priv, id) {
>                 struct intel_engine_execlists * const execlists = &engine->execlists;
>                 execlists->irq_tasklet.func = i915_guc_irq_handler;
> +               INIT_WORK(&engine->guc_preempt_work, inject_preempt_context);
>         }
>  
>         return 0;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index e6bbdc5dd01b..affef8e808be 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -354,7 +354,7 @@ static void unwind_wa_tail(struct drm_i915_gem_request *rq)
>         assert_ring_tail_valid(rq->ring, rq->tail);
>  }
>  
> -static void unwind_incomplete_requests(struct intel_engine_cs *engine)
> +void unwind_incomplete_requests(struct intel_engine_cs *engine)
>  {
>         struct drm_i915_gem_request *rq, *rn;
>         struct i915_priolist *uninitialized_var(p);
> @@ -687,7 +687,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>                 execlists_submit_ports(engine);
>  }
>  
> -static void
> +void
>  execlist_cancel_port_requests(struct intel_engine_execlists *execlists)
>  {
>         struct execlist_port *port = execlists->port;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index 689fde1a63a9..e650610de226 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -107,6 +107,10 @@ intel_lr_context_descriptor(struct i915_gem_context *ctx,
>         return ctx->engine[engine->id].lrc_desc;
>  }
>  
> +void unwind_incomplete_requests(struct intel_engine_cs *engine);
> +void
> +execlist_cancel_port_requests(struct intel_engine_execlists *execlists);

We are using execlists as our prefix, but only so far for internal
intel_lrc.c usage, so perhaps intel_execlists_ for sharing (verbosity is
unfortunately the way of unambiguity). And we should convert
unwind_incomplete_requests() over.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 09/10] drm/i915/guc: Preemption! With GuC.
  2017-10-05  9:20   ` [PATCH 09/10] drm/i915/guc: Preemption! With GuC Michał Winiarski
  2017-10-05 10:04     ` Chris Wilson
@ 2017-10-05 10:31     ` Chris Wilson
  2017-10-05 18:38       ` Michał Winiarski
  2017-10-05 15:00     ` Michal Wajdeczko
  2 siblings, 1 reply; 37+ messages in thread
From: Chris Wilson @ 2017-10-05 10:31 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

Quoting Michał Winiarski (2017-10-05 10:20:04)
> @@ -625,10 +723,23 @@ static void i915_guc_irq_handler(unsigned long data)
>         struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
>         struct intel_engine_execlists * const execlists = &engine->execlists;
>         struct execlist_port *port = execlists->port;
> -       const struct execlist_port * const last_port =
> -               &execlists->port[execlists->port_mask];
>         struct drm_i915_gem_request *rq;
>  
> +       if (READ_ONCE(execlists->preempt) &&
> +           intel_read_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX) ==
> +           GUC_PREEMPT_FINISHED) {
> +               execlist_cancel_port_requests(&engine->execlists);
> +
> +               spin_lock_irq(&engine->timeline->lock);
> +               unwind_incomplete_requests(engine);
> +               spin_unlock_irq(&engine->timeline->lock);
> +
> +               wait_for_guc_preempt_report(engine);
> +
> +               WRITE_ONCE(execlists->preempt, false);
> +               intel_write_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX, 0);

Ok. So we injected the preempt context, observed it complete. What stops
the GuC from switching back to the normal context and draining its wq?
(Those requests we've just put back to be re-executed later.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset
  2017-10-05  9:13 [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Michał Winiarski
                   ` (5 preceding siblings ...)
  2017-10-05  9:33 ` [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Chris Wilson
@ 2017-10-05 11:30 ` Michal Wajdeczko
  2017-10-26 21:22 ` ✗ Fi.CI.BAT: failure for series starting with [01/10] " Patchwork
  7 siblings, 0 replies; 37+ messages in thread
From: Michal Wajdeczko @ 2017-10-05 11:30 UTC (permalink / raw)
  To: intel-gfx, Michał Winiarski

On Thu, 05 Oct 2017 11:13:40 +0200, Michał Winiarski  
<michal.winiarski@intel.com> wrote:

> We're using first page of kernel context state to share data with GuC,
> let's precompute the ggtt offset at GuC initialization time rather than
> everytime we're using GuC actions.
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 4 ++--
>  drivers/gpu/drm/i915/intel_uc.c            | 4 ++++
>  drivers/gpu/drm/i915/intel_uc.h            | 3 +++
>  3 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c  
> b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 04f1281d81a5..2c0aeee3143d 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -1234,7 +1234,7 @@ int intel_guc_suspend(struct drm_i915_private  
> *dev_priv)
>  	/* any value greater than GUC_POWER_D0 */
>  	data[1] = GUC_POWER_D1;
>  	/* first page is shared data with GuC */

Please drop this now redundant comment

> -	data[2] = guc_ggtt_offset(ctx->engine[RCS].state) + LRC_GUCSHR_PN *  
> PAGE_SIZE;
> +	data[2] = guc->shared_data_offset;
> 	return intel_guc_send(guc, data, ARRAY_SIZE(data));
>  }
> @@ -1260,7 +1260,7 @@ int intel_guc_resume(struct drm_i915_private  
> *dev_priv)
>  	data[0] = INTEL_GUC_ACTION_EXIT_S_STATE;
>  	data[1] = GUC_POWER_D0;
>  	/* first page is shared data with GuC */

Please drop this now redundant comment

> -	data[2] = guc_ggtt_offset(ctx->engine[RCS].state) + LRC_GUCSHR_PN *  
> PAGE_SIZE;
> +	data[2] = guc->shared_data_offset;
> 	return intel_guc_send(guc, data, ARRAY_SIZE(data));
>  }
> diff --git a/drivers/gpu/drm/i915/intel_uc.c  
> b/drivers/gpu/drm/i915/intel_uc.c
> index e7875277ba97..f4893c85e54a 100644
> --- a/drivers/gpu/drm/i915/intel_uc.c
> +++ b/drivers/gpu/drm/i915/intel_uc.c
> @@ -173,6 +173,10 @@ static void guc_free_load_err_log(struct intel_guc  
> *guc)
>  static int guc_enable_communication(struct intel_guc *guc)
>  {
>  	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	struct i915_gem_context *ctx = dev_priv->kernel_context;
> +
> +	guc->shared_data_offset = guc_ggtt_offset(ctx->engine[RCS].state) +
> +		LRC_GUCSHR_PN * PAGE_SIZE;

Hmm, I'm not sure that this is the best place for such initialization.
What about doing it directly in intel_uc_init_hw() or in dedicated inline ?

> 	if (HAS_GUC_CT(dev_priv))
>  		return intel_guc_enable_ct(guc);
> diff --git a/drivers/gpu/drm/i915/intel_uc.h  
> b/drivers/gpu/drm/i915/intel_uc.h
> index 4fa091e90b5f..10e8f0ed02e4 100644
> --- a/drivers/gpu/drm/i915/intel_uc.h
> +++ b/drivers/gpu/drm/i915/intel_uc.h
> @@ -121,6 +121,9 @@ struct intel_guc {
>  	/* To serialize the intel_guc_send actions */
>  	struct mutex send_mutex;
> +	/* Kernel context state ggtt offset, first page is shared with GuC */
> +	u32 shared_data_offset;

We want to keep all 'send' related members together.
Maybe better place for this new member will be after 'execbuf_client' ?

> +
>  	/* GuC's FW specific send function */
>  	int (*send)(struct intel_guc *guc, const u32 *data, u32 len);
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/10] drm/i915/guc: Initialize GuC before restarting engines
  2017-10-05  9:36   ` Chris Wilson
@ 2017-10-05 11:55     ` Michał Winiarski
  0 siblings, 0 replies; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05 11:55 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Mika Kuoppala

On Thu, Oct 05, 2017 at 09:36:31AM +0000, Chris Wilson wrote:
> Quoting Michał Winiarski (2017-10-05 10:13:41)
> > Now that we're handling request resubmission the same way as regular
> > submission (from the tasklet), we can move GuC initialization earlier,
> > before restarting the engines. This way, we're no longer being in the
> > state of flux during engine restart - we're already in user requested
> > submission mode.
> 
> Do you remember why it was the other way around?

IIRC we were resubmitting directly from "intel_uc_init_hw".
I guess doing that without calling ->init_hw() first was harmful :)

-Michał

> 
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Cc: Michel Thierry <michel.thierry@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > Cc: Oscar Mateo <oscar.mateo@intel.com>
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-05  9:13 ` [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
  2017-10-05  9:39   ` Chris Wilson
@ 2017-10-05 13:59   ` Michal Wajdeczko
  2017-10-05 14:58   ` Jeff McGee
  2017-10-05 15:22   ` Daniele Ceraolo Spurio
  3 siblings, 0 replies; 37+ messages in thread
From: Michal Wajdeczko @ 2017-10-05 13:59 UTC (permalink / raw)
  To: intel-gfx, Michał Winiarski; +Cc: Dave Gordon

On Thu, 05 Oct 2017 11:13:43 +0200, Michał Winiarski  
<michal.winiarski@intel.com> wrote:

> From: Dave Gordon <david.s.gordon@intel.com>
>
> This second client is created with priority KMD_HIGH, and marked
> as preemptive. This will allow us to request preemption using GuC  
> actions.
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  4 ++++
>  drivers/gpu/drm/i915/i915_guc_submission.c | 32  
> ++++++++++++++++++++++++++----
>  drivers/gpu/drm/i915/intel_uc.h            |  1 +
>  3 files changed, 33 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c  
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index b4a6ac60e7c6..1a963c13cab8 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2488,6 +2488,10 @@ static int i915_guc_info(struct seq_file *m, void  
> *data)
> 	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);
>  	i915_guc_client_info(m, dev_priv, guc->execbuf_client);
> +	if (guc->preempt_client) {
> +		seq_printf(m, "\nGuC preempt client\n");

Maybe to match output from 'execbuf' there should be "@ %p:"

> +		i915_guc_client_info(m, dev_priv, guc->preempt_client);
> +	}
> 	i915_guc_log_info(m, dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c  
> b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 374501a67274..e8052e86426d 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -331,6 +331,8 @@ static void guc_stage_desc_init(struct intel_guc  
> *guc,
>  	memset(desc, 0, sizeof(*desc));
> 	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |  
> GUC_STAGE_DESC_ATTR_KERNEL;
> +	if (client->priority <= GUC_CLIENT_PRIORITY_HIGH)
> +		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
>  	desc->stage_id = client->stage_id;
>  	desc->priority = client->priority;
>  	desc->db_id = client->doorbell_id;
> @@ -1154,7 +1156,7 @@ int i915_guc_submission_enable(struct  
> drm_i915_private *dev_priv)
>  		     sizeof(struct guc_wq_item) *
>  		     I915_NUM_ENGINES > GUC_WQ_SIZE);
> -	if (!client) {

Make sure to remove unnecessary earlier assignment:
	client = guc->execbuf_client;

> +	if (!guc->execbuf_client) {
>  		client = guc_client_alloc(dev_priv,
>  					  INTEL_INFO(dev_priv)->ring_mask,
>  					  GUC_CLIENT_PRIORITY_KMD_NORMAL,
> @@ -1167,15 +1169,29 @@ int i915_guc_submission_enable(struct  
> drm_i915_private *dev_priv)
>  		guc->execbuf_client = client;
>  	}
> +	if (!guc->preempt_client) {
> +		client = guc_client_alloc(dev_priv,
> +					  INTEL_INFO(dev_priv)->ring_mask,
> +					  GUC_CLIENT_PRIORITY_KMD_HIGH,
> +					  dev_priv->preempt_context);
> +		if (IS_ERR(client)) {
> +			DRM_ERROR("Failed to create GuC client for preemption!\n");
> +			err = PTR_ERR(client);
> +			goto err_free_clients;
> +		}
> +
> +		guc->preempt_client = client;
> +	}
> +

If we allow clients to be already created here, then maybe we should
move their creation into separate helper inline to clearly mark that.
Then we will have simpler error handling sequence ;)

>  	err = intel_guc_sample_forcewake(guc);
>  	if (err)
> -		goto err_execbuf_client;
> +		goto err_free_clients;
> 	guc_reset_wq(client);
> 	err = guc_init_doorbell_hw(guc);
>  	if (err)
> -		goto err_execbuf_client;
> +		goto err_free_clients;
> 	/* Take over from manual control of ELSP (execlists) */
>  	guc_interrupts_capture(dev_priv);
> @@ -1187,7 +1203,11 @@ int i915_guc_submission_enable(struct  
> drm_i915_private *dev_priv)
> 	return 0;
> -err_execbuf_client:
> +err_free_clients:
> +	if (guc->preempt_client) {
> +		guc_client_free(guc->preempt_client);
> +		guc->preempt_client = NULL;
> +	}
>  	guc_client_free(guc->execbuf_client);
>  	guc->execbuf_client = NULL;
>  	return err;
> @@ -1202,6 +1222,10 @@ void i915_guc_submission_disable(struct  
> drm_i915_private *dev_priv)
>  	/* Revert back to manual ELSP submission */
>  	intel_engines_reset_default_submission(dev_priv);
> +	if (guc->preempt_client) {
> +		guc_client_free(guc->preempt_client);
> +		guc->preempt_client = NULL;
> +	}
>  	guc_client_free(guc->execbuf_client);
>  	guc->execbuf_client = NULL;
>  }
> diff --git a/drivers/gpu/drm/i915/intel_uc.h  
> b/drivers/gpu/drm/i915/intel_uc.h
> index 10e8f0ed02e4..c6c6f8513bbf 100644
> --- a/drivers/gpu/drm/i915/intel_uc.h
> +++ b/drivers/gpu/drm/i915/intel_uc.h
> @@ -107,6 +107,7 @@ struct intel_guc {
>  	struct ida stage_ids;
> 	struct i915_guc_client *execbuf_client;
> +	struct i915_guc_client *preempt_client;
> 	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
>  	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 03/10] drm/i915/guc: Add preemption action to GuC firmware interface
  2017-10-05  9:13 ` [PATCH 03/10] drm/i915/guc: Add preemption action to GuC firmware interface Michał Winiarski
@ 2017-10-05 14:25   ` Jeff McGee
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff McGee @ 2017-10-05 14:25 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

On Thu, Oct 05, 2017 at 11:13:42AM +0200, Michał Winiarski wrote:
> We're using GuC action to request preemption. However, after requesting
> preemption we need to wait for GuC to finish its own post-processing
> before we start submitting our requests. Firmware is using shared
> context to report its status.
> Let's update GuC firmware interface with those new definitions.
> 
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-05  9:13 ` [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
  2017-10-05  9:39   ` Chris Wilson
  2017-10-05 13:59   ` Michal Wajdeczko
@ 2017-10-05 14:58   ` Jeff McGee
  2017-10-05 15:22   ` Daniele Ceraolo Spurio
  3 siblings, 0 replies; 37+ messages in thread
From: Jeff McGee @ 2017-10-05 14:58 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: Dave Gordon, intel-gfx

On Thu, Oct 05, 2017 at 11:13:43AM +0200, Michał Winiarski wrote:
> From: Dave Gordon <david.s.gordon@intel.com>
> 
> This second client is created with priority KMD_HIGH, and marked
> as preemptive. This will allow us to request preemption using GuC actions.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  4 ++++
>  drivers/gpu/drm/i915/i915_guc_submission.c | 32 ++++++++++++++++++++++++++----
>  drivers/gpu/drm/i915/intel_uc.h            |  1 +
>  3 files changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index b4a6ac60e7c6..1a963c13cab8 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2488,6 +2488,10 @@ static int i915_guc_info(struct seq_file *m, void *data)
>  
>  	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);
>  	i915_guc_client_info(m, dev_priv, guc->execbuf_client);
> +	if (guc->preempt_client) {
> +		seq_printf(m, "\nGuC preempt client\n");
> +		i915_guc_client_info(m, dev_priv, guc->preempt_client);
> +	}
>  
>  	i915_guc_log_info(m, dev_priv);
>  
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 374501a67274..e8052e86426d 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -331,6 +331,8 @@ static void guc_stage_desc_init(struct intel_guc *guc,
>  	memset(desc, 0, sizeof(*desc));
>  
>  	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE | GUC_STAGE_DESC_ATTR_KERNEL;
> +	if (client->priority <= GUC_CLIENT_PRIORITY_HIGH)
> +		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
>  	desc->stage_id = client->stage_id;
>  	desc->priority = client->priority;
>  	desc->db_id = client->doorbell_id;
> @@ -1154,7 +1156,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>  		     sizeof(struct guc_wq_item) *
>  		     I915_NUM_ENGINES > GUC_WQ_SIZE);
>  
I think that client is still being initialized above to guc->execbuf_client.
If so, can it be removed?

> -	if (!client) {
> +	if (!guc->execbuf_client) {
>  		client = guc_client_alloc(dev_priv,
>  					  INTEL_INFO(dev_priv)->ring_mask,
>  					  GUC_CLIENT_PRIORITY_KMD_NORMAL,
> @@ -1167,15 +1169,29 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>  		guc->execbuf_client = client;
>  	}
>  
> +	if (!guc->preempt_client) {
> +		client = guc_client_alloc(dev_priv,
> +					  INTEL_INFO(dev_priv)->ring_mask,
> +					  GUC_CLIENT_PRIORITY_KMD_HIGH,
> +					  dev_priv->preempt_context);
> +		if (IS_ERR(client)) {
> +			DRM_ERROR("Failed to create GuC client for preemption!\n");
> +			err = PTR_ERR(client);
> +			goto err_free_clients;
> +		}
> +
> +		guc->preempt_client = client;
> +	}
> +
>  	err = intel_guc_sample_forcewake(guc);
>  	if (err)
> -		goto err_execbuf_client;
> +		goto err_free_clients;
>  
>  	guc_reset_wq(client);
>  
>  	err = guc_init_doorbell_hw(guc);
>  	if (err)
> -		goto err_execbuf_client;
> +		goto err_free_clients;
>  
>  	/* Take over from manual control of ELSP (execlists) */
>  	guc_interrupts_capture(dev_priv);
> @@ -1187,7 +1203,11 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>  
>  	return 0;
>  
> -err_execbuf_client:
> +err_free_clients:
> +	if (guc->preempt_client) {
> +		guc_client_free(guc->preempt_client);
> +		guc->preempt_client = NULL;
> +	}
>  	guc_client_free(guc->execbuf_client);
>  	guc->execbuf_client = NULL;
>  	return err;
> @@ -1202,6 +1222,10 @@ void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
>  	/* Revert back to manual ELSP submission */
>  	intel_engines_reset_default_submission(dev_priv);
>  
> +	if (guc->preempt_client) {
Do we need this check? It seems that execbuf_client and preempt_client are
both required for GuC submission to be enabled.
> +		guc_client_free(guc->preempt_client);
> +		guc->preempt_client = NULL;
> +	}
>  	guc_client_free(guc->execbuf_client);
>  	guc->execbuf_client = NULL;
>  }
> diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
> index 10e8f0ed02e4..c6c6f8513bbf 100644
> --- a/drivers/gpu/drm/i915/intel_uc.h
> +++ b/drivers/gpu/drm/i915/intel_uc.h
> @@ -107,6 +107,7 @@ struct intel_guc {
>  	struct ida stage_ids;
>  
>  	struct i915_guc_client *execbuf_client;
> +	struct i915_guc_client *preempt_client;
>  
>  	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
>  	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
> -- 
> 2.13.5
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 09/10] drm/i915/guc: Preemption! With GuC.
  2017-10-05  9:20   ` [PATCH 09/10] drm/i915/guc: Preemption! With GuC Michał Winiarski
  2017-10-05 10:04     ` Chris Wilson
  2017-10-05 10:31     ` Chris Wilson
@ 2017-10-05 15:00     ` Michal Wajdeczko
  2017-10-05 15:42       ` Michał Winiarski
  2 siblings, 1 reply; 37+ messages in thread
From: Michal Wajdeczko @ 2017-10-05 15:00 UTC (permalink / raw)
  To: intel-gfx, Michał Winiarski

On Thu, 05 Oct 2017 11:20:04 +0200, Michał Winiarski  
<michal.winiarski@intel.com> wrote:

> Pretty similar to what we have on execlists.
> We're reusing most of the GEM code, however, due to GuC quirks we need a
> couple of extra bits.
> Preemption is implemented as GuC action, and actions can be pretty slow.
> Because of that, we're using a mutex to serialize them. Since we're
> requesting preemption from the tasklet, the task of creating a workitem
> and wrapping it in GuC action is delegated to a worker.
>
> To distinguish that preemption has finished, we're using additional
> piece of HWSP, and since we're not getting context switch interrupts,
> we're also adding a user interrupt.
>
> The fact that our special preempt context has completed unfortunately
> doesn't mean that we're ready to submit new work. We also need to wait
> for GuC to finish its own processing.
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---

<snip>

> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c  
> b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 0f36bba9fc9e..21412ea3dc0e 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -498,6 +498,88 @@ static void guc_add_request(struct intel_guc *guc,
>  	spin_unlock(&client->wq_lock);
>  }
> +#define GUC_PREEMPT_FINISHED 0x1
> +#define GUC_PREEMPT_BREADCRUMB_DWORDS 0x8
> +static void inject_preempt_context(struct work_struct *work)
> +{
> +	struct intel_engine_cs *engine =
> +		container_of(work, typeof(*engine), guc_preempt_work);
> +	struct drm_i915_private *dev_priv = engine->i915;
> +	struct intel_guc *guc = &engine->i915->guc;

guc = &dev_priv->guc;

> +	struct i915_guc_client *client = guc->preempt_client;
> +	struct intel_context *ce = &client->owner->engine[engine->id];
> +	u32 ctx_desc = lower_32_bits(intel_lr_context_descriptor(client->owner,
> +								 engine));
> +	u32 *cs = ce->ring->vaddr + ce->ring->tail;
> +	u32 data[7];
> +
> +	if (engine->id == RCS) {
> +		cs = gen8_emit_ggtt_write_render(
> +				intel_hws_preempt_done_address(engine),
> +				GUC_PREEMPT_FINISHED, cs);
> +		*cs++ = MI_USER_INTERRUPT;
> +		*cs++ = MI_NOOP;
> +	} else {
> +		cs = gen8_emit_ggtt_write(
> +				intel_hws_preempt_done_address(engine),
> +				GUC_PREEMPT_FINISHED, cs);
> +		*cs++ = MI_USER_INTERRUPT;
> +		*cs++ = MI_NOOP;
> +		*cs++ = MI_NOOP;
> +		*cs++ = MI_NOOP;
> +	}

Maybe like this:

	cs = gen8_emit_ggtt_write_render(
		intel_hws_preempt_done_address(engine),
		GUC_PREEMPT_FINISHED, cs);
	*cs++ = MI_USER_INTERRUPT;
	*cs++ = MI_NOOP;
	if (engine->id != RCS) {
		*cs++ = MI_NOOP;
		*cs++ = MI_NOOP;
	}

> +
> +	ce->ring->tail += GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32);
> +	ce->ring->tail &= (ce->ring->size - 1);
> +
> +	if (i915_vma_is_map_and_fenceable(ce->ring->vma))
> +		POSTING_READ_FW(GUC_STATUS);
> +
> +	spin_lock_irq(&client->wq_lock);
> +	guc_wq_item_append(client, engine->guc_id, ctx_desc,
> +			   ce->ring->tail / sizeof(u64), 0);
> +	spin_unlock_irq(&client->wq_lock);
> +
> +	data[0] = INTEL_GUC_ACTION_REQUEST_PREEMPTION;
> +	data[1] = client->stage_id;
> +	data[2] = INTEL_GUC_PREEMPT_OPTION_IMMEDIATE |
> +		  INTEL_GUC_PREEMPT_OPTION_DROP_WORK_Q |
> +		  INTEL_GUC_PREEMPT_OPTION_DROP_SUBMIT_Q;
> +	data[3] = engine->guc_id;
> +	data[4] = guc->execbuf_client->priority;

Hmm, this is always GUC_CLIENT_PRIORITY_KMD_NORMAL, right ?

> +	data[5] = guc->execbuf_client->stage_id;
> +	data[6] = guc->shared_data_offset;
> +

Please pull above action code into new helper

	int guc_send_preemption_request(guc, engine_guc_id)
	{
		data[] { ... };
		return intel_guc_send(guc, data, ARRAY_SIZE(data));
	}

> +	if (WARN_ON(intel_guc_send(guc, data, ARRAY_SIZE(data)))) {
> +		WRITE_ONCE(engine->execlists.preempt, false);
> +		tasklet_schedule(&engine->execlists.irq_tasklet);
> +	}
> +}
> +

<snip>

> @@ -177,6 +177,8 @@ static int guc_enable_communication(struct intel_guc  
> *guc)
> 	guc->shared_data_offset = guc_ggtt_offset(ctx->engine[RCS].state) +
>  		LRC_GUCSHR_PN * PAGE_SIZE;
> +	guc->shared_data_addr = (void*)ctx->engine[RCS].lrc_reg_state -
> +		LRC_STATE_PN * PAGE_SIZE;

Please pick better place (see my comment in 1/10)

> 	if (HAS_GUC_CT(dev_priv))
>  		return intel_guc_enable_ct(guc);
> diff --git a/drivers/gpu/drm/i915/intel_uc.h  
> b/drivers/gpu/drm/i915/intel_uc.h
> index c6c6f8513bbf..8e2818e5741e 100644
> --- a/drivers/gpu/drm/i915/intel_uc.h
> +++ b/drivers/gpu/drm/i915/intel_uc.h
> @@ -124,6 +124,9 @@ struct intel_guc {
> 	/* Kernel context state ggtt offset, first page is shared with GuC */
>  	u32 shared_data_offset;
> +	void *shared_data_addr;
> +
> +	struct workqueue_struct *preempt_wq;

Please pick better place (see my comment in 1/10)

> 	/* GuC's FW specific send function */
>  	int (*send)(struct intel_guc *guc, const u32 *data, u32 len);
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-05  9:13 ` [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
                     ` (2 preceding siblings ...)
  2017-10-05 14:58   ` Jeff McGee
@ 2017-10-05 15:22   ` Daniele Ceraolo Spurio
  2017-10-05 16:00     ` Michał Winiarski
  3 siblings, 1 reply; 37+ messages in thread
From: Daniele Ceraolo Spurio @ 2017-10-05 15:22 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx; +Cc: Dave Gordon



On 05/10/17 02:13, Michał Winiarski wrote:
> From: Dave Gordon <david.s.gordon@intel.com>
> 
> This second client is created with priority KMD_HIGH, and marked
> as preemptive. This will allow us to request preemption using GuC actions.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c        |  4 ++++
>   drivers/gpu/drm/i915/i915_guc_submission.c | 32 ++++++++++++++++++++++++++----
>   drivers/gpu/drm/i915/intel_uc.h            |  1 +
>   3 files changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index b4a6ac60e7c6..1a963c13cab8 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2488,6 +2488,10 @@ static int i915_guc_info(struct seq_file *m, void *data)
>   
>   	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);
>   	i915_guc_client_info(m, dev_priv, guc->execbuf_client);
> +	if (guc->preempt_client) {
> +		seq_printf(m, "\nGuC preempt client\n");
> +		i915_guc_client_info(m, dev_priv, guc->preempt_client);
> +	}
>   
>   	i915_guc_log_info(m, dev_priv);
>   
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 374501a67274..e8052e86426d 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -331,6 +331,8 @@ static void guc_stage_desc_init(struct intel_guc *guc,
>   	memset(desc, 0, sizeof(*desc));
>   
>   	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE | GUC_STAGE_DESC_ATTR_KERNEL;
> +	if (client->priority <= GUC_CLIENT_PRIORITY_HIGH)
> +		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
>   	desc->stage_id = client->stage_id;
>   	desc->priority = client->priority;
>   	desc->db_id = client->doorbell_id;
> @@ -1154,7 +1156,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>   		     sizeof(struct guc_wq_item) *
>   		     I915_NUM_ENGINES > GUC_WQ_SIZE);
>   
> -	if (!client) {
> +	if (!guc->execbuf_client) {
>   		client = guc_client_alloc(dev_priv,
>   					  INTEL_INFO(dev_priv)->ring_mask,
>   					  GUC_CLIENT_PRIORITY_KMD_NORMAL,
> @@ -1167,15 +1169,29 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>   		guc->execbuf_client = client;
>   	}
>   
> +	if (!guc->preempt_client) {
> +		client = guc_client_alloc(dev_priv,
> +					  INTEL_INFO(dev_priv)->ring_mask,
> +					  GUC_CLIENT_PRIORITY_KMD_HIGH,
> +					  dev_priv->preempt_context);
> +		if (IS_ERR(client)) {
> +			DRM_ERROR("Failed to create GuC client for preemption!\n");
> +			err = PTR_ERR(client);
> +			goto err_free_clients;
> +		}
> +
> +		guc->preempt_client = client;
> +	}
> +
>   	err = intel_guc_sample_forcewake(guc);
>   	if (err)
> -		goto err_execbuf_client;
> +		goto err_free_clients;
>   
>   	guc_reset_wq(client);
>   
>   	err = guc_init_doorbell_hw(guc);
>   	if (err)
> -		goto err_execbuf_client;
> +		goto err_free_clients;
>   
>   	/* Take over from manual control of ELSP (execlists) */
>   	guc_interrupts_capture(dev_priv);
> @@ -1187,7 +1203,11 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>   
>   	return 0;
>   
> -err_execbuf_client:
> +err_free_clients:
> +	if (guc->preempt_client) {
> +		guc_client_free(guc->preempt_client);
> +		guc->preempt_client = NULL;
> +	}
>   	guc_client_free(guc->execbuf_client);
>   	guc->execbuf_client = NULL;
>   	return err;
> @@ -1202,6 +1222,10 @@ void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
>   	/* Revert back to manual ELSP submission */
>   	intel_engines_reset_default_submission(dev_priv);
>   
> +	if (guc->preempt_client) {
> +		guc_client_free(guc->preempt_client);
> +		guc->preempt_client = NULL;
> +	}
>   	guc_client_free(guc->execbuf_client);
>   	guc->execbuf_client = NULL;
>   }
> diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
> index 10e8f0ed02e4..c6c6f8513bbf 100644
> --- a/drivers/gpu/drm/i915/intel_uc.h
> +++ b/drivers/gpu/drm/i915/intel_uc.h
> @@ -107,6 +107,7 @@ struct intel_guc {
>   	struct ida stage_ids;
>   
>   	struct i915_guc_client *execbuf_client;
> +	struct i915_guc_client *preempt_client;
>   
>   	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
>   	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
> 


I think you also need to update guc_init_doorbell_hw() to handle the new 
client. The comment in there says:

/* Now for every client (and not only execbuf_client) make sure their
  * doorbells are known by the GuC */

Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 09/10] drm/i915/guc: Preemption! With GuC.
  2017-10-05 15:00     ` Michal Wajdeczko
@ 2017-10-05 15:42       ` Michał Winiarski
  0 siblings, 0 replies; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05 15:42 UTC (permalink / raw)
  To: Michal Wajdeczko; +Cc: intel-gfx

On Thu, Oct 05, 2017 at 03:00:31PM +0000, Michal Wajdeczko wrote:
> On Thu, 05 Oct 2017 11:20:04 +0200, Michał Winiarski
> <michal.winiarski@intel.com> wrote:
> 
> > Pretty similar to what we have on execlists.
> > We're reusing most of the GEM code, however, due to GuC quirks we need a
> > couple of extra bits.
> > Preemption is implemented as GuC action, and actions can be pretty slow.
> > Because of that, we're using a mutex to serialize them. Since we're
> > requesting preemption from the tasklet, the task of creating a workitem
> > and wrapping it in GuC action is delegated to a worker.
> > 
> > To distinguish that preemption has finished, we're using additional
> > piece of HWSP, and since we're not getting context switch interrupts,
> > we're also adding a user interrupt.
> > 
> > The fact that our special preempt context has completed unfortunately
> > doesn't mean that we're ready to submit new work. We also need to wait
> > for GuC to finish its own processing.
> > 
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Oscar Mateo <oscar.mateo@intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> 
> <snip>
> 
> > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> > b/drivers/gpu/drm/i915/i915_guc_submission.c
> > index 0f36bba9fc9e..21412ea3dc0e 100644
> > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > @@ -498,6 +498,88 @@ static void guc_add_request(struct intel_guc *guc,
> >  	spin_unlock(&client->wq_lock);
> >  }
> > +#define GUC_PREEMPT_FINISHED 0x1
> > +#define GUC_PREEMPT_BREADCRUMB_DWORDS 0x8
> > +static void inject_preempt_context(struct work_struct *work)
> > +{
> > +	struct intel_engine_cs *engine =
> > +		container_of(work, typeof(*engine), guc_preempt_work);
> > +	struct drm_i915_private *dev_priv = engine->i915;
> > +	struct intel_guc *guc = &engine->i915->guc;
> 
> guc = &dev_priv->guc;
> 
> > +	struct i915_guc_client *client = guc->preempt_client;
> > +	struct intel_context *ce = &client->owner->engine[engine->id];
> > +	u32 ctx_desc = lower_32_bits(intel_lr_context_descriptor(client->owner,
> > +								 engine));
> > +	u32 *cs = ce->ring->vaddr + ce->ring->tail;
> > +	u32 data[7];
> > +
> > +	if (engine->id == RCS) {
> > +		cs = gen8_emit_ggtt_write_render(
> > +				intel_hws_preempt_done_address(engine),
> > +				GUC_PREEMPT_FINISHED, cs);
> > +		*cs++ = MI_USER_INTERRUPT;
> > +		*cs++ = MI_NOOP;
> > +	} else {
> > +		cs = gen8_emit_ggtt_write(
> > +				intel_hws_preempt_done_address(engine),
> > +				GUC_PREEMPT_FINISHED, cs);
> > +		*cs++ = MI_USER_INTERRUPT;
> > +		*cs++ = MI_NOOP;
> > +		*cs++ = MI_NOOP;
> > +		*cs++ = MI_NOOP;
> > +	}
> 
> Maybe like this:
> 
> 	cs = gen8_emit_ggtt_write_render(
> 		intel_hws_preempt_done_address(engine),
> 		GUC_PREEMPT_FINISHED, cs);
> 	*cs++ = MI_USER_INTERRUPT;
> 	*cs++ = MI_NOOP;
> 	if (engine->id != RCS) {
> 		*cs++ = MI_NOOP;
> 		*cs++ = MI_NOOP;
> 	}

Did you mean:

if (engine->id == RCS) {
	cs = gen8_emit_ggtt_write_render()
} else {
	cs = gen8_emit_ggtt_write()
	*cs++ = MI_NOOP;
	*cs++ = MI_NOOP;
}
*cs++ = MI_USER_INTERRUPT
*cs++ = MI_NOOP

?

[SNIP]

> > diff --git a/drivers/gpu/drm/i915/intel_uc.h
> > b/drivers/gpu/drm/i915/intel_uc.h
> > index c6c6f8513bbf..8e2818e5741e 100644
> > --- a/drivers/gpu/drm/i915/intel_uc.h
> > +++ b/drivers/gpu/drm/i915/intel_uc.h
> > @@ -124,6 +124,9 @@ struct intel_guc {
> > 	/* Kernel context state ggtt offset, first page is shared with GuC */
> >  	u32 shared_data_offset;
> > +	void *shared_data_addr;
> > +
> > +	struct workqueue_struct *preempt_wq;
> 
> Please pick better place (see my comment in 1/10)

I'll go with a helper for shared_data_* instead.

-Michał

> 
> > 	/* GuC's FW specific send function */
> >  	int (*send)(struct intel_guc *guc, const u32 *data, u32 len);
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-05 15:22   ` Daniele Ceraolo Spurio
@ 2017-10-05 16:00     ` Michał Winiarski
  2017-10-05 16:35       ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05 16:00 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: Dave Gordon, intel-gfx

On Thu, Oct 05, 2017 at 03:22:59PM +0000, Daniele Ceraolo Spurio wrote:
> 
> 
> On 05/10/17 02:13, Michał Winiarski wrote:
> > From: Dave Gordon <david.s.gordon@intel.com>
> > 
> > This second client is created with priority KMD_HIGH, and marked
> > as preemptive. This will allow us to request preemption using GuC actions.
> > 
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Cc: Oscar Mateo <oscar.mateo@intel.com>
> > Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_debugfs.c        |  4 ++++
> >   drivers/gpu/drm/i915/i915_guc_submission.c | 32 ++++++++++++++++++++++++++----
> >   drivers/gpu/drm/i915/intel_uc.h            |  1 +
> >   3 files changed, 33 insertions(+), 4 deletions(-)

[SNIP]

> > diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
> > index 10e8f0ed02e4..c6c6f8513bbf 100644
> > --- a/drivers/gpu/drm/i915/intel_uc.h
> > +++ b/drivers/gpu/drm/i915/intel_uc.h
> > @@ -107,6 +107,7 @@ struct intel_guc {
> >   	struct ida stage_ids;
> >   	struct i915_guc_client *execbuf_client;
> > +	struct i915_guc_client *preempt_client;
> >   	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
> >   	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
> > 
> 
> 
> I think you also need to update guc_init_doorbell_hw() to handle the new
> client. The comment in there says:
> 
> /* Now for every client (and not only execbuf_client) make sure their
>  * doorbells are known by the GuC */

But I don't want the doorbell for preempt_client...
I'm not submitting anything 'directly' using this client (just indirectly -
through preempt action).

Are you sure we need the doorbell?
Last time I tried to refactor GuC submission to use doorbell per engine, I got
the info that I should *avoid* allocating multiple doorbells.

-Michał

> 
> Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-05 16:00     ` Michał Winiarski
@ 2017-10-05 16:35       ` Daniele Ceraolo Spurio
  2017-10-05 18:20         ` Michał Winiarski
  0 siblings, 1 reply; 37+ messages in thread
From: Daniele Ceraolo Spurio @ 2017-10-05 16:35 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: Dave Gordon, intel-gfx



On 05/10/17 09:00, Michał Winiarski wrote:
> On Thu, Oct 05, 2017 at 03:22:59PM +0000, Daniele Ceraolo Spurio wrote:
>>
>>
>> On 05/10/17 02:13, Michał Winiarski wrote:
>>> From: Dave Gordon <david.s.gordon@intel.com>
>>>
>>> This second client is created with priority KMD_HIGH, and marked
>>> as preemptive. This will allow us to request preemption using GuC actions.
>>>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
>>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> Cc: Oscar Mateo <oscar.mateo@intel.com>
>>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>>> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_debugfs.c        |  4 ++++
>>>    drivers/gpu/drm/i915/i915_guc_submission.c | 32 ++++++++++++++++++++++++++----
>>>    drivers/gpu/drm/i915/intel_uc.h            |  1 +
>>>    3 files changed, 33 insertions(+), 4 deletions(-)
> 
> [SNIP]
> 
>>> diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
>>> index 10e8f0ed02e4..c6c6f8513bbf 100644
>>> --- a/drivers/gpu/drm/i915/intel_uc.h
>>> +++ b/drivers/gpu/drm/i915/intel_uc.h
>>> @@ -107,6 +107,7 @@ struct intel_guc {
>>>    	struct ida stage_ids;
>>>    	struct i915_guc_client *execbuf_client;
>>> +	struct i915_guc_client *preempt_client;
>>>    	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
>>>    	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
>>>
>>
>>
>> I think you also need to update guc_init_doorbell_hw() to handle the new
>> client. The comment in there says:
>>
>> /* Now for every client (and not only execbuf_client) make sure their
>>   * doorbells are known by the GuC */
> 
> But I don't want the doorbell for preempt_client...
> I'm not submitting anything 'directly' using this client (just indirectly -
> through preempt action).
> 
> Are you sure we need the doorbell?
> Last time I tried to refactor GuC submission to use doorbell per engine, I got
> the info that I should *avoid* allocating multiple doorbells.
> 
> -Michał
> 

Don't you still get a doorbell from guc_client_alloc()? or am I missing 
something?
One of the issues that guc_init_doorbell_hw() tries to solve is that 
after a GuC reset the doorbell HW is not cleaned up, so you can 
potentially have an active doorbell that GuC doesn't know about, which 
could lead to a failure if we try to release that doorbell later on. By 
doing the H2G we are sure that HW and GuC FW are in sync and I think 
this should apply to the doorbell in the preempt client as well.

The problem with having many doorbells is that they add a slight delay 
when waking the power well (not sure about the exact details) and AFAIK 
this happens even if you don't ring them. For this reason it is true 
that we want to keep the number of doorbells limited, but having 2 in 
total shouldn't be an issue. However, if we really don't need the 
doorbell then we should probably modify guc_client_alloc() to skip the 
doorbell allocation for the preempt client, but that could lead to a 
bigger rework to remove the assumption that each client has a doorbell.

Daniele

>>
>> Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset
  2017-10-05  9:33 ` [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Chris Wilson
@ 2017-10-05 17:02   ` Daniele Ceraolo Spurio
  2017-10-06 12:35     ` Michał Winiarski
  0 siblings, 1 reply; 37+ messages in thread
From: Daniele Ceraolo Spurio @ 2017-10-05 17:02 UTC (permalink / raw)
  To: Chris Wilson, Michał Winiarski, intel-gfx



On 05/10/17 02:33, Chris Wilson wrote:
> Quoting Michał Winiarski (2017-10-05 10:13:40)
>> We're using first page of kernel context state to share data with GuC,
>> let's precompute the ggtt offset at GuC initialization time rather than
>> everytime we're using GuC actions.
> 
> So LRC_GUCSHR_PN is still 0. Plans for that to change?
> 

This is a requirement from the GuC side. GuC expects each context to 
have that extra page before the PPHWSP and it uses it to dump some 
per-lrc info, part of which is for internal use and part is info for the 
host (although we don't need/use it).
On certain events (reset/preempt/suspend etc) GuC will dump extra info 
and this is done in the page provided in the H2G. I think we use the one 
of the default ctx just for simplicity, but it should be possible to use 
a different one, possibly not attached to any lrc if needed, but I'm not 
sure if this has ever been tested.

-Daniele

> Atm, we should be changing one pointer deref for another...
> -Chris
>>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-05 16:35       ` Daniele Ceraolo Spurio
@ 2017-10-05 18:20         ` Michał Winiarski
  0 siblings, 0 replies; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05 18:20 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: Dave Gordon, intel-gfx

On Thu, Oct 05, 2017 at 04:35:54PM +0000, Daniele Ceraolo Spurio wrote:
> 
> 
> On 05/10/17 09:00, Michał Winiarski wrote:
> > On Thu, Oct 05, 2017 at 03:22:59PM +0000, Daniele Ceraolo Spurio wrote:
> > > 
> > > 
> > > On 05/10/17 02:13, Michał Winiarski wrote:
> > > > From: Dave Gordon <david.s.gordon@intel.com>
> > > > 
> > > > This second client is created with priority KMD_HIGH, and marked
> > > > as preemptive. This will allow us to request preemption using GuC actions.
> > > > 
> > > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> > > > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > > > Cc: Oscar Mateo <oscar.mateo@intel.com>
> > > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> > > > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > > > ---
> > > >    drivers/gpu/drm/i915/i915_debugfs.c        |  4 ++++
> > > >    drivers/gpu/drm/i915/i915_guc_submission.c | 32 ++++++++++++++++++++++++++----
> > > >    drivers/gpu/drm/i915/intel_uc.h            |  1 +
> > > >    3 files changed, 33 insertions(+), 4 deletions(-)
> > 
> > [SNIP]
> > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
> > > > index 10e8f0ed02e4..c6c6f8513bbf 100644
> > > > --- a/drivers/gpu/drm/i915/intel_uc.h
> > > > +++ b/drivers/gpu/drm/i915/intel_uc.h
> > > > @@ -107,6 +107,7 @@ struct intel_guc {
> > > >    	struct ida stage_ids;
> > > >    	struct i915_guc_client *execbuf_client;
> > > > +	struct i915_guc_client *preempt_client;
> > > >    	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
> > > >    	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
> > > > 
> > > 
> > > 
> > > I think you also need to update guc_init_doorbell_hw() to handle the new
> > > client. The comment in there says:
> > > 
> > > /* Now for every client (and not only execbuf_client) make sure their
> > >   * doorbells are known by the GuC */
> > 
> > But I don't want the doorbell for preempt_client...
> > I'm not submitting anything 'directly' using this client (just indirectly -
> > through preempt action).
> > 
> > Are you sure we need the doorbell?
> > Last time I tried to refactor GuC submission to use doorbell per engine, I got
> > the info that I should *avoid* allocating multiple doorbells.
> > 
> > -Michał
> > 
> 
> Don't you still get a doorbell from guc_client_alloc()? or am I missing
> something?

Right, but if we could get away without allocating one...

> One of the issues that guc_init_doorbell_hw() tries to solve is that after a
> GuC reset the doorbell HW is not cleaned up, so you can potentially have an
> active doorbell that GuC doesn't know about, which could lead to a failure
> if we try to release that doorbell later on. By doing the H2G we are sure
> that HW and GuC FW are in sync and I think this should apply to the doorbell
> in the preempt client as well.

Agreed.

> The problem with having many doorbells is that they add a slight delay when
> waking the power well (not sure about the exact details) and AFAIK this
> happens even if you don't ring them. For this reason it is true that we want
> to keep the number of doorbells limited, but having 2 in total shouldn't be
> an issue. However, if we really don't need the doorbell then we should
> probably modify guc_client_alloc() to skip the doorbell allocation for the
> preempt client, but that could lead to a bigger rework to remove the
> assumption that each client has a doorbell.

I guess I'll take a look at how much would need to be changed.
If it ends up being larger that I'm comfortable with for this series, I'll
fallback to modifying init_doorbell_hw and leave that for some other time.

Thanks!

-Michał

> 
> Daniele
> 
> > > 
> > > Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 09/10] drm/i915/guc: Preemption! With GuC.
  2017-10-05 10:31     ` Chris Wilson
@ 2017-10-05 18:38       ` Michał Winiarski
  0 siblings, 0 replies; 37+ messages in thread
From: Michał Winiarski @ 2017-10-05 18:38 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Thu, Oct 05, 2017 at 10:31:34AM +0000, Chris Wilson wrote:
> Quoting Michał Winiarski (2017-10-05 10:20:04)
> > @@ -625,10 +723,23 @@ static void i915_guc_irq_handler(unsigned long data)
> >         struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
> >         struct intel_engine_execlists * const execlists = &engine->execlists;
> >         struct execlist_port *port = execlists->port;
> > -       const struct execlist_port * const last_port =
> > -               &execlists->port[execlists->port_mask];
> >         struct drm_i915_gem_request *rq;
> >  
> > +       if (READ_ONCE(execlists->preempt) &&
> > +           intel_read_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX) ==
> > +           GUC_PREEMPT_FINISHED) {
> > +               execlist_cancel_port_requests(&engine->execlists);
> > +
> > +               spin_lock_irq(&engine->timeline->lock);
> > +               unwind_incomplete_requests(engine);
> > +               spin_unlock_irq(&engine->timeline->lock);
> > +
> > +               wait_for_guc_preempt_report(engine);
> > +
> > +               WRITE_ONCE(execlists->preempt, false);
> > +               intel_write_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX, 0);
> 
> Ok. So we injected the preempt context, observed it complete. What stops
> the GuC from switching back to the normal context and draining its wq?
> (Those requests we've just put back to be re-executed later.)
> -Chris

We're sending GuC action with:
INTEL_GUC_PREEMPT_OPTION_DROP_WORK_Q | INTEL_GUC_PREEMPT_OPTION_DROP_SUBMIT_Q

Which prevents GuC from doing exactly that.

I'll add a comment in inject preempt ctx.

-Michał
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 05/10] drm/i915/guc: Split guc_wq_item_append
  2017-10-05  9:13 ` [PATCH 05/10] drm/i915/guc: Split guc_wq_item_append Michał Winiarski
  2017-10-05  9:45   ` Chris Wilson
@ 2017-10-05 20:37   ` Jeff McGee
  1 sibling, 0 replies; 37+ messages in thread
From: Jeff McGee @ 2017-10-05 20:37 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

On Thu, Oct 05, 2017 at 11:13:44AM +0200, Michał Winiarski wrote:
> We're using a special preempt context for HW to preempt into. We don't
> want to emit any requests there, but we still need to wrap this context
> into a valid GuC work item.
> Let's cleanup the functions operating on GuC work items.
> We can extract guc_request_add - responsible for adding GuC work item and
> ringing the doorbell, and guc_wq_item_append - used by the function
> above, not tied to the concept of gem request.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 07/10] drm/i915: Add information needed to track engine preempt state
  2017-10-05  9:20   ` [PATCH 07/10] drm/i915: Add information needed to track engine preempt state Michał Winiarski
  2017-10-05  9:49     ` Chris Wilson
@ 2017-10-05 20:42     ` Jeff McGee
  1 sibling, 0 replies; 37+ messages in thread
From: Jeff McGee @ 2017-10-05 20:42 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

On Thu, Oct 05, 2017 at 11:20:02AM +0200, Michał Winiarski wrote:
> We shouldn't inspect ELSP context status (or any other bits depending on
> specific submission backend) when using GuC submission.
> Let's use another piece of HWSP for preempt context, to write its bit of
> information, meaning that preemption has finished, and hardware is now
> idle.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset
  2017-10-05 17:02   ` Daniele Ceraolo Spurio
@ 2017-10-06 12:35     ` Michał Winiarski
  2017-10-06 16:24       ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 37+ messages in thread
From: Michał Winiarski @ 2017-10-06 12:35 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-gfx

On Thu, Oct 05, 2017 at 05:02:39PM +0000, Daniele Ceraolo Spurio wrote:
> 
> 
> On 05/10/17 02:33, Chris Wilson wrote:
> > Quoting Michał Winiarski (2017-10-05 10:13:40)
> > > We're using first page of kernel context state to share data with GuC,
> > > let's precompute the ggtt offset at GuC initialization time rather than
> > > everytime we're using GuC actions.
> > 
> > So LRC_GUCSHR_PN is still 0. Plans for that to change?
> > 
> 
> This is a requirement from the GuC side. GuC expects each context to have
> that extra page before the PPHWSP and it uses it to dump some per-lrc info,
> part of which is for internal use and part is info for the host (although we
> don't need/use it).
> On certain events (reset/preempt/suspend etc) GuC will dump extra info and
> this is done in the page provided in the H2G. I think we use the one of the
> default ctx just for simplicity, but it should be possible to use a
> different one, possibly not attached to any lrc if needed, but I'm not sure
> if this has ever been tested.

Done that (allocating a separate object for GuC shared data), seems to
work just fine on its own. Except if we try to remove the first page from
contexts. It seems to make GuC upset even though we're not using actions.

We could still do that, though without removing the extra page we're just being
more wasteful. But perhaps it's cleaner that way? Having separate managed in GuC
code rather than reusing random places in context state? Thoughts?

-Michał

> 
> -Daniele
> 
> > Atm, we should be changing one pointer deref for another...
> > -Chris
> > > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset
  2017-10-06 12:35     ` Michał Winiarski
@ 2017-10-06 16:24       ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 37+ messages in thread
From: Daniele Ceraolo Spurio @ 2017-10-06 16:24 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx



On 06/10/17 05:35, Michał Winiarski wrote:
> On Thu, Oct 05, 2017 at 05:02:39PM +0000, Daniele Ceraolo Spurio wrote:
>>
>>
>> On 05/10/17 02:33, Chris Wilson wrote:
>>> Quoting Michał Winiarski (2017-10-05 10:13:40)
>>>> We're using first page of kernel context state to share data with GuC,
>>>> let's precompute the ggtt offset at GuC initialization time rather than
>>>> everytime we're using GuC actions.
>>>
>>> So LRC_GUCSHR_PN is still 0. Plans for that to change?
>>>
>>
>> This is a requirement from the GuC side. GuC expects each context to have
>> that extra page before the PPHWSP and it uses it to dump some per-lrc info,
>> part of which is for internal use and part is info for the host (although we
>> don't need/use it).
>> On certain events (reset/preempt/suspend etc) GuC will dump extra info and
>> this is done in the page provided in the H2G. I think we use the one of the
>> default ctx just for simplicity, but it should be possible to use a
>> different one, possibly not attached to any lrc if needed, but I'm not sure
>> if this has ever been tested.
> 
> Done that (allocating a separate object for GuC shared data), seems to
> work just fine on its own. Except if we try to remove the first page from
> contexts. It seems to make GuC upset even though we're not using actions.
> 

Yep, as I mentioned above GuC dumps runtime info about each lrc it 
handles in that page (e.g. if an lrc has been submitted via proxy), so 
it is probably going to either page-fault or write in the wrong memory 
if that page is not allocated.

> We could still do that, though without removing the extra page we're just being
> more wasteful. But perhaps it's cleaner that way? Having separate managed in GuC
> code rather than reusing random places in context state? Thoughts?
> 

This is similar to what we used to do by using the PPHWSP of the default 
ctx as the global HWSP. Personally I'd prefer to keep it separate as it 
feels cleaner and a single extra page shouldn't hurt us that much, but 
there was some push-back when I suggested the same for the HWSP.

Daniele

> -Michał
> 
>>
>> -Daniele
>>
>>> Atm, we should be changing one pointer deref for another...
>>> -Chris
>>>>
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [01/10] drm/i915/guc: Precompute GuC shared data offset
  2017-10-05  9:13 [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Michał Winiarski
                   ` (6 preceding siblings ...)
  2017-10-05 11:30 ` Michal Wajdeczko
@ 2017-10-26 21:22 ` Patchwork
  7 siblings, 0 replies; 37+ messages in thread
From: Patchwork @ 2017-10-26 21:22 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/10] drm/i915/guc: Precompute GuC shared data offset
URL   : https://patchwork.freedesktop.org/series/31412/
State : failure

== Summary ==

Series 31412 revision 1 was fully merged or fully failed: no git log

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2017-10-26 21:22 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-05  9:13 [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Michał Winiarski
2017-10-05  9:13 ` [PATCH 02/10] drm/i915/guc: Initialize GuC before restarting engines Michał Winiarski
2017-10-05  9:36   ` Chris Wilson
2017-10-05 11:55     ` Michał Winiarski
2017-10-05  9:13 ` [PATCH 03/10] drm/i915/guc: Add preemption action to GuC firmware interface Michał Winiarski
2017-10-05 14:25   ` Jeff McGee
2017-10-05  9:13 ` [PATCH 04/10] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
2017-10-05  9:39   ` Chris Wilson
2017-10-05 13:59   ` Michal Wajdeczko
2017-10-05 14:58   ` Jeff McGee
2017-10-05 15:22   ` Daniele Ceraolo Spurio
2017-10-05 16:00     ` Michał Winiarski
2017-10-05 16:35       ` Daniele Ceraolo Spurio
2017-10-05 18:20         ` Michał Winiarski
2017-10-05  9:13 ` [PATCH 05/10] drm/i915/guc: Split guc_wq_item_append Michał Winiarski
2017-10-05  9:45   ` Chris Wilson
2017-10-05 20:37   ` Jeff McGee
2017-10-05  9:20 ` [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions Michał Winiarski
2017-10-05  9:20   ` [PATCH 07/10] drm/i915: Add information needed to track engine preempt state Michał Winiarski
2017-10-05  9:49     ` Chris Wilson
2017-10-05 20:42     ` Jeff McGee
2017-10-05  9:20   ` [PATCH 08/10] drm/i915/guc: Keep request->priority for its lifetime Michał Winiarski
2017-10-05  9:50     ` Chris Wilson
2017-10-05  9:20   ` [PATCH 09/10] drm/i915/guc: Preemption! With GuC Michał Winiarski
2017-10-05 10:04     ` Chris Wilson
2017-10-05 10:31     ` Chris Wilson
2017-10-05 18:38       ` Michał Winiarski
2017-10-05 15:00     ` Michal Wajdeczko
2017-10-05 15:42       ` Michał Winiarski
2017-10-05  9:20   ` [PATCH 10/10] HAX Enable GuC Submission for CI Michał Winiarski
2017-10-05  9:47   ` [PATCH 06/10] drm/i915: Extract "emit write" part of emit breadcrumb functions Chris Wilson
2017-10-05  9:33 ` [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset Chris Wilson
2017-10-05 17:02   ` Daniele Ceraolo Spurio
2017-10-06 12:35     ` Michał Winiarski
2017-10-06 16:24       ` Daniele Ceraolo Spurio
2017-10-05 11:30 ` Michal Wajdeczko
2017-10-26 21:22 ` ✗ Fi.CI.BAT: failure for series starting with [01/10] " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.