All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] Preemption with GuC, fourth try
@ 2017-10-25 20:00 Michał Winiarski
  2017-10-25 20:00 ` [PATCH v2 01/12] drm/i915/guc: Do not use 0 for GuC doorbell cookie Michał Winiarski
                   ` (19 more replies)
  0 siblings, 20 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

No major changes from previous iteration.
Dropped the workaround for missing interrupt (which turned out to be
self-inflicted, now properly fixed by Chris), and applied the review comments.

Dave Gordon (1):
  drm/i915/guc: Add a second client, to be used for preemption

Michał Winiarski (11):
  drm/i915/guc: Do not use 0 for GuC doorbell cookie
  drm/i915/guc: Extract GuC stage desc pool creation into a helper
  drm/i915/guc: Allocate separate shared data object for GuC
    communication
  drm/i915/guc: Add preemption action to GuC firmware interface
  drm/i915/guc: Split guc_wq_item_append
  drm/i915: Extract "emit write" part of emit breadcrumb functions
  drm/i915: Add information needed to track engine preempt state
  drm/i915/guc: Keep request->priority for its lifetime
  drm/i915: Rename helpers used for unwinding, use macro for can_preempt
  drm/i915/guc: Preemption! With GuC
  HAX Enable GuC Submission for CI

 drivers/gpu/drm/i915/i915_debugfs.c        |   2 +
 drivers/gpu/drm/i915/i915_drv.c            |   5 +-
 drivers/gpu/drm/i915/i915_drv.h            |   2 +
 drivers/gpu/drm/i915/i915_gem.c            |  10 +
 drivers/gpu/drm/i915/i915_gem_gtt.c        |   8 +-
 drivers/gpu/drm/i915/i915_guc_submission.c | 480 +++++++++++++++++++++++------
 drivers/gpu/drm/i915/i915_params.h         |   4 +-
 drivers/gpu/drm/i915/intel_engine_cs.c     |   6 +-
 drivers/gpu/drm/i915/intel_guc.c           |  14 +-
 drivers/gpu/drm/i915/intel_guc.h           |  11 +
 drivers/gpu/drm/i915/intel_guc_fwif.h      |  39 +++
 drivers/gpu/drm/i915/intel_lrc.c           |  65 ++--
 drivers/gpu/drm/i915/intel_lrc.h           |   1 -
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  51 +++
 14 files changed, 534 insertions(+), 164 deletions(-)

-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 01/12] drm/i915/guc: Do not use 0 for GuC doorbell cookie
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-25 20:00 ` [PATCH 02/12] drm/i915/guc: Extract GuC stage desc pool creation into a helper Michał Winiarski
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

Apparently, this value is reserved and may be interpreted as changing
doorbell ownership. Even though we're not observing any side effects
now, let's skip over it to be consistent with the spec.

v2: Apply checkpatch (Sagar)

Suggested-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 07289a4b5bfa..0b61bf18c3f4 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -475,9 +475,12 @@ static void guc_ring_doorbell(struct i915_guc_client *client)
 	/* pointer of current doorbell cacheline */
 	db = __get_doorbell(client);
 
-	/* we're not expecting the doorbell cookie to change behind our back */
+	/*
+	 * We're not expecting the doorbell cookie to change behind our back,
+	 * we also need to treat 0 as a reserved value.
+	 */
 	cookie = READ_ONCE(db->cookie);
-	WARN_ON_ONCE(xchg(&db->cookie, cookie + 1) != cookie);
+	WARN_ON_ONCE(xchg(&db->cookie, cookie + 1 ?: cookie + 2) != cookie);
 
 	/* XXX: doorbell was lost and need to acquire it again */
 	GEM_BUG_ON(db->db_status != GUC_DOORBELL_ENABLED);
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 02/12] drm/i915/guc: Extract GuC stage desc pool creation into a helper
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
  2017-10-25 20:00 ` [PATCH v2 01/12] drm/i915/guc: Do not use 0 for GuC doorbell cookie Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-25 20:00 ` [PATCH v2 03/12] drm/i915/guc: Allocate separate shared data object for GuC communication Michał Winiarski
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

Since it's a two-step process, we can have a cleaner error handling in
the caller if we do the allocations in a helper.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 65 +++++++++++++++++-------------
 1 file changed, 38 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 0b61bf18c3f4..e195bdee0473 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -311,6 +311,37 @@ static void guc_proc_desc_init(struct intel_guc *guc,
 	desc->priority = client->priority;
 }
 
+static int guc_stage_desc_pool_create(struct intel_guc *guc)
+{
+	struct i915_vma *vma;
+	void *vaddr;
+
+	vma = intel_guc_allocate_vma(guc,
+				     PAGE_ALIGN(sizeof(struct guc_stage_desc) *
+				     GUC_MAX_STAGE_DESCRIPTORS));
+	if (IS_ERR(vma))
+		return PTR_ERR(vma);
+
+	vaddr = i915_gem_object_pin_map(vma->obj, I915_MAP_WB);
+	if (IS_ERR(vaddr)) {
+		i915_vma_unpin_and_release(&vma);
+		return PTR_ERR(vaddr);
+	}
+
+	guc->stage_desc_pool = vma;
+	guc->stage_desc_pool_vaddr = vaddr;
+	ida_init(&guc->stage_ids);
+
+	return 0;
+}
+
+static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
+{
+	ida_destroy(&guc->stage_ids);
+	i915_gem_object_unpin_map(guc->stage_desc_pool->obj);
+	i915_vma_unpin_and_release(&guc->stage_desc_pool);
+}
+
 /*
  * Initialise/clear the stage descriptor shared with the GuC firmware.
  *
@@ -953,47 +984,29 @@ static void guc_ads_destroy(struct intel_guc *guc)
 int i915_guc_submission_init(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
-	struct i915_vma *vma;
-	void *vaddr;
 	int ret;
 
 	if (guc->stage_desc_pool)
 		return 0;
 
-	vma = intel_guc_allocate_vma(guc,
-				PAGE_ALIGN(sizeof(struct guc_stage_desc) *
-					GUC_MAX_STAGE_DESCRIPTORS));
-	if (IS_ERR(vma))
-		return PTR_ERR(vma);
-
-	guc->stage_desc_pool = vma;
-
-	vaddr = i915_gem_object_pin_map(guc->stage_desc_pool->obj, I915_MAP_WB);
-	if (IS_ERR(vaddr)) {
-		ret = PTR_ERR(vaddr);
-		goto err_vma;
-	}
-
-	guc->stage_desc_pool_vaddr = vaddr;
+	ret = guc_stage_desc_pool_create(guc);
+	if (ret)
+		return ret;
 
 	ret = intel_guc_log_create(guc);
 	if (ret < 0)
-		goto err_vaddr;
+		goto err_stage_desc_pool;
 
 	ret = guc_ads_create(guc);
 	if (ret < 0)
 		goto err_log;
 
-	ida_init(&guc->stage_ids);
-
 	return 0;
 
 err_log:
 	intel_guc_log_destroy(guc);
-err_vaddr:
-	i915_gem_object_unpin_map(guc->stage_desc_pool->obj);
-err_vma:
-	i915_vma_unpin_and_release(&guc->stage_desc_pool);
+err_stage_desc_pool:
+	guc_stage_desc_pool_destroy(guc);
 	return ret;
 }
 
@@ -1001,11 +1014,9 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
 
-	ida_destroy(&guc->stage_ids);
 	guc_ads_destroy(guc);
 	intel_guc_log_destroy(guc);
-	i915_gem_object_unpin_map(guc->stage_desc_pool->obj);
-	i915_vma_unpin_and_release(&guc->stage_desc_pool);
+	guc_stage_desc_pool_destroy(guc);
 }
 
 static void guc_interrupts_capture(struct drm_i915_private *dev_priv)
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 03/12] drm/i915/guc: Allocate separate shared data object for GuC communication
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
  2017-10-25 20:00 ` [PATCH v2 01/12] drm/i915/guc: Do not use 0 for GuC doorbell cookie Michał Winiarski
  2017-10-25 20:00 ` [PATCH 02/12] drm/i915/guc: Extract GuC stage desc pool creation into a helper Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-25 21:04   ` Michel Thierry
  2017-10-25 20:00 ` [PATCH v2 04/12] drm/i915/guc: Add preemption action to GuC firmware interface Michał Winiarski
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

We were using first page of kernel context render state for sharing data
with GuC. While it's justified by the fact that those pages are not used
(note, GuC still enforces this layout and refuses to work if we remove
the extra page in front), it's also confusing (why are we using this
particular page?). Let's allocate a separate object instead.

v2: Drop kernel_context from GuC suspend/resume action handlers (Michel)

Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 36 +++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_guc.c           | 14 ++----------
 drivers/gpu/drm/i915/intel_guc.h           |  2 ++
 3 files changed, 39 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index e195bdee0473..d1a5613da24c 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -437,6 +437,33 @@ static void guc_stage_desc_fini(struct intel_guc *guc,
 	memset(desc, 0, sizeof(*desc));
 }
 
+static int guc_shared_data_create(struct intel_guc *guc)
+{
+	struct i915_vma *vma;
+	void *vaddr;
+
+	vma = intel_guc_allocate_vma(guc, PAGE_SIZE);
+	if (IS_ERR(vma))
+		return PTR_ERR(vma);
+
+	vaddr = i915_gem_object_pin_map(vma->obj, I915_MAP_WB);
+	if (IS_ERR(vaddr)) {
+		i915_vma_unpin_and_release(&vma);
+		return PTR_ERR(vaddr);
+	}
+
+	guc->shared_data = vma;
+	guc->shared_data_vaddr = vaddr;
+
+	return 0;
+}
+
+static void guc_shared_data_destroy(struct intel_guc *guc)
+{
+	i915_gem_object_unpin_map(guc->shared_data->obj);
+	i915_vma_unpin_and_release(&guc->shared_data);
+}
+
 /* Construct a Work Item and append it to the GuC's Work Queue */
 static void guc_wq_item_append(struct i915_guc_client *client,
 			       struct drm_i915_gem_request *rq)
@@ -993,9 +1020,13 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
 	if (ret)
 		return ret;
 
+	ret = guc_shared_data_create(guc);
+	if (ret)
+		goto err_stage_desc_pool;
+
 	ret = intel_guc_log_create(guc);
 	if (ret < 0)
-		goto err_stage_desc_pool;
+		goto err_shared_data;
 
 	ret = guc_ads_create(guc);
 	if (ret < 0)
@@ -1005,6 +1036,8 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
 
 err_log:
 	intel_guc_log_destroy(guc);
+err_shared_data:
+	guc_shared_data_destroy(guc);
 err_stage_desc_pool:
 	guc_stage_desc_pool_destroy(guc);
 	return ret;
@@ -1016,6 +1049,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
 
 	guc_ads_destroy(guc);
 	intel_guc_log_destroy(guc);
+	guc_shared_data_destroy(guc);
 	guc_stage_desc_pool_destroy(guc);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index 10037c0fdf95..f74d50fdaeb0 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -268,7 +268,6 @@ int intel_guc_auth_huc(struct intel_guc *guc, u32 rsa_offset)
 int intel_guc_suspend(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
-	struct i915_gem_context *ctx;
 	u32 data[3];
 
 	if (guc->fw.load_status != INTEL_UC_FIRMWARE_SUCCESS)
@@ -276,14 +275,10 @@ int intel_guc_suspend(struct drm_i915_private *dev_priv)
 
 	gen9_disable_guc_interrupts(dev_priv);
 
-	ctx = dev_priv->kernel_context;
-
 	data[0] = INTEL_GUC_ACTION_ENTER_S_STATE;
 	/* any value greater than GUC_POWER_D0 */
 	data[1] = GUC_POWER_D1;
-	/* first page is shared data with GuC */
-	data[2] = guc_ggtt_offset(ctx->engine[RCS].state) +
-		  LRC_GUCSHR_PN * PAGE_SIZE;
+	data[2] = guc_ggtt_offset(guc->shared_data);
 
 	return intel_guc_send(guc, data, ARRAY_SIZE(data));
 }
@@ -295,7 +290,6 @@ int intel_guc_suspend(struct drm_i915_private *dev_priv)
 int intel_guc_resume(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
-	struct i915_gem_context *ctx;
 	u32 data[3];
 
 	if (guc->fw.load_status != INTEL_UC_FIRMWARE_SUCCESS)
@@ -304,13 +298,9 @@ int intel_guc_resume(struct drm_i915_private *dev_priv)
 	if (i915_modparams.guc_log_level >= 0)
 		gen9_enable_guc_interrupts(dev_priv);
 
-	ctx = dev_priv->kernel_context;
-
 	data[0] = INTEL_GUC_ACTION_EXIT_S_STATE;
 	data[1] = GUC_POWER_D0;
-	/* first page is shared data with GuC */
-	data[2] = guc_ggtt_offset(ctx->engine[RCS].state) +
-		  LRC_GUCSHR_PN * PAGE_SIZE;
+	data[2] = guc_ggtt_offset(guc->shared_data);
 
 	return intel_guc_send(guc, data, ARRAY_SIZE(data));
 }
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 418450b1ae27..aa1583167b0a 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -54,6 +54,8 @@ struct intel_guc {
 	struct i915_vma *stage_desc_pool;
 	void *stage_desc_pool_vaddr;
 	struct ida stage_ids;
+	struct i915_vma *shared_data;
+	void *shared_data_vaddr;
 
 	struct i915_guc_client *execbuf_client;
 
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 04/12] drm/i915/guc: Add preemption action to GuC firmware interface
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (2 preceding siblings ...)
  2017-10-25 20:00 ` [PATCH v2 03/12] drm/i915/guc: Allocate separate shared data object for GuC communication Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-25 20:00 ` [PATCH v3 05/12] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

We're using GuC action to request preemption. However, after requesting
preemption we need to wait for GuC to finish its own post-processing
before we start submitting our requests. Firmware is using shared
context to report its status.
Let's update GuC firmware interface with those new definitions.

v2: Drop unused INTEL_GUC_PREEMPT_OPTION_IMMEDIATE

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 39 +++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 80c507435458..e24dbec2ead8 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -544,9 +544,36 @@ union guc_log_control {
 	u32 value;
 } __packed;
 
+struct guc_ctx_report {
+	u32 report_return_status;
+	u32 reserved1[64];
+	u32 affected_count;
+	u32 reserved2[2];
+} __packed;
+
+/* GuC Shared Context Data Struct */
+struct guc_shared_ctx_data {
+	u32 addr_of_last_preempted_data_low;
+	u32 addr_of_last_preempted_data_high;
+	u32 addr_of_last_preempted_data_high_tmp;
+	u32 padding;
+	u32 is_mapped_to_proxy;
+	u32 proxy_ctx_id;
+	u32 engine_reset_ctx_id;
+	u32 media_reset_count;
+	u32 reserved1[8];
+	u32 uk_last_ctx_switch_reason;
+	u32 was_reset;
+	u32 lrca_gpu_addr;
+	u64 execlist_ctx;
+	u32 reserved2[66];
+	struct guc_ctx_report preempt_ctx_report[GUC_MAX_ENGINES_NUM];
+} __packed;
+
 /* This Action will be programmed in C180 - SOFT_SCRATCH_O_REG */
 enum intel_guc_action {
 	INTEL_GUC_ACTION_DEFAULT = 0x0,
+	INTEL_GUC_ACTION_REQUEST_PREEMPTION = 0x2,
 	INTEL_GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
 	INTEL_GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
 	INTEL_GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
@@ -562,6 +589,18 @@ enum intel_guc_action {
 	INTEL_GUC_ACTION_LIMIT
 };
 
+enum intel_guc_preempt_options {
+	INTEL_GUC_PREEMPT_OPTION_DROP_WORK_Q = 0x4,
+	INTEL_GUC_PREEMPT_OPTION_DROP_SUBMIT_Q = 0x8,
+};
+
+enum intel_guc_report_status {
+	INTEL_GUC_REPORT_STATUS_UNKNOWN = 0x0,
+	INTEL_GUC_REPORT_STATUS_ACKED = 0x1,
+	INTEL_GUC_REPORT_STATUS_ERROR = 0x2,
+	INTEL_GUC_REPORT_STATUS_COMPLETE = 0x4,
+};
+
 /*
  * The GuC sends its response to a command by overwriting the
  * command in SS0. The response is distinguishable from a command
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 05/12] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (3 preceding siblings ...)
  2017-10-25 20:00 ` [PATCH v2 04/12] drm/i915/guc: Add preemption action to GuC firmware interface Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-26 12:50   ` Michal Wajdeczko
  2017-10-26 13:20   ` [PATCH v4] " Michał Winiarski
  2017-10-25 20:00 ` [PATCH 06/12] drm/i915/guc: Split guc_wq_item_append Michał Winiarski
                   ` (14 subsequent siblings)
  19 siblings, 2 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dave Gordon

From: Dave Gordon <david.s.gordon@intel.com>

This second client is created with priority KMD_HIGH, and marked
as preemptive. This will allow us to request preemption using GuC actions.

v2: Extract clients creation into a helper, debugfs fixups. (Michał)
Recreate doorbell on init. (Daniele)
Move clients into an array.

v3: And move clients back from an array, to get rid of the enum (Michał)

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |   2 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 112 ++++++++++++++++++++---------
 drivers/gpu/drm/i915/intel_guc.h           |   1 +
 3 files changed, 82 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c65e381b85f3..6e178d75bb17 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2484,6 +2484,8 @@ static int i915_guc_info(struct seq_file *m, void *data)
 
 	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);
 	i915_guc_client_info(m, dev_priv, guc->execbuf_client);
+	seq_printf(m, "\nGuC preempt client @ %p:\n", guc->preempt_client);
+	i915_guc_client_info(m, dev_priv, guc->preempt_client);
 
 	i915_guc_log_info(m, dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index d1a5613da24c..1d3123e74bc4 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -33,10 +33,11 @@
  *
  * GuC client:
  * A i915_guc_client refers to a submission path through GuC. Currently, there
- * is only one of these (the execbuf_client) and this one is charged with all
- * submissions to the GuC. This struct is the owner of a doorbell, a process
- * descriptor and a workqueue (all of them inside a single gem object that
- * contains all required pages for these elements).
+ * are two clients. One of them (the execbuf_client) is charged with all
+ * submissions to the GuC, the other one (preempt_client) is responsible for
+ * preempting the execbuf_client. This struct is the owner of a doorbell, a
+ * process descriptor and a workqueue (all of them inside a single gem object
+ * that contains all required pages for these elements).
  *
  * GuC stage descriptor:
  * During initialization, the driver allocates a static pool of 1024 such
@@ -363,6 +364,8 @@ static void guc_stage_desc_init(struct intel_guc *guc,
 	memset(desc, 0, sizeof(*desc));
 
 	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE | GUC_STAGE_DESC_ATTR_KERNEL;
+	if (client->priority <= GUC_CLIENT_PRIORITY_HIGH)
+		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
 	desc->stage_id = client->stage_id;
 	desc->priority = client->priority;
 	desc->db_id = client->doorbell_id;
@@ -763,14 +766,18 @@ static int guc_init_doorbell_hw(struct intel_guc *guc)
 
 	/* Now for every client (and not only execbuf_client) make sure their
 	 * doorbells are known by the GuC */
-	//for (client = client_list; client != NULL; client = client->next)
-	{
-		ret = __create_doorbell(client);
-		if (ret) {
-			DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
-				client->stage_id, ret);
-			return ret;
-		}
+	ret = __create_doorbell(guc->execbuf_client);
+	if (ret) {
+		DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
+			  guc->execbuf_client->stage_id, ret);
+		return ret;
+	}
+
+	ret = __create_doorbell(guc->preempt_client);
+	if (ret) {
+		DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
+			  guc->preempt_client->stage_id, ret);
+		return ret;
 	}
 
 	/* Read back & verify all (used & unused) doorbell registers */
@@ -895,6 +902,47 @@ static void guc_client_free(struct i915_guc_client *client)
 	kfree(client);
 }
 
+static int guc_clients_create(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct i915_guc_client *client;
+
+	client = guc_client_alloc(dev_priv,
+				  INTEL_INFO(dev_priv)->ring_mask,
+				  GUC_CLIENT_PRIORITY_KMD_NORMAL,
+				  dev_priv->kernel_context);
+	if (IS_ERR(client)) {
+		DRM_ERROR("Failed to create GuC client for submission!\n");
+		return PTR_ERR(client);
+	}
+	guc->execbuf_client = client;
+
+	client = guc_client_alloc(dev_priv,
+				  INTEL_INFO(dev_priv)->ring_mask,
+				  GUC_CLIENT_PRIORITY_KMD_HIGH,
+				  dev_priv->preempt_context);
+	if (IS_ERR(client)) {
+		DRM_ERROR("Failed to create GuC client for preemption!\n");
+		guc_client_free(guc->execbuf_client);
+		guc->execbuf_client = NULL;
+		return PTR_ERR(client);
+	}
+	guc->preempt_client = client;
+
+	return 0;
+}
+
+static void guc_clients_destroy(struct intel_guc *guc)
+{
+	struct i915_guc_client *client;
+
+	client = fetch_and_zero(&guc->execbuf_client);
+	guc_client_free(client);
+
+	client = fetch_and_zero(&guc->preempt_client);
+	guc_client_free(client);
+}
+
 static void guc_policy_init(struct guc_policy *policy)
 {
 	policy->execution_quantum = POLICY_DEFAULT_EXECUTION_QUANTUM_US;
@@ -1134,7 +1182,6 @@ static void i915_guc_submission_unpark(struct intel_engine_cs *engine)
 int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
-	struct i915_guc_client *client = guc->execbuf_client;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 	int err;
@@ -1152,28 +1199,29 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 		     sizeof(struct guc_wq_item) *
 		     I915_NUM_ENGINES > GUC_WQ_SIZE);
 
-	if (!client) {
-		client = guc_client_alloc(dev_priv,
-					  INTEL_INFO(dev_priv)->ring_mask,
-					  GUC_CLIENT_PRIORITY_KMD_NORMAL,
-					  dev_priv->kernel_context);
-		if (IS_ERR(client)) {
-			DRM_ERROR("Failed to create GuC client for execbuf!\n");
-			return PTR_ERR(client);
-		}
-
-		guc->execbuf_client = client;
+	/*
+	 * We're being called on both module initialization and on reset,
+	 * until this flow is changed, we're using regular client presence to
+	 * determine which case are we in, and whether we should allocate new
+	 * clients or just reset their workqueues.
+	 */
+	if (!guc->execbuf_client) {
+		GEM_BUG_ON(guc->preempt_client);
+		err = guc_clients_create(guc);
+		if (err)
+			return err;
+	} else {
+		guc_reset_wq(guc->execbuf_client);
+		guc_reset_wq(guc->preempt_client);
 	}
 
 	err = intel_guc_sample_forcewake(guc);
 	if (err)
-		goto err_execbuf_client;
-
-	guc_reset_wq(client);
+		goto err_free_clients;
 
 	err = guc_init_doorbell_hw(guc);
 	if (err)
-		goto err_execbuf_client;
+		goto err_free_clients;
 
 	/* Take over from manual control of ELSP (execlists) */
 	guc_interrupts_capture(dev_priv);
@@ -1187,9 +1235,8 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 
 	return 0;
 
-err_execbuf_client:
-	guc_client_free(guc->execbuf_client);
-	guc->execbuf_client = NULL;
+err_free_clients:
+	guc_clients_destroy(guc);
 	return err;
 }
 
@@ -1204,6 +1251,5 @@ void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
 	/* Revert back to manual ELSP submission */
 	intel_engines_reset_default_submission(dev_priv);
 
-	guc_client_free(guc->execbuf_client);
-	guc->execbuf_client = NULL;
+	guc_clients_destroy(guc);
 }
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index aa1583167b0a..6ca55c5bed3c 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -58,6 +58,7 @@ struct intel_guc {
 	void *shared_data_vaddr;
 
 	struct i915_guc_client *execbuf_client;
+	struct i915_guc_client *preempt_client;
 
 	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
 	/* Cyclic counter mod pagesize	*/
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 06/12] drm/i915/guc: Split guc_wq_item_append
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (4 preceding siblings ...)
  2017-10-25 20:00 ` [PATCH v3 05/12] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-25 20:00 ` [PATCH v2 07/12] drm/i915: Extract "emit write" part of emit breadcrumb functions Michał Winiarski
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

We're using a special preempt context for HW to preempt into. We don't
want to emit any requests there, but we still need to wrap this context
into a valid GuC work item.
Let's cleanup the functions operating on GuC work items.
We can extract guc_request_add - responsible for adding GuC work item and
ringing the doorbell, and guc_wq_item_append - used by the function
above, not tied to the concept of gem request.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 55 ++++++++++++++++--------------
 1 file changed, 30 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1d3123e74bc4..12f1c6a0afce 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -71,7 +71,7 @@
  * WQ_TYPE_INORDER is needed to support legacy submission via GuC, which
  * represents in-order queue. The kernel driver packs ring tail pointer and an
  * ELSP context descriptor dword into Work Item.
- * See guc_wq_item_append()
+ * See guc_add_request()
  *
  * ADS:
  * The Additional Data Struct (ADS) has pointers for different buffers used by
@@ -390,7 +390,7 @@ static void guc_stage_desc_init(struct intel_guc *guc,
 		 * submission or, in other words, not using a direct submission
 		 * model) the KMD's LRCA is not used for any work submission.
 		 * Instead, the GuC uses the LRCA of the user mode context (see
-		 * guc_wq_item_append below).
+		 * guc_add_request below).
 		 */
 		lrc->context_desc = lower_32_bits(ce->lrc_desc);
 
@@ -469,22 +469,18 @@ static void guc_shared_data_destroy(struct intel_guc *guc)
 
 /* Construct a Work Item and append it to the GuC's Work Queue */
 static void guc_wq_item_append(struct i915_guc_client *client,
-			       struct drm_i915_gem_request *rq)
+			       u32 target_engine, u32 context_desc,
+			       u32 ring_tail, u32 fence_id)
 {
 	/* wqi_len is in DWords, and does not include the one-word header */
 	const size_t wqi_size = sizeof(struct guc_wq_item);
 	const u32 wqi_len = wqi_size / sizeof(u32) - 1;
-	struct intel_engine_cs *engine = rq->engine;
-	struct i915_gem_context *ctx = rq->ctx;
 	struct guc_process_desc *desc = __get_process_desc(client);
 	struct guc_wq_item *wqi;
-	u32 ring_tail, wq_off;
+	u32 wq_off;
 
 	lockdep_assert_held(&client->wq_lock);
 
-	ring_tail = intel_ring_set_tail(rq->ring, rq->tail) / sizeof(u64);
-	GEM_BUG_ON(ring_tail > WQ_RING_TAIL_MAX);
-
 	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
 	 * should not have the case where structure wqi is across page, neither
 	 * wrapped to the beginning. This simplifies the implementation below.
@@ -506,15 +502,14 @@ static void guc_wq_item_append(struct i915_guc_client *client,
 	/* Now fill in the 4-word work queue item */
 	wqi->header = WQ_TYPE_INORDER |
 		      (wqi_len << WQ_LEN_SHIFT) |
-		      (engine->guc_id << WQ_TARGET_SHIFT) |
+		      (target_engine << WQ_TARGET_SHIFT) |
 		      WQ_NO_WCFLUSH_WAIT;
-
-	wqi->context_desc = lower_32_bits(intel_lr_context_descriptor(ctx, engine));
-
+	wqi->context_desc = context_desc;
 	wqi->submit_element_info = ring_tail << WQ_RING_TAIL_SHIFT;
-	wqi->fence_id = rq->global_seqno;
+	GEM_BUG_ON(ring_tail > WQ_RING_TAIL_MAX);
+	wqi->fence_id = fence_id;
 
-	/* Postincrement WQ tail for next time. */
+	/* Make the update visible to GuC */
 	WRITE_ONCE(desc->tail, (wq_off + wqi_size) & (GUC_WQ_SIZE - 1));
 }
 
@@ -547,6 +542,25 @@ static void guc_ring_doorbell(struct i915_guc_client *client)
 	GEM_BUG_ON(db->db_status != GUC_DOORBELL_ENABLED);
 }
 
+static void guc_add_request(struct intel_guc *guc,
+			    struct drm_i915_gem_request *rq)
+{
+	struct i915_guc_client *client = guc->execbuf_client;
+	struct intel_engine_cs *engine = rq->engine;
+	u32 ctx_desc = lower_32_bits(intel_lr_context_descriptor(rq->ctx, engine));
+	u32 ring_tail = intel_ring_set_tail(rq->ring, rq->tail) / sizeof(u64);
+
+	spin_lock(&client->wq_lock);
+
+	guc_wq_item_append(client, engine->guc_id, ctx_desc,
+			   ring_tail, rq->global_seqno);
+	guc_ring_doorbell(client);
+
+	client->submissions[engine->id] += 1;
+
+	spin_unlock(&client->wq_lock);
+}
+
 /**
  * i915_guc_submit() - Submit commands through GuC
  * @engine: engine associated with the commands
@@ -558,10 +572,8 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 	struct intel_guc *guc = &dev_priv->guc;
-	struct i915_guc_client *client = guc->execbuf_client;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
-	const unsigned int engine_id = engine->id;
 	unsigned int n;
 
 	for (n = 0; n < execlists_num_ports(execlists); n++) {
@@ -575,14 +587,7 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
 			if (i915_vma_is_map_and_fenceable(rq->ring->vma))
 				POSTING_READ_FW(GUC_STATUS);
 
-			spin_lock(&client->wq_lock);
-
-			guc_wq_item_append(client, rq);
-			guc_ring_doorbell(client);
-
-			client->submissions[engine_id] += 1;
-
-			spin_unlock(&client->wq_lock);
+			guc_add_request(guc, rq);
 		}
 	}
 }
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 07/12] drm/i915: Extract "emit write" part of emit breadcrumb functions
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (5 preceding siblings ...)
  2017-10-25 20:00 ` [PATCH 06/12] drm/i915/guc: Split guc_wq_item_append Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-25 20:00 ` [PATCH 08/12] drm/i915: Add information needed to track engine preempt state Michał Winiarski
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

Let's separate the "emit" part from touching any internal structures,
this way we can have a generic "emit coherent GGTT write" function.
We would like to reuse this functionality for emitting HWSP write, to
confirm that preempt-to-idle has finished.

v2: Reorder args to match emit_pipe_control, s/render/rcs (Chris)

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_lrc.c        | 28 +++++++-----------------
 drivers/gpu/drm/i915/intel_ringbuffer.h | 38 +++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e821c1eba7a4..599c709fc5a7 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1794,10 +1794,8 @@ static void gen8_emit_breadcrumb(struct drm_i915_gem_request *request, u32 *cs)
 	/* w/a: bit 5 needs to be zero for MI_FLUSH_DW address. */
 	BUILD_BUG_ON(I915_GEM_HWS_INDEX_ADDR & (1 << 5));
 
-	*cs++ = (MI_FLUSH_DW + 1) | MI_FLUSH_DW_OP_STOREDW;
-	*cs++ = intel_hws_seqno_address(request->engine) | MI_FLUSH_DW_USE_GTT;
-	*cs++ = 0;
-	*cs++ = request->global_seqno;
+	cs = gen8_emit_ggtt_write(cs, request->global_seqno,
+				  intel_hws_seqno_address(request->engine));
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_NOOP;
 	request->tail = intel_ring_offset(request, cs);
@@ -1807,24 +1805,14 @@ static void gen8_emit_breadcrumb(struct drm_i915_gem_request *request, u32 *cs)
 }
 static const int gen8_emit_breadcrumb_sz = 6 + WA_TAIL_DWORDS;
 
-static void gen8_emit_breadcrumb_render(struct drm_i915_gem_request *request,
+static void gen8_emit_breadcrumb_rcs(struct drm_i915_gem_request *request,
 					u32 *cs)
 {
 	/* We're using qword write, seqno should be aligned to 8 bytes. */
 	BUILD_BUG_ON(I915_GEM_HWS_INDEX & 1);
 
-	/* w/a for post sync ops following a GPGPU operation we
-	 * need a prior CS_STALL, which is emitted by the flush
-	 * following the batch.
-	 */
-	*cs++ = GFX_OP_PIPE_CONTROL(6);
-	*cs++ = PIPE_CONTROL_GLOBAL_GTT_IVB | PIPE_CONTROL_CS_STALL |
-		PIPE_CONTROL_QW_WRITE;
-	*cs++ = intel_hws_seqno_address(request->engine);
-	*cs++ = 0;
-	*cs++ = request->global_seqno;
-	/* We're thrashing one dword of HWS. */
-	*cs++ = 0;
+	cs = gen8_emit_ggtt_write_rcs(cs, request->global_seqno,
+				      intel_hws_seqno_address(request->engine));
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_NOOP;
 	request->tail = intel_ring_offset(request, cs);
@@ -1832,7 +1820,7 @@ static void gen8_emit_breadcrumb_render(struct drm_i915_gem_request *request,
 
 	gen8_emit_wa_tail(request, cs);
 }
-static const int gen8_emit_breadcrumb_render_sz = 8 + WA_TAIL_DWORDS;
+static const int gen8_emit_breadcrumb_rcs_sz = 8 + WA_TAIL_DWORDS;
 
 static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 {
@@ -1991,8 +1979,8 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
 		engine->init_hw = gen8_init_render_ring;
 	engine->init_context = gen8_init_rcs_context;
 	engine->emit_flush = gen8_emit_flush_render;
-	engine->emit_breadcrumb = gen8_emit_breadcrumb_render;
-	engine->emit_breadcrumb_sz = gen8_emit_breadcrumb_render_sz;
+	engine->emit_breadcrumb = gen8_emit_breadcrumb_rcs;
+	engine->emit_breadcrumb_sz = gen8_emit_breadcrumb_rcs_sz;
 
 	ret = intel_engine_create_scratch(engine, PAGE_SIZE);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index e5a0d489a304..c15161e56964 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -869,6 +869,44 @@ static inline u32 *gen8_emit_pipe_control(u32 *batch, u32 flags, u32 offset)
 	return batch + 6;
 }
 
+static inline u32 *
+gen8_emit_ggtt_write_rcs(u32 *cs, u32 value, u32 gtt_offset)
+{
+	/* We're using qword write, offset should be aligned to 8 bytes. */
+	GEM_BUG_ON(!IS_ALIGNED(gtt_offset, 8));
+
+	/* w/a for post sync ops following a GPGPU operation we
+	 * need a prior CS_STALL, which is emitted by the flush
+	 * following the batch.
+	 */
+	*cs++ = GFX_OP_PIPE_CONTROL(6);
+	*cs++ = PIPE_CONTROL_GLOBAL_GTT_IVB | PIPE_CONTROL_CS_STALL |
+		PIPE_CONTROL_QW_WRITE;
+	*cs++ = gtt_offset;
+	*cs++ = 0;
+	*cs++ = value;
+	/* We're thrashing one dword of HWS. */
+	*cs++ = 0;
+
+	return cs;
+}
+
+static inline u32 *
+gen8_emit_ggtt_write(u32 *cs, u32 value, u32 gtt_offset)
+{
+	/* w/a: bit 5 needs to be zero for MI_FLUSH_DW address. */
+	GEM_BUG_ON(gtt_offset & (1 << 5));
+	/* Offset should be aligned to 8 bytes for both (QW/DW) write types */
+	GEM_BUG_ON(!IS_ALIGNED(gtt_offset, 8));
+
+	*cs++ = (MI_FLUSH_DW + 1) | MI_FLUSH_DW_OP_STOREDW;
+	*cs++ = gtt_offset | MI_FLUSH_DW_USE_GTT;
+	*cs++ = 0;
+	*cs++ = value;
+
+	return cs;
+}
+
 bool intel_engine_is_idle(struct intel_engine_cs *engine);
 bool intel_engines_are_idle(struct drm_i915_private *dev_priv);
 
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 08/12] drm/i915: Add information needed to track engine preempt state
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (6 preceding siblings ...)
  2017-10-25 20:00 ` [PATCH v2 07/12] drm/i915: Extract "emit write" part of emit breadcrumb functions Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-25 20:00 ` [PATCH v2 09/12] drm/i915/guc: Keep request->priority for its lifetime Michał Winiarski
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

We shouldn't inspect ELSP context status (or any other bits depending on
specific submission backend) when using GuC submission.
Let's use another piece of HWSP for preempt context, to write its bit of
information, meaning that preemption has finished, and hardware is now
idle.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c15161e56964..4a5a08985328 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -626,6 +626,8 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
  */
 #define I915_GEM_HWS_INDEX		0x30
 #define I915_GEM_HWS_INDEX_ADDR (I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
+#define I915_GEM_HWS_PREEMPT_INDEX	0x32
+#define I915_GEM_HWS_PREEMPT_ADDR (I915_GEM_HWS_PREEMPT_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
 #define I915_GEM_HWS_SCRATCH_INDEX	0x40
 #define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
 
@@ -778,6 +780,11 @@ static inline u32 intel_hws_seqno_address(struct intel_engine_cs *engine)
 	return engine->status_page.ggtt_offset + I915_GEM_HWS_INDEX_ADDR;
 }
 
+static inline u32 intel_hws_preempt_done_address(struct intel_engine_cs *engine)
+{
+	return engine->status_page.ggtt_offset + I915_GEM_HWS_PREEMPT_ADDR;
+}
+
 /* intel_breadcrumbs.c -- user interrupt bottom-half for waiters */
 int intel_engine_init_breadcrumbs(struct intel_engine_cs *engine);
 
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 09/12] drm/i915/guc: Keep request->priority for its lifetime
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (7 preceding siblings ...)
  2017-10-25 20:00 ` [PATCH 08/12] drm/i915: Add information needed to track engine preempt state Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-25 20:00 ` [PATCH v3 10/12] drm/i915: Rename helpers used for unwinding, use macro for can_preempt Michał Winiarski
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

We also want to support preemption with GuC submission backend.
In order to do that, we need to remember the priority, like we do on
execlists path.

v2: Remove completed prio == INT_MAX optimization

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 12f1c6a0afce..600ef9d33773 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -637,7 +637,6 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 			}
 
 			INIT_LIST_HEAD(&rq->priotree.link);
-			rq->priotree.priority = INT_MAX;
 
 			__i915_gem_request_submit(rq);
 			trace_i915_gem_request_in(rq, port_index(port, execlists));
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 10/12] drm/i915: Rename helpers used for unwinding, use macro for can_preempt
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (8 preceding siblings ...)
  2017-10-25 20:00 ` [PATCH v2 09/12] drm/i915/guc: Keep request->priority for its lifetime Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-25 20:15   ` Chris Wilson
  2017-10-25 20:00 ` [PATCH v4 11/12] drm/i915/guc: Preemption! With GuC Michał Winiarski
                   ` (9 subsequent siblings)
  19 siblings, 1 reply; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

We would also like to make use of execlist_cancel_port_requests and
unwind_incomplete_requests in GuC preemption backend.
Let's rename the functions to use the correct prefixes, so that we can
simply add the declarations in the following patch.
Similar thing for applies for can_preempt, except we're introducing
HAS_LOGICAL_RING_PREEMPTION macro instad, converting other users that
were previously touching device info directly.

v2: s/intel_engine/execlists and pass execlists to unwind (Chris)
v3: use locked version for exporting, drop const qual (Chris)

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c        |  2 +-
 drivers/gpu/drm/i915/i915_drv.h        |  2 ++
 drivers/gpu/drm/i915/intel_engine_cs.c |  6 +++---
 drivers/gpu/drm/i915/intel_lrc.c       | 35 ++++++++++++++++++----------------
 4 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 3db5851756f0..7b871802ae36 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -372,7 +372,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 			value |= I915_SCHEDULER_CAP_ENABLED;
 			value |= I915_SCHEDULER_CAP_PRIORITY;
 
-			if (INTEL_INFO(dev_priv)->has_logical_ring_preemption &&
+			if (HAS_LOGICAL_RING_PREEMPTION(dev_priv) &&
 			    i915_modparams.enable_execlists &&
 			    !i915_modparams.enable_guc_submission)
 				value |= I915_SCHEDULER_CAP_PREEMPTION;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 366ba74b0ad2..61c155cbf9d7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3140,6 +3140,8 @@ intel_info(const struct drm_i915_private *dev_priv)
 
 #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \
 		((dev_priv)->info.has_logical_ring_contexts)
+#define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
+		((dev_priv)->info.has_logical_ring_preemption)
 #define USES_PPGTT(dev_priv)		(i915_modparams.enable_ppgtt)
 #define USES_FULL_PPGTT(dev_priv)	(i915_modparams.enable_ppgtt >= 2)
 #define USES_FULL_48BIT_PPGTT(dev_priv)	(i915_modparams.enable_ppgtt == 3)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index fedb839dff61..3ac876ca6cae 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -620,7 +620,7 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
 	 * Similarly the preempt context must always be available so that
 	 * we can interrupt the engine at any time.
 	 */
-	if (INTEL_INFO(engine->i915)->has_logical_ring_preemption) {
+	if (HAS_LOGICAL_RING_PREEMPTION(engine->i915)) {
 		ring = engine->context_pin(engine,
 					   engine->i915->preempt_context);
 		if (IS_ERR(ring)) {
@@ -651,7 +651,7 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
 err_breadcrumbs:
 	intel_engine_fini_breadcrumbs(engine);
 err_unpin_preempt:
-	if (INTEL_INFO(engine->i915)->has_logical_ring_preemption)
+	if (HAS_LOGICAL_RING_PREEMPTION(engine->i915))
 		engine->context_unpin(engine, engine->i915->preempt_context);
 err_unpin_kernel:
 	engine->context_unpin(engine, engine->i915->kernel_context);
@@ -679,7 +679,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 	intel_engine_cleanup_cmd_parser(engine);
 	i915_gem_batch_pool_fini(&engine->batch_pool);
 
-	if (INTEL_INFO(engine->i915)->has_logical_ring_preemption)
+	if (HAS_LOGICAL_RING_PREEMPTION(engine->i915))
 		engine->context_unpin(engine, engine->i915->preempt_context);
 	engine->context_unpin(engine, engine->i915->kernel_context);
 }
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 599c709fc5a7..b5d382ef8d85 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -354,7 +354,7 @@ static void unwind_wa_tail(struct drm_i915_gem_request *rq)
 	assert_ring_tail_valid(rq->ring, rq->tail);
 }
 
-static void unwind_incomplete_requests(struct intel_engine_cs *engine)
+static void __unwind_incomplete_requests(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_request *rq, *rn;
 	struct i915_priolist *uninitialized_var(p);
@@ -385,6 +385,17 @@ static void unwind_incomplete_requests(struct intel_engine_cs *engine)
 	}
 }
 
+static void
+execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
+{
+	struct intel_engine_cs *engine =
+		container_of(execlists, typeof(*engine), execlists);
+
+	spin_lock_irq(&engine->timeline->lock);
+	__unwind_incomplete_requests(engine);
+	spin_unlock_irq(&engine->timeline->lock);
+}
+
 static inline void
 execlists_context_status_change(struct drm_i915_gem_request *rq,
 				unsigned long status)
@@ -515,11 +526,6 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
 	elsp_write(ce->lrc_desc, elsp);
 }
 
-static bool can_preempt(struct intel_engine_cs *engine)
-{
-	return INTEL_INFO(engine->i915)->has_logical_ring_preemption;
-}
-
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
@@ -567,7 +573,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		if (port_count(&port[0]) > 1)
 			goto unlock;
 
-		if (can_preempt(engine) &&
+		if (HAS_LOGICAL_RING_PREEMPTION(engine->i915) &&
 		    rb_entry(rb, struct i915_priolist, node)->priority >
 		    max(last->priotree.priority, 0)) {
 			/*
@@ -691,7 +697,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 }
 
 static void
-execlist_cancel_port_requests(struct intel_engine_execlists *execlists)
+execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
 {
 	struct execlist_port *port = execlists->port;
 	unsigned int num_ports = execlists_num_ports(execlists);
@@ -718,7 +724,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
 	/* Cancel the requests on the HW and clear the ELSP tracker. */
-	execlist_cancel_port_requests(execlists);
+	execlists_cancel_port_requests(execlists);
 
 	/* Mark all executing requests as skipped. */
 	list_for_each_entry(rq, &engine->timeline->requests, link) {
@@ -858,11 +864,8 @@ static void intel_lrc_irq_handler(unsigned long data)
 
 			if (status & GEN8_CTX_STATUS_ACTIVE_IDLE &&
 			    buf[2*head + 1] == PREEMPT_ID) {
-				execlist_cancel_port_requests(execlists);
-
-				spin_lock_irq(&engine->timeline->lock);
-				unwind_incomplete_requests(engine);
-				spin_unlock_irq(&engine->timeline->lock);
+				execlists_cancel_port_requests(execlists);
+				execlists_unwind_incomplete_requests(execlists);
 
 				GEM_BUG_ON(!execlists_is_active(execlists,
 								EXECLISTS_ACTIVE_PREEMPT));
@@ -1531,10 +1534,10 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	 * guessing the missed context-switch events by looking at what
 	 * requests were completed.
 	 */
-	execlist_cancel_port_requests(execlists);
+	execlists_cancel_port_requests(execlists);
 
 	/* Push back any incomplete requests for replay after the reset. */
-	unwind_incomplete_requests(engine);
+	__unwind_incomplete_requests(engine);
 
 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
 
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v4 11/12] drm/i915/guc: Preemption! With GuC
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (9 preceding siblings ...)
  2017-10-25 20:00 ` [PATCH v3 10/12] drm/i915: Rename helpers used for unwinding, use macro for can_preempt Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-25 20:24   ` Chris Wilson
                     ` (2 more replies)
  2017-10-25 20:00 ` [PATCH 12/12] HAX Enable GuC Submission for CI Michał Winiarski
                   ` (8 subsequent siblings)
  19 siblings, 3 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

Pretty similar to what we have on execlists.
We're reusing most of the GEM code, however, due to GuC quirks we need a
couple of extra bits.
Preemption is implemented as GuC action, and actions can be pretty slow.
Because of that, we're using a mutex to serialize them. Since we're
requesting preemption from the tasklet, the task of creating a workitem
and wrapping it in GuC action is delegated to a worker.

To distinguish that preemption has finished, we're using additional
piece of HWSP, and since we're not getting context switch interrupts,
we're also adding a user interrupt.

The fact that our special preempt context has completed unfortunately
doesn't mean that we're ready to submit new work. We also need to wait
for GuC to finish its own processing.

v2: Don't compile out the wait for GuC, handle workqueue flush on reset,
no need for ordered workqueue, put on a reviewer hat when looking at my own
patches (Chris)
Move struct work around in intel_guc, move user interruput outside of
conditional (Michał)
Keep ring around rather than chase though intel_context

v3: Extract WA for flushing ggtt writes to a helper (Chris)
Keep work_struct in intel_guc rather than engine (Michał)
Use ordered workqueue for inject_preempt worker to avoid GuC quirks.

v4: Drop now unused INTEL_GUC_PREEMPT_OPTION_IMMEDIATE (Daniele)
Drop stray newlines, use container_of for intel_guc in worker,
check for presence of workqueue when flushing it, rather than
enable_guc_submission modparam, reorder preempt postprocessing (Chris)

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c            |   3 +-
 drivers/gpu/drm/i915/i915_gem.c            |  10 ++
 drivers/gpu/drm/i915/i915_guc_submission.c | 208 +++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_guc.h           |   8 ++
 drivers/gpu/drm/i915/intel_lrc.c           |   4 +-
 drivers/gpu/drm/i915/intel_lrc.h           |   1 -
 drivers/gpu/drm/i915/intel_ringbuffer.h    |   6 +
 7 files changed, 222 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7b871802ae36..af745749509c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -373,8 +373,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 			value |= I915_SCHEDULER_CAP_PRIORITY;
 
 			if (HAS_LOGICAL_RING_PREEMPTION(dev_priv) &&
-			    i915_modparams.enable_execlists &&
-			    !i915_modparams.enable_guc_submission)
+			    i915_modparams.enable_execlists)
 				value |= I915_SCHEDULER_CAP_PREEMPTION;
 		}
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d803ef5f4a7f..c2506fb3a483 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2921,6 +2921,16 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
 	tasklet_kill(&engine->execlists.irq_tasklet);
 	tasklet_disable(&engine->execlists.irq_tasklet);
 
+	/*
+	 * We're using worker to queue preemption requests from the tasklet in
+	 * GuC submission mode.
+	 * Even though tasklet was disabled, we may still have a worker queued.
+	 * Let's make sure that all workers scheduled before disabling the
+	 * tasklet are completed before continuing with the reset.
+	 */
+	if (engine->i915->guc.preempt_wq)
+		flush_workqueue(engine->i915->guc.preempt_wq);
+
 	if (engine->irq_seqno_barrier)
 		engine->irq_seqno_barrier(engine);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 600ef9d33773..c7647afb969e 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -561,6 +561,109 @@ static void guc_add_request(struct intel_guc *guc,
 	spin_unlock(&client->wq_lock);
 }
 
+/*
+ * When we're doing submissions using regular execlists backend, writing to
+ * ELSP from CPU side is enough to make sure that writes to ringbuffer pages
+ * pinned in mappable aperture portion of GGTT are visible to command streamer.
+ * Writes done by GuC on our behalf are not guaranteeing such ordering,
+ * therefore, to ensure the flush, we're issuing a POSTING READ.
+ */
+static void flush_ggtt_writes(struct i915_vma *vma)
+{
+	struct drm_i915_private *dev_priv = to_i915(vma->obj->base.dev);
+
+	if (i915_vma_is_map_and_fenceable(vma))
+		POSTING_READ_FW(GUC_STATUS);
+}
+
+#define GUC_PREEMPT_FINISHED 0x1
+#define GUC_PREEMPT_BREADCRUMB_DWORDS 0x8
+static void inject_preempt_context(struct work_struct *work)
+{
+	struct guc_preempt_work *preempt_work =
+		container_of(work, typeof(*preempt_work), work);
+	struct intel_engine_cs *engine = preempt_work->engine;
+	struct intel_guc *guc = container_of(preempt_work, typeof(*guc),
+					     preempt_work[engine->id]);
+	struct i915_guc_client *client = guc->preempt_client;
+	struct intel_ring *ring = client->owner->engine[engine->id].ring;
+	u32 ctx_desc = lower_32_bits(intel_lr_context_descriptor(client->owner,
+								 engine));
+	u32 *cs = ring->vaddr + ring->tail;
+	u32 data[7];
+
+	if (engine->id == RCS) {
+		cs = gen8_emit_ggtt_write_rcs(cs, GUC_PREEMPT_FINISHED,
+				intel_hws_preempt_done_address(engine));
+	} else {
+		cs = gen8_emit_ggtt_write(cs, GUC_PREEMPT_FINISHED,
+				intel_hws_preempt_done_address(engine));
+		*cs++ = MI_NOOP;
+		*cs++ = MI_NOOP;
+	}
+	*cs++ = MI_USER_INTERRUPT;
+	*cs++ = MI_NOOP;
+
+	GEM_BUG_ON(!IS_ALIGNED(ring->size,
+			       GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32)));
+	GEM_BUG_ON((void *)cs - (ring->vaddr + ring->tail) !=
+		   GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32));
+
+	ring->tail += GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32);
+	ring->tail &= (ring->size - 1);
+
+	flush_ggtt_writes(ring->vma);
+
+	spin_lock_irq(&client->wq_lock);
+	guc_wq_item_append(client, engine->guc_id, ctx_desc,
+			   ring->tail / sizeof(u64), 0);
+	spin_unlock_irq(&client->wq_lock);
+
+	data[0] = INTEL_GUC_ACTION_REQUEST_PREEMPTION;
+	data[1] = client->stage_id;
+	data[2] = INTEL_GUC_PREEMPT_OPTION_DROP_WORK_Q |
+		  INTEL_GUC_PREEMPT_OPTION_DROP_SUBMIT_Q;
+	data[3] = engine->guc_id;
+	data[4] = guc->execbuf_client->priority;
+	data[5] = guc->execbuf_client->stage_id;
+	data[6] = guc_ggtt_offset(guc->shared_data);
+
+	if (WARN_ON(intel_guc_send(guc, data, ARRAY_SIZE(data)))) {
+		execlists_clear_active(&engine->execlists,
+				       EXECLISTS_ACTIVE_PREEMPT);
+		tasklet_schedule(&engine->execlists.irq_tasklet);
+	}
+}
+
+/*
+ * We're using user interrupt and HWSP value to mark that preemption has
+ * finished and GPU is idle. Normally, we could unwind and continue similar to
+ * execlists submission path. Unfortunately, with GuC we also need to wait for
+ * it to finish its own postprocessing, before attempting to submit. Otherwise
+ * GuC may silently ignore our submissions, and thus we risk losing request at
+ * best, executing out-of-order and causing kernel panic at worst.
+ */
+#define GUC_PREEMPT_POSTPROCESS_DELAY_MS 10
+static void wait_for_guc_preempt_report(struct intel_engine_cs *engine)
+{
+	struct intel_guc *guc = &engine->i915->guc;
+	struct guc_shared_ctx_data *data = guc->shared_data_vaddr;
+	struct guc_ctx_report *report =
+		&data->preempt_ctx_report[engine->guc_id];
+
+	WARN_ON(wait_for_atomic(report->report_return_status ==
+				INTEL_GUC_REPORT_STATUS_COMPLETE,
+				GUC_PREEMPT_POSTPROCESS_DELAY_MS));
+	/*
+	 * GuC is expecting that we're also going to clear the affected context
+	 * counter, let's also reset the return status to not depend on GuC
+	 * resetting it after recieving another preempt action
+	 */
+	report->affected_count = 0;
+	report->report_return_status = INTEL_GUC_REPORT_STATUS_UNKNOWN;
+}
+
+
 /**
  * i915_guc_submit() - Submit commands through GuC
  * @engine: engine associated with the commands
@@ -570,8 +673,7 @@ static void guc_add_request(struct intel_guc *guc,
  */
 static void i915_guc_submit(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-	struct intel_guc *guc = &dev_priv->guc;
+	struct intel_guc *guc = &engine->i915->guc;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
 	unsigned int n;
@@ -584,8 +686,7 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
 		if (rq && count == 0) {
 			port_set(&port[n], port_pack(rq, ++count));
 
-			if (i915_vma_is_map_and_fenceable(rq->ring->vma))
-				POSTING_READ_FW(GUC_STATUS);
+			flush_ggtt_writes(rq->ring->vma);
 
 			guc_add_request(guc, rq);
 		}
@@ -613,13 +714,32 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 	bool submit = false;
 	struct rb_node *rb;
 
-	if (port_isset(port))
-		port++;
-
 	spin_lock_irq(&engine->timeline->lock);
 	rb = execlists->first;
 	GEM_BUG_ON(rb_first(&execlists->queue) != rb);
-	while (rb) {
+
+	if (!rb)
+		goto unlock;
+
+	if (HAS_LOGICAL_RING_PREEMPTION(engine->i915) && port_isset(port)) {
+		struct guc_preempt_work *preempt_work =
+			&engine->i915->guc.preempt_work[engine->id];
+
+		if (rb_entry(rb, struct i915_priolist, node)->priority >
+		    max(port_request(port)->priotree.priority, 0)) {
+			execlists_set_active(execlists,
+					     EXECLISTS_ACTIVE_PREEMPT);
+			queue_work(engine->i915->guc.preempt_wq,
+				   &preempt_work->work);
+			goto unlock;
+		} else if (port_isset(last_port)) {
+			goto unlock;
+		}
+
+		port++;
+	}
+
+	do {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
 
@@ -649,7 +769,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
-	}
+	} while (rb);
 done:
 	execlists->first = rb;
 	if (submit) {
@@ -657,6 +777,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 		execlists_set_active(execlists, EXECLISTS_ACTIVE_USER);
 		i915_guc_submit(engine);
 	}
+unlock:
 	spin_unlock_irq(&engine->timeline->lock);
 }
 
@@ -665,8 +786,6 @@ static void i915_guc_irq_handler(unsigned long data)
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
-	const struct execlist_port * const last_port =
-		&execlists->port[execlists->port_mask];
 	struct drm_i915_gem_request *rq;
 
 	rq = port_request(&port[0]);
@@ -681,7 +800,19 @@ static void i915_guc_irq_handler(unsigned long data)
 	if (!rq)
 		execlists_clear_active(execlists, EXECLISTS_ACTIVE_USER);
 
-	if (!port_isset(last_port))
+	if (execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT) &&
+	    intel_read_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX) ==
+	    GUC_PREEMPT_FINISHED) {
+		execlists_cancel_port_requests(&engine->execlists);
+		execlists_unwind_incomplete_requests(execlists);
+
+		wait_for_guc_preempt_report(engine);
+
+		execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
+		intel_write_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX, 0);
+	}
+
+	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT))
 		i915_guc_dequeue(engine);
 }
 
@@ -1056,6 +1187,50 @@ static void guc_ads_destroy(struct intel_guc *guc)
 	i915_vma_unpin_and_release(&guc->ads_vma);
 }
 
+static int guc_preempt_work_create(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	/*
+	 * Even though both sending GuC action, and adding a new workitem to
+	 * GuC workqueue are serialized (each with its own locking), since
+	 * we're using mutliple engines, it's possible that we're going to
+	 * issue a preempt request with two (or more - each for different
+	 * engine) workitems in GuC queue. In this situation, GuC may submit
+	 * all of them, which will make us very confused.
+	 * Our preemption contexts may even already be complete - before we
+	 * even had the chance to sent the preempt action to GuC!. Rather
+	 * than introducing yet another lock, we can just use ordered workqueue
+	 * to make sure we're always sending a single preemption request with a
+	 * single workitem.
+	 */
+	guc->preempt_wq = alloc_ordered_workqueue("i915-guc_preempt",
+						  WQ_HIGHPRI);
+	if (!guc->preempt_wq)
+		return -ENOMEM;
+
+	for_each_engine(engine, dev_priv, id) {
+		guc->preempt_work[id].engine = engine;
+		INIT_WORK(&guc->preempt_work[id].work, inject_preempt_context);
+	}
+
+	return 0;
+}
+
+static void guc_preempt_work_destroy(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, dev_priv, id)
+		cancel_work_sync(&guc->preempt_work[id].work);
+
+	destroy_workqueue(guc->preempt_wq);
+}
+
 /*
  * Set up the memory resources to be shared with the GuC (via the GGTT)
  * at firmware loading time.
@@ -1080,12 +1255,18 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
 	if (ret < 0)
 		goto err_shared_data;
 
+	ret = guc_preempt_work_create(guc);
+	if (ret)
+		goto err_log;
+
 	ret = guc_ads_create(guc);
 	if (ret < 0)
-		goto err_log;
+		goto err_wq;
 
 	return 0;
 
+err_wq:
+	guc_preempt_work_destroy(guc);
 err_log:
 	intel_guc_log_destroy(guc);
 err_shared_data:
@@ -1100,6 +1281,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
 	struct intel_guc *guc = &dev_priv->guc;
 
 	guc_ads_destroy(guc);
+	guc_preempt_work_destroy(guc);
 	intel_guc_log_destroy(guc);
 	guc_shared_data_destroy(guc);
 	guc_stage_desc_pool_destroy(guc);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 6ca55c5bed3c..538615ff5ea2 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -34,6 +34,11 @@
 #include "i915_guc_reg.h"
 #include "i915_vma.h"
 
+struct guc_preempt_work {
+	struct intel_engine_cs *engine;
+	struct work_struct work;
+};
+
 /*
  * Top level structure of GuC. It handles firmware loading and manages client
  * pool and doorbells. intel_guc owns a i915_guc_client to replace the legacy
@@ -60,6 +65,9 @@ struct intel_guc {
 	struct i915_guc_client *execbuf_client;
 	struct i915_guc_client *preempt_client;
 
+	struct guc_preempt_work preempt_work[I915_NUM_ENGINES];
+	struct workqueue_struct *preempt_wq;
+
 	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
 	/* Cyclic counter mod pagesize	*/
 	u32 db_cacheline;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b5d382ef8d85..6840ec8db037 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -385,7 +385,7 @@ static void __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	}
 }
 
-static void
+void
 execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
 {
 	struct intel_engine_cs *engine =
@@ -696,7 +696,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	}
 }
 
-static void
+void
 execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
 {
 	struct execlist_port *port = execlists->port;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 689fde1a63a9..17182ce29674 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -107,7 +107,6 @@ intel_lr_context_descriptor(struct i915_gem_context *ctx,
 	return ctx->engine[engine->id].lrc_desc;
 }
 
-
 /* Execlists */
 int intel_sanitize_enable_execlists(struct drm_i915_private *dev_priv,
 				    int enable_execlists);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 4a5a08985328..69ad875fd011 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -557,6 +557,12 @@ execlists_is_active(const struct intel_engine_execlists *execlists,
 	return test_bit(bit, (unsigned long *)&execlists->active);
 }
 
+void
+execlists_cancel_port_requests(struct intel_engine_execlists * const execlists);
+
+void
+execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists);
+
 static inline unsigned int
 execlists_num_ports(const struct intel_engine_execlists * const execlists)
 {
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 12/12] HAX Enable GuC Submission for CI
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (10 preceding siblings ...)
  2017-10-25 20:00 ` [PATCH v4 11/12] drm/i915/guc: Preemption! With GuC Michał Winiarski
@ 2017-10-25 20:00 ` Michał Winiarski
  2017-10-25 21:06 ` ✗ Fi.CI.BAT: failure for Preemption with GuC, fourth try Patchwork
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-25 20:00 UTC (permalink / raw)
  To: intel-gfx

Also:
Revert "drm/i915/guc: Assert that we switch between known ggtt->invalidate functions"

This reverts commit 04f7b24eccdfae680a36e9825fe0d61dcd5ed528.
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 8 ++------
 drivers/gpu/drm/i915/i915_params.h  | 4 ++--
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5eaa6893daaa..7dcf931abe37 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3568,17 +3568,13 @@ int i915_ggtt_enable_hw(struct drm_i915_private *dev_priv)
 
 void i915_ggtt_enable_guc(struct drm_i915_private *i915)
 {
-	GEM_BUG_ON(i915->ggtt.invalidate != gen6_ggtt_invalidate);
-
 	i915->ggtt.invalidate = guc_ggtt_invalidate;
 }
 
 void i915_ggtt_disable_guc(struct drm_i915_private *i915)
 {
-	/* We should only be called after i915_ggtt_enable_guc() */
-	GEM_BUG_ON(i915->ggtt.invalidate != guc_ggtt_invalidate);
-
-	i915->ggtt.invalidate = gen6_ggtt_invalidate;
+	if (i915->ggtt.invalidate == guc_ggtt_invalidate)
+		i915->ggtt.invalidate = gen6_ggtt_invalidate;
 }
 
 void i915_gem_restore_gtt_mappings(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index c7292268ed43..c38cef07b9fe 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -44,8 +44,8 @@
 	param(int, disable_power_well, -1) \
 	param(int, enable_ips, 1) \
 	param(int, invert_brightness, 0) \
-	param(int, enable_guc_loading, 0) \
-	param(int, enable_guc_submission, 0) \
+	param(int, enable_guc_loading, 1) \
+	param(int, enable_guc_submission, 1) \
 	param(int, guc_log_level, -1) \
 	param(char *, guc_firmware_path, NULL) \
 	param(char *, huc_firmware_path, NULL) \
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 10/12] drm/i915: Rename helpers used for unwinding, use macro for can_preempt
  2017-10-25 20:00 ` [PATCH v3 10/12] drm/i915: Rename helpers used for unwinding, use macro for can_preempt Michał Winiarski
@ 2017-10-25 20:15   ` Chris Wilson
  0 siblings, 0 replies; 35+ messages in thread
From: Chris Wilson @ 2017-10-25 20:15 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

Quoting Michał Winiarski (2017-10-25 21:00:18)
> We would also like to make use of execlist_cancel_port_requests and
> unwind_incomplete_requests in GuC preemption backend.
> Let's rename the functions to use the correct prefixes, so that we can
> simply add the declarations in the following patch.
> Similar thing for applies for can_preempt, except we're introducing
> HAS_LOGICAL_RING_PREEMPTION macro instad, converting other users that
> were previously touching device info directly.
> 
> v2: s/intel_engine/execlists and pass execlists to unwind (Chris)
> v3: use locked version for exporting, drop const qual (Chris)
> 
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v4 11/12] drm/i915/guc: Preemption! With GuC
  2017-10-25 20:00 ` [PATCH v4 11/12] drm/i915/guc: Preemption! With GuC Michał Winiarski
@ 2017-10-25 20:24   ` Chris Wilson
  2017-10-25 21:14   ` Chris Wilson
  2017-10-26  7:27   ` [PATCH v5] " Michał Winiarski
  2 siblings, 0 replies; 35+ messages in thread
From: Chris Wilson @ 2017-10-25 20:24 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

Quoting Michał Winiarski (2017-10-25 21:00:19)
> Pretty similar to what we have on execlists.
> We're reusing most of the GEM code, however, due to GuC quirks we need a
> couple of extra bits.
> Preemption is implemented as GuC action, and actions can be pretty slow.
> Because of that, we're using a mutex to serialize them. Since we're
> requesting preemption from the tasklet, the task of creating a workitem
> and wrapping it in GuC action is delegated to a worker.
> 
> To distinguish that preemption has finished, we're using additional
> piece of HWSP, and since we're not getting context switch interrupts,
> we're also adding a user interrupt.
> 
> The fact that our special preempt context has completed unfortunately
> doesn't mean that we're ready to submit new work. We also need to wait
> for GuC to finish its own processing.
> 
> v2: Don't compile out the wait for GuC, handle workqueue flush on reset,
> no need for ordered workqueue, put on a reviewer hat when looking at my own
> patches (Chris)
> Move struct work around in intel_guc, move user interruput outside of
> conditional (Michał)
> Keep ring around rather than chase though intel_context
> 
> v3: Extract WA for flushing ggtt writes to a helper (Chris)
> Keep work_struct in intel_guc rather than engine (Michał)
> Use ordered workqueue for inject_preempt worker to avoid GuC quirks.
> 
> v4: Drop now unused INTEL_GUC_PREEMPT_OPTION_IMMEDIATE (Daniele)
> Drop stray newlines, use container_of for intel_guc in worker,
> check for presence of workqueue when flushing it, rather than
> enable_guc_submission modparam, reorder preempt postprocessing (Chris)
> 
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jeff McGee <jeff.mcgee@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> +#define GUC_PREEMPT_POSTPROCESS_DELAY_MS 10
> +static void wait_for_guc_preempt_report(struct intel_engine_cs *engine)
> +{
> +       struct intel_guc *guc = &engine->i915->guc;
> +       struct guc_shared_ctx_data *data = guc->shared_data_vaddr;
> +       struct guc_ctx_report *report =
> +               &data->preempt_ctx_report[engine->guc_id];
> +
> +       WARN_ON(wait_for_atomic(report->report_return_status ==
> +                               INTEL_GUC_REPORT_STATUS_COMPLETE,
> +                               GUC_PREEMPT_POSTPROCESS_DELAY_MS));
> +       /*
> +        * GuC is expecting that we're also going to clear the affected context
> +        * counter, let's also reset the return status to not depend on GuC
> +        * resetting it after recieving another preempt action
> +        */
> +       report->affected_count = 0;
> +       report->report_return_status = INTEL_GUC_REPORT_STATUS_UNKNOWN;
> +}
> +
> +

The horror! ;)

Looks like everything is in order. Preemptive
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

I shall ponder the question of whether we can stress these
(guc/execlists) preemption paths more. We have checks that we can
preempt and reorder in-flight requests; we have simple smoketests. What
I feel is like we probably need more of the latter to find timing
issues. (A new mode for gem_concurrent_blit? There's nothing like running
a test for a few days to shake out timing bugs.)  Unless I've missed a
rule we can test in gem_exec_schedule.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 03/12] drm/i915/guc: Allocate separate shared data object for GuC communication
  2017-10-25 20:00 ` [PATCH v2 03/12] drm/i915/guc: Allocate separate shared data object for GuC communication Michał Winiarski
@ 2017-10-25 21:04   ` Michel Thierry
  0 siblings, 0 replies; 35+ messages in thread
From: Michel Thierry @ 2017-10-25 21:04 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

On 25/10/17 13:00, Michał Winiarski wrote:
> We were using first page of kernel context render state for sharing data
> with GuC. While it's justified by the fact that those pages are not used
> (note, GuC still enforces this layout and refuses to work if we remove
> the extra page in front), it's also confusing (why are we using this
> particular page?). Let's allocate a separate object instead.
> 
> v2: Drop kernel_context from GuC suspend/resume action handlers (Michel)

v2 Reviewed-by: Michel Thierry <michel.thierry@intel.com>
> 
> Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Jeff McGee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 36 +++++++++++++++++++++++++++++-
>   drivers/gpu/drm/i915/intel_guc.c           | 14 ++----------
>   drivers/gpu/drm/i915/intel_guc.h           |  2 ++
>   3 files changed, 39 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index e195bdee0473..d1a5613da24c 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -437,6 +437,33 @@ static void guc_stage_desc_fini(struct intel_guc *guc,
>          memset(desc, 0, sizeof(*desc));
>   }
> 
> +static int guc_shared_data_create(struct intel_guc *guc)
> +{
> +       struct i915_vma *vma;
> +       void *vaddr;
> +
> +       vma = intel_guc_allocate_vma(guc, PAGE_SIZE);
> +       if (IS_ERR(vma))
> +               return PTR_ERR(vma);
> +
> +       vaddr = i915_gem_object_pin_map(vma->obj, I915_MAP_WB);
> +       if (IS_ERR(vaddr)) {
> +               i915_vma_unpin_and_release(&vma);
> +               return PTR_ERR(vaddr);
> +       }
> +
> +       guc->shared_data = vma;
> +       guc->shared_data_vaddr = vaddr;
> +
> +       return 0;
> +}
> +
> +static void guc_shared_data_destroy(struct intel_guc *guc)
> +{
> +       i915_gem_object_unpin_map(guc->shared_data->obj);
> +       i915_vma_unpin_and_release(&guc->shared_data);
> +}
> +
>   /* Construct a Work Item and append it to the GuC's Work Queue */
>   static void guc_wq_item_append(struct i915_guc_client *client,
>                                 struct drm_i915_gem_request *rq)
> @@ -993,9 +1020,13 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
>          if (ret)
>                  return ret;
> 
> +       ret = guc_shared_data_create(guc);
> +       if (ret)
> +               goto err_stage_desc_pool;
> +
>          ret = intel_guc_log_create(guc);
>          if (ret < 0)
> -               goto err_stage_desc_pool;
> +               goto err_shared_data;
> 
>          ret = guc_ads_create(guc);
>          if (ret < 0)
> @@ -1005,6 +1036,8 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
> 
>   err_log:
>          intel_guc_log_destroy(guc);
> +err_shared_data:
> +       guc_shared_data_destroy(guc);
>   err_stage_desc_pool:
>          guc_stage_desc_pool_destroy(guc);
>          return ret;
> @@ -1016,6 +1049,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
> 
>          guc_ads_destroy(guc);
>          intel_guc_log_destroy(guc);
> +       guc_shared_data_destroy(guc);
>          guc_stage_desc_pool_destroy(guc);
>   }
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
> index 10037c0fdf95..f74d50fdaeb0 100644
> --- a/drivers/gpu/drm/i915/intel_guc.c
> +++ b/drivers/gpu/drm/i915/intel_guc.c
> @@ -268,7 +268,6 @@ int intel_guc_auth_huc(struct intel_guc *guc, u32 rsa_offset)
>   int intel_guc_suspend(struct drm_i915_private *dev_priv)
>   {
>          struct intel_guc *guc = &dev_priv->guc;
> -       struct i915_gem_context *ctx;
>          u32 data[3];
> 
>          if (guc->fw.load_status != INTEL_UC_FIRMWARE_SUCCESS)
> @@ -276,14 +275,10 @@ int intel_guc_suspend(struct drm_i915_private *dev_priv)
> 
>          gen9_disable_guc_interrupts(dev_priv);
> 
> -       ctx = dev_priv->kernel_context;
> -
>          data[0] = INTEL_GUC_ACTION_ENTER_S_STATE;
>          /* any value greater than GUC_POWER_D0 */
>          data[1] = GUC_POWER_D1;
> -       /* first page is shared data with GuC */
> -       data[2] = guc_ggtt_offset(ctx->engine[RCS].state) +
> -                 LRC_GUCSHR_PN * PAGE_SIZE;
> +       data[2] = guc_ggtt_offset(guc->shared_data);
> 
>          return intel_guc_send(guc, data, ARRAY_SIZE(data));
>   }
> @@ -295,7 +290,6 @@ int intel_guc_suspend(struct drm_i915_private *dev_priv)
>   int intel_guc_resume(struct drm_i915_private *dev_priv)
>   {
>          struct intel_guc *guc = &dev_priv->guc;
> -       struct i915_gem_context *ctx;
>          u32 data[3];
> 
>          if (guc->fw.load_status != INTEL_UC_FIRMWARE_SUCCESS)
> @@ -304,13 +298,9 @@ int intel_guc_resume(struct drm_i915_private *dev_priv)
>          if (i915_modparams.guc_log_level >= 0)
>                  gen9_enable_guc_interrupts(dev_priv);
> 
> -       ctx = dev_priv->kernel_context;
> -
>          data[0] = INTEL_GUC_ACTION_EXIT_S_STATE;
>          data[1] = GUC_POWER_D0;
> -       /* first page is shared data with GuC */
> -       data[2] = guc_ggtt_offset(ctx->engine[RCS].state) +
> -                 LRC_GUCSHR_PN * PAGE_SIZE;
> +       data[2] = guc_ggtt_offset(guc->shared_data);
> 
>          return intel_guc_send(guc, data, ARRAY_SIZE(data));
>   }
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 418450b1ae27..aa1583167b0a 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -54,6 +54,8 @@ struct intel_guc {
>          struct i915_vma *stage_desc_pool;
>          void *stage_desc_pool_vaddr;
>          struct ida stage_ids;
> +       struct i915_vma *shared_data;
> +       void *shared_data_vaddr;
> 
>          struct i915_guc_client *execbuf_client;
> 
> --
> 2.13.6
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* ✗ Fi.CI.BAT: failure for Preemption with GuC, fourth try
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (11 preceding siblings ...)
  2017-10-25 20:00 ` [PATCH 12/12] HAX Enable GuC Submission for CI Michał Winiarski
@ 2017-10-25 21:06 ` Patchwork
  2017-10-26  7:48 ` ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev2) Patchwork
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2017-10-25 21:06 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

== Series Details ==

Series: Preemption with GuC, fourth try
URL   : https://patchwork.freedesktop.org/series/32654/
State : failure

== Summary ==

Series 32654v1 Preemption with GuC, fourth try
https://patchwork.freedesktop.org/api/1.0/series/32654/revisions/1/mbox/

Test chamelium:
        Subgroup dp-crc-fast:
                fail       -> PASS       (fi-kbl-7500u) fdo#102514
Test debugfs_test:
        Subgroup read_all_entries:
                pass       -> DMESG-WARN (fi-cfl-s)
Test drv_getparams_basic:
        Subgroup basic-eu-total:
                pass       -> DMESG-WARN (fi-cfl-s)
        Subgroup basic-subslice-total:
                pass       -> DMESG-WARN (fi-cfl-s)
Test drv_hangman:
        Subgroup error-state-basic:
                pass       -> DMESG-FAIL (fi-glk-1)
                pass       -> DMESG-FAIL (fi-glk-dsi)
                pass       -> DMESG-WARN (fi-cfl-s)
Test gem_basic:
        Subgroup bad-close:
                pass       -> INCOMPLETE (fi-glk-1)
                pass       -> INCOMPLETE (fi-glk-dsi)
Test gem_exec_reloc:
        Subgroup basic-gtt-active:
                pass       -> FAIL       (fi-gdg-551) fdo#102582 +2
        Subgroup basic-write-cpu-active:
                pass       -> FAIL       (fi-gdg-551)
        Subgroup basic-write-gtt-active:
                pass       -> FAIL       (fi-gdg-551) fdo#102582
Test gem_sync:
        Subgroup basic-many-each:
                pass       -> FAIL       (fi-kbl-7567u) fdo#103165
Test kms_flip:
        Subgroup basic-flip-vs-wf_vblank:
                fail       -> PASS       (fi-skl-6770hq)
Test kms_setmode:
        Subgroup basic-clone-single-crtc:
                pass       -> INCOMPLETE (fi-cnl-y)

fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
fdo#102582 https://bugs.freedesktop.org/show_bug.cgi?id=102582
fdo#102582 https://bugs.freedesktop.org/show_bug.cgi?id=102582
fdo#103165 https://bugs.freedesktop.org/show_bug.cgi?id=103165

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:441s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:451s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:377s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:537s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:262s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:500s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:508s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:488s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:487s
fi-cfl-s         total:289  pass:249  dwarn:8   dfail:0   fail:0   skip:32  time:562s
fi-cnl-y         total:289  pass:224  dwarn:0   dfail:0   fail:0   skip:24 
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:417s
fi-gdg-551       total:289  pass:173  dwarn:1   dfail:0   fail:6   skip:109 time:250s
fi-glk-1         total:16   pass:5    dwarn:0   dfail:1   fail:0   skip:9  
fi-glk-dsi       total:16   pass:5    dwarn:0   dfail:1   fail:0   skip:9  
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:429s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:424s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:434s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:493s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:462s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:487s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:571s
fi-kbl-7567u     total:289  pass:268  dwarn:0   dfail:0   fail:1   skip:20  time:483s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:582s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:549s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:447s
fi-skl-6600u     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:597s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:651s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:529s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:498s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:458s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:565s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:423s

be67da218126edb88de92c391a48964c89219c11 drm-tip: 2017y-10m-25d-17h-36m-48s UTC integration manifest
51a3ae75372d HAX Enable GuC Submission for CI
30e2ce646bf4 drm/i915/guc: Preemption! With GuC
ee2027e659d3 drm/i915: Rename helpers used for unwinding, use macro for can_preempt
24e661557be6 drm/i915/guc: Keep request->priority for its lifetime
dbb9d7ee0e47 drm/i915: Add information needed to track engine preempt state
047a7e6c2786 drm/i915: Extract "emit write" part of emit breadcrumb functions
c6dc70581913 drm/i915/guc: Split guc_wq_item_append
2fb58671aa1c drm/i915/guc: Add a second client, to be used for preemption
ce8846251ead drm/i915/guc: Add preemption action to GuC firmware interface
6565d972ea35 drm/i915/guc: Allocate separate shared data object for GuC communication
336ed6330546 drm/i915/guc: Extract GuC stage desc pool creation into a helper
b84d980555d1 drm/i915/guc: Do not use 0 for GuC doorbell cookie

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6192/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v4 11/12] drm/i915/guc: Preemption! With GuC
  2017-10-25 20:00 ` [PATCH v4 11/12] drm/i915/guc: Preemption! With GuC Michał Winiarski
  2017-10-25 20:24   ` Chris Wilson
@ 2017-10-25 21:14   ` Chris Wilson
  2017-10-26  7:27   ` [PATCH v5] " Michał Winiarski
  2 siblings, 0 replies; 35+ messages in thread
From: Chris Wilson @ 2017-10-25 21:14 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

Quoting Michał Winiarski (2017-10-25 21:00:19)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d803ef5f4a7f..c2506fb3a483 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2921,6 +2921,16 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
>         tasklet_kill(&engine->execlists.irq_tasklet);
>         tasklet_disable(&engine->execlists.irq_tasklet);
>  
> +       /*
> +        * We're using worker to queue preemption requests from the tasklet in
> +        * GuC submission mode.
> +        * Even though tasklet was disabled, we may still have a worker queued.
> +        * Let's make sure that all workers scheduled before disabling the
> +        * tasklet are completed before continuing with the reset.
> +        */
> +       if (engine->i915->guc.preempt_wq)
> +               flush_workqueue(engine->i915->guc.preempt_wq);
> +
>         if (engine->irq_seqno_barrier)
>                 engine->irq_seqno_barrier(engine);

https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6192/fi-glk-dsi/igt@drv_hangman@error-state-basic.html

[   31.382512] i915 0000:00:02.0: Resetting rcs0 after gpu hang
[   31.382864] INFO: trying to register non-static key.
[   31.382871] the code is fine but needs lockdep annotation.
[   31.382874] turning off the locking correctness validator.
[   31.382879] CPU: 0 PID: 1473 Comm: drv_hangman Not tainted 4.14.0-rc6-CI-Patchwork_6192+ #1
[   31.382884] Hardware name: Intel Corp. Geminilake/GLK RVP2 LP4SD (07), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
[   31.382890] Call Trace:
[   31.382898]  dump_stack+0x68/0x9f
[   31.382903]  register_lock_class+0x3fd/0x580
[   31.382907]  ? __bfs+0x129/0x210
[   31.382912]  __lock_acquire+0xa4/0x1b00
[   31.382916]  ? flush_workqueue+0x75/0x520
[   31.382921]  ? __raw_spin_lock_init+0x21/0x60
[   31.382925]  lock_acquire+0xb0/0x200
[   31.382929]  ? lock_acquire+0xb0/0x200
[   31.382932]  ? flush_workqueue+0x75/0x520
[   31.382936]  flush_workqueue+0x98/0x520
[   31.382939]  ? flush_workqueue+0x75/0x520
[   31.382944]  ? _raw_spin_unlock_irq+0x37/0x50
[   31.382991]  i915_gem_reset_prepare_engine+0x5b/0x90 [i915]
[   31.383024]  ? i915_gem_reset_prepare_engine+0x5b/0x90 [i915]
[   31.383055]  i915_reset_engine+0x43/0x100 [i915]
[   31.383087]  i915_handle_error+0x1ee/0x430 [i915]
[   31.383095]  ? __might_fault+0x3e/0x90
[   31.383127]  i915_wedged_set+0x84/0xd0 [i915]
[   31.383132]  simple_attr_write+0xb4/0xd0
[   31.383138]  full_proxy_write+0x54/0x80
[   31.383143]  __vfs_write+0x28/0x130
[   31.383148]  ? rcu_read_lock_sched_held+0x7a/0x90
[   31.383152]  ? rcu_sync_lockdep_assert+0x2f/0x60
[   31.383157]  ? __sb_start_write+0x10c/0x200
[   31.383161]  vfs_write+0xc8/0x1c0
[   31.383164]  SyS_write+0x49/0xb0
[   31.383169]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[   31.383173] RIP: 0033:0x7f3492708670
[   31.383177] RSP: 002b:00007fffcbef7258 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   31.383182] RAX: ffffffffffffffda RBX: ffffffff81491ef3 RCX: 00007f3492708670
[   31.383186] RDX: 0000000000000002 RSI: 00005637752962c5 RDI: 0000000000000007
[   31.383190] RBP: ffffc900022abf88 R08: 0000563776b75df0 R09: 0000000000000000
[   31.383195] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[   31.383199] R13: 00007fffcbef7660 R14: 000056377527ea10 R15: 0000000000000007
[   31.383205]  ? __this_cpu_preempt_check+0x13/0x20
[   31.383232] BUG: unable to handle kernel paging request at 0000000100000058

> +static void guc_preempt_work_destroy(struct intel_guc *guc)
> +{
> +       struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +       struct intel_engine_cs *engine;
> +       enum intel_engine_id id;
> +
> +       for_each_engine(engine, dev_priv, id)
> +               cancel_work_sync(&guc->preempt_work[id].work);
> +
> +       destroy_workqueue(guc->preempt_wq);

guc->preempt_wq = NULL;

glk is special in that CI run for trying and failing to enable the GuC
due to no fw. (Do I hear the noise of a combinatorial explosion in the
background?)
-Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v5] drm/i915/guc: Preemption! With GuC
  2017-10-25 20:00 ` [PATCH v4 11/12] drm/i915/guc: Preemption! With GuC Michał Winiarski
  2017-10-25 20:24   ` Chris Wilson
  2017-10-25 21:14   ` Chris Wilson
@ 2017-10-26  7:27   ` Michał Winiarski
  2017-10-26 13:35     ` [PATCH v6] " Michał Winiarski
  2 siblings, 1 reply; 35+ messages in thread
From: Michał Winiarski @ 2017-10-26  7:27 UTC (permalink / raw)
  To: intel-gfx

Pretty similar to what we have on execlists.
We're reusing most of the GEM code, however, due to GuC quirks we need a
couple of extra bits.
Preemption is implemented as GuC action, and actions can be pretty slow.
Because of that, we're using a mutex to serialize them. Since we're
requesting preemption from the tasklet, the task of creating a workitem
and wrapping it in GuC action is delegated to a worker.

To distinguish that preemption has finished, we're using additional
piece of HWSP, and since we're not getting context switch interrupts,
we're also adding a user interrupt.

The fact that our special preempt context has completed unfortunately
doesn't mean that we're ready to submit new work. We also need to wait
for GuC to finish its own processing.

v2: Don't compile out the wait for GuC, handle workqueue flush on reset,
no need for ordered workqueue, put on a reviewer hat when looking at my own
patches (Chris)
Move struct work around in intel_guc, move user interruput outside of
conditional (Michał)
Keep ring around rather than chase though intel_context

v3: Extract WA for flushing ggtt writes to a helper (Chris)
Keep work_struct in intel_guc rather than engine (Michał)
Use ordered workqueue for inject_preempt worker to avoid GuC quirks.

v4: Drop now unused INTEL_GUC_PREEMPT_OPTION_IMMEDIATE (Daniele)
Drop stray newlines, use container_of for intel_guc in worker,
check for presence of workqueue when flushing it, rather than
enable_guc_submission modparam, reorder preempt postprocessing (Chris)

v5: Make wq NULL after destroying it

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.c            |   3 +-
 drivers/gpu/drm/i915/i915_gem.c            |  10 ++
 drivers/gpu/drm/i915/i915_guc_submission.c | 208 +++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_guc.h           |   8 ++
 drivers/gpu/drm/i915/intel_lrc.c           |   4 +-
 drivers/gpu/drm/i915/intel_lrc.h           |   1 -
 drivers/gpu/drm/i915/intel_ringbuffer.h    |   6 +
 7 files changed, 222 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7b871802ae36..af745749509c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -373,8 +373,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 			value |= I915_SCHEDULER_CAP_PRIORITY;
 
 			if (HAS_LOGICAL_RING_PREEMPTION(dev_priv) &&
-			    i915_modparams.enable_execlists &&
-			    !i915_modparams.enable_guc_submission)
+			    i915_modparams.enable_execlists)
 				value |= I915_SCHEDULER_CAP_PREEMPTION;
 		}
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d803ef5f4a7f..c2506fb3a483 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2921,6 +2921,16 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
 	tasklet_kill(&engine->execlists.irq_tasklet);
 	tasklet_disable(&engine->execlists.irq_tasklet);
 
+	/*
+	 * We're using worker to queue preemption requests from the tasklet in
+	 * GuC submission mode.
+	 * Even though tasklet was disabled, we may still have a worker queued.
+	 * Let's make sure that all workers scheduled before disabling the
+	 * tasklet are completed before continuing with the reset.
+	 */
+	if (engine->i915->guc.preempt_wq)
+		flush_workqueue(engine->i915->guc.preempt_wq);
+
 	if (engine->irq_seqno_barrier)
 		engine->irq_seqno_barrier(engine);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 600ef9d33773..c3eaeb4485e1 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -561,6 +561,108 @@ static void guc_add_request(struct intel_guc *guc,
 	spin_unlock(&client->wq_lock);
 }
 
+/*
+ * When we're doing submissions using regular execlists backend, writing to
+ * ELSP from CPU side is enough to make sure that writes to ringbuffer pages
+ * pinned in mappable aperture portion of GGTT are visible to command streamer.
+ * Writes done by GuC on our behalf are not guaranteeing such ordering,
+ * therefore, to ensure the flush, we're issuing a POSTING READ.
+ */
+static void flush_ggtt_writes(struct i915_vma *vma)
+{
+	struct drm_i915_private *dev_priv = to_i915(vma->obj->base.dev);
+
+	if (i915_vma_is_map_and_fenceable(vma))
+		POSTING_READ_FW(GUC_STATUS);
+}
+
+#define GUC_PREEMPT_FINISHED 0x1
+#define GUC_PREEMPT_BREADCRUMB_DWORDS 0x8
+static void inject_preempt_context(struct work_struct *work)
+{
+	struct guc_preempt_work *preempt_work =
+		container_of(work, typeof(*preempt_work), work);
+	struct intel_engine_cs *engine = preempt_work->engine;
+	struct intel_guc *guc = container_of(preempt_work, typeof(*guc),
+					     preempt_work[engine->id]);
+	struct i915_guc_client *client = guc->preempt_client;
+	struct intel_ring *ring = client->owner->engine[engine->id].ring;
+	u32 ctx_desc = lower_32_bits(intel_lr_context_descriptor(client->owner,
+								 engine));
+	u32 *cs = ring->vaddr + ring->tail;
+	u32 data[7];
+
+	if (engine->id == RCS) {
+		cs = gen8_emit_ggtt_write_rcs(cs, GUC_PREEMPT_FINISHED,
+				intel_hws_preempt_done_address(engine));
+	} else {
+		cs = gen8_emit_ggtt_write(cs, GUC_PREEMPT_FINISHED,
+				intel_hws_preempt_done_address(engine));
+		*cs++ = MI_NOOP;
+		*cs++ = MI_NOOP;
+	}
+	*cs++ = MI_USER_INTERRUPT;
+	*cs++ = MI_NOOP;
+
+	GEM_BUG_ON(!IS_ALIGNED(ring->size,
+			       GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32)));
+	GEM_BUG_ON((void *)cs - (ring->vaddr + ring->tail) !=
+		   GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32));
+
+	ring->tail += GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32);
+	ring->tail &= (ring->size - 1);
+
+	flush_ggtt_writes(ring->vma);
+
+	spin_lock_irq(&client->wq_lock);
+	guc_wq_item_append(client, engine->guc_id, ctx_desc,
+			   ring->tail / sizeof(u64), 0);
+	spin_unlock_irq(&client->wq_lock);
+
+	data[0] = INTEL_GUC_ACTION_REQUEST_PREEMPTION;
+	data[1] = client->stage_id;
+	data[2] = INTEL_GUC_PREEMPT_OPTION_DROP_WORK_Q |
+		  INTEL_GUC_PREEMPT_OPTION_DROP_SUBMIT_Q;
+	data[3] = engine->guc_id;
+	data[4] = guc->execbuf_client->priority;
+	data[5] = guc->execbuf_client->stage_id;
+	data[6] = guc_ggtt_offset(guc->shared_data);
+
+	if (WARN_ON(intel_guc_send(guc, data, ARRAY_SIZE(data)))) {
+		execlists_clear_active(&engine->execlists,
+				       EXECLISTS_ACTIVE_PREEMPT);
+		tasklet_schedule(&engine->execlists.irq_tasklet);
+	}
+}
+
+/*
+ * We're using user interrupt and HWSP value to mark that preemption has
+ * finished and GPU is idle. Normally, we could unwind and continue similar to
+ * execlists submission path. Unfortunately, with GuC we also need to wait for
+ * it to finish its own postprocessing, before attempting to submit. Otherwise
+ * GuC may silently ignore our submissions, and thus we risk losing request at
+ * best, executing out-of-order and causing kernel panic at worst.
+ */
+#define GUC_PREEMPT_POSTPROCESS_DELAY_MS 10
+static void wait_for_guc_preempt_report(struct intel_engine_cs *engine)
+{
+	struct intel_guc *guc = &engine->i915->guc;
+	struct guc_shared_ctx_data *data = guc->shared_data_vaddr;
+	struct guc_ctx_report *report =
+		&data->preempt_ctx_report[engine->guc_id];
+
+	WARN_ON(wait_for_atomic(report->report_return_status ==
+				INTEL_GUC_REPORT_STATUS_COMPLETE,
+				GUC_PREEMPT_POSTPROCESS_DELAY_MS));
+	/*
+	 * GuC is expecting that we're also going to clear the affected context
+	 * counter, let's also reset the return status to not depend on GuC
+	 * resetting it after recieving another preempt action
+	 */
+	report->affected_count = 0;
+	report->report_return_status = INTEL_GUC_REPORT_STATUS_UNKNOWN;
+}
+
 /**
  * i915_guc_submit() - Submit commands through GuC
  * @engine: engine associated with the commands
@@ -570,8 +672,7 @@ static void guc_add_request(struct intel_guc *guc,
  */
 static void i915_guc_submit(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-	struct intel_guc *guc = &dev_priv->guc;
+	struct intel_guc *guc = &engine->i915->guc;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
 	unsigned int n;
@@ -584,8 +685,7 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
 		if (rq && count == 0) {
 			port_set(&port[n], port_pack(rq, ++count));
 
-			if (i915_vma_is_map_and_fenceable(rq->ring->vma))
-				POSTING_READ_FW(GUC_STATUS);
+			flush_ggtt_writes(rq->ring->vma);
 
 			guc_add_request(guc, rq);
 		}
@@ -613,13 +713,32 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 	bool submit = false;
 	struct rb_node *rb;
 
-	if (port_isset(port))
-		port++;
-
 	spin_lock_irq(&engine->timeline->lock);
 	rb = execlists->first;
 	GEM_BUG_ON(rb_first(&execlists->queue) != rb);
-	while (rb) {
+
+	if (!rb)
+		goto unlock;
+
+	if (HAS_LOGICAL_RING_PREEMPTION(engine->i915) && port_isset(port)) {
+		struct guc_preempt_work *preempt_work =
+			&engine->i915->guc.preempt_work[engine->id];
+
+		if (rb_entry(rb, struct i915_priolist, node)->priority >
+		    max(port_request(port)->priotree.priority, 0)) {
+			execlists_set_active(execlists,
+					     EXECLISTS_ACTIVE_PREEMPT);
+			queue_work(engine->i915->guc.preempt_wq,
+				   &preempt_work->work);
+			goto unlock;
+		} else if (port_isset(last_port)) {
+			goto unlock;
+		}
+
+		port++;
+	}
+
+	do {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
 
@@ -649,7 +768,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
-	}
+	} while (rb);
 done:
 	execlists->first = rb;
 	if (submit) {
@@ -657,6 +776,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 		execlists_set_active(execlists, EXECLISTS_ACTIVE_USER);
 		i915_guc_submit(engine);
 	}
+unlock:
 	spin_unlock_irq(&engine->timeline->lock);
 }
 
@@ -665,8 +785,6 @@ static void i915_guc_irq_handler(unsigned long data)
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
-	const struct execlist_port * const last_port =
-		&execlists->port[execlists->port_mask];
 	struct drm_i915_gem_request *rq;
 
 	rq = port_request(&port[0]);
@@ -681,7 +799,19 @@ static void i915_guc_irq_handler(unsigned long data)
 	if (!rq)
 		execlists_clear_active(execlists, EXECLISTS_ACTIVE_USER);
 
-	if (!port_isset(last_port))
+	if (execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT) &&
+	    intel_read_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX) ==
+	    GUC_PREEMPT_FINISHED) {
+		execlists_cancel_port_requests(&engine->execlists);
+		execlists_unwind_incomplete_requests(execlists);
+
+		wait_for_guc_preempt_report(engine);
+
+		execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
+		intel_write_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX, 0);
+	}
+
+	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT))
 		i915_guc_dequeue(engine);
 }
 
@@ -1056,6 +1186,51 @@ static void guc_ads_destroy(struct intel_guc *guc)
 	i915_vma_unpin_and_release(&guc->ads_vma);
 }
 
+static int guc_preempt_work_create(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	/*
+	 * Even though both sending GuC action, and adding a new workitem to
+	 * GuC workqueue are serialized (each with its own locking), since
+	 * we're using mutliple engines, it's possible that we're going to
+	 * issue a preempt request with two (or more - each for different
+	 * engine) workitems in GuC queue. In this situation, GuC may submit
+	 * all of them, which will make us very confused.
+	 * Our preemption contexts may even already be complete - before we
+	 * even had the chance to sent the preempt action to GuC!. Rather
+	 * than introducing yet another lock, we can just use ordered workqueue
+	 * to make sure we're always sending a single preemption request with a
+	 * single workitem.
+	 */
+	guc->preempt_wq = alloc_ordered_workqueue("i915-guc_preempt",
+						  WQ_HIGHPRI);
+	if (!guc->preempt_wq)
+		return -ENOMEM;
+
+	for_each_engine(engine, dev_priv, id) {
+		guc->preempt_work[id].engine = engine;
+		INIT_WORK(&guc->preempt_work[id].work, inject_preempt_context);
+	}
+
+	return 0;
+}
+
+static void guc_preempt_work_destroy(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, dev_priv, id)
+		cancel_work_sync(&guc->preempt_work[id].work);
+
+	destroy_workqueue(guc->preempt_wq);
+	guc->preempt_wq = NULL;
+}
+
 /*
  * Set up the memory resources to be shared with the GuC (via the GGTT)
  * at firmware loading time.
@@ -1080,12 +1255,18 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
 	if (ret < 0)
 		goto err_shared_data;
 
+	ret = guc_preempt_work_create(guc);
+	if (ret)
+		goto err_log;
+
 	ret = guc_ads_create(guc);
 	if (ret < 0)
-		goto err_log;
+		goto err_wq;
 
 	return 0;
 
+err_wq:
+	guc_preempt_work_destroy(guc);
 err_log:
 	intel_guc_log_destroy(guc);
 err_shared_data:
@@ -1100,6 +1281,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
 	struct intel_guc *guc = &dev_priv->guc;
 
 	guc_ads_destroy(guc);
+	guc_preempt_work_destroy(guc);
 	intel_guc_log_destroy(guc);
 	guc_shared_data_destroy(guc);
 	guc_stage_desc_pool_destroy(guc);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 6ca55c5bed3c..538615ff5ea2 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -34,6 +34,11 @@
 #include "i915_guc_reg.h"
 #include "i915_vma.h"
 
+struct guc_preempt_work {
+	struct intel_engine_cs *engine;
+	struct work_struct work;
+};
+
 /*
  * Top level structure of GuC. It handles firmware loading and manages client
  * pool and doorbells. intel_guc owns a i915_guc_client to replace the legacy
@@ -60,6 +65,9 @@ struct intel_guc {
 	struct i915_guc_client *execbuf_client;
 	struct i915_guc_client *preempt_client;
 
+	struct guc_preempt_work preempt_work[I915_NUM_ENGINES];
+	struct workqueue_struct *preempt_wq;
+
 	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
 	/* Cyclic counter mod pagesize	*/
 	u32 db_cacheline;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b5d382ef8d85..6840ec8db037 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -385,7 +385,7 @@ static void __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	}
 }
 
-static void
+void
 execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
 {
 	struct intel_engine_cs *engine =
@@ -696,7 +696,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	}
 }
 
-static void
+void
 execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
 {
 	struct execlist_port *port = execlists->port;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 689fde1a63a9..17182ce29674 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -107,7 +107,6 @@ intel_lr_context_descriptor(struct i915_gem_context *ctx,
 	return ctx->engine[engine->id].lrc_desc;
 }
 
-
 /* Execlists */
 int intel_sanitize_enable_execlists(struct drm_i915_private *dev_priv,
 				    int enable_execlists);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 4a5a08985328..69ad875fd011 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -557,6 +557,12 @@ execlists_is_active(const struct intel_engine_execlists *execlists,
 	return test_bit(bit, (unsigned long *)&execlists->active);
 }
 
+void
+execlists_cancel_port_requests(struct intel_engine_execlists * const execlists);
+
+void
+execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists);
+
 static inline unsigned int
 execlists_num_ports(const struct intel_engine_execlists * const execlists)
 {
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev2)
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (12 preceding siblings ...)
  2017-10-25 21:06 ` ✗ Fi.CI.BAT: failure for Preemption with GuC, fourth try Patchwork
@ 2017-10-26  7:48 ` Patchwork
  2017-10-26 13:41 ` ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev3) Patchwork
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2017-10-26  7:48 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

== Series Details ==

Series: Preemption with GuC, fourth try (rev2)
URL   : https://patchwork.freedesktop.org/series/32654/
State : warning

== Summary ==

Series 32654v2 Preemption with GuC, fourth try
https://patchwork.freedesktop.org/api/1.0/series/32654/revisions/2/mbox/

Test drv_module_reload:
        Subgroup basic-reload:
                pass       -> DMESG-WARN (fi-cnl-y)
        Subgroup basic-no-display:
                pass       -> DMESG-WARN (fi-cnl-y)
        Subgroup basic-reload-inject:
                pass       -> DMESG-WARN (fi-cnl-y)

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:439s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:444s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:370s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:527s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:264s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:503s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:503s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:492s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:470s
fi-cfl-s         total:289  pass:253  dwarn:4   dfail:0   fail:0   skip:32  time:551s
fi-cnl-y         total:289  pass:259  dwarn:3   dfail:0   fail:0   skip:27  time:620s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:419s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:246s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:576s
fi-glk-dsi       total:289  pass:258  dwarn:0   dfail:0   fail:1   skip:30  time:482s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:431s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:431s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:442s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:494s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:461s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:492s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:574s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:479s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:584s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:544s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:449s
fi-skl-6600u     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:590s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:648s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:519s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:495s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:454s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:555s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:415s

2ea0b3d47030274c97624258e09fc7d1ffd0e0f2 drm-tip: 2017y-10m-25d-18h-42m-20s UTC integration manifest
d8525b3f8fda HAX Enable GuC Submission for CI
ca766427cf88 drm/i915/guc: Preemption! With GuC
5c250d581667 drm/i915: Rename helpers used for unwinding, use macro for can_preempt
dee14752e8a2 drm/i915/guc: Keep request->priority for its lifetime
8be229d8c330 drm/i915: Add information needed to track engine preempt state
ebe667124190 drm/i915: Extract "emit write" part of emit breadcrumb functions
e03bf3d7119c drm/i915/guc: Split guc_wq_item_append
f4bdf01a19d4 drm/i915/guc: Add a second client, to be used for preemption
55ce81eb7f4b drm/i915/guc: Add preemption action to GuC firmware interface
47cc469a334c drm/i915/guc: Allocate separate shared data object for GuC communication
ced55050645b drm/i915/guc: Extract GuC stage desc pool creation into a helper
814910338e4f drm/i915/guc: Do not use 0 for GuC doorbell cookie

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6198/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 05/12] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-25 20:00 ` [PATCH v3 05/12] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
@ 2017-10-26 12:50   ` Michal Wajdeczko
  2017-10-26 13:20   ` [PATCH v4] " Michał Winiarski
  1 sibling, 0 replies; 35+ messages in thread
From: Michal Wajdeczko @ 2017-10-26 12:50 UTC (permalink / raw)
  To: intel-gfx, Michał Winiarski; +Cc: Dave Gordon

On Wed, 25 Oct 2017 22:00:13 +0200, Michał Winiarski  
<michal.winiarski@intel.com> wrote:

> From: Dave Gordon <david.s.gordon@intel.com>
>
> This second client is created with priority KMD_HIGH, and marked
> as preemptive. This will allow us to request preemption using GuC  
> actions.
>
> v2: Extract clients creation into a helper, debugfs fixups. (Michał)
> Recreate doorbell on init. (Daniele)
> Move clients into an array.
>
> v3: And move clients back from an array, to get rid of the enum (Michał)
>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Jeff McGee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |   2 +
>  drivers/gpu/drm/i915/i915_guc_submission.c | 112  
> ++++++++++++++++++++---------
>  drivers/gpu/drm/i915/intel_guc.h           |   1 +
>  3 files changed, 82 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c  
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index c65e381b85f3..6e178d75bb17 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2484,6 +2484,8 @@ static int i915_guc_info(struct seq_file *m, void  
> *data)
> 	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);
>  	i915_guc_client_info(m, dev_priv, guc->execbuf_client);
> +	seq_printf(m, "\nGuC preempt client @ %p:\n", guc->preempt_client);
> +	i915_guc_client_info(m, dev_priv, guc->preempt_client);
> 	i915_guc_log_info(m, dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c  
> b/drivers/gpu/drm/i915/i915_guc_submission.c
> index d1a5613da24c..1d3123e74bc4 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -33,10 +33,11 @@
>   *
>   * GuC client:
>   * A i915_guc_client refers to a submission path through GuC.  
> Currently, there
> - * is only one of these (the execbuf_client) and this one is charged  
> with all
> - * submissions to the GuC. This struct is the owner of a doorbell, a  
> process
> - * descriptor and a workqueue (all of them inside a single gem object  
> that
> - * contains all required pages for these elements).
> + * are two clients. One of them (the execbuf_client) is charged with all
> + * submissions to the GuC, the other one (preempt_client) is  
> responsible for
> + * preempting the execbuf_client. This struct is the owner of a  
> doorbell, a
> + * process descriptor and a workqueue (all of them inside a single gem  
> object
> + * that contains all required pages for these elements).
>   *
>   * GuC stage descriptor:
>   * During initialization, the driver allocates a static pool of 1024  
> such
> @@ -363,6 +364,8 @@ static void guc_stage_desc_init(struct intel_guc  
> *guc,
>  	memset(desc, 0, sizeof(*desc));
> 	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |  
> GUC_STAGE_DESC_ATTR_KERNEL;
> +	if (client->priority <= GUC_CLIENT_PRIORITY_HIGH)

Can we use

	if (client->priority == GUC_CLIENT_PRIORITY_KMD_HIGH ||
	    client->priority == GUC_CLIENT_PRIORITY_HIGH)

I assume compiler will optimize it to the same asm, but then we can
avoid condition that may be read wrong (priority less then high)


> +		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
>  	desc->stage_id = client->stage_id;
>  	desc->priority = client->priority;
>  	desc->db_id = client->doorbell_id;
> @@ -763,14 +766,18 @@ static int guc_init_doorbell_hw(struct intel_guc  
> *guc)
> 	/* Now for every client (and not only execbuf_client) make sure their
>  	 * doorbells are known by the GuC */
> -	//for (client = client_list; client != NULL; client = client->next)
> -	{
> -		ret = __create_doorbell(client);
> -		if (ret) {
> -			DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
> -				client->stage_id, ret);
> -			return ret;
> -		}
> +	ret = __create_doorbell(guc->execbuf_client);
> +	if (ret) {
> +		DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
> +			  guc->execbuf_client->stage_id, ret);

This

> +		return ret;
> +	}
> +
> +	ret = __create_doorbell(guc->preempt_client);
> +	if (ret) {
> +		DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
> +			  guc->preempt_client->stage_id, ret);

And this error can be moved to __create_doorbell to avoid duplication

> +		return ret;
>  	}
> 	/* Read back & verify all (used & unused) doorbell registers */
> @@ -895,6 +902,47 @@ static void guc_client_free(struct i915_guc_client  
> *client)
>  	kfree(client);
>  }
> +static int guc_clients_create(struct intel_guc *guc)
> +{
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	struct i915_guc_client *client;
> +
> +	client = guc_client_alloc(dev_priv,
> +				  INTEL_INFO(dev_priv)->ring_mask,
> +				  GUC_CLIENT_PRIORITY_KMD_NORMAL,
> +				  dev_priv->kernel_context);
> +	if (IS_ERR(client)) {
> +		DRM_ERROR("Failed to create GuC client for submission!\n");
> +		return PTR_ERR(client);
> +	}
> +	guc->execbuf_client = client;
> +
> +	client = guc_client_alloc(dev_priv,
> +				  INTEL_INFO(dev_priv)->ring_mask,
> +				  GUC_CLIENT_PRIORITY_KMD_HIGH,
> +				  dev_priv->preempt_context);
> +	if (IS_ERR(client)) {
> +		DRM_ERROR("Failed to create GuC client for preemption!\n");
> +		guc_client_free(guc->execbuf_client);
> +		guc->execbuf_client = NULL;
> +		return PTR_ERR(client);
> +	}
> +	guc->preempt_client = client;
> +
> +	return 0;
> +}
> +
> +static void guc_clients_destroy(struct intel_guc *guc)
> +{
> +	struct i915_guc_client *client;
> +
> +	client = fetch_and_zero(&guc->execbuf_client);
> +	guc_client_free(client);
> +
> +	client = fetch_and_zero(&guc->preempt_client);
> +	guc_client_free(client);
> +}
> +
>  static void guc_policy_init(struct guc_policy *policy)
>  {
>  	policy->execution_quantum = POLICY_DEFAULT_EXECUTION_QUANTUM_US;
> @@ -1134,7 +1182,6 @@ static void i915_guc_submission_unpark(struct  
> intel_engine_cs *engine)
>  int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>  {
>  	struct intel_guc *guc = &dev_priv->guc;
> -	struct i915_guc_client *client = guc->execbuf_client;
>  	struct intel_engine_cs *engine;
>  	enum intel_engine_id id;
>  	int err;
> @@ -1152,28 +1199,29 @@ int i915_guc_submission_enable(struct  
> drm_i915_private *dev_priv)
>  		     sizeof(struct guc_wq_item) *
>  		     I915_NUM_ENGINES > GUC_WQ_SIZE);
> -	if (!client) {
> -		client = guc_client_alloc(dev_priv,
> -					  INTEL_INFO(dev_priv)->ring_mask,
> -					  GUC_CLIENT_PRIORITY_KMD_NORMAL,
> -					  dev_priv->kernel_context);
> -		if (IS_ERR(client)) {
> -			DRM_ERROR("Failed to create GuC client for execbuf!\n");
> -			return PTR_ERR(client);
> -		}
> -
> -		guc->execbuf_client = client;
> +	/*
> +	 * We're being called on both module initialization and on reset,
> +	 * until this flow is changed, we're using regular client presence to
> +	 * determine which case are we in, and whether we should allocate new
> +	 * clients or just reset their workqueues.
> +	 */
> +	if (!guc->execbuf_client) {
> +		GEM_BUG_ON(guc->preempt_client);

You can move this GEM_BUG_ON to guc_clients_create and check both

> +		err = guc_clients_create(guc);
> +		if (err)
> +			return err;
> +	} else {
> +		guc_reset_wq(guc->execbuf_client);
> +		guc_reset_wq(guc->preempt_client);
>  	}
> 	err = intel_guc_sample_forcewake(guc);
>  	if (err)
> -		goto err_execbuf_client;
> -
> -	guc_reset_wq(client);
> +		goto err_free_clients;
> 	err = guc_init_doorbell_hw(guc);
>  	if (err)
> -		goto err_execbuf_client;
> +		goto err_free_clients;
> 	/* Take over from manual control of ELSP (execlists) */
>  	guc_interrupts_capture(dev_priv);
> @@ -1187,9 +1235,8 @@ int i915_guc_submission_enable(struct  
> drm_i915_private *dev_priv)
> 	return 0;
> -err_execbuf_client:
> -	guc_client_free(guc->execbuf_client);
> -	guc->execbuf_client = NULL;
> +err_free_clients:
> +	guc_clients_destroy(guc);
>  	return err;
>  }
> @@ -1204,6 +1251,5 @@ void i915_guc_submission_disable(struct  
> drm_i915_private *dev_priv)
>  	/* Revert back to manual ELSP submission */
>  	intel_engines_reset_default_submission(dev_priv);
> -	guc_client_free(guc->execbuf_client);
> -	guc->execbuf_client = NULL;
> +	guc_clients_destroy(guc);
>  }
> diff --git a/drivers/gpu/drm/i915/intel_guc.h  
> b/drivers/gpu/drm/i915/intel_guc.h
> index aa1583167b0a..6ca55c5bed3c 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -58,6 +58,7 @@ struct intel_guc {
>  	void *shared_data_vaddr;
> 	struct i915_guc_client *execbuf_client;
> +	struct i915_guc_client *preempt_client;
> 	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
>  	/* Cyclic counter mod pagesize	*/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v4] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-25 20:00 ` [PATCH v3 05/12] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
  2017-10-26 12:50   ` Michal Wajdeczko
@ 2017-10-26 13:20   ` Michał Winiarski
  2017-10-26 13:32     ` [PATCH v5] " Michał Winiarski
  1 sibling, 1 reply; 35+ messages in thread
From: Michał Winiarski @ 2017-10-26 13:20 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dave Gordon

From: Dave Gordon <david.s.gordon@intel.com>

This second client is created with priority KMD_HIGH, and marked
as preemptive. This will allow us to request preemption using GuC actions.

v2: Extract clients creation into a helper, debugfs fixups. (Michał)
Recreate doorbell on init. (Daniele)
Move clients into an array.

v3: And move clients back from an array, to get rid of the enum (Michał)

v4: Use is_high_priority, move DRM_ERROR into __create_doorbell, move
GEM_BUG_ON inside guc_clients_create (Michał)

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |   2 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 117 ++++++++++++++++++++---------
 drivers/gpu/drm/i915/intel_guc.h           |   1 +
 3 files changed, 84 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c65e381b85f3..6e178d75bb17 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2484,6 +2484,8 @@ static int i915_guc_info(struct seq_file *m, void *data)
 
 	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);
 	i915_guc_client_info(m, dev_priv, guc->execbuf_client);
+	seq_printf(m, "\nGuC preempt client @ %p:\n", guc->preempt_client);
+	i915_guc_client_info(m, dev_priv, guc->preempt_client);
 
 	i915_guc_log_info(m, dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index d1a5613da24c..4c8a0241d953 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -33,10 +33,11 @@
  *
  * GuC client:
  * A i915_guc_client refers to a submission path through GuC. Currently, there
- * is only one of these (the execbuf_client) and this one is charged with all
- * submissions to the GuC. This struct is the owner of a doorbell, a process
- * descriptor and a workqueue (all of them inside a single gem object that
- * contains all required pages for these elements).
+ * are two clients. One of them (the execbuf_client) is charged with all
+ * submissions to the GuC, the other one (preempt_client) is responsible for
+ * preempting the execbuf_client. This struct is the owner of a doorbell, a
+ * process descriptor and a workqueue (all of them inside a single gem object
+ * that contains all required pages for these elements).
  *
  * GuC stage descriptor:
  * During initialization, the driver allocates a static pool of 1024 such
@@ -83,7 +84,8 @@
 
 static inline bool is_high_priority(struct i915_guc_client* client)
 {
-	return client->priority <= GUC_CLIENT_PRIORITY_HIGH;
+	return (client->priority == GUC_CLIENT_PRIORITY_KMD_HIGH ||
+		client->priority == GUC_CLIENT_PRIORITY_HIGH);
 }
 
 static int __reserve_doorbell(struct i915_guc_client *client)
@@ -196,8 +198,11 @@ static int __create_doorbell(struct i915_guc_client *client)
 	doorbell->cookie = 0;
 
 	err = __guc_allocate_doorbell(client->guc, client->stage_id);
-	if (err)
+	if (err) {
 		doorbell->db_status = GUC_DOORBELL_DISABLED;
+		DRM_ERROR("Couldn't create client %u doorbell: %d\n",
+			  client->stage_id, err);
+	}
 
 	return err;
 }
@@ -363,6 +368,8 @@ static void guc_stage_desc_init(struct intel_guc *guc,
 	memset(desc, 0, sizeof(*desc));
 
 	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE | GUC_STAGE_DESC_ATTR_KERNEL;
+	if (is_high_priority(client))
+		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
 	desc->stage_id = client->stage_id;
 	desc->priority = client->priority;
 	desc->db_id = client->doorbell_id;
@@ -763,15 +770,13 @@ static int guc_init_doorbell_hw(struct intel_guc *guc)
 
 	/* Now for every client (and not only execbuf_client) make sure their
 	 * doorbells are known by the GuC */
-	//for (client = client_list; client != NULL; client = client->next)
-	{
-		ret = __create_doorbell(client);
-		if (ret) {
-			DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
-				client->stage_id, ret);
-			return ret;
-		}
-	}
+	ret = __create_doorbell(guc->execbuf_client);
+	if (ret)
+		return ret;
+
+	ret = __create_doorbell(guc->preempt_client);
+	if (ret)
+		return ret;
 
 	/* Read back & verify all (used & unused) doorbell registers */
 	for (db_id = 0; db_id < GUC_NUM_DOORBELLS; ++db_id)
@@ -895,6 +900,49 @@ static void guc_client_free(struct i915_guc_client *client)
 	kfree(client);
 }
 
+static int guc_clients_create(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct i915_guc_client *client;
+
+	GEM_BUG_ON(guc->execbuf_client || guc->preempt_client);
+
+	client = guc_client_alloc(dev_priv,
+				  INTEL_INFO(dev_priv)->ring_mask,
+				  GUC_CLIENT_PRIORITY_KMD_NORMAL,
+				  dev_priv->kernel_context);
+	if (IS_ERR(client)) {
+		DRM_ERROR("Failed to create GuC client for submission!\n");
+		return PTR_ERR(client);
+	}
+	guc->execbuf_client = client;
+
+	client = guc_client_alloc(dev_priv,
+				  INTEL_INFO(dev_priv)->ring_mask,
+				  GUC_CLIENT_PRIORITY_KMD_HIGH,
+				  dev_priv->preempt_context);
+	if (IS_ERR(client)) {
+		DRM_ERROR("Failed to create GuC client for preemption!\n");
+		guc_client_free(guc->execbuf_client);
+		guc->execbuf_client = NULL;
+		return PTR_ERR(client);
+	}
+	guc->preempt_client = client;
+
+	return 0;
+}
+
+static void guc_clients_destroy(struct intel_guc *guc)
+{
+	struct i915_guc_client *client;
+
+	client = fetch_and_zero(&guc->execbuf_client);
+	guc_client_free(client);
+
+	client = fetch_and_zero(&guc->preempt_client);
+	guc_client_free(client);
+}
+
 static void guc_policy_init(struct guc_policy *policy)
 {
 	policy->execution_quantum = POLICY_DEFAULT_EXECUTION_QUANTUM_US;
@@ -1134,7 +1182,6 @@ static void i915_guc_submission_unpark(struct intel_engine_cs *engine)
 int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
-	struct i915_guc_client *client = guc->execbuf_client;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 	int err;
@@ -1152,28 +1199,28 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 		     sizeof(struct guc_wq_item) *
 		     I915_NUM_ENGINES > GUC_WQ_SIZE);
 
-	if (!client) {
-		client = guc_client_alloc(dev_priv,
-					  INTEL_INFO(dev_priv)->ring_mask,
-					  GUC_CLIENT_PRIORITY_KMD_NORMAL,
-					  dev_priv->kernel_context);
-		if (IS_ERR(client)) {
-			DRM_ERROR("Failed to create GuC client for execbuf!\n");
-			return PTR_ERR(client);
-		}
-
-		guc->execbuf_client = client;
+	/*
+	 * We're being called on both module initialization and on reset,
+	 * until this flow is changed, we're using regular client presence to
+	 * determine which case are we in, and whether we should allocate new
+	 * clients or just reset their workqueues.
+	 */
+	if (!guc->execbuf_client) {
+		err = guc_clients_create(guc);
+		if (err)
+			return err;
+	} else {
+		guc_reset_wq(guc->execbuf_client);
+		guc_reset_wq(guc->preempt_client);
 	}
 
 	err = intel_guc_sample_forcewake(guc);
 	if (err)
-		goto err_execbuf_client;
-
-	guc_reset_wq(client);
+		goto err_free_clients;
 
 	err = guc_init_doorbell_hw(guc);
 	if (err)
-		goto err_execbuf_client;
+		goto err_free_clients;
 
 	/* Take over from manual control of ELSP (execlists) */
 	guc_interrupts_capture(dev_priv);
@@ -1187,9 +1234,8 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 
 	return 0;
 
-err_execbuf_client:
-	guc_client_free(guc->execbuf_client);
-	guc->execbuf_client = NULL;
+err_free_clients:
+	guc_clients_destroy(guc);
 	return err;
 }
 
@@ -1204,6 +1250,5 @@ void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
 	/* Revert back to manual ELSP submission */
 	intel_engines_reset_default_submission(dev_priv);
 
-	guc_client_free(guc->execbuf_client);
-	guc->execbuf_client = NULL;
+	guc_clients_destroy(guc);
 }
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index aa1583167b0a..6ca55c5bed3c 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -58,6 +58,7 @@ struct intel_guc {
 	void *shared_data_vaddr;
 
 	struct i915_guc_client *execbuf_client;
+	struct i915_guc_client *preempt_client;
 
 	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
 	/* Cyclic counter mod pagesize	*/
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v5] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-26 13:20   ` [PATCH v4] " Michał Winiarski
@ 2017-10-26 13:32     ` Michał Winiarski
  2017-10-26 13:44       ` Michal Wajdeczko
  2017-10-26 14:17       ` [PATCH v6] " Michał Winiarski
  0 siblings, 2 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-26 13:32 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dave Gordon

From: Dave Gordon <david.s.gordon@intel.com>

This second client is created with priority KMD_HIGH, and marked
as preemptive. This will allow us to request preemption using GuC actions.

v2: Extract clients creation into a helper, debugfs fixups. (Michał)
Recreate doorbell on init. (Daniele)
Move clients into an array.

v3: And move clients back from an array, to get rid of the enum (Michał)

v4: Use is_high_priority, move DRM_ERROR into __create_doorbell, move
GEM_BUG_ON inside guc_clients_create (Michał)

v5: Split the BUG_ON (Michał)

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |   2 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 118 ++++++++++++++++++++---------
 drivers/gpu/drm/i915/intel_guc.h           |   1 +
 3 files changed, 85 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c65e381b85f3..6e178d75bb17 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2484,6 +2484,8 @@ static int i915_guc_info(struct seq_file *m, void *data)
 
 	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);
 	i915_guc_client_info(m, dev_priv, guc->execbuf_client);
+	seq_printf(m, "\nGuC preempt client @ %p:\n", guc->preempt_client);
+	i915_guc_client_info(m, dev_priv, guc->preempt_client);
 
 	i915_guc_log_info(m, dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index d1a5613da24c..31c50ad46eb7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -33,10 +33,11 @@
  *
  * GuC client:
  * A i915_guc_client refers to a submission path through GuC. Currently, there
- * is only one of these (the execbuf_client) and this one is charged with all
- * submissions to the GuC. This struct is the owner of a doorbell, a process
- * descriptor and a workqueue (all of them inside a single gem object that
- * contains all required pages for these elements).
+ * are two clients. One of them (the execbuf_client) is charged with all
+ * submissions to the GuC, the other one (preempt_client) is responsible for
+ * preempting the execbuf_client. This struct is the owner of a doorbell, a
+ * process descriptor and a workqueue (all of them inside a single gem object
+ * that contains all required pages for these elements).
  *
  * GuC stage descriptor:
  * During initialization, the driver allocates a static pool of 1024 such
@@ -83,7 +84,8 @@
 
 static inline bool is_high_priority(struct i915_guc_client* client)
 {
-	return client->priority <= GUC_CLIENT_PRIORITY_HIGH;
+	return (client->priority == GUC_CLIENT_PRIORITY_KMD_HIGH ||
+		client->priority == GUC_CLIENT_PRIORITY_HIGH);
 }
 
 static int __reserve_doorbell(struct i915_guc_client *client)
@@ -196,8 +198,11 @@ static int __create_doorbell(struct i915_guc_client *client)
 	doorbell->cookie = 0;
 
 	err = __guc_allocate_doorbell(client->guc, client->stage_id);
-	if (err)
+	if (err) {
 		doorbell->db_status = GUC_DOORBELL_DISABLED;
+		DRM_ERROR("Couldn't create client %u doorbell: %d\n",
+			  client->stage_id, err);
+	}
 
 	return err;
 }
@@ -363,6 +368,8 @@ static void guc_stage_desc_init(struct intel_guc *guc,
 	memset(desc, 0, sizeof(*desc));
 
 	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE | GUC_STAGE_DESC_ATTR_KERNEL;
+	if (is_high_priority(client))
+		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
 	desc->stage_id = client->stage_id;
 	desc->priority = client->priority;
 	desc->db_id = client->doorbell_id;
@@ -763,15 +770,13 @@ static int guc_init_doorbell_hw(struct intel_guc *guc)
 
 	/* Now for every client (and not only execbuf_client) make sure their
 	 * doorbells are known by the GuC */
-	//for (client = client_list; client != NULL; client = client->next)
-	{
-		ret = __create_doorbell(client);
-		if (ret) {
-			DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
-				client->stage_id, ret);
-			return ret;
-		}
-	}
+	ret = __create_doorbell(guc->execbuf_client);
+	if (ret)
+		return ret;
+
+	ret = __create_doorbell(guc->preempt_client);
+	if (ret)
+		return ret;
 
 	/* Read back & verify all (used & unused) doorbell registers */
 	for (db_id = 0; db_id < GUC_NUM_DOORBELLS; ++db_id)
@@ -895,6 +900,50 @@ static void guc_client_free(struct i915_guc_client *client)
 	kfree(client);
 }
 
+static int guc_clients_create(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct i915_guc_client *client;
+
+	GEM_BUG_ON(guc->execbuf_client);
+	GEM_BUG_ON(guc->preempt_client);
+
+	client = guc_client_alloc(dev_priv,
+				  INTEL_INFO(dev_priv)->ring_mask,
+				  GUC_CLIENT_PRIORITY_KMD_NORMAL,
+				  dev_priv->kernel_context);
+	if (IS_ERR(client)) {
+		DRM_ERROR("Failed to create GuC client for submission!\n");
+		return PTR_ERR(client);
+	}
+	guc->execbuf_client = client;
+
+	client = guc_client_alloc(dev_priv,
+				  INTEL_INFO(dev_priv)->ring_mask,
+				  GUC_CLIENT_PRIORITY_KMD_HIGH,
+				  dev_priv->preempt_context);
+	if (IS_ERR(client)) {
+		DRM_ERROR("Failed to create GuC client for preemption!\n");
+		guc_client_free(guc->execbuf_client);
+		guc->execbuf_client = NULL;
+		return PTR_ERR(client);
+	}
+	guc->preempt_client = client;
+
+	return 0;
+}
+
+static void guc_clients_destroy(struct intel_guc *guc)
+{
+	struct i915_guc_client *client;
+
+	client = fetch_and_zero(&guc->execbuf_client);
+	guc_client_free(client);
+
+	client = fetch_and_zero(&guc->preempt_client);
+	guc_client_free(client);
+}
+
 static void guc_policy_init(struct guc_policy *policy)
 {
 	policy->execution_quantum = POLICY_DEFAULT_EXECUTION_QUANTUM_US;
@@ -1134,7 +1183,6 @@ static void i915_guc_submission_unpark(struct intel_engine_cs *engine)
 int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
-	struct i915_guc_client *client = guc->execbuf_client;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 	int err;
@@ -1152,28 +1200,28 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 		     sizeof(struct guc_wq_item) *
 		     I915_NUM_ENGINES > GUC_WQ_SIZE);
 
-	if (!client) {
-		client = guc_client_alloc(dev_priv,
-					  INTEL_INFO(dev_priv)->ring_mask,
-					  GUC_CLIENT_PRIORITY_KMD_NORMAL,
-					  dev_priv->kernel_context);
-		if (IS_ERR(client)) {
-			DRM_ERROR("Failed to create GuC client for execbuf!\n");
-			return PTR_ERR(client);
-		}
-
-		guc->execbuf_client = client;
+	/*
+	 * We're being called on both module initialization and on reset,
+	 * until this flow is changed, we're using regular client presence to
+	 * determine which case are we in, and whether we should allocate new
+	 * clients or just reset their workqueues.
+	 */
+	if (!guc->execbuf_client) {
+		err = guc_clients_create(guc);
+		if (err)
+			return err;
+	} else {
+		guc_reset_wq(guc->execbuf_client);
+		guc_reset_wq(guc->preempt_client);
 	}
 
 	err = intel_guc_sample_forcewake(guc);
 	if (err)
-		goto err_execbuf_client;
-
-	guc_reset_wq(client);
+		goto err_free_clients;
 
 	err = guc_init_doorbell_hw(guc);
 	if (err)
-		goto err_execbuf_client;
+		goto err_free_clients;
 
 	/* Take over from manual control of ELSP (execlists) */
 	guc_interrupts_capture(dev_priv);
@@ -1187,9 +1235,8 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 
 	return 0;
 
-err_execbuf_client:
-	guc_client_free(guc->execbuf_client);
-	guc->execbuf_client = NULL;
+err_free_clients:
+	guc_clients_destroy(guc);
 	return err;
 }
 
@@ -1204,6 +1251,5 @@ void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
 	/* Revert back to manual ELSP submission */
 	intel_engines_reset_default_submission(dev_priv);
 
-	guc_client_free(guc->execbuf_client);
-	guc->execbuf_client = NULL;
+	guc_clients_destroy(guc);
 }
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index aa1583167b0a..6ca55c5bed3c 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -58,6 +58,7 @@ struct intel_guc {
 	void *shared_data_vaddr;
 
 	struct i915_guc_client *execbuf_client;
+	struct i915_guc_client *preempt_client;
 
 	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
 	/* Cyclic counter mod pagesize	*/
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6] drm/i915/guc: Preemption! With GuC
  2017-10-26  7:27   ` [PATCH v5] " Michał Winiarski
@ 2017-10-26 13:35     ` Michał Winiarski
  0 siblings, 0 replies; 35+ messages in thread
From: Michał Winiarski @ 2017-10-26 13:35 UTC (permalink / raw)
  To: intel-gfx

Pretty similar to what we have on execlists.
We're reusing most of the GEM code, however, due to GuC quirks we need a
couple of extra bits.
Preemption is implemented as GuC action, and actions can be pretty slow.
Because of that, we're using a mutex to serialize them. Since we're
requesting preemption from the tasklet, the task of creating a workitem
and wrapping it in GuC action is delegated to a worker.

To distinguish that preemption has finished, we're using additional
piece of HWSP, and since we're not getting context switch interrupts,
we're also adding a user interrupt.

The fact that our special preempt context has completed unfortunately
doesn't mean that we're ready to submit new work. We also need to wait
for GuC to finish its own processing.

v2: Don't compile out the wait for GuC, handle workqueue flush on reset,
no need for ordered workqueue, put on a reviewer hat when looking at my own
patches (Chris)
Move struct work around in intel_guc, move user interruput outside of
conditional (Michał)
Keep ring around rather than chase though intel_context

v3: Extract WA for flushing ggtt writes to a helper (Chris)
Keep work_struct in intel_guc rather than engine (Michał)
Use ordered workqueue for inject_preempt worker to avoid GuC quirks.

v4: Drop now unused INTEL_GUC_PREEMPT_OPTION_IMMEDIATE (Daniele)
Drop stray newlines, use container_of for intel_guc in worker,
check for presence of workqueue when flushing it, rather than
enable_guc_submission modparam, reorder preempt postprocessing (Chris)

v5: Make wq NULL after destroying it

v6: Swap struct guc_preempt_work members (Michał)

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.c            |   3 +-
 drivers/gpu/drm/i915/i915_gem.c            |  10 ++
 drivers/gpu/drm/i915/i915_guc_submission.c | 208 +++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_guc.h           |   8 ++
 drivers/gpu/drm/i915/intel_lrc.c           |   4 +-
 drivers/gpu/drm/i915/intel_lrc.h           |   1 -
 drivers/gpu/drm/i915/intel_ringbuffer.h    |   6 +
 7 files changed, 222 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7b871802ae36..af745749509c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -373,8 +373,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 			value |= I915_SCHEDULER_CAP_PRIORITY;
 
 			if (HAS_LOGICAL_RING_PREEMPTION(dev_priv) &&
-			    i915_modparams.enable_execlists &&
-			    !i915_modparams.enable_guc_submission)
+			    i915_modparams.enable_execlists)
 				value |= I915_SCHEDULER_CAP_PREEMPTION;
 		}
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d803ef5f4a7f..c2506fb3a483 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2921,6 +2921,16 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
 	tasklet_kill(&engine->execlists.irq_tasklet);
 	tasklet_disable(&engine->execlists.irq_tasklet);
 
+	/*
+	 * We're using worker to queue preemption requests from the tasklet in
+	 * GuC submission mode.
+	 * Even though tasklet was disabled, we may still have a worker queued.
+	 * Let's make sure that all workers scheduled before disabling the
+	 * tasklet are completed before continuing with the reset.
+	 */
+	if (engine->i915->guc.preempt_wq)
+		flush_workqueue(engine->i915->guc.preempt_wq);
+
 	if (engine->irq_seqno_barrier)
 		engine->irq_seqno_barrier(engine);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 63641811de46..bf8d1e54aa5c 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -565,6 +565,108 @@ static void guc_add_request(struct intel_guc *guc,
 	spin_unlock(&client->wq_lock);
 }
 
+/*
+ * When we're doing submissions using regular execlists backend, writing to
+ * ELSP from CPU side is enough to make sure that writes to ringbuffer pages
+ * pinned in mappable aperture portion of GGTT are visible to command streamer.
+ * Writes done by GuC on our behalf are not guaranteeing such ordering,
+ * therefore, to ensure the flush, we're issuing a POSTING READ.
+ */
+static void flush_ggtt_writes(struct i915_vma *vma)
+{
+	struct drm_i915_private *dev_priv = to_i915(vma->obj->base.dev);
+
+	if (i915_vma_is_map_and_fenceable(vma))
+		POSTING_READ_FW(GUC_STATUS);
+}
+
+#define GUC_PREEMPT_FINISHED 0x1
+#define GUC_PREEMPT_BREADCRUMB_DWORDS 0x8
+static void inject_preempt_context(struct work_struct *work)
+{
+	struct guc_preempt_work *preempt_work =
+		container_of(work, typeof(*preempt_work), work);
+	struct intel_engine_cs *engine = preempt_work->engine;
+	struct intel_guc *guc = container_of(preempt_work, typeof(*guc),
+					     preempt_work[engine->id]);
+	struct i915_guc_client *client = guc->preempt_client;
+	struct intel_ring *ring = client->owner->engine[engine->id].ring;
+	u32 ctx_desc = lower_32_bits(intel_lr_context_descriptor(client->owner,
+								 engine));
+	u32 *cs = ring->vaddr + ring->tail;
+	u32 data[7];
+
+	if (engine->id == RCS) {
+		cs = gen8_emit_ggtt_write_rcs(cs, GUC_PREEMPT_FINISHED,
+				intel_hws_preempt_done_address(engine));
+	} else {
+		cs = gen8_emit_ggtt_write(cs, GUC_PREEMPT_FINISHED,
+				intel_hws_preempt_done_address(engine));
+		*cs++ = MI_NOOP;
+		*cs++ = MI_NOOP;
+	}
+	*cs++ = MI_USER_INTERRUPT;
+	*cs++ = MI_NOOP;
+
+	GEM_BUG_ON(!IS_ALIGNED(ring->size,
+			       GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32)));
+	GEM_BUG_ON((void *)cs - (ring->vaddr + ring->tail) !=
+		   GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32));
+
+	ring->tail += GUC_PREEMPT_BREADCRUMB_DWORDS * sizeof(u32);
+	ring->tail &= (ring->size - 1);
+
+	flush_ggtt_writes(ring->vma);
+
+	spin_lock_irq(&client->wq_lock);
+	guc_wq_item_append(client, engine->guc_id, ctx_desc,
+			   ring->tail / sizeof(u64), 0);
+	spin_unlock_irq(&client->wq_lock);
+
+	data[0] = INTEL_GUC_ACTION_REQUEST_PREEMPTION;
+	data[1] = client->stage_id;
+	data[2] = INTEL_GUC_PREEMPT_OPTION_DROP_WORK_Q |
+		  INTEL_GUC_PREEMPT_OPTION_DROP_SUBMIT_Q;
+	data[3] = engine->guc_id;
+	data[4] = guc->execbuf_client->priority;
+	data[5] = guc->execbuf_client->stage_id;
+	data[6] = guc_ggtt_offset(guc->shared_data);
+
+	if (WARN_ON(intel_guc_send(guc, data, ARRAY_SIZE(data)))) {
+		execlists_clear_active(&engine->execlists,
+				       EXECLISTS_ACTIVE_PREEMPT);
+		tasklet_schedule(&engine->execlists.irq_tasklet);
+	}
+}
+
+/*
+ * We're using user interrupt and HWSP value to mark that preemption has
+ * finished and GPU is idle. Normally, we could unwind and continue similar to
+ * execlists submission path. Unfortunately, with GuC we also need to wait for
+ * it to finish its own postprocessing, before attempting to submit. Otherwise
+ * GuC may silently ignore our submissions, and thus we risk losing request at
+ * best, executing out-of-order and causing kernel panic at worst.
+ */
+#define GUC_PREEMPT_POSTPROCESS_DELAY_MS 10
+static void wait_for_guc_preempt_report(struct intel_engine_cs *engine)
+{
+	struct intel_guc *guc = &engine->i915->guc;
+	struct guc_shared_ctx_data *data = guc->shared_data_vaddr;
+	struct guc_ctx_report *report =
+		&data->preempt_ctx_report[engine->guc_id];
+
+	WARN_ON(wait_for_atomic(report->report_return_status ==
+				INTEL_GUC_REPORT_STATUS_COMPLETE,
+				GUC_PREEMPT_POSTPROCESS_DELAY_MS));
+	/*
+	 * GuC is expecting that we're also going to clear the affected context
+	 * counter, let's also reset the return status to not depend on GuC
+	 * resetting it after recieving another preempt action
+	 */
+	report->affected_count = 0;
+	report->report_return_status = INTEL_GUC_REPORT_STATUS_UNKNOWN;
+}
+
 /**
  * i915_guc_submit() - Submit commands through GuC
  * @engine: engine associated with the commands
@@ -574,8 +676,7 @@ static void guc_add_request(struct intel_guc *guc,
  */
 static void i915_guc_submit(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-	struct intel_guc *guc = &dev_priv->guc;
+	struct intel_guc *guc = &engine->i915->guc;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
 	unsigned int n;
@@ -588,8 +689,7 @@ static void i915_guc_submit(struct intel_engine_cs *engine)
 		if (rq && count == 0) {
 			port_set(&port[n], port_pack(rq, ++count));
 
-			if (i915_vma_is_map_and_fenceable(rq->ring->vma))
-				POSTING_READ_FW(GUC_STATUS);
+			flush_ggtt_writes(rq->ring->vma);
 
 			guc_add_request(guc, rq);
 		}
@@ -617,13 +717,32 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 	bool submit = false;
 	struct rb_node *rb;
 
-	if (port_isset(port))
-		port++;
-
 	spin_lock_irq(&engine->timeline->lock);
 	rb = execlists->first;
 	GEM_BUG_ON(rb_first(&execlists->queue) != rb);
-	while (rb) {
+
+	if (!rb)
+		goto unlock;
+
+	if (HAS_LOGICAL_RING_PREEMPTION(engine->i915) && port_isset(port)) {
+		struct guc_preempt_work *preempt_work =
+			&engine->i915->guc.preempt_work[engine->id];
+
+		if (rb_entry(rb, struct i915_priolist, node)->priority >
+		    max(port_request(port)->priotree.priority, 0)) {
+			execlists_set_active(execlists,
+					     EXECLISTS_ACTIVE_PREEMPT);
+			queue_work(engine->i915->guc.preempt_wq,
+				   &preempt_work->work);
+			goto unlock;
+		} else if (port_isset(last_port)) {
+			goto unlock;
+		}
+
+		port++;
+	}
+
+	do {
 		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
 		struct drm_i915_gem_request *rq, *rn;
 
@@ -653,7 +772,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 		INIT_LIST_HEAD(&p->requests);
 		if (p->priority != I915_PRIORITY_NORMAL)
 			kmem_cache_free(engine->i915->priorities, p);
-	}
+	} while (rb);
 done:
 	execlists->first = rb;
 	if (submit) {
@@ -661,6 +780,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 		execlists_set_active(execlists, EXECLISTS_ACTIVE_USER);
 		i915_guc_submit(engine);
 	}
+unlock:
 	spin_unlock_irq(&engine->timeline->lock);
 }
 
@@ -669,8 +789,6 @@ static void i915_guc_irq_handler(unsigned long data)
 	struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
-	const struct execlist_port * const last_port =
-		&execlists->port[execlists->port_mask];
 	struct drm_i915_gem_request *rq;
 
 	rq = port_request(&port[0]);
@@ -685,7 +803,19 @@ static void i915_guc_irq_handler(unsigned long data)
 	if (!rq)
 		execlists_clear_active(execlists, EXECLISTS_ACTIVE_USER);
 
-	if (!port_isset(last_port))
+	if (execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT) &&
+	    intel_read_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX) ==
+	    GUC_PREEMPT_FINISHED) {
+		execlists_cancel_port_requests(&engine->execlists);
+		execlists_unwind_incomplete_requests(execlists);
+
+		wait_for_guc_preempt_report(engine);
+
+		execlists_clear_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
+		intel_write_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX, 0);
+	}
+
+	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT))
 		i915_guc_dequeue(engine);
 }
 
@@ -1057,6 +1187,51 @@ static void guc_ads_destroy(struct intel_guc *guc)
 	i915_vma_unpin_and_release(&guc->ads_vma);
 }
 
+static int guc_preempt_work_create(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	/*
+	 * Even though both sending GuC action, and adding a new workitem to
+	 * GuC workqueue are serialized (each with its own locking), since
+	 * we're using mutliple engines, it's possible that we're going to
+	 * issue a preempt request with two (or more - each for different
+	 * engine) workitems in GuC queue. In this situation, GuC may submit
+	 * all of them, which will make us very confused.
+	 * Our preemption contexts may even already be complete - before we
+	 * even had the chance to sent the preempt action to GuC!. Rather
+	 * than introducing yet another lock, we can just use ordered workqueue
+	 * to make sure we're always sending a single preemption request with a
+	 * single workitem.
+	 */
+	guc->preempt_wq = alloc_ordered_workqueue("i915-guc_preempt",
+						  WQ_HIGHPRI);
+	if (!guc->preempt_wq)
+		return -ENOMEM;
+
+	for_each_engine(engine, dev_priv, id) {
+		guc->preempt_work[id].engine = engine;
+		INIT_WORK(&guc->preempt_work[id].work, inject_preempt_context);
+	}
+
+	return 0;
+}
+
+static void guc_preempt_work_destroy(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, dev_priv, id)
+		cancel_work_sync(&guc->preempt_work[id].work);
+
+	destroy_workqueue(guc->preempt_wq);
+	guc->preempt_wq = NULL;
+}
+
 /*
  * Set up the memory resources to be shared with the GuC (via the GGTT)
  * at firmware loading time.
@@ -1081,12 +1256,18 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
 	if (ret < 0)
 		goto err_shared_data;
 
+	ret = guc_preempt_work_create(guc);
+	if (ret)
+		goto err_log;
+
 	ret = guc_ads_create(guc);
 	if (ret < 0)
-		goto err_log;
+		goto err_wq;
 
 	return 0;
 
+err_wq:
+	guc_preempt_work_destroy(guc);
 err_log:
 	intel_guc_log_destroy(guc);
 err_shared_data:
@@ -1101,6 +1282,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
 	struct intel_guc *guc = &dev_priv->guc;
 
 	guc_ads_destroy(guc);
+	guc_preempt_work_destroy(guc);
 	intel_guc_log_destroy(guc);
 	guc_shared_data_destroy(guc);
 	guc_stage_desc_pool_destroy(guc);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 6ca55c5bed3c..607e02500262 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -34,6 +34,11 @@
 #include "i915_guc_reg.h"
 #include "i915_vma.h"
 
+struct guc_preempt_work {
+	struct work_struct work;
+	struct intel_engine_cs *engine;
+};
+
 /*
  * Top level structure of GuC. It handles firmware loading and manages client
  * pool and doorbells. intel_guc owns a i915_guc_client to replace the legacy
@@ -60,6 +65,9 @@ struct intel_guc {
 	struct i915_guc_client *execbuf_client;
 	struct i915_guc_client *preempt_client;
 
+	struct guc_preempt_work preempt_work[I915_NUM_ENGINES];
+	struct workqueue_struct *preempt_wq;
+
 	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
 	/* Cyclic counter mod pagesize	*/
 	u32 db_cacheline;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b5d382ef8d85..6840ec8db037 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -385,7 +385,7 @@ static void __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	}
 }
 
-static void
+void
 execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
 {
 	struct intel_engine_cs *engine =
@@ -696,7 +696,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	}
 }
 
-static void
+void
 execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
 {
 	struct execlist_port *port = execlists->port;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 689fde1a63a9..17182ce29674 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -107,7 +107,6 @@ intel_lr_context_descriptor(struct i915_gem_context *ctx,
 	return ctx->engine[engine->id].lrc_desc;
 }
 
-
 /* Execlists */
 int intel_sanitize_enable_execlists(struct drm_i915_private *dev_priv,
 				    int enable_execlists);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 4a5a08985328..69ad875fd011 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -557,6 +557,12 @@ execlists_is_active(const struct intel_engine_execlists *execlists,
 	return test_bit(bit, (unsigned long *)&execlists->active);
 }
 
+void
+execlists_cancel_port_requests(struct intel_engine_execlists * const execlists);
+
+void
+execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists);
+
 static inline unsigned int
 execlists_num_ports(const struct intel_engine_execlists * const execlists)
 {
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev3)
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (13 preceding siblings ...)
  2017-10-26  7:48 ` ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev2) Patchwork
@ 2017-10-26 13:41 ` Patchwork
  2017-10-26 13:59 ` ✓ Fi.CI.BAT: success for Preemption with GuC, fourth try (rev5) Patchwork
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2017-10-26 13:41 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

== Series Details ==

Series: Preemption with GuC, fourth try (rev3)
URL   : https://patchwork.freedesktop.org/series/32654/
State : warning

== Summary ==

Series 32654v3 Preemption with GuC, fourth try
https://patchwork.freedesktop.org/api/1.0/series/32654/revisions/3/mbox/

Test chamelium:
        Subgroup dp-crc-fast:
                pass       -> FAIL       (fi-kbl-7500u) fdo#102514
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b-frame-sequence:
                skip       -> PASS       (fi-hsw-4770r) fdo#102332
        Subgroup suspend-read-crc-pipe-b:
                pass       -> DMESG-WARN (fi-byt-j1900) fdo#101705
Test drv_module_reload:
        Subgroup basic-reload:
                pass       -> DMESG-WARN (fi-cnl-y)
        Subgroup basic-no-display:
                pass       -> DMESG-WARN (fi-cnl-y)
        Subgroup basic-reload-inject:
                pass       -> DMESG-WARN (fi-cnl-y)

fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
fdo#102332 https://bugs.freedesktop.org/show_bug.cgi?id=102332
fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:438s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:451s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:369s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:528s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:263s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:503s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:502s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:494s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:482s
fi-cfl-s         total:289  pass:253  dwarn:4   dfail:0   fail:0   skip:32  time:552s
fi-cnl-y         total:289  pass:259  dwarn:3   dfail:0   fail:0   skip:27  time:618s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:418s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:249s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:574s
fi-glk-dsi       total:289  pass:258  dwarn:0   dfail:0   fail:1   skip:30  time:487s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:426s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:425s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:434s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:492s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:463s
fi-kbl-7500u     total:289  pass:263  dwarn:1   dfail:0   fail:1   skip:24  time:480s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:575s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:478s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:581s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:545s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:454s
fi-skl-6600u     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:589s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:659s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:522s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:501s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:456s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:560s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:418s

df3033b174059a59aa0c890f81de8af037abd11f drm-tip: 2017y-10m-26d-11h-03m-59s UTC integration manifest
3218d17f96a7 HAX Enable GuC Submission for CI
a621438797f4 drm/i915/guc: Preemption! With GuC
cac38ec21a03 drm/i915: Rename helpers used for unwinding, use macro for can_preempt
d8b26d6919a9 drm/i915/guc: Keep request->priority for its lifetime
794b8ad2405a drm/i915: Add information needed to track engine preempt state
fd083843a919 drm/i915: Extract "emit write" part of emit breadcrumb functions
5b871dda8364 drm/i915/guc: Split guc_wq_item_append
b1bd6d916ab2 drm/i915/guc: Add a second client, to be used for preemption
cb79905d0954 drm/i915/guc: Add preemption action to GuC firmware interface
18c35c003dcb drm/i915/guc: Allocate separate shared data object for GuC communication
5bda4fcf07ad drm/i915/guc: Extract GuC stage desc pool creation into a helper
87f3d391fb9f drm/i915/guc: Do not use 0 for GuC doorbell cookie

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6204/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-26 13:32     ` [PATCH v5] " Michał Winiarski
@ 2017-10-26 13:44       ` Michal Wajdeczko
  2017-10-26 14:17       ` [PATCH v6] " Michał Winiarski
  1 sibling, 0 replies; 35+ messages in thread
From: Michal Wajdeczko @ 2017-10-26 13:44 UTC (permalink / raw)
  To: intel-gfx, Michał Winiarski

On Thu, 26 Oct 2017 15:32:31 +0200, Michał Winiarski  
<michal.winiarski@intel.com> wrote:

> From: Dave Gordon <david.s.gordon@intel.com>
>
> This second client is created with priority KMD_HIGH, and marked
> as preemptive. This will allow us to request preemption using GuC  
> actions.
>
> v2: Extract clients creation into a helper, debugfs fixups. (Michał)
> Recreate doorbell on init. (Daniele)
> Move clients into an array.
>
> v3: And move clients back from an array, to get rid of the enum (Michał)
>
> v4: Use is_high_priority, move DRM_ERROR into __create_doorbell, move
> GEM_BUG_ON inside guc_clients_create (Michał)
>
> v5: Split the BUG_ON (Michał)
>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Jeff McGee <jeff.mcgee@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> ---

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* ✓ Fi.CI.BAT: success for Preemption with GuC, fourth try (rev5)
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (14 preceding siblings ...)
  2017-10-26 13:41 ` ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev3) Patchwork
@ 2017-10-26 13:59 ` Patchwork
  2017-10-26 14:43 ` ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev6) Patchwork
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2017-10-26 13:59 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

== Series Details ==

Series: Preemption with GuC, fourth try (rev5)
URL   : https://patchwork.freedesktop.org/series/32654/
State : success

== Summary ==

Series 32654v5 Preemption with GuC, fourth try
https://patchwork.freedesktop.org/api/1.0/series/32654/revisions/5/mbox/

Test kms_cursor_legacy:
        Subgroup basic-flip-before-cursor-atomic:
                pass       -> INCOMPLETE (fi-cnl-y) fdo#102035
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b-frame-sequence:
                skip       -> PASS       (fi-hsw-4770r) fdo#102332
        Subgroup suspend-read-crc-pipe-b:
                pass       -> DMESG-WARN (fi-byt-j1900) fdo#101705
Test drv_module_reload:
        Subgroup basic-reload:
                pass       -> DMESG-WARN (fi-bsw-n3050) fdo#103165

fdo#102035 https://bugs.freedesktop.org/show_bug.cgi?id=102035
fdo#102332 https://bugs.freedesktop.org/show_bug.cgi?id=102332
fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705
fdo#103165 https://bugs.freedesktop.org/show_bug.cgi?id=103165

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:441s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:453s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:370s
fi-bsw-n3050     total:289  pass:242  dwarn:1   dfail:0   fail:0   skip:46  time:520s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:261s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:499s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:500s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:495s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:476s
fi-cfl-s         total:289  pass:253  dwarn:4   dfail:0   fail:0   skip:32  time:554s
fi-cnl-y         total:289  pass:193  dwarn:0   dfail:0   fail:0   skip:20 
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:417s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:251s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:582s
fi-glk-dsi       total:289  pass:258  dwarn:0   dfail:0   fail:1   skip:30  time:488s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:425s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:425s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:433s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:484s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:459s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:490s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:573s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:483s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:585s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:550s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:444s
fi-skl-6600u     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:597s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:647s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:522s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:502s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:457s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:560s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:413s

df3033b174059a59aa0c890f81de8af037abd11f drm-tip: 2017y-10m-26d-11h-03m-59s UTC integration manifest
57f4372b843b HAX Enable GuC Submission for CI
bd453e53421a drm/i915/guc: Preemption! With GuC
c230cfaaeec4 drm/i915: Rename helpers used for unwinding, use macro for can_preempt
be74df89d1cb drm/i915/guc: Keep request->priority for its lifetime
c65600f6018d drm/i915: Add information needed to track engine preempt state
05053f4249e6 drm/i915: Extract "emit write" part of emit breadcrumb functions
44056d81e278 drm/i915/guc: Split guc_wq_item_append
e6230d13d5b1 drm/i915/guc: Add a second client, to be used for preemption
46cdb7035ad8 drm/i915/guc: Add preemption action to GuC firmware interface
05352d789d91 drm/i915/guc: Allocate separate shared data object for GuC communication
79c012264126 drm/i915/guc: Extract GuC stage desc pool creation into a helper
a535408d82c7 drm/i915/guc: Do not use 0 for GuC doorbell cookie

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6205/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v6] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-26 13:32     ` [PATCH v5] " Michał Winiarski
  2017-10-26 13:44       ` Michal Wajdeczko
@ 2017-10-26 14:17       ` Michał Winiarski
  2017-10-26 18:49         ` Michel Thierry
  1 sibling, 1 reply; 35+ messages in thread
From: Michał Winiarski @ 2017-10-26 14:17 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dave Gordon

From: Dave Gordon <david.s.gordon@intel.com>

This second client is created with priority KMD_HIGH, and marked
as preemptive. This will allow us to request preemption using GuC actions.

v2: Extract clients creation into a helper, debugfs fixups. (Michał)
Recreate doorbell on init. (Daniele)
Move clients into an array.

v3: And move clients back from an array, to get rid of the enum (Michał)

v4: Use is_high_priority, move DRM_ERROR into __create_doorbell, move
GEM_BUG_ON inside guc_clients_create (Michał)

v5: Split the BUG_ON (Michał)

v6: Cleanup after error during doorbell reinit (Michał)

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |   2 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 118 ++++++++++++++++++++---------
 drivers/gpu/drm/i915/intel_guc.h           |   1 +
 3 files changed, 86 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c65e381b85f3..6e178d75bb17 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2484,6 +2484,8 @@ static int i915_guc_info(struct seq_file *m, void *data)
 
 	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);
 	i915_guc_client_info(m, dev_priv, guc->execbuf_client);
+	seq_printf(m, "\nGuC preempt client @ %p:\n", guc->preempt_client);
+	i915_guc_client_info(m, dev_priv, guc->preempt_client);
 
 	i915_guc_log_info(m, dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index d1a5613da24c..b0f034f92431 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -33,10 +33,11 @@
  *
  * GuC client:
  * A i915_guc_client refers to a submission path through GuC. Currently, there
- * is only one of these (the execbuf_client) and this one is charged with all
- * submissions to the GuC. This struct is the owner of a doorbell, a process
- * descriptor and a workqueue (all of them inside a single gem object that
- * contains all required pages for these elements).
+ * are two clients. One of them (the execbuf_client) is charged with all
+ * submissions to the GuC, the other one (preempt_client) is responsible for
+ * preempting the execbuf_client. This struct is the owner of a doorbell, a
+ * process descriptor and a workqueue (all of them inside a single gem object
+ * that contains all required pages for these elements).
  *
  * GuC stage descriptor:
  * During initialization, the driver allocates a static pool of 1024 such
@@ -83,7 +84,8 @@
 
 static inline bool is_high_priority(struct i915_guc_client* client)
 {
-	return client->priority <= GUC_CLIENT_PRIORITY_HIGH;
+	return (client->priority == GUC_CLIENT_PRIORITY_KMD_HIGH ||
+		client->priority == GUC_CLIENT_PRIORITY_HIGH);
 }
 
 static int __reserve_doorbell(struct i915_guc_client *client)
@@ -196,8 +198,11 @@ static int __create_doorbell(struct i915_guc_client *client)
 	doorbell->cookie = 0;
 
 	err = __guc_allocate_doorbell(client->guc, client->stage_id);
-	if (err)
+	if (err) {
 		doorbell->db_status = GUC_DOORBELL_DISABLED;
+		DRM_ERROR("Couldn't create client %u doorbell: %d\n",
+			  client->stage_id, err);
+	}
 
 	return err;
 }
@@ -363,6 +368,8 @@ static void guc_stage_desc_init(struct intel_guc *guc,
 	memset(desc, 0, sizeof(*desc));
 
 	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE | GUC_STAGE_DESC_ATTR_KERNEL;
+	if (is_high_priority(client))
+		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
 	desc->stage_id = client->stage_id;
 	desc->priority = client->priority;
 	desc->db_id = client->doorbell_id;
@@ -763,14 +770,14 @@ static int guc_init_doorbell_hw(struct intel_guc *guc)
 
 	/* Now for every client (and not only execbuf_client) make sure their
 	 * doorbells are known by the GuC */
-	//for (client = client_list; client != NULL; client = client->next)
-	{
-		ret = __create_doorbell(client);
-		if (ret) {
-			DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
-				client->stage_id, ret);
-			return ret;
-		}
+	ret = __create_doorbell(guc->execbuf_client);
+	if (ret)
+		return ret;
+
+	ret = __create_doorbell(guc->preempt_client);
+	if (ret) {
+		__destroy_doorbell(guc->execbuf_client);
+		return ret;
 	}
 
 	/* Read back & verify all (used & unused) doorbell registers */
@@ -895,6 +902,50 @@ static void guc_client_free(struct i915_guc_client *client)
 	kfree(client);
 }
 
+static int guc_clients_create(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct i915_guc_client *client;
+
+	GEM_BUG_ON(guc->execbuf_client);
+	GEM_BUG_ON(guc->preempt_client);
+
+	client = guc_client_alloc(dev_priv,
+				  INTEL_INFO(dev_priv)->ring_mask,
+				  GUC_CLIENT_PRIORITY_KMD_NORMAL,
+				  dev_priv->kernel_context);
+	if (IS_ERR(client)) {
+		DRM_ERROR("Failed to create GuC client for submission!\n");
+		return PTR_ERR(client);
+	}
+	guc->execbuf_client = client;
+
+	client = guc_client_alloc(dev_priv,
+				  INTEL_INFO(dev_priv)->ring_mask,
+				  GUC_CLIENT_PRIORITY_KMD_HIGH,
+				  dev_priv->preempt_context);
+	if (IS_ERR(client)) {
+		DRM_ERROR("Failed to create GuC client for preemption!\n");
+		guc_client_free(guc->execbuf_client);
+		guc->execbuf_client = NULL;
+		return PTR_ERR(client);
+	}
+	guc->preempt_client = client;
+
+	return 0;
+}
+
+static void guc_clients_destroy(struct intel_guc *guc)
+{
+	struct i915_guc_client *client;
+
+	client = fetch_and_zero(&guc->execbuf_client);
+	guc_client_free(client);
+
+	client = fetch_and_zero(&guc->preempt_client);
+	guc_client_free(client);
+}
+
 static void guc_policy_init(struct guc_policy *policy)
 {
 	policy->execution_quantum = POLICY_DEFAULT_EXECUTION_QUANTUM_US;
@@ -1134,7 +1185,6 @@ static void i915_guc_submission_unpark(struct intel_engine_cs *engine)
 int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
-	struct i915_guc_client *client = guc->execbuf_client;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 	int err;
@@ -1152,28 +1202,28 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 		     sizeof(struct guc_wq_item) *
 		     I915_NUM_ENGINES > GUC_WQ_SIZE);
 
-	if (!client) {
-		client = guc_client_alloc(dev_priv,
-					  INTEL_INFO(dev_priv)->ring_mask,
-					  GUC_CLIENT_PRIORITY_KMD_NORMAL,
-					  dev_priv->kernel_context);
-		if (IS_ERR(client)) {
-			DRM_ERROR("Failed to create GuC client for execbuf!\n");
-			return PTR_ERR(client);
-		}
-
-		guc->execbuf_client = client;
+	/*
+	 * We're being called on both module initialization and on reset,
+	 * until this flow is changed, we're using regular client presence to
+	 * determine which case are we in, and whether we should allocate new
+	 * clients or just reset their workqueues.
+	 */
+	if (!guc->execbuf_client) {
+		err = guc_clients_create(guc);
+		if (err)
+			return err;
+	} else {
+		guc_reset_wq(guc->execbuf_client);
+		guc_reset_wq(guc->preempt_client);
 	}
 
 	err = intel_guc_sample_forcewake(guc);
 	if (err)
-		goto err_execbuf_client;
-
-	guc_reset_wq(client);
+		goto err_free_clients;
 
 	err = guc_init_doorbell_hw(guc);
 	if (err)
-		goto err_execbuf_client;
+		goto err_free_clients;
 
 	/* Take over from manual control of ELSP (execlists) */
 	guc_interrupts_capture(dev_priv);
@@ -1187,9 +1237,8 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 
 	return 0;
 
-err_execbuf_client:
-	guc_client_free(guc->execbuf_client);
-	guc->execbuf_client = NULL;
+err_free_clients:
+	guc_clients_destroy(guc);
 	return err;
 }
 
@@ -1204,6 +1253,5 @@ void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
 	/* Revert back to manual ELSP submission */
 	intel_engines_reset_default_submission(dev_priv);
 
-	guc_client_free(guc->execbuf_client);
-	guc->execbuf_client = NULL;
+	guc_clients_destroy(guc);
 }
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index aa1583167b0a..6ca55c5bed3c 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -58,6 +58,7 @@ struct intel_guc {
 	void *shared_data_vaddr;
 
 	struct i915_guc_client *execbuf_client;
+	struct i915_guc_client *preempt_client;
 
 	DECLARE_BITMAP(doorbell_bitmap, GUC_NUM_DOORBELLS);
 	/* Cyclic counter mod pagesize	*/
-- 
2.13.6

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev6)
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (15 preceding siblings ...)
  2017-10-26 13:59 ` ✓ Fi.CI.BAT: success for Preemption with GuC, fourth try (rev5) Patchwork
@ 2017-10-26 14:43 ` Patchwork
  2017-10-26 15:18 ` ✓ Fi.CI.IGT: success for Preemption with GuC, fourth try (rev5) Patchwork
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2017-10-26 14:43 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

== Series Details ==

Series: Preemption with GuC, fourth try (rev6)
URL   : https://patchwork.freedesktop.org/series/32654/
State : warning

== Summary ==

Series 32654v6 Preemption with GuC, fourth try
https://patchwork.freedesktop.org/api/1.0/series/32654/revisions/6/mbox/

Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b-frame-sequence:
                skip       -> PASS       (fi-hsw-4770r) fdo#102332
        Subgroup suspend-read-crc-pipe-b:
                pass       -> DMESG-WARN (fi-byt-j1900) fdo#101705
Test drv_module_reload:
        Subgroup basic-reload:
                pass       -> DMESG-WARN (fi-cnl-y)
        Subgroup basic-no-display:
                dmesg-warn -> INCOMPLETE (fi-cfl-s) fdo#103206
                pass       -> DMESG-WARN (fi-cnl-y)
        Subgroup basic-reload-inject:
                pass       -> DMESG-WARN (fi-cnl-y)

fdo#102332 https://bugs.freedesktop.org/show_bug.cgi?id=102332
fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705
fdo#103206 https://bugs.freedesktop.org/show_bug.cgi?id=103206

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:445s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:455s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:370s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:525s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:263s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:504s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:498s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:490s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:482s
fi-cfl-s         total:287  pass:253  dwarn:2   dfail:0   fail:0   skip:31 
fi-cnl-y         total:289  pass:259  dwarn:3   dfail:0   fail:0   skip:27  time:606s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:418s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:250s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:575s
fi-glk-dsi       total:289  pass:258  dwarn:0   dfail:0   fail:1   skip:30  time:483s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:424s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:428s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:434s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:493s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:455s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:491s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:572s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:485s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:586s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:541s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:449s
fi-skl-6600u     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:598s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:648s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:523s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:496s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:461s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:561s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:420s

df3033b174059a59aa0c890f81de8af037abd11f drm-tip: 2017y-10m-26d-11h-03m-59s UTC integration manifest
7172b849c490 HAX Enable GuC Submission for CI
996b8a088c11 drm/i915/guc: Preemption! With GuC
7bebd9377fe5 drm/i915: Rename helpers used for unwinding, use macro for can_preempt
c358e5f30565 drm/i915/guc: Keep request->priority for its lifetime
815b33b8e308 drm/i915: Add information needed to track engine preempt state
364832f82afd drm/i915: Extract "emit write" part of emit breadcrumb functions
274c59afc863 drm/i915/guc: Split guc_wq_item_append
ea7d6bdcc394 drm/i915/guc: Add a second client, to be used for preemption
2de790e9c225 drm/i915/guc: Add preemption action to GuC firmware interface
881c8236e90b drm/i915/guc: Allocate separate shared data object for GuC communication
db5247f8c3a1 drm/i915/guc: Extract GuC stage desc pool creation into a helper
ae0602d37052 drm/i915/guc: Do not use 0 for GuC doorbell cookie

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6207/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* ✓ Fi.CI.IGT: success for Preemption with GuC, fourth try (rev5)
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (16 preceding siblings ...)
  2017-10-26 14:43 ` ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev6) Patchwork
@ 2017-10-26 15:18 ` Patchwork
  2017-10-26 15:59 ` ✓ Fi.CI.IGT: success for Preemption with GuC, fourth try (rev6) Patchwork
  2017-10-26 20:37 ` [PATCH 00/12] Preemption with GuC, fourth try Chris Wilson
  19 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2017-10-26 15:18 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

== Series Details ==

Series: Preemption with GuC, fourth try (rev5)
URL   : https://patchwork.freedesktop.org/series/32654/
State : success

== Summary ==

Test kms_busy:
        Subgroup extended-modeset-hang-oldfb-with-reset-render-A:
                pass       -> DMESG-WARN (shard-hsw) fdo#102249 +1
        Subgroup extended-modeset-hang-newfb-with-reset-render-B:
                pass       -> DMESG-WARN (shard-hsw) fdo#103038
Test kms_plane:
        Subgroup plane-panning-bottom-right-suspend-pipe-A-planes:
                skip       -> PASS       (shard-hsw)
Test kms_flip:
        Subgroup dpms-vs-vblank-race-interruptible:
                pass       -> FAIL       (shard-hsw) fdo#103060
Test kms_setmode:
        Subgroup basic:
                pass       -> FAIL       (shard-hsw) fdo#99912

fdo#102249 https://bugs.freedesktop.org/show_bug.cgi?id=102249
fdo#103038 https://bugs.freedesktop.org/show_bug.cgi?id=103038
fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912

shard-hsw        total:2539 pass:1430 dwarn:2   dfail:0   fail:10  skip:1097 time:9281s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6205/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* ✓ Fi.CI.IGT: success for Preemption with GuC, fourth try (rev6)
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (17 preceding siblings ...)
  2017-10-26 15:18 ` ✓ Fi.CI.IGT: success for Preemption with GuC, fourth try (rev5) Patchwork
@ 2017-10-26 15:59 ` Patchwork
  2017-10-26 20:37 ` [PATCH 00/12] Preemption with GuC, fourth try Chris Wilson
  19 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2017-10-26 15:59 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

== Series Details ==

Series: Preemption with GuC, fourth try (rev6)
URL   : https://patchwork.freedesktop.org/series/32654/
State : success

== Summary ==

Test kms_busy:
        Subgroup extended-modeset-hang-newfb-with-reset-render-A:
                pass       -> DMESG-WARN (shard-hsw) fdo#102249 +1
Test kms_plane:
        Subgroup plane-panning-bottom-right-suspend-pipe-A-planes:
                skip       -> PASS       (shard-hsw)

fdo#102249 https://bugs.freedesktop.org/show_bug.cgi?id=102249

shard-hsw        total:2539 pass:1433 dwarn:1   dfail:0   fail:8   skip:1097 time:9240s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6207/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-26 14:17       ` [PATCH v6] " Michał Winiarski
@ 2017-10-26 18:49         ` Michel Thierry
  2017-10-26 20:02           ` Chris Wilson
  0 siblings, 1 reply; 35+ messages in thread
From: Michel Thierry @ 2017-10-26 18:49 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

On 26/10/17 07:17, Michał Winiarski wrote:
> @@ -763,14 +770,14 @@ static int guc_init_doorbell_hw(struct intel_guc *guc)
> 
>          /* Now for every client (and not only execbuf_client) make sure their
>           * doorbells are known by the GuC */
> -       //for (client = client_list; client != NULL; client = client->next)
> -       {
> -               ret = __create_doorbell(client);
> -               if (ret) {
> -                       DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
> -                               client->stage_id, ret);
> -                       return ret;
> -               }
> +       ret = __create_doorbell(guc->execbuf_client);
> +       if (ret)
> +               return ret;
> +
> +       ret = __create_doorbell(guc->preempt_client);
> +       if (ret) {
> +               __destroy_doorbell(guc->execbuf_client);
> +               return ret;
>          }

I'm pretty sure there's an old client left behind after this, e.g.:

  static int guc_init_doorbell_hw(struct intel_guc *guc)
  {
-	struct i915_guc_client *client = guc->execbuf_client;
  ...

-Michel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-26 18:49         ` Michel Thierry
@ 2017-10-26 20:02           ` Chris Wilson
  2017-10-26 20:15             ` Michel Thierry
  0 siblings, 1 reply; 35+ messages in thread
From: Chris Wilson @ 2017-10-26 20:02 UTC (permalink / raw)
  To: Michel Thierry, Michał Winiarski, intel-gfx

Quoting Michel Thierry (2017-10-26 19:49:06)
> On 26/10/17 07:17, Michał Winiarski wrote:
> > @@ -763,14 +770,14 @@ static int guc_init_doorbell_hw(struct intel_guc *guc)
> > 
> >          /* Now for every client (and not only execbuf_client) make sure their
> >           * doorbells are known by the GuC */
> > -       //for (client = client_list; client != NULL; client = client->next)
> > -       {
> > -               ret = __create_doorbell(client);
> > -               if (ret) {
> > -                       DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
> > -                               client->stage_id, ret);
> > -                       return ret;
> > -               }
> > +       ret = __create_doorbell(guc->execbuf_client);
> > +       if (ret)
> > +               return ret;
> > +
> > +       ret = __create_doorbell(guc->preempt_client);
> > +       if (ret) {
> > +               __destroy_doorbell(guc->execbuf_client);
> > +               return ret;
> >          }
> 
> I'm pretty sure there's an old client left behind after this, e.g.:
> 
>   static int guc_init_doorbell_hw(struct intel_guc *guc)
>   {
> -       struct i915_guc_client *client = guc->execbuf_client;

That doesn't look like it's a problem, or that confusing. This client
local var is just a placeholder used to clear the unwanted doorbells. We
then walk over the known kernel clients to reset them.

So, I think the suggestion is to split this function again, but I don't
see a blocker for merging.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6] drm/i915/guc: Add a second client, to be used for preemption
  2017-10-26 20:02           ` Chris Wilson
@ 2017-10-26 20:15             ` Michel Thierry
  0 siblings, 0 replies; 35+ messages in thread
From: Michel Thierry @ 2017-10-26 20:15 UTC (permalink / raw)
  To: Chris Wilson, Michał Winiarski, intel-gfx

On 10/26/2017 1:02 PM, Chris Wilson wrote:
> Quoting Michel Thierry (2017-10-26 19:49:06)
>> On 26/10/17 07:17, Michał Winiarski wrote:
>>> @@ -763,14 +770,14 @@ static int guc_init_doorbell_hw(struct intel_guc *guc)
>>>
>>>           /* Now for every client (and not only execbuf_client) make sure their
>>>            * doorbells are known by the GuC */
>>> -       //for (client = client_list; client != NULL; client = client->next)
>>> -       {
>>> -               ret = __create_doorbell(client);
>>> -               if (ret) {
>>> -                       DRM_ERROR("Couldn't recreate client %u doorbell: %d\n",
>>> -                               client->stage_id, ret);
>>> -                       return ret;
>>> -               }
>>> +       ret = __create_doorbell(guc->execbuf_client);
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       ret = __create_doorbell(guc->preempt_client);
>>> +       if (ret) {
>>> +               __destroy_doorbell(guc->execbuf_client);
>>> +               return ret;
>>>           }
>>
>> I'm pretty sure there's an old client left behind after this, e.g.:
>>
>>    static int guc_init_doorbell_hw(struct intel_guc *guc)
>>    {
>> -       struct i915_guc_client *client = guc->execbuf_client;
> 
> That doesn't look like it's a problem, or that confusing. This client
> local var is just a placeholder used to clear the unwanted doorbells. We
> then walk over the known kernel clients to reset them.
> 
> So, I think the suggestion is to split this function again, but I don't
> see a blocker for merging.

Ok, I'll need to split the function (or pass some flags to execute just 
parts of it) for the selftest anyway.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/12] Preemption with GuC, fourth try
  2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
                   ` (18 preceding siblings ...)
  2017-10-26 15:59 ` ✓ Fi.CI.IGT: success for Preemption with GuC, fourth try (rev6) Patchwork
@ 2017-10-26 20:37 ` Chris Wilson
  19 siblings, 0 replies; 35+ messages in thread
From: Chris Wilson @ 2017-10-26 20:37 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx

Quoting Michał Winiarski (2017-10-25 21:00:08)
> No major changes from previous iteration.
> Dropped the workaround for missing interrupt (which turned out to be
> self-inflicted, now properly fixed by Chris), and applied the review comments.

And pushed. Thanks everyone for the reviews and suggestions; looks like
we have a solid foundation on which to build.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2017-10-26 20:37 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-25 20:00 [PATCH 00/12] Preemption with GuC, fourth try Michał Winiarski
2017-10-25 20:00 ` [PATCH v2 01/12] drm/i915/guc: Do not use 0 for GuC doorbell cookie Michał Winiarski
2017-10-25 20:00 ` [PATCH 02/12] drm/i915/guc: Extract GuC stage desc pool creation into a helper Michał Winiarski
2017-10-25 20:00 ` [PATCH v2 03/12] drm/i915/guc: Allocate separate shared data object for GuC communication Michał Winiarski
2017-10-25 21:04   ` Michel Thierry
2017-10-25 20:00 ` [PATCH v2 04/12] drm/i915/guc: Add preemption action to GuC firmware interface Michał Winiarski
2017-10-25 20:00 ` [PATCH v3 05/12] drm/i915/guc: Add a second client, to be used for preemption Michał Winiarski
2017-10-26 12:50   ` Michal Wajdeczko
2017-10-26 13:20   ` [PATCH v4] " Michał Winiarski
2017-10-26 13:32     ` [PATCH v5] " Michał Winiarski
2017-10-26 13:44       ` Michal Wajdeczko
2017-10-26 14:17       ` [PATCH v6] " Michał Winiarski
2017-10-26 18:49         ` Michel Thierry
2017-10-26 20:02           ` Chris Wilson
2017-10-26 20:15             ` Michel Thierry
2017-10-25 20:00 ` [PATCH 06/12] drm/i915/guc: Split guc_wq_item_append Michał Winiarski
2017-10-25 20:00 ` [PATCH v2 07/12] drm/i915: Extract "emit write" part of emit breadcrumb functions Michał Winiarski
2017-10-25 20:00 ` [PATCH 08/12] drm/i915: Add information needed to track engine preempt state Michał Winiarski
2017-10-25 20:00 ` [PATCH v2 09/12] drm/i915/guc: Keep request->priority for its lifetime Michał Winiarski
2017-10-25 20:00 ` [PATCH v3 10/12] drm/i915: Rename helpers used for unwinding, use macro for can_preempt Michał Winiarski
2017-10-25 20:15   ` Chris Wilson
2017-10-25 20:00 ` [PATCH v4 11/12] drm/i915/guc: Preemption! With GuC Michał Winiarski
2017-10-25 20:24   ` Chris Wilson
2017-10-25 21:14   ` Chris Wilson
2017-10-26  7:27   ` [PATCH v5] " Michał Winiarski
2017-10-26 13:35     ` [PATCH v6] " Michał Winiarski
2017-10-25 20:00 ` [PATCH 12/12] HAX Enable GuC Submission for CI Michał Winiarski
2017-10-25 21:06 ` ✗ Fi.CI.BAT: failure for Preemption with GuC, fourth try Patchwork
2017-10-26  7:48 ` ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev2) Patchwork
2017-10-26 13:41 ` ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev3) Patchwork
2017-10-26 13:59 ` ✓ Fi.CI.BAT: success for Preemption with GuC, fourth try (rev5) Patchwork
2017-10-26 14:43 ` ✗ Fi.CI.BAT: warning for Preemption with GuC, fourth try (rev6) Patchwork
2017-10-26 15:18 ` ✓ Fi.CI.IGT: success for Preemption with GuC, fourth try (rev5) Patchwork
2017-10-26 15:59 ` ✓ Fi.CI.IGT: success for Preemption with GuC, fourth try (rev6) Patchwork
2017-10-26 20:37 ` [PATCH 00/12] Preemption with GuC, fourth try Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.