All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/21] New GuC ABI
@ 2018-08-29 19:10 Michal Wajdeczko
  2018-08-29 19:10 ` [PATCH 01/21] drm/i915/guc: Update GuC power domain states Michal Wajdeczko
                   ` (10 more replies)
  0 siblings, 11 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:10 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi, Sujaritha Sundaresan

This series introduces new Gen11 GuC ABI. Unfortunatelly
this new ABI is not backward compatible, so for a while we
will only support HuC authentication for pre-Gen11 GuC until
new firmwares will be released.

Note: To pass CI.BAT on machines with GuC, HAX will modify
forced modparam to disable GuC submission on those machines.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> 

Michal Wajdeczko (21):
  drm/i915/guc: Update GuC power domain states
  drm/i915/guc: Don't allow GuC submission on pre-Gen11
  drm/i915/guc: Simplify preparation of GuC parameter block
  drm/i915/guc: Support dual Gen9/Gen11 parameters block
  drm/i915/guc: Update sample-forcewake command
  drm/i915/guc: Use guc_class instead of engine_class in fw interface
  drm/i915/guc: New GuC ADS object definition
  drm/i915/guc: Make use of the SW counter field in the context
    descriptor
  drm/i915/guc: New GuC IDs based on engine class and instance
  drm/i915: Add hooks for (per-engine) context allocation/update/free
  drm/i915/guc: New GuC stage descriptors
  drm/i915/guc: New GuC workqueue item submission mechanism
  drm/i915/guc: Add support for resume-parsing wq item
  drm/i915/guc: New reset-engine command
  drm/i915/guc: Support for extended GuC notification messages
  drm/i915/guc: New engine-reset-complete message
  drm/i915/guc: New GuC interrupt register for Gen11
  drm/i915/guc: New GuC scratch registers for Gen11
  drm/i915/huc: New HuC status register for Gen11
  drm/i915/guc: Enable command transport buffers for Gen11
  HAX Don't enable GuC submission on pre-Gen11 even if forced

 drivers/gpu/drm/i915/i915_debugfs.c         |   9 +-
 drivers/gpu/drm/i915/i915_drv.h             |  26 ++-
 drivers/gpu/drm/i915/i915_gem_context.c     |  11 +-
 drivers/gpu/drm/i915/i915_gem_context.h     |   2 +
 drivers/gpu/drm/i915/i915_pci.c             |   1 +
 drivers/gpu/drm/i915/i915_reg.h             |   2 +
 drivers/gpu/drm/i915/i915_utils.h           |  12 +
 drivers/gpu/drm/i915/intel_engine_cs.c      |  21 +-
 drivers/gpu/drm/i915/intel_guc.c            | 249 ++++++++++++++++-----
 drivers/gpu/drm/i915/intel_guc.h            |   9 +-
 drivers/gpu/drm/i915/intel_guc_ads.c        |  91 ++++++--
 drivers/gpu/drm/i915/intel_guc_ct.c         |   5 +-
 drivers/gpu/drm/i915/intel_guc_fwif.h       | 264 +++++++++++++---------
 drivers/gpu/drm/i915/intel_guc_reg.h        |   7 +
 drivers/gpu/drm/i915/intel_guc_submission.c | 329 +++++++++++++++++++++-------
 drivers/gpu/drm/i915/intel_huc.c            |  58 ++++-
 drivers/gpu/drm/i915/intel_lrc.c            |  29 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h     |   4 +
 drivers/gpu/drm/i915/intel_uc.c             |  21 ++
 drivers/gpu/drm/i915/selftests/intel_guc.c  |   2 +-
 20 files changed, 863 insertions(+), 289 deletions(-)

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 01/21] drm/i915/guc: Update GuC power domain states
  2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
@ 2018-08-29 19:10 ` Michal Wajdeczko
  2018-08-29 20:57   ` Daniele Ceraolo Spurio
  2018-08-29 19:10 ` [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11 Michal Wajdeczko
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:10 UTC (permalink / raw)
  To: intel-gfx

We should update GuC power domain states also when GuC submission
is disabled, otherwise GuC might complain or ignore our requests.
This seems to be required for all currently released GuC firmwares.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
---
 drivers/gpu/drm/i915/intel_uc.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index 7c95697..7a3a4ca 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -401,6 +401,10 @@ int intel_uc_init_hw(struct drm_i915_private *i915)
 		ret = intel_guc_submission_enable(guc);
 		if (ret)
 			goto err_communication;
+	} else {
+		ret = intel_guc_sample_forcewake(guc);
+		if (ret)
+			goto err_communication;
 	}
 
 	dev_info(i915->drm.dev, "GuC firmware version %u.%u\n",
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11
  2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
  2018-08-29 19:10 ` [PATCH 01/21] drm/i915/guc: Update GuC power domain states Michal Wajdeczko
@ 2018-08-29 19:10 ` Michal Wajdeczko
  2018-08-29 19:16   ` Srivatsa, Anusha
                     ` (3 more replies)
  2018-08-29 19:10 ` [PATCH 03/21] drm/i915/guc: Simplify preparation of GuC parameter block Michal Wajdeczko
                   ` (8 subsequent siblings)
  10 siblings, 4 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:10 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi, Sujaritha Sundaresan

Upcoming Gen11 GuC firmware requires new interface that is incompatible
with existing pre-Gen11 firmwares. Updated firmwares for pre-Gen11 will
arrive later. In the meantime sanitize the enable_guc option so that we
can enable HuC authentication but nothing else on pre-Gen11.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
---
 drivers/gpu/drm/i915/intel_uc.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index 7a3a4ca..185b29b 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -63,6 +63,8 @@ static int __get_platform_enable_guc(struct drm_i915_private *i915)
 		enable_guc |= ENABLE_GUC_LOAD_HUC;
 
 	/* Any platform specific fine-tuning can be done here */
+	if (INTEL_GEN(i915) < 11)
+		enable_guc &= ~ENABLE_GUC_SUBMISSION;
 
 	return enable_guc;
 }
@@ -115,6 +117,13 @@ static void sanitize_options_early(struct drm_i915_private *i915)
 			 yesno(intel_uc_is_using_guc_submission()),
 			 yesno(intel_uc_is_using_huc()));
 
+	/* Verify GuC submission support */
+	if (intel_uc_is_using_guc_submission() && INTEL_GEN(i915) < 11) {
+		DRM_WARN("Incompatible option detected: %s=%d, %s!\n",
+			 "enable_guc", i915_modparams.enable_guc,
+			 "submission not supported");
+	}
+
 	/* Verify GuC firmware availability */
 	if (intel_uc_is_using_guc() && !intel_uc_fw_is_selected(guc_fw)) {
 		DRM_WARN("Incompatible option detected: %s=%d, %s!\n",
@@ -292,6 +301,12 @@ int intel_uc_init(struct drm_i915_private *i915)
 		return ret;
 
 	if (USES_GUC_SUBMISSION(i915)) {
+
+		if (INTEL_GEN(i915) < 11) {
+			intel_guc_fini(guc);
+			return -EIO;
+		}
+
 		/*
 		 * This is stuff we need to have available at fw load time
 		 * if we are planning to enable submission later
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 03/21] drm/i915/guc: Simplify preparation of GuC parameter block
  2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
  2018-08-29 19:10 ` [PATCH 01/21] drm/i915/guc: Update GuC power domain states Michal Wajdeczko
  2018-08-29 19:10 ` [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11 Michal Wajdeczko
@ 2018-08-29 19:10 ` Michal Wajdeczko
  2018-08-30 22:58   ` John Spotswood
  2018-09-06  8:32   ` Joonas Lahtinen
  2018-08-29 19:10 ` [PATCH 04/21] drm/i915/guc: Support dual Gen9/Gen11 parameters block Michal Wajdeczko
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:10 UTC (permalink / raw)
  To: intel-gfx

Definition of the parameters block passed to GuC is about to change.
Slightly refactor code now to make upcoming patch smaller.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
---
 drivers/gpu/drm/i915/intel_guc.c | 38 +++++++++++++++++++++++---------------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index 230aea6..982bcc8 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -320,19 +320,8 @@ static u32 guc_ctl_log_params_flags(struct intel_guc *guc)
 	return flags;
 }
 
-/*
- * Initialise the GuC parameter block before starting the firmware
- * transfer. These parameters are read by the firmware on startup
- * and cannot be changed thereafter.
- */
-void intel_guc_init_params(struct intel_guc *guc)
+static void guc_prepare_params(struct intel_guc *guc, u32 *params)
 {
-	struct drm_i915_private *dev_priv = guc_to_i915(guc);
-	u32 params[GUC_CTL_MAX_DWORDS];
-	int i;
-
-	memset(params, 0, sizeof(params));
-
 	/*
 	 * GuC ARAT increment is 10 ns. GuC default scheduler quantum is one
 	 * second. This ARAR is calculated by:
@@ -347,9 +336,12 @@ void intel_guc_init_params(struct intel_guc *guc)
 	params[GUC_CTL_LOG_PARAMS]  = guc_ctl_log_params_flags(guc);
 	params[GUC_CTL_DEBUG] = guc_ctl_debug_flags(guc);
 	params[GUC_CTL_CTXINFO] = guc_ctl_ctxinfo_flags(guc);
+}
 
-	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
-		DRM_DEBUG_DRIVER("param[%2d] = %#x\n", i, params[i]);
+static void guc_write_params(struct intel_guc *guc, const u32 *params)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	int i;
 
 	/*
 	 * All SOFT_SCRATCH registers are in FORCEWAKE_BLITTER domain and
@@ -360,12 +352,28 @@ void intel_guc_init_params(struct intel_guc *guc)
 
 	I915_WRITE(SOFT_SCRATCH(0), 0);
 
-	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
+	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++) {
+		DRM_DEBUG_DRIVER("param[%2d] = %#x\n", i, params[i]);
 		I915_WRITE(SOFT_SCRATCH(1 + i), params[i]);
+	}
 
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_BLITTER);
 }
 
+/*
+ * Initialise the GuC parameter block before starting the firmware
+ * transfer. These parameters are read by the firmware on startup
+ * and cannot be changed thereafter.
+ */
+void intel_guc_init_params(struct intel_guc *guc)
+{
+	u32 params[GUC_CTL_MAX_DWORDS];
+
+	memset(params, 0, sizeof(params));
+	guc_prepare_params(guc, params);
+	guc_write_params(guc, params);
+}
+
 int intel_guc_send_nop(struct intel_guc *guc, const u32 *action, u32 len,
 		       u32 *response_buf, u32 response_buf_size)
 {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 04/21] drm/i915/guc: Support dual Gen9/Gen11 parameters block
  2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
                   ` (2 preceding siblings ...)
  2018-08-29 19:10 ` [PATCH 03/21] drm/i915/guc: Simplify preparation of GuC parameter block Michal Wajdeczko
@ 2018-08-29 19:10 ` Michal Wajdeczko
  2018-08-30 22:58   ` John Spotswood
  2018-09-06  8:39   ` Joonas Lahtinen
  2018-08-29 19:10 ` [PATCH 05/21] drm/i915/guc: Update sample-forcewake command Michal Wajdeczko
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:10 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi, Sujaritha Sundaresan

Gen11 GuC boot parameter definitions are different than previously
used for Gen9. Try to support both definitions until new firmwares
for pre-Gen11 will be available.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
---
 drivers/gpu/drm/i915/intel_guc.c      | 76 +++++++++++++++++++++++++----------
 drivers/gpu/drm/i915/intel_guc_fwif.h | 59 +++++++++++++--------------
 2 files changed, 83 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index 982bcc8..a9c2f7b 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -230,14 +230,7 @@ void intel_guc_fini(struct intel_guc *guc)
 static u32 guc_ctl_debug_flags(struct intel_guc *guc)
 {
 	u32 level = intel_guc_log_get_level(&guc->log);
-	u32 flags;
-	u32 ads;
-
-	ads = intel_guc_ggtt_offset(guc, guc->ads_vma) >> PAGE_SHIFT;
-	flags = ads << GUC_ADS_ADDR_SHIFT | GUC_ADS_ENABLED;
-
-	if (!GUC_LOG_LEVEL_IS_ENABLED(level))
-		flags |= GUC_LOG_DEFAULT_DISABLED;
+	u32 flags = 0;
 
 	if (!GUC_LOG_LEVEL_IS_VERBOSE(level))
 		flags |= GUC_LOG_DISABLED;
@@ -248,20 +241,28 @@ static u32 guc_ctl_debug_flags(struct intel_guc *guc)
 	return flags;
 }
 
-static u32 guc_ctl_feature_flags(struct intel_guc *guc)
+static u32 guc9_ctl_debug_flags(struct intel_guc *guc)
 {
-	u32 flags = 0;
+	u32 level = intel_guc_log_get_level(&guc->log);
+	u32 flags;
+	u32 ads;
 
-	flags |=  GUC_CTL_VCS2_ENABLED;
+	ads = intel_guc_ggtt_offset(guc, guc->ads_vma) >> PAGE_SHIFT;
+	flags = ads << GUC9_ADS_ADDR_SHIFT | GUC9_ADS_ENABLED;
 
-	if (USES_GUC_SUBMISSION(guc_to_i915(guc)))
-		flags |= GUC_CTL_KERNEL_SUBMISSIONS;
-	else
-		flags |= GUC_CTL_DISABLE_SCHEDULER;
+	if (!GUC_LOG_LEVEL_IS_ENABLED(level))
+		flags |= GUC9_LOG_DEFAULT_DISABLED;
+
+	flags |= guc_ctl_debug_flags(guc);
 
 	return flags;
 }
 
+static u32 guc9_ctl_feature_flags(struct intel_guc *guc)
+{
+	return GUC9_CTL_VCS2_ENABLED | GUC9_CTL_DISABLE_SCHEDULER;
+}
+
 static u32 guc_ctl_ctxinfo_flags(struct intel_guc *guc)
 {
 	u32 flags = 0;
@@ -279,6 +280,16 @@ static u32 guc_ctl_ctxinfo_flags(struct intel_guc *guc)
 	return flags;
 }
 
+static u32 guc_ctl_feature_flags(struct intel_guc *guc)
+{
+	u32 flags = 0;
+
+	if (!USES_GUC_SUBMISSION(guc_to_i915(guc)))
+		flags |= GUC_CTL_DISABLE_SCHEDULER;
+
+	return flags;
+}
+
 static u32 guc_ctl_log_params_flags(struct intel_guc *guc)
 {
 	u32 offset = intel_guc_ggtt_offset(guc, guc->log.vma) >> PAGE_SHIFT;
@@ -320,22 +331,39 @@ static u32 guc_ctl_log_params_flags(struct intel_guc *guc)
 	return flags;
 }
 
-static void guc_prepare_params(struct intel_guc *guc, u32 *params)
+static void guc9_prepare_params(struct intel_guc *guc, u32 *params)
 {
 	/*
 	 * GuC ARAT increment is 10 ns. GuC default scheduler quantum is one
 	 * second. This ARAR is calculated by:
 	 * Scheduler-Quantum-in-ns / ARAT-increment-in-ns = 1000000000 / 10
 	 */
-	params[GUC_CTL_ARAT_HIGH] = 0;
-	params[GUC_CTL_ARAT_LOW] = 100000000;
+	params[GUC9_CTL_ARAT_HIGH] = 0;
+	params[GUC9_CTL_ARAT_LOW] = 100000000;
+
+	params[GUC9_CTL_WA] |= GUC9_CTL_WA_UK_BY_DRIVER;
 
-	params[GUC_CTL_WA] |= GUC_CTL_WA_UK_BY_DRIVER;
+	params[GUC9_CTL_FEATURE] = guc9_ctl_feature_flags(guc);
+	params[GUC9_CTL_LOG_PARAMS] = guc_ctl_log_params_flags(guc);
+	params[GUC9_CTL_DEBUG] = guc9_ctl_debug_flags(guc);
+	params[GUC_CTL_CTXINFO] = guc_ctl_ctxinfo_flags(guc);
+}
 
+static u32 guc_ctl_ads_flags(struct intel_guc *guc)
+{
+	u32 ads = intel_guc_ggtt_offset(guc, guc->ads_vma) >> PAGE_SHIFT;
+	u32 flags = ads << GUC_ADS_ADDR_SHIFT;
+
+	return flags;
+}
+
+static void guc11_prepare_params(struct intel_guc *guc, u32 *params)
+{
+	params[GUC_CTL_CTXINFO] = guc_ctl_ctxinfo_flags(guc);
+	params[GUC_CTL_LOG_PARAMS] = guc_ctl_log_params_flags(guc);
 	params[GUC_CTL_FEATURE] = guc_ctl_feature_flags(guc);
-	params[GUC_CTL_LOG_PARAMS]  = guc_ctl_log_params_flags(guc);
 	params[GUC_CTL_DEBUG] = guc_ctl_debug_flags(guc);
-	params[GUC_CTL_CTXINFO] = guc_ctl_ctxinfo_flags(guc);
+	params[GUC_CTL_ADS] = guc_ctl_ads_flags(guc);
 }
 
 static void guc_write_params(struct intel_guc *guc, const u32 *params)
@@ -367,10 +395,14 @@ static void guc_write_params(struct intel_guc *guc, const u32 *params)
  */
 void intel_guc_init_params(struct intel_guc *guc)
 {
+	struct drm_i915_private *i915 = guc_to_i915(guc);
 	u32 params[GUC_CTL_MAX_DWORDS];
 
 	memset(params, 0, sizeof(params));
-	guc_prepare_params(guc, params);
+	if (INTEL_GEN(i915) >= 11)
+		guc11_prepare_params(guc, params);
+	else
+		guc9_prepare_params(guc, params);
 	guc_write_params(guc, params);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 8382d59..7070e36 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -71,44 +71,28 @@
 #define GUC_STAGE_DESC_ATTR_PCH		BIT(6)
 #define GUC_STAGE_DESC_ATTR_TERMINATED	BIT(7)
 
-/* The guc control data is 10 DWORDs */
+/* New GuC control data */
 #define GUC_CTL_CTXINFO			0
 #define   GUC_CTL_CTXNUM_IN16_SHIFT	0
 #define   GUC_CTL_BASE_ADDR_SHIFT	12
 
-#define GUC_CTL_ARAT_HIGH		1
-#define GUC_CTL_ARAT_LOW		2
-
-#define GUC_CTL_DEVICE_INFO		3
-
-#define GUC_CTL_LOG_PARAMS		4
+#define GUC_CTL_LOG_PARAMS		1
 #define   GUC_LOG_VALID			(1 << 0)
 #define   GUC_LOG_NOTIFY_ON_HALF_FULL	(1 << 1)
 #define   GUC_LOG_ALLOC_IN_MEGABYTE	(1 << 3)
 #define   GUC_LOG_CRASH_SHIFT		4
-#define   GUC_LOG_CRASH_MASK		(0x1 << GUC_LOG_CRASH_SHIFT)
+#define   GUC_LOG_CRASH_MASK		(0x3 << GUC_LOG_CRASH_SHIFT)
 #define   GUC_LOG_DPC_SHIFT		6
 #define   GUC_LOG_DPC_MASK	        (0x7 << GUC_LOG_DPC_SHIFT)
 #define   GUC_LOG_ISR_SHIFT		9
 #define   GUC_LOG_ISR_MASK	        (0x7 << GUC_LOG_ISR_SHIFT)
 #define   GUC_LOG_BUF_ADDR_SHIFT	12
 
-#define GUC_CTL_PAGE_FAULT_CONTROL	5
-
-#define GUC_CTL_WA			6
-#define   GUC_CTL_WA_UK_BY_DRIVER	(1 << 3)
+#define GUC_CTL_WA			2
+#define GUC_CTL_FEATURE			3
+#define   GUC_CTL_DISABLE_SCHEDULER	(1 << 14)
 
-#define GUC_CTL_FEATURE			7
-#define   GUC_CTL_VCS2_ENABLED		(1 << 0)
-#define   GUC_CTL_KERNEL_SUBMISSIONS	(1 << 1)
-#define   GUC_CTL_FEATURE2		(1 << 2)
-#define   GUC_CTL_POWER_GATING		(1 << 3)
-#define   GUC_CTL_DISABLE_SCHEDULER	(1 << 4)
-#define   GUC_CTL_PREEMPTION_LOG	(1 << 5)
-#define   GUC_CTL_ENABLE_SLPC		(1 << 7)
-#define   GUC_CTL_RESET_ON_PREMPT_FAILURE	(1 << 8)
-
-#define GUC_CTL_DEBUG			8
+#define GUC_CTL_DEBUG			4
 #define   GUC_LOG_VERBOSITY_SHIFT	0
 #define   GUC_LOG_VERBOSITY_LOW		(0 << GUC_LOG_VERBOSITY_SHIFT)
 #define   GUC_LOG_VERBOSITY_MED		(1 << GUC_LOG_VERBOSITY_SHIFT)
@@ -121,13 +105,28 @@
 #define	  GUC_LOG_DESTINATION_MASK	(3 << 4)
 #define   GUC_LOG_DISABLED		(1 << 6)
 #define   GUC_PROFILE_ENABLED		(1 << 7)
-#define   GUC_WQ_TRACK_ENABLED		(1 << 8)
-#define   GUC_ADS_ENABLED		(1 << 9)
-#define   GUC_LOG_DEFAULT_DISABLED	(1 << 10)
-#define   GUC_ADS_ADDR_SHIFT		11
-#define   GUC_ADS_ADDR_MASK		0xfffff800
-
-#define GUC_CTL_RSRVD			9
+#define   GUC9_WQ_TRACK_ENABLED		(1 << 8)
+#define   GUC9_ADS_ENABLED		(1 << 9)
+#define   GUC9_LOG_DEFAULT_DISABLED	(1 << 10)
+#define   GUC9_ADS_ADDR_SHIFT		11
+#define   GUC9_ADS_ADDR_MASK		0xfffff800
+
+#define GUC_CTL_ADS			5
+#define   GUC_ADS_ADDR_SHIFT		1
+#define   GUC_ADS_ADDR_MASK		(0xFFFFF << GUC_ADS_ADDR_SHIFT)
+
+/* Legacy GuC control data */
+#define GUC9_CTL_ARAT_HIGH		1
+#define GUC9_CTL_ARAT_LOW		2
+#define GUC9_CTL_DEVICE_INFO		3
+#define GUC9_CTL_LOG_PARAMS		4
+#define GUC9_CTL_PAGE_FAULT_CONTROL	5
+#define GUC9_CTL_WA			6
+#define   GUC9_CTL_WA_UK_BY_DRIVER	(1 << 3)
+#define GUC9_CTL_FEATURE		7
+#define   GUC9_CTL_VCS2_ENABLED		(1 << 0)
+#define   GUC9_CTL_DISABLE_SCHEDULER	(1 << 4)
+#define GUC9_CTL_DEBUG			8
 
 #define GUC_CTL_MAX_DWORDS		(SOFT_SCRATCH_COUNT - 2) /* [1..14] */
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 05/21] drm/i915/guc: Update sample-forcewake command
  2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
                   ` (3 preceding siblings ...)
  2018-08-29 19:10 ` [PATCH 04/21] drm/i915/guc: Support dual Gen9/Gen11 parameters block Michal Wajdeczko
@ 2018-08-29 19:10 ` Michal Wajdeczko
  2018-08-29 21:52   ` Daniele Ceraolo Spurio
  2018-08-29 19:10 ` [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface Michal Wajdeczko
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:10 UTC (permalink / raw)
  To: intel-gfx

Action ID of this command has been changed in GuC firmware.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 7070e36..963da91 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -659,7 +659,6 @@ enum intel_guc_action {
 	INTEL_GUC_ACTION_DEFAULT = 0x0,
 	INTEL_GUC_ACTION_REQUEST_PREEMPTION = 0x2,
 	INTEL_GUC_ACTION_REQUEST_ENGINE_RESET = 0x3,
-	INTEL_GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
 	INTEL_GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
 	INTEL_GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
 	INTEL_GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE = 0x30,
@@ -667,6 +666,7 @@ enum intel_guc_action {
 	INTEL_GUC_ACTION_ENTER_S_STATE = 0x501,
 	INTEL_GUC_ACTION_EXIT_S_STATE = 0x502,
 	INTEL_GUC_ACTION_SLPC_REQUEST = 0x3003,
+	INTEL_GUC_ACTION_SAMPLE_FORCEWAKE = 0x3005,
 	INTEL_GUC_ACTION_AUTHENTICATE_HUC = 0x4000,
 	INTEL_GUC_ACTION_REGISTER_COMMAND_TRANSPORT_BUFFER = 0x4505,
 	INTEL_GUC_ACTION_DEREGISTER_COMMAND_TRANSPORT_BUFFER = 0x4506,
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface
  2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
                   ` (4 preceding siblings ...)
  2018-08-29 19:10 ` [PATCH 05/21] drm/i915/guc: Update sample-forcewake command Michal Wajdeczko
@ 2018-08-29 19:10 ` Michal Wajdeczko
  2018-08-29 19:58   ` Michel Thierry
  2018-09-06  8:55   ` Joonas Lahtinen
  2018-08-29 19:15 ` [PATCH 07/21] drm/i915/guc: New GuC ADS object definition Michal Wajdeczko
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:10 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lucas De Marchi, Rodrigo Vivi

Until now the GuC and HW engine class has been the same, which allowed
us to use them interchangeable. But it is better to start doing the
right thing and use the GuC definitions for the firmware interface.

We also keep the same class id in the ctx descriptor to be able to have
the same values in the driver and firmware logs.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c  | 13 +++++++++++++
 drivers/gpu/drm/i915/intel_guc_fwif.h   |  7 +++++++
 drivers/gpu/drm/i915/intel_lrc.c        | 10 +++++++++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 1a34e8f..bc81354 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -85,6 +85,7 @@ struct engine_info {
 	unsigned int hw_id;
 	unsigned int uabi_id;
 	u8 class;
+	u8 guc_class;
 	u8 instance;
 	/* mmio bases table *must* be sorted in reverse gen order */
 	struct engine_mmio_base {
@@ -98,6 +99,7 @@ struct engine_info {
 		.hw_id = RCS_HW,
 		.uabi_id = I915_EXEC_RENDER,
 		.class = RENDER_CLASS,
+		.guc_class = GUC_RENDER_CLASS,
 		.instance = 0,
 		.mmio_bases = {
 			{ .gen = 1, .base = RENDER_RING_BASE }
@@ -107,6 +109,7 @@ struct engine_info {
 		.hw_id = BCS_HW,
 		.uabi_id = I915_EXEC_BLT,
 		.class = COPY_ENGINE_CLASS,
+		.guc_class = GUC_BLITTER_CLASS,
 		.instance = 0,
 		.mmio_bases = {
 			{ .gen = 6, .base = BLT_RING_BASE }
@@ -116,6 +119,7 @@ struct engine_info {
 		.hw_id = VCS_HW,
 		.uabi_id = I915_EXEC_BSD,
 		.class = VIDEO_DECODE_CLASS,
+		.guc_class = GUC_VIDEO_CLASS,
 		.instance = 0,
 		.mmio_bases = {
 			{ .gen = 11, .base = GEN11_BSD_RING_BASE },
@@ -127,6 +131,7 @@ struct engine_info {
 		.hw_id = VCS2_HW,
 		.uabi_id = I915_EXEC_BSD,
 		.class = VIDEO_DECODE_CLASS,
+		.guc_class = GUC_VIDEO_CLASS,
 		.instance = 1,
 		.mmio_bases = {
 			{ .gen = 11, .base = GEN11_BSD2_RING_BASE },
@@ -137,6 +142,7 @@ struct engine_info {
 		.hw_id = VCS3_HW,
 		.uabi_id = I915_EXEC_BSD,
 		.class = VIDEO_DECODE_CLASS,
+		.guc_class = GUC_VIDEO_CLASS,
 		.instance = 2,
 		.mmio_bases = {
 			{ .gen = 11, .base = GEN11_BSD3_RING_BASE }
@@ -146,6 +152,7 @@ struct engine_info {
 		.hw_id = VCS4_HW,
 		.uabi_id = I915_EXEC_BSD,
 		.class = VIDEO_DECODE_CLASS,
+		.guc_class = GUC_VIDEO_CLASS,
 		.instance = 3,
 		.mmio_bases = {
 			{ .gen = 11, .base = GEN11_BSD4_RING_BASE }
@@ -155,6 +162,7 @@ struct engine_info {
 		.hw_id = VECS_HW,
 		.uabi_id = I915_EXEC_VEBOX,
 		.class = VIDEO_ENHANCEMENT_CLASS,
+		.guc_class = GUC_VIDEOENHANCE_CLASS,
 		.instance = 0,
 		.mmio_bases = {
 			{ .gen = 11, .base = GEN11_VEBOX_RING_BASE },
@@ -165,6 +173,7 @@ struct engine_info {
 		.hw_id = VECS2_HW,
 		.uabi_id = I915_EXEC_VEBOX,
 		.class = VIDEO_ENHANCEMENT_CLASS,
+		.guc_class = GUC_VIDEOENHANCE_CLASS,
 		.instance = 1,
 		.mmio_bases = {
 			{ .gen = 11, .base = GEN11_VEBOX2_RING_BASE }
@@ -276,6 +285,9 @@ static void __sprint_engine_name(char *name, const struct engine_info *info)
 	if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
 		return -EINVAL;
 
+	if (GEM_WARN_ON(info->guc_class >= GUC_MAX_ENGINE_CLASSES))
+		return -EINVAL;
+
 	if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
 		return -EINVAL;
 
@@ -291,6 +303,7 @@ static void __sprint_engine_name(char *name, const struct engine_info *info)
 	engine->i915 = dev_priv;
 	__sprint_engine_name(engine->name, info);
 	engine->hw_id = engine->guc_id = info->hw_id;
+	engine->guc_class = info->guc_class;
 	engine->mmio_base = __engine_mmio_base(dev_priv, info->mmio_bases);
 	engine->class = info->class;
 	engine->instance = info->instance;
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 963da91..5b7a05b 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -39,6 +39,13 @@
 #define GUC_VIDEO_ENGINE2		4
 #define GUC_MAX_ENGINES_NUM		(GUC_VIDEO_ENGINE2 + 1)
 
+#define GUC_RENDER_CLASS	0
+#define GUC_VIDEO_CLASS		1
+#define GUC_VIDEOENHANCE_CLASS	2
+#define GUC_BLITTER_CLASS	3
+#define GUC_RESERVED_CLASS	4
+#define GUC_MAX_ENGINE_CLASSES	(GUC_RESERVED_CLASS + 1)
+
 /* Work queue item header definitions */
 #define WQ_STATUS_ACTIVE		1
 #define WQ_STATUS_SUSPENDED		2
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f8ceb9c..f4b9972 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -249,7 +249,15 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 
 		/* TODO: decide what to do with SW counter (bits 55-60) */
 
-		desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
+		/*
+		 * Although GuC will never see this upper part as it fills
+		 * its own descriptor, using the guc_class here will help keep
+		 * the i915 and firmware logs in sync.
+		 */
+		if (HAS_GUC_SCHED(ctx->i915))
+			desc |= (u64)engine->guc_class << GEN11_ENGINE_CLASS_SHIFT;
+		else
+			desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
 								/* bits 61-63 */
 	} else {
 		GEM_BUG_ON(ctx->hw_id >= BIT(GEN8_CTX_ID_WIDTH));
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 3f6920d..f47009f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -350,7 +350,9 @@ struct intel_engine_cs {
 
 	enum intel_engine_id id;
 	unsigned int hw_id;
+
 	unsigned int guc_id;
+	u8 guc_class;
 
 	u8 uabi_id;
 	u8 uabi_class;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 07/21] drm/i915/guc: New GuC ADS object definition
  2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
                   ` (5 preceding siblings ...)
  2018-08-29 19:10 ` [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface Michal Wajdeczko
@ 2018-08-29 19:15 ` Michal Wajdeczko
  2018-08-29 19:16 ` [PATCH 08/21] drm/i915/guc: Make use of the SW counter field in the context descriptor Michal Wajdeczko
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:15 UTC (permalink / raw)
  To: intel-gfx

Definition of the Additional Data Structure (ADS) object and
some of its sub-structs has been updated in the GuC firmware.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c  |  5 ++
 drivers/gpu/drm/i915/intel_guc_ads.c    | 91 ++++++++++++++++++++++++---------
 drivers/gpu/drm/i915/intel_guc_fwif.h   | 89 ++++++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 +
 4 files changed, 126 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index bc81354..6cefe26 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -270,6 +270,11 @@ static void __sprint_engine_name(char *name, const struct engine_info *info)
 			 info->instance) >= INTEL_ENGINE_CS_MAX_NAME);
 }
 
+u32 intel_class_context_size(struct drm_i915_private *dev_priv, u8 class)
+{
+	return __intel_engine_context_size(dev_priv, class);
+}
+
 static int
 intel_engine_setup(struct drm_i915_private *dev_priv,
 		   enum intel_engine_id id)
diff --git a/drivers/gpu/drm/i915/intel_guc_ads.c b/drivers/gpu/drm/i915/intel_guc_ads.c
index f0db628..8e59a64 100644
--- a/drivers/gpu/drm/i915/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/intel_guc_ads.c
@@ -51,7 +51,7 @@ static void guc_policies_init(struct guc_policies *policies)
 	policies->max_num_work_items = POLICY_MAX_NUM_WI;
 
 	for (p = 0; p < GUC_CLIENT_PRIORITY_NUM; p++) {
-		for (i = GUC_RENDER_ENGINE; i < GUC_MAX_ENGINES_NUM; i++) {
+		for (i = 0; i < GUC_MAX_ENGINE_CLASSES; i++) {
 			policy = &policies->policy[p][i];
 
 			guc_policy_init(policy);
@@ -61,12 +61,35 @@ static void guc_policies_init(struct guc_policies *policies)
 	policies->is_valid = 1;
 }
 
+static u8 guc_class_to_intel_class(u8 guc_class)
+{
+	switch (guc_class) {
+	default:
+		MISSING_CASE(guc_class);
+		/* fall through */
+	case GUC_RENDER_CLASS:
+		return RENDER_CLASS;
+	case GUC_VIDEO_CLASS:
+		return VIDEO_DECODE_CLASS;
+	case GUC_VIDEOENHANCE_CLASS:
+		return VIDEO_ENHANCEMENT_CLASS;
+	case GUC_BLITTER_CLASS:
+		return COPY_ENGINE_CLASS;
+	}
+}
+
 /*
  * The first 80 dwords of the register state context, containing the
  * execlists and ppgtt registers.
  */
 #define LR_HW_CONTEXT_SIZE	(80 * sizeof(u32))
 
+static void
+guc_master_cmd_transport_pool_init(struct guc_master_cmd_transport_pool *pool)
+{
+	memset(pool, 0, sizeof(*pool));
+}
+
 /**
  * intel_guc_ads_create() - creates GuC ADS
  * @guc: intel_guc struct
@@ -76,19 +99,22 @@ int intel_guc_ads_create(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
 	struct i915_vma *vma, *kernel_ctx_vma;
-	struct page *page;
 	/* The ads obj includes the struct itself and buffers passed to GuC */
 	struct {
 		struct guc_ads ads;
 		struct guc_policies policies;
 		struct guc_mmio_reg_state reg_state;
+		struct guc_gt_system_info system_info;
+		struct guc_gt_system_additional_info add_system_info;
+		struct guc_master_cmd_transport_pool ct_pool;
 		u8 reg_state_buffer[GUC_S3_SAVE_SPACE_PAGES * PAGE_SIZE];
 	} __packed *blob;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
 	const u32 skipped_offset = LRC_HEADER_PAGES * PAGE_SIZE;
 	const u32 skipped_size = LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE;
+	u32 media_fuse;
 	u32 base;
+	u8 class;
+	int ret;
 
 	GEM_BUG_ON(guc->ads_vma);
 
@@ -98,21 +124,15 @@ int intel_guc_ads_create(struct intel_guc *guc)
 
 	guc->ads_vma = vma;
 
-	page = i915_vma_first_page(vma);
-	blob = kmap(page);
+	blob = i915_gem_object_pin_map(guc->ads_vma->obj, I915_MAP_WB);
+	if (IS_ERR(blob)) {
+		ret = PTR_ERR(blob);
+		goto err_vma;
+	}
 
 	/* GuC scheduling policies */
 	guc_policies_init(&blob->policies);
 
-	/* MMIO reg state */
-	for_each_engine(engine, dev_priv, id) {
-		blob->reg_state.white_list[engine->guc_id].mmio_start =
-			engine->mmio_base + GUC_MMIO_WHITE_LIST_START;
-
-		/* Nothing to be saved or restored for now. */
-		blob->reg_state.white_list[engine->guc_id].count = 0;
-	}
-
 	/*
 	 * The GuC requires a "Golden Context" when it reinitialises
 	 * engines after a reset. Here we use the Render ring default
@@ -123,27 +143,50 @@ int intel_guc_ads_create(struct intel_guc *guc)
 	 */
 	kernel_ctx_vma = to_intel_context(dev_priv->kernel_context,
 					  dev_priv->engine[RCS])->state;
-	blob->ads.golden_context_lrca =
+	blob->ads.golden_context_lrca[GUC_RENDER_CLASS] =
 		intel_guc_ggtt_offset(guc, kernel_ctx_vma) + skipped_offset;
 
 	/*
-	 * The GuC expects us to exclude the portion of the context image that
-	 * it skips from the size it is to read. It starts reading from after
-	 * the execlist context (so skipping the first page [PPHWSP] and 80
-	 * dwords). Weird guc is weird.
+	 * We only care about the golden context for the render class, really
+	 * (but skipping the execlist part of the context)
 	 */
-	for_each_engine(engine, dev_priv, id)
-		blob->ads.eng_state_size[engine->guc_id] =
-			engine->context_size - skipped_size;
+	class = guc_class_to_intel_class(GUC_RENDER_CLASS);
+	blob->ads.eng_state_size[GUC_RENDER_CLASS] =
+		intel_class_context_size(dev_priv, class) - skipped_size;
+
+	/* System info */
+	blob->system_info.slice_enabled = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
+	blob->system_info.rcs_enabled = 1;
+	blob->system_info.bcs_enabled = 1;
+
+	media_fuse = I915_READ(GEN11_GT_VEBOX_VDBOX_DISABLE);
+	blob->system_info.vdbox_enable_mask = ~(media_fuse & GEN11_GT_VDBOX_DISABLE_MASK);
+	blob->system_info.vebox_enable_mask = ~((media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
+						GEN11_GT_VEBOX_DISABLE_SHIFT);
 
 	base = intel_guc_ggtt_offset(guc, vma);
+
+	/* Additional info  */
+	guc_master_cmd_transport_pool_init(&blob->ct_pool);
+
+	blob->add_system_info.gfx_address_command_transport_pool =
+		base + ptr_offset(blob, ct_pool);
+	blob->add_system_info.command_transport_pool_count = GUC_CT_POOL_SIZE;
+
+	/* ADS */
 	blob->ads.scheduler_policies = base + ptr_offset(blob, policies);
 	blob->ads.reg_state_buffer = base + ptr_offset(blob, reg_state_buffer);
 	blob->ads.reg_state_addr = base + ptr_offset(blob, reg_state);
+	blob->ads.gt_system_info = base + ptr_offset(blob, system_info);
+	blob->ads.gt_system_additional_info = base + ptr_offset(blob, add_system_info);
 
-	kunmap(page);
+	i915_gem_object_unpin_map(guc->ads_vma->obj);
 
 	return 0;
+
+err_vma:
+	i915_vma_unpin_and_release(&guc->ads_vma, 0);
+	return ret;
 }
 
 void intel_guc_ads_destroy(struct intel_guc *guc)
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 5b7a05b..2b41538 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -45,6 +45,7 @@
 #define GUC_BLITTER_CLASS	3
 #define GUC_RESERVED_CLASS	4
 #define GUC_MAX_ENGINE_CLASSES	(GUC_RESERVED_CLASS + 1)
+#define GUC_MAX_INSTANCES_PER_CLASS	4
 
 /* Work queue item header definitions */
 #define WQ_STATUS_ACTIVE		1
@@ -447,23 +448,19 @@ struct guc_ct_buffer_desc {
 struct guc_policy {
 	/* Time for one workload to execute. (in micro seconds) */
 	u32 execution_quantum;
-	u32 reserved1;
-
 	/* Time to wait for a preemption request to completed before issuing a
 	 * reset. (in micro seconds). */
 	u32 preemption_time;
-
 	/* How much time to allow to run after the first fault is observed.
 	 * Then preempt afterwards. (in micro seconds) */
 	u32 fault_time;
-
 	u32 policy_flags;
-	u32 reserved[2];
+	u32 reserved[8];
 } __packed;
 
 struct guc_policies {
-	struct guc_policy policy[GUC_CLIENT_PRIORITY_NUM][GUC_MAX_ENGINES_NUM];
-
+	struct guc_policy policy[GUC_CLIENT_PRIORITY_NUM][GUC_MAX_ENGINE_CLASSES];
+	u32 submission_queue_depth[GUC_MAX_ENGINE_CLASSES];
 	/* In micro seconds. How much time to allow before DPC processing is
 	 * called back via interrupt (to prevent DPC queue drain starving).
 	 * Typically 1000s of micro seconds (example only, not granularity). */
@@ -476,57 +473,75 @@ struct guc_policies {
 	 * idle. */
 	u32 max_num_work_items;
 
-	u32 reserved[19];
+	u32 reserved[4];
 } __packed;
 
 /* GuC MMIO reg state struct */
 
-#define GUC_REGSET_FLAGS_NONE		0x0
-#define GUC_REGSET_POWERCYCLE		0x1
-#define GUC_REGSET_MASKED		0x2
-#define GUC_REGSET_ENGINERESET		0x4
-#define GUC_REGSET_SAVE_DEFAULT_VALUE	0x8
-#define GUC_REGSET_SAVE_CURRENT_VALUE	0x10
 
-#define GUC_REGSET_MAX_REGISTERS	25
-#define GUC_MMIO_WHITE_LIST_START	0x24d0
-#define GUC_MMIO_WHITE_LIST_MAX		12
+#define GUC_REGSET_MAX_REGISTERS	64
 #define GUC_S3_SAVE_SPACE_PAGES		10
 
-struct guc_mmio_regset {
-	struct __packed {
-		u32 offset;
-		u32 value;
-		u32 flags;
-	} registers[GUC_REGSET_MAX_REGISTERS];
+struct guc_mmio_reg {
+	u32 offset;
+	u32 value;
+	u32 flags;
+#define GUC_REGSET_MASKED		(1 << 0)
+} __packed;
 
+struct guc_mmio_regset {
+	struct guc_mmio_reg registers[GUC_REGSET_MAX_REGISTERS];
 	u32 values_valid;
 	u32 number_of_registers;
 } __packed;
 
-/* MMIO registers that are set as non privileged */
-struct mmio_white_list {
-	u32 mmio_start;
-	u32 offsets[GUC_MMIO_WHITE_LIST_MAX];
-	u32 count;
+/* GuC register sets */
+struct guc_mmio_reg_state {
+	struct guc_mmio_regset engine_reg[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS];
+	u32 reserved[98];
+} __packed;
+
+/* Gen11+ HW info */
+struct guc_gt_system_info {
+	u32 slice_enabled;
+	u32 rcs_enabled;
+	u32 reserved0;
+	u32 bcs_enabled;
+	u32 vdbox_enable_mask;
+	u32 vdbox_sfc_support_mask;
+	u32 vebox_enable_mask;
+	u32 reserved[9];
 } __packed;
 
-struct guc_mmio_reg_state {
-	struct guc_mmio_regset global_reg;
-	struct guc_mmio_regset engine_reg[GUC_MAX_ENGINES_NUM];
-	struct mmio_white_list white_list[GUC_MAX_ENGINES_NUM];
+struct guc_gt_system_additional_info {
+	u32 reserved0[14];
+	u32 gfx_address_command_transport_pool;
+	u32 command_transport_pool_count;
+	u32 reserved1[95];
 } __packed;
 
-/* GuC Additional Data Struct */
+struct guc_master_cmd_transport_buffer_alloc {
+	struct guc_ct_buffer_desc desc;
+	u32 reserved[7];
+} __packed;
 
+#define GUC_CT_POOL_SIZE	2
+
+struct guc_master_cmd_transport_pool {
+	struct guc_master_cmd_transport_buffer_alloc pool[GUC_CT_POOL_SIZE];
+} __packed;
+
+/* GuC Additional Data Struct */
 struct guc_ads {
 	u32 reg_state_addr;
 	u32 reg_state_buffer;
-	u32 golden_context_lrca;
 	u32 scheduler_policies;
-	u32 reserved0[3];
-	u32 eng_state_size[GUC_MAX_ENGINES_NUM];
-	u32 reserved2[4];
+	u32 gt_system_info;
+	u32 gt_system_additional_info;
+	u32 control_data;
+	u32 golden_context_lrca[GUC_MAX_ENGINE_CLASSES];
+	u32 eng_state_size[GUC_MAX_ENGINE_CLASSES];
+	u32 reserved[16];
 } __packed;
 
 /* GuC logging structures */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f47009f..ec13d39 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -1183,6 +1183,8 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 
 ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine);
 
+u32 intel_class_context_size(struct drm_i915_private *dev_priv, u8 class);
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 
 static inline bool inject_preempt_hang(struct intel_engine_execlists *execlists)
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 08/21] drm/i915/guc: Make use of the SW counter field in the context descriptor
  2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
                   ` (6 preceding siblings ...)
  2018-08-29 19:15 ` [PATCH 07/21] drm/i915/guc: New GuC ADS object definition Michal Wajdeczko
@ 2018-08-29 19:16 ` Michal Wajdeczko
  2018-08-30  0:08   ` Lionel Landwerlin
  2018-08-29 19:17 ` [PATCH 09/21] drm/i915/guc: New GuC IDs based on engine class and instance Michal Wajdeczko
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:16 UTC (permalink / raw)
  To: intel-gfx; +Cc: Oscar Mateo, Rodrigo Vivi

The new context descriptor format contains two assignable fields:
the SW Context ID (technically 11 bits, but practically limited to 2032
entries due to some being reserved for future use by the GuC) and the
SW Counter (6 bits).

We don't want to limit ourselves too much in the maximum number of
concurrent contexts we want to allow, so ideally we want to employ
every possible bit available. Unfortunately, a further limitation in
the interface with the GuC means the combination of SW Context ID +
SW Counter has to be unique within the same engine class (as we use
the SW Context ID to index in the GuC stage descriptor pool, and the
Engine Class + SW Counter to index in the 2-dimensional lrc array).
This essentially means we need to somehow encode the engine instance.

Since the BSpec allows 6 bits for engine instance, we use the whole
SW counter for this task. If the limitation of 2032 maximum simultaneous
contexts is too restrictive, we can always squeeze things a bit more
(3 extras bits for hw_id, 3 bits for instance) and things will still
work (Gen11 does not instance more than 8 engines of any class).

Another alternative would be to generate the hw_id per HW context
instead of per GEM context, but that has other problems (e.g. maximum
number of user-created contexts would be variable, no relationship
between a GuC principal descriptor and the proxy descriptor it uses, ...)

Bspec: 12254

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 15 +++++++++++----
 drivers/gpu/drm/i915/i915_gem_context.c |  5 ++++-
 drivers/gpu/drm/i915/i915_gem_context.h |  2 ++
 drivers/gpu/drm/i915/i915_reg.h         |  2 ++
 drivers/gpu/drm/i915/intel_lrc.c        | 12 +++++++++---
 5 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e5b9d3c..34f5495 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1866,14 +1866,21 @@ struct drm_i915_private {
 		struct llist_head free_list;
 		struct work_struct free_work;
 
-		/* The hw wants to have a stable context identifier for the
+		/*
+		 * The HW wants to have a stable context identifier for the
 		 * lifetime of the context (for OA, PASID, faults, etc).
 		 * This is limited in execlists to 21 bits.
+		 * In enhanced execlist (GEN11+) this is limited to 11 bits
+		 * (the SW Context ID field) but GuC limits it a bit further
+		 * (11 bits - 16) due to some entries being reserved for future
+		 * use (so the firmware only supports a GuC stage descriptor
+		 * pool of 2032 entries).
 		 */
 		struct ida hw_ida;
-#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
-#define MAX_GUC_CONTEXT_HW_ID (1 << 20) /* exclusive */
-#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
+#define MAX_CONTEXT_HW_ID			(1 << 21) /* exclusive */
+#define MAX_GUC_CONTEXT_HW_ID			(1 << 20) /* exclusive */
+#define GEN11_MAX_CONTEXT_HW_ID			(1 << 11) /* exclusive */
+#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC	(GEN11_MAX_CONTEXT_HW_ID - 16)
 	} contexts;
 
 	u32 fdi_rx_config;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index f15a039..e3b500c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -209,7 +209,10 @@ static int assign_hw_id(struct drm_i915_private *dev_priv, unsigned *out)
 	unsigned int max;
 
 	if (INTEL_GEN(dev_priv) >= 11) {
-		max = GEN11_MAX_CONTEXT_HW_ID;
+		if (USES_GUC_SUBMISSION(dev_priv))
+			max = GEN11_MAX_CONTEXT_HW_ID_WITH_GUC;
+		else
+			max = GEN11_MAX_CONTEXT_HW_ID;
 	} else {
 		/*
 		 * When using GuC in proxy submission, GuC consumes the
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 851dad6..4b87f5d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -154,6 +154,8 @@ struct i915_gem_context {
 		struct intel_ring *ring;
 		u32 *lrc_reg_state;
 		u64 lrc_desc;
+		u32 sw_context_id;
+		u32 sw_counter;
 		int pin_count;
 
 		const struct intel_context_ops *ops;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f232178..ea65d7b 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3900,6 +3900,8 @@ enum {
 #define GEN8_CTX_ID_WIDTH 21
 #define GEN11_SW_CTX_ID_SHIFT 37
 #define GEN11_SW_CTX_ID_WIDTH 11
+#define GEN11_SW_COUNTER_SHIFT 55
+#define GEN11_SW_COUNTER_WIDTH 6
 #define GEN11_ENGINE_CLASS_SHIFT 61
 #define GEN11_ENGINE_CLASS_WIDTH 3
 #define GEN11_ENGINE_INSTANCE_SHIFT 48
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f4b9972..3001a14 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -240,14 +240,15 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 	 * anything below.
 	 */
 	if (INTEL_GEN(ctx->i915) >= 11) {
-		GEM_BUG_ON(ctx->hw_id >= BIT(GEN11_SW_CTX_ID_WIDTH));
-		desc |= (u64)ctx->hw_id << GEN11_SW_CTX_ID_SHIFT;
+		GEM_BUG_ON(ce->sw_context_id >= BIT(GEN11_SW_CTX_ID_WIDTH));
+		desc |= (u64)ce->sw_context_id << GEN11_SW_CTX_ID_SHIFT;
 								/* bits 37-47 */
 
 		desc |= (u64)engine->instance << GEN11_ENGINE_INSTANCE_SHIFT;
 								/* bits 48-53 */
 
-		/* TODO: decide what to do with SW counter (bits 55-60) */
+		desc |= (u64)ce->sw_counter << GEN11_SW_COUNTER_SHIFT;
+								/* bits 55-60 */
 
 		/*
 		 * Although GuC will never see this upper part as it fills
@@ -2771,6 +2772,11 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 	ce->ring = ring;
 	ce->state = vma;
 
+	if (INTEL_GEN(ctx->i915) >= 11) {
+		ce->sw_context_id = ctx->hw_id;
+		ce->sw_counter = engine->instance;
+	}
+
 	return 0;
 
 error_ring_free:
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11
  2018-08-29 19:10 ` [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11 Michal Wajdeczko
@ 2018-08-29 19:16   ` Srivatsa, Anusha
  2018-08-30 22:58   ` John Spotswood
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 49+ messages in thread
From: Srivatsa, Anusha @ 2018-08-29 19:16 UTC (permalink / raw)
  To: Wajdeczko, Michal, intel-gfx; +Cc: Sundaresan, Sujaritha, Vivi, Rodrigo



>-----Original Message-----
>From: Wajdeczko, Michal
>Sent: Wednesday, August 29, 2018 12:11 PM
>To: intel-gfx@lists.freedesktop.org
>Cc: Wajdeczko, Michal <Michal.Wajdeczko@intel.com>; Joonas Lahtinen
><joonas.lahtinen@linux.intel.com>; Vivi, Rodrigo <rodrigo.vivi@intel.com>;
>Ceraolo Spurio, Daniele <daniele.ceraolospurio@intel.com>; Thierry, Michel
><michel.thierry@intel.com>; Spotswood, John A <john.a.spotswood@intel.com>;
>Belgaumkar, Vinay <vinay.belgaumkar@intel.com>; Ye, Tony
><tony.ye@intel.com>; Srivatsa, Anusha <anusha.srivatsa@intel.com>; Mcgee,
>Jeff <jeff.mcgee@intel.com>; Argenziano, Antonio
><antonio.argenziano@intel.com>; Sundaresan, Sujaritha
><sujaritha.sundaresan@intel.com>
>Subject: [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11
>
>Upcoming Gen11 GuC firmware requires new interface that is incompatible with
>existing pre-Gen11 firmwares. Updated firmwares for pre-Gen11 will arrive later.
>In the meantime sanitize the enable_guc option so that we can enable HuC
>authentication but nothing else on pre-Gen11.
>
>Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>Cc: Michel Thierry <michel.thierry@intel.com>
>Cc: John Spotswood <john.a.spotswood@intel.com>
>Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
>Cc: Tony Ye <tony.ye@intel.com>
>Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
>Cc: Jeff Mcgee <jeff.mcgee@intel.com>
>Cc: Antonio Argenziano <antonio.argenziano@intel.com>
>Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Thanks for the fix Michal.

Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
>---
> drivers/gpu/drm/i915/intel_uc.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
>index 7a3a4ca..185b29b 100644
>--- a/drivers/gpu/drm/i915/intel_uc.c
>+++ b/drivers/gpu/drm/i915/intel_uc.c
>@@ -63,6 +63,8 @@ static int __get_platform_enable_guc(struct
>drm_i915_private *i915)
> 		enable_guc |= ENABLE_GUC_LOAD_HUC;
>
> 	/* Any platform specific fine-tuning can be done here */
>+	if (INTEL_GEN(i915) < 11)
>+		enable_guc &= ~ENABLE_GUC_SUBMISSION;
>
> 	return enable_guc;
> }
>@@ -115,6 +117,13 @@ static void sanitize_options_early(struct
>drm_i915_private *i915)
> 			 yesno(intel_uc_is_using_guc_submission()),
> 			 yesno(intel_uc_is_using_huc()));
>
>+	/* Verify GuC submission support */
>+	if (intel_uc_is_using_guc_submission() && INTEL_GEN(i915) < 11) {
>+		DRM_WARN("Incompatible option detected: %s=%d, %s!\n",
>+			 "enable_guc", i915_modparams.enable_guc,
>+			 "submission not supported");
>+	}
>+
> 	/* Verify GuC firmware availability */
> 	if (intel_uc_is_using_guc() && !intel_uc_fw_is_selected(guc_fw)) {
> 		DRM_WARN("Incompatible option detected: %s=%d, %s!\n",
>@@ -292,6 +301,12 @@ int intel_uc_init(struct drm_i915_private *i915)
> 		return ret;
>
> 	if (USES_GUC_SUBMISSION(i915)) {
>+
>+		if (INTEL_GEN(i915) < 11) {
>+			intel_guc_fini(guc);
>+			return -EIO;
>+		}
>+
> 		/*
> 		 * This is stuff we need to have available at fw load time
> 		 * if we are planning to enable submission later
>--
>1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 09/21] drm/i915/guc: New GuC IDs based on engine class and instance
  2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
                   ` (7 preceding siblings ...)
  2018-08-29 19:16 ` [PATCH 08/21] drm/i915/guc: Make use of the SW counter field in the context descriptor Michal Wajdeczko
@ 2018-08-29 19:17 ` Michal Wajdeczko
  2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
  2018-08-29 19:19 ` [PATCH 20/21] drm/i915/guc: Enable command transport buffers " Michal Wajdeczko
  10 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:17 UTC (permalink / raw)
  To: intel-gfx; +Cc: Oscar Mateo, Rodrigo Vivi

Starting from Gen11, the ID to be provided to GuC needs to contain
the engine class in bits [0..2] and the instance in bits [3..6].

NOTE: this patch breaks pointer dereferences in some existing GuC
functions that use the guc_id to dereference arrays but these functions
are not used for now as we have GuC submission disabled and we will
update these functions in follow up patch which requires new IDs.

Bspec: 20944

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c |  3 ++-
 drivers/gpu/drm/i915/intel_guc_fwif.h  | 19 +++++++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 6cefe26..6709ead 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -307,7 +307,8 @@ u32 intel_class_context_size(struct drm_i915_private *dev_priv, u8 class)
 	engine->id = id;
 	engine->i915 = dev_priv;
 	__sprint_engine_name(engine->name, info);
-	engine->hw_id = engine->guc_id = info->hw_id;
+	engine->hw_id = info->hw_id;
+	engine->guc_id = MAKE_GUC_ID(info->guc_class, info->instance);
 	engine->guc_class = info->guc_class;
 	engine->mmio_base = __engine_mmio_base(dev_priv, info->mmio_bases);
 	engine->class = info->class;
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 2b41538..227ab32 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -138,6 +138,25 @@
 
 #define GUC_CTL_MAX_DWORDS		(SOFT_SCRATCH_COUNT - 2) /* [1..14] */
 
+/*
+ * The class goes in bits [0..2] of the GuC ID, the instance in bits [3..6].
+ * Bit 7 can be used for operations that apply to all engine classes&instances.
+ */
+#define GUC_ENGINE_CLASS_SHIFT		0
+#define GUC_ENGINE_CLASS_MASK		(0x7 << GUC_ENGINE_CLASS_SHIFT)
+#define GUC_ENGINE_INSTANCE_SHIFT	3
+#define GUC_ENGINE_INSTANCE_MASK	(0xf << GUC_ENGINE_INSTANCE_SHIFT)
+#define GUC_ENGINE_ALL_INSTANCES	(1 << 7)
+
+#define MAKE_GUC_ID(class, instance) \
+	(((class) << GUC_ENGINE_CLASS_SHIFT) | \
+	 ((instance) << GUC_ENGINE_INSTANCE_SHIFT))
+
+#define GUC_ID_TO_ENGINE_CLASS(guc_id) \
+	(((guc_id) & GUC_ENGINE_CLASS_MASK) >> GUC_ENGINE_CLASS_SHIFT)
+#define GUC_ID_TO_ENGINE_INSTANCE(guc_id) \
+	(((guc_id) & GUC_ENGINE_INSTANCE_MASK) >> GUC_ENGINE_INSTANCE_SHIFT)
+
 /**
  * DOC: GuC Firmware Layout
  *
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free
  2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
                   ` (8 preceding siblings ...)
  2018-08-29 19:17 ` [PATCH 09/21] drm/i915/guc: New GuC IDs based on engine class and instance Michal Wajdeczko
@ 2018-08-29 19:18 ` Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 11/21] drm/i915/guc: New GuC stage descriptors Michal Wajdeczko
                     ` (8 more replies)
  2018-08-29 19:19 ` [PATCH 20/21] drm/i915/guc: Enable command transport buffers " Michal Wajdeczko
  10 siblings, 9 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:18 UTC (permalink / raw)
  To: intel-gfx

In upcoming GuC patch we will require notification per engine context
allocation/update/free to correctly setup GuC stage descriptors.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 11 +++++++++++
 drivers/gpu/drm/i915/i915_gem_context.c |  6 +++++-
 drivers/gpu/drm/i915/intel_lrc.c        |  7 +++++++
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 34f5495..234c819 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1881,6 +1881,17 @@ struct drm_i915_private {
 #define MAX_GUC_CONTEXT_HW_ID			(1 << 20) /* exclusive */
 #define GEN11_MAX_CONTEXT_HW_ID			(1 << 11) /* exclusive */
 #define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC	(GEN11_MAX_CONTEXT_HW_ID - 16)
+
+		/*
+		 * Hooks for context (per-engine context, not gem context)
+		 * allocation, deallocation and descriptor update.
+		 */
+		void (*alloc_hook)(struct i915_gem_context *ctx,
+				   struct intel_engine_cs *engine);
+		void (*update_hook)(struct i915_gem_context *ctx,
+				    struct intel_engine_cs *engine);
+		void (*free_hook)(struct i915_gem_context *ctx,
+				  struct intel_engine_cs *engine);
 	} contexts;
 
 	u32 fdi_rx_config;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e3b500c..976941e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -126,9 +126,13 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 
 	for (n = 0; n < ARRAY_SIZE(ctx->__engine); n++) {
 		struct intel_context *ce = &ctx->__engine[n];
+		struct intel_engine_cs *engine = ctx->i915->engine[n];
 
-		if (ce->ops)
+		if (ce->ops) {
+			if (ctx->i915->contexts.free_hook)
+				ctx->i915->contexts.free_hook(ctx, engine);
 			ce->ops->destroy(ce);
+		}
 	}
 
 	kfree(ctx->name);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3001a14..ef4d491 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -266,6 +266,9 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 	}
 
 	ce->lrc_desc = desc;
+
+	if (ctx->i915->contexts.update_hook)
+		ctx->i915->contexts.update_hook(ctx, engine);
 }
 
 static struct i915_priolist *
@@ -2722,6 +2725,7 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 					    struct intel_engine_cs *engine,
 					    struct intel_context *ce)
 {
+	struct drm_i915_private *i915 = engine->i915;
 	struct drm_i915_gem_object *ctx_obj;
 	struct i915_vma *vma;
 	uint32_t context_size;
@@ -2777,6 +2781,9 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 		ce->sw_counter = engine->instance;
 	}
 
+	if (i915->contexts.alloc_hook)
+		i915->contexts.alloc_hook(ctx, engine);
+
 	return 0;
 
 error_ring_free:
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 11/21] drm/i915/guc: New GuC stage descriptors
  2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
@ 2018-08-29 19:18   ` Michal Wajdeczko
  2018-08-29 23:14     ` Daniele Ceraolo Spurio
  2018-10-12 18:25     ` [RFC] " Daniele Ceraolo Spurio
  2018-08-29 19:18   ` [PATCH 12/21] drm/i915/guc: New GuC workqueue item submission mechanism Michal Wajdeczko
                     ` (7 subsequent siblings)
  8 siblings, 2 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:18 UTC (permalink / raw)
  To: intel-gfx; +Cc: Oscar Mateo

New GuC stage descriptor stores information about all possible HW contexts
that use it. The idea is that every direct-submission GuC client gets one
SW Context ID and every HW context created by that client gets one SW
Counter (up to 64 entries). The correct SW Context ID and SW Counter now
get passed on every work queue item.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c         |   9 +-
 drivers/gpu/drm/i915/i915_utils.h           |  12 ++
 drivers/gpu/drm/i915/intel_guc_fwif.h       |  67 ++++-----
 drivers/gpu/drm/i915/intel_guc_submission.c | 205 +++++++++++++++++++---------
 4 files changed, 190 insertions(+), 103 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a5265c2..ad09970 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2455,11 +2455,10 @@ static int i915_guc_stage_pool(struct seq_file *m, void *data)
 
 		seq_printf(m, "GuC stage descriptor %u:\n", index);
 		seq_printf(m, "\tIndex: %u\n", desc->stage_id);
+		seq_printf(m, "\tProxy Index: %u\n", desc->proxy_id);
 		seq_printf(m, "\tAttribute: 0x%x\n", desc->attribute);
 		seq_printf(m, "\tPriority: %d\n", desc->priority);
 		seq_printf(m, "\tDoorbell id: %d\n", desc->db_id);
-		seq_printf(m, "\tEngines used: 0x%x\n",
-			   desc->engines_used);
 		seq_printf(m, "\tDoorbell trigger phy: 0x%llx, cpu: 0x%llx, uK: 0x%x\n",
 			   desc->db_trigger_phy,
 			   desc->db_trigger_cpu,
@@ -2471,10 +2470,10 @@ static int i915_guc_stage_pool(struct seq_file *m, void *data)
 		seq_putc(m, '\n');
 
 		for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
-			u32 guc_engine_id = engine->guc_id;
+			u32 class = GUC_ID_TO_ENGINE_CLASS(engine->guc_id);
+			u32 inst = GUC_ID_TO_ENGINE_INSTANCE(engine->guc_id);
 			struct guc_execlist_context *lrc =
-						&desc->lrc[guc_engine_id];
-
+						&desc->lrc[class][inst];
 			seq_printf(m, "\t%s LRC:\n", engine->name);
 			seq_printf(m, "\t\tContext desc: 0x%x\n",
 				   lrc->context_desc);
diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index 395dd25..7ec1fe4 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -118,6 +118,18 @@ static inline u64 ptr_to_u64(const void *ptr)
 	__idx;								\
 })
 
+#define bitmap32_test_bit(bitmap, bit) ({ \
+	(bitmap)[(bit) / 32] & (1 << ((bit) % 32)); \
+})
+
+#define bitmap32_set_bit(bitmap, bit) ({ \
+	(bitmap)[(bit) / 32] |= (1 << ((bit) % 32)); \
+})
+
+#define bitmap32_clear_bit(bitmap, bit) ({ \
+	(bitmap)[(bit) / 32] &= ~(1 << ((bit) % 32)); \
+})
+
 #include <linux/list.h>
 
 static inline int list_is_first(const struct list_head *list,
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 227ab32..0a897cd 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -29,9 +29,11 @@
 #define GUC_CLIENT_PRIORITY_NORMAL	3
 #define GUC_CLIENT_PRIORITY_NUM		4
 
-#define GUC_MAX_STAGE_DESCRIPTORS	1024
+#define GUC_MAX_STAGE_DESCRIPTORS	2032
 #define	GUC_INVALID_STAGE_ID		GUC_MAX_STAGE_DESCRIPTORS
 
+#define GUC_MAX_LRC_PER_CLASS		64
+
 #define GUC_RENDER_ENGINE		0
 #define GUC_VIDEO_ENGINE		1
 #define GUC_BLITTER_ENGINE		2
@@ -71,9 +73,12 @@
 #define GUC_DOORBELL_DISABLED		0
 
 #define GUC_STAGE_DESC_ATTR_ACTIVE	BIT(0)
-#define GUC_STAGE_DESC_ATTR_PENDING_DB	BIT(1)
-#define GUC_STAGE_DESC_ATTR_KERNEL	BIT(2)
-#define GUC_STAGE_DESC_ATTR_PREEMPT	BIT(3)
+#define GUC_STAGE_DESC_ATTR_TYPE_SHIFT	1
+#define GUC_STAGE_DESC_ATTR_PRINCIPAL	(0x0 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
+#define GUC_STAGE_DESC_ATTR_PROXY	(0x1 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
+#define GUC_STAGE_DESC_ATTR_REAL	(0x2 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
+#define GUC_STAGE_DESC_ATTR_TYPE_MASK	(0x3 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
+#define GUC_STAGE_DESC_ATTR_KERNEL	(1 << 3)
 #define GUC_STAGE_DESC_ATTR_RESET	BIT(4)
 #define GUC_STAGE_DESC_ATTR_WQLOCKED	BIT(5)
 #define GUC_STAGE_DESC_ATTR_PCH		BIT(6)
@@ -282,9 +287,10 @@ struct guc_process_desc {
 	u64 wq_base_addr;
 	u32 wq_size_bytes;
 	u32 wq_status;
-	u32 engine_presence;
 	u32 priority;
-	u32 reserved[30];
+	u32 token;
+	u32 queue_engine_error;
+	u32 reserved[23];
 } __packed;
 
 /* engine id and context id is packed into guc_execlist_context.context_id*/
@@ -295,16 +301,19 @@ struct guc_process_desc {
 struct guc_execlist_context {
 	u32 context_desc;
 	u32 context_id;
-	u32 ring_status;
+	u32 reserved0;
 	u32 ring_lrca;
 	u32 ring_begin;
 	u32 ring_end;
 	u32 ring_next_free_location;
 	u32 ring_current_tail_pointer_value;
-	u8 engine_state_submit_value;
-	u8 engine_state_wait_value;
-	u16 pagefault_count;
-	u16 engine_submit_queue_count;
+	u32 engine_state_wait_value;
+	u32 state_reserved;
+	u32 is_present_in_sq;
+	u32 sync_value;
+	u32 sync_addr;
+	u32 slpc_hints;
+	u32 reserved1[4];
 } __packed;
 
 /*
@@ -317,36 +326,30 @@ struct guc_execlist_context {
  * with the GuC, being allocated before the GuC is loaded with its firmware.
  */
 struct guc_stage_desc {
-	u32 sched_common_area;
+	u64 desc_private;
 	u32 stage_id;
-	u32 pas_id;
-	u8 engines_used;
+	u32 proxy_id;
 	u64 db_trigger_cpu;
 	u32 db_trigger_uk;
 	u64 db_trigger_phy;
-	u16 db_id;
-
-	struct guc_execlist_context lrc[GUC_MAX_ENGINES_NUM];
-
-	u8 attribute;
-
-	u32 priority;
-
+	u32 db_id;
+	struct guc_execlist_context lrc[GUC_MAX_ENGINE_CLASSES][GUC_MAX_LRC_PER_CLASS];
+	u32 lrc_bitmap[GUC_MAX_ENGINE_CLASSES][3];
+	u32 lrc_count;
+	u32 max_lrc_per_class;
+	u32 attribute; /* GUC_STAGE_DESC_ATTR_xxx */
+	u32 priority; /* GUC_CLIENT_PRIORITY_xxx */
 	u32 wq_sampled_tail_offset;
 	u32 wq_total_submit_enqueues;
-
 	u32 process_desc;
 	u32 wq_addr;
 	u32 wq_size;
-
-	u32 engine_presence;
-
-	u8 engine_suspended;
-
-	u8 reserved0[3];
-	u64 reserved1[1];
-
-	u64 desc_private;
+	u32 feature0;
+	u32 feature1;
+	u32 feature2;
+	u32 queue_engine_error;
+	u32 reserved[2];
+	u64 reserved3[12];
 } __packed;
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 07b9d31..54655dc 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -46,17 +46,22 @@
  * that contains all required pages for these elements).
  *
  * GuC stage descriptor:
- * During initialization, the driver allocates a static pool of 1024 such
- * descriptors, and shares them with the GuC.
- * Currently, there exists a 1:1 mapping between a intel_guc_client and a
- * guc_stage_desc (via the client's stage_id), so effectively only one
- * gets used. This stage descriptor lets the GuC know about the doorbell,
- * workqueue and process descriptor. Theoretically, it also lets the GuC
- * know about our HW contexts (context ID, etc...), but we actually
- * employ a kind of submission where the GuC uses the LRCA sent via the work
- * item instead (the single guc_stage_desc associated to execbuf client
- * contains information about the default kernel context only, but this is
- * essentially unused). This is called a "proxy" submission.
+ * During initialization, the driver allocates a static pool of descriptors
+ * and shares them with the GuC. This stage descriptor lets the GuC know about
+ * the doorbell, workqueue and process descriptor, additionally it stores
+ * information about all possible HW contexts that use it (64 x number of
+ * engine classes of guc_execlist_context structs).
+ *
+ * The idea is that every direct-submission GuC client gets one SW Context ID
+ * and every HW context created by that client gets one SW Counter. The "SW
+ * Context ID" and "SW Counter" to use now get passed on every work queue item.
+ *
+ * But we don't have direct submission yet: does that mean we are limited to 64
+ * contexts in total (one client)? Not really: we can use extra GuC context
+ * descriptors to store more HW contexts. They are special in that they don't
+ * have their own work queue, doorbell or process descriptor. Instead, these
+ * "principal" GuC context descriptors use the one that belongs to the client
+ * as a "proxy" for submission (a generalization of the old proxy submission).
  *
  * The Scratch registers:
  * There are 16 MMIO-based registers start from 0xC180. The kernel driver writes
@@ -171,6 +176,16 @@ static struct guc_stage_desc *__get_stage_desc(struct intel_guc_client *client)
 	return &base[client->stage_id];
 }
 
+static struct guc_stage_desc *__get_ppal_stage_desc(struct intel_guc *guc,
+						    u32 index)
+{
+	struct guc_stage_desc *base = guc->stage_desc_pool_vaddr;
+
+	GEM_BUG_ON(index >= GUC_MAX_STAGE_DESCRIPTORS);
+
+	return &base[index];
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -344,70 +359,20 @@ static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
 static void guc_stage_desc_init(struct intel_guc *guc,
 				struct intel_guc_client *client)
 {
-	struct drm_i915_private *dev_priv = guc_to_i915(guc);
-	struct intel_engine_cs *engine;
-	struct i915_gem_context *ctx = client->owner;
 	struct guc_stage_desc *desc;
-	unsigned int tmp;
 	u32 gfx_addr;
 
 	desc = __get_stage_desc(client);
 	memset(desc, 0, sizeof(*desc));
 
 	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |
+			  GUC_STAGE_DESC_ATTR_PROXY |
 			  GUC_STAGE_DESC_ATTR_KERNEL;
-	if (is_high_priority(client))
-		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
 	desc->stage_id = client->stage_id;
+	desc->proxy_id = client->stage_id;
 	desc->priority = client->priority;
 	desc->db_id = client->doorbell_id;
 
-	for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
-		struct intel_context *ce = to_intel_context(ctx, engine);
-		u32 guc_engine_id = engine->guc_id;
-		struct guc_execlist_context *lrc = &desc->lrc[guc_engine_id];
-
-		/* TODO: We have a design issue to be solved here. Only when we
-		 * receive the first batch, we know which engine is used by the
-		 * user. But here GuC expects the lrc and ring to be pinned. It
-		 * is not an issue for default context, which is the only one
-		 * for now who owns a GuC client. But for future owner of GuC
-		 * client, need to make sure lrc is pinned prior to enter here.
-		 */
-		if (!ce->state)
-			break;	/* XXX: continue? */
-
-		/*
-		 * XXX: When this is a GUC_STAGE_DESC_ATTR_KERNEL client (proxy
-		 * submission or, in other words, not using a direct submission
-		 * model) the KMD's LRCA is not used for any work submission.
-		 * Instead, the GuC uses the LRCA of the user mode context (see
-		 * guc_add_request below).
-		 */
-		lrc->context_desc = lower_32_bits(ce->lrc_desc);
-
-		/* The state page is after PPHWSP */
-		lrc->ring_lrca = intel_guc_ggtt_offset(guc, ce->state) +
-				 LRC_STATE_PN * PAGE_SIZE;
-
-		/* XXX: In direct submission, the GuC wants the HW context id
-		 * here. In proxy submission, it wants the stage id
-		 */
-		lrc->context_id = (client->stage_id << GUC_ELC_CTXID_OFFSET) |
-				(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
-
-		lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
-		lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
-		lrc->ring_next_free_location = lrc->ring_begin;
-		lrc->ring_current_tail_pointer_value = 0;
-
-		desc->engines_used |= (1 << guc_engine_id);
-	}
-
-	DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
-			 client->engines, desc->engines_used);
-	WARN_ON(desc->engines_used == 0);
-
 	/*
 	 * The doorbell, process descriptor, and workqueue are all parts
 	 * of the client object, which the GuC will reference via the GGTT
@@ -430,7 +395,15 @@ static void guc_stage_desc_fini(struct intel_guc *guc,
 	struct guc_stage_desc *desc;
 
 	desc = __get_stage_desc(client);
-	memset(desc, 0, sizeof(*desc));
+	/* No memset: the stage desc might still be used as a principal */
+	desc->attribute &= ~GUC_STAGE_DESC_ATTR_TYPE_MASK;
+	desc->db_id = 0;
+	desc->db_trigger_phy = 0;
+	desc->db_trigger_cpu = 0;
+	desc->db_trigger_uk = 0;
+	desc->process_desc = 0;
+	desc->wq_addr = 0;
+	desc->wq_size = 0;
 }
 
 /* Construct a Work Item and append it to the GuC's Work Queue */
@@ -1299,6 +1272,87 @@ static void guc_set_default_submission(struct intel_engine_cs *engine)
 	engine->flags &= ~I915_ENGINE_SUPPORTS_STATS;
 }
 
+static void guc_ppal_stage_attach(struct i915_gem_context *ctx,
+				  struct intel_engine_cs *engine)
+{
+	struct intel_guc *guc = &ctx->i915->guc;
+	struct intel_context *ce = to_intel_context(ctx, engine);
+	struct guc_stage_desc *desc;
+
+	GEM_BUG_ON(ce->sw_context_id >= GUC_MAX_STAGE_DESCRIPTORS);
+
+	desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
+
+	if (desc->lrc_count == 0) {
+		desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |
+				  GUC_STAGE_DESC_ATTR_PRINCIPAL |
+				  GUC_STAGE_DESC_ATTR_KERNEL;
+		desc->stage_id = ce->sw_context_id;
+	}
+
+	GEM_BUG_ON(bitmap32_test_bit(desc->lrc_bitmap[engine->guc_class],
+				     ce->sw_counter));
+	bitmap32_set_bit(desc->lrc_bitmap[engine->guc_class], ce->sw_counter);
+	desc->lrc_count++;
+
+	/* GuC optimizations */
+	if (ce->sw_counter >= desc->max_lrc_per_class)
+		desc->max_lrc_per_class = ce->sw_counter + 1;
+}
+
+static void guc_ppal_stage_detach(struct i915_gem_context *ctx,
+				  struct intel_engine_cs *engine)
+{
+	struct intel_guc *guc = &ctx->i915->guc;
+	struct intel_context *ce = to_intel_context(ctx, engine);
+	struct guc_stage_desc *desc;
+	struct guc_execlist_context *lrc;
+
+	GEM_BUG_ON(ce->sw_context_id >= GUC_MAX_STAGE_DESCRIPTORS);
+
+	desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
+
+	GEM_BUG_ON(!bitmap32_test_bit(desc->lrc_bitmap[engine->guc_class],
+				      ce->sw_counter));
+	bitmap32_clear_bit(desc->lrc_bitmap[engine->guc_class], ce->sw_counter);
+	desc->lrc_count--;
+
+	if (desc->lrc_count == 0) {
+		desc->attribute = 0;
+		desc->stage_id = 0;
+		desc->max_lrc_per_class = 0;
+	} else {
+		/* TODO: GuC optimizations */
+	}
+
+	lrc = &desc->lrc[engine->guc_class][ce->sw_counter];
+	memset(lrc, 0, sizeof(*lrc));
+}
+
+static void guc_ppal_stage_update(struct i915_gem_context *ctx,
+				  struct intel_engine_cs *engine)
+{
+	struct intel_guc *guc = &ctx->i915->guc;
+	struct intel_context *ce = to_intel_context(ctx, engine);
+	struct guc_stage_desc *desc;
+	struct guc_execlist_context *lrc;
+
+	GEM_BUG_ON(ce->sw_context_id >= GUC_MAX_STAGE_DESCRIPTORS);
+
+	desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
+
+	GEM_BUG_ON(!bitmap32_test_bit(desc->lrc_bitmap[engine->guc_class],
+				      ce->sw_counter));
+
+	lrc = &desc->lrc[engine->guc_class][ce->sw_counter];
+
+	lrc->context_desc = lower_32_bits(ce->lrc_desc);
+	lrc->context_id = upper_32_bits(ce->lrc_desc);
+	lrc->ring_lrca = intel_guc_ggtt_offset(guc, ce->state) + LRC_STATE_PN * PAGE_SIZE;
+	lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
+	lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
+}
+
 int intel_guc_submission_enable(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
@@ -1321,17 +1375,26 @@ int intel_guc_submission_enable(struct intel_guc *guc)
 
 	GEM_BUG_ON(!guc->execbuf_client);
 
+	dev_priv->contexts.alloc_hook = guc_ppal_stage_attach;
+	dev_priv->contexts.update_hook = guc_ppal_stage_update;
+	dev_priv->contexts.free_hook = guc_ppal_stage_detach;
+
+	for_each_engine(engine, dev_priv, id) {
+		guc_ppal_stage_attach(dev_priv->kernel_context, engine);
+		guc_ppal_stage_update(dev_priv->kernel_context, engine);
+	}
+
 	guc_reset_wq(guc->execbuf_client);
 	if (guc->preempt_client)
 		guc_reset_wq(guc->preempt_client);
 
 	err = intel_guc_sample_forcewake(guc);
 	if (err)
-		return err;
+		goto err_clear_ctx_hooks;
 
 	err = guc_clients_doorbell_init(guc);
 	if (err)
-		return err;
+		goto err_clear_ctx_hooks;
 
 	/* Take over from manual control of ELSP (execlists) */
 	guc_interrupts_capture(dev_priv);
@@ -1342,6 +1405,12 @@ int intel_guc_submission_enable(struct intel_guc *guc)
 	}
 
 	return 0;
+
+err_clear_ctx_hooks:
+	dev_priv->contexts.alloc_hook = NULL;
+	dev_priv->contexts.update_hook = NULL;
+	dev_priv->contexts.free_hook = NULL;
+	return err;
 }
 
 void intel_guc_submission_disable(struct intel_guc *guc)
@@ -1352,6 +1421,10 @@ void intel_guc_submission_disable(struct intel_guc *guc)
 
 	guc_interrupts_release(dev_priv);
 	guc_clients_doorbell_fini(guc);
+
+	dev_priv->contexts.alloc_hook = NULL;
+	dev_priv->contexts.update_hook = NULL;
+	dev_priv->contexts.free_hook = NULL;
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 12/21] drm/i915/guc: New GuC workqueue item submission mechanism
  2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 11/21] drm/i915/guc: New GuC stage descriptors Michal Wajdeczko
@ 2018-08-29 19:18   ` Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 13/21] drm/i915/guc: Add support for resume-parsing wq item Michal Wajdeczko
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:18 UTC (permalink / raw)
  To: intel-gfx

Work queue items definitions were updated.
To simplify the scheduling logic in the GuC firmware, now only
out-of-order mode of scheduling is supported.

Credits-to: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_fwif.h       | 18 +++++++++++-----
 drivers/gpu/drm/i915/intel_guc_submission.c | 32 +++++++++++++++++++++--------
 drivers/gpu/drm/i915/selftests/intel_guc.c  |  2 +-
 3 files changed, 37 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 0a897cd..156db08 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -49,22 +49,30 @@
 #define GUC_MAX_ENGINE_CLASSES	(GUC_RESERVED_CLASS + 1)
 #define GUC_MAX_INSTANCES_PER_CLASS	4
 
-/* Work queue item header definitions */
+/* Work queue status */
 #define WQ_STATUS_ACTIVE		1
 #define WQ_STATUS_SUSPENDED		2
 #define WQ_STATUS_CMD_ERROR		3
 #define WQ_STATUS_ENGINE_ID_NOT_USED	4
 #define WQ_STATUS_SUSPENDED_FROM_RESET	5
+#define WQ_STATUS_INVALID		6
+
+/* Work queue item header definitions */
 #define WQ_TYPE_SHIFT			0
 #define   WQ_TYPE_BATCH_BUF		(0x1 << WQ_TYPE_SHIFT)
 #define   WQ_TYPE_PSEUDO		(0x2 << WQ_TYPE_SHIFT)
-#define   WQ_TYPE_INORDER		(0x3 << WQ_TYPE_SHIFT)
+#define   WQ_TYPE_KMD			(0x3 << WQ_TYPE_SHIFT)
 #define   WQ_TYPE_NOOP			(0x4 << WQ_TYPE_SHIFT)
-#define WQ_TARGET_SHIFT			10
+#define   WQ_TYPE_RESUME		(0x5 << WQ_TYPE_SHIFT)
+#define   WQ_TYPE_INVALID		(0x6 << WQ_TYPE_SHIFT)
+#define WQ_TARGET_SHIFT			8
 #define WQ_LEN_SHIFT			16
-#define WQ_NO_WCFLUSH_WAIT		(1 << 27)
-#define WQ_PRESENT_WORKLOAD		(1 << 28)
+#define WQ_LEN_MASK			(0x7FF << WQ_LEN_SHIFT)
 
+/* Work queue item submit element info definitions */
+#define WQ_SW_CTX_INDEX_SHIFT		0
+#define WQ_SW_COUNTER_SHIFT		11
+#define WQ_RING_TAIL_INDEX_SHIFT	18
 #define WQ_RING_TAIL_SHIFT		20
 #define WQ_RING_TAIL_MAX		0x7FF	/* 2^11 QWords */
 #define WQ_RING_TAIL_MASK		(WQ_RING_TAIL_MAX << WQ_RING_TAIL_SHIFT)
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 54655dc..378f97e 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -79,9 +79,14 @@
  * Work Items:
  * There are several types of work items that the host may place into a
  * workqueue, each with its own requirements and limitations. Currently only
- * WQ_TYPE_INORDER is needed to support legacy submission via GuC, which
- * represents in-order queue. The kernel driver packs ring tail pointer and an
- * ELSP context descriptor dword into Work Item.
+ * out-of-order WQ_TYPE_KMD is supported by the firmware.
+ * Our in-orderness is guaranteed by the execlists emulation with two "fake"
+ * ports that the scheduler uses when submitting to the GuC backend. If we are
+ * submitting requests for context A in the first port and we place a request
+ * for context B in the second port, we won't submit more requests for A until
+ * all the pending ones complete. We have to take this into account if we try
+ * to change the current execlist emulation model (e.g.: increasing the number
+ * of fake ports could cause requests to execute in wrong global seqno order).
  * See guc_add_request()
  *
  */
@@ -408,6 +413,7 @@ static void guc_stage_desc_fini(struct intel_guc *guc,
 
 /* Construct a Work Item and append it to the GuC's Work Queue */
 static void guc_wq_item_append(struct intel_guc_client *client,
+			       struct intel_context *ce,
 			       u32 target_engine, u32 context_desc,
 			       u32 ring_tail, u32 fence_id)
 {
@@ -445,12 +451,15 @@ static void guc_wq_item_append(struct intel_guc_client *client,
 		wqi->header = WQ_TYPE_NOOP | (wqi_len << WQ_LEN_SHIFT);
 	} else {
 		/* Now fill in the 4-word work queue item */
-		wqi->header = WQ_TYPE_INORDER |
-			      (wqi_len << WQ_LEN_SHIFT) |
+		wqi->header = WQ_TYPE_KMD |
 			      (target_engine << WQ_TARGET_SHIFT) |
-			      WQ_NO_WCFLUSH_WAIT;
+			      (wqi_len << WQ_LEN_SHIFT);
 		wqi->context_desc = context_desc;
-		wqi->submit_element_info = ring_tail << WQ_RING_TAIL_SHIFT;
+
+		wqi->submit_element_info =
+			(ce->sw_context_id << WQ_SW_CTX_INDEX_SHIFT) |
+			(ce->sw_counter << WQ_SW_COUNTER_SHIFT |
+			(ring_tail << WQ_RING_TAIL_INDEX_SHIFT));
 		GEM_BUG_ON(ring_tail > WQ_RING_TAIL_MAX);
 		wqi->fence_id = fence_id;
 	}
@@ -492,12 +501,13 @@ static void guc_add_request(struct intel_guc *guc, struct i915_request *rq)
 {
 	struct intel_guc_client *client = guc->execbuf_client;
 	struct intel_engine_cs *engine = rq->engine;
+	struct intel_context *ce = &rq->gem_context->__engine[rq->engine->id];
 	u32 ctx_desc = lower_32_bits(rq->hw_context->lrc_desc);
 	u32 ring_tail = intel_ring_set_tail(rq->ring, rq->tail) / sizeof(u64);
 
 	spin_lock(&client->wq_lock);
 
-	guc_wq_item_append(client, engine->guc_id, ctx_desc,
+	guc_wq_item_append(client, ce, engine->guc_id, ctx_desc,
 			   ring_tail, rq->global_seqno);
 	guc_ring_doorbell(client);
 
@@ -530,16 +540,20 @@ static void inject_preempt_context(struct work_struct *work)
 					     preempt_work[engine->id]);
 	struct intel_guc_client *client = guc->preempt_client;
 	struct guc_stage_desc *stage_desc = __get_stage_desc(client);
+	struct intel_context *ce = &client->owner->__engine[engine->id];
 	u32 ctx_desc = lower_32_bits(to_intel_context(client->owner,
 						      engine)->lrc_desc);
 	u32 data[7];
 
+	/* FIXME: Gen11+ preemption is different anyway */
+	GEM_BUG_ON(INTEL_GEN(guc_to_i915(guc)) >= 11);
+
 	/*
 	 * The ring contains commands to write GUC_PREEMPT_FINISHED into HWSP.
 	 * See guc_fill_preempt_context().
 	 */
 	spin_lock_irq(&client->wq_lock);
-	guc_wq_item_append(client, engine->guc_id, ctx_desc,
+	guc_wq_item_append(client, ce, engine->guc_id, ctx_desc,
 			   GUC_PREEMPT_BREADCRUMB_BYTES / sizeof(u64), 0);
 	spin_unlock_irq(&client->wq_lock);
 
diff --git a/drivers/gpu/drm/i915/selftests/intel_guc.c b/drivers/gpu/drm/i915/selftests/intel_guc.c
index 90ba88c..7b692ac 100644
--- a/drivers/gpu/drm/i915/selftests/intel_guc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_guc.c
@@ -74,7 +74,7 @@ static int ring_doorbell_nop(struct intel_guc_client *client)
 
 	spin_lock_irq(&client->wq_lock);
 
-	guc_wq_item_append(client, 0, 0, 0, 0);
+	guc_wq_item_append(client, NULL, 0, 0, 0, 0);
 	guc_ring_doorbell(client);
 
 	spin_unlock_irq(&client->wq_lock);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 13/21] drm/i915/guc: Add support for resume-parsing wq item
  2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 11/21] drm/i915/guc: New GuC stage descriptors Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 12/21] drm/i915/guc: New GuC workqueue item submission mechanism Michal Wajdeczko
@ 2018-08-29 19:18   ` Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 14/21] drm/i915/guc: New reset-engine command Michal Wajdeczko
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:18 UTC (permalink / raw)
  To: intel-gfx

Since fw version 25.161, GuC lets us know when an engine had to be reset
due to a hang in another dependent engine, by setting BIT(engine_class) in
the queue_engine_error field. GuC will ignore any other wq item until this
flag is cleared.

To restart the workqueue processing for that engine, we must insert a
special wq item called resume-parsing and wait until the queue_engine_error
field is updated.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: MichaĹ Winiarski <michal.winiarski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_submission.c | 92 +++++++++++++++++++++++++++--
 1 file changed, 86 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 378f97e..5df0204 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -411,6 +411,77 @@ static void guc_stage_desc_fini(struct intel_guc *guc,
 	desc->wq_size = 0;
 }
 
+static u32 get_wq_offset(struct guc_process_desc *desc)
+{
+	const size_t wqi_size = sizeof(struct guc_wq_item);
+	u32 wq_off;
+
+	/*
+	 * Free space is guaranteed, either by normal port submission or
+	 * because we waited for the wq_resume to be processed.
+	 */
+	wq_off = READ_ONCE(desc->tail);
+	GEM_BUG_ON(CIRC_SPACE(wq_off, READ_ONCE(desc->head),
+			      GUC_WQ_SIZE) < wqi_size);
+	GEM_BUG_ON(wq_off & (wqi_size - 1));
+
+	return wq_off;
+}
+
+static void write_wqi(struct guc_process_desc *desc, u32 wq_off)
+{
+	const size_t wqi_size = sizeof(struct guc_wq_item);
+
+	WRITE_ONCE(desc->tail, (wq_off + wqi_size) & (GUC_WQ_SIZE - 1));
+}
+
+static void guc_append_wq_resume_parsing_item(struct intel_guc_client *client,
+					      struct intel_engine_cs *engine)
+{
+	struct guc_process_desc *desc = __get_process_desc(client);
+	const u32 target_engine = engine->guc_id;
+	const u32 wqi_resume = WQ_TYPE_RESUME |
+			       (target_engine << WQ_TARGET_SHIFT) |
+			       (0 << WQ_LEN_SHIFT);
+	const u32 wqi_noop = WQ_TYPE_NOOP |
+			    (target_engine << WQ_TARGET_SHIFT) |
+			    (0 << WQ_LEN_SHIFT);
+	struct guc_wq_item *wqi;
+	u32 wq_off;
+
+	lockdep_assert_held(&client->wq_lock);
+
+	wq_off = get_wq_offset(desc);
+	wqi = client->vaddr + wq_off + GUC_DB_SIZE;
+
+	/*
+	 * Submit 4 wq_items (1 RESUME_WQ_PARSING followed by 3 NOOPs) in
+	 * order to keep it the same size as a 'normal' wq_item.
+	 */
+	wqi->header = wqi_resume;
+	wqi->context_desc = wqi_noop;
+	wqi->submit_element_info = wqi_noop;
+	wqi->fence_id = wqi_noop;
+
+	write_wqi(desc, wq_off);
+}
+
+#define GUC_WAIT_FOR_ENGINE_ERROR_CLEANED_MS 10
+
+static void guc_wait_wq_resumed(struct intel_guc_client *client,
+				struct intel_engine_cs *engine)
+{
+	struct guc_process_desc *desc = __get_process_desc(client);
+	const u32 target_engine_class = engine->guc_class;
+
+	lockdep_assert_held(&client->wq_lock);
+
+	/* must wait for the flag to be cleared */
+	WARN_ON(wait_for_atomic(!(desc->queue_engine_error &
+				  BIT(target_engine_class)),
+				GUC_WAIT_FOR_ENGINE_ERROR_CLEANED_MS));
+}
+
 /* Construct a Work Item and append it to the GuC's Work Queue */
 static void guc_wq_item_append(struct intel_guc_client *client,
 			       struct intel_context *ce,
@@ -438,11 +509,7 @@ static void guc_wq_item_append(struct intel_guc_client *client,
 	/* We expect the WQ to be active if we're appending items to it */
 	GEM_BUG_ON(desc->wq_status != WQ_STATUS_ACTIVE);
 
-	/* Free space is guaranteed. */
-	wq_off = READ_ONCE(desc->tail);
-	GEM_BUG_ON(CIRC_SPACE(wq_off, READ_ONCE(desc->head),
-			      GUC_WQ_SIZE) < wqi_size);
-	GEM_BUG_ON(wq_off & (wqi_size - 1));
+	wq_off = get_wq_offset(desc);
 
 	/* WQ starts from the page after doorbell / process_desc */
 	wqi = client->vaddr + wq_off + GUC_DB_SIZE;
@@ -465,7 +532,7 @@ static void guc_wq_item_append(struct intel_guc_client *client,
 	}
 
 	/* Make the update visible to GuC */
-	WRITE_ONCE(desc->tail, (wq_off + wqi_size) & (GUC_WQ_SIZE - 1));
+	write_wqi(desc, wq_off);
 }
 
 static void guc_reset_wq(struct intel_guc_client *client)
@@ -497,6 +564,14 @@ static void guc_ring_doorbell(struct intel_guc_client *client)
 	GEM_BUG_ON(db->db_status != GUC_DOORBELL_ENABLED);
 }
 
+static bool guc_needs_wq_resume_parsing_item(struct intel_guc_client *client,
+					     u32 target_engine_class)
+{
+	struct guc_process_desc *desc = __get_process_desc(client);
+
+	return desc->queue_engine_error & BIT(target_engine_class);
+}
+
 static void guc_add_request(struct intel_guc *guc, struct i915_request *rq)
 {
 	struct intel_guc_client *client = guc->execbuf_client;
@@ -507,6 +582,11 @@ static void guc_add_request(struct intel_guc *guc, struct i915_request *rq)
 
 	spin_lock(&client->wq_lock);
 
+	if (guc_needs_wq_resume_parsing_item(client, engine->guc_class)) {
+		guc_append_wq_resume_parsing_item(client, engine);
+		guc_wait_wq_resumed(client, engine);
+	}
+
 	guc_wq_item_append(client, ce, engine->guc_id, ctx_desc,
 			   ring_tail, rq->global_seqno);
 	guc_ring_doorbell(client);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 14/21] drm/i915/guc: New reset-engine command
  2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
                     ` (2 preceding siblings ...)
  2018-08-29 19:18   ` [PATCH 13/21] drm/i915/guc: Add support for resume-parsing wq item Michal Wajdeczko
@ 2018-08-29 19:18   ` Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 15/21] drm/i915/guc: Support for extended GuC notification messages Michal Wajdeczko
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:18 UTC (permalink / raw)
  To: intel-gfx

Format of the ENGINE_RESET H2G message has been updated. Additionally,
the firmware will send a G2H ENGINE_RESET_COMPLETE message (with the
engine's guc_class in data[2]) to confirm that the reset has been
completed (but this will be handled in a other patch).

Requires GuC fw v25.161+.

Credits-to: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
---
 drivers/gpu/drm/i915/intel_guc.c | 73 ++++++++++++++++++++++++++++++++++------
 drivers/gpu/drm/i915/intel_guc.h |  6 ++++
 2 files changed, 69 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index a9c2f7b..64f1dca 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -576,27 +576,80 @@ int intel_guc_suspend(struct intel_guc *guc)
 	return intel_guc_send(guc, data, ARRAY_SIZE(data));
 }
 
+static inline void
+guc_set_class_under_reset(struct intel_guc *guc, unsigned int guc_class)
+{
+	GEM_BUG_ON(guc_class >= GUC_MAX_ENGINE_CLASSES);
+	bitmap32_set_bit(&guc->engine_class_under_reset, guc_class);
+}
+
+static inline void
+guc_clear_class_under_reset(struct intel_guc *guc, unsigned int guc_class)
+{
+	GEM_BUG_ON(guc_class >= GUC_MAX_ENGINE_CLASSES);
+	bitmap32_clear_bit(&guc->engine_class_under_reset, guc_class);
+}
+
+static inline bool
+guc_is_class_under_reset(struct intel_guc *guc, unsigned int guc_class)
+{
+	return bitmap32_test_bit(&guc->engine_class_under_reset, guc_class);
+}
+
+static int __guc_action_reset_engine(struct intel_guc *guc,
+				     u32 guc_class,
+				     u32 stage_id)
+{
+	u32 action[] = {
+		INTEL_GUC_ACTION_REQUEST_ENGINE_RESET,
+		guc_class,
+		stage_id,
+	};
+
+	return intel_guc_send(guc, action, ARRAY_SIZE(action));
+}
+
+#define GUC_ENGINE_RESET_COMPLETE_WAIT_MS 100
+
 /**
  * intel_guc_reset_engine() - ask GuC to reset an engine
  * @guc:	intel_guc structure
  * @engine:	engine to be reset
+ *
+ * Ask GuC to reset an engine. The firmware will send a separate
+ * ENGINE_RESET_COMPLETE message (with the engine's guc_class)
+ * to confirm that the reset has been completed.
  */
 int intel_guc_reset_engine(struct intel_guc *guc,
 			   struct intel_engine_cs *engine)
 {
-	u32 data[7];
+	struct intel_guc_client *client = guc->execbuf_client;
+	u32 guc_class = engine->guc_class;
+	int ret;
 
-	GEM_BUG_ON(!guc->execbuf_client);
+	GEM_BUG_ON(guc_is_class_under_reset(guc, guc_class));
+	guc_set_class_under_reset(guc, guc_class);
 
-	data[0] = INTEL_GUC_ACTION_REQUEST_ENGINE_RESET;
-	data[1] = engine->guc_id;
-	data[2] = 0;
-	data[3] = 0;
-	data[4] = 0;
-	data[5] = guc->execbuf_client->stage_id;
-	data[6] = intel_guc_ggtt_offset(guc, guc->shared_data);
+	ret = __guc_action_reset_engine(guc, guc_class, client->stage_id);
+	if (ret)
+		goto out;
 
-	return intel_guc_send(guc, data, ARRAY_SIZE(data));
+	if ((wait_for(!guc_is_class_under_reset(guc, guc_class),
+		      GUC_ENGINE_RESET_COMPLETE_WAIT_MS))) {
+		DRM_ERROR("reset_complete timed out, engine class %d\n",
+			  engine->guc_class);
+		ret = -ETIMEDOUT;
+	}
+
+out:
+	/*
+	 * Clear flag on any failure, we fall back to full reset in
+	 * case of timeout/error.
+	 */
+	if (ret)
+		guc_clear_class_under_reset(guc, guc_class);
+
+	return ret;
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index ad42faf..8688edc 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -74,6 +74,12 @@ struct intel_guc {
 	/* Cyclic counter mod pagesize	*/
 	u32 db_cacheline;
 
+	/*
+	 * Track outstanding request-engine-reset h2g commands,
+	 * accessed by set/clear/is_engine_class_under_reset
+	 */
+	u32 engine_class_under_reset;
+
 	/* GuC's FW specific registers used in MMIO send */
 	struct {
 		u32 base;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 15/21] drm/i915/guc: Support for extended GuC notification messages
  2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
                     ` (3 preceding siblings ...)
  2018-08-29 19:18   ` [PATCH 14/21] drm/i915/guc: New reset-engine command Michal Wajdeczko
@ 2018-08-29 19:18   ` Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 16/21] drm/i915/guc: New engine-reset-complete message Michal Wajdeczko
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:18 UTC (permalink / raw)
  To: intel-gfx

GuC may send notification messages with payload larger than
single u32. Prepare driver to accept those messages.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
---
 drivers/gpu/drm/i915/intel_guc.c    | 13 ++++++++++---
 drivers/gpu/drm/i915/intel_guc.h    |  3 ++-
 drivers/gpu/drm/i915/intel_guc_ct.c |  5 ++++-
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index 64f1dca..da61115 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -511,17 +511,24 @@ void intel_guc_to_host_event_handler_mmio(struct intel_guc *guc)
 	spin_unlock(&guc->irq_lock);
 	enable_rpm_wakeref_asserts(dev_priv);
 
-	intel_guc_to_host_process_recv_msg(guc, msg);
+	intel_guc_to_host_process_recv_msg(guc, &msg, 1);
 }
 
-void intel_guc_to_host_process_recv_msg(struct intel_guc *guc, u32 msg)
+int intel_guc_to_host_process_recv_msg(struct intel_guc *guc,
+				       const u32 *payload, u32 len)
 {
+	u32 msg;
+
+	GEM_BUG_ON(!len);
+
 	/* Make sure to handle only enabled messages */
-	msg &= guc->msg_enabled_mask;
+	msg = payload[0] & guc->msg_enabled_mask;
 
 	if (msg & (INTEL_GUC_RECV_MSG_FLUSH_LOG_BUFFER |
 		   INTEL_GUC_RECV_MSG_CRASH_DUMP_POSTED))
 		intel_guc_log_handle_flush_event(&guc->log);
+
+	return 0;
 }
 
 int intel_guc_sample_forcewake(struct intel_guc *guc)
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 8688edc..1345fe0 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -165,7 +165,8 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 *action, u32 len,
 void intel_guc_to_host_event_handler(struct intel_guc *guc);
 void intel_guc_to_host_event_handler_nop(struct intel_guc *guc);
 void intel_guc_to_host_event_handler_mmio(struct intel_guc *guc);
-void intel_guc_to_host_process_recv_msg(struct intel_guc *guc, u32 msg);
+int intel_guc_to_host_process_recv_msg(struct intel_guc *guc,
+				       const u32 *payload, u32 len);
 int intel_guc_sample_forcewake(struct intel_guc *guc);
 int intel_guc_auth_huc(struct intel_guc *guc, u32 rsa_offset);
 int intel_guc_suspend(struct intel_guc *guc);
diff --git a/drivers/gpu/drm/i915/intel_guc_ct.c b/drivers/gpu/drm/i915/intel_guc_ct.c
index a52883e..2077166 100644
--- a/drivers/gpu/drm/i915/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/intel_guc_ct.c
@@ -709,6 +709,7 @@ static void ct_process_request(struct intel_guc_ct *ct,
 			       u32 action, u32 len, const u32 *payload)
 {
 	struct intel_guc *guc = ct_to_guc(ct);
+	int ret;
 
 	CT_DEBUG_DRIVER("CT: request %x %*ph\n", action, 4 * len, payload);
 
@@ -716,7 +717,9 @@ static void ct_process_request(struct intel_guc_ct *ct,
 	case INTEL_GUC_ACTION_DEFAULT:
 		if (unlikely(len < 1))
 			goto fail_unexpected;
-		intel_guc_to_host_process_recv_msg(guc, *payload);
+		ret = intel_guc_to_host_process_recv_msg(guc, payload, len);
+		if (ret)
+			goto fail_unexpected;
 		break;
 
 	default:
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 16/21] drm/i915/guc: New engine-reset-complete message
  2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
                     ` (4 preceding siblings ...)
  2018-08-29 19:18   ` [PATCH 15/21] drm/i915/guc: Support for extended GuC notification messages Michal Wajdeczko
@ 2018-08-29 19:18   ` Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 17/21] drm/i915/guc: New GuC interrupt register for Gen11 Michal Wajdeczko
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:18 UTC (permalink / raw)
  To: intel-gfx

GuC sends ENGINE_RESET_COMPLETE message as an follow-up answer
to earlier ENGINE_RESET request from the host. Once this message
is received, clear engine reset flag to unblock our reset process.

Credits-to: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
---
 drivers/gpu/drm/i915/intel_guc.c      | 29 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_guc_fwif.h |  3 ++-
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index da61115..9a177ff 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -27,6 +27,9 @@
 #include "intel_guc_submission.h"
 #include "i915_drv.h"
 
+static void guc_handle_engine_reset_completed(struct intel_guc *guc,
+					      const u32 engine_class);
+
 static void gen8_guc_raise_irq(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
@@ -528,6 +531,12 @@ int intel_guc_to_host_process_recv_msg(struct intel_guc *guc,
 		   INTEL_GUC_RECV_MSG_CRASH_DUMP_POSTED))
 		intel_guc_log_handle_flush_event(&guc->log);
 
+	if (msg & INTEL_GUC_RECV_MSG_ENGINE_RESET_COMPLETE) {
+		if (len != 3)
+			return -EPROTO;
+		guc_handle_engine_reset_completed(guc, payload[1]);
+	}
+
 	return 0;
 }
 
@@ -588,6 +597,7 @@ int intel_guc_suspend(struct intel_guc *guc)
 {
 	GEM_BUG_ON(guc_class >= GUC_MAX_ENGINE_CLASSES);
 	bitmap32_set_bit(&guc->engine_class_under_reset, guc_class);
+	intel_guc_enable_msg(guc, INTEL_GUC_RECV_MSG_ENGINE_RESET_COMPLETE);
 }
 
 static inline void
@@ -595,6 +605,7 @@ int intel_guc_suspend(struct intel_guc *guc)
 {
 	GEM_BUG_ON(guc_class >= GUC_MAX_ENGINE_CLASSES);
 	bitmap32_clear_bit(&guc->engine_class_under_reset, guc_class);
+	intel_guc_disable_msg(guc, INTEL_GUC_RECV_MSG_ENGINE_RESET_COMPLETE);
 }
 
 static inline bool
@@ -659,6 +670,24 @@ int intel_guc_reset_engine(struct intel_guc *guc,
 	return ret;
 }
 
+/*
+ * GuC notifies host that reset engine has completed.
+ * This message should only be received after a request-reset h2g,
+ * so check that and clear the engine_class_under_reset flag.
+ */
+static void guc_handle_engine_reset_completed(struct intel_guc *guc,
+					      const u32 engine_class)
+{
+	if (engine_class >= GUC_MAX_ENGINE_CLASSES ||
+	    !guc_is_class_under_reset(guc, engine_class)) {
+		DRM_WARN("Unexpected reset-complete for engine class: %d",
+			 engine_class);
+		return;
+	}
+
+	guc_clear_class_under_reset(guc, engine_class);
+}
+
 /**
  * intel_guc_resume() - notify GuC resuming from suspend state
  * @guc:	the guc
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 156db08..1958581 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -757,7 +757,8 @@ enum intel_guc_response_status {
 /* This action will be programmed in C1BC - SOFT_SCRATCH_15_REG */
 enum intel_guc_recv_message {
 	INTEL_GUC_RECV_MSG_CRASH_DUMP_POSTED = BIT(1),
-	INTEL_GUC_RECV_MSG_FLUSH_LOG_BUFFER = BIT(3)
+	INTEL_GUC_RECV_MSG_FLUSH_LOG_BUFFER = BIT(3),
+	INTEL_GUC_RECV_MSG_ENGINE_RESET_COMPLETE = BIT(25),
 };
 
 #endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 17/21] drm/i915/guc: New GuC interrupt register for Gen11
  2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
                     ` (5 preceding siblings ...)
  2018-08-29 19:18   ` [PATCH 16/21] drm/i915/guc: New engine-reset-complete message Michal Wajdeczko
@ 2018-08-29 19:18   ` Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 18/21] drm/i915/guc: New GuC scratch registers " Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 19/21] drm/i915/huc: New HuC status register " Michal Wajdeczko
  8 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:18 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

Gen11 defines new more flexible Host-to-GuC interrupt register.
Now the host can write any 32-bit payload to trigger an interrupt
and GuC can additionally read this payload from the register.
Current GuC firmware ignores the payload so we just write 0.

Bspec: 21043

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/intel_guc.c     | 14 +++++++++++++-
 drivers/gpu/drm/i915/intel_guc_reg.h |  1 +
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index 9a177ff..b4a6a92 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -37,6 +37,13 @@ static void gen8_guc_raise_irq(struct intel_guc *guc)
 	I915_WRITE(GUC_SEND_INTERRUPT, GUC_SEND_TRIGGER);
 }
 
+static void gen11_guc_raise_irq(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+
+	I915_WRITE(GEN11_GUC_HOST_INTERRUPT, 0);
+}
+
 static inline i915_reg_t guc_send_reg(struct intel_guc *guc, u32 i)
 {
 	GEM_BUG_ON(!guc->send_regs.base);
@@ -65,6 +72,8 @@ void intel_guc_init_send_regs(struct intel_guc *guc)
 
 void intel_guc_init_early(struct intel_guc *guc)
 {
+	struct drm_i915_private *i915 = guc_to_i915(guc);
+
 	intel_guc_fw_init_early(guc);
 	intel_guc_ct_init_early(&guc->ct);
 	intel_guc_log_init_early(&guc->log);
@@ -73,7 +82,10 @@ void intel_guc_init_early(struct intel_guc *guc)
 	spin_lock_init(&guc->irq_lock);
 	guc->send = intel_guc_send_nop;
 	guc->handler = intel_guc_to_host_event_handler_nop;
-	guc->notify = gen8_guc_raise_irq;
+	if (INTEL_GEN(i915) >= 11)
+		guc->notify = gen11_guc_raise_irq;
+	else
+		guc->notify = gen8_guc_raise_irq;
 }
 
 static int guc_init_wq(struct intel_guc *guc)
diff --git a/drivers/gpu/drm/i915/intel_guc_reg.h b/drivers/gpu/drm/i915/intel_guc_reg.h
index d860847..1542199 100644
--- a/drivers/gpu/drm/i915/intel_guc_reg.h
+++ b/drivers/gpu/drm/i915/intel_guc_reg.h
@@ -103,6 +103,7 @@
 
 #define GUC_SEND_INTERRUPT		_MMIO(0xc4c8)
 #define   GUC_SEND_TRIGGER		  (1<<0)
+#define GEN11_GUC_HOST_INTERRUPT	_MMIO(0x1901f0)
 
 #define GEN8_DRBREGL(x)			_MMIO(0x1000 + (x) * 8)
 #define   GEN8_DRB_VALID		  (1<<0)
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 18/21] drm/i915/guc: New GuC scratch registers for Gen11
  2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
                     ` (6 preceding siblings ...)
  2018-08-29 19:18   ` [PATCH 17/21] drm/i915/guc: New GuC interrupt register for Gen11 Michal Wajdeczko
@ 2018-08-29 19:18   ` Michal Wajdeczko
  2018-08-29 19:18   ` [PATCH 19/21] drm/i915/huc: New HuC status register " Michal Wajdeczko
  8 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:18 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

Gen11 adds new set of scratch registers that can be used for MMIO
based Host-to-Guc communication. Due to limited number of these
registers it is expected that host will use them only for command
transport buffers (CTB) communication setup if one is available.

Bspec: 21044

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/intel_guc.c     | 10 ++++++++--
 drivers/gpu/drm/i915/intel_guc_reg.h |  3 +++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index b4a6a92..586edfe 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -59,8 +59,14 @@ void intel_guc_init_send_regs(struct intel_guc *guc)
 	enum forcewake_domains fw_domains = 0;
 	unsigned int i;
 
-	guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0));
-	guc->send_regs.count = SOFT_SCRATCH_COUNT - 1;
+	if (HAS_GUC_CT(dev_priv) && INTEL_GEN(dev_priv) >= 11) {
+		guc->send_regs.base =
+				i915_mmio_reg_offset(GEN11_SOFT_SCRATCH(0));
+		guc->send_regs.count = GEN11_SOFT_SCRATCH_COUNT;
+	} else {
+		guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0));
+		guc->send_regs.count = SOFT_SCRATCH_COUNT - 1;
+	}
 
 	for (i = 0; i < guc->send_regs.count; i++) {
 		fw_domains |= intel_uncore_forcewake_for_reg(dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_guc_reg.h b/drivers/gpu/drm/i915/intel_guc_reg.h
index 1542199..2149209 100644
--- a/drivers/gpu/drm/i915/intel_guc_reg.h
+++ b/drivers/gpu/drm/i915/intel_guc_reg.h
@@ -51,6 +51,9 @@
 #define SOFT_SCRATCH(n)			_MMIO(0xc180 + (n) * 4)
 #define SOFT_SCRATCH_COUNT		16
 
+#define GEN11_SOFT_SCRATCH(n)		_MMIO(0x190240 + (n) * 4)
+#define GEN11_SOFT_SCRATCH_COUNT	4
+
 #define UOS_RSA_SCRATCH(i)		_MMIO(0xc200 + (i) * 4)
 #define UOS_RSA_SCRATCH_COUNT		64
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 19/21] drm/i915/huc: New HuC status register for Gen11
  2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
                     ` (7 preceding siblings ...)
  2018-08-29 19:18   ` [PATCH 18/21] drm/i915/guc: New GuC scratch registers " Michal Wajdeczko
@ 2018-08-29 19:18   ` Michal Wajdeczko
  2018-08-30 22:59     ` John Spotswood
  8 siblings, 1 reply; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:18 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

Gen11 defines new register for checking HuC authentication status.
Look into the right register and bit.

BSpec: 19686

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_reg.h |  3 ++
 drivers/gpu/drm/i915/intel_huc.c     | 58 +++++++++++++++++++++++++++++++-----
 2 files changed, 53 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_reg.h b/drivers/gpu/drm/i915/intel_guc_reg.h
index 2149209..de36595 100644
--- a/drivers/gpu/drm/i915/intel_guc_reg.h
+++ b/drivers/gpu/drm/i915/intel_guc_reg.h
@@ -79,6 +79,9 @@
 #define HUC_STATUS2             _MMIO(0xD3B0)
 #define   HUC_FW_VERIFIED       (1<<7)
 
+#define GEN11_HUC_KERNEL_LOAD_INFO	_MMIO(0xC1DC)
+#define   HUC_LOAD_SUCCESSFUL		  (1 << 0)
+
 #define GUC_WOPCM_SIZE			_MMIO(0xc050)
 #define   GUC_WOPCM_SIZE_LOCKED		  (1<<0)
 #define   GUC_WOPCM_SIZE_SHIFT		12
diff --git a/drivers/gpu/drm/i915/intel_huc.c b/drivers/gpu/drm/i915/intel_huc.c
index 37ef540d..a710c0d 100644
--- a/drivers/gpu/drm/i915/intel_huc.c
+++ b/drivers/gpu/drm/i915/intel_huc.c
@@ -40,6 +40,47 @@ int intel_huc_init_misc(struct intel_huc *huc)
 	return 0;
 }
 
+static int gen8_huc_wait_verified(struct intel_huc *huc)
+{
+	struct drm_i915_private *i915 = huc_to_i915(huc);
+	u32 status;
+	int ret;
+
+	ret = __intel_wait_for_register(i915,
+					HUC_STATUS2,
+					HUC_FW_VERIFIED,
+					HUC_FW_VERIFIED,
+					2, 50, &status);
+	if (ret)
+		DRM_ERROR("HuC: status %#x\n", status);
+	return ret;
+}
+
+static int gen11_huc_wait_verified(struct intel_huc *huc)
+{
+	struct drm_i915_private *i915 = huc_to_i915(huc);
+	int ret;
+
+	ret = __intel_wait_for_register(i915,
+					GEN11_HUC_KERNEL_LOAD_INFO,
+					HUC_LOAD_SUCCESSFUL,
+					HUC_LOAD_SUCCESSFUL,
+					2, 50, NULL);
+	return ret;
+}
+
+static int huc_wait_verified(struct intel_huc *huc)
+{
+	struct drm_i915_private *i915 = huc_to_i915(huc);
+	int ret;
+
+	if (INTEL_GEN(i915) >= 11)
+		ret = gen11_huc_wait_verified(huc);
+	else
+		ret = gen8_huc_wait_verified(huc);
+	return ret;
+}
+
 /**
  * intel_huc_auth() - Authenticate HuC uCode
  * @huc: intel_huc structure
@@ -56,7 +97,6 @@ int intel_huc_auth(struct intel_huc *huc)
 	struct drm_i915_private *i915 = huc_to_i915(huc);
 	struct intel_guc *guc = &i915->guc;
 	struct i915_vma *vma;
-	u32 status;
 	int ret;
 
 	if (huc->fw.load_status != INTEL_UC_FIRMWARE_SUCCESS)
@@ -79,13 +119,9 @@ int intel_huc_auth(struct intel_huc *huc)
 	}
 
 	/* Check authentication status, it should be done by now */
-	ret = __intel_wait_for_register(i915,
-					HUC_STATUS2,
-					HUC_FW_VERIFIED,
-					HUC_FW_VERIFIED,
-					2, 50, &status);
+	ret = huc_wait_verified(huc);
 	if (ret) {
-		DRM_ERROR("HuC: Firmware not verified %#x\n", status);
+		DRM_ERROR("HuC: Firmware not verified %d\n", ret);
 		goto fail_unpin;
 	}
 
@@ -120,7 +156,13 @@ int intel_huc_check_status(struct intel_huc *huc)
 		return -ENODEV;
 
 	intel_runtime_pm_get(dev_priv);
-	status = I915_READ(HUC_STATUS2) & HUC_FW_VERIFIED;
+
+	if (INTEL_GEN(dev_priv) >= 11)
+		status = I915_READ(GEN11_HUC_KERNEL_LOAD_INFO) &
+			 HUC_LOAD_SUCCESSFUL;
+	else
+		status = I915_READ(HUC_STATUS2) & HUC_FW_VERIFIED;
+
 	intel_runtime_pm_put(dev_priv);
 
 	return status;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 20/21] drm/i915/guc: Enable command transport buffers for Gen11
  2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
                   ` (9 preceding siblings ...)
  2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
@ 2018-08-29 19:19 ` Michal Wajdeczko
  2018-08-29 19:19   ` [PATCH 21/21] HAX Don't enable GuC submission on pre-Gen11 even if forced Michal Wajdeczko
  10 siblings, 1 reply; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:19 UTC (permalink / raw)
  To: intel-gfx

Gen11 GuC firmware can handle commands over CT buffers.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index d6f7b9f..6bc4bc1 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -600,6 +600,7 @@
 	GEN10_FEATURES, \
 	GEN(11), \
 	.ddb_size = 2048, \
+	.has_guc_ct = 1, \
 	.has_logical_ring_elsq = 1
 
 static const struct intel_device_info intel_icelake_11_info = {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 21/21] HAX Don't enable GuC submission on pre-Gen11 even if forced
  2018-08-29 19:19 ` [PATCH 20/21] drm/i915/guc: Enable command transport buffers " Michal Wajdeczko
@ 2018-08-29 19:19   ` Michal Wajdeczko
  0 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 19:19 UTC (permalink / raw)
  To: intel-gfx

This is just to fool CI skl|kbl-guc machines

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/intel_uc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index 185b29b..95697c0 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -122,6 +122,8 @@ static void sanitize_options_early(struct drm_i915_private *i915)
 		DRM_WARN("Incompatible option detected: %s=%d, %s!\n",
 			 "enable_guc", i915_modparams.enable_guc,
 			 "submission not supported");
+		i915_modparams.enable_guc &= ~ENABLE_GUC_SUBMISSION;
+		GEM_BUG_ON(intel_uc_is_using_guc_submission());
 	}
 
 	/* Verify GuC firmware availability */
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface
  2018-08-29 19:10 ` [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface Michal Wajdeczko
@ 2018-08-29 19:58   ` Michel Thierry
  2018-08-30  0:16     ` Lionel Landwerlin
  2018-09-06  8:55   ` Joonas Lahtinen
  1 sibling, 1 reply; 49+ messages in thread
From: Michel Thierry @ 2018-08-29 19:58 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx; +Cc: Lucas De Marchi, Rodrigo Vivi

+Lionel
(please see below as this touches the lrca format & relates to OA 
reporting too)

On 8/29/2018 12:10 PM, Michal Wajdeczko wrote:
> Until now the GuC and HW engine class has been the same, which allowed
> us to use them interchangeable. But it is better to start doing the
> right thing and use the GuC definitions for the firmware interface.
> 
> We also keep the same class id in the ctx descriptor to be able to have
> the same values in the driver and firmware logs.
> 
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: Tomasz Lis <tomasz.lis@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_engine_cs.c  | 13 +++++++++++++
>   drivers/gpu/drm/i915/intel_guc_fwif.h   |  7 +++++++
>   drivers/gpu/drm/i915/intel_lrc.c        | 10 +++++++++-
>   drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
>   4 files changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 1a34e8f..bc81354 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -85,6 +85,7 @@ struct engine_info {
>   	unsigned int hw_id;
>   	unsigned int uabi_id;
>   	u8 class;
> +	u8 guc_class;
>   	u8 instance;
>   	/* mmio bases table *must* be sorted in reverse gen order */
>   	struct engine_mmio_base {
> @@ -98,6 +99,7 @@ struct engine_info {
>   		.hw_id = RCS_HW,
>   		.uabi_id = I915_EXEC_RENDER,
>   		.class = RENDER_CLASS,
> +		.guc_class = GUC_RENDER_CLASS,
>   		.instance = 0,
>   		.mmio_bases = {
>   			{ .gen = 1, .base = RENDER_RING_BASE }
> @@ -107,6 +109,7 @@ struct engine_info {
>   		.hw_id = BCS_HW,
>   		.uabi_id = I915_EXEC_BLT,
>   		.class = COPY_ENGINE_CLASS,
> +		.guc_class = GUC_BLITTER_CLASS,
>   		.instance = 0,
>   		.mmio_bases = {
>   			{ .gen = 6, .base = BLT_RING_BASE }
> @@ -116,6 +119,7 @@ struct engine_info {
>   		.hw_id = VCS_HW,
>   		.uabi_id = I915_EXEC_BSD,
>   		.class = VIDEO_DECODE_CLASS,
> +		.guc_class = GUC_VIDEO_CLASS,
>   		.instance = 0,
>   		.mmio_bases = {
>   			{ .gen = 11, .base = GEN11_BSD_RING_BASE },
> @@ -127,6 +131,7 @@ struct engine_info {
>   		.hw_id = VCS2_HW,
>   		.uabi_id = I915_EXEC_BSD,
>   		.class = VIDEO_DECODE_CLASS,
> +		.guc_class = GUC_VIDEO_CLASS,
>   		.instance = 1,
>   		.mmio_bases = {
>   			{ .gen = 11, .base = GEN11_BSD2_RING_BASE },
> @@ -137,6 +142,7 @@ struct engine_info {
>   		.hw_id = VCS3_HW,
>   		.uabi_id = I915_EXEC_BSD,
>   		.class = VIDEO_DECODE_CLASS,
> +		.guc_class = GUC_VIDEO_CLASS,
>   		.instance = 2,
>   		.mmio_bases = {
>   			{ .gen = 11, .base = GEN11_BSD3_RING_BASE }
> @@ -146,6 +152,7 @@ struct engine_info {
>   		.hw_id = VCS4_HW,
>   		.uabi_id = I915_EXEC_BSD,
>   		.class = VIDEO_DECODE_CLASS,
> +		.guc_class = GUC_VIDEO_CLASS,
>   		.instance = 3,
>   		.mmio_bases = {
>   			{ .gen = 11, .base = GEN11_BSD4_RING_BASE }
> @@ -155,6 +162,7 @@ struct engine_info {
>   		.hw_id = VECS_HW,
>   		.uabi_id = I915_EXEC_VEBOX,
>   		.class = VIDEO_ENHANCEMENT_CLASS,
> +		.guc_class = GUC_VIDEOENHANCE_CLASS,
>   		.instance = 0,
>   		.mmio_bases = {
>   			{ .gen = 11, .base = GEN11_VEBOX_RING_BASE },
> @@ -165,6 +173,7 @@ struct engine_info {
>   		.hw_id = VECS2_HW,
>   		.uabi_id = I915_EXEC_VEBOX,
>   		.class = VIDEO_ENHANCEMENT_CLASS,
> +		.guc_class = GUC_VIDEOENHANCE_CLASS,
>   		.instance = 1,
>   		.mmio_bases = {
>   			{ .gen = 11, .base = GEN11_VEBOX2_RING_BASE }
> @@ -276,6 +285,9 @@ static void __sprint_engine_name(char *name, const struct engine_info *info)
>   	if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
>   		return -EINVAL;
>   
> +	if (GEM_WARN_ON(info->guc_class >= GUC_MAX_ENGINE_CLASSES))
> +		return -EINVAL;
> +
>   	if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
>   		return -EINVAL;
>   
> @@ -291,6 +303,7 @@ static void __sprint_engine_name(char *name, const struct engine_info *info)
>   	engine->i915 = dev_priv;
>   	__sprint_engine_name(engine->name, info);
>   	engine->hw_id = engine->guc_id = info->hw_id;
> +	engine->guc_class = info->guc_class;
>   	engine->mmio_base = __engine_mmio_base(dev_priv, info->mmio_bases);
>   	engine->class = info->class;
>   	engine->instance = info->instance;
> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
> index 963da91..5b7a05b 100644
> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
> @@ -39,6 +39,13 @@
>   #define GUC_VIDEO_ENGINE2		4
>   #define GUC_MAX_ENGINES_NUM		(GUC_VIDEO_ENGINE2 + 1)
>   
> +#define GUC_RENDER_CLASS	0
> +#define GUC_VIDEO_CLASS		1
> +#define GUC_VIDEOENHANCE_CLASS	2
> +#define GUC_BLITTER_CLASS	3
> +#define GUC_RESERVED_CLASS	4
> +#define GUC_MAX_ENGINE_CLASSES	(GUC_RESERVED_CLASS + 1)
> +
>   /* Work queue item header definitions */
>   #define WQ_STATUS_ACTIVE		1
>   #define WQ_STATUS_SUSPENDED		2
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index f8ceb9c..f4b9972 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -249,7 +249,15 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
>   
>   		/* TODO: decide what to do with SW counter (bits 55-60) */
>   
> -		desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
> +		/*
> +		 * Although GuC will never see this upper part as it fills
> +		 * its own descriptor, using the guc_class here will help keep
> +		 * the i915 and firmware logs in sync.
> +		 */
> +		if (HAS_GUC_SCHED(ctx->i915))
> +			desc |= (u64)engine->guc_class << GEN11_ENGINE_CLASS_SHIFT;
> +		else
> +			desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>   								/* bits 61-63 */

OA also uses this upper part (see oa_get_render_ctx_id), so it's 
something to be aware of.

My opinion is that it's useful to have the lrca matching the firmware 
logs, but OA should account of this change at it receives what the fw 
sent to the hw. Which one is more important is for others to decide 
(plus it only becomes a problem when engine-class and guc-class start to 
deviate).

Acked-by: Michel Thierry <michel.thierry@intel.com>

-Michel


>   	} else {
>   		GEM_BUG_ON(ctx->hw_id >= BIT(GEN8_CTX_ID_WIDTH));
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 3f6920d..f47009f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -350,7 +350,9 @@ struct intel_engine_cs {
>   
>   	enum intel_engine_id id;
>   	unsigned int hw_id;
> +
>   	unsigned int guc_id;
> +	u8 guc_class;
>   
>   	u8 uabi_id;
>   	u8 uabi_class;
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 01/21] drm/i915/guc: Update GuC power domain states
  2018-08-29 19:10 ` [PATCH 01/21] drm/i915/guc: Update GuC power domain states Michal Wajdeczko
@ 2018-08-29 20:57   ` Daniele Ceraolo Spurio
  2018-08-29 22:43     ` Michal Wajdeczko
  0 siblings, 1 reply; 49+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-08-29 20:57 UTC (permalink / raw)
  To: intel-gfx



On 29/08/18 12:10, Michal Wajdeczko wrote:
> We should update GuC power domain states also when GuC submission
> is disabled, otherwise GuC might complain or ignore our requests.
> This seems to be required for all currently released GuC firmwares.
> 
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: John Spotswood <john.a.spotswood@intel.com>
> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
> Cc: Tomasz Lis <tomasz.lis@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_uc.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
> index 7c95697..7a3a4ca 100644
> --- a/drivers/gpu/drm/i915/intel_uc.c
> +++ b/drivers/gpu/drm/i915/intel_uc.c
> @@ -401,6 +401,10 @@ int intel_uc_init_hw(struct drm_i915_private *i915)
>   		ret = intel_guc_submission_enable(guc);
>   		if (ret)
>   			goto err_communication;
> +	} else {
> +		ret = intel_guc_sample_forcewake(guc);
> +		if (ret)
> +			goto err_communication;
>   	}

We can just pull this out of intel_guc_submission_enable and call it 
unconditionally instead of having the if here. Even better, we should 
only call this when we write to GEN9_PG_ENABLE and pass the value we're 
writing instead of re-calculating it inside the function, but that's a 
job for another time I guess.

Daniele

>   
>   	dev_info(i915->drm.dev, "GuC firmware version %u.%u\n",
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 05/21] drm/i915/guc: Update sample-forcewake command
  2018-08-29 19:10 ` [PATCH 05/21] drm/i915/guc: Update sample-forcewake command Michal Wajdeczko
@ 2018-08-29 21:52   ` Daniele Ceraolo Spurio
  2018-08-29 22:31     ` Michal Wajdeczko
  0 siblings, 1 reply; 49+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-08-29 21:52 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx



On 29/08/18 12:10, Michal Wajdeczko wrote:
> Action ID of this command has been changed in GuC firmware.
> 

the commit message of patch 1 says we need to use this command even if 
GuC submission is disabled, which is still a supported config on gen9. 
However, won't changing the value make the H2G fail, since gen9 FW still 
uses the old define?

Daniele

> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: John Spotswood <john.a.spotswood@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_guc_fwif.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
> index 7070e36..963da91 100644
> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
> @@ -659,7 +659,6 @@ enum intel_guc_action {
>   	INTEL_GUC_ACTION_DEFAULT = 0x0,
>   	INTEL_GUC_ACTION_REQUEST_PREEMPTION = 0x2,
>   	INTEL_GUC_ACTION_REQUEST_ENGINE_RESET = 0x3,
> -	INTEL_GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
>   	INTEL_GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
>   	INTEL_GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
>   	INTEL_GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE = 0x30,
> @@ -667,6 +666,7 @@ enum intel_guc_action {
>   	INTEL_GUC_ACTION_ENTER_S_STATE = 0x501,
>   	INTEL_GUC_ACTION_EXIT_S_STATE = 0x502,
>   	INTEL_GUC_ACTION_SLPC_REQUEST = 0x3003,
> +	INTEL_GUC_ACTION_SAMPLE_FORCEWAKE = 0x3005,
>   	INTEL_GUC_ACTION_AUTHENTICATE_HUC = 0x4000,
>   	INTEL_GUC_ACTION_REGISTER_COMMAND_TRANSPORT_BUFFER = 0x4505,
>   	INTEL_GUC_ACTION_DEREGISTER_COMMAND_TRANSPORT_BUFFER = 0x4506,
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 05/21] drm/i915/guc: Update sample-forcewake command
  2018-08-29 21:52   ` Daniele Ceraolo Spurio
@ 2018-08-29 22:31     ` Michal Wajdeczko
  0 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 22:31 UTC (permalink / raw)
  To: intel-gfx, Daniele Ceraolo Spurio

On Wed, 29 Aug 2018 23:52:26 +0200, Daniele Ceraolo Spurio  
<daniele.ceraolospurio@intel.com> wrote:

>
>
> On 29/08/18 12:10, Michal Wajdeczko wrote:
>> Action ID of this command has been changed in GuC firmware.
>>
>
> the commit message of patch 1 says we need to use this command even if  
> GuC submission is disabled, which is still a supported config on gen9.  
> However, won't changing the value make the H2G fail, since gen9 FW still  
> uses the old define?
>

Good catch! I was testing patch 1 separately as it was added as a fix
to the issues exposed by the series while run with enable_guc=2
I'll keep old command ID but with GEN prefix and use that in function.

Thanks,
Michal
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 01/21] drm/i915/guc: Update GuC power domain states
  2018-08-29 20:57   ` Daniele Ceraolo Spurio
@ 2018-08-29 22:43     ` Michal Wajdeczko
  0 siblings, 0 replies; 49+ messages in thread
From: Michal Wajdeczko @ 2018-08-29 22:43 UTC (permalink / raw)
  To: intel-gfx, Daniele Ceraolo Spurio

On Wed, 29 Aug 2018 22:57:54 +0200, Daniele Ceraolo Spurio  
<daniele.ceraolospurio@intel.com> wrote:

>
>
> On 29/08/18 12:10, Michal Wajdeczko wrote:
>> We should update GuC power domain states also when GuC submission
>> is disabled, otherwise GuC might complain or ignore our requests.
>> This seems to be required for all currently released GuC firmwares.
>>  Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Cc: John Spotswood <john.a.spotswood@intel.com>
>> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
>> Cc: Tomasz Lis <tomasz.lis@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_uc.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>  diff --git a/drivers/gpu/drm/i915/intel_uc.c  
>> b/drivers/gpu/drm/i915/intel_uc.c
>> index 7c95697..7a3a4ca 100644
>> --- a/drivers/gpu/drm/i915/intel_uc.c
>> +++ b/drivers/gpu/drm/i915/intel_uc.c
>> @@ -401,6 +401,10 @@ int intel_uc_init_hw(struct drm_i915_private *i915)
>>   		ret = intel_guc_submission_enable(guc);
>>   		if (ret)
>>   			goto err_communication;
>> +	} else {
>> +		ret = intel_guc_sample_forcewake(guc);
>> +		if (ret)
>> +			goto err_communication;
>>   	}
>
> We can just pull this out of intel_guc_submission_enable and call it  
> unconditionally instead of having the if here.

I was considering this, but decided to call it separately as it is very
likely that it will be no longer required for never fw releases, so it
will be easier to revert.

Michal
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 11/21] drm/i915/guc: New GuC stage descriptors
  2018-08-29 19:18   ` [PATCH 11/21] drm/i915/guc: New GuC stage descriptors Michal Wajdeczko
@ 2018-08-29 23:14     ` Daniele Ceraolo Spurio
  2018-10-12 18:25     ` [RFC] " Daniele Ceraolo Spurio
  1 sibling, 0 replies; 49+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-08-29 23:14 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx; +Cc: Oscar Mateo



On 29/08/18 12:18, Michal Wajdeczko wrote:
> New GuC stage descriptor stores information about all possible HW contexts
> that use it. The idea is that every direct-submission GuC client gets one
> SW Context ID and every HW context created by that client gets one SW
> Counter (up to 64 entries). The correct SW Context ID and SW Counter now
> get passed on every work queue item.
> 
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tomasz Lis <tomasz.lis@intel.com>
> Cc: Michal Winiarski <michal.winiarski@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c         |   9 +-
>   drivers/gpu/drm/i915/i915_utils.h           |  12 ++
>   drivers/gpu/drm/i915/intel_guc_fwif.h       |  67 ++++-----
>   drivers/gpu/drm/i915/intel_guc_submission.c | 205 +++++++++++++++++++---------
>   4 files changed, 190 insertions(+), 103 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index a5265c2..ad09970 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2455,11 +2455,10 @@ static int i915_guc_stage_pool(struct seq_file *m, void *data)
>   
>   		seq_printf(m, "GuC stage descriptor %u:\n", index);
>   		seq_printf(m, "\tIndex: %u\n", desc->stage_id);
> +		seq_printf(m, "\tProxy Index: %u\n", desc->proxy_id);
>   		seq_printf(m, "\tAttribute: 0x%x\n", desc->attribute);
>   		seq_printf(m, "\tPriority: %d\n", desc->priority);
>   		seq_printf(m, "\tDoorbell id: %d\n", desc->db_id);
> -		seq_printf(m, "\tEngines used: 0x%x\n",
> -			   desc->engines_used);
>   		seq_printf(m, "\tDoorbell trigger phy: 0x%llx, cpu: 0x%llx, uK: 0x%x\n",
>   			   desc->db_trigger_phy,
>   			   desc->db_trigger_cpu,
> @@ -2471,10 +2470,10 @@ static int i915_guc_stage_pool(struct seq_file *m, void *data)
>   		seq_putc(m, '\n');
>   
>   		for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
> -			u32 guc_engine_id = engine->guc_id;
> +			u32 class = GUC_ID_TO_ENGINE_CLASS(engine->guc_id);
> +			u32 inst = GUC_ID_TO_ENGINE_INSTANCE(engine->guc_id);
>   			struct guc_execlist_context *lrc =
> -						&desc->lrc[guc_engine_id];
> -
> +						&desc->lrc[class][inst];
>   			seq_printf(m, "\t%s LRC:\n", engine->name);
>   			seq_printf(m, "\t\tContext desc: 0x%x\n",
>   				   lrc->context_desc);
> diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
> index 395dd25..7ec1fe4 100644
> --- a/drivers/gpu/drm/i915/i915_utils.h
> +++ b/drivers/gpu/drm/i915/i915_utils.h
> @@ -118,6 +118,18 @@ static inline u64 ptr_to_u64(const void *ptr)
>   	__idx;								\
>   })
>   
> +#define bitmap32_test_bit(bitmap, bit) ({ \
> +	(bitmap)[(bit) / 32] & (1 << ((bit) % 32)); \
> +})
> +
> +#define bitmap32_set_bit(bitmap, bit) ({ \
> +	(bitmap)[(bit) / 32] |= (1 << ((bit) % 32)); \
> +})
> +
> +#define bitmap32_clear_bit(bitmap, bit) ({ \
> +	(bitmap)[(bit) / 32] &= ~(1 << ((bit) % 32)); \
> +})
> +
>   #include <linux/list.h>
>   
>   static inline int list_is_first(const struct list_head *list,
> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
> index 227ab32..0a897cd 100644
> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
> @@ -29,9 +29,11 @@
>   #define GUC_CLIENT_PRIORITY_NORMAL	3
>   #define GUC_CLIENT_PRIORITY_NUM		4
>   
> -#define GUC_MAX_STAGE_DESCRIPTORS	1024
> +#define GUC_MAX_STAGE_DESCRIPTORS	2032
>   #define	GUC_INVALID_STAGE_ID		GUC_MAX_STAGE_DESCRIPTORS
>   
> +#define GUC_MAX_LRC_PER_CLASS		64
> +
>   #define GUC_RENDER_ENGINE		0
>   #define GUC_VIDEO_ENGINE		1
>   #define GUC_BLITTER_ENGINE		2
> @@ -71,9 +73,12 @@
>   #define GUC_DOORBELL_DISABLED		0
>   
>   #define GUC_STAGE_DESC_ATTR_ACTIVE	BIT(0)
> -#define GUC_STAGE_DESC_ATTR_PENDING_DB	BIT(1)
> -#define GUC_STAGE_DESC_ATTR_KERNEL	BIT(2)
> -#define GUC_STAGE_DESC_ATTR_PREEMPT	BIT(3)
> +#define GUC_STAGE_DESC_ATTR_TYPE_SHIFT	1
> +#define GUC_STAGE_DESC_ATTR_PRINCIPAL	(0x0 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
> +#define GUC_STAGE_DESC_ATTR_PROXY	(0x1 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
> +#define GUC_STAGE_DESC_ATTR_REAL	(0x2 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
> +#define GUC_STAGE_DESC_ATTR_TYPE_MASK	(0x3 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
> +#define GUC_STAGE_DESC_ATTR_KERNEL	(1 << 3)
>   #define GUC_STAGE_DESC_ATTR_RESET	BIT(4)
>   #define GUC_STAGE_DESC_ATTR_WQLOCKED	BIT(5)
>   #define GUC_STAGE_DESC_ATTR_PCH		BIT(6)
> @@ -282,9 +287,10 @@ struct guc_process_desc {
>   	u64 wq_base_addr;
>   	u32 wq_size_bytes;
>   	u32 wq_status;
> -	u32 engine_presence;
>   	u32 priority;
> -	u32 reserved[30];
> +	u32 token;
> +	u32 queue_engine_error;
> +	u32 reserved[23];
>   } __packed;
>   
>   /* engine id and context id is packed into guc_execlist_context.context_id*/
> @@ -295,16 +301,19 @@ struct guc_process_desc {
>   struct guc_execlist_context {
>   	u32 context_desc;
>   	u32 context_id;
> -	u32 ring_status;
> +	u32 reserved0;
>   	u32 ring_lrca;
>   	u32 ring_begin;
>   	u32 ring_end;
>   	u32 ring_next_free_location;
>   	u32 ring_current_tail_pointer_value;
> -	u8 engine_state_submit_value;
> -	u8 engine_state_wait_value;
> -	u16 pagefault_count;
> -	u16 engine_submit_queue_count;
> +	u32 engine_state_wait_value;
> +	u32 state_reserved;
> +	u32 is_present_in_sq;
> +	u32 sync_value;
> +	u32 sync_addr;
> +	u32 slpc_hints;
> +	u32 reserved1[4];
>   } __packed;
>   
>   /*
> @@ -317,36 +326,30 @@ struct guc_execlist_context {
>    * with the GuC, being allocated before the GuC is loaded with its firmware.
>    */
>   struct guc_stage_desc {
> -	u32 sched_common_area;
> +	u64 desc_private;
>   	u32 stage_id;
> -	u32 pas_id;
> -	u8 engines_used;
> +	u32 proxy_id;
>   	u64 db_trigger_cpu;
>   	u32 db_trigger_uk;
>   	u64 db_trigger_phy;
> -	u16 db_id;
> -
> -	struct guc_execlist_context lrc[GUC_MAX_ENGINES_NUM];
> -
> -	u8 attribute;
> -
> -	u32 priority;
> -
> +	u32 db_id;
> +	struct guc_execlist_context lrc[GUC_MAX_ENGINE_CLASSES][GUC_MAX_LRC_PER_CLASS];
> +	u32 lrc_bitmap[GUC_MAX_ENGINE_CLASSES][3];
> +	u32 lrc_count;
> +	u32 max_lrc_per_class;
> +	u32 attribute; /* GUC_STAGE_DESC_ATTR_xxx */
> +	u32 priority; /* GUC_CLIENT_PRIORITY_xxx */
>   	u32 wq_sampled_tail_offset;
>   	u32 wq_total_submit_enqueues;
> -
>   	u32 process_desc;
>   	u32 wq_addr;
>   	u32 wq_size;
> -
> -	u32 engine_presence;
> -
> -	u8 engine_suspended;
> -
> -	u8 reserved0[3];
> -	u64 reserved1[1];
> -
> -	u64 desc_private;
> +	u32 feature0;
> +	u32 feature1;
> +	u32 feature2;
> +	u32 queue_engine_error;
> +	u32 reserved[2];
> +	u64 reserved3[12];
>   } __packed;
>   
>   /**
> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
> index 07b9d31..54655dc 100644
> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
> @@ -46,17 +46,22 @@
>    * that contains all required pages for these elements).
>    *
>    * GuC stage descriptor:
> - * During initialization, the driver allocates a static pool of 1024 such
> - * descriptors, and shares them with the GuC.
> - * Currently, there exists a 1:1 mapping between a intel_guc_client and a
> - * guc_stage_desc (via the client's stage_id), so effectively only one
> - * gets used. This stage descriptor lets the GuC know about the doorbell,
> - * workqueue and process descriptor. Theoretically, it also lets the GuC
> - * know about our HW contexts (context ID, etc...), but we actually
> - * employ a kind of submission where the GuC uses the LRCA sent via the work
> - * item instead (the single guc_stage_desc associated to execbuf client
> - * contains information about the default kernel context only, but this is
> - * essentially unused). This is called a "proxy" submission.
> + * During initialization, the driver allocates a static pool of descriptors
> + * and shares them with the GuC. This stage descriptor lets the GuC know about
> + * the doorbell, workqueue and process descriptor, additionally it stores
> + * information about all possible HW contexts that use it (64 x number of
> + * engine classes of guc_execlist_context structs).
> + *
> + * The idea is that every direct-submission GuC client gets one SW Context ID
> + * and every HW context created by that client gets one SW Counter. The "SW
> + * Context ID" and "SW Counter" to use now get passed on every work queue item.
> + *
> + * But we don't have direct submission yet: does that mean we are limited to 64
> + * contexts in total (one client)? Not really: we can use extra GuC context
> + * descriptors to store more HW contexts. They are special in that they don't
> + * have their own work queue, doorbell or process descriptor. Instead, these
> + * "principal" GuC context descriptors use the one that belongs to the client
> + * as a "proxy" for submission (a generalization of the old proxy submission).

AFAICS, with the implementation in this patch, a stage_desc can be both 
"proxy" and "principal". This doesn't match the expectation of the FW; I 
have  confirmed with GuC FW engineers that they expect each stage 
descriptor used by the kernel to be either one or the other and the 
interface has different values for the same field to distinguish. The 
only case where proxy stuff (db, wq) and principal stuff (lrcs) are 
together should be for descriptors marked with GUC_STAGE_DESC_ATTR_REAL 
which AFAIU are the ones used for direct submission.

Things still kinda work when we do this with only 1 client since the 
only proxy we have is at the same time the principal of the 
kernel_context, so it is somehow a special and more stable case, but the 
abstraction doesn't really scale to more clients. I think a cleaner 
solution would be to reserve a handful of stage descriptors (or even 
just one) at the end of the array to work as proxies and use the 
remaining descriptors as principals. This way we can have a cleaner 
in-code separation of the 2 usages with very little cost. The only 
impact I can see is in the selftests as those won't be able to create 
256 clients anymore, but I'm sure we can find a way to override the 
reservation limit for selftest only as those would only need proxy 
entries anyway.
Also, each principal is tied to a single proxy via desc->proxy_id, so 
that also doesn't work with the current implementation.

Thoughts?

I'm going to skip detailed code review while we discuss the above.

Thanks,
Daniele

>    *
>    * The Scratch registers:
>    * There are 16 MMIO-based registers start from 0xC180. The kernel driver writes
> @@ -171,6 +176,16 @@ static struct guc_stage_desc *__get_stage_desc(struct intel_guc_client *client)
>   	return &base[client->stage_id];
>   }
>   
> +static struct guc_stage_desc *__get_ppal_stage_desc(struct intel_guc *guc,
> +						    u32 index)
> +{
> +	struct guc_stage_desc *base = guc->stage_desc_pool_vaddr;
> +
> +	GEM_BUG_ON(index >= GUC_MAX_STAGE_DESCRIPTORS);
> +
> +	return &base[index];
> +}
> +
>   /*
>    * Initialise, update, or clear doorbell data shared with the GuC
>    *
> @@ -344,70 +359,20 @@ static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
>   static void guc_stage_desc_init(struct intel_guc *guc,
>   				struct intel_guc_client *client)
>   {
> -	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> -	struct intel_engine_cs *engine;
> -	struct i915_gem_context *ctx = client->owner;
>   	struct guc_stage_desc *desc;
> -	unsigned int tmp;
>   	u32 gfx_addr;
>   
>   	desc = __get_stage_desc(client);
>   	memset(desc, 0, sizeof(*desc));
>   
>   	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |
> +			  GUC_STAGE_DESC_ATTR_PROXY |
>   			  GUC_STAGE_DESC_ATTR_KERNEL;
> -	if (is_high_priority(client))
> -		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
>   	desc->stage_id = client->stage_id;
> +	desc->proxy_id = client->stage_id;
>   	desc->priority = client->priority;
>   	desc->db_id = client->doorbell_id;
>   
> -	for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
> -		struct intel_context *ce = to_intel_context(ctx, engine);
> -		u32 guc_engine_id = engine->guc_id;
> -		struct guc_execlist_context *lrc = &desc->lrc[guc_engine_id];
> -
> -		/* TODO: We have a design issue to be solved here. Only when we
> -		 * receive the first batch, we know which engine is used by the
> -		 * user. But here GuC expects the lrc and ring to be pinned. It
> -		 * is not an issue for default context, which is the only one
> -		 * for now who owns a GuC client. But for future owner of GuC
> -		 * client, need to make sure lrc is pinned prior to enter here.
> -		 */
> -		if (!ce->state)
> -			break;	/* XXX: continue? */
> -
> -		/*
> -		 * XXX: When this is a GUC_STAGE_DESC_ATTR_KERNEL client (proxy
> -		 * submission or, in other words, not using a direct submission
> -		 * model) the KMD's LRCA is not used for any work submission.
> -		 * Instead, the GuC uses the LRCA of the user mode context (see
> -		 * guc_add_request below).
> -		 */
> -		lrc->context_desc = lower_32_bits(ce->lrc_desc);
> -
> -		/* The state page is after PPHWSP */
> -		lrc->ring_lrca = intel_guc_ggtt_offset(guc, ce->state) +
> -				 LRC_STATE_PN * PAGE_SIZE;
> -
> -		/* XXX: In direct submission, the GuC wants the HW context id
> -		 * here. In proxy submission, it wants the stage id
> -		 */
> -		lrc->context_id = (client->stage_id << GUC_ELC_CTXID_OFFSET) |
> -				(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
> -
> -		lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
> -		lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
> -		lrc->ring_next_free_location = lrc->ring_begin;
> -		lrc->ring_current_tail_pointer_value = 0;
> -
> -		desc->engines_used |= (1 << guc_engine_id);
> -	}
> -
> -	DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
> -			 client->engines, desc->engines_used);
> -	WARN_ON(desc->engines_used == 0);
> -
>   	/*
>   	 * The doorbell, process descriptor, and workqueue are all parts
>   	 * of the client object, which the GuC will reference via the GGTT
> @@ -430,7 +395,15 @@ static void guc_stage_desc_fini(struct intel_guc *guc,
>   	struct guc_stage_desc *desc;
>   
>   	desc = __get_stage_desc(client);
> -	memset(desc, 0, sizeof(*desc));
> +	/* No memset: the stage desc might still be used as a principal */
> +	desc->attribute &= ~GUC_STAGE_DESC_ATTR_TYPE_MASK;
> +	desc->db_id = 0;
> +	desc->db_trigger_phy = 0;
> +	desc->db_trigger_cpu = 0;
> +	desc->db_trigger_uk = 0;
> +	desc->process_desc = 0;
> +	desc->wq_addr = 0;
> +	desc->wq_size = 0;
>   }
>   
>   /* Construct a Work Item and append it to the GuC's Work Queue */
> @@ -1299,6 +1272,87 @@ static void guc_set_default_submission(struct intel_engine_cs *engine)
>   	engine->flags &= ~I915_ENGINE_SUPPORTS_STATS;
>   }
>   
> +static void guc_ppal_stage_attach(struct i915_gem_context *ctx,
> +				  struct intel_engine_cs *engine)
> +{
> +	struct intel_guc *guc = &ctx->i915->guc;
> +	struct intel_context *ce = to_intel_context(ctx, engine);
> +	struct guc_stage_desc *desc;
> +
> +	GEM_BUG_ON(ce->sw_context_id >= GUC_MAX_STAGE_DESCRIPTORS);
> +
> +	desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
> +
> +	if (desc->lrc_count == 0) {
> +		desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |
> +				  GUC_STAGE_DESC_ATTR_PRINCIPAL |
> +				  GUC_STAGE_DESC_ATTR_KERNEL;
> +		desc->stage_id = ce->sw_context_id;
> +	}
> +
> +	GEM_BUG_ON(bitmap32_test_bit(desc->lrc_bitmap[engine->guc_class],
> +				     ce->sw_counter));
> +	bitmap32_set_bit(desc->lrc_bitmap[engine->guc_class], ce->sw_counter);
> +	desc->lrc_count++;
> +
> +	/* GuC optimizations */
> +	if (ce->sw_counter >= desc->max_lrc_per_class)
> +		desc->max_lrc_per_class = ce->sw_counter + 1;
> +}
> +
> +static void guc_ppal_stage_detach(struct i915_gem_context *ctx,
> +				  struct intel_engine_cs *engine)
> +{
> +	struct intel_guc *guc = &ctx->i915->guc;
> +	struct intel_context *ce = to_intel_context(ctx, engine);
> +	struct guc_stage_desc *desc;
> +	struct guc_execlist_context *lrc;
> +
> +	GEM_BUG_ON(ce->sw_context_id >= GUC_MAX_STAGE_DESCRIPTORS);
> +
> +	desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
> +
> +	GEM_BUG_ON(!bitmap32_test_bit(desc->lrc_bitmap[engine->guc_class],
> +				      ce->sw_counter));
> +	bitmap32_clear_bit(desc->lrc_bitmap[engine->guc_class], ce->sw_counter);
> +	desc->lrc_count--;
> +
> +	if (desc->lrc_count == 0) {
> +		desc->attribute = 0;
> +		desc->stage_id = 0;
> +		desc->max_lrc_per_class = 0;
> +	} else {
> +		/* TODO: GuC optimizations */
> +	}
> +
> +	lrc = &desc->lrc[engine->guc_class][ce->sw_counter];
> +	memset(lrc, 0, sizeof(*lrc));
> +}
> +
> +static void guc_ppal_stage_update(struct i915_gem_context *ctx,
> +				  struct intel_engine_cs *engine)
> +{
> +	struct intel_guc *guc = &ctx->i915->guc;
> +	struct intel_context *ce = to_intel_context(ctx, engine);
> +	struct guc_stage_desc *desc;
> +	struct guc_execlist_context *lrc;
> +
> +	GEM_BUG_ON(ce->sw_context_id >= GUC_MAX_STAGE_DESCRIPTORS);
> +
> +	desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
> +
> +	GEM_BUG_ON(!bitmap32_test_bit(desc->lrc_bitmap[engine->guc_class],
> +				      ce->sw_counter));
> +
> +	lrc = &desc->lrc[engine->guc_class][ce->sw_counter];
> +
> +	lrc->context_desc = lower_32_bits(ce->lrc_desc);
> +	lrc->context_id = upper_32_bits(ce->lrc_desc);
> +	lrc->ring_lrca = intel_guc_ggtt_offset(guc, ce->state) + LRC_STATE_PN * PAGE_SIZE;
> +	lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
> +	lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
> +}
> +
>   int intel_guc_submission_enable(struct intel_guc *guc)
>   {
>   	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> @@ -1321,17 +1375,26 @@ int intel_guc_submission_enable(struct intel_guc *guc)
>   
>   	GEM_BUG_ON(!guc->execbuf_client);
>   
> +	dev_priv->contexts.alloc_hook = guc_ppal_stage_attach;
> +	dev_priv->contexts.update_hook = guc_ppal_stage_update;
> +	dev_priv->contexts.free_hook = guc_ppal_stage_detach;
> +
> +	for_each_engine(engine, dev_priv, id) {
> +		guc_ppal_stage_attach(dev_priv->kernel_context, engine);
> +		guc_ppal_stage_update(dev_priv->kernel_context, engine);
> +	}
> +
>   	guc_reset_wq(guc->execbuf_client);
>   	if (guc->preempt_client)
>   		guc_reset_wq(guc->preempt_client);
>   
>   	err = intel_guc_sample_forcewake(guc);
>   	if (err)
> -		return err;
> +		goto err_clear_ctx_hooks;
>   
>   	err = guc_clients_doorbell_init(guc);
>   	if (err)
> -		return err;
> +		goto err_clear_ctx_hooks;
>   
>   	/* Take over from manual control of ELSP (execlists) */
>   	guc_interrupts_capture(dev_priv);
> @@ -1342,6 +1405,12 @@ int intel_guc_submission_enable(struct intel_guc *guc)
>   	}
>   
>   	return 0;
> +
> +err_clear_ctx_hooks:
> +	dev_priv->contexts.alloc_hook = NULL;
> +	dev_priv->contexts.update_hook = NULL;
> +	dev_priv->contexts.free_hook = NULL;
> +	return err;
>   }
>   
>   void intel_guc_submission_disable(struct intel_guc *guc)
> @@ -1352,6 +1421,10 @@ void intel_guc_submission_disable(struct intel_guc *guc)
>   
>   	guc_interrupts_release(dev_priv);
>   	guc_clients_doorbell_fini(guc);
> +
> +	dev_priv->contexts.alloc_hook = NULL;
> +	dev_priv->contexts.update_hook = NULL;
> +	dev_priv->contexts.free_hook = NULL;
>   }
>   
>   #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 08/21] drm/i915/guc: Make use of the SW counter field in the context descriptor
  2018-08-29 19:16 ` [PATCH 08/21] drm/i915/guc: Make use of the SW counter field in the context descriptor Michal Wajdeczko
@ 2018-08-30  0:08   ` Lionel Landwerlin
  2018-08-30 14:15     ` Lis, Tomasz
  0 siblings, 1 reply; 49+ messages in thread
From: Lionel Landwerlin @ 2018-08-30  0:08 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx; +Cc: Oscar Mateo, Rodrigo Vivi

On 29/08/2018 20:16, Michal Wajdeczko wrote:
> The new context descriptor format contains two assignable fields:
> the SW Context ID (technically 11 bits, but practically limited to 2032
> entries due to some being reserved for future use by the GuC) and the
> SW Counter (6 bits).
>
> We don't want to limit ourselves too much in the maximum number of
> concurrent contexts we want to allow, so ideally we want to employ
> every possible bit available. Unfortunately, a further limitation in
> the interface with the GuC means the combination of SW Context ID +
> SW Counter has to be unique within the same engine class (as we use
> the SW Context ID to index in the GuC stage descriptor pool, and the
> Engine Class + SW Counter to index in the 2-dimensional lrc array).
> This essentially means we need to somehow encode the engine instance.
>
> Since the BSpec allows 6 bits for engine instance, we use the whole
> SW counter for this task. If the limitation of 2032 maximum simultaneous
> contexts is too restrictive, we can always squeeze things a bit more
> (3 extras bits for hw_id, 3 bits for instance) and things will still
> work (Gen11 does not instance more than 8 engines of any class).
>
> Another alternative would be to generate the hw_id per HW context
> instead of per GEM context, but that has other problems (e.g. maximum
> number of user-created contexts would be variable, no relationship
> between a GuC principal descriptor and the proxy descriptor it uses, ...)
>
> Bspec: 12254
>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         | 15 +++++++++++----
>   drivers/gpu/drm/i915/i915_gem_context.c |  5 ++++-
>   drivers/gpu/drm/i915/i915_gem_context.h |  2 ++
>   drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>   drivers/gpu/drm/i915/intel_lrc.c        | 12 +++++++++---
>   5 files changed, 28 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e5b9d3c..34f5495 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1866,14 +1866,21 @@ struct drm_i915_private {
>   		struct llist_head free_list;
>   		struct work_struct free_work;
>   
> -		/* The hw wants to have a stable context identifier for the
> +		/*
> +		 * The HW wants to have a stable context identifier for the
>   		 * lifetime of the context (for OA, PASID, faults, etc).
>   		 * This is limited in execlists to 21 bits.
> +		 * In enhanced execlist (GEN11+) this is limited to 11 bits
> +		 * (the SW Context ID field) but GuC limits it a bit further
> +		 * (11 bits - 16) due to some entries being reserved for future
> +		 * use (so the firmware only supports a GuC stage descriptor
> +		 * pool of 2032 entries).
>   		 */
>   		struct ida hw_ida;
> -#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
> -#define MAX_GUC_CONTEXT_HW_ID (1 << 20) /* exclusive */
> -#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
> +#define MAX_CONTEXT_HW_ID			(1 << 21) /* exclusive */
> +#define MAX_GUC_CONTEXT_HW_ID			(1 << 20) /* exclusive */
> +#define GEN11_MAX_CONTEXT_HW_ID			(1 << 11) /* exclusive */
> +#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC	(GEN11_MAX_CONTEXT_HW_ID - 16)
>   	} contexts;
>   
>   	u32 fdi_rx_config;
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index f15a039..e3b500c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -209,7 +209,10 @@ static int assign_hw_id(struct drm_i915_private *dev_priv, unsigned *out)
>   	unsigned int max;
>   
>   	if (INTEL_GEN(dev_priv) >= 11) {
> -		max = GEN11_MAX_CONTEXT_HW_ID;
> +		if (USES_GUC_SUBMISSION(dev_priv))
> +			max = GEN11_MAX_CONTEXT_HW_ID_WITH_GUC;
> +		else
> +			max = GEN11_MAX_CONTEXT_HW_ID;
>   	} else {
>   		/*
>   		 * When using GuC in proxy submission, GuC consumes the
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index 851dad6..4b87f5d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -154,6 +154,8 @@ struct i915_gem_context {
>   		struct intel_ring *ring;
>   		u32 *lrc_reg_state;
>   		u64 lrc_desc;
> +		u32 sw_context_id;
> +		u32 sw_counter;
>   		int pin_count;
>   
>   		const struct intel_context_ops *ops;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index f232178..ea65d7b 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -3900,6 +3900,8 @@ enum {
>   #define GEN8_CTX_ID_WIDTH 21
>   #define GEN11_SW_CTX_ID_SHIFT 37
>   #define GEN11_SW_CTX_ID_WIDTH 11
> +#define GEN11_SW_COUNTER_SHIFT 55
> +#define GEN11_SW_COUNTER_WIDTH 6
>   #define GEN11_ENGINE_CLASS_SHIFT 61
>   #define GEN11_ENGINE_CLASS_WIDTH 3
>   #define GEN11_ENGINE_INSTANCE_SHIFT 48
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index f4b9972..3001a14 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -240,14 +240,15 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
>   	 * anything below.
>   	 */
>   	if (INTEL_GEN(ctx->i915) >= 11) {


Hey Michal,

There is a comment just above the if () about updating the i915_perf.c 
(oa_get_render_ctx_id) when descriptor is updated.
Otherwise this will break some part of the observability feature.
You can verify this with the IGT tests/perf 
--run-subtest=gen8-unprivileged-single-ctx-counters

Thanks a lot,

-
Lionel
> -		GEM_BUG_ON(ctx->hw_id >= BIT(GEN11_SW_CTX_ID_WIDTH));
> -		desc |= (u64)ctx->hw_id << GEN11_SW_CTX_ID_SHIFT;
> +		GEM_BUG_ON(ce->sw_context_id >= BIT(GEN11_SW_CTX_ID_WIDTH));
> +		desc |= (u64)ce->sw_context_id << GEN11_SW_CTX_ID_SHIFT;
>   								/* bits 37-47 */
>   
>   		desc |= (u64)engine->instance << GEN11_ENGINE_INSTANCE_SHIFT;
>   								/* bits 48-53 */
>   
> -		/* TODO: decide what to do with SW counter (bits 55-60) */
> +		desc |= (u64)ce->sw_counter << GEN11_SW_COUNTER_SHIFT;
> +								/* bits 55-60 */
>   
>   		/*
>   		 * Although GuC will never see this upper part as it fills
> @@ -2771,6 +2772,11 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
>   	ce->ring = ring;
>   	ce->state = vma;
>   
> +	if (INTEL_GEN(ctx->i915) >= 11) {
> +		ce->sw_context_id = ctx->hw_id;
> +		ce->sw_counter = engine->instance;
> +	}
> +
>   	return 0;
>   
>   error_ring_free:


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface
  2018-08-29 19:58   ` Michel Thierry
@ 2018-08-30  0:16     ` Lionel Landwerlin
  2018-08-30 13:29       ` Lis, Tomasz
  2018-08-30 22:34       ` Daniele Ceraolo Spurio
  0 siblings, 2 replies; 49+ messages in thread
From: Lionel Landwerlin @ 2018-08-30  0:16 UTC (permalink / raw)
  To: Michel Thierry, Michal Wajdeczko, intel-gfx; +Cc: Lucas De Marchi, Rodrigo Vivi

On 29/08/2018 20:58, Michel Thierry wrote:
> +Lionel
> (please see below as this touches the lrca format & relates to OA 
> reporting too)
>
> On 8/29/2018 12:10 PM, Michal Wajdeczko wrote:
>> Until now the GuC and HW engine class has been the same, which allowed
>> us to use them interchangeable. But it is better to start doing the
>> right thing and use the GuC definitions for the firmware interface.
>>
>> We also keep the same class id in the ctx descriptor to be able to have
>> the same values in the driver and firmware logs.
>>
>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Michel Thierry <michel.thierry@intel.com>
>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>> Cc: Tomasz Lis <tomasz.lis@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_engine_cs.c  | 13 +++++++++++++
>>   drivers/gpu/drm/i915/intel_guc_fwif.h   |  7 +++++++
>>   drivers/gpu/drm/i915/intel_lrc.c        | 10 +++++++++-
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
>>   4 files changed, 31 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
>> b/drivers/gpu/drm/i915/intel_engine_cs.c
>> index 1a34e8f..bc81354 100644
>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>> @@ -85,6 +85,7 @@ struct engine_info {
>>       unsigned int hw_id;
>>       unsigned int uabi_id;
>>       u8 class;
>> +    u8 guc_class;
>>       u8 instance;
>>       /* mmio bases table *must* be sorted in reverse gen order */
>>       struct engine_mmio_base {
>> @@ -98,6 +99,7 @@ struct engine_info {
>>           .hw_id = RCS_HW,
>>           .uabi_id = I915_EXEC_RENDER,
>>           .class = RENDER_CLASS,
>> +        .guc_class = GUC_RENDER_CLASS,
>>           .instance = 0,
>>           .mmio_bases = {
>>               { .gen = 1, .base = RENDER_RING_BASE }
>> @@ -107,6 +109,7 @@ struct engine_info {
>>           .hw_id = BCS_HW,
>>           .uabi_id = I915_EXEC_BLT,
>>           .class = COPY_ENGINE_CLASS,
>> +        .guc_class = GUC_BLITTER_CLASS,
>>           .instance = 0,
>>           .mmio_bases = {
>>               { .gen = 6, .base = BLT_RING_BASE }
>> @@ -116,6 +119,7 @@ struct engine_info {
>>           .hw_id = VCS_HW,
>>           .uabi_id = I915_EXEC_BSD,
>>           .class = VIDEO_DECODE_CLASS,
>> +        .guc_class = GUC_VIDEO_CLASS,
>>           .instance = 0,
>>           .mmio_bases = {
>>               { .gen = 11, .base = GEN11_BSD_RING_BASE },
>> @@ -127,6 +131,7 @@ struct engine_info {
>>           .hw_id = VCS2_HW,
>>           .uabi_id = I915_EXEC_BSD,
>>           .class = VIDEO_DECODE_CLASS,
>> +        .guc_class = GUC_VIDEO_CLASS,
>>           .instance = 1,
>>           .mmio_bases = {
>>               { .gen = 11, .base = GEN11_BSD2_RING_BASE },
>> @@ -137,6 +142,7 @@ struct engine_info {
>>           .hw_id = VCS3_HW,
>>           .uabi_id = I915_EXEC_BSD,
>>           .class = VIDEO_DECODE_CLASS,
>> +        .guc_class = GUC_VIDEO_CLASS,
>>           .instance = 2,
>>           .mmio_bases = {
>>               { .gen = 11, .base = GEN11_BSD3_RING_BASE }
>> @@ -146,6 +152,7 @@ struct engine_info {
>>           .hw_id = VCS4_HW,
>>           .uabi_id = I915_EXEC_BSD,
>>           .class = VIDEO_DECODE_CLASS,
>> +        .guc_class = GUC_VIDEO_CLASS,
>>           .instance = 3,
>>           .mmio_bases = {
>>               { .gen = 11, .base = GEN11_BSD4_RING_BASE }
>> @@ -155,6 +162,7 @@ struct engine_info {
>>           .hw_id = VECS_HW,
>>           .uabi_id = I915_EXEC_VEBOX,
>>           .class = VIDEO_ENHANCEMENT_CLASS,
>> +        .guc_class = GUC_VIDEOENHANCE_CLASS,
>>           .instance = 0,
>>           .mmio_bases = {
>>               { .gen = 11, .base = GEN11_VEBOX_RING_BASE },
>> @@ -165,6 +173,7 @@ struct engine_info {
>>           .hw_id = VECS2_HW,
>>           .uabi_id = I915_EXEC_VEBOX,
>>           .class = VIDEO_ENHANCEMENT_CLASS,
>> +        .guc_class = GUC_VIDEOENHANCE_CLASS,
>>           .instance = 1,
>>           .mmio_bases = {
>>               { .gen = 11, .base = GEN11_VEBOX2_RING_BASE }
>> @@ -276,6 +285,9 @@ static void __sprint_engine_name(char *name, 
>> const struct engine_info *info)
>>       if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
>>           return -EINVAL;
>>   +    if (GEM_WARN_ON(info->guc_class >= GUC_MAX_ENGINE_CLASSES))
>> +        return -EINVAL;
>> +
>>       if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
>>           return -EINVAL;
>>   @@ -291,6 +303,7 @@ static void __sprint_engine_name(char *name, 
>> const struct engine_info *info)
>>       engine->i915 = dev_priv;
>>       __sprint_engine_name(engine->name, info);
>>       engine->hw_id = engine->guc_id = info->hw_id;
>> +    engine->guc_class = info->guc_class;
>>       engine->mmio_base = __engine_mmio_base(dev_priv, 
>> info->mmio_bases);
>>       engine->class = info->class;
>>       engine->instance = info->instance;
>> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
>> b/drivers/gpu/drm/i915/intel_guc_fwif.h
>> index 963da91..5b7a05b 100644
>> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
>> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
>> @@ -39,6 +39,13 @@
>>   #define GUC_VIDEO_ENGINE2        4
>>   #define GUC_MAX_ENGINES_NUM        (GUC_VIDEO_ENGINE2 + 1)
>>   +#define GUC_RENDER_CLASS    0
>> +#define GUC_VIDEO_CLASS        1
>> +#define GUC_VIDEOENHANCE_CLASS    2
>> +#define GUC_BLITTER_CLASS    3
>> +#define GUC_RESERVED_CLASS    4
>> +#define GUC_MAX_ENGINE_CLASSES    (GUC_RESERVED_CLASS + 1)
>> +
>>   /* Work queue item header definitions */
>>   #define WQ_STATUS_ACTIVE        1
>>   #define WQ_STATUS_SUSPENDED        2
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index f8ceb9c..f4b9972 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -249,7 +249,15 @@ static inline bool need_preempt(const struct 
>> intel_engine_cs *engine,
>>             /* TODO: decide what to do with SW counter (bits 55-60) */
>>   -        desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>> +        /*
>> +         * Although GuC will never see this upper part as it fills
>> +         * its own descriptor, using the guc_class here will help keep
>> +         * the i915 and firmware logs in sync.
>> +         */
>> +        if (HAS_GUC_SCHED(ctx->i915))
>> +            desc |= (u64)engine->guc_class << GEN11_ENGINE_CLASS_SHIFT;
>> +        else
>> +            desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>>                                   /* bits 61-63 */
>
> OA also uses this upper part (see oa_get_render_ctx_id), so it's 
> something to be aware of.
>
> My opinion is that it's useful to have the lrca matching the firmware 
> logs, but OA should account of this change at it receives what the fw 
> sent to the hw. Which one is more important is for others to decide 
> (plus it only becomes a problem when engine-class and guc-class start 
> to deviate).
>
> Acked-by: Michel Thierry <michel.thierry@intel.com>
>
> -Michel


If GuC still behaves the same as the Gen9 firmware I was testing with, 
parts of the upper 32bits of the descriptor will end up in HW.

Just make sure i915_perf.c is in sync with intel_lrc.c and it should be 
fine :)


>
>
>>       } else {
>>           GEM_BUG_ON(ctx->hw_id >= BIT(GEN8_CTX_ID_WIDTH));
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
>> b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> index 3f6920d..f47009f 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> @@ -350,7 +350,9 @@ struct intel_engine_cs {
>>         enum intel_engine_id id;
>>       unsigned int hw_id;
>> +
>>       unsigned int guc_id;
>> +    u8 guc_class;
>>         u8 uabi_id;
>>       u8 uabi_class;
>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface
  2018-08-30  0:16     ` Lionel Landwerlin
@ 2018-08-30 13:29       ` Lis, Tomasz
  2018-08-30 14:16         ` Lis, Tomasz
  2018-08-30 14:56         ` Lionel Landwerlin
  2018-08-30 22:34       ` Daniele Ceraolo Spurio
  1 sibling, 2 replies; 49+ messages in thread
From: Lis, Tomasz @ 2018-08-30 13:29 UTC (permalink / raw)
  To: Lionel Landwerlin, Michel Thierry, Michal Wajdeczko, intel-gfx
  Cc: Lucas De Marchi, Rodrigo Vivi



On 2018-08-30 02:16, Lionel Landwerlin wrote:
> On 29/08/2018 20:58, Michel Thierry wrote:
>> +Lionel
>> (please see below as this touches the lrca format & relates to OA 
>> reporting too)
>>
>> On 8/29/2018 12:10 PM, Michal Wajdeczko wrote:
>>> Until now the GuC and HW engine class has been the same, which allowed
>>> us to use them interchangeable. But it is better to start doing the
>>> right thing and use the GuC definitions for the firmware interface.
>>>
>>> We also keep the same class id in the ctx descriptor to be able to have
>>> the same values in the driver and firmware logs.
>>>
>>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: Michel Thierry <michel.thierry@intel.com>
>>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>>> Cc: Tomasz Lis <tomasz.lis@intel.com>
Tested-by: Tomasz Lis <tomasz.lis@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/intel_engine_cs.c  | 13 +++++++++++++
>>>   drivers/gpu/drm/i915/intel_guc_fwif.h   |  7 +++++++
>>>   drivers/gpu/drm/i915/intel_lrc.c        | 10 +++++++++-
>>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
>>>   4 files changed, 31 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
>>> b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> index 1a34e8f..bc81354 100644
>>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> @@ -85,6 +85,7 @@ struct engine_info {
>>>       unsigned int hw_id;
>>>       unsigned int uabi_id;
>>>       u8 class;
>>> +    u8 guc_class;
>>>       u8 instance;
>>>       /* mmio bases table *must* be sorted in reverse gen order */
>>>       struct engine_mmio_base {
>>> @@ -98,6 +99,7 @@ struct engine_info {
>>>           .hw_id = RCS_HW,
>>>           .uabi_id = I915_EXEC_RENDER,
>>>           .class = RENDER_CLASS,
>>> +        .guc_class = GUC_RENDER_CLASS,
>>>           .instance = 0,
>>>           .mmio_bases = {
>>>               { .gen = 1, .base = RENDER_RING_BASE }
>>> @@ -107,6 +109,7 @@ struct engine_info {
>>>           .hw_id = BCS_HW,
>>>           .uabi_id = I915_EXEC_BLT,
>>>           .class = COPY_ENGINE_CLASS,
>>> +        .guc_class = GUC_BLITTER_CLASS,
>>>           .instance = 0,
>>>           .mmio_bases = {
>>>               { .gen = 6, .base = BLT_RING_BASE }
>>> @@ -116,6 +119,7 @@ struct engine_info {
>>>           .hw_id = VCS_HW,
>>>           .uabi_id = I915_EXEC_BSD,
>>>           .class = VIDEO_DECODE_CLASS,
>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>           .instance = 0,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_BSD_RING_BASE },
>>> @@ -127,6 +131,7 @@ struct engine_info {
>>>           .hw_id = VCS2_HW,
>>>           .uabi_id = I915_EXEC_BSD,
>>>           .class = VIDEO_DECODE_CLASS,
>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>           .instance = 1,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_BSD2_RING_BASE },
>>> @@ -137,6 +142,7 @@ struct engine_info {
>>>           .hw_id = VCS3_HW,
>>>           .uabi_id = I915_EXEC_BSD,
>>>           .class = VIDEO_DECODE_CLASS,
>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>           .instance = 2,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_BSD3_RING_BASE }
>>> @@ -146,6 +152,7 @@ struct engine_info {
>>>           .hw_id = VCS4_HW,
>>>           .uabi_id = I915_EXEC_BSD,
>>>           .class = VIDEO_DECODE_CLASS,
>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>           .instance = 3,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_BSD4_RING_BASE }
>>> @@ -155,6 +162,7 @@ struct engine_info {
>>>           .hw_id = VECS_HW,
>>>           .uabi_id = I915_EXEC_VEBOX,
>>>           .class = VIDEO_ENHANCEMENT_CLASS,
>>> +        .guc_class = GUC_VIDEOENHANCE_CLASS,
>>>           .instance = 0,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_VEBOX_RING_BASE },
>>> @@ -165,6 +173,7 @@ struct engine_info {
>>>           .hw_id = VECS2_HW,
>>>           .uabi_id = I915_EXEC_VEBOX,
>>>           .class = VIDEO_ENHANCEMENT_CLASS,
>>> +        .guc_class = GUC_VIDEOENHANCE_CLASS,
>>>           .instance = 1,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_VEBOX2_RING_BASE }
>>> @@ -276,6 +285,9 @@ static void __sprint_engine_name(char *name, 
>>> const struct engine_info *info)
>>>       if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
>>>           return -EINVAL;
>>>   +    if (GEM_WARN_ON(info->guc_class >= GUC_MAX_ENGINE_CLASSES))
>>> +        return -EINVAL;
>>> +
>>>       if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
>>>           return -EINVAL;
>>>   @@ -291,6 +303,7 @@ static void __sprint_engine_name(char *name, 
>>> const struct engine_info *info)
>>>       engine->i915 = dev_priv;
>>>       __sprint_engine_name(engine->name, info);
>>>       engine->hw_id = engine->guc_id = info->hw_id;
>>> +    engine->guc_class = info->guc_class;
>>>       engine->mmio_base = __engine_mmio_base(dev_priv, 
>>> info->mmio_bases);
>>>       engine->class = info->class;
>>>       engine->instance = info->instance;
>>> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
>>> b/drivers/gpu/drm/i915/intel_guc_fwif.h
>>> index 963da91..5b7a05b 100644
>>> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
>>> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
>>> @@ -39,6 +39,13 @@
>>>   #define GUC_VIDEO_ENGINE2        4
>>>   #define GUC_MAX_ENGINES_NUM        (GUC_VIDEO_ENGINE2 + 1)
>>>   +#define GUC_RENDER_CLASS    0
>>> +#define GUC_VIDEO_CLASS        1
>>> +#define GUC_VIDEOENHANCE_CLASS    2
>>> +#define GUC_BLITTER_CLASS    3
>>> +#define GUC_RESERVED_CLASS    4
>>> +#define GUC_MAX_ENGINE_CLASSES    (GUC_RESERVED_CLASS + 1)
>>> +
>>>   /* Work queue item header definitions */
>>>   #define WQ_STATUS_ACTIVE        1
>>>   #define WQ_STATUS_SUSPENDED        2
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>> index f8ceb9c..f4b9972 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -249,7 +249,15 @@ static inline bool need_preempt(const struct 
>>> intel_engine_cs *engine,
>>>             /* TODO: decide what to do with SW counter (bits 55-60) */
>>>   -        desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>>> +        /*
>>> +         * Although GuC will never see this upper part as it fills
>>> +         * its own descriptor, using the guc_class here will help keep
>>> +         * the i915 and firmware logs in sync.
>>> +         */
>>> +        if (HAS_GUC_SCHED(ctx->i915))
>>> +            desc |= (u64)engine->guc_class << 
>>> GEN11_ENGINE_CLASS_SHIFT;
>>> +        else
>>> +            desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>>>                                   /* bits 61-63 */
>>
>> OA also uses this upper part (see oa_get_render_ctx_id), so it's 
>> something to be aware of.
>>
>> My opinion is that it's useful to have the lrca matching the firmware 
>> logs, but OA should account of this change at it receives what the fw 
>> sent to the hw. Which one is more important is for others to decide 
>> (plus it only becomes a problem when engine-class and guc-class start 
>> to deviate).
>>
>> Acked-by: Michel Thierry <michel.thierry@intel.com>
>>
>> -Michel
>
>
> If GuC still behaves the same as the Gen9 firmware I was testing with, 
> parts of the upper 32bits of the descriptor will end up in HW.
>
> Just make sure i915_perf.c is in sync with intel_lrc.c and it should 
> be fine :)
>
Tested on KBL; works fine for both enable_guc=2 and enable_guc=3.

./tests/perf --run-subtest=gen8-unprivileged-single-ctx-counters
IGT-Version: 1.22-g11db680 (x86_64) (Linux: 4.19.0-rc1tli+ x86_64)
Subtest gen8-unprivileged-single-ctx-counters: SUCCESS (0,058s)

-Tomasz
>
>>
>>
>>>       } else {
>>>           GEM_BUG_ON(ctx->hw_id >= BIT(GEN8_CTX_ID_WIDTH));
>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
>>> b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> index 3f6920d..f47009f 100644
>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> @@ -350,7 +350,9 @@ struct intel_engine_cs {
>>>         enum intel_engine_id id;
>>>       unsigned int hw_id;
>>> +
>>>       unsigned int guc_id;
>>> +    u8 guc_class;
>>>         u8 uabi_id;
>>>       u8 uabi_class;
>>>
>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 08/21] drm/i915/guc: Make use of the SW counter field in the context descriptor
  2018-08-30  0:08   ` Lionel Landwerlin
@ 2018-08-30 14:15     ` Lis, Tomasz
  2018-08-31 15:31       ` Lis, Tomasz
  0 siblings, 1 reply; 49+ messages in thread
From: Lis, Tomasz @ 2018-08-30 14:15 UTC (permalink / raw)
  To: Lionel Landwerlin, Michal Wajdeczko, intel-gfx; +Cc: Oscar Mateo, Rodrigo Vivi



On 2018-08-30 02:08, Lionel Landwerlin wrote:
> On 29/08/2018 20:16, Michal Wajdeczko wrote:
>> The new context descriptor format contains two assignable fields:
>> the SW Context ID (technically 11 bits, but practically limited to 2032
>> entries due to some being reserved for future use by the GuC) and the
>> SW Counter (6 bits).
>>
>> We don't want to limit ourselves too much in the maximum number of
>> concurrent contexts we want to allow, so ideally we want to employ
>> every possible bit available. Unfortunately, a further limitation in
>> the interface with the GuC means the combination of SW Context ID +
>> SW Counter has to be unique within the same engine class (as we use
>> the SW Context ID to index in the GuC stage descriptor pool, and the
>> Engine Class + SW Counter to index in the 2-dimensional lrc array).
>> This essentially means we need to somehow encode the engine instance.
>>
>> Since the BSpec allows 6 bits for engine instance, we use the whole
>> SW counter for this task. If the limitation of 2032 maximum simultaneous
>> contexts is too restrictive, we can always squeeze things a bit more
>> (3 extras bits for hw_id, 3 bits for instance) and things will still
>> work (Gen11 does not instance more than 8 engines of any class).
>>
>> Another alternative would be to generate the hw_id per HW context
>> instead of per GEM context, but that has other problems (e.g. maximum
>> number of user-created contexts would be variable, no relationship
>> between a GuC principal descriptor and the proxy descriptor it uses, 
>> ...)
>>
>> Bspec: 12254
>>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Michel Thierry <michel.thierry@intel.com>
Tested-by: Tomasz Lis <tomasz.lis@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         | 15 +++++++++++----
>>   drivers/gpu/drm/i915/i915_gem_context.c |  5 ++++-
>>   drivers/gpu/drm/i915/i915_gem_context.h |  2 ++
>>   drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>>   drivers/gpu/drm/i915/intel_lrc.c        | 12 +++++++++---
>>   5 files changed, 28 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index e5b9d3c..34f5495 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1866,14 +1866,21 @@ struct drm_i915_private {
>>           struct llist_head free_list;
>>           struct work_struct free_work;
>>   -        /* The hw wants to have a stable context identifier for the
>> +        /*
>> +         * The HW wants to have a stable context identifier for the
>>            * lifetime of the context (for OA, PASID, faults, etc).
>>            * This is limited in execlists to 21 bits.
>> +         * In enhanced execlist (GEN11+) this is limited to 11 bits
>> +         * (the SW Context ID field) but GuC limits it a bit further
>> +         * (11 bits - 16) due to some entries being reserved for future
>> +         * use (so the firmware only supports a GuC stage descriptor
>> +         * pool of 2032 entries).
>>            */
>>           struct ida hw_ida;
>> -#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
>> -#define MAX_GUC_CONTEXT_HW_ID (1 << 20) /* exclusive */
>> -#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
>> +#define MAX_CONTEXT_HW_ID            (1 << 21) /* exclusive */
>> +#define MAX_GUC_CONTEXT_HW_ID            (1 << 20) /* exclusive */
>> +#define GEN11_MAX_CONTEXT_HW_ID            (1 << 11) /* exclusive */
>> +#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC (GEN11_MAX_CONTEXT_HW_ID - 16)
>>       } contexts;
>>         u32 fdi_rx_config;
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
>> b/drivers/gpu/drm/i915/i915_gem_context.c
>> index f15a039..e3b500c 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -209,7 +209,10 @@ static int assign_hw_id(struct drm_i915_private 
>> *dev_priv, unsigned *out)
>>       unsigned int max;
>>         if (INTEL_GEN(dev_priv) >= 11) {
>> -        max = GEN11_MAX_CONTEXT_HW_ID;
>> +        if (USES_GUC_SUBMISSION(dev_priv))
>> +            max = GEN11_MAX_CONTEXT_HW_ID_WITH_GUC;
>> +        else
>> +            max = GEN11_MAX_CONTEXT_HW_ID;
>>       } else {
>>           /*
>>            * When using GuC in proxy submission, GuC consumes the
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h 
>> b/drivers/gpu/drm/i915/i915_gem_context.h
>> index 851dad6..4b87f5d 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.h
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
>> @@ -154,6 +154,8 @@ struct i915_gem_context {
>>           struct intel_ring *ring;
>>           u32 *lrc_reg_state;
>>           u64 lrc_desc;
>> +        u32 sw_context_id;
>> +        u32 sw_counter;
>>           int pin_count;
>>             const struct intel_context_ops *ops;
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h 
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index f232178..ea65d7b 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -3900,6 +3900,8 @@ enum {
>>   #define GEN8_CTX_ID_WIDTH 21
>>   #define GEN11_SW_CTX_ID_SHIFT 37
>>   #define GEN11_SW_CTX_ID_WIDTH 11
>> +#define GEN11_SW_COUNTER_SHIFT 55
>> +#define GEN11_SW_COUNTER_WIDTH 6
>>   #define GEN11_ENGINE_CLASS_SHIFT 61
>>   #define GEN11_ENGINE_CLASS_WIDTH 3
>>   #define GEN11_ENGINE_INSTANCE_SHIFT 48
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index f4b9972..3001a14 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -240,14 +240,15 @@ static inline bool need_preempt(const struct 
>> intel_engine_cs *engine,
>>        * anything below.
>>        */
>>       if (INTEL_GEN(ctx->i915) >= 11) {
>
>
> Hey Michal,
>
> There is a comment just above the if () about updating the i915_perf.c 
> (oa_get_render_ctx_id) when descriptor is updated.
> Otherwise this will break some part of the observability feature.
> You can verify this with the IGT tests/perf 
> --run-subtest=gen8-unprivileged-single-ctx-counters
>
> Thanks a lot,
>
> -
> Lionel
Tested on KBL; works fine for both enable_guc=2 and enable_guc=3.

./tests/perf --run-subtest=gen8-unprivileged-single-ctx-counters
IGT-Version: 1.22-g11db680 (x86_64) (Linux: 4.19.0-rc1tli+ x86_64)
Subtest gen8-unprivileged-single-ctx-counters: SUCCESS (0,058s)

-Tomasz
>> -        GEM_BUG_ON(ctx->hw_id >= BIT(GEN11_SW_CTX_ID_WIDTH));
>> -        desc |= (u64)ctx->hw_id << GEN11_SW_CTX_ID_SHIFT;
>> +        GEM_BUG_ON(ce->sw_context_id >= BIT(GEN11_SW_CTX_ID_WIDTH));
>> +        desc |= (u64)ce->sw_context_id << GEN11_SW_CTX_ID_SHIFT;
>>                                   /* bits 37-47 */
>>             desc |= (u64)engine->instance << 
>> GEN11_ENGINE_INSTANCE_SHIFT;
>>                                   /* bits 48-53 */
>>   -        /* TODO: decide what to do with SW counter (bits 55-60) */
>> +        desc |= (u64)ce->sw_counter << GEN11_SW_COUNTER_SHIFT;
>> +                                /* bits 55-60 */
>>             /*
>>            * Although GuC will never see this upper part as it fills
>> @@ -2771,6 +2772,11 @@ static int 
>> execlists_context_deferred_alloc(struct i915_gem_context *ctx,
>>       ce->ring = ring;
>>       ce->state = vma;
>>   +    if (INTEL_GEN(ctx->i915) >= 11) {
>> +        ce->sw_context_id = ctx->hw_id;
>> +        ce->sw_counter = engine->instance;
>> +    }
>> +
>>       return 0;
>>     error_ring_free:
>
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface
  2018-08-30 13:29       ` Lis, Tomasz
@ 2018-08-30 14:16         ` Lis, Tomasz
  2018-08-30 14:56         ` Lionel Landwerlin
  1 sibling, 0 replies; 49+ messages in thread
From: Lis, Tomasz @ 2018-08-30 14:16 UTC (permalink / raw)
  To: Lionel Landwerlin, Michel Thierry, Michal Wajdeczko, intel-gfx
  Cc: Lucas De Marchi, Rodrigo Vivi

Uhh, sorry - answered on wrong patch.

Please ignore this one.

-Tomasz

On 2018-08-30 15:29, Lis, Tomasz wrote:
>
>
> On 2018-08-30 02:16, Lionel Landwerlin wrote:
>> On 29/08/2018 20:58, Michel Thierry wrote:
>>> +Lionel
>>> (please see below as this touches the lrca format & relates to OA 
>>> reporting too)
>>>
>>> On 8/29/2018 12:10 PM, Michal Wajdeczko wrote:
>>>> Until now the GuC and HW engine class has been the same, which allowed
>>>> us to use them interchangeable. But it is better to start doing the
>>>> right thing and use the GuC definitions for the firmware interface.
>>>>
>>>> We also keep the same class id in the ctx descriptor to be able to 
>>>> have
>>>> the same values in the driver and firmware logs.
>>>>
>>>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>>> Cc: Michel Thierry <michel.thierry@intel.com>
>>>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>>>> Cc: Tomasz Lis <tomasz.lis@intel.com>
> Tested-by: Tomasz Lis <tomasz.lis@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/intel_engine_cs.c  | 13 +++++++++++++
>>>>   drivers/gpu/drm/i915/intel_guc_fwif.h   |  7 +++++++
>>>>   drivers/gpu/drm/i915/intel_lrc.c        | 10 +++++++++-
>>>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
>>>>   4 files changed, 31 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
>>>> b/drivers/gpu/drm/i915/intel_engine_cs.c
>>>> index 1a34e8f..bc81354 100644
>>>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>>>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>>>> @@ -85,6 +85,7 @@ struct engine_info {
>>>>       unsigned int hw_id;
>>>>       unsigned int uabi_id;
>>>>       u8 class;
>>>> +    u8 guc_class;
>>>>       u8 instance;
>>>>       /* mmio bases table *must* be sorted in reverse gen order */
>>>>       struct engine_mmio_base {
>>>> @@ -98,6 +99,7 @@ struct engine_info {
>>>>           .hw_id = RCS_HW,
>>>>           .uabi_id = I915_EXEC_RENDER,
>>>>           .class = RENDER_CLASS,
>>>> +        .guc_class = GUC_RENDER_CLASS,
>>>>           .instance = 0,
>>>>           .mmio_bases = {
>>>>               { .gen = 1, .base = RENDER_RING_BASE }
>>>> @@ -107,6 +109,7 @@ struct engine_info {
>>>>           .hw_id = BCS_HW,
>>>>           .uabi_id = I915_EXEC_BLT,
>>>>           .class = COPY_ENGINE_CLASS,
>>>> +        .guc_class = GUC_BLITTER_CLASS,
>>>>           .instance = 0,
>>>>           .mmio_bases = {
>>>>               { .gen = 6, .base = BLT_RING_BASE }
>>>> @@ -116,6 +119,7 @@ struct engine_info {
>>>>           .hw_id = VCS_HW,
>>>>           .uabi_id = I915_EXEC_BSD,
>>>>           .class = VIDEO_DECODE_CLASS,
>>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>>           .instance = 0,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_BSD_RING_BASE },
>>>> @@ -127,6 +131,7 @@ struct engine_info {
>>>>           .hw_id = VCS2_HW,
>>>>           .uabi_id = I915_EXEC_BSD,
>>>>           .class = VIDEO_DECODE_CLASS,
>>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>>           .instance = 1,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_BSD2_RING_BASE },
>>>> @@ -137,6 +142,7 @@ struct engine_info {
>>>>           .hw_id = VCS3_HW,
>>>>           .uabi_id = I915_EXEC_BSD,
>>>>           .class = VIDEO_DECODE_CLASS,
>>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>>           .instance = 2,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_BSD3_RING_BASE }
>>>> @@ -146,6 +152,7 @@ struct engine_info {
>>>>           .hw_id = VCS4_HW,
>>>>           .uabi_id = I915_EXEC_BSD,
>>>>           .class = VIDEO_DECODE_CLASS,
>>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>>           .instance = 3,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_BSD4_RING_BASE }
>>>> @@ -155,6 +162,7 @@ struct engine_info {
>>>>           .hw_id = VECS_HW,
>>>>           .uabi_id = I915_EXEC_VEBOX,
>>>>           .class = VIDEO_ENHANCEMENT_CLASS,
>>>> +        .guc_class = GUC_VIDEOENHANCE_CLASS,
>>>>           .instance = 0,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_VEBOX_RING_BASE },
>>>> @@ -165,6 +173,7 @@ struct engine_info {
>>>>           .hw_id = VECS2_HW,
>>>>           .uabi_id = I915_EXEC_VEBOX,
>>>>           .class = VIDEO_ENHANCEMENT_CLASS,
>>>> +        .guc_class = GUC_VIDEOENHANCE_CLASS,
>>>>           .instance = 1,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_VEBOX2_RING_BASE }
>>>> @@ -276,6 +285,9 @@ static void __sprint_engine_name(char *name, 
>>>> const struct engine_info *info)
>>>>       if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
>>>>           return -EINVAL;
>>>>   +    if (GEM_WARN_ON(info->guc_class >= GUC_MAX_ENGINE_CLASSES))
>>>> +        return -EINVAL;
>>>> +
>>>>       if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
>>>>           return -EINVAL;
>>>>   @@ -291,6 +303,7 @@ static void __sprint_engine_name(char *name, 
>>>> const struct engine_info *info)
>>>>       engine->i915 = dev_priv;
>>>>       __sprint_engine_name(engine->name, info);
>>>>       engine->hw_id = engine->guc_id = info->hw_id;
>>>> +    engine->guc_class = info->guc_class;
>>>>       engine->mmio_base = __engine_mmio_base(dev_priv, 
>>>> info->mmio_bases);
>>>>       engine->class = info->class;
>>>>       engine->instance = info->instance;
>>>> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
>>>> b/drivers/gpu/drm/i915/intel_guc_fwif.h
>>>> index 963da91..5b7a05b 100644
>>>> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
>>>> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
>>>> @@ -39,6 +39,13 @@
>>>>   #define GUC_VIDEO_ENGINE2        4
>>>>   #define GUC_MAX_ENGINES_NUM        (GUC_VIDEO_ENGINE2 + 1)
>>>>   +#define GUC_RENDER_CLASS    0
>>>> +#define GUC_VIDEO_CLASS        1
>>>> +#define GUC_VIDEOENHANCE_CLASS    2
>>>> +#define GUC_BLITTER_CLASS    3
>>>> +#define GUC_RESERVED_CLASS    4
>>>> +#define GUC_MAX_ENGINE_CLASSES    (GUC_RESERVED_CLASS + 1)
>>>> +
>>>>   /* Work queue item header definitions */
>>>>   #define WQ_STATUS_ACTIVE        1
>>>>   #define WQ_STATUS_SUSPENDED        2
>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>>> index f8ceb9c..f4b9972 100644
>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>>> @@ -249,7 +249,15 @@ static inline bool need_preempt(const struct 
>>>> intel_engine_cs *engine,
>>>>             /* TODO: decide what to do with SW counter (bits 55-60) */
>>>>   -        desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>>>> +        /*
>>>> +         * Although GuC will never see this upper part as it fills
>>>> +         * its own descriptor, using the guc_class here will help 
>>>> keep
>>>> +         * the i915 and firmware logs in sync.
>>>> +         */
>>>> +        if (HAS_GUC_SCHED(ctx->i915))
>>>> +            desc |= (u64)engine->guc_class << 
>>>> GEN11_ENGINE_CLASS_SHIFT;
>>>> +        else
>>>> +            desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>>>>                                   /* bits 61-63 */
>>>
>>> OA also uses this upper part (see oa_get_render_ctx_id), so it's 
>>> something to be aware of.
>>>
>>> My opinion is that it's useful to have the lrca matching the 
>>> firmware logs, but OA should account of this change at it receives 
>>> what the fw sent to the hw. Which one is more important is for 
>>> others to decide (plus it only becomes a problem when engine-class 
>>> and guc-class start to deviate).
>>>
>>> Acked-by: Michel Thierry <michel.thierry@intel.com>
>>>
>>> -Michel
>>
>>
>> If GuC still behaves the same as the Gen9 firmware I was testing 
>> with, parts of the upper 32bits of the descriptor will end up in HW.
>>
>> Just make sure i915_perf.c is in sync with intel_lrc.c and it should 
>> be fine :)
>>
> Tested on KBL; works fine for both enable_guc=2 and enable_guc=3.
>
> ./tests/perf --run-subtest=gen8-unprivileged-single-ctx-counters
> IGT-Version: 1.22-g11db680 (x86_64) (Linux: 4.19.0-rc1tli+ x86_64)
> Subtest gen8-unprivileged-single-ctx-counters: SUCCESS (0,058s)
>
> -Tomasz
>>
>>>
>>>
>>>>       } else {
>>>>           GEM_BUG_ON(ctx->hw_id >= BIT(GEN8_CTX_ID_WIDTH));
>>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
>>>> b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>>> index 3f6920d..f47009f 100644
>>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>>> @@ -350,7 +350,9 @@ struct intel_engine_cs {
>>>>         enum intel_engine_id id;
>>>>       unsigned int hw_id;
>>>> +
>>>>       unsigned int guc_id;
>>>> +    u8 guc_class;
>>>>         u8 uabi_id;
>>>>       u8 uabi_class;
>>>>
>>>
>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface
  2018-08-30 13:29       ` Lis, Tomasz
  2018-08-30 14:16         ` Lis, Tomasz
@ 2018-08-30 14:56         ` Lionel Landwerlin
  1 sibling, 0 replies; 49+ messages in thread
From: Lionel Landwerlin @ 2018-08-30 14:56 UTC (permalink / raw)
  To: Lis, Tomasz, Michel Thierry, Michal Wajdeczko, intel-gfx
  Cc: Lucas De Marchi, Rodrigo Vivi

On 30/08/2018 14:29, Lis, Tomasz wrote:
>
>
> On 2018-08-30 02:16, Lionel Landwerlin wrote:
>> On 29/08/2018 20:58, Michel Thierry wrote:
>>> +Lionel
>>> (please see below as this touches the lrca format & relates to OA 
>>> reporting too)
>>>
>>> On 8/29/2018 12:10 PM, Michal Wajdeczko wrote:
>>>> Until now the GuC and HW engine class has been the same, which allowed
>>>> us to use them interchangeable. But it is better to start doing the
>>>> right thing and use the GuC definitions for the firmware interface.
>>>>
>>>> We also keep the same class id in the ctx descriptor to be able to 
>>>> have
>>>> the same values in the driver and firmware logs.
>>>>
>>>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>>> Cc: Michel Thierry <michel.thierry@intel.com>
>>>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>>>> Cc: Tomasz Lis <tomasz.lis@intel.com>
> Tested-by: Tomasz Lis <tomasz.lis@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/intel_engine_cs.c  | 13 +++++++++++++
>>>>   drivers/gpu/drm/i915/intel_guc_fwif.h   |  7 +++++++
>>>>   drivers/gpu/drm/i915/intel_lrc.c        | 10 +++++++++-
>>>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
>>>>   4 files changed, 31 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
>>>> b/drivers/gpu/drm/i915/intel_engine_cs.c
>>>> index 1a34e8f..bc81354 100644
>>>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>>>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>>>> @@ -85,6 +85,7 @@ struct engine_info {
>>>>       unsigned int hw_id;
>>>>       unsigned int uabi_id;
>>>>       u8 class;
>>>> +    u8 guc_class;
>>>>       u8 instance;
>>>>       /* mmio bases table *must* be sorted in reverse gen order */
>>>>       struct engine_mmio_base {
>>>> @@ -98,6 +99,7 @@ struct engine_info {
>>>>           .hw_id = RCS_HW,
>>>>           .uabi_id = I915_EXEC_RENDER,
>>>>           .class = RENDER_CLASS,
>>>> +        .guc_class = GUC_RENDER_CLASS,
>>>>           .instance = 0,
>>>>           .mmio_bases = {
>>>>               { .gen = 1, .base = RENDER_RING_BASE }
>>>> @@ -107,6 +109,7 @@ struct engine_info {
>>>>           .hw_id = BCS_HW,
>>>>           .uabi_id = I915_EXEC_BLT,
>>>>           .class = COPY_ENGINE_CLASS,
>>>> +        .guc_class = GUC_BLITTER_CLASS,
>>>>           .instance = 0,
>>>>           .mmio_bases = {
>>>>               { .gen = 6, .base = BLT_RING_BASE }
>>>> @@ -116,6 +119,7 @@ struct engine_info {
>>>>           .hw_id = VCS_HW,
>>>>           .uabi_id = I915_EXEC_BSD,
>>>>           .class = VIDEO_DECODE_CLASS,
>>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>>           .instance = 0,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_BSD_RING_BASE },
>>>> @@ -127,6 +131,7 @@ struct engine_info {
>>>>           .hw_id = VCS2_HW,
>>>>           .uabi_id = I915_EXEC_BSD,
>>>>           .class = VIDEO_DECODE_CLASS,
>>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>>           .instance = 1,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_BSD2_RING_BASE },
>>>> @@ -137,6 +142,7 @@ struct engine_info {
>>>>           .hw_id = VCS3_HW,
>>>>           .uabi_id = I915_EXEC_BSD,
>>>>           .class = VIDEO_DECODE_CLASS,
>>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>>           .instance = 2,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_BSD3_RING_BASE }
>>>> @@ -146,6 +152,7 @@ struct engine_info {
>>>>           .hw_id = VCS4_HW,
>>>>           .uabi_id = I915_EXEC_BSD,
>>>>           .class = VIDEO_DECODE_CLASS,
>>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>>           .instance = 3,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_BSD4_RING_BASE }
>>>> @@ -155,6 +162,7 @@ struct engine_info {
>>>>           .hw_id = VECS_HW,
>>>>           .uabi_id = I915_EXEC_VEBOX,
>>>>           .class = VIDEO_ENHANCEMENT_CLASS,
>>>> +        .guc_class = GUC_VIDEOENHANCE_CLASS,
>>>>           .instance = 0,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_VEBOX_RING_BASE },
>>>> @@ -165,6 +173,7 @@ struct engine_info {
>>>>           .hw_id = VECS2_HW,
>>>>           .uabi_id = I915_EXEC_VEBOX,
>>>>           .class = VIDEO_ENHANCEMENT_CLASS,
>>>> +        .guc_class = GUC_VIDEOENHANCE_CLASS,
>>>>           .instance = 1,
>>>>           .mmio_bases = {
>>>>               { .gen = 11, .base = GEN11_VEBOX2_RING_BASE }
>>>> @@ -276,6 +285,9 @@ static void __sprint_engine_name(char *name, 
>>>> const struct engine_info *info)
>>>>       if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
>>>>           return -EINVAL;
>>>>   +    if (GEM_WARN_ON(info->guc_class >= GUC_MAX_ENGINE_CLASSES))
>>>> +        return -EINVAL;
>>>> +
>>>>       if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
>>>>           return -EINVAL;
>>>>   @@ -291,6 +303,7 @@ static void __sprint_engine_name(char *name, 
>>>> const struct engine_info *info)
>>>>       engine->i915 = dev_priv;
>>>>       __sprint_engine_name(engine->name, info);
>>>>       engine->hw_id = engine->guc_id = info->hw_id;
>>>> +    engine->guc_class = info->guc_class;
>>>>       engine->mmio_base = __engine_mmio_base(dev_priv, 
>>>> info->mmio_bases);
>>>>       engine->class = info->class;
>>>>       engine->instance = info->instance;
>>>> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
>>>> b/drivers/gpu/drm/i915/intel_guc_fwif.h
>>>> index 963da91..5b7a05b 100644
>>>> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
>>>> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
>>>> @@ -39,6 +39,13 @@
>>>>   #define GUC_VIDEO_ENGINE2        4
>>>>   #define GUC_MAX_ENGINES_NUM        (GUC_VIDEO_ENGINE2 + 1)
>>>>   +#define GUC_RENDER_CLASS    0
>>>> +#define GUC_VIDEO_CLASS        1
>>>> +#define GUC_VIDEOENHANCE_CLASS    2
>>>> +#define GUC_BLITTER_CLASS    3
>>>> +#define GUC_RESERVED_CLASS    4
>>>> +#define GUC_MAX_ENGINE_CLASSES    (GUC_RESERVED_CLASS + 1)
>>>> +
>>>>   /* Work queue item header definitions */
>>>>   #define WQ_STATUS_ACTIVE        1
>>>>   #define WQ_STATUS_SUSPENDED        2
>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>>> index f8ceb9c..f4b9972 100644
>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>>> @@ -249,7 +249,15 @@ static inline bool need_preempt(const struct 
>>>> intel_engine_cs *engine,
>>>>             /* TODO: decide what to do with SW counter (bits 55-60) */
>>>>   -        desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>>>> +        /*
>>>> +         * Although GuC will never see this upper part as it fills
>>>> +         * its own descriptor, using the guc_class here will help 
>>>> keep
>>>> +         * the i915 and firmware logs in sync.
>>>> +         */
>>>> +        if (HAS_GUC_SCHED(ctx->i915))
>>>> +            desc |= (u64)engine->guc_class << 
>>>> GEN11_ENGINE_CLASS_SHIFT;
>>>> +        else
>>>> +            desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>>>>                                   /* bits 61-63 */
>>>
>>> OA also uses this upper part (see oa_get_render_ctx_id), so it's 
>>> something to be aware of.
>>>
>>> My opinion is that it's useful to have the lrca matching the 
>>> firmware logs, but OA should account of this change at it receives 
>>> what the fw sent to the hw. Which one is more important is for 
>>> others to decide (plus it only becomes a problem when engine-class 
>>> and guc-class start to deviate).
>>>
>>> Acked-by: Michel Thierry <michel.thierry@intel.com>
>>>
>>> -Michel
>>
>>
>> If GuC still behaves the same as the Gen9 firmware I was testing 
>> with, parts of the upper 32bits of the descriptor will end up in HW.
>>
>> Just make sure i915_perf.c is in sync with intel_lrc.c and it should 
>> be fine :)
>>
> Tested on KBL; works fine for both enable_guc=2 and enable_guc=3.
>
> ./tests/perf --run-subtest=gen8-unprivileged-single-ctx-counters
> IGT-Version: 1.22-g11db680 (x86_64) (Linux: 4.19.0-rc1tli+ x86_64)
> Subtest gen8-unprivileged-single-ctx-counters: SUCCESS (0,058s)


But this patch affect Gen11, I think this is where you need to test, not 
on KBL.


>
> -Tomasz
>>
>>>
>>>
>>>>       } else {
>>>>           GEM_BUG_ON(ctx->hw_id >= BIT(GEN8_CTX_ID_WIDTH));
>>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
>>>> b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>>> index 3f6920d..f47009f 100644
>>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>>> @@ -350,7 +350,9 @@ struct intel_engine_cs {
>>>>         enum intel_engine_id id;
>>>>       unsigned int hw_id;
>>>> +
>>>>       unsigned int guc_id;
>>>> +    u8 guc_class;
>>>>         u8 uabi_id;
>>>>       u8 uabi_class;
>>>>
>>>
>>
>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface
  2018-08-30  0:16     ` Lionel Landwerlin
  2018-08-30 13:29       ` Lis, Tomasz
@ 2018-08-30 22:34       ` Daniele Ceraolo Spurio
  1 sibling, 0 replies; 49+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-08-30 22:34 UTC (permalink / raw)
  To: Lionel Landwerlin, Michel Thierry, Michal Wajdeczko, intel-gfx
  Cc: Lucas De Marchi, Rodrigo Vivi



On 29/08/18 17:16, Lionel Landwerlin wrote:
> On 29/08/2018 20:58, Michel Thierry wrote:
>> +Lionel
>> (please see below as this touches the lrca format & relates to OA 
>> reporting too)
>>
>> On 8/29/2018 12:10 PM, Michal Wajdeczko wrote:
>>> Until now the GuC and HW engine class has been the same, which allowed
>>> us to use them interchangeable. But it is better to start doing the
>>> right thing and use the GuC definitions for the firmware interface.
>>>
>>> We also keep the same class id in the ctx descriptor to be able to have
>>> the same values in the driver and firmware logs.
>>>
>>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: Michel Thierry <michel.thierry@intel.com>
>>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>>> Cc: Tomasz Lis <tomasz.lis@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/intel_engine_cs.c  | 13 +++++++++++++
>>>   drivers/gpu/drm/i915/intel_guc_fwif.h   |  7 +++++++
>>>   drivers/gpu/drm/i915/intel_lrc.c        | 10 +++++++++-
>>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
>>>   4 files changed, 31 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
>>> b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> index 1a34e8f..bc81354 100644
>>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> @@ -85,6 +85,7 @@ struct engine_info {
>>>       unsigned int hw_id;
>>>       unsigned int uabi_id;
>>>       u8 class;
>>> +    u8 guc_class;
>>>       u8 instance;
>>>       /* mmio bases table *must* be sorted in reverse gen order */
>>>       struct engine_mmio_base {
>>> @@ -98,6 +99,7 @@ struct engine_info {
>>>           .hw_id = RCS_HW,
>>>           .uabi_id = I915_EXEC_RENDER,
>>>           .class = RENDER_CLASS,
>>> +        .guc_class = GUC_RENDER_CLASS,
>>>           .instance = 0,
>>>           .mmio_bases = {
>>>               { .gen = 1, .base = RENDER_RING_BASE }
>>> @@ -107,6 +109,7 @@ struct engine_info {
>>>           .hw_id = BCS_HW,
>>>           .uabi_id = I915_EXEC_BLT,
>>>           .class = COPY_ENGINE_CLASS,
>>> +        .guc_class = GUC_BLITTER_CLASS,
>>>           .instance = 0,
>>>           .mmio_bases = {
>>>               { .gen = 6, .base = BLT_RING_BASE }
>>> @@ -116,6 +119,7 @@ struct engine_info {
>>>           .hw_id = VCS_HW,
>>>           .uabi_id = I915_EXEC_BSD,
>>>           .class = VIDEO_DECODE_CLASS,
>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>           .instance = 0,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_BSD_RING_BASE },
>>> @@ -127,6 +131,7 @@ struct engine_info {
>>>           .hw_id = VCS2_HW,
>>>           .uabi_id = I915_EXEC_BSD,
>>>           .class = VIDEO_DECODE_CLASS,
>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>           .instance = 1,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_BSD2_RING_BASE },
>>> @@ -137,6 +142,7 @@ struct engine_info {
>>>           .hw_id = VCS3_HW,
>>>           .uabi_id = I915_EXEC_BSD,
>>>           .class = VIDEO_DECODE_CLASS,
>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>           .instance = 2,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_BSD3_RING_BASE }
>>> @@ -146,6 +152,7 @@ struct engine_info {
>>>           .hw_id = VCS4_HW,
>>>           .uabi_id = I915_EXEC_BSD,
>>>           .class = VIDEO_DECODE_CLASS,
>>> +        .guc_class = GUC_VIDEO_CLASS,
>>>           .instance = 3,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_BSD4_RING_BASE }
>>> @@ -155,6 +162,7 @@ struct engine_info {
>>>           .hw_id = VECS_HW,
>>>           .uabi_id = I915_EXEC_VEBOX,
>>>           .class = VIDEO_ENHANCEMENT_CLASS,
>>> +        .guc_class = GUC_VIDEOENHANCE_CLASS,
>>>           .instance = 0,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_VEBOX_RING_BASE },
>>> @@ -165,6 +173,7 @@ struct engine_info {
>>>           .hw_id = VECS2_HW,
>>>           .uabi_id = I915_EXEC_VEBOX,
>>>           .class = VIDEO_ENHANCEMENT_CLASS,
>>> +        .guc_class = GUC_VIDEOENHANCE_CLASS,
>>>           .instance = 1,
>>>           .mmio_bases = {
>>>               { .gen = 11, .base = GEN11_VEBOX2_RING_BASE }
>>> @@ -276,6 +285,9 @@ static void __sprint_engine_name(char *name, 
>>> const struct engine_info *info)
>>>       if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
>>>           return -EINVAL;
>>>   +    if (GEM_WARN_ON(info->guc_class >= GUC_MAX_ENGINE_CLASSES))
>>> +        return -EINVAL;
>>> +
>>>       if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
>>>           return -EINVAL;
>>>   @@ -291,6 +303,7 @@ static void __sprint_engine_name(char *name, 
>>> const struct engine_info *info)
>>>       engine->i915 = dev_priv;
>>>       __sprint_engine_name(engine->name, info);
>>>       engine->hw_id = engine->guc_id = info->hw_id;
>>> +    engine->guc_class = info->guc_class;
>>>       engine->mmio_base = __engine_mmio_base(dev_priv, 
>>> info->mmio_bases);
>>>       engine->class = info->class;
>>>       engine->instance = info->instance;
>>> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
>>> b/drivers/gpu/drm/i915/intel_guc_fwif.h
>>> index 963da91..5b7a05b 100644
>>> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
>>> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
>>> @@ -39,6 +39,13 @@
>>>   #define GUC_VIDEO_ENGINE2        4
>>>   #define GUC_MAX_ENGINES_NUM        (GUC_VIDEO_ENGINE2 + 1)
>>>   +#define GUC_RENDER_CLASS    0
>>> +#define GUC_VIDEO_CLASS        1
>>> +#define GUC_VIDEOENHANCE_CLASS    2
>>> +#define GUC_BLITTER_CLASS    3
>>> +#define GUC_RESERVED_CLASS    4
>>> +#define GUC_MAX_ENGINE_CLASSES    (GUC_RESERVED_CLASS + 1)
>>> +
>>>   /* Work queue item header definitions */
>>>   #define WQ_STATUS_ACTIVE        1
>>>   #define WQ_STATUS_SUSPENDED        2
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>> index f8ceb9c..f4b9972 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -249,7 +249,15 @@ static inline bool need_preempt(const struct 
>>> intel_engine_cs *engine,
>>>             /* TODO: decide what to do with SW counter (bits 55-60) */
>>>   -        desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>>> +        /*
>>> +         * Although GuC will never see this upper part as it fills
>>> +         * its own descriptor, using the guc_class here will help keep
>>> +         * the i915 and firmware logs in sync.
>>> +         */
>>> +        if (HAS_GUC_SCHED(ctx->i915))
>>> +            desc |= (u64)engine->guc_class << GEN11_ENGINE_CLASS_SHIFT;
>>> +        else
>>> +            desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>>>                                   /* bits 61-63 */
>>
>> OA also uses this upper part (see oa_get_render_ctx_id), so it's 
>> something to be aware of.
>>
>> My opinion is that it's useful to have the lrca matching the firmware 
>> logs, but OA should account of this change at it receives what the fw 
>> sent to the hw. Which one is more important is for others to decide 
>> (plus it only becomes a problem when engine-class and guc-class start 
>> to deviate).
>>
>> Acked-by: Michel Thierry <michel.thierry@intel.com>
>>
>> -Michel
> 
> 
> If GuC still behaves the same as the Gen9 firmware I was testing with, 
> parts of the upper 32bits of the descriptor will end up in HW.
> 
> Just make sure i915_perf.c is in sync with intel_lrc.c and it should be 
> fine :)
> 

Not sure about having i915_perf.c is in sync with intel_lrc.c. As Michel 
said, GuC will submit to the HW using the "real" IDs (i.e. the ones used 
in intel_lrc the non-guc path) so I would expect those to show up in 
perf counters but in guc logs we have the "fake" ones (i.e. the ones 
using the guc_ids). In this patch intel_lrc uses the guc_ids and I don't 
think we want that for perf.
The ids are currently the same in both paths so testing is not going to 
show any issue with this until they diverge.

Thanks,
Daniele

> 
>>
>>
>>>       } else {
>>>           GEM_BUG_ON(ctx->hw_id >= BIT(GEN8_CTX_ID_WIDTH));
>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
>>> b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> index 3f6920d..f47009f 100644
>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> @@ -350,7 +350,9 @@ struct intel_engine_cs {
>>>         enum intel_engine_id id;
>>>       unsigned int hw_id;
>>> +
>>>       unsigned int guc_id;
>>> +    u8 guc_class;
>>>         u8 uabi_id;
>>>       u8 uabi_class;
>>>
>>
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11
  2018-08-29 19:10 ` [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11 Michal Wajdeczko
  2018-08-29 19:16   ` Srivatsa, Anusha
@ 2018-08-30 22:58   ` John Spotswood
  2018-09-06  8:28   ` Joonas Lahtinen
  2018-09-06  8:29   ` Joonas Lahtinen
  3 siblings, 0 replies; 49+ messages in thread
From: John Spotswood @ 2018-08-30 22:58 UTC (permalink / raw)
  To: Wajdeczko, Michal, intel-gfx; +Cc: Sundaresan, Sujaritha, Vivi, Rodrigo

On Wed, 2018-08-29 at 12:10 -0700, Wajdeczko, Michal wrote:
> Upcoming Gen11 GuC firmware requires new interface that is
> incompatible
> with existing pre-Gen11 firmwares. Updated firmwares for pre-Gen11
> will
> arrive later. In the meantime sanitize the enable_guc option so that
> we
> can enable HuC authentication but nothing else on pre-Gen11.
> 
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: John Spotswood <john.a.spotswood@intel.com>
> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> Cc: Tony Ye <tony.ye@intel.com>
> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Antonio Argenziano <antonio.argenziano@intel.com>
> Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>

Reviewed-by: John Spotswood <john.a.spotswood@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_uc.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uc.c
> b/drivers/gpu/drm/i915/intel_uc.c
> index 7a3a4ca..185b29b 100644
> --- a/drivers/gpu/drm/i915/intel_uc.c
> +++ b/drivers/gpu/drm/i915/intel_uc.c
> @@ -63,6 +63,8 @@ static int __get_platform_enable_guc(struct
> drm_i915_private *i915)
>  		enable_guc |= ENABLE_GUC_LOAD_HUC;
>  
>  	/* Any platform specific fine-tuning can be done here */
> +	if (INTEL_GEN(i915) < 11)
> +		enable_guc &= ~ENABLE_GUC_SUBMISSION;
>  
>  	return enable_guc;
>  }
> @@ -115,6 +117,13 @@ static void sanitize_options_early(struct
> drm_i915_private *i915)
>  			 yesno(intel_uc_is_using_guc_submission()),
>  			 yesno(intel_uc_is_using_huc()));
>  
> +	/* Verify GuC submission support */
> +	if (intel_uc_is_using_guc_submission() && INTEL_GEN(i915) <
> 11) {
> +		DRM_WARN("Incompatible option detected: %s=%d,
> %s!\n",
> +			 "enable_guc", i915_modparams.enable_guc,
> +			 "submission not supported");
> +	}
> +
>  	/* Verify GuC firmware availability */
>  	if (intel_uc_is_using_guc() &&
> !intel_uc_fw_is_selected(guc_fw)) {
>  		DRM_WARN("Incompatible option detected: %s=%d,
> %s!\n",
> @@ -292,6 +301,12 @@ int intel_uc_init(struct drm_i915_private *i915)
>  		return ret;
>  
>  	if (USES_GUC_SUBMISSION(i915)) {
> +
> +		if (INTEL_GEN(i915) < 11) {
> +			intel_guc_fini(guc);
> +			return -EIO;
> +		}
> +
>  		/*
>  		 * This is stuff we need to have available at fw
> load time
>  		 * if we are planning to enable submission later
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 03/21] drm/i915/guc: Simplify preparation of GuC parameter block
  2018-08-29 19:10 ` [PATCH 03/21] drm/i915/guc: Simplify preparation of GuC parameter block Michal Wajdeczko
@ 2018-08-30 22:58   ` John Spotswood
  2018-09-06  8:32   ` Joonas Lahtinen
  1 sibling, 0 replies; 49+ messages in thread
From: John Spotswood @ 2018-08-30 22:58 UTC (permalink / raw)
  To: Wajdeczko, Michal, intel-gfx

On Wed, 2018-08-29 at 12:10 -0700, Wajdeczko, Michal wrote:
> Definition of the parameters block passed to GuC is about to change.
> Slightly refactor code now to make upcoming patch smaller.
> 
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: John Spotswood <john.a.spotswood@intel.com>

Reviewed-by: John Spotswood <john.a.spotswood@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_guc.c | 38 +++++++++++++++++++++++-------
> --------
>  1 file changed, 23 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc.c
> b/drivers/gpu/drm/i915/intel_guc.c
> index 230aea6..982bcc8 100644
> --- a/drivers/gpu/drm/i915/intel_guc.c
> +++ b/drivers/gpu/drm/i915/intel_guc.c
> @@ -320,19 +320,8 @@ static u32 guc_ctl_log_params_flags(struct
> intel_guc *guc)
>  	return flags;
>  }
>  
> -/*
> - * Initialise the GuC parameter block before starting the firmware
> - * transfer. These parameters are read by the firmware on startup
> - * and cannot be changed thereafter.
> - */
> -void intel_guc_init_params(struct intel_guc *guc)
> +static void guc_prepare_params(struct intel_guc *guc, u32 *params)
>  {
> -	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> -	u32 params[GUC_CTL_MAX_DWORDS];
> -	int i;
> -
> -	memset(params, 0, sizeof(params));
> -
>  	/*
>  	 * GuC ARAT increment is 10 ns. GuC default scheduler
> quantum is one
>  	 * second. This ARAR is calculated by:
> @@ -347,9 +336,12 @@ void intel_guc_init_params(struct intel_guc
> *guc)
>  	params[GUC_CTL_LOG_PARAMS]  = guc_ctl_log_params_flags(guc);
>  	params[GUC_CTL_DEBUG] = guc_ctl_debug_flags(guc);
>  	params[GUC_CTL_CTXINFO] = guc_ctl_ctxinfo_flags(guc);
> +}
>  
> -	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
> -		DRM_DEBUG_DRIVER("param[%2d] = %#x\n", i,
> params[i]);
> +static void guc_write_params(struct intel_guc *guc, const u32
> *params)
> +{
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	int i;
>  
>  	/*
>  	 * All SOFT_SCRATCH registers are in FORCEWAKE_BLITTER
> domain and
> @@ -360,12 +352,28 @@ void intel_guc_init_params(struct intel_guc
> *guc)
>  
>  	I915_WRITE(SOFT_SCRATCH(0), 0);
>  
> -	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
> +	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++) {
> +		DRM_DEBUG_DRIVER("param[%2d] = %#x\n", i,
> params[i]);
>  		I915_WRITE(SOFT_SCRATCH(1 + i), params[i]);
> +	}
>  
>  	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_BLITTER);
>  }
>  
> +/*
> + * Initialise the GuC parameter block before starting the firmware
> + * transfer. These parameters are read by the firmware on startup
> + * and cannot be changed thereafter.
> + */
> +void intel_guc_init_params(struct intel_guc *guc)
> +{
> +	u32 params[GUC_CTL_MAX_DWORDS];
> +
> +	memset(params, 0, sizeof(params));
> +	guc_prepare_params(guc, params);
> +	guc_write_params(guc, params);
> +}
> +
>  int intel_guc_send_nop(struct intel_guc *guc, const u32 *action, u32
> len,
>  		       u32 *response_buf, u32 response_buf_size)
>  {
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 04/21] drm/i915/guc: Support dual Gen9/Gen11 parameters block
  2018-08-29 19:10 ` [PATCH 04/21] drm/i915/guc: Support dual Gen9/Gen11 parameters block Michal Wajdeczko
@ 2018-08-30 22:58   ` John Spotswood
  2018-09-06  8:39   ` Joonas Lahtinen
  1 sibling, 0 replies; 49+ messages in thread
From: John Spotswood @ 2018-08-30 22:58 UTC (permalink / raw)
  To: Wajdeczko, Michal, intel-gfx; +Cc: Sundaresan, Sujaritha, Vivi, Rodrigo

On Wed, 2018-08-29 at 12:10 -0700, Wajdeczko, Michal wrote:
> Gen11 GuC boot parameter definitions are different than previously
> used for Gen9. Try to support both definitions until new firmwares
> for pre-Gen11 will be available.
> 
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: John Spotswood <john.a.spotswood@intel.com>
> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> Cc: Tony Ye <tony.ye@intel.com>
> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Antonio Argenziano <antonio.argenziano@intel.com>
> Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>

Reviewed-by: John Spotswood <john.a.spotswood@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_guc.c      | 76
> +++++++++++++++++++++++++----------
>  drivers/gpu/drm/i915/intel_guc_fwif.h | 59 +++++++++++++----------
> ----
>  2 files changed, 83 insertions(+), 52 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc.c
> b/drivers/gpu/drm/i915/intel_guc.c
> index 982bcc8..a9c2f7b 100644
> --- a/drivers/gpu/drm/i915/intel_guc.c
> +++ b/drivers/gpu/drm/i915/intel_guc.c
> @@ -230,14 +230,7 @@ void intel_guc_fini(struct intel_guc *guc)
>  static u32 guc_ctl_debug_flags(struct intel_guc *guc)
>  {
>  	u32 level = intel_guc_log_get_level(&guc->log);
> -	u32 flags;
> -	u32 ads;
> -
> -	ads = intel_guc_ggtt_offset(guc, guc->ads_vma) >>
> PAGE_SHIFT;
> -	flags = ads << GUC_ADS_ADDR_SHIFT | GUC_ADS_ENABLED;
> -
> -	if (!GUC_LOG_LEVEL_IS_ENABLED(level))
> -		flags |= GUC_LOG_DEFAULT_DISABLED;
> +	u32 flags = 0;
>  
>  	if (!GUC_LOG_LEVEL_IS_VERBOSE(level))
>  		flags |= GUC_LOG_DISABLED;
> @@ -248,20 +241,28 @@ static u32 guc_ctl_debug_flags(struct intel_guc
> *guc)
>  	return flags;
>  }
>  
> -static u32 guc_ctl_feature_flags(struct intel_guc *guc)
> +static u32 guc9_ctl_debug_flags(struct intel_guc *guc)
>  {
> -	u32 flags = 0;
> +	u32 level = intel_guc_log_get_level(&guc->log);
> +	u32 flags;
> +	u32 ads;
>  
> -	flags |=  GUC_CTL_VCS2_ENABLED;
> +	ads = intel_guc_ggtt_offset(guc, guc->ads_vma) >>
> PAGE_SHIFT;
> +	flags = ads << GUC9_ADS_ADDR_SHIFT | GUC9_ADS_ENABLED;
>  
> -	if (USES_GUC_SUBMISSION(guc_to_i915(guc)))
> -		flags |= GUC_CTL_KERNEL_SUBMISSIONS;
> -	else
> -		flags |= GUC_CTL_DISABLE_SCHEDULER;
> +	if (!GUC_LOG_LEVEL_IS_ENABLED(level))
> +		flags |= GUC9_LOG_DEFAULT_DISABLED;
> +
> +	flags |= guc_ctl_debug_flags(guc);
>  
>  	return flags;
>  }
>  
> +static u32 guc9_ctl_feature_flags(struct intel_guc *guc)
> +{
> +	return GUC9_CTL_VCS2_ENABLED | GUC9_CTL_DISABLE_SCHEDULER;
> +}
> +
>  static u32 guc_ctl_ctxinfo_flags(struct intel_guc *guc)
>  {
>  	u32 flags = 0;
> @@ -279,6 +280,16 @@ static u32 guc_ctl_ctxinfo_flags(struct
> intel_guc *guc)
>  	return flags;
>  }
>  
> +static u32 guc_ctl_feature_flags(struct intel_guc *guc)
> +{
> +	u32 flags = 0;
> +
> +	if (!USES_GUC_SUBMISSION(guc_to_i915(guc)))
> +		flags |= GUC_CTL_DISABLE_SCHEDULER;
> +
> +	return flags;
> +}
> +
>  static u32 guc_ctl_log_params_flags(struct intel_guc *guc)
>  {
>  	u32 offset = intel_guc_ggtt_offset(guc, guc->log.vma) >>
> PAGE_SHIFT;
> @@ -320,22 +331,39 @@ static u32 guc_ctl_log_params_flags(struct
> intel_guc *guc)
>  	return flags;
>  }
>  
> -static void guc_prepare_params(struct intel_guc *guc, u32 *params)
> +static void guc9_prepare_params(struct intel_guc *guc, u32 *params)
>  {
>  	/*
>  	 * GuC ARAT increment is 10 ns. GuC default scheduler
> quantum is one
>  	 * second. This ARAR is calculated by:
>  	 * Scheduler-Quantum-in-ns / ARAT-increment-in-ns =
> 1000000000 / 10
>  	 */
> -	params[GUC_CTL_ARAT_HIGH] = 0;
> -	params[GUC_CTL_ARAT_LOW] = 100000000;
> +	params[GUC9_CTL_ARAT_HIGH] = 0;
> +	params[GUC9_CTL_ARAT_LOW] = 100000000;
> +
> +	params[GUC9_CTL_WA] |= GUC9_CTL_WA_UK_BY_DRIVER;
>  
> -	params[GUC_CTL_WA] |= GUC_CTL_WA_UK_BY_DRIVER;
> +	params[GUC9_CTL_FEATURE] = guc9_ctl_feature_flags(guc);
> +	params[GUC9_CTL_LOG_PARAMS] = guc_ctl_log_params_flags(guc);
> +	params[GUC9_CTL_DEBUG] = guc9_ctl_debug_flags(guc);
> +	params[GUC_CTL_CTXINFO] = guc_ctl_ctxinfo_flags(guc);
> +}
>  
> +static u32 guc_ctl_ads_flags(struct intel_guc *guc)
> +{
> +	u32 ads = intel_guc_ggtt_offset(guc, guc->ads_vma) >>
> PAGE_SHIFT;
> +	u32 flags = ads << GUC_ADS_ADDR_SHIFT;
> +
> +	return flags;
> +}
> +
> +static void guc11_prepare_params(struct intel_guc *guc, u32 *params)
> +{
> +	params[GUC_CTL_CTXINFO] = guc_ctl_ctxinfo_flags(guc);
> +	params[GUC_CTL_LOG_PARAMS] = guc_ctl_log_params_flags(guc);
>  	params[GUC_CTL_FEATURE] = guc_ctl_feature_flags(guc);
> -	params[GUC_CTL_LOG_PARAMS]  = guc_ctl_log_params_flags(guc);
>  	params[GUC_CTL_DEBUG] = guc_ctl_debug_flags(guc);
> -	params[GUC_CTL_CTXINFO] = guc_ctl_ctxinfo_flags(guc);
> +	params[GUC_CTL_ADS] = guc_ctl_ads_flags(guc);
>  }
>  
>  static void guc_write_params(struct intel_guc *guc, const u32
> *params)
> @@ -367,10 +395,14 @@ static void guc_write_params(struct intel_guc
> *guc, const u32 *params)
>   */
>  void intel_guc_init_params(struct intel_guc *guc)
>  {
> +	struct drm_i915_private *i915 = guc_to_i915(guc);
>  	u32 params[GUC_CTL_MAX_DWORDS];
>  
>  	memset(params, 0, sizeof(params));
> -	guc_prepare_params(guc, params);
> +	if (INTEL_GEN(i915) >= 11)
> +		guc11_prepare_params(guc, params);
> +	else
> +		guc9_prepare_params(guc, params);
>  	guc_write_params(guc, params);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h
> b/drivers/gpu/drm/i915/intel_guc_fwif.h
> index 8382d59..7070e36 100644
> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
> @@ -71,44 +71,28 @@
>  #define GUC_STAGE_DESC_ATTR_PCH		BIT(6)
>  #define GUC_STAGE_DESC_ATTR_TERMINATED	BIT(7)
>  
> -/* The guc control data is 10 DWORDs */
> +/* New GuC control data */
>  #define GUC_CTL_CTXINFO			0
>  #define   GUC_CTL_CTXNUM_IN16_SHIFT	0
>  #define   GUC_CTL_BASE_ADDR_SHIFT	12
>  
> -#define GUC_CTL_ARAT_HIGH		1
> -#define GUC_CTL_ARAT_LOW		2
> -
> -#define GUC_CTL_DEVICE_INFO		3
> -
> -#define GUC_CTL_LOG_PARAMS		4
> +#define GUC_CTL_LOG_PARAMS		1
>  #define   GUC_LOG_VALID			(1 << 0)
>  #define   GUC_LOG_NOTIFY_ON_HALF_FULL	(1 << 1)
>  #define   GUC_LOG_ALLOC_IN_MEGABYTE	(1 << 3)
>  #define   GUC_LOG_CRASH_SHIFT		4
> -#define   GUC_LOG_CRASH_MASK		(0x1 <<
> GUC_LOG_CRASH_SHIFT)
> +#define   GUC_LOG_CRASH_MASK		(0x3 <<
> GUC_LOG_CRASH_SHIFT)
>  #define   GUC_LOG_DPC_SHIFT		6
>  #define   GUC_LOG_DPC_MASK	        (0x7 << GUC_LOG_DPC_SHIFT)
>  #define   GUC_LOG_ISR_SHIFT		9
>  #define   GUC_LOG_ISR_MASK	        (0x7 << GUC_LOG_ISR_SHIFT)
>  #define   GUC_LOG_BUF_ADDR_SHIFT	12
>  
> -#define GUC_CTL_PAGE_FAULT_CONTROL	5
> -
> -#define GUC_CTL_WA			6
> -#define   GUC_CTL_WA_UK_BY_DRIVER	(1 << 3)
> +#define GUC_CTL_WA			2
> +#define GUC_CTL_FEATURE			3
> +#define   GUC_CTL_DISABLE_SCHEDULER	(1 << 14)
>  
> -#define GUC_CTL_FEATURE			7
> -#define   GUC_CTL_VCS2_ENABLED		(1 << 0)
> -#define   GUC_CTL_KERNEL_SUBMISSIONS	(1 << 1)
> -#define   GUC_CTL_FEATURE2		(1 << 2)
> -#define   GUC_CTL_POWER_GATING		(1 << 3)
> -#define   GUC_CTL_DISABLE_SCHEDULER	(1 << 4)
> -#define   GUC_CTL_PREEMPTION_LOG	(1 << 5)
> -#define   GUC_CTL_ENABLE_SLPC		(1 << 7)
> -#define   GUC_CTL_RESET_ON_PREMPT_FAILURE	(1 << 8)
> -
> -#define GUC_CTL_DEBUG			8
> +#define GUC_CTL_DEBUG			4
>  #define   GUC_LOG_VERBOSITY_SHIFT	0
>  #define   GUC_LOG_VERBOSITY_LOW		(0 <<
> GUC_LOG_VERBOSITY_SHIFT)
>  #define   GUC_LOG_VERBOSITY_MED		(1 <<
> GUC_LOG_VERBOSITY_SHIFT)
> @@ -121,13 +105,28 @@
>  #define	  GUC_LOG_DESTINATION_MASK	(3 << 4)
>  #define   GUC_LOG_DISABLED		(1 << 6)
>  #define   GUC_PROFILE_ENABLED		(1 << 7)
> -#define   GUC_WQ_TRACK_ENABLED		(1 << 8)
> -#define   GUC_ADS_ENABLED		(1 << 9)
> -#define   GUC_LOG_DEFAULT_DISABLED	(1 << 10)
> -#define   GUC_ADS_ADDR_SHIFT		11
> -#define   GUC_ADS_ADDR_MASK		0xfffff800
> -
> -#define GUC_CTL_RSRVD			9
> +#define   GUC9_WQ_TRACK_ENABLED		(1 << 8)
> +#define   GUC9_ADS_ENABLED		(1 << 9)
> +#define   GUC9_LOG_DEFAULT_DISABLED	(1 << 10)
> +#define   GUC9_ADS_ADDR_SHIFT		11
> +#define   GUC9_ADS_ADDR_MASK		0xfffff800
> +
> +#define GUC_CTL_ADS			5
> +#define   GUC_ADS_ADDR_SHIFT		1
> +#define   GUC_ADS_ADDR_MASK		(0xFFFFF <<
> GUC_ADS_ADDR_SHIFT)
> +
> +/* Legacy GuC control data */
> +#define GUC9_CTL_ARAT_HIGH		1
> +#define GUC9_CTL_ARAT_LOW		2
> +#define GUC9_CTL_DEVICE_INFO		3
> +#define GUC9_CTL_LOG_PARAMS		4
> +#define GUC9_CTL_PAGE_FAULT_CONTROL	5
> +#define GUC9_CTL_WA			6
> +#define   GUC9_CTL_WA_UK_BY_DRIVER	(1 << 3)
> +#define GUC9_CTL_FEATURE		7
> +#define   GUC9_CTL_VCS2_ENABLED		(1 << 0)
> +#define   GUC9_CTL_DISABLE_SCHEDULER	(1 << 4)
> +#define GUC9_CTL_DEBUG			8
>  
>  #define GUC_CTL_MAX_DWORDS		(SOFT_SCRATCH_COUNT - 2)
> /* [1..14] */
>  
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 19/21] drm/i915/huc: New HuC status register for Gen11
  2018-08-29 19:18   ` [PATCH 19/21] drm/i915/huc: New HuC status register " Michal Wajdeczko
@ 2018-08-30 22:59     ` John Spotswood
  0 siblings, 0 replies; 49+ messages in thread
From: John Spotswood @ 2018-08-30 22:59 UTC (permalink / raw)
  To: Wajdeczko, Michal, intel-gfx; +Cc: Vivi, Rodrigo

On Wed, 2018-08-29 at 12:18 -0700, Wajdeczko, Michal wrote:
> Gen11 defines new register for checking HuC authentication status.
> Look into the right register and bit.
> 
> BSpec: 19686
> 
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Tony Ye <tony.ye@intel.com>
> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: John Spotswood <john.a.spotswood@intel.com>
> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>

Reviewed-by: John Spotswood <john.a.spotswood@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_guc_reg.h |  3 ++
>  drivers/gpu/drm/i915/intel_huc.c     | 58
> +++++++++++++++++++++++++++++++-----
>  2 files changed, 53 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc_reg.h
> b/drivers/gpu/drm/i915/intel_guc_reg.h
> index 2149209..de36595 100644
> --- a/drivers/gpu/drm/i915/intel_guc_reg.h
> +++ b/drivers/gpu/drm/i915/intel_guc_reg.h
> @@ -79,6 +79,9 @@
>  #define HUC_STATUS2             _MMIO(0xD3B0)
>  #define   HUC_FW_VERIFIED       (1<<7)
>  
> +#define GEN11_HUC_KERNEL_LOAD_INFO	_MMIO(0xC1DC)
> +#define   HUC_LOAD_SUCCESSFUL		  (1 << 0)
> +
>  #define GUC_WOPCM_SIZE			_MMIO(0xc050)
>  #define   GUC_WOPCM_SIZE_LOCKED		  (1<<0)
>  #define   GUC_WOPCM_SIZE_SHIFT		12
> diff --git a/drivers/gpu/drm/i915/intel_huc.c
> b/drivers/gpu/drm/i915/intel_huc.c
> index 37ef540d..a710c0d 100644
> --- a/drivers/gpu/drm/i915/intel_huc.c
> +++ b/drivers/gpu/drm/i915/intel_huc.c
> @@ -40,6 +40,47 @@ int intel_huc_init_misc(struct intel_huc *huc)
>  	return 0;
>  }
>  
> +static int gen8_huc_wait_verified(struct intel_huc *huc)
> +{
> +	struct drm_i915_private *i915 = huc_to_i915(huc);
> +	u32 status;
> +	int ret;
> +
> +	ret = __intel_wait_for_register(i915,
> +					HUC_STATUS2,
> +					HUC_FW_VERIFIED,
> +					HUC_FW_VERIFIED,
> +					2, 50, &status);
> +	if (ret)
> +		DRM_ERROR("HuC: status %#x\n", status);
> +	return ret;
> +}
> +
> +static int gen11_huc_wait_verified(struct intel_huc *huc)
> +{
> +	struct drm_i915_private *i915 = huc_to_i915(huc);
> +	int ret;
> +
> +	ret = __intel_wait_for_register(i915,
> +					GEN11_HUC_KERNEL_LOAD_INFO,
> +					HUC_LOAD_SUCCESSFUL,
> +					HUC_LOAD_SUCCESSFUL,
> +					2, 50, NULL);
> +	return ret;
> +}
> +
> +static int huc_wait_verified(struct intel_huc *huc)
> +{
> +	struct drm_i915_private *i915 = huc_to_i915(huc);
> +	int ret;
> +
> +	if (INTEL_GEN(i915) >= 11)
> +		ret = gen11_huc_wait_verified(huc);
> +	else
> +		ret = gen8_huc_wait_verified(huc);
> +	return ret;
> +}
> +
>  /**
>   * intel_huc_auth() - Authenticate HuC uCode
>   * @huc: intel_huc structure
> @@ -56,7 +97,6 @@ int intel_huc_auth(struct intel_huc *huc)
>  	struct drm_i915_private *i915 = huc_to_i915(huc);
>  	struct intel_guc *guc = &i915->guc;
>  	struct i915_vma *vma;
> -	u32 status;
>  	int ret;
>  
>  	if (huc->fw.load_status != INTEL_UC_FIRMWARE_SUCCESS)
> @@ -79,13 +119,9 @@ int intel_huc_auth(struct intel_huc *huc)
>  	}
>  
>  	/* Check authentication status, it should be done by now */
> -	ret = __intel_wait_for_register(i915,
> -					HUC_STATUS2,
> -					HUC_FW_VERIFIED,
> -					HUC_FW_VERIFIED,
> -					2, 50, &status);
> +	ret = huc_wait_verified(huc);
>  	if (ret) {
> -		DRM_ERROR("HuC: Firmware not verified %#x\n",
> status);
> +		DRM_ERROR("HuC: Firmware not verified %d\n", ret);
>  		goto fail_unpin;
>  	}
>  
> @@ -120,7 +156,13 @@ int intel_huc_check_status(struct intel_huc
> *huc)
>  		return -ENODEV;
>  
>  	intel_runtime_pm_get(dev_priv);
> -	status = I915_READ(HUC_STATUS2) & HUC_FW_VERIFIED;
> +
> +	if (INTEL_GEN(dev_priv) >= 11)
> +		status = I915_READ(GEN11_HUC_KERNEL_LOAD_INFO) &
> +			 HUC_LOAD_SUCCESSFUL;
> +	else
> +		status = I915_READ(HUC_STATUS2) & HUC_FW_VERIFIED;
> +
>  	intel_runtime_pm_put(dev_priv);
>  
>  	return status;
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 08/21] drm/i915/guc: Make use of the SW counter field in the context descriptor
  2018-08-30 14:15     ` Lis, Tomasz
@ 2018-08-31 15:31       ` Lis, Tomasz
  0 siblings, 0 replies; 49+ messages in thread
From: Lis, Tomasz @ 2018-08-31 15:31 UTC (permalink / raw)
  To: Lionel Landwerlin, Michal Wajdeczko, intel-gfx; +Cc: Oscar Mateo, Rodrigo Vivi



On 2018-08-30 16:15, Lis, Tomasz wrote:
>
>
> On 2018-08-30 02:08, Lionel Landwerlin wrote:
>> On 29/08/2018 20:16, Michal Wajdeczko wrote:
>>> The new context descriptor format contains two assignable fields:
>>> the SW Context ID (technically 11 bits, but practically limited to 2032
>>> entries due to some being reserved for future use by the GuC) and the
>>> SW Counter (6 bits).
>>>
>>> We don't want to limit ourselves too much in the maximum number of
>>> concurrent contexts we want to allow, so ideally we want to employ
>>> every possible bit available. Unfortunately, a further limitation in
>>> the interface with the GuC means the combination of SW Context ID +
>>> SW Counter has to be unique within the same engine class (as we use
>>> the SW Context ID to index in the GuC stage descriptor pool, and the
>>> Engine Class + SW Counter to index in the 2-dimensional lrc array).
>>> This essentially means we need to somehow encode the engine instance.
>>>
>>> Since the BSpec allows 6 bits for engine instance, we use the whole
>>> SW counter for this task. If the limitation of 2032 maximum 
>>> simultaneous
>>> contexts is too restrictive, we can always squeeze things a bit more
>>> (3 extras bits for hw_id, 3 bits for instance) and things will still
>>> work (Gen11 does not instance more than 8 engines of any class).
>>>
>>> Another alternative would be to generate the hw_id per HW context
>>> instead of per GEM context, but that has other problems (e.g. maximum
>>> number of user-created contexts would be variable, no relationship
>>> between a GuC principal descriptor and the proxy descriptor it uses, 
>>> ...)
>>>
>>> Bspec: 12254
>>>
>>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: Michel Thierry <michel.thierry@intel.com>
> Tested-by: Tomasz Lis <tomasz.lis@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_drv.h         | 15 +++++++++++----
>>>   drivers/gpu/drm/i915/i915_gem_context.c |  5 ++++-
>>>   drivers/gpu/drm/i915/i915_gem_context.h |  2 ++
>>>   drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>>>   drivers/gpu/drm/i915/intel_lrc.c        | 12 +++++++++---
>>>   5 files changed, 28 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index e5b9d3c..34f5495 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -1866,14 +1866,21 @@ struct drm_i915_private {
>>>           struct llist_head free_list;
>>>           struct work_struct free_work;
>>>   -        /* The hw wants to have a stable context identifier for the
>>> +        /*
>>> +         * The HW wants to have a stable context identifier for the
>>>            * lifetime of the context (for OA, PASID, faults, etc).
>>>            * This is limited in execlists to 21 bits.
>>> +         * In enhanced execlist (GEN11+) this is limited to 11 bits
>>> +         * (the SW Context ID field) but GuC limits it a bit further
>>> +         * (11 bits - 16) due to some entries being reserved for 
>>> future
>>> +         * use (so the firmware only supports a GuC stage descriptor
>>> +         * pool of 2032 entries).
>>>            */
>>>           struct ida hw_ida;
>>> -#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
>>> -#define MAX_GUC_CONTEXT_HW_ID (1 << 20) /* exclusive */
>>> -#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
>>> +#define MAX_CONTEXT_HW_ID            (1 << 21) /* exclusive */
>>> +#define MAX_GUC_CONTEXT_HW_ID            (1 << 20) /* exclusive */
>>> +#define GEN11_MAX_CONTEXT_HW_ID            (1 << 11) /* exclusive */
>>> +#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC (GEN11_MAX_CONTEXT_HW_ID - 
>>> 16)
>>>       } contexts;
>>>         u32 fdi_rx_config;
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
>>> b/drivers/gpu/drm/i915/i915_gem_context.c
>>> index f15a039..e3b500c 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>>> @@ -209,7 +209,10 @@ static int assign_hw_id(struct drm_i915_private 
>>> *dev_priv, unsigned *out)
>>>       unsigned int max;
>>>         if (INTEL_GEN(dev_priv) >= 11) {
>>> -        max = GEN11_MAX_CONTEXT_HW_ID;
>>> +        if (USES_GUC_SUBMISSION(dev_priv))
>>> +            max = GEN11_MAX_CONTEXT_HW_ID_WITH_GUC;
>>> +        else
>>> +            max = GEN11_MAX_CONTEXT_HW_ID;
>>>       } else {
>>>           /*
>>>            * When using GuC in proxy submission, GuC consumes the
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h 
>>> b/drivers/gpu/drm/i915/i915_gem_context.h
>>> index 851dad6..4b87f5d 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_context.h
>>> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
>>> @@ -154,6 +154,8 @@ struct i915_gem_context {
>>>           struct intel_ring *ring;
>>>           u32 *lrc_reg_state;
>>>           u64 lrc_desc;
>>> +        u32 sw_context_id;
>>> +        u32 sw_counter;
>>>           int pin_count;
>>>             const struct intel_context_ops *ops;
>>> diff --git a/drivers/gpu/drm/i915/i915_reg.h 
>>> b/drivers/gpu/drm/i915/i915_reg.h
>>> index f232178..ea65d7b 100644
>>> --- a/drivers/gpu/drm/i915/i915_reg.h
>>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>>> @@ -3900,6 +3900,8 @@ enum {
>>>   #define GEN8_CTX_ID_WIDTH 21
>>>   #define GEN11_SW_CTX_ID_SHIFT 37
>>>   #define GEN11_SW_CTX_ID_WIDTH 11
>>> +#define GEN11_SW_COUNTER_SHIFT 55
>>> +#define GEN11_SW_COUNTER_WIDTH 6
>>>   #define GEN11_ENGINE_CLASS_SHIFT 61
>>>   #define GEN11_ENGINE_CLASS_WIDTH 3
>>>   #define GEN11_ENGINE_INSTANCE_SHIFT 48
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>> index f4b9972..3001a14 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -240,14 +240,15 @@ static inline bool need_preempt(const struct 
>>> intel_engine_cs *engine,
>>>        * anything below.
>>>        */
>>>       if (INTEL_GEN(ctx->i915) >= 11) {
>>
>>
>> Hey Michal,
>>
>> There is a comment just above the if () about updating the 
>> i915_perf.c (oa_get_render_ctx_id) when descriptor is updated.
>> Otherwise this will break some part of the observability feature.
>> You can verify this with the IGT tests/perf 
>> --run-subtest=gen8-unprivileged-single-ctx-counters
>>
>> Thanks a lot,
>>
>> -
>> Lionel
> Tested on KBL; works fine for both enable_guc=2 and enable_guc=3.
>
> ./tests/perf --run-subtest=gen8-unprivileged-single-ctx-counters
> IGT-Version: 1.22-g11db680 (x86_64) (Linux: 4.19.0-rc1tli+ x86_64)
> Subtest gen8-unprivileged-single-ctx-counters: SUCCESS (0,058s)
>
> -Tomasz
Took a bit longer, but tested on ICL as well. Test passes for 
enable_guc=1 and 3.
There is still an issue with enable_guc=2, but that's not related to 
observability feature (and out of scope of this review).

-Tomasz
>>> -        GEM_BUG_ON(ctx->hw_id >= BIT(GEN11_SW_CTX_ID_WIDTH));
>>> -        desc |= (u64)ctx->hw_id << GEN11_SW_CTX_ID_SHIFT;
>>> +        GEM_BUG_ON(ce->sw_context_id >= BIT(GEN11_SW_CTX_ID_WIDTH));
>>> +        desc |= (u64)ce->sw_context_id << GEN11_SW_CTX_ID_SHIFT;
>>>                                   /* bits 37-47 */
>>>             desc |= (u64)engine->instance << 
>>> GEN11_ENGINE_INSTANCE_SHIFT;
>>>                                   /* bits 48-53 */
>>>   -        /* TODO: decide what to do with SW counter (bits 55-60) */
>>> +        desc |= (u64)ce->sw_counter << GEN11_SW_COUNTER_SHIFT;
>>> +                                /* bits 55-60 */
>>>             /*
>>>            * Although GuC will never see this upper part as it fills
>>> @@ -2771,6 +2772,11 @@ static int 
>>> execlists_context_deferred_alloc(struct i915_gem_context *ctx,
>>>       ce->ring = ring;
>>>       ce->state = vma;
>>>   +    if (INTEL_GEN(ctx->i915) >= 11) {
>>> +        ce->sw_context_id = ctx->hw_id;
>>> +        ce->sw_counter = engine->instance;
>>> +    }
>>> +
>>>       return 0;
>>>     error_ring_free:
>>
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11
  2018-08-29 19:10 ` [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11 Michal Wajdeczko
  2018-08-29 19:16   ` Srivatsa, Anusha
  2018-08-30 22:58   ` John Spotswood
@ 2018-09-06  8:28   ` Joonas Lahtinen
  2018-09-06  8:29   ` Joonas Lahtinen
  3 siblings, 0 replies; 49+ messages in thread
From: Joonas Lahtinen @ 2018-09-06  8:28 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx; +Cc: Rodrigo Vivi, Sujaritha Sundaresan

Quoting Michal Wajdeczko (2018-08-29 22:10:36)
> Upcoming Gen11 GuC firmware requires new interface that is incompatible
> with existing pre-Gen11 firmwares. Updated firmwares for pre-Gen11 will
> arrive later. In the meantime sanitize the enable_guc option so that we
> can enable HuC authentication but nothing else on pre-Gen11.

Do you have a plan to source a GuC ICL machine to CI to regain the
lost coverage? That'd be the minimum requirement.

This is because I'm not really sold for replacing the existing
interface in upstream (for which we have some machines in CI) with
something we don't currently have any coverage for.

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11
  2018-08-29 19:10 ` [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11 Michal Wajdeczko
                     ` (2 preceding siblings ...)
  2018-09-06  8:28   ` Joonas Lahtinen
@ 2018-09-06  8:29   ` Joonas Lahtinen
  3 siblings, 0 replies; 49+ messages in thread
From: Joonas Lahtinen @ 2018-09-06  8:29 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx; +Cc: Rodrigo Vivi, Sujaritha Sundaresan

Quoting Michal Wajdeczko (2018-08-29 22:10:36)
> Upcoming Gen11 GuC firmware requires new interface that is incompatible
> with existing pre-Gen11 firmwares. Updated firmwares for pre-Gen11 will
> arrive later. In the meantime sanitize the enable_guc option so that we
> can enable HuC authentication but nothing else on pre-Gen11.
> 
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: John Spotswood <john.a.spotswood@intel.com>
> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> Cc: Tony Ye <tony.ye@intel.com>
> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
> Cc: Jeff Mcgee <jeff.mcgee@intel.com>
> Cc: Antonio Argenziano <antonio.argenziano@intel.com>
> Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>

<SNIP>

> @@ -292,6 +301,12 @@ int intel_uc_init(struct drm_i915_private *i915)
>                 return ret;
>  
>         if (USES_GUC_SUBMISSION(i915)) {
> +

Extra newline.

Regards, Joonas

> +               if (INTEL_GEN(i915) < 11) {
> +                       intel_guc_fini(guc);
> +                       return -EIO;
> +               }
> +
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 03/21] drm/i915/guc: Simplify preparation of GuC parameter block
  2018-08-29 19:10 ` [PATCH 03/21] drm/i915/guc: Simplify preparation of GuC parameter block Michal Wajdeczko
  2018-08-30 22:58   ` John Spotswood
@ 2018-09-06  8:32   ` Joonas Lahtinen
  1 sibling, 0 replies; 49+ messages in thread
From: Joonas Lahtinen @ 2018-09-06  8:32 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx

Quoting Michal Wajdeczko (2018-08-29 22:10:37)
> Definition of the parameters block passed to GuC is about to change.
> Slightly refactor code now to make upcoming patch smaller.
> 
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: John Spotswood <john.a.spotswood@intel.com>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 04/21] drm/i915/guc: Support dual Gen9/Gen11 parameters block
  2018-08-29 19:10 ` [PATCH 04/21] drm/i915/guc: Support dual Gen9/Gen11 parameters block Michal Wajdeczko
  2018-08-30 22:58   ` John Spotswood
@ 2018-09-06  8:39   ` Joonas Lahtinen
  1 sibling, 0 replies; 49+ messages in thread
From: Joonas Lahtinen @ 2018-09-06  8:39 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx; +Cc: Rodrigo Vivi, Sujaritha Sundaresan

Quoting Michal Wajdeczko (2018-08-29 22:10:38)
> Gen11 GuC boot parameter definitions are different than previously
> used for Gen9. Try to support both definitions until new firmwares
> for pre-Gen11 will be available.

This is exactly the kind of branching we want to avoid. Purpose of the
GuC is to hide per-Gen differences, not to cause them :)

The new interface code should just be GuC interface code (not refer to
ICL or any generation specifically), then pre-Gen11 platforms just won't
have a drm-tip compatible firmware until -- well -- they do have it.

So for the series, just axe the old interface and replace it in-place.
Makes the review of the interface differences much more effective, too.

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface
  2018-08-29 19:10 ` [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface Michal Wajdeczko
  2018-08-29 19:58   ` Michel Thierry
@ 2018-09-06  8:55   ` Joonas Lahtinen
  1 sibling, 0 replies; 49+ messages in thread
From: Joonas Lahtinen @ 2018-09-06  8:55 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx; +Cc: Lucas De Marchi, Rodrigo Vivi

Quoting Michal Wajdeczko (2018-08-29 22:10:40)
> Until now the GuC and HW engine class has been the same, which allowed
> us to use them interchangeable. But it is better to start doing the
> right thing and use the GuC definitions for the firmware interface.
> 
> We also keep the same class id in the ctx descriptor to be able to have
> the same values in the driver and firmware logs.
> 
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: Tomasz Lis <tomasz.lis@intel.com>

<SNIP>

> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -249,7 +249,15 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
>  
>                 /* TODO: decide what to do with SW counter (bits 55-60) */
>  
> -               desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
> +               /*
> +                * Although GuC will never see this upper part as it fills
> +                * its own descriptor, using the guc_class here will help keep
> +                * the i915 and firmware logs in sync.
> +                */
> +               if (HAS_GUC_SCHED(ctx->i915))
> +                       desc |= (u64)engine->guc_class << GEN11_ENGINE_CLASS_SHIFT;
> +               else
> +                       desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>                                                                 /* bits 61-63 */

I'm fairly confident I've given this review comment long ago, but here
goes again.

The new member name should just be hw_class or something else agnostic,
and the branching of using ELSP vs. GuC identifier should happen at the engine
init time (or at another sweet spot). And then the actual write should
be unconditionally with the hw_class value.

We should not be adding GuC specifics to intel_lrc.c, I think.

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC] drm/i915/guc: New GuC stage descriptors
  2018-08-29 19:18   ` [PATCH 11/21] drm/i915/guc: New GuC stage descriptors Michal Wajdeczko
  2018-08-29 23:14     ` Daniele Ceraolo Spurio
@ 2018-10-12 18:25     ` Daniele Ceraolo Spurio
  2018-10-17 18:42       ` Lis, Tomasz
  1 sibling, 1 reply; 49+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-10-12 18:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Michel Thierry

With the new interface, GuC now requires every lrc to be registered in
one of the stage descriptors, which have been re-designed so that each
descriptor can store up to 64 lrc per class (i.e. equal to the possible
SW counter values).
Similarly to what happened with the previous legacy design, it is possible
to have a single "proxy" descriptor that owns the workqueue and the
doorbell and use it for all submission. To distinguish the proxy
descriptors from the one used for lrc storage, the latter have been
called "principal". A descriptor can't be both a proxy and a principal
at the same time; to enforce this, since we only use 1 proxy descriptor
per client, we reserve enough descriptor from the bottom of the pool to
be used as proxy and leave the others as principals. For simplicity, we
currently map context IDs 1:1 to principal descriptors, but we could
have more contexts in flight if needed by using the SW counter.
Note that the lrcs need to be mapped in the principal descriptor until
guc is done with them. This means that we can't release the HW id when
the user app closes the ctx because it might still be in flight with GuC
and that we need to be careful when unpinning because the fact that the
a request on the next context has completed doesn't mean that GuC is
done processing the first one. See in-code comments for details.

NOTE: GuC is not going to look at lrcs that are not in flight, so we
could potentially skip the unpinning steps. However, the unpinnig steps
perform extra correctness check so better keep them until we're sure
that the flow is solid.

Based on an initial patch by Oscar Mateo.

RFC: this will be sent as part of the updated series once we have
the gen9 FW with the new interface, but since the flow is
significantly different compared to the previous version I'd like
to gather some early feedback to make sure there aren't any major
issues.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c         |  30 +-
 drivers/gpu/drm/i915/i915_drv.h             |   5 +-
 drivers/gpu/drm/i915/i915_gem_context.c     |   9 +-
 drivers/gpu/drm/i915/intel_guc.h            |  14 +-
 drivers/gpu/drm/i915/intel_guc_fwif.h       |  73 +++--
 drivers/gpu/drm/i915/intel_guc_submission.c | 346 +++++++++++++++-----
 drivers/gpu/drm/i915/intel_lrc.c            |  18 +-
 drivers/gpu/drm/i915/intel_lrc.h            |   7 +
 drivers/gpu/drm/i915/selftests/intel_guc.c  |   4 +-
 9 files changed, 360 insertions(+), 146 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 00c551d3e409..04bbde4a38a6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2474,11 +2474,10 @@ static int i915_guc_stage_pool(struct seq_file *m, void *data)
 
 		seq_printf(m, "GuC stage descriptor %u:\n", index);
 		seq_printf(m, "\tIndex: %u\n", desc->stage_id);
+		seq_printf(m, "\tProxy Index: %u\n", desc->proxy_id);
 		seq_printf(m, "\tAttribute: 0x%x\n", desc->attribute);
 		seq_printf(m, "\tPriority: %d\n", desc->priority);
 		seq_printf(m, "\tDoorbell id: %d\n", desc->db_id);
-		seq_printf(m, "\tEngines used: 0x%x\n",
-			   desc->engines_used);
 		seq_printf(m, "\tDoorbell trigger phy: 0x%llx, cpu: 0x%llx, uK: 0x%x\n",
 			   desc->db_trigger_phy,
 			   desc->db_trigger_cpu,
@@ -2490,18 +2489,21 @@ static int i915_guc_stage_pool(struct seq_file *m, void *data)
 		seq_putc(m, '\n');
 
 		for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
-			u32 guc_engine_id = engine->guc_id;
-			struct guc_execlist_context *lrc =
-						&desc->lrc[guc_engine_id];
-
-			seq_printf(m, "\t%s LRC:\n", engine->name);
-			seq_printf(m, "\t\tContext desc: 0x%x\n",
-				   lrc->context_desc);
-			seq_printf(m, "\t\tContext id: 0x%x\n", lrc->context_id);
-			seq_printf(m, "\t\tLRCA: 0x%x\n", lrc->ring_lrca);
-			seq_printf(m, "\t\tRing begin: 0x%x\n", lrc->ring_begin);
-			seq_printf(m, "\t\tRing end: 0x%x\n", lrc->ring_end);
-			seq_putc(m, '\n');
+			u8 class = engine->class;
+			u8 inst = engine->instance;
+
+			if (desc->lrc_alloc_map[class].bitmap & BIT(inst)) {
+				struct guc_execlist_context *lrc =
+							&desc->lrc[class][inst];
+				seq_printf(m, "\t%s LRC:\n", engine->name);
+				seq_printf(m, "\t\tHW context desc: 0x%x:0x%x\n",
+						lower_32_bits(lrc->hw_context_desc),
+						upper_32_bits(lrc->hw_context_desc));
+				seq_printf(m, "\t\tLRC: 0x%x\n", lrc->ring_lrc);
+				seq_printf(m, "\t\tRing begin: 0x%x\n", lrc->ring_begin);
+				seq_printf(m, "\t\tRing end: 0x%x\n", lrc->ring_end);
+				seq_putc(m, '\n');
+			}
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bd76931987ef..ce095d57e050 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1878,13 +1878,14 @@ struct drm_i915_private {
 		 * (the SW Context ID field) but GuC limits it further so
 		 * without taking advantage of part of the SW counter field the
 		 * firmware only supports a max number of contexts equal to the
-		 * number of entries in the GuC stage descriptor pool.
+		 * number of entries in the GuC stage descriptor pool, minus
+		 * the descriptors reserved for proxy usage
 		 */
 		struct ida hw_ida;
 #define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
 #define MAX_GUC_CONTEXT_HW_ID (1 << 20) /* exclusive */
 #define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
-#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC GUC_MAX_STAGE_DESCRIPTORS
+#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC GUC_MAX_PPAL_STAGE_DESCRIPTORS
 		struct list_head hw_id_list;
 	} contexts;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 552d2e108de4..c239d9b9307c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -284,10 +284,15 @@ static void context_close(struct i915_gem_context *ctx)
 	i915_gem_context_set_closed(ctx);
 
 	/*
-	 * This context will never again be assinged to HW, so we can
+	 * This context will never again be assigned to HW, so we can
 	 * reuse its ID for the next context.
+	 *
+	 * if GuC is in use, we need to keep the ID until GuC has finished
+	 * processing all submitted requests because the ID is used by the
+	 * firmware to index the guc stage_desc pool.
 	 */
-	release_hw_id(ctx);
+	if (!USES_GUC_SUBMISSION(ctx->i915))
+		release_hw_id(ctx);
 
 	/*
 	 * The LUT uses the VMA as a backpointer to unref the object,
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 11b3882482f4..05ee44fb66af 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -58,10 +58,14 @@ struct intel_guc {
 	bool interrupts_enabled;
 	unsigned int msg_enabled_mask;
 
+	struct ida client_ids;
+#define GUC_MAX_CLIENT_IDS 2
+
 	struct i915_vma *ads_vma;
 	struct i915_vma *stage_desc_pool;
 	void *stage_desc_pool_vaddr;
-	struct ida stage_ids;
+#define	GUC_MAX_PPAL_STAGE_DESCRIPTORS (GUC_MAX_STAGE_DESCRIPTORS - GUC_MAX_CLIENT_IDS)
+
 	struct i915_vma *shared_data;
 	void *shared_data_vaddr;
 
@@ -94,6 +98,14 @@ struct intel_guc {
 
 	/* GuC's FW specific notify function */
 	void (*notify)(struct intel_guc *guc);
+
+	/*
+	 * Override the first stage_desc to be used as proxy
+	 * (Default: GUC_MAX_PPAL_STAGE_DESCRIPTORS). The max number of ppal
+	 * descriptors is not updated accordingly since the test using this does
+	 * not allocate any context.
+	 */
+	I915_SELFTEST_DECLARE(u32 starting_proxy_id);
 };
 
 static inline bool intel_guc_is_alive(struct intel_guc *guc)
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index ce3ab6ed21d5..1a0f41a26173 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -32,6 +32,8 @@
 #define GUC_MAX_STAGE_DESCRIPTORS	1024
 #define	GUC_INVALID_STAGE_ID		GUC_MAX_STAGE_DESCRIPTORS
 
+#define GUC_MAX_LRC_PER_CLASS		64
+
 #define GUC_RENDER_ENGINE		0
 #define GUC_VIDEO_ENGINE		1
 #define GUC_BLITTER_ENGINE		2
@@ -66,9 +68,12 @@
 #define GUC_DOORBELL_DISABLED		0
 
 #define GUC_STAGE_DESC_ATTR_ACTIVE	BIT(0)
-#define GUC_STAGE_DESC_ATTR_PENDING_DB	BIT(1)
-#define GUC_STAGE_DESC_ATTR_KERNEL	BIT(2)
-#define GUC_STAGE_DESC_ATTR_PREEMPT	BIT(3)
+#define GUC_STAGE_DESC_ATTR_TYPE_SHIFT	1
+#define GUC_STAGE_DESC_ATTR_PRINCIPAL	(0x0 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
+#define GUC_STAGE_DESC_ATTR_PROXY	(0x1 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
+#define GUC_STAGE_DESC_ATTR_REAL	(0x2 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
+#define GUC_STAGE_DESC_ATTR_TYPE_MASK	(0x3 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
+#define GUC_STAGE_DESC_ATTR_KERNEL	(1 << 3)
 #define GUC_STAGE_DESC_ATTR_RESET	BIT(4)
 #define GUC_STAGE_DESC_ATTR_WQLOCKED	BIT(5)
 #define GUC_STAGE_DESC_ATTR_PCH		BIT(6)
@@ -277,9 +282,10 @@ struct guc_process_desc {
 	u64 wq_base_addr;
 	u32 wq_size_bytes;
 	u32 wq_status;
-	u32 engine_presence;
 	u32 priority;
-	u32 reserved[30];
+	u32 token;
+	u32 queue_engine_error;
+	u32 reserved[23];
 } __packed;
 
 /* engine id and context id is packed into guc_execlist_context.context_id*/
@@ -288,18 +294,20 @@ struct guc_process_desc {
 
 /* The execlist context including software and HW information */
 struct guc_execlist_context {
-	u32 context_desc;
-	u32 context_id;
-	u32 ring_status;
-	u32 ring_lrca;
+	u64 hw_context_desc;
+	u32 reserved0;
+	u32 ring_lrc;
 	u32 ring_begin;
 	u32 ring_end;
 	u32 ring_next_free_location;
 	u32 ring_current_tail_pointer_value;
-	u8 engine_state_submit_value;
-	u8 engine_state_wait_value;
-	u16 pagefault_count;
-	u16 engine_submit_queue_count;
+	u32 engine_state_wait_value;
+	u32 state_reserved;
+	u32 is_present_in_sq;
+	u32 sync_value;
+	u32 sync_addr;
+	u32 slpc_hints;
+	u32 reserved1[4];
 } __packed;
 
 /*
@@ -312,36 +320,33 @@ struct guc_execlist_context {
  * with the GuC, being allocated before the GuC is loaded with its firmware.
  */
 struct guc_stage_desc {
-	u32 sched_common_area;
+	u64 desc_private;
 	u32 stage_id;
-	u32 pas_id;
-	u8 engines_used;
+	u32 proxy_id;
 	u64 db_trigger_cpu;
 	u32 db_trigger_uk;
 	u64 db_trigger_phy;
-	u16 db_id;
-
-	struct guc_execlist_context lrc[GUC_MAX_ENGINES_NUM];
-
-	u8 attribute;
-
-	u32 priority;
-
+	u32 db_id;
+	struct guc_execlist_context lrc[GUC_MAX_ENGINE_CLASSES][GUC_MAX_LRC_PER_CLASS];
+	struct {
+		u64 bitmap;
+		u32 reserved0;
+	} __packed lrc_alloc_map[GUC_MAX_ENGINE_CLASSES];
+	u32 lrc_count;
+	u32 max_lrc_per_class;
+	u32 attribute; /* GUC_STAGE_DESC_ATTR_xxx */
+	u32 priority; /* GUC_CLIENT_PRIORITY_xxx */
 	u32 wq_sampled_tail_offset;
 	u32 wq_total_submit_enqueues;
-
 	u32 process_desc;
 	u32 wq_addr;
 	u32 wq_size;
-
-	u32 engine_presence;
-
-	u8 engine_suspended;
-
-	u8 reserved0[3];
-	u64 reserved1[1];
-
-	u64 desc_private;
+	u32 feature0;
+	u32 feature1;
+	u32 feature2;
+	u32 queue_engine_error;
+	u32 reserved[2];
+	u64 reserved3[12];
 } __packed;
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index eae668442ebe..9bf8ebbc4de1 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -46,17 +46,22 @@
  * that contains all required pages for these elements).
  *
  * GuC stage descriptor:
- * During initialization, the driver allocates a static pool of 1024 such
- * descriptors, and shares them with the GuC.
- * Currently, there exists a 1:1 mapping between a intel_guc_client and a
- * guc_stage_desc (via the client's stage_id), so effectively only one
- * gets used. This stage descriptor lets the GuC know about the doorbell,
- * workqueue and process descriptor. Theoretically, it also lets the GuC
- * know about our HW contexts (context ID, etc...), but we actually
- * employ a kind of submission where the GuC uses the LRCA sent via the work
- * item instead (the single guc_stage_desc associated to execbuf client
- * contains information about the default kernel context only, but this is
- * essentially unused). This is called a "proxy" submission.
+ * During initialization, the driver allocates a static pool of descriptors
+ * and shares them with the GuC. This stage descriptor lets the GuC know about
+ * the doorbell, workqueue and process descriptor, additionally it stores
+ * information about all possible HW contexts that use it (64 x number of
+ * engine classes of guc_execlist_context structs).
+ *
+ * The idea is that every direct-submission GuC client gets one SW Context ID
+ * and every HW context created by that client gets one SW Counter. The "SW
+ * Context ID" and "SW Counter" to use now get passed on every work queue item.
+ *
+ * But we don't have direct submission yet: does that mean we are limited to 64
+ * contexts in total (one client)? Not really: we can use extra GuC context
+ * descriptors to store more HW contexts. They are special in that they don't
+ * have their own work queue, doorbell or process descriptor. Instead, these
+ * "principal" GuC context descriptors use the one that belongs to the client
+ * as a "proxy" for submission (a generalization of the old proxy submission).
  *
  * The Scratch registers:
  * There are 16 MMIO-based registers start from 0xC180. The kernel driver writes
@@ -164,11 +169,28 @@ static int __guc_deallocate_doorbell(struct intel_guc *guc, u32 stage_id)
 	return intel_guc_send(guc, action, ARRAY_SIZE(action));
 }
 
-static struct guc_stage_desc *__get_stage_desc(struct intel_guc_client *client)
+static struct guc_stage_desc *__get_stage_desc(struct intel_guc *guc, u32 index)
+{
+	struct guc_stage_desc *base = guc->stage_desc_pool_vaddr;
+
+	GEM_BUG_ON(!USES_GUC_SUBMISSION(guc_to_i915(guc)));
+	GEM_BUG_ON(index >= GUC_MAX_STAGE_DESCRIPTORS);
+
+	return &base[index];
+}
+
+static struct guc_stage_desc *__get_proxy_stage_desc(struct intel_guc_client *client)
 {
-	struct guc_stage_desc *base = client->guc->stage_desc_pool_vaddr;
+	GEM_BUG_ON(!I915_SELFTEST_ONLY(client->guc->starting_proxy_id) &&
+			client->stage_id < GUC_MAX_PPAL_STAGE_DESCRIPTORS);
+	return __get_stage_desc(client->guc, client->stage_id);
+}
 
-	return &base[client->stage_id];
+static struct guc_stage_desc *__get_ppal_stage_desc(struct intel_guc *guc,
+						    u32 index)
+{
+	GEM_BUG_ON(index >= GUC_MAX_PPAL_STAGE_DESCRIPTORS);
+	return __get_stage_desc(guc, index);
 }
 
 /*
@@ -183,7 +205,7 @@ static void __update_doorbell_desc(struct intel_guc_client *client, u16 new_id)
 	struct guc_stage_desc *desc;
 
 	/* Update the GuC's idea of the doorbell ID */
-	desc = __get_stage_desc(client);
+	desc = __get_proxy_stage_desc(client);
 	desc->db_id = new_id;
 }
 
@@ -329,14 +351,12 @@ static int guc_stage_desc_pool_create(struct intel_guc *guc)
 
 	guc->stage_desc_pool = vma;
 	guc->stage_desc_pool_vaddr = vaddr;
-	ida_init(&guc->stage_ids);
 
 	return 0;
 }
 
 static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
 {
-	ida_destroy(&guc->stage_ids);
 	i915_vma_unpin_and_release(&guc->stage_desc_pool, I915_VMA_RELEASE_MAP);
 }
 
@@ -347,78 +367,26 @@ static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
  * data structures relating to this client (doorbell, process descriptor,
  * write queue, etc).
  */
-static void guc_stage_desc_init(struct intel_guc_client *client)
+static void guc_proxy_stage_desc_init(struct intel_guc_client *client)
 {
-	struct intel_guc *guc = client->guc;
-	struct drm_i915_private *dev_priv = guc_to_i915(guc);
-	struct intel_engine_cs *engine;
-	struct i915_gem_context *ctx = client->owner;
 	struct guc_stage_desc *desc;
-	unsigned int tmp;
 	u32 gfx_addr;
 
-	desc = __get_stage_desc(client);
+	desc = __get_proxy_stage_desc(client);
 	memset(desc, 0, sizeof(*desc));
 
 	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |
+			  GUC_STAGE_DESC_ATTR_PROXY |
 			  GUC_STAGE_DESC_ATTR_KERNEL;
-	if (is_high_priority(client))
-		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
 	desc->stage_id = client->stage_id;
 	desc->priority = client->priority;
 	desc->db_id = client->doorbell_id;
 
-	for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
-		struct intel_context *ce = to_intel_context(ctx, engine);
-		u32 guc_engine_id = engine->guc_id;
-		struct guc_execlist_context *lrc = &desc->lrc[guc_engine_id];
-
-		/* TODO: We have a design issue to be solved here. Only when we
-		 * receive the first batch, we know which engine is used by the
-		 * user. But here GuC expects the lrc and ring to be pinned. It
-		 * is not an issue for default context, which is the only one
-		 * for now who owns a GuC client. But for future owner of GuC
-		 * client, need to make sure lrc is pinned prior to enter here.
-		 */
-		if (!ce->state)
-			break;	/* XXX: continue? */
-
-		/*
-		 * XXX: When this is a GUC_STAGE_DESC_ATTR_KERNEL client (proxy
-		 * submission or, in other words, not using a direct submission
-		 * model) the KMD's LRCA is not used for any work submission.
-		 * Instead, the GuC uses the LRCA of the user mode context (see
-		 * guc_add_request below).
-		 */
-		lrc->context_desc = lower_32_bits(ce->lrc_desc);
-
-		/* The state page is after PPHWSP */
-		lrc->ring_lrca = intel_guc_ggtt_offset(guc, ce->state) +
-				 LRC_STATE_PN * PAGE_SIZE;
-
-		/* XXX: In direct submission, the GuC wants the HW context id
-		 * here. In proxy submission, it wants the stage id
-		 */
-		lrc->context_id = (client->stage_id << GUC_ELC_CTXID_OFFSET) |
-				(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
-
-		lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
-		lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
-		lrc->ring_next_free_location = lrc->ring_begin;
-		lrc->ring_current_tail_pointer_value = 0;
-
-		desc->engines_used |= (1 << guc_engine_id);
-	}
-
-	DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
-			 client->engines, desc->engines_used);
-	WARN_ON(desc->engines_used == 0);
-
 	/*
 	 * The doorbell, process descriptor, and workqueue are all parts
 	 * of the client object, which the GuC will reference via the GGTT
 	 */
-	gfx_addr = intel_guc_ggtt_offset(guc, client->vma);
+	gfx_addr = intel_guc_ggtt_offset(client->guc, client->vma);
 	desc->db_trigger_phy = sg_dma_address(client->vma->pages->sgl) +
 				client->doorbell_offset;
 	desc->db_trigger_cpu = ptr_to_u64(__get_doorbell(client));
@@ -430,11 +398,11 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 	desc->desc_private = ptr_to_u64(client);
 }
 
-static void guc_stage_desc_fini(struct intel_guc_client *client)
+static void guc_proxy_stage_desc_fini(struct intel_guc_client *client)
 {
 	struct guc_stage_desc *desc;
 
-	desc = __get_stage_desc(client);
+	desc = __get_proxy_stage_desc(client);
 	memset(desc, 0, sizeof(*desc));
 }
 
@@ -553,7 +521,7 @@ static void inject_preempt_context(struct work_struct *work)
 	struct intel_guc *guc = container_of(preempt_work, typeof(*guc),
 					     preempt_work[engine->id]);
 	struct intel_guc_client *client = guc->preempt_client;
-	struct guc_stage_desc *stage_desc = __get_stage_desc(client);
+	struct guc_stage_desc *stage_desc = __get_proxy_stage_desc(client);
 	struct intel_context *ce = to_intel_context(client->owner, engine);
 	u32 data[7];
 
@@ -919,6 +887,7 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
 	struct i915_vma *vma;
 	void *vaddr;
 	int ret;
+	u32 starting_id;
 
 	client = kzalloc(sizeof(*client), GFP_KERNEL);
 	if (!client)
@@ -931,8 +900,11 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
 	client->doorbell_id = GUC_DOORBELL_INVALID;
 	spin_lock_init(&client->wq_lock);
 
-	ret = ida_simple_get(&guc->stage_ids, 0, GUC_MAX_STAGE_DESCRIPTORS,
-			     GFP_KERNEL);
+	if (!I915_SELFTEST_ONLY(starting_id = guc->starting_proxy_id))
+		starting_id = GUC_MAX_PPAL_STAGE_DESCRIPTORS;
+
+	ret = ida_simple_get(&guc->client_ids, starting_id,
+			     GUC_MAX_STAGE_DESCRIPTORS, GFP_KERNEL);
 	if (ret < 0)
 		goto err_client;
 
@@ -983,7 +955,7 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
 err_vma:
 	i915_vma_unpin_and_release(&client->vma, 0);
 err_id:
-	ida_simple_remove(&guc->stage_ids, client->stage_id);
+	ida_simple_remove(&guc->client_ids, client->stage_id);
 err_client:
 	kfree(client);
 	return ERR_PTR(ret);
@@ -993,7 +965,7 @@ static void guc_client_free(struct intel_guc_client *client)
 {
 	unreserve_doorbell(client);
 	i915_vma_unpin_and_release(&client->vma, I915_VMA_RELEASE_MAP);
-	ida_simple_remove(&client->guc->stage_ids, client->stage_id);
+	ida_simple_remove(&client->guc->client_ids, client->stage_id);
 	kfree(client);
 }
 
@@ -1063,7 +1035,7 @@ static int __guc_client_enable(struct intel_guc_client *client)
 	int ret;
 
 	guc_proc_desc_init(client);
-	guc_stage_desc_init(client);
+	guc_proxy_stage_desc_init(client);
 
 	ret = create_doorbell(client);
 	if (ret)
@@ -1072,7 +1044,7 @@ static int __guc_client_enable(struct intel_guc_client *client)
 	return 0;
 
 fail:
-	guc_stage_desc_fini(client);
+	guc_proxy_stage_desc_fini(client);
 	guc_proc_desc_fini(client);
 	return ret;
 }
@@ -1089,7 +1061,7 @@ static void __guc_client_disable(struct intel_guc_client *client)
 	else
 		__destroy_doorbell(client);
 
-	guc_stage_desc_fini(client);
+	guc_proxy_stage_desc_fini(client);
 	guc_proc_desc_fini(client);
 }
 
@@ -1145,6 +1117,9 @@ int intel_guc_submission_init(struct intel_guc *guc)
 	GEM_BUG_ON(!guc->stage_desc_pool);
 
 	WARN_ON(!guc_verify_doorbells(guc));
+
+	ida_init(&guc->client_ids);
+
 	ret = guc_clients_create(guc);
 	if (ret)
 		goto err_pool;
@@ -1157,6 +1132,7 @@ int intel_guc_submission_init(struct intel_guc *guc)
 	return 0;
 
 err_pool:
+	ida_destroy(&guc->client_ids);
 	guc_stage_desc_pool_destroy(guc);
 	return ret;
 }
@@ -1173,6 +1149,8 @@ void intel_guc_submission_fini(struct intel_guc *guc)
 	guc_clients_destroy(guc);
 	WARN_ON(!guc_verify_doorbells(guc));
 
+	ida_destroy(&guc->client_ids);
+
 	if (guc->stage_desc_pool)
 		guc_stage_desc_pool_destroy(guc);
 }
@@ -1257,6 +1235,203 @@ static void guc_submission_unpark(struct intel_engine_cs *engine)
 	intel_engine_pin_breadcrumbs_irq(engine);
 }
 
+static void guc_map_gem_ctx_to_ppal_stage(struct intel_guc *guc,
+					  struct guc_stage_desc *desc,
+					  u32 id)
+{
+	GEM_BUG_ON(desc->attribute & GUC_STAGE_DESC_ATTR_ACTIVE);
+
+	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |
+			  GUC_STAGE_DESC_ATTR_PRINCIPAL |
+			  GUC_STAGE_DESC_ATTR_KERNEL;
+	desc->stage_id = id;
+
+	/* all ppal contexts will be submitted trough the execbuf client */
+	desc->proxy_id = guc->execbuf_client->stage_id;
+
+	/*
+	 * max_lrc_per_class is used in GuC to cut short loops over the
+	 * lrc_bitmap when only a small amount of lrcs are used. We could
+	 * recalculate this value every time an lrc is added or removed, but
+	 * given the fact that we only have a max number of lrcs per stage_desc
+	 * equal to the max number of instances of a class (because we map
+	 * gem_context 1:1 with stage_desc) and that the GuC loops only in
+	 * specific cases, redoing the calculation each time doesn't give us a
+	 * big benefit for the cost so we can just use a static value.
+	 */
+	desc->max_lrc_per_class = MAX_ENGINE_INSTANCE + 1;
+}
+
+static void guc_unmap_gem_ctx_from_ppal_stage(struct intel_guc *guc,
+					      struct guc_stage_desc *desc)
+{
+	GEM_BUG_ON(!(desc->attribute & GUC_STAGE_DESC_ATTR_ACTIVE));
+	GEM_BUG_ON(desc->lrc_count > 0);
+
+	memset(desc, 0, sizeof(*desc));
+}
+
+static inline void guc_ppal_stage_lrc_pin(struct intel_engine_cs *engine,
+					  struct i915_gem_context *ctx,
+					  struct intel_context *ce)
+{
+	struct intel_guc *guc = &ctx->i915->guc;
+	struct guc_stage_desc *desc;
+	struct guc_execlist_context *lrc;
+	u8 guc_class = engine->class;
+
+	/* 1:1 gem_context to ppal mapping */
+	GEM_BUG_ON(ce->sw_counter > MAX_ENGINE_INSTANCE);
+
+	desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
+	GEM_BUG_ON(desc->lrc_alloc_map[guc_class].bitmap & BIT(ce->sw_counter));
+
+	if (!desc->lrc_count++)
+		guc_map_gem_ctx_to_ppal_stage(guc, desc, ce->sw_context_id);
+
+	lrc = &desc->lrc[guc_class][ce->sw_counter];
+	lrc->hw_context_desc = ce->lrc_desc;
+	lrc->ring_lrc = intel_guc_ggtt_offset(guc, ce->state) +
+			LRC_STATE_PN * PAGE_SIZE;
+	lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
+	lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
+
+	desc->lrc_alloc_map[guc_class].bitmap |= BIT(ce->sw_counter);
+}
+
+static inline void guc_ppal_stage_lrc_unpin(struct intel_context *ce)
+{
+	struct i915_gem_context *ctx = ce->gem_context;
+	struct intel_guc *guc = &ctx->i915->guc;
+	struct intel_engine_cs *engine = ctx->i915->engine[ce - ctx->__engine];
+	struct guc_stage_desc *desc;
+	struct guc_execlist_context *lrc;
+	u8 guc_class = engine->class;
+
+	desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
+	GEM_BUG_ON(!(desc->lrc_alloc_map[guc_class].bitmap & BIT(ce->sw_counter)));
+
+	lrc = &desc->lrc[guc_class][ce->sw_counter];
+
+	/*
+	 * GuC needs us to keep the lrc mapped until it has finished processing
+	 * the ctx switch interrupt. When executing nop or very small workloads
+	 * it is possible (but quite unlikely) that 2 contexts on different
+	 * ELSPs of the same engine complete before the GuC manages to process
+	 * the interrupt for the first completion. Experiments show this happens
+	 * for ~0.2% of contexts when executing nop workloads on different
+	 * contexts back to back on the same engine. When submitting nop
+	 * workloads on all engines at the same time the hit-rate goes up to
+	 * ~0.7%. In all the observed cases GuC required < 100us to catch up,
+	 * with the single engine case being always below 20us.
+	 *
+	 * The completion of the request on the second lrc will reduce our
+	 * pin_count on the first lrc to zero, thus triggering a call to this
+	 * function potentially before GuC has had time to process the
+	 * interrupt. To avoid this, we could get an extra pin on the context or
+	 * delay the unpin when guc is in use, but given that the issue is
+	 * limited to pathological scenarios and has very low hit rate even
+	 * there, we can just introduce a small delay when it happens to give
+	 * time to GuC to catch up. Also to be noted that since the requests
+	 * have completed on the HW we've most likely already sent GuC the next
+	 * contexts to be executed, so it is unlikely that by waiting we'll add
+	 * bubbles in the HW execution.
+	 */
+	WARN_ON(wait_for_us(lrc->is_present_in_sq == 0, 1000));
+
+	desc->lrc_alloc_map[guc_class].bitmap &= ~BIT(ce->sw_counter);
+	memset(lrc, 0, sizeof(*lrc));
+
+	if (!--desc->lrc_count)
+		guc_unmap_gem_ctx_from_ppal_stage(guc, desc);
+}
+
+static inline void guc_init_lrc_mapping(struct intel_guc *guc)
+{
+	struct drm_i915_private *i915 = guc_to_i915(guc);
+	struct intel_engine_cs *engine;
+	struct i915_gem_context *ctx;
+	struct intel_context *ce;
+	enum intel_engine_id id;
+
+	/*
+	 * Some context (e.g. kernel_context) might have been pinned before we
+	 * enabled GuC submission, so we need to add them to the GuC bookeping.
+	 * Also, after a reset the GuC we want to make sure that the information
+	 * shared with GuC is properly reset.
+	 *
+	 * NOTE: the code below assumes 1:1 mapping between ppal descriptors and
+	 * gem contexts for simplicity.
+	 */
+	list_for_each_entry(ctx, &i915->contexts.list, link) {
+		if (atomic_read(&ctx->hw_id_pin_count)) {
+			struct guc_stage_desc *desc;
+
+			/* make sure the descriptor is clean... */
+			GEM_BUG_ON(ctx->hw_id > GUC_MAX_PPAL_STAGE_DESCRIPTORS);
+			desc = __get_ppal_stage_desc(guc, ctx->hw_id);
+			memset(desc, 0, sizeof(*desc));
+
+			/* ...and the (re-)pin all the lrcs */
+			for_each_engine(engine, i915, id) {
+				ce = to_intel_context(ctx, engine);
+				if (ce->pin_count > 0)
+					guc_ppal_stage_lrc_pin(engine, ctx, ce);
+			}
+		}
+	}
+}
+
+static inline void guc_fini_lrc_mapping(struct intel_guc *guc)
+{
+	struct drm_i915_private *i915 = guc_to_i915(guc);
+	struct intel_engine_cs *engine;
+	struct i915_gem_context *ctx;
+	struct intel_context *ce;
+	enum intel_engine_id id;
+
+	list_for_each_entry(ctx, &i915->contexts.list, link) {
+		if (atomic_read(&ctx->hw_id_pin_count)) {
+			for_each_engine(engine, i915, id) {
+				ce = to_intel_context(ctx, engine);
+				if (ce->pin_count > 0)
+					guc_ppal_stage_lrc_unpin(ce);
+			}
+		}
+	}
+}
+
+static void guc_context_unpin(struct intel_context *ce)
+{
+	guc_ppal_stage_lrc_unpin(ce);
+	intel_execlists_context_unpin(ce);
+}
+
+static const struct intel_context_ops guc_context_ops = {
+	.unpin = guc_context_unpin,
+	.destroy = intel_execlists_context_destroy,
+};
+
+static struct intel_context *guc_context_pin(struct intel_engine_cs *engine,
+					     struct i915_gem_context *ctx)
+{
+	struct intel_context *ce = to_intel_context(ctx, engine);
+
+	lockdep_assert_held(&ctx->i915->drm.struct_mutex);
+
+	if (likely(ce->pin_count++))
+		return ce;
+	GEM_BUG_ON(!ce->pin_count); /* no overflow please! */
+
+	ce->ops = &guc_context_ops;
+
+	ce = intel_execlists_context_pin(engine, ctx, ce);
+	if (!IS_ERR(ce))
+		guc_ppal_stage_lrc_pin(engine, ctx, ce);
+
+	return ce;
+}
+
 static void guc_set_default_submission(struct intel_engine_cs *engine)
 {
 	/*
@@ -1274,6 +1449,8 @@ static void guc_set_default_submission(struct intel_engine_cs *engine)
 
 	engine->execlists.tasklet.func = guc_submission_tasklet;
 
+	engine->context_pin = guc_context_pin;
+
 	engine->park = guc_submission_park;
 	engine->unpark = guc_submission_unpark;
 
@@ -1320,6 +1497,8 @@ int intel_guc_submission_enable(struct intel_guc *guc)
 		engine->set_default_submission(engine);
 	}
 
+	guc_init_lrc_mapping(guc);
+
 	return 0;
 }
 
@@ -1329,6 +1508,7 @@ void intel_guc_submission_disable(struct intel_guc *guc)
 
 	GEM_BUG_ON(dev_priv->gt.awake); /* GT should be parked first */
 
+	guc_fini_lrc_mapping(guc);
 	guc_interrupts_release(dev_priv);
 	guc_clients_disable(guc);
 }
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 48e0cdf42221..444bc83554c5 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1071,7 +1071,7 @@ static void execlists_submit_request(struct i915_request *request)
 	spin_unlock_irqrestore(&engine->timeline.lock, flags);
 }
 
-static void execlists_context_destroy(struct intel_context *ce)
+void intel_execlists_context_destroy(struct intel_context *ce)
 {
 	GEM_BUG_ON(ce->pin_count);
 
@@ -1084,7 +1084,7 @@ static void execlists_context_destroy(struct intel_context *ce)
 	i915_gem_object_put(ce->state->obj);
 }
 
-static void execlists_context_unpin(struct intel_context *ce)
+void intel_execlists_context_unpin(struct intel_context *ce)
 {
 	struct intel_engine_cs *engine;
 
@@ -1141,10 +1141,10 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
 	return i915_vma_pin(vma, 0, 0, flags);
 }
 
-static struct intel_context *
-__execlists_context_pin(struct intel_engine_cs *engine,
-			struct i915_gem_context *ctx,
-			struct intel_context *ce)
+struct intel_context *
+intel_execlists_context_pin(struct intel_engine_cs *engine,
+			    struct i915_gem_context *ctx,
+			    struct intel_context *ce)
 {
 	void *vaddr;
 	int ret;
@@ -1205,8 +1205,8 @@ __execlists_context_pin(struct intel_engine_cs *engine,
 }
 
 static const struct intel_context_ops execlists_context_ops = {
-	.unpin = execlists_context_unpin,
-	.destroy = execlists_context_destroy,
+	.unpin = intel_execlists_context_unpin,
+	.destroy = intel_execlists_context_destroy,
 };
 
 static struct intel_context *
@@ -1224,7 +1224,7 @@ execlists_context_pin(struct intel_engine_cs *engine,
 
 	ce->ops = &execlists_context_ops;
 
-	return __execlists_context_pin(engine, ctx, ce);
+	return intel_execlists_context_pin(engine, ctx, ce);
 }
 
 static int execlists_request_alloc(struct i915_request *request)
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index f5a5502ecf70..178b181ea651 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -104,4 +104,11 @@ void intel_lr_context_resume(struct drm_i915_private *dev_priv);
 
 void intel_execlists_set_default_submission(struct intel_engine_cs *engine);
 
+struct intel_context *
+intel_execlists_context_pin(struct intel_engine_cs *engine,
+			    struct i915_gem_context *ctx,
+			    struct intel_context *ce);
+void intel_execlists_context_unpin(struct intel_context *ce);
+void intel_execlists_context_destroy(struct intel_context *ce);
+
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/selftests/intel_guc.c b/drivers/gpu/drm/i915/selftests/intel_guc.c
index bf27162fb327..eb4e8bbe8c82 100644
--- a/drivers/gpu/drm/i915/selftests/intel_guc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_guc.c
@@ -301,6 +301,7 @@ static int igt_guc_doorbells(void *arg)
 	if (err)
 		goto unlock;
 
+	guc->starting_proxy_id = GUC_MAX_PPAL_STAGE_DESCRIPTORS - ATTEMPTS;
 	for (i = 0; i < ATTEMPTS; i++) {
 		clients[i] = guc_client_alloc(dev_priv,
 					      INTEL_INFO(dev_priv)->ring_mask,
@@ -334,7 +335,8 @@ static int igt_guc_doorbells(void *arg)
 		 * The check below is only valid because we keep a doorbell
 		 * assigned during the whole life of the client.
 		 */
-		if (clients[i]->stage_id >= GUC_NUM_DOORBELLS) {
+		if ((clients[i]->stage_id - guc->starting_proxy_id) >=
+		     GUC_NUM_DOORBELLS) {
 			pr_err("[%d] more clients than doorbells (%d >= %d)\n",
 			       i, clients[i]->stage_id, GUC_NUM_DOORBELLS);
 			err = -EINVAL;
-- 
2.19.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [RFC] drm/i915/guc: New GuC stage descriptors
  2018-10-12 18:25     ` [RFC] " Daniele Ceraolo Spurio
@ 2018-10-17 18:42       ` Lis, Tomasz
  2018-10-18 21:07         ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 49+ messages in thread
From: Lis, Tomasz @ 2018-10-17 18:42 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx; +Cc: Michel Thierry



On 2018-10-12 20:25, Daniele Ceraolo Spurio wrote:
> With the new interface, GuC now requires every lrc to be registered in
> one of the stage descriptors, which have been re-designed so that each
> descriptor can store up to 64 lrc per class (i.e. equal to the possible
> SW counter values).
> Similarly to what happened with the previous legacy design, it is possible
> to have a single "proxy" descriptor that owns the workqueue and the
> doorbell and use it for all submission. To distinguish the proxy
> descriptors from the one used for lrc storage, the latter have been
> called "principal". A descriptor can't be both a proxy and a principal
> at the same time; to enforce this, since we only use 1 proxy descriptor
> per client, we reserve enough descriptor from the bottom of the pool to
> be used as proxy and leave the others as principals. For simplicity, we
> currently map context IDs 1:1 to principal descriptors, but we could
> have more contexts in flight if needed by using the SW counter.
Does this have any influence on the concept of "default context" used by 
UMDs?
Or is this completely separate?
> Note that the lrcs need to be mapped in the principal descriptor until
> guc is done with them. This means that we can't release the HW id when
> the user app closes the ctx because it might still be in flight with GuC
> and that we need to be careful when unpinning because the fact that the
s/the//
> a request on the next context has completed doesn't mean that GuC is
> done processing the first one. See in-code comments for details.
>
> NOTE: GuC is not going to look at lrcs that are not in flight, so we
> could potentially skip the unpinning steps. However, the unpinnig steps
> perform extra correctness check so better keep them until we're sure
> that the flow is solid.
>
> Based on an initial patch by Oscar Mateo.
>
> RFC: this will be sent as part of the updated series once we have
> the gen9 FW with the new interface, but since the flow is
> significantly different compared to the previous version I'd like
> to gather some early feedback to make sure there aren't any major
> issues.
>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Michal Winiarski <michal.winiarski@intel.com>
> Cc: Tomasz Lis <tomasz.lis@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c         |  30 +-
>   drivers/gpu/drm/i915/i915_drv.h             |   5 +-
>   drivers/gpu/drm/i915/i915_gem_context.c     |   9 +-
>   drivers/gpu/drm/i915/intel_guc.h            |  14 +-
>   drivers/gpu/drm/i915/intel_guc_fwif.h       |  73 +++--
>   drivers/gpu/drm/i915/intel_guc_submission.c | 346 +++++++++++++++-----
>   drivers/gpu/drm/i915/intel_lrc.c            |  18 +-
>   drivers/gpu/drm/i915/intel_lrc.h            |   7 +
>   drivers/gpu/drm/i915/selftests/intel_guc.c  |   4 +-
>   9 files changed, 360 insertions(+), 146 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 00c551d3e409..04bbde4a38a6 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2474,11 +2474,10 @@ static int i915_guc_stage_pool(struct seq_file *m, void *data)
>   
>   		seq_printf(m, "GuC stage descriptor %u:\n", index);
>   		seq_printf(m, "\tIndex: %u\n", desc->stage_id);
> +		seq_printf(m, "\tProxy Index: %u\n", desc->proxy_id);
>   		seq_printf(m, "\tAttribute: 0x%x\n", desc->attribute);
>   		seq_printf(m, "\tPriority: %d\n", desc->priority);
>   		seq_printf(m, "\tDoorbell id: %d\n", desc->db_id);
> -		seq_printf(m, "\tEngines used: 0x%x\n",
> -			   desc->engines_used);
>   		seq_printf(m, "\tDoorbell trigger phy: 0x%llx, cpu: 0x%llx, uK: 0x%x\n",
>   			   desc->db_trigger_phy,
>   			   desc->db_trigger_cpu,
> @@ -2490,18 +2489,21 @@ static int i915_guc_stage_pool(struct seq_file *m, void *data)
>   		seq_putc(m, '\n');
>   
>   		for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
> -			u32 guc_engine_id = engine->guc_id;
> -			struct guc_execlist_context *lrc =
> -						&desc->lrc[guc_engine_id];
> -
> -			seq_printf(m, "\t%s LRC:\n", engine->name);
> -			seq_printf(m, "\t\tContext desc: 0x%x\n",
> -				   lrc->context_desc);
> -			seq_printf(m, "\t\tContext id: 0x%x\n", lrc->context_id);
> -			seq_printf(m, "\t\tLRCA: 0x%x\n", lrc->ring_lrca);
> -			seq_printf(m, "\t\tRing begin: 0x%x\n", lrc->ring_begin);
> -			seq_printf(m, "\t\tRing end: 0x%x\n", lrc->ring_end);
> -			seq_putc(m, '\n');
> +			u8 class = engine->class;
> +			u8 inst = engine->instance;
> +
> +			if (desc->lrc_alloc_map[class].bitmap & BIT(inst)) {
> +				struct guc_execlist_context *lrc =
> +							&desc->lrc[class][inst];
> +				seq_printf(m, "\t%s LRC:\n", engine->name);
> +				seq_printf(m, "\t\tHW context desc: 0x%x:0x%x\n",
> +						lower_32_bits(lrc->hw_context_desc),
> +						upper_32_bits(lrc->hw_context_desc));
> +				seq_printf(m, "\t\tLRC: 0x%x\n", lrc->ring_lrc);
> +				seq_printf(m, "\t\tRing begin: 0x%x\n", lrc->ring_begin);
> +				seq_printf(m, "\t\tRing end: 0x%x\n", lrc->ring_end);
> +				seq_putc(m, '\n');
> +			}
>   		}
>   	}
>   
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bd76931987ef..ce095d57e050 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1878,13 +1878,14 @@ struct drm_i915_private {
>   		 * (the SW Context ID field) but GuC limits it further so
>   		 * without taking advantage of part of the SW counter field the
>   		 * firmware only supports a max number of contexts equal to the
> -		 * number of entries in the GuC stage descriptor pool.
> +		 * number of entries in the GuC stage descriptor pool, minus
> +		 * the descriptors reserved for proxy usage
>   		 */
>   		struct ida hw_ida;
>   #define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
>   #define MAX_GUC_CONTEXT_HW_ID (1 << 20) /* exclusive */
>   #define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
> -#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC GUC_MAX_STAGE_DESCRIPTORS
> +#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC GUC_MAX_PPAL_STAGE_DESCRIPTORS
>   		struct list_head hw_id_list;
>   	} contexts;
>   
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 552d2e108de4..c239d9b9307c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -284,10 +284,15 @@ static void context_close(struct i915_gem_context *ctx)
>   	i915_gem_context_set_closed(ctx);
>   
>   	/*
> -	 * This context will never again be assinged to HW, so we can
> +	 * This context will never again be assigned to HW, so we can
>   	 * reuse its ID for the next context.
> +	 *
> +	 * if GuC is in use, we need to keep the ID until GuC has finished
> +	 * processing all submitted requests because the ID is used by the
> +	 * firmware to index the guc stage_desc pool.
>   	 */
> -	release_hw_id(ctx);
> +	if (!USES_GUC_SUBMISSION(ctx->i915))
> +		release_hw_id(ctx);
>   
>   	/*
>   	 * The LUT uses the VMA as a backpointer to unref the object,
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 11b3882482f4..05ee44fb66af 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -58,10 +58,14 @@ struct intel_guc {
>   	bool interrupts_enabled;
>   	unsigned int msg_enabled_mask;
>   
> +	struct ida client_ids;
> +#define GUC_MAX_CLIENT_IDS 2
> +
>   	struct i915_vma *ads_vma;
>   	struct i915_vma *stage_desc_pool;
>   	void *stage_desc_pool_vaddr;
> -	struct ida stage_ids;
> +#define	GUC_MAX_PPAL_STAGE_DESCRIPTORS (GUC_MAX_STAGE_DESCRIPTORS - GUC_MAX_CLIENT_IDS)
> +
>   	struct i915_vma *shared_data;
>   	void *shared_data_vaddr;
>   
> @@ -94,6 +98,14 @@ struct intel_guc {
>   
>   	/* GuC's FW specific notify function */
>   	void (*notify)(struct intel_guc *guc);
> +
> +	/*
> +	 * Override the first stage_desc to be used as proxy
> +	 * (Default: GUC_MAX_PPAL_STAGE_DESCRIPTORS). The max number of ppal
> +	 * descriptors is not updated accordingly since the test using this does
> +	 * not allocate any context.
> +	 */
> +	I915_SELFTEST_DECLARE(u32 starting_proxy_id);
>   };
>   
>   static inline bool intel_guc_is_alive(struct intel_guc *guc)
> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
> index ce3ab6ed21d5..1a0f41a26173 100644
> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
> @@ -32,6 +32,8 @@
>   #define GUC_MAX_STAGE_DESCRIPTORS	1024
>   #define	GUC_INVALID_STAGE_ID		GUC_MAX_STAGE_DESCRIPTORS
>   
> +#define GUC_MAX_LRC_PER_CLASS		64
> +
>   #define GUC_RENDER_ENGINE		0
>   #define GUC_VIDEO_ENGINE		1
>   #define GUC_BLITTER_ENGINE		2
> @@ -66,9 +68,12 @@
>   #define GUC_DOORBELL_DISABLED		0
>   
>   #define GUC_STAGE_DESC_ATTR_ACTIVE	BIT(0)
> -#define GUC_STAGE_DESC_ATTR_PENDING_DB	BIT(1)
> -#define GUC_STAGE_DESC_ATTR_KERNEL	BIT(2)
> -#define GUC_STAGE_DESC_ATTR_PREEMPT	BIT(3)
> +#define GUC_STAGE_DESC_ATTR_TYPE_SHIFT	1
> +#define GUC_STAGE_DESC_ATTR_PRINCIPAL	(0x0 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
> +#define GUC_STAGE_DESC_ATTR_PROXY	(0x1 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
> +#define GUC_STAGE_DESC_ATTR_REAL	(0x2 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
> +#define GUC_STAGE_DESC_ATTR_TYPE_MASK	(0x3 << GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
> +#define GUC_STAGE_DESC_ATTR_KERNEL	(1 << 3)
>   #define GUC_STAGE_DESC_ATTR_RESET	BIT(4)
>   #define GUC_STAGE_DESC_ATTR_WQLOCKED	BIT(5)
>   #define GUC_STAGE_DESC_ATTR_PCH		BIT(6)
> @@ -277,9 +282,10 @@ struct guc_process_desc {
>   	u64 wq_base_addr;
>   	u32 wq_size_bytes;
>   	u32 wq_status;
> -	u32 engine_presence;
>   	u32 priority;
> -	u32 reserved[30];
> +	u32 token;
> +	u32 queue_engine_error;
> +	u32 reserved[23];
>   } __packed;
>   
>   /* engine id and context id is packed into guc_execlist_context.context_id*/
> @@ -288,18 +294,20 @@ struct guc_process_desc {
>   
>   /* The execlist context including software and HW information */
>   struct guc_execlist_context {
> -	u32 context_desc;
> -	u32 context_id;
> -	u32 ring_status;
> -	u32 ring_lrca;
> +	u64 hw_context_desc;
> +	u32 reserved0;
> +	u32 ring_lrc;
>   	u32 ring_begin;
>   	u32 ring_end;
>   	u32 ring_next_free_location;
>   	u32 ring_current_tail_pointer_value;
> -	u8 engine_state_submit_value;
> -	u8 engine_state_wait_value;
> -	u16 pagefault_count;
> -	u16 engine_submit_queue_count;
> +	u32 engine_state_wait_value;
> +	u32 state_reserved;
> +	u32 is_present_in_sq;
> +	u32 sync_value;
> +	u32 sync_addr;
> +	u32 slpc_hints;
> +	u32 reserved1[4];
>   } __packed;
>   
>   /*
> @@ -312,36 +320,33 @@ struct guc_execlist_context {
>    * with the GuC, being allocated before the GuC is loaded with its firmware.
>    */
>   struct guc_stage_desc {
> -	u32 sched_common_area;
> +	u64 desc_private;
>   	u32 stage_id;
> -	u32 pas_id;
> -	u8 engines_used;
> +	u32 proxy_id;
>   	u64 db_trigger_cpu;
>   	u32 db_trigger_uk;
>   	u64 db_trigger_phy;
> -	u16 db_id;
> -
> -	struct guc_execlist_context lrc[GUC_MAX_ENGINES_NUM];
> -
> -	u8 attribute;
> -
> -	u32 priority;
> -
> +	u32 db_id;
> +	struct guc_execlist_context lrc[GUC_MAX_ENGINE_CLASSES][GUC_MAX_LRC_PER_CLASS];
> +	struct {
> +		u64 bitmap;
> +		u32 reserved0;
> +	} __packed lrc_alloc_map[GUC_MAX_ENGINE_CLASSES];
> +	u32 lrc_count;
> +	u32 max_lrc_per_class;
> +	u32 attribute; /* GUC_STAGE_DESC_ATTR_xxx */
> +	u32 priority; /* GUC_CLIENT_PRIORITY_xxx */
>   	u32 wq_sampled_tail_offset;
>   	u32 wq_total_submit_enqueues;
> -
>   	u32 process_desc;
>   	u32 wq_addr;
>   	u32 wq_size;
> -
> -	u32 engine_presence;
> -
> -	u8 engine_suspended;
> -
> -	u8 reserved0[3];
> -	u64 reserved1[1];
> -
> -	u64 desc_private;
> +	u32 feature0;
> +	u32 feature1;
> +	u32 feature2;
> +	u32 queue_engine_error;
> +	u32 reserved[2];
> +	u64 reserved3[12];
>   } __packed;
>   
>   /**
> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
> index eae668442ebe..9bf8ebbc4de1 100644
> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
> @@ -46,17 +46,22 @@
>    * that contains all required pages for these elements).
>    *
>    * GuC stage descriptor:
> - * During initialization, the driver allocates a static pool of 1024 such
> - * descriptors, and shares them with the GuC.
> - * Currently, there exists a 1:1 mapping between a intel_guc_client and a
> - * guc_stage_desc (via the client's stage_id), so effectively only one
> - * gets used. This stage descriptor lets the GuC know about the doorbell,
> - * workqueue and process descriptor. Theoretically, it also lets the GuC
> - * know about our HW contexts (context ID, etc...), but we actually
> - * employ a kind of submission where the GuC uses the LRCA sent via the work
> - * item instead (the single guc_stage_desc associated to execbuf client
> - * contains information about the default kernel context only, but this is
> - * essentially unused). This is called a "proxy" submission.
> + * During initialization, the driver allocates a static pool of descriptors
> + * and shares them with the GuC. This stage descriptor lets the GuC know about
Sentence missing definition of "this stage descriptor", ie. "Each entry 
at the beginning of the pool represents one guc_stage_desc. These stage 
descriptors..."
> + * the doorbell, workqueue and process descriptor, additionally it stores
> + * information about all possible HW contexts that use it (64 x number of
> + * engine classes of guc_execlist_context structs).
> + *
> + * The idea is that every direct-submission GuC client gets one SW Context ID
> + * and every HW context created by that client gets one SW Counter. The "SW
> + * Context ID" and "SW Counter" to use now get passed on every work queue item.
> + *
> + * But we don't have direct submission yet: does that mean we are limited to 64
> + * contexts in total (one client)? Not really: we can use extra GuC context
> + * descriptors to store more HW contexts. They are special in that they don't
> + * have their own work queue, doorbell or process descriptor. Instead, these
> + * "principal" GuC context descriptors use the one that belongs to the client
> + * as a "proxy" for submission (a generalization of the old proxy submission).
>    *
>    * The Scratch registers:
>    * There are 16 MMIO-based registers start from 0xC180. The kernel driver writes
> @@ -164,11 +169,28 @@ static int __guc_deallocate_doorbell(struct intel_guc *guc, u32 stage_id)
>   	return intel_guc_send(guc, action, ARRAY_SIZE(action));
>   }
>   
> -static struct guc_stage_desc *__get_stage_desc(struct intel_guc_client *client)
> +static struct guc_stage_desc *__get_stage_desc(struct intel_guc *guc, u32 index)
> +{
> +	struct guc_stage_desc *base = guc->stage_desc_pool_vaddr;
> +
> +	GEM_BUG_ON(!USES_GUC_SUBMISSION(guc_to_i915(guc)));
> +	GEM_BUG_ON(index >= GUC_MAX_STAGE_DESCRIPTORS);
> +
> +	return &base[index];
> +}
> +
> +static struct guc_stage_desc *__get_proxy_stage_desc(struct intel_guc_client *client)
>   {
> -	struct guc_stage_desc *base = client->guc->stage_desc_pool_vaddr;
> +	GEM_BUG_ON(!I915_SELFTEST_ONLY(client->guc->starting_proxy_id) &&
> +			client->stage_id < GUC_MAX_PPAL_STAGE_DESCRIPTORS);
> +	return __get_stage_desc(client->guc, client->stage_id);
> +}
>   
> -	return &base[client->stage_id];
> +static struct guc_stage_desc *__get_ppal_stage_desc(struct intel_guc *guc,
> +						    u32 index)
> +{
> +	GEM_BUG_ON(index >= GUC_MAX_PPAL_STAGE_DESCRIPTORS);
> +	return __get_stage_desc(guc, index);
>   }
>   
>   /*
> @@ -183,7 +205,7 @@ static void __update_doorbell_desc(struct intel_guc_client *client, u16 new_id)
>   	struct guc_stage_desc *desc;
>   
>   	/* Update the GuC's idea of the doorbell ID */
> -	desc = __get_stage_desc(client);
> +	desc = __get_proxy_stage_desc(client);
>   	desc->db_id = new_id;
>   }
>   
> @@ -329,14 +351,12 @@ static int guc_stage_desc_pool_create(struct intel_guc *guc)
>   
>   	guc->stage_desc_pool = vma;
>   	guc->stage_desc_pool_vaddr = vaddr;
> -	ida_init(&guc->stage_ids);
>   
>   	return 0;
>   }
>   
>   static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
>   {
> -	ida_destroy(&guc->stage_ids);
>   	i915_vma_unpin_and_release(&guc->stage_desc_pool, I915_VMA_RELEASE_MAP);
>   }
>   
> @@ -347,78 +367,26 @@ static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
>    * data structures relating to this client (doorbell, process descriptor,
>    * write queue, etc).
>    */
> -static void guc_stage_desc_init(struct intel_guc_client *client)
> +static void guc_proxy_stage_desc_init(struct intel_guc_client *client)
>   {
> -	struct intel_guc *guc = client->guc;
> -	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> -	struct intel_engine_cs *engine;
> -	struct i915_gem_context *ctx = client->owner;
>   	struct guc_stage_desc *desc;
> -	unsigned int tmp;
>   	u32 gfx_addr;
>   
> -	desc = __get_stage_desc(client);
> +	desc = __get_proxy_stage_desc(client);
>   	memset(desc, 0, sizeof(*desc));
>   
>   	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |
> +			  GUC_STAGE_DESC_ATTR_PROXY |
>   			  GUC_STAGE_DESC_ATTR_KERNEL;
> -	if (is_high_priority(client))
> -		desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
>   	desc->stage_id = client->stage_id;
>   	desc->priority = client->priority;
>   	desc->db_id = client->doorbell_id;
>   
> -	for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
> -		struct intel_context *ce = to_intel_context(ctx, engine);
> -		u32 guc_engine_id = engine->guc_id;
> -		struct guc_execlist_context *lrc = &desc->lrc[guc_engine_id];
> -
> -		/* TODO: We have a design issue to be solved here. Only when we
> -		 * receive the first batch, we know which engine is used by the
> -		 * user. But here GuC expects the lrc and ring to be pinned. It
> -		 * is not an issue for default context, which is the only one
> -		 * for now who owns a GuC client. But for future owner of GuC
> -		 * client, need to make sure lrc is pinned prior to enter here.
> -		 */
> -		if (!ce->state)
> -			break;	/* XXX: continue? */
> -
> -		/*
> -		 * XXX: When this is a GUC_STAGE_DESC_ATTR_KERNEL client (proxy
> -		 * submission or, in other words, not using a direct submission
> -		 * model) the KMD's LRCA is not used for any work submission.
> -		 * Instead, the GuC uses the LRCA of the user mode context (see
> -		 * guc_add_request below).
> -		 */
> -		lrc->context_desc = lower_32_bits(ce->lrc_desc);
> -
> -		/* The state page is after PPHWSP */
> -		lrc->ring_lrca = intel_guc_ggtt_offset(guc, ce->state) +
> -				 LRC_STATE_PN * PAGE_SIZE;
> -
> -		/* XXX: In direct submission, the GuC wants the HW context id
> -		 * here. In proxy submission, it wants the stage id
> -		 */
> -		lrc->context_id = (client->stage_id << GUC_ELC_CTXID_OFFSET) |
> -				(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
> -
> -		lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
> -		lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
> -		lrc->ring_next_free_location = lrc->ring_begin;
> -		lrc->ring_current_tail_pointer_value = 0;
> -
> -		desc->engines_used |= (1 << guc_engine_id);
> -	}
> -
> -	DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
> -			 client->engines, desc->engines_used);
> -	WARN_ON(desc->engines_used == 0);
> -
>   	/*
>   	 * The doorbell, process descriptor, and workqueue are all parts
>   	 * of the client object, which the GuC will reference via the GGTT
>   	 */
> -	gfx_addr = intel_guc_ggtt_offset(guc, client->vma);
> +	gfx_addr = intel_guc_ggtt_offset(client->guc, client->vma);
>   	desc->db_trigger_phy = sg_dma_address(client->vma->pages->sgl) +
>   				client->doorbell_offset;
>   	desc->db_trigger_cpu = ptr_to_u64(__get_doorbell(client));
> @@ -430,11 +398,11 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
>   	desc->desc_private = ptr_to_u64(client);
>   }
>   
> -static void guc_stage_desc_fini(struct intel_guc_client *client)
> +static void guc_proxy_stage_desc_fini(struct intel_guc_client *client)
>   {
>   	struct guc_stage_desc *desc;
>   
> -	desc = __get_stage_desc(client);
> +	desc = __get_proxy_stage_desc(client);
>   	memset(desc, 0, sizeof(*desc));
>   }
>   
> @@ -553,7 +521,7 @@ static void inject_preempt_context(struct work_struct *work)
>   	struct intel_guc *guc = container_of(preempt_work, typeof(*guc),
>   					     preempt_work[engine->id]);
>   	struct intel_guc_client *client = guc->preempt_client;
> -	struct guc_stage_desc *stage_desc = __get_stage_desc(client);
> +	struct guc_stage_desc *stage_desc = __get_proxy_stage_desc(client);
>   	struct intel_context *ce = to_intel_context(client->owner, engine);
>   	u32 data[7];
>   
> @@ -919,6 +887,7 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
>   	struct i915_vma *vma;
>   	void *vaddr;
>   	int ret;
> +	u32 starting_id;
>   
>   	client = kzalloc(sizeof(*client), GFP_KERNEL);
>   	if (!client)
> @@ -931,8 +900,11 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
>   	client->doorbell_id = GUC_DOORBELL_INVALID;
>   	spin_lock_init(&client->wq_lock);
>   
> -	ret = ida_simple_get(&guc->stage_ids, 0, GUC_MAX_STAGE_DESCRIPTORS,
> -			     GFP_KERNEL);
> +	if (!I915_SELFTEST_ONLY(starting_id = guc->starting_proxy_id))
> +		starting_id = GUC_MAX_PPAL_STAGE_DESCRIPTORS;
> +
> +	ret = ida_simple_get(&guc->client_ids, starting_id,
> +			     GUC_MAX_STAGE_DESCRIPTORS, GFP_KERNEL);
>   	if (ret < 0)
>   		goto err_client;
>   
> @@ -983,7 +955,7 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
>   err_vma:
>   	i915_vma_unpin_and_release(&client->vma, 0);
>   err_id:
> -	ida_simple_remove(&guc->stage_ids, client->stage_id);
> +	ida_simple_remove(&guc->client_ids, client->stage_id);
>   err_client:
>   	kfree(client);
>   	return ERR_PTR(ret);
> @@ -993,7 +965,7 @@ static void guc_client_free(struct intel_guc_client *client)
>   {
>   	unreserve_doorbell(client);
>   	i915_vma_unpin_and_release(&client->vma, I915_VMA_RELEASE_MAP);
> -	ida_simple_remove(&client->guc->stage_ids, client->stage_id);
> +	ida_simple_remove(&client->guc->client_ids, client->stage_id);
>   	kfree(client);
>   }
>   
> @@ -1063,7 +1035,7 @@ static int __guc_client_enable(struct intel_guc_client *client)
>   	int ret;
>   
>   	guc_proc_desc_init(client);
> -	guc_stage_desc_init(client);
> +	guc_proxy_stage_desc_init(client);
>   
>   	ret = create_doorbell(client);
>   	if (ret)
> @@ -1072,7 +1044,7 @@ static int __guc_client_enable(struct intel_guc_client *client)
>   	return 0;
>   
>   fail:
> -	guc_stage_desc_fini(client);
> +	guc_proxy_stage_desc_fini(client);
>   	guc_proc_desc_fini(client);
>   	return ret;
>   }
> @@ -1089,7 +1061,7 @@ static void __guc_client_disable(struct intel_guc_client *client)
>   	else
>   		__destroy_doorbell(client);
>   
> -	guc_stage_desc_fini(client);
> +	guc_proxy_stage_desc_fini(client);
>   	guc_proc_desc_fini(client);
>   }
>   
> @@ -1145,6 +1117,9 @@ int intel_guc_submission_init(struct intel_guc *guc)
>   	GEM_BUG_ON(!guc->stage_desc_pool);
>   
>   	WARN_ON(!guc_verify_doorbells(guc));
> +
> +	ida_init(&guc->client_ids);
> +
>   	ret = guc_clients_create(guc);
>   	if (ret)
>   		goto err_pool;
> @@ -1157,6 +1132,7 @@ int intel_guc_submission_init(struct intel_guc *guc)
>   	return 0;
>   
>   err_pool:
> +	ida_destroy(&guc->client_ids);
>   	guc_stage_desc_pool_destroy(guc);
>   	return ret;
>   }
> @@ -1173,6 +1149,8 @@ void intel_guc_submission_fini(struct intel_guc *guc)
>   	guc_clients_destroy(guc);
>   	WARN_ON(!guc_verify_doorbells(guc));
>   
> +	ida_destroy(&guc->client_ids);
> +
>   	if (guc->stage_desc_pool)
>   		guc_stage_desc_pool_destroy(guc);
>   }
> @@ -1257,6 +1235,203 @@ static void guc_submission_unpark(struct intel_engine_cs *engine)
>   	intel_engine_pin_breadcrumbs_irq(engine);
>   }
>   
> +static void guc_map_gem_ctx_to_ppal_stage(struct intel_guc *guc,
> +					  struct guc_stage_desc *desc,
> +					  u32 id)
> +{
> +	GEM_BUG_ON(desc->attribute & GUC_STAGE_DESC_ATTR_ACTIVE);
> +
> +	desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |
> +			  GUC_STAGE_DESC_ATTR_PRINCIPAL |
> +			  GUC_STAGE_DESC_ATTR_KERNEL;
> +	desc->stage_id = id;
> +
> +	/* all ppal contexts will be submitted trough the execbuf client */
> +	desc->proxy_id = guc->execbuf_client->stage_id;
> +
> +	/*
> +	 * max_lrc_per_class is used in GuC to cut short loops over the
> +	 * lrc_bitmap when only a small amount of lrcs are used. We could
> +	 * recalculate this value every time an lrc is added or removed, but
> +	 * given the fact that we only have a max number of lrcs per stage_desc
> +	 * equal to the max number of instances of a class (because we map
> +	 * gem_context 1:1 with stage_desc) and that the GuC loops only in
> +	 * specific cases, redoing the calculation each time doesn't give us a
> +	 * big benefit for the cost so we can just use a static value.
> +	 */
> +	desc->max_lrc_per_class = MAX_ENGINE_INSTANCE + 1;
> +}
> +
> +static void guc_unmap_gem_ctx_from_ppal_stage(struct intel_guc *guc,
> +					      struct guc_stage_desc *desc)
> +{
> +	GEM_BUG_ON(!(desc->attribute & GUC_STAGE_DESC_ATTR_ACTIVE));
> +	GEM_BUG_ON(desc->lrc_count > 0);
> +
> +	memset(desc, 0, sizeof(*desc));
> +}
> +
> +static inline void guc_ppal_stage_lrc_pin(struct intel_engine_cs *engine,
> +					  struct i915_gem_context *ctx,
> +					  struct intel_context *ce)
> +{
> +	struct intel_guc *guc = &ctx->i915->guc;
> +	struct guc_stage_desc *desc;
> +	struct guc_execlist_context *lrc;
> +	u8 guc_class = engine->class;
> +
> +	/* 1:1 gem_context to ppal mapping */
> +	GEM_BUG_ON(ce->sw_counter > MAX_ENGINE_INSTANCE);
> +
> +	desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
> +	GEM_BUG_ON(desc->lrc_alloc_map[guc_class].bitmap & BIT(ce->sw_counter));
> +
> +	if (!desc->lrc_count++)
> +		guc_map_gem_ctx_to_ppal_stage(guc, desc, ce->sw_context_id);
> +
> +	lrc = &desc->lrc[guc_class][ce->sw_counter];
> +	lrc->hw_context_desc = ce->lrc_desc;
> +	lrc->ring_lrc = intel_guc_ggtt_offset(guc, ce->state) +
> +			LRC_STATE_PN * PAGE_SIZE;
> +	lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
> +	lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
> +
> +	desc->lrc_alloc_map[guc_class].bitmap |= BIT(ce->sw_counter);
> +}
> +
> +static inline void guc_ppal_stage_lrc_unpin(struct intel_context *ce)
> +{
> +	struct i915_gem_context *ctx = ce->gem_context;
> +	struct intel_guc *guc = &ctx->i915->guc;
> +	struct intel_engine_cs *engine = ctx->i915->engine[ce - ctx->__engine];
> +	struct guc_stage_desc *desc;
> +	struct guc_execlist_context *lrc;
> +	u8 guc_class = engine->class;
> +
> +	desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
> +	GEM_BUG_ON(!(desc->lrc_alloc_map[guc_class].bitmap & BIT(ce->sw_counter)));
> +
> +	lrc = &desc->lrc[guc_class][ce->sw_counter];
> +
> +	/*
> +	 * GuC needs us to keep the lrc mapped until it has finished processing
> +	 * the ctx switch interrupt. When executing nop or very small workloads
> +	 * it is possible (but quite unlikely) that 2 contexts on different
> +	 * ELSPs of the same engine complete before the GuC manages to process
> +	 * the interrupt for the first completion. Experiments show this happens
> +	 * for ~0.2% of contexts when executing nop workloads on different
> +	 * contexts back to back on the same engine. When submitting nop
> +	 * workloads on all engines at the same time the hit-rate goes up to
> +	 * ~0.7%. In all the observed cases GuC required < 100us to catch up,
> +	 * with the single engine case being always below 20us.
> +	 *
> +	 * The completion of the request on the second lrc will reduce our
> +	 * pin_count on the first lrc to zero, thus triggering a call to this
> +	 * function potentially before GuC has had time to process the
> +	 * interrupt. To avoid this, we could get an extra pin on the context or
> +	 * delay the unpin when guc is in use, but given that the issue is
> +	 * limited to pathological scenarios and has very low hit rate even
> +	 * there, we can just introduce a small delay when it happens to give
> +	 * time to GuC to catch up. Also to be noted that since the requests
> +	 * have completed on the HW we've most likely already sent GuC the next
> +	 * contexts to be executed, so it is unlikely that by waiting we'll add
> +	 * bubbles in the HW execution.
> +	 */
> +	WARN_ON(wait_for_us(lrc->is_present_in_sq == 0, 1000));
> +
> +	desc->lrc_alloc_map[guc_class].bitmap &= ~BIT(ce->sw_counter);
> +	memset(lrc, 0, sizeof(*lrc));
> +
> +	if (!--desc->lrc_count)
> +		guc_unmap_gem_ctx_from_ppal_stage(guc, desc);
> +}
> +
> +static inline void guc_init_lrc_mapping(struct intel_guc *guc)
> +{
> +	struct drm_i915_private *i915 = guc_to_i915(guc);
> +	struct intel_engine_cs *engine;
> +	struct i915_gem_context *ctx;
> +	struct intel_context *ce;
> +	enum intel_engine_id id;
> +
> +	/*
> +	 * Some context (e.g. kernel_context) might have been pinned before we
> +	 * enabled GuC submission, so we need to add them to the GuC bookeping.
> +	 * Also, after a reset the GuC we want to make sure that the information
> +	 * shared with GuC is properly reset.
> +	 *
> +	 * NOTE: the code below assumes 1:1 mapping between ppal descriptors and
> +	 * gem contexts for simplicity.
> +	 */
> +	list_for_each_entry(ctx, &i915->contexts.list, link) {
> +		if (atomic_read(&ctx->hw_id_pin_count)) {
> +			struct guc_stage_desc *desc;
> +
> +			/* make sure the descriptor is clean... */
> +			GEM_BUG_ON(ctx->hw_id > GUC_MAX_PPAL_STAGE_DESCRIPTORS);
> +			desc = __get_ppal_stage_desc(guc, ctx->hw_id);
> +			memset(desc, 0, sizeof(*desc));
> +
> +			/* ...and the (re-)pin all the lrcs */
> +			for_each_engine(engine, i915, id) {
> +				ce = to_intel_context(ctx, engine);
> +				if (ce->pin_count > 0)
> +					guc_ppal_stage_lrc_pin(engine, ctx, ce);
> +			}
> +		}
> +	}
> +}
> +
> +static inline void guc_fini_lrc_mapping(struct intel_guc *guc)
> +{
> +	struct drm_i915_private *i915 = guc_to_i915(guc);
> +	struct intel_engine_cs *engine;
> +	struct i915_gem_context *ctx;
> +	struct intel_context *ce;
> +	enum intel_engine_id id;
> +
> +	list_for_each_entry(ctx, &i915->contexts.list, link) {
> +		if (atomic_read(&ctx->hw_id_pin_count)) {
> +			for_each_engine(engine, i915, id) {
> +				ce = to_intel_context(ctx, engine);
> +				if (ce->pin_count > 0)
> +					guc_ppal_stage_lrc_unpin(ce);
> +			}
> +		}
> +	}
> +}
> +
> +static void guc_context_unpin(struct intel_context *ce)
> +{
> +	guc_ppal_stage_lrc_unpin(ce);
> +	intel_execlists_context_unpin(ce);
> +}
> +
> +static const struct intel_context_ops guc_context_ops = {
> +	.unpin = guc_context_unpin,
> +	.destroy = intel_execlists_context_destroy,
> +};
> +
> +static struct intel_context *guc_context_pin(struct intel_engine_cs *engine,
> +					     struct i915_gem_context *ctx)
> +{
> +	struct intel_context *ce = to_intel_context(ctx, engine);
> +
> +	lockdep_assert_held(&ctx->i915->drm.struct_mutex);
> +
> +	if (likely(ce->pin_count++))
> +		return ce;
> +	GEM_BUG_ON(!ce->pin_count); /* no overflow please! */
> +
> +	ce->ops = &guc_context_ops;
> +
> +	ce = intel_execlists_context_pin(engine, ctx, ce);
> +	if (!IS_ERR(ce))
> +		guc_ppal_stage_lrc_pin(engine, ctx, ce);
> +
> +	return ce;
> +}
> +
>   static void guc_set_default_submission(struct intel_engine_cs *engine)
>   {
>   	/*
> @@ -1274,6 +1449,8 @@ static void guc_set_default_submission(struct intel_engine_cs *engine)
>   
>   	engine->execlists.tasklet.func = guc_submission_tasklet;
>   
> +	engine->context_pin = guc_context_pin;
> +
>   	engine->park = guc_submission_park;
>   	engine->unpark = guc_submission_unpark;
>   
> @@ -1320,6 +1497,8 @@ int intel_guc_submission_enable(struct intel_guc *guc)
>   		engine->set_default_submission(engine);
>   	}
>   
> +	guc_init_lrc_mapping(guc);
> +
>   	return 0;
>   }
>   
> @@ -1329,6 +1508,7 @@ void intel_guc_submission_disable(struct intel_guc *guc)
>   
>   	GEM_BUG_ON(dev_priv->gt.awake); /* GT should be parked first */
>   
> +	guc_fini_lrc_mapping(guc);
>   	guc_interrupts_release(dev_priv);
>   	guc_clients_disable(guc);
>   }
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 48e0cdf42221..444bc83554c5 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1071,7 +1071,7 @@ static void execlists_submit_request(struct i915_request *request)
>   	spin_unlock_irqrestore(&engine->timeline.lock, flags);
>   }
>   
> -static void execlists_context_destroy(struct intel_context *ce)
> +void intel_execlists_context_destroy(struct intel_context *ce)
>   {
>   	GEM_BUG_ON(ce->pin_count);
>   
> @@ -1084,7 +1084,7 @@ static void execlists_context_destroy(struct intel_context *ce)
>   	i915_gem_object_put(ce->state->obj);
>   }
>   
> -static void execlists_context_unpin(struct intel_context *ce)
> +void intel_execlists_context_unpin(struct intel_context *ce)
>   {
>   	struct intel_engine_cs *engine;
>   
> @@ -1141,10 +1141,10 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
>   	return i915_vma_pin(vma, 0, 0, flags);
>   }
>   
> -static struct intel_context *
> -__execlists_context_pin(struct intel_engine_cs *engine,
> -			struct i915_gem_context *ctx,
> -			struct intel_context *ce)
> +struct intel_context *
> +intel_execlists_context_pin(struct intel_engine_cs *engine,
> +			    struct i915_gem_context *ctx,
> +			    struct intel_context *ce)
>   {
>   	void *vaddr;
>   	int ret;
> @@ -1205,8 +1205,8 @@ __execlists_context_pin(struct intel_engine_cs *engine,
>   }
>   
>   static const struct intel_context_ops execlists_context_ops = {
> -	.unpin = execlists_context_unpin,
> -	.destroy = execlists_context_destroy,
> +	.unpin = intel_execlists_context_unpin,
> +	.destroy = intel_execlists_context_destroy,
>   };
>   
>   static struct intel_context *
> @@ -1224,7 +1224,7 @@ execlists_context_pin(struct intel_engine_cs *engine,
>   
>   	ce->ops = &execlists_context_ops;
>   
> -	return __execlists_context_pin(engine, ctx, ce);
> +	return intel_execlists_context_pin(engine, ctx, ce);
>   }
>   
>   static int execlists_request_alloc(struct i915_request *request)
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index f5a5502ecf70..178b181ea651 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -104,4 +104,11 @@ void intel_lr_context_resume(struct drm_i915_private *dev_priv);
>   
>   void intel_execlists_set_default_submission(struct intel_engine_cs *engine);
>   
> +struct intel_context *
> +intel_execlists_context_pin(struct intel_engine_cs *engine,
> +			    struct i915_gem_context *ctx,
> +			    struct intel_context *ce);
> +void intel_execlists_context_unpin(struct intel_context *ce);
> +void intel_execlists_context_destroy(struct intel_context *ce);
> +
>   #endif /* _INTEL_LRC_H_ */
> diff --git a/drivers/gpu/drm/i915/selftests/intel_guc.c b/drivers/gpu/drm/i915/selftests/intel_guc.c
> index bf27162fb327..eb4e8bbe8c82 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_guc.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_guc.c
> @@ -301,6 +301,7 @@ static int igt_guc_doorbells(void *arg)
>   	if (err)
>   		goto unlock;
>   
> +	guc->starting_proxy_id = GUC_MAX_PPAL_STAGE_DESCRIPTORS - ATTEMPTS;
maybe GEM_BUG_ON(GUC_MAX_PPAL_STAGE_DESCRIPTORS < ATTEMPTS) ?
>   	for (i = 0; i < ATTEMPTS; i++) {
>   		clients[i] = guc_client_alloc(dev_priv,
>   					      INTEL_INFO(dev_priv)->ring_mask,
> @@ -334,7 +335,8 @@ static int igt_guc_doorbells(void *arg)
>   		 * The check below is only valid because we keep a doorbell
>   		 * assigned during the whole life of the client.
>   		 */
> -		if (clients[i]->stage_id >= GUC_NUM_DOORBELLS) {
> +		if ((clients[i]->stage_id - guc->starting_proxy_id) >=
> +		     GUC_NUM_DOORBELLS) {
I always get a bit nervous when I see unsigned numbers subtracted..
>   			pr_err("[%d] more clients than doorbells (%d >= %d)\n",
%u ?
>   			       i, clients[i]->stage_id, GUC_NUM_DOORBELLS);
>   			err = -EINVAL;

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC] drm/i915/guc: New GuC stage descriptors
  2018-10-17 18:42       ` Lis, Tomasz
@ 2018-10-18 21:07         ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 49+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-10-18 21:07 UTC (permalink / raw)
  To: Lis, Tomasz, intel-gfx; +Cc: Michel Thierry



On 17/10/18 11:42, Lis, Tomasz wrote:
> 
> 
> On 2018-10-12 20:25, Daniele Ceraolo Spurio wrote:
>> With the new interface, GuC now requires every lrc to be registered in
>> one of the stage descriptors, which have been re-designed so that each
>> descriptor can store up to 64 lrc per class (i.e. equal to the possible
>> SW counter values).
>> Similarly to what happened with the previous legacy design, it is 
>> possible
>> to have a single "proxy" descriptor that owns the workqueue and the
>> doorbell and use it for all submission. To distinguish the proxy
>> descriptors from the one used for lrc storage, the latter have been
>> called "principal". A descriptor can't be both a proxy and a principal
>> at the same time; to enforce this, since we only use 1 proxy descriptor
>> per client, we reserve enough descriptor from the bottom of the pool to
>> be used as proxy and leave the others as principals. For simplicity, we
>> currently map context IDs 1:1 to principal descriptors, but we could
>> have more contexts in flight if needed by using the SW counter.
> Does this have any influence on the concept of "default context" used by 
> UMDs?
> Or is this completely separate?

No relationship at the moment, we only map gem_context to stage_desc, we 
don't care about who those contexts belong to or if they're default or 
not. We could potentially use an approach where each file descriptor 
gets a stage descriptor and all the contexts opened by that FD use the 
same stage_desc with different SW counters, but that would limit both 
the number of open FDs and the number of contexts per FD, so doesn't 
seem like a good idea IMO.

>> Note that the lrcs need to be mapped in the principal descriptor until
>> guc is done with them. This means that we can't release the HW id when
>> the user app closes the ctx because it might still be in flight with GuC
>> and that we need to be careful when unpinning because the fact that the
> s/the//
>> a request on the next context has completed doesn't mean that GuC is
>> done processing the first one. See in-code comments for details.
>>
>> NOTE: GuC is not going to look at lrcs that are not in flight, so we
>> could potentially skip the unpinning steps. However, the unpinnig steps
>> perform extra correctness check so better keep them until we're sure
>> that the flow is solid.
>>
>> Based on an initial patch by Oscar Mateo.
>>
>> RFC: this will be sent as part of the updated series once we have
>> the gen9 FW with the new interface, but since the flow is
>> significantly different compared to the previous version I'd like
>> to gather some early feedback to make sure there aren't any major
>> issues.
>>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Michel Thierry <michel.thierry@intel.com>
>> Cc: Michal Winiarski <michal.winiarski@intel.com>
>> Cc: Tomasz Lis <tomasz.lis@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c         |  30 +-
>>   drivers/gpu/drm/i915/i915_drv.h             |   5 +-
>>   drivers/gpu/drm/i915/i915_gem_context.c     |   9 +-
>>   drivers/gpu/drm/i915/intel_guc.h            |  14 +-
>>   drivers/gpu/drm/i915/intel_guc_fwif.h       |  73 +++--
>>   drivers/gpu/drm/i915/intel_guc_submission.c | 346 +++++++++++++++-----
>>   drivers/gpu/drm/i915/intel_lrc.c            |  18 +-
>>   drivers/gpu/drm/i915/intel_lrc.h            |   7 +
>>   drivers/gpu/drm/i915/selftests/intel_guc.c  |   4 +-
>>   9 files changed, 360 insertions(+), 146 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 00c551d3e409..04bbde4a38a6 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -2474,11 +2474,10 @@ static int i915_guc_stage_pool(struct seq_file 
>> *m, void *data)
>>           seq_printf(m, "GuC stage descriptor %u:\n", index);
>>           seq_printf(m, "\tIndex: %u\n", desc->stage_id);
>> +        seq_printf(m, "\tProxy Index: %u\n", desc->proxy_id);
>>           seq_printf(m, "\tAttribute: 0x%x\n", desc->attribute);
>>           seq_printf(m, "\tPriority: %d\n", desc->priority);
>>           seq_printf(m, "\tDoorbell id: %d\n", desc->db_id);
>> -        seq_printf(m, "\tEngines used: 0x%x\n",
>> -               desc->engines_used);
>>           seq_printf(m, "\tDoorbell trigger phy: 0x%llx, cpu: 0x%llx, 
>> uK: 0x%x\n",
>>                  desc->db_trigger_phy,
>>                  desc->db_trigger_cpu,
>> @@ -2490,18 +2489,21 @@ static int i915_guc_stage_pool(struct seq_file 
>> *m, void *data)
>>           seq_putc(m, '\n');
>>           for_each_engine_masked(engine, dev_priv, client->engines, 
>> tmp) {
>> -            u32 guc_engine_id = engine->guc_id;
>> -            struct guc_execlist_context *lrc =
>> -                        &desc->lrc[guc_engine_id];
>> -
>> -            seq_printf(m, "\t%s LRC:\n", engine->name);
>> -            seq_printf(m, "\t\tContext desc: 0x%x\n",
>> -                   lrc->context_desc);
>> -            seq_printf(m, "\t\tContext id: 0x%x\n", lrc->context_id);
>> -            seq_printf(m, "\t\tLRCA: 0x%x\n", lrc->ring_lrca);
>> -            seq_printf(m, "\t\tRing begin: 0x%x\n", lrc->ring_begin);
>> -            seq_printf(m, "\t\tRing end: 0x%x\n", lrc->ring_end);
>> -            seq_putc(m, '\n');
>> +            u8 class = engine->class;
>> +            u8 inst = engine->instance;
>> +
>> +            if (desc->lrc_alloc_map[class].bitmap & BIT(inst)) {
>> +                struct guc_execlist_context *lrc =
>> +                            &desc->lrc[class][inst];
>> +                seq_printf(m, "\t%s LRC:\n", engine->name);
>> +                seq_printf(m, "\t\tHW context desc: 0x%x:0x%x\n",
>> +                        lower_32_bits(lrc->hw_context_desc),
>> +                        upper_32_bits(lrc->hw_context_desc));
>> +                seq_printf(m, "\t\tLRC: 0x%x\n", lrc->ring_lrc);
>> +                seq_printf(m, "\t\tRing begin: 0x%x\n", 
>> lrc->ring_begin);
>> +                seq_printf(m, "\t\tRing end: 0x%x\n", lrc->ring_end);
>> +                seq_putc(m, '\n');
>> +            }
>>           }
>>       }
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index bd76931987ef..ce095d57e050 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1878,13 +1878,14 @@ struct drm_i915_private {
>>            * (the SW Context ID field) but GuC limits it further so
>>            * without taking advantage of part of the SW counter field the
>>            * firmware only supports a max number of contexts equal to the
>> -         * number of entries in the GuC stage descriptor pool.
>> +         * number of entries in the GuC stage descriptor pool, minus
>> +         * the descriptors reserved for proxy usage
>>            */
>>           struct ida hw_ida;
>>   #define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
>>   #define MAX_GUC_CONTEXT_HW_ID (1 << 20) /* exclusive */
>>   #define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
>> -#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC GUC_MAX_STAGE_DESCRIPTORS
>> +#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC GUC_MAX_PPAL_STAGE_DESCRIPTORS
>>           struct list_head hw_id_list;
>>       } contexts;
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
>> b/drivers/gpu/drm/i915/i915_gem_context.c
>> index 552d2e108de4..c239d9b9307c 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -284,10 +284,15 @@ static void context_close(struct 
>> i915_gem_context *ctx)
>>       i915_gem_context_set_closed(ctx);
>>       /*
>> -     * This context will never again be assinged to HW, so we can
>> +     * This context will never again be assigned to HW, so we can
>>        * reuse its ID for the next context.
>> +     *
>> +     * if GuC is in use, we need to keep the ID until GuC has finished
>> +     * processing all submitted requests because the ID is used by the
>> +     * firmware to index the guc stage_desc pool.
>>        */
>> -    release_hw_id(ctx);
>> +    if (!USES_GUC_SUBMISSION(ctx->i915))
>> +        release_hw_id(ctx);
>>       /*
>>        * The LUT uses the VMA as a backpointer to unref the object,
>> diff --git a/drivers/gpu/drm/i915/intel_guc.h 
>> b/drivers/gpu/drm/i915/intel_guc.h
>> index 11b3882482f4..05ee44fb66af 100644
>> --- a/drivers/gpu/drm/i915/intel_guc.h
>> +++ b/drivers/gpu/drm/i915/intel_guc.h
>> @@ -58,10 +58,14 @@ struct intel_guc {
>>       bool interrupts_enabled;
>>       unsigned int msg_enabled_mask;
>> +    struct ida client_ids;
>> +#define GUC_MAX_CLIENT_IDS 2
>> +
>>       struct i915_vma *ads_vma;
>>       struct i915_vma *stage_desc_pool;
>>       void *stage_desc_pool_vaddr;
>> -    struct ida stage_ids;
>> +#define    GUC_MAX_PPAL_STAGE_DESCRIPTORS (GUC_MAX_STAGE_DESCRIPTORS 
>> - GUC_MAX_CLIENT_IDS)
>> +
>>       struct i915_vma *shared_data;
>>       void *shared_data_vaddr;
>> @@ -94,6 +98,14 @@ struct intel_guc {
>>       /* GuC's FW specific notify function */
>>       void (*notify)(struct intel_guc *guc);
>> +
>> +    /*
>> +     * Override the first stage_desc to be used as proxy
>> +     * (Default: GUC_MAX_PPAL_STAGE_DESCRIPTORS). The max number of ppal
>> +     * descriptors is not updated accordingly since the test using 
>> this does
>> +     * not allocate any context.
>> +     */
>> +    I915_SELFTEST_DECLARE(u32 starting_proxy_id);
>>   };
>>   static inline bool intel_guc_is_alive(struct intel_guc *guc)
>> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
>> b/drivers/gpu/drm/i915/intel_guc_fwif.h
>> index ce3ab6ed21d5..1a0f41a26173 100644
>> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
>> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
>> @@ -32,6 +32,8 @@
>>   #define GUC_MAX_STAGE_DESCRIPTORS    1024
>>   #define    GUC_INVALID_STAGE_ID        GUC_MAX_STAGE_DESCRIPTORS
>> +#define GUC_MAX_LRC_PER_CLASS        64
>> +
>>   #define GUC_RENDER_ENGINE        0
>>   #define GUC_VIDEO_ENGINE        1
>>   #define GUC_BLITTER_ENGINE        2
>> @@ -66,9 +68,12 @@
>>   #define GUC_DOORBELL_DISABLED        0
>>   #define GUC_STAGE_DESC_ATTR_ACTIVE    BIT(0)
>> -#define GUC_STAGE_DESC_ATTR_PENDING_DB    BIT(1)
>> -#define GUC_STAGE_DESC_ATTR_KERNEL    BIT(2)
>> -#define GUC_STAGE_DESC_ATTR_PREEMPT    BIT(3)
>> +#define GUC_STAGE_DESC_ATTR_TYPE_SHIFT    1
>> +#define GUC_STAGE_DESC_ATTR_PRINCIPAL    (0x0 << 
>> GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
>> +#define GUC_STAGE_DESC_ATTR_PROXY    (0x1 << 
>> GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
>> +#define GUC_STAGE_DESC_ATTR_REAL    (0x2 << 
>> GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
>> +#define GUC_STAGE_DESC_ATTR_TYPE_MASK    (0x3 << 
>> GUC_STAGE_DESC_ATTR_TYPE_SHIFT)
>> +#define GUC_STAGE_DESC_ATTR_KERNEL    (1 << 3)
>>   #define GUC_STAGE_DESC_ATTR_RESET    BIT(4)
>>   #define GUC_STAGE_DESC_ATTR_WQLOCKED    BIT(5)
>>   #define GUC_STAGE_DESC_ATTR_PCH        BIT(6)
>> @@ -277,9 +282,10 @@ struct guc_process_desc {
>>       u64 wq_base_addr;
>>       u32 wq_size_bytes;
>>       u32 wq_status;
>> -    u32 engine_presence;
>>       u32 priority;
>> -    u32 reserved[30];
>> +    u32 token;
>> +    u32 queue_engine_error;
>> +    u32 reserved[23];
>>   } __packed;
>>   /* engine id and context id is packed into 
>> guc_execlist_context.context_id*/
>> @@ -288,18 +294,20 @@ struct guc_process_desc {
>>   /* The execlist context including software and HW information */
>>   struct guc_execlist_context {
>> -    u32 context_desc;
>> -    u32 context_id;
>> -    u32 ring_status;
>> -    u32 ring_lrca;
>> +    u64 hw_context_desc;
>> +    u32 reserved0;
>> +    u32 ring_lrc;
>>       u32 ring_begin;
>>       u32 ring_end;
>>       u32 ring_next_free_location;
>>       u32 ring_current_tail_pointer_value;
>> -    u8 engine_state_submit_value;
>> -    u8 engine_state_wait_value;
>> -    u16 pagefault_count;
>> -    u16 engine_submit_queue_count;
>> +    u32 engine_state_wait_value;
>> +    u32 state_reserved;
>> +    u32 is_present_in_sq;
>> +    u32 sync_value;
>> +    u32 sync_addr;
>> +    u32 slpc_hints;
>> +    u32 reserved1[4];
>>   } __packed;
>>   /*
>> @@ -312,36 +320,33 @@ struct guc_execlist_context {
>>    * with the GuC, being allocated before the GuC is loaded with its 
>> firmware.
>>    */
>>   struct guc_stage_desc {
>> -    u32 sched_common_area;
>> +    u64 desc_private;
>>       u32 stage_id;
>> -    u32 pas_id;
>> -    u8 engines_used;
>> +    u32 proxy_id;
>>       u64 db_trigger_cpu;
>>       u32 db_trigger_uk;
>>       u64 db_trigger_phy;
>> -    u16 db_id;
>> -
>> -    struct guc_execlist_context lrc[GUC_MAX_ENGINES_NUM];
>> -
>> -    u8 attribute;
>> -
>> -    u32 priority;
>> -
>> +    u32 db_id;
>> +    struct guc_execlist_context 
>> lrc[GUC_MAX_ENGINE_CLASSES][GUC_MAX_LRC_PER_CLASS];
>> +    struct {
>> +        u64 bitmap;
>> +        u32 reserved0;
>> +    } __packed lrc_alloc_map[GUC_MAX_ENGINE_CLASSES];
>> +    u32 lrc_count;
>> +    u32 max_lrc_per_class;
>> +    u32 attribute; /* GUC_STAGE_DESC_ATTR_xxx */
>> +    u32 priority; /* GUC_CLIENT_PRIORITY_xxx */
>>       u32 wq_sampled_tail_offset;
>>       u32 wq_total_submit_enqueues;
>> -
>>       u32 process_desc;
>>       u32 wq_addr;
>>       u32 wq_size;
>> -
>> -    u32 engine_presence;
>> -
>> -    u8 engine_suspended;
>> -
>> -    u8 reserved0[3];
>> -    u64 reserved1[1];
>> -
>> -    u64 desc_private;
>> +    u32 feature0;
>> +    u32 feature1;
>> +    u32 feature2;
>> +    u32 queue_engine_error;
>> +    u32 reserved[2];
>> +    u64 reserved3[12];
>>   } __packed;
>>   /**
>> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c 
>> b/drivers/gpu/drm/i915/intel_guc_submission.c
>> index eae668442ebe..9bf8ebbc4de1 100644
>> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
>> @@ -46,17 +46,22 @@
>>    * that contains all required pages for these elements).
>>    *
>>    * GuC stage descriptor:
>> - * During initialization, the driver allocates a static pool of 1024 
>> such
>> - * descriptors, and shares them with the GuC.
>> - * Currently, there exists a 1:1 mapping between a intel_guc_client 
>> and a
>> - * guc_stage_desc (via the client's stage_id), so effectively only one
>> - * gets used. This stage descriptor lets the GuC know about the 
>> doorbell,
>> - * workqueue and process descriptor. Theoretically, it also lets the GuC
>> - * know about our HW contexts (context ID, etc...), but we actually
>> - * employ a kind of submission where the GuC uses the LRCA sent via 
>> the work
>> - * item instead (the single guc_stage_desc associated to execbuf client
>> - * contains information about the default kernel context only, but 
>> this is
>> - * essentially unused). This is called a "proxy" submission.
>> + * During initialization, the driver allocates a static pool of 
>> descriptors
>> + * and shares them with the GuC. This stage descriptor lets the GuC 
>> know about
> Sentence missing definition of "this stage descriptor", ie. "Each entry 
> at the beginning of the pool represents one guc_stage_desc. These stage 
> descriptors..."

Will add

>> + * the doorbell, workqueue and process descriptor, additionally it 
>> stores
>> + * information about all possible HW contexts that use it (64 x 
>> number of
>> + * engine classes of guc_execlist_context structs).
>> + *
>> + * The idea is that every direct-submission GuC client gets one SW 
>> Context ID
>> + * and every HW context created by that client gets one SW Counter. 
>> The "SW
>> + * Context ID" and "SW Counter" to use now get passed on every work 
>> queue item.
>> + *
>> + * But we don't have direct submission yet: does that mean we are 
>> limited to 64
>> + * contexts in total (one client)? Not really: we can use extra GuC 
>> context
>> + * descriptors to store more HW contexts. They are special in that 
>> they don't
>> + * have their own work queue, doorbell or process descriptor. 
>> Instead, these
>> + * "principal" GuC context descriptors use the one that belongs to 
>> the client
>> + * as a "proxy" for submission (a generalization of the old proxy 
>> submission).
>>    *
>>    * The Scratch registers:
>>    * There are 16 MMIO-based registers start from 0xC180. The kernel 
>> driver writes
>> @@ -164,11 +169,28 @@ static int __guc_deallocate_doorbell(struct 
>> intel_guc *guc, u32 stage_id)
>>       return intel_guc_send(guc, action, ARRAY_SIZE(action));
>>   }
>> -static struct guc_stage_desc *__get_stage_desc(struct 
>> intel_guc_client *client)
>> +static struct guc_stage_desc *__get_stage_desc(struct intel_guc *guc, 
>> u32 index)
>> +{
>> +    struct guc_stage_desc *base = guc->stage_desc_pool_vaddr;
>> +
>> +    GEM_BUG_ON(!USES_GUC_SUBMISSION(guc_to_i915(guc)));
>> +    GEM_BUG_ON(index >= GUC_MAX_STAGE_DESCRIPTORS);
>> +
>> +    return &base[index];
>> +}
>> +
>> +static struct guc_stage_desc *__get_proxy_stage_desc(struct 
>> intel_guc_client *client)
>>   {
>> -    struct guc_stage_desc *base = client->guc->stage_desc_pool_vaddr;
>> +    GEM_BUG_ON(!I915_SELFTEST_ONLY(client->guc->starting_proxy_id) &&
>> +            client->stage_id < GUC_MAX_PPAL_STAGE_DESCRIPTORS);
>> +    return __get_stage_desc(client->guc, client->stage_id);
>> +}
>> -    return &base[client->stage_id];
>> +static struct guc_stage_desc *__get_ppal_stage_desc(struct intel_guc 
>> *guc,
>> +                            u32 index)
>> +{
>> +    GEM_BUG_ON(index >= GUC_MAX_PPAL_STAGE_DESCRIPTORS);
>> +    return __get_stage_desc(guc, index);
>>   }
>>   /*
>> @@ -183,7 +205,7 @@ static void __update_doorbell_desc(struct 
>> intel_guc_client *client, u16 new_id)
>>       struct guc_stage_desc *desc;
>>       /* Update the GuC's idea of the doorbell ID */
>> -    desc = __get_stage_desc(client);
>> +    desc = __get_proxy_stage_desc(client);
>>       desc->db_id = new_id;
>>   }
>> @@ -329,14 +351,12 @@ static int guc_stage_desc_pool_create(struct 
>> intel_guc *guc)
>>       guc->stage_desc_pool = vma;
>>       guc->stage_desc_pool_vaddr = vaddr;
>> -    ida_init(&guc->stage_ids);
>>       return 0;
>>   }
>>   static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
>>   {
>> -    ida_destroy(&guc->stage_ids);
>>       i915_vma_unpin_and_release(&guc->stage_desc_pool, 
>> I915_VMA_RELEASE_MAP);
>>   }
>> @@ -347,78 +367,26 @@ static void guc_stage_desc_pool_destroy(struct 
>> intel_guc *guc)
>>    * data structures relating to this client (doorbell, process 
>> descriptor,
>>    * write queue, etc).
>>    */
>> -static void guc_stage_desc_init(struct intel_guc_client *client)
>> +static void guc_proxy_stage_desc_init(struct intel_guc_client *client)
>>   {
>> -    struct intel_guc *guc = client->guc;
>> -    struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> -    struct intel_engine_cs *engine;
>> -    struct i915_gem_context *ctx = client->owner;
>>       struct guc_stage_desc *desc;
>> -    unsigned int tmp;
>>       u32 gfx_addr;
>> -    desc = __get_stage_desc(client);
>> +    desc = __get_proxy_stage_desc(client);
>>       memset(desc, 0, sizeof(*desc));
>>       desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |
>> +              GUC_STAGE_DESC_ATTR_PROXY |
>>                 GUC_STAGE_DESC_ATTR_KERNEL;
>> -    if (is_high_priority(client))
>> -        desc->attribute |= GUC_STAGE_DESC_ATTR_PREEMPT;
>>       desc->stage_id = client->stage_id;
>>       desc->priority = client->priority;
>>       desc->db_id = client->doorbell_id;
>> -    for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
>> -        struct intel_context *ce = to_intel_context(ctx, engine);
>> -        u32 guc_engine_id = engine->guc_id;
>> -        struct guc_execlist_context *lrc = &desc->lrc[guc_engine_id];
>> -
>> -        /* TODO: We have a design issue to be solved here. Only when we
>> -         * receive the first batch, we know which engine is used by the
>> -         * user. But here GuC expects the lrc and ring to be pinned. It
>> -         * is not an issue for default context, which is the only one
>> -         * for now who owns a GuC client. But for future owner of GuC
>> -         * client, need to make sure lrc is pinned prior to enter here.
>> -         */
>> -        if (!ce->state)
>> -            break;    /* XXX: continue? */
>> -
>> -        /*
>> -         * XXX: When this is a GUC_STAGE_DESC_ATTR_KERNEL client (proxy
>> -         * submission or, in other words, not using a direct submission
>> -         * model) the KMD's LRCA is not used for any work submission.
>> -         * Instead, the GuC uses the LRCA of the user mode context (see
>> -         * guc_add_request below).
>> -         */
>> -        lrc->context_desc = lower_32_bits(ce->lrc_desc);
>> -
>> -        /* The state page is after PPHWSP */
>> -        lrc->ring_lrca = intel_guc_ggtt_offset(guc, ce->state) +
>> -                 LRC_STATE_PN * PAGE_SIZE;
>> -
>> -        /* XXX: In direct submission, the GuC wants the HW context id
>> -         * here. In proxy submission, it wants the stage id
>> -         */
>> -        lrc->context_id = (client->stage_id << GUC_ELC_CTXID_OFFSET) |
>> -                (guc_engine_id << GUC_ELC_ENGINE_OFFSET);
>> -
>> -        lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
>> -        lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
>> -        lrc->ring_next_free_location = lrc->ring_begin;
>> -        lrc->ring_current_tail_pointer_value = 0;
>> -
>> -        desc->engines_used |= (1 << guc_engine_id);
>> -    }
>> -
>> -    DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
>> -             client->engines, desc->engines_used);
>> -    WARN_ON(desc->engines_used == 0);
>> -
>>       /*
>>        * The doorbell, process descriptor, and workqueue are all parts
>>        * of the client object, which the GuC will reference via the GGTT
>>        */
>> -    gfx_addr = intel_guc_ggtt_offset(guc, client->vma);
>> +    gfx_addr = intel_guc_ggtt_offset(client->guc, client->vma);
>>       desc->db_trigger_phy = sg_dma_address(client->vma->pages->sgl) +
>>                   client->doorbell_offset;
>>       desc->db_trigger_cpu = ptr_to_u64(__get_doorbell(client));
>> @@ -430,11 +398,11 @@ static void guc_stage_desc_init(struct 
>> intel_guc_client *client)
>>       desc->desc_private = ptr_to_u64(client);
>>   }
>> -static void guc_stage_desc_fini(struct intel_guc_client *client)
>> +static void guc_proxy_stage_desc_fini(struct intel_guc_client *client)
>>   {
>>       struct guc_stage_desc *desc;
>> -    desc = __get_stage_desc(client);
>> +    desc = __get_proxy_stage_desc(client);
>>       memset(desc, 0, sizeof(*desc));
>>   }
>> @@ -553,7 +521,7 @@ static void inject_preempt_context(struct 
>> work_struct *work)
>>       struct intel_guc *guc = container_of(preempt_work, typeof(*guc),
>>                            preempt_work[engine->id]);
>>       struct intel_guc_client *client = guc->preempt_client;
>> -    struct guc_stage_desc *stage_desc = __get_stage_desc(client);
>> +    struct guc_stage_desc *stage_desc = __get_proxy_stage_desc(client);
>>       struct intel_context *ce = to_intel_context(client->owner, engine);
>>       u32 data[7];
>> @@ -919,6 +887,7 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
>>       struct i915_vma *vma;
>>       void *vaddr;
>>       int ret;
>> +    u32 starting_id;
>>       client = kzalloc(sizeof(*client), GFP_KERNEL);
>>       if (!client)
>> @@ -931,8 +900,11 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
>>       client->doorbell_id = GUC_DOORBELL_INVALID;
>>       spin_lock_init(&client->wq_lock);
>> -    ret = ida_simple_get(&guc->stage_ids, 0, GUC_MAX_STAGE_DESCRIPTORS,
>> -                 GFP_KERNEL);
>> +    if (!I915_SELFTEST_ONLY(starting_id = guc->starting_proxy_id))
>> +        starting_id = GUC_MAX_PPAL_STAGE_DESCRIPTORS;
>> +
>> +    ret = ida_simple_get(&guc->client_ids, starting_id,
>> +                 GUC_MAX_STAGE_DESCRIPTORS, GFP_KERNEL);
>>       if (ret < 0)
>>           goto err_client;
>> @@ -983,7 +955,7 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
>>   err_vma:
>>       i915_vma_unpin_and_release(&client->vma, 0);
>>   err_id:
>> -    ida_simple_remove(&guc->stage_ids, client->stage_id);
>> +    ida_simple_remove(&guc->client_ids, client->stage_id);
>>   err_client:
>>       kfree(client);
>>       return ERR_PTR(ret);
>> @@ -993,7 +965,7 @@ static void guc_client_free(struct 
>> intel_guc_client *client)
>>   {
>>       unreserve_doorbell(client);
>>       i915_vma_unpin_and_release(&client->vma, I915_VMA_RELEASE_MAP);
>> -    ida_simple_remove(&client->guc->stage_ids, client->stage_id);
>> +    ida_simple_remove(&client->guc->client_ids, client->stage_id);
>>       kfree(client);
>>   }
>> @@ -1063,7 +1035,7 @@ static int __guc_client_enable(struct 
>> intel_guc_client *client)
>>       int ret;
>>       guc_proc_desc_init(client);
>> -    guc_stage_desc_init(client);
>> +    guc_proxy_stage_desc_init(client);
>>       ret = create_doorbell(client);
>>       if (ret)
>> @@ -1072,7 +1044,7 @@ static int __guc_client_enable(struct 
>> intel_guc_client *client)
>>       return 0;
>>   fail:
>> -    guc_stage_desc_fini(client);
>> +    guc_proxy_stage_desc_fini(client);
>>       guc_proc_desc_fini(client);
>>       return ret;
>>   }
>> @@ -1089,7 +1061,7 @@ static void __guc_client_disable(struct 
>> intel_guc_client *client)
>>       else
>>           __destroy_doorbell(client);
>> -    guc_stage_desc_fini(client);
>> +    guc_proxy_stage_desc_fini(client);
>>       guc_proc_desc_fini(client);
>>   }
>> @@ -1145,6 +1117,9 @@ int intel_guc_submission_init(struct intel_guc 
>> *guc)
>>       GEM_BUG_ON(!guc->stage_desc_pool);
>>       WARN_ON(!guc_verify_doorbells(guc));
>> +
>> +    ida_init(&guc->client_ids);
>> +
>>       ret = guc_clients_create(guc);
>>       if (ret)
>>           goto err_pool;
>> @@ -1157,6 +1132,7 @@ int intel_guc_submission_init(struct intel_guc 
>> *guc)
>>       return 0;
>>   err_pool:
>> +    ida_destroy(&guc->client_ids);
>>       guc_stage_desc_pool_destroy(guc);
>>       return ret;
>>   }
>> @@ -1173,6 +1149,8 @@ void intel_guc_submission_fini(struct intel_guc 
>> *guc)
>>       guc_clients_destroy(guc);
>>       WARN_ON(!guc_verify_doorbells(guc));
>> +    ida_destroy(&guc->client_ids);
>> +
>>       if (guc->stage_desc_pool)
>>           guc_stage_desc_pool_destroy(guc);
>>   }
>> @@ -1257,6 +1235,203 @@ static void guc_submission_unpark(struct 
>> intel_engine_cs *engine)
>>       intel_engine_pin_breadcrumbs_irq(engine);
>>   }
>> +static void guc_map_gem_ctx_to_ppal_stage(struct intel_guc *guc,
>> +                      struct guc_stage_desc *desc,
>> +                      u32 id)
>> +{
>> +    GEM_BUG_ON(desc->attribute & GUC_STAGE_DESC_ATTR_ACTIVE);
>> +
>> +    desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |
>> +              GUC_STAGE_DESC_ATTR_PRINCIPAL |
>> +              GUC_STAGE_DESC_ATTR_KERNEL;
>> +    desc->stage_id = id;
>> +
>> +    /* all ppal contexts will be submitted trough the execbuf client */
>> +    desc->proxy_id = guc->execbuf_client->stage_id;
>> +
>> +    /*
>> +     * max_lrc_per_class is used in GuC to cut short loops over the
>> +     * lrc_bitmap when only a small amount of lrcs are used. We could
>> +     * recalculate this value every time an lrc is added or removed, but
>> +     * given the fact that we only have a max number of lrcs per 
>> stage_desc
>> +     * equal to the max number of instances of a class (because we map
>> +     * gem_context 1:1 with stage_desc) and that the GuC loops only in
>> +     * specific cases, redoing the calculation each time doesn't give 
>> us a
>> +     * big benefit for the cost so we can just use a static value.
>> +     */
>> +    desc->max_lrc_per_class = MAX_ENGINE_INSTANCE + 1;
>> +}
>> +
>> +static void guc_unmap_gem_ctx_from_ppal_stage(struct intel_guc *guc,
>> +                          struct guc_stage_desc *desc)
>> +{
>> +    GEM_BUG_ON(!(desc->attribute & GUC_STAGE_DESC_ATTR_ACTIVE));
>> +    GEM_BUG_ON(desc->lrc_count > 0);
>> +
>> +    memset(desc, 0, sizeof(*desc));
>> +}
>> +
>> +static inline void guc_ppal_stage_lrc_pin(struct intel_engine_cs 
>> *engine,
>> +                      struct i915_gem_context *ctx,
>> +                      struct intel_context *ce)
>> +{
>> +    struct intel_guc *guc = &ctx->i915->guc;
>> +    struct guc_stage_desc *desc;
>> +    struct guc_execlist_context *lrc;
>> +    u8 guc_class = engine->class;
>> +
>> +    /* 1:1 gem_context to ppal mapping */
>> +    GEM_BUG_ON(ce->sw_counter > MAX_ENGINE_INSTANCE);
>> +
>> +    desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
>> +    GEM_BUG_ON(desc->lrc_alloc_map[guc_class].bitmap & 
>> BIT(ce->sw_counter));
>> +
>> +    if (!desc->lrc_count++)
>> +        guc_map_gem_ctx_to_ppal_stage(guc, desc, ce->sw_context_id);
>> +
>> +    lrc = &desc->lrc[guc_class][ce->sw_counter];
>> +    lrc->hw_context_desc = ce->lrc_desc;
>> +    lrc->ring_lrc = intel_guc_ggtt_offset(guc, ce->state) +
>> +            LRC_STATE_PN * PAGE_SIZE;
>> +    lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
>> +    lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
>> +
>> +    desc->lrc_alloc_map[guc_class].bitmap |= BIT(ce->sw_counter);
>> +}
>> +
>> +static inline void guc_ppal_stage_lrc_unpin(struct intel_context *ce)
>> +{
>> +    struct i915_gem_context *ctx = ce->gem_context;
>> +    struct intel_guc *guc = &ctx->i915->guc;
>> +    struct intel_engine_cs *engine = ctx->i915->engine[ce - 
>> ctx->__engine];
>> +    struct guc_stage_desc *desc;
>> +    struct guc_execlist_context *lrc;
>> +    u8 guc_class = engine->class;
>> +
>> +    desc = __get_ppal_stage_desc(guc, ce->sw_context_id);
>> +    GEM_BUG_ON(!(desc->lrc_alloc_map[guc_class].bitmap & 
>> BIT(ce->sw_counter)));
>> +
>> +    lrc = &desc->lrc[guc_class][ce->sw_counter];
>> +
>> +    /*
>> +     * GuC needs us to keep the lrc mapped until it has finished 
>> processing
>> +     * the ctx switch interrupt. When executing nop or very small 
>> workloads
>> +     * it is possible (but quite unlikely) that 2 contexts on different
>> +     * ELSPs of the same engine complete before the GuC manages to 
>> process
>> +     * the interrupt for the first completion. Experiments show this 
>> happens
>> +     * for ~0.2% of contexts when executing nop workloads on different
>> +     * contexts back to back on the same engine. When submitting nop
>> +     * workloads on all engines at the same time the hit-rate goes up to
>> +     * ~0.7%. In all the observed cases GuC required < 100us to catch 
>> up,
>> +     * with the single engine case being always below 20us.
>> +     *
>> +     * The completion of the request on the second lrc will reduce our
>> +     * pin_count on the first lrc to zero, thus triggering a call to 
>> this
>> +     * function potentially before GuC has had time to process the
>> +     * interrupt. To avoid this, we could get an extra pin on the 
>> context or
>> +     * delay the unpin when guc is in use, but given that the issue is
>> +     * limited to pathological scenarios and has very low hit rate even
>> +     * there, we can just introduce a small delay when it happens to 
>> give
>> +     * time to GuC to catch up. Also to be noted that since the requests
>> +     * have completed on the HW we've most likely already sent GuC 
>> the next
>> +     * contexts to be executed, so it is unlikely that by waiting 
>> we'll add
>> +     * bubbles in the HW execution.
>> +     */
>> +    WARN_ON(wait_for_us(lrc->is_present_in_sq == 0, 1000));
>> +
>> +    desc->lrc_alloc_map[guc_class].bitmap &= ~BIT(ce->sw_counter);
>> +    memset(lrc, 0, sizeof(*lrc));
>> +
>> +    if (!--desc->lrc_count)
>> +        guc_unmap_gem_ctx_from_ppal_stage(guc, desc);
>> +}
>> +
>> +static inline void guc_init_lrc_mapping(struct intel_guc *guc)
>> +{
>> +    struct drm_i915_private *i915 = guc_to_i915(guc);
>> +    struct intel_engine_cs *engine;
>> +    struct i915_gem_context *ctx;
>> +    struct intel_context *ce;
>> +    enum intel_engine_id id;
>> +
>> +    /*
>> +     * Some context (e.g. kernel_context) might have been pinned 
>> before we
>> +     * enabled GuC submission, so we need to add them to the GuC 
>> bookeping.
>> +     * Also, after a reset the GuC we want to make sure that the 
>> information
>> +     * shared with GuC is properly reset.
>> +     *
>> +     * NOTE: the code below assumes 1:1 mapping between ppal 
>> descriptors and
>> +     * gem contexts for simplicity.
>> +     */
>> +    list_for_each_entry(ctx, &i915->contexts.list, link) {
>> +        if (atomic_read(&ctx->hw_id_pin_count)) {
>> +            struct guc_stage_desc *desc;
>> +
>> +            /* make sure the descriptor is clean... */
>> +            GEM_BUG_ON(ctx->hw_id > GUC_MAX_PPAL_STAGE_DESCRIPTORS);
>> +            desc = __get_ppal_stage_desc(guc, ctx->hw_id);
>> +            memset(desc, 0, sizeof(*desc));
>> +
>> +            /* ...and the (re-)pin all the lrcs */
>> +            for_each_engine(engine, i915, id) {
>> +                ce = to_intel_context(ctx, engine);
>> +                if (ce->pin_count > 0)
>> +                    guc_ppal_stage_lrc_pin(engine, ctx, ce);
>> +            }
>> +        }
>> +    }
>> +}
>> +
>> +static inline void guc_fini_lrc_mapping(struct intel_guc *guc)
>> +{
>> +    struct drm_i915_private *i915 = guc_to_i915(guc);
>> +    struct intel_engine_cs *engine;
>> +    struct i915_gem_context *ctx;
>> +    struct intel_context *ce;
>> +    enum intel_engine_id id;
>> +
>> +    list_for_each_entry(ctx, &i915->contexts.list, link) {
>> +        if (atomic_read(&ctx->hw_id_pin_count)) {
>> +            for_each_engine(engine, i915, id) {
>> +                ce = to_intel_context(ctx, engine);
>> +                if (ce->pin_count > 0)
>> +                    guc_ppal_stage_lrc_unpin(ce);
>> +            }
>> +        }
>> +    }
>> +}
>> +
>> +static void guc_context_unpin(struct intel_context *ce)
>> +{
>> +    guc_ppal_stage_lrc_unpin(ce);
>> +    intel_execlists_context_unpin(ce);
>> +}
>> +
>> +static const struct intel_context_ops guc_context_ops = {
>> +    .unpin = guc_context_unpin,
>> +    .destroy = intel_execlists_context_destroy,
>> +};
>> +
>> +static struct intel_context *guc_context_pin(struct intel_engine_cs 
>> *engine,
>> +                         struct i915_gem_context *ctx)
>> +{
>> +    struct intel_context *ce = to_intel_context(ctx, engine);
>> +
>> +    lockdep_assert_held(&ctx->i915->drm.struct_mutex);
>> +
>> +    if (likely(ce->pin_count++))
>> +        return ce;
>> +    GEM_BUG_ON(!ce->pin_count); /* no overflow please! */
>> +
>> +    ce->ops = &guc_context_ops;
>> +
>> +    ce = intel_execlists_context_pin(engine, ctx, ce);
>> +    if (!IS_ERR(ce))
>> +        guc_ppal_stage_lrc_pin(engine, ctx, ce);
>> +
>> +    return ce;
>> +}
>> +
>>   static void guc_set_default_submission(struct intel_engine_cs *engine)
>>   {
>>       /*
>> @@ -1274,6 +1449,8 @@ static void guc_set_default_submission(struct 
>> intel_engine_cs *engine)
>>       engine->execlists.tasklet.func = guc_submission_tasklet;
>> +    engine->context_pin = guc_context_pin;
>> +
>>       engine->park = guc_submission_park;
>>       engine->unpark = guc_submission_unpark;
>> @@ -1320,6 +1497,8 @@ int intel_guc_submission_enable(struct intel_guc 
>> *guc)
>>           engine->set_default_submission(engine);
>>       }
>> +    guc_init_lrc_mapping(guc);
>> +
>>       return 0;
>>   }
>> @@ -1329,6 +1508,7 @@ void intel_guc_submission_disable(struct 
>> intel_guc *guc)
>>       GEM_BUG_ON(dev_priv->gt.awake); /* GT should be parked first */
>> +    guc_fini_lrc_mapping(guc);
>>       guc_interrupts_release(dev_priv);
>>       guc_clients_disable(guc);
>>   }
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index 48e0cdf42221..444bc83554c5 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1071,7 +1071,7 @@ static void execlists_submit_request(struct 
>> i915_request *request)
>>       spin_unlock_irqrestore(&engine->timeline.lock, flags);
>>   }
>> -static void execlists_context_destroy(struct intel_context *ce)
>> +void intel_execlists_context_destroy(struct intel_context *ce)
>>   {
>>       GEM_BUG_ON(ce->pin_count);
>> @@ -1084,7 +1084,7 @@ static void execlists_context_destroy(struct 
>> intel_context *ce)
>>       i915_gem_object_put(ce->state->obj);
>>   }
>> -static void execlists_context_unpin(struct intel_context *ce)
>> +void intel_execlists_context_unpin(struct intel_context *ce)
>>   {
>>       struct intel_engine_cs *engine;
>> @@ -1141,10 +1141,10 @@ static int __context_pin(struct 
>> i915_gem_context *ctx, struct i915_vma *vma)
>>       return i915_vma_pin(vma, 0, 0, flags);
>>   }
>> -static struct intel_context *
>> -__execlists_context_pin(struct intel_engine_cs *engine,
>> -            struct i915_gem_context *ctx,
>> -            struct intel_context *ce)
>> +struct intel_context *
>> +intel_execlists_context_pin(struct intel_engine_cs *engine,
>> +                struct i915_gem_context *ctx,
>> +                struct intel_context *ce)
>>   {
>>       void *vaddr;
>>       int ret;
>> @@ -1205,8 +1205,8 @@ __execlists_context_pin(struct intel_engine_cs 
>> *engine,
>>   }
>>   static const struct intel_context_ops execlists_context_ops = {
>> -    .unpin = execlists_context_unpin,
>> -    .destroy = execlists_context_destroy,
>> +    .unpin = intel_execlists_context_unpin,
>> +    .destroy = intel_execlists_context_destroy,
>>   };
>>   static struct intel_context *
>> @@ -1224,7 +1224,7 @@ execlists_context_pin(struct intel_engine_cs 
>> *engine,
>>       ce->ops = &execlists_context_ops;
>> -    return __execlists_context_pin(engine, ctx, ce);
>> +    return intel_execlists_context_pin(engine, ctx, ce);
>>   }
>>   static int execlists_request_alloc(struct i915_request *request)
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.h 
>> b/drivers/gpu/drm/i915/intel_lrc.h
>> index f5a5502ecf70..178b181ea651 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.h
>> +++ b/drivers/gpu/drm/i915/intel_lrc.h
>> @@ -104,4 +104,11 @@ void intel_lr_context_resume(struct 
>> drm_i915_private *dev_priv);
>>   void intel_execlists_set_default_submission(struct intel_engine_cs 
>> *engine);
>> +struct intel_context *
>> +intel_execlists_context_pin(struct intel_engine_cs *engine,
>> +                struct i915_gem_context *ctx,
>> +                struct intel_context *ce);
>> +void intel_execlists_context_unpin(struct intel_context *ce);
>> +void intel_execlists_context_destroy(struct intel_context *ce);
>> +
>>   #endif /* _INTEL_LRC_H_ */
>> diff --git a/drivers/gpu/drm/i915/selftests/intel_guc.c 
>> b/drivers/gpu/drm/i915/selftests/intel_guc.c
>> index bf27162fb327..eb4e8bbe8c82 100644
>> --- a/drivers/gpu/drm/i915/selftests/intel_guc.c
>> +++ b/drivers/gpu/drm/i915/selftests/intel_guc.c
>> @@ -301,6 +301,7 @@ static int igt_guc_doorbells(void *arg)
>>       if (err)
>>           goto unlock;
>> +    guc->starting_proxy_id = GUC_MAX_PPAL_STAGE_DESCRIPTORS - ATTEMPTS;
> maybe GEM_BUG_ON(GUC_MAX_PPAL_STAGE_DESCRIPTORS < ATTEMPTS) ?

ack, but BUILD_BUG_ON

>>       for (i = 0; i < ATTEMPTS; i++) {
>>           clients[i] = guc_client_alloc(dev_priv,
>>                             INTEL_INFO(dev_priv)->ring_mask,
>> @@ -334,7 +335,8 @@ static int igt_guc_doorbells(void *arg)
>>            * The check below is only valid because we keep a doorbell
>>            * assigned during the whole life of the client.
>>            */
>> -        if (clients[i]->stage_id >= GUC_NUM_DOORBELLS) {
>> +        if ((clients[i]->stage_id - guc->starting_proxy_id) >=
>> +             GUC_NUM_DOORBELLS) {
> I always get a bit nervous when I see unsigned numbers subtracted..

The check you asked for above should ensure we don't underflow here.

>>               pr_err("[%d] more clients than doorbells (%d >= %d)\n",
> %u ?

ack

Daniele

>>                      i, clients[i]->stage_id, GUC_NUM_DOORBELLS);
>>               err = -EINVAL;
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2018-10-18 21:07 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-29 19:10 [PATCH 00/21] New GuC ABI Michal Wajdeczko
2018-08-29 19:10 ` [PATCH 01/21] drm/i915/guc: Update GuC power domain states Michal Wajdeczko
2018-08-29 20:57   ` Daniele Ceraolo Spurio
2018-08-29 22:43     ` Michal Wajdeczko
2018-08-29 19:10 ` [PATCH 02/21] drm/i915/guc: Don't allow GuC submission on pre-Gen11 Michal Wajdeczko
2018-08-29 19:16   ` Srivatsa, Anusha
2018-08-30 22:58   ` John Spotswood
2018-09-06  8:28   ` Joonas Lahtinen
2018-09-06  8:29   ` Joonas Lahtinen
2018-08-29 19:10 ` [PATCH 03/21] drm/i915/guc: Simplify preparation of GuC parameter block Michal Wajdeczko
2018-08-30 22:58   ` John Spotswood
2018-09-06  8:32   ` Joonas Lahtinen
2018-08-29 19:10 ` [PATCH 04/21] drm/i915/guc: Support dual Gen9/Gen11 parameters block Michal Wajdeczko
2018-08-30 22:58   ` John Spotswood
2018-09-06  8:39   ` Joonas Lahtinen
2018-08-29 19:10 ` [PATCH 05/21] drm/i915/guc: Update sample-forcewake command Michal Wajdeczko
2018-08-29 21:52   ` Daniele Ceraolo Spurio
2018-08-29 22:31     ` Michal Wajdeczko
2018-08-29 19:10 ` [PATCH 06/21] drm/i915/guc: Use guc_class instead of engine_class in fw interface Michal Wajdeczko
2018-08-29 19:58   ` Michel Thierry
2018-08-30  0:16     ` Lionel Landwerlin
2018-08-30 13:29       ` Lis, Tomasz
2018-08-30 14:16         ` Lis, Tomasz
2018-08-30 14:56         ` Lionel Landwerlin
2018-08-30 22:34       ` Daniele Ceraolo Spurio
2018-09-06  8:55   ` Joonas Lahtinen
2018-08-29 19:15 ` [PATCH 07/21] drm/i915/guc: New GuC ADS object definition Michal Wajdeczko
2018-08-29 19:16 ` [PATCH 08/21] drm/i915/guc: Make use of the SW counter field in the context descriptor Michal Wajdeczko
2018-08-30  0:08   ` Lionel Landwerlin
2018-08-30 14:15     ` Lis, Tomasz
2018-08-31 15:31       ` Lis, Tomasz
2018-08-29 19:17 ` [PATCH 09/21] drm/i915/guc: New GuC IDs based on engine class and instance Michal Wajdeczko
2018-08-29 19:18 ` [PATCH 10/21] drm/i915: Add hooks for (per-engine) context allocation/update/free Michal Wajdeczko
2018-08-29 19:18   ` [PATCH 11/21] drm/i915/guc: New GuC stage descriptors Michal Wajdeczko
2018-08-29 23:14     ` Daniele Ceraolo Spurio
2018-10-12 18:25     ` [RFC] " Daniele Ceraolo Spurio
2018-10-17 18:42       ` Lis, Tomasz
2018-10-18 21:07         ` Daniele Ceraolo Spurio
2018-08-29 19:18   ` [PATCH 12/21] drm/i915/guc: New GuC workqueue item submission mechanism Michal Wajdeczko
2018-08-29 19:18   ` [PATCH 13/21] drm/i915/guc: Add support for resume-parsing wq item Michal Wajdeczko
2018-08-29 19:18   ` [PATCH 14/21] drm/i915/guc: New reset-engine command Michal Wajdeczko
2018-08-29 19:18   ` [PATCH 15/21] drm/i915/guc: Support for extended GuC notification messages Michal Wajdeczko
2018-08-29 19:18   ` [PATCH 16/21] drm/i915/guc: New engine-reset-complete message Michal Wajdeczko
2018-08-29 19:18   ` [PATCH 17/21] drm/i915/guc: New GuC interrupt register for Gen11 Michal Wajdeczko
2018-08-29 19:18   ` [PATCH 18/21] drm/i915/guc: New GuC scratch registers " Michal Wajdeczko
2018-08-29 19:18   ` [PATCH 19/21] drm/i915/huc: New HuC status register " Michal Wajdeczko
2018-08-30 22:59     ` John Spotswood
2018-08-29 19:19 ` [PATCH 20/21] drm/i915/guc: Enable command transport buffers " Michal Wajdeczko
2018-08-29 19:19   ` [PATCH 21/21] HAX Don't enable GuC submission on pre-Gen11 even if forced Michal Wajdeczko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.