All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Wajdeczko <michal.wajdeczko@intel.com>
To: intel-gfx@lists.freedesktop.org
Subject: [PATCH v6 09/12] drm/i915/uc: Use correct error code for GuC initialization failure
Date: Wed,  4 Apr 2018 21:07:21 +0000	[thread overview]
Message-ID: <20180404210724.3416-9-michal.wajdeczko@intel.com> (raw)
In-Reply-To: <20180404210724.3416-1-michal.wajdeczko@intel.com>

Since commit 6ca9a2beb54a ("drm/i915: Unwind i915_gem_init() failure")
we believed that we correctly handle all errors encountered during
GuC initialization, including special one that indicates request to
run driver with disabled GPU submission (-EIO).

Unfortunately since commit 121981fafe69 ("drm/i915/guc: Combine
enable_guc_loading|submission modparams") we stopped using that
error code to avoid unwanted fallback to execlist submission mode.

In result any GuC initialization failure was treated as non-recoverable
error leading to driver load abort, so we could not even read related
GuC error log to investigate cause of the problem.

Fix that by always returning -EIO on uC hardware related failure.

v2: don't allow -EIO from uc_init
    don't call uc_fini[_misc] on -EIO
    mark guc fw as failed on hw init failure
    prepare uc_fini_hw to run after earlier -EIO

v3: update comments (Sagar)
    use sanitize functions on failure in init_hw (Michal)
    and also sanitize guc/huc fw in fini_hw (Michal)

v4: rebase
v5: rebase

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c    | 17 ++++++++++-------
 drivers/gpu/drm/i915/intel_guc.h   |  5 +++++
 drivers/gpu/drm/i915/intel_uc.c    | 15 +++++++++++----
 drivers/gpu/drm/i915/intel_uc_fw.h |  5 +++++
 4 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ea00f3e..6ab5f4c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5369,8 +5369,10 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	intel_init_gt_powersave(dev_priv);
 
 	ret = intel_uc_init(dev_priv);
-	if (ret)
+	if (ret) {
+		GEM_BUG_ON(ret == -EIO);
 		goto err_pm;
+	}
 
 	ret = i915_gem_init_hw(dev_priv);
 	if (ret)
@@ -5417,7 +5419,8 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	i915_gem_contexts_lost(dev_priv);
 	i915_gem_fini_hw(dev_priv);
 err_uc_init:
-	intel_uc_fini(dev_priv);
+	if (ret != -EIO)
+		intel_uc_fini(dev_priv);
 err_pm:
 	if (ret != -EIO) {
 		intel_cleanup_gt_powersave(dev_priv);
@@ -5431,15 +5434,15 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
-	intel_uc_fini_misc(dev_priv);
-
-	if (ret != -EIO)
+	if (ret != -EIO) {
+		intel_uc_fini_misc(dev_priv);
 		i915_gem_cleanup_userptr(dev_priv);
+	}
 
 	if (ret == -EIO) {
 		/*
-		 * Allow engine initialisation to fail by marking the GPU as
-		 * wedged. But we only want to do this where the GPU is angry,
+		 * Allow engines or uC initialization to fail by marking the GPU
+		 * as wedged. But we only want to do this when the GPU is angry,
 		 * for all other failure, such as an allocation failure, bail.
 		 */
 		if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index f1265e1..c587068 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -176,6 +176,11 @@ static inline int intel_guc_sanitize(struct intel_guc *guc)
 	return 0;
 }
 
+static inline bool intel_guc_is_loaded(struct intel_guc *guc)
+{
+	return intel_uc_fw_is_loaded(&guc->fw);
+}
+
 static inline void intel_guc_enable_msg(struct intel_guc *guc, u32 mask)
 {
 	spin_lock_irq(&guc->irq_lock);
diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index 0439966..7862731 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -332,6 +332,8 @@ static void __uc_sanitize(struct drm_i915_private *i915)
 	intel_huc_sanitize(huc);
 	intel_guc_sanitize(guc);
 
+	GEM_BUG_ON(intel_guc_is_loaded(guc));
+
 	__intel_uc_reset_hw(i915);
 }
 
@@ -420,11 +422,13 @@ int intel_uc_init_hw(struct drm_i915_private *dev_priv)
 	 * Note that there is no fallback as either user explicitly asked for
 	 * the GuC or driver default option was to run with the GuC enabled.
 	 */
-	if (GEM_WARN_ON(ret == -EIO))
-		ret = -EINVAL;
-
 	dev_err(dev_priv->drm.dev, "GuC initialization failed %d\n", ret);
-	return ret;
+
+	/* Sanitize GuC/HuC to avoid clean-up on wedged */
+	__uc_sanitize(dev_priv);
+
+	/* We want to disable GPU submission but keep KMS alive */
+	return -EIO;
 }
 
 void intel_uc_fini_hw(struct drm_i915_private *dev_priv)
@@ -436,6 +440,9 @@ void intel_uc_fini_hw(struct drm_i915_private *dev_priv)
 
 	GEM_BUG_ON(!HAS_GUC(dev_priv));
 
+	if (!intel_guc_is_loaded(guc))
+		return;
+
 	if (USES_GUC_SUBMISSION(dev_priv))
 		intel_guc_submission_disable(guc);
 
diff --git a/drivers/gpu/drm/i915/intel_uc_fw.h b/drivers/gpu/drm/i915/intel_uc_fw.h
index dc33b12..77ad2aa 100644
--- a/drivers/gpu/drm/i915/intel_uc_fw.h
+++ b/drivers/gpu/drm/i915/intel_uc_fw.h
@@ -121,6 +121,11 @@ static inline void intel_uc_fw_sanitize(struct intel_uc_fw *uc_fw)
 		uc_fw->load_status = INTEL_UC_FIRMWARE_PENDING;
 }
 
+static inline bool intel_uc_fw_is_loaded(struct intel_uc_fw *uc_fw)
+{
+	return uc_fw->load_status == INTEL_UC_FIRMWARE_SUCCESS;
+}
+
 /**
  * intel_uc_fw_get_upload_size() - Get size of firmware needed to be uploaded.
  * @uc_fw: uC firmware.
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2018-04-04 21:07 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-04 21:07 [PATCH v6 01/12] drm/i915: Correctly handle error path in i915_gem_init_hw Michal Wajdeczko
2018-04-04 21:07 ` [PATCH v6 02/12] drm/i915: Move i915_gem_fini to i915_gem.c Michal Wajdeczko
2018-04-04 21:07 ` [PATCH v6 03/12] drm/i915: Introduce i915_gem_fini_hw for symmetry with i915_gem_init_hw Michal Wajdeczko
2018-04-04 21:07 ` [PATCH v6 04/12] drm/i915: Add i915_gem_fini_hw to i915_gem_suspend Michal Wajdeczko
2018-04-04 21:07 ` [PATCH v6 05/12] drm/i915: Add i915_gem_fini_hw to i915_reset Michal Wajdeczko
2018-04-04 21:07 ` [PATCH v6 06/12] drm/i915/guc: Ignore dev_priv->gt.awake while disabling submission Michal Wajdeczko
2018-04-04 21:07 ` [PATCH v6 07/12] drm/i915/guc: Restore symmetric doorbell cleanup Michal Wajdeczko
2018-04-04 21:07 ` [PATCH v6 08/12] drm/i915/uc: Fully sanitize uC within intel_uc_fini_hw Michal Wajdeczko
2018-04-04 21:07 ` Michal Wajdeczko [this message]
2018-04-04 21:07 ` [PATCH v6 10/12] drm/i915/uc: Use helper functions to detect fw load status Michal Wajdeczko
2018-04-04 21:07 ` [PATCH v6 11/12] drm/i915/uc: Trivial s/dev_priv/i915 in intel_uc.c Michal Wajdeczko
2018-04-04 21:07 ` [PATCH v6 12/12] HAX: Enable GuC for CI Michal Wajdeczko
2018-04-04 21:57 ` ✓ Fi.CI.BAT: success for series starting with [v6,01/12] drm/i915: Correctly handle error path in i915_gem_init_hw Patchwork
2018-04-05  0:33 ` ✗ Fi.CI.IGT: failure " Patchwork
2018-04-05 10:02   ` Sagar Arun Kamble
2018-04-05 10:19     ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180404210724.3416-9-michal.wajdeczko@intel.com \
    --to=michal.wajdeczko@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.