All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] add more probe failures
@ 2019-07-29 15:22 Michal Wajdeczko
  2019-07-29 15:22 ` [PATCH 1/4] drm/i915: Report -ENODEV after injecting probe failure Michal Wajdeczko
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Michal Wajdeczko @ 2019-07-29 15:22 UTC (permalink / raw)
  To: intel-gfx

There are still many places where we can inject failures,
but doing so without preparation may make CI unhappy as
it will treat any new ERROR/WARN very seriously.

Mitigate that by converting some of dmesg messages to
debug after injecting fake error.

Michal Wajdeczko (4):
  drm/i915: Report -ENODEV after injecting probe failure
  drm/i915: Add i915 to i915_inject_probe_failure
  drm/i915/uc: Inject load errors into intel_uc_init_hw
  drm/i915/wopcm: Don't fail on WOPCM partitioning failure

 .../gpu/drm/i915/display/intel_connector.c    |  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc.c         | 11 +++
 drivers/gpu/drm/i915/i915_drv.c               | 27 ++++----
 drivers/gpu/drm/i915/i915_drv.h               | 19 ++++--
 drivers/gpu/drm/i915/i915_gem.c               | 24 ++-----
 drivers/gpu/drm/i915/i915_pci.c               |  4 +-
 drivers/gpu/drm/i915/intel_gvt.c              |  2 +-
 drivers/gpu/drm/i915/intel_uncore.c           |  2 +-
 drivers/gpu/drm/i915/intel_wopcm.c            | 68 ++++++++++---------
 drivers/gpu/drm/i915/intel_wopcm.h            | 11 ++-
 11 files changed, 97 insertions(+), 75 deletions(-)

-- 
2.19.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/4] drm/i915: Report -ENODEV after injecting probe failure
  2019-07-29 15:22 [PATCH 0/4] add more probe failures Michal Wajdeczko
@ 2019-07-29 15:22 ` Michal Wajdeczko
  2019-07-29 17:54   ` Daniele Ceraolo Spurio
  2019-07-29 15:22 ` [PATCH 2/4] drm/i915: Add i915 to i915_inject_probe_failure Michal Wajdeczko
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Michal Wajdeczko @ 2019-07-29 15:22 UTC (permalink / raw)
  To: intel-gfx

We want to do more driver testing using injected load errors,
but we don't want to limit ourselves to use only -ENODEV (as
other errors are interpreted as real issues, like this:

<4> [381.569479] i915: probe of 0000:00:02.0 failed with error -7

Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index bd9211b3d76e..332949c20a29 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -956,7 +956,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	err = i915_driver_probe(pdev, ent);
 	if (err)
-		return err;
+		return i915_error_injected() ? -ENODEV : err;
 
 	if (i915_inject_probe_failure()) {
 		i915_pci_remove(pdev);
-- 
2.19.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/4] drm/i915: Add i915 to i915_inject_probe_failure
  2019-07-29 15:22 [PATCH 0/4] add more probe failures Michal Wajdeczko
  2019-07-29 15:22 ` [PATCH 1/4] drm/i915: Report -ENODEV after injecting probe failure Michal Wajdeczko
@ 2019-07-29 15:22 ` Michal Wajdeczko
  2019-07-30  8:07   ` kbuild test robot
  2019-07-29 15:23 ` [PATCH 3/4] drm/i915/uc: Inject load errors into intel_uc_init_hw Michal Wajdeczko
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Michal Wajdeczko @ 2019-07-29 15:22 UTC (permalink / raw)
  To: intel-gfx

With i915 added to i915_inject_probe_failure we can use dedicated
printk when injecting artificial load failure.

Also make this function look like other i915 functions that return
error code and make it more flexible to return any provided error
code instead of previously assumed -ENODEV.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/display/intel_connector.c    |  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +-
 drivers/gpu/drm/i915/i915_drv.c               | 27 ++++++++++---------
 drivers/gpu/drm/i915/i915_drv.h               | 12 +++++----
 drivers/gpu/drm/i915/i915_gem.c               | 10 +++----
 drivers/gpu/drm/i915/i915_pci.c               |  2 +-
 drivers/gpu/drm/i915/intel_gvt.c              |  2 +-
 drivers/gpu/drm/i915/intel_uncore.c           |  2 +-
 drivers/gpu/drm/i915/intel_wopcm.c            |  2 +-
 9 files changed, 31 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_connector.c b/drivers/gpu/drm/i915/display/intel_connector.c
index d0163d86c42a..cf8823ce9606 100644
--- a/drivers/gpu/drm/i915/display/intel_connector.c
+++ b/drivers/gpu/drm/i915/display/intel_connector.c
@@ -118,7 +118,7 @@ int intel_connector_register(struct drm_connector *connector)
 	if (ret)
 		goto err;
 
-	if (i915_inject_probe_failure()) {
+	if (i915_inject_probe_failure(to_i915(connector->dev))) {
 		ret = -EFAULT;
 		goto err_backlight;
 	}
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 65cbf1d9118d..8bd9a9adf4a5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -426,7 +426,7 @@ int intel_engines_init_mmio(struct drm_i915_private *i915)
 	WARN_ON(engine_mask &
 		GENMASK(BITS_PER_TYPE(mask) - 1, I915_NUM_ENGINES));
 
-	if (i915_inject_probe_failure())
+	if (i915_inject_probe_failure(i915))
 		return -ENODEV;
 
 	for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f2d3d754af37..5e9cb7e91ae1 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -83,19 +83,20 @@ static struct drm_driver driver;
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG)
 static unsigned int i915_probe_fail_count;
 
-bool __i915_inject_probe_failure(const char *func, int line)
+int __i915_inject_load_error(struct drm_i915_private *i915, int err,
+			     const char *func, int line)
 {
 	if (i915_probe_fail_count >= i915_modparams.inject_load_failure)
-		return false;
+		return 0;
 
-	if (++i915_probe_fail_count == i915_modparams.inject_load_failure) {
-		DRM_INFO("Injecting failure at checkpoint %u [%s:%d]\n",
-			 i915_modparams.inject_load_failure, func, line);
-		i915_modparams.inject_load_failure = 0;
-		return true;
-	}
+	if (++i915_probe_fail_count < i915_modparams.inject_load_failure)
+		return 0;
 
-	return false;
+	__i915_printk(i915, KERN_INFO,
+		      "Injecting failure %d at checkpoint %u [%s:%d]\n",
+		      err, i915_modparams.inject_load_failure, func, line);
+	i915_modparams.inject_load_failure = 0;
+	return err;
 }
 
 bool i915_error_injected(void)
@@ -687,7 +688,7 @@ static int i915_driver_modeset_probe(struct drm_device *dev)
 	struct pci_dev *pdev = dev_priv->drm.pdev;
 	int ret;
 
-	if (i915_inject_probe_failure())
+	if (i915_inject_probe_failure(dev_priv))
 		return -ENODEV;
 
 	if (HAS_DISPLAY(dev_priv)) {
@@ -894,7 +895,7 @@ static int i915_driver_early_probe(struct drm_i915_private *dev_priv)
 {
 	int ret = 0;
 
-	if (i915_inject_probe_failure())
+	if (i915_inject_probe_failure(dev_priv))
 		return -ENODEV;
 
 	intel_device_info_subplatform_init(dev_priv);
@@ -985,7 +986,7 @@ static int i915_driver_mmio_probe(struct drm_i915_private *dev_priv)
 {
 	int ret;
 
-	if (i915_inject_probe_failure())
+	if (i915_inject_probe_failure(dev_priv))
 		return -ENODEV;
 
 	if (i915_get_bridge_dev(dev_priv))
@@ -1530,7 +1531,7 @@ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
 	struct pci_dev *pdev = dev_priv->drm.pdev;
 	int ret;
 
-	if (i915_inject_probe_failure())
+	if (i915_inject_probe_failure(dev_priv))
 		return -ENODEV;
 
 	intel_device_info_runtime_init(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 59d4a1146039..6b059d51aaff 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -122,19 +122,21 @@
 
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG)
 
-bool __i915_inject_probe_failure(const char *func, int line);
-#define i915_inject_probe_failure() \
-	__i915_inject_probe_failure(__func__, __LINE__)
-
+int __i915_inject_load_error(struct drm_i915_private *i915, int err,
+			     const char *func, int line);
+#define i915_inject_load_error(_i915, _err) \
+	__i915_inject_load_error((_i915), (_err), __func__, __LINE__)
 bool i915_error_injected(void);
 
 #else
 
-#define i915_inject_probe_failure() false
+#define i915_inject_load_error(_err) 0
 #define i915_error_injected() false
 
 #endif
 
+#define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
+
 #define i915_probe_error(i915, fmt, ...)				   \
 	__i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \
 		      fmt, ##__VA_ARGS__)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 01dd0d1d9bf6..32b4fa5c579c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1511,15 +1511,13 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	if (ret)
 		goto err_gt;
 
-	if (i915_inject_probe_failure()) {
-		ret = -ENODEV;
+	ret = i915_inject_load_error(dev_priv, -ENODEV);
+	if (ret)
 		goto err_gt;
-	}
 
-	if (i915_inject_probe_failure()) {
-		ret = -EIO;
+	ret = i915_inject_load_error(dev_priv, -EIO);
+	if (ret)
 		goto err_gt;
-	}
 
 	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 332949c20a29..1ca010cb836a 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -958,7 +958,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (err)
 		return i915_error_injected() ? -ENODEV : err;
 
-	if (i915_inject_probe_failure()) {
+	if (i915_inject_probe_failure(to_i915(pci_get_drvdata(pdev)))) {
 		i915_pci_remove(pdev);
 		return -ENODEV;
 	}
diff --git a/drivers/gpu/drm/i915/intel_gvt.c b/drivers/gpu/drm/i915/intel_gvt.c
index c66b2d8a6219..2b6c016387c2 100644
--- a/drivers/gpu/drm/i915/intel_gvt.c
+++ b/drivers/gpu/drm/i915/intel_gvt.c
@@ -95,7 +95,7 @@ int intel_gvt_init(struct drm_i915_private *dev_priv)
 {
 	int ret;
 
-	if (i915_inject_probe_failure())
+	if (i915_inject_probe_failure(dev_priv))
 		return -ENODEV;
 
 	if (!i915_modparams.enable_gvt) {
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 475ab3d4d91d..88eb3b5c226d 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1331,7 +1331,7 @@ static int __fw_domain_init(struct intel_uncore *uncore,
 	GEM_BUG_ON(domain_id >= FW_DOMAIN_ID_COUNT);
 	GEM_BUG_ON(uncore->fw_domain[domain_id]);
 
-	if (i915_inject_probe_failure())
+	if (i915_inject_probe_failure(uncore->i915))
 		return -ENOMEM;
 
 	d = kzalloc(sizeof(*d), GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/intel_wopcm.c b/drivers/gpu/drm/i915/intel_wopcm.c
index 0e86a9e85b49..e173a8e61bfd 100644
--- a/drivers/gpu/drm/i915/intel_wopcm.c
+++ b/drivers/gpu/drm/i915/intel_wopcm.c
@@ -177,7 +177,7 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
 
 	GEM_BUG_ON(!wopcm->size);
 
-	if (i915_inject_probe_failure())
+	if (i915_inject_probe_failure(i915))
 		return -E2BIG;
 
 	if (guc_fw_size >= wopcm->size) {
-- 
2.19.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] drm/i915/uc: Inject load errors into intel_uc_init_hw
  2019-07-29 15:22 [PATCH 0/4] add more probe failures Michal Wajdeczko
  2019-07-29 15:22 ` [PATCH 1/4] drm/i915: Report -ENODEV after injecting probe failure Michal Wajdeczko
  2019-07-29 15:22 ` [PATCH 2/4] drm/i915: Add i915 to i915_inject_probe_failure Michal Wajdeczko
@ 2019-07-29 15:23 ` Michal Wajdeczko
  2019-07-30  8:47   ` Chris Wilson
  2019-07-29 15:23 ` [PATCH 4/4] drm/i915/wopcm: Don't fail on WOPCM partitioning failure Michal Wajdeczko
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Michal Wajdeczko @ 2019-07-29 15:23 UTC (permalink / raw)
  To: intel-gfx

Inject load errors into intel_uc_init_hw to make sure we
correctly handle uC initialization failures.

To avoid complains from CI about inserted errors or warnings,
use helper macro that checks if there was an error injection.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/uc/intel_uc.c | 8 ++++++++
 drivers/gpu/drm/i915/i915_drv.h       | 7 ++++++-
 drivers/gpu/drm/i915/i915_gem.c       | 2 +-
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index fafa9be1e12a..9e1156c29cb1 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -400,6 +400,14 @@ int intel_uc_init_hw(struct intel_uc *uc)
 	if (!intel_uc_is_using_guc(uc))
 		return 0;
 
+	ret = i915_inject_load_error(i915, -EIO);
+	if (ret)
+		return ret;
+
+	ret = i915_inject_load_error(i915, -ENXIO);
+	if (ret)
+		return ret;
+
 	GEM_BUG_ON(!intel_uc_fw_supported(&guc->fw));
 
 	guc_reset_interrupts(guc);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6b059d51aaff..36f7a146f06a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -137,9 +137,14 @@ bool i915_error_injected(void);
 
 #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
 
-#define i915_probe_error(i915, fmt, ...)				   \
+#define I915_ERROR(i915, fmt, ...) \
 	__i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \
 		      fmt, ##__VA_ARGS__)
+#define I915_WARN(i915, fmt, ...) \
+	__i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_WARNING, \
+		      fmt, ##__VA_ARGS__)
+
+#define i915_probe_error(i915, fmt, ...) I915_ERROR(i915, fmt, ##__VA_ARGS__)
 
 struct drm_i915_gem_object;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 32b4fa5c579c..c437ab5554ec 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1248,7 +1248,7 @@ int i915_gem_init_hw(struct drm_i915_private *i915)
 	/* We can't enable contexts until all firmware is loaded */
 	ret = intel_uc_init_hw(&i915->gt.uc);
 	if (ret) {
-		DRM_ERROR("Enabling uc failed (%d)\n", ret);
+		I915_ERROR(i915, "Enabling uc failed (%d)\n", ret);
 		goto out;
 	}
 
-- 
2.19.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/4] drm/i915/wopcm: Don't fail on WOPCM partitioning failure
  2019-07-29 15:22 [PATCH 0/4] add more probe failures Michal Wajdeczko
                   ` (2 preceding siblings ...)
  2019-07-29 15:23 ` [PATCH 3/4] drm/i915/uc: Inject load errors into intel_uc_init_hw Michal Wajdeczko
@ 2019-07-29 15:23 ` Michal Wajdeczko
  2019-07-29 18:04   ` Daniele Ceraolo Spurio
  2019-07-29 17:12 ` ✓ Fi.CI.BAT: success for add more probe failures Patchwork
  2019-07-29 19:56 ` ✗ Fi.CI.BAT: failure for add more probe failures (rev2) Patchwork
  5 siblings, 1 reply; 15+ messages in thread
From: Michal Wajdeczko @ 2019-07-29 15:23 UTC (permalink / raw)
  To: intel-gfx

We shouldn't immediately fail on WOPCM partitioning or programming
as we plan to restore fallback on any GuC related failures.

While around, add some more probe failure injections.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/uc/intel_uc.c |  3 ++
 drivers/gpu/drm/i915/i915_gem.c       | 12 +----
 drivers/gpu/drm/i915/intel_wopcm.c    | 66 +++++++++++++++------------
 drivers/gpu/drm/i915/intel_wopcm.h    | 11 ++++-
 4 files changed, 50 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 9e1156c29cb1..f096063189d4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -410,6 +410,9 @@ int intel_uc_init_hw(struct intel_uc *uc)
 
 	GEM_BUG_ON(!intel_uc_fw_supported(&guc->fw));
 
+	if (!intel_wopcm_is_ready(&i915->wopcm))
+		return -ENXIO;
+
 	guc_reset_interrupts(guc);
 
 	/* WaEnableuKernelHeaderValidFix:skl */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c437ab5554ec..02e09864856f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1239,11 +1239,7 @@ int i915_gem_init_hw(struct drm_i915_private *i915)
 		goto out;
 	}
 
-	ret = intel_wopcm_init_hw(&i915->wopcm, gt);
-	if (ret) {
-		DRM_ERROR("Enabling WOPCM failed (%d)\n", ret);
-		goto out;
-	}
+	intel_wopcm_init_hw(&i915->wopcm, gt);
 
 	/* We can't enable contexts until all firmware is loaded */
 	ret = intel_uc_init_hw(&i915->gt.uc);
@@ -1432,10 +1428,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 		return ret;
 
 	intel_uc_fetch_firmwares(&dev_priv->gt.uc);
-
-	ret = intel_wopcm_init(&dev_priv->wopcm);
-	if (ret)
-		goto err_uc_fw;
+	intel_wopcm_init(&dev_priv->wopcm);
 
 	/* This is just a security blanket to placate dragons.
 	 * On some systems, we very sporadically observe that the first TLBs
@@ -1559,7 +1552,6 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
-err_uc_fw:
 	intel_uc_cleanup_firmwares(&dev_priv->gt.uc);
 
 	if (ret != -EIO) {
diff --git a/drivers/gpu/drm/i915/intel_wopcm.c b/drivers/gpu/drm/i915/intel_wopcm.c
index e173a8e61bfd..89b2ffef879a 100644
--- a/drivers/gpu/drm/i915/intel_wopcm.c
+++ b/drivers/gpu/drm/i915/intel_wopcm.c
@@ -137,7 +137,11 @@ static inline int check_hw_restriction(struct drm_i915_private *i915,
 				       u32 guc_wopcm_base, u32 guc_wopcm_size,
 				       u32 huc_fw_size)
 {
-	int err = 0;
+	int err;
+
+	err = i915_inject_load_error(i915, -E2BIG);
+	if (err)
+		return err;
 
 	if (IS_GEN(i915, 9))
 		err = gen9_check_dword_gap(guc_wopcm_base, guc_wopcm_size);
@@ -156,12 +160,10 @@ static inline int check_hw_restriction(struct drm_i915_private *i915,
  * This function will partition WOPCM space based on GuC and HuC firmware sizes
  * and will allocate max remaining for use by GuC. This function will also
  * enforce platform dependent hardware restrictions on GuC WOPCM offset and
- * size. It will fail the WOPCM init if any of these checks were failed, so that
- * the following GuC firmware uploading would be aborted.
- *
- * Return: 0 on success, non-zero error code on failure.
+ * size. It will fail the WOPCM init if any of these checks fail, so that the
+ * following WOPCM registers setup and GuC firmware uploading would be aborted.
  */
-int intel_wopcm_init(struct intel_wopcm *wopcm)
+void intel_wopcm_init(struct intel_wopcm *wopcm)
 {
 	struct drm_i915_private *i915 = wopcm_to_i915(wopcm);
 	u32 guc_fw_size = intel_uc_fw_get_upload_size(&i915->gt.uc.guc.fw);
@@ -173,23 +175,23 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
 	int err;
 
 	if (!USES_GUC(i915))
-		return 0;
+		return;
 
 	GEM_BUG_ON(!wopcm->size);
 
 	if (i915_inject_probe_failure(i915))
-		return -E2BIG;
+		return;
 
 	if (guc_fw_size >= wopcm->size) {
 		DRM_ERROR("GuC FW (%uKiB) is too big to fit in WOPCM.",
 			  guc_fw_size / 1024);
-		return -E2BIG;
+		goto fail;
 	}
 
 	if (huc_fw_size >= wopcm->size) {
 		DRM_ERROR("HuC FW (%uKiB) is too big to fit in WOPCM.",
 			  huc_fw_size / 1024);
-		return -E2BIG;
+		goto fail;
 	}
 
 	guc_wopcm_base = ALIGN(huc_fw_size + WOPCM_RESERVED_SIZE,
@@ -197,7 +199,7 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
 	if ((guc_wopcm_base + ctx_rsvd) >= wopcm->size) {
 		DRM_ERROR("GuC WOPCM base (%uKiB) is too big.\n",
 			  guc_wopcm_base / 1024);
-		return -E2BIG;
+		goto fail;
 	}
 
 	guc_wopcm_size = wopcm->size - guc_wopcm_base - ctx_rsvd;
@@ -211,18 +213,19 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
 		DRM_ERROR("Need %uKiB WOPCM for GuC, %uKiB available.\n",
 			  (guc_fw_size + guc_wopcm_rsvd) / 1024,
 			  guc_wopcm_size / 1024);
-		return -E2BIG;
+		goto fail;
 	}
 
 	err = check_hw_restriction(i915, guc_wopcm_base, guc_wopcm_size,
 				   huc_fw_size);
 	if (err)
-		return err;
+		goto fail;
 
 	wopcm->guc.base = guc_wopcm_base;
 	wopcm->guc.size = guc_wopcm_size;
-
-	return 0;
+	return;
+fail:
+	I915_WARN(i915, "WOPCM partitioning failed, GuC will fail to load!\n");
 }
 
 static int
@@ -231,14 +234,25 @@ write_and_verify(struct intel_gt *gt,
 {
 	struct intel_uncore *uncore = gt->uncore;
 	u32 reg_val;
+	int err;
 
 	GEM_BUG_ON(val & ~mask);
 
+	err = i915_inject_load_error(gt->i915, -EIO);
+	if (err)
+		return err;
+
 	intel_uncore_write(uncore, reg, val);
 
 	reg_val = intel_uncore_read(uncore, reg);
 
-	return (reg_val & mask) != (val | locked_bit) ? -EIO : 0;
+	if ((reg_val & mask) != (val | locked_bit)) {
+		I915_WARN(gt->i915, "WOPCM register %#x=%#x\n",
+			  i915_mmio_reg_offset(reg), reg_val);
+		return -EIO;
+	}
+
+	return 0;
 }
 
 /**
@@ -249,19 +263,16 @@ write_and_verify(struct intel_gt *gt,
  * Setup the GuC WOPCM size and offset registers with the calculated values. It
  * will verify the register values to make sure the registers are locked with
  * correct values.
- *
- * Return: 0 on success. -EIO if registers were locked with incorrect values.
  */
-int intel_wopcm_init_hw(struct intel_wopcm *wopcm, struct intel_gt *gt)
+void intel_wopcm_init_hw(struct intel_wopcm *wopcm, struct intel_gt *gt)
 {
 	struct drm_i915_private *i915 = wopcm_to_i915(wopcm);
-	struct intel_uncore *uncore = gt->uncore;
 	u32 huc_agent;
 	u32 mask;
 	int err;
 
-	if (!USES_GUC(i915))
-		return 0;
+	if (!intel_wopcm_guc_size(wopcm))
+		return;
 
 	GEM_BUG_ON(!HAS_GT_UC(i915));
 	GEM_BUG_ON(!wopcm->guc.size);
@@ -281,14 +292,9 @@ int intel_wopcm_init_hw(struct intel_wopcm *wopcm, struct intel_gt *gt)
 	if (err)
 		goto err_out;
 
-	return 0;
+	wopcm->ready = true;
+	return;
 
 err_out:
-	DRM_ERROR("Failed to init WOPCM registers:\n");
-	DRM_ERROR("DMA_GUC_WOPCM_OFFSET=%#x\n",
-		  intel_uncore_read(uncore, DMA_GUC_WOPCM_OFFSET));
-	DRM_ERROR("GUC_WOPCM_SIZE=%#x\n",
-		  intel_uncore_read(uncore, GUC_WOPCM_SIZE));
-
-	return err;
+	I915_WARN(i915, "WOPCM programming failed, GuC will fail to load!\n");
 }
diff --git a/drivers/gpu/drm/i915/intel_wopcm.h b/drivers/gpu/drm/i915/intel_wopcm.h
index 56aaed4d64ff..daf9c1dbe20b 100644
--- a/drivers/gpu/drm/i915/intel_wopcm.h
+++ b/drivers/gpu/drm/i915/intel_wopcm.h
@@ -17,6 +17,7 @@ struct intel_gt;
  * @guc: GuC WOPCM Region info.
  * @guc.base: GuC WOPCM base which is offset from WOPCM base.
  * @guc.size: Size of the GuC WOPCM region.
+ * @ready: indicates that WOPCM registers are correctly programmed.
  */
 struct intel_wopcm {
 	u32 size;
@@ -24,6 +25,7 @@ struct intel_wopcm {
 		u32 base;
 		u32 size;
 	} guc;
+	bool ready;
 };
 
 /**
@@ -42,7 +44,12 @@ static inline u32 intel_wopcm_guc_size(struct intel_wopcm *wopcm)
 }
 
 void intel_wopcm_init_early(struct intel_wopcm *wopcm);
-int intel_wopcm_init(struct intel_wopcm *wopcm);
-int intel_wopcm_init_hw(struct intel_wopcm *wopcm, struct intel_gt *gt);
+void intel_wopcm_init(struct intel_wopcm *wopcm);
+void intel_wopcm_init_hw(struct intel_wopcm *wopcm, struct intel_gt *gt);
+
+static inline bool intel_wopcm_is_ready(struct intel_wopcm *wopcm)
+{
+	return wopcm->ready;
+}
 
 #endif
-- 
2.19.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* ✓ Fi.CI.BAT: success for add more probe failures
  2019-07-29 15:22 [PATCH 0/4] add more probe failures Michal Wajdeczko
                   ` (3 preceding siblings ...)
  2019-07-29 15:23 ` [PATCH 4/4] drm/i915/wopcm: Don't fail on WOPCM partitioning failure Michal Wajdeczko
@ 2019-07-29 17:12 ` Patchwork
  2019-07-29 19:56 ` ✗ Fi.CI.BAT: failure for add more probe failures (rev2) Patchwork
  5 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2019-07-29 17:12 UTC (permalink / raw)
  To: Michal Wajdeczko; +Cc: intel-gfx

== Series Details ==

Series: add more probe failures
URL   : https://patchwork.freedesktop.org/series/64390/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6574 -> Patchwork_13788
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13788/

Known issues
------------

  Here are the changes found in Patchwork_13788 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_busy@basic-flip-a:
    - fi-kbl-7567u:       [PASS][1] -> [SKIP][2] ([fdo#109271] / [fdo#109278]) +2 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6574/fi-kbl-7567u/igt@kms_busy@basic-flip-a.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13788/fi-kbl-7567u/igt@kms_busy@basic-flip-a.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - fi-blb-e6850:       [PASS][3] -> [INCOMPLETE][4] ([fdo#107718])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6574/fi-blb-e6850/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13788/fi-blb-e6850/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  
#### Possible fixes ####

  * igt@gem_basic@create-fd-close:
    - fi-icl-u3:          [DMESG-WARN][5] ([fdo#107724]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6574/fi-icl-u3/igt@gem_basic@create-fd-close.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13788/fi-icl-u3/igt@gem_basic@create-fd-close.html

  * igt@gem_ctx_create@basic-files:
    - fi-icl-u3:          [INCOMPLETE][7] ([fdo#107713] / [fdo#109100]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6574/fi-icl-u3/igt@gem_ctx_create@basic-files.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13788/fi-icl-u3/igt@gem_ctx_create@basic-files.html

  * igt@i915_selftest@live_execlists:
    - fi-skl-gvtdvm:      [DMESG-FAIL][9] ([fdo#111108]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6574/fi-skl-gvtdvm/igt@i915_selftest@live_execlists.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13788/fi-skl-gvtdvm/igt@i915_selftest@live_execlists.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [FAIL][11] ([fdo#109485]) -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6574/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13788/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  * igt@kms_frontbuffer_tracking@basic:
    - fi-icl-u2:          [FAIL][13] ([fdo#103167]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6574/fi-icl-u2/igt@kms_frontbuffer_tracking@basic.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13788/fi-icl-u2/igt@kms_frontbuffer_tracking@basic.html

  
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#109100]: https://bugs.freedesktop.org/show_bug.cgi?id=109100
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278
  [fdo#109485]: https://bugs.freedesktop.org/show_bug.cgi?id=109485
  [fdo#111108]: https://bugs.freedesktop.org/show_bug.cgi?id=111108


Participating hosts (51 -> 46)
------------------------------

  Additional (1): fi-pnv-d510 
  Missing    (6): fi-kbl-soraka fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6574 -> Patchwork_13788

  CI-20190529: 20190529
  CI_DRM_6574: 30577cdd3fc7a9385b5ef15940606d320d6e2c4a @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5114: 54450721c3b323d01e0c6e5d5e883b7499c8022c @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_13788: ed6c2b2fb9ff1f2bde6b5282bdab40345588a11b @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ed6c2b2fb9ff drm/i915/wopcm: Don't fail on WOPCM partitioning failure
f31d5a7c468e drm/i915/uc: Inject load errors into intel_uc_init_hw
cdf6826ebe4d drm/i915: Add i915 to i915_inject_probe_failure
130bf947b81d drm/i915: Report -ENODEV after injecting probe failure

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13788/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] drm/i915: Report -ENODEV after injecting probe failure
  2019-07-29 15:22 ` [PATCH 1/4] drm/i915: Report -ENODEV after injecting probe failure Michal Wajdeczko
@ 2019-07-29 17:54   ` Daniele Ceraolo Spurio
  2019-07-29 19:45     ` Chris Wilson
  0 siblings, 1 reply; 15+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-07-29 17:54 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx, Sundaresan, Sujaritha



On 7/29/19 8:22 AM, Michal Wajdeczko wrote:
> We want to do more driver testing using injected load errors,
> but we don't want to limit ourselves to use only -ENODEV (as
> other errors are interpreted as real issues, like this:
> 
> <4> [381.569479] i915: probe of 0000:00:02.0 failed with error -7
> 
> Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

IIRC the final outcome of the discussion we had on this a while back was 
that CI would ignore this kind of error message. Anything changed? We 
already have places we return something different from -ENODEV (e.g. 
__fw_domain_init() in intel_uncore.c)

Daniele

> ---
>   drivers/gpu/drm/i915/i915_pci.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index bd9211b3d76e..332949c20a29 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -956,7 +956,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   
>   	err = i915_driver_probe(pdev, ent);
>   	if (err)
> -		return err;
> +		return i915_error_injected() ? -ENODEV : err;
>   
>   	if (i915_inject_probe_failure()) {
>   		i915_pci_remove(pdev);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] drm/i915/wopcm: Don't fail on WOPCM partitioning failure
  2019-07-29 15:23 ` [PATCH 4/4] drm/i915/wopcm: Don't fail on WOPCM partitioning failure Michal Wajdeczko
@ 2019-07-29 18:04   ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 15+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-07-29 18:04 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx



On 7/29/19 8:23 AM, Michal Wajdeczko wrote:
> We shouldn't immediately fail on WOPCM partitioning or programming
> as we plan to restore fallback on any GuC related failures.
> 
> While around, add some more probe failure injections.
> 

I was planning to move the uC wopcm programming to intel_uc_init_hw() 
(where you now have intel_wopcm_is_ready) since we're actually not 
initializing any wopcm HW but uC HW related to how/where the uC accesses 
the wopcm. That would allow us to return the failure directly from the 
function without having to save state. Thoughts?

Daniele

> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_uc.c |  3 ++
>   drivers/gpu/drm/i915/i915_gem.c       | 12 +----
>   drivers/gpu/drm/i915/intel_wopcm.c    | 66 +++++++++++++++------------
>   drivers/gpu/drm/i915/intel_wopcm.h    | 11 ++++-
>   4 files changed, 50 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> index 9e1156c29cb1..f096063189d4 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> @@ -410,6 +410,9 @@ int intel_uc_init_hw(struct intel_uc *uc)
>   
>   	GEM_BUG_ON(!intel_uc_fw_supported(&guc->fw));
>   
> +	if (!intel_wopcm_is_ready(&i915->wopcm))
> +		return -ENXIO;
> +
>   	guc_reset_interrupts(guc);
>   
>   	/* WaEnableuKernelHeaderValidFix:skl */
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c437ab5554ec..02e09864856f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1239,11 +1239,7 @@ int i915_gem_init_hw(struct drm_i915_private *i915)
>   		goto out;
>   	}
>   
> -	ret = intel_wopcm_init_hw(&i915->wopcm, gt);
> -	if (ret) {
> -		DRM_ERROR("Enabling WOPCM failed (%d)\n", ret);
> -		goto out;
> -	}
> +	intel_wopcm_init_hw(&i915->wopcm, gt);
>   
>   	/* We can't enable contexts until all firmware is loaded */
>   	ret = intel_uc_init_hw(&i915->gt.uc);
> @@ -1432,10 +1428,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
>   		return ret;
>   
>   	intel_uc_fetch_firmwares(&dev_priv->gt.uc);
> -
> -	ret = intel_wopcm_init(&dev_priv->wopcm);
> -	if (ret)
> -		goto err_uc_fw;
> +	intel_wopcm_init(&dev_priv->wopcm);
>   
>   	/* This is just a security blanket to placate dragons.
>   	 * On some systems, we very sporadically observe that the first TLBs
> @@ -1559,7 +1552,6 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
>   	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
>   	mutex_unlock(&dev_priv->drm.struct_mutex);
>   
> -err_uc_fw:
>   	intel_uc_cleanup_firmwares(&dev_priv->gt.uc);
>   
>   	if (ret != -EIO) {
> diff --git a/drivers/gpu/drm/i915/intel_wopcm.c b/drivers/gpu/drm/i915/intel_wopcm.c
> index e173a8e61bfd..89b2ffef879a 100644
> --- a/drivers/gpu/drm/i915/intel_wopcm.c
> +++ b/drivers/gpu/drm/i915/intel_wopcm.c
> @@ -137,7 +137,11 @@ static inline int check_hw_restriction(struct drm_i915_private *i915,
>   				       u32 guc_wopcm_base, u32 guc_wopcm_size,
>   				       u32 huc_fw_size)
>   {
> -	int err = 0;
> +	int err;
> +
> +	err = i915_inject_load_error(i915, -E2BIG);
> +	if (err)
> +		return err;
>   
>   	if (IS_GEN(i915, 9))
>   		err = gen9_check_dword_gap(guc_wopcm_base, guc_wopcm_size);
> @@ -156,12 +160,10 @@ static inline int check_hw_restriction(struct drm_i915_private *i915,
>    * This function will partition WOPCM space based on GuC and HuC firmware sizes
>    * and will allocate max remaining for use by GuC. This function will also
>    * enforce platform dependent hardware restrictions on GuC WOPCM offset and
> - * size. It will fail the WOPCM init if any of these checks were failed, so that
> - * the following GuC firmware uploading would be aborted.
> - *
> - * Return: 0 on success, non-zero error code on failure.
> + * size. It will fail the WOPCM init if any of these checks fail, so that the
> + * following WOPCM registers setup and GuC firmware uploading would be aborted.
>    */
> -int intel_wopcm_init(struct intel_wopcm *wopcm)
> +void intel_wopcm_init(struct intel_wopcm *wopcm)
>   {
>   	struct drm_i915_private *i915 = wopcm_to_i915(wopcm);
>   	u32 guc_fw_size = intel_uc_fw_get_upload_size(&i915->gt.uc.guc.fw);
> @@ -173,23 +175,23 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
>   	int err;
>   
>   	if (!USES_GUC(i915))
> -		return 0;
> +		return;
>   
>   	GEM_BUG_ON(!wopcm->size);
>   
>   	if (i915_inject_probe_failure(i915))
> -		return -E2BIG;
> +		return;
>   
>   	if (guc_fw_size >= wopcm->size) {
>   		DRM_ERROR("GuC FW (%uKiB) is too big to fit in WOPCM.",
>   			  guc_fw_size / 1024);
> -		return -E2BIG;
> +		goto fail;
>   	}
>   
>   	if (huc_fw_size >= wopcm->size) {
>   		DRM_ERROR("HuC FW (%uKiB) is too big to fit in WOPCM.",
>   			  huc_fw_size / 1024);
> -		return -E2BIG;
> +		goto fail;
>   	}
>   
>   	guc_wopcm_base = ALIGN(huc_fw_size + WOPCM_RESERVED_SIZE,
> @@ -197,7 +199,7 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
>   	if ((guc_wopcm_base + ctx_rsvd) >= wopcm->size) {
>   		DRM_ERROR("GuC WOPCM base (%uKiB) is too big.\n",
>   			  guc_wopcm_base / 1024);
> -		return -E2BIG;
> +		goto fail;
>   	}
>   
>   	guc_wopcm_size = wopcm->size - guc_wopcm_base - ctx_rsvd;
> @@ -211,18 +213,19 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
>   		DRM_ERROR("Need %uKiB WOPCM for GuC, %uKiB available.\n",
>   			  (guc_fw_size + guc_wopcm_rsvd) / 1024,
>   			  guc_wopcm_size / 1024);
> -		return -E2BIG;
> +		goto fail;
>   	}
>   
>   	err = check_hw_restriction(i915, guc_wopcm_base, guc_wopcm_size,
>   				   huc_fw_size);
>   	if (err)
> -		return err;
> +		goto fail;
>   
>   	wopcm->guc.base = guc_wopcm_base;
>   	wopcm->guc.size = guc_wopcm_size;
> -
> -	return 0;
> +	return;
> +fail:
> +	I915_WARN(i915, "WOPCM partitioning failed, GuC will fail to load!\n");
>   }
>   
>   static int
> @@ -231,14 +234,25 @@ write_and_verify(struct intel_gt *gt,
>   {
>   	struct intel_uncore *uncore = gt->uncore;
>   	u32 reg_val;
> +	int err;
>   
>   	GEM_BUG_ON(val & ~mask);
>   
> +	err = i915_inject_load_error(gt->i915, -EIO);
> +	if (err)
> +		return err;
> +
>   	intel_uncore_write(uncore, reg, val);
>   
>   	reg_val = intel_uncore_read(uncore, reg);
>   
> -	return (reg_val & mask) != (val | locked_bit) ? -EIO : 0;
> +	if ((reg_val & mask) != (val | locked_bit)) {
> +		I915_WARN(gt->i915, "WOPCM register %#x=%#x\n",
> +			  i915_mmio_reg_offset(reg), reg_val);
> +		return -EIO;
> +	}
> +
> +	return 0;
>   }
>   
>   /**
> @@ -249,19 +263,16 @@ write_and_verify(struct intel_gt *gt,
>    * Setup the GuC WOPCM size and offset registers with the calculated values. It
>    * will verify the register values to make sure the registers are locked with
>    * correct values.
> - *
> - * Return: 0 on success. -EIO if registers were locked with incorrect values.
>    */
> -int intel_wopcm_init_hw(struct intel_wopcm *wopcm, struct intel_gt *gt)
> +void intel_wopcm_init_hw(struct intel_wopcm *wopcm, struct intel_gt *gt)
>   {
>   	struct drm_i915_private *i915 = wopcm_to_i915(wopcm);
> -	struct intel_uncore *uncore = gt->uncore;
>   	u32 huc_agent;
>   	u32 mask;
>   	int err;
>   
> -	if (!USES_GUC(i915))
> -		return 0;
> +	if (!intel_wopcm_guc_size(wopcm))
> +		return;
>   
>   	GEM_BUG_ON(!HAS_GT_UC(i915));
>   	GEM_BUG_ON(!wopcm->guc.size);
> @@ -281,14 +292,9 @@ int intel_wopcm_init_hw(struct intel_wopcm *wopcm, struct intel_gt *gt)
>   	if (err)
>   		goto err_out;
>   
> -	return 0;
> +	wopcm->ready = true;
> +	return;
>   
>   err_out:
> -	DRM_ERROR("Failed to init WOPCM registers:\n");
> -	DRM_ERROR("DMA_GUC_WOPCM_OFFSET=%#x\n",
> -		  intel_uncore_read(uncore, DMA_GUC_WOPCM_OFFSET));
> -	DRM_ERROR("GUC_WOPCM_SIZE=%#x\n",
> -		  intel_uncore_read(uncore, GUC_WOPCM_SIZE));
> -
> -	return err;
> +	I915_WARN(i915, "WOPCM programming failed, GuC will fail to load!\n");
>   }
> diff --git a/drivers/gpu/drm/i915/intel_wopcm.h b/drivers/gpu/drm/i915/intel_wopcm.h
> index 56aaed4d64ff..daf9c1dbe20b 100644
> --- a/drivers/gpu/drm/i915/intel_wopcm.h
> +++ b/drivers/gpu/drm/i915/intel_wopcm.h
> @@ -17,6 +17,7 @@ struct intel_gt;
>    * @guc: GuC WOPCM Region info.
>    * @guc.base: GuC WOPCM base which is offset from WOPCM base.
>    * @guc.size: Size of the GuC WOPCM region.
> + * @ready: indicates that WOPCM registers are correctly programmed.
>    */
>   struct intel_wopcm {
>   	u32 size;
> @@ -24,6 +25,7 @@ struct intel_wopcm {
>   		u32 base;
>   		u32 size;
>   	} guc;
> +	bool ready;
>   };
>   
>   /**
> @@ -42,7 +44,12 @@ static inline u32 intel_wopcm_guc_size(struct intel_wopcm *wopcm)
>   }
>   
>   void intel_wopcm_init_early(struct intel_wopcm *wopcm);
> -int intel_wopcm_init(struct intel_wopcm *wopcm);
> -int intel_wopcm_init_hw(struct intel_wopcm *wopcm, struct intel_gt *gt);
> +void intel_wopcm_init(struct intel_wopcm *wopcm);
> +void intel_wopcm_init_hw(struct intel_wopcm *wopcm, struct intel_gt *gt);
> +
> +static inline bool intel_wopcm_is_ready(struct intel_wopcm *wopcm)
> +{
> +	return wopcm->ready;
> +}
>   
>   #endif
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] drm/i915: Report -ENODEV after injecting probe failure
  2019-07-29 17:54   ` Daniele Ceraolo Spurio
@ 2019-07-29 19:45     ` Chris Wilson
  0 siblings, 0 replies; 15+ messages in thread
From: Chris Wilson @ 2019-07-29 19:45 UTC (permalink / raw)
  To: Sundaresan, Sujaritha, Daniele Ceraolo Spurio, Michal Wajdeczko,
	intel-gfx

Quoting Daniele Ceraolo Spurio (2019-07-29 18:54:15)
> 
> 
> On 7/29/19 8:22 AM, Michal Wajdeczko wrote:
> > We want to do more driver testing using injected load errors,
> > but we don't want to limit ourselves to use only -ENODEV (as
> > other errors are interpreted as real issues, like this:
> > 
> > <4> [381.569479] i915: probe of 0000:00:02.0 failed with error -7
> > 
> > Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> IIRC the final outcome of the discussion we had on this a while back was 
> that CI would ignore this kind of error message. Anything changed? We 
> already have places we return something different from -ENODEV (e.g. 
> __fw_domain_init() in intel_uncore.c)

If it was only a note, it would be naturally ignored... We could simply
make it a notice and justify as the base really doesn't know how severe
the error is, just that the probe failed:

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 994a90747420..877735f2dfd4 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -613,7 +613,7 @@ static int really_probe(struct device *dev, struct device_driver *drv)
                break;
        default:
                /* driver matched but the probe failed */
-               printk(KERN_WARNING
+               printk(KERN_NOTICE
                       "%s: probe of %s failed with error %d\n",
                       drv->name, dev_name(dev), ret);
        }

> 
> Daniele
> 
> > ---
> >   drivers/gpu/drm/i915/i915_pci.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> > index bd9211b3d76e..332949c20a29 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -956,7 +956,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> >   
> >       err = i915_driver_probe(pdev, ent);
> >       if (err)
> > -             return err;
> > +             return i915_error_injected() ? -ENODEV : err;
> >   
> >       if (i915_inject_probe_failure()) {
> >               i915_pci_remove(pdev);
> > 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* ✗ Fi.CI.BAT: failure for add more probe failures (rev2)
  2019-07-29 15:22 [PATCH 0/4] add more probe failures Michal Wajdeczko
                   ` (4 preceding siblings ...)
  2019-07-29 17:12 ` ✓ Fi.CI.BAT: success for add more probe failures Patchwork
@ 2019-07-29 19:56 ` Patchwork
  5 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2019-07-29 19:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: add more probe failures (rev2)
URL   : https://patchwork.freedesktop.org/series/64390/
State : failure

== Summary ==

Applying: drm/i915: Report -ENODEV after injecting probe failure
error: patch failed: drivers/base/dd.c:613
error: drivers/base/dd.c: patch does not apply
error: Did you hand edit your patch?
It does not apply to blobs recorded in its index.
hint: Use 'git am --show-current-patch' to see the failed patch
Using index info to reconstruct a base tree...
Patch failed at 0001 drm/i915: Report -ENODEV after injecting probe failure
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] drm/i915: Add i915 to i915_inject_probe_failure
  2019-07-29 15:22 ` [PATCH 2/4] drm/i915: Add i915 to i915_inject_probe_failure Michal Wajdeczko
@ 2019-07-30  8:07   ` kbuild test robot
  0 siblings, 0 replies; 15+ messages in thread
From: kbuild test robot @ 2019-07-30  8:07 UTC (permalink / raw)
  To: Michal Wajdeczko; +Cc: intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 12314 bytes --]

Hi Michal,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[cannot apply to v5.3-rc2 next-20190730]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Michal-Wajdeczko/add-more-probe-failures/20190730-060559
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-defconfig (attached as .config)
compiler: gcc-7 (Debian 7.4.0-10) 7.4.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   drivers/gpu/drm/i915/i915_drv.c: In function 'i915_driver_modeset_probe':
>> drivers/gpu/drm/i915/i915_drv.c:691:40: error: macro "i915_inject_load_error" passed 2 arguments, but takes just 1
     if (i915_inject_probe_failure(dev_priv))
                                           ^
   In file included from drivers/gpu/drm/i915/display/intel_bw.h:11:0,
                    from drivers/gpu/drm/i915/i915_drv.c:52:
>> drivers/gpu/drm/i915/i915_drv.h:138:41: error: 'i915_inject_load_error' undeclared (first use in this function); did you mean 'i915_gpu_error'?
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/i915_drv.c:691:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(dev_priv))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.h:138:41: note: each undeclared identifier is reported only once for each function it appears in
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/i915_drv.c:691:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(dev_priv))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.c: In function 'i915_driver_early_probe':
   drivers/gpu/drm/i915/i915_drv.c:898:40: error: macro "i915_inject_load_error" passed 2 arguments, but takes just 1
     if (i915_inject_probe_failure(dev_priv))
                                           ^
   In file included from drivers/gpu/drm/i915/display/intel_bw.h:11:0,
                    from drivers/gpu/drm/i915/i915_drv.c:52:
>> drivers/gpu/drm/i915/i915_drv.h:138:41: error: 'i915_inject_load_error' undeclared (first use in this function); did you mean 'i915_gpu_error'?
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
   drivers/gpu/drm/i915/i915_drv.c:898:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(dev_priv))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.c: In function 'i915_driver_mmio_probe':
   drivers/gpu/drm/i915/i915_drv.c:989:40: error: macro "i915_inject_load_error" passed 2 arguments, but takes just 1
     if (i915_inject_probe_failure(dev_priv))
                                           ^
   In file included from drivers/gpu/drm/i915/display/intel_bw.h:11:0,
                    from drivers/gpu/drm/i915/i915_drv.c:52:
>> drivers/gpu/drm/i915/i915_drv.h:138:41: error: 'i915_inject_load_error' undeclared (first use in this function); did you mean 'i915_gpu_error'?
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
   drivers/gpu/drm/i915/i915_drv.c:989:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(dev_priv))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.c: In function 'i915_driver_hw_probe':
   drivers/gpu/drm/i915/i915_drv.c:1534:40: error: macro "i915_inject_load_error" passed 2 arguments, but takes just 1
     if (i915_inject_probe_failure(dev_priv))
                                           ^
   In file included from drivers/gpu/drm/i915/display/intel_bw.h:11:0,
                    from drivers/gpu/drm/i915/i915_drv.c:52:
>> drivers/gpu/drm/i915/i915_drv.h:138:41: error: 'i915_inject_load_error' undeclared (first use in this function); did you mean 'i915_gpu_error'?
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
   drivers/gpu/drm/i915/i915_drv.c:1534:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(dev_priv))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
--
   drivers/gpu/drm/i915/i915_pci.c: In function 'i915_pci_probe':
>> drivers/gpu/drm/i915/i915_pci.c:961:62: error: macro "i915_inject_load_error" passed 2 arguments, but takes just 1
     if (i915_inject_probe_failure(to_i915(pci_get_drvdata(pdev)))) {
                                                                 ^
   In file included from drivers/gpu/drm/i915/i915_pci.c:33:0:
>> drivers/gpu/drm/i915/i915_drv.h:138:41: error: 'i915_inject_load_error' undeclared (first use in this function); did you mean 'i915_gpu_error'?
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/i915_pci.c:961:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(to_i915(pci_get_drvdata(pdev)))) {
         ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.h:138:41: note: each undeclared identifier is reported only once for each function it appears in
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/i915_pci.c:961:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(to_i915(pci_get_drvdata(pdev)))) {
         ^~~~~~~~~~~~~~~~~~~~~~~~~
--
   drivers/gpu/drm/i915/gt/intel_engine_cs.c: In function 'intel_engines_init_mmio':
>> drivers/gpu/drm/i915/gt/intel_engine_cs.c:429:36: error: macro "i915_inject_load_error" passed 2 arguments, but takes just 1
     if (i915_inject_probe_failure(i915))
                                       ^
   In file included from drivers/gpu/drm/i915/gt/intel_engine_cs.c:29:0:
>> drivers/gpu/drm/i915/i915_drv.h:138:41: error: 'i915_inject_load_error' undeclared (first use in this function); did you mean 'i915_gpu_error'?
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/gt/intel_engine_cs.c:429:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(i915))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.h:138:41: note: each undeclared identifier is reported only once for each function it appears in
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/gt/intel_engine_cs.c:429:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(i915))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
--
   drivers/gpu/drm/i915/i915_gem.c: In function 'i915_gem_init':
>> drivers/gpu/drm/i915/i915_gem.c:1514:48: error: macro "i915_inject_load_error" passed 2 arguments, but takes just 1
     ret = i915_inject_load_error(dev_priv, -ENODEV);
                                                   ^
>> drivers/gpu/drm/i915/i915_gem.c:1514:8: error: 'i915_inject_load_error' undeclared (first use in this function); did you mean 'i915_gpu_error'?
     ret = i915_inject_load_error(dev_priv, -ENODEV);
           ^~~~~~~~~~~~~~~~~~~~~~
           i915_gpu_error
   drivers/gpu/drm/i915/i915_gem.c:1514:8: note: each undeclared identifier is reported only once for each function it appears in
   drivers/gpu/drm/i915/i915_gem.c:1518:45: error: macro "i915_inject_load_error" passed 2 arguments, but takes just 1
     ret = i915_inject_load_error(dev_priv, -EIO);
                                                ^
--
   drivers/gpu/drm/i915/intel_uncore.c: In function '__fw_domain_init':
>> drivers/gpu/drm/i915/intel_uncore.c:1334:44: error: macro "i915_inject_load_error" passed 2 arguments, but takes just 1
     if (i915_inject_probe_failure(uncore->i915))
                                               ^
   In file included from drivers/gpu/drm/i915/intel_uncore.c:27:0:
>> drivers/gpu/drm/i915/i915_drv.h:138:41: error: 'i915_inject_load_error' undeclared (first use in this function); did you mean 'i915_gpu_error'?
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/intel_uncore.c:1334:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(uncore->i915))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.h:138:41: note: each undeclared identifier is reported only once for each function it appears in
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/intel_uncore.c:1334:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(uncore->i915))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
--
   drivers/gpu/drm/i915/intel_wopcm.c: In function 'intel_wopcm_init':
>> drivers/gpu/drm/i915/intel_wopcm.c:180:36: error: macro "i915_inject_load_error" passed 2 arguments, but takes just 1
     if (i915_inject_probe_failure(i915))
                                       ^
   In file included from drivers/gpu/drm/i915/intel_wopcm.c:8:0:
>> drivers/gpu/drm/i915/i915_drv.h:138:41: error: 'i915_inject_load_error' undeclared (first use in this function); did you mean 'i915_gpu_error'?
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/intel_wopcm.c:180:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(i915))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.h:138:41: note: each undeclared identifier is reported only once for each function it appears in
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/intel_wopcm.c:180:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(i915))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
--
   drivers/gpu/drm/i915/display/intel_connector.c: In function 'intel_connector_register':
>> drivers/gpu/drm/i915/display/intel_connector.c:121:55: error: macro "i915_inject_load_error" passed 2 arguments, but takes just 1
     if (i915_inject_probe_failure(to_i915(connector->dev))) {
                                                          ^
   In file included from drivers/gpu/drm/i915/display/intel_connector.c:34:0:
>> drivers/gpu/drm/i915/i915_drv.h:138:41: error: 'i915_inject_load_error' undeclared (first use in this function); did you mean 'i915_gpu_error'?
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/display/intel_connector.c:121:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(to_i915(connector->dev))) {
         ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.h:138:41: note: each undeclared identifier is reported only once for each function it appears in
    #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
                                            ^
>> drivers/gpu/drm/i915/display/intel_connector.c:121:6: note: in expansion of macro 'i915_inject_probe_failure'
     if (i915_inject_probe_failure(to_i915(connector->dev))) {
         ^~~~~~~~~~~~~~~~~~~~~~~~~

vim +/i915_inject_load_error +691 drivers/gpu/drm/i915/i915_drv.c

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 27884 bytes --]

[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] drm/i915/uc: Inject load errors into intel_uc_init_hw
  2019-07-29 15:23 ` [PATCH 3/4] drm/i915/uc: Inject load errors into intel_uc_init_hw Michal Wajdeczko
@ 2019-07-30  8:47   ` Chris Wilson
  2019-07-30  9:27     ` Chris Wilson
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2019-07-30  8:47 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx

Quoting Michal Wajdeczko (2019-07-29 16:23:00)
> Inject load errors into intel_uc_init_hw to make sure we
> correctly handle uC initialization failures.
> 
> To avoid complains from CI about inserted errors or warnings,
> use helper macro that checks if there was an error injection.
> 
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_uc.c | 8 ++++++++
>  drivers/gpu/drm/i915/i915_drv.h       | 7 ++++++-
>  drivers/gpu/drm/i915/i915_gem.c       | 2 +-
>  3 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> index fafa9be1e12a..9e1156c29cb1 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> @@ -400,6 +400,14 @@ int intel_uc_init_hw(struct intel_uc *uc)
>         if (!intel_uc_is_using_guc(uc))
>                 return 0;
>  
> +       ret = i915_inject_load_error(i915, -EIO);
> +       if (ret)
> +               return ret;
> +
> +       ret = i915_inject_load_error(i915, -ENXIO);
> +       if (ret)
> +               return ret;
> +
>         GEM_BUG_ON(!intel_uc_fw_supported(&guc->fw));
>  
>         guc_reset_interrupts(guc);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 6b059d51aaff..36f7a146f06a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -137,9 +137,14 @@ bool i915_error_injected(void);
>  
>  #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
>  
> -#define i915_probe_error(i915, fmt, ...)                                  \
> +#define I915_ERROR(i915, fmt, ...) \
>         __i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \
>                       fmt, ##__VA_ARGS__)
> +#define I915_WARN(i915, fmt, ...) \
> +       __i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_WARNING, \
> +                     fmt, ##__VA_ARGS__)

I didn't see I915_WARN be used in this series. Is it likely? We either
abort the module load, in which it is an error, or we are quite happy to
continue in which case I'd vote for a "normal but significant condition"
i.e. KERN_NOTICE.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] drm/i915/uc: Inject load errors into intel_uc_init_hw
  2019-07-30  8:47   ` Chris Wilson
@ 2019-07-30  9:27     ` Chris Wilson
  2019-07-30 10:56       ` Michal Wajdeczko
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2019-07-30  9:27 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx

Quoting Chris Wilson (2019-07-30 09:47:37)
> Quoting Michal Wajdeczko (2019-07-29 16:23:00)
> > Inject load errors into intel_uc_init_hw to make sure we
> > correctly handle uC initialization failures.
> > 
> > To avoid complains from CI about inserted errors or warnings,
> > use helper macro that checks if there was an error injection.
> > 
> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_uc.c | 8 ++++++++
> >  drivers/gpu/drm/i915/i915_drv.h       | 7 ++++++-
> >  drivers/gpu/drm/i915/i915_gem.c       | 2 +-
> >  3 files changed, 15 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > index fafa9be1e12a..9e1156c29cb1 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > @@ -400,6 +400,14 @@ int intel_uc_init_hw(struct intel_uc *uc)
> >         if (!intel_uc_is_using_guc(uc))
> >                 return 0;
> >  
> > +       ret = i915_inject_load_error(i915, -EIO);
> > +       if (ret)
> > +               return ret;
> > +
> > +       ret = i915_inject_load_error(i915, -ENXIO);
> > +       if (ret)
> > +               return ret;
> > +
> >         GEM_BUG_ON(!intel_uc_fw_supported(&guc->fw));
> >  
> >         guc_reset_interrupts(guc);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 6b059d51aaff..36f7a146f06a 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -137,9 +137,14 @@ bool i915_error_injected(void);
> >  
> >  #define i915_inject_probe_failure(i915) i915_inject_load_error((i915), -ENODEV)
> >  
> > -#define i915_probe_error(i915, fmt, ...)                                  \
> > +#define I915_ERROR(i915, fmt, ...) \
> >         __i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \
> >                       fmt, ##__VA_ARGS__)
> > +#define I915_WARN(i915, fmt, ...) \
> > +       __i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_WARNING, \
> > +                     fmt, ##__VA_ARGS__)
> 
> I didn't see I915_WARN be used in this series. Is it likely? We either
> abort the module load, in which it is an error, or we are quite happy to
> continue in which case I'd vote for a "normal but significant condition"
> i.e. KERN_NOTICE.

Spotted the I915_WARN, I guess that's reasonable.

However, I do dislike I915_ERROR / I915_WARN on principle. Especially as
they have the coupling to the probe-failure magic and I don't feel like
it is a good idea to have that spread too far.
-chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] drm/i915/uc: Inject load errors into intel_uc_init_hw
  2019-07-30  9:27     ` Chris Wilson
@ 2019-07-30 10:56       ` Michal Wajdeczko
  2019-07-30 11:09         ` Chris Wilson
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Wajdeczko @ 2019-07-30 10:56 UTC (permalink / raw)
  To: intel-gfx, Chris Wilson

On Tue, 30 Jul 2019 11:27:51 +0200, Chris Wilson  
<chris@chris-wilson.co.uk> wrote:

> Quoting Chris Wilson (2019-07-30 09:47:37)
>> Quoting Michal Wajdeczko (2019-07-29 16:23:00)
>> > Inject load errors into intel_uc_init_hw to make sure we
>> > correctly handle uC initialization failures.
>> >
>> > To avoid complains from CI about inserted errors or warnings,
>> > use helper macro that checks if there was an error injection.
>> >
>> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> > ---
>> >  drivers/gpu/drm/i915/gt/uc/intel_uc.c | 8 ++++++++
>> >  drivers/gpu/drm/i915/i915_drv.h       | 7 ++++++-
>> >  drivers/gpu/drm/i915/i915_gem.c       | 2 +-
>> >  3 files changed, 15 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c  
>> b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>> > index fafa9be1e12a..9e1156c29cb1 100644
>> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>> > @@ -400,6 +400,14 @@ int intel_uc_init_hw(struct intel_uc *uc)
>> >         if (!intel_uc_is_using_guc(uc))
>> >                 return 0;
>> >
>> > +       ret = i915_inject_load_error(i915, -EIO);
>> > +       if (ret)
>> > +               return ret;
>> > +
>> > +       ret = i915_inject_load_error(i915, -ENXIO);
>> > +       if (ret)
>> > +               return ret;
>> > +
>> >         GEM_BUG_ON(!intel_uc_fw_supported(&guc->fw));
>> >
>> >         guc_reset_interrupts(guc);
>> > diff --git a/drivers/gpu/drm/i915/i915_drv.h  
>> b/drivers/gpu/drm/i915/i915_drv.h
>> > index 6b059d51aaff..36f7a146f06a 100644
>> > --- a/drivers/gpu/drm/i915/i915_drv.h
>> > +++ b/drivers/gpu/drm/i915/i915_drv.h
>> > @@ -137,9 +137,14 @@ bool i915_error_injected(void);
>> >
>> >  #define i915_inject_probe_failure(i915)  
>> i915_inject_load_error((i915), -ENODEV)
>> >
>> > -#define i915_probe_error(i915, fmt,  
>> ...)                                  \
>> > +#define I915_ERROR(i915, fmt, ...) \
>> >         __i915_printk(i915, i915_error_injected() ? KERN_DEBUG :  
>> KERN_ERR, \
>> >                       fmt, ##__VA_ARGS__)
>> > +#define I915_WARN(i915, fmt, ...) \
>> > +       __i915_printk(i915, i915_error_injected() ? KERN_DEBUG :  
>> KERN_WARNING, \
>> > +                     fmt, ##__VA_ARGS__)
>>
>> I didn't see I915_WARN be used in this series. Is it likely? We either
>> abort the module load, in which it is an error, or we are quite happy to
>> continue in which case I'd vote for a "normal but significant condition"
>> i.e. KERN_NOTICE.
>
> Spotted the I915_WARN, I guess that's reasonable.
>
> However, I do dislike I915_ERROR / I915_WARN on principle. Especially as
> they have the coupling to the probe-failure magic and I don't feel like
> it is a good idea to have that spread too far.

But what else can we do in i915 alone to
a) have messages that show origin of the problem as soon as possible,
b) ignore these fake errors if caused by injected failures ?

otherwise we risk that all new injected errors will be immediately
reported as regressions and we stop at CI.BAT

There are other options but we need some buy-in from the CI team:

- we can use existing messages to temporary ignore errors between
   "injecting" and "failed"

i915 0000:00:02.0: Injecting failure %d at checkpoint %u [file:line]
i915 0000:00:02.0: Device initialization failed (%d)

- we can use existing messages to temporary ignore errors between
   "injecting" and new "completed"

i915 0000:00:02.0: Injecting failure %d at checkpoint %u [file:line]
i915 0000:00:02.0: Checkpoint %u failure recovery completed

btw, I'm also thinking about introducing 'recoverable' failures, which
could be used not only to make sure that we don't break the driver while
unwinding, but also to make sure that driver was finally able handle such
failures in expected way:

i915_inject_load_error(i915, -ENOMEM); // probe must fail
i915_inject_recoverable_error(i915, -EIO, WEDGED);  // probe ok but gpu  
wedged
i915_inject_recoverable_error(i915, -ENOEXEC, NORMAL); // probe must be ok
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] drm/i915/uc: Inject load errors into intel_uc_init_hw
  2019-07-30 10:56       ` Michal Wajdeczko
@ 2019-07-30 11:09         ` Chris Wilson
  0 siblings, 0 replies; 15+ messages in thread
From: Chris Wilson @ 2019-07-30 11:09 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx

Quoting Michal Wajdeczko (2019-07-30 11:56:11)
> On Tue, 30 Jul 2019 11:27:51 +0200, Chris Wilson  
> <chris@chris-wilson.co.uk> wrote:
> 
> > Quoting Chris Wilson (2019-07-30 09:47:37)
> >> Quoting Michal Wajdeczko (2019-07-29 16:23:00)
> >> > Inject load errors into intel_uc_init_hw to make sure we
> >> > correctly handle uC initialization failures.
> >> >
> >> > To avoid complains from CI about inserted errors or warnings,
> >> > use helper macro that checks if there was an error injection.
> >> >
> >> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> >> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> >> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >> > ---
> >> >  drivers/gpu/drm/i915/gt/uc/intel_uc.c | 8 ++++++++
> >> >  drivers/gpu/drm/i915/i915_drv.h       | 7 ++++++-
> >> >  drivers/gpu/drm/i915/i915_gem.c       | 2 +-
> >> >  3 files changed, 15 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c  
> >> b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> >> > index fafa9be1e12a..9e1156c29cb1 100644
> >> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> >> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> >> > @@ -400,6 +400,14 @@ int intel_uc_init_hw(struct intel_uc *uc)
> >> >         if (!intel_uc_is_using_guc(uc))
> >> >                 return 0;
> >> >
> >> > +       ret = i915_inject_load_error(i915, -EIO);
> >> > +       if (ret)
> >> > +               return ret;
> >> > +
> >> > +       ret = i915_inject_load_error(i915, -ENXIO);
> >> > +       if (ret)
> >> > +               return ret;
> >> > +
> >> >         GEM_BUG_ON(!intel_uc_fw_supported(&guc->fw));
> >> >
> >> >         guc_reset_interrupts(guc);
> >> > diff --git a/drivers/gpu/drm/i915/i915_drv.h  
> >> b/drivers/gpu/drm/i915/i915_drv.h
> >> > index 6b059d51aaff..36f7a146f06a 100644
> >> > --- a/drivers/gpu/drm/i915/i915_drv.h
> >> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> > @@ -137,9 +137,14 @@ bool i915_error_injected(void);
> >> >
> >> >  #define i915_inject_probe_failure(i915)  
> >> i915_inject_load_error((i915), -ENODEV)
> >> >
> >> > -#define i915_probe_error(i915, fmt,  
> >> ...)                                  \
> >> > +#define I915_ERROR(i915, fmt, ...) \
> >> >         __i915_printk(i915, i915_error_injected() ? KERN_DEBUG :  
> >> KERN_ERR, \
> >> >                       fmt, ##__VA_ARGS__)
> >> > +#define I915_WARN(i915, fmt, ...) \
> >> > +       __i915_printk(i915, i915_error_injected() ? KERN_DEBUG :  
> >> KERN_WARNING, \
> >> > +                     fmt, ##__VA_ARGS__)
> >>
> >> I didn't see I915_WARN be used in this series. Is it likely? We either
> >> abort the module load, in which it is an error, or we are quite happy to
> >> continue in which case I'd vote for a "normal but significant condition"
> >> i.e. KERN_NOTICE.
> >
> > Spotted the I915_WARN, I guess that's reasonable.
> >
> > However, I do dislike I915_ERROR / I915_WARN on principle. Especially as
> > they have the coupling to the probe-failure magic and I don't feel like
> > it is a good idea to have that spread too far.
> 
> But what else can we do in i915 alone to
> a) have messages that show origin of the problem as soon as possible,

(Actually, is it not that the issue here is the opposite, it's errors
being reported later on after the injection.)

> b) ignore these fake errors if caused by injected failures ?

My issue is that I915_ERROR() is simply too generic (and too reminiscent
of DRM_ERROR), it invites use outside of the probe paths. If probe/PROBE
is kept in the name, it should help stop it being applied where it makes
no sense (and I fear could cause confusion if message levels did get
flipped).

> otherwise we risk that all new injected errors will be immediately
> reported as regressions and we stop at CI.BAT

What I thought was a very very simple solution was to teach that test to
simply ignore the expected errors. I'm still waiting on igt being
standalone, or at least being able to control the runner from within the
test.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-07-30 11:09 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-29 15:22 [PATCH 0/4] add more probe failures Michal Wajdeczko
2019-07-29 15:22 ` [PATCH 1/4] drm/i915: Report -ENODEV after injecting probe failure Michal Wajdeczko
2019-07-29 17:54   ` Daniele Ceraolo Spurio
2019-07-29 19:45     ` Chris Wilson
2019-07-29 15:22 ` [PATCH 2/4] drm/i915: Add i915 to i915_inject_probe_failure Michal Wajdeczko
2019-07-30  8:07   ` kbuild test robot
2019-07-29 15:23 ` [PATCH 3/4] drm/i915/uc: Inject load errors into intel_uc_init_hw Michal Wajdeczko
2019-07-30  8:47   ` Chris Wilson
2019-07-30  9:27     ` Chris Wilson
2019-07-30 10:56       ` Michal Wajdeczko
2019-07-30 11:09         ` Chris Wilson
2019-07-29 15:23 ` [PATCH 4/4] drm/i915/wopcm: Don't fail on WOPCM partitioning failure Michal Wajdeczko
2019-07-29 18:04   ` Daniele Ceraolo Spurio
2019-07-29 17:12 ` ✓ Fi.CI.BAT: success for add more probe failures Patchwork
2019-07-29 19:56 ` ✗ Fi.CI.BAT: failure for add more probe failures (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.