dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Further multi-gt handling
@ 2022-09-15 23:26 Matt Roper
  2022-09-15 23:26 ` [PATCH v2 1/4] drm/i915/gt: Cleanup partial engine discovery failures Matt Roper
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Matt Roper @ 2022-09-15 23:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: Janusz Krzysztofik, Daniele Ceraolo Spurio, dri-devel

Now that MTL is going to start providing two GTs, there are a few more
places in the driver that need to iterate over each GT instead of
operating directly on gt0.  Also some more deliberate cleanup is needed,
in cases where we fail GT/engine initialization after the first GT has
been fully setup.

v2:
 - Drop unnecessary helper function.  (Janusz)
 - Consolidate some adjacent GT loops (Daniele)

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>

Chris Wilson (1):
  drm/i915/gt: Cleanup partial engine discovery failures

Tvrtko Ursulin (3):
  drm/i915: Make GEM resume all engines
  drm/i915: Make GEM suspend all GTs
  drm/i915: Handle all GTs on driver (un)load paths

 drivers/gpu/drm/i915/gem/i915_gem_pm.c    | 33 ++++++++++++++---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  7 +++-
 drivers/gpu/drm/i915/i915_driver.c        |  3 +-
 drivers/gpu/drm/i915/i915_gem.c           | 43 ++++++++++++++++-------
 4 files changed, 68 insertions(+), 18 deletions(-)

-- 
2.37.3


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/4] drm/i915/gt: Cleanup partial engine discovery failures
  2022-09-15 23:26 [PATCH v2 0/4] Further multi-gt handling Matt Roper
@ 2022-09-15 23:26 ` Matt Roper
  2022-09-16  8:03   ` Janusz Krzysztofik
  2022-09-15 23:26 ` [PATCH v2 2/4] drm/i915: Make GEM resume all engines Matt Roper
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Matt Roper @ 2022-09-15 23:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: Janusz Krzysztofik, Chris Wilson, dri-devel, Chris Wilson

From: Chris Wilson <chris.p.wilson@intel.com>

If we abort driver initialisation in the middle of gt/engine discovery,
some engines will be fully setup and some not. Those incompletely setup
engines only have 'engine->release == NULL' and so will leak any of the
common objects allocated.

v2:
 - Drop the destroy_pinned_context() helper for now.  It's not really
   worth it with just a single callsite at the moment.  (Janusz)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 1f7188129cd1..2ddcad497fa3 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1274,8 +1274,13 @@ int intel_engines_init(struct intel_gt *gt)
 			return err;
 
 		err = setup(engine);
-		if (err)
+		if (err) {
+			intel_engine_cleanup_common(engine);
 			return err;
+		}
+
+		/* The backend should now be responsible for cleanup */
+		GEM_BUG_ON(engine->release == NULL);
 
 		err = engine_init_common(engine);
 		if (err)
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/4] drm/i915: Make GEM resume all engines
  2022-09-15 23:26 [PATCH v2 0/4] Further multi-gt handling Matt Roper
  2022-09-15 23:26 ` [PATCH v2 1/4] drm/i915/gt: Cleanup partial engine discovery failures Matt Roper
@ 2022-09-15 23:26 ` Matt Roper
  2022-09-15 23:26 ` [PATCH v2 3/4] drm/i915: Make GEM suspend all GTs Matt Roper
  2022-09-15 23:26 ` [PATCH v2 4/4] drm/i915: Handle all GTs on driver (un)load paths Matt Roper
  3 siblings, 0 replies; 7+ messages in thread
From: Matt Roper @ 2022-09-15 23:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: Andi Shyti, Andi Shyti, dri-devel, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Walk all GTs from i915_gem_resume when resuming engines.

Cc: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_pm.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
index 3428f735e786..2c80cc8362b6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
@@ -212,7 +212,8 @@ int i915_gem_freeze_late(struct drm_i915_private *i915)
 
 void i915_gem_resume(struct drm_i915_private *i915)
 {
-	int ret;
+	struct intel_gt *gt;
+	int ret, i, j;
 
 	GEM_TRACE("%s\n", dev_name(i915->drm.dev));
 
@@ -224,8 +225,25 @@ void i915_gem_resume(struct drm_i915_private *i915)
 	 * guarantee that the context image is complete. So let's just reset
 	 * it and start again.
 	 */
-	intel_gt_resume(to_gt(i915));
+	for_each_gt(gt, i915, i)
+		if (intel_gt_resume(gt))
+			goto err_wedged;
 
 	ret = lmem_restore(i915, I915_TTM_BACKUP_ALLOW_GPU);
 	GEM_WARN_ON(ret);
+
+	return;
+
+err_wedged:
+	for_each_gt(gt, i915, j) {
+		if (!intel_gt_is_wedged(gt)) {
+			dev_err(i915->drm.dev,
+				"Failed to re-initialize GPU[%u], declaring it wedged!\n",
+				j);
+			intel_gt_set_wedged(gt);
+		}
+
+		if (j == i)
+			break;
+	}
 }
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 3/4] drm/i915: Make GEM suspend all GTs
  2022-09-15 23:26 [PATCH v2 0/4] Further multi-gt handling Matt Roper
  2022-09-15 23:26 ` [PATCH v2 1/4] drm/i915/gt: Cleanup partial engine discovery failures Matt Roper
  2022-09-15 23:26 ` [PATCH v2 2/4] drm/i915: Make GEM resume all engines Matt Roper
@ 2022-09-15 23:26 ` Matt Roper
  2022-09-15 23:26 ` [PATCH v2 4/4] drm/i915: Handle all GTs on driver (un)load paths Matt Roper
  3 siblings, 0 replies; 7+ messages in thread
From: Matt Roper @ 2022-09-15 23:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: Andi Shyti, dri-devel, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Walk all GTs when suspending.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_pm.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
index 2c80cc8362b6..e5bfb6be9f7a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
@@ -22,6 +22,9 @@
 
 void i915_gem_suspend(struct drm_i915_private *i915)
 {
+	struct intel_gt *gt;
+	unsigned int i;
+
 	GEM_TRACE("%s\n", dev_name(i915->drm.dev));
 
 	intel_wakeref_auto(&to_gt(i915)->userfault_wakeref, 0);
@@ -36,7 +39,8 @@ void i915_gem_suspend(struct drm_i915_private *i915)
 	 * state. Fortunately, the kernel_context is disposable and we do
 	 * not rely on its state.
 	 */
-	intel_gt_suspend_prepare(to_gt(i915));
+	for_each_gt(gt, i915, i)
+		intel_gt_suspend_prepare(gt);
 
 	i915_gem_drain_freed_objects(i915);
 }
@@ -131,7 +135,9 @@ void i915_gem_suspend_late(struct drm_i915_private *i915)
 		&i915->mm.purge_list,
 		NULL
 	}, **phase;
+	struct intel_gt *gt;
 	unsigned long flags;
+	unsigned int i;
 	bool flush = false;
 
 	/*
@@ -154,7 +160,8 @@ void i915_gem_suspend_late(struct drm_i915_private *i915)
 	 * machine in an unusable condition.
 	 */
 
-	intel_gt_suspend_late(to_gt(i915));
+	for_each_gt(gt, i915, i)
+		intel_gt_suspend_late(gt);
 
 	spin_lock_irqsave(&i915->mm.obj_lock, flags);
 	for (phase = phases; *phase; phase++) {
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 4/4] drm/i915: Handle all GTs on driver (un)load paths
  2022-09-15 23:26 [PATCH v2 0/4] Further multi-gt handling Matt Roper
                   ` (2 preceding siblings ...)
  2022-09-15 23:26 ` [PATCH v2 3/4] drm/i915: Make GEM suspend all GTs Matt Roper
@ 2022-09-15 23:26 ` Matt Roper
  2022-09-16  8:19   ` [Intel-gfx] " Andi Shyti
  3 siblings, 1 reply; 7+ messages in thread
From: Matt Roper @ 2022-09-15 23:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniele Ceraolo Spurio, dri-devel, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This, along with the changes already landed in commit 1c66a12ab431
("drm/i915: Handle each GT on init/release and suspend/resume") makes
engines from all GTs actually known to the driver.

To accomplish this we need to sprinkle a lot of for_each_gt calls around
but is otherwise pretty un-eventuful.

v2:
 - Consolidate adjacent GT loops in a couple places.  (Daniele)

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/i915_driver.c |  3 ++-
 drivers/gpu/drm/i915/i915_gem.c    | 43 +++++++++++++++++++++---------
 2 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
index c459eb362c47..9d1fc2477f80 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -1661,7 +1661,8 @@ static int intel_runtime_suspend(struct device *kdev)
 
 		intel_runtime_pm_enable_interrupts(dev_priv);
 
-		intel_gt_runtime_resume(to_gt(dev_priv));
+		for_each_gt(gt, dev_priv, i)
+			intel_gt_runtime_resume(gt);
 
 		enable_rpm_wakeref_asserts(rpm);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f18cc6270b2b..e936e67f3411 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1128,6 +1128,8 @@ void i915_gem_drain_workqueue(struct drm_i915_private *i915)
 
 int i915_gem_init(struct drm_i915_private *dev_priv)
 {
+	struct intel_gt *gt;
+	unsigned int i;
 	int ret;
 
 	/* We need to fallback to 4K pages if host doesn't support huge gtt. */
@@ -1158,9 +1160,11 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	 */
 	intel_init_clock_gating(dev_priv);
 
-	ret = intel_gt_init(to_gt(dev_priv));
-	if (ret)
-		goto err_unlock;
+	for_each_gt(gt, dev_priv, i) {
+		ret = intel_gt_init(gt);
+		if (ret)
+			goto err_unlock;
+	}
 
 	return 0;
 
@@ -1173,8 +1177,13 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 err_unlock:
 	i915_gem_drain_workqueue(dev_priv);
 
-	if (ret != -EIO)
-		intel_uc_cleanup_firmwares(&to_gt(dev_priv)->uc);
+	if (ret != -EIO) {
+		for_each_gt(gt, dev_priv, i) {
+			intel_gt_driver_remove(gt);
+			intel_gt_driver_release(gt);
+			intel_uc_cleanup_firmwares(&gt->uc);
+		}
+	}
 
 	if (ret == -EIO) {
 		/*
@@ -1182,10 +1191,12 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 		 * as wedged. But we only want to do this when the GPU is angry,
 		 * for all other failure, such as an allocation failure, bail.
 		 */
-		if (!intel_gt_is_wedged(to_gt(dev_priv))) {
-			i915_probe_error(dev_priv,
-					 "Failed to initialize GPU, declaring it wedged!\n");
-			intel_gt_set_wedged(to_gt(dev_priv));
+		for_each_gt(gt, dev_priv, i) {
+			if (!intel_gt_is_wedged(gt)) {
+				i915_probe_error(dev_priv,
+						"Failed to initialize GPU, declaring it wedged!\n");
+				intel_gt_set_wedged(gt);
+			}
 		}
 
 		/* Minimal basic recovery for KMS */
@@ -1213,10 +1224,14 @@ void i915_gem_driver_unregister(struct drm_i915_private *i915)
 
 void i915_gem_driver_remove(struct drm_i915_private *dev_priv)
 {
+	struct intel_gt *gt;
+	unsigned int i;
+
 	intel_wakeref_auto_fini(&to_gt(dev_priv)->userfault_wakeref);
 
 	i915_gem_suspend_late(dev_priv);
-	intel_gt_driver_remove(to_gt(dev_priv));
+	for_each_gt(gt, dev_priv, i)
+		intel_gt_driver_remove(gt);
 	dev_priv->uabi_engines = RB_ROOT;
 
 	/* Flush any outstanding unpin_work. */
@@ -1227,9 +1242,13 @@ void i915_gem_driver_remove(struct drm_i915_private *dev_priv)
 
 void i915_gem_driver_release(struct drm_i915_private *dev_priv)
 {
-	intel_gt_driver_release(to_gt(dev_priv));
+	struct intel_gt *gt;
+	unsigned int i;
 
-	intel_uc_cleanup_firmwares(&to_gt(dev_priv)->uc);
+	for_each_gt(gt, dev_priv, i) {
+		intel_gt_driver_release(gt);
+		intel_uc_cleanup_firmwares(&gt->uc);
+	}
 
 	i915_gem_drain_freed_objects(dev_priv);
 
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/4] drm/i915/gt: Cleanup partial engine discovery failures
  2022-09-15 23:26 ` [PATCH v2 1/4] drm/i915/gt: Cleanup partial engine discovery failures Matt Roper
@ 2022-09-16  8:03   ` Janusz Krzysztofik
  0 siblings, 0 replies; 7+ messages in thread
From: Janusz Krzysztofik @ 2022-09-16  8:03 UTC (permalink / raw)
  To: intel-gfx, Matt Roper
  Cc: Janusz Krzysztofik, Chris Wilson, dri-devel, Chris Wilson

On Friday, 16 September 2022 01:26:51 CEST Matt Roper wrote:
> From: Chris Wilson <chris.p.wilson@intel.com>
> 
> If we abort driver initialisation in the middle of gt/engine discovery,
> some engines will be fully setup and some not. Those incompletely setup
> engines only have 'engine->release == NULL' and so will leak any of the
> common objects allocated.
> 
> v2:
>  - Drop the destroy_pinned_context() helper for now.  It's not really
>    worth it with just a single callsite at the moment.  (Janusz)
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 1f7188129cd1..2ddcad497fa3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1274,8 +1274,13 @@ int intel_engines_init(struct intel_gt *gt)
>  			return err;
>  
>  		err = setup(engine);
> -		if (err)
> +		if (err) {
> +			intel_engine_cleanup_common(engine);
>  			return err;
> +		}
> +
> +		/* The backend should now be responsible for cleanup */
> +		GEM_BUG_ON(engine->release == NULL);
>  
>  		err = engine_init_common(engine);
>  		if (err)
> 





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Intel-gfx] [PATCH v2 4/4] drm/i915: Handle all GTs on driver (un)load paths
  2022-09-15 23:26 ` [PATCH v2 4/4] drm/i915: Handle all GTs on driver (un)load paths Matt Roper
@ 2022-09-16  8:19   ` Andi Shyti
  0 siblings, 0 replies; 7+ messages in thread
From: Andi Shyti @ 2022-09-16  8:19 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, dri-devel

Hi Matt,

On Thu, Sep 15, 2022 at 04:26:54PM -0700, Matt Roper wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> This, along with the changes already landed in commit 1c66a12ab431
> ("drm/i915: Handle each GT on init/release and suspend/resume") makes
> engines from all GTs actually known to the driver.
> 
> To accomplish this we need to sprinkle a lot of for_each_gt calls around
> but is otherwise pretty un-eventuful.
> 
> v2:
>  - Consolidate adjacent GT loops in a couple places.  (Daniele)
> 
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>

Andi

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-09-16  8:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-15 23:26 [PATCH v2 0/4] Further multi-gt handling Matt Roper
2022-09-15 23:26 ` [PATCH v2 1/4] drm/i915/gt: Cleanup partial engine discovery failures Matt Roper
2022-09-16  8:03   ` Janusz Krzysztofik
2022-09-15 23:26 ` [PATCH v2 2/4] drm/i915: Make GEM resume all engines Matt Roper
2022-09-15 23:26 ` [PATCH v2 3/4] drm/i915: Make GEM suspend all GTs Matt Roper
2022-09-15 23:26 ` [PATCH v2 4/4] drm/i915: Handle all GTs on driver (un)load paths Matt Roper
2022-09-16  8:19   ` [Intel-gfx] " Andi Shyti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).