All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Better handle memory pressure at suspend
@ 2023-10-04 17:18 Mario Limonciello
  2023-10-04 17:18 ` [PATCH v4 1/3] drm/amd: Evict resources during PM ops prepare() callback Mario Limonciello
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Mario Limonciello @ 2023-10-04 17:18 UTC (permalink / raw)
  To: amd-gfx; +Cc: Harry.Wentland, Mario Limonciello

At suspend time if there is memory pressure then dynamically allocating
memory will cause failures that don't clean up properly when trying
suspend a second time.

Move the bigger memory allocations into Linux PM prepare() callback and
drop allocations that aren't really needed in DC code.

v1: https://lore.kernel.org/amd-gfx/20230925143359.14932-1-mario.limonciello@amd.com/
v2: https://lore.kernel.org/amd-gfx/20231002224449.95565-1-mario.limonciello@amd.com/T/#mc800319a05df821cd1875234b09bf212e2e3282b
v3: https://lore.kernel.org/amd-gfx/20231003205437.123426-1-mario.limonciello@amd.com/T/#m00a49b75cd2638bf8a0ebd549d6a6010bfb7328b

v3->v4:
 * Combine patches 1/2
 * Drop adev->in_suspend references
v2->v3:
 * Handle adev->in_suspend in prepare() and complete()
 * Add missing scratch variable in dc_resource_state_destruct()
 * Revert error code propagation in same series
v1->v2:
 * Handle DC code too
 * Add prepare callback rather than moving symbol calls
Mario Limonciello (3):
  drm/amd: Evict resources during PM ops prepare() callback
  drm/amd/display: Destroy DC context while keeping DML
  drm/amd/display: make dc_set_power_state() return type `void` again

 drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 26 +++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  7 +++--
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 +++-------
 drivers/gpu/drm/amd/display/dc/core/dc.c      | 31 ++-----------------
 .../gpu/drm/amd/display/dc/core/dc_resource.c | 12 +++++++
 drivers/gpu/drm/amd/display/dc/dc.h           |  2 +-
 7 files changed, 50 insertions(+), 46 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4 1/3] drm/amd: Evict resources during PM ops prepare() callback
  2023-10-04 17:18 [PATCH v4 0/3] Better handle memory pressure at suspend Mario Limonciello
@ 2023-10-04 17:18 ` Mario Limonciello
  2023-10-05 14:13   ` Alex Deucher
       [not found]   ` <12e15ad4-cfa5-4ce9-89a4-f8d1718cb256@gmail.com>
  2023-10-04 17:18 ` [PATCH v4 2/3] drm/amd/display: Destroy DC context while keeping DML Mario Limonciello
  2023-10-04 17:18 ` [PATCH v4 3/3] drm/amd/display: make dc_set_power_state() return type `void` again Mario Limonciello
  2 siblings, 2 replies; 10+ messages in thread
From: Mario Limonciello @ 2023-10-04 17:18 UTC (permalink / raw)
  To: amd-gfx; +Cc: Harry.Wentland, Mario Limonciello

Linux PM core has a prepare() callback run before suspend.

If the system is under high memory pressure, the resources may need
to be evicted into swap instead.  If the storage backing for swap
is offlined during the suspend() step then such a call may fail.

So duplicate this step into prepare() to move evict majority of
resources while leaving all existing steps that put the GPU into a
low power state in suspend().

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 +++++++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  7 +++---
 3 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d23fb4b5ad95..6643d0ed6b1b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1413,6 +1413,7 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
 void amdgpu_driver_release_kms(struct drm_device *dev);
 
 int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
+int amdgpu_device_prepare(struct drm_device *dev);
 int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
 int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
 u32 amdgpu_get_vblank_counter_kms(struct drm_crtc *crtc);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bad2b5577e96..67acee569c08 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4259,6 +4259,31 @@ static int amdgpu_device_evict_resources(struct amdgpu_device *adev)
 /*
  * Suspend & resume.
  */
+/**
+ * amdgpu_device_prepare - prepare for device suspend
+ *
+ * @dev: drm dev pointer
+ *
+ * Prepare to put the hw in the suspend state (all asics).
+ * Returns 0 for success or an error on failure.
+ * Called at driver suspend.
+ */
+int amdgpu_device_prepare(struct drm_device *dev)
+{
+	struct amdgpu_device *adev = drm_to_adev(dev);
+	int r;
+
+	if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
+		return 0;
+
+	/* Evict the majority of BOs before starting suspend sequence */
+	r = amdgpu_device_evict_resources(adev);
+	if (r)
+		return r;
+
+	return 0;
+}
+
 /**
  * amdgpu_device_suspend - initiate device suspend
  *
@@ -4279,7 +4304,6 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
 
 	adev->in_suspend = true;
 
-	/* Evict the majority of BOs before grabbing the full access */
 	r = amdgpu_device_evict_resources(adev);
 	if (r)
 		return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index e3471293846f..175167582db0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2425,8 +2425,9 @@ static int amdgpu_pmops_prepare(struct device *dev)
 	/* Return a positive number here so
 	 * DPM_FLAG_SMART_SUSPEND works properly
 	 */
-	if (amdgpu_device_supports_boco(drm_dev))
-		return pm_runtime_suspended(dev);
+	if (amdgpu_device_supports_boco(drm_dev) &&
+	    pm_runtime_suspended(dev))
+		return 1;
 
 	/* if we will not support s3 or s2i for the device
 	 *  then skip suspend
@@ -2435,7 +2436,7 @@ static int amdgpu_pmops_prepare(struct device *dev)
 	    !amdgpu_acpi_is_s3_active(adev))
 		return 1;
 
-	return 0;
+	return amdgpu_device_prepare(drm_dev);
 }
 
 static void amdgpu_pmops_complete(struct device *dev)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 2/3] drm/amd/display: Destroy DC context while keeping DML
  2023-10-04 17:18 [PATCH v4 0/3] Better handle memory pressure at suspend Mario Limonciello
  2023-10-04 17:18 ` [PATCH v4 1/3] drm/amd: Evict resources during PM ops prepare() callback Mario Limonciello
@ 2023-10-04 17:18 ` Mario Limonciello
  2023-10-05 14:27   ` Alex Deucher
  2023-10-04 17:18 ` [PATCH v4 3/3] drm/amd/display: make dc_set_power_state() return type `void` again Mario Limonciello
  2 siblings, 1 reply; 10+ messages in thread
From: Mario Limonciello @ 2023-10-04 17:18 UTC (permalink / raw)
  To: amd-gfx; +Cc: Harry.Wentland, Mario Limonciello

If there is memory pressure at suspend time then dynamically
allocating a large structure as part of DC suspend code will
fail.

Instead re-use the same structure and clear all members except
those that should be maintained.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/gpu/drm/amd/display/dc/core/dc.c      | 25 -------------------
 .../gpu/drm/amd/display/dc/core/dc_resource.c | 12 +++++++++
 2 files changed, 12 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 39e291a467e2..cb8c7c5a8807 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -4728,9 +4728,6 @@ bool dc_set_power_state(
 	struct dc *dc,
 	enum dc_acpi_cm_power_state power_state)
 {
-	struct kref refcount;
-	struct display_mode_lib *dml;
-
 	if (!dc->current_state)
 		return true;
 
@@ -4750,30 +4747,8 @@ bool dc_set_power_state(
 		break;
 	default:
 		ASSERT(dc->current_state->stream_count == 0);
-		/* Zero out the current context so that on resume we start with
-		 * clean state, and dc hw programming optimizations will not
-		 * cause any trouble.
-		 */
-		dml = kzalloc(sizeof(struct display_mode_lib),
-				GFP_KERNEL);
-
-		ASSERT(dml);
-		if (!dml)
-			return false;
-
-		/* Preserve refcount */
-		refcount = dc->current_state->refcount;
-		/* Preserve display mode lib */
-		memcpy(dml, &dc->current_state->bw_ctx.dml, sizeof(struct display_mode_lib));
 
 		dc_resource_state_destruct(dc->current_state);
-		memset(dc->current_state, 0,
-				sizeof(*dc->current_state));
-
-		dc->current_state->refcount = refcount;
-		dc->current_state->bw_ctx.dml = *dml;
-
-		kfree(dml);
 
 		break;
 	}
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index aa7b5db83644..e487c966c118 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -4350,6 +4350,18 @@ void dc_resource_state_destruct(struct dc_state *context)
 		context->streams[i] = NULL;
 	}
 	context->stream_count = 0;
+	context->stream_mask = 0;
+	memset(&context->res_ctx, 0, sizeof(context->res_ctx));
+	memset(&context->pp_display_cfg, 0, sizeof(context->pp_display_cfg));
+	memset(&context->dcn_bw_vars, 0, sizeof(context->dcn_bw_vars));
+	context->clk_mgr = NULL;
+	memset(&context->bw_ctx.bw, 0, sizeof(context->bw_ctx.bw));
+	memset(context->block_sequence, 0, sizeof(context->block_sequence));
+	context->block_sequence_steps = 0;
+	memset(context->dc_dmub_cmd, 0, sizeof(context->dc_dmub_cmd));
+	context->dmub_cmd_count = 0;
+	memset(&context->perf_params, 0, sizeof(context->perf_params));
+	memset(&context->scratch, 0, sizeof(context->scratch));
 }
 
 void dc_resource_state_copy_construct(
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 3/3] drm/amd/display: make dc_set_power_state() return type `void` again
  2023-10-04 17:18 [PATCH v4 0/3] Better handle memory pressure at suspend Mario Limonciello
  2023-10-04 17:18 ` [PATCH v4 1/3] drm/amd: Evict resources during PM ops prepare() callback Mario Limonciello
  2023-10-04 17:18 ` [PATCH v4 2/3] drm/amd/display: Destroy DC context while keeping DML Mario Limonciello
@ 2023-10-04 17:18 ` Mario Limonciello
  2023-10-05 14:28   ` Alex Deucher
  2 siblings, 1 reply; 10+ messages in thread
From: Mario Limonciello @ 2023-10-04 17:18 UTC (permalink / raw)
  To: amd-gfx; +Cc: Harry.Wentland, Mario Limonciello

As dc_set_power_state() no longer allocates memory, it's not necessary
to have return types and check return code as it can't fail anymore.

Change it back to `void`.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c   | 17 +++++------------
 drivers/gpu/drm/amd/display/dc/core/dc.c        |  6 ++----
 drivers/gpu/drm/amd/display/dc/dc.h             |  2 +-
 3 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a59a11ae42db..df9d9437f149 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2685,11 +2685,6 @@ static void hpd_rx_irq_work_suspend(struct amdgpu_display_manager *dm)
 	}
 }
 
-static int dm_set_power_state(struct dc *dc, enum dc_acpi_cm_power_state power_state)
-{
-	return dc_set_power_state(dc, power_state) ? 0 : -ENOMEM;
-}
-
 static int dm_suspend(void *handle)
 {
 	struct amdgpu_device *adev = handle;
@@ -2723,7 +2718,9 @@ static int dm_suspend(void *handle)
 
 	hpd_rx_irq_work_suspend(dm);
 
-	return dm_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D3);
+	dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D3);
+
+	return 0;
 }
 
 struct drm_connector *
@@ -2917,9 +2914,7 @@ static int dm_resume(void *handle)
 		if (r)
 			DRM_ERROR("DMUB interface failed to initialize: status=%d\n", r);
 
-		r = dm_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
-		if (r)
-			return r;
+		dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
 
 		dc_resume(dm->dc);
 
@@ -2969,9 +2964,7 @@ static int dm_resume(void *handle)
 	}
 
 	/* power on hardware */
-	r = dm_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
-	if (r)
-		return r;
+	 dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
 
 	/* program HPD filter */
 	dc_resume(dm->dc);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
index cb8c7c5a8807..2645d59dc58e 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -4724,12 +4724,12 @@ void dc_power_down_on_boot(struct dc *dc)
 		dc->hwss.power_down_on_boot(dc);
 }
 
-bool dc_set_power_state(
+void dc_set_power_state(
 	struct dc *dc,
 	enum dc_acpi_cm_power_state power_state)
 {
 	if (!dc->current_state)
-		return true;
+		return;
 
 	switch (power_state) {
 	case DC_ACPI_CM_POWER_STATE_D0:
@@ -4752,8 +4752,6 @@ bool dc_set_power_state(
 
 		break;
 	}
-
-	return true;
 }
 
 void dc_resume(struct dc *dc)
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h
index b140eb240ad7..b6002b11a745 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -2330,7 +2330,7 @@ void dc_notify_vsync_int_state(struct dc *dc, struct dc_stream_state *stream, bo
 
 /* Power Interfaces */
 
-bool dc_set_power_state(
+void dc_set_power_state(
 		struct dc *dc,
 		enum dc_acpi_cm_power_state power_state);
 void dc_resume(struct dc *dc);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 1/3] drm/amd: Evict resources during PM ops prepare() callback
  2023-10-04 17:18 ` [PATCH v4 1/3] drm/amd: Evict resources during PM ops prepare() callback Mario Limonciello
@ 2023-10-05 14:13   ` Alex Deucher
       [not found]   ` <12e15ad4-cfa5-4ce9-89a4-f8d1718cb256@gmail.com>
  1 sibling, 0 replies; 10+ messages in thread
From: Alex Deucher @ 2023-10-05 14:13 UTC (permalink / raw)
  To: Mario Limonciello; +Cc: Harry.Wentland, amd-gfx

On Wed, Oct 4, 2023 at 1:37 PM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>
> Linux PM core has a prepare() callback run before suspend.
>
> If the system is under high memory pressure, the resources may need
> to be evicted into swap instead.  If the storage backing for swap
> is offlined during the suspend() step then such a call may fail.
>
> So duplicate this step into prepare() to move evict majority of
> resources while leaving all existing steps that put the GPU into a
> low power state in suspend().
>
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 +++++++++++++++++++++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  7 +++---
>  3 files changed, 30 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index d23fb4b5ad95..6643d0ed6b1b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1413,6 +1413,7 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
>  void amdgpu_driver_release_kms(struct drm_device *dev);
>
>  int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
> +int amdgpu_device_prepare(struct drm_device *dev);
>  int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
>  int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
>  u32 amdgpu_get_vblank_counter_kms(struct drm_crtc *crtc);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index bad2b5577e96..67acee569c08 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4259,6 +4259,31 @@ static int amdgpu_device_evict_resources(struct amdgpu_device *adev)
>  /*
>   * Suspend & resume.
>   */
> +/**
> + * amdgpu_device_prepare - prepare for device suspend
> + *
> + * @dev: drm dev pointer
> + *
> + * Prepare to put the hw in the suspend state (all asics).
> + * Returns 0 for success or an error on failure.
> + * Called at driver suspend.
> + */
> +int amdgpu_device_prepare(struct drm_device *dev)
> +{
> +       struct amdgpu_device *adev = drm_to_adev(dev);
> +       int r;
> +
> +       if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
> +               return 0;
> +
> +       /* Evict the majority of BOs before starting suspend sequence */
> +       r = amdgpu_device_evict_resources(adev);
> +       if (r)
> +               return r;
> +
> +       return 0;
> +}
> +
>  /**
>   * amdgpu_device_suspend - initiate device suspend
>   *
> @@ -4279,7 +4304,6 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
>
>         adev->in_suspend = true;
>
> -       /* Evict the majority of BOs before grabbing the full access */
>         r = amdgpu_device_evict_resources(adev);
>         if (r)
>                 return r;

Might want to add a note that this is likely a noop in the normal
suspend case and is just here to handle the case where
amdgpu_device_suspend() is called outside of the normal pmops
framework.
Other than that, the patch is:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index e3471293846f..175167582db0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2425,8 +2425,9 @@ static int amdgpu_pmops_prepare(struct device *dev)
>         /* Return a positive number here so
>          * DPM_FLAG_SMART_SUSPEND works properly
>          */
> -       if (amdgpu_device_supports_boco(drm_dev))
> -               return pm_runtime_suspended(dev);
> +       if (amdgpu_device_supports_boco(drm_dev) &&
> +           pm_runtime_suspended(dev))
> +               return 1;
>
>         /* if we will not support s3 or s2i for the device
>          *  then skip suspend
> @@ -2435,7 +2436,7 @@ static int amdgpu_pmops_prepare(struct device *dev)
>             !amdgpu_acpi_is_s3_active(adev))
>                 return 1;
>
> -       return 0;
> +       return amdgpu_device_prepare(drm_dev);
>  }
>
>  static void amdgpu_pmops_complete(struct device *dev)
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 2/3] drm/amd/display: Destroy DC context while keeping DML
  2023-10-04 17:18 ` [PATCH v4 2/3] drm/amd/display: Destroy DC context while keeping DML Mario Limonciello
@ 2023-10-05 14:27   ` Alex Deucher
  2023-10-05 14:37     ` Mario Limonciello
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Deucher @ 2023-10-05 14:27 UTC (permalink / raw)
  To: Mario Limonciello; +Cc: Harry.Wentland, amd-gfx

On Wed, Oct 4, 2023 at 1:37 PM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>
> If there is memory pressure at suspend time then dynamically
> allocating a large structure as part of DC suspend code will
> fail.
>
> Instead re-use the same structure and clear all members except
> those that should be maintained.
>
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
>  drivers/gpu/drm/amd/display/dc/core/dc.c      | 25 -------------------
>  .../gpu/drm/amd/display/dc/core/dc_resource.c | 12 +++++++++
>  2 files changed, 12 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
> index 39e291a467e2..cb8c7c5a8807 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> @@ -4728,9 +4728,6 @@ bool dc_set_power_state(
>         struct dc *dc,
>         enum dc_acpi_cm_power_state power_state)
>  {
> -       struct kref refcount;
> -       struct display_mode_lib *dml;
> -
>         if (!dc->current_state)
>                 return true;
>
> @@ -4750,30 +4747,8 @@ bool dc_set_power_state(
>                 break;
>         default:
>                 ASSERT(dc->current_state->stream_count == 0);
> -               /* Zero out the current context so that on resume we start with
> -                * clean state, and dc hw programming optimizations will not
> -                * cause any trouble.
> -                */
> -               dml = kzalloc(sizeof(struct display_mode_lib),
> -                               GFP_KERNEL);
> -
> -               ASSERT(dml);
> -               if (!dml)
> -                       return false;
> -
> -               /* Preserve refcount */
> -               refcount = dc->current_state->refcount;
> -               /* Preserve display mode lib */
> -               memcpy(dml, &dc->current_state->bw_ctx.dml, sizeof(struct display_mode_lib));
>
>                 dc_resource_state_destruct(dc->current_state);
> -               memset(dc->current_state, 0,
> -                               sizeof(*dc->current_state));
> -
> -               dc->current_state->refcount = refcount;
> -               dc->current_state->bw_ctx.dml = *dml;

The dml dance seems a bit weird.  I guess it's here because
dc_resource_state_destruct() might change it?  Can we safely drop
this?  If we do need it, we could pre-allocate a dml structure and use
that.

Alex

> -
> -               kfree(dml);
>
>                 break;
>         }
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> index aa7b5db83644..e487c966c118 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> @@ -4350,6 +4350,18 @@ void dc_resource_state_destruct(struct dc_state *context)
>                 context->streams[i] = NULL;
>         }
>         context->stream_count = 0;
> +       context->stream_mask = 0;
> +       memset(&context->res_ctx, 0, sizeof(context->res_ctx));
> +       memset(&context->pp_display_cfg, 0, sizeof(context->pp_display_cfg));
> +       memset(&context->dcn_bw_vars, 0, sizeof(context->dcn_bw_vars));
> +       context->clk_mgr = NULL;
> +       memset(&context->bw_ctx.bw, 0, sizeof(context->bw_ctx.bw));
> +       memset(context->block_sequence, 0, sizeof(context->block_sequence));
> +       context->block_sequence_steps = 0;
> +       memset(context->dc_dmub_cmd, 0, sizeof(context->dc_dmub_cmd));
> +       context->dmub_cmd_count = 0;
> +       memset(&context->perf_params, 0, sizeof(context->perf_params));
> +       memset(&context->scratch, 0, sizeof(context->scratch));
>  }
>
>  void dc_resource_state_copy_construct(
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 3/3] drm/amd/display: make dc_set_power_state() return type `void` again
  2023-10-04 17:18 ` [PATCH v4 3/3] drm/amd/display: make dc_set_power_state() return type `void` again Mario Limonciello
@ 2023-10-05 14:28   ` Alex Deucher
  0 siblings, 0 replies; 10+ messages in thread
From: Alex Deucher @ 2023-10-05 14:28 UTC (permalink / raw)
  To: Mario Limonciello; +Cc: Harry.Wentland, amd-gfx

On Wed, Oct 4, 2023 at 1:27 PM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>
> As dc_set_power_state() no longer allocates memory, it's not necessary
> to have return types and check return code as it can't fail anymore.
>
> Change it back to `void`.
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c   | 17 +++++------------
>  drivers/gpu/drm/amd/display/dc/core/dc.c        |  6 ++----
>  drivers/gpu/drm/amd/display/dc/dc.h             |  2 +-
>  3 files changed, 8 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index a59a11ae42db..df9d9437f149 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -2685,11 +2685,6 @@ static void hpd_rx_irq_work_suspend(struct amdgpu_display_manager *dm)
>         }
>  }
>
> -static int dm_set_power_state(struct dc *dc, enum dc_acpi_cm_power_state power_state)
> -{
> -       return dc_set_power_state(dc, power_state) ? 0 : -ENOMEM;
> -}
> -
>  static int dm_suspend(void *handle)
>  {
>         struct amdgpu_device *adev = handle;
> @@ -2723,7 +2718,9 @@ static int dm_suspend(void *handle)
>
>         hpd_rx_irq_work_suspend(dm);
>
> -       return dm_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D3);
> +       dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D3);
> +
> +       return 0;
>  }
>
>  struct drm_connector *
> @@ -2917,9 +2914,7 @@ static int dm_resume(void *handle)
>                 if (r)
>                         DRM_ERROR("DMUB interface failed to initialize: status=%d\n", r);
>
> -               r = dm_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
> -               if (r)
> -                       return r;
> +               dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
>
>                 dc_resume(dm->dc);
>
> @@ -2969,9 +2964,7 @@ static int dm_resume(void *handle)
>         }
>
>         /* power on hardware */
> -       r = dm_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
> -       if (r)
> -               return r;
> +        dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
>
>         /* program HPD filter */
>         dc_resume(dm->dc);
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
> index cb8c7c5a8807..2645d59dc58e 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> @@ -4724,12 +4724,12 @@ void dc_power_down_on_boot(struct dc *dc)
>                 dc->hwss.power_down_on_boot(dc);
>  }
>
> -bool dc_set_power_state(
> +void dc_set_power_state(
>         struct dc *dc,
>         enum dc_acpi_cm_power_state power_state)
>  {
>         if (!dc->current_state)
> -               return true;
> +               return;
>
>         switch (power_state) {
>         case DC_ACPI_CM_POWER_STATE_D0:
> @@ -4752,8 +4752,6 @@ bool dc_set_power_state(
>
>                 break;
>         }
> -
> -       return true;
>  }
>
>  void dc_resume(struct dc *dc)
> diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h
> index b140eb240ad7..b6002b11a745 100644
> --- a/drivers/gpu/drm/amd/display/dc/dc.h
> +++ b/drivers/gpu/drm/amd/display/dc/dc.h
> @@ -2330,7 +2330,7 @@ void dc_notify_vsync_int_state(struct dc *dc, struct dc_stream_state *stream, bo
>
>  /* Power Interfaces */
>
> -bool dc_set_power_state(
> +void dc_set_power_state(
>                 struct dc *dc,
>                 enum dc_acpi_cm_power_state power_state);
>  void dc_resume(struct dc *dc);
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 2/3] drm/amd/display: Destroy DC context while keeping DML
  2023-10-05 14:27   ` Alex Deucher
@ 2023-10-05 14:37     ` Mario Limonciello
  0 siblings, 0 replies; 10+ messages in thread
From: Mario Limonciello @ 2023-10-05 14:37 UTC (permalink / raw)
  To: Wentland, Harry, Alex Deucher, Siqueira, Rodrigo; +Cc: amd-gfx

On 10/5/2023 09:27, Alex Deucher wrote:
> On Wed, Oct 4, 2023 at 1:37 PM Mario Limonciello
> <mario.limonciello@amd.com> wrote:
>>
>> If there is memory pressure at suspend time then dynamically
>> allocating a large structure as part of DC suspend code will
>> fail.
>>
>> Instead re-use the same structure and clear all members except
>> those that should be maintained.
>>
>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>>   drivers/gpu/drm/amd/display/dc/core/dc.c      | 25 -------------------
>>   .../gpu/drm/amd/display/dc/core/dc_resource.c | 12 +++++++++
>>   2 files changed, 12 insertions(+), 25 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
>> index 39e291a467e2..cb8c7c5a8807 100644
>> --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
>> +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
>> @@ -4728,9 +4728,6 @@ bool dc_set_power_state(
>>          struct dc *dc,
>>          enum dc_acpi_cm_power_state power_state)
>>   {
>> -       struct kref refcount;
>> -       struct display_mode_lib *dml;
>> -
>>          if (!dc->current_state)
>>                  return true;
>>
>> @@ -4750,30 +4747,8 @@ bool dc_set_power_state(
>>                  break;
>>          default:
>>                  ASSERT(dc->current_state->stream_count == 0);
>> -               /* Zero out the current context so that on resume we start with
>> -                * clean state, and dc hw programming optimizations will not
>> -                * cause any trouble.
>> -                */
>> -               dml = kzalloc(sizeof(struct display_mode_lib),
>> -                               GFP_KERNEL);
>> -
>> -               ASSERT(dml);
>> -               if (!dml)
>> -                       return false;
>> -
>> -               /* Preserve refcount */
>> -               refcount = dc->current_state->refcount;
>> -               /* Preserve display mode lib */
>> -               memcpy(dml, &dc->current_state->bw_ctx.dml, sizeof(struct display_mode_lib));
>>
>>                  dc_resource_state_destruct(dc->current_state);
>> -               memset(dc->current_state, 0,
>> -                               sizeof(*dc->current_state));
>> -
>> -               dc->current_state->refcount = refcount;
>> -               dc->current_state->bw_ctx.dml = *dml;
> 
> The dml dance seems a bit weird.  I guess it's here because
> dc_resource_state_destruct() might change it?  Can we safely drop
> this?  If we do need it, we could pre-allocate a dml structure and use
> that.

The dml structure is huge, so I think it's sub-optimal to have two 
copies of it.  That's why I aimed to just destroy everything else except 
it instead.

The only reason it's "safe" to drop the whole above stuff is because of 
"threading the needle" of what dc_resource_state_destruct() does.

In the earlier version I had a mistake to miss clearing the scratch 
variable and it caused some IGT faliures.

This probably needs to be double checked with the DML2 series landing as 
well to make sure it didn't get caught in the middle.

> 
> Alex
> 
>> -
>> -               kfree(dml);
>>
>>                  break;
>>          }
>> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
>> index aa7b5db83644..e487c966c118 100644
>> --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
>> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
>> @@ -4350,6 +4350,18 @@ void dc_resource_state_destruct(struct dc_state *context)
>>                  context->streams[i] = NULL;
>>          }
>>          context->stream_count = 0;
>> +       context->stream_mask = 0;
>> +       memset(&context->res_ctx, 0, sizeof(context->res_ctx));
>> +       memset(&context->pp_display_cfg, 0, sizeof(context->pp_display_cfg));
>> +       memset(&context->dcn_bw_vars, 0, sizeof(context->dcn_bw_vars));
>> +       context->clk_mgr = NULL;
>> +       memset(&context->bw_ctx.bw, 0, sizeof(context->bw_ctx.bw));
>> +       memset(context->block_sequence, 0, sizeof(context->block_sequence));
>> +       context->block_sequence_steps = 0;
>> +       memset(context->dc_dmub_cmd, 0, sizeof(context->dc_dmub_cmd));
>> +       context->dmub_cmd_count = 0;
>> +       memset(&context->perf_params, 0, sizeof(context->perf_params));
>> +       memset(&context->scratch, 0, sizeof(context->scratch));
>>   }
>>
>>   void dc_resource_state_copy_construct(
>> --
>> 2.34.1
>>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 1/3] drm/amd: Evict resources during PM ops prepare() callback
       [not found]   ` <12e15ad4-cfa5-4ce9-89a4-f8d1718cb256@gmail.com>
@ 2023-10-05 14:59     ` Mario Limonciello
  2023-10-06  7:22       ` Christian König
  0 siblings, 1 reply; 10+ messages in thread
From: Mario Limonciello @ 2023-10-05 14:59 UTC (permalink / raw)
  To: Christian König, amd-gfx; +Cc: Harry.Wentland

On 10/5/2023 09:39, Christian König wrote:
> Am 04.10.23 um 19:18 schrieb Mario Limonciello:
>> Linux PM core has a prepare() callback run before suspend.
>>
>> If the system is under high memory pressure, the resources may need
>> to be evicted into swap instead.  If the storage backing for swap
>> is offlined during the suspend() step then such a call may fail.
>>
>> So duplicate this step into prepare() to move evict majority of
>> resources while leaving all existing steps that put the GPU into a
>> low power state in suspend().
>>
>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 +++++++++++++++++++++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  7 +++---
>>   3 files changed, 30 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index d23fb4b5ad95..6643d0ed6b1b 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -1413,6 +1413,7 @@ void amdgpu_driver_postclose_kms(struct 
>> drm_device *dev,
>>   void amdgpu_driver_release_kms(struct drm_device *dev);
>>   int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
>> +int amdgpu_device_prepare(struct drm_device *dev);
>>   int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
>>   int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
>>   u32 amdgpu_get_vblank_counter_kms(struct drm_crtc *crtc);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index bad2b5577e96..67acee569c08 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -4259,6 +4259,31 @@ static int amdgpu_device_evict_resources(struct 
>> amdgpu_device *adev)
>>   /*
>>    * Suspend & resume.
>>    */
>> +/**
>> + * amdgpu_device_prepare - prepare for device suspend
>> + *
>> + * @dev: drm dev pointer
>> + *
>> + * Prepare to put the hw in the suspend state (all asics).
>> + * Returns 0 for success or an error on failure.
>> + * Called at driver suspend.
>> + */
>> +int amdgpu_device_prepare(struct drm_device *dev)
>> +{
>> +    struct amdgpu_device *adev = drm_to_adev(dev);
>> +    int r;
>> +
>> +    if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
>> +        return 0;
>> +
>> +    /* Evict the majority of BOs before starting suspend sequence */
>> +    r = amdgpu_device_evict_resources(adev);
>> +    if (r)
>> +        return r;
>> +
>> +    return 0;
>> +}
>> +
>>   /**
>>    * amdgpu_device_suspend - initiate device suspend
>>    *
>> @@ -4279,7 +4304,6 @@ int amdgpu_device_suspend(struct drm_device 
>> *dev, bool fbcon)
>>       adev->in_suspend = true;
>> -    /* Evict the majority of BOs before grabbing the full access */
>>       r = amdgpu_device_evict_resources(adev);
>>       if (r)
>>           return r;
> 
> I would just completely drop this extra amdgpu_device_evict_resources() 
> call now.
> 
> We have a second call which is used to evacuate firmware etc... after 
> the hw has been shut down. That one can't move, but also shouldn't 
> allocate that much memory.
> 

The problem is that amdgpu_device_suspend() is also called from 
amdgpu_switcheroo_set_state() as well as a bunch of pmops sequences that 
I don't expect call prepare() like poweroff().

I would think we still want to evict resources at the beginning of 
amdgpu_device_suspend() for all of those.

So it's an extra call for the prepare() path but it should be harmless.

> Regards,
> Christian.
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index e3471293846f..175167582db0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -2425,8 +2425,9 @@ static int amdgpu_pmops_prepare(struct device *dev)
>>       /* Return a positive number here so
>>        * DPM_FLAG_SMART_SUSPEND works properly
>>        */
>> -    if (amdgpu_device_supports_boco(drm_dev))
>> -        return pm_runtime_suspended(dev);
>> +    if (amdgpu_device_supports_boco(drm_dev) &&
>> +        pm_runtime_suspended(dev))
>> +        return 1;
>>       /* if we will not support s3 or s2i for the device
>>        *  then skip suspend
>> @@ -2435,7 +2436,7 @@ static int amdgpu_pmops_prepare(struct device *dev)
>>           !amdgpu_acpi_is_s3_active(adev))
>>           return 1;
>> -    return 0;
>> +    return amdgpu_device_prepare(drm_dev);
>>   }
>>   static void amdgpu_pmops_complete(struct device *dev)
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 1/3] drm/amd: Evict resources during PM ops prepare() callback
  2023-10-05 14:59     ` Mario Limonciello
@ 2023-10-06  7:22       ` Christian König
  0 siblings, 0 replies; 10+ messages in thread
From: Christian König @ 2023-10-06  7:22 UTC (permalink / raw)
  To: Mario Limonciello, amd-gfx; +Cc: Harry.Wentland

Am 05.10.23 um 16:59 schrieb Mario Limonciello:
> On 10/5/2023 09:39, Christian König wrote:
>> Am 04.10.23 um 19:18 schrieb Mario Limonciello:
>>> Linux PM core has a prepare() callback run before suspend.
>>>
>>> If the system is under high memory pressure, the resources may need
>>> to be evicted into swap instead.  If the storage backing for swap
>>> is offlined during the suspend() step then such a call may fail.
>>>
>>> So duplicate this step into prepare() to move evict majority of
>>> resources while leaving all existing steps that put the GPU into a
>>> low power state in suspend().
>>>
>>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 
>>> +++++++++++++++++++++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  7 +++---
>>>   3 files changed, 30 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index d23fb4b5ad95..6643d0ed6b1b 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -1413,6 +1413,7 @@ void amdgpu_driver_postclose_kms(struct 
>>> drm_device *dev,
>>>   void amdgpu_driver_release_kms(struct drm_device *dev);
>>>   int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
>>> +int amdgpu_device_prepare(struct drm_device *dev);
>>>   int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
>>>   int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
>>>   u32 amdgpu_get_vblank_counter_kms(struct drm_crtc *crtc);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index bad2b5577e96..67acee569c08 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -4259,6 +4259,31 @@ static int 
>>> amdgpu_device_evict_resources(struct amdgpu_device *adev)
>>>   /*
>>>    * Suspend & resume.
>>>    */
>>> +/**
>>> + * amdgpu_device_prepare - prepare for device suspend
>>> + *
>>> + * @dev: drm dev pointer
>>> + *
>>> + * Prepare to put the hw in the suspend state (all asics).
>>> + * Returns 0 for success or an error on failure.
>>> + * Called at driver suspend.
>>> + */
>>> +int amdgpu_device_prepare(struct drm_device *dev)
>>> +{
>>> +    struct amdgpu_device *adev = drm_to_adev(dev);
>>> +    int r;
>>> +
>>> +    if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
>>> +        return 0;
>>> +
>>> +    /* Evict the majority of BOs before starting suspend sequence */
>>> +    r = amdgpu_device_evict_resources(adev);
>>> +    if (r)
>>> +        return r;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>>   /**
>>>    * amdgpu_device_suspend - initiate device suspend
>>>    *
>>> @@ -4279,7 +4304,6 @@ int amdgpu_device_suspend(struct drm_device 
>>> *dev, bool fbcon)
>>>       adev->in_suspend = true;
>>> -    /* Evict the majority of BOs before grabbing the full access */
>>>       r = amdgpu_device_evict_resources(adev);
>>>       if (r)
>>>           return r;
>>
>> I would just completely drop this extra 
>> amdgpu_device_evict_resources() call now.
>>
>> We have a second call which is used to evacuate firmware etc... after 
>> the hw has been shut down. That one can't move, but also shouldn't 
>> allocate that much memory.
>>
>
> The problem is that amdgpu_device_suspend() is also called from 
> amdgpu_switcheroo_set_state() as well as a bunch of pmops sequences 
> that I don't expect call prepare() like poweroff().
>
> I would think we still want to evict resources at the beginning of 
> amdgpu_device_suspend() for all of those.
>
> So it's an extra call for the prepare() path but it should be harmless.

That's true, but I would rather say that we should then call 
amdgpu_device_prepare() in those call paths as well.

We might end up putting more stuff into the prepare call than just 
eviction VRAM.

Regards,
Christian.

>
>> Regards,
>> Christian.
>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> index e3471293846f..175167582db0 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> @@ -2425,8 +2425,9 @@ static int amdgpu_pmops_prepare(struct device 
>>> *dev)
>>>       /* Return a positive number here so
>>>        * DPM_FLAG_SMART_SUSPEND works properly
>>>        */
>>> -    if (amdgpu_device_supports_boco(drm_dev))
>>> -        return pm_runtime_suspended(dev);
>>> +    if (amdgpu_device_supports_boco(drm_dev) &&
>>> +        pm_runtime_suspended(dev))
>>> +        return 1;
>>>       /* if we will not support s3 or s2i for the device
>>>        *  then skip suspend
>>> @@ -2435,7 +2436,7 @@ static int amdgpu_pmops_prepare(struct device 
>>> *dev)
>>>           !amdgpu_acpi_is_s3_active(adev))
>>>           return 1;
>>> -    return 0;
>>> +    return amdgpu_device_prepare(drm_dev);
>>>   }
>>>   static void amdgpu_pmops_complete(struct device *dev)
>>
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-10-06  7:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-04 17:18 [PATCH v4 0/3] Better handle memory pressure at suspend Mario Limonciello
2023-10-04 17:18 ` [PATCH v4 1/3] drm/amd: Evict resources during PM ops prepare() callback Mario Limonciello
2023-10-05 14:13   ` Alex Deucher
     [not found]   ` <12e15ad4-cfa5-4ce9-89a4-f8d1718cb256@gmail.com>
2023-10-05 14:59     ` Mario Limonciello
2023-10-06  7:22       ` Christian König
2023-10-04 17:18 ` [PATCH v4 2/3] drm/amd/display: Destroy DC context while keeping DML Mario Limonciello
2023-10-05 14:27   ` Alex Deucher
2023-10-05 14:37     ` Mario Limonciello
2023-10-04 17:18 ` [PATCH v4 3/3] drm/amd/display: make dc_set_power_state() return type `void` again Mario Limonciello
2023-10-05 14:28   ` Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.