All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: disable sdma ecc irq only when sdma RAS is enabled in suspend
@ 2023-05-06  9:03 Guchun Chen
  2023-05-06  9:52 ` Zhou1, Tao
  0 siblings, 1 reply; 4+ messages in thread
From: Guchun Chen @ 2023-05-06  9:03 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, hawking.zhang, lijo.lazar, Tao.Zhou1,
	christian.koenig
  Cc: Guchun Chen

sdma_v4_0_ip is shared on a few asics, but in sdma_v4_0_hw_fini,
driver unconditionally disables ecc_irq which is only enabled on
those asics enabling sdma ecc. This will introduce a warning in
suspend cycle on those chips with sdma ip v4.0, while without
sdma ecc. So this patch correct this.

[ 7283.166354] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu]
[ 7283.167001] RSP: 0018:ffff9a5fc3967d08 EFLAGS: 00010246
[ 7283.167019] RAX: ffff98d88afd3770 RBX: 0000000000000001 RCX: 0000000000000000
[ 7283.167023] RDX: 0000000000000000 RSI: ffff98d89da30390 RDI: ffff98d89da20000
[ 7283.167025] RBP: ffff98d89da20000 R08: 0000000000036838 R09: 0000000000000006
[ 7283.167028] R10: ffffd5764243c008 R11: 0000000000000000 R12: ffff98d89da30390
[ 7283.167030] R13: ffff98d89da38978 R14: ffffffff999ae15a R15: ffff98d880130105
[ 7283.167032] FS:  0000000000000000(0000) GS:ffff98d996f00000(0000) knlGS:0000000000000000
[ 7283.167036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7283.167039] CR2: 00000000f7a9d178 CR3: 00000001c42ea000 CR4: 00000000003506e0
[ 7283.167041] Call Trace:
[ 7283.167046]  <TASK>
[ 7283.167048]  sdma_v4_0_hw_fini+0x38/0xa0 [amdgpu]
[ 7283.167704]  amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu]
[ 7283.168296]  amdgpu_device_suspend+0x103/0x180 [amdgpu]
[ 7283.168875]  amdgpu_pmops_freeze+0x21/0x60 [amdgpu]
[ 7283.169464]  pci_pm_freeze+0x54/0xc0

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index b5affba22156..8b8ddf050266 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1903,9 +1903,11 @@ static int sdma_v4_0_hw_fini(void *handle)
 		return 0;
 	}
 
-	for (i = 0; i < adev->sdma.num_instances; i++) {
-		amdgpu_irq_put(adev, &adev->sdma.ecc_irq,
-			       AMDGPU_SDMA_IRQ_INSTANCE0 + i);
+	if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__SDMA)) {
+		for (i = 0; i < adev->sdma.num_instances; i++) {
+			amdgpu_irq_put(adev, &adev->sdma.ecc_irq,
+				       AMDGPU_SDMA_IRQ_INSTANCE0 + i);
+		}
 	}
 
 	sdma_v4_0_ctx_switch_enable(adev, false);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [PATCH] drm/amdgpu: disable sdma ecc irq only when sdma RAS is enabled in suspend
  2023-05-06  9:03 [PATCH] drm/amdgpu: disable sdma ecc irq only when sdma RAS is enabled in suspend Guchun Chen
@ 2023-05-06  9:52 ` Zhou1, Tao
  0 siblings, 0 replies; 4+ messages in thread
From: Zhou1, Tao @ 2023-05-06  9:52 UTC (permalink / raw)
  To: Chen, Guchun, amd-gfx, Deucher, Alexander, Zhang, Hawking, Lazar,
	Lijo, Koenig, Christian

[AMD Official Use Only - General]

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>

> -----Original Message-----
> From: Chen, Guchun <Guchun.Chen@amd.com>
> Sent: Saturday, May 6, 2023 5:04 PM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>;
> Lazar, Lijo <Lijo.Lazar@amd.com>; Zhou1, Tao <Tao.Zhou1@amd.com>; Koenig,
> Christian <Christian.Koenig@amd.com>
> Cc: Chen, Guchun <Guchun.Chen@amd.com>
> Subject: [PATCH] drm/amdgpu: disable sdma ecc irq only when sdma RAS is
> enabled in suspend
> 
> sdma_v4_0_ip is shared on a few asics, but in sdma_v4_0_hw_fini, driver
> unconditionally disables ecc_irq which is only enabled on those asics enabling
> sdma ecc. This will introduce a warning in suspend cycle on those chips with
> sdma ip v4.0, while without sdma ecc. So this patch correct this.
> 
> [ 7283.166354] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu] [ 7283.167001]
> RSP: 0018:ffff9a5fc3967d08 EFLAGS: 00010246 [ 7283.167019] RAX:
> ffff98d88afd3770 RBX: 0000000000000001 RCX: 0000000000000000
> [ 7283.167023] RDX: 0000000000000000 RSI: ffff98d89da30390 RDI:
> ffff98d89da20000 [ 7283.167025] RBP: ffff98d89da20000 R08:
> 0000000000036838 R09: 0000000000000006 [ 7283.167028] R10:
> ffffd5764243c008 R11: 0000000000000000 R12: ffff98d89da30390
> [ 7283.167030] R13: ffff98d89da38978 R14: ffffffff999ae15a R15:
> ffff98d880130105 [ 7283.167032] FS:  0000000000000000(0000)
> GS:ffff98d996f00000(0000) knlGS:0000000000000000 [ 7283.167036] CS:  0010
> DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7283.167039] CR2:
> 00000000f7a9d178 CR3: 00000001c42ea000 CR4: 00000000003506e0
> [ 7283.167041] Call Trace:
> [ 7283.167046]  <TASK>
> [ 7283.167048]  sdma_v4_0_hw_fini+0x38/0xa0 [amdgpu] [ 7283.167704]
> amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu] [ 7283.168296]
> amdgpu_device_suspend+0x103/0x180 [amdgpu] [ 7283.168875]
> amdgpu_pmops_freeze+0x21/0x60 [amdgpu] [ 7283.169464]
> pci_pm_freeze+0x54/0xc0
> 
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522
> 
> Signed-off-by: Guchun Chen <guchun.chen@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> index b5affba22156..8b8ddf050266 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> @@ -1903,9 +1903,11 @@ static int sdma_v4_0_hw_fini(void *handle)
>  		return 0;
>  	}
> 
> -	for (i = 0; i < adev->sdma.num_instances; i++) {
> -		amdgpu_irq_put(adev, &adev->sdma.ecc_irq,
> -			       AMDGPU_SDMA_IRQ_INSTANCE0 + i);
> +	if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__SDMA)) {
> +		for (i = 0; i < adev->sdma.num_instances; i++) {
> +			amdgpu_irq_put(adev, &adev->sdma.ecc_irq,
> +				       AMDGPU_SDMA_IRQ_INSTANCE0 + i);
> +		}
>  	}
> 
>  	sdma_v4_0_ctx_switch_enable(adev, false);
> --
> 2.25.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] drm/amdgpu: disable sdma ecc irq only when sdma RAS is enabled in suspend
  2023-05-09  1:27 Guchun Chen
@ 2023-05-09 13:34 ` Deucher, Alexander
  0 siblings, 0 replies; 4+ messages in thread
From: Deucher, Alexander @ 2023-05-09 13:34 UTC (permalink / raw)
  To: Chen, Guchun, amd-gfx, Zhang, Hawking, Lazar, Lijo, Quan, Evan,
	Koenig, Christian, Pan, Xinhui
  Cc: Zhou1, Tao

[Public]

> -----Original Message-----
> From: Chen, Guchun <Guchun.Chen@amd.com>
> Sent: Monday, May 8, 2023 9:28 PM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Zhang, Hawking
> <Hawking.Zhang@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com>; Quan, Evan
> <Evan.Quan@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>;
> Pan, Xinhui <Xinhui.Pan@amd.com>
> Cc: Chen, Guchun <Guchun.Chen@amd.com>; Zhou1, Tao
> <Tao.Zhou1@amd.com>
> Subject: [PATCH] drm/amdgpu: disable sdma ecc irq only when sdma RAS is
> enabled in suspend
> 
> sdma_v4_0_ip is shared on a few asics, but in sdma_v4_0_hw_fini, driver
> unconditionally disables ecc_irq which is only enabled on those asics enabling
> sdma ecc. This will introduce a warning in suspend cycle on those chips with
> sdma ip v4.0, while without sdma ecc. So this patch correct this.
> 
> [ 7283.166354] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu] [
> 7283.167001] RSP: 0018:ffff9a5fc3967d08 EFLAGS: 00010246 [ 7283.167019]
> RAX: ffff98d88afd3770 RBX: 0000000000000001 RCX: 0000000000000000 [
> 7283.167023] RDX: 0000000000000000 RSI: ffff98d89da30390 RDI:
> ffff98d89da20000 [ 7283.167025] RBP: ffff98d89da20000 R08:
> 0000000000036838 R09: 0000000000000006 [ 7283.167028] R10:
> ffffd5764243c008 R11: 0000000000000000 R12: ffff98d89da30390 [
> 7283.167030] R13: ffff98d89da38978 R14: ffffffff999ae15a R15:
> ffff98d880130105 [ 7283.167032] FS:  0000000000000000(0000)
> GS:ffff98d996f00000(0000) knlGS:0000000000000000 [ 7283.167036] CS:  0010
> DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7283.167039] CR2:
> 00000000f7a9d178 CR3: 00000001c42ea000 CR4: 00000000003506e0 [
> 7283.167041] Call Trace:
> [ 7283.167046]  <TASK>
> [ 7283.167048]  sdma_v4_0_hw_fini+0x38/0xa0 [amdgpu] [ 7283.167704]
> amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu] [ 7283.168296]
> amdgpu_device_suspend+0x103/0x180 [amdgpu] [ 7283.168875]
> amdgpu_pmops_freeze+0x21/0x60 [amdgpu] [ 7283.169464]
> pci_pm_freeze+0x54/0xc0
> 
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522
> 
> Signed-off-by: Guchun Chen <guchun.chen@amd.com>
> Reviewed-by: Tao Zhou <tao.zhou1@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> index b5affba22156..8b8ddf050266 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> @@ -1903,9 +1903,11 @@ static int sdma_v4_0_hw_fini(void *handle)
>  		return 0;
>  	}
> 
> -	for (i = 0; i < adev->sdma.num_instances; i++) {
> -		amdgpu_irq_put(adev, &adev->sdma.ecc_irq,
> -			       AMDGPU_SDMA_IRQ_INSTANCE0 + i);
> +	if (amdgpu_ras_is_supported(adev,
> AMDGPU_RAS_BLOCK__SDMA)) {
> +		for (i = 0; i < adev->sdma.num_instances; i++) {
> +			amdgpu_irq_put(adev, &adev->sdma.ecc_irq,
> +				       AMDGPU_SDMA_IRQ_INSTANCE0 + i);
> +		}
>  	}
> 
>  	sdma_v4_0_ctx_switch_enable(adev, false);
> --
> 2.25.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] drm/amdgpu: disable sdma ecc irq only when sdma RAS is enabled in suspend
@ 2023-05-09  1:27 Guchun Chen
  2023-05-09 13:34 ` Deucher, Alexander
  0 siblings, 1 reply; 4+ messages in thread
From: Guchun Chen @ 2023-05-09  1:27 UTC (permalink / raw)
  To: amd-gfx, alexander.deucher, hawking.zhang, lijo.lazar, evan.quan,
	christian.koenig, xinhui.pan
  Cc: Tao Zhou, Guchun Chen

sdma_v4_0_ip is shared on a few asics, but in sdma_v4_0_hw_fini,
driver unconditionally disables ecc_irq which is only enabled on
those asics enabling sdma ecc. This will introduce a warning in
suspend cycle on those chips with sdma ip v4.0, while without
sdma ecc. So this patch correct this.

[ 7283.166354] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu]
[ 7283.167001] RSP: 0018:ffff9a5fc3967d08 EFLAGS: 00010246
[ 7283.167019] RAX: ffff98d88afd3770 RBX: 0000000000000001 RCX: 0000000000000000
[ 7283.167023] RDX: 0000000000000000 RSI: ffff98d89da30390 RDI: ffff98d89da20000
[ 7283.167025] RBP: ffff98d89da20000 R08: 0000000000036838 R09: 0000000000000006
[ 7283.167028] R10: ffffd5764243c008 R11: 0000000000000000 R12: ffff98d89da30390
[ 7283.167030] R13: ffff98d89da38978 R14: ffffffff999ae15a R15: ffff98d880130105
[ 7283.167032] FS:  0000000000000000(0000) GS:ffff98d996f00000(0000) knlGS:0000000000000000
[ 7283.167036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7283.167039] CR2: 00000000f7a9d178 CR3: 00000001c42ea000 CR4: 00000000003506e0
[ 7283.167041] Call Trace:
[ 7283.167046]  <TASK>
[ 7283.167048]  sdma_v4_0_hw_fini+0x38/0xa0 [amdgpu]
[ 7283.167704]  amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu]
[ 7283.168296]  amdgpu_device_suspend+0x103/0x180 [amdgpu]
[ 7283.168875]  amdgpu_pmops_freeze+0x21/0x60 [amdgpu]
[ 7283.169464]  pci_pm_freeze+0x54/0xc0

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index b5affba22156..8b8ddf050266 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1903,9 +1903,11 @@ static int sdma_v4_0_hw_fini(void *handle)
 		return 0;
 	}
 
-	for (i = 0; i < adev->sdma.num_instances; i++) {
-		amdgpu_irq_put(adev, &adev->sdma.ecc_irq,
-			       AMDGPU_SDMA_IRQ_INSTANCE0 + i);
+	if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__SDMA)) {
+		for (i = 0; i < adev->sdma.num_instances; i++) {
+			amdgpu_irq_put(adev, &adev->sdma.ecc_irq,
+				       AMDGPU_SDMA_IRQ_INSTANCE0 + i);
+		}
 	}
 
 	sdma_v4_0_ctx_switch_enable(adev, false);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-05-09 13:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-06  9:03 [PATCH] drm/amdgpu: disable sdma ecc irq only when sdma RAS is enabled in suspend Guchun Chen
2023-05-06  9:52 ` Zhou1, Tao
2023-05-09  1:27 Guchun Chen
2023-05-09 13:34 ` Deucher, Alexander

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.