All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: Update RAS init handling
@ 2020-09-11  4:02 Clements, John
  2020-09-11  6:02 ` Zhang, Hawking
  0 siblings, 1 reply; 6+ messages in thread
From: Clements, John @ 2020-09-11  4:02 UTC (permalink / raw)
  To: amd-gfx list, Chen, Guchun, Zhang, Hawking


[-- Attachment #1.1: Type: text/plain, Size: 124 bytes --]

[AMD Official Use Only - Internal Distribution Only]

Added RAS status check and tear down RAS context if RAS init fails

[-- Attachment #1.2: Type: text/html, Size: 1820 bytes --]

[-- Attachment #2: 0001-drm-amdgpu-Update-RAS-init-handling.patch --]
[-- Type: application/octet-stream, Size: 1520 bytes --]

From b17babc8bca65728975ddbc98bf0ee7338eac4f3 Mon Sep 17 00:00:00 2001
From: John Clements <john.clements@amd.com>
Date: Fri, 11 Sep 2020 11:57:56 +0800
Subject: [PATCH] drm/amdgpu: Update RAS init handling

Output RAS init status

If RAS init fails, teardown RAS context

Signed-off-by: John Clements <john.clements@amd.com>
Change-Id: Ic7660a709c60f12b481fdee0a8b32694210138c0
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index d6c38e24f130..7dd515bab22e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -929,6 +929,7 @@ static int psp_ras_load(struct psp_context *psp)
 {
 	int ret;
 	struct psp_gfx_cmd_resp *cmd;
+	struct ta_ras_shared_memory *ras_cmd;
 
 	/*
 	 * TODO: bypass the loading in sriov for now
@@ -952,11 +953,22 @@ static int psp_ras_load(struct psp_context *psp)
 	ret = psp_cmd_submit_buf(psp, NULL, cmd,
 			psp->fence_buf_mc_addr);
 
+	ras_cmd = (struct ta_ras_shared_memory*)psp->ras.ras_shared_buf;
+
 	if (!ret) {
-		psp->ras.ras_initialized = true;
 		psp->ras.session_id = cmd->resp.session_id;
+
+		if (!ras_cmd->ras_status)
+			psp->ras.ras_initialized = true;
+		else
+		{
+			dev_warn(psp->adev->dev, "RAS Init Status: 0x%X\n", ras_cmd->ras_status);
+		}
 	}
 
+	if (ret || ras_cmd->ras_status)
+		amdgpu_ras_fini(psp->adev);
+
 	kfree(cmd);
 
 	return ret;
-- 
2.17.1


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [PATCH] drm/amdgpu: Update RAS init handling
  2020-09-11  4:02 [PATCH] drm/amdgpu: Update RAS init handling Clements, John
@ 2020-09-11  6:02 ` Zhang, Hawking
  2020-09-14 18:38   ` Deucher, Alexander
  0 siblings, 1 reply; 6+ messages in thread
From: Zhang, Hawking @ 2020-09-11  6:02 UTC (permalink / raw)
  To: Clements, John, amd-gfx list, Chen, Guchun


[-- Attachment #1.1: Type: text/plain, Size: 959 bytes --]

[AMD Public Use]

+                             {
+                                             dev_warn(psp->adev->dev, "RAS Init Status: 0x%X\n", ras_cmd->ras_status);
+                             }
Please remove the redundant bracket.

Other than that, the patch is
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>

In addition, please create another patch to move the nbio ras controller irq source registry to sw_init, which is the consistent as what we did for other ip blocks, register the irq source in IP sw_init funcs.

Regards,
Hawking
From: Clements, John <John.Clements@amd.com>
Sent: Friday, September 11, 2020 12:03
To: amd-gfx list <amd-gfx@lists.freedesktop.org>; Chen, Guchun <Guchun.Chen@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: [PATCH] drm/amdgpu: Update RAS init handling


[AMD Official Use Only - Internal Distribution Only]

Added RAS status check and tear down RAS context if RAS init fails

[-- Attachment #1.2: Type: text/html, Size: 4638 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amdgpu: Update RAS init handling
  2020-09-11  6:02 ` Zhang, Hawking
@ 2020-09-14 18:38   ` Deucher, Alexander
  0 siblings, 0 replies; 6+ messages in thread
From: Deucher, Alexander @ 2020-09-14 18:38 UTC (permalink / raw)
  To: Zhang, Hawking, Clements, John, amd-gfx list, Chen,  Guchun


[-- Attachment #1.1: Type: text/plain, Size: 1510 bytes --]

[AMD Public Use]

Also, general nit, per kernel coding style, braces should be on the same line as the if or else,  E.g.,
} else {


Alex
________________________________
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Zhang, Hawking <Hawking.Zhang@amd.com>
Sent: Friday, September 11, 2020 2:02 AM
To: Clements, John <John.Clements@amd.com>; amd-gfx list <amd-gfx@lists.freedesktop.org>; Chen, Guchun <Guchun.Chen@amd.com>
Subject: RE: [PATCH] drm/amdgpu: Update RAS init handling


[AMD Public Use]



+                             {

+                                             dev_warn(psp->adev->dev, "RAS Init Status: 0x%X\n", ras_cmd->ras_status);

+                             }

Please remove the redundant bracket.



Other than that, the patch is

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>



In addition, please create another patch to move the nbio ras controller irq source registry to sw_init, which is the consistent as what we did for other ip blocks, register the irq source in IP sw_init funcs.



Regards,

Hawking

From: Clements, John <John.Clements@amd.com>
Sent: Friday, September 11, 2020 12:03
To: amd-gfx list <amd-gfx@lists.freedesktop.org>; Chen, Guchun <Guchun.Chen@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: [PATCH] drm/amdgpu: Update RAS init handling



[AMD Official Use Only - Internal Distribution Only]



Added RAS status check and tear down RAS context if RAS init fails

[-- Attachment #1.2: Type: text/html, Size: 5060 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] drm/amdgpu: Update RAS init handling
  2020-09-09  9:24 Clements, John
  2020-09-09 10:34 ` Zhang, Hawking
@ 2020-09-09 11:48 ` Chen, Guchun
  1 sibling, 0 replies; 6+ messages in thread
From: Chen, Guchun @ 2020-09-09 11:48 UTC (permalink / raw)
  To: Clements, John, amd-gfx list, Zhang, Hawking


[-- Attachment #1.1: Type: text/plain, Size: 596 bytes --]

[AMD Public Use]

+            if (!adev->psp.ras.ras_initialized)
+                           return -EINVAL;
+
             if (!con)
                            return -EINVAL;

I suggest squashing the new check into below one.

Regards,
Guchun

From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Clements, John
Sent: Wednesday, September 9, 2020 5:24 PM
To: amd-gfx list <amd-gfx@lists.freedesktop.org>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: [PATCH] drm/amdgpu: Update RAS init handling


[AMD Official Use Only - Internal Distribution Only]



[-- Attachment #1.2: Type: text/html, Size: 4174 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] drm/amdgpu: Update RAS init handling
  2020-09-09  9:24 Clements, John
@ 2020-09-09 10:34 ` Zhang, Hawking
  2020-09-09 11:48 ` Chen, Guchun
  1 sibling, 0 replies; 6+ messages in thread
From: Zhang, Hawking @ 2020-09-09 10:34 UTC (permalink / raw)
  To: Clements, John, amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 503 bytes --]

[AMD Public Use]

The patch is

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>

BTW, please use the git-send mail for code review going forward, instead of attached the patch.

Regards,
Hawking
From: Clements, John <John.Clements@amd.com>
Sent: Wednesday, September 9, 2020 17:24
To: amd-gfx list <amd-gfx@lists.freedesktop.org>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: [PATCH] drm/amdgpu: Update RAS init handling


[AMD Official Use Only - Internal Distribution Only]



[-- Attachment #1.2: Type: text/html, Size: 3518 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] drm/amdgpu: Update RAS init handling
@ 2020-09-09  9:24 Clements, John
  2020-09-09 10:34 ` Zhang, Hawking
  2020-09-09 11:48 ` Chen, Guchun
  0 siblings, 2 replies; 6+ messages in thread
From: Clements, John @ 2020-09-09  9:24 UTC (permalink / raw)
  To: amd-gfx list, Zhang, Hawking


[-- Attachment #1.1: Type: text/plain, Size: 58 bytes --]

[AMD Official Use Only - Internal Distribution Only]



[-- Attachment #1.2: Type: text/html, Size: 1760 bytes --]

[-- Attachment #2: 0001-drm-amdgpu-Update-RAS-init-handling.patch --]
[-- Type: application/octet-stream, Size: 1946 bytes --]

From bf9f81158639eb8df72e3c8e89c86e6f4c499355 Mon Sep 17 00:00:00 2001
From: John Clements <john.clements@amd.com>
Date: Wed, 9 Sep 2020 17:19:53 +0800
Subject: [PATCH] drm/amdgpu: Update RAS init handling

Upon RAS init failure:
Output RAS init status in log
Do not attempt further RAS FW functions

Signed-off-by: John Clements <john.clements@amd.com>
Change-Id: Ie59c764f5eec387d04bd676ad4e053bb39f30667
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 9 ++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index d6c38e24f130..c765d14efb58 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -929,6 +929,7 @@ static int psp_ras_load(struct psp_context *psp)
 {
 	int ret;
 	struct psp_gfx_cmd_resp *cmd;
+	struct ta_ras_shared_memory *ras_cmd;
 
 	/*
 	 * TODO: bypass the loading in sriov for now
@@ -952,9 +953,15 @@ static int psp_ras_load(struct psp_context *psp)
 	ret = psp_cmd_submit_buf(psp, NULL, cmd,
 			psp->fence_buf_mc_addr);
 
+	ras_cmd = (struct ta_ras_shared_memory*)psp->ras.ras_shared_buf;
+
 	if (!ret) {
-		psp->ras.ras_initialized = true;
 		psp->ras.session_id = cmd->resp.session_id;
+
+		if (!ras_cmd->ras_status)
+			psp->ras.ras_initialized = true;
+        else
+			dev_warn(psp->adev->dev, "RAS Init Status: 0x%X\n", ras_cmd->ras_status);
 	}
 
 	kfree(cmd);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index e5ea14774c0c..c659b105b66b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -604,6 +604,9 @@ int amdgpu_ras_feature_enable(struct amdgpu_device *adev,
 	union ta_ras_cmd_input *info;
 	int ret;
 
+	if (!adev->psp.ras.ras_initialized)
+		return -EINVAL;
+
 	if (!con)
 		return -EINVAL;
 
-- 
2.17.1


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-09-14 18:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-11  4:02 [PATCH] drm/amdgpu: Update RAS init handling Clements, John
2020-09-11  6:02 ` Zhang, Hawking
2020-09-14 18:38   ` Deucher, Alexander
  -- strict thread matches above, loose matches on Subject: below --
2020-09-09  9:24 Clements, John
2020-09-09 10:34 ` Zhang, Hawking
2020-09-09 11:48 ` Chen, Guchun

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.