* [PATCH] drm/amdgpu: Update RAS init handling
@ 2020-09-11 4:02 Clements, John
2020-09-11 6:02 ` Zhang, Hawking
0 siblings, 1 reply; 6+ messages in thread
From: Clements, John @ 2020-09-11 4:02 UTC (permalink / raw)
To: amd-gfx list, Chen, Guchun, Zhang, Hawking
[-- Attachment #1.1: Type: text/plain, Size: 124 bytes --]
[AMD Official Use Only - Internal Distribution Only]
Added RAS status check and tear down RAS context if RAS init fails
[-- Attachment #1.2: Type: text/html, Size: 1820 bytes --]
[-- Attachment #2: 0001-drm-amdgpu-Update-RAS-init-handling.patch --]
[-- Type: application/octet-stream, Size: 1520 bytes --]
From b17babc8bca65728975ddbc98bf0ee7338eac4f3 Mon Sep 17 00:00:00 2001
From: John Clements <john.clements@amd.com>
Date: Fri, 11 Sep 2020 11:57:56 +0800
Subject: [PATCH] drm/amdgpu: Update RAS init handling
Output RAS init status
If RAS init fails, teardown RAS context
Signed-off-by: John Clements <john.clements@amd.com>
Change-Id: Ic7660a709c60f12b481fdee0a8b32694210138c0
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index d6c38e24f130..7dd515bab22e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -929,6 +929,7 @@ static int psp_ras_load(struct psp_context *psp)
{
int ret;
struct psp_gfx_cmd_resp *cmd;
+ struct ta_ras_shared_memory *ras_cmd;
/*
* TODO: bypass the loading in sriov for now
@@ -952,11 +953,22 @@ static int psp_ras_load(struct psp_context *psp)
ret = psp_cmd_submit_buf(psp, NULL, cmd,
psp->fence_buf_mc_addr);
+ ras_cmd = (struct ta_ras_shared_memory*)psp->ras.ras_shared_buf;
+
if (!ret) {
- psp->ras.ras_initialized = true;
psp->ras.session_id = cmd->resp.session_id;
+
+ if (!ras_cmd->ras_status)
+ psp->ras.ras_initialized = true;
+ else
+ {
+ dev_warn(psp->adev->dev, "RAS Init Status: 0x%X\n", ras_cmd->ras_status);
+ }
}
+ if (ret || ras_cmd->ras_status)
+ amdgpu_ras_fini(psp->adev);
+
kfree(cmd);
return ret;
--
2.17.1
[-- Attachment #3: Type: text/plain, Size: 154 bytes --]
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 6+ messages in thread
* RE: [PATCH] drm/amdgpu: Update RAS init handling
2020-09-11 4:02 [PATCH] drm/amdgpu: Update RAS init handling Clements, John
@ 2020-09-11 6:02 ` Zhang, Hawking
2020-09-14 18:38 ` Deucher, Alexander
0 siblings, 1 reply; 6+ messages in thread
From: Zhang, Hawking @ 2020-09-11 6:02 UTC (permalink / raw)
To: Clements, John, amd-gfx list, Chen, Guchun
[-- Attachment #1.1: Type: text/plain, Size: 959 bytes --]
[AMD Public Use]
+ {
+ dev_warn(psp->adev->dev, "RAS Init Status: 0x%X\n", ras_cmd->ras_status);
+ }
Please remove the redundant bracket.
Other than that, the patch is
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
In addition, please create another patch to move the nbio ras controller irq source registry to sw_init, which is the consistent as what we did for other ip blocks, register the irq source in IP sw_init funcs.
Regards,
Hawking
From: Clements, John <John.Clements@amd.com>
Sent: Friday, September 11, 2020 12:03
To: amd-gfx list <amd-gfx@lists.freedesktop.org>; Chen, Guchun <Guchun.Chen@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: [PATCH] drm/amdgpu: Update RAS init handling
[AMD Official Use Only - Internal Distribution Only]
Added RAS status check and tear down RAS context if RAS init fails
[-- Attachment #1.2: Type: text/html, Size: 4638 bytes --]
[-- Attachment #2: Type: text/plain, Size: 154 bytes --]
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/amdgpu: Update RAS init handling
2020-09-11 6:02 ` Zhang, Hawking
@ 2020-09-14 18:38 ` Deucher, Alexander
0 siblings, 0 replies; 6+ messages in thread
From: Deucher, Alexander @ 2020-09-14 18:38 UTC (permalink / raw)
To: Zhang, Hawking, Clements, John, amd-gfx list, Chen, Guchun
[-- Attachment #1.1: Type: text/plain, Size: 1510 bytes --]
[AMD Public Use]
Also, general nit, per kernel coding style, braces should be on the same line as the if or else, E.g.,
} else {
Alex
________________________________
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Zhang, Hawking <Hawking.Zhang@amd.com>
Sent: Friday, September 11, 2020 2:02 AM
To: Clements, John <John.Clements@amd.com>; amd-gfx list <amd-gfx@lists.freedesktop.org>; Chen, Guchun <Guchun.Chen@amd.com>
Subject: RE: [PATCH] drm/amdgpu: Update RAS init handling
[AMD Public Use]
+ {
+ dev_warn(psp->adev->dev, "RAS Init Status: 0x%X\n", ras_cmd->ras_status);
+ }
Please remove the redundant bracket.
Other than that, the patch is
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
In addition, please create another patch to move the nbio ras controller irq source registry to sw_init, which is the consistent as what we did for other ip blocks, register the irq source in IP sw_init funcs.
Regards,
Hawking
From: Clements, John <John.Clements@amd.com>
Sent: Friday, September 11, 2020 12:03
To: amd-gfx list <amd-gfx@lists.freedesktop.org>; Chen, Guchun <Guchun.Chen@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: [PATCH] drm/amdgpu: Update RAS init handling
[AMD Official Use Only - Internal Distribution Only]
Added RAS status check and tear down RAS context if RAS init fails
[-- Attachment #1.2: Type: text/html, Size: 5060 bytes --]
[-- Attachment #2: Type: text/plain, Size: 154 bytes --]
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] drm/amdgpu: Update RAS init handling
2020-09-09 9:24 Clements, John
2020-09-09 10:34 ` Zhang, Hawking
@ 2020-09-09 11:48 ` Chen, Guchun
1 sibling, 0 replies; 6+ messages in thread
From: Chen, Guchun @ 2020-09-09 11:48 UTC (permalink / raw)
To: Clements, John, amd-gfx list, Zhang, Hawking
[-- Attachment #1.1: Type: text/plain, Size: 596 bytes --]
[AMD Public Use]
+ if (!adev->psp.ras.ras_initialized)
+ return -EINVAL;
+
if (!con)
return -EINVAL;
I suggest squashing the new check into below one.
Regards,
Guchun
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Clements, John
Sent: Wednesday, September 9, 2020 5:24 PM
To: amd-gfx list <amd-gfx@lists.freedesktop.org>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: [PATCH] drm/amdgpu: Update RAS init handling
[AMD Official Use Only - Internal Distribution Only]
[-- Attachment #1.2: Type: text/html, Size: 4174 bytes --]
[-- Attachment #2: Type: text/plain, Size: 154 bytes --]
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] drm/amdgpu: Update RAS init handling
2020-09-09 9:24 Clements, John
@ 2020-09-09 10:34 ` Zhang, Hawking
2020-09-09 11:48 ` Chen, Guchun
1 sibling, 0 replies; 6+ messages in thread
From: Zhang, Hawking @ 2020-09-09 10:34 UTC (permalink / raw)
To: Clements, John, amd-gfx list
[-- Attachment #1.1: Type: text/plain, Size: 503 bytes --]
[AMD Public Use]
The patch is
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
BTW, please use the git-send mail for code review going forward, instead of attached the patch.
Regards,
Hawking
From: Clements, John <John.Clements@amd.com>
Sent: Wednesday, September 9, 2020 17:24
To: amd-gfx list <amd-gfx@lists.freedesktop.org>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: [PATCH] drm/amdgpu: Update RAS init handling
[AMD Official Use Only - Internal Distribution Only]
[-- Attachment #1.2: Type: text/html, Size: 3518 bytes --]
[-- Attachment #2: Type: text/plain, Size: 154 bytes --]
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] drm/amdgpu: Update RAS init handling
@ 2020-09-09 9:24 Clements, John
2020-09-09 10:34 ` Zhang, Hawking
2020-09-09 11:48 ` Chen, Guchun
0 siblings, 2 replies; 6+ messages in thread
From: Clements, John @ 2020-09-09 9:24 UTC (permalink / raw)
To: amd-gfx list, Zhang, Hawking
[-- Attachment #1.1: Type: text/plain, Size: 58 bytes --]
[AMD Official Use Only - Internal Distribution Only]
[-- Attachment #1.2: Type: text/html, Size: 1760 bytes --]
[-- Attachment #2: 0001-drm-amdgpu-Update-RAS-init-handling.patch --]
[-- Type: application/octet-stream, Size: 1946 bytes --]
From bf9f81158639eb8df72e3c8e89c86e6f4c499355 Mon Sep 17 00:00:00 2001
From: John Clements <john.clements@amd.com>
Date: Wed, 9 Sep 2020 17:19:53 +0800
Subject: [PATCH] drm/amdgpu: Update RAS init handling
Upon RAS init failure:
Output RAS init status in log
Do not attempt further RAS FW functions
Signed-off-by: John Clements <john.clements@amd.com>
Change-Id: Ie59c764f5eec387d04bd676ad4e053bb39f30667
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 9 ++++++++-
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +++
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index d6c38e24f130..c765d14efb58 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -929,6 +929,7 @@ static int psp_ras_load(struct psp_context *psp)
{
int ret;
struct psp_gfx_cmd_resp *cmd;
+ struct ta_ras_shared_memory *ras_cmd;
/*
* TODO: bypass the loading in sriov for now
@@ -952,9 +953,15 @@ static int psp_ras_load(struct psp_context *psp)
ret = psp_cmd_submit_buf(psp, NULL, cmd,
psp->fence_buf_mc_addr);
+ ras_cmd = (struct ta_ras_shared_memory*)psp->ras.ras_shared_buf;
+
if (!ret) {
- psp->ras.ras_initialized = true;
psp->ras.session_id = cmd->resp.session_id;
+
+ if (!ras_cmd->ras_status)
+ psp->ras.ras_initialized = true;
+ else
+ dev_warn(psp->adev->dev, "RAS Init Status: 0x%X\n", ras_cmd->ras_status);
}
kfree(cmd);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index e5ea14774c0c..c659b105b66b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -604,6 +604,9 @@ int amdgpu_ras_feature_enable(struct amdgpu_device *adev,
union ta_ras_cmd_input *info;
int ret;
+ if (!adev->psp.ras.ras_initialized)
+ return -EINVAL;
+
if (!con)
return -EINVAL;
--
2.17.1
[-- Attachment #3: Type: text/plain, Size: 154 bytes --]
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-09-14 18:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-11 4:02 [PATCH] drm/amdgpu: Update RAS init handling Clements, John
2020-09-11 6:02 ` Zhang, Hawking
2020-09-14 18:38 ` Deucher, Alexander
-- strict thread matches above, loose matches on Subject: below --
2020-09-09 9:24 Clements, John
2020-09-09 10:34 ` Zhang, Hawking
2020-09-09 11:48 ` Chen, Guchun
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.