* [PATCH] drm/amdgpu: skip mes self test for gc 11.0.3 in recover
@ 2022-10-19 3:45 YuBiao Wang
2022-10-19 3:53 ` Luben Tuikov
0 siblings, 1 reply; 5+ messages in thread
From: YuBiao Wang @ 2022-10-19 3:45 UTC (permalink / raw)
To: amd-gfx
Cc: YuBiao Wang, Andrey Grodzovsky, Jack Xiao, Feifei Xu,
horace.chen, Kevin Wang, Tuikov Luben, Deucher Alexander,
Evan Quan, Christian König, Monk Liu, Hawking Zhang
Temporary disable mes self teset for gc 11.0.3 during gpu_recovery.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e0445e8cc342..5b8362727226 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5381,7 +5381,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
drm_sched_start(&ring->sched, !tmp_adev->asic_reset_res);
}
- if (adev->enable_mes)
+ if (adev->enable_mes && adev->ip_versions[GC_HWIP][0] != IP_VERSION(11, 0, 3))
amdgpu_mes_self_test(tmp_adev);
if (!drm_drv_uses_atomic_modeset(adev_to_drm(tmp_adev)) && !job_signaled) {
--
2.25.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/amdgpu: skip mes self test for gc 11.0.3 in recover
2022-10-19 3:45 [PATCH] drm/amdgpu: skip mes self test for gc 11.0.3 in recover YuBiao Wang
@ 2022-10-19 3:53 ` Luben Tuikov
2022-10-19 4:50 ` Wang, YuBiao
2022-10-20 2:44 ` Wang, YuBiao
0 siblings, 2 replies; 5+ messages in thread
From: Luben Tuikov @ 2022-10-19 3:53 UTC (permalink / raw)
To: YuBiao Wang, amd-gfx
Cc: Andrey Grodzovsky, Jack Xiao, Feifei Xu, horace.chen, Kevin Wang,
Deucher Alexander, Evan Quan, Christian König, Monk Liu,
Hawking Zhang
On 2022-10-18 23:45, YuBiao Wang wrote:
> Temporary disable mes self teset for gc 11.0.3 during gpu_recovery.
>
Is this "temporary" as in "we'll revert this commit later", or is
it "temporary" as in the code execution itself?
> Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index e0445e8cc342..5b8362727226 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5381,7 +5381,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> drm_sched_start(&ring->sched, !tmp_adev->asic_reset_res);
> }
>
> - if (adev->enable_mes)
> + if (adev->enable_mes && adev->ip_versions[GC_HWIP][0] != IP_VERSION(11, 0, 3))
> amdgpu_mes_self_test(tmp_adev);
Is this just for this version of the IP or this and any newer versions?
Regards,
Luben
>
> if (!drm_drv_uses_atomic_modeset(adev_to_drm(tmp_adev)) && !job_signaled) {
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] drm/amdgpu: skip mes self test for gc 11.0.3 in recover
2022-10-19 3:53 ` Luben Tuikov
@ 2022-10-19 4:50 ` Wang, YuBiao
2022-10-20 2:44 ` Wang, YuBiao
1 sibling, 0 replies; 5+ messages in thread
From: Wang, YuBiao @ 2022-10-19 4:50 UTC (permalink / raw)
To: Tuikov, Luben, amd-gfx
Cc: Grodzovsky, Andrey, Xiao, Jack, Wang, Yang(Kevin),
Xu, Feifei, Chen, Horace, Deucher, Alexander, Quan, Evan, Koenig,
Christian, Liu, Monk, Zhang, Hawking
Hi Luben,
As far as I know of this is only for gc 11.0.3. Mes self test is also skipped in mes late init for this version of IP.
Best Regards,
Yubiao Wang
-----Original Message-----
From: Tuikov, Luben <Luben.Tuikov@amd.com>
Sent: Wednesday, October 19, 2022 11:53 AM
To: Wang, YuBiao <YuBiao.Wang@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Quan, Evan <Evan.Quan@amd.com>; Chen, Horace <Horace.Chen@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Xiao, Jack <Jack.Xiao@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>; Liu, Monk <Monk.Liu@amd.com>; Xu, Feifei <Feifei.Xu@amd.com>; Wang, Yang(Kevin) <KevinYang.Wang@amd.com>
Subject: Re: [PATCH] drm/amdgpu: skip mes self test for gc 11.0.3 in recover
On 2022-10-18 23:45, YuBiao Wang wrote:
> Temporary disable mes self teset for gc 11.0.3 during gpu_recovery.
>
Is this "temporary" as in "we'll revert this commit later", or is it "temporary" as in the code execution itself?
> Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index e0445e8cc342..5b8362727226 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5381,7 +5381,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> drm_sched_start(&ring->sched, !tmp_adev->asic_reset_res);
> }
>
> - if (adev->enable_mes)
> + if (adev->enable_mes && adev->ip_versions[GC_HWIP][0] !=
> +IP_VERSION(11, 0, 3))
> amdgpu_mes_self_test(tmp_adev);
Is this just for this version of the IP or this and any newer versions?
Regards,
Luben
>
> if (!drm_drv_uses_atomic_modeset(adev_to_drm(tmp_adev)) &&
> !job_signaled) {
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] drm/amdgpu: skip mes self test for gc 11.0.3 in recover
2022-10-19 3:53 ` Luben Tuikov
2022-10-19 4:50 ` Wang, YuBiao
@ 2022-10-20 2:44 ` Wang, YuBiao
2022-10-20 7:03 ` Luben Tuikov
1 sibling, 1 reply; 5+ messages in thread
From: Wang, YuBiao @ 2022-10-20 2:44 UTC (permalink / raw)
To: Tuikov, Luben, amd-gfx, Zhang, Hawking
Cc: Grodzovsky, Andrey, Xiao, Jack, Wang, Yang(Kevin),
Xu, Feifei, Chen, Horace, Deucher, Alexander, Quan, Evan, Koenig,
Christian, Liu, Monk
Hi Luben,
>> Is this "temporary" as in "we'll revert this commit later", or is it "temporary" as in the code execution itself?
>> Is this just for this version of the IP or this and any newer versions?
I suppose that it is meant to be reverted later. There is a similar patch in commit c25a7a8bf19a98578ad27aaaa78082276ea1557c which also temporarily skip mes self test only for gc_11.0.3 during mes late init, which was reviewed by @Zhang, Hawking. My patch is to also skip mes self test during gpu recover since self test will also cause failure during reset.
Best Regards,
Yubiao Wang
-----Original Message-----
From: Tuikov, Luben <Luben.Tuikov@amd.com>
Sent: Wednesday, October 19, 2022 11:53 AM
To: Wang, YuBiao <YuBiao.Wang@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Quan, Evan <Evan.Quan@amd.com>; Chen, Horace <Horace.Chen@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Xiao, Jack <Jack.Xiao@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>; Liu, Monk <Monk.Liu@amd.com>; Xu, Feifei <Feifei.Xu@amd.com>; Wang, Yang(Kevin) <KevinYang.Wang@amd.com>
Subject: Re: [PATCH] drm/amdgpu: skip mes self test for gc 11.0.3 in recover
On 2022-10-18 23:45, YuBiao Wang wrote:
> Temporary disable mes self teset for gc 11.0.3 during gpu_recovery.
>
Is this "temporary" as in "we'll revert this commit later", or is it "temporary" as in the code execution itself?
> Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index e0445e8cc342..5b8362727226 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5381,7 +5381,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> drm_sched_start(&ring->sched, !tmp_adev->asic_reset_res);
> }
>
> - if (adev->enable_mes)
> + if (adev->enable_mes && adev->ip_versions[GC_HWIP][0] !=
> +IP_VERSION(11, 0, 3))
> amdgpu_mes_self_test(tmp_adev);
Is this just for this version of the IP or this and any newer versions?
Regards,
Luben
>
> if (!drm_drv_uses_atomic_modeset(adev_to_drm(tmp_adev)) &&
> !job_signaled) {
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/amdgpu: skip mes self test for gc 11.0.3 in recover
2022-10-20 2:44 ` Wang, YuBiao
@ 2022-10-20 7:03 ` Luben Tuikov
0 siblings, 0 replies; 5+ messages in thread
From: Luben Tuikov @ 2022-10-20 7:03 UTC (permalink / raw)
To: Wang, YuBiao, amd-gfx, Zhang, Hawking
Cc: Grodzovsky, Andrey, Xiao, Jack, Wang, Yang(Kevin),
Xu, Feifei, Chen, Horace, Deucher, Alexander, Quan, Evan, Koenig,
Christian, Liu, Monk
Hi YuBiao,
Ah, okay, there's a precedent for such a change then.
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Regards,
Luben
On 2022-10-19 22:44, Wang, YuBiao wrote:
> Hi Luben,
>
>>> Is this "temporary" as in "we'll revert this commit later", or is it "temporary" as in the code execution itself?
>>> Is this just for this version of the IP or this and any newer versions?
>
> I suppose that it is meant to be reverted later. There is a similar patch in commit c25a7a8bf19a98578ad27aaaa78082276ea1557c which also temporarily skip mes self test only for gc_11.0.3 during mes late init, which was reviewed by @Zhang, Hawking. My patch is to also skip mes self test during gpu recover since self test will also cause failure during reset.
>
> Best Regards,
> Yubiao Wang
>
> -----Original Message-----
> From: Tuikov, Luben <Luben.Tuikov@amd.com>
> Sent: Wednesday, October 19, 2022 11:53 AM
> To: Wang, YuBiao <YuBiao.Wang@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Quan, Evan <Evan.Quan@amd.com>; Chen, Horace <Horace.Chen@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Xiao, Jack <Jack.Xiao@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>; Liu, Monk <Monk.Liu@amd.com>; Xu, Feifei <Feifei.Xu@amd.com>; Wang, Yang(Kevin) <KevinYang.Wang@amd.com>
> Subject: Re: [PATCH] drm/amdgpu: skip mes self test for gc 11.0.3 in recover
>
> On 2022-10-18 23:45, YuBiao Wang wrote:
>> Temporary disable mes self teset for gc 11.0.3 during gpu_recovery.
>>
>
> Is this "temporary" as in "we'll revert this commit later", or is it "temporary" as in the code execution itself?
>
>> Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index e0445e8cc342..5b8362727226 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -5381,7 +5381,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>> drm_sched_start(&ring->sched, !tmp_adev->asic_reset_res);
>> }
>>
>> - if (adev->enable_mes)
>> + if (adev->enable_mes && adev->ip_versions[GC_HWIP][0] !=
>> +IP_VERSION(11, 0, 3))
>> amdgpu_mes_self_test(tmp_adev);
>
> Is this just for this version of the IP or this and any newer versions?
>
> Regards,
> Luben
>
>>
>> if (!drm_drv_uses_atomic_modeset(adev_to_drm(tmp_adev)) &&
>> !job_signaled) {
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-10-20 7:03 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-19 3:45 [PATCH] drm/amdgpu: skip mes self test for gc 11.0.3 in recover YuBiao Wang
2022-10-19 3:53 ` Luben Tuikov
2022-10-19 4:50 ` Wang, YuBiao
2022-10-20 2:44 ` Wang, YuBiao
2022-10-20 7:03 ` Luben Tuikov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.