All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: record error code when ring test failed
@ 2016-08-30  9:59 Chunming Zhou
       [not found] ` <1472551169-29801-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Chunming Zhou @ 2016-08-30  9:59 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Chunming Zhou

Change-Id: I3a59f602a4d5ec42c8c184daa14eb8194b0dab9e
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index f5810f7..8c17888 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -280,7 +280,7 @@ void amdgpu_ib_pool_fini(struct amdgpu_device *adev)
 int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
 {
 	unsigned i;
-	int r;
+	int r, ret = 0;
 
 	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
 		struct amdgpu_ring *ring = adev->rings[i];
@@ -301,10 +301,11 @@ int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
 			} else {
 				/* still not good, but we can live with it */
 				DRM_ERROR("amdgpu: failed testing IB on ring %d (%d).\n", i, r);
+				ret = r;
 			}
 		}
 	}
-	return 0;
+	return ret;
 }
 
 /*
-- 
1.9.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amdgpu: record error code when ring test failed
       [not found] ` <1472551169-29801-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
@ 2016-08-30 12:24   ` Christian König
  2016-08-30 14:24   ` Deucher, Alexander
  1 sibling, 0 replies; 6+ messages in thread
From: Christian König @ 2016-08-30 12:24 UTC (permalink / raw)
  To: Chunming Zhou, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 30.08.2016 um 11:59 schrieb Chunming Zhou:
> Change-Id: I3a59f602a4d5ec42c8c184daa14eb8194b0dab9e
> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index f5810f7..8c17888 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -280,7 +280,7 @@ void amdgpu_ib_pool_fini(struct amdgpu_device *adev)
>   int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
>   {
>   	unsigned i;
> -	int r;
> +	int r, ret = 0;
>   
>   	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>   		struct amdgpu_ring *ring = adev->rings[i];
> @@ -301,10 +301,11 @@ int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
>   			} else {
>   				/* still not good, but we can live with it */
>   				DRM_ERROR("amdgpu: failed testing IB on ring %d (%d).\n", i, r);
> +				ret = r;
>   			}
>   		}
>   	}
> -	return 0;
> +	return ret;
>   }
>   
>   /*


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] drm/amdgpu: record error code when ring test failed
       [not found] ` <1472551169-29801-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
  2016-08-30 12:24   ` Christian König
@ 2016-08-30 14:24   ` Deucher, Alexander
       [not found]     ` <MWHPR12MB1694B57F7A0C9E4B37637C95F7E00-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  1 sibling, 1 reply; 6+ messages in thread
From: Deucher, Alexander @ 2016-08-30 14:24 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zhou, David(ChunMing)

> -----Original Message-----
> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
> Of Chunming Zhou
> Sent: Tuesday, August 30, 2016 5:59 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhou, David(ChunMing)
> Subject: [PATCH] drm/amdgpu: record error code when ring test failed
> 
> Change-Id: I3a59f602a4d5ec42c8c184daa14eb8194b0dab9e
> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index f5810f7..8c17888 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -280,7 +280,7 @@ void amdgpu_ib_pool_fini(struct amdgpu_device
> *adev)
>  int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
>  {
>  	unsigned i;
> -	int r;
> +	int r, ret = 0;
> 
>  	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>  		struct amdgpu_ring *ring = adev->rings[i];
> @@ -301,10 +301,11 @@ int amdgpu_ib_ring_tests(struct amdgpu_device
> *adev)
>  			} else {
>  				/* still not good, but we can live with it */
>  				DRM_ERROR("amdgpu: failed testing IB on
> ring %d (%d).\n", i, r);
> +				ret = r;

Hmm, I think that was intentional so as not to fail completely even if some of the engines aren't working.

Alex

>  			}
>  		}
>  	}
> -	return 0;
> +	return ret;
>  }
> 
>  /*
> --
> 1.9.1
> 
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amdgpu: record error code when ring test failed
       [not found]     ` <MWHPR12MB1694B57F7A0C9E4B37637C95F7E00-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2016-08-30 15:04       ` Christian König
       [not found]         ` <03ed1bf9-b4f8-be92-2653-f19109310103-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  2016-08-31  1:44       ` zhoucm1
  1 sibling, 1 reply; 6+ messages in thread
From: Christian König @ 2016-08-30 15:04 UTC (permalink / raw)
  To: Deucher, Alexander, Zhou, David(ChunMing),
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 30.08.2016 um 16:24 schrieb Deucher, Alexander:
>> -----Original Message-----
>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
>> Of Chunming Zhou
>> Sent: Tuesday, August 30, 2016 5:59 AM
>> To: amd-gfx@lists.freedesktop.org
>> Cc: Zhou, David(ChunMing)
>> Subject: [PATCH] drm/amdgpu: record error code when ring test failed
>>
>> Change-Id: I3a59f602a4d5ec42c8c184daa14eb8194b0dab9e
>> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> index f5810f7..8c17888 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> @@ -280,7 +280,7 @@ void amdgpu_ib_pool_fini(struct amdgpu_device
>> *adev)
>>   int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
>>   {
>>   	unsigned i;
>> -	int r;
>> +	int r, ret = 0;
>>
>>   	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>   		struct amdgpu_ring *ring = adev->rings[i];
>> @@ -301,10 +301,11 @@ int amdgpu_ib_ring_tests(struct amdgpu_device
>> *adev)
>>   			} else {
>>   				/* still not good, but we can live with it */
>>   				DRM_ERROR("amdgpu: failed testing IB on
>> ring %d (%d).\n", i, r);
>> +				ret = r;
> Hmm, I think that was intentional so as not to fail completely even if some of the engines aren't working.

Yeah, I've had the same concern so I double checked it. The driver just 
prints an additional error message and continuous with the startup.

In general I think it makes sense to return an error here, cause then we 
can easily identify cases where we need to fallback to a full engine reset.

Christian.

> Alex
>
>>   			}
>>   		}
>>   	}
>> -	return 0;
>> +	return ret;
>>   }
>>
>>   /*
>> --
>> 1.9.1
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amdgpu: record error code when ring test failed
       [not found]         ` <03ed1bf9-b4f8-be92-2653-f19109310103-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
@ 2016-08-30 15:06           ` Alex Deucher
  0 siblings, 0 replies; 6+ messages in thread
From: Alex Deucher @ 2016-08-30 15:06 UTC (permalink / raw)
  To: Christian König
  Cc: Deucher, Alexander, Zhou, David(ChunMing),
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Tue, Aug 30, 2016 at 11:04 AM, Christian König
<deathsimple@vodafone.de> wrote:
> Am 30.08.2016 um 16:24 schrieb Deucher, Alexander:
>>>
>>> -----Original Message-----
>>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
>>> Of Chunming Zhou
>>> Sent: Tuesday, August 30, 2016 5:59 AM
>>> To: amd-gfx@lists.freedesktop.org
>>> Cc: Zhou, David(ChunMing)
>>> Subject: [PATCH] drm/amdgpu: record error code when ring test failed
>>>
>>> Change-Id: I3a59f602a4d5ec42c8c184daa14eb8194b0dab9e
>>> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> index f5810f7..8c17888 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> @@ -280,7 +280,7 @@ void amdgpu_ib_pool_fini(struct amdgpu_device
>>> *adev)
>>>   int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
>>>   {
>>>         unsigned i;
>>> -       int r;
>>> +       int r, ret = 0;
>>>
>>>         for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>>                 struct amdgpu_ring *ring = adev->rings[i];
>>> @@ -301,10 +301,11 @@ int amdgpu_ib_ring_tests(struct amdgpu_device
>>> *adev)
>>>                         } else {
>>>                                 /* still not good, but we can live with
>>> it */
>>>                                 DRM_ERROR("amdgpu: failed testing IB on
>>> ring %d (%d).\n", i, r);
>>> +                               ret = r;
>>
>> Hmm, I think that was intentional so as not to fail completely even if
>> some of the engines aren't working.
>
>
> Yeah, I've had the same concern so I double checked it. The driver just
> prints an additional error message and continuous with the startup.
>
> In general I think it makes sense to return an error here, cause then we can
> easily identify cases where we need to fallback to a full engine reset.
>

Thanks for checking.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> Christian.
>
>
>> Alex
>>
>>>                         }
>>>                 }
>>>         }
>>> -       return 0;
>>> +       return ret;
>>>   }
>>>
>>>   /*
>>> --
>>> 1.9.1
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amdgpu: record error code when ring test failed
       [not found]     ` <MWHPR12MB1694B57F7A0C9E4B37637C95F7E00-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  2016-08-30 15:04       ` Christian König
@ 2016-08-31  1:44       ` zhoucm1
  1 sibling, 0 replies; 6+ messages in thread
From: zhoucm1 @ 2016-08-31  1:44 UTC (permalink / raw)
  To: Deucher, Alexander, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 2016年08月30日 22:24, Deucher, Alexander wrote:
>> -----Original Message-----
>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
>> Of Chunming Zhou
>> Sent: Tuesday, August 30, 2016 5:59 AM
>> To: amd-gfx@lists.freedesktop.org
>> Cc: Zhou, David(ChunMing)
>> Subject: [PATCH] drm/amdgpu: record error code when ring test failed
>>
>> Change-Id: I3a59f602a4d5ec42c8c184daa14eb8194b0dab9e
>> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> index f5810f7..8c17888 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> @@ -280,7 +280,7 @@ void amdgpu_ib_pool_fini(struct amdgpu_device
>> *adev)
>>   int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
>>   {
>>   	unsigned i;
>> -	int r;
>> +	int r, ret = 0;
>>
>>   	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>   		struct amdgpu_ring *ring = adev->rings[i];
>> @@ -301,10 +301,11 @@ int amdgpu_ib_ring_tests(struct amdgpu_device
>> *adev)
>>   			} else {
>>   				/* still not good, but we can live with it */
>>   				DRM_ERROR("amdgpu: failed testing IB on
>> ring %d (%d).\n", i, r);
>> +				ret = r;
> Hmm, I think that was intentional so as not to fail completely even if some of the engines aren't working.
After gpu reset, sometimes this case will happen, then gpu could hang if 
ignoring error.

Regards,
David Zhou
>
> Alex
>
>>   			}
>>   		}
>>   	}
>> -	return 0;
>> +	return ret;
>>   }
>>
>>   /*
>> --
>> 1.9.1
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-08-31  1:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-30  9:59 [PATCH] drm/amdgpu: record error code when ring test failed Chunming Zhou
     [not found] ` <1472551169-29801-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2016-08-30 12:24   ` Christian König
2016-08-30 14:24   ` Deucher, Alexander
     [not found]     ` <MWHPR12MB1694B57F7A0C9E4B37637C95F7E00-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2016-08-30 15:04       ` Christian König
     [not found]         ` <03ed1bf9-b4f8-be92-2653-f19109310103-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2016-08-30 15:06           ` Alex Deucher
2016-08-31  1:44       ` zhoucm1

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.