All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: always force full reset for SOC21
@ 2024-03-24  0:52 Alex Deucher
  2024-03-24 10:16 ` Friedrich Vock
  2024-04-02  8:21 ` Christian König
  0 siblings, 2 replies; 5+ messages in thread
From: Alex Deucher @ 2024-03-24  0:52 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher

There are cases where soft reset seems to succeed, but
does not, so always use mode1/2 for now.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/soc21.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
index 581a3bd11481..8526282f4da1 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -457,10 +457,8 @@ static bool soc21_need_full_reset(struct amdgpu_device *adev)
 {
 	switch (amdgpu_ip_version(adev, GC_HWIP, 0)) {
 	case IP_VERSION(11, 0, 0):
-		return amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__UMC);
 	case IP_VERSION(11, 0, 2):
 	case IP_VERSION(11, 0, 3):
-		return false;
 	default:
 		return true;
 	}
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/amdgpu: always force full reset for SOC21
  2024-03-24  0:52 [PATCH] drm/amdgpu: always force full reset for SOC21 Alex Deucher
@ 2024-03-24 10:16 ` Friedrich Vock
  2024-03-25 15:01   ` Alex Deucher
  2024-04-02  8:21 ` Christian König
  1 sibling, 1 reply; 5+ messages in thread
From: Friedrich Vock @ 2024-03-24 10:16 UTC (permalink / raw)
  To: Alex Deucher, amd-gfx

On 24.03.24 01:52, Alex Deucher wrote:
> There are cases where soft reset seems to succeed, but
> does not, so always use mode1/2 for now.

Does "for now" mean that a proper fix is being worked on/will appear later?

Immediately falling back to full resets is a really bad experience, and
it's especially catastrophic when only MODE1 is available.

Of course, soft resets succeeding but leaving the GPU in a faulty state
isn't acceptable either, but I think it's pretty important to keep the
ability to do soft resets if at all possible.

If it's not possible to wait with this until the proper fix is
available, I hope that at least it can be reverted soon.

Thanks,
Friedrich

> Signed-off-by: Alex Deucher<alexander.deucher@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/soc21.c | 2 --
>   1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
> index 581a3bd11481..8526282f4da1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> @@ -457,10 +457,8 @@ static bool soc21_need_full_reset(struct amdgpu_device *adev)
>   {
>   	switch (amdgpu_ip_version(adev, GC_HWIP, 0)) {
>   	case IP_VERSION(11, 0, 0):
> -		return amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__UMC);
>   	case IP_VERSION(11, 0, 2):
>   	case IP_VERSION(11, 0, 3):
> -		return false;
>   	default:
>   		return true;
>   	}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/amdgpu: always force full reset for SOC21
  2024-03-24 10:16 ` Friedrich Vock
@ 2024-03-25 15:01   ` Alex Deucher
  2024-03-26 23:31     ` Kasiviswanathan, Harish
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Deucher @ 2024-03-25 15:01 UTC (permalink / raw)
  To: Friedrich Vock; +Cc: Alex Deucher, amd-gfx

On Sun, Mar 24, 2024 at 6:42 AM Friedrich Vock <friedrich.vock@gmx.de> wrote:
>
> On 24.03.24 01:52, Alex Deucher wrote:
> > There are cases where soft reset seems to succeed, but
> > does not, so always use mode1/2 for now.
>
> Does "for now" mean that a proper fix is being worked on/will appear later?
>
> Immediately falling back to full resets is a really bad experience, and
> it's especially catastrophic when only MODE1 is available.
>
> Of course, soft resets succeeding but leaving the GPU in a faulty state
> isn't acceptable either, but I think it's pretty important to keep the
> ability to do soft resets if at all possible.
>
> If it's not possible to wait with this until the proper fix is
> available, I hope that at least it can be reverted soon.

Yes, it's being actively debugged.

Alex

>
> Thanks,
> Friedrich
>
> > Signed-off-by: Alex Deucher<alexander.deucher@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/soc21.c | 2 --
> >   1 file changed, 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
> > index 581a3bd11481..8526282f4da1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> > @@ -457,10 +457,8 @@ static bool soc21_need_full_reset(struct amdgpu_device *adev)
> >   {
> >       switch (amdgpu_ip_version(adev, GC_HWIP, 0)) {
> >       case IP_VERSION(11, 0, 0):
> > -             return amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__UMC);
> >       case IP_VERSION(11, 0, 2):
> >       case IP_VERSION(11, 0, 3):
> > -             return false;
> >       default:
> >               return true;
> >       }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH] drm/amdgpu: always force full reset for SOC21
  2024-03-25 15:01   ` Alex Deucher
@ 2024-03-26 23:31     ` Kasiviswanathan, Harish
  0 siblings, 0 replies; 5+ messages in thread
From: Kasiviswanathan, Harish @ 2024-03-26 23:31 UTC (permalink / raw)
  To: Alex Deucher, Friedrich Vock; +Cc: Deucher, Alexander, amd-gfx

[AMD Official Use Only - General]

Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>

-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Alex Deucher
Sent: Monday, March 25, 2024 11:01 AM
To: Friedrich Vock <friedrich.vock@gmx.de>
Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: always force full reset for SOC21

On Sun, Mar 24, 2024 at 6:42 AM Friedrich Vock <friedrich.vock@gmx.de> wrote:
>
> On 24.03.24 01:52, Alex Deucher wrote:
> > There are cases where soft reset seems to succeed, but
> > does not, so always use mode1/2 for now.
>
> Does "for now" mean that a proper fix is being worked on/will appear later?
>
> Immediately falling back to full resets is a really bad experience, and
> it's especially catastrophic when only MODE1 is available.
>
> Of course, soft resets succeeding but leaving the GPU in a faulty state
> isn't acceptable either, but I think it's pretty important to keep the
> ability to do soft resets if at all possible.
>
> If it's not possible to wait with this until the proper fix is
> available, I hope that at least it can be reverted soon.

Yes, it's being actively debugged.

Alex

>
> Thanks,
> Friedrich
>
> > Signed-off-by: Alex Deucher<alexander.deucher@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/soc21.c | 2 --
> >   1 file changed, 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
> > index 581a3bd11481..8526282f4da1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> > @@ -457,10 +457,8 @@ static bool soc21_need_full_reset(struct amdgpu_device *adev)
> >   {
> >       switch (amdgpu_ip_version(adev, GC_HWIP, 0)) {
> >       case IP_VERSION(11, 0, 0):
> > -             return amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__UMC);
> >       case IP_VERSION(11, 0, 2):
> >       case IP_VERSION(11, 0, 3):
> > -             return false;
> >       default:
> >               return true;
> >       }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/amdgpu: always force full reset for SOC21
  2024-03-24  0:52 [PATCH] drm/amdgpu: always force full reset for SOC21 Alex Deucher
  2024-03-24 10:16 ` Friedrich Vock
@ 2024-04-02  8:21 ` Christian König
  1 sibling, 0 replies; 5+ messages in thread
From: Christian König @ 2024-04-02  8:21 UTC (permalink / raw)
  To: Alex Deucher, amd-gfx

Am 24.03.24 um 01:52 schrieb Alex Deucher:
> There are cases where soft reset seems to succeed, but
> does not, so always use mode1/2 for now.
>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

IIRC I've requested some changes to how soft reset is done for SOC21 but 
never found the time to actually go over the new specification.

We should probably just need to adjust the soft recovery code for those 
new hardware generations.

Regards,
Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/soc21.c | 2 --
>   1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
> index 581a3bd11481..8526282f4da1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> @@ -457,10 +457,8 @@ static bool soc21_need_full_reset(struct amdgpu_device *adev)
>   {
>   	switch (amdgpu_ip_version(adev, GC_HWIP, 0)) {
>   	case IP_VERSION(11, 0, 0):
> -		return amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__UMC);
>   	case IP_VERSION(11, 0, 2):
>   	case IP_VERSION(11, 0, 3):
> -		return false;
>   	default:
>   		return true;
>   	}


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-02  8:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-24  0:52 [PATCH] drm/amdgpu: always force full reset for SOC21 Alex Deucher
2024-03-24 10:16 ` Friedrich Vock
2024-03-25 15:01   ` Alex Deucher
2024-03-26 23:31     ` Kasiviswanathan, Harish
2024-04-02  8:21 ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.