stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)"
@ 2022-01-09 18:11 Len Brown
  2022-01-10 16:08 ` Deucher, Alexander
  0 siblings, 1 reply; 6+ messages in thread
From: Len Brown @ 2022-01-09 18:11 UTC (permalink / raw)
  To: torvalds
  Cc: linux-pm, linux-kernel, Len Brown, Guchun Chen,
	Andrey Grodzovsky, Christian Koenig, Alex Deucher, stable

From: Len Brown <len.brown@intel.com>

This reverts commit f7d6779df642720e22bffd449e683bb8690bd3bf.

This bisected regression has impacted suspend-resume stability
since 5.15-rc1. It regressed -stable via 5.14.10.

https://bugzilla.kernel.org/show_bug.cgi?id=215315

Fixes: f7d6779df64 ("drm/amdgpu: stop scheduler when calling hw_fini (v2)")
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: <stable@vger.kernel.org> # 5.14+
Signed-off-by: Len Brown <len.brown@intel.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 9afd11ca2709..45977a72b5dd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -547,9 +547,6 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev)
 		if (!ring || !ring->fence_drv.initialized)
 			continue;
 
-		if (!ring->no_scheduler)
-			drm_sched_stop(&ring->sched, NULL);
-
 		/* You can't wait for HW to signal if it's gone */
 		if (!drm_dev_is_unplugged(adev_to_drm(adev)))
 			r = amdgpu_fence_wait_empty(ring);
@@ -609,11 +606,6 @@ void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev)
 		if (!ring || !ring->fence_drv.initialized)
 			continue;
 
-		if (!ring->no_scheduler) {
-			drm_sched_resubmit_jobs(&ring->sched);
-			drm_sched_start(&ring->sched, true);
-		}
-
 		/* enable the interrupt */
 		if (ring->fence_drv.irq_src)
 			amdgpu_irq_get(adev, ring->fence_drv.irq_src,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)"
  2022-01-09 18:11 [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)" Len Brown
@ 2022-01-10 16:08 ` Deucher, Alexander
  2022-01-10 16:16   ` Christian König
  0 siblings, 1 reply; 6+ messages in thread
From: Deucher, Alexander @ 2022-01-10 16:08 UTC (permalink / raw)
  To: Len Brown, torvalds, Chen, Guchun, Grodzovsky, Andrey, Koenig, Christian
  Cc: linux-pm, linux-kernel, Len Brown, stable

[Public]

> -----Original Message-----
> From: Len Brown <lenb417@gmail.com> On Behalf Of Len Brown
> Sent: Sunday, January 9, 2022 1:12 PM
> To: torvalds@linux-foundation.org
> Cc: linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org; Len Brown
> <len.brown@intel.com>; Chen, Guchun <Guchun.Chen@amd.com>;
> Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Koenig, Christian
> <Christian.Koenig@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; stable@vger.kernel.org
> Subject: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler when
> calling hw_fini (v2)"
> 
> From: Len Brown <len.brown@intel.com>
> 
> This reverts commit f7d6779df642720e22bffd449e683bb8690bd3bf.
> 
> This bisected regression has impacted suspend-resume stability since 5.15-
> rc1. It regressed -stable via 5.14.10.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz
> illa.kernel.org%2Fshow_bug.cgi%3Fid%3D215315&amp;data=04%7C01%7Cal
> exander.deucher%40amd.com%7Ccf790be4827f4df9f2d808d9d39b81af%7C3
> dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637773487569442716%7C
> Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJB
> TiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=AX0TXkyoMhy%2BZqE
> VgRSWMkKd5nPa4WOv%2B1FZHLSErSw%3D&amp;reserved=0
> 
> Fixes: f7d6779df64 ("drm/amdgpu: stop scheduler when calling hw_fini (v2)")
> Cc: Guchun Chen <guchun.chen@amd.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: <stable@vger.kernel.org> # 5.14+
> Signed-off-by: Len Brown <len.brown@intel.com>

@Chen, Guchun, @Grodzovsky, Andrey, @Koenig, Christian

Any ideas?  What's the consequence of reverting this patch?  Didn't this patch fix another suspend/resume issue?

Alex

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 --------
>  1 file changed, 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 9afd11ca2709..45977a72b5dd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -547,9 +547,6 @@ void amdgpu_fence_driver_hw_fini(struct
> amdgpu_device *adev)
>  		if (!ring || !ring->fence_drv.initialized)
>  			continue;
> 
> -		if (!ring->no_scheduler)
> -			drm_sched_stop(&ring->sched, NULL);
> -
>  		/* You can't wait for HW to signal if it's gone */
>  		if (!drm_dev_is_unplugged(adev_to_drm(adev)))
>  			r = amdgpu_fence_wait_empty(ring);
> @@ -609,11 +606,6 @@ void amdgpu_fence_driver_hw_init(struct
> amdgpu_device *adev)
>  		if (!ring || !ring->fence_drv.initialized)
>  			continue;
> 
> -		if (!ring->no_scheduler) {
> -			drm_sched_resubmit_jobs(&ring->sched);
> -			drm_sched_start(&ring->sched, true);
> -		}
> -
>  		/* enable the interrupt */
>  		if (ring->fence_drv.irq_src)
>  			amdgpu_irq_get(adev, ring->fence_drv.irq_src,
> --
> 2.25.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)"
  2022-01-10 16:08 ` Deucher, Alexander
@ 2022-01-10 16:16   ` Christian König
  2022-01-10 16:25     ` Deucher, Alexander
  0 siblings, 1 reply; 6+ messages in thread
From: Christian König @ 2022-01-10 16:16 UTC (permalink / raw)
  To: Deucher, Alexander, Len Brown, torvalds, Chen, Guchun,
	Grodzovsky, Andrey
  Cc: linux-pm, linux-kernel, Len Brown, stable

Am 10.01.22 um 17:08 schrieb Deucher, Alexander:
> [Public]
>
>> -----Original Message-----
>> From: Len Brown <lenb417@gmail.com> On Behalf Of Len Brown
>> Sent: Sunday, January 9, 2022 1:12 PM
>> To: torvalds@linux-foundation.org
>> Cc: linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org; Len Brown
>> <len.brown@intel.com>; Chen, Guchun <Guchun.Chen@amd.com>;
>> Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Koenig, Christian
>> <Christian.Koenig@amd.com>; Deucher, Alexander
>> <Alexander.Deucher@amd.com>; stable@vger.kernel.org
>> Subject: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler when
>> calling hw_fini (v2)"
>>
>> From: Len Brown <len.brown@intel.com>
>>
>> This reverts commit f7d6779df642720e22bffd449e683bb8690bd3bf.
>>
>> This bisected regression has impacted suspend-resume stability since 5.15-
>> rc1. It regressed -stable via 5.14.10.
>>
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz
>> illa.kernel.org%2Fshow_bug.cgi%3Fid%3D215315&amp;data=04%7C01%7Cal
>> exander.deucher%40amd.com%7Ccf790be4827f4df9f2d808d9d39b81af%7C3
>> dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637773487569442716%7C
>> Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJB
>> TiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=AX0TXkyoMhy%2BZqE
>> VgRSWMkKd5nPa4WOv%2B1FZHLSErSw%3D&amp;reserved=0
>>
>> Fixes: f7d6779df64 ("drm/amdgpu: stop scheduler when calling hw_fini (v2)")
>> Cc: Guchun Chen <guchun.chen@amd.com>
>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> Cc: Christian Koenig <christian.koenig@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Cc: <stable@vger.kernel.org> # 5.14+
>> Signed-off-by: Len Brown <len.brown@intel.com>
> @Chen, Guchun, @Grodzovsky, Andrey, @Koenig, Christian
>
> Any ideas?  What's the consequence of reverting this patch?  Didn't this patch fix another suspend/resume issue?

I think Guchun was just trying to adapt that we removed the scheduler 
stop from the fence driver hw fini path.

Not sure if that actually fixed something or was just a precaution.

Regards,
Christian.

>
> Alex
>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 --------
>>   1 file changed, 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> index 9afd11ca2709..45977a72b5dd 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> @@ -547,9 +547,6 @@ void amdgpu_fence_driver_hw_fini(struct
>> amdgpu_device *adev)
>>   		if (!ring || !ring->fence_drv.initialized)
>>   			continue;
>>
>> -		if (!ring->no_scheduler)
>> -			drm_sched_stop(&ring->sched, NULL);
>> -
>>   		/* You can't wait for HW to signal if it's gone */
>>   		if (!drm_dev_is_unplugged(adev_to_drm(adev)))
>>   			r = amdgpu_fence_wait_empty(ring);
>> @@ -609,11 +606,6 @@ void amdgpu_fence_driver_hw_init(struct
>> amdgpu_device *adev)
>>   		if (!ring || !ring->fence_drv.initialized)
>>   			continue;
>>
>> -		if (!ring->no_scheduler) {
>> -			drm_sched_resubmit_jobs(&ring->sched);
>> -			drm_sched_start(&ring->sched, true);
>> -		}
>> -
>>   		/* enable the interrupt */
>>   		if (ring->fence_drv.irq_src)
>>   			amdgpu_irq_get(adev, ring->fence_drv.irq_src,
>> --
>> 2.25.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)"
  2022-01-10 16:16   ` Christian König
@ 2022-01-10 16:25     ` Deucher, Alexander
  2022-01-10 16:43       ` Lukas Wunner
  2022-01-11  3:20       ` Chen, Guchun
  0 siblings, 2 replies; 6+ messages in thread
From: Deucher, Alexander @ 2022-01-10 16:25 UTC (permalink / raw)
  To: Koenig, Christian, Len Brown, torvalds, Chen, Guchun, Grodzovsky, Andrey
  Cc: linux-pm, linux-kernel, Len Brown, stable

[Public]

> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig@amd.com>
> Sent: Monday, January 10, 2022 11:16 AM
> To: Deucher, Alexander <Alexander.Deucher@amd.com>; Len Brown
> <lenb@kernel.org>; torvalds@linux-foundation.org; Chen, Guchun
> <Guchun.Chen@amd.com>; Grodzovsky, Andrey
> <Andrey.Grodzovsky@amd.com>
> Cc: linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org; Len Brown
> <len.brown@intel.com>; stable@vger.kernel.org
> Subject: Re: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler
> when calling hw_fini (v2)"
> 
> Am 10.01.22 um 17:08 schrieb Deucher, Alexander:
> > [Public]
> >
> >> -----Original Message-----
> >> From: Len Brown <lenb417@gmail.com> On Behalf Of Len Brown
> >> Sent: Sunday, January 9, 2022 1:12 PM
> >> To: torvalds@linux-foundation.org
> >> Cc: linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org; Len Brown
> >> <len.brown@intel.com>; Chen, Guchun <Guchun.Chen@amd.com>;
> >> Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Koenig, Christian
> >> <Christian.Koenig@amd.com>; Deucher, Alexander
> >> <Alexander.Deucher@amd.com>; stable@vger.kernel.org
> >> Subject: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler
> when
> >> calling hw_fini (v2)"
> >>
> >> From: Len Brown <len.brown@intel.com>
> >>
> >> This reverts commit f7d6779df642720e22bffd449e683bb8690bd3bf.
> >>
> >> This bisected regression has impacted suspend-resume stability since
> >> 5.15- rc1. It regressed -stable via 5.14.10.
> >>
> >>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbug
> >> z
> illa.kernel.org%2Fshow_bug.cgi%3Fid%3D215315&amp;data=04%7C01%7Cal
> >>
> exander.deucher%40amd.com%7Ccf790be4827f4df9f2d808d9d39b81af%7C3
> >>
> dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637773487569442716%7C
> >>
> Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJB
> >>
> TiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=AX0TXkyoMhy%2BZqE
> >> VgRSWMkKd5nPa4WOv%2B1FZHLSErSw%3D&amp;reserved=0
> >>
> >> Fixes: f7d6779df64 ("drm/amdgpu: stop scheduler when calling hw_fini
> >> (v2)")
> >> Cc: Guchun Chen <guchun.chen@amd.com>
> >> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >> Cc: Christian Koenig <christian.koenig@amd.com>
> >> Cc: Alex Deucher <alexander.deucher@amd.com>
> >> Cc: <stable@vger.kernel.org> # 5.14+
> >> Signed-off-by: Len Brown <len.brown@intel.com>
> > @Chen, Guchun, @Grodzovsky, Andrey, @Koenig, Christian
> >
> > Any ideas?  What's the consequence of reverting this patch?  Didn't this
> patch fix another suspend/resume issue?
> 
> I think Guchun was just trying to adapt that we removed the scheduler stop
> from the fence driver hw fini path.
> 
> Not sure if that actually fixed something or was just a precaution.

Thanks.  I'll wait for feedback from Guchun and Andrey and if they are ok with it, I'll apply the revert.

Alex


> 
> Regards,
> Christian.
> 
> >
> > Alex
> >
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 --------
> >>   1 file changed, 8 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >> index 9afd11ca2709..45977a72b5dd 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >> @@ -547,9 +547,6 @@ void amdgpu_fence_driver_hw_fini(struct
> >> amdgpu_device *adev)
> >>   		if (!ring || !ring->fence_drv.initialized)
> >>   			continue;
> >>
> >> -		if (!ring->no_scheduler)
> >> -			drm_sched_stop(&ring->sched, NULL);
> >> -
> >>   		/* You can't wait for HW to signal if it's gone */
> >>   		if (!drm_dev_is_unplugged(adev_to_drm(adev)))
> >>   			r = amdgpu_fence_wait_empty(ring); @@ -609,11
> +606,6 @@ void
> >> amdgpu_fence_driver_hw_init(struct
> >> amdgpu_device *adev)
> >>   		if (!ring || !ring->fence_drv.initialized)
> >>   			continue;
> >>
> >> -		if (!ring->no_scheduler) {
> >> -			drm_sched_resubmit_jobs(&ring->sched);
> >> -			drm_sched_start(&ring->sched, true);
> >> -		}
> >> -
> >>   		/* enable the interrupt */
> >>   		if (ring->fence_drv.irq_src)
> >>   			amdgpu_irq_get(adev, ring->fence_drv.irq_src,
> >> --
> >> 2.25.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)"
  2022-01-10 16:25     ` Deucher, Alexander
@ 2022-01-10 16:43       ` Lukas Wunner
  2022-01-11  3:20       ` Chen, Guchun
  1 sibling, 0 replies; 6+ messages in thread
From: Lukas Wunner @ 2022-01-10 16:43 UTC (permalink / raw)
  To: Deucher, Alexander
  Cc: Koenig, Christian, Len Brown, torvalds, Chen, Guchun, Grodzovsky,
	Andrey, linux-pm, linux-kernel, Len Brown, stable

On Mon, Jan 10, 2022 at 04:25:51PM +0000, Deucher, Alexander wrote:
> Thanks.  I'll wait for feedback from Guchun and Andrey and if they are
> ok with it, I'll apply the revert.

Linus already picked it up yesterday, it's in v5.16.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)"
  2022-01-10 16:25     ` Deucher, Alexander
  2022-01-10 16:43       ` Lukas Wunner
@ 2022-01-11  3:20       ` Chen, Guchun
  1 sibling, 0 replies; 6+ messages in thread
From: Chen, Guchun @ 2022-01-11  3:20 UTC (permalink / raw)
  To: Deucher, Alexander, Koenig, Christian, Len Brown, torvalds,
	Grodzovsky, Andrey
  Cc: linux-pm, linux-kernel, Len Brown, stable

[Public]

Hi Alex/Christian,

This patch is to put drm_sched_stop to stop scheduler before amdgpu_fence_wait_empty, otherwise, there is possibly a race problem that drm scheduler will keep submitting commands to hardware in suspend, so amdgpu_fence_wait_empty has no chance to get empty. This is based on the discussion with Andrey before.

In Brown's case, without this patch, his test can run well by a 10-hour duration. However, with this patch applied, issue occurs in under an hour. I guess this patch exposes another underlying problem, as if it's totally faulty, the test with the patch applied will break in the first round suspend/resume test instead of failed after several rounds suspend/resume test.
https://bugzilla.kernel.org/show_bug.cgi?id=215315

Anyway, we can revert it for now, and I will continue the investigation to the root cause.

Regards,
Guchun

-----Original Message-----
From: Deucher, Alexander <Alexander.Deucher@amd.com> 
Sent: Tuesday, January 11, 2022 12:26 AM
To: Koenig, Christian <Christian.Koenig@amd.com>; Len Brown <lenb@kernel.org>; torvalds@linux-foundation.org; Chen, Guchun <Guchun.Chen@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Cc: linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org; Len Brown <len.brown@intel.com>; stable@vger.kernel.org
Subject: RE: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)"

[Public]

> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig@amd.com>
> Sent: Monday, January 10, 2022 11:16 AM
> To: Deucher, Alexander <Alexander.Deucher@amd.com>; Len Brown 
> <lenb@kernel.org>; torvalds@linux-foundation.org; Chen, Guchun 
> <Guchun.Chen@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> Cc: linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org; Len Brown 
> <len.brown@intel.com>; stable@vger.kernel.org
> Subject: Re: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler 
> when calling hw_fini (v2)"
> 
> Am 10.01.22 um 17:08 schrieb Deucher, Alexander:
> > [Public]
> >
> >> -----Original Message-----
> >> From: Len Brown <lenb417@gmail.com> On Behalf Of Len Brown
> >> Sent: Sunday, January 9, 2022 1:12 PM
> >> To: torvalds@linux-foundation.org
> >> Cc: linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org; Len 
> >> Brown <len.brown@intel.com>; Chen, Guchun <Guchun.Chen@amd.com>; 
> >> Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Koenig, Christian 
> >> <Christian.Koenig@amd.com>; Deucher, Alexander 
> >> <Alexander.Deucher@amd.com>; stable@vger.kernel.org
> >> Subject: [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler
> when
> >> calling hw_fini (v2)"
> >>
> >> From: Len Brown <len.brown@intel.com>
> >>
> >> This reverts commit f7d6779df642720e22bffd449e683bb8690bd3bf.
> >>
> >> This bisected regression has impacted suspend-resume stability 
> >> since
> >> 5.15- rc1. It regressed -stable via 5.14.10.
> >>
> >>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbug
> >> z
> illa.kernel.org%2Fshow_bug.cgi%3Fid%3D215315&amp;data=04%7C01%7Cal
> >>
> exander.deucher%40amd.com%7Ccf790be4827f4df9f2d808d9d39b81af%7C3
> >>
> dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637773487569442716%7C
> >>
> Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJB
> >>
> TiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=AX0TXkyoMhy%2BZqE
> >> VgRSWMkKd5nPa4WOv%2B1FZHLSErSw%3D&amp;reserved=0
> >>
> >> Fixes: f7d6779df64 ("drm/amdgpu: stop scheduler when calling 
> >> hw_fini
> >> (v2)")
> >> Cc: Guchun Chen <guchun.chen@amd.com>
> >> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >> Cc: Christian Koenig <christian.koenig@amd.com>
> >> Cc: Alex Deucher <alexander.deucher@amd.com>
> >> Cc: <stable@vger.kernel.org> # 5.14+
> >> Signed-off-by: Len Brown <len.brown@intel.com>
> > @Chen, Guchun, @Grodzovsky, Andrey, @Koenig, Christian
> >
> > Any ideas?  What's the consequence of reverting this patch?  Didn't 
> > this
> patch fix another suspend/resume issue?
> 
> I think Guchun was just trying to adapt that we removed the scheduler 
> stop from the fence driver hw fini path.
> 
> Not sure if that actually fixed something or was just a precaution.

Thanks.  I'll wait for feedback from Guchun and Andrey and if they are ok with it, I'll apply the revert.

Alex


> 
> Regards,
> Christian.
> 
> >
> > Alex
> >
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 --------
> >>   1 file changed, 8 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >> index 9afd11ca2709..45977a72b5dd 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >> @@ -547,9 +547,6 @@ void amdgpu_fence_driver_hw_fini(struct
> >> amdgpu_device *adev)
> >>   		if (!ring || !ring->fence_drv.initialized)
> >>   			continue;
> >>
> >> -		if (!ring->no_scheduler)
> >> -			drm_sched_stop(&ring->sched, NULL);
> >> -
> >>   		/* You can't wait for HW to signal if it's gone */
> >>   		if (!drm_dev_is_unplugged(adev_to_drm(adev)))
> >>   			r = amdgpu_fence_wait_empty(ring); @@ -609,11
> +606,6 @@ void
> >> amdgpu_fence_driver_hw_init(struct
> >> amdgpu_device *adev)
> >>   		if (!ring || !ring->fence_drv.initialized)
> >>   			continue;
> >>
> >> -		if (!ring->no_scheduler) {
> >> -			drm_sched_resubmit_jobs(&ring->sched);
> >> -			drm_sched_start(&ring->sched, true);
> >> -		}
> >> -
> >>   		/* enable the interrupt */
> >>   		if (ring->fence_drv.irq_src)
> >>   			amdgpu_irq_get(adev, ring->fence_drv.irq_src,
> >> --
> >> 2.25.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-01-11  3:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-09 18:11 [PATCH REGRESSION] Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)" Len Brown
2022-01-10 16:08 ` Deucher, Alexander
2022-01-10 16:16   ` Christian König
2022-01-10 16:25     ` Deucher, Alexander
2022-01-10 16:43       ` Lukas Wunner
2022-01-11  3:20       ` Chen, Guchun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).