[PATCH] drm/amdgpu: wait for all rings to drain before runtime suspending

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] drm/amdgpu: wait for all rings to drain before runtime suspending
@ 2019-12-10 22:08 Alex Deucher
  2019-12-11  2:26 ` zhoucm1
  0 siblings, 1 reply; 4+ messages in thread
From: Alex Deucher @ 2019-12-10 22:08 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher

Add a safety check to runtime suspend to make sure all outstanding
fences have signaled before we suspend.  Doesn't fix any known issue.

We already do this via the fence driver suspend function, but we
just force completion rather than bailing.  This bails on runtime
suspend so we can try again later once the fences are signaled to
avoid missing any outstanding work.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 2f367146c72c..81322b0a8acf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1214,13 +1214,23 @@ static int amdgpu_pmops_runtime_suspend(struct device *dev)
 	struct pci_dev *pdev = to_pci_dev(dev);
 	struct drm_device *drm_dev = pci_get_drvdata(pdev);
 	struct amdgpu_device *adev = drm_dev->dev_private;
-	int ret;
+	int ret, i;
 
 	if (!adev->runpm) {
 		pm_runtime_forbid(dev);
 		return -EBUSY;
 	}
 
+	/* wait for all rings to drain before suspending */
+	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
+		struct amdgpu_ring *ring = adev->rings[i];
+		if (ring && ring->sched.ready) {
+			ret = amdgpu_fence_wait_empty(ring);
+			if (ret)
+				return -EBUSY;
+		}
+	}
+
 	if (amdgpu_device_supports_boco(drm_dev))
 		drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
 	drm_kms_helper_poll_disable(drm_dev);
-- 
2.23.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/amdgpu: wait for all rings to drain before runtime suspending
  2019-12-10 22:08 [PATCH] drm/amdgpu: wait for all rings to drain before runtime suspending Alex Deucher
@ 2019-12-11  2:26 ` zhoucm1
  2019-12-11 13:07   ` Christian König
  0 siblings, 1 reply; 4+ messages in thread
From: zhoucm1 @ 2019-12-11  2:26 UTC (permalink / raw)
  To: Alex Deucher, amd-gfx; +Cc: Alex Deucher


On 2019/12/11 上午6:08, Alex Deucher wrote:
> Add a safety check to runtime suspend to make sure all outstanding
> fences have signaled before we suspend.  Doesn't fix any known issue.
>
> We already do this via the fence driver suspend function, but we
> just force completion rather than bailing.  This bails on runtime
> suspend so we can try again later once the fences are signaled to
> avoid missing any outstanding work.

The idea sounds OK to me, but if you want to drain the rings, you should 
make sure no more submission, right?

So you should park all schedulers before waiting for all outstanding 
fences completed.

-David

>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +++++++++++-
>   1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 2f367146c72c..81322b0a8acf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1214,13 +1214,23 @@ static int amdgpu_pmops_runtime_suspend(struct device *dev)
>   	struct pci_dev *pdev = to_pci_dev(dev);
>   	struct drm_device *drm_dev = pci_get_drvdata(pdev);
>   	struct amdgpu_device *adev = drm_dev->dev_private;
> -	int ret;
> +	int ret, i;
>   
>   	if (!adev->runpm) {
>   		pm_runtime_forbid(dev);
>   		return -EBUSY;
>   	}
>   
> +	/* wait for all rings to drain before suspending */
> +	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
> +		struct amdgpu_ring *ring = adev->rings[i];
> +		if (ring && ring->sched.ready) {
> +			ret = amdgpu_fence_wait_empty(ring);
> +			if (ret)
> +				return -EBUSY;
> +		}
> +	}
> +
>   	if (amdgpu_device_supports_boco(drm_dev))
>   		drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
>   	drm_kms_helper_poll_disable(drm_dev);
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/amdgpu: wait for all rings to drain before runtime suspending
  2019-12-11  2:26 ` zhoucm1
@ 2019-12-11 13:07   ` Christian König
  2019-12-11 14:47     ` Alex Deucher
  0 siblings, 1 reply; 4+ messages in thread
From: Christian König @ 2019-12-11 13:07 UTC (permalink / raw)
  To: zhoucm1, Alex Deucher, amd-gfx; +Cc: Alex Deucher

Am 11.12.19 um 03:26 schrieb zhoucm1:
>
> On 2019/12/11 上午6:08, Alex Deucher wrote:
>> Add a safety check to runtime suspend to make sure all outstanding
>> fences have signaled before we suspend.  Doesn't fix any known issue.
>>
>> We already do this via the fence driver suspend function, but we
>> just force completion rather than bailing.  This bails on runtime
>> suspend so we can try again later once the fences are signaled to
>> avoid missing any outstanding work.
>
> The idea sounds OK to me, but if you want to drain the rings, you 
> should make sure no more submission, right?
>
> So you should park all schedulers before waiting for all outstanding 
> fences completed.

At that point userspace should already be put to hold, so no new 
submissions. But it probably won't hurt stopping the scheduler anyway.

But another issue I see is what happens if we locked up the hardware?

Christian.

>
> -David
>
>>
>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +++++++++++-
>>   1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 2f367146c72c..81322b0a8acf 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -1214,13 +1214,23 @@ static int 
>> amdgpu_pmops_runtime_suspend(struct device *dev)
>>       struct pci_dev *pdev = to_pci_dev(dev);
>>       struct drm_device *drm_dev = pci_get_drvdata(pdev);
>>       struct amdgpu_device *adev = drm_dev->dev_private;
>> -    int ret;
>> +    int ret, i;
>>         if (!adev->runpm) {
>>           pm_runtime_forbid(dev);
>>           return -EBUSY;
>>       }
>>   +    /* wait for all rings to drain before suspending */
>> +    for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>> +        struct amdgpu_ring *ring = adev->rings[i];
>> +        if (ring && ring->sched.ready) {
>> +            ret = amdgpu_fence_wait_empty(ring);
>> +            if (ret)
>> +                return -EBUSY;
>> +        }
>> +    }
>> +
>>       if (amdgpu_device_supports_boco(drm_dev))
>>           drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
>>       drm_kms_helper_poll_disable(drm_dev);
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/amdgpu: wait for all rings to drain before runtime suspending
  2019-12-11 13:07   ` Christian König
@ 2019-12-11 14:47     ` Alex Deucher
  0 siblings, 0 replies; 4+ messages in thread
From: Alex Deucher @ 2019-12-11 14:47 UTC (permalink / raw)
  To: Christian Koenig; +Cc: Alex Deucher, zhoucm1, amd-gfx list

On Wed, Dec 11, 2019 at 8:07 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 11.12.19 um 03:26 schrieb zhoucm1:
> >
> > On 2019/12/11 上午6:08, Alex Deucher wrote:
> >> Add a safety check to runtime suspend to make sure all outstanding
> >> fences have signaled before we suspend.  Doesn't fix any known issue.
> >>
> >> We already do this via the fence driver suspend function, but we
> >> just force completion rather than bailing.  This bails on runtime
> >> suspend so we can try again later once the fences are signaled to
> >> avoid missing any outstanding work.
> >
> > The idea sounds OK to me, but if you want to drain the rings, you
> > should make sure no more submission, right?
> >
> > So you should park all schedulers before waiting for all outstanding
> > fences completed.
>
> At that point userspace should already be put to hold, so no new
> submissions. But it probably won't hurt stopping the scheduler anyway.
>

Any ioctl calls will wake the hw again or increase the usage count.

> But another issue I see is what happens if we locked up the hardware?
>

Regular GPU reset would kick in eventually.

Alex

> Christian.
>
> >
> > -David
> >
> >>
> >> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +++++++++++-
> >>   1 file changed, 11 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> index 2f367146c72c..81322b0a8acf 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> @@ -1214,13 +1214,23 @@ static int
> >> amdgpu_pmops_runtime_suspend(struct device *dev)
> >>       struct pci_dev *pdev = to_pci_dev(dev);
> >>       struct drm_device *drm_dev = pci_get_drvdata(pdev);
> >>       struct amdgpu_device *adev = drm_dev->dev_private;
> >> -    int ret;
> >> +    int ret, i;
> >>         if (!adev->runpm) {
> >>           pm_runtime_forbid(dev);
> >>           return -EBUSY;
> >>       }
> >>   +    /* wait for all rings to drain before suspending */
> >> +    for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
> >> +        struct amdgpu_ring *ring = adev->rings[i];
> >> +        if (ring && ring->sched.ready) {
> >> +            ret = amdgpu_fence_wait_empty(ring);
> >> +            if (ret)
> >> +                return -EBUSY;
> >> +        }
> >> +    }
> >> +
> >>       if (amdgpu_device_supports_boco(drm_dev))
> >>           drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
> >>       drm_kms_helper_poll_disable(drm_dev);
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-12-11 14:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-10 22:08 [PATCH] drm/amdgpu: wait for all rings to drain before runtime suspending Alex Deucher
2019-12-11  2:26 ` zhoucm1
2019-12-11 13:07   ` Christian König
2019-12-11 14:47     ` Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.