All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: Fix S3 resume failre.
@ 2018-07-19 15:19 Andrey Grodzovsky
       [not found] ` <1532013596-26662-1-git-send-email-andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Andrey Grodzovsky @ 2018-07-19 15:19 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Alexander.Deucher-5C7GfCeVMHo, Andrey Grodzovsky,
	harry.wentland-5C7GfCeVMHo, christian.koenig-5C7GfCeVMHo,
	ville.syrjala-VuQAYsv1563Yd54FQh9/CA

Problem:
FB is still not unpinned during the first run of amdgpu_bo_evict_vram
and so it's left for the second run, but during second run the SDMA for
moving buffer around already disabled and you have to do
it with CPU, but FB is not in visible VRAM and hence the eviction failure
leading later to resume failure.

Fix:
When DAL in use get a pointer to FB from crtc->primary->state rather
then from crtc->primary which is not set for DAL since it supports
atomic KMS.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 709e4a3..dd9ebf7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool suspend, bool fbcon)
 	/* unpin the front buffers and cursors */
 	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
 		struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
-		struct drm_framebuffer *fb = crtc->primary->fb;
+		 struct drm_framebuffer *fb = amdgpu_device_has_dc_support(adev) ?
+				 crtc->primary->state->fb : crtc->primary->fb;
 		struct amdgpu_bo *robj;
 
 		if (amdgpu_crtc->cursor_bo) {
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found] ` <1532013596-26662-1-git-send-email-andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-19 15:25   ` Christian König
       [not found]     ` <fb787740-8201-8d27-4ee2-dfcc8d732bc7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2018-07-19 15:39   ` Ville Syrjälä
  1 sibling, 1 reply; 15+ messages in thread
From: Christian König @ 2018-07-19 15:25 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Alexander.Deucher-5C7GfCeVMHo, harry.wentland-5C7GfCeVMHo,
	christian.koenig-5C7GfCeVMHo,
	ville.syrjala-VuQAYsv1563Yd54FQh9/CA

Am 19.07.2018 um 17:19 schrieb Andrey Grodzovsky:
> Problem:
> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
> and so it's left for the second run, but during second run the SDMA for
> moving buffer around already disabled and you have to do
> it with CPU, but FB is not in visible VRAM and hence the eviction failure
> leading later to resume failure.
>
> Fix:
> When DAL in use get a pointer to FB from crtc->primary->state rather
> then from crtc->primary which is not set for DAL since it supports
> atomic KMS.

Nice catch, but could we add a helper for unpinning them and just unpin 
both instead of checking if DC is enabled or not?

I think that would be a little bit cleaner.

Christian.

>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 709e4a3..dd9ebf7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool suspend, bool fbcon)
>   	/* unpin the front buffers and cursors */
>   	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>   		struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
> -		struct drm_framebuffer *fb = crtc->primary->fb;
> +		 struct drm_framebuffer *fb = amdgpu_device_has_dc_support(adev) ?
> +				 crtc->primary->state->fb : crtc->primary->fb;
>   		struct amdgpu_bo *robj;
>   
>   		if (amdgpu_crtc->cursor_bo) {

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]     ` <fb787740-8201-8d27-4ee2-dfcc8d732bc7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-07-19 15:32       ` Harry Wentland
  2018-07-19 16:34       ` Andrey Grodzovsky
  1 sibling, 0 replies; 15+ messages in thread
From: Harry Wentland @ 2018-07-19 15:32 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo, Andrey Grodzovsky,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Alexander.Deucher-5C7GfCeVMHo, ville.syrjala-VuQAYsv1563Yd54FQh9/CA

On 2018-07-19 11:25 AM, Christian König wrote:
> Am 19.07.2018 um 17:19 schrieb Andrey Grodzovsky:
>> Problem:
>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>> and so it's left for the second run, but during second run the SDMA for
>> moving buffer around already disabled and you have to do
>> it with CPU, but FB is not in visible VRAM and hence the eviction failure
>> leading later to resume failure.
>>
>> Fix:
>> When DAL in use get a pointer to FB from crtc->primary->state rather
>> then from crtc->primary which is not set for DAL since it supports
>> atomic KMS.
> 

Good catch.

> Nice catch, but could we add a helper for unpinning them and just unpin both instead of checking if DC is enabled or not?
> 
> I think that would be a little bit cleaner.
> 
> Christian.
> 
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 709e4a3..dd9ebf7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool suspend, bool fbcon)
>>       /* unpin the front buffers and cursors */
>>       list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>>           struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>> +         struct drm_framebuffer *fb = amdgpu_device_has_dc_support(adev) ?
>> +                 crtc->primary->state->fb : crtc->primary->fb;
Is that the only place or are there other places where DC is using fb from crtc, rather than crtc_state?

Harry

>>           struct amdgpu_bo *robj;
>>             if (amdgpu_crtc->cursor_bo) {
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found] ` <1532013596-26662-1-git-send-email-andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
  2018-07-19 15:25   ` Christian König
@ 2018-07-19 15:39   ` Ville Syrjälä
       [not found]     ` <20180719153902.GA5565-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 15+ messages in thread
From: Ville Syrjälä @ 2018-07-19 15:39 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Alexander.Deucher-5C7GfCeVMHo, harry.wentland-5C7GfCeVMHo,
	christian.koenig-5C7GfCeVMHo,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote:
> Problem:
> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
> and so it's left for the second run, but during second run the SDMA for
> moving buffer around already disabled and you have to do
> it with CPU, but FB is not in visible VRAM and hence the eviction failure
> leading later to resume failure.
> 
> Fix:
> When DAL in use get a pointer to FB from crtc->primary->state rather
> then from crtc->primary which is not set for DAL since it supports
> atomic KMS.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 709e4a3..dd9ebf7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool suspend, bool fbcon)
>  	/* unpin the front buffers and cursors */
>  	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>  		struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
> -		struct drm_framebuffer *fb = crtc->primary->fb;
> +		 struct drm_framebuffer *fb = amdgpu_device_has_dc_support(adev) ?
> +				 crtc->primary->state->fb : crtc->primary->fb;

So apparently you haven't yet turned off the planes here. If I'm 
reading things right amdgpu_device_ip_suspend() should end up doing
that through drm_atomic_helper_suspend(). So it looks like like now
you'll end up unpinning the same bos twice. Doesn't that mess up
some kind of refcount or something?

To me it would seem better to susped the display before trying
to evict the bos.

>  		struct amdgpu_bo *robj;
>  
>  		if (amdgpu_crtc->cursor_bo) {
> -- 
> 2.7.4

-- 
Ville Syrjälä
Intel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]     ` <20180719153902.GA5565-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2018-07-19 16:33       ` Andrey Grodzovsky
       [not found]         ` <86de8654-1bff-ad56-9e72-9ec1eb2e7f76-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Andrey Grodzovsky @ 2018-07-19 16:33 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Alexander.Deucher-5C7GfCeVMHo, harry.wentland-5C7GfCeVMHo,
	christian.koenig-5C7GfCeVMHo,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 07/19/2018 11:39 AM, Ville Syrjälä wrote:
> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote:
>> Problem:
>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>> and so it's left for the second run, but during second run the SDMA for
>> moving buffer around already disabled and you have to do
>> it with CPU, but FB is not in visible VRAM and hence the eviction failure
>> leading later to resume failure.
>>
>> Fix:
>> When DAL in use get a pointer to FB from crtc->primary->state rather
>> then from crtc->primary which is not set for DAL since it supports
>> atomic KMS.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 709e4a3..dd9ebf7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool suspend, bool fbcon)
>>   	/* unpin the front buffers and cursors */
>>   	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>>   		struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>> -		struct drm_framebuffer *fb = crtc->primary->fb;
>> +		 struct drm_framebuffer *fb = amdgpu_device_has_dc_support(adev) ?
>> +				 crtc->primary->state->fb : crtc->primary->fb;
> So apparently you haven't yet turned off the planes here. If I'm
> reading things right amdgpu_device_ip_suspend() should end up doing
> that through drm_atomic_helper_suspend(). So it looks like like now
> you'll end up unpinning the same bos twice. Doesn't that mess up
> some kind of refcount or something?
amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve is less clear.

>
> To me it would seem better to susped the display before trying
> to evict the bos.

Yea, i was aware of that and indeed DAL shouldn't rely on the code in  
amdgpu_device_suspend to unpin
front buffer and cursor since the atomic helper should do it. Problem is 
that during amdgpu_device_ip_suspend
the SDMA engine gets suspended too, so you have to embed another 
eviction in between, after display is suspended but before
SDMA and this forces ordering between them which kind of already in 
place (amd_ip_block_type) but still it's an extra constrain.
Also I am not sure about side effects of another eviction call there - 
Christian any ideas ?

Andrey

>
>>   		struct amdgpu_bo *robj;
>>   
>>   		if (amdgpu_crtc->cursor_bo) {
>> -- 
>> 2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]     ` <fb787740-8201-8d27-4ee2-dfcc8d732bc7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2018-07-19 15:32       ` Harry Wentland
@ 2018-07-19 16:34       ` Andrey Grodzovsky
  1 sibling, 0 replies; 15+ messages in thread
From: Andrey Grodzovsky @ 2018-07-19 16:34 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Alexander.Deucher-5C7GfCeVMHo, harry.wentland-5C7GfCeVMHo,
	ville.syrjala-VuQAYsv1563Yd54FQh9/CA



On 07/19/2018 11:25 AM, Christian König wrote:
> Am 19.07.2018 um 17:19 schrieb Andrey Grodzovsky:
>> Problem:
>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>> and so it's left for the second run, but during second run the SDMA for
>> moving buffer around already disabled and you have to do
>> it with CPU, but FB is not in visible VRAM and hence the eviction 
>> failure
>> leading later to resume failure.
>>
>> Fix:
>> When DAL in use get a pointer to FB from crtc->primary->state rather
>> then from crtc->primary which is not set for DAL since it supports
>> atomic KMS.
>
> Nice catch, but could we add a helper for unpinning them and just 
> unpin both instead of checking if DC is enabled or not?

They are mutually exclusive, once is only used in legacy drivers another 
only in atomic KMS drivers.

Andrey

>
> I think that would be a little bit cleaner.
>
> Christian.
>
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 709e4a3..dd9ebf7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device 
>> *dev, bool suspend, bool fbcon)
>>       /* unpin the front buffers and cursors */
>>       list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>>           struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>> +         struct drm_framebuffer *fb = 
>> amdgpu_device_has_dc_support(adev) ?
>> +                 crtc->primary->state->fb : crtc->primary->fb;
>>           struct amdgpu_bo *robj;
>>             if (amdgpu_crtc->cursor_bo) {
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]         ` <86de8654-1bff-ad56-9e72-9ec1eb2e7f76-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-19 16:47           ` Michel Dänzer
       [not found]             ` <13eee582-a100-4c36-e3c6-ad6f96f40efd-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Michel Dänzer @ 2018-07-19 16:47 UTC (permalink / raw)
  To: Andrey Grodzovsky, Ville Syrjälä
  Cc: Alexander.Deucher-5C7GfCeVMHo, harry.wentland-5C7GfCeVMHo,
	christian.koenig-5C7GfCeVMHo,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2018-07-19 06:33 PM, Andrey Grodzovsky wrote:
> On 07/19/2018 11:39 AM, Ville Syrjälä wrote:
>> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote:
>>> Problem:
>>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>>> and so it's left for the second run, but during second run the SDMA for
>>> moving buffer around already disabled and you have to do
>>> it with CPU, but FB is not in visible VRAM and hence the eviction
>>> failure
>>> leading later to resume failure.
>>>
>>> Fix:
>>> When DAL in use get a pointer to FB from crtc->primary->state rather
>>> then from crtc->primary which is not set for DAL since it supports
>>> atomic KMS.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 709e4a3..dd9ebf7 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device
>>> *dev, bool suspend, bool fbcon)
>>>       /* unpin the front buffers and cursors */
>>>       list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>>>           struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>>> +         struct drm_framebuffer *fb =
>>> amdgpu_device_has_dc_support(adev) ?
>>> +                 crtc->primary->state->fb : crtc->primary->fb;
>> So apparently you haven't yet turned off the planes here. If I'm
>> reading things right amdgpu_device_ip_suspend() should end up doing
>> that through drm_atomic_helper_suspend(). So it looks like like now
>> you'll end up unpinning the same bos twice. Doesn't that mess up
>> some kind of refcount or something?
> amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve is less
> clear.

BO reservation shouldn't an issue here, BOs are only reserved for a
short time around (un)pinning them.


>> To me it would seem better to susped the display before trying
>> to evict the bos.
> 
> Yea, i was aware of that and indeed DAL shouldn't rely on the code in 
> amdgpu_device_suspend to unpin
> front buffer and cursor since the atomic helper should do it. Problem is
> that during amdgpu_device_ip_suspend
> the SDMA engine gets suspended too, so you have to embed another
> eviction in between, after display is suspended but before
> SDMA and this forces ordering between them which kind of already in
> place (amd_ip_block_type) but still it's an extra constrain.

Ville's point (which I basically agree with) is that the display
hardware should be turned off before evicting VRAM the first time, in
which case no second eviction should be necessary (for this purpose).


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]             ` <13eee582-a100-4c36-e3c6-ad6f96f40efd-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2018-07-19 16:53               ` Andrey Grodzovsky
       [not found]                 ` <020d0b20-32f5-8974-06a7-5ef015ddca87-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Andrey Grodzovsky @ 2018-07-19 16:53 UTC (permalink / raw)
  To: Michel Dänzer, Ville Syrjälä
  Cc: Alexander.Deucher-5C7GfCeVMHo, harry.wentland-5C7GfCeVMHo,
	christian.koenig-5C7GfCeVMHo,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 07/19/2018 12:47 PM, Michel Dänzer wrote:
> On 2018-07-19 06:33 PM, Andrey Grodzovsky wrote:
>> On 07/19/2018 11:39 AM, Ville Syrjälä wrote:
>>> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote:
>>>> Problem:
>>>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>>>> and so it's left for the second run, but during second run the SDMA for
>>>> moving buffer around already disabled and you have to do
>>>> it with CPU, but FB is not in visible VRAM and hence the eviction
>>>> failure
>>>> leading later to resume failure.
>>>>
>>>> Fix:
>>>> When DAL in use get a pointer to FB from crtc->primary->state rather
>>>> then from crtc->primary which is not set for DAL since it supports
>>>> atomic KMS.
>>>>
>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>>>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index 709e4a3..dd9ebf7 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device
>>>> *dev, bool suspend, bool fbcon)
>>>>        /* unpin the front buffers and cursors */
>>>>        list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>>>>            struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>>>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>>>> +         struct drm_framebuffer *fb =
>>>> amdgpu_device_has_dc_support(adev) ?
>>>> +                 crtc->primary->state->fb : crtc->primary->fb;
>>> So apparently you haven't yet turned off the planes here. If I'm
>>> reading things right amdgpu_device_ip_suspend() should end up doing
>>> that through drm_atomic_helper_suspend(). So it looks like like now
>>> you'll end up unpinning the same bos twice. Doesn't that mess up
>>> some kind of refcount or something?
>> amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve is less
>> clear.
> BO reservation shouldn't an issue here, BOs are only reserved for a
> short time around (un)pinning them.
>
>
>>> To me it would seem better to susped the display before trying
>>> to evict the bos.
>> Yea, i was aware of that and indeed DAL shouldn't rely on the code in
>> amdgpu_device_suspend to unpin
>> front buffer and cursor since the atomic helper should do it. Problem is
>> that during amdgpu_device_ip_suspend
>> the SDMA engine gets suspended too, so you have to embed another
>> eviction in between, after display is suspended but before
>> SDMA and this forces ordering between them which kind of already in
>> place (amd_ip_block_type) but still it's an extra constrain.
> Ville's point (which I basically agree with) is that the display
> hardware should be turned off before evicting VRAM the first time, in
> which case no second eviction should be necessary (for this purpose).

Display HW is turned off as part of all IPs in a loop inside 
amdgpu_device_ip_suspend.
Are you suggesting to extract the  display HW turn off from inside 
amdgpu_device_ip_suspend and place it
before the first call to amdgpu_bo_evict_vram ?

Andrey

>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]                 ` <020d0b20-32f5-8974-06a7-5ef015ddca87-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-19 16:59                   ` Michel Dänzer
       [not found]                     ` <455ebec7-b2cf-0f8e-3fbc-be192d0913c0-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Michel Dänzer @ 2018-07-19 16:59 UTC (permalink / raw)
  To: Andrey Grodzovsky, Ville Syrjälä
  Cc: Alexander.Deucher-5C7GfCeVMHo, harry.wentland-5C7GfCeVMHo,
	christian.koenig-5C7GfCeVMHo,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2018-07-19 06:53 PM, Andrey Grodzovsky wrote:
> 
> 
> On 07/19/2018 12:47 PM, Michel Dänzer wrote:
>> On 2018-07-19 06:33 PM, Andrey Grodzovsky wrote:
>>> On 07/19/2018 11:39 AM, Ville Syrjälä wrote:
>>>> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote:
>>>>> Problem:
>>>>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>>>>> and so it's left for the second run, but during second run the SDMA
>>>>> for
>>>>> moving buffer around already disabled and you have to do
>>>>> it with CPU, but FB is not in visible VRAM and hence the eviction
>>>>> failure
>>>>> leading later to resume failure.
>>>>>
>>>>> Fix:
>>>>> When DAL in use get a pointer to FB from crtc->primary->state rather
>>>>> then from crtc->primary which is not set for DAL since it supports
>>>>> atomic KMS.
>>>>>
>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>>>>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic
>>>>> drivers
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index 709e4a3..dd9ebf7 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device
>>>>> *dev, bool suspend, bool fbcon)
>>>>>        /* unpin the front buffers and cursors */
>>>>>        list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>>>>>            struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>>>>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>>>>> +         struct drm_framebuffer *fb =
>>>>> amdgpu_device_has_dc_support(adev) ?
>>>>> +                 crtc->primary->state->fb : crtc->primary->fb;
>>>> So apparently you haven't yet turned off the planes here. If I'm
>>>> reading things right amdgpu_device_ip_suspend() should end up doing
>>>> that through drm_atomic_helper_suspend(). So it looks like like now
>>>> you'll end up unpinning the same bos twice. Doesn't that mess up
>>>> some kind of refcount or something?
>>> amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve is less
>>> clear.
>> BO reservation shouldn't an issue here, BOs are only reserved for a
>> short time around (un)pinning them.
>>
>>
>>>> To me it would seem better to susped the display before trying
>>>> to evict the bos.
>>> Yea, i was aware of that and indeed DAL shouldn't rely on the code in
>>> amdgpu_device_suspend to unpin
>>> front buffer and cursor since the atomic helper should do it. Problem is
>>> that during amdgpu_device_ip_suspend
>>> the SDMA engine gets suspended too, so you have to embed another
>>> eviction in between, after display is suspended but before
>>> SDMA and this forces ordering between them which kind of already in
>>> place (amd_ip_block_type) but still it's an extra constrain.
>> Ville's point (which I basically agree with) is that the display
>> hardware should be turned off before evicting VRAM the first time, in
>> which case no second eviction should be necessary (for this purpose).
> 
> Display HW is turned off as part of all IPs in a loop inside
> amdgpu_device_ip_suspend.
> Are you suggesting to extract the  display HW turn off from inside
> amdgpu_device_ip_suspend and place it
> before the first call to amdgpu_bo_evict_vram ?

In a nutshell, yes.

Or maybe it would be easier to move the amdgpu_bo_evict_vram call down
to somewhere called from amdgpu_device_ip_suspend?


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]                     ` <455ebec7-b2cf-0f8e-3fbc-be192d0913c0-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2018-07-19 17:07                       ` Andrey Grodzovsky
       [not found]                         ` <7406a2c5-56ed-15f0-7007-25a806218a79-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Andrey Grodzovsky @ 2018-07-19 17:07 UTC (permalink / raw)
  To: Michel Dänzer, Ville Syrjälä
  Cc: Alexander.Deucher-5C7GfCeVMHo, harry.wentland-5C7GfCeVMHo,
	christian.koenig-5C7GfCeVMHo,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 07/19/2018 12:59 PM, Michel Dänzer wrote:
> On 2018-07-19 06:53 PM, Andrey Grodzovsky wrote:
>>
>> On 07/19/2018 12:47 PM, Michel Dänzer wrote:
>>> On 2018-07-19 06:33 PM, Andrey Grodzovsky wrote:
>>>> On 07/19/2018 11:39 AM, Ville Syrjälä wrote:
>>>>> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote:
>>>>>> Problem:
>>>>>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>>>>>> and so it's left for the second run, but during second run the SDMA
>>>>>> for
>>>>>> moving buffer around already disabled and you have to do
>>>>>> it with CPU, but FB is not in visible VRAM and hence the eviction
>>>>>> failure
>>>>>> leading later to resume failure.
>>>>>>
>>>>>> Fix:
>>>>>> When DAL in use get a pointer to FB from crtc->primary->state rather
>>>>>> then from crtc->primary which is not set for DAL since it supports
>>>>>> atomic KMS.
>>>>>>
>>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>>>>>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic
>>>>>> drivers
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>>>>>     1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index 709e4a3..dd9ebf7 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device
>>>>>> *dev, bool suspend, bool fbcon)
>>>>>>         /* unpin the front buffers and cursors */
>>>>>>         list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>>>>>>             struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>>>>>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>>>>>> +         struct drm_framebuffer *fb =
>>>>>> amdgpu_device_has_dc_support(adev) ?
>>>>>> +                 crtc->primary->state->fb : crtc->primary->fb;
>>>>> So apparently you haven't yet turned off the planes here. If I'm
>>>>> reading things right amdgpu_device_ip_suspend() should end up doing
>>>>> that through drm_atomic_helper_suspend(). So it looks like like now
>>>>> you'll end up unpinning the same bos twice. Doesn't that mess up
>>>>> some kind of refcount or something?
>>>> amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve is less
>>>> clear.
>>> BO reservation shouldn't an issue here, BOs are only reserved for a
>>> short time around (un)pinning them.
>>>
>>>
>>>>> To me it would seem better to susped the display before trying
>>>>> to evict the bos.
>>>> Yea, i was aware of that and indeed DAL shouldn't rely on the code in
>>>> amdgpu_device_suspend to unpin
>>>> front buffer and cursor since the atomic helper should do it. Problem is
>>>> that during amdgpu_device_ip_suspend
>>>> the SDMA engine gets suspended too, so you have to embed another
>>>> eviction in between, after display is suspended but before
>>>> SDMA and this forces ordering between them which kind of already in
>>>> place (amd_ip_block_type) but still it's an extra constrain.
>>> Ville's point (which I basically agree with) is that the display
>>> hardware should be turned off before evicting VRAM the first time, in
>>> which case no second eviction should be necessary (for this purpose).
>> Display HW is turned off as part of all IPs in a loop inside
>> amdgpu_device_ip_suspend.
>> Are you suggesting to extract the  display HW turn off from inside
>> amdgpu_device_ip_suspend and place it
>> before the first call to amdgpu_bo_evict_vram ?
> In a nutshell, yes.
>
> Or maybe it would be easier to move the amdgpu_bo_evict_vram call down
> to somewhere called from amdgpu_device_ip_suspend?

I can move the BEFORE and AFTER calls to amdgpu_bo_evict_vram inside 
amdgpu_device_ip_suspend
such that the first one is called AFTER display is shut off, while the 
second in the very end of the function.
I am just not sure what's gonna be the side effect of evicting after 
bunch of blocks (such as GMC) are already disabled.

Andrey

>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]                         ` <7406a2c5-56ed-15f0-7007-25a806218a79-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-19 18:30                           ` Alex Deucher
       [not found]                             ` <CADnq5_Om34fPU5uY_ChVv7b2jAJcP36-Auv6q-2a=Jf=d_L_KA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Deucher @ 2018-07-19 18:30 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Michel Dänzer, amd-gfx list, Deucher, Alexander, Wentland,
	Harry, Christian Koenig, Ville Syrjälä

[-- Attachment #1: Type: text/plain, Size: 4691 bytes --]

On Thu, Jul 19, 2018 at 1:07 PM, Andrey Grodzovsky
<Andrey.Grodzovsky-5C7GfCeVMHo@public.gmane.org> wrote:
>
>
> On 07/19/2018 12:59 PM, Michel Dänzer wrote:
>>
>> On 2018-07-19 06:53 PM, Andrey Grodzovsky wrote:
>>>
>>>
>>> On 07/19/2018 12:47 PM, Michel Dänzer wrote:
>>>>
>>>> On 2018-07-19 06:33 PM, Andrey Grodzovsky wrote:
>>>>>
>>>>> On 07/19/2018 11:39 AM, Ville Syrjälä wrote:
>>>>>>
>>>>>> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote:
>>>>>>>
>>>>>>> Problem:
>>>>>>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>>>>>>> and so it's left for the second run, but during second run the SDMA
>>>>>>> for
>>>>>>> moving buffer around already disabled and you have to do
>>>>>>> it with CPU, but FB is not in visible VRAM and hence the eviction
>>>>>>> failure
>>>>>>> leading later to resume failure.
>>>>>>>
>>>>>>> Fix:
>>>>>>> When DAL in use get a pointer to FB from crtc->primary->state rather
>>>>>>> then from crtc->primary which is not set for DAL since it supports
>>>>>>> atomic KMS.
>>>>>>>
>>>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>>>>>>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic
>>>>>>> drivers
>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>>>>>>     1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> index 709e4a3..dd9ebf7 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device
>>>>>>> *dev, bool suspend, bool fbcon)
>>>>>>>         /* unpin the front buffers and cursors */
>>>>>>>         list_for_each_entry(crtc, &dev->mode_config.crtc_list, head)
>>>>>>> {
>>>>>>>             struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>>>>>>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>>>>>>> +         struct drm_framebuffer *fb =
>>>>>>> amdgpu_device_has_dc_support(adev) ?
>>>>>>> +                 crtc->primary->state->fb : crtc->primary->fb;
>>>>>>
>>>>>> So apparently you haven't yet turned off the planes here. If I'm
>>>>>> reading things right amdgpu_device_ip_suspend() should end up doing
>>>>>> that through drm_atomic_helper_suspend(). So it looks like like now
>>>>>> you'll end up unpinning the same bos twice. Doesn't that mess up
>>>>>> some kind of refcount or something?
>>>>>
>>>>> amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve is less
>>>>> clear.
>>>>
>>>> BO reservation shouldn't an issue here, BOs are only reserved for a
>>>> short time around (un)pinning them.
>>>>
>>>>
>>>>>> To me it would seem better to susped the display before trying
>>>>>> to evict the bos.
>>>>>
>>>>> Yea, i was aware of that and indeed DAL shouldn't rely on the code in
>>>>> amdgpu_device_suspend to unpin
>>>>> front buffer and cursor since the atomic helper should do it. Problem
>>>>> is
>>>>> that during amdgpu_device_ip_suspend
>>>>> the SDMA engine gets suspended too, so you have to embed another
>>>>> eviction in between, after display is suspended but before
>>>>> SDMA and this forces ordering between them which kind of already in
>>>>> place (amd_ip_block_type) but still it's an extra constrain.
>>>>
>>>> Ville's point (which I basically agree with) is that the display
>>>> hardware should be turned off before evicting VRAM the first time, in
>>>> which case no second eviction should be necessary (for this purpose).
>>>
>>> Display HW is turned off as part of all IPs in a loop inside
>>> amdgpu_device_ip_suspend.
>>> Are you suggesting to extract the  display HW turn off from inside
>>> amdgpu_device_ip_suspend and place it
>>> before the first call to amdgpu_bo_evict_vram ?
>>
>> In a nutshell, yes.
>>
>> Or maybe it would be easier to move the amdgpu_bo_evict_vram call down
>> to somewhere called from amdgpu_device_ip_suspend?
>
>
> I can move the BEFORE and AFTER calls to amdgpu_bo_evict_vram inside
> amdgpu_device_ip_suspend
> such that the first one is called AFTER display is shut off, while the
> second in the very end of the function.
> I am just not sure what's gonna be the side effect of evicting after bunch
> of blocks (such as GMC) are already disabled.

How about something like the attached patches?  Basically split the ip
suspend sequence in two like we do for resume.

Alex

[-- Attachment #2: 0002-drm-amdgpu-rework-suspend-and-resume-to-deal-with-at.patch --]
[-- Type: text/x-patch, Size: 4598 bytes --]

From 17389a607dc90e858c8e072800f7bfaca2c4db86 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Thu, 19 Jul 2018 13:24:33 -0500
Subject: [PATCH 2/2] drm/amdgpu: rework suspend and resume to deal with atomic
 changes

Use the newly split ip suspend functions to do suspend displays
first (to deal with atomic so that FBs can be unpinned before
attempting to evict vram), then evict vram, then suspend the
other IPs.  Also move the non-DC pinning code to only be
called in the non-DC cases since atomic should take care of
DC.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 86 ++++++++++++++++--------------
 1 file changed, 45 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 6dcbc44394d9..465c8904b233 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2709,44 +2709,46 @@ int amdgpu_device_suspend(struct drm_device *dev, bool suspend, bool fbcon)
 			drm_helper_connector_dpms(connector, DRM_MODE_DPMS_OFF);
 		}
 		drm_modeset_unlock_all(dev);
-	}
-
-	amdgpu_amdkfd_suspend(adev);
-
-	/* unpin the front buffers and cursors */
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
-		struct drm_framebuffer *fb = crtc->primary->fb;
-		struct amdgpu_bo *robj;
-
-		if (amdgpu_crtc->cursor_bo) {
-			struct amdgpu_bo *aobj = gem_to_amdgpu_bo(amdgpu_crtc->cursor_bo);
-			r = amdgpu_bo_reserve(aobj, true);
-			if (r == 0) {
-				amdgpu_bo_unpin(aobj);
-				amdgpu_bo_unreserve(aobj);
+			/* unpin the front buffers and cursors */
+		list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+			struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
+			struct drm_framebuffer *fb = crtc->primary->fb;
+			struct amdgpu_bo *robj;
+
+			if (amdgpu_crtc->cursor_bo) {
+				struct amdgpu_bo *aobj = gem_to_amdgpu_bo(amdgpu_crtc->cursor_bo);
+				r = amdgpu_bo_reserve(aobj, true);
+				if (r == 0) {
+					amdgpu_bo_unpin(aobj);
+					amdgpu_bo_unreserve(aobj);
+				}
 			}
-		}
 
-		if (fb == NULL || fb->obj[0] == NULL) {
-			continue;
-		}
-		robj = gem_to_amdgpu_bo(fb->obj[0]);
-		/* don't unpin kernel fb objects */
-		if (!amdgpu_fbdev_robj_is_fb(adev, robj)) {
-			r = amdgpu_bo_reserve(robj, true);
-			if (r == 0) {
-				amdgpu_bo_unpin(robj);
-				amdgpu_bo_unreserve(robj);
+			if (fb == NULL || fb->obj[0] == NULL) {
+				continue;
+			}
+			robj = gem_to_amdgpu_bo(fb->obj[0]);
+			/* don't unpin kernel fb objects */
+			if (!amdgpu_fbdev_robj_is_fb(adev, robj)) {
+				r = amdgpu_bo_reserve(robj, true);
+				if (r == 0) {
+					amdgpu_bo_unpin(robj);
+					amdgpu_bo_unreserve(robj);
+				}
 			}
 		}
 	}
+
+	amdgpu_amdkfd_suspend(adev);
+
+	r = amdgpu_device_ip_suspend_phase1(adev);
+
 	/* evict vram memory */
 	amdgpu_bo_evict_vram(adev);
 
 	amdgpu_fence_driver_suspend(adev);
 
-	r = amdgpu_device_ip_suspend(adev);
+	r = amdgpu_device_ip_suspend_phase2(adev);
 
 	/* evict remaining vram memory
 	 * This second call to evict vram is to evict the gart page table
@@ -2824,19 +2826,21 @@ int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon)
 	if (r)
 		goto unlock;
 
-	/* pin cursors */
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
-
-		if (amdgpu_crtc->cursor_bo) {
-			struct amdgpu_bo *aobj = gem_to_amdgpu_bo(amdgpu_crtc->cursor_bo);
-			r = amdgpu_bo_reserve(aobj, true);
-			if (r == 0) {
-				r = amdgpu_bo_pin(aobj, AMDGPU_GEM_DOMAIN_VRAM);
-				if (r != 0)
-					DRM_ERROR("Failed to pin cursor BO (%d)\n", r);
-				amdgpu_crtc->cursor_addr = amdgpu_bo_gpu_offset(aobj);
-				amdgpu_bo_unreserve(aobj);
+	if (!amdgpu_device_has_dc_support(adev)) {
+		/* pin cursors */
+		list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+			struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
+
+			if (amdgpu_crtc->cursor_bo) {
+				struct amdgpu_bo *aobj = gem_to_amdgpu_bo(amdgpu_crtc->cursor_bo);
+				r = amdgpu_bo_reserve(aobj, true);
+				if (r == 0) {
+					r = amdgpu_bo_pin(aobj, AMDGPU_GEM_DOMAIN_VRAM);
+					if (r != 0)
+						DRM_ERROR("Failed to pin cursor BO (%d)\n", r);
+					amdgpu_crtc->cursor_addr = amdgpu_bo_gpu_offset(aobj);
+					amdgpu_bo_unreserve(aobj);
+				}
 			}
 		}
 	}
-- 
2.13.6


[-- Attachment #3: 0001-drm-amdgpu-split-ip-suspend-into-2-phases.patch --]
[-- Type: text/x-patch, Size: 4412 bytes --]

From 57fe101b1272653b63aa2fdcac8de1a425233559 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Thu, 19 Jul 2018 13:10:07 -0500
Subject: [PATCH 1/2] drm/amdgpu: split ip suspend into 2 phases

We need to do some IPs earlier to deal with ordering issues
similar to how resume is split into two phases. Do DCE first
to deal with atomic, then do the rest.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 78 +++++++++++++++++++++++++++++-
 1 file changed, 76 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 99a867fd7031..6dcbc44394d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1926,7 +1926,7 @@ static void amdgpu_device_ip_late_init_func_handler(struct work_struct *work)
 }
 
 /**
- * amdgpu_device_ip_suspend - run suspend for hardware IPs
+ * amdgpu_device_ip_suspend_phase1 - run suspend for hardware IPs (phase 1)
  *
  * @adev: amdgpu_device pointer
  *
@@ -1936,7 +1936,55 @@ static void amdgpu_device_ip_late_init_func_handler(struct work_struct *work)
  * in each IP into a state suitable for suspend.
  * Returns 0 on success, negative error code on failure.
  */
-int amdgpu_device_ip_suspend(struct amdgpu_device *adev)
+static int amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev)
+{
+	int i, r;
+
+	if (amdgpu_sriov_vf(adev))
+		amdgpu_virt_request_full_gpu(adev, false);
+
+	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
+		if (!adev->ip_blocks[i].status.valid)
+			continue;
+		/* displays are handled separately */
+		if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_DCE) {
+			/* ungate blocks so that suspend can properly shut them down */
+			if (adev->ip_blocks[i].version->funcs->set_clockgating_state) {
+				r = adev->ip_blocks[i].version->funcs->set_clockgating_state((void *)adev,
+											     AMD_CG_STATE_UNGATE);
+				if (r) {
+					DRM_ERROR("set_clockgating_state(ungate) of IP block <%s> failed %d\n",
+						  adev->ip_blocks[i].version->funcs->name, r);
+				}
+			}
+			/* XXX handle errors */
+			r = adev->ip_blocks[i].version->funcs->suspend(adev);
+			/* XXX handle errors */
+			if (r) {
+				DRM_ERROR("suspend of IP block <%s> failed %d\n",
+					  adev->ip_blocks[i].version->funcs->name, r);
+			}
+		}
+	}
+
+	if (amdgpu_sriov_vf(adev))
+		amdgpu_virt_release_full_gpu(adev, false);
+
+	return 0;
+}
+
+/**
+ * amdgpu_device_ip_suspend_phase2 - run suspend for hardware IPs (phase 2)
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Main suspend function for hardware IPs.  The list of all the hardware
+ * IPs that make up the asic is walked, clockgating is disabled and the
+ * suspend callbacks are run.  suspend puts the hardware and software state
+ * in each IP into a state suitable for suspend.
+ * Returns 0 on success, negative error code on failure.
+ */
+static int amdgpu_device_ip_suspend_phase2(struct amdgpu_device *adev)
 {
 	int i, r;
 
@@ -1957,6 +2005,9 @@ int amdgpu_device_ip_suspend(struct amdgpu_device *adev)
 	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
 		if (!adev->ip_blocks[i].status.valid)
 			continue;
+		/* displays are handled in phase1 */
+		if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_DCE)
+			continue;
 		/* ungate blocks so that suspend can properly shut them down */
 		if (adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_SMC &&
 			adev->ip_blocks[i].version->funcs->set_clockgating_state) {
@@ -1982,6 +2033,29 @@ int amdgpu_device_ip_suspend(struct amdgpu_device *adev)
 	return 0;
 }
 
+/**
+ * amdgpu_device_ip_suspend - run suspend for hardware IPs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Main suspend function for hardware IPs.  The list of all the hardware
+ * IPs that make up the asic is walked, clockgating is disabled and the
+ * suspend callbacks are run.  suspend puts the hardware and software state
+ * in each IP into a state suitable for suspend.
+ * Returns 0 on success, negative error code on failure.
+ */
+int amdgpu_device_ip_suspend(struct amdgpu_device *adev)
+{
+	int r;
+
+	r = amdgpu_device_ip_suspend_phase1(adev);
+	if (r)
+		return r;
+	r = amdgpu_device_ip_suspend_phase2(adev);
+
+	return r;
+}
+
 static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev)
 {
 	int i, r;
-- 
2.13.6


[-- Attachment #4: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]                             ` <CADnq5_Om34fPU5uY_ChVv7b2jAJcP36-Auv6q-2a=Jf=d_L_KA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-07-19 19:37                               ` Harry Wentland
       [not found]                                 ` <6d0bf937-a857-11d7-3472-9b9e848b6576-5C7GfCeVMHo@public.gmane.org>
  2018-07-20 13:52                               ` Andrey Grodzovsky
  1 sibling, 1 reply; 15+ messages in thread
From: Harry Wentland @ 2018-07-19 19:37 UTC (permalink / raw)
  To: Alex Deucher, Andrey Grodzovsky
  Cc: Deucher, Alexander, Michel Dänzer, amd-gfx list,
	Christian Koenig, Ville Syrjälä

On 2018-07-19 02:30 PM, Alex Deucher wrote:
> On Thu, Jul 19, 2018 at 1:07 PM, Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>>
>> On 07/19/2018 12:59 PM, Michel Dänzer wrote:
>>>
>>> On 2018-07-19 06:53 PM, Andrey Grodzovsky wrote:
>>>>
>>>>
>>>> On 07/19/2018 12:47 PM, Michel Dänzer wrote:
>>>>>
>>>>> On 2018-07-19 06:33 PM, Andrey Grodzovsky wrote:
>>>>>>
>>>>>> On 07/19/2018 11:39 AM, Ville Syrjälä wrote:
>>>>>>>
>>>>>>> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote:
>>>>>>>>
>>>>>>>> Problem:
>>>>>>>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>>>>>>>> and so it's left for the second run, but during second run the SDMA
>>>>>>>> for
>>>>>>>> moving buffer around already disabled and you have to do
>>>>>>>> it with CPU, but FB is not in visible VRAM and hence the eviction
>>>>>>>> failure
>>>>>>>> leading later to resume failure.
>>>>>>>>
>>>>>>>> Fix:
>>>>>>>> When DAL in use get a pointer to FB from crtc->primary->state rather
>>>>>>>> then from crtc->primary which is not set for DAL since it supports
>>>>>>>> atomic KMS.
>>>>>>>>
>>>>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>>>>>>>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic
>>>>>>>> drivers
>>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>>> ---
>>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>>>>>>>     1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> index 709e4a3..dd9ebf7 100644
>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device
>>>>>>>> *dev, bool suspend, bool fbcon)
>>>>>>>>         /* unpin the front buffers and cursors */
>>>>>>>>         list_for_each_entry(crtc, &dev->mode_config.crtc_list, head)
>>>>>>>> {
>>>>>>>>             struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>>>>>>>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>>>>>>>> +         struct drm_framebuffer *fb =
>>>>>>>> amdgpu_device_has_dc_support(adev) ?
>>>>>>>> +                 crtc->primary->state->fb : crtc->primary->fb;
>>>>>>>
>>>>>>> So apparently you haven't yet turned off the planes here. If I'm
>>>>>>> reading things right amdgpu_device_ip_suspend() should end up doing
>>>>>>> that through drm_atomic_helper_suspend(). So it looks like like now
>>>>>>> you'll end up unpinning the same bos twice. Doesn't that mess up
>>>>>>> some kind of refcount or something?
>>>>>>
>>>>>> amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve is less
>>>>>> clear.
>>>>>
>>>>> BO reservation shouldn't an issue here, BOs are only reserved for a
>>>>> short time around (un)pinning them.
>>>>>
>>>>>
>>>>>>> To me it would seem better to susped the display before trying
>>>>>>> to evict the bos.
>>>>>>
>>>>>> Yea, i was aware of that and indeed DAL shouldn't rely on the code in
>>>>>> amdgpu_device_suspend to unpin
>>>>>> front buffer and cursor since the atomic helper should do it. Problem
>>>>>> is
>>>>>> that during amdgpu_device_ip_suspend
>>>>>> the SDMA engine gets suspended too, so you have to embed another
>>>>>> eviction in between, after display is suspended but before
>>>>>> SDMA and this forces ordering between them which kind of already in
>>>>>> place (amd_ip_block_type) but still it's an extra constrain.
>>>>>
>>>>> Ville's point (which I basically agree with) is that the display
>>>>> hardware should be turned off before evicting VRAM the first time, in
>>>>> which case no second eviction should be necessary (for this purpose).
>>>>
>>>> Display HW is turned off as part of all IPs in a loop inside
>>>> amdgpu_device_ip_suspend.
>>>> Are you suggesting to extract the  display HW turn off from inside
>>>> amdgpu_device_ip_suspend and place it
>>>> before the first call to amdgpu_bo_evict_vram ?
>>>
>>> In a nutshell, yes.
>>>
>>> Or maybe it would be easier to move the amdgpu_bo_evict_vram call down
>>> to somewhere called from amdgpu_device_ip_suspend?
>>
>>
>> I can move the BEFORE and AFTER calls to amdgpu_bo_evict_vram inside
>> amdgpu_device_ip_suspend
>> such that the first one is called AFTER display is shut off, while the
>> second in the very end of the function.
>> I am just not sure what's gonna be the side effect of evicting after bunch
>> of blocks (such as GMC) are already disabled.
> 
> How about something like the attached patches?  Basically split the ip
> suspend sequence in two like we do for resume.
> 

Patches are
Acked-by: Harry Wentland <harry.wentland@amd.com>

Harry

> Alex
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]                                 ` <6d0bf937-a857-11d7-3472-9b9e848b6576-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-19 20:17                                   ` Andrey Grodzovsky
       [not found]                                     ` <99b55a5d-fc78-c319-4c3e-cc6aa0478f7a-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Andrey Grodzovsky @ 2018-07-19 20:17 UTC (permalink / raw)
  To: Harry Wentland, Alex Deucher
  Cc: Deucher, Alexander, Ville Syrjälä,
	Michel Dänzer, Christian Koenig, amd-gfx list



On 07/19/2018 03:37 PM, Harry Wentland wrote:
> On 2018-07-19 02:30 PM, Alex Deucher wrote:
>> On Thu, Jul 19, 2018 at 1:07 PM, Andrey Grodzovsky
>> <Andrey.Grodzovsky@amd.com> wrote:
>>>
>>> On 07/19/2018 12:59 PM, Michel Dänzer wrote:
>>>> On 2018-07-19 06:53 PM, Andrey Grodzovsky wrote:
>>>>>
>>>>> On 07/19/2018 12:47 PM, Michel Dänzer wrote:
>>>>>> On 2018-07-19 06:33 PM, Andrey Grodzovsky wrote:
>>>>>>> On 07/19/2018 11:39 AM, Ville Syrjälä wrote:
>>>>>>>> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote:
>>>>>>>>> Problem:
>>>>>>>>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>>>>>>>>> and so it's left for the second run, but during second run the SDMA
>>>>>>>>> for
>>>>>>>>> moving buffer around already disabled and you have to do
>>>>>>>>> it with CPU, but FB is not in visible VRAM and hence the eviction
>>>>>>>>> failure
>>>>>>>>> leading later to resume failure.
>>>>>>>>>
>>>>>>>>> Fix:
>>>>>>>>> When DAL in use get a pointer to FB from crtc->primary->state rather
>>>>>>>>> then from crtc->primary which is not set for DAL since it supports
>>>>>>>>> atomic KMS.
>>>>>>>>>
>>>>>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>>>>>>>>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic
>>>>>>>>> drivers
>>>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>>>> ---
>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>>>>>>>>      1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>>> index 709e4a3..dd9ebf7 100644
>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device
>>>>>>>>> *dev, bool suspend, bool fbcon)
>>>>>>>>>          /* unpin the front buffers and cursors */
>>>>>>>>>          list_for_each_entry(crtc, &dev->mode_config.crtc_list, head)
>>>>>>>>> {
>>>>>>>>>              struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>>>>>>>>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>>>>>>>>> +         struct drm_framebuffer *fb =
>>>>>>>>> amdgpu_device_has_dc_support(adev) ?
>>>>>>>>> +                 crtc->primary->state->fb : crtc->primary->fb;
>>>>>>>> So apparently you haven't yet turned off the planes here. If I'm
>>>>>>>> reading things right amdgpu_device_ip_suspend() should end up doing
>>>>>>>> that through drm_atomic_helper_suspend(). So it looks like like now
>>>>>>>> you'll end up unpinning the same bos twice. Doesn't that mess up
>>>>>>>> some kind of refcount or something?
>>>>>>> amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve is less
>>>>>>> clear.
>>>>>> BO reservation shouldn't an issue here, BOs are only reserved for a
>>>>>> short time around (un)pinning them.
>>>>>>
>>>>>>
>>>>>>>> To me it would seem better to susped the display before trying
>>>>>>>> to evict the bos.
>>>>>>> Yea, i was aware of that and indeed DAL shouldn't rely on the code in
>>>>>>> amdgpu_device_suspend to unpin
>>>>>>> front buffer and cursor since the atomic helper should do it. Problem
>>>>>>> is
>>>>>>> that during amdgpu_device_ip_suspend
>>>>>>> the SDMA engine gets suspended too, so you have to embed another
>>>>>>> eviction in between, after display is suspended but before
>>>>>>> SDMA and this forces ordering between them which kind of already in
>>>>>>> place (amd_ip_block_type) but still it's an extra constrain.
>>>>>> Ville's point (which I basically agree with) is that the display
>>>>>> hardware should be turned off before evicting VRAM the first time, in
>>>>>> which case no second eviction should be necessary (for this purpose).
>>>>> Display HW is turned off as part of all IPs in a loop inside
>>>>> amdgpu_device_ip_suspend.
>>>>> Are you suggesting to extract the  display HW turn off from inside
>>>>> amdgpu_device_ip_suspend and place it
>>>>> before the first call to amdgpu_bo_evict_vram ?
>>>> In a nutshell, yes.
>>>>
>>>> Or maybe it would be easier to move the amdgpu_bo_evict_vram call down
>>>> to somewhere called from amdgpu_device_ip_suspend?
>>>
>>> I can move the BEFORE and AFTER calls to amdgpu_bo_evict_vram inside
>>> amdgpu_device_ip_suspend
>>> such that the first one is called AFTER display is shut off, while the
>>> second in the very end of the function.
>>> I am just not sure what's gonna be the side effect of evicting after bunch
>>> of blocks (such as GMC) are already disabled.
>> How about something like the attached patches?  Basically split the ip
>> suspend sequence in two like we do for resume.
>>
> Patches are
> Acked-by: Harry Wentland <harry.wentland@amd.com>
>
> Harry
>
>> Alex

Patches look good indeed but on second S3 in a raw i get a warning in 
dma_fence_is_later
about fence contexts not equal. I will have to take a look why is that.

Andrey

>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]                                     ` <99b55a5d-fc78-c319-4c3e-cc6aa0478f7a-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-19 22:45                                       ` Andrey Grodzovsky
  0 siblings, 0 replies; 15+ messages in thread
From: Andrey Grodzovsky @ 2018-07-19 22:45 UTC (permalink / raw)
  To: Harry Wentland, Alex Deucher
  Cc: Deucher, Alexander, Ville Syrjälä,
	Michel Dänzer, Christian Koenig, amd-gfx list



On 07/19/2018 04:17 PM, Andrey Grodzovsky wrote:
>
>
> On 07/19/2018 03:37 PM, Harry Wentland wrote:
>> On 2018-07-19 02:30 PM, Alex Deucher wrote:
>>> On Thu, Jul 19, 2018 at 1:07 PM, Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>>
>>>> On 07/19/2018 12:59 PM, Michel Dänzer wrote:
>>>>> On 2018-07-19 06:53 PM, Andrey Grodzovsky wrote:
>>>>>>
>>>>>> On 07/19/2018 12:47 PM, Michel Dänzer wrote:
>>>>>>> On 2018-07-19 06:33 PM, Andrey Grodzovsky wrote:
>>>>>>>> On 07/19/2018 11:39 AM, Ville Syrjälä wrote:
>>>>>>>>> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky 
>>>>>>>>> wrote:
>>>>>>>>>> Problem:
>>>>>>>>>> FB is still not unpinned during the first run of 
>>>>>>>>>> amdgpu_bo_evict_vram
>>>>>>>>>> and so it's left for the second run, but during second run 
>>>>>>>>>> the SDMA
>>>>>>>>>> for
>>>>>>>>>> moving buffer around already disabled and you have to do
>>>>>>>>>> it with CPU, but FB is not in visible VRAM and hence the 
>>>>>>>>>> eviction
>>>>>>>>>> failure
>>>>>>>>>> leading later to resume failure.
>>>>>>>>>>
>>>>>>>>>> Fix:
>>>>>>>>>> When DAL in use get a pointer to FB from crtc->primary->state 
>>>>>>>>>> rather
>>>>>>>>>> then from crtc->primary which is not set for DAL since it 
>>>>>>>>>> supports
>>>>>>>>>> atomic KMS.
>>>>>>>>>>
>>>>>>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>>>>>>>>>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic
>>>>>>>>>> drivers
>>>>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>>>>> ---
>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>>>>>>>>>      1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>>>> index 709e4a3..dd9ebf7 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>>>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct 
>>>>>>>>>> drm_device
>>>>>>>>>> *dev, bool suspend, bool fbcon)
>>>>>>>>>>          /* unpin the front buffers and cursors */
>>>>>>>>>>          list_for_each_entry(crtc, 
>>>>>>>>>> &dev->mode_config.crtc_list, head)
>>>>>>>>>> {
>>>>>>>>>>              struct amdgpu_crtc *amdgpu_crtc = 
>>>>>>>>>> to_amdgpu_crtc(crtc);
>>>>>>>>>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>>>>>>>>>> +         struct drm_framebuffer *fb =
>>>>>>>>>> amdgpu_device_has_dc_support(adev) ?
>>>>>>>>>> + crtc->primary->state->fb : crtc->primary->fb;
>>>>>>>>> So apparently you haven't yet turned off the planes here. If I'm
>>>>>>>>> reading things right amdgpu_device_ip_suspend() should end up 
>>>>>>>>> doing
>>>>>>>>> that through drm_atomic_helper_suspend(). So it looks like 
>>>>>>>>> like now
>>>>>>>>> you'll end up unpinning the same bos twice. Doesn't that mess up
>>>>>>>>> some kind of refcount or something?
>>>>>>>> amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve 
>>>>>>>> is less
>>>>>>>> clear.
>>>>>>> BO reservation shouldn't an issue here, BOs are only reserved for a
>>>>>>> short time around (un)pinning them.
>>>>>>>
>>>>>>>
>>>>>>>>> To me it would seem better to susped the display before trying
>>>>>>>>> to evict the bos.
>>>>>>>> Yea, i was aware of that and indeed DAL shouldn't rely on the 
>>>>>>>> code in
>>>>>>>> amdgpu_device_suspend to unpin
>>>>>>>> front buffer and cursor since the atomic helper should do it. 
>>>>>>>> Problem
>>>>>>>> is
>>>>>>>> that during amdgpu_device_ip_suspend
>>>>>>>> the SDMA engine gets suspended too, so you have to embed another
>>>>>>>> eviction in between, after display is suspended but before
>>>>>>>> SDMA and this forces ordering between them which kind of 
>>>>>>>> already in
>>>>>>>> place (amd_ip_block_type) but still it's an extra constrain.
>>>>>>> Ville's point (which I basically agree with) is that the display
>>>>>>> hardware should be turned off before evicting VRAM the first 
>>>>>>> time, in
>>>>>>> which case no second eviction should be necessary (for this 
>>>>>>> purpose).
>>>>>> Display HW is turned off as part of all IPs in a loop inside
>>>>>> amdgpu_device_ip_suspend.
>>>>>> Are you suggesting to extract the  display HW turn off from inside
>>>>>> amdgpu_device_ip_suspend and place it
>>>>>> before the first call to amdgpu_bo_evict_vram ?
>>>>> In a nutshell, yes.
>>>>>
>>>>> Or maybe it would be easier to move the amdgpu_bo_evict_vram call 
>>>>> down
>>>>> to somewhere called from amdgpu_device_ip_suspend?
>>>>
>>>> I can move the BEFORE and AFTER calls to amdgpu_bo_evict_vram inside
>>>> amdgpu_device_ip_suspend
>>>> such that the first one is called AFTER display is shut off, while the
>>>> second in the very end of the function.
>>>> I am just not sure what's gonna be the side effect of evicting 
>>>> after bunch
>>>> of blocks (such as GMC) are already disabled.
>>> How about something like the attached patches?  Basically split the ip
>>> suspend sequence in two like we do for resume.
>>>
>> Patches are
>> Acked-by: Harry Wentland <harry.wentland@amd.com>
>>
>> Harry
>>
>>> Alex
>
> Patches look good indeed but on second S3 in a raw i get a warning in 
> dma_fence_is_later
> about fence contexts not equal. I will have to take a look why is that.
>
> Andrey

Seems like amdgpu_ttm_set_buffer_funcs_status destroys adev->mman.entity 
on suspend without releasing
adev->mman.bdev.man[TTM_PL_VRAM].move fence so on resume the new 
drm_sched_entity.fence_context causes
the warning against the old fence context which is different.
BTW, happens with my original change to, I just haven't noticed.

Andrey

>
>>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix S3 resume failre.
       [not found]                             ` <CADnq5_Om34fPU5uY_ChVv7b2jAJcP36-Auv6q-2a=Jf=d_L_KA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2018-07-19 19:37                               ` Harry Wentland
@ 2018-07-20 13:52                               ` Andrey Grodzovsky
  1 sibling, 0 replies; 15+ messages in thread
From: Andrey Grodzovsky @ 2018-07-20 13:52 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Michel Dänzer, amd-gfx list, Deucher, Alexander, Wentland,
	Harry, Christian Koenig, Ville Syrjälä



On 07/19/2018 02:30 PM, Alex Deucher wrote:
> On Thu, Jul 19, 2018 at 1:07 PM, Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> On 07/19/2018 12:59 PM, Michel Dänzer wrote:
>>> On 2018-07-19 06:53 PM, Andrey Grodzovsky wrote:
>>>>
>>>> On 07/19/2018 12:47 PM, Michel Dänzer wrote:
>>>>> On 2018-07-19 06:33 PM, Andrey Grodzovsky wrote:
>>>>>> On 07/19/2018 11:39 AM, Ville Syrjälä wrote:
>>>>>>> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote:
>>>>>>>> Problem:
>>>>>>>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>>>>>>>> and so it's left for the second run, but during second run the SDMA
>>>>>>>> for
>>>>>>>> moving buffer around already disabled and you have to do
>>>>>>>> it with CPU, but FB is not in visible VRAM and hence the eviction
>>>>>>>> failure
>>>>>>>> leading later to resume failure.
>>>>>>>>
>>>>>>>> Fix:
>>>>>>>> When DAL in use get a pointer to FB from crtc->primary->state rather
>>>>>>>> then from crtc->primary which is not set for DAL since it supports
>>>>>>>> atomic KMS.
>>>>>>>>
>>>>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>>>>>>>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic
>>>>>>>> drivers
>>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>>> ---
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>>>>>>>      1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> index 709e4a3..dd9ebf7 100644
>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device
>>>>>>>> *dev, bool suspend, bool fbcon)
>>>>>>>>          /* unpin the front buffers and cursors */
>>>>>>>>          list_for_each_entry(crtc, &dev->mode_config.crtc_list, head)
>>>>>>>> {
>>>>>>>>              struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>>>>>>>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>>>>>>>> +         struct drm_framebuffer *fb =
>>>>>>>> amdgpu_device_has_dc_support(adev) ?
>>>>>>>> +                 crtc->primary->state->fb : crtc->primary->fb;
>>>>>>> So apparently you haven't yet turned off the planes here. If I'm
>>>>>>> reading things right amdgpu_device_ip_suspend() should end up doing
>>>>>>> that through drm_atomic_helper_suspend(). So it looks like like now
>>>>>>> you'll end up unpinning the same bos twice. Doesn't that mess up
>>>>>>> some kind of refcount or something?
>>>>>> amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve is less
>>>>>> clear.
>>>>> BO reservation shouldn't an issue here, BOs are only reserved for a
>>>>> short time around (un)pinning them.
>>>>>
>>>>>
>>>>>>> To me it would seem better to susped the display before trying
>>>>>>> to evict the bos.
>>>>>> Yea, i was aware of that and indeed DAL shouldn't rely on the code in
>>>>>> amdgpu_device_suspend to unpin
>>>>>> front buffer and cursor since the atomic helper should do it. Problem
>>>>>> is
>>>>>> that during amdgpu_device_ip_suspend
>>>>>> the SDMA engine gets suspended too, so you have to embed another
>>>>>> eviction in between, after display is suspended but before
>>>>>> SDMA and this forces ordering between them which kind of already in
>>>>>> place (amd_ip_block_type) but still it's an extra constrain.
>>>>> Ville's point (which I basically agree with) is that the display
>>>>> hardware should be turned off before evicting VRAM the first time, in
>>>>> which case no second eviction should be necessary (for this purpose).
>>>> Display HW is turned off as part of all IPs in a loop inside
>>>> amdgpu_device_ip_suspend.
>>>> Are you suggesting to extract the  display HW turn off from inside
>>>> amdgpu_device_ip_suspend and place it
>>>> before the first call to amdgpu_bo_evict_vram ?
>>> In a nutshell, yes.
>>>
>>> Or maybe it would be easier to move the amdgpu_bo_evict_vram call down
>>> to somewhere called from amdgpu_device_ip_suspend?
>>
>> I can move the BEFORE and AFTER calls to amdgpu_bo_evict_vram inside
>> amdgpu_device_ip_suspend
>> such that the first one is called AFTER display is shut off, while the
>> second in the very end of the function.
>> I am just not sure what's gonna be the side effect of evicting after bunch
>> of blocks (such as GMC) are already disabled.
> How about something like the attached patches?  Basically split the ip
> suspend sequence in two like we do for resume.
>
> Alex

Please carry over the regression  info and the related Bugzilla ticket 
into the comments here -

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers

Since the warning I observed is not specific to your change please push 
this and I will handle the warning
separately.

Reviewed-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Andrey


Andrey
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-07-20 13:52 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-19 15:19 [PATCH] drm/amdgpu: Fix S3 resume failre Andrey Grodzovsky
     [not found] ` <1532013596-26662-1-git-send-email-andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
2018-07-19 15:25   ` Christian König
     [not found]     ` <fb787740-8201-8d27-4ee2-dfcc8d732bc7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-07-19 15:32       ` Harry Wentland
2018-07-19 16:34       ` Andrey Grodzovsky
2018-07-19 15:39   ` Ville Syrjälä
     [not found]     ` <20180719153902.GA5565-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2018-07-19 16:33       ` Andrey Grodzovsky
     [not found]         ` <86de8654-1bff-ad56-9e72-9ec1eb2e7f76-5C7GfCeVMHo@public.gmane.org>
2018-07-19 16:47           ` Michel Dänzer
     [not found]             ` <13eee582-a100-4c36-e3c6-ad6f96f40efd-otUistvHUpPR7s880joybQ@public.gmane.org>
2018-07-19 16:53               ` Andrey Grodzovsky
     [not found]                 ` <020d0b20-32f5-8974-06a7-5ef015ddca87-5C7GfCeVMHo@public.gmane.org>
2018-07-19 16:59                   ` Michel Dänzer
     [not found]                     ` <455ebec7-b2cf-0f8e-3fbc-be192d0913c0-otUistvHUpPR7s880joybQ@public.gmane.org>
2018-07-19 17:07                       ` Andrey Grodzovsky
     [not found]                         ` <7406a2c5-56ed-15f0-7007-25a806218a79-5C7GfCeVMHo@public.gmane.org>
2018-07-19 18:30                           ` Alex Deucher
     [not found]                             ` <CADnq5_Om34fPU5uY_ChVv7b2jAJcP36-Auv6q-2a=Jf=d_L_KA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-07-19 19:37                               ` Harry Wentland
     [not found]                                 ` <6d0bf937-a857-11d7-3472-9b9e848b6576-5C7GfCeVMHo@public.gmane.org>
2018-07-19 20:17                                   ` Andrey Grodzovsky
     [not found]                                     ` <99b55a5d-fc78-c319-4c3e-cc6aa0478f7a-5C7GfCeVMHo@public.gmane.org>
2018-07-19 22:45                                       ` Andrey Grodzovsky
2018-07-20 13:52                               ` Andrey Grodzovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.