Re: [PATCH v2 1/2] drm/amdgpu: enhance amdgpu_vcn_suspend

From: Leo Liu <leo.liu@amd.com>
To: "Zhu, James" <James.Zhu@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH v2 1/2] drm/amdgpu: enhance amdgpu_vcn_suspend
Date: Mon, 17 May 2021 14:15:24 -0400	[thread overview]
Message-ID: <a09ac369-8d0c-5fc1-77c9-070498143861@amd.com> (raw)
In-Reply-To: <DM5PR12MB25173E8B288010950417C2E2E42D9@DM5PR12MB2517.namprd12.prod.outlook.com>

[-- Attachment #1.1: Type: text/plain, Size: 5727 bytes --]

The saved data are from the engine cache, it's the runtime of engine 
before suspend, it might be different after you have the engine powered off.

Regards,

Leo

On 2021-05-17 2:11 p.m., Zhu, James wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
>
> save_bo needn't ungate vcn,  it just keeps data in memory.
>
> Thanks & Best Regards!
>
>
> James Zhu
>
> ------------------------------------------------------------------------
> *From:* Liu, Leo <Leo.Liu@amd.com>
> *Sent:* Monday, May 17, 2021 2:07 PM
> *To:* Zhu, James <James.Zhu@amd.com>; Zhu, James <James.Zhu@amd.com>; 
> amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
> *Subject:* Re: [PATCH v2 1/2] drm/amdgpu: enhance amdgpu_vcn_suspend
>
> Definitely, we need to move cancel_delayed_work_sync moved to before 
> power gate.
>
> Should "save_bo" be step 4 before power gate ?
>
> Regards,
>
> Leo
>
>
> On 2021-05-17 1:59 p.m., James Zhu wrote:
>>
>> Then we forgot the proposal I provided before.
>>
>> I think the below seq may fixed the race condition issue that we are 
>> facing.
>>
>> 1. stop scheduling new jobs
>>
>>     for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>         if (adev->vcn.harvest_config & (1 << i))
>>             continue;
>>
>>         ring = &adev->vcn.inst[i].ring_dec;
>>         ring->sched.ready = false;
>>
>>         for (j = 0; j < adev->vcn.num_enc_rings; ++j) {
>>             ring = &adev->vcn.inst[i].ring_enc[j];
>>             ring->sched.ready = false;
>>         }
>>     }
>>
>> 2. cancel_delayed_work_sync(&adev->vcn.idle_work);
>>
>> 3. SOC15_WAIT_ON_RREG(VCN, inst_idx, mmUVD_POWER_STATUS, 1,
>>          UVD_POWER_STATUS__UVD_POWER_STATUS_MASK);
>>
>> 4. amdgpu_device_ip_set_powergating_state(adev, 
>> AMD_IP_BLOCK_TYPE_VCN,   AMD_PG_STATE_GATE);
>>
>> 5.  saved_bo
>>
>> Best Regards!
>>
>> James
>>
>> On 2021-05-17 1:43 p.m., Leo Liu wrote:
>>>
>>> On 2021-05-17 12:54 p.m., James Zhu wrote:
>>>> I am wondering if there are still some jobs kept in the queue, it 
>>>> is lucky to check
>>>
>>> Yes it's possible, in this case delayed handler is set, so 
>>> cancelling once is enough.
>>>
>>>
>>>>
>>>> UVD_POWER_STATUS done, but after, fw start a new job that list in 
>>>> the queue.
>>>>
>>>> To handle this situation perfectly, we need add mechanism to 
>>>> suspend fw first.
>>>
>>> I think that should be handled by the sequence from 
>>> vcn_v3_0_stop_dpg_mode().
>>>
>>>
>>>>
>>>> Another case, if it is unlucky, that  vcn fw hung at that time, 
>>>> UVD_POWER_STATUS
>>>>
>>>> always keeps busy.   then it needs force powering gate the vcn hw 
>>>> after certain time waiting.
>>>
>>> Yep, we still need to gate VCN power after certain timeout.
>>>
>>>
>>> Regards,
>>>
>>> Leo
>>>
>>>
>>>
>>>>
>>>> Best Regards!
>>>>
>>>> James
>>>>
>>>> On 2021-05-17 12:34 p.m., Leo Liu wrote:
>>>>>
>>>>> On 2021-05-17 11:52 a.m., James Zhu wrote:
>>>>>> During vcn suspends, stop ring continue to receive new requests,
>>>>>> and try to wait for all vcn jobs to finish gracefully.
>>>>>>
>>>>>> v2: Forced powering gate vcn hardware after few wainting retry.
>>>>>>
>>>>>> Signed-off-by: James Zhu <James.Zhu@amd.com> 
>>>>>> <mailto:James.Zhu@amd.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 22 
>>>>>> +++++++++++++++++++++-
>>>>>>   1 file changed, 21 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>>> index 2016459..9f3a6e7 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>>> @@ -275,9 +275,29 @@ int amdgpu_vcn_suspend(struct amdgpu_device 
>>>>>> *adev)
>>>>>>   {
>>>>>>       unsigned size;
>>>>>>       void *ptr;
>>>>>> +    int retry_max = 6;
>>>>>>       int i;
>>>>>>   - cancel_delayed_work_sync(&adev->vcn.idle_work);
>>>>>> +    for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>>>> +        if (adev->vcn.harvest_config & (1 << i))
>>>>>> +            continue;
>>>>>> +        ring = &adev->vcn.inst[i].ring_dec;
>>>>>> +        ring->sched.ready = false;
>>>>>> +
>>>>>> +        for (j = 0; j < adev->vcn.num_enc_rings; ++j) {
>>>>>> +            ring = &adev->vcn.inst[i].ring_enc[j];
>>>>>> +            ring->sched.ready = false;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    while (retry_max-- && 
>>>>>> cancel_delayed_work_sync(&adev->vcn.idle_work))
>>>>>> +        mdelay(5);
>>>>>
>>>>> I think it's possible to have one pending job unprocessed with VCN 
>>>>> when suspend sequence getting here, but it shouldn't be more than 
>>>>> one, cancel_delayed_work_sync probably return false after the 
>>>>> first time, so calling cancel_delayed_work_sync once should be 
>>>>> enough here. we probably need to wait longer from:
>>>>>
>>>>> SOC15_WAIT_ON_RREG(VCN, inst_idx, mmUVD_POWER_STATUS, 1,
>>>>>         UVD_POWER_STATUS__UVD_POWER_STATUS_MASK);
>>>>>
>>>>> to make sure the unprocessed job get done.
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Leo
>>>>>
>>>>>
>>>>>> +    if (!retry_max && !amdgpu_sriov_vf(adev)) {
>>>>>> +        if (RREG32_SOC15(VCN, i, mmUVD_STATUS)) {
>>>>>> +            dev_warn(adev->dev, "Forced powering gate vcn 
>>>>>> hardware!");
>>>>>> +            vcn_v3_0_set_powergating_state(adev, 
>>>>>> AMD_PG_STATE_GATE);
>>>>>> +        }
>>>>>> +    }
>>>>>>         for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>>>>           if (adev->vcn.harvest_config & (1 << i))

[-- Attachment #1.2: Type: text/html, Size: 12762 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx