On 2021-04-10 1:34 p.m., Christian König wrote:
> Hi Andrey,
>
> Am 09.04.21 um 20:18 schrieb Andrey Grodzovsky:
>> [SNIP]
>>>>
>>>> If we use a list and a flag called 'emit_allowed' under a lock such 
>>>> that in amdgpu_fence_emit we lock the list, check the flag and if 
>>>> true add the new HW fence to list and proceed to HW emition as 
>>>> normal, otherwise return with -ENODEV. In amdgpu_pci_remove we take 
>>>> the lock, set the flag to false, and then iterate the list and 
>>>> force signal it. Will this not prevent any new HW fence creation 
>>>> from now on from any place trying to do so ?
>>>
>>> Way to much overhead. The fence processing is intentionally lock 
>>> free to avoid cache line bouncing because the IRQ can move from CPU 
>>> to CPU.
>>>
>>> We need something which at least the processing of fences in the 
>>> interrupt handler doesn't affect at all.
>>
>>
>> As far as I see in the code, amdgpu_fence_emit is only called from 
>> task context. Also, we can skip this list I proposed and just use 
>> amdgpu_fence_driver_force_completion for each ring to signal all 
>> created HW fences.
>
> Ah, wait a second this gave me another idea.
>
> See amdgpu_fence_driver_force_completion():
>
> amdgpu_fence_write(ring, ring->fence_drv.sync_seq);
>
> If we change that to something like:
>
> amdgpu_fence_write(ring, ring->fence_drv.sync_seq + 0x3FFFFFFF);
>
> Not only the currently submitted, but also the next 0x3FFFFFFF fences 
> will be considered signaled.
>
> This basically solves out problem of making sure that new fences are 
> also signaled without any additional overhead whatsoever.


Problem with this is that the act of setting the sync_seq to some MAX 
value alone is not enough, you actually have to call 
amdgpu_fence_process to iterate and signal the fences currently stored 
in ring->fence_drv.fences array and to guarantee that once you done your 
signalling no more HW fences will be added to that array anymore. I was 
thinking to do something like bellow:

amdgpu_fence_emit()

{

     dma_fence_init(fence);

     srcu_read_lock(amdgpu_unplug_srcu)

     if (!adev->unplug)) {

         seq = ++ring->fence_drv.sync_seq;
         emit_fence(fence);

*/* We can't wait forever as the HW might be gone at any point*/**
        dma_fence_wait_timeout(old_fence, 5S);*
         ring->fence_drv.fences[seq & ring->fence_drv.num_fences_mask] = 
fence;

     } else {

         dma_fence_set_error(fence, -ENODEV);
         DMA_fence_signal(fence)

     }

     srcu_read_unlock(amdgpu_unplug_srcu)
     return fence;

}

amdgpu_pci_remove

{

     adev->unplug = true;
     synchronize_srcu(amdgpu_unplug_srcu)

     /* Past this point no more fence are submitted to HW ring and hence 
we can safely call force signal on all that are currently there.
      * Any subsequently created  HW fences will be returned signaled 
with an error code right away
      */

     for_each_ring(adev)
         amdgpu_fence_process(ring)

     drm_dev_unplug(dev);
     Stop schedulers
     cancel_sync(all timers and queued works);
     hw_fini
     unmap_mmio

}


Andrey


>
>
>>
>>>>>
>>>>> Alternatively grabbing the reset write side and stopping and then 
>>>>> restarting the scheduler could work as well.
>>>>>
>>>>> Christian.
>>>>
>>>>
>>>> I didn't get the above and I don't see why I need to reuse the GPU 
>>>> reset rw_lock. I rely on the SRCU unplug flag for unplug. Also, not 
>>>> clear to me why are we focusing on the scheduler threads, any code 
>>>> patch to generate HW fences should be covered, so any code leading 
>>>> to amdgpu_fence_emit needs to be taken into account such as, direct 
>>>> IB submissions, VM flushes e.t.c
>>>
>>> You need to work together with the reset lock anyway, cause a 
>>> hotplug could run at the same time as a reset.
>>
>>
>> For going my way indeed now I see now that I have to take reset write 
>> side lock during HW fences signalling in order to protect against 
>> scheduler/HW fences detachment and reattachment during schedulers 
>> stop/restart. But if we go with your approach  then calling 
>> drm_dev_unplug and scoping amdgpu_job_timeout with drm_dev_enter/exit 
>> should be enough to prevent any concurrent GPU resets during unplug. 
>> In fact I already do it anyway - 
>> https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedesktop.org%2F~agrodzov%2Flinux%2Fcommit%2F%3Fh%3Ddrm-misc-next%26id%3Def0ea4dd29ef44d2649c5eda16c8f4869acc36b1&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7Ceefa9c90ed8c405ec3b708d8fc46daaa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637536728550884740%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=UiNaJE%2BH45iYmbwSDnMSKZS5z0iak0fNlbbfYqKS2Jo%3D&amp;reserved=0
>
> Yes, good point as well.
>
> Christian.
>
>>
>> Andrey
>>
>>
>>>
>>>
>>> Christian.
>>>
>>>>
>>>> Andrey
>>>>
>>>>
>>>>>
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>>
>>>>>>> Andrey
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> Andrey
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>