From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
To: "Christian König" <christian.koenig@amd.com>,
"Christian König" <ckoenig.leichtzumerken@gmail.com>,
"Li, Dennis" <Dennis.Li@amd.com>,
"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
"Deucher, Alexander" <Alexander.Deucher@amd.com>,
"Kuehling, Felix" <Felix.Kuehling@amd.com>,
"Zhang, Hawking" <Hawking.Zhang@amd.com>
Subject: Re: [PATCH 0/4] Refine GPU recovery sequence to enhance its stability
Date: Mon, 12 Apr 2021 15:12:58 -0400 [thread overview]
Message-ID: <0224d3c2-1bd0-3941-fa7f-5667027368fa@amd.com> (raw)
In-Reply-To: <2894bf97-8c39-6610-c479-b089c46513e7@amd.com>
[-- Attachment #1.1: Type: text/plain, Size: 9490 bytes --]
On 2021-04-12 2:23 p.m., Christian König wrote:
> Am 12.04.21 um 20:18 schrieb Andrey Grodzovsky:
>>
>> On 2021-04-12 2:05 p.m., Christian König wrote:
>>
>>> Am 12.04.21 um 20:01 schrieb Andrey Grodzovsky:
>>>>
>>>> On 2021-04-12 1:44 p.m., Christian König wrote:
>>>>
>>>>>
>>>>> Am 12.04.21 um 19:27 schrieb Andrey Grodzovsky:
>>>>>> On 2021-04-10 1:34 p.m., Christian König wrote:
>>>>>>> Hi Andrey,
>>>>>>>
>>>>>>> Am 09.04.21 um 20:18 schrieb Andrey Grodzovsky:
>>>>>>>> [SNIP]
>>>>>>>>>>
>>>>>>>>>> If we use a list and a flag called 'emit_allowed' under a
>>>>>>>>>> lock such that in amdgpu_fence_emit we lock the list, check
>>>>>>>>>> the flag and if true add the new HW fence to list and proceed
>>>>>>>>>> to HW emition as normal, otherwise return with -ENODEV. In
>>>>>>>>>> amdgpu_pci_remove we take the lock, set the flag to false,
>>>>>>>>>> and then iterate the list and force signal it. Will this not
>>>>>>>>>> prevent any new HW fence creation from now on from any place
>>>>>>>>>> trying to do so ?
>>>>>>>>>
>>>>>>>>> Way to much overhead. The fence processing is intentionally
>>>>>>>>> lock free to avoid cache line bouncing because the IRQ can
>>>>>>>>> move from CPU to CPU.
>>>>>>>>>
>>>>>>>>> We need something which at least the processing of fences in
>>>>>>>>> the interrupt handler doesn't affect at all.
>>>>>>>>
>>>>>>>>
>>>>>>>> As far as I see in the code, amdgpu_fence_emit is only called
>>>>>>>> from task context. Also, we can skip this list I proposed and
>>>>>>>> just use amdgpu_fence_driver_force_completion for each ring to
>>>>>>>> signal all created HW fences.
>>>>>>>
>>>>>>> Ah, wait a second this gave me another idea.
>>>>>>>
>>>>>>> See amdgpu_fence_driver_force_completion():
>>>>>>>
>>>>>>> amdgpu_fence_write(ring, ring->fence_drv.sync_seq);
>>>>>>>
>>>>>>> If we change that to something like:
>>>>>>>
>>>>>>> amdgpu_fence_write(ring, ring->fence_drv.sync_seq + 0x3FFFFFFF);
>>>>>>>
>>>>>>> Not only the currently submitted, but also the next 0x3FFFFFFF
>>>>>>> fences will be considered signaled.
>>>>>>>
>>>>>>> This basically solves out problem of making sure that new fences
>>>>>>> are also signaled without any additional overhead whatsoever.
>>>>>>
>>>>>>
>>>>>> Problem with this is that the act of setting the sync_seq to some
>>>>>> MAX value alone is not enough, you actually have to call
>>>>>> amdgpu_fence_process to iterate and signal the fences currently
>>>>>> stored in ring->fence_drv.fences array and to guarantee that once
>>>>>> you done your signalling no more HW fences will be added to that
>>>>>> array anymore. I was thinking to do something like bellow:
>>>>>>
>>>>>
>>>>> Well we could implement the is_signaled callback once more, but
>>>>> I'm not sure if that is a good idea.
>>>>
>>>>
>>>> This indeed could save the explicit signaling I am doing bellow but
>>>> I also set an error code there which might be helpful to propagate
>>>> to users
>>>>
>>>>
>>>>>
>>>>>> amdgpu_fence_emit()
>>>>>>
>>>>>> {
>>>>>>
>>>>>> dma_fence_init(fence);
>>>>>>
>>>>>> srcu_read_lock(amdgpu_unplug_srcu)
>>>>>>
>>>>>> if (!adev->unplug)) {
>>>>>>
>>>>>> seq = ++ring->fence_drv.sync_seq;
>>>>>> emit_fence(fence);
>>>>>>
>>>>>> */* We can't wait forever as the HW might be gone at any point*/**
>>>>>> dma_fence_wait_timeout(old_fence, 5S);*
>>>>>>
>>>>>
>>>>> You can pretty much ignore this wait here. It is only as a last
>>>>> resort so that we never overwrite the ring buffers.
>>>>
>>>>
>>>> If device is present how can I ignore this ?
>>>>
>>
>> I think you missed my question here
>>
>
> Sorry I thought I answered that below.
>
> See this is just the last resort so that we don't need to worry about
> ring buffer overflows during testing.
>
> We should not get here in practice and if we get here generating a
> deadlock might actually be the best handling.
>
> The alternative would be to call BUG().
>
>>>>
>>>>>
>>>>> But it should not have a timeout as far as I can see.
>>>>
>>>>
>>>> Without timeout wait the who approach falls apart as I can't call
>>>> srcu_synchronize on this scope because once device is physically
>>>> gone the wait here will be forever
>>>>
>>>
>>> Yeah, but this is intentional. The only alternative to avoid
>>> corruption is to wait with a timeout and call BUG() if that
>>> triggers. That isn't much better.
>>>
>>>>
>>>>>
>>>>>> ring->fence_drv.fences[seq &
>>>>>> ring->fence_drv.num_fences_mask] = fence;
>>>>>>
>>>>>> } else {
>>>>>>
>>>>>> dma_fence_set_error(fence, -ENODEV);
>>>>>> DMA_fence_signal(fence)
>>>>>>
>>>>>> }
>>>>>>
>>>>>> srcu_read_unlock(amdgpu_unplug_srcu)
>>>>>> return fence;
>>>>>>
>>>>>> }
>>>>>>
>>>>>> amdgpu_pci_remove
>>>>>>
>>>>>> {
>>>>>>
>>>>>> adev->unplug = true;
>>>>>> synchronize_srcu(amdgpu_unplug_srcu)
>>>>>>
>>>>>
>>>>> Well that is just duplicating what drm_dev_unplug() should be
>>>>> doing on a different level.
>>>>
>>>>
>>>> drm_dev_unplug is on a much wider scope, for everything in the
>>>> device including 'flushing' in flight IOCTLs, this deals
>>>> specifically with the issue of force signalling HW fences
>>>>
>>>
>>> Yeah, but it adds the same overhead as the device srcu.
>>>
>>> Christian.
>>
>>
>> So what's the right approach ? How we guarantee that when running
>> amdgpu_fence_driver_force_completion we will signal all the HW fences
>> and not racing against some more fences insertion into that array ?
>>
>
> Well I would still say the best approach would be to insert this
> between the front end and the backend and not rely on signaling fences
> while holding the device srcu.
My question is, even now, when we run
amdgpu_fence_driver_fini_hw->amdgpu_fence_wait_empty or
amdgpu_fence_driver_fini_hw->amdgpu_fence_driver_force_completion, what
there prevents a race with another fence being at the same time emitted
and inserted into the fence array ? Looks like nothing.
>
> BTW: Could it be that the device SRCU protects more than one device
> and we deadlock because of this?
I haven't actually experienced any deadlock until now but, yes,
drm_unplug_srcu is defined as static in drm_drv.c and so in the
presence of multiple devices from same or different drivers we in fact
are dependent on all their critical sections i guess.
Andrey
>
> Christian.
>
>> Andrey
>>
>>
>>>
>>>> Andrey
>>>>
>>>>
>>>>>
>>>>> Christian.
>>>>>
>>>>>> /* Past this point no more fence are submitted to HW ring and
>>>>>> hence we can safely call force signal on all that are currently
>>>>>> there.
>>>>>> * Any subsequently created HW fences will be returned
>>>>>> signaled with an error code right away
>>>>>> */
>>>>>>
>>>>>> for_each_ring(adev)
>>>>>> amdgpu_fence_process(ring)
>>>>>>
>>>>>> drm_dev_unplug(dev);
>>>>>> Stop schedulers
>>>>>> cancel_sync(all timers and queued works);
>>>>>> hw_fini
>>>>>> unmap_mmio
>>>>>>
>>>>>> }
>>>>>>
>>>>>>
>>>>>> Andrey
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Alternatively grabbing the reset write side and stopping and
>>>>>>>>>>> then restarting the scheduler could work as well.
>>>>>>>>>>>
>>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I didn't get the above and I don't see why I need to reuse
>>>>>>>>>> the GPU reset rw_lock. I rely on the SRCU unplug flag for
>>>>>>>>>> unplug. Also, not clear to me why are we focusing on the
>>>>>>>>>> scheduler threads, any code patch to generate HW fences
>>>>>>>>>> should be covered, so any code leading to amdgpu_fence_emit
>>>>>>>>>> needs to be taken into account such as, direct IB
>>>>>>>>>> submissions, VM flushes e.t.c
>>>>>>>>>
>>>>>>>>> You need to work together with the reset lock anyway, cause a
>>>>>>>>> hotplug could run at the same time as a reset.
>>>>>>>>
>>>>>>>>
>>>>>>>> For going my way indeed now I see now that I have to take reset
>>>>>>>> write side lock during HW fences signalling in order to protect
>>>>>>>> against scheduler/HW fences detachment and reattachment during
>>>>>>>> schedulers stop/restart. But if we go with your approach then
>>>>>>>> calling drm_dev_unplug and scoping amdgpu_job_timeout with
>>>>>>>> drm_dev_enter/exit should be enough to prevent any concurrent
>>>>>>>> GPU resets during unplug. In fact I already do it anyway -
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedesktop.org%2F~agrodzov%2Flinux%2Fcommit%2F%3Fh%3Ddrm-misc-next%26id%3Def0ea4dd29ef44d2649c5eda16c8f4869acc36b1&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Ceefa9c90ed8c405ec3b708d8fc46daaa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637536728550884740%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=UiNaJE%2BH45iYmbwSDnMSKZS5z0iak0fNlbbfYqKS2Jo%3D&reserved=0
>>>>>>>
>>>>>>> Yes, good point as well.
>>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>>
>>>>>>>> Andrey
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Andrey
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Christian.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Andrey
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Andrey
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>
[-- Attachment #1.2: Type: text/html, Size: 18466 bytes --]
[-- Attachment #2: Type: text/plain, Size: 154 bytes --]
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2021-04-12 19:13 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-18 7:23 [PATCH 0/4] Refine GPU recovery sequence to enhance its stability Dennis Li
2021-03-18 7:23 ` [PATCH 1/4] drm/amdgpu: remove reset lock from low level functions Dennis Li
2021-03-18 7:23 ` [PATCH 2/4] drm/amdgpu: refine the GPU recovery sequence Dennis Li
2021-03-18 7:56 ` Christian König
2021-03-18 7:23 ` [PATCH 3/4] drm/amdgpu: instead of using down/up_read directly Dennis Li
2021-03-18 7:23 ` [PATCH 4/4] drm/amdkfd: add reset lock protection for kfd entry functions Dennis Li
2021-03-18 7:53 ` [PATCH 0/4] Refine GPU recovery sequence to enhance its stability Christian König
2021-03-18 8:28 ` Li, Dennis
2021-03-18 8:58 ` AW: " Koenig, Christian
2021-03-18 9:30 ` Li, Dennis
2021-03-18 9:51 ` Christian König
2021-04-05 17:58 ` Andrey Grodzovsky
2021-04-06 10:34 ` Christian König
2021-04-06 11:21 ` Christian König
2021-04-06 21:22 ` Andrey Grodzovsky
2021-04-07 10:28 ` Christian König
2021-04-07 19:44 ` Andrey Grodzovsky
2021-04-08 8:22 ` Christian König
2021-04-08 8:32 ` Christian König
2021-04-08 16:08 ` Andrey Grodzovsky
2021-04-08 18:58 ` Christian König
2021-04-08 20:39 ` Andrey Grodzovsky
2021-04-09 6:53 ` Christian König
2021-04-09 7:01 ` Christian König
2021-04-09 15:42 ` Andrey Grodzovsky
2021-04-09 16:39 ` Christian König
2021-04-09 18:18 ` Andrey Grodzovsky
2021-04-10 17:34 ` Christian König
2021-04-12 17:27 ` Andrey Grodzovsky
2021-04-12 17:44 ` Christian König
2021-04-12 18:01 ` Andrey Grodzovsky
2021-04-12 18:05 ` Christian König
2021-04-12 18:18 ` Andrey Grodzovsky
2021-04-12 18:23 ` Christian König
2021-04-12 19:12 ` Andrey Grodzovsky [this message]
2021-04-12 19:18 ` Christian König
2021-04-12 20:01 ` Andrey Grodzovsky
2021-04-13 7:10 ` Christian König
2021-04-13 9:13 ` Li, Dennis
2021-04-13 9:14 ` Christian König
2021-04-13 20:08 ` Daniel Vetter
2021-04-13 15:12 ` Andrey Grodzovsky
2021-04-13 18:03 ` Christian König
2021-04-13 18:18 ` Andrey Grodzovsky
2021-04-13 18:25 ` Christian König
2021-04-13 18:30 ` Andrey Grodzovsky
2021-04-14 7:01 ` Christian König
2021-04-14 14:36 ` Andrey Grodzovsky
2021-04-14 14:58 ` Christian König
2021-04-15 6:27 ` Andrey Grodzovsky
2021-04-15 7:02 ` Christian König
2021-04-15 14:11 ` Andrey Grodzovsky
2021-04-15 15:09 ` Christian König
2021-04-13 20:07 ` Daniel Vetter
2021-04-13 5:36 ` Andrey Grodzovsky
2021-04-13 7:07 ` Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0224d3c2-1bd0-3941-fa7f-5667027368fa@amd.com \
--to=andrey.grodzovsky@amd.com \
--cc=Alexander.Deucher@amd.com \
--cc=Dennis.Li@amd.com \
--cc=Felix.Kuehling@amd.com \
--cc=Hawking.Zhang@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=ckoenig.leichtzumerken@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).