Denis, Christian, are there any updates in the plan on how to move on with this ? As you know I need very similar code for my up-streaming of device hot-unplug. My latest solution (https://lists.freedesktop.org/archives/amd-gfx/2021-January/058606.html) was not acceptable because of low level guards on the register accessors level which was hurting performance. Basically I need a way to prevent any MMIO write accesses from kernel driver after device is removed (UMD accesses are taken care of by page faulting dummy page). We are using now hot-unplug code for Freemont program and so up-streaming became more of a priority then before. This MMIO access issue is currently my main blocker from up-streaming. Is there any way I can assist in pushing this on ? Andrey On 2021-03-18 5:51 a.m., Christian König wrote: > Am 18.03.21 um 10:30 schrieb Li, Dennis: >> >> >>> The GPU reset doesn't complete the fences we wait for. It only >> completes the hardware fences as part of the reset. >> >> >>> So waiting for a fence while holding the reset lock is illegal >> and needs to be avoided. >> >> I understood your concern. It is more complex for DRM GFX, therefore >> I abandon adding lock protection for DRM ioctls now. Maybe we can try >> to add all kernel  dma_fence waiting in a list, and signal all in >> recovery threads. Do you have same concern for compute cases? >> > > Yes, compute (KFD) is even harder to handle. > > See you can't signal the dma_fence waiting. Waiting for a dma_fence > also means you wait for the GPU reset to finish. > > When we would signal the dma_fence during the GPU reset then we would > run into memory corruption because the hardware jobs running after the > GPU reset would access memory which is already freed. > >> >>> Lockdep also complains about this when it is used correctly. The >> only reason it doesn't complain here is because you use an >> atomic+wait_event instead of a locking primitive. >> >> Agree. This approach will escape the monitor of lockdep.  Its goal is >> to block other threads when GPU recovery thread start. But I couldn’t >> find a better method to solve this problem. Do you have some suggestion? >> > > Well, completely abandon those change here. > > What we need to do is to identify where hardware access happens and > then insert taking the read side of the GPU reset lock so that we > don't wait for a dma_fence or allocate memory, but still protect the > hardware from concurrent access and reset. > > Regards, > Christian. > >> Best Regards >> >> Dennis Li >> >> *From:* Koenig, Christian >> *Sent:* Thursday, March 18, 2021 4:59 PM >> *To:* Li, Dennis ; amd-gfx@lists.freedesktop.org; >> Deucher, Alexander ; Kuehling, Felix >> ; Zhang, Hawking >> *Subject:* AW: [PATCH 0/4] Refine GPU recovery sequence to enhance >> its stability >> >> Exactly that's what you don't seem to understand. >> >> The GPU reset doesn't complete the fences we wait for. It only >> completes the hardware fences as part of the reset. >> >> So waiting for a fence while holding the reset lock is illegal and >> needs to be avoided. >> >> Lockdep also complains about this when it is used correctly. The only >> reason it doesn't complain here is because you use an >> atomic+wait_event instead of a locking primitive. >> >> Regards, >> >> Christian. >> >> ------------------------------------------------------------------------ >> >> *Von:*Li, Dennis > >> *Gesendet:* Donnerstag, 18. März 2021 09:28 >> *An:* Koenig, Christian > >; amd-gfx@lists.freedesktop.org >> > >; Deucher, Alexander >> >; >> Kuehling, Felix > >; Zhang, Hawking >> > >> *Betreff:* RE: [PATCH 0/4] Refine GPU recovery sequence to enhance >> its stability >> >> >>> Those two steps need to be exchanged or otherwise it is possible >> that new delayed work items etc are started before the lock is taken. >> What about adding check for adev->in_gpu_reset in work item? If >> exchange the two steps, it maybe introduce the deadlock.  For >> example, the user thread hold the read lock and waiting for the >> fence, if recovery thread try to hold write lock and then complete >> fences, in this case, recovery thread will always be blocked. >> >> >> Best Regards >> Dennis Li >> -----Original Message----- >> From: Koenig, Christian > > >> Sent: Thursday, March 18, 2021 3:54 PM >> To: Li, Dennis >; >> amd-gfx@lists.freedesktop.org ; >> Deucher, Alexander > >; Kuehling, Felix >> >; Zhang, >> Hawking > >> Subject: Re: [PATCH 0/4] Refine GPU recovery sequence to enhance its >> stability >> >> Am 18.03.21 um 08:23 schrieb Dennis Li: >> > We have defined two variables in_gpu_reset and reset_sem in adev >> object. The atomic type variable in_gpu_reset is used to avoid >> recovery thread reenter and make lower functions return more earlier >> when recovery start, but couldn't block recovery thread when it >> access hardware. The r/w semaphore reset_sem is used to solve these >> synchronization issues between recovery thread and other threads. >> > >> > The original solution locked registers' access in lower functions, >> which will introduce following issues: >> > >> > 1) many lower functions are used in both recovery thread and >> others. Firstly we must harvest these functions, it is easy to miss >> someones. Secondly these functions need select which lock (read lock >> or write lock) will be used, according to the thread it is running >> in. If the thread context isn't considered, the added lock will >> easily introduce deadlock. Besides that, in most time, developer >> easily forget to add locks for new functions. >> > >> > 2) performance drop. More lower functions are more frequently called. >> > >> > 3) easily introduce false positive lockdep complaint, because write >> lock has big range in recovery thread, but low level functions will >> hold read lock may be protected by other locks in other threads. >> > >> > Therefore the new solution will try to add lock protection for >> ioctls of kfd. Its goal is that there are no threads except for >> recovery thread or its children (for xgmi) to access hardware when >> doing GPU reset and resume. So refine recovery thread as the following: >> > >> > Step 0: atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) >> >     1). if failed, it means system had a recovery thread running, >> current thread exit directly; >> >     2). if success, enter recovery thread; >> > >> > Step 1: cancel all delay works, stop drm schedule, complete all >> unreceived fences and so on. It try to stop or pause other threads. >> > >> > Step 2: call down_write(&adev->reset_sem) to hold write lock, which >> will block recovery thread until other threads release read locks. >> >> Those two steps need to be exchanged or otherwise it is possible that >> new delayed work items etc are started before the lock is taken. >> >> Just to make it clear until this is fixed the whole patch set is a NAK. >> >> Regards, >> Christian. >> >> > >> > Step 3: normally, there is only recovery threads running to access >> hardware, it is safe to do gpu reset now. >> > >> > Step 4: do post gpu reset, such as call all ips' resume functions; >> > >> > Step 5: atomic set adev->in_gpu_reset as 0, wake up other threads >> and release write lock. Recovery thread exit normally. >> > >> > Other threads call the amdgpu_read_lock to synchronize with >> recovery thread. If it finds that in_gpu_reset is 1, it should >> release read lock if it has holden one, and then blocks itself to >> wait for recovery finished event. If thread successfully hold read >> lock and in_gpu_reset is 0, it continues. It will exit normally or be >> stopped by recovery thread in step 1. >> > >> > Dennis Li (4): >> >    drm/amdgpu: remove reset lock from low level functions >> >    drm/amdgpu: refine the GPU recovery sequence >> >    drm/amdgpu: instead of using down/up_read directly >> >    drm/amdkfd: add reset lock protection for kfd entry functions >> > >> >   drivers/gpu/drm/amd/amdgpu/amdgpu.h           | 6 + >> >   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   | 14 +- >> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 173 >> +++++++++++++----- >> >   .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 8 - >> >   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c        | 4 +- >> >   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c         | 9 +- >> >   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c         | 5 +- >> >   drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c         | 5 +- >> >   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      | 172 ++++++++++++++++- >> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h         | 3 +- >> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c      | 4 + >> >   .../amd/amdkfd/kfd_process_queue_manager.c    | 17 ++ >> >   12 files changed, 345 insertions(+), 75 deletions(-) >> > >> > > > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx