JingWen - could you maybe give those patches a try on SRIOV XGMI system ? If you see issues maybe you could let me connect and debug. My SRIOV XGMI system which Shayun kindly arranged for me is not loading the driver with my drm-misc-next branch even without my patches. Andrey On 2022-01-17 14:21, Andrey Grodzovsky wrote: > > > On 2022-01-17 2:17 p.m., Christian König wrote: >> Am 17.01.22 um 20:14 schrieb Andrey Grodzovsky: >>> >>> Ping on the question >>> >> >> Oh, my! That was already more than a week ago and is completely >> swapped out of my head again. >> >>> Andrey >>> >>> On 2022-01-05 1:11 p.m., Andrey Grodzovsky wrote: >>>>>> Also, what about having the reset_active or in_reset flag in the >>>>>> reset_domain itself? >>>>> >>>>> Of hand that sounds like a good idea. >>>> >>>> >>>> What then about the adev->reset_sem semaphore ? Should we also move >>>> this to reset_domain ?  Both of the moves have functional >>>> implications only for XGMI case because there will be contention >>>> over accessing those single instance variables from multiple devices >>>> while now each device has it's own copy. >> >> Since this is a rw semaphore that should be unproblematic I think. It >> could just be that the cache line of the lock then plays ping/pong >> between the CPU cores. >> >>>> >>>> What benefit the centralization into reset_domain gives - is it for >>>> example to prevent one device in a hive trying to access through >>>> MMIO another one's >>>> VRAM (shared FB memory) while the other one goes through reset ? >> >> I think that this is the killer argument for a centralized lock, yes. > > > np, i will add a patch with centralizing both flag into reset domain > and resend. > > Andrey > > >> >> Christian. >> >>>> >>>> Andrey >>