Sure, I will investigate that. What about the ticket which LIjo raised which was basically doing 8 resets instead of one  ? Lijo - can this ticket wait until I come up with this new design for amdgpu reset function or u need a quick solution now in which case we can use the already existing patch temporary. Andrey On 2022-05-12 09:15, Christian König wrote: >> I am not sure why HIVE is the object we should work with, hive is one >> use case, single device is another, then Lijo described something >> called partition which is what ? Particular pipe within GPU ?. What >> they all share in common >> IMHO is that all of them use reset domain when they want a recovery >> operation, so maybe GPU reset should be oriented to work with reset >> domain ? > > Yes, exactly that's the idea. > > Basically the reset domain knowns which amdgpu devices it needs to > reset together. > > If you then represent that so that you always have a hive even when > you only have one device in it, or if you put an array of devices > which needs to be reset together into the reset domain doesn't matter. > > Maybe go for the later approach, that is probably a bit cleaner and > less code to change. > > Christian.