The only real solution I can see is to be able to reliable kill shaders
in an OOM situation.

Well, we can in fact preempt our compute shaders with low latency.
Killing a KFD process will do exactly that.

I've taken a look at that thing as well and to be honest it is not even 
remotely sufficient.

We need something which stops the hardware *immediately* from accessing 
system memory, and not wait for the SQ to kill all waves, flush caches 
etc...

One possibility I'm playing around with for a while is to replace the 
root PD for the VMIDs in question on the fly. E.g. we just let it point 
to some dummy which redirects everything into nirvana.

But implementing this is easier said than done...

Regards,
Christian.