Comment # 17 on bug 108854 from
(In reply to Tom Seewald from comment #16)

> But in general shouldn't the kernel driver (ideally) be able to handle mesa
> passing malformed/bad commands rather than freezing the device (step 3 to
> 4)?  I understand not every case can be covered, and I also understand that
> GPU resets need to be supported in user space for seamless recovery, but
> shouldn't the driver "unstick" itself enough so the computer can be rebooted
> normally?

These are not generally bad data from mesa per se.  There's not really a good
way to validate all combinations of state sent to the GPU are valid or not. 
There are hundreds of registers and state buffers that the GPU uses to process
the 3D pipeline.  It's impossible to test every combination of state and
dispatch and ordering.  The hangs are generally due to a deadlock in the hw due
to a bad interaction of states set by the application.  E.g., some hw block is
waiting on a signal from another hw block which won't get sent because the user
sent another state update which stops that signal.

The GPU reset should generally be able recover the GPU, but in some cases you
may end up with a deadlock in sw in the kernel somewhere.


You are receiving this mail because: