https://bugs.freedesktop.org/show_bug.cgi?id=107572 Bug ID: 107572 Summary: Unrecoverable GPU hang with IP block:gfx_v8_0 is hung Product: DRI Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: madcatx@atlas.cz Hello, I have been experiencing a worrying amount of these ever since I got my RX 570 a few months ago. I can reproduce the hang quite reliably by with some 3D workloads, for instance the Unigine Superposition run on High quality or Witcher 3 (through WINE) crash the GPU quite reliably within minutes. Once that happens I can always SSH into the machine and try to get at least some debugging information. Unfortunately, there does not seem to be much to go on. dmesg does not tell me more than this: [ 254.704581] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, last signaled seq=103742, last emitted seq=103745 [ 254.704586] [drm] IP block:gfx_v8_0 is hung! [ 254.704629] [drm] GPU recovery disabled. Here are a few things I have tried so far: - Boot with amdgpu.dc=0 - Boot with amdgpu.vm_update_mode=3 - Force the GPU to max power state - Disable IOMMU (both by iommu=off and by disabling VT-d in BIOS) - Boot with amdgpu.gpu_recovery=1 (does not produce any additional info) I grabbed the umr tool to try to get the state of the GPU when in crashes but it does not seem to be able to read anything. Running: umr -R gfx[.] Leaves me with: [ERROR]: Could not open ring debugfs file# I check that entries in /sys/kernel/debug/amdgpu that look relevant are there, cat'ing them gives me "Operation not permitted". Yes, I am doing it as root. Once this happens the only way out is a hard reboot. I am running up-to-date Fedora 28, kernel 4.17.2, Mesa 18.0 series, LLVM 6.0.1. Is there anything else I can do? Thanks. -- You are receiving this mail because: You are the assignee for the bug.