All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 107572] Unrecoverable GPU hang with IP block:gfx_v8_0 is hung
@ 2018-08-14 23:45 bugzilla-daemon
  2018-08-15  7:58 ` bugzilla-daemon
                   ` (24 more replies)
  0 siblings, 25 replies; 26+ messages in thread
From: bugzilla-daemon @ 2018-08-14 23:45 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2160 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=107572

            Bug ID: 107572
           Summary: Unrecoverable GPU hang with IP block:gfx_v8_0 is hung
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: madcatx@atlas.cz

Hello,

I have been experiencing a worrying amount of these ever since I got my RX 570
a few months ago. I can reproduce the hang quite reliably by with some 3D
workloads, for instance the Unigine Superposition run on High quality or
Witcher 3 (through WINE) crash the GPU quite reliably within minutes.

Once that happens I can always SSH into the machine and try to get at least
some debugging information. Unfortunately, there does not seem to be much to go
on.

dmesg does not tell me more than this:
[  254.704581] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
last signaled seq=103742, last emitted seq=103745
[  254.704586] [drm] IP block:gfx_v8_0 is hung!
[  254.704629] [drm] GPU recovery disabled.

Here are a few things I have tried so far:
- Boot with amdgpu.dc=0
- Boot with amdgpu.vm_update_mode=3
- Force the GPU to max power state
- Disable IOMMU (both by iommu=off and by disabling VT-d in BIOS)
- Boot with amdgpu.gpu_recovery=1 (does not produce any additional info)

I grabbed the umr tool to try to get the state of the GPU when in crashes but
it does not seem to be able to read anything. Running:

umr -R gfx[.]

Leaves me with:

[ERROR]: Could not open ring debugfs file#  

I check that entries in /sys/kernel/debug/amdgpu that look relevant are there,
cat'ing them gives me "Operation not permitted". Yes, I am doing it as root.

Once this happens the only way out is a hard reboot.

I am running up-to-date Fedora 28, kernel 4.17.2, Mesa 18.0 series, LLVM 6.0.1.

Is there anything else I can do?

Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3471 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-09-25 18:09 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-14 23:45 [Bug 107572] Unrecoverable GPU hang with IP block:gfx_v8_0 is hung bugzilla-daemon
2018-08-15  7:58 ` bugzilla-daemon
2018-08-15  8:13 ` bugzilla-daemon
2018-08-15 20:21 ` bugzilla-daemon
2018-08-15 20:21 ` bugzilla-daemon
2018-08-15 20:23 ` bugzilla-daemon
2018-08-16  7:31 ` bugzilla-daemon
2018-08-16 12:16 ` bugzilla-daemon
2018-08-16 12:25 ` bugzilla-daemon
2018-08-16 12:31 ` bugzilla-daemon
2018-08-16 14:16 ` bugzilla-daemon
2018-08-16 15:29 ` bugzilla-daemon
2018-08-17  6:38 ` bugzilla-daemon
2018-08-17  9:22 ` bugzilla-daemon
2018-08-18 11:51 ` bugzilla-daemon
2018-08-18 14:46 ` bugzilla-daemon
2018-08-18 23:05 ` bugzilla-daemon
2018-08-23  2:25 ` bugzilla-daemon
2018-08-23 18:45 ` bugzilla-daemon
2018-08-23 18:46 ` bugzilla-daemon
2018-09-01  0:07 ` bugzilla-daemon
2018-09-03 17:58 ` bugzilla-daemon
2018-09-12  5:18 ` bugzilla-daemon
2018-09-14  5:46 ` bugzilla-daemon
2018-09-16 21:24 ` bugzilla-daemon
2019-09-25 18:09 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.