All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 108493] Unigine Heaven at 4K crashes amdgpu and causes a GPU hang
@ 2018-10-19 10:24 bugzilla-daemon
  2018-10-22 19:20 ` bugzilla-daemon
                   ` (20 more replies)
  0 siblings, 21 replies; 22+ messages in thread
From: bugzilla-daemon @ 2018-10-19 10:24 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 3814 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108493

            Bug ID: 108493
           Summary: Unigine Heaven at 4K crashes amdgpu and causes a GPU
                    hang
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: venemo@msn.com

I experience a consistent amdgpu crash when using my AMD GPU with a 4K screen.

Hardware:
* Sapphire Radeon RX 570 Pulse ITX 4GB
* Zotac AMP box mini external GPU enclosure
* Dell XPS 13 9370 laptop
* Dell U2718Q 4K display

Software:
First tried with Fedora 28. Now using Fedora 29. Tried kernel versions 4.18.12,
4.18.13 and 4.19-rc7, the issue appears with all of these. Mesa version is
18.2.2, but the crash is also there with 18.0 (on Fedora 28).

Steps to reproduce the crash:
1. Turn off the laptop
2. Attach the eGPU to the laptop
3. Attach a 4K screen to the HDMI output of the AMD GPU
4. Turn on the laptop
5. Add the following to the kernel command line: 'module_blacklist=i915 3' (to
ensure the Intel GPU is not used at all, plus the graphical login won't
interfere)
6. Launch the operating system
7. Log in from the console
8. Launch an X session with 'startx'
9. Start the Unigine Heaven benchmark in fullscreen 4K

Expected outcome:
Unigine Heaven should show up and run in a stable and performant manner.

Actual outcome:
Unigine Heaven shows up, runs for a couple of seconds and then the screen goes
dark. I can still log into the machine with SSH, but can not kill X or interact
with the AMD GPU in any way. Can't even reboot the machine, the only thing that
works is long pressing the power key.

Relevant lines from dmesg log:
[  305.078426] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
signaled seq=147930, emitted seq=147933
[  305.078567] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout,
signaled seq=3176, emitted seq=3178
[  305.078573] [drm] GPU recovery disabled.

Possible workaround:
* The crash does not happen when I disable power management with amdgpu.dpm=0,
however then it has very poor performance.
* The crash also doesn't happen when I use 'echo low >
/sys/class/drm/card0/device/power_dpm_force_performance_level' with the same
note about bad performance.

Additional information:
* Note that running any other graphics intensive application (ie. your
favourite game) will also result in the same crash, but Unigine Heaven is what
I found to be the quickest way to reproduce it.
* Also note that the crash is not X-specific but again this is what I found to
be the simplest way to reproduce it.
* The very same hardware works correctly on Windows without a crash. So this is
probably not a hardware defect.
* The crash is almost immediate on 4K, but it also occours with other
resolutions, just takes more time. At 1440p it takes a couple of minutes but
still crashes. At 1080p I could run it for several minutes without a crash (did
not test further than that).
* The problem seems to be similar to these:
https://bugs.freedesktop.org/show_bug.cgi?id=105733 and
https://bugs.freedesktop.org/show_bug.cgi?id=102322 - the difference is that
the suggested workarounds don't help, just seem to postpone the crash by a very
small margin. It still crashes in less than a minute though.
* Enabling GPU recovery does not actually manage to recover the GPU.

If you need any other kind of log or any more info, please let me know. Thank
you in advance for looking into solving this problem.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 5528 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-11-19  8:59 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-19 10:24 [Bug 108493] Unigine Heaven at 4K crashes amdgpu and causes a GPU hang bugzilla-daemon
2018-10-22 19:20 ` bugzilla-daemon
2018-10-22 19:26 ` bugzilla-daemon
2018-10-22 19:27 ` bugzilla-daemon
2018-10-22 19:27 ` bugzilla-daemon
2018-10-22 19:27 ` bugzilla-daemon
2018-10-22 19:30 ` bugzilla-daemon
2018-10-23 16:01 ` bugzilla-daemon
2018-10-23 16:12 ` bugzilla-daemon
2018-10-27  5:32 ` bugzilla-daemon
2018-10-27  5:38 ` bugzilla-daemon
2018-10-27  5:39 ` bugzilla-daemon
2018-10-27  5:50 ` bugzilla-daemon
2018-10-29 20:41 ` bugzilla-daemon
2018-10-29 21:01 ` bugzilla-daemon
2018-10-31  8:34 ` bugzilla-daemon
2018-11-18 21:52 ` bugzilla-daemon
2019-01-14 16:17 ` bugzilla-daemon
2019-03-14  1:47 ` bugzilla-daemon
2019-03-15 13:04 ` bugzilla-daemon
2019-08-25 16:26 ` bugzilla-daemon
2019-11-19  8:59 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.