From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9 Date: Sat, 09 Nov 2019 17:57:57 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1565400032==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [131.252.210.165]) by gabe.freedesktop.org (Postfix) with ESMTP id E51B86E3E3 for ; Sat, 9 Nov 2019 17:57:57 +0000 (UTC) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============1565400032== Content-Type: multipart/alternative; boundary="157332227711.E9AbEF1C3.27320" Content-Transfer-Encoding: 7bit --157332227711.E9AbEF1C3.27320 Date: Sat, 9 Nov 2019 17:57:57 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D111481 --- Comment #223 from lptech1024@gmail.com --- Followup to #216: Fedora 31: Kernel 5.3.9, GNOME 3.34, Mesa 19.2.2, linux-firmware 20190923, = LLVM 9.0.0 The hang is 100% reproducible. It occurs running the Linux-native (Vulkan) version of Shadow of the Tomb Raider (SotTR). I have never run SotTR under Proton/Wine, so that isn't a confounding variable. The (unskippable) cutscene is for the Amazon River in Peru and occurs anywh= ere between 15 seconds before the pilot is struck and the pilot is struck. Even when the video hangs, you can usually hear fragments (sound effects) of the game for a few seconds afterwords. I ran SotTR with vktrace and activated the Gnome (Wayland) overview to see = if there I could catch any relevant terminal output (none that I saw). The game still had focus, so it continued playing. After the hang (when I rebooted), there wasn't a vktrace file. I would assume this would be either it didn't write it out due to the hang or it didn't have content to write. However, with it running visible in the overview (and a manual kernel updat= e), I got both ring gfx and sdma errors: Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled. Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled. Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0 Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 1722 thread gnome-shel:cs0 pid 1768 Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* = ring sdma1 timeout, signaled seq=3D1049, emitted seq=3D1053 Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* = ring sdma0 timeout, signaled seq=3D30017, emitted seq=3D30020 Nov 07 [SNIP]:19 [SNIP] kernel: [drm] GPU recovery disabled. Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process ShadowOfTheTomb pid 3890 thread WebViewRenderer pid 4981 Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* = ring gfx_0.0.0 timeout, signaled seq=3D75610, emitted seq=3D75612 Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted! As a workaround to proceed in the game, I downloaded the AMDVLD 2019.Q4.2 .= deb, extracted the contents, modified the JSON file (to point to the local amdvlk64.so), and ran SotTR with the VK_ICD_FILENAMES variable set to the AMDVLK JSON file. The AMDVLK graphics were terrible (significant percentage of random pixels turning random colors, bad rendering of elements, etc), but I did not experience any hangs during the cutscene. After reaching a known save point= , I switched back to mesa/RADV-llvm and haven't experienced a hang since (haven= 't progressed that much further yet, but that's the only hang so far - about 1= 3% of the game has been completed). This would seem to point to a bug at least partially due to mesa/RADV-llvm. --=20 You are receiving this mail because: You are the assignee for the bug.= --157332227711.E9AbEF1C3.27320 Date: Sat, 9 Nov 2019 17:57:57 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated

Comm= ent # 223 on bug 11148= 1 from lptech1024@= ;gmail.com
Followup to #216:

Fedora 31: Kernel 5.3.9, GNOME 3.34, Mesa 19.2.2, linux-firmware 20190923, =
LLVM
9.0.0

The hang is 100% reproducible.

It occurs running the Linux-native (Vulkan) version of Shadow of the Tomb
Raider (SotTR). I have never run SotTR under Proton/Wine, so that isn't a
confounding variable.

The (unskippable) cutscene is for the Amazon River in Peru and occurs anywh=
ere
between 15 seconds before the pilot is struck and the pilot is struck. Even
when the video hangs, you can usually hear fragments (sound effects) of the
game for a few seconds afterwords.

I ran SotTR with vktrace and activated the Gnome (Wayland) overview to see =
if
there I could catch any relevant terminal output (none that I saw). The game
still had focus, so it continued playing. After the hang (when I rebooted),
there wasn't a vktrace file. I would assume this would be either it didn't
write it out due to the hang or it didn't have content to write.

However, with it running visible in the overview (and a manual kernel updat=
e),
I got both ring gfx and sdma errors:

Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled.
Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled.
Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process  pid 0 thread  pid 0
Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process gnome-shell pid 1722 thread gnome-shel:cs0 pid
1768
Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* =
ring
sdma1 timeout, signaled seq=3D1049, emitted seq=3D1053
Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* =
ring
sdma0 timeout, signaled seq=3D30017, emitted seq=3D30020
Nov 07 [SNIP]:19 [SNIP] kernel: [drm] GPU recovery disabled.
Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process ShadowOfTheTomb pid 3890 thread WebViewRenderer
pid 4981
Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* =
ring
gfx_0.0.0 timeout, signaled seq=3D75610, emitted seq=3D75612
Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]]
*ERROR* Waiting for fences timed out or interrupted!

As a workaround to proceed in the game, I downloaded the AMDVLD 2019.Q4.2 .=
deb,
extracted the contents, modified the JSON file (to point to the local
amdvlk64.so), and ran SotTR with the VK_ICD_FILENAMES variable set to the
AMDVLK JSON file.

The AMDVLK graphics were terrible (significant percentage of random pixels
turning random colors, bad rendering of elements, etc), but I did not
experience any hangs during the cutscene. After reaching a known save point=
, I
switched back to mesa/RADV-llvm and haven't experienced a hang since (haven=
't
progressed that much further yet, but that's the only hang so far - about 1=
3%
of the game has been completed).

This would seem to point to a bug at least partially due to mesa/RADV-llvm.=


You are receiving this mail because:
  • You are the assignee for the bug.
= --157332227711.E9AbEF1C3.27320-- --===============1565400032== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVs --===============1565400032==--