From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu
with kernel 5.3 and mesa 19.2 -git/llvm9
Date: Sun, 10 Nov 2019 12:26:53 +0000
Message-ID:
References:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============0590466287=="
Return-path:
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
[IPv6:2610:10:20:722:a800:ff:fe98:4b55])
by gabe.freedesktop.org (Postfix) with ESMTP id 3ACEE6E612
for ; Sun, 10 Nov 2019 12:27:06 +0000 (UTC)
In-Reply-To:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel"
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org
--===============0590466287==
Content-Type: multipart/alternative; boundary="15733888250.9B3cA.2731"
Content-Transfer-Encoding: 7bit
--15733888250.9B3cA.2731
Date: Sun, 10 Nov 2019 12:27:05 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D111481
--- Comment #224 from Marko Popovic ---
(In reply to lptech1024 from comment #223)
> Followup to #216:
>=20
> Fedora 31: Kernel 5.3.9, GNOME 3.34, Mesa 19.2.2, linux-firmware 20190923,
> LLVM 9.0.0
>=20
> The hang is 100% reproducible.
>=20
> It occurs running the Linux-native (Vulkan) version of Shadow of the Tomb
> Raider (SotTR). I have never run SotTR under Proton/Wine, so that isn't a
> confounding variable.
>=20
> The (unskippable) cutscene is for the Amazon River in Peru and occurs
> anywhere between 15 seconds before the pilot is struck and the pilot is
> struck. Even when the video hangs, you can usually hear fragments (sound
> effects) of the game for a few seconds afterwords.
>=20
> I ran SotTR with vktrace and activated the Gnome (Wayland) overview to see
> if there I could catch any relevant terminal output (none that I saw). The
> game still had focus, so it continued playing. After the hang (when I
> rebooted), there wasn't a vktrace file. I would assume this would be eith=
er
> it didn't write it out due to the hang or it didn't have content to write.
>=20
> However, with it running visible in the overview (and a manual kernel
> update), I got both ring gfx and sdma errors:
>=20
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled.
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled.
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> Process information: process pid 0 thread pid 0
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> Process information: process gnome-shell pid 1722 thread gnome-shel:cs0 p=
id
> 1768
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> ring sdma1 timeout, signaled seq=3D1049, emitted seq=3D1053
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> ring sdma0 timeout, signaled seq=3D30017, emitted seq=3D30020
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm] GPU recovery disabled.
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> Process information: process ShadowOfTheTomb pid 3890 thread WebViewRende=
rer
> pid 4981
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> ring gfx_0.0.0 timeout, signaled seq=3D75610, emitted seq=3D75612
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu=
]]
> *ERROR* Waiting for fences timed out or interrupted!
>=20
> As a workaround to proceed in the game, I downloaded the AMDVLD 2019.Q4.2
> .deb, extracted the contents, modified the JSON file (to point to the loc=
al
> amdvlk64.so), and ran SotTR with the VK_ICD_FILENAMES variable set to the
> AMDVLK JSON file.
>=20
> The AMDVLK graphics were terrible (significant percentage of random pixels
> turning random colors, bad rendering of elements, etc), but I did not
> experience any hangs during the cutscene. After reaching a known save poi=
nt,
> I switched back to mesa/RADV-llvm and haven't experienced a hang since
> (haven't progressed that much further yet, but that's the only hang so fa=
r -
> about 13% of the game has been completed).
>=20
> This would seem to point to a bug at least partially due to mesa/RADV-llv=
m.
radv related hangs got fixed in Mesa 20 git series, this thread is more
concerned with SDMA kernel-driver hangs.
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15733888250.9B3cA.2731
Date: Sun, 10 Nov 2019 12:27:05 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
Comm=
ent # 224
on bug 11148=
1
from Marko Popovic
(In reply to lptech1024 from comment #223)
> Followup to #216:
>=20
> Fedora 31: Kernel 5.3.9, GNOME 3.34, Mesa 19.2.2, linux-firmware 20190=
923,
> LLVM 9.0.0
>=20
> The hang is 100% reproducible.
>=20
> It occurs running the Linux-native (Vulkan) version of Shadow of the T=
omb
> Raider (SotTR). I have never run SotTR under Proton/Wine, so that isn'=
t a
> confounding variable.
>=20
> The (unskippable) cutscene is for the Amazon River in Peru and occurs
> anywhere between 15 seconds before the pilot is struck and the pilot is
> struck. Even when the video hangs, you can usually hear fragments (sou=
nd
> effects) of the game for a few seconds afterwords.
>=20
> I ran SotTR with vktrace and activated the Gnome (Wayland) overview to=
see
> if there I could catch any relevant terminal output (none that I saw).=
The
> game still had focus, so it continued playing. After the hang (when I
> rebooted), there wasn't a vktrace file. I would assume this would be e=
ither
> it didn't write it out due to the hang or it didn't have content to wr=
ite.
>=20
> However, with it running visible in the overview (and a manual kernel
> update), I got both ring gfx and sdma errors:
>=20
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled.
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled.
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ER=
ROR*
> Process information: process pid 0 thread pid 0
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ER=
ROR*
> Process information: process gnome-shell pid 1722 thread gnome-shel:cs=
0 pid
> 1768
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ER=
ROR*
> ring sdma1 timeout, signaled seq=3D1049, emitted seq=3D1053
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ER=
ROR*
> ring sdma0 timeout, signaled seq=3D30017, emitted seq=3D30020
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm] GPU recovery disabled.
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ER=
ROR*
> Process information: process ShadowOfTheTomb pid 3890 thread WebViewRe=
nderer
> pid 4981
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ER=
ROR*
> ring gfx_0.0.0 timeout, signaled seq=3D75610, emitted seq=3D75612
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_dm_atomic_commit_tail [amd=
gpu]]
> *ERROR* Waiting for fences timed out or interrupted!
>=20
> As a workaround to proceed in the game, I downloaded the AMDVLD 2019.Q=
4.2
> .deb, extracted the contents, modified the JSON file (to point to the =
local
> amdvlk64.so), and ran SotTR with the VK_ICD_FILENAMES variable set to =
the
> AMDVLK JSON file.
>=20
> The AMDVLK graphics were terrible (significant percentage of random pi=
xels
> turning random colors, bad rendering of elements, etc), but I did not
> experience any hangs during the cutscene. After reaching a known save =
point,
> I switched back to mesa/RADV-llvm and haven't experienced a hang since
> (haven't progressed that much further yet, but that's the only hang so=
far -
> about 13% of the game has been completed).
>=20
> This would seem to point to a bug at least partially due to mesa/RADV-=
llvm.
radv related hangs got fixed in Mesa 20 git series, this thread is more
concerned with SDMA kernel-driver hangs.
You are receiving this mail because:
- You are the assignee for the bug.
=
--15733888250.9B3cA.2731--
--===============0590466287==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVs
--===============0590466287==--