From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 102322] System crashes after "[drm] IP block:gmc_v8_0 is hung!"
/ [drm] IP block:sdma_v3_0 is hung!
Date: Tue, 26 Jun 2018 15:20:45 +0000
Message-ID:
References:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============1679670518=="
Return-path:
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
[131.252.210.165])
by gabe.freedesktop.org (Postfix) with ESMTP id 514E86E5AA
for ; Tue, 26 Jun 2018 15:20:45 +0000 (UTC)
In-Reply-To:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel"
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org
--===============1679670518==
Content-Type: multipart/alternative; boundary="15300264450.b3E37c.31284"
Content-Transfer-Encoding: 7bit
--15300264450.b3E37c.31284
Date: Tue, 26 Jun 2018 15:20:45 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D102322
--- Comment #8 from Andrey Grodzovsky ---
(In reply to dwagner from comment #7)
> (In reply to Andrey Grodzovsky from comment #6)
> > Verify you are using latest AMD firmware and up to date MESA/LLVM
>=20
> Firmware:
>=20
> pacman -Q linux-firmware
> linux-firmware 20180606.d114732-1
>=20
> ll /usr/lib/firmware/amdgpu/vega10_vce.bin
> -rw-r--r-- 1 root root 165344 Jun 7 08:01
> /usr/lib/firmware/amdgpu/vega10_vce.bin
>=20
>=20
> MESA:
>=20
> pacman -Q mesa
> mesa 18.1.2-1
>=20
>=20
> LLVM:
> pacman -Q llvm-libs
> llvm-libs 6.0.0-4
>=20
> Is this new enough?
The kernel and MESA seems new enough, LLVM is 6 so maybe you should try 7.
The firmware also looks pretty late but I still would advise to manually
override all firmware files with files from here
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git=
/tree/amdgpu
Just backup your existing firmware/amdgpu folder for any case.
>=20
>=20
> BTW: In a forum somebody asked what the dmesg output on crash looked like=
if
> I enabled amdgpu.gpu_recovery=3D1 - the result is a few lines more of out=
put,
> but still a fatal system crash:
>=20
> Jun 26 00:50:09 ryzen kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> ring gfx timeout, last signaled seq=3D12277, last emitted seq=3D12279
> Jun 26 00:50:09 ryzen kernel: [drm] IP block:gmc_v8_0 is hung!
> Jun 26 00:50:09 ryzen kernel: [drm] IP block:gfx_v8_0 is hung!
> Jun 26 00:50:09 ryzen kernel: amdgpu 0000:0a:00.0: GPU reset begin!
> Jun 26 00:50:15 ryzen kernel: [drm:drm_atomic_helper_wait_for_flip_done
> [drm_kms_helper]] *ERROR* [CRTC:42:crtc-0] flip_done timed out
> Jun 26 00:50:15 ryzen kernel: [drm:drm_atomic_helper_wait_for_dependencies
> [drm_kms_helper]] *ERROR* [CRTC:42:crtc-0] flip_done timed out
> Jun 26 00:50:25 ryzen kernel: [drm:drm_atomic_helper_wait_for_dependencies
> [drm_kms_helper]] *ERROR* [PLANE:40:plane-4] flip_done timed out
It's a know issue, try the patch I attached to resolve the deadlock , but y=
ou
will probably experience other failures after that anyway.=20
Andrey
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15300264450.b3E37c.31284
Date: Tue, 26 Jun 2018 15:20:45 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
Commen=
t # 8
on bug 10232=
2
from Andrey Grodzovsky
(In reply to dwagner from comment #7)
> (In reply to Andrey Grodzovsky from comment #6)
> > Verify you are using latest AMD firmware and up to date MESA/LLVM
>=20
> Firmware:
>=20
> pacman -Q linux-firmware
> linux-firmware 20180606.d114732-1
>=20
> ll /usr/lib/firmware/amdgpu/vega10_vce.bin
> -rw-r--r-- 1 root root 165344 Jun 7 08:01
> /usr/lib/firmware/amdgpu/vega10_vce.bin
>=20
>=20
> MESA:
>=20
> pacman -Q mesa
> mesa 18.1.2-1
>=20
>=20
> LLVM:
> pacman -Q llvm-libs
> llvm-libs 6.0.0-4
>=20
> Is this new enough?
The kernel and MESA seems new enough, LLVM is 6 so maybe you should try 7.
The firmware also looks pretty late but I still would advise to manually
override all firmware files with files from here
https://git.kernel.org/pub/scm/linux/kernel/git/fi=
rmware/linux-firmware.git/tree/amdgpu
Just backup your existing firmware/amdgpu folder for any case.
>=20
>=20
> BTW: In a forum somebody asked what the dmesg output on crash looked l=
ike if
> I enabled amdgpu.gpu_recovery=3D1 - the result is a few lines more of =
output,
> but still a fatal system crash:
>=20
> Jun 26 00:50:09 ryzen kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERRO=
R*
> ring gfx timeout, last signaled seq=3D12277, last emitted seq=3D12279
> Jun 26 00:50:09 ryzen kernel: [drm] IP block:gmc_v8_0 is hung!
> Jun 26 00:50:09 ryzen kernel: [drm] IP block:gfx_v8_0 is hung!
> Jun 26 00:50:09 ryzen kernel: amdgpu 0000:0a:00.0: GPU reset begin!
> Jun 26 00:50:15 ryzen kernel: [drm:drm_atomic_helper_wait_for_flip_done
> [drm_kms_helper]] *ERROR* [CRTC:42:crtc-0] flip_done timed out
> Jun 26 00:50:15 ryzen kernel: [drm:drm_atomic_helper_wait_for_dependen=
cies
> [drm_kms_helper]] *ERROR* [CRTC:42:crtc-0] flip_done timed out
> Jun 26 00:50:25 ryzen kernel: [drm:drm_atomic_helper_wait_for_dependen=
cies
> [drm_kms_helper]] *ERROR* [PLANE:40:plane-4] flip_done timed out
It's a know issue, try the patch I attached to resolve the deadlock , but y=
ou
will probably experience other failures after that anyway.=20
Andrey
You are receiving this mail because:
- You are the assignee for the bug.
=
--15300264450.b3E37c.31284--
--===============1679670518==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg==
--===============1679670518==--