From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9 Date: Sun, 01 Sep 2019 00:36:02 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0052882344==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [IPv6:2610:10:20:722:a800:ff:fe98:4b55]) by gabe.freedesktop.org (Postfix) with ESMTP id AC5936E162 for ; Sun, 1 Sep 2019 00:36:03 +0000 (UTC) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============0052882344== Content-Type: multipart/alternative; boundary="15672981630.6bF4aB.21039" Content-Transfer-Encoding: 7bit --15672981630.6bF4aB.21039 Date: Sun, 1 Sep 2019 00:36:03 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D111481 --- Comment #10 from Marko Popovic --- (In reply to Matthias M=C3=BCller from comment #9) > On my side i can report that the issue does not occur if i don't use a to= ol > to modify the FANs - does anyone of you use something of the like or are > this seperate issues? I don't use any tools, all is stock. (In reply to Mathieu Belanger from comment #7) > Created attachment 145225 [details] [review] > Merge last adg5f code >=20 > Ok, I did look at the recent kernel patch and commit and they seam to have > fixed a couple bugs. I do not know it it include these but I did not crash > one time since I merged that into the kernel 5.3-rc6. (that code is staged > for 5.4 merge window). >=20 > I did attach the patch so you can merge that if you wish to try. It add a= ll > the latest bits for AMDGPU into 5.3-rc6, including Renoir support. After applying the patch, same type of error occurs, luckily very easy to reproduce with Citra emulator, apparently it does something that AMD's driv= er really doesn't like and makes chances higher for error to occur. Also when = CPU is under heavy I/O load error seems more likely to occur as well on my end. Last log after applying the latest patch from the merge posted in the attachment: sep 01 02:29:10 Marko-PC kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out! sep 01 02:29:10 Marko-PC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=3D16312, emitted seq=3D16314 sep 01 02:29:10 Marko-PC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process citra-qt pid 2928 thread citra-qt:cs0 pid 2938 sep 01 02:29:10 Marko-PC kernel: [drm] GPU recovery disabled. If we could get any official AMD responses to at least make sure that we're= at least being listened to would be very nice. --=20 You are receiving this mail because: You are the assignee for the bug.= --15672981630.6bF4aB.21039 Date: Sun, 1 Sep 2019 00:36:03 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated

Comme= nt # 10 on bug 11148= 1 from Marko Popovic
(In reply to Matthias M=C3=BCller from comment #9)
> On my side i can report that the issue does not =
occur if i don't use a tool
> to modify the FANs - does anyone of you use something of the like or a=
re
> this seperate issues?

I don't use any tools, all is stock.

(In reply to Mathieu Belanger from =
comment #7)
> Created atta=
chment 145225 [details] [review] [re=
view]
> Merge last adg5f code
>=20
> Ok, I did look at the recent kernel patch and commit and they seam to =
have
> fixed a couple bugs. I do not know it it include these but I did not c=
rash
> one time since I merged that into the kernel 5.3-rc6. (that code is st=
aged
> for 5.4 merge window).
>=20
> I did attach the patch so you can merge that if you wish to try. It ad=
d all
> the latest bits for AMDGPU into 5.3-rc6, including Renoir support.

After applying the patch, same type of error occurs, luckily very easy to
reproduce with Citra emulator, apparently it does something that AMD's driv=
er
really doesn't like and makes chances higher for error to occur. Also when =
CPU
is under heavy I/O load error seems more likely to occur as well on my end.

Last log after applying the latest patch from the merge posted in the
attachment:
sep 01 02:29:10 Marko-PC kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]]
*ERROR* Waiting for fences timed out!
sep 01 02:29:10 Marko-PC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
ring gfx_0.0.0 timeout, signaled seq=3D16312, emitted seq=3D16314
sep 01 02:29:10 Marko-PC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process citra-qt pid 2928 thread citra-qt:cs0 pid 2938
sep 01 02:29:10 Marko-PC kernel: [drm] GPU recovery disabled.

If we could get any official AMD responses to at least make sure that we're=
 at
least being listened to would be very nice.


You are receiving this mail because:
  • You are the assignee for the bug.
= --15672981630.6bF4aB.21039-- --===============0052882344== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVs --===============0052882344==--