All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
@ 2019-09-22 12:01 bugzilla-daemon
  2019-09-23  2:46 ` bugzilla-daemon
                   ` (38 more replies)
  0 siblings, 39 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-09-22 12:01 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1810 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

            Bug ID: 111763
           Summary: ring_gfx hangs/freezes on Navi gpus
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: major
          Priority: not set
         Component: DRM/AMDgpu
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: popovic.marko@protonmail.com

I'm making this topic as a separate tracking of ring_gfx related bugs since we
should keep https://bugs.freedesktop.org/show_bug.cgi?id=111481 related to
sdma0/1 type freezes since those are ones that seem to cause random "Out of the
blue" hangs on the desktop.

There is another type of freeze/hang happening when playing Starcraft II via
D9VK. This one doesn't seem to be related to either ngg or dma because I have
them both disabled by AMD_DEBUG=nodma and AMD_DEBUG=nongg and the hangs occur
anyway, on exactly the same place every time.

Error logs:
sep 17 11:48:24 Marko-PC kernel: [drm:amdgpu_dm_commit_planes.constprop.0
[amdgpu]] *ERROR* Waiting for fences timed out or interrupted!
sep 17 11:48:24 Marko-PC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
ring gfx_0.0.0 timeout, signaled seq=2361623, emitted seq=2361625
sep 17 11:48:24 Marko-PC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process SC2_x64.exe pid 20236 thread SC2_x64.exe pid 20236

I will try and provide trace files by using renderdoc for described issues.
They also happen in native games like Rise of the Tomb Raider and Vulkan etc.
Will provide as much info as possible.

Using Kernel 5.3, MESA 19.2 and llvm9.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3303 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
@ 2019-09-23  2:46 ` bugzilla-daemon
  2019-09-23  6:56 ` bugzilla-daemon
                   ` (37 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-09-23  2:46 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 913 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #1 from Jeremy Attali <jeremy.attali@gmail.com> ---
Not sure if that might help someone else, but I found a workaround in my case
with DOOM. I was having the same crashes as Marko described with Starcraft II,
I tried the following:

- In Steam, I disabled the In Game Steam Overlay
- I switched the Graphics API from OpenGL to Vulkan

I did not have any crash so far. But I haven't tried to isolate one or the
other.

Packages:
linux 5.3.arch1-1
linux-firmware-agd5f-radeon-navi10 2019.09.13.18.36-1
mesa-git 1:19.3.0_devel.115574.40087ffc5b9-1
vulkan-radeon-git 1:19.3.0_devel.115574.40087ffc5b9-1
libdrm 2.4.99-1
lib32-mesa-git 1:19.3.0_devel.115574.40087ffc5b9-1
lib32-vulkan-radeon-git 1:19.3.0_devel.115574.40087ffc5b9-1
lib32-libdrm 2.4.99-1

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1673 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
  2019-09-23  2:46 ` bugzilla-daemon
@ 2019-09-23  6:56 ` bugzilla-daemon
  2019-09-23  6:57 ` bugzilla-daemon
                   ` (36 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-09-23  6:56 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 313 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #2 from Daniel Lu <daniel.lawrence.lu@gmail.com> ---
Created attachment 145464
  --> https://bugs.freedesktop.org/attachment.cgi?id=145464&action=edit
dmesg output

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
  2019-09-23  2:46 ` bugzilla-daemon
  2019-09-23  6:56 ` bugzilla-daemon
@ 2019-09-23  6:57 ` bugzilla-daemon
  2019-09-23  7:00 ` bugzilla-daemon
                   ` (35 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-09-23  6:57 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 340 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #3 from Daniel Lu <daniel.lawrence.lu@gmail.com> ---
Created attachment 145465
  --> https://bugs.freedesktop.org/attachment.cgi?id=145465&action=edit
output of running sudo umr -R gfx_0.0.0

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1271 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (2 preceding siblings ...)
  2019-09-23  6:57 ` bugzilla-daemon
@ 2019-09-23  7:00 ` bugzilla-daemon
  2019-09-30 12:18 ` bugzilla-daemon
                   ` (34 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-09-23  7:00 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 479 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #4 from Daniel Lu <daniel.lawrence.lu@gmail.com> ---
I am seeing a similar hang in Starcraft II. Unlike Marko, I am not using d9vk
--- instead, I'm using wine-nine. The hang doesn't happen in all games but
seems to be particularly frequent in the coop mission "dead of night".

Using mesa-git 19.3.0_devel.115092.3f5b541fc8b-1.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1250 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (3 preceding siblings ...)
  2019-09-23  7:00 ` bugzilla-daemon
@ 2019-09-30 12:18 ` bugzilla-daemon
  2019-09-30 15:10 ` bugzilla-daemon
                   ` (33 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-09-30 12:18 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 350 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #5 from Doug Ty <git@dougty.com> ---
I've been getting this too with Minecraft:  
https://bugs.freedesktop.org/show_bug.cgi?id=111669

For my particular case at least, AMD_DEBUG=nodma seems to fix it

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1254 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (4 preceding siblings ...)
  2019-09-30 12:18 ` bugzilla-daemon
@ 2019-09-30 15:10 ` bugzilla-daemon
  2019-09-30 21:55 ` bugzilla-daemon
                   ` (32 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-09-30 15:10 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1048 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #6 from Marko Popovic <popovic.marko@protonmail.com> ---
(In reply to Doug Ty from comment #5)
> I've been getting this too with Minecraft:  
> https://bugs.freedesktop.org/show_bug.cgi?id=111669
> 
> For my particular case at least, AMD_DEBUG=nodma seems to fix it

(In reply to Marko Popovic from comment #0)
> There is another type of freeze/hang happening when playing Starcraft II via
> D9VK. This one doesn't seem to be related to either ngg or dma because I
> have them both disabled by AMD_DEBUG=nodma and AMD_DEBUG=nongg and the hangs
> occur anyway, on exactly the same place every time.

You are refering to sdma0 / sdma1 type hang which is tracked
here:https://bugs.freedesktop.org/show_bug.cgi?id=111481

For ring_gfx hangs they're quite more reproducible and are not affected by
AMD_DEBUG=nodma or AMD_DEBUG=nongg which I already mentioned above in the bug
description.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2332 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (5 preceding siblings ...)
  2019-09-30 15:10 ` bugzilla-daemon
@ 2019-09-30 21:55 ` bugzilla-daemon
  2019-09-30 22:02 ` bugzilla-daemon
                   ` (31 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-09-30 21:55 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1196 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #7 from Doug Ty <git@dougty.com> ---
(In reply to Marko Popovic from comment #6)
> (In reply to Doug Ty from comment #5)
> > I've been getting this too with Minecraft:  
> > https://bugs.freedesktop.org/show_bug.cgi?id=111669
> > 
> > For my particular case at least, AMD_DEBUG=nodma seems to fix it
> 
> You are refering to sdma0 / sdma1 type hang which is tracked
> here:https://bugs.freedesktop.org/show_bug.cgi?id=111481
> 
> For ring_gfx hangs they're quite more reproducible and are not affected by
> AMD_DEBUG=nodma or AMD_DEBUG=nongg which I already mentioned above in the
> bug description.

Sorry, but this is incorrect. My Minecraft hang is most definitely a ring gfx
hang, *not* sdma. I've posted logs and apitraces in the linked thread if you'd
like to check for yourself.

I can't explain why nodma isn't working for you, perhaps it doesn't work for
game? Have you tried putting it in /etc/environment so it's system-wide? I
don't know what to tell you regarding nodma, but my hang is definitely ring gfx
as well.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2456 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (6 preceding siblings ...)
  2019-09-30 21:55 ` bugzilla-daemon
@ 2019-09-30 22:02 ` bugzilla-daemon
  2019-10-03 12:26 ` bugzilla-daemon
                   ` (30 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-09-30 22:02 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1734 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #8 from Marko Popovic <popovic.marko@protonmail.com> ---
(In reply to Doug Ty from comment #7)
> (In reply to Marko Popovic from comment #6)
> > (In reply to Doug Ty from comment #5)
> > > I've been getting this too with Minecraft:  
> > > https://bugs.freedesktop.org/show_bug.cgi?id=111669
> > > 
> > > For my particular case at least, AMD_DEBUG=nodma seems to fix it
> > 
> > You are refering to sdma0 / sdma1 type hang which is tracked
> > here:https://bugs.freedesktop.org/show_bug.cgi?id=111481
> > 
> > For ring_gfx hangs they're quite more reproducible and are not affected by
> > AMD_DEBUG=nodma or AMD_DEBUG=nongg which I already mentioned above in the
> > bug description.
> 
> Sorry, but this is incorrect. My Minecraft hang is most definitely a ring
> gfx hang, *not* sdma. I've posted logs and apitraces in the linked thread if
> you'd like to check for yourself.
> 
> I can't explain why nodma isn't working for you, perhaps it doesn't work for
> game? Have you tried putting it in /etc/environment so it's system-wide? I
> don't know what to tell you regarding nodma, but my hang is definitely ring
> gfx as well.

I guess we just have many different types of hangs then... ring_gfx hangs are
more mysterious than sdma0/1 hangs it seems, since there is no "universal"
workaround for them. nodma works for stopping global sdma-type hangs for me,
nongg works for stopping the citra-related hang of ring_gfx type, but none of
those 2 variables work for stopping Starcraft II and RoTR ring_gfx-type hangs
for me, so it's really really confusing.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3130 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (7 preceding siblings ...)
  2019-09-30 22:02 ` bugzilla-daemon
@ 2019-10-03 12:26 ` bugzilla-daemon
  2019-10-11 13:37 ` bugzilla-daemon
                   ` (29 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-10-03 12:26 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 523 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #9 from Marko Popovic <popovic.marko@protonmail.com> ---
https://cgit.freedesktop.org/mesa/mesa/commit/?id=a2a68d551c1c2a4f13761ffa8f3f6f13fee7a384

This might actually fix the ring_gfx type hangs or even sdma ones at least for
Vulkan API? Not exactly sure but will also be testing the latest MESA builds
from Oibaf's PPA in following days and report back on the issue :)

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1393 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (8 preceding siblings ...)
  2019-10-03 12:26 ` bugzilla-daemon
@ 2019-10-11 13:37 ` bugzilla-daemon
  2019-10-11 13:57 ` bugzilla-daemon
                   ` (28 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-10-11 13:37 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 655 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #10 from takios+fdbugs@takios.de ---
(In reply to Marko Popovic from comment #9)
> https://cgit.freedesktop.org/mesa/mesa/commit/
> ?id=a2a68d551c1c2a4f13761ffa8f3f6f13fee7a384
> 
> This might actually fix the ring_gfx type hangs or even sdma ones at least
> for Vulkan API? Not exactly sure but will also be testing the latest MESA
> builds from Oibaf's PPA in following days and report back on the issue :)

Sadly, I'm still getting the ring_gfx hangs after a few minutes of playing
Trackmania 2.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1546 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (9 preceding siblings ...)
  2019-10-11 13:37 ` bugzilla-daemon
@ 2019-10-11 13:57 ` bugzilla-daemon
  2019-10-15 12:58 ` bugzilla-daemon
                   ` (27 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-10-11 13:57 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 831 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #11 from Marko Popovic <popovic.marko@protonmail.com> ---
(In reply to takios+fdbugs from comment #10)
> (In reply to Marko Popovic from comment #9)
> > https://cgit.freedesktop.org/mesa/mesa/commit/
> > ?id=a2a68d551c1c2a4f13761ffa8f3f6f13fee7a384
> > 
> > This might actually fix the ring_gfx type hangs or even sdma ones at least
> > for Vulkan API? Not exactly sure but will also be testing the latest MESA
> > builds from Oibaf's PPA in following days and report back on the issue :)
> 
> Sadly, I'm still getting the ring_gfx hangs after a few minutes of playing
> Trackmania 2.

Oh yes I forgot to add a reply here. It didn't solve any of the hangs for me
either.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1815 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (10 preceding siblings ...)
  2019-10-11 13:57 ` bugzilla-daemon
@ 2019-10-15 12:58 ` bugzilla-daemon
  2019-10-15 17:10 ` bugzilla-daemon
                   ` (26 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-10-15 12:58 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 692 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #12 from shahul <shahulhameed.481@gmail.com> ---

I am working on Navi10 RX5700
I am facing below issue when i run unigine-heaven benchmark

 [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed
out!
 [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled
seq=5075872, emitted seq=5075874
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process
heaven_x64 pid 13723 thread heaven_x64:cs0 pid 13741
 [drm] GPU recovery disabled.

Is any fix for it ? 

Thanks on advance.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1449 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (11 preceding siblings ...)
  2019-10-15 12:58 ` bugzilla-daemon
@ 2019-10-15 17:10 ` bugzilla-daemon
  2019-10-23  7:26 ` bugzilla-daemon
                   ` (25 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-10-15 17:10 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 551 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #13 from Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> ---
For hangs involving radv the AMD_DEBUG options aren't relevant.
You should use RADV_DEBUG instead (probably doesn't support the same values).

Also opening a bug in https://gitlab.freedesktop.org/mesa/mesa/issues is a good
idea since gfx hangs are most likely a driver issue (radv or radeonsi,
depending on the API used).

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1398 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (12 preceding siblings ...)
  2019-10-15 17:10 ` bugzilla-daemon
@ 2019-10-23  7:26 ` bugzilla-daemon
  2019-10-31 12:09 ` bugzilla-daemon
                   ` (24 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-10-23  7:26 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 370 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

sambolinux <sambolinux@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|not set                     |medium

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1049 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (13 preceding siblings ...)
  2019-10-23  7:26 ` bugzilla-daemon
@ 2019-10-31 12:09 ` bugzilla-daemon
  2019-10-31 12:11 ` bugzilla-daemon
                   ` (23 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-10-31 12:09 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1272 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #14 from wychuchol <wychuchol7777@gmail.com> ---
RX 5700 XT Pop OS 19.10 latest Oibaf mesa not sure what llvm
Anomaly 1.5.0 update 3 standalone 64 bit mod for S.T.A.L.K.E.R. Call of Pripyat
running under wine d3dx11_43->dxvk (winetricks dxvk d3dcompiler_43 d3dx11_43)

Oct 30 02:49:30 pop-os kernel: [ 4864.627343]
[drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences
timed out!
Oct 30 02:49:30 pop-os kernel: [ 4869.231450] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2626284, emitted
seq=2626286
Oct 30 02:49:30 pop-os kernel: [ 4869.231486] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* Process information: process AnomalyDX11.exe pid 5791 thread
AnomalyDX11.exe pid 5791
Oct 30 02:49:30 pop-os kernel: [ 4869.231487] [drm] GPU recovery disabled.

Happens at random. Sometimes hangs straight away, sometimes can go over an hour
without crash. Complete crash, no option available besides hard reset. Not even
mouse pointer would move (as with sdma0 hang).

I'm sorry if it's not the right place to report this, I'm somewhat new to all
of this.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2032 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (14 preceding siblings ...)
  2019-10-31 12:09 ` bugzilla-daemon
@ 2019-10-31 12:11 ` bugzilla-daemon
  2019-11-01  1:23 ` bugzilla-daemon
                   ` (22 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-10-31 12:11 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 228 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #15 from wychuchol <wychuchol7777@gmail.com> ---
Forgot to add, Kernel v5.4-rc5.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 985 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (15 preceding siblings ...)
  2019-10-31 12:11 ` bugzilla-daemon
@ 2019-11-01  1:23 ` bugzilla-daemon
  2019-11-01 16:26 ` bugzilla-daemon
                   ` (21 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-01  1:23 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1704 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #16 from Andrew Sheldon <asheldon55@gmail.com> ---
(In reply to wychuchol from comment #14)
> RX 5700 XT Pop OS 19.10 latest Oibaf mesa not sure what llvm
> Anomaly 1.5.0 update 3 standalone 64 bit mod for S.T.A.L.K.E.R. Call of
> Pripyat running under wine d3dx11_43->dxvk (winetricks dxvk d3dcompiler_43
> d3dx11_43)
> 
> Oct 30 02:49:30 pop-os kernel: [ 4864.627343]
> [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for
> fences timed out!
> Oct 30 02:49:30 pop-os kernel: [ 4869.231450] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2626284, emitted
> seq=2626286
> Oct 30 02:49:30 pop-os kernel: [ 4869.231486] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* Process information: process AnomalyDX11.exe pid 5791
> thread AnomalyDX11.exe pid 5791
> Oct 30 02:49:30 pop-os kernel: [ 4869.231487] [drm] GPU recovery disabled.
> 
> Happens at random. Sometimes hangs straight away, sometimes can go over an
> hour without crash. Complete crash, no option available besides hard reset.
> Not even mouse pointer would move (as with sdma0 hang).
> 
> I'm sorry if it's not the right place to report this, I'm somewhat new to
> all of this.

Ring gfx type hangs tend to be in Mesa. Report here:
https://gitlab.freedesktop.org/mesa/mesa/issues

Also I'm not sure how up to date the Oibaf repo is, but Mesa git landed ACO
recently for Navi cards. You can try with RADV_PERFTEST=aco environment
variable set if your Mesa is new enough, and you might have better luck with
hangs.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2663 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (16 preceding siblings ...)
  2019-11-01  1:23 ` bugzilla-daemon
@ 2019-11-01 16:26 ` bugzilla-daemon
  2019-11-02 12:35 ` bugzilla-daemon
                   ` (20 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-01 16:26 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2119 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #17 from wychuchol <wychuchol7777@gmail.com> ---
(In reply to Andrew Sheldon from comment #16)
> (In reply to wychuchol from comment #14)
> > RX 5700 XT Pop OS 19.10 latest Oibaf mesa not sure what llvm
> > Anomaly 1.5.0 update 3 standalone 64 bit mod for S.T.A.L.K.E.R. Call of
> > Pripyat running under wine d3dx11_43->dxvk (winetricks dxvk d3dcompiler_43
> > d3dx11_43)
> > 
> > Oct 30 02:49:30 pop-os kernel: [ 4864.627343]
> > [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for
> > fences timed out!
> > Oct 30 02:49:30 pop-os kernel: [ 4869.231450] [drm:amdgpu_job_timedout
> > [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2626284, emitted
> > seq=2626286
> > Oct 30 02:49:30 pop-os kernel: [ 4869.231486] [drm:amdgpu_job_timedout
> > [amdgpu]] *ERROR* Process information: process AnomalyDX11.exe pid 5791
> > thread AnomalyDX11.exe pid 5791
> > Oct 30 02:49:30 pop-os kernel: [ 4869.231487] [drm] GPU recovery disabled.
> > 
> > Happens at random. Sometimes hangs straight away, sometimes can go over an
> > hour without crash. Complete crash, no option available besides hard reset.
> > Not even mouse pointer would move (as with sdma0 hang).
> > 
> > I'm sorry if it's not the right place to report this, I'm somewhat new to
> > all of this.
> 
> Ring gfx type hangs tend to be in Mesa. Report here:
> https://gitlab.freedesktop.org/mesa/mesa/issues
> 
> Also I'm not sure how up to date the Oibaf repo is, but Mesa git landed ACO
> recently for Navi cards. You can try with RADV_PERFTEST=aco environment
> variable set if your Mesa is new enough, and you might have better luck with
> hangs.

Thank you so very much, no way to be sure since they seemed to happen at random
but I think I'd experience at least 2 or 3 hangs in the time I've tested it but
smooth ride so far. No performance impact either but running this game as I do
I'm supposedly laying most of the calculations on CPU not GPU.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3210 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (17 preceding siblings ...)
  2019-11-01 16:26 ` bugzilla-daemon
@ 2019-11-02 12:35 ` bugzilla-daemon
  2019-11-02 23:11 ` bugzilla-daemon
                   ` (19 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-02 12:35 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1053 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #18 from wychuchol <wychuchol7777@gmail.com> ---
It happened again. This time without a game or anything running, barely logged
in and opened a program and boom.

Nov  2 12:42:07 pop-os kernel: [ 1675.883513]
[drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences
timed out!
Nov  2 12:42:07 pop-os kernel: [ 1680.747513] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2714, emitted seq=2716
Nov  2 12:42:07 pop-os kernel: [ 1680.747549] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* Process information: process Xorg pid 2293 thread Xorg:cs0
pid 2294
Nov  2 12:42:07 pop-os kernel: [ 1680.747551] [drm] GPU recovery disabled.

Only cursor moved, no clicks registered, restart achieved with REISUB.
I tried registering at https://gitlab.freedesktop.org/mesa/mesa/issues but I'm
getting no account confirmation mail so can't post it there.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1872 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (18 preceding siblings ...)
  2019-11-02 12:35 ` bugzilla-daemon
@ 2019-11-02 23:11 ` bugzilla-daemon
  2019-11-04 16:08 ` bugzilla-daemon
                   ` (18 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-02 23:11 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1477 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #19 from wychuchol <wychuchol7777@gmail.com> ---
Perhaps needs another entry started but it's related (since it didn't happen
before I tried RADV_PERFTEST=aco and AMD_DEBUG="nongg,nodma") so I'll post it
in case someone has had same issues as me.

After some time in Witcher 3 GOTY run with Lutris PC restarts on it's own. I
thought something is overheating (I've noticed graphic card memory in PSensor
sometimes reaching 90 so I thought maybe that's what's happening) but I
investigated kern.log and this always happened before that autonomous reset:

Nov  2 22:01:53 pop-os kernel: [  979.244964] pcieport 0000:00:01.1: AER:
Corrected error received: 0000:01:00.0
Nov  2 22:01:53 pop-os kernel: [  979.244967] nvme 0000:01:00.0: AER: PCIe Bus
Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme 0000:01:00.0: AER:   device
[1987:5012] error status/mask=00001000/00006000
Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme 0000:01:00.0: AER:    [12]
Timeout               
Nov  2 22:01:53 pop-os kernel: [  979.262629] Emergency Sync complete

A solution I found is to add pci=nommconf in /etc/default/grub to the line 
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" (so it looks like this:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=nommconf").

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2264 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (19 preceding siblings ...)
  2019-11-02 23:11 ` bugzilla-daemon
@ 2019-11-04 16:08 ` bugzilla-daemon
  2019-11-04 16:10 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-04 16:08 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1520 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #20 from wychuchol <wychuchol7777@gmail.com> ---
Barely started PC, opened palemoon, curse move only hang and then dozens of
graphical artifacts on screen like square patches of glitches. 

Nov  3 13:15:10 pop-os kernel: [  133.998883]
[drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences
timed out!
Nov  3 13:15:10 pop-os kernel: [  139.118912] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=11145, emitted seq=11148
Nov  3 13:15:10 pop-os kernel: [  139.118956] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* Process information: process gnome-shell pid 2588 thread
gnome-shel:cs0 pid 2606
Nov  3 13:15:10 pop-os kernel: [  139.118958] [drm] GPU recovery disabled.

Then sometime later I got ring gfx related crash with Witcher 3 which didn't
happen before:
Nov  3 14:08:47 pop-os kernel: [ 3185.175837]
[drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences
timed out!
Nov  3 14:08:47 pop-os kernel: [ 3190.039750] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=1448573, emitted
seq=1448575
Nov  3 14:08:47 pop-os kernel: [ 3190.039786] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* Process information: process witcher3.exe pid 8100 thread
witcher3.exe pid 10168
Nov  3 14:08:47 pop-os kernel: [ 3190.039788] [drm] GPU recovery disabled.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2277 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (20 preceding siblings ...)
  2019-11-04 16:08 ` bugzilla-daemon
@ 2019-11-04 16:10 ` bugzilla-daemon
  2019-11-04 22:13 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-04 16:10 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1665 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #21 from Marko Popovic <popovic.marko@protonmail.com> ---
(In reply to wychuchol from comment #20)
> Barely started PC, opened palemoon, curse move only hang and then dozens of
> graphical artifacts on screen like square patches of glitches. 
> 
> Nov  3 13:15:10 pop-os kernel: [  133.998883]
> [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for
> fences timed out!
> Nov  3 13:15:10 pop-os kernel: [  139.118912] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=11145, emitted
> seq=11148
> Nov  3 13:15:10 pop-os kernel: [  139.118956] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* Process information: process gnome-shell pid 2588 thread
> gnome-shel:cs0 pid 2606
> Nov  3 13:15:10 pop-os kernel: [  139.118958] [drm] GPU recovery disabled.
> 
> Then sometime later I got ring gfx related crash with Witcher 3 which didn't
> happen before:
> Nov  3 14:08:47 pop-os kernel: [ 3185.175837]
> [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for
> fences timed out!
> Nov  3 14:08:47 pop-os kernel: [ 3190.039750] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=1448573, emitted
> seq=1448575
> Nov  3 14:08:47 pop-os kernel: [ 3190.039786] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* Process information: process witcher3.exe pid 8100 thread
> witcher3.exe pid 10168
> Nov  3 14:08:47 pop-os kernel: [ 3190.039788] [drm] GPU recovery disabled.

What kernel/MESA combo are you using?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2578 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (21 preceding siblings ...)
  2019-11-04 16:10 ` bugzilla-daemon
@ 2019-11-04 22:13 ` bugzilla-daemon
  2019-11-05  6:07 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-04 22:13 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1181 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #22 from wychuchol <wychuchol7777@gmail.com> ---
(In reply to Marko Popovic from comment #21)
> What kernel/MESA combo are you using?

DRM 3.35.0, 5.4.0-050400rc5-generic, LLVM 9.0.0
Mesa 19.3.0-devel (git-ff6e148 2019-10-29 eoan-oibaf-ppa

Or at least that's what I got from glxinfo | grep OpenGL

Stalker hanged again just after few minutes of playtime so I don't know if any
of the fixes actually fixed anything or has it held stuff together a bit more
securely.

Nov  4 23:04:16 pop-os kernel: [100672.998576]
[drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences
timed out!
Nov  4 23:04:16 pop-os kernel: [100677.862509] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=23742723, emitted
seq=23742725
Nov  4 23:04:16 pop-os kernel: [100677.862545] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* Process information: process AnomalyDX11.exe pid 3904 thread
AnomalyDX11.exe pid 3904
Nov  4 23:04:16 pop-os kernel: [100677.862547] [drm] GPU recovery disabled.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2010 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (22 preceding siblings ...)
  2019-11-04 22:13 ` bugzilla-daemon
@ 2019-11-05  6:07 ` bugzilla-daemon
  2019-11-05 16:28 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-05  6:07 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2296 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #23 from wychuchol <wychuchol7777@gmail.com> ---
(In reply to wychuchol from comment #19)
> After some time in Witcher 3 GOTY run with Lutris PC restarts on it's own. I
> thought something is overheating (I've noticed graphic card memory in
> PSensor sometimes reaching 90 so I thought maybe that's what's happening)
> but I investigated kern.log and this always happened before that autonomous
> reset:
> 
> Nov  2 22:01:53 pop-os kernel: [  979.244964] pcieport 0000:00:01.1: AER:
> Corrected error received: 0000:01:00.0
> Nov  2 22:01:53 pop-os kernel: [  979.244967] nvme 0000:01:00.0: AER: PCIe
> Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
> Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme 0000:01:00.0: AER:  
> device [1987:5012] error status/mask=00001000/00006000
> Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme 0000:01:00.0: AER:   
> [12] Timeout               
> Nov  2 22:01:53 pop-os kernel: [  979.262629] Emergency Sync complete

Thing with those AER errors is that they can go on and on and reset happens few
minutes after the last logged error. 
This might be overheating, I managed to find how to output sensors readings
into txt log and found that memory went up to 96 C (or rather it stayed there
for about 1m 10s)
Last reading before reset:
amdgpu-pci-2800
Adapter: PCI adapter
vddgfx:       +1.16 V  
fan1:        1551 RPM  (min =    0 RPM, max = 3200 RPM)
edge:         +74.0°C  (crit = +118.0°C, hyst = -273.1°C)
                       (emerg = +99.0°C)
junction:     +88.0°C  (crit = +99.0°C, hyst = -273.1°C)
                       (emerg = +99.0°C)
mem:          +96.0°C  (crit = +99.0°C, hyst = -273.1°C)
                       (emerg = +99.0°C)
power1:      162.00 W  (cap = 195.00 W)

k10temp-pci-00c3
Adapter: PCI adapter
Tdie:         +70.5°C  (high = +70.0°C)
Tctl:         +70.5°C  

Now the weird thing is - if this is in fact overheating why fan didn't go
beyond 1600 rpm even once.... Highest was like 1581 rpm and I don't have silent
bios switched on (sapphire pulse rx 5700 xt, lever facing away from video
ports).

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3167 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (23 preceding siblings ...)
  2019-11-05  6:07 ` bugzilla-daemon
@ 2019-11-05 16:28 ` bugzilla-daemon
  2019-11-09  2:54 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-05 16:28 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 3268 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #24 from wychuchol <wychuchol7777@gmail.com> ---
(In reply to wychuchol from comment #23)
> (In reply to wychuchol from comment #19)
> > After some time in Witcher 3 GOTY run with Lutris PC restarts on it's own. I
> > thought something is overheating (I've noticed graphic card memory in
> > PSensor sometimes reaching 90 so I thought maybe that's what's happening)
> > but I investigated kern.log and this always happened before that autonomous
> > reset:
> > 
> > Nov  2 22:01:53 pop-os kernel: [  979.244964] pcieport 0000:00:01.1: AER:
> > Corrected error received: 0000:01:00.0
> > Nov  2 22:01:53 pop-os kernel: [  979.244967] nvme 0000:01:00.0: AER: PCIe
> > Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
> > Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme 0000:01:00.0: AER:  
> > device [1987:5012] error status/mask=00001000/00006000
> > Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme 0000:01:00.0: AER:   
> > [12] Timeout               
> > Nov  2 22:01:53 pop-os kernel: [  979.262629] Emergency Sync complete
> 
> Thing with those AER errors is that they can go on and on and reset happens
> few minutes after the last logged error. 
> This might be overheating, I managed to find how to output sensors readings
> into txt log and found that memory went up to 96 C (or rather it stayed
> there for about 1m 10s)
> Last reading before reset:
> amdgpu-pci-2800
> Adapter: PCI adapter
> vddgfx:       +1.16 V  
> fan1:        1551 RPM  (min =    0 RPM, max = 3200 RPM)
> edge:         +74.0°C  (crit = +118.0°C, hyst = -273.1°C)
>                        (emerg = +99.0°C)
> junction:     +88.0°C  (crit = +99.0°C, hyst = -273.1°C)
>                        (emerg = +99.0°C)
> mem:          +96.0°C  (crit = +99.0°C, hyst = -273.1°C)
>                        (emerg = +99.0°C)
> power1:      162.00 W  (cap = 195.00 W)
> 
> k10temp-pci-00c3
> Adapter: PCI adapter
> Tdie:         +70.5°C  (high = +70.0°C)
> Tctl:         +70.5°C  
> 
> Now the weird thing is - if this is in fact overheating why fan didn't go
> beyond 1600 rpm even once.... Highest was like 1581 rpm and I don't have
> silent bios switched on (sapphire pulse rx 5700 xt, lever facing away from
> video ports).

Okay I don't think it's overheating anymore. I found a moment in Anomaly 1.5.0
I can't get past without system resetting, just before a psi storm in Army
Warehouses (I can provide a savefile).

Last sensors reading before crash (5 second increments):
amdgpu-pci-2800
Adapter: PCI adapter
vddgfx:       +1.01 V  
fan1:        1560 RPM  (min =    0 RPM, max = 3200 RPM)
edge:         +69.0°C  (crit = +118.0°C, hyst = -273.1°C)
                       (emerg = +99.0°C)
junction:     +84.0°C  (crit = +99.0°C, hyst = -273.1°C)
                       (emerg = +99.0°C)
mem:          +80.0°C  (crit = +99.0°C, hyst = -273.1°C)
                       (emerg = +99.0°C)
power1:      227.00 W  (cap = 195.00 W)

k10temp-pci-00c3
Adapter: PCI adapter
Tdie:         +71.8°C  (high = +70.0°C)
Tctl:         +71.8°C

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 4312 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (24 preceding siblings ...)
  2019-11-05 16:28 ` bugzilla-daemon
@ 2019-11-09  2:54 ` bugzilla-daemon
  2019-11-09 12:42 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-09  2:54 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 870 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #25 from Ben Klein <robobenklein@gmail.com> ---
Created attachment 145918
  --> https://bugs.freedesktop.org/attachment.cgi?id=145918&action=edit
Journal excerpt vega56 ring gfx timeout, then gpu reset

I think I'm having this problem on a Vega 56, I didn't see anyone else mention
that card here.

I attached the relevant log, I think it's this same issue, but someone correct
me if I'm wrong.

OpenGL renderer string: Radeon RX Vega (VEGA10, DRM 3.33.0, 5.3.0-20-generic,
LLVM 9.0.0)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 19.2.1

Running Pop!_OS:
Linux robo-triangulum 5.3.0-20-generic
#21+system76~1572304854~19.10~8caa3e6-Ubuntu SMP Tue Oct 29 00:4 x86_64 x86_64
x86_64 GNU/Linux

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1828 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (25 preceding siblings ...)
  2019-11-09  2:54 ` bugzilla-daemon
@ 2019-11-09 12:42 ` bugzilla-daemon
  2019-11-09 20:12 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-09 12:42 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1071 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #26 from Marko Popovic <popovic.marko@protonmail.com> ---
(In reply to Ben Klein from comment #25)
> Created attachment 145918 [details]
> Journal excerpt vega56 ring gfx timeout, then gpu reset
> 
> I think I'm having this problem on a Vega 56, I didn't see anyone else
> mention that card here.
> 
> I attached the relevant log, I think it's this same issue, but someone
> correct me if I'm wrong.
> 
> OpenGL renderer string: Radeon RX Vega (VEGA10, DRM 3.33.0,
> 5.3.0-20-generic, LLVM 9.0.0)
> OpenGL core profile version string: 4.5 (Core Profile) Mesa 19.2.1
> 
> Running Pop!_OS:
> Linux robo-triangulum 5.3.0-20-generic
> #21+system76~1572304854~19.10~8caa3e6-Ubuntu SMP Tue Oct 29 00:4 x86_64
> x86_64 x86_64 GNU/Linux

Could be, there are a few patches in latest RADV, so try out MESA 20.0 git to
see if it fixes anything for you... apparently radv hangs for navi gpus stopped
with that fix.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2222 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (26 preceding siblings ...)
  2019-11-09 12:42 ` bugzilla-daemon
@ 2019-11-09 20:12 ` bugzilla-daemon
  2019-11-10 12:20 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-09 20:12 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 965 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

James Wood <Chryseus8080@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Chryseus8080@gmail.com

--- Comment #27 from James Wood <Chryseus8080@gmail.com> ---
This doesn't seem to be exclusive to Navi GPUs, I've been having instances of
ring gfx timeouts freezing up the system in numerous games such as Project
Zomboid (was recently fixed by the developer) and ArmA 3 with the all too
familiar dmesg:
[drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed
out or interrupted!
drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

I'm using:
Radeon RX 590 Series (POLARIS10, DRM 3.33.0, 5.3.8-arch1-1, LLVM 9.0.0)

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2255 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (27 preceding siblings ...)
  2019-11-09 20:12 ` bugzilla-daemon
@ 2019-11-10 12:20 ` bugzilla-daemon
  2019-11-10 13:50 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-10 12:20 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 439 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #28 from Marko Popovic <popovic.marko@protonmail.com> ---
I think this bug report can be closed now, Mesa 20 git basically fixes radv
related ring_gfx hangs, there is still hang that happens in Citra emulator (ngg
related) but AMD developers are aware of it so will probably get fixed too.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1205 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (28 preceding siblings ...)
  2019-11-10 12:20 ` bugzilla-daemon
@ 2019-11-10 13:50 ` bugzilla-daemon
  2019-11-10 13:51 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-10 13:50 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 555 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #29 from Daniel Suarez <danielsuarez369@protonmail.com> ---
(In reply to Marko Popovic from comment #28)
> I think this bug report can be closed now, Mesa 20 git basically fixes radv
> related ring_gfx hangs, there is still hang that happens in Citra emulator
> (ngg related) but AMD developers are aware of it so will probably get fixed
> too.

Yeah.. "soon". Still waiting for them to fix bug 111481

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1614 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (29 preceding siblings ...)
  2019-11-10 13:50 ` bugzilla-daemon
@ 2019-11-10 13:51 ` bugzilla-daemon
  2019-11-10 13:53 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-10 13:51 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 716 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #30 from Marko Popovic <popovic.marko@protonmail.com> ---
(In reply to Daniel Suarez from comment #29)
> (In reply to Marko Popovic from comment #28)
> > I think this bug report can be closed now, Mesa 20 git basically fixes radv
> > related ring_gfx hangs, there is still hang that happens in Citra emulator
> > (ngg related) but AMD developers are aware of it so will probably get fixed
> > too.
> 
> Yeah.. "soon". Still waiting for them to fix bug 111481

SDMA hangs have nothing to do with ring_gfx hangs which were mostly radv
related and are fixed now

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1835 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (30 preceding siblings ...)
  2019-11-10 13:51 ` bugzilla-daemon
@ 2019-11-10 13:53 ` bugzilla-daemon
  2019-11-10 13:55 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-10 13:53 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 950 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #31 from Daniel Suarez <danielsuarez369@protonmail.com> ---
(In reply to Marko Popovic from comment #30)
> (In reply to Daniel Suarez from comment #29)
> > (In reply to Marko Popovic from comment #28)
> > > I think this bug report can be closed now, Mesa 20 git basically fixes radv
> > > related ring_gfx hangs, there is still hang that happens in Citra emulator
> > > (ngg related) but AMD developers are aware of it so will probably get fixed
> > > too.
> > 
> > Yeah.. "soon". Still waiting for them to fix bug 111481
> 
> SDMA hangs have nothing to do with ring_gfx hangs which were mostly radv
> related and are fixed now

Still, I can't even play Vulkan titles reliably because the system constantly
hangs even with the workarounds in the bug report. AMD really needs to fix
them.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2145 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (31 preceding siblings ...)
  2019-11-10 13:53 ` bugzilla-daemon
@ 2019-11-10 13:55 ` bugzilla-daemon
  2019-11-10 13:58 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-10 13:55 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1161 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #32 from Marko Popovic <popovic.marko@protonmail.com> ---
(In reply to Daniel Suarez from comment #31)
> (In reply to Marko Popovic from comment #30)
> > (In reply to Daniel Suarez from comment #29)
> > > (In reply to Marko Popovic from comment #28)
> > > > I think this bug report can be closed now, Mesa 20 git basically fixes radv
> > > > related ring_gfx hangs, there is still hang that happens in Citra emulator
> > > > (ngg related) but AMD developers are aware of it so will probably get fixed
> > > > too.
> > > 
> > > Yeah.. "soon". Still waiting for them to fix bug 111481
> > 
> > SDMA hangs have nothing to do with ring_gfx hangs which were mostly radv
> > related and are fixed now
> 
> Still, I can't even play Vulkan titles reliably because the system
> constantly hangs even with the workarounds in the bug report. AMD really
> needs to fix them.

Mesa 20.0 should fix Vulkan hangs for you, and with nodma SDMA is disabled
fully so you can't get any hangs that are SDMA related.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2443 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (32 preceding siblings ...)
  2019-11-10 13:55 ` bugzilla-daemon
@ 2019-11-10 13:58 ` bugzilla-daemon
  2019-11-10 14:00 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-10 13:58 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1641 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #33 from Daniel Suarez <danielsuarez369@protonmail.com> ---
(In reply to Marko Popovic from comment #32)
> (In reply to Daniel Suarez from comment #31)
> > (In reply to Marko Popovic from comment #30)
> > > (In reply to Daniel Suarez from comment #29)
> > > > (In reply to Marko Popovic from comment #28)
> > > > > I think this bug report can be closed now, Mesa 20 git basically fixes radv
> > > > > related ring_gfx hangs, there is still hang that happens in Citra emulator
> > > > > (ngg related) but AMD developers are aware of it so will probably get fixed
> > > > > too.
> > > > 
> > > > Yeah.. "soon". Still waiting for them to fix bug 111481
> > > 
> > > SDMA hangs have nothing to do with ring_gfx hangs which were mostly radv
> > > related and are fixed now
> > 
> > Still, I can't even play Vulkan titles reliably because the system
> > constantly hangs even with the workarounds in the bug report. AMD really
> > needs to fix them.
> 
> Mesa 20.0 should fix Vulkan hangs for you, and with nodma SDMA is disabled
> fully so you can't get any hangs that are SDMA related.

That workaround delays the hangs af best, and I have gotten hangs from OpenGl
Games and also by using amdvlk. 

Don't get me wrong I'm not saying this bug report shouldn't be closed, I'm just
saying that you saying "soon" is very misleading. AMD hasn't still properly
fixed bugs that lead to hangs by just watching Firefox, and it's been MONTHS.
"soon" for them is months apperantly

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3046 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (33 preceding siblings ...)
  2019-11-10 13:58 ` bugzilla-daemon
@ 2019-11-10 14:00 ` bugzilla-daemon
  2019-11-10 14:04 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-10 14:00 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1780 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #34 from Marko Popovic <popovic.marko@protonmail.com> ---
(In reply to Daniel Suarez from comment #33)
> (In reply to Marko Popovic from comment #32)
> > (In reply to Daniel Suarez from comment #31)
> > > (In reply to Marko Popovic from comment #30)
> > > > (In reply to Daniel Suarez from comment #29)
> > > > > (In reply to Marko Popovic from comment #28)
> > > > > > I think this bug report can be closed now, Mesa 20 git basically fixes radv
> > > > > > related ring_gfx hangs, there is still hang that happens in Citra emulator
> > > > > > (ngg related) but AMD developers are aware of it so will probably get fixed
> > > > > > too.
> > > > > 
> > > > > Yeah.. "soon". Still waiting for them to fix bug 111481
> > > > 
> > > > SDMA hangs have nothing to do with ring_gfx hangs which were mostly radv
> > > > related and are fixed now
> > > 
> > > Still, I can't even play Vulkan titles reliably because the system
> > > constantly hangs even with the workarounds in the bug report. AMD really
> > > needs to fix them.
> > 
> > Mesa 20.0 should fix Vulkan hangs for you, and with nodma SDMA is disabled
> > fully so you can't get any hangs that are SDMA related.
> 
> That workaround delays the hangs af best, and I have gotten hangs from
> OpenGl Games and also by using amdvlk. 
> 
> Don't get me wrong I'm not saying this bug report shouldn't be closed, I'm
> just saying that you saying "soon" is very misleading. AMD hasn't still
> properly fixed bugs that lead to hangs by just watching Firefox, and it's
> been MONTHS. "soon" for them is months apperantly

And where exactly did I say soon?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3311 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (34 preceding siblings ...)
  2019-11-10 14:00 ` bugzilla-daemon
@ 2019-11-10 14:04 ` bugzilla-daemon
  2019-11-12 23:15 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-10 14:04 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1945 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #35 from Daniel Suarez <danielsuarez369@protonmail.com> ---
(In reply to Marko Popovic from comment #34)
> (In reply to Daniel Suarez from comment #33)
> > (In reply to Marko Popovic from comment #32)
> > > (In reply to Daniel Suarez from comment #31)
> > > > (In reply to Marko Popovic from comment #30)
> > > > > (In reply to Daniel Suarez from comment #29)
> > > > > > (In reply to Marko Popovic from comment #28)
> > > > > > > I think this bug report can be closed now, Mesa 20 git basically fixes radv
> > > > > > > related ring_gfx hangs, there is still hang that happens in Citra emulator
> > > > > > > (ngg related) but AMD developers are aware of it so will probably get fixed
> > > > > > > too.
> > > > > > 
> > > > > > Yeah.. "soon". Still waiting for them to fix bug 111481
> > > > > 
> > > > > SDMA hangs have nothing to do with ring_gfx hangs which were mostly radv
> > > > > related and are fixed now
> > > > 
> > > > Still, I can't even play Vulkan titles reliably because the system
> > > > constantly hangs even with the workarounds in the bug report. AMD really
> > > > needs to fix them.
> > > 
> > > Mesa 20.0 should fix Vulkan hangs for you, and with nodma SDMA is disabled
> > > fully so you can't get any hangs that are SDMA related.
> > 
> > That workaround delays the hangs af best, and I have gotten hangs from
> > OpenGl Games and also by using amdvlk. 
> > 
> > Don't get me wrong I'm not saying this bug report shouldn't be closed, I'm
> > just saying that you saying "soon" is very misleading. AMD hasn't still
> > properly fixed bugs that lead to hangs by just watching Firefox, and it's
> > been MONTHS. "soon" for them is months apperantly
> 
> And where exactly did I say soon?

My bad, I read "soon" instead of "too", apologies

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3635 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (35 preceding siblings ...)
  2019-11-10 14:04 ` bugzilla-daemon
@ 2019-11-12 23:15 ` bugzilla-daemon
  2019-11-13  0:04 ` bugzilla-daemon
  2019-11-19  9:52 ` bugzilla-daemon
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-12 23:15 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 729 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #36 from John H <hamz_23@hotmail.com> ---
Also, for people who have a 5700XT card, check if yours has dual BIOS's

Typically one is for running at normal clock speeds, and the other is for
running overclocked values.

My card, the Powercolor Red Devil 5700XT, is an example of such card, in OC
mode I have had all sorts of random freezes and crashes in both Windows AND
Linux. 

Since switching to the default clocks, sometimes called Silent mode. I haven't
had a single problem since. This is just a heads up for users who have Navi10
based cards with a selectable BIOS

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1479 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (36 preceding siblings ...)
  2019-11-12 23:15 ` bugzilla-daemon
@ 2019-11-13  0:04 ` bugzilla-daemon
  2019-11-19  9:52 ` bugzilla-daemon
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-13  0:04 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 862 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #37 from Andrew Sheldon <asheldon55@gmail.com> ---
(In reply to Daniel Suarez from comment #33)

> That workaround delays the hangs af best, and I have gotten hangs from
> OpenGl Games and also by using amdvlk. 
> 

Those hangs shouldn't be SDMA related, however. If you are getting hangs from
specific games, report them on the corresponding bug tracker
(https://gitlab.freedesktop.org/mesa/mesa for OGL and RADV,
https://github.com/GPUOpen-Drivers/AMDVLK/issues for AMDVLK).

I suggest using RADV_PERFTEST=aco with mesa-git for the most stable Vulkan
experience (or try the AMDGPU-PRO Vulkan driver). 

There's also the "divide error" random hang issue, but it shouldn't be related
to SDMA either.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1827 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug 111763] ring_gfx hangs/freezes on Navi gpus
  2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
                   ` (37 preceding siblings ...)
  2019-11-13  0:04 ` bugzilla-daemon
@ 2019-11-19  9:52 ` bugzilla-daemon
  38 siblings, 0 replies; 40+ messages in thread
From: bugzilla-daemon @ 2019-11-19  9:52 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 806 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=111763

Martin Peres <martin.peres@free.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |MOVED
             Status|NEW                         |RESOLVED

--- Comment #38 from Martin Peres <martin.peres@free.fr> ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/914.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2331 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2019-11-19  9:52 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-22 12:01 [Bug 111763] ring_gfx hangs/freezes on Navi gpus bugzilla-daemon
2019-09-23  2:46 ` bugzilla-daemon
2019-09-23  6:56 ` bugzilla-daemon
2019-09-23  6:57 ` bugzilla-daemon
2019-09-23  7:00 ` bugzilla-daemon
2019-09-30 12:18 ` bugzilla-daemon
2019-09-30 15:10 ` bugzilla-daemon
2019-09-30 21:55 ` bugzilla-daemon
2019-09-30 22:02 ` bugzilla-daemon
2019-10-03 12:26 ` bugzilla-daemon
2019-10-11 13:37 ` bugzilla-daemon
2019-10-11 13:57 ` bugzilla-daemon
2019-10-15 12:58 ` bugzilla-daemon
2019-10-15 17:10 ` bugzilla-daemon
2019-10-23  7:26 ` bugzilla-daemon
2019-10-31 12:09 ` bugzilla-daemon
2019-10-31 12:11 ` bugzilla-daemon
2019-11-01  1:23 ` bugzilla-daemon
2019-11-01 16:26 ` bugzilla-daemon
2019-11-02 12:35 ` bugzilla-daemon
2019-11-02 23:11 ` bugzilla-daemon
2019-11-04 16:08 ` bugzilla-daemon
2019-11-04 16:10 ` bugzilla-daemon
2019-11-04 22:13 ` bugzilla-daemon
2019-11-05  6:07 ` bugzilla-daemon
2019-11-05 16:28 ` bugzilla-daemon
2019-11-09  2:54 ` bugzilla-daemon
2019-11-09 12:42 ` bugzilla-daemon
2019-11-09 20:12 ` bugzilla-daemon
2019-11-10 12:20 ` bugzilla-daemon
2019-11-10 13:50 ` bugzilla-daemon
2019-11-10 13:51 ` bugzilla-daemon
2019-11-10 13:53 ` bugzilla-daemon
2019-11-10 13:55 ` bugzilla-daemon
2019-11-10 13:58 ` bugzilla-daemon
2019-11-10 14:00 ` bugzilla-daemon
2019-11-10 14:04 ` bugzilla-daemon
2019-11-12 23:15 ` bugzilla-daemon
2019-11-13  0:04 ` bugzilla-daemon
2019-11-19  9:52 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.