* [Bug 107154] [drm] GPU recovery disabled.
@ 2018-07-08 9:24 bugzilla-daemon
2018-07-08 19:00 ` bugzilla-daemon
` (10 more replies)
0 siblings, 11 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-07-08 9:24 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 2323 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
Bug ID: 107154
Summary: [drm] GPU recovery disabled.
Product: DRI
Version: unspecified
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: normal
Priority: medium
Component: DRM/AMDgpu
Assignee: dri-devel@lists.freedesktop.org
Reporter: freedesktop.org@nentwig.biz
Hi!
This is a surprisingly long standing problem with a RX 460, more precisely
since 4.15 all the way up to 4.18 AMD staging DRM next [1].
After resuming from sleep (echo -n mem > /sys/power/state) amdgpu is dead
(always, reliably).
Here's what dmesg has to say about it:
[Sun Jul 8 11:01:17 2018] PM: suspend exit
[Sun Jul 8 11:01:19 2018] [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR* amdgpu:
IB test timed out.
[Sun Jul 8 11:01:19 2018] [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu:
failed testing IB on GFX ring (-110).
[Sun Jul 8 11:01:19 2018] [drm:process_one_work] *ERROR* ib ring test failed
(-110).
[Sun Jul 8 11:01:28 2018] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, last signaled seq=864, last emitted seq=868
[Sun Jul 8 11:01:28 2018] [drm] GPU recovery disabled.
>From ealier versions:
[ 42.802559] PM: suspend exit
[ 42.824332] amdgpu 0000:41:00.0: GPU fault detected: 147 0x0bd84802
[ 42.824338] amdgpu 0000:41:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR
0x0034F97B
[ 42.824341] amdgpu 0000:41:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0C048002
[ 42.824345] amdgpu 0000:41:00.0: VM fault (0x02, vmid 6) at page 3471739,
read from 'TC0' (0x54433000) (72)
[ 52.956306] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
last signaled seq=1287, last emitted seq=1289
[ 52.956316] [drm] IP block:gfx_v8_0 is hung!
[ 52.956362] [drm] GPU recovery disabled.
I've also seen fault 146 but other than that it mostly looks the same. 4.14-lts
(with dc=0) works fine.
RX 460, Zenith Extreme, 1950x.
[1] arch linux AUR; this versioning is a bit confusing, it may actually already
be the 4.19 branch, latest commit is3838e387fd1eb17bfcf6ff7d443d931adb5cb41b
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 3602 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug 107154] [drm] GPU recovery disabled.
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
@ 2018-07-08 19:00 ` bugzilla-daemon
2018-07-08 20:03 ` bugzilla-daemon
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-07-08 19:00 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 1224 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
--- Comment #1 from dwagner <jb5sgc1n.nya@20mm.eu> ---
Indeed, crashes upon S3 resumes have been abundant with amdgpu.dc=1 for many
months now, and seemingly for more than one reason.
One bug I reported in August 2017 with
https://bugs.freedesktop.org/show_bug.cgi?id=102323 - that one was fixed
quickly.
The next S3 resume crash I reported in October 2017 in
https://bugs.freedesktop.org/show_bug.cgi?id=103277, that one stayed without
any resolution until April 2018, and the fix found in that report only works if
no "drm.edid_firmware=..." kernel command line option is used.
Another crash bug with S3 resumes I reported for 4.17.2 kernels in
https://bugs.freedesktop.org/show_bug.cgi?id=107065 - then realized that 4.18
pre-releases exhibit the very same kind of crash immediately upon starting X11.
For this crash upon X11 startup, there is a patch in the bug report, but it
does not prevent the S3 resume crash.
I currently work around S3 resume crashes by switching to the console display
before enterin S3 sleep - but this is really an awkward work-around.
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 2630 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug 107154] [drm] GPU recovery disabled.
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
2018-07-08 19:00 ` bugzilla-daemon
@ 2018-07-08 20:03 ` bugzilla-daemon
2018-07-09 8:53 ` bugzilla-daemon
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-07-08 20:03 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 548 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
--- Comment #2 from freedesktop.org@nentwig.biz ---
(In reply to dwagner from comment #1)
> I currently work around S3 resume crashes by switching to the console
> display before enterin S3 sleep - but this is really an awkward work-around.
Oh, that doesn't help either. It crashes the very moment I switch back to X.
And what's more starting with 4.15 amdgpu.dc=0 doesn't appear to make any
difference.
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1359 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug 107154] [drm] GPU recovery disabled.
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
2018-07-08 19:00 ` bugzilla-daemon
2018-07-08 20:03 ` bugzilla-daemon
@ 2018-07-09 8:53 ` bugzilla-daemon
2018-07-09 11:31 ` bugzilla-daemon
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-07-09 8:53 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 273 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
--- Comment #3 from Michel Dänzer <michel@daenzer.net> ---
Please attach the full dmesg output.
Can you bisect between 4.14 and 4.15?
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1015 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug 107154] [drm] GPU recovery disabled.
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
` (2 preceding siblings ...)
2018-07-09 8:53 ` bugzilla-daemon
@ 2018-07-09 11:31 ` bugzilla-daemon
2018-07-09 16:03 ` bugzilla-daemon
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-07-09 11:31 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 237 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
--- Comment #4 from Christian König <ckoenig.leichtzumerken@gmail.com> ---
Do you have a full dmesg?
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 995 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug 107154] [drm] GPU recovery disabled.
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
` (3 preceding siblings ...)
2018-07-09 11:31 ` bugzilla-daemon
@ 2018-07-09 16:03 ` bugzilla-daemon
2018-07-09 16:04 ` bugzilla-daemon
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-07-09 16:03 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 333 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
--- Comment #5 from freedesktop.org@nentwig.biz ---
Created attachment 140525
--> https://bugs.freedesktop.org/attachment.cgi?id=140525&action=edit
dmesg amdgpu.dc=1
Booted with amdgpu.dc=1.
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1196 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug 107154] [drm] GPU recovery disabled.
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
` (4 preceding siblings ...)
2018-07-09 16:03 ` bugzilla-daemon
@ 2018-07-09 16:04 ` bugzilla-daemon
2018-07-09 16:13 ` bugzilla-daemon
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-07-09 16:04 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 357 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
--- Comment #6 from freedesktop.org@nentwig.biz ---
Created attachment 140526
--> https://bugs.freedesktop.org/attachment.cgi?id=140526&action=edit
dmesg /etc/modprobe.d/
Booted with amdgpu.dc=1 in /etc/modprobe.d/
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1230 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug 107154] [drm] GPU recovery disabled.
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
` (5 preceding siblings ...)
2018-07-09 16:04 ` bugzilla-daemon
@ 2018-07-09 16:13 ` bugzilla-daemon
2018-07-09 16:29 ` bugzilla-daemon
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-07-09 16:13 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 890 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
--- Comment #7 from freedesktop.org@nentwig.biz ---
Sure, attached. AMD staging kernel. I don't know how to tell whether DC=1 is
really enabled, so I did two runs: one with amdgpu.dc=1 as boot parameter and
one with /etc/modprobe.d/ on top of that.
Procedure was the same both times:
- boot
- X login
- switch to console
- sleep, wakeup
- switch to X
The drm/amdgpu lines appear already in the console right after waking up, prior
to switching to X.
This time "only" X crashed (could still move the pointer); at times the
complete machine is dead, no switching to console and and no SSH.
(as a side note: is is normal that waking up on ryzen takes something on the
order of 10-30s? I'm used to split second wakeups on Intel.)
HTH
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1637 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug 107154] [drm] GPU recovery disabled.
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
` (6 preceding siblings ...)
2018-07-09 16:13 ` bugzilla-daemon
@ 2018-07-09 16:29 ` bugzilla-daemon
2018-07-10 7:04 ` bugzilla-daemon
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-07-09 16:29 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 452 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
--- Comment #8 from freedesktop.org@nentwig.biz ---
Created attachment 140528
--> https://bugs.freedesktop.org/attachment.cgi?id=140528&action=edit
dmesg 4.14 LTS
Sorry, forgot about the requested 4.14 dmesg log. Attached as well.
This is: boot, login (to KDE this time), do stuff, remember, sleep, wakeup.
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1309 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug 107154] [drm] GPU recovery disabled.
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
` (7 preceding siblings ...)
2018-07-09 16:29 ` bugzilla-daemon
@ 2018-07-10 7:04 ` bugzilla-daemon
2018-09-02 10:26 ` bugzilla-daemon
2018-09-11 13:38 ` bugzilla-daemon
10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-07-10 7:04 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 641 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
Christian König <ckoenig.leichtzumerken@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|NEW |RESOLVED
--- Comment #9 from Christian König <ckoenig.leichtzumerken@gmail.com> ---
Yeah, that is a known problem in the PCI subsystem. Will be fixed with 4.19 and
then backported to older kernels.
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 2112 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug 107154] [drm] GPU recovery disabled.
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
` (8 preceding siblings ...)
2018-07-10 7:04 ` bugzilla-daemon
@ 2018-09-02 10:26 ` bugzilla-daemon
2018-09-11 13:38 ` bugzilla-daemon
10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-02 10:26 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 556 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
--- Comment #10 from freedesktop.org@nentwig.biz ---
So, there's 4.19rc1-amd-next \o/
echo: write error: Device or resource busy
This started to happen with 4.18. dmesg:
[ 171.245467] Freezing of tasks failed after 20.006 seconds (1 tasks refusing
to freeze, wq_busy=0):
[ 171.245484] systemd-udevd D 0 700 615 0x80000124
So, is this sth. to report to fricking systemd to?
Gee, really...?!
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1346 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug 107154] [drm] GPU recovery disabled.
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
` (9 preceding siblings ...)
2018-09-02 10:26 ` bugzilla-daemon
@ 2018-09-11 13:38 ` bugzilla-daemon
10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-11 13:38 UTC (permalink / raw)
To: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 477 bytes --]
https://bugs.freedesktop.org/show_bug.cgi?id=107154
--- Comment #11 from kyle.devir@mykolab.com ---
> systemd-udevd
This is not systemd's fault, but indicative of something hanging in kernel
land, which udevd ends up being blocked on.
Experienced this a few major kernel releases ago, which were resolved by the
next major version. Never did figure out what caused udevd to block... :/
--
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #1.2: Type: text/html, Size: 1288 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-09-11 13:38 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-08 9:24 [Bug 107154] [drm] GPU recovery disabled bugzilla-daemon
2018-07-08 19:00 ` bugzilla-daemon
2018-07-08 20:03 ` bugzilla-daemon
2018-07-09 8:53 ` bugzilla-daemon
2018-07-09 11:31 ` bugzilla-daemon
2018-07-09 16:03 ` bugzilla-daemon
2018-07-09 16:04 ` bugzilla-daemon
2018-07-09 16:13 ` bugzilla-daemon
2018-07-09 16:29 ` bugzilla-daemon
2018-07-10 7:04 ` bugzilla-daemon
2018-09-02 10:26 ` bugzilla-daemon
2018-09-11 13:38 ` bugzilla-daemon
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.