From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 79980] New: Random radeonsi crashes
Date: Fri, 13 Jun 2014 13:06:44 +0000
Message-ID:
Priority
medium
Bug ID
79980
Assignee
dri-devel@lists.freedesktop.org
Summary
Random radeonsi crashes
Severity
normal
Classification
Unclassified
OS
All
Reporter
darkbasic@linuxsystems.it
Hardware
Other
Status
NEW
Version
XOrg CVS
Component
DRM/Radeon
Product
DRI
Created attachment 100978 [details]
dmesg
Kernel 3.15.0-rc8 + PTE patches
What specific app were you using that caused the GPU hang? Also if this is a regression can you biect?
No specific app (not counting KDE desktop effects). If the problem is the kernel it's a regression because I didn't have any problem with -rc5. Unfortunately it's not easy to trigger the crash so there is no chance to bisect given how busy I actually am.
(In reply to comment #3) > Does it still happen if you drop the PTE patches? Is that stop poisoning the GART TLB? Whatever - it could be a separate issue, but I am now getting sort of random crashes on your drm-next-3.16 with my pitcairn. I am stable on deathsimple 3.15 fixes + hdmi patches.
(In reply to comment #5) > (In reply to comment #3) > > Does it still happen if you drop the PTE patches? > > Is that stop poisoning the GART TLB? Ok ignore that :-) I didn't spot the rs600
(In reply to comment #6) > (In reply to comment #5) > > (In reply to comment #3) > > > Does it still happen if you drop the PTE patches? > > > > Is that stop poisoning the GART TLB? > > Ok ignore that :-) I didn't spot the rs600 It applies to all asics from rs600 forward.
(In reply to comment #7) > (In reply to comment #6) > > (In reply to comment #5) > > > (In reply to comment #3) > > > > Does it still happen if you drop the PTE patches? > > > > > > Is that stop poisoning the GART TLB? > > > > Ok ignore that :-) I didn't spot the rs600 > > It applies to all asics from rs600 forward. Ahh, in the meantime I've now built with optimize SI VM handling + use lower_32_bits where appropriate reverted - the latter just so I could revert the former. I'll see if I am stable over the next couple of days like this.
(In reply to comment #8) > (In reply to comment #7) > > (In reply to comment #6) > > > (In reply to comment #5) > > > > (In reply to comment #3) > > > > > Does it still happen if you drop the PTE patches? > > > > > > > > Is that stop poisoning the GART TLB? > > > > > > Ok ignore that :-) I didn't spot the rs600 > > > > It applies to all asics from rs600 forward. > > Ahh, in the meantime I've now built with > > optimize SI VM handling + use lower_32_bits where appropriate reverted - the > latter just so I could revert the former. > > I'll see if I am stable over the next couple of days like this. I am stable so far with the above reverted.
(In reply to comment #9) > (In reply to comment #8) > > optimize SI VM handling + use lower_32_bits where appropriate reverted - the > > latter just so I could revert the former. > > > > I'll see if I am stable over the next couple of days like this. > > I am stable so far with the above reverted. Spoke too soon, I just locked. Wasn't quite the same as before in that screen stayed on displaying normal rather that off/on + junk. Wasn't doing anything GPU related (accepting I always am with glamor), was doing a big compile, so memory pressure I guess. Also just add to the mix, after thinking I was stable yesterday I upgraded gcc and updated llvm and mesa so they were different in several ways, though I haven't rebuilt kernel.
Created attachment 101226 [details]
gray screen
This is what I often get, I was simply syncing my portage tree while it
happened.
What | Removed | Added |
---|---|---|
Priority | medium | high |
First of all: excuse my bad english. I have the same problem with my HD 7950; using hangouts, playing Left for Dead 2, or watching a flash video my screen goes crazy with vertical lines or grey fog. Started when i upgraded to testing repo (Archlinux) and downloaded the newest linux-firmware package, who includes TAHITI_mc2.bin. I suffered this bug on kernels 3.14 and 3.15. For now, i am using 3.15.1 kernel, and the old Tahiti firmware, and it seems stable.
> Wasn't doing anything GPU related (accepting I always am with glamor), was
> doing a big compile, so memory pressure I guess.
You're right, i was compiling too when it crashed. Nothing GPU related anyway.
(In reply to comment #10) > (In reply to comment #9) > > (In reply to comment #8) > > > > optimize SI VM handling + use lower_32_bits where appropriate reverted - the > > > latter just so I could revert the former. > > > > > > I'll see if I am stable over the next couple of days like this. > > > > I am stable so far with the above reverted. > > Spoke too soon, I just locked. Wasn't quite the same as before in that > screen stayed on displaying normal rather that off/on + junk. > > Wasn't doing anything GPU related (accepting I always am with glamor), was > doing a big compile, so memory pressure I guess. > > Also just add to the mix, after thinking I was stable yesterday I upgraded > gcc and updated llvm and mesa so they were different in several ways, though > I haven't rebuilt kernel. I got another lock last thing, this one was "typical" happened when closing seamonkey, this is the third time closing it has locked. Of course it doesn't do it if I try. I must be using gl someway/sometimes, as the last thing I see is the xterm from where it was started and there is a mesa message about default setting for s3tc being overridden by env (and that's not by me - I don't have drirc anywhere). I think this is going to be a pain to find - I just tried reset --hard onto add large PTE support for NI, SI and CIK v5 that failed to resume from mem 1st try, though it wasn't locked. just corrupt (mouse cursor large block of junk, fluxbox desktop black, but toolbar still visible) so maybe a different issue fixed by a later commit. I could SysRq - the log was normal.
This bug is caused by TAHITI_mc2.bin firmware. The old firmware works good.
(In reply to comment #15) > This bug is caused by TAHITI_mc2.bin firmware. The old firmware works good. Well I haven't tried without it, but I have so far failed to reproduce this bug on a slightly older 3.15 drm fixes also using TAHITI_mc2.bin.
(In reply to comment #15) > This bug is caused by TAHITI_mc2.bin firmware. The old firmware works good. Did you test a new kernel with the old firmware or an old kernel without the new firmware patch? It could be some other change if you did the latter.
If it's the same problem Marek is seeing it's probably this: 6d2f294 - drm/radeon: use normal BOs for the page tables v4
(In reply to comment #17) > (In reply to comment #15) > > This bug is caused by TAHITI_mc2.bin firmware. The old firmware works good. > > Did you test a new kernel with the old firmware or an old kernel without the > new firmware patch? It could be some other change if you did the latter. 3.14 or 3.15 + New firmware = Crashes 3.14 or 3.15 + Old firmware = No problems!
OK forget it. It's not a firmware related problem. I had this bug with old firmware on kernel 3.15.1. I resized a flash video window (vdpau accelerated) and lost my screen.
It happened again. In this case with 3.16.rc2, resizing a firefox windows with flash content (vdpau on).
It happened on 3.16-rc1 too while doing a video call with skype.