From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 91278] Tonga GPU lock/reset fail with Unigine Valley Date: Wed, 30 Sep 2015 20:42:20 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1573068851==" Return-path: Received: from culpepper.freedesktop.org (unknown [131.252.210.165]) by gabe.freedesktop.org (Postfix) with ESMTP id 070246E8D8 for ; Wed, 30 Sep 2015 13:42:20 -0700 (PDT) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============1573068851== Content-Type: multipart/alternative; boundary="1443645739.84DC01.16527"; charset="UTF-8" --1443645739.84DC01.16527 Date: Wed, 30 Sep 2015 20:42:19 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" https://bugs.freedesktop.org/show_bug.cgi?id=91278 --- Comment #20 from Mathias Tillman --- (In reply to Andy Furniss from comment #19) > (In reply to Andy Furniss from comment #18) > > So after last update I ran valley again briefly and saw a few vmfaults, then > > a longer run and got thousands again. > > > > without touching anything else did echo mem >/sys/power/state and then woke > > up. > > > > 10 minute run of valley has produced zero faults. > > Further test from power off, nothing else running apart from X/fluxox short > run of valley no faults. Reran valley for a bit longer and got thousands. > Did memsleep ran valley no faults but after about 10 minutes it hung. Do you get those GPU faults in the log even when there's no hang? I haven't checked dmesg while running valley myself, but I do know they always appear when a hang has happened (I'm using ssh to grab dmesg while it's hung). Dmesg is sometimes completely filled with GPU faults, other times it's just a few. I ran it a few minutes ago and only got this: [ 1737.984328] amdgpu 0000:01:00.0: GPU fault detected: 146 0x08804804 [ 1737.984338] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00100110 [ 1737.984343] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0A048004 [ 1737.984348] VM fault (0x04, vmid 5) at page 1048848, read from 'TC6' (0x54433600) (72) [ 1737.984355] amdgpu 0000:01:00.0: GPU fault detected: 146 0x08804004 [ 1737.984359] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [ 1737.984363] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 [ 1737.984366] VM fault (0x00, vmid 0) at page 0, read from '' (0x00000000) (0) [ 1737.984374] amdgpu 0000:01:00.0: GPU fault detected: 146 0x08800804 [ 1737.984378] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [ 1737.984381] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 [ 1737.984384] VM fault (0x00, vmid 0) at page 0, read from '' (0x00000000) (0) -- You are receiving this mail because: You are the assignee for the bug. --1443645739.84DC01.16527 Date: Wed, 30 Sep 2015 20:42:19 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8"

Comment # 20 on bug 91278 from
(In reply to Andy Furniss from comment #19)
> (In reply to Andy Furniss from comment #18)
> > So after last update I ran valley again briefly and saw a few vmfaults, then
> > a longer run and got thousands again.
> > 
> > without touching anything else did echo mem >/sys/power/state and then woke
> > up.
> > 
> > 10 minute run of valley has produced zero faults.
> 
> Further test from power off, nothing else running apart from X/fluxox short
> run of valley no faults. Reran valley for a bit longer and got thousands.
> Did memsleep ran valley no faults but after about 10 minutes it hung.

Do you get those GPU faults in the log even when there's no hang? I haven't
checked dmesg while running valley myself, but I do know they always appear
when a hang has happened (I'm using ssh to grab dmesg while it's hung).

Dmesg is sometimes completely filled with GPU faults, other times it's just a
few. I ran it a few minutes ago and only got this:

[ 1737.984328] amdgpu 0000:01:00.0: GPU fault detected: 146 0x08804804
[ 1737.984338] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00100110
[ 1737.984343] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0A048004
[ 1737.984348] VM fault (0x04, vmid 5) at page 1048848, read from 'TC6'
(0x54433600) (72)
[ 1737.984355] amdgpu 0000:01:00.0: GPU fault detected: 146 0x08804004
[ 1737.984359] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 1737.984363] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x00000000
[ 1737.984366] VM fault (0x00, vmid 0) at page 0, read from '' (0x00000000) (0)
[ 1737.984374] amdgpu 0000:01:00.0: GPU fault detected: 146 0x08800804
[ 1737.984378] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 1737.984381] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x00000000
[ 1737.984384] VM fault (0x00, vmid 0) at page 0, read from '' (0x00000000) (0)


You are receiving this mail because:
  • You are the assignee for the bug.
--1443645739.84DC01.16527-- --===============1573068851== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHA6Ly9saXN0 cy5mcmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9kcmktZGV2ZWwK --===============1573068851==--