All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
@ 2014-06-18  2:20 bugzilla-daemon
  2014-06-18  2:22 ` [Bug 78221] " bugzilla-daemon
                   ` (23 more replies)
  0 siblings, 24 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-18  2:20 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

            Bug ID: 78221
           Summary: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D
                    activity - GPU VM fault occurs. (possibly DMA copying
                    issue strikes back?)
           Product: Drivers
           Version: 2.5
    Kernel Version: 3.16-rc1
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: t3st3r@mail.ru
        Regression: No

Configuration:
 AMD R9 270 with 3.16-rc1 kernel.
 Ubuntu 14.04 + oibaf PPA used as recent opensource graphics stack (MESA 10.3
git).
 XFCE used as DE, compositing is off.

To reproduce:
 Intermittent bug. R9 270 GPU can sometimes lock up on some heavy 2D-based
loads. 
 Known way to toggle bug is to install Battle for Wesnoth game from ubuntu
repos and allow game to display some map with many units/heavy animations when
running in windowed mode (BfW is SDL-based, 2D-only game, it does not makes GL
calls, etc). In random intervals, usually in range of 30 minutes, GPU could
lose stability and could lock up.

Special considerations:
 1) GPU recovery often succeeds in this case. Yet its bad to have 20 seconds of
black screen and sometimes it can fail to recover after multiple GPU crashes.
 2) Kernels prior 3.15 had similar issue. Kernel 3.15 (release version but not
-RCs) has been rock solid and never crashed GPU to the best of my knowledge, no
matter what I attempted. Now 3.16-rc1 crashes again.
 3) Taking 2) into account and such verioning I suspect it could have something
to do with commit b5be1a839a33634393394e4782edaa37a4bc1a1e or somewhere around.
Possibly its what reintroduced lockups in 3.16 again. Maybe underlying reasons
of deadlocks were not fixed for R9 270?
 4) GPU seems to be stable under 3D loads I've attempted - it does not crashes
even after hours of quite demanding 3D loads like games, etc. Only some
specific 2D loads can cause such issues.

Crash details are looking like this:
===CUT===
Jun 18 03:10:01 localhost kernel: [26125.102351] radeon 0000:01:00.0: ring 0
stalled for more than 10263msec
Jun 18 03:10:01 localhost kernel: [26125.102362] radeon 0000:01:00.0: GPU
lockup (waiting for 0x0000000000445274 last fence id 0x0000000000445273 on ring
0)
Jun 18 03:10:01 localhost kernel: [26125.102370] radeon 0000:01:00.0: failed to
get a new IB (-35)
Jun 18 03:10:01 localhost kernel: [26125.671219] AMD-Vi: Event logged
[IO_PAGE_FAULT device=01:00.0 domain=0x0018 address=0x00000000803dae00
flags=0x0000]
Jun 18 03:10:01 localhost kernel: [26125.671232] AMD-Vi: Event logged [
Jun 18 03:10:01 localhost kernel: [26125.671232] radeon 0000:01:00.0: Saved
23200 dwords of commands on ring 0.
Jun 18 03:10:01 localhost kernel: [26125.671240] IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x00000000803dae30 flags=0x0020]
Jun 18 03:10:01 localhost kernel: [26125.671243] AMD-Vi: Event logged
[IO_PAGE_FAULT device=01:00.0 domain=0x0018 address=0x0000000080000100
flags=0x0020]
Jun 18 03:10:01 localhost kernel: [26125.671248] AMD-Vi: Event logged
[IO_PAGE_FAULT device=01:00.0 domain=0x0018 address=0x00000000803dad00
flags=0x0000]
Jun 18 03:10:01 localhost kernel: [26125.671361] radeon 0000:01:00.0: GPU
softreset: 0x0000006C
Jun 18 03:10:01 localhost kernel: [26125.671365] radeon 0000:01:00.0:  
GRBM_STATUS               = 0xA0003028
Jun 18 03:10:01 localhost kernel: [26125.671367] radeon 0000:01:00.0:  
GRBM_STATUS_SE0           = 0x00000006
Jun 18 03:10:01 localhost kernel: [26125.671370] radeon 0000:01:00.0:  
GRBM_STATUS_SE1           = 0x00000006
Jun 18 03:10:01 localhost kernel: [26125.671372] radeon 0000:01:00.0:  
SRBM_STATUS               = 0x200000C0
Jun 18 03:10:01 localhost kernel: [26125.671483] radeon 0000:01:00.0:  
SRBM_STATUS2              = 0x00000000
Jun 18 03:10:01 localhost kernel: [26125.671485] radeon 0000:01:00.0:  
R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 18 03:10:01 localhost kernel: [26125.671487] radeon 0000:01:00.0:  
R_008678_CP_STALLED_STAT2 = 0x00010000
Jun 18 03:10:01 localhost kernel: [26125.671489] radeon 0000:01:00.0:  
R_00867C_CP_BUSY_STAT     = 0x00000002
Jun 18 03:10:01 localhost kernel: [26125.671492] radeon 0000:01:00.0:  
R_008680_CP_STAT          = 0x80010243
Jun 18 03:10:01 localhost kernel: [26125.671494] radeon 0000:01:00.0:  
R_00D034_DMA_STATUS_REG   = 0x44483146
Jun 18 03:10:01 localhost kernel: [26125.671496] radeon 0000:01:00.0:  
R_00D834_DMA_STATUS_REG   = 0x44C84246
Jun 18 03:10:01 localhost kernel: [26125.671499] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jun 18 03:10:01 localhost kernel: [26125.671501] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jun 18 03:10:02 localhost kernel: [26126.218193] radeon 0000:01:00.0:
GRBM_SOFT_RESET=0x0000DDFF
Jun 18 03:10:02 localhost kernel: [26126.218247] radeon 0000:01:00.0:
SRBM_SOFT_RESET=0x00100140
Jun 18 03:10:02 localhost kernel: [26126.219404] radeon 0000:01:00.0:  
GRBM_STATUS               = 0x00003028
Jun 18 03:10:02 localhost kernel: [26126.219407] radeon 0000:01:00.0:  
GRBM_STATUS_SE0           = 0x00000006
Jun 18 03:10:02 localhost kernel: [26126.219409] radeon 0000:01:00.0:  
GRBM_STATUS_SE1           = 0x00000006
Jun 18 03:10:02 localhost kernel: [26126.219411] radeon 0000:01:00.0:  
SRBM_STATUS               = 0x200000C0
Jun 18 03:10:02 localhost kernel: [26126.219522] radeon 0000:01:00.0:  
SRBM_STATUS2              = 0x00000000
Jun 18 03:10:02 localhost kernel: [26126.219524] radeon 0000:01:00.0:  
R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 18 03:10:02 localhost kernel: [26126.219526] radeon 0000:01:00.0:  
R_008678_CP_STALLED_STAT2 = 0x00000000
Jun 18 03:10:02 localhost kernel: [26126.219528] radeon 0000:01:00.0:  
R_00867C_CP_BUSY_STAT     = 0x00000000
Jun 18 03:10:02 localhost kernel: [26126.219530] radeon 0000:01:00.0:  
R_008680_CP_STAT          = 0x00000000
Jun 18 03:10:02 localhost kernel: [26126.219533] radeon 0000:01:00.0:  
R_00D034_DMA_STATUS_REG   = 0x44C83D57
Jun 18 03:10:02 localhost kernel: [26126.219535] radeon 0000:01:00.0:  
R_00D834_DMA_STATUS_REG   = 0x44C83D57
Jun 18 03:10:02 localhost kernel: [26126.219780] radeon 0000:01:00.0: GPU reset
succeeded, trying to resume
Jun 18 03:10:02 localhost kernel: [26126.246602] [drm] probing gen 2 caps for
device 1002:5a16 = 31cd02/0
Jun 18 03:10:02 localhost kernel: [26126.246606] [drm] PCIE gen 2 link speeds
already enabled
Jun 18 03:10:02 localhost kernel: [26126.250269] [drm] PCIE GART of 1024M
enabled (table at 0x0000000000276000).
Jun 18 03:10:02 localhost kernel: [26126.250405] radeon 0000:01:00.0: WB
enabled
Jun 18 03:10:02 localhost kernel: [26126.250408] radeon 0000:01:00.0: fence
driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr
0xffff88041651fc00
Jun 18 03:10:02 localhost kernel: [26126.250411] radeon 0000:01:00.0: fence
driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr
0xffff88041651fc04
Jun 18 03:10:02 localhost kernel: [26126.250413] radeon 0000:01:00.0: fence
driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr
0xffff88041651fc08
Jun 18 03:10:02 localhost kernel: [26126.250415] radeon 0000:01:00.0: fence
driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr
0xffff88041651fc0c
Jun 18 03:10:02 localhost kernel: [26126.250417] radeon 0000:01:00.0: fence
driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr
0xffff88041651fc10
Jun 18 03:10:02 localhost kernel: [26126.251393] radeon 0000:01:00.0: fence
driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr
0xffffc90011db5a18
Jun 18 03:10:02 localhost kernel: [26126.436426] [drm] ring test on 0 succeeded
in 3 usecs
Jun 18 03:10:02 localhost kernel: [26126.436432] [drm] ring test on 1 succeeded
in 1 usecs
Jun 18 03:10:02 localhost kernel: [26126.436437] [drm] ring test on 2 succeeded
in 1 usecs
Jun 18 03:10:02 localhost kernel: [26126.436500] [drm] ring test on 3 succeeded
in 2 usecs
Jun 18 03:10:02 localhost kernel: [26126.436510] [drm] ring test on 4 succeeded
in 1 usecs
Jun 18 03:10:02 localhost kernel: [26126.613588] [drm] ring test on 5 succeeded
in 2 usecs
Jun 18 03:10:02 localhost kernel: [26126.613595] [drm] UVD initialized
successfully.
Jun 18 03:10:04 localhost kernel: [26128.294193] SysRq : Emergency Sync
Jun 18 03:10:04 localhost kernel: [26128.309427] Emergency Sync complete
Jun 18 03:10:12 localhost kernel: [26136.610250] radeon 0000:01:00.0: ring 0
stalled for more than 10001msec
Jun 18 03:10:12 localhost kernel: [26136.610260] radeon 0000:01:00.0: GPU
lockup (waiting for 0x0000000000445396 last fence id 0x0000000000445273 on ring
0)
Jun 18 03:10:12 localhost kernel: [26136.610266] [drm:r600_ib_test] *ERROR*
radeon: fence wait failed (-35).
Jun 18 03:10:12 localhost kernel: [26136.610273] [drm:radeon_ib_ring_tests]
*ERROR* radeon: failed testing IB on GFX ring (-35).
Jun 18 03:10:12 localhost kernel: [26136.610278] radeon 0000:01:00.0: ib ring
test failed (-35).
Jun 18 03:10:13 localhost kernel: [26137.083011] SysRq : Emergency Sync
Jun 18 03:10:13 localhost kernel: [26137.098644] Emergency Sync complete
Jun 18 03:10:13 localhost kernel: [26137.164779] radeon 0000:01:00.0: GPU
softreset: 0x00000048
Jun 18 03:10:13 localhost kernel: [26137.164782] radeon 0000:01:00.0:  
GRBM_STATUS               = 0xA0003028
Jun 18 03:10:13 localhost kernel: [26137.164785] radeon 0000:01:00.0:  
GRBM_STATUS_SE0           = 0x00000006
Jun 18 03:10:13 localhost kernel: [26137.164787] radeon 0000:01:00.0:  
GRBM_STATUS_SE1           = 0x00000006
Jun 18 03:10:13 localhost kernel: [26137.164789] radeon 0000:01:00.0:  
SRBM_STATUS               = 0x200000C0
Jun 18 03:10:13 localhost kernel: [26137.164911] radeon 0000:01:00.0:  
SRBM_STATUS2              = 0x00000000
Jun 18 03:10:13 localhost kernel: [26137.164913] radeon 0000:01:00.0:  
R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 18 03:10:13 localhost kernel: [26137.164918] radeon 0000:01:00.0:  
R_008678_CP_STALLED_STAT2 = 0x00010000
Jun 18 03:10:13 localhost kernel: [26137.164925] radeon 0000:01:00.0:  
R_00867C_CP_BUSY_STAT     = 0x00000002
Jun 18 03:10:13 localhost kernel: [26137.164930] radeon 0000:01:00.0:  
R_008680_CP_STAT          = 0x80010243
Jun 18 03:10:13 localhost kernel: [26137.164933] radeon 0000:01:00.0:  
R_00D034_DMA_STATUS_REG   = 0x44C83D57
Jun 18 03:10:13 localhost kernel: [26137.164935] radeon 0000:01:00.0:  
R_00D834_DMA_STATUS_REG   = 0x44C83D57
Jun 18 03:10:13 localhost kernel: [26137.164937] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jun 18 03:10:13 localhost kernel: [26137.164940] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jun 18 03:10:13 localhost kernel: [26137.709342] radeon 0000:01:00.0:
GRBM_SOFT_RESET=0x0000DDFF
Jun 18 03:10:13 localhost kernel: [26137.709397] radeon 0000:01:00.0:
SRBM_SOFT_RESET=0x00000100
Jun 18 03:10:13 localhost kernel: [26137.710554] radeon 0000:01:00.0:  
GRBM_STATUS               = 0x00003028
Jun 18 03:10:13 localhost kernel: [26137.710556] radeon 0000:01:00.0:  
GRBM_STATUS_SE0           = 0x00000006
Jun 18 03:10:13 localhost kernel: [26137.710558] radeon 0000:01:00.0:  
GRBM_STATUS_SE1           = 0x00000006
Jun 18 03:10:13 localhost kernel: [26137.710560] radeon 0000:01:00.0:  
SRBM_STATUS               = 0x200000C0
Jun 18 03:10:13 localhost kernel: [26137.710681] radeon 0000:01:00.0:  
SRBM_STATUS2              = 0x00000000
Jun 18 03:10:13 localhost kernel: [26137.710683] radeon 0000:01:00.0:  
R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 18 03:10:13 localhost kernel: [26137.710685] radeon 0000:01:00.0:  
R_008678_CP_STALLED_STAT2 = 0x00000000
Jun 18 03:10:13 localhost kernel: [26137.710687] radeon 0000:01:00.0:  
R_00867C_CP_BUSY_STAT     = 0x00000000
Jun 18 03:10:13 localhost kernel: [26137.710689] radeon 0000:01:00.0:  
R_008680_CP_STAT          = 0x00000000
Jun 18 03:10:13 localhost kernel: [26137.710695] radeon 0000:01:00.0:  
R_00D034_DMA_STATUS_REG   = 0x44C83D57
Jun 18 03:10:13 localhost kernel: [26137.710706] radeon 0000:01:00.0:  
R_00D834_DMA_STATUS_REG   = 0x44C83D57
Jun 18 03:10:13 localhost kernel: [26137.710960] radeon 0000:01:00.0: GPU reset
succeeded, trying to resume
Jun 18 03:10:13 localhost kernel: [26137.724290] [drm] probing gen 2 caps for
device 1002:5a16 = 31cd02/0
Jun 18 03:10:13 localhost kernel: [26137.724294] [drm] PCIE gen 2 link speeds
already enabled
Jun 18 03:10:13 localhost kernel: [26137.728012] [drm] PCIE GART of 1024M
enabled (table at 0x0000000000276000).
Jun 18 03:10:13 localhost kernel: [26137.728147] radeon 0000:01:00.0: WB
enabled
Jun 18 03:10:13 localhost kernel: [26137.728150] radeon 0000:01:00.0: fence
driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr
0xffff88041651fc00
Jun 18 03:10:13 localhost kernel: [26137.728152] radeon 0000:01:00.0: fence
driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr
0xffff88041651fc04
Jun 18 03:10:13 localhost kernel: [26137.728154] radeon 0000:01:00.0: fence
driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr
0xffff88041651fc08
Jun 18 03:10:13 localhost kernel: [26137.728156] radeon 0000:01:00.0: fence
driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr
0xffff88041651fc0c
Jun 18 03:10:13 localhost kernel: [26137.728158] radeon 0000:01:00.0: fence
driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr
0xffff88041651fc10
Jun 18 03:10:13 localhost kernel: [26137.729140] radeon 0000:01:00.0: fence
driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr
0xffffc90011db5a18
Jun 18 03:10:14 localhost kernel: [26137.914127] [drm] ring test on 0 succeeded
in 3 usecs
Jun 18 03:10:14 localhost kernel: [26137.914133] [drm] ring test on 1 succeeded
in 1 usecs
Jun 18 03:10:14 localhost kernel: [26137.914138] [drm] ring test on 2 succeeded
in 1 usecs
Jun 18 03:10:14 localhost kernel: [26137.914202] [drm] ring test on 3 succeeded
in 2 usecs
Jun 18 03:10:14 localhost kernel: [26137.914211] [drm] ring test on 4 succeeded
in 1 usecs
Jun 18 03:10:14 localhost kernel: [26138.091289] [drm] ring test on 5 succeeded
in 2 usecs
Jun 18 03:10:14 localhost kernel: [26138.091296] [drm] UVD initialized
successfully.
Jun 18 03:10:14 localhost kernel: [26138.091385] [drm] ib test on ring 0
succeeded in 0 usecs
Jun 18 03:10:14 localhost kernel: [26138.091475] [drm] ib test on ring 1
succeeded in 0 usecs
Jun 18 03:10:14 localhost kernel: [26138.091601] [drm] ib test on ring 2
succeeded in 0 usecs
Jun 18 03:10:14 localhost kernel: [26138.091644] [drm] ib test on ring 3
succeeded in 0 usecs
Jun 18 03:10:14 localhost kernel: [26138.091679] [drm] ib test on ring 4
succeeded in 0 usecs
Jun 18 03:10:16 localhost kernel: [26140.121922] SysRq : Keyboard mode set to
system default
Jun 18 03:10:18 localhost kernel: [26141.841306] SysRq : Emergency Sync
Jun 18 03:10:18 localhost kernel: [26141.866011] Emergency Sync complete
Jun 18 03:10:18 localhost kernel: [26142.361117] SysRq : Emergency Sync
Jun 18 03:10:18 localhost kernel: [26142.383541] Emergency Sync complete
Jun 18 03:10:24 localhost kernel: [26148.238943] radeon 0000:01:00.0: ring 5
stalled for more than 10000msec
Jun 18 03:10:24 localhost kernel: [26148.238952] radeon 0000:01:00.0: GPU
lockup (waiting for 0x0000000000000004 last fence id 0x0000000000000002 on ring
5)
Jun 18 03:10:24 localhost kernel: [26148.238958] [drm:uvd_v1_0_ib_test] *ERROR*
radeon: fence wait failed (-35).
Jun 18 03:10:24 localhost kernel: [26148.238966] [drm:radeon_ib_ring_tests]
*ERROR* radeon: failed testing IB on ring 5 (-35).
Jun 18 03:10:24 localhost kernel: [26148.238995] [drm:radeon_pm_resume_dpm]
*ERROR* radeon: dpm resume failed
Jun 18 03:10:24 localhost kernel: [26148.262523] radeon 0000:01:00.0: GPU fault
detected: 146 0x05428804
Jun 18 03:10:24 localhost kernel: [26148.262534] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0000A1AA
Jun 18 03:10:24 localhost kernel: [26148.262539] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02088004
Jun 18 03:10:24 localhost kernel: [26148.262544] VM fault (0x04, vmid 1) at
page 41386, read from TC (136)
Jun 18 03:10:24 localhost kernel: [26148.262552] radeon 0000:01:00.0: GPU fault
detected: 146 0x06824804
Jun 18 03:10:24 localhost kernel: [26148.262556] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262559] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262563] VM fault (0x00, vmid 0) at
page 0, read from unknown (0)
Jun 18 03:10:24 localhost kernel: [26148.262570] radeon 0000:01:00.0: GPU fault
detected: 146 0x05428404
Jun 18 03:10:24 localhost kernel: [26148.262573] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262577] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262580] VM fault (0x00, vmid 0) at
page 0, read from unknown (0)
Jun 18 03:10:24 localhost kernel: [26148.262586] radeon 0000:01:00.0: GPU fault
detected: 146 0x06c24404
Jun 18 03:10:24 localhost kernel: [26148.262590] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262593] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262597] VM fault (0x00, vmid 0) at
page 0, read from unknown (0)
Jun 18 03:10:24 localhost kernel: [26148.262603] radeon 0000:01:00.0: GPU fault
detected: 146 0x06228804
Jun 18 03:10:24 localhost kernel: [26148.262606] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262610] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262613] VM fault (0x00, vmid 0) at
page 0, read from unknown (0)
Jun 18 03:10:24 localhost kernel: [26148.262619] radeon 0000:01:00.0: GPU fault
detected: 146 0x07024804
Jun 18 03:10:24 localhost kernel: [26148.262623] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262626] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262629] VM fault (0x00, vmid 0) at
page 0, read from unknown (0)
Jun 18 03:10:24 localhost kernel: [26148.262635] radeon 0000:01:00.0: GPU fault
detected: 146 0x0802c804
Jun 18 03:10:24 localhost kernel: [26148.262639] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262642] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262646] VM fault (0x00, vmid 0) at
page 0, read from unknown (0)
Jun 18 03:10:24 localhost kernel: [26148.262652] radeon 0000:01:00.0: GPU fault
detected: 146 0x0882c404
Jun 18 03:10:24 localhost kernel: [26148.262655] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jun 18 03:10:24 localhost kernel: [26148.262659] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
===CUT===
(message about VM fault repeats several thousands times, some messages omitted
as they make full log about 3.5Mb).

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
@ 2014-06-18  2:22 ` bugzilla-daemon
  2014-06-18 15:12 ` bugzilla-daemon
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-18  2:22 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #1 from t3st3r@mail.ru ---
*** Bug 78211 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
  2014-06-18  2:22 ` [Bug 78221] " bugzilla-daemon
@ 2014-06-18 15:12 ` bugzilla-daemon
  2014-06-19  7:37 ` bugzilla-daemon
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-18 15:12 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

Alex Deucher <alexdeucher@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alexdeucher@gmail.com

--- Comment #2 from Alex Deucher <alexdeucher@gmail.com> ---
This is more likely a bug in the mesa 3D driver than a kernel bug.  The 3D
driver is used for both 2D and 3D acceleration.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
  2014-06-18  2:22 ` [Bug 78221] " bugzilla-daemon
  2014-06-18 15:12 ` bugzilla-daemon
@ 2014-06-19  7:37 ` bugzilla-daemon
  2014-06-19 13:46 ` bugzilla-daemon
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-19  7:37 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #3 from t3st3r@mail.ru ---
Not looks like this. Re-checked again, using very same version of drivers in
process. I'm unable to trigged bug with 3.15 mainline kernel, no matter what.
But it happens easily with older kernels like 3.14 or early 3.15RCs, and would
also happen with 3.16-RC1. This makes me to think it comes to DMA issues. What
makes me even more suspicious about it is that early 3.15RC were crashing as
well, but in release version its gone. Looks like last-minute changes related
to DMA have fixed this instability. But now it re-appeared again :(.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (2 preceding siblings ...)
  2014-06-19  7:37 ` bugzilla-daemon
@ 2014-06-19 13:46 ` bugzilla-daemon
  2014-06-21  4:04 ` bugzilla-daemon
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-19 13:46 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #4 from Alex Deucher <alexdeucher@gmail.com> ---
Can you bisect and see what fixed it in 3.15 or what broke it again in 3.16?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (3 preceding siblings ...)
  2014-06-19 13:46 ` bugzilla-daemon
@ 2014-06-21  4:04 ` bugzilla-daemon
  2014-06-22  7:12 ` bugzilla-daemon
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-21  4:04 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #5 from t3st3r@mail.ru ---
Will try that since that bug is nasty enough. Can take some time.

As initial investigation when looking on commit log and matching encounters of
bugs, it appears stability issues were fixed at result of commit
0a4ae727d6aa459247b027387edb6ff99f657792 (appears between 3.15-rc8 -> 3.15
release).

So all 3.15 RCs were not stable on R9 270. However, 3.15 release is okay due to
these last-minute fixes. Yet 0a4ae727d6aa459247b027387edb6ff99f657792 seems to
be composed of few commits, lets chew a bit more on it. Most likely it comes
down to 91b0275c0ecd1870c5f8bfb73e2da2d6c29414b3. 

I think I would try little experiment first: return CPDMA as it was in 3.15
last minute fix and see if stability returns to 3.16-rc1 with R9 270.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (4 preceding siblings ...)
  2014-06-21  4:04 ` bugzilla-daemon
@ 2014-06-22  7:12 ` bugzilla-daemon
  2014-06-23 14:44 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-22  7:12 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #6 from t3st3r@mail.ru ---
Hmm, wrong guess about CPDMA. Trying harder, due to nature of bug it can take
some time.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (5 preceding siblings ...)
  2014-06-22  7:12 ` bugzilla-daemon
@ 2014-06-23 14:44 ` bugzilla-daemon
  2014-06-23 14:45 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-23 14:44 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #7 from Alex Deucher <alexdeucher@gmail.com> ---
Created attachment 140711
  --> https://bugzilla.kernel.org/attachment.cgi?id=140711&action=edit
patch 1/2

Does this patch set help?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (6 preceding siblings ...)
  2014-06-23 14:44 ` bugzilla-daemon
@ 2014-06-23 14:45 ` bugzilla-daemon
  2014-06-24 11:40 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-23 14:45 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #8 from Alex Deucher <alexdeucher@gmail.com> ---
Created attachment 140721
  --> https://bugzilla.kernel.org/attachment.cgi?id=140721&action=edit
patch 2/2

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (7 preceding siblings ...)
  2014-06-23 14:45 ` bugzilla-daemon
@ 2014-06-24 11:40 ` bugzilla-daemon
  2014-06-24 16:23 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-24 11:40 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #9 from t3st3r@mail.ru ---
Hmm, this patch does not applies cleanly to 3.16-rc1 or -rc2, mostly having
bunch of conflicts in radeon_vm.c, which are a bit over my head to resolve at
this point. Which version of kernel I'm supposed to try?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (8 preceding siblings ...)
  2014-06-24 11:40 ` bugzilla-daemon
@ 2014-06-24 16:23 ` bugzilla-daemon
  2014-06-24 16:23 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-24 16:23 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

Alex Deucher <alexdeucher@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #140711|0                           |1
        is obsolete|                            |

--- Comment #10 from Alex Deucher <alexdeucher@gmail.com> ---
Created attachment 140871
  --> https://bugzilla.kernel.org/attachment.cgi?id=140871&action=edit
patch 1/2

Sorry, updated patches.  These apply against 3.15.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (9 preceding siblings ...)
  2014-06-24 16:23 ` bugzilla-daemon
@ 2014-06-24 16:23 ` bugzilla-daemon
  2014-06-25  1:05 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-24 16:23 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

Alex Deucher <alexdeucher@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #140721|0                           |1
        is obsolete|                            |

--- Comment #11 from Alex Deucher <alexdeucher@gmail.com> ---
Created attachment 140881
  --> https://bugzilla.kernel.org/attachment.cgi?id=140881&action=edit
patch 2/2

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (10 preceding siblings ...)
  2014-06-24 16:23 ` bugzilla-daemon
@ 2014-06-25  1:05 ` bugzilla-daemon
  2014-06-25  2:11 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-25  1:05 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #12 from t3st3r@mail.ru ---
And what I'm supposed to test if patch is against 3.15? Because 3.15 release is
fine "on its own" and does not exposes this bug. So its impossible to see if
bug appears -> apply patch -> check that bug is gone (which looks most logical
course of actions to me, unless I got something wrong). Because 3.15 lacks this
bug even before patch.

Bug appears to be fixed between 3.15-rc8 and 3.15 as result of mentioned merge.
Then bug reappeared at 3.16-rc1 (and up) as result of other merges. So it would
be logical if patch is against some 3.15-rc* or 3.16-rc*? Unfortunately, they
have so many changes related to VM management that it's not like if I'm cool
enough to port patch to these versions myself (most notably, radeon_vm.c
changes are quite complicated). So I cant see if GPU lockup is gone after
patching some "known-bad" version.

Or you mean something like this: take 3.15 (which is ok) and check that patch
does not breaks anything? But it wouldn't be direct check if bug is actually
gone, right?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (11 preceding siblings ...)
  2014-06-25  1:05 ` bugzilla-daemon
@ 2014-06-25  2:11 ` bugzilla-daemon
  2014-06-25  9:45 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-25  2:11 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #13 from Alex Deucher <alexdeucher@gmail.com> ---
Sorry, I misread your comments and thought it was broken on 3.15 as well.  You
can follow the thread here:
http://lists.freedesktop.org/archives/dri-devel/2014-June/062305.html

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (12 preceding siblings ...)
  2014-06-25  2:11 ` bugzilla-daemon
@ 2014-06-25  9:45 ` bugzilla-daemon
  2014-06-25 13:17 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-25  9:45 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #14 from t3st3r@mail.ru ---
Nono, v3.15 (release) is okay on my GPU in regard to this bug. That what makes
testing patch tricky.

Bug has been here since unknown. I can tell for sure it plagued all 3.15RCs
(maybe earlier versions as well). But between 3.15rc8 and 3.15 release, bunch
of last-minute DRM fixes landed (0a4ae727d6aa459247b027387edb6ff99f657792).
Except everything else, it corrected this GPU deadlock problem. So v3.15 (the
one and the only) does not exposes that bug.

When I gave a try to v3.16rc1, I figured out bug re-appeared. Hence, looked
like regression in 3.16RCs.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (13 preceding siblings ...)
  2014-06-25  9:45 ` bugzilla-daemon
@ 2014-06-25 13:17 ` bugzilla-daemon
  2014-08-05  8:06 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-06-25 13:17 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #15 from Alex Deucher <alexdeucher@gmail.com> ---
Any luck narrowing down what fixed it in 3.15 or what broke it again in 3.16?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (14 preceding siblings ...)
  2014-06-25 13:17 ` bugzilla-daemon
@ 2014-08-05  8:06 ` bugzilla-daemon
  2014-08-14 11:56 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-08-05  8:06 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #16 from t3st3r@mail.ru ---
I have to admit this bug really suxx. I've attempted to bisect 3.15 -> 3.16rc1
several times but these attempts failed so far.

It looks like while I generally found quite fast ways to toggle this bug in
lucky cases, in some cases bug does not toggles for many hours or even can
require a reboot on same kernel version to increase chance bug appears. Bug
also seems to be really picky on previous history of GPU usage (e.g. launching
some 3D game before BfW can screw anything up and bug would not toggle in
literally days,  but can occasionally backstab).

In some cases deciding if kernel is bugged or not turned out to be a really
daunting and time consuming task. My last attempt was also wrong. I bet some of
"good" kernels were not as good as they should. Bad kernels on other hand
supposed to be bad, i.e. GPU crashed.

So last attempt also led me into really strange area, I don't even have
hardware in question so this module is never used.

P.S. and as far as I understand,
http://lists.freedesktop.org/archives/dri-devel/2014-June/062305.html fix
wasn't ported into 3.16 series? So 3.16 keeps failing for me.


And as example, last bisect looked like this:
$ git bisect log
git bisect start
# good: [1860e379875dfe7271c649058aeddffe5afd9d0d] Linux 3.15
git bisect good 1860e379875dfe7271c649058aeddffe5afd9d0d
# bad: [7171511eaec5bf23fb06078f59784a3a0626b38f] Linux 3.16-rc1
git bisect bad 7171511eaec5bf23fb06078f59784a3a0626b38f
# good: [aaeb2554337217dfa4eac2fcc90da7be540b9a73] Merge branch 'v4l_for_linus'
of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media into next
git bisect good aaeb2554337217dfa4eac2fcc90da7be540b9a73
# good: [16b9057804c02e2d351e9c8f606e909b43cbd9e7] Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect good 16b9057804c02e2d351e9c8f606e909b43cbd9e7
# bad: [249c8b8d7e2d1bf9505dc46458537e77326c24fd] i40evf: remove unnecessary
log messages
git bisect bad 249c8b8d7e2d1bf9505dc46458537e77326c24fd
# good: [758bd61aa987e82765bd432f37bd81bd197c4b1a] Merge branch 'master' of
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
git bisect good 758bd61aa987e82765bd432f37bd81bd197c4b1a
# bad: [9db7cb6901740453a442e598563b576987dd471b] Merge branch 'master' of
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into
for-davem
git bisect bad 9db7cb6901740453a442e598563b576987dd471b
# bad: [99abe65ff18b6bbac2e55524827b571c3eccfa86] Merge tag 'nfc-next-3.16-1'
of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next
git bisect bad 99abe65ff18b6bbac2e55524827b571c3eccfa86
# bad: [75e58071c0c64f331ccc4c0037990a1e50099f7f] Merge branch 'for-linville'
of git://github.com/kvalo/ath
git bisect bad 75e58071c0c64f331ccc4c0037990a1e50099f7f
# bad: [d5738b41e555f97f597b19bc549fa811b516d6b6] Revert "wl1251: enforce
changed hw encryption support on monitor state change"
git bisect bad d5738b41e555f97f597b19bc549fa811b516d6b6
# bad: [0aa7142812c19af25ad21405eefc499e83da2fcc] iwlwifi: mvm: fix sparse
warning when _DEBUGFS isn't set
git bisect bad 0aa7142812c19af25ad21405eefc499e83da2fcc
# bad: [14b485f041e35f60212317017c2127b8a9b6be31] iwlwifi: mvm: prevent nic to
powered up at driver load
git bisect bad 14b485f041e35f60212317017c2127b8a9b6be31
# bad: [1e9551debacdaa044eeb514f4366beac6e18f6d9] iwlwifi: mvm: rs: don't allow
TPC when power save is disabled
git bisect bad 1e9551debacdaa044eeb514f4366beac6e18f6d9
# bad: [cebeb0f1885fa93c44be5d4e0b9b640210ff088c] Merge remote-tracking branch
'wireless-next/master' into iwlwifi-next
git bisect bad cebeb0f1885fa93c44be5d4e0b9b640210ff088c
# bad: [939ecf6b14c46e3448411a934418311b492bfee4] Merge remote-tracking branch
'iwlwifi-fixes/master' into iwlwifi-next
git bisect bad 939ecf6b14c46e3448411a934418311b492bfee4
# first bad commit: [939ecf6b14c46e3448411a934418311b492bfee4] Merge
remote-tracking branch 'iwlwifi-fixes/master' into iwlwifi-next

Obviously iwlwifi haves nothing to do with this bug. I bet I failed to judge
quality of some kernel(s) correctly one more time.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (15 preceding siblings ...)
  2014-08-05  8:06 ` bugzilla-daemon
@ 2014-08-14 11:56 ` bugzilla-daemon
  2014-08-24  1:05 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-08-14 11:56 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

Tomasz Mloduchowski <q@qdot.me> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |q@qdot.me

--- Comment #17 from Tomasz Mloduchowski <q@qdot.me> ---
I can confirm that the bug still occurs on 3.16 as well. 

Different hardware:
02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Curacao XT [Radeon R9 270X]

Non-AMD-Vi  (Intel Xeon), IO-MMU disabled. 

Occasionally on large window resizes (4K display running awesome WM, moving a
2D window from a small tile to a large one) this issue triggers. 


[ 6735.965953] radeon 0000:02:00.0: ring 0 stalled for more than 10081msec
[ 6735.965958] radeon 0000:02:00.0: GPU lockup (waiting for 0x0000000000041872
last fence id 0x0000000000041871 on ring 0)
[ 6735.965962] radeon 0000:02:00.0: failed to get a new IB (-35)
[ 6736.546504] radeon 0000:02:00.0: Saved 12093 dwords of commands on ring 0.
[ 6736.546647] radeon 0000:02:00.0: GPU softreset: 0x0000006C
[ 6736.546651] radeon 0000:02:00.0:   GRBM_STATUS               = 0xA0003028
[ 6736.546654] radeon 0000:02:00.0:   GRBM_STATUS_SE0           = 0x00000006
[ 6736.546657] radeon 0000:02:00.0:   GRBM_STATUS_SE1           = 0x00000006
[ 6736.546660] radeon 0000:02:00.0:   SRBM_STATUS               = 0x200000C0
[ 6736.546773] radeon 0000:02:00.0:   SRBM_STATUS2              = 0x00000000
[ 6736.546777] radeon 0000:02:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[ 6736.546780] radeon 0000:02:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010000
[ 6736.546783] radeon 0000:02:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000002
[ 6736.546786] radeon 0000:02:00.0:   R_008680_CP_STAT          = 0x80010243
[ 6736.546789] radeon 0000:02:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83146
[ 6736.546793] radeon 0000:02:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C84246
[ 6736.546796] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 6736.546802] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x00000000
[ 6737.119141] radeon 0000:02:00.0: GRBM_SOFT_RESET=0x0000DDFF
[ 6737.119197] radeon 0000:02:00.0: SRBM_SOFT_RESET=0x00100140
[ 6737.120382] radeon 0000:02:00.0:   GRBM_STATUS               = 0x00003028
[ 6737.120385] radeon 0000:02:00.0:   GRBM_STATUS_SE0           = 0x00000006
[ 6737.120388] radeon 0000:02:00.0:   GRBM_STATUS_SE1           = 0x00000006
[ 6737.120391] radeon 0000:02:00.0:   SRBM_STATUS               = 0x20000AC0
[ 6737.120503] radeon 0000:02:00.0:   SRBM_STATUS2              = 0x00000000
[ 6737.120507] radeon 0000:02:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[ 6737.120510] radeon 0000:02:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[ 6737.120513] radeon 0000:02:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[ 6737.120516] radeon 0000:02:00.0:   R_008680_CP_STAT          = 0x00000000
[ 6737.120519] radeon 0000:02:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[ 6737.120522] radeon 0000:02:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[ 6737.120770] radeon 0000:02:00.0: GPU reset succeeded, trying to resume
[ 6737.169219] [drm] probing gen 2 caps for device 8086:340a = 3b3d02/0
[ 6737.169230] [drm] PCIE gen 2 link speeds already enabled
[ 6737.172143] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
[ 6737.172320] radeon 0000:02:00.0: WB enabled
[ 6737.172324] radeon 0000:02:00.0: fence driver on ring 0 use gpu addr
0x0000000080000c00 and cpu addr 0xffff880197695c00
[ 6737.172327] radeon 0000:02:00.0: fence driver on ring 1 use gpu addr
0x0000000080000c04 and cpu addr 0xffff880197695c04
[ 6737.172330] radeon 0000:02:00.0: fence driver on ring 2 use gpu addr
0x0000000080000c08 and cpu addr 0xffff880197695c08
[ 6737.172335] radeon 0000:02:00.0: fence driver on ring 3 use gpu addr
0x0000000080000c0c and cpu addr 0xffff880197695c0c
[ 6737.172338] radeon 0000:02:00.0: fence driver on ring 4 use gpu addr
0x0000000080000c10 and cpu addr 0xffff880197695c10
[ 6737.216900] radeon 0000:02:00.0: fence driver on ring 5 use gpu addr
0x0000000000075a18 and cpu addr 0xffffc90001735a18
[ 6737.402614] [drm] ring test on 0 succeeded in 3 usecs
[ 6737.402627] [drm] ring test on 1 succeeded in 1 usecs
[ 6737.402634] [drm] ring test on 2 succeeded in 1 usecs
[ 6737.402701] [drm] ring test on 3 succeeded in 2 usecs
[ 6737.402713] [drm] ring test on 4 succeeded in 1 usecs
[ 6737.579764] [drm] ring test on 5 succeeded in 2 usecs
[ 6737.579778] [drm] UVD initialized successfully.
[ 6747.574404] radeon 0000:02:00.0: ring 0 stalled for more than 10000msec
[ 6747.574410] radeon 0000:02:00.0: GPU lockup (waiting for 0x0000000000041920
last fence id 0x0000000000041871 on ring 0)
[ 6747.574414] [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35).
[ 6747.574418] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on
GFX ring (-35).
[ 6747.574421] radeon 0000:02:00.0: ib ring test failed (-35).
[ 6748.140502] radeon 0000:02:00.0: GPU softreset: 0x00000048
[ 6748.140507] radeon 0000:02:00.0:   GRBM_STATUS               = 0xA0003028
[ 6748.140510] radeon 0000:02:00.0:   GRBM_STATUS_SE0           = 0x00000006
[ 6748.140513] radeon 0000:02:00.0:   GRBM_STATUS_SE1           = 0x00000006
[ 6748.140516] radeon 0000:02:00.0:   SRBM_STATUS               = 0x200000C0
[ 6748.140628] radeon 0000:02:00.0:   SRBM_STATUS2              = 0x00000000
[ 6748.140631] radeon 0000:02:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[ 6748.140635] radeon 0000:02:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010000
[ 6748.140638] radeon 0000:02:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000002
[ 6748.140641] radeon 0000:02:00.0:   R_008680_CP_STAT          = 0x80010243
[ 6748.140644] radeon 0000:02:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[ 6748.140647] radeon 0000:02:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[ 6748.140651] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 6748.140654] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x00000000
[ 6748.692401] radeon 0000:02:00.0: GRBM_SOFT_RESET=0x0000DDFF
[ 6748.692457] radeon 0000:02:00.0: SRBM_SOFT_RESET=0x00000100
[ 6748.693617] radeon 0000:02:00.0:   GRBM_STATUS               = 0x00003028
[ 6748.693621] radeon 0000:02:00.0:   GRBM_STATUS_SE0           = 0x00000006
[ 6748.693624] radeon 0000:02:00.0:   GRBM_STATUS_SE1           = 0x00000006
[ 6748.693627] radeon 0000:02:00.0:   SRBM_STATUS               = 0x200000C0
[ 6748.693746] radeon 0000:02:00.0:   SRBM_STATUS2              = 0x00000000
[ 6748.693751] radeon 0000:02:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[ 6748.693754] radeon 0000:02:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[ 6748.693757] radeon 0000:02:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[ 6748.693760] radeon 0000:02:00.0:   R_008680_CP_STAT          = 0x00000000
[ 6748.693763] radeon 0000:02:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[ 6748.693767] radeon 0000:02:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[ 6748.694014] radeon 0000:02:00.0: GPU reset succeeded, trying to resume
[ 6748.709717] [drm] probing gen 2 caps for device 8086:340a = 3b3d02/0
[ 6748.709721] [drm] PCIE gen 2 link speeds already enabled
[ 6748.712059] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
[ 6748.712221] radeon 0000:02:00.0: WB enabled
[ 6748.712224] radeon 0000:02:00.0: fence driver on ring 0 use gpu addr
0x0000000080000c00 and cpu addr 0xffff880197695c00
[ 6748.712225] radeon 0000:02:00.0: fence driver on ring 1 use gpu addr
0x0000000080000c04 and cpu addr 0xffff880197695c04
[ 6748.712227] radeon 0000:02:00.0: fence driver on ring 2 use gpu addr
0x0000000080000c08 and cpu addr 0xffff880197695c08
[ 6748.712229] radeon 0000:02:00.0: fence driver on ring 3 use gpu addr
0x0000000080000c0c and cpu addr 0xffff880197695c0c
[ 6748.712231] radeon 0000:02:00.0: fence driver on ring 4 use gpu addr
0x0000000080000c10 and cpu addr 0xffff880197695c10
[ 6748.755479] radeon 0000:02:00.0: fence driver on ring 5 use gpu addr
0x0000000000075a18 and cpu addr 0xffffc90001735a18
[ 6748.941259] [drm] ring test on 0 succeeded in 3 usecs
[ 6748.941266] [drm] ring test on 1 succeeded in 1 usecs
[ 6748.941272] [drm] ring test on 2 succeeded in 1 usecs
[ 6748.941338] [drm] ring test on 3 succeeded in 2 usecs
[ 6748.941350] [drm] ring test on 4 succeeded in 1 usecs
[ 6749.118470] [drm] ring test on 5 succeeded in 2 usecs
[ 6749.118480] [drm] UVD initialized successfully.
[ 6749.118615] [drm] ib test on ring 0 succeeded in 0 usecs
[ 6749.118672] [drm] ib test on ring 1 succeeded in 0 usecs
[ 6749.118732] [drm] ib test on ring 2 succeeded in 0 usecs
[ 6749.118768] [drm] ib test on ring 3 succeeded in 0 usecs
[ 6749.118804] [drm] ib test on ring 4 succeeded in 0 usecs
[ 6759.264624] radeon 0000:02:00.0: ring 5 stalled for more than 10000msec
[ 6759.264630] radeon 0000:02:00.0: GPU lockup (waiting for 0x0000000000000004
last fence id 0x0000000000000002 on ring 5)
[ 6759.264634] [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35).
[ 6759.264640] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on
ring 5 (-35).
[ 6759.264667] [drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed
[ 6759.279402] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc33d04
[ 6759.279407] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6759.279410] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0303D004
[ 6759.279413] VM fault (0x04, vmid 1) at page 138462, write from DMA1 (61)
[ 6759.280478] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc24804
[ 6759.280482] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6759.280484] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02048004
[ 6759.280487] VM fault (0x04, vmid 1) at page 138462, read from TC (72)
[ 6759.281017] radeon 0000:02:00.0: GPU fault detected: 146 0x01033d04
[ 6759.281020] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021988
[ 6759.281023] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0303D004
[ 6759.281025] VM fault (0x04, vmid 1) at page 137608, write from DMA1 (61)
[ 6759.281062] radeon 0000:02:00.0: GPU fault detected: 146 0x01033d04
[ 6759.281064] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 6759.281066] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02048004
[ 6759.281069] VM fault (0x04, vmid 1) at page 0, read from TC (72)
[ 6759.283614] radeon 0000:02:00.0: GPU fault detected: 146 0x0143a004
[ 6759.283619] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x0001D38A
[ 6759.283621] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x030A0004
[ 6759.283624] VM fault (0x04, vmid 1) at page 119690, write from CB (160)
[ 6759.283841] radeon 0000:02:00.0: GPU fault detected: 146 0x05439004
[ 6759.283844] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 6759.283846] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02048004
[ 6759.283848] VM fault (0x04, vmid 1) at page 0, read from TC (72)
[ 6759.283853] radeon 0000:02:00.0: GPU fault detected: 146 0x05439004
[ 6759.283856] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x0001D391
[ 6759.283858] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02048004
[ 6759.283861] VM fault (0x04, vmid 1) at page 119697, read from TC (72)
[ 6759.283889] radeon 0000:02:00.0: GPU fault detected: 146 0x05c3a004
[ 6759.283891] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 6759.283894] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0303D004
[ 6759.283896] VM fault (0x04, vmid 1) at page 0, write from DMA1 (61)
[ 6759.283901] radeon 0000:02:00.0: GPU fault detected: 146 0x05a32004
[ 6759.283904] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021988
[ 6759.283906] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0303D004
[ 6759.283908] VM fault (0x04, vmid 1) at page 137608, write from DMA1 (61)
[ 6759.283914] radeon 0000:02:00.0: GPU fault detected: 146 0x05a31004
[ 6759.283916] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021988
[ 6759.283918] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0303D004
[ 6759.283921] VM fault (0x04, vmid 1) at page 137608, write from DMA1 (61)
[ 6759.283965] radeon 0000:02:00.0: GPU fault detected: 146 0x06e3d004
[ 6759.283967] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 6759.283969] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02048004
[ 6759.283972] VM fault (0x04, vmid 1) at page 0, read from TC (72)
[ 6759.284178] radeon 0000:02:00.0: GPU fault detected: 146 0x01424804
[ 6759.284180] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x0001D38C
[ 6759.284183] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x03060004
[ 6759.284185] VM fault (0x04, vmid 1) at page 119692, write from CB (96)
[ 6759.284190] radeon 0000:02:00.0: GPU fault detected: 146 0x03224804
[ 6759.284193] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x0001D3A4
[ 6759.284195] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x03050004
[ 6759.284197] VM fault (0x04, vmid 1) at page 119716, write from CB (80)
[ 6759.284422] radeon 0000:02:00.0: GPU fault detected: 146 0x01224804
[ 6759.284424] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 6759.284427] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02048004
[ 6759.284429] VM fault (0x04, vmid 1) at page 0, read from TC (72)
[ 6759.284444] radeon 0000:02:00.0: GPU fault detected: 146 0x01036004
[ 6759.284447] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x0001D395
[ 6759.284449] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02048004
[ 6759.284451] VM fault (0x04, vmid 1) at page 119701, read from TC (72)
[ 6759.284556] radeon 0000:02:00.0: GPU fault detected: 146 0x03035004
[ 6759.284558] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 6759.284561] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x03060004
[ 6759.284563] VM fault (0x04, vmid 1) at page 0, write from CB (96)
[ 6759.284568] radeon 0000:02:00.0: GPU fault detected: 146 0x0343a004
[ 6759.284570] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021B93
[ 6759.284573] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x03090004
[ 6759.284575] VM fault (0x04, vmid 1) at page 138131, write from CB (144)
[ 6759.284612] radeon 0000:02:00.0: GPU fault detected: 146 0x03232004
[ 6759.284615] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 6759.284617] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0303D004
[ 6759.284619] VM fault (0x04, vmid 1) at page 0, write from DMA1 (61)
[ 6759.284624] radeon 0000:02:00.0: GPU fault detected: 146 0x03c39004
[ 6759.284627] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDF
[ 6759.284629] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0303D004
[ 6759.284631] VM fault (0x04, vmid 1) at page 138463, write from DMA1 (61)
[ 6759.284637] radeon 0000:02:00.0: GPU fault detected: 146 0x03231004
[ 6759.284639] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6759.284641] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0303D004
[ 6759.284644] VM fault (0x04, vmid 1) at page 138462, write from DMA1 (61)
[ 6759.284649] radeon 0000:02:00.0: GPU fault detected: 146 0x03231004
[ 6759.284651] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6759.284653] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0303D004
[ 6759.284656] VM fault (0x04, vmid 1) at page 138462, write from DMA1 (61)
[ 6759.284716] radeon 0000:02:00.0: GPU fault detected: 146 0x05035004
[ 6759.284718] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021B93
[ 6759.284720] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02044004
[ 6759.284723] VM fault (0x04, vmid 1) at page 138131, read from TC (68)
[ 6759.284728] radeon 0000:02:00.0: GPU fault detected: 146 0x05035004
[ 6759.284730] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDF
[ 6759.284732] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x020C4004
[ 6759.284735] VM fault (0x04, vmid 1) at page 138463, read from TC (196)
[ 6759.516471] radeon 0000:02:00.0: GPU fault detected: 146 0x01036004
[ 6759.516475] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021988
[ 6759.516477] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x03060004
[ 6759.516479] VM fault (0x04, vmid 1) at page 137608, write from CB (96)
[ 6759.516483] radeon 0000:02:00.0: GPU fault detected: 146 0x01035004
[ 6759.516485] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021988
[ 6759.516486] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x03060004
[ 6759.516488] VM fault (0x04, vmid 1) at page 137608, write from CB (96)
[ 6759.533652] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc36004
[ 6759.533656] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6759.533658] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x03060004
[ 6759.533659] VM fault (0x04, vmid 1) at page 138462, write from CB (96)
[ 6759.533858] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc32004
[ 6759.533860] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6759.533861] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x03020004
[ 6759.533863] VM fault (0x04, vmid 1) at page 138462, write from CB (32)
[ 6759.547549] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc24804
[ 6759.547552] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6759.547554] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02048004
[ 6759.547555] VM fault (0x04, vmid 1) at page 138462, read from TC (72)
[ 6761.492893] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc33d04
[ 6761.492896] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6761.492898] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0303D004
[ 6761.492900] VM fault (0x04, vmid 1) at page 138462, write from DMA1 (61)
[ 6761.493081] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc24804
[ 6759.547552] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6759.547554] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02048004
[ 6759.547555] VM fault (0x04, vmid 1) at page 138462, read from TC (72)
[ 6761.492893] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc33d04
[ 6761.492896] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6761.492898] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0303D004
[ 6761.492900] VM fault (0x04, vmid 1) at page 138462, write from DMA1 (61)
[ 6761.493081] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc24804
[ 6761.493083] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6761.493085] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02048004
[ 6761.493087] VM fault (0x04, vmid 1) at page 138462, read from TC (72)
[ 6761.493486] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc36004
[ 6761.493489] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6761.493491] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x03060004
[ 6761.493493] VM fault (0x04, vmid 1) at page 138462, write from CB (96)
[ 6762.236056] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc21004
[ 6762.236060] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6762.236062] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02010004
[ 6762.236064] VM fault (0x04, vmid 1) at page 138462, read from CB (16)
[ 6762.236240] radeon 0000:02:00.0: GPU fault detected: 146 0x0bc22004
[ 6762.236244] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021CDE
[ 6762.236246] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02020004
[ 6762.236248] VM fault (0x04, vmid 1) at page 138462, read from CB (32)
[ 6770.359479] radeon 0000:02:00.0: GPU fault detected: 146 0x01036004
[ 6770.359483] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021988
[ 6770.359489] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x03060004
[ 6770.359492] VM fault (0x04, vmid 1) at page 137608, write from CB (96)
[ 6770.359496] radeon 0000:02:00.0: GPU fault detected: 146 0x01039004
[ 6770.359498] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00021988
[ 6770.359500] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x02044004
[ 6770.359502] VM fault (0x04, vmid 1) at page 137608, read from TC (68)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (16 preceding siblings ...)
  2014-08-14 11:56 ` bugzilla-daemon
@ 2014-08-24  1:05 ` bugzilla-daemon
  2014-08-25  9:58 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-08-24  1:05 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #18 from t3st3r@mail.ru ---
This is getting even more interesting. After some investigation I got idea why
bisect never succeeds. It looks like there was no stable kernels at all: 3.15
is also broken. However it takes "almost forever" to crash it with previously
used methods.

Somehow I stepped up on similar but far more optimized use case (another map in
BfW game) which locks up GPU in matter of seconds to a minute. That's what I
need :). This also proven to knock down "good" 3.15 kernels in matter of 30
seconds or so. So it was not good at all. Obviously my bisect can't succeed.

On other hand now I can try mentioned patches...

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (17 preceding siblings ...)
  2014-08-24  1:05 ` bugzilla-daemon
@ 2014-08-25  9:58 ` bugzilla-daemon
  2014-09-08 12:19 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-08-25  9:58 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #19 from Michel Dänzer <michel@daenzer.net> ---
Does a 3.17 based drm-fixes kernel tree work better? There have been a couple
of stability fixes.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (18 preceding siblings ...)
  2014-08-25  9:58 ` bugzilla-daemon
@ 2014-09-08 12:19 ` bugzilla-daemon
  2014-09-08 12:22 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-09-08 12:19 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #20 from t3st3r@mail.ru ---
1) About 3.15 + patch: I gave it a try and it took quite a while to get opinion
about it. Overall it is quite stable and survives about several days of run of
problematic load. But eventually GPU still could encounter crash. Intereating
thing in this occurence I caught is that regardless of scary message about
failed DPM resume, GPU seems to be operable after successful recovery. I got
couple of similar crashes as well within a week. It looked like this:

===cut===
[815114.959250] SysRq : Emergency Sync
[815115.071974] Emergency Sync complete
[815116.935547] radeon 0000:01:00.0: ring 0 stalled for more than 10082msec
[815116.935556] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000007f39f60
last fence id 0x0000000007f39f5f on ring 0)
[815116.935564] radeon 0000:01:00.0: failed to get a new IB (-35)
[815116.942472] radeon 0000:01:00.0: sa_manager is not empty, clearing anyway
[815117.134467] SysRq : Keyboard mode set to system default
[815117.500079] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x0000000080406640 flags=0x0000]
[815117.500092] radeon 0000:01:00.0: Saved 6061 dwords of commands on ring 0.
[815117.500097] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x0000000080406650 flags=0x0020]
[815117.500104] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x0000000080000100 flags=0x0020]
[815117.500110] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x0000000080404500 flags=0x0000]
[815117.500222] radeon 0000:01:00.0: GPU softreset: 0x0000006C
[815117.500226] radeon 0000:01:00.0:   GRBM_STATUS               = 0xA0003028
[815117.500229] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[815117.500231] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[815117.500233] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200002C0
[815117.500349] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[815117.500351] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[815117.500353] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010000
[815117.500356] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000002
[815117.500358] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80010243
[815117.500360] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44483106
[815117.500362] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C84246
[815117.500365] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[815117.500368] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x00000000
[815118.057253] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
[815118.057308] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100140
[815118.058465] radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
[815118.058468] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[815118.058470] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[815118.058472] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[815118.058583] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[815118.058585] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[815118.058588] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[815118.058590] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[815118.058592] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
[815118.058594] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[815118.058597] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[815118.058843] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[815118.086936] [drm] probing gen 2 caps for device 1002:5a16 = 31cd02/0
[815118.086939] [drm] PCIE gen 2 link speeds already enabled
[815118.090599] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
[815118.090704] radeon 0000:01:00.0: WB enabled
[815118.090707] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr
0x0000000080000c00 and cpu addr 0xffff880414545c00
[815118.090709] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr
0x0000000080000c04 and cpu addr 0xffff880414545c04
[815118.090711] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr
0x0000000080000c08 and cpu addr 0xffff880414545c08
[815118.090713] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr
0x0000000080000c0c and cpu addr 0xffff880414545c0c
[815118.090715] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr
0x0000000080000c10 and cpu addr 0xffff880414545c10
[815118.091689] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr
0x0000000000075a18 and cpu addr 0xffffc90012135a18
[815118.278813] [drm] ring test on 0 succeeded in 3 usecs
[815118.278819] [drm] ring test on 1 succeeded in 1 usecs
[815118.278824] [drm] ring test on 2 succeeded in 1 usecs
[815118.278888] [drm] ring test on 3 succeeded in 2 usecs
[815118.278897] [drm] ring test on 4 succeeded in 1 usecs
[815118.455982] [drm] ring test on 5 succeeded in 2 usecs
[815118.455989] [drm] UVD initialized successfully.
[815128.453467] radeon 0000:01:00.0: ring 0 stalled for more than 10001msec
[815128.453477] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000007f39fad
last fence id 0x0000000007f39f5f on ring 0)
[815128.453483] [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35).
[815128.453491] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on
GFX ring (-35).
[815128.453496] radeon 0000:01:00.0: ib ring test failed (-35).
[815129.011900] radeon 0000:01:00.0: GPU softreset: 0x00000048
[815129.011904] radeon 0000:01:00.0:   GRBM_STATUS               = 0xA0003028
[815129.011907] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[815129.011909] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[815129.011911] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[815129.012022] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[815129.012025] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[815129.012027] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010000
[815129.012029] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000002
[815129.012031] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80010243
[815129.012034] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[815129.012036] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[815129.012039] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[815129.012041] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x00000000
[815129.561916] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
[815129.561971] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
[815129.563128] radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
[815129.563131] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[815129.563133] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[815129.563135] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[815129.563246] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[815129.563249] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[815129.563251] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[815129.563253] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[815129.563255] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
[815129.563257] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[815129.563260] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[815129.563506] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[815129.576411] [drm] probing gen 2 caps for device 1002:5a16 = 31cd02/0
[815129.576415] [drm] PCIE gen 2 link speeds already enabled
[815129.580147] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
[815129.580250] radeon 0000:01:00.0: WB enabled
[815129.580253] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr
0x0000000080000c00 and cpu addr 0xffff880414545c00
[815129.580255] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr
0x0000000080000c04 and cpu addr 0xffff880414545c04
[815129.580257] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr
0x0000000080000c08 and cpu addr 0xffff880414545c08
[815129.580259] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr
0x0000000080000c0c and cpu addr 0xffff880414545c0c
[815129.580261] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr
0x0000000080000c10 and cpu addr 0xffff880414545c10
[815129.581232] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr
0x0000000000075a18 and cpu addr 0xffffc90012135a18
[815129.767993] [drm] ring test on 0 succeeded in 3 usecs
[815129.767999] [drm] ring test on 1 succeeded in 1 usecs
[815129.768004] [drm] ring test on 2 succeeded in 1 usecs
[815129.768068] [drm] ring test on 3 succeeded in 2 usecs
[815129.768077] [drm] ring test on 4 succeeded in 1 usecs
[815129.945157] [drm] ring test on 5 succeeded in 2 usecs
[815129.945164] [drm] UVD initialized successfully.
[815129.946125] [drm] ib test on ring 0 succeeded in 0 usecs
[815129.946210] [drm] ib test on ring 1 succeeded in 0 usecs
[815129.946301] [drm] ib test on ring 2 succeeded in 0 usecs
[815129.946345] [drm] ib test on ring 3 succeeded in 0 usecs
[815129.946380] [drm] ib test on ring 4 succeeded in 0 usecs
[815137.847012] SysRq : Emergency Sync
[815137.965713] Emergency Sync complete
[815139.742325] SysRq : Emergency Sync
[815139.864190] Emergency Sync complete
[815140.093163] radeon 0000:01:00.0: ring 5 stalled for more than 10000msec
[815140.093173] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000004
last fence id 0x0000000000000002 on ring 5)
[815140.093179] [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35).
[815140.093188] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on
ring 5 (-35).
[815140.093217] [drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed
===cut===

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (19 preceding siblings ...)
  2014-09-08 12:19 ` bugzilla-daemon
@ 2014-09-08 12:22 ` bugzilla-daemon
  2014-09-09  3:09 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-09-08 12:22 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #21 from t3st3r@mail.ru ---
2) About 3.17... I attempted 3.17-rc1 and it crashed in about 30 seconds of run
of problematic work. 

I will try newer -RCs as well, as I can see there were some extra changes to
radeon-related code.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (20 preceding siblings ...)
  2014-09-08 12:22 ` bugzilla-daemon
@ 2014-09-09  3:09 ` bugzilla-daemon
  2014-09-30  4:03 ` bugzilla-daemon
  2015-07-10 23:38 ` bugzilla-daemon
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-09-09  3:09 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

--- Comment #22 from t3st3r@mail.ru ---
Attempted to test on 3.17-rc4. Result: crashed in about 3 minutes of run (see
below).

Are some stability fixes missing 3.17-rc4 mainline? At first glance I do not
see radeon-related commits in drm-fixes which haven't made it to -rc4. Am I
missing something?

===cut===
 kernel: [  599.949295] radeon 0000:01:00.0: ring 3 stalled for more than
10167msec
 kernel: [  599.949305] radeon 0000:01:00.0: GPU lockup (waiting for
0x0000000000001eb0 last fence id 0x0000000000001eaf on ring 3)
 kernel: [  599.949312] radeon 0000:01:00.0: scheduling IB failed (-35).
 kernel: [  600.507409] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x000000008040a840 flags=0x0010]
 kernel: [  600.507420] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x000000008040a870 flags=0x0030]
 kernel: [  600.507426] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x0000000080000100 flags=0x0030]
 kernel: [  600.507431] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0
domain=0x0018 address=0x000000008040a700 flags=0x0010]
 kernel: [  600.507460] radeon 0000:01:00.0: Saved 19308 dwords of commands on
ring 0.
 kernel: [  600.507590] radeon 0000:01:00.0: GPU softreset: 0x0000006C
 kernel: [  600.507593] radeon 0000:01:00.0:   GRBM_STATUS               =
0xA0003028
 kernel: [  600.507596] radeon 0000:01:00.0:   GRBM_STATUS_SE0           =
0x00000006
 kernel: [  600.507598] radeon 0000:01:00.0:   GRBM_STATUS_SE1           =
0x00000006
 kernel: [  600.507600] radeon 0000:01:00.0:   SRBM_STATUS               =
0x200000C0
 kernel: [  600.507711] radeon 0000:01:00.0:   SRBM_STATUS2              =
0x00000000
 kernel: [  600.507714] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 =
0x00000000
 kernel: [  600.507716] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 =
0x00010000
 kernel: [  600.507718] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     =
0x00000002
 kernel: [  600.507720] radeon 0000:01:00.0:   R_008680_CP_STAT          =
0x80010243
 kernel: [  600.507723] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   =
0x44483106
 kernel: [  600.507725] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   =
0x44E84266
 kernel: [  600.507728] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
 kernel: [  600.507730] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
 kernel: [  601.054357] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
 kernel: [  601.054411] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100140
 kernel: [  601.055568] radeon 0000:01:00.0:   GRBM_STATUS               =
0x00003028
 kernel: [  601.055571] radeon 0000:01:00.0:   GRBM_STATUS_SE0           =
0x00000006
 kernel: [  601.055573] radeon 0000:01:00.0:   GRBM_STATUS_SE1           =
0x00000006
 kernel: [  601.055575] radeon 0000:01:00.0:   SRBM_STATUS               =
0x20000AC0
 kernel: [  601.055686] radeon 0000:01:00.0:   SRBM_STATUS2              =
0x00000000
 kernel: [  601.055689] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 =
0x00000000
 kernel: [  601.055691] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 =
0x00000000
 kernel: [  601.055693] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     =
0x00000000
 kernel: [  601.055695] radeon 0000:01:00.0:   R_008680_CP_STAT          =
0x00000000
 kernel: [  601.055698] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  601.055700] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  601.055951] radeon 0000:01:00.0: GPU reset succeeded, trying to
resume
 kernel: [  601.083744] [drm] probing gen 2 caps for device 1002:5a16 =
31cd02/0
 kernel: [  601.083747] [drm] PCIE gen 2 link speeds already enabled
 kernel: [  601.084938] [drm] PCIE GART of 1024M enabled (table at
0x0000000000276000).
 kernel: [  601.085046] radeon 0000:01:00.0: WB enabled
 kernel: [  601.085049] radeon 0000:01:00.0: fence driver on ring 0 use gpu
addr 0x0000000080000c00 and cpu addr 0xffff880413fbec00
 kernel: [  601.085052] radeon 0000:01:00.0: fence driver on ring 1 use gpu
addr 0x0000000080000c04 and cpu addr 0xffff880413fbec04
 kernel: [  601.085054] radeon 0000:01:00.0: fence driver on ring 2 use gpu
addr 0x0000000080000c08 and cpu addr 0xffff880413fbec08
 kernel: [  601.085056] radeon 0000:01:00.0: fence driver on ring 3 use gpu
addr 0x0000000080000c0c and cpu addr 0xffff880413fbec0c
 kernel: [  601.085057] radeon 0000:01:00.0: fence driver on ring 4 use gpu
addr 0x0000000080000c10 and cpu addr 0xffff880413fbec10
 kernel: [  601.086030] radeon 0000:01:00.0: fence driver on ring 5 use gpu
addr 0x0000000000075a18 and cpu addr 0xffffc90011db5a18
 kernel: [  601.271000] [drm] ring test on 0 succeeded in 3 usecs
 kernel: [  601.271006] [drm] ring test on 1 succeeded in 1 usecs
 kernel: [  601.271011] [drm] ring test on 2 succeeded in 1 usecs
 kernel: [  601.271075] [drm] ring test on 3 succeeded in 2 usecs
 kernel: [  601.271084] [drm] ring test on 4 succeeded in 1 usecs
 kernel: [  601.448164] [drm] ring test on 5 succeeded in 2 usecs
 kernel: [  601.448172] [drm] UVD initialized successfully.
 kernel: [  611.444226] radeon 0000:01:00.0: ring 0 stalled for more than
10000msec
 kernel: [  611.444237] radeon 0000:01:00.0: GPU lockup (waiting for
0x000000000001a60a last fence id 0x000000000001a4dd on ring 0)
 kernel: [  611.444244] [drm:r600_ib_test] *ERROR* radeon: fence wait failed
(-35).
 kernel: [  611.444252] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed
testing IB on GFX ring (-35).
 kernel: [  611.444257] radeon 0000:01:00.0: ib ring test failed (-35).
 kernel: [  611.997330] radeon 0000:01:00.0: GPU softreset: 0x00000048
 kernel: [  611.997333] radeon 0000:01:00.0:   GRBM_STATUS               =
0xA0003028
 kernel: [  611.997336] radeon 0000:01:00.0:   GRBM_STATUS_SE0           =
0x00000006
 kernel: [  611.997338] radeon 0000:01:00.0:   GRBM_STATUS_SE1           =
0x00000006
 kernel: [  611.997341] radeon 0000:01:00.0:   SRBM_STATUS               =
0x200000C0
 kernel: [  611.997452] radeon 0000:01:00.0:   SRBM_STATUS2              =
0x00000000
 kernel: [  611.997454] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 =
0x00000000
 kernel: [  611.997456] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 =
0x00010000
 kernel: [  611.997458] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     =
0x00400002
 kernel: [  611.997461] radeon 0000:01:00.0:   R_008680_CP_STAT          =
0x84010243
 kernel: [  611.997463] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  611.997465] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  611.997468] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
 kernel: [  611.997470] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
 kernel: [  612.542126] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
 kernel: [  612.542180] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
 kernel: [  612.543338] radeon 0000:01:00.0:   GRBM_STATUS               =
0x00003028
 kernel: [  612.543340] radeon 0000:01:00.0:   GRBM_STATUS_SE0           =
0x00000006
 kernel: [  612.543343] radeon 0000:01:00.0:   GRBM_STATUS_SE1           =
0x00000006
 kernel: [  612.543345] radeon 0000:01:00.0:   SRBM_STATUS               =
0x200000C0
 kernel: [  612.543456] radeon 0000:01:00.0:   SRBM_STATUS2              =
0x00000000
 kernel: [  612.543458] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 =
0x00000000
 kernel: [  612.543460] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 =
0x00000000
 kernel: [  612.543462] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     =
0x00000000
 kernel: [  612.543465] radeon 0000:01:00.0:   R_008680_CP_STAT          =
0x00000000
 kernel: [  612.543467] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  612.543469] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   =
0x44C83D57
 kernel: [  612.543724] radeon 0000:01:00.0: GPU reset succeeded, trying to
resume
 kernel: [  612.556911] [drm] probing gen 2 caps for device 1002:5a16 =
31cd02/0
 kernel: [  612.556915] [drm] PCIE gen 2 link speeds already enabled
 kernel: [  612.558107] [drm] PCIE GART of 1024M enabled (table at
0x0000000000276000).
 kernel: [  612.558216] radeon 0000:01:00.0: WB enabled
 kernel: [  612.558219] radeon 0000:01:00.0: fence driver on ring 0 use gpu
addr 0x0000000080000c00 and cpu addr 0xffff880413fbec00
 kernel: [  612.558222] radeon 0000:01:00.0: fence driver on ring 1 use gpu
addr 0x0000000080000c04 and cpu addr 0xffff880413fbec04
 kernel: [  612.558224] radeon 0000:01:00.0: fence driver on ring 2 use gpu
addr 0x0000000080000c08 and cpu addr 0xffff880413fbec08
 kernel: [  612.558226] radeon 0000:01:00.0: fence driver on ring 3 use gpu
addr 0x0000000080000c0c and cpu addr 0xffff880413fbec0c
 kernel: [  612.558228] radeon 0000:01:00.0: fence driver on ring 4 use gpu
addr 0x0000000080000c10 and cpu addr 0xffff880413fbec10
 kernel: [  612.559203] radeon 0000:01:00.0: fence driver on ring 5 use gpu
addr 0x0000000000075a18 and cpu addr 0xffffc90011db5a18
 kernel: [  612.744297] [drm] ring test on 0 succeeded in 3 usecs
 kernel: [  612.744302] [drm] ring test on 1 succeeded in 1 usecs
 kernel: [  612.744308] [drm] ring test on 2 succeeded in 1 usecs
 kernel: [  612.744371] [drm] ring test on 3 succeeded in 2 usecs
 kernel: [  612.744380] [drm] ring test on 4 succeeded in 1 usecs
 kernel: [  612.921464] [drm] ring test on 5 succeeded in 2 usecs
 kernel: [  612.921472] [drm] UVD initialized successfully.
 kernel: [  612.921539] [drm] ib test on ring 0 succeeded in 0 usecs
 kernel: [  612.921634] [drm] ib test on ring 1 succeeded in 0 usecs
 kernel: [  612.921722] [drm] ib test on ring 2 succeeded in 0 usecs
 kernel: [  612.921762] [drm] ib test on ring 3 succeeded in 0 usecs
 kernel: [  612.921796] [drm] ib test on ring 4 succeeded in 0 usecs
 kernel: [  623.068910] radeon 0000:01:00.0: ring 5 stalled for more than
10000msec
 kernel: [  623.068921] radeon 0000:01:00.0: GPU lockup (waiting for
0x0000000000000004 last fence id 0x0000000000000002 on ring 5)
 kernel: [  623.068927] [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait
failed (-35).
 kernel: [  623.068935] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed
testing IB on ring 5 (-35).
 kernel: [  623.098333] radeon 0000:01:00.0: GPU fault detected: 146 0x07a23d0c
 kernel: [  623.098342] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0000BDBD
 kernel: [  623.098347] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0203D00C
 kernel: [  623.098352] VM fault (0x0c, vmid 1) at page 48573, read from DMA1
(61)
 kernel: [  623.098364] radeon 0000:01:00.0: GPU fault detected: 146 0x07c23d0c
 kernel: [  623.098368] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
 kernel: [  623.098372] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0208400C
 kernel: [  623.098377] VM fault (0x0c, vmid 1) at page 0, read from TC (132)
 kernel: [  623.098383] radeon 0000:01:00.0: GPU fault detected: 146 0x07e23d0c
 kernel: [  623.098387] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0000BDBC
 kernel: [  623.098391] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0200800C
 kernel: [  623.098395] VM fault (0x0c, vmid 1) at page 48572, read from TC (8)
 kernel: [  623.128770] radeon 0000:01:00.0: GPU fault detected: 146 0x06033d14
 kernel: [  623.128781] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0000BDB0
 kernel: [  623.128787] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0303D014
 kernel: [  623.128793] VM fault (0x04, vmid 1) at page 48560, write from DMA1
(61)
 kernel: [  623.128820] radeon 0000:01:00.0: GPU fault detected: 146 0x06033d14
 kernel: [  623.128825] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
 kernel: [  623.128830] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0204400C
 kernel: [  623.128835] VM fault (0x0c, vmid 1) at page 0, read from TC (68)
 kernel: [  623.128842] radeon 0000:01:00.0: GPU fault detected: 146 0x06033d14
 kernel: [  623.128847] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0000BDB8
 kernel: [  623.128852] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0204400C
 kernel: [  623.128857] VM fault (0x0c, vmid 1) at page 48568, read from TC
(68)
 kernel: [  623.129932] radeon 0000:01:00.0: GPU fault detected: 146 0x06033d14
 kernel: [  623.129940] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0000BDB0
 kernel: [  623.129944] radeon 0000:01:00.0:  
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0303D014
 kernel: [  623.129948] VM fault (0x04, vmid 1) at page 48560, write from DMA1
(61)
 kernel: [  623.129965] radeon 0000:01:00.0: GPU fault detected: 146 0x06233d14
===cut===
Note: several megabytes of similar "VM fault" flood skipped.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (21 preceding siblings ...)
  2014-09-09  3:09 ` bugzilla-daemon
@ 2014-09-30  4:03 ` bugzilla-daemon
  2015-07-10 23:38 ` bugzilla-daemon
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2014-09-30  4:03 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

Jean-Michel Smith <jean.michel.sm@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jean.michel.sm@gmail.com

--- Comment #23 from Jean-Michel Smith <jean.michel.sm@gmail.com> ---
I've seen this bug as well, through quite a few versions of 3.15 and 3.16. 
Sometimes it just freezes X, other times it hangs the entire system.  Here is
the output of the last hang (I was able to log in remotely as this time it
didn't completely crash the system)

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Curacao XT [Radeon R9 270X]

(uname -a)
Linux prime 3.16.3-gentoo #1 SMP PREEMPT Thu Sep 18 20:59:58 CDT 2014 x86_64
Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz GenuineIntel GNU/Linux

(lsmod)
cfbfillrect             3634  1 radeon
cfbimgblt               2055  1 radeon
cfbcopyarea             3110  1 radeon
i2c_algo_bit            5055  1 radeon
drm_kms_helper         33715  1 radeon
ttm                    59052  1 radeon
drm                   226864  6 ttm,drm_kms_helper,radeon
firmware_class          8187  1 radeon
radeon               1258462  3 

(relevant dmesg info)

[120499.589293] radeon 0000:01:00.0: ring 0 stalled for more than 10473msec
[120499.589296] radeon 0000:01:00.0: GPU lockup (waiting for 0x00000000000783d0
last fence id 0x00000000000783cf on ring 0)
[120499.589299] radeon 0000:01:00.0: failed to get a new IB (-35)
[120500.099613] radeon 0000:01:00.0: Saved 3600 dwords of commands on ring 0.
[120500.099743] radeon 0000:01:00.0: GPU softreset: 0x0000006C
[120500.099746] radeon 0000:01:00.0:   GRBM_STATUS               = 0xA0003028
[120500.099748] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[120500.099750] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[120500.099751] radeon 0000:01:00.0:   SRBM_STATUS               = 0x20000AC0
[120500.099862] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[120500.099864] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[120500.099866] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010000
[120500.099868] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000002
[120500.099870] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80010243
[120500.099872] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83146
[120500.099874] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44E84266
[120500.099876] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[120500.099879] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x00000000
[120500.592138] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
[120500.592192] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100140
[120500.593350] radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
[120500.593352] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[120500.593354] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[120500.593356] radeon 0000:01:00.0:   SRBM_STATUS               = 0x20000AC0
[120500.593466] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[120500.593468] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[120500.593470] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[120500.593472] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[120500.593473] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
[120500.593475] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[120500.593477] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[120500.593718] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[120500.621478] [drm] probing gen 2 caps for device 8086:3c04 = 7a7103/e
[120500.621482] [drm] PCIE gen 3 link speeds already enabled
[120500.623908] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
[120500.624051] radeon 0000:01:00.0: WB enabled
[120500.624054] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr
0x0000000100000c00 and cpu addr 0xffff8807fb4aac00
[120500.624056] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr
0x0000000100000c04 and cpu addr 0xffff8807fb4aac04
[120500.624058] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr
0x0000000100000c08 and cpu addr 0xffff8807fb4aac08
[120500.624059] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr
0x0000000100000c0c and cpu addr 0xffff8807fb4aac0c
[120500.624061] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr
0x0000000100000c10 and cpu addr 0xffff8807fb4aac10
[120500.624680] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr
0x0000000000075a18 and cpu addr 0xffffc900142b5a18
[120500.789277] [drm] ring test on 0 succeeded in 3 usecs
[120500.789283] [drm] ring test on 1 succeeded in 1 usecs
[120500.789287] [drm] ring test on 2 succeeded in 1 usecs
[120500.789351] [drm] ring test on 3 succeeded in 2 usecs
[120500.789361] [drm] ring test on 4 succeeded in 1 usecs
[120500.981448] [drm] ring test on 5 succeeded in 2 usecs
[120500.981456] [drm] UVD initialized successfully.
[120510.981602] radeon 0000:01:00.0: ring 0 stalled for more than 10002msec
[120510.981604] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000078407
last fence id 0x00000000000783cf on ring 0)
[120510.981606] [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35).
[120510.981608] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on
GFX ring (-35).
[120510.981609] radeon 0000:01:00.0: ib ring test failed (-35).
[120511.461309] radeon 0000:01:00.0: GPU softreset: 0x00000048
[120511.461310] radeon 0000:01:00.0:   GRBM_STATUS               = 0xA0003028
[120511.461312] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[120511.461313] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[120511.461314] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[120511.461428] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[120511.461429] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[120511.461431] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010000
[120511.461432] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000002
[120511.461434] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80010243
[120511.461435] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[120511.461437] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[120511.461439] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[120511.461440] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x00000000
[120511.933287] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
[120511.933340] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
[120511.934495] radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
[120511.934496] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[120511.934498] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[120511.934499] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[120511.934609] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[120511.934610] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[120511.934612] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[120511.934613] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[120511.934614] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
[120511.934616] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[120511.934617] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[120511.934857] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[120511.945176] [drm] probing gen 2 caps for device 8086:3c04 = 7a7103/e
[120511.945179] [drm] PCIE gen 3 link speeds already enabled
[120511.947127] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
[120511.947253] radeon 0000:01:00.0: WB enabled
[120511.947255] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr
0x0000000100000c00 and cpu addr 0xffff8807fb4aac00
[120511.947256] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr
0x0000000100000c04 and cpu addr 0xffff8807fb4aac04
[120511.947257] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr
0x0000000100000c08 and cpu addr 0xffff8807fb4aac08
[120511.947258] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr
0x0000000100000c0c and cpu addr 0xffff8807fb4aac0c
[120511.947259] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr
0x0000000100000c10 and cpu addr 0xffff8807fb4aac10
[120511.947868] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr
0x0000000000075a18 and cpu addr 0xffffc900142b5a18
[120512.109348] [drm] ring test on 0 succeeded in 4 usecs
[120512.109352] [drm] ring test on 1 succeeded in 1 usecs
[120512.109355] [drm] ring test on 2 succeeded in 1 usecs
[120512.109417] [drm] ring test on 3 succeeded in 2 usecs
[120512.109426] [drm] ring test on 4 succeeded in 1 usecs
[120512.286478] [drm] ring test on 5 succeeded in 2 usecs
[120512.286483] [drm] UVD initialized successfully.
[120512.286534] [drm] ib test on ring 0 succeeded in 0 usecs
[120512.286580] [drm] ib test on ring 1 succeeded in 0 usecs
[120512.286623] [drm] ib test on ring 2 succeeded in 0 usecs
[120512.286648] [drm] ib test on ring 3 succeeded in 0 usecs
[120512.286672] [drm] ib test on ring 4 succeeded in 0 usecs
[120522.435679] radeon 0000:01:00.0: ring 5 stalled for more than 10000msec
[120522.435685] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000004
last fence id 0x0000000000000002 on ring 5)
[120522.435688] [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35).
[120522.435695] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on
ring 5 (-35).
[120522.435730] [drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)
  2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
                   ` (22 preceding siblings ...)
  2014-09-30  4:03 ` bugzilla-daemon
@ 2015-07-10 23:38 ` bugzilla-daemon
  23 siblings, 0 replies; 25+ messages in thread
From: bugzilla-daemon @ 2015-07-10 23:38 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=78221

Linux Tester <linux.tester@sharklasers.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |linux.tester@sharklasers.co
                   |                            |m

--- Comment #24 from Linux Tester <linux.tester@sharklasers.com> ---
After some updates to MESA and kernel this bug no longer happens at all - 2D
now rock solid for me on R9 270 and I can run even most troublesome workloads
for weeks without any issues. While I failed to pinpoint what exactly has fixed
bug, thanks anyway. I think it is now safe to close this bug as fixed. I bet
you've got dozens of other GPU lockups to chew on, so I'm glad to inform you at
least one nasty thing has been nailed down.

(it's me, original bug reporter who has forgot password on both original
mailbox and bugzilla account, dammit)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-07-10 23:38 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-18  2:20 [Bug 78221] New: 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?) bugzilla-daemon
2014-06-18  2:22 ` [Bug 78221] " bugzilla-daemon
2014-06-18 15:12 ` bugzilla-daemon
2014-06-19  7:37 ` bugzilla-daemon
2014-06-19 13:46 ` bugzilla-daemon
2014-06-21  4:04 ` bugzilla-daemon
2014-06-22  7:12 ` bugzilla-daemon
2014-06-23 14:44 ` bugzilla-daemon
2014-06-23 14:45 ` bugzilla-daemon
2014-06-24 11:40 ` bugzilla-daemon
2014-06-24 16:23 ` bugzilla-daemon
2014-06-24 16:23 ` bugzilla-daemon
2014-06-25  1:05 ` bugzilla-daemon
2014-06-25  2:11 ` bugzilla-daemon
2014-06-25  9:45 ` bugzilla-daemon
2014-06-25 13:17 ` bugzilla-daemon
2014-08-05  8:06 ` bugzilla-daemon
2014-08-14 11:56 ` bugzilla-daemon
2014-08-24  1:05 ` bugzilla-daemon
2014-08-25  9:58 ` bugzilla-daemon
2014-09-08 12:19 ` bugzilla-daemon
2014-09-08 12:22 ` bugzilla-daemon
2014-09-09  3:09 ` bugzilla-daemon
2014-09-30  4:03 ` bugzilla-daemon
2015-07-10 23:38 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.