linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
@ 2019-01-12 20:50 Borislav Petkov
  2019-01-15 10:21 ` Borislav Petkov
  0 siblings, 1 reply; 2+ messages in thread
From: Borislav Petkov @ 2019-01-12 20:50 UTC (permalink / raw)
  To: dri-devel
  Cc: Alex Deucher, Christian König, David (ChunMing) Zhou,
	amd-gfx, linux-kernel

Hi guys,

my odyssey with the GPU continues. This time it didn't reset itself
but started spewing a single line about the hardware locking up.

The machine was responsive to sysrq so I was able to write out
/var/log/messages and reboot.

This is still with 4.20-rc7 but I'm building 5.0-rc1 to see if there's a
difference.

Any and all ideas what to do here are greatly appreciated.

Thx.

Jan 12 21:38:16 zn vmunix: [257393.853806] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a0 on ring 0)
Jan 12 21:38:16 zn vmunix: [257394.369780] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a0 on ring 0)
Jan 12 21:38:17 zn vmunix: [257394.877795] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:17 zn vmunix: [257395.389789] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:18 zn vmunix: [257395.901786] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:18 zn vmunix: [257396.413787] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:19 zn vmunix: [257396.925790] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:19 zn vmunix: [257397.437787] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:20 zn vmunix: [257397.949788] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:20 zn vmunix: [257398.461786] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:21 zn vmunix: [257398.973824] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:21 zn vmunix: [257399.485828] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:22 zn vmunix: [257399.997842] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:22 zn vmunix: [257400.509852] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:23 zn vmunix: [257401.021845] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:23 zn vmunix: [257401.533826] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:24 zn vmunix: [257402.045822] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:24 zn vmunix: [257402.557826] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:25 zn vmunix: [257403.069833] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:25 zn vmunix: [257403.581826] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:26 zn vmunix: [257404.093828] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:26 zn vmunix: [257404.605827] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:27 zn vmunix: [257405.117821] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:27 zn vmunix: [257405.629883] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:28 zn vmunix: [257406.141842] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:28 zn vmunix: [257406.653828] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:29 zn vmunix: [257407.165827] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:29 zn vmunix: [257407.677827] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:30 zn vmunix: [257408.189823] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:30 zn vmunix: [257408.701827] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:31 zn vmunix: [257409.213827] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:32 zn vmunix: [257409.725822] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:32 zn vmunix: [257410.237827] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:33 zn vmunix: [257410.749833] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:33 zn vmunix: [257411.261824] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:34 zn vmunix: [257411.773828] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Jan 12 21:38:34 zn vmunix: [257412.285827] radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)


-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
  2019-01-12 20:50 radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0) Borislav Petkov
@ 2019-01-15 10:21 ` Borislav Petkov
  0 siblings, 0 replies; 2+ messages in thread
From: Borislav Petkov @ 2019-01-15 10:21 UTC (permalink / raw)
  To: dri-devel
  Cc: Alex Deucher, Christian König, David (ChunMing) Zhou,
	amd-gfx, linux-kernel

On Sat, Jan 12, 2019 at 09:50:51PM +0100, Borislav Petkov wrote:
> Hi guys,
> 
> my odyssey with the GPU continues. This time it didn't reset itself
> but started spewing a single line about the hardware locking up.
> 
> The machine was responsive to sysrq so I was able to write out
> /var/log/messages and reboot.
> 
> This is still with 4.20-rc7 but I'm building 5.0-rc1 to see if there's a
> difference.

Well, not really. This time the reset succeeded and the machine is still
alive:

[111333.620619] radeon 0000:1d:00.0: ring 0 stalled for more than 10360msec
[111333.620626] radeon 0000:1d:00.0: GPU lockup (current fence id 0x000000000080f31d last fence id 0x000000000080f416 on ring 0)
[111334.132277] radeon 0000:1d:00.0: ring 0 stalled for more than 10872msec
[111334.132283] radeon 0000:1d:00.0: GPU lockup (current fence id 0x000000000080f31d last fence id 0x000000000080f418 on ring 0)
[111334.199083] radeon 0000:1d:00.0: failed to get a new IB (-35)
[111334.199107] [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib !
[111334.206116] radeon 0000:1d:00.0: Saved 8121 dwords of commands on ring 0.
[111334.206127] radeon 0000:1d:00.0: GPU softreset: 0x00000008
[111334.206130] radeon 0000:1d:00.0:   R_008010_GRBM_STATUS      = 0xA0001030
[111334.206132] radeon 0000:1d:00.0:   R_008014_GRBM_STATUS2     = 0x00000003
[111334.206135] radeon 0000:1d:00.0:   R_000E50_SRBM_STATUS      = 0x200000C0
[111334.206137] radeon 0000:1d:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[111334.206139] radeon 0000:1d:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[111334.206141] radeon 0000:1d:00.0:   R_00867C_CP_BUSY_STAT     = 0x00020182
[111334.206144] radeon 0000:1d:00.0:   R_008680_CP_STAT          = 0x80028645
[111334.206146] radeon 0000:1d:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[111334.272194] radeon 0000:1d:00.0: R_008020_GRBM_SOFT_RESET=0x00004001
[111334.272247] radeon 0000:1d:00.0: SRBM_SOFT_RESET=0x00000100
[111334.274336] radeon 0000:1d:00.0:   R_008010_GRBM_STATUS      = 0xA0003030
[111334.274338] radeon 0000:1d:00.0:   R_008014_GRBM_STATUS2     = 0x00000003
[111334.274339] radeon 0000:1d:00.0:   R_000E50_SRBM_STATUS      = 0x200080C0
[111334.274341] radeon 0000:1d:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[111334.274342] radeon 0000:1d:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[111334.274344] radeon 0000:1d:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[111334.274345] radeon 0000:1d:00.0:   R_008680_CP_STAT          = 0x80100000
[111334.274347] radeon 0000:1d:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[111334.274354] radeon 0000:1d:00.0: GPU reset succeeded, trying to resume
[111334.290030] [drm] PCIE gen 2 link speeds already enabled
[111334.292121] [drm] PCIE GART of 512M enabled (table at 0x0000000000142000).
[111334.292135] radeon 0000:1d:00.0: WB enabled
[111334.292137] radeon 0000:1d:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0x00000000fb2c042c
[111334.292325] radeon 0000:1d:00.0: fence driver on ring 5 use gpu addr 0x00000000000521d0 and cpu addr 0x0000000014f22c80
[111334.323193] [drm] ring test on 0 succeeded in 0 usecs
[111334.497890] [drm] ring test on 5 succeeded in 1 usecs
[111334.497896] [drm] UVD initialized successfully.
[111334.724316] [drm] ib test on ring 0 succeeded in 0 usecs
[111335.380416] [drm] ib test on ring 5 succeeded

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-01-15 10:22 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-12 20:50 radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0) Borislav Petkov
2019-01-15 10:21 ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).