* Re: Re:[mipsel+rs780e]Occasionally "GPU lockup" after resuming from suspend.
[not found] <2699.222.92.8.142.1318833267.squirrel@mail.lemote.com>
@ 2011-10-18 8:35 ` Chen Jie
2011-10-20 16:31 ` Michel Dänzer
0 siblings, 1 reply; 5+ messages in thread
From: Chen Jie @ 2011-10-18 8:35 UTC (permalink / raw)
To: chenhc; +Cc: Michel Dänzer, dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 4526 bytes --]
Hi,
在 2011年10月17日 下午2:34, <chenhc@lemote.com>写道:
> If I start X but switch to the console, then do suspend & resume, "GPU
> reset" hardly happen. but there is a new problem that the IRQ of radeon
> card is disabled. Maybe "GPU reset" has something to do with "IRQ
> disabled"?
>
> I have tried "irqpoll", it doesn't fix this problem.
>
> [ 571.914062] irq 6: nobody cared (try booting with the "irqpoll" option)
> [ 571.914062] Call Trace:
> [ 571.914062] [<ffffffff806f3248>] dump_stack+0x8/0x34
> [ 571.914062] [<ffffffff8027e1e4>] __report_bad_irq.clone.6+0x44/0x15c
> [ 571.914062] [<ffffffff8027e584>] note_interrupt+0x204/0x2a0
> [ 571.914062] [<ffffffff8027c7cc>] handle_irq_event_percpu+0x19c/0x1f8
> [ 571.914062] [<ffffffff8027c890>] handle_irq_event+0x68/0xa8
> [ 571.914062] [<ffffffff8027f038>] handle_level_irq+0xd8/0x13c
> [ 571.914062] [<ffffffff8027bec8>] generic_handle_irq+0x48/0x58
> [ 571.914062] [<ffffffff80204574>] do_IRQ+0x18/0x24
> [ 571.914062] [<ffffffff8020152c>] mach_irq_dispatch+0xf0/0x194
> [ 571.914062] [<ffffffff80202a40>] ret_from_irq+0x0/0x4
> [ 571.914062]
> [ 571.914062] handlers:
> [ 571.914062] [<ffffffff8053bba8>] radeon_driver_irq_handler_kms
>
> P.S.: use the latest kernel from git, and irq6 is not shared by other
> devices.
>
> Does fence_wait depends on GPU's interrupt? If yes, then can I say "GPU
lockup" is caused by unexpected disabling of GPU's irq?
> > Hi Alex, Michel
> >
> > 2011/10/5 Alex Deucher <alexdeucher@gmail.com>
> >
> >> 2011/10/5 Michel D鋘zer <michel@daenzer.net>:
> >> > On Don, 2011-09-29 at 17:17 +0800, Chen Jie wrote:
> >> >>
> >> >> We got occasionally "GPU lockup" after resuming from suspend(on
> >> mipsel
> >> >> platform with a mips64 compatible CPU and rs780e, the kernel is
> >> >> 3.1.0-rc8 64bit). Related kernel message:
> >> >
> >> > [...]
> >> >
> >> >> [ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than
> >> >> 10019msec
> >> >> [ 177.089843] ------------[ cut here ]------------
> >> >> [ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267
> >> >> radeon_fence_wait+0x25c/0x33c()
> >> >> [ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id
> >> >> 0x000013AD)
> >> >> [ 177.113281] Modules linked in: psmouse serio_raw
> >> >> [ 177.117187] Call Trace:
> >> >> [ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34
> >> >> [ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0
> >> >> [ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44
> >> >> [ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c
> >> >> [ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220
> >> >> [ 177.148437] [<ffffffff8053b478>] radeon_gem_wait_idle_ioctl
> >> >> +0x80/0x114
> >> >> [ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc
> >> >> [ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38
> >> >> [ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c
> >> >> [ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138
> >> >> [ 177.179687] ---[ end trace 92f63d998efe4c6d ]---
> >> >> [ 177.187500] radeon 0000:01:05.0: GPU softreset
> >> >> [ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030
> >> >> [ 177.195312] radeon 0000:01:05.0:
> >> R_008014_GRBM_STATUS2=0x00111103
> >> >> [ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040
> >> >> [ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout !
> >> >
> >> > [...]
> >> >
> >> >> What may cause a "GPU lockup"?
> >> >
> >> > Lots of things... The most common cause is an incorrect command stream
> >> > sent to the GPU by userspace or the kernel.
> >> >
> >> >> Why reset didn't work?
> >> >
> >> > Might be related to 'Wait for MC idle timedout !', but I don't know
> >> > offhand what could be up with that.
> >> >
> >> >
> >> >> BTW, one question:
> >> >> I got 'RADEON_IS_PCI | RADEON_IS_IGP' in rdev->flags, which causes
> >> >> need_dma32 was set.
> >> >> Is it correct? (drivers/char/agp is not available on mips, could that
> >> >> be the reason?)
> >> >
> >> > Not sure, Alex?
> >>
> >> You don't AGP for newer IGP cards (rs4xx+). It gets set by default if
> >> the card is not AGP or PCIE. That should be changed as only the
> >> legacy r1xx PCI GART block has that limitation. I'll send a patch out
> >> shortly.
> >>
> >> Got it, thanks for the reply.
> >
>
[-- Attachment #1.2: Type: text/html, Size: 6170 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Re:[mipsel+rs780e]Occasionally "GPU lockup" after resuming from suspend.
2011-10-18 8:35 ` Re:[mipsel+rs780e]Occasionally "GPU lockup" after resuming from suspend Chen Jie
@ 2011-10-20 16:31 ` Michel Dänzer
0 siblings, 0 replies; 5+ messages in thread
From: Michel Dänzer @ 2011-10-20 16:31 UTC (permalink / raw)
To: Chen Jie; +Cc: chenhc, dri-devel
On Die, 2011-10-18 at 16:35 +0800, Chen Jie wrote:
>
> 在 2011年10月17日 下午2:34, <chenhc@lemote.com>写道:
> If I start X but switch to the console, then do suspend &
> resume, "GPU
> reset" hardly happen. but there is a new problem that the IRQ
> of radeon
> card is disabled. Maybe "GPU reset" has something to do with
> "IRQ
> disabled"?
>
> I have tried "irqpoll", it doesn't fix this problem.
>
> [ 571.914062] irq 6: nobody cared (try booting with the
> "irqpoll" option)
> [ 571.914062] Call Trace:
> [ 571.914062] [<ffffffff806f3248>] dump_stack+0x8/0x34
> [ 571.914062] [<ffffffff8027e1e4>] __report_bad_irq.clone.6
> +0x44/0x15c
> [ 571.914062] [<ffffffff8027e584>] note_interrupt+0x204/0x2a0
> [ 571.914062] [<ffffffff8027c7cc>] handle_irq_event_percpu
> +0x19c/0x1f8
> [ 571.914062] [<ffffffff8027c890>] handle_irq_event+0x68/0xa8
> [ 571.914062] [<ffffffff8027f038>] handle_level_irq
> +0xd8/0x13c
> [ 571.914062] [<ffffffff8027bec8>] generic_handle_irq
> +0x48/0x58
> [ 571.914062] [<ffffffff80204574>] do_IRQ+0x18/0x24
> [ 571.914062] [<ffffffff8020152c>] mach_irq_dispatch
> +0xf0/0x194
> [ 571.914062] [<ffffffff80202a40>] ret_from_irq+0x0/0x4
> [ 571.914062]
> [ 571.914062] handlers:
> [ 571.914062] [<ffffffff8053bba8>]
> radeon_driver_irq_handler_kms
>
> P.S.: use the latest kernel from git, and irq6 is not shared
> by other
> devices.
>
> Does fence_wait depends on GPU's interrupt? If yes, then can I say
> "GPU lockup" is caused by unexpected disabling of GPU's irq?
No, if the GPU didn't actually lock up, the fences should still signal
eventually, as radeon_fence_signaled()->radeon_fence_poll_locked() is
called after the wait for the SW interrupt times out.
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Debian, X and DRI developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Re:[mipsel+rs780e]Occasionally "GPU lockup" after resuming from suspend.
2011-10-05 9:41 ` Michel Dänzer
@ 2011-10-05 13:54 ` Alex Deucher
0 siblings, 0 replies; 5+ messages in thread
From: Alex Deucher @ 2011-10-05 13:54 UTC (permalink / raw)
To: Michel Dänzer; +Cc: dri-devel, Chen Jie
2011/10/5 Michel Dänzer <michel@daenzer.net>:
> On Don, 2011-09-29 at 17:17 +0800, Chen Jie wrote:
>>
>> We got occasionally "GPU lockup" after resuming from suspend(on mipsel
>> platform with a mips64 compatible CPU and rs780e, the kernel is
>> 3.1.0-rc8 64bit). Related kernel message:
>
> [...]
>
>> [ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than
>> 10019msec
>> [ 177.089843] ------------[ cut here ]------------
>> [ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267
>> radeon_fence_wait+0x25c/0x33c()
>> [ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id
>> 0x000013AD)
>> [ 177.113281] Modules linked in: psmouse serio_raw
>> [ 177.117187] Call Trace:
>> [ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34
>> [ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0
>> [ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44
>> [ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c
>> [ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220
>> [ 177.148437] [<ffffffff8053b478>] radeon_gem_wait_idle_ioctl
>> +0x80/0x114
>> [ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc
>> [ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38
>> [ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c
>> [ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138
>> [ 177.179687] ---[ end trace 92f63d998efe4c6d ]---
>> [ 177.187500] radeon 0000:01:05.0: GPU softreset
>> [ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030
>> [ 177.195312] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00111103
>> [ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040
>> [ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout !
>
> [...]
>
>> What may cause a "GPU lockup"?
>
> Lots of things... The most common cause is an incorrect command stream
> sent to the GPU by userspace or the kernel.
>
>> Why reset didn't work?
>
> Might be related to 'Wait for MC idle timedout !', but I don't know
> offhand what could be up with that.
>
>
>> BTW, one question:
>> I got 'RADEON_IS_PCI | RADEON_IS_IGP' in rdev->flags, which causes
>> need_dma32 was set.
>> Is it correct? (drivers/char/agp is not available on mips, could that
>> be the reason?)
>
> Not sure, Alex?
You don't AGP for newer IGP cards (rs4xx+). It gets set by default if
the card is not AGP or PCIE. That should be changed as only the
legacy r1xx PCI GART block has that limitation. I'll send a patch out
shortly.
Alex
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Re:[mipsel+rs780e]Occasionally "GPU lockup" after resuming from suspend.
2011-09-29 9:17 Chen Jie
@ 2011-10-05 9:41 ` Michel Dänzer
2011-10-05 13:54 ` Alex Deucher
0 siblings, 1 reply; 5+ messages in thread
From: Michel Dänzer @ 2011-10-05 9:41 UTC (permalink / raw)
To: Chen Jie; +Cc: dri-devel
On Don, 2011-09-29 at 17:17 +0800, Chen Jie wrote:
>
> We got occasionally "GPU lockup" after resuming from suspend(on mipsel
> platform with a mips64 compatible CPU and rs780e, the kernel is
> 3.1.0-rc8 64bit). Related kernel message:
[...]
> [ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than
> 10019msec
> [ 177.089843] ------------[ cut here ]------------
> [ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267
> radeon_fence_wait+0x25c/0x33c()
> [ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id
> 0x000013AD)
> [ 177.113281] Modules linked in: psmouse serio_raw
> [ 177.117187] Call Trace:
> [ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34
> [ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0
> [ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44
> [ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c
> [ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220
> [ 177.148437] [<ffffffff8053b478>] radeon_gem_wait_idle_ioctl
> +0x80/0x114
> [ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc
> [ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38
> [ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c
> [ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138
> [ 177.179687] ---[ end trace 92f63d998efe4c6d ]---
> [ 177.187500] radeon 0000:01:05.0: GPU softreset
> [ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030
> [ 177.195312] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00111103
> [ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040
> [ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout !
[...]
> What may cause a "GPU lockup"?
Lots of things... The most common cause is an incorrect command stream
sent to the GPU by userspace or the kernel.
> Why reset didn't work?
Might be related to 'Wait for MC idle timedout !', but I don't know
offhand what could be up with that.
> BTW, one question:
> I got 'RADEON_IS_PCI | RADEON_IS_IGP' in rdev->flags, which causes
> need_dma32 was set.
> Is it correct? (drivers/char/agp is not available on mips, could that
> be the reason?)
Not sure, Alex?
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Debian, X and DRI developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re:[mipsel+rs780e]Occasionally "GPU lockup" after resuming from suspend.
@ 2011-09-29 9:17 Chen Jie
2011-10-05 9:41 ` Michel Dänzer
0 siblings, 1 reply; 5+ messages in thread
From: Chen Jie @ 2011-09-29 9:17 UTC (permalink / raw)
To: Alex Deucher; +Cc: Michel Dänzer, dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 4432 bytes --]
Hi,
Add more information.
We got occasionally "GPU lockup" after resuming from suspend(on mipsel
platform with a mips64 compatible CPU and rs780e, the kernel is 3.1.0-rc8
64bit). Related kernel message:
/* return from STR */
[ 156.152343] radeon 0000:01:05.0: WB enabled
[ 156.187500] [drm] ring test succeeded in 0 usecs
[ 156.187500] [drm] ib test succeeded in 0 usecs
[ 156.398437] ata2: SATA link down (SStatus 0 SControl 300)
[ 156.398437] ata3: SATA link down (SStatus 0 SControl 300)
[ 156.398437] ata4: SATA link down (SStatus 0 SControl 300)
[ 156.578125] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 156.597656] ata1.00: configured for UDMA/133
[ 156.613281] usb 1-5: reset high speed USB device number 4 using ehci_hcd
[ 157.027343] usb 3-2: reset low speed USB device number 2 using ohci_hcd
[ 157.609375] usb 3-3: reset low speed USB device number 3 using ohci_hcd
[ 157.683593] r8169 0000:02:00.0: eth0: link up
[ 165.621093] PM: resume of devices complete after 9679.556 msecs
[ 165.628906] Restarting tasks ... done.
[ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than
10019msec
[ 177.089843] ------------[ cut here ]------------
[ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267
radeon_fence_wait+0x25c/0x33c()
[ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id 0x000013AD)
[ 177.113281] Modules linked in: psmouse serio_raw
[ 177.117187] Call Trace:
[ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34
[ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0
[ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44
[ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c
[ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220
[ 177.148437] [<ffffffff8053b478>] radeon_gem_wait_idle_ioctl+0x80/0x114
[ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc
[ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38
[ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c
[ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138
[ 177.179687] ---[ end trace 92f63d998efe4c6d ]---
[ 177.187500] radeon 0000:01:05.0: GPU softreset
[ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030
[ 177.195312] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00111103
[ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040
[ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout !
[ 177.367187] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEE
[ 177.390625] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00000001
[ 177.414062] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xA0003030
[ 177.417968] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00000003
[ 177.425781] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x2002B040
[ 177.433593] radeon 0000:01:05.0: GPU reset succeed
[ 177.605468] radeon 0000:01:05.0: Wait for MC idle timedout !
[ 177.761718] radeon 0000:01:05.0: Wait for MC idle timedout !
[ 177.804687] radeon 0000:01:05.0: WB enabled
[ 178.000000] [drm:r600_ring_test] *ERROR* radeon: ring test failed
(scratch(0x8504)=0xCAFEDEAD)
[ 178.007812] [drm:r600_resume] *ERROR* r600 startup failed on resume
[ 178.988281] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule
IB(5).
[ 178.996093] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB !
[ 179.003906] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule
IB(6).
...
What may cause a "GPU lockup"? Why reset didn't work? Any idea?
BTW, one question:
I got 'RADEON_IS_PCI | RADEON_IS_IGP' in rdev->flags, which causes
need_dma32 was set.
Is it correct? (drivers/char/agp is not available on mips, could that be the
reason?)
[ 177.179687]在 2011年9月28日 下午3:23, <chenhc@lemote.com>写道:
> Hi Alex,
>
> When we do STR (S3) with a RS780E radeon card on MIPS platform. "GPU
> reset" may happen after resume (the possibility is about 5%). After that,
> X is unusuable.
>
> We know there is a "ring test" at system resume time and GPU reset time.
> Whether GPU reset happens, the "ring test" at system resume time is always
> successful. But the "ring test" at GPU reset time usually fails.
>
> We use the latest kernel (3.1.0-RC8 from git) and X.org is 7.6.
>
> Any ideas?
>
> Best regards,
> Huacai Chen
>
>
Regards,
- Chen Jie
[-- Attachment #1.2: Type: text/html, Size: 8244 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-10-20 16:31 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <2699.222.92.8.142.1318833267.squirrel@mail.lemote.com>
2011-10-18 8:35 ` Re:[mipsel+rs780e]Occasionally "GPU lockup" after resuming from suspend Chen Jie
2011-10-20 16:31 ` Michel Dänzer
2011-09-29 9:17 Chen Jie
2011-10-05 9:41 ` Michel Dänzer
2011-10-05 13:54 ` Alex Deucher
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.