All of lore.kernel.org
 help / color / mirror / Atom feed
* Another GPU hang
@ 2011-05-21 21:08 Dmitry Nezhevenko
  2011-05-21 21:45 ` Chris Wilson
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Nezhevenko @ 2011-05-21 21:08 UTC (permalink / raw)
  To: intel-gfx

Hi,

I'm not sure why I'm getting this. I've performed a lot of configuration
changes, so can't be sure, which one causes. I've switched to amd64
distro and upgraded everything to latest debian unstable. Also I'm on
2.6.39 now.

I'm getting such hang sometimes just after unplugging HDMI cable from
laptop. 

[ 8401.416156] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[ 8401.416169] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[ 8401.418055] [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 154589 at 154588, next 154591)
[ 8401.920228] [drm:i915_reset] *ERROR* Failed to reset chip.

Actually the monitor is DVI, connected via HDMI->DVI cable to ASUS F6A
laptop with Intel GM45 graphics.

Ii915_error_state file is something around 1.5 mb, so I've uploaded it to
http://dion.org.ua/uploads/2011/05/2.6.39_intel_hung.txt

xserver-xorg    1:7.6+6
xserver-xorg-core       2:1.10.1-2
xserver-xorg-video-intel        2:2.15.0-3

Kernel version 2.6.39.

Any ideas?

-- 
WBR, Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Another GPU hang
  2011-05-21 21:08 Another GPU hang Dmitry Nezhevenko
@ 2011-05-21 21:45 ` Chris Wilson
  2011-05-21 22:47   ` Dmitry Nezhevenko
  0 siblings, 1 reply; 6+ messages in thread
From: Chris Wilson @ 2011-05-21 21:45 UTC (permalink / raw)
  To: Dmitry Nezhevenko, intel-gfx

On Sun, 22 May 2011 00:08:10 +0300, Dmitry Nezhevenko <dion@inhex.net> wrote:
> Hi,
> 
> I'm not sure why I'm getting this. I've performed a lot of configuration
> changes, so can't be sure, which one causes. I've switched to amd64
> distro and upgraded everything to latest debian unstable. Also I'm on
> 2.6.39 now.
> 
> I'm getting such hang sometimes just after unplugging HDMI cable from
> laptop. 

Hmm. Are you absolutely sure? I've an open bug 35576, but nothing that
indicated a correlation with modeswitching. The hang would appear to be
due to waiting on a dead pipe then. We've had a long history with such
bugs and purposely changing modes and so are now careful to flush any
pending waits from userspace and in KMS before modeswitching. This raises
the question that maybe there is a window for a hotplug event to turn-off
the pipe before userspace has finished flushing its queue of pending ops.

The important question is do you see this 0x01820000 at other times?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Another GPU hang
  2011-05-21 21:45 ` Chris Wilson
@ 2011-05-21 22:47   ` Dmitry Nezhevenko
  2011-05-22  8:02     ` Chris Wilson
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Nezhevenko @ 2011-05-21 22:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, May 21, 2011 at 10:45:24PM +0100, Chris Wilson wrote:
> On Sun, 22 May 2011 00:08:10 +0300, Dmitry Nezhevenko <dion@inhex.net> wrote:
> > Hi,
> > 
> > I'm not sure why I'm getting this. I've performed a lot of configuration
> > changes, so can't be sure, which one causes. I've switched to amd64
> > distro and upgraded everything to latest debian unstable. Also I'm on
> > 2.6.39 now.
> > 
> > I'm getting such hang sometimes just after unplugging HDMI cable from
> > laptop. 
> 
> Hmm. Are you absolutely sure? I've an open bug 35576, but nothing that
> indicated a correlation with modeswitching. The hang would appear to be
> due to waiting on a dead pipe then. We've had a long history with such
> bugs and purposely changing modes and so are now careful to flush any
> pending waits from userspace and in KMS before modeswitching. This raises
> the question that maybe there is a window for a hotplug event to turn-off
> the pipe before userspace has finished flushing its queue of pending ops.

I've just tried to reproduce it again. It looks like I'm unable to do it when
external display (DVI connected via DMI->HDMI cable) is turned off via xrandr. 

However if display is active (LVDS turned off and HDMI1 active) it was
easy enough to reproduce this by just playing with HDMI connector. So
after 5-6 attempts to plug/unplug monitor I've got hung again:

[38440.632948] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 81
[38440.632956] [drm:drm_edid_block_valid] *ERROR* Raw EDID:
[38440.632962] <3>00 ff ff ff ff ff ff 00 38 a3 8c 67 01 01 01 01  ........8..g....
[38440.632968] <3>07 14 01 03 80 34 20 78 ea fc 85 a4 55 4d 9d 25  .....4 x....UM.%
[38440.632973] <3>12 50 54 bf ef 80 81 c0 81 80 90 40 8b c0 95 00  .PT........@....
[38440.632978] <3>a9 40 b3 00 d1 00 28 3c 80 a0 70 b0 23 40 30 20  .@....(<..p.#@0
[38440.632982] <3>36 00 06 44 21 00 00 1a 00 00 00 fd 00 32 55 1f  6..D!........2U.
[38440.632987] <3>5c 11 00 0a 20 20 20 20 20 20 00 00 00 fc 00 4c  \...      .....L
[38440.632992] <3>43 44 32 34 39 30 57 55 58 69 32 0a 00 00 00 ff  CD2490WUXi2.....
[38440.632997] <3>00 30 32 33 30 32 34 37 31 55 4f ff ff ff ff ff  .02302471UO.....
[38440.633000]
[38461.596072] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[38461.596083] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[38461.597995] [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 2888267 at 2888261, next 2888268)
[38462.100045] [drm:i915_reset] *ERROR* Failed to reset chip.

> The important question is do you see this 0x01820000 at other times?
> -Chris

You are asking about "IPEHR:" line in i915_error_state file, right? Now it was
IPEHR: 0x01800002.

I've uploaded full file to:
http://dion.org.ua/uploads/2011/05/2.6.39_intel_hung_2.txt

As about other hungs, some times ago there were hungs while playing video using
mplayer -vo xv.  I've asked about it here and you replied that it's fixed in
commit:
                                                                                                                                                                                                                                        
commit 23f9b14df7c102c1036134835dd5d1a508059858
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Feb 12 10:42:34 2011 +0000                                                                                                                                                                                                  
                                                                                                                                                                                                                                        
    i965: Remove broken maximum base addresses from video 

So after upgrading to proper debian package everything was OK. At least
currently I don't remember any hung except display plug/unplug
 
-- 
WBR, Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Another GPU hang
  2011-05-21 22:47   ` Dmitry Nezhevenko
@ 2011-05-22  8:02     ` Chris Wilson
  2011-05-22 11:49       ` Dmitry Nezhevenko
  0 siblings, 1 reply; 6+ messages in thread
From: Chris Wilson @ 2011-05-22  8:02 UTC (permalink / raw)
  To: Dmitry Nezhevenko; +Cc: intel-gfx

On Sun, 22 May 2011 01:47:01 +0300, Dmitry Nezhevenko <dion@inhex.net> wrote:
> I've just tried to reproduce it again. It looks like I'm unable to do it when
> external display (DVI connected via DMI->HDMI cable) is turned off via xrandr. 
> 
> However if display is active (LVDS turned off and HDMI1 active) it was
> easy enough to reproduce this by just playing with HDMI connector. So
> after 5-6 attempts to plug/unplug monitor I've got hung again:

Ok, and I just got another bug (37450) which makes me think that, at
least, part of this problem is due to the stale DPMS values. (We depend
upon the DPMS state for not feeding WAITs onto a disabled pipe).
https://bugzilla.kernel.org/show_bug.cgi?id=24982
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Another GPU hang
  2011-05-22  8:02     ` Chris Wilson
@ 2011-05-22 11:49       ` Dmitry Nezhevenko
  2011-05-25  7:40         ` Dmitry Nezhevenko
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Nezhevenko @ 2011-05-22 11:49 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sun, May 22, 2011 at 09:02:43AM +0100, Chris Wilson wrote:
> On Sun, 22 May 2011 01:47:01 +0300, Dmitry Nezhevenko <dion@inhex.net> wrote:
> > I've just tried to reproduce it again. It looks like I'm unable to do it when
> > external display (DVI connected via DMI->HDMI cable) is turned off via xrandr. 
> > 
> > However if display is active (LVDS turned off and HDMI1 active) it was
> > easy enough to reproduce this by just playing with HDMI connector. So
> > after 5-6 attempts to plug/unplug monitor I've got hung again:
> 
> Ok, and I just got another bug (37450) which makes me think that, at
> least, part of this problem is due to the stale DPMS values. (We depend
> upon the DPMS state for not feeding WAITs onto a disabled pipe).
> https://bugzilla.kernel.org/show_bug.cgi?id=24982
> -Chris
> 

There is a reference to commit 811aaa55ba21ab37407018cfc01770d6b037d3fb in
bugreport. I've double checked that this commit is present in 2.6.39
kernel I'm using. So this is probably not applicable to me. 

Should I create bugreport at kernel.org for this issue?

PS. In case if you've some patch or other things to do, I can try it. 
 
-- 
WBR, Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Another GPU hang
  2011-05-22 11:49       ` Dmitry Nezhevenko
@ 2011-05-25  7:40         ` Dmitry Nezhevenko
  0 siblings, 0 replies; 6+ messages in thread
From: Dmitry Nezhevenko @ 2011-05-25  7:40 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sun, May 22, 2011 at 02:49:30PM +0300, Dmitry Nezhevenko wrote:
> > Ok, and I just got another bug (37450) which makes me think that, at
> > least, part of this problem is due to the stale DPMS values. (We depend
> > upon the DPMS state for not feeding WAITs onto a disabled pipe).
> > https://bugzilla.kernel.org/show_bug.cgi?id=24982
> > -Chris
> > 
> 
> There is a reference to commit 811aaa55ba21ab37407018cfc01770d6b037d3fb in
> bugreport. I've double checked that this commit is present in 2.6.39
> kernel I'm using. So this is probably not applicable to me. 
> 
> Should I create bugreport at kernel.org for this issue?
> 
> PS. In case if you've some patch or other things to do, I can try it. 

Just a note. I've once reproduced my issue without even plugging
connectors. It was enough to just switch from HDMI1 to LVDS output using
xrandr.


Regards
 
-- 
WBR, Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-05-25  7:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-21 21:08 Another GPU hang Dmitry Nezhevenko
2011-05-21 21:45 ` Chris Wilson
2011-05-21 22:47   ` Dmitry Nezhevenko
2011-05-22  8:02     ` Chris Wilson
2011-05-22 11:49       ` Dmitry Nezhevenko
2011-05-25  7:40         ` Dmitry Nezhevenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.