linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 4.11-rc0, thinkpad x220: GPU hang
@ 2017-02-28 14:34 Pavel Machek
  2017-02-28 15:02 ` Chris Wilson
  0 siblings, 1 reply; 12+ messages in thread
From: Pavel Machek @ 2017-02-28 14:34 UTC (permalink / raw)
  To: kernel list, daniel.vetter, jani.nikula, intel-gfx, dri-devel

[-- Attachment #1: Type: text/plain, Size: 4408 bytes --]

Hi!

mplayer stopped working after a while. Dmesg says:

[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
usb-0000:00:1d.0-1.2, CDC Ethernet Device, 22:1b:e4:4e:56:f5
[ 3190.767227] [drm] GPU HANG: ecode 6:0:0xbb409fff, in chromium
[4597], reason: Hang on render ring, action: reset
[ 3190.767311] [drm] GPU hangs can indicate a bug anywhere in the
entire gfx stack, including userspace.
[ 3190.767313] [drm] Please file a _new_ bug report on
bugs.freedesktop.org against DRI -> DRM/Intel
[ 3190.767315] [drm] drm/i915 developers can then reassign to the
right component if it's not a kernel issue.
[ 3190.767317] [drm] The gpu crash dump is required to analyze gpu
hangs, so please always attach it.
[ 3190.767320] [drm] GPU crash dump saved to
/sys/class/drm/card0/error
[ 3190.767427] drm/i915: Resetting chip after gpu hang
[ 3228.329384] cdc_ether 2-1.2:1.0 usb0: kevent 12 may have been
dropped
[ 3228.329604] cdc_ether 2-1.2:1.0 usb0: kevent 12 may have been
dropped
[ 3877.246261] perf: interrupt took too long (3142 > 3133), lowering
kernel.perf_event_max_sample_rate to 63500
[ 4802.784478] drm/i915: Resetting chip after gpu hang
[ 4810.784851] drm/i915: Resetting chip after gpu hang
[ 4829.829795] drm/i915: Resetting chip after gpu hang
[ 4837.826154] drm/i915: Resetting chip after gpu hang
[ 5125.026814] [drm:intel_pipe_update_end] *ERROR* Atomic update
failure on pipe A (start=308257 end=308258) time 203 us, min 763, max
767, scanline start 761, end 771
[ 5125.192602] [drm:intel_pipe_update_end] *ERROR* Atomic update
failure on pipe B (start=307385 end=307386) time 204 us, min 1073, max
1079, scanline start 1071, end 1086
[ 5125.309992] [drm:intel_pipe_update_end] *ERROR* Atomic update
failure on pipe A (start=308274 end=308275) time 203 us, min 763, max
767, scanline start 758, end 768
[ 5125.460013] [drm:intel_pipe_update_end] *ERROR* Atomic update
failure on pipe A (start=308283 end=308284) time 204 us, min 763, max
767, scanline start 761, end 771
[ 5125.493340] [drm:intel_pipe_update_end] *ERROR* Atomic update
failure on pipe A (start=308285 end=308286) time 202 us, min 763, max
767, scanline start 761, end 771
[ 5125.526684] [drm:intel_pipe_update_end] *ERROR* Atomic update
failure on pipe A (start=308287 end=308288) time 204 us, min 763, max
767, scanline start 762, end 772
[ 5125.593245] [drm:intel_pipe_update_end] *ERROR* Atomic update
failure on pipe A (start=308291 end=308292) time 203 us, min 763, max
767, scanline start 758, end 768
[ 5125.676636] [drm:intel_pipe_update_end] *ERROR* Atomic update
failure on pipe A (start=308296 end=308297) time 202 us, min 763, max
767, scanline start 762, end 772
[ 5125.709960] [drm:intel_pipe_update_end] *ERROR* Atomic update
failure on pipe A (start=308298 end=308299) time 203 us, min 763, max
767, scanline start 762, end 772
[ 5126.093109] [drm:intel_pipe_update_end] *ERROR* Atomic update
failure on pipe A (start=308321 end=308322) time 204 us, min 763, max
767, scanline start 759, end 770
[ 5647.879171] drm/i915: Resetting chip after gpu hang
[ 5655.879507] drm/i915: Resetting chip after gpu hang
[ 5850.864464] drm/i915: Resetting chip after gpu hang
[ 5858.864853] drm/i915: Resetting chip after gpu hang
[ 5904.850879] drm/i915: Resetting chip after gpu hang
[ 5912.851252] drm/i915: Resetting chip after gpu hang
[ 5949.876973] drm/i915: Resetting chip after gpu hang
[ 5957.877460] drm/i915: Resetting chip after gpu hang
[ 6018.872153] drm/i915: Resetting chip after gpu hang
[ 6030.872646] drm/i915: Resetting chip after gpu hang
[ 7108.362610] perf: interrupt took too long (3935 > 3927), lowering
kernel.perf_event_max_sample_rate to 50750
[ 9670.047072] drm/i915: Resetting chip after gpu hang
[ 9678.047415] drm/i915: Resetting chip after gpu hang
[10408.064806] drm/i915: Resetting chip after gpu hang
[10416.097168] drm/i915: Resetting chip after gpu hang
[10416.097181] [drm:i915_reset] *ERROR* GPU recovery failed
pavel@duo:/data/film$

Umm. Dmesg wants me to attach card0/error, but it looks like it
contains quite a lot of data. If it contains actual framebuffer
content, it may not be wise to post to mailing list....

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 4.11-rc0, thinkpad x220: GPU hang
  2017-02-28 14:34 4.11-rc0, thinkpad x220: GPU hang Pavel Machek
@ 2017-02-28 15:02 ` Chris Wilson
  2017-03-05 23:01   ` [regression] " Pavel Machek
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2017-02-28 15:02 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel list, daniel.vetter, jani.nikula, intel-gfx, dri-devel

On Tue, Feb 28, 2017 at 03:34:53PM +0100, Pavel Machek wrote:
> Hi!
> 
> mplayer stopped working after a while. Dmesg says:
> 
> [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> usb-0000:00:1d.0-1.2, CDC Ethernet Device, 22:1b:e4:4e:56:f5
> [ 3190.767227] [drm] GPU HANG: ecode 6:0:0xbb409fff, in chromium
> [4597], reason: Hang on render ring, action: reset
> [ 3190.767311] [drm] GPU hangs can indicate a bug anywhere in the
> entire gfx stack, including userspace.
> [ 3190.767313] [drm] Please file a _new_ bug report on
> bugs.freedesktop.org against DRI -> DRM/Intel
> [ 3190.767315] [drm] drm/i915 developers can then reassign to the
> right component if it's not a kernel issue.
> [ 3190.767317] [drm] The gpu crash dump is required to analyze gpu
> hangs, so please always attach it.
> [ 3190.767320] [drm] GPU crash dump saved to
> /sys/class/drm/card0/error
> [ 3190.767427] drm/i915: Resetting chip after gpu hang
> [ 3228.329384] cdc_ether 2-1.2:1.0 usb0: kevent 12 may have been
> dropped
> [ 3228.329604] cdc_ether 2-1.2:1.0 usb0: kevent 12 may have been
> dropped
> [ 3877.246261] perf: interrupt took too long (3142 > 3133), lowering
> kernel.perf_event_max_sample_rate to 63500
> [ 4802.784478] drm/i915: Resetting chip after gpu hang
> [ 4810.784851] drm/i915: Resetting chip after gpu hang
> [ 4829.829795] drm/i915: Resetting chip after gpu hang
> [ 4837.826154] drm/i915: Resetting chip after gpu hang
> [ 5125.026814] [drm:intel_pipe_update_end] *ERROR* Atomic update
> failure on pipe A (start=308257 end=308258) time 203 us, min 763, max
> 767, scanline start 761, end 771
> [ 5125.192602] [drm:intel_pipe_update_end] *ERROR* Atomic update
> failure on pipe B (start=307385 end=307386) time 204 us, min 1073, max
> 1079, scanline start 1071, end 1086
> [ 5125.309992] [drm:intel_pipe_update_end] *ERROR* Atomic update
> failure on pipe A (start=308274 end=308275) time 203 us, min 763, max
> 767, scanline start 758, end 768
> [ 5125.460013] [drm:intel_pipe_update_end] *ERROR* Atomic update
> failure on pipe A (start=308283 end=308284) time 204 us, min 763, max
> 767, scanline start 761, end 771
> [ 5125.493340] [drm:intel_pipe_update_end] *ERROR* Atomic update
> failure on pipe A (start=308285 end=308286) time 202 us, min 763, max
> 767, scanline start 761, end 771
> [ 5125.526684] [drm:intel_pipe_update_end] *ERROR* Atomic update
> failure on pipe A (start=308287 end=308288) time 204 us, min 763, max
> 767, scanline start 762, end 772
> [ 5125.593245] [drm:intel_pipe_update_end] *ERROR* Atomic update
> failure on pipe A (start=308291 end=308292) time 203 us, min 763, max
> 767, scanline start 758, end 768
> [ 5125.676636] [drm:intel_pipe_update_end] *ERROR* Atomic update
> failure on pipe A (start=308296 end=308297) time 202 us, min 763, max
> 767, scanline start 762, end 772
> [ 5125.709960] [drm:intel_pipe_update_end] *ERROR* Atomic update
> failure on pipe A (start=308298 end=308299) time 203 us, min 763, max
> 767, scanline start 762, end 772
> [ 5126.093109] [drm:intel_pipe_update_end] *ERROR* Atomic update
> failure on pipe A (start=308321 end=308322) time 204 us, min 763, max
> 767, scanline start 759, end 770
> [ 5647.879171] drm/i915: Resetting chip after gpu hang
> [ 5655.879507] drm/i915: Resetting chip after gpu hang
> [ 5850.864464] drm/i915: Resetting chip after gpu hang
> [ 5858.864853] drm/i915: Resetting chip after gpu hang
> [ 5904.850879] drm/i915: Resetting chip after gpu hang
> [ 5912.851252] drm/i915: Resetting chip after gpu hang
> [ 5949.876973] drm/i915: Resetting chip after gpu hang
> [ 5957.877460] drm/i915: Resetting chip after gpu hang
> [ 6018.872153] drm/i915: Resetting chip after gpu hang
> [ 6030.872646] drm/i915: Resetting chip after gpu hang
> [ 7108.362610] perf: interrupt took too long (3935 > 3927), lowering
> kernel.perf_event_max_sample_rate to 50750
> [ 9670.047072] drm/i915: Resetting chip after gpu hang
> [ 9678.047415] drm/i915: Resetting chip after gpu hang
> [10408.064806] drm/i915: Resetting chip after gpu hang
> [10416.097168] drm/i915: Resetting chip after gpu hang
> [10416.097181] [drm:i915_reset] *ERROR* GPU recovery failed
> pavel@duo:/data/film$
> 
> Umm. Dmesg wants me to attach card0/error, but it looks like it
> contains quite a lot of data. If it contains actual framebuffer
> content, it may not be wise to post to mailing list....

It contains command and register states. No pixel data unless userspace
got particularly creative with its memory corruption.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [regression] Re: 4.11-rc0, thinkpad x220: GPU hang
  2017-02-28 15:02 ` Chris Wilson
@ 2017-03-05 23:01   ` Pavel Machek
  2017-03-06 11:15     ` Chris Wilson
  2017-03-14  9:08     ` Thorsten Leemhuis
  0 siblings, 2 replies; 12+ messages in thread
From: Pavel Machek @ 2017-03-05 23:01 UTC (permalink / raw)
  To: Chris Wilson, kernel list, daniel.vetter, jani.nikula, intel-gfx,
	dri-devel

[-- Attachment #1: Type: text/plain, Size: 411 bytes --]

Hi!

> > mplayer stopped working after a while. Dmesg says:
> > 
> > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at

Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
try? Bisect will be slow and nasty :-(.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [regression] Re: 4.11-rc0, thinkpad x220: GPU hang
  2017-03-05 23:01   ` [regression] " Pavel Machek
@ 2017-03-06 11:15     ` Chris Wilson
  2017-03-06 11:47       ` Chris Wilson
  2017-03-06 12:10       ` Pavel Machek
  2017-03-14  9:08     ` Thorsten Leemhuis
  1 sibling, 2 replies; 12+ messages in thread
From: Chris Wilson @ 2017-03-06 11:15 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel list, daniel.vetter, jani.nikula, intel-gfx, dri-devel

On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
> Hi!
> 
> > > mplayer stopped working after a while. Dmesg says:
> > > 
> > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> 
> Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> try? Bisect will be slow and nasty :-(.

I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
and under the presumption that your bug matches (as the symptoms do):

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 4ffa35faff49..62e31a7438ac 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
 {
        struct drm_i915_private *dev_priv = request->i915;
 
-       i915_gem_request_submit(request);
-
        GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
        I915_WRITE_TAIL(request->engine, request->tail);
+
+       i915_gem_request_submit(request);
 }
 
 static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)


-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [regression] Re: 4.11-rc0, thinkpad x220: GPU hang
  2017-03-06 11:15     ` Chris Wilson
@ 2017-03-06 11:47       ` Chris Wilson
  2017-03-06 12:10       ` Pavel Machek
  1 sibling, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-03-06 11:47 UTC (permalink / raw)
  To: Pavel Machek, kernel list, daniel.vetter, jani.nikula, intel-gfx,
	dri-devel

On Mon, Mar 06, 2017 at 11:15:28AM +0000, Chris Wilson wrote:
> On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
> > Hi!
> > 
> > > > mplayer stopped working after a while. Dmesg says:
> > > > 
> > > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > 
> > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > try? Bisect will be slow and nasty :-(.
> 
> I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
> and under the presumption that your bug matches (as the symptoms do):
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 4ffa35faff49..62e31a7438ac 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
>  {
>         struct drm_i915_private *dev_priv = request->i915;
>  
> -       i915_gem_request_submit(request);
> -
>         GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
>         I915_WRITE_TAIL(request->engine, request->tail);
> +
> +       i915_gem_request_submit(request);

Hmm. request->tail is not set until i915_gem_request_submit() Uh oh.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [regression] Re: 4.11-rc0, thinkpad x220: GPU hang
  2017-03-06 11:15     ` Chris Wilson
  2017-03-06 11:47       ` Chris Wilson
@ 2017-03-06 12:10       ` Pavel Machek
  2017-03-06 12:23         ` Chris Wilson
  1 sibling, 1 reply; 12+ messages in thread
From: Pavel Machek @ 2017-03-06 12:10 UTC (permalink / raw)
  To: Chris Wilson, kernel list, daniel.vetter, jani.nikula, intel-gfx,
	dri-devel

[-- Attachment #1: Type: text/plain, Size: 2118 bytes --]

On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
> On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
> > Hi!
> > 
> > > > mplayer stopped working after a while. Dmesg says:
> > > > 
> > > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > 
> > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > try? Bisect will be slow and nasty :-(.
> 
> I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
> and under the presumption that your bug matches (as the symptoms do):
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 4ffa35faff49..62e31a7438ac 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
>  {
>         struct drm_i915_private *dev_priv = request->i915;
>  
> -       i915_gem_request_submit(request);
> -
>         GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
>         I915_WRITE_TAIL(request->engine, request->tail);
> +
> +       i915_gem_request_submit(request);
>  }
>  
>  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)

I applied it as:

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 91bc4ab..9c49c7a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1338,9 +1338,9 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
 {
 	struct drm_i915_private *dev_priv = request->i915;
 
-	i915_gem_request_submit(request);
-
 	I915_WRITE_TAIL(request->engine, request->tail);
+
+	i915_gem_request_submit(request);
 }
 
 static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,

Hmm. But your next mail suggest that it may not be smart to try to
boot it? :-).

										Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [regression] Re: 4.11-rc0, thinkpad x220: GPU hang
  2017-03-06 12:10       ` Pavel Machek
@ 2017-03-06 12:23         ` Chris Wilson
  2017-03-21 14:02           ` Pavel Machek
                             ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Chris Wilson @ 2017-03-06 12:23 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel list, daniel.vetter, jani.nikula, intel-gfx, dri-devel

On Mon, Mar 06, 2017 at 01:10:48PM +0100, Pavel Machek wrote:
> On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
> > On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
> > > Hi!
> > > 
> > > > > mplayer stopped working after a while. Dmesg says:
> > > > > 
> > > > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > > 
> > > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > > try? Bisect will be slow and nasty :-(.
> > 
> > I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
> > and under the presumption that your bug matches (as the symptoms do):
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 4ffa35faff49..62e31a7438ac 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
> >  {
> >         struct drm_i915_private *dev_priv = request->i915;
> >  
> > -       i915_gem_request_submit(request);
> > -
> >         GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
> >         I915_WRITE_TAIL(request->engine, request->tail);
> > +
> > +       i915_gem_request_submit(request);
> >  }
> >  
> >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
> 
> I applied it as:
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 91bc4ab..9c49c7a 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1338,9 +1338,9 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
>  {
>  	struct drm_i915_private *dev_priv = request->i915;
>  
> -	i915_gem_request_submit(request);
> -
>  	I915_WRITE_TAIL(request->engine, request->tail);
> +
> +	i915_gem_request_submit(request);
>  }
>  
>  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
> 
> Hmm. But your next mail suggest that it may not be smart to try to
> boot it? :-).

Don't bother, it'll promptly hang.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [regression] Re: 4.11-rc0, thinkpad x220: GPU hang
  2017-03-05 23:01   ` [regression] " Pavel Machek
  2017-03-06 11:15     ` Chris Wilson
@ 2017-03-14  9:08     ` Thorsten Leemhuis
  2017-03-14 11:35       ` Pavel Machek
  1 sibling, 1 reply; 12+ messages in thread
From: Thorsten Leemhuis @ 2017-03-14  9:08 UTC (permalink / raw)
  To: Pavel Machek, Chris Wilson, kernel list, daniel.vetter,
	jani.nikula, intel-gfx, dri-devel

On 06.03.2017 00:01, Pavel Machek wrote:
>>> mplayer stopped working after a while. Dmesg says:
>>>
>>> [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> try? Bisect will be slow and nasty :-(.

@Pavel, @Chris: What's the status of this?

I added this report to the list of regressions for Linux 4.11. I'll try
to watch this thread for further updates on this issue to document
progress in my weekly reports. Please let me know in case the discussion
moves to a different place (bugzilla or another mail thread for
example). tia!

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [regression] Re: 4.11-rc0, thinkpad x220: GPU hang
  2017-03-14  9:08     ` Thorsten Leemhuis
@ 2017-03-14 11:35       ` Pavel Machek
  0 siblings, 0 replies; 12+ messages in thread
From: Pavel Machek @ 2017-03-14 11:35 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Chris Wilson, kernel list, daniel.vetter, jani.nikula, intel-gfx,
	dri-devel

On Tue 2017-03-14 10:08:23, Thorsten Leemhuis wrote:
> On 06.03.2017 00:01, Pavel Machek wrote:
> >>> mplayer stopped working after a while. Dmesg says:
> >>>
> >>> [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > try? Bisect will be slow and nasty :-(.
> 
> @Pavel, @Chris: What's the status of this?
> 
> I added this report to the list of regressions for Linux 4.11. I'll try
> to watch this thread for further updates on this issue to document
> progress in my weekly reports. Please let me know in case the discussion
> moves to a different place (bugzilla or another mail thread for
> example). tia!

We know where the bug is, but there's no fix for it. There was one patch, but
it was quickly withdrawn.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [regression] Re: 4.11-rc0, thinkpad x220: GPU hang
  2017-03-06 12:23         ` Chris Wilson
@ 2017-03-21 14:02           ` Pavel Machek
  2017-03-25 21:33           ` Pavel Machek
  2017-04-09 10:33           ` Pavel Machek
  2 siblings, 0 replies; 12+ messages in thread
From: Pavel Machek @ 2017-03-21 14:02 UTC (permalink / raw)
  To: Chris Wilson, kernel list, daniel.vetter, jani.nikula, intel-gfx,
	dri-devel

[-- Attachment #1: Type: text/plain, Size: 2324 bytes --]

Hi!

> > > > > > mplayer stopped working after a while. Dmesg says:
> > > > > > 
> > > > > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > > > 
> > > > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > > > try? Bisect will be slow and nasty :-(.
> > > 
> > > I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
> > > and under the presumption that your bug matches (as the symptoms do):
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > index 4ffa35faff49..62e31a7438ac 100644
> > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
> > >  {
> > >         struct drm_i915_private *dev_priv = request->i915;
> > >  
> > > -       i915_gem_request_submit(request);
> > > -
> > >         GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
> > >         I915_WRITE_TAIL(request->engine, request->tail);
> > > +
> > > +       i915_gem_request_submit(request);
> > >  }
> > >  
> > >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
> > 
> > I applied it as:
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 91bc4ab..9c49c7a 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -1338,9 +1338,9 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
> >  {
> >  	struct drm_i915_private *dev_priv = request->i915;
> >  
> > -	i915_gem_request_submit(request);
> > -
> >  	I915_WRITE_TAIL(request->engine, request->tail);
> > +
> > +	i915_gem_request_submit(request);
> >  }
> >  
> >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
> > 
> > Hmm. But your next mail suggest that it may not be smart to try to
> > boot it? :-).
> 
> Don't bother, it'll promptly hang.

Any news here?

Is there something I can revert to get back to working system?

Thanks,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [regression] Re: 4.11-rc0, thinkpad x220: GPU hang
  2017-03-06 12:23         ` Chris Wilson
  2017-03-21 14:02           ` Pavel Machek
@ 2017-03-25 21:33           ` Pavel Machek
  2017-04-09 10:33           ` Pavel Machek
  2 siblings, 0 replies; 12+ messages in thread
From: Pavel Machek @ 2017-03-25 21:33 UTC (permalink / raw)
  To: Chris Wilson, kernel list, daniel.vetter, jani.nikula, intel-gfx,
	dri-devel

[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]

On Mon 2017-03-06 12:23:41, Chris Wilson wrote:
> On Mon, Mar 06, 2017 at 01:10:48PM +0100, Pavel Machek wrote:
> > On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
> > > On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
> > > > Hi!
> > > > 
> > > > > > mplayer stopped working after a while. Dmesg says:
> > > > > > 
> > > > > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > > > 
> > > > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > > > try? Bisect will be slow and nasty :-(.
> > > 
> > > I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
> > > and under the presumption that your bug matches (as the symptoms do):
> > > 
...
> >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
> > 
> > Hmm. But your next mail suggest that it may not be smart to try to
> > boot it? :-).
> 
> Don't bother, it'll promptly hang.

Any news here? Is there chance this is fixed in -rc4?
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [regression] Re: 4.11-rc0, thinkpad x220: GPU hang
  2017-03-06 12:23         ` Chris Wilson
  2017-03-21 14:02           ` Pavel Machek
  2017-03-25 21:33           ` Pavel Machek
@ 2017-04-09 10:33           ` Pavel Machek
  2 siblings, 0 replies; 12+ messages in thread
From: Pavel Machek @ 2017-04-09 10:33 UTC (permalink / raw)
  To: Chris Wilson, kernel list, daniel.vetter, jani.nikula, intel-gfx,
	dri-devel

[-- Attachment #1: Type: text/plain, Size: 2568 bytes --]

On Mon 2017-03-06 12:23:41, Chris Wilson wrote:
> On Mon, Mar 06, 2017 at 01:10:48PM +0100, Pavel Machek wrote:
> > On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
> > > On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
> > > > Hi!
> > > > 
> > > > > > mplayer stopped working after a while. Dmesg says:
> > > > > > 
> > > > > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > > > 
> > > > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > > > try? Bisect will be slow and nasty :-(.
> > > 
> > > I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
> > > and under the presumption that your bug matches (as the symptoms do):
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > index 4ffa35faff49..62e31a7438ac 100644
> > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
> > >  {
> > >         struct drm_i915_private *dev_priv = request->i915;
> > >  
> > > -       i915_gem_request_submit(request);
> > > -
> > >         GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
> > >         I915_WRITE_TAIL(request->engine, request->tail);
> > > +
> > > +       i915_gem_request_submit(request);
> > >  }
> > >  
> > >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
> > 
> > I applied it as:
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 91bc4ab..9c49c7a 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -1338,9 +1338,9 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
> >  {
> >  	struct drm_i915_private *dev_priv = request->i915;
> >  
> > -	i915_gem_request_submit(request);
> > -
> >  	I915_WRITE_TAIL(request->engine, request->tail);
> > +
> > +	i915_gem_request_submit(request);
> >  }
> >  
> >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
> > 
> > Hmm. But your next mail suggest that it may not be smart to try to
> > boot it? :-).
> 
> Don't bother, it'll promptly hang.

Any news here? 4.11-rc5 is actually usable on the hardware (unlike
-rc1), not sure what changed.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-04-09 10:33 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-28 14:34 4.11-rc0, thinkpad x220: GPU hang Pavel Machek
2017-02-28 15:02 ` Chris Wilson
2017-03-05 23:01   ` [regression] " Pavel Machek
2017-03-06 11:15     ` Chris Wilson
2017-03-06 11:47       ` Chris Wilson
2017-03-06 12:10       ` Pavel Machek
2017-03-06 12:23         ` Chris Wilson
2017-03-21 14:02           ` Pavel Machek
2017-03-25 21:33           ` Pavel Machek
2017-04-09 10:33           ` Pavel Machek
2017-03-14  9:08     ` Thorsten Leemhuis
2017-03-14 11:35       ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).