linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* i915 driver crashes on T540p if docking station attached
@ 2015-07-30  0:49 Theodore Ts'o
  2015-07-30  1:39 ` [REGRESSION] " Theodore Ts'o
  0 siblings, 1 reply; 19+ messages in thread
From: Theodore Ts'o @ 2015-07-30  0:49 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: Daniel Vetter, Mani Nikula, Ander Conselvan de Oliveira, linux-kernel


Unfortunately the failure causes a series of recursive faults and I
haven't been able to capture the stack trace, but on 4.2-rcX kernels,
I can reliably cause the system to crash if my T540p is booted with
the docking station attached.

It will also crash if I boot the system first, and then insert the
laptop into the dockstation.

Unfortunately, I can't get a stack trace because there are a huge
number of recursive/double faults, and the system dies so quickly that
nothing ends up in the log files.  If you really need a stack dump I
can try to rig something, but modern Laptops don't have serial
consoles any more, alas, so it's bit of a pain.

I was able to bisect it down to this commit, however: 8c7b5ccb72987:
"drm/i915: Use atomic helpers for computing changed flags:"

Is there any chance Intel could add a Lenovo Dockstation with a
Multistream DP output to part of your test hardware?  Unfortunately it
seems pretty common that I see regressions with my particular
hardware.  Maybe there aren't enough people using Thinkpads any more?  :-(

      	     	    	   	 	       	      - Ted


P.S.  The git bisect log

git bisect start
# bad: [421d125c06c4be4c5005cb69840206bd09b71dd6] builddeb: sign the modules after splitting out the debuginfo files
git bisect bad 421d125c06c4be4c5005cb69840206bd09b71dd6
# good: [b953c0d234bc72e8489d3bf51a276c5c4ec85345] Linux 4.1
git bisect good b953c0d234bc72e8489d3bf51a276c5c4ec85345
# good: [aeaa2122af4e53f3bfd28e8f294557bb95af43fc] drm/i915/skl: Add the INIT power domain to the MISC I/O power well
git bisect good aeaa2122af4e53f3bfd28e8f294557bb95af43fc
# bad: [4d70f38a760ad2879d2ebd84001c92980180f630] drm/i915/bios: remove a redundant NULL pointer check
git bisect bad 4d70f38a760ad2879d2ebd84001c92980180f630
# bad: [27a1b688d9f1fa2abd14bfe6a8729a19fb3b1b25] drm/i915/bxt: Enable WaEnableYV12BugFixInHalfSliceChicken7 for Broxton
git bisect bad 27a1b688d9f1fa2abd14bfe6a8729a19fb3b1b25
# good: [4be0731786de10d0e9ae1d159504c83c6b052647] drm/i915: Add crtc states before calling compute_config()
git bisect good 4be0731786de10d0e9ae1d159504c83c6b052647
# good: [d5432a9d19b61ba6a2b3d88f3026e0ca60eb57a1] drm/i915: Stage new modeset state straight into atomic state
git bisect good d5432a9d19b61ba6a2b3d88f3026e0ca60eb57a1
# bad: [a821fc46bc7bb6d4cf9a5f8d2787fd70231c2c10] drm/i915: Swap atomic state in legacy modeset
git bisect bad a821fc46bc7bb6d4cf9a5f8d2787fd70231c2c10
# bad: [8c7b5ccb729870e606321b3703e2c2e698c49a95] drm/i915: Use atomic helpers for computing changed flags
git bisect bad 8c7b5ccb729870e606321b3703e2c2e698c49a95
# good: [0f63cca2afdc38877e86acfa9821020f6e2213fd] drm/i915: Update crtc state active flag based on DPMS
git bisect good 0f63cca2afdc38877e86acfa9821020f6e2213fd
# good: [840bfe953384a134c8639f2964d9b74bfa671e16] drm/atomic: Make mode_fixup() optional for check_modeset()
git bisect good 840bfe953384a134c8639f2964d9b74bfa671e16
# first bad commit: [8c7b5ccb729870e606321b3703e2c2e698c49a95] drm/i915: Use atomic helpers for computing changed flags


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30  0:49 i915 driver crashes on T540p if docking station attached Theodore Ts'o
@ 2015-07-30  1:39 ` Theodore Ts'o
  2015-07-30  5:18   ` Linus Torvalds
  0 siblings, 1 reply; 19+ messages in thread
From: Theodore Ts'o @ 2015-07-30  1:39 UTC (permalink / raw)
  To: intel-gfx, dri-devel, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, linux-kernel, torvalds

On Wed, Jul 29, 2015 at 08:49:37PM -0400, Theodore Ts'o wrote:
> 
> Unfortunately the failure causes a series of recursive faults and I
> haven't been able to capture the stack trace, but on 4.2-rcX kernels,
> I can reliably cause the system to crash if my T540p is booted with
> the docking station attached.
> 
> It will also crash if I boot the system first, and then insert the
> laptop into the dockstation.
> 
> Unfortunately, I can't get a stack trace because there are a huge
> number of recursive/double faults, and the system dies so quickly that
> nothing ends up in the log files.  If you really need a stack dump I
> can try to rig something, but modern Laptops don't have serial
> consoles any more, alas, so it's bit of a pain.

The bad news is that I tried to use kdump to capture a crashdump and
hopefully get more information, and kdump utterly wedged on the panic.
The good news is because it wedged the system, I was able to get the
console stackdump before it scrolled off due to a whole series of
recursive oops messages.

It's here:  https://goo.gl/photos/xHjn2Z97JQEw6k2C9

Hopefully tihs is useful.  It's not obvious how to revert this change,
since there were a large number of changes to i915 after this.  If
someone could help me with a revert, I'd be happy to test it.

Thanks,

						- Ted
						


> 
> I was able to bisect it down to this commit, however: 8c7b5ccb72987:
> "drm/i915: Use atomic helpers for computing changed flags:"
> 
> Is there any chance Intel could add a Lenovo Dockstation with a
> Multistream DP output to part of your test hardware?  Unfortunately it
> seems pretty common that I see regressions with my particular
> hardware.  Maybe there aren't enough people using Thinkpads any more?  :-(
> 
>       	     	    	   	 	       	      - Ted
> 
> 
> P.S.  The git bisect log
> 
> git bisect start
> # bad: [421d125c06c4be4c5005cb69840206bd09b71dd6] builddeb: sign the modules after splitting out the debuginfo files
> git bisect bad 421d125c06c4be4c5005cb69840206bd09b71dd6
> # good: [b953c0d234bc72e8489d3bf51a276c5c4ec85345] Linux 4.1
> git bisect good b953c0d234bc72e8489d3bf51a276c5c4ec85345
> # good: [aeaa2122af4e53f3bfd28e8f294557bb95af43fc] drm/i915/skl: Add the INIT power domain to the MISC I/O power well
> git bisect good aeaa2122af4e53f3bfd28e8f294557bb95af43fc
> # bad: [4d70f38a760ad2879d2ebd84001c92980180f630] drm/i915/bios: remove a redundant NULL pointer check
> git bisect bad 4d70f38a760ad2879d2ebd84001c92980180f630
> # bad: [27a1b688d9f1fa2abd14bfe6a8729a19fb3b1b25] drm/i915/bxt: Enable WaEnableYV12BugFixInHalfSliceChicken7 for Broxton
> git bisect bad 27a1b688d9f1fa2abd14bfe6a8729a19fb3b1b25
> # good: [4be0731786de10d0e9ae1d159504c83c6b052647] drm/i915: Add crtc states before calling compute_config()
> git bisect good 4be0731786de10d0e9ae1d159504c83c6b052647
> # good: [d5432a9d19b61ba6a2b3d88f3026e0ca60eb57a1] drm/i915: Stage new modeset state straight into atomic state
> git bisect good d5432a9d19b61ba6a2b3d88f3026e0ca60eb57a1
> # bad: [a821fc46bc7bb6d4cf9a5f8d2787fd70231c2c10] drm/i915: Swap atomic state in legacy modeset
> git bisect bad a821fc46bc7bb6d4cf9a5f8d2787fd70231c2c10
> # bad: [8c7b5ccb729870e606321b3703e2c2e698c49a95] drm/i915: Use atomic helpers for computing changed flags
> git bisect bad 8c7b5ccb729870e606321b3703e2c2e698c49a95
> # good: [0f63cca2afdc38877e86acfa9821020f6e2213fd] drm/i915: Update crtc state active flag based on DPMS
> git bisect good 0f63cca2afdc38877e86acfa9821020f6e2213fd
> # good: [840bfe953384a134c8639f2964d9b74bfa671e16] drm/atomic: Make mode_fixup() optional for check_modeset()
> git bisect good 840bfe953384a134c8639f2964d9b74bfa671e16
> # first bad commit: [8c7b5ccb729870e606321b3703e2c2e698c49a95] drm/i915: Use atomic helpers for computing changed flags
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30  1:39 ` [REGRESSION] " Theodore Ts'o
@ 2015-07-30  5:18   ` Linus Torvalds
  2015-07-30 11:16     ` Dave Airlie
  2015-07-30 14:40     ` Daniel Vetter
  0 siblings, 2 replies; 19+ messages in thread
From: Linus Torvalds @ 2015-07-30  5:18 UTC (permalink / raw)
  To: Theodore Ts'o, intel-gfx, DRI, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, Linux Kernel Mailing List,
	Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 1635 bytes --]

On Wed, Jul 29, 2015 at 6:39 PM, Theodore Ts'o <tytso@mit.edu> wrote:
>
> It's here:  https://goo.gl/photos/xHjn2Z97JQEw6k2C9

You didn't catch enough of the code line to decode the code, but it's
early enough in drm_crtc_index() (just five bytes in) that it's almost
certainly the very first dereference, so it's almost guaranteed to be
that

   crtc->dev

access as part of list_for_each_entry(), with crtc being NULL. And
yes, "->dev" is the very first field, so the offset is zero too (while
the "->mode_config" list access would not be at offset zero).

And it looks like it is called from drm_atomic_helper_check_modeset():
the reason it has a question mark in the backtrace is because the
fault happens before the stack frame has even been set up.

There are multiple calls to "drm_crtc_index()" from that function, I
can't tell which one it is. Looking at the code generation I get, I
think it's because update_connector_routing() gets inlined, and that
one does several calls. Most of them look like this:

                if (connector->state->crtc) {
                        idx = drm_crtc_index(connector->state->crtc);

ie they check that the crtc is non-NULL, but that last one does not:

        connector_state->best_encoder = new_encoder;
        idx = drm_crtc_index(connector_state->crtc);

        crtc_state = state->crtc_states[idx];
        crtc_state->mode_changed = true;

and I suspect the fix might be something like the attached. Totally
untested. Ted?

This whole "atomic modeset" series has been one royal fuck-up, guys.
We've had too many of these kinds of crap issues.

                           Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 845 bytes --]

 drivers/gpu/drm/drm_atomic_helper.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 5b59d5ad7d1c..aac212297b49 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx)
 	}
 
 	connector_state->best_encoder = new_encoder;
-	idx = drm_crtc_index(connector_state->crtc);
+	if (connector_state->crtc) {
+		idx = drm_crtc_index(connector_state->crtc);
 
-	crtc_state = state->crtc_states[idx];
-	crtc_state->mode_changed = true;
+		crtc_state = state->crtc_states[idx];
+		crtc_state->mode_changed = true;
+	}
 
 	DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] using [ENCODER:%d:%s] on [CRTC:%d]\n",
 			 connector->base.id,

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30  5:18   ` Linus Torvalds
@ 2015-07-30 11:16     ` Dave Airlie
  2015-07-30 14:40     ` Daniel Vetter
  1 sibling, 0 replies; 19+ messages in thread
From: Dave Airlie @ 2015-07-30 11:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Ts'o, intel-gfx, DRI, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, Linux Kernel Mailing List

On 30 July 2015 at 15:18, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Wed, Jul 29, 2015 at 6:39 PM, Theodore Ts'o <tytso@mit.edu> wrote:
>>
>> It's here:  https://goo.gl/photos/xHjn2Z97JQEw6k2C9
>
> You didn't catch enough of the code line to decode the code, but it's
> early enough in drm_crtc_index() (just five bytes in) that it's almost
> certainly the very first dereference, so it's almost guaranteed to be
> that
>
>    crtc->dev
>
> access as part of list_for_each_entry(), with crtc being NULL. And
> yes, "->dev" is the very first field, so the offset is zero too (while
> the "->mode_config" list access would not be at offset zero).
>
> And it looks like it is called from drm_atomic_helper_check_modeset():
> the reason it has a question mark in the backtrace is because the
> fault happens before the stack frame has even been set up.
>
> There are multiple calls to "drm_crtc_index()" from that function, I
> can't tell which one it is. Looking at the code generation I get, I
> think it's because update_connector_routing() gets inlined, and that
> one does several calls. Most of them look like this:
>
>                 if (connector->state->crtc) {
>                         idx = drm_crtc_index(connector->state->crtc);
>
> ie they check that the crtc is non-NULL, but that last one does not:
>
>         connector_state->best_encoder = new_encoder;
>         idx = drm_crtc_index(connector_state->crtc);
>
>         crtc_state = state->crtc_states[idx];
>         crtc_state->mode_changed = true;
>
> and I suspect the fix might be something like the attached. Totally
> untested. Ted?
>
> This whole "atomic modeset" series has been one royal fuck-up, guys.
> We've had too many of these kinds of crap issues.

It hasn't been that bad, on a scale of 1 to MD eats my raid array, I'd
say we are barely at a 5.

There have been a lot of small and seemingly easily fixed teething
problems, essentially rewriting the DRM API to provide a new userspace
API and internal interface, porting some drivers partly to the new
interface, while trying to maintain the old ABI/API on top seamlessly
was always going to be an impossible task. It was never going to
magically all just work in -next and land in your tree fully formed
smelling of lavender and elderberries. This is a massive undertaking,
and doing it over a few kernels was the only possible way it could
ever land.

I think the biggest problem we've had is the QA team at Intel got
reorganised or something right when they really needed to be doing
testing on this stuff, so what was sitting in -next never got as much
testing as it had previously, and you can see that in the types of
cases that are getting through. I think the other thing we can learn
is that when Android forks the kernel we should just say this shit is
too hard, let Google go and create a new API and a complete set of
graphics drivers and deal with it in 10 years, because that was
seriously the only other option.

So yes it's a pity other kernel developers are seeing our fallout, but
I've experienced lots of other kernel developers fall out over the
years, and generally the idea is to get this stuff fixed to a
reasonable state before you release a final kernel.

Note I'm not personally involved in the development for atomic
modesetting at all, I'm running the kernels with it where and when I
can, and I trust the developers who work on it are doing as much as
they can to make it work.

That said hopefully Daniel can find a bag of fucks to debug and write
a proper patch, instead of rage quitting the universe, and just git
reset --hard v4.0 drivers/gpu/drm/i915..

Dave.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30  5:18   ` Linus Torvalds
  2015-07-30 11:16     ` Dave Airlie
@ 2015-07-30 14:40     ` Daniel Vetter
  2015-07-30 15:32       ` Theodore Ts'o
  2015-07-30 15:50       ` Theodore Ts'o
  1 sibling, 2 replies; 19+ messages in thread
From: Daniel Vetter @ 2015-07-30 14:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Ts'o, intel-gfx, DRI, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, Linux Kernel Mailing List

On Wed, Jul 29, 2015 at 10:18:16PM -0700, Linus Torvalds wrote:
>  drivers/gpu/drm/drm_atomic_helper.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 5b59d5ad7d1c..aac212297b49 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx)
>  	}
>  
>  	connector_state->best_encoder = new_encoder;
> -	idx = drm_crtc_index(connector_state->crtc);
> +	if (connector_state->crtc) {
> +		idx = drm_crtc_index(connector_state->crtc);
>  
> -	crtc_state = state->crtc_states[idx];
> -	crtc_state->mode_changed = true;
> +		crtc_state = state->crtc_states[idx];
> +		crtc_state->mode_changed = true;
> +	}

This shouldn't happen since if it does we ended up stealing the encoder
from the connector itself (we do check for connector_state->crtc earlier)
and that would be a bug. I haven't figured out a precise theory but my
guess is on the best_encoder selection, and indeed dp mst encoder
selection seems to have gone belly up in 4.2 with the bisected commit.

I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff
but I couldn't test them yet since no dp mst here and I didn't find
anything that would ship faster than 1-2 weeks yet. I'll try to get some
other people here to test it meanwhile too.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30 14:40     ` Daniel Vetter
@ 2015-07-30 15:32       ` Theodore Ts'o
  2015-07-30 15:54         ` [Intel-gfx] " Daniel Vetter
  2015-07-30 15:57         ` Takashi Iwai
  2015-07-30 15:50       ` Theodore Ts'o
  1 sibling, 2 replies; 19+ messages in thread
From: Theodore Ts'o @ 2015-07-30 15:32 UTC (permalink / raw)
  To: Linus Torvalds, intel-gfx, DRI, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, Linux Kernel Mailing List

On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote:
> On Wed, Jul 29, 2015 at 10:18:16PM -0700, Linus Torvalds wrote:
> >  drivers/gpu/drm/drm_atomic_helper.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > index 5b59d5ad7d1c..aac212297b49 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx)
> >  	}
> >  
> >  	connector_state->best_encoder = new_encoder;
> > -	idx = drm_crtc_index(connector_state->crtc);
> > +	if (connector_state->crtc) {
> > +		idx = drm_crtc_index(connector_state->crtc);
> >  
> > -	crtc_state = state->crtc_states[idx];
> > -	crtc_state->mode_changed = true;
> > +		crtc_state = state->crtc_states[idx];
> > +		crtc_state->mode_changed = true;
> > +	}
> 
> This shouldn't happen since if it does we ended up stealing the encoder
> from the connector itself (we do check for connector_state->crtc earlier)
> and that would be a bug. I haven't figured out a precise theory but my
> guess is on the best_encoder selection, and indeed dp mst encoder
> selection seems to have gone belly up in 4.2 with the bisected commit.

Well, I just tested Linus's patch and it works.

BTW, is there any chance that I can suspend my laptop, and then move
it from my docking station at home (where I have a Dell 30" display)
to my docking station at work (where I have a Dell 24" display), and
actually have the new monitor be detected?  For at least the past
year, I have to reboot in order to be able to use the external
monitor?  This used to work, but it's been a very long-standing
regression.  I undrstand that Multi-stream DP is a evil horrible hack,
and supporting it is painful, but this used to work, and it hasn't in
a long time.  :-(

					- Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30 14:40     ` Daniel Vetter
  2015-07-30 15:32       ` Theodore Ts'o
@ 2015-07-30 15:50       ` Theodore Ts'o
  2015-07-30 15:59         ` Theodore Ts'o
                           ` (2 more replies)
  1 sibling, 3 replies; 19+ messages in thread
From: Theodore Ts'o @ 2015-07-30 15:50 UTC (permalink / raw)
  To: Linus Torvalds, intel-gfx, DRI, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, Linux Kernel Mailing List

On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote:
> I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff
> but I couldn't test them yet since no dp mst here and I didn't find
> anything that would ship faster than 1-2 weeks yet. I'll try to get some
> other people here to test it meanwhile too.

I've tried pulling in your patches from fixes-stuff, onto Linus's tree
(without Linus's fix), and the good news is that I'm no longer
crashing on boot.

The *bad* news is that (a) it breaks the external monitor attached to
the docking station completely (this was working with Linus's patch),
and (b) it's triggering a LOCKDEP failure.

So even though Linus's patch wasn't supposed to work, I think I'm
going to back to it....

					- Ted


Jul 30 11:46:49 closure kernel: [    4.221951] 
Jul 30 11:46:49 closure kernel: [    4.221954] ======================================================
Jul 30 11:46:49 closure kernel: [    4.221957] [ INFO: possible circular locking dependency detected ]
Jul 30 11:46:49 closure kernel: [    4.221960] 4.2.0-rc4-13906-g5f1b75cd #16 Not tainted
Jul 30 11:46:49 closure kernel: [    4.221963] -------------------------------------------------------
Jul 30 11:46:49 closure kernel: [    4.221966] modprobe/503 is trying to acquire lock:
Jul 30 11:46:49 closure kernel: [    4.221968]  (init_mutex){+.+.+.}, at: [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.221977] 
Jul 30 11:46:49 closure kernel: [    4.221977] but task is already holding lock:
Jul 30 11:46:49 closure kernel: [    4.221979]  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7c9>] __blocking_notifier_call_chain+0x37/0x69
Jul 30 11:46:49 closure kernel: [    4.221987] 
Jul 30 11:46:49 closure kernel: [    4.221987] which lock already depends on the new lock.
Jul 30 11:46:49 closure kernel: [    4.221987] 
Jul 30 11:46:49 closure kernel: [    4.221990] 
Jul 30 11:46:49 closure kernel: [    4.221990] the existing dependency chain (in reverse order) is:
Jul 30 11:46:49 closure kernel: [    4.221995] 
Jul 30 11:46:49 closure kernel: [    4.221995] -> #1 (&(&backlight_notifier)->rwsem){++++..}:
Jul 30 11:46:49 closure kernel: [    4.222001]        [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
Jul 30 11:46:49 closure kernel: [    4.222007]        [<ffffffff8161f1db>] down_write+0x46/0x8a
Jul 30 11:46:49 closure kernel: [    4.222012]        [<ffffffff8109a6c0>] blocking_notifier_chain_register+0x36/0x57
Jul 30 11:46:49 closure kernel: [    4.222017]        [<ffffffff8134eb4e>] backlight_register_notifier+0x18/0x1a
Jul 30 11:46:49 closure kernel: [    4.222022]        [<ffffffff8138b463>] acpi_video_get_backlight_type+0xfa/0x164
Jul 30 11:46:49 closure kernel: [    4.222028]        [<ffffffffc03a1e45>] 0xffffffffc03a1e45
Jul 30 11:46:49 closure audispd: No plugins found, exiting
Jul 30 11:46:49 closure kernel: [    4.222032]        [<ffffffffc03a28a8>] 0xffffffffc03a28a8
Jul 30 11:46:49 closure kernel: [    4.222036]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
Jul 30 11:46:49 closure kernel: [    4.222042]        [<ffffffff81619985>] do_init_module+0x60/0x1e3
Jul 30 11:46:49 closure kernel: [    4.222047]        [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
Jul 30 11:46:49 closure kernel: [    4.222052]        [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
Jul 30 11:46:49 closure kernel: [    4.222056]        [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73
Jul 30 11:46:49 closure kernel: [    4.222060] 
Jul 30 11:46:49 closure kernel: [    4.222060] -> #0 (init_mutex){+.+.+.}:
Jul 30 11:46:49 closure kernel: [    4.222065]        [<ffffffff810bb77a>] __lock_acquire+0xc55/0xf54
Jul 30 11:46:49 closure kernel: [    4.222070]        [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
Jul 30 11:46:49 closure kernel: [    4.222074]        [<ffffffff8161d83a>] mutex_lock_nested+0x70/0x391
Jul 30 11:46:49 closure kernel: [    4.222078]        [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.222083]        [<ffffffff8138b505>] acpi_video_backlight_notify+0x19/0x2f
Jul 30 11:46:49 closure kernel: [    4.222088]        [<ffffffff8109a442>] notifier_call_chain+0x4c/0x71
Jul 30 11:46:49 closure kernel: [    4.222092]        [<ffffffff8109a7e2>] __blocking_notifier_call_chain+0x50/0x69
Jul 30 11:46:49 closure kernel: [    4.222098]        [<ffffffff8109a80f>] blocking_notifier_call_chain+0x14/0x16
Jul 30 11:46:49 closure kernel: [    4.222103]        [<ffffffff8134f023>] backlight_device_register+0x1df/0x1f1
Jul 30 11:46:49 closure kernel: [    4.222108]        [<ffffffffc07b3061>] intel_backlight_register+0xf0/0x157 [i915]
Jul 30 11:46:49 closure kernel: [    4.222146]        [<ffffffffc078c843>] intel_modeset_gem_init+0x158/0x164 [i915]
Jul 30 11:46:49 closure kernel: [    4.222176]        [<ffffffffc07b997c>] i915_driver_load+0xf1c/0x1139 [i915]
Jul 30 11:46:49 closure kernel: [    4.222205]        [<ffffffffc053af19>] drm_dev_register+0x84/0xfd [drm]
Jul 30 11:46:49 closure kernel: [    4.222217]        [<ffffffffc053d77e>] drm_get_pci_dev+0x102/0x1bc [drm]
Jul 30 11:46:49 closure kernel: [    4.222228]        [<ffffffffc07291e2>] i915_pci_probe+0x4f/0x51 [i915]
Jul 30 11:46:49 closure kernel: [    4.222247]        [<ffffffff81333ad3>] pci_device_probe+0x74/0xd6
Jul 30 11:46:49 closure kernel: [    4.222253]        [<ffffffff813d4806>] driver_probe_device+0x15f/0x387
Jul 30 11:46:49 closure kernel: [    4.222257]        [<ffffffff813d4a81>] __driver_attach+0x53/0x74
Jul 30 11:46:49 closure kernel: [    4.222262]        [<ffffffff813d2aa0>] bus_for_each_dev+0x6f/0x89
Jul 30 11:46:49 closure kernel: [    4.222266]        [<ffffffff813d41f0>] driver_attach+0x1e/0x20
Jul 30 11:46:49 closure kernel: [    4.222269]        [<ffffffff813d3e33>] bus_add_driver+0x140/0x238
Jul 30 11:46:49 closure kernel: [    4.222273]        [<ffffffff813d53d8>] driver_register+0x8f/0xcc
Jul 30 11:46:49 closure kernel: [    4.222278]        [<ffffffff81332be1>] __pci_register_driver+0x5e/0x62
Jul 30 11:46:49 closure kernel: [    4.222282]        [<ffffffffc053d890>] drm_pci_init+0x58/0xda [drm]
Jul 30 11:46:49 closure kernel: [    4.222293]        [<ffffffffc081f0a0>] i915_init+0xa0/0xa8 [i915]
Jul 30 11:46:49 closure kernel: [    4.222312]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
Jul 30 11:46:49 closure kernel: [    4.222317]        [<ffffffff81619985>] do_init_module+0x60/0x1e3
Jul 30 11:46:49 closure kernel: [    4.222321]        [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
Jul 30 11:46:49 closure kernel: [    4.222325]        [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
Jul 30 11:46:49 closure kernel: [    4.222329]        [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73
Jul 30 11:46:49 closure kernel: [    4.222334] 
Jul 30 11:46:49 closure kernel: [    4.222334] other info that might help us debug this:
Jul 30 11:46:49 closure kernel: [    4.222334] 
Jul 30 11:46:49 closure kernel: [    4.222340]  Possible unsafe locking scenario:
Jul 30 11:46:49 closure kernel: [    4.222340] 
Jul 30 11:46:49 closure kernel: [    4.222344]        CPU0                    CPU1
Jul 30 11:46:49 closure kernel: [    4.222347]        ----                    ----
Jul 30 11:46:49 closure kernel: [    4.222350]   lock(&(&backlight_notifier)->rwsem);
Jul 30 11:46:49 closure kernel: [    4.222353]                                lock(init_mutex);
Jul 30 11:46:49 closure kernel: [    4.222357]                                lock(&(&backlight_notifier)->rwsem);
Jul 30 11:46:49 closure kernel: [    4.222363]   lock(init_mutex);
Jul 30 11:46:49 closure kernel: [    4.222366] 
Jul 30 11:46:49 closure kernel: [    4.222366]  *** DEADLOCK ***
Jul 30 11:46:49 closure kernel: [    4.222366] 
Jul 30 11:46:49 closure kernel: [    4.222371] 4 locks held by modprobe/503:
Jul 30 11:46:49 closure kernel: [    4.222374]  #0:  (&dev->mutex){......}, at: [<ffffffff813d3ff1>] device_lock+0xf/0x11
Jul 30 11:46:49 closure kernel: [    4.222381]  #1:  (&dev->mutex){......}, at: [<ffffffff813d3ff1>] device_lock+0xf/0x11
Jul 30 11:46:49 closure kernel: [    4.222388]  #2:  (drm_global_mutex){+.+.+.}, at: [<ffffffffc053aeb9>] drm_dev_register+0x24/0xfd [drm]
Jul 30 11:46:49 closure kernel: [    4.222402]  #3:  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7c9>] __blocking_notifier_call_chain+0x37/0x69
Jul 30 11:46:49 closure kernel: [    4.222410] 
Jul 30 11:46:49 closure kernel: [    4.222410] stack backtrace:
Jul 30 11:46:49 closure kernel: [    4.222416] CPU: 7 PID: 503 Comm: modprobe Not tainted 4.2.0-rc4-13906-g5f1b75cd #16
Jul 30 11:46:49 closure kernel: [    4.222420] Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET59WW (2.07 ) 02/12/2014
Jul 30 11:46:49 closure kernel: [    4.222425]  ffffffff8280a230 ffff8800c992b5d8 ffffffff8161a71e 0000000000000006
Jul 30 11:46:49 closure kernel: [    4.222431]  ffffffff8280a230 ffff8800c992b628 ffffffff810b9adf ffffffff82265780
Jul 30 11:46:49 closure kernel: [    4.222437]  ffff880405588000 0000000000000004 ffff880405588880 0000000000000004
Jul 30 11:46:49 closure kernel: [    4.222443] Call Trace:
Jul 30 11:46:49 closure kernel: [    4.222447]  [<ffffffff8161a71e>] dump_stack+0x4c/0x65
Jul 30 11:46:49 closure kernel: [    4.222451]  [<ffffffff810b9adf>] print_circular_bug+0x1f8/0x209
Jul 30 11:46:49 closure kernel: [    4.222455]  [<ffffffff810bb77a>] __lock_acquire+0xc55/0xf54
Jul 30 11:46:49 closure kernel: [    4.222460]  [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
Jul 30 11:46:49 closure kernel: [    4.222464]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.222469]  [<ffffffff8161d83a>] mutex_lock_nested+0x70/0x391
Jul 30 11:46:49 closure kernel: [    4.222472]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.222476]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.222480]  [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.222484]  [<ffffffff8138b505>] acpi_video_backlight_notify+0x19/0x2f
Jul 30 11:46:49 closure kernel: [    4.222488]  [<ffffffff8109a442>] notifier_call_chain+0x4c/0x71
Jul 30 11:46:49 closure kernel: [    4.222492]  [<ffffffff8109a7e2>] __blocking_notifier_call_chain+0x50/0x69
Jul 30 11:46:49 closure kernel: [    4.222496]  [<ffffffff8109a80f>] blocking_notifier_call_chain+0x14/0x16
Jul 30 11:46:49 closure kernel: [    4.222500]  [<ffffffff8134f023>] backlight_device_register+0x1df/0x1f1
Jul 30 11:46:49 closure kernel: [    4.222530]  [<ffffffffc07b3061>] intel_backlight_register+0xf0/0x157 [i915]
Jul 30 11:46:49 closure kernel: [    4.222556]  [<ffffffffc078c843>] intel_modeset_gem_init+0x158/0x164 [i915]
Jul 30 11:46:49 closure kernel: [    4.222584]  [<ffffffffc07b997c>] i915_driver_load+0xf1c/0x1139 [i915]
Jul 30 11:46:49 closure kernel: [    4.222589]  [<ffffffff810ba715>] ? mark_held_locks+0x56/0x6c
Jul 30 11:46:49 closure kernel: [    4.222593]  [<ffffffff81620836>] ? _raw_spin_unlock_irqrestore+0x3f/0x4d
Jul 30 11:46:49 closure kernel: [    4.222597]  [<ffffffff810ba89c>] ? trace_hardirqs_on_caller+0x171/0x18d
Jul 30 11:46:49 closure kernel: [    4.222607]  [<ffffffffc053af19>] drm_dev_register+0x84/0xfd [drm]
Jul 30 11:46:49 closure kernel: [    4.222618]  [<ffffffffc053d77e>] drm_get_pci_dev+0x102/0x1bc [drm]
Jul 30 11:46:49 closure kernel: [    4.222636]  [<ffffffffc07291e2>] i915_pci_probe+0x4f/0x51 [i915]
Jul 30 11:46:49 closure kernel: [    4.222640]  [<ffffffff81333ad3>] pci_device_probe+0x74/0xd6
Jul 30 11:46:49 closure kernel: [    4.222644]  [<ffffffff813d4a2e>] ? driver_probe_device+0x387/0x387
Jul 30 11:46:49 closure kernel: [    4.222648]  [<ffffffff813d4806>] driver_probe_device+0x15f/0x387
Jul 30 11:46:49 closure kernel: [    4.222652]  [<ffffffff813d4a2e>] ? driver_probe_device+0x387/0x387
Jul 30 11:46:49 closure kernel: [    4.222655]  [<ffffffff813d4a81>] __driver_attach+0x53/0x74
Jul 30 11:46:49 closure kernel: [    4.222659]  [<ffffffff813d2aa0>] bus_for_each_dev+0x6f/0x89
Jul 30 11:46:49 closure kernel: [    4.222662]  [<ffffffff813d41f0>] driver_attach+0x1e/0x20
Jul 30 11:46:49 closure kernel: [    4.222666]  [<ffffffff813d3e33>] bus_add_driver+0x140/0x238
Jul 30 11:46:49 closure kernel: [    4.222670]  [<ffffffff813d53d8>] driver_register+0x8f/0xcc
Jul 30 11:46:49 closure kernel: [    4.222674]  [<ffffffff81332be1>] __pci_register_driver+0x5e/0x62
Jul 30 11:46:49 closure kernel: [    4.222677]  [<ffffffffc081f000>] ? 0xffffffffc081f000
Jul 30 11:46:49 closure kernel: [    4.222687]  [<ffffffffc053d890>] drm_pci_init+0x58/0xda [drm]
Jul 30 11:46:49 closure kernel: [    4.222690]  [<ffffffffc081f000>] ? 0xffffffffc081f000
Jul 30 11:46:49 closure kernel: [    4.222708]  [<ffffffffc081f0a0>] i915_init+0xa0/0xa8 [i915]
Jul 30 11:46:49 closure kernel: [    4.222712]  [<ffffffffc081f000>] ? 0xffffffffc081f000
Jul 30 11:46:49 closure kernel: [    4.222716]  [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
Jul 30 11:46:49 closure kernel: [    4.222719]  [<ffffffff8161994d>] ? do_init_module+0x28/0x1e3
Jul 30 11:46:49 closure kernel: [    4.222723]  [<ffffffff81199350>] ? kmem_cache_alloc_trace+0xba/0xcc
Jul 30 11:46:49 closure kernel: [    4.222727]  [<ffffffff81619985>] do_init_module+0x60/0x1e3
Jul 30 11:46:49 closure kernel: [    4.222731]  [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
Jul 30 11:46:49 closure kernel: [    4.222736]  [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
Jul 30 11:46:49 closure kernel: [    4.222739]  [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30 15:32       ` Theodore Ts'o
@ 2015-07-30 15:54         ` Daniel Vetter
  2015-07-30 15:57         ` Takashi Iwai
  1 sibling, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2015-07-30 15:54 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, intel-gfx, DRI, Daniel Vetter,
	Mani Nikula, Ander Conselvan de Oliveira,
	Linux Kernel Mailing List

On Thu, Jul 30, 2015 at 5:32 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote:
>> On Wed, Jul 29, 2015 at 10:18:16PM -0700, Linus Torvalds wrote:
>> >  drivers/gpu/drm/drm_atomic_helper.c | 8 +++++---
>> >  1 file changed, 5 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
>> > index 5b59d5ad7d1c..aac212297b49 100644
>> > --- a/drivers/gpu/drm/drm_atomic_helper.c
>> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
>> > @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx)
>> >     }
>> >
>> >     connector_state->best_encoder = new_encoder;
>> > -   idx = drm_crtc_index(connector_state->crtc);
>> > +   if (connector_state->crtc) {
>> > +           idx = drm_crtc_index(connector_state->crtc);
>> >
>> > -   crtc_state = state->crtc_states[idx];
>> > -   crtc_state->mode_changed = true;
>> > +           crtc_state = state->crtc_states[idx];
>> > +           crtc_state->mode_changed = true;
>> > +   }
>>
>> This shouldn't happen since if it does we ended up stealing the encoder
>> from the connector itself (we do check for connector_state->crtc earlier)
>> and that would be a bug. I haven't figured out a precise theory but my
>> guess is on the best_encoder selection, and indeed dp mst encoder
>> selection seems to have gone belly up in 4.2 with the bisected commit.
>
> Well, I just tested Linus's patch and it works.

That's sersiously surprising if you mean display and everything actually
works. Is dpms on/off and suspend and all that also still working? Can you
please changed the check into a

if (!connector_state->crtc)
        return 0;

so that we don't blow up on the debug line below and then grab dmesg with
drm.debug=0x1e when this happens? Note there will be lots of noise you
might need to dig out full dmesg from logs.

> BTW, is there any chance that I can suspend my laptop, and then move
> it from my docking station at home (where I have a Dell 30" display)
> to my docking station at work (where I have a Dell 24" display), and
> actually have the new monitor be detected?  For at least the past
> year, I have to reboot in order to be able to use the external
> monitor?  This used to work, but it's been a very long-standing
> regression.  I undrstand that Multi-stream DP is a evil horrible hack,
> and supporting it is painful, but this used to work, and it hasn't in
> a long time.  :-(

Hm we seem to not reprobe mst state on resume. The quick hack below should
help (but totally untested since still no dp mst hub here).
-Daniel

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 884b4f9b81c4..c0677c83a0e9 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -775,6 +775,9 @@ static int i915_drm_resume(struct drm_device *dev)
 	/* Config may have changed between suspend and resume */
 	drm_helper_hpd_irq_event(dev);
 
+	dev_priv->short_hpd_port_mask = ~0;
+	queue_work(dev_priv->dp_wq, &dev_priv->dig_port_work);
+
 	intel_opregion_init(dev);
 
 	intel_fbdev_set_suspend(dev, FBINFO_STATE_RUNNING, false);
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30 15:32       ` Theodore Ts'o
  2015-07-30 15:54         ` [Intel-gfx] " Daniel Vetter
@ 2015-07-30 15:57         ` Takashi Iwai
  2015-07-30 18:14           ` Linus Torvalds
  1 sibling, 1 reply; 19+ messages in thread
From: Takashi Iwai @ 2015-07-30 15:57 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Linus Torvalds, intel-gfx, DRI, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, Linux Kernel Mailing List

On Thu, 30 Jul 2015 17:32:28 +0200,
Theodore Ts'o wrote:
> 
> On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote:
> > On Wed, Jul 29, 2015 at 10:18:16PM -0700, Linus Torvalds wrote:
> > >  drivers/gpu/drm/drm_atomic_helper.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > index 5b59d5ad7d1c..aac212297b49 100644
> > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx)
> > >  	}
> > >  
> > >  	connector_state->best_encoder = new_encoder;
> > > -	idx = drm_crtc_index(connector_state->crtc);
> > > +	if (connector_state->crtc) {
> > > +		idx = drm_crtc_index(connector_state->crtc);
> > >  
> > > -	crtc_state = state->crtc_states[idx];
> > > -	crtc_state->mode_changed = true;
> > > +		crtc_state = state->crtc_states[idx];
> > > +		crtc_state->mode_changed = true;
> > > +	}
> > 
> > This shouldn't happen since if it does we ended up stealing the encoder
> > from the connector itself (we do check for connector_state->crtc earlier)
> > and that would be a bug. I haven't figured out a precise theory but my
> > guess is on the best_encoder selection, and indeed dp mst encoder
> > selection seems to have gone belly up in 4.2 with the bisected commit.
> 
> Well, I just tested Linus's patch and it works.
> 
> BTW, is there any chance that I can suspend my laptop, and then move
> it from my docking station at home (where I have a Dell 30" display)
> to my docking station at work (where I have a Dell 24" display), and
> actually have the new monitor be detected?  For at least the past
> year, I have to reboot in order to be able to use the external
> monitor?  This used to work, but it's been a very long-standing
> regression.  I undrstand that Multi-stream DP is a evil horrible hack,
> and supporting it is painful, but this used to work, and it hasn't in
> a long time.  :-(

Relevant with this?
   https://bugs.freedesktop.org/show_bug.cgi?id=89589

I wanted to check this by myself, too, as the same bug was reported to
openSUSE bugzilla, but I had no hardware showing it.


Takashi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30 15:50       ` Theodore Ts'o
@ 2015-07-30 15:59         ` Theodore Ts'o
  2015-07-30 16:00         ` Daniel Vetter
  2015-08-03 15:27         ` Daniel Vetter
  2 siblings, 0 replies; 19+ messages in thread
From: Theodore Ts'o @ 2015-07-30 15:59 UTC (permalink / raw)
  To: Linus Torvalds, intel-gfx, DRI, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 818 bytes --]

On Thu, Jul 30, 2015 at 11:50:29AM -0400, Theodore Ts'o wrote:
> I've tried pulling in your patches from fixes-stuff, onto Linus's tree
> (without Linus's fix), and the good news is that I'm no longer
> crashing on boot.
> 
> The *bad* news is that (a) it breaks the external monitor attached to
> the docking station completely (this was working with Linus's patch),
> and (b) it's triggering a LOCKDEP failure.

Well, that's not fair.  Even with Linus's fix, there is still a
LOCKDEP failure.  And a few more i915 WARNINGS.  But at least the
external monitor works, so this is what I'm using.  Enclosed please
find a dmesg with the lockdep and i915 warnings and my .config.  The
kernel that I used can be found at:

https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git/log/?h=i915-test-4.2.0-rc4

						- Ted

[-- Attachment #2: dmesg.gz --]
[-- Type: application/gzip, Size: 25784 bytes --]

[-- Attachment #3: config.gz --]
[-- Type: application/gzip, Size: 31022 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30 15:50       ` Theodore Ts'o
  2015-07-30 15:59         ` Theodore Ts'o
@ 2015-07-30 16:00         ` Daniel Vetter
  2015-08-03 15:27         ` Daniel Vetter
  2 siblings, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2015-07-30 16:00 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, intel-gfx, DRI, Daniel Vetter,
	Mani Nikula, Ander Conselvan de Oliveira,
	Linux Kernel Mailing List

On Thu, Jul 30, 2015 at 11:50:29AM -0400, Theodore Ts'o wrote:
> On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote:
> > I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff
> > but I couldn't test them yet since no dp mst here and I didn't find
> > anything that would ship faster than 1-2 weeks yet. I'll try to get some
> > other people here to test it meanwhile too.
> 
> I've tried pulling in your patches from fixes-stuff, onto Linus's tree
> (without Linus's fix), and the good news is that I'm no longer
> crashing on boot.

Ok so I'm not completely clueless yet, the encoder confusion indeed
resulted in the follow-up crash. But obviously I don't understand yet
exactly what's going on if this breaks the display.

> The *bad* news is that (a) it breaks the external monitor attached to
> the docking station completely (this was working with Linus's patch),
> and (b) it's triggering a LOCKDEP failure.

The lockdep splat is all in the driver load before we do any modeset at
all, so shouldn't have changed between these patches. Are you sure it's a
regression due to mine and wasn't there before?

> So even though Linus's patch wasn't supposed to work, I think I'm
> going to back to it....

Well I found some dp mst hubs meanwhile so hopefully tomorrow I can test
myself what's going wrong here.
-Daniel

> 
> 					- Ted
> 
> 
> Jul 30 11:46:49 closure kernel: [    4.221951] 
> Jul 30 11:46:49 closure kernel: [    4.221954] ======================================================
> Jul 30 11:46:49 closure kernel: [    4.221957] [ INFO: possible circular locking dependency detected ]
> Jul 30 11:46:49 closure kernel: [    4.221960] 4.2.0-rc4-13906-g5f1b75cd #16 Not tainted
> Jul 30 11:46:49 closure kernel: [    4.221963] -------------------------------------------------------
> Jul 30 11:46:49 closure kernel: [    4.221966] modprobe/503 is trying to acquire lock:
> Jul 30 11:46:49 closure kernel: [    4.221968]  (init_mutex){+.+.+.}, at: [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.221977] 
> Jul 30 11:46:49 closure kernel: [    4.221977] but task is already holding lock:
> Jul 30 11:46:49 closure kernel: [    4.221979]  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7c9>] __blocking_notifier_call_chain+0x37/0x69
> Jul 30 11:46:49 closure kernel: [    4.221987] 
> Jul 30 11:46:49 closure kernel: [    4.221987] which lock already depends on the new lock.
> Jul 30 11:46:49 closure kernel: [    4.221987] 
> Jul 30 11:46:49 closure kernel: [    4.221990] 
> Jul 30 11:46:49 closure kernel: [    4.221990] the existing dependency chain (in reverse order) is:
> Jul 30 11:46:49 closure kernel: [    4.221995] 
> Jul 30 11:46:49 closure kernel: [    4.221995] -> #1 (&(&backlight_notifier)->rwsem){++++..}:
> Jul 30 11:46:49 closure kernel: [    4.222001]        [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
> Jul 30 11:46:49 closure kernel: [    4.222007]        [<ffffffff8161f1db>] down_write+0x46/0x8a
> Jul 30 11:46:49 closure kernel: [    4.222012]        [<ffffffff8109a6c0>] blocking_notifier_chain_register+0x36/0x57
> Jul 30 11:46:49 closure kernel: [    4.222017]        [<ffffffff8134eb4e>] backlight_register_notifier+0x18/0x1a
> Jul 30 11:46:49 closure kernel: [    4.222022]        [<ffffffff8138b463>] acpi_video_get_backlight_type+0xfa/0x164
> Jul 30 11:46:49 closure kernel: [    4.222028]        [<ffffffffc03a1e45>] 0xffffffffc03a1e45
> Jul 30 11:46:49 closure audispd: No plugins found, exiting
> Jul 30 11:46:49 closure kernel: [    4.222032]        [<ffffffffc03a28a8>] 0xffffffffc03a28a8
> Jul 30 11:46:49 closure kernel: [    4.222036]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
> Jul 30 11:46:49 closure kernel: [    4.222042]        [<ffffffff81619985>] do_init_module+0x60/0x1e3
> Jul 30 11:46:49 closure kernel: [    4.222047]        [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
> Jul 30 11:46:49 closure kernel: [    4.222052]        [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
> Jul 30 11:46:49 closure kernel: [    4.222056]        [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73
> Jul 30 11:46:49 closure kernel: [    4.222060] 
> Jul 30 11:46:49 closure kernel: [    4.222060] -> #0 (init_mutex){+.+.+.}:
> Jul 30 11:46:49 closure kernel: [    4.222065]        [<ffffffff810bb77a>] __lock_acquire+0xc55/0xf54
> Jul 30 11:46:49 closure kernel: [    4.222070]        [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
> Jul 30 11:46:49 closure kernel: [    4.222074]        [<ffffffff8161d83a>] mutex_lock_nested+0x70/0x391
> Jul 30 11:46:49 closure kernel: [    4.222078]        [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.222083]        [<ffffffff8138b505>] acpi_video_backlight_notify+0x19/0x2f
> Jul 30 11:46:49 closure kernel: [    4.222088]        [<ffffffff8109a442>] notifier_call_chain+0x4c/0x71
> Jul 30 11:46:49 closure kernel: [    4.222092]        [<ffffffff8109a7e2>] __blocking_notifier_call_chain+0x50/0x69
> Jul 30 11:46:49 closure kernel: [    4.222098]        [<ffffffff8109a80f>] blocking_notifier_call_chain+0x14/0x16
> Jul 30 11:46:49 closure kernel: [    4.222103]        [<ffffffff8134f023>] backlight_device_register+0x1df/0x1f1
> Jul 30 11:46:49 closure kernel: [    4.222108]        [<ffffffffc07b3061>] intel_backlight_register+0xf0/0x157 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222146]        [<ffffffffc078c843>] intel_modeset_gem_init+0x158/0x164 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222176]        [<ffffffffc07b997c>] i915_driver_load+0xf1c/0x1139 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222205]        [<ffffffffc053af19>] drm_dev_register+0x84/0xfd [drm]
> Jul 30 11:46:49 closure kernel: [    4.222217]        [<ffffffffc053d77e>] drm_get_pci_dev+0x102/0x1bc [drm]
> Jul 30 11:46:49 closure kernel: [    4.222228]        [<ffffffffc07291e2>] i915_pci_probe+0x4f/0x51 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222247]        [<ffffffff81333ad3>] pci_device_probe+0x74/0xd6
> Jul 30 11:46:49 closure kernel: [    4.222253]        [<ffffffff813d4806>] driver_probe_device+0x15f/0x387
> Jul 30 11:46:49 closure kernel: [    4.222257]        [<ffffffff813d4a81>] __driver_attach+0x53/0x74
> Jul 30 11:46:49 closure kernel: [    4.222262]        [<ffffffff813d2aa0>] bus_for_each_dev+0x6f/0x89
> Jul 30 11:46:49 closure kernel: [    4.222266]        [<ffffffff813d41f0>] driver_attach+0x1e/0x20
> Jul 30 11:46:49 closure kernel: [    4.222269]        [<ffffffff813d3e33>] bus_add_driver+0x140/0x238
> Jul 30 11:46:49 closure kernel: [    4.222273]        [<ffffffff813d53d8>] driver_register+0x8f/0xcc
> Jul 30 11:46:49 closure kernel: [    4.222278]        [<ffffffff81332be1>] __pci_register_driver+0x5e/0x62
> Jul 30 11:46:49 closure kernel: [    4.222282]        [<ffffffffc053d890>] drm_pci_init+0x58/0xda [drm]
> Jul 30 11:46:49 closure kernel: [    4.222293]        [<ffffffffc081f0a0>] i915_init+0xa0/0xa8 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222312]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
> Jul 30 11:46:49 closure kernel: [    4.222317]        [<ffffffff81619985>] do_init_module+0x60/0x1e3
> Jul 30 11:46:49 closure kernel: [    4.222321]        [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
> Jul 30 11:46:49 closure kernel: [    4.222325]        [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
> Jul 30 11:46:49 closure kernel: [    4.222329]        [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73
> Jul 30 11:46:49 closure kernel: [    4.222334] 
> Jul 30 11:46:49 closure kernel: [    4.222334] other info that might help us debug this:
> Jul 30 11:46:49 closure kernel: [    4.222334] 
> Jul 30 11:46:49 closure kernel: [    4.222340]  Possible unsafe locking scenario:
> Jul 30 11:46:49 closure kernel: [    4.222340] 
> Jul 30 11:46:49 closure kernel: [    4.222344]        CPU0                    CPU1
> Jul 30 11:46:49 closure kernel: [    4.222347]        ----                    ----
> Jul 30 11:46:49 closure kernel: [    4.222350]   lock(&(&backlight_notifier)->rwsem);
> Jul 30 11:46:49 closure kernel: [    4.222353]                                lock(init_mutex);
> Jul 30 11:46:49 closure kernel: [    4.222357]                                lock(&(&backlight_notifier)->rwsem);
> Jul 30 11:46:49 closure kernel: [    4.222363]   lock(init_mutex);
> Jul 30 11:46:49 closure kernel: [    4.222366] 
> Jul 30 11:46:49 closure kernel: [    4.222366]  *** DEADLOCK ***
> Jul 30 11:46:49 closure kernel: [    4.222366] 
> Jul 30 11:46:49 closure kernel: [    4.222371] 4 locks held by modprobe/503:
> Jul 30 11:46:49 closure kernel: [    4.222374]  #0:  (&dev->mutex){......}, at: [<ffffffff813d3ff1>] device_lock+0xf/0x11
> Jul 30 11:46:49 closure kernel: [    4.222381]  #1:  (&dev->mutex){......}, at: [<ffffffff813d3ff1>] device_lock+0xf/0x11
> Jul 30 11:46:49 closure kernel: [    4.222388]  #2:  (drm_global_mutex){+.+.+.}, at: [<ffffffffc053aeb9>] drm_dev_register+0x24/0xfd [drm]
> Jul 30 11:46:49 closure kernel: [    4.222402]  #3:  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7c9>] __blocking_notifier_call_chain+0x37/0x69
> Jul 30 11:46:49 closure kernel: [    4.222410] 
> Jul 30 11:46:49 closure kernel: [    4.222410] stack backtrace:
> Jul 30 11:46:49 closure kernel: [    4.222416] CPU: 7 PID: 503 Comm: modprobe Not tainted 4.2.0-rc4-13906-g5f1b75cd #16
> Jul 30 11:46:49 closure kernel: [    4.222420] Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET59WW (2.07 ) 02/12/2014
> Jul 30 11:46:49 closure kernel: [    4.222425]  ffffffff8280a230 ffff8800c992b5d8 ffffffff8161a71e 0000000000000006
> Jul 30 11:46:49 closure kernel: [    4.222431]  ffffffff8280a230 ffff8800c992b628 ffffffff810b9adf ffffffff82265780
> Jul 30 11:46:49 closure kernel: [    4.222437]  ffff880405588000 0000000000000004 ffff880405588880 0000000000000004
> Jul 30 11:46:49 closure kernel: [    4.222443] Call Trace:
> Jul 30 11:46:49 closure kernel: [    4.222447]  [<ffffffff8161a71e>] dump_stack+0x4c/0x65
> Jul 30 11:46:49 closure kernel: [    4.222451]  [<ffffffff810b9adf>] print_circular_bug+0x1f8/0x209
> Jul 30 11:46:49 closure kernel: [    4.222455]  [<ffffffff810bb77a>] __lock_acquire+0xc55/0xf54
> Jul 30 11:46:49 closure kernel: [    4.222460]  [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
> Jul 30 11:46:49 closure kernel: [    4.222464]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.222469]  [<ffffffff8161d83a>] mutex_lock_nested+0x70/0x391
> Jul 30 11:46:49 closure kernel: [    4.222472]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.222476]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.222480]  [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.222484]  [<ffffffff8138b505>] acpi_video_backlight_notify+0x19/0x2f
> Jul 30 11:46:49 closure kernel: [    4.222488]  [<ffffffff8109a442>] notifier_call_chain+0x4c/0x71
> Jul 30 11:46:49 closure kernel: [    4.222492]  [<ffffffff8109a7e2>] __blocking_notifier_call_chain+0x50/0x69
> Jul 30 11:46:49 closure kernel: [    4.222496]  [<ffffffff8109a80f>] blocking_notifier_call_chain+0x14/0x16
> Jul 30 11:46:49 closure kernel: [    4.222500]  [<ffffffff8134f023>] backlight_device_register+0x1df/0x1f1
> Jul 30 11:46:49 closure kernel: [    4.222530]  [<ffffffffc07b3061>] intel_backlight_register+0xf0/0x157 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222556]  [<ffffffffc078c843>] intel_modeset_gem_init+0x158/0x164 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222584]  [<ffffffffc07b997c>] i915_driver_load+0xf1c/0x1139 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222589]  [<ffffffff810ba715>] ? mark_held_locks+0x56/0x6c
> Jul 30 11:46:49 closure kernel: [    4.222593]  [<ffffffff81620836>] ? _raw_spin_unlock_irqrestore+0x3f/0x4d
> Jul 30 11:46:49 closure kernel: [    4.222597]  [<ffffffff810ba89c>] ? trace_hardirqs_on_caller+0x171/0x18d
> Jul 30 11:46:49 closure kernel: [    4.222607]  [<ffffffffc053af19>] drm_dev_register+0x84/0xfd [drm]
> Jul 30 11:46:49 closure kernel: [    4.222618]  [<ffffffffc053d77e>] drm_get_pci_dev+0x102/0x1bc [drm]
> Jul 30 11:46:49 closure kernel: [    4.222636]  [<ffffffffc07291e2>] i915_pci_probe+0x4f/0x51 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222640]  [<ffffffff81333ad3>] pci_device_probe+0x74/0xd6
> Jul 30 11:46:49 closure kernel: [    4.222644]  [<ffffffff813d4a2e>] ? driver_probe_device+0x387/0x387
> Jul 30 11:46:49 closure kernel: [    4.222648]  [<ffffffff813d4806>] driver_probe_device+0x15f/0x387
> Jul 30 11:46:49 closure kernel: [    4.222652]  [<ffffffff813d4a2e>] ? driver_probe_device+0x387/0x387
> Jul 30 11:46:49 closure kernel: [    4.222655]  [<ffffffff813d4a81>] __driver_attach+0x53/0x74
> Jul 30 11:46:49 closure kernel: [    4.222659]  [<ffffffff813d2aa0>] bus_for_each_dev+0x6f/0x89
> Jul 30 11:46:49 closure kernel: [    4.222662]  [<ffffffff813d41f0>] driver_attach+0x1e/0x20
> Jul 30 11:46:49 closure kernel: [    4.222666]  [<ffffffff813d3e33>] bus_add_driver+0x140/0x238
> Jul 30 11:46:49 closure kernel: [    4.222670]  [<ffffffff813d53d8>] driver_register+0x8f/0xcc
> Jul 30 11:46:49 closure kernel: [    4.222674]  [<ffffffff81332be1>] __pci_register_driver+0x5e/0x62
> Jul 30 11:46:49 closure kernel: [    4.222677]  [<ffffffffc081f000>] ? 0xffffffffc081f000
> Jul 30 11:46:49 closure kernel: [    4.222687]  [<ffffffffc053d890>] drm_pci_init+0x58/0xda [drm]
> Jul 30 11:46:49 closure kernel: [    4.222690]  [<ffffffffc081f000>] ? 0xffffffffc081f000
> Jul 30 11:46:49 closure kernel: [    4.222708]  [<ffffffffc081f0a0>] i915_init+0xa0/0xa8 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222712]  [<ffffffffc081f000>] ? 0xffffffffc081f000
> Jul 30 11:46:49 closure kernel: [    4.222716]  [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
> Jul 30 11:46:49 closure kernel: [    4.222719]  [<ffffffff8161994d>] ? do_init_module+0x28/0x1e3
> Jul 30 11:46:49 closure kernel: [    4.222723]  [<ffffffff81199350>] ? kmem_cache_alloc_trace+0xba/0xcc
> Jul 30 11:46:49 closure kernel: [    4.222727]  [<ffffffff81619985>] do_init_module+0x60/0x1e3
> Jul 30 11:46:49 closure kernel: [    4.222731]  [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
> Jul 30 11:46:49 closure kernel: [    4.222736]  [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
> Jul 30 11:46:49 closure kernel: [    4.222739]  [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30 15:57         ` Takashi Iwai
@ 2015-07-30 18:14           ` Linus Torvalds
  0 siblings, 0 replies; 19+ messages in thread
From: Linus Torvalds @ 2015-07-30 18:14 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Theodore Ts'o, intel-gfx, DRI, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, Linux Kernel Mailing List

On Thu, Jul 30, 2015 at 8:57 AM, Takashi Iwai <tiwai@suse.de> wrote:
> On Thu, 30 Jul 2015 17:32:28 +0200,
> Theodore Ts'o wrote:
>>
>> BTW, is there any chance that I can suspend my laptop, and then move
>> it from my docking station at home (where I have a Dell 30" display)
>> to my docking station at work (where I have a Dell 24" display), and
>> actually have the new monitor be detected?  For at least the past
>> year, I have to reboot in order to be able to use the external
>> monitor?  This used to work, but it's been a very long-standing
>> regression.  I undrstand that Multi-stream DP is a evil horrible hack,
>> and supporting it is painful, but this used to work, and it hasn't in
>> a long time.  :-(
>
> Relevant with this?
>    https://bugs.freedesktop.org/show_bug.cgi?id=89589
>
> I wanted to check this by myself, too, as the same bug was reported to
> openSUSE bugzilla, but I had no hardware showing it.

Hmm. That commit e7d6f7d70829 looks like it should still revert fairly
cleanly (just move the call to intel_dp_mst_resume() to before the
intel_modeset_setup_hw_state() call and locking).

Ted, worth checking out, even if that presumably ends up
re-introducing some WARN_ON's..

                    Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-07-30 15:50       ` Theodore Ts'o
  2015-07-30 15:59         ` Theodore Ts'o
  2015-07-30 16:00         ` Daniel Vetter
@ 2015-08-03 15:27         ` Daniel Vetter
  2015-08-03 16:25           ` Theodore Ts'o
  2 siblings, 1 reply; 19+ messages in thread
From: Daniel Vetter @ 2015-08-03 15:27 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, intel-gfx, DRI, Daniel Vetter,
	Mani Nikula, Ander Conselvan de Oliveira,
	Linux Kernel Mailing List

On Thu, Jul 30, 2015 at 11:50:29AM -0400, Theodore Ts'o wrote:
> On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote:
> > I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff
> > but I couldn't test them yet since no dp mst here and I didn't find
> > anything that would ship faster than 1-2 weeks yet. I'll try to get some
> > other people here to test it meanwhile too.
> 
> I've tried pulling in your patches from fixes-stuff, onto Linus's tree
> (without Linus's fix), and the good news is that I'm no longer
> crashing on boot.
> 
> The *bad* news is that (a) it breaks the external monitor attached to
> the docking station completely (this was working with Linus's patch),
> and (b) it's triggering a LOCKDEP failure.
> 
> So even though Linus's patch wasn't supposed to work, I think I'm
> going to back to it....

Ok I updated fixes-stuff with just 2 patches which seem to be enough to
fix it. Plus a patch to convert Linus' hack into something we can keep
plus a drive-by WARNING fix in mst that got in the way for me.

Seems to work here in getting rid of the Oops. If this tests out for you
too I'll send a pull to Linus.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-08-03 15:27         ` Daniel Vetter
@ 2015-08-03 16:25           ` Theodore Ts'o
  2015-08-03 17:24             ` Linus Torvalds
  2015-08-04 16:05             ` Daniel Vetter
  0 siblings, 2 replies; 19+ messages in thread
From: Theodore Ts'o @ 2015-08-03 16:25 UTC (permalink / raw)
  To: Linus Torvalds, intel-gfx, DRI, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, Linux Kernel Mailing List

On Mon, Aug 03, 2015 at 05:27:29PM +0200, Daniel Vetter wrote:
> 
> Ok I updated fixes-stuff with just 2 patches which seem to be enough to
> fix it. Plus a patch to convert Linus' hack into something we can keep
> plus a drive-by WARNING fix in mst that got in the way for me.
> 
> Seems to work here in getting rid of the Oops. If this tests out for you
> too I'll send a pull to Linus.

I've just tried pulling in your updated fixes-stuff, and it avoids the
oops and allows external the monitor to work correctly.  However, I'm
still seeing a large number of drm/i915 related warning messages and
other kernel kvetching.

Thanks!!

						- Ted

[    4.084198] [drm] Initialized drm 1.1.0 20060810
[    4.129576] [drm] Memory usable by graphics device = 2048M
[    4.129616] [drm] Replacing VGA console driver
[    4.130315] Console: switching to colour dummy device 80x25
[    4.145332] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    4.145334] [drm] Driver supports precise vblank timestamp query.
[    4.146184] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    4.163778] usbcore: registered new interface driver btusb
[    4.170719] ------------[ cut here ]------------
[    4.170749] WARNING: CPU: 0 PID: 463 at /usr/projects/linux/linux/drivers/gpu/drm/i915/intel_pm.c:2339 ilk_update_wm+0x71a/0xb27 [i915]()
[    4.170751] WARN_ON(!r->enable)
[    4.170752] Modules linked in:
[    4.170754]  btusb btrtl btbcm btintel iwlmvm(+) bluetooth mac80211 iwlwifi snd_hda_intel i915(+) drm_kms_helper snd_hda_codec cfg80211 drm snd_hwdep lpc_ich snd_hda_core intel_gtt thinkpad_acpi tpm_tis nvram tpm intel_smartconnect uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core sch_fq_codel kvm_intel kvm ecryptfs parport_pc ppdev lp parport autofs4 btrfs xor hid_generic usbhid hid raid6_pq microcode rtsx_pci_sdmmc ehci_pci e1000e rtsx_pci ehci_hcd xhci_pci ptp mfd_core pps_core xhci_hcd
[    4.170805] CPU: 0 PID: 463 Comm: systemd-udevd Not tainted 4.2.0-rc5-14194-g130583b #18
[    4.170807] Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET59WW (2.07 ) 02/12/2014
[    4.170809]  0000000000000009 ffff880403f0f4c8 ffffffff8161aaee 0000000000000006
[    4.170814]  ffff880403f0f518 ffff880403f0f508 ffffffff8107e5f0 0000000000000006
[    4.170818]  ffffffffc05ade43 ffff8800c8b70000 ffff8800c7f16000 ffff880405fb48b8
[    4.170823] Call Trace:
[    4.170829]  [<ffffffff8161aaee>] dump_stack+0x4c/0x65
[    4.170833]  [<ffffffff8107e5f0>] warn_slowpath_common+0xa1/0xbb
[    4.170856]  [<ffffffffc05ade43>] ? ilk_update_wm+0x71a/0xb27 [i915]
[    4.170859]  [<ffffffff8107e650>] warn_slowpath_fmt+0x46/0x48
[    4.170879]  [<ffffffffc05abb1e>] ? ilk_compute_wm_maximums+0x43/0xa2 [i915]
[    4.170899]  [<ffffffffc05ade43>] ilk_update_wm+0x71a/0xb27 [i915]
[    4.170921]  [<ffffffffc05afb2b>] intel_update_watermarks+0x1e/0x20 [i915]
[    4.170957]  [<ffffffffc05ff8d4>] haswell_crtc_disable+0x270/0x2ae [i915]
[    4.170989]  [<ffffffffc060199d>] intel_crtc_control+0xa0/0xe1 [i915]
[    4.171020]  [<ffffffffc0601a2b>] intel_crtc_update_dpms+0x4d/0x5d [i915]
[    4.171052]  [<ffffffffc0607dd9>] intel_modeset_setup_hw_state+0x7b0/0xa90 [i915]
[    4.171081]  [<ffffffffc05ec6de>] ? hsw_write64+0xcd/0xcd [i915]
[    4.171113]  [<ffffffffc060ab44>] ? ilk_fbc_disable+0x29/0x69 [i915]
[    4.171142]  [<ffffffffc0609512>] intel_modeset_init+0x130d/0x14e3 [i915]
[    4.171179]  [<ffffffffc0636962>] i915_driver_load+0xf05/0x1139 [i915]
[    4.171183]  [<ffffffff810ba787>] ? mark_held_locks+0x56/0x6c
[    4.171186]  [<ffffffff81620c06>] ? _raw_spin_unlock_irqrestore+0x3f/0x4d
[    4.171189]  [<ffffffff810ba90e>] ? trace_hardirqs_on_caller+0x171/0x18d
[    4.171204]  [<ffffffffc042cf19>] drm_dev_register+0x84/0xfd [drm]
[    4.171215]  [<ffffffffc042f77e>] drm_get_pci_dev+0x102/0x1bc [drm]
[    4.171237]  [<ffffffffc05a61e2>] i915_pci_probe+0x4f/0x51 [i915]
[    4.171240]  [<ffffffff81333c33>] pci_device_probe+0x74/0xd6
[    4.171245]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.171248]  [<ffffffff813d4966>] driver_probe_device+0x15f/0x387
[    4.171250]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.171252]  [<ffffffff813d4be1>] __driver_attach+0x53/0x74
[    4.171255]  [<ffffffff813d2c00>] bus_for_each_dev+0x6f/0x89
[    4.171257]  [<ffffffff813d4350>] driver_attach+0x1e/0x20
[    4.171260]  [<ffffffff813d3f93>] bus_add_driver+0x140/0x238
[    4.171263]  [<ffffffff813d5538>] driver_register+0x8f/0xcc
[    4.171266]  [<ffffffff81332d41>] __pci_register_driver+0x5e/0x62
[    4.171268]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171278]  [<ffffffffc042f890>] drm_pci_init+0x58/0xda [drm]
[    4.171281]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171301]  [<ffffffffc069c0a0>] i915_init+0xa0/0xa8 [i915]
[    4.171303]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171307]  [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
[    4.171310]  [<ffffffff81619d1d>] ? do_init_module+0x28/0x1e3
[    4.171313]  [<ffffffff81199429>] ? kmem_cache_alloc_trace+0xba/0xcc
[    4.171315]  [<ffffffff81619d55>] do_init_module+0x60/0x1e3
[    4.171319]  [<ffffffff810f0acd>] load_module+0x1c42/0x2059
[    4.171324]  [<ffffffff810f10b8>] SyS_finit_module+0x85/0x92
[    4.171327]  [<ffffffff8162145b>] entry_SYSCALL_64_fastpath+0x16/0x73
[    4.171329] ---[ end trace 7eb514b89de5fc4a ]---
[    4.171331] ------------[ cut here ]------------
[    4.171354] WARNING: CPU: 0 PID: 463 at /usr/projects/linux/linux/drivers/gpu/drm/i915/intel_pm.c:2339 ilk_update_wm+0x71a/0xb27 [i915]()
[    4.171355] WARN_ON(!r->enable)
[    4.171357] Modules linked in:
[    4.171358]  btusb btrtl btbcm btintel iwlmvm(+) bluetooth mac80211 iwlwifi snd_hda_intel i915(+) drm_kms_helper snd_hda_codec cfg80211 drm snd_hwdep lpc_ich snd_hda_core intel_gtt thinkpad_acpi tpm_tis nvram tpm intel_smartconnect uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core sch_fq_codel kvm_intel kvm ecryptfs parport_pc ppdev lp parport autofs4 btrfs xor hid_generic usbhid hid raid6_pq microcode rtsx_pci_sdmmc ehci_pci e1000e rtsx_pci ehci_hcd xhci_pci ptp mfd_core pps_core xhci_hcd
[    4.171404] CPU: 0 PID: 463 Comm: systemd-udevd Tainted: G        W       4.2.0-rc5-14194-g130583b #18
[    4.171406] Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET59WW (2.07 ) 02/12/2014
[    4.171408]  0000000000000009 ffff880403f0f4c8 ffffffff8161aaee 0000000000000006
[    4.171412]  ffff880403f0f518 ffff880403f0f508 ffffffff8107e5f0 0000000000000006
[    4.171417]  ffffffffc05ade43 ffff8800c8b70000 ffff8800c7f15000 ffff880405fb48b8
[    4.171421] Call Trace:
[    4.171424]  [<ffffffff8161aaee>] dump_stack+0x4c/0x65
[    4.171427]  [<ffffffff8107e5f0>] warn_slowpath_common+0xa1/0xbb
[    4.171449]  [<ffffffffc05ade43>] ? ilk_update_wm+0x71a/0xb27 [i915]
[    4.171452]  [<ffffffff8107e650>] warn_slowpath_fmt+0x46/0x48
[    4.171472]  [<ffffffffc05abb1e>] ? ilk_compute_wm_maximums+0x43/0xa2 [i915]
[    4.171491]  [<ffffffffc05ade43>] ilk_update_wm+0x71a/0xb27 [i915]
[    4.171513]  [<ffffffffc05afb2b>] intel_update_watermarks+0x1e/0x20 [i915]
[    4.171546]  [<ffffffffc05ff8d4>] haswell_crtc_disable+0x270/0x2ae [i915]
[    4.171579]  [<ffffffffc060199d>] intel_crtc_control+0xa0/0xe1 [i915]
[    4.171610]  [<ffffffffc0601a2b>] intel_crtc_update_dpms+0x4d/0x5d [i915]
[    4.171641]  [<ffffffffc0607dd9>] intel_modeset_setup_hw_state+0x7b0/0xa90 [i915]
[    4.171671]  [<ffffffffc05ec6de>] ? hsw_write64+0xcd/0xcd [i915]
[    4.171702]  [<ffffffffc060ab44>] ? ilk_fbc_disable+0x29/0x69 [i915]
[    4.171733]  [<ffffffffc0609512>] intel_modeset_init+0x130d/0x14e3 [i915]
[    4.171770]  [<ffffffffc0636962>] i915_driver_load+0xf05/0x1139 [i915]
[    4.171773]  [<ffffffff810ba787>] ? mark_held_locks+0x56/0x6c
[    4.171776]  [<ffffffff81620c06>] ? _raw_spin_unlock_irqrestore+0x3f/0x4d
[    4.171779]  [<ffffffff810ba90e>] ? trace_hardirqs_on_caller+0x171/0x18d
[    4.171791]  [<ffffffffc042cf19>] drm_dev_register+0x84/0xfd [drm]
[    4.171802]  [<ffffffffc042f77e>] drm_get_pci_dev+0x102/0x1bc [drm]
[    4.171825]  [<ffffffffc05a61e2>] i915_pci_probe+0x4f/0x51 [i915]
[    4.171828]  [<ffffffff81333c33>] pci_device_probe+0x74/0xd6
[    4.171831]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.171833]  [<ffffffff813d4966>] driver_probe_device+0x15f/0x387
[    4.171836]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.171838]  [<ffffffff813d4be1>] __driver_attach+0x53/0x74
[    4.171841]  [<ffffffff813d2c00>] bus_for_each_dev+0x6f/0x89
[    4.171844]  [<ffffffff813d4350>] driver_attach+0x1e/0x20
[    4.171846]  [<ffffffff813d3f93>] bus_add_driver+0x140/0x238
[    4.171849]  [<ffffffff813d5538>] driver_register+0x8f/0xcc
[    4.171852]  [<ffffffff81332d41>] __pci_register_driver+0x5e/0x62
[    4.171854]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171866]  [<ffffffffc042f890>] drm_pci_init+0x58/0xda [drm]
[    4.171868]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171890]  [<ffffffffc069c0a0>] i915_init+0xa0/0xa8 [i915]
[    4.171893]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171896]  [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
[    4.171898]  [<ffffffff81619d1d>] ? do_init_module+0x28/0x1e3
[    4.171901]  [<ffffffff81199429>] ? kmem_cache_alloc_trace+0xba/0xcc
[    4.171904]  [<ffffffff81619d55>] do_init_module+0x60/0x1e3
[    4.171907]  [<ffffffff810f0acd>] load_module+0x1c42/0x2059
[    4.171911]  [<ffffffff810f10b8>] SyS_finit_module+0x85/0x92
[    4.171914]  [<ffffffff8162145b>] entry_SYSCALL_64_fastpath+0x16/0x73
[    4.171916] ---[ end trace 7eb514b89de5fc4b ]---
[    4.176978] Bluetooth: hci0: read Intel version: 370710018002030d48
[    4.176981] Bluetooth: hci0: Intel device is already patched. patch num: 48

[    4.181839] ======================================================
[    4.181844] [ INFO: possible circular locking dependency detected ]
[    4.181849] 4.2.0-rc5-14194-g130583b #18 Tainted: G        W      
[    4.181854] -------------------------------------------------------
[    4.181859] systemd-udevd/463 is trying to acquire lock:
[    4.181864]  (init_mutex){+.+.+.}, at: [<ffffffff8138b4e0>] acpi_video_get_backlight_type+0x17/0x164
[    4.181878] 
               but task is already holding lock:
[    4.181883]  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7cc>] __blocking_notifier_call_chain+0x37/0x69
[    4.181895] 
               which lock already depends on the new lock.

[    4.181902] 
               the existing dependency chain (in reverse order) is:
[    4.181912] 
               -> #1 (&(&backlight_notifier)->rwsem){++++..}:
[    4.181923]        [<ffffffff810bbe7a>] lock_acquire+0x104/0x18b
[    4.181932]        [<ffffffff8161f5ab>] down_write+0x46/0x8a
[    4.181942]        [<ffffffff8109a6c3>] blocking_notifier_chain_register+0x36/0x57
[    4.181953]        [<ffffffff8134ecae>] backlight_register_notifier+0x18/0x1a
[    4.181962]        [<ffffffff8138b5c3>] acpi_video_get_backlight_type+0xfa/0x164
[    4.181973]        [<ffffffffc03d2e45>] 0xffffffffc03d2e45
[    4.181981]        [<ffffffffc03d38a8>] 0xffffffffc03d38a8
[    4.181988]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
[    4.181997]        [<ffffffff81619d55>] do_init_module+0x60/0x1e3
[    4.182006]        [<ffffffff810f0acd>] load_module+0x1c42/0x2059
[    4.182015]        [<ffffffff810f10b8>] SyS_finit_module+0x85/0x92
[    4.182023]        [<ffffffff8162145b>] entry_SYSCALL_64_fastpath+0x16/0x73
[    4.182031] 
               -> #0 (init_mutex){+.+.+.}:
[    4.182042]        [<ffffffff810bb7ec>] __lock_acquire+0xc55/0xf54
[    4.182050]        [<ffffffff810bbe7a>] lock_acquire+0x104/0x18b
[    4.182058]        [<ffffffff8161dc0a>] mutex_lock_nested+0x70/0x391
[    4.182066]        [<ffffffff8138b4e0>] acpi_video_get_backlight_type+0x17/0x164
[    4.182077]        [<ffffffff8138b665>] acpi_video_backlight_notify+0x19/0x2f
[    4.182086]        [<ffffffff8109a445>] notifier_call_chain+0x4c/0x71
[    4.182094]        [<ffffffff8109a7e5>] __blocking_notifier_call_chain+0x50/0x69
[    4.182105]        [<ffffffff8109a812>] blocking_notifier_call_chain+0x14/0x16
[    4.182116]        [<ffffffff8134f183>] backlight_device_register+0x1df/0x1f1
[    4.182125]        [<ffffffffc063005e>] intel_backlight_register+0xf0/0x157 [i915]
[    4.182174]        [<ffffffffc0609840>] intel_modeset_gem_init+0x158/0x164 [i915]
[    4.182214]        [<ffffffffc0636979>] i915_driver_load+0xf1c/0x1139 [i915]
[    4.182253]        [<ffffffffc042cf19>] drm_dev_register+0x84/0xfd [drm]
[    4.182271]        [<ffffffffc042f77e>] drm_get_pci_dev+0x102/0x1bc [drm]
[    4.182287]        [<ffffffffc05a61e2>] i915_pci_probe+0x4f/0x51 [i915]
[    4.182314]        [<ffffffff81333c33>] pci_device_probe+0x74/0xd6
[    4.182322]        [<ffffffff813d4966>] driver_probe_device+0x15f/0x387
[    4.182331]        [<ffffffff813d4be1>] __driver_attach+0x53/0x74
[    4.182339]        [<ffffffff813d2c00>] bus_for_each_dev+0x6f/0x89
[    4.182347]        [<ffffffff813d4350>] driver_attach+0x1e/0x20
[    4.182355]        [<ffffffff813d3f93>] bus_add_driver+0x140/0x238
[    4.182363]        [<ffffffff813d5538>] driver_register+0x8f/0xcc
[    4.182371]        [<ffffffff81332d41>] __pci_register_driver+0x5e/0x62
[    4.182379]        [<ffffffffc042f890>] drm_pci_init+0x58/0xda [drm]
[    4.182396]        [<ffffffffc069c0a0>] i915_init+0xa0/0xa8 [i915]
[    4.182423]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
[    4.182432]        [<ffffffff81619d55>] do_init_module+0x60/0x1e3
[    4.182440]        [<ffffffff810f0acd>] load_module+0x1c42/0x2059
[    4.182448]        [<ffffffff810f10b8>] SyS_finit_module+0x85/0x92
[    4.182456]        [<ffffffff8162145b>] entry_SYSCALL_64_fastpath+0x16/0x73
[    4.182465] 
               other info that might help us debug this:

[    4.182477]  Possible unsafe locking scenario:

[    4.182486]        CPU0                    CPU1
[    4.182491]        ----                    ----
[    4.182497]   lock(&(&backlight_notifier)->rwsem);
[    4.182504]                                lock(init_mutex);
[    4.182512]                                lock(&(&backlight_notifier)->rwsem);
[    4.182522]   lock(init_mutex);
[    4.182528] 
                *** DEADLOCK ***

[    4.182540] 4 locks held by systemd-udevd/463:
[    4.182546]  #0:  (&dev->mutex){......}, at: [<ffffffff813d4151>] device_lock+0xf/0x11
[    4.182560]  #1:  (&dev->mutex){......}, at: [<ffffffff813d4151>] device_lock+0xf/0x11
[    4.182574]  #2:  (drm_global_mutex){+.+.+.}, at: [<ffffffffc042ceb9>] drm_dev_register+0x24/0xfd [drm]
[    4.182596]  #3:  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7cc>] __blocking_notifier_call_chain+0x37/0x69
[    4.182612] 
               stack backtrace:
[    4.182622] CPU: 0 PID: 463 Comm: systemd-udevd Tainted: G        W       4.2.0-rc5-14194-g130583b #18
[    4.182632] Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET59WW (2.07 ) 02/12/2014
[    4.182642]  ffffffff8280b780 ffff880403f0f5d8 ffffffff8161aaee 0000000000000006
[    4.182654]  ffffffff8280b780 ffff880403f0f628 ffffffff810b9b51 ffffffff82265780
[    4.182667]  ffff880403de0000 0000000000000004 ffff880403de0880 0000000000000004
[    4.182679] Call Trace:
[    4.182685]  [<ffffffff8161aaee>] dump_stack+0x4c/0x65
[    4.182693]  [<ffffffff810b9b51>] print_circular_bug+0x1f8/0x209
[    4.182701]  [<ffffffff810bb7ec>] __lock_acquire+0xc55/0xf54
[    4.182710]  [<ffffffff810bbe7a>] lock_acquire+0x104/0x18b
[    4.182717]  [<ffffffff8138b4e0>] ? acpi_video_get_backlight_type+0x17/0x164
[    4.182726]  [<ffffffff8161dc0a>] mutex_lock_nested+0x70/0x391
[    4.182734]  [<ffffffff8138b4e0>] ? acpi_video_get_backlight_type+0x17/0x164
[    4.182742]  [<ffffffff8138b4e0>] ? acpi_video_get_backlight_type+0x17/0x164
[    4.182750]  [<ffffffff8138b4e0>] acpi_video_get_backlight_type+0x17/0x164
[    4.182759]  [<ffffffff8138b665>] acpi_video_backlight_notify+0x19/0x2f
[    4.182766]  [<ffffffff8109a445>] notifier_call_chain+0x4c/0x71
[    4.182774]  [<ffffffff8109a7e5>] __blocking_notifier_call_chain+0x50/0x69
[    4.182782]  [<ffffffff8109a812>] blocking_notifier_call_chain+0x14/0x16
[    4.182790]  [<ffffffff8134f183>] backlight_device_register+0x1df/0x1f1
[    4.182833]  [<ffffffffc063005e>] intel_backlight_register+0xf0/0x157 [i915]
[    4.182872]  [<ffffffffc0609840>] intel_modeset_gem_init+0x158/0x164 [i915]
[    4.182915]  [<ffffffffc0636979>] i915_driver_load+0xf1c/0x1139 [i915]
[    4.182924]  [<ffffffff810ba787>] ? mark_held_locks+0x56/0x6c
[    4.182932]  [<ffffffff81620c06>] ? _raw_spin_unlock_irqrestore+0x3f/0x4d
[    4.182940]  [<ffffffff810ba90e>] ? trace_hardirqs_on_caller+0x171/0x18d
[    4.182956]  [<ffffffffc042cf19>] drm_dev_register+0x84/0xfd [drm]
[    4.182972]  [<ffffffffc042f77e>] drm_get_pci_dev+0x102/0x1bc [drm]
[    4.182998]  [<ffffffffc05a61e2>] i915_pci_probe+0x4f/0x51 [i915]
[    4.183006]  [<ffffffff81333c33>] pci_device_probe+0x74/0xd6
[    4.183014]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.183021]  [<ffffffff813d4966>] driver_probe_device+0x15f/0x387
[    4.183029]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.183036]  [<ffffffff813d4be1>] __driver_attach+0x53/0x74
[    4.183043]  [<ffffffff813d2c00>] bus_for_each_dev+0x6f/0x89
[    4.183050]  [<ffffffff813d4350>] driver_attach+0x1e/0x20
[    4.183058]  [<ffffffff813d3f93>] bus_add_driver+0x140/0x238
[    4.183065]  [<ffffffff813d5538>] driver_register+0x8f/0xcc
[    4.183073]  [<ffffffff81332d41>] __pci_register_driver+0x5e/0x62
[    4.183080]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.183095]  [<ffffffffc042f890>] drm_pci_init+0x58/0xda [drm]
[    4.183102]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.183126]  [<ffffffffc069c0a0>] i915_init+0xa0/0xa8 [i915]
[    4.183132]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.183139]  [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
[    4.183146]  [<ffffffff81619d1d>] ? do_init_module+0x28/0x1e3
[    4.183153]  [<ffffffff81199429>] ? kmem_cache_alloc_trace+0xba/0xcc
[    4.183161]  [<ffffffff81619d55>] do_init_module+0x60/0x1e3
[    4.183169]  [<ffffffff810f0acd>] load_module+0x1c42/0x2059
[    4.183178]  [<ffffffff810f10b8>] SyS_finit_module+0x85/0x92
[    4.183185]  [<ffffffff8162145b>] entry_SYSCALL_64_fastpath+0x16/0x73
[    4.186598] ACPI: Video Device [VID] (multi-head: yes  rom: no  post: no)
[    4.191522] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[    4.191536] [drm] Initialized i915 1.6.0 20150522 for 0000:00:02.0 on minor 0
[    4.191691] i801_smbus 0000:00:1f.3: SMBus using PCI interrupt
[    4.248792] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit banging on pin 5
[    4.322899] fbcon: inteldrmfb (fb0) is primary device
[    5.440946] Console: switching to colour frame buffer device 360x101
[    5.452747] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
[    5.452767] i915 0000:00:02.0: registered panic notifier

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-08-03 16:25           ` Theodore Ts'o
@ 2015-08-03 17:24             ` Linus Torvalds
  2015-08-03 18:49               ` Theodore Ts'o
  2015-08-03 22:05               ` Daniel Vetter
  2015-08-04 16:05             ` Daniel Vetter
  1 sibling, 2 replies; 19+ messages in thread
From: Linus Torvalds @ 2015-08-03 17:24 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, intel-gfx, DRI, Daniel Vetter,
	Mani Nikula, Ander Conselvan de Oliveira,
	Linux Kernel Mailing List

On Mon, Aug 3, 2015 at 9:25 AM, Theodore Ts'o <tytso@mit.edu> wrote:
>
> I've just tried pulling in your updated fixes-stuff, and it avoids the
> oops and allows external the monitor to work correctly.

Good. Have either of you tested the suspend/resume behavior? Is that fixed too?

>                      However, I'm
> still seeing a large number of drm/i915 related warning messages and
> other kernel kvetching.

I suspect I can live with that for now. The lockdep one looks like
it's mainly an initialization issue, so you'd never get the actual
deadlock in practice, but it's obviously annoying.  The intel_pm.c one
I'll have to defer to the i915 people for..

I'll be travelling much of this week (flying to Finland tomorrow, back
on Sunday - yay, 30h in airplanes for three days on the ground, but
it's my dad's bday), and my internet will be sporadic. But I'll have a
laptop and be able to pull stuff every once in a while.

It would be good to have this one resolved, and I just need to worry
about the remaining VM problem..

                           Linus

> [    4.170749] WARNING: CPU: 0 PID: 463 at drivers/gpu/drm/i915/intel_pm.c:2339 ilk_update_wm+0x71a/0xb27 [i915]()
> [    4.170751] WARN_ON(!r->enable)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-08-03 17:24             ` Linus Torvalds
@ 2015-08-03 18:49               ` Theodore Ts'o
  2015-08-03 22:05               ` Daniel Vetter
  1 sibling, 0 replies; 19+ messages in thread
From: Theodore Ts'o @ 2015-08-03 18:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: intel-gfx, DRI, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, Linux Kernel Mailing List

On Mon, Aug 03, 2015 at 10:24:53AM -0700, Linus Torvalds wrote:
> On Mon, Aug 3, 2015 at 9:25 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> >
> > I've just tried pulling in your updated fixes-stuff, and it avoids the
> > oops and allows external the monitor to work correctly.
> 
> Good. Have either of you tested the suspend/resume behavior? Is that fixed too?

No, I haven't had a chance to test the suspend/resume behavior,
because that requires suspending at work, going home, and connecting
to a dock which has a different monitor attached to it, and resuming
(or vice versa of suspending at home and then resuming at work).

So it's a bit trickier for me to test.  It's also not a regression,
and the workaround of rebooting is annoying, but I've lived with it
for several releases now, but I'll try the two patches/changes that
folks had suggested hopefully later this week.

						- Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-08-03 17:24             ` Linus Torvalds
  2015-08-03 18:49               ` Theodore Ts'o
@ 2015-08-03 22:05               ` Daniel Vetter
  2015-08-04  1:17                 ` Rafael J. Wysocki
  1 sibling, 1 reply; 19+ messages in thread
From: Daniel Vetter @ 2015-08-03 22:05 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Ts'o, intel-gfx, DRI, Daniel Vetter, Mani Nikula,
	Ander Conselvan de Oliveira, Linux Kernel Mailing List,
	Hans de Goede, Hart, Rafael J. Wysocki

On Mon, Aug 3, 2015 at 7:24 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>>                      However, I'm
>> still seeing a large number of drm/i915 related warning messages and
>> other kernel kvetching.
>
> I suspect I can live with that for now. The lockdep one looks like
> it's mainly an initialization issue, so you'd never get the actual
> deadlock in practice, but it's obviously annoying.  The intel_pm.c one
> I'll have to defer to the i915 people for..

The lockdep splat is just acpi being inconsistent with init_mutex vs.
backlight notifier_chain (which has it's own lock) calls. init_mutex
is new in 4.2 and has been added in

commit 87521e16a7abbf3fa337f56cb4d1e18247f15e8a
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Tue Jun 16 16:27:48 2015 +0200

    acpi-video-detect: Rewrite backlight interface selection logic


Not mine ;-) But adding relevant people.

I'll send you a pull for the mst one tomorrow and look into the
watermark fail in intel_pm.c too.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-08-03 22:05               ` Daniel Vetter
@ 2015-08-04  1:17                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2015-08-04  1:17 UTC (permalink / raw)
  To: Daniel Vetter, Hans de Goede
  Cc: Linus Torvalds, Theodore Ts'o, intel-gfx, DRI, Daniel Vetter,
	Mani Nikula, Ander Conselvan de Oliveira,
	Linux Kernel Mailing List, Hart, Rafael J. Wysocki

On Tuesday, August 04, 2015 12:05:14 AM Daniel Vetter wrote:
> On Mon, Aug 3, 2015 at 7:24 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >>                      However, I'm
> >> still seeing a large number of drm/i915 related warning messages and
> >> other kernel kvetching.
> >
> > I suspect I can live with that for now. The lockdep one looks like
> > it's mainly an initialization issue, so you'd never get the actual
> > deadlock in practice, but it's obviously annoying.  The intel_pm.c one
> > I'll have to defer to the i915 people for..
> 
> The lockdep splat is just acpi being inconsistent with init_mutex vs.
> backlight notifier_chain (which has it's own lock) calls. init_mutex
> is new in 4.2 and has been added in
> 
> commit 87521e16a7abbf3fa337f56cb4d1e18247f15e8a
> Author: Hans de Goede <hdegoede@redhat.com>
> Date:   Tue Jun 16 16:27:48 2015 +0200
> 
>     acpi-video-detect: Rewrite backlight interface selection logic
> 
> 
> Not mine ;-) But adding relevant people.

Hans, can you have a look at this, please?

Rafael


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [REGRESSION] Re: i915 driver crashes on T540p if docking station attached
  2015-08-03 16:25           ` Theodore Ts'o
  2015-08-03 17:24             ` Linus Torvalds
@ 2015-08-04 16:05             ` Daniel Vetter
  1 sibling, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2015-08-04 16:05 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, intel-gfx, DRI, Daniel Vetter,
	Mani Nikula, Ander Conselvan de Oliveira,
	Linux Kernel Mailing List

On Mon, Aug 03, 2015 at 12:25:11PM -0400, Theodore Ts'o wrote:
> On Mon, Aug 03, 2015 at 05:27:29PM +0200, Daniel Vetter wrote:
> > 
> > Ok I updated fixes-stuff with just 2 patches which seem to be enough to
> > fix it. Plus a patch to convert Linus' hack into something we can keep
> > plus a drive-by WARNING fix in mst that got in the way for me.
> > 
> > Seems to work here in getting rid of the Oops. If this tests out for you
> > too I'll send a pull to Linus.
> 
> I've just tried pulling in your updated fixes-stuff, and it avoids the
> oops and allows external the monitor to work correctly.  However, I'm
> still seeing a large number of drm/i915 related warning messages and
> other kernel kvetching.

Involved a bit of head-scratching since I'm not too familiar with the
watermark code and it gained a lot of complexity for atomic. But the below
patch should be able to fix this WARNING (and it looks like it was a
genuine one). If it works for you I'll bake it into a proper patch.

Thanks, Daniel

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 30e0f54ba19d..ae07fd0c395c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -15121,6 +15121,11 @@ void intel_modeset_setup_hw_state(struct drm_device *dev,
 
 	intel_modeset_readout_hw_state(dev);
 
+	if (IS_GEN9(dev))
+		skl_wm_get_hw_state(dev);
+	else if (HAS_PCH_SPLIT(dev))
+		ilk_wm_get_hw_state(dev);
+
 	/*
 	 * Now that we have the config, copy it to each CRTC struct
 	 * Note that this could go away if we move to using crtc_config
@@ -15162,11 +15167,6 @@ void intel_modeset_setup_hw_state(struct drm_device *dev,
 		pll->on = false;
 	}
 
-	if (IS_GEN9(dev))
-		skl_wm_get_hw_state(dev);
-	else if (HAS_PCH_SPLIT(dev))
-		ilk_wm_get_hw_state(dev);
-
 	if (force_restore) {
 		i915_redisable_vga(dev);
 
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-08-04 16:05 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-30  0:49 i915 driver crashes on T540p if docking station attached Theodore Ts'o
2015-07-30  1:39 ` [REGRESSION] " Theodore Ts'o
2015-07-30  5:18   ` Linus Torvalds
2015-07-30 11:16     ` Dave Airlie
2015-07-30 14:40     ` Daniel Vetter
2015-07-30 15:32       ` Theodore Ts'o
2015-07-30 15:54         ` [Intel-gfx] " Daniel Vetter
2015-07-30 15:57         ` Takashi Iwai
2015-07-30 18:14           ` Linus Torvalds
2015-07-30 15:50       ` Theodore Ts'o
2015-07-30 15:59         ` Theodore Ts'o
2015-07-30 16:00         ` Daniel Vetter
2015-08-03 15:27         ` Daniel Vetter
2015-08-03 16:25           ` Theodore Ts'o
2015-08-03 17:24             ` Linus Torvalds
2015-08-03 18:49               ` Theodore Ts'o
2015-08-03 22:05               ` Daniel Vetter
2015-08-04  1:17                 ` Rafael J. Wysocki
2015-08-04 16:05             ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).