All of lore.kernel.org
 help / color / mirror / Atom feed
* [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
@ 2009-08-07 14:46 Clark Williams
  2009-08-07 15:09 ` Peter Zijlstra
  2009-08-08 12:00 ` Theodore Tso
  0 siblings, 2 replies; 16+ messages in thread
From: Clark Williams @ 2009-08-07 14:46 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: LKML, RT

[-- Attachment #1: Type: text/plain, Size: 2767 bytes --]

Peter,

I'm getting this warning from lockdep when booting on my T60. 

The two addresses reported (0xffffffff812664a2 and 0xffffffff812664ae)
actually bracket one call to mutex_lock() in driver_attach() so I'm not
sure what the complaint is.

Clark

=============================================
[ INFO: possible recursive locking detected ]
2.6.31-rc5-rt1.1 #37
---------------------------------------------
swapper/1 is trying to acquire lock:
 (&dev->mutex){+.+...}, at: [<ffffffff812664ae>]
__driver_attach+0x48/0x81

but task is already holding lock:
 (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
__driver_attach+0x3c/0x81

other info that might help us debug this:
1 lock held by swapper/1:
 #0:  (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
__driver_attach+0x3c/0x81

stack backtrace:
Pid: 1, comm: swapper Not tainted 2.6.31-rc5-rt1.1 #37
Call Trace:
 [<ffffffff81071e38>] __lock_acquire+0x14ae/0x153c
 [<ffffffff812664ae>] ? __driver_attach+0x48/0x81
 [<ffffffff81071fc3>] lock_acquire+0xfd/0x129
 [<ffffffff812664ae>] ? __driver_attach+0x48/0x81
 [<ffffffff8137b667>] _mutex_lock+0x31/0x40
 [<ffffffff812664ae>] ? __driver_attach+0x48/0x81
 [<ffffffff812664ae>] __driver_attach+0x48/0x81
 [<ffffffff81266466>] ? __driver_attach+0x0/0x81
 [<ffffffff81265b9c>] bus_for_each_dev+0x59/0x8e
 [<ffffffff81266245>] driver_attach+0x1e/0x20
 [<ffffffff81265465>] bus_add_driver+0x13f/0x288
 [<ffffffff812667b3>] driver_register+0x9d/0x10e
 [<ffffffff81207e26>] acpi_bus_register_driver+0x43/0x45
 [<ffffffff815f5fdf>] acpi_ec_init+0x37/0x55
 [<ffffffff815f5d47>] acpi_init+0x224/0x265
 [<ffffffff815f5b23>] ? acpi_init+0x0/0x265
 [<ffffffff81009080>] do_one_initcall+0x75/0x18a
 [<ffffffff8106dd0a>] ? put_lock_stats+0xe/0x27
 [<ffffffff8137b3d1>] ? rt_spin_unlock+0x23/0x6d
 [<ffffffff8106dcc8>] ? get_lock_stats+0x16/0x4a
 [<ffffffff8106dcc8>] ? get_lock_stats+0x16/0x4a
 [<ffffffff81145616>] ? proc_register+0x18c/0x1a2
 [<ffffffff8137e6fe>] ? sub_preempt_count+0x35/0x49
 [<ffffffff8106dd0a>] ? put_lock_stats+0xe/0x27
 [<ffffffff8106de1c>] ? lock_release_holdtime+0xf9/0xfe
 [<ffffffff8137b3d1>] ? rt_spin_unlock+0x23/0x6d
 [<ffffffff81145616>] ? proc_register+0x18c/0x1a2
 [<ffffffff8114574a>] ? create_proc_entry+0x79/0x91
 [<ffffffff8109c069>] ? register_irq_proc+0xb3/0xcf
 [<ffffffff81140000>] ? pde_users_dec+0x20/0x45
 [<ffffffff815d370a>] kernel_init+0x17e/0x28b
 [<ffffffff8100d15a>] child_rip+0xa/0x20
 [<ffffffff8100ca94>] ? restore_args+0x0/0x30
 [<ffffffff815d358c>] ? kernel_init+0x0/0x28b
 [<ffffffff8100d150>] ? child_rip+0x0/0x20
---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-07 14:46 [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1 Clark Williams
@ 2009-08-07 15:09 ` Peter Zijlstra
  2009-08-07 16:04   ` Alan Stern
                     ` (2 more replies)
  2009-08-08 12:00 ` Theodore Tso
  1 sibling, 3 replies; 16+ messages in thread
From: Peter Zijlstra @ 2009-08-07 15:09 UTC (permalink / raw)
  To: Clark Williams
  Cc: LKML, RT, Thomas Gleixner, Alan Stern, greg, Rafael J. Wysocki,
	Kay Sievers

On Fri, 2009-08-07 at 09:46 -0500, Clark Williams wrote:
> Peter,
> 
> I'm getting this warning from lockdep when booting on my T60. 
> 
> The two addresses reported (0xffffffff812664a2 and 0xffffffff812664ae)
> actually bracket one call to mutex_lock() in driver_attach() so I'm not
> sure what the complaint is.
> 
> Clark
> 
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.31-rc5-rt1.1 #37
> ---------------------------------------------
> swapper/1 is trying to acquire lock:
>  (&dev->mutex){+.+...}, at: [<ffffffff812664ae>]
> __driver_attach+0x48/0x81
> 
> but task is already holding lock:
>  (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
> __driver_attach+0x3c/0x81
> 
> other info that might help us debug this:
> 1 lock held by swapper/1:
>  #0:  (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
> __driver_attach+0x3c/0x81

Oh, that's tglx who's gone wild with sem->mutex conversions.

It used to be that _all_ dev->sem instances were taken on suspend or
something like that, I think that got fixed a long while back.

I'd have to look at what the current locking requirements for dev->sem
are. 

I remember talking to Alan on several occasions about this, and I just
went over some of the old emails, but I must say the precise
requirements stay hidden from me. Also, I'm not sure these emails are
still representative of the current state.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-07 15:09 ` Peter Zijlstra
@ 2009-08-07 16:04   ` Alan Stern
  2009-08-07 16:15     ` Peter Zijlstra
  2009-08-08  9:06     ` Ming Lei
  2009-08-08  3:20     ` Dave Young
  2009-08-08  8:33     ` Ming Lei
  2 siblings, 2 replies; 16+ messages in thread
From: Alan Stern @ 2009-08-07 16:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Clark Williams, LKML, RT, Thomas Gleixner, greg,
	Rafael J. Wysocki, Kay Sievers

On Fri, 7 Aug 2009, Peter Zijlstra wrote:

> On Fri, 2009-08-07 at 09:46 -0500, Clark Williams wrote:
> > Peter,
> > 
> > I'm getting this warning from lockdep when booting on my T60. 
> > 
> > The two addresses reported (0xffffffff812664a2 and 0xffffffff812664ae)
> > actually bracket one call to mutex_lock() in driver_attach() so I'm not
> > sure what the complaint is.

> Oh, that's tglx who's gone wild with sem->mutex conversions.

Is this code available somewhere?

> It used to be that _all_ dev->sem instances were taken on suspend or
> something like that, I think that got fixed a long while back.
> 
> I'd have to look at what the current locking requirements for dev->sem
> are. 

It is supposed to be locked whenever the driver core invokes a probe, 
remove, or PM-related callback.  Under some circumstances, the parent's 
semaphore is supposed to be locked as well.  Individual subsystems may 
have their own requirements in addition to these.

The ordering requirement is: Don't try to acquire a device's lock if
you already hold the lock for a non-ancestor device.  More generally
(if more obscurely): If you already hold device A's lock, then don't
try to acquire the lock for device B unless you already hold the lock
for A & B's most recent common ancestor.

> I remember talking to Alan on several occasions about this, and I just
> went over some of the old emails, but I must say the precise
> requirements stay hidden from me. Also, I'm not sure these emails are
> still representative of the current state.

I think they are, pretty much.  The real problem, of course, is that 
lockdep doesn't understand tree-structured lock orderings.  Hence it 
isn't practical to convert dev->sem into a mutex.

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-07 16:04   ` Alan Stern
@ 2009-08-07 16:15     ` Peter Zijlstra
  2009-08-07 16:45       ` Alan Stern
  2009-08-08  9:06     ` Ming Lei
  1 sibling, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2009-08-07 16:15 UTC (permalink / raw)
  To: Alan Stern
  Cc: Clark Williams, LKML, RT, Thomas Gleixner, greg,
	Rafael J. Wysocki, Kay Sievers

On Fri, 2009-08-07 at 12:04 -0400, Alan Stern wrote:
> On Fri, 7 Aug 2009, Peter Zijlstra wrote:
> 
> > On Fri, 2009-08-07 at 09:46 -0500, Clark Williams wrote:
> > > Peter,
> > > 
> > > I'm getting this warning from lockdep when booting on my T60. 
> > > 
> > > The two addresses reported (0xffffffff812664a2 and 0xffffffff812664ae)
> > > actually bracket one call to mutex_lock() in driver_attach() so I'm not
> > > sure what the complaint is.
> 
> > Oh, that's tglx who's gone wild with sem->mutex conversions.
> 
> Is this code available somewhere?

Its in the -rt tree, but this patch was posted to lkml at:
  http://lkml.org/lkml/2009/7/26/36

The -rt tree can be found in various places, but while tglx is out
celebrating his holidays the latest can be found through:
  http://lkml.org/lkml/2009/8/5/406

> > It used to be that _all_ dev->sem instances were taken on suspend or
> > something like that, I think that got fixed a long while back.
> > 
> > I'd have to look at what the current locking requirements for dev->sem
> > are. 
> 
> It is supposed to be locked whenever the driver core invokes a probe, 
> remove, or PM-related callback.  Under some circumstances, the parent's 
> semaphore is supposed to be locked as well.  Individual subsystems may 
> have their own requirements in addition to these.
> 
> The ordering requirement is: Don't try to acquire a device's lock if
> you already hold the lock for a non-ancestor device.  More generally
> (if more obscurely): If you already hold device A's lock, then don't
> try to acquire the lock for device B unless you already hold the lock
> for A & B's most recent common ancestor.
> 
> > I remember talking to Alan on several occasions about this, and I just
> > went over some of the old emails, but I must say the precise
> > requirements stay hidden from me. Also, I'm not sure these emails are
> > still representative of the current state.
> 
> I think they are, pretty much.  The real problem, of course, is that 
> lockdep doesn't understand tree-structured lock orderings.  Hence it 
> isn't practical to convert dev->sem into a mutex.

Right, well it would if we'd make every instance a class, but since
classes should reside in static storage this is far from trivial.

If we'd be able to find a mapping such that we can use a limited number
of these classes to represent the needed structure then we're good.

I think I proposed adding a class to each driver or something, but then
you countered that a single driver could register itself at conflicting
places in the device tree.

Still it might be worth to try that and see where we'll end up and
possibly fix up a few drivers to be more intelligent.

/me ponders

Nested busses would be interesting though, suppose we assign a class to
a USB bus driver, and we chain USB hubs, you'd get a nesting of similar
classes and that'd upset lockdep again :/


The other proposal was creating a fixed list of classes and register
each device at a class corresponding to its depth in the tree. I can't
remember what was wrong with that, but I seem to have been persuaded
that that was hard too.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-07 16:15     ` Peter Zijlstra
@ 2009-08-07 16:45       ` Alan Stern
  2009-08-07 16:49         ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2009-08-07 16:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Clark Williams, LKML, RT, Thomas Gleixner, greg,
	Rafael J. Wysocki, Kay Sievers

On Fri, 7 Aug 2009, Peter Zijlstra wrote:

> > I think they are, pretty much.  The real problem, of course, is that 
> > lockdep doesn't understand tree-structured lock orderings.  Hence it 
> > isn't practical to convert dev->sem into a mutex.
> 
> Right, well it would if we'd make every instance a class, but since
> classes should reside in static storage this is far from trivial.
> 
> If we'd be able to find a mapping such that we can use a limited number
> of these classes to represent the needed structure then we're good.
> 
> I think I proposed adding a class to each driver or something, but then
> you countered that a single driver could register itself at conflicting
> places in the device tree.

Also there can be devices in the tree that don't have any driver.  And 
there can be devices that have the same driver as their children.

> Still it might be worth to try that and see where we'll end up and
> possibly fix up a few drivers to be more intelligent.
> 
> /me ponders
> 
> Nested busses would be interesting though, suppose we assign a class to
> a USB bus driver, and we chain USB hubs, you'd get a nesting of similar
> classes and that'd upset lockdep again :/

Yep.

> The other proposal was creating a fixed list of classes and register
> each device at a class corresponding to its depth in the tree. I can't
> remember what was wrong with that, but I seem to have been persuaded
> that that was hard too.

It probably would work for the most part.  However a possible scenario
involves first locking a parent and then locking all its children.  (I
don't know if this ever happens anywhere, but it might.)  This can't
cause a deadlock but it would run into trouble with depth-based
classes.

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-07 16:45       ` Alan Stern
@ 2009-08-07 16:49         ` Peter Zijlstra
  2009-08-07 21:30           ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2009-08-07 16:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: Clark Williams, LKML, RT, Thomas Gleixner, greg,
	Rafael J. Wysocki, Kay Sievers

On Fri, 2009-08-07 at 12:45 -0400, Alan Stern wrote:
> On Fri, 7 Aug 2009, Peter Zijlstra wrote:

> > The other proposal was creating a fixed list of classes and register
> > each device at a class corresponding to its depth in the tree. I can't
> > remember what was wrong with that, but I seem to have been persuaded
> > that that was hard too.
> 
> It probably would work for the most part.  However a possible scenario
> involves first locking a parent and then locking all its children.  (I
> don't know if this ever happens anywhere, but it might.)  This can't
> cause a deadlock but it would run into trouble with depth-based
> classes.

If you know which parent is locked, we can solve that with
mutex_lock_nest_lock() [ doesn't currently exist, but is analogous to
spin_lock_nest_lock() ] and together with
http://lkml.org/lkml/2009/7/23/222 that would allow you to lock up to
2048 children.

Would something like that work?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-07 16:49         ` Peter Zijlstra
@ 2009-08-07 21:30           ` Alan Stern
  0 siblings, 0 replies; 16+ messages in thread
From: Alan Stern @ 2009-08-07 21:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Clark Williams, LKML, RT, Thomas Gleixner, greg,
	Rafael J. Wysocki, Kay Sievers

On Fri, 7 Aug 2009, Peter Zijlstra wrote:

> On Fri, 2009-08-07 at 12:45 -0400, Alan Stern wrote:
> > On Fri, 7 Aug 2009, Peter Zijlstra wrote:
> 
> > > The other proposal was creating a fixed list of classes and register
> > > each device at a class corresponding to its depth in the tree. I can't
> > > remember what was wrong with that, but I seem to have been persuaded
> > > that that was hard too.
> > 
> > It probably would work for the most part.  However a possible scenario
> > involves first locking a parent and then locking all its children.  (I
> > don't know if this ever happens anywhere, but it might.)  This can't
> > cause a deadlock but it would run into trouble with depth-based
> > classes.
> 
> If you know which parent is locked, we can solve that with
> mutex_lock_nest_lock() [ doesn't currently exist, but is analogous to
> spin_lock_nest_lock() ] and together with
> http://lkml.org/lkml/2009/7/23/222 that would allow you to lock up to
> 2048 children.

Not only do I know not which parent is locked, I don't even know if 
this ever happens anywhere at all!  My point was purely theoretical.

> Would something like that work?

Perhaps -- I don't understand what spin_lock_nest_lock() is supposed to
do.

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-07 15:09 ` Peter Zijlstra
@ 2009-08-08  3:20     ` Dave Young
  2009-08-08  3:20     ` Dave Young
  2009-08-08  8:33     ` Ming Lei
  2 siblings, 0 replies; 16+ messages in thread
From: Dave Young @ 2009-08-08  3:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Clark Williams, LKML, RT, Thomas Gleixner, Alan Stern, greg,
	Rafael J. Wysocki, Kay Sievers

On Fri, Aug 7, 2009 at 11:09 PM, Peter Zijlstra<peterz@infradead.org> wrote:
> On Fri, 2009-08-07 at 09:46 -0500, Clark Williams wrote:
>> Peter,
>>
>> I'm getting this warning from lockdep when booting on my T60.
>>
>> The two addresses reported (0xffffffff812664a2 and 0xffffffff812664ae)
>> actually bracket one call to mutex_lock() in driver_attach() so I'm not
>> sure what the complaint is.
>>
>> Clark
>>
>> =============================================
>> [ INFO: possible recursive locking detected ]
>> 2.6.31-rc5-rt1.1 #37
>> ---------------------------------------------
>> swapper/1 is trying to acquire lock:
>>  (&dev->mutex){+.+...}, at: [<ffffffff812664ae>]
>> __driver_attach+0x48/0x81
>>
>> but task is already holding lock:
>>  (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
>> __driver_attach+0x3c/0x81
>>
>> other info that might help us debug this:
>> 1 lock held by swapper/1:
>>  #0:  (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
>> __driver_attach+0x3c/0x81
>
> Oh, that's tglx who's gone wild with sem->mutex conversions.
>
> It used to be that _all_ dev->sem instances were taken on suspend or
> something like that, I think that got fixed a long while back.
>
> I'd have to look at what the current locking requirements for dev->sem
> are.
>
> I remember talking to Alan on several occasions about this, and I just
> went over some of the old emails, but I must say the precise
> requirements stay hidden from me. Also, I'm not sure these emails are
> still representative of the current state.

I think you means this thread:

http://lkml.org/lkml/2008/4/17/305

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



-- 
Regards
dave

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
@ 2009-08-08  3:20     ` Dave Young
  0 siblings, 0 replies; 16+ messages in thread
From: Dave Young @ 2009-08-08  3:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Clark Williams, LKML, RT, Thomas Gleixner, Alan Stern, greg,
	Rafael J. Wysocki, Kay Sievers

On Fri, Aug 7, 2009 at 11:09 PM, Peter Zijlstra<peterz@infradead.org> wrote:
> On Fri, 2009-08-07 at 09:46 -0500, Clark Williams wrote:
>> Peter,
>>
>> I'm getting this warning from lockdep when booting on my T60.
>>
>> The two addresses reported (0xffffffff812664a2 and 0xffffffff812664ae)
>> actually bracket one call to mutex_lock() in driver_attach() so I'm not
>> sure what the complaint is.
>>
>> Clark
>>
>> =============================================
>> [ INFO: possible recursive locking detected ]
>> 2.6.31-rc5-rt1.1 #37
>> ---------------------------------------------
>> swapper/1 is trying to acquire lock:
>>  (&dev->mutex){+.+...}, at: [<ffffffff812664ae>]
>> __driver_attach+0x48/0x81
>>
>> but task is already holding lock:
>>  (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
>> __driver_attach+0x3c/0x81
>>
>> other info that might help us debug this:
>> 1 lock held by swapper/1:
>>  #0:  (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
>> __driver_attach+0x3c/0x81
>
> Oh, that's tglx who's gone wild with sem->mutex conversions.
>
> It used to be that _all_ dev->sem instances were taken on suspend or
> something like that, I think that got fixed a long while back.
>
> I'd have to look at what the current locking requirements for dev->sem
> are.
>
> I remember talking to Alan on several occasions about this, and I just
> went over some of the old emails, but I must say the precise
> requirements stay hidden from me. Also, I'm not sure these emails are
> still representative of the current state.

I think you means this thread:

http://lkml.org/lkml/2008/4/17/305

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



-- 
Regards
dave
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-07 15:09 ` Peter Zijlstra
@ 2009-08-08  8:33     ` Ming Lei
  2009-08-08  3:20     ` Dave Young
  2009-08-08  8:33     ` Ming Lei
  2 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2009-08-08  8:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Clark Williams, LKML, RT, Thomas Gleixner, Alan Stern, greg,
	Rafael J. Wysocki, Kay Sievers

2009/8/7 Peter Zijlstra <peterz@infradead.org>:
> On Fri, 2009-08-07 at 09:46 -0500, Clark Williams wrote:
>> Peter,
>>
>> I'm getting this warning from lockdep when booting on my T60.
>>
>> The two addresses reported (0xffffffff812664a2 and 0xffffffff812664ae)
>> actually bracket one call to mutex_lock() in driver_attach() so I'm not
>> sure what the complaint is.
>>
>> Clark
>>
>> =============================================
>> [ INFO: possible recursive locking detected ]
>> 2.6.31-rc5-rt1.1 #37
>> ---------------------------------------------
>> swapper/1 is trying to acquire lock:
>>  (&dev->mutex){+.+...}, at: [<ffffffff812664ae>]
>> __driver_attach+0x48/0x81
>>
>> but task is already holding lock:
>>  (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
>> __driver_attach+0x3c/0x81
>>
>> other info that might help us debug this:
>> 1 lock held by swapper/1:
>>  #0:  (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
>> __driver_attach+0x3c/0x81
>
> Oh, that's tglx who's gone wild with sem->mutex conversions.

Maybe we can introduce some mutex interfaces which bypass lockdep validation
temporarily to allow driver core to convert to mutex from sema if lockdep
can't validate tree-structured lock orderings.

Thanks.

--
Lei Ming

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
@ 2009-08-08  8:33     ` Ming Lei
  0 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2009-08-08  8:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Clark Williams, LKML, RT, Thomas Gleixner, Alan Stern, greg,
	Rafael J. Wysocki, Kay Sievers

2009/8/7 Peter Zijlstra <peterz@infradead.org>:
> On Fri, 2009-08-07 at 09:46 -0500, Clark Williams wrote:
>> Peter,
>>
>> I'm getting this warning from lockdep when booting on my T60.
>>
>> The two addresses reported (0xffffffff812664a2 and 0xffffffff812664ae)
>> actually bracket one call to mutex_lock() in driver_attach() so I'm not
>> sure what the complaint is.
>>
>> Clark
>>
>> =============================================
>> [ INFO: possible recursive locking detected ]
>> 2.6.31-rc5-rt1.1 #37
>> ---------------------------------------------
>> swapper/1 is trying to acquire lock:
>>  (&dev->mutex){+.+...}, at: [<ffffffff812664ae>]
>> __driver_attach+0x48/0x81
>>
>> but task is already holding lock:
>>  (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
>> __driver_attach+0x3c/0x81
>>
>> other info that might help us debug this:
>> 1 lock held by swapper/1:
>>  #0:  (&dev->mutex){+.+...}, at: [<ffffffff812664a2>]
>> __driver_attach+0x3c/0x81
>
> Oh, that's tglx who's gone wild with sem->mutex conversions.

Maybe we can introduce some mutex interfaces which bypass lockdep validation
temporarily to allow driver core to convert to mutex from sema if lockdep
can't validate tree-structured lock orderings.

Thanks.

--
Lei Ming
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-07 16:04   ` Alan Stern
  2009-08-07 16:15     ` Peter Zijlstra
@ 2009-08-08  9:06     ` Ming Lei
  2009-08-08 15:19       ` Alan Stern
  1 sibling, 1 reply; 16+ messages in thread
From: Ming Lei @ 2009-08-08  9:06 UTC (permalink / raw)
  To: Alan Stern
  Cc: Peter Zijlstra, Clark Williams, LKML, RT, Thomas Gleixner, greg,
	Rafael J. Wysocki, Kay Sievers

2009/8/8 Alan Stern <stern@rowland.harvard.edu>:
> On Fri, 7 Aug 2009, Peter Zijlstra wrote:

>> It used to be that _all_ dev->sem instances were taken on suspend or
>> something like that, I think that got fixed a long while back.
>>
>> I'd have to look at what the current locking requirements for dev->sem
>> are.
>
> It is supposed to be locked whenever the driver core invokes a probe,
> remove, or PM-related callback.  Under some circumstances, the parent's
> semaphore is supposed to be locked as well.  Individual subsystems may
> have their own requirements in addition to these.
>
> The ordering requirement is: Don't try to acquire a device's lock if
> you already hold the lock for a non-ancestor device.  More generally
> (if more obscurely): If you already hold device A's lock, then don't
> try to acquire the lock for device B unless you already hold the lock
> for A & B's most recent common ancestor.
>

It seems that the following case is very common, and A and B have no
common ancestor, but we can hold device A and B's lock at the same
time, can't we?

Thanks.

device A  comes in one bus:
	device_add()
         ->bus_attach_device()
            ->device_attach():drivers/base/dd.c /*holding device A's lock*/
               ->...drv->probe()		/*sleep here some time*/

then device B comes in another bus:
	device_add()
         ->bus_attach_device()
            ->device_attach():drivers/base/dd.c /*holding device B's lock*/
               ->...drv->probe()		/*sleep here some time*/

-- 
Lei Ming

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-07 14:46 [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1 Clark Williams
  2009-08-07 15:09 ` Peter Zijlstra
@ 2009-08-08 12:00 ` Theodore Tso
  2009-08-08 14:07     ` Dave Young
  1 sibling, 1 reply; 16+ messages in thread
From: Theodore Tso @ 2009-08-08 12:00 UTC (permalink / raw)
  To: Clark Williams; +Cc: Peter Zijlstra, LKML, RT

On Fri, Aug 07, 2009 at 09:46:08AM -0500, Clark Williams wrote:
> Peter,
> 
> I'm getting this warning from lockdep when booting on my T60. 
> 
> The two addresses reported (0xffffffff812664a2 and 0xffffffff812664ae)
> actually bracket one call to mutex_lock() in driver_attach() so I'm not
> sure what the complaint is.

I'm getting a different lockdep warning when booting on my T400 using
2.6.31-rc5; not sure if it's related or not....

In any case, it's screwing up the ability for lockdep to find any
other problems.

[    0.297775] INFO: trying to register non-static key.
[    0.297775] the code is fine but needs lockdep annotation.
[    0.297775] turning off the locking correctness validator.
[    0.297775] Pid: 1, comm: swapper Not tainted 2.6.31-rc5-00256-gf124845 #4
[    0.297775] Call Trace:
[    0.297775]  [<c0511a2f>] ? printk+0x14/0x1d
[    0.297775]  [<c016de1f>] register_lock_class+0x5a/0x2a1
[    0.297775]  [<c016e932>] ? mark_lock+0x1e/0x1e4
[    0.297775]  [<c016f673>] __lock_acquire+0x9c/0xb1e
[    0.297775]  [<c01be910>] ? mod_zone_page_state+0x9f/0xaf
[    0.297775]  [<c016ed98>] ? trace_hardirqs_on_caller+0x103/0x124
[    0.297775]  [<c016e932>] ? mark_lock+0x1e/0x1e4
[    0.297775]  [<c016eb3b>] ? mark_held_locks+0x43/0x5b
[    0.297775]  [<c01d490f>] ? kmem_cache_alloc+0xaf/0x127
[    0.297775]  [<c016ed98>] ? trace_hardirqs_on_caller+0x103/0x124
[    0.297775]  [<c0170189>] lock_acquire+0x94/0xb7
[    0.297775]  [<c045d6bd>] ? alloc_netdev_mq+0x105/0x1cc
[    0.297775]  [<c0513f8d>] _spin_lock_bh+0x28/0x58
[    0.297775]  [<c045d6bd>] ? alloc_netdev_mq+0x105/0x1cc
[    0.297775]  [<c045d6bd>] alloc_netdev_mq+0x105/0x1cc
[    0.297775]  [<c03f7bbf>] ? loopback_setup+0x0/0x79
[    0.297775]  [<c03f7c6f>] loopback_net_init+0x25/0x68
[    0.297782]  [<c0457317>] register_pernet_operations+0x2f/0xa1
[    0.297832]  [<c0512fcd>] ? mutex_lock_nested+0x33/0x3b
[    0.297891]  [<c0457435>] register_pernet_device+0x24/0x4c
[    0.297951]  [<c0796a09>] net_dev_init+0x101/0x150
[    0.298010]  [<c0796908>] ? net_dev_init+0x0/0x150
[    0.298069]  [<c010115c>] do_one_initcall+0x6a/0x177
[    0.298127]  [<c016e932>] ? mark_lock+0x1e/0x1e4
[    0.298185]  [<c016e932>] ? mark_lock+0x1e/0x1e4
[    0.298244]  [<c01b371a>] ? get_page_from_freelist+0x28f/0x3be
[    0.298304]  [<c016ed98>] ? trace_hardirqs_on_caller+0x103/0x124
[    0.298364]  [<c016edc4>] ? trace_hardirqs_on+0xb/0xd
[    0.298423]  [<c016e932>] ? mark_lock+0x1e/0x1e4
[    0.298482]  [<c016e932>] ? mark_lock+0x1e/0x1e4
[    0.298540]  [<c016edc4>] ? trace_hardirqs_on+0xb/0xd
[    0.298600]  [<c03104ad>] ? ida_get_new_above+0x157/0x171
[    0.298660]  [<c0213468>] ? proc_register+0x14b/0x15c
[    0.298719]  [<c011e5f6>] ? sched_clock+0x8/0xb
[    0.298777]  [<c016d9b1>] ? lock_release_holdtime+0x30/0x131
[    0.298837]  [<c0213468>] ? proc_register+0x14b/0x15c
[    0.298896]  [<c0513dfb>] ? _spin_unlock+0x22/0x25
[    0.298954]  [<c0213468>] ? proc_register+0x14b/0x15c
[    0.299013]  [<c021359b>] ? create_proc_entry+0x80/0x96
[    0.299073]  [<c0191064>] ? register_irq_proc+0x91/0xad
[    0.299132]  [<c01910d8>] ? init_irq_proc+0x58/0x65
[    0.299191]  [<c0768301>] kernel_init+0x131/0x182
[    0.299249]  [<c07681d0>] ? kernel_init+0x0/0x182

							- Ted

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-08 12:00 ` Theodore Tso
@ 2009-08-08 14:07     ` Dave Young
  0 siblings, 0 replies; 16+ messages in thread
From: Dave Young @ 2009-08-08 14:07 UTC (permalink / raw)
  To: Theodore Tso, Clark Williams, Peter Zijlstra, LKML, RT

On Sat, Aug 8, 2009 at 8:00 PM, Theodore Tso<tytso@mit.edu> wrote:
> On Fri, Aug 07, 2009 at 09:46:08AM -0500, Clark Williams wrote:
>> Peter,
>>
>> I'm getting this warning from lockdep when booting on my T60.
>>
>> The two addresses reported (0xffffffff812664a2 and 0xffffffff812664ae)
>> actually bracket one call to mutex_lock() in driver_attach() so I'm not
>> sure what the complaint is.
>
> I'm getting a different lockdep warning when booting on my T400 using
> 2.6.31-rc5; not sure if it's related or not....
>
> In any case, it's screwing up the ability for lockdep to find any
> other problems.
>
> [    0.297775] INFO: trying to register non-static key.
> [    0.297775] the code is fine but needs lockdep annotation.
> [    0.297775] turning off the locking correctness validator.
> [    0.297775] Pid: 1, comm: swapper Not tainted 2.6.31-rc5-00256-gf124845 #4
> [    0.297775] Call Trace:
> [    0.297775]  [<c0511a2f>] ? printk+0x14/0x1d
> [    0.297775]  [<c016de1f>] register_lock_class+0x5a/0x2a1
> [    0.297775]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.297775]  [<c016f673>] __lock_acquire+0x9c/0xb1e
> [    0.297775]  [<c01be910>] ? mod_zone_page_state+0x9f/0xaf
> [    0.297775]  [<c016ed98>] ? trace_hardirqs_on_caller+0x103/0x124
> [    0.297775]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.297775]  [<c016eb3b>] ? mark_held_locks+0x43/0x5b
> [    0.297775]  [<c01d490f>] ? kmem_cache_alloc+0xaf/0x127
> [    0.297775]  [<c016ed98>] ? trace_hardirqs_on_caller+0x103/0x124
> [    0.297775]  [<c0170189>] lock_acquire+0x94/0xb7
> [    0.297775]  [<c045d6bd>] ? alloc_netdev_mq+0x105/0x1cc
> [    0.297775]  [<c0513f8d>] _spin_lock_bh+0x28/0x58
> [    0.297775]  [<c045d6bd>] ? alloc_netdev_mq+0x105/0x1cc
> [    0.297775]  [<c045d6bd>] alloc_netdev_mq+0x105/0x1cc
> [    0.297775]  [<c03f7bbf>] ? loopback_setup+0x0/0x79
> [    0.297775]  [<c03f7c6f>] loopback_net_init+0x25/0x68
> [    0.297782]  [<c0457317>] register_pernet_operations+0x2f/0xa1
> [    0.297832]  [<c0512fcd>] ? mutex_lock_nested+0x33/0x3b
> [    0.297891]  [<c0457435>] register_pernet_device+0x24/0x4c
> [    0.297951]  [<c0796a09>] net_dev_init+0x101/0x150
> [    0.298010]  [<c0796908>] ? net_dev_init+0x0/0x150
> [    0.298069]  [<c010115c>] do_one_initcall+0x6a/0x177
> [    0.298127]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.298185]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.298244]  [<c01b371a>] ? get_page_from_freelist+0x28f/0x3be
> [    0.298304]  [<c016ed98>] ? trace_hardirqs_on_caller+0x103/0x124
> [    0.298364]  [<c016edc4>] ? trace_hardirqs_on+0xb/0xd
> [    0.298423]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.298482]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.298540]  [<c016edc4>] ? trace_hardirqs_on+0xb/0xd
> [    0.298600]  [<c03104ad>] ? ida_get_new_above+0x157/0x171
> [    0.298660]  [<c0213468>] ? proc_register+0x14b/0x15c
> [    0.298719]  [<c011e5f6>] ? sched_clock+0x8/0xb
> [    0.298777]  [<c016d9b1>] ? lock_release_holdtime+0x30/0x131
> [    0.298837]  [<c0213468>] ? proc_register+0x14b/0x15c
> [    0.298896]  [<c0513dfb>] ? _spin_unlock+0x22/0x25
> [    0.298954]  [<c0213468>] ? proc_register+0x14b/0x15c
> [    0.299013]  [<c021359b>] ? create_proc_entry+0x80/0x96
> [    0.299073]  [<c0191064>] ? register_irq_proc+0x91/0xad
> [    0.299132]  [<c01910d8>] ? init_irq_proc+0x58/0x65
> [    0.299191]  [<c0768301>] kernel_init+0x131/0x182
> [    0.299249]  [<c07681d0>] ? kernel_init+0x0/0x182
>

It's a different problem, for this issue please see:

http://lkml.org/lkml/2009/8/5/49
http://lkml.org/lkml/2009/8/5/51

-- 
Regards
dave

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
@ 2009-08-08 14:07     ` Dave Young
  0 siblings, 0 replies; 16+ messages in thread
From: Dave Young @ 2009-08-08 14:07 UTC (permalink / raw)
  To: Theodore Tso, Clark Williams, Peter Zijlstra, LKML, RT

On Sat, Aug 8, 2009 at 8:00 PM, Theodore Tso<tytso@mit.edu> wrote:
> On Fri, Aug 07, 2009 at 09:46:08AM -0500, Clark Williams wrote:
>> Peter,
>>
>> I'm getting this warning from lockdep when booting on my T60.
>>
>> The two addresses reported (0xffffffff812664a2 and 0xffffffff812664ae)
>> actually bracket one call to mutex_lock() in driver_attach() so I'm not
>> sure what the complaint is.
>
> I'm getting a different lockdep warning when booting on my T400 using
> 2.6.31-rc5; not sure if it's related or not....
>
> In any case, it's screwing up the ability for lockdep to find any
> other problems.
>
> [    0.297775] INFO: trying to register non-static key.
> [    0.297775] the code is fine but needs lockdep annotation.
> [    0.297775] turning off the locking correctness validator.
> [    0.297775] Pid: 1, comm: swapper Not tainted 2.6.31-rc5-00256-gf124845 #4
> [    0.297775] Call Trace:
> [    0.297775]  [<c0511a2f>] ? printk+0x14/0x1d
> [    0.297775]  [<c016de1f>] register_lock_class+0x5a/0x2a1
> [    0.297775]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.297775]  [<c016f673>] __lock_acquire+0x9c/0xb1e
> [    0.297775]  [<c01be910>] ? mod_zone_page_state+0x9f/0xaf
> [    0.297775]  [<c016ed98>] ? trace_hardirqs_on_caller+0x103/0x124
> [    0.297775]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.297775]  [<c016eb3b>] ? mark_held_locks+0x43/0x5b
> [    0.297775]  [<c01d490f>] ? kmem_cache_alloc+0xaf/0x127
> [    0.297775]  [<c016ed98>] ? trace_hardirqs_on_caller+0x103/0x124
> [    0.297775]  [<c0170189>] lock_acquire+0x94/0xb7
> [    0.297775]  [<c045d6bd>] ? alloc_netdev_mq+0x105/0x1cc
> [    0.297775]  [<c0513f8d>] _spin_lock_bh+0x28/0x58
> [    0.297775]  [<c045d6bd>] ? alloc_netdev_mq+0x105/0x1cc
> [    0.297775]  [<c045d6bd>] alloc_netdev_mq+0x105/0x1cc
> [    0.297775]  [<c03f7bbf>] ? loopback_setup+0x0/0x79
> [    0.297775]  [<c03f7c6f>] loopback_net_init+0x25/0x68
> [    0.297782]  [<c0457317>] register_pernet_operations+0x2f/0xa1
> [    0.297832]  [<c0512fcd>] ? mutex_lock_nested+0x33/0x3b
> [    0.297891]  [<c0457435>] register_pernet_device+0x24/0x4c
> [    0.297951]  [<c0796a09>] net_dev_init+0x101/0x150
> [    0.298010]  [<c0796908>] ? net_dev_init+0x0/0x150
> [    0.298069]  [<c010115c>] do_one_initcall+0x6a/0x177
> [    0.298127]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.298185]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.298244]  [<c01b371a>] ? get_page_from_freelist+0x28f/0x3be
> [    0.298304]  [<c016ed98>] ? trace_hardirqs_on_caller+0x103/0x124
> [    0.298364]  [<c016edc4>] ? trace_hardirqs_on+0xb/0xd
> [    0.298423]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.298482]  [<c016e932>] ? mark_lock+0x1e/0x1e4
> [    0.298540]  [<c016edc4>] ? trace_hardirqs_on+0xb/0xd
> [    0.298600]  [<c03104ad>] ? ida_get_new_above+0x157/0x171
> [    0.298660]  [<c0213468>] ? proc_register+0x14b/0x15c
> [    0.298719]  [<c011e5f6>] ? sched_clock+0x8/0xb
> [    0.298777]  [<c016d9b1>] ? lock_release_holdtime+0x30/0x131
> [    0.298837]  [<c0213468>] ? proc_register+0x14b/0x15c
> [    0.298896]  [<c0513dfb>] ? _spin_unlock+0x22/0x25
> [    0.298954]  [<c0213468>] ? proc_register+0x14b/0x15c
> [    0.299013]  [<c021359b>] ? create_proc_entry+0x80/0x96
> [    0.299073]  [<c0191064>] ? register_irq_proc+0x91/0xad
> [    0.299132]  [<c01910d8>] ? init_irq_proc+0x58/0x65
> [    0.299191]  [<c0768301>] kernel_init+0x131/0x182
> [    0.299249]  [<c07681d0>] ? kernel_init+0x0/0x182
>

It's a different problem, for this issue please see:

http://lkml.org/lkml/2009/8/5/49
http://lkml.org/lkml/2009/8/5/51

-- 
Regards
dave
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1
  2009-08-08  9:06     ` Ming Lei
@ 2009-08-08 15:19       ` Alan Stern
  0 siblings, 0 replies; 16+ messages in thread
From: Alan Stern @ 2009-08-08 15:19 UTC (permalink / raw)
  To: Ming Lei
  Cc: Peter Zijlstra, Clark Williams, LKML, RT, Thomas Gleixner, greg,
	Rafael J. Wysocki, Kay Sievers

On Sat, 8 Aug 2009, Ming Lei wrote:

> > The ordering requirement is: Don't try to acquire a device's lock if
> > you already hold the lock for a non-ancestor device.  More generally
> > (if more obscurely): If you already hold device A's lock, then don't
> > try to acquire the lock for device B unless you already hold the lock
> > for A & B's most recent common ancestor.
> >
> 
> It seems that the following case is very common, and A and B have no
> common ancestor, but we can hold device A and B's lock at the same
> time, can't we?
> 
> Thanks.
> 
> device A  comes in one bus:
> 	device_add()
>          ->bus_attach_device()
>             ->device_attach():drivers/base/dd.c /*holding device A's lock*/
>                ->...drv->probe()		/*sleep here some time*/

So right now thread 1 is sleeping.

> then device B comes in another bus:

So all this must happen in a different thread, thread 2:

> 	device_add()
>          ->bus_attach_device()
>             ->device_attach():drivers/base/dd.c /*holding device B's lock*/
>                ->...drv->probe()		/*sleep here some time*/

At this point, thread 1 holds A's lock and thread 2 holds B's lock.  
Neither thread holds both locks.

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2009-08-08 15:19 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-07 14:46 [RT] Lockdep warning on boot with 2.6.31-rc5-rt1.1 Clark Williams
2009-08-07 15:09 ` Peter Zijlstra
2009-08-07 16:04   ` Alan Stern
2009-08-07 16:15     ` Peter Zijlstra
2009-08-07 16:45       ` Alan Stern
2009-08-07 16:49         ` Peter Zijlstra
2009-08-07 21:30           ` Alan Stern
2009-08-08  9:06     ` Ming Lei
2009-08-08 15:19       ` Alan Stern
2009-08-08  3:20   ` Dave Young
2009-08-08  3:20     ` Dave Young
2009-08-08  8:33   ` Ming Lei
2009-08-08  8:33     ` Ming Lei
2009-08-08 12:00 ` Theodore Tso
2009-08-08 14:07   ` Dave Young
2009-08-08 14:07     ` Dave Young

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.