linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.24-rc6: possible recursive locking detected
@ 2008-01-03 22:58 Christian Kujau
  2008-01-03 23:06 ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Christian Kujau @ 2008-01-03 22:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: jfs-discussion

hi,

a few minutes after upgrading from -rc5 to -rc6 I got:

[ 1310.670986] =============================================
[ 1310.671690] [ INFO: possible recursive locking detected ]
[ 1310.672097] 2.6.24-rc6 #1
[ 1310.672421] ---------------------------------------------
[ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
[ 1310.673238]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
[ 1310.673869]
[ 1310.673870] but task is already holding lock:
[ 1310.674567]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
[ 1310.675267]
[ 1310.675268] other info that might help us debug this:
[ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
[ 1310.676334]  #0:  (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0
[ 1310.677251]  #1:  (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470
[ 1310.677924]  #2:  (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210
[ 1310.678460]  #3:  (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80
[ 1310.679250]  #4:  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
[ 1310.680151]
[ 1310.680152] stack backtrace:
[ 1310.680772] Pid: 3692, comm: FahCore_a0.exe Not tainted 2.6.24-rc6 #1
[ 1310.681209]  [<c01038aa>] show_trace_log_lvl+0x1a/0x30
[ 1310.681659]  [<c0104322>] show_trace+0x12/0x20
[ 1310.682085]  [<c0104cba>] dump_stack+0x6a/0x70
[ 1310.682512]  [<c0138ec1>] __lock_acquire+0x971/0x10c0
[ 1310.682961]  [<c013966e>] lock_acquire+0x5e/0x80
[ 1310.683392]  [<c0419b78>] _spin_lock_irqsave+0x38/0x50
[ 1310.683914]  [<c011544b>] __wake_up+0x1b/0x50
[ 1310.684337]  [<c018e2ba>] ep_poll_safewake+0x9a/0xc0
[ 1310.684822]  [<c018f11b>] ep_poll_callback+0x8b/0xe0
[ 1310.685265]  [<c0114418>] __wake_up_common+0x48/0x70
[ 1310.685712]  [<c0115467>] __wake_up+0x37/0x50
[ 1310.686136]  [<c03816aa>] sock_def_readable+0x7a/0x80
[ 1310.686579]  [<c0381c2b>] sock_queue_rcv_skb+0xeb/0x150
[ 1310.687028]  [<c03c7d99>] udp_queue_rcv_skb+0x139/0x2a0
[ 1310.687554]  [<c03c81f1>] __udp4_lib_rcv+0x2f1/0x7e0
[ 1310.687996]  [<c03c86f2>] udp_rcv+0x12/0x20
[ 1310.688415]  [<c03a80a5>] ip_local_deliver_finish+0x125/0x210
[ 1310.688881]  [<c03a84ed>] ip_local_deliver+0x2d/0x90
[ 1310.689323]  [<c03a7d6b>] ip_rcv_finish+0xeb/0x300
[ 1310.689760]  [<c03a8425>] ip_rcv+0x195/0x230
[ 1310.690182]  [<c0388fdc>] netif_receive_skb+0x37c/0x470
[ 1310.690632]  [<c038ba39>] process_backlog+0x69/0xc0
[ 1310.691175]  [<c038b6f7>] net_rx_action+0x137/0x1b0
[ 1310.691681]  [<c011e5c2>] __do_softirq+0x52/0xb0
[ 1310.692006]  [<c0104e94>] do_softirq+0x94/0xe0
[ 1310.692301]  =======================


This is a single CPU machine, and the box was quite busy due to disk I/O 
(load 6-8). The machine continues to run and all is well now. Even the 
application mentioned above (FahCore_a0.exe) is running fine 
("Folding@Home", cpu bound). The binary is located on an jfs filesystem, 
which was also under heavy I/O. Can someone tell me why the backtrace 
shows so much net* stuff? There was not much net I/O...

more details and .config: http://nerdbynature.de/bits/2.6.24-rc6

Thanks,
Christian.
-- 
BOFH excuse #312:

incompatible bit-registration operators

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-03 22:58 2.6.24-rc6: possible recursive locking detected Christian Kujau
@ 2008-01-03 23:06 ` Rafael J. Wysocki
  2008-01-04  8:30   ` Ingo Molnar
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2008-01-03 23:06 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux-kernel, jfs-discussion, Ingo Molnar, Peter Zijlstra

[Added some CCs]

On Thursday, 3 of January 2008, Christian Kujau wrote:
> hi,
> 
> a few minutes after upgrading from -rc5 to -rc6 I got:
> 
> [ 1310.670986] =============================================
> [ 1310.671690] [ INFO: possible recursive locking detected ]
> [ 1310.672097] 2.6.24-rc6 #1
> [ 1310.672421] ---------------------------------------------
> [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
> [ 1310.673238]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> [ 1310.673869]
> [ 1310.673870] but task is already holding lock:
> [ 1310.674567]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> [ 1310.675267]
> [ 1310.675268] other info that might help us debug this:
> [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
> [ 1310.676334]  #0:  (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0
> [ 1310.677251]  #1:  (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470
> [ 1310.677924]  #2:  (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210
> [ 1310.678460]  #3:  (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80
> [ 1310.679250]  #4:  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> [ 1310.680151]
> [ 1310.680152] stack backtrace:
> [ 1310.680772] Pid: 3692, comm: FahCore_a0.exe Not tainted 2.6.24-rc6 #1
> [ 1310.681209]  [<c01038aa>] show_trace_log_lvl+0x1a/0x30
> [ 1310.681659]  [<c0104322>] show_trace+0x12/0x20
> [ 1310.682085]  [<c0104cba>] dump_stack+0x6a/0x70
> [ 1310.682512]  [<c0138ec1>] __lock_acquire+0x971/0x10c0
> [ 1310.682961]  [<c013966e>] lock_acquire+0x5e/0x80
> [ 1310.683392]  [<c0419b78>] _spin_lock_irqsave+0x38/0x50
> [ 1310.683914]  [<c011544b>] __wake_up+0x1b/0x50
> [ 1310.684337]  [<c018e2ba>] ep_poll_safewake+0x9a/0xc0
> [ 1310.684822]  [<c018f11b>] ep_poll_callback+0x8b/0xe0
> [ 1310.685265]  [<c0114418>] __wake_up_common+0x48/0x70
> [ 1310.685712]  [<c0115467>] __wake_up+0x37/0x50
> [ 1310.686136]  [<c03816aa>] sock_def_readable+0x7a/0x80
> [ 1310.686579]  [<c0381c2b>] sock_queue_rcv_skb+0xeb/0x150
> [ 1310.687028]  [<c03c7d99>] udp_queue_rcv_skb+0x139/0x2a0
> [ 1310.687554]  [<c03c81f1>] __udp4_lib_rcv+0x2f1/0x7e0
> [ 1310.687996]  [<c03c86f2>] udp_rcv+0x12/0x20
> [ 1310.688415]  [<c03a80a5>] ip_local_deliver_finish+0x125/0x210
> [ 1310.688881]  [<c03a84ed>] ip_local_deliver+0x2d/0x90
> [ 1310.689323]  [<c03a7d6b>] ip_rcv_finish+0xeb/0x300
> [ 1310.689760]  [<c03a8425>] ip_rcv+0x195/0x230
> [ 1310.690182]  [<c0388fdc>] netif_receive_skb+0x37c/0x470
> [ 1310.690632]  [<c038ba39>] process_backlog+0x69/0xc0
> [ 1310.691175]  [<c038b6f7>] net_rx_action+0x137/0x1b0
> [ 1310.691681]  [<c011e5c2>] __do_softirq+0x52/0xb0
> [ 1310.692006]  [<c0104e94>] do_softirq+0x94/0xe0
> [ 1310.692301]  =======================
> 
> 
> This is a single CPU machine, and the box was quite busy due to disk I/O 
> (load 6-8). The machine continues to run and all is well now. Even the 
> application mentioned above (FahCore_a0.exe) is running fine 
> ("Folding@Home", cpu bound). The binary is located on an jfs filesystem, 
> which was also under heavy I/O. Can someone tell me why the backtrace 
> shows so much net* stuff? There was not much net I/O...
> 
> more details and .config: http://nerdbynature.de/bits/2.6.24-rc6
> 
> Thanks,
> Christian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-03 23:06 ` Rafael J. Wysocki
@ 2008-01-04  8:30   ` Ingo Molnar
  2008-01-05  7:12     ` Herbert Xu
  0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2008-01-04  8:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Christian Kujau, linux-kernel, jfs-discussion, Peter Zijlstra,
	Davide Libenzi, Herbert Xu



[ Added even more CCs :-) Seems eventpoll & net related. ]

* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> [Added some CCs]
> 
> On Thursday, 3 of January 2008, Christian Kujau wrote:
> > hi,
> > 
> > a few minutes after upgrading from -rc5 to -rc6 I got:
> > 
> > [ 1310.670986] =============================================
> > [ 1310.671690] [ INFO: possible recursive locking detected ]
> > [ 1310.672097] 2.6.24-rc6 #1
> > [ 1310.672421] ---------------------------------------------
> > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
> > [ 1310.673238]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > [ 1310.673869]
> > [ 1310.673870] but task is already holding lock:
> > [ 1310.674567]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > [ 1310.675267]
> > [ 1310.675268] other info that might help us debug this:
> > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
> > [ 1310.676334]  #0:  (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0
> > [ 1310.677251]  #1:  (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470
> > [ 1310.677924]  #2:  (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210
> > [ 1310.678460]  #3:  (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80
> > [ 1310.679250]  #4:  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > [ 1310.680151]
> > [ 1310.680152] stack backtrace:
> > [ 1310.680772] Pid: 3692, comm: FahCore_a0.exe Not tainted 2.6.24-rc6 #1
> > [ 1310.681209]  [<c01038aa>] show_trace_log_lvl+0x1a/0x30
> > [ 1310.681659]  [<c0104322>] show_trace+0x12/0x20
> > [ 1310.682085]  [<c0104cba>] dump_stack+0x6a/0x70
> > [ 1310.682512]  [<c0138ec1>] __lock_acquire+0x971/0x10c0
> > [ 1310.682961]  [<c013966e>] lock_acquire+0x5e/0x80
> > [ 1310.683392]  [<c0419b78>] _spin_lock_irqsave+0x38/0x50
> > [ 1310.683914]  [<c011544b>] __wake_up+0x1b/0x50
> > [ 1310.684337]  [<c018e2ba>] ep_poll_safewake+0x9a/0xc0
> > [ 1310.684822]  [<c018f11b>] ep_poll_callback+0x8b/0xe0
> > [ 1310.685265]  [<c0114418>] __wake_up_common+0x48/0x70
> > [ 1310.685712]  [<c0115467>] __wake_up+0x37/0x50
> > [ 1310.686136]  [<c03816aa>] sock_def_readable+0x7a/0x80
> > [ 1310.686579]  [<c0381c2b>] sock_queue_rcv_skb+0xeb/0x150
> > [ 1310.687028]  [<c03c7d99>] udp_queue_rcv_skb+0x139/0x2a0
> > [ 1310.687554]  [<c03c81f1>] __udp4_lib_rcv+0x2f1/0x7e0
> > [ 1310.687996]  [<c03c86f2>] udp_rcv+0x12/0x20
> > [ 1310.688415]  [<c03a80a5>] ip_local_deliver_finish+0x125/0x210
> > [ 1310.688881]  [<c03a84ed>] ip_local_deliver+0x2d/0x90
> > [ 1310.689323]  [<c03a7d6b>] ip_rcv_finish+0xeb/0x300
> > [ 1310.689760]  [<c03a8425>] ip_rcv+0x195/0x230
> > [ 1310.690182]  [<c0388fdc>] netif_receive_skb+0x37c/0x470
> > [ 1310.690632]  [<c038ba39>] process_backlog+0x69/0xc0
> > [ 1310.691175]  [<c038b6f7>] net_rx_action+0x137/0x1b0
> > [ 1310.691681]  [<c011e5c2>] __do_softirq+0x52/0xb0
> > [ 1310.692006]  [<c0104e94>] do_softirq+0x94/0xe0
> > [ 1310.692301]  =======================
> > 
> > 
> > This is a single CPU machine, and the box was quite busy due to disk I/O 
> > (load 6-8). The machine continues to run and all is well now. Even the 
> > application mentioned above (FahCore_a0.exe) is running fine 
> > ("Folding@Home", cpu bound). The binary is located on an jfs filesystem, 
> > which was also under heavy I/O. Can someone tell me why the backtrace 
> > shows so much net* stuff? There was not much net I/O...
> > 
> > more details and .config: http://nerdbynature.de/bits/2.6.24-rc6
> > 
> > Thanks,
> > Christian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-04  8:30   ` Ingo Molnar
@ 2008-01-05  7:12     ` Herbert Xu
  2008-01-05 16:53       ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Herbert Xu @ 2008-01-05  7:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rafael J. Wysocki, Christian Kujau, linux-kernel, jfs-discussion,
	Peter Zijlstra, Davide Libenzi

On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote:
>
> > > [ 1310.670986] =============================================
> > > [ 1310.671690] [ INFO: possible recursive locking detected ]
> > > [ 1310.672097] 2.6.24-rc6 #1
> > > [ 1310.672421] ---------------------------------------------
> > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
> > > [ 1310.673238]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > > [ 1310.673869]
> > > [ 1310.673870] but task is already holding lock:
> > > [ 1310.674567]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > > [ 1310.675267]
> > > [ 1310.675268] other info that might help us debug this:
> > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
> > > [ 1310.676334]  #0:  (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0
> > > [ 1310.677251]  #1:  (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470
> > > [ 1310.677924]  #2:  (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210
> > > [ 1310.678460]  #3:  (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80
> > > [ 1310.679250]  #4:  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50

The net part might just be a red herring, since the problem is that
__wake_up is somehow reentering itself.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-05  7:12     ` Herbert Xu
@ 2008-01-05 16:53       ` Peter Zijlstra
  2008-01-05 17:01         ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2008-01-05 16:53 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Ingo Molnar, Rafael J. Wysocki, Christian Kujau, linux-kernel,
	jfs-discussion, Davide Libenzi, Johannes Berg, Oleg Nesterov


On Sat, 2008-01-05 at 18:12 +1100, Herbert Xu wrote:
> On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote:
> >
> > > > [ 1310.670986] =============================================
> > > > [ 1310.671690] [ INFO: possible recursive locking detected ]
> > > > [ 1310.672097] 2.6.24-rc6 #1
> > > > [ 1310.672421] ---------------------------------------------
> > > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
> > > > [ 1310.673238]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > > > [ 1310.673869]
> > > > [ 1310.673870] but task is already holding lock:
> > > > [ 1310.674567]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > > > [ 1310.675267]
> > > > [ 1310.675268] other info that might help us debug this:
> > > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
> > > > [ 1310.676334]  #0:  (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0
> > > > [ 1310.677251]  #1:  (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470
> > > > [ 1310.677924]  #2:  (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210
> > > > [ 1310.678460]  #3:  (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80
> > > > [ 1310.679250]  #4:  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> 
> The net part might just be a red herring, since the problem is that
> __wake_up is somehow reentering itself.

/*
 * Perform a safe wake up of the poll wait list. The problem is that
 * with the new callback'd wake up system, it is possible that the
 * poll callback is reentered from inside the call to wake_up() done
 * on the poll wait queue head. The rule is that we cannot reenter the
 * wake up code from the same task more than EP_MAX_POLLWAKE_NESTS times,
 * and we cannot reenter the same wait queue head at all. This will
 * enable to have a hierarchy of epoll file descriptor of no more than
 * EP_MAX_POLLWAKE_NESTS deep. We need the irq version of the spin lock
 * because this one gets called by the poll callback, that in turn is called
 * from inside a wake_up(), that might be called from irq context.
 */

Seems to suggest that the epoll code can indeed recurse into wakeup.

Davide, Johannes, any ideas?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-05 16:53       ` Peter Zijlstra
@ 2008-01-05 17:01         ` Peter Zijlstra
  2008-01-05 21:35           ` Davide Libenzi
  2008-01-07 17:22           ` Oleg Nesterov
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Zijlstra @ 2008-01-05 17:01 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Ingo Molnar, Rafael J. Wysocki, Christian Kujau, linux-kernel,
	jfs-discussion, Davide Libenzi, Johannes Berg, Oleg Nesterov


On Sat, 2008-01-05 at 17:53 +0100, Peter Zijlstra wrote:
> On Sat, 2008-01-05 at 18:12 +1100, Herbert Xu wrote:
> > On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote:
> > >
> > > > > [ 1310.670986] =============================================
> > > > > [ 1310.671690] [ INFO: possible recursive locking detected ]
> > > > > [ 1310.672097] 2.6.24-rc6 #1
> > > > > [ 1310.672421] ---------------------------------------------
> > > > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
> > > > > [ 1310.673238]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > > > > [ 1310.673869]
> > > > > [ 1310.673870] but task is already holding lock:
> > > > > [ 1310.674567]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > > > > [ 1310.675267]
> > > > > [ 1310.675268] other info that might help us debug this:
> > > > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
> > > > > [ 1310.676334]  #0:  (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0
> > > > > [ 1310.677251]  #1:  (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470
> > > > > [ 1310.677924]  #2:  (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210
> > > > > [ 1310.678460]  #3:  (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80
> > > > > [ 1310.679250]  #4:  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > 
> > The net part might just be a red herring, since the problem is that
> > __wake_up is somehow reentering itself.
> 
> /*
>  * Perform a safe wake up of the poll wait list. The problem is that
>  * with the new callback'd wake up system, it is possible that the
>  * poll callback is reentered from inside the call to wake_up() done
>  * on the poll wait queue head. The rule is that we cannot reenter the
>  * wake up code from the same task more than EP_MAX_POLLWAKE_NESTS times,
>  * and we cannot reenter the same wait queue head at all. This will
>  * enable to have a hierarchy of epoll file descriptor of no more than
>  * EP_MAX_POLLWAKE_NESTS deep. We need the irq version of the spin lock
>  * because this one gets called by the poll callback, that in turn is called
>  * from inside a wake_up(), that might be called from irq context.
>  */
> 
> Seems to suggest that the epoll code can indeed recurse into wakeup.
> 
> Davide, Johannes, any ideas?

Since EP_MAX_POLLWAKE_NESTS < MAX_LOCKDEP_SUBCLASSES we could perhaps do
something like:

  wake_up_nested(..., wake_nests);

although I'm not quite sure that is correct, my understanding of this
code is still fragile at best.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-05 17:01         ` Peter Zijlstra
@ 2008-01-05 21:35           ` Davide Libenzi
  2008-01-06  0:20             ` Christian Kujau
  2008-01-06 21:44             ` Cyrill Gorcunov
  2008-01-07 17:22           ` Oleg Nesterov
  1 sibling, 2 replies; 17+ messages in thread
From: Davide Libenzi @ 2008-01-05 21:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau,
	Linux Kernel Mailing List, jfs-discussion, Johannes Berg,
	Oleg Nesterov

On Sat, 5 Jan 2008, Peter Zijlstra wrote:

> 
> On Sat, 2008-01-05 at 17:53 +0100, Peter Zijlstra wrote:
> > On Sat, 2008-01-05 at 18:12 +1100, Herbert Xu wrote:
> > > On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote:
> > > >
> > > > > > [ 1310.670986] =============================================
> > > > > > [ 1310.671690] [ INFO: possible recursive locking detected ]
> > > > > > [ 1310.672097] 2.6.24-rc6 #1
> > > > > > [ 1310.672421] ---------------------------------------------
> > > > > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
> > > > > > [ 1310.673238]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > > > > > [ 1310.673869]
> > > > > > [ 1310.673870] but task is already holding lock:
> > > > > > [ 1310.674567]  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > > > > > [ 1310.675267]
> > > > > > [ 1310.675268] other info that might help us debug this:
> > > > > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
> > > > > > [ 1310.676334]  #0:  (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0
> > > > > > [ 1310.677251]  #1:  (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470
> > > > > > [ 1310.677924]  #2:  (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210
> > > > > > [ 1310.678460]  #3:  (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80
> > > > > > [ 1310.679250]  #4:  (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
> > > 
> > > The net part might just be a red herring, since the problem is that
> > > __wake_up is somehow reentering itself.
> > 
> > /*
> >  * Perform a safe wake up of the poll wait list. The problem is that
> >  * with the new callback'd wake up system, it is possible that the
> >  * poll callback is reentered from inside the call to wake_up() done
> >  * on the poll wait queue head. The rule is that we cannot reenter the
> >  * wake up code from the same task more than EP_MAX_POLLWAKE_NESTS times,
> >  * and we cannot reenter the same wait queue head at all. This will
> >  * enable to have a hierarchy of epoll file descriptor of no more than
> >  * EP_MAX_POLLWAKE_NESTS deep. We need the irq version of the spin lock
> >  * because this one gets called by the poll callback, that in turn is called
> >  * from inside a wake_up(), that might be called from irq context.
> >  */
> > 
> > Seems to suggest that the epoll code can indeed recurse into wakeup.
> > 
> > Davide, Johannes, any ideas?
> 
> Since EP_MAX_POLLWAKE_NESTS < MAX_LOCKDEP_SUBCLASSES we could perhaps do
> something like:
> 
>   wake_up_nested(..., wake_nests);
> 
> although I'm not quite sure that is correct, my understanding of this
> code is still fragile at best.

I remember I talked with Arjan about this time ago. Basically, since 1) 
you can drop an epoll fd inside another epoll fd 2) callback-based wakeups 
are used, you can see a wake_up() from inside another wake_up(), but they 
will never refer to the same lock instance.
Think about:

	dfd = socket(...);
	efd1 = epoll_create();
	efd2 = epoll_create();
	epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...);
	epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);

When a packet arrives to the device underneath "dfd", the net code will 
issue a wake_up() on its poll wake list. Epoll (efd1) has installed a 
callback wakeup entry on that queue, and the wake_up() performed by the 
"dfd" net code will end up in ep_poll_callback(). At this point epoll 
(efd1) notices that it may have some event ready, so it needs to wake up 
the waiters on its poll wait list (efd2). So it calls ep_poll_safewake() 
that ends up in another wake_up(), after having checked about the 
recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to 
avoid stack blasting. Never hit the same queue, to avoid loops like:

	epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
	epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...);
	epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...);
	epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...);

The code "if (tncur->wq == wq || ..." prevents re-entering the same 
queue/lock.
I don't know how the lockdep code works, so I can't say about 
wake_up_nested(). Although I have a feeling is not enough in this case.
A solution may be to move the call to ep_poll_safewake() (that'd become a 
simple wake_up()) inside a tasklet or whatever is today trendy for delayed 
work. But his kinda scares me to be honest, since epoll has already a 
bunch of places where it could be asynchronously hit (plus performance 
regression will need to be verified).



- Davide



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-05 21:35           ` Davide Libenzi
@ 2008-01-06  0:20             ` Christian Kujau
  2008-01-07 21:35               ` Davide Libenzi
  2008-01-06 21:44             ` Cyrill Gorcunov
  1 sibling, 1 reply; 17+ messages in thread
From: Christian Kujau @ 2008-01-06  0:20 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Peter Zijlstra, Herbert Xu, Ingo Molnar, Rafael J. Wysocki,
	Linux Kernel Mailing List, jfs-discussion, Johannes Berg,
	Oleg Nesterov

On Sat, 5 Jan 2008, Davide Libenzi wrote:
> A solution may be to move the call to ep_poll_safewake() (that'd become a
> simple wake_up()) inside a tasklet or whatever is today trendy for delayed
> work. But his kinda scares me to be honest, since epoll has already a
> bunch of places where it could be asynchronously hit (plus performance
> regression will need to be verified).

Although I'm not able to reproduce this one right now, I'm happy to test 
any patches you guys come up with.

Thanks for your time,
Christian.
-- 
BOFH excuse #78:

Yes, yes, its called a design limitation

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-05 21:35           ` Davide Libenzi
  2008-01-06  0:20             ` Christian Kujau
@ 2008-01-06 21:44             ` Cyrill Gorcunov
  2008-01-06 21:53               ` Cyrill Gorcunov
  1 sibling, 1 reply; 17+ messages in thread
From: Cyrill Gorcunov @ 2008-01-06 21:44 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Peter Zijlstra, Herbert Xu, Ingo Molnar, Rafael J. Wysocki,
	Christian Kujau, Linux Kernel Mailing List, jfs-discussion,
	Johannes Berg, Oleg Nesterov

[Davide Libenzi - Sat, Jan 05, 2008 at 01:35:25PM -0800]
| On Sat, 5 Jan 2008, Peter Zijlstra wrote:
| 
[...snip...] 
| I remember I talked with Arjan about this time ago. Basically, since 1) 
| you can drop an epoll fd inside another epoll fd 2) callback-based wakeups 
| are used, you can see a wake_up() from inside another wake_up(), but they 
| will never refer to the same lock instance.
| Think about:
| 
| 	dfd = socket(...);
| 	efd1 = epoll_create();
| 	efd2 = epoll_create();
| 	epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...);
| 	epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
| 
| When a packet arrives to the device underneath "dfd", the net code will 
| issue a wake_up() on its poll wake list. Epoll (efd1) has installed a 
| callback wakeup entry on that queue, and the wake_up() performed by the 
| "dfd" net code will end up in ep_poll_callback(). At this point epoll 
| (efd1) notices that it may have some event ready, so it needs to wake up 
| the waiters on its poll wait list (efd2). So it calls ep_poll_safewake() 
| that ends up in another wake_up(), after having checked about the 
| recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to 
| avoid stack blasting. Never hit the same queue, to avoid loops like:
| 
| 	epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
| 	epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...);
| 	epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...);
| 	epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...);
| 
| The code "if (tncur->wq == wq || ..." prevents re-entering the same 
| queue/lock.
| I don't know how the lockdep code works, so I can't say about 
| wake_up_nested(). Although I have a feeling is not enough in this case.
| A solution may be to move the call to ep_poll_safewake() (that'd become a 
| simple wake_up()) inside a tasklet or whatever is today trendy for delayed 
| work. But his kinda scares me to be honest, since epoll has already a 
| bunch of places where it could be asynchronously hit (plus performance 
| regression will need to be verified).
| 
| 
| 
| - Davide
| 
| 

it's quite possible that i'm wrong but just interested...
why in ep_poll_safewake() the assignment

	struct list_head *lsthead = &psw->wake_task_list;

is not protected by spinlock?

		- Cyrill -

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-06 21:44             ` Cyrill Gorcunov
@ 2008-01-06 21:53               ` Cyrill Gorcunov
  0 siblings, 0 replies; 17+ messages in thread
From: Cyrill Gorcunov @ 2008-01-06 21:53 UTC (permalink / raw)
  To: Davide Libenzi, Peter Zijlstra, Herbert Xu, Ingo Molnar,
	Rafael J. Wysocki, Christian Kujau, Linux Kernel Mailing List,
	jfs-discussion, Johannes Berg, Oleg Nesterov

[Cyrill Gorcunov - Mon, Jan 07, 2008 at 12:44:42AM +0300]
| [Davide Libenzi - Sat, Jan 05, 2008 at 01:35:25PM -0800]
| | On Sat, 5 Jan 2008, Peter Zijlstra wrote:
| | 
| [...snip...] 
| | I remember I talked with Arjan about this time ago. Basically, since 1) 
| | you can drop an epoll fd inside another epoll fd 2) callback-based wakeups 
| | are used, you can see a wake_up() from inside another wake_up(), but they 
| | will never refer to the same lock instance.
| | Think about:
| | 
| | 	dfd = socket(...);
| | 	efd1 = epoll_create();
| | 	efd2 = epoll_create();
| | 	epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...);
| | 	epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
| | 
| | When a packet arrives to the device underneath "dfd", the net code will 
| | issue a wake_up() on its poll wake list. Epoll (efd1) has installed a 
| | callback wakeup entry on that queue, and the wake_up() performed by the 
| | "dfd" net code will end up in ep_poll_callback(). At this point epoll 
| | (efd1) notices that it may have some event ready, so it needs to wake up 
| | the waiters on its poll wait list (efd2). So it calls ep_poll_safewake() 
| | that ends up in another wake_up(), after having checked about the 
| | recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to 
| | avoid stack blasting. Never hit the same queue, to avoid loops like:
| | 
| | 	epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
| | 	epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...);
| | 	epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...);
| | 	epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...);
| | 
| | The code "if (tncur->wq == wq || ..." prevents re-entering the same 
| | queue/lock.
| | I don't know how the lockdep code works, so I can't say about 
| | wake_up_nested(). Although I have a feeling is not enough in this case.
| | A solution may be to move the call to ep_poll_safewake() (that'd become a 
| | simple wake_up()) inside a tasklet or whatever is today trendy for delayed 
| | work. But his kinda scares me to be honest, since epoll has already a 
| | bunch of places where it could be asynchronously hit (plus performance 
| | regression will need to be verified).
| | 
| | 
| | 
| | - Davide
| | 
| | 
| 
| it's quite possible that i'm wrong but just interested...
| why in ep_poll_safewake() the assignment
| 
| 	struct list_head *lsthead = &psw->wake_task_list;
| 
| is not protected by spinlock?
| 
| 		- Cyrill -

it was a completely stupid question... please drop ;)

		- Cyrill -

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-05 17:01         ` Peter Zijlstra
  2008-01-05 21:35           ` Davide Libenzi
@ 2008-01-07 17:22           ` Oleg Nesterov
  2008-01-07 17:49             ` Oleg Nesterov
  1 sibling, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2008-01-07 17:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau,
	linux-kernel, jfs-discussion, Davide Libenzi, Johannes Berg

On 01/05, Peter Zijlstra wrote:
>
> Since EP_MAX_POLLWAKE_NESTS < MAX_LOCKDEP_SUBCLASSES we could perhaps do
> something like:
>
>   wake_up_nested(..., wake_nests);

I think this would be the most correct change. But I wonder if it is possible
to do something more generic (but otoh more stupid/hackish and less safe).

Consider this "just for illustration" patch,

--- t/kernel/lockdep.c	2007-11-09 12:57:31.000000000 +0300
+++ t/kernel/lockdep.c	2008-01-07 19:43:50.000000000 +0300
@@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr,
 	struct held_lock *prev;
 	int i;
 
-	for (i = 0; i < curr->lockdep_depth; i++) {
+	for (i = curr->lockdep_depth; --i >= 0; ) {
 		prev = curr->held_locks + i;
 		if (prev->class != next->class)
 			continue;
+
+		if (prev->trylock == -1)
+			return 2;
 		/*
 		 * Allow read-after-read recursion of the same
 		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
-------------------------------------------------------------------------

Now,

	// trylock == -1
	#define	spin_mark_nested(l)	\
		lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_)
	#define	spin_unmark_nested(l)	\
		lock_release(&(l)->dep_map, 1, _THIS_IP_)

and ep_poll_safewake() can do:

	/* Do really wake up now */
	spin_mark_nested(&wq->lock);
	wake_up(wq);
	spin_unmark_nested(&wq->lock);

Possible?

Oleg.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-07 17:22           ` Oleg Nesterov
@ 2008-01-07 17:49             ` Oleg Nesterov
  2008-01-13 16:32               ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2008-01-07 17:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau,
	linux-kernel, jfs-discussion, Davide Libenzi, Johannes Berg

On 01/07, Oleg Nesterov wrote:
>
> Consider this "just for illustration" patch,
> 
> --- t/kernel/lockdep.c	2007-11-09 12:57:31.000000000 +0300
> +++ t/kernel/lockdep.c	2008-01-07 19:43:50.000000000 +0300
> @@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr,
>  	struct held_lock *prev;
>  	int i;
>  
> -	for (i = 0; i < curr->lockdep_depth; i++) {
> +	for (i = curr->lockdep_depth; --i >= 0; ) {
>  		prev = curr->held_locks + i;
>  		if (prev->class != next->class)
>  			continue;
> +
> +		if (prev->trylock == -1)
> +			return 2;
>  		/*
>  		 * Allow read-after-read recursion of the same
>  		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
> -------------------------------------------------------------------------
> 
> Now,
> 
> 	// trylock == -1
> 	#define	spin_mark_nested(l)	\
> 		lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_)
> 	#define	spin_unmark_nested(l)	\
> 		lock_release(&(l)->dep_map, 1, _THIS_IP_)
> 
> and ep_poll_safewake() can do:
> 
> 	/* Do really wake up now */
> 	spin_mark_nested(&wq->lock);
> 	wake_up(wq);
> 	spin_unmark_nested(&wq->lock);

I tested the patch above with the following code,

	wait_queue_head_t w1, w2, w3;

	init_waitqueue_head(&w1);
	init_waitqueue_head(&w2);
	init_waitqueue_head(&w3);

	local_irq_disable();
	spin_lock(&w1.lock);

	spin_mark_nested(&w2.lock);
	spin_lock(&w2.lock);

	spin_mark_nested(&w3.lock);
	wake_up(&w3);
	spin_unmark_nested(&w3.lock);

	spin_unlock(&w2.lock);
	spin_unmark_nested(&w2.lock);

	spin_unlock(&w1.lock);
	local_irq_enable();

seems to work. What do you think?

Oleg.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-06  0:20             ` Christian Kujau
@ 2008-01-07 21:35               ` Davide Libenzi
  0 siblings, 0 replies; 17+ messages in thread
From: Davide Libenzi @ 2008-01-07 21:35 UTC (permalink / raw)
  To: Christian Kujau
  Cc: Peter Zijlstra, Herbert Xu, Ingo Molnar, Rafael J. Wysocki,
	Linux Kernel Mailing List, jfs-discussion, Johannes Berg,
	Oleg Nesterov

On Sun, 6 Jan 2008, Christian Kujau wrote:

> On Sat, 5 Jan 2008, Davide Libenzi wrote:
> > A solution may be to move the call to ep_poll_safewake() (that'd become a
> > simple wake_up()) inside a tasklet or whatever is today trendy for delayed
> > work. But his kinda scares me to be honest, since epoll has already a
> > bunch of places where it could be asynchronously hit (plus performance
> > regression will need to be verified).
> 
> Although I'm not able to reproduce this one right now, I'm happy to test any
> patches you guys come up with.

There's no need to reproduce it. It's there, it's among us ;)



- Davide



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-07 17:49             ` Oleg Nesterov
@ 2008-01-13 16:32               ` Peter Zijlstra
  2008-01-14 21:27                 ` Oleg Nesterov
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2008-01-13 16:32 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau,
	linux-kernel, jfs-discussion, Davide Libenzi, Johannes Berg


On Mon, 2008-01-07 at 20:49 +0300, Oleg Nesterov wrote:
> On 01/07, Oleg Nesterov wrote:
> >
> > Consider this "just for illustration" patch,
> > 
> > --- t/kernel/lockdep.c	2007-11-09 12:57:31.000000000 +0300
> > +++ t/kernel/lockdep.c	2008-01-07 19:43:50.000000000 +0300
> > @@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr,
> >  	struct held_lock *prev;
> >  	int i;
> >  
> > -	for (i = 0; i < curr->lockdep_depth; i++) {
> > +	for (i = curr->lockdep_depth; --i >= 0; ) {
> >  		prev = curr->held_locks + i;
> >  		if (prev->class != next->class)
> >  			continue;
> > +
> > +		if (prev->trylock == -1)
> > +			return 2;
> >  		/*
> >  		 * Allow read-after-read recursion of the same
> >  		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
> > -------------------------------------------------------------------------
> > 
> > Now,
> > 
> > 	// trylock == -1
> > 	#define	spin_mark_nested(l)	\
> > 		lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_)
> > 	#define	spin_unmark_nested(l)	\
> > 		lock_release(&(l)->dep_map, 1, _THIS_IP_)
> > 
> > and ep_poll_safewake() can do:
> > 
> > 	/* Do really wake up now */
> > 	spin_mark_nested(&wq->lock);
> > 	wake_up(wq);
> > 	spin_unmark_nested(&wq->lock);
> 
> I tested the patch above with the following code,
> 
> 	wait_queue_head_t w1, w2, w3;
> 
> 	init_waitqueue_head(&w1);
> 	init_waitqueue_head(&w2);
> 	init_waitqueue_head(&w3);
> 
> 	local_irq_disable();
> 	spin_lock(&w1.lock);
> 
> 	spin_mark_nested(&w2.lock);
> 	spin_lock(&w2.lock);
> 
> 	spin_mark_nested(&w3.lock);
> 	wake_up(&w3);
> 	spin_unmark_nested(&w3.lock);
> 
> 	spin_unlock(&w2.lock);
> 	spin_unmark_nested(&w2.lock);
> 
> 	spin_unlock(&w1.lock);
> 	local_irq_enable();
> 
> seems to work. What do you think?

I've been pondering this for a while, and some days I really like it,
some days I don't.

The problem I have with it is that it becomes very easy to falsely
annotate problems away - its a very powerful annotation. That said, its
almost powerful enough to annotate the device semaphore/mutex problem.

I think I'll do wake_up_nested() for now and keep this around.

Thanks for this very nice idea though.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.24-rc6: possible recursive locking detected
  2008-01-13 16:32               ` Peter Zijlstra
@ 2008-01-14 21:27                 ` Oleg Nesterov
  2008-01-30 10:34                   ` hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected) Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2008-01-14 21:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau,
	linux-kernel, jfs-discussion, Davide Libenzi, Johannes Berg

On 01/13, Peter Zijlstra wrote:
> 
> On Mon, 2008-01-07 at 20:49 +0300, Oleg Nesterov wrote:
>
> > On 01/07, Oleg Nesterov wrote:
> > >
> > > Consider this "just for illustration" patch,
> > > 
> > > --- t/kernel/lockdep.c	2007-11-09 12:57:31.000000000 +0300
> > > +++ t/kernel/lockdep.c	2008-01-07 19:43:50.000000000 +0300
> > > @@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr,
> > >  	struct held_lock *prev;
> > >  	int i;
> > >  
> > > -	for (i = 0; i < curr->lockdep_depth; i++) {
> > > +	for (i = curr->lockdep_depth; --i >= 0; ) {
> > >  		prev = curr->held_locks + i;
> > >  		if (prev->class != next->class)
> > >  			continue;
> > > +
> > > +		if (prev->trylock == -1)
> > > +			return 2;
> > >  		/*
> > >  		 * Allow read-after-read recursion of the same
> > >  		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
> > > -------------------------------------------------------------------------
> > > 
> > > Now,
> > > 
> > > 	// trylock == -1
> > > 	#define	spin_mark_nested(l)	\
> > > 		lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_)
> > > 	#define	spin_unmark_nested(l)	\
> > > 		lock_release(&(l)->dep_map, 1, _THIS_IP_)
> > > 
> > > and ep_poll_safewake() can do:
> > > 
> > > 	/* Do really wake up now */
> > > 	spin_mark_nested(&wq->lock);
> > > 	wake_up(wq);
> > > 	spin_unmark_nested(&wq->lock);
> > 
> > seems to work. What do you think?
> 
> I've been pondering this for a while, and some days I really like it,
> some days I don't.
> 
> The problem I have with it is that it becomes very easy to falsely
> annotate problems away - its a very powerful annotation.

Also, I don't like the overloading of ->trylock, this is really hackish.

> I think I'll do wake_up_nested() for now and keep this around.

Agreed.

Perhaps it is a bit easier to use spin_lock_nested() + __wake_up_common()
directly (we have a lot of wake_up_xxx helpers), but this is up to you.


Offtopic question. Why do we have so many lockdep stuff in timer.c and hrtimer.c ?
We never lock 2 bases at the same time, except in migrate_timers(). We can kill
double_spin_lock() and base_lock_keys[] and just use spin_lock_nested in
migrate_timers(), no?

Oleg.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected)
  2008-01-14 21:27                 ` Oleg Nesterov
@ 2008-01-30 10:34                   ` Peter Zijlstra
  2008-01-30 17:36                     ` Thomas Gleixner
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2008-01-30 10:34 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Ingo Molnar, linux-kernel, Thomas Gleixner

( trimmed CC list )

Sorry for the delay, this message seems to have gotten lost in my
inbox :-/

On Tue, 2008-01-15 at 00:27 +0300, Oleg Nesterov wrote:
> On 01/13, Peter Zijlstra wrote:

> Offtopic question. Why do we have so many lockdep stuff in timer.c and hrtimer.c ?
> We never lock 2 bases at the same time, except in migrate_timers(). We can kill
> double_spin_lock() and base_lock_keys[] and just use spin_lock_nested in
> migrate_timers(), no?

Lets ask Thomas.. :-)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected)
  2008-01-30 10:34                   ` hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected) Peter Zijlstra
@ 2008-01-30 17:36                     ` Thomas Gleixner
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Gleixner @ 2008-01-30 17:36 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Oleg Nesterov, Ingo Molnar, linux-kernel

On Wed, 30 Jan 2008, Peter Zijlstra wrote:

> ( trimmed CC list )
> 
> Sorry for the delay, this message seems to have gotten lost in my
> inbox :-/
> 
> On Tue, 2008-01-15 at 00:27 +0300, Oleg Nesterov wrote:
> > On 01/13, Peter Zijlstra wrote:
> 
> > Offtopic question. Why do we have so many lockdep stuff in timer.c and hrtimer.c ?
> > We never lock 2 bases at the same time, except in migrate_timers(). We can kill
> > double_spin_lock() and base_lock_keys[] and just use spin_lock_nested in
> > migrate_timers(), no?
> 
> Lets ask Thomas.. :-)

No objections as long as it all works :)

   tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2008-01-30 17:37 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-03 22:58 2.6.24-rc6: possible recursive locking detected Christian Kujau
2008-01-03 23:06 ` Rafael J. Wysocki
2008-01-04  8:30   ` Ingo Molnar
2008-01-05  7:12     ` Herbert Xu
2008-01-05 16:53       ` Peter Zijlstra
2008-01-05 17:01         ` Peter Zijlstra
2008-01-05 21:35           ` Davide Libenzi
2008-01-06  0:20             ` Christian Kujau
2008-01-07 21:35               ` Davide Libenzi
2008-01-06 21:44             ` Cyrill Gorcunov
2008-01-06 21:53               ` Cyrill Gorcunov
2008-01-07 17:22           ` Oleg Nesterov
2008-01-07 17:49             ` Oleg Nesterov
2008-01-13 16:32               ` Peter Zijlstra
2008-01-14 21:27                 ` Oleg Nesterov
2008-01-30 10:34                   ` hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected) Peter Zijlstra
2008-01-30 17:36                     ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).