netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [BUG: 3.13.0-rc4] inconsistent lock state
       [not found] <52B09446.7060809@t-online.de>
@ 2013-12-17 18:59 ` Eric Dumazet
  2013-12-18  7:54   ` Knut Petersen
  2013-12-17 19:06 ` Linus Torvalds
  1 sibling, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2013-12-17 18:59 UTC (permalink / raw)
  To: Knut Petersen; +Cc: Linus Torvalds, linux-kernel, netdev, David Miller

On Tue, 2013-12-17 at 19:13 +0100, Knut Petersen wrote:
> Hi Linus / everybody!
> 
> Booting openSuSE 13.1 with kernel 3.13.0-rc4 triggers the attached
> warning.
> 
> cu,
>   Knut
> 
> 

Following patch should solve the issue.

http://patchwork.ozlabs.org/patch/301382/

Sorry for this ...

Thanks

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG: 3.13.0-rc4] inconsistent lock state
       [not found] <52B09446.7060809@t-online.de>
  2013-12-17 18:59 ` [BUG: 3.13.0-rc4] inconsistent lock state Eric Dumazet
@ 2013-12-17 19:06 ` Linus Torvalds
  2013-12-17 19:24   ` David Miller
  1 sibling, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2013-12-17 19:06 UTC (permalink / raw)
  To: Knut Petersen, David Miller, Eric Dumazet
  Cc: linux-kernel, Network Development, Shawn Bohrer

[-- Attachment #1: Type: text/plain, Size: 1308 bytes --]

David, Eric, netdev,
 this seems to be due to __udp4_lib_rcv() doing udp_sk_rx_dst_set(),
which takes the 'sk->sk_dst_lock' spinlock. This all happens in a
software irq context.

And on the other hand, inet_csk_listen_start() does sk_dst_reset(),
which takes the same lock *without* bh-disable, so it could deadlock
if an interrupt comes in, and bh processing happens, and we try to
take the lock recursively.

Now, if I read everything correctly, I think that this is all fine in
practice because inet_csk_listen_start() is only ever called for TCP
sockets (inet_listen seems to exit unless it's a SOCK_STREAM), and
that this is a false lockdep warning due to a very hacky use of
sk_dst_lock by UDP.

However, even if that's the case, then to make lockdep happy, maybe
UDP and TCP sockets need to initialize that sk_dst_lock in different
lockdep classes. Because we should make sure that lockdep is happy
either way.

Or maybe use a different lock for that UDP hack.

This seems to have been introduced by commit 975022310233 ("udp: ipv4:
must add synchronization in udp_sk_rx_dst_set()").

Hmm?

              Linus

On Tue, Dec 17, 2013 at 10:13 AM, Knut Petersen
<Knut_Petersen@t-online.de> wrote:
> Hi Linus / everybody!
>
> Booting openSuSE 13.1 with kernel 3.13.0-rc4 triggers the attached
> warning.

[-- Attachment #2: messages --]
[-- Type: text/plain, Size: 6286 bytes --]

golem kernel: [   25.324096] 
golem kernel: [   25.326525] =================================
golem kernel: [   25.328009] [ INFO: inconsistent lock state ]
golem kernel: [   25.328009] 3.13.0-rc4-main #89 Not tainted
golem kernel: [   25.328009] ---------------------------------
golem kernel: [   25.328009] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
golem kernel: [   25.328009] SuSEfirewall2/1453 [HC0[0]:SC1[1]:HE1:SE0] takes:
golem kernel: [   25.328009]  (&(&sk->sk_dst_lock)->rlock){+.?...}, at: [<c0516fe1>] __udp4_lib_rcv+0x4db/0x6d2
golem kernel: [   25.328009] {SOFTIRQ-ON-W} state was registered at:
golem kernel: [   25.328009]   [<c0154319>] __lock_acquire+0x6d5/0x1556
golem kernel: [   25.328009]   [<c0155733>] lock_acquire+0x72/0xc9
golem kernel: [   25.328009]   [<c054e122>] _raw_spin_lock+0x2a/0x39
golem kernel: [   25.328009]   [<c04fdff3>] inet_csk_listen_start+0x7d/0xc1
golem kernel: [   25.328009]   [<c051cff2>] inet_listen+0x145/0x16b
golem kernel: [   25.328009]   [<c04c23d4>] SyS_listen+0x48/0x63
golem kernel: [   25.328009]   [<c04c3096>] SyS_socketcall+0xb4/0x1fc
golem kernel: [   25.328009]   [<c0553df4>] sysenter_do_call+0x12/0x32
golem kernel: [   25.328009] irq event stamp: 53738
golem kernel: [   25.328009] hardirqs last  enabled at (53738): [<c012dc10>] local_bh_enable+0x92/0xa4
golem kernel: [   25.328009] hardirqs last disabled at (53737): [<c012dbc2>] local_bh_enable+0x44/0xa4
golem kernel: [   25.328009] softirqs last  enabled at (41266): [<c012d9aa>] __do_softirq+0x168/0x21a
golem kernel: [   25.328009] softirqs last disabled at (53653): [<c0102bff>] do_softirq_own_stack+0x34/0x3a
golem kernel: [   25.328009] 
golem kernel: [   25.328009] other info that might help us debug this:
golem kernel: [   25.328009]  Possible unsafe locking scenario:
golem kernel: [   25.328009] 
golem kernel: [   25.328009]        CPU0
golem kernel: [   25.328009]        ----
golem kernel: [   25.328009]   lock(&(&sk->sk_dst_lock)->rlock);
golem kernel: [   25.328009]   <Interrupt>
golem kernel: [   25.328009]     lock(&(&sk->sk_dst_lock)->rlock);
golem kernel: [   25.328009] 
golem kernel: [   25.328009]  *** DEADLOCK ***
golem kernel: [   25.328009] 
golem kernel: [   25.328009] 5 locks held by SuSEfirewall2/1453:
golem kernel: [   25.328009]  #0:  (&dup_mmap_sem){.+.+.+}, at: [<c01ab8c4>] uprobe_start_dup_mmap+0x12/0x14
golem kernel: [   25.328009]  #1:  (&mm->mmap_sem){++++++}, at: [<c0128491>] dup_mm+0x8c/0x3a4
golem kernel: [   25.328009]  #2:  (&mm->mmap_sem/1){+.+.+.}, at: [<c01284aa>] dup_mm+0xa5/0x3a4
golem kernel: [   25.328009]  #3:  (rcu_read_lock){.+.+..}, at: [<c04cfe90>] __netif_receive_skb_core+0x136/0x5c2
golem kernel: [   25.328009]  #4:  (rcu_read_lock){.+.+..}, at: [<c04f3f4f>] ip_local_deliver_finish+0x2b/0x21b
golem kernel: [   25.328009] 
golem kernel: [   25.328009] stack backtrace:
golem kernel: [   25.328009] CPU: 0 PID: 1453 Comm: SuSEfirewall2 Not tainted 3.13.0-rc4-main #89
golem kernel: [   25.328009] Hardware name:    /i915GMm-HFS, BIOS 6.00 PG 09/14/2005
golem kernel: [   25.328009]  00000000 f02d0110 f600bd60 c0548f1f f600bd80 c054714f c06d12e1 c06d11e0
golem kernel: [   25.328009]  c06d1fe2 f02d05ec f02d0110 c0152caa f600bda4 c015361e 00000004 00000000
golem kernel: [   25.328009]  00000006 00000004 f02d05e8 00000002 f02d05ec f600be04 c01542a6 c0153a54
golem kernel: [   25.328009] Call Trace:
golem kernel: [   25.328009]  [<c0548f1f>] dump_stack+0x16/0x18
golem kernel: [   25.328009]  [<c054714f>] print_usage_bug+0x21f/0x22b
golem kernel: [   25.328009]  [<c0152caa>] ? check_usage_backwards+0xda/0xda
golem kernel: [   25.328009]  [<c015361e>] mark_lock+0x2c4/0x4c2
golem kernel: [   25.328009]  [<c01542a6>] __lock_acquire+0x662/0x1556
golem kernel: [   25.328009]  [<c0153a54>] ? trace_hardirqs_on+0xb/0xd
golem kernel: [   25.328009]  [<fa07d588>] ? nf_conntrack_seqadj_fini+0x15/0x15 [nf_conntrack]
golem kernel: [   25.328009]  [<c0154363>] ? __lock_acquire+0x71f/0x1556
golem kernel: [   25.328009]  [<c0155733>] lock_acquire+0x72/0xc9
golem kernel: [   25.328009]  [<c0516fe1>] ? __udp4_lib_rcv+0x4db/0x6d2
golem kernel: [   25.328009]  [<c054e122>] _raw_spin_lock+0x2a/0x39
golem kernel: [   25.328009]  [<c0516fe1>] ? __udp4_lib_rcv+0x4db/0x6d2
golem kernel: [   25.328009]  [<c0516fe1>] __udp4_lib_rcv+0x4db/0x6d2
golem kernel: [   25.328009]  [<c05175e7>] udp_rcv+0x17/0x19
golem kernel: [   25.328009]  [<c04f4027>] ip_local_deliver_finish+0x103/0x21b
golem kernel: [   25.328009]  [<c04f45bd>] ip_local_deliver+0x75/0x7a
golem kernel: [   25.328009]  [<c04f43e3>] ip_rcv_finish+0x2a4/0x332
golem kernel: [   25.328009]  [<c04f481d>] ip_rcv+0x25b/0x2ba
golem kernel: [   25.328009]  [<c04d0291>] __netif_receive_skb_core+0x537/0x5c2
golem kernel: [   25.328009]  [<c04d0ada>] __netif_receive_skb+0x1b/0x57
golem kernel: [   25.328009]  [<c04d0b4d>] netif_receive_skb+0x37/0x3a
golem kernel: [   25.328009]  [<c04d12a3>] napi_gro_receive+0x36/0x72
golem kernel: [   25.328009]  [<c045a5d3>] sky2_poll+0x6c1/0x98c
golem kernel: [   25.328009]  [<c01539fb>] ? trace_hardirqs_on_caller+0x11d/0x16b
golem kernel: [   25.328009]  [<c04d0d71>] net_rx_action+0xaa/0x19e
golem kernel: [   25.328009]  [<c012d900>] __do_softirq+0xbe/0x21a
golem kernel: [   25.328009]  [<c012d842>] ? __tasklet_hrtimer_trampoline+0x33/0x33
golem kernel: [   25.328009]  <IRQ>  [<c012dcc0>] ? irq_exit+0x46/0x81
golem kernel: [   25.328009]  [<c011ccba>] ? smp_apic_timer_interrupt+0x36/0x3f
golem kernel: [   25.328009]  [<c054ee6b>] ? apic_timer_interrupt+0x2f/0x34
golem kernel: [   25.328009]  [<c01c007b>] ? bdi_has_dirty_io+0x37/0x46
golem kernel: [   25.328009]  [<c0128626>] ? dup_mm+0x221/0x3a4
golem kernel: [   25.328009]  [<c01293d0>] ? copy_process.part.34+0xc02/0x104e
golem kernel: [   25.328009]  [<c0129981>] ? do_fork+0xab/0x263
golem kernel: [   25.328009]  [<c02f7b6a>] ? _copy_to_user+0x41/0x4c
golem kernel: [   25.328009]  [<c02f7764>] ? trace_hardirqs_on_thunk+0xc/0x10
golem kernel: [   25.328009]  [<c0553e23>] ? sysenter_exit+0xf/0x16
golem kernel: [   25.328009]  [<c0129bb0>] ? SyS_clone+0x1b/0x1d
golem kernel: [   25.328009]  [<c0553df4>] ? sysenter_do_call+0x12/0x32

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG: 3.13.0-rc4] inconsistent lock state
  2013-12-17 19:06 ` Linus Torvalds
@ 2013-12-17 19:24   ` David Miller
  0 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2013-12-17 19:24 UTC (permalink / raw)
  To: torvalds; +Cc: Knut_Petersen, edumazet, linux-kernel, netdev, sbohrer

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue, 17 Dec 2013 11:06:15 -0800

>  this seems to be due to __udp4_lib_rcv() doing udp_sk_rx_dst_set(),
> which takes the 'sk->sk_dst_lock' spinlock. This all happens in a
> software irq context.

I have a fix for this from Eric that I'll send to you today.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG: 3.13.0-rc4] inconsistent lock state
  2013-12-17 18:59 ` [BUG: 3.13.0-rc4] inconsistent lock state Eric Dumazet
@ 2013-12-18  7:54   ` Knut Petersen
  0 siblings, 0 replies; 4+ messages in thread
From: Knut Petersen @ 2013-12-18  7:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linus Torvalds, linux-kernel, netdev, David Miller

On 17.12.2013 19:59, Eric Dumazet wrote:
> On Tue, 2013-12-17 at 19:13 +0100, Knut Petersen wrote:
>> Hi Linus / everybody!
>>
>> Booting openSuSE 13.1 with kernel 3.13.0-rc4 triggers the attached
>> warning.
>>
>> cu,
>>    Knut
>>
>>
> Following patch should solve the issue.
>
> http://patchwork.ozlabs.org/patch/301382/

Indeed, it solves the problem.

cu,
  Knut

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-12-18  7:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <52B09446.7060809@t-online.de>
2013-12-17 18:59 ` [BUG: 3.13.0-rc4] inconsistent lock state Eric Dumazet
2013-12-18  7:54   ` Knut Petersen
2013-12-17 19:06 ` Linus Torvalds
2013-12-17 19:24   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).