* lockdep possible recursive lock in slab parent->list->rlock in rc2 @ 2009-12-27 12:06 Andi Kleen 2009-12-27 12:33 ` Pekka Enberg 0 siblings, 1 reply; 5+ messages in thread From: Andi Kleen @ 2009-12-27 12:06 UTC (permalink / raw) To: linux-kernel, netdev, penberg Get this on a NFS root system while booting This must be a recent change in the last week, I didn't see it in a post rc1 git* from last week (I haven't done a exact bisect) It's triggered by the r8169 driver close function, but looks more like a slab problem? I haven't checked it in detail if the locks are really different or just lockdep not knowing enough classes. -Andi ============================================= [ INFO: possible recursive locking detected ] 2.6.33-rc2 #19 --------------------------------------------- swapper/1 is trying to acquire lock: (&(&parent->list_lock)->rlock){-.-...}, at: [<ffffffff810cc93a>] cache_flusharray+0x55/0x10a but task is already holding lock: (&(&parent->list_lock)->rlock){-.-...}, at: [<ffffffff810cc93a>] cache_flusharray+0x55/0x10a other info that might help us debug this: 2 locks held by swapper/1: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff813e24d6>] rtnl_lock+0x12/0x14 #1: (&(&parent->list_lock)->rlock){-.-...}, at: [<ffffffff810cc93a>] cache_flusharray+0x55/0x10a stack backtrace: Pid: 1, comm: swapper Not tainted 2.6.33-rc2-MCE6 #19 Call Trace: [<ffffffff810687da>] __lock_acquire+0xf94/0x1771 [<ffffffff81066402>] ? mark_held_locks+0x4d/0x6b [<ffffffff81066663>] ? trace_hardirqs_on_caller+0x10b/0x12f [<ffffffff8105b061>] ? sched_clock_local+0x1c/0x80 [<ffffffff8105b061>] ? sched_clock_local+0x1c/0x80 [<ffffffff81069073>] lock_acquire+0xbc/0xd9 [<ffffffff810cc93a>] ? cache_flusharray+0x55/0x10a [<ffffffff8149639d>] _raw_spin_lock+0x31/0x66 [<ffffffff810cc93a>] ? cache_flusharray+0x55/0x10a [<ffffffff810cbbf8>] ? kfree_debugcheck+0x11/0x2d [<ffffffff810cc93a>] cache_flusharray+0x55/0x10a [<ffffffff81066d67>] ? debug_check_no_locks_freed+0x119/0x12f [<ffffffff810cc387>] kmem_cache_free+0x18f/0x1f2 [<ffffffff810cc515>] slab_destroy+0x12b/0x138 [<ffffffff810cc683>] free_block+0x161/0x1a2 [<ffffffff810cc982>] cache_flusharray+0x9d/0x10a [<ffffffff81066d67>] ? debug_check_no_locks_freed+0x119/0x12f [<ffffffff810ccbf3>] kfree+0x204/0x23b [<ffffffff81066694>] ? trace_hardirqs_on+0xd/0xf [<ffffffff813d002a>] skb_release_data+0xc6/0xcb [<ffffffff813cfd19>] __kfree_skb+0x19/0x86 [<ffffffff813cfdb1>] consume_skb+0x2b/0x2d [<ffffffff8133929a>] rtl8169_rx_clear+0x7f/0xbb [<ffffffff8133ada2>] rtl8169_down+0x12c/0x13b [<ffffffff8133b58a>] rtl8169_close+0x30/0x131 [<ffffffff813e8d98>] ? dev_deactivate+0x168/0x198 [<ffffffff813d94d6>] dev_close+0x8c/0xae [<ffffffff813d8e62>] dev_change_flags+0xba/0x180 [<ffffffff81a87e63>] ic_close_devs+0x2e/0x48 [<ffffffff81a88a5b>] ip_auto_config+0x914/0xe1e [<ffffffff8105b061>] ? sched_clock_local+0x1c/0x80 [<ffffffff810649a1>] ? trace_hardirqs_off+0xd/0xf [<ffffffff8105b1c0>] ? cpu_clock+0x2d/0x3f [<ffffffff810649c7>] ? lock_release_holdtime+0x24/0x181 [<ffffffff81a86967>] ? tcp_congestion_default+0x0/0x12 [<ffffffff81496c60>] ? _raw_spin_unlock+0x26/0x2b [<ffffffff81a86967>] ? tcp_congestion_default+0x0/0x12 [<ffffffff81a88147>] ? ip_auto_config+0x0/0xe1e [<ffffffff810001f0>] do_one_initcall+0x5a/0x14f [<ffffffff81a5364c>] kernel_init+0x141/0x197 [<ffffffff81003794>] kernel_thread_helper+0x4/0x10 [<ffffffff81496efc>] ? restore_args+0x0/0x30 [<ffffffff81a5350b>] ? kernel_init+0x0/0x197 [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10 IP-Config: Retrying forever (NFS root)... r8169: eth0: link up -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lockdep possible recursive lock in slab parent->list->rlock in rc2 2009-12-27 12:06 lockdep possible recursive lock in slab parent->list->rlock in rc2 Andi Kleen @ 2009-12-27 12:33 ` Pekka Enberg 2009-12-27 12:43 ` Andi Kleen ` (2 more replies) 0 siblings, 3 replies; 5+ messages in thread From: Pekka Enberg @ 2009-12-27 12:33 UTC (permalink / raw) To: Andi Kleen Cc: linux-kernel, netdev, paulmck, a.p.zijlstra, cl, Heiko Carstens Hi Andi, On Sun, 2009-12-27 at 13:06 +0100, Andi Kleen wrote: > Get this on a NFS root system while booting > This must be a recent change in the last week, > I didn't see it in a post rc1 git* from last week > (I haven't done a exact bisect) > > It's triggered by the r8169 driver close function, > but looks more like a slab problem? > > I haven't checked it in detail if the locks are > really different or just lockdep not knowing > enough classes. I broke the lockdep annotations in commit ce79ddc8e2376a9a93c7d42daf89bfcbb9187e62 ("SLAB: Fix lockdep annotations for CPU hotplug"). Does this fix things for you? Heiko, the following patch should fix it for you too. Pekka diff --git a/mm/slab.c b/mm/slab.c index 7d41f15..7451bda 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -654,7 +654,7 @@ static void init_node_lock_keys(int q) l3 = s->cs_cachep->nodelists[q]; if (!l3 || OFF_SLAB(s->cs_cachep)) - return; + continue; lockdep_set_class(&l3->list_lock, &on_slab_l3_key); alc = l3->alien; /* @@ -665,7 +665,7 @@ static void init_node_lock_keys(int q) * for alloc_alien_cache, */ if (!alc || (unsigned long)alc == BAD_ALIEN_MAGIC) - return; + continue; for_each_node(r) { if (alc[r]) lockdep_set_class(&alc[r]->lock, ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: lockdep possible recursive lock in slab parent->list->rlock in rc2 2009-12-27 12:33 ` Pekka Enberg @ 2009-12-27 12:43 ` Andi Kleen 2009-12-28 9:40 ` Heiko Carstens 2009-12-29 1:11 ` Paul E. McKenney 2 siblings, 0 replies; 5+ messages in thread From: Andi Kleen @ 2009-12-27 12:43 UTC (permalink / raw) To: Pekka Enberg Cc: Andi Kleen, linux-kernel, netdev, paulmck, a.p.zijlstra, cl, Heiko Carstens > I broke the lockdep annotations in commit > ce79ddc8e2376a9a93c7d42daf89bfcbb9187e62 ("SLAB: Fix lockdep annotations > for CPU hotplug"). Does this fix things for you? Heiko, the following > patch should fix it for you too. Yes that patch fixes it. Thanks. -Andi ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lockdep possible recursive lock in slab parent->list->rlock in rc2 2009-12-27 12:33 ` Pekka Enberg 2009-12-27 12:43 ` Andi Kleen @ 2009-12-28 9:40 ` Heiko Carstens 2009-12-29 1:11 ` Paul E. McKenney 2 siblings, 0 replies; 5+ messages in thread From: Heiko Carstens @ 2009-12-28 9:40 UTC (permalink / raw) To: Pekka Enberg; +Cc: Andi Kleen, linux-kernel, netdev, paulmck, a.p.zijlstra, cl On Sun, Dec 27, 2009 at 02:33:14PM +0200, Pekka Enberg wrote: > Hi Andi, > > On Sun, 2009-12-27 at 13:06 +0100, Andi Kleen wrote: > > Get this on a NFS root system while booting > > This must be a recent change in the last week, > > I didn't see it in a post rc1 git* from last week > > (I haven't done a exact bisect) > > > > It's triggered by the r8169 driver close function, > > but looks more like a slab problem? > > > > I haven't checked it in detail if the locks are > > really different or just lockdep not knowing > > enough classes. > > I broke the lockdep annotations in commit > ce79ddc8e2376a9a93c7d42daf89bfcbb9187e62 ("SLAB: Fix lockdep annotations > for CPU hotplug"). Does this fix things for you? Heiko, the following > patch should fix it for you too. Works fine here too. Thanks! ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lockdep possible recursive lock in slab parent->list->rlock in rc2 2009-12-27 12:33 ` Pekka Enberg 2009-12-27 12:43 ` Andi Kleen 2009-12-28 9:40 ` Heiko Carstens @ 2009-12-29 1:11 ` Paul E. McKenney 2 siblings, 0 replies; 5+ messages in thread From: Paul E. McKenney @ 2009-12-29 1:11 UTC (permalink / raw) To: Pekka Enberg Cc: Andi Kleen, linux-kernel, netdev, a.p.zijlstra, cl, Heiko Carstens On Sun, Dec 27, 2009 at 02:33:14PM +0200, Pekka Enberg wrote: > Hi Andi, > > On Sun, 2009-12-27 at 13:06 +0100, Andi Kleen wrote: > > Get this on a NFS root system while booting > > This must be a recent change in the last week, > > I didn't see it in a post rc1 git* from last week > > (I haven't done a exact bisect) > > > > It's triggered by the r8169 driver close function, > > but looks more like a slab problem? > > > > I haven't checked it in detail if the locks are > > really different or just lockdep not knowing > > enough classes. > > I broke the lockdep annotations in commit > ce79ddc8e2376a9a93c7d42daf89bfcbb9187e62 ("SLAB: Fix lockdep annotations > for CPU hotplug"). Does this fix things for you? Heiko, the following > patch should fix it for you too. And no lockdep warnings here, either. I did get the following new-to-me preempt_count underflow, but doubt that it is related. Thanx, Paul Badness at kernel/sched.c:5350 NIP: c0000000005b2e58 LR: c0000000005b2e3c CTR: c000000000025f0c REGS: c000000042893b30 TRAP: 0700 Not tainted (2.6.33-rc2-autokern1) MSR: 8000000000029032 <EE,ME,CE,IR,DR> CR: 22000082 XER: 0000000c TASK = c00000007d8737e0[0] 'swapper' THREAD: c000000042890000 CPU: 2 GPR00: 0000000000000000 c000000042893db0 c0000000009c07f8 0000000000000001 GPR04: 0000000000000001 0000000000000006 0000000000000001 000000000000004a GPR08: 0000000000000000 c00000000128adb8 c00000000088aa20 c000000000a0da08 GPR12: 0000000000000002 c0000000009df880 0000000000000000 0000000000c00020 GPR16: 0000000000000002 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 c0000000009e24b0 0000000000000001 c0000000009df480 GPR24: 0000000000000000 c0000000009d8628 c0000000009df880 0000000000000002 GPR28: c0000000009e2068 c0000000009d8628 c00000000093c000 c000000042890000 NIP [c0000000005b2e58] .sub_preempt_count+0x58/0xc8 LR [c0000000005b2e3c] .sub_preempt_count+0x3c/0xc8 Call Trace: [c000000042893db0] [c000000042893e30] 0xc000000042893e30 (unreliable) [c000000042893e30] [c000000000014d38] .cpu_idle+0x1f0/0x20c [c000000042893ec0] [c0000000005ba678] .start_secondary+0x380/0x3c4 [c000000042893f90] [c000000000008264] .start_secondary_prolog+0x10/0x14 Instruction dump: 78290464 80090014 7f801800 40bc0074 4bd45745 60000000 2fa30000 419e0070 e93e8a08 80090000 2f800000 409e0060 <0fe00000> 48000058 78000620 2fa00000 BUG: scheduling while atomic: swapper/0/0x00000000 INFO: lockdep is turned off. Modules linked in: ehea Call Trace: [c000000042897bf0] [c0000000000123b0] .show_stack+0x70/0x184 (unreliable) [c000000042897ca0] [c00000000005eaa0] .__schedule_bug+0xa4/0xc4 [c000000042897d30] [c0000000005abe4c] .schedule+0xd8/0xa8c [c000000042897e30] [c000000000014d40] .cpu_idle+0x1f8/0x20c [c000000042897ec0] [c0000000005ba678] .start_secondary+0x380/0x3c4 [c000000042897f90] [c000000000008264] .start_secondary_prolog+0x10/0x14 > diff --git a/mm/slab.c b/mm/slab.c > index 7d41f15..7451bda 100644 > --- a/mm/slab.c > +++ b/mm/slab.c > @@ -654,7 +654,7 @@ static void init_node_lock_keys(int q) > > l3 = s->cs_cachep->nodelists[q]; > if (!l3 || OFF_SLAB(s->cs_cachep)) > - return; > + continue; > lockdep_set_class(&l3->list_lock, &on_slab_l3_key); > alc = l3->alien; > /* > @@ -665,7 +665,7 @@ static void init_node_lock_keys(int q) > * for alloc_alien_cache, > */ > if (!alc || (unsigned long)alc == BAD_ALIEN_MAGIC) > - return; > + continue; > for_each_node(r) { > if (alc[r]) > lockdep_set_class(&alc[r]->lock, > > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-12-29 1:11 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-12-27 12:06 lockdep possible recursive lock in slab parent->list->rlock in rc2 Andi Kleen 2009-12-27 12:33 ` Pekka Enberg 2009-12-27 12:43 ` Andi Kleen 2009-12-28 9:40 ` Heiko Carstens 2009-12-29 1:11 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).