* WARNING: bad unlock balance in sch_direct_xmit @ 2019-12-03 15:59 syzbot 2020-01-05 18:30 ` syzbot 2020-01-05 22:58 ` syzbot 0 siblings, 2 replies; 17+ messages in thread From: syzbot @ 2019-12-03 15:59 UTC (permalink / raw) To: davem, jhs, jiri, linux-kernel, netdev, syzkaller-bugs, xiyou.wangcong Hello, syzbot found the following crash on: HEAD commit: 81b6b964 Merge branch 'master' of git://git.kernel.org/pub.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1677d882e00000 kernel config: https://syzkaller.appspot.com/x/.config?x=773597fe8d7cb41a dashboard link: https://syzkaller.appspot.com/bug?extid=4ec99438ed7450da6272 compiler: gcc (GCC) 9.0.0 20181231 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+4ec99438ed7450da6272@syzkaller.appspotmail.com ===================================== WARNING: bad unlock balance detected! 5.4.0-syzkaller #0 Not tainted ------------------------------------- syz-executor.4/10485 is trying to release lock (&dev->qdisc_xmit_lock_key) at: [<ffffffff85e7458a>] spin_unlock include/linux/spinlock.h:378 [inline] [<ffffffff85e7458a>] __netif_tx_unlock include/linux/netdevice.h:3961 [inline] [<ffffffff85e7458a>] sch_direct_xmit+0x3fa/0xd30 net/sched/sch_generic.c:315 but there are no more locks to release! other info that might help us debug this: 7 locks held by syz-executor.4/10485: #0: ffff88809555b060 (&pipe->mutex/1){+.+.}, at: pipe_lock_nested fs/pipe.c:63 [inline] #0: ffff88809555b060 (&pipe->mutex/1){+.+.}, at: pipe_lock fs/pipe.c:71 [inline] #0: ffff88809555b060 (&pipe->mutex/1){+.+.}, at: pipe_wait+0x1ce/0x1f0 fs/pipe.c:119 #1: ffff8880ae809d50 ((&ndev->rs_timer)){+.-.}, at: lockdep_copy_map include/linux/lockdep.h:172 [inline] #1: ffff8880ae809d50 ((&ndev->rs_timer)){+.-.}, at: call_timer_fn+0xe0/0x780 kernel/time/timer.c:1394 #2: ffffffff891a4080 (rcu_read_lock){....}, at: ip6_nd_hdr net/ipv6/ndisc.c:463 [inline] #2: ffffffff891a4080 (rcu_read_lock){....}, at: ndisc_send_skb+0x7fe/0x1490 net/ipv6/ndisc.c:499 #3: ffffffff891a4040 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline] #3: ffffffff891a4040 (rcu_read_lock_bh){....}, at: ip6_finish_output2+0x214/0x25c0 net/ipv6/ip6_output.c:102 #4: ffffffff891a4040 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x35c0 net/core/dev.c:3948 #5: ffff8880a8e16250 (&dev->qdisc_tx_busylock_key#19){+...}, at: spin_trylock include/linux/spinlock.h:348 [inline] #5: ffff8880a8e16250 (&dev->qdisc_tx_busylock_key#19){+...}, at: qdisc_run_begin include/net/sch_generic.h:159 [inline] #5: ffff8880a8e16250 (&dev->qdisc_tx_busylock_key#19){+...}, at: __dev_xmit_skb net/core/dev.c:3611 [inline] #5: ffff8880a8e16250 (&dev->qdisc_tx_busylock_key#19){+...}, at: __dev_queue_xmit+0x2412/0x35c0 net/core/dev.c:3982 #6: ffff8880a8e16138 (&dev->qdisc_running_key#19){+...}, at: dev_queue_xmit+0x18/0x20 net/core/dev.c:4046 stack backtrace: CPU: 0 PID: 10485 Comm: syz-executor.4 Not tainted 5.4.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_unlock_imbalance_bug kernel/locking/lockdep.c:4008 [inline] print_unlock_imbalance_bug.cold+0x114/0x123 kernel/locking/lockdep.c:3984 __lock_release kernel/locking/lockdep.c:4242 [inline] lock_release+0x5f2/0x960 kernel/locking/lockdep.c:4503 __raw_spin_unlock include/linux/spinlock_api_smp.h:150 [inline] _raw_spin_unlock+0x16/0x40 kernel/locking/spinlock.c:183 spin_unlock include/linux/spinlock.h:378 [inline] __netif_tx_unlock include/linux/netdevice.h:3961 [inline] sch_direct_xmit+0x3fa/0xd30 net/sched/sch_generic.c:315 __dev_xmit_skb net/core/dev.c:3621 [inline] __dev_queue_xmit+0x2707/0x35c0 net/core/dev.c:3982 dev_queue_xmit+0x18/0x20 net/core/dev.c:4046 neigh_hh_output include/net/neighbour.h:500 [inline] neigh_output include/net/neighbour.h:509 [inline] ip6_finish_output2+0xfbe/0x25c0 net/ipv6/ip6_output.c:116 __ip6_finish_output+0x444/0xaa0 net/ipv6/ip6_output.c:142 ip6_finish_output+0x38/0x1f0 net/ipv6/ip6_output.c:152 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x25e/0x880 net/ipv6/ip6_output.c:175 dst_output include/net/dst.h:436 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ndisc_send_skb+0xf1f/0x1490 net/ipv6/ndisc.c:505 ndisc_send_rs+0x134/0x720 net/ipv6/ndisc.c:699 addrconf_rs_timer+0x30f/0x6e0 net/ipv6/addrconf.c:3879 call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers kernel/time/timer.c:1773 [inline] __run_timers kernel/time/timer.c:1740 [inline] run_timer_softirq+0x6c3/0x1790 kernel/time/timer.c:1786 __do_softirq+0x262/0x98c kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x19b/0x1e0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1137 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 </IRQ> RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:752 [inline] RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline] RIP: 0010:_raw_spin_unlock_irqrestore+0x90/0xe0 kernel/locking/spinlock.c:191 Code: 48 c7 c0 58 34 13 89 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 39 48 83 3d ff 65 96 01 00 74 24 48 89 df 57 9d <0f> 1f 44 00 00 bf 01 00 00 00 e8 e1 a7 d3 f9 65 8b 05 12 50 85 78 RSP: 0018:ffff88805df77a30 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13 RAX: 1ffffffff122668b RBX: 0000000000000282 RCX: 0000000000000006 RDX: dffffc0000000000 RSI: 0000000000000008 RDI: 0000000000000282 RBP: ffff88805df77a40 R08: 1ffffffff15377ad R09: fffffbfff15377ae R10: fffffbfff15377ad R11: ffffffff8a9bbd6f R12: ffff88809555b080 R13: 0000000000000282 R14: 0000000000000000 R15: 0000000000000001 spin_unlock_irqrestore include/linux/spinlock.h:393 [inline] __wake_up_common_lock+0xf8/0x150 kernel/sched/wait.c:125 __wake_up+0xe/0x10 kernel/sched/wait.c:142 wakeup_pipe_writers+0x5c/0x90 fs/splice.c:457 splice_from_pipe_next.part.0+0x237/0x300 fs/splice.c:560 splice_from_pipe_next fs/splice.c:543 [inline] __splice_from_pipe+0x10f/0x7d0 fs/splice.c:622 vmsplice_to_user fs/splice.c:1272 [inline] do_vmsplice.part.0+0x249/0x2b0 fs/splice.c:1350 do_vmsplice fs/splice.c:1344 [inline] __do_sys_vmsplice+0x1bc/0x210 fs/splice.c:1371 __se_sys_vmsplice fs/splice.c:1353 [inline] __x64_sys_vmsplice+0x97/0xf0 fs/splice.c:1353 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x45a679 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f139a14ac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000116 RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 000000000045a679 RDX: 0000000000000001 RSI: 0000000020000000 RDI: 0000000000000003 RBP: 000000000075c070 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f139a14b6d4 R13: 00000000004ca80d R14: 00000000004e3128 R15: 00000000ffffffff --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2019-12-03 15:59 WARNING: bad unlock balance in sch_direct_xmit syzbot @ 2020-01-05 18:30 ` syzbot 2020-01-05 22:58 ` syzbot 1 sibling, 0 replies; 17+ messages in thread From: syzbot @ 2020-01-05 18:30 UTC (permalink / raw) To: davem, jhs, jiri, linux-kernel, netdev, syzkaller-bugs, xiyou.wangcong syzbot has found a reproducer for the following crash on: HEAD commit: 36487907 Merge branch 'akpm' (patches from Andrew) git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=165b3e15e00000 kernel config: https://syzkaller.appspot.com/x/.config?x=f2f3ef188b7e16cf dashboard link: https://syzkaller.appspot.com/bug?extid=4ec99438ed7450da6272 compiler: gcc (GCC) 9.0.0 20181231 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1722c5c1e00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=167aee3ee00000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+4ec99438ed7450da6272@syzkaller.appspotmail.com ===================================== WARNING: bad unlock balance detected! 5.5.0-rc4-syzkaller #0 Not tainted ------------------------------------- swapper/0/0 is trying to release lock (&dev->qdisc_xmit_lock_key) at: [<ffffffff863f0dda>] spin_unlock include/linux/spinlock.h:378 [inline] [<ffffffff863f0dda>] __netif_tx_unlock include/linux/netdevice.h:3966 [inline] [<ffffffff863f0dda>] sch_direct_xmit+0x3fa/0xd30 net/sched/sch_generic.c:315 but there are no more locks to release! other info that might help us debug this: 6 locks held by swapper/0/0: #0: ffffc90000007d50 ((&idev->mc_dad_timer)){+.-.}, at: lockdep_copy_map include/linux/lockdep.h:172 [inline] #0: ffffc90000007d50 ((&idev->mc_dad_timer)){+.-.}, at: call_timer_fn+0xe0/0x780 kernel/time/timer.c:1394 #1: ffffffff899a5600 (rcu_read_lock){....}, at: dev_net include/linux/netdevice.h:2188 [inline] #1: ffffffff899a5600 (rcu_read_lock){....}, at: mld_sendpack+0x180/0xed0 net/ipv6/mcast.c:1649 #2: ffffffff899a55c0 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline] #2: ffffffff899a55c0 (rcu_read_lock_bh){....}, at: ip6_finish_output2+0x214/0x25c0 net/ipv6/ip6_output.c:102 #3: ffffffff899a55c0 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x35c0 net/core/dev.c:3948 #4: ffff88809f1b1250 (&dev->qdisc_tx_busylock_key#3){+...}, at: spin_trylock include/linux/spinlock.h:348 [inline] #4: ffff88809f1b1250 (&dev->qdisc_tx_busylock_key#3){+...}, at: qdisc_run_begin include/net/sch_generic.h:159 [inline] #4: ffff88809f1b1250 (&dev->qdisc_tx_busylock_key#3){+...}, at: __dev_xmit_skb net/core/dev.c:3611 [inline] #4: ffff88809f1b1250 (&dev->qdisc_tx_busylock_key#3){+...}, at: __dev_queue_xmit+0x2412/0x35c0 net/core/dev.c:3982 #5: ffff88809f1b1138 (&dev->qdisc_running_key#3){+...}, at: dev_queue_xmit+0x18/0x20 net/core/dev.c:4046 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_unlock_imbalance_bug kernel/locking/lockdep.c:4008 [inline] print_unlock_imbalance_bug.cold+0x114/0x123 kernel/locking/lockdep.c:3984 __lock_release kernel/locking/lockdep.c:4242 [inline] lock_release+0x5f2/0x960 kernel/locking/lockdep.c:4503 __raw_spin_unlock include/linux/spinlock_api_smp.h:150 [inline] _raw_spin_unlock+0x16/0x40 kernel/locking/spinlock.c:183 spin_unlock include/linux/spinlock.h:378 [inline] __netif_tx_unlock include/linux/netdevice.h:3966 [inline] sch_direct_xmit+0x3fa/0xd30 net/sched/sch_generic.c:315 __dev_xmit_skb net/core/dev.c:3621 [inline] __dev_queue_xmit+0x2707/0x35c0 net/core/dev.c:3982 dev_queue_xmit+0x18/0x20 net/core/dev.c:4046 neigh_hh_output include/net/neighbour.h:499 [inline] neigh_output include/net/neighbour.h:508 [inline] ip6_finish_output2+0xfbe/0x25c0 net/ipv6/ip6_output.c:116 __ip6_finish_output+0x444/0xaa0 net/ipv6/ip6_output.c:142 ip6_finish_output+0x38/0x1f0 net/ipv6/ip6_output.c:152 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x25e/0x880 net/ipv6/ip6_output.c:175 dst_output include/net/dst.h:436 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] mld_sendpack+0x9c2/0xed0 net/ipv6/mcast.c:1682 mld_send_initial_cr.part.0+0x114/0x160 net/ipv6/mcast.c:2099 mld_send_initial_cr net/ipv6/mcast.c:2083 [inline] mld_dad_timer_expire+0x42/0x230 net/ipv6/mcast.c:2118 call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers kernel/time/timer.c:1773 [inline] __run_timers kernel/time/timer.c:1740 [inline] run_timer_softirq+0x6c3/0x1790 kernel/time/timer.c:1786 __do_softirq+0x262/0x98c kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x19b/0x1e0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1137 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 </IRQ> RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61 Code: 78 94 db f9 eb 8a cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d c4 24 51 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d b4 24 51 00 fb f4 <c3> cc 55 48 89 e5 41 57 41 56 41 55 41 54 53 e8 9e 68 8b f9 e8 09 RSP: 0018:ffffffff89807ce8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13 RAX: 1ffffffff132669e RBX: ffffffff8987a140 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000006 RDI: ffffffff8987a9d4 RBP: ffffffff89807d18 R08: ffffffff8987a140 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000 R13: ffffffff8a7bb380 R14: 0000000000000000 R15: 0000000000000000 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:690 default_idle_call+0x84/0xb0 kernel/sched/idle.c:94 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x3c8/0x6e0 kernel/sched/idle.c:269 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:361 rest_init+0x23b/0x371 init/main.c:451 arch_call_rest_init+0xe/0x1b start_kernel+0x904/0x943 init/main.c:784 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:490 x86_64_start_kernel+0x77/0x7b arch/x86/kernel/head64.c:471 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:242 kobject: 'brport' (000000006bf26f50): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (000000006bf26f50): calling ktype release kobject: 'brport': free name kobject: 'brport' (000000006a34e5a7): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (000000006a34e5a7): calling ktype release kobject: 'brport': free name kobject: 'brport' (0000000064bee465): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (0000000064bee465): calling ktype release kobject: 'brport': free name kobject: 'brport' (00000000572b5ef3): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (00000000572b5ef3): calling ktype release kobject: 'brport': free name kobject: 'brport' (00000000c5891925): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (00000000c5891925): calling ktype release kobject: 'brport': free name kobject: 'brport' (000000007996a1ba): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (000000007996a1ba): calling ktype release kobject: 'brport': free name kobject: 'brport' (00000000c48f444d): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (00000000c48f444d): calling ktype release kobject: 'brport': free name kobject: 'brport' (000000002f26f70f): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (000000002f26f70f): calling ktype release kobject: 'brport': free name kobject: 'brport' (000000009742ca09): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (000000009742ca09): calling ktype release kobject: 'brport': free name kobject: 'brport' (00000000b1045bd4): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (00000000b1045bd4): calling ktype release kobject: 'brport': free name kobject: 'brport' (000000009f21e9dc): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (000000009f21e9dc): calling ktype release kobject: 'brport': free name kobject: 'brport' (000000002b66d6a0): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (000000002b66d6a0): calling ktype release kobject: 'brport': free name kobject: 'brport' (0000000090c92031): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (0000000090c92031): calling ktype release kobject: 'brport': free name kobject: 'brport' (00000000b4538cf1): kobject_cleanup, parent 000000007f2d209d kobject: 'brport' (00000000b4538cf1): calling ktype release kobject: 'brport': free name ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2019-12-03 15:59 WARNING: bad unlock balance in sch_direct_xmit syzbot 2020-01-05 18:30 ` syzbot @ 2020-01-05 22:58 ` syzbot 2020-01-07 5:36 ` Cong Wang 1 sibling, 1 reply; 17+ messages in thread From: syzbot @ 2020-01-05 22:58 UTC (permalink / raw) To: a, alex.aring, allison, andrew, andy, ap420073, ast, b.a.t.m.a.n, bridge, cleech, daniel, davem, dsa, f.fainelli, fw, gregkh, gustavo, haiyangz, info, j.vosburgh, j, jakub.kicinski, jhs, jiri, johan.hedberg, johannes.berg, jwi, kstewart, kvalo, kys, linmiaohe, linux-bluetooth, linux-hams, linux-hyperv, linux-kernel, linux-ppp, linux-wireless, linux-wpan, liuhangbin, marcel, mareklindner, mkubecek, mmanning, netdev, nikolay, oss-drivers, paulus, ralf, roopa, sashal syzbot has bisected this bug to: commit ab92d68fc22f9afab480153bd82a20f6e2533769 Author: Taehee Yoo <ap420073@gmail.com> Date: Mon Oct 21 18:47:51 2019 +0000 net: core: add generic lockdep keys bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15e88ec6e00000 start commit: 36487907 Merge branch 'akpm' (patches from Andrew) git tree: upstream final crash: https://syzkaller.appspot.com/x/report.txt?x=17e88ec6e00000 console output: https://syzkaller.appspot.com/x/log.txt?x=13e88ec6e00000 kernel config: https://syzkaller.appspot.com/x/.config?x=f2f3ef188b7e16cf dashboard link: https://syzkaller.appspot.com/bug?extid=4ec99438ed7450da6272 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1722c5c1e00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=167aee3ee00000 Reported-by: syzbot+4ec99438ed7450da6272@syzkaller.appspotmail.com Fixes: ab92d68fc22f ("net: core: add generic lockdep keys") For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-05 22:58 ` syzbot @ 2020-01-07 5:36 ` Cong Wang 2020-01-07 11:30 ` Taehee Yoo 0 siblings, 1 reply; 17+ messages in thread From: Cong Wang @ 2020-01-07 5:36 UTC (permalink / raw) To: syzbot; +Cc: Taehee Yoo, Linux Kernel Network Developers Hi, Taehee On Sun, Jan 5, 2020 at 2:59 PM syzbot <syzbot+4ec99438ed7450da6272@syzkaller.appspotmail.com> wrote: > > syzbot has bisected this bug to: > > commit ab92d68fc22f9afab480153bd82a20f6e2533769 > Author: Taehee Yoo <ap420073@gmail.com> > Date: Mon Oct 21 18:47:51 2019 +0000 > > net: core: add generic lockdep keys Why netdev_update_lockdep_key() is needed? It causes the bug here because the unregister and register are not atomic although under RTNL, fast path could still lock with one key and unlock with another key after the previous got unregistered. From my understand of lockdep here, as long as the device itself is not changed, it doesn't need to update those keys. Thanks. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-07 5:36 ` Cong Wang @ 2020-01-07 11:30 ` Taehee Yoo 2020-01-08 0:34 ` Cong Wang 0 siblings, 1 reply; 17+ messages in thread From: Taehee Yoo @ 2020-01-07 11:30 UTC (permalink / raw) To: Cong Wang; +Cc: syzbot, Linux Kernel Network Developers On Tue, 7 Jan 2020 at 14:36, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > Hi, Taehee > Hi, Cong! Thank you for your diagnosis. > On Sun, Jan 5, 2020 at 2:59 PM syzbot > <syzbot+4ec99438ed7450da6272@syzkaller.appspotmail.com> wrote: > > > > syzbot has bisected this bug to: > > > > commit ab92d68fc22f9afab480153bd82a20f6e2533769 > > Author: Taehee Yoo <ap420073@gmail.com> > > Date: Mon Oct 21 18:47:51 2019 +0000 > > > > net: core: add generic lockdep keys > > Why netdev_update_lockdep_key() is needed? > > It causes the bug here because the unregister and register are not > atomic although under RTNL, fast path could still lock with one key > and unlock with another key after the previous got unregistered. > > From my understand of lockdep here, as long as the device itself > is not changed, it doesn't need to update those keys. > > Thanks. The goal of netdev_update_lockdep_key() is to avoid wrong lockdep splat. diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 02916f43bf63..cea5ef66b813 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -2448,7 +2448,7 @@ static int do_set_master(struct net_device *dev, int ifindex, err = ops->ndo_del_slave(upper_dev, dev); if (err) return err; - netdev_update_lockdep_key(dev); + //netdev_update_lockdep_key(dev); } else { return -EOPNOTSUPP; } Test commands: ip link add team0 type team ip link add team1 type team ip link set team0 master team1 ip link set team0 nomaster ip link set team1 master team0 Splats: [ 105.484781][ T1178] ====================================================== [ 105.485894][ T1178] WARNING: possible circular locking dependency detected [ 105.486791][ T1178] 5.5.0-rc2+ #264 Not tainted [ 105.487369][ T1178] ------------------------------------------------------ [ 105.488130][ T1178] ip/1178 is trying to acquire lock: [ 105.488948][ T1178] ffff8880521d9280 (&dev->addr_list_lock_key#4){+...}, at: dev_uc_sync_multiple+0x95/0x120 [ 105.490336][ T1178] [ 105.490336][ T1178] but task is already holding lock: [ 105.491710][ T1178] ffff88806aa29280 (&dev->addr_list_lock_key#3){+...}, at: team_add_slave+0x165d/0x1972 [team] [ 105.493471][ T1178] [ 105.493471][ T1178] which lock already depends on the new lock. [ 105.493471][ T1178] [ 105.495423][ T1178] [ 105.495423][ T1178] the existing dependency chain (in reverse order) is: [ 105.496809][ T1178] [ 105.496809][ T1178] -> #1 (&dev->addr_list_lock_key#3){+...}: [ 105.497747][ T1178] _raw_spin_lock+0x30/0x70 [ 105.498201][ T1178] dev_uc_sync_multiple+0x95/0x120 [ 105.498733][ T1178] team_add_slave+0x1668/0x1972 [team] [ 105.499293][ T1178] do_setlink+0xaab/0x2ef0 [ 105.499755][ T1178] __rtnl_newlink+0x9c5/0x1270 [ 105.500239][ T1178] rtnl_newlink+0x65/0x90 [ 105.500713][ T1178] rtnetlink_rcv_msg+0x4a8/0x890 [ 105.501269][ T1178] netlink_rcv_skb+0x121/0x350 [ 105.501799][ T1178] netlink_unicast+0x421/0x600 [ 105.502327][ T1178] netlink_sendmsg+0x65a/0xb90 [ 105.502890][ T1178] ____sys_sendmsg+0x5ce/0x7a0 [ 105.503469][ T1178] ___sys_sendmsg+0x10f/0x1b0 [ 105.504115][ T1178] __sys_sendmsg+0xc6/0x150 [ 105.504746][ T1178] do_syscall_64+0x99/0x4f0 [ 105.505391][ T1178] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 105.506206][ T1178] [ 105.506206][ T1178] -> #0 (&dev->addr_list_lock_key#4){+...}: [ 105.507238][ T1178] __lock_acquire+0x2d8d/0x3de0 [ 105.507907][ T1178] lock_acquire+0x164/0x3b0 [ 105.508536][ T1178] _raw_spin_lock+0x30/0x70 [ 105.509180][ T1178] dev_uc_sync_multiple+0x95/0x120 [ 105.509825][ T1178] team_add_slave+0x1668/0x1972 [team] [ 105.510451][ T1178] do_setlink+0xaab/0x2ef0 [ 105.510961][ T1178] __rtnl_newlink+0x9c5/0x1270 [ 105.511525][ T1178] rtnl_newlink+0x65/0x90 [ 105.512026][ T1178] rtnetlink_rcv_msg+0x4a8/0x890 [ 105.512618][ T1178] netlink_rcv_skb+0x121/0x350 [ 105.513158][ T1178] netlink_unicast+0x421/0x600 [ 105.513843][ T1178] netlink_sendmsg+0x65a/0xb90 [ 105.514524][ T1178] ____sys_sendmsg+0x5ce/0x7a0 [ 105.515186][ T1178] ___sys_sendmsg+0x10f/0x1b0 [ 105.515852][ T1178] __sys_sendmsg+0xc6/0x150 [ 105.516493][ T1178] do_syscall_64+0x99/0x4f0 [ 105.517190][ T1178] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 105.518005][ T1178] [ 105.518005][ T1178] other info that might help us debug this: [ 105.518005][ T1178] [ 105.519317][ T1178] Possible unsafe locking scenario: [ 105.519317][ T1178] [ 105.520268][ T1178] CPU0 CPU1 [ 105.520958][ T1178] ---- ---- [ 105.521640][ T1178] lock(&dev->addr_list_lock_key#3); [ 105.522866][ T1178] lock(&dev->addr_list_lock_key#4); [ 105.523613][ T1178] lock(&dev->addr_list_lock_key#3); [ 105.524303][ T1178] lock(&dev->addr_list_lock_key#4); [ 105.524890][ T1178] [ 105.524890][ T1178] *** DEADLOCK *** [ 105.524890][ T1178] [ 105.525624][ T1178] 3 locks held by ip/1178: [ 105.526010][ T1178] #0: ffffffffb0cc0b70 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x457/0x890 [ 105.526864][ T1178] #1: ffff88806aa29c80 (team->team_lock_key#2){+.+.}, at: team_add_slave+0x89/0x1972 [team] [ 105.527857][ T1178] #2: ffff88806aa29280 (&dev->addr_list_lock_key#3){+...}, at: team_add_slave+0x165d/0x1972 [team] [ 105.528914][ T1178] [ 105.528914][ T1178] stack backtrace: [ 105.529505][ T1178] CPU: 3 PID: 1178 Comm: ip Not tainted 5.5.0-rc2+ #264 [ 105.530202][ T1178] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 105.531417][ T1178] Call Trace: [ 105.531824][ T1178] dump_stack+0x96/0xdb [ 105.532333][ T1178] check_noncircular+0x371/0x450 [ 105.532963][ T1178] ? print_circular_bug.isra.36+0x310/0x310 [ 105.533575][ T1178] ? stack_trace_save+0x82/0xb0 [ 105.534058][ T1178] ? hlock_class+0x130/0x130 [ 105.534549][ T1178] ? __lock_acquire+0x2d8d/0x3de0 [ 105.535036][ T1178] __lock_acquire+0x2d8d/0x3de0 [ 105.535545][ T1178] ? register_lock_class+0x14d0/0x14d0 [ 105.536034][ T1178] ? find_held_lock+0x39/0x1d0 [ 105.536469][ T1178] lock_acquire+0x164/0x3b0 [ 105.536958][ T1178] ? dev_uc_sync_multiple+0x95/0x120 [ 105.537736][ T1178] _raw_spin_lock+0x30/0x70 [ 105.538398][ T1178] ? dev_uc_sync_multiple+0x95/0x120 [ 105.539260][ T1178] dev_uc_sync_multiple+0x95/0x120 [ 105.540213][ T1178] team_add_slave+0x1668/0x1972 [team] [ 105.541113][ T1178] ? team_init+0x7b0/0x7b0 [team] [ 105.541760][ T1178] ? mutex_is_locked+0x13/0x50 [ 105.542359][ T1178] ? rtnl_is_locked+0x11/0x20 [ 105.542984][ T1178] ? netdev_master_upper_dev_get+0xf/0x120 [ 105.543734][ T1178] do_setlink+0xaab/0x2ef0 [ 105.544296][ T1178] ? is_bpf_text_address+0x81/0xe0 [ ... ] After "ip link set team0 master team1", the "team1 -> team0" locking path will be recorded in lockdep key of both team1 and team0. Then, if "ip link set team1 master team0" is executed, "team0 -> team1" locking path also will be recorded in lockdep key. At this moment, lockdep will catch possible deadlock situation and it prints the above warning message. But, both "team0 -> team1" and "team1 -> team0" will not be existing concurrently. so the above message is actually wrong. In order to avoid this message, a recorded locking path should be removed. So, both lockdep_unregister_key() and lockdep_register_key() are needed. Yes, netdev_update_lockdep_key() is not atomic so "WARNING: bad unlock balance in sch_direct_xmit" message could be printed. Thank you so much! Taehee Yoo ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-07 11:30 ` Taehee Yoo @ 2020-01-08 0:34 ` Cong Wang 2020-01-08 11:42 ` Taehee Yoo 0 siblings, 1 reply; 17+ messages in thread From: Cong Wang @ 2020-01-08 0:34 UTC (permalink / raw) To: Taehee Yoo; +Cc: syzbot, Linux Kernel Network Developers On Tue, Jan 7, 2020 at 3:31 AM Taehee Yoo <ap420073@gmail.com> wrote: > After "ip link set team0 master team1", the "team1 -> team0" locking path > will be recorded in lockdep key of both team1 and team0. > Then, if "ip link set team1 master team0" is executed, "team0 -> team1" > locking path also will be recorded in lockdep key. At this moment, > lockdep will catch possible deadlock situation and it prints the above > warning message. But, both "team0 -> team1" and "team1 -> team0" > will not be existing concurrently. so the above message is actually wrong. > In order to avoid this message, a recorded locking path should be > removed. So, both lockdep_unregister_key() and lockdep_register_key() > are needed. > So, after you move the key down to each netdevice, they are now treated as different locks. Is this stacked device scenario the reason why you move it to per-netdevice? If so, I wonder why not just use nested locks? Like: netif_addr_nested_lock(upper, 0); netif_addr_nested_lock(lower, 1); netif_addr_nested_unlock(lower); netif_addr_nested_unlock(upper); For this case, they could still share a same key. Thanks for the details! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-08 0:34 ` Cong Wang @ 2020-01-08 11:42 ` Taehee Yoo 2020-01-09 23:38 ` Cong Wang 2020-01-13 11:08 ` Dmitry Vyukov 0 siblings, 2 replies; 17+ messages in thread From: Taehee Yoo @ 2020-01-08 11:42 UTC (permalink / raw) To: Cong Wang; +Cc: syzbot, Linux Kernel Network Developers On Wed, 8 Jan 2020 at 09:34, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Tue, Jan 7, 2020 at 3:31 AM Taehee Yoo <ap420073@gmail.com> wrote: > > After "ip link set team0 master team1", the "team1 -> team0" locking path > > will be recorded in lockdep key of both team1 and team0. > > Then, if "ip link set team1 master team0" is executed, "team0 -> team1" > > locking path also will be recorded in lockdep key. At this moment, > > lockdep will catch possible deadlock situation and it prints the above > > warning message. But, both "team0 -> team1" and "team1 -> team0" > > will not be existing concurrently. so the above message is actually wrong. > > In order to avoid this message, a recorded locking path should be > > removed. So, both lockdep_unregister_key() and lockdep_register_key() > > are needed. > > > > So, after you move the key down to each netdevice, they are now treated > as different locks. Is this stacked device scenario the reason why you > move it to per-netdevice? If so, I wonder why not just use nested locks? > Like: > > netif_addr_nested_lock(upper, 0); > netif_addr_nested_lock(lower, 1); > netif_addr_nested_unlock(lower); > netif_addr_nested_unlock(upper); > > For this case, they could still share a same key. > > Thanks for the details! Yes, the reason for using dynamic lockdep key is to avoid lockdep warning in stacked device scenario. But, the addr_list_lock case is a little bit different. There was a bug in netif_addr_lock_nested() that "dev->netdev_ops->ndo_get_lock_subclass" isn't updated after "master" and "nomaster" command. So, the wrong subclass is used, so lockdep warning message was printed. There were some ways to fix this problem, using dynamic key is just one of them. I think using the correct subclass in netif_addr_lock_nested() is also a correct way to fix that problem. Another minor reason was that the subclass is limited by 8. but dynamic key has no limitation. Unfortunately, dynamic key has a problem too. lockdep limits the maximum number of lockdep keys. $cat /proc/lockdep_stats lock-classes: 1228 [max: 8192] So, If so many network interfaces are created, they use so many lockdep keys. If so, lockdep will stop. This is the cause of "BUG: MAX_LOCKDEP_KEYS too low!". Thank you! Taehee Yoo ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-08 11:42 ` Taehee Yoo @ 2020-01-09 23:38 ` Cong Wang 2020-01-10 3:06 ` Taehee Yoo 2020-01-13 11:08 ` Dmitry Vyukov 1 sibling, 1 reply; 17+ messages in thread From: Cong Wang @ 2020-01-09 23:38 UTC (permalink / raw) To: Taehee Yoo; +Cc: syzbot, Linux Kernel Network Developers On Wed, Jan 8, 2020 at 3:43 AM Taehee Yoo <ap420073@gmail.com> wrote: > > On Wed, 8 Jan 2020 at 09:34, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > On Tue, Jan 7, 2020 at 3:31 AM Taehee Yoo <ap420073@gmail.com> wrote: > > > After "ip link set team0 master team1", the "team1 -> team0" locking path > > > will be recorded in lockdep key of both team1 and team0. > > > Then, if "ip link set team1 master team0" is executed, "team0 -> team1" > > > locking path also will be recorded in lockdep key. At this moment, > > > lockdep will catch possible deadlock situation and it prints the above > > > warning message. But, both "team0 -> team1" and "team1 -> team0" > > > will not be existing concurrently. so the above message is actually wrong. > > > In order to avoid this message, a recorded locking path should be > > > removed. So, both lockdep_unregister_key() and lockdep_register_key() > > > are needed. > > > > > > > So, after you move the key down to each netdevice, they are now treated > > as different locks. Is this stacked device scenario the reason why you > > move it to per-netdevice? If so, I wonder why not just use nested locks? > > Like: > > > > netif_addr_nested_lock(upper, 0); > > netif_addr_nested_lock(lower, 1); > > netif_addr_nested_unlock(lower); > > netif_addr_nested_unlock(upper); > > > > For this case, they could still share a same key. > > > > Thanks for the details! > > Yes, the reason for using dynamic lockdep key is to avoid lockdep > warning in stacked device scenario. > But, the addr_list_lock case is a little bit different. > There was a bug in netif_addr_lock_nested() that > "dev->netdev_ops->ndo_get_lock_subclass" isn't updated after "master" > and "nomaster" command. > So, the wrong subclass is used, so lockdep warning message was printed. Hmm? I never propose netdev_ops->ndo_get_lock_subclass(), and the subclasses are always 0,1, no matter which is the master device, so it doesn't need a ops. > There were some ways to fix this problem, using dynamic key is just one > of them. I think using the correct subclass in netif_addr_lock_nested() > is also a correct way to fix that problem. Another minor reason was that > the subclass is limited by 8. but dynamic key has no limitation. Yeah, but in practice I believe 8 is sufficient for stacked devices. > > Unfortunately, dynamic key has a problem too. > lockdep limits the maximum number of lockdep keys. Right, and also the problem reported by syzbot, that is not safe during unregister and register. Anyway, do you think we should revert back to the static keys and use subclass to address the lockdep issue instead? Thanks! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-09 23:38 ` Cong Wang @ 2020-01-10 3:06 ` Taehee Yoo 2020-01-10 4:43 ` Cong Wang 0 siblings, 1 reply; 17+ messages in thread From: Taehee Yoo @ 2020-01-10 3:06 UTC (permalink / raw) To: Cong Wang; +Cc: syzbot, Linux Kernel Network Developers On Fri, 10 Jan 2020 at 08:38, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Wed, Jan 8, 2020 at 3:43 AM Taehee Yoo <ap420073@gmail.com> wrote: > > > > On Wed, 8 Jan 2020 at 09:34, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > > > On Tue, Jan 7, 2020 at 3:31 AM Taehee Yoo <ap420073@gmail.com> wrote: > > > > After "ip link set team0 master team1", the "team1 -> team0" locking path > > > > will be recorded in lockdep key of both team1 and team0. > > > > Then, if "ip link set team1 master team0" is executed, "team0 -> team1" > > > > locking path also will be recorded in lockdep key. At this moment, > > > > lockdep will catch possible deadlock situation and it prints the above > > > > warning message. But, both "team0 -> team1" and "team1 -> team0" > > > > will not be existing concurrently. so the above message is actually wrong. > > > > In order to avoid this message, a recorded locking path should be > > > > removed. So, both lockdep_unregister_key() and lockdep_register_key() > > > > are needed. > > > > > > > > > > So, after you move the key down to each netdevice, they are now treated > > > as different locks. Is this stacked device scenario the reason why you > > > move it to per-netdevice? If so, I wonder why not just use nested locks? > > > Like: > > > > > > netif_addr_nested_lock(upper, 0); > > > netif_addr_nested_lock(lower, 1); > > > netif_addr_nested_unlock(lower); > > > netif_addr_nested_unlock(upper); > > > > > > For this case, they could still share a same key. > > > > > > Thanks for the details! > > > > Yes, the reason for using dynamic lockdep key is to avoid lockdep > > warning in stacked device scenario. > > But, the addr_list_lock case is a little bit different. > > There was a bug in netif_addr_lock_nested() that > > "dev->netdev_ops->ndo_get_lock_subclass" isn't updated after "master" > > and "nomaster" command. > > So, the wrong subclass is used, so lockdep warning message was printed. > > Hmm? I never propose netdev_ops->ndo_get_lock_subclass(), and > the subclasses are always 0,1, no matter which is the master device, > so it doesn't need a ops. > It's just the reason why the dynamic lockdep key was adopted instead of a nested lock. > > > There were some ways to fix this problem, using dynamic key is just one > > of them. I think using the correct subclass in netif_addr_lock_nested() > > is also a correct way to fix that problem. Another minor reason was that > > the subclass is limited by 8. but dynamic key has no limitation. > > Yeah, but in practice I believe 8 is sufficient for stacked devices. > I agree with this. > > > > > Unfortunately, dynamic key has a problem too. > > lockdep limits the maximum number of lockdep keys. > > > Right, and also the problem reported by syzbot, that is not safe > during unregister and register. > qdisc_xmit_lock_key has this problem. But, I'm not sure about addr_list_lock_key. If addr_list_lock is used outside of RTNL, it has this problem. If it isn't used outside of RTNL, it doesn't have this problem. > Anyway, do you think we should revert back to the static keys > and use subclass to address the lockdep issue instead? > > Thanks! I agree with this to reduce the number of dynamic lockdep keys. Thanks a lot! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-10 3:06 ` Taehee Yoo @ 2020-01-10 4:43 ` Cong Wang 2020-01-10 6:02 ` Taehee Yoo 0 siblings, 1 reply; 17+ messages in thread From: Cong Wang @ 2020-01-10 4:43 UTC (permalink / raw) To: Taehee Yoo; +Cc: syzbot, Linux Kernel Network Developers On Thu, Jan 9, 2020 at 7:06 PM Taehee Yoo <ap420073@gmail.com> wrote: > > On Fri, 10 Jan 2020 at 08:38, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > On Wed, Jan 8, 2020 at 3:43 AM Taehee Yoo <ap420073@gmail.com> wrote: > > > > > > On Wed, 8 Jan 2020 at 09:34, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > > > > > On Tue, Jan 7, 2020 at 3:31 AM Taehee Yoo <ap420073@gmail.com> wrote: > > > > > After "ip link set team0 master team1", the "team1 -> team0" locking path > > > > > will be recorded in lockdep key of both team1 and team0. > > > > > Then, if "ip link set team1 master team0" is executed, "team0 -> team1" > > > > > locking path also will be recorded in lockdep key. At this moment, > > > > > lockdep will catch possible deadlock situation and it prints the above > > > > > warning message. But, both "team0 -> team1" and "team1 -> team0" > > > > > will not be existing concurrently. so the above message is actually wrong. > > > > > In order to avoid this message, a recorded locking path should be > > > > > removed. So, both lockdep_unregister_key() and lockdep_register_key() > > > > > are needed. > > > > > > > > > > > > > So, after you move the key down to each netdevice, they are now treated > > > > as different locks. Is this stacked device scenario the reason why you > > > > move it to per-netdevice? If so, I wonder why not just use nested locks? > > > > Like: > > > > > > > > netif_addr_nested_lock(upper, 0); > > > > netif_addr_nested_lock(lower, 1); > > > > netif_addr_nested_unlock(lower); > > > > netif_addr_nested_unlock(upper); > > > > > > > > For this case, they could still share a same key. > > > > > > > > Thanks for the details! > > > > > > Yes, the reason for using dynamic lockdep key is to avoid lockdep > > > warning in stacked device scenario. > > > But, the addr_list_lock case is a little bit different. > > > There was a bug in netif_addr_lock_nested() that > > > "dev->netdev_ops->ndo_get_lock_subclass" isn't updated after "master" > > > and "nomaster" command. > > > So, the wrong subclass is used, so lockdep warning message was printed. > > > > Hmm? I never propose netdev_ops->ndo_get_lock_subclass(), and > > the subclasses are always 0,1, no matter which is the master device, > > so it doesn't need a ops. > > > > It's just the reason why the dynamic lockdep key was adopted instead of > a nested lock. Oh, but why? :) As I said, at least for the addr lock case, we can always pass subclass in the same order as they are called if we switch it back to static keys. > > > > > > There were some ways to fix this problem, using dynamic key is just one > > > of them. I think using the correct subclass in netif_addr_lock_nested() > > > is also a correct way to fix that problem. Another minor reason was that > > > the subclass is limited by 8. but dynamic key has no limitation. > > > > Yeah, but in practice I believe 8 is sufficient for stacked devices. > > > > I agree with this. > > > > > > > > > Unfortunately, dynamic key has a problem too. > > > lockdep limits the maximum number of lockdep keys. > > > > > > Right, and also the problem reported by syzbot, that is not safe > > during unregister and register. > > > > qdisc_xmit_lock_key has this problem. > But, I'm not sure about addr_list_lock_key. > If addr_list_lock is used outside of RTNL, it has this problem. > If it isn't used outside of RTNL, it doesn't have this problem. Yeah, I am aware. > > > Anyway, do you think we should revert back to the static keys > > and use subclass to address the lockdep issue instead? > > > > Thanks! > > I agree with this to reduce the number of dynamic lockdep keys. I am trying to fix this syzbot warning, not to address the key limit. The reason is that I think dynamic keys are not necessary and not able to be used safely in this case. What I am still not sure is whether using subclass (with static keys) could address the lockdep issue you fixed with dynamic keys. Thanks! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-10 4:43 ` Cong Wang @ 2020-01-10 6:02 ` Taehee Yoo 2020-01-11 21:53 ` Cong Wang 0 siblings, 1 reply; 17+ messages in thread From: Taehee Yoo @ 2020-01-10 6:02 UTC (permalink / raw) To: Cong Wang; +Cc: syzbot, Linux Kernel Network Developers On Fri, 10 Jan 2020 at 13:43, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Thu, Jan 9, 2020 at 7:06 PM Taehee Yoo <ap420073@gmail.com> wrote: > > > > On Fri, 10 Jan 2020 at 08:38, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > > > On Wed, Jan 8, 2020 at 3:43 AM Taehee Yoo <ap420073@gmail.com> wrote: > > > > > > > > On Wed, 8 Jan 2020 at 09:34, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > > > > > > > On Tue, Jan 7, 2020 at 3:31 AM Taehee Yoo <ap420073@gmail.com> wrote: > > > > > > After "ip link set team0 master team1", the "team1 -> team0" locking path > > > > > > will be recorded in lockdep key of both team1 and team0. > > > > > > Then, if "ip link set team1 master team0" is executed, "team0 -> team1" > > > > > > locking path also will be recorded in lockdep key. At this moment, > > > > > > lockdep will catch possible deadlock situation and it prints the above > > > > > > warning message. But, both "team0 -> team1" and "team1 -> team0" > > > > > > will not be existing concurrently. so the above message is actually wrong. > > > > > > In order to avoid this message, a recorded locking path should be > > > > > > removed. So, both lockdep_unregister_key() and lockdep_register_key() > > > > > > are needed. > > > > > > > > > > > > > > > > So, after you move the key down to each netdevice, they are now treated > > > > > as different locks. Is this stacked device scenario the reason why you > > > > > move it to per-netdevice? If so, I wonder why not just use nested locks? > > > > > Like: > > > > > > > > > > netif_addr_nested_lock(upper, 0); > > > > > netif_addr_nested_lock(lower, 1); > > > > > netif_addr_nested_unlock(lower); > > > > > netif_addr_nested_unlock(upper); > > > > > > > > > > For this case, they could still share a same key. > > > > > > > > > > Thanks for the details! > > > > > > > > Yes, the reason for using dynamic lockdep key is to avoid lockdep > > > > warning in stacked device scenario. > > > > But, the addr_list_lock case is a little bit different. > > > > There was a bug in netif_addr_lock_nested() that > > > > "dev->netdev_ops->ndo_get_lock_subclass" isn't updated after "master" > > > > and "nomaster" command. > > > > So, the wrong subclass is used, so lockdep warning message was printed. > > > > > > Hmm? I never propose netdev_ops->ndo_get_lock_subclass(), and > > > the subclasses are always 0,1, no matter which is the master device, > > > so it doesn't need a ops. > > > > > > > It's just the reason why the dynamic lockdep key was adopted instead of > > a nested lock. > > Oh, but why? :) As I said, at least for the addr lock case, we can always > pass subclass in the same order as they are called if we switch it back > to static keys. > ndo_get_lock_subclass() was used to calculate subclass which was used by netif_addr_lock_nested(). -static inline void netif_addr_lock_nested(struct net_device *dev) -{ - int subclass = SINGLE_DEPTH_NESTING; - - if (dev->netdev_ops->ndo_get_lock_subclass) - subclass = dev->netdev_ops->ndo_get_lock_subclass(dev); - - spin_lock_nested(&dev->addr_list_lock, subclass); -} The most important thing about nested lock is to get the correct subclass. nest_level was used as subclass and this was calculated by ->ndo_get_lock_subclass(). But, ->ndo_get_lock_subclass() didn't calculate correct subclass. After "master" and "nomaster" operations, nest_level should be updated recursively, but it didn't. So incorrect subclass was used. team3 <-- subclass 0 "ip link set team3 master team2" team2 <-- subclass 0 team3 <-- subclass 1 "ip link set team2 master team1" team1 <-- subclass 0 team3 <-- subclass 1 team3 <-- subclass 1 "ip link set team1 master team0" team0 <-- subclass 0 team1 <-- subclass 1 team3 <-- subclass 1 team3 <-- subclass 1 After "master" and "nomaster" operation, subclass values of all lower or upper interfaces would be changed. But ->ndo_get_lock_subclass() didn't update subclass recursively, lockdep warning appeared. In order to fix this, I had two ways. 1. use dynamic keys instead of static keys. 2. fix ndo_get_lock_subclass(). The reason why I adopted using dynamic keys instead of fixing ->ndo_get_lock_subclass() is that the ->ndo_get_lock_subclass() isn't a common helper function. So, driver writers should implement ->ndo_get_lock_subclass(). If we use dynamic keys, ->ndo_get_lock_subclass() code could be removed. > > > > > > > > > There were some ways to fix this problem, using dynamic key is just one > > > > of them. I think using the correct subclass in netif_addr_lock_nested() > > > > is also a correct way to fix that problem. Another minor reason was that > > > > the subclass is limited by 8. but dynamic key has no limitation. > > > > > > Yeah, but in practice I believe 8 is sufficient for stacked devices. > > > > > > > I agree with this. > > > > > > > > > > > > > Unfortunately, dynamic key has a problem too. > > > > lockdep limits the maximum number of lockdep keys. > > > > > > > > > Right, and also the problem reported by syzbot, that is not safe > > > during unregister and register. > > > > > > > qdisc_xmit_lock_key has this problem. > > But, I'm not sure about addr_list_lock_key. > > If addr_list_lock is used outside of RTNL, it has this problem. > > If it isn't used outside of RTNL, it doesn't have this problem. > > Yeah, I am aware. > > > > > > > Anyway, do you think we should revert back to the static keys > > > and use subclass to address the lockdep issue instead? > > > > > > Thanks! > > > > I agree with this to reduce the number of dynamic lockdep keys. > > I am trying to fix this syzbot warning, not to address the key limit. > > The reason is that I think dynamic keys are not necessary and > not able to be used safely in this case. What I am still not sure > is whether using subclass (with static keys) could address the > lockdep issue you fixed with dynamic keys. > > Thanks! What I fixed problems with dynamic lockdep keys could be fixed by nested lock too. I think if the subclass value synchronization routine works well, there will be no problem. Thanks a lot! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-10 6:02 ` Taehee Yoo @ 2020-01-11 21:53 ` Cong Wang 2020-01-11 23:28 ` Cong Wang 2020-01-13 10:13 ` Taehee Yoo 0 siblings, 2 replies; 17+ messages in thread From: Cong Wang @ 2020-01-11 21:53 UTC (permalink / raw) To: Taehee Yoo; +Cc: syzbot, Linux Kernel Network Developers On Thu, Jan 9, 2020 at 10:02 PM Taehee Yoo <ap420073@gmail.com> wrote: > ndo_get_lock_subclass() was used to calculate subclass which was used by > netif_addr_lock_nested(). > > -static inline void netif_addr_lock_nested(struct net_device *dev) > -{ > - int subclass = SINGLE_DEPTH_NESTING; > - > - if (dev->netdev_ops->ndo_get_lock_subclass) > - subclass = dev->netdev_ops->ndo_get_lock_subclass(dev); > - > - spin_lock_nested(&dev->addr_list_lock, subclass); > -} > > The most important thing about nested lock is to get the correct subclass. > nest_level was used as subclass and this was calculated by > ->ndo_get_lock_subclass(). > But, ->ndo_get_lock_subclass() didn't calculate correct subclass. > After "master" and "nomaster" operations, nest_level should be updated > recursively, but it didn't. So incorrect subclass was used. > > team3 <-- subclass 0 > > "ip link set team3 master team2" > > team2 <-- subclass 0 > team3 <-- subclass 1 > > "ip link set team2 master team1" > > team1 <-- subclass 0 > team3 <-- subclass 1 > team3 <-- subclass 1 > > "ip link set team1 master team0" > > team0 <-- subclass 0 > team1 <-- subclass 1 > team3 <-- subclass 1 > team3 <-- subclass 1 > > After "master" and "nomaster" operation, subclass values of all lower or > upper interfaces would be changed. But ->ndo_get_lock_subclass() > didn't update subclass recursively, lockdep warning appeared. > In order to fix this, I had two ways. > 1. use dynamic keys instead of static keys. > 2. fix ndo_get_lock_subclass(). > > The reason why I adopted using dynamic keys instead of fixing > ->ndo_get_lock_subclass() is that the ->ndo_get_lock_subclass() isn't > a common helper function. > So, driver writers should implement ->ndo_get_lock_subclass(). > If we use dynamic keys, ->ndo_get_lock_subclass() code could be removed. > The details you provide here are really helpful for me to understand the reasons behind your changes. Let me think about this and see how I could address both problems. This appears to be harder than I originally thought. > > What I fixed problems with dynamic lockdep keys could be fixed by > nested lock too. I think if the subclass value synchronization routine > works well, there will be no problem. Great! We are on the same page. Thanks for all the information and the reproducer too! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-11 21:53 ` Cong Wang @ 2020-01-11 23:28 ` Cong Wang 2020-01-13 10:53 ` Taehee Yoo 2020-01-13 10:13 ` Taehee Yoo 1 sibling, 1 reply; 17+ messages in thread From: Cong Wang @ 2020-01-11 23:28 UTC (permalink / raw) To: Taehee Yoo; +Cc: syzbot, Linux Kernel Network Developers On Sat, Jan 11, 2020 at 1:53 PM Cong Wang <xiyou.wangcong@gmail.com> wrote: > The details you provide here are really helpful for me to understand > the reasons behind your changes. Let me think about this and see how > I could address both problems. This appears to be harder than I originally > thought. Do you think the following patch will make everyone happy? diff --git a/net/core/dev.c b/net/core/dev.c index 0ad39c87b7fd..7e885d069707 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9177,22 +9177,10 @@ static void netdev_unregister_lockdep_key(struct net_device *dev) void netdev_update_lockdep_key(struct net_device *dev) { - struct netdev_queue *queue; - int i; - - lockdep_unregister_key(&dev->qdisc_xmit_lock_key); lockdep_unregister_key(&dev->addr_list_lock_key); - - lockdep_register_key(&dev->qdisc_xmit_lock_key); lockdep_register_key(&dev->addr_list_lock_key); lockdep_set_class(&dev->addr_list_lock, &dev->addr_list_lock_key); - for (i = 0; i < dev->num_tx_queues; i++) { - queue = netdev_get_tx_queue(dev, i); - - lockdep_set_class(&queue->_xmit_lock, - &dev->qdisc_xmit_lock_key); - } } EXPORT_SYMBOL(netdev_update_lockdep_key); I think as long as we don't take _xmit_lock nestedly, it is fine. And most (or all?) of the software netdev's are already lockless, so I can't think of any case we take more than one _xmit_lock on TX path. I tested it with the syzbot reproducer and your set master/nomaster commands, I don't get any lockdep splat. What do you think? Thanks! ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-11 23:28 ` Cong Wang @ 2020-01-13 10:53 ` Taehee Yoo 2020-01-14 19:39 ` Cong Wang 0 siblings, 1 reply; 17+ messages in thread From: Taehee Yoo @ 2020-01-13 10:53 UTC (permalink / raw) To: Cong Wang; +Cc: syzbot, Linux Kernel Network Developers On Sun, 12 Jan 2020 at 08:28, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Sat, Jan 11, 2020 at 1:53 PM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > The details you provide here are really helpful for me to understand > > the reasons behind your changes. Let me think about this and see how > > I could address both problems. This appears to be harder than I originally > > thought. > > Do you think the following patch will make everyone happy? > > diff --git a/net/core/dev.c b/net/core/dev.c > index 0ad39c87b7fd..7e885d069707 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -9177,22 +9177,10 @@ static void > netdev_unregister_lockdep_key(struct net_device *dev) > > void netdev_update_lockdep_key(struct net_device *dev) > { > - struct netdev_queue *queue; > - int i; > - > - lockdep_unregister_key(&dev->qdisc_xmit_lock_key); > lockdep_unregister_key(&dev->addr_list_lock_key); > - > - lockdep_register_key(&dev->qdisc_xmit_lock_key); > lockdep_register_key(&dev->addr_list_lock_key); > > lockdep_set_class(&dev->addr_list_lock, &dev->addr_list_lock_key); > - for (i = 0; i < dev->num_tx_queues; i++) { > - queue = netdev_get_tx_queue(dev, i); > - > - lockdep_set_class(&queue->_xmit_lock, > - &dev->qdisc_xmit_lock_key); > - } > } > EXPORT_SYMBOL(netdev_update_lockdep_key); > > I think as long as we don't take _xmit_lock nestedly, it is fine. And > most (or all?) of the software netdev's are already lockless, so I can't > think of any case we take more than one _xmit_lock on TX path. > > I tested it with the syzbot reproducer and your set master/nomaster > commands, I don't get any lockdep splat. > > What do you think? > > Thanks! I have tested this approach and I have no found any problem. As you said, most of virtual interfaces are already lockless. So, generally lockdep warning will not occur. I found two virtual interfaces that they don't have LLTX and they also could be upper interface. Interfaces are "rmnet" and virt_wifi" type. My test case is here. [Before] master0(bond or team or bridge) | slave0(rmnet or virt_wifi) | master1 | slave1 | master2 | veth [After] master0(bond or team or bridge) | slave1(rmnet or virt_wifi) | master2 | slave0 | master1 | veth In this test, the ordering of slave1 and slave0 will be changed. But, rmnet and virt_wifi type interface couldn't be slave of bond, team, and bridge type interface. So, This graph will not be made. So, I agree with this approach. Thank you so much! Taehee Yoo ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-13 10:53 ` Taehee Yoo @ 2020-01-14 19:39 ` Cong Wang 0 siblings, 0 replies; 17+ messages in thread From: Cong Wang @ 2020-01-14 19:39 UTC (permalink / raw) To: Taehee Yoo; +Cc: syzbot, Linux Kernel Network Developers On Mon, Jan 13, 2020 at 2:53 AM Taehee Yoo <ap420073@gmail.com> wrote: > > On Sun, 12 Jan 2020 at 08:28, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > On Sat, Jan 11, 2020 at 1:53 PM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > The details you provide here are really helpful for me to understand > > > the reasons behind your changes. Let me think about this and see how > > > I could address both problems. This appears to be harder than I originally > > > thought. > > > > Do you think the following patch will make everyone happy? > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > index 0ad39c87b7fd..7e885d069707 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -9177,22 +9177,10 @@ static void > > netdev_unregister_lockdep_key(struct net_device *dev) > > > > void netdev_update_lockdep_key(struct net_device *dev) > > { > > - struct netdev_queue *queue; > > - int i; > > - > > - lockdep_unregister_key(&dev->qdisc_xmit_lock_key); > > lockdep_unregister_key(&dev->addr_list_lock_key); > > - > > - lockdep_register_key(&dev->qdisc_xmit_lock_key); > > lockdep_register_key(&dev->addr_list_lock_key); > > > > lockdep_set_class(&dev->addr_list_lock, &dev->addr_list_lock_key); > > - for (i = 0; i < dev->num_tx_queues; i++) { > > - queue = netdev_get_tx_queue(dev, i); > > - > > - lockdep_set_class(&queue->_xmit_lock, > > - &dev->qdisc_xmit_lock_key); > > - } > > } > > EXPORT_SYMBOL(netdev_update_lockdep_key); > > > > I think as long as we don't take _xmit_lock nestedly, it is fine. And > > most (or all?) of the software netdev's are already lockless, so I can't > > think of any case we take more than one _xmit_lock on TX path. > > > > I tested it with the syzbot reproducer and your set master/nomaster > > commands, I don't get any lockdep splat. > > > > What do you think? > > > > Thanks! > > I have tested this approach and I have no found any problem. > As you said, most of virtual interfaces are already lockless. > So, generally lockdep warning will not occur. > I found two virtual interfaces that they don't have LLTX and they also > could be upper interface. Interfaces are "rmnet" and virt_wifi" type. > > My test case is here. > > [Before] > master0(bond or team or bridge) > | > slave0(rmnet or virt_wifi) > | > master1 > | > slave1 > | > master2 > | > veth > > [After] > master0(bond or team or bridge) > | > slave1(rmnet or virt_wifi) > | > master2 > | > slave0 > | > master1 > | > veth > > In this test, the ordering of slave1 and slave0 will be changed. > But, rmnet and virt_wifi type interface couldn't be slave of bond, team, > and bridge type interface. So, This graph will not be made. > So, I agree with this approach. Thanks for reviewing it and auditing more network devices. I will send out the patch formally. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-11 21:53 ` Cong Wang 2020-01-11 23:28 ` Cong Wang @ 2020-01-13 10:13 ` Taehee Yoo 1 sibling, 0 replies; 17+ messages in thread From: Taehee Yoo @ 2020-01-13 10:13 UTC (permalink / raw) To: Cong Wang; +Cc: syzbot, Linux Kernel Network Developers On Sun, 12 Jan 2020 at 06:53, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Thu, Jan 9, 2020 at 10:02 PM Taehee Yoo <ap420073@gmail.com> wrote: > > ndo_get_lock_subclass() was used to calculate subclass which was used by > > netif_addr_lock_nested(). > > > > -static inline void netif_addr_lock_nested(struct net_device *dev) > > -{ > > - int subclass = SINGLE_DEPTH_NESTING; > > - > > - if (dev->netdev_ops->ndo_get_lock_subclass) > > - subclass = dev->netdev_ops->ndo_get_lock_subclass(dev); > > - > > - spin_lock_nested(&dev->addr_list_lock, subclass); > > -} > > > > The most important thing about nested lock is to get the correct subclass. > > nest_level was used as subclass and this was calculated by > > ->ndo_get_lock_subclass(). > > But, ->ndo_get_lock_subclass() didn't calculate correct subclass. > > After "master" and "nomaster" operations, nest_level should be updated > > recursively, but it didn't. So incorrect subclass was used. > > > > team3 <-- subclass 0 > > > > "ip link set team3 master team2" > > > > team2 <-- subclass 0 > > team3 <-- subclass 1 > > > > "ip link set team2 master team1" > > > > team1 <-- subclass 0 > > team3 <-- subclass 1 > > team3 <-- subclass 1 > > > > "ip link set team1 master team0" > > > > team0 <-- subclass 0 > > team1 <-- subclass 1 > > team3 <-- subclass 1 > > team3 <-- subclass 1 > > > > After "master" and "nomaster" operation, subclass values of all lower or > > upper interfaces would be changed. But ->ndo_get_lock_subclass() > > didn't update subclass recursively, lockdep warning appeared. > > In order to fix this, I had two ways. > > 1. use dynamic keys instead of static keys. > > 2. fix ndo_get_lock_subclass(). > > > > The reason why I adopted using dynamic keys instead of fixing > > ->ndo_get_lock_subclass() is that the ->ndo_get_lock_subclass() isn't > > a common helper function. > > So, driver writers should implement ->ndo_get_lock_subclass(). > > If we use dynamic keys, ->ndo_get_lock_subclass() code could be removed. > > > > The details you provide here are really helpful for me to understand > the reasons behind your changes. Let me think about this and see how > I could address both problems. This appears to be harder than I originally > thought. > > > > > What I fixed problems with dynamic lockdep keys could be fixed by > > nested lock too. I think if the subclass value synchronization routine > > works well, there will be no problem. > > Great! We are on the same page. > > Thanks for all the information and the reproducer too! I really glad my explanation helps you! Thank you so much! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: bad unlock balance in sch_direct_xmit 2020-01-08 11:42 ` Taehee Yoo 2020-01-09 23:38 ` Cong Wang @ 2020-01-13 11:08 ` Dmitry Vyukov 1 sibling, 0 replies; 17+ messages in thread From: Dmitry Vyukov @ 2020-01-13 11:08 UTC (permalink / raw) To: Taehee Yoo; +Cc: Cong Wang, syzbot, Linux Kernel Network Developers On Wed, Jan 8, 2020 at 12:44 PM Taehee Yoo <ap420073@gmail.com> wrote: > > On Wed, 8 Jan 2020 at 09:34, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > On Tue, Jan 7, 2020 at 3:31 AM Taehee Yoo <ap420073@gmail.com> wrote: > > > After "ip link set team0 master team1", the "team1 -> team0" locking path > > > will be recorded in lockdep key of both team1 and team0. > > > Then, if "ip link set team1 master team0" is executed, "team0 -> team1" > > > locking path also will be recorded in lockdep key. At this moment, > > > lockdep will catch possible deadlock situation and it prints the above > > > warning message. But, both "team0 -> team1" and "team1 -> team0" > > > will not be existing concurrently. so the above message is actually wrong. > > > In order to avoid this message, a recorded locking path should be > > > removed. So, both lockdep_unregister_key() and lockdep_register_key() > > > are needed. > > > > > > > So, after you move the key down to each netdevice, they are now treated > > as different locks. Is this stacked device scenario the reason why you > > move it to per-netdevice? If so, I wonder why not just use nested locks? > > Like: > > > > netif_addr_nested_lock(upper, 0); > > netif_addr_nested_lock(lower, 1); > > netif_addr_nested_unlock(lower); > > netif_addr_nested_unlock(upper); > > > > For this case, they could still share a same key. > > > > Thanks for the details! > > Yes, the reason for using dynamic lockdep key is to avoid lockdep > warning in stacked device scenario. > But, the addr_list_lock case is a little bit different. > There was a bug in netif_addr_lock_nested() that > "dev->netdev_ops->ndo_get_lock_subclass" isn't updated after "master" > and "nomaster" command. > So, the wrong subclass is used, so lockdep warning message was printed. > There were some ways to fix this problem, using dynamic key is just one > of them. I think using the correct subclass in netif_addr_lock_nested() > is also a correct way to fix that problem. Another minor reason was that > the subclass is limited by 8. but dynamic key has no limitation. > > Unfortunately, dynamic key has a problem too. > lockdep limits the maximum number of lockdep keys. > $cat /proc/lockdep_stats > lock-classes: 1228 [max: 8192] > > So, If so many network interfaces are created, they use so many > lockdep keys. If so, lockdep will stop. > This is the cause of "BUG: MAX_LOCKDEP_KEYS too low!". Hi Taehee, Cong, We actually have some serious problems with lockdep limits recently: https://groups.google.com/g/syzkaller-bugs/c/O9pFzd9KABU/m/KCuRo3w5CgAJ I wonder if it's related to what you mentioned... I will CC you on that other thread. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2020-01-14 19:39 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-03 15:59 WARNING: bad unlock balance in sch_direct_xmit syzbot 2020-01-05 18:30 ` syzbot 2020-01-05 22:58 ` syzbot 2020-01-07 5:36 ` Cong Wang 2020-01-07 11:30 ` Taehee Yoo 2020-01-08 0:34 ` Cong Wang 2020-01-08 11:42 ` Taehee Yoo 2020-01-09 23:38 ` Cong Wang 2020-01-10 3:06 ` Taehee Yoo 2020-01-10 4:43 ` Cong Wang 2020-01-10 6:02 ` Taehee Yoo 2020-01-11 21:53 ` Cong Wang 2020-01-11 23:28 ` Cong Wang 2020-01-13 10:53 ` Taehee Yoo 2020-01-14 19:39 ` Cong Wang 2020-01-13 10:13 ` Taehee Yoo 2020-01-13 11:08 ` Dmitry Vyukov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).