* WARNING: locking bug in inet_autobind @ 2019-05-16 5:46 syzbot 2019-05-21 8:31 ` syzbot ` (3 more replies) 0 siblings, 4 replies; 18+ messages in thread From: syzbot @ 2019-05-16 5:46 UTC (permalink / raw) To: ast, bpf, daniel, davem, kafai, kuznet, linux-kernel, netdev, songliubraving, syzkaller-bugs, yhs, yoshfuji Hello, syzbot found the following crash on: HEAD commit: 35c99ffa Merge tag 'for_linus' of git://git.kernel.org/pub.. git tree: net-next console output: https://syzkaller.appspot.com/x/log.txt?x=10e970f4a00000 kernel config: https://syzkaller.appspot.com/x/.config?x=82f0809e8f0a8c87 dashboard link: https://syzkaller.appspot.com/bug?extid=94cc2a66fc228b23f360 compiler: gcc (GCC) 9.0.0 20181231 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+94cc2a66fc228b23f360@syzkaller.appspotmail.com WARNING: CPU: 1 PID: 32543 at kernel/locking/lockdep.c:734 arch_local_save_flags arch/x86/include/asm/paravirt.h:762 [inline] WARNING: CPU: 1 PID: 32543 at kernel/locking/lockdep.c:734 arch_local_save_flags arch/x86/include/asm/paravirt.h:760 [inline] WARNING: CPU: 1 PID: 32543 at kernel/locking/lockdep.c:734 look_up_lock_class kernel/locking/lockdep.c:725 [inline] WARNING: CPU: 1 PID: 32543 at kernel/locking/lockdep.c:734 register_lock_class+0xe10/0x1860 kernel/locking/lockdep.c:1078 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 32543 Comm: syz-executor.4 Not tainted 5.1.0+ #9 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 panic+0x2cb/0x65c kernel/panic.c:214 __warn.cold+0x20/0x45 kernel/panic.c:566 report_bug+0x263/0x2b0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:180 [inline] fixup_bug arch/x86/kernel/traps.c:175 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:273 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:292 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:972 RIP: 0010:look_up_lock_class kernel/locking/lockdep.c:734 [inline] RIP: 0010:register_lock_class+0xe10/0x1860 kernel/locking/lockdep.c:1078 Code: 00 48 89 da 4d 8b 76 c0 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 23 07 00 00 4c 89 33 e9 e3 f4 ff ff 0f 0b <0f> 0b e9 ea f3 ff ff 44 89 e0 4c 8b 95 50 ff ff ff 83 c0 01 4c 8b RSP: 0018:ffff88806395f9e8 EFLAGS: 00010083 RAX: dffffc0000000000 RBX: ffff8880a947f1e0 RCX: 0000000000000000 RDX: 1ffff1101528fe3f RSI: 0000000000000000 RDI: ffff8880a947f1f8 RBP: ffff88806395fab0 R08: 1ffff1100c72bf45 R09: ffffffff8a459c80 R10: ffffffff8a0e47e0 R11: 0000000000000000 R12: ffffffff8a1235a0 R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff87fe4c60 __lock_acquire+0x116/0x5490 kernel/locking/lockdep.c:3673 lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4302 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline] _raw_spin_lock_bh+0x33/0x50 kernel/locking/spinlock.c:175 spin_lock_bh include/linux/spinlock.h:343 [inline] lock_sock_nested+0x41/0x120 net/core/sock.c:2917 lock_sock include/net/sock.h:1525 [inline] inet_autobind+0x20/0x1a0 net/ipv4/af_inet.c:183 inet_dgram_connect+0x252/0x2e0 net/ipv4/af_inet.c:573 __sys_connect+0x266/0x330 net/socket.c:1840 __do_sys_connect net/socket.c:1851 [inline] __se_sys_connect net/socket.c:1848 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1848 do_syscall_64+0x103/0x680 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x458da9 Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f695f8b6c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458da9 RDX: 000000000000001c RSI: 0000000020000000 RDI: 0000000000000003 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f695f8b76d4 R13: 00000000004bf1fe R14: 00000000004d04f8 R15: 00000000ffffffff Kernel Offset: disabled Rebooting in 86400 seconds.. --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: WARNING: locking bug in inet_autobind 2019-05-16 5:46 WARNING: locking bug in inet_autobind syzbot @ 2019-05-21 8:31 ` syzbot 2019-05-22 3:16 ` syzbot ` (2 subsequent siblings) 3 siblings, 0 replies; 18+ messages in thread From: syzbot @ 2019-05-21 8:31 UTC (permalink / raw) To: ast, bpf, daniel, davem, kafai, kuznet, linux-kernel, netdev, songliubraving, syzkaller-bugs, yhs, yoshfuji syzbot has found a reproducer for the following crash on: HEAD commit: f49aa1de Merge tag 'for-5.2-rc1-tag' of git://git.kernel.o.. git tree: net-next console output: https://syzkaller.appspot.com/x/log.txt?x=14e5b130a00000 kernel config: https://syzkaller.appspot.com/x/.config?x=fc045131472947d7 dashboard link: https://syzkaller.appspot.com/bug?extid=94cc2a66fc228b23f360 compiler: gcc (GCC) 9.0.0 20181231 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=163731f8a00000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+94cc2a66fc228b23f360@syzkaller.appspotmail.com WARNING: CPU: 1 PID: 28592 at kernel/locking/lockdep.c:734 arch_local_save_flags arch/x86/include/asm/paravirt.h:762 [inline] WARNING: CPU: 1 PID: 28592 at kernel/locking/lockdep.c:734 arch_local_save_flags arch/x86/include/asm/paravirt.h:760 [inline] WARNING: CPU: 1 PID: 28592 at kernel/locking/lockdep.c:734 look_up_lock_class kernel/locking/lockdep.c:725 [inline] WARNING: CPU: 1 PID: 28592 at kernel/locking/lockdep.c:734 register_lock_class+0xe10/0x1860 kernel/locking/lockdep.c:1078 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 28592 Comm: syz-executor.5 Not tainted 5.2.0-rc1+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 panic+0x2cb/0x744 kernel/panic.c:218 __warn.cold+0x20/0x4d kernel/panic.c:575 report_bug+0x263/0x2b0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:179 [inline] fixup_bug arch/x86/kernel/traps.c:174 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:986 RIP: 0010:look_up_lock_class kernel/locking/lockdep.c:734 [inline] RIP: 0010:register_lock_class+0xe10/0x1860 kernel/locking/lockdep.c:1078 Code: 00 48 89 da 4d 8b 76 c0 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 23 07 00 00 4c 89 33 e9 e3 f4 ff ff 0f 0b <0f> 0b e9 ea f3 ff ff 44 89 e0 4c 8b 95 50 ff ff ff 83 c0 01 4c 8b RSP: 0018:ffff888093d179e8 EFLAGS: 00010083 RAX: dffffc0000000000 RBX: ffff8880967cd160 RCX: 0000000000000000 RDX: 1ffff11012cf9a2f RSI: 0000000000000000 RDI: ffff8880967cd178 RBP: ffff888093d17ab0 R08: 1ffff110127a2f45 R09: ffffffff8a659d40 R10: ffffffff8a2e8440 R11: 0000000000000000 R12: ffffffff8a323030 R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff88022ba0 __lock_acquire+0x116/0x5490 kernel/locking/lockdep.c:3673 lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4302 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline] _raw_spin_lock_bh+0x33/0x50 kernel/locking/spinlock.c:175 spin_lock_bh include/linux/spinlock.h:343 [inline] lock_sock_nested+0x41/0x120 net/core/sock.c:2917 lock_sock include/net/sock.h:1525 [inline] inet_autobind+0x20/0x1a0 net/ipv4/af_inet.c:183 inet_dgram_connect+0x243/0x2d0 net/ipv4/af_inet.c:573 __sys_connect+0x264/0x330 net/socket.c:1840 __do_sys_connect net/socket.c:1851 [inline] __se_sys_connect net/socket.c:1848 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1848 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x459279 Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2321b1ac78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000459279 RDX: 000000000000001c RSI: 0000000020000000 RDI: 0000000000000003 RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2321b1b6d4 R13: 00000000004bf74d R14: 00000000004d0c18 R15: 00000000ffffffff Kernel Offset: disabled Rebooting in 86400 seconds.. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: WARNING: locking bug in inet_autobind 2019-05-16 5:46 WARNING: locking bug in inet_autobind syzbot 2019-05-21 8:31 ` syzbot @ 2019-05-22 3:16 ` syzbot [not found] ` <0000000000008b645c058971629b-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 2022-09-18 15:52 ` Tetsuo Handa 2022-12-29 6:26 ` [syzbot] " syzbot 3 siblings, 1 reply; 18+ messages in thread From: syzbot @ 2019-05-22 3:16 UTC (permalink / raw) To: Yong.Zhao, airlied, alexander.deucher, amd-gfx, ast, bpf, christian.koenig, daniel, daniel, davem, david1.zhou, dri-devel, evan.quan, felix.kuehling, harry.wentland, kafai, kuznet, linux-kernel, netdev, ozeng, ray.huang, rex.zhu, songliubraving, syzkaller-bugs, yhs, yong.zhao, yoshfuji syzbot has bisected this bug to: commit c0d9271ecbd891cdeb0fad1edcdd99ee717a655f Author: Yong Zhao <Yong.Zhao@amd.com> Date: Fri Feb 1 23:36:21 2019 +0000 drm/amdgpu: Delete user queue doorbell variables bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1433ece4a00000 start commit: f49aa1de Merge tag 'for-5.2-rc1-tag' of git://git.kernel.o.. git tree: net-next final crash: https://syzkaller.appspot.com/x/report.txt?x=1633ece4a00000 console output: https://syzkaller.appspot.com/x/log.txt?x=1233ece4a00000 kernel config: https://syzkaller.appspot.com/x/.config?x=fc045131472947d7 dashboard link: https://syzkaller.appspot.com/bug?extid=94cc2a66fc228b23f360 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=163731f8a00000 Reported-by: syzbot+94cc2a66fc228b23f360@syzkaller.appspotmail.com Fixes: c0d9271ecbd8 ("drm/amdgpu: Delete user queue doorbell variables") For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <0000000000008b645c058971629b-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]
* Re: WARNING: locking bug in inet_autobind [not found] ` <0000000000008b645c058971629b-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> @ 2019-05-22 3:21 ` Zhao, Yong 0 siblings, 0 replies; 18+ messages in thread From: Zhao, Yong @ 2019-05-22 3:21 UTC (permalink / raw) To: syzbot, airlied-cv59FeDIM0c, Deucher, Alexander, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, ast-DgEjT+Ai2ygdnm+yROfE0A, bpf-u79uwXL29TY76Z2rM5mHXA, Koenig, Christian, daniel-/w4YWyX8dFk, daniel-FeC+5ew28dpmcu3hnIyYJQ, davem-fT/PcQaiUtIeIZ0/mPfg9Q, Zhou, David(ChunMing), dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Quan, Evan, Kuehling, Felix, Wentland, Harry, kafai-b10kYP2dOMg, kuznet-v/Mj1YrvjDBInbfyfbPRSQ, linux-kernel-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1.1: Type: text/plain, Size: 2286 bytes --] This commit was reverted later. I guess the revert was probably not picked up properly. Regards, Yong ________________________________ From: syzbot <syzbot+94cc2a66fc228b23f360-Pl5Pbv+GP7P466ipTTIvnc23WoclnBCfAL8bYrjMMd8@public.gmane.org> Sent: Tuesday, May 21, 2019 11:16 PM To: Zhao, Yong; airlied-cv59FeDIM0c@public.gmane.org; Deucher, Alexander; amd-gfx-PD4FTy7X32mqWrfYKbYh0A@public.gmane.orgktop.org; ast-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org; bpf-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Koenig, Christian; daniel@ffwll.ch; daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org; davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org; Zhou, David(ChunMing); dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org; Quan, Evan; Kuehling, Felix; Wentland, Harry; kafai-b10kYP2dOMg@public.gmane.org; kuznet-v/Mj1YrvjDBInbfyfbPRSQ@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; netdev@vger.kernel.org; Zeng, Oak; Huang, Ray; rex.zhu-5C7GfCeVMHo@public.gmane.org; songliubraving@fb.com; syzkaller-bugs-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; yhs-b10kYP2dOMg@public.gmane.org; Zhao, Yong; yoshfuji@linux-ipv6.org Subject: Re: WARNING: locking bug in inet_autobind [CAUTION: External Email] syzbot has bisected this bug to: commit c0d9271ecbd891cdeb0fad1edcdd99ee717a655f Author: Yong Zhao <Yong.Zhao-5C7GfCeVMHo@public.gmane.org> Date: Fri Feb 1 23:36:21 2019 +0000 drm/amdgpu: Delete user queue doorbell variables bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1433ece4a00000 start commit: f49aa1de Merge tag 'for-5.2-rc1-tag' of git://git.kernel.o.. git tree: net-next final crash: https://syzkaller.appspot.com/x/report.txt?x=1633ece4a00000 console output: https://syzkaller.appspot.com/x/log.txt?x=1233ece4a00000 kernel config: https://syzkaller.appspot.com/x/.config?x=fc045131472947d7 dashboard link: https://syzkaller.appspot.com/bug?extid=94cc2a66fc228b23f360 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=163731f8a00000 Reported-by: syzbot+94cc2a66fc228b23f360-Pl5Pbv+GP7P466ipTTIvnc23WoclnBCfAL8bYrjMMd8@public.gmane.org Fixes: c0d9271ecbd8 ("drm/amdgpu: Delete user queue doorbell variables") For information about bisection process see: https://goo.gl/tpsmEJ#bisection [-- Attachment #1.2: Type: text/html, Size: 4106 bytes --] [-- Attachment #2: Type: text/plain, Size: 153 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: WARNING: locking bug in inet_autobind 2019-05-16 5:46 WARNING: locking bug in inet_autobind syzbot 2019-05-21 8:31 ` syzbot 2019-05-22 3:16 ` syzbot @ 2022-09-18 15:52 ` Tetsuo Handa 2022-09-18 18:25 ` Boqun Feng 2022-12-29 6:26 ` [syzbot] " syzbot 3 siblings, 1 reply; 18+ messages in thread From: Tetsuo Handa @ 2022-09-18 15:52 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni Cc: netdev, syzbot, syzkaller-bugs syzbot is reporting locking bug in inet_autobind(), for commit 37159ef2c1ae1e69 ("l2tp: fix a lockdep splat") started calling lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, "l2tp_sock") in l2tp_tunnel_create() (which is currently in l2tp_tunnel_register()). How can we fix this problem? ------------[ cut here ]------------ class->name=slock-AF_INET6 lock->name=l2tp_sock lock->key=l2tp_socket_class WARNING: CPU: 2 PID: 9237 at kernel/locking/lockdep.c:940 look_up_lock_class+0xcc/0x140 Modules linked in: CPU: 2 PID: 9237 Comm: a.out Not tainted 6.0.0-rc5-00094-ga335366bad13-dirty #860 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 RIP: 0010:look_up_lock_class+0xcc/0x140 On 2019/05/16 14:46, syzbot wrote: > HEAD commit: 35c99ffa Merge tag 'for_linus' of git://git.kernel.org/pub.. > git tree: net-next > console output: https://syzkaller.appspot.com/x/log.txt?x=10e970f4a00000 > kernel config: https://syzkaller.appspot.com/x/.config?x=82f0809e8f0a8c87 > dashboard link: https://syzkaller.appspot.com/bug?extid=94cc2a66fc228b23f360 > compiler: gcc (GCC) 9.0.0 20181231 (experimental) C reproducer is available at https://syzkaller.appspot.com/text?tag=ReproC&x=15062310080000 . ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: WARNING: locking bug in inet_autobind 2022-09-18 15:52 ` Tetsuo Handa @ 2022-09-18 18:25 ` Boqun Feng 2022-09-19 5:02 ` Tetsuo Handa 0 siblings, 1 reply; 18+ messages in thread From: Boqun Feng @ 2022-09-18 18:25 UTC (permalink / raw) To: Tetsuo Handa Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, syzbot, syzkaller-bugs On Mon, Sep 19, 2022 at 12:52:45AM +0900, Tetsuo Handa wrote: > syzbot is reporting locking bug in inet_autobind(), for > commit 37159ef2c1ae1e69 ("l2tp: fix a lockdep splat") started > calling > > lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, "l2tp_sock") > > in l2tp_tunnel_create() (which is currently in l2tp_tunnel_register()). > How can we fix this problem? > Just a theory, it seems that we have a memory corruption happened for lockdep_set_class_and_name(), in l2tp_tunnel_register(), the "sk" gets published before lockdep_set_class_and_name(): tunnel->sock = sk; ... lockdep_set_class_and_name(&sk->sk_lock.slock,...); And what could happen is that sock_lock_init() races with the l2tp_tunnel_register(), which results into two lockdep_set_class_and_name()s race with each other. Anyway, "sk" should not be published until its lock gets properly initialized, could you try the following (untested)? Looks to me all other code around the lockdep_set_class_and_name() should be moved upwards, but I don't want to pretend I'm an expert ;-) Regards, Boqun diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index 7499c51b1850..1a01d23abc53 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1480,7 +1480,9 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net, sk = sock->sk; sock_hold(sk); - tunnel->sock = sk; + lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, + "l2tp_sock"); + smp_store_release(&tunnel->sock, sk); spin_lock_bh(&pn->l2tp_tunnel_list_lock); list_for_each_entry(tunnel_walk, &pn->l2tp_tunnel_list, list) { @@ -1509,8 +1511,6 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net, tunnel->old_sk_destruct = sk->sk_destruct; sk->sk_destruct = &l2tp_tunnel_destruct; - lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, - "l2tp_sock"); sk->sk_allocation = GFP_ATOMIC; trace_register_tunnel(tunnel); > ------------[ cut here ]------------ > class->name=slock-AF_INET6 lock->name=l2tp_sock lock->key=l2tp_socket_class > WARNING: CPU: 2 PID: 9237 at kernel/locking/lockdep.c:940 look_up_lock_class+0xcc/0x140 > Modules linked in: > CPU: 2 PID: 9237 Comm: a.out Not tainted 6.0.0-rc5-00094-ga335366bad13-dirty #860 > Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 > RIP: 0010:look_up_lock_class+0xcc/0x140 > > On 2019/05/16 14:46, syzbot wrote: > > HEAD commit: 35c99ffa Merge tag 'for_linus' of git://git.kernel.org/pub.. > > git tree: net-next > > console output: https://syzkaller.appspot.com/x/log.txt?x=10e970f4a00000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=82f0809e8f0a8c87 > > dashboard link: https://syzkaller.appspot.com/bug?extid=94cc2a66fc228b23f360 > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > C reproducer is available at > https://syzkaller.appspot.com/text?tag=ReproC&x=15062310080000 . > ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: WARNING: locking bug in inet_autobind 2022-09-18 18:25 ` Boqun Feng @ 2022-09-19 5:02 ` Tetsuo Handa 2022-09-27 13:00 ` Tetsuo Handa 0 siblings, 1 reply; 18+ messages in thread From: Tetsuo Handa @ 2022-09-19 5:02 UTC (permalink / raw) To: Boqun Feng Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, syzbot, syzkaller-bugs On 2022/09/19 3:25, Boqun Feng wrote: > On Mon, Sep 19, 2022 at 12:52:45AM +0900, Tetsuo Handa wrote: >> syzbot is reporting locking bug in inet_autobind(), for >> commit 37159ef2c1ae1e69 ("l2tp: fix a lockdep splat") started >> calling >> >> lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, "l2tp_sock") >> >> in l2tp_tunnel_create() (which is currently in l2tp_tunnel_register()). >> How can we fix this problem? >> > > Just a theory, it seems that we have a memory corruption happened for > lockdep_set_class_and_name(), in l2tp_tunnel_register(), the "sk" gets > published before lockdep_set_class_and_name(): > > tunnel->sock = sk; > ... > lockdep_set_class_and_name(&sk->sk_lock.slock,...); > > And what could happen is that sock_lock_init() races with the > l2tp_tunnel_register(), which results into two > lockdep_set_class_and_name()s race with each other. > > Anyway, "sk" should not be published until its lock gets properly > initialized, could you try the following (untested)? Looks to me all > other code around the lockdep_set_class_and_name() should be moved > upwards, but I don't want to pretend I'm an expert ;-) This diff did not help. ------------[ cut here ]------------ Looking for class "l2tp_sock" with key l2tp_socket_class, but found a different class "slock-AF_INET6" with the same key WARNING: CPU: 1 PID: 14195 at kernel/locking/lockdep.c:940 look_up_lock_class+0xcc/0x140 Modules linked in: CPU: 1 PID: 14195 Comm: a.out Not tainted 6.0.0-rc6-dirty #863 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 RIP: 0010:look_up_lock_class+0xcc/0x140 A roughly simplified reproducer (be unlikely able to reproduce) is shown below. ---------------------------------------- #include <unistd.h> #include <sys/socket.h> #include <netinet/in.h> #include <linux/if_pppox.h> int main(int argc, char *argv[]) { const int fd0 = socket(AF_PPPOX, SOCK_STREAM, 1); const int fd1 = socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP); struct sockaddr_pppol2tp addr0 = { .sa_family = AF_PPPOX, .sa_protocol = 1, .pppol2tp.fd = fd1, /* AF_INET6 UDP socket. */ .pppol2tp.addr.sin_port = htons(1), .pppol2tp.addr.sin_addr = htonl(INADDR_LOOPBACK), .pppol2tp.s_tunnel = 2 }; struct sockaddr_in6 addr1 = { .sin6_family = AF_INET6, .sin6_port = htons(0), .sin6_addr = in6addr_loopback }; if (fork() == 0) { connect(fd1, (struct sockaddr *) &addr1, sizeof(addr1)); /* Invoke inet_autobind() due to .sin6_port = htons(0). */ _exit(0); } connect(fd0, (struct sockaddr *) &addr0, sizeof(addr0)); /* Call lockdep_set_class_and_name(sk) of already published fd1. */ return 0; } ---------------------------------------- The reproducer is creating two file descriptors via socket(AF_PPPOX, SOCK_STREAM, 1) and socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP). The connect() on AF_PPPOX socket calls l2tp_tunnel_register() via pppol2tp_connect(). l2tp_tunnel_register() changes an already published socket's "sk" which can be reached via file descriptor using sockfd_lookup(). And for this reproducer, a "sk" created via socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP) is modified by the connect() on AF_PPPOX socket. But since this file descriptor is visible to userspace, the userspace can concurrently call connect() on AF_INET6 socket (which invokes inet_autobind() by passing port == 0) using this file descriptor. As a result, spin_lock_bh(&sk->sk_lock.slock) from lock_sock_nested(sk) from lock_sock(sk) from inet_autobind() from inet_dgram_connect() finds that there already is a class "slock-AF_INET6" which would have been a normal result if l2tp_tunnel_register() did not call lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, "l2tp_sock") on this AF_INET6 socket. It seems like a race condition, for a debug printk() patch shown below suggested that this happens when lock_sock(sk) and lockdep_set_class_and_name(&sk->sk_lock.slock) ran in parallel. ---------------------------------------- diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 3ca0cc467886..57b31d06b0e1 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -174,6 +174,8 @@ static int inet_autobind(struct sock *sk) { struct inet_sock *inet; /* We may need to bind the socket. */ + if (!strcmp(current->comm, "a.out")) + pr_info("inet_autobind(sk=%px)\n", sk); lock_sock(sk); inet = inet_sk(sk); if (!inet->inet_num) { diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index 7499c51b1850..1bb14b19bca0 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1509,8 +1509,12 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net, tunnel->old_sk_destruct = sk->sk_destruct; sk->sk_destruct = &l2tp_tunnel_destruct; + if (!strcmp(current->comm, "a.out")) + pr_info("l2tp_tunnel_register(sk=%px) before\n", sk); lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, "l2tp_sock"); + if (!strcmp(current->comm, "a.out")) + pr_info("l2tp_tunnel_register(sk=%px) after\n", sk); sk->sk_allocation = GFP_ATOMIC; trace_register_tunnel(tunnel); ---------------------------------------- ---------------------------------------- [ 229.873612][T41464] l2tp_core: l2tp_tunnel_register(sk=ffff8880148a7800) before [ 229.873619][T41464] l2tp_core: l2tp_tunnel_register(sk=ffff8880148a7800) after [ 229.873654][T41465] IPv4: inet_autobind(sk=ffff8880148a7800) [ 229.879263][T41468] IPv4: inet_autobind(sk=ffff8880d63a1e00) [ 229.879264][T41467] l2tp_core: l2tp_tunnel_register(sk=ffff8880d63a1e00) before [ 229.879272][T41468] ------------[ cut here ]------------ [ 229.879272][T41467] l2tp_core: l2tp_tunnel_register(sk=ffff8880d63a1e00) after [ 229.879275][T41468] Looking for class "l2tp_sock" with key l2tp_socket_class, but found a different class "slock-AF_INET6" with the same key [ 229.879932][T41450] l2tp_core: l2tp_tunnel_register(sk=ffff88807c416180) after [ 229.882029][T41468] WARNING: CPU: 0 PID: 41468 at kernel/locking/lockdep.c:940 look_up_lock_class+0xcc/0x140 [ 229.888126][T41471] IPv4: inet_autobind(sk=ffff88807c410000) [ 229.888126][T41470] l2tp_core: l2tp_tunnel_register(sk=ffff88807c410000) before [ 229.888134][T41470] l2tp_core: l2tp_tunnel_register(sk=ffff88807c410000) after [ 229.889140][T41468] Modules linked in: [ 230.006548][T41468] CPU: 0 PID: 41468 Comm: a.out Not tainted 6.0.0-rc6-00001-g7def00e9a851-dirty #871 [ 230.009327][T41468] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 230.012117][T41468] RIP: 0010:look_up_lock_class+0xcc/0x140 [ 230.014633][T41468] Code: 8b 17 48 c7 c0 90 42 4b 88 48 39 c2 74 c4 f6 05 dd 31 dc 01 01 75 bb c6 05 d4 31 dc 01 01 48 c7 c7 26 5e f3 85 e8 f4 17 4c fc <0f> 0b eb a4 e8 5b c1 93 fd 48 c7 c7 fd 4c 19 86 89 de e8 c5 06 ff [ 230.020534][T41468] RSP: 0018:ffffc90013bc3ba0 EFLAGS: 00010046 [ 230.023183][T41468] RAX: 4ca7765a49bbb600 RBX: ffffffff8837db90 RCX: ffff8880d5ddd580 [ 230.025998][T41468] RDX: 0000000000000000 RSI: 0000000080000201 RDI: 0000000000000000 [ 230.028984][T41468] RBP: 0000000000000001 R08: ffffffff8136457a R09: 0000000000000000 [ 230.031785][T41468] R10: ffffffff81366013 R11: ffff8880d5ddd580 R12: 0000000000000000 [ 230.034512][T41468] R13: ffff8880d63a1eb0 R14: 0000000000000000 R15: 0000000000000000 [ 230.037347][T41468] FS: 00007efccdb44640(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 [ 230.040207][T41468] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 230.042940][T41468] CR2: 00007efccdb43ef8 CR3: 0000000011a99000 CR4: 00000000000506f0 [ 230.045741][T41468] Call Trace: [ 230.048282][T41468] <TASK> [ 230.050869][T41468] register_lock_class+0x48/0x300 [ 230.053474][T41468] __lock_acquire+0x87/0x3340 [ 230.056057][T41468] ? __lock_acquire+0x65f/0x3340 [ 230.058852][T41468] ? console_trylock_spinning+0x187/0x2c0 [ 230.061637][T41468] lock_acquire+0xc6/0x1d0 [ 230.064189][T41468] ? lock_sock_nested+0x56/0xa0 [ 230.066753][T41468] ? lock_sock_nested+0x56/0xa0 [ 230.069337][T41468] _raw_spin_lock_bh+0x31/0x40 [ 230.071879][T41468] ? lock_sock_nested+0x56/0xa0 [ 230.074527][T41468] lock_sock_nested+0x56/0xa0 [ 230.077195][T41468] inet_dgram_connect+0xd7/0x1c0 [ 230.079829][T41468] __sys_connect+0x137/0x150 [ 230.082440][T41468] ? syscall_enter_from_user_mode+0x2e/0x1d0 [ 230.085198][T41468] ? lockdep_hardirqs_on+0x8d/0x130 [ 230.087957][T41468] __x64_sys_connect+0x18/0x20 [ 230.090690][T41468] do_syscall_64+0x3d/0x90 [ 230.093232][T41468] entry_SYSCALL_64_after_hwframe+0x63/0xcd ---------------------------------------- But unfortunately reordering tunnel->sock = sk; ... lockdep_set_class_and_name(&sk->sk_lock.slock,...); by lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, "l2tp_sock"); smp_store_release(&tunnel->sock, sk); does not help, for connect() on AF_INET6 socket is not finding this "sk" by accessing tunnel->sock. ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: WARNING: locking bug in inet_autobind 2022-09-19 5:02 ` Tetsuo Handa @ 2022-09-27 13:00 ` Tetsuo Handa 2022-11-22 18:02 ` Jakub Sitnicki 0 siblings, 1 reply; 18+ messages in thread From: Tetsuo Handa @ 2022-09-27 13:00 UTC (permalink / raw) To: Boqun Feng, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long, netdev, syzbot, syzkaller-bugs On 2022/09/19 14:02, Tetsuo Handa wrote: > But unfortunately reordering > > tunnel->sock = sk; > ... > lockdep_set_class_and_name(&sk->sk_lock.slock,...); > > by > > lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, "l2tp_sock"); > smp_store_release(&tunnel->sock, sk); > > does not help, for connect() on AF_INET6 socket is not finding this "sk" by > accessing tunnel->sock. > I considered something like below diff, but I came to think that this problem cannot be solved unless l2tp_tunnel_register() stops using userspace-supplied file descriptor and starts always calling l2tp_tunnel_sock_create(), for userspace can continue using userspace-supplied file descriptor as if a normal socket even after lockdep_set_class_and_name() told that this is a tunneling socket. Since userspace-supplied file descriptor has to be a datagram socket, can we somehow copy the source/destination addresses from userspace-supplied socket to kernel-created socket? diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index 7499c51b1850..07429bed7c4c 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1382,8 +1382,6 @@ static int l2tp_tunnel_sock_create(struct net *net, return err; } -static struct lock_class_key l2tp_socket_class; - int l2tp_tunnel_create(int fd, int version, u32 tunnel_id, u32 peer_tunnel_id, struct l2tp_tunnel_cfg *cfg, struct l2tp_tunnel **tunnelp) { @@ -1509,8 +1507,20 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net, tunnel->old_sk_destruct = sk->sk_destruct; sk->sk_destruct = &l2tp_tunnel_destruct; - lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, - "l2tp_sock"); + if (IS_ENABLED(CONFIG_LOCKDEP)) { + static struct lock_class_key l2tp_socket_class; + + /* Changing class/name of an already visible sock might race + * with first lock_sock() call on that sock. In order to make + * sure that register_lock_class() has completed before + * lockdep_set_class_and_name() changes class/name, explicitly + * lock/release that sock. + */ + lock_sock(sk); + release_sock(sk); + lockdep_set_class_and_name(&sk->sk_lock.slock, + &l2tp_socket_class, "l2tp_sock"); + } sk->sk_allocation = GFP_ATOMIC; trace_register_tunnel(tunnel); ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: WARNING: locking bug in inet_autobind 2022-09-27 13:00 ` Tetsuo Handa @ 2022-11-22 18:02 ` Jakub Sitnicki 0 siblings, 0 replies; 18+ messages in thread From: Jakub Sitnicki @ 2022-11-22 18:02 UTC (permalink / raw) To: Eric Dumazet, Tetsuo Handa Cc: Boqun Feng, David S. Miller, Jakub Kicinski, Paolo Abeni, Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long, netdev, syzbot, syzkaller-bugs On Tue, Sep 27, 2022 at 10:00 PM +09, Tetsuo Handa wrote: > On 2022/09/19 14:02, Tetsuo Handa wrote: >> But unfortunately reordering >> >> tunnel->sock = sk; >> ... >> lockdep_set_class_and_name(&sk->sk_lock.slock,...); >> >> by >> >> lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, "l2tp_sock"); >> smp_store_release(&tunnel->sock, sk); >> >> does not help, for connect() on AF_INET6 socket is not finding this "sk" by >> accessing tunnel->sock. >> > > I considered something like below diff, but I came to think that this problem > cannot be solved unless l2tp_tunnel_register() stops using userspace-supplied > file descriptor and starts always calling l2tp_tunnel_sock_create(), for > userspace can continue using userspace-supplied file descriptor as if a normal > socket even after lockdep_set_class_and_name() told that this is a tunneling > socket. > > Since userspace-supplied file descriptor has to be a datagram socket, > can we somehow copy the source/destination addresses from > userspace-supplied socket to kernel-created socket? > > > diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c > index 7499c51b1850..07429bed7c4c 100644 > --- a/net/l2tp/l2tp_core.c > +++ b/net/l2tp/l2tp_core.c > @@ -1382,8 +1382,6 @@ static int l2tp_tunnel_sock_create(struct net *net, > return err; > } > > -static struct lock_class_key l2tp_socket_class; > - > int l2tp_tunnel_create(int fd, int version, u32 tunnel_id, u32 peer_tunnel_id, > struct l2tp_tunnel_cfg *cfg, struct l2tp_tunnel **tunnelp) > { > @@ -1509,8 +1507,20 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net, > > tunnel->old_sk_destruct = sk->sk_destruct; > sk->sk_destruct = &l2tp_tunnel_destruct; > - lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, > - "l2tp_sock"); > + if (IS_ENABLED(CONFIG_LOCKDEP)) { > + static struct lock_class_key l2tp_socket_class; > + > + /* Changing class/name of an already visible sock might race > + * with first lock_sock() call on that sock. In order to make > + * sure that register_lock_class() has completed before > + * lockdep_set_class_and_name() changes class/name, explicitly > + * lock/release that sock. > + */ > + lock_sock(sk); > + release_sock(sk); > + lockdep_set_class_and_name(&sk->sk_lock.slock, > + &l2tp_socket_class, "l2tp_sock"); > + } > sk->sk_allocation = GFP_ATOMIC; > > trace_register_tunnel(tunnel); What if we revisit Eric's lockdep splat fix in 37159ef2c1ae ("l2tp: fix a lockdep splat") and: 1. remove the lockdep_set_class_and_name(...) call in l2tp; it looks like an odd case within the network stack, and 2. switch to bh_lock_sock_nested in l2tp_xmit_core so that we don't break what has been fixed in 37159ef2c1ae. Eric, WDYT? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] WARNING: locking bug in inet_autobind 2019-05-16 5:46 WARNING: locking bug in inet_autobind syzbot ` (2 preceding siblings ...) 2022-09-18 15:52 ` Tetsuo Handa @ 2022-12-29 6:26 ` syzbot 2023-01-03 15:39 ` Felix Kuehling 3 siblings, 1 reply; 18+ messages in thread From: syzbot @ 2022-12-29 6:26 UTC (permalink / raw) To: Alexander.Deucher, Christian.Koenig, David1.Zhou, Evan.Quan, Felix.Kuehling, Harry.Wentland, Oak.Zeng, Ray.Huang, Yong.Zhao, airlied, alexander.deucher, amd-gfx, ast, boqun.feng, bpf, christian.koenig, daniel, daniel, davem, david1.zhou, dri-devel, dsahern, edumazet, evan.quan, felix.kuehling, gautammenghani201, harry.wentland, jakub, kafai, kuba, kuznet, linux-kernel, longman, mingo, netdev, ozeng, pabeni, penguin-kernel, penguin-kernel, peterz, ray.huang, rex.zhu, songliubraving, syzkaller-bugs, will, yhs, yong.zhao, yoshfuji syzbot has found a reproducer for the following issue on: HEAD commit: 1b929c02afd3 Linux 6.2-rc1 git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=145c6a68480000 kernel config: https://syzkaller.appspot.com/x/.config?x=2651619a26b4d687 dashboard link: https://syzkaller.appspot.com/bug?extid=94cc2a66fc228b23f360 compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13e13e32480000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13790f08480000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/d1849f1ca322/disk-1b929c02.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/924cb8aa4ada/vmlinux-1b929c02.xz kernel image: https://storage.googleapis.com/syzbot-assets/8c7330dae0a0/bzImage-1b929c02.xz The issue was bisected to: commit c0d9271ecbd891cdeb0fad1edcdd99ee717a655f Author: Yong Zhao <Yong.Zhao@amd.com> Date: Fri Feb 1 23:36:21 2019 +0000 drm/amdgpu: Delete user queue doorbell variables bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1433ece4a00000 final oops: https://syzkaller.appspot.com/x/report.txt?x=1633ece4a00000 console output: https://syzkaller.appspot.com/x/log.txt?x=1233ece4a00000 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+94cc2a66fc228b23f360@syzkaller.appspotmail.com Fixes: c0d9271ecbd8 ("drm/amdgpu: Delete user queue doorbell variables") ------------[ cut here ]------------ Looking for class "l2tp_sock" with key l2tp_socket_class, but found a different class "slock-AF_INET6" with the same key WARNING: CPU: 0 PID: 7280 at kernel/locking/lockdep.c:937 look_up_lock_class+0x97/0x110 kernel/locking/lockdep.c:937 Modules linked in: CPU: 0 PID: 7280 Comm: syz-executor835 Not tainted 6.2.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 RIP: 0010:look_up_lock_class+0x97/0x110 kernel/locking/lockdep.c:937 Code: 17 48 81 fa e0 e5 f6 8f 74 59 80 3d 5d bc 57 04 00 75 50 48 c7 c7 00 4d 4c 8a 48 89 04 24 c6 05 49 bc 57 04 01 e8 a9 42 b9 ff <0f> 0b 48 8b 04 24 eb 31 9c 5a 80 e6 02 74 95 e8 45 38 02 fa 85 c0 RSP: 0018:ffffc9000b5378b8 EFLAGS: 00010082 RAX: 0000000000000000 RBX: ffffffff91c06a00 RCX: 0000000000000000 RDX: ffff8880292d0000 RSI: ffffffff8166721c RDI: fffff520016a6f09 RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000080000201 R11: 20676e696b6f6f4c R12: 0000000000000000 R13: ffff88802a5820b0 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f1fd7a97700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000100 CR3: 0000000078ab4000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> register_lock_class+0xbe/0x1120 kernel/locking/lockdep.c:1289 __lock_acquire+0x109/0x56d0 kernel/locking/lockdep.c:4934 lock_acquire kernel/locking/lockdep.c:5668 [inline] lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline] _raw_spin_lock_bh+0x33/0x40 kernel/locking/spinlock.c:178 spin_lock_bh include/linux/spinlock.h:355 [inline] lock_sock_nested+0x5f/0xf0 net/core/sock.c:3473 lock_sock include/net/sock.h:1725 [inline] inet_autobind+0x1a/0x190 net/ipv4/af_inet.c:177 inet_send_prepare net/ipv4/af_inet.c:813 [inline] inet_send_prepare+0x325/0x4e0 net/ipv4/af_inet.c:807 inet6_sendmsg+0x43/0xe0 net/ipv6/af_inet6.c:655 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xd3/0x120 net/socket.c:734 __sys_sendto+0x23a/0x340 net/socket.c:2117 __do_sys_sendto net/socket.c:2129 [inline] __se_sys_sendto net/socket.c:2125 [inline] __x64_sys_sendto+0xe1/0x1b0 net/socket.c:2125 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f1fd78538b9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f1fd7a971f8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007f1fd78f0038 RCX: 00007f1fd78538b9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 RBP: 00007f1fd78f0030 R08: 0000000020000100 R09: 000000000000001c R10: 0000000004008000 R11: 0000000000000212 R12: 00007f1fd78f003c R13: 00007f1fd79ffc8f R14: 00007f1fd7a97300 R15: 0000000000022000 </TASK> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] WARNING: locking bug in inet_autobind 2022-12-29 6:26 ` [syzbot] " syzbot @ 2023-01-03 15:39 ` Felix Kuehling 2023-01-03 16:05 ` Waiman Long 0 siblings, 1 reply; 18+ messages in thread From: Felix Kuehling @ 2023-01-03 15:39 UTC (permalink / raw) To: syzbot, Alexander.Deucher, Christian.Koenig, David1.Zhou, Evan.Quan, Harry.Wentland, Oak.Zeng, Ray.Huang, Yong.Zhao, airlied, amd-gfx, ast, boqun.feng, bpf, daniel, daniel, davem, dri-devel, dsahern, edumazet, gautammenghani201, jakub, kafai, kuba, kuznet, linux-kernel, longman, mingo, netdev, ozeng, pabeni, penguin-kernel, peterz, rex.zhu, songliubraving, syzkaller-bugs, will, yhs, yoshfuji The regression point doesn't make sense. The kernel config doesn't enable CONFIG_DRM_AMDGPU, so there is no way that a change in AMDGPU could have caused this regression. Regards, Felix Am 2022-12-29 um 01:26 schrieb syzbot: > syzbot has found a reproducer for the following issue on: > > HEAD commit: 1b929c02afd3 Linux 6.2-rc1 > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=145c6a68480000 > kernel config: https://syzkaller.appspot.com/x/.config?x=2651619a26b4d687 > dashboard link: https://syzkaller.appspot.com/bug?extid=94cc2a66fc228b23f360 > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13e13e32480000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13790f08480000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/d1849f1ca322/disk-1b929c02.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/924cb8aa4ada/vmlinux-1b929c02.xz > kernel image: https://storage.googleapis.com/syzbot-assets/8c7330dae0a0/bzImage-1b929c02.xz > > The issue was bisected to: > > commit c0d9271ecbd891cdeb0fad1edcdd99ee717a655f > Author: Yong Zhao <Yong.Zhao@amd.com> > Date: Fri Feb 1 23:36:21 2019 +0000 > > drm/amdgpu: Delete user queue doorbell variables > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1433ece4a00000 > final oops: https://syzkaller.appspot.com/x/report.txt?x=1633ece4a00000 > console output: https://syzkaller.appspot.com/x/log.txt?x=1233ece4a00000 > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+94cc2a66fc228b23f360@syzkaller.appspotmail.com > Fixes: c0d9271ecbd8 ("drm/amdgpu: Delete user queue doorbell variables") > > ------------[ cut here ]------------ > Looking for class "l2tp_sock" with key l2tp_socket_class, but found a different class "slock-AF_INET6" with the same key > WARNING: CPU: 0 PID: 7280 at kernel/locking/lockdep.c:937 look_up_lock_class+0x97/0x110 kernel/locking/lockdep.c:937 > Modules linked in: > CPU: 0 PID: 7280 Comm: syz-executor835 Not tainted 6.2.0-rc1-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 > RIP: 0010:look_up_lock_class+0x97/0x110 kernel/locking/lockdep.c:937 > Code: 17 48 81 fa e0 e5 f6 8f 74 59 80 3d 5d bc 57 04 00 75 50 48 c7 c7 00 4d 4c 8a 48 89 04 24 c6 05 49 bc 57 04 01 e8 a9 42 b9 ff <0f> 0b 48 8b 04 24 eb 31 9c 5a 80 e6 02 74 95 e8 45 38 02 fa 85 c0 > RSP: 0018:ffffc9000b5378b8 EFLAGS: 00010082 > RAX: 0000000000000000 RBX: ffffffff91c06a00 RCX: 0000000000000000 > RDX: ffff8880292d0000 RSI: ffffffff8166721c RDI: fffff520016a6f09 > RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 > R10: 0000000080000201 R11: 20676e696b6f6f4c R12: 0000000000000000 > R13: ffff88802a5820b0 R14: 0000000000000000 R15: 0000000000000000 > FS: 00007f1fd7a97700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000020000100 CR3: 0000000078ab4000 CR4: 00000000003506f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <TASK> > register_lock_class+0xbe/0x1120 kernel/locking/lockdep.c:1289 > __lock_acquire+0x109/0x56d0 kernel/locking/lockdep.c:4934 > lock_acquire kernel/locking/lockdep.c:5668 [inline] > lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633 > __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline] > _raw_spin_lock_bh+0x33/0x40 kernel/locking/spinlock.c:178 > spin_lock_bh include/linux/spinlock.h:355 [inline] > lock_sock_nested+0x5f/0xf0 net/core/sock.c:3473 > lock_sock include/net/sock.h:1725 [inline] > inet_autobind+0x1a/0x190 net/ipv4/af_inet.c:177 > inet_send_prepare net/ipv4/af_inet.c:813 [inline] > inet_send_prepare+0x325/0x4e0 net/ipv4/af_inet.c:807 > inet6_sendmsg+0x43/0xe0 net/ipv6/af_inet6.c:655 > sock_sendmsg_nosec net/socket.c:714 [inline] > sock_sendmsg+0xd3/0x120 net/socket.c:734 > __sys_sendto+0x23a/0x340 net/socket.c:2117 > __do_sys_sendto net/socket.c:2129 [inline] > __se_sys_sendto net/socket.c:2125 [inline] > __x64_sys_sendto+0xe1/0x1b0 net/socket.c:2125 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7f1fd78538b9 > Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007f1fd7a971f8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c > RAX: ffffffffffffffda RBX: 00007f1fd78f0038 RCX: 00007f1fd78538b9 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 > RBP: 00007f1fd78f0030 R08: 0000000020000100 R09: 000000000000001c > R10: 0000000004008000 R11: 0000000000000212 R12: 00007f1fd78f003c > R13: 00007f1fd79ffc8f R14: 00007f1fd7a97300 R15: 0000000000022000 > </TASK> > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] WARNING: locking bug in inet_autobind 2023-01-03 15:39 ` Felix Kuehling @ 2023-01-03 16:05 ` Waiman Long 2023-01-03 16:20 ` Felix Kuehling 0 siblings, 1 reply; 18+ messages in thread From: Waiman Long @ 2023-01-03 16:05 UTC (permalink / raw) To: Felix Kuehling, syzbot, Alexander.Deucher, Christian.Koenig, David1.Zhou, Evan.Quan, Harry.Wentland, Oak.Zeng, Ray.Huang, Yong.Zhao, airlied, amd-gfx, ast, boqun.feng, bpf, daniel, daniel, davem, dri-devel, dsahern, edumazet, gautammenghani201, jakub, kafai, kuba, kuznet, linux-kernel, mingo, netdev, ozeng, pabeni, penguin-kernel, peterz, rex.zhu, songliubraving, syzkaller-bugs, will, yhs, yoshfuji On 1/3/23 10:39, Felix Kuehling wrote: > The regression point doesn't make sense. The kernel config doesn't > enable CONFIG_DRM_AMDGPU, so there is no way that a change in AMDGPU > could have caused this regression. > I agree. It is likely a pre-existing problem or caused by another commit that got triggered because of the change in cacheline alignment caused by commit c0d9271ecbd ("drm/amdgpu: Delete user queue doorbell variable"). Cheers, Longman > Regards, > Felix > > > Am 2022-12-29 um 01:26 schrieb syzbot: >> syzbot has found a reproducer for the following issue on: >> >> HEAD commit: 1b929c02afd3 Linux 6.2-rc1 >> git tree: upstream >> console output: https://syzkaller.appspot.com/x/log.txt?x=145c6a68480000 >> kernel config: >> https://syzkaller.appspot.com/x/.config?x=2651619a26b4d687 >> dashboard link: >> https://syzkaller.appspot.com/bug?extid=94cc2a66fc228b23f360 >> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU >> Binutils for Debian) 2.35.2 >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13e13e32480000 >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13790f08480000 >> >> Downloadable assets: >> disk image: >> https://storage.googleapis.com/syzbot-assets/d1849f1ca322/disk-1b929c02.raw.xz >> vmlinux: >> https://storage.googleapis.com/syzbot-assets/924cb8aa4ada/vmlinux-1b929c02.xz >> kernel image: >> https://storage.googleapis.com/syzbot-assets/8c7330dae0a0/bzImage-1b929c02.xz >> >> The issue was bisected to: >> >> commit c0d9271ecbd891cdeb0fad1edcdd99ee717a655f >> Author: Yong Zhao <Yong.Zhao@amd.com> >> Date: Fri Feb 1 23:36:21 2019 +0000 >> >> drm/amdgpu: Delete user queue doorbell variables >> >> bisection log: >> https://syzkaller.appspot.com/x/bisect.txt?x=1433ece4a00000 >> final oops: https://syzkaller.appspot.com/x/report.txt?x=1633ece4a00000 >> console output: https://syzkaller.appspot.com/x/log.txt?x=1233ece4a00000 >> >> IMPORTANT: if you fix the issue, please add the following tag to the >> commit: >> Reported-by: syzbot+94cc2a66fc228b23f360@syzkaller.appspotmail.com >> Fixes: c0d9271ecbd8 ("drm/amdgpu: Delete user queue doorbell variables") >> >> ------------[ cut here ]------------ >> Looking for class "l2tp_sock" with key l2tp_socket_class, but found a >> different class "slock-AF_INET6" with the same key >> WARNING: CPU: 0 PID: 7280 at kernel/locking/lockdep.c:937 >> look_up_lock_class+0x97/0x110 kernel/locking/lockdep.c:937 >> Modules linked in: >> CPU: 0 PID: 7280 Comm: syz-executor835 Not tainted >> 6.2.0-rc1-syzkaller #0 >> Hardware name: Google Google Compute Engine/Google Compute Engine, >> BIOS Google 10/26/2022 >> RIP: 0010:look_up_lock_class+0x97/0x110 kernel/locking/lockdep.c:937 >> Code: 17 48 81 fa e0 e5 f6 8f 74 59 80 3d 5d bc 57 04 00 75 50 48 c7 >> c7 00 4d 4c 8a 48 89 04 24 c6 05 49 bc 57 04 01 e8 a9 42 b9 ff <0f> >> 0b 48 8b 04 24 eb 31 9c 5a 80 e6 02 74 95 e8 45 38 02 fa 85 c0 >> RSP: 0018:ffffc9000b5378b8 EFLAGS: 00010082 >> RAX: 0000000000000000 RBX: ffffffff91c06a00 RCX: 0000000000000000 >> RDX: ffff8880292d0000 RSI: ffffffff8166721c RDI: fffff520016a6f09 >> RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 >> R10: 0000000080000201 R11: 20676e696b6f6f4c R12: 0000000000000000 >> R13: ffff88802a5820b0 R14: 0000000000000000 R15: 0000000000000000 >> FS: 00007f1fd7a97700(0000) GS:ffff8880b9800000(0000) >> knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 0000000020000100 CR3: 0000000078ab4000 CR4: 00000000003506f0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Call Trace: >> <TASK> >> register_lock_class+0xbe/0x1120 kernel/locking/lockdep.c:1289 >> __lock_acquire+0x109/0x56d0 kernel/locking/lockdep.c:4934 >> lock_acquire kernel/locking/lockdep.c:5668 [inline] >> lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633 >> __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline] >> _raw_spin_lock_bh+0x33/0x40 kernel/locking/spinlock.c:178 >> spin_lock_bh include/linux/spinlock.h:355 [inline] >> lock_sock_nested+0x5f/0xf0 net/core/sock.c:3473 >> lock_sock include/net/sock.h:1725 [inline] >> inet_autobind+0x1a/0x190 net/ipv4/af_inet.c:177 >> inet_send_prepare net/ipv4/af_inet.c:813 [inline] >> inet_send_prepare+0x325/0x4e0 net/ipv4/af_inet.c:807 >> inet6_sendmsg+0x43/0xe0 net/ipv6/af_inet6.c:655 >> sock_sendmsg_nosec net/socket.c:714 [inline] >> sock_sendmsg+0xd3/0x120 net/socket.c:734 >> __sys_sendto+0x23a/0x340 net/socket.c:2117 >> __do_sys_sendto net/socket.c:2129 [inline] >> __se_sys_sendto net/socket.c:2125 [inline] >> __x64_sys_sendto+0xe1/0x1b0 net/socket.c:2125 >> do_syscall_x64 arch/x86/entry/common.c:50 [inline] >> do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 >> entry_SYSCALL_64_after_hwframe+0x63/0xcd >> RIP: 0033:0x7f1fd78538b9 >> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 15 00 00 90 48 89 f8 48 >> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> >> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 >> RSP: 002b:00007f1fd7a971f8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c >> RAX: ffffffffffffffda RBX: 00007f1fd78f0038 RCX: 00007f1fd78538b9 >> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 >> RBP: 00007f1fd78f0030 R08: 0000000020000100 R09: 000000000000001c >> R10: 0000000004008000 R11: 0000000000000212 R12: 00007f1fd78f003c >> R13: 00007f1fd79ffc8f R14: 00007f1fd7a97300 R15: 0000000000022000 >> </TASK> >> > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] WARNING: locking bug in inet_autobind 2023-01-03 16:05 ` Waiman Long @ 2023-01-03 16:20 ` Felix Kuehling 2023-01-03 22:07 ` Tetsuo Handa 0 siblings, 1 reply; 18+ messages in thread From: Felix Kuehling @ 2023-01-03 16:20 UTC (permalink / raw) To: Waiman Long, syzbot, Alexander.Deucher, Christian.Koenig, David1.Zhou, Evan.Quan, Harry.Wentland, Oak.Zeng, Ray.Huang, Yong.Zhao, airlied, amd-gfx, ast, boqun.feng, bpf, daniel, daniel, davem, dri-devel, dsahern, edumazet, gautammenghani201, jakub, kafai, kuba, kuznet, linux-kernel, mingo, netdev, ozeng, pabeni, penguin-kernel, peterz, rex.zhu, songliubraving, syzkaller-bugs, will, yhs, yoshfuji Am 2023-01-03 um 11:05 schrieb Waiman Long: > On 1/3/23 10:39, Felix Kuehling wrote: >> The regression point doesn't make sense. The kernel config doesn't >> enable CONFIG_DRM_AMDGPU, so there is no way that a change in AMDGPU >> could have caused this regression. >> > I agree. It is likely a pre-existing problem or caused by another > commit that got triggered because of the change in cacheline alignment > caused by commit c0d9271ecbd ("drm/amdgpu: Delete user queue doorbell > variable"). I don't think the change can affect cache line alignment. The entire amdgpu driver doesn't even get compiled in the kernel config that was used, and the change doesn't touch any files outside drivers/gpu/drm/amd/amdgpu: # CONFIG_DRM_AMDGPU is not set My guess would be that it's an intermittent bug that is confusing bisect. Regards, Felix > > Cheers, > Longman > > >> Regards, >> Felix >> >> >> Am 2022-12-29 um 01:26 schrieb syzbot: >>> syzbot has found a reproducer for the following issue on: >>> >>> HEAD commit: 1b929c02afd3 Linux 6.2-rc1 >>> git tree: upstream >>> console output: >>> https://syzkaller.appspot.com/x/log.txt?x=145c6a68480000 >>> kernel config: >>> https://syzkaller.appspot.com/x/.config?x=2651619a26b4d687 >>> dashboard link: >>> https://syzkaller.appspot.com/bug?extid=94cc2a66fc228b23f360 >>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU >>> Binutils for Debian) 2.35.2 >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13e13e32480000 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13790f08480000 >>> >>> Downloadable assets: >>> disk image: >>> https://storage.googleapis.com/syzbot-assets/d1849f1ca322/disk-1b929c02.raw.xz >>> vmlinux: >>> https://storage.googleapis.com/syzbot-assets/924cb8aa4ada/vmlinux-1b929c02.xz >>> kernel image: >>> https://storage.googleapis.com/syzbot-assets/8c7330dae0a0/bzImage-1b929c02.xz >>> >>> The issue was bisected to: >>> >>> commit c0d9271ecbd891cdeb0fad1edcdd99ee717a655f >>> Author: Yong Zhao <Yong.Zhao@amd.com> >>> Date: Fri Feb 1 23:36:21 2019 +0000 >>> >>> drm/amdgpu: Delete user queue doorbell variables >>> >>> bisection log: >>> https://syzkaller.appspot.com/x/bisect.txt?x=1433ece4a00000 >>> final oops: https://syzkaller.appspot.com/x/report.txt?x=1633ece4a00000 >>> console output: >>> https://syzkaller.appspot.com/x/log.txt?x=1233ece4a00000 >>> >>> IMPORTANT: if you fix the issue, please add the following tag to the >>> commit: >>> Reported-by: syzbot+94cc2a66fc228b23f360@syzkaller.appspotmail.com >>> Fixes: c0d9271ecbd8 ("drm/amdgpu: Delete user queue doorbell >>> variables") >>> >>> ------------[ cut here ]------------ >>> Looking for class "l2tp_sock" with key l2tp_socket_class, but found >>> a different class "slock-AF_INET6" with the same key >>> WARNING: CPU: 0 PID: 7280 at kernel/locking/lockdep.c:937 >>> look_up_lock_class+0x97/0x110 kernel/locking/lockdep.c:937 >>> Modules linked in: >>> CPU: 0 PID: 7280 Comm: syz-executor835 Not tainted >>> 6.2.0-rc1-syzkaller #0 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, >>> BIOS Google 10/26/2022 >>> RIP: 0010:look_up_lock_class+0x97/0x110 kernel/locking/lockdep.c:937 >>> Code: 17 48 81 fa e0 e5 f6 8f 74 59 80 3d 5d bc 57 04 00 75 50 48 c7 >>> c7 00 4d 4c 8a 48 89 04 24 c6 05 49 bc 57 04 01 e8 a9 42 b9 ff <0f> >>> 0b 48 8b 04 24 eb 31 9c 5a 80 e6 02 74 95 e8 45 38 02 fa 85 c0 >>> RSP: 0018:ffffc9000b5378b8 EFLAGS: 00010082 >>> RAX: 0000000000000000 RBX: ffffffff91c06a00 RCX: 0000000000000000 >>> RDX: ffff8880292d0000 RSI: ffffffff8166721c RDI: fffff520016a6f09 >>> RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 >>> R10: 0000000080000201 R11: 20676e696b6f6f4c R12: 0000000000000000 >>> R13: ffff88802a5820b0 R14: 0000000000000000 R15: 0000000000000000 >>> FS: 00007f1fd7a97700(0000) GS:ffff8880b9800000(0000) >>> knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: 0000000020000100 CR3: 0000000078ab4000 CR4: 00000000003506f0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>> Call Trace: >>> <TASK> >>> register_lock_class+0xbe/0x1120 kernel/locking/lockdep.c:1289 >>> __lock_acquire+0x109/0x56d0 kernel/locking/lockdep.c:4934 >>> lock_acquire kernel/locking/lockdep.c:5668 [inline] >>> lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633 >>> __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline] >>> _raw_spin_lock_bh+0x33/0x40 kernel/locking/spinlock.c:178 >>> spin_lock_bh include/linux/spinlock.h:355 [inline] >>> lock_sock_nested+0x5f/0xf0 net/core/sock.c:3473 >>> lock_sock include/net/sock.h:1725 [inline] >>> inet_autobind+0x1a/0x190 net/ipv4/af_inet.c:177 >>> inet_send_prepare net/ipv4/af_inet.c:813 [inline] >>> inet_send_prepare+0x325/0x4e0 net/ipv4/af_inet.c:807 >>> inet6_sendmsg+0x43/0xe0 net/ipv6/af_inet6.c:655 >>> sock_sendmsg_nosec net/socket.c:714 [inline] >>> sock_sendmsg+0xd3/0x120 net/socket.c:734 >>> __sys_sendto+0x23a/0x340 net/socket.c:2117 >>> __do_sys_sendto net/socket.c:2129 [inline] >>> __se_sys_sendto net/socket.c:2125 [inline] >>> __x64_sys_sendto+0xe1/0x1b0 net/socket.c:2125 >>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] >>> do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 >>> entry_SYSCALL_64_after_hwframe+0x63/0xcd >>> RIP: 0033:0x7f1fd78538b9 >>> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 15 00 00 90 48 89 f8 48 >>> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> >>> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 >>> RSP: 002b:00007f1fd7a971f8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c >>> RAX: ffffffffffffffda RBX: 00007f1fd78f0038 RCX: 00007f1fd78538b9 >>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 >>> RBP: 00007f1fd78f0030 R08: 0000000020000100 R09: 000000000000001c >>> R10: 0000000004008000 R11: 0000000000000212 R12: 00007f1fd78f003c >>> R13: 00007f1fd79ffc8f R14: 00007f1fd7a97300 R15: 0000000000022000 >>> </TASK> >>> >> > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] WARNING: locking bug in inet_autobind 2023-01-03 16:20 ` Felix Kuehling 2023-01-03 22:07 ` Tetsuo Handa @ 2023-01-03 22:07 ` Tetsuo Handa 0 siblings, 0 replies; 18+ messages in thread From: Tetsuo Handa @ 2023-01-03 22:07 UTC (permalink / raw) To: Felix Kuehling, Waiman Long, edumazet, jakub Cc: syzkaller-bugs, netdev, syzbot, Alexander.Deucher, Christian.Koenig, David1.Zhou, Evan.Quan, Harry.Wentland, Oak.Zeng, Ray.Huang, Yong.Zhao, airlied, ast, boqun.feng, daniel, daniel, davem, dsahern, gautammenghani201, kafai, kuba, kuznet, mingo, ozeng, pabeni, peterz, rex.zhu, songliubraving, will, yhs, yoshfuji On 2023/01/04 1:20, Felix Kuehling wrote: > > Am 2023-01-03 um 11:05 schrieb Waiman Long: >> On 1/3/23 10:39, Felix Kuehling wrote: >>> The regression point doesn't make sense. The kernel config doesn't enable CONFIG_DRM_AMDGPU, so there is no way that a change in AMDGPU could have caused this regression. >>> >> I agree. It is likely a pre-existing problem or caused by another commit that got triggered because of the change in cacheline alignment caused by commit c0d9271ecbd ("drm/amdgpu: Delete user queue doorbell variable"). > I don't think the change can affect cache line alignment. The entire amdgpu driver doesn't even get compiled in the kernel config that was used, and the change doesn't touch any files outside drivers/gpu/drm/amd/amdgpu: > > # CONFIG_DRM_AMDGPU is not set > > My guess would be that it's an intermittent bug that is confusing bisect. > > Regards, > Felix This was already explained in https://groups.google.com/g/syzkaller-bugs/c/1rmGDmbXWIw/m/nIQm0EmxBAAJ . Jakub Sitnicki suggested What if we revisit Eric's lockdep splat fix in 37159ef2c1ae ("l2tp: fix a lockdep splat") and: 1. remove the lockdep_set_class_and_name(...) call in l2tp; it looks like an odd case within the network stack, and 2. switch to bh_lock_sock_nested in l2tp_xmit_core so that we don't break what has been fixed in 37159ef2c1ae. and we are waiting for response from Eric Dumazet. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] WARNING: locking bug in inet_autobind @ 2023-01-03 22:07 ` Tetsuo Handa 0 siblings, 0 replies; 18+ messages in thread From: Tetsuo Handa @ 2023-01-03 22:07 UTC (permalink / raw) To: Felix Kuehling, Waiman Long, edumazet, jakub Cc: Yong.Zhao, songliubraving, Christian.Koenig, airlied, yhs, ast, Ray.Huang, will, David1.Zhou, syzbot, ozeng, daniel, Oak.Zeng, peterz, mingo, kuba, pabeni, Harry.Wentland, boqun.feng, syzkaller-bugs, kuznet, Evan.Quan, yoshfuji, netdev, dsahern, davem, daniel, gautammenghani201, Alexander.Deucher, rex.zhu, kafai On 2023/01/04 1:20, Felix Kuehling wrote: > > Am 2023-01-03 um 11:05 schrieb Waiman Long: >> On 1/3/23 10:39, Felix Kuehling wrote: >>> The regression point doesn't make sense. The kernel config doesn't enable CONFIG_DRM_AMDGPU, so there is no way that a change in AMDGPU could have caused this regression. >>> >> I agree. It is likely a pre-existing problem or caused by another commit that got triggered because of the change in cacheline alignment caused by commit c0d9271ecbd ("drm/amdgpu: Delete user queue doorbell variable"). > I don't think the change can affect cache line alignment. The entire amdgpu driver doesn't even get compiled in the kernel config that was used, and the change doesn't touch any files outside drivers/gpu/drm/amd/amdgpu: > > # CONFIG_DRM_AMDGPU is not set > > My guess would be that it's an intermittent bug that is confusing bisect. > > Regards, > Felix This was already explained in https://groups.google.com/g/syzkaller-bugs/c/1rmGDmbXWIw/m/nIQm0EmxBAAJ . Jakub Sitnicki suggested What if we revisit Eric's lockdep splat fix in 37159ef2c1ae ("l2tp: fix a lockdep splat") and: 1. remove the lockdep_set_class_and_name(...) call in l2tp; it looks like an odd case within the network stack, and 2. switch to bh_lock_sock_nested in l2tp_xmit_core so that we don't break what has been fixed in 37159ef2c1ae. and we are waiting for response from Eric Dumazet. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] WARNING: locking bug in inet_autobind @ 2023-01-03 22:07 ` Tetsuo Handa 0 siblings, 0 replies; 18+ messages in thread From: Tetsuo Handa @ 2023-01-03 22:07 UTC (permalink / raw) To: Felix Kuehling, Waiman Long, edumazet, jakub Cc: Yong.Zhao, songliubraving, Christian.Koenig, airlied, yhs, ast, Ray.Huang, will, David1.Zhou, syzbot, ozeng, daniel, Oak.Zeng, peterz, mingo, kuba, pabeni, boqun.feng, syzkaller-bugs, kuznet, Evan.Quan, yoshfuji, netdev, dsahern, davem, gautammenghani201, Alexander.Deucher, rex.zhu, kafai On 2023/01/04 1:20, Felix Kuehling wrote: > > Am 2023-01-03 um 11:05 schrieb Waiman Long: >> On 1/3/23 10:39, Felix Kuehling wrote: >>> The regression point doesn't make sense. The kernel config doesn't enable CONFIG_DRM_AMDGPU, so there is no way that a change in AMDGPU could have caused this regression. >>> >> I agree. It is likely a pre-existing problem or caused by another commit that got triggered because of the change in cacheline alignment caused by commit c0d9271ecbd ("drm/amdgpu: Delete user queue doorbell variable"). > I don't think the change can affect cache line alignment. The entire amdgpu driver doesn't even get compiled in the kernel config that was used, and the change doesn't touch any files outside drivers/gpu/drm/amd/amdgpu: > > # CONFIG_DRM_AMDGPU is not set > > My guess would be that it's an intermittent bug that is confusing bisect. > > Regards, > Felix This was already explained in https://groups.google.com/g/syzkaller-bugs/c/1rmGDmbXWIw/m/nIQm0EmxBAAJ . Jakub Sitnicki suggested What if we revisit Eric's lockdep splat fix in 37159ef2c1ae ("l2tp: fix a lockdep splat") and: 1. remove the lockdep_set_class_and_name(...) call in l2tp; it looks like an odd case within the network stack, and 2. switch to bh_lock_sock_nested in l2tp_xmit_core so that we don't break what has been fixed in 37159ef2c1ae. and we are waiting for response from Eric Dumazet. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] WARNING: locking bug in inet_autobind 2023-01-03 22:07 ` Tetsuo Handa (?) (?) @ 2023-01-03 22:12 ` Eric Dumazet -1 siblings, 0 replies; 18+ messages in thread From: Eric Dumazet @ 2023-01-03 22:12 UTC (permalink / raw) To: Tetsuo Handa Cc: Felix Kuehling, Waiman Long, jakub, syzkaller-bugs, netdev, syzbot, Alexander.Deucher, Christian.Koenig, David1.Zhou, Evan.Quan, Harry.Wentland, Oak.Zeng, Ray.Huang, Yong.Zhao, airlied, ast, boqun.feng, daniel, daniel, davem, dsahern, gautammenghani201, kafai, kuba, kuznet, mingo, ozeng, pabeni, peterz, rex.zhu, songliubraving, will, yhs, yoshfuji On Tue, Jan 3, 2023 at 11:08 PM Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > > On 2023/01/04 1:20, Felix Kuehling wrote: > > > > Am 2023-01-03 um 11:05 schrieb Waiman Long: > >> On 1/3/23 10:39, Felix Kuehling wrote: > >>> The regression point doesn't make sense. The kernel config doesn't enable CONFIG_DRM_AMDGPU, so there is no way that a change in AMDGPU could have caused this regression. > >>> > >> I agree. It is likely a pre-existing problem or caused by another commit that got triggered because of the change in cacheline alignment caused by commit c0d9271ecbd ("drm/amdgpu: Delete user queue doorbell variable"). > > I don't think the change can affect cache line alignment. The entire amdgpu driver doesn't even get compiled in the kernel config that was used, and the change doesn't touch any files outside drivers/gpu/drm/amd/amdgpu: > > > > # CONFIG_DRM_AMDGPU is not set > > > > My guess would be that it's an intermittent bug that is confusing bisect. > > > > Regards, > > Felix > > This was already explained in https://groups.google.com/g/syzkaller-bugs/c/1rmGDmbXWIw/m/nIQm0EmxBAAJ . > > Jakub Sitnicki suggested > > What if we revisit Eric's lockdep splat fix in 37159ef2c1ae ("l2tp: fix > a lockdep splat") and: > > 1. remove the lockdep_set_class_and_name(...) call in l2tp; it looks > like an odd case within the network stack, and > > 2. switch to bh_lock_sock_nested in l2tp_xmit_core so that we don't > break what has been fixed in 37159ef2c1ae. > > and we are waiting for response from Eric Dumazet. > Eric Dumazet has been very busy. Send a patch, instead of an idea/description. Thanks. ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20221229101603.2931-1-hdanton@sina.com>]
* Re: [syzbot] WARNING: locking bug in inet_autobind [not found] <20221229101603.2931-1-hdanton@sina.com> @ 2022-12-29 10:43 ` syzbot 0 siblings, 0 replies; 18+ messages in thread From: syzbot @ 2022-12-29 10:43 UTC (permalink / raw) To: hdanton, linux-kernel, syzkaller-bugs Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: INFO: rcu detected stall in corrupted rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P5564 } 2687 jiffies s: 2885 root: 0x0/T rcu: blocking rcu_node structures (internal RCU debug): Tested on: commit: 1b929c02 Linux 6.2-rc1 git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git console output: https://syzkaller.appspot.com/x/log.txt?x=124c2632480000 kernel config: https://syzkaller.appspot.com/x/.config?x=2651619a26b4d687 dashboard link: https://syzkaller.appspot.com/bug?extid=94cc2a66fc228b23f360 compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 patch: https://syzkaller.appspot.com/x/patch.diff?x=16485ff2480000 ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2023-01-04 8:15 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-05-16 5:46 WARNING: locking bug in inet_autobind syzbot 2019-05-21 8:31 ` syzbot 2019-05-22 3:16 ` syzbot [not found] ` <0000000000008b645c058971629b-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 2019-05-22 3:21 ` Zhao, Yong 2022-09-18 15:52 ` Tetsuo Handa 2022-09-18 18:25 ` Boqun Feng 2022-09-19 5:02 ` Tetsuo Handa 2022-09-27 13:00 ` Tetsuo Handa 2022-11-22 18:02 ` Jakub Sitnicki 2022-12-29 6:26 ` [syzbot] " syzbot 2023-01-03 15:39 ` Felix Kuehling 2023-01-03 16:05 ` Waiman Long 2023-01-03 16:20 ` Felix Kuehling 2023-01-03 22:07 ` Tetsuo Handa 2023-01-03 22:07 ` Tetsuo Handa 2023-01-03 22:07 ` Tetsuo Handa 2023-01-03 22:12 ` Eric Dumazet [not found] <20221229101603.2931-1-hdanton@sina.com> 2022-12-29 10:43 ` syzbot
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.