* [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter @ 2021-08-08 23:38 syzbot 2021-08-09 17:32 ` Shoaib Rao 0 siblings, 1 reply; 21+ messages in thread From: syzbot @ 2021-08-08 23:38 UTC (permalink / raw) To: andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, rao.shoaib, shuah, songliubraving, syzkaller-bugs, viro, yhs Hello, syzbot found the following issue on: HEAD commit: c2eecaa193ff pktgen: Remove redundant clone_skb override git tree: net-next console output: https://syzkaller.appspot.com/x/log.txt?x=12e3a69e300000 kernel config: https://syzkaller.appspot.com/x/.config?x=aba0c23f8230e048 dashboard link: https://syzkaller.appspot.com/bug?extid=8760ca6c1ee783ac4abd compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15c5b104300000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10062aaa300000 The issue was bisected to: commit 314001f0bf927015e459c9d387d62a231fe93af3 Author: Rao Shoaib <rao.shoaib@oracle.com> Date: Sun Aug 1 07:57:07 2021 +0000 af_unix: Add OOB support bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=10765f8e300000 final oops: https://syzkaller.appspot.com/x/report.txt?x=12765f8e300000 console output: https://syzkaller.appspot.com/x/log.txt?x=14765f8e300000 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+8760ca6c1ee783ac4abd@syzkaller.appspotmail.com Fixes: 314001f0bf92 ("af_unix: Add OOB support") BUG: sleeping function called from invalid context at lib/iov_iter.c:619 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 8443, name: syz-executor700 2 locks held by syz-executor700/8443: #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 Preemption disabled at: [<0000000000000000>] 0x0 CPU: 1 PID: 8443 Comm: syz-executor700 Not tainted 5.14.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9154 __might_fault+0x6e/0x180 mm/memory.c:5258 _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 copy_to_iter include/linux/uio.h:139 [inline] simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 sock_recvmsg_nosec net/socket.c:944 [inline] sock_recvmsg net/socket.c:962 [inline] sock_recvmsg net/socket.c:958 [inline] ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 ___sys_recvmsg+0x127/0x200 net/socket.c:2664 do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 __sys_recvmmsg net/socket.c:2837 [inline] __do_sys_recvmmsg net/socket.c:2860 [inline] __se_sys_recvmmsg net/socket.c:2853 [inline] __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x43ef39 Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000402fb0 R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488 ============================= [ BUG: Invalid wait context ] 5.14.0-rc3-syzkaller #0 Tainted: G W ----------------------------- syz-executor700/8443 is trying to lock: ffff8880212b6a28 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0xa3/0x180 mm/memory.c:5260 other info that might help us debug this: context-{4:4} 2 locks held by syz-executor700/8443: #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 stack backtrace: CPU: 1 PID: 8443 Comm: syz-executor700 Tainted: G W 5.14.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 print_lock_invalid_wait_context kernel/locking/lockdep.c:4666 [inline] check_wait_context kernel/locking/lockdep.c:4727 [inline] __lock_acquire.cold+0x213/0x3ab kernel/locking/lockdep.c:4965 lock_acquire kernel/locking/lockdep.c:5625 [inline] lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 __might_fault mm/memory.c:5261 [inline] __might_fault+0x106/0x180 mm/memory.c:5246 _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 copy_to_iter include/linux/uio.h:139 [inline] simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 sock_recvmsg_nosec net/socket.c:944 [inline] sock_recvmsg net/socket.c:962 [inline] sock_recvmsg net/socket.c:958 [inline] ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 ___sys_recvmsg+0x127/0x200 net/socket.c:2664 do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 __sys_recvmmsg net/socket.c:2837 [inline] __do_sys_recvmmsg net/socket.c:2860 [inline] __se_sys_recvmmsg net/socket.c:2853 [inline] __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x43ef39 Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 R10: 0000000000000007 R11: 0000000000000246 R12: 0000 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. For information about bisection process see: https://goo.gl/tpsmEJ#bisection syzbot can test patches for this issue, for details see: https://goo.gl/tpsmEJ#testing-patches ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-08 23:38 [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter syzbot @ 2021-08-09 17:32 ` Shoaib Rao 2021-08-09 18:06 ` Dmitry Vyukov 0 siblings, 1 reply; 21+ messages in thread From: Shoaib Rao @ 2021-08-09 17:32 UTC (permalink / raw) To: syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, viro, yhs This seems like a false positive. 1) The function will not sleep because it only calls copy routine if the byte is present. 2). There is no difference between this new call and the older calls in unix_stream_read_generic(). Shoaib On 8/8/21 4:38 PM, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: c2eecaa193ff pktgen: Remove redundant clone_skb override > git tree: net-next > console output: https://syzkaller.appspot.com/x/log.txt?x=12e3a69e300000 > kernel config: https://syzkaller.appspot.com/x/.config?x=aba0c23f8230e048 > dashboard link: https://syzkaller.appspot.com/bug?extid=8760ca6c1ee783ac4abd > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15c5b104300000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10062aaa300000 > > The issue was bisected to: > > commit 314001f0bf927015e459c9d387d62a231fe93af3 > Author: Rao Shoaib <rao.shoaib@oracle.com> > Date: Sun Aug 1 07:57:07 2021 +0000 > > af_unix: Add OOB support > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=10765f8e300000 > final oops: https://syzkaller.appspot.com/x/report.txt?x=12765f8e300000 > console output: https://syzkaller.appspot.com/x/log.txt?x=14765f8e300000 > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+8760ca6c1ee783ac4abd@syzkaller.appspotmail.com > Fixes: 314001f0bf92 ("af_unix: Add OOB support") > > BUG: sleeping function called from invalid context at lib/iov_iter.c:619 > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 8443, name: syz-executor700 > 2 locks held by syz-executor700/8443: > #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 > #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] > #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 > Preemption disabled at: > [<0000000000000000>] 0x0 > CPU: 1 PID: 8443 Comm: syz-executor700 Not tainted 5.14.0-rc3-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 > ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9154 > __might_fault+0x6e/0x180 mm/memory.c:5258 > _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 > copy_to_iter include/linux/uio.h:139 [inline] > simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 > __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 > skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 > skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] > unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 > unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] > unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 > unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 > sock_recvmsg_nosec net/socket.c:944 [inline] > sock_recvmsg net/socket.c:962 [inline] > sock_recvmsg net/socket.c:958 [inline] > ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 > ___sys_recvmsg+0x127/0x200 net/socket.c:2664 > do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 > __sys_recvmmsg net/socket.c:2837 [inline] > __do_sys_recvmmsg net/socket.c:2860 [inline] > __se_sys_recvmmsg net/socket.c:2853 [inline] > __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > RIP: 0033:0x43ef39 > Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b > RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 > RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 > RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 > R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000402fb0 > R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488 > > ============================= > [ BUG: Invalid wait context ] > 5.14.0-rc3-syzkaller #0 Tainted: G W > ----------------------------- > syz-executor700/8443 is trying to lock: > ffff8880212b6a28 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0xa3/0x180 mm/memory.c:5260 > other info that might help us debug this: > context-{4:4} > 2 locks held by syz-executor700/8443: > #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 > #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] > #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 > stack backtrace: > CPU: 1 PID: 8443 Comm: syz-executor700 Tainted: G W 5.14.0-rc3-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 > print_lock_invalid_wait_context kernel/locking/lockdep.c:4666 [inline] > check_wait_context kernel/locking/lockdep.c:4727 [inline] > __lock_acquire.cold+0x213/0x3ab kernel/locking/lockdep.c:4965 > lock_acquire kernel/locking/lockdep.c:5625 [inline] > lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 > __might_fault mm/memory.c:5261 [inline] > __might_fault+0x106/0x180 mm/memory.c:5246 > _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 > copy_to_iter include/linux/uio.h:139 [inline] > simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 > __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 > skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 > skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] > unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 > unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] > unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 > unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 > sock_recvmsg_nosec net/socket.c:944 [inline] > sock_recvmsg net/socket.c:962 [inline] > sock_recvmsg net/socket.c:958 [inline] > ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 > ___sys_recvmsg+0x127/0x200 net/socket.c:2664 > do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 > __sys_recvmmsg net/socket.c:2837 [inline] > __do_sys_recvmmsg net/socket.c:2860 [inline] > __se_sys_recvmmsg net/socket.c:2853 [inline] > __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > RIP: 0033:0x43ef39 > Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b > RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 > RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 > RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 > R10: 0000000000000007 R11: 0000000000000246 R12: 0000 > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > For information about bisection process see: https://goo.gl/tpsmEJ#bisection > syzbot can test patches for this issue, for details see: > https://goo.gl/tpsmEJ#testing-patches ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 17:32 ` Shoaib Rao @ 2021-08-09 18:06 ` Dmitry Vyukov 2021-08-09 19:16 ` Shoaib Rao 0 siblings, 1 reply; 21+ messages in thread From: Dmitry Vyukov @ 2021-08-09 18:06 UTC (permalink / raw) To: Shoaib Rao Cc: syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, viro, yhs On Mon, 9 Aug 2021 at 19:33, Shoaib Rao <rao.shoaib@oracle.com> wrote: > > This seems like a false positive. 1) The function will not sleep because > it only calls copy routine if the byte is present. 2). There is no > difference between this new call and the older calls in > unix_stream_read_generic(). Hi Shoaib, Thanks for looking into this. Do you have any ideas on how to fix this tool's false positive? Tools with false positives are order of magnitude less useful than tools w/o false positives. E.g. do we turn it off on syzbot? But I don't remember any other false positives from "sleeping function called from invalid context" checker... > On 8/8/21 4:38 PM, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: c2eecaa193ff pktgen: Remove redundant clone_skb override > > git tree: net-next > > console output: https://syzkaller.appspot.com/x/log.txt?x=12e3a69e300000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=aba0c23f8230e048 > > dashboard link: https://syzkaller.appspot.com/bug?extid=8760ca6c1ee783ac4abd > > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15c5b104300000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10062aaa300000 > > > > The issue was bisected to: > > > > commit 314001f0bf927015e459c9d387d62a231fe93af3 > > Author: Rao Shoaib <rao.shoaib@oracle.com> > > Date: Sun Aug 1 07:57:07 2021 +0000 > > > > af_unix: Add OOB support > > > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=10765f8e300000 > > final oops: https://syzkaller.appspot.com/x/report.txt?x=12765f8e300000 > > console output: https://syzkaller.appspot.com/x/log.txt?x=14765f8e300000 > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+8760ca6c1ee783ac4abd@syzkaller.appspotmail.com > > Fixes: 314001f0bf92 ("af_unix: Add OOB support") > > > > BUG: sleeping function called from invalid context at lib/iov_iter.c:619 > > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 8443, name: syz-executor700 > > 2 locks held by syz-executor700/8443: > > #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 > > #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] > > #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 > > Preemption disabled at: > > [<0000000000000000>] 0x0 > > CPU: 1 PID: 8443 Comm: syz-executor700 Not tainted 5.14.0-rc3-syzkaller #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:88 [inline] > > dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 > > ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9154 > > __might_fault+0x6e/0x180 mm/memory.c:5258 > > _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 > > copy_to_iter include/linux/uio.h:139 [inline] > > simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 > > __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 > > skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 > > skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] > > unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 > > unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] > > unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 > > unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 > > sock_recvmsg_nosec net/socket.c:944 [inline] > > sock_recvmsg net/socket.c:962 [inline] > > sock_recvmsg net/socket.c:958 [inline] > > ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 > > ___sys_recvmsg+0x127/0x200 net/socket.c:2664 > > do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 > > __sys_recvmmsg net/socket.c:2837 [inline] > > __do_sys_recvmmsg net/socket.c:2860 [inline] > > __se_sys_recvmmsg net/socket.c:2853 [inline] > > __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > RIP: 0033:0x43ef39 > > Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 > > RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b > > RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 > > RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 > > RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 > > R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000402fb0 > > R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488 > > > > ============================= > > [ BUG: Invalid wait context ] > > 5.14.0-rc3-syzkaller #0 Tainted: G W > > ----------------------------- > > syz-executor700/8443 is trying to lock: > > ffff8880212b6a28 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0xa3/0x180 mm/memory.c:5260 > > other info that might help us debug this: > > context-{4:4} > > 2 locks held by syz-executor700/8443: > > #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 > > #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] > > #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 > > stack backtrace: > > CPU: 1 PID: 8443 Comm: syz-executor700 Tainted: G W 5.14.0-rc3-syzkaller #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:88 [inline] > > dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 > > print_lock_invalid_wait_context kernel/locking/lockdep.c:4666 [inline] > > check_wait_context kernel/locking/lockdep.c:4727 [inline] > > __lock_acquire.cold+0x213/0x3ab kernel/locking/lockdep.c:4965 > > lock_acquire kernel/locking/lockdep.c:5625 [inline] > > lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 > > __might_fault mm/memory.c:5261 [inline] > > __might_fault+0x106/0x180 mm/memory.c:5246 > > _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 > > copy_to_iter include/linux/uio.h:139 [inline] > > simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 > > __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 > > skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 > > skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] > > unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 > > unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] > > unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 > > unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 > > sock_recvmsg_nosec net/socket.c:944 [inline] > > sock_recvmsg net/socket.c:962 [inline] > > sock_recvmsg net/socket.c:958 [inline] > > ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 > > ___sys_recvmsg+0x127/0x200 net/socket.c:2664 > > do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 > > __sys_recvmmsg net/socket.c:2837 [inline] > > __do_sys_recvmmsg net/socket.c:2860 [inline] > > __se_sys_recvmmsg net/socket.c:2853 [inline] > > __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > RIP: 0033:0x43ef39 > > Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 > > RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b > > RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 > > RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 > > RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 > > R10: 0000000000000007 R11: 0000000000000246 R12: 0000 > > > > > > --- > > This report is generated by a bot. It may contain errors. > > See https://goo.gl/tpsmEJ for more information about syzbot. > > syzbot engineers can be reached at syzkaller@googlegroups.com. > > > > syzbot will keep track of this issue. See: > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > For information about bisection process see: https://goo.gl/tpsmEJ#bisection > > syzbot can test patches for this issue, for details see: > > https://goo.gl/tpsmEJ#testing-patches > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/0c106e6c-672f-474e-5815-97b65596139d%40oracle.com. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 18:06 ` Dmitry Vyukov @ 2021-08-09 19:16 ` Shoaib Rao 2021-08-09 19:21 ` Dmitry Vyukov 2021-08-09 19:57 ` Al Viro 0 siblings, 2 replies; 21+ messages in thread From: Shoaib Rao @ 2021-08-09 19:16 UTC (permalink / raw) To: Dmitry Vyukov Cc: syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, viro, yhs On 8/9/21 11:06 AM, Dmitry Vyukov wrote: > On Mon, 9 Aug 2021 at 19:33, Shoaib Rao <rao.shoaib@oracle.com> wrote: >> This seems like a false positive. 1) The function will not sleep because >> it only calls copy routine if the byte is present. 2). There is no >> difference between this new call and the older calls in >> unix_stream_read_generic(). > Hi Shoaib, > > Thanks for looking into this. > Do you have any ideas on how to fix this tool's false positive? Tools > with false positives are order of magnitude less useful than tools w/o > false positives. E.g. do we turn it off on syzbot? But I don't > remember any other false positives from "sleeping function called from > invalid context" checker... Before we take any action I would like to understand why the tool does not single out other calls to recv_actor in unix_stream_read_generic(). The context in all cases is the same. I also do not understand why the code would sleep, Let's assume the user provided address is bad, the code will return EFAULT, it will never sleep, if the kernel provided address is bad the system will panic. The only difference I see is that the new code holds 2 locks while the previous code held one lock, but the locks are acquired before the call to copy. So please help me understand how the tool works. Even though I have evaluated the code carefully, there is always a possibility that the tool is correct. Shoaib > > > >> On 8/8/21 4:38 PM, syzbot wrote: >>> Hello, >>> >>> syzbot found the following issue on: >>> >>> HEAD commit: c2eecaa193ff pktgen: Remove redundant clone_skb override >>> git tree: net-next >>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=12e3a69e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHEdQcWD$ >>> kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=aba0c23f8230e048__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPLGp1-Za$ >>> dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=8760ca6c1ee783ac4abd__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPCORTNOH$ >>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1 >>> syz repro: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.syz?x=15c5b104300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPAjhi2yc$ >>> C reproducer: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.c?x=10062aaa300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNzAjzQJ$ >>> >>> The issue was bisected to: >>> >>> commit 314001f0bf927015e459c9d387d62a231fe93af3 >>> Author: Rao Shoaib <rao.shoaib@oracle.com> >>> Date: Sun Aug 1 07:57:07 2021 +0000 >>> >>> af_unix: Add OOB support >>> >>> bisection log: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/bisect.txt?x=10765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPK2iWt2r$ >>> final oops: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/report.txt?x=12765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKAb0dft$ >>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=14765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNlW_w-u$ >>> >>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>> Reported-by: syzbot+8760ca6c1ee783ac4abd@syzkaller.appspotmail.com >>> Fixes: 314001f0bf92 ("af_unix: Add OOB support") >>> >>> BUG: sleeping function called from invalid context at lib/iov_iter.c:619 >>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 8443, name: syz-executor700 >>> 2 locks held by syz-executor700/8443: >>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 >>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] >>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 >>> Preemption disabled at: >>> [<0000000000000000>] 0x0 >>> CPU: 1 PID: 8443 Comm: syz-executor700 Not tainted 5.14.0-rc3-syzkaller #0 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>> Call Trace: >>> __dump_stack lib/dump_stack.c:88 [inline] >>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 >>> ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9154 >>> __might_fault+0x6e/0x180 mm/memory.c:5258 >>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 >>> copy_to_iter include/linux/uio.h:139 [inline] >>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 >>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 >>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 >>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] >>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 >>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] >>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 >>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 >>> sock_recvmsg_nosec net/socket.c:944 [inline] >>> sock_recvmsg net/socket.c:962 [inline] >>> sock_recvmsg net/socket.c:958 [inline] >>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 >>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 >>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 >>> __sys_recvmmsg net/socket.c:2837 [inline] >>> __do_sys_recvmmsg net/socket.c:2860 [inline] >>> __se_sys_recvmmsg net/socket.c:2853 [inline] >>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 >>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] >>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 >>> entry_SYSCALL_64_after_hwframe+0x44/0xae >>> RIP: 0033:0x43ef39 >>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 >>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b >>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 >>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 >>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 >>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000402fb0 >>> R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488 >>> >>> ============================= >>> [ BUG: Invalid wait context ] >>> 5.14.0-rc3-syzkaller #0 Tainted: G W >>> ----------------------------- >>> syz-executor700/8443 is trying to lock: >>> ffff8880212b6a28 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0xa3/0x180 mm/memory.c:5260 >>> other info that might help us debug this: >>> context-{4:4} >>> 2 locks held by syz-executor700/8443: >>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 >>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] >>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 >>> stack backtrace: >>> CPU: 1 PID: 8443 Comm: syz-executor700 Tainted: G W 5.14.0-rc3-syzkaller #0 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>> Call Trace: >>> __dump_stack lib/dump_stack.c:88 [inline] >>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 >>> print_lock_invalid_wait_context kernel/locking/lockdep.c:4666 [inline] >>> check_wait_context kernel/locking/lockdep.c:4727 [inline] >>> __lock_acquire.cold+0x213/0x3ab kernel/locking/lockdep.c:4965 >>> lock_acquire kernel/locking/lockdep.c:5625 [inline] >>> lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 >>> __might_fault mm/memory.c:5261 [inline] >>> __might_fault+0x106/0x180 mm/memory.c:5246 >>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 >>> copy_to_iter include/linux/uio.h:139 [inline] >>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 >>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 >>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 >>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] >>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 >>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] >>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 >>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 >>> sock_recvmsg_nosec net/socket.c:944 [inline] >>> sock_recvmsg net/socket.c:962 [inline] >>> sock_recvmsg net/socket.c:958 [inline] >>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 >>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 >>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 >>> __sys_recvmmsg net/socket.c:2837 [inline] >>> __do_sys_recvmmsg net/socket.c:2860 [inline] >>> __se_sys_recvmmsg net/socket.c:2853 [inline] >>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 >>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] >>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 >>> entry_SYSCALL_64_after_hwframe+0x44/0xae >>> RIP: 0033:0x43ef39 >>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 >>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b >>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 >>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 >>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 >>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000 >>> >>> >>> --- >>> This report is generated by a bot. It may contain errors. >>> See https://urldefense.com/v3/__https://goo.gl/tpsmEJ__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPG1UhbpZ$ for more information about syzbot. >>> syzbot engineers can be reached at syzkaller@googlegroups.com. >>> >>> syzbot will keep track of this issue. See: >>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*status__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKlEx5v1$ for how to communicate with syzbot. >>> For information about bisection process see: https://urldefense.com/v3/__https://goo.gl/tpsmEJ*bisection__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPJk7KaIr$ >>> syzbot can test patches for this issue, for details see: >>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*testing-patches__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPMhq2hD3$ >> -- >> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. >> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. >> To view this discussion on the web visit https://urldefense.com/v3/__https://groups.google.com/d/msgid/syzkaller-bugs/0c106e6c-672f-474e-5815-97b65596139d*40oracle.com__;JQ!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHjmYAGZ$ . ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 19:16 ` Shoaib Rao @ 2021-08-09 19:21 ` Dmitry Vyukov 2021-08-09 19:40 ` Shoaib Rao 2021-08-09 19:57 ` Al Viro 1 sibling, 1 reply; 21+ messages in thread From: Dmitry Vyukov @ 2021-08-09 19:21 UTC (permalink / raw) To: Shoaib Rao Cc: syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, viro, yhs On Mon, 9 Aug 2021 at 21:16, Shoaib Rao <rao.shoaib@oracle.com> wrote: > On 8/9/21 11:06 AM, Dmitry Vyukov wrote: > > On Mon, 9 Aug 2021 at 19:33, Shoaib Rao <rao.shoaib@oracle.com> wrote: > >> This seems like a false positive. 1) The function will not sleep because > >> it only calls copy routine if the byte is present. 2). There is no > >> difference between this new call and the older calls in > >> unix_stream_read_generic(). > > Hi Shoaib, > > > > Thanks for looking into this. > > Do you have any ideas on how to fix this tool's false positive? Tools > > with false positives are order of magnitude less useful than tools w/o > > false positives. E.g. do we turn it off on syzbot? But I don't > > remember any other false positives from "sleeping function called from > > invalid context" checker... > > Before we take any action I would like to understand why the tool does > not single out other calls to recv_actor in unix_stream_read_generic(). > The context in all cases is the same. I also do not understand why the > code would sleep, Let's assume the user provided address is bad, the > code will return EFAULT, it will never sleep, I always assumed that it's because if user pages are swapped out, it may need to read them back from disk. > if the kernel provided > address is bad the system will panic. The only difference I see is that > the new code holds 2 locks while the previous code held one lock, but > the locks are acquired before the call to copy. > > So please help me understand how the tool works. Even though I have > evaluated the code carefully, there is always a possibility that the > tool is correct. > > Shoaib > > > > > > > > >> On 8/8/21 4:38 PM, syzbot wrote: > >>> Hello, > >>> > >>> syzbot found the following issue on: > >>> > >>> HEAD commit: c2eecaa193ff pktgen: Remove redundant clone_skb override > >>> git tree: net-next > >>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=12e3a69e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHEdQcWD$ > >>> kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=aba0c23f8230e048__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPLGp1-Za$ > >>> dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=8760ca6c1ee783ac4abd__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPCORTNOH$ > >>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1 > >>> syz repro: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.syz?x=15c5b104300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPAjhi2yc$ > >>> C reproducer: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.c?x=10062aaa300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNzAjzQJ$ > >>> > >>> The issue was bisected to: > >>> > >>> commit 314001f0bf927015e459c9d387d62a231fe93af3 > >>> Author: Rao Shoaib <rao.shoaib@oracle.com> > >>> Date: Sun Aug 1 07:57:07 2021 +0000 > >>> > >>> af_unix: Add OOB support > >>> > >>> bisection log: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/bisect.txt?x=10765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPK2iWt2r$ > >>> final oops: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/report.txt?x=12765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKAb0dft$ > >>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=14765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNlW_w-u$ > >>> > >>> IMPORTANT: if you fix the issue, please add the following tag to the commit: > >>> Reported-by: syzbot+8760ca6c1ee783ac4abd@syzkaller.appspotmail.com > >>> Fixes: 314001f0bf92 ("af_unix: Add OOB support") > >>> > >>> BUG: sleeping function called from invalid context at lib/iov_iter.c:619 > >>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 8443, name: syz-executor700 > >>> 2 locks held by syz-executor700/8443: > >>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 > >>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] > >>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 > >>> Preemption disabled at: > >>> [<0000000000000000>] 0x0 > >>> CPU: 1 PID: 8443 Comm: syz-executor700 Not tainted 5.14.0-rc3-syzkaller #0 > >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > >>> Call Trace: > >>> __dump_stack lib/dump_stack.c:88 [inline] > >>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 > >>> ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9154 > >>> __might_fault+0x6e/0x180 mm/memory.c:5258 > >>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 > >>> copy_to_iter include/linux/uio.h:139 [inline] > >>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 > >>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 > >>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 > >>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] > >>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 > >>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] > >>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 > >>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 > >>> sock_recvmsg_nosec net/socket.c:944 [inline] > >>> sock_recvmsg net/socket.c:962 [inline] > >>> sock_recvmsg net/socket.c:958 [inline] > >>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 > >>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 > >>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 > >>> __sys_recvmmsg net/socket.c:2837 [inline] > >>> __do_sys_recvmmsg net/socket.c:2860 [inline] > >>> __se_sys_recvmmsg net/socket.c:2853 [inline] > >>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 > >>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] > >>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > >>> entry_SYSCALL_64_after_hwframe+0x44/0xae > >>> RIP: 0033:0x43ef39 > >>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 > >>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b > >>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 > >>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 > >>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 > >>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000402fb0 > >>> R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488 > >>> > >>> ============================= > >>> [ BUG: Invalid wait context ] > >>> 5.14.0-rc3-syzkaller #0 Tainted: G W > >>> ----------------------------- > >>> syz-executor700/8443 is trying to lock: > >>> ffff8880212b6a28 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0xa3/0x180 mm/memory.c:5260 > >>> other info that might help us debug this: > >>> context-{4:4} > >>> 2 locks held by syz-executor700/8443: > >>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 > >>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] > >>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 > >>> stack backtrace: > >>> CPU: 1 PID: 8443 Comm: syz-executor700 Tainted: G W 5.14.0-rc3-syzkaller #0 > >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > >>> Call Trace: > >>> __dump_stack lib/dump_stack.c:88 [inline] > >>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 > >>> print_lock_invalid_wait_context kernel/locking/lockdep.c:4666 [inline] > >>> check_wait_context kernel/locking/lockdep.c:4727 [inline] > >>> __lock_acquire.cold+0x213/0x3ab kernel/locking/lockdep.c:4965 > >>> lock_acquire kernel/locking/lockdep.c:5625 [inline] > >>> lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 > >>> __might_fault mm/memory.c:5261 [inline] > >>> __might_fault+0x106/0x180 mm/memory.c:5246 > >>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 > >>> copy_to_iter include/linux/uio.h:139 [inline] > >>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 > >>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 > >>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 > >>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] > >>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 > >>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] > >>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 > >>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 > >>> sock_recvmsg_nosec net/socket.c:944 [inline] > >>> sock_recvmsg net/socket.c:962 [inline] > >>> sock_recvmsg net/socket.c:958 [inline] > >>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 > >>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 > >>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 > >>> __sys_recvmmsg net/socket.c:2837 [inline] > >>> __do_sys_recvmmsg net/socket.c:2860 [inline] > >>> __se_sys_recvmmsg net/socket.c:2853 [inline] > >>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 > >>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] > >>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > >>> entry_SYSCALL_64_after_hwframe+0x44/0xae > >>> RIP: 0033:0x43ef39 > >>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 > >>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b > >>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 > >>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 > >>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 > >>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000 > >>> > >>> > >>> --- > >>> This report is generated by a bot. It may contain errors. > >>> See https://urldefense.com/v3/__https://goo.gl/tpsmEJ__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPG1UhbpZ$ for more information about syzbot. > >>> syzbot engineers can be reached at syzkaller@googlegroups.com. > >>> > >>> syzbot will keep track of this issue. See: > >>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*status__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKlEx5v1$ for how to communicate with syzbot. > >>> For information about bisection process see: https://urldefense.com/v3/__https://goo.gl/tpsmEJ*bisection__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPJk7KaIr$ > >>> syzbot can test patches for this issue, for details see: > >>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*testing-patches__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPMhq2hD3$ > >> -- > >> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > >> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > >> To view this discussion on the web visit https://urldefense.com/v3/__https://groups.google.com/d/msgid/syzkaller-bugs/0c106e6c-672f-474e-5815-97b65596139d*40oracle.com__;JQ!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHjmYAGZ$ . ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 19:21 ` Dmitry Vyukov @ 2021-08-09 19:40 ` Shoaib Rao 2021-08-09 20:02 ` Eric Dumazet 2021-08-09 20:04 ` Al Viro 0 siblings, 2 replies; 21+ messages in thread From: Shoaib Rao @ 2021-08-09 19:40 UTC (permalink / raw) To: Dmitry Vyukov Cc: syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, viro, yhs On 8/9/21 12:21 PM, Dmitry Vyukov wrote: > On Mon, 9 Aug 2021 at 21:16, Shoaib Rao <rao.shoaib@oracle.com> wrote: >> On 8/9/21 11:06 AM, Dmitry Vyukov wrote: >>> On Mon, 9 Aug 2021 at 19:33, Shoaib Rao <rao.shoaib@oracle.com> wrote: >>>> This seems like a false positive. 1) The function will not sleep because >>>> it only calls copy routine if the byte is present. 2). There is no >>>> difference between this new call and the older calls in >>>> unix_stream_read_generic(). >>> Hi Shoaib, >>> >>> Thanks for looking into this. >>> Do you have any ideas on how to fix this tool's false positive? Tools >>> with false positives are order of magnitude less useful than tools w/o >>> false positives. E.g. do we turn it off on syzbot? But I don't >>> remember any other false positives from "sleeping function called from >>> invalid context" checker... >> Before we take any action I would like to understand why the tool does >> not single out other calls to recv_actor in unix_stream_read_generic(). >> The context in all cases is the same. I also do not understand why the >> code would sleep, Let's assume the user provided address is bad, the >> code will return EFAULT, it will never sleep, > I always assumed that it's because if user pages are swapped out, it > may need to read them back from disk. Page faults occur all the time, the page may not even be in the cache or the mapping is not there (mmap), so I would not consider this a bug. The code should complain about all other calls as they are also copying to user pages. I must not be following some semantics for the code to be triggered but I can not figure that out. What is the recommended interface to do user copy from kernel? Shoaib > >> if the kernel provided >> address is bad the system will panic. The only difference I see is that >> the new code holds 2 locks while the previous code held one lock, but >> the locks are acquired before the call to copy. >> >> So please help me understand how the tool works. Even though I have >> evaluated the code carefully, there is always a possibility that the >> tool is correct. >> >> Shoaib >> >>> >>> >>>> On 8/8/21 4:38 PM, syzbot wrote: >>>>> Hello, >>>>> >>>>> syzbot found the following issue on: >>>>> >>>>> HEAD commit: c2eecaa193ff pktgen: Remove redundant clone_skb override >>>>> git tree: net-next >>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=12e3a69e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHEdQcWD$ >>>>> kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=aba0c23f8230e048__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPLGp1-Za$ >>>>> dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=8760ca6c1ee783ac4abd__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPCORTNOH$ >>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1 >>>>> syz repro: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.syz?x=15c5b104300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPAjhi2yc$ >>>>> C reproducer: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.c?x=10062aaa300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNzAjzQJ$ >>>>> >>>>> The issue was bisected to: >>>>> >>>>> commit 314001f0bf927015e459c9d387d62a231fe93af3 >>>>> Author: Rao Shoaib <rao.shoaib@oracle.com> >>>>> Date: Sun Aug 1 07:57:07 2021 +0000 >>>>> >>>>> af_unix: Add OOB support >>>>> >>>>> bisection log: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/bisect.txt?x=10765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPK2iWt2r$ >>>>> final oops: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/report.txt?x=12765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKAb0dft$ >>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=14765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNlW_w-u$ >>>>> >>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>>>> Reported-by: syzbot+8760ca6c1ee783ac4abd@syzkaller.appspotmail.com >>>>> Fixes: 314001f0bf92 ("af_unix: Add OOB support") >>>>> >>>>> BUG: sleeping function called from invalid context at lib/iov_iter.c:619 >>>>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 8443, name: syz-executor700 >>>>> 2 locks held by syz-executor700/8443: >>>>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 >>>>> Preemption disabled at: >>>>> [<0000000000000000>] 0x0 >>>>> CPU: 1 PID: 8443 Comm: syz-executor700 Not tainted 5.14.0-rc3-syzkaller #0 >>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>>>> Call Trace: >>>>> __dump_stack lib/dump_stack.c:88 [inline] >>>>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 >>>>> ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9154 >>>>> __might_fault+0x6e/0x180 mm/memory.c:5258 >>>>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 >>>>> copy_to_iter include/linux/uio.h:139 [inline] >>>>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 >>>>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 >>>>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 >>>>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] >>>>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 >>>>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] >>>>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 >>>>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 >>>>> sock_recvmsg_nosec net/socket.c:944 [inline] >>>>> sock_recvmsg net/socket.c:962 [inline] >>>>> sock_recvmsg net/socket.c:958 [inline] >>>>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 >>>>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 >>>>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 >>>>> __sys_recvmmsg net/socket.c:2837 [inline] >>>>> __do_sys_recvmmsg net/socket.c:2860 [inline] >>>>> __se_sys_recvmmsg net/socket.c:2853 [inline] >>>>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 >>>>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] >>>>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 >>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae >>>>> RIP: 0033:0x43ef39 >>>>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 >>>>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b >>>>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 >>>>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 >>>>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 >>>>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000402fb0 >>>>> R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488 >>>>> >>>>> ============================= >>>>> [ BUG: Invalid wait context ] >>>>> 5.14.0-rc3-syzkaller #0 Tainted: G W >>>>> ----------------------------- >>>>> syz-executor700/8443 is trying to lock: >>>>> ffff8880212b6a28 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0xa3/0x180 mm/memory.c:5260 >>>>> other info that might help us debug this: >>>>> context-{4:4} >>>>> 2 locks held by syz-executor700/8443: >>>>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 >>>>> stack backtrace: >>>>> CPU: 1 PID: 8443 Comm: syz-executor700 Tainted: G W 5.14.0-rc3-syzkaller #0 >>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>>>> Call Trace: >>>>> __dump_stack lib/dump_stack.c:88 [inline] >>>>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 >>>>> print_lock_invalid_wait_context kernel/locking/lockdep.c:4666 [inline] >>>>> check_wait_context kernel/locking/lockdep.c:4727 [inline] >>>>> __lock_acquire.cold+0x213/0x3ab kernel/locking/lockdep.c:4965 >>>>> lock_acquire kernel/locking/lockdep.c:5625 [inline] >>>>> lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 >>>>> __might_fault mm/memory.c:5261 [inline] >>>>> __might_fault+0x106/0x180 mm/memory.c:5246 >>>>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 >>>>> copy_to_iter include/linux/uio.h:139 [inline] >>>>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 >>>>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 >>>>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 >>>>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] >>>>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 >>>>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] >>>>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 >>>>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 >>>>> sock_recvmsg_nosec net/socket.c:944 [inline] >>>>> sock_recvmsg net/socket.c:962 [inline] >>>>> sock_recvmsg net/socket.c:958 [inline] >>>>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 >>>>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 >>>>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 >>>>> __sys_recvmmsg net/socket.c:2837 [inline] >>>>> __do_sys_recvmmsg net/socket.c:2860 [inline] >>>>> __se_sys_recvmmsg net/socket.c:2853 [inline] >>>>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 >>>>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] >>>>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 >>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae >>>>> RIP: 0033:0x43ef39 >>>>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 >>>>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b >>>>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 >>>>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 >>>>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 >>>>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000 >>>>> >>>>> >>>>> --- >>>>> This report is generated by a bot. It may contain errors. >>>>> See https://urldefense.com/v3/__https://goo.gl/tpsmEJ__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPG1UhbpZ$ for more information about syzbot. >>>>> syzbot engineers can be reached at syzkaller@googlegroups.com. >>>>> >>>>> syzbot will keep track of this issue. See: >>>>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*status__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKlEx5v1$ for how to communicate with syzbot. >>>>> For information about bisection process see: https://urldefense.com/v3/__https://goo.gl/tpsmEJ*bisection__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPJk7KaIr$ >>>>> syzbot can test patches for this issue, for details see: >>>>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*testing-patches__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPMhq2hD3$ >>>> -- >>>> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. >>>> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. >>>> To view this discussion on the web visit https://urldefense.com/v3/__https://groups.google.com/d/msgid/syzkaller-bugs/0c106e6c-672f-474e-5815-97b65596139d*40oracle.com__;JQ!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHjmYAGZ$ . ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 19:40 ` Shoaib Rao @ 2021-08-09 20:02 ` Eric Dumazet 2021-08-09 20:09 ` Eric Dumazet 2021-08-09 20:04 ` Al Viro 1 sibling, 1 reply; 21+ messages in thread From: Eric Dumazet @ 2021-08-09 20:02 UTC (permalink / raw) To: Shoaib Rao Cc: Dmitry Vyukov, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Christian Brauner, Cong Wang, Daniel Borkmann, David Miller, jamorris, John Fastabend, Martin KaFai Lau, kpsingh, Jakub Kicinski, LKML, open list:KERNEL SELFTEST FRAMEWORK, netdev, Shuah Khan, Song Liu, syzkaller-bugs, Al Viro, Yonghong Song On Mon, Aug 9, 2021 at 9:40 PM Shoaib Rao <rao.shoaib@oracle.com> wrote: > > > On 8/9/21 12:21 PM, Dmitry Vyukov wrote: > > On Mon, 9 Aug 2021 at 21:16, Shoaib Rao <rao.shoaib@oracle.com> wrote: > >> On 8/9/21 11:06 AM, Dmitry Vyukov wrote: > >>> On Mon, 9 Aug 2021 at 19:33, Shoaib Rao <rao.shoaib@oracle.com> wrote: > >>>> This seems like a false positive. 1) The function will not sleep because > >>>> it only calls copy routine if the byte is present. 2). There is no > >>>> difference between this new call and the older calls in > >>>> unix_stream_read_generic(). > >>> Hi Shoaib, > >>> > >>> Thanks for looking into this. > >>> Do you have any ideas on how to fix this tool's false positive? Tools > >>> with false positives are order of magnitude less useful than tools w/o > >>> false positives. E.g. do we turn it off on syzbot? But I don't > >>> remember any other false positives from "sleeping function called from > >>> invalid context" checker... > >> Before we take any action I would like to understand why the tool does > >> not single out other calls to recv_actor in unix_stream_read_generic(). > >> The context in all cases is the same. I also do not understand why the > >> code would sleep, Let's assume the user provided address is bad, the > >> code will return EFAULT, it will never sleep, > > I always assumed that it's because if user pages are swapped out, it > > may need to read them back from disk. > > Page faults occur all the time, the page may not even be in the cache or > the mapping is not there (mmap), so I would not consider this a bug. The > code should complain about all other calls as they are also copying to > user pages. I must not be following some semantics for the code to be > triggered but I can not figure that out. What is the recommended > interface to do user copy from kernel? Are you aware of the difference between a mutex and a spinlock ? When copying data from/to user, you can not hold a spinlock. > > Shoaib > > > > >> if the kernel provided > >> address is bad the system will panic. The only difference I see is that > >> the new code holds 2 locks while the previous code held one lock, but > >> the locks are acquired before the call to copy. > >> > >> So please help me understand how the tool works. Even though I have > >> evaluated the code carefully, there is always a possibility that the > >> tool is correct. > >> > >> Shoaib > >> > >>> > >>> > >>>> On 8/8/21 4:38 PM, syzbot wrote: > >>>>> Hello, > >>>>> > >>>>> syzbot found the following issue on: > >>>>> > >>>>> HEAD commit: c2eecaa193ff pktgen: Remove redundant clone_skb override > >>>>> git tree: net-next > >>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=12e3a69e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHEdQcWD$ > >>>>> kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=aba0c23f8230e048__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPLGp1-Za$ > >>>>> dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=8760ca6c1ee783ac4abd__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPCORTNOH$ > >>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1 > >>>>> syz repro: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.syz?x=15c5b104300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPAjhi2yc$ > >>>>> C reproducer: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.c?x=10062aaa300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNzAjzQJ$ > >>>>> > >>>>> The issue was bisected to: > >>>>> > >>>>> commit 314001f0bf927015e459c9d387d62a231fe93af3 > >>>>> Author: Rao Shoaib <rao.shoaib@oracle.com> > >>>>> Date: Sun Aug 1 07:57:07 2021 +0000 > >>>>> > >>>>> af_unix: Add OOB support > >>>>> > >>>>> bisection log: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/bisect.txt?x=10765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPK2iWt2r$ > >>>>> final oops: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/report.txt?x=12765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKAb0dft$ > >>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=14765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNlW_w-u$ > >>>>> > >>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: > >>>>> Reported-by: syzbot+8760ca6c1ee783ac4abd@syzkaller.appspotmail.com > >>>>> Fixes: 314001f0bf92 ("af_unix: Add OOB support") > >>>>> > >>>>> BUG: sleeping function called from invalid context at lib/iov_iter.c:619 > >>>>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 8443, name: syz-executor700 > >>>>> 2 locks held by syz-executor700/8443: > >>>>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 > >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] > >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 > >>>>> Preemption disabled at: > >>>>> [<0000000000000000>] 0x0 > >>>>> CPU: 1 PID: 8443 Comm: syz-executor700 Not tainted 5.14.0-rc3-syzkaller #0 > >>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > >>>>> Call Trace: > >>>>> __dump_stack lib/dump_stack.c:88 [inline] > >>>>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 > >>>>> ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9154 > >>>>> __might_fault+0x6e/0x180 mm/memory.c:5258 > >>>>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 > >>>>> copy_to_iter include/linux/uio.h:139 [inline] > >>>>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 > >>>>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 > >>>>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 > >>>>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] > >>>>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 > >>>>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] > >>>>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 > >>>>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 > >>>>> sock_recvmsg_nosec net/socket.c:944 [inline] > >>>>> sock_recvmsg net/socket.c:962 [inline] > >>>>> sock_recvmsg net/socket.c:958 [inline] > >>>>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 > >>>>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 > >>>>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 > >>>>> __sys_recvmmsg net/socket.c:2837 [inline] > >>>>> __do_sys_recvmmsg net/socket.c:2860 [inline] > >>>>> __se_sys_recvmmsg net/socket.c:2853 [inline] > >>>>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 > >>>>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] > >>>>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > >>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae > >>>>> RIP: 0033:0x43ef39 > >>>>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 > >>>>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b > >>>>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 > >>>>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 > >>>>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 > >>>>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000402fb0 > >>>>> R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488 > >>>>> > >>>>> ============================= > >>>>> [ BUG: Invalid wait context ] > >>>>> 5.14.0-rc3-syzkaller #0 Tainted: G W > >>>>> ----------------------------- > >>>>> syz-executor700/8443 is trying to lock: > >>>>> ffff8880212b6a28 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0xa3/0x180 mm/memory.c:5260 > >>>>> other info that might help us debug this: > >>>>> context-{4:4} > >>>>> 2 locks held by syz-executor700/8443: > >>>>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 > >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] > >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 > >>>>> stack backtrace: > >>>>> CPU: 1 PID: 8443 Comm: syz-executor700 Tainted: G W 5.14.0-rc3-syzkaller #0 > >>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > >>>>> Call Trace: > >>>>> __dump_stack lib/dump_stack.c:88 [inline] > >>>>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 > >>>>> print_lock_invalid_wait_context kernel/locking/lockdep.c:4666 [inline] > >>>>> check_wait_context kernel/locking/lockdep.c:4727 [inline] > >>>>> __lock_acquire.cold+0x213/0x3ab kernel/locking/lockdep.c:4965 > >>>>> lock_acquire kernel/locking/lockdep.c:5625 [inline] > >>>>> lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 > >>>>> __might_fault mm/memory.c:5261 [inline] > >>>>> __might_fault+0x106/0x180 mm/memory.c:5246 > >>>>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 > >>>>> copy_to_iter include/linux/uio.h:139 [inline] > >>>>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 > >>>>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 > >>>>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 > >>>>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] > >>>>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 > >>>>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] > >>>>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 > >>>>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 > >>>>> sock_recvmsg_nosec net/socket.c:944 [inline] > >>>>> sock_recvmsg net/socket.c:962 [inline] > >>>>> sock_recvmsg net/socket.c:958 [inline] > >>>>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 > >>>>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 > >>>>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 > >>>>> __sys_recvmmsg net/socket.c:2837 [inline] > >>>>> __do_sys_recvmmsg net/socket.c:2860 [inline] > >>>>> __se_sys_recvmmsg net/socket.c:2853 [inline] > >>>>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 > >>>>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] > >>>>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > >>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae > >>>>> RIP: 0033:0x43ef39 > >>>>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 > >>>>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b > >>>>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 > >>>>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 > >>>>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 > >>>>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000 > >>>>> > >>>>> > >>>>> --- > >>>>> This report is generated by a bot. It may contain errors. > >>>>> See https://urldefense.com/v3/__https://goo.gl/tpsmEJ__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPG1UhbpZ$ for more information about syzbot. > >>>>> syzbot engineers can be reached at syzkaller@googlegroups.com. > >>>>> > >>>>> syzbot will keep track of this issue. See: > >>>>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*status__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKlEx5v1$ for how to communicate with syzbot. > >>>>> For information about bisection process see: https://urldefense.com/v3/__https://goo.gl/tpsmEJ*bisection__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPJk7KaIr$ > >>>>> syzbot can test patches for this issue, for details see: > >>>>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*testing-patches__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPMhq2hD3$ > >>>> -- > >>>> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > >>>> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > >>>> To view this discussion on the web visit https://urldefense.com/v3/__https://groups.google.com/d/msgid/syzkaller-bugs/0c106e6c-672f-474e-5815-97b65596139d*40oracle.com__;JQ!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHjmYAGZ$ . ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 20:02 ` Eric Dumazet @ 2021-08-09 20:09 ` Eric Dumazet 2021-08-09 20:31 ` Shoaib Rao 0 siblings, 1 reply; 21+ messages in thread From: Eric Dumazet @ 2021-08-09 20:09 UTC (permalink / raw) To: Shoaib Rao Cc: Dmitry Vyukov, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Christian Brauner, Cong Wang, Daniel Borkmann, David Miller, jamorris, John Fastabend, Martin KaFai Lau, kpsingh, Jakub Kicinski, LKML, open list:KERNEL SELFTEST FRAMEWORK, netdev, Shuah Khan, Song Liu, syzkaller-bugs, Al Viro, Yonghong Song On Mon, Aug 9, 2021 at 10:02 PM Eric Dumazet <edumazet@google.com> wrote: > > On Mon, Aug 9, 2021 at 9:40 PM Shoaib Rao <rao.shoaib@oracle.com> wrote: > > > > > > On 8/9/21 12:21 PM, Dmitry Vyukov wrote: > > > On Mon, 9 Aug 2021 at 21:16, Shoaib Rao <rao.shoaib@oracle.com> wrote: > > >> On 8/9/21 11:06 AM, Dmitry Vyukov wrote: > > >>> On Mon, 9 Aug 2021 at 19:33, Shoaib Rao <rao.shoaib@oracle.com> wrote: > > >>>> This seems like a false positive. 1) The function will not sleep because > > >>>> it only calls copy routine if the byte is present. 2). There is no > > >>>> difference between this new call and the older calls in > > >>>> unix_stream_read_generic(). > > >>> Hi Shoaib, > > >>> > > >>> Thanks for looking into this. > > >>> Do you have any ideas on how to fix this tool's false positive? Tools > > >>> with false positives are order of magnitude less useful than tools w/o > > >>> false positives. E.g. do we turn it off on syzbot? But I don't > > >>> remember any other false positives from "sleeping function called from > > >>> invalid context" checker... > > >> Before we take any action I would like to understand why the tool does > > >> not single out other calls to recv_actor in unix_stream_read_generic(). > > >> The context in all cases is the same. I also do not understand why the > > >> code would sleep, Let's assume the user provided address is bad, the > > >> code will return EFAULT, it will never sleep, > > > I always assumed that it's because if user pages are swapped out, it > > > may need to read them back from disk. > > > > Page faults occur all the time, the page may not even be in the cache or > > the mapping is not there (mmap), so I would not consider this a bug. The > > code should complain about all other calls as they are also copying to > > user pages. I must not be following some semantics for the code to be > > triggered but I can not figure that out. What is the recommended > > interface to do user copy from kernel? > > Are you aware of the difference between a mutex and a spinlock ? > > When copying data from/to user, you can not hold a spinlock. > > I am guessing that even your test would trigger the warning, if you make sure to include CONFIG_DEBUG_ATOMIC_SLEEP=y in your kernel build. > > > > Shoaib > > > > > > > >> if the kernel provided > > >> address is bad the system will panic. The only difference I see is that > > >> the new code holds 2 locks while the previous code held one lock, but > > >> the locks are acquired before the call to copy. > > >> > > >> So please help me understand how the tool works. Even though I have > > >> evaluated the code carefully, there is always a possibility that the > > >> tool is correct. > > >> > > >> Shoaib > > >> > > >>> > > >>> > > >>>> On 8/8/21 4:38 PM, syzbot wrote: > > >>>>> Hello, > > >>>>> > > >>>>> syzbot found the following issue on: > > >>>>> > > >>>>> HEAD commit: c2eecaa193ff pktgen: Remove redundant clone_skb override > > >>>>> git tree: net-next > > >>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=12e3a69e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHEdQcWD$ > > >>>>> kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=aba0c23f8230e048__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPLGp1-Za$ > > >>>>> dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=8760ca6c1ee783ac4abd__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPCORTNOH$ > > >>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1 > > >>>>> syz repro: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.syz?x=15c5b104300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPAjhi2yc$ > > >>>>> C reproducer: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.c?x=10062aaa300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNzAjzQJ$ > > >>>>> > > >>>>> The issue was bisected to: > > >>>>> > > >>>>> commit 314001f0bf927015e459c9d387d62a231fe93af3 > > >>>>> Author: Rao Shoaib <rao.shoaib@oracle.com> > > >>>>> Date: Sun Aug 1 07:57:07 2021 +0000 > > >>>>> > > >>>>> af_unix: Add OOB support > > >>>>> > > >>>>> bisection log: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/bisect.txt?x=10765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPK2iWt2r$ > > >>>>> final oops: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/report.txt?x=12765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKAb0dft$ > > >>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=14765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNlW_w-u$ > > >>>>> > > >>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: > > >>>>> Reported-by: syzbot+8760ca6c1ee783ac4abd@syzkaller.appspotmail.com > > >>>>> Fixes: 314001f0bf92 ("af_unix: Add OOB support") > > >>>>> > > >>>>> BUG: sleeping function called from invalid context at lib/iov_iter.c:619 > > >>>>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 8443, name: syz-executor700 > > >>>>> 2 locks held by syz-executor700/8443: > > >>>>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 > > >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] > > >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 > > >>>>> Preemption disabled at: > > >>>>> [<0000000000000000>] 0x0 > > >>>>> CPU: 1 PID: 8443 Comm: syz-executor700 Not tainted 5.14.0-rc3-syzkaller #0 > > >>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > >>>>> Call Trace: > > >>>>> __dump_stack lib/dump_stack.c:88 [inline] > > >>>>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 > > >>>>> ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9154 > > >>>>> __might_fault+0x6e/0x180 mm/memory.c:5258 > > >>>>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 > > >>>>> copy_to_iter include/linux/uio.h:139 [inline] > > >>>>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 > > >>>>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 > > >>>>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 > > >>>>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] > > >>>>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 > > >>>>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] > > >>>>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 > > >>>>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 > > >>>>> sock_recvmsg_nosec net/socket.c:944 [inline] > > >>>>> sock_recvmsg net/socket.c:962 [inline] > > >>>>> sock_recvmsg net/socket.c:958 [inline] > > >>>>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 > > >>>>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 > > >>>>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 > > >>>>> __sys_recvmmsg net/socket.c:2837 [inline] > > >>>>> __do_sys_recvmmsg net/socket.c:2860 [inline] > > >>>>> __se_sys_recvmmsg net/socket.c:2853 [inline] > > >>>>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 > > >>>>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > >>>>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > > >>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae > > >>>>> RIP: 0033:0x43ef39 > > >>>>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 > > >>>>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b > > >>>>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 > > >>>>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 > > >>>>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 > > >>>>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000402fb0 > > >>>>> R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488 > > >>>>> > > >>>>> ============================= > > >>>>> [ BUG: Invalid wait context ] > > >>>>> 5.14.0-rc3-syzkaller #0 Tainted: G W > > >>>>> ----------------------------- > > >>>>> syz-executor700/8443 is trying to lock: > > >>>>> ffff8880212b6a28 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0xa3/0x180 mm/memory.c:5260 > > >>>>> other info that might help us debug this: > > >>>>> context-{4:4} > > >>>>> 2 locks held by syz-executor700/8443: > > >>>>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 > > >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] > > >>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 > > >>>>> stack backtrace: > > >>>>> CPU: 1 PID: 8443 Comm: syz-executor700 Tainted: G W 5.14.0-rc3-syzkaller #0 > > >>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > >>>>> Call Trace: > > >>>>> __dump_stack lib/dump_stack.c:88 [inline] > > >>>>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 > > >>>>> print_lock_invalid_wait_context kernel/locking/lockdep.c:4666 [inline] > > >>>>> check_wait_context kernel/locking/lockdep.c:4727 [inline] > > >>>>> __lock_acquire.cold+0x213/0x3ab kernel/locking/lockdep.c:4965 > > >>>>> lock_acquire kernel/locking/lockdep.c:5625 [inline] > > >>>>> lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 > > >>>>> __might_fault mm/memory.c:5261 [inline] > > >>>>> __might_fault+0x106/0x180 mm/memory.c:5246 > > >>>>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 > > >>>>> copy_to_iter include/linux/uio.h:139 [inline] > > >>>>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 > > >>>>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 > > >>>>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 > > >>>>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] > > >>>>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 > > >>>>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] > > >>>>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 > > >>>>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 > > >>>>> sock_recvmsg_nosec net/socket.c:944 [inline] > > >>>>> sock_recvmsg net/socket.c:962 [inline] > > >>>>> sock_recvmsg net/socket.c:958 [inline] > > >>>>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 > > >>>>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 > > >>>>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 > > >>>>> __sys_recvmmsg net/socket.c:2837 [inline] > > >>>>> __do_sys_recvmmsg net/socket.c:2860 [inline] > > >>>>> __se_sys_recvmmsg net/socket.c:2853 [inline] > > >>>>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 > > >>>>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > >>>>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > > >>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae > > >>>>> RIP: 0033:0x43ef39 > > >>>>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 > > >>>>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b > > >>>>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 > > >>>>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 > > >>>>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 > > >>>>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000 > > >>>>> > > >>>>> > > >>>>> --- > > >>>>> This report is generated by a bot. It may contain errors. > > >>>>> See https://urldefense.com/v3/__https://goo.gl/tpsmEJ__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPG1UhbpZ$ for more information about syzbot. > > >>>>> syzbot engineers can be reached at syzkaller@googlegroups.com. > > >>>>> > > >>>>> syzbot will keep track of this issue. See: > > >>>>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*status__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKlEx5v1$ for how to communicate with syzbot. > > >>>>> For information about bisection process see: https://urldefense.com/v3/__https://goo.gl/tpsmEJ*bisection__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPJk7KaIr$ > > >>>>> syzbot can test patches for this issue, for details see: > > >>>>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*testing-patches__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPMhq2hD3$ > > >>>> -- > > >>>> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > > >>>> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > > >>>> To view this discussion on the web visit https://urldefense.com/v3/__https://groups.google.com/d/msgid/syzkaller-bugs/0c106e6c-672f-474e-5815-97b65596139d*40oracle.com__;JQ!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHjmYAGZ$ . ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 20:09 ` Eric Dumazet @ 2021-08-09 20:31 ` Shoaib Rao 2021-08-10 9:19 ` Eric Dumazet 0 siblings, 1 reply; 21+ messages in thread From: Shoaib Rao @ 2021-08-09 20:31 UTC (permalink / raw) To: Eric Dumazet Cc: Dmitry Vyukov, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Christian Brauner, Cong Wang, Daniel Borkmann, David Miller, jamorris, John Fastabend, Martin KaFai Lau, kpsingh, Jakub Kicinski, LKML, open list:KERNEL SELFTEST FRAMEWORK, netdev, Shuah Khan, Song Liu, syzkaller-bugs, Al Viro, Yonghong Song On 8/9/21 1:09 PM, Eric Dumazet wrote: > On Mon, Aug 9, 2021 at 10:02 PM Eric Dumazet <edumazet@google.com> wrote: >> On Mon, Aug 9, 2021 at 9:40 PM Shoaib Rao <rao.shoaib@oracle.com> wrote: >>> >>> On 8/9/21 12:21 PM, Dmitry Vyukov wrote: >>>> On Mon, 9 Aug 2021 at 21:16, Shoaib Rao <rao.shoaib@oracle.com> wrote: >>>>> On 8/9/21 11:06 AM, Dmitry Vyukov wrote: >>>>>> On Mon, 9 Aug 2021 at 19:33, Shoaib Rao <rao.shoaib@oracle.com> wrote: >>>>>>> This seems like a false positive. 1) The function will not sleep because >>>>>>> it only calls copy routine if the byte is present. 2). There is no >>>>>>> difference between this new call and the older calls in >>>>>>> unix_stream_read_generic(). >>>>>> Hi Shoaib, >>>>>> >>>>>> Thanks for looking into this. >>>>>> Do you have any ideas on how to fix this tool's false positive? Tools >>>>>> with false positives are order of magnitude less useful than tools w/o >>>>>> false positives. E.g. do we turn it off on syzbot? But I don't >>>>>> remember any other false positives from "sleeping function called from >>>>>> invalid context" checker... >>>>> Before we take any action I would like to understand why the tool does >>>>> not single out other calls to recv_actor in unix_stream_read_generic(). >>>>> The context in all cases is the same. I also do not understand why the >>>>> code would sleep, Let's assume the user provided address is bad, the >>>>> code will return EFAULT, it will never sleep, >>>> I always assumed that it's because if user pages are swapped out, it >>>> may need to read them back from disk. >>> Page faults occur all the time, the page may not even be in the cache or >>> the mapping is not there (mmap), so I would not consider this a bug. The >>> code should complain about all other calls as they are also copying to >>> user pages. I must not be following some semantics for the code to be >>> triggered but I can not figure that out. What is the recommended >>> interface to do user copy from kernel? >> Are you aware of the difference between a mutex and a spinlock ? >> >> When copying data from/to user, you can not hold a spinlock. >> >> > I am guessing that even your test would trigger the warning, > if you make sure to include CONFIG_DEBUG_ATOMIC_SLEEP=y in your kernel build. Eric, Thanks for the pointer, have you ever over looked at something when coding? Shoaib > >>> Shoaib >>> >>>>> if the kernel provided >>>>> address is bad the system will panic. The only difference I see is that >>>>> the new code holds 2 locks while the previous code held one lock, but >>>>> the locks are acquired before the call to copy. >>>>> >>>>> So please help me understand how the tool works. Even though I have >>>>> evaluated the code carefully, there is always a possibility that the >>>>> tool is correct. >>>>> >>>>> Shoaib >>>>> >>>>>> >>>>>>> On 8/8/21 4:38 PM, syzbot wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> syzbot found the following issue on: >>>>>>>> >>>>>>>> HEAD commit: c2eecaa193ff pktgen: Remove redundant clone_skb override >>>>>>>> git tree: net-next >>>>>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=12e3a69e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHEdQcWD$ >>>>>>>> kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=aba0c23f8230e048__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPLGp1-Za$ >>>>>>>> dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=8760ca6c1ee783ac4abd__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPCORTNOH$ >>>>>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1 >>>>>>>> syz repro: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.syz?x=15c5b104300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPAjhi2yc$ >>>>>>>> C reproducer: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.c?x=10062aaa300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNzAjzQJ$ >>>>>>>> >>>>>>>> The issue was bisected to: >>>>>>>> >>>>>>>> commit 314001f0bf927015e459c9d387d62a231fe93af3 >>>>>>>> Author: Rao Shoaib <rao.shoaib@oracle.com> >>>>>>>> Date: Sun Aug 1 07:57:07 2021 +0000 >>>>>>>> >>>>>>>> af_unix: Add OOB support >>>>>>>> >>>>>>>> bisection log: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/bisect.txt?x=10765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPK2iWt2r$ >>>>>>>> final oops: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/report.txt?x=12765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKAb0dft$ >>>>>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=14765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNlW_w-u$ >>>>>>>> >>>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>>>>>>> Reported-by: syzbot+8760ca6c1ee783ac4abd@syzkaller.appspotmail.com >>>>>>>> Fixes: 314001f0bf92 ("af_unix: Add OOB support") >>>>>>>> >>>>>>>> BUG: sleeping function called from invalid context at lib/iov_iter.c:619 >>>>>>>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 8443, name: syz-executor700 >>>>>>>> 2 locks held by syz-executor700/8443: >>>>>>>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 >>>>>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] >>>>>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 >>>>>>>> Preemption disabled at: >>>>>>>> [<0000000000000000>] 0x0 >>>>>>>> CPU: 1 PID: 8443 Comm: syz-executor700 Not tainted 5.14.0-rc3-syzkaller #0 >>>>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>>>>>>> Call Trace: >>>>>>>> __dump_stack lib/dump_stack.c:88 [inline] >>>>>>>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 >>>>>>>> ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9154 >>>>>>>> __might_fault+0x6e/0x180 mm/memory.c:5258 >>>>>>>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 >>>>>>>> copy_to_iter include/linux/uio.h:139 [inline] >>>>>>>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 >>>>>>>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 >>>>>>>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 >>>>>>>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] >>>>>>>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 >>>>>>>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] >>>>>>>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 >>>>>>>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 >>>>>>>> sock_recvmsg_nosec net/socket.c:944 [inline] >>>>>>>> sock_recvmsg net/socket.c:962 [inline] >>>>>>>> sock_recvmsg net/socket.c:958 [inline] >>>>>>>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 >>>>>>>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 >>>>>>>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 >>>>>>>> __sys_recvmmsg net/socket.c:2837 [inline] >>>>>>>> __do_sys_recvmmsg net/socket.c:2860 [inline] >>>>>>>> __se_sys_recvmmsg net/socket.c:2853 [inline] >>>>>>>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 >>>>>>>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] >>>>>>>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 >>>>>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae >>>>>>>> RIP: 0033:0x43ef39 >>>>>>>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 >>>>>>>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b >>>>>>>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 >>>>>>>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 >>>>>>>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 >>>>>>>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000402fb0 >>>>>>>> R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488 >>>>>>>> >>>>>>>> ============================= >>>>>>>> [ BUG: Invalid wait context ] >>>>>>>> 5.14.0-rc3-syzkaller #0 Tainted: G W >>>>>>>> ----------------------------- >>>>>>>> syz-executor700/8443 is trying to lock: >>>>>>>> ffff8880212b6a28 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0xa3/0x180 mm/memory.c:5260 >>>>>>>> other info that might help us debug this: >>>>>>>> context-{4:4} >>>>>>>> 2 locks held by syz-executor700/8443: >>>>>>>> #0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501 >>>>>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] >>>>>>>> #1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502 >>>>>>>> stack backtrace: >>>>>>>> CPU: 1 PID: 8443 Comm: syz-executor700 Tainted: G W 5.14.0-rc3-syzkaller #0 >>>>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>>>>>>> Call Trace: >>>>>>>> __dump_stack lib/dump_stack.c:88 [inline] >>>>>>>> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 >>>>>>>> print_lock_invalid_wait_context kernel/locking/lockdep.c:4666 [inline] >>>>>>>> check_wait_context kernel/locking/lockdep.c:4727 [inline] >>>>>>>> __lock_acquire.cold+0x213/0x3ab kernel/locking/lockdep.c:4965 >>>>>>>> lock_acquire kernel/locking/lockdep.c:5625 [inline] >>>>>>>> lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 >>>>>>>> __might_fault mm/memory.c:5261 [inline] >>>>>>>> __might_fault+0x106/0x180 mm/memory.c:5246 >>>>>>>> _copy_to_iter+0x199/0x1600 lib/iov_iter.c:619 >>>>>>>> copy_to_iter include/linux/uio.h:139 [inline] >>>>>>>> simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519 >>>>>>>> __skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425 >>>>>>>> skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533 >>>>>>>> skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline] >>>>>>>> unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701 >>>>>>>> unix_stream_recv_urg net/unix/af_unix.c:2433 [inline] >>>>>>>> unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504 >>>>>>>> unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717 >>>>>>>> sock_recvmsg_nosec net/socket.c:944 [inline] >>>>>>>> sock_recvmsg net/socket.c:962 [inline] >>>>>>>> sock_recvmsg net/socket.c:958 [inline] >>>>>>>> ____sys_recvmsg+0x2c4/0x600 net/socket.c:2622 >>>>>>>> ___sys_recvmsg+0x127/0x200 net/socket.c:2664 >>>>>>>> do_recvmmsg+0x24d/0x6d0 net/socket.c:2758 >>>>>>>> __sys_recvmmsg net/socket.c:2837 [inline] >>>>>>>> __do_sys_recvmmsg net/socket.c:2860 [inline] >>>>>>>> __se_sys_recvmmsg net/socket.c:2853 [inline] >>>>>>>> __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853 >>>>>>>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] >>>>>>>> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 >>>>>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae >>>>>>>> RIP: 0033:0x43ef39 >>>>>>>> Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 >>>>>>>> RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b >>>>>>>> RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39 >>>>>>>> RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004 >>>>>>>> RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488 >>>>>>>> R10: 0000000000000007 R11: 0000000000000246 R12: 0000 >>>>>>>> >>>>>>>> >>>>>>>> --- >>>>>>>> This report is generated by a bot. It may contain errors. >>>>>>>> See https://urldefense.com/v3/__https://goo.gl/tpsmEJ__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPG1UhbpZ$ for more information about syzbot. >>>>>>>> syzbot engineers can be reached at syzkaller@googlegroups.com. >>>>>>>> >>>>>>>> syzbot will keep track of this issue. See: >>>>>>>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*status__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKlEx5v1$ for how to communicate with syzbot. >>>>>>>> For information about bisection process see: https://urldefense.com/v3/__https://goo.gl/tpsmEJ*bisection__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPJk7KaIr$ >>>>>>>> syzbot can test patches for this issue, for details see: >>>>>>>> https://urldefense.com/v3/__https://goo.gl/tpsmEJ*testing-patches__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPMhq2hD3$ >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. >>>>>>> To view this discussion on the web visit https://urldefense.com/v3/__https://groups.google.com/d/msgid/syzkaller-bugs/0c106e6c-672f-474e-5815-97b65596139d*40oracle.com__;JQ!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHjmYAGZ$ . ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 20:31 ` Shoaib Rao @ 2021-08-10 9:19 ` Eric Dumazet 2021-08-10 17:50 ` Shoaib Rao 0 siblings, 1 reply; 21+ messages in thread From: Eric Dumazet @ 2021-08-10 9:19 UTC (permalink / raw) To: Shoaib Rao, Eric Dumazet Cc: Dmitry Vyukov, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Christian Brauner, Cong Wang, Daniel Borkmann, David Miller, jamorris, John Fastabend, Martin KaFai Lau, kpsingh, Jakub Kicinski, LKML, open list:KERNEL SELFTEST FRAMEWORK, netdev, Shuah Khan, Song Liu, syzkaller-bugs, Al Viro, Yonghong Song On 8/9/21 10:31 PM, Shoaib Rao wrote: > > On 8/9/21 1:09 PM, Eric Dumazet wrote: >> I am guessing that even your test would trigger the warning, >> if you make sure to include CONFIG_DEBUG_ATOMIC_SLEEP=y in your kernel build. > > Eric, > > Thanks for the pointer, have you ever over looked at something when coding? > I _think_ I was trying to help, not shaming you in any way. My question about spinlock/mutex was not sarcastic, you authored 6 official linux patches, there is no evidence for linux kernel expertise. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-10 9:19 ` Eric Dumazet @ 2021-08-10 17:50 ` Shoaib Rao 2021-08-10 18:02 ` Eric Dumazet 0 siblings, 1 reply; 21+ messages in thread From: Shoaib Rao @ 2021-08-10 17:50 UTC (permalink / raw) To: Eric Dumazet, Eric Dumazet Cc: Dmitry Vyukov, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Christian Brauner, Cong Wang, Daniel Borkmann, David Miller, jamorris, John Fastabend, Martin KaFai Lau, kpsingh, Jakub Kicinski, LKML, open list:KERNEL SELFTEST FRAMEWORK, netdev, Shuah Khan, Song Liu, syzkaller-bugs, Al Viro, Yonghong Song On 8/10/21 2:19 AM, Eric Dumazet wrote: > > On 8/9/21 10:31 PM, Shoaib Rao wrote: >> On 8/9/21 1:09 PM, Eric Dumazet wrote: >>> I am guessing that even your test would trigger the warning, >>> if you make sure to include CONFIG_DEBUG_ATOMIC_SLEEP=y in your kernel build. >> Eric, >> >> Thanks for the pointer, have you ever over looked at something when coding? >> > I _think_ I was trying to help, not shaming you in any way. How did the previous email help? I did not get any reply when I asked what could be the cause. > > My question about spinlock/mutex was not sarcastic, you authored > 6 official linux patches, there is no evidence for linux kernel expertise. That is no measure of someones understanding. There are other OS's as well. I have worked on Solaris and other *unix* OS's for over 20+ years. This was an oversight on my part and I apologize, but instead of questioning my expertise it would have been helpful to say what might have caused it. Shoaib ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-10 17:50 ` Shoaib Rao @ 2021-08-10 18:02 ` Eric Dumazet 2021-08-10 18:29 ` Shoaib Rao 0 siblings, 1 reply; 21+ messages in thread From: Eric Dumazet @ 2021-08-10 18:02 UTC (permalink / raw) To: Shoaib Rao Cc: Eric Dumazet, Dmitry Vyukov, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Christian Brauner, Cong Wang, Daniel Borkmann, David Miller, jamorris, John Fastabend, Martin KaFai Lau, kpsingh, Jakub Kicinski, LKML, open list:KERNEL SELFTEST FRAMEWORK, netdev, Shuah Khan, Song Liu, syzkaller-bugs, Al Viro, Yonghong Song On Tue, Aug 10, 2021 at 7:50 PM Shoaib Rao <rao.shoaib@oracle.com> wrote: > > > On 8/10/21 2:19 AM, Eric Dumazet wrote: > > > > On 8/9/21 10:31 PM, Shoaib Rao wrote: > >> On 8/9/21 1:09 PM, Eric Dumazet wrote: > >>> I am guessing that even your test would trigger the warning, > >>> if you make sure to include CONFIG_DEBUG_ATOMIC_SLEEP=y in your kernel build. > >> Eric, > >> > >> Thanks for the pointer, have you ever over looked at something when coding? > >> > > I _think_ I was trying to help, not shaming you in any way. > How did the previous email help? I did not get any reply when I asked > what could be the cause. Which previous email ? Are you expecting immediate answers to your emails ? I am not working for Oracle. > > > > My question about spinlock/mutex was not sarcastic, you authored > > 6 official linux patches, there is no evidence for linux kernel expertise. > > That is no measure of someones understanding. There are other OS's as > well. I have worked on Solaris and other *unix* OS's for over 20+ years. > This was an oversight on my part and I apologize, but instead of > questioning my expertise it would have been helpful to say what might > have caused it. I sent two emails with _useful_ _information_. If you felt you were attacked, I suggest you take a deep breath, and read my emails without trying to change their intention and meaning. If you think my emails were not useful, just ignore them, this is fine by me. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-10 18:02 ` Eric Dumazet @ 2021-08-10 18:29 ` Shoaib Rao 0 siblings, 0 replies; 21+ messages in thread From: Shoaib Rao @ 2021-08-10 18:29 UTC (permalink / raw) To: Eric Dumazet Cc: Eric Dumazet, Dmitry Vyukov, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Christian Brauner, Cong Wang, Daniel Borkmann, David Miller, jamorris, John Fastabend, Martin KaFai Lau, kpsingh, Jakub Kicinski, LKML, open list:KERNEL SELFTEST FRAMEWORK, netdev, Shuah Khan, Song Liu, syzkaller-bugs, Al Viro, Yonghong Song On 8/10/21 11:02 AM, Eric Dumazet wrote: > On Tue, Aug 10, 2021 at 7:50 PM Shoaib Rao <rao.shoaib@oracle.com> wrote: >> >> On 8/10/21 2:19 AM, Eric Dumazet wrote: >>> On 8/9/21 10:31 PM, Shoaib Rao wrote: >>>> On 8/9/21 1:09 PM, Eric Dumazet wrote: >>>>> I am guessing that even your test would trigger the warning, >>>>> if you make sure to include CONFIG_DEBUG_ATOMIC_SLEEP=y in your kernel build. >>>> Eric, >>>> >>>> Thanks for the pointer, have you ever over looked at something when coding? >>>> >>> I _think_ I was trying to help, not shaming you in any way. >> How did the previous email help? I did not get any reply when I asked >> what could be the cause. > Which previous email ? Are you expecting immediate answers to your emails ? > I am not working for Oracle. > >>> My question about spinlock/mutex was not sarcastic, you authored >>> 6 official linux patches, there is no evidence for linux kernel expertise. >> That is no measure of someones understanding. There are other OS's as >> well. I have worked on Solaris and other *unix* OS's for over 20+ years. >> This was an oversight on my part and I apologize, but instead of >> questioning my expertise it would have been helpful to say what might >> have caused it. > > I sent two emails with _useful_ _information_. > > If you felt you were attacked, I suggest you take a deep breath, > and read my emails without trying to change their intention and meaning. > > If you think my emails were not useful, just ignore them, this is fine by me. Hi Eric, I went back and looked at the two emails. You are correct. > Are you aware of the difference between a mutex and a spinlock ? > > When copying data from/to user, you can not hold a spinlock. The second line is useful but the first one was not necessary. Shoaib ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 19:40 ` Shoaib Rao 2021-08-09 20:02 ` Eric Dumazet @ 2021-08-09 20:04 ` Al Viro 2021-08-09 20:16 ` Al Viro 1 sibling, 1 reply; 21+ messages in thread From: Al Viro @ 2021-08-09 20:04 UTC (permalink / raw) To: Shoaib Rao Cc: Dmitry Vyukov, syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, yhs On Mon, Aug 09, 2021 at 12:40:03PM -0700, Shoaib Rao wrote: > Page faults occur all the time, the page may not even be in the cache or the > mapping is not there (mmap), so I would not consider this a bug. The code > should complain about all other calls as they are also copying to user > pages. I must not be following some semantics for the code to be triggered > but I can not figure that out. What is the recommended interface to do user > copy from kernel? What are you talking about? Yes, page faults happen. No, they must not be triggered in contexts when you cannot afford going to sleep. In particular, you can't do that while holding a spinlock. There are things that can't be done under a spinlock. If your commit is attempting that, it's simply broken. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 20:04 ` Al Viro @ 2021-08-09 20:16 ` Al Viro 2021-08-09 20:30 ` Shoaib Rao 2021-08-09 20:37 ` Shoaib Rao 0 siblings, 2 replies; 21+ messages in thread From: Al Viro @ 2021-08-09 20:16 UTC (permalink / raw) To: Shoaib Rao Cc: Dmitry Vyukov, syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, yhs On Mon, Aug 09, 2021 at 08:04:40PM +0000, Al Viro wrote: > On Mon, Aug 09, 2021 at 12:40:03PM -0700, Shoaib Rao wrote: > > > Page faults occur all the time, the page may not even be in the cache or the > > mapping is not there (mmap), so I would not consider this a bug. The code > > should complain about all other calls as they are also copying to user > > pages. I must not be following some semantics for the code to be triggered > > but I can not figure that out. What is the recommended interface to do user > > copy from kernel? > > What are you talking about? Yes, page faults happen. No, they > must not be triggered in contexts when you cannot afford going to sleep. > In particular, you can't do that while holding a spinlock. > > There are things that can't be done under a spinlock. If your > commit is attempting that, it's simply broken. ... in particular, this +#if IS_ENABLED(CONFIG_AF_UNIX_OOB) + mutex_lock(&u->iolock); + unix_state_lock(sk); + + err = unix_stream_recv_urg(state); + + unix_state_unlock(sk); + mutex_unlock(&u->iolock); +#endif is 100% broken, since you *are* attempting to copy data to userland between spin_lock(&unix_sk(s)->lock) and spin_unlock(&unix_sk(s)->lock). You can't do blocking operations under a spinlock. And copyout is inherently a blocking operation - it can require any kind of IO to complete. If you have the destination (very much valid - no bad addresses there) in the middle of a page mmapped from a file and currently not paged in, you *must* read the current contents of the page, at least into the parts of page that are not going to be overwritten by your copyout. No way around that. And that can involve any kind of delays and any amount of disk/network/whatnot traffic. You fundamentally can not do that kind of thing without giving the CPU up. And under a spinlock you are not allowed to do that. In the current form that commit is obviously broken. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 20:16 ` Al Viro @ 2021-08-09 20:30 ` Shoaib Rao 2021-08-09 20:37 ` Shoaib Rao 1 sibling, 0 replies; 21+ messages in thread From: Shoaib Rao @ 2021-08-09 20:30 UTC (permalink / raw) To: Al Viro Cc: Dmitry Vyukov, syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, yhs On 8/9/21 1:16 PM, Al Viro wrote: > On Mon, Aug 09, 2021 at 08:04:40PM +0000, Al Viro wrote: >> On Mon, Aug 09, 2021 at 12:40:03PM -0700, Shoaib Rao wrote: >> >>> Page faults occur all the time, the page may not even be in the cache or the >>> mapping is not there (mmap), so I would not consider this a bug. The code >>> should complain about all other calls as they are also copying to user >>> pages. I must not be following some semantics for the code to be triggered >>> but I can not figure that out. What is the recommended interface to do user >>> copy from kernel? >> What are you talking about? Yes, page faults happen. No, they >> must not be triggered in contexts when you cannot afford going to sleep. >> In particular, you can't do that while holding a spinlock. >> >> There are things that can't be done under a spinlock. If your >> commit is attempting that, it's simply broken. > ... in particular, this > > +#if IS_ENABLED(CONFIG_AF_UNIX_OOB) > + mutex_lock(&u->iolock); > + unix_state_lock(sk); > + > + err = unix_stream_recv_urg(state); > + > + unix_state_unlock(sk); > + mutex_unlock(&u->iolock); > +#endif > > is 100% broken, since you *are* attempting to copy data to userland between > spin_lock(&unix_sk(s)->lock) and spin_unlock(&unix_sk(s)->lock). > > You can't do blocking operations under a spinlock. And copyout is inherently > a blocking operation - it can require any kind of IO to complete. If you > have the destination (very much valid - no bad addresses there) in the middle > of a page mmapped from a file and currently not paged in, you *must* read > the current contents of the page, at least into the parts of page that > are not going to be overwritten by your copyout. No way around that. And > that can involve any kind of delays and any amount of disk/network/whatnot > traffic. > > You fundamentally can not do that kind of thing without giving the CPU up. > And under a spinlock you are not allowed to do that. > > In the current form that commit is obviously broken. I am quiet aware of spinlock and mutex and all the other kernel structures etc... As I said the fact that Linux uses locks* for spinlocks and mutexes is confusing unless you look at the details of the lock. I will fix the issue, it is a simple fix, copy the byte to a kernel variable, release the lock. copy the byte to userland. Shoaib ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 20:16 ` Al Viro 2021-08-09 20:30 ` Shoaib Rao @ 2021-08-09 20:37 ` Shoaib Rao 2021-08-09 21:41 ` Al Viro 1 sibling, 1 reply; 21+ messages in thread From: Shoaib Rao @ 2021-08-09 20:37 UTC (permalink / raw) To: Al Viro Cc: Dmitry Vyukov, syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, yhs On 8/9/21 1:16 PM, Al Viro wrote: > On Mon, Aug 09, 2021 at 08:04:40PM +0000, Al Viro wrote: >> On Mon, Aug 09, 2021 at 12:40:03PM -0700, Shoaib Rao wrote: >> >>> Page faults occur all the time, the page may not even be in the cache or the >>> mapping is not there (mmap), so I would not consider this a bug. The code >>> should complain about all other calls as they are also copying to user >>> pages. I must not be following some semantics for the code to be triggered >>> but I can not figure that out. What is the recommended interface to do user >>> copy from kernel? >> What are you talking about? Yes, page faults happen. No, they >> must not be triggered in contexts when you cannot afford going to sleep. >> In particular, you can't do that while holding a spinlock. >> >> There are things that can't be done under a spinlock. If your >> commit is attempting that, it's simply broken. > ... in particular, this > > +#if IS_ENABLED(CONFIG_AF_UNIX_OOB) > + mutex_lock(&u->iolock); > + unix_state_lock(sk); > + > + err = unix_stream_recv_urg(state); > + > + unix_state_unlock(sk); > + mutex_unlock(&u->iolock); > +#endif > > is 100% broken, since you *are* attempting to copy data to userland between > spin_lock(&unix_sk(s)->lock) and spin_unlock(&unix_sk(s)->lock). Yes, but why are we calling it unix_state_lock() why not unix_state_spinlock() ? I have tons of experience doing kernel coding and you can never ever cover everything, that is why I wanted to root cause the issue instead of just turning off the check. Imagine you or Eric make a mistake and break the kernel, how would you guys feel if I were to write a similar email? Shoaib > > You can't do blocking operations under a spinlock. And copyout is inherently > a blocking operation - it can require any kind of IO to complete. If you > have the destination (very much valid - no bad addresses there) in the middle > of a page mmapped from a file and currently not paged in, you *must* read > the current contents of the page, at least into the parts of page that > are not going to be overwritten by your copyout. No way around that. And > that can involve any kind of delays and any amount of disk/network/whatnot > traffic. > > You fundamentally can not do that kind of thing without giving the CPU up. > And under a spinlock you are not allowed to do that. > > In the current form that commit is obviously broken. I am ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 20:37 ` Shoaib Rao @ 2021-08-09 21:41 ` Al Viro 2021-08-09 22:38 ` Shoaib Rao 0 siblings, 1 reply; 21+ messages in thread From: Al Viro @ 2021-08-09 21:41 UTC (permalink / raw) To: Shoaib Rao Cc: Dmitry Vyukov, syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, yhs On Mon, Aug 09, 2021 at 01:37:08PM -0700, Shoaib Rao wrote: > > +#if IS_ENABLED(CONFIG_AF_UNIX_OOB) > > + mutex_lock(&u->iolock); > > + unix_state_lock(sk); > > + > > + err = unix_stream_recv_urg(state); > > + > > + unix_state_unlock(sk); > > + mutex_unlock(&u->iolock); > > +#endif > > > > is 100% broken, since you *are* attempting to copy data to userland between > > spin_lock(&unix_sk(s)->lock) and spin_unlock(&unix_sk(s)->lock). > > Yes, but why are we calling it unix_state_lock() why not > unix_state_spinlock() ? We'd never bothered with such naming conventions; keep in mind that locking rules can and do change from time to time, and encoding the nature of locking primitive into the name would result in tons of noise. > I have tons of experience doing kernel coding and you can never ever cover > everything, that is why I wanted to root cause the issue instead of just > turning off the check. > > Imagine you or Eric make a mistake and break the kernel, how would you guys > feel if I were to write a similar email? Moderately embarrassed, at a guess, but what would that have to do with somebody pointing the bug out? Bonehead mistakes happen, they are embarrassing no matter who catches them - trust me, it's no less unpleasant when you end up being one who finds your own bug months after it went into the tree. Been there, done that... Since you asked, as far as my reactions normally go: * I made a mistake that ended up screwing people over => can be hideously embarrassing, no matter what. No cause for that in your case, AFAICS - it hadn't even gone into mainline yet. * I made a dumb mistake that got caught (again, doesn't matter by whom) => unpleasant; shit happens (does it ever), but that's not a tragedy. Ought to look for the ways to catch the same kind of mistakes and see if I have stepped into the same problem anywhere else - often enough the blind spots strike more than once. If the method of catching the same kind of crap ends up being something like 'grep for <pattern>, manually check the instances to weed out the false positive'... might be worth running over the tree; often enough the blind spots are shared. Would be partially applicable in your case ("if using an unfamiliar locking helper, check what it does"), but not easily greppable. * I kept looking at bug report, missing the relevant indicators despite the increasingly direct references to those by other people => mildly embarrassing (possibly more than mildly, if that persists for long). Ought to get some coffee, wake up properly (if applicable, that is) and make notes for myself re what to watch out for. Partially applicable here; I'm no telepath, but at a guess you missed the list of locks in the report _and_ missed repeated references to some spinlock being involved. Since the call chain had not (AFAICS) been missed, the question "which spinlock do they keep blathering about?" wouldn't have been hard. Might be useful to make note of, for the next time you have to deal with such reports. * Somebody starts asking whether I bloody understand something trivial => figure out what does that have to do with the situation at hand, reply with the description of what I'd missed (again, quite possibly the answer will be "enough coffee") and move on to figuring out how to fix the damn bug. Not exactly applicable here - the closest I can see is Eric's question regarding the difference between mutex and spinlock. In similar situation I'd go with something along the lines of "Sorry, hadn't spotted the spinlock in question"; your reply had been a bit more combative than that, but that's a matter of taste. None of my postings would fit into that class, AFAICS... * Somebody explains (in painful details) what's wrong with the code => more or less the same as above, only with less temptation (for me) to get defensive. Reactions vary - some folks find it more offensive than the previous one, but essentially it's the same thing. The above describes my reactions, in case it's not obvious - I'm not saying that everyone should react the same way, but you've asked how would I (or Eric) react in such-and-such case. And I can't speak for Eric, obviously... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 21:41 ` Al Viro @ 2021-08-09 22:38 ` Shoaib Rao 0 siblings, 0 replies; 21+ messages in thread From: Shoaib Rao @ 2021-08-09 22:38 UTC (permalink / raw) To: Al Viro Cc: Dmitry Vyukov, syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, yhs On 8/9/21 2:41 PM, Al Viro wrote: > On Mon, Aug 09, 2021 at 01:37:08PM -0700, Shoaib Rao wrote: > >>> +#if IS_ENABLED(CONFIG_AF_UNIX_OOB) >>> + mutex_lock(&u->iolock); >>> + unix_state_lock(sk); >>> + >>> + err = unix_stream_recv_urg(state); >>> + >>> + unix_state_unlock(sk); >>> + mutex_unlock(&u->iolock); >>> +#endif >>> >>> is 100% broken, since you *are* attempting to copy data to userland between >>> spin_lock(&unix_sk(s)->lock) and spin_unlock(&unix_sk(s)->lock). >> Yes, but why are we calling it unix_state_lock() why not >> unix_state_spinlock() ? > We'd never bothered with such naming conventions; keep in mind that > locking rules can and do change from time to time, and encoding the > nature of locking primitive into the name would result in tons of > noise. Rules/Order and Semantics can change, but naming IMHO helps out a lot. There are certain OS's where spinlocks only spin for a bit after that they block. However, they still are called spinlocks. > >> I have tons of experience doing kernel coding and you can never ever cover >> everything, that is why I wanted to root cause the issue instead of just >> turning off the check. >> >> Imagine you or Eric make a mistake and break the kernel, how would you guys >> feel if I were to write a similar email? > Moderately embarrassed, at a guess, but what would that have to do with > somebody pointing the bug out? Bonehead mistakes happen, they are embarrassing > no matter who catches them - trust me, it's no less unpleasant when you end > up being one who finds your own bug months after it went into the tree. Been > there, done that... > > Since you asked, as far as my reactions normally go: > * I made a mistake that ended up screwing people over => can be > hideously embarrassing, no matter what. No cause for that in your case, > AFAICS - it hadn't even gone into mainline yet. > * I made a dumb mistake that got caught (again, doesn't matter > by whom) => unpleasant; shit happens (does it ever), but that's not > a tragedy. Ought to look for the ways to catch the same kind of mistakes > and see if I have stepped into the same problem anywhere else - often > enough the blind spots strike more than once. If the method of catching > the same kind of crap ends up being something like 'grep for <pattern>, > manually check the instances to weed out the false positive'... might > be worth running over the tree; often enough the blind spots are shared. > Would be partially applicable in your case ("if using an unfamiliar locking > helper, check what it does"), but not easily greppable. > * I kept looking at bug report, missing the relevant indicators > despite the increasingly direct references to those by other people => > mildly embarrassing (possibly more than mildly, if that persists for long). > Ought to get some coffee, wake up properly (if applicable, that is) and make > notes for myself re what to watch out for. Partially applicable here; > I'm no telepath, but at a guess you missed the list of locks in the report > _and_ missed repeated references to some spinlock being involved. > Since the call chain had not (AFAICS) been missed, the question > "which spinlock do they keep blathering about?" wouldn't have been hard. > Might be useful to make note of, for the next time you have to deal with > such reports. > * Somebody starts asking whether I bloody understand something > trivial => figure out what does that have to do with the situation at > hand, reply with the description of what I'd missed (again, quite possibly > the answer will be "enough coffee") and move on to figuring out how to > fix the damn bug. Not exactly applicable here - the closest I can see > is Eric's question regarding the difference between mutex and spinlock. > In similar situation I'd go with something along the lines of "Sorry, > hadn't spotted the spinlock in question"; your reply had been a bit > more combative than that, but that's a matter of taste. None of my > postings would fit into that class, AFAICS... > * Somebody explains (in painful details) what's wrong with the > code => more or less the same as above, only with less temptation (for > me) to get defensive. Reactions vary - some folks find it more offensive > than the previous one, but essentially it's the same thing. > > The above describes my reactions, in case it's not obvious - > I'm not saying that everyone should react the same way, but you've > asked how would I (or Eric) react in such-and-such case. And I can't > speak for Eric, obviously... Al, I really appreciate the time you have taken to write the email. I agree with what you have stated 99%. My displeasure is with the fact that when I asked what conditions trigger this error (not familiar with the checker), no one replied. As I said in the emails, I did suspect the locks but did not have time to look at the definition, your email arrived as I was looking at the definition. It would have been better and polite to say, are you sure you are not holding a spinlock? Would that not solve the issue? Why do we have to always assume that the other person is not knowledgeable and inferior to us. Is there any documentation that lists possible reasons when the checker points to an error? Thanks again for the email. Regards, Shoaib ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 19:16 ` Shoaib Rao 2021-08-09 19:21 ` Dmitry Vyukov @ 2021-08-09 19:57 ` Al Viro 2021-08-09 20:18 ` Shoaib Rao 1 sibling, 1 reply; 21+ messages in thread From: Al Viro @ 2021-08-09 19:57 UTC (permalink / raw) To: Shoaib Rao Cc: Dmitry Vyukov, syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, yhs On Mon, Aug 09, 2021 at 12:16:27PM -0700, Shoaib Rao wrote: > > On 8/9/21 11:06 AM, Dmitry Vyukov wrote: > > On Mon, 9 Aug 2021 at 19:33, Shoaib Rao <rao.shoaib@oracle.com> wrote: > > > This seems like a false positive. 1) The function will not sleep because > > > it only calls copy routine if the byte is present. 2). There is no > > > difference between this new call and the older calls in > > > unix_stream_read_generic(). > > Hi Shoaib, > > > > Thanks for looking into this. > > Do you have any ideas on how to fix this tool's false positive? Tools > > with false positives are order of magnitude less useful than tools w/o > > false positives. E.g. do we turn it off on syzbot? But I don't > > remember any other false positives from "sleeping function called from > > invalid context" checker... > > Before we take any action I would like to understand why the tool does not > single out other calls to recv_actor in unix_stream_read_generic(). The > context in all cases is the same. I also do not understand why the code > would sleep, Let's assume the user provided address is bad, the code will > return EFAULT, it will never sleep, if the kernel provided address is bad > the system will panic. The only difference I see is that the new code holds > 2 locks while the previous code held one lock, but the locks are acquired > before the call to copy. > > So please help me understand how the tool works. Even though I have > evaluated the code carefully, there is always a possibility that the tool is > correct. Huh??? What do you mean "address is bad"? "Address is inside an area mmapped from NFS file". And it bloody well will sleep on attempt to read the page. You should never, ever do copy_{to,from}_user() or equivalents while holding a spinlock, period. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter 2021-08-09 19:57 ` Al Viro @ 2021-08-09 20:18 ` Shoaib Rao 0 siblings, 0 replies; 21+ messages in thread From: Shoaib Rao @ 2021-08-09 20:18 UTC (permalink / raw) To: Al Viro Cc: Dmitry Vyukov, syzbot, andrii, ast, bpf, christian.brauner, cong.wang, daniel, davem, edumazet, jamorris, john.fastabend, kafai, kpsingh, kuba, linux-kernel, linux-kselftest, netdev, shuah, songliubraving, syzkaller-bugs, yhs On 8/9/21 12:57 PM, Al Viro wrote: > On Mon, Aug 09, 2021 at 12:16:27PM -0700, Shoaib Rao wrote: >> On 8/9/21 11:06 AM, Dmitry Vyukov wrote: >>> On Mon, 9 Aug 2021 at 19:33, Shoaib Rao <rao.shoaib@oracle.com> wrote: >>>> This seems like a false positive. 1) The function will not sleep because >>>> it only calls copy routine if the byte is present. 2). There is no >>>> difference between this new call and the older calls in >>>> unix_stream_read_generic(). >>> Hi Shoaib, >>> >>> Thanks for looking into this. >>> Do you have any ideas on how to fix this tool's false positive? Tools >>> with false positives are order of magnitude less useful than tools w/o >>> false positives. E.g. do we turn it off on syzbot? But I don't >>> remember any other false positives from "sleeping function called from >>> invalid context" checker... >> Before we take any action I would like to understand why the tool does not >> single out other calls to recv_actor in unix_stream_read_generic(). The >> context in all cases is the same. I also do not understand why the code >> would sleep, Let's assume the user provided address is bad, the code will >> return EFAULT, it will never sleep, if the kernel provided address is bad >> the system will panic. The only difference I see is that the new code holds >> 2 locks while the previous code held one lock, but the locks are acquired >> before the call to copy. >> >> So please help me understand how the tool works. Even though I have >> evaluated the code carefully, there is always a possibility that the tool is >> correct. > Huh??? > > What do you mean "address is bad"? "Address is inside an area mmapped from > NFS file". And it bloody well will sleep on attempt to read the page. That is exactly what I said :-). There are times when copying thread/task may sleep when the page is not there and it does not have to be an NFS file, Linux supports mmap without backing memory and page faults occur with files all the time. With the bad address I meant that the user passes in an incorrect address. > > You should never, ever do copy_{to,from}_user() or equivalents while holding > a spinlock, period. Yes spinlock should not be held if the process can sleep. In this case it wont but there is no way to indicate that. Thanks for pointing that out, as the second lock I am holding is indeed a spinlock (it is accessed via unix_state_unlock so I missed the spinlock). I will modify the code and resubmit. I am glad we found the root cause. Shoaib ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2021-08-10 18:30 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-08 23:38 [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter syzbot 2021-08-09 17:32 ` Shoaib Rao 2021-08-09 18:06 ` Dmitry Vyukov 2021-08-09 19:16 ` Shoaib Rao 2021-08-09 19:21 ` Dmitry Vyukov 2021-08-09 19:40 ` Shoaib Rao 2021-08-09 20:02 ` Eric Dumazet 2021-08-09 20:09 ` Eric Dumazet 2021-08-09 20:31 ` Shoaib Rao 2021-08-10 9:19 ` Eric Dumazet 2021-08-10 17:50 ` Shoaib Rao 2021-08-10 18:02 ` Eric Dumazet 2021-08-10 18:29 ` Shoaib Rao 2021-08-09 20:04 ` Al Viro 2021-08-09 20:16 ` Al Viro 2021-08-09 20:30 ` Shoaib Rao 2021-08-09 20:37 ` Shoaib Rao 2021-08-09 21:41 ` Al Viro 2021-08-09 22:38 ` Shoaib Rao 2021-08-09 19:57 ` Al Viro 2021-08-09 20:18 ` Shoaib Rao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).