* [syzbot] BUG: sleeping function called from invalid context in static_key_slow_inc
@ 2022-11-17 9:55 syzbot
2022-11-17 12:03 ` syzbot
2022-11-18 1:56 ` syzbot
0 siblings, 2 replies; 11+ messages in thread
From: syzbot @ 2022-11-17 9:55 UTC (permalink / raw)
To: Jason, frederic, juri.lelli, kirill.shutemov, linux-kernel,
mark.rutland, netdev, peterz, sathyanarayanan.kuppuswamy,
steven.price, syzkaller-bugs, tglx
Hello,
syzbot found the following issue on:
HEAD commit: 064bc7312bd0 netdevsim: Fix memory leak of nsim_dev->fa_co..
git tree: net
console output: https://syzkaller.appspot.com/x/log.txt?x=13d3204e880000
kernel config: https://syzkaller.appspot.com/x/.config?x=a33ac7bbc22a8c35
dashboard link: https://syzkaller.appspot.com/bug?extid=703d9e154b3b58277261
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0634e1c0e4cb/disk-064bc731.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/fe1039d2de22/vmlinux-064bc731.xz
kernel image: https://storage.googleapis.com/syzbot-assets/5a0d673875fa/bzImage-064bc731.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+703d9e154b3b58277261@syzkaller.appspotmail.com
BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:49
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9420, name: syz-executor.5
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 9420 Comm: syz-executor.5 Not tainted 6.1.0-rc4-syzkaller-00212-g064bc7312bd0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
__might_resched.cold+0x222/0x26b kernel/sched/core.c:9890
percpu_down_read include/linux/percpu-rwsem.h:49 [inline]
cpus_read_lock+0x1b/0x140 kernel/cpu.c:310
static_key_slow_inc+0x12/0x20 kernel/jump_label.c:158
udp_tunnel_encap_enable include/net/udp_tunnel.h:187 [inline]
setup_udp_tunnel_sock+0x43d/0x550 net/ipv4/udp_tunnel_core.c:81
l2tp_tunnel_register+0xc51/0x1210 net/l2tp/l2tp_core.c:1509
pppol2tp_connect+0xcdc/0x1a10 net/l2tp/l2tp_ppp.c:723
__sys_connect_file+0x153/0x1a0 net/socket.c:1976
__sys_connect+0x165/0x1a0 net/socket.c:1993
__do_sys_connect net/socket.c:2003 [inline]
__se_sys_connect net/socket.c:2000 [inline]
__x64_sys_connect+0x73/0xb0 net/socket.c:2000
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fd94d28b639
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fd94e038168 EFLAGS: 00000246
ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 00007fd94d3abf80 RCX: 00007fd94d28b639
RDX: 000000000000002e RSI: 0000000020000000 RDI: 0000000000000003
RBP: 00007fd94d2e6ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffde93128bf R14: 00007fd94e038300 R15: 0000000000022000
</TASK>
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in static_key_slow_inc
2022-11-17 9:55 [syzbot] BUG: sleeping function called from invalid context in static_key_slow_inc syzbot
@ 2022-11-17 12:03 ` syzbot
2022-11-18 1:56 ` syzbot
1 sibling, 0 replies; 11+ messages in thread
From: syzbot @ 2022-11-17 12:03 UTC (permalink / raw)
To: Jason, davem, edumazet, frederic, jacob.e.keller, jiri,
juri.lelli, kirill.shutemov, kuba, linux-kernel, mark.rutland,
netdev, nicolas.dichtel, pabeni, paul, peterz, razor,
sathyanarayanan.kuppuswamy, steven.price, syzkaller-bugs, tglx
syzbot has found a reproducer for the following issue on:
HEAD commit: 064bc7312bd0 netdevsim: Fix memory leak of nsim_dev->fa_co..
git tree: net
console+strace: https://syzkaller.appspot.com/x/log.txt?x=16b2b231880000
kernel config: https://syzkaller.appspot.com/x/.config?x=a33ac7bbc22a8c35
dashboard link: https://syzkaller.appspot.com/bug?extid=703d9e154b3b58277261
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13cd2f79880000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=109e1695880000
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0634e1c0e4cb/disk-064bc731.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/fe1039d2de22/vmlinux-064bc731.xz
kernel image: https://storage.googleapis.com/syzbot-assets/5a0d673875fa/bzImage-064bc731.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+703d9e154b3b58277261@syzkaller.appspotmail.com
BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:49
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3634, name: syz-executor167
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
3 locks held by syz-executor167/3634:
#0: ffffffff8df6b530 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:860
#1: ffffffff8df6b5e8 (genl_mutex){+.+.}-{3:3}, at: genl_lock net/netlink/genetlink.c:33 [inline]
#1: ffffffff8df6b5e8 (genl_mutex){+.+.}-{3:3}, at: genl_rcv_msg+0x50d/0x780 net/netlink/genetlink.c:848
#2: ffff8880182fa0b8 (k-clock-AF_INET){+++.}-{2:2}, at: l2tp_tunnel_register+0x126/0x1210 net/l2tp/l2tp_core.c:1477
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 3634 Comm: syz-executor167 Not tainted 6.1.0-rc4-syzkaller-00212-g064bc7312bd0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
__might_resched.cold+0x222/0x26b kernel/sched/core.c:9890
percpu_down_read include/linux/percpu-rwsem.h:49 [inline]
cpus_read_lock+0x1b/0x140 kernel/cpu.c:310
static_key_slow_inc+0x12/0x20 kernel/jump_label.c:158
udp_tunnel_encap_enable include/net/udp_tunnel.h:189 [inline]
setup_udp_tunnel_sock+0x3e1/0x550 net/ipv4/udp_tunnel_core.c:81
l2tp_tunnel_register+0xc51/0x1210 net/l2tp/l2tp_core.c:1509
l2tp_nl_cmd_tunnel_create+0x3d6/0x8b0 net/l2tp/l2tp_netlink.c:245
genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:756
genl_family_rcv_msg net/netlink/genetlink.c:833 [inline]
genl_rcv_msg+0x445/0x780 net/netlink/genetlink.c:850
netlink_rcv_skb+0x157/0x430 net/netlink/af_netlink.c:2540
genl_rcv+0x28/0x40 net/netlink/genetlink.c:861
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1345
netlink_sendmsg+0x91b/0xe10 net/netlink/af_netlink.c:1921
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xd3/0x120 net/socket.c:734
sock_no_sendpage+0x10c/0x160 net/core/sock.c:3219
kernel_sendpage.part.0+0x1d5/0x700 net/socket.c:3561
kernel_sendpage net/socket.c:3558 [inline]
sock_sendpage+0xe3/0x140 net/socket.c:1054
pipe_to_sendpage+0x2b1/0x380 fs/splice.c:361
splice_from_pipe_feed fs/splice.c:415 [inline]
__splice_from_pipe+0x449/0x8a0 fs/splice.c:559
splice_from_pipe fs/splice.c:594 [inline]
generic_splice_sendpage+0xd8/0x140 fs/splice.c:743
do_splice_from fs/splice.c:764 [inline]
direct_splice_actor+0x114/0x180 fs/splice.c:931
splice_direct_to_actor+0x335/0x8a0 fs/splice.c:886
do_splice_direct+0x1ab/0x280 fs/splice.c:974
do_sendfile+0xb19/0x1270 fs/read_write.c:1255
__do_sys_sendfile64 fs/read_write.c:1323 [inline]
__se_sys_sendfile64 fs/read_write.c:1309 [inline]
__x64_sys_sendfile64+0x1d0/0x210 fs/read_write.c:1309
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f93d1567cb9
Code: 28 c3 e8 5a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffdd8ae4a88 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f93d1567cb9
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000005
RBP: 00007f93d152b680 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000100000000 R11: 0000000000000246 R12: 00007f93d152b710
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] BUG: sleeping function called from invalid context in static_key_slow_inc
2022-11-17 9:55 [syzbot] BUG: sleeping function called from invalid context in static_key_slow_inc syzbot
2022-11-17 12:03 ` syzbot
@ 2022-11-18 1:56 ` syzbot
2022-11-18 11:51 ` [PATCH 6.1-rc6] l2tp: call udp_tunnel_encap_enable() and sock_release() without sk_callback_lock Tetsuo Handa
1 sibling, 1 reply; 11+ messages in thread
From: syzbot @ 2022-11-18 1:56 UTC (permalink / raw)
To: Jason, davem, edumazet, frederic, gnault, jacob.e.keller, jakub,
jiri, johannes, juri.lelli, kirill.shutemov, kuba, linux-kernel,
mark.rutland, netdev, nicolas.dichtel, pabeni, paul, peterz,
razor, sathyanarayanan.kuppuswamy, steven.price, syzkaller-bugs,
tglx, tparkin
syzbot has bisected this issue to:
commit b68777d54fac21fc833ec26ea1a2a84f975ab035
Author: Jakub Sitnicki <jakub@cloudflare.com>
Date: Mon Nov 14 19:16:19 2022 +0000
l2tp: Serialize access to sk_user_data with sk_callback_lock
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1600bb49880000
start commit: 064bc7312bd0 netdevsim: Fix memory leak of nsim_dev->fa_co..
git tree: net
final oops: https://syzkaller.appspot.com/x/report.txt?x=1500bb49880000
console output: https://syzkaller.appspot.com/x/log.txt?x=1100bb49880000
kernel config: https://syzkaller.appspot.com/x/.config?x=a33ac7bbc22a8c35
dashboard link: https://syzkaller.appspot.com/bug?extid=703d9e154b3b58277261
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13cd2f79880000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=109e1695880000
Reported-by: syzbot+703d9e154b3b58277261@syzkaller.appspotmail.com
Fixes: b68777d54fac ("l2tp: Serialize access to sk_user_data with sk_callback_lock")
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 6.1-rc6] l2tp: call udp_tunnel_encap_enable() and sock_release() without sk_callback_lock
2022-11-18 1:56 ` syzbot
@ 2022-11-18 11:51 ` Tetsuo Handa
2022-11-18 12:36 ` Eric Dumazet
0 siblings, 1 reply; 11+ messages in thread
From: Tetsuo Handa @ 2022-11-18 11:51 UTC (permalink / raw)
To: David S. Miller",
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Hideaki YOSHIFUJI,
David Ahern, Jakub Sitnicki, Tom Parkin
Cc: syzbot, netdev, syzkaller-bugs, Haowei Yan
syzbot is reporting sleep in atomic context at l2tp_tunnel_register() [1],
for commit b68777d54fac ("l2tp: Serialize access to sk_user_data with
sk_callback_lock") missed that udp_tunnel_encap_enable() from
setup_udp_tunnel_sock() might sleep.
Since we don't want to drop sk->sk_callback_lock inside
setup_udp_tunnel_sock() right before calling udp_tunnel_encap_enable(),
introduce a variant which does not call udp_tunnel_encap_enable(). And
call udp_tunnel_encap_enable() after dropping sk->sk_callback_lock.
Also, drop sk->sk_callback_lock before calling sock_release() in order to
avoid circular locking dependency problem.
Link: https://syzkaller.appspot.com/bug?extid=703d9e154b3b58277261 [1]
Reported-by: syzbot <syzbot+703d9e154b3b58277261@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: b68777d54fac ("l2tp: Serialize access to sk_user_data with sk_callback_lock")
---
F.Y.I. Below is the lockdep message:
======================================================
WARNING: possible circular locking dependency detected
6.1.0-rc5+ #2 Not tainted
------------------------------------------------------
a.out/2794 is trying to acquire lock:
ffff8c628878bdf0 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: sk_common_release+0x19/0xe0
but task is already holding lock:
ffff8c628878c078 (k-clock-AF_INET){+++.}-{2:2}, at: l2tp_tunnel_register+0x64/0x5e0 [l2tp_core]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (k-clock-AF_INET){+++.}-{2:2}:
lock_acquire+0xc7/0x2e0
_raw_read_lock_bh+0x3d/0x80
sock_i_uid+0x19/0x40
udp_lib_lport_inuse+0x2c/0x120
udp_lib_get_port+0xf8/0x570
udp_v4_get_port+0xbb/0xc0
__inet_bind+0x10e/0x240
inet_bind+0x2b/0x40
kernel_bind+0xb/0x10
udp_sock_create4+0x97/0x160 [udp_tunnel]
l2tp_tunnel_sock_create+0x316/0x330 [l2tp_core]
l2tp_tunnel_register+0x394/0x5e0 [l2tp_core]
l2tp_nl_cmd_tunnel_create+0xe8/0x200 [l2tp_netlink]
genl_family_rcv_msg_doit.isra.17+0x102/0x140
genl_rcv_msg+0x112/0x270
netlink_rcv_skb+0x4f/0x100
genl_rcv+0x23/0x40
netlink_unicast+0x1a5/0x280
netlink_sendmsg+0x22f/0x490
sock_sendmsg+0x2e/0x40
____sys_sendmsg+0x1e9/0x210
___sys_sendmsg+0x77/0xb0
__sys_sendmsg+0x60/0xb0
__x64_sys_sendmsg+0x1a/0x20
do_syscall_64+0x34/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
-> #1 (&table->hash[i].lock){+...}-{2:2}:
lock_acquire+0xc7/0x2e0
_raw_spin_lock_bh+0x31/0x40
udp_lib_get_port+0xda/0x570
udp_v4_get_port+0xbb/0xc0
__inet_bind+0x10e/0x240
inet_bind+0x2b/0x40
kernel_bind+0xb/0x10
udp_sock_create4+0x97/0x160 [udp_tunnel]
l2tp_tunnel_sock_create+0x316/0x330 [l2tp_core]
l2tp_tunnel_register+0x394/0x5e0 [l2tp_core]
l2tp_nl_cmd_tunnel_create+0xe8/0x200 [l2tp_netlink]
genl_family_rcv_msg_doit.isra.17+0x102/0x140
genl_rcv_msg+0x112/0x270
netlink_rcv_skb+0x4f/0x100
genl_rcv+0x23/0x40
netlink_unicast+0x1a5/0x280
netlink_sendmsg+0x22f/0x490
sock_sendmsg+0x2e/0x40
____sys_sendmsg+0x1e9/0x210
___sys_sendmsg+0x77/0xb0
__sys_sendmsg+0x60/0xb0
__x64_sys_sendmsg+0x1a/0x20
do_syscall_64+0x34/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
-> #0 (k-sk_lock-AF_INET){+.+.}-{0:0}:
check_prevs_add+0x16a/0x1070
__lock_acquire+0x11bd/0x1670
lock_acquire+0xc7/0x2e0
udp_destroy_sock+0x2d/0xd0
sk_common_release+0x19/0xe0
udp_lib_close+0x9/0x10
inet_release+0x2e/0x60
__sock_release+0x7e/0xa0
sock_release+0xb/0x10
l2tp_tunnel_register+0x3f1/0x5e0 [l2tp_core]
l2tp_nl_cmd_tunnel_create+0xe8/0x200 [l2tp_netlink]
genl_family_rcv_msg_doit.isra.17+0x102/0x140
genl_rcv_msg+0x112/0x270
netlink_rcv_skb+0x4f/0x100
genl_rcv+0x23/0x40
netlink_unicast+0x1a5/0x280
netlink_sendmsg+0x22f/0x490
sock_sendmsg+0x2e/0x40
____sys_sendmsg+0x1e9/0x210
___sys_sendmsg+0x77/0xb0
__sys_sendmsg+0x60/0xb0
__x64_sys_sendmsg+0x1a/0x20
do_syscall_64+0x34/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
other info that might help us debug this:
Chain exists of:
k-sk_lock-AF_INET --> &table->hash[i].lock --> k-clock-AF_INET
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(k-clock-AF_INET);
lock(&table->hash[i].lock);
lock(k-clock-AF_INET);
lock(k-sk_lock-AF_INET);
*** DEADLOCK ***
3 locks held by a.out/2794:
#0: ffffffffb466fc30 (cb_lock){++++}-{3:3}, at: genl_rcv+0x14/0x40
#1: ffffffffb466fcc8 (genl_mutex){+.+.}-{3:3}, at: genl_rcv_msg+0x14d/0x270
#2: ffff8c628878c078 (k-clock-AF_INET){+++.}-{2:2}, at: l2tp_tunnel_register+0x64/0x5e0 [l2tp_core]
include/net/udp_tunnel.h | 2 ++
net/ipv4/udp_tunnel_core.c | 10 ++++++++--
net/l2tp/l2tp_core.c | 10 +++++-----
3 files changed, 15 insertions(+), 7 deletions(-)
diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index 72394f441dad..a84fa57bc750 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -92,6 +92,8 @@ struct udp_tunnel_sock_cfg {
/* Setup the given (UDP) sock to receive UDP encapsulated packets */
void setup_udp_tunnel_sock(struct net *net, struct socket *sock,
struct udp_tunnel_sock_cfg *sock_cfg);
+void setup_udp_tunnel_sock_no_enable(struct net *net, struct socket *sock,
+ struct udp_tunnel_sock_cfg *sock_cfg);
/* -- List of parsable UDP tunnel types --
*
diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c
index 8242c8947340..dff825664000 100644
--- a/net/ipv4/udp_tunnel_core.c
+++ b/net/ipv4/udp_tunnel_core.c
@@ -57,8 +57,8 @@ int udp_sock_create4(struct net *net, struct udp_port_cfg *cfg,
}
EXPORT_SYMBOL(udp_sock_create4);
-void setup_udp_tunnel_sock(struct net *net, struct socket *sock,
- struct udp_tunnel_sock_cfg *cfg)
+void setup_udp_tunnel_sock_no_enable(struct net *net, struct socket *sock,
+ struct udp_tunnel_sock_cfg *cfg)
{
struct sock *sk = sock->sk;
@@ -77,7 +77,13 @@ void setup_udp_tunnel_sock(struct net *net, struct socket *sock,
udp_sk(sk)->encap_destroy = cfg->encap_destroy;
udp_sk(sk)->gro_receive = cfg->gro_receive;
udp_sk(sk)->gro_complete = cfg->gro_complete;
+}
+EXPORT_SYMBOL_GPL(setup_udp_tunnel_sock_no_enable);
+void setup_udp_tunnel_sock(struct net *net, struct socket *sock,
+ struct udp_tunnel_sock_cfg *cfg)
+{
+ setup_udp_tunnel_sock_no_enable(net, sock, cfg);
udp_tunnel_encap_enable(sock);
}
EXPORT_SYMBOL_GPL(setup_udp_tunnel_sock);
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 754fdda8a5f5..a4f611196c83 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1506,7 +1506,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
.encap_destroy = l2tp_udp_encap_destroy,
};
- setup_udp_tunnel_sock(net, sock, &udp_cfg);
+ setup_udp_tunnel_sock_no_enable(net, sock, &udp_cfg);
} else {
rcu_assign_sk_user_data(sk, tunnel);
}
@@ -1519,19 +1519,19 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
trace_register_tunnel(tunnel);
+ write_unlock(&sk->sk_callback_lock);
+ if (tunnel->encap == L2TP_ENCAPTYPE_UDP)
+ udp_tunnel_encap_enable(sock);
if (tunnel->fd >= 0)
sockfd_put(sock);
-
- write_unlock(&sk->sk_callback_lock);
return 0;
err_sock:
+ write_unlock(&sk->sk_callback_lock);
if (tunnel->fd < 0)
sock_release(sock);
else
sockfd_put(sock);
-
- write_unlock(&sk->sk_callback_lock);
err:
return ret;
}
--
2.18.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 6.1-rc6] l2tp: call udp_tunnel_encap_enable() and sock_release() without sk_callback_lock
2022-11-18 11:51 ` [PATCH 6.1-rc6] l2tp: call udp_tunnel_encap_enable() and sock_release() without sk_callback_lock Tetsuo Handa
@ 2022-11-18 12:36 ` Eric Dumazet
2022-11-18 13:19 ` Tetsuo Handa
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Eric Dumazet @ 2022-11-18 12:36 UTC (permalink / raw)
To: Tetsuo Handa
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Hideaki YOSHIFUJI,
David Ahern, Jakub Sitnicki, Tom Parkin, syzbot, netdev,
syzkaller-bugs, Haowei Yan
On Fri, Nov 18, 2022 at 3:51 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> syzbot is reporting sleep in atomic context at l2tp_tunnel_register() [1],
> for commit b68777d54fac ("l2tp: Serialize access to sk_user_data with
> sk_callback_lock") missed that udp_tunnel_encap_enable() from
> setup_udp_tunnel_sock() might sleep.
>
> Since we don't want to drop sk->sk_callback_lock inside
> setup_udp_tunnel_sock() right before calling udp_tunnel_encap_enable(),
> introduce a variant which does not call udp_tunnel_encap_enable(). And
> call udp_tunnel_encap_enable() after dropping sk->sk_callback_lock.
>
> Also, drop sk->sk_callback_lock before calling sock_release() in order to
> avoid circular locking dependency problem.
Please look at recent discussion, your patch does not address another
fundamental problem.
Also, Jakub was working on a fix already. Perhaps sync with him to
avoid duplicate work.
https://lore.kernel.org/netdev/20221114191619.124659-1-jakub@cloudflare.com/T/
Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 6.1-rc6] l2tp: call udp_tunnel_encap_enable() and sock_release() without sk_callback_lock
2022-11-18 12:36 ` Eric Dumazet
@ 2022-11-18 13:19 ` Tetsuo Handa
2022-11-18 15:04 ` Eric Dumazet
2022-11-18 17:50 ` Jakub Sitnicki
[not found] ` <a2199ab7c03e71af3ac791e119e52c94e9f023f56c8b0d8014dd70aceee2784e@mu>
2 siblings, 1 reply; 11+ messages in thread
From: Tetsuo Handa @ 2022-11-18 13:19 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Hideaki YOSHIFUJI,
David Ahern, Jakub Sitnicki, Tom Parkin, syzbot, netdev,
syzkaller-bugs, Haowei Yan
On 2022/11/18 21:36, Eric Dumazet wrote:
> Please look at recent discussion, your patch does not address another
> fundamental problem.
>
> Also, Jakub was working on a fix already. Perhaps sync with him to
> avoid duplicate work.
I can't afford monitoring all mailing lists. Since a thread at syzkaller-bugs group
did not get that information, I started this work. Please consider including
syzbot+XXXXXXXXXXXXXXXXXXXX@syzkaller.appspotmail.com into the discussions so that
we can google for recent discussions (if any) using mail address as a keyword.
>
> https://lore.kernel.org/netdev/20221114191619.124659-1-jakub@cloudflare.com/T/
>
> Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 6.1-rc6] l2tp: call udp_tunnel_encap_enable() and sock_release() without sk_callback_lock
2022-11-18 13:19 ` Tetsuo Handa
@ 2022-11-18 15:04 ` Eric Dumazet
0 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2022-11-18 15:04 UTC (permalink / raw)
To: Tetsuo Handa
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Hideaki YOSHIFUJI,
David Ahern, Jakub Sitnicki, Tom Parkin, syzbot, netdev,
syzkaller-bugs, Haowei Yan
On Fri, Nov 18, 2022 at 5:19 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2022/11/18 21:36, Eric Dumazet wrote:
> > Please look at recent discussion, your patch does not address another
> > fundamental problem.
> >
> > Also, Jakub was working on a fix already. Perhaps sync with him to
> > avoid duplicate work.
>
> I can't afford monitoring all mailing lists. Since a thread at syzkaller-bugs group
> did not get that information, I started this work. Please consider including
> syzbot+XXXXXXXXXXXXXXXXXXXX@syzkaller.appspotmail.com into the discussions so that
> we can google for recent discussions (if any) using mail address as a keyword.
>
This is not going to happen.
The discussion happened before the reports were made public.
No more than 7 syzbot reports are attached to the same root cause.
We deal with hundreds of syzbot reports per week, there is no way we
can retroactively find all relevant netdev@ threads.
If you can not afford making sure you are not wasting your time, this
is your call.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 6.1-rc6] l2tp: call udp_tunnel_encap_enable() and sock_release() without sk_callback_lock
2022-11-18 12:36 ` Eric Dumazet
2022-11-18 13:19 ` Tetsuo Handa
@ 2022-11-18 17:50 ` Jakub Sitnicki
2022-11-19 10:08 ` Tetsuo Handa
[not found] ` <a2199ab7c03e71af3ac791e119e52c94e9f023f56c8b0d8014dd70aceee2784e@mu>
2 siblings, 1 reply; 11+ messages in thread
From: Jakub Sitnicki @ 2022-11-18 17:50 UTC (permalink / raw)
To: Tetsuo Handa, Eric Dumazet
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Hideaki YOSHIFUJI,
David Ahern, Tom Parkin, syzbot, netdev, syzkaller-bugs,
Haowei Yan
On Fri, Nov 18, 2022 at 04:36 AM -08, Eric Dumazet wrote:
> On Fri, Nov 18, 2022 at 3:51 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>
>> syzbot is reporting sleep in atomic context at l2tp_tunnel_register() [1],
>> for commit b68777d54fac ("l2tp: Serialize access to sk_user_data with
>> sk_callback_lock") missed that udp_tunnel_encap_enable() from
>> setup_udp_tunnel_sock() might sleep.
>>
>> Since we don't want to drop sk->sk_callback_lock inside
>> setup_udp_tunnel_sock() right before calling udp_tunnel_encap_enable(),
>> introduce a variant which does not call udp_tunnel_encap_enable(). And
>> call udp_tunnel_encap_enable() after dropping sk->sk_callback_lock.
>>
>> Also, drop sk->sk_callback_lock before calling sock_release() in order to
>> avoid circular locking dependency problem.
>
> Please look at recent discussion, your patch does not address another
> fundamental problem.
>
> Also, Jakub was working on a fix already. Perhaps sync with him to
> avoid duplicate work.
>
> https://lore.kernel.org/netdev/20221114191619.124659-1-jakub@cloudflare.com/T/
>
> Thanks.
Thanks for the patch, Tetsuo.
As Eric has pointed out [1], there is another problem - in addition to
sleeping in atomic context, I have also failed to use the write_lock
variant which disabled BH locally.
The latter bug can lead to dead-locks, as reported by syzcaller [2, 3],
because we grab sk_callback_lock in softirq context, which can then
block waiting on us if:
1) it runs on the same CPU, or
CPU0
----
lock(clock-AF_INET6);
<Interrupt>
lock(clock-AF_INET6);
2) lock ordering leads to priority inversion
CPU0 CPU1
---- ----
lock(clock-AF_INET6);
local_irq_disable();
lock(&tcp_hashinfo.bhash[i].lock);
lock(clock-AF_INET6);
<Interrupt>
lock(&tcp_hashinfo.bhash[i].lock);
IOW, your patch works if we also s/write_\(un\)\?lock/write_\1lock_bh/.
But, I also have an alternative idea - instead of pulling the function
call that might sleep out of the critical section, I think we can make
the critical section much shorter by rearranging the tunnel
initialization code slightly. That is, a change like below.
-jkbs
[1] https://lore.kernel.org/netdev/CANn89iLQUZnyGNCn2GpW31FXpE_Lt7a5Urr21RqzfAE4sYxs+w@mail.gmail.com/
[2] https://lore.kernel.org/netdev/000000000000e38b6605eda76f98@google.com
[3] https://lore.kernel.org/netdev/000000000000dfa31e05eda76f75@google.com/
--8<--
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 754fdda8a5f5..07454c0418e3 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1474,11 +1474,15 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
}
sk = sock->sk;
- write_lock(&sk->sk_callback_lock);
+ write_lock_bh(&sk->sk_callback_lock);
ret = l2tp_validate_socket(sk, net, tunnel->encap);
if (ret < 0)
goto err_sock;
+ if (tunnel->encap != L2TP_ENCAPTYPE_UDP)
+ rcu_assign_sk_user_data(sk, tunnel);
+
+ write_unlock_bh(&sk->sk_callback_lock);
tunnel->l2tp_net = net;
pn = l2tp_pernet(net);
@@ -1507,8 +1511,6 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
};
setup_udp_tunnel_sock(net, sock, &udp_cfg);
- } else {
- rcu_assign_sk_user_data(sk, tunnel);
}
tunnel->old_sk_destruct = sk->sk_destruct;
@@ -1522,7 +1524,6 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
if (tunnel->fd >= 0)
sockfd_put(sock);
- write_unlock(&sk->sk_callback_lock);
return 0;
err_sock:
@@ -1530,8 +1531,6 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
sock_release(sock);
else
sockfd_put(sock);
-
- write_unlock(&sk->sk_callback_lock);
err:
return ret;
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 6.1-rc6] l2tp: call udp_tunnel_encap_enable() and sock_release() without sk_callback_lock
[not found] ` <a2199ab7c03e71af3ac791e119e52c94e9f023f56c8b0d8014dd70aceee2784e@mu>
@ 2022-11-18 22:10 ` Jakub Sitnicki
0 siblings, 0 replies; 11+ messages in thread
From: Jakub Sitnicki @ 2022-11-18 22:10 UTC (permalink / raw)
To: Tetsuo Handa, Eric Dumazet
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Hideaki YOSHIFUJI,
David Ahern, Tom Parkin, syzbot, netdev, syzkaller-bugs,
Haowei Yan
On Fri, Nov 18, 2022 at 06:50 PM +01, Jakub Sitnicki wrote:
[...]
> But, I also have an alternative idea - instead of pulling the function
> call that might sleep out of the critical section, I think we could make
> the critical section much shorter by rearranging the tunnel
> initialization code slightly. That is, a change like below.
[...]
> --8<--
>
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 754fdda8a5f5..07454c0418e3 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1474,11 +1474,15 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
> }
>
> sk = sock->sk;
> - write_lock(&sk->sk_callback_lock);
> + write_lock_bh(&sk->sk_callback_lock);
>
> ret = l2tp_validate_socket(sk, net, tunnel->encap);
> if (ret < 0)
> goto err_sock;
> + if (tunnel->encap != L2TP_ENCAPTYPE_UDP)
> + rcu_assign_sk_user_data(sk, tunnel);
sk_user_data needs to be reset back to NULL if we bail out when the
tunnel already exists. Will add that and turn it into a patch tomorrow.
> +
> + write_unlock_bh(&sk->sk_callback_lock);
>
> tunnel->l2tp_net = net;
> pn = l2tp_pernet(net);
> @@ -1507,8 +1511,6 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
> };
>
> setup_udp_tunnel_sock(net, sock, &udp_cfg);
> - } else {
> - rcu_assign_sk_user_data(sk, tunnel);
> }
>
> tunnel->old_sk_destruct = sk->sk_destruct;
> @@ -1522,7 +1524,6 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
> if (tunnel->fd >= 0)
> sockfd_put(sock);
>
> - write_unlock(&sk->sk_callback_lock);
> return 0;
>
> err_sock:
> @@ -1530,8 +1531,6 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
> sock_release(sock);
> else
> sockfd_put(sock);
> -
> - write_unlock(&sk->sk_callback_lock);
> err:
> return ret;
> }
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 6.1-rc6] l2tp: call udp_tunnel_encap_enable() and sock_release() without sk_callback_lock
2022-11-18 17:50 ` Jakub Sitnicki
@ 2022-11-19 10:08 ` Tetsuo Handa
2022-11-19 13:13 ` Jakub Sitnicki
0 siblings, 1 reply; 11+ messages in thread
From: Tetsuo Handa @ 2022-11-19 10:08 UTC (permalink / raw)
To: Jakub Sitnicki, Eric Dumazet
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Hideaki YOSHIFUJI,
David Ahern, Tom Parkin, syzbot, netdev, syzkaller-bugs,
Haowei Yan
On 2022/11/19 2:50, Jakub Sitnicki wrote:
> Thanks for the patch, Tetsuo.
>
> As Eric has pointed out [1], there is another problem - in addition to
> sleeping in atomic context, I have also failed to use the write_lock
> variant which disabled BH locally.
>
> The latter bug can lead to dead-locks, as reported by syzcaller [2, 3],
> because we grab sk_callback_lock in softirq context, which can then
> block waiting on us if:
Below is another approach I was thinking of, for reusing existing locks is prone
to locking bugs like [2] and [3].
I couldn't interpret "Write-protected by @sk_callback_lock." part because
it does not say what lock is needed for protecting sk_user_data for read access.
Is it possible to use a mutex dedicated for l2tp_tunnel_destruct() (and optionally
setup_udp_tunnel_sock_no_enable() in order not to create l2tp_tunnel_register_mutex =>
cpu_hotplug_lock chain) ?
By the way I haven't heard an response on
Since userspace-supplied file descriptor has to be a datagram socket,
can we somehow copy the source/destination addresses from
userspace-supplied socket to kernel-created socket?
at https://lkml.kernel.org/r/c9695548-3f27-dda1-3124-ec21da106741@I-love.SAKURA.ne.jp
(that is, always create a new socket in order to be able to assign lockdep class
before that socket is used).
diff --git a/include/net/sock.h b/include/net/sock.h
index e0517ecc6531..49473013afa6 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -323,7 +323,7 @@ struct sk_filter;
* @sk_tskey: counter to disambiguate concurrent tstamp requests
* @sk_zckey: counter to order MSG_ZEROCOPY notifications
* @sk_socket: Identd and reporting IO signals
- * @sk_user_data: RPC layer private data. Write-protected by @sk_callback_lock.
+ * @sk_user_data: RPC layer private data.
* @sk_frag: cached page frag
* @sk_peek_off: current peek_offset value
* @sk_send_head: front of stuff to transmit
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 754fdda8a5f5..2bfcf6968d89 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1150,10 +1150,8 @@ static void l2tp_tunnel_destruct(struct sock *sk)
}
/* Remove hooks into tunnel socket */
- write_lock_bh(&sk->sk_callback_lock);
sk->sk_destruct = tunnel->old_sk_destruct;
sk->sk_user_data = NULL;
- write_unlock_bh(&sk->sk_callback_lock);
/* Call the original destructor */
if (sk->sk_destruct)
@@ -1455,6 +1453,7 @@ static int l2tp_validate_socket(const struct sock *sk, const struct net *net,
int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
struct l2tp_tunnel_cfg *cfg)
{
+ static DEFINE_MUTEX(l2tp_tunnel_register_mutex);
struct l2tp_tunnel *tunnel_walk;
struct l2tp_net *pn;
struct socket *sock;
@@ -1474,7 +1473,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
}
sk = sock->sk;
- write_lock(&sk->sk_callback_lock);
+ mutex_lock(&l2tp_tunnel_register_mutex);
ret = l2tp_validate_socket(sk, net, tunnel->encap);
if (ret < 0)
@@ -1519,19 +1518,18 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
trace_register_tunnel(tunnel);
+ mutex_unlock(&l2tp_tunnel_register_mutex);
if (tunnel->fd >= 0)
sockfd_put(sock);
- write_unlock(&sk->sk_callback_lock);
return 0;
err_sock:
+ mutex_unlock(&l2tp_tunnel_register_mutex);
if (tunnel->fd < 0)
sock_release(sock);
else
sockfd_put(sock);
-
- write_unlock(&sk->sk_callback_lock);
err:
return ret;
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 6.1-rc6] l2tp: call udp_tunnel_encap_enable() and sock_release() without sk_callback_lock
2022-11-19 10:08 ` Tetsuo Handa
@ 2022-11-19 13:13 ` Jakub Sitnicki
0 siblings, 0 replies; 11+ messages in thread
From: Jakub Sitnicki @ 2022-11-19 13:13 UTC (permalink / raw)
To: Tetsuo Handa
Cc: Eric Dumazet, David S. Miller, Jakub Kicinski, Paolo Abeni,
Hideaki YOSHIFUJI, David Ahern, Tom Parkin, syzbot, netdev,
syzkaller-bugs, Haowei Yan
On Sat, Nov 19, 2022 at 07:08 PM +09, Tetsuo Handa wrote:
> On 2022/11/19 2:50, Jakub Sitnicki wrote:
>> Thanks for the patch, Tetsuo.
>>
>> As Eric has pointed out [1], there is another problem - in addition to
>> sleeping in atomic context, I have also failed to use the write_lock
>> variant which disabled BH locally.
>>
>> The latter bug can lead to dead-locks, as reported by syzcaller [2, 3],
>> because we grab sk_callback_lock in softirq context, which can then
>> block waiting on us if:
>
> Below is another approach I was thinking of, for reusing existing locks is prone
> to locking bugs like [2] and [3].
>
> I couldn't interpret "Write-protected by @sk_callback_lock." part because
> it does not say what lock is needed for protecting sk_user_data for read access.
sk_user_data is RCU-protected on reader-side. But we still need to
synchronize writers.
> Is it possible to use a mutex dedicated for l2tp_tunnel_destruct() (and optionally
> setup_udp_tunnel_sock_no_enable() in order not to create l2tp_tunnel_register_mutex =>
> cpu_hotplug_lock chain) ?
No, we need to a common lock to synchronize with other users in the net
stack (reuseport groups, sockmap/psock to name a couple).
> By the way I haven't heard an response on
>
> Since userspace-supplied file descriptor has to be a datagram socket,
> can we somehow copy the source/destination addresses from
> userspace-supplied socket to kernel-created socket?
>
> at https://lkml.kernel.org/r/c9695548-3f27-dda1-3124-ec21da106741@I-love.SAKURA.ne.jp
> (that is, always create a new socket in order to be able to assign lockdep class
> before that socket is used).
This is a drive by fix for me to l2tp, so I might not be the best person
to ask, but I will take a look at the thread.
[...]
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-11-19 13:18 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-17 9:55 [syzbot] BUG: sleeping function called from invalid context in static_key_slow_inc syzbot
2022-11-17 12:03 ` syzbot
2022-11-18 1:56 ` syzbot
2022-11-18 11:51 ` [PATCH 6.1-rc6] l2tp: call udp_tunnel_encap_enable() and sock_release() without sk_callback_lock Tetsuo Handa
2022-11-18 12:36 ` Eric Dumazet
2022-11-18 13:19 ` Tetsuo Handa
2022-11-18 15:04 ` Eric Dumazet
2022-11-18 17:50 ` Jakub Sitnicki
2022-11-19 10:08 ` Tetsuo Handa
2022-11-19 13:13 ` Jakub Sitnicki
[not found] ` <a2199ab7c03e71af3ac791e119e52c94e9f023f56c8b0d8014dd70aceee2784e@mu>
2022-11-18 22:10 ` Jakub Sitnicki
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.