From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753144AbcK0BL3 (ORCPT ); Sat, 26 Nov 2016 20:11:29 -0500 Received: from mail-io0-f193.google.com ([209.85.223.193]:36554 "EHLO mail-io0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752507AbcK0BLV (ORCPT ); Sat, 26 Nov 2016 20:11:21 -0500 MIME-Version: 1.0 In-Reply-To: References: From: Cong Wang Date: Sat, 26 Nov 2016 17:11:00 -0800 Message-ID: Subject: Re: netlink: GPF in sock_sndtimeo To: Dmitry Vyukov Cc: David Miller , Johannes Berg , Florian Westphal , Eric Dumazet , Herbert Xu , netdev , LKML , syzkaller Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Nov 26, 2016 at 7:44 AM, Dmitry Vyukov wrote: > Hello, > > The following program triggers GPF in sock_sndtimeo: > https://gist.githubusercontent.com/dvyukov/c19cadd309791cf5cb9b2bf936d3f48d/raw/1743ba0211079a5465d039512b427bc6b59b1a76/gistfile1.txt > > On commit 16ae16c6e5616c084168740990fc508bda6655d4 (Nov 24). > > general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN > Dumping ftrace buffer: > (ftrace buffer empty) > Modules linked in: > CPU: 1 PID: 19950 Comm: syz-executor Not tainted 4.9.0-rc5+ #54 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: ffff88002a0d0840 task.stack: ffff880036920000 > RIP: 0010:[] [< inline >] sock_sndtimeo > include/net/sock.h:2075 > RIP: 0010:[] [] > netlink_unicast+0xe1/0x730 net/netlink/af_netlink.c:1232 > RSP: 0018:ffff880036926f68 EFLAGS: 00010202 > RAX: 0000000000000068 RBX: ffff880036927000 RCX: ffffc900021d0000 > RDX: 0000000000000d63 RSI: 00000000024000c0 RDI: 0000000000000340 > RBP: ffff880036927028 R08: ffffed0006ea7aab R09: ffffed0006ea7aab > R10: 0000000000000001 R11: ffffed0006ea7aaa R12: dffffc0000000000 > R13: 0000000000000000 R14: ffff880035de3400 R15: ffff880035de3400 > FS: 00007f90a2fc7700(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000000006de0c0 CR3: 0000000035de6000 CR4: 00000000000006e0 > Stack: > ffff880035de3400 ffffffff819f02a1 1ffff10006d24df4 0000000000000004 > 00004db400000014 ffff880036926fd8 ffffffff00000000 0000000041b58ab3 > ffffffff89653c11 ffffffff86cb3500 ffffffff819f0345 ffff880035de3400 > Call Trace: > [< inline >] audit_replace kernel/audit.c:817 > [] audit_receive_msg+0x22c9/0x2ce0 kernel/audit.c:894 > [< inline >] audit_receive_skb kernel/audit.c:1120 > [] audit_receive+0x1dc/0x360 kernel/audit.c:1133 > [< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1214 > [] netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1240 > [] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1786 > [< inline >] sock_sendmsg_nosec net/socket.c:621 > [] sock_sendmsg+0xcf/0x110 net/socket.c:631 > [] sock_write_iter+0x32b/0x620 net/socket.c:829 > [< inline >] new_sync_write fs/read_write.c:499 > [] __vfs_write+0x4fe/0x830 fs/read_write.c:512 > [] vfs_write+0x175/0x4e0 fs/read_write.c:560 > [< inline >] SYSC_write fs/read_write.c:607 > [] SyS_write+0x100/0x240 fs/read_write.c:599 > [] do_syscall_64+0x2f4/0x940 arch/x86/entry/common.c:280 > [] entry_SYSCALL64_slow_path+0x25/0x25 > Code: fe 4c 89 f7 e8 31 16 ff ff 8b 8d 70 ff ff ff 49 89 c7 31 c0 85 > c9 75 25 e8 7d 4a a3 fa 49 8d bd 40 03 00 00 48 89 f8 48 c1 e8 03 <42> > 80 3c 20 00 0f 85 3a 06 00 00 49 8b 85 40 03 00 00 4c 8d 73 > RIP [< inline >] sock_sndtimeo include/net/sock.h:2075 > RIP [] netlink_unicast+0xe1/0x730 > net/netlink/af_netlink.c:1232 > RSP > ---[ end trace 8383a15fba6fdc59 ]--- It is racy on audit_sock, especially on the netns exit path. Could the following patch help a little bit? Also, I don't see how the synchronize_net() here could sync with netlink rcv path, since unlike packets from wire, netlink messages are not handled in BH context nor I see any RCU taken on rcv path. diff --git a/kernel/audit.c b/kernel/audit.c index f1ca116..20bc79e 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1167,10 +1167,13 @@ static void __net_exit audit_net_exit(struct net *net) { struct audit_net *aunet = net_generic(net, audit_net_id); struct sock *sock = aunet->nlsk; + + mutex_lock(&audit_cmd_mutex); if (sock == audit_sock) { audit_pid = 0; audit_sock = NULL; } + mutex_unlock(&audit_cmd_mutex); RCU_INIT_POINTER(aunet->nlsk, NULL); synchronize_net();