* Re: netlink: GPF in sock_sndtimeo [not found] ` <CAM_iQpVeLvfYV+1jX1ZKOntZim4roof4=> @ 2016-11-29 16:48 ` Richard Guy Briggs 2016-11-29 23:13 ` Cong Wang 0 siblings, 1 reply; 36+ messages in thread From: Richard Guy Briggs @ 2016-11-29 16:48 UTC (permalink / raw) To: Cong Wang Cc: Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On 2016-11-26 17:11, Cong Wang wrote: > On Sat, Nov 26, 2016 at 7:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > > Hello, Eric, thanks for the heads up on this. (more below...) > > The following program triggers GPF in sock_sndtimeo: > > https://gist.githubusercontent.com/dvyukov/c19cadd309791cf5cb9b2bf936d3f48d/raw/1743ba0211079a5465d039512b427bc6b59b1a76/gistfile1.txt > > > > On commit 16ae16c6e5616c084168740990fc508bda6655d4 (Nov 24). > > > > general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN > > Dumping ftrace buffer: > > (ftrace buffer empty) > > Modules linked in: > > CPU: 1 PID: 19950 Comm: syz-executor Not tainted 4.9.0-rc5+ #54 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > task: ffff88002a0d0840 task.stack: ffff880036920000 > > RIP: 0010:[<ffffffff86cb35e1>] [< inline >] sock_sndtimeo > > include/net/sock.h:2075 > > RIP: 0010:[<ffffffff86cb35e1>] [<ffffffff86cb35e1>] > > netlink_unicast+0xe1/0x730 net/netlink/af_netlink.c:1232 > > RSP: 0018:ffff880036926f68 EFLAGS: 00010202 > > RAX: 0000000000000068 RBX: ffff880036927000 RCX: ffffc900021d0000 > > RDX: 0000000000000d63 RSI: 00000000024000c0 RDI: 0000000000000340 > > RBP: ffff880036927028 R08: ffffed0006ea7aab R09: ffffed0006ea7aab > > R10: 0000000000000001 R11: ffffed0006ea7aaa R12: dffffc0000000000 > > R13: 0000000000000000 R14: ffff880035de3400 R15: ffff880035de3400 > > FS: 00007f90a2fc7700(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00000000006de0c0 CR3: 0000000035de6000 CR4: 00000000000006e0 > > Stack: > > ffff880035de3400 ffffffff819f02a1 1ffff10006d24df4 0000000000000004 > > 00004db400000014 ffff880036926fd8 ffffffff00000000 0000000041b58ab3 > > ffffffff89653c11 ffffffff86cb3500 ffffffff819f0345 ffff880035de3400 > > Call Trace: > > [< inline >] audit_replace kernel/audit.c:817 > > [<ffffffff816c34b9>] audit_receive_msg+0x22c9/0x2ce0 kernel/audit.c:894 > > [< inline >] audit_receive_skb kernel/audit.c:1120 > > [<ffffffff816c40ac>] audit_receive+0x1dc/0x360 kernel/audit.c:1133 > > [< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1214 > > [<ffffffff86cb3a14>] netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1240 > > [<ffffffff86cb46d4>] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1786 > > [< inline >] sock_sendmsg_nosec net/socket.c:621 > > [<ffffffff86a6d54f>] sock_sendmsg+0xcf/0x110 net/socket.c:631 > > [<ffffffff86a6d8bb>] sock_write_iter+0x32b/0x620 net/socket.c:829 > > [< inline >] new_sync_write fs/read_write.c:499 > > [<ffffffff81a6f24e>] __vfs_write+0x4fe/0x830 fs/read_write.c:512 > > [<ffffffff81a70cf5>] vfs_write+0x175/0x4e0 fs/read_write.c:560 > > [< inline >] SYSC_write fs/read_write.c:607 > > [<ffffffff81a75180>] SyS_write+0x100/0x240 fs/read_write.c:599 > > [<ffffffff81009a24>] do_syscall_64+0x2f4/0x940 arch/x86/entry/common.c:280 > > [<ffffffff88149e8d>] entry_SYSCALL64_slow_path+0x25/0x25 > > Code: fe 4c 89 f7 e8 31 16 ff ff 8b 8d 70 ff ff ff 49 89 c7 31 c0 85 > > c9 75 25 e8 7d 4a a3 fa 49 8d bd 40 03 00 00 48 89 f8 48 c1 e8 03 <42> > > 80 3c 20 00 0f 85 3a 06 00 00 49 8b 85 40 03 00 00 4c 8d 73 > > RIP [< inline >] sock_sndtimeo include/net/sock.h:2075 > > RIP [<ffffffff86cb35e1>] netlink_unicast+0xe1/0x730 > > net/netlink/af_netlink.c:1232 > > RSP <ffff880036926f68> > > ---[ end trace 8383a15fba6fdc59 ]--- > > It is racy on audit_sock, especially on the netns exit path. I think that is the only place it is racy. The other places audit_sock is set is when the socket failure has just triggered a reset. Is there a notifier callback for failed or reaped sockets? > Could the following patch help a little bit? Also, I don't see how the > synchronize_net() here could sync with netlink rcv path, since unlike > packets from wire, netlink messages are not handled in BH context > nor I see any RCU taken on rcv path. Thanks Cong, this looks like it could help. I'll have a closer look. > diff --git a/kernel/audit.c b/kernel/audit.c > index f1ca116..20bc79e 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -1167,10 +1167,13 @@ static void __net_exit audit_net_exit(struct net *net) > { > struct audit_net *aunet = net_generic(net, audit_net_id); > struct sock *sock = aunet->nlsk; > + > + mutex_lock(&audit_cmd_mutex); > if (sock == audit_sock) { > audit_pid = 0; > audit_sock = NULL; > } > + mutex_unlock(&audit_cmd_mutex); > > RCU_INIT_POINTER(aunet->nlsk, NULL); > synchronize_net(); - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-11-29 16:48 ` netlink: GPF in sock_sndtimeo Richard Guy Briggs @ 2016-11-29 23:13 ` Cong Wang 2016-11-30 4:52 ` Richard Guy Briggs 0 siblings, 1 reply; 36+ messages in thread From: Cong Wang @ 2016-11-29 23:13 UTC (permalink / raw) To: Richard Guy Briggs Cc: Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On Tue, Nov 29, 2016 at 8:48 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > On 2016-11-26 17:11, Cong Wang wrote: >> It is racy on audit_sock, especially on the netns exit path. > > I think that is the only place it is racy. The other places audit_sock > is set is when the socket failure has just triggered a reset. > > Is there a notifier callback for failed or reaped sockets? Is NETLINK_URELEASE event what you are looking for? ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-11-29 23:13 ` Cong Wang @ 2016-11-30 4:52 ` Richard Guy Briggs 2016-12-09 6:02 ` Richard Guy Briggs 0 siblings, 1 reply; 36+ messages in thread From: Richard Guy Briggs @ 2016-11-30 4:52 UTC (permalink / raw) To: Cong Wang Cc: Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On 2016-11-29 15:13, Cong Wang wrote: > On Tue, Nov 29, 2016 at 8:48 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > On 2016-11-26 17:11, Cong Wang wrote: > >> It is racy on audit_sock, especially on the netns exit path. > > > > I think that is the only place it is racy. The other places audit_sock > > is set is when the socket failure has just triggered a reset. > > > > Is there a notifier callback for failed or reaped sockets? > > Is NETLINK_URELEASE event what you are looking for? Possibly, yes. Thanks, I'll have a look. - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-11-30 4:52 ` Richard Guy Briggs @ 2016-12-09 6:02 ` Richard Guy Briggs 2016-12-09 6:57 ` Cong Wang 2016-12-09 10:49 ` netlink: GPF in sock_sndtimeo Dmitry Vyukov 0 siblings, 2 replies; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-09 6:02 UTC (permalink / raw) To: Cong Wang, linux-audit, pmoore Cc: Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On 2016-11-29 23:52, Richard Guy Briggs wrote: > On 2016-11-29 15:13, Cong Wang wrote: > > On Tue, Nov 29, 2016 at 8:48 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > > On 2016-11-26 17:11, Cong Wang wrote: > > >> It is racy on audit_sock, especially on the netns exit path. > > > > > > I think that is the only place it is racy. The other places audit_sock > > > is set is when the socket failure has just triggered a reset. > > > > > > Is there a notifier callback for failed or reaped sockets? > > > > Is NETLINK_URELEASE event what you are looking for? > > Possibly, yes. Thanks, I'll have a look. I tried a quick compile attempt on the test case (I assume it is a socket fuzzer) and get the following compile error: cc -g -O0 -Wall -D_GNU_SOURCE -o socket_fuzz socket_fuzz.c socket_fuzz.c:16:1: warning: "_GNU_SOURCE" redefined <command-line>: warning: this is the location of the previous definition socket_fuzz.c: In function ‘segv_handler’: socket_fuzz.c:89: warning: implicit declaration of function ‘__atomic_load_n’ socket_fuzz.c:89: error: ‘__ATOMIC_RELAXED’ undeclared (first use in this function) socket_fuzz.c:89: error: (Each undeclared identifier is reported only once socket_fuzz.c:89: error: for each function it appears in.) socket_fuzz.c: In function ‘loop’: socket_fuzz.c:280: warning: unused variable ‘errno0’ socket_fuzz.c: In function ‘test’: socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_add’ socket_fuzz.c:303: error: ‘__ATOMIC_SEQ_CST’ undeclared (first use in this function) socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_sub’ I also tried to extend Cong Wang's idea to attempt to proactively respond to a NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. Eliminating the lock since the sock is dead anways eliminates the error. Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to get the test case to compile. This is being tracked as https://github.com/linux-audit/audit-kernel/issues/30 Subject: [PATCH] audit: proactively reset audit_sock on matching NETLINK_URELEASE diff --git a/kernel/audit.c b/kernel/audit.c index f1ca116..91d222d 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -423,6 +423,7 @@ static void kauditd_send_skb(struct sk_buff *skb) snprintf(s, sizeof(s), "audit_pid=%d reset", audit_pid); audit_log_lost(s); audit_pid = 0; + audit_nlk_portid = 0; audit_sock = NULL; } else { pr_warn("re-scheduling(#%d) write to audit_pid=%d\n", @@ -1143,6 +1144,28 @@ static int audit_bind(struct net *net, int group) return 0; } +static int audit_sock_netlink_notify(struct notifier_block *nb, + unsigned long event, + void *_notify) +{ + struct netlink_notify *notify = _notify; + struct audit_net *aunet = net_generic(notify->net, audit_net_id); + + if (event == NETLINK_URELEASE && notify->protocol == NETLINK_AUDIT) { + if (audit_nlk_portid == notify->portid && + audit_sock == aunet->nlsk) { + audit_pid = 0; + audit_nlk_portid = 0; + audit_sock = NULL; + } + } + return NOTIFY_DONE; +} + +static struct notifier_block audit_netlink_notifier = { + .notifier_call = audit_sock_netlink_notify, +}; + static int __net_init audit_net_init(struct net *net) { struct netlink_kernel_cfg cfg = { @@ -1167,10 +1190,14 @@ static void __net_exit audit_net_exit(struct net *net) { struct audit_net *aunet = net_generic(net, audit_net_id); struct sock *sock = aunet->nlsk; + + mutex_lock(&audit_cmd_mutex); if (sock == audit_sock) { audit_pid = 0; + audit_nlk_portid = 0; audit_sock = NULL; } + mutex_unlock(&audit_cmd_mutex); RCU_INIT_POINTER(aunet->nlsk, NULL); synchronize_net(); @@ -1202,6 +1229,7 @@ static int __init audit_init(void) audit_enabled = audit_default; audit_ever_enabled |= !!audit_default; + netlink_register_notifier(&audit_netlink_notifier); audit_log(NULL, GFP_KERNEL, AUDIT_KERNEL, "initialized"); for (i = 0; i < AUDIT_INODE_BUCKETS; i++) -- 1.7.1 > - RGB - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-09 6:02 ` Richard Guy Briggs @ 2016-12-09 6:57 ` Cong Wang 2016-12-09 11:01 ` Richard Guy Briggs 2016-12-09 10:49 ` netlink: GPF in sock_sndtimeo Dmitry Vyukov 1 sibling, 1 reply; 36+ messages in thread From: Cong Wang @ 2016-12-09 6:57 UTC (permalink / raw) To: Richard Guy Briggs Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote: > I also tried to extend Cong Wang's idea to attempt to proactively respond to a > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. > Eliminating the lock since the sock is dead anways eliminates the error. > > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to > get the test case to compile. It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid' are updated as a whole and race between audit_receive_msg() and NETLINK_URELEASE. > @@ -1167,10 +1190,14 @@ static void __net_exit audit_net_exit(struct net *net) > { > struct audit_net *aunet = net_generic(net, audit_net_id); > struct sock *sock = aunet->nlsk; > + > + mutex_lock(&audit_cmd_mutex); > if (sock == audit_sock) { > audit_pid = 0; > + audit_nlk_portid = 0; > audit_sock = NULL; > } > + mutex_unlock(&audit_cmd_mutex); > If you decide to use NETLINK_URELEASE notifier, the above piece is no longer needed, the net_exit path simply releases a refcnt. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-09 6:57 ` Cong Wang @ 2016-12-09 11:01 ` Richard Guy Briggs 2016-12-10 4:13 ` Cong Wang 0 siblings, 1 reply; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-09 11:01 UTC (permalink / raw) To: Cong Wang Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On 2016-12-08 22:57, Cong Wang wrote: > On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote: > > I also tried to extend Cong Wang's idea to attempt to proactively respond to a > > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error > > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. > > Eliminating the lock since the sock is dead anways eliminates the error. > > > > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to > > get the test case to compile. > > It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid' > are updated as a whole and race between audit_receive_msg() and > NETLINK_URELEASE. This is what I expected and why I originally added the mutex lock in the callback... The dumps I got were bare with no wrapper identifying the process context or specific error, so I'm at a bit of a loss how to solve this (without thinking more about it) other than instinctively removing the mutex. Another approach might be to look at consolidating the three into one identifier or derive the other two from one, or serialize their access. > > @@ -1167,10 +1190,14 @@ static void __net_exit audit_net_exit(struct net *net) > > { > > struct audit_net *aunet = net_generic(net, audit_net_id); > > struct sock *sock = aunet->nlsk; > > + > > + mutex_lock(&audit_cmd_mutex); > > if (sock == audit_sock) { > > audit_pid = 0; > > + audit_nlk_portid = 0; > > audit_sock = NULL; > > } > > + mutex_unlock(&audit_cmd_mutex); > > If you decide to use NETLINK_URELEASE notifier, the above piece is no > longer needed, the net_exit path simply releases a refcnt. Good point. It would have already killed it off. So this piece is arguably too late anyways. - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-09 11:01 ` Richard Guy Briggs @ 2016-12-10 4:13 ` Cong Wang 2016-12-10 7:40 ` Cong Wang 2016-12-12 10:02 ` Richard Guy Briggs 0 siblings, 2 replies; 36+ messages in thread From: Cong Wang @ 2016-12-10 4:13 UTC (permalink / raw) To: Richard Guy Briggs Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On Fri, Dec 9, 2016 at 3:01 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > On 2016-12-08 22:57, Cong Wang wrote: >> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote: >> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a >> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error >> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. >> > Eliminating the lock since the sock is dead anways eliminates the error. >> > >> > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to >> > get the test case to compile. >> >> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid' >> are updated as a whole and race between audit_receive_msg() and >> NETLINK_URELEASE. > > This is what I expected and why I originally added the mutex lock in the > callback... The dumps I got were bare with no wrapper identifying the > process context or specific error, so I'm at a bit of a loss how to > solve this (without thinking more about it) other than instinctively > removing the mutex. Netlink notifier can safely be converted to blocking one, I will send a patch. But I seriously doubt you really need NETLINK_URELEASE here, it adds nothing but overhead, b/c the netlink notifier is called on every netlink socket in the system, but for net exit path, that is relatively a slow path. Also, kauditd_send_skb() needs audit_cmd_mutex too. I will send a formal patch. Thanks. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-10 4:13 ` Cong Wang @ 2016-12-10 7:40 ` Cong Wang 2016-12-12 10:07 ` Dmitry Vyukov 2016-12-13 7:51 ` Richard Guy Briggs 2016-12-12 10:02 ` Richard Guy Briggs 1 sibling, 2 replies; 36+ messages in thread From: Cong Wang @ 2016-12-10 7:40 UTC (permalink / raw) To: Richard Guy Briggs Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller [-- Attachment #1: Type: text/plain, Size: 1664 bytes --] On Fri, Dec 9, 2016 at 8:13 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: > On Fri, Dec 9, 2016 at 3:01 AM, Richard Guy Briggs <rgb@redhat.com> wrote: >> On 2016-12-08 22:57, Cong Wang wrote: >>> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote: >>> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a >>> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error >>> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. >>> > Eliminating the lock since the sock is dead anways eliminates the error. >>> > >>> > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to >>> > get the test case to compile. >>> >>> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid' >>> are updated as a whole and race between audit_receive_msg() and >>> NETLINK_URELEASE. >> >> This is what I expected and why I originally added the mutex lock in the >> callback... The dumps I got were bare with no wrapper identifying the >> process context or specific error, so I'm at a bit of a loss how to >> solve this (without thinking more about it) other than instinctively >> removing the mutex. > > Netlink notifier can safely be converted to blocking one, I will send > a patch. > > But I seriously doubt you really need NETLINK_URELEASE here, > it adds nothing but overhead, b/c the netlink notifier is called on > every netlink socket in the system, but for net exit path, that is > relatively a slow path. > > Also, kauditd_send_skb() needs audit_cmd_mutex too. Please let me know what you think about the attached patch? Thanks! [-- Attachment #2: audit_sock.diff --] [-- Type: text/plain, Size: 1370 bytes --] commit a12b43ee814625933ff155c20dc863c59cfcf240 Author: Cong Wang <xiyou.wangcong@gmail.com> Date: Fri Dec 9 17:56:42 2016 -0800 audit: close a race condition on audit_sock Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> diff --git a/kernel/audit.c b/kernel/audit.c index f1ca116..ab947d8 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -423,6 +423,8 @@ static void kauditd_send_skb(struct sk_buff *skb) snprintf(s, sizeof(s), "audit_pid=%d reset", audit_pid); audit_log_lost(s); audit_pid = 0; + audit_nlk_portid = 0; + sock_put(audit_sock); audit_sock = NULL; } else { pr_warn("re-scheduling(#%d) write to audit_pid=%d\n", @@ -899,6 +901,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) audit_log_config_change("audit_pid", new_pid, audit_pid, 1); audit_pid = new_pid; audit_nlk_portid = NETLINK_CB(skb).portid; + sock_hold(skb->sk); + if (audit_sock) + sock_put(audit_sock); audit_sock = skb->sk; } if (s.mask & AUDIT_STATUS_RATE_LIMIT) { @@ -1167,10 +1172,6 @@ static void __net_exit audit_net_exit(struct net *net) { struct audit_net *aunet = net_generic(net, audit_net_id); struct sock *sock = aunet->nlsk; - if (sock == audit_sock) { - audit_pid = 0; - audit_sock = NULL; - } RCU_INIT_POINTER(aunet->nlsk, NULL); synchronize_net(); ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-10 7:40 ` Cong Wang @ 2016-12-12 10:07 ` Dmitry Vyukov 2016-12-13 7:51 ` Richard Guy Briggs 1 sibling, 0 replies; 36+ messages in thread From: Dmitry Vyukov @ 2016-12-12 10:07 UTC (permalink / raw) To: syzkaller Cc: Richard Guy Briggs, linux-audit, Paul Moore, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML On Sat, Dec 10, 2016 at 8:40 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >>> On 2016-12-08 22:57, Cong Wang wrote: >>>> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote: >>>> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a >>>> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error >>>> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. >>>> > Eliminating the lock since the sock is dead anways eliminates the error. >>>> > >>>> > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to >>>> > get the test case to compile. >>>> >>>> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid' >>>> are updated as a whole and race between audit_receive_msg() and >>>> NETLINK_URELEASE. >>> >>> This is what I expected and why I originally added the mutex lock in the >>> callback... The dumps I got were bare with no wrapper identifying the >>> process context or specific error, so I'm at a bit of a loss how to >>> solve this (without thinking more about it) other than instinctively >>> removing the mutex. >> >> Netlink notifier can safely be converted to blocking one, I will send >> a patch. >> >> But I seriously doubt you really need NETLINK_URELEASE here, >> it adds nothing but overhead, b/c the netlink notifier is called on >> every netlink socket in the system, but for net exit path, that is >> relatively a slow path. >> >> Also, kauditd_send_skb() needs audit_cmd_mutex too. > > Please let me know what you think about the attached patch? Applied the patch locally and have not seen the bug since then (~24 hours of testing). ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-10 7:40 ` Cong Wang 2016-12-12 10:07 ` Dmitry Vyukov @ 2016-12-13 7:51 ` Richard Guy Briggs 2016-12-13 8:28 ` Richard Guy Briggs 1 sibling, 1 reply; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-13 7:51 UTC (permalink / raw) To: Cong Wang Cc: Herbert Xu, Johannes Berg, netdev, Florian Westphal, LKML, Eric Dumazet, linux-audit, syzkaller, David Miller, Dmitry Vyukov On 2016-12-09 23:40, Cong Wang wrote: > On Fri, Dec 9, 2016 at 8:13 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Fri, Dec 9, 2016 at 3:01 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > >> On 2016-12-08 22:57, Cong Wang wrote: > >>> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote: > >>> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a > >>> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error > >>> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. > >>> > Eliminating the lock since the sock is dead anways eliminates the error. > >>> > > >>> > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to > >>> > get the test case to compile. > >>> > >>> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid' > >>> are updated as a whole and race between audit_receive_msg() and > >>> NETLINK_URELEASE. > >> > >> This is what I expected and why I originally added the mutex lock in the > >> callback... The dumps I got were bare with no wrapper identifying the > >> process context or specific error, so I'm at a bit of a loss how to > >> solve this (without thinking more about it) other than instinctively > >> removing the mutex. > > > > Netlink notifier can safely be converted to blocking one, I will send > > a patch. > > > > But I seriously doubt you really need NETLINK_URELEASE here, > > it adds nothing but overhead, b/c the netlink notifier is called on > > every netlink socket in the system, but for net exit path, that is > > relatively a slow path. > > > > Also, kauditd_send_skb() needs audit_cmd_mutex too. > > Please let me know what you think about the attached patch? > > Thanks! > commit a12b43ee814625933ff155c20dc863c59cfcf240 > Author: Cong Wang <xiyou.wangcong@gmail.com> > Date: Fri Dec 9 17:56:42 2016 -0800 > > audit: close a race condition on audit_sock > > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> > > diff --git a/kernel/audit.c b/kernel/audit.c > index f1ca116..ab947d8 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -423,6 +423,8 @@ static void kauditd_send_skb(struct sk_buff *skb) > snprintf(s, sizeof(s), "audit_pid=%d reset", audit_pid); > audit_log_lost(s); > audit_pid = 0; > + audit_nlk_portid = 0; > + sock_put(audit_sock); > audit_sock = NULL; > } else { > pr_warn("re-scheduling(#%d) write to audit_pid=%d\n", > @@ -899,6 +901,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > audit_log_config_change("audit_pid", new_pid, audit_pid, 1); > audit_pid = new_pid; > audit_nlk_portid = NETLINK_CB(skb).portid; > + sock_hold(skb->sk); > + if (audit_sock) > + sock_put(audit_sock); > audit_sock = skb->sk; > } > if (s.mask & AUDIT_STATUS_RATE_LIMIT) { > @@ -1167,10 +1172,6 @@ static void __net_exit audit_net_exit(struct net *net) > { > struct audit_net *aunet = net_generic(net, audit_net_id); > struct sock *sock = aunet->nlsk; > - if (sock == audit_sock) { > - audit_pid = 0; > - audit_sock = NULL; > - } So how does this not leak memory leaving the sock refcount incremented by the registered audit daemon when that daemon shuts down normally? - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-13 7:51 ` Richard Guy Briggs @ 2016-12-13 8:28 ` Richard Guy Briggs 0 siblings, 0 replies; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-13 8:28 UTC (permalink / raw) To: Cong Wang Cc: Herbert Xu, Johannes Berg, netdev, Florian Westphal, LKML, Eric Dumazet, linux-audit, syzkaller, David Miller, Dmitry Vyukov On 2016-12-13 02:51, Richard Guy Briggs wrote: > On 2016-12-09 23:40, Cong Wang wrote: > > On Fri, Dec 9, 2016 at 8:13 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > On Fri, Dec 9, 2016 at 3:01 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > >> On 2016-12-08 22:57, Cong Wang wrote: > > >>> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote: > > >>> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a > > >>> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error > > >>> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. > > >>> > Eliminating the lock since the sock is dead anways eliminates the error. > > >>> > > > >>> > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to > > >>> > get the test case to compile. > > >>> > > >>> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid' > > >>> are updated as a whole and race between audit_receive_msg() and > > >>> NETLINK_URELEASE. > > >> > > >> This is what I expected and why I originally added the mutex lock in the > > >> callback... The dumps I got were bare with no wrapper identifying the > > >> process context or specific error, so I'm at a bit of a loss how to > > >> solve this (without thinking more about it) other than instinctively > > >> removing the mutex. > > > > > > Netlink notifier can safely be converted to blocking one, I will send > > > a patch. > > > > > > But I seriously doubt you really need NETLINK_URELEASE here, > > > it adds nothing but overhead, b/c the netlink notifier is called on > > > every netlink socket in the system, but for net exit path, that is > > > relatively a slow path. > > > > > > Also, kauditd_send_skb() needs audit_cmd_mutex too. > > > > Please let me know what you think about the attached patch? > > > > Thanks! > > > commit a12b43ee814625933ff155c20dc863c59cfcf240 > > Author: Cong Wang <xiyou.wangcong@gmail.com> > > Date: Fri Dec 9 17:56:42 2016 -0800 > > > > audit: close a race condition on audit_sock > > > > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> > > > > diff --git a/kernel/audit.c b/kernel/audit.c > > index f1ca116..ab947d8 100644 > > --- a/kernel/audit.c > > +++ b/kernel/audit.c > > @@ -423,6 +423,8 @@ static void kauditd_send_skb(struct sk_buff *skb) > > snprintf(s, sizeof(s), "audit_pid=%d reset", audit_pid); > > audit_log_lost(s); > > audit_pid = 0; > > + audit_nlk_portid = 0; > > + sock_put(audit_sock); > > audit_sock = NULL; > > } else { > > pr_warn("re-scheduling(#%d) write to audit_pid=%d\n", > > @@ -899,6 +901,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > > audit_log_config_change("audit_pid", new_pid, audit_pid, 1); > > audit_pid = new_pid; > > audit_nlk_portid = NETLINK_CB(skb).portid; > > + sock_hold(skb->sk); > > + if (audit_sock) > > + sock_put(audit_sock); > > audit_sock = skb->sk; > > } > > if (s.mask & AUDIT_STATUS_RATE_LIMIT) { > > @@ -1167,10 +1172,6 @@ static void __net_exit audit_net_exit(struct net *net) > > { > > struct audit_net *aunet = net_generic(net, audit_net_id); > > struct sock *sock = aunet->nlsk; > > - if (sock == audit_sock) { > > - audit_pid = 0; > > - audit_sock = NULL; > > - } > > So how does this not leak memory leaving the sock refcount incremented > by the registered audit daemon when that daemon shuts down normally? Sorry, that should have been: How does it not leak if auditd exits abnormally without sending a shutdown message, but no message is sent on the queue to trigger an error before the net namespace exits? > - RGB - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-10 4:13 ` Cong Wang 2016-12-10 7:40 ` Cong Wang @ 2016-12-12 10:02 ` Richard Guy Briggs 2016-12-12 10:03 ` [PATCH v2] audit: use proper refcount locking on audit_sock Richard Guy Briggs ` (2 more replies) 1 sibling, 3 replies; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-12 10:02 UTC (permalink / raw) To: Cong Wang Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On 2016-12-09 20:13, Cong Wang wrote: > On Fri, Dec 9, 2016 at 3:01 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > On 2016-12-08 22:57, Cong Wang wrote: > >> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs <rgb@redhat.com> wrote: > >> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a > >> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error > >> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. > >> > Eliminating the lock since the sock is dead anways eliminates the error. > >> > > >> > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to > >> > get the test case to compile. > >> > >> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid' > >> are updated as a whole and race between audit_receive_msg() and > >> NETLINK_URELEASE. > > > > This is what I expected and why I originally added the mutex lock in the > > callback... The dumps I got were bare with no wrapper identifying the > > process context or specific error, so I'm at a bit of a loss how to > > solve this (without thinking more about it) other than instinctively > > removing the mutex. > > Netlink notifier can safely be converted to blocking one, I will send > a patch. I had a quick look at how that might happen. The netlink notifier chain is atomic. Would the registered callback funciton need to spawn a one-time thread to avoid blocking? > But I seriously doubt you really need NETLINK_URELEASE here, > it adds nothing but overhead, b/c the netlink notifier is called on > every netlink socket in the system, but for net exit path, that is > relatively a slow path. I was a bit concerned about its overhead, but was hoping to update audit_sock more quickly in the case of a sock shutting down for any reason. > Also, kauditd_send_skb() needs audit_cmd_mutex too. Agreed. > I will send a formal patch. I had a look at your patch. It looks attractively simple. The audit next tree has patches queued that add an audit_reset function that will require more work. I still see some potential gaps. - If the process messes up (or the sock lookup messes up) it is reset in the kauditd thread under the audit_cmd_mutex. - If the process exits normally or is replaced due to an audit_replace error, it is reset from audit_receive_skb under the audit_cmd_mutex. - If the process dies before the kauditd thread notices, either reap it via notifier callback or it needs a check on net exit to reset. This last one appears necessary to decrement the sock refcount so the sock can be released in netlink_kernel_release(). If we want to be proactive and use the netlink notifier, we assume the overhead of adding to the netlink notifier chain and eliminate all the other reset calls under the kauditd thread. If we are ok being reactionary, then we'll at least need the net exit check on audit_sock. Have I understood this correctly? I'll follow with a patch based on audit#next There will be an upstream merge conflict between audit#next and net#next due to the removal of: RCU_INIT_POINTER(aunet->nlsk, NULL); synchronize_net(); from the end of audit_net_exit(). This patch should probably go through the audit maintainer due to the other anticipated merge conflicts. > Thanks. - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2] audit: use proper refcount locking on audit_sock 2016-12-12 10:02 ` Richard Guy Briggs @ 2016-12-12 10:03 ` Richard Guy Briggs 2016-12-12 17:10 ` Paul Moore ` (2 more replies) 2016-12-13 0:10 ` netlink: GPF in sock_sndtimeo Cong Wang 2016-12-13 15:03 ` [RFC PATCH v3] audit: use proper refcount locking on audit_sock Richard Guy Briggs 2 siblings, 3 replies; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-12 10:03 UTC (permalink / raw) To: netdev, linux-kernel, linux-audit Cc: Richard Guy Briggs, dvyukov, xiyou.wangcong, edumazet, eparis, pmoore, sgrubb Resetting audit_sock appears to be racy. audit_sock was being copied and dereferenced without using a refcount on the source sock. Bump the refcount on the underlying sock when we store a refrence in audit_sock and release it when we reset audit_sock. audit_sock modification needs the audit_cmd_mutex. See: https://lkml.org/lkml/2016/11/26/232 Thanks to Eric Dumazet <edumazet@google.com> and Cong Wang <xiyou.wangcong@gmail.com> on ideas how to fix it. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> --- There has been a lot of change in the audit code that is about to go upstream to address audit queue issues. This patch is based on the source tree: git://git.infradead.org/users/pcmoore/audit#next --- kernel/audit.c | 34 ++++++++++++++++++++++++++++------ 1 files changed, 28 insertions(+), 6 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index f20eee0..439f7f3 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -452,7 +452,9 @@ static void auditd_reset(void) struct sk_buff *skb; /* break the connection */ + sock_put(audit_sock); audit_pid = 0; + audit_nlk_portid = 0; audit_sock = NULL; /* flush all of the retry queue to the hold queue */ @@ -478,6 +480,12 @@ static int kauditd_send_unicast_skb(struct sk_buff *skb) if (rc >= 0) { consume_skb(skb); rc = 0; + } else { + if (rc & (-ENOMEM|-EPERM|-ECONNREFUSED)) { + mutex_lock(&audit_cmd_mutex); + auditd_reset(); + mutex_unlock(&audit_cmd_mutex); + } } return rc; @@ -579,7 +587,9 @@ static int kauditd_thread(void *dummy) auditd = 0; if (AUDITD_BAD(rc, reschedule)) { + mutex_lock(&audit_cmd_mutex); auditd_reset(); + mutex_unlock(&audit_cmd_mutex); reschedule = 0; } } else @@ -594,7 +604,9 @@ static int kauditd_thread(void *dummy) auditd = 0; if (AUDITD_BAD(rc, reschedule)) { kauditd_hold_skb(skb); + mutex_lock(&audit_cmd_mutex); auditd_reset(); + mutex_unlock(&audit_cmd_mutex); reschedule = 0; } else /* temporary problem (we hope), queue @@ -623,7 +635,9 @@ quick_loop: if (rc) { auditd = 0; if (AUDITD_BAD(rc, reschedule)) { + mutex_lock(&audit_cmd_mutex); auditd_reset(); + mutex_unlock(&audit_cmd_mutex); reschedule = 0; } @@ -1004,17 +1018,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) return -EACCES; } if (audit_pid && new_pid && - audit_replace(requesting_pid) != -ECONNREFUSED) { + (audit_replace(requesting_pid) & (-ECONNREFUSED|-EPERM|-ENOMEM))) { audit_log_config_change("audit_pid", new_pid, audit_pid, 0); return -EEXIST; } if (audit_enabled != AUDIT_OFF) audit_log_config_change("audit_pid", new_pid, audit_pid, 1); - audit_pid = new_pid; - audit_nlk_portid = NETLINK_CB(skb).portid; - audit_sock = skb->sk; - if (!new_pid) + if (new_pid) { + if (audit_sock) + sock_put(audit_sock); + audit_pid = new_pid; + audit_nlk_portid = NETLINK_CB(skb).portid; + sock_hold(skb->sk); + audit_sock = skb->sk; + } else { auditd_reset(); + } wake_up_interruptible(&kauditd_wait); } if (s.mask & AUDIT_STATUS_RATE_LIMIT) { @@ -1283,8 +1302,11 @@ static void __net_exit audit_net_exit(struct net *net) { struct audit_net *aunet = net_generic(net, audit_net_id); struct sock *sock = aunet->nlsk; - if (sock == audit_sock) + if (sock == audit_sock) { + mutex_lock(&audit_cmd_mutex); auditd_reset(); + mutex_unlock(&audit_cmd_mutex); + } RCU_INIT_POINTER(aunet->nlsk, NULL); synchronize_net(); -- 1.7.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v2] audit: use proper refcount locking on audit_sock 2016-12-12 10:03 ` [PATCH v2] audit: use proper refcount locking on audit_sock Richard Guy Briggs @ 2016-12-12 17:10 ` Paul Moore 2016-12-13 4:49 ` Richard Guy Briggs 2016-12-12 20:18 ` Paul Moore 2016-12-12 23:58 ` Cong Wang 2 siblings, 1 reply; 36+ messages in thread From: Paul Moore @ 2016-12-12 17:10 UTC (permalink / raw) To: Richard Guy Briggs Cc: netdev, linux-kernel, linux-audit, edumazet, xiyou.wangcong, dvyukov On Mon, Dec 12, 2016 at 5:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > Resetting audit_sock appears to be racy. > > audit_sock was being copied and dereferenced without using a refcount on > the source sock. > > Bump the refcount on the underlying sock when we store a refrence in > audit_sock and release it when we reset audit_sock. audit_sock > modification needs the audit_cmd_mutex. > > See: https://lkml.org/lkml/2016/11/26/232 > > Thanks to Eric Dumazet <edumazet@google.com> and Cong Wang > <xiyou.wangcong@gmail.com> on ideas how to fix it. > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com> > --- > There has been a lot of change in the audit code that is about to go > upstream to address audit queue issues. This patch is based on the > source tree: git://git.infradead.org/users/pcmoore/audit#next > --- > kernel/audit.c | 34 ++++++++++++++++++++++++++++------ > 1 files changed, 28 insertions(+), 6 deletions(-) This is coming in pretty late for the v4.10 merge window, much later than I would usually take things, but this is arguably important, and (at first glance) relatively low risk - what testing have you done on this? > diff --git a/kernel/audit.c b/kernel/audit.c > index f20eee0..439f7f3 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -452,7 +452,9 @@ static void auditd_reset(void) > struct sk_buff *skb; > > /* break the connection */ > + sock_put(audit_sock); > audit_pid = 0; > + audit_nlk_portid = 0; > audit_sock = NULL; > > /* flush all of the retry queue to the hold queue */ > @@ -478,6 +480,12 @@ static int kauditd_send_unicast_skb(struct sk_buff *skb) > if (rc >= 0) { > consume_skb(skb); > rc = 0; > + } else { > + if (rc & (-ENOMEM|-EPERM|-ECONNREFUSED)) { > + mutex_lock(&audit_cmd_mutex); > + auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > + } > } > > return rc; > @@ -579,7 +587,9 @@ static int kauditd_thread(void *dummy) > > auditd = 0; > if (AUDITD_BAD(rc, reschedule)) { > + mutex_lock(&audit_cmd_mutex); > auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > reschedule = 0; > } > } else > @@ -594,7 +604,9 @@ static int kauditd_thread(void *dummy) > auditd = 0; > if (AUDITD_BAD(rc, reschedule)) { > kauditd_hold_skb(skb); > + mutex_lock(&audit_cmd_mutex); > auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > reschedule = 0; > } else > /* temporary problem (we hope), queue > @@ -623,7 +635,9 @@ quick_loop: > if (rc) { > auditd = 0; > if (AUDITD_BAD(rc, reschedule)) { > + mutex_lock(&audit_cmd_mutex); > auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > reschedule = 0; > } > > @@ -1004,17 +1018,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > return -EACCES; > } > if (audit_pid && new_pid && > - audit_replace(requesting_pid) != -ECONNREFUSED) { > + (audit_replace(requesting_pid) & (-ECONNREFUSED|-EPERM|-ENOMEM))) { > audit_log_config_change("audit_pid", new_pid, audit_pid, 0); > return -EEXIST; > } > if (audit_enabled != AUDIT_OFF) > audit_log_config_change("audit_pid", new_pid, audit_pid, 1); > - audit_pid = new_pid; > - audit_nlk_portid = NETLINK_CB(skb).portid; > - audit_sock = skb->sk; > - if (!new_pid) > + if (new_pid) { > + if (audit_sock) > + sock_put(audit_sock); > + audit_pid = new_pid; > + audit_nlk_portid = NETLINK_CB(skb).portid; > + sock_hold(skb->sk); > + audit_sock = skb->sk; > + } else { > auditd_reset(); > + } > wake_up_interruptible(&kauditd_wait); > } > if (s.mask & AUDIT_STATUS_RATE_LIMIT) { > @@ -1283,8 +1302,11 @@ static void __net_exit audit_net_exit(struct net *net) > { > struct audit_net *aunet = net_generic(net, audit_net_id); > struct sock *sock = aunet->nlsk; > - if (sock == audit_sock) > + if (sock == audit_sock) { > + mutex_lock(&audit_cmd_mutex); > auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > + } > > RCU_INIT_POINTER(aunet->nlsk, NULL); > synchronize_net(); > -- > 1.7.1 > > -- > Linux-audit mailing list > Linux-audit@redhat.com > https://www.redhat.com/mailman/listinfo/linux-audit -- paul moore www.paul-moore.com ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] audit: use proper refcount locking on audit_sock 2016-12-12 17:10 ` Paul Moore @ 2016-12-13 4:49 ` Richard Guy Briggs 0 siblings, 0 replies; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-13 4:49 UTC (permalink / raw) To: Paul Moore Cc: netdev, linux-kernel, linux-audit, edumazet, xiyou.wangcong, dvyukov On 2016-12-12 12:10, Paul Moore wrote: > On Mon, Dec 12, 2016 at 5:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > Resetting audit_sock appears to be racy. > > > > audit_sock was being copied and dereferenced without using a refcount on > > the source sock. > > > > Bump the refcount on the underlying sock when we store a refrence in > > audit_sock and release it when we reset audit_sock. audit_sock > > modification needs the audit_cmd_mutex. > > > > See: https://lkml.org/lkml/2016/11/26/232 > > > > Thanks to Eric Dumazet <edumazet@google.com> and Cong Wang > > <xiyou.wangcong@gmail.com> on ideas how to fix it. > > > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com> > > --- > > There has been a lot of change in the audit code that is about to go > > upstream to address audit queue issues. This patch is based on the > > source tree: git://git.infradead.org/users/pcmoore/audit#next > > --- > > kernel/audit.c | 34 ++++++++++++++++++++++++++++------ > > 1 files changed, 28 insertions(+), 6 deletions(-) > > This is coming in pretty late for the v4.10 merge window, much later > than I would usually take things, but this is arguably important, and > (at first glance) relatively low risk - what testing have you done on > this? At this point, compile and boot, and I'm able to compile and run the supplied test code without any issues. > > diff --git a/kernel/audit.c b/kernel/audit.c > > index f20eee0..439f7f3 100644 > > --- a/kernel/audit.c > > +++ b/kernel/audit.c > > @@ -452,7 +452,9 @@ static void auditd_reset(void) > > struct sk_buff *skb; > > > > /* break the connection */ > > + sock_put(audit_sock); > > audit_pid = 0; > > + audit_nlk_portid = 0; > > audit_sock = NULL; > > > > /* flush all of the retry queue to the hold queue */ > > @@ -478,6 +480,12 @@ static int kauditd_send_unicast_skb(struct sk_buff *skb) > > if (rc >= 0) { > > consume_skb(skb); > > rc = 0; > > + } else { > > + if (rc & (-ENOMEM|-EPERM|-ECONNREFUSED)) { > > + mutex_lock(&audit_cmd_mutex); > > + auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > + } > > } > > > > return rc; > > @@ -579,7 +587,9 @@ static int kauditd_thread(void *dummy) > > > > auditd = 0; > > if (AUDITD_BAD(rc, reschedule)) { > > + mutex_lock(&audit_cmd_mutex); > > auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > reschedule = 0; > > } > > } else > > @@ -594,7 +604,9 @@ static int kauditd_thread(void *dummy) > > auditd = 0; > > if (AUDITD_BAD(rc, reschedule)) { > > kauditd_hold_skb(skb); > > + mutex_lock(&audit_cmd_mutex); > > auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > reschedule = 0; > > } else > > /* temporary problem (we hope), queue > > @@ -623,7 +635,9 @@ quick_loop: > > if (rc) { > > auditd = 0; > > if (AUDITD_BAD(rc, reschedule)) { > > + mutex_lock(&audit_cmd_mutex); > > auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > reschedule = 0; > > } > > > > @@ -1004,17 +1018,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > > return -EACCES; > > } > > if (audit_pid && new_pid && > > - audit_replace(requesting_pid) != -ECONNREFUSED) { > > + (audit_replace(requesting_pid) & (-ECONNREFUSED|-EPERM|-ENOMEM))) { > > audit_log_config_change("audit_pid", new_pid, audit_pid, 0); > > return -EEXIST; > > } > > if (audit_enabled != AUDIT_OFF) > > audit_log_config_change("audit_pid", new_pid, audit_pid, 1); > > - audit_pid = new_pid; > > - audit_nlk_portid = NETLINK_CB(skb).portid; > > - audit_sock = skb->sk; > > - if (!new_pid) > > + if (new_pid) { > > + if (audit_sock) > > + sock_put(audit_sock); > > + audit_pid = new_pid; > > + audit_nlk_portid = NETLINK_CB(skb).portid; > > + sock_hold(skb->sk); > > + audit_sock = skb->sk; > > + } else { > > auditd_reset(); > > + } > > wake_up_interruptible(&kauditd_wait); > > } > > if (s.mask & AUDIT_STATUS_RATE_LIMIT) { > > @@ -1283,8 +1302,11 @@ static void __net_exit audit_net_exit(struct net *net) > > { > > struct audit_net *aunet = net_generic(net, audit_net_id); > > struct sock *sock = aunet->nlsk; > > - if (sock == audit_sock) > > + if (sock == audit_sock) { > > + mutex_lock(&audit_cmd_mutex); > > auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > + } > > > > RCU_INIT_POINTER(aunet->nlsk, NULL); > > synchronize_net(); > > -- > > 1.7.1 > > > > -- > > Linux-audit mailing list > > Linux-audit@redhat.com > > https://www.redhat.com/mailman/listinfo/linux-audit > > > > -- > paul moore > www.paul-moore.com - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] audit: use proper refcount locking on audit_sock 2016-12-12 10:03 ` [PATCH v2] audit: use proper refcount locking on audit_sock Richard Guy Briggs 2016-12-12 17:10 ` Paul Moore @ 2016-12-12 20:18 ` Paul Moore 2016-12-13 5:10 ` Richard Guy Briggs 2016-12-12 23:58 ` Cong Wang 2 siblings, 1 reply; 36+ messages in thread From: Paul Moore @ 2016-12-12 20:18 UTC (permalink / raw) To: Richard Guy Briggs Cc: netdev, linux-kernel, linux-audit, edumazet, xiyou.wangcong, dvyukov On Mon, Dec 12, 2016 at 5:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > Resetting audit_sock appears to be racy. > > audit_sock was being copied and dereferenced without using a refcount on > the source sock. > > Bump the refcount on the underlying sock when we store a refrence in > audit_sock and release it when we reset audit_sock. audit_sock > modification needs the audit_cmd_mutex. > > See: https://lkml.org/lkml/2016/11/26/232 > > Thanks to Eric Dumazet <edumazet@google.com> and Cong Wang > <xiyou.wangcong@gmail.com> on ideas how to fix it. > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com> > --- > There has been a lot of change in the audit code that is about to go > upstream to address audit queue issues. This patch is based on the > source tree: git://git.infradead.org/users/pcmoore/audit#next > --- > kernel/audit.c | 34 ++++++++++++++++++++++++++++------ > 1 files changed, 28 insertions(+), 6 deletions(-) My previous question about testing still stands, but I took a closer look and have some additional comments, see below ... > diff --git a/kernel/audit.c b/kernel/audit.c > index f20eee0..439f7f3 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -452,7 +452,9 @@ static void auditd_reset(void) > struct sk_buff *skb; > > /* break the connection */ > + sock_put(audit_sock); > audit_pid = 0; > + audit_nlk_portid = 0; > audit_sock = NULL; > > /* flush all of the retry queue to the hold queue */ > @@ -478,6 +480,12 @@ static int kauditd_send_unicast_skb(struct sk_buff *skb) > if (rc >= 0) { > consume_skb(skb); > rc = 0; > + } else { > + if (rc & (-ENOMEM|-EPERM|-ECONNREFUSED)) { I dislike the way you wrote this because instead of simply looking at this to see if it correct I need to sort out all the bits and find out if there are other error codes that could run afoul of this check ... make it simple, e.g. (rc == -ENOMEM || rc == -EPERM || ...). Actually, since EPERM is 1, -EPERM (-1 in two's compliment is 0xffffffff) is going to cause this to be true for pretty much any value of rc, yes? > + mutex_lock(&audit_cmd_mutex); > + auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > + } The code in audit#next handles netlink_unicast() errors in kauditd_thread() and you are adding error handling code here in kauditd_send_unicast_skb() ... that's messy. I don't care too much where the auditd_reset() call is made, but let's only do it in one function; FWIW, I originally put the error handling code in kauditd_thread() because there was other error handling code that needed to done in that scope so it resulted in cleaner code. Related, I see you are now considering ENOMEM to be a fatal condition, that differs from the AUDITD_BAD macro in kauditd_thread(); this difference needs to be reconciled. Finally, you should update the comment header block for auditd_reset() that it needs to be called with the audit_cmd_mutex held. > @@ -1004,17 +1018,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > return -EACCES; > } > if (audit_pid && new_pid && > - audit_replace(requesting_pid) != -ECONNREFUSED) { > + (audit_replace(requesting_pid) & (-ECONNREFUSED|-EPERM|-ENOMEM))) { Do we simply want to treat any error here as fatal, and not just ECONN/EPERM/ENOMEM? If not, let's come up with a single macro to handle the fatal netlink_unicast() return codes so we have some chance to keep things consistent in the future. -- paul moore www.paul-moore.com ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] audit: use proper refcount locking on audit_sock 2016-12-12 20:18 ` Paul Moore @ 2016-12-13 5:10 ` Richard Guy Briggs 2016-12-13 15:01 ` Richard Guy Briggs 0 siblings, 1 reply; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-13 5:10 UTC (permalink / raw) To: Paul Moore Cc: netdev, linux-kernel, linux-audit, edumazet, xiyou.wangcong, dvyukov On 2016-12-12 15:18, Paul Moore wrote: > On Mon, Dec 12, 2016 at 5:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > Resetting audit_sock appears to be racy. > > > > audit_sock was being copied and dereferenced without using a refcount on > > the source sock. > > > > Bump the refcount on the underlying sock when we store a refrence in > > audit_sock and release it when we reset audit_sock. audit_sock > > modification needs the audit_cmd_mutex. > > > > See: https://lkml.org/lkml/2016/11/26/232 > > > > Thanks to Eric Dumazet <edumazet@google.com> and Cong Wang > > <xiyou.wangcong@gmail.com> on ideas how to fix it. > > > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com> > > --- > > There has been a lot of change in the audit code that is about to go > > upstream to address audit queue issues. This patch is based on the > > source tree: git://git.infradead.org/users/pcmoore/audit#next > > --- > > kernel/audit.c | 34 ++++++++++++++++++++++++++++------ > > 1 files changed, 28 insertions(+), 6 deletions(-) > > diff --git a/kernel/audit.c b/kernel/audit.c > > index f20eee0..439f7f3 100644 > > --- a/kernel/audit.c > > +++ b/kernel/audit.c > > @@ -452,7 +452,9 @@ static void auditd_reset(void) > > struct sk_buff *skb; > > > > /* break the connection */ > > + sock_put(audit_sock); > > audit_pid = 0; > > + audit_nlk_portid = 0; > > audit_sock = NULL; > > > > /* flush all of the retry queue to the hold queue */ > > @@ -478,6 +480,12 @@ static int kauditd_send_unicast_skb(struct sk_buff *skb) > > if (rc >= 0) { > > consume_skb(skb); > > rc = 0; > > + } else { > > + if (rc & (-ENOMEM|-EPERM|-ECONNREFUSED)) { > > I dislike the way you wrote this because instead of simply looking at > this to see if it correct I need to sort out all the bits and find out > if there are other error codes that could run afoul of this check ... > make it simple, e.g. (rc == -ENOMEM || rc == -EPERM || ...). > Actually, since EPERM is 1, -EPERM (-1 in two's compliment is > 0xffffffff) is going to cause this to be true for pretty much any > value of rc, yes? Yes, you are correct. We need there a logical or on the results of each comparison to the return code rather than bit-wise or-ing the result codes together first to save a step. > > + mutex_lock(&audit_cmd_mutex); > > + auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > + } > > The code in audit#next handles netlink_unicast() errors in > kauditd_thread() and you are adding error handling code here in > kauditd_send_unicast_skb() ... that's messy. I don't care too much > where the auditd_reset() call is made, but let's only do it in one > function; FWIW, I originally put the error handling code in > kauditd_thread() because there was other error handling code that > needed to done in that scope so it resulted in cleaner code. Hmmm, I seem to remember it not returning the return code and I thought I had changed it to do so, but I see now that it was already there. Agreed, I needlessly duplicated that error handling. > Related, I see you are now considering ENOMEM to be a fatal condition, > that differs from the AUDITD_BAD macro in kauditd_thread(); this > difference needs to be reconciled. Also correct about -EPERM now that I check back to the intent of commit 32a1dbaece7e ("audit: try harder to send to auditd upon netlink failure") > Finally, you should update the comment header block for auditd_reset() > that it needs to be called with the audit_cmd_mutex held. Yup. > > @@ -1004,17 +1018,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > > return -EACCES; > > } > > if (audit_pid && new_pid && > > - audit_replace(requesting_pid) != -ECONNREFUSED) { > > + (audit_replace(requesting_pid) & (-ECONNREFUSED|-EPERM|-ENOMEM))) { > > Do we simply want to treat any error here as fatal, and not just > ECONN/EPERM/ENOMEM? If not, let's come up with a single macro to > handle the fatal netlink_unicast() return codes so we have some chance > to keep things consistent in the future. I'll work through this before I post another patch... > paul moore - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] audit: use proper refcount locking on audit_sock 2016-12-13 5:10 ` Richard Guy Briggs @ 2016-12-13 15:01 ` Richard Guy Briggs 0 siblings, 0 replies; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-13 15:01 UTC (permalink / raw) To: Paul Moore Cc: netdev, linux-kernel, linux-audit, edumazet, xiyou.wangcong, dvyukov On 2016-12-13 00:10, Richard Guy Briggs wrote: > On 2016-12-12 15:18, Paul Moore wrote: > > On Mon, Dec 12, 2016 at 5:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > > Resetting audit_sock appears to be racy. > > > > > > audit_sock was being copied and dereferenced without using a refcount on > > > the source sock. > > > > > > Bump the refcount on the underlying sock when we store a refrence in > > > audit_sock and release it when we reset audit_sock. audit_sock > > > modification needs the audit_cmd_mutex. > > > > > > See: https://lkml.org/lkml/2016/11/26/232 > > > > > > Thanks to Eric Dumazet <edumazet@google.com> and Cong Wang > > > <xiyou.wangcong@gmail.com> on ideas how to fix it. > > > > > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com> > > > --- > > > There has been a lot of change in the audit code that is about to go > > > upstream to address audit queue issues. This patch is based on the > > > source tree: git://git.infradead.org/users/pcmoore/audit#next > > > --- > > > kernel/audit.c | 34 ++++++++++++++++++++++++++++------ > > > 1 files changed, 28 insertions(+), 6 deletions(-) > > > diff --git a/kernel/audit.c b/kernel/audit.c > > > index f20eee0..439f7f3 100644 > > > --- a/kernel/audit.c > > > +++ b/kernel/audit.c > > > @@ -452,7 +452,9 @@ static void auditd_reset(void) > > > struct sk_buff *skb; > > > > > > /* break the connection */ > > > + sock_put(audit_sock); > > > audit_pid = 0; > > > + audit_nlk_portid = 0; > > > audit_sock = NULL; > > > > > > /* flush all of the retry queue to the hold queue */ > > > @@ -478,6 +480,12 @@ static int kauditd_send_unicast_skb(struct sk_buff *skb) > > > if (rc >= 0) { > > > consume_skb(skb); > > > rc = 0; > > > + } else { > > > + if (rc & (-ENOMEM|-EPERM|-ECONNREFUSED)) { > > > > I dislike the way you wrote this because instead of simply looking at > > this to see if it correct I need to sort out all the bits and find out > > if there are other error codes that could run afoul of this check ... > > make it simple, e.g. (rc == -ENOMEM || rc == -EPERM || ...). > > Actually, since EPERM is 1, -EPERM (-1 in two's compliment is > > 0xffffffff) is going to cause this to be true for pretty much any > > value of rc, yes? > > Yes, you are correct. We need there a logical or on the results of each > comparison to the return code rather than bit-wise or-ing the result > codes together first to save a step. > > > > + mutex_lock(&audit_cmd_mutex); > > > + auditd_reset(); > > > + mutex_unlock(&audit_cmd_mutex); > > > + } > > > > The code in audit#next handles netlink_unicast() errors in > > kauditd_thread() and you are adding error handling code here in > > kauditd_send_unicast_skb() ... that's messy. I don't care too much > > where the auditd_reset() call is made, but let's only do it in one > > function; FWIW, I originally put the error handling code in > > kauditd_thread() because there was other error handling code that > > needed to done in that scope so it resulted in cleaner code. > > Hmmm, I seem to remember it not returning the return code and I thought > I had changed it to do so, but I see now that it was already there. > Agreed, I needlessly duplicated that error handling. > > > Related, I see you are now considering ENOMEM to be a fatal condition, > > that differs from the AUDITD_BAD macro in kauditd_thread(); this > > difference needs to be reconciled. > > Also correct about -EPERM now that I check back to the intent of commit > 32a1dbaece7e ("audit: try harder to send to auditd upon netlink > failure") > > > Finally, you should update the comment header block for auditd_reset() > > that it needs to be called with the audit_cmd_mutex held. > > Yup. > > > > @@ -1004,17 +1018,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > > > return -EACCES; > > > } > > > if (audit_pid && new_pid && > > > - audit_replace(requesting_pid) != -ECONNREFUSED) { > > > + (audit_replace(requesting_pid) & (-ECONNREFUSED|-EPERM|-ENOMEM))) { > > > > Do we simply want to treat any error here as fatal, and not just > > ECONN/EPERM/ENOMEM? If not, let's come up with a single macro to > > handle the fatal netlink_unicast() return codes so we have some chance > > to keep things consistent in the future. > > I'll work through this before I post another patch... Ok, I've gone back to look at the reasoning in commit 133e1e5acd4a ("audit: stop an old auditd being starved out by a new auditd") which suggests only ECONNREFUSED can cause an audit_pid replace, so I've returned it to its original state. I'll post another tested patch, but I'm still not that happy that it does not proactively reset audit_pid, audit_nlk_portid and audit_sock when auditd's socket has a problem. I'll leave the test run overnight. > > paul moore > > - RGB - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] audit: use proper refcount locking on audit_sock 2016-12-12 10:03 ` [PATCH v2] audit: use proper refcount locking on audit_sock Richard Guy Briggs 2016-12-12 17:10 ` Paul Moore 2016-12-12 20:18 ` Paul Moore @ 2016-12-12 23:58 ` Cong Wang 2016-12-13 14:55 ` Richard Guy Briggs 2 siblings, 1 reply; 36+ messages in thread From: Cong Wang @ 2016-12-12 23:58 UTC (permalink / raw) To: Richard Guy Briggs Cc: Linux Kernel Network Developers, LKML, linux-audit, Dmitry Vyukov, Eric Dumazet, Eric Paris, Paul Moore, sgrubb On Mon, Dec 12, 2016 at 2:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > Resetting audit_sock appears to be racy. > > audit_sock was being copied and dereferenced without using a refcount on > the source sock. > > Bump the refcount on the underlying sock when we store a refrence in > audit_sock and release it when we reset audit_sock. audit_sock > modification needs the audit_cmd_mutex. > > See: https://lkml.org/lkml/2016/11/26/232 > > Thanks to Eric Dumazet <edumazet@google.com> and Cong Wang > <xiyou.wangcong@gmail.com> on ideas how to fix it. > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com> > --- > There has been a lot of change in the audit code that is about to go > upstream to address audit queue issues. This patch is based on the > source tree: git://git.infradead.org/users/pcmoore/audit#next > --- > kernel/audit.c | 34 ++++++++++++++++++++++++++++------ > 1 files changed, 28 insertions(+), 6 deletions(-) > > diff --git a/kernel/audit.c b/kernel/audit.c > index f20eee0..439f7f3 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -452,7 +452,9 @@ static void auditd_reset(void) > struct sk_buff *skb; > > /* break the connection */ > + sock_put(audit_sock); Why audit_sock can't be NULL here? > audit_pid = 0; > + audit_nlk_portid = 0; > audit_sock = NULL; > > /* flush all of the retry queue to the hold queue */ > @@ -478,6 +480,12 @@ static int kauditd_send_unicast_skb(struct sk_buff *skb) > if (rc >= 0) { > consume_skb(skb); > rc = 0; > + } else { > + if (rc & (-ENOMEM|-EPERM|-ECONNREFUSED)) { Are these errno's bits?? > + mutex_lock(&audit_cmd_mutex); > + auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > + } > } > > return rc; > @@ -579,7 +587,9 @@ static int kauditd_thread(void *dummy) > > auditd = 0; > if (AUDITD_BAD(rc, reschedule)) { > + mutex_lock(&audit_cmd_mutex); > auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > reschedule = 0; > } > } else > @@ -594,7 +604,9 @@ static int kauditd_thread(void *dummy) > auditd = 0; > if (AUDITD_BAD(rc, reschedule)) { > kauditd_hold_skb(skb); > + mutex_lock(&audit_cmd_mutex); > auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > reschedule = 0; > } else > /* temporary problem (we hope), queue > @@ -623,7 +635,9 @@ quick_loop: > if (rc) { > auditd = 0; > if (AUDITD_BAD(rc, reschedule)) { > + mutex_lock(&audit_cmd_mutex); > auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > reschedule = 0; > } > > @@ -1004,17 +1018,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > return -EACCES; > } > if (audit_pid && new_pid && > - audit_replace(requesting_pid) != -ECONNREFUSED) { > + (audit_replace(requesting_pid) & (-ECONNREFUSED|-EPERM|-ENOMEM))) { > audit_log_config_change("audit_pid", new_pid, audit_pid, 0); > return -EEXIST; > } > if (audit_enabled != AUDIT_OFF) > audit_log_config_change("audit_pid", new_pid, audit_pid, 1); > - audit_pid = new_pid; > - audit_nlk_portid = NETLINK_CB(skb).portid; > - audit_sock = skb->sk; > - if (!new_pid) > + if (new_pid) { > + if (audit_sock) > + sock_put(audit_sock); > + audit_pid = new_pid; > + audit_nlk_portid = NETLINK_CB(skb).portid; > + sock_hold(skb->sk); Why refcnt is still needed here? I need it because I removed the code in net exit code path. > + audit_sock = skb->sk; > + } else { > auditd_reset(); > + } > wake_up_interruptible(&kauditd_wait); > } > if (s.mask & AUDIT_STATUS_RATE_LIMIT) { > @@ -1283,8 +1302,11 @@ static void __net_exit audit_net_exit(struct net *net) > { > struct audit_net *aunet = net_generic(net, audit_net_id); > struct sock *sock = aunet->nlsk; > - if (sock == audit_sock) > + if (sock == audit_sock) { > + mutex_lock(&audit_cmd_mutex); You need to put the if check inside the mutex too. Again, this could be removed if you use refcnt. > auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > + } > > RCU_INIT_POINTER(aunet->nlsk, NULL); > synchronize_net(); > -- > 1.7.1 > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] audit: use proper refcount locking on audit_sock 2016-12-12 23:58 ` Cong Wang @ 2016-12-13 14:55 ` Richard Guy Briggs 0 siblings, 0 replies; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-13 14:55 UTC (permalink / raw) To: Cong Wang Cc: Linux Kernel Network Developers, LKML, linux-audit, Dmitry Vyukov, Eric Dumazet, Eric Paris, Paul Moore, sgrubb On 2016-12-12 15:58, Cong Wang wrote: > On Mon, Dec 12, 2016 at 2:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > Resetting audit_sock appears to be racy. > > > > audit_sock was being copied and dereferenced without using a refcount on > > the source sock. > > > > Bump the refcount on the underlying sock when we store a refrence in > > audit_sock and release it when we reset audit_sock. audit_sock > > modification needs the audit_cmd_mutex. > > > > See: https://lkml.org/lkml/2016/11/26/232 > > > > Thanks to Eric Dumazet <edumazet@google.com> and Cong Wang > > <xiyou.wangcong@gmail.com> on ideas how to fix it. > > > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com> > > --- > > There has been a lot of change in the audit code that is about to go > > upstream to address audit queue issues. This patch is based on the > > source tree: git://git.infradead.org/users/pcmoore/audit#next > > --- > > kernel/audit.c | 34 ++++++++++++++++++++++++++++------ > > 1 files changed, 28 insertions(+), 6 deletions(-) > > > > diff --git a/kernel/audit.c b/kernel/audit.c > > index f20eee0..439f7f3 100644 > > --- a/kernel/audit.c > > +++ b/kernel/audit.c > > @@ -452,7 +452,9 @@ static void auditd_reset(void) > > struct sk_buff *skb; > > > > /* break the connection */ > > + sock_put(audit_sock); > > Why audit_sock can't be NULL here? Fixed. > > audit_pid = 0; > > + audit_nlk_portid = 0; > > audit_sock = NULL; > > > > /* flush all of the retry queue to the hold queue */ > > @@ -478,6 +480,12 @@ static int kauditd_send_unicast_skb(struct sk_buff *skb) > > if (rc >= 0) { > > consume_skb(skb); > > rc = 0; > > + } else { > > + if (rc & (-ENOMEM|-EPERM|-ECONNREFUSED)) { > > Are these errno's bits?? No, I've fixed this silly error. > > + mutex_lock(&audit_cmd_mutex); > > + auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > + } > > } > > > > return rc; > > @@ -579,7 +587,9 @@ static int kauditd_thread(void *dummy) > > > > auditd = 0; > > if (AUDITD_BAD(rc, reschedule)) { > > + mutex_lock(&audit_cmd_mutex); > > auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > reschedule = 0; > > } > > } else > > @@ -594,7 +604,9 @@ static int kauditd_thread(void *dummy) > > auditd = 0; > > if (AUDITD_BAD(rc, reschedule)) { > > kauditd_hold_skb(skb); > > + mutex_lock(&audit_cmd_mutex); > > auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > reschedule = 0; > > } else > > /* temporary problem (we hope), queue > > @@ -623,7 +635,9 @@ quick_loop: > > if (rc) { > > auditd = 0; > > if (AUDITD_BAD(rc, reschedule)) { > > + mutex_lock(&audit_cmd_mutex); > > auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > reschedule = 0; > > } > > > > @@ -1004,17 +1018,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > > return -EACCES; > > } > > if (audit_pid && new_pid && > > - audit_replace(requesting_pid) != -ECONNREFUSED) { > > + (audit_replace(requesting_pid) & (-ECONNREFUSED|-EPERM|-ENOMEM))) { > > audit_log_config_change("audit_pid", new_pid, audit_pid, 0); > > return -EEXIST; > > } > > if (audit_enabled != AUDIT_OFF) > > audit_log_config_change("audit_pid", new_pid, audit_pid, 1); > > - audit_pid = new_pid; > > - audit_nlk_portid = NETLINK_CB(skb).portid; > > - audit_sock = skb->sk; > > - if (!new_pid) > > + if (new_pid) { > > + if (audit_sock) > > + sock_put(audit_sock); > > + audit_pid = new_pid; > > + audit_nlk_portid = NETLINK_CB(skb).portid; > > + sock_hold(skb->sk); > > Why refcnt is still needed here? I need it because I removed the code > in net exit code path. Because there is a chance that auditd exits abnormally and no message is send from the kauditd thread to discover it has gone. > > + audit_sock = skb->sk; > > + } else { > > auditd_reset(); > > + } > > wake_up_interruptible(&kauditd_wait); > > } > > if (s.mask & AUDIT_STATUS_RATE_LIMIT) { > > @@ -1283,8 +1302,11 @@ static void __net_exit audit_net_exit(struct net *net) > > { > > struct audit_net *aunet = net_generic(net, audit_net_id); > > struct sock *sock = aunet->nlsk; > > - if (sock == audit_sock) > > + if (sock == audit_sock) { > > + mutex_lock(&audit_cmd_mutex); > > You need to put the if check inside the mutex too. Again, this could be > removed if you use refcnt. Ok, right, fixed. That last patch was a bit of a mess! Thanks for your patience in checking it... > > auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > + } > > > > RCU_INIT_POINTER(aunet->nlsk, NULL); > > synchronize_net(); > > -- > > 1.7.1 > > - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-12 10:02 ` Richard Guy Briggs 2016-12-12 10:03 ` [PATCH v2] audit: use proper refcount locking on audit_sock Richard Guy Briggs @ 2016-12-13 0:10 ` Cong Wang 2016-12-13 10:52 ` Richard Guy Briggs 2016-12-13 15:03 ` [RFC PATCH v3] audit: use proper refcount locking on audit_sock Richard Guy Briggs 2 siblings, 1 reply; 36+ messages in thread From: Cong Wang @ 2016-12-13 0:10 UTC (permalink / raw) To: Richard Guy Briggs Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On Mon, Dec 12, 2016 at 2:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > On 2016-12-09 20:13, Cong Wang wrote: >> Netlink notifier can safely be converted to blocking one, I will send >> a patch. > > I had a quick look at how that might happen. The netlink notifier chain > is atomic. Would the registered callback funciton need to spawn a > one-time thread to avoid blocking? It is already non-atomic now: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=efa172f42836477bf1ac3c9a3053140df764699c > I had a look at your patch. It looks attractively simple. The audit > next tree has patches queued that add an audit_reset function that will > require more work. I still see some potential gaps. > > - If the process messes up (or the sock lookup messes up) it is reset > in the kauditd thread under the audit_cmd_mutex. > > - If the process exits normally or is replaced due to an audit_replace > error, it is reset from audit_receive_skb under the audit_cmd_mutex. > > - If the process dies before the kauditd thread notices, either reap it > via notifier callback or it needs a check on net exit to reset. This > last one appears necessary to decrement the sock refcount so the sock > can be released in netlink_kernel_release(). > > If we want to be proactive and use the netlink notifier, we assume the > overhead of adding to the netlink notifier chain and eliminate all the > other reset calls under the kauditd thread. If we are ok being > reactionary, then we'll at least need the net exit check on audit_sock. > I don't see why we need to check it in net exit if we use refcnt, because we have two different users of audit_sock: kauditd and netns, if both take care of refcnt properly, we don't need to worry about who is the last, no matter what failures occur in what order. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-13 0:10 ` netlink: GPF in sock_sndtimeo Cong Wang @ 2016-12-13 10:52 ` Richard Guy Briggs 2016-12-14 0:17 ` Cong Wang 0 siblings, 1 reply; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-13 10:52 UTC (permalink / raw) To: Cong Wang Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On 2016-12-12 16:10, Cong Wang wrote: > On Mon, Dec 12, 2016 at 2:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > On 2016-12-09 20:13, Cong Wang wrote: > >> Netlink notifier can safely be converted to blocking one, I will send > >> a patch. > > > > I had a quick look at how that might happen. The netlink notifier chain > > is atomic. Would the registered callback funciton need to spawn a > > one-time thread to avoid blocking? > > It is already non-atomic now: > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=efa172f42836477bf1ac3c9a3053140df764699c Ok, that is recent... It is still less attractive as you point out due to the overhead, but still worth considering if we can't find another way. > > I had a look at your patch. It looks attractively simple. The audit > > next tree has patches queued that add an audit_reset function that will > > require more work. I still see some potential gaps. > > > > - If the process messes up (or the sock lookup messes up) it is reset > > in the kauditd thread under the audit_cmd_mutex. > > > > - If the process exits normally or is replaced due to an audit_replace > > error, it is reset from audit_receive_skb under the audit_cmd_mutex. > > > > - If the process dies before the kauditd thread notices, either reap it > > via notifier callback or it needs a check on net exit to reset. This > > last one appears necessary to decrement the sock refcount so the sock > > can be released in netlink_kernel_release(). > > > > If we want to be proactive and use the netlink notifier, we assume the > > overhead of adding to the netlink notifier chain and eliminate all the > > other reset calls under the kauditd thread. If we are ok being > > reactionary, then we'll at least need the net exit check on audit_sock. > > I don't see why we need to check it in net exit if we use refcnt, > because we have two different users of audit_sock: kauditd and > netns, if both take care of refcnt properly, we don't need to worry > about who is the last, no matter what failures occur in what order. It is actually the audit_pid and audit_nlk_portid that I care about more. The audit daemon could vanish or close the socket while the kernel sock to which it was attached is still quite valid. Accessing the set of three atomically is the urge. I wonder if it makes more sense to test for the presence of auditd using audit_sock rather than audit_pid, but still keep audit_pid for our reporting and replacement strategy. Another idea would be to put the three in one struct. Can someone explain how they think the original test was able to trigger this GPF? Network namespace shutdown while something pretended to set up a new auditd? That's impressive for a fuzzer if that's the case... Is there an strace? I guess it is all in test(). - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-13 10:52 ` Richard Guy Briggs @ 2016-12-14 0:17 ` Cong Wang 2016-12-14 4:17 ` Richard Guy Briggs 0 siblings, 1 reply; 36+ messages in thread From: Cong Wang @ 2016-12-14 0:17 UTC (permalink / raw) To: Richard Guy Briggs Cc: linux-audit, Paul Moore, Dmitry Vyukov, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On Tue, Dec 13, 2016 at 2:52 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > It is actually the audit_pid and audit_nlk_portid that I care about > more. The audit daemon could vanish or close the socket while the > kernel sock to which it was attached is still quite valid. Accessing > the set of three atomically is the urge. I wonder if it makes more > sense to test for the presence of auditd using audit_sock rather than > audit_pid, but still keep audit_pid for our reporting and replacement > strategy. Another idea would be to put the three in one struct. Note, the process has audit_pid should hold a refcnt to the netns too, so the netns can't be gone until that process is gone. > > Can someone explain how they think the original test was able to trigger > this GPF? Network namespace shutdown while something pretended to set > up a new auditd? That's impressive for a fuzzer if that's the case... > Is there an strace? I guess it is all in test(). > I am surprised you still don't get the race condition even when you are now working on v2... The race happens in this scenarios : 1) Create a new netns 2) In the new netns, communicate with kauditd to set audit_sock 3) Generate some audit messages, so kauditd will keep sending them via audit_sock 4) exit the netns 5) the previous audit_sock is now going away, but kaudit_sock could still access it in this small window. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-14 0:17 ` Cong Wang @ 2016-12-14 4:17 ` Richard Guy Briggs 0 siblings, 0 replies; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-14 4:17 UTC (permalink / raw) To: Cong Wang Cc: Herbert Xu, Johannes Berg, netdev, Florian Westphal, LKML, Eric Dumazet, linux-audit, syzkaller, David Miller, Dmitry Vyukov On 2016-12-13 16:17, Cong Wang wrote: > On Tue, Dec 13, 2016 at 2:52 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > It is actually the audit_pid and audit_nlk_portid that I care about > > more. The audit daemon could vanish or close the socket while the > > kernel sock to which it was attached is still quite valid. Accessing > > the set of three atomically is the urge. I wonder if it makes more > > sense to test for the presence of auditd using audit_sock rather than > > audit_pid, but still keep audit_pid for our reporting and replacement > > strategy. Another idea would be to put the three in one struct. > > Note, the process has audit_pid should hold a refcnt to the netns too, > so the netns can't be gone until that process is gone. I noted that. I did wonder if there might be a problem if all the processes were moved to another netns with the struct sock stuck in the now process-void netns. This is alluded-to in 6f285b19d09f ("audit: Send replies in the proper network namespace."). > > Can someone explain how they think the original test was able to trigger > > this GPF? Network namespace shutdown while something pretended to set > > up a new auditd? That's impressive for a fuzzer if that's the case... > > Is there an strace? I guess it is all in test(). > > I am surprised you still don't get the race condition even when you > are now working on v2... > > The race happens in this scenarios : > > 1) Create a new netns > > 2) In the new netns, communicate with kauditd to set audit_sock > > 3) Generate some audit messages, so kauditd will keep sending them > via audit_sock > > 4) exit the netns > > 5) the previous audit_sock is now going away, but kaudit_sock could still > access it in this small window. Ah ok that fits... - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* [RFC PATCH v3] audit: use proper refcount locking on audit_sock 2016-12-12 10:02 ` Richard Guy Briggs 2016-12-12 10:03 ` [PATCH v2] audit: use proper refcount locking on audit_sock Richard Guy Briggs 2016-12-13 0:10 ` netlink: GPF in sock_sndtimeo Cong Wang @ 2016-12-13 15:03 ` Richard Guy Briggs 2016-12-13 20:50 ` Paul Moore 2016-12-14 0:19 ` Cong Wang 2 siblings, 2 replies; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-13 15:03 UTC (permalink / raw) To: netdev, linux-kernel, linux-audit Cc: Richard Guy Briggs, dvyukov, xiyou.wangcong, edumazet, eparis, pmoore, sgrubb Resetting audit_sock appears to be racy. audit_sock was being copied and dereferenced without using a refcount on the source sock. Bump the refcount on the underlying sock when we store a refrence in audit_sock and release it when we reset audit_sock. audit_sock modification needs the audit_cmd_mutex. See: https://lkml.org/lkml/2016/11/26/232 Thanks to Eric Dumazet <edumazet@google.com> and Cong Wang <xiyou.wangcong@gmail.com> on ideas how to fix it. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> --- There has been a lot of change in the audit code that is about to go upstream to address audit queue issues. This patch is based on the source tree: git://git.infradead.org/users/pcmoore/audit#next --- kernel/audit.c | 28 +++++++++++++++++++++++----- 1 files changed, 23 insertions(+), 5 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index f20eee0..3bb4126 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -446,14 +446,19 @@ static void kauditd_retry_skb(struct sk_buff *skb) * Description: * Break the auditd/kauditd connection and move all the records in the retry * queue into the hold queue in case auditd reconnects. + * The audit_cmd_mutex must be held when calling this function. */ static void auditd_reset(void) { struct sk_buff *skb; /* break the connection */ + if (audit_sock) { + sock_put(audit_sock); + audit_sock = NULL; + } audit_pid = 0; - audit_sock = NULL; + audit_nlk_portid = 0; /* flush all of the retry queue to the hold queue */ while ((skb = skb_dequeue(&audit_retry_queue))) @@ -579,7 +584,9 @@ static int kauditd_thread(void *dummy) auditd = 0; if (AUDITD_BAD(rc, reschedule)) { + mutex_lock(&audit_cmd_mutex); auditd_reset(); + mutex_unlock(&audit_cmd_mutex); reschedule = 0; } } else @@ -594,7 +601,9 @@ static int kauditd_thread(void *dummy) auditd = 0; if (AUDITD_BAD(rc, reschedule)) { kauditd_hold_skb(skb); + mutex_lock(&audit_cmd_mutex); auditd_reset(); + mutex_unlock(&audit_cmd_mutex); reschedule = 0; } else /* temporary problem (we hope), queue @@ -623,7 +632,9 @@ quick_loop: if (rc) { auditd = 0; if (AUDITD_BAD(rc, reschedule)) { + mutex_lock(&audit_cmd_mutex); auditd_reset(); + mutex_unlock(&audit_cmd_mutex); reschedule = 0; } @@ -1010,11 +1021,16 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) } if (audit_enabled != AUDIT_OFF) audit_log_config_change("audit_pid", new_pid, audit_pid, 1); - audit_pid = new_pid; - audit_nlk_portid = NETLINK_CB(skb).portid; - audit_sock = skb->sk; - if (!new_pid) + if (new_pid) { + if (audit_sock) + sock_put(audit_sock); + audit_pid = new_pid; + audit_nlk_portid = NETLINK_CB(skb).portid; + sock_hold(skb->sk); + audit_sock = skb->sk; + } else { auditd_reset(); + } wake_up_interruptible(&kauditd_wait); } if (s.mask & AUDIT_STATUS_RATE_LIMIT) { @@ -1283,8 +1299,10 @@ static void __net_exit audit_net_exit(struct net *net) { struct audit_net *aunet = net_generic(net, audit_net_id); struct sock *sock = aunet->nlsk; + mutex_lock(&audit_cmd_mutex); if (sock == audit_sock) auditd_reset(); + mutex_unlock(&audit_cmd_mutex); RCU_INIT_POINTER(aunet->nlsk, NULL); synchronize_net(); -- 1.7.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [RFC PATCH v3] audit: use proper refcount locking on audit_sock 2016-12-13 15:03 ` [RFC PATCH v3] audit: use proper refcount locking on audit_sock Richard Guy Briggs @ 2016-12-13 20:50 ` Paul Moore 2016-12-14 0:19 ` Cong Wang 1 sibling, 0 replies; 36+ messages in thread From: Paul Moore @ 2016-12-13 20:50 UTC (permalink / raw) To: Richard Guy Briggs Cc: netdev, linux-kernel, linux-audit, edumazet, xiyou.wangcong, dvyukov On Tue, Dec 13, 2016 at 10:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > Resetting audit_sock appears to be racy. > > audit_sock was being copied and dereferenced without using a refcount on > the source sock. > > Bump the refcount on the underlying sock when we store a refrence in > audit_sock and release it when we reset audit_sock. audit_sock > modification needs the audit_cmd_mutex. > > See: https://lkml.org/lkml/2016/11/26/232 > > Thanks to Eric Dumazet <edumazet@google.com> and Cong Wang > <xiyou.wangcong@gmail.com> on ideas how to fix it. > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com> > --- > There has been a lot of change in the audit code that is about to go > upstream to address audit queue issues. This patch is based on the > source tree: git://git.infradead.org/users/pcmoore/audit#next > --- > kernel/audit.c | 28 +++++++++++++++++++++++----- > 1 files changed, 23 insertions(+), 5 deletions(-) This looks more reasonable. I still wonder about synchronization between threads changing the audit_* connection variables and the kauditd_thread, but I guess we can treat that as another issue; this patch fixes a bug and is worth merging now. I'm building a test kernel right now, assuming nothing blows up I'll push this patch with the rest of the audit patches tomorrow; if something bad happens, this is going to miss the first audit pull request. > diff --git a/kernel/audit.c b/kernel/audit.c > index f20eee0..3bb4126 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -446,14 +446,19 @@ static void kauditd_retry_skb(struct sk_buff *skb) > * Description: > * Break the auditd/kauditd connection and move all the records in the retry > * queue into the hold queue in case auditd reconnects. > + * The audit_cmd_mutex must be held when calling this function. > */ Don't resend, but in the future please start comments like this on the previous line. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [RFC PATCH v3] audit: use proper refcount locking on audit_sock 2016-12-13 15:03 ` [RFC PATCH v3] audit: use proper refcount locking on audit_sock Richard Guy Briggs 2016-12-13 20:50 ` Paul Moore @ 2016-12-14 0:19 ` Cong Wang 2016-12-14 4:00 ` Richard Guy Briggs 1 sibling, 1 reply; 36+ messages in thread From: Cong Wang @ 2016-12-14 0:19 UTC (permalink / raw) To: Richard Guy Briggs Cc: Linux Kernel Network Developers, LKML, linux-audit, Dmitry Vyukov, Eric Dumazet, Eric Paris, Paul Moore, sgrubb On Tue, Dec 13, 2016 at 7:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > @@ -1283,8 +1299,10 @@ static void __net_exit audit_net_exit(struct net *net) > { > struct audit_net *aunet = net_generic(net, audit_net_id); > struct sock *sock = aunet->nlsk; > + mutex_lock(&audit_cmd_mutex); > if (sock == audit_sock) > auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); This still doesn't look correct to me, b/c here we release the audit_sock refcnt twice: 1) inside audit_reset() 2) netlink_kernel_release() ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [RFC PATCH v3] audit: use proper refcount locking on audit_sock 2016-12-14 0:19 ` Cong Wang @ 2016-12-14 4:00 ` Richard Guy Briggs 2016-12-14 5:36 ` Cong Wang 0 siblings, 1 reply; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-14 4:00 UTC (permalink / raw) To: Cong Wang Cc: Linux Kernel Network Developers, LKML, Eric Dumazet, linux-audit, Dmitry Vyukov On 2016-12-13 16:19, Cong Wang wrote: > On Tue, Dec 13, 2016 at 7:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > @@ -1283,8 +1299,10 @@ static void __net_exit audit_net_exit(struct net *net) > > { > > struct audit_net *aunet = net_generic(net, audit_net_id); > > struct sock *sock = aunet->nlsk; > > + mutex_lock(&audit_cmd_mutex); > > if (sock == audit_sock) > > auditd_reset(); > > + mutex_unlock(&audit_cmd_mutex); > > This still doesn't look correct to me, b/c here we release the audit_sock > refcnt twice: > > 1) inside audit_reset() The audit_reset() refcount decrement corresponds to a setting of audit_sock only if audit_sock is still non-NULL. > 2) netlink_kernel_release() This refcount decrement corresponds to netlink_kernel_create(). - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [RFC PATCH v3] audit: use proper refcount locking on audit_sock 2016-12-14 4:00 ` Richard Guy Briggs @ 2016-12-14 5:36 ` Cong Wang 0 siblings, 0 replies; 36+ messages in thread From: Cong Wang @ 2016-12-14 5:36 UTC (permalink / raw) To: Richard Guy Briggs Cc: Linux Kernel Network Developers, LKML, Eric Dumazet, linux-audit, Dmitry Vyukov On Tue, Dec 13, 2016 at 8:00 PM, Richard Guy Briggs <rgb@redhat.com> wrote: > On 2016-12-13 16:19, Cong Wang wrote: >> On Tue, Dec 13, 2016 at 7:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote: >> > @@ -1283,8 +1299,10 @@ static void __net_exit audit_net_exit(struct net *net) >> > { >> > struct audit_net *aunet = net_generic(net, audit_net_id); >> > struct sock *sock = aunet->nlsk; >> > + mutex_lock(&audit_cmd_mutex); >> > if (sock == audit_sock) >> > auditd_reset(); >> > + mutex_unlock(&audit_cmd_mutex); >> >> This still doesn't look correct to me, b/c here we release the audit_sock >> refcnt twice: >> >> 1) inside audit_reset() > > The audit_reset() refcount decrement corresponds to a setting of > audit_sock only if audit_sock is still non-NULL. > Hmm, thinking about it again, looks like the sock == audit_sock and audit_sock != NULL checks can guarantee we are safe. So, Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-09 6:02 ` Richard Guy Briggs 2016-12-09 6:57 ` Cong Wang @ 2016-12-09 10:49 ` Dmitry Vyukov 2016-12-09 11:48 ` Richard Guy Briggs 1 sibling, 1 reply; 36+ messages in thread From: Dmitry Vyukov @ 2016-12-09 10:49 UTC (permalink / raw) To: Richard Guy Briggs Cc: Cong Wang, linux-audit, pmoore, David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On Fri, Dec 9, 2016 at 7:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > On 2016-11-29 23:52, Richard Guy Briggs wrote: >> On 2016-11-29 15:13, Cong Wang wrote: >> > On Tue, Nov 29, 2016 at 8:48 AM, Richard Guy Briggs <rgb@redhat.com> wrote: >> > > On 2016-11-26 17:11, Cong Wang wrote: >> > >> It is racy on audit_sock, especially on the netns exit path. >> > > >> > > I think that is the only place it is racy. The other places audit_sock >> > > is set is when the socket failure has just triggered a reset. >> > > >> > > Is there a notifier callback for failed or reaped sockets? >> > >> > Is NETLINK_URELEASE event what you are looking for? >> >> Possibly, yes. Thanks, I'll have a look. > > I tried a quick compile attempt on the test case (I assume it is a > socket fuzzer) and get the following compile error: > cc -g -O0 -Wall -D_GNU_SOURCE -o socket_fuzz socket_fuzz.c > socket_fuzz.c:16:1: warning: "_GNU_SOURCE" redefined > <command-line>: warning: this is the location of the previous definition > socket_fuzz.c: In function ‘segv_handler’: > socket_fuzz.c:89: warning: implicit declaration of function ‘__atomic_load_n’ > socket_fuzz.c:89: error: ‘__ATOMIC_RELAXED’ undeclared (first use in this function) > socket_fuzz.c:89: error: (Each undeclared identifier is reported only once > socket_fuzz.c:89: error: for each function it appears in.) > socket_fuzz.c: In function ‘loop’: > socket_fuzz.c:280: warning: unused variable ‘errno0’ > socket_fuzz.c: In function ‘test’: > socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_add’ > socket_fuzz.c:303: error: ‘__ATOMIC_SEQ_CST’ undeclared (first use in this function) > socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_sub’ -std=gnu99 should help ignore warnings > I also tried to extend Cong Wang's idea to attempt to proactively respond to a > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. > Eliminating the lock since the sock is dead anways eliminates the error. > > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to > get the test case to compile. > > This is being tracked as https://github.com/linux-audit/audit-kernel/issues/30 > > Subject: [PATCH] audit: proactively reset audit_sock on matching NETLINK_URELEASE > > diff --git a/kernel/audit.c b/kernel/audit.c > index f1ca116..91d222d 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -423,6 +423,7 @@ static void kauditd_send_skb(struct sk_buff *skb) > snprintf(s, sizeof(s), "audit_pid=%d reset", audit_pid); > audit_log_lost(s); > audit_pid = 0; > + audit_nlk_portid = 0; > audit_sock = NULL; > } else { > pr_warn("re-scheduling(#%d) write to audit_pid=%d\n", > @@ -1143,6 +1144,28 @@ static int audit_bind(struct net *net, int group) > return 0; > } > > +static int audit_sock_netlink_notify(struct notifier_block *nb, > + unsigned long event, > + void *_notify) > +{ > + struct netlink_notify *notify = _notify; > + struct audit_net *aunet = net_generic(notify->net, audit_net_id); > + > + if (event == NETLINK_URELEASE && notify->protocol == NETLINK_AUDIT) { > + if (audit_nlk_portid == notify->portid && > + audit_sock == aunet->nlsk) { > + audit_pid = 0; > + audit_nlk_portid = 0; > + audit_sock = NULL; > + } > + } > + return NOTIFY_DONE; > +} > + > +static struct notifier_block audit_netlink_notifier = { > + .notifier_call = audit_sock_netlink_notify, > +}; > + > static int __net_init audit_net_init(struct net *net) > { > struct netlink_kernel_cfg cfg = { > @@ -1167,10 +1190,14 @@ static void __net_exit audit_net_exit(struct net *net) > { > struct audit_net *aunet = net_generic(net, audit_net_id); > struct sock *sock = aunet->nlsk; > + > + mutex_lock(&audit_cmd_mutex); > if (sock == audit_sock) { > audit_pid = 0; > + audit_nlk_portid = 0; > audit_sock = NULL; > } > + mutex_unlock(&audit_cmd_mutex); > > RCU_INIT_POINTER(aunet->nlsk, NULL); > synchronize_net(); > @@ -1202,6 +1229,7 @@ static int __init audit_init(void) > audit_enabled = audit_default; > audit_ever_enabled |= !!audit_default; > > + netlink_register_notifier(&audit_netlink_notifier); > audit_log(NULL, GFP_KERNEL, AUDIT_KERNEL, "initialized"); > > for (i = 0; i < AUDIT_INODE_BUCKETS; i++) > -- > 1.7.1 > > >> - RGB > > - RGB > > -- > Richard Guy Briggs <rgb@redhat.com> > Kernel Security Engineering, Base Operating Systems, Red Hat > Remote, Ottawa, Canada > Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-09 10:49 ` netlink: GPF in sock_sndtimeo Dmitry Vyukov @ 2016-12-09 11:48 ` Richard Guy Briggs 2016-12-09 11:53 ` Dmitry Vyukov 0 siblings, 1 reply; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-09 11:48 UTC (permalink / raw) To: Dmitry Vyukov; +Cc: netdev, LKML On 2016-12-09 11:49, Dmitry Vyukov wrote: > On Fri, Dec 9, 2016 at 7:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > > On 2016-11-29 23:52, Richard Guy Briggs wrote: > > I tried a quick compile attempt on the test case (I assume it is a > > socket fuzzer) and get the following compile error: > > cc -g -O0 -Wall -D_GNU_SOURCE -o socket_fuzz socket_fuzz.c > > socket_fuzz.c:16:1: warning: "_GNU_SOURCE" redefined > > <command-line>: warning: this is the location of the previous definition > > socket_fuzz.c: In function ‘segv_handler’: > > socket_fuzz.c:89: warning: implicit declaration of function ‘__atomic_load_n’ > > socket_fuzz.c:89: error: ‘__ATOMIC_RELAXED’ undeclared (first use in this function) > > socket_fuzz.c:89: error: (Each undeclared identifier is reported only once > > socket_fuzz.c:89: error: for each function it appears in.) > > socket_fuzz.c: In function ‘loop’: > > socket_fuzz.c:280: warning: unused variable ‘errno0’ > > socket_fuzz.c: In function ‘test’: > > socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_add’ > > socket_fuzz.c:303: error: ‘__ATOMIC_SEQ_CST’ undeclared (first use in this function) > > socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_sub’ > > -std=gnu99 should help > ignore warnings I got a little further, left with "__ATOMIC_RELAXED undeclared", "__ATOMIC_SEQ_CST undeclared" under gcc 4.4.7-16. gcc 4.8.2-15 leaves me with "undefined reference to `clock_gettime'" What compiler version do you recommend? > >> - RGB > > > > - RGB - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-09 11:48 ` Richard Guy Briggs @ 2016-12-09 11:53 ` Dmitry Vyukov 2016-12-09 12:12 ` Richard Guy Briggs 0 siblings, 1 reply; 36+ messages in thread From: Dmitry Vyukov @ 2016-12-09 11:53 UTC (permalink / raw) To: Richard Guy Briggs; +Cc: netdev, LKML On Fri, Dec 9, 2016 at 12:48 PM, Richard Guy Briggs <rgb@redhat.com> wrote: > On 2016-12-09 11:49, Dmitry Vyukov wrote: >> On Fri, Dec 9, 2016 at 7:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote: >> > On 2016-11-29 23:52, Richard Guy Briggs wrote: >> > I tried a quick compile attempt on the test case (I assume it is a >> > socket fuzzer) and get the following compile error: >> > cc -g -O0 -Wall -D_GNU_SOURCE -o socket_fuzz socket_fuzz.c >> > socket_fuzz.c:16:1: warning: "_GNU_SOURCE" redefined >> > <command-line>: warning: this is the location of the previous definition >> > socket_fuzz.c: In function ‘segv_handler’: >> > socket_fuzz.c:89: warning: implicit declaration of function ‘__atomic_load_n’ >> > socket_fuzz.c:89: error: ‘__ATOMIC_RELAXED’ undeclared (first use in this function) >> > socket_fuzz.c:89: error: (Each undeclared identifier is reported only once >> > socket_fuzz.c:89: error: for each function it appears in.) >> > socket_fuzz.c: In function ‘loop’: >> > socket_fuzz.c:280: warning: unused variable ‘errno0’ >> > socket_fuzz.c: In function ‘test’: >> > socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_add’ >> > socket_fuzz.c:303: error: ‘__ATOMIC_SEQ_CST’ undeclared (first use in this function) >> > socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_sub’ >> >> -std=gnu99 should help >> ignore warnings > > I got a little further, left with "__ATOMIC_RELAXED undeclared", "__ATOMIC_SEQ_CST > undeclared" under gcc 4.4.7-16. > > gcc 4.8.2-15 leaves me with "undefined reference to `clock_gettime'" add -lrt > What compiler version do you recommend? 6.x sounds reasonable 4.4 branch is 7.5 years old, surprised that it does not disintegrate into dust yet :) >> >> - RGB >> > >> > - RGB > > - RGB > > -- > Richard Guy Briggs <rgb@redhat.com> > Kernel Security Engineering, Base Operating Systems, Red Hat > Remote, Ottawa, Canada > Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-12-09 11:53 ` Dmitry Vyukov @ 2016-12-09 12:12 ` Richard Guy Briggs 0 siblings, 0 replies; 36+ messages in thread From: Richard Guy Briggs @ 2016-12-09 12:12 UTC (permalink / raw) To: Dmitry Vyukov; +Cc: netdev, LKML, linux-audit On 2016-12-09 12:53, Dmitry Vyukov wrote: > On Fri, Dec 9, 2016 at 12:48 PM, Richard Guy Briggs <rgb@redhat.com> wrote: > > On 2016-12-09 11:49, Dmitry Vyukov wrote: > >> On Fri, Dec 9, 2016 at 7:02 AM, Richard Guy Briggs <rgb@redhat.com> wrote: > >> > On 2016-11-29 23:52, Richard Guy Briggs wrote: > >> > I tried a quick compile attempt on the test case (I assume it is a > >> > socket fuzzer) and get the following compile error: > >> > cc -g -O0 -Wall -D_GNU_SOURCE -o socket_fuzz socket_fuzz.c > >> > socket_fuzz.c:16:1: warning: "_GNU_SOURCE" redefined > >> > <command-line>: warning: this is the location of the previous definition > >> > socket_fuzz.c: In function ‘segv_handler’: > >> > socket_fuzz.c:89: warning: implicit declaration of function ‘__atomic_load_n’ > >> > socket_fuzz.c:89: error: ‘__ATOMIC_RELAXED’ undeclared (first use in this function) > >> > socket_fuzz.c:89: error: (Each undeclared identifier is reported only once > >> > socket_fuzz.c:89: error: for each function it appears in.) > >> > socket_fuzz.c: In function ‘loop’: > >> > socket_fuzz.c:280: warning: unused variable ‘errno0’ > >> > socket_fuzz.c: In function ‘test’: > >> > socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_add’ > >> > socket_fuzz.c:303: error: ‘__ATOMIC_SEQ_CST’ undeclared (first use in this function) > >> > socket_fuzz.c:303: warning: implicit declaration of function ‘__atomic_fetch_sub’ > >> > >> -std=gnu99 should help > >> ignore warnings > > > > I got a little further, left with "__ATOMIC_RELAXED undeclared", "__ATOMIC_SEQ_CST > > undeclared" under gcc 4.4.7-16. > > > > gcc 4.8.2-15 leaves me with "undefined reference to `clock_gettime'" > > add -lrt Ok, that helped. Thanks! > > What compiler version do you recommend? > > 6.x sounds reasonable > 4.4 branch is 7.5 years old, surprised that it does not disintegrate > into dust yet :) These are under RHEL6... so there are updates to them, but yeah, they are old. > >> >> - RGB > >> > > >> > - RGB > > > > - RGB > > > > -- > > Richard Guy Briggs <rgb@redhat.com> > > Kernel Security Engineering, Base Operating Systems, Red Hat > > Remote, Ottawa, Canada > > Voice: +1.647.777.2635, Internal: (81) 32635 - RGB -- Richard Guy Briggs <rgb@redhat.com> Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635 ^ permalink raw reply [flat|nested] 36+ messages in thread
* netlink: GPF in sock_sndtimeo @ 2016-11-26 15:44 Dmitry Vyukov 2016-11-26 16:17 ` Eric Dumazet 2016-11-27 1:11 ` Cong Wang 0 siblings, 2 replies; 36+ messages in thread From: Dmitry Vyukov @ 2016-11-26 15:44 UTC (permalink / raw) To: David Miller, Johannes Berg, Florian Westphal, Cong Wang, Eric Dumazet, Herbert Xu, netdev, LKML Cc: syzkaller Hello, The following program triggers GPF in sock_sndtimeo: https://gist.githubusercontent.com/dvyukov/c19cadd309791cf5cb9b2bf936d3f48d/raw/1743ba0211079a5465d039512b427bc6b59b1a76/gistfile1.txt On commit 16ae16c6e5616c084168740990fc508bda6655d4 (Nov 24). general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 19950 Comm: syz-executor Not tainted 4.9.0-rc5+ #54 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88002a0d0840 task.stack: ffff880036920000 RIP: 0010:[<ffffffff86cb35e1>] [< inline >] sock_sndtimeo include/net/sock.h:2075 RIP: 0010:[<ffffffff86cb35e1>] [<ffffffff86cb35e1>] netlink_unicast+0xe1/0x730 net/netlink/af_netlink.c:1232 RSP: 0018:ffff880036926f68 EFLAGS: 00010202 RAX: 0000000000000068 RBX: ffff880036927000 RCX: ffffc900021d0000 RDX: 0000000000000d63 RSI: 00000000024000c0 RDI: 0000000000000340 RBP: ffff880036927028 R08: ffffed0006ea7aab R09: ffffed0006ea7aab R10: 0000000000000001 R11: ffffed0006ea7aaa R12: dffffc0000000000 R13: 0000000000000000 R14: ffff880035de3400 R15: ffff880035de3400 FS: 00007f90a2fc7700(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006de0c0 CR3: 0000000035de6000 CR4: 00000000000006e0 Stack: ffff880035de3400 ffffffff819f02a1 1ffff10006d24df4 0000000000000004 00004db400000014 ffff880036926fd8 ffffffff00000000 0000000041b58ab3 ffffffff89653c11 ffffffff86cb3500 ffffffff819f0345 ffff880035de3400 Call Trace: [< inline >] audit_replace kernel/audit.c:817 [<ffffffff816c34b9>] audit_receive_msg+0x22c9/0x2ce0 kernel/audit.c:894 [< inline >] audit_receive_skb kernel/audit.c:1120 [<ffffffff816c40ac>] audit_receive+0x1dc/0x360 kernel/audit.c:1133 [< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1214 [<ffffffff86cb3a14>] netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1240 [<ffffffff86cb46d4>] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1786 [< inline >] sock_sendmsg_nosec net/socket.c:621 [<ffffffff86a6d54f>] sock_sendmsg+0xcf/0x110 net/socket.c:631 [<ffffffff86a6d8bb>] sock_write_iter+0x32b/0x620 net/socket.c:829 [< inline >] new_sync_write fs/read_write.c:499 [<ffffffff81a6f24e>] __vfs_write+0x4fe/0x830 fs/read_write.c:512 [<ffffffff81a70cf5>] vfs_write+0x175/0x4e0 fs/read_write.c:560 [< inline >] SYSC_write fs/read_write.c:607 [<ffffffff81a75180>] SyS_write+0x100/0x240 fs/read_write.c:599 [<ffffffff81009a24>] do_syscall_64+0x2f4/0x940 arch/x86/entry/common.c:280 [<ffffffff88149e8d>] entry_SYSCALL64_slow_path+0x25/0x25 Code: fe 4c 89 f7 e8 31 16 ff ff 8b 8d 70 ff ff ff 49 89 c7 31 c0 85 c9 75 25 e8 7d 4a a3 fa 49 8d bd 40 03 00 00 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00 0f 85 3a 06 00 00 49 8b 85 40 03 00 00 4c 8d 73 RIP [< inline >] sock_sndtimeo include/net/sock.h:2075 RIP [<ffffffff86cb35e1>] netlink_unicast+0xe1/0x730 net/netlink/af_netlink.c:1232 RSP <ffff880036926f68> ---[ end trace 8383a15fba6fdc59 ]--- ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-11-26 15:44 Dmitry Vyukov @ 2016-11-26 16:17 ` Eric Dumazet 2016-11-27 1:11 ` Cong Wang 1 sibling, 0 replies; 36+ messages in thread From: Eric Dumazet @ 2016-11-26 16:17 UTC (permalink / raw) To: Dmitry Vyukov, Richard Guy Briggs Cc: David Miller, Johannes Berg, Florian Westphal, Cong Wang, Herbert Xu, netdev, LKML, syzkaller CC Richard Guy Briggs <rgb@redhat.com> On Sat, Nov 26, 2016 at 7:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > Hello, > > The following program triggers GPF in sock_sndtimeo: > https://gist.githubusercontent.com/dvyukov/c19cadd309791cf5cb9b2bf936d3f48d/raw/1743ba0211079a5465d039512b427bc6b59b1a76/gistfile1.txt > > On commit 16ae16c6e5616c084168740990fc508bda6655d4 (Nov 24). > > general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN > Dumping ftrace buffer: > (ftrace buffer empty) > Modules linked in: > CPU: 1 PID: 19950 Comm: syz-executor Not tainted 4.9.0-rc5+ #54 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: ffff88002a0d0840 task.stack: ffff880036920000 > RIP: 0010:[<ffffffff86cb35e1>] [< inline >] sock_sndtimeo > include/net/sock.h:2075 > RIP: 0010:[<ffffffff86cb35e1>] [<ffffffff86cb35e1>] > netlink_unicast+0xe1/0x730 net/netlink/af_netlink.c:1232 > RSP: 0018:ffff880036926f68 EFLAGS: 00010202 > RAX: 0000000000000068 RBX: ffff880036927000 RCX: ffffc900021d0000 > RDX: 0000000000000d63 RSI: 00000000024000c0 RDI: 0000000000000340 > RBP: ffff880036927028 R08: ffffed0006ea7aab R09: ffffed0006ea7aab > R10: 0000000000000001 R11: ffffed0006ea7aaa R12: dffffc0000000000 > R13: 0000000000000000 R14: ffff880035de3400 R15: ffff880035de3400 > FS: 00007f90a2fc7700(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000000006de0c0 CR3: 0000000035de6000 CR4: 00000000000006e0 > Stack: > ffff880035de3400 ffffffff819f02a1 1ffff10006d24df4 0000000000000004 > 00004db400000014 ffff880036926fd8 ffffffff00000000 0000000041b58ab3 > ffffffff89653c11 ffffffff86cb3500 ffffffff819f0345 ffff880035de3400 > Call Trace: > [< inline >] audit_replace kernel/audit.c:817 > [<ffffffff816c34b9>] audit_receive_msg+0x22c9/0x2ce0 kernel/audit.c:894 > [< inline >] audit_receive_skb kernel/audit.c:1120 > [<ffffffff816c40ac>] audit_receive+0x1dc/0x360 kernel/audit.c:1133 > [< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1214 > [<ffffffff86cb3a14>] netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1240 > [<ffffffff86cb46d4>] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1786 > [< inline >] sock_sendmsg_nosec net/socket.c:621 > [<ffffffff86a6d54f>] sock_sendmsg+0xcf/0x110 net/socket.c:631 > [<ffffffff86a6d8bb>] sock_write_iter+0x32b/0x620 net/socket.c:829 > [< inline >] new_sync_write fs/read_write.c:499 > [<ffffffff81a6f24e>] __vfs_write+0x4fe/0x830 fs/read_write.c:512 > [<ffffffff81a70cf5>] vfs_write+0x175/0x4e0 fs/read_write.c:560 > [< inline >] SYSC_write fs/read_write.c:607 > [<ffffffff81a75180>] SyS_write+0x100/0x240 fs/read_write.c:599 > [<ffffffff81009a24>] do_syscall_64+0x2f4/0x940 arch/x86/entry/common.c:280 > [<ffffffff88149e8d>] entry_SYSCALL64_slow_path+0x25/0x25 > Code: fe 4c 89 f7 e8 31 16 ff ff 8b 8d 70 ff ff ff 49 89 c7 31 c0 85 > c9 75 25 e8 7d 4a a3 fa 49 8d bd 40 03 00 00 48 89 f8 48 c1 e8 03 <42> > 80 3c 20 00 0f 85 3a 06 00 00 49 8b 85 40 03 00 00 4c 8d 73 > RIP [< inline >] sock_sndtimeo include/net/sock.h:2075 > RIP [<ffffffff86cb35e1>] netlink_unicast+0xe1/0x730 > net/netlink/af_netlink.c:1232 > RSP <ffff880036926f68> > ---[ end trace 8383a15fba6fdc59 ]--- Looks a bug added in commit 32a1dbaece7e37cea415e03cd426172249aa859e ("audit: try harder to send to auditd upon netlink failure") or 133e1e5acd4a63c4a0dcc413e90d5decdbce9c4a ("audit: stop an old auditd being starved out by a new auditd") Richard, can you take a look ? Thanks ! ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: netlink: GPF in sock_sndtimeo 2016-11-26 15:44 Dmitry Vyukov 2016-11-26 16:17 ` Eric Dumazet @ 2016-11-27 1:11 ` Cong Wang 1 sibling, 0 replies; 36+ messages in thread From: Cong Wang @ 2016-11-27 1:11 UTC (permalink / raw) To: Dmitry Vyukov Cc: David Miller, Johannes Berg, Florian Westphal, Eric Dumazet, Herbert Xu, netdev, LKML, syzkaller On Sat, Nov 26, 2016 at 7:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > Hello, > > The following program triggers GPF in sock_sndtimeo: > https://gist.githubusercontent.com/dvyukov/c19cadd309791cf5cb9b2bf936d3f48d/raw/1743ba0211079a5465d039512b427bc6b59b1a76/gistfile1.txt > > On commit 16ae16c6e5616c084168740990fc508bda6655d4 (Nov 24). > > general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN > Dumping ftrace buffer: > (ftrace buffer empty) > Modules linked in: > CPU: 1 PID: 19950 Comm: syz-executor Not tainted 4.9.0-rc5+ #54 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: ffff88002a0d0840 task.stack: ffff880036920000 > RIP: 0010:[<ffffffff86cb35e1>] [< inline >] sock_sndtimeo > include/net/sock.h:2075 > RIP: 0010:[<ffffffff86cb35e1>] [<ffffffff86cb35e1>] > netlink_unicast+0xe1/0x730 net/netlink/af_netlink.c:1232 > RSP: 0018:ffff880036926f68 EFLAGS: 00010202 > RAX: 0000000000000068 RBX: ffff880036927000 RCX: ffffc900021d0000 > RDX: 0000000000000d63 RSI: 00000000024000c0 RDI: 0000000000000340 > RBP: ffff880036927028 R08: ffffed0006ea7aab R09: ffffed0006ea7aab > R10: 0000000000000001 R11: ffffed0006ea7aaa R12: dffffc0000000000 > R13: 0000000000000000 R14: ffff880035de3400 R15: ffff880035de3400 > FS: 00007f90a2fc7700(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000000006de0c0 CR3: 0000000035de6000 CR4: 00000000000006e0 > Stack: > ffff880035de3400 ffffffff819f02a1 1ffff10006d24df4 0000000000000004 > 00004db400000014 ffff880036926fd8 ffffffff00000000 0000000041b58ab3 > ffffffff89653c11 ffffffff86cb3500 ffffffff819f0345 ffff880035de3400 > Call Trace: > [< inline >] audit_replace kernel/audit.c:817 > [<ffffffff816c34b9>] audit_receive_msg+0x22c9/0x2ce0 kernel/audit.c:894 > [< inline >] audit_receive_skb kernel/audit.c:1120 > [<ffffffff816c40ac>] audit_receive+0x1dc/0x360 kernel/audit.c:1133 > [< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1214 > [<ffffffff86cb3a14>] netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1240 > [<ffffffff86cb46d4>] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1786 > [< inline >] sock_sendmsg_nosec net/socket.c:621 > [<ffffffff86a6d54f>] sock_sendmsg+0xcf/0x110 net/socket.c:631 > [<ffffffff86a6d8bb>] sock_write_iter+0x32b/0x620 net/socket.c:829 > [< inline >] new_sync_write fs/read_write.c:499 > [<ffffffff81a6f24e>] __vfs_write+0x4fe/0x830 fs/read_write.c:512 > [<ffffffff81a70cf5>] vfs_write+0x175/0x4e0 fs/read_write.c:560 > [< inline >] SYSC_write fs/read_write.c:607 > [<ffffffff81a75180>] SyS_write+0x100/0x240 fs/read_write.c:599 > [<ffffffff81009a24>] do_syscall_64+0x2f4/0x940 arch/x86/entry/common.c:280 > [<ffffffff88149e8d>] entry_SYSCALL64_slow_path+0x25/0x25 > Code: fe 4c 89 f7 e8 31 16 ff ff 8b 8d 70 ff ff ff 49 89 c7 31 c0 85 > c9 75 25 e8 7d 4a a3 fa 49 8d bd 40 03 00 00 48 89 f8 48 c1 e8 03 <42> > 80 3c 20 00 0f 85 3a 06 00 00 49 8b 85 40 03 00 00 4c 8d 73 > RIP [< inline >] sock_sndtimeo include/net/sock.h:2075 > RIP [<ffffffff86cb35e1>] netlink_unicast+0xe1/0x730 > net/netlink/af_netlink.c:1232 > RSP <ffff880036926f68> > ---[ end trace 8383a15fba6fdc59 ]--- It is racy on audit_sock, especially on the netns exit path. Could the following patch help a little bit? Also, I don't see how the synchronize_net() here could sync with netlink rcv path, since unlike packets from wire, netlink messages are not handled in BH context nor I see any RCU taken on rcv path. diff --git a/kernel/audit.c b/kernel/audit.c index f1ca116..20bc79e 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1167,10 +1167,13 @@ static void __net_exit audit_net_exit(struct net *net) { struct audit_net *aunet = net_generic(net, audit_net_id); struct sock *sock = aunet->nlsk; + + mutex_lock(&audit_cmd_mutex); if (sock == audit_sock) { audit_pid = 0; audit_sock = NULL; } + mutex_unlock(&audit_cmd_mutex); RCU_INIT_POINTER(aunet->nlsk, NULL); synchronize_net(); ^ permalink raw reply related [flat|nested] 36+ messages in thread
end of thread, other threads:[~2016-12-14 6:37 UTC | newest] Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CACT4Y+aG1+91U1PWMTwpE_6vbEuqG7CdLCM1H=3WVJWtz=> [not found] ` <CAM_iQpVeLvfYV+1jX1ZKOntZim4roof4=> 2016-11-29 16:48 ` netlink: GPF in sock_sndtimeo Richard Guy Briggs 2016-11-29 23:13 ` Cong Wang 2016-11-30 4:52 ` Richard Guy Briggs 2016-12-09 6:02 ` Richard Guy Briggs 2016-12-09 6:57 ` Cong Wang 2016-12-09 11:01 ` Richard Guy Briggs 2016-12-10 4:13 ` Cong Wang 2016-12-10 7:40 ` Cong Wang 2016-12-12 10:07 ` Dmitry Vyukov 2016-12-13 7:51 ` Richard Guy Briggs 2016-12-13 8:28 ` Richard Guy Briggs 2016-12-12 10:02 ` Richard Guy Briggs 2016-12-12 10:03 ` [PATCH v2] audit: use proper refcount locking on audit_sock Richard Guy Briggs 2016-12-12 17:10 ` Paul Moore 2016-12-13 4:49 ` Richard Guy Briggs 2016-12-12 20:18 ` Paul Moore 2016-12-13 5:10 ` Richard Guy Briggs 2016-12-13 15:01 ` Richard Guy Briggs 2016-12-12 23:58 ` Cong Wang 2016-12-13 14:55 ` Richard Guy Briggs 2016-12-13 0:10 ` netlink: GPF in sock_sndtimeo Cong Wang 2016-12-13 10:52 ` Richard Guy Briggs 2016-12-14 0:17 ` Cong Wang 2016-12-14 4:17 ` Richard Guy Briggs 2016-12-13 15:03 ` [RFC PATCH v3] audit: use proper refcount locking on audit_sock Richard Guy Briggs 2016-12-13 20:50 ` Paul Moore 2016-12-14 0:19 ` Cong Wang 2016-12-14 4:00 ` Richard Guy Briggs 2016-12-14 5:36 ` Cong Wang 2016-12-09 10:49 ` netlink: GPF in sock_sndtimeo Dmitry Vyukov 2016-12-09 11:48 ` Richard Guy Briggs 2016-12-09 11:53 ` Dmitry Vyukov 2016-12-09 12:12 ` Richard Guy Briggs 2016-11-26 15:44 Dmitry Vyukov 2016-11-26 16:17 ` Eric Dumazet 2016-11-27 1:11 ` Cong Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).