From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751576AbcLEHVY (ORCPT ); Mon, 5 Dec 2016 02:21:24 -0500 Received: from helcar.hengli.com.au ([209.40.204.226]:49327 "EHLO helcar.apana.org.au" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750785AbcLEHVV (ORCPT ); Mon, 5 Dec 2016 02:21:21 -0500 Date: Mon, 5 Dec 2016 15:19:46 +0800 From: Herbert Xu To: Cong Wang Cc: Andrey Konovalov , "David S. Miller" , Johannes Berg , Florian Westphal , Eric Dumazet , Bob Copeland , Tom Herbert , David Decotigny , netdev , LKML Subject: Re: net: use-after-free in worker_thread Message-ID: <20161205071946.GB9496@gondor.apana.org.au> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Dec 03, 2016 at 10:14:48AM -0800, Cong Wang wrote: > On Sat, Dec 3, 2016 at 9:41 AM, Cong Wang wrote: > > On Sat, Dec 3, 2016 at 4:56 AM, Andrey Konovalov wrote: > >> Hi! > >> > >> I'm seeing lots of the following error reports while running the > >> syzkaller fuzzer. > >> > >> Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1). > >> > >> ================================================================== > >> BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0 > >> Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774 > >> > >> page:ffffea00019fce00 count:1 mapcount:0 mapping: (null) > >> index:0xffff880067f39c10 compound_mapcount: 0 > >> flags: 0x500000000004080(slab|head) > >> page dumped because: kasan: bad access detected > >> > >> CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66 > >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > >> ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a > >> ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198 > >> ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8 > >> Call Trace: > >> [< inline >] __dump_stack lib/dump_stack.c:15 > >> [] dump_stack+0x292/0x398 lib/dump_stack.c:51 > >> [< inline >] describe_address mm/kasan/report.c:262 > >> [] kasan_report_error+0x121/0x560 mm/kasan/report.c:368 > >> [< inline >] kasan_report mm/kasan/report.c:390 > >> [] __asan_report_load8_noabort+0x3e/0x40 > >> mm/kasan/report.c:411 > >> [] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228 > >> [] kthread+0x323/0x3e0 kernel/kthread.c:209 > >> [] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433 > > > > Heck... this is the pending work vs. sk_destruct() race. :-/ > > We can't wait for the work in RCU callback, let me think about it... Sorry, my patch was obviously crap as it was trying to delay the freeing of a socket from sk_destruct which can't be done. > Please try the attached patch, I only did compile test, I can't access > my desktop now, so can't do further tests. Thanks for the patch. It'll obviously work but I wanted avoid that because it penalises the common path for the rare case. Andrey, please try this patch and let me know if it's any better. ---8<--- Subject: netlink: Do not schedule work from sk_destruct It is wrong to schedule a work from sk_destruct using the socket as the memory reserve because the socket will be freed immediately after the return from sk_destruct. Instead we should do the deferral prior to sk_free. This patch does just that. Fixes: 707693c8a498 ("netlink: Call cb->done from a worker thread") Signed-off-by: Herbert Xu diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 602e5eb..8a642c5 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -322,11 +322,13 @@ static void netlink_skb_set_owner_r(struct sk_buff *skb, struct sock *sk) sk_mem_charge(sk, skb->truesize); } -static void __netlink_sock_destruct(struct sock *sk) +static void netlink_sock_destruct(struct sock *sk) { struct netlink_sock *nlk = nlk_sk(sk); if (nlk->cb_running) { + if (nlk->cb.done) + nlk->cb.done(&nlk->cb); module_put(nlk->cb.module); kfree_skb(nlk->cb.skb); } @@ -348,21 +350,7 @@ static void netlink_sock_destruct_work(struct work_struct *work) struct netlink_sock *nlk = container_of(work, struct netlink_sock, work); - nlk->cb.done(&nlk->cb); - __netlink_sock_destruct(&nlk->sk); -} - -static void netlink_sock_destruct(struct sock *sk) -{ - struct netlink_sock *nlk = nlk_sk(sk); - - if (nlk->cb_running && nlk->cb.done) { - INIT_WORK(&nlk->work, netlink_sock_destruct_work); - schedule_work(&nlk->work); - return; - } - - __netlink_sock_destruct(sk); + sk_free(&nlk->sk); } /* This lock without WQ_FLAG_EXCLUSIVE is good on UP and it is _very_ bad on @@ -664,11 +652,17 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol, goto out; } -static void deferred_put_nlk_sk(struct rcu_head *head) +static void deferred_free_nlk_sk(struct rcu_head *head) { struct netlink_sock *nlk = container_of(head, struct netlink_sock, rcu); - sock_put(&nlk->sk); + if (nlk->cb_running && nlk->cb.done) { + INIT_WORK(&nlk->work, netlink_sock_destruct_work); + schedule_work(&nlk->work); + return; + } + + sk_free(&nlk->sk); } static int netlink_release(struct socket *sock) @@ -743,7 +737,9 @@ static int netlink_release(struct socket *sock) local_bh_disable(); sock_prot_inuse_add(sock_net(sk), &netlink_proto, -1); local_bh_enable(); - call_rcu(&nlk->rcu, deferred_put_nlk_sk); + + if (atomic_dec_and_test(&sk->sk_refcnt)) + call_rcu(&nlk->rcu, deferred_free_nlk_sk); return 0; } -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt