From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47728C43381 for ; Mon, 25 Mar 2019 01:23:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0FDDE20693 for ; Mon, 25 Mar 2019 01:23:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1553477026; bh=QLE/muGLqp4ExMEy7KCJ6i4T1iu323BEk1k+JhCCWGM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=jDNMToHpgqNEOC/weWOZqYkZRZgWCXkK+mxw/ncKBJJJU+iAslKmBUvj5FxKOOvHs nJ4S2cxfh0fJ09IPkTBdG4Oi80WFMv8ihYPZ5NaNWhLxjMe6dNSSn/VUhCy8xyMMLN nzjjpvlTCpAwHdtOKhD8u8UUVfmgzwr94Ae7qT68= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729140AbfCYBXp (ORCPT ); Sun, 24 Mar 2019 21:23:45 -0400 Received: from mail-lf1-f67.google.com ([209.85.167.67]:45823 "EHLO mail-lf1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729123AbfCYBXp (ORCPT ); Sun, 24 Mar 2019 21:23:45 -0400 Received: by mail-lf1-f67.google.com with SMTP id 5so4780473lft.12 for ; Sun, 24 Mar 2019 18:23:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Wf9yYyFx5XQycNPtTrNCRc9C90M0oW6GSIu1pC1YItI=; b=e7jdEUyW3sYQ481V5uCwd0nHFZrOLlPdAmWu2ViiO7nkllP8t8muZrXYkTqJzUTTJX EXfrZiDE/2CbKNS/1W1CU56XDBYV7+t0NaQJ6K02ZTiWYXa/JgAXwDv+5p2oSIBgzKfr 3HSdu1dDi/ZMK7h69Ghn4ZtJ37EaKRqtoYQ9M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Wf9yYyFx5XQycNPtTrNCRc9C90M0oW6GSIu1pC1YItI=; b=kCNYCjduxPinXfHvduN9wSd4WITQs6aSfibdx+JiqL1YA+CsefoUyP6pSq1bv9Sk+e ZzJnBoMYHUiIY5/UamcuK9vBwEFi92mhvPU0LzPxETVHivtykrHRUxmZJBwzp8rWCVaD BsKhUl5tcJYOZnsq+jjsuEqBN+we9uhEXJO/M2ONuJRMEtDLW1KrPo4r0lNjqUCFaJVM AxpG8Q4nNcobKwjeVpsY3yIcl106WAYHEMXOGNiNxXf1MMuYVqSpxN+qL3ZffmcoV4y7 gHtkamIkaokLv3c8Rh1XeLW2ApyjfavwbBZpbWTtfC1JHnL3zKpWyacFxOib4AxWZvak l7HQ== X-Gm-Message-State: APjAAAUglVuMPazEuADH2E1miT99qIJHyPeLQrCauC81voyUObuDosAl 7dUwvF/Q9vpBupPjPBSV7f+iebbzXDw= X-Google-Smtp-Source: APXvYqyjeOh4vk5UHzdOqzDHWmkvdhj1qsas1GOXo9rbsdxMUZ8dNzW0xGFQWe2cRvIuwPEZ4xm8pg== X-Received: by 2002:ac2:43c3:: with SMTP id u3mr10862254lfl.69.1553477022281; Sun, 24 Mar 2019 18:23:42 -0700 (PDT) Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com. [209.85.208.179]) by smtp.gmail.com with ESMTPSA id s24sm1738494ljs.30.2019.03.24.18.23.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 24 Mar 2019 18:23:41 -0700 (PDT) Received: by mail-lj1-f179.google.com with SMTP id y6so6284502ljd.12 for ; Sun, 24 Mar 2019 18:23:40 -0700 (PDT) X-Received: by 2002:a2e:22c4:: with SMTP id i187mr2863776lji.94.1553477020489; Sun, 24 Mar 2019 18:23:40 -0700 (PDT) MIME-Version: 1.0 References: <0000000000006946d2057bbd0eef@google.com> In-Reply-To: <0000000000006946d2057bbd0eef@google.com> From: Linus Torvalds Date: Sun, 24 Mar 2019 18:23:24 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: KASAN: use-after-free Read in path_lookupat To: syzbot , Alexei Starovoitov , Daniel Borkmann Cc: linux-fsdevel , Linux List Kernel Mailing , syzkaller-bugs , Al Viro Content-Type: text/plain; charset="UTF-8" Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Hmm. Al, this one seems real and also seems pretty nasty from a vfs interface standpoint. On Wed, Nov 28, 2018 at 9:40 AM syzbot wrote: > > BUG: KASAN: use-after-free in lookup_last fs/namei.c:2269 [inline] > BUG: KASAN: use-after-free in path_lookupat.isra.43+0x9f8/0xc00 fs/namei.c:2318 > ... > lookup_last fs/namei.c:2269 [inline] > path_lookupat.isra.43+0x9f8/0xc00 fs/namei.c:2318 Ok, path lookup using RCU. > Allocated by task 9424: > kstrdup+0x39/0x70 mm/util.c:49 > bpf_symlink+0x26/0x140 kernel/bpf/inode.c:356 It's the symlink data for the bpf filesystem. Fair enough. > Freed by task 9425: > kfree+0xcf/0x230 mm/slab.c:3817 > bpf_evict_inode+0x11f/0x150 kernel/bpf/inode.c:565 > evict+0x4b9/0x980 fs/inode.c:558 > iput_final fs/inode.c:1550 [inline] > iput+0x674/0xa90 fs/inode.c:1576 > do_unlinkat+0x733/0xa30 fs/namei.c:4069 .. and the path lookup is racing with the final unlink. We actually RCU-delay the inode freeing itself, but when we do the final iput(), the "evict()" function is called synchronously. Now, the simple fix would seem to just RCU-delay the kfree() of the symlink data in bpf_evict_inode(). Maybe that's the right thing to do. I'd call it trivial, except you'd need to have that rcu head to attach it to, making it slightly less trivial. I guess the kmalloc could just malloc that too. But it does strike me that this might be a generic issue. I wonder how many other filesystems do this? Alexei, Daniel - the "proper" fix right is probably one of four cases: (a) rcu-delay the freeing, and use simple_symlink_inode_operations(). No set_delayed_call() needed, because it stays around for the inode lifetime, but the RCU-delaying is still needed for lookup. or (b) put the symlink in a page, and use page_symlink_inode_operations for that inode, where we have a "struct delayed_call" and get a page ref or (c) put the symlink in the inode itself, and then use that simple_symlink_inode_operations (and now the RCU-delaying of the inode protects it). of (d) make a special getlink() that does "copy symlink, set_delayed_call(done, kfree_link, copy)" but I do wonder if we should perhaps just make simple_symlink_inode_operations do that (d) case for people. Al, comments? At the very least, if we don't make simple_symlink_inode_operations() do that, we should have a *big* comment that if it's not part of the inode data, it needs to be RCU-delayed. Or maybe we could add a final inode callback function for "rcu_free" that is called as the RCU-delayed freeing of the inode itself happens? And then people could hook into that for freeing the inode->i_link data. So many choices.. But the current situation seems unnecessarily complex for the filesystem, and isn't really documented. Our documentation currently says for get_link(): "If the body won't go away until the inode is gone, nothing else is needed", which is wrong (or at least very misleading, since the last "inode is gone" callback we have is that evict() function). Linus