From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: syzbot <syzbot+7a8ba368b47fdefca61e@syzkaller.appspotmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
syzkaller-bugs <syzkaller-bugs@googlegroups.com>
Subject: Re: KASAN: use-after-free Read in path_lookupat
Date: Mon, 25 Mar 2019 04:57:44 +0000 [thread overview]
Message-ID: <20190325045744.GK2217@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAHk-=wiijJMTw=nTj_ED+YWMCrvHzh3ezfVYRxvzcW6+trgyPA@mail.gmail.com>
On Sun, Mar 24, 2019 at 06:23:24PM -0700, Linus Torvalds wrote:
> Al, comments? At the very least, if we don't make
> simple_symlink_inode_operations() do that, we should have a *big*
> comment that if it's not part of the inode data, it needs to be
> RCU-delayed.
simple_symlink_inode_operations is red herring here - what matters
is ->i_link being set; those have ->get_link == simple_get_link,
but note that it is *not* called:
res = inode->i_link;
if (!res) {
const char * (*get)(struct dentry *, struct inode *,
struct delayed_call *);
get = inode->i_op->get_link;
if (nd->flags & LOOKUP_RCU) {
res = get(NULL, inode, &last->done);
if (res == ERR_PTR(-ECHILD)) {
if (unlikely(unlazy_walk(nd)))
return ERR_PTR(-ECHILD);
res = get(dentry, inode, &last->done);
}
} else {
res = get(dentry, inode, &last->done);
}
if (IS_ERR_OR_NULL(res))
return res;
}
for traversal and similar for readlink(2). And we certainly don't want
to allocate copies in those cases - it would fuck RCU traversals for
all fast symlinks (i.e. for the majority of symlinks out there).
Actual situation:
* shmem, erofs: OK, kfree() from the thing ->destroy_inode() is calling via
call_rcu().
* befs, ext2, ext4, freevxfs, jfs, orangefs, ufs: OK, coallocated with inode
* debugfs: broken
* jffs2: broken, freeing of f->target should be moved to jffs2_i_callback().
* ubifs: broken, ought to move kfree(ui->data); from ubifs_destroy_inode() to
ubifs_i_callback()
* ceph: broken, needs to move kfree(ci->symlink) from ceph_destroy_inode()
to ceph_i_callback().
* bpf: broken
So we have 5 broken cases, all with the same kind of fix: move freeing
into the RCU-delayed part of ->destroy_inode(); for debugfs and bpf
that requires adding ->alloc_inode()/->destroy_inode(), rather than
relying upon the defaults from fs/inode.c
> Or maybe we could add a final inode callback function for "rcu_free"
> that is called as the RCU-delayed freeing of the inode itself happens?
> And then people could hook into that for freeing the inode->i_link
> data.
You mean, split ->destroy_inode() into immediate and RCU-delayed parts?
There are filesystems where both parts are non-empty - we can't just
switch all ->destroy_inode() work to call_rcu().
> So many choices.. But the current situation seems unnecessarily
> complex for the filesystem, and isn't really documented.
>
> Our documentation currently says for get_link(): "If the body won't go
> away until the inode is gone, nothing else is needed", which is wrong
> (or at least very misleading, since the last "inode is gone" callback
> we have is that evict() function).
s/inode is gone/struct inode is freed/, but it's obviously not clear
enough.
next prev parent reply other threads:[~2019-03-25 4:57 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-28 17:40 KASAN: use-after-free Read in path_lookupat syzbot
2019-03-25 0:44 ` syzbot
2019-03-25 1:25 ` Linus Torvalds
2019-03-25 1:23 ` Linus Torvalds
2019-03-25 4:57 ` Al Viro [this message]
2019-03-25 9:15 ` Daniel Borkmann
2019-03-25 11:11 ` Al Viro
2019-03-25 11:17 ` Al Viro
2019-03-25 11:21 ` Daniel Borkmann
2019-03-25 18:36 ` Linus Torvalds
2019-03-25 19:18 ` Linus Torvalds
2019-03-25 21:14 ` Al Viro
2019-03-25 21:45 ` Linus Torvalds
2019-03-25 22:04 ` Daniel Borkmann
2019-03-25 22:13 ` Linus Torvalds
2019-03-25 22:41 ` Daniel Borkmann
2019-03-25 22:49 ` Al Viro
2019-03-25 23:37 ` Al Viro
2019-03-25 23:44 ` Alexei Starovoitov
2019-03-26 0:21 ` Al Viro
2019-03-26 1:38 ` ceph: fix use-after-free on symlink traversal Al Viro
2019-03-26 1:39 ` jffs2: " Al Viro
2019-03-26 1:40 ` ubifs: " Al Viro
2019-03-26 1:43 ` debugfs: " Al Viro
2019-03-26 10:41 ` ceph: " Jeff Layton
2019-03-26 11:38 ` Ilya Dryomov
2019-03-26 1:45 ` KASAN: use-after-free Read in path_lookupat Al Viro
2019-04-10 18:11 ` Al Viro
2019-04-10 19:44 ` Linus Torvalds
2019-03-25 19:43 ` Al Viro
2019-03-25 22:48 ` Dave Chinner
2019-03-25 23:02 ` Al Viro
[not found] ` <CAGe7X7mb=gK7zhSwmT_6mmmkcbjhZAOb=wj31BdUcHkNUPsm2Q@mail.gmail.com>
2019-03-26 4:15 ` Al Viro
2019-03-27 16:58 ` Jan Kara
2019-03-27 18:59 ` Al Viro
2019-03-28 9:00 ` Jan Kara
2019-03-27 17:22 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190325045744.GK2217@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=syzbot+7a8ba368b47fdefca61e@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).