Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: syzbot <syzbot+7a8ba368b47fdefca61e@syzkaller.appspotmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>
Subject: Re: KASAN: use-after-free Read in path_lookupat
Date: Mon, 25 Mar 2019 04:57:44 +0000
Message-ID: <20190325045744.GK2217@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAHk-=wiijJMTw=nTj_ED+YWMCrvHzh3ezfVYRxvzcW6+trgyPA@mail.gmail.com>

On Sun, Mar 24, 2019 at 06:23:24PM -0700, Linus Torvalds wrote:


> Al, comments? At the very least, if we don't make
> simple_symlink_inode_operations() do that, we should have a *big*
> comment that if it's not part of the inode data, it needs to be
> RCU-delayed.

simple_symlink_inode_operations is red herring here - what matters
is ->i_link being set; those have ->get_link == simple_get_link,
but note that it is *not* called:
        res = inode->i_link;
        if (!res) {
                const char * (*get)(struct dentry *, struct inode *,
                                struct delayed_call *);
                get = inode->i_op->get_link;
                if (nd->flags & LOOKUP_RCU) {
                        res = get(NULL, inode, &last->done);
                        if (res == ERR_PTR(-ECHILD)) {
                                if (unlikely(unlazy_walk(nd)))
                                        return ERR_PTR(-ECHILD);
                                res = get(dentry, inode, &last->done);
                        }
                } else {
                        res = get(dentry, inode, &last->done);
                }
                if (IS_ERR_OR_NULL(res))
                        return res;
	}
for traversal and similar for readlink(2).  And we certainly don't want
to allocate copies in those cases - it would fuck RCU traversals for
all fast symlinks (i.e. for the majority of symlinks out there).

Actual situation:

* shmem, erofs: OK, kfree() from the thing ->destroy_inode() is calling via
call_rcu().
* befs, ext2, ext4, freevxfs, jfs, orangefs, ufs: OK, coallocated with inode
* debugfs: broken
* jffs2: broken, freeing of f->target should be moved to jffs2_i_callback().
* ubifs: broken, ought to move kfree(ui->data); from ubifs_destroy_inode() to
ubifs_i_callback()
* ceph: broken, needs to move kfree(ci->symlink) from ceph_destroy_inode()
to ceph_i_callback().
* bpf: broken

So we have 5 broken cases, all with the same kind of fix: move freeing
into the RCU-delayed part of ->destroy_inode(); for debugfs and bpf
that requires adding ->alloc_inode()/->destroy_inode(), rather than
relying upon the defaults from fs/inode.c

> Or maybe we could add a final inode callback function for "rcu_free"
> that is called as the RCU-delayed freeing of the inode itself happens?
> And then people could hook into that for freeing the inode->i_link
> data.

You mean, split ->destroy_inode() into immediate and RCU-delayed parts?
There are filesystems where both parts are non-empty - we can't just
switch all ->destroy_inode() work to call_rcu().

> So many choices.. But the current situation seems unnecessarily
> complex for the filesystem, and isn't really documented.
> 
> Our documentation currently says for get_link(): "If the body won't go
> away until the inode is gone, nothing else is needed", which is wrong
> (or at least very misleading, since the last "inode is gone" callback
> we have is that evict() function).

s/inode is gone/struct inode is freed/, but it's obviously not clear
enough.

  reply index

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-28 17:40 syzbot
2019-03-25  0:44 ` syzbot
2019-03-25  1:25   ` Linus Torvalds
2019-03-25  1:23 ` Linus Torvalds
2019-03-25  4:57   ` Al Viro [this message]
2019-03-25  9:15     ` Daniel Borkmann
2019-03-25 11:11       ` Al Viro
2019-03-25 11:17         ` Al Viro
2019-03-25 11:21           ` Daniel Borkmann
2019-03-25 18:36     ` Linus Torvalds
2019-03-25 19:18       ` Linus Torvalds
2019-03-25 21:14         ` Al Viro
2019-03-25 21:45           ` Linus Torvalds
2019-03-25 22:04             ` Daniel Borkmann
2019-03-25 22:13               ` Linus Torvalds
2019-03-25 22:41                 ` Daniel Borkmann
2019-03-25 22:49               ` Al Viro
2019-03-25 23:37             ` Al Viro
2019-03-25 23:44               ` Alexei Starovoitov
2019-03-26  0:21                 ` Al Viro
2019-03-26  1:38               ` ceph: fix use-after-free on symlink traversal Al Viro
2019-03-26  1:39                 ` jffs2: " Al Viro
2019-03-26  1:40                 ` ubifs: " Al Viro
2019-03-26  1:43                 ` debugfs: " Al Viro
2019-03-26 10:41                 ` ceph: " Jeff Layton
2019-03-26 11:38                 ` Ilya Dryomov
2019-03-26  1:45               ` KASAN: use-after-free Read in path_lookupat Al Viro
2019-04-10 18:11                 ` Al Viro
2019-04-10 19:44                   ` Linus Torvalds
2019-03-25 19:43       ` Al Viro
2019-03-25 22:48         ` Dave Chinner
2019-03-25 23:02           ` Al Viro
     [not found]             ` <CAGe7X7mb=gK7zhSwmT_6mmmkcbjhZAOb=wj31BdUcHkNUPsm2Q@mail.gmail.com>
2019-03-26  4:15               ` Al Viro
2019-03-27 16:58                 ` Jan Kara
2019-03-27 18:59                   ` Al Viro
2019-03-28  9:00                     ` Jan Kara
2019-03-27 17:22             ` Jan Kara

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190325045744.GK2217@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzbot+7a8ba368b47fdefca61e@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git