linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: syzbot <syzbot+7a8ba368b47fdefca61e@syzkaller.appspotmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>
Subject: Re: KASAN: use-after-free Read in path_lookupat
Date: Mon, 25 Mar 2019 19:43:32 +0000	[thread overview]
Message-ID: <20190325194332.GO2217@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAHk-=wg4iJsMHBzK52WzP+5_92HbwvX_vh_s4mMUuN0FJGdM5A@mail.gmail.com>

On Mon, Mar 25, 2019 at 11:36:01AM -0700, Linus Torvalds wrote:
> Right. Not just move the existing destroy_inode() - because as you
> say, people may not be able to to do that in RCU contect, but split it
> up, and add a "final_free_inode()" callback or something for the RCU
> phase.
> 
> In fact, I suspect that *every* user of destroy_inode() already has to
> have its own RCU callback to another final freeing function anyway.

Nope - pipes do not, and for a good reason.

> Because they really shouldn't free the inode itself early. Maybe we
> can just make that be a generic thing?

Maybe...  OTOH, we already have more methods on the destruction end
than I'm comfortable trying to document.  Because we clearly *do*
need a clear set of rules along the lines of "this kind of thing belongs
in this method, this - in that one".

As it is, on the way to inode destruction we have

1) [kinda] ->drop_inode() - decide whether this inode is not worth
keeping around in icache once the refcount reaches zero.  Predicate,
shouldn't change inode state at all.    Common instances:
	* default (encoded as NULL, available as generic_drop_inode()) -
"keep it around if it's still hashed and link count is non-zero".
	* generic_delete_inode(): "don't retain that sucker"

However, 3 instances are more or less weird - f2fs, gfs and ocfs2.
gfs2 one is the least odd of those, but the other two...
What the hell is forced writeback doing in ocfs2_drop_inode()?
If they don't want to retain anything, fine, but then why do
the
        inode->i_state |= I_WILL_FREE;
        spin_unlock(&inode->i_lock);
        write_inode_now(inode, 1);
        spin_lock(&inode->i_lock);
        WARN_ON(inode->i_state & I_NEW);
        inode->i_state &= ~I_WILL_FREE;
dance in ->drop_inode()?  It will be immediately followed by
->evict_inode(), and it feels like the damn thing would be
a lot more natural there...  And what, for pity sake, f2fs is
doing with truncation in that predicate, of all places?  The
comment in there is
        /*
         * This is to avoid a deadlock condition like below.
         * writeback_single_inode(inode)
         *  - f2fs_write_data_page
         *    - f2fs_gc -> iput -> evict
         *       - inode_wait_for_writeback(inode)
         */
which looks... uninspiring, to put it mildly.

2) ->evict_inode() - called when we kick the inode out.  Freeing
on-disk inodes, etc. belongs there.  inode is still in icache
hash chain at that point, so any icache lookups for it will block
until that thing is done.  If we have something non-trivial done
by iget... test() callbacks, we must keep the data structures needed
by those.

3) ->destroy_inode() - destructor.  By that point all remaining
references to inode are (stale) RCU ones.  The stuff that might
be reached via those has to have an RCU delay between the call
of ->destroy_inode() and freeing.  Very commonly that's done
by call_rcu(), and more often than not it's the only thing in
->destroy_inode().  However, if we know that there'll be no
RCU accessors, we can do freeing immediately - pipes do just
that.

And the above is piss-poor as documentation goes - it doesn't
answer the "where should this go?" any better than "try to
see what similar filesystems are doing", which is asking for
cargo-culting ;-/

  parent reply	other threads:[~2019-03-25 19:43 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-28 17:40 KASAN: use-after-free Read in path_lookupat syzbot
2019-03-25  0:44 ` syzbot
2019-03-25  1:25   ` Linus Torvalds
2019-03-25  1:23 ` Linus Torvalds
2019-03-25  4:57   ` Al Viro
2019-03-25  9:15     ` Daniel Borkmann
2019-03-25 11:11       ` Al Viro
2019-03-25 11:17         ` Al Viro
2019-03-25 11:21           ` Daniel Borkmann
2019-03-25 18:36     ` Linus Torvalds
2019-03-25 19:18       ` Linus Torvalds
2019-03-25 21:14         ` Al Viro
2019-03-25 21:45           ` Linus Torvalds
2019-03-25 22:04             ` Daniel Borkmann
2019-03-25 22:13               ` Linus Torvalds
2019-03-25 22:41                 ` Daniel Borkmann
2019-03-25 22:49               ` Al Viro
2019-03-25 23:37             ` Al Viro
2019-03-25 23:44               ` Alexei Starovoitov
2019-03-26  0:21                 ` Al Viro
2019-03-26  1:38               ` ceph: fix use-after-free on symlink traversal Al Viro
2019-03-26  1:39                 ` jffs2: " Al Viro
2019-03-26  1:40                 ` ubifs: " Al Viro
2019-03-26  1:43                 ` debugfs: " Al Viro
2019-03-26 10:41                 ` ceph: " Jeff Layton
2019-03-26 11:38                 ` Ilya Dryomov
2019-03-26  1:45               ` KASAN: use-after-free Read in path_lookupat Al Viro
2019-04-10 18:11                 ` Al Viro
2019-04-10 19:44                   ` Linus Torvalds
2019-03-25 19:43       ` Al Viro [this message]
2019-03-25 22:48         ` Dave Chinner
2019-03-25 23:02           ` Al Viro
     [not found]             ` <CAGe7X7mb=gK7zhSwmT_6mmmkcbjhZAOb=wj31BdUcHkNUPsm2Q@mail.gmail.com>
2019-03-26  4:15               ` Al Viro
2019-03-27 16:58                 ` Jan Kara
2019-03-27 18:59                   ` Al Viro
2019-03-28  9:00                     ` Jan Kara
2019-03-27 17:22             ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190325194332.GO2217@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzbot+7a8ba368b47fdefca61e@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).