linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	syzbot <syzbot+7a8ba368b47fdefca61e@syzkaller.appspotmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	Jan Kara <jack@suse.cz>, Jaegeuk Kim <jaegeuk@kernel.org>,
	Joel Becker <jlbec@evilplan.org>, Mark Fasheh <mark@fasheh.com>
Subject: Re: KASAN: use-after-free Read in path_lookupat
Date: Wed, 27 Mar 2019 18:22:54 +0100	[thread overview]
Message-ID: <20190327172254.GC6742@quack2.suse.cz> (raw)
In-Reply-To: <20190325230211.GR2217@ZenIV.linux.org.uk>

On Mon 25-03-19 23:02:11, Al Viro wrote:
> On Tue, Mar 26, 2019 at 09:48:23AM +1100, Dave Chinner wrote:
> 
> > And when it comes to VFS inode reclaim, XFS does not implement
> > ->evict_inode because there is nothing at the VFS level to do.
> > And ->destroy_inode ends up doing cleanup work (e.g. freeing on-disk
> > inodes) which is non-trivial, blocking work, but then still requires
> > the struct xfs_inode to be written back to disk before it can bei
> > freed. So it just gets marked "reclaimable" and background reclaim
> > then takes care of it from there so we avoid synchronous IO in inode
> > reclaim...
> > 
> > This works because don't track dirty inode metadata in the VFS
> > writeback code (it's tracked with much more precision in the XFS log
> > infrastructure) and we don't write back inodes from the VFS
> > infrastructure, either. It's all done based on internal state
> > outside the VFS.
> > 
> > And, because of this, the VFS cannot assume that it can free
> > the struct inode after calling ->destroy_inode or even use
> > call_rcu() to run a filesystem destructor because the filesystem
> > may need to do work that needs to block and that's not allowed in an
> > RCU callback...
> 
> In Linus' patch that's what you get with non-NULL ->destroy_inode
> + NULL ->destroy_inode_rcu, so XFS won't be screwed by that.
> Said that, yes, XFS adds another fun twist there (AFAICS, it's
> the only in-tree filesystem that pulls that off).
> 
> I would really like some comments from f2fs and ocfs2 folks, as well
> as Jan - he's had much more recent contact with writeback code than
> I have...  Could somebody explain what's going on in f2fs and ocfs2
> ->drop_inode()?  It _should_ be just a predicate; looks like both
> are playing very odd games to work around writeback problems and
> I wonder if there's a cleaner solution for that.  I can try and dig
> through maillist(s) archives, but I would really appreciate it
> if somebody could give a braindump on the issues dealt with in there...

OCFS2 is discussed elsewhere and should be relatively easy to deal with.
F2FS seems harder. The problem is that AFAICS they get inode references
from their garbage collection code which can get called during page
writeback. And then they need to drop these references and they can be the
last ones to hold the inode reference for an unlinked inode forcing flush
worker into inode cleanup. Which generally causes problems and was the
reason why writeback code does not take inode references but relies on
I_SYNC to protect it from inode reclaim instead (see commit 169ebd90131b
"writeback: Avoid iput() from flusher thread"). And they noticed the
problem as well and hacked around it... Now I don't know enough about F2FS
and its garbage collection to tell how they can avoid dropping inode
references from flush worker context. But that's the right solution for
avoiding deadlocks.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

      parent reply	other threads:[~2019-03-27 17:23 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-28 17:40 KASAN: use-after-free Read in path_lookupat syzbot
2019-03-25  0:44 ` syzbot
2019-03-25  1:25   ` Linus Torvalds
2019-03-25  1:23 ` Linus Torvalds
2019-03-25  4:57   ` Al Viro
2019-03-25  9:15     ` Daniel Borkmann
2019-03-25 11:11       ` Al Viro
2019-03-25 11:17         ` Al Viro
2019-03-25 11:21           ` Daniel Borkmann
2019-03-25 18:36     ` Linus Torvalds
2019-03-25 19:18       ` Linus Torvalds
2019-03-25 21:14         ` Al Viro
2019-03-25 21:45           ` Linus Torvalds
2019-03-25 22:04             ` Daniel Borkmann
2019-03-25 22:13               ` Linus Torvalds
2019-03-25 22:41                 ` Daniel Borkmann
2019-03-25 22:49               ` Al Viro
2019-03-25 23:37             ` Al Viro
2019-03-25 23:44               ` Alexei Starovoitov
2019-03-26  0:21                 ` Al Viro
2019-03-26  1:38               ` ceph: fix use-after-free on symlink traversal Al Viro
2019-03-26  1:39                 ` jffs2: " Al Viro
2019-03-26  1:40                 ` ubifs: " Al Viro
2019-03-26  1:43                 ` debugfs: " Al Viro
2019-03-26 10:41                 ` ceph: " Jeff Layton
2019-03-26 11:38                 ` Ilya Dryomov
2019-03-26  1:45               ` KASAN: use-after-free Read in path_lookupat Al Viro
2019-04-10 18:11                 ` Al Viro
2019-04-10 19:44                   ` Linus Torvalds
2019-03-25 19:43       ` Al Viro
2019-03-25 22:48         ` Dave Chinner
2019-03-25 23:02           ` Al Viro
     [not found]             ` <CAGe7X7mb=gK7zhSwmT_6mmmkcbjhZAOb=wj31BdUcHkNUPsm2Q@mail.gmail.com>
2019-03-26  4:15               ` Al Viro
2019-03-27 16:58                 ` Jan Kara
2019-03-27 18:59                   ` Al Viro
2019-03-28  9:00                     ` Jan Kara
2019-03-27 17:22             ` Jan Kara [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190327172254.GC6742@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=david@fromorbit.com \
    --cc=jaegeuk@kernel.org \
    --cc=jlbec@evilplan.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark@fasheh.com \
    --cc=syzbot+7a8ba368b47fdefca61e@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).