linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Jonathan Corbet <corbet@lwn.net>
Cc: "Tobin C. Harding" <tobin@kernel.org>,
	Mauro Carvalho Chehab <mchehab@s-opensource.com>,
	Neil Brown <neilb@suse.com>, Randy Dunlap <rdunlap@infradead.org>,
	linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, Ian Kent <raven@themaw.net>
Subject: Re: [PATCH v3 00/24] Convert vfs.txt to vfs.rst
Date: Tue, 2 Apr 2019 18:54:01 +0100	[thread overview]
Message-ID: <20190402175401.GL2217@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20190402164824.GK2217@ZenIV.linux.org.uk>

On Tue, Apr 02, 2019 at 05:48:24PM +0100, Al Viro wrote:

> ->d_prune() must not drop/regain any of the locks held by caller.
> It must _not_ free anything attached to dentry - that belongs
> later in the shutdown sequence.  If anything, I'm tempted to
> make it take const struct dentry * as argument, just to make
> that clear.
> 
> No new (counted) references can be acquired by that point;
> lockless dcache lookup might find our dentry a match, but
> result of such lookup is not going to be legitimized - it's
> doomed to be thrown out as stale.
> 
> It really makes more sense as part of struct dentry lifecycle
> description...  
> 
> [1] in theory, ->d_time might be changed by overlapping lockless
> call of ->d_revalidate().  Up to filesystem - VFS doesn't touch
> that field (and AFAICS only NFS uses it these days).

One addition: ->d_prune() can overlap only with
	* lockless ->d_hash()/->d_compare()
	* lockless ->d_revalidate()
	* lockless ->d_manage()
So it must not destroy anything used by those without an RCU
delay.  The same goes for ->d_release() - both the list of
the things it can overlap with and requirements re RCU delays.

In-tree ->d_prune() instances are fine and so's the majority
of ->d_release().  However, autofs ->d_release() has something
that looks like an RCU use-after-free waiting to happen:
static void autofs_dentry_release(struct dentry *de)
{
        struct autofs_info *ino = autofs_dentry_ino(de);
        struct autofs_sb_info *sbi = autofs_sbi(de->d_sb);

        pr_debug("releasing %p\n", de);

        if (!ino)
                return;
...
        autofs_free_ino(ino);
}
with autofs_free_ino() being straight kfree().  Which means
that the lockless case of autofs_d_manage() can run into
autofs_dentry_ino(dentry) getting freed right under it.

And there we do have this reachable:
int autofs_expire_wait(const struct path *path, int rcu_walk)
{
        struct dentry *dentry = path->dentry;
        struct autofs_sb_info *sbi = autofs_sbi(dentry->d_sb);
        struct autofs_info *ino = autofs_dentry_ino(dentry);
        int status;
        int state;

        /* Block on any pending expire */
        if (!(ino->flags & AUTOFS_INF_WANT_EXPIRE))
                return 0;
        if (rcu_walk)
                return -ECHILD;

the second check buggers off in lockless mode; the first one
can be done in lockless mode just fine, so AFAICS we do have
a problem there.  Smells like we ought to make that kfree
in autofs_free_ino() RCU-delayed...  Ian, have you, by any
chance, run into reports like that?  Use-after-free or
oopsen in autofs_expire_wait() and friends, that is...

AFAICS, everything else is safe; however, looking through
those has turned up a fishy spot in ceph_d_revalidate():
        } else if (d_really_is_positive(dentry) &&
                   ceph_snap(d_inode(dentry)) == CEPH_SNAPDIR) {
                valid = 1;
Again, lockless ->d_revalidate() is called without anything
that would hold ->d_inode stable; the first part of the
condition does not guarantee that we won't run into
ceph_snap(NULL) and oops.  Sure, compiler is almost certainly
not going to reload here, but we ought to use d_inode_rcu(dentry)
there.

Sigh...

  reply	other threads:[~2019-04-02 17:54 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-27  5:16 [PATCH v3 00/24] Convert vfs.txt to vfs.rst Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 01/24] vfs: Remove trailing whitespace Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 02/24] vfs: Clean up VFS data structure declarations Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 03/24] fs: Update function docstring for dio_complete() Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 04/24] fs: Add docstrings to exported functions Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 05/24] fs: Guard unusual text with backticks Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 06/24] fs: Update function docstring for simple_write_end() Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 07/24] fs: Fix function docstring for posix_acl_update_mode() Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 08/24] dcache: Remove trailing whitespace Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 09/24] dcache: Fix i.e. usage in coments Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 10/24] dcache: Fix e.g. usage in comment Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 11/24] dcache: Fix docstring comment for d_drop() Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 12/24] dcache: Fix non-docstring comments Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 13/24] dcache: Clean up function docstrings Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 14/24] dcache: Clean up function docstring members Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 15/24] docs: filesystems: vfs: Remove space before tab Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 16/24] docs: filesystems: vfs: Use uniform space after period Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 17/24] docs: filesystems: vfs: Use 72 character column width Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 18/24] docs: filesystems: vfs: Use uniform spacing around headings Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 19/24] docs: filesystems: vfs: Use correct initial heading Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 20/24] docs: filesystems: vfs: Use SPDX identifier Tobin C. Harding
2019-04-01  5:43   ` Mukesh Ojha
2019-03-27  5:17 ` [PATCH v3 21/24] docs: filesystems: vfs: Fix pre-amble indentation Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 22/24] fs: Copy documentation to struct declarations Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 23/24] dcache: Copy documentation to struct declaration Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 24/24] docs: Convert vfs.txt to reStructuredText format Tobin C. Harding
2019-03-27  5:24 ` [PATCH v3 00/24] Convert vfs.txt to vfs.rst Joe Perches
2019-03-27  6:26   ` Tobin C. Harding
2019-04-02 15:49 ` Jonathan Corbet
2019-04-02 16:48   ` Al Viro
2019-04-02 17:54     ` Al Viro [this message]
2019-04-02 19:08       ` Al Viro
2019-04-02 23:36         ` Ian Kent
2019-04-02 23:56         ` Ian Kent
2019-04-03  0:55           ` NeilBrown
2019-04-03 19:35             ` Al Viro
2019-04-04  6:30               ` Ian Kent
2019-04-03 23:28             ` Ian Kent
2019-04-02 19:25     ` Tobin C. Harding
2019-04-03 19:47       ` Al Viro
2019-04-03 20:59         ` Tobin C. Harding
2019-04-03  1:00     ` NeilBrown
2019-04-03  1:44       ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190402175401.GL2217@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@s-opensource.com \
    --cc=neilb@suse.com \
    --cc=raven@themaw.net \
    --cc=rdunlap@infradead.org \
    --cc=tobin@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).