All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Jonathan Corbet <corbet@lwn.net>
Cc: "Tobin C. Harding" <tobin@kernel.org>,
	Mauro Carvalho Chehab <mchehab@s-opensource.com>,
	Neil Brown <neilb@suse.com>, Randy Dunlap <rdunlap@infradead.org>,
	linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, Ian Kent <raven@themaw.net>
Subject: Re: [PATCH v3 00/24] Convert vfs.txt to vfs.rst
Date: Tue, 2 Apr 2019 18:54:01 +0100	[thread overview]
Message-ID: <20190402175401.GL2217@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20190402164824.GK2217@ZenIV.linux.org.uk>

On Tue, Apr 02, 2019 at 05:48:24PM +0100, Al Viro wrote:

> ->d_prune() must not drop/regain any of the locks held by caller.
> It must _not_ free anything attached to dentry - that belongs
> later in the shutdown sequence.  If anything, I'm tempted to
> make it take const struct dentry * as argument, just to make
> that clear.
> 
> No new (counted) references can be acquired by that point;
> lockless dcache lookup might find our dentry a match, but
> result of such lookup is not going to be legitimized - it's
> doomed to be thrown out as stale.
> 
> It really makes more sense as part of struct dentry lifecycle
> description...  
> 
> [1] in theory, ->d_time might be changed by overlapping lockless
> call of ->d_revalidate().  Up to filesystem - VFS doesn't touch
> that field (and AFAICS only NFS uses it these days).

One addition: ->d_prune() can overlap only with
	* lockless ->d_hash()/->d_compare()
	* lockless ->d_revalidate()
	* lockless ->d_manage()
So it must not destroy anything used by those without an RCU
delay.  The same goes for ->d_release() - both the list of
the things it can overlap with and requirements re RCU delays.

In-tree ->d_prune() instances are fine and so's the majority
of ->d_release().  However, autofs ->d_release() has something
that looks like an RCU use-after-free waiting to happen:
static void autofs_dentry_release(struct dentry *de)
{
        struct autofs_info *ino = autofs_dentry_ino(de);
        struct autofs_sb_info *sbi = autofs_sbi(de->d_sb);

        pr_debug("releasing %p\n", de);

        if (!ino)
                return;
...
        autofs_free_ino(ino);
}
with autofs_free_ino() being straight kfree().  Which means
that the lockless case of autofs_d_manage() can run into
autofs_dentry_ino(dentry) getting freed right under it.

And there we do have this reachable:
int autofs_expire_wait(const struct path *path, int rcu_walk)
{
        struct dentry *dentry = path->dentry;
        struct autofs_sb_info *sbi = autofs_sbi(dentry->d_sb);
        struct autofs_info *ino = autofs_dentry_ino(dentry);
        int status;
        int state;

        /* Block on any pending expire */
        if (!(ino->flags & AUTOFS_INF_WANT_EXPIRE))
                return 0;
        if (rcu_walk)
                return -ECHILD;

the second check buggers off in lockless mode; the first one
can be done in lockless mode just fine, so AFAICS we do have
a problem there.  Smells like we ought to make that kfree
in autofs_free_ino() RCU-delayed...  Ian, have you, by any
chance, run into reports like that?  Use-after-free or
oopsen in autofs_expire_wait() and friends, that is...

AFAICS, everything else is safe; however, looking through
those has turned up a fishy spot in ceph_d_revalidate():
        } else if (d_really_is_positive(dentry) &&
                   ceph_snap(d_inode(dentry)) == CEPH_SNAPDIR) {
                valid = 1;
Again, lockless ->d_revalidate() is called without anything
that would hold ->d_inode stable; the first part of the
condition does not guarantee that we won't run into
ceph_snap(NULL) and oops.  Sure, compiler is almost certainly
not going to reload here, but we ought to use d_inode_rcu(dentry)
there.

Sigh...

  reply	other threads:[~2019-04-02 17:54 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-27  5:16 [PATCH v3 00/24] Convert vfs.txt to vfs.rst Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 01/24] vfs: Remove trailing whitespace Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 02/24] vfs: Clean up VFS data structure declarations Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 03/24] fs: Update function docstring for dio_complete() Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 04/24] fs: Add docstrings to exported functions Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 05/24] fs: Guard unusual text with backticks Tobin C. Harding
2019-03-27  5:16 ` [PATCH v3 06/24] fs: Update function docstring for simple_write_end() Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 07/24] fs: Fix function docstring for posix_acl_update_mode() Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 08/24] dcache: Remove trailing whitespace Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 09/24] dcache: Fix i.e. usage in coments Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 10/24] dcache: Fix e.g. usage in comment Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 11/24] dcache: Fix docstring comment for d_drop() Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 12/24] dcache: Fix non-docstring comments Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 13/24] dcache: Clean up function docstrings Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 14/24] dcache: Clean up function docstring members Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 15/24] docs: filesystems: vfs: Remove space before tab Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 16/24] docs: filesystems: vfs: Use uniform space after period Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 17/24] docs: filesystems: vfs: Use 72 character column width Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 18/24] docs: filesystems: vfs: Use uniform spacing around headings Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 19/24] docs: filesystems: vfs: Use correct initial heading Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 20/24] docs: filesystems: vfs: Use SPDX identifier Tobin C. Harding
2019-04-01  5:43   ` Mukesh Ojha
2019-03-27  5:17 ` [PATCH v3 21/24] docs: filesystems: vfs: Fix pre-amble indentation Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 22/24] fs: Copy documentation to struct declarations Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 23/24] dcache: Copy documentation to struct declaration Tobin C. Harding
2019-03-27  5:17 ` [PATCH v3 24/24] docs: Convert vfs.txt to reStructuredText format Tobin C. Harding
2019-03-27  5:24 ` [PATCH v3 00/24] Convert vfs.txt to vfs.rst Joe Perches
2019-03-27  6:26   ` Tobin C. Harding
2019-04-02 15:49 ` Jonathan Corbet
2019-04-02 16:48   ` Al Viro
2019-04-02 17:54     ` Al Viro [this message]
2019-04-02 19:08       ` Al Viro
2019-04-02 23:36         ` Ian Kent
2019-04-02 23:56         ` Ian Kent
2019-04-03  0:55           ` NeilBrown
2019-04-03 19:35             ` Al Viro
2019-04-04  6:30               ` Ian Kent
2019-04-03 23:28             ` Ian Kent
2019-04-02 19:25     ` Tobin C. Harding
2019-04-03 19:47       ` Al Viro
2019-04-03 20:59         ` Tobin C. Harding
2019-04-03  1:00     ` NeilBrown
2019-04-03  1:44       ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190402175401.GL2217@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@s-opensource.com \
    --cc=neilb@suse.com \
    --cc=raven@themaw.net \
    --cc=rdunlap@infradead.org \
    --cc=tobin@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.