From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
linux-kernel@vger.kernel.org,
John Ogness <john.ogness@linutronix.de>
Subject: [BUG] lock_parent() breakage when used from shrink_dentry_list() (was Re: [PATCH v2 6/6] fs/dcache: Avoid remaining try_lock loop in shrink_dentry_list())
Date: Fri, 23 Feb 2018 20:13:17 +0000 [thread overview]
Message-ID: <20180223201317.GG30522@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20180223174216.GD30522@ZenIV.linux.org.uk>
On Fri, Feb 23, 2018 at 05:42:16PM +0000, Al Viro wrote:
> 4) the nasty one - shrink_dentry_list() evictions of zero-count dentries.
> _That_ calls for careful use of RCU, etc. - none of the others need that. Need
> to think how to deal with that sucker; in any case, I do not believe that sharing
> said RCU use, etc. with any other cases would do anything other than obfuscating
> the rest.
Arrrgh... Actually, there's a nasty corner case where the variant in mainline is
broken. Look:
dentry placed on a shrink list
we pick the fucker from the list and lock it.
we call lock_parent() on it.
dentry is not a root and it's not deleted, so we proceed.
trylock fails.
we grab rcu_read_lock()
we drop dentry->d_lock
on another CPU, something does e.g. d_prune_aliases() (or finds the
sucker in hash and proceeds to unhash and dput(), etc.) - anything
that evicts that dentry. It is marked with DCACHE_MAY_FREE and left
alone. The parent, OTOH, is dropped and freeing gets scheduled as
soon as RCU allows.
we grab parent->d_lock
we verify that dentry->d_parent is still the same (it is)
we do rcu_read_unlock()
we grab dentry->d_lock
we return parent
At that point we are fucked - there's nothing to prevent parent from being
freed at any point. And we assume that its ->d_lock is held and needs to
be dropped.
The call site in d_prune_aliases() avoids the same scenario, since there we
are already holding ->i_lock and another thread won't get to __dentry_kill()
until we are done with lock_parent().
Unless I'm missing something, that's a (narrow) memory corruptor. The window is
narrow, but not impossibly so - if that other thread had been spinning on attempt
to grab dentry->d_lock in d_prune_alias(), it has to squeeze through
if (!dentry->d_lockref.count) {
and then in lock_parent() called there - through
if (IS_ROOT(dentry))
return NULL;
if (unlikely(dentry->d_lockref.count < 0))
return NULL;
if (likely(spin_trylock(&parent->d_lock)))
before the first CPU gets through
parent = READ_ONCE(dentry->d_parent);
spin_lock(&parent->d_lock);
The first CPU can't be preempted, but there's nothing to prevent an IRQ arriving
at that point, letting the second one win the race.
Comments?
I think the (untested) patch below is -stable fodder:
lock_parent() needs to recheck if dentry got __dentry_kill'ed under it
In case when dentry passed to lock_parent() is protected from freeing only
by the fact that it's on a shrink list and trylock of parent fails, we
could get hit by __dentry_kill() (and subsequent dentry_kill(parent))
between unlocking dentry and locking presumed parent. We need to recheck
that dentry is alive once we lock both it and parent *and* postpone
rcu_read_unlock() until after that point. Otherwise we could return
a pointer to struct dentry that already is rcu-scheduled for freeing, with
->d_lock held on it; caller's subsequent attempt to unlock it can end
up with memory corruption.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/dcache.c b/fs/dcache.c
index 7c38f39958bc..32aaab21e648 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -647,11 +647,16 @@ static inline struct dentry *lock_parent(struct dentry *dentry)
spin_unlock(&parent->d_lock);
goto again;
}
- rcu_read_unlock();
- if (parent != dentry)
+ if (parent != dentry) {
spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
- else
+ if (unlikely(dentry->d_lockref.count < 0)) {
+ spin_unlock(&parent->d_lock);
+ parent = NULL;
+ }
+ } else {
parent = NULL;
+ }
+ rcu_read_unlock();
return parent;
}
next prev parent reply other threads:[~2018-02-23 20:13 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-22 23:50 [PATCH v2 0/6] fs/dcache: avoid trylock loops John Ogness
2018-02-22 23:50 ` [PATCH v2 1/6] fs/dcache: Remove stale comment from dentry_kill() John Ogness
2018-02-22 23:50 ` [PATCH v2 2/6] fs/dcache: Move dentry_kill() below lock_parent() John Ogness
2018-02-22 23:50 ` [PATCH v2 3/6] fs/dcache: Avoid the try_lock loop in d_delete() John Ogness
2018-02-23 2:08 ` Al Viro
2018-02-22 23:50 ` [PATCH v2 4/6] fs/dcache: Avoid the try_lock loops in dentry_kill() John Ogness
2018-02-23 2:22 ` Al Viro
2018-02-23 3:12 ` Al Viro
2018-02-23 3:16 ` Al Viro
2018-02-23 5:46 ` Al Viro
2018-02-22 23:50 ` [PATCH v2 5/6] fs/dcache: Avoid a try_lock loop in shrink_dentry_list() John Ogness
2018-02-23 3:48 ` Al Viro
2018-02-22 23:50 ` [PATCH v2 6/6] fs/dcache: Avoid remaining " John Ogness
2018-02-23 3:58 ` Al Viro
2018-02-23 4:08 ` Al Viro
2018-02-23 13:57 ` John Ogness
2018-02-23 15:09 ` Al Viro
2018-02-23 17:42 ` Al Viro
2018-02-23 20:13 ` Al Viro [this message]
2018-02-23 21:35 ` [BUG] lock_parent() breakage when used from shrink_dentry_list() (was Re: [PATCH v2 6/6] fs/dcache: Avoid remaining try_lock loop in shrink_dentry_list()) Linus Torvalds
2018-02-24 0:22 ` Al Viro
2018-02-25 7:40 ` Al Viro
2018-02-27 5:16 ` dcache: remove trylock loops (was Re: [BUG] lock_parent() breakage when used from shrink_dentry_list()) John Ogness
2018-03-12 19:13 ` Al Viro
2018-03-12 20:05 ` Al Viro
2018-03-12 20:33 ` Al Viro
2018-03-13 1:12 ` NeilBrown
2018-04-28 0:10 ` Al Viro
2018-03-12 20:23 ` Eric W. Biederman
2018-03-12 20:39 ` Al Viro
2018-03-12 23:28 ` Eric W. Biederman
2018-03-12 23:52 ` Eric W. Biederman
2018-03-13 0:37 ` Al Viro
2018-03-13 0:50 ` Al Viro
2018-03-13 4:02 ` Eric W. Biederman
2018-03-14 23:20 ` [PATCH] fs: Teach path_connected to handle nfs filesystems with multiple roots Eric W. Biederman
2018-03-15 22:34 ` Al Viro
2018-03-13 0:36 ` dcache: remove trylock loops (was Re: [BUG] lock_parent() breakage when used from shrink_dentry_list()) Al Viro
2018-03-12 22:14 ` Thomas Gleixner
2018-03-13 20:46 ` John Ogness
2018-03-13 21:05 ` John Ogness
2018-03-13 23:59 ` Al Viro
2018-03-14 2:58 ` Matthew Wilcox
2018-03-14 8:18 ` John Ogness
2018-03-02 9:04 ` [BUG] lock_parent() breakage when used from shrink_dentry_list() (was Re: [PATCH v2 6/6] fs/dcache: Avoid remaining try_lock loop in shrink_dentry_list()) Sebastian Andrzej Siewior
2018-02-23 0:59 ` [PATCH v2 0/6] fs/dcache: avoid trylock loops Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180223201317.GG30522@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=bigeasy@linutronix.de \
--cc=hch@lst.de \
--cc=john.ogness@linutronix.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).