From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
Dave Chinner <david@fromorbit.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: dcache shrink list corruption?
Date: Wed, 30 Apr 2014 21:38:23 +0100 [thread overview]
Message-ID: <20140430203823.GT18016@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CA+55aFxe4HKHiWbarfwtww9=WUgfAmk+Cf3G7tdBB19uMdjbEg@mail.gmail.com>
On Wed, Apr 30, 2014 at 01:23:26PM -0700, Linus Torvalds wrote:
> On Wed, Apr 30, 2014 at 12:59 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Another thing: I don't like what's going on with freeing vs. ->d_lock there.
> > Had that been a mutex, we'd definitely get a repeat of "vfs: fix subtle
> > use-after-free of pipe_inode_info". The question is, can spin_unlock(p)
> > dereference p after another CPU gets through spin_lock(p)? Linus?
>
> spin_unlock() *should* be safe wrt that issue.
>
> But I have to say, I think paravirtualized spinlocks may break that.
> They do all kinds of "kick waiters" after releasing the lock.
>
> Doesn't the RCU protection solve that, though? Nobody should be
> releasing the dentry under us, afaik..
We do not (and cannot) call dentry_kill() with rcu_read_lock held - it can
trigger any amount of IO, for one thing. We can take it around the
couple of places where do that spin_unlock(&dentry->d_lock) (along with
setting DCACHE_RCUACCESS) - that's what I'd been refering to. Then this
sucker (tests still running, so far everything seems to survive) becomes
the following (again, on top of 1/6..4/6). BTW, is there any convenient
way to tell git commit --amend to update the commit date? Something
like --date=now would be nice, but it isn't accepted...
commit 797ff22681dc969b478ed837787d24dfd2dd2132
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Tue Apr 29 23:52:05 2014 -0400
dentry_kill(): don't try to remove from shrink list
If the victim in on the shrink list, don't remove it from there.
If shrink_dentry_list() manages to remove it from the list before
we are done - fine, we'll just free it as usual. If not - mark
it with new flag (DCACHE_MAY_FREE) and leave it there.
Eventually, shrink_dentry_list() will get to it, remove the sucker
from shrink list and call dentry_kill(dentry, 0). Which is where
we'll deal with freeing.
Since now dentry_kill(dentry, 0) may happen after or during
dentry_kill(dentry, 1), we need to recognize that (by seeing
DCACHE_DENTRY_KILLED already set), unlock everything
and either free the sucker (in case DCACHE_MAY_FREE has been
set) or leave it for ongoing dentry_kill(dentry, 1) to deal with.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
diff --git a/fs/dcache.c b/fs/dcache.c
index e482775..fa40d26 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -489,6 +489,20 @@ relock:
goto relock;
}
+ if (unlikely(dentry->d_flags & DCACHE_DENTRY_KILLED)) {
+ if (parent)
+ spin_unlock(&parent->d_lock);
+ if (dentry->d_flags & DCACHE_MAY_FREE) {
+ spin_unlock(&dentry->d_lock);
+ dentry_free(dentry);
+ } else {
+ dentry->d_flags |= DCACHE_RCUACCESS;
+ rcu_read_lock();
+ spin_unlock(&dentry->d_lock);
+ rcu_read_unlock();
+ }
+ return parent;
+ }
/*
* The dentry is now unrecoverably dead to the world.
*/
@@ -504,8 +518,6 @@ relock:
if (dentry->d_flags & DCACHE_LRU_LIST) {
if (!(dentry->d_flags & DCACHE_SHRINK_LIST))
d_lru_del(dentry);
- else
- d_shrink_del(dentry);
}
/* if it was on the hash then remove it */
__d_drop(dentry);
@@ -527,7 +539,16 @@ relock:
if (dentry->d_op && dentry->d_op->d_release)
dentry->d_op->d_release(dentry);
- dentry_free(dentry);
+ spin_lock(&dentry->d_lock);
+ if (dentry->d_flags & DCACHE_SHRINK_LIST) {
+ dentry->d_flags |= DCACHE_MAY_FREE | DCACHE_RCUACCESS;
+ rcu_read_lock();
+ spin_unlock(&dentry->d_lock);
+ rcu_read_unlock();
+ } else {
+ spin_unlock(&dentry->d_lock);
+ dentry_free(dentry);
+ }
return parent;
}
@@ -829,7 +850,7 @@ static void shrink_dentry_list(struct list_head *list)
* We found an inuse dentry which was not removed from
* the LRU because of laziness during lookup. Do not free it.
*/
- if (dentry->d_lockref.count) {
+ if (dentry->d_lockref.count > 0) {
spin_unlock(&dentry->d_lock);
continue;
}
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 3b9bfdb..3c7ec32 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -221,6 +221,8 @@ struct dentry_operations {
#define DCACHE_SYMLINK_TYPE 0x00300000 /* Symlink */
#define DCACHE_FILE_TYPE 0x00400000 /* Other file type */
+#define DCACHE_MAY_FREE 0x00800000
+
extern seqlock_t rename_lock;
static inline int dname_external(const struct dentry *dentry)
next prev parent reply other threads:[~2014-04-30 20:38 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-29 16:01 dcache shrink list corruption? Miklos Szeredi
2014-04-29 17:43 ` Linus Torvalds
2014-04-29 18:03 ` Miklos Szeredi
2014-04-29 18:16 ` Al Viro
2014-04-29 19:10 ` Al Viro
2014-04-29 21:18 ` Dave Chinner
2014-04-29 21:48 ` Al Viro
2014-04-29 23:04 ` Linus Torvalds
2014-04-29 23:20 ` Al Viro
2014-04-30 2:31 ` Al Viro
2014-04-30 2:56 ` Linus Torvalds
2014-04-30 4:04 ` Al Viro
2014-04-30 15:49 ` Miklos Szeredi
2014-04-30 15:56 ` Miklos Szeredi
2014-04-30 16:03 ` Al Viro
2014-04-30 17:33 ` Miklos Szeredi
2014-04-30 18:36 ` Al Viro
2014-04-30 18:42 ` Miklos Szeredi
2014-04-30 19:02 ` Al Viro
2014-04-30 19:59 ` Al Viro
2014-04-30 20:23 ` Linus Torvalds
2014-04-30 20:38 ` Al Viro [this message]
2014-04-30 20:57 ` Linus Torvalds
2014-04-30 21:12 ` Al Viro
2014-04-30 22:12 ` Al Viro
2014-04-30 23:04 ` Linus Torvalds
2014-04-30 23:14 ` Linus Torvalds
2014-04-30 23:43 ` Al Viro
2014-05-01 0:18 ` Linus Torvalds
2014-05-01 2:51 ` Al Viro
2014-05-01 2:59 ` Linus Torvalds
2014-05-01 3:12 ` Al Viro
2014-05-01 9:42 ` Miklos Szeredi
2014-05-01 14:34 ` Al Viro
2014-05-01 21:02 ` Al Viro
2014-05-01 21:05 ` Al Viro
2014-05-01 22:52 ` Linus Torvalds
2014-05-02 8:43 ` Szeredi Miklos
2014-05-02 21:04 ` Linus Torvalds
2014-04-30 23:38 ` Al Viro
2014-04-30 9:15 ` Miklos Szeredi
2014-05-02 5:51 ` Al Viro
2014-05-02 9:00 ` Szeredi Miklos
2014-05-02 21:02 ` Miklos Szeredi
2014-05-02 21:08 ` Miklos Szeredi
2014-05-02 21:18 ` Linus Torvalds
2014-05-02 22:40 ` Al Viro
2014-05-02 23:06 ` Al Viro
2014-05-03 4:26 ` Al Viro
2014-05-03 18:07 ` Linus Torvalds
2014-05-03 18:25 ` Al Viro
2014-05-03 18:21 ` Al Viro
2014-05-04 6:29 ` Al Viro
2014-05-06 10:17 ` Miklos Szeredi
2014-05-06 14:53 ` Linus Torvalds
2014-05-06 16:52 ` Al Viro
2014-05-06 17:01 ` Linus Torvalds
2014-05-06 19:15 ` Al Viro
2014-05-02 22:32 ` Al Viro
2014-04-29 18:17 ` Linus Torvalds
2014-04-29 17:56 ` Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140430203823.GT18016@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.