All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	Dave Chinner <david@fromorbit.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: dcache shrink list corruption?
Date: Wed, 30 Apr 2014 21:38:23 +0100	[thread overview]
Message-ID: <20140430203823.GT18016@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CA+55aFxe4HKHiWbarfwtww9=WUgfAmk+Cf3G7tdBB19uMdjbEg@mail.gmail.com>

On Wed, Apr 30, 2014 at 01:23:26PM -0700, Linus Torvalds wrote:
> On Wed, Apr 30, 2014 at 12:59 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Another thing: I don't like what's going on with freeing vs. ->d_lock there.
> > Had that been a mutex, we'd definitely get a repeat of "vfs: fix subtle
> > use-after-free of pipe_inode_info".  The question is, can spin_unlock(p)
> > dereference p after another CPU gets through spin_lock(p)?  Linus?
> 
> spin_unlock() *should* be safe wrt that issue.
> 
> But I have to say, I think paravirtualized spinlocks may break that.
> They do all kinds of "kick waiters" after releasing the lock.
> 
> Doesn't the RCU protection solve that, though? Nobody should be
> releasing the dentry under us, afaik..

We do not (and cannot) call dentry_kill() with rcu_read_lock held - it can
trigger any amount of IO, for one thing.  We can take it around the
couple of places where do that spin_unlock(&dentry->d_lock) (along with
setting DCACHE_RCUACCESS) - that's what I'd been refering to.  Then this
sucker (tests still running, so far everything seems to survive) becomes
the following (again, on top of 1/6..4/6).  BTW, is there any convenient
way to tell git commit --amend to update the commit date?  Something
like --date=now would be nice, but it isn't accepted...

commit 797ff22681dc969b478ed837787d24dfd2dd2132
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Tue Apr 29 23:52:05 2014 -0400

    dentry_kill(): don't try to remove from shrink list
    
    If the victim in on the shrink list, don't remove it from there.
    If shrink_dentry_list() manages to remove it from the list before
    we are done - fine, we'll just free it as usual.  If not - mark
    it with new flag (DCACHE_MAY_FREE) and leave it there.
    
    Eventually, shrink_dentry_list() will get to it, remove the sucker
    from shrink list and call dentry_kill(dentry, 0).  Which is where
    we'll deal with freeing.
    
    Since now dentry_kill(dentry, 0) may happen after or during
    dentry_kill(dentry, 1), we need to recognize that (by seeing
    DCACHE_DENTRY_KILLED already set), unlock everything
    and either free the sucker (in case DCACHE_MAY_FREE has been
    set) or leave it for ongoing dentry_kill(dentry, 1) to deal with.
    
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

diff --git a/fs/dcache.c b/fs/dcache.c
index e482775..fa40d26 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -489,6 +489,20 @@ relock:
 		goto relock;
 	}
 
+	if (unlikely(dentry->d_flags & DCACHE_DENTRY_KILLED)) {
+		if (parent)
+			spin_unlock(&parent->d_lock);
+		if (dentry->d_flags & DCACHE_MAY_FREE) {
+			spin_unlock(&dentry->d_lock);
+			dentry_free(dentry);
+		} else {
+			dentry->d_flags |= DCACHE_RCUACCESS;
+			rcu_read_lock();
+			spin_unlock(&dentry->d_lock);
+			rcu_read_unlock();
+		}
+		return parent;
+	}
 	/*
 	 * The dentry is now unrecoverably dead to the world.
 	 */
@@ -504,8 +518,6 @@ relock:
 	if (dentry->d_flags & DCACHE_LRU_LIST) {
 		if (!(dentry->d_flags & DCACHE_SHRINK_LIST))
 			d_lru_del(dentry);
-		else
-			d_shrink_del(dentry);
 	}
 	/* if it was on the hash then remove it */
 	__d_drop(dentry);
@@ -527,7 +539,16 @@ relock:
 	if (dentry->d_op && dentry->d_op->d_release)
 		dentry->d_op->d_release(dentry);
 
-	dentry_free(dentry);
+	spin_lock(&dentry->d_lock);
+	if (dentry->d_flags & DCACHE_SHRINK_LIST) {
+		dentry->d_flags |= DCACHE_MAY_FREE | DCACHE_RCUACCESS;
+		rcu_read_lock();
+		spin_unlock(&dentry->d_lock);
+		rcu_read_unlock();
+	} else {
+		spin_unlock(&dentry->d_lock);
+		dentry_free(dentry);
+	}
 	return parent;
 }
 
@@ -829,7 +850,7 @@ static void shrink_dentry_list(struct list_head *list)
 		 * We found an inuse dentry which was not removed from
 		 * the LRU because of laziness during lookup. Do not free it.
 		 */
-		if (dentry->d_lockref.count) {
+		if (dentry->d_lockref.count > 0) {
 			spin_unlock(&dentry->d_lock);
 			continue;
 		}
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 3b9bfdb..3c7ec32 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -221,6 +221,8 @@ struct dentry_operations {
 #define DCACHE_SYMLINK_TYPE		0x00300000 /* Symlink */
 #define DCACHE_FILE_TYPE		0x00400000 /* Other file type */
 
+#define DCACHE_MAY_FREE			0x00800000
+
 extern seqlock_t rename_lock;
 
 static inline int dname_external(const struct dentry *dentry)

  reply	other threads:[~2014-04-30 20:38 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-29 16:01 dcache shrink list corruption? Miklos Szeredi
2014-04-29 17:43 ` Linus Torvalds
2014-04-29 18:03   ` Miklos Szeredi
2014-04-29 18:16     ` Al Viro
2014-04-29 19:10       ` Al Viro
2014-04-29 21:18         ` Dave Chinner
2014-04-29 21:48           ` Al Viro
2014-04-29 23:04             ` Linus Torvalds
2014-04-29 23:20               ` Al Viro
2014-04-30  2:31                 ` Al Viro
2014-04-30  2:56                   ` Linus Torvalds
2014-04-30  4:04                     ` Al Viro
2014-04-30 15:49                       ` Miklos Szeredi
2014-04-30 15:56                         ` Miklos Szeredi
2014-04-30 16:03                         ` Al Viro
2014-04-30 17:33                           ` Miklos Szeredi
2014-04-30 18:36                             ` Al Viro
2014-04-30 18:42                               ` Miklos Szeredi
2014-04-30 19:02                                 ` Al Viro
2014-04-30 19:59                                   ` Al Viro
2014-04-30 20:23                                     ` Linus Torvalds
2014-04-30 20:38                                       ` Al Viro [this message]
2014-04-30 20:57                                         ` Linus Torvalds
2014-04-30 21:12                                           ` Al Viro
2014-04-30 22:12                                             ` Al Viro
2014-04-30 23:04                                               ` Linus Torvalds
2014-04-30 23:14                                                 ` Linus Torvalds
2014-04-30 23:43                                                   ` Al Viro
2014-05-01  0:18                                                     ` Linus Torvalds
2014-05-01  2:51                                                       ` Al Viro
2014-05-01  2:59                                                         ` Linus Torvalds
2014-05-01  3:12                                                           ` Al Viro
2014-05-01  9:42                                                             ` Miklos Szeredi
2014-05-01 14:34                                                               ` Al Viro
2014-05-01 21:02                                                                 ` Al Viro
2014-05-01 21:05                                                                   ` Al Viro
2014-05-01 22:52                                                                     ` Linus Torvalds
2014-05-02  8:43                                                                 ` Szeredi Miklos
2014-05-02 21:04                                                                 ` Linus Torvalds
2014-04-30 23:38                                                 ` Al Viro
2014-04-30  9:15                     ` Miklos Szeredi
2014-05-02  5:51                       ` Al Viro
2014-05-02  9:00                         ` Szeredi Miklos
2014-05-02 21:02                           ` Miklos Szeredi
2014-05-02 21:08                           ` Miklos Szeredi
2014-05-02 21:18                             ` Linus Torvalds
2014-05-02 22:40                               ` Al Viro
2014-05-02 23:06                                 ` Al Viro
2014-05-03  4:26                                 ` Al Viro
2014-05-03 18:07                                   ` Linus Torvalds
2014-05-03 18:25                                     ` Al Viro
2014-05-03 18:21                                   ` Al Viro
2014-05-04  6:29                                     ` Al Viro
2014-05-06 10:17                                       ` Miklos Szeredi
2014-05-06 14:53                                         ` Linus Torvalds
2014-05-06 16:52                                           ` Al Viro
2014-05-06 17:01                                             ` Linus Torvalds
2014-05-06 19:15                                               ` Al Viro
2014-05-02 22:32                             ` Al Viro
2014-04-29 18:17     ` Linus Torvalds
2014-04-29 17:56 ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140430203823.GT18016@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.