linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: John Ogness <john.ogness@linutronix.de>
Cc: linux-fsdevel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Christoph Hellwig <hch@lst.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 6/6] fs/dcache: Avoid remaining try_lock loop in shrink_dentry_list()
Date: Fri, 23 Feb 2018 17:42:16 +0000	[thread overview]
Message-ID: <20180223174216.GD30522@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20180223150928.GC30522@ZenIV.linux.org.uk>

On Fri, Feb 23, 2018 at 03:09:28PM +0000, Al Viro wrote:
> You are conflating the "we have a reference" cases with this one, and
> they are very different.  Note, BTW, that had we raced with somebody
> else grabbing a reference, we would've quietly dropped dentry from
> the shrink list; what if we do the following: just after checking that
> refcount is not positive, do
> 	inode = dentry->d_inode;
> 	if unlikely(inode && !spin_trylock...)
> 		rcu_read_lock
> 		drop ->d_lock
> 		grab inode->i_lock
> 		grab ->d_lock
> 		if unlikely(dentry->d_inode != inode)
> 			drop inode->i_lock
> 			rcu_read_unlock
> 			if !killed
> 				drop ->d_lock
> 				drop parent's ->d_lock
> 				continue;
> 		else
> 			rcu_read_unlock
> *before* going into
>                 if (unlikely(dentry->d_flags & DCACHE_DENTRY_KILLED)) {
>                         bool can_free = dentry->d_flags & DCACHE_MAY_FREE;
>                         spin_unlock(&dentry->d_lock);
> 			...
> part?

Owww....  It's actually even nastier than I realized - dropping ->d_lock
opens us to having the sucker freed by dput() from another thread here.
IOW, between d_shrink_del(dentry) and __dentry_kill(dentry) dropping ->d_lock
is dangerous...

It's really very different from all other cases, and the trickiest by far.

FWIW, my impression from the series:
	1) dentry_kill() should deal with trylock failures on its own, leaving
the callers only the real "we need to drop the parent" case.  See upthread for
one variant of doing that.
	2) switching parent eviction in shrink_dentry_list() to dentry_kill()
is fine.
	3) for d_delete() trylock loop is wrong; however, it does not need
anything more elaborate than
{
        struct inode *inode;
        int isdir = d_is_dir(dentry);
        /*
         * Are we the only user?
         */
        spin_lock(&dentry->d_lock);
        if (dentry->d_lockref.count != 1)
		goto Shared;

        inode = dentry->d_inode;
	if (unlikely(!spin_trylock(&inode->i_lock))) {
		spin_unlock(&dentry->d_lock);
		spin_lock(&inode->i_lock);
		spin_lock(&dentry->d_lock);
		if (dentry->d_lockref.count != 1) {
			spin_unlock(&inode->i_lock);
			goto Shared;
		}
	}
           
	dentry->d_flags &= ~DCACHE_CANT_MOUNT;
	dentry_unlink_inode(dentry);
	fsnotify_nameremove(dentry, isdir);
	return;

Shared:	/* can't make it negative, must unhash */
        if (!d_unhashed(dentry))
                __d_drop(dentry);
        spin_unlock(&dentry->d_lock);

        fsnotify_nameremove(dentry, isdir);
}

If not an outright "lock inode first from the very beginning" - note that
inode is stable (and non-NULL) here.  IOW, that needs to be compared with
{
        struct inode *inode = dentry->d_inode;
        int isdir = d_is_dir(dentry);
        spin_lock(&inode->i_lock);
        spin_lock(&dentry->d_lock);
        /*
         * Are we the only user?
         */
        if (dentry->d_lockref.count == 1) {
		dentry->d_flags &= ~DCACHE_CANT_MOUNT;
		dentry_unlink_inode(dentry);
	} else {
		if (!d_unhashed(dentry))
			__d_drop(dentry);
		spin_unlock(&dentry->d_lock);
		spin_unlock(&inode->i_lock);
	}
	fsnotify_nameremove(dentry, isdir);
}

That costs an extra boinking the ->i_lock in case dentry is shared, but it's
much shorter and simpler that way.  Needs profiling; if the second variant
does not give worse performance, I would definitely prefer that one.
	4) the nasty one - shrink_dentry_list() evictions of zero-count dentries.
_That_ calls for careful use of RCU, etc. - none of the others need that.  Need
to think how to deal with that sucker; in any case, I do not believe that sharing
said RCU use, etc. with any other cases would do anything other than obfuscating
the rest.

  reply	other threads:[~2018-02-23 17:42 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-22 23:50 [PATCH v2 0/6] fs/dcache: avoid trylock loops John Ogness
2018-02-22 23:50 ` [PATCH v2 1/6] fs/dcache: Remove stale comment from dentry_kill() John Ogness
2018-02-22 23:50 ` [PATCH v2 2/6] fs/dcache: Move dentry_kill() below lock_parent() John Ogness
2018-02-22 23:50 ` [PATCH v2 3/6] fs/dcache: Avoid the try_lock loop in d_delete() John Ogness
2018-02-23  2:08   ` Al Viro
2018-02-22 23:50 ` [PATCH v2 4/6] fs/dcache: Avoid the try_lock loops in dentry_kill() John Ogness
2018-02-23  2:22   ` Al Viro
2018-02-23  3:12     ` Al Viro
2018-02-23  3:16       ` Al Viro
2018-02-23  5:46       ` Al Viro
2018-02-22 23:50 ` [PATCH v2 5/6] fs/dcache: Avoid a try_lock loop in shrink_dentry_list() John Ogness
2018-02-23  3:48   ` Al Viro
2018-02-22 23:50 ` [PATCH v2 6/6] fs/dcache: Avoid remaining " John Ogness
2018-02-23  3:58   ` Al Viro
2018-02-23  4:08     ` Al Viro
2018-02-23 13:57       ` John Ogness
2018-02-23 15:09         ` Al Viro
2018-02-23 17:42           ` Al Viro [this message]
2018-02-23 20:13             ` [BUG] lock_parent() breakage when used from shrink_dentry_list() (was Re: [PATCH v2 6/6] fs/dcache: Avoid remaining try_lock loop in shrink_dentry_list()) Al Viro
2018-02-23 21:35               ` Linus Torvalds
2018-02-24  0:22                 ` Al Viro
2018-02-25  7:40                   ` Al Viro
2018-02-27  5:16                     ` dcache: remove trylock loops (was Re: [BUG] lock_parent() breakage when used from shrink_dentry_list()) John Ogness
2018-03-12 19:13                       ` Al Viro
2018-03-12 20:05                         ` Al Viro
2018-03-12 20:33                           ` Al Viro
2018-03-13  1:12                           ` NeilBrown
2018-04-28  0:10                             ` Al Viro
2018-03-12 20:23                         ` Eric W. Biederman
2018-03-12 20:39                           ` Al Viro
2018-03-12 23:28                             ` Eric W. Biederman
2018-03-12 23:52                               ` Eric W. Biederman
2018-03-13  0:37                                 ` Al Viro
2018-03-13  0:50                                   ` Al Viro
2018-03-13  4:02                                     ` Eric W. Biederman
2018-03-14 23:20                                     ` [PATCH] fs: Teach path_connected to handle nfs filesystems with multiple roots Eric W. Biederman
2018-03-15 22:34                                       ` Al Viro
2018-03-13  0:36                               ` dcache: remove trylock loops (was Re: [BUG] lock_parent() breakage when used from shrink_dentry_list()) Al Viro
2018-03-12 22:14                         ` Thomas Gleixner
2018-03-13 20:46                         ` John Ogness
2018-03-13 21:05                           ` John Ogness
2018-03-13 23:59                             ` Al Viro
2018-03-14  2:58                               ` Matthew Wilcox
2018-03-14  8:18                               ` John Ogness
2018-03-02  9:04                     ` [BUG] lock_parent() breakage when used from shrink_dentry_list() (was Re: [PATCH v2 6/6] fs/dcache: Avoid remaining try_lock loop in shrink_dentry_list()) Sebastian Andrzej Siewior
2018-02-23  0:59 ` [PATCH v2 0/6] fs/dcache: avoid trylock loops Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180223174216.GD30522@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=bigeasy@linutronix.de \
    --cc=hch@lst.de \
    --cc=john.ogness@linutronix.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).