linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Oleg Drokin <green@linuxhacker.ru>
Cc: Mailing List <linux-kernel@vger.kernel.org>,
	"<linux-fsdevel@vger.kernel.org>" <linux-fsdevel@vger.kernel.org>
Subject: Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.
Date: Tue, 5 Jul 2016 21:08:26 +0100	[thread overview]
Message-ID: <20160705200826.GP14480@ZenIV.linux.org.uk> (raw)
In-Reply-To: <1C63FC85-848B-486F-8223-F5CEB5C39848@linuxhacker.ru>

On Tue, Jul 05, 2016 at 03:12:44PM -0400, Oleg Drokin wrote:
> 
> When d_in_lookup was introduced, it was described as:
>     New primitives: d_in_lookup() (a predicate checking if dentry is in
>     the in-lookup state) and d_lookup_done() (tells the system that
>     we are done with lookup and if it's still marked as in-lookup, it
>     should cease to be such).
> 
> I don't see where it mentions anything about exclusive vs parallel lookup
> that probably led to some confusion.

In the same commit:

#define DCACHE_PAR_LOOKUP               0x10000000 /* being looked up (with parent locked shared) */

static inline int d_in_lookup(struct dentry *dentry)
{
       return dentry->d_flags & DCACHE_PAR_LOOKUP;
}

Sure, we could use d_alloc_parallel() for all lookups, but it wouldn't buy
us anything in terms of exclusion (parent locked exclusive => no other
lookup attempts on that parent/name pair anyway) and it would cost extra
searches both in the primary and in-lookup hashes, as well as insertions and
removals from the latter.

Hell knows - perhaps teaching d_alloc_parallel() that NULL wq => just
allocate and mark in-lookup, without touching either primary or in-lookup
hash (and scream bloody murder if the parent isn't locked exclusive) would
be a good idea.  A few places in fs/dcache.c would need to check for
->d_wait being non-NULL (__d_lookup_done(), __d_add() an __d_move());
We could use that for lookups when parent is locked exclusive; then
d_in_lookup() would be true for *all* dentries passed to ->lookup().  

I'll look into that, but it's obviously the next cycle fodder.

> So with Lustre the dentry can be in three states, really:
> 
> 1. hashed dentry that's all A-ok to reuse.
> 2. hashed dentry that's NOT valid (dlm lock went away) - this is distinguished in d_compare by looking at a bit in the fs_data
> 3. unhashed dentry ( I guess could be both valid and invalid lustre-wise).
> 
> So the logic in ll_lookup_it_finish() (part of regular lookup) is this:
> 
> If the dentry we have is not hashed - this is a new lookup, so we need to
> call into ll_splice_alias() to see if there's a better dentry we need to
> reuse that was already rejected by VFS before since we did not have necessary locks,
> but we do have them now.
> The comment at the top of ll_dcompare() explains why we don't just unhash the
> dentry on lock-loss - that apparently leads to a loop around real_lookup for
> real-contended dentries.
> This is also why we cannot use d_splice_alias here - such cases are possible
> for regular files and directories.
>
> Anyway, I guess additional point of confusion here is then why does
> ll_lookup_it_finish() need to check for hashedness of the dentry since it's in
> lookup, so we should be unhashed here.
> I checked the commit history and this check was added along with atomic_open
> support, so I imagine we can just move it up into ll_atomic_open and then your
> change starts to make sense along with a few other things.

So basically this
        } else if (!it_disposition(it, DISP_LOOKUP_NEG)  &&
                   !it_disposition(it, DISP_OPEN_CREATE)) {
                /* With DISP_OPEN_CREATE dentry will be
                 * instantiated in ll_create_it.
                 */
                LASSERT(!d_inode(*de));
                d_instantiate(*de, inode);
        }
is something we should do only for negative hashed fed to it by
->atomic_open(), and that - only if we have no O_CREAT in flags?

Then, since 3/3 eliminates that case completely, we could just rip that
else-if, along with the check for d_unhashed itself, making the call of
ll_splice_alias() unconditional there.  Or am I misreading what you said?
Do you see any problems with what's in #for-linus now (head at 11f0083)?

  reply	other threads:[~2016-07-05 20:08 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-17  4:09 More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes Oleg Drokin
2016-06-17  4:29 ` Al Viro
2016-06-25 16:38   ` Oleg Drokin
2016-07-03  6:29     ` Al Viro
2016-07-04  0:08       ` Al Viro
2016-07-04  0:37         ` Oleg Drokin
2016-07-04  3:08           ` Al Viro
2016-07-04  3:55             ` Oleg Drokin
2016-07-05  2:25               ` Al Viro
2016-07-10 17:01                 ` Oleg Drokin
2016-07-10 18:14                   ` James Simmons
2016-07-11  1:01                     ` Al Viro
2016-07-11  1:03                       ` Al Viro
2016-07-11 22:54                         ` lustre sendmsg stuff Oleg Drokin
2016-07-11 17:15                       ` More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes James Simmons
2016-07-05  2:28       ` Oleg Drokin
2016-07-05  2:32         ` Oleg Drokin
2016-07-05  4:43         ` Oleg Drokin
2016-07-05  6:22       ` Oleg Drokin
2016-07-05 12:31         ` Al Viro
2016-07-05 13:51           ` Al Viro
2016-07-05 15:21             ` Oleg Drokin
2016-07-05 17:42               ` Al Viro
2016-07-05 18:12                 ` Oleg Drokin
2016-07-05 16:33             ` Oleg Drokin
2016-07-05 18:08               ` Al Viro
2016-07-05 19:12                 ` Oleg Drokin
2016-07-05 20:08                   ` Al Viro [this message]
2016-07-05 20:21                     ` Oleg Drokin
2016-07-06  0:29                       ` Oleg Drokin
2016-07-06  3:20                         ` Al Viro
2016-07-06  3:25                           ` Oleg Drokin
2016-07-06  4:35                             ` Oleg Drokin
2016-07-06 16:24             ` Oleg Drokin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160705200826.GP14480@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=green@linuxhacker.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).