linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ian Kent <raven@themaw.net>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Chengguang Xu <cgxu519@mykernel.net>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	overlayfs <linux-unionfs@vger.kernel.org>
Subject: Re: [RFC PATCH v3 0/9] Suppress negative dentry
Date: Mon, 18 May 2020 18:26:48 +0800	[thread overview]
Message-ID: <cfc4e94f6667eabee664b63cc23051e9d816456c.camel@themaw.net> (raw)
In-Reply-To: <CAOQ4uxjT8DouPmf1mk1x24X8FcN5peYAqwdr362P4gcW+x15dw@mail.gmail.com>

On Mon, 2020-05-18 at 08:27 +0300, Amir Goldstein wrote:
> On Mon, May 18, 2020 at 3:53 AM Ian Kent <raven@themaw.net> wrote:
> > On Fri, 2020-05-15 at 15:20 +0800, Chengguang Xu wrote:
> > > This series adds a new lookup flag LOOKUP_DONTCACHE_NEGATIVE
> > > to indicate to drop negative dentry in slow path of lookup.
> > > 
> > > In overlayfs, negative dentries in upper/lower layers are useless
> > > after construction of overlayfs' own dentry, so in order to
> > > effectively reclaim those dentries, specify
> > > LOOKUP_DONTCACHE_NEGATIVE
> > > flag when doing lookup in upper/lower layers.
> > 
> > I've looked at this a couple of times now.
> > 
> > I'm not at all sure of the wisdom of adding a flag to a VFS
> > function
> > that allows circumventing what a file system chooses to do.
> 
> But it is not really a conscious choice is it?

I thought that too until recently when I had the need to ask the
question "how do these negative hashed dentries get into the dcache.

> How exactly does a filesystem express its desire to cache a negative
> dentry? The documentation of lookup() in vfs.rst makes it clear that
> it is not up to the filesystem to make that decision.

That's the question I had too and, somewhat embarrassingly, got
a response that started with "Are you serious ..." for Al when
I made a claim that ext4 doesn't create negative hashed dentries.

What was so bad about that claim is it's really obvious that ext4
does do this in ext4/namei.c:ext4_lookup():

...
inode = NULL;
if (bh) {
...
}
...
return d_splice_alias(inode, dentry);

and inode can be negative here.

Now d_splice_alias() is pretty complicated but, if passed a NULL
dentry it amounts to calling __d_add(dentry, NULL) which produces
a negative hashed dentry, a decision made by ext4 ->lookup() (and
I must say typical of the very few ways such dentries get into
the dcache).

Now on final dput of the walk these dentries should end up with a
reference count of zero which triggers dput() to add them to the
lru list so they can be considered as prune targets and can be
found in subsequent lookups (they are hashed).

This is how using negative hashed detries helps to avoid the
expensive alloc/free (at least) that occurs when looking up paths
that don't exist.

Negative "unhashed" dentries are discarded on final dput() so
don't really come into the picture except that dropping a negative
hashed dentry will cause it to be discarded on final dput().

But nothing is ever quite as simple as a description of how it
appears to (is meant to) work so, by all means, take what I say
with a grain of salt, ;)

> The VFS needs to cache the negative dentry on lookup(), so
> it can turn it positive on create().
> Low level kernel modules that call the VFS lookup() might know
> that caching the negative dentry is counter productive.
> 
> > I also do really see the need for it because only hashed negative
> > dentrys will be retained by the VFS so, if you see a hashed
> > negative
> > dentry then you can cause it to be discarded on release of the last
> > reference by dropping it.
> > 
> > So what's different here, why is adding an argument to do that drop
> > in the VFS itself needed instead of just doing it in overlayfs?
> 
> That was v1 patch. It was dealing with the possible race of
> returned negative dentry becoming positive before dropping it
> in an intrusive manner.

Right, very much something that overlay fs must care about, file
systems not so much.

It might be possible to assume that if the underlying file system
->lookup() call has produced a negative hashed dentry that it will
stay negative. That assumption definitely can't be made for negative
unhashed dentries, of course, since they may still be in the process
of being filled in.

But, unfortunately, I see your point, it's a risky assumption.

> 
> In retrospect, I think this race doesn't matter and there is no
> harm in dropping a positive dentry in a race obviously caused by
> accessing the underlying layer, which as documented results in
> "undefined behavior".

Umm ... I thought we were talking about negative dentries ...

Ian


  parent reply	other threads:[~2020-05-18 10:26 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-15  7:20 [RFC PATCH v3 0/9] Suppress negative dentry Chengguang Xu
2020-05-15  7:20 ` [RFC PATCH v3 1/9] fs/dcache: Introduce a new lookup flag LOOKUP_DONTCACHE_NEGATIVE Chengguang Xu
2020-05-15  7:20 ` [RFC PATCH v3 2/9] ovl: Suppress negative dentry in lookup Chengguang Xu
2020-05-15  7:20 ` [RFC PATCH v3 3/9] cifs: Adjust argument for lookup_positive_unlocked() Chengguang Xu
2020-05-15  7:20 ` [RFC PATCH v3 4/9] debugfs: " Chengguang Xu
2020-05-15  7:20 ` [RFC PATCH v3 5/9] ecryptfs: Adjust argument for lookup_one_len_unlocked() Chengguang Xu
2020-05-15  7:20 ` [RFC PATCH v3 6/9] exportfs: " Chengguang Xu
2020-05-15  7:20 ` [RFC PATCH v3 7/9] kernfs: Adjust argument for lookup_positive_unlocked() Chengguang Xu
2020-05-15  7:20 ` [RFC PATCH v3 8/9] nfsd: " Chengguang Xu
2020-05-15  7:20 ` [RFC PATCH v3 9/9] quota: " Chengguang Xu
2020-05-15  7:30 ` [RFC PATCH v3 0/9] Suppress negative dentry Amir Goldstein
2020-05-15  8:25   ` Chengguang Xu
2020-05-15  8:42     ` Miklos Szeredi
2020-05-18  0:53 ` Ian Kent
2020-05-18  5:27   ` Amir Goldstein
2020-05-18  7:52     ` Miklos Szeredi
2020-05-18  8:51       ` Amir Goldstein
2020-05-18  9:17         ` Miklos Szeredi
2020-05-19  5:01       ` cgxu
2020-05-19  8:21         ` Miklos Szeredi
2020-05-19  9:23           ` cgxu
2020-05-20 14:44             ` Miklos Szeredi
2020-05-25 13:37               ` Chengguang Xu
2020-05-25 13:50                 ` Miklos Szeredi
2020-05-18 10:26     ` Ian Kent [this message]
2020-05-18 10:39       ` Ian Kent

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cfc4e94f6667eabee664b63cc23051e9d816456c.camel@themaw.net \
    --to=raven@themaw.net \
    --cc=amir73il@gmail.com \
    --cc=cgxu519@mykernel.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).