Re: [RFC PATCH v3 0/9] Suppress negative dentry

From: cgxu <cgxu519@mykernel.net>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Amir Goldstein <amir73il@gmail.com>, Ian Kent <raven@themaw.net>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	overlayfs <linux-unionfs@vger.kernel.org>
Subject: Re: [RFC PATCH v3 0/9] Suppress negative dentry
Date: Tue, 19 May 2020 13:01:44 +0800	[thread overview]
Message-ID: <05e92557-055c-0dea-4fe4-0194606b6c77@mykernel.net> (raw)
In-Reply-To: <CAJfpegtpi1SVJRbQb8zM0t66WnrjKsPEGEN3qZKRzrZePP06dA@mail.gmail.com>

  ---- 在 星期一, 2020-05-18 15:52:48 Miklos Szeredi <miklos@szeredi.hu> 撰写 
----
  > On Mon, May 18, 2020 at 7:27 AM Amir Goldstein <amir73il@gmail.com> 
wrote:
  > >
  > > On Mon, May 18, 2020 at 3:53 AM Ian Kent <raven@themaw.net> wrote:
  > > >
  > > > On Fri, 2020-05-15 at 15:20 +0800, Chengguang Xu wrote:
  > > > > This series adds a new lookup flag LOOKUP_DONTCACHE_NEGATIVE
  > > > > to indicate to drop negative dentry in slow path of lookup.
  > > > >
  > > > > In overlayfs, negative dentries in upper/lower layers are useless
  > > > > after construction of overlayfs' own dentry, so in order to
  > > > > effectively reclaim those dentries, specify 
LOOKUP_DONTCACHE_NEGATIVE
  > > > > flag when doing lookup in upper/lower layers.
  > > >
  > > > I've looked at this a couple of times now.
  > > >
  > > > I'm not at all sure of the wisdom of adding a flag to a VFS function
  > > > that allows circumventing what a file system chooses to do.
  > >
  > > But it is not really a conscious choice is it?
  > > How exactly does a filesystem express its desire to cache a negative
  > > dentry? The documentation of lookup() in vfs.rst makes it clear that
  > > it is not up to the filesystem to make that decision.
  > > The VFS needs to cache the negative dentry on lookup(), so
  > > it can turn it positive on create().
  > > Low level kernel modules that call the VFS lookup() might know
  > > that caching the negative dentry is counter productive.
  > >
  > > >
  > > > I also do really see the need for it because only hashed negative
  > > > dentrys will be retained by the VFS so, if you see a hashed negative
  > > > dentry then you can cause it to be discarded on release of the last
  > > > reference by dropping it.
  > > >
  > > > So what's different here, why is adding an argument to do that drop
  > > > in the VFS itself needed instead of just doing it in overlayfs?
  > >
  > > That was v1 patch. It was dealing with the possible race of
  > > returned negative dentry becoming positive before dropping it
  > > in an intrusive manner.
  > >
  > > In retrospect, I think this race doesn't matter and there is no
  > > harm in dropping a positive dentry in a race obviously caused by
  > > accessing the underlying layer, which as documented results in
  > > "undefined behavior".
  > >
  > > Miklos, am I missing something?
  >  > Dropping a positive dentry is harmful in case there's a long term
  > reference to the dentry (e.g. an open file) since it will look as if
  > the file was deleted, when in fact it wasn't.
  >  > It's possible to unhash a negative dentry in a safe way if we make
  > sure it cannot become positive.  One way is to grab d_lock and remove
  > it from the hash table only if count is one.
  >  > So yes, we could have a helper to do that instead of the lookup flag.
  > The disadvantage being that we'd also be dropping negatives that did
  > not enter the cache because of our lookup.
  >


If we don't consider that only drop negative dentry of our lookup,
it is possible to do like below, isn't it?

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 723d17744758..fa339e23b0f8 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -200,7 +200,7 @@ static int ovl_lookup_single(struct dentry *base, 
struct ovl_lookup_data *d,
         int err;
         bool last_element = !post[0];

-       this = lookup_positive_unlocked(name, base, namelen);
+       this = lookup_one_len_unlocked(name, base, namelen);
         if (IS_ERR(this)) {
                 err = PTR_ERR(this);
                 this = NULL;
@@ -209,6 +209,18 @@ static int ovl_lookup_single(struct dentry *base, 
struct ovl_lookup_data *d,
                 goto out_err;
         }

+       if (d_flags_negative(this->d_flags)) {
+               inode_lock_shared(base->d_inode);
+               if (d_flags_negative(this->d_flags))
+                       d_drop(this);
+               inode_unlock_shared(base->d_inode);
+
+               dput(this);
+               this = NULL;
+               err = -ENOENT;
+               goto out;
+       }
+
         if (ovl_dentry_weird(this)) {
                 /* Don't support traversing automounts and other 
weirdness */
                 err = -EREMOTE;


Thanks,
cgxu