All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Al Viro <viro@zeniv.linux.org.uk>, Daire Byrne <daire@dneg.com>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Chuck Lever <chuck.lever@oracle.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH 07/12] NFS: support parallel updates in the one directory.
Date: Tue, 14 Jun 2022 09:18:22 +1000	[thread overview]
Message-ID: <165516230200.21248.14713533079253477888.stgit@noble.brown> (raw)
In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown>

NFS can easily support parallel updates as the locking is done on the
server, so this patch enables parallel updates for NFS.

NFS unlink needs to block concurrent opens() once it decides to actually
unlink the file, rather than rename it to .nfsXXXX (aka sillyrename).
It currently does this by temporarily unhashing the dentry and relying
on the exclusive lock on the directory to block a ->lookup().  That
doesn't work now that unlink uses a shared lock, so an alternate
approach is needed.

__nfs_lookup_revalidate (->d_revalidate) now blocks if DCACHE_PAR_UPDATE
is set, and if nfs_unlink() happens to be called with an exclusive lock
and DCACHE_PAR_UPDATE is not set, it get set during the potential race window.

I'd rather use some other indicator in the dentry to tell
_nfs_lookup_revalidate() to wait, but we are nearly out of d_flags bits,
and NFS doesn't have a general-purpose d_fsdata.

NFS "silly-rename" may now be called with only a shared lock on the
directory, so it needs a bit of extra care to get exclusive access to
the new name. d_lock_update_nested() and d_unlock_update() help here.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfs/dir.c    |   29 +++++++++++++++++++++++------
 fs/nfs/inode.c  |    2 ++
 fs/nfs/unlink.c |    5 ++++-
 3 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index a8ecdd527662..54c2c7adcd56 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1778,6 +1778,9 @@ __nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags,
 	int ret;
 
 	if (flags & LOOKUP_RCU) {
+		if (dentry->d_flags & DCACHE_PAR_UPDATE)
+			/* Pending unlink */
+			return -ECHILD;
 		parent = READ_ONCE(dentry->d_parent);
 		dir = d_inode_rcu(parent);
 		if (!dir)
@@ -1786,6 +1789,9 @@ __nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags,
 		if (parent != READ_ONCE(dentry->d_parent))
 			return -ECHILD;
 	} else {
+		/* Wait for unlink to complete */
+		wait_var_event(&dentry->d_flags,
+			       !(dentry->d_flags & DCACHE_PAR_UPDATE));
 		parent = dget_parent(dentry);
 		ret = reval(d_inode(parent), dentry, flags);
 		dput(parent);
@@ -2453,7 +2459,7 @@ static int nfs_safe_remove(struct dentry *dentry)
 int nfs_unlink(struct inode *dir, struct dentry *dentry)
 {
 	int error;
-	int need_rehash = 0;
+	bool did_set_par_update = false;
 
 	dfprintk(VFS, "NFS: unlink(%s/%lu, %pd)\n", dir->i_sb->s_id,
 		dir->i_ino, dentry);
@@ -2468,15 +2474,26 @@ int nfs_unlink(struct inode *dir, struct dentry *dentry)
 		error = nfs_sillyrename(dir, dentry);
 		goto out;
 	}
-	if (!d_unhashed(dentry)) {
-		__d_drop(dentry);
-		need_rehash = 1;
+	/* We must prevent any concurrent open until the unlink
+	 * completes.  ->d_revalidate will wait for DCACHE_PAR_UPDATE
+	 * to clear, but if this happens to a non-parallel update, we
+	 * still want to block opens.  So set DCACHE_PAR_UPDATE
+	 * temporarily.
+	 */
+	if (!(dentry->d_flags & DCACHE_PAR_UPDATE)) {
+		/* Must have exclusive lock on parent */
+		did_set_par_update = true;
+		dentry->d_flags |= DCACHE_PAR_UPDATE;
 	}
+
 	spin_unlock(&dentry->d_lock);
 	error = nfs_safe_remove(dentry);
 	nfs_dentry_remove_handle_error(dir, dentry, error);
-	if (need_rehash)
-		d_rehash(dentry);
+	if (did_set_par_update) {
+		spin_lock(&dentry->d_lock);
+		dentry->d_flags &= ~DCACHE_PAR_UPDATE;
+		spin_unlock(&dentry->d_lock);
+	}
 out:
 	trace_nfs_unlink_exit(dir, dentry, error);
 	return error;
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index b4e46b0ffa2d..cea2554710d2 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -481,6 +481,8 @@ nfs_fhget(struct super_block *sb, struct nfs_fh *fh, struct nfs_fattr *fattr)
 
 		/* We can't support update_atime(), since the server will reset it */
 		inode->i_flags |= S_NOATIME|S_NOCMTIME;
+		/* Parallel updates to directories are trivial */
+		inode->i_flags |= S_PAR_UPDATE;
 		inode->i_mode = fattr->mode;
 		nfsi->cache_validity = 0;
 		if ((fattr->valid & NFS_ATTR_FATTR_MODE) == 0
diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index 9697cd5d2561..52a20eb6131c 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -462,6 +462,7 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry)
 	sdentry = NULL;
 	do {
 		int slen;
+		d_unlock_update(sdentry);
 		dput(sdentry);
 		sillycounter++;
 		slen = scnprintf(silly, sizeof(silly),
@@ -479,7 +480,8 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry)
 		 */
 		if (IS_ERR(sdentry))
 			goto out;
-	} while (d_inode(sdentry) != NULL); /* need negative lookup */
+	} while (!d_lock_update_nested(sdentry, NULL, NULL,
+				       SINGLE_DEPTH_NESTING));
 
 	ihold(inode);
 
@@ -524,6 +526,7 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry)
 	rpc_put_task(task);
 out_dput:
 	iput(inode);
+	d_unlock_update(sdentry);
 	dput(sdentry);
 out:
 	return error;



  parent reply	other threads:[~2022-06-13 23:21 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-13 23:18 [PATCH RFC 00/12] Allow concurrent directory updates NeilBrown
2022-06-13 23:18 ` [PATCH 04/12] VFS: move dput() and mnt_drop_write() into done_path_update() NeilBrown
2022-06-13 23:18 ` [PATCH 03/12] VFS: move want_write checks into lookup_hash_update() NeilBrown
2022-06-13 23:18 ` [PATCH 02/12] VFS: move EEXIST and ENOENT tests " NeilBrown
2022-06-13 23:18 ` [PATCH 01/12] VFS: support parallel updates in the one directory NeilBrown
2022-06-13 23:18 ` [PATCH 05/12] VFS: export done_path_update() NeilBrown
2022-06-13 23:18 ` [PATCH 08/12] nfsd: allow parallel creates from nfsd NeilBrown
2022-06-24 14:43   ` Chuck Lever III
2022-06-28 22:35   ` Chuck Lever III
2022-06-28 23:09     ` NeilBrown
2022-07-04 17:17       ` Chuck Lever III
2022-06-13 23:18 ` NeilBrown [this message]
2022-06-13 23:18 ` [PATCH 11/12] nfsd: use (un)lock_inode instead of fh_(un)lock NeilBrown
2022-06-24 14:43   ` Chuck Lever III
2022-06-13 23:18 ` [PATCH 06/12] VFS: support concurrent renames NeilBrown
2022-06-14  4:35   ` kernel test robot
2022-06-14 12:37   ` kernel test robot
2022-06-14 13:28   ` kernel test robot
2022-06-26 13:07   ` [VFS] 46a2afd9f6: ltp.rename10.fail kernel test robot
2022-06-26 13:07     ` kernel test robot
2022-06-26 13:07     ` [LTP] " kernel test robot
2022-06-13 23:18 ` [PATCH 12/12] nfsd: discard fh_locked flag and fh_lock/fh_unlock NeilBrown
2022-06-24 14:43   ` Chuck Lever III
2022-06-13 23:18 ` [PATCH 10/12] nfsd: reduce locking in nfsd_lookup() NeilBrown
2022-06-24 14:43   ` Chuck Lever III
2022-06-13 23:18 ` [PATCH 09/12] nfsd: support concurrent renames NeilBrown
2022-06-24 14:43   ` Chuck Lever III
2022-06-15 13:46 ` [PATCH RFC 00/12] Allow concurrent directory updates Daire Byrne
2022-06-16  0:55   ` NeilBrown
2022-06-16 10:48     ` Daire Byrne
2022-06-17  5:49       ` NeilBrown
2022-06-17 15:27         ` Daire Byrne
2022-06-20 10:18           ` Daire Byrne
2022-06-16 13:49     ` Anna Schumaker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=165516230200.21248.14713533079253477888.stgit@noble.brown \
    --to=neilb@suse.de \
    --cc=chuck.lever@oracle.com \
    --cc=daire@dneg.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@hammerspace.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.