All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] nfs_complete_rename() calls d_move() without i_mutex
@ 2017-06-15 16:13 Benjamin Coddington
  2017-06-15 16:13 ` [PATCH 1/2] Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind" Benjamin Coddington
  2017-06-15 16:13 ` [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS Benjamin Coddington
  0 siblings, 2 replies; 10+ messages in thread
From: Benjamin Coddington @ 2017-06-15 16:13 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker; +Cc: Jeff Layton, mszeredi, bfields, linux-nfs

Once commit 920b4530fb80430ff30ef83efe21ba1fa5623731 "NFS: nfs_rename()
handle -ERESTARTSYS dentry left behind" moved the local d_move() into the
RPC asyncronous context, d_move() could be called without holding the
directories' i_mutex.

Let's revert that commit, and a follow-up fix for it in 1/2, and then fix
the original problem once more by forcing a revalidation of the old and new
directories if we notice that the rename was interrupted in 2/2.

Benjamin Coddington (2):
  Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"
  NFS: nfs_rename() - revalidate directories on -ERESTARTSYS

 fs/nfs/dir.c            | 47 ++++++++++++++++++++---------------------------
 fs/nfs/unlink.c         |  7 +++++++
 include/linux/nfs_xdr.h |  1 +
 3 files changed, 28 insertions(+), 27 deletions(-)

-- 
2.9.3


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"
  2017-06-15 16:13 [PATCH 0/2] nfs_complete_rename() calls d_move() without i_mutex Benjamin Coddington
@ 2017-06-15 16:13 ` Benjamin Coddington
  2017-06-15 18:16   ` Jeff Layton
  2017-06-15 16:13 ` [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS Benjamin Coddington
  1 sibling, 1 reply; 10+ messages in thread
From: Benjamin Coddington @ 2017-06-15 16:13 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker; +Cc: Jeff Layton, mszeredi, bfields, linux-nfs

This reverts commit 920b4530fb80430ff30ef83efe21ba1fa5623731 which could
call d_move() without holding the directory's i_mutex, and reverts commit
d4ea7e3c5c0e341c15b073016dbf3ab6c65f12f3 "NFS: Fix old dentry rehash after
move", which was a follow-up fix.

Fixes: 920b4530fb80 ("NFS: nfs_rename() handle -ERESTARTSYS dentry left behind")
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c | 45 ++++++++++++++++++---------------------------
 1 file changed, 18 insertions(+), 27 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 32ccd7754f8a..1faf337b316f 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1946,29 +1946,6 @@ nfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry)
 }
 EXPORT_SYMBOL_GPL(nfs_link);
 
-static void
-nfs_complete_rename(struct rpc_task *task, struct nfs_renamedata *data)
-{
-	struct dentry *old_dentry = data->old_dentry;
-	struct dentry *new_dentry = data->new_dentry;
-	struct inode *old_inode = d_inode(old_dentry);
-	struct inode *new_inode = d_inode(new_dentry);
-
-	nfs_mark_for_revalidate(old_inode);
-
-	switch (task->tk_status) {
-	case 0:
-		if (new_inode != NULL)
-			nfs_drop_nlink(new_inode);
-		d_move(old_dentry, new_dentry);
-		nfs_set_verifier(new_dentry,
-					nfs_save_change_attribute(data->new_dir));
-		break;
-	case -ENOENT:
-		nfs_dentry_handle_enoent(old_dentry);
-	}
-}
-
 /*
  * RENAME
  * FIXME: Some nfsds, like the Linux user space nfsd, may generate a
@@ -1999,7 +1976,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 {
 	struct inode *old_inode = d_inode(old_dentry);
 	struct inode *new_inode = d_inode(new_dentry);
-	struct dentry *dentry = NULL;
+	struct dentry *dentry = NULL, *rehash = NULL;
 	struct rpc_task *task;
 	int error = -EBUSY;
 
@@ -2022,8 +1999,10 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 		 * To prevent any new references to the target during the
 		 * rename, we unhash the dentry in advance.
 		 */
-		if (!d_unhashed(new_dentry))
+		if (!d_unhashed(new_dentry)) {
 			d_drop(new_dentry);
+			rehash = new_dentry;
+		}
 
 		if (d_count(new_dentry) > 2) {
 			int err;
@@ -2040,6 +2019,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 				goto out;
 
 			new_dentry = dentry;
+			rehash = NULL;
 			new_inode = NULL;
 		}
 	}
@@ -2048,8 +2028,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (new_inode != NULL)
 		NFS_PROTO(new_inode)->return_delegation(new_inode);
 
-	task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry,
-					nfs_complete_rename);
+	task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry, NULL);
 	if (IS_ERR(task)) {
 		error = PTR_ERR(task);
 		goto out;
@@ -2059,9 +2038,21 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (error == 0)
 		error = task->tk_status;
 	rpc_put_task(task);
+	nfs_mark_for_revalidate(old_inode);
 out:
+	if (rehash)
+		d_rehash(rehash);
 	trace_nfs_rename_exit(old_dir, old_dentry,
 			new_dir, new_dentry, error);
+	if (!error) {
+		if (new_inode != NULL)
+			nfs_drop_nlink(new_inode);
+		d_move(old_dentry, new_dentry);
+		nfs_set_verifier(new_dentry,
+					nfs_save_change_attribute(new_dir));
+	} else if (error == -ENOENT)
+		nfs_dentry_handle_enoent(old_dentry);
+
 	/* new dentry created? */
 	if (dentry)
 		dput(dentry);
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS
  2017-06-15 16:13 [PATCH 0/2] nfs_complete_rename() calls d_move() without i_mutex Benjamin Coddington
  2017-06-15 16:13 ` [PATCH 1/2] Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind" Benjamin Coddington
@ 2017-06-15 16:13 ` Benjamin Coddington
  2017-06-15 18:18   ` Jeff Layton
  1 sibling, 1 reply; 10+ messages in thread
From: Benjamin Coddington @ 2017-06-15 16:13 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker; +Cc: Jeff Layton, mszeredi, bfields, linux-nfs

An interrupted rename will leave the old dentry behind if the rename
succeeds.  Fix this by forcing a lookup the next time through
->d_revalidate.

A previous attempt at solving this problem took the approach to complete
the work of the rename asynchronously, however that approach was wrong
since it would allow the d_move() to occur after the directory's i_mutex
had been dropped by the original process.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c            | 2 ++
 fs/nfs/unlink.c         | 7 +++++++
 include/linux/nfs_xdr.h | 1 +
 3 files changed, 10 insertions(+)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 1faf337b316f..bb697e5c3f6c 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2037,6 +2037,8 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	error = rpc_wait_for_completion_task(task);
 	if (error == 0)
 		error = task->tk_status;
+	else
+		((struct nfs_renamedata *)task->tk_calldata)->cancelled = 1;
 	rpc_put_task(task);
 	nfs_mark_for_revalidate(old_inode);
 out:
diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index 191aa577dd1f..b47158a34879 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -288,6 +288,13 @@ static void nfs_async_rename_release(void *calldata)
 	if (d_really_is_positive(data->old_dentry))
 		nfs_mark_for_revalidate(d_inode(data->old_dentry));
 
+	/* The result of the rename is unknown. Play it safe by
+	 * forcing a new lookup */
+	if (data->cancelled) {
+		nfs_force_lookup_revalidate(data->old_dir);
+		nfs_force_lookup_revalidate(data->new_dir);
+	}
+
 	dput(data->old_dentry);
 	dput(data->new_dentry);
 	iput(data->old_dir);
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index b28c83475ee8..2a8406b4b353 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1533,6 +1533,7 @@ struct nfs_renamedata {
 	struct nfs_fattr	new_fattr;
 	void (*complete)(struct rpc_task *, struct nfs_renamedata *);
 	long timeout;
+	int cancelled;
 };
 
 struct nfs_access_entry;
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"
  2017-06-15 16:13 ` [PATCH 1/2] Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind" Benjamin Coddington
@ 2017-06-15 18:16   ` Jeff Layton
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Layton @ 2017-06-15 18:16 UTC (permalink / raw)
  To: Benjamin Coddington, Trond Myklebust, Anna Schumaker
  Cc: mszeredi, bfields, linux-nfs

On Thu, 2017-06-15 at 12:13 -0400, Benjamin Coddington wrote:
> This reverts commit 920b4530fb80430ff30ef83efe21ba1fa5623731 which could
> call d_move() without holding the directory's i_mutex, and reverts commit
> d4ea7e3c5c0e341c15b073016dbf3ab6c65f12f3 "NFS: Fix old dentry rehash after
> move", which was a follow-up fix.
> 
> Fixes: 920b4530fb80 ("NFS: nfs_rename() handle -ERESTARTSYS dentry left behind")
> Cc: stable@vger.kernel.org # v4.10+
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> ---
>  fs/nfs/dir.c | 45 ++++++++++++++++++---------------------------
>  1 file changed, 18 insertions(+), 27 deletions(-)
> 
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 32ccd7754f8a..1faf337b316f 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1946,29 +1946,6 @@ nfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry)
>  }
>  EXPORT_SYMBOL_GPL(nfs_link);
>  
> -static void
> -nfs_complete_rename(struct rpc_task *task, struct nfs_renamedata *data)
> -{
> -	struct dentry *old_dentry = data->old_dentry;
> -	struct dentry *new_dentry = data->new_dentry;
> -	struct inode *old_inode = d_inode(old_dentry);
> -	struct inode *new_inode = d_inode(new_dentry);
> -
> -	nfs_mark_for_revalidate(old_inode);
> -
> -	switch (task->tk_status) {
> -	case 0:
> -		if (new_inode != NULL)
> -			nfs_drop_nlink(new_inode);
> -		d_move(old_dentry, new_dentry);
> -		nfs_set_verifier(new_dentry,
> -					nfs_save_change_attribute(data->new_dir));
> -		break;
> -	case -ENOENT:
> -		nfs_dentry_handle_enoent(old_dentry);
> -	}
> -}
> -
>  /*
>   * RENAME
>   * FIXME: Some nfsds, like the Linux user space nfsd, may generate a
> @@ -1999,7 +1976,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
>  {
>  	struct inode *old_inode = d_inode(old_dentry);
>  	struct inode *new_inode = d_inode(new_dentry);
> -	struct dentry *dentry = NULL;
> +	struct dentry *dentry = NULL, *rehash = NULL;
>  	struct rpc_task *task;
>  	int error = -EBUSY;
>  
> @@ -2022,8 +1999,10 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
>  		 * To prevent any new references to the target during the
>  		 * rename, we unhash the dentry in advance.
>  		 */
> -		if (!d_unhashed(new_dentry))
> +		if (!d_unhashed(new_dentry)) {
>  			d_drop(new_dentry);
> +			rehash = new_dentry;
> +		}
>  
>  		if (d_count(new_dentry) > 2) {
>  			int err;
> @@ -2040,6 +2019,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
>  				goto out;
>  
>  			new_dentry = dentry;
> +			rehash = NULL;
>  			new_inode = NULL;
>  		}
>  	}
> @@ -2048,8 +2028,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
>  	if (new_inode != NULL)
>  		NFS_PROTO(new_inode)->return_delegation(new_inode);
>  
> -	task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry,
> -					nfs_complete_rename);
> +	task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry, NULL);
>  	if (IS_ERR(task)) {
>  		error = PTR_ERR(task);
>  		goto out;
> @@ -2059,9 +2038,21 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
>  	if (error == 0)
>  		error = task->tk_status;
>  	rpc_put_task(task);
> +	nfs_mark_for_revalidate(old_inode);
>  out:
> +	if (rehash)
> +		d_rehash(rehash);
>  	trace_nfs_rename_exit(old_dir, old_dentry,
>  			new_dir, new_dentry, error);
> +	if (!error) {
> +		if (new_inode != NULL)
> +			nfs_drop_nlink(new_inode);
> +		d_move(old_dentry, new_dentry);
> +		nfs_set_verifier(new_dentry,
> +					nfs_save_change_attribute(new_dir));
> +	} else if (error == -ENOENT)
> +		nfs_dentry_handle_enoent(old_dentry);
> +
>  	/* new dentry created? */
>  	if (dentry)
>  		dput(dentry);

Reviewed-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS
  2017-06-15 16:13 ` [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS Benjamin Coddington
@ 2017-06-15 18:18   ` Jeff Layton
  2017-06-15 19:06     ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2017-06-15 18:18 UTC (permalink / raw)
  To: Benjamin Coddington, Trond Myklebust, Anna Schumaker
  Cc: mszeredi, bfields, linux-nfs

On Thu, 2017-06-15 at 12:13 -0400, Benjamin Coddington wrote:
> An interrupted rename will leave the old dentry behind if the rename
> succeeds.  Fix this by forcing a lookup the next time through
> ->d_revalidate.
> 
> A previous attempt at solving this problem took the approach to complete
> the work of the rename asynchronously, however that approach was wrong
> since it would allow the d_move() to occur after the directory's i_mutex
> had been dropped by the original process.
> 
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> ---
>  fs/nfs/dir.c            | 2 ++
>  fs/nfs/unlink.c         | 7 +++++++
>  include/linux/nfs_xdr.h | 1 +
>  3 files changed, 10 insertions(+)
> 
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 1faf337b316f..bb697e5c3f6c 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -2037,6 +2037,8 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
>  	error = rpc_wait_for_completion_task(task);
>  	if (error == 0)
>  		error = task->tk_status;
> +	else
> +		((struct nfs_renamedata *)task->tk_calldata)->cancelled = 1;

This looks a bit racy. You could end up setting cancelled just after the
reply comes in and the completion callback checks it. I think that you
probably either want to do this with an atomic operation or under a lock
of some sort.

You could probably do it with an xchg() operation?

>  	rpc_put_task(task);
>  	nfs_mark_for_revalidate(old_inode);
>  out:
> diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
> index 191aa577dd1f..b47158a34879 100644
> --- a/fs/nfs/unlink.c
> +++ b/fs/nfs/unlink.c
> @@ -288,6 +288,13 @@ static void nfs_async_rename_release(void *calldata)
>  	if (d_really_is_positive(data->old_dentry))
>  		nfs_mark_for_revalidate(d_inode(data->old_dentry));
>  
> +	/* The result of the rename is unknown. Play it safe by
> +	 * forcing a new lookup */
> +	if (data->cancelled) {
> +		nfs_force_lookup_revalidate(data->old_dir);
> +		nfs_force_lookup_revalidate(data->new_dir);
> +	}
> +
>  	dput(data->old_dentry);
>  	dput(data->new_dentry);
>  	iput(data->old_dir);
> diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> index b28c83475ee8..2a8406b4b353 100644
> --- a/include/linux/nfs_xdr.h
> +++ b/include/linux/nfs_xdr.h
> @@ -1533,6 +1533,7 @@ struct nfs_renamedata {
>  	struct nfs_fattr	new_fattr;
>  	void (*complete)(struct rpc_task *, struct nfs_renamedata *);
>  	long timeout;
> +	int cancelled;

No need for a whole int for a flag and these do get allocated. Make it a
bool?

>  };
>  
>  struct nfs_access_entry;

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS
  2017-06-15 18:18   ` Jeff Layton
@ 2017-06-15 19:06     ` Jeff Layton
  2017-06-15 20:19       ` Benjamin Coddington
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2017-06-15 19:06 UTC (permalink / raw)
  To: Benjamin Coddington, Trond Myklebust, Anna Schumaker
  Cc: mszeredi, bfields, linux-nfs

On Thu, 2017-06-15 at 14:18 -0400, Jeff Layton wrote:
> On Thu, 2017-06-15 at 12:13 -0400, Benjamin Coddington wrote:
> > An interrupted rename will leave the old dentry behind if the rename
> > succeeds.  Fix this by forcing a lookup the next time through
> > ->d_revalidate.
> > 
> > A previous attempt at solving this problem took the approach to complete
> > the work of the rename asynchronously, however that approach was wrong
> > since it would allow the d_move() to occur after the directory's i_mutex
> > had been dropped by the original process.
> > 
> > Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> > ---
> >  fs/nfs/dir.c            | 2 ++
> >  fs/nfs/unlink.c         | 7 +++++++
> >  include/linux/nfs_xdr.h | 1 +
> >  3 files changed, 10 insertions(+)
> > 
> > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > index 1faf337b316f..bb697e5c3f6c 100644
> > --- a/fs/nfs/dir.c
> > +++ b/fs/nfs/dir.c
> > @@ -2037,6 +2037,8 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
> >  	error = rpc_wait_for_completion_task(task);
> >  	if (error == 0)
> >  		error = task->tk_status;
> > +	else
> > +		((struct nfs_renamedata *)task->tk_calldata)->cancelled = 1;
> 
> This looks a bit racy. You could end up setting cancelled just after the
> reply comes in and the completion callback checks it. I think that you
> probably either want to do this with an atomic operation or under a lock
> of some sort.
> 
> You could probably do it with an xchg() operation?
> 

As Ben pointed out on IRC, that flag is checked in rpc_release, so as
long as we ensure that it's set while we hold a task reference we should
be fine here without any locking.

That said, do we need a barrier here? We do need to ensure that
cancelled is set before the rpc_put_task occurs.

> >  	rpc_put_task(task);
> >  	nfs_mark_for_revalidate(old_inode);
> >  out:
> > diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
> > index 191aa577dd1f..b47158a34879 100644
> > --- a/fs/nfs/unlink.c
> > +++ b/fs/nfs/unlink.c
> > @@ -288,6 +288,13 @@ static void nfs_async_rename_release(void *calldata)
> >  	if (d_really_is_positive(data->old_dentry))
> >  		nfs_mark_for_revalidate(d_inode(data->old_dentry));
> >  
> > +	/* The result of the rename is unknown. Play it safe by
> > +	 * forcing a new lookup */
> > +	if (data->cancelled) {
> > +		nfs_force_lookup_revalidate(data->old_dir);
> > +		nfs_force_lookup_revalidate(data->new_dir);
> > +	}
> > +
> >  	dput(data->old_dentry);
> >  	dput(data->new_dentry);
> >  	iput(data->old_dir);
> > diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> > index b28c83475ee8..2a8406b4b353 100644
> > --- a/include/linux/nfs_xdr.h
> > +++ b/include/linux/nfs_xdr.h
> > @@ -1533,6 +1533,7 @@ struct nfs_renamedata {
> >  	struct nfs_fattr	new_fattr;
> >  	void (*complete)(struct rpc_task *, struct nfs_renamedata *);
> >  	long timeout;
> > +	int cancelled;
> 
> No need for a whole int for a flag and these do get allocated. Make it a
> bool?
> 
> >  };
> >  
> >  struct nfs_access_entry;
> 
> 

-- 
Jeff Layton <jlayton@poochiereds.net>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS
  2017-06-15 19:06     ` Jeff Layton
@ 2017-06-15 20:19       ` Benjamin Coddington
  2017-06-15 20:34         ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: Benjamin Coddington @ 2017-06-15 20:19 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Trond Myklebust, Anna Schumaker, mszeredi, bfields, linux-nfs


On 15 Jun 2017, at 15:06, Jeff Layton wrote:

> On Thu, 2017-06-15 at 14:18 -0400, Jeff Layton wrote:
>> On Thu, 2017-06-15 at 12:13 -0400, Benjamin Coddington wrote:
>>> An interrupted rename will leave the old dentry behind if the rename
>>> succeeds.  Fix this by forcing a lookup the next time through
>>> ->d_revalidate.
>>>
>>> A previous attempt at solving this problem took the approach to 
>>> complete
>>> the work of the rename asynchronously, however that approach was 
>>> wrong
>>> since it would allow the d_move() to occur after the directory's 
>>> i_mutex
>>> had been dropped by the original process.
>>>
>>> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
>>> ---
>>>  fs/nfs/dir.c            | 2 ++
>>>  fs/nfs/unlink.c         | 7 +++++++
>>>  include/linux/nfs_xdr.h | 1 +
>>>  3 files changed, 10 insertions(+)
>>>
>>> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>>> index 1faf337b316f..bb697e5c3f6c 100644
>>> --- a/fs/nfs/dir.c
>>> +++ b/fs/nfs/dir.c
>>> @@ -2037,6 +2037,8 @@ int nfs_rename(struct inode *old_dir, struct 
>>> dentry *old_dentry,
>>>  	error = rpc_wait_for_completion_task(task);
>>>  	if (error == 0)
>>>  		error = task->tk_status;
>>> +	else
>>> +		((struct nfs_renamedata *)task->tk_calldata)->cancelled = 1;
>>
>> This looks a bit racy. You could end up setting cancelled just after 
>> the
>> reply comes in and the completion callback checks it. I think that 
>> you
>> probably either want to do this with an atomic operation or under a 
>> lock
>> of some sort.
>>
>> You could probably do it with an xchg() operation?
>>
>
> As Ben pointed out on IRC, that flag is checked in rpc_release, so as
> long as we ensure that it's set while we hold a task reference we 
> should
> be fine here without any locking.
>
> That said, do we need a barrier here? We do need to ensure that
> cancelled is set before the rpc_put_task occurs.

Yes, I think we probably do.


>>>  	rpc_put_task(task);
>>>  	nfs_mark_for_revalidate(old_inode);
>>>  out:
>>> diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
>>> index 191aa577dd1f..b47158a34879 100644
>>> --- a/fs/nfs/unlink.c
>>> +++ b/fs/nfs/unlink.c
>>> @@ -288,6 +288,13 @@ static void nfs_async_rename_release(void 
>>> *calldata)
>>>  	if (d_really_is_positive(data->old_dentry))
>>>  		nfs_mark_for_revalidate(d_inode(data->old_dentry));
>>>
>>> +	/* The result of the rename is unknown. Play it safe by
>>> +	 * forcing a new lookup */
>>> +	if (data->cancelled) {
>>> +		nfs_force_lookup_revalidate(data->old_dir);
>>> +		nfs_force_lookup_revalidate(data->new_dir);

Jeff's pointed out on IRC that we should hold the i_lock to call
nfs_force_lookup_revalidate(), so I'll add that.


>>> +	}
>>> +
>>>  	dput(data->old_dentry);
>>>  	dput(data->new_dentry);
>>>  	iput(data->old_dir);
>>> diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
>>> index b28c83475ee8..2a8406b4b353 100644
>>> --- a/include/linux/nfs_xdr.h
>>> +++ b/include/linux/nfs_xdr.h
>>> @@ -1533,6 +1533,7 @@ struct nfs_renamedata {
>>>  	struct nfs_fattr	new_fattr;
>>>  	void (*complete)(struct rpc_task *, struct nfs_renamedata *);
>>>  	long timeout;
>>> +	int cancelled;
>>
>> No need for a whole int for a flag and these do get allocated. Make 
>> it a
>> bool?

or

unsigned int : 1

which seems to be often used -- see nfs4_opendata.  The cancelled flag 
could
be changed there as well I suppose.

Ben

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS
  2017-06-15 20:19       ` Benjamin Coddington
@ 2017-06-15 20:34         ` Jeff Layton
  2017-06-15 20:57           ` Benjamin Coddington
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2017-06-15 20:34 UTC (permalink / raw)
  To: Benjamin Coddington
  Cc: Trond Myklebust, Anna Schumaker, mszeredi, bfields, linux-nfs

On Thu, 2017-06-15 at 16:19 -0400, Benjamin Coddington wrote:
> On 15 Jun 2017, at 15:06, Jeff Layton wrote:
> 
> > On Thu, 2017-06-15 at 14:18 -0400, Jeff Layton wrote:
> > > On Thu, 2017-06-15 at 12:13 -0400, Benjamin Coddington wrote:
> > > > An interrupted rename will leave the old dentry behind if the rename
> > > > succeeds.  Fix this by forcing a lookup the next time through
> > > > ->d_revalidate.
> > > > 
> > > > A previous attempt at solving this problem took the approach to 
> > > > complete
> > > > the work of the rename asynchronously, however that approach was 
> > > > wrong
> > > > since it would allow the d_move() to occur after the directory's 
> > > > i_mutex
> > > > had been dropped by the original process.
> > > > 
> > > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> > > > ---
> > > >  fs/nfs/dir.c            | 2 ++
> > > >  fs/nfs/unlink.c         | 7 +++++++
> > > >  include/linux/nfs_xdr.h | 1 +
> > > >  3 files changed, 10 insertions(+)
> > > > 
> > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > > index 1faf337b316f..bb697e5c3f6c 100644
> > > > --- a/fs/nfs/dir.c
> > > > +++ b/fs/nfs/dir.c
> > > > @@ -2037,6 +2037,8 @@ int nfs_rename(struct inode *old_dir, struct 
> > > > dentry *old_dentry,
> > > >  	error = rpc_wait_for_completion_task(task);
> > > >  	if (error == 0)
> > > >  		error = task->tk_status;
> > > > +	else
> > > > +		((struct nfs_renamedata *)task->tk_calldata)->cancelled = 1;
> > > 
> > > This looks a bit racy. You could end up setting cancelled just after 
> > > the
> > > reply comes in and the completion callback checks it. I think that 
> > > you
> > > probably either want to do this with an atomic operation or under a 
> > > lock
> > > of some sort.
> > > 
> > > You could probably do it with an xchg() operation?
> > > 
> > 
> > As Ben pointed out on IRC, that flag is checked in rpc_release, so as
> > long as we ensure that it's set while we hold a task reference we 
> > should
> > be fine here without any locking.
> > 
> > That said, do we need a barrier here? We do need to ensure that
> > cancelled is set before the rpc_put_task occurs.
> 
> Yes, I think we probably do.
> 

Yeah, I think a smp_wmb() there, paired with the implied barrier in the
atomic_dec_and_test in rpc_put_task?


> 
> > > >  	rpc_put_task(task);
> > > >  	nfs_mark_for_revalidate(old_inode);
> > > >  out:
> > > > diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
> > > > index 191aa577dd1f..b47158a34879 100644
> > > > --- a/fs/nfs/unlink.c
> > > > +++ b/fs/nfs/unlink.c
> > > > @@ -288,6 +288,13 @@ static void nfs_async_rename_release(void 
> > > > *calldata)
> > > >  	if (d_really_is_positive(data->old_dentry))
> > > >  		nfs_mark_for_revalidate(d_inode(data->old_dentry));
> > > > 
> > > > +	/* The result of the rename is unknown. Play it safe by
> > > > +	 * forcing a new lookup */
> > > > +	if (data->cancelled) {
> > > > +		nfs_force_lookup_revalidate(data->old_dir);
> > > > +		nfs_force_lookup_revalidate(data->new_dir);
> 
> Jeff's pointed out on IRC that we should hold the i_lock to call
> nfs_force_lookup_revalidate(), so I'll add that.
> 
> 
> > > > +	}
> > > > +
> > > >  	dput(data->old_dentry);
> > > >  	dput(data->new_dentry);
> > > >  	iput(data->old_dir);
> > > > diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> > > > index b28c83475ee8..2a8406b4b353 100644
> > > > --- a/include/linux/nfs_xdr.h
> > > > +++ b/include/linux/nfs_xdr.h
> > > > @@ -1533,6 +1533,7 @@ struct nfs_renamedata {
> > > >  	struct nfs_fattr	new_fattr;
> > > >  	void (*complete)(struct rpc_task *, struct nfs_renamedata *);
> > > >  	long timeout;
> > > > +	int cancelled;
> > > 
> > > No need for a whole int for a flag and these do get allocated. Make 
> > > it a
> > > bool?
> 
> or
> 
> unsigned int : 1
> 
> which seems to be often used -- see nfs4_opendata.  The cancelled flag 
> could
> be changed there as well I suppose.

I'd prefer a bool, but it's really up to Trond and Anna, I suppose.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS
  2017-06-15 20:34         ` Jeff Layton
@ 2017-06-15 20:57           ` Benjamin Coddington
  2017-06-15 21:06             ` Anna Schumaker
  0 siblings, 1 reply; 10+ messages in thread
From: Benjamin Coddington @ 2017-06-15 20:57 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Trond Myklebust, Anna Schumaker, mszeredi, bfields, linux-nfs

On 15 Jun 2017, at 16:34, Jeff Layton wrote:

> Yeah, I think a smp_wmb() there, paired with the implied barrier in the
> atomic_dec_and_test in rpc_put_task?

Yes, that should do it.

>>>> No need for a whole int for a flag and these do get allocated. Make
>>>> it a
>>>> bool?
>>
>> or
>>
>> unsigned int : 1
>>
>> which seems to be often used -- see nfs4_opendata.  The cancelled flag
>> could
>> be changed there as well I suppose.
>
> I'd prefer a bool, but it's really up to Trond and Anna, I suppose.

If Anna or Trond will tell us how they'd like it, I can follow up with a
patch to make them all consistent.

Ben

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS
  2017-06-15 20:57           ` Benjamin Coddington
@ 2017-06-15 21:06             ` Anna Schumaker
  0 siblings, 0 replies; 10+ messages in thread
From: Anna Schumaker @ 2017-06-15 21:06 UTC (permalink / raw)
  To: Benjamin Coddington, Jeff Layton
  Cc: Trond Myklebust, Anna Schumaker, mszeredi, bfields, linux-nfs



On 06/15/2017 04:57 PM, Benjamin Coddington wrote:
> On 15 Jun 2017, at 16:34, Jeff Layton wrote:
> 
>> Yeah, I think a smp_wmb() there, paired with the implied barrier in the
>> atomic_dec_and_test in rpc_put_task?
> 
> Yes, that should do it.
> 
>>>>> No need for a whole int for a flag and these do get allocated. Make
>>>>> it a
>>>>> bool?
>>>
>>> or
>>>
>>> unsigned int : 1
>>>
>>> which seems to be often used -- see nfs4_opendata.  The cancelled flag
>>> could
>>> be changed there as well I suppose.
>>
>> I'd prefer a bool, but it's really up to Trond and Anna, I suppose.
> 
> If Anna or Trond will tell us how they'd like it, I can follow up with a
> patch to make them all consistent.

My preference is for a bool

Anna

> 
> Ben
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-06-15 21:06 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-15 16:13 [PATCH 0/2] nfs_complete_rename() calls d_move() without i_mutex Benjamin Coddington
2017-06-15 16:13 ` [PATCH 1/2] Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind" Benjamin Coddington
2017-06-15 18:16   ` Jeff Layton
2017-06-15 16:13 ` [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS Benjamin Coddington
2017-06-15 18:18   ` Jeff Layton
2017-06-15 19:06     ` Jeff Layton
2017-06-15 20:19       ` Benjamin Coddington
2017-06-15 20:34         ` Jeff Layton
2017-06-15 20:57           ` Benjamin Coddington
2017-06-15 21:06             ` Anna Schumaker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.