All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] simplify reconnecting dentries looked up by filehandle (v2)
@ 2013-10-25 20:29 ` J. Bruce Fields
  0 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:29 UTC (permalink / raw)
  To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: Christoph Hellwig, Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

This is a revised version of patches to simplify the code that
reconnects directories looked up by filehandle.  Thanks to Christoph for
review.  Changes since v1 include:

	- BUG_ON finding a filesystem root with DCACHE_DISCONNECTED
	- fix some dentry reference leaks
	- only do the dentry_connected() check when we actually hit a
	  race--it's redundant in the normal case.
	- simplify some of the logic
	- add detail to changelogs and comments
	- sequence the patches slightly differently (hopefully it's
	  clearer)

Christoph, I credited one minor patch to you, and I ignored several
Reviewed-by's you posted since I wasn't sure if you'd want them on the
revised patches.

The following explanation is mostly unchanged from before, including the
performance results.  (I didn't redo them, but a couple quick checks
didn't suggest any significant difference from the revisions.)

--b.

When we lookup a filehandle for a directory not already in the dentry
cache, fs/exportfs/expfs.c:reconnect_path() is what populates the dentry
cache with the directory and its parents.

The current code is more complicated than necessary:

	- after looking up the parent, it searches the parent directory
	  for our target dentry.  If that search fails with -ENOENT, it
	  retries up to 10 times.  I don't understand why.  This should
	  only happen if there's a concurrent rename or rmdir, in which
	  case that concurrent operation should have reconnected the
	  dentry for us, and we're done.

	- each time it succesfully connects a dentry to its parent, it
	  restarts from scratch with the original dentry.  Why not
	  continue working with the parent we just found?

Patches showing the resulting simplification follow.

I tested performance with a script that creates an N-deep directory
tree, gets a filehandle for the bottom directory, writes 2 to
/proc/sys/vm/drop_caches, then times an open_by_handle_at() of the
filehandle.  Code at

	git://linux-nfs.org/~bfields/fhtests.git

For directories of various depths, some example observed times (median
results of 3 similar runs, in seconds), were:

		depth:	8000	2000	200
	no patches:	11	0.7	0.02
	all patches:	 0.1	0.03	0.01

For depths < 2000 I used an ugly hack to shrink_slab_node() to force
drop_caches to free more dentries.  Difference look lost in the noise
for much smaller depths.

Nesting that deep sounds crazy to me, and cold lookup of a
filehandle isn't the common case.  But maybe it's worth not having to
worry about the awful performance in these pathalogical cases.  And it
does simplify the code.

Christoph Hellwig (1):
  exportfs: BUG_ON in crazy corner case

J. Bruce Fields (7):
  exportfs: more detailed comment for path_reconnect
  exportfs: clear DISCONNECTED on all parents sooner
  exportfs: stop retrying once we race with rename/remove
  exportfs: eliminate unused "noprogress" counter
  exportfs: move most of reconnect_path to helper function
  exportfs: better variable name
  exportfs: fix quadratic behavior in filehandle lookup

 fs/exportfs/expfs.c |  249 +++++++++++++++++++++++++++------------------------
 1 file changed, 133 insertions(+), 116 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 0/8] simplify reconnecting dentries looked up by filehandle (v2)
@ 2013-10-25 20:29 ` J. Bruce Fields
  0 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Christoph Hellwig, Al Viro, linux-nfs, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

This is a revised version of patches to simplify the code that
reconnects directories looked up by filehandle.  Thanks to Christoph for
review.  Changes since v1 include:

	- BUG_ON finding a filesystem root with DCACHE_DISCONNECTED
	- fix some dentry reference leaks
	- only do the dentry_connected() check when we actually hit a
	  race--it's redundant in the normal case.
	- simplify some of the logic
	- add detail to changelogs and comments
	- sequence the patches slightly differently (hopefully it's
	  clearer)

Christoph, I credited one minor patch to you, and I ignored several
Reviewed-by's you posted since I wasn't sure if you'd want them on the
revised patches.

The following explanation is mostly unchanged from before, including the
performance results.  (I didn't redo them, but a couple quick checks
didn't suggest any significant difference from the revisions.)

--b.

When we lookup a filehandle for a directory not already in the dentry
cache, fs/exportfs/expfs.c:reconnect_path() is what populates the dentry
cache with the directory and its parents.

The current code is more complicated than necessary:

	- after looking up the parent, it searches the parent directory
	  for our target dentry.  If that search fails with -ENOENT, it
	  retries up to 10 times.  I don't understand why.  This should
	  only happen if there's a concurrent rename or rmdir, in which
	  case that concurrent operation should have reconnected the
	  dentry for us, and we're done.

	- each time it succesfully connects a dentry to its parent, it
	  restarts from scratch with the original dentry.  Why not
	  continue working with the parent we just found?

Patches showing the resulting simplification follow.

I tested performance with a script that creates an N-deep directory
tree, gets a filehandle for the bottom directory, writes 2 to
/proc/sys/vm/drop_caches, then times an open_by_handle_at() of the
filehandle.  Code at

	git://linux-nfs.org/~bfields/fhtests.git

For directories of various depths, some example observed times (median
results of 3 similar runs, in seconds), were:

		depth:	8000	2000	200
	no patches:	11	0.7	0.02
	all patches:	 0.1	0.03	0.01

For depths < 2000 I used an ugly hack to shrink_slab_node() to force
drop_caches to free more dentries.  Difference look lost in the noise
for much smaller depths.

Nesting that deep sounds crazy to me, and cold lookup of a
filehandle isn't the common case.  But maybe it's worth not having to
worry about the awful performance in these pathalogical cases.  And it
does simplify the code.

Christoph Hellwig (1):
  exportfs: BUG_ON in crazy corner case

J. Bruce Fields (7):
  exportfs: more detailed comment for path_reconnect
  exportfs: clear DISCONNECTED on all parents sooner
  exportfs: stop retrying once we race with rename/remove
  exportfs: eliminate unused "noprogress" counter
  exportfs: move most of reconnect_path to helper function
  exportfs: better variable name
  exportfs: fix quadratic behavior in filehandle lookup

 fs/exportfs/expfs.c |  249 +++++++++++++++++++++++++++------------------------
 1 file changed, 133 insertions(+), 116 deletions(-)

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/8] exportfs: BUG_ON in crazy corner case
  2013-10-25 20:29 ` J. Bruce Fields
  (?)
@ 2013-10-25 20:29 ` J. Bruce Fields
  -1 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:29 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Christoph Hellwig, Al Viro, linux-nfs, Christoph Hellwig,
	J. Bruce Fields

From: Christoph Hellwig <hch@lst.de>

This would indicate a nasty bug in the dcache and has never triggered in
the past 10 years as far as I know.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/exportfs/expfs.c |    8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index c43fe9b..6d0a7fa 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -112,18 +112,14 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 	while (target_dir->d_flags & DCACHE_DISCONNECTED && noprogress++ < 10) {
 		struct dentry *pd = find_disconnected_root(target_dir);
 
+		BUG_ON(pd == mnt->mnt_sb->s_root);
+
 		if (!IS_ROOT(pd)) {
 			/* must have found a connected parent - great */
 			spin_lock(&pd->d_lock);
 			pd->d_flags &= ~DCACHE_DISCONNECTED;
 			spin_unlock(&pd->d_lock);
 			noprogress = 0;
-		} else if (pd == mnt->mnt_sb->s_root) {
-			printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n");
-			spin_lock(&pd->d_lock);
-			pd->d_flags &= ~DCACHE_DISCONNECTED;
-			spin_unlock(&pd->d_lock);
-			noprogress = 0;
 		} else {
 			/*
 			 * We have hit the top of a disconnected path, try to
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/8] exportfs: more detailed comment for path_reconnect
  2013-10-25 20:29 ` J. Bruce Fields
@ 2013-10-25 20:29     ` J. Bruce Fields
  -1 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:29 UTC (permalink / raw)
  To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: Christoph Hellwig, Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/exportfs/expfs.c |   14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 6d0a7fa..87e6dca 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -93,7 +93,19 @@ find_disconnected_root(struct dentry *dentry)
 /*
  * Make sure target_dir is fully connected to the dentry tree.
  *
- * It may already be, as the flag isn't always updated when connection happens.
+ * On successful return, DCACHE_DISCONNECTED will be cleared on
+ * target_dir, and target_dir->d_parent->...->d_parent will reach the
+ * root of the filesystem.
+ *
+ * Whenever DCACHE_DISCONNECTED is unset, target_dir is fully connected.
+ * But the converse is not true: target_dir may have DCACHE_DISCONNECTED
+ * set but already be connected.  In that case we'll verify the
+ * connection to root and then clear the flag.
+ *
+ * Note that target_dir could be removed by a concurrent operation.  In
+ * that case reconnect_path may still succeed with target_dir fully
+ * connected, but further operations using the filehandle will fail when
+ * necessary (due to S_DEAD being set on the directory).
  */
 static int
 reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/8] exportfs: more detailed comment for path_reconnect
@ 2013-10-25 20:29     ` J. Bruce Fields
  0 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Christoph Hellwig, Al Viro, linux-nfs, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/exportfs/expfs.c |   14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 6d0a7fa..87e6dca 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -93,7 +93,19 @@ find_disconnected_root(struct dentry *dentry)
 /*
  * Make sure target_dir is fully connected to the dentry tree.
  *
- * It may already be, as the flag isn't always updated when connection happens.
+ * On successful return, DCACHE_DISCONNECTED will be cleared on
+ * target_dir, and target_dir->d_parent->...->d_parent will reach the
+ * root of the filesystem.
+ *
+ * Whenever DCACHE_DISCONNECTED is unset, target_dir is fully connected.
+ * But the converse is not true: target_dir may have DCACHE_DISCONNECTED
+ * set but already be connected.  In that case we'll verify the
+ * connection to root and then clear the flag.
+ *
+ * Note that target_dir could be removed by a concurrent operation.  In
+ * that case reconnect_path may still succeed with target_dir fully
+ * connected, but further operations using the filehandle will fail when
+ * necessary (due to S_DEAD being set on the directory).
  */
 static int
 reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/8] exportfs: clear DISCONNECTED on all parents sooner
  2013-10-25 20:29 ` J. Bruce Fields
@ 2013-10-25 20:30     ` J. Bruce Fields
  -1 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:30 UTC (permalink / raw)
  To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: Christoph Hellwig, Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Once we've found any connected parent, we know all our parents are
connected--that's true even if there's a concurrent rename.  May as well
clear them all at once and be done with it.

Reviewed-by: Cristoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/exportfs/expfs.c |   25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 87e6dca..c65b748 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -90,6 +90,24 @@ find_disconnected_root(struct dentry *dentry)
 	return dentry;
 }
 
+static void clear_disconnected(struct dentry *dentry)
+{
+	dget(dentry);
+	while (dentry->d_flags & DCACHE_DISCONNECTED) {
+		struct dentry *parent = dget_parent(dentry);
+
+		WARN_ON_ONCE(IS_ROOT(dentry));
+
+		spin_lock(&dentry->d_lock);
+		dentry->d_flags &= ~DCACHE_DISCONNECTED;
+		spin_unlock(&dentry->d_lock);
+
+		dput(dentry);
+		dentry = parent;
+	}
+	dput(dentry);
+}
+
 /*
  * Make sure target_dir is fully connected to the dentry tree.
  *
@@ -128,10 +146,9 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 
 		if (!IS_ROOT(pd)) {
 			/* must have found a connected parent - great */
-			spin_lock(&pd->d_lock);
-			pd->d_flags &= ~DCACHE_DISCONNECTED;
-			spin_unlock(&pd->d_lock);
-			noprogress = 0;
+			clear_disconnected(target_dir);
+			dput(pd);
+			break;
 		} else {
 			/*
 			 * We have hit the top of a disconnected path, try to
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/8] exportfs: clear DISCONNECTED on all parents sooner
@ 2013-10-25 20:30     ` J. Bruce Fields
  0 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Christoph Hellwig, Al Viro, linux-nfs, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Once we've found any connected parent, we know all our parents are
connected--that's true even if there's a concurrent rename.  May as well
clear them all at once and be done with it.

Reviewed-by: Cristoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/exportfs/expfs.c |   25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 87e6dca..c65b748 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -90,6 +90,24 @@ find_disconnected_root(struct dentry *dentry)
 	return dentry;
 }
 
+static void clear_disconnected(struct dentry *dentry)
+{
+	dget(dentry);
+	while (dentry->d_flags & DCACHE_DISCONNECTED) {
+		struct dentry *parent = dget_parent(dentry);
+
+		WARN_ON_ONCE(IS_ROOT(dentry));
+
+		spin_lock(&dentry->d_lock);
+		dentry->d_flags &= ~DCACHE_DISCONNECTED;
+		spin_unlock(&dentry->d_lock);
+
+		dput(dentry);
+		dentry = parent;
+	}
+	dput(dentry);
+}
+
 /*
  * Make sure target_dir is fully connected to the dentry tree.
  *
@@ -128,10 +146,9 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 
 		if (!IS_ROOT(pd)) {
 			/* must have found a connected parent - great */
-			spin_lock(&pd->d_lock);
-			pd->d_flags &= ~DCACHE_DISCONNECTED;
-			spin_unlock(&pd->d_lock);
-			noprogress = 0;
+			clear_disconnected(target_dir);
+			dput(pd);
+			break;
 		} else {
 			/*
 			 * We have hit the top of a disconnected path, try to
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/8] exportfs: stop retrying once we race with rename/remove
  2013-10-25 20:29 ` J. Bruce Fields
@ 2013-10-25 20:30     ` J. Bruce Fields
  -1 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:30 UTC (permalink / raw)
  To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: Christoph Hellwig, Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

There are two places here where we could race with a rename or remove:

	- We could find the parent, but then be removed or renamed away
	  from that parent directory before finding our name in that
	  directory.
	- We could find the parent, and find our name in that parent,
	  but then be renamed or removed before we look ourselves up by
	  that name in that parent.

In both cases the concurrent rename or remove will take care of
reconnecting the directory that we're currently examining.  Our target
directory should then also be connected.  Check this and clear
DISCONNECTED in these cases instead of looping around again.

Note: we *do* need to check that this actually happened if we want to be
robust in the face of corrupted filesystems: a corrupted filesystem
could just return a completely wrong parent, and we want to fail with an
error in that case before starting to clear DISCONNECTED on
non-DISCONNECTED filesystems.

Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/exportfs/expfs.c |   45 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 5 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index c65b748..6b5ddd5 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -90,6 +90,23 @@ find_disconnected_root(struct dentry *dentry)
 	return dentry;
 }
 
+static bool dentry_connected(struct dentry *dentry)
+{
+	dget(dentry);
+	while (dentry->d_flags & DCACHE_DISCONNECTED) {
+		struct dentry *parent = dget_parent(dentry);
+
+		dput(dentry);
+		if (IS_ROOT(dentry)) {
+			dput(parent);
+			return false;
+		}
+		dentry = parent;
+	}
+	dput(dentry);
+	return true;
+}
+
 static void clear_disconnected(struct dentry *dentry)
 {
 	dget(dentry);
@@ -189,9 +206,9 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 				dput(pd);
 				if (err == -ENOENT)
 					/* some race between get_parent and
-					 * get_name?  just try again
+					 * get_name?
 					 */
-					continue;
+					goto out_reconnected;
 				break;
 			}
 			dprintk("%s: found name: %s\n", __func__, nbuf);
@@ -211,12 +228,12 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 			 * hopefully, npd == pd, though it isn't really
 			 * a problem if it isn't
 			 */
+			dput(npd);
+			dput(ppd);
 			if (npd == pd)
 				noprogress = 0;
 			else
-				printk("%s: npd != pd\n", __func__);
-			dput(npd);
-			dput(ppd);
+				goto out_reconnected;
 			if (IS_ROOT(pd)) {
 				/* something went wrong, we have to give up */
 				dput(pd);
@@ -234,6 +251,24 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 	}
 
 	return 0;
+out_reconnected:
+	/*
+	 * Someone must have renamed our entry into another parent, in
+	 * which case it has been reconnected by the rename.
+	 *
+	 * Or someone removed it entirely, in which case filehandle
+	 * lookup will succeed but the directory is now IS_DEAD and
+	 * subsequent operations on it will fail.
+	 *
+	 * Alternatively, maybe there was no race at all, and the
+	 * filesystem is just corrupt and gave us a parent that doesn't
+	 * actually contain any entry pointing to this inode.  So,
+	 * double check that this worked and return -ESTALE if not:
+	 */
+	if (!dentry_connected(target_dir))
+		return -ESTALE;
+	clear_disconnected(target_dir);
+	return 0;
 }
 
 struct getdents_callback {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/8] exportfs: stop retrying once we race with rename/remove
@ 2013-10-25 20:30     ` J. Bruce Fields
  0 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Christoph Hellwig, Al Viro, linux-nfs, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

There are two places here where we could race with a rename or remove:

	- We could find the parent, but then be removed or renamed away
	  from that parent directory before finding our name in that
	  directory.
	- We could find the parent, and find our name in that parent,
	  but then be renamed or removed before we look ourselves up by
	  that name in that parent.

In both cases the concurrent rename or remove will take care of
reconnecting the directory that we're currently examining.  Our target
directory should then also be connected.  Check this and clear
DISCONNECTED in these cases instead of looping around again.

Note: we *do* need to check that this actually happened if we want to be
robust in the face of corrupted filesystems: a corrupted filesystem
could just return a completely wrong parent, and we want to fail with an
error in that case before starting to clear DISCONNECTED on
non-DISCONNECTED filesystems.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/exportfs/expfs.c |   45 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 5 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index c65b748..6b5ddd5 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -90,6 +90,23 @@ find_disconnected_root(struct dentry *dentry)
 	return dentry;
 }
 
+static bool dentry_connected(struct dentry *dentry)
+{
+	dget(dentry);
+	while (dentry->d_flags & DCACHE_DISCONNECTED) {
+		struct dentry *parent = dget_parent(dentry);
+
+		dput(dentry);
+		if (IS_ROOT(dentry)) {
+			dput(parent);
+			return false;
+		}
+		dentry = parent;
+	}
+	dput(dentry);
+	return true;
+}
+
 static void clear_disconnected(struct dentry *dentry)
 {
 	dget(dentry);
@@ -189,9 +206,9 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 				dput(pd);
 				if (err == -ENOENT)
 					/* some race between get_parent and
-					 * get_name?  just try again
+					 * get_name?
 					 */
-					continue;
+					goto out_reconnected;
 				break;
 			}
 			dprintk("%s: found name: %s\n", __func__, nbuf);
@@ -211,12 +228,12 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 			 * hopefully, npd == pd, though it isn't really
 			 * a problem if it isn't
 			 */
+			dput(npd);
+			dput(ppd);
 			if (npd == pd)
 				noprogress = 0;
 			else
-				printk("%s: npd != pd\n", __func__);
-			dput(npd);
-			dput(ppd);
+				goto out_reconnected;
 			if (IS_ROOT(pd)) {
 				/* something went wrong, we have to give up */
 				dput(pd);
@@ -234,6 +251,24 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 	}
 
 	return 0;
+out_reconnected:
+	/*
+	 * Someone must have renamed our entry into another parent, in
+	 * which case it has been reconnected by the rename.
+	 *
+	 * Or someone removed it entirely, in which case filehandle
+	 * lookup will succeed but the directory is now IS_DEAD and
+	 * subsequent operations on it will fail.
+	 *
+	 * Alternatively, maybe there was no race at all, and the
+	 * filesystem is just corrupt and gave us a parent that doesn't
+	 * actually contain any entry pointing to this inode.  So,
+	 * double check that this worked and return -ESTALE if not:
+	 */
+	if (!dentry_connected(target_dir))
+		return -ESTALE;
+	clear_disconnected(target_dir);
+	return 0;
 }
 
 struct getdents_callback {
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/8] exportfs: eliminate unused "noprogress" counter
  2013-10-25 20:29 ` J. Bruce Fields
                   ` (2 preceding siblings ...)
  (?)
@ 2013-10-25 20:30 ` J. Bruce Fields
  -1 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Christoph Hellwig, Al Viro, linux-nfs, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Note this counter is now being set to 0 on every pass through the loop,
so it no longer serves any useful purpose.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/exportfs/expfs.c |   15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 6b5ddd5..d8ba88a 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -145,18 +145,9 @@ static void clear_disconnected(struct dentry *dentry)
 static int
 reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 {
-	int noprogress = 0;
 	int err = -ESTALE;
 
-	/*
-	 * It is possible that a confused file system might not let us complete
-	 * the path to the root.  For example, if get_parent returns a directory
-	 * in which we cannot find a name for the child.  While this implies a
-	 * very sick filesystem we don't want it to cause knfsd to spin.  Hence
-	 * the noprogress counter.  If we go through the loop 10 times (2 is
-	 * probably enough) without getting anywhere, we just give up
-	 */
-	while (target_dir->d_flags & DCACHE_DISCONNECTED && noprogress++ < 10) {
+	while (target_dir->d_flags & DCACHE_DISCONNECTED) {
 		struct dentry *pd = find_disconnected_root(target_dir);
 
 		BUG_ON(pd == mnt->mnt_sb->s_root);
@@ -230,9 +221,7 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 			 */
 			dput(npd);
 			dput(ppd);
-			if (npd == pd)
-				noprogress = 0;
-			else
+			if (npd != pd)
 				goto out_reconnected;
 			if (IS_ROOT(pd)) {
 				/* something went wrong, we have to give up */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/8] exportfs: move most of reconnect_path to helper function
  2013-10-25 20:29 ` J. Bruce Fields
                   ` (3 preceding siblings ...)
  (?)
@ 2013-10-25 20:30 ` J. Bruce Fields
  -1 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Christoph Hellwig, Al Viro, linux-nfs, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Also replace 3 easily-confused three-letter acronyms by more helpful
variable names.

Just cleanup, no change in functionality, with one exception: the
dentry_connected() check in the "out_reconnected" case will now only
check the ancestors of the current dentry instead of checking all the
way from target_dir.  Since we've already verified connectivity up to
this dentry, that should be sufficient.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/exportfs/expfs.c |  164 +++++++++++++++++++++++++++------------------------
 1 file changed, 86 insertions(+), 78 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index d8ba88a..d32ead9 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -126,6 +126,86 @@ static void clear_disconnected(struct dentry *dentry)
 }
 
 /*
+ * Reconnect a directory dentry with its parent.
+ *
+ * This can return a dentry, or NULL, or an error.
+ *
+ * In the first case the returned dentry is the parent of the given
+ * dentry, and may itself need to be reconnected to its parent.
+ *
+ * In the NULL case, a concurrent VFS operation has either renamed or
+ * removed this directory.  The concurrent operation has reconnected our
+ * dentry, so we no longer need to.
+ */
+static struct dentry *reconnect_one(struct vfsmount *mnt,
+		struct dentry *dentry, char *nbuf)
+{
+	struct dentry *parent;
+	struct dentry *tmp;
+	int err;
+
+	parent = ERR_PTR(-EACCES);
+	mutex_lock(&dentry->d_inode->i_mutex);
+	if (mnt->mnt_sb->s_export_op->get_parent)
+		parent = mnt->mnt_sb->s_export_op->get_parent(dentry);
+	mutex_unlock(&dentry->d_inode->i_mutex);
+
+	if (IS_ERR(parent)) {
+		dprintk("%s: get_parent of %ld failed, err %d\n",
+			__func__, dentry->d_inode->i_ino, PTR_ERR(parent));
+		return parent;
+	}
+
+	dprintk("%s: find name of %lu in %lu\n", __func__,
+		dentry->d_inode->i_ino, parent->d_inode->i_ino);
+	err = exportfs_get_name(mnt, parent, nbuf, dentry);
+	if (err == -ENOENT)
+		goto out_reconnected;
+	if (err)
+		goto out_err;
+	dprintk("%s: found name: %s\n", __func__, nbuf);
+	mutex_lock(&parent->d_inode->i_mutex);
+	tmp = lookup_one_len(nbuf, parent, strlen(nbuf));
+	mutex_unlock(&parent->d_inode->i_mutex);
+	if (IS_ERR(tmp)) {
+		dprintk("%s: lookup failed: %d\n", __func__, PTR_ERR(tmp));
+		goto out_err;
+	}
+	if (tmp != dentry) {
+		dput(tmp);
+		goto out_reconnected;
+	}
+	dput(tmp);
+	if (IS_ROOT(dentry)) {
+		err = -ESTALE;
+		goto out_err;
+	}
+	return parent;
+
+out_err:
+	dput(parent);
+	return ERR_PTR(err);
+out_reconnected:
+	dput(parent);
+	/*
+	 * Someone must have renamed our entry into another parent, in
+	 * which case it has been reconnected by the rename.
+	 *
+	 * Or someone removed it entirely, in which case filehandle
+	 * lookup will succeed but the directory is now IS_DEAD and
+	 * subsequent operations on it will fail.
+	 *
+	 * Alternatively, maybe there was no race at all, and the
+	 * filesystem is just corrupt and gave us a parent that doesn't
+	 * actually contain any entry pointing to this inode.  So,
+	 * double check that this worked and return -ESTALE if not:
+	 */
+	if (!dentry_connected(dentry))
+		return ERR_PTR(-ESTALE);
+	return NULL;
+}
+
+/*
  * Make sure target_dir is fully connected to the dentry tree.
  *
  * On successful return, DCACHE_DISCONNECTED will be cleared on
@@ -158,76 +238,19 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 			dput(pd);
 			break;
 		} else {
+			struct dentry *parent;
 			/*
 			 * We have hit the top of a disconnected path, try to
 			 * find parent and connect.
-			 *
-			 * Racing with some other process renaming a directory
-			 * isn't much of a problem here.  If someone renames
-			 * the directory, it will end up properly connected,
-			 * which is what we want
-			 *
-			 * Getting the parent can't be supported generically,
-			 * the locking is too icky.
-			 *
-			 * Instead we just return EACCES.  If server reboots
-			 * or inodes get flushed, you lose
-			 */
-			struct dentry *ppd = ERR_PTR(-EACCES);
-			struct dentry *npd;
-
-			mutex_lock(&pd->d_inode->i_mutex);
-			if (mnt->mnt_sb->s_export_op->get_parent)
-				ppd = mnt->mnt_sb->s_export_op->get_parent(pd);
-			mutex_unlock(&pd->d_inode->i_mutex);
-
-			if (IS_ERR(ppd)) {
-				err = PTR_ERR(ppd);
-				dprintk("%s: get_parent of %ld failed, err %d\n",
-					__func__, pd->d_inode->i_ino, err);
-				dput(pd);
-				break;
-			}
-
-			dprintk("%s: find name of %lu in %lu\n", __func__,
-				pd->d_inode->i_ino, ppd->d_inode->i_ino);
-			err = exportfs_get_name(mnt, ppd, nbuf, pd);
-			if (err) {
-				dput(ppd);
-				dput(pd);
-				if (err == -ENOENT)
-					/* some race between get_parent and
-					 * get_name?
-					 */
-					goto out_reconnected;
-				break;
-			}
-			dprintk("%s: found name: %s\n", __func__, nbuf);
-			mutex_lock(&ppd->d_inode->i_mutex);
-			npd = lookup_one_len(nbuf, ppd, strlen(nbuf));
-			mutex_unlock(&ppd->d_inode->i_mutex);
-			if (IS_ERR(npd)) {
-				err = PTR_ERR(npd);
-				dprintk("%s: lookup failed: %d\n",
-					__func__, err);
-				dput(ppd);
-				dput(pd);
-				break;
-			}
-			/* we didn't really want npd, we really wanted
-			 * a side-effect of the lookup.
-			 * hopefully, npd == pd, though it isn't really
-			 * a problem if it isn't
 			 */
-			dput(npd);
-			dput(ppd);
-			if (npd != pd)
+			 parent = reconnect_one(mnt, pd, nbuf);
+			 if (!parent)
 				goto out_reconnected;
-			if (IS_ROOT(pd)) {
-				/* something went wrong, we have to give up */
-				dput(pd);
+			if (IS_ERR(parent)) {
+				err = PTR_ERR(parent);
 				break;
 			}
+			dput(parent);
 		}
 		dput(pd);
 	}
@@ -241,21 +264,6 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 
 	return 0;
 out_reconnected:
-	/*
-	 * Someone must have renamed our entry into another parent, in
-	 * which case it has been reconnected by the rename.
-	 *
-	 * Or someone removed it entirely, in which case filehandle
-	 * lookup will succeed but the directory is now IS_DEAD and
-	 * subsequent operations on it will fail.
-	 *
-	 * Alternatively, maybe there was no race at all, and the
-	 * filesystem is just corrupt and gave us a parent that doesn't
-	 * actually contain any entry pointing to this inode.  So,
-	 * double check that this worked and return -ESTALE if not:
-	 */
-	if (!dentry_connected(target_dir))
-		return -ESTALE;
 	clear_disconnected(target_dir);
 	return 0;
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 7/8] exportfs: better variable name
  2013-10-25 20:29 ` J. Bruce Fields
@ 2013-10-25 20:30     ` J. Bruce Fields
  -1 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:30 UTC (permalink / raw)
  To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: Christoph Hellwig, Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Replace another unhelpful acronym.

Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/exportfs/expfs.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index d32ead9..b33b9c4 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -228,14 +228,14 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 	int err = -ESTALE;
 
 	while (target_dir->d_flags & DCACHE_DISCONNECTED) {
-		struct dentry *pd = find_disconnected_root(target_dir);
+		struct dentry *dentry = find_disconnected_root(target_dir);
 
-		BUG_ON(pd == mnt->mnt_sb->s_root);
+		BUG_ON(dentry == mnt->mnt_sb->s_root);
 
-		if (!IS_ROOT(pd)) {
+		if (!IS_ROOT(dentry)) {
 			/* must have found a connected parent - great */
 			clear_disconnected(target_dir);
-			dput(pd);
+			dput(dentry);
 			break;
 		} else {
 			struct dentry *parent;
@@ -243,7 +243,7 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 			 * We have hit the top of a disconnected path, try to
 			 * find parent and connect.
 			 */
-			 parent = reconnect_one(mnt, pd, nbuf);
+			 parent = reconnect_one(mnt, dentry, nbuf);
 			 if (!parent)
 				goto out_reconnected;
 			if (IS_ERR(parent)) {
@@ -252,7 +252,7 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 			}
 			dput(parent);
 		}
-		dput(pd);
+		dput(dentry);
 	}
 
 	if (target_dir->d_flags & DCACHE_DISCONNECTED) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 7/8] exportfs: better variable name
@ 2013-10-25 20:30     ` J. Bruce Fields
  0 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Christoph Hellwig, Al Viro, linux-nfs, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Replace another unhelpful acronym.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/exportfs/expfs.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index d32ead9..b33b9c4 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -228,14 +228,14 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 	int err = -ESTALE;
 
 	while (target_dir->d_flags & DCACHE_DISCONNECTED) {
-		struct dentry *pd = find_disconnected_root(target_dir);
+		struct dentry *dentry = find_disconnected_root(target_dir);
 
-		BUG_ON(pd == mnt->mnt_sb->s_root);
+		BUG_ON(dentry == mnt->mnt_sb->s_root);
 
-		if (!IS_ROOT(pd)) {
+		if (!IS_ROOT(dentry)) {
 			/* must have found a connected parent - great */
 			clear_disconnected(target_dir);
-			dput(pd);
+			dput(dentry);
 			break;
 		} else {
 			struct dentry *parent;
@@ -243,7 +243,7 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 			 * We have hit the top of a disconnected path, try to
 			 * find parent and connect.
 			 */
-			 parent = reconnect_one(mnt, pd, nbuf);
+			 parent = reconnect_one(mnt, dentry, nbuf);
 			 if (!parent)
 				goto out_reconnected;
 			if (IS_ERR(parent)) {
@@ -252,7 +252,7 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 			}
 			dput(parent);
 		}
-		dput(pd);
+		dput(dentry);
 	}
 
 	if (target_dir->d_flags & DCACHE_DISCONNECTED) {
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 8/8] exportfs: fix quadratic behavior in filehandle lookup
  2013-10-25 20:29 ` J. Bruce Fields
                   ` (4 preceding siblings ...)
  (?)
@ 2013-10-25 20:30 ` J. Bruce Fields
  -1 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-25 20:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Christoph Hellwig, Al Viro, linux-nfs, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Suppose we're given the filehandle for a directory whose closest
ancestor in the dcache is its Nth ancestor.

The main loop in reconnect_path searches for an IS_ROOT ancestor of
target_dir, reconnects that ancestor to its parent, then recommences the
search for an IS_ROOT ancestor from target_dir.

This behavior is quadratic in N.  And there's really no need to restart
the search from target_dir each time: once a directory has been looked
up, it won't become IS_ROOT again.  So instead of starting from
target_dir each time, we can continue where we left off.

This simplifies the code and improves performance on very deep directory
heirachies.  (I can't think of any reason anyone should need heirarchies
a hundred or more deep, but the performance improvement may be valuable
if only to limit damage in case of abuse.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/exportfs/expfs.c |   66 ++++++++++-----------------------------------------
 1 file changed, 13 insertions(+), 53 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index b33b9c4..48a359d 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -69,27 +69,6 @@ find_acceptable_alias(struct dentry *result,
 	return NULL;
 }
 
-/*
- * Find root of a disconnected subtree and return a reference to it.
- */
-static struct dentry *
-find_disconnected_root(struct dentry *dentry)
-{
-	dget(dentry);
-	while (!IS_ROOT(dentry)) {
-		struct dentry *parent = dget_parent(dentry);
-
-		if (!(parent->d_flags & DCACHE_DISCONNECTED)) {
-			dput(parent);
-			break;
-		}
-
-		dput(dentry);
-		dentry = parent;
-	}
-	return dentry;
-}
-
 static bool dentry_connected(struct dentry *dentry)
 {
 	dget(dentry);
@@ -225,45 +204,26 @@ out_reconnected:
 static int
 reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
 {
-	int err = -ESTALE;
+	struct dentry *dentry, *parent;
 
-	while (target_dir->d_flags & DCACHE_DISCONNECTED) {
-		struct dentry *dentry = find_disconnected_root(target_dir);
+	dentry = dget(target_dir);
 
+	while (dentry->d_flags & DCACHE_DISCONNECTED) {
 		BUG_ON(dentry == mnt->mnt_sb->s_root);
 
-		if (!IS_ROOT(dentry)) {
-			/* must have found a connected parent - great */
-			clear_disconnected(target_dir);
-			dput(dentry);
+		if (IS_ROOT(dentry))
+			parent = reconnect_one(mnt, dentry, nbuf);
+		else
+			parent = dget_parent(dentry);
+
+		if (!parent)
 			break;
-		} else {
-			struct dentry *parent;
-			/*
-			 * We have hit the top of a disconnected path, try to
-			 * find parent and connect.
-			 */
-			 parent = reconnect_one(mnt, dentry, nbuf);
-			 if (!parent)
-				goto out_reconnected;
-			if (IS_ERR(parent)) {
-				err = PTR_ERR(parent);
-				break;
-			}
-			dput(parent);
-		}
 		dput(dentry);
+		if (IS_ERR(parent))
+			return PTR_ERR(parent);
+		dentry = parent;
 	}
-
-	if (target_dir->d_flags & DCACHE_DISCONNECTED) {
-		/* something went wrong - oh-well */
-		if (!err)
-			err = -ESTALE;
-		return err;
-	}
-
-	return 0;
-out_reconnected:
+	dput(dentry);
 	clear_disconnected(target_dir);
 	return 0;
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] exportfs: stop retrying once we race with rename/remove
  2013-10-25 20:30     ` J. Bruce Fields
@ 2013-10-27  7:04         ` NeilBrown
  -1 siblings, 0 replies; 23+ messages in thread
From: NeilBrown @ 2013-10-27  7:04 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Christoph Hellwig, Al Viro,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 4584 bytes --]

On Fri, 25 Oct 2013 16:30:01 -0400 "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
wrote:

> From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> 
> There are two places here where we could race with a rename or remove:
> 
> 	- We could find the parent, but then be removed or renamed away
> 	  from that parent directory before finding our name in that
> 	  directory.
> 	- We could find the parent, and find our name in that parent,
> 	  but then be renamed or removed before we look ourselves up by
> 	  that name in that parent.
> 
> In both cases the concurrent rename or remove will take care of
> reconnecting the directory that we're currently examining.  Our target
> directory should then also be connected.  Check this and clear
> DISCONNECTED in these cases instead of looping around again.
> 
> Note: we *do* need to check that this actually happened if we want to be
> robust in the face of corrupted filesystems: a corrupted filesystem
> could just return a completely wrong parent, and we want to fail with an
> error in that case before starting to clear DISCONNECTED on
> non-DISCONNECTED filesystems.
> 
> Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/exportfs/expfs.c |   45 ++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 40 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> index c65b748..6b5ddd5 100644
> --- a/fs/exportfs/expfs.c
> +++ b/fs/exportfs/expfs.c
> @@ -90,6 +90,23 @@ find_disconnected_root(struct dentry *dentry)
>  	return dentry;
>  }
>  
> +static bool dentry_connected(struct dentry *dentry)
> +{
> +	dget(dentry);
> +	while (dentry->d_flags & DCACHE_DISCONNECTED) {
> +		struct dentry *parent = dget_parent(dentry);
> +
> +		dput(dentry);
> +		if (IS_ROOT(dentry)) {
> +			dput(parent);
> +			return false;
> +		}
> +		dentry = parent;
> +	}
> +	dput(dentry);
> +	return true;
> +}

This looks remarkably similar to find_disconnected_root().

static bool dentry_connected(struct dentry *dentry)
{
	return !IS_ROOT(find_disconnected_root(dentry));
}

??
....

> +
>  static void clear_disconnected(struct dentry *dentry)
>  {
>  	dget(dentry);
> @@ -189,9 +206,9 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
>  				dput(pd);
>  				if (err == -ENOENT)
>  					/* some race between get_parent and
> -					 * get_name?  just try again
> +					 * get_name?
>  					 */
> -					continue;
> +					goto out_reconnected;
>  				break;
>  			}
>  			dprintk("%s: found name: %s\n", __func__, nbuf);
> @@ -211,12 +228,12 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
>  			 * hopefully, npd == pd, though it isn't really
>  			 * a problem if it isn't
>  			 */
> +			dput(npd);
> +			dput(ppd);
>  			if (npd == pd)
>  				noprogress = 0;
>  			else
> -				printk("%s: npd != pd\n", __func__);
> -			dput(npd);
> -			dput(ppd);
> +				goto out_reconnected;
>  			if (IS_ROOT(pd)) {
>  				/* something went wrong, we have to give up */
>  				dput(pd);
> @@ -234,6 +251,24 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
>  	}
>  
>  	return 0;
> +out_reconnected:
> +	/*
> +	 * Someone must have renamed our entry into another parent, in
> +	 * which case it has been reconnected by the rename.
> +	 *
> +	 * Or someone removed it entirely, in which case filehandle
> +	 * lookup will succeed but the directory is now IS_DEAD and
> +	 * subsequent operations on it will fail.
> +	 *
> +	 * Alternatively, maybe there was no race at all, and the
> +	 * filesystem is just corrupt and gave us a parent that doesn't
> +	 * actually contain any entry pointing to this inode.  So,
> +	 * double check that this worked and return -ESTALE if not:
> +	 */
> +	if (!dentry_connected(target_dir))
> +		return -ESTALE;
> +	clear_disconnected(target_dir);

... or just open-code it.  Then this becomes:

  if (!IS_ROOT(find_disconnected_root(target_dir))) {
         clear_disconnected(target_dir);
         return 0;
  }
  return -ESTALE;

which is pleasing similar to the (new) code higher up:

		struct dentry *pd = find_disconnected_root(target_dir);

		if (!IS_ROOT(pd)) {
			/* must have found a connected parent - great */
			clear_disconnected(target_dir);
                ....


Just a thought,
NeilBrown


> +	return 0;
>  }
>  
>  struct getdents_callback {


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] exportfs: stop retrying once we race with rename/remove
@ 2013-10-27  7:04         ` NeilBrown
  0 siblings, 0 replies; 23+ messages in thread
From: NeilBrown @ 2013-10-27  7:04 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-fsdevel, Christoph Hellwig, Al Viro, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 4497 bytes --]

On Fri, 25 Oct 2013 16:30:01 -0400 "J. Bruce Fields" <bfields@redhat.com>
wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> There are two places here where we could race with a rename or remove:
> 
> 	- We could find the parent, but then be removed or renamed away
> 	  from that parent directory before finding our name in that
> 	  directory.
> 	- We could find the parent, and find our name in that parent,
> 	  but then be renamed or removed before we look ourselves up by
> 	  that name in that parent.
> 
> In both cases the concurrent rename or remove will take care of
> reconnecting the directory that we're currently examining.  Our target
> directory should then also be connected.  Check this and clear
> DISCONNECTED in these cases instead of looping around again.
> 
> Note: we *do* need to check that this actually happened if we want to be
> robust in the face of corrupted filesystems: a corrupted filesystem
> could just return a completely wrong parent, and we want to fail with an
> error in that case before starting to clear DISCONNECTED on
> non-DISCONNECTED filesystems.
> 
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/exportfs/expfs.c |   45 ++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 40 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> index c65b748..6b5ddd5 100644
> --- a/fs/exportfs/expfs.c
> +++ b/fs/exportfs/expfs.c
> @@ -90,6 +90,23 @@ find_disconnected_root(struct dentry *dentry)
>  	return dentry;
>  }
>  
> +static bool dentry_connected(struct dentry *dentry)
> +{
> +	dget(dentry);
> +	while (dentry->d_flags & DCACHE_DISCONNECTED) {
> +		struct dentry *parent = dget_parent(dentry);
> +
> +		dput(dentry);
> +		if (IS_ROOT(dentry)) {
> +			dput(parent);
> +			return false;
> +		}
> +		dentry = parent;
> +	}
> +	dput(dentry);
> +	return true;
> +}

This looks remarkably similar to find_disconnected_root().

static bool dentry_connected(struct dentry *dentry)
{
	return !IS_ROOT(find_disconnected_root(dentry));
}

??
....

> +
>  static void clear_disconnected(struct dentry *dentry)
>  {
>  	dget(dentry);
> @@ -189,9 +206,9 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
>  				dput(pd);
>  				if (err == -ENOENT)
>  					/* some race between get_parent and
> -					 * get_name?  just try again
> +					 * get_name?
>  					 */
> -					continue;
> +					goto out_reconnected;
>  				break;
>  			}
>  			dprintk("%s: found name: %s\n", __func__, nbuf);
> @@ -211,12 +228,12 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
>  			 * hopefully, npd == pd, though it isn't really
>  			 * a problem if it isn't
>  			 */
> +			dput(npd);
> +			dput(ppd);
>  			if (npd == pd)
>  				noprogress = 0;
>  			else
> -				printk("%s: npd != pd\n", __func__);
> -			dput(npd);
> -			dput(ppd);
> +				goto out_reconnected;
>  			if (IS_ROOT(pd)) {
>  				/* something went wrong, we have to give up */
>  				dput(pd);
> @@ -234,6 +251,24 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
>  	}
>  
>  	return 0;
> +out_reconnected:
> +	/*
> +	 * Someone must have renamed our entry into another parent, in
> +	 * which case it has been reconnected by the rename.
> +	 *
> +	 * Or someone removed it entirely, in which case filehandle
> +	 * lookup will succeed but the directory is now IS_DEAD and
> +	 * subsequent operations on it will fail.
> +	 *
> +	 * Alternatively, maybe there was no race at all, and the
> +	 * filesystem is just corrupt and gave us a parent that doesn't
> +	 * actually contain any entry pointing to this inode.  So,
> +	 * double check that this worked and return -ESTALE if not:
> +	 */
> +	if (!dentry_connected(target_dir))
> +		return -ESTALE;
> +	clear_disconnected(target_dir);

... or just open-code it.  Then this becomes:

  if (!IS_ROOT(find_disconnected_root(target_dir))) {
         clear_disconnected(target_dir);
         return 0;
  }
  return -ESTALE;

which is pleasing similar to the (new) code higher up:

		struct dentry *pd = find_disconnected_root(target_dir);

		if (!IS_ROOT(pd)) {
			/* must have found a connected parent - great */
			clear_disconnected(target_dir);
                ....


Just a thought,
NeilBrown


> +	return 0;
>  }
>  
>  struct getdents_callback {


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] exportfs: stop retrying once we race with rename/remove
  2013-10-27  7:04         ` NeilBrown
@ 2013-10-28  8:53             ` Christoph Hellwig
  -1 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2013-10-28  8:53 UTC (permalink / raw)
  To: NeilBrown
  Cc: J. Bruce Fields, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Christoph Hellwig, Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA

On Sun, Oct 27, 2013 at 06:04:09PM +1100, NeilBrown wrote:
> This looks remarkably similar to find_disconnected_root().

It does, but that function is going away in the last patch of the
series.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] exportfs: stop retrying once we race with rename/remove
@ 2013-10-28  8:53             ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2013-10-28  8:53 UTC (permalink / raw)
  To: NeilBrown
  Cc: J. Bruce Fields, linux-fsdevel, Christoph Hellwig, Al Viro, linux-nfs

On Sun, Oct 27, 2013 at 06:04:09PM +1100, NeilBrown wrote:
> This looks remarkably similar to find_disconnected_root().

It does, but that function is going away in the last patch of the
series.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] exportfs: stop retrying once we race with rename/remove
  2013-10-27  7:04         ` NeilBrown
  (?)
  (?)
@ 2013-10-28 15:08         ` J. Bruce Fields
  -1 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-28 15:08 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-fsdevel, Christoph Hellwig, Al Viro, linux-nfs

On Sun, Oct 27, 2013 at 06:04:09PM +1100, NeilBrown wrote:
> ... or just open-code it.  Then this becomes:
> 
>   if (!IS_ROOT(find_disconnected_root(target_dir))) {
>          clear_disconnected(target_dir);
>          return 0;
>   }
>   return -ESTALE;
> 
> which is pleasing similar to the (new) code higher up:
> 
> 		struct dentry *pd = find_disconnected_root(target_dir);
> 
> 		if (!IS_ROOT(pd)) {
> 			/* must have found a connected parent - great */
> 			clear_disconnected(target_dir);
>                 ....

We end up ditching the other call to find_disconnected_root(), and I
find dentry_connected() slightly more straightforward here, so I'd
prefer to stick with it as is.

I could do the above as an intermediate step and then write a separate
patch for the !IS_ROOT(find_disconnected_root())->dentry_connected()
that makes that argument explicitly if you think that would be clearer?

Thanks for the review.

--b.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/8] simplify reconnecting dentries looked up by filehandle (v2)
  2013-10-25 20:29 ` J. Bruce Fields
@ 2013-10-30 16:10     ` Christoph Hellwig
  -1 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2013-10-30 16:10 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Christoph Hellwig, Al Viro,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

The whole series looks good to me,

Reviewed-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/8] simplify reconnecting dentries looked up by filehandle (v2)
@ 2013-10-30 16:10     ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2013-10-30 16:10 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-fsdevel, Christoph Hellwig, Al Viro, linux-nfs

The whole series looks good to me,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/8] simplify reconnecting dentries looked up by filehandle (v2)
  2013-10-30 16:10     ` Christoph Hellwig
@ 2013-10-30 16:38         ` J. Bruce Fields
  -1 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-30 16:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Al Viro,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

On Wed, Oct 30, 2013 at 09:10:13AM -0700, Christoph Hellwig wrote:
> The whole series looks good to me,
> 
> Reviewed-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>

Thanks.  I've added that to the for-viro branch at:

	git://linux-nfs.org/~bfields/linux.git for-viro

So that now includes:

	- delegation support
	- jlayton's fix for a setlease/open race
	- 32/64-bit filehandle lookup fix
	- fixes for some uses of DCACHE_DISCONNECTED
	- this path_reconnect cleanup

As far as I'm concerned all of that is ready for 3.13.


I still intend to contribute a basic open_by_fhandle_at test but haven't
gotten much past running xfstests/new.

And as you said it would also be nice to have stress testing that tries
to trigger the various reconnect/rename/rmdir races, but I haven't
thought much yet about how to do that.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/8] simplify reconnecting dentries looked up by filehandle (v2)
@ 2013-10-30 16:38         ` J. Bruce Fields
  0 siblings, 0 replies; 23+ messages in thread
From: J. Bruce Fields @ 2013-10-30 16:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-fsdevel, Al Viro, linux-nfs

On Wed, Oct 30, 2013 at 09:10:13AM -0700, Christoph Hellwig wrote:
> The whole series looks good to me,
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Thanks.  I've added that to the for-viro branch at:

	git://linux-nfs.org/~bfields/linux.git for-viro

So that now includes:

	- delegation support
	- jlayton's fix for a setlease/open race
	- 32/64-bit filehandle lookup fix
	- fixes for some uses of DCACHE_DISCONNECTED
	- this path_reconnect cleanup

As far as I'm concerned all of that is ready for 3.13.


I still intend to contribute a basic open_by_fhandle_at test but haven't
gotten much past running xfstests/new.

And as you said it would also be nice to have stress testing that tries
to trigger the various reconnect/rename/rmdir races, but I haven't
thought much yet about how to do that.

--b.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2013-10-30 16:38 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-25 20:29 [PATCH 0/8] simplify reconnecting dentries looked up by filehandle (v2) J. Bruce Fields
2013-10-25 20:29 ` J. Bruce Fields
2013-10-25 20:29 ` [PATCH 1/8] exportfs: BUG_ON in crazy corner case J. Bruce Fields
     [not found] ` <1382733005-6006-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-25 20:29   ` [PATCH 2/8] exportfs: more detailed comment for path_reconnect J. Bruce Fields
2013-10-25 20:29     ` J. Bruce Fields
2013-10-25 20:30   ` [PATCH 3/8] exportfs: clear DISCONNECTED on all parents sooner J. Bruce Fields
2013-10-25 20:30     ` J. Bruce Fields
2013-10-25 20:30   ` [PATCH 4/8] exportfs: stop retrying once we race with rename/remove J. Bruce Fields
2013-10-25 20:30     ` J. Bruce Fields
     [not found]     ` <1382733005-6006-5-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-27  7:04       ` NeilBrown
2013-10-27  7:04         ` NeilBrown
     [not found]         ` <20131027180409.66037e01-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2013-10-28  8:53           ` Christoph Hellwig
2013-10-28  8:53             ` Christoph Hellwig
2013-10-28 15:08         ` J. Bruce Fields
2013-10-25 20:30   ` [PATCH 7/8] exportfs: better variable name J. Bruce Fields
2013-10-25 20:30     ` J. Bruce Fields
2013-10-30 16:10   ` [PATCH 0/8] simplify reconnecting dentries looked up by filehandle (v2) Christoph Hellwig
2013-10-30 16:10     ` Christoph Hellwig
     [not found]     ` <20131030161013.GA19668-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2013-10-30 16:38       ` J. Bruce Fields
2013-10-30 16:38         ` J. Bruce Fields
2013-10-25 20:30 ` [PATCH 5/8] exportfs: eliminate unused "noprogress" counter J. Bruce Fields
2013-10-25 20:30 ` [PATCH 6/8] exportfs: move most of reconnect_path to helper function J. Bruce Fields
2013-10-25 20:30 ` [PATCH 8/8] exportfs: fix quadratic behavior in filehandle lookup J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.