All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] Allow lock dropping before waiting on inodes being freed
@ 2016-05-13 17:42 ` Andreas Gruenbacher
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: Alexander Viro, Bob Peterson, Steven Whitehouse
  Cc: Andreas Gruenbacher, cluster-devel, linux-fsdevel

This patch set my Bob and me avoids a deadlock in gfs2 when an inode is being
freed: normally, when find_inode finds an inode in I_FREEING or I_WILL_FREE
state, it waits for the inode to go away before returning
(__wait_on_freeing_inode).  Freeing an inode on gfs2 requires taking the
inode's glock.  However, in gfs2_lookup_by_inum, we need to be holding the
inode glock while doing lookups, so we can deadlock with processes concurrently
freeing inodes.

This patch set allows gfs2_lookup_by_inum to drop the inode glock before
waiting for inodes to go away.  We could try non-blocking lookups and wait for
an arbitrary amount of time before retrying the lookup instead as well, but
waiting for the actual event (i.e., the inode being freed to likely have
disappeared) makes more sense.

Thanks,
Andreas

Andreas Gruenbacher (5):
  GFS2: Remove superfluous assignment
  GFS2: No need for non-blocking gfs2_ilookup in delete_work_func
  vfs: Introduce prepare_wait_on_freeing_inode
  GFS2: Use non-blocking wait in gfs2_iget
  GFS2: Prevent deadlock in gfs2_lookup_by_inum

Bob Peterson (2):
  Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup"
  Revert "GFS2: Don't filter out I_FREEING inodes anymore"

 fs/gfs2/dir.c        |   2 +-
 fs/gfs2/glock.c      |   4 +-
 fs/gfs2/inode.c      | 109 ++++++++++++++++++++++++++++++++++++++++++++++-----
 fs/gfs2/inode.h      |   8 +++-
 fs/gfs2/ops_fstype.c |   2 +-
 fs/inode.c           |  18 +++++++--
 include/linux/fs.h   |   1 +
 include/linux/wait.h |  21 +++++++---
 8 files changed, 142 insertions(+), 23 deletions(-)

-- 
2.5.5


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Cluster-devel] [PATCH 0/7] Allow lock dropping before waiting on inodes being freed
@ 2016-05-13 17:42 ` Andreas Gruenbacher
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch set my Bob and me avoids a deadlock in gfs2 when an inode is being
freed: normally, when find_inode finds an inode in I_FREEING or I_WILL_FREE
state, it waits for the inode to go away before returning
(__wait_on_freeing_inode).  Freeing an inode on gfs2 requires taking the
inode's glock.  However, in gfs2_lookup_by_inum, we need to be holding the
inode glock while doing lookups, so we can deadlock with processes concurrently
freeing inodes.

This patch set allows gfs2_lookup_by_inum to drop the inode glock before
waiting for inodes to go away.  We could try non-blocking lookups and wait for
an arbitrary amount of time before retrying the lookup instead as well, but
waiting for the actual event (i.e., the inode being freed to likely have
disappeared) makes more sense.

Thanks,
Andreas

Andreas Gruenbacher (5):
  GFS2: Remove superfluous assignment
  GFS2: No need for non-blocking gfs2_ilookup in delete_work_func
  vfs: Introduce prepare_wait_on_freeing_inode
  GFS2: Use non-blocking wait in gfs2_iget
  GFS2: Prevent deadlock in gfs2_lookup_by_inum

Bob Peterson (2):
  Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup"
  Revert "GFS2: Don't filter out I_FREEING inodes anymore"

 fs/gfs2/dir.c        |   2 +-
 fs/gfs2/glock.c      |   4 +-
 fs/gfs2/inode.c      | 109 ++++++++++++++++++++++++++++++++++++++++++++++-----
 fs/gfs2/inode.h      |   8 +++-
 fs/gfs2/ops_fstype.c |   2 +-
 fs/inode.c           |  18 +++++++--
 include/linux/fs.h   |   1 +
 include/linux/wait.h |  21 +++++++---
 8 files changed, 142 insertions(+), 23 deletions(-)

-- 
2.5.5



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/7] Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup"
  2016-05-13 17:42 ` [Cluster-devel] " Andreas Gruenbacher
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  -1 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: Alexander Viro, Bob Peterson, Steven Whitehouse
  Cc: cluster-devel, linux-fsdevel

From: Bob Peterson <rpeterso@redhat.com>

This reverts commit 73b462d2808d7cbca4d7886cf6aaed850640e6cd.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/dir.c        | 2 +-
 fs/gfs2/inode.c      | 5 +++--
 fs/gfs2/inode.h      | 3 ++-
 fs/gfs2/ops_fstype.c | 2 +-
 4 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index 4a01f30..d4014af 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1660,7 +1660,7 @@ struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *name,
 		brelse(bh);
 		if (fail_on_exist)
 			return ERR_PTR(-EEXIST);
-		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino);
+		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, 0);
 		if (!IS_ERR(inode))
 			GFS2_I(inode)->i_rahead = rahead;
 		return inode;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index aea002e..be6036a 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -80,12 +80,13 @@ static void gfs2_set_iop(struct inode *inode)
  * @sb: The super block
  * @no_addr: The inode number
  * @type: The type of the inode
+ * non_block: Can we block on inodes that are being freed?
  *
  * Returns: A VFS inode, or an error
  */
 
 struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
-				u64 no_addr, u64 no_formal_ino)
+				u64 no_addr, u64 no_formal_ino, int non_block)
 {
 	struct inode *inode;
 	struct gfs2_inode *ip;
@@ -169,7 +170,7 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 	if (error)
 		goto fail;
 
-	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0);
+	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, 1);
 	if (IS_ERR(inode))
 		goto fail;
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index e1af0d4..22c27a8 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -94,7 +94,8 @@ err:
 }
 
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
-				       u64 no_addr, u64 no_formal_ino);
+				       u64 no_addr, u64 no_formal_ino,
+				       int non_block);
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 49b0bff..dbed9e2 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -454,7 +454,7 @@ static int gfs2_lookup_root(struct super_block *sb, struct dentry **dptr,
 	struct dentry *dentry;
 	struct inode *inode;
 
-	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0);
+	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, 0);
 	if (IS_ERR(inode)) {
 		fs_err(sdp, "can't read in %s inode: %ld\n", name, PTR_ERR(inode));
 		return PTR_ERR(inode);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Cluster-devel] [PATCH 1/7] Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup"
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Bob Peterson <rpeterso@redhat.com>

This reverts commit 73b462d2808d7cbca4d7886cf6aaed850640e6cd.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/dir.c        | 2 +-
 fs/gfs2/inode.c      | 5 +++--
 fs/gfs2/inode.h      | 3 ++-
 fs/gfs2/ops_fstype.c | 2 +-
 4 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index 4a01f30..d4014af 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1660,7 +1660,7 @@ struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *name,
 		brelse(bh);
 		if (fail_on_exist)
 			return ERR_PTR(-EEXIST);
-		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino);
+		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, 0);
 		if (!IS_ERR(inode))
 			GFS2_I(inode)->i_rahead = rahead;
 		return inode;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index aea002e..be6036a 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -80,12 +80,13 @@ static void gfs2_set_iop(struct inode *inode)
  * @sb: The super block
  * @no_addr: The inode number
  * @type: The type of the inode
+ * non_block: Can we block on inodes that are being freed?
  *
  * Returns: A VFS inode, or an error
  */
 
 struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
-				u64 no_addr, u64 no_formal_ino)
+				u64 no_addr, u64 no_formal_ino, int non_block)
 {
 	struct inode *inode;
 	struct gfs2_inode *ip;
@@ -169,7 +170,7 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 	if (error)
 		goto fail;
 
-	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0);
+	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, 1);
 	if (IS_ERR(inode))
 		goto fail;
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index e1af0d4..22c27a8 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -94,7 +94,8 @@ err:
 }
 
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
-				       u64 no_addr, u64 no_formal_ino);
+				       u64 no_addr, u64 no_formal_ino,
+				       int non_block);
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 49b0bff..dbed9e2 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -454,7 +454,7 @@ static int gfs2_lookup_root(struct super_block *sb, struct dentry **dptr,
 	struct dentry *dentry;
 	struct inode *inode;
 
-	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0);
+	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, 0);
 	if (IS_ERR(inode)) {
 		fs_err(sdp, "can't read in %s inode: %ld\n", name, PTR_ERR(inode));
 		return PTR_ERR(inode);
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/7] Revert "GFS2: Don't filter out I_FREEING inodes anymore"
  2016-05-13 17:42 ` [Cluster-devel] " Andreas Gruenbacher
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  -1 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: Alexander Viro, Bob Peterson, Steven Whitehouse
  Cc: cluster-devel, linux-fsdevel

From: Bob Peterson <rpeterso@redhat.com>

This reverts commit ff34245d524a898eee6e013eb1ec165095277148.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/export.c |  2 +-
 fs/gfs2/glock.c  |  2 +-
 fs/gfs2/inode.c  | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 fs/gfs2/inode.h  |  2 +-
 4 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index d5bda85..5d15e94 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -137,7 +137,7 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb,
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	struct inode *inode;
 
-	inode = gfs2_ilookup(sb, inum->no_addr);
+	inode = gfs2_ilookup(sb, inum->no_addr, 0);
 	if (inode) {
 		if (GFS2_I(inode)->i_no_formal_ino != inum->no_formal_ino) {
 			iput(inode);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 3910cea..672de35 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -589,7 +589,7 @@ static void delete_work_func(struct work_struct *work)
 	/* Note: Unsafe to dereference ip as we don't hold right refs/locks */
 
 	if (ip)
-		inode = gfs2_ilookup(sdp->sd_vfs, no_addr);
+		inode = gfs2_ilookup(sdp->sd_vfs, no_addr, 1);
 	else
 		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
 	if (inode && !IS_ERR(inode)) {
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index be6036a..48c1418 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -37,9 +37,61 @@
 #include "super.h"
 #include "glops.h"
 
-struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
+struct gfs2_skip_data {
+	u64 no_addr;
+	int skipped;
+	int non_block;
+};
+
+static int iget_test(struct inode *inode, void *opaque)
+{
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_skip_data *data = opaque;
+
+	if (ip->i_no_addr == data->no_addr) {
+		if (data->non_block &&
+		    inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)) {
+			data->skipped = 1;
+			return 0;
+		}
+		return 1;
+	}
+	return 0;
+}
+
+static int iget_set(struct inode *inode, void *opaque)
 {
-	return ilookup(sb, (unsigned long)no_addr);
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_skip_data *data = opaque;
+
+	if (data->skipped)
+		return -ENOENT;
+	inode->i_ino = (unsigned long)(data->no_addr);
+	ip->i_no_addr = data->no_addr;
+	return 0;
+}
+
+struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int non_block)
+{
+	unsigned long hash = (unsigned long)no_addr;
+	struct gfs2_skip_data data;
+
+	data.no_addr = no_addr;
+	data.skipped = 0;
+	data.non_block = non_block;
+	return ilookup5(sb, hash, iget_test, &data);
+}
+
+static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
+			       int non_block)
+{
+	struct gfs2_skip_data data;
+	unsigned long hash = (unsigned long)no_addr;
+
+	data.no_addr = no_addr;
+	data.skipped = 0;
+	data.non_block = non_block;
+	return iget5_locked(sb, hash, iget_test, iget_set, &data);
 }
 
 /**
@@ -93,7 +145,7 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	struct gfs2_glock *io_gl = NULL;
 	int error;
 
-	inode = iget_locked(sb, (unsigned long)no_addr);
+	inode = gfs2_iget(sb, no_addr, non_block);
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 22c27a8..ba4d949 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -99,7 +99,7 @@ extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
-extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr);
+extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int nonblock);
 
 extern int gfs2_inode_refresh(struct gfs2_inode *ip);
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Cluster-devel] [PATCH 2/7] Revert "GFS2: Don't filter out I_FREEING inodes anymore"
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Bob Peterson <rpeterso@redhat.com>

This reverts commit ff34245d524a898eee6e013eb1ec165095277148.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/export.c |  2 +-
 fs/gfs2/glock.c  |  2 +-
 fs/gfs2/inode.c  | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 fs/gfs2/inode.h  |  2 +-
 4 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index d5bda85..5d15e94 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -137,7 +137,7 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb,
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	struct inode *inode;
 
-	inode = gfs2_ilookup(sb, inum->no_addr);
+	inode = gfs2_ilookup(sb, inum->no_addr, 0);
 	if (inode) {
 		if (GFS2_I(inode)->i_no_formal_ino != inum->no_formal_ino) {
 			iput(inode);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 3910cea..672de35 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -589,7 +589,7 @@ static void delete_work_func(struct work_struct *work)
 	/* Note: Unsafe to dereference ip as we don't hold right refs/locks */
 
 	if (ip)
-		inode = gfs2_ilookup(sdp->sd_vfs, no_addr);
+		inode = gfs2_ilookup(sdp->sd_vfs, no_addr, 1);
 	else
 		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
 	if (inode && !IS_ERR(inode)) {
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index be6036a..48c1418 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -37,9 +37,61 @@
 #include "super.h"
 #include "glops.h"
 
-struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
+struct gfs2_skip_data {
+	u64 no_addr;
+	int skipped;
+	int non_block;
+};
+
+static int iget_test(struct inode *inode, void *opaque)
+{
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_skip_data *data = opaque;
+
+	if (ip->i_no_addr == data->no_addr) {
+		if (data->non_block &&
+		    inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)) {
+			data->skipped = 1;
+			return 0;
+		}
+		return 1;
+	}
+	return 0;
+}
+
+static int iget_set(struct inode *inode, void *opaque)
 {
-	return ilookup(sb, (unsigned long)no_addr);
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_skip_data *data = opaque;
+
+	if (data->skipped)
+		return -ENOENT;
+	inode->i_ino = (unsigned long)(data->no_addr);
+	ip->i_no_addr = data->no_addr;
+	return 0;
+}
+
+struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int non_block)
+{
+	unsigned long hash = (unsigned long)no_addr;
+	struct gfs2_skip_data data;
+
+	data.no_addr = no_addr;
+	data.skipped = 0;
+	data.non_block = non_block;
+	return ilookup5(sb, hash, iget_test, &data);
+}
+
+static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
+			       int non_block)
+{
+	struct gfs2_skip_data data;
+	unsigned long hash = (unsigned long)no_addr;
+
+	data.no_addr = no_addr;
+	data.skipped = 0;
+	data.non_block = non_block;
+	return iget5_locked(sb, hash, iget_test, iget_set, &data);
 }
 
 /**
@@ -93,7 +145,7 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	struct gfs2_glock *io_gl = NULL;
 	int error;
 
-	inode = iget_locked(sb, (unsigned long)no_addr);
+	inode = gfs2_iget(sb, no_addr, non_block);
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 22c27a8..ba4d949 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -99,7 +99,7 @@ extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
-extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr);
+extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int nonblock);
 
 extern int gfs2_inode_refresh(struct gfs2_inode *ip);
 
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/7] GFS2: Remove superfluous assignment
  2016-05-13 17:42 ` [Cluster-devel] " Andreas Gruenbacher
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  -1 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: Alexander Viro, Bob Peterson, Steven Whitehouse
  Cc: Andreas Gruenbacher, cluster-devel, linux-fsdevel

This patch removes an unneeded assignment in gfs2_inode_lookup:
gfs2_iget already initializes GFS2_I(inode)->i_no_addr.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/inode.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 48c1418..80dd6ba 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -148,9 +148,7 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	inode = gfs2_iget(sb, no_addr, non_block);
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
-
 	ip = GFS2_I(inode);
-	ip->i_no_addr = no_addr;
 
 	if (inode->i_state & I_NEW) {
 		struct gfs2_sbd *sdp = GFS2_SB(inode);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Cluster-devel] [PATCH 3/7] GFS2: Remove superfluous assignment
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch removes an unneeded assignment in gfs2_inode_lookup:
gfs2_iget already initializes GFS2_I(inode)->i_no_addr.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/inode.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 48c1418..80dd6ba 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -148,9 +148,7 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	inode = gfs2_iget(sb, no_addr, non_block);
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
-
 	ip = GFS2_I(inode);
-	ip->i_no_addr = no_addr;
 
 	if (inode->i_state & I_NEW) {
 		struct gfs2_sbd *sdp = GFS2_SB(inode);
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/7] GFS2: No need for non-blocking gfs2_ilookup in delete_work_func
  2016-05-13 17:42 ` [Cluster-devel] " Andreas Gruenbacher
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  -1 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: Alexander Viro, Bob Peterson, Steven Whitehouse
  Cc: Andreas Gruenbacher, cluster-devel, linux-fsdevel

In delete_work_func, we are guaranteed not to be holding the glock of
the inode we are looking up.  If the inode we are looking for is being
freed, the lookup is free to block until the inode is gone.  (Freeing
the inode takes the glock, so we must not already be holding it.)

With that, no more non-blocking callers of gfs2_ilookup are left, and
the non-block parameter of gfs2_ilookup can be removed.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/export.c | 2 +-
 fs/gfs2/glock.c  | 6 +++---
 fs/gfs2/inode.c  | 4 ++--
 fs/gfs2/inode.h  | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index 5d15e94..d5bda85 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -137,7 +137,7 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb,
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	struct inode *inode;
 
-	inode = gfs2_ilookup(sb, inum->no_addr, 0);
+	inode = gfs2_ilookup(sb, inum->no_addr);
 	if (inode) {
 		if (GFS2_I(inode)->i_no_formal_ino != inum->no_formal_ino) {
 			iput(inode);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 672de35..66d1216 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -576,7 +576,7 @@ static void delete_work_func(struct work_struct *work)
 	struct gfs2_glock *gl = container_of(work, struct gfs2_glock, gl_delete);
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
 	struct gfs2_inode *ip;
-	struct inode *inode;
+	struct inode *inode = NULL;
 	u64 no_addr = gl->gl_name.ln_number;
 
 	/* If someone's using this glock to create a new dinode, the block must
@@ -589,8 +589,8 @@ static void delete_work_func(struct work_struct *work)
 	/* Note: Unsafe to dereference ip as we don't hold right refs/locks */
 
 	if (ip)
-		inode = gfs2_ilookup(sdp->sd_vfs, no_addr, 1);
-	else
+		inode = gfs2_ilookup(sdp->sd_vfs, no_addr);
+	if (IS_ERR_OR_NULL(inode))
 		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
 	if (inode && !IS_ERR(inode)) {
 		d_prune_aliases(inode);
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 80dd6ba..3440219 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -71,14 +71,14 @@ static int iget_set(struct inode *inode, void *opaque)
 	return 0;
 }
 
-struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int non_block)
+struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 {
 	unsigned long hash = (unsigned long)no_addr;
 	struct gfs2_skip_data data;
 
 	data.no_addr = no_addr;
 	data.skipped = 0;
-	data.non_block = non_block;
+	data.non_block = 0;
 	return ilookup5(sb, hash, iget_test, &data);
 }
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index ba4d949..22c27a8 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -99,7 +99,7 @@ extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
-extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int nonblock);
+extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr);
 
 extern int gfs2_inode_refresh(struct gfs2_inode *ip);
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Cluster-devel] [PATCH 4/7] GFS2: No need for non-blocking gfs2_ilookup in delete_work_func
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

In delete_work_func, we are guaranteed not to be holding the glock of
the inode we are looking up.  If the inode we are looking for is being
freed, the lookup is free to block until the inode is gone.  (Freeing
the inode takes the glock, so we must not already be holding it.)

With that, no more non-blocking callers of gfs2_ilookup are left, and
the non-block parameter of gfs2_ilookup can be removed.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/export.c | 2 +-
 fs/gfs2/glock.c  | 6 +++---
 fs/gfs2/inode.c  | 4 ++--
 fs/gfs2/inode.h  | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index 5d15e94..d5bda85 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -137,7 +137,7 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb,
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	struct inode *inode;
 
-	inode = gfs2_ilookup(sb, inum->no_addr, 0);
+	inode = gfs2_ilookup(sb, inum->no_addr);
 	if (inode) {
 		if (GFS2_I(inode)->i_no_formal_ino != inum->no_formal_ino) {
 			iput(inode);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 672de35..66d1216 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -576,7 +576,7 @@ static void delete_work_func(struct work_struct *work)
 	struct gfs2_glock *gl = container_of(work, struct gfs2_glock, gl_delete);
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
 	struct gfs2_inode *ip;
-	struct inode *inode;
+	struct inode *inode = NULL;
 	u64 no_addr = gl->gl_name.ln_number;
 
 	/* If someone's using this glock to create a new dinode, the block must
@@ -589,8 +589,8 @@ static void delete_work_func(struct work_struct *work)
 	/* Note: Unsafe to dereference ip as we don't hold right refs/locks */
 
 	if (ip)
-		inode = gfs2_ilookup(sdp->sd_vfs, no_addr, 1);
-	else
+		inode = gfs2_ilookup(sdp->sd_vfs, no_addr);
+	if (IS_ERR_OR_NULL(inode))
 		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
 	if (inode && !IS_ERR(inode)) {
 		d_prune_aliases(inode);
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 80dd6ba..3440219 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -71,14 +71,14 @@ static int iget_set(struct inode *inode, void *opaque)
 	return 0;
 }
 
-struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int non_block)
+struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 {
 	unsigned long hash = (unsigned long)no_addr;
 	struct gfs2_skip_data data;
 
 	data.no_addr = no_addr;
 	data.skipped = 0;
-	data.non_block = non_block;
+	data.non_block = 0;
 	return ilookup5(sb, hash, iget_test, &data);
 }
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index ba4d949..22c27a8 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -99,7 +99,7 @@ extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
-extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int nonblock);
+extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr);
 
 extern int gfs2_inode_refresh(struct gfs2_inode *ip);
 
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/7] vfs: Introduce prepare_wait_on_freeing_inode
  2016-05-13 17:42 ` [Cluster-devel] " Andreas Gruenbacher
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  -1 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: Alexander Viro, Bob Peterson, Steven Whitehouse
  Cc: Andreas Gruenbacher, cluster-devel, linux-fsdevel

Add function prepare_wait_on_freeing_inode: during an inode lookup,
freeing an inode can require taking a lock that the filesystem already
holds.  The lock must be dropped before we can wait for the inode to go
away; waiting inside find_inode / find_inode_fast would deadlock.

In that case, filesystems can use find_inode_nowait.  When an inode that
is being freed is found, they can prepare to wait for the inode to go
away inside find_inode_nowait, then drop the conflicting lock and wait
outside of find_inode_nowait.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/inode.c           | 17 ++++++++++++++---
 include/linux/fs.h   |  1 +
 include/linux/wait.h | 21 +++++++++++++++------
 3 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 69b8b52..2979a7f 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1818,6 +1818,17 @@ int inode_needs_sync(struct inode *inode)
 }
 EXPORT_SYMBOL(inode_needs_sync);
 
+wait_queue_head_t *prepare_wait_on_freeing_inode(
+		struct inode *inode, struct wait_bit_queue *wait)
+{
+	wait_queue_head_t *wq = bit_waitqueue(&inode->i_state, __I_NEW);
+
+	init_wait_bit(wait, &inode->i_state, __I_NEW);
+	prepare_to_wait(wq, &wait->wait, TASK_UNINTERRUPTIBLE);
+	return wq;
+}
+EXPORT_SYMBOL(prepare_wait_on_freeing_inode);
+
 /*
  * If we try to find an inode in the inode hash while it is being
  * deleted, we have to wait until the filesystem completes its
@@ -1831,10 +1842,10 @@ EXPORT_SYMBOL(inode_needs_sync);
  */
 static void __wait_on_freeing_inode(struct inode *inode)
 {
+	struct wait_bit_queue wait;
 	wait_queue_head_t *wq;
-	DEFINE_WAIT_BIT(wait, &inode->i_state, __I_NEW);
-	wq = bit_waitqueue(&inode->i_state, __I_NEW);
-	prepare_to_wait(wq, &wait.wait, TASK_UNINTERRUPTIBLE);
+
+	wq = prepare_wait_on_freeing_inode(inode, &wait);
 	spin_unlock(&inode->i_lock);
 	spin_unlock(&inode_hash_lock);
 	schedule();
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 70e61b5..7cdf365 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2629,6 +2629,7 @@ extern void address_space_init_once(struct address_space *mapping);
 extern struct inode * igrab(struct inode *);
 extern ino_t iunique(struct super_block *, ino_t);
 extern int inode_needs_sync(struct inode *inode);
+extern wait_queue_head_t *prepare_wait_on_freeing_inode(struct inode *inode, struct wait_bit_queue *wait);
 extern int generic_delete_inode(struct inode *inode);
 static inline int generic_drop_inode(struct inode *inode)
 {
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 27d7a0a..45ef1d2 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -991,6 +991,14 @@ int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
 
 #define DEFINE_WAIT(name) DEFINE_WAIT_FUNC(name, autoremove_wake_function)
 
+#define init_wait(wait)							\
+	do {								\
+		(wait)->private = current;				\
+		(wait)->func = autoremove_wake_function;		\
+		INIT_LIST_HEAD(&(wait)->task_list);			\
+		(wait)->flags = 0;					\
+	} while (0)
+
 #define DEFINE_WAIT_BIT(name, word, bit)				\
 	struct wait_bit_queue name = {					\
 		.key = __WAIT_BIT_KEY_INITIALIZER(word, bit),		\
@@ -1002,15 +1010,16 @@ int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
 		},							\
 	}
 
-#define init_wait(wait)							\
+#define init_wait_bit(wbit, word, bit)					\
 	do {								\
-		(wait)->private = current;				\
-		(wait)->func = autoremove_wake_function;		\
-		INIT_LIST_HEAD(&(wait)->task_list);			\
-		(wait)->flags = 0;					\
+		(wbit)->key.flags = word;				\
+		(wbit)->key.bit_nr = bit;				\
+		(wbit)->wait.private = current;				\
+		(wbit)->wait.func = wake_bit_function;			\
+		INIT_LIST_HEAD(&(wbit)->wait.task_list);		\
+		(wbit)->wait.flags = 0;					\
 	} while (0)
 
-
 extern int bit_wait(struct wait_bit_key *, int);
 extern int bit_wait_io(struct wait_bit_key *, int);
 extern int bit_wait_timeout(struct wait_bit_key *, int);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Cluster-devel] [PATCH 5/7] vfs: Introduce prepare_wait_on_freeing_inode
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Add function prepare_wait_on_freeing_inode: during an inode lookup,
freeing an inode can require taking a lock that the filesystem already
holds.  The lock must be dropped before we can wait for the inode to go
away; waiting inside find_inode / find_inode_fast would deadlock.

In that case, filesystems can use find_inode_nowait.  When an inode that
is being freed is found, they can prepare to wait for the inode to go
away inside find_inode_nowait, then drop the conflicting lock and wait
outside of find_inode_nowait.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/inode.c           | 17 ++++++++++++++---
 include/linux/fs.h   |  1 +
 include/linux/wait.h | 21 +++++++++++++++------
 3 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 69b8b52..2979a7f 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1818,6 +1818,17 @@ int inode_needs_sync(struct inode *inode)
 }
 EXPORT_SYMBOL(inode_needs_sync);
 
+wait_queue_head_t *prepare_wait_on_freeing_inode(
+		struct inode *inode, struct wait_bit_queue *wait)
+{
+	wait_queue_head_t *wq = bit_waitqueue(&inode->i_state, __I_NEW);
+
+	init_wait_bit(wait, &inode->i_state, __I_NEW);
+	prepare_to_wait(wq, &wait->wait, TASK_UNINTERRUPTIBLE);
+	return wq;
+}
+EXPORT_SYMBOL(prepare_wait_on_freeing_inode);
+
 /*
  * If we try to find an inode in the inode hash while it is being
  * deleted, we have to wait until the filesystem completes its
@@ -1831,10 +1842,10 @@ EXPORT_SYMBOL(inode_needs_sync);
  */
 static void __wait_on_freeing_inode(struct inode *inode)
 {
+	struct wait_bit_queue wait;
 	wait_queue_head_t *wq;
-	DEFINE_WAIT_BIT(wait, &inode->i_state, __I_NEW);
-	wq = bit_waitqueue(&inode->i_state, __I_NEW);
-	prepare_to_wait(wq, &wait.wait, TASK_UNINTERRUPTIBLE);
+
+	wq = prepare_wait_on_freeing_inode(inode, &wait);
 	spin_unlock(&inode->i_lock);
 	spin_unlock(&inode_hash_lock);
 	schedule();
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 70e61b5..7cdf365 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2629,6 +2629,7 @@ extern void address_space_init_once(struct address_space *mapping);
 extern struct inode * igrab(struct inode *);
 extern ino_t iunique(struct super_block *, ino_t);
 extern int inode_needs_sync(struct inode *inode);
+extern wait_queue_head_t *prepare_wait_on_freeing_inode(struct inode *inode, struct wait_bit_queue *wait);
 extern int generic_delete_inode(struct inode *inode);
 static inline int generic_drop_inode(struct inode *inode)
 {
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 27d7a0a..45ef1d2 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -991,6 +991,14 @@ int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
 
 #define DEFINE_WAIT(name) DEFINE_WAIT_FUNC(name, autoremove_wake_function)
 
+#define init_wait(wait)							\
+	do {								\
+		(wait)->private = current;				\
+		(wait)->func = autoremove_wake_function;		\
+		INIT_LIST_HEAD(&(wait)->task_list);			\
+		(wait)->flags = 0;					\
+	} while (0)
+
 #define DEFINE_WAIT_BIT(name, word, bit)				\
 	struct wait_bit_queue name = {					\
 		.key = __WAIT_BIT_KEY_INITIALIZER(word, bit),		\
@@ -1002,15 +1010,16 @@ int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
 		},							\
 	}
 
-#define init_wait(wait)							\
+#define init_wait_bit(wbit, word, bit)					\
 	do {								\
-		(wait)->private = current;				\
-		(wait)->func = autoremove_wake_function;		\
-		INIT_LIST_HEAD(&(wait)->task_list);			\
-		(wait)->flags = 0;					\
+		(wbit)->key.flags = word;				\
+		(wbit)->key.bit_nr = bit;				\
+		(wbit)->wait.private = current;				\
+		(wbit)->wait.func = wake_bit_function;			\
+		INIT_LIST_HEAD(&(wbit)->wait.task_list);		\
+		(wbit)->wait.flags = 0;					\
 	} while (0)
 
-
 extern int bit_wait(struct wait_bit_key *, int);
 extern int bit_wait_io(struct wait_bit_key *, int);
 extern int bit_wait_timeout(struct wait_bit_key *, int);
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/7] GFS2: Use non-blocking wait in gfs2_iget
  2016-05-13 17:42 ` [Cluster-devel] " Andreas Gruenbacher
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  -1 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: Alexander Viro, Bob Peterson, Steven Whitehouse
  Cc: Andreas Gruenbacher, cluster-devel, linux-fsdevel

Change gfs2_iget to use non-blocking lookups internally.  This will be
used to prevent glock deadlocks.

(Requires exporting __iget from fs/inode.c.)

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/inode.c | 104 ++++++++++++++++++++++++++++++++++++--------------------
 fs/gfs2/inode.h |   5 +++
 fs/inode.c      |   1 +
 3 files changed, 74 insertions(+), 36 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 3440219..1fa4799 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -20,6 +20,7 @@
 #include <linux/fiemap.h>
 #include <linux/security.h>
 #include <asm/uaccess.h>
+#include <linux/writeback.h>
 
 #include "gfs2.h"
 #include "incore.h"
@@ -37,61 +38,92 @@
 #include "super.h"
 #include "glops.h"
 
-struct gfs2_skip_data {
+struct gfs2_match {
 	u64 no_addr;
-	int skipped;
-	int non_block;
+	struct gfs2_freeing_inode *freeing;
 };
 
-static int iget_test(struct inode *inode, void *opaque)
+/* gfs2_match_inode - Inode matching function
+ * @inode: inode pointer
+ * @l: hash value (unused, since it may not be able to hold our no_addr)
+ * @opaque: points to a gfs2_freeing_inode structure
+ */
+static int gfs2_match_inode(struct inode *inode, unsigned long l, void *opaque)
 {
-	struct gfs2_inode *ip = GFS2_I(inode);
-	struct gfs2_skip_data *data = opaque;
+	struct gfs2_match *match = opaque;
+	struct gfs2_freeing_inode *freeing = match->freeing;
 
-	if (ip->i_no_addr == data->no_addr) {
-		if (data->non_block &&
-		    inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)) {
-			data->skipped = 1;
-			return 0;
-		}
-		return 1;
+	if (GFS2_I(inode)->i_no_addr != match->no_addr)
+		return 0;
+
+	spin_lock(&inode->i_lock);
+	if (inode->i_state & (I_FREEING|I_WILL_FREE)) {
+		freeing->wq = prepare_wait_on_freeing_inode(inode,
+			&freeing->bit_wait);
+		spin_unlock(&inode->i_lock);
+		return -1;
 	}
-	return 0;
+	__iget(inode);
+	spin_unlock(&inode->i_lock);
+	return 1;
 }
 
-static int iget_set(struct inode *inode, void *opaque)
+static int gfs2_test_inode(struct inode *inode, void *opaque)
 {
-	struct gfs2_inode *ip = GFS2_I(inode);
-	struct gfs2_skip_data *data = opaque;
+	struct gfs2_match *match = opaque;
 
-	if (data->skipped)
-		return -ENOENT;
-	inode->i_ino = (unsigned long)(data->no_addr);
-	ip->i_no_addr = data->no_addr;
-	return 0;
+	return GFS2_I(inode)->i_no_addr == match->no_addr;
 }
 
 struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 {
-	unsigned long hash = (unsigned long)no_addr;
-	struct gfs2_skip_data data;
+	struct gfs2_match match = {
+		.no_addr = no_addr,
+	};
 
-	data.no_addr = no_addr;
-	data.skipped = 0;
-	data.non_block = 0;
-	return ilookup5(sb, hash, iget_test, &data);
+	return ilookup5(sb, no_addr, gfs2_test_inode, &match);
 }
 
 static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
 			       int non_block)
 {
-	struct gfs2_skip_data data;
-	unsigned long hash = (unsigned long)no_addr;
+	struct gfs2_freeing_inode freeing;
+	struct gfs2_match match = {
+		.no_addr = no_addr,
+		.freeing = &freeing,
+	};
+	struct inode *inode;
 
-	data.no_addr = no_addr;
-	data.skipped = 0;
-	data.non_block = non_block;
-	return iget5_locked(sb, hash, iget_test, iget_set, &data);
+	while (1) {
+		freeing.wq = NULL;
+		inode = find_inode_nowait(sb, no_addr,
+					  gfs2_match_inode, &match);
+		if (inode) {
+			wait_on_inode(inode);
+			return inode;
+		}
+		if (freeing.wq) {
+			if (non_block) {
+				finish_wait(freeing.wq, &freeing.bit_wait.wait);
+				return ERR_PTR(-EAGAIN);
+			}
+			schedule();
+			finish_wait(freeing.wq, &freeing.bit_wait.wait);
+			continue;
+		}
+
+		inode = new_inode(sb);
+		if (!inode)
+			return ERR_PTR(-ENOMEM);
+		inode->i_ino = no_addr;
+		GFS2_I(inode)->i_no_addr = no_addr;
+		if (insert_inode_locked4(inode, no_addr,
+					 gfs2_test_inode, &match) < 0) {
+			iput(inode);
+			continue;
+		}
+		return inode;
+	}
 }
 
 /**
@@ -146,8 +178,8 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	int error;
 
 	inode = gfs2_iget(sb, no_addr, non_block);
-	if (!inode)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(inode))
+		return inode;
 	ip = GFS2_I(inode);
 
 	if (inode->i_state & I_NEW) {
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 22c27a8..4863513 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -93,6 +93,11 @@ err:
 	return -EIO;
 }
 
+struct gfs2_freeing_inode {
+	wait_queue_head_t *wq;
+	struct wait_bit_queue bit_wait;
+};
+
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
 				       u64 no_addr, u64 no_formal_ino,
 				       int non_block);
diff --git a/fs/inode.c b/fs/inode.c
index 2979a7f..2cb18bd 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -387,6 +387,7 @@ void __iget(struct inode *inode)
 {
 	atomic_inc(&inode->i_count);
 }
+EXPORT_SYMBOL(__iget);
 
 /*
  * get additional reference to inode; caller must already hold one.
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Cluster-devel] [PATCH 6/7] GFS2: Use non-blocking wait in gfs2_iget
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Change gfs2_iget to use non-blocking lookups internally.  This will be
used to prevent glock deadlocks.

(Requires exporting __iget from fs/inode.c.)

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/inode.c | 104 ++++++++++++++++++++++++++++++++++++--------------------
 fs/gfs2/inode.h |   5 +++
 fs/inode.c      |   1 +
 3 files changed, 74 insertions(+), 36 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 3440219..1fa4799 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -20,6 +20,7 @@
 #include <linux/fiemap.h>
 #include <linux/security.h>
 #include <asm/uaccess.h>
+#include <linux/writeback.h>
 
 #include "gfs2.h"
 #include "incore.h"
@@ -37,61 +38,92 @@
 #include "super.h"
 #include "glops.h"
 
-struct gfs2_skip_data {
+struct gfs2_match {
 	u64 no_addr;
-	int skipped;
-	int non_block;
+	struct gfs2_freeing_inode *freeing;
 };
 
-static int iget_test(struct inode *inode, void *opaque)
+/* gfs2_match_inode - Inode matching function
+ * @inode: inode pointer
+ * @l: hash value (unused, since it may not be able to hold our no_addr)
+ * @opaque: points to a gfs2_freeing_inode structure
+ */
+static int gfs2_match_inode(struct inode *inode, unsigned long l, void *opaque)
 {
-	struct gfs2_inode *ip = GFS2_I(inode);
-	struct gfs2_skip_data *data = opaque;
+	struct gfs2_match *match = opaque;
+	struct gfs2_freeing_inode *freeing = match->freeing;
 
-	if (ip->i_no_addr == data->no_addr) {
-		if (data->non_block &&
-		    inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)) {
-			data->skipped = 1;
-			return 0;
-		}
-		return 1;
+	if (GFS2_I(inode)->i_no_addr != match->no_addr)
+		return 0;
+
+	spin_lock(&inode->i_lock);
+	if (inode->i_state & (I_FREEING|I_WILL_FREE)) {
+		freeing->wq = prepare_wait_on_freeing_inode(inode,
+			&freeing->bit_wait);
+		spin_unlock(&inode->i_lock);
+		return -1;
 	}
-	return 0;
+	__iget(inode);
+	spin_unlock(&inode->i_lock);
+	return 1;
 }
 
-static int iget_set(struct inode *inode, void *opaque)
+static int gfs2_test_inode(struct inode *inode, void *opaque)
 {
-	struct gfs2_inode *ip = GFS2_I(inode);
-	struct gfs2_skip_data *data = opaque;
+	struct gfs2_match *match = opaque;
 
-	if (data->skipped)
-		return -ENOENT;
-	inode->i_ino = (unsigned long)(data->no_addr);
-	ip->i_no_addr = data->no_addr;
-	return 0;
+	return GFS2_I(inode)->i_no_addr == match->no_addr;
 }
 
 struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 {
-	unsigned long hash = (unsigned long)no_addr;
-	struct gfs2_skip_data data;
+	struct gfs2_match match = {
+		.no_addr = no_addr,
+	};
 
-	data.no_addr = no_addr;
-	data.skipped = 0;
-	data.non_block = 0;
-	return ilookup5(sb, hash, iget_test, &data);
+	return ilookup5(sb, no_addr, gfs2_test_inode, &match);
 }
 
 static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
 			       int non_block)
 {
-	struct gfs2_skip_data data;
-	unsigned long hash = (unsigned long)no_addr;
+	struct gfs2_freeing_inode freeing;
+	struct gfs2_match match = {
+		.no_addr = no_addr,
+		.freeing = &freeing,
+	};
+	struct inode *inode;
 
-	data.no_addr = no_addr;
-	data.skipped = 0;
-	data.non_block = non_block;
-	return iget5_locked(sb, hash, iget_test, iget_set, &data);
+	while (1) {
+		freeing.wq = NULL;
+		inode = find_inode_nowait(sb, no_addr,
+					  gfs2_match_inode, &match);
+		if (inode) {
+			wait_on_inode(inode);
+			return inode;
+		}
+		if (freeing.wq) {
+			if (non_block) {
+				finish_wait(freeing.wq, &freeing.bit_wait.wait);
+				return ERR_PTR(-EAGAIN);
+			}
+			schedule();
+			finish_wait(freeing.wq, &freeing.bit_wait.wait);
+			continue;
+		}
+
+		inode = new_inode(sb);
+		if (!inode)
+			return ERR_PTR(-ENOMEM);
+		inode->i_ino = no_addr;
+		GFS2_I(inode)->i_no_addr = no_addr;
+		if (insert_inode_locked4(inode, no_addr,
+					 gfs2_test_inode, &match) < 0) {
+			iput(inode);
+			continue;
+		}
+		return inode;
+	}
 }
 
 /**
@@ -146,8 +178,8 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	int error;
 
 	inode = gfs2_iget(sb, no_addr, non_block);
-	if (!inode)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(inode))
+		return inode;
 	ip = GFS2_I(inode);
 
 	if (inode->i_state & I_NEW) {
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 22c27a8..4863513 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -93,6 +93,11 @@ err:
 	return -EIO;
 }
 
+struct gfs2_freeing_inode {
+	wait_queue_head_t *wq;
+	struct wait_bit_queue bit_wait;
+};
+
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
 				       u64 no_addr, u64 no_formal_ino,
 				       int non_block);
diff --git a/fs/inode.c b/fs/inode.c
index 2979a7f..2cb18bd 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -387,6 +387,7 @@ void __iget(struct inode *inode)
 {
 	atomic_inc(&inode->i_count);
 }
+EXPORT_SYMBOL(__iget);
 
 /*
  * get additional reference to inode; caller must already hold one.
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 7/7] GFS2: Prevent deadlock in gfs2_lookup_by_inum
  2016-05-13 17:42 ` [Cluster-devel] " Andreas Gruenbacher
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  -1 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: Alexander Viro, Bob Peterson, Steven Whitehouse
  Cc: Andreas Gruenbacher, cluster-devel, linux-fsdevel

In gfs2_lookup_by_inum, we must take the glock of a presumed inode
before we can determine if the block indeed still contains an inode,
then look up the inode.  When a lookup finds an inode that is being
freed, it usually waits until that inodes has gone before returning.
However, freeing the inode requires taking the inode glock, so we end up
deadlocking.

Fix that by changing gfs2_inode_lookup: instead of waiting for inodes
that are being freed, return the context necessary for waiting.  Then,
in gfs2_lookup_by_inum, drop the glock before waiting and retrying the
lookup.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/dir.c        |  2 +-
 fs/gfs2/inode.c      | 42 +++++++++++++++++++++++++-----------------
 fs/gfs2/inode.h      |  2 +-
 fs/gfs2/ops_fstype.c |  2 +-
 4 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index d4014af..eb54888 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1660,7 +1660,7 @@ struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *name,
 		brelse(bh);
 		if (fail_on_exist)
 			return ERR_PTR(-EEXIST);
-		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, 0);
+		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, NULL);
 		if (!IS_ERR(inode))
 			GFS2_I(inode)->i_rahead = rahead;
 		return inode;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 1fa4799..fb7ccb0 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -85,30 +85,29 @@ struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 }
 
 static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
-			       int non_block)
+			       struct gfs2_freeing_inode *freeing)
 {
-	struct gfs2_freeing_inode freeing;
-	struct gfs2_match match = {
-		.no_addr = no_addr,
-		.freeing = &freeing,
-	};
+	struct gfs2_freeing_inode wait_here;
+	struct gfs2_match match;
 	struct inode *inode;
 
+	if (!freeing)
+		freeing = &wait_here;
+	match.no_addr = no_addr;
+	match.freeing = freeing;
 	while (1) {
-		freeing.wq = NULL;
+		freeing->wq = NULL;
 		inode = find_inode_nowait(sb, no_addr,
 					  gfs2_match_inode, &match);
 		if (inode) {
 			wait_on_inode(inode);
 			return inode;
 		}
-		if (freeing.wq) {
-			if (non_block) {
-				finish_wait(freeing.wq, &freeing.bit_wait.wait);
+		if (freeing->wq) {
+			if (freeing != &wait_here)
 				return ERR_PTR(-EAGAIN);
-			}
 			schedule();
-			finish_wait(freeing.wq, &freeing.bit_wait.wait);
+			finish_wait(freeing->wq, &freeing->bit_wait.wait);
 			continue;
 		}
 
@@ -162,22 +161,23 @@ static void gfs2_set_iop(struct inode *inode)
 /**
  * gfs2_inode_lookup - Lookup an inode
  * @sb: The super block
- * @no_addr: The inode number
  * @type: The type of the inode
- * non_block: Can we block on inodes that are being freed?
+ * @no_addr: The inode number
+ * @freeing: Filled in when inode is being freed
  *
  * Returns: A VFS inode, or an error
  */
 
 struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
-				u64 no_addr, u64 no_formal_ino, int non_block)
+				u64 no_addr, u64 no_formal_ino,
+				struct gfs2_freeing_inode *freeing)
 {
 	struct inode *inode;
 	struct gfs2_inode *ip;
 	struct gfs2_glock *io_gl = NULL;
 	int error;
 
-	inode = gfs2_iget(sb, no_addr, non_block);
+	inode = gfs2_iget(sb, no_addr, freeing);
 	if (IS_ERR(inode))
 		return inode;
 	ip = GFS2_I(inode);
@@ -237,11 +237,13 @@ fail:
 struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 				  u64 *no_formal_ino, unsigned int blktype)
 {
+	struct gfs2_freeing_inode freeing;
 	struct super_block *sb = sdp->sd_vfs;
 	struct gfs2_holder i_gh;
 	struct inode *inode = NULL;
 	int error;
 
+repeat:
 	/* Must not read in block until block type is verified */
 	error = gfs2_glock_nq_num(sdp, no_addr, &gfs2_inode_glops,
 				  LM_ST_EXCLUSIVE, GL_SKIP, &i_gh);
@@ -252,7 +254,13 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 	if (error)
 		goto fail;
 
-	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, 1);
+	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, &freeing);
+	if (inode == ERR_PTR(-EAGAIN)) {
+		gfs2_glock_dq_uninit(&i_gh);
+		schedule();
+		finish_wait(freeing.wq, &freeing.bit_wait.wait);
+		goto repeat;
+	}
 	if (IS_ERR(inode))
 		goto fail;
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 4863513..f043601 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -100,7 +100,7 @@ struct gfs2_freeing_inode {
 
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
 				       u64 no_addr, u64 no_formal_ino,
-				       int non_block);
+				       struct gfs2_freeing_inode *freeing);
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index dbed9e2..cfc4158 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -454,7 +454,7 @@ static int gfs2_lookup_root(struct super_block *sb, struct dentry **dptr,
 	struct dentry *dentry;
 	struct inode *inode;
 
-	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, 0);
+	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, NULL);
 	if (IS_ERR(inode)) {
 		fs_err(sdp, "can't read in %s inode: %ld\n", name, PTR_ERR(inode));
 		return PTR_ERR(inode);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Cluster-devel] [PATCH 7/7] GFS2: Prevent deadlock in gfs2_lookup_by_inum
@ 2016-05-13 17:42   ` Andreas Gruenbacher
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-13 17:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

In gfs2_lookup_by_inum, we must take the glock of a presumed inode
before we can determine if the block indeed still contains an inode,
then look up the inode.  When a lookup finds an inode that is being
freed, it usually waits until that inodes has gone before returning.
However, freeing the inode requires taking the inode glock, so we end up
deadlocking.

Fix that by changing gfs2_inode_lookup: instead of waiting for inodes
that are being freed, return the context necessary for waiting.  Then,
in gfs2_lookup_by_inum, drop the glock before waiting and retrying the
lookup.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/dir.c        |  2 +-
 fs/gfs2/inode.c      | 42 +++++++++++++++++++++++++-----------------
 fs/gfs2/inode.h      |  2 +-
 fs/gfs2/ops_fstype.c |  2 +-
 4 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index d4014af..eb54888 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1660,7 +1660,7 @@ struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *name,
 		brelse(bh);
 		if (fail_on_exist)
 			return ERR_PTR(-EEXIST);
-		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, 0);
+		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, NULL);
 		if (!IS_ERR(inode))
 			GFS2_I(inode)->i_rahead = rahead;
 		return inode;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 1fa4799..fb7ccb0 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -85,30 +85,29 @@ struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 }
 
 static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
-			       int non_block)
+			       struct gfs2_freeing_inode *freeing)
 {
-	struct gfs2_freeing_inode freeing;
-	struct gfs2_match match = {
-		.no_addr = no_addr,
-		.freeing = &freeing,
-	};
+	struct gfs2_freeing_inode wait_here;
+	struct gfs2_match match;
 	struct inode *inode;
 
+	if (!freeing)
+		freeing = &wait_here;
+	match.no_addr = no_addr;
+	match.freeing = freeing;
 	while (1) {
-		freeing.wq = NULL;
+		freeing->wq = NULL;
 		inode = find_inode_nowait(sb, no_addr,
 					  gfs2_match_inode, &match);
 		if (inode) {
 			wait_on_inode(inode);
 			return inode;
 		}
-		if (freeing.wq) {
-			if (non_block) {
-				finish_wait(freeing.wq, &freeing.bit_wait.wait);
+		if (freeing->wq) {
+			if (freeing != &wait_here)
 				return ERR_PTR(-EAGAIN);
-			}
 			schedule();
-			finish_wait(freeing.wq, &freeing.bit_wait.wait);
+			finish_wait(freeing->wq, &freeing->bit_wait.wait);
 			continue;
 		}
 
@@ -162,22 +161,23 @@ static void gfs2_set_iop(struct inode *inode)
 /**
  * gfs2_inode_lookup - Lookup an inode
  * @sb: The super block
- * @no_addr: The inode number
  * @type: The type of the inode
- * non_block: Can we block on inodes that are being freed?
+ * @no_addr: The inode number
+ * @freeing: Filled in when inode is being freed
  *
  * Returns: A VFS inode, or an error
  */
 
 struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
-				u64 no_addr, u64 no_formal_ino, int non_block)
+				u64 no_addr, u64 no_formal_ino,
+				struct gfs2_freeing_inode *freeing)
 {
 	struct inode *inode;
 	struct gfs2_inode *ip;
 	struct gfs2_glock *io_gl = NULL;
 	int error;
 
-	inode = gfs2_iget(sb, no_addr, non_block);
+	inode = gfs2_iget(sb, no_addr, freeing);
 	if (IS_ERR(inode))
 		return inode;
 	ip = GFS2_I(inode);
@@ -237,11 +237,13 @@ fail:
 struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 				  u64 *no_formal_ino, unsigned int blktype)
 {
+	struct gfs2_freeing_inode freeing;
 	struct super_block *sb = sdp->sd_vfs;
 	struct gfs2_holder i_gh;
 	struct inode *inode = NULL;
 	int error;
 
+repeat:
 	/* Must not read in block until block type is verified */
 	error = gfs2_glock_nq_num(sdp, no_addr, &gfs2_inode_glops,
 				  LM_ST_EXCLUSIVE, GL_SKIP, &i_gh);
@@ -252,7 +254,13 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 	if (error)
 		goto fail;
 
-	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, 1);
+	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, &freeing);
+	if (inode == ERR_PTR(-EAGAIN)) {
+		gfs2_glock_dq_uninit(&i_gh);
+		schedule();
+		finish_wait(freeing.wq, &freeing.bit_wait.wait);
+		goto repeat;
+	}
 	if (IS_ERR(inode))
 		goto fail;
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 4863513..f043601 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -100,7 +100,7 @@ struct gfs2_freeing_inode {
 
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
 				       u64 no_addr, u64 no_formal_ino,
-				       int non_block);
+				       struct gfs2_freeing_inode *freeing);
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index dbed9e2..cfc4158 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -454,7 +454,7 @@ static int gfs2_lookup_root(struct super_block *sb, struct dentry **dptr,
 	struct dentry *dentry;
 	struct inode *inode;
 
-	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, 0);
+	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, NULL);
 	if (IS_ERR(inode)) {
 		fs_err(sdp, "can't read in %s inode: %ld\n", name, PTR_ERR(inode));
 		return PTR_ERR(inode);
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/7] Allow lock dropping before waiting on inodes being freed
  2016-05-13 17:42 ` [Cluster-devel] " Andreas Gruenbacher
@ 2016-05-16 12:14   ` Andreas Gruenbacher
  -1 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-16 12:14 UTC (permalink / raw)
  To: Alexander Viro, Bob Peterson, Steven Whitehouse
  Cc: Andreas Gruenbacher, cluster-devel, linux-fsdevel

On Fri, May 13, 2016 at 7:42 PM, Andreas Gruenbacher
<agruenba@redhat.com> wrote:
> This patch set my Bob and me avoids a deadlock in gfs2 when an inode is being
> freed: normally, when find_inode finds an inode in I_FREEING or I_WILL_FREE
> state, it waits for the inode to go away before returning
> (__wait_on_freeing_inode).  Freeing an inode on gfs2 requires taking the
> inode's glock.  However, in gfs2_lookup_by_inum, we need to be holding the
> inode glock while doing lookups, so we can deadlock with processes concurrently
> freeing inodes.
>
> This patch set allows gfs2_lookup_by_inum to drop the inode glock before
> waiting for inodes to go away.  We could try non-blocking lookups and wait for
> an arbitrary amount of time before retrying the lookup instead as well, but
> waiting for the actual event (i.e., the inode being freed to likely have
> disappeared) makes more sense.

Another approach we have tried was to use iget5_locked and to drop the
glock in the test callback if necessary. However, we'd have to
atomically test for freeing inodes, release the glock, and cause
iget5_locked (actually find_inode) to wait and repeat. The test
callback can only return whether the inode matches the one we're
looking for though; it cannot return an error code like -EBUSY.

I've looked through the existing instances of iget5_locked. All test
callbacks only return 0 or 1, so we could theoretically change
iget5_locked to deal with a test result of -EBUSY or so as well. There
still doesn't seem to be a way of cleanly implementing the waiting for
freeing inodes with this approach though: the test callback is called
without i_lock held; it would have to __iget the inode itself to
prevent if from becoming freeing; find_inode would then __iget the
inode a second time. Yuck.

Andreas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Cluster-devel] [PATCH 0/7] Allow lock dropping before waiting on inodes being freed
@ 2016-05-16 12:14   ` Andreas Gruenbacher
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Gruenbacher @ 2016-05-16 12:14 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Fri, May 13, 2016 at 7:42 PM, Andreas Gruenbacher
<agruenba@redhat.com> wrote:
> This patch set my Bob and me avoids a deadlock in gfs2 when an inode is being
> freed: normally, when find_inode finds an inode in I_FREEING or I_WILL_FREE
> state, it waits for the inode to go away before returning
> (__wait_on_freeing_inode).  Freeing an inode on gfs2 requires taking the
> inode's glock.  However, in gfs2_lookup_by_inum, we need to be holding the
> inode glock while doing lookups, so we can deadlock with processes concurrently
> freeing inodes.
>
> This patch set allows gfs2_lookup_by_inum to drop the inode glock before
> waiting for inodes to go away.  We could try non-blocking lookups and wait for
> an arbitrary amount of time before retrying the lookup instead as well, but
> waiting for the actual event (i.e., the inode being freed to likely have
> disappeared) makes more sense.

Another approach we have tried was to use iget5_locked and to drop the
glock in the test callback if necessary. However, we'd have to
atomically test for freeing inodes, release the glock, and cause
iget5_locked (actually find_inode) to wait and repeat. The test
callback can only return whether the inode matches the one we're
looking for though; it cannot return an error code like -EBUSY.

I've looked through the existing instances of iget5_locked. All test
callbacks only return 0 or 1, so we could theoretically change
iget5_locked to deal with a test result of -EBUSY or so as well. There
still doesn't seem to be a way of cleanly implementing the waiting for
freeing inodes with this approach though: the test callback is called
without i_lock held; it would have to __iget the inode itself to
prevent if from becoming freeing; find_inode would then __iget the
inode a second time. Yuck.

Andreas



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/7] Allow lock dropping before waiting on inodes being freed
  2016-05-13 17:42 ` [Cluster-devel] " Andreas Gruenbacher
                   ` (8 preceding siblings ...)
  (?)
@ 2016-05-31 16:45 ` Bob Peterson
  -1 siblings, 0 replies; 19+ messages in thread
From: Bob Peterson @ 2016-05-31 16:45 UTC (permalink / raw)
  To: Miklos Szeredi, Alexander Viro
  Cc: linux-fsdevel, Andreas Gruenbacher, Steven Whitehouse

----- Original Message -----
| This patch set my Bob and me avoids a deadlock in gfs2 when an inode is being
| freed: normally, when find_inode finds an inode in I_FREEING or I_WILL_FREE
| state, it waits for the inode to go away before returning
| (__wait_on_freeing_inode).  Freeing an inode on gfs2 requires taking the
| inode's glock.  However, in gfs2_lookup_by_inum, we need to be holding the
| inode glock while doing lookups, so we can deadlock with processes
| concurrently
| freeing inodes.
| 
| This patch set allows gfs2_lookup_by_inum to drop the inode glock before
| waiting for inodes to go away.  We could try non-blocking lookups and wait
| for
| an arbitrary amount of time before retrying the lookup instead as well, but
| waiting for the actual event (i.e., the inode being freed to likely have
| disappeared) makes more sense.
| 
| Thanks,
| Andreas
| 
| Andreas Gruenbacher (5):
|   GFS2: Remove superfluous assignment
|   GFS2: No need for non-blocking gfs2_ilookup in delete_work_func
|   vfs: Introduce prepare_wait_on_freeing_inode
|   GFS2: Use non-blocking wait in gfs2_iget
|   GFS2: Prevent deadlock in gfs2_lookup_by_inum
| 
| Bob Peterson (2):
|   Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup"
|   Revert "GFS2: Don't filter out I_FREEING inodes anymore"
| 
|  fs/gfs2/dir.c        |   2 +-
|  fs/gfs2/glock.c      |   4 +-
|  fs/gfs2/inode.c      | 109
|  ++++++++++++++++++++++++++++++++++++++++++++++-----
|  fs/gfs2/inode.h      |   8 +++-
|  fs/gfs2/ops_fstype.c |   2 +-
|  fs/inode.c           |  18 +++++++--
|  include/linux/fs.h   |   1 +
|  include/linux/wait.h |  21 +++++++---
|  8 files changed, 142 insertions(+), 23 deletions(-)
| 
| --
| 2.5.5

Hi Miklos and Al,

Did you two guys see the vfs-related changes contained in the patch set
sent by Andreas on 13 May, and resent by me on 24 May? We have not received
any comments about the patches, but since they affect VFS, I wanted to be
sure you two looked them over before I pushed them to the gfs2 for-next tree.
I just don't want to be accused of trying to slip this in under the radar or
you guys being taken by surprise.

Regards,

Bob Peterson
Red Hat File Systems

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-05-31 16:45 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-13 17:42 [PATCH 0/7] Allow lock dropping before waiting on inodes being freed Andreas Gruenbacher
2016-05-13 17:42 ` [Cluster-devel] " Andreas Gruenbacher
2016-05-13 17:42 ` [PATCH 1/7] Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup" Andreas Gruenbacher
2016-05-13 17:42   ` [Cluster-devel] " Andreas Gruenbacher
2016-05-13 17:42 ` [PATCH 2/7] Revert "GFS2: Don't filter out I_FREEING inodes anymore" Andreas Gruenbacher
2016-05-13 17:42   ` [Cluster-devel] " Andreas Gruenbacher
2016-05-13 17:42 ` [PATCH 3/7] GFS2: Remove superfluous assignment Andreas Gruenbacher
2016-05-13 17:42   ` [Cluster-devel] " Andreas Gruenbacher
2016-05-13 17:42 ` [PATCH 4/7] GFS2: No need for non-blocking gfs2_ilookup in delete_work_func Andreas Gruenbacher
2016-05-13 17:42   ` [Cluster-devel] " Andreas Gruenbacher
2016-05-13 17:42 ` [PATCH 5/7] vfs: Introduce prepare_wait_on_freeing_inode Andreas Gruenbacher
2016-05-13 17:42   ` [Cluster-devel] " Andreas Gruenbacher
2016-05-13 17:42 ` [PATCH 6/7] GFS2: Use non-blocking wait in gfs2_iget Andreas Gruenbacher
2016-05-13 17:42   ` [Cluster-devel] " Andreas Gruenbacher
2016-05-13 17:42 ` [PATCH 7/7] GFS2: Prevent deadlock in gfs2_lookup_by_inum Andreas Gruenbacher
2016-05-13 17:42   ` [Cluster-devel] " Andreas Gruenbacher
2016-05-16 12:14 ` [PATCH 0/7] Allow lock dropping before waiting on inodes being freed Andreas Gruenbacher
2016-05-16 12:14   ` [Cluster-devel] " Andreas Gruenbacher
2016-05-31 16:45 ` Bob Peterson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.