All of lore.kernel.org
 help / color / mirror / Atom feed
* [GFS2 PATCH 0/7] Fix glock deadlocks with deleted inodes
@ 2016-05-24 19:12 ` Bob Peterson
  0 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel, Alexander Viro, linux-fsdevel

I'm sending this to linux-fsdevel because one of the patches affects vfs.

GFS2 has a long-standing deadlock related to the transition of dinodes from
the Unlinked state to the Deleted state. The problem is that function
delete_work_func in glock.c called gfs2_lookup_by_inum, which locks the
inode glock before performing the rest of its checks, which means the lock
order is different from the normal lookups.

My original idea was to make delete_work_func call the normal lookup
function, and do away with the non-blocking lookups so they had the same
lock order. But we decided that was a bad idea. So the first two patches
revert the changes I had made earlier toward that end. The next five
patches implement a new design by Andreas Gruenbacher, which allows for
more flexibility in waiting for freeing inodes.

These missed transitions from unlinked to free also had another side-effect:
It often failed to acquire the lock and gave up, which meant that slowly
over time, the "df" value would get out of sync with "du." So it appeared
that the file system was more full than it was. (These unlinked dinodes
should, in theory, be reclaimed automatically, but they still threw the
numbers off, because the "unlinked" dinodes were not considered free space.)

This patch set has undergone rigorous testing that reliably recreated the
deadlock. Normally, the deadlock appeared in about an hour or two of
testing. With these patches, it's been running for weeks without a problem.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
Andreas Gruenbacher (5):
  GFS2: Remove superfluous assignment
  GFS2: No need for non-blocking gfs2_ilookup in delete_work_func
  vfs: Introduce prepare_wait_on_freeing_inode
  GFS2: Use non-blocking wait in gfs2_iget
  GFS2: Prevent deadlock in gfs2_lookup_by_inum

Bob Peterson (2):
  Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup"
  Revert "GFS2: Don't filter out I_FREEING inodes anymore"

 fs/gfs2/dir.c        |   2 +-
 fs/gfs2/glock.c      |   4 +-
 fs/gfs2/inode.c      | 109 ++++++++++++++++++++++++++++++++++++++++++++++-----
 fs/gfs2/inode.h      |   8 +++-
 fs/gfs2/ops_fstype.c |   2 +-
 fs/inode.c           |  18 +++++++--
 include/linux/fs.h   |   1 +
 include/linux/wait.h |  21 +++++++---
 8 files changed, 142 insertions(+), 23 deletions(-)

-- 
2.5.5


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Cluster-devel] [GFS2 PATCH 0/7] Fix glock deadlocks with deleted inodes
@ 2016-05-24 19:12 ` Bob Peterson
  0 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

I'm sending this to linux-fsdevel because one of the patches affects vfs.

GFS2 has a long-standing deadlock related to the transition of dinodes from
the Unlinked state to the Deleted state. The problem is that function
delete_work_func in glock.c called gfs2_lookup_by_inum, which locks the
inode glock before performing the rest of its checks, which means the lock
order is different from the normal lookups.

My original idea was to make delete_work_func call the normal lookup
function, and do away with the non-blocking lookups so they had the same
lock order. But we decided that was a bad idea. So the first two patches
revert the changes I had made earlier toward that end. The next five
patches implement a new design by Andreas Gruenbacher, which allows for
more flexibility in waiting for freeing inodes.

These missed transitions from unlinked to free also had another side-effect:
It often failed to acquire the lock and gave up, which meant that slowly
over time, the "df" value would get out of sync with "du." So it appeared
that the file system was more full than it was. (These unlinked dinodes
should, in theory, be reclaimed automatically, but they still threw the
numbers off, because the "unlinked" dinodes were not considered free space.)

This patch set has undergone rigorous testing that reliably recreated the
deadlock. Normally, the deadlock appeared in about an hour or two of
testing. With these patches, it's been running for weeks without a problem.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
Andreas Gruenbacher (5):
  GFS2: Remove superfluous assignment
  GFS2: No need for non-blocking gfs2_ilookup in delete_work_func
  vfs: Introduce prepare_wait_on_freeing_inode
  GFS2: Use non-blocking wait in gfs2_iget
  GFS2: Prevent deadlock in gfs2_lookup_by_inum

Bob Peterson (2):
  Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup"
  Revert "GFS2: Don't filter out I_FREEING inodes anymore"

 fs/gfs2/dir.c        |   2 +-
 fs/gfs2/glock.c      |   4 +-
 fs/gfs2/inode.c      | 109 ++++++++++++++++++++++++++++++++++++++++++++++-----
 fs/gfs2/inode.h      |   8 +++-
 fs/gfs2/ops_fstype.c |   2 +-
 fs/inode.c           |  18 +++++++--
 include/linux/fs.h   |   1 +
 include/linux/wait.h |  21 +++++++---
 8 files changed, 142 insertions(+), 23 deletions(-)

-- 
2.5.5



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [GFS2 PATCH 1/7] Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup"
  2016-05-24 19:12 ` [Cluster-devel] " Bob Peterson
@ 2016-05-24 19:12   ` Bob Peterson
  -1 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel, Alexander Viro, linux-fsdevel

This reverts commit 73b462d2808d7cbca4d7886cf6aaed850640e6cd.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/dir.c        | 2 +-
 fs/gfs2/inode.c      | 5 +++--
 fs/gfs2/inode.h      | 3 ++-
 fs/gfs2/ops_fstype.c | 2 +-
 4 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index 4a01f30..d4014af 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1660,7 +1660,7 @@ struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *name,
 		brelse(bh);
 		if (fail_on_exist)
 			return ERR_PTR(-EEXIST);
-		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino);
+		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, 0);
 		if (!IS_ERR(inode))
 			GFS2_I(inode)->i_rahead = rahead;
 		return inode;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 21dc784..f77976f 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -80,12 +80,13 @@ static void gfs2_set_iop(struct inode *inode)
  * @sb: The super block
  * @no_addr: The inode number
  * @type: The type of the inode
+ * non_block: Can we block on inodes that are being freed?
  *
  * Returns: A VFS inode, or an error
  */
 
 struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
-				u64 no_addr, u64 no_formal_ino)
+				u64 no_addr, u64 no_formal_ino, int non_block)
 {
 	struct inode *inode;
 	struct gfs2_inode *ip;
@@ -169,7 +170,7 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 	if (error)
 		goto fail;
 
-	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0);
+	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, 1);
 	if (IS_ERR(inode))
 		goto fail;
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index e1af0d4..22c27a8 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -94,7 +94,8 @@ err:
 }
 
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
-				       u64 no_addr, u64 no_formal_ino);
+				       u64 no_addr, u64 no_formal_ino,
+				       int non_block);
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 4546360..89472c4 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -454,7 +454,7 @@ static int gfs2_lookup_root(struct super_block *sb, struct dentry **dptr,
 	struct dentry *dentry;
 	struct inode *inode;
 
-	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0);
+	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, 0);
 	if (IS_ERR(inode)) {
 		fs_err(sdp, "can't read in %s inode: %ld\n", name, PTR_ERR(inode));
 		return PTR_ERR(inode);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Cluster-devel] [GFS2 PATCH 1/7] Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup"
@ 2016-05-24 19:12   ` Bob Peterson
  0 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This reverts commit 73b462d2808d7cbca4d7886cf6aaed850640e6cd.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/dir.c        | 2 +-
 fs/gfs2/inode.c      | 5 +++--
 fs/gfs2/inode.h      | 3 ++-
 fs/gfs2/ops_fstype.c | 2 +-
 4 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index 4a01f30..d4014af 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1660,7 +1660,7 @@ struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *name,
 		brelse(bh);
 		if (fail_on_exist)
 			return ERR_PTR(-EEXIST);
-		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino);
+		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, 0);
 		if (!IS_ERR(inode))
 			GFS2_I(inode)->i_rahead = rahead;
 		return inode;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 21dc784..f77976f 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -80,12 +80,13 @@ static void gfs2_set_iop(struct inode *inode)
  * @sb: The super block
  * @no_addr: The inode number
  * @type: The type of the inode
+ * non_block: Can we block on inodes that are being freed?
  *
  * Returns: A VFS inode, or an error
  */
 
 struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
-				u64 no_addr, u64 no_formal_ino)
+				u64 no_addr, u64 no_formal_ino, int non_block)
 {
 	struct inode *inode;
 	struct gfs2_inode *ip;
@@ -169,7 +170,7 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 	if (error)
 		goto fail;
 
-	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0);
+	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, 1);
 	if (IS_ERR(inode))
 		goto fail;
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index e1af0d4..22c27a8 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -94,7 +94,8 @@ err:
 }
 
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
-				       u64 no_addr, u64 no_formal_ino);
+				       u64 no_addr, u64 no_formal_ino,
+				       int non_block);
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 4546360..89472c4 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -454,7 +454,7 @@ static int gfs2_lookup_root(struct super_block *sb, struct dentry **dptr,
 	struct dentry *dentry;
 	struct inode *inode;
 
-	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0);
+	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, 0);
 	if (IS_ERR(inode)) {
 		fs_err(sdp, "can't read in %s inode: %ld\n", name, PTR_ERR(inode));
 		return PTR_ERR(inode);
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [GFS2 PATCH 2/7] Revert "GFS2: Don't filter out I_FREEING inodes anymore"
  2016-05-24 19:12 ` [Cluster-devel] " Bob Peterson
@ 2016-05-24 19:12   ` Bob Peterson
  -1 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel, Alexander Viro, linux-fsdevel

This reverts commit ff34245d524a898eee6e013eb1ec165095277148.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/export.c |  2 +-
 fs/gfs2/glock.c  |  2 +-
 fs/gfs2/inode.c  | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 fs/gfs2/inode.h  |  2 +-
 4 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index d5bda85..5d15e94 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -137,7 +137,7 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb,
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	struct inode *inode;
 
-	inode = gfs2_ilookup(sb, inum->no_addr);
+	inode = gfs2_ilookup(sb, inum->no_addr, 0);
 	if (inode) {
 		if (GFS2_I(inode)->i_no_formal_ino != inum->no_formal_ino) {
 			iput(inode);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 706fd93..e648b07 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -589,7 +589,7 @@ static void delete_work_func(struct work_struct *work)
 	/* Note: Unsafe to dereference ip as we don't hold right refs/locks */
 
 	if (ip)
-		inode = gfs2_ilookup(sdp->sd_vfs, no_addr);
+		inode = gfs2_ilookup(sdp->sd_vfs, no_addr, 1);
 	else
 		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
 	if (inode && !IS_ERR(inode)) {
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index f77976f..30f70b3 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -37,9 +37,61 @@
 #include "super.h"
 #include "glops.h"
 
-struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
+struct gfs2_skip_data {
+	u64 no_addr;
+	int skipped;
+	int non_block;
+};
+
+static int iget_test(struct inode *inode, void *opaque)
+{
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_skip_data *data = opaque;
+
+	if (ip->i_no_addr == data->no_addr) {
+		if (data->non_block &&
+		    inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)) {
+			data->skipped = 1;
+			return 0;
+		}
+		return 1;
+	}
+	return 0;
+}
+
+static int iget_set(struct inode *inode, void *opaque)
 {
-	return ilookup(sb, (unsigned long)no_addr);
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_skip_data *data = opaque;
+
+	if (data->skipped)
+		return -ENOENT;
+	inode->i_ino = (unsigned long)(data->no_addr);
+	ip->i_no_addr = data->no_addr;
+	return 0;
+}
+
+struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int non_block)
+{
+	unsigned long hash = (unsigned long)no_addr;
+	struct gfs2_skip_data data;
+
+	data.no_addr = no_addr;
+	data.skipped = 0;
+	data.non_block = non_block;
+	return ilookup5(sb, hash, iget_test, &data);
+}
+
+static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
+			       int non_block)
+{
+	struct gfs2_skip_data data;
+	unsigned long hash = (unsigned long)no_addr;
+
+	data.no_addr = no_addr;
+	data.skipped = 0;
+	data.non_block = non_block;
+	return iget5_locked(sb, hash, iget_test, iget_set, &data);
 }
 
 /**
@@ -93,7 +145,7 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	struct gfs2_glock *io_gl = NULL;
 	int error;
 
-	inode = iget_locked(sb, (unsigned long)no_addr);
+	inode = gfs2_iget(sb, no_addr, non_block);
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 22c27a8..ba4d949 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -99,7 +99,7 @@ extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
-extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr);
+extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int nonblock);
 
 extern int gfs2_inode_refresh(struct gfs2_inode *ip);
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Cluster-devel] [GFS2 PATCH 2/7] Revert "GFS2: Don't filter out I_FREEING inodes anymore"
@ 2016-05-24 19:12   ` Bob Peterson
  0 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This reverts commit ff34245d524a898eee6e013eb1ec165095277148.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/export.c |  2 +-
 fs/gfs2/glock.c  |  2 +-
 fs/gfs2/inode.c  | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 fs/gfs2/inode.h  |  2 +-
 4 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index d5bda85..5d15e94 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -137,7 +137,7 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb,
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	struct inode *inode;
 
-	inode = gfs2_ilookup(sb, inum->no_addr);
+	inode = gfs2_ilookup(sb, inum->no_addr, 0);
 	if (inode) {
 		if (GFS2_I(inode)->i_no_formal_ino != inum->no_formal_ino) {
 			iput(inode);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 706fd93..e648b07 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -589,7 +589,7 @@ static void delete_work_func(struct work_struct *work)
 	/* Note: Unsafe to dereference ip as we don't hold right refs/locks */
 
 	if (ip)
-		inode = gfs2_ilookup(sdp->sd_vfs, no_addr);
+		inode = gfs2_ilookup(sdp->sd_vfs, no_addr, 1);
 	else
 		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
 	if (inode && !IS_ERR(inode)) {
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index f77976f..30f70b3 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -37,9 +37,61 @@
 #include "super.h"
 #include "glops.h"
 
-struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
+struct gfs2_skip_data {
+	u64 no_addr;
+	int skipped;
+	int non_block;
+};
+
+static int iget_test(struct inode *inode, void *opaque)
+{
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_skip_data *data = opaque;
+
+	if (ip->i_no_addr == data->no_addr) {
+		if (data->non_block &&
+		    inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)) {
+			data->skipped = 1;
+			return 0;
+		}
+		return 1;
+	}
+	return 0;
+}
+
+static int iget_set(struct inode *inode, void *opaque)
 {
-	return ilookup(sb, (unsigned long)no_addr);
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_skip_data *data = opaque;
+
+	if (data->skipped)
+		return -ENOENT;
+	inode->i_ino = (unsigned long)(data->no_addr);
+	ip->i_no_addr = data->no_addr;
+	return 0;
+}
+
+struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int non_block)
+{
+	unsigned long hash = (unsigned long)no_addr;
+	struct gfs2_skip_data data;
+
+	data.no_addr = no_addr;
+	data.skipped = 0;
+	data.non_block = non_block;
+	return ilookup5(sb, hash, iget_test, &data);
+}
+
+static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
+			       int non_block)
+{
+	struct gfs2_skip_data data;
+	unsigned long hash = (unsigned long)no_addr;
+
+	data.no_addr = no_addr;
+	data.skipped = 0;
+	data.non_block = non_block;
+	return iget5_locked(sb, hash, iget_test, iget_set, &data);
 }
 
 /**
@@ -93,7 +145,7 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	struct gfs2_glock *io_gl = NULL;
 	int error;
 
-	inode = iget_locked(sb, (unsigned long)no_addr);
+	inode = gfs2_iget(sb, no_addr, non_block);
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 22c27a8..ba4d949 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -99,7 +99,7 @@ extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
-extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr);
+extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int nonblock);
 
 extern int gfs2_inode_refresh(struct gfs2_inode *ip);
 
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [GFS2 PATCH 3/7] GFS2: Remove superfluous assignment
  2016-05-24 19:12 ` [Cluster-devel] " Bob Peterson
@ 2016-05-24 19:12   ` Bob Peterson
  -1 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel, Alexander Viro, linux-fsdevel

From: Andreas Gruenbacher <agruenba@redhat.com>

This patch removes an unneeded assignment in gfs2_inode_lookup:
gfs2_iget already initializes GFS2_I(inode)->i_no_addr.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/inode.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 30f70b3..a752110 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -148,9 +148,7 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	inode = gfs2_iget(sb, no_addr, non_block);
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
-
 	ip = GFS2_I(inode);
-	ip->i_no_addr = no_addr;
 
 	if (inode->i_state & I_NEW) {
 		struct gfs2_sbd *sdp = GFS2_SB(inode);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Cluster-devel] [GFS2 PATCH 3/7] GFS2: Remove superfluous assignment
@ 2016-05-24 19:12   ` Bob Peterson
  0 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Andreas Gruenbacher <agruenba@redhat.com>

This patch removes an unneeded assignment in gfs2_inode_lookup:
gfs2_iget already initializes GFS2_I(inode)->i_no_addr.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/inode.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 30f70b3..a752110 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -148,9 +148,7 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	inode = gfs2_iget(sb, no_addr, non_block);
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
-
 	ip = GFS2_I(inode);
-	ip->i_no_addr = no_addr;
 
 	if (inode->i_state & I_NEW) {
 		struct gfs2_sbd *sdp = GFS2_SB(inode);
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [GFS2 PATCH 4/7] GFS2: No need for non-blocking gfs2_ilookup in delete_work_func
  2016-05-24 19:12 ` [Cluster-devel] " Bob Peterson
@ 2016-05-24 19:12   ` Bob Peterson
  -1 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel, Alexander Viro, linux-fsdevel

From: Andreas Gruenbacher <agruenba@redhat.com>

In delete_work_func, we are guaranteed not to be holding the glock of
the inode we are looking up.  If the inode we are looking for is being
freed, the lookup is free to block until the inode is gone.  (Freeing
the inode takes the glock, so we must not already be holding it.)

With that, no more non-blocking callers of gfs2_ilookup are left, and
the non-block parameter of gfs2_ilookup can be removed.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/export.c | 2 +-
 fs/gfs2/glock.c  | 6 +++---
 fs/gfs2/inode.c  | 4 ++--
 fs/gfs2/inode.h  | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index 5d15e94..d5bda85 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -137,7 +137,7 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb,
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	struct inode *inode;
 
-	inode = gfs2_ilookup(sb, inum->no_addr, 0);
+	inode = gfs2_ilookup(sb, inum->no_addr);
 	if (inode) {
 		if (GFS2_I(inode)->i_no_formal_ino != inum->no_formal_ino) {
 			iput(inode);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index e648b07..ce46375 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -576,7 +576,7 @@ static void delete_work_func(struct work_struct *work)
 	struct gfs2_glock *gl = container_of(work, struct gfs2_glock, gl_delete);
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
 	struct gfs2_inode *ip;
-	struct inode *inode;
+	struct inode *inode = NULL;
 	u64 no_addr = gl->gl_name.ln_number;
 
 	/* If someone's using this glock to create a new dinode, the block must
@@ -589,8 +589,8 @@ static void delete_work_func(struct work_struct *work)
 	/* Note: Unsafe to dereference ip as we don't hold right refs/locks */
 
 	if (ip)
-		inode = gfs2_ilookup(sdp->sd_vfs, no_addr, 1);
-	else
+		inode = gfs2_ilookup(sdp->sd_vfs, no_addr);
+	if (IS_ERR_OR_NULL(inode))
 		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
 	if (inode && !IS_ERR(inode)) {
 		d_prune_aliases(inode);
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index a752110..5dff5da 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -71,14 +71,14 @@ static int iget_set(struct inode *inode, void *opaque)
 	return 0;
 }
 
-struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int non_block)
+struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 {
 	unsigned long hash = (unsigned long)no_addr;
 	struct gfs2_skip_data data;
 
 	data.no_addr = no_addr;
 	data.skipped = 0;
-	data.non_block = non_block;
+	data.non_block = 0;
 	return ilookup5(sb, hash, iget_test, &data);
 }
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index ba4d949..22c27a8 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -99,7 +99,7 @@ extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
-extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int nonblock);
+extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr);
 
 extern int gfs2_inode_refresh(struct gfs2_inode *ip);
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Cluster-devel] [GFS2 PATCH 4/7] GFS2: No need for non-blocking gfs2_ilookup in delete_work_func
@ 2016-05-24 19:12   ` Bob Peterson
  0 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Andreas Gruenbacher <agruenba@redhat.com>

In delete_work_func, we are guaranteed not to be holding the glock of
the inode we are looking up.  If the inode we are looking for is being
freed, the lookup is free to block until the inode is gone.  (Freeing
the inode takes the glock, so we must not already be holding it.)

With that, no more non-blocking callers of gfs2_ilookup are left, and
the non-block parameter of gfs2_ilookup can be removed.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/export.c | 2 +-
 fs/gfs2/glock.c  | 6 +++---
 fs/gfs2/inode.c  | 4 ++--
 fs/gfs2/inode.h  | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index 5d15e94..d5bda85 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -137,7 +137,7 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb,
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	struct inode *inode;
 
-	inode = gfs2_ilookup(sb, inum->no_addr, 0);
+	inode = gfs2_ilookup(sb, inum->no_addr);
 	if (inode) {
 		if (GFS2_I(inode)->i_no_formal_ino != inum->no_formal_ino) {
 			iput(inode);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index e648b07..ce46375 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -576,7 +576,7 @@ static void delete_work_func(struct work_struct *work)
 	struct gfs2_glock *gl = container_of(work, struct gfs2_glock, gl_delete);
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
 	struct gfs2_inode *ip;
-	struct inode *inode;
+	struct inode *inode = NULL;
 	u64 no_addr = gl->gl_name.ln_number;
 
 	/* If someone's using this glock to create a new dinode, the block must
@@ -589,8 +589,8 @@ static void delete_work_func(struct work_struct *work)
 	/* Note: Unsafe to dereference ip as we don't hold right refs/locks */
 
 	if (ip)
-		inode = gfs2_ilookup(sdp->sd_vfs, no_addr, 1);
-	else
+		inode = gfs2_ilookup(sdp->sd_vfs, no_addr);
+	if (IS_ERR_OR_NULL(inode))
 		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
 	if (inode && !IS_ERR(inode)) {
 		d_prune_aliases(inode);
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index a752110..5dff5da 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -71,14 +71,14 @@ static int iget_set(struct inode *inode, void *opaque)
 	return 0;
 }
 
-struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int non_block)
+struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 {
 	unsigned long hash = (unsigned long)no_addr;
 	struct gfs2_skip_data data;
 
 	data.no_addr = no_addr;
 	data.skipped = 0;
-	data.non_block = non_block;
+	data.non_block = 0;
 	return ilookup5(sb, hash, iget_test, &data);
 }
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index ba4d949..22c27a8 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -99,7 +99,7 @@ extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
-extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int nonblock);
+extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr);
 
 extern int gfs2_inode_refresh(struct gfs2_inode *ip);
 
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [GFS2 PATCH 5/7] vfs: Introduce prepare_wait_on_freeing_inode
  2016-05-24 19:12 ` [Cluster-devel] " Bob Peterson
@ 2016-05-24 19:12   ` Bob Peterson
  -1 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel, Alexander Viro, linux-fsdevel

From: Andreas Gruenbacher <agruenba@redhat.com>

Add function prepare_wait_on_freeing_inode: during an inode lookup,
freeing an inode can require taking a lock that the filesystem already
holds.  The lock must be dropped before we can wait for the inode to go
away; waiting inside find_inode / find_inode_fast would deadlock.

In that case, filesystems can use find_inode_nowait.  When an inode that
is being freed is found, they can prepare to wait for the inode to go
away inside find_inode_nowait, then drop the conflicting lock and wait
outside of find_inode_nowait.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/inode.c           | 17 ++++++++++++++---
 include/linux/fs.h   |  1 +
 include/linux/wait.h | 21 +++++++++++++++------
 3 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 4ccbc21..e00fec4 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1819,6 +1819,17 @@ int inode_needs_sync(struct inode *inode)
 }
 EXPORT_SYMBOL(inode_needs_sync);
 
+wait_queue_head_t *prepare_wait_on_freeing_inode(
+		struct inode *inode, struct wait_bit_queue *wait)
+{
+	wait_queue_head_t *wq = bit_waitqueue(&inode->i_state, __I_NEW);
+
+	init_wait_bit(wait, &inode->i_state, __I_NEW);
+	prepare_to_wait(wq, &wait->wait, TASK_UNINTERRUPTIBLE);
+	return wq;
+}
+EXPORT_SYMBOL(prepare_wait_on_freeing_inode);
+
 /*
  * If we try to find an inode in the inode hash while it is being
  * deleted, we have to wait until the filesystem completes its
@@ -1832,10 +1843,10 @@ EXPORT_SYMBOL(inode_needs_sync);
  */
 static void __wait_on_freeing_inode(struct inode *inode)
 {
+	struct wait_bit_queue wait;
 	wait_queue_head_t *wq;
-	DEFINE_WAIT_BIT(wait, &inode->i_state, __I_NEW);
-	wq = bit_waitqueue(&inode->i_state, __I_NEW);
-	prepare_to_wait(wq, &wait.wait, TASK_UNINTERRUPTIBLE);
+
+	wq = prepare_wait_on_freeing_inode(inode, &wait);
 	spin_unlock(&inode->i_lock);
 	spin_unlock(&inode_hash_lock);
 	schedule();
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5f61431..fe129a7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2686,6 +2686,7 @@ extern void address_space_init_once(struct address_space *mapping);
 extern struct inode * igrab(struct inode *);
 extern ino_t iunique(struct super_block *, ino_t);
 extern int inode_needs_sync(struct inode *inode);
+extern wait_queue_head_t *prepare_wait_on_freeing_inode(struct inode *inode, struct wait_bit_queue *wait);
 extern int generic_delete_inode(struct inode *inode);
 static inline int generic_drop_inode(struct inode *inode)
 {
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 27d7a0a..45ef1d2 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -991,6 +991,14 @@ int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
 
 #define DEFINE_WAIT(name) DEFINE_WAIT_FUNC(name, autoremove_wake_function)
 
+#define init_wait(wait)							\
+	do {								\
+		(wait)->private = current;				\
+		(wait)->func = autoremove_wake_function;		\
+		INIT_LIST_HEAD(&(wait)->task_list);			\
+		(wait)->flags = 0;					\
+	} while (0)
+
 #define DEFINE_WAIT_BIT(name, word, bit)				\
 	struct wait_bit_queue name = {					\
 		.key = __WAIT_BIT_KEY_INITIALIZER(word, bit),		\
@@ -1002,15 +1010,16 @@ int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
 		},							\
 	}
 
-#define init_wait(wait)							\
+#define init_wait_bit(wbit, word, bit)					\
 	do {								\
-		(wait)->private = current;				\
-		(wait)->func = autoremove_wake_function;		\
-		INIT_LIST_HEAD(&(wait)->task_list);			\
-		(wait)->flags = 0;					\
+		(wbit)->key.flags = word;				\
+		(wbit)->key.bit_nr = bit;				\
+		(wbit)->wait.private = current;				\
+		(wbit)->wait.func = wake_bit_function;			\
+		INIT_LIST_HEAD(&(wbit)->wait.task_list);		\
+		(wbit)->wait.flags = 0;					\
 	} while (0)
 
-
 extern int bit_wait(struct wait_bit_key *, int);
 extern int bit_wait_io(struct wait_bit_key *, int);
 extern int bit_wait_timeout(struct wait_bit_key *, int);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Cluster-devel] [GFS2 PATCH 5/7] vfs: Introduce prepare_wait_on_freeing_inode
@ 2016-05-24 19:12   ` Bob Peterson
  0 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Andreas Gruenbacher <agruenba@redhat.com>

Add function prepare_wait_on_freeing_inode: during an inode lookup,
freeing an inode can require taking a lock that the filesystem already
holds.  The lock must be dropped before we can wait for the inode to go
away; waiting inside find_inode / find_inode_fast would deadlock.

In that case, filesystems can use find_inode_nowait.  When an inode that
is being freed is found, they can prepare to wait for the inode to go
away inside find_inode_nowait, then drop the conflicting lock and wait
outside of find_inode_nowait.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/inode.c           | 17 ++++++++++++++---
 include/linux/fs.h   |  1 +
 include/linux/wait.h | 21 +++++++++++++++------
 3 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 4ccbc21..e00fec4 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1819,6 +1819,17 @@ int inode_needs_sync(struct inode *inode)
 }
 EXPORT_SYMBOL(inode_needs_sync);
 
+wait_queue_head_t *prepare_wait_on_freeing_inode(
+		struct inode *inode, struct wait_bit_queue *wait)
+{
+	wait_queue_head_t *wq = bit_waitqueue(&inode->i_state, __I_NEW);
+
+	init_wait_bit(wait, &inode->i_state, __I_NEW);
+	prepare_to_wait(wq, &wait->wait, TASK_UNINTERRUPTIBLE);
+	return wq;
+}
+EXPORT_SYMBOL(prepare_wait_on_freeing_inode);
+
 /*
  * If we try to find an inode in the inode hash while it is being
  * deleted, we have to wait until the filesystem completes its
@@ -1832,10 +1843,10 @@ EXPORT_SYMBOL(inode_needs_sync);
  */
 static void __wait_on_freeing_inode(struct inode *inode)
 {
+	struct wait_bit_queue wait;
 	wait_queue_head_t *wq;
-	DEFINE_WAIT_BIT(wait, &inode->i_state, __I_NEW);
-	wq = bit_waitqueue(&inode->i_state, __I_NEW);
-	prepare_to_wait(wq, &wait.wait, TASK_UNINTERRUPTIBLE);
+
+	wq = prepare_wait_on_freeing_inode(inode, &wait);
 	spin_unlock(&inode->i_lock);
 	spin_unlock(&inode_hash_lock);
 	schedule();
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5f61431..fe129a7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2686,6 +2686,7 @@ extern void address_space_init_once(struct address_space *mapping);
 extern struct inode * igrab(struct inode *);
 extern ino_t iunique(struct super_block *, ino_t);
 extern int inode_needs_sync(struct inode *inode);
+extern wait_queue_head_t *prepare_wait_on_freeing_inode(struct inode *inode, struct wait_bit_queue *wait);
 extern int generic_delete_inode(struct inode *inode);
 static inline int generic_drop_inode(struct inode *inode)
 {
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 27d7a0a..45ef1d2 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -991,6 +991,14 @@ int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
 
 #define DEFINE_WAIT(name) DEFINE_WAIT_FUNC(name, autoremove_wake_function)
 
+#define init_wait(wait)							\
+	do {								\
+		(wait)->private = current;				\
+		(wait)->func = autoremove_wake_function;		\
+		INIT_LIST_HEAD(&(wait)->task_list);			\
+		(wait)->flags = 0;					\
+	} while (0)
+
 #define DEFINE_WAIT_BIT(name, word, bit)				\
 	struct wait_bit_queue name = {					\
 		.key = __WAIT_BIT_KEY_INITIALIZER(word, bit),		\
@@ -1002,15 +1010,16 @@ int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
 		},							\
 	}
 
-#define init_wait(wait)							\
+#define init_wait_bit(wbit, word, bit)					\
 	do {								\
-		(wait)->private = current;				\
-		(wait)->func = autoremove_wake_function;		\
-		INIT_LIST_HEAD(&(wait)->task_list);			\
-		(wait)->flags = 0;					\
+		(wbit)->key.flags = word;				\
+		(wbit)->key.bit_nr = bit;				\
+		(wbit)->wait.private = current;				\
+		(wbit)->wait.func = wake_bit_function;			\
+		INIT_LIST_HEAD(&(wbit)->wait.task_list);		\
+		(wbit)->wait.flags = 0;					\
 	} while (0)
 
-
 extern int bit_wait(struct wait_bit_key *, int);
 extern int bit_wait_io(struct wait_bit_key *, int);
 extern int bit_wait_timeout(struct wait_bit_key *, int);
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [GFS2 PATCH 6/7] GFS2: Use non-blocking wait in gfs2_iget
  2016-05-24 19:12 ` [Cluster-devel] " Bob Peterson
@ 2016-05-24 19:12   ` Bob Peterson
  -1 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel, Alexander Viro, linux-fsdevel

From: Andreas Gruenbacher <agruenba@redhat.com>

Change gfs2_iget to use non-blocking lookups internally.  This will be
used to prevent glock deadlocks.

(Requires exporting __iget from fs/inode.c.)

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/inode.c | 104 ++++++++++++++++++++++++++++++++++++--------------------
 fs/gfs2/inode.h |   5 +++
 fs/inode.c      |   1 +
 3 files changed, 74 insertions(+), 36 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 5dff5da..d1dde39 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -20,6 +20,7 @@
 #include <linux/fiemap.h>
 #include <linux/security.h>
 #include <asm/uaccess.h>
+#include <linux/writeback.h>
 
 #include "gfs2.h"
 #include "incore.h"
@@ -37,61 +38,92 @@
 #include "super.h"
 #include "glops.h"
 
-struct gfs2_skip_data {
+struct gfs2_match {
 	u64 no_addr;
-	int skipped;
-	int non_block;
+	struct gfs2_freeing_inode *freeing;
 };
 
-static int iget_test(struct inode *inode, void *opaque)
+/* gfs2_match_inode - Inode matching function
+ * @inode: inode pointer
+ * @l: hash value (unused, since it may not be able to hold our no_addr)
+ * @opaque: points to a gfs2_freeing_inode structure
+ */
+static int gfs2_match_inode(struct inode *inode, unsigned long l, void *opaque)
 {
-	struct gfs2_inode *ip = GFS2_I(inode);
-	struct gfs2_skip_data *data = opaque;
+	struct gfs2_match *match = opaque;
+	struct gfs2_freeing_inode *freeing = match->freeing;
 
-	if (ip->i_no_addr == data->no_addr) {
-		if (data->non_block &&
-		    inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)) {
-			data->skipped = 1;
-			return 0;
-		}
-		return 1;
+	if (GFS2_I(inode)->i_no_addr != match->no_addr)
+		return 0;
+
+	spin_lock(&inode->i_lock);
+	if (inode->i_state & (I_FREEING|I_WILL_FREE)) {
+		freeing->wq = prepare_wait_on_freeing_inode(inode,
+			&freeing->bit_wait);
+		spin_unlock(&inode->i_lock);
+		return -1;
 	}
-	return 0;
+	__iget(inode);
+	spin_unlock(&inode->i_lock);
+	return 1;
 }
 
-static int iget_set(struct inode *inode, void *opaque)
+static int gfs2_test_inode(struct inode *inode, void *opaque)
 {
-	struct gfs2_inode *ip = GFS2_I(inode);
-	struct gfs2_skip_data *data = opaque;
+	struct gfs2_match *match = opaque;
 
-	if (data->skipped)
-		return -ENOENT;
-	inode->i_ino = (unsigned long)(data->no_addr);
-	ip->i_no_addr = data->no_addr;
-	return 0;
+	return GFS2_I(inode)->i_no_addr == match->no_addr;
 }
 
 struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 {
-	unsigned long hash = (unsigned long)no_addr;
-	struct gfs2_skip_data data;
+	struct gfs2_match match = {
+		.no_addr = no_addr,
+	};
 
-	data.no_addr = no_addr;
-	data.skipped = 0;
-	data.non_block = 0;
-	return ilookup5(sb, hash, iget_test, &data);
+	return ilookup5(sb, no_addr, gfs2_test_inode, &match);
 }
 
 static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
 			       int non_block)
 {
-	struct gfs2_skip_data data;
-	unsigned long hash = (unsigned long)no_addr;
+	struct gfs2_freeing_inode freeing;
+	struct gfs2_match match = {
+		.no_addr = no_addr,
+		.freeing = &freeing,
+	};
+	struct inode *inode;
 
-	data.no_addr = no_addr;
-	data.skipped = 0;
-	data.non_block = non_block;
-	return iget5_locked(sb, hash, iget_test, iget_set, &data);
+	while (1) {
+		freeing.wq = NULL;
+		inode = find_inode_nowait(sb, no_addr,
+					  gfs2_match_inode, &match);
+		if (inode) {
+			wait_on_inode(inode);
+			return inode;
+		}
+		if (freeing.wq) {
+			if (non_block) {
+				finish_wait(freeing.wq, &freeing.bit_wait.wait);
+				return ERR_PTR(-EAGAIN);
+			}
+			schedule();
+			finish_wait(freeing.wq, &freeing.bit_wait.wait);
+			continue;
+		}
+
+		inode = new_inode(sb);
+		if (!inode)
+			return ERR_PTR(-ENOMEM);
+		inode->i_ino = no_addr;
+		GFS2_I(inode)->i_no_addr = no_addr;
+		if (insert_inode_locked4(inode, no_addr,
+					 gfs2_test_inode, &match) < 0) {
+			iput(inode);
+			continue;
+		}
+		return inode;
+	}
 }
 
 /**
@@ -146,8 +178,8 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	int error;
 
 	inode = gfs2_iget(sb, no_addr, non_block);
-	if (!inode)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(inode))
+		return inode;
 	ip = GFS2_I(inode);
 
 	if (inode->i_state & I_NEW) {
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 22c27a8..4863513 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -93,6 +93,11 @@ err:
 	return -EIO;
 }
 
+struct gfs2_freeing_inode {
+	wait_queue_head_t *wq;
+	struct wait_bit_queue bit_wait;
+};
+
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
 				       u64 no_addr, u64 no_formal_ino,
 				       int non_block);
diff --git a/fs/inode.c b/fs/inode.c
index e00fec4..c18ace5 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -388,6 +388,7 @@ void __iget(struct inode *inode)
 {
 	atomic_inc(&inode->i_count);
 }
+EXPORT_SYMBOL(__iget);
 
 /*
  * get additional reference to inode; caller must already hold one.
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Cluster-devel] [GFS2 PATCH 6/7] GFS2: Use non-blocking wait in gfs2_iget
@ 2016-05-24 19:12   ` Bob Peterson
  0 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Andreas Gruenbacher <agruenba@redhat.com>

Change gfs2_iget to use non-blocking lookups internally.  This will be
used to prevent glock deadlocks.

(Requires exporting __iget from fs/inode.c.)

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/inode.c | 104 ++++++++++++++++++++++++++++++++++++--------------------
 fs/gfs2/inode.h |   5 +++
 fs/inode.c      |   1 +
 3 files changed, 74 insertions(+), 36 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 5dff5da..d1dde39 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -20,6 +20,7 @@
 #include <linux/fiemap.h>
 #include <linux/security.h>
 #include <asm/uaccess.h>
+#include <linux/writeback.h>
 
 #include "gfs2.h"
 #include "incore.h"
@@ -37,61 +38,92 @@
 #include "super.h"
 #include "glops.h"
 
-struct gfs2_skip_data {
+struct gfs2_match {
 	u64 no_addr;
-	int skipped;
-	int non_block;
+	struct gfs2_freeing_inode *freeing;
 };
 
-static int iget_test(struct inode *inode, void *opaque)
+/* gfs2_match_inode - Inode matching function
+ * @inode: inode pointer
+ * @l: hash value (unused, since it may not be able to hold our no_addr)
+ * @opaque: points to a gfs2_freeing_inode structure
+ */
+static int gfs2_match_inode(struct inode *inode, unsigned long l, void *opaque)
 {
-	struct gfs2_inode *ip = GFS2_I(inode);
-	struct gfs2_skip_data *data = opaque;
+	struct gfs2_match *match = opaque;
+	struct gfs2_freeing_inode *freeing = match->freeing;
 
-	if (ip->i_no_addr == data->no_addr) {
-		if (data->non_block &&
-		    inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)) {
-			data->skipped = 1;
-			return 0;
-		}
-		return 1;
+	if (GFS2_I(inode)->i_no_addr != match->no_addr)
+		return 0;
+
+	spin_lock(&inode->i_lock);
+	if (inode->i_state & (I_FREEING|I_WILL_FREE)) {
+		freeing->wq = prepare_wait_on_freeing_inode(inode,
+			&freeing->bit_wait);
+		spin_unlock(&inode->i_lock);
+		return -1;
 	}
-	return 0;
+	__iget(inode);
+	spin_unlock(&inode->i_lock);
+	return 1;
 }
 
-static int iget_set(struct inode *inode, void *opaque)
+static int gfs2_test_inode(struct inode *inode, void *opaque)
 {
-	struct gfs2_inode *ip = GFS2_I(inode);
-	struct gfs2_skip_data *data = opaque;
+	struct gfs2_match *match = opaque;
 
-	if (data->skipped)
-		return -ENOENT;
-	inode->i_ino = (unsigned long)(data->no_addr);
-	ip->i_no_addr = data->no_addr;
-	return 0;
+	return GFS2_I(inode)->i_no_addr == match->no_addr;
 }
 
 struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 {
-	unsigned long hash = (unsigned long)no_addr;
-	struct gfs2_skip_data data;
+	struct gfs2_match match = {
+		.no_addr = no_addr,
+	};
 
-	data.no_addr = no_addr;
-	data.skipped = 0;
-	data.non_block = 0;
-	return ilookup5(sb, hash, iget_test, &data);
+	return ilookup5(sb, no_addr, gfs2_test_inode, &match);
 }
 
 static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
 			       int non_block)
 {
-	struct gfs2_skip_data data;
-	unsigned long hash = (unsigned long)no_addr;
+	struct gfs2_freeing_inode freeing;
+	struct gfs2_match match = {
+		.no_addr = no_addr,
+		.freeing = &freeing,
+	};
+	struct inode *inode;
 
-	data.no_addr = no_addr;
-	data.skipped = 0;
-	data.non_block = non_block;
-	return iget5_locked(sb, hash, iget_test, iget_set, &data);
+	while (1) {
+		freeing.wq = NULL;
+		inode = find_inode_nowait(sb, no_addr,
+					  gfs2_match_inode, &match);
+		if (inode) {
+			wait_on_inode(inode);
+			return inode;
+		}
+		if (freeing.wq) {
+			if (non_block) {
+				finish_wait(freeing.wq, &freeing.bit_wait.wait);
+				return ERR_PTR(-EAGAIN);
+			}
+			schedule();
+			finish_wait(freeing.wq, &freeing.bit_wait.wait);
+			continue;
+		}
+
+		inode = new_inode(sb);
+		if (!inode)
+			return ERR_PTR(-ENOMEM);
+		inode->i_ino = no_addr;
+		GFS2_I(inode)->i_no_addr = no_addr;
+		if (insert_inode_locked4(inode, no_addr,
+					 gfs2_test_inode, &match) < 0) {
+			iput(inode);
+			continue;
+		}
+		return inode;
+	}
 }
 
 /**
@@ -146,8 +178,8 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	int error;
 
 	inode = gfs2_iget(sb, no_addr, non_block);
-	if (!inode)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(inode))
+		return inode;
 	ip = GFS2_I(inode);
 
 	if (inode->i_state & I_NEW) {
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 22c27a8..4863513 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -93,6 +93,11 @@ err:
 	return -EIO;
 }
 
+struct gfs2_freeing_inode {
+	wait_queue_head_t *wq;
+	struct wait_bit_queue bit_wait;
+};
+
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
 				       u64 no_addr, u64 no_formal_ino,
 				       int non_block);
diff --git a/fs/inode.c b/fs/inode.c
index e00fec4..c18ace5 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -388,6 +388,7 @@ void __iget(struct inode *inode)
 {
 	atomic_inc(&inode->i_count);
 }
+EXPORT_SYMBOL(__iget);
 
 /*
  * get additional reference to inode; caller must already hold one.
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [GFS2 PATCH 7/7] GFS2: Prevent deadlock in gfs2_lookup_by_inum
  2016-05-24 19:12 ` [Cluster-devel] " Bob Peterson
@ 2016-05-24 19:12   ` Bob Peterson
  -1 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel, Alexander Viro, linux-fsdevel

From: Andreas Gruenbacher <agruenba@redhat.com>

In gfs2_lookup_by_inum, we must take the glock of a presumed inode
before we can determine if the block indeed still contains an inode,
then look up the inode.  When a lookup finds an inode that is being
freed, it usually waits until that inodes has gone before returning.
However, freeing the inode requires taking the inode glock, so we end up
deadlocking.

Fix that by changing gfs2_inode_lookup: instead of waiting for inodes
that are being freed, return the context necessary for waiting.  Then,
in gfs2_lookup_by_inum, drop the glock before waiting and retrying the
lookup.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/dir.c        |  2 +-
 fs/gfs2/inode.c      | 42 +++++++++++++++++++++++++-----------------
 fs/gfs2/inode.h      |  2 +-
 fs/gfs2/ops_fstype.c |  2 +-
 4 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index d4014af..eb54888 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1660,7 +1660,7 @@ struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *name,
 		brelse(bh);
 		if (fail_on_exist)
 			return ERR_PTR(-EEXIST);
-		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, 0);
+		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, NULL);
 		if (!IS_ERR(inode))
 			GFS2_I(inode)->i_rahead = rahead;
 		return inode;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index d1dde39..e61ba5b 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -85,30 +85,29 @@ struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 }
 
 static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
-			       int non_block)
+			       struct gfs2_freeing_inode *freeing)
 {
-	struct gfs2_freeing_inode freeing;
-	struct gfs2_match match = {
-		.no_addr = no_addr,
-		.freeing = &freeing,
-	};
+	struct gfs2_freeing_inode wait_here;
+	struct gfs2_match match;
 	struct inode *inode;
 
+	if (!freeing)
+		freeing = &wait_here;
+	match.no_addr = no_addr;
+	match.freeing = freeing;
 	while (1) {
-		freeing.wq = NULL;
+		freeing->wq = NULL;
 		inode = find_inode_nowait(sb, no_addr,
 					  gfs2_match_inode, &match);
 		if (inode) {
 			wait_on_inode(inode);
 			return inode;
 		}
-		if (freeing.wq) {
-			if (non_block) {
-				finish_wait(freeing.wq, &freeing.bit_wait.wait);
+		if (freeing->wq) {
+			if (freeing != &wait_here)
 				return ERR_PTR(-EAGAIN);
-			}
 			schedule();
-			finish_wait(freeing.wq, &freeing.bit_wait.wait);
+			finish_wait(freeing->wq, &freeing->bit_wait.wait);
 			continue;
 		}
 
@@ -162,22 +161,23 @@ static void gfs2_set_iop(struct inode *inode)
 /**
  * gfs2_inode_lookup - Lookup an inode
  * @sb: The super block
- * @no_addr: The inode number
  * @type: The type of the inode
- * non_block: Can we block on inodes that are being freed?
+ * @no_addr: The inode number
+ * @freeing: Filled in when inode is being freed
  *
  * Returns: A VFS inode, or an error
  */
 
 struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
-				u64 no_addr, u64 no_formal_ino, int non_block)
+				u64 no_addr, u64 no_formal_ino,
+				struct gfs2_freeing_inode *freeing)
 {
 	struct inode *inode;
 	struct gfs2_inode *ip;
 	struct gfs2_glock *io_gl = NULL;
 	int error;
 
-	inode = gfs2_iget(sb, no_addr, non_block);
+	inode = gfs2_iget(sb, no_addr, freeing);
 	if (IS_ERR(inode))
 		return inode;
 	ip = GFS2_I(inode);
@@ -237,11 +237,13 @@ fail:
 struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 				  u64 *no_formal_ino, unsigned int blktype)
 {
+	struct gfs2_freeing_inode freeing;
 	struct super_block *sb = sdp->sd_vfs;
 	struct gfs2_holder i_gh;
 	struct inode *inode = NULL;
 	int error;
 
+repeat:
 	/* Must not read in block until block type is verified */
 	error = gfs2_glock_nq_num(sdp, no_addr, &gfs2_inode_glops,
 				  LM_ST_EXCLUSIVE, GL_SKIP, &i_gh);
@@ -252,7 +254,13 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 	if (error)
 		goto fail;
 
-	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, 1);
+	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, &freeing);
+	if (inode == ERR_PTR(-EAGAIN)) {
+		gfs2_glock_dq_uninit(&i_gh);
+		schedule();
+		finish_wait(freeing.wq, &freeing.bit_wait.wait);
+		goto repeat;
+	}
 	if (IS_ERR(inode))
 		goto fail;
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 4863513..f043601 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -100,7 +100,7 @@ struct gfs2_freeing_inode {
 
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
 				       u64 no_addr, u64 no_formal_ino,
-				       int non_block);
+				       struct gfs2_freeing_inode *freeing);
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 89472c4..8fef52d 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -454,7 +454,7 @@ static int gfs2_lookup_root(struct super_block *sb, struct dentry **dptr,
 	struct dentry *dentry;
 	struct inode *inode;
 
-	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, 0);
+	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, NULL);
 	if (IS_ERR(inode)) {
 		fs_err(sdp, "can't read in %s inode: %ld\n", name, PTR_ERR(inode));
 		return PTR_ERR(inode);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Cluster-devel] [GFS2 PATCH 7/7] GFS2: Prevent deadlock in gfs2_lookup_by_inum
@ 2016-05-24 19:12   ` Bob Peterson
  0 siblings, 0 replies; 18+ messages in thread
From: Bob Peterson @ 2016-05-24 19:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Andreas Gruenbacher <agruenba@redhat.com>

In gfs2_lookup_by_inum, we must take the glock of a presumed inode
before we can determine if the block indeed still contains an inode,
then look up the inode.  When a lookup finds an inode that is being
freed, it usually waits until that inodes has gone before returning.
However, freeing the inode requires taking the inode glock, so we end up
deadlocking.

Fix that by changing gfs2_inode_lookup: instead of waiting for inodes
that are being freed, return the context necessary for waiting.  Then,
in gfs2_lookup_by_inum, drop the glock before waiting and retrying the
lookup.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/dir.c        |  2 +-
 fs/gfs2/inode.c      | 42 +++++++++++++++++++++++++-----------------
 fs/gfs2/inode.h      |  2 +-
 fs/gfs2/ops_fstype.c |  2 +-
 4 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index d4014af..eb54888 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1660,7 +1660,7 @@ struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *name,
 		brelse(bh);
 		if (fail_on_exist)
 			return ERR_PTR(-EEXIST);
-		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, 0);
+		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino, NULL);
 		if (!IS_ERR(inode))
 			GFS2_I(inode)->i_rahead = rahead;
 		return inode;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index d1dde39..e61ba5b 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -85,30 +85,29 @@ struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 }
 
 static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
-			       int non_block)
+			       struct gfs2_freeing_inode *freeing)
 {
-	struct gfs2_freeing_inode freeing;
-	struct gfs2_match match = {
-		.no_addr = no_addr,
-		.freeing = &freeing,
-	};
+	struct gfs2_freeing_inode wait_here;
+	struct gfs2_match match;
 	struct inode *inode;
 
+	if (!freeing)
+		freeing = &wait_here;
+	match.no_addr = no_addr;
+	match.freeing = freeing;
 	while (1) {
-		freeing.wq = NULL;
+		freeing->wq = NULL;
 		inode = find_inode_nowait(sb, no_addr,
 					  gfs2_match_inode, &match);
 		if (inode) {
 			wait_on_inode(inode);
 			return inode;
 		}
-		if (freeing.wq) {
-			if (non_block) {
-				finish_wait(freeing.wq, &freeing.bit_wait.wait);
+		if (freeing->wq) {
+			if (freeing != &wait_here)
 				return ERR_PTR(-EAGAIN);
-			}
 			schedule();
-			finish_wait(freeing.wq, &freeing.bit_wait.wait);
+			finish_wait(freeing->wq, &freeing->bit_wait.wait);
 			continue;
 		}
 
@@ -162,22 +161,23 @@ static void gfs2_set_iop(struct inode *inode)
 /**
  * gfs2_inode_lookup - Lookup an inode
  * @sb: The super block
- * @no_addr: The inode number
  * @type: The type of the inode
- * non_block: Can we block on inodes that are being freed?
+ * @no_addr: The inode number
+ * @freeing: Filled in when inode is being freed
  *
  * Returns: A VFS inode, or an error
  */
 
 struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
-				u64 no_addr, u64 no_formal_ino, int non_block)
+				u64 no_addr, u64 no_formal_ino,
+				struct gfs2_freeing_inode *freeing)
 {
 	struct inode *inode;
 	struct gfs2_inode *ip;
 	struct gfs2_glock *io_gl = NULL;
 	int error;
 
-	inode = gfs2_iget(sb, no_addr, non_block);
+	inode = gfs2_iget(sb, no_addr, freeing);
 	if (IS_ERR(inode))
 		return inode;
 	ip = GFS2_I(inode);
@@ -237,11 +237,13 @@ fail:
 struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 				  u64 *no_formal_ino, unsigned int blktype)
 {
+	struct gfs2_freeing_inode freeing;
 	struct super_block *sb = sdp->sd_vfs;
 	struct gfs2_holder i_gh;
 	struct inode *inode = NULL;
 	int error;
 
+repeat:
 	/* Must not read in block until block type is verified */
 	error = gfs2_glock_nq_num(sdp, no_addr, &gfs2_inode_glops,
 				  LM_ST_EXCLUSIVE, GL_SKIP, &i_gh);
@@ -252,7 +254,13 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 	if (error)
 		goto fail;
 
-	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, 1);
+	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, &freeing);
+	if (inode == ERR_PTR(-EAGAIN)) {
+		gfs2_glock_dq_uninit(&i_gh);
+		schedule();
+		finish_wait(freeing.wq, &freeing.bit_wait.wait);
+		goto repeat;
+	}
 	if (IS_ERR(inode))
 		goto fail;
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 4863513..f043601 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -100,7 +100,7 @@ struct gfs2_freeing_inode {
 
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
 				       u64 no_addr, u64 no_formal_ino,
-				       int non_block);
+				       struct gfs2_freeing_inode *freeing);
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 89472c4..8fef52d 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -454,7 +454,7 @@ static int gfs2_lookup_root(struct super_block *sb, struct dentry **dptr,
 	struct dentry *dentry;
 	struct inode *inode;
 
-	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, 0);
+	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, NULL);
 	if (IS_ERR(inode)) {
 		fs_err(sdp, "can't read in %s inode: %ld\n", name, PTR_ERR(inode));
 		return PTR_ERR(inode);
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Cluster-devel] [GFS2 PATCH 0/7] Fix glock deadlocks with deleted inodes
  2016-05-24 19:12 ` [Cluster-devel] " Bob Peterson
@ 2016-05-24 21:58   ` Andreas Gruenbacher
  -1 siblings, 0 replies; 18+ messages in thread
From: Andreas Gruenbacher @ 2016-05-24 21:58 UTC (permalink / raw)
  To: Bob Peterson; +Cc: cluster-devel, Alexander Viro, linux-fsdevel

On Tue, May 24, 2016 at 9:12 PM, Bob Peterson <rpeterso@redhat.com> wrote:
> I'm sending this to linux-fsdevel because one of the patches affects vfs.

FWIW, except for Bob's new cover letter, this is a repost of the
following patch set from May 13 -- which nobody has commented on so
far:

  http://permalink.gmane.org/gmane.linux.file-systems/108374

There are some more details about how we got to that solution in the first post.

Thanks,
Andreas

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Cluster-devel] [GFS2 PATCH 0/7] Fix glock deadlocks with deleted inodes
@ 2016-05-24 21:58   ` Andreas Gruenbacher
  0 siblings, 0 replies; 18+ messages in thread
From: Andreas Gruenbacher @ 2016-05-24 21:58 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, May 24, 2016 at 9:12 PM, Bob Peterson <rpeterso@redhat.com> wrote:
> I'm sending this to linux-fsdevel because one of the patches affects vfs.

FWIW, except for Bob's new cover letter, this is a repost of the
following patch set from May 13 -- which nobody has commented on so
far:

  http://permalink.gmane.org/gmane.linux.file-systems/108374

There are some more details about how we got to that solution in the first post.

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-05-24 21:58 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-24 19:12 [GFS2 PATCH 0/7] Fix glock deadlocks with deleted inodes Bob Peterson
2016-05-24 19:12 ` [Cluster-devel] " Bob Peterson
2016-05-24 19:12 ` [GFS2 PATCH 1/7] Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup" Bob Peterson
2016-05-24 19:12   ` [Cluster-devel] " Bob Peterson
2016-05-24 19:12 ` [GFS2 PATCH 2/7] Revert "GFS2: Don't filter out I_FREEING inodes anymore" Bob Peterson
2016-05-24 19:12   ` [Cluster-devel] " Bob Peterson
2016-05-24 19:12 ` [GFS2 PATCH 3/7] GFS2: Remove superfluous assignment Bob Peterson
2016-05-24 19:12   ` [Cluster-devel] " Bob Peterson
2016-05-24 19:12 ` [GFS2 PATCH 4/7] GFS2: No need for non-blocking gfs2_ilookup in delete_work_func Bob Peterson
2016-05-24 19:12   ` [Cluster-devel] " Bob Peterson
2016-05-24 19:12 ` [GFS2 PATCH 5/7] vfs: Introduce prepare_wait_on_freeing_inode Bob Peterson
2016-05-24 19:12   ` [Cluster-devel] " Bob Peterson
2016-05-24 19:12 ` [GFS2 PATCH 6/7] GFS2: Use non-blocking wait in gfs2_iget Bob Peterson
2016-05-24 19:12   ` [Cluster-devel] " Bob Peterson
2016-05-24 19:12 ` [GFS2 PATCH 7/7] GFS2: Prevent deadlock in gfs2_lookup_by_inum Bob Peterson
2016-05-24 19:12   ` [Cluster-devel] " Bob Peterson
2016-05-24 21:58 ` [Cluster-devel] [GFS2 PATCH 0/7] Fix glock deadlocks with deleted inodes Andreas Gruenbacher
2016-05-24 21:58   ` Andreas Gruenbacher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.