All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [[GFS2 PATCH] 0/3] Patches for gfs2_lookup_by_inum deadlock
@ 2016-06-15 14:46 Bob Peterson
  2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 1/3] gfs2: Fix gfs2_lookup_by_inum lock inversion Bob Peterson
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Bob Peterson @ 2016-06-15 14:46 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

This is a set of three patches from Andreas Gruenbacher that fix the
gfs2_lookup_by_inum deadlock problem. I've been working with Andreas
for a while now, and we've both made several attempts to fix this
problem in the past, in regard to the transition of dinodes from the
"unlinked" to the "free" state. This is the latest attempt, and it
seems to be working well.

Our previous attempt made a change to vfs, but Al Viro didn't like
that, so it was scrapped in favor of this one, which is simpler and
confined to GFS2. It's similar in concept to the patch set I posted
on 18 December 2015.

It also fixes a problem for 32-bit architecture that was introduced
by a recent patch related to the same problem.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
Andreas Gruenbacher (3):
  gfs2: Fix gfs2_lookup_by_inum lock inversion
  gfs2: Get rid of gfs2_ilookup
  gfs2: Large-filesystem fix for 32-bit systems

 fs/gfs2/dir.c        |   3 +-
 fs/gfs2/export.c     |  11 ------
 fs/gfs2/glock.c      |   9 +----
 fs/gfs2/inode.c      | 103 ++++++++++++++++++++++++++++++++++++---------------
 fs/gfs2/inode.h      |   4 +-
 fs/gfs2/ops_fstype.c |   3 +-
 6 files changed, 81 insertions(+), 52 deletions(-)

-- 
2.5.5



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Cluster-devel] [[GFS2 PATCH] 1/3] gfs2: Fix gfs2_lookup_by_inum lock inversion
  2016-06-15 14:46 [Cluster-devel] [[GFS2 PATCH] 0/3] Patches for gfs2_lookup_by_inum deadlock Bob Peterson
@ 2016-06-15 14:46 ` Bob Peterson
  2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 2/3] gfs2: Get rid of gfs2_ilookup Bob Peterson
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Bob Peterson @ 2016-06-15 14:46 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Andreas Gruenbacher <agruenba@redhat.com>

The current gfs2_lookup_by_inum takes the glock of a presumed inode
identified by block number, verifies that the block is indeed an inode,
and then instantiates and reads the new inode via gfs2_inode_lookup.

However, instantiating a new inode may block on freeing a previous
instance of that inode (__wait_on_freeing_inode), and freeing an inode
requires to take the glock already held, leading to lock inversion and
deadlock.

Fix this by first instantiating the new inode, then verifying that the
block is an inode (if required), and then reading in the new inode, all
in gfs2_inode_lookup.

If the block we are looking for is not an inode, we discard the new
inode via iget_failed, which marks inodes as bad and unhashes them.
Other tasks waiting on that inode will get back a bad inode back from
ilookup or iget_locked; in that case, retry the lookup.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/dir.c        |   3 +-
 fs/gfs2/glock.c      |   4 +-
 fs/gfs2/inode.c      | 101 +++++++++++++++++++++++++++++++++++++--------------
 fs/gfs2/inode.h      |   3 +-
 fs/gfs2/ops_fstype.c |   3 +-
 5 files changed, 81 insertions(+), 33 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index 4a01f30..1b02665 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1660,7 +1660,8 @@ struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *name,
 		brelse(bh);
 		if (fail_on_exist)
 			return ERR_PTR(-EEXIST);
-		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino);
+		inode = gfs2_inode_lookup(dir->i_sb, dtype, addr, formal_ino,
+					  GFS2_BLKST_FREE /* ignore */);
 		if (!IS_ERR(inode))
 			GFS2_I(inode)->i_rahead = rahead;
 		return inode;
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 706fd93..ce46375 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -576,7 +576,7 @@ static void delete_work_func(struct work_struct *work)
 	struct gfs2_glock *gl = container_of(work, struct gfs2_glock, gl_delete);
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
 	struct gfs2_inode *ip;
-	struct inode *inode;
+	struct inode *inode = NULL;
 	u64 no_addr = gl->gl_name.ln_number;
 
 	/* If someone's using this glock to create a new dinode, the block must
@@ -590,7 +590,7 @@ static void delete_work_func(struct work_struct *work)
 
 	if (ip)
 		inode = gfs2_ilookup(sdp->sd_vfs, no_addr);
-	else
+	if (IS_ERR_OR_NULL(inode))
 		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
 	if (inode && !IS_ERR(inode)) {
 		d_prune_aliases(inode);
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 21dc784..6d5c6bb 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -39,7 +39,33 @@
 
 struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
 {
-	return ilookup(sb, (unsigned long)no_addr);
+	struct inode *inode;
+
+repeat:
+	inode = ilookup(sb, no_addr);
+	if (!inode)
+		return inode;
+	if (is_bad_inode(inode)) {
+		iput(inode);
+		goto repeat;
+	}
+	return inode;
+}
+
+static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr)
+{
+	struct inode *inode;
+
+repeat:
+	inode = iget_locked(sb, no_addr);
+	if (!inode)
+		return inode;
+	if (is_bad_inode(inode)) {
+		iput(inode);
+		goto repeat;
+	}
+	GFS2_I(inode)->i_no_addr = no_addr;
+	return inode;
 }
 
 /**
@@ -78,26 +104,37 @@ static void gfs2_set_iop(struct inode *inode)
 /**
  * gfs2_inode_lookup - Lookup an inode
  * @sb: The super block
- * @no_addr: The inode number
  * @type: The type of the inode
+ * @no_addr: The inode number
+ * @no_formal_ino: The inode generation number
+ * @blktype: Requested block type (GFS2_BLKST_DINODE or GFS2_BLKST_UNLINKED;
+ *           GFS2_BLKST_FREE do indicate not to verify)
+ *
+ * If @type is DT_UNKNOWN, the inode type is fetched from disk.
+ *
+ * If @blktype is anything other than GFS2_BLKST_FREE (which is used as a
+ * placeholder because it doesn't otherwise make sense), the on-disk block type
+ * is verified to be @blktype.
  *
  * Returns: A VFS inode, or an error
  */
 
 struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
-				u64 no_addr, u64 no_formal_ino)
+				u64 no_addr, u64 no_formal_ino,
+				unsigned int blktype)
 {
 	struct inode *inode;
 	struct gfs2_inode *ip;
 	struct gfs2_glock *io_gl = NULL;
+	struct gfs2_holder i_gh;
+	bool unlock = false;
 	int error;
 
-	inode = iget_locked(sb, (unsigned long)no_addr);
+	inode = gfs2_iget(sb, no_addr);
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
 
 	ip = GFS2_I(inode);
-	ip->i_no_addr = no_addr;
 
 	if (inode->i_state & I_NEW) {
 		struct gfs2_sbd *sdp = GFS2_SB(inode);
@@ -112,10 +149,30 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 		if (unlikely(error))
 			goto fail_put;
 
+		if (type == DT_UNKNOWN || blktype != GFS2_BLKST_FREE) {
+			/*
+			 * The GL_SKIP flag indicates to skip reading the inode
+			 * block.  We read the inode with gfs2_inode_refresh
+			 * after possibly checking the block type.
+			 */
+			error = gfs2_glock_nq_init(ip->i_gl, LM_ST_EXCLUSIVE,
+						   GL_SKIP, &i_gh);
+			if (error)
+				goto fail_put;
+			unlock = true;
+
+			if (blktype != GFS2_BLKST_FREE) {
+				error = gfs2_check_blk_type(sdp, no_addr,
+							    blktype);
+				if (error)
+					goto fail_put;
+			}
+		}
+
 		set_bit(GIF_INVALID, &ip->i_flags);
 		error = gfs2_glock_nq_init(io_gl, LM_ST_SHARED, GL_EXACT, &ip->i_iopen_gh);
 		if (unlikely(error))
-			goto fail_iopen;
+			goto fail_put;
 
 		ip->i_iopen_gh.gh_gl->gl_object = ip;
 		gfs2_glock_put(io_gl);
@@ -134,6 +191,8 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 		unlock_new_inode(inode);
 	}
 
+	if (unlock)
+		gfs2_glock_dq_uninit(&i_gh);
 	return inode;
 
 fail_refresh:
@@ -141,10 +200,11 @@ fail_refresh:
 	ip->i_iopen_gh.gh_gl->gl_object = NULL;
 	gfs2_glock_dq_wait(&ip->i_iopen_gh);
 	gfs2_holder_uninit(&ip->i_iopen_gh);
-fail_iopen:
+fail_put:
 	if (io_gl)
 		gfs2_glock_put(io_gl);
-fail_put:
+	if (unlock)
+		gfs2_glock_dq_uninit(&i_gh);
 	ip->i_gl->gl_object = NULL;
 fail:
 	iget_failed(inode);
@@ -155,23 +215,12 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 				  u64 *no_formal_ino, unsigned int blktype)
 {
 	struct super_block *sb = sdp->sd_vfs;
-	struct gfs2_holder i_gh;
-	struct inode *inode = NULL;
+	struct inode *inode;
 	int error;
 
-	/* Must not read in block until block type is verified */
-	error = gfs2_glock_nq_num(sdp, no_addr, &gfs2_inode_glops,
-				  LM_ST_EXCLUSIVE, GL_SKIP, &i_gh);
-	if (error)
-		return ERR_PTR(error);
-
-	error = gfs2_check_blk_type(sdp, no_addr, blktype);
-	if (error)
-		goto fail;
-
-	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0);
+	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, blktype);
 	if (IS_ERR(inode))
-		goto fail;
+		return inode;
 
 	/* Two extra checks for NFS only */
 	if (no_formal_ino) {
@@ -182,16 +231,12 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 		error = -EIO;
 		if (GFS2_I(inode)->i_diskflags & GFS2_DIF_SYSTEM)
 			goto fail_iput;
-
-		error = 0;
 	}
+	return inode;
 
-fail:
-	gfs2_glock_dq_uninit(&i_gh);
-	return error ? ERR_PTR(error) : inode;
 fail_iput:
 	iput(inode);
-	goto fail;
+	return ERR_PTR(error);
 }
 
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index e1af0d4..443b46c 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -94,7 +94,8 @@ err:
 }
 
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
-				       u64 no_addr, u64 no_formal_ino);
+				       u64 no_addr, u64 no_formal_ino,
+				       unsigned int blktype);
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 4546360..b8f6fc9 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -454,7 +454,8 @@ static int gfs2_lookup_root(struct super_block *sb, struct dentry **dptr,
 	struct dentry *dentry;
 	struct inode *inode;
 
-	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0);
+	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0,
+				  GFS2_BLKST_FREE /* ignore */);
 	if (IS_ERR(inode)) {
 		fs_err(sdp, "can't read in %s inode: %ld\n", name, PTR_ERR(inode));
 		return PTR_ERR(inode);
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Cluster-devel] [[GFS2 PATCH] 2/3] gfs2: Get rid of gfs2_ilookup
  2016-06-15 14:46 [Cluster-devel] [[GFS2 PATCH] 0/3] Patches for gfs2_lookup_by_inum deadlock Bob Peterson
  2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 1/3] gfs2: Fix gfs2_lookup_by_inum lock inversion Bob Peterson
@ 2016-06-15 14:46 ` Bob Peterson
  2016-06-16 15:51   ` Steven Whitehouse
  2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 3/3] gfs2: Large-filesystem fix for 32-bit systems Bob Peterson
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Bob Peterson @ 2016-06-15 14:46 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Andreas Gruenbacher <agruenba@redhat.com>

Now that gfs2_lookup_by_inum only takes the inode glock for new inodes
(and not for cached inodes anymore), there no longer is a need to
optimize the cached-inode case in gfs2_get_dentry or delete_work_func,
and gfs2_ilookup can be removed.

In addition, gfs2_get_dentry wasn't checking the GFS2_DIF_SYSTEM flag in
i_diskflags in the gfs2_ilookup case (see gfs2_lookup_by_inum); this
inconsistency goes away as well.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/export.c | 11 -----------
 fs/gfs2/glock.c  | 11 ++---------
 fs/gfs2/inode.c  | 15 ---------------
 fs/gfs2/inode.h  |  1 -
 4 files changed, 2 insertions(+), 36 deletions(-)

diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index d5bda85..a332f3c 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -137,21 +137,10 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb,
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	struct inode *inode;
 
-	inode = gfs2_ilookup(sb, inum->no_addr);
-	if (inode) {
-		if (GFS2_I(inode)->i_no_formal_ino != inum->no_formal_ino) {
-			iput(inode);
-			return ERR_PTR(-ESTALE);
-		}
-		goto out_inode;
-	}
-
 	inode = gfs2_lookup_by_inum(sdp, inum->no_addr, &inum->no_formal_ino,
 				    GFS2_BLKST_DINODE);
 	if (IS_ERR(inode))
 		return ERR_CAST(inode);
-
-out_inode:
 	return d_obtain_alias(inode);
 }
 
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index ce46375..1138a61 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -575,8 +575,7 @@ static void delete_work_func(struct work_struct *work)
 {
 	struct gfs2_glock *gl = container_of(work, struct gfs2_glock, gl_delete);
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
-	struct gfs2_inode *ip;
-	struct inode *inode = NULL;
+	struct inode *inode;
 	u64 no_addr = gl->gl_name.ln_number;
 
 	/* If someone's using this glock to create a new dinode, the block must
@@ -585,13 +584,7 @@ static void delete_work_func(struct work_struct *work)
 	if (test_bit(GLF_INODE_CREATING, &gl->gl_flags))
 		goto out;
 
-	ip = gl->gl_object;
-	/* Note: Unsafe to dereference ip as we don't hold right refs/locks */
-
-	if (ip)
-		inode = gfs2_ilookup(sdp->sd_vfs, no_addr);
-	if (IS_ERR_OR_NULL(inode))
-		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
+	inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
 	if (inode && !IS_ERR(inode)) {
 		d_prune_aliases(inode);
 		iput(inode);
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 6d5c6bb..ebff26e 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -37,21 +37,6 @@
 #include "super.h"
 #include "glops.h"
 
-struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
-{
-	struct inode *inode;
-
-repeat:
-	inode = ilookup(sb, no_addr);
-	if (!inode)
-		return inode;
-	if (is_bad_inode(inode)) {
-		iput(inode);
-		goto repeat;
-	}
-	return inode;
-}
-
 static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr)
 {
 	struct inode *inode;
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 443b46c..7710dfd 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -99,7 +99,6 @@ extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
 extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
 					 u64 *no_formal_ino,
 					 unsigned int blktype);
-extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr);
 
 extern int gfs2_inode_refresh(struct gfs2_inode *ip);
 
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Cluster-devel] [[GFS2 PATCH] 3/3] gfs2: Large-filesystem fix for 32-bit systems
  2016-06-15 14:46 [Cluster-devel] [[GFS2 PATCH] 0/3] Patches for gfs2_lookup_by_inum deadlock Bob Peterson
  2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 1/3] gfs2: Fix gfs2_lookup_by_inum lock inversion Bob Peterson
  2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 2/3] gfs2: Get rid of gfs2_ilookup Bob Peterson
@ 2016-06-15 14:46 ` Bob Peterson
  2016-06-16 15:48   ` Steven Whitehouse
  2016-06-17  9:40 ` [Cluster-devel] [PATCH] gfs2: Initialize iopen glock holder for new inodes Andreas Gruenbacher
  2016-06-27 15:20 ` [Cluster-devel] [[GFS2 PATCH] 0/3] Patches for gfs2_lookup_by_inum deadlock Bob Peterson
  4 siblings, 1 reply; 9+ messages in thread
From: Bob Peterson @ 2016-06-15 14:46 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Andreas Gruenbacher <agruenba@redhat.com>

Commit ff34245d switched from iget5_locked to iget_locked among other
things, but iget_locked doesn't work for filesystems larger than 2^32
blocks on 32-bit systems.  Switch back to iget5_locked.  Filesystems
larger than 2^32 blocks are unrealistic to work well on 32-bit systems,
so this is mostly a code cleanliness fix.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/inode.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index ebff26e..481b649 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -37,19 +37,34 @@
 #include "super.h"
 #include "glops.h"
 
+static int iget_test(struct inode *inode, void *opaque)
+{
+	u64 no_addr = *(u64 *)opaque;
+
+	return GFS2_I(inode)->i_no_addr == no_addr;
+}
+
+static int iget_set(struct inode *inode, void *opaque)
+{
+	u64 no_addr = *(u64 *)opaque;
+
+	GFS2_I(inode)->i_no_addr = no_addr;
+	inode->i_ino = no_addr;
+	return 0;
+}
+
 static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr)
 {
 	struct inode *inode;
 
 repeat:
-	inode = iget_locked(sb, no_addr);
+	inode = iget5_locked(sb, no_addr, iget_test, iget_set, &no_addr);
 	if (!inode)
 		return inode;
 	if (is_bad_inode(inode)) {
 		iput(inode);
 		goto repeat;
 	}
-	GFS2_I(inode)->i_no_addr = no_addr;
 	return inode;
 }
 
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Cluster-devel] [[GFS2 PATCH] 3/3] gfs2: Large-filesystem fix for 32-bit systems
  2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 3/3] gfs2: Large-filesystem fix for 32-bit systems Bob Peterson
@ 2016-06-16 15:48   ` Steven Whitehouse
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Whitehouse @ 2016-06-16 15:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Acked-by: Steven Whitehouse <swhiteho@redhat.com>

This is an obvious fix, so definitely needs to go in,

Steve.

On 15/06/16 15:46, Bob Peterson wrote:
> From: Andreas Gruenbacher <agruenba@redhat.com>
>
> Commit ff34245d switched from iget5_locked to iget_locked among other
> things, but iget_locked doesn't work for filesystems larger than 2^32
> blocks on 32-bit systems.  Switch back to iget5_locked.  Filesystems
> larger than 2^32 blocks are unrealistic to work well on 32-bit systems,
> so this is mostly a code cleanliness fix.
>
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>   fs/gfs2/inode.c | 19 +++++++++++++++++--
>   1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
> index ebff26e..481b649 100644
> --- a/fs/gfs2/inode.c
> +++ b/fs/gfs2/inode.c
> @@ -37,19 +37,34 @@
>   #include "super.h"
>   #include "glops.h"
>   
> +static int iget_test(struct inode *inode, void *opaque)
> +{
> +	u64 no_addr = *(u64 *)opaque;
> +
> +	return GFS2_I(inode)->i_no_addr == no_addr;
> +}
> +
> +static int iget_set(struct inode *inode, void *opaque)
> +{
> +	u64 no_addr = *(u64 *)opaque;
> +
> +	GFS2_I(inode)->i_no_addr = no_addr;
> +	inode->i_ino = no_addr;
> +	return 0;
> +}
> +
>   static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr)
>   {
>   	struct inode *inode;
>   
>   repeat:
> -	inode = iget_locked(sb, no_addr);
> +	inode = iget5_locked(sb, no_addr, iget_test, iget_set, &no_addr);
>   	if (!inode)
>   		return inode;
>   	if (is_bad_inode(inode)) {
>   		iput(inode);
>   		goto repeat;
>   	}
> -	GFS2_I(inode)->i_no_addr = no_addr;
>   	return inode;
>   }
>   



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Cluster-devel] [[GFS2 PATCH] 2/3] gfs2: Get rid of gfs2_ilookup
  2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 2/3] gfs2: Get rid of gfs2_ilookup Bob Peterson
@ 2016-06-16 15:51   ` Steven Whitehouse
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Whitehouse @ 2016-06-16 15:51 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

For patches 1 & 2 in this series, how much testing have they had? These 
look very much as if they are going in the right direction, but we must 
check with Al Viro to be sure that this is what he had in mind, and also 
what the VFS level fix was that he mentioned.

So looking good, but needs a bit more work before it is ready for an Ack 
I think,

Steve.

On 15/06/16 15:46, Bob Peterson wrote:
> From: Andreas Gruenbacher <agruenba@redhat.com>
>
> Now that gfs2_lookup_by_inum only takes the inode glock for new inodes
> (and not for cached inodes anymore), there no longer is a need to
> optimize the cached-inode case in gfs2_get_dentry or delete_work_func,
> and gfs2_ilookup can be removed.
>
> In addition, gfs2_get_dentry wasn't checking the GFS2_DIF_SYSTEM flag in
> i_diskflags in the gfs2_ilookup case (see gfs2_lookup_by_inum); this
> inconsistency goes away as well.
>
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>   fs/gfs2/export.c | 11 -----------
>   fs/gfs2/glock.c  | 11 ++---------
>   fs/gfs2/inode.c  | 15 ---------------
>   fs/gfs2/inode.h  |  1 -
>   4 files changed, 2 insertions(+), 36 deletions(-)
>
> diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
> index d5bda85..a332f3c 100644
> --- a/fs/gfs2/export.c
> +++ b/fs/gfs2/export.c
> @@ -137,21 +137,10 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb,
>   	struct gfs2_sbd *sdp = sb->s_fs_info;
>   	struct inode *inode;
>   
> -	inode = gfs2_ilookup(sb, inum->no_addr);
> -	if (inode) {
> -		if (GFS2_I(inode)->i_no_formal_ino != inum->no_formal_ino) {
> -			iput(inode);
> -			return ERR_PTR(-ESTALE);
> -		}
> -		goto out_inode;
> -	}
> -
>   	inode = gfs2_lookup_by_inum(sdp, inum->no_addr, &inum->no_formal_ino,
>   				    GFS2_BLKST_DINODE);
>   	if (IS_ERR(inode))
>   		return ERR_CAST(inode);
> -
> -out_inode:
>   	return d_obtain_alias(inode);
>   }
>   
> diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
> index ce46375..1138a61 100644
> --- a/fs/gfs2/glock.c
> +++ b/fs/gfs2/glock.c
> @@ -575,8 +575,7 @@ static void delete_work_func(struct work_struct *work)
>   {
>   	struct gfs2_glock *gl = container_of(work, struct gfs2_glock, gl_delete);
>   	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
> -	struct gfs2_inode *ip;
> -	struct inode *inode = NULL;
> +	struct inode *inode;
>   	u64 no_addr = gl->gl_name.ln_number;
>   
>   	/* If someone's using this glock to create a new dinode, the block must
> @@ -585,13 +584,7 @@ static void delete_work_func(struct work_struct *work)
>   	if (test_bit(GLF_INODE_CREATING, &gl->gl_flags))
>   		goto out;
>   
> -	ip = gl->gl_object;
> -	/* Note: Unsafe to dereference ip as we don't hold right refs/locks */
> -
> -	if (ip)
> -		inode = gfs2_ilookup(sdp->sd_vfs, no_addr);
> -	if (IS_ERR_OR_NULL(inode))
> -		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
> +	inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
>   	if (inode && !IS_ERR(inode)) {
>   		d_prune_aliases(inode);
>   		iput(inode);
> diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
> index 6d5c6bb..ebff26e 100644
> --- a/fs/gfs2/inode.c
> +++ b/fs/gfs2/inode.c
> @@ -37,21 +37,6 @@
>   #include "super.h"
>   #include "glops.h"
>   
> -struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
> -{
> -	struct inode *inode;
> -
> -repeat:
> -	inode = ilookup(sb, no_addr);
> -	if (!inode)
> -		return inode;
> -	if (is_bad_inode(inode)) {
> -		iput(inode);
> -		goto repeat;
> -	}
> -	return inode;
> -}
> -
>   static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr)
>   {
>   	struct inode *inode;
> diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
> index 443b46c..7710dfd 100644
> --- a/fs/gfs2/inode.h
> +++ b/fs/gfs2/inode.h
> @@ -99,7 +99,6 @@ extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
>   extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
>   					 u64 *no_formal_ino,
>   					 unsigned int blktype);
> -extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr);
>   
>   extern int gfs2_inode_refresh(struct gfs2_inode *ip);
>   



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Cluster-devel] [PATCH] gfs2: Initialize iopen glock holder for new inodes
  2016-06-15 14:46 [Cluster-devel] [[GFS2 PATCH] 0/3] Patches for gfs2_lookup_by_inum deadlock Bob Peterson
                   ` (2 preceding siblings ...)
  2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 3/3] gfs2: Large-filesystem fix for 32-bit systems Bob Peterson
@ 2016-06-17  9:40 ` Andreas Gruenbacher
  2016-06-17 13:41   ` Bob Peterson
  2016-06-27 15:20 ` [Cluster-devel] [[GFS2 PATCH] 0/3] Patches for gfs2_lookup_by_inum deadlock Bob Peterson
  4 siblings, 1 reply; 9+ messages in thread
From: Andreas Gruenbacher @ 2016-06-17  9:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Bob Peterson <rpeterso@redhat.com>

In gfs2_init_inode_once, initialize inode->i_iopen_gh.gh_gl to NULL:
otherwise, when gfs2_inode_lookup fails, the iopen glock holder can
remain unset and iget_failed can end up accessing random memory.

It turned out that patch "gfs2: Fix gfs2_lookup_by_inum lock inversion" made
gfs2_inode_lookup fail in this way more often, and we started to see this kind
of failure.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index f99f8e9..615f675 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -45,6 +45,7 @@ static void gfs2_init_inode_once(void *foo)
 	memset(&ip->i_res, 0, sizeof(ip->i_res));
 	RB_CLEAR_NODE(&ip->i_res.rs_node);
 	ip->i_hash_cache = NULL;
+	ip->i_iopen_gh.gh_gl = NULL;
 }
 
 static void gfs2_init_glock_once(void *foo)
-- 
2.5.5



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Cluster-devel] [PATCH] gfs2: Initialize iopen glock holder for new inodes
  2016-06-17  9:40 ` [Cluster-devel] [PATCH] gfs2: Initialize iopen glock holder for new inodes Andreas Gruenbacher
@ 2016-06-17 13:41   ` Bob Peterson
  0 siblings, 0 replies; 9+ messages in thread
From: Bob Peterson @ 2016-06-17 13:41 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
| From: Bob Peterson <rpeterso@redhat.com>
| 
| In gfs2_init_inode_once, initialize inode->i_iopen_gh.gh_gl to NULL:
| otherwise, when gfs2_inode_lookup fails, the iopen glock holder can
| remain unset and iget_failed can end up accessing random memory.
| 
| It turned out that patch "gfs2: Fix gfs2_lookup_by_inum lock inversion" made
| gfs2_inode_lookup fail in this way more often, and we started to see this
| kind
| of failure.
| 
| Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| ---
|  fs/gfs2/main.c | 1 +
|  1 file changed, 1 insertion(+)
| 
| diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
| index f99f8e9..615f675 100644
| --- a/fs/gfs2/main.c
| +++ b/fs/gfs2/main.c
| @@ -45,6 +45,7 @@ static void gfs2_init_inode_once(void *foo)
|  	memset(&ip->i_res, 0, sizeof(ip->i_res));
|  	RB_CLEAR_NODE(&ip->i_res.rs_node);
|  	ip->i_hash_cache = NULL;
| +	ip->i_iopen_gh.gh_gl = NULL;
|  }
|  
|  static void gfs2_init_glock_once(void *foo)
| --
| 2.5.5
| 
| 
Hi,

Thanks. This is now applied to the for-next branch of the linux-gfs2 tree:
https://git.kernel.org/cgit/linux/kernel/git/gfs2/linux-gfs2.git/commit/fs/gfs2?h=for-next&id=1e875f5a95a28b5286165db9fa832b0773657ddb

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Cluster-devel] [[GFS2 PATCH] 0/3] Patches for gfs2_lookup_by_inum deadlock
  2016-06-15 14:46 [Cluster-devel] [[GFS2 PATCH] 0/3] Patches for gfs2_lookup_by_inum deadlock Bob Peterson
                   ` (3 preceding siblings ...)
  2016-06-17  9:40 ` [Cluster-devel] [PATCH] gfs2: Initialize iopen glock holder for new inodes Andreas Gruenbacher
@ 2016-06-27 15:20 ` Bob Peterson
  4 siblings, 0 replies; 9+ messages in thread
From: Bob Peterson @ 2016-06-27 15:20 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
| Hi,
| 
| This is a set of three patches from Andreas Gruenbacher that fix the
| gfs2_lookup_by_inum deadlock problem. I've been working with Andreas
| for a while now, and we've both made several attempts to fix this
| problem in the past, in regard to the transition of dinodes from the
| "unlinked" to the "free" state. This is the latest attempt, and it
| seems to be working well.
| 
| Our previous attempt made a change to vfs, but Al Viro didn't like
| that, so it was scrapped in favor of this one, which is simpler and
| confined to GFS2. It's similar in concept to the patch set I posted
| on 18 December 2015.
| 
| It also fixes a problem for 32-bit architecture that was introduced
| by a recent patch related to the same problem.
| 
| Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| ---
| Andreas Gruenbacher (3):
|   gfs2: Fix gfs2_lookup_by_inum lock inversion
|   gfs2: Get rid of gfs2_ilookup
|   gfs2: Large-filesystem fix for 32-bit systems
| 
|  fs/gfs2/dir.c        |   3 +-
|  fs/gfs2/export.c     |  11 ------
|  fs/gfs2/glock.c      |   9 +----
|  fs/gfs2/inode.c      | 103
|  ++++++++++++++++++++++++++++++++++++---------------
|  fs/gfs2/inode.h      |   4 +-
|  fs/gfs2/ops_fstype.c |   3 +-
|  6 files changed, 81 insertions(+), 52 deletions(-)
| 
| --
| 2.5.5
| 
| 
Hi,

Thanks. These are now applied to the for-next branch of the linux-gfs2 tree:
https://git.kernel.org/cgit/linux/kernel/git/gfs2/linux-gfs2.git/commit/fs?h=for-next&id=3ce37b2cb4917674fa5b776e857dcea94c0e0835
https://git.kernel.org/cgit/linux/kernel/git/gfs2/linux-gfs2.git/commit/fs?h=for-next&id=ec5ec66ba48bd3163110599359797858ac38e79b
https://git.kernel.org/cgit/linux/kernel/git/gfs2/linux-gfs2.git/commit/fs?h=for-next&id=cda9dd4207aeb29d0aa2298085cc2d1ebcb87e04

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-06-27 15:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-15 14:46 [Cluster-devel] [[GFS2 PATCH] 0/3] Patches for gfs2_lookup_by_inum deadlock Bob Peterson
2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 1/3] gfs2: Fix gfs2_lookup_by_inum lock inversion Bob Peterson
2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 2/3] gfs2: Get rid of gfs2_ilookup Bob Peterson
2016-06-16 15:51   ` Steven Whitehouse
2016-06-15 14:46 ` [Cluster-devel] [[GFS2 PATCH] 3/3] gfs2: Large-filesystem fix for 32-bit systems Bob Peterson
2016-06-16 15:48   ` Steven Whitehouse
2016-06-17  9:40 ` [Cluster-devel] [PATCH] gfs2: Initialize iopen glock holder for new inodes Andreas Gruenbacher
2016-06-17 13:41   ` Bob Peterson
2016-06-27 15:20 ` [Cluster-devel] [[GFS2 PATCH] 0/3] Patches for gfs2_lookup_by_inum deadlock Bob Peterson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.