* [PATCH 0/4] xfs: resurrect realtime subvolume support on kernel 2.6.37
@ 2012-02-17 22:46 kdasu
2012-02-17 22:51 ` [PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation kdasu
0 siblings, 1 reply; 11+ messages in thread
From: kdasu @ 2012-02-17 22:46 UTC (permalink / raw)
To: xfs
Fixed bimap inode deadlock existing on 2.6.37.
[PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
Also added the realtime subvolume patches from the 2.6.39 release.
[PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation
[PATCH 2/4] xfs: fix xfs_get_extsz_hint for a zero extent size hint
[PATCH 3/4] xfs: add lockdep annotations for the rt inodes
These fixes were needed to resurrect realtime subvolume support on
2.6.37 kernel.
I suspect the deadlock problem while freeing multiple extents
exits on 2.6.39 in the xfs_rtfree_entent() call.
Kamal
--
View this message in context: http://old.nabble.com/-PATCH-0-4--xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33345988.html
Sent from the Xfs - General mailing list archive at Nabble.com.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation
2012-02-17 22:46 [PATCH 0/4] xfs: resurrect realtime subvolume support on kernel 2.6.37 kdasu
@ 2012-02-17 22:51 ` kdasu
2012-02-17 22:55 ` [PATCH 2/4] xfs: fix xfs_get_extsz_hint for a zero extent size hint kdasu
0 siblings, 1 reply; 11+ messages in thread
From: kdasu @ 2012-02-17 22:51 UTC (permalink / raw)
To: xfs
Currently both xfs_rtpick_extent and xfs_rtallocate_extent call
xfs_trans_iget to grab and lock the rt bitmap inode, which results in a
deadlock since the removal of the lock recursion counters in commit
"xfs: simplify inode to transaction joining"
Fix this by acquiring and locking the inode in xfs_bmap_rtalloc before
calling into xfs_rtpick_extent and xfs_rtallocate_extent.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
fs/xfs/xfs_bmap.c | 11 +++++++++++
fs/xfs/xfs_rtalloc.c | 34 +++++++++++++---------------------
2 files changed, 24 insertions(+), 21 deletions(-)
diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 4111cd3..9d9970b 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -2316,6 +2316,7 @@ xfs_bmap_rtalloc(
xfs_extlen_t prod = 0; /* product factor for allocators */
xfs_extlen_t ralen = 0; /* realtime allocation length */
xfs_extlen_t align; /* minimum allocation alignment */
+ xfs_inode_t *ip; /* bitmap incore inode */
xfs_rtblock_t rtb;
kdasu@kdasu-VirtualBox:~/linux-2.6$ more
0001-xfs-only-lock-the-rt-bitmap-inode-once-per-allocatio.patch
>From f59ab29cc191a5955c4d44c0b92a537f981184c2 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@infradead.org>
Date: Tue, 25 Jan 2011 09:06:19 +0000
Subject: [PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation
Currently both xfs_rtpick_extent and xfs_rtallocate_extent call
xfs_trans_iget to grab and lock the rt bitmap inode, which results in a
deadlock since the removal of the lock recursion counters in commit
"xfs: simplify inode to transaction joining"
Fix this by acquiring and locking the inode in xfs_bmap_rtalloc before
calling into xfs_rtpick_extent and xfs_rtallocate_extent.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
fs/xfs/xfs_bmap.c | 11 +++++++++++
fs/xfs/xfs_rtalloc.c | 34 +++++++++++++---------------------
2 files changed, 24 insertions(+), 21 deletions(-)
diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 4111cd3..9d9970b 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -2316,6 +2316,7 @@ xfs_bmap_rtalloc(
xfs_extlen_t prod = 0; /* product factor for allocators */
xfs_extlen_t ralen = 0; /* realtime allocation length */
xfs_extlen_t align; /* minimum allocation alignment */
+ xfs_inode_t *ip; /* bitmap incore inode */
xfs_rtblock_t rtb;
mp = ap->ip->i_mount;
@@ -2348,6 +2349,16 @@ xfs_bmap_rtalloc(
*/
if (ralen * mp->m_sb.sb_rextsize >= MAXEXTLEN)
ralen = MAXEXTLEN / mp->m_sb.sb_rextsize;
+
+ /*
+ * Lock out other modifications to the RT bitmap inode.
+ */
+ error = xfs_trans_iget(mp, ap->tp, mp->m_sb.sb_rbmino, 0,
+ XFS_ILOCK_EXCL, &ip);
+ if (error)
+ return error;
+ ASSERT(ip == mp->m_rbmip);
+
/*
* If it's an allocation to an empty file at offset 0,
* pick an extent that will space things out in the rt area.
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 12a1913..037fab1 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -2075,15 +2075,15 @@ xfs_rtallocate_extent(
xfs_extlen_t prod, /* extent product factor */
xfs_rtblock_t *rtblock) /* out: start block allocated */
{
+ xfs_mount_t *mp = tp->t_mountp;
int error; /* error value */
- xfs_inode_t *ip; /* inode for bitmap file */
- xfs_mount_t *mp; /* file system mount structure */
xfs_rtblock_t r; /* result allocated block */
xfs_fsblock_t sb; /* summary file block number */
xfs_buf_t *sumbp; /* summary file block buffer */
+ ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
ASSERT(minlen > 0 && minlen <= maxlen);
- mp = tp->t_mountp;
+
/*
* If prod is set then figure out what to do to minlen and maxlen.
*/
@@ -2099,12 +2099,7 @@ xfs_rtallocate_extent(
return 0;
}
}
- /*
- * Lock out other callers by grabbing the bitmap inode lock.
- */
- if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
- XFS_ILOCK_EXCL, &ip)))
- return error;
+
sumbp = NULL;
/*
* Allocate by size, or near another block, or exactly at some
block.
@@ -2123,11 +2118,12 @@ xfs_rtallocate_extent(
len, &sumbp, &sb, prod, &r);
break;
default:
+ error = EIO;
ASSERT(0);
}
- if (error) {
+ if (error)
return error;
- }
+
/*
* If it worked, update the superblock.
*/
@@ -2306,20 +2302,16 @@ xfs_rtpick_extent(
xfs_rtblock_t *pick) /* result rt extent */
{
xfs_rtblock_t b; /* result block */
- int error; /* error return value */
- xfs_inode_t *ip; /* bitmap incore inode */
int log2; /* log of sequence number */
__uint64_t resid; /* residual after log removed */
__uint64_t seq; /* sequence number of file creation
*/
__uint64_t *seqp; /* pointer to seqno in inode */
- if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
- XFS_ILOCK_EXCL, &ip)))
- return error;
+
sumbp = NULL;
/*
* Allocate by size, or near another block, or exactly at some
block.
@@ -2123,11 +2118,12 @@ xfs_rtallocate_extent(
len, &sumbp, &sb, prod, &r);
break;
default:
+ error = EIO;
ASSERT(0);
}
- if (error) {
+ if (error)
return error;
- }
+
/*
* If it worked, update the superblock.
*/
@@ -2306,20 +2302,16 @@ xfs_rtpick_extent(
xfs_rtblock_t *pick) /* result rt extent */
{
xfs_rtblock_t b; /* result block */
- int error; /* error return value */
- xfs_inode_t *ip; /* bitmap incore inode */
int log2; /* log of sequence number */
__uint64_t resid; /* residual after log removed */
__uint64_t seq; /* sequence number of file creation
*/
__uint64_t *seqp; /* pointer to seqno in inode */
- if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
- XFS_ILOCK_EXCL, &ip)))
- return error;
- ASSERT(ip == mp->m_rbmip);
- seqp = (__uint64_t *)&ip->i_d.di_atime;
- if (!(ip->i_d.di_flags & XFS_DIFLAG_NEWRTBM)) {
- ip->i_d.di_flags |= XFS_DIFLAG_NEWRTBM;
+ ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
+
+ seqp = (__uint64_t *)&mp->m_rbmip->i_d.di_atime;
+ if (!(mp->m_rbmip->i_d.di_flags & XFS_DIFLAG_NEWRTBM)) {
+ mp->m_rbmip->i_d.di_flags |= XFS_DIFLAG_NEWRTBM;
*seqp = 0;
}
seq = *seqp;
@@ -2335,7 +2327,7 @@ xfs_rtpick_extent(
b = mp->m_sb.sb_rextents - len;
}
*seqp = seq + 1;
- xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+ xfs_trans_log_inode(tp, mp->m_rbmip, XFS_ILOG_CORE);
*pick = b;
return 0;
}
--
1.7.5.4
--
View this message in context: http://old.nabble.com/-PATCH-0-4--xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33346009.html
Sent from the Xfs - General mailing list archive at Nabble.com.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] xfs: fix xfs_get_extsz_hint for a zero extent size hint
2012-02-17 22:51 ` [PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation kdasu
@ 2012-02-17 22:55 ` kdasu
2012-02-17 22:58 ` [PATCH 3/4] xfs: add lockdep annotations for the rt inodes kdasu
0 siblings, 1 reply; 11+ messages in thread
From: kdasu @ 2012-02-17 22:55 UTC (permalink / raw)
To: xfs
We can easily set the extsize flag without setting an extent size
hint, or one that evaluates to zero. Historically the di_extsize
field was only used when it was non-zero, but the commit
"Cleanup inode extent size hint extraction"
broke this. Restore the old behaviour, thus fixing xfsqa 090 with
a debug kernel.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
fs/xfs/xfs_rw.c | 18 +++++-------------
1 files changed, 5 insertions(+), 13 deletions(-)
diff --git a/fs/xfs/xfs_rw.c b/fs/xfs/xfs_rw.c
index 56861d5..ccd3adf 100644
--- a/fs/xfs/xfs_rw.c
+++ b/fs/xfs/xfs_rw.c
@@ -173,17 +173,9 @@ xfs_extlen_t
xfs_get_extsz_hint(
struct xfs_inode *ip)
{
- xfs_extlen_t extsz;
-
- if (unlikely(XFS_IS_REALTIME_INODE(ip))) {
- extsz = (ip->i_d.di_flags & XFS_DIFLAG_EXTSIZE)
- ? ip->i_d.di_extsize
- : ip->i_mount->m_sb.sb_rextsize;
- ASSERT(extsz);
- } else {
- extsz = (ip->i_d.di_flags & XFS_DIFLAG_EXTSIZE)
- ? ip->i_d.di_extsize : 0;
- }
-
- return extsz;
+ if ((ip->i_d.di_flags & XFS_DIFLAG_EXTSIZE) && ip->i_d.di_extsize)
+ return ip->i_d.di_extsize;
+ if (XFS_IS_REALTIME_INODE(ip))
+ return ip->i_mount->m_sb.sb_rextsize;
+ return 0;
}
--
1.7.5.4
--
View this message in context: http://old.nabble.com/-PATCH-0-4--xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33346035.html
Sent from the Xfs - General mailing list archive at Nabble.com.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] xfs: add lockdep annotations for the rt inodes
2012-02-17 22:55 ` [PATCH 2/4] xfs: fix xfs_get_extsz_hint for a zero extent size hint kdasu
@ 2012-02-17 22:58 ` kdasu
2012-02-17 23:00 ` [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37 kdasu
0 siblings, 1 reply; 11+ messages in thread
From: kdasu @ 2012-02-17 22:58 UTC (permalink / raw)
To: xfs
The rt bitmap and summary inodes do not participate in the normal inode
locking protocol. Instead the rt bitmap inode can be locked in any
transaction involving rt allocations, and the both of the rt inodes can
be locked at the same time. Add specific lockdep subclasses for the rt
inodes to prevent lockdep from blowing up.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
fs/xfs/xfs_bmap.c | 2 +-
fs/xfs/xfs_inode.h | 23 +++++++++++++++--------
fs/xfs/xfs_rtalloc.c | 16 ++++++++++------
3 files changed, 26 insertions(+), 15 deletions(-)
diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 9d9970b..36c317c 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -2354,7 +2354,7 @@ xfs_bmap_rtalloc(
* Lock out other modifications to the RT bitmap inode.
*/
error = xfs_trans_iget(mp, ap->tp, mp->m_sb.sb_rbmino, 0,
- XFS_ILOCK_EXCL, &ip);
+ XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP, &ip);
if (error)
return error;
ASSERT(ip == mp->m_rbmip);
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index fb2ca2e..a9e82d4 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -408,28 +408,35 @@ static inline void xfs_ifunlock(xfs_inode_t *ip)
/*
* Flags for lockdep annotations.
*
- * XFS_I[O]LOCK_PARENT - for operations that require locking two inodes
- * (ie directory operations that require locking a directory inode and
- * an entry inode). The first inode gets locked with this flag so it
- * gets a lockdep subclass of 1 and the second lock will have a lockdep
- * subclass of 0.
+ * XFS_LOCK_PARENT - for directory operations that require locking a
+ * parent directory inode and a child entry inode. The parent gets locked
+ * with this flag so it gets a lockdep subclass of 1 and the child entry
+ * lock will have a lockdep subclass of 0.
+ *
+ * XFS_LOCK_RTBITMAP/XFS_LOCK_RTSUM - the realtime device bitmap and
summary
+ * inodes do not participate in the normal lock order, and thus have their
+ * own subclasses.
*
* XFS_LOCK_INUMORDER - for locking several inodes at the some time
* with xfs_lock_inodes(). This flag is used as the starting subclass
* and each subsequent lock acquired will increment the subclass by one.
- * So the first lock acquired will have a lockdep subclass of 2, the
- * second lock will have a lockdep subclass of 3, and so on. It is
+ * So the first lock acquired will have a lockdep subclass of 4, the
+ * second lock will have a lockdep subclass of 5, and so on. It is
* the responsibility of the class builder to shift this to the correct
* portion of the lock_mode lockdep mask.
*/
#define XFS_LOCK_PARENT 1
-#define XFS_LOCK_INUMORDER 2
+#define XFS_LOCK_RTBITMAP 2
+#define XFS_LOCK_RTSUM 3
+#define XFS_LOCK_INUMORDER 4
#define XFS_IOLOCK_SHIFT 16
#define XFS_IOLOCK_PARENT (XFS_LOCK_PARENT <<
XFS_IOLOCK_SHIFT)
#define XFS_ILOCK_SHIFT 24
#define XFS_ILOCK_PARENT (XFS_LOCK_PARENT << XFS_ILOCK_SHIFT)
+#define XFS_ILOCK_RTBITMAP (XFS_LOCK_RTBITMAP <<
XFS_ILOCK_SHIFT)
+#define XFS_ILOCK_RTSUM (XFS_LOCK_RTSUM << XFS_ILOCK_SHIFT)
#define XFS_IOLOCK_DEP_MASK 0x00ff0000
#define XFS_ILOCK_DEP_MASK 0xff000000
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 037fab1..f592ac9 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -1972,8 +1972,10 @@ xfs_growfs_rt(
/*
* Lock out other callers by grabbing the bitmap inode lock.
*/
- if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
- XFS_ILOCK_EXCL, &ip)))
+ error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
+ XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP,
+ &ip);
+ if (error)
goto error_cancel;
ASSERT(ip == mp->m_rbmip);
/*
@@ -1986,8 +1988,9 @@ xfs_growfs_rt(
/*
* Get the summary inode into the transaction.
*/
- if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rsumino, 0,
- XFS_ILOCK_EXCL, &ip)))
+ error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rsumino, 0,
+ XFS_ILOCK_EXCL | XFS_ILOCK_RTSUM,
&ip);
+ if (error)
goto error_cancel;
ASSERT(ip == mp->m_rsumip);
/*
@@ -2160,8 +2163,9 @@ xfs_rtfree_extent(
/*
* Synchronize by locking the bitmap inode.
*/
- if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
- XFS_ILOCK_EXCL, &ip)))
+ error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
+ XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP, &ip);
+ if (error)
return error;
#if defined(__KERNEL__) && defined(DEBUG)
/*
--
1.7.5.4
--
View this message in context: http://old.nabble.com/-PATCH-0-4--xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33346043.html
Sent from the Xfs - General mailing list archive at Nabble.com.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
2012-02-17 22:58 ` [PATCH 3/4] xfs: add lockdep annotations for the rt inodes kdasu
@ 2012-02-17 23:00 ` kdasu
2012-02-19 22:41 ` Christoph Hellwig
0 siblings, 1 reply; 11+ messages in thread
From: kdasu @ 2012-02-17 23:00 UTC (permalink / raw)
To: xfs
On the 2.6.37 kernel, xfs_fs_evict_inode() leads to a deadlock when
freeing multiple realtime extents. On further debugging the root
cause it was determined to be recursive locking of the RT bitmap
inode during evict operation within the same task context.
The same vfs evict sequence is replayed by the xfs log recovery on
mounts on a reboot after the problem happens first time.
This problem exists on kernel v2.6.39 as well.
Call stack:
xfs_ilock <- simple task deadlock in the xfs_ilock(ip, XFS_ILOCK_EXCL)
re-acquired on second iteration when the inode is cached
xfs_iget_cache_hit
xfs_iget
xfs_trans_iget
xfs_rtfree_extent <- Call to xfs_trans_iget()
xfs_bmap_del_extent
xfs_bunmapi <- while loop based on number of extents to free
xfs_itruncate_finish
xfs_inactive
evict
The deadlock fix has two parts :
1) check if the inode is already locked in xfs_iget.c in the
xfs_iget_cache_hit() function. Do not acquire the inode lock again
if ip is already locked with the XFS_ILOCK_EXCL subclass.
We use the active transaction structure to detect if the inode is
already lokced.
2) In addition in xfs_trans_inode.c:xfs_trans_iget() prevent joining
already active transaction.
The above changes are also needed along with the backport of following
2.6.39 kernel patches to 2.6.37 kernel:
xfs: only lock the rt bitmap inode once per allocation
xfs: fix xfs_get_extsz_hint for a zero extent size hint
xfs: add lockdep annotations for the rt inodes
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
fs/xfs/xfs_iget.c | 12 +++++++++++-
fs/xfs/xfs_trans_inode.c | 2 +-
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
index 0cdd269..f05bdc2 100644
--- a/fs/xfs/xfs_iget.c
+++ b/fs/xfs/xfs_iget.c
@@ -143,6 +143,7 @@ xfs_inode_free(
static int
xfs_iget_cache_hit(
struct xfs_perag *pag,
+ xfs_trans_t *tp,
struct xfs_inode *ip,
int flags,
int lock_flags) __releases(pag->pag_ici_lock)
@@ -234,6 +235,15 @@ xfs_iget_cache_hit(
trace_xfs_iget_hit(ip);
}
+ /* check inode already locked */
+ spin_lock(&ip->i_flags_lock);
+ if (tp && ip->i_transp == tp) {
+ if ((ip->i_itemp->ili_lock_flags & lock_flags) &
+ (XFS_ILOCK_EXCL))
+ lock_flags = 0;
+ }
+ spin_unlock(&ip->i_flags_lock);
+
if (lock_flags != 0)
xfs_ilock(ip, lock_flags);
@@ -379,7 +389,7 @@ again:
ip = radix_tree_lookup(&pag->pag_ici_root, agino);
if (ip) {
- error = xfs_iget_cache_hit(pag, ip, flags, lock_flags);
+ error = xfs_iget_cache_hit(pag, tp, ip, flags, lock_flags);
if (error)
goto out_error_or_again;
} else {
diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c
index ccb3453..6f8db93 100644
--- a/fs/xfs/xfs_trans_inode.c
+++ b/fs/xfs/xfs_trans_inode.c
@@ -58,7 +58,7 @@ xfs_trans_iget(
int error;
error = xfs_iget(mp, tp, ino, flags, lock_flags, ipp);
- if (!error && tp) {
+ if (!error && tp && !((*ipp)->i_transp)) {
xfs_trans_ijoin(tp, *ipp);
(*ipp)->i_itemp->ili_lock_flags = lock_flags;
}
--
1.7.5.4
--
View this message in context: http://old.nabble.com/-PATCH-0-4--xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33346051.html
Sent from the Xfs - General mailing list archive at Nabble.com.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
2012-02-17 23:00 ` [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37 kdasu
@ 2012-02-19 22:41 ` Christoph Hellwig
2012-02-21 17:22 ` Kamal Dasu
0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2012-02-19 22:41 UTC (permalink / raw)
To: kdasu; +Cc: xfs
On Fri, Feb 17, 2012 at 03:00:09PM -0800, kdasu wrote:
>
>
> On the 2.6.37 kernel, xfs_fs_evict_inode() leads to a deadlock when
> freeing multiple realtime extents. On further debugging the root
> cause it was determined to be recursive locking of the RT bitmap
> inode during evict operation within the same task context.
> The same vfs evict sequence is replayed by the xfs log recovery on
> mounts on a reboot after the problem happens first time.
> This problem exists on kernel v2.6.39 as well.
I think you're better off fixing this problem like I did for the
allocation side, that is:
- remove the xfs_ilock and xfs_trans_ijoin (or probably still
xfs_trans_iget in your version) from xfs_rtfree_extent, and
instead add asserts that the inode is locked and has an inode_item
attach to it.
- in xfs_bunmapi if we are dealing with an inode with the rt flag
bump the reference count on the inode there and attach it to the
transaction before calling into xfs_bmap_del_extent, similar to
what we do in xfs_bmap_rtalloc.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
2012-02-19 22:41 ` Christoph Hellwig
@ 2012-02-21 17:22 ` Kamal Dasu
2012-02-23 16:52 ` [PATCH 4/4] V2 " Kamal Dasu
0 siblings, 1 reply; 11+ messages in thread
From: Kamal Dasu @ 2012-02-21 17:22 UTC (permalink / raw)
To: xfs
Christoph Hellwig wrote:
>
> I think you're better off fixing this problem like I did for the
> allocation side, that is:
>
> - remove the xfs_ilock and xfs_trans_ijoin (or probably still
> xfs_trans_iget in your version) from xfs_rtfree_extent, and
> instead add asserts that the inode is locked and has an inode_item
> attach to it.
> - in xfs_bunmapi if we are dealing with an inode with the rt flag
> bump the reference count on the inode there and attach it to the
> transaction before calling into xfs_bmap_del_extent, similar to
> what we do in xfs_bmap_rtalloc.
>
I will make the change and test and send the new version of the patch.
BTW when you say reference counting the inode do you mean I should call
xfs_trans_ijoin_ref().
--
View this message in context: http://old.nabble.com/-PATCH-0-4--RFC-xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33365485.html
Sent from the Xfs - General mailing list archive at Nabble.com.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 4/4] V2 xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
2012-02-21 17:22 ` Kamal Dasu
@ 2012-02-23 16:52 ` Kamal Dasu
2012-02-25 9:40 ` Christoph Hellwig
0 siblings, 1 reply; 11+ messages in thread
From: Kamal Dasu @ 2012-02-23 16:52 UTC (permalink / raw)
To: xfs
To fix the deadlock caused by recursively calling xfs_rtfree_extent
from xfs_bunmapi():
- removed xfs_trans_iget() from xfs_rtfree_extent(),
instead added asserts that the inode is locked and has an inode_item
attached to it.
- in xfs_bunmapi() when dealing with an inode with the rt flag
call xfs_ilock() and xfs_trans_ijoin() so that the
reference count is bumped on the inode and attached it to the
transaction before calling into xfs_bmap_del_extent, similar to
what we do in xfs_bmap_rtalloc.
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
fs/xfs/xfs_bmap.c | 10 ++++++++++
fs/xfs/xfs_rtalloc.c | 13 ++++++-------
2 files changed, 16 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 36c317c..103a253 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -5058,6 +5058,16 @@ xfs_bunmapi(
cur->bc_private.b.flags = 0;
} else
cur = NULL;
+
+ if (isrt) {
+ /*
+ * Synchronize by locking the bitmap inode.
+ */
+ xfs_ilock(mp->m_rbmip, XFS_ILOCK_EXCL);
+ xfs_trans_ijoin_ref(tp, mp->m_rbmip,
+ XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP);
+ }
+
extno = 0;
while (bno != (xfs_fileoff_t)-1 && bno >= start && lastx >= 0 &&
(nexts == 0 || extno < nexts)) {
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index f592ac9..d0a48d4 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -2160,13 +2160,12 @@ xfs_rtfree_extent(
xfs_buf_t *sumbp; /* summary file block buffer */
mp = tp->t_mountp;
- /*
- * Synchronize by locking the bitmap inode.
- */
- error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
- XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP, &ip);
- if (error)
- return error;
+
+ ASSERT(mp->m_rbmip->i_itemp != NULL);
+ ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
+
+ ip = mp->m_rbmip;
+
#if defined(__KERNEL__) && defined(DEBUG)
/*
* Check to see that this whole range is currently allocated.
--
1.7.5.4
--
View this message in context: http://old.nabble.com/-PATCH-0-4--RFC-xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33379323.html
Sent from the Xfs - General mailing list archive at Nabble.com.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] V2 xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
2012-02-23 16:52 ` [PATCH 4/4] V2 " Kamal Dasu
@ 2012-02-25 9:40 ` Christoph Hellwig
2012-02-25 15:46 ` Kamal Dasu
0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2012-02-25 9:40 UTC (permalink / raw)
To: Kamal Dasu; +Cc: xfs
On Thu, Feb 23, 2012 at 08:52:57AM -0800, Kamal Dasu wrote:
>
> To fix the deadlock caused by recursively calling xfs_rtfree_extent
> from xfs_bunmapi():
>
> - removed xfs_trans_iget() from xfs_rtfree_extent(),
> instead added asserts that the inode is locked and has an inode_item
> attached to it.
> - in xfs_bunmapi() when dealing with an inode with the rt flag
> call xfs_ilock() and xfs_trans_ijoin() so that the
> reference count is bumped on the inode and attached it to the
> transaction before calling into xfs_bmap_del_extent, similar to
> what we do in xfs_bmap_rtalloc.
>
> Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
This looks good, thanks a lot!
Do you have an easily reproducable testcase for this which we could
put into xfstests?
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] V2 xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
2012-02-25 9:40 ` Christoph Hellwig
@ 2012-02-25 15:46 ` Kamal Dasu
2012-02-28 8:36 ` Christoph Hellwig
0 siblings, 1 reply; 11+ messages in thread
From: Kamal Dasu @ 2012-02-25 15:46 UTC (permalink / raw)
To: xfs
Christoph,
I have not been able to create a simple test case for this yet.
Currently the only way I have is to use an a time shift recording
application
that stored video streams on a real-time subvolume. Sometimes when such
a stream is deleted I see the problem. I have not figured out how a test to
consistently get allocation where the bit map span multiple extents while
freeing the inode.
I am still trying to come with a simple test case.
If you have any ideas let me know I will be happy to try it out.
Kamal
Christoph Hellwig wrote:
>
> On Thu, Feb 23, 2012 at 08:52:57AM -0800, Kamal Dasu wrote:
>>
>> To fix the deadlock caused by recursively calling xfs_rtfree_extent
>> from xfs_bunmapi():
>>
>> - removed xfs_trans_iget() from xfs_rtfree_extent(),
>> instead added asserts that the inode is locked and has an inode_item
>> attached to it.
>> - in xfs_bunmapi() when dealing with an inode with the rt flag
>> call xfs_ilock() and xfs_trans_ijoin() so that the
>> reference count is bumped on the inode and attached it to the
>> transaction before calling into xfs_bmap_del_extent, similar to
>> what we do in xfs_bmap_rtalloc.
>>
>> Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
>
> This looks good, thanks a lot!
>
> Do you have an easily reproducable testcase for this which we could
> put into xfstests?
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
>
--
View this message in context: http://old.nabble.com/-PATCH-0-4--RFC-xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33390983.html
Sent from the Xfs - General mailing list archive at Nabble.com.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] V2 xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
2012-02-25 15:46 ` Kamal Dasu
@ 2012-02-28 8:36 ` Christoph Hellwig
0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2012-02-28 8:36 UTC (permalink / raw)
To: Kamal Dasu; +Cc: xfs
On Sat, Feb 25, 2012 at 07:57:02AM -0800, Kamal Dasu wrote:
>
> Christoph,
>
> I have not been able to create a simple test case for this yet.
>
> Currently the only way I have is to use an a time shift recording
> application
> that stores video streams on a real-time subvolume. Sometimes when such
> a stream is deleted I see the problem. I have not figured out how to create
> a
> test to consistently get allocation where the bit map span multiple extents
> while
> freeing the inode.
>
> I am still trying to come with a simple test case.
> If you have any ideas let me know I will be happy to try it out.
>
> I have also posted a equivalent post for the 3.3 kernel branch.
> http://old.nabble.com/-PATCH--xfs%3A-fix-deadlock-in-xfs_rtfree_extent-with-kernel-v3.x-to33375114.html
That's the mail I replied to :)
Either way - I'd love to get a good test case for it eventually, but for
now let's put in the obvious fix even if we don't have a good
reproducer.
Alex, I'd suggest this is a 3.3-rc issue as it's a regression, even if
an old one.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-02-28 8:36 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-17 22:46 [PATCH 0/4] xfs: resurrect realtime subvolume support on kernel 2.6.37 kdasu
2012-02-17 22:51 ` [PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation kdasu
2012-02-17 22:55 ` [PATCH 2/4] xfs: fix xfs_get_extsz_hint for a zero extent size hint kdasu
2012-02-17 22:58 ` [PATCH 3/4] xfs: add lockdep annotations for the rt inodes kdasu
2012-02-17 23:00 ` [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37 kdasu
2012-02-19 22:41 ` Christoph Hellwig
2012-02-21 17:22 ` Kamal Dasu
2012-02-23 16:52 ` [PATCH 4/4] V2 " Kamal Dasu
2012-02-25 9:40 ` Christoph Hellwig
2012-02-25 15:46 ` Kamal Dasu
2012-02-28 8:36 ` Christoph Hellwig
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.