All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] xfs: resurrect realtime subvolume support on kernel 2.6.37
@ 2012-02-17 22:46 kdasu
  2012-02-17 22:51 ` [PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation kdasu
  0 siblings, 1 reply; 11+ messages in thread
From: kdasu @ 2012-02-17 22:46 UTC (permalink / raw)
  To: xfs


Fixed bimap inode deadlock  existing on 2.6.37.

[PATCH 4/4]   xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37

Also added the realtime subvolume patches from the 2.6.39 release.

[PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation
[PATCH 2/4] xfs: fix xfs_get_extsz_hint for a zero extent size hint
[PATCH 3/4] xfs: add lockdep annotations for the rt inodes

These fixes were needed to resurrect realtime subvolume support on 
2.6.37 kernel.

I suspect the deadlock problem while freeing multiple extents 
exits  on 2.6.39  in the xfs_rtfree_entent() call.

Kamal
-- 
View this message in context: http://old.nabble.com/-PATCH-0-4--xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33345988.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation
  2012-02-17 22:46 [PATCH 0/4] xfs: resurrect realtime subvolume support on kernel 2.6.37 kdasu
@ 2012-02-17 22:51 ` kdasu
  2012-02-17 22:55   ` [PATCH 2/4] xfs: fix xfs_get_extsz_hint for a zero extent size hint kdasu
  0 siblings, 1 reply; 11+ messages in thread
From: kdasu @ 2012-02-17 22:51 UTC (permalink / raw)
  To: xfs


Currently both xfs_rtpick_extent and xfs_rtallocate_extent call
xfs_trans_iget to grab and lock the rt bitmap inode, which results in a
deadlock since the removal of the lock recursion counters in commit

        "xfs: simplify inode to transaction joining"

Fix this by acquiring and locking the inode in xfs_bmap_rtalloc before
calling into xfs_rtpick_extent and xfs_rtallocate_extent.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
 fs/xfs/xfs_bmap.c    |   11 +++++++++++
 fs/xfs/xfs_rtalloc.c |   34 +++++++++++++---------------------
 2 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 4111cd3..9d9970b 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -2316,6 +2316,7 @@ xfs_bmap_rtalloc(
        xfs_extlen_t    prod = 0;       /* product factor for allocators */
        xfs_extlen_t    ralen = 0;      /* realtime allocation length */
        xfs_extlen_t    align;          /* minimum allocation alignment */
+       xfs_inode_t     *ip;            /* bitmap incore inode */
        xfs_rtblock_t   rtb;
kdasu@kdasu-VirtualBox:~/linux-2.6$ more
0001-xfs-only-lock-the-rt-bitmap-inode-once-per-allocatio.patch 
>From f59ab29cc191a5955c4d44c0b92a537f981184c2 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@infradead.org>
Date: Tue, 25 Jan 2011 09:06:19 +0000
Subject: [PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation

Currently both xfs_rtpick_extent and xfs_rtallocate_extent call
xfs_trans_iget to grab and lock the rt bitmap inode, which results in a
deadlock since the removal of the lock recursion counters in commit

        "xfs: simplify inode to transaction joining"

Fix this by acquiring and locking the inode in xfs_bmap_rtalloc before
calling into xfs_rtpick_extent and xfs_rtallocate_extent.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
 fs/xfs/xfs_bmap.c    |   11 +++++++++++
 fs/xfs/xfs_rtalloc.c |   34 +++++++++++++---------------------
 2 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 4111cd3..9d9970b 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -2316,6 +2316,7 @@ xfs_bmap_rtalloc(
        xfs_extlen_t    prod = 0;       /* product factor for allocators */
        xfs_extlen_t    ralen = 0;      /* realtime allocation length */
        xfs_extlen_t    align;          /* minimum allocation alignment */
+       xfs_inode_t     *ip;            /* bitmap incore inode */
        xfs_rtblock_t   rtb;
 
        mp = ap->ip->i_mount;
@@ -2348,6 +2349,16 @@ xfs_bmap_rtalloc(
         */
        if (ralen * mp->m_sb.sb_rextsize >= MAXEXTLEN)
                ralen = MAXEXTLEN / mp->m_sb.sb_rextsize;
+
+       /*
+        * Lock out other modifications to the RT bitmap inode.
+        */
+       error = xfs_trans_iget(mp, ap->tp, mp->m_sb.sb_rbmino, 0,
+                              XFS_ILOCK_EXCL, &ip);
+       if (error)
+               return error;
+       ASSERT(ip == mp->m_rbmip);
+
        /*
         * If it's an allocation to an empty file at offset 0,
         * pick an extent that will space things out in the rt area.
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 12a1913..037fab1 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -2075,15 +2075,15 @@ xfs_rtallocate_extent(
        xfs_extlen_t    prod,           /* extent product factor */
        xfs_rtblock_t   *rtblock)       /* out: start block allocated */
 {
+       xfs_mount_t     *mp = tp->t_mountp;
        int             error;          /* error value */
-       xfs_inode_t     *ip;            /* inode for bitmap file */
-       xfs_mount_t     *mp;            /* file system mount structure */
        xfs_rtblock_t   r;              /* result allocated block */
        xfs_fsblock_t   sb;             /* summary file block number */
        xfs_buf_t       *sumbp;         /* summary file block buffer */
 
+       ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
        ASSERT(minlen > 0 && minlen <= maxlen);
-       mp = tp->t_mountp;
+
        /*
         * If prod is set then figure out what to do to minlen and maxlen.
         */
@@ -2099,12 +2099,7 @@ xfs_rtallocate_extent(
                        return 0;
                }
        }
-       /*
-        * Lock out other callers by grabbing the bitmap inode lock.
-        */
-       if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
-                                       XFS_ILOCK_EXCL, &ip)))
-               return error;
+
        sumbp = NULL;
        /*
         * Allocate by size, or near another block, or exactly at some
block.
@@ -2123,11 +2118,12 @@ xfs_rtallocate_extent(
                                len, &sumbp, &sb, prod, &r);
                break;
        default:
+               error = EIO;
                ASSERT(0);
        }
-       if (error) {
+       if (error)
                return error;
-       }
+
        /*
         * If it worked, update the superblock.
         */
@@ -2306,20 +2302,16 @@ xfs_rtpick_extent(
        xfs_rtblock_t   *pick)          /* result rt extent */
 {
        xfs_rtblock_t   b;              /* result block */
-       int             error;          /* error return value */
-       xfs_inode_t     *ip;            /* bitmap incore inode */
        int             log2;           /* log of sequence number */
        __uint64_t      resid;          /* residual after log removed */
        __uint64_t      seq;            /* sequence number of file creation
*/
        __uint64_t      *seqp;          /* pointer to seqno in inode */
 
-       if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
-                                       XFS_ILOCK_EXCL, &ip)))
-               return error;
+
        sumbp = NULL;
        /*
         * Allocate by size, or near another block, or exactly at some
block.
@@ -2123,11 +2118,12 @@ xfs_rtallocate_extent(
                                len, &sumbp, &sb, prod, &r);
                break;
        default:
+               error = EIO;
                ASSERT(0);
        }
-       if (error) {
+       if (error)
                return error;
-       }
+
        /*
         * If it worked, update the superblock.
         */
@@ -2306,20 +2302,16 @@ xfs_rtpick_extent(
        xfs_rtblock_t   *pick)          /* result rt extent */
 {
        xfs_rtblock_t   b;              /* result block */
-       int             error;          /* error return value */
-       xfs_inode_t     *ip;            /* bitmap incore inode */
        int             log2;           /* log of sequence number */
        __uint64_t      resid;          /* residual after log removed */
        __uint64_t      seq;            /* sequence number of file creation
*/
        __uint64_t      *seqp;          /* pointer to seqno in inode */
 
-       if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
-                                       XFS_ILOCK_EXCL, &ip)))
-               return error;
-       ASSERT(ip == mp->m_rbmip);
-       seqp = (__uint64_t *)&ip->i_d.di_atime;
-       if (!(ip->i_d.di_flags & XFS_DIFLAG_NEWRTBM)) {
-               ip->i_d.di_flags |= XFS_DIFLAG_NEWRTBM;
+       ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
+
+       seqp = (__uint64_t *)&mp->m_rbmip->i_d.di_atime;
+       if (!(mp->m_rbmip->i_d.di_flags & XFS_DIFLAG_NEWRTBM)) {
+               mp->m_rbmip->i_d.di_flags |= XFS_DIFLAG_NEWRTBM;
                *seqp = 0;
        }
        seq = *seqp;
@@ -2335,7 +2327,7 @@ xfs_rtpick_extent(
                        b = mp->m_sb.sb_rextents - len;
        }
        *seqp = seq + 1;
-       xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+       xfs_trans_log_inode(tp, mp->m_rbmip, XFS_ILOG_CORE);
        *pick = b;
        return 0;
 }
-- 
1.7.5.4
-- 
View this message in context: http://old.nabble.com/-PATCH-0-4--xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33346009.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] xfs: fix xfs_get_extsz_hint for a zero extent size hint
  2012-02-17 22:51 ` [PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation kdasu
@ 2012-02-17 22:55   ` kdasu
  2012-02-17 22:58     ` [PATCH 3/4] xfs: add lockdep annotations for the rt inodes kdasu
  0 siblings, 1 reply; 11+ messages in thread
From: kdasu @ 2012-02-17 22:55 UTC (permalink / raw)
  To: xfs


We can easily set the extsize flag without setting an extent size
hint, or one that evaluates to zero.  Historically the di_extsize
field was only used when it was non-zero, but the commit

        "Cleanup inode extent size hint extraction"

broke this.  Restore the old behaviour, thus fixing xfsqa 090 with
a debug kernel.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
 fs/xfs/xfs_rw.c |   18 +++++-------------
 1 files changed, 5 insertions(+), 13 deletions(-)

diff --git a/fs/xfs/xfs_rw.c b/fs/xfs/xfs_rw.c
index 56861d5..ccd3adf 100644
--- a/fs/xfs/xfs_rw.c
+++ b/fs/xfs/xfs_rw.c
@@ -173,17 +173,9 @@ xfs_extlen_t
 xfs_get_extsz_hint(
        struct xfs_inode        *ip)
 {
-       xfs_extlen_t            extsz;
-
-       if (unlikely(XFS_IS_REALTIME_INODE(ip))) {
-               extsz = (ip->i_d.di_flags & XFS_DIFLAG_EXTSIZE)
-                               ? ip->i_d.di_extsize
-                               : ip->i_mount->m_sb.sb_rextsize;
-               ASSERT(extsz);
-       } else {
-               extsz = (ip->i_d.di_flags & XFS_DIFLAG_EXTSIZE)
-                               ? ip->i_d.di_extsize : 0;
-       }
-
-       return extsz;
+       if ((ip->i_d.di_flags & XFS_DIFLAG_EXTSIZE) && ip->i_d.di_extsize)
+               return ip->i_d.di_extsize;
+       if (XFS_IS_REALTIME_INODE(ip))
+               return ip->i_mount->m_sb.sb_rextsize;
+       return 0;
 }
-- 
1.7.5.4

-- 
View this message in context: http://old.nabble.com/-PATCH-0-4--xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33346035.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] xfs: add lockdep annotations for the rt inodes
  2012-02-17 22:55   ` [PATCH 2/4] xfs: fix xfs_get_extsz_hint for a zero extent size hint kdasu
@ 2012-02-17 22:58     ` kdasu
  2012-02-17 23:00       ` [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37 kdasu
  0 siblings, 1 reply; 11+ messages in thread
From: kdasu @ 2012-02-17 22:58 UTC (permalink / raw)
  To: xfs


The rt bitmap and summary inodes do not participate in the normal inode
locking protocol.  Instead the rt bitmap inode can be locked in any
transaction involving rt allocations, and the both of the rt inodes can
be locked at the same time.  Add specific lockdep subclasses for the rt
inodes to prevent lockdep from blowing up.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
 fs/xfs/xfs_bmap.c    |    2 +-
 fs/xfs/xfs_inode.h   |   23 +++++++++++++++--------
 fs/xfs/xfs_rtalloc.c |   16 ++++++++++------
 3 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 9d9970b..36c317c 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -2354,7 +2354,7 @@ xfs_bmap_rtalloc(
         * Lock out other modifications to the RT bitmap inode.
         */
        error = xfs_trans_iget(mp, ap->tp, mp->m_sb.sb_rbmino, 0,
-                              XFS_ILOCK_EXCL, &ip);
+                              XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP, &ip);
        if (error)
                return error;
        ASSERT(ip == mp->m_rbmip);
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index fb2ca2e..a9e82d4 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -408,28 +408,35 @@ static inline void xfs_ifunlock(xfs_inode_t *ip)
 /*
  * Flags for lockdep annotations.
  *
- * XFS_I[O]LOCK_PARENT - for operations that require locking two inodes
- * (ie directory operations that require locking a directory inode and
- * an entry inode).  The first inode gets locked with this flag so it
- * gets a lockdep subclass of 1 and the second lock will have a lockdep
- * subclass of 0.
+ * XFS_LOCK_PARENT - for directory operations that require locking a
+ * parent directory inode and a child entry inode.  The parent gets locked
+ * with this flag so it gets a lockdep subclass of 1 and the child entry
+ * lock will have a lockdep subclass of 0.
+ *
+ * XFS_LOCK_RTBITMAP/XFS_LOCK_RTSUM - the realtime device bitmap and
summary
+ * inodes do not participate in the normal lock order, and thus have their
+ * own subclasses.
  *
  * XFS_LOCK_INUMORDER - for locking several inodes at the some time
  * with xfs_lock_inodes().  This flag is used as the starting subclass
  * and each subsequent lock acquired will increment the subclass by one.
- * So the first lock acquired will have a lockdep subclass of 2, the
- * second lock will have a lockdep subclass of 3, and so on. It is
+ * So the first lock acquired will have a lockdep subclass of 4, the
+ * second lock will have a lockdep subclass of 5, and so on. It is
  * the responsibility of the class builder to shift this to the correct
  * portion of the lock_mode lockdep mask.
  */
 #define XFS_LOCK_PARENT                1
-#define XFS_LOCK_INUMORDER     2
+#define XFS_LOCK_RTBITMAP      2
+#define XFS_LOCK_RTSUM         3
+#define XFS_LOCK_INUMORDER     4
 
 #define XFS_IOLOCK_SHIFT       16
 #define        XFS_IOLOCK_PARENT       (XFS_LOCK_PARENT <<
XFS_IOLOCK_SHIFT)
 
 #define XFS_ILOCK_SHIFT                24
 #define        XFS_ILOCK_PARENT        (XFS_LOCK_PARENT << XFS_ILOCK_SHIFT)
+#define        XFS_ILOCK_RTBITMAP      (XFS_LOCK_RTBITMAP <<
XFS_ILOCK_SHIFT)
+#define        XFS_ILOCK_RTSUM         (XFS_LOCK_RTSUM << XFS_ILOCK_SHIFT)
 
 #define XFS_IOLOCK_DEP_MASK    0x00ff0000
 #define XFS_ILOCK_DEP_MASK     0xff000000
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 037fab1..f592ac9 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -1972,8 +1972,10 @@ xfs_growfs_rt(
                /*
                 * Lock out other callers by grabbing the bitmap inode lock.
                 */
-               if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
-                                               XFS_ILOCK_EXCL, &ip)))
+               error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
+                                      XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP,
+                                      &ip);
+               if (error)
                        goto error_cancel;
                ASSERT(ip == mp->m_rbmip);
                /*
@@ -1986,8 +1988,9 @@ xfs_growfs_rt(
                /*
                 * Get the summary inode into the transaction.
                 */
-               if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rsumino, 0,
-                                               XFS_ILOCK_EXCL, &ip)))
+               error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rsumino, 0,
+                                      XFS_ILOCK_EXCL | XFS_ILOCK_RTSUM,
&ip);
+               if (error)
                        goto error_cancel;
                ASSERT(ip == mp->m_rsumip);
                /*
@@ -2160,8 +2163,9 @@ xfs_rtfree_extent(
        /*
         * Synchronize by locking the bitmap inode.
         */
-       if ((error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
-                                       XFS_ILOCK_EXCL, &ip)))
+       error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
+                              XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP, &ip);
+       if (error)
                return error;
 #if defined(__KERNEL__) && defined(DEBUG)
        /*
-- 
1.7.5.4


-- 
View this message in context: http://old.nabble.com/-PATCH-0-4--xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33346043.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
  2012-02-17 22:58     ` [PATCH 3/4] xfs: add lockdep annotations for the rt inodes kdasu
@ 2012-02-17 23:00       ` kdasu
  2012-02-19 22:41         ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: kdasu @ 2012-02-17 23:00 UTC (permalink / raw)
  To: xfs



On the 2.6.37 kernel, xfs_fs_evict_inode() leads to a deadlock when
freeing multiple realtime extents. On further debugging the root
cause it was determined to be recursive locking of the RT bitmap
inode during evict operation within the same task context.
The same vfs evict sequence is replayed by the xfs log recovery on
mounts on a reboot after the problem happens first time.
This problem exists on kernel v2.6.39 as well.

Call stack:
xfs_ilock   <- simple task deadlock in the xfs_ilock(ip,  XFS_ILOCK_EXCL)
               re-acquired on second iteration when the inode is cached
xfs_iget_cache_hit
xfs_iget
xfs_trans_iget
xfs_rtfree_extent <- Call to xfs_trans_iget()
xfs_bmap_del_extent
xfs_bunmapi  <- while loop based on number of extents to free
xfs_itruncate_finish
xfs_inactive
evict

The deadlock fix has two parts :
1) check if the inode is already locked in xfs_iget.c in the
   xfs_iget_cache_hit() function. Do not acquire the inode lock again
   if ip is already locked with the  XFS_ILOCK_EXCL subclass.
   We use the active transaction structure to detect if the inode is
   already lokced.
2) In addition in xfs_trans_inode.c:xfs_trans_iget() prevent joining
   already active transaction.

The above changes are also needed along with the backport of following
2.6.39 kernel patches to 2.6.37 kernel:

xfs: only lock the rt bitmap inode once per allocation
xfs: fix xfs_get_extsz_hint for a zero extent size hint
xfs: add lockdep annotations for the rt inodes

Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
 fs/xfs/xfs_iget.c        |   12 +++++++++++-
 fs/xfs/xfs_trans_inode.c |    2 +-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
index 0cdd269..f05bdc2 100644
--- a/fs/xfs/xfs_iget.c
+++ b/fs/xfs/xfs_iget.c
@@ -143,6 +143,7 @@ xfs_inode_free(
 static int
 xfs_iget_cache_hit(
        struct xfs_perag        *pag,
+       xfs_trans_t             *tp,
        struct xfs_inode        *ip,
        int                     flags,
        int                     lock_flags) __releases(pag->pag_ici_lock)
@@ -234,6 +235,15 @@ xfs_iget_cache_hit(
                trace_xfs_iget_hit(ip);
        }
 
+       /* check inode already locked  */
+       spin_lock(&ip->i_flags_lock);
+       if (tp &&  ip->i_transp == tp) {
+               if ((ip->i_itemp->ili_lock_flags & lock_flags) &
+                       (XFS_ILOCK_EXCL))
+                       lock_flags = 0;
+       }
+       spin_unlock(&ip->i_flags_lock);
+
        if (lock_flags != 0)
                xfs_ilock(ip, lock_flags);
 
@@ -379,7 +389,7 @@ again:
        ip = radix_tree_lookup(&pag->pag_ici_root, agino);
 
        if (ip) {
-               error = xfs_iget_cache_hit(pag, ip, flags, lock_flags);
+               error = xfs_iget_cache_hit(pag, tp, ip, flags, lock_flags);
                if (error)
                        goto out_error_or_again;
        } else {
diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c
index ccb3453..6f8db93 100644
--- a/fs/xfs/xfs_trans_inode.c
+++ b/fs/xfs/xfs_trans_inode.c
@@ -58,7 +58,7 @@ xfs_trans_iget(
        int                     error;
 
        error = xfs_iget(mp, tp, ino, flags, lock_flags, ipp);
-       if (!error && tp) {
+       if (!error && tp && !((*ipp)->i_transp)) {
                xfs_trans_ijoin(tp, *ipp);
                (*ipp)->i_itemp->ili_lock_flags = lock_flags;
        }
-- 
1.7.5.4

-- 
View this message in context: http://old.nabble.com/-PATCH-0-4--xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33346051.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
  2012-02-17 23:00       ` [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37 kdasu
@ 2012-02-19 22:41         ` Christoph Hellwig
  2012-02-21 17:22           ` Kamal Dasu
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2012-02-19 22:41 UTC (permalink / raw)
  To: kdasu; +Cc: xfs

On Fri, Feb 17, 2012 at 03:00:09PM -0800, kdasu wrote:
> 
> 
> On the 2.6.37 kernel, xfs_fs_evict_inode() leads to a deadlock when
> freeing multiple realtime extents. On further debugging the root
> cause it was determined to be recursive locking of the RT bitmap
> inode during evict operation within the same task context.
> The same vfs evict sequence is replayed by the xfs log recovery on
> mounts on a reboot after the problem happens first time.
> This problem exists on kernel v2.6.39 as well.

I think you're better off fixing this problem like I did for the
allocation side, that is:

 - remove the xfs_ilock and xfs_trans_ijoin (or probably still
   xfs_trans_iget in your version) from xfs_rtfree_extent, and
   instead add asserts that the inode is locked and has an inode_item
   attach to it.
 - in xfs_bunmapi if we are dealing with an inode with the rt flag
   bump the reference count on the inode there and attach it to the
   transaction before calling into xfs_bmap_del_extent, similar to
   what we do in xfs_bmap_rtalloc.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
  2012-02-19 22:41         ` Christoph Hellwig
@ 2012-02-21 17:22           ` Kamal Dasu
  2012-02-23 16:52             ` [PATCH 4/4] V2 " Kamal Dasu
  0 siblings, 1 reply; 11+ messages in thread
From: Kamal Dasu @ 2012-02-21 17:22 UTC (permalink / raw)
  To: xfs




Christoph Hellwig wrote:
> 
> I think you're better off fixing this problem like I did for the
> allocation side, that is:
> 
>  - remove the xfs_ilock and xfs_trans_ijoin (or probably still
>    xfs_trans_iget in your version) from xfs_rtfree_extent, and
>    instead add asserts that the inode is locked and has an inode_item
>    attach to it.
>  - in xfs_bunmapi if we are dealing with an inode with the rt flag
>    bump the reference count on the inode there and attach it to the
>    transaction before calling into xfs_bmap_del_extent, similar to
>    what we do in xfs_bmap_rtalloc.
> 

I will make the change and test and send the new version of the patch.
BTW when you say reference counting the inode do you mean I should call 
xfs_trans_ijoin_ref().


-- 
View this message in context: http://old.nabble.com/-PATCH-0-4--RFC-xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33365485.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/4] V2 xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
  2012-02-21 17:22           ` Kamal Dasu
@ 2012-02-23 16:52             ` Kamal Dasu
  2012-02-25  9:40               ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Kamal Dasu @ 2012-02-23 16:52 UTC (permalink / raw)
  To: xfs


To fix the deadlock caused by recursively calling xfs_rtfree_extent
from xfs_bunmapi():

 - removed  xfs_trans_iget() from xfs_rtfree_extent(),
   instead added asserts that the inode is locked and has an inode_item
   attached to it.
 - in xfs_bunmapi() when dealing with an inode with the rt flag
   call xfs_ilock() and xfs_trans_ijoin() so that the
   reference count is bumped on the inode and attached it to the
   transaction before calling into xfs_bmap_del_extent, similar to
   what we do in xfs_bmap_rtalloc.

Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
 fs/xfs/xfs_bmap.c    |   10 ++++++++++
 fs/xfs/xfs_rtalloc.c |   13 ++++++-------
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 36c317c..103a253 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -5058,6 +5058,16 @@ xfs_bunmapi(
                cur->bc_private.b.flags = 0;
        } else
                cur = NULL;
+
+       if (isrt) {
+               /*
+                * Synchronize by locking the bitmap inode.
+                */
+               xfs_ilock(mp->m_rbmip, XFS_ILOCK_EXCL);
+               xfs_trans_ijoin_ref(tp, mp->m_rbmip,
+                                   XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP);
+       }
+
        extno = 0;
        while (bno != (xfs_fileoff_t)-1 && bno >= start && lastx >= 0 &&
               (nexts == 0 || extno < nexts)) {
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index f592ac9..d0a48d4 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -2160,13 +2160,12 @@ xfs_rtfree_extent(
        xfs_buf_t       *sumbp;         /* summary file block buffer */
 
        mp = tp->t_mountp;
-       /*
-        * Synchronize by locking the bitmap inode.
-        */
-       error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
-                              XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP, &ip);
-       if (error)
-               return error;
+
+       ASSERT(mp->m_rbmip->i_itemp != NULL);
+       ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
+
+       ip = mp->m_rbmip;
+
 #if defined(__KERNEL__) && defined(DEBUG)
        /*
         * Check to see that this whole range is currently allocated.
-- 
1.7.5.4

-- 
View this message in context: http://old.nabble.com/-PATCH-0-4--RFC-xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33379323.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] V2 xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
  2012-02-23 16:52             ` [PATCH 4/4] V2 " Kamal Dasu
@ 2012-02-25  9:40               ` Christoph Hellwig
  2012-02-25 15:46                 ` Kamal Dasu
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2012-02-25  9:40 UTC (permalink / raw)
  To: Kamal Dasu; +Cc: xfs

On Thu, Feb 23, 2012 at 08:52:57AM -0800, Kamal Dasu wrote:
> 
> To fix the deadlock caused by recursively calling xfs_rtfree_extent
> from xfs_bunmapi():
> 
>  - removed  xfs_trans_iget() from xfs_rtfree_extent(),
>    instead added asserts that the inode is locked and has an inode_item
>    attached to it.
>  - in xfs_bunmapi() when dealing with an inode with the rt flag
>    call xfs_ilock() and xfs_trans_ijoin() so that the
>    reference count is bumped on the inode and attached it to the
>    transaction before calling into xfs_bmap_del_extent, similar to
>    what we do in xfs_bmap_rtalloc.
> 
> Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>

This looks good, thanks a lot!

Do you have an easily reproducable testcase for this which we could
put into xfstests?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] V2 xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
  2012-02-25  9:40               ` Christoph Hellwig
@ 2012-02-25 15:46                 ` Kamal Dasu
  2012-02-28  8:36                   ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Kamal Dasu @ 2012-02-25 15:46 UTC (permalink / raw)
  To: xfs


Christoph,

I have not been able to create a simple test case for this yet.

Currently the only way I have is to use an a time shift recording
application 
that stored video streams on a real-time subvolume.  Sometimes when such
a stream is deleted I see the problem. I have not  figured out how a test to 
consistently get allocation where the bit map span multiple extents while 
freeing  the inode. 

I am still trying to come with a simple test case. 
If you have any ideas let me know I will be happy to try it out.

Kamal



Christoph Hellwig wrote:
> 
> On Thu, Feb 23, 2012 at 08:52:57AM -0800, Kamal Dasu wrote:
>> 
>> To fix the deadlock caused by recursively calling xfs_rtfree_extent
>> from xfs_bunmapi():
>> 
>>  - removed  xfs_trans_iget() from xfs_rtfree_extent(),
>>    instead added asserts that the inode is locked and has an inode_item
>>    attached to it.
>>  - in xfs_bunmapi() when dealing with an inode with the rt flag
>>    call xfs_ilock() and xfs_trans_ijoin() so that the
>>    reference count is bumped on the inode and attached it to the
>>    transaction before calling into xfs_bmap_del_extent, similar to
>>    what we do in xfs_bmap_rtalloc.
>> 
>> Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
> 
> This looks good, thanks a lot!
> 
> Do you have an easily reproducable testcase for this which we could
> put into xfstests?
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
> 
> 

-- 
View this message in context: http://old.nabble.com/-PATCH-0-4--RFC-xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33390983.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] V2 xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
  2012-02-25 15:46                 ` Kamal Dasu
@ 2012-02-28  8:36                   ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2012-02-28  8:36 UTC (permalink / raw)
  To: Kamal Dasu; +Cc: xfs

On Sat, Feb 25, 2012 at 07:57:02AM -0800, Kamal Dasu wrote:
> 
> Christoph,
> 
> I have not been able to create a simple test case for this yet.
> 
> Currently the only way I have is to use an a time shift recording
> application 
> that stores video streams on a real-time subvolume.  Sometimes when such
> a stream is deleted I see the problem. I have not  figured out how to create
> a 
> test to consistently get allocation where the bit map span multiple extents
> while 
> freeing  the inode. 
> 
> I am still trying to come with a simple test case. 
> If you have any ideas let me know I will be happy to try it out.
> 
> I have also posted a equivalent post for the 3.3 kernel branch.
> http://old.nabble.com/-PATCH--xfs%3A-fix-deadlock-in-xfs_rtfree_extent-with-kernel-v3.x-to33375114.html

That's the mail I replied to :)

Either way - I'd love to get a good test case for it eventually, but for
now let's put in the obvious fix even if we don't have a good
reproducer.

Alex, I'd suggest this is a 3.3-rc issue as it's a regression, even if
an old one.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-02-28  8:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-17 22:46 [PATCH 0/4] xfs: resurrect realtime subvolume support on kernel 2.6.37 kdasu
2012-02-17 22:51 ` [PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation kdasu
2012-02-17 22:55   ` [PATCH 2/4] xfs: fix xfs_get_extsz_hint for a zero extent size hint kdasu
2012-02-17 22:58     ` [PATCH 3/4] xfs: add lockdep annotations for the rt inodes kdasu
2012-02-17 23:00       ` [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37 kdasu
2012-02-19 22:41         ` Christoph Hellwig
2012-02-21 17:22           ` Kamal Dasu
2012-02-23 16:52             ` [PATCH 4/4] V2 " Kamal Dasu
2012-02-25  9:40               ` Christoph Hellwig
2012-02-25 15:46                 ` Kamal Dasu
2012-02-28  8:36                   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.