All of lore.kernel.org
 help / color / mirror / Atom feed
* Code Changes [Review Request]: xfs_trans_ail_delete() gets invoked with corrupted values – Causing Crash and Deadlock.
@ 2011-09-13 12:07 Amit Sahrawat
  2011-09-13 15:27 ` Code Changes [Review Request]: xfs_trans_ail_delete() gets invoked with corrupted values ? " Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Amit Sahrawat @ 2011-09-13 12:07 UTC (permalink / raw)
  To: Christoph Hellwig, Dave Chinner, xfs

Architecture: ARM
Kernel: 2.6.38.13 (observed on 3.0.3 also)
Test Case: Unplug device during umount

BackTrace:
Unable to handle kernel NULL pointer dereference at virtual address 00000023
Process umount (pid: 220, stack limit = 0xe45f22e8)
Backtrace:
(xfs_log_move_tail+0x0/0x280) from [<c01ac3b0>]
(xfs_trans_ail_delete+0xbc/0x17c)
(xfs_trans_ail_delete+0x0/0x17c) from [<c01830c0>] (xfs_buf_iodone+0x64/0x7c)
(xfs_buf_iodone+0x0/0x7c) from [<c0183028>] (xfs_buf_do_callbacks+0x2c/0x3c)
(xfs_buf_do_callbacks+0x0/0x3c) from [<c0183260>]
(xfs_buf_iodone_callbacks+0x14c/0x174)
(xfs_buf_iodone_callbacks+0x0/0x174) from [<c01b660c>]
(xfs_buf_iodone_work+0x58/0x7c)
(xfs_buf_iodone_work+0x0/0x7c) from [<c01b66b4>] (xfs_buf_ioend+0x84/0x9c)
(xfs_buf_ioend+0x0/0x9c) from [<c01b6d60>] (xfs_bioerror+0x4c/0x54)
(xfs_bioerror+0x0/0x54) from [<c01b6dc4>] (xfs_bdstrat_cb+0x5c/0x6c)
(xfs_bdstrat_cb+0x0/0x6c) from [<c01b6858>] (xfs_flush_buftarg+0xe4/0x18c)
(xfs_flush_buftarg+0x0/0x18c) from [<c01b6920>] (xfs_free_buftarg+0x20/0x78)
(xfs_free_buftarg+0x0/0x78) from [<c01bdecc>] (xfs_close_devices+0x64/0x68)
(xfs_close_devices+0x0/0x68) from [<c01bdf44>] (xfs_fs_put_super+0x74/0x88)
(xfs_fs_put_super+0x0/0x88) from [<c00c3a9c>]
(generic_shutdown_super+0x7c/0x120)
(generic_shutdown_super+0x0/0x120) from [<c00c3b74>]
(kill_block_super+0x34/0x4c)
(kill_block_super+0x0/0x4c) from [<c00c2b54>]
(deactivate_locked_super+0x3c/0x5c)
(deactivate_locked_super+0x0/0x5c) from [<c00c2d60>]
(deactivate_super+0x5c/0x60)
(deactivate_super+0x0/0x60) from [<c00da79c>] (mntput_no_expire+0x9c/0xe8)
(mntput_no_expire+0x0/0xe8) from [<c00dae40>] (sys_umount+0x308/0x334)
(sys_umount+0x0/0x334) from [<c001ef80>] (ret_fast_syscall+0x0/0x30)
Code: e1a02007 e59f00bc e1a01006 eb055bd7 (e5d53023)
---[ end trace ddd9103bce1b5eae ]---
------------[ cut here ]------------

In the case of crash (back trace as shown), the log structure is
de-allocated as part of the xfs_log_unmount(), but there was a pending
callback which gets called after all these un-mount routine. Since,
log structure doesn’t hold valid values it causes OOPS when accessed.

In the function xfs_buf_iodone() : file xfs_buf_item.c

If Mount point and the log structure addresses are checked through
xfs_buf_t *bp, it shows correct values – valid address for mount point
and NULL for log(bp->b_mount,bp->b_mount->m_log). While if the same
are accessed through “ailp->xa_mount,ailp->xa_mount->m_log” – these
hold garbage values.

I have made few changes:
diff -Nurp linux-orig/fs/xfs/xfs_buf_item.c linux-Modified/fs/xfs/xfs_buf_item.c
--- linux-orig/fs/xfs/xfs_buf_item.c	2011-09-13 17:08:13.000000000 +0530
+++ linux-Modified/fs/xfs/xfs_buf_item.c	2011-09-13 17:05:43.000000000 +0530
@@ -1065,7 +1065,9 @@ xfs_buf_iodone(
 	 *
 	 * Either way, AIL is useless if we're forcing a shutdown.
 	 */
-	spin_lock(&ailp->xa_lock);
-	xfs_trans_ail_delete(ailp, (xfs_log_item_t *)bip);
-	xfs_buf_item_free(bip);
+	if(bp->b_mount->m_log){
+		spin_lock(&ailp->xa_lock);
+		xfs_trans_ail_delete(ailp, (xfs_log_item_t *)bip);
+		xfs_buf_item_free(bip);
+	}
 }
diff -Nurp linux-orig/fs/xfs/xfs_trans_ail.c
linux-Modified/fs/xfs/xfs_trans_ail.c
--- linux-orig/fs/xfs/xfs_trans_ail.c	2011-06-02 12:10:52.000000000 +0530
+++ linux-Modified/fs/xfs/xfs_trans_ail.c	2011-09-13 17:06:31.000000000 +0530
@@ -27,6 +27,7 @@
 #include "xfs_mount.h"
 #include "xfs_trans_priv.h"
 #include "xfs_error.h"
+#include "xfs_log_priv.h"

 STATIC void xfs_ail_insert(struct xfs_ail *, xfs_log_item_t *);
 STATIC xfs_log_item_t * xfs_ail_delete(struct xfs_ail *, xfs_log_item_t *);
@@ -619,6 +619,9 @@ xfs_trans_ail_destroy(
 	struct xfs_ail	*ailp = mp->m_ail;

 	xfsaild_stop(ailp);
+	mp->m_ail->xa_mount = NULL;
+        mp->m_ail = NULL;
+        mp->m_log->l_ailp = NULL;
 	kmem_free(ailp);
 }

Please share your opinion on these changes.

Thanks & Regards,
Amit Sahrawat

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Code Changes [Review Request]: xfs_trans_ail_delete() gets invoked with corrupted values ? Causing Crash and Deadlock.
  2011-09-13 12:07 Code Changes [Review Request]: xfs_trans_ail_delete() gets invoked with corrupted values – Causing Crash and Deadlock Amit Sahrawat
@ 2011-09-13 15:27 ` Christoph Hellwig
  2011-09-13 15:58   ` Amit Sahrawat
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2011-09-13 15:27 UTC (permalink / raw)
  To: Amit Sahrawat; +Cc: Christoph Hellwig, xfs

Hmm, I don't think this is the correct fix.  We should never have
buffers still around once we unmount the log.  Can you try the patch
below?

---
From: Christoph Hellwig <hch@lst.de>
Subject: xfs: fix buffer flushing during unmount

The code to flush buffers in the umount code is a bit iffy: we first flush
all delwri buffers out, but then might be able to queue up a new one when
logging the sb counts.  On a normal shutdown that one would get flushed
out when doing the synchronous superblock write in xfs_unmountfs_writesb,
but we skip that one if the filesystem has been shut down.

Fix this by moving the delwri list flushing until just before unmounting
the log, and while we're at it also remove the superflous delwri list
and buffer lru flusing for the rt and log device that can never have
cached or delwri buffers.

Signed-off-by: Christoph Hellwig <hch@lst.de>

Index: xfs/fs/xfs/xfs_mount.c
===================================================================
--- xfs.orig/fs/xfs/xfs_mount.c	2011-09-13 10:55:20.748087866 -0400
+++ xfs/fs/xfs/xfs_mount.c	2011-09-13 10:56:19.108088343 -0400
@@ -44,9 +44,6 @@
 #include "xfs_trace.h"
 
 
-STATIC void	xfs_unmountfs_wait(xfs_mount_t *);
-
-
 #ifdef HAVE_PERCPU_SB
 STATIC void	xfs_icsb_balance_counter(xfs_mount_t *, xfs_sb_field_t,
 						int);
@@ -1496,11 +1493,6 @@ xfs_unmountfs(
 	 */
 	xfs_log_force(mp, XFS_LOG_SYNC);
 
-	xfs_binval(mp->m_ddev_targp);
-	if (mp->m_rtdev_targp) {
-		xfs_binval(mp->m_rtdev_targp);
-	}
-
 	/*
 	 * Unreserve any blocks we have so that when we unmount we don't account
 	 * the reserved free space as used. This is really only necessary for
@@ -1526,7 +1518,16 @@ xfs_unmountfs(
 		xfs_warn(mp, "Unable to update superblock counters. "
 				"Freespace may not be correct on next mount.");
 	xfs_unmountfs_writesb(mp);
-	xfs_unmountfs_wait(mp); 		/* wait for async bufs */
+
+	/*
+	 * Make sure all buffers have been flushed and completed before
+	 * unmounting the log.
+	 */
+	error = xfs_flush_buftarg(mp->m_ddev_targp, 1);
+	if (error)
+		xfs_warn(mp, "%d busy buffers during unmount.", error);
+	xfs_wait_buftarg(mp->m_ddev_targp);
+
 	xfs_log_unmount_write(mp);
 	xfs_log_unmount(mp);
 	xfs_uuid_unmount(mp);
@@ -1537,16 +1538,6 @@ xfs_unmountfs(
 	xfs_free_perag(mp);
 }
 
-STATIC void
-xfs_unmountfs_wait(xfs_mount_t *mp)
-{
-	if (mp->m_logdev_targp != mp->m_ddev_targp)
-		xfs_wait_buftarg(mp->m_logdev_targp);
-	if (mp->m_rtdev_targp)
-		xfs_wait_buftarg(mp->m_rtdev_targp);
-	xfs_wait_buftarg(mp->m_ddev_targp);
-}
-
 int
 xfs_fs_writable(xfs_mount_t *mp)
 {

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Code Changes [Review Request]: xfs_trans_ail_delete() gets invoked with corrupted values ? Causing Crash and Deadlock.
  2011-09-13 15:27 ` Code Changes [Review Request]: xfs_trans_ail_delete() gets invoked with corrupted values ? " Christoph Hellwig
@ 2011-09-13 15:58   ` Amit Sahrawat
  2011-09-13 16:04     ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Amit Sahrawat @ 2011-09-13 15:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Yes, that was not the correct fix. But since I am getting that
regularly I had to first get over that to avoid crash.
The changes you have mentioned below seems appropriate- I will check
and update you more on this by tomorrow.
But I tried to this flushing as part of xlog_dealloc_log - it didn't
work over there  - i will confirm this.

Thanks & Regards,
Amit Sahrawat

On Tue, Sep 13, 2011 at 8:57 PM, Christoph Hellwig <hch@infradead.org> wrote:
> Hmm, I don't think this is the correct fix.  We should never have
> buffers still around once we unmount the log.  Can you try the patch
> below?
>
> ---
> From: Christoph Hellwig <hch@lst.de>
> Subject: xfs: fix buffer flushing during unmount
>
> The code to flush buffers in the umount code is a bit iffy: we first flush
> all delwri buffers out, but then might be able to queue up a new one when
> logging the sb counts.  On a normal shutdown that one would get flushed
> out when doing the synchronous superblock write in xfs_unmountfs_writesb,
> but we skip that one if the filesystem has been shut down.
>
> Fix this by moving the delwri list flushing until just before unmounting
> the log, and while we're at it also remove the superflous delwri list
> and buffer lru flusing for the rt and log device that can never have
> cached or delwri buffers.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
>
> Index: xfs/fs/xfs/xfs_mount.c
> ===================================================================
> --- xfs.orig/fs/xfs/xfs_mount.c 2011-09-13 10:55:20.748087866 -0400
> +++ xfs/fs/xfs/xfs_mount.c      2011-09-13 10:56:19.108088343 -0400
> @@ -44,9 +44,6 @@
>  #include "xfs_trace.h"
>
>
> -STATIC void    xfs_unmountfs_wait(xfs_mount_t *);
> -
> -
>  #ifdef HAVE_PERCPU_SB
>  STATIC void    xfs_icsb_balance_counter(xfs_mount_t *, xfs_sb_field_t,
>                                                int);
> @@ -1496,11 +1493,6 @@ xfs_unmountfs(
>         */
>        xfs_log_force(mp, XFS_LOG_SYNC);
>
> -       xfs_binval(mp->m_ddev_targp);
> -       if (mp->m_rtdev_targp) {
> -               xfs_binval(mp->m_rtdev_targp);
> -       }
> -
>        /*
>         * Unreserve any blocks we have so that when we unmount we don't account
>         * the reserved free space as used. This is really only necessary for
> @@ -1526,7 +1518,16 @@ xfs_unmountfs(
>                xfs_warn(mp, "Unable to update superblock counters. "
>                                "Freespace may not be correct on next mount.");
>        xfs_unmountfs_writesb(mp);
> -       xfs_unmountfs_wait(mp);                 /* wait for async bufs */
> +
> +       /*
> +        * Make sure all buffers have been flushed and completed before
> +        * unmounting the log.
> +        */
> +       error = xfs_flush_buftarg(mp->m_ddev_targp, 1);
> +       if (error)
> +               xfs_warn(mp, "%d busy buffers during unmount.", error);
> +       xfs_wait_buftarg(mp->m_ddev_targp);
> +
>        xfs_log_unmount_write(mp);
>        xfs_log_unmount(mp);
>        xfs_uuid_unmount(mp);
> @@ -1537,16 +1538,6 @@ xfs_unmountfs(
>        xfs_free_perag(mp);
>  }
>
> -STATIC void
> -xfs_unmountfs_wait(xfs_mount_t *mp)
> -{
> -       if (mp->m_logdev_targp != mp->m_ddev_targp)
> -               xfs_wait_buftarg(mp->m_logdev_targp);
> -       if (mp->m_rtdev_targp)
> -               xfs_wait_buftarg(mp->m_rtdev_targp);
> -       xfs_wait_buftarg(mp->m_ddev_targp);
> -}
> -
>  int
>  xfs_fs_writable(xfs_mount_t *mp)
>  {
>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Code Changes [Review Request]: xfs_trans_ail_delete() gets invoked with corrupted values ? Causing Crash and Deadlock.
  2011-09-13 15:58   ` Amit Sahrawat
@ 2011-09-13 16:04     ` Christoph Hellwig
  2011-09-14  9:22       ` Amit Sahrawat
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2011-09-13 16:04 UTC (permalink / raw)
  To: Amit Sahrawat; +Cc: Christoph Hellwig, xfs

On Tue, Sep 13, 2011 at 09:28:40PM +0530, Amit Sahrawat wrote:
> Yes, that was not the correct fix. But since I am getting that
> regularly I had to first get over that to avoid crash.

No problem.  I'm more than happy about any kind of analsys and debugging
I get.

> The changes you have mentioned below seems appropriate- I will check
> and update you more on this by tomorrow.
> But I tried to this flushing as part of xlog_dealloc_log - it didn't
> work over there  - i will confirm this.

Hmm.  In that case this one might not work either, but I'd love to
see your results.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Code Changes [Review Request]: xfs_trans_ail_delete() gets invoked with corrupted values ? Causing Crash and Deadlock.
  2011-09-13 16:04     ` Christoph Hellwig
@ 2011-09-14  9:22       ` Amit Sahrawat
  2011-09-14 13:28         ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Amit Sahrawat @ 2011-09-14  9:22 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

The patch is working fine.No issues observed.
Thanks.

Regards,
Amit Sahrawat

On Tue, Sep 13, 2011 at 9:34 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Tue, Sep 13, 2011 at 09:28:40PM +0530, Amit Sahrawat wrote:
>> Yes, that was not the correct fix. But since I am getting that
>> regularly I had to first get over that to avoid crash.
>
> No problem.  I'm more than happy about any kind of analsys and debugging
> I get.
>
>> The changes you have mentioned below seems appropriate- I will check
>> and update you more on this by tomorrow.
>> But I tried to this flushing as part of xlog_dealloc_log - it didn't
>> work over there  - i will confirm this.
>
> Hmm.  In that case this one might not work either, but I'd love to
> see your results.
>
>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Code Changes [Review Request]: xfs_trans_ail_delete() gets invoked with corrupted values ? Causing Crash and Deadlock.
  2011-09-14  9:22       ` Amit Sahrawat
@ 2011-09-14 13:28         ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2011-09-14 13:28 UTC (permalink / raw)
  To: Amit Sahrawat; +Cc: Christoph Hellwig, xfs

On Wed, Sep 14, 2011 at 02:52:08PM +0530, Amit Sahrawat wrote:
> The patch is working fine.No issues observed.
> Thanks.

Ok, I'll send it in for Linux 3.2.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-09-14 13:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-13 12:07 Code Changes [Review Request]: xfs_trans_ail_delete() gets invoked with corrupted values – Causing Crash and Deadlock Amit Sahrawat
2011-09-13 15:27 ` Code Changes [Review Request]: xfs_trans_ail_delete() gets invoked with corrupted values ? " Christoph Hellwig
2011-09-13 15:58   ` Amit Sahrawat
2011-09-13 16:04     ` Christoph Hellwig
2011-09-14  9:22       ` Amit Sahrawat
2011-09-14 13:28         ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.