All of lore.kernel.org
 help / color / mirror / Atom feed
* [regression, 3.0-rc] xfs: freeze hang in 068
@ 2011-07-11  1:03 Dave Chinner
  2011-07-11  9:51 ` Christoph Hellwig
  0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2011-07-11  1:03 UTC (permalink / raw)
  To: hch; +Cc: xfs

Christoph,

The recent changes to the active transaction accounting to close a
race on freeze can hang the freeze process and hence the filesystem.

SysRq : Show Blocked State
  task                        PC stack   pid father
xfs_io          D ffff88005b7a0f00  5592 32539  32535 0x00020000
 ffff88005bc9dd88 0000000000000082 ffff88005bc9dd28 0000000000000296
 ffff88005bc9c010 ffff88005bff4fe0 0000000000010f80 ffff88005bc9dfd8
 ffff88005bc9dfd8 0000000000010f80 ffff88005cd1dfc0 ffff88005bff4fe0
Call Trace:
 [<ffffffff8150c184>] schedule_timeout+0x97/0xbb
 [<ffffffff81072766>] ? lock_timer_base+0x4d/0x4d
 [<ffffffff8150c1c1>] schedule_timeout_uninterruptible+0x19/0x1b
 [<ffffffff81269786>] xfs_quiesce_attr+0x1d/0x7f
 [<ffffffff81266bb2>] xfs_fs_freeze+0x20/0x2e
 [<ffffffff8110db00>] freeze_super+0x8b/0xca
 [<ffffffff81118abc>] do_vfs_ioctl+0x1d0/0x45c
 [<ffffffff812a97b7>] ? do_raw_spin_unlock+0x8f/0x98
 [<ffffffff81102846>] ? virt_to_head_page+0x9/0x2c
 [<ffffffff81143cb0>] compat_sys_ioctl+0x33c/0x368
 [<ffffffff8110a0f3>] ? do_sys_open+0xee/0x100
 [<ffffffff81514960>] sysenter_dispatch+0x7/0x2e

This is waiting for mp->m_active_trans to reach zero.

fsstress        D 0000000000000000  5376 32541  32540 0x00020000
 ffff88005b68dd48 0000000000000086 ffff88005b68dce8 ffffffff812a97b7
 ffff88005b68c010 ffff88005cf8d7d0 0000000000010f80 ffff88005b68dfd8
 ffff88005b68dfd8 0000000000010f80 ffffffff81a0b020 ffff88005cf8d7d0
Call Trace:
 [<ffffffff812a97b7>] ? do_raw_spin_unlock+0x8f/0x98
 [<ffffffff81080c73>] ? prepare_to_wait+0x71/0x7c
 [<ffffffff81262d5d>] xfs_file_aio_write+0x10a/0x245
 [<ffffffff81080a31>] ? wake_up_bit+0x25/0x25
 [<ffffffff8110b52b>] do_sync_write+0xc6/0x103
 [<ffffffff810eec9e>] ? handle_mm_fault+0xff/0x111
 [<ffffffff8110be64>] vfs_write+0xa9/0x105
 [<ffffffff8110b055>] ? vfs_llseek+0x2e/0x30
 [<ffffffff8110bf79>] sys_write+0x45/0x6c
 [<ffffffff81514960>] sysenter_dispatch+0x7/0x2e

This is waiting for the filesystem to unfreeze.

fsstress        D 0000000000000000  5040 32542  32540 0x00020000
 ffff88005b62fc78 0000000000000082 ffff88005b62fc18 ffffffff812a97b7
 ffff88005b62e010 ffff88005be9c000 0000000000010f80 ffff88005b62ffd8
 ffff88005b62ffd8 0000000000010f80 ffff88005cf8efa0 ffff88005be9c000
Call Trace:
 [<ffffffff812a97b7>] ? do_raw_spin_unlock+0x8f/0x98
 [<ffffffff81080c73>] ? prepare_to_wait+0x71/0x7c
 [<ffffffff81255843>] _xfs_trans_alloc+0x89/0xee
 [<ffffffff81080a31>] ? wake_up_bit+0x25/0x25
 [<ffffffff8125875b>] xfs_trans_alloc+0x13/0x15
 [<ffffffff8125abcb>] xfs_change_file_space+0x1f9/0x2f0
 [<ffffffff81122dda>] ? mntput+0x21/0x23
 [<ffffffff81113e7c>] ? path_put+0x1d/0x21
 [<ffffffff81263301>] xfs_ioc_space+0xc2/0xd3
 [<ffffffff81208f92>] xfs_file_compat_ioctl+0x2e1/0x49b
 [<ffffffff81122cc8>] ? mntput_no_expire+0x50/0x141
 [<ffffffff81122dda>] ? mntput+0x21/0x23
 [<ffffffff8110efd8>] ? vfs_fstat+0x3b/0x45
 [<ffffffff81143b15>] compat_sys_ioctl+0x1a1/0x368
 [<ffffffff81514960>] sysenter_dispatch+0x7/0x2e

This has an active transaction reference (i.e. keeping
mp->m_active_trans > 0) and is waiting for the freeze to complete.

Basically the problem is this:

thread 1				freeze
					SB_FREEZE_WRITE
					sync_filesystem()
					SB_FREEZE_TRANS
					->freeze
xfs_trans_alloc
  atomic_inc(mp->m_active_trans)
  wait on (SB_FREEZE_TRANS)
					xfs_quiese_attr()
					  while (mp->m_active_trans > 0)
						delay(1);

So effective we cannot sleep waiting for SB_FREEZE_TRANS to go away
while holding an active transaction reference because the freeze
process does not set and check SB_FREEZE_TRANS/mp->m_active_trans
atomically.

I haven't put any thought into how to solve this problem yet, so I'd
suggest that at this late stage we need to revert 315fdfa (xfs: fix
filesystsem freeze race in xfs_trans_alloc) because the race it
fixes is far less critical (i.e. doesn't hang the filesystem) and
harder to hit than the regression introduced here.

I've reproduced this a coupe lof times now on a 1p/1.5GB x86_64
kernel/i686 userspace VM.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [regression, 3.0-rc] xfs: freeze hang in 068
  2011-07-11  1:03 [regression, 3.0-rc] xfs: freeze hang in 068 Dave Chinner
@ 2011-07-11  9:51 ` Christoph Hellwig
  2011-07-12  0:46   ` Dave Chinner
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2011-07-11  9:51 UTC (permalink / raw)
  To: Dave Chinner; +Cc: hch, xfs

On Mon, Jul 11, 2011 at 11:03:57AM +1000, Dave Chinner wrote:
> Christoph,
> 
> The recent changes to the active transaction accounting to close a
> race on freeze can hang the freeze process and hence the filesystem.

That commit isn't in 3.0-rc, but I guess it's just the subject line
that is incorrect.

> So effective we cannot sleep waiting for SB_FREEZE_TRANS to go away
> while holding an active transaction reference because the freeze
> process does not set and check SB_FREEZE_TRANS/mp->m_active_trans
> atomically.
> 
> I haven't put any thought into how to solve this problem yet, so I'd
> suggest that at this late stage we need to revert 315fdfa (xfs: fix
> filesystsem freeze race in xfs_trans_alloc) because the race it
> fixes is far less critical (i.e. doesn't hang the filesystem) and
> harder to hit than the regression introduced here.

Yes, I guess we need to revert it for now.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [regression, 3.0-rc] xfs: freeze hang in 068
  2011-07-11  9:51 ` Christoph Hellwig
@ 2011-07-12  0:46   ` Dave Chinner
  0 siblings, 0 replies; 3+ messages in thread
From: Dave Chinner @ 2011-07-12  0:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: hch, xfs

On Mon, Jul 11, 2011 at 05:51:47AM -0400, Christoph Hellwig wrote:
> On Mon, Jul 11, 2011 at 11:03:57AM +1000, Dave Chinner wrote:
> > Christoph,
> > 
> > The recent changes to the active transaction accounting to close a
> > race on freeze can hang the freeze process and hence the filesystem.
> 
> That commit isn't in 3.0-rc, but I guess it's just the subject line
> that is incorrect.

Ah, yes. I'd merged everything into a common test tree based on
3.0-rc6, and then forgot I'd merged all the pending 3.1 stuff
when I'd tracked down the bug and wrote the email....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-07-12  0:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-11  1:03 [regression, 3.0-rc] xfs: freeze hang in 068 Dave Chinner
2011-07-11  9:51 ` Christoph Hellwig
2011-07-12  0:46   ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.