All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Nick Piggin <npiggin@kernel.dk>
Cc: xfs@oss.sgi.com
Subject: Re: XFS hang in xlog_grant_log_space
Date: Wed, 28 Jul 2010 00:58:09 +1000	[thread overview]
Message-ID: <20100727145808.GQ7362@dastard> (raw)
In-Reply-To: <20100727133038.GP7362@dastard>

On Tue, Jul 27, 2010 at 11:30:38PM +1000, Dave Chinner wrote:
> On Tue, Jul 27, 2010 at 09:36:26PM +1000, Nick Piggin wrote:
> > On Tue, Jul 27, 2010 at 06:06:32PM +1000, Nick Piggin wrote:
> > On this same system, same setup (vanilla kernel with sha given below),
> > I have now twice reproduced a complete hang in XFS. I can give more
> > information, test patches or options etc if required.
> > 
> > setup.sh looks like this:
> > #!/bin/bash
> > modprobe rd rd_size=$[2*1024*1024]
> > dd if=/dev/zero of=/dev/ram0 bs=4K
> > mkfs.xfs -f -l size=64m -d agcount=16 /dev/ram0
> > mount -o delaylog,logbsize=262144,nobarrier /dev/ram0 mnt
> > 
> > The 'dd' is required to ensure rd driver does not allocate pages
> > during IO (which can lead to out of memory deadlocks). Running just
> > involves changing into mnt directory and
> > 
> > while true
> > do
> >   sync
> >   echo 3 > /proc/sys/vm/drop_caches
> >   ../dbench -c ../loadfiles/client.txt -t20 8
> >   rm -rf clients
> > done
> > 
> > And wait for it to hang (happend in < 5 minutes here)
> ....
> > Call Trace:
> >  [<ffffffff812361f8>] xlog_grant_log_space+0x158/0x3d0
> 
> It's waiting on log space to be freed up. Either there's an
> accounting problem (possible), or you've got an xfslogd/xfsaild
> spinning and not making progress competing log IOs or pushing the
> tail of the log. I'll see if I can reproduce it.

Ok, I've just reproduced it. From some tracing:

touch-3340  [004] 1844935.582716: xfs_log_reserve: dev 1:0 type CREATE t_ocnt 2 t_cnt 2 t_curr_res 167148 t_unit_res 167148 t_flags XLOG_TIC_INITED|XLOG_TIC_PERM_RESERV reserve_headq 0xffff88010f489c78 write_headq 0x(null) grant_reserve_cycle 314 grant_reserve_bytes 24250680 grant_write_cycle 314 grant_write_bytes 24250680 curr_cycle 314 curr_block 44137 tail_cycle 313 tail_block 48532

The key part here is this:

curr_cycle 314 curr_block 44137 tail_cycle 313 tail_block 48532

This says the tail of the log is roughly 62MB behind the head. i.e
the log is full and we are waiting for tail pushing to write the
item holding the tail in place to disk so it can them be moved
forward. That's better than an accounting problem, at least.

So what is holding the tail in place? The first item on the AIL
appears to be:

xfsaild/ram0-2997  [000] 1844800.800764: xfs_buf_cond_lock: dev 1:0 bno 0x280120 len 0x2000 hold 3 pincount 0 lock 0 flags ASYNC|DONE|STALE|PAGE_CACHE caller xfs_buf_item_trylock

A stale buffer. Given that the next objects show this trace:

xfsaild/ram0-2997  [000] 1844800.800767: xfs_ilock_nowait: dev 1:0 ino 0x500241 flags ILOCK_SHARED caller xfs_inode_item_trylock
xfsaild/ram0-2997  [000] 1844800.800768: xfs_buf_rele: dev 1:0 bno 0x280120 len 0x2000 hold 4 pincount 0 lock 0 flags ASYNC|DONE|STALE|PAGE_CACHE caller _xfs_buf_find
xfsaild/ram0-2997  [000] 1844800.800769: xfs_iunlock: dev 1:0 ino 0x500241 flags ILOCK_SHARED caller xfs_inode_item_pushbuf

we see the next item on the AIL is an inode but the trace is
followed by a release on the original buffer, than tells me the
inode is flush locked and it returned XFS_ITEM_PUSHBUF to push the
inode buffer out. That results in xfs_inode_item_pushbuf() being
called, and that tries to lock the inode buffer to flush it.
xfs_buf_rele is called if the trylock on the buffer fails.

IOWs, this looks to be another problem with inode cluster freeing.

Ok, so we can't flush the buffer because it is locked. Why is it
locked? Well, that is unclear as yet. None of the blocked processes
should be holding an inode buffer locked, and a stale buffer should
be unlocked during transaction commit and not live longer than
the log IO that writes the transaction to disk. That is, it should
not get locked again before everything is freed up.

That's as much as I can get from post-mortem analysis - I need to
capture a trace that spans the lockup to catch what happens
to the buffer that we are hung on. That will have to wait until the
morning....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2010-07-27 14:55 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-22 19:01 VFS scalability git tree Nick Piggin
2010-07-22 19:01 ` Nick Piggin
2010-07-23 11:13 ` Dave Chinner
2010-07-23 11:13   ` Dave Chinner
2010-07-23 14:04   ` [PATCH 0/2] vfs scalability tree fixes Dave Chinner
2010-07-23 14:04     ` Dave Chinner
2010-07-23 16:09     ` Nick Piggin
2010-07-23 16:09       ` Nick Piggin
2010-07-23 14:04   ` [PATCH 1/2] xfs: fix shrinker build Dave Chinner
2010-07-23 14:04     ` Dave Chinner
2010-07-23 14:04   ` [PATCH 2/2] xfs: shrinker should use a per-filesystem scan count Dave Chinner
2010-07-23 14:04     ` Dave Chinner
2010-07-23 15:51   ` VFS scalability git tree Nick Piggin
2010-07-23 15:51     ` Nick Piggin
2010-07-24  0:21     ` Dave Chinner
2010-07-24  0:21       ` Dave Chinner
2010-07-23 11:17 ` Christoph Hellwig
2010-07-23 11:17   ` Christoph Hellwig
2010-07-23 15:42   ` Nick Piggin
2010-07-23 15:42     ` Nick Piggin
2010-07-23 13:55 ` Dave Chinner
2010-07-23 13:55   ` Dave Chinner
2010-07-23 16:16   ` Nick Piggin
2010-07-23 16:16     ` Nick Piggin
2010-07-27  7:05   ` Nick Piggin
2010-07-27  7:05     ` Nick Piggin
2010-07-27  8:06     ` Nick Piggin
2010-07-27  8:06       ` Nick Piggin
2010-07-27 11:36       ` XFS hang in xlog_grant_log_space (was Re: VFS scalability git tree) Nick Piggin
2010-07-27 13:30         ` Dave Chinner
2010-07-27 14:58           ` Dave Chinner [this message]
2010-07-28 13:17             ` XFS hang in xlog_grant_log_space Dave Chinner
2010-07-29 14:05               ` Nick Piggin
2010-07-29 22:56                 ` Dave Chinner
2010-07-30  3:59                   ` Nick Piggin
2010-07-28 12:57       ` VFS scalability git tree Dave Chinner
2010-07-28 12:57         ` Dave Chinner
2010-07-29 14:03         ` Nick Piggin
2010-07-29 14:03           ` Nick Piggin
2010-07-27 11:09     ` Nick Piggin
2010-07-27 11:09       ` Nick Piggin
2010-07-27 13:18     ` Dave Chinner
2010-07-27 13:18       ` Dave Chinner
2010-07-27 15:09       ` Nick Piggin
2010-07-27 15:09         ` Nick Piggin
2010-07-28  4:59         ` Dave Chinner
2010-07-28  4:59           ` Dave Chinner
2010-07-28  4:59           ` Dave Chinner
2010-07-23 15:35 ` Nick Piggin
2010-07-23 15:35   ` Nick Piggin
2010-07-24  8:43 ` KOSAKI Motohiro
2010-07-24  8:43   ` KOSAKI Motohiro
2010-07-24  8:44   ` [PATCH 1/2] vmscan: shrink_all_slab() use reclaim_state instead the return value of shrink_slab() KOSAKI Motohiro
2010-07-24  8:44     ` KOSAKI Motohiro
2010-07-24  8:44     ` KOSAKI Motohiro
2010-07-24 12:05     ` KOSAKI Motohiro
2010-07-24 12:05       ` KOSAKI Motohiro
2010-07-24  8:46   ` [PATCH 2/2] vmscan: change shrink_slab() return tyep with void KOSAKI Motohiro
2010-07-24  8:46     ` KOSAKI Motohiro
2010-07-24  8:46     ` KOSAKI Motohiro
2010-07-24 10:54   ` VFS scalability git tree KOSAKI Motohiro
2010-07-24 10:54     ` KOSAKI Motohiro
2010-07-26  5:41 ` Nick Piggin
2010-07-26  5:41   ` Nick Piggin
2010-07-28 10:24   ` Nick Piggin
2010-07-28 10:24     ` Nick Piggin
2010-07-30  9:12 ` Nick Piggin
2010-07-30  9:12   ` Nick Piggin
2010-08-03  0:27   ` john stultz
2010-08-03  0:27     ` john stultz
2010-08-03  0:27     ` john stultz
2010-08-03  5:44     ` Nick Piggin
2010-08-03  5:44       ` Nick Piggin
2010-08-03  5:44       ` Nick Piggin
2010-09-14 22:26       ` Christoph Hellwig
2010-09-14 23:02         ` Frank Mayhar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100727145808.GQ7362@dastard \
    --to=david@fromorbit.com \
    --cc=npiggin@kernel.dk \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.