linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG: XFS internal error xfs_trans_cancel in 2.6.27
@ 2008-10-09 10:36 Sean Purdy
  2008-10-09 23:02 ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Sean Purdy @ 2008-10-09 10:36 UTC (permalink / raw)
  To: Linux Kernel Mailing List

Hi,


Further to the discussion (and patching) of an xfs_trans_cancel
issue in June, in kernel < 2.6.26

A similar issue came up on one disk of a 4 x 750GiB machine
with a 2.6.24 kernel.  So I installed 2.6.27-6 and gave it another try.
But I'm still seeing the same problem.  Remounting the drive each time
is fine, and xfs_check shows no errors.

The issue is reproducible, within a few minutes of marking the device
writable in our distributed file system (MogileFS).

There were no memory use issues and a memcheck test passed.

The disk in question is at 94% and has previously been at a
similar utilisation before going down to around 60% and back up.
File sizes stored are anything between 1KB to 1GB

So it could be a fragmentation issue.  But then the other three disks
on that machine have had a similar history.

Frustratingly, I then mounted the disk readwrite elsewhere on
the same machine, and copied a range of files to it from 7672 bytes
to 800Mb and those copied fine.  Then I reintroduced the disk into the
MogileFS system and the issue recurred within a few minutes.
We're using lighttpd to read and write the files for the mogile system.

Output from df and dmesg below.


Disk is /dev/sdd1
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda10           611077760 579544432  31533328  95% /var/mogdata/dev182
/dev/sdb10           611077760 583449400  27628360  96% /var/mogdata/dev183
/dev/sdc1            732272128 686380752  45891376  94% /var/mogdata/dev184
/dev/sdd1            732272128 684888328  47383800  94% /var/mogdata/dev185

Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda10           126217920   76222 126141698    1% /var/mogdata/dev182
/dev/sdb10           110599968   78265 110521703    1% /var/mogdata/dev183
/dev/sdc1            183656960   82848 183574112    1% /var/mogdata/dev184
/dev/sdd1            189625760   72442 189553318    1% /var/mogdata/dev185


             total       used       free     shared    buffers     cached
Mem:       2071708    1240212     831496          0         40    1097796
-/+ buffers/cache:     142376    1929332
Swap:      1951800         56    1951744


[142880.364261] Filesystem "sdd1": XFS internal error xfs_trans_cancel at line 1164 of file /build/buildd/linux-2.6.27/fs/xfs/xfs_trans.c.  Caller 0xf8b0bd50
[142880.364305] Pid: 17672, comm: lighttpd Not tainted 2.6.27-6-server #1
[142880.364325]  [<f8ae4b03>] xfs_error_report+0x53/0x60 [xfs]
[142880.364369]  [<f8b0bd50>] ? xfs_mkdir+0x2d0/0x470 [xfs]
[142880.364395]  [<f8b05472>] xfs_trans_cancel+0xd2/0xf0 [xfs]
[142880.364423]  [<f8b0bd50>] ? xfs_mkdir+0x2d0/0x470 [xfs]
[142880.364447]  [<f8b0bd50>] xfs_mkdir+0x2d0/0x470 [xfs]
[142880.364483]  [<f8b173f7>] xfs_vn_mknod+0x1e7/0x290 [xfs]
[142880.364506]  [<f8b174ba>] xfs_vn_mkdir+0x1a/0x20 [xfs]
[142880.364520]  [<c01c5a16>] vfs_mkdir+0xa6/0x100
[142880.364526]  [<c038d88d>] ? _spin_lock+0xd/0x10
[142880.364532]  [<c01c790e>] sys_mkdirat+0xce/0xe0
[142880.364535]  [<c01bc04b>] ? fsnotify_access+0x6b/0x80
[142880.364540]  [<c01bcceb>] ? vfs_read+0xab/0x110
[142880.364543]  [<c01c7945>] sys_mkdir+0x25/0x30
[142880.364545]  [<c0109f03>] sysenter_do_call+0x12/0x2f
[142880.364552]  =======================
[142880.364556] xfs_force_shutdown(sdd1,0x8) called from line 1165 of file /build/buildd/linux-2.6.27/fs/xfs/xfs_trans.c.  Return address = 0xf8b0548a
[142880.364566] Filesystem "sdd1": Corruption of in-memory data detected.  Shutting down filesystem: sdd1
[142880.364589] Please umount the filesystem, and rectify the problem(s)
[142907.600040] Filesystem "sdd1": xfs_log_force: error 5 returned.


Thanks,

Sean

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: XFS internal error xfs_trans_cancel in 2.6.27
  2008-10-09 10:36 BUG: XFS internal error xfs_trans_cancel in 2.6.27 Sean Purdy
@ 2008-10-09 23:02 ` Dave Chinner
  2008-10-10  9:22   ` Sean Purdy
  2008-10-16 14:17   ` Sean Purdy
  0 siblings, 2 replies; 5+ messages in thread
From: Dave Chinner @ 2008-10-09 23:02 UTC (permalink / raw)
  To: Sean Purdy; +Cc: Linux Kernel Mailing List

On Thu, Oct 09, 2008 at 11:36:10AM +0100, Sean Purdy wrote:
> Hi,
> 
> 
> Further to the discussion (and patching) of an xfs_trans_cancel
> issue in June, in kernel < 2.6.26
> 
> A similar issue came up on one disk of a 4 x 750GiB machine
> with a 2.6.24 kernel.  So I installed 2.6.27-6 and gave it another try.
> But I'm still seeing the same problem.  Remounting the drive each time
> is fine, and xfs_check shows no errors.

2.6.27-6? You mean 2.6.27-rc6?

Anyway, you need to try this patch:

http://oss.sgi.com/archives/xfs/2008-10/msg00105.html

which I posted a few days ago that fixes the latest reproducable
case of this shutdown that I know of.

If that patch doesn't fix it, what you need to do is get the
filesystem into a state where this sequence of commands can be
repeated:

# mount <dev> <mntpt>
# cd <mntpt>/some/dir/in/fs/where/error/is/occurring
# touch <somefile>
[ filesystem shuts down ]

At that point, if you provide me with an xfs_metadump image and
the exact commands to reproduce the shutdown, I will be able to
find the problem and patch it pretty quickly.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: XFS internal error xfs_trans_cancel in 2.6.27
  2008-10-09 23:02 ` Dave Chinner
@ 2008-10-10  9:22   ` Sean Purdy
  2008-10-16 14:17   ` Sean Purdy
  1 sibling, 0 replies; 5+ messages in thread
From: Sean Purdy @ 2008-10-10  9:22 UTC (permalink / raw)
  To: david; +Cc: Linux Kernel Mailing List

On Fri, 10 Oct 2008, Dave Chinner said:
> On Thu, Oct 09, 2008 at 11:36:10AM +0100, Sean Purdy wrote:
> > A similar issue came up on one disk of a 4 x 750GiB machine
> > with a 2.6.24 kernel.  So I installed 2.6.27-6 and gave it another try.
> 
> 2.6.27-6? You mean 2.6.27-rc6?

Yes, that's just how the Ubuntu maintainer packaged it.
 
> Anyway, you need to try this patch:

Will do.

Sean

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: XFS internal error xfs_trans_cancel in 2.6.27
  2008-10-09 23:02 ` Dave Chinner
  2008-10-10  9:22   ` Sean Purdy
@ 2008-10-16 14:17   ` Sean Purdy
  2008-10-16 22:56     ` Dave Chinner
  1 sibling, 1 reply; 5+ messages in thread
From: Sean Purdy @ 2008-10-16 14:17 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: david

On Fri, 10 Oct 2008, Dave Chinner said:
> On Thu, Oct 09, 2008 at 11:36:10AM +0100, Sean Purdy wrote:
> > Hi,
> > 
> > Further to the discussion (and patching) of an xfs_trans_cancel
> > issue in June, in kernel < 2.6.26
> > 
> > A similar issue came up on one disk of a 4 x 750GiB machine
> > with a 2.6.24 kernel.  So I installed 2.6.27-6 and gave it another try.
> > But I'm still seeing the same problem.  Remounting the drive each time
> > is fine, and xfs_check shows no errors.
> 
> 2.6.27-6? You mean 2.6.27-rc6?

I meant rc9 as it happens, but never mind.
 
> Anyway, you need to try this patch:
> 
> http://oss.sgi.com/archives/xfs/2008-10/msg00105.html
> 
> which I posted a few days ago that fixes the latest reproducable
> case of this shutdown that I know of.

This patch worked fine, thanks very much!  Can we get it into
some stable version of 2.6.27?  Maybe I can put in a bug request
for the next Ubuntu kernel.


Sean

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: XFS internal error xfs_trans_cancel in 2.6.27
  2008-10-16 14:17   ` Sean Purdy
@ 2008-10-16 22:56     ` Dave Chinner
  0 siblings, 0 replies; 5+ messages in thread
From: Dave Chinner @ 2008-10-16 22:56 UTC (permalink / raw)
  To: Sean Purdy; +Cc: Linux Kernel Mailing List, xfs

On Thu, Oct 16, 2008 at 03:17:52PM +0100, Sean Purdy wrote:
> On Fri, 10 Oct 2008, Dave Chinner said:
> > On Thu, Oct 09, 2008 at 11:36:10AM +0100, Sean Purdy wrote:
> > > Hi,
> > > 
> > > Further to the discussion (and patching) of an xfs_trans_cancel
> > > issue in June, in kernel < 2.6.26
> > > 
> > > A similar issue came up on one disk of a 4 x 750GiB machine
> > > with a 2.6.24 kernel.  So I installed 2.6.27-6 and gave it another try.
> > > But I'm still seeing the same problem.  Remounting the drive each time
> > > is fine, and xfs_check shows no errors.
> > 
> > 2.6.27-6? You mean 2.6.27-rc6?
> 
> I meant rc9 as it happens, but never mind.
>  
> > Anyway, you need to try this patch:
> > 
> > http://oss.sgi.com/archives/xfs/2008-10/msg00105.html
> > 
> > which I posted a few days ago that fixes the latest reproducable
> > case of this shutdown that I know of.
> 
> This patch worked fine, thanks very much!

Great. It's good to know that mutliple people إve been hitting
this specific problem.

> Can we get it into
> some stable version of 2.6.27?  Maybe I can put in a bug request
> for the next Ubuntu kernel.

We haven't pushed it upstream for .28-rc1 yet as it is still being
QA'd. Once it is pushed we can consider it for .27-stable.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-10-16 22:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-09 10:36 BUG: XFS internal error xfs_trans_cancel in 2.6.27 Sean Purdy
2008-10-09 23:02 ` Dave Chinner
2008-10-10  9:22   ` Sean Purdy
2008-10-16 14:17   ` Sean Purdy
2008-10-16 22:56     ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).