All of lore.kernel.org
 help / color / mirror / Atom feed
* possible xfs corruption
@ 2010-12-07 11:49 blacknred
  2010-12-07 14:10 ` Emmanuel Florac
  2010-12-07 22:10 ` Dave Chinner
  0 siblings, 2 replies; 13+ messages in thread
From: blacknred @ 2010-12-07 11:49 UTC (permalink / raw)
  To: xfs


Hi...

I'm stuck with a storage issue on reboot. Initially doubted the storage, but
dmesg throws these errors. Now wondering whether this is a fs issue? Any
thoughts as to whats going on here?


XFS: failed to locate log tail
XFS: log mount/recovery failed: error 117
XFS: log mount failed
XFS mounting filesystem cciss/c0d0
Filesystem "cciss/c0d0": XFS internal error xlog_clear_stale_blocks(2) at
line 1237 of file
/home/buildsvn/rpmbuild/BUILD/xfs-kmod-0.4/_kmod_build_PAE/xfs_log_recover.c. 
Caller 0xf8aa7892
 [<f8aa774e>] xlog_find_tail+0x91b/0xa4a [xfs]
 [<f8aa7892>] xlog_recover+0x15/0x20b [xfs]
 [<f8aa7892>] xlog_recover+0x15/0x20b [xfs]
 [<f8ab9000>] xfs_buf_get_empty+0x2a/0x33 [xfs]
 [<f8aa36e5>] xfs_log_mount+0x467/0x4ab [xfs]
 [<f8ab702d>] kmem_zalloc+0xd/0x34 [xfs]
 [<f8aa9d8a>] xfs_mountfs+0x9f5/0xdfb [xfs]
 [<f8a92287>] xfs_fs_vcmn_err+0x5f/0x83 [xfs]
 [<f8abeea5>] xfs_mountfs_check_barriers+0xb8/0xd0 [xfs]
 [<f8ab02d2>] xfs_mount+0x75e/0x82a [xfs]
 [<f8aafb74>] xfs_mount+0x0/0x82a [xfs]
 [<f8abf57f>] vfs_mount+0x17/0x1a [xfs]
 [<f8abf42b>] xfs_fs_fill_super+0x68/0x1a5 [xfs]
 [<c04ef1a0>] snprintf+0x1c/0x1f
 [<c04a9b23>] disk_name+0x56/0x60
 [<c047b131>] get_sb_bdev+0xc6/0x113
 [<f8abe9d3>] xfs_fs_get_sb+0x12/0x16 [xfs]
 [<f8abf3c3>] xfs_fs_fill_super+0x0/0x1a5 [xfs]
 [<c047abf6>] vfs_kern_mount+0x7d/0xf2
 [<c047ac9d>] do_kern_mount+0x25/0x36
 [<c048e6aa>] do_mount+0x5fb/0x66b
 [<c04c63ab>] avc_has_perm+0x3c/0x46
 [<c04c63ab>] avc_has_perm+0x3c/0x46
 [<c04c7c1e>] selinux_inode_alloc_security+0x50/0x7b
 [<c04c7619>] inode_doinit_with_dentry+0x8a/0x495
 [<c045ad0a>] get_page_from_freelist+0x96/0x370
 [<c045b04d>] __alloc_pages+0x69/0x2cf
 [<c048d597>] copy_mount_options+0x26/0x109
 [<c048e787>] sys_mount+0x6d/0xa5
 [<c0404ead>] sysenter_past_esp+0x56/0x79

Thanks in advance
David

-- 
View this message in context: http://old.nabble.com/possible-xfs-corruption-tp30395558p30395558.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible xfs corruption
  2010-12-07 11:49 possible xfs corruption blacknred
@ 2010-12-07 14:10 ` Emmanuel Florac
  2010-12-07 15:11   ` blacknred
  2010-12-07 22:10 ` Dave Chinner
  1 sibling, 1 reply; 13+ messages in thread
From: Emmanuel Florac @ 2010-12-07 14:10 UTC (permalink / raw)
  To: blacknred; +Cc: xfs

Le Tue, 7 Dec 2010 03:49:49 -0800 (PST)
blacknred <leo1783@hotmail.co.uk> écrivait:

> I'm stuck with a storage issue on reboot. Initially doubted the
> storage, but dmesg throws these errors. Now wondering whether this is
> a fs issue? Any thoughts as to whats going on here?
> 

It looks like a part of the filesystem is physically missing. What is
the underlying device? Apparently it's a SmartArray in some HP server,
did you had a power failure? 

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible xfs corruption
  2010-12-07 14:10 ` Emmanuel Florac
@ 2010-12-07 15:11   ` blacknred
  2010-12-07 15:42     ` Emmanuel Florac
  2010-12-07 19:32     ` Michael Monnerie
  0 siblings, 2 replies; 13+ messages in thread
From: blacknred @ 2010-12-07 15:11 UTC (permalink / raw)
  To: xfs


Yes, It's a Smart Array in HP Proliant Server.
No power failure, just upgraded the controller firmware and rebooted the
server.....



Emmanuel Florac wrote:
> 
> Le Tue, 7 Dec 2010 03:49:49 -0800 (PST)
> blacknred <leo1783@hotmail.co.uk> écrivait:
> 
>> I'm stuck with a storage issue on reboot. Initially doubted the
>> storage, but dmesg throws these errors. Now wondering whether this is
>> a fs issue? Any thoughts as to whats going on here?
>> 
> 
> It looks like a part of the filesystem is physically missing. What is
> the underlying device? Apparently it's a SmartArray in some HP server,
> did you had a power failure? 
> 
> -- 
> ------------------------------------------------------------------------
> Emmanuel Florac     |   Direction technique
>                     |   Intellique
>                     |	<eflorac@intellique.com>
>                     |   +33 1 78 94 84 02
> ------------------------------------------------------------------------
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
> 
> 

-- 
View this message in context: http://old.nabble.com/possible-xfs-corruption-tp30395558p30397173.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible xfs corruption
  2010-12-07 15:11   ` blacknred
@ 2010-12-07 15:42     ` Emmanuel Florac
  2010-12-07 17:30       ` blacknred
  2010-12-07 19:32     ` Michael Monnerie
  1 sibling, 1 reply; 13+ messages in thread
From: Emmanuel Florac @ 2010-12-07 15:42 UTC (permalink / raw)
  To: blacknred; +Cc: xfs

Le Tue, 7 Dec 2010 07:11:04 -0800 (PST)
blacknred <leo1783@hotmail.co.uk> écrivait:

> Yes, It's a Smart Array in HP Proliant Server.
> No power failure, just upgraded the controller firmware and rebooted
> the server.....
> 

Hu oh, that stinks. Nothing in the firmware release notes? Do you have
any other filesystem on this array? 

First you could try to backup the filesystem metadata using

xfs_metadump -o -w /dev/DEVICE outfile.meta

It can be useful later in case things turn bad.
Does it output any errors? IO errors particularly ?

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible xfs corruption
  2010-12-07 15:42     ` Emmanuel Florac
@ 2010-12-07 17:30       ` blacknred
  2010-12-07 18:15         ` Stan Hoeppner
  0 siblings, 1 reply; 13+ messages in thread
From: blacknred @ 2010-12-07 17:30 UTC (permalink / raw)
  To: xfs


It's all xfs..... and didn't see any I/O errors as well...except that dmesg
is flooded with similar traces to one i posted....

Emmanuel Florac wrote:
> 
> Le Tue, 7 Dec 2010 07:11:04 -0800 (PST)
> blacknred <leo1783@hotmail.co.uk> écrivait:
> 
>> Yes, It's a Smart Array in HP Proliant Server.
>> No power failure, just upgraded the controller firmware and rebooted
>> the server.....
>> 
> 
> Hu oh, that stinks. Nothing in the firmware release notes? Do you have
> any other filesystem on this array? 
> 
> First you could try to backup the filesystem metadata using
> 
> xfs_metadump -o -w /dev/DEVICE outfile.meta
> 
> It can be useful later in case things turn bad.
> Does it output any errors? IO errors particularly ?
> 
> -- 
> ------------------------------------------------------------------------
> Emmanuel Florac     |   Direction technique
>                     |   Intellique
>                     |	<eflorac@intellique.com>
>                     |   +33 1 78 94 84 02
> ------------------------------------------------------------------------
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
> 
> 

-- 
View this message in context: http://old.nabble.com/possible-xfs-corruption-tp30395558p30398538.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible xfs corruption
  2010-12-07 17:30       ` blacknred
@ 2010-12-07 18:15         ` Stan Hoeppner
  0 siblings, 0 replies; 13+ messages in thread
From: Stan Hoeppner @ 2010-12-07 18:15 UTC (permalink / raw)
  To: xfs

blacknred put forth on 12/7/2010 11:30 AM:
> 
> It's all xfs..... and didn't see any I/O errors as well...except that dmesg
> is flooded with similar traces to one i posted....

Out of the blue, one OP, with two Proliant servers having storage
problems, within hours of one another, the same day?

Long odds, that.

Did you walk a mile, flat footed, on deep shag carpet, while wearing a
wool sweater, in a room with zero humidity, then walk into the DC and
touch each of these servers, or something? ;)

-- 
Stan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible xfs corruption
  2010-12-07 15:11   ` blacknred
  2010-12-07 15:42     ` Emmanuel Florac
@ 2010-12-07 19:32     ` Michael Monnerie
  1 sibling, 0 replies; 13+ messages in thread
From: Michael Monnerie @ 2010-12-07 19:32 UTC (permalink / raw)
  To: xfs; +Cc: blacknred


[-- Attachment #1.1: Type: Text/Plain, Size: 614 bytes --]

On Dienstag, 7. Dezember 2010 blacknred wrote:
> Yes, It's a Smart Array in HP Proliant Server.
> No power failure, just upgraded the controller firmware and rebooted
> the server.....

What Server, controller, and from which firmware version did you go to 
which one?

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services: Protéger
http://proteger.at [gesprochen: Prot-e-schee]
Tel: +43 660 / 415 6531

// ****** Radiointerview zum Thema Spam ******
// http://www.it-podcast.at/archiv.html#podcast-100716
// 
// Haus zu verkaufen: http://zmi.at/langegg/

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible xfs corruption
  2010-12-07 11:49 possible xfs corruption blacknred
  2010-12-07 14:10 ` Emmanuel Florac
@ 2010-12-07 22:10 ` Dave Chinner
  2010-12-09  3:44   ` Eric Sandeen
  1 sibling, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2010-12-07 22:10 UTC (permalink / raw)
  To: blacknred; +Cc: xfs

On Tue, Dec 07, 2010 at 03:49:49AM -0800, blacknred wrote:
> 
> Hi...
> 
> I'm stuck with a storage issue on reboot. Initially doubted the storage, but
> dmesg throws these errors. Now wondering whether this is a fs issue? Any
> thoughts as to whats going on here?
> 
> 
> XFS: failed to locate log tail
> XFS: log mount/recovery failed: error 117
> XFS: log mount failed
> XFS mounting filesystem cciss/c0d0
> Filesystem "cciss/c0d0": XFS internal error xlog_clear_stale_blocks(2) at
> line 1237 of file

Which indicates that the head and/or the tail of the log are not
valid. Can you provide the output of:

# xfs_logprint -d /dev/cciss/c0d0

So we can see what the head/tail values are in the log?

> /home/buildsvn/rpmbuild/BUILD/xfs-kmod-0.4/_kmod_build_PAE/xfs_log_recover.c. 

CentOS kernel? How old?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible xfs corruption
  2010-12-07 22:10 ` Dave Chinner
@ 2010-12-09  3:44   ` Eric Sandeen
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Sandeen @ 2010-12-09  3:44 UTC (permalink / raw)
  To: Dave Chinner; +Cc: blacknred, xfs

On 12/7/10 4:10 PM, Dave Chinner wrote:
> On Tue, Dec 07, 2010 at 03:49:49AM -0800, blacknred wrote:
>>
>> Hi...
>>
>> I'm stuck with a storage issue on reboot. Initially doubted the storage, but
>> dmesg throws these errors. Now wondering whether this is a fs issue? Any
>> thoughts as to whats going on here?
>>
>>
>> XFS: failed to locate log tail
>> XFS: log mount/recovery failed: error 117
>> XFS: log mount failed
>> XFS mounting filesystem cciss/c0d0
>> Filesystem "cciss/c0d0": XFS internal error xlog_clear_stale_blocks(2) at
>> line 1237 of file
> 
> Which indicates that the head and/or the tail of the log are not
> valid. Can you provide the output of:
> 
> # xfs_logprint -d /dev/cciss/c0d0
> 
> So we can see what the head/tail values are in the log?
> 
>> /home/buildsvn/rpmbuild/BUILD/xfs-kmod-0.4/_kmod_build_PAE/xfs_log_recover.c. 
> 
> CentOS kernel? How old?

Assuning it's centos5, there's really no need to be using an xfs kmod there
anymore, the module shipped with the kernel in recent versions of the OS
is really the one you want to use.  That kmod is ancient and unmaintained.

Although I suspect the storage is more likely at fault here.  :)

-Eric


> Cheers,
> 
> Dave.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Possible XFS Corruption
  2004-08-02  4:20   ` Callan Tham
@ 2004-08-02  5:48     ` Nathan Scott
  0 siblings, 0 replies; 13+ messages in thread
From: Nathan Scott @ 2004-08-02  5:48 UTC (permalink / raw)
  To: Callan Tham; +Cc: linux-kernel, linux-xfs

On Mon, Aug 02, 2004 at 12:20:14PM +0800, Callan Tham wrote:
> On Mon, 2004-08-02 at 13:02, Nathan Scott wrote:
> > > I'm running a Gentoo-patched 2.6.7 kernel, and am experiencing possible
> > > XFS corruption on one of my partitions. I've included a sample of the
> > 
> > Is it reproducible with an unpatched kernel.org kernel?
> > 
> > thanks.
> 
> Hi Nathan,
> 
> Unfortunately, I am unable to test this with a vanilla kernel. However,

Oh?

> looking through the Gentoo patches, they did not touch any of the XFS
> code in a vanilla 2.6.7 kernel.

I would be surprised if they had.  A more likely source of
problems would be changes in the VM subsystem (XFS metadata
buffers are cached in the page cache).

> Is there any other way to diagnose this?

The failure you see is XFS reporting corruption in a directory
btree buffer which didn't have an appropriate magic number at
its start when read in from disk.  There's thousands of potential
reasons why that may have happened;  more often than not these
days its an error thats occured outside of XFS though, and XFS
is passing on the bad news.

If you can find a reproducible test case, you're half way there.
If you can find a reproducible test case on a kernel.org kernel,
you're 95% of the way there, cos then we can more easily help. ;)

cheers.

-- 
Nathan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Possible XFS Corruption
  2004-08-02  3:49 Possible XFS Corruption Callan Tham
@ 2004-08-02  5:02 ` Nathan Scott
  2004-08-02  4:20   ` Callan Tham
  0 siblings, 1 reply; 13+ messages in thread
From: Nathan Scott @ 2004-08-02  5:02 UTC (permalink / raw)
  To: Callan Tham; +Cc: linux-kernel

On Mon, Aug 02, 2004 at 11:49:05AM +0800, Callan Tham wrote:
> Hi list,
> 
> I'm running a Gentoo-patched 2.6.7 kernel, and am experiencing possible
> XFS corruption on one of my partitions. I've included a sample of the

Is it reproducible with an unpatched kernel.org kernel?

thanks.

-- 
Nathan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Possible XFS Corruption
  2004-08-02  5:02 ` Nathan Scott
@ 2004-08-02  4:20   ` Callan Tham
  2004-08-02  5:48     ` Nathan Scott
  0 siblings, 1 reply; 13+ messages in thread
From: Callan Tham @ 2004-08-02  4:20 UTC (permalink / raw)
  To: Nathan Scott; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 541 bytes --]

On Mon, 2004-08-02 at 13:02, Nathan Scott wrote:
> > I'm running a Gentoo-patched 2.6.7 kernel, and am experiencing possible
> > XFS corruption on one of my partitions. I've included a sample of the
> 
> Is it reproducible with an unpatched kernel.org kernel?
> 
> thanks.

Hi Nathan,

Unfortunately, I am unable to test this with a vanilla kernel. However,
looking through the Gentoo patches, they did not touch any of the XFS
code in a vanilla 2.6.7 kernel.

Is there any other way to diagnose this?

Thank you,

Callan

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Possible XFS Corruption
@ 2004-08-02  3:49 Callan Tham
  2004-08-02  5:02 ` Nathan Scott
  0 siblings, 1 reply; 13+ messages in thread
From: Callan Tham @ 2004-08-02  3:49 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1849 bytes --]

Hi list,

I'm running a Gentoo-patched 2.6.7 kernel, and am experiencing possible
XFS corruption on one of my partitions. I've included a sample of the
logs I've managed to see here:

Filesystem "hda1": XFS internal error xfs_da_do_buf(2) at line 2273 of
file fs/xfs/xfs_da_btree.c.  Caller 0xc02420ec
[<c0241b2f>] xfs_da_do_buf+0x41f/0x960
[<c02420ec>] xfs_da_read_buf+0x3c/0x40
[<c02420ec>] xfs_da_read_buf+0x3c/0x40
[<c02420ec>] xfs_da_read_buf+0x3c/0x40
[<c0249611>] xfs_dir2_leaf_lookup_int+0x41/0x280
[<c0249611>] xfs_dir2_leaf_lookup_int+0x41/0x280
[<c0132a80>] do_generic_mapping_read+0x180/0x3a0
[<c02340e9>] xfs_bmap_last_offset+0xa9/0x120
[<c0249540>] xfs_dir2_leaf_lookup+0x20/0xb0
[<c02441f2>] xfs_dir2_lookup+0x112/0x130
[<c0132ca0>] file_read_actor+0x0/0xe0
[<c0283687>] xfs_read+0x197/0x250
[<c02736dc>] xfs_dir_lookup_int+0x2c/0xe0
[<c02781e7>] xfs_lookup+0x37/0x70
[<c0282aaa>] linvfs_lookup+0x4a/0x90
[<c015700f>] real_lookup+0xaf/0xd0
[<c0157218>] do_lookup+0x68/0x80
[<c0157628>] link_path_walk+0x3f8/0x7d0
[<c0157c06>] path_lookup+0x66/0x110
[<c015489f>] open_exec+0x1f/0xe0
[<c015499c>] kernel_read+0x3c/0x50
[<c016f8cf>] load_elf_binary+0xaef/0xb70
[<c011bd3d>] mm_init+0x8d/0xc0
[<c0135d52>] buffered_rmqueue+0xd2/0x180
[<c0135ead>] __alloc_pages+0xad/0x340
[<c013605b>] __alloc_pages+0x25b/0x340
[<c0291628>] __copy_from_user_ll+0x58/0x60
[<c0155411>] search_binary_handler+0x51/0x1a0
[<c015570b>] do_execve+0x1ab/0x230
[<c01049de>] sys_execve+0x2e/0x60
[<c0105d07>] syscall_call+0x7/0xb

This is not the first time I have seen this error, and am wondering if
it is something anyone has experienced regularly. Any help in diagnosing
this problem is greatly appreciated.

Please CC me any replies, as I'm not subscribed to the list at this
address. Thanks in advance!

Callan

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-12-09  3:42 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-07 11:49 possible xfs corruption blacknred
2010-12-07 14:10 ` Emmanuel Florac
2010-12-07 15:11   ` blacknred
2010-12-07 15:42     ` Emmanuel Florac
2010-12-07 17:30       ` blacknred
2010-12-07 18:15         ` Stan Hoeppner
2010-12-07 19:32     ` Michael Monnerie
2010-12-07 22:10 ` Dave Chinner
2010-12-09  3:44   ` Eric Sandeen
  -- strict thread matches above, loose matches on Subject: below --
2004-08-02  3:49 Possible XFS Corruption Callan Tham
2004-08-02  5:02 ` Nathan Scott
2004-08-02  4:20   ` Callan Tham
2004-08-02  5:48     ` Nathan Scott

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.