xfs corruption

* xfs corruption
@ 2015-09-03 11:09 Danny Shavit
  2015-09-03 13:22 ` Eric Sandeen
  0 siblings, 1 reply; 18+ messages in thread
From: Danny Shavit @ 2015-09-03 11:09 UTC (permalink / raw)
  To: xfs; +Cc: Alex Lyakas

[-- Attachment #1.1: Type: text/plain, Size: 632 bytes --]

Hi Dave,

We couple of more xfs corruption that we would like to share:

1. This is an interesting one, since xfs reported corruption but when
running xfs_repair, no error was found.
Attached is the kernel log section regarding the corruption (6458).
Does xfs_repair explicitly read data from the disk? In such case it might
be a memory corruption. Are you familiar with such cases?

2. xfs corruption occurred suddenly with no apparent external event.
Attached are xfs_repair and kernel logs are.
Xfs dump can be found in:
https://zadarastorage-public.s3.amazonaws.com/xfs/82.metadump.gz

-- 
Thanks,
Danny Shavit
ZadaraStorage

[-- Attachment #1.2: Type: text/html, Size: 1015 bytes --]

[-- Attachment #2: 6458-kernel.log --]
[-- Type: application/octet-stream, Size: 2688 bytes --]

The XFS volumes then entered a corrupted state:

Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.307743] XFS (dm-39): Internal error xfs_allocbt_verify at line 330 of file /mnt/share/builds/14.11--3.8.13-030813-generic/2015-04-29_10-45-42--14.11-1601-124/src/zadara-btrfs/fs/xfs/xfs_alloc_btree.c.  Caller 0xffffffffa064e9ce
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.307743]
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.314446] Pid: 25231, comm: kworker/0:0H Tainted: GF       W  O 3.8.13-030813-generic #201305111843
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.314449] Call Trace:
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.314487]  [<ffffffffa0631baf>] xfs_error_report+0x3f/0x50 [xfs]
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.314502]  [<ffffffffa064e9ce>] ? xfs_allocbt_read_verify+0xe/0x10 [xfs]
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.314514]  [<ffffffffa0631c1e>] xfs_corruption_error+0x5e/0x90 [xfs]
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.314528]  [<ffffffffa064e862>] xfs_allocbt_verify+0x92/0x1e0 [xfs]
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.314540]  [<ffffffffa064e9ce>] ? xfs_allocbt_read_verify+0xe/0x10 [xfs]
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.314547]  [<ffffffff810135aa>] ? __switch_to+0x12a/0x4a0
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.314551]  [<ffffffff81096cd8>] ? set_next_entity+0xa8/0xc0
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.314566]  [<ffffffffa064e9ce>] xfs_allocbt_read_verify+0xe/0x10 [xfs]
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.315251]  [<ffffffffa062f48f>] xfs_buf_iodone_work+0x3f/0xa0 [xfs]
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.315255]  [<ffffffff81078b81>] process_one_work+0x141/0x490
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.315257]  [<ffffffff81079b48>] worker_thread+0x168/0x400
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.315259]  [<ffffffff810799e0>] ? manage_workers+0x120/0x120
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.315262]  [<ffffffff8107f050>] kthread+0xc0/0xd0
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.315265]  [<ffffffff8107ef90>] ? flush_kthread_worker+0xb0/0xb0
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.315270]  [<ffffffff816f61ec>] ret_from_fork+0x7c/0xb0
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.315273]  [<ffffffff8107ef90>] ? flush_kthread_worker+0xb0/0xb0
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.315275] XFS (dm-39): Corruption detected. Unmount and run xfs_repair
Aug 27 01:01:34 vsa-0000014e-vc-0 kernel: [3507105.316706] XFS (dm-39): metadata I/O error: block 0x41a6eff8 ("xfs_trans_read_buf_map") error 117 numblks 8

[-- Attachment #3: 6442-82-xfs_repair.log --]
[-- Type: application/octet-stream, Size: 6009 bytes --]

root@vsa-00000110-vc-0:~# xfs_repair /dev/dm-82
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
root@vsa-00000110-vc-0:~# xfs_repair -L /dev/dm-82
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
agi unlinked bucket 1 is 12580353 in ag 3 (inode=213906945)
sb_icount 1226496, counted 1227776
sb_ifree 292180, counted 297082
sb_fdblocks 31182739, counted 55158044
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
7f8d22a2c700: Badness in key lookup (length)
bp=(bno 84932992, len 16384 bytes) key=(bno 84932992, len 8192 bytes)
        - agno = 3
bad magic # 0xeabb123a in inode 213906945 (data fork) bmbt block 13369242
bad data fork in inode 213906945
cleared inode 213906945
clearing forw/back pointers in block 0 for attributes in inode 213906953
bad attribute leaf magic # 0xbc6c for dir ino 213906953
problem with attribute contents in inode 213906953
clearing inode 213906953 attributes
correcting nblocks for inode 213906953, was 66 - counted 65
clearing forw/back pointers in block 0 for attributes in inode 213906954
bad attribute leaf magic # 0xde72 for dir ino 213906954
problem with attribute contents in inode 213906954
clearing inode 213906954 attributes
correcting nblocks for inode 213906954, was 2 - counted 1
clearing forw/back pointers in block 0 for attributes in inode 213906960
bad attribute leaf magic # 0xd0eb for dir ino 213906960
problem with attribute contents in inode 213906960
clearing inode 213906960 attributes
correcting nblocks for inode 213906960, was 4 - counted 3
clearing forw/back pointers in block 0 for attributes in inode 213906961
bad attribute leaf magic # 0xb876 for dir ino 213906961
problem with attribute contents in inode 213906961
clearing inode 213906961 attributes
correcting nblocks for inode 213906961, was 5 - counted 4
        - agno = 4
        - agno = 5
clearing forw/back pointers in block 0 for attributes in inode 347235105
bad attribute leaf magic # 0xb033 for dir ino 347235105
problem with attribute contents in inode 347235105
clearing inode 347235105 attributes
correcting nblocks for inode 347235105, was 9 - counted 8
clearing forw/back pointers in block 0 for attributes in inode 347235106
bad attribute leaf magic # 0xe13 for dir ino 347235106
problem with attribute contents in inode 347235106
clearing inode 347235106 attributes
correcting nblocks for inode 347235106, was 9 - counted 8
        - agno = 6
        - agno = 7
clearing forw/back pointers in block 0 for attributes in inode 478759702
bad attribute leaf magic # 0xa065 for dir ino 478759702
problem with attribute contents in inode 478759702
clearing inode 478759702 attributes
correcting nblocks for inode 478759702, was 1561 - counted 1560
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
bad magic # 0x58465342 in inode 213906953 (data fork) bmbt block 0
bad data fork in inode 213906953
cleared inode 213906953
bad attribute format 1 in inode 213906954, resetting value
bad attribute format 1 in inode 213906960, resetting value
bad attribute format 1 in inode 213906961, resetting value
        - agno = 4
        - agno = 5
bad attribute format 1 in inode 347235105, resetting value
bad attribute format 1 in inode 347235106, resetting value
        - agno = 6
        - agno = 7
bad magic # 0x58465342 in inode 478759702 (data fork) bmbt block 0
bad data fork in inode 478759702
cleared inode 478759702
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
7f8d24478740: Badness in key lookup (length)
bp=(bno 0, len 4096 bytes) key=(bno 0, len 512 bytes)
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
entry "3247.png" in directory inode 201326924 points to free inode 213906953
bad hash table for directory inode 201326924 (no data entry): rebuilding
rebuilding directory inode 201326924
entry "0251050.NWB" in directory inode 469762366 points to free inode 478759702
rebuilding directory inode 469762366
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done
root@vsa-00000110-vc-0:~# echo $?
0
root@vsa-00000110-vc-0:~# crm_mon
Connection to the CIB terminated
Reconnecting...root@vsa-00000110-vc-0:~# less /var/log/kern.log
root@vsa-00000110-vc-0:~#

[-- Attachment #4: dm-82-kernel.log --]
[-- Type: application/octet-stream, Size: 2549 bytes --]

Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.685353] ffff88010ec36000: ea bb 12 3a 5f 44 01 a8 b9 2a 80 10 b3 a7 d5 af  ...:_D...*......
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.686568] XFS (dm-82): Internal error xfs_bmbt_verify at line 747 of file /mnt/share/builds/14.11--3.8.13-030813-generic/2015-06-17_03-30-37--14.11-1601-129/src/zadara-btrfs/fs/xfs/xfs_bmap_btree.c.  Caller 0xffffffffa07779ee
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.686568] 
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689393] Pid: 17063, comm: kworker/0:1H Tainted: GF       W  O 3.8.13-030813-generic #201305111843
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689395] Call Trace:
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689443]  [<ffffffffa0746baf>] xfs_error_report+0x3f/0x50 [xfs]
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689491]  [<ffffffffa07779ee>] ? xfs_bmbt_read_verify+0xe/0x10 [xfs]
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689503]  [<ffffffffa0746c1e>] xfs_corruption_error+0x5e/0x90 [xfs]
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689517]  [<ffffffffa0777867>] xfs_bmbt_verify+0x77/0x1e0 [xfs]
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689535]  [<ffffffffa07779ee>] ? xfs_bmbt_read_verify+0xe/0x10 [xfs]
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689548]  [<ffffffffa07779ee>] xfs_bmbt_read_verify+0xe/0x10 [xfs]
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689558]  [<ffffffffa074448f>] xfs_buf_iodone_work+0x3f/0xa0 [xfs]
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689564]  [<ffffffff81078b81>] process_one_work+0x141/0x490
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689566]  [<ffffffff81079b48>] worker_thread+0x168/0x400
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689569]  [<ffffffff810799e0>] ? manage_workers+0x120/0x120
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689571]  [<ffffffff8107f050>] kthread+0xc0/0xd0
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689574]  [<ffffffff8107ef90>] ? flush_kthread_worker+0xb0/0xb0
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689579]  [<ffffffff816f61ec>] ret_from_fork+0x7c/0xb0
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689582]  [<ffffffff8107ef90>] ? flush_kthread_worker+0xb0/0xb0
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.689584] XFS (dm-82): Corruption detected. Unmount and run xfs_repair
Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.690508] XFS (dm-82): metadata I/O error: block 0x50ffb50 ("xfs_trans_read_buf_map") error 117 numblks 8

[-- Attachment #5: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 18+ messages in thread