From: Eric Sandeen <sandeen@sandeen.net>
To: John Jore <john@jore.no>,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Subject: Re: Bug in xfs_repair 5..4.0 / Unable to repair metadata corruption
Date: Sun, 9 Feb 2020 21:47:18 -0600 [thread overview]
Message-ID: <74152f80-3a42-eab5-a95f-e29f03db46a9@sandeen.net> (raw)
In-Reply-To: <b2babb761ed24dc986abc3073c5c47fc@jore.no>
On 2/9/20 12:19 AM, John Jore wrote:
> Hi all,
>
> Not sure if this is the appropriate forum to reports xfs_repair bugs? If wrong, please point me in the appropriate direction?
This is the place.
> I have a corrupted XFS volume which mounts fine, but xfs_repair is unable to repair it and volume eventually shuts down due to metadata corruption if writes are performed.
what does dmesg say when it shuts down?
>
> Originally I used xfs_repair from CentOS 8.1.1911, but cloned latest xfs_repair from git://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git (Today, Feb 9th, reports as version 5.4.0)
>
>
> Phase 3 - for each AG...
> - scan and clear agi unlinked lists...
> - 16:08:04: scanning agi unlinked lists - 64 of 64 allocation groups done
> - process known inodes and perform inode discovery...
> - agno = 45
> - agno = 15
> - agno = 0
> - agno = 30
> - agno = 60
> - agno = 46
> - agno = 16
> Metadata corruption detected at 0x4330e3, xfs_inode block 0x17312a3f0/0x2000
> - agno = 61
> - agno = 31
> - agno = 47
> - agno = 62
> - agno = 48
> - agno = 49
> - agno = 32
> - agno = 33
> - agno = 17
> - agno = 1
> bad magic number 0x0 on inode 18253615584
> bad version number 0x0 on inode 18253615584
> bad magic number 0x0 on inode 18253615585
> bad version number 0x0 on inode 18253615585
> bad magic number 0x0 on inode 18253615586
> .....
> bad magic number 0x0 on inode 18253615584, resetting magic number
> bad version number 0x0 on inode 18253615584, resetting version number
> bad magic number 0x0 on inode 18253615585, resetting magic number
> bad version number 0x0 on inode 18253615585, resetting version number
> bad magic number 0x0 on inode 18253615586, resetting magic number
> bad version number 0x0 on inode 18253615586, resetting version number
Looks like a whole chunk of inodes with at least 0 magic numbers.
> ....
> - agno = 16
> - agno = 17
> Metadata corruption detected at 0x4330e3, xfs_inode block 0x17312a3f0/0x2000
> - agno = 18
> - agno = 19
> ...
> Phase 7 - verify and correct link counts...
> - 16:10:41: verify and correct link counts - 64 of 64 allocation groups done
> Metadata corruption detected at 0x433385, xfs_inode block 0x17312a3f0/0x2000
> libxfs_writebufr: write verifier failed on xfs_inode bno 0x17312a3f0/0x2000
This bit seems problematic, I guess it's unable to write the updated inode buffer,
due to some corruption, which presumably is why you keep tripping over the same
corruption each time.
> releasing dirty buffer (bulk) to free list!
>
>
>
> Does not matter how many times, I've lost count, I re-run xfs_repair, with, or without -d,
-d is for repairing a filesystem while mounted. I hope you are not doing that, are you?
> it never does repair the volume.
> Volume is a ~12GB LV build using 4x 4TB disks in RAID 5 using a 3Ware 9690SA controller.
Just to double check, are there any storage errors reported in dmesg?
> Any suggestions or additional data I can provide?
If you are willing to provide an xfs_metadump to me (off-list) I will see if I can
reproduce it from the metadump.
# xfs_metadump /dev/$WHATEVER metadump.img
# bzip2 metadump.img
-Eric
>
> John
>
next prev parent reply other threads:[~2020-02-10 3:47 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <186d30f217e645728ad1f34724cbe3e7@jore.no>
2020-02-09 6:19 ` Bug in xfs_repair 5..4.0 / Unable to repair metadata corruption John Jore
2020-02-10 3:47 ` Eric Sandeen [this message]
2020-02-10 3:49 ` Eric Sandeen
[not found] ` <60f32c031f4345a2b680fbc8531f7bd3@jore.no>
2020-02-10 10:33 ` John Jore
2020-02-10 14:36 ` Eric Sandeen
2020-02-10 14:43 ` Eric Sandeen
2020-02-10 15:35 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=74152f80-3a42-eab5-a95f-e29f03db46a9@sandeen.net \
--to=sandeen@sandeen.net \
--cc=john@jore.no \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).