Re: BTRFS fails mount after power failure

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: weldon@newfietech.com, linux-btrfs@vger.kernel.org
Subject: Re: BTRFS fails mount after power failure
Date: Tue, 24 Aug 2021 07:54:59 +0800	[thread overview]
Message-ID: <0be8ec2b-7226-f3d1-a02b-608e757bda24@gmx.com> (raw)
In-Reply-To: <005201d79860$befd1b60$3cf75220$@newfietech.com>

On 2021/8/24 上午4:52, weldon@newfietech.com wrote:
> Good day folks,
>
> I awoke this morning to find that my UPS had died overnight and my Ubuntu
> server with a 14.5TB (Raid 5) BTRFS volume went down with it.

RAID5 has known write hole bug, and although that bug won't cause
immediate problems, it slowly degrades the whole array with each
corrupted sector or unexpected power loss.

This would eventually bring down the array with enough degradation.

>  The machine
> rebooted fine and the hardware reports no errors, however the BTRFS volume
> will no longer mount.  The OS boots fine, the 14.5TB volume is for data
> storage only.  gparted shows the volume/partition,  and correctly reports
> space used as well as total size.  I've never encountered this type of issue
> over the past year while using btrfs and I'm not sure where to start.  A
> number of google search results express caution when attempting to
> recover/repair, so I'm hoping for some expert advice.
>
> My dmesg log exceeds the 100,000 bytes restriction, so I'm unable to attach
> it, so please ask if there's anything specific I can include otherwise.
>
> # uname -a
> Linux onyx 5.4.0-81-generic #91-Ubuntu SMP Thu Jul 15 19:09:17 UTC 2021
> x86_64 x86_64 x86_64 GNU/Linux
>
> # btrfs --version
> btrfs-progs v5.4.1
>
> # btrfs fi show
> Label: 'Data'  uuid: 7f500ee1-32b7-45a3-b1e9-deb7e1f59632
>          Total devices 1 FS bytes used 7.17TiB
>          devid    1 size 14.50TiB used 7.40TiB path /dev/sdb1
>
> # dmesg | grep sdb
> [    2.312875] sd 32:0:1:0: [sdb] Very big device. Trying to use READ
> CAPACITY(16).
> [    2.313010] sd 32:0:1:0: [sdb] 31138512896 512-byte logical blocks: (15.9
> TB/14.5 TiB)
> [    2.313062] sd 32:0:1:0: [sdb] Write Protect is off
> [    2.313065] sd 32:0:1:0: [sdb] Mode Sense: 61 00 00 00
> [    2.313116] sd 32:0:1:0: [sdb] Cache data unavailable
> [    2.313119] sd 32:0:1:0: [sdb] Assuming drive cache: write through
> [    2.333321] sd 32:0:1:0: [sdb] Very big device. Trying to use READ
> CAPACITY(16).
> [    2.396761]  sdb: sdb1
> [    2.397170] sd 32:0:1:0: [sdb] Very big device. Trying to use READ
> CAPACITY(16).
> [    2.397261] sd 32:0:1:0: [sdb] Attached SCSI disk
> [    4.709963] BTRFS: device label Data devid 1 transid 120260 /dev/sdb1
> [   21.849570] BTRFS info (device sdb1): disk space caching is enabled
> [   21.849573] BTRFS info (device sdb1): has skinny extents
> [   22.023224] BTRFS error (device sdb1): parent transid verify failed on
> 7939752886272 wanted 120260 found 120262
> [   22.047940] BTRFS error (device sdb1): parent transid verify failed on
> 7939752886272 wanted 120260 found 120265

This already shows some mismatch in on-disk data and recovered data from
parity.

This shows the on-disk data and parity have drifted from each other,
exactly the write hole problem.

Furthermore, the disk has newer data than what we expect.

What's the device model? It looks like a misbehavior, not sure if it's
from the hardware, or the btrfs code.
As RAID56 is already marked as unsafe for a while, not that much love
nor code fix is directed to RAID56, thus both cases are possible.

> [   22.047949] BTRFS warning (device sdb1): failed to read tree root
> [   22.089003] BTRFS error (device sdb1): open_ctree failed
>
> root@onyx:/home/weldon# btrfs-find-root /dev/sdb1
> parent transid verify failed on 7939752886272 wanted 120260 found 120262
> parent transid verify failed on 7939752886272 wanted 120260 found 120265
> parent transid verify failed on 7939752886272 wanted 120260 found 120265
> Ignoring transid failure
> WARNING: could not setup extent tree, skipping it
> Couldn't setup device tree
> Superblock thinks the generation is 120260
> Superblock thinks the level is 1
> Well block 7939758882816(gen: 120264 level: 1) seems good, but
> generation/level doesn't match, want gen: 120260 level: 1
> Well block 7939747938304(gen: 120263 level: 1) seems good, but
> generation/level doesn't match, want gen: 120260 level: 1
> Well block 7939756146688(gen: 120262 level: 1) seems good, but
> generation/level doesn't match, want gen: 120260 level: 1
> Well block 7939751559168(gen: 120261 level: 0) seems good, but
> generation/level doesn't match, want gen: 120260 level: 1
>
> *** A large selection of block references was removed due to character
> count... if needed, I can resend with the full output.
>
> Well block 1316967743488(gen: 1293 level: 0) seems good, but
> generation/level doesn't match, want gen: 120260 level: 1
> Well block 1316909662208(gen: 1283 level: 0) seems good, but
> generation/level doesn't match, want gen: 120260 level: 1
> Well block 1316908711936(gen: 1283 level: 0) seems good, but
> generation/level doesn't match, want gen: 120260 level: 1
> root@onyx:/home#
>
> Any help or assistance would be greatly appreciated.  Important data has
> been backed up, however if it's possible to recover without thrashing the
> entire volume, that would be preferred.

First thing first, don't expect too much about magically turning the fs
back to fully functional status.
Transid error is always tricky for btrfs.

But for your case, I'm guessing your sdb1 does not have the latest super
block.
We have newer tree roots on disk, but older super block.

Maybe you would like to try "btrfs ins dump-tree" on all the involved
disks, and find if there is newer super blocks.

Thanks,
Qu
>
> Regards,
> Weldon
>