On 2019/2/19 下午6:24, Roderick Johnstone wrote:
> Hi
> 
> This is on Fedora 28:
> 
> # uname -a
> Linux mysystem.mydomain 4.20.7-100.fc28.x86_64 #1 SMP Wed Feb 6 19:17:09
> UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
> 
> # btrfs --version
> btrfs-progs v4.17.1
> 
> #   btrfs fi show
> Label: none  uuid: 56d0171a-440d-47ff-ad0f-f7f97df31f7b
>         Total devices 1 FS bytes used 7.39TiB
>         devid    1 size 9.10TiB used 7.50TiB path /dev/md2
> 
> 
> My btrfs filesystem is in a bad state after a partial disk failure on
> the md device (raid 6 array) the file system was on.
> 
> One of the disks had bad blocks, but instead of being ejected from the
> array, the array hung up.

I'm a little interested why RAID6 hung up.

> After rebooting to regain access and remove
> the bad disk I am in the following situation:
> 
> # mount -t btrfs -o compress-force=zlib,noatime /dev/md2 /mnt/rmj
> mount: /mnt/rmj: wrong fs type, bad option, bad superblock on /dev/md2,
> missing codepage or helper program, or other error.
> # dmesg
> ...
>   264.527647] BTRFS info (device md2): force zlib compression, level 3
> [  264.955360] BTRFS error (device md2): parent transid verify failed on
> 5568287064064 wanted 254988 found 94122

It's 99% some extent tree blocks get corrupted.

> [  264.964273] BTRFS error (device md2): open_ctree failed
> 
> I can mount and access the filesystem with the usebackuproot option:
> 
> # mount -t btrfs -o usebackuproot,compress-force=zlib,noatime /dev/md2
> /mnt/rmj
> [  307.542761] BTRFS info (device md2): trying to use backup root at
> mount time
> [  307.542768] BTRFS info (device md2): force zlib compression, level 3
> [  307.570897] BTRFS error (device md2): parent transid verify failed on
> 5568287064064 wanted 254988 found 94122
> [  307.570979] BTRFS error (device md2): parent transid verify failed on
> 5568287064064 wanted 254988 found 94122
> [  431.167149] BTRFS info (device md2): checking UUID tree
> 
> But later after a umount there are these messages.
> 
> # umount /mnt/rmj
> 2205.778998] BTRFS error (device md2): parent transid verify failed on
> 5568276393984 wanted 254986 found 94117
> [ 2205.779008] BTRFS: error (device md2) in __btrfs_free_extent:6831:
> errno=-5 IO failure
> [ 2205.779082] BTRFS info (device md2): forced readonly
> [ 2205.779087] BTRFS: error (device md2) in btrfs_run_delayed_refs:2978:
> errno=-5 IO failure
> [ 2205.779192] BTRFS warning (device md2): btrfs_uuid_scan_kthread
> failed -30
Of course it's extent tree corrupted.

> 
> and a subsequent mount without the userbackuproot fails in the same way
> as before.
> 
> I have a copy of the important directories, but would like to be able to
> repair the filesystem if possible,

You could mostly salvage the data, either use 'usebackuproot' mount
option + RO mount or btrfs-restore.

For full rw recovery, I don't think there is a good tool right now.
Extent tree repair is pretty trikcy, under most case, the only method is
--init-extent-tree, but that functionality isn't tried by many users.
And it only makes sense if all other trees are OK.

So in short, RW recovery is near impossible.

> 
> Any advise around repairing the filesystem would be appreciated.

It's better to salvage your data first and if you like adventure, try
--init-extent-tree.
If not, just rebuild the array.

Thanks,
Qu

> 
> Thanks.
> 
> Roderick Johnstone