Re: BTRFS RAID filesystem unmountable

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Michael Wade <spikewade@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: BTRFS RAID filesystem unmountable
Date: Sun, 29 Apr 2018 16:33:22 +0800	[thread overview]
Message-ID: <bff1c7b3-5bac-9806-a376-3612057865cf@gmx.com> (raw)
In-Reply-To: <CAB+znrF_d+Hg_A9AMvWEB=S5eVAtYrvr2jUPcvR4FfB4hnCMWA@mail.gmail.com>

[-- Attachment #1.1: Type: text/plain, Size: 9893 bytes --]

On 2018年04月29日 16:11, Michael Wade wrote:
> Thanks Qu,
> 
> Please find attached the log file for the chunk recover command.

Strangely, btrfs chunk recovery found no extra chunk beyond current
system chunk range.

Which means, it's chunk tree corrupted.

Please dump the chunk tree with latest btrfs-progs (which provides the
new --follow option).

# btrfs inspect dump-tree -b 20800943685632 <device>

If it doesn't work, please provide the following binary dump:

# dd if=<dev> of=/tmp/chunk_root.copy1 bs=1 count=32K skip=266325721088
# dd if=<dev> of=/tmp/chunk_root.copy2 bs=1 count=32K skip=266359275520
(And will need to repeat similar dump for several times according to
above dump)

Thanks,
Qu

> 
> Kind regards
> Michael
> 
> On 28 April 2018 at 12:38, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>> On 2018年04月28日 17:37, Michael Wade wrote:
>>> Hi Qu,
>>>
>>> Thanks for your reply. I will investigate upgrading the kernel,
>>> however I worry that future ReadyNAS firmware upgrades would fail on a
>>> newer kernel version (I don't have much linux experience so maybe my
>>> concerns are unfounded!?).
>>>
>>> I have attached the output of the dump super command.
>>>
>>> I did actually run chunk recover before, without the verbose option,
>>> it took around 24 hours to finish but did not resolve my issue. Happy
>>> to start that again if you need its output.
>>
>> The system chunk only contains the following chunks:
>> [0, 4194304]:           Initial temporary chunk, not used at all
>> [20971520, 29360128]:   System chunk created by mkfs, should be full
>>                         used up
>> [20800943685632, 20800977240064]:
>>                         The newly created large system chunk.
>>
>> The chunk root is still in 2nd chunk thus valid, but some of its leaf is
>> out of the range.
>>
>> If you can't wait 24h for chunk recovery to run, my advice would be move
>> the disk to some other computer, and use latest btrfs-progs to execute
>> the following command:
>>
>> # btrfs inpsect dump-tree -b 20800943685632 --follow
>>
>> If we're lucky enough, we may read out the tree leaf containing the new
>> system chunk and save a day.
>>
>> Thanks,
>> Qu
>>
>>>
>>> Thanks so much for your help.
>>>
>>> Kind regards
>>> Michael
>>>
>>> On 28 April 2018 at 09:45, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>>>
>>>>
>>>> On 2018年04月28日 16:30, Michael Wade wrote:
>>>>> Hi all,
>>>>>
>>>>> I was hoping that someone would be able to help me resolve the issues
>>>>> I am having with my ReadyNAS BTRFS volume. Basically my trouble
>>>>> started after a power cut, subsequently the volume would not mount.
>>>>> Here are the details of my setup as it is at the moment:
>>>>>
>>>>> uname -a
>>>>> Linux QAI 4.4.116.alpine.1 #1 SMP Mon Feb 19 21:58:38 PST 2018 armv7l GNU/Linux
>>>>
>>>> The kernel is pretty old for btrfs.
>>>> Strongly recommended to upgrade.
>>>>
>>>>>
>>>>> btrfs --version
>>>>> btrfs-progs v4.12
>>>>
>>>> So is the user tools.
>>>>
>>>> Although I think it won't be a big problem, as needed tool should be there.
>>>>
>>>>>
>>>>> btrfs fi show
>>>>> Label: '11baed92:data'  uuid: 20628cda-d98f-4f85-955c-932a367f8821
>>>>> Total devices 1 FS bytes used 5.12TiB
>>>>> devid    1 size 7.27TiB used 6.24TiB path /dev/md127
>>>>
>>>> So, it's btrfs on mdraid.
>>>> It would normally make things harder to debug, so I could only provide
>>>> advice from the respect of btrfs.
>>>> For mdraid part, I can't ensure anything.
>>>>
>>>>>
>>>>> Here are the relevant dmesg logs for the current state of the device:
>>>>>
>>>>> [   19.119391] md: md127 stopped.
>>>>> [   19.120841] md: bind<sdb3>
>>>>> [   19.121120] md: bind<sdc3>
>>>>> [   19.121380] md: bind<sda3>
>>>>> [   19.125535] md/raid:md127: device sda3 operational as raid disk 0
>>>>> [   19.125547] md/raid:md127: device sdc3 operational as raid disk 2
>>>>> [   19.125554] md/raid:md127: device sdb3 operational as raid disk 1
>>>>> [   19.126712] md/raid:md127: allocated 3240kB
>>>>> [   19.126778] md/raid:md127: raid level 5 active with 3 out of 3
>>>>> devices, algorithm 2
>>>>> [   19.126784] RAID conf printout:
>>>>> [   19.126789]  --- level:5 rd:3 wd:3
>>>>> [   19.126794]  disk 0, o:1, dev:sda3
>>>>> [   19.126799]  disk 1, o:1, dev:sdb3
>>>>> [   19.126804]  disk 2, o:1, dev:sdc3
>>>>> [   19.128118] md127: detected capacity change from 0 to 7991637573632
>>>>> [   19.395112] Adding 523708k swap on /dev/md1.  Priority:-1 extents:1
>>>>> across:523708k
>>>>> [   19.434956] BTRFS: device label 11baed92:data devid 1 transid
>>>>> 151800 /dev/md127
>>>>> [   19.739276] BTRFS info (device md127): setting nodatasum
>>>>> [   19.740440] BTRFS critical (device md127): unable to find logical
>>>>> 3208757641216 len 4096
>>>>> [   19.740450] BTRFS critical (device md127): unable to find logical
>>>>> 3208757641216 len 4096
>>>>> [   19.740498] BTRFS critical (device md127): unable to find logical
>>>>> 3208757641216 len 4096
>>>>> [   19.740512] BTRFS critical (device md127): unable to find logical
>>>>> 3208757641216 len 4096
>>>>> [   19.740552] BTRFS critical (device md127): unable to find logical
>>>>> 3208757641216 len 4096
>>>>> [   19.740560] BTRFS critical (device md127): unable to find logical
>>>>> 3208757641216 len 4096
>>>>> [   19.740576] BTRFS error (device md127): failed to read chunk root
>>>>
>>>> This shows it pretty clear, btrfs fails to read chunk root.
>>>> And according your above "len 4096" it's pretty old fs, as it's still
>>>> using 4K nodesize other than 16K nodesize.
>>>>
>>>> According to above output, it means your superblock by somehow lacks the
>>>> needed system chunk mapping, which is used to initialize chunk mapping.
>>>>
>>>> Please provide the following command output:
>>>>
>>>> # btrfs inspect dump-super -fFa /dev/md127
>>>>
>>>> Also, please consider run the following command and dump all its output:
>>>>
>>>> # btrfs rescue chunk-recover -v /dev/md127.
>>>>
>>>> Please note that, above command can take a long time to finish, and if
>>>> it works without problem, it may solve your problem.
>>>> But if it doesn't work, the output could help me to manually craft a fix
>>>> to your super block.
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>
>>>>> [   19.783975] BTRFS error (device md127): open_ctree failed
>>>>>
>>>>> In an attempt to recover the volume myself I run a few BTRFS commands
>>>>> mostly using advice from here:
>>>>> https://lists.opensuse.org/opensuse/2017-02/msg00930.html. However
>>>>> that actually seems to have made things worse as I can no longer mount
>>>>> the file system, not even in readonly mode.
>>>>>
>>>>> So starting from the beginning here is a list of things I have done so
>>>>> far (hopefully I remembered the order in which I ran them!)
>>>>>
>>>>> 1. Noticed that my backups to the NAS were not running (didn't get
>>>>> notified that the volume had basically "died")
>>>>> 2. ReadyNAS UI indicated that the volume was inactive.
>>>>> 3. SSHed onto the box and found that the first drive was not marked as
>>>>> operational (log showed I/O errors / UNKOWN (0x2003))  so I replaced
>>>>> the disk and let the array resync.
>>>>> 4. After resync the volume still was unaccessible so I looked at the
>>>>> logs once more and saw something like the following which seemed to
>>>>> indicate that the replay log had been corrupted when the power went
>>>>> out:
>>>>>
>>>>> BTRFS critical (device md127): corrupt leaf, non-root leaf's nritems
>>>>> is 0: block=232292352, root=7, slot=0
>>>>> BTRFS critical (device md127): corrupt leaf, non-root leaf's nritems
>>>>> is 0: block=232292352, root=7, slot=0
>>>>> BTRFS: error (device md127) in btrfs_replay_log:2524: errno=-5 IO
>>>>> failure (Failed to recover log tree)
>>>>> BTRFS error (device md127): pending csums is 155648
>>>>> BTRFS error (device md127): cleaner transaction attach returned -30
>>>>> BTRFS critical (device md127): corrupt leaf, non-root leaf's nritems
>>>>> is 0: block=232292352, root=7, slot=0
>>>>>
>>>>> 5. Then:
>>>>>
>>>>> btrfs rescue zero-log
>>>>>
>>>>> 6. Was then able to mount the volume in readonly mode.
>>>>>
>>>>> btrfs scrub start
>>>>>
>>>>> Which fixed some errors but not all:
>>>>>
>>>>> scrub status for 20628cda-d98f-4f85-955c-932a367f8821
>>>>>
>>>>> scrub started at Tue Apr 24 17:27:44 2018, running for 04:00:34
>>>>> total bytes scrubbed: 224.26GiB with 6 errors
>>>>> error details: csum=6
>>>>> corrected errors: 0, uncorrectable errors: 6, unverified errors: 0
>>>>>
>>>>> scrub status for 20628cda-d98f-4f85-955c-932a367f8821
>>>>> scrub started at Tue Apr 24 17:27:44 2018, running for 04:34:43
>>>>> total bytes scrubbed: 224.26GiB with 6 errors
>>>>> error details: csum=6
>>>>> corrected errors: 0, uncorrectable errors: 6, unverified errors: 0
>>>>>
>>>>> 6. Seeing this hanging I rebooted the NAS
>>>>> 7. Think this is when the volume would not mount at all.
>>>>> 8. Seeing log entries like these:
>>>>>
>>>>> BTRFS warning (device md127): checksum error at logical 20800943685632
>>>>> on dev /dev/md127, sector 520167424: metadata node (level 1) in tree 3
>>>>>
>>>>> I ran
>>>>>
>>>>> btrfs check --fix-crc
>>>>>
>>>>> And that brings us to where I am now: Some seemly corrupted BTRFS
>>>>> metadata and unable to mount the drive even with the recovery option.
>>>>>
>>>>> Any help you can give is much appreciated!
>>>>>
>>>>> Kind regards
>>>>> Michael
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]