Re: Need help with potential ~45TB dataloss

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Patrick Dijkgraaf <bolderbast@duckstad.net>, linux-btrfs@vger.kernel.org
Subject: Re: Need help with potential ~45TB dataloss
Date: Mon, 3 Dec 2018 08:45:55 +0800	[thread overview]
Message-ID: <b905e50b-bfd8-348a-992a-e8fee7785eb9@gmx.com> (raw)
In-Reply-To: <19cde2a5-6a07-4c14-6e84-9496f91422d7@gmx.com>

[-- Attachment #1.1: Type: text/plain, Size: 12011 bytes --]

On 2018/12/3 上午8:35, Qu Wenruo wrote:
> 
> 
> On 2018/12/2 下午5:03, Patrick Dijkgraaf wrote:
>> Hi Qu,
>>
>> Thanks for helping me!
>>
>> Please see the reponses in-line.
>> Any suggestions based on this?
>>
>> Thanks!
>>
>>
>> On Sat, 2018-12-01 at 07:57 +0800, Qu Wenruo wrote:
>>> On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote:
>>>> Hi all,
>>>>
>>>> I have been a happy BTRFS user for quite some time. But now I'm
>>>> facing
>>>> a potential ~45TB dataloss... :-(
>>>> I hope someone can help!
>>>>
>>>> I have Server A and Server B. Both having a 20-devices BTRFS RAID6
>>>> filesystem.

I forgot one important thing here, specially for RAID6.

If one data device corrupted, RAID6 will normally try to rebuild using
RAID5 way, and if another one disk get corrupted, it may not recover
correctly.

Current way to recover is try *all* combination.

IIRC Liu Bo tried such patch but not merged.

This means current RAID6 can only handle two missing devices at its best
condition.
But for corruption, it can only be as good as RAID5.

Thanks,
Qu

> Because of known RAID5/6 risks, Server B was a backup
>>>> of
>>>> Server A.
>>>> After applying updates to server B and reboot, the FS would not
>>>> mount
>>>> anymore. Because it was "just" a backup. I decided to recreate the
>>>> FS
>>>> and perform a new backup. Later, I discovered that the FS was not
>>>> broken, but I faced this issue: 
>>>> https://patchwork.kernel.org/patch/10694997/
>>>>
>>>
>>> Sorry for the inconvenience.
>>>
>>> I didn't realize the max_chunk_size limit isn't reliable at that
>>> timing.
>>
>> No problem, I should not have jumped to the conclusion to recreate the
>> backup volume.
>>
>>>> Anyway, the FS was already recreated, so I needed to do a new
>>>> backup.
>>>> During the backup (using rsync -vah), Server A (the source)
>>>> encountered
>>>> an I/O error and my rsync failed. In an attempt to "quick fix" the
>>>> issue, I rebooted Server A after which the FS would not mount
>>>> anymore.
>>>
>>> Did you have any dmesg about that IO error?
>>
>> Yes there was. But I omitted capturing it... The system is now rebooted
>> and I can't retrieve it anymore. :-(
>>
>>> And how is the reboot scheduled? Forced power off or normal reboot
>>> command?
>>
>> The system was rebooted using a normal reboot command.
> 
> Then the problem is pretty serious.
> 
> Possibly already corrupted before.
> 
>>
>>>> I documented what I have tried, below. I have not yet tried
>>>> anything
>>>> except what is shown, because I am afraid of causing more harm to
>>>> the FS.
>>>
>>> Pretty clever, no btrfs check --repair is a pretty good move.
>>>
>>>> I hope somebody here can give me advice on how to (hopefully)
>>>> retrieve my data...
>>>>
>>>> Thanks in advance!
>>>>
>>>> ==========================================
>>>>
>>>> [root@cornelis ~]# btrfs fi show
>>>> Label: 'cornelis-btrfs'  uuid: ac643516-670e-40f3-aa4c-f329fc3795fd
>>>> 	Total devices 1 FS bytes used 463.92GiB
>>>> 	devid    1 size 800.00GiB used 493.02GiB path
>>>> /dev/mapper/cornelis-cornelis--btrfs
>>>>
>>>> Label: 'data'  uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5
>>>> 	Total devices 20 FS bytes used 44.85TiB
>>>> 	devid    1 size 3.64TiB used 3.64TiB path /dev/sdn2
>>>> 	devid    2 size 3.64TiB used 3.64TiB path /dev/sdp2
>>>> 	devid    3 size 3.64TiB used 3.64TiB path /dev/sdu2
>>>> 	devid    4 size 3.64TiB used 3.64TiB path /dev/sdx2
>>>> 	devid    5 size 3.64TiB used 3.64TiB path /dev/sdh2
>>>> 	devid    6 size 3.64TiB used 3.64TiB path /dev/sdg2
>>>> 	devid    7 size 3.64TiB used 3.64TiB path /dev/sdm2
>>>> 	devid    8 size 3.64TiB used 3.64TiB path /dev/sdw2
>>>> 	devid    9 size 3.64TiB used 3.64TiB path /dev/sdj2
>>>> 	devid   10 size 3.64TiB used 3.64TiB path /dev/sdt2
>>>> 	devid   11 size 3.64TiB used 3.64TiB path /dev/sdk2
>>>> 	devid   12 size 3.64TiB used 3.64TiB path /dev/sdq2
>>>> 	devid   13 size 3.64TiB used 3.64TiB path /dev/sds2
>>>> 	devid   14 size 3.64TiB used 3.64TiB path /dev/sdf2
>>>> 	devid   15 size 7.28TiB used 588.80GiB path /dev/sdr2
>>>> 	devid   16 size 7.28TiB used 588.80GiB path /dev/sdo2
>>>> 	devid   17 size 7.28TiB used 588.80GiB path /dev/sdv2
>>>> 	devid   18 size 7.28TiB used 588.80GiB path /dev/sdi2
>>>> 	devid   19 size 7.28TiB used 588.80GiB path /dev/sdl2
>>>> 	devid   20 size 7.28TiB used 588.80GiB path /dev/sde2
>>>>
>>>> [root@cornelis ~]# mount /dev/sdn2 /mnt/data
>>>> mount: /mnt/data: wrong fs type, bad option, bad superblock on
>>>> /dev/sdn2, missing codepage or helper program, or other error.
>>>
>>> What is the dmesg of the mount failure?
>>
>> [Sun Dec  2 09:41:08 2018] BTRFS info (device sdn2): disk space caching
>> is enabled
>> [Sun Dec  2 09:41:08 2018] BTRFS info (device sdn2): has skinny extents
>> [Sun Dec  2 09:41:08 2018] BTRFS error (device sdn2): parent transid
>> verify failed on 46451963543552 wanted 114401 found 114173
>> [Sun Dec  2 09:41:08 2018] BTRFS critical (device sdn2): corrupt leaf:
>> root=2 block=46451963543552 slot=0, unexpected item end, have
>> 1387359977 expect 16283
> 
> OK, this shows that one of the copy has mismatched generation while the
> other copy is completely corrupted.
> 
>> [Sun Dec  2 09:41:08 2018] BTRFS warning (device sdn2): failed to read
>> tree root
>> [Sun Dec  2 09:41:08 2018] BTRFS error (device sdn2): open_ctree failed
>>
>>> And have you tried -o ro,degraded ?
>>
>> Tried it just now, gives the exact same error.
>>
>>>> [root@cornelis ~]# btrfs check /dev/sdn2
>>>> Opening filesystem to check...
>>>> parent transid verify failed on 46451963543552 wanted 114401 found
>>>> 114173
>>>> parent transid verify failed on 46451963543552 wanted 114401 found
>>>> 114173
>>>> checksum verify failed on 46451963543552 found A8F2A769 wanted
>>>> 4C111ADF
>>>> checksum verify failed on 46451963543552 found 32153BE8 wanted
>>>> 8B07ABE4
>>>> checksum verify failed on 46451963543552 found 32153BE8 wanted
>>>> 8B07ABE4
>>>> bad tree block 46451963543552, bytenr mismatch,
>>>> want=46451963543552,
>>>> have=75208089814272
>>>> Couldn't read tree root
>>>
>>> Would you please also paste the output of "btrfs ins dump-super
>>> /dev/sdn2" ?
>>
>> [root@cornelis ~]# btrfs ins dump-super /dev/sdn2
>> superblock: bytenr=65536, device=/dev/sdn2
>> ---------------------------------------------------------
>> csum_type		0 (crc32c)
>> csum_size		4
>> csum			0x51725c39 [match]
>> bytenr			65536
>> flags			0x1
>> 			( WRITTEN )
>> magic			_BHRfS_M [match]
>> fsid			4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5
>> label			data
>> generation		114401
>> root			46451963543552
> 
> The bytenr matches with the dmesg, so it's tree root node corrupted.
> 
>> sys_array_size		513
>> chunk_root_generation	112769
>> root_level		1
>> chunk_root		22085632
>> chunk_root_level	1
>> log_root		46451935461376
>> log_root_transid	0
>> log_root_level		0
>> total_bytes		104020314161152
>> bytes_used		49308554543104
>> sectorsize		4096
>> nodesize		16384
>> leafsize (deprecated)		16384
>> stripesize		4096
>> root_dir		6
>> num_devices		20
>> compat_flags		0x0
>> compat_ro_flags		0x0
>> incompat_flags		0x1e1
>> 			( MIXED_BACKREF |
>> 			  BIG_METADATA |
>> 			  EXTENDED_IREF |
>> 			  RAID56 |
>> 			  SKINNY_METADATA )
>> cache_generation	114401
>> uuid_tree_generation	114401
>> dev_item.uuid		c6b44903-e849-4403-98c4-f3ba4d0b3fc3
>> dev_item.fsid		4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5 [match]
>> dev_item.type		0
>> dev_item.total_bytes	4000783007744
>> dev_item.bytes_used	4000781959168
>> dev_item.io_align	4096
>> dev_item.io_width	4096
>> dev_item.sector_size	4096
>> dev_item.devid		1
>> dev_item.dev_group	0
>> dev_item.seek_speed	0
>> dev_item.bandwidth	0
>> dev_item.generation	0
>>
>>> It looks like your tree root (or at least some tree root nodes/leaves
>>> get corrupted)
>>>
>>>> ERROR: cannot open file system
>>>
>>> And since it's your tree root corrupted, you could also try
>>> "btrfs-find-root <device>" to try to get a good old copy of your tree
>>> root.
>>
>> The output is rather long. I pasted it here: 
>> https://pastebin.com/FkyBLgj9
>> I'm unsure what to look for in this output?
> 
> This shows all the candidates of the older tree root bytenr.
> 
> We could use it to try to recover.
> 
> You could then try the following command and see if btrfs check can go
> further.
> 
>  # btrfs check -r 45462239363072 <device>
> 
> 
> And the following dump could also help:
> 
>  # btrfs ins dump-tree -b 45462239363072 --follow
> 
> Thanks,
> Qu
> 
>>
>>> But I suspect the corruption happens before you noticed, thus the old
>>> tree root may not help much.
>>>
>>> Also, the output of "btrfs ins dump-tree -t root <device>" will help.
>>
>> Here it is:
>>
>> [root@cornelis ~]# btrfs ins dump-tree -t root /dev/sdn2
>> btrfs-progs v4.19 
>> parent transid verify failed on 46451963543552 wanted 114401 found
>> 114173
>> parent transid verify failed on 46451963543552 wanted 114401 found
>> 114173
>> checksum verify failed on 46451963543552 found A8F2A769 wanted 4C111ADF
>> checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
>> checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
>> bad tree block 46451963543552, bytenr mismatch, want=46451963543552,
>> have=75208089814272
>> Couldn't read tree root
>> ERROR: unable to open /dev/sdn2
>>
>>> Thanks,
>>> Qu
>>
>> No, thank YOU! :-)
>>
>>>> [root@cornelis ~]# btrfs restore /dev/sdn2 /mnt/data/
>>>> parent transid verify failed on 46451963543552 wanted 114401 found
>>>> 114173
>>>> parent transid verify failed on 46451963543552 wanted 114401 found
>>>> 114173
>>>> checksum verify failed on 46451963543552 found A8F2A769 wanted
>>>> 4C111ADF
>>>> checksum verify failed on 46451963543552 found 32153BE8 wanted
>>>> 8B07ABE4
>>>> checksum verify failed on 46451963543552 found 32153BE8 wanted
>>>> 8B07ABE4
>>>> bad tree block 46451963543552, bytenr mismatch,
>>>> want=46451963543552,
>>>> have=75208089814272
>>>> Couldn't read tree root
>>>> Could not open root, trying backup super
>>>> warning, device 14 is missing
>>>> warning, device 13 is missing
>>>> warning, device 12 is missing
>>>> warning, device 11 is missing
>>>> warning, device 10 is missing
>>>> warning, device 9 is missing
>>>> warning, device 8 is missing
>>>> warning, device 7 is missing
>>>> warning, device 6 is missing
>>>> warning, device 5 is missing
>>>> warning, device 4 is missing
>>>> warning, device 3 is missing
>>>> warning, device 2 is missing
>>>> checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0
>>>> checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0
>>>> bad tree block 22085632, bytenr mismatch, want=22085632,
>>>> have=1147797504
>>>> ERROR: cannot read chunk root
>>>> Could not open root, trying backup super
>>>> warning, device 14 is missing
>>>> warning, device 13 is missing
>>>> warning, device 12 is missing
>>>> warning, device 11 is missing
>>>> warning, device 10 is missing
>>>> warning, device 9 is missing
>>>> warning, device 8 is missing
>>>> warning, device 7 is missing
>>>> warning, device 6 is missing
>>>> warning, device 5 is missing
>>>> warning, device 4 is missing
>>>> warning, device 3 is missing
>>>> warning, device 2 is missing
>>>> checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0
>>>> checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0
>>>> bad tree block 22085632, bytenr mismatch, want=22085632,
>>>> have=1147797504
>>>> ERROR: cannot read chunk root
>>>> Could not open root, trying backup super
>>>>
>>>> [root@cornelis ~]# uname -r
>>>> 4.18.16-arch1-1-ARCH
>>>>
>>>> [root@cornelis ~]# btrfs --version
>>>> btrfs-progs v4.19
>>>>
> 

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]