linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Patrick Dijkgraaf <bolderbast@duckstad.net>, linux-btrfs@vger.kernel.org
Subject: Re: Need help with potential ~45TB dataloss
Date: Sat, 1 Dec 2018 07:57:28 +0800	[thread overview]
Message-ID: <6ce9cd01-960f-af3d-0273-0b9abfa1d4f8@gmx.com> (raw)
In-Reply-To: <8bc37755da04dffae1a34cea2a06bcffdf2c75d7.camel@duckstad.net>


[-- Attachment #1.1: Type: text/plain, Size: 6638 bytes --]



On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote:
> Hi all,
> 
> I have been a happy BTRFS user for quite some time. But now I'm facing
> a potential ~45TB dataloss... :-(
> I hope someone can help!
> 
> I have Server A and Server B. Both having a 20-devices BTRFS RAID6
> filesystem. Because of known RAID5/6 risks, Server B was a backup of
> Server A.
> After applying updates to server B and reboot, the FS would not mount
> anymore. Because it was "just" a backup. I decided to recreate the FS
> and perform a new backup. Later, I discovered that the FS was not
> broken, but I faced this issue: 
> https://patchwork.kernel.org/patch/10694997/

Sorry for the inconvenience.

I didn't realize the max_chunk_size limit isn't reliable at that timing.

> 
> Anyway, the FS was already recreated, so I needed to do a new backup.
> During the backup (using rsync -vah), Server A (the source) encountered
> an I/O error and my rsync failed. In an attempt to "quick fix" the
> issue, I rebooted Server A after which the FS would not mount anymore.

Did you have any dmesg about that IO error?

And how is the reboot scheduled? Forced power off or normal reboot command?

> 
> I documented what I have tried, below. I have not yet tried anything
> except what is shown, because I am afraid of causing more harm to
> the FS.

Pretty clever, no btrfs check --repair is a pretty good move.

> I hope somebody here can give me advice on how to (hopefully)
> retrieve my data...
> 
> Thanks in advance!
> 
> ==========================================
> 
> [root@cornelis ~]# btrfs fi show
> Label: 'cornelis-btrfs'  uuid: ac643516-670e-40f3-aa4c-f329fc3795fd
> 	Total devices 1 FS bytes used 463.92GiB
> 	devid    1 size 800.00GiB used 493.02GiB path
> /dev/mapper/cornelis-cornelis--btrfs
> 
> Label: 'data'  uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5
> 	Total devices 20 FS bytes used 44.85TiB
> 	devid    1 size 3.64TiB used 3.64TiB path /dev/sdn2
> 	devid    2 size 3.64TiB used 3.64TiB path /dev/sdp2
> 	devid    3 size 3.64TiB used 3.64TiB path /dev/sdu2
> 	devid    4 size 3.64TiB used 3.64TiB path /dev/sdx2
> 	devid    5 size 3.64TiB used 3.64TiB path /dev/sdh2
> 	devid    6 size 3.64TiB used 3.64TiB path /dev/sdg2
> 	devid    7 size 3.64TiB used 3.64TiB path /dev/sdm2
> 	devid    8 size 3.64TiB used 3.64TiB path /dev/sdw2
> 	devid    9 size 3.64TiB used 3.64TiB path /dev/sdj2
> 	devid   10 size 3.64TiB used 3.64TiB path /dev/sdt2
> 	devid   11 size 3.64TiB used 3.64TiB path /dev/sdk2
> 	devid   12 size 3.64TiB used 3.64TiB path /dev/sdq2
> 	devid   13 size 3.64TiB used 3.64TiB path /dev/sds2
> 	devid   14 size 3.64TiB used 3.64TiB path /dev/sdf2
> 	devid   15 size 7.28TiB used 588.80GiB path /dev/sdr2
> 	devid   16 size 7.28TiB used 588.80GiB path /dev/sdo2
> 	devid   17 size 7.28TiB used 588.80GiB path /dev/sdv2
> 	devid   18 size 7.28TiB used 588.80GiB path /dev/sdi2
> 	devid   19 size 7.28TiB used 588.80GiB path /dev/sdl2
> 	devid   20 size 7.28TiB used 588.80GiB path /dev/sde2
> 
> [root@cornelis ~]# mount /dev/sdn2 /mnt/data
> mount: /mnt/data: wrong fs type, bad option, bad superblock on
> /dev/sdn2, missing codepage or helper program, or other error.

What is the dmesg of the mount failure?

And have you tried -o ro,degraded ?

> 
> [root@cornelis ~]# btrfs check /dev/sdn2
> Opening filesystem to check...
> parent transid verify failed on 46451963543552 wanted 114401 found
> 114173
> parent transid verify failed on 46451963543552 wanted 114401 found
> 114173
> checksum verify failed on 46451963543552 found A8F2A769 wanted 4C111ADF
> checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
> checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
> bad tree block 46451963543552, bytenr mismatch, want=46451963543552,
> have=75208089814272
> Couldn't read tree root

Would you please also paste the output of "btrfs ins dump-super /dev/sdn2" ?

It looks like your tree root (or at least some tree root nodes/leaves
get corrupted)

> ERROR: cannot open file system

And since it's your tree root corrupted, you could also try
"btrfs-find-root <device>" to try to get a good old copy of your tree root.

But I suspect the corruption happens before you noticed, thus the old
tree root may not help much.

Also, the output of "btrfs ins dump-tree -t root <device>" will help.

Thanks,
Qu
> 
> [root@cornelis ~]# btrfs restore /dev/sdn2 /mnt/data/
> parent transid verify failed on 46451963543552 wanted 114401 found
> 114173
> parent transid verify failed on 46451963543552 wanted 114401 found
> 114173
> checksum verify failed on 46451963543552 found A8F2A769 wanted 4C111ADF
> checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
> checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
> bad tree block 46451963543552, bytenr mismatch, want=46451963543552,
> have=75208089814272
> Couldn't read tree root
> Could not open root, trying backup super
> warning, device 14 is missing
> warning, device 13 is missing
> warning, device 12 is missing
> warning, device 11 is missing
> warning, device 10 is missing
> warning, device 9 is missing
> warning, device 8 is missing
> warning, device 7 is missing
> warning, device 6 is missing
> warning, device 5 is missing
> warning, device 4 is missing
> warning, device 3 is missing
> warning, device 2 is missing
> checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0
> checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0
> bad tree block 22085632, bytenr mismatch, want=22085632,
> have=1147797504
> ERROR: cannot read chunk root
> Could not open root, trying backup super
> warning, device 14 is missing
> warning, device 13 is missing
> warning, device 12 is missing
> warning, device 11 is missing
> warning, device 10 is missing
> warning, device 9 is missing
> warning, device 8 is missing
> warning, device 7 is missing
> warning, device 6 is missing
> warning, device 5 is missing
> warning, device 4 is missing
> warning, device 3 is missing
> warning, device 2 is missing
> checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0
> checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0
> bad tree block 22085632, bytenr mismatch, want=22085632,
> have=1147797504
> ERROR: cannot read chunk root
> Could not open root, trying backup super
> 
> [root@cornelis ~]# uname -r
> 4.18.16-arch1-1-ARCH
> 
> [root@cornelis ~]# btrfs --version
> btrfs-progs v4.19
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2018-11-30 23:57 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-30 13:53 Need help with potential ~45TB dataloss Patrick Dijkgraaf
2018-11-30 23:57 ` Qu Wenruo [this message]
2018-12-02  9:03   ` Patrick Dijkgraaf
2018-12-02 20:14     ` Patrick Dijkgraaf
2018-12-02 20:30       ` Andrei Borzenkov
2018-12-03  5:58         ` Qu Wenruo
2018-12-04  3:16           ` Chris Murphy
2018-12-04 10:09             ` Patrick Dijkgraaf
2018-12-04 19:38               ` Chris Murphy
2018-12-09  9:28                 ` Patrick Dijkgraaf
2018-12-03  0:35     ` Qu Wenruo
2018-12-03  0:45       ` Qu Wenruo
2018-12-04  9:58       ` Patrick Dijkgraaf
2018-12-09  9:32         ` Patrick Dijkgraaf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6ce9cd01-960f-af3d-0273-0b9abfa1d4f8@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=bolderbast@duckstad.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).