linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Daniel Clarke <dan@e-dan.co.uk>, linux-btrfs@vger.kernel.org
Subject: Re: BTRFS unable to mount after one failed disk in RAID 1
Date: Wed, 21 Aug 2019 07:39:19 +0800	[thread overview]
Message-ID: <cafc855e-b030-83ff-2984-dfb45a36d1b3@gmx.com> (raw)
In-Reply-To: <CAP-b2nNHVnfDyC2-F2pWtwUgjZxcqfwqYvNcBmknd5ZHauWoUw@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 5129 bytes --]



On 2019/8/21 上午3:42, Daniel Clarke wrote:
> Hi,
> 
> I'm having some trouble recovering my data after a single disk has
> failed in a raid1 two disk setup.
> 
> The original setup:
> mkfs.btrfs -L MASTER /dev/sdb1
> mount -o compress=zstd,noatime /dev/sdb1 /mnt/master
> btrfs subvolume create /mnt/master/home
> btrfs device add /dev/sdc1 /mnt/master
> btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt/master
> 
> Mount after in fstab:
> 
> UUID=70a651ab-4837-4891-9099-a6c8a52aa40f /mnt/master     btrfs
> defaults,noatime,compress=zstd 0      0
> 
> Was working fine for about 8 months, however I found the filesystem
> went to read only,

Dmesg of that please.

And there is a known bug that an aborted transaction can cause race and
corrupt the fs.
Please provide the kernel version of that RO event.

> and after a restart, would not mount at all. A
> failed disk seems to be the cause.

Dmesg of the first mount failure please.
It's hard to say just by the little info we get.

> 
> I'm trying to get the files off the other disk, but it will not mount.
> 
> Some info:
> ~$ blkid /dev/sdc1
> /dev/sdc1: LABEL="MASTER" UUID="70a651ab-4837-4891-9099-a6c8a52aa40f"
> UUID_SUB="150986ba-521c-4eb0-85ec-9435edecaf2a" TYPE="btrfs"
> PARTUUID="50a736da-aba8-224a-8e82-f1322ede466f"
> 
> ~$ btrfs --version
> btrfs-progs v4.15.1

Too old, too dangerous, especially considering older btrfs-progs can
cause further corruption if it hits some BUG_ON() or abort trans.
> 
> ~$ btrfs fi show
> warning, device 2 is missing
> bytenr mismatch, want=1057828618240, have=0
> Label: 'MASTER'  uuid: 70a651ab-4837-4891-9099-a6c8a52aa40f
> Total devices 2 FS bytes used 1001.59GiB
> devid    1 size 1.82TiB used 1003.03GiB path /dev/sdc1
> *** Some devices missing
> 
> Things I've tried:
> 
> ~$ mount -t btrfs -o ro,usebackuproot,compress=zstd /dev/sdc1 /mnt/maindisk
> mount: /mnt/maindisk: wrong fs type, bad option, bad superblock on
> /dev/sdc1, missing codepage or helper program, or other error.

You're using RO, it already means btrfs will try its best to mount
unless vital tree blocks are missing.

> 
> In dmesg:
> [ 4044.456472] BTRFS info (device sdc1): trying to use backup root at mount time
> [ 4044.456478] BTRFS info (device sdc1): use zstd compression, level 0
> [ 4044.456481] BTRFS info (device sdc1): disk space caching is enabled
> [ 4044.456482] BTRFS info (device sdc1): has skinny extents
> [ 4044.802419] BTRFS error (device sdc1): devid 2 uuid
> a3889c61-07b3-4165-bc37-e9918e41ea8d is missing
> [ 4044.802426] BTRFS error (device sdc1): failed to read chunk tree: -2

And that's the case, chunk tree blocks are missing.

Please provide the following dump:

# btrfs ins dump-super -FfA /dev/sdc1

> [ 4044.863400] BTRFS error (device sdc1): open_ctree failed
> 
> Pretty much the same thing with other mount options, with same
> messages in dmesg.
> 
> ~$ btrfs check --init-extent-tree /dev/sdc1

Why you're doing so?! It's already mentioned --init-extent-tree is UNSAFE!

> warning, device 2 is missing
> Checking filesystem on /dev/sdc1
> UUID: 70a651ab-4837-4891-9099-a6c8a52aa40f
> Creating a new extent tree
> bytenr mismatch, want=1058577645568, have=0
> Error reading tree block
> error pinning down used bytes
> ERROR: attempt to start transaction over already running one

Transaction get aborted, exactly the situation where fs can get further
corrupted.

The only good news is, we shouldn't have written much data as it's
happening in tree pinning down process, so no further damage.

> extent buffer leak: start 1768503115776 len 16384
> 
> ~$ btrfs rescue super-recover -v /dev/sdc1
> All Devices:
> Device: id = 1, name = /dev/sdc1
> 
> Before Recovering:
> [All good supers]:
> device name = /dev/sdc1
> superblock bytenr = 65536
> 
> device name = /dev/sdc1
> superblock bytenr = 67108864
> 
> device name = /dev/sdc1
> superblock bytenr = 274877906944
> 
> [All bad supers]:
> 
> All supers are valid, no need to recover
> 
> 
> ~$ sudo btrfs restore -mxs /dev/sdc1 /mnt/ssd1/
> warning, device 2 is missing
> bytenr mismatch, want=1057828618240, have=0
> Could not open root, trying backup super
> warning, device 2 is missing
> bytenr mismatch, want=1057828618240, have=0
> Could not open root, trying backup super
> warning, device 2 is missing
> bytenr mismatch, want=1057828618240, have=0
> Could not open root, trying backup super
> 
> ~$ btrfs check /dev/sdc1
> warning, device 2 is missing
> bytenr mismatch, want=1057828618240, have=0
> ERROR: cannot open file system
> 
> ~$ btrfs rescue zero-log /dev/sdc1
> warning, device 2 is missing
> bytenr mismatch, want=1057828618240, have=0
> ERROR: could not open ctree
> 
> I'm only interested in getting it read-only mounted so I can copy
> somewhere else. Any ideas you have are welcome!

It looks like some metadata tree blocks are still not in RAID1 mode.
Needs that ins dump-super output to confirm.

Thanks,
Qu

> 
> Many Thanks,
> 
> Daniel Clarke
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  parent reply	other threads:[~2019-08-20 23:39 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-20 19:42 BTRFS unable to mount after one failed disk in RAID 1 Daniel Clarke
2019-08-20 20:37 ` Chris Murphy
2019-08-20 23:39 ` Qu Wenruo [this message]
     [not found]   ` <CAP-b2nPJE_=957ARh+JMzOkVg4E_A_tAJPiN0e1BTyCLTZ=Jhw@mail.gmail.com>
     [not found]     ` <55f9d2ff-b6b0-7f13-287e-c9916c57943f@gmx.com>
     [not found]       ` <CAP-b2nM=s07wA0Yb9XQvea_ULZ=kui0UZNddGDa_gfd1C_m7qQ@mail.gmail.com>
2019-08-25 20:49         ` Daniel Clarke
2019-08-25 23:51           ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cafc855e-b030-83ff-2984-dfb45a36d1b3@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dan@e-dan.co.uk \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).