linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* failed to read block groups: -5; open_ctree failed
@ 2020-03-18  2:26 Liwei
  2020-03-18  8:00 ` Qu Wenruo
  0 siblings, 1 reply; 2+ messages in thread
From: Liwei @ 2020-03-18  2:26 UTC (permalink / raw)
  To: linux-btrfs

Hi list,
I'm getting the following log while trying to mount my filesystem:
[   23.403026] BTRFS: device label dstore devid 1 transid 1288839 /dev/dm-8
[   23.491459] BTRFS info (device dm-8): enabling auto defrag
[   23.491461] BTRFS info (device dm-8): disk space caching is enabled
[   23.717506] BTRFS info (device dm-8): bdev /dev/mapper/vg- dstore
errs: wr 0, rd 728, flush 0, corrupt 16, gen 0
[   32.108724] BTRFS error (device dm-8): bad tree block start, want
39854304329728 have 0
[   32.110570] BTRFS error (device dm-8): bad tree block start, want
39854304329728 have 0
[   32.112030] BTRFS error (device dm-8): failed to read block groups: -5
[   32.273712] BTRFS error (device dm-8): open_ctree failed

A check gives me:
#btrfs check /dev/mapper/recovery
Opening filesystem to check...
checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
bad tree block 39854304329728, bytenr mismatch, want=39854304329728, have=0
ERROR: cannot open file system

The same thing happens with the other superblocks, all superblocks are
not corrupted.

The reason this happened is a controller failure occurred while trying
to expand the underlying raid6 causing some pretty nasty drive
dropouts. Looking through older generations of tree roots, I'm getting
the same zeroed node at 39854304329728.

It seems like at some point md messed up recovering from the
controller failure (or rather I did) and it seems like I am getting a
lot of zeroed-out/corrupted areas? Can someone confirm if that is the
case or if it is just some weird state the filesystem is in?

I'm not hung up about hosing the filesystem as we have a complete
backup before doing the raid expansion, but it'd be great if I can
avoid restoring as that will take a very long time.

Other obligatory information:
# uname -a
Linux dstore-1 4.19.0-4-amd64 #1 SMP Debian 4.19.28-2 (2019-03-15)
x86_64 GNU/Linux
# btrfs --version
btrfs-progs v4.20.1

Thank you very much!
Liwei

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: failed to read block groups: -5; open_ctree failed
  2020-03-18  2:26 failed to read block groups: -5; open_ctree failed Liwei
@ 2020-03-18  8:00 ` Qu Wenruo
  0 siblings, 0 replies; 2+ messages in thread
From: Qu Wenruo @ 2020-03-18  8:00 UTC (permalink / raw)
  To: Liwei, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2909 bytes --]



On 2020/3/18 上午10:26, Liwei wrote:
> Hi list,
> I'm getting the following log while trying to mount my filesystem:
> [   23.403026] BTRFS: device label dstore devid 1 transid 1288839 /dev/dm-8
> [   23.491459] BTRFS info (device dm-8): enabling auto defrag
> [   23.491461] BTRFS info (device dm-8): disk space caching is enabled
> [   23.717506] BTRFS info (device dm-8): bdev /dev/mapper/vg- dstore
> errs: wr 0, rd 728, flush 0, corrupt 16, gen 0
> [   32.108724] BTRFS error (device dm-8): bad tree block start, want
> 39854304329728 have 0
> [   32.110570] BTRFS error (device dm-8): bad tree block start, want
> 39854304329728 have 0
> [   32.112030] BTRFS error (device dm-8): failed to read block groups: -5
> [   32.273712] BTRFS error (device dm-8): open_ctree failed

Extent tree corruption.

And it's not some small problem, but data loss.
The on-disk data is completely wiped (all 0).

> 
> A check gives me:
> #btrfs check /dev/mapper/recovery
> Opening filesystem to check...
> checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
> checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
> checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
> checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
> bad tree block 39854304329728, bytenr mismatch, want=39854304329728, have=0
> ERROR: cannot open file system
> 
> The same thing happens with the other superblocks, all superblocks are
> not corrupted.

Super blocks is only in 4K size, you won't expect that could contain all
your metadata, right?

> 
> The reason this happened is a controller failure occurred while trying
> to expand the underlying raid6 causing some pretty nasty drive
> dropouts. Looking through older generations of tree roots, I'm getting
> the same zeroed node at 39854304329728.
> 
> It seems like at some point md messed up recovering from the
> controller failure (or rather I did) and it seems like I am getting a
> lot of zeroed-out/corrupted areas?

Yes, that's exactly the case.

> Can someone confirm if that is the
> case or if it is just some weird state the filesystem is in?
> 
> I'm not hung up about hosing the filesystem as we have a complete
> backup before doing the raid expansion, but it'd be great if I can
> avoid restoring as that will take a very long time.

Since part of your on-disk data/metadata is wiped, I don't believe the
wiped metadata is only limited in extent tree.

But if you're really lucky, and the wiped range is only in extent tree,
btrfs-restore would be able to restore most of your data.

Thanks,
Qu

> 
> Other obligatory information:
> # uname -a
> Linux dstore-1 4.19.0-4-amd64 #1 SMP Debian 4.19.28-2 (2019-03-15)
> x86_64 GNU/Linux
> # btrfs --version
> btrfs-progs v4.20.1
> 
> Thank you very much!
> Liwei
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-03-18  8:00 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-18  2:26 failed to read block groups: -5; open_ctree failed Liwei
2020-03-18  8:00 ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).