linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BTRFS corruption: open_ctree failed
@ 2019-01-03  0:26 b11g
  2019-01-03  4:52 ` Chris Murphy
  2019-01-11 12:29 ` b11g
  0 siblings, 2 replies; 11+ messages in thread
From: b11g @ 2019-01-03  0:26 UTC (permalink / raw)
  To: linux-btrfs

Hi all,

I have several BTRFS success-stories, and I've been an happy user for quite a long time now. I was therefore surprised to face a BTRFS corruption on a system I'd just installed.

I use NixOS, unstable branch (linux kernel 4.19.12). The system runs on a SSD with an ext4 boot partition, a simple btrfs root with some subvolumes, and some swap space only used for hibernation. I was working on my server as normal when I noticed all of my BTRFS subvolumes had been remounted ro. After a short time, I started getting various IO errors ("bus error" by journalctl, "I/O error" by ls etc.). I halted the system (hard reboot), at the reboot the BTRFS partition would not mount. I suspected the corruption to be disk-related, but smartctl does not show any warning for the disk, and the ext4 partition seems healthy.

Those are the kernel messages logged when I attempt to mount the partition:
Jan 02 23:39:38 nixos kernel: BTRFS warning (device sdd2): sdd2 checksum verify failed on <L> wanted <A> found <B> level 0
Jan 02 23:39:38 nixos kernel: BTRFS error (device sdd2): failed to read block groups: -5
Jan 02 23:39:38 nixos systemd[1]: Started Cleanup of Temporary Directories.
Jan 02 23:39:38 nixos kernel: BTRFS error (device sdd2): open_ctree failed


Some queries for the error code I got lead me to those two recent threads:
https://www.spinics.net/lists/linux-btrfs/msg84973.html
https://www.spinics.net/lists/linux-btrfs/msg83833.html


Using btrfs-progs-4.15.1,  "btrfs restore /dev/sdd2 /tmp/" fails with:
checksum verify failed on <N> found <A> wanted <B>
checksum verify failed on <N> found <A> wanted <B>
Csum didn't match
Could not open root, trying backup super
checksum verify failed on <N> found <A> wanted <B>
checksum verify failed on <N> found <A> wanted <B>
Csum didn't match
Could not open root, trying backup super
ERROR: superblock bytenr <X> is larger than device size <Y>
Could not open root, trying backup super

Using btrfs-progs-4.19.1, "btrfs restore /dev/sdd2 /tmp/" succeeds with some exceptions:
We have looped trying to restore files in /@/nix/store too many times to be making progress, stopping

I do not have much time for debugging the issue and I did not lose important data, so I tried a couple of commands suggested on the threads and in the docs (without fully understanding them):

"btrfs rescue zero-log /dev/sdd2":
checksum verify failed on <N> found <A> wanted <B>
checksum verify failed on <N> found <A> wanted <B>
Csum didn't match
ERROR: could not open ctree

"btrfs check --repair /dev/sdd2" (I know, I was not supposed to run this one):
Opening filesystem to check...
checksum verify failed on <N> found <A> wanted <B>
checksum verify failed on <N> found <A> wanted <B>
Csum didn't match
ERROR: could not open ctree

Same for "btrfs check --init-csum-tree /dev/sdd2".


I expect to wipe the disk and do a clean start in the following days, I just wanted to report this in the hope it helps in the development (sorry for the redaction). If you need more information, I'll be glad to help as I can!

Thank you for your work,
Cheers,
- b11g


^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: BTRFS corruption: open_ctree failed
@ 2019-01-03  2:52 Tomasz Chmielewski
  2019-01-03  7:27 ` Andrea Gelmini
  0 siblings, 1 reply; 11+ messages in thread
From: Tomasz Chmielewski @ 2019-01-03  2:52 UTC (permalink / raw)
  To: Btrfs BTRFS

> I have several BTRFS success-stories, and I've been an happy user for 
> quite=
> a long time now. I was therefore surprised to face a BTRFS corruption 
> on a=
> system I'd just installed.
> I use NixOS, unstable branch (linux kernel 4.19.12). The system runs on 
> a S=
> SD with an ext4 boot partition, a simple btrfs root with some 
> subvolumes, a=

Did you use 4.19.x kernels earlier than 4.19.8?

They had a bug which would corrupt filesystems (mostly ext4 users would 
be reporting it, but I saw it with other filesystems, like xfs and 
btrfs, too):

https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.19-4.20-BLK-MQ-Fix

Interestingly, btrfs in RAID mode would often detect and correct these 
corruptions.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-01-11 12:29 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-03  0:26 BTRFS corruption: open_ctree failed b11g
2019-01-03  4:52 ` Chris Murphy
2019-01-03 13:55   ` b11g
2019-01-11 12:29 ` b11g
2019-01-03  2:52 Tomasz Chmielewski
2019-01-03  7:27 ` Andrea Gelmini
2019-01-03  7:43   ` Tomasz Chmielewski
2019-01-03  8:22     ` Andrea Gelmini
2019-01-03  8:29       ` Tomasz Chmielewski
2019-01-03  9:46         ` Andrea Gelmini
2019-01-03 14:32   ` b11g

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).