linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: b11g <b11g@protonmail.com>
To: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: BTRFS corruption: open_ctree failed
Date: Fri, 11 Jan 2019 12:29:40 +0000	[thread overview]
Message-ID: <j8LS7-Iz6uOg8fN_yZD_mcUT3J1jCHsk9nU18LpSGGcHDHVTHk6s8u5HFNAuB6lfcjzwtWD2o6awwLqY6-GQmOBBeJ6ZUiiybgohZUxJlwg=@protonmail.com> (raw)
In-Reply-To: <DID7NmLoqB1WBZ8Vo7rEMlL9cLWv7Bz9F04MkQYOLfZJyl04O44IUiFKdC_Tl4iSvHGLxpbwDwD9cHJ33t-Hkv8PKF0wmm7201RABA2kRAY=@protonmail.com>

Follow up: the issue was a faulty DIMM module. For some strange coincidence, only the space allocated to disk caches appeared to be corrupted - with the rest of the system working flawlessly most of the time.

I would guess that BTRFS tried to self-heal based on the cached data, ultimately corrupting the file system behind salvation?

If anyone gets here with similar problems - memtest your ram before doing anything!

-b11g


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, 3 January 2019 01:26, b11g <b11g@protonmail.com> wrote:

> Hi all,
>
> I have several BTRFS success-stories, and I've been an happy user for quite a long time now. I was therefore surprised to face a BTRFS corruption on a system I'd just installed.
>
> I use NixOS, unstable branch (linux kernel 4.19.12). The system runs on a SSD with an ext4 boot partition, a simple btrfs root with some subvolumes, and some swap space only used for hibernation. I was working on my server as normal when I noticed all of my BTRFS subvolumes had been remounted ro. After a short time, I started getting various IO errors ("bus error" by journalctl, "I/O error" by ls etc.). I halted the system (hard reboot), at the reboot the BTRFS partition would not mount. I suspected the corruption to be disk-related, but smartctl does not show any warning for the disk, and the ext4 partition seems healthy.
>
> Those are the kernel messages logged when I attempt to mount the partition:
> Jan 02 23:39:38 nixos kernel: BTRFS warning (device sdd2): sdd2 checksum verify failed on <L> wanted <A> found <B> level 0
> Jan 02 23:39:38 nixos kernel: BTRFS error (device sdd2): failed to read block groups: -5
> Jan 02 23:39:38 nixos systemd[1]: Started Cleanup of Temporary Directories.
> Jan 02 23:39:38 nixos kernel: BTRFS error (device sdd2): open_ctree failed
>
> Some queries for the error code I got lead me to those two recent threads:
> https://www.spinics.net/lists/linux-btrfs/msg84973.html
> https://www.spinics.net/lists/linux-btrfs/msg83833.html
>
> Using btrfs-progs-4.15.1, "btrfs restore /dev/sdd2 /tmp/" fails with:
> checksum verify failed on <N> found <A> wanted <B>
> checksum verify failed on <N> found <A> wanted <B>
> Csum didn't match
> Could not open root, trying backup super
> checksum verify failed on <N> found <A> wanted <B>
> checksum verify failed on <N> found <A> wanted <B>
> Csum didn't match
> Could not open root, trying backup super
> ERROR: superblock bytenr <X> is larger than device size <Y>
> Could not open root, trying backup super
>
> Using btrfs-progs-4.19.1, "btrfs restore /dev/sdd2 /tmp/" succeeds with some exceptions:
> We have looped trying to restore files in /@/nix/store too many times to be making progress, stopping
>
> I do not have much time for debugging the issue and I did not lose important data, so I tried a couple of commands suggested on the threads and in the docs (without fully understanding them):
>
> "btrfs rescue zero-log /dev/sdd2":
> checksum verify failed on <N> found <A> wanted <B>
> checksum verify failed on <N> found <A> wanted <B>
> Csum didn't match
> ERROR: could not open ctree
>
> "btrfs check --repair /dev/sdd2" (I know, I was not supposed to run this one):
> Opening filesystem to check...
> checksum verify failed on <N> found <A> wanted <B>
> checksum verify failed on <N> found <A> wanted <B>
> Csum didn't match
> ERROR: could not open ctree
>
> Same for "btrfs check --init-csum-tree /dev/sdd2".
>
> I expect to wipe the disk and do a clean start in the following days, I just wanted to report this in the hope it helps in the development (sorry for the redaction). If you need more information, I'll be glad to help as I can!
>
> Thank you for your work,
> Cheers,
>
> -   b11g



  parent reply	other threads:[~2019-01-11 12:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-03  0:26 BTRFS corruption: open_ctree failed b11g
2019-01-03  4:52 ` Chris Murphy
2019-01-03 13:55   ` b11g
2019-01-11 12:29 ` b11g [this message]
2019-01-03  2:52 Tomasz Chmielewski
2019-01-03  7:27 ` Andrea Gelmini
2019-01-03  7:43   ` Tomasz Chmielewski
2019-01-03  8:22     ` Andrea Gelmini
2019-01-03  8:29       ` Tomasz Chmielewski
2019-01-03  9:46         ` Andrea Gelmini
2019-01-03 14:32   ` b11g

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='j8LS7-Iz6uOg8fN_yZD_mcUT3J1jCHsk9nU18LpSGGcHDHVTHk6s8u5HFNAuB6lfcjzwtWD2o6awwLqY6-GQmOBBeJ6ZUiiybgohZUxJlwg=@protonmail.com' \
    --to=b11g@protonmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).