All of lore.kernel.org
 help / color / mirror / Atom feed
From: b11g <b11g@protonmail.com>
To: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: BTRFS corruption: open_ctree failed
Date: Fri, 11 Jan 2019 12:29:40 +0000	[thread overview]
Message-ID: <j8LS7-Iz6uOg8fN_yZD_mcUT3J1jCHsk9nU18LpSGGcHDHVTHk6s8u5HFNAuB6lfcjzwtWD2o6awwLqY6-GQmOBBeJ6ZUiiybgohZUxJlwg=@protonmail.com> (raw)
In-Reply-To: <DID7NmLoqB1WBZ8Vo7rEMlL9cLWv7Bz9F04MkQYOLfZJyl04O44IUiFKdC_Tl4iSvHGLxpbwDwD9cHJ33t-Hkv8PKF0wmm7201RABA2kRAY=@protonmail.com>

Follow up: the issue was a faulty DIMM module. For some strange coincidence, only the space allocated to disk caches appeared to be corrupted - with the rest of the system working flawlessly most of the time.

I would guess that BTRFS tried to self-heal based on the cached data, ultimately corrupting the file system behind salvation?

If anyone gets here with similar problems - memtest your ram before doing anything!

-b11g


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, 3 January 2019 01:26, b11g <b11g@protonmail.com> wrote:

> Hi all,
>
> I have several BTRFS success-stories, and I've been an happy user for quite a long time now. I was therefore surprised to face a BTRFS corruption on a system I'd just installed.
>
> I use NixOS, unstable branch (linux kernel 4.19.12). The system runs on a SSD with an ext4 boot partition, a simple btrfs root with some subvolumes, and some swap space only used for hibernation. I was working on my server as normal when I noticed all of my BTRFS subvolumes had been remounted ro. After a short time, I started getting various IO errors ("bus error" by journalctl, "I/O error" by ls etc.). I halted the system (hard reboot), at the reboot the BTRFS partition would not mount. I suspected the corruption to be disk-related, but smartctl does not show any warning for the disk, and the ext4 partition seems healthy.
>
> Those are the kernel messages logged when I attempt to mount the partition:
> Jan 02 23:39:38 nixos kernel: BTRFS warning (device sdd2): sdd2 checksum verify failed on <L> wanted <A> found <B> level 0
> Jan 02 23:39:38 nixos kernel: BTRFS error (device sdd2): failed to read block groups: -5
> Jan 02 23:39:38 nixos systemd[1]: Started Cleanup of Temporary Directories.
> Jan 02 23:39:38 nixos kernel: BTRFS error (device sdd2): open_ctree failed
>
> Some queries for the error code I got lead me to those two recent threads:
> https://www.spinics.net/lists/linux-btrfs/msg84973.html
> https://www.spinics.net/lists/linux-btrfs/msg83833.html
>
> Using btrfs-progs-4.15.1, "btrfs restore /dev/sdd2 /tmp/" fails with:
> checksum verify failed on <N> found <A> wanted <B>
> checksum verify failed on <N> found <A> wanted <B>
> Csum didn't match
> Could not open root, trying backup super
> checksum verify failed on <N> found <A> wanted <B>
> checksum verify failed on <N> found <A> wanted <B>
> Csum didn't match
> Could not open root, trying backup super
> ERROR: superblock bytenr <X> is larger than device size <Y>
> Could not open root, trying backup super
>
> Using btrfs-progs-4.19.1, "btrfs restore /dev/sdd2 /tmp/" succeeds with some exceptions:
> We have looped trying to restore files in /@/nix/store too many times to be making progress, stopping
>
> I do not have much time for debugging the issue and I did not lose important data, so I tried a couple of commands suggested on the threads and in the docs (without fully understanding them):
>
> "btrfs rescue zero-log /dev/sdd2":
> checksum verify failed on <N> found <A> wanted <B>
> checksum verify failed on <N> found <A> wanted <B>
> Csum didn't match
> ERROR: could not open ctree
>
> "btrfs check --repair /dev/sdd2" (I know, I was not supposed to run this one):
> Opening filesystem to check...
> checksum verify failed on <N> found <A> wanted <B>
> checksum verify failed on <N> found <A> wanted <B>
> Csum didn't match
> ERROR: could not open ctree
>
> Same for "btrfs check --init-csum-tree /dev/sdd2".
>
> I expect to wipe the disk and do a clean start in the following days, I just wanted to report this in the hope it helps in the development (sorry for the redaction). If you need more information, I'll be glad to help as I can!
>
> Thank you for your work,
> Cheers,
>
> -   b11g



  parent reply	other threads:[~2019-01-11 12:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-03  0:26 BTRFS corruption: open_ctree failed b11g
2019-01-03  4:52 ` Chris Murphy
2019-01-03 13:55   ` b11g
2019-01-11 12:29 ` b11g [this message]
2019-01-03  2:52 Tomasz Chmielewski
2019-01-03  7:27 ` Andrea Gelmini
2019-01-03  7:43   ` Tomasz Chmielewski
2019-01-03  8:22     ` Andrea Gelmini
2019-01-03  8:29       ` Tomasz Chmielewski
2019-01-03  9:46         ` Andrea Gelmini
2019-01-03 14:32   ` b11g

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='j8LS7-Iz6uOg8fN_yZD_mcUT3J1jCHsk9nU18LpSGGcHDHVTHk6s8u5HFNAuB6lfcjzwtWD2o6awwLqY6-GQmOBBeJ6ZUiiybgohZUxJlwg=@protonmail.com' \
    --to=b11g@protonmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.