All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Larkin Lowrey <llowrey@nuclearwinter.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Scrub aborts due to corrupt leaf
Date: Tue, 1 Jan 2019 08:12:06 +0800	[thread overview]
Message-ID: <9dbfde05-4b20-4681-9286-3db0e8cf4f56@gmx.com> (raw)
In-Reply-To: <b8b50f6e-60ed-6988-0556-50c94c077099@nuclearwinter.com>


[-- Attachment #1.1: Type: text/plain, Size: 7090 bytes --]



On 2018/12/31 下午11:52, Larkin Lowrey wrote:
> On 10/11/2018 12:15 AM, Chris Murphy wrote:
>> Is this a 68T file system? Seems excessive.
>> Haha, by excessive I mean nuking such a big fs just for being unable
>> to remove the space tree. I'm quite sure the devs would like to get
>> that crashing bug fixed, anyway.
> 
> A second FS just started failing. I never had this much trouble with
> space cache v1.
> 
> This host had a DIMM failure a couple of weeks ago which caused the
> system to halt due to uncorrectable ECC error(s).

That looks like a pretty possible cause for the corruption.

Like strange items in your extent tree of your other fs, if your memory
is unreliable, all your fs is possible corrupted.

And for the victim of memory corruption, the hotter tree block the
easier to be a victim.

For both case, the corruption happens at extent tree, which matches the
symptom.

Please do a btrfs check on all your filesystems.

Thanks,
Qu

> That was the only
> recent unsafe shutdown. Other than that, things have been running
> normally until today when the FS went read-only during backups. As with
> the other host, I tried to clear the space-cache (v2) before doing a
> 'check --repair' but got this:
> 
> [root@fubar ~]# btrfs check --clear-space-cache=v2 /dev/Cached/Nearline
> Opening filesystem to check...
> Checking filesystem on /dev/Cached/Nearline
> UUID: 68d31d5f-97a2-4a73-a398-c7c13ff439a5
> Clear free space cache v2
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> checksum verify failed on 271262429573120 found 1BA4548E wanted D105DF84
> bad tree block 271262429573120, bytenr mismatch, want=271262429573120,
> have=17478763091281320157
> extent-tree.c:2703: alloc_reserved_tree_block: BUG_ON `ret` triggered,
> value -17
> btrfs(+0x1ff96)[0x55eae7dc5f96]
> btrfs(+0x2109f)[0x55eae7dc709f]
> btrfs(+0x2115e)[0x55eae7dc715e]
> btrfs(+0x22054)[0x55eae7dc8054]
> btrfs(+0x22c57)[0x55eae7dc8c57]
> btrfs(btrfs_alloc_free_block+0xc2)[0x55eae7dcca72]
> btrfs(__btrfs_cow_block+0x18a)[0x55eae7dbc05a]
> btrfs(btrfs_cow_block+0x104)[0x55eae7dbc874]
> btrfs(btrfs_search_slot+0x35f)[0x55eae7dbf6cf]
> btrfs(btrfs_clear_free_space_tree+0x104)[0x55eae7de8b54]
> btrfs(cmd_check+0xb11)[0x55eae7e0ce31]
> btrfs(main+0x88)[0x55eae7dbaaa8]
> /lib64/libc.so.6(__libc_start_main+0xf3)[0x7fead8094413]
> btrfs(_start+0x2e)[0x55eae7dbabbe]
> Aborted (core dumped)
> 
> # btrfs fi show /public/nearline/
> Label: none  uuid: 68d31d5f-97a2-4a73-a398-c7c13ff439a5
>         Total devices 1 FS bytes used 61.09TiB
>         devid    1 size 65.25TiB used 61.45TiB path
> /dev/mapper/Cached-Nearline
> 
> # btrfs fi df /public/nearline/
> Data, single: total=61.39TiB, used=61.03TiB
> System, single: total=32.00MiB, used=6.59MiB
> Metadata, single: total=67.00GiB, used=65.85GiB
> GlobalReserve, single: total=512.00MiB, used=4.02MiB
> 
> # btrfs fi usage /public/nearline/
> Overall:
>     Device size:                  65.25TiB
>     Device allocated:             61.45TiB
>     Device unallocated:            3.79TiB
>     Device missing:                  0.00B
>     Used:                         61.09TiB
>     Free (estimated):              4.15TiB      (min: 4.15TiB)
>     Data ratio:                       1.00
>     Metadata ratio:                   1.00
>     Global reserve:              512.00MiB      (used: 4.02MiB)
> 
> Data,single: Size:61.39TiB, Used:61.03TiB
>    /dev/mapper/Cached-Nearline    61.39TiB
> 
> Metadata,single: Size:67.00GiB, Used:65.85GiB
>    /dev/mapper/Cached-Nearline    67.00GiB
> 
> System,single: Size:32.00MiB, Used:6.59MiB
>    /dev/mapper/Cached-Nearline    32.00MiB
> 
> Unallocated:
>    /dev/mapper/Cached-Nearline     3.79TiB
> 
> 4.19.10-300.fc29.x86_64
> btrfs-progs v4.17.1
> 
> I haven't nuked the other FS yet so I now have two that are either in
> the same or at least very similar states.
> 
> What additional information can I provide?
> 
> --Larkin


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-01-01  0:15 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-26 20:45 Scrub aborts due to corrupt leaf Larkin Lowrey
2018-08-27  0:16 ` Qu Wenruo
2018-08-27  2:32   ` Larkin Lowrey
2018-08-27  4:46     ` Qu Wenruo
2018-08-28  2:12       ` Larkin Lowrey
2018-08-28  3:29         ` Chris Murphy
2018-08-28 13:29         ` Larkin Lowrey
2018-08-28 13:42           ` Qu Wenruo
2018-08-28 13:56             ` Chris Murphy
2018-08-29  1:27               ` Qu Wenruo
2018-08-29  5:32               ` Qu Wenruo
2018-09-11 15:23                 ` Larkin Lowrey
2018-10-10 15:44                   ` Larkin Lowrey
2018-10-10 16:04                     ` Holger Hoffstätte
2018-10-10 17:25                       ` Larkin Lowrey
2018-10-10 18:20                         ` Holger Hoffstätte
2018-10-10 18:31                           ` Larkin Lowrey
2018-10-10 19:53                             ` Chris Murphy
2018-10-10 23:43                         ` Qu Wenruo
2018-10-10 17:44                       ` Chris Murphy
2018-10-10 18:25                         ` Holger Hoffstätte
2018-10-10 23:55                         ` Hans van Kranenburg
2018-10-11  2:12                           ` Larkin Lowrey
2018-10-11  2:51                             ` Chris Murphy
2018-10-11  3:07                               ` Larkin Lowrey
2018-10-11  4:00                                 ` Chris Murphy
2018-10-11  4:15                                   ` Chris Murphy
2018-12-31 15:52                                     ` Larkin Lowrey
2019-01-01  0:12                                       ` Qu Wenruo [this message]
2019-01-01  2:38                                         ` Larkin Lowrey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9dbfde05-4b20-4681-9286-3db0e8cf4f56@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=llowrey@nuclearwinter.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.