All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Mamedov <rm@romanrm.net>
To: Marc MERLIN <marc@merlins.org>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
	Chris Mason <clm@fb.com>, Qu Wenruo <quwenruo@cn.fujitsu.com>,
	David Sterba <dsterba@suse.cz>
Subject: Re: btrfs check --repair: failed to repair damaged filesystem, aborting
Date: Wed, 3 May 2017 11:32:26 +0500	[thread overview]
Message-ID: <20170503113226.757f492e@natsu> (raw)
In-Reply-To: <20170503061711.jowqa5wh3mtzjbww@merlins.org>

On Tue, 2 May 2017 23:17:11 -0700
Marc MERLIN <marc@merlins.org> wrote:

> On Tue, May 02, 2017 at 11:00:08PM -0700, Marc MERLIN wrote:
> > David,
> > 
> > I think you maintain btrfs-progs, but I'm not sure if you're in charge 
> > of check --repair.
> > Could you comment on the bottom of the mail, namely:
> > > failed to repair damaged filesystem, aborting
> > > So, I'm out of luck now, full wipe and 3-5 day rebuild?
>   
> Actually, another thought:
> Is there or should there be a way to repair around the bit that cannot
> be repaired?
> Separately, or not, can I locate which bits are causing the repair to
> fail and maybe get a pointer to the path/inode so that I can hopefully
> just delete those bad data structures (assuming deleting them is even
> possible and that the FS won't just go read only as I try to do that)

There is the "btrfs-corrupt-block" tool which helped me to kick Btrfsck
further along its course in a similar "unrepairable" situation.
https://www.spinics.net/lists/linux-btrfs/msg53061.html

In your case it appears like the block 2899180224512 is giving it the most
trouble, so you could start with killing that one. From what I can tell this
tool zeroes out the entire block, so Btrfsck can simply delete the reference
and forget it, rather than repeatedly trying to figure out solutions and
bailing out with "failed to repair damaged filesystem, aborting".

Depending on what was stored in it, you may have either no visible effect, or
a complete filesystem failure, or anything in between. Hence if you want to
experiment with this, find a way to work on writable overlay snapshots (also
described in the linked message).

> Here is the full run if that helps:
> https://pastebin.com/STMFHty4
> 
> > Thanks,
> > Marc
> > 
> > Rest:
> > On Tue, May 02, 2017 at 11:47:22AM -0700, Marc MERLIN wrote:
> > > (cc trimmed)
> > > 
> > > The one in debian/unstable crashed:
> > > gargamel:~# btrfs --version
> > > btrfs-progs v4.7.3
> > > gargamel:~# btrfs check --repair /dev/mapper/dshelf2
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > extent-tree.c:2721: alloc_reserved_tree_block: Assertion `ret` failed.
> > > btrfs[0x43e418]
> > > btrfs[0x43e43f]
> > > btrfs[0x43f276]
> > > btrfs[0x43f46f]
> > > btrfs[0x4407ef]
> > > btrfs[0x440963]
> > > btrfs(btrfs_inc_extent_ref+0x513)[0x44107a]
> > > btrfs[0x420053]
> > > btrfs[0x4265eb]
> > > btrfs(cmd_check+0x1111)[0x427d6d]
> > > btrfs(main+0x12f)[0x40a341]
> > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f6b632e82b1]
> > > btrfs(_start+0x2a)[0x40a37a]
> > > 
> > > Ok, it's old, let's take git from today:
> > > gargamel:~# btrfs --version
> > > btrfs-progs v4.10.2
> > > As a note, 
> > > gargamel:~# btrfs check --mode=lowmem --repair /dev/mapper/dshelf2
> > > enabling repair mode
> > > ERROR: low memory mode doesn't support repair yet
> > > 
> > > As a note, a 32bit binary on a 64bit kernel:
> > > gargamel:~# btrfs check --repair /dev/mapper/dshelf2
> > > enabling repair mode
> > > Checking filesystem on /dev/mapper/dshelf2
> > > UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653
> > > checking extents
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> > > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> > > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> > > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> > > parent transid verify failed on 1671538819072 wanted 293964 found 293902
> > > parent transid verify failed on 1671538819072 wanted 293964 found 293902
> > > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> > > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> > > cmds-check.c:6291: add_data_backref: BUG_ON `!back` triggered, value 1
> > > Aborted
> > > 
> > > let's try again with a 64bit binary built from git:
> > > (...)
> > > Repaired extent references for 4227617038336
> > > ref mismatch on [4227872751616 4096] extent item 1, found 0
> > > Incorrect local backref count on 4227872751616 parent 3493071667200 owner 0
> > > offset 0 found 0 wanted 1 back 0x56470b18e7f0  
> > > Backref disk bytenr does not match extent record, bytenr=4227872751616, ref
> > > bytenr=0
> > > backpointer mismatch on [4227872751616 4096]
> > > owner ref check failed [4227872751616 4096]
> > > repair deleting extent record: key 4227872751616 168 4096
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E  
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > Repaired extent references for 4227872751616
> > > ref mismatch on [6674127745024 32768] extent item 0, found 1
> > > Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not
> > > found in extent tree
> > > Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0
> > > offset 0 found 1 wanted 0 back 0x5648afda0f20  
> > > backpointer mismatch on [6674127745024 32768]
> > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> > > checksum verify failed on 6983266418688 found BCBF9E15 wanted 785FF67E
> > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> > > bytenr mismatch, want=6983266418688, have=13671317608077697645
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > failed to repair damaged filesystem, aborting
> > > 
> > > 
> > > So, I'm out of luck now, full wipe and 3-5 day rebuild?
> > > 
> > > Thanks,
> > > Marc
> > > -- 
> > > "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> > > Microsoft is to operating systems ....
> > >                                       .... what McDonalds is to gourmet cooking
> > > Home page: http://marc.merlins.org/  
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > 
> > 
> > -- 
> > "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> > Microsoft is to operating systems ....
> >                                       .... what McDonalds is to gourmet cooking
> > Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 


-- 
With respect,
Roman

  reply	other threads:[~2017-05-03  6:32 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-20 14:39 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Marc MERLIN
2017-06-20 15:23 ` Hugo Mills
2017-06-20 15:26   ` Marc MERLIN
2017-06-20 15:36     ` Hugo Mills
2017-06-20 15:44       ` Marc MERLIN
2017-06-20 23:12         ` Marc MERLIN
2017-06-20 23:58           ` Marc MERLIN
2017-06-21  3:31           ` Chris Murphy
2017-06-21  3:43             ` Marc MERLIN
2017-06-21 15:13               ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN
2017-06-21 23:22                 ` Chris Murphy
2017-06-22  0:48                   ` Marc MERLIN
2017-06-22  2:22                 ` Qu Wenruo
2017-06-22  2:53                   ` Marc MERLIN
2017-06-22  4:08                     ` Qu Wenruo
2017-06-23  4:06                       ` Marc MERLIN
2017-06-23  8:54                         ` Lu Fengqi
2017-06-23 16:17                           ` Marc MERLIN
2017-06-24  2:34                             ` Marc MERLIN
2017-06-26 10:46                               ` Lu Fengqi
2017-06-27 23:11                                 ` Marc MERLIN
2017-06-28  7:10                                   ` Lu Fengqi
2017-06-28 14:43                                     ` Marc MERLIN
2017-05-01 17:06                                       ` 4.11 relocate crash, null pointer Marc MERLIN
2017-05-01 18:08                                         ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN
2017-05-02  1:50                                           ` Chris Murphy
2017-05-02  3:23                                             ` Marc MERLIN
2017-05-02  4:56                                               ` Chris Murphy
2017-05-02  5:11                                                 ` Marc MERLIN
2017-05-02 18:47                                                   ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN
2017-05-03  6:00                                                     ` Marc MERLIN
2017-05-03  6:17                                                       ` Marc MERLIN
2017-05-03  6:32                                                         ` Roman Mamedov [this message]
2017-05-03 20:40                                                           ` Marc MERLIN
2017-07-07  5:37                                                   ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN
2017-07-07  5:39                                                     ` Marc MERLIN
2017-07-07  9:33                                                       ` Lu Fengqi
2017-07-07 16:38                                                         ` Marc MERLIN
2017-07-09  4:34                                                           ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
2017-07-09  5:05                                                             ` We really need a better/working btrfs check --repair Marc MERLIN
2017-07-09  6:34                                                             ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
2017-07-09  7:57                                                             ` Martin Steigerwald
2017-07-09  9:16                                                               ` Paul Jones
2017-07-09 11:17                                                                 ` Duncan
2017-07-09 13:00                                                                   ` Martin Steigerwald
2017-07-29 19:29                                                                   ` Imran Geriskovan
2017-07-29 23:38                                                                     ` Duncan
2017-07-30 14:54                                                                       ` Imran Geriskovan
2017-07-31  4:53                                                                         ` Duncan
2017-07-31 20:32                                                                           ` Imran Geriskovan
2017-08-01  1:36                                                                             ` Duncan
2017-08-01 15:18                                                                               ` Imran Geriskovan
2017-07-31 21:07                                                               ` Ivan Sizov
2017-07-31 21:17                                                                 ` Marc MERLIN
2017-07-31 21:39                                                                   ` Ivan Sizov
2017-08-01 16:41                                                                     ` Ivan Sizov
2017-07-31 22:00                                                                   ` Justin Maggard
2017-08-01  6:38                                                                     ` Marc MERLIN
2017-05-02 19:59                                                 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Kai Krakow
2017-05-02  5:01                                               ` Duncan
2017-05-02 19:53                                                 ` Kai Krakow
2017-05-23 16:58                                                 ` Marc MERLIN
2017-05-24 10:16                                                   ` Duncan
2017-05-05  1:19                                               ` Qu Wenruo
2017-05-05  2:10                                                 ` Qu Wenruo
2017-05-05  2:40                                                 ` Marc MERLIN
2017-05-05  5:03                                                   ` Qu Wenruo
2017-05-05 15:43                                                     ` Marc MERLIN
2017-05-17 18:23                                                       ` Kai Krakow
2017-05-05  1:13                                           ` Qu Wenruo
2017-06-29 13:36                                       ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Lu Fengqi
2017-06-29 15:30                                         ` Marc MERLIN
2017-06-30 14:59                                           ` Lu Fengqi
2017-06-22  4:08                     ` Qu Wenruo
2017-06-21 12:04           ` 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Duncan
2017-06-21  3:26         ` Chris Murphy
2017-06-21  4:06           ` Marc MERLIN

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170503113226.757f492e@natsu \
    --to=rm@romanrm.net \
    --cc=clm@fb.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=marc@merlins.org \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.