All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: "Agustín DallʼAlba" <agustin@dallalba.com.ar>
Cc: Nikolay Borisov <nborisov@suse.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
	Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Subject: Re: raid10 corruption while removing failing disk
Date: Mon, 10 Aug 2020 19:48:19 -0600	[thread overview]
Message-ID: <CAJCQCtS09avevjWuQcy-_9xBdA7zAo_4h81MaRS30btZDJNhAw@mail.gmail.com> (raw)
In-Reply-To: <9863afbc619489c98693a4e1d02a38136f5c2ef3.camel@dallalba.com.ar>

On Mon, Aug 10, 2020 at 7:19 PM Agustín DallʼAlba
<agustin@dallalba.com.ar> wrote:
>
> On Mon, 2020-08-10 at 11:21 +0300, Nikolay Borisov wrote:
> > This suggests you are hitting a known problem with reloc roots which
> > have been fixed in the latest upstream and lts (5.4) kernels:
> >
> > 044ca910276b btrfs: reloc: fix reloc root leak and NULL pointer
> > dereference (3 months ago) <Qu Wenruo>
> > 707de9c0806d btrfs: relocation: fix reloc_root lifespan and access (7
> > months ago) <Qu Wenruo>
> > 1fac4a54374f btrfs: relocation: fix use-after-free on dead relocation
> > roots (11 months ago) <Qu Wenruo>
> >
> >
> > So yes, try to update to latest stable kernel and re-run the device
> > remove. Also update your btrfs progs to latest 5.6 version and rerun
> > check again (by default it's a read only operations so it shouldn't
> > cause any more damage).
>
> I have tried again with the 5.8.0 kernel and btrfs-progs v5.7 (which
> I've compiled statically on a different machine and used only for btrfs
> device remove and btrfs check). The system still goes read-only when I
> attempt to remove the failing drive, but it doesn't oops in this
> version.
>
> This version of btrfs check finds many more problems, however the
> 'checksum verify failed' line looks supicious: instead of `found
> BAB1746E wanted A8A48266` it prints `found 0000006E wanted 00000066`
> like if the checksums had been truncated to 8 bits before printing.


6E = 1101110
66 = 1100110

So it's a bit flip. I think newer versions of found/wanted are only
showing what's different, maybe someone will confirm it. 'btrfs check
--repair' can usually fix most cases of bit flips in metadata. I'm not
sure about the other problems, if they're related or fixable.

Top candidate is bad RAM but it could be something else. Best to start
doing a memtest86+, newer the better. If it's bad RAM sometimes it'll
find it in a couple hours, and other times, a few days which is
definitely annoying.


-- 
Chris Murphy

  reply	other threads:[~2020-08-11  1:48 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-10  7:03 raid10 corruption while removing failing disk Agustín DallʼAlba
2020-08-10  7:22 ` Nikolay Borisov
2020-08-10  7:38   ` Martin Steigerwald
2020-08-10  7:51     ` Nikolay Borisov
2020-08-10  8:57       ` Martin Steigerwald
2020-08-11  1:30       ` Chris Murphy
2020-08-10  7:59     ` Agustín DallʼAlba
2020-08-10  8:21 ` Nikolay Borisov
2020-08-10 22:24   ` Zygo Blaxell
2020-08-11  1:18   ` Agustín DallʼAlba
2020-08-11  1:48     ` Chris Murphy [this message]
2020-08-11  2:34 ` Chris Murphy
2020-08-11  5:06   ` Agustín DallʼAlba
2020-08-11 19:17     ` Chris Murphy
2020-08-11 20:40       ` Agustín DallʼAlba
2020-08-12  3:03         ` Chris Murphy
2020-08-31 20:05       ` Agustín DallʼAlba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJCQCtS09avevjWuQcy-_9xBdA7zAo_4h81MaRS30btZDJNhAw@mail.gmail.com \
    --to=lists@colorremedies.com \
    --cc=agustin@dallalba.com.ar \
    --cc=ce3g8jdj@umail.furryterror.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.