All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: "Agustín DallʼAlba" <agustin@dallalba.com.ar>
Cc: Chris Murphy <lists@colorremedies.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: raid10 corruption while removing failing disk
Date: Tue, 11 Aug 2020 13:17:44 -0600	[thread overview]
Message-ID: <CAJCQCtSdJVw5o2hJ3OyE6-nvM2xpx=nRHLVNSgf9ydD2O--vMQ@mail.gmail.com> (raw)
In-Reply-To: <dc0bea2ee916ce4d1a53fe59869b7b7d8868f617.camel@dallalba.com.ar>

On Mon, Aug 10, 2020 at 11:06 PM Agustín DallʼAlba
<agustin@dallalba.com.ar> wrote:
>
> On Mon, 2020-08-10 at 20:34 -0600, Chris Murphy wrote:
> > On Mon, Aug 10, 2020 at 1:03 AM Agustín DallʼAlba
> > <agustin@dallalba.com.ar> wrote:
> > > Hello!
> > >
> > > The last quarterly scrub on our btrfs filesystem found a few bad
> > > sectors in one of its devices (/dev/sdd), and because there's nobody on
> > > site to replace the failing disk I decided to remove it from the array
> > > with `btrfs device remove` before the problem could get worse.
> >
> > It doesn't much matter if it gets worse, because you still have
> > redundancy on that dying drive until the moment it's completely toast.
> > And btrfs doesn't care if it's spewing read errors.
>
> By 'get worse', I mean another drive failing, and then we'd definitely
> lose data. Because of the pandemic there was (and still is) nobody on
> site to replace the drive, and I won't be able to go there for who
> knows how many months.

Fair point.

> I have a _partial_ dmesg of this time period. It's got a lot of gaps in
> between reboots. I'll send it to you without ccing the list. The
> failing drive is an atrocious WD green for which I forgot to set the
> idle3 timer, that doesn't support SCT ERC and lately just hangs forever
> and requires a power cycle. So there's no way around the slowness. It
> was added on a pinch a year ago because we needed more space. I
> probably should have ask someone to disconnect it and used 'remove
> missing'.

That drive should have '/sys/block/sda/device/timeout' at least 120.
Although I've seen folks on linux-raid@ suggest 180. I don't know what
the actual maximum time for "deep recovery" these drives could have.

As the signal in a sector weakens, the reads get slower. You can
freshen the signal simply by rewriting data. Btrfs doesn't ever do
overwrites, but you can use 'btrfs balance' for this task. Once a year
seems reasonable, or as you notice reads becoming slower. And use a
filtered balance to avoid doing it all at once.


>
> > > # btrfs check --force --readonly /dev/sda
> > > WARNING: filesystem mounted, continuing because of --force
> > > Checking filesystem on /dev/sda
> > > UUID: 4d3acf20-d408-49ab-b0a6-182396a9f27c
> > > checksum verify failed on 10919566688256 found BAB1746E wanted A8A48266
> > > checksum verify failed on 10919566688256 found BAB1746E wanted A8A48266
> >
> > So they aren't at all the same, that's unexpected.
>
> What do you mean by this?

I only fully understood what you meant by this:
>instead of `found BAB1746E wanted A8A48266` it prints `found 0000006E wanted 00000066`

once I re-read the first email that had the full 'btrfs check' output
from the old version. And yeah I don't know why they're different now.


> > My advice is to mount ro, backup (or two copies for important info),
> > and start with a new Btrfs file system and restore. It's not worth
> > repairing.
>
> Sigh, I was expecting I'd have to do this. At least no data was lost,
> and the system still functions even though it's read-only. Do you think
> check --repair is not worth trying? Everything of value is already
> backed up, but restoring it would take many hours of work.

Metadata, RAID10: total=9.00GiB, used=7.57GiB

Ballpark 8 hours for --repair given metadata size and spinning drives.
It'll add some time adding --init-extent-tree which... is decently
likely to be needed here. So the gotcha is, see if --repair works, and
it fixes some stuff but still needs extent tree repaired anyway. Now
you have to do that and it could be another 8 hours. Or do you go with
the heavy hammer right away to save time and do both at once? But the
heavy hammer is riskier.

Whether repair or start over, you need to have the backup plus 2x for
important stuff. To do the repair you need to be prepared for the
possibility tihngs get worse. I'll argue strongly that it's a bug if
things get worse (i.e. now you can't mount ro at all) but as a risk
assessment, it has to be considered.


--
Chris Murphy

  reply	other threads:[~2020-08-11 19:18 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-10  7:03 raid10 corruption while removing failing disk Agustín DallʼAlba
2020-08-10  7:22 ` Nikolay Borisov
2020-08-10  7:38   ` Martin Steigerwald
2020-08-10  7:51     ` Nikolay Borisov
2020-08-10  8:57       ` Martin Steigerwald
2020-08-11  1:30       ` Chris Murphy
2020-08-10  7:59     ` Agustín DallʼAlba
2020-08-10  8:21 ` Nikolay Borisov
2020-08-10 22:24   ` Zygo Blaxell
2020-08-11  1:18   ` Agustín DallʼAlba
2020-08-11  1:48     ` Chris Murphy
2020-08-11  2:34 ` Chris Murphy
2020-08-11  5:06   ` Agustín DallʼAlba
2020-08-11 19:17     ` Chris Murphy [this message]
2020-08-11 20:40       ` Agustín DallʼAlba
2020-08-12  3:03         ` Chris Murphy
2020-08-31 20:05       ` Agustín DallʼAlba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJCQCtSdJVw5o2hJ3OyE6-nvM2xpx=nRHLVNSgf9ydD2O--vMQ@mail.gmail.com' \
    --to=lists@colorremedies.com \
    --cc=agustin@dallalba.com.ar \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.