All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phil Turmel <philip@turmel.org>
To: Wols Lists <antlists@youngman.org.uk>, Nix <nix@esperi.org.uk>,
	NeilBrown <neilb@suse.com>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: RFC - Raid error detection and auto-recovery (was Fault tolerance with badblocks)
Date: Mon, 15 May 2017 18:31:55 -0400	[thread overview]
Message-ID: <7ba308d7-6954-8cd9-e623-93b940c5e370@turmel.org> (raw)
In-Reply-To: <5919B0AC.30705@youngman.org.uk>

On 05/15/2017 09:44 AM, Wols Lists wrote:
> On 15/05/17 12:11, Nix wrote:
>> I think the point here is that we'd like some way to recover that lets
>> us get back to the most-likely-consistent state. However, on going over
>> the RAID-6 maths again I think I see where I was wrong. In the absence
>> of P, Q, P *or* Q or one of P and Q and a data stripe, you can
>> reconstruct the rest, but the only reason you can do that is because
>> they are either correct or absent: you can trust them if they're there,
>> and you cannot mistake a missing stripe for one that isn't missing.
> 
> The point of Peter Anvin's paper, though, was that it IS possible to
> correct raid-6 if ONE of P, Q, or a data stripe is corrupt.

If and only if it is known that all but the supposedly corrupt block
were written together (complete stripe) and no possibility of
perturbation occurred between the original calculation of P,Q in the CPU
and original transmission of all of these blocks to the member drives.

Since incomplete writes and a whole host of hardware corruptions are
known to happen, you *don't* have enough information to automatically
repair.

The only unambiguous signal MD raid receives that a particular block is
corrupt is an Unrecoverable Read Error from a drive.  MD fixes these
from available redundancy.  All other sources of corruption require
assistance from an upper layer or from administrator input.

There's no magic wand, Wol.

  reply	other threads:[~2017-05-15 22:31 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-10 13:26 RFC - Raid error detection and auto-recovery (was Fault tolerance with badblocks) Wols Lists
2017-05-10 17:07 ` Piergiorgio Sartor
2017-05-11 23:31   ` Eyal Lebedinsky
2017-05-15  3:43 ` NeilBrown
2017-05-15 11:11   ` Nix
2017-05-15 13:44     ` Wols Lists
2017-05-15 22:31       ` Phil Turmel [this message]
2017-05-16 10:33         ` Wols Lists
2017-05-16 14:17           ` Phil Turmel
2017-05-16 14:53             ` Wols Lists
2017-05-16 15:31               ` Phil Turmel
2017-05-16 15:51                 ` Nix
2017-05-16 16:11                   ` Anthonys Lists

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7ba308d7-6954-8cd9-e623-93b940c5e370@turmel.org \
    --to=philip@turmel.org \
    --cc=antlists@youngman.org.uk \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=nix@esperi.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.