All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Brown <david.brown@hesbynett.no>
To: Nix <nix@esperi.org.uk>, Wols Lists <antlists@youngman.org.uk>
Cc: "Ravi (Tom) Hale" <ravi@hale.ee>, linux-raid@vger.kernel.org
Subject: Re: Fault tolerance with badblocks
Date: Tue, 09 May 2017 09:37:33 +0200	[thread overview]
Message-ID: <591171BD.3060707@hesbynett.no> (raw)
In-Reply-To: <87h90v8kt3.fsf@esperi.org.uk>

On 08/05/17 16:50, Nix wrote:

> 
> I wonder... scrubbing is not very useful with md, particularly with RAID
> 6, because it does no writes unless something mismatches, and on failure
> there is no attempt to determine which of the N disks is bad and rewrite
> its contents from the other devices (nor, as I understand it, does it
> clearly say which drive gave the error, so even failing it out and
> resyncing it is hard).
> 

Please read Neil Brown's article on this: "Smart or simple RAID
recovery?" <http://neil.brown.name/blog/20100211050355>

> If there was a way to get md to *rewrite* everything during scrub,
> rather than just checking, this might help (in addition to letting the
> drive refresh the magnetization of absolutely everything). "repair" mode
> appears to do no writes until an error is found, whereupon (on RAID 6)
> it proceeds to make a "repair" that is more likely than not to overwrite
> good data with bad. Optionally writing what's already there on non-error
> seems like it might be a worthwhile (and fairly simple) change.
> 

Scrubbing /does/ rewrite disk blocks - when necessary.  It does not do
it explicitly, but the disks handle this themselves.

To the processor, a disk block is 4K of data.  But to the disk and its
controllers, it is 4K plus a sizeable amount of error checking and
correcting bits.  Some are spread out within the block, some are
collected together at the end of the block.  The ECC system can handle a
large number of failed bits, either in lumps caused by a physical defect
on the disk surface, or spread out due to the slow decay of the magnetic
orientation, or hits by cosmic rays.

When the disk is asked to read a block, it pulls up the data and the ECC
bits, and uses this to check and re-construct the 4K of data, and a
measure of how many errors were corrected.  On modern high-capacity
drives, it is normal that some errors are corrected on a read.  But if
more than a certain level occur, then the firmware will trigger a
re-write automatically to the same sector.  This will then be re-read.
If the error rate is low, fine.  If it is high, then the sector will be
remapped by the disk.

So simply /reading/ the data, as far as the processor is concerned, will
cause re-writes as and when needed.


  parent reply	other threads:[~2017-05-09  7:37 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-04 10:04 Fault tolerance in RAID0 with badblocks Ravi (Tom) Hale
2017-05-04 13:44 ` Wols Lists
2017-05-05  4:03   ` Fault tolerance " Ravi (Tom) Hale
2017-05-05 19:20     ` Anthony Youngman
2017-05-06 11:21       ` Ravi (Tom) Hale
2017-05-06 13:00         ` Wols Lists
2017-05-08 14:50           ` Nix
2017-05-08 18:00             ` Anthony Youngman
2017-05-09 10:11               ` David Brown
2017-05-09 10:18               ` Nix
2017-05-08 19:02             ` Phil Turmel
2017-05-08 19:52               ` Nix
2017-05-08 20:27                 ` Anthony Youngman
2017-05-09  9:53                   ` Nix
2017-05-09 11:09                     ` David Brown
2017-05-09 11:27                       ` Nix
2017-05-09 11:58                         ` David Brown
2017-05-09 17:25                           ` Chris Murphy
2017-05-09 19:44                             ` Wols Lists
2017-05-10  3:53                               ` Chris Murphy
2017-05-10  4:49                                 ` Wols Lists
2017-05-10 17:18                                   ` Chris Murphy
2017-05-16  3:20                                   ` NeilBrown
2017-05-10  5:00                                 ` Dave Stevens
2017-05-10 16:44                                 ` Edward Kuns
2017-05-10 18:09                                   ` Chris Murphy
2017-05-09 20:18                             ` Nix
2017-05-09 20:52                               ` Wols Lists
2017-05-10  8:41                               ` David Brown
2017-05-09 21:06                             ` A sector-of-mismatch warning patch (was Re: Fault tolerance with badblocks) Nix
2017-05-12 11:14                               ` Nix
2017-05-16  3:27                               ` NeilBrown
2017-05-16  9:13                                 ` Nix
2017-05-16 21:11                                 ` NeilBrown
2017-05-16 21:46                                   ` Nix
2017-05-18  0:07                                     ` Shaohua Li
2017-05-19  4:53                                       ` NeilBrown
2017-05-19 10:31                                         ` Nix
2017-05-19 16:48                                           ` Shaohua Li
2017-06-02 12:28                                             ` Nix
2017-05-19  4:49                                     ` NeilBrown
2017-05-19 10:32                                       ` Nix
2017-05-19 16:55                                         ` Shaohua Li
2017-05-21 22:00                                           ` NeilBrown
2017-05-09 19:16                         ` Fault tolerance with badblocks Phil Turmel
2017-05-09 20:01                           ` Nix
2017-05-09 20:57                             ` Wols Lists
2017-05-09 21:22                               ` Nix
2017-05-09 21:23                             ` Phil Turmel
2017-05-09 21:32                     ` NeilBrown
2017-05-10 19:03                       ` Nix
2017-05-09 16:05                   ` Chris Murphy
2017-05-09 17:49                     ` Wols Lists
2017-05-10  3:06                       ` Chris Murphy
2017-05-08 20:56                 ` Phil Turmel
2017-05-09 10:28                   ` Nix
2017-05-09 10:50                     ` Reindl Harald
2017-05-09 11:15                       ` Nix
2017-05-09 11:48                         ` Reindl Harald
2017-05-09 16:11                           ` Nix
2017-05-09 16:46                             ` Reindl Harald
2017-05-09  7:37             ` David Brown [this message]
2017-05-09  9:58               ` Nix
2017-05-09 10:28                 ` Brad Campbell
2017-05-09 10:40                   ` Nix
2017-05-09 12:15                     ` Tim Small
2017-05-09 15:30                       ` Nix
2017-05-05 20:23     ` Peter Grandi
2017-05-05 22:14       ` Nix

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=591171BD.3060707@hesbynett.no \
    --to=david.brown@hesbynett.no \
    --cc=antlists@youngman.org.uk \
    --cc=linux-raid@vger.kernel.org \
    --cc=nix@esperi.org.uk \
    --cc=ravi@hale.ee \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.