All of lore.kernel.org
 help / color / mirror / Atom feed
From: Reindl Harald <h.reindl@thelounge.net>
To: David Brown <david.brown@hesbynett.no>,
	Adam Goryachev <mailinglists@websitemanagers.com.au>,
	Jeff Allison <jeff.allison@allygray.2y.net>
Cc: linux-raid@vger.kernel.org
Subject: Re: proactive disk replacement
Date: Tue, 21 Mar 2017 14:24:08 +0100	[thread overview]
Message-ID: <09f4c794-8b17-05f5-10b7-6a3fa515bfa9@thelounge.net> (raw)
In-Reply-To: <58D126EB.7060707@hesbynett.no>



Am 21.03.2017 um 14:13 schrieb David Brown:
> On 21/03/17 12:03, Reindl Harald wrote:
>>
>> Am 21.03.2017 um 11:54 schrieb Adam Goryachev:
> <snip>
>>
>>> In addition, you claim that a drive larger than 2TB is almost certainly
>>> going to suffer from a URE during recovery, yet this is exactly the
>>> situation you will be in when trying to recover a RAID10 with member
>>> devices 2TB or larger. A single URE on the surviving portion of the
>>> RAID1 will cause you to lose the entire RAID10 array. On the other hand,
>>> 3 URE's on the three remaining members of the RAID6 will not cause more
>>> than a hiccup (as long as no more than one URE on the same stripe, which
>>> I would argue is ... exceptionally unlikely).
>>
>> given that when your disks have the same age errors on another disk
>> become more likely when one failed and the heavy disk IO due recovery of
>> a RAID6 with takes *many hours* where you have heavy IO on *all disks*
>> compared with a way faster restore of RAID1/10 guess in which case a URE
>> is more likely
>>
>> additionally why should the whole array fail just because a single block
>> get lost? the is no parity which needs to be calculated, you just lost a
>> single block somewhere - RAID1/10 are way easier in their implementation
>
> If you have RAID1, and you have an URE, then the data can be recovered
> from the other have of that RAID1 pair.  If you have had a disk failure
> (manual for replacement, or a real failure), and you get an URE on the
> other half of that pair, then you lose data.
>
> With RAID6, you need an additional failure (either another full disk
> failure or an URE in the /same/ stripe) to lose data.  RAID6 has higher
> redundancy than two-way RAID1 - of this there is /no/ doubt

yes, but with RAID5/RAID6 *all disks* are involved in the rebuild, with 
a 10 disk RAID10 only one disk needs to be read and the data written to 
the new one - all other disks are not involved in the resync at all

for most arrays the disks have a similar age and usage pattern, so when 
the first one fails it becomes likely that it don't take too long for 
another one and so load and recovery time matters

  reply	other threads:[~2017-03-21 13:24 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-20 12:47 proactive disk replacement Jeff Allison
2017-03-20 13:25 ` Reindl Harald
2017-03-20 14:59 ` Adam Goryachev
2017-03-20 15:04   ` Reindl Harald
2017-03-20 15:23     ` Adam Goryachev
2017-03-20 16:19       ` Wols Lists
2017-03-21  2:33   ` Jeff Allison
2017-03-21  9:54     ` Reindl Harald
2017-03-21 10:54       ` Adam Goryachev
2017-03-21 11:03         ` Reindl Harald
2017-03-21 11:34           ` Andreas Klauer
2017-03-21 12:03             ` Reindl Harald
2017-03-21 12:41               ` Andreas Klauer
2017-03-22  4:16                 ` NeilBrown
2017-03-21 11:56           ` Adam Goryachev
2017-03-21 12:10             ` Reindl Harald
2017-03-21 13:13           ` David Brown
2017-03-21 13:24             ` Reindl Harald [this message]
2017-03-21 14:15               ` David Brown
2017-03-21 15:25                 ` Wols Lists
2017-03-21 15:41                   ` David Brown
2017-03-21 16:49                     ` Phil Turmel
2017-03-22 13:53                       ` Gandalf Corvotempesta
2017-03-22 14:12                         ` David Brown
2017-03-22 14:32                         ` Phil Turmel
2017-03-21 11:55         ` Gandalf Corvotempesta
2017-03-21 13:02       ` David Brown
2017-03-21 13:26         ` Gandalf Corvotempesta
2017-03-21 14:26           ` David Brown
2017-03-21 15:31             ` Wols Lists
2017-03-21 17:00               ` Phil Turmel
2017-03-21 15:29         ` Wols Lists
2017-03-21 16:55         ` Phil Turmel
2017-03-22 14:51 ` John Stoffel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=09f4c794-8b17-05f5-10b7-6a3fa515bfa9@thelounge.net \
    --to=h.reindl@thelounge.net \
    --cc=david.brown@hesbynett.no \
    --cc=jeff.allison@allygray.2y.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=mailinglists@websitemanagers.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.