All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phil Turmel <philip@turmel.org>
To: Jon Nelson <jnelson-linux-raid@jamponi.net>,
	Mikael Abrahamsson <swmike@swm.pp.se>
Cc: Florian Lampel <florian.lampel@gmail.com>,
	LinuxRaid <linux-raid@vger.kernel.org>
Subject: Re: RAID6 dead on the water after Controller failure
Date: Sat, 15 Feb 2014 22:49:14 -0500	[thread overview]
Message-ID: <5300353A.8060509@turmel.org> (raw)
In-Reply-To: <CAKuK5J3ry9SQPPJ_xGktZJFtoPQw1R+tE6eeRX7s6yTt+M79Zw@mail.gmail.com>

On 02/15/2014 06:23 PM, Jon Nelson wrote:
> On Sat, Feb 15, 2014 at 5:04 PM, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
>> On Sat, 15 Feb 2014, Jon Nelson wrote:
>>
>>> Out of 12 drives, I thought RAID6 only offered a total of *2* failed
>>> devices. It seems to me that you have 7 devices in sync and 4 *almost* in
>>> sync. It's this "almost" part that has me confused. How can the raid run if
>>> the event count doesn't match? Wouldn't at least 10 out of 12 drives have to
>>> have the same event count to avoid data loss?
>>
>>
>> Correct. When you use --assemble --force you're basically telling mdadm "I
>> know what I'm doing and I'll take the risk of data loss or corruption". If
>> you assemble in with a kicked drive that was kicked long ago that has a
>> really far off event count, you can really really screw things up.
>>
>> Unless you use --force, mdadm won't assemble an array where the event count
>> doesn't match up.
> 
> Aha. So if you ran a check in this case, it would find some number of
> blocks that don't match up. What does MD do in that case?

Nothing other than reporting it in the mismatch_cnt.  If you then
perform a "repair" scrub, it will regenerate P&Q from the data blocks,
period.

If you think about it, five of the seven events are easily explained:
four dropped drives, plus cancellation of the rebuild onto what was then
/dev/sdh1.  So the opportunity for other corruption was small.  This is
precisely the scenario where forced assembly makes sense.  The drives
were failed for reasons other than actual drive failure.  The use of
--force tells mdadm that those drives aren't really bad.  But I left out
the partially rebuilt drive because it really couldn't be trusted.

HTH,

Phil



  reply	other threads:[~2014-02-16  3:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-14 16:19 RAID6 dead on the water after Controller failure Florian Lampel
2014-02-14 20:35 ` Phil Turmel
2014-02-15 12:31   ` Florian Lampel
2014-02-15 15:12     ` Phil Turmel
2014-02-15 18:52       ` Florian Lampel
2014-02-15 19:00         ` Phil Turmel
2014-02-15 19:01           ` Phil Turmel
2014-02-15 19:09           ` Bakk. Florian Lampel
2014-02-15 22:04       ` Jon Nelson
2014-02-15 23:04         ` Mikael Abrahamsson
2014-02-15 23:23           ` Jon Nelson
2014-02-16  3:49             ` Phil Turmel [this message]
     [not found]     ` <CADNH=7EiY18TJDBDQsT6LDtw+Ft_2XCFaP30uK7uJb_e7xKhsQ@mail.gmail.com>
2014-02-15 18:56       ` Florian Lampel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5300353A.8060509@turmel.org \
    --to=philip@turmel.org \
    --cc=florian.lampel@gmail.com \
    --cc=jnelson-linux-raid@jamponi.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=swmike@swm.pp.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.