All of lore.kernel.org
 help / color / mirror / Atom feed
From: Billy Crook <billycrook@gmail.com>
To: Michael Evans <mjevans1983@gmail.com>
Cc: Phillip Susi <psusi@cfl.rr.com>, linux-raid@vger.kernel.org
Subject: Re: Two degraded mirror segments recombined out of sync for massive data loss
Date: Thu, 8 Apr 2010 09:58:10 -0500	[thread overview]
Message-ID: <z2za43edf1b1004080758mbf88eeecg64b174d740573be9@mail.gmail.com> (raw)
In-Reply-To: <r2v4877c76c1004071421o14e66ae5s35b0bf0914fb737c@mail.gmail.com>

On Wed, Apr 7, 2010 at 16:21, Michael Evans <mjevans1983@gmail.com> wrote:
> It sounds like the last 'synced' time should be tracked, as well as
> the last modification time. If the two differ then it can be known
> that the contents has diverged since last sync.

I have perhaps a better solution:
Every time an event happens that could affect the coherency of the
components of an array (i.e. started, stopped, disk failed), a counter
is incremented on all of the components.  Then a random number is
written next to it (same number to all disks).

On assembling an array:
For all components in the array, find the highest counter.

If enough disks with this counter are present and contain the same
random number, the it starts the array, and if a rebuild is necessary
to regain parity or the specified number of mirrors, the remaining
components of the same array are consumed for this purpose.

If there are multiple randoms present for this highest counter, then
the last modification time can be used to choose the most up to date
one.  It comes online, and overwrites the components with the older
modification time, or maybe it just prompts the admin, and starts
degraded.

This would catch the problem originally reported by Phillip because
the random numbers written to the components' headers would have been
written at different times, and so they would be different.  mdadm
would know by this that they had diverged regardless of any timestamp
or counter.

  parent reply	other threads:[~2010-04-08 14:58 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-07 20:45 Two degraded mirror segments recombined out of sync for massive data loss Phillip Susi
2010-04-07 21:21 ` Michael Evans
2010-04-07 22:58   ` Jools Wills
2010-04-08 14:58   ` Billy Crook [this message]
2010-04-07 23:49 ` Neil Brown
2010-04-08 13:56   ` Phillip Susi
2010-04-14 20:56 ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=z2za43edf1b1004080758mbf88eeecg64b174d740573be9@mail.gmail.com \
    --to=billycrook@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=mjevans1983@gmail.com \
    --cc=psusi@cfl.rr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.