All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phil Turmel <philip@turmel.org>
To: Florian Lampel <florian.lampel@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: RAID6 dead on the water after Controller failure
Date: Sat, 15 Feb 2014 10:12:49 -0500	[thread overview]
Message-ID: <52FF83F1.3030904@turmel.org> (raw)
In-Reply-To: <5269CCC7-A0A7-479E-9738-88C74CB19435@gmail.com>

Good morning Florian,

On 02/15/2014 07:31 AM, Florian Lampel wrote:
> Greetings,
> 
> first of all - thanks to Phil Turmel for pointing me in the right direction. I checked all the cables and true enough, the System SSD's cable's shielding was halfway peeled off.

Very good.

> Anyway, the current state is as follows:
> 
> *) The missing HDDs came up right after the reboot, and I had to use the "bootdegraded=true" kernel option.
> *) All 12 drives are functional.
> 
> Here is a link to the requested output of 
> 
> --- mdadm -E /dev/sd[abcd]1 ---
> --- for x in /dev/sd[a-z] ; do echo $x : ; smartctl -x $x ; done ----
> 
> as well as
> 
> ---- mdadm --examine /dev/sd[abcdefghijklmnop]1 ------
> 
> Link:
> h__p://pastebin.com/v6yzn3KX

Device order has changed, summary:

/dev/sda1: WD-WMC300595440 Device #4 @442
/dev/sdb1: WD-WMC300595880 Device #5 @442
/dev/sdc1: WD-WMC1T1521826 Device #6 @442
/dev/sdd1: WD-WMC300314126 spare
/dev/sde1: WD-WMC300595645 Device #8 @435
/dev/sdf1: WD-WMC300314217 Device #9 @435
/dev/sdg1: WD-WMC300595957 Device #10 @435
/dev/sdh1: WD-WMC300313432 Device #11 @435
/dev/sdj1: WD-WMC300312702 Device #0 @442
/dev/sdk1: WD-WMC300248734 Device #1 @442
/dev/sdl1: WD-WMC300314248 Device #2 @442
/dev/sdm1: WD-WMC300585843 Device #3 @442

and your SSD is now /dev/sdi.

> My findings:
> The Event count does differ, but not by much. As my next step, I would follow Phil Turmel's advice and reassemble the Array using the --force option, to be precise:
> 
> mdadm -Afv /dev/md0 /dev/sd[abcdefgjklm]1

Not quite.  What was 'h' is now 'd'.  Use:

mdadm -Afv /dev/md0 /dev/sd[abcefghjklm]1

> Could you please advise me wether this next step is all right to do now that we have new logs etc.?

Yes.  You may also need "mdadm --stop /dev/md0" first if your boot
process partially assembled the array already.

After assembly, your array will be single-degraded but fully functional.
 That would be a good time to backup any critical data that isn't
already in a backup.

Then you can add /dev/sdd1 back into the array and let it rebuild.

> Thanks in advance,
> Florian Lampel
> 
> PS: Thanks again to Phil for pointing out that --create would be madness.--

One more thing:  your drives report never having a self-test run.  You
should have a cron job that triggers a long background self-test on a
regular basis.  Weekly, perhaps.

Similarly, you should have a cron job trigger an occasional "check"
scrub on the array, too.  Not at the same time as the self-tests,
though.  (I understand some distributions have this already.)

HTH,

Phil

  reply	other threads:[~2014-02-15 15:12 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-14 16:19 RAID6 dead on the water after Controller failure Florian Lampel
2014-02-14 20:35 ` Phil Turmel
2014-02-15 12:31   ` Florian Lampel
2014-02-15 15:12     ` Phil Turmel [this message]
2014-02-15 18:52       ` Florian Lampel
2014-02-15 19:00         ` Phil Turmel
2014-02-15 19:01           ` Phil Turmel
2014-02-15 19:09           ` Bakk. Florian Lampel
2014-02-15 22:04       ` Jon Nelson
2014-02-15 23:04         ` Mikael Abrahamsson
2014-02-15 23:23           ` Jon Nelson
2014-02-16  3:49             ` Phil Turmel
     [not found]     ` <CADNH=7EiY18TJDBDQsT6LDtw+Ft_2XCFaP30uK7uJb_e7xKhsQ@mail.gmail.com>
2014-02-15 18:56       ` Florian Lampel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52FF83F1.3030904@turmel.org \
    --to=philip@turmel.org \
    --cc=florian.lampel@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.