From: Peter van Es <vanes.peter@gmail.com>
To: linux-raid@vger.kernel.org
Subject: Help needed recovering from raid failure
Date: Mon, 27 Apr 2015 11:35:09 +0200 [thread overview]
Message-ID: <4D8713B5-39E7-4EE2-898C-35DC0948B4CA@gmail.com> (raw)
Sorry for the long post...
I am running Ubuntu LTS 14.04.02 Server edition, 64 bits, with 4x 2.0TB drives in a raid-5 array.
The 4th drive was beginning to show read errors. Because it was weekend, I could not go out
and buy a spare 2TB drive to replace the one that was beginning to fail.
I first got a fail event:
This is an automatically generated mail message from mdadm
running on bali
A Fail event had been detected on md device /dev/md/1.
It could be related to component device /dev/sdd2.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid5 sdc2[2] sdb2[1] sda2[0] sdd2[3](F)
5854290432 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
md0 : active raid5 sdc1[2] sdd1[3] sdb1[1] sda1[0]
5850624 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
unused devices: <none>
And then subsequently, around 18 hours later:
This is an automatically generated mail message from mdadm
running on bali
A DegradedArray event had been detected on md device /dev/md/1.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid5 sdc2[2] sdb2[1] sda2[0] sdd2[3](F)
5854290432 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
md0 : active raid5 sdc1[2] sdd1[3] sdb1[1] sda1[0]
5850624 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
unused devices: <none>
The server had taken the array off line at that point.
Needless to say, I can't boot the system anymore as the boot drive is /dev/md0, and GRUB can't
get at it. I do need to recover data (I know, but there's stuf on there I have no backup for--yet).
I booted Linux from a USB stick (which is on /dev/sdc1 hence changing the numbering),
in recovery mode. Below is the output of /proc/mdstat and
mdadm --examine. It looks like somehow the /dev/sdd2 and /dev/sde2 drives took on the
super block of the /dev/md127 device (my swap file). May that have been done by the boot from
the Ubuntu USB stick?
My plan... assemble a degraded array, with /dev/sde2 (the 4th drive, formerly known as /dev/sdd2) not in it.
Because the fail event put the file system in RO mode, I expect /dev/sdd2 (formerly /dev/sdc2) to be ok.
Then insert new 2TB drive in slot 4. Let system resync and recover.
I'm running xfs on the /dev/md1 device.
Questions:
1. is this the wise course of action ?
2. how exactly do I reassemble the array (/etc/mdadm.conf is inaccessible in recovery mode)
3. what command line options do I use exactly from the --examine output below without screwing things up
And help or pointers gratefully accepted
Peter van Es
/proc/mdstat (in recovery)
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md126 : inactive sdb2[1](S) sda2[0](S)
3902861312 blocks super 1.2
md127 : active raid5 sde2[5](S) sde1[3] sdb1[1] sda1[0] sdd1[2] sdd2[4](S)
5850624 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
unused devices: <none>
mdadm --examine /dev/sd[abde]2
/dev/sda2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 1f28f7bb:7b3ecd41:ca0fa5d1:ccd008df
Name : ubuntu:1 (local to host ubuntu)
Creation Time : Wed Apr 1 22:27:58 2015
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 3902861312 (1861.03 GiB 1998.26 GB)
Array Size : 5854290432 (5583.09 GiB 5994.79 GB)
Used Dev Size : 3902860288 (1861.03 GiB 1998.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 713e556d:ca104217:785db68a:d820a57b
Update Time : Sun Apr 26 05:59:13 2015
Checksum : fda151f9 - correct
Events : 18014
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : AA.. ('A' == active, '.' == missing)
/dev/sdb2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 1f28f7bb:7b3ecd41:ca0fa5d1:ccd008df
Name : ubuntu:1 (local to host ubuntu)
Creation Time : Wed Apr 1 22:27:58 2015
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 3902861312 (1861.03 GiB 1998.26 GB)
Array Size : 5854290432 (5583.09 GiB 5994.79 GB)
Used Dev Size : 3902860288 (1861.03 GiB 1998.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : f1e79609:79b7ac23:55197f70:e8fbfd58
Update Time : Sun Apr 26 05:59:13 2015
Checksum : 696f4e76 - correct
Events : 18014
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 1
Array State : AA.. ('A' == active, '.' == missing)
/dev/sdd2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : dbe238a3:c7a528c1:a1b78589:276ecfcf
Name : ubuntu:0 (local to host ubuntu)
Creation Time : Wed Apr 1 22:27:42 2015
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 3903121408 (1861.15 GiB 1998.40 GB)
Array Size : 5850624 (5.58 GiB 5.99 GB)
Used Dev Size : 3900416 (1904.82 MiB 1997.01 MB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 0f3f2b91:09cbb344:e52c4c4b:722d65c4
Update Time : Mon Apr 27 08:37:15 2015
Checksum : 7e241855 - correct
Events : 26
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : AAAA ('A' == active, '.' == missing)
/dev/sde2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : dbe238a3:c7a528c1:a1b78589:276ecfcf
Name : ubuntu:0 (local to host ubuntu)
Creation Time : Wed Apr 1 22:27:42 2015
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 3903121408 (1861.15 GiB 1998.40 GB)
Array Size : 5850624 (5.58 GiB 5.99 GB)
Used Dev Size : 3900416 (1904.82 MiB 1997.01 MB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : cdae3287:91168194:942ba99d:1a85c466
Update Time : Mon Apr 27 08:37:15 2015
Checksum : b8b529f3 - correct
Events : 26
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : AAAA ('A' == active, '.' == missing)
next reply other threads:[~2015-04-27 9:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-27 9:35 Peter van Es [this message]
2015-04-27 11:07 ` Help needed recovering from raid failure Mikael Abrahamsson
2015-04-28 22:26 ` NeilBrown
2015-04-29 18:17 Peter van Es
2015-04-29 23:27 ` NeilBrown
2015-04-30 19:25 ` Peter van Es
2015-05-01 2:31 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D8713B5-39E7-4EE2-898C-35DC0948B4CA@gmail.com \
--to=vanes.peter@gmail.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.