* Problem assembling a degraded RAID5
@ 2012-04-12 18:02 Martin Wegner
2012-04-12 20:56 ` Martin Wegner
0 siblings, 1 reply; 2+ messages in thread
From: Martin Wegner @ 2012-04-12 18:02 UTC (permalink / raw)
To: linux-raid
Hello.
I've had a disk "failure" in a raid5 containing 4 drives and 1 spare.
The raid5 still reported to be clean but smart data was indicating one
drive failing. So I did these steps:
1. I shut down the system and replaced the failing drive with a new one.
2. Upon booting the system, another drive of this array was missing. I
thought it would be the spare device and tried to start the array with
the remaining 3 devices (out of the 4 non-spare), but it didn't work.
All devices were set up as spare in the array (so I also used --force
eventually, but still no luck.). So I came to the conclusion that the
missing device was not the spare device.
3. I shut down the system again and re-checked all the cables and also
re-installed the failing device and removed the new one. So, the raid
array should be (physically and the actual data on the array) in the
exact same state as before I had removed the disk.
But the raid5 array cannot be started anymore. mdadm reports that the
superblocks of the devices do not match.
Can anyone help me how to recover this raid array? I'm pretty desperate
at this point.
Here is the data of $ mdadm --examine ... of all member devices:
/dev/sda5:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 610fb4f8:02dab3e7:e2fbd8a5:4828a4b0
Name : garm:1 (local to host garm)
Creation Time : Fri Jul 1 20:02:44 2011
Raid Level : -unknown-
Raid Devices : 0
Avail Dev Size : 3907023024 (1863.01 GiB 2000.40 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : b80889b3:d910f7cf:940fe571:45fdbd79
Update Time : Thu Apr 12 17:47:39 2012
Checksum : e5c307f8 - correct
Events : 2
Device Role : spare
Array State : ('A' == active, '.' == missing)
/dev/sdb5:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 610fb4f8:02dab3e7:e2fbd8a5:4828a4b0
Name : garm:1 (local to host garm)
Creation Time : Fri Jul 1 20:02:44 2011
Raid Level : -unknown-
Raid Devices : 0
Avail Dev Size : 3907023024 (1863.01 GiB 2000.40 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 5c645db4:15f5123c:54736b86:201f0767
Update Time : Thu Apr 12 17:47:39 2012
Checksum : 5482cd74 - correct
Events : 2
Device Role : spare
Array State : ('A' == active, '.' == missing)
/dev/sdg5:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 610fb4f8:02dab3e7:e2fbd8a5:4828a4b0
Name : garm:1 (local to host garm)
Creation Time : Fri Jul 1 20:02:44 2011
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 3907023024 (1863.01 GiB 2000.40 GB)
Array Size : 11721068256 (5589.04 GiB 6001.19 GB)
Used Dev Size : 3907022752 (1863.01 GiB 2000.40 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 0ec23618:69bcb467:20fe2b20:5dedf2d6
Internal Bitmap : 2 sectors from superblock
Update Time : Thu Apr 12 17:14:54 2012
Checksum : edbcd80f - correct
Events : 21215
Layout : left-symmetric
Chunk Size : 16K
Device Role : Active device 0
Array State : AAAA ('A' == active, '.' == missing)
/dev/sdh5:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 610fb4f8:02dab3e7:e2fbd8a5:4828a4b0
Name : garm:1 (local to host garm)
Creation Time : Fri Jul 1 20:02:44 2011
Raid Level : -unknown-
Raid Devices : 0
Avail Dev Size : 3907023024 (1863.01 GiB 2000.40 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : ff352ee9:4f8d881c:e5408fde:e6234761
Update Time : Thu Apr 12 17:47:39 2012
Checksum : bc2d09a6 - correct
Events : 2
Device Role : spare
Array State : ('A' == active, '.' == missing)
/dev/sdl5:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 610fb4f8:02dab3e7:e2fbd8a5:4828a4b0
Name : garm:1 (local to host garm)
Creation Time : Fri Jul 1 20:02:44 2011
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 3907023024 (1863.01 GiB 2000.40 GB)
Array Size : 11721068256 (5589.04 GiB 6001.19 GB)
Used Dev Size : 3907022752 (1863.01 GiB 2000.40 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : bdf903e3:88296c06:e340658c:4378ac7b
Internal Bitmap : 2 sectors from superblock
Update Time : Sun Apr 8 20:44:46 2012
Checksum : 682f35bb - correct
Events : 21215
Layout : left-symmetric
Chunk Size : 16K
Device Role : spare
Array State : AAAA ('A' == active, '.' == missing)
Thanks in advance,
Martin Wegner
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Problem assembling a degraded RAID5
2012-04-12 18:02 Problem assembling a degraded RAID5 Martin Wegner
@ 2012-04-12 20:56 ` Martin Wegner
0 siblings, 0 replies; 2+ messages in thread
From: Martin Wegner @ 2012-04-12 20:56 UTC (permalink / raw)
To: linux-raid
Hello.
I was able to gather some more data on the raid array:
Before removing the disk, /proc/mdstat showed this:
------------------------------------------------------------------------------
md1 : active raid5 sdh5[1] sdb5[2] sdc5[3] sdi5[0] sdm5[4](S)
5860534128 blocks super 1.2 level 5, 16k chunk, algorithm 2 [4/4]
[UUUU]
bitmap: 0/15 pages [0KB], 65536KB chunk
------------------------------------------------------------------------------
And <mdadm --examine /dev/md1> showed this:
------------------------------------------------------------------------------
/dev/md1:
Version : 1.2
Creation Time : Fri Jul 1 20:02:44 2011
Raid Level : raid5
Array Size : 5860534128 (5589.04 GiB 6001.19 GB)
Used Dev Size : 1953511376 (1863.01 GiB 2000.40 GB)
Raid Devices : 4
Total Devices : 5
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Wed Apr 11 23:06:09 2012
State : active
Active Devices : 4
Working Devices : 5
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 16K
Name : garm:1 (local to host garm)
UUID : 610fb4f8:02dab3e7:e2fbd8a5:4828a4b0
Events : 21215
Number Major Minor RaidDevice State
0 8 133 0 active sync /dev/sdi5
1 8 117 1 active sync /dev/sdh5
3 8 37 2 active sync /dev/sdc5
2 8 21 3 active sync /dev/sdb5
4 8 197 - spare /dev/sdm5
------------------------------------------------------------------------------
So, *after* my repair attempt, the member devices somehow got renamed to
sda5, sdb5, sdg5, sgh5 and sdl5 .
The last device in alphabetical order sdl5 still seems to be the spare
device according to <mdadm --examine ...>.
The sdg5 and sdh5 devices are the HDD models I started the raid5 with at
the beginning (as raid1 at that time). sdg5 reports as raid device 0
according to <mdadm --examine ...>. So I guess that sdg5 and sdh5 got
swapped - maybe because I swapped cables or something like that. But I
think that sdh5 has to be raid device 1, although it is not reporting that.
So that leaves sda5 and sdb5 which also may be swapped, but they should
be raid devices 2 and 3.
For the original device order, this is all I could recover so far. Is
there any way to re-assemble the raid array with this information?
When searching for similar reports, I read that it may be possible to
re-create the array with <mdadm --create ...> if one knows the array's
metadata like level, chunksize, etc. and the original device order when
creating the array.
I think, I have all necessary metadata but the above was all I could
recover about the device order.
On top of this raid5 I had a LUKS crypt device. In the thread [0] on
serverfault.com someone states that multiple tries to do a <mdadm
--create ...> with different device orders can be made and one can check
if a valid RAID was re-created by checking for metadata on the RAID
device. In my case, I could check if a valid LUKS header could be found.
Would this be a possibility?
Is there any way I can recover the raid5 with this information?
I'd really appriciate any help with this issue.
Thanks,
Martin Wegner
[0]
http://serverfault.com/questions/347606/recover-raid-5-data-after-created-new-array-instead-of-re-using
On 04/12/12 20:02, Martin Wegner wrote:
> Hello.
>
> I've had a disk "failure" in a raid5 containing 4 drives and 1 spare.
> The raid5 still reported to be clean but smart data was indicating one
> drive failing. So I did these steps:
>
> 1. I shut down the system and replaced the failing drive with a new one.
> 2. Upon booting the system, another drive of this array was missing. I
> thought it would be the spare device and tried to start the array with
> the remaining 3 devices (out of the 4 non-spare), but it didn't work.
> All devices were set up as spare in the array (so I also used --force
> eventually, but still no luck.). So I came to the conclusion that the
> missing device was not the spare device.
> 3. I shut down the system again and re-checked all the cables and also
> re-installed the failing device and removed the new one. So, the raid
> array should be (physically and the actual data on the array) in the
> exact same state as before I had removed the disk.
>
> But the raid5 array cannot be started anymore. mdadm reports that the
> superblocks of the devices do not match.
>
> Can anyone help me how to recover this raid array? I'm pretty desperate
> at this point.
>
> Here is the data of $ mdadm --examine ... of all member devices:
>
> /dev/sda5:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : 610fb4f8:02dab3e7:e2fbd8a5:4828a4b0
> Name : garm:1 (local to host garm)
> Creation Time : Fri Jul 1 20:02:44 2011
> Raid Level : -unknown-
> Raid Devices : 0
>
> Avail Dev Size : 3907023024 (1863.01 GiB 2000.40 GB)
> Data Offset : 2048 sectors
> Super Offset : 8 sectors
> State : active
> Device UUID : b80889b3:d910f7cf:940fe571:45fdbd79
>
> Update Time : Thu Apr 12 17:47:39 2012
> Checksum : e5c307f8 - correct
> Events : 2
>
>
> Device Role : spare
> Array State : ('A' == active, '.' == missing)
>
>
> /dev/sdb5:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : 610fb4f8:02dab3e7:e2fbd8a5:4828a4b0
> Name : garm:1 (local to host garm)
> Creation Time : Fri Jul 1 20:02:44 2011
> Raid Level : -unknown-
> Raid Devices : 0
>
> Avail Dev Size : 3907023024 (1863.01 GiB 2000.40 GB)
> Data Offset : 2048 sectors
> Super Offset : 8 sectors
> State : active
> Device UUID : 5c645db4:15f5123c:54736b86:201f0767
>
> Update Time : Thu Apr 12 17:47:39 2012
> Checksum : 5482cd74 - correct
> Events : 2
>
>
> Device Role : spare
> Array State : ('A' == active, '.' == missing)
>
>
> /dev/sdg5:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x1
> Array UUID : 610fb4f8:02dab3e7:e2fbd8a5:4828a4b0
> Name : garm:1 (local to host garm)
> Creation Time : Fri Jul 1 20:02:44 2011
> Raid Level : raid5
> Raid Devices : 4
>
> Avail Dev Size : 3907023024 (1863.01 GiB 2000.40 GB)
> Array Size : 11721068256 (5589.04 GiB 6001.19 GB)
> Used Dev Size : 3907022752 (1863.01 GiB 2000.40 GB)
> Data Offset : 2048 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : 0ec23618:69bcb467:20fe2b20:5dedf2d6
>
> Internal Bitmap : 2 sectors from superblock
> Update Time : Thu Apr 12 17:14:54 2012
> Checksum : edbcd80f - correct
> Events : 21215
>
> Layout : left-symmetric
> Chunk Size : 16K
>
> Device Role : Active device 0
> Array State : AAAA ('A' == active, '.' == missing)
>
>
> /dev/sdh5:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : 610fb4f8:02dab3e7:e2fbd8a5:4828a4b0
> Name : garm:1 (local to host garm)
> Creation Time : Fri Jul 1 20:02:44 2011
> Raid Level : -unknown-
> Raid Devices : 0
>
> Avail Dev Size : 3907023024 (1863.01 GiB 2000.40 GB)
> Data Offset : 2048 sectors
> Super Offset : 8 sectors
> State : active
> Device UUID : ff352ee9:4f8d881c:e5408fde:e6234761
>
> Update Time : Thu Apr 12 17:47:39 2012
> Checksum : bc2d09a6 - correct
> Events : 2
>
>
> Device Role : spare
> Array State : ('A' == active, '.' == missing)
>
>
> /dev/sdl5:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x1
> Array UUID : 610fb4f8:02dab3e7:e2fbd8a5:4828a4b0
> Name : garm:1 (local to host garm)
> Creation Time : Fri Jul 1 20:02:44 2011
> Raid Level : raid5
> Raid Devices : 4
>
> Avail Dev Size : 3907023024 (1863.01 GiB 2000.40 GB)
> Array Size : 11721068256 (5589.04 GiB 6001.19 GB)
> Used Dev Size : 3907022752 (1863.01 GiB 2000.40 GB)
> Data Offset : 2048 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : bdf903e3:88296c06:e340658c:4378ac7b
>
> Internal Bitmap : 2 sectors from superblock
> Update Time : Sun Apr 8 20:44:46 2012
> Checksum : 682f35bb - correct
> Events : 21215
>
> Layout : left-symmetric
> Chunk Size : 16K
>
> Device Role : spare
> Array State : AAAA ('A' == active, '.' == missing)
>
> Thanks in advance,
>
> Martin Wegner
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-04-12 20:56 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-12 18:02 Problem assembling a degraded RAID5 Martin Wegner
2012-04-12 20:56 ` Martin Wegner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.