* mdadm ignoring X as it reports Y as failed
@ 2013-07-06 19:06 Marek Jaros
2013-07-08 1:28 ` NeilBrown
2013-07-08 3:58 ` NeilBrown
0 siblings, 2 replies; 3+ messages in thread
From: Marek Jaros @ 2013-07-06 19:06 UTC (permalink / raw)
To: linux-raid
Hey everybody.
To keep it short, I have a RAID-5 mdraid, just today 2 out of the 5
drives dropped out. It was a cable issue and has since been fixed. The
array was not being written to or utilized in other way so no data has
been lost.
However when I attempted to reassemble the array with
mdadm --assemble --force --verbose /dev/md0 /dev/sdc /dev/sdd /dev/sde
/dev/sdf /dev/sdg
I got the folowing errors
mdadm: looking for devices for /dev/md0
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sde is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdg is identified as a member of /dev/md0, slot 4.
mdadm: ignoring /dev/sde as it reports /dev/sdc as failed
mdadm: ignoring /dev/sdf as it reports /dev/sdc as failed
mdadm: ignoring /dev/sdg as it reports /dev/sdc as failed
mdadm: added /dev/sdd to /dev/md0 as 1
mdadm: no uptodate device for slot 2 of /dev/md0
mdadm: no uptodate device for slot 3 of /dev/md0
mdadm: no uptodate device for slot 4 of /dev/md0
mdadm: added /dev/sdc to /dev/md0 as 0
mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.
After doing --examine* I indeed found out that the StateArray info
inside the superblock has marked the first two drives as missing. That
is however not true anymore but I can't force it to assemble the array
or update the superblock info.
So is there anyway to force mdadm to assemble the array? Or perhaps
edit the superblock info manually? I'd rather avoid having to recreate
the array from scratch.
Any help or pointers with more info are highly appreciated. Thank you.
Regards,
Marek Jaros
*--examine output here:
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 43222604:a15a957d:313d7b6c:d087ba6a
Name : YSGARD:0
Creation Time : Thu Apr 25 12:38:17 2013
Raid Level : raid5
Raid Devices : 5
Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
Array Size : 3906525184 (3725.55 GiB 4000.28 GB)
Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : d2a4d8ed:b2c2bf12:6d553d16:7aabe804
Update Time : Sat Jul 6 14:43:14 2013
Checksum : 1b569235 - correct
Events : 2742
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : AAAAA ('A' == active, '.' == missing)
/dev/sdd:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 43222604:a15a957d:313d7b6c:d087ba6a
Name : YSGARD:0
Creation Time : Thu Apr 25 12:38:17 2013
Raid Level : raid5
Raid Devices : 5
Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
Array Size : 3906525184 (3725.55 GiB 4000.28 GB)
Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : f729ddca:adb2eaec:748ad0b6:0e321903
Update Time : Sat Jul 6 14:29:42 2013
Checksum : 7149bfc6 - correct
Events : 2742
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 1
Array State : AAAAA ('A' == active, '.' == missing)
/dev/sde:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 43222604:a15a957d:313d7b6c:d087ba6a
Name : YSGARD:0
Creation Time : Thu Apr 25 12:38:17 2013
Raid Level : raid5
Raid Devices : 5
Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
Array Size : 3906525184 (3725.55 GiB 4000.28 GB)
Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 83aea2f7:dc6a7d68:0d875248:c54d2bc9
Update Time : Sat Jul 6 14:46:15 2013
Checksum : 713418b2 - correct
Events : 2742
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 2
Array State : ..AAA ('A' == active, '.' == missing)
/dev/sdg:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 43222604:a15a957d:313d7b6c:d087ba6a
Name : YSGARD:0
Creation Time : Thu Apr 25 12:38:17 2013
Raid Level : raid5
Raid Devices : 5
Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
Array Size : 3906525184 (3725.55 GiB 4000.28 GB)
Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : b8295773:a4a73e18:6a097100:15ecc829
Update Time : Sat Jul 6 14:46:15 2013
Checksum : b565f15d - correct
Events : 2742
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 4
Array State : ..AAA ('A' == active, '.' == missing)
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: mdadm ignoring X as it reports Y as failed
2013-07-06 19:06 mdadm ignoring X as it reports Y as failed Marek Jaros
@ 2013-07-08 1:28 ` NeilBrown
2013-07-08 3:58 ` NeilBrown
1 sibling, 0 replies; 3+ messages in thread
From: NeilBrown @ 2013-07-08 1:28 UTC (permalink / raw)
To: Marek Jaros; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 6381 bytes --]
On Sat, 06 Jul 2013 21:06:14 +0200 "Marek Jaros" <mjaros1@nbox.cz> wrote:
> Hey everybody.
>
> To keep it short, I have a RAID-5 mdraid, just today 2 out of the 5
> drives dropped out. It was a cable issue and has since been fixed. The
> array was not being written to or utilized in other way so no data has
> been lost.
>
> However when I attempted to reassemble the array with
>
> mdadm --assemble --force --verbose /dev/md0 /dev/sdc /dev/sdd /dev/sde
> /dev/sdf /dev/sdg
>
>
> I got the folowing errors
>
> mdadm: looking for devices for /dev/md0
> mdadm: /dev/sdc is identified as a member of /dev/md0, slot 0.
> mdadm: /dev/sdd is identified as a member of /dev/md0, slot 1.
> mdadm: /dev/sde is identified as a member of /dev/md0, slot 2.
> mdadm: /dev/sdf is identified as a member of /dev/md0, slot 3.
> mdadm: /dev/sdg is identified as a member of /dev/md0, slot 4.
> mdadm: ignoring /dev/sde as it reports /dev/sdc as failed
> mdadm: ignoring /dev/sdf as it reports /dev/sdc as failed
> mdadm: ignoring /dev/sdg as it reports /dev/sdc as failed
> mdadm: added /dev/sdd to /dev/md0 as 1
> mdadm: no uptodate device for slot 2 of /dev/md0
> mdadm: no uptodate device for slot 3 of /dev/md0
> mdadm: no uptodate device for slot 4 of /dev/md0
> mdadm: added /dev/sdc to /dev/md0 as 0
> mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.
>
Try the same assemble command, but rearrange the devices so that one of sde
sdf sdg is first. That might work.
The super blocks all have the same event count, but report different things
about which devices have failed. That shouldn't really happen but it
obviously did.
I'll see if I can figure out how to make mdadm cope correctly with this
situation.
Thanks for the report,
NeilBrown
>
> After doing --examine* I indeed found out that the StateArray info
> inside the superblock has marked the first two drives as missing. That
> is however not true anymore but I can't force it to assemble the array
> or update the superblock info.
>
> So is there anyway to force mdadm to assemble the array? Or perhaps
> edit the superblock info manually? I'd rather avoid having to recreate
> the array from scratch.
>
> Any help or pointers with more info are highly appreciated. Thank you.
>
>
> Regards,
>
> Marek Jaros
>
>
> *--examine output here:
>
> /dev/sdc:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : 43222604:a15a957d:313d7b6c:d087ba6a
> Name : YSGARD:0
> Creation Time : Thu Apr 25 12:38:17 2013
> Raid Level : raid5
> Raid Devices : 5
>
> Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
> Array Size : 3906525184 (3725.55 GiB 4000.28 GB)
> Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
> Data Offset : 262144 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : d2a4d8ed:b2c2bf12:6d553d16:7aabe804
>
> Update Time : Sat Jul 6 14:43:14 2013
> Checksum : 1b569235 - correct
> Events : 2742
>
> Layout : left-symmetric
> Chunk Size : 512K
>
> Device Role : Active device 0
> Array State : AAAAA ('A' == active, '.' == missing)
> /dev/sdd:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : 43222604:a15a957d:313d7b6c:d087ba6a
> Name : YSGARD:0
> Creation Time : Thu Apr 25 12:38:17 2013
> Raid Level : raid5
> Raid Devices : 5
>
> Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
> Array Size : 3906525184 (3725.55 GiB 4000.28 GB)
> Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
> Data Offset : 262144 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : f729ddca:adb2eaec:748ad0b6:0e321903
>
> Update Time : Sat Jul 6 14:29:42 2013
> Checksum : 7149bfc6 - correct
> Events : 2742
>
> Layout : left-symmetric
> Chunk Size : 512K
>
> Device Role : Active device 1
> Array State : AAAAA ('A' == active, '.' == missing)
> /dev/sde:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : 43222604:a15a957d:313d7b6c:d087ba6a
> Name : YSGARD:0
> Creation Time : Thu Apr 25 12:38:17 2013
> Raid Level : raid5
> Raid Devices : 5
>
> Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
> Array Size : 3906525184 (3725.55 GiB 4000.28 GB)
> Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
> Data Offset : 262144 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : 83aea2f7:dc6a7d68:0d875248:c54d2bc9
>
> Update Time : Sat Jul 6 14:46:15 2013
> Checksum : 713418b2 - correct
> Events : 2742
>
> Layout : left-symmetric
> Chunk Size : 512K
>
> Device Role : Active device 2
> Array State : ..AAA ('A' == active, '.' == missing)
> /dev/sdg:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : 43222604:a15a957d:313d7b6c:d087ba6a
> Name : YSGARD:0
> Creation Time : Thu Apr 25 12:38:17 2013
> Raid Level : raid5
> Raid Devices : 5
>
> Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
> Array Size : 3906525184 (3725.55 GiB 4000.28 GB)
> Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
> Data Offset : 262144 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : b8295773:a4a73e18:6a097100:15ecc829
>
> Update Time : Sat Jul 6 14:46:15 2013
> Checksum : b565f15d - correct
> Events : 2742
>
> Layout : left-symmetric
> Chunk Size : 512K
>
> Device Role : Active device 4
> Array State : ..AAA ('A' == active, '.' == missing)
>
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: mdadm ignoring X as it reports Y as failed
2013-07-06 19:06 mdadm ignoring X as it reports Y as failed Marek Jaros
2013-07-08 1:28 ` NeilBrown
@ 2013-07-08 3:58 ` NeilBrown
1 sibling, 0 replies; 3+ messages in thread
From: NeilBrown @ 2013-07-08 3:58 UTC (permalink / raw)
To: Marek Jaros; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 3833 bytes --]
On Sat, 06 Jul 2013 21:06:14 +0200 "Marek Jaros" <mjaros1@nbox.cz> wrote:
> Hey everybody.
>
> To keep it short, I have a RAID-5 mdraid, just today 2 out of the 5
> drives dropped out. It was a cable issue and has since been fixed. The
> array was not being written to or utilized in other way so no data has
> been lost.
>
> However when I attempted to reassemble the array with
>
> mdadm --assemble --force --verbose /dev/md0 /dev/sdc /dev/sdd /dev/sde
> /dev/sdf /dev/sdg
>
>
> I got the folowing errors
>
> mdadm: looking for devices for /dev/md0
> mdadm: /dev/sdc is identified as a member of /dev/md0, slot 0.
> mdadm: /dev/sdd is identified as a member of /dev/md0, slot 1.
> mdadm: /dev/sde is identified as a member of /dev/md0, slot 2.
> mdadm: /dev/sdf is identified as a member of /dev/md0, slot 3.
> mdadm: /dev/sdg is identified as a member of /dev/md0, slot 4.
> mdadm: ignoring /dev/sde as it reports /dev/sdc as failed
> mdadm: ignoring /dev/sdf as it reports /dev/sdc as failed
> mdadm: ignoring /dev/sdg as it reports /dev/sdc as failed
> mdadm: added /dev/sdd to /dev/md0 as 1
> mdadm: no uptodate device for slot 2 of /dev/md0
> mdadm: no uptodate device for slot 3 of /dev/md0
> mdadm: no uptodate device for slot 4 of /dev/md0
> mdadm: added /dev/sdc to /dev/md0 as 0
> mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.
>
>
> After doing --examine* I indeed found out that the StateArray info
> inside the superblock has marked the first two drives as missing. That
> is however not true anymore but I can't force it to assemble the array
> or update the superblock info.
>
> So is there anyway to force mdadm to assemble the array? Or perhaps
> edit the superblock info manually? I'd rather avoid having to recreate
> the array from scratch.
>
> Any help or pointers with more info are highly appreciated. Thank you.
>
Hi again,
could you tell me what kernel you are running? Because as far as I can tell
the state of the devices that you reported is impossible!
The interesting bit of the --examine output is:
/dev/sdc:
Update Time : Sat Jul 6 14:43:14 2013
Events : 2742
Array State : AAAAA ('A' == active, '.' == missing)
/dev/sdd:
Update Time : Sat Jul 6 14:29:42 2013
Events : 2742
Array State : AAAAA ('A' == active, '.' == missing)
/dev/sde:
Update Time : Sat Jul 6 14:46:15 2013
Events : 2742
Array State : ..AAA ('A' == active, '.' == missing)
/dev/sdg:
Update Time : Sat Jul 6 14:46:15 2013
Events : 2742
Array State : ..AAA ('A' == active, '.' == missing)
From this I can see that:
at 14:29:42 everything was fine and all the superblocks were updated.
at 14:43:13 everything still seemed to be fine and md tried to update the
superblock again (it does that from time to time) but failed
to write to /dev/sdd. This would have triggered an error so
it would have marked sdd as faulty and updated the superblocks
again.
Probably when it tried it found that the write to sdc failed to, so it marked
that as faulty and tried again.
at 14:46:15 it wrote out metadata to sde and sdg reporting that sdc and sdd
were faulty.
Every time that it updates the superblock when the array is degraded it must
update the 'Events' count. However the Events count at 14:46:15 (after 2
devices have failed) is the same as it was at 14:43:14 before anything had
failed. That is really wrong.
Hence the question. I need to know if this is a bug that has already been
fixed (I cannot find a fix, but you never know), or if the bug is still
present and I need to hunt some more.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-07-08 3:58 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-06 19:06 mdadm ignoring X as it reports Y as failed Marek Jaros
2013-07-08 1:28 ` NeilBrown
2013-07-08 3:58 ` NeilBrown
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.