All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: Unable to re-assemble a raid 10 after it has FAILED.
       [not found] <CAEp9EKjgt1RTkYzhSX-UQN2-WhV3iCKPr_DTnHN_KmorvFNsvg@mail.gmail.com>
@ 2014-06-10 11:04 ` Alberto Morell
  2014-06-10 11:30   ` NeilBrown
  2014-06-10 20:28   ` Fwd: " John Stoffel
  0 siblings, 2 replies; 4+ messages in thread
From: Alberto Morell @ 2014-06-10 11:04 UTC (permalink / raw)
  To: linux-raid

Hi!

I have a raid 10 devices with 4 components. I make the raid fails, by
making two components fail (using "mdadm --set-faulty <device>"). When
I add the two devices back, they are added as spare devices. Then, I
can get the raid active again only creating the raid again using
"mdadm --create --assume-clean...". Re-assembling the device does not
work with error message:

mdadm: failed to RUN_ARRAY /dev/md/hdd: Input/output error
mdadm: Not enough devices to start the array.

I am trying to automate the raid configuration and the "mdadm --create
..." option is not convenient as I would have to know all the creation
parameters.

Below is the command sequence.

Thanks is advance,

Alberto Morell.


[root@os2 raid_tools]# mdadm --version
mdadm - v3.2.6 - 25th October 2012

[root@os2 ~]# mdadm --create /dev/md/hdd --metadata=1.0 --auto=md --name=hdd \
--chunk=256 --bitmap=internal --bitmap-chunk=65536 --level=raid10 --run \
--raid-devices=4 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1

[root@os2 ~]# mdadm --set-faulty /dev/md/hdd /dev/sde1
mdadm: set /dev/sde1 faulty in /dev/md/hdd
[root@os2 ~]# mdadm --set-faulty /dev/md/hdd /dev/sdf1
mdadm: set /dev/sdf1 faulty in /dev/md/hdd

[root@os2 ~]# mdadm --remove /dev/md/hdd /dev/sde1
mdadm: hot removed /dev/sde1 from /dev/md/hdd
[root@os2 ~]# mdadm --remove /dev/md/hdd /dev/sdf1
mdadm: hot removed /dev/sdf1 from /dev/md/hdd
[root@os2 ~]# mdadm --add /dev/md/hdd /dev/sde1
mdadm: re-added /dev/sde1
[root@os2 ~]# mdadm --add /dev/md/hdd /dev/sdf1
mdadm: re-added /dev/sdf1

[root@os2 raid_tools]# mdadm --detail /dev/md/hdd /dev/md/hdd:
        Version : 1.0
  Creation Time : Tue Jun 10 10:57:23 2014
     Raid Level : raid10
  Used Dev Size : 4193280 (4.00 GiB 4.29 GB)
   Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Tue Jun 10 10:59:25 2014
          State : active, FAILED, Not Started
 Active Devices : 2
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 2

         Layout : near=2
     Chunk Size : 256K

           Name : hdd
           UUID : 3398fe2d:48bfbff8:6ff2acaf:7cd020c5
         Events : 60

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       49        1      active sync   /dev/sdd1
       2       0        0        2      removed
       3       0        0        3      removed

       2       8       65        -      spare   /dev/sde1
       3       8       81        -      spare   /dev/sdf1

[root@os2 raid_tools]# mdadm --assemble /dev/md/hdd --force --run
--uuid=3398fe2d:48bfbff8:6ff2acaf:7cd020c5 --verbose /dev/sdc1
/dev/sdd1 /dev/sde1 /dev/sdf1
mdadm: looking for devices for /dev/md/hdd
mdadm: /dev/sdc1 is identified as a member of /dev/md/hdd, slot 0.
mdadm: /dev/sdd1 is identified as a member of /dev/md/hdd, slot 1.
mdadm: /dev/sde1 is identified as a member of /dev/md/hdd, slot -1.
mdadm: /dev/sdf1 is identified as a member of /dev/md/hdd, slot -1.
mdadm: added /dev/sdd1 to /dev/md/hdd as 1
mdadm: no uptodate device for slot 2 of /dev/md/hdd
mdadm: no uptodate device for slot 3 of /dev/md/hdd
mdadm: added /dev/sde1 to /dev/md/hdd as -1
mdadm: added /dev/sdf1 to /dev/md/hdd as -1
mdadm: added /dev/sdc1 to /dev/md/hdd as 0
mdadm: failed to RUN_ARRAY /dev/md/hdd: Input/output error
mdadm: Not enough devices to start the array.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Unable to re-assemble a raid 10 after it has FAILED.
  2014-06-10 11:04 ` Fwd: Unable to re-assemble a raid 10 after it has FAILED Alberto Morell
@ 2014-06-10 11:30   ` NeilBrown
  2014-06-11  7:24     ` Alberto Morell
  2014-06-10 20:28   ` Fwd: " John Stoffel
  1 sibling, 1 reply; 4+ messages in thread
From: NeilBrown @ 2014-06-10 11:30 UTC (permalink / raw)
  To: Alberto Morell; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2151 bytes --]

On Tue, 10 Jun 2014 13:04:28 +0200 Alberto Morell <amp4tj@gmail.com> wrote:

> Hi!
> 
> I have a raid 10 devices with 4 components. I make the raid fails, by
> making two components fail (using "mdadm --set-faulty <device>"). When
> I add the two devices back, they are added as spare devices. Then, I
> can get the raid active again only creating the raid again using
> "mdadm --create --assume-clean...". Re-assembling the device does not
> work with error message:
> 
> mdadm: failed to RUN_ARRAY /dev/md/hdd: Input/output error
> mdadm: Not enough devices to start the array.
> 
> I am trying to automate the raid configuration and the "mdadm --create
> ..." option is not convenient as I would have to know all the creation
> parameters.
> 
> Below is the command sequence.
> 
> Thanks is advance,
> 
> Alberto Morell.
> 
> 
> [root@os2 raid_tools]# mdadm --version
> mdadm - v3.2.6 - 25th October 2012
> 
> [root@os2 ~]# mdadm --create /dev/md/hdd --metadata=1.0 --auto=md --name=hdd \
> --chunk=256 --bitmap=internal --bitmap-chunk=65536 --level=raid10 --run \
> --raid-devices=4 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1
> 
> [root@os2 ~]# mdadm --set-faulty /dev/md/hdd /dev/sde1
> mdadm: set /dev/sde1 faulty in /dev/md/hdd
> [root@os2 ~]# mdadm --set-faulty /dev/md/hdd /dev/sdf1
> mdadm: set /dev/sdf1 faulty in /dev/md/hdd

So now your array is dead.  I assume you expected this.

> 
> [root@os2 ~]# mdadm --remove /dev/md/hdd /dev/sde1
> mdadm: hot removed /dev/sde1 from /dev/md/hdd
> [root@os2 ~]# mdadm --remove /dev/md/hdd /dev/sdf1
> mdadm: hot removed /dev/sdf1 from /dev/md/hdd
> [root@os2 ~]# mdadm --add /dev/md/hdd /dev/sde1
> mdadm: re-added /dev/sde1
> [root@os2 ~]# mdadm --add /dev/md/hdd /dev/sdf1
> mdadm: re-added /dev/sdf1

Adding good things to dead things doesn't make the dead thing undead.
You don't want to do this.

If and array is dead, the best thing to do is to stop it, before trying to
re-add anything, and (after checking the all the devices are really working)
assemble with --force.
That is the best way to breath life into a dead array.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fwd: Unable to re-assemble a raid 10 after it has FAILED.
  2014-06-10 11:04 ` Fwd: Unable to re-assemble a raid 10 after it has FAILED Alberto Morell
  2014-06-10 11:30   ` NeilBrown
@ 2014-06-10 20:28   ` John Stoffel
  1 sibling, 0 replies; 4+ messages in thread
From: John Stoffel @ 2014-06-10 20:28 UTC (permalink / raw)
  To: Alberto Morell; +Cc: linux-raid

>>>>> "Alberto" == Alberto Morell <amp4tj@gmail.com> writes:

Alberto> I have a raid 10 devices with 4 components. I make the raid
Alberto> fails, by making two components fail (using "mdadm
Alberto> --set-faulty <device>"). When I add the two devices back,
Alberto> they are added as spare devices. Then, I can get the raid
Alberto> active again only creating the raid again using "mdadm
Alberto> --create --assume-clean...". Re-assembling the device does
Alberto> not work with error message:

Alberto> mdadm: failed to RUN_ARRAY /dev/md/hdd: Input/output error
Alberto> mdadm: Not enough devices to start the array.

Alberto> I am trying to automate the raid configuration and the "mdadm --create
Alberto> ..." option is not convenient as I would have to know all the creation
Alberto> parameters.

Alberto> Below is the command sequence.

Alberto> Thanks is advance,

Alberto> Alberto Morell.


Alberto> [root@os2 raid_tools]# mdadm --version
Alberto> mdadm - v3.2.6 - 25th October 2012

Alberto> [root@os2 ~]# mdadm --create /dev/md/hdd --metadata=1.0 --auto=md --name=hdd \
Alberto> --chunk=256 --bitmap=internal --bitmap-chunk=65536 --level=raid10 --run \
Alberto> --raid-devices=4 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1

Alberto> [root@os2 ~]# mdadm --set-faulty /dev/md/hdd /dev/sde1
Alberto> mdadm: set /dev/sde1 faulty in /dev/md/hdd
Alberto> [root@os2 ~]# mdadm --set-faulty /dev/md/hdd /dev/sdf1
Alberto> mdadm: set /dev/sdf1 faulty in /dev/md/hdd

You do realize that you killed the last two drives in the array,
right?  sdc1 and sdd1 are mirrored with each other, and sde1 and sdf1
are mirrored to each other.  When you kill those two drives, the
entire array *must* go offline to try and save data.  It's not likely,
since the filesystem spread across all four drives is probably toast
now...

At least that's how I think the RAID10 works be default.  If you use
one of the other layouts, you might have lucked out by not killing
both halves of a raid pair.

John

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Unable to re-assemble a raid 10 after it has FAILED.
  2014-06-10 11:30   ` NeilBrown
@ 2014-06-11  7:24     ` Alberto Morell
  0 siblings, 0 replies; 4+ messages in thread
From: Alberto Morell @ 2014-06-11  7:24 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On Tue, Jun 10, 2014 at 1:30 PM, NeilBrown <neilb@suse.de> wrote:
> On Tue, 10 Jun 2014 13:04:28 +0200 Alberto Morell <amp4tj@gmail.com> wrote:
>
>> Hi!
>>
>> I have a raid 10 devices with 4 components. I make the raid fails, by
>> making two components fail (using "mdadm --set-faulty <device>"). When
>> I add the two devices back, they are added as spare devices. Then, I
>> can get the raid active again only creating the raid again using
>> "mdadm --create --assume-clean...". Re-assembling the device does not
>> work with error message:
>>
>> mdadm: failed to RUN_ARRAY /dev/md/hdd: Input/output error
>> mdadm: Not enough devices to start the array.
>>
>> I am trying to automate the raid configuration and the "mdadm --create
>> ..." option is not convenient as I would have to know all the creation
>> parameters.
>>
>> Below is the command sequence.
>>
>> Thanks is advance,
>>
>> Alberto Morell.
>>
>>
>> [root@os2 raid_tools]# mdadm --version
>> mdadm - v3.2.6 - 25th October 2012
>>
>> [root@os2 ~]# mdadm --create /dev/md/hdd --metadata=1.0 --auto=md --name=hdd \
>> --chunk=256 --bitmap=internal --bitmap-chunk=65536 --level=raid10 --run \
>> --raid-devices=4 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1
>>
>> [root@os2 ~]# mdadm --set-faulty /dev/md/hdd /dev/sde1
>> mdadm: set /dev/sde1 faulty in /dev/md/hdd
>> [root@os2 ~]# mdadm --set-faulty /dev/md/hdd /dev/sdf1
>> mdadm: set /dev/sdf1 faulty in /dev/md/hdd
>
> So now your array is dead.  I assume you expected this.
>

>>
>> [root@os2 ~]# mdadm --remove /dev/md/hdd /dev/sde1
>> mdadm: hot removed /dev/sde1 from /dev/md/hdd
>> [root@os2 ~]# mdadm --remove /dev/md/hdd /dev/sdf1
>> mdadm: hot removed /dev/sdf1 from /dev/md/hdd
>> [root@os2 ~]# mdadm --add /dev/md/hdd /dev/sde1
>> mdadm: re-added /dev/sde1
>> [root@os2 ~]# mdadm --add /dev/md/hdd /dev/sdf1
>> mdadm: re-added /dev/sdf1
>
> Adding good things to dead things doesn't make the dead thing undead.
> You don't want to do this.
>
> If and array is dead, the best thing to do is to stop it, before trying to
> re-add anything, and (after checking the all the devices are really working)
> assemble with --force.
> That is the best way to breath life into a dead array.
>
> NeilBrown

That sequence solved the problem:
1. After the array is dead, stop it

[root@os2 raid_tools]# mdadm --stop /dev/md127
mdadm: stopped /dev/md127

2. Then, assemble the array with the proper components.

[root@os2 raid_tools]# mdadm --assemble /dev/md/hdd --force --run
--uuid=085d48f6:e6e38d1c:a476006f:cd3a3e51 --verbose /dev/sdc1
/dev/sdd1 /dev/sde1 /dev/sdf1

The array started with only three components, but then I added the
fourth just with a "mdadm --add"

(Before, I was first re-adding the failed components, and after that,
stopping and re-assembling the array)

Thanks a lot!

Alberto Morell.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-06-11  7:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAEp9EKjgt1RTkYzhSX-UQN2-WhV3iCKPr_DTnHN_KmorvFNsvg@mail.gmail.com>
2014-06-10 11:04 ` Fwd: Unable to re-assemble a raid 10 after it has FAILED Alberto Morell
2014-06-10 11:30   ` NeilBrown
2014-06-11  7:24     ` Alberto Morell
2014-06-10 20:28   ` Fwd: " John Stoffel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.