RAID showing all devices as spares after partial unplug

All of lore.kernel.org
 help / color / mirror / Atom feed

* RAID showing all devices as spares after partial unplug
       [not found] <CAB=7dhk0AV1dKL2cngt1eZXJwCVrfixfLE5z=J1i-7tqdL-6QA@mail.gmail.com>
@ 2011-09-17 20:39 ` Mike Hartman
  2011-09-17 22:16   ` Mike Hartman
  0 siblings, 1 reply; 12+ messages in thread
From: Mike Hartman @ 2011-09-17 20:39 UTC (permalink / raw)
  To: linux-raid

I have 11 drives in a RAID 6 array. 6 are plugged into one esata
enclosure, the other 4 are in another. These esata cables are prone to
loosening when I'm working on nearby hardware.

If that happens and I start the host up, big chunks of the array are
missing and things could get ugly. Thus I cooked up a custom startup
script that verifies each device is present before starting the array
with

mdadm --assemble --no-degraded -u 4fd7659f:12044eff:ba25240d:
de22249d /dev/md3

So I thought I was covered. In case something got unplugged I would
see the array failing to start at boot and I could shut down, fix the
cables and try again. However, I hit a new scenario today where one of
the plugs was loosened while everything was turned on.

The good news is that there should have been no activity on the array
when this happened, particularly write activity. It's a big media
partition and sees much less writing then reading. I'm also the only
one that uses it and I know I wasn't transferring anything. The system
also seems to have immediately marked the filesystem read-only,
because I discovered the issue when I went to write to it later and
got a "read-only filesystem" error. So I believe the state of the
drives should be the same - nothing should be out of sync.

However, I shut the system down, fixed the cables and brought it back
up. All the devices are detected by my script and it tries to start
the array with the command I posted above, but I've ended up with
this:

md0 : inactive sdn1[1](S) sdj1[9](S) sdm1[10](S) sdl1[11](S)
sdk1[12](S) md3p1[8](S) sdc1[6](S) sdd1[5](S) md1p1[4](S) sdf1[3](S)
sdh1[0](S)
      16113893731 blocks super 1.2

Instead of all coming back up, or still showing the unplugged drives
missing, everything is a spare? I'm suitably disturbed.

It seems to me that if the data on the drives still reflects the
last-good data from the array (and since no writing was going on it
should) then this is just a matter of some metadata getting messed up
and it should be fixable. Can someone please walk me through the
commands to do that?

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID showing all devices as spares after partial unplug
  2011-09-17 20:39 ` RAID showing all devices as spares after partial unplug Mike Hartman
@ 2011-09-17 22:16   ` Mike Hartman
       [not found]     ` <CAB=7dhmFQ=Rtagj2j_22cnoS0A2yoKvJgaTM+ZiqDBqhPRooDQ@mail.g mail.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Mike Hartman @ 2011-09-17 22:16 UTC (permalink / raw)
  To: linux-raid

I should add that the mdadm command in question actually ends in
/dev/md0, not /dev/md3 (that's for another array). So the device name
for the array I'm seeing in mdstat DOES match the one in the assemble
command.

On Sat, Sep 17, 2011 at 4:39 PM, Mike Hartman <mike@hartmanipulation.com> wrote:
> I have 11 drives in a RAID 6 array. 6 are plugged into one esata
> enclosure, the other 4 are in another. These esata cables are prone to
> loosening when I'm working on nearby hardware.
>
> If that happens and I start the host up, big chunks of the array are
> missing and things could get ugly. Thus I cooked up a custom startup
> script that verifies each device is present before starting the array
> with
>
> mdadm --assemble --no-degraded -u 4fd7659f:12044eff:ba25240d:
> de22249d /dev/md3
>
> So I thought I was covered. In case something got unplugged I would
> see the array failing to start at boot and I could shut down, fix the
> cables and try again. However, I hit a new scenario today where one of
> the plugs was loosened while everything was turned on.
>
> The good news is that there should have been no activity on the array
> when this happened, particularly write activity. It's a big media
> partition and sees much less writing then reading. I'm also the only
> one that uses it and I know I wasn't transferring anything. The system
> also seems to have immediately marked the filesystem read-only,
> because I discovered the issue when I went to write to it later and
> got a "read-only filesystem" error. So I believe the state of the
> drives should be the same - nothing should be out of sync.
>
> However, I shut the system down, fixed the cables and brought it back
> up. All the devices are detected by my script and it tries to start
> the array with the command I posted above, but I've ended up with
> this:
>
> md0 : inactive sdn1[1](S) sdj1[9](S) sdm1[10](S) sdl1[11](S)
> sdk1[12](S) md3p1[8](S) sdc1[6](S) sdd1[5](S) md1p1[4](S) sdf1[3](S)
> sdh1[0](S)
>       16113893731 blocks super 1.2
>
> Instead of all coming back up, or still showing the unplugged drives
> missing, everything is a spare? I'm suitably disturbed.
>
> It seems to me that if the data on the drives still reflects the
> last-good data from the array (and since no writing was going on it
> should) then this is just a matter of some metadata getting messed up
> and it should be fixable. Can someone please walk me through the
> commands to do that?
>
> Mike
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

[parent not found: <CAB=7dhmFQ=Rtagj2j_22cnoS0A2yoKvJgaTM+ZiqDBqhPRooDQ@mail.g mail.com>]

* Re: RAID showing all devices as spares after partial unplug
       [not found]     ` <CAB=7dhmFQ=Rtagj2j_22cnoS0A2yoKvJgaTM+ZiqDBqhPRooDQ@mail.g mail.com>
@ 2011-09-18  1:16       ` Jim Schatzman
  2011-09-18  1:34         ` Mike Hartman
  2011-09-18 23:08         ` NeilBrown
  0 siblings, 2 replies; 12+ messages in thread
From: Jim Schatzman @ 2011-09-18  1:16 UTC (permalink / raw)
  To: Mike Hartman, linux-raid

Mike-

I have seen very similar problems. I regret that electronics engineers cannot design more secure connectors. eSata connector are terrible - they come loose at the slightest tug. For this reason, I am gradually abandoning eSata enclosures and going to internal drives only. Fortunately, there are some inexpensive RAID chassis available now.

I tried the same thing as you. I removed the array(s) from mdadm.conf and I wrote a script for "/etc/cron.reboot" which assembles the array, "no-degraded". Doing this seems to minimize the damage caused by drives prior to a reboot. However, if the drives are disconnected while Linux is up, then either the array will stay up but some drives will become stale or the array will be stopped. The behavior I usually see is that all the drives that went offline now become "spare".

It would be nice if md would just reassemble the array once all the drives come back online. Unfortunately, it doesn't. I would run mdadm -E against all the drives/partitions, verifying that the metadata all indicates that they are/were part of the expected array. At that point, you should be able ro re-create the RAID. Be sure you list the drives in the correct order. Once the array is going again, mount the resulting partitions RO and verify that the data is o.k. before going RW.

Jim









At 04:16 PM 9/17/2011, Mike Hartman wrote:
>I should add that the mdadm command in question actually ends in
>/dev/md0, not /dev/md3 (that's for another array). So the device name
>for the array I'm seeing in mdstat DOES match the one in the assemble
>command.
>
>On Sat, Sep 17, 2011 at 4:39 PM, Mike Hartman <mike@hartmanipulation.com> wrote:
>> I have 11 drives in a RAID 6 array. 6 are plugged into one esata
>> enclosure, the other 4 are in another. These esata cables are prone to
>> loosening when I'm working on nearby hardware.
>>
>> If that happens and I start the host up, big chunks of the array are
>> missing and things could get ugly. Thus I cooked up a custom startup
>> script that verifies each device is present before starting the array
>> with
>>
>> mdadm --assemble --no-degraded -u 4fd7659f:12044eff:ba25240d:
>> de22249d /dev/md3
>>
>> So I thought I was covered. In case something got unplugged I would
>> see the array failing to start at boot and I could shut down, fix the
>> cables and try again. However, I hit a new scenario today where one of
>> the plugs was loosened while everything was turned on.
>>
>> The good news is that there should have been no activity on the array
>> when this happened, particularly write activity. It's a big media
>> partition and sees much less writing then reading. I'm also the only
>> one that uses it and I know I wasn't transferring anything. The system
>> also seems to have immediately marked the filesystem read-only,
>> because I discovered the issue when I went to write to it later and
>> got a "read-only filesystem" error. So I believe the state of the
>> drives should be the same - nothing should be out of sync.
>>
>> However, I shut the system down, fixed the cables and brought it back
>> up. All the devices are detected by my script and it tries to start
>> the array with the command I posted above, but I've ended up with
>> this:
>>
>> md0 : inactive sdn1[1](S) sdj1[9](S) sdm1[10](S) sdl1[11](S)
>> sdk1[12](S) md3p1[8](S) sdc1[6](S) sdd1[5](S) md1p1[4](S) sdf1[3](S)
>> sdh1[0](S)
>>       16113893731 blocks super 1.2
>>
>> Instead of all coming back up, or still showing the unplugged drives
>> missing, everything is a spare? I'm suitably disturbed.
>>
>> It seems to me that if the data on the drives still reflects the
>> last-good data from the array (and since no writing was going on it
>> should) then this is just a matter of some metadata getting messed up
>> and it should be fixable. Can someone please walk me through the
>> commands to do that?
>>
>> Mike
>>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID showing all devices as spares after partial unplug
  2011-09-18  1:16       ` Jim Schatzman
@ 2011-09-18  1:34         ` Mike Hartman
       [not found]           ` <CAB=7dh=PymkpqRLTWiNzD-+n=XwEWnPN8nQwXg1=UmiJmZ1b1w@mail.g mail.com>
  2011-09-18 23:08         ` NeilBrown
  1 sibling, 1 reply; 12+ messages in thread
From: Mike Hartman @ 2011-09-18  1:34 UTC (permalink / raw)
  To: Jim Schatzman; +Cc: linux-raid

On Sat, Sep 17, 2011 at 9:16 PM, Jim Schatzman
<james.schatzman@futurelabusa.com> wrote:
> Mike-
>
> I have seen very similar problems. I regret that electronics engineers cannot design more secure connectors. eSata connector are terrible - they come loose at the slightest tug. For this reason, I am gradually abandoning eSata enclosures and going to internal drives only. Fortunately, there are some inexpensive RAID chassis available now.
>
> I tried the same thing as you. I removed the array(s) from mdadm.conf and I wrote a script for "/etc/cron.reboot" which assembles the array, "no-degraded". Doing this seems to minimize the damage caused by drives prior to a reboot. However, if the drives are disconnected while Linux is up, then either the array will stay up but some drives will become stale or the array will be stopped. The behavior I usually see is that all the drives that went offline now become "spare".
>

That sounds similar, although I only had 4/11 go offline and now
they're ALL spare.

> It would be nice if md would just reassemble the array once all the drives come back online. Unfortunately, it doesn't. I would run mdadm -E against all the drives/partitions, verifying that the metadata all indicates that they are/were part of the expected array.

I ran mdadm -E and they all correctly appear as part of the array:

for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
-E $d | grep Role; done

/dev/sdc1
   Device Role : Active device 5
/dev/sdd1
   Device Role : Active device 4
/dev/sdf1
   Device Role : Active device 2
/dev/sdh1
   Device Role : Active device 0
/dev/sdj1
   Device Role : Active device 10
/dev/sdk1
   Device Role : Active device 7
/dev/sdl1
   Device Role : Active device 8
/dev/sdm1
   Device Role : Active device 9
/dev/sdn1
   Device Role : Active device 1
/dev/md1p1
   Device Role : Active device 3
/dev/md3p1
   Device Role : Active device 6

But they have varying event counts (although all pretty close together):

for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
-E $d | grep Event; done

/dev/sdc1
         Events : 1756743
/dev/sdd1
         Events : 1756743
/dev/sdf1
         Events : 1756737
/dev/sdh1
         Events : 1756737
/dev/sdj1
         Events : 1756743
/dev/sdk1
         Events : 1756743
/dev/sdl1
         Events : 1756743
/dev/sdm1
         Events : 1756743
/dev/sdn1
         Events : 1756743
/dev/md1p1
         Events : 1756737
/dev/md3p1
         Events : 1756740

And they don't seem to agree on the overall status of the array. The
ones that never went down seem to think the array is missing 4 nodes,
while the ones that went down seem to think all the nodes are good:

for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
-E $d | grep State; done

/dev/sdc1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdd1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdf1
          State : clean
   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
/dev/sdh1
          State : clean
   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
/dev/sdj1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdk1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdl1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdm1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdn1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/md1p1
          State : clean
   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
/dev/md3p1
          State : clean
   Array State : .A..AAAAAAA ('A' == active, '.' == missing)

So it seems like overall the array is intact, I just need to convince
it of that fact.

> At that point, you should be able ro re-create the RAID. Be sure you list the drives in the correct order. Once the array is going again, mount the resulting partitions RO and verify that the data is o.k. before going RW.

Could you be more specific about how exactly I should re-create the
RAID? Should I just do --assemble --force?

>
> Jim
>
>
>
>
>
>
>
>
>
> At 04:16 PM 9/17/2011, Mike Hartman wrote:
>>I should add that the mdadm command in question actually ends in
>>/dev/md0, not /dev/md3 (that's for another array). So the device name
>>for the array I'm seeing in mdstat DOES match the one in the assemble
>>command.
>>
>>On Sat, Sep 17, 2011 at 4:39 PM, Mike Hartman <mike@hartmanipulation.com> wrote:
>>> I have 11 drives in a RAID 6 array. 6 are plugged into one esata
>>> enclosure, the other 4 are in another. These esata cables are prone to
>>> loosening when I'm working on nearby hardware.
>>>
>>> If that happens and I start the host up, big chunks of the array are
>>> missing and things could get ugly. Thus I cooked up a custom startup
>>> script that verifies each device is present before starting the array
>>> with
>>>
>>> mdadm --assemble --no-degraded -u 4fd7659f:12044eff:ba25240d:
>>> de22249d /dev/md3
>>>
>>> So I thought I was covered. In case something got unplugged I would
>>> see the array failing to start at boot and I could shut down, fix the
>>> cables and try again. However, I hit a new scenario today where one of
>>> the plugs was loosened while everything was turned on.
>>>
>>> The good news is that there should have been no activity on the array
>>> when this happened, particularly write activity. It's a big media
>>> partition and sees much less writing then reading. I'm also the only
>>> one that uses it and I know I wasn't transferring anything. The system
>>> also seems to have immediately marked the filesystem read-only,
>>> because I discovered the issue when I went to write to it later and
>>> got a "read-only filesystem" error. So I believe the state of the
>>> drives should be the same - nothing should be out of sync.
>>>
>>> However, I shut the system down, fixed the cables and brought it back
>>> up. All the devices are detected by my script and it tries to start
>>> the array with the command I posted above, but I've ended up with
>>> this:
>>>
>>> md0 : inactive sdn1[1](S) sdj1[9](S) sdm1[10](S) sdl1[11](S)
>>> sdk1[12](S) md3p1[8](S) sdc1[6](S) sdd1[5](S) md1p1[4](S) sdf1[3](S)
>>> sdh1[0](S)
>>>       16113893731 blocks super 1.2
>>>
>>> Instead of all coming back up, or still showing the unplugged drives
>>> missing, everything is a spare? I'm suitably disturbed.
>>>
>>> It seems to me that if the data on the drives still reflects the
>>> last-good data from the array (and since no writing was going on it
>>> should) then this is just a matter of some metadata getting messed up
>>> and it should be fixable. Can someone please walk me through the
>>> commands to do that?
>>>
>>> Mike
>>>
>>--
>>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

[parent not found: <CAB=7dh=PymkpqRLTWiNzD-+n=XwEWnPN8nQwXg1=UmiJmZ1b1w@mail.g mail.com>]

* Re: RAID showing all devices as spares after partial unplug
       [not found]           ` <CAB=7dh=PymkpqRLTWiNzD-+n=XwEWnPN8nQwXg1=UmiJmZ1b1w@mail.g mail.com>
@ 2011-09-18  2:57             ` Jim Schatzman
  2011-09-18  3:07               ` Mike Hartman
  0 siblings, 1 reply; 12+ messages in thread
From: Jim Schatzman @ 2011-09-18  2:57 UTC (permalink / raw)
  To: Mike Hartman; +Cc: linux-raid

Mike-

See my response below.

Good luck!

Jim


At 07:34 PM 9/17/2011, Mike Hartman wrote:
>On Sat, Sep 17, 2011 at 9:16 PM, Jim Schatzman
><james.schatzman@futurelabusa.com> wrote:
>> Mike-
>>
>> I have seen very similar problems. I regret that electronics engineers cannot design more secure connectors. eSata connector are terrible - they come loose at the slightest tug. For this reason, I am gradually abandoning eSata enclosures and going to internal drives only. Fortunately, there are some inexpensive RAID chassis available now.
>>
>> I tried the same thing as you. I removed the array(s) from mdadm.conf and I wrote a script for "/etc/cron.reboot" which assembles the array, "no-degraded". Doing this seems to minimize the damage caused by drives prior to a reboot. However, if the drives are disconnected while Linux is up, then either the array will stay up but some drives will become stale or the array will be stopped. The behavior I usually see is that all the drives that went offline now become "spare".
>>
>
>That sounds similar, although I only had 4/11 go offline and now
>they're ALL spare.
>
>> It would be nice if md would just reassemble the array once all the drives come back online. Unfortunately, it doesn't. I would run mdadm -E against all the drives/partitions, verifying that the metadata all indicates that they are/were part of the expected array.
>
>I ran mdadm -E and they all correctly appear as part of the array:
>
>for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
>-E $d | grep Role; done
>
>/dev/sdc1
>   Device Role : Active device 5
>/dev/sdd1
>   Device Role : Active device 4
>/dev/sdf1
>   Device Role : Active device 2
>/dev/sdh1
>   Device Role : Active device 0
>/dev/sdj1
>   Device Role : Active device 10
>/dev/sdk1
>   Device Role : Active device 7
>/dev/sdl1
>   Device Role : Active device 8
>/dev/sdm1
>   Device Role : Active device 9
>/dev/sdn1
>   Device Role : Active device 1
>/dev/md1p1
>   Device Role : Active device 3
>/dev/md3p1
>   Device Role : Active device 6
>
>But they have varying event counts (although all pretty close together):
>
>for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
>-E $d | grep Event; done
>
>/dev/sdc1
>         Events : 1756743
>/dev/sdd1
>         Events : 1756743
>/dev/sdf1
>         Events : 1756737
>/dev/sdh1
>         Events : 1756737
>/dev/sdj1
>         Events : 1756743
>/dev/sdk1
>         Events : 1756743
>/dev/sdl1
>         Events : 1756743
>/dev/sdm1
>         Events : 1756743
>/dev/sdn1
>         Events : 1756743
>/dev/md1p1
>         Events : 1756737
>/dev/md3p1
>         Events : 1756740
>
>And they don't seem to agree on the overall status of the array. The
>ones that never went down seem to think the array is missing 4 nodes,
>while the ones that went down seem to think all the nodes are good:
>
>for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
>-E $d | grep State; done
>
>/dev/sdc1
>          State : clean
>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>/dev/sdd1
>          State : clean
>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>/dev/sdf1
>          State : clean
>   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
>/dev/sdh1
>          State : clean
>   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
>/dev/sdj1
>          State : clean
>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>/dev/sdk1
>          State : clean
>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>/dev/sdl1
>          State : clean
>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>/dev/sdm1
>          State : clean
>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>/dev/sdn1
>          State : clean
>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>/dev/md1p1
>          State : clean
>   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
>/dev/md3p1
>          State : clean
>   Array State : .A..AAAAAAA ('A' == active, '.' == missing)
>
>So it seems like overall the array is intact, I just need to convince
>it of that fact.
>
>> At that point, you should be able ro re-create the RAID. Be sure you list the drives in the correct order. Once the array is going again, mount the resulting partitions RO and verify that the data is o.k. before going RW.
>
>Could you be more specific about how exactly I should re-create the
>RAID? Should I just do --assemble --force?



 -->  No. As far as I know, you have to use "-C"/"--create".  You need to use exactly the same array parameters that were used to create the array the first time. Same metadata version. Same stripe size. Raid mode the same. Physical devices in the same order. 

Why do you have to use "--create", and thus open the door for catastropic error?? I have asked the same question myself. Maybe, if more people ping Neil Brown on this, he may be willing to find another way.



>>
>> Jim
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> At 04:16 PM 9/17/2011, Mike Hartman wrote:
>>>I should add that the mdadm command in question actually ends in
>>>/dev/md0, not /dev/md3 (that's for another array). So the device name
>>>for the array I'm seeing in mdstat DOES match the one in the assemble
>>>command.
>>>
>>>On Sat, Sep 17, 2011 at 4:39 PM, Mike Hartman <mike@hartmanipulation.com> wrote:
>>>> I have 11 drives in a RAID 6 array. 6 are plugged into one esata
>>>> enclosure, the other 4 are in another. These esata cables are prone to
>>>> loosening when I'm working on nearby hardware.
>>>>
>>>> If that happens and I start the host up, big chunks of the array are
>>>> missing and things could get ugly. Thus I cooked up a custom startup
>>>> script that verifies each device is present before starting the array
>>>> with
>>>>
>>>> mdadm --assemble --no-degraded -u 4fd7659f:12044eff:ba25240d:
>>>> de22249d /dev/md3
>>>>
>>>> So I thought I was covered. In case something got unplugged I would
>>>> see the array failing to start at boot and I could shut down, fix the
>>>> cables and try again. However, I hit a new scenario today where one of
>>>> the plugs was loosened while everything was turned on.
>>>>
>>>> The good news is that there should have been no activity on the array
>>>> when this happened, particularly write activity. It's a big media
>>>> partition and sees much less writing then reading. I'm also the only
>>>> one that uses it and I know I wasn't transferring anything. The system
>>>> also seems to have immediately marked the filesystem read-only,
>>>> because I discovered the issue when I went to write to it later and
>>>> got a "read-only filesystem" error. So I believe the state of the
>>>> drives should be the same - nothing should be out of sync.
>>>>
>>>> However, I shut the system down, fixed the cables and brought it back
>>>> up. All the devices are detected by my script and it tries to start
>>>> the array with the command I posted above, but I've ended up with
>>>> this:
>>>>
>>>> md0 : inactive sdn1[1](S) sdj1[9](S) sdm1[10](S) sdl1[11](S)
>>>> sdk1[12](S) md3p1[8](S) sdc1[6](S) sdd1[5](S) md1p1[4](S) sdf1[3](S)
>>>> sdh1[0](S)
>>>>       16113893731 blocks super 1.2
>>>>
>>>> Instead of all coming back up, or still showing the unplugged drives
>>>> missing, everything is a spare? I'm suitably disturbed.
>>>>
>>>> It seems to me that if the data on the drives still reflects the
>>>> last-good data from the array (and since no writing was going on it
>>>> should) then this is just a matter of some metadata getting messed up
>>>> and it should be fixable. Can someone please walk me through the
>>>> commands to do that?
>>>>
>>>> Mike
>>>>
>>>--
>>>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>the body of a message to majordomo@vger.kernel.org
>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID showing all devices as spares after partial unplug
  2011-09-18  2:57             ` Jim Schatzman
@ 2011-09-18  3:07               ` Mike Hartman
       [not found]                 ` <CAB=7dh=9UcEWJjLbOvPLu1Ubij0X4i6+SQ-6L9VE5gHLvcJVcw@mail.gmail.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Mike Hartman @ 2011-09-18  3:07 UTC (permalink / raw)
  To: Jim Schatzman; +Cc: linux-raid

Yikes. That's a pretty terrifying prospect.

On Sat, Sep 17, 2011 at 10:57 PM, Jim Schatzman
<james.schatzman@futurelabusa.com> wrote:
> Mike-
>
> See my response below.
>
> Good luck!
>
> Jim
>
>
> At 07:34 PM 9/17/2011, Mike Hartman wrote:
>>On Sat, Sep 17, 2011 at 9:16 PM, Jim Schatzman
>><james.schatzman@futurelabusa.com> wrote:
>>> Mike-
>>>
>>> I have seen very similar problems. I regret that electronics engineers cannot design more secure connectors. eSata connector are terrible - they come loose at the slightest tug. For this reason, I am gradually abandoning eSata enclosures and going to internal drives only. Fortunately, there are some inexpensive RAID chassis available now.
>>>
>>> I tried the same thing as you. I removed the array(s) from mdadm.conf and I wrote a script for "/etc/cron.reboot" which assembles the array, "no-degraded". Doing this seems to minimize the damage caused by drives prior to a reboot. However, if the drives are disconnected while Linux is up, then either the array will stay up but some drives will become stale or the array will be stopped. The behavior I usually see is that all the drives that went offline now become "spare".
>>>
>>
>>That sounds similar, although I only had 4/11 go offline and now
>>they're ALL spare.
>>
>>> It would be nice if md would just reassemble the array once all the drives come back online. Unfortunately, it doesn't. I would run mdadm -E against all the drives/partitions, verifying that the metadata all indicates that they are/were part of the expected array.
>>
>>I ran mdadm -E and they all correctly appear as part of the array:
>>
>>for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
>>-E $d | grep Role; done
>>
>>/dev/sdc1
>>   Device Role : Active device 5
>>/dev/sdd1
>>   Device Role : Active device 4
>>/dev/sdf1
>>   Device Role : Active device 2
>>/dev/sdh1
>>   Device Role : Active device 0
>>/dev/sdj1
>>   Device Role : Active device 10
>>/dev/sdk1
>>   Device Role : Active device 7
>>/dev/sdl1
>>   Device Role : Active device 8
>>/dev/sdm1
>>   Device Role : Active device 9
>>/dev/sdn1
>>   Device Role : Active device 1
>>/dev/md1p1
>>   Device Role : Active device 3
>>/dev/md3p1
>>   Device Role : Active device 6
>>
>>But they have varying event counts (although all pretty close together):
>>
>>for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
>>-E $d | grep Event; done
>>
>>/dev/sdc1
>>         Events : 1756743
>>/dev/sdd1
>>         Events : 1756743
>>/dev/sdf1
>>         Events : 1756737
>>/dev/sdh1
>>         Events : 1756737
>>/dev/sdj1
>>         Events : 1756743
>>/dev/sdk1
>>         Events : 1756743
>>/dev/sdl1
>>         Events : 1756743
>>/dev/sdm1
>>         Events : 1756743
>>/dev/sdn1
>>         Events : 1756743
>>/dev/md1p1
>>         Events : 1756737
>>/dev/md3p1
>>         Events : 1756740
>>
>>And they don't seem to agree on the overall status of the array. The
>>ones that never went down seem to think the array is missing 4 nodes,
>>while the ones that went down seem to think all the nodes are good:
>>
>>for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
>>-E $d | grep State; done
>>
>>/dev/sdc1
>>          State : clean
>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>/dev/sdd1
>>          State : clean
>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>/dev/sdf1
>>          State : clean
>>   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
>>/dev/sdh1
>>          State : clean
>>   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
>>/dev/sdj1
>>          State : clean
>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>/dev/sdk1
>>          State : clean
>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>/dev/sdl1
>>          State : clean
>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>/dev/sdm1
>>          State : clean
>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>/dev/sdn1
>>          State : clean
>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>/dev/md1p1
>>          State : clean
>>   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
>>/dev/md3p1
>>          State : clean
>>   Array State : .A..AAAAAAA ('A' == active, '.' == missing)
>>
>>So it seems like overall the array is intact, I just need to convince
>>it of that fact.
>>
>>> At that point, you should be able ro re-create the RAID. Be sure you list the drives in the correct order. Once the array is going again, mount the resulting partitions RO and verify that the data is o.k. before going RW.
>>
>>Could you be more specific about how exactly I should re-create the
>>RAID? Should I just do --assemble --force?
>
>
>
>  -->  No. As far as I know, you have to use "-C"/"--create".  You need to use exactly the same array parameters that were used to create the array the first time. Same metadata version. Same stripe size. Raid mode the same. Physical devices in the same order.
>
> Why do you have to use "--create", and thus open the door for catastropic error?? I have asked the same question myself. Maybe, if more people ping Neil Brown on this, he may be willing to find another way.
>
>
>
>>>
>>> Jim
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> At 04:16 PM 9/17/2011, Mike Hartman wrote:
>>>>I should add that the mdadm command in question actually ends in
>>>>/dev/md0, not /dev/md3 (that's for another array). So the device name
>>>>for the array I'm seeing in mdstat DOES match the one in the assemble
>>>>command.
>>>>
>>>>On Sat, Sep 17, 2011 at 4:39 PM, Mike Hartman <mike@hartmanipulation.com> wrote:
>>>>> I have 11 drives in a RAID 6 array. 6 are plugged into one esata
>>>>> enclosure, the other 4 are in another. These esata cables are prone to
>>>>> loosening when I'm working on nearby hardware.
>>>>>
>>>>> If that happens and I start the host up, big chunks of the array are
>>>>> missing and things could get ugly. Thus I cooked up a custom startup
>>>>> script that verifies each device is present before starting the array
>>>>> with
>>>>>
>>>>> mdadm --assemble --no-degraded -u 4fd7659f:12044eff:ba25240d:
>>>>> de22249d /dev/md3
>>>>>
>>>>> So I thought I was covered. In case something got unplugged I would
>>>>> see the array failing to start at boot and I could shut down, fix the
>>>>> cables and try again. However, I hit a new scenario today where one of
>>>>> the plugs was loosened while everything was turned on.
>>>>>
>>>>> The good news is that there should have been no activity on the array
>>>>> when this happened, particularly write activity. It's a big media
>>>>> partition and sees much less writing then reading. I'm also the only
>>>>> one that uses it and I know I wasn't transferring anything. The system
>>>>> also seems to have immediately marked the filesystem read-only,
>>>>> because I discovered the issue when I went to write to it later and
>>>>> got a "read-only filesystem" error. So I believe the state of the
>>>>> drives should be the same - nothing should be out of sync.
>>>>>
>>>>> However, I shut the system down, fixed the cables and brought it back
>>>>> up. All the devices are detected by my script and it tries to start
>>>>> the array with the command I posted above, but I've ended up with
>>>>> this:
>>>>>
>>>>> md0 : inactive sdn1[1](S) sdj1[9](S) sdm1[10](S) sdl1[11](S)
>>>>> sdk1[12](S) md3p1[8](S) sdc1[6](S) sdd1[5](S) md1p1[4](S) sdf1[3](S)
>>>>> sdh1[0](S)
>>>>>       16113893731 blocks super 1.2
>>>>>
>>>>> Instead of all coming back up, or still showing the unplugged drives
>>>>> missing, everything is a spare? I'm suitably disturbed.
>>>>>
>>>>> It seems to me that if the data on the drives still reflects the
>>>>> last-good data from the array (and since no writing was going on it
>>>>> should) then this is just a matter of some metadata getting messed up
>>>>> and it should be fixable. Can someone please walk me through the
>>>>> commands to do that?
>>>>>
>>>>> Mike
>>>>>
>>>>--
>>>>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>the body of a message to majordomo@vger.kernel.org
>>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>--
>>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

[parent not found: <CAB=7dh=9UcEWJjLbOvPLu1Ubij0X4i6+SQ-6L9VE5gHLvcJVcw@mail.gmail.com>]

* RE: RAID showing all devices as spares after partial unplug
       [not found]                 ` <CAB=7dh=9UcEWJjLbOvPLu1Ubij0X4i6+SQ-6L9VE5gHLvcJVcw@mail.gmail.com>
@ 2011-09-18  3:59                   ` Mike Hartman
  2011-09-18 13:23                     ` Phil Turmel
  0 siblings, 1 reply; 12+ messages in thread
From: Mike Hartman @ 2011-09-18  3:59 UTC (permalink / raw)
  To: linux-raid

On Sat, Sep 17, 2011 at 11:07 PM, Mike Hartman
<mike@hartmanipulation.com> wrote:
> Yikes. That's a pretty terrifying prospect.
>
> On Sat, Sep 17, 2011 at 10:57 PM, Jim Schatzman
> <james.schatzman@futurelabusa.com> wrote:
>> Mike-
>>
>> See my response below.
>>
>> Good luck!
>>
>> Jim
>>
>>
>> At 07:34 PM 9/17/2011, Mike Hartman wrote:
>>>On Sat, Sep 17, 2011 at 9:16 PM, Jim Schatzman
>>><james.schatzman@futurelabusa.com> wrote:
>>>> Mike-
>>>>
>>>> I have seen very similar problems. I regret that electronics engineers cannot design more secure connectors. eSata connector are terrible - they come loose at the slightest tug. For this reason, I am gradually abandoning eSata enclosures and going to internal drives only. Fortunately, there are some inexpensive RAID chassis available now.
>>>>
>>>> I tried the same thing as you. I removed the array(s) from mdadm.conf and I wrote a script for "/etc/cron.reboot" which assembles the array, "no-degraded". Doing this seems to minimize the damage caused by drives prior to a reboot. However, if the drives are disconnected while Linux is up, then either the array will stay up but some drives will become stale or the array will be stopped. The behavior I usually see is that all the drives that went offline now become "spare".
>>>>
>>>
>>>That sounds similar, although I only had 4/11 go offline and now
>>>they're ALL spare.
>>>
>>>> It would be nice if md would just reassemble the array once all the drives come back online. Unfortunately, it doesn't. I would run mdadm -E against all the drives/partitions, verifying that the metadata all indicates that they are/were part of the expected array.
>>>
>>>I ran mdadm -E and they all correctly appear as part of the array:
>>>
>>>for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
>>>-E $d | grep Role; done
>>>
>>>/dev/sdc1
>>>   Device Role : Active device 5
>>>/dev/sdd1
>>>   Device Role : Active device 4
>>>/dev/sdf1
>>>   Device Role : Active device 2
>>>/dev/sdh1
>>>   Device Role : Active device 0
>>>/dev/sdj1
>>>   Device Role : Active device 10
>>>/dev/sdk1
>>>   Device Role : Active device 7
>>>/dev/sdl1
>>>   Device Role : Active device 8
>>>/dev/sdm1
>>>   Device Role : Active device 9
>>>/dev/sdn1
>>>   Device Role : Active device 1
>>>/dev/md1p1
>>>   Device Role : Active device 3
>>>/dev/md3p1
>>>   Device Role : Active device 6
>>>
>>>But they have varying event counts (although all pretty close together):
>>>
>>>for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
>>>-E $d | grep Event; done
>>>
>>>/dev/sdc1
>>>         Events : 1756743
>>>/dev/sdd1
>>>         Events : 1756743
>>>/dev/sdf1
>>>         Events : 1756737
>>>/dev/sdh1
>>>         Events : 1756737
>>>/dev/sdj1
>>>         Events : 1756743
>>>/dev/sdk1
>>>         Events : 1756743
>>>/dev/sdl1
>>>         Events : 1756743
>>>/dev/sdm1
>>>         Events : 1756743
>>>/dev/sdn1
>>>         Events : 1756743
>>>/dev/md1p1
>>>         Events : 1756737
>>>/dev/md3p1
>>>         Events : 1756740
>>>
>>>And they don't seem to agree on the overall status of the array. The
>>>ones that never went down seem to think the array is missing 4 nodes,
>>>while the ones that went down seem to think all the nodes are good:
>>>
>>>for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
>>>-E $d | grep State; done
>>>
>>>/dev/sdc1
>>>          State : clean
>>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>>/dev/sdd1
>>>          State : clean
>>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>>/dev/sdf1
>>>          State : clean
>>>   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
>>>/dev/sdh1
>>>          State : clean
>>>   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
>>>/dev/sdj1
>>>          State : clean
>>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>>/dev/sdk1
>>>          State : clean
>>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>>/dev/sdl1
>>>          State : clean
>>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>>/dev/sdm1
>>>          State : clean
>>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>>/dev/sdn1
>>>          State : clean
>>>   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
>>>/dev/md1p1
>>>          State : clean
>>>   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
>>>/dev/md3p1
>>>          State : clean
>>>   Array State : .A..AAAAAAA ('A' == active, '.' == missing)
>>>
>>>So it seems like overall the array is intact, I just need to convince
>>>it of that fact.
>>>
>>>> At that point, you should be able ro re-create the RAID. Be sure you list the drives in the correct order. Once the array is going again, mount the resulting partitions RO and verify that the data is o.k. before going RW.
>>>
>>>Could you be more specific about how exactly I should re-create the
>>>RAID? Should I just do --assemble --force?
>>
>>
>>
>>  -->  No. As far as I know, you have to use "-C"/"--create".  You need to use exactly the same array parameters that were used to create the array the first time. Same metadata version. Same stripe size. Raid mode the same. Physical devices in the same order.
>>
>> Why do you have to use "--create", and thus open the door for catastropic error?? I have asked the same question myself. Maybe, if more people ping Neil Brown on this, he may be willing to find another way.
>>
>>
>>


Is there any way to construct the exact create command using the info
given by mdadm -E? This array started as a RAID 5 that was reshaped
into a 6 and then grown many times, so I don't have a single original
create command lying around to reference.

I know the devices and their order (as previously listed) - are all
the other options I need to specify part of the -E output? If so, can
someone clarify how that maps into the command?

Here's an example output:

mdadm -E /dev/sdh1

/dev/sdh1:
         Magic : a92b4efc
       Version : 1.2
   Feature Map : 0x1
    Array UUID : 714c307e:71626854:2c2cc6c8:c67339a0
          Name : odin:0  (local to host odin)
 Creation Time : Sat Sep  4 12:52:59 2010
    Raid Level : raid6
  Raid Devices : 11

 Avail Dev Size : 2929691614 (1396.99 GiB 1500.00 GB)
    Array Size : 26367220224 (12572.87 GiB 13500.02 GB)
 Used Dev Size : 2929691136 (1396.99 GiB 1500.00 GB)
   Data Offset : 2048 sectors
  Super Offset : 8 sectors
         State : clean
   Device UUID : 384875df:23db9d35:f63202d0:01c03ba2

Internal Bitmap : 2 sectors from superblock
   Update Time : Thu Sep 15 05:10:57 2011
      Checksum : f679cecb - correct
        Events : 1756737

        Layout : left-symmetric
    Chunk Size : 256K

  Device Role : Active device 0
  Array State : AAAAAAAAAAA ('A' == active, '.' == missing)

Mike


>>>>
>>>> Jim
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> At 04:16 PM 9/17/2011, Mike Hartman wrote:
>>>>>I should add that the mdadm command in question actually ends in
>>>>>/dev/md0, not /dev/md3 (that's for another array). So the device name
>>>>>for the array I'm seeing in mdstat DOES match the one in the assemble
>>>>>command.
>>>>>
>>>>>On Sat, Sep 17, 2011 at 4:39 PM, Mike Hartman <mike@hartmanipulation.com> wrote:
>>>>>> I have 11 drives in a RAID 6 array. 6 are plugged into one esata
>>>>>> enclosure, the other 4 are in another. These esata cables are prone to
>>>>>> loosening when I'm working on nearby hardware.
>>>>>>
>>>>>> If that happens and I start the host up, big chunks of the array are
>>>>>> missing and things could get ugly. Thus I cooked up a custom startup
>>>>>> script that verifies each device is present before starting the array
>>>>>> with
>>>>>>
>>>>>> mdadm --assemble --no-degraded -u 4fd7659f:12044eff:ba25240d:
>>>>>> de22249d /dev/md3
>>>>>>
>>>>>> So I thought I was covered. In case something got unplugged I would
>>>>>> see the array failing to start at boot and I could shut down, fix the
>>>>>> cables and try again. However, I hit a new scenario today where one of
>>>>>> the plugs was loosened while everything was turned on.
>>>>>>
>>>>>> The good news is that there should have been no activity on the array
>>>>>> when this happened, particularly write activity. It's a big media
>>>>>> partition and sees much less writing then reading. I'm also the only
>>>>>> one that uses it and I know I wasn't transferring anything. The system
>>>>>> also seems to have immediately marked the filesystem read-only,
>>>>>> because I discovered the issue when I went to write to it later and
>>>>>> got a "read-only filesystem" error. So I believe the state of the
>>>>>> drives should be the same - nothing should be out of sync.
>>>>>>
>>>>>> However, I shut the system down, fixed the cables and brought it back
>>>>>> up. All the devices are detected by my script and it tries to start
>>>>>> the array with the command I posted above, but I've ended up with
>>>>>> this:
>>>>>>
>>>>>> md0 : inactive sdn1[1](S) sdj1[9](S) sdm1[10](S) sdl1[11](S)
>>>>>> sdk1[12](S) md3p1[8](S) sdc1[6](S) sdd1[5](S) md1p1[4](S) sdf1[3](S)
>>>>>> sdh1[0](S)
>>>>>>       16113893731 blocks super 1.2
>>>>>>
>>>>>> Instead of all coming back up, or still showing the unplugged drives
>>>>>> missing, everything is a spare? I'm suitably disturbed.
>>>>>>
>>>>>> It seems to me that if the data on the drives still reflects the
>>>>>> last-good data from the array (and since no writing was going on it
>>>>>> should) then this is just a matter of some metadata getting messed up
>>>>>> and it should be fixable. Can someone please walk me through the
>>>>>> commands to do that?
>>>>>>
>>>>>> Mike
>>>>>>
>>>>>--
>>>>>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>>the body of a message to majordomo@vger.kernel.org
>>>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>>--
>>>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>the body of a message to majordomo@vger.kernel.org
>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID showing all devices as spares after partial unplug
  2011-09-18  3:59                   ` Mike Hartman
@ 2011-09-18 13:23                     ` Phil Turmel
  2011-09-18 16:07                       ` Mike Hartman
  0 siblings, 1 reply; 12+ messages in thread
From: Phil Turmel @ 2011-09-18 13:23 UTC (permalink / raw)
  To: Mike Hartman; +Cc: linux-raid

Hi Mike,

On 09/17/2011 11:59 PM, Mike Hartman wrote:
> On Sat, Sep 17, 2011 at 11:07 PM, Mike Hartman
> <mike@hartmanipulation.com> wrote:
>> Yikes. That's a pretty terrifying prospect.

*Don't do it!*

"mdadm --create" in these situations is an absolute last resort.

First, try --assemble --force.

If needed, check the archives for the environment variable setting that'll temporarily allow mdadm to ignore the event counts for more --assemble and --assemble --force tries.

(I can't remember the variable name off the top of my head.)

Only if all of the above fails do you fall back to "--create", and every single "--create" attempt *must* include "--assume-clean", or your data is in grave danger.

Based on the output of the one full mdadm -E report, your array was created with a recent version mdadm, so you shouldn't have trouble with data offsets.  Please post the full "mdadm -E" report for each drive if you want help putting together a --create command.

I'd also like to see the output of lsdrv[1], so there's a good record of drive serial numbers vs. device names.

HTH,

Phil

[1] http://github.com/pturmel/lsdrv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID showing all devices as spares after partial unplug
  2011-09-18 13:23                     ` Phil Turmel
@ 2011-09-18 16:07                       ` Mike Hartman
  2011-09-18 16:18                         ` Phil Turmel
  0 siblings, 1 reply; 12+ messages in thread
From: Mike Hartman @ 2011-09-18 16:07 UTC (permalink / raw)
  To: Phil Turmel; +Cc: linux-raid

Thanks Phil. --assemble --force is all it took - glad I held off. That
was my first instinct to try, but I was worried it would still leave
the drives as spare AND somehow mess up the metadata enough that it
wouldn't be recoverable, so I was afraid to touch it until someone
could confirm the approach. Seems like a silly reason to have the
array down for multiple days, but better safe than sorry with that
much data.

Thanks again to both you and Jim!

Mike

On Sun, Sep 18, 2011 at 9:23 AM, Phil Turmel <philip@turmel.org> wrote:
> Hi Mike,
>
> On 09/17/2011 11:59 PM, Mike Hartman wrote:
>> On Sat, Sep 17, 2011 at 11:07 PM, Mike Hartman
>> <mike@hartmanipulation.com> wrote:
>>> Yikes. That's a pretty terrifying prospect.
>
> *Don't do it!*
>
> "mdadm --create" in these situations is an absolute last resort.
>
> First, try --assemble --force.
>
> If needed, check the archives for the environment variable setting that'll temporarily allow mdadm to ignore the event counts for more --assemble and --assemble --force tries.
>
> (I can't remember the variable name off the top of my head.)
>
> Only if all of the above fails do you fall back to "--create", and every single "--create" attempt *must* include "--assume-clean", or your data is in grave danger.
>
> Based on the output of the one full mdadm -E report, your array was created with a recent version mdadm, so you shouldn't have trouble with data offsets.  Please post the full "mdadm -E" report for each drive if you want help putting together a --create command.
>
> I'd also like to see the output of lsdrv[1], so there's a good record of drive serial numbers vs. device names.
>
> HTH,
>
> Phil
>
> [1] http://github.com/pturmel/lsdrv
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID showing all devices as spares after partial unplug
  2011-09-18 16:07                       ` Mike Hartman
@ 2011-09-18 16:18                         ` Phil Turmel
       [not found]                           ` <20110920010054.8DFFE581F7A@mail.futurelabusa.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Phil Turmel @ 2011-09-18 16:18 UTC (permalink / raw)
  To: Mike Hartman; +Cc: linux-raid

On 09/18/2011 12:07 PM, Mike Hartman wrote:
> Thanks Phil. --assemble --force is all it took - glad I held off. That
> was my first instinct to try, but I was worried it would still leave
> the drives as spare AND somehow mess up the metadata enough that it
> wouldn't be recoverable, so I was afraid to touch it until someone
> could confirm the approach. Seems like a silly reason to have the
> array down for multiple days, but better safe than sorry with that
> much data.

Good to hear!  This list's archives have a number of cases where premature use of "--create" pushed a recoverable array over the edge, with resulting grief for the owner.  Not that it can't or shouldn't ever be done, but the pitfalls have sharp stakes.

> Thanks again to both you and Jim!

You're welcome.

Phil

^ permalink raw reply	[flat|nested] 12+ messages in thread

[parent not found: <20110920010054.8DFFE581F7A@mail.futurelabusa.com>]

* Re: RAID showing all devices as spares after partial unplug
       [not found]                           ` <20110920010054.8DFFE581F7A@mail.futurelabusa.com>
@ 2011-09-20  4:33                             ` Phil Turmel
  0 siblings, 0 replies; 12+ messages in thread
From: Phil Turmel @ 2011-09-20  4:33 UTC (permalink / raw)
  To: Jim Schatzman; +Cc: linux-raid

[added the list CC: back.  Please use reply-to-all on kernel.org lists.]

On 09/19/2011 09:00 PM, Jim Schatzman wrote:
> Thanks to all. Some notes
> 
> 1) I have never gotten "mdadm --assemble --force" to work as desired.
> Having tried this on the 6-8 occasions when I have temporarily
> disconnected some drives, all that I have seen is that the
> temporarily-disconnect drives/partitions get added as spares and
> that's not helpful, as far as I can see. I'll have to try it the next
> time and see if it works.

Seems to be dependent on dirty status of the array (write in progress).  Also, you should ensure the array is stopped before assembling after reconnecting.

> 2) Thanks for reminding me about the --assume-clean with "mdadm
> --create" option. Very important.  My bad for forgetting it.
> 
> 3) This is the first time I have heard that it is possible to get
> mdadm/md to ignore the event counts in the metadata via environmental
> variable. Can someone please provide the details?

I was mistaken...  The variable I was thinking of only applies to interrupted --grow operations: "MDADM_GROW_ALLOW_OLD".

> I freely acknowledge that forcing mdadm to do something abnormal
> risks losing data. My situation, like Mike's, has always (knock on
> wood) been when the array was up but idle. Two slightly different
> cases are (1) drives are disconnected when the system is up; (2)
> drives are disconnected when the system is powered down and then
> rebooted. Both situations have always occurred when enough drives are
> offlined that the array cannot function and gets stopped
> automatically. Both situations have always resulted in
> drives/partitions being marked as "spare" if the subsequent assembly
> is done without "--no-degraded".

Neil has already responded that this needs looking at.  The key will be the recognition of multiple simultaneous failures as not really a drive problem, triggering some form of alternate recovery.

> Following Mike's procedure of removing the arrays from
> /etc/mdadm.conf and always assembling with "--no-degraded", the
> problem is eliminated in the case that drives are unplugged during
> power-off. However, if the drives are unplugged while the system is
> up, then I still have to jump through hoops (i.e., mdadm --create
> --assume-clean) to get the arrays back up. I haven't tried "mdadm
> --assemble --force" for several versions of md/mdadm, so maybe things
> have changed?

--assemble --force will always be at least as safe as --create --assume-clean.  Since it honors the recorded role numbers, it reduces the chance of a typo letting a create happen with devices in the wrong order.  Device naming on boot can vary, especially with recent kernels that are capable of simultaneous probing.  Using the original metadata really helps in this case.  It also helps when the mdadm version has changed substantially since the array was created.

> For me, the fundamental problem has been the very insecure nature of
> eSata connectors. Poor design, in my opinion. The same kind of thing
> could occur, though, with an external enclosure if the power to the
> enclosure is lost.

Indeed.  I haven't experienced the issue, though, as my arrays are all internal.  (so far...)

Phil.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID showing all devices as spares after partial unplug
  2011-09-18  1:16       ` Jim Schatzman
  2011-09-18  1:34         ` Mike Hartman
@ 2011-09-18 23:08         ` NeilBrown
  1 sibling, 0 replies; 12+ messages in thread
From: NeilBrown @ 2011-09-18 23:08 UTC (permalink / raw)
  To: Jim Schatzman; +Cc: Mike Hartman, linux-raid

[-- Attachment #1: Type: text/plain, Size: 2007 bytes --]

On Sat, 17 Sep 2011 19:16:50 -0600 Jim Schatzman
<james.schatzman@futurelabusa.com> wrote:

> Mike-
> 
> I have seen very similar problems. I regret that electronics engineers cannot design more secure connectors. eSata connector are terrible - they come loose at the slightest tug. For this reason, I am gradually abandoning eSata enclosures and going to internal drives only. Fortunately, there are some inexpensive RAID chassis available now.
> 
> I tried the same thing as you. I removed the array(s) from mdadm.conf and I wrote a script for "/etc/cron.reboot" which assembles the array, "no-degraded". Doing this seems to minimize the damage caused by drives prior to a reboot. However, if the drives are disconnected while Linux is up, then either the array will stay up but some drives will become stale or the array will be stopped. The behavior I usually see is that all the drives that went offline now become "spare".
> 
> It would be nice if md would just reassemble the array once all the drives come back online. Unfortunately, it doesn't. I would run mdadm -E against all the drives/partitions, verifying that the metadata all indicates that they are/were part of the expected array. At that point, you should be able ro re-create the RAID. Be sure you list the drives in the correct order. Once the array is going again, mount the resulting partitions RO and verify that the data is o.k. before going RW.

mdadm certainly can "just reassemble the array once all the drives come ...
online".

If you have udev configured to run "mdadm -I device-name" when a device
appears, then as soon as all required devices have appeared the array will be
started.

It would be good to have better handling of "half the devices disappeared",
particular if this is notice while trying to read or while trying to mark the
array "dirty" in preparation for write.
If it happens during a real 'write' it is a bit harder to handle cleanly.

I should add that to my list :-)

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-09-20  4:33 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAB=7dhk0AV1dKL2cngt1eZXJwCVrfixfLE5z=J1i-7tqdL-6QA@mail.gmail.com>
2011-09-17 20:39 ` RAID showing all devices as spares after partial unplug Mike Hartman
2011-09-17 22:16   ` Mike Hartman
     [not found]     ` <CAB=7dhmFQ=Rtagj2j_22cnoS0A2yoKvJgaTM+ZiqDBqhPRooDQ@mail.g mail.com>
2011-09-18  1:16       ` Jim Schatzman
2011-09-18  1:34         ` Mike Hartman
     [not found]           ` <CAB=7dh=PymkpqRLTWiNzD-+n=XwEWnPN8nQwXg1=UmiJmZ1b1w@mail.g mail.com>
2011-09-18  2:57             ` Jim Schatzman
2011-09-18  3:07               ` Mike Hartman
     [not found]                 ` <CAB=7dh=9UcEWJjLbOvPLu1Ubij0X4i6+SQ-6L9VE5gHLvcJVcw@mail.gmail.com>
2011-09-18  3:59                   ` Mike Hartman
2011-09-18 13:23                     ` Phil Turmel
2011-09-18 16:07                       ` Mike Hartman
2011-09-18 16:18                         ` Phil Turmel
     [not found]                           ` <20110920010054.8DFFE581F7A@mail.futurelabusa.com>
2011-09-20  4:33                             ` Phil Turmel
2011-09-18 23:08         ` NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.