All of lore.kernel.org
 help / color / mirror / Atom feed
* How to delay mdadm assembly until all component drives are recognized/ready?
@ 2017-05-10  3:13 Ram Ramesh
  2017-05-16  4:15 ` Ram Ramesh
  2017-05-16 21:25 ` NeilBrown
  0 siblings, 2 replies; 4+ messages in thread
From: Ram Ramesh @ 2017-05-10  3:13 UTC (permalink / raw)
  To: Linux Raid

Today, I noticed that my RAID6 md0 was assembled in degraded state with 
two drives in failed state after a pm-suspend and restart. Both of these 
drives were attached toSAS9211-8I controller. The other drives are 
attached to motherboard. I have not had this on a normal boot/reboot. 
Also, in this particular case, mythtv recording was going on when 
suspended and therefore as soon as resumed that used this md0.

Upon inspection, it appears (I am not sure here) that mdadm assembled 
the array even before the drives were ready to be used. All I had to do 
was to remove and re-add them to bring the array back to "good" state. I 
am wondering if there is a way to tell mdadm to wait for all drives to 
be ready before assembling. Also, if there is something that I can add 
to resume scripts that will help, please let me know.

Kernel: Linux zym 3.13.0-106-generic #153-Ubuntu SMP
mdadm - v3.2.5 - 18th May 2012

Failed drives are HGST NAS and WD Gold with less than a year of usage. 
So I doubt they are bad drives by any means.

Ramesh



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to delay mdadm assembly until all component drives are recognized/ready?
  2017-05-10  3:13 How to delay mdadm assembly until all component drives are recognized/ready? Ram Ramesh
@ 2017-05-16  4:15 ` Ram Ramesh
  2017-05-16 21:25 ` NeilBrown
  1 sibling, 0 replies; 4+ messages in thread
From: Ram Ramesh @ 2017-05-16  4:15 UTC (permalink / raw)
  To: Linux Raid

On 05/09/2017 10:13 PM, Ram Ramesh wrote:
> Today, I noticed that my RAID6 md0 was assembled in degraded state 
> with two drives in failed state after a pm-suspend and restart. Both 
> of these drives were attached toSAS9211-8I controller. The other 
> drives are attached to motherboard. I have not had this on a normal 
> boot/reboot. Also, in this particular case, mythtv recording was going 
> on when suspended and therefore as soon as resumed that used this md0.
>
> Upon inspection, it appears (I am not sure here) that mdadm assembled 
> the array even before the drives were ready to be used. All I had to 
> do was to remove and re-add them to bring the array back to "good" 
> state. I am wondering if there is a way to tell mdadm to wait for all 
> drives to be ready before assembling. Also, if there is something that 
> I can add to resume scripts that will help, please let me know.
>
> Kernel: Linux zym 3.13.0-106-generic #153-Ubuntu SMP
> mdadm - v3.2.5 - 18th May 2012
>
> Failed drives are HGST NAS and WD Gold with less than a year of usage. 
> So I doubt they are bad drives by any means.
>
> Ramesh
>
>
This happened again. I think no damage is done until I write to the 
device. Once the write fails, it marks the component disk "failed" If I 
can make it wait a shortwhile before writing happens, I think I should 
be ok. Is there something possible with any of the thaw scripts that 
will delay start of resume operation.

Ramesh


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to delay mdadm assembly until all component drives are recognized/ready?
  2017-05-10  3:13 How to delay mdadm assembly until all component drives are recognized/ready? Ram Ramesh
  2017-05-16  4:15 ` Ram Ramesh
@ 2017-05-16 21:25 ` NeilBrown
  2017-05-16 23:49   ` Ram Ramesh
  1 sibling, 1 reply; 4+ messages in thread
From: NeilBrown @ 2017-05-16 21:25 UTC (permalink / raw)
  To: Ram Ramesh, Linux Raid

[-- Attachment #1: Type: text/plain, Size: 1875 bytes --]

On Tue, May 09 2017, Ram Ramesh wrote:

> Today, I noticed that my RAID6 md0 was assembled in degraded state with 
> two drives in failed state after a pm-suspend and restart. Both of these 
> drives were attached toSAS9211-8I controller. The other drives are 
> attached to motherboard. I have not had this on a normal boot/reboot. 
> Also, in this particular case, mythtv recording was going on when 
> suspended and therefore as soon as resumed that used this md0.
>
> Upon inspection, it appears (I am not sure here) that mdadm assembled 
> the array even before the drives were ready to be used. All I had to do 
> was to remove and re-add them to bring the array back to "good" state. I 
> am wondering if there is a way to tell mdadm to wait for all drives to 
> be ready before assembling. Also, if there is something that I can add 
> to resume scripts that will help, please let me know.
>
> Kernel: Linux zym 3.13.0-106-generic #153-Ubuntu SMP
> mdadm - v3.2.5 - 18th May 2012
>
> Failed drives are HGST NAS and WD Gold with less than a year of usage. 
> So I doubt they are bad drives by any means.

This is a question that needs to be addressed by your distro.  mdadm
just does what it is told to do by init/udev/systemd scripts.

The preferred way for array startup to happen is that when udev
discovers a new device, "mdadm --incremental $DEV" is run, and mdadm
includes the device into an array as appropriate.  mdadm will not
normally activate the array until all expected devices have appeared.
After some timeout "mdadm -IRs" or "mdadm --run /dev/mdXX" can be run to
start the array even though it is degraded.

The udev-* scripts and systemd/* unit files provided with current
upstream mdadm do this, with a 30 second timeout.
If a given distro doesn't use these scripts, you need to take it up
with them.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to delay mdadm assembly until all component drives are recognized/ready?
  2017-05-16 21:25 ` NeilBrown
@ 2017-05-16 23:49   ` Ram Ramesh
  0 siblings, 0 replies; 4+ messages in thread
From: Ram Ramesh @ 2017-05-16 23:49 UTC (permalink / raw)
  To: NeilBrown, Linux Raid

On 05/16/2017 04:25 PM, NeilBrown wrote:
> On Tue, May 09 2017, Ram Ramesh wrote:
>
>> Today, I noticed that my RAID6 md0 was assembled in degraded state with
>> two drives in failed state after a pm-suspend and restart. Both of these
>> drives were attached toSAS9211-8I controller. The other drives are
>> attached to motherboard. I have not had this on a normal boot/reboot.
>> Also, in this particular case, mythtv recording was going on when
>> suspended and therefore as soon as resumed that used this md0.
>>
>> Upon inspection, it appears (I am not sure here) that mdadm assembled
>> the array even before the drives were ready to be used. All I had to do
>> was to remove and re-add them to bring the array back to "good" state. I
>> am wondering if there is a way to tell mdadm to wait for all drives to
>> be ready before assembling. Also, if there is something that I can add
>> to resume scripts that will help, please let me know.
>>
>> Kernel: Linux zym 3.13.0-106-generic #153-Ubuntu SMP
>> mdadm - v3.2.5 - 18th May 2012
>>
>> Failed drives are HGST NAS and WD Gold with less than a year of usage.
>> So I doubt they are bad drives by any means.
> This is a question that needs to be addressed by your distro.  mdadm
> just does what it is told to do by init/udev/systemd scripts.
>
> The preferred way for array startup to happen is that when udev
> discovers a new device, "mdadm --incremental $DEV" is run, and mdadm
> includes the device into an array as appropriate.  mdadm will not
> normally activate the array until all expected devices have appeared.
> After some timeout "mdadm -IRs" or "mdadm --run /dev/mdXX" can be run to
> start the array even though it is degraded.
>
> The udev-* scripts and systemd/* unit files provided with current
> upstream mdadm do this, with a 30 second timeout.
> If a given distro doesn't use these scripts, you need to take it up
> with them.
>
> NeilBrown
Neil,

   Thanks. I was hoping that there is something that I can add to 
mdadm.conf that will make this work. That is why I checked here, as my 
mdadm expertise is liminited. Anyway, it appears that the problem is due 
ext4lazyinit which accesses md instantaneously after resume. I will take 
this up with the distro folks. My machine badly needs an upgrade. I 
think it is time

Ramesh

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-05-16 23:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-10  3:13 How to delay mdadm assembly until all component drives are recognized/ready? Ram Ramesh
2017-05-16  4:15 ` Ram Ramesh
2017-05-16 21:25 ` NeilBrown
2017-05-16 23:49   ` Ram Ramesh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.