All of lore.kernel.org
 help / color / mirror / Atom feed
* Special drives for Linux Raid?
@ 2011-11-07 13:29 Danilo Godec
  2011-11-07 13:49 ` Miles Fidelman
  0 siblings, 1 reply; 5+ messages in thread
From: Danilo Godec @ 2011-11-07 13:29 UTC (permalink / raw)
  To: linux-raid

Some manufacturers make 'special' versions of drives for RAID (WD RE4, 
Seagate SE, ...). Apparently the main difference is in error handling, 
where normal 'desktop' drives try hard to recover an error (up to 
several minutes) while RAID drives give up quickly (few seconds) so that 
the RAID controller can take over.

Are there any other known and significant differences?

Is usage of these special drives recommended with Linux md?


  Thanks, Danilo


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Special drives for Linux Raid?
  2011-11-07 13:29 Special drives for Linux Raid? Danilo Godec
@ 2011-11-07 13:49 ` Miles Fidelman
  2011-11-07 14:57   ` David Brown
  0 siblings, 1 reply; 5+ messages in thread
From: Miles Fidelman @ 2011-11-07 13:49 UTC (permalink / raw)
  Cc: linux-raid

Danilo Godec wrote:
> Some manufacturers make 'special' versions of drives for RAID (WD RE4, 
> Seagate SE, ...). Apparently the main difference is in error handling, 
> where normal 'desktop' drives try hard to recover an error (up to 
> several minutes) while RAID drives give up quickly (few seconds) so 
> that the RAID controller can take over.
>
not so much "special" as "different"

the term to look for is "enterprise"

you've identified the key distinction:

- desktop drives assume that they have the only copy of your data, the 
on-board processor tries very hard to read and re-read until it returns 
your data ---- the result is that everything slows down

- if you have a raid array, you want a failing disk to give up and 
return, very quickly, so that the data can be read from a different drive

I learned this the hard way, when I had a server that just slowed way 
down to the point that it took 10 seconds or more to echo a keystroke.  
It took me a long time to figure out what was going on - and some rather 
painful false starts (trashed the o/s).

One important thing I discovered:  the md RAID driver does NOT consider 
a long time delay as a signal to fail a drive out of an array.  It's a 
really good idea to run mdstat and keep an eye on your drives.  If Raw 
Reed Error goes above 0, start paying attention.

Miles Fidelman



-- 
In theory, there is no difference between theory and practice.
In<fnord>  practice, there is.   .... Yogi Berra



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Special drives for Linux Raid?
  2011-11-07 13:49 ` Miles Fidelman
@ 2011-11-07 14:57   ` David Brown
  2011-11-07 18:00     ` Beolach
  0 siblings, 1 reply; 5+ messages in thread
From: David Brown @ 2011-11-07 14:57 UTC (permalink / raw)
  To: linux-raid

On 07/11/2011 14:49, Miles Fidelman wrote:
> Danilo Godec wrote:
>> Some manufacturers make 'special' versions of drives for RAID (WD RE4,
>> Seagate SE, ...). Apparently the main difference is in error handling,
>> where normal 'desktop' drives try hard to recover an error (up to
>> several minutes) while RAID drives give up quickly (few seconds) so
>> that the RAID controller can take over.
>>
> not so much "special" as "different"
>
> the term to look for is "enterprise"
>
> you've identified the key distinction:
>
> - desktop drives assume that they have the only copy of your data, the
> on-board processor tries very hard to read and re-read until it returns
> your data ---- the result is that everything slows down
>
> - if you have a raid array, you want a failing disk to give up and
> return, very quickly, so that the data can be read from a different drive
>
> I learned this the hard way, when I had a server that just slowed way
> down to the point that it took 10 seconds or more to echo a keystroke.
> It took me a long time to figure out what was going on - and some rather
> painful false starts (trashed the o/s).
>
> One important thing I discovered: the md RAID driver does NOT consider a
> long time delay as a signal to fail a drive out of an array. It's a
> really good idea to run mdstat and keep an eye on your drives. If Raw
> Reed Error goes above 0, start paying attention.
>

As far as I know (and I hope I'll be corrected quickly if I'm wrong), 
when a drive fails to read from a sector, it will be considered a 
"failed" drive by the raid controller or software raid, and kicked out 
of the array.  The exception is the latest versions of md raid which 
support bad block lists.

If you are using a "raid" drive, which only re-tries for a couple of 
seconds, then a read failure will quickly return an error.  This limits 
the worst-case delay when reading from a failing drive.  But it also 
means that the drive won't try as hard as it can, and the drive will be 
kicked out of the array earlier.

With a "desktop" drive, worst case delays can be much longer, but you 
have a higher chance of getting your data off the disk.  That's always a 
good thing, even with raid.

If you use a hardware raid controller that requires "raid" drives, then 
long re-reads on a "desktop" drive will cause timeouts, and the drive 
will be kicked out of the array.


I don't believe it is very common to have long re-reads even with 
desktop drives - more commonly, the drive will either correct small 
errors quickly, or will have serious failures.  But obviously the 
situation does occur.

If you need to put limits on the worst-case read performance, then 
"raid" drives are the only way to go.  If not, then I would think 
"desktop" drives are a better choice in most cases.  Also note that 
since "desktop" drives are half the cost of "raid" drives, if you have 
the space in your system you can buy twice as many for the same price. 
That means better performance and/or better redundancy and/or more space 
and/or better value for money.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Special drives for Linux Raid?
  2011-11-07 14:57   ` David Brown
@ 2011-11-07 18:00     ` Beolach
  2011-11-07 18:28       ` David Brown
  0 siblings, 1 reply; 5+ messages in thread
From: Beolach @ 2011-11-07 18:00 UTC (permalink / raw)
  To: David Brown; +Cc: linux-raid

On Mon, Nov 7, 2011 at 07:57, David Brown <david@westcontrol.com> wrote:
> On 07/11/2011 14:49, Miles Fidelman wrote:
>>
>> Danilo Godec wrote:
>>>
>>> Some manufacturers make 'special' versions of drives for RAID (WD RE4,
>>> Seagate SE, ...). Apparently the main difference is in error handling,
>>> where normal 'desktop' drives try hard to recover an error (up to
>>> several minutes) while RAID drives give up quickly (few seconds) so
>>> that the RAID controller can take over.
>>>
>> not so much "special" as "different"
>>
>> the term to look for is "enterprise"
>>
>> you've identified the key distinction:
>>
>> - desktop drives assume that they have the only copy of your data, the
>> on-board processor tries very hard to read and re-read until it returns
>> your data ---- the result is that everything slows down
>>
>> - if you have a raid array, you want a failing disk to give up and
>> return, very quickly, so that the data can be read from a different drive
>>
>> I learned this the hard way, when I had a server that just slowed way
>> down to the point that it took 10 seconds or more to echo a keystroke.
>> It took me a long time to figure out what was going on - and some rather
>> painful false starts (trashed the o/s).
>>
>> One important thing I discovered: the md RAID driver does NOT consider a
>> long time delay as a signal to fail a drive out of an array. It's a
>> really good idea to run mdstat and keep an eye on your drives. If Raw
>> Reed Error goes above 0, start paying attention.
>>
>
> As far as I know (and I hope I'll be corrected quickly if I'm wrong), when a
> drive fails to read from a sector, it will be considered a "failed" drive by
> the raid controller or software raid, and kicked out of the array.  The
> exception is the latest versions of md raid which support bad block lists.
>

I don't think that's quite correct - when a member drive of an MD RAID
returns a read error, MD tries to re-write the sector using the
redundancy from the other drives in the RAID.  It's only if a drive
returns a *write* error that the drive is failed.


-- 
Conway S. Smith
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Special drives for Linux Raid?
  2011-11-07 18:00     ` Beolach
@ 2011-11-07 18:28       ` David Brown
  0 siblings, 0 replies; 5+ messages in thread
From: David Brown @ 2011-11-07 18:28 UTC (permalink / raw)
  To: linux-raid

On 07/11/11 19:00, Beolach wrote:
> On Mon, Nov 7, 2011 at 07:57, David Brown<david@westcontrol.com>  wrote:
>> On 07/11/2011 14:49, Miles Fidelman wrote:
>>>
>>> Danilo Godec wrote:
>>>>
>>>> Some manufacturers make 'special' versions of drives for RAID (WD RE4,
>>>> Seagate SE, ...). Apparently the main difference is in error handling,
>>>> where normal 'desktop' drives try hard to recover an error (up to
>>>> several minutes) while RAID drives give up quickly (few seconds) so
>>>> that the RAID controller can take over.
>>>>
>>> not so much "special" as "different"
>>>
>>> the term to look for is "enterprise"
>>>
>>> you've identified the key distinction:
>>>
>>> - desktop drives assume that they have the only copy of your data, the
>>> on-board processor tries very hard to read and re-read until it returns
>>> your data ---- the result is that everything slows down
>>>
>>> - if you have a raid array, you want a failing disk to give up and
>>> return, very quickly, so that the data can be read from a different drive
>>>
>>> I learned this the hard way, when I had a server that just slowed way
>>> down to the point that it took 10 seconds or more to echo a keystroke.
>>> It took me a long time to figure out what was going on - and some rather
>>> painful false starts (trashed the o/s).
>>>
>>> One important thing I discovered: the md RAID driver does NOT consider a
>>> long time delay as a signal to fail a drive out of an array. It's a
>>> really good idea to run mdstat and keep an eye on your drives. If Raw
>>> Reed Error goes above 0, start paying attention.
>>>
>>
>> As far as I know (and I hope I'll be corrected quickly if I'm wrong), when a
>> drive fails to read from a sector, it will be considered a "failed" drive by
>> the raid controller or software raid, and kicked out of the array.  The
>> exception is the latest versions of md raid which support bad block lists.
>>
>
> I don't think that's quite correct - when a member drive of an MD RAID
> returns a read error, MD tries to re-write the sector using the
> redundancy from the other drives in the RAID.  It's only if a drive
> returns a *write* error that the drive is failed.
>

OK, thanks for correcting me here.

Do hardware raid cards typically do the same thing?

(I've only occasionally had disk failures in raid systems, and in every 
case the disk died totally, so I haven't tested this.)


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-11-07 18:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-07 13:29 Special drives for Linux Raid? Danilo Godec
2011-11-07 13:49 ` Miles Fidelman
2011-11-07 14:57   ` David Brown
2011-11-07 18:00     ` Beolach
2011-11-07 18:28       ` David Brown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.