All of lore.kernel.org
 help / color / mirror / Atom feed
* mdadm and TLER (Time Limited Error Recovery)
@ 2009-09-08  0:35 Tim Rutter
  2009-09-08 14:17 ` Mario 'BitKoenig' Holbe
  0 siblings, 1 reply; 12+ messages in thread
From: Tim Rutter @ 2009-09-08  0:35 UTC (permalink / raw)
  To: linux-raid

There seems to many people asking this question with very vague
answers. So I'm looking for some clarification.

Is the TLER on drives like Western Digital's WD2002FYPS a problem or
benefit for mdadm(RAID5/RAID6)?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mdadm and TLER (Time Limited Error Recovery)
  2009-09-08  0:35 mdadm and TLER (Time Limited Error Recovery) Tim Rutter
@ 2009-09-08 14:17 ` Mario 'BitKoenig' Holbe
  2009-09-08 18:48   ` Iustin Pop
  0 siblings, 1 reply; 12+ messages in thread
From: Mario 'BitKoenig' Holbe @ 2009-09-08 14:17 UTC (permalink / raw)
  To: linux-raid

Tim Rutter <timrutter@gmail.com> wrote:
> Is the TLER on drives like Western Digital's WD2002FYPS a problem or
> benefit for mdadm(RAID5/RAID6)?

Neither nor with a very very small drift to "problem", IMHO.
A (not so little) while ago when md did not automatically correct
read-errors, the drift to "problem" was a bit less small ;) This could
be the reason for some of the vague answers you mentioned.

Anyways, clarification...
The only reason for TLER (Time Limited Error Recovery) is to behave
"friendly" toward RAID controllers that timeout disks.
In fact, md does not timeout disks as many Hardware RAID controllers do.
So, from md's point of view, TLER is useless, i.e. it has no benefit.

On the other hand, TLER leads to the disk not trying as hard to recover
from (read-)errors (i.e. get the data back) as it could - usually,
there's just no need to do it in a RAID, because another component (the
RAID controller) has a far easier way to get the data back (i.e. read it
from the other disk(s)).
Of course, there are the unusual cases, like degraded RAID or two disks
being unable to read this specific data. In these rare cases it would be
nice if the disk would do as much as it can to get the data back instead
of relying on the RAID controller. This is why I think there is a very
very small drift to "problem".


regards
   Mario
-- 
File names are infinite in length where infinity is set to 255 characters.
                                -- Peter Collinson, "The Unix File System"


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mdadm and TLER (Time Limited Error Recovery)
  2009-09-08 14:17 ` Mario 'BitKoenig' Holbe
@ 2009-09-08 18:48   ` Iustin Pop
  2009-09-08 19:45     ` Mario 'BitKoenig' Holbe
  2009-09-09  1:33     ` Maurice Hilarius
  0 siblings, 2 replies; 12+ messages in thread
From: Iustin Pop @ 2009-09-08 18:48 UTC (permalink / raw)
  To: Mario 'BitKoenig' Holbe; +Cc: linux-raid

On Tue, Sep 08, 2009 at 04:17:58PM +0200, Mario 'BitKoenig' Holbe wrote:
> Tim Rutter <timrutter@gmail.com> wrote:
> > Is the TLER on drives like Western Digital's WD2002FYPS a problem or
> > benefit for mdadm(RAID5/RAID6)?
> 
> Neither nor with a very very small drift to "problem", IMHO.
> A (not so little) while ago when md did not automatically correct
> read-errors, the drift to "problem" was a bit less small ;) This could
> be the reason for some of the vague answers you mentioned.
> 
> Anyways, clarification...
> The only reason for TLER (Time Limited Error Recovery) is to behave
> "friendly" toward RAID controllers that timeout disks.
> In fact, md does not timeout disks as many Hardware RAID controllers do.
> So, from md's point of view, TLER is useless, i.e. it has no benefit.

I'm sorry but I disagree here. *Especially* because md is used over
normal SATA controllers most of the time, TLER is beneficial because the
drive doesn't go catatonic for minutes at a time trying to recover a bad
sector, which would (because md doesn't timeout disks) cause md to hung
up the whole device. TLER will allow md to see the error quickly and
attempt to rewrite (read) or retry/fail the disk (write) for a bad the
sector.

Just my understanding of the md stack.

regards,
iustin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mdadm and TLER (Time Limited Error Recovery)
  2009-09-08 18:48   ` Iustin Pop
@ 2009-09-08 19:45     ` Mario 'BitKoenig' Holbe
  2009-09-09  1:33     ` Maurice Hilarius
  1 sibling, 0 replies; 12+ messages in thread
From: Mario 'BitKoenig' Holbe @ 2009-09-08 19:45 UTC (permalink / raw)
  To: linux-raid

Iustin Pop <iusty@k1024.org> wrote:
> I'm sorry but I disagree here. *Especially* because md is used over
> normal SATA controllers most of the time, TLER is beneficial because the
> drive doesn't go catatonic for minutes at a time trying to recover a bad
> sector, which would (because md doesn't timeout disks) cause md to hung
> up the whole device. TLER will allow md to see the error quickly and

Yes, that's right. So - as in the most cases - it's up to user's demands
if he prefers quicker recovery from errors or harder attempts to correct
them :)


regards
   Mario
-- 
Good, Fast, Cheap: Pick any two (you can't have all three).
                                            -- RFC 1925, 7a


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mdadm and TLER (Time Limited Error Recovery)
  2009-09-08 18:48   ` Iustin Pop
  2009-09-08 19:45     ` Mario 'BitKoenig' Holbe
@ 2009-09-09  1:33     ` Maurice Hilarius
  2009-09-09  8:21       ` Simon Jackson
  1 sibling, 1 reply; 12+ messages in thread
From: Maurice Hilarius @ 2009-09-09  1:33 UTC (permalink / raw)
  To: Mario 'BitKoenig' Holbe; +Cc: linux-raid, iusty

Iustin Pop wrote:
> ..
>> Anyways, clarification...
>> The only reason for TLER (Time Limited Error Recovery) is to behave
>> "friendly" toward RAID controllers that timeout disks.
>> In fact, md does not timeout disks as many Hardware RAID controllers do.
>> So, from md's point of view, TLER is useless, i.e. it has no benefit.
>>     
>
> I'm sorry but I disagree here. *Especially* because md is used over
> normal SATA controllers most of the time, TLER is beneficial because the
> drive doesn't go catatonic for minutes at a time trying to recover a bad
> sector, which would (because md doesn't timeout disks) cause md to hung
> up the whole device. TLER will allow md to see the error quickly and
> attempt to rewrite (read) or retry/fail the disk (write) for a bad the
> sector.
>
> Just my understanding of the md stack.
>
> regards,
> iustin
>
>   
I agree.
Before WD implemented this we would see cases quite often where a 
perfectly good drive would get "kicked out"
of a RAID as frequently or even more often, than on a hardware RAID.
TLER management seems to have eliminated most of these cases.



-- 
Regards, Maurice

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: mdadm and TLER (Time Limited Error Recovery)
  2009-09-09  1:33     ` Maurice Hilarius
@ 2009-09-09  8:21       ` Simon Jackson
  2009-09-09  9:00         ` Majed B.
  2009-09-09 11:04         ` Mario 'BitKoenig' Holbe
  0 siblings, 2 replies; 12+ messages in thread
From: Simon Jackson @ 2009-09-09  8:21 UTC (permalink / raw)
  To: linux-raid

This sounds an interesting proposition for RAID 1 setups that I am using.  In a couple of cases I have seen unresponsive drives retrying on a bad block seemingly to lock up my system, or at least slow response significantly.  

In my case I am using Seagate and Hitachi drives.  A look at Wikipedia indicates that on Hitachi there is something called "Command Completion Time Limit" and on Seagate "Error Recovery Control".

Please can anyone tell me how I would go about setting timeout values on these types of drive. Are there utility programs to do this or a Linux 
command.

Thanks Simon.

-----Original Message-----
From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Maurice Hilarius
Sent: 09 September 2009 02:34
To: Mario 'BitKoenig' Holbe
Cc: linux-raid@vger.kernel.org; iusty@k1024.org
Subject: Re: mdadm and TLER (Time Limited Error Recovery)

Iustin Pop wrote:
> ..
>> Anyways, clarification...
>> The only reason for TLER (Time Limited Error Recovery) is to behave
>> "friendly" toward RAID controllers that timeout disks.
>> In fact, md does not timeout disks as many Hardware RAID controllers do.
>> So, from md's point of view, TLER is useless, i.e. it has no benefit.
>>     
>
> I'm sorry but I disagree here. *Especially* because md is used over
> normal SATA controllers most of the time, TLER is beneficial because the
> drive doesn't go catatonic for minutes at a time trying to recover a bad
> sector, which would (because md doesn't timeout disks) cause md to hung
> up the whole device. TLER will allow md to see the error quickly and
> attempt to rewrite (read) or retry/fail the disk (write) for a bad the
> sector.
>
> Just my understanding of the md stack.
>
> regards,
> iustin
>
>   
I agree.
Before WD implemented this we would see cases quite often where a 
perfectly good drive would get "kicked out"
of a RAID as frequently or even more often, than on a hardware RAID.
TLER management seems to have eliminated most of these cases.



-- 
Regards, Maurice
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mdadm and TLER (Time Limited Error Recovery)
  2009-09-09  8:21       ` Simon Jackson
@ 2009-09-09  9:00         ` Majed B.
  2009-09-09 11:04         ` Mario 'BitKoenig' Holbe
  1 sibling, 0 replies; 12+ messages in thread
From: Majed B. @ 2009-09-09  9:00 UTC (permalink / raw)
  To: Simon Jackson; +Cc: linux-raid

If there's no specific utility from the manufacturer for Linux, you
might want to take a look at "sdparm"

On Wed, Sep 9, 2009 at 11:21 AM, Simon Jackson<sjackson@bluearc.com> wrote:
> This sounds an interesting proposition for RAID 1 setups that I am using.  In a couple of cases I have seen unresponsive drives retrying on a bad block seemingly to lock up my system, or at least slow response significantly.
>
> In my case I am using Seagate and Hitachi drives.  A look at Wikipedia indicates that on Hitachi there is something called "Command Completion Time Limit" and on Seagate "Error Recovery Control".
>
> Please can anyone tell me how I would go about setting timeout values on these types of drive. Are there utility programs to do this or a Linux
> command.
>
> Thanks Simon.
>
> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Maurice Hilarius
> Sent: 09 September 2009 02:34
> To: Mario 'BitKoenig' Holbe
> Cc: linux-raid@vger.kernel.org; iusty@k1024.org
> Subject: Re: mdadm and TLER (Time Limited Error Recovery)
>
> Iustin Pop wrote:
>> ..
>>> Anyways, clarification...
>>> The only reason for TLER (Time Limited Error Recovery) is to behave
>>> "friendly" toward RAID controllers that timeout disks.
>>> In fact, md does not timeout disks as many Hardware RAID controllers do.
>>> So, from md's point of view, TLER is useless, i.e. it has no benefit.
>>>
>>
>> I'm sorry but I disagree here. *Especially* because md is used over
>> normal SATA controllers most of the time, TLER is beneficial because the
>> drive doesn't go catatonic for minutes at a time trying to recover a bad
>> sector, which would (because md doesn't timeout disks) cause md to hung
>> up the whole device. TLER will allow md to see the error quickly and
>> attempt to rewrite (read) or retry/fail the disk (write) for a bad the
>> sector.
>>
>> Just my understanding of the md stack.
>>
>> regards,
>> iustin
>>
>>
> I agree.
> Before WD implemented this we would see cases quite often where a
> perfectly good drive would get "kicked out"
> of a RAID as frequently or even more often, than on a hardware RAID.
> TLER management seems to have eliminated most of these cases.
>
>
>
> --
> Regards, Maurice
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
       Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mdadm and TLER (Time Limited Error Recovery)
  2009-09-09  8:21       ` Simon Jackson
  2009-09-09  9:00         ` Majed B.
@ 2009-09-09 11:04         ` Mario 'BitKoenig' Holbe
  2009-09-10  9:26           ` Simon Jackson
  2009-09-15 15:36           ` Simon Jackson
  1 sibling, 2 replies; 12+ messages in thread
From: Mario 'BitKoenig' Holbe @ 2009-09-09 11:04 UTC (permalink / raw)
  To: linux-raid

Simon Jackson <sjackson@bluearc.com> wrote:
> In my case I am using Seagate and Hitachi drives.
> Please can anyone tell me how I would go about setting timeout values on these types of drive. Are there utility programs to do this or a Linux 

Well, my Seagates have a RTL (Recovery time limit (ms)) field in the rw
(Read write error recovery) mode page.

You could try something like `sdparm -W RTL=7000 /dev/sdX' to set it to
7 seconds. I don't know if it works, I didn't test it, use it at your
own risk! Do a backup before, it could blow up your disk or even the
universe :) Tell us if it worked :)


regards
   Mario
-- 
() Ascii Ribbon Campaign
/\ Support plain text e-mail


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: mdadm and TLER (Time Limited Error Recovery)
  2009-09-09 11:04         ` Mario 'BitKoenig' Holbe
@ 2009-09-10  9:26           ` Simon Jackson
  2009-09-10  9:39             ` Majed B.
  2009-09-10  9:46             ` Robin Hill
  2009-09-15 15:36           ` Simon Jackson
  1 sibling, 2 replies; 12+ messages in thread
From: Simon Jackson @ 2009-09-10  9:26 UTC (permalink / raw)
  To: Mario 'BitKoenig' Holbe, linux-raid

Hmmm.  I do not have sdparm on my system. 

Linux 2.6.26-1-amd64 #1 SMP Sat Jan 10 17:57:00 UTC 2009 x86_64 GNU/Linux

Is this the same functionality as the hdparm utility? 

-----Original Message-----
From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Mario 'BitKoenig' Holbe
Sent: 09 September 2009 12:05
To: linux-raid@vger.kernel.org
Subject: Re: mdadm and TLER (Time Limited Error Recovery)

Simon Jackson <sjackson@bluearc.com> wrote:
> In my case I am using Seagate and Hitachi drives.
> Please can anyone tell me how I would go about setting timeout values on these types of drive. Are there utility programs to do this or a Linux 

Well, my Seagates have a RTL (Recovery time limit (ms)) field in the rw
(Read write error recovery) mode page.

You could try something like `sdparm -W RTL=7000 /dev/sdX' to set it to
7 seconds. I don't know if it works, I didn't test it, use it at your
own risk! Do a backup before, it could blow up your disk or even the
universe :) Tell us if it worked :)


regards
   Mario
-- 
() Ascii Ribbon Campaign
/\ Support plain text e-mail

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mdadm and TLER (Time Limited Error Recovery)
  2009-09-10  9:26           ` Simon Jackson
@ 2009-09-10  9:39             ` Majed B.
  2009-09-10  9:46             ` Robin Hill
  1 sibling, 0 replies; 12+ messages in thread
From: Majed B. @ 2009-09-10  9:39 UTC (permalink / raw)
  To: Simon Jackson; +Cc: Mario 'BitKoenig' Holbe, linux-raid

sdparm is similar to hdparm but meant to deal with SCSI devices
(including SATA).

Your repository should have it, or download & compile.

On Thu, Sep 10, 2009 at 12:26 PM, Simon Jackson<sjackson@bluearc.com> wrote:
> Hmmm.  I do not have sdparm on my system.
>
> Linux 2.6.26-1-amd64 #1 SMP Sat Jan 10 17:57:00 UTC 2009 x86_64 GNU/Linux
>
> Is this the same functionality as the hdparm utility?
>
> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Mario 'BitKoenig' Holbe
> Sent: 09 September 2009 12:05
> To: linux-raid@vger.kernel.org
> Subject: Re: mdadm and TLER (Time Limited Error Recovery)
>
> Simon Jackson <sjackson@bluearc.com> wrote:
>> In my case I am using Seagate and Hitachi drives.
>> Please can anyone tell me how I would go about setting timeout values on these types of drive. Are there utility programs to do this or a Linux
>
> Well, my Seagates have a RTL (Recovery time limit (ms)) field in the rw
> (Read write error recovery) mode page.
>
> You could try something like `sdparm -W RTL=7000 /dev/sdX' to set it to
> 7 seconds. I don't know if it works, I didn't test it, use it at your
> own risk! Do a backup before, it could blow up your disk or even the
> universe :) Tell us if it worked :)
>
>
> regards
>   Mario
> --
> () Ascii Ribbon Campaign
> /\ Support plain text e-mail
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
       Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mdadm and TLER (Time Limited Error Recovery)
  2009-09-10  9:26           ` Simon Jackson
  2009-09-10  9:39             ` Majed B.
@ 2009-09-10  9:46             ` Robin Hill
  1 sibling, 0 replies; 12+ messages in thread
From: Robin Hill @ 2009-09-10  9:46 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 894 bytes --]

On Thu Sep 10, 2009 at 10:26:35AM +0100, Simon Jackson wrote:

> Hmmm.  I do not have sdparm on my system. 
> 
> Linux 2.6.26-1-amd64 #1 SMP Sat Jan 10 17:57:00 UTC 2009 x86_64 GNU/Linux
> 
> Is this the same functionality as the hdparm utility? 
> 
Yes - it was written for SCSI drives rather than IDE drives.  With the
different markets these were targeted for, SCSI drives have
(historically) provided a lot more low-level parameters for tweaking.
Newer ATA drives have been incorporating a lot of the same functionality
though, so it'll work (to some extent) with them as well.

The homepage is at http://sg.danny.cz/sg/sdparm.html

HTH,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: mdadm and TLER (Time Limited Error Recovery)
  2009-09-09 11:04         ` Mario 'BitKoenig' Holbe
  2009-09-10  9:26           ` Simon Jackson
@ 2009-09-15 15:36           ` Simon Jackson
  1 sibling, 0 replies; 12+ messages in thread
From: Simon Jackson @ 2009-09-15 15:36 UTC (permalink / raw)
  To: Mario 'BitKoenig' Holbe, linux-raid

Did not appear to work.

$ sdparm -s "RTL=7000" /dev/sda
    /dev/sda: ATA       ST980818SM        3.AA
change_mode_page: failed setting page: Read write error recovery

-----Original Message-----
From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Mario 'BitKoenig' Holbe
Sent: 09 September 2009 12:05
To: linux-raid@vger.kernel.org
Subject: Re: mdadm and TLER (Time Limited Error Recovery)

Simon Jackson <sjackson@bluearc.com> wrote:
> In my case I am using Seagate and Hitachi drives.
> Please can anyone tell me how I would go about setting timeout values on these types of drive. Are there utility programs to do this or a Linux 

Well, my Seagates have a RTL (Recovery time limit (ms)) field in the rw
(Read write error recovery) mode page.

You could try something like `sdparm -W RTL=7000 /dev/sdX' to set it to
7 seconds. I don't know if it works, I didn't test it, use it at your
own risk! Do a backup before, it could blow up your disk or even the
universe :) Tell us if it worked :)


regards
   Mario
-- 
() Ascii Ribbon Campaign
/\ Support plain text e-mail

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-09-15 15:36 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-08  0:35 mdadm and TLER (Time Limited Error Recovery) Tim Rutter
2009-09-08 14:17 ` Mario 'BitKoenig' Holbe
2009-09-08 18:48   ` Iustin Pop
2009-09-08 19:45     ` Mario 'BitKoenig' Holbe
2009-09-09  1:33     ` Maurice Hilarius
2009-09-09  8:21       ` Simon Jackson
2009-09-09  9:00         ` Majed B.
2009-09-09 11:04         ` Mario 'BitKoenig' Holbe
2009-09-10  9:26           ` Simon Jackson
2009-09-10  9:39             ` Majed B.
2009-09-10  9:46             ` Robin Hill
2009-09-15 15:36           ` Simon Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.