All of lore.kernel.org
 help / color / mirror / Atom feed
* TLER / CCTL timeout handling
@ 2010-03-23 22:10 Nebojsa Trpkovic
  2010-03-24 21:54 ` Stefan /*St0fF*/ Hübner
  0 siblings, 1 reply; 3+ messages in thread
From: Nebojsa Trpkovic @ 2010-03-23 22:10 UTC (permalink / raw)
  To: linux-raid

Hello.

I've found interesting text about TLER / CCTL
(http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery )

on desktop class drives:
http://forums.storagereview.com/index.php/topic/28333-tler-cctl/

So, the question is:

If I make my drive report back it failed to read the requested sector,
how that report will be handeled?

Will Linux software RAID be aware of that report and start some action
(rebuilding affected stripe or at least whole array, reallocating bad
sectors along the way) ?


Thank you.
Nebojsa Trpkovic



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: TLER / CCTL timeout handling
  2010-03-23 22:10 TLER / CCTL timeout handling Nebojsa Trpkovic
@ 2010-03-24 21:54 ` Stefan /*St0fF*/ Hübner
  2010-03-25  0:48   ` Nebojsa Trpkovic
  0 siblings, 1 reply; 3+ messages in thread
From: Stefan /*St0fF*/ Hübner @ 2010-03-24 21:54 UTC (permalink / raw)
  To: Nebojsa Trpkovic; +Cc: linux-raid

Am 23.03.2010 23:10, schrieb Nebojsa Trpkovic:
> Hello.
> 
> I've found interesting text about TLER / CCTL
> (http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery )

You might want to read on inside the ATA8-ACS, section about SCT Transport.
> 
> on desktop class drives:
> http://forums.storagereview.com/index.php/topic/28333-tler-cctl/
> 
> So, the question is:
> 
> If I make my drive report back it failed to read the requested sector,
> how that report will be handeled?

As far as I understood in recent answers to my questions: as expected.
> 
> Will Linux software RAID be aware of that report and start some action
> (rebuilding affected stripe or at least whole array, reallocating bad
> sectors along the way) ?
> 
Indeed.  Michael Tokarev answered to me on 2/9/10:
"On failed _read_ it tries to
reconstruct data from other disk drives and writes the reconstructed
data back to the drive where read failed.  If the _write_ fails md will
drop the disk."

This means: if read fails and the drive does not report back, the
following reconstructing write calls will fail, too.  The disk gets
dropped, because it (most probably) is still doing its error recovery on
the former read request and by that not responding.

If you enable ERC read timeouts, it'll report a media error (or
something similar), but honour the write request.  If you give the ERC
write timeout a value that is not too small and also not too large (i.e.
it shouldn't timeout the write-operation from the view of the kernel),
it will either fix the pending sector, or reallocate it.  If the ERC
write timeout value is too small, it'll very aggressively reallocate
sectors - which should not be the intention, as there are very few spare
sectors (compared to the amount of sectors in total - only a few thousand).
> 
> Thank you.
> Nebojsa Trpkovic

You're welcome, and all the best,
Stefan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: TLER / CCTL timeout handling
  2010-03-24 21:54 ` Stefan /*St0fF*/ Hübner
@ 2010-03-25  0:48   ` Nebojsa Trpkovic
  0 siblings, 0 replies; 3+ messages in thread
From: Nebojsa Trpkovic @ 2010-03-25  0:48 UTC (permalink / raw)
  To: st0ff; +Cc: Stefan /*St0fF*/ Hübner, linux-raid

On 03/24/10 22:54, Stefan /*St0fF*/ Hübner wrote:
> Am 23.03.2010 23:10, schrieb Nebojsa Trpkovic:
>> Hello.
>>
>> I've found interesting text about TLER / CCTL
>> (http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery )
> 
> You might want to read on inside the ATA8-ACS, section about SCT Transport.
>>
>> on desktop class drives:
>> http://forums.storagereview.com/index.php/topic/28333-tler-cctl/
>>
>> So, the question is:
>>
>> If I make my drive report back it failed to read the requested sector,
>> how that report will be handeled?
> 
> As far as I understood in recent answers to my questions: as expected.
>>
>> Will Linux software RAID be aware of that report and start some action
>> (rebuilding affected stripe or at least whole array, reallocating bad
>> sectors along the way) ?
>>
> Indeed.  Michael Tokarev answered to me on 2/9/10:
> "On failed _read_ it tries to
> reconstruct data from other disk drives and writes the reconstructed
> data back to the drive where read failed.  If the _write_ fails md will
> drop the disk."
> 
> This means: if read fails and the drive does not report back, the
> following reconstructing write calls will fail, too.  The disk gets
> dropped, because it (most probably) is still doing its error recovery on
> the former read request and by that not responding.
> 
> If you enable ERC read timeouts, it'll report a media error (or
> something similar), but honour the write request.  If you give the ERC
> write timeout a value that is not too small and also not too large (i.e.
> it shouldn't timeout the write-operation from the view of the kernel),
> it will either fix the pending sector, or reallocate it.  If the ERC
> write timeout value is too small, it'll very aggressively reallocate
> sectors - which should not be the intention, as there are very few spare
> sectors (compared to the amount of sectors in total - only a few thousand).
>>
>> Thank you.
>> Nebojsa Trpkovic
> 
> You're welcome, and all the best,
> Stefan
> 

Thank you very much!

This is great answer (explaining a lot of things) and great news
(there's nothing to worry/hack about).

So, now we (desktop drives users) just have to wait for smartmontools
5.40 or pull source from SVN and set some reasonable ERC read timeouts.

Is value of 7 seconds considered as reasonable ERC read timeout?

Nebojsa
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-03-25  0:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-23 22:10 TLER / CCTL timeout handling Nebojsa Trpkovic
2010-03-24 21:54 ` Stefan /*St0fF*/ Hübner
2010-03-25  0:48   ` Nebojsa Trpkovic

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.