All of lore.kernel.org
 help / color / mirror / Atom feed
* detection/correction of corruption with raid6
@ 2008-12-05 21:00 Redeeman
  2008-12-05 21:02 ` Justin Piszcz
  0 siblings, 1 reply; 26+ messages in thread
From: Redeeman @ 2008-12-05 21:00 UTC (permalink / raw)
  To: linux-raid

Hello.

I was looking at the PDFs linked to from the wiki, and found this:
http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf

More specifically, section 4, starting on page 8.

Am I understanding this correctly, in that with raid6, linux is capable
of detecting if the content on 1 disk is corrupted, and reconstruct it
from the remaining disks?


mvh.
Kasper Sandberg


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-05 21:00 detection/correction of corruption with raid6 Redeeman
@ 2008-12-05 21:02 ` Justin Piszcz
  2008-12-05 21:06   ` Redeeman
  0 siblings, 1 reply; 26+ messages in thread
From: Justin Piszcz @ 2008-12-05 21:02 UTC (permalink / raw)
  To: Redeeman; +Cc: linux-raid



On Fri, 5 Dec 2008, Redeeman wrote:

> Hello.
>
> I was looking at the PDFs linked to from the wiki, and found this:
> http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
>
> More specifically, section 4, starting on page 8.
>
> Am I understanding this correctly, in that with raid6, linux is capable
> of detecting if the content on 1 disk is corrupted, and reconstruct it
> from the remaining disks?

I ran md/raid6 for awhile, do you mean remap the bad sector on the fly? 
Linux/md raid does not do this afaik.

But it can recover from a single or double disk failure, I have had both 
happen.

Justin.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-05 21:02 ` Justin Piszcz
@ 2008-12-05 21:06   ` Redeeman
  2008-12-05 21:09     ` Justin Piszcz
  0 siblings, 1 reply; 26+ messages in thread
From: Redeeman @ 2008-12-05 21:06 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-raid

On Fri, 2008-12-05 at 16:02 -0500, Justin Piszcz wrote:
> 
> On Fri, 5 Dec 2008, Redeeman wrote:
> 
> > Hello.
> >
> > I was looking at the PDFs linked to from the wiki, and found this:
> > http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
> >
> > More specifically, section 4, starting on page 8.
> >
> > Am I understanding this correctly, in that with raid6, linux is capable
> > of detecting if the content on 1 disk is corrupted, and reconstruct it
> > from the remaining disks?
> 
> I ran md/raid6 for awhile, do you mean remap the bad sector on the fly? 
> Linux/md raid does not do this afaik.

No, i mean, if one disk does silent corruption

> 
> But it can recover from a single or double disk failure, I have had both 
> happen.
> 
> Justin.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-05 21:06   ` Redeeman
@ 2008-12-05 21:09     ` Justin Piszcz
  2008-12-05 21:12       ` Redeeman
  0 siblings, 1 reply; 26+ messages in thread
From: Justin Piszcz @ 2008-12-05 21:09 UTC (permalink / raw)
  To: Redeeman; +Cc: linux-raid



On Fri, 5 Dec 2008, Redeeman wrote:

> On Fri, 2008-12-05 at 16:02 -0500, Justin Piszcz wrote:
>>
>> On Fri, 5 Dec 2008, Redeeman wrote:
>>
>>> Hello.
>>>
>>> I was looking at the PDFs linked to from the wiki, and found this:
>>> http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
>>>
>>> More specifically, section 4, starting on page 8.
>>>
>>> Am I understanding this correctly, in that with raid6, linux is capable
>>> of detecting if the content on 1 disk is corrupted, and reconstruct it
>>> from the remaining disks?
>>
>> I ran md/raid6 for awhile, do you mean remap the bad sector on the fly?
>> Linux/md raid does not do this afaik.
>
> No, i mean, if one disk does silent corruption

What would the error look like?  Both md/Linux & in the 3ware manual 
recommend you run a 'check' across the raid at least once a week 
(3ware/raid-verify) and md/Linux in Debian runs a check once a month I 
believe to eliminate these issues.

If you are asking whether a read error of a latent sector from the one 
disk will result it reading the data from the second disk that is a good
question.

Justin.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-05 21:09     ` Justin Piszcz
@ 2008-12-05 21:12       ` Redeeman
  2008-12-05 21:17         ` Justin Piszcz
  2008-12-05 21:30         ` Michał Przyłuski
  0 siblings, 2 replies; 26+ messages in thread
From: Redeeman @ 2008-12-05 21:12 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-raid

On Fri, 2008-12-05 at 16:09 -0500, Justin Piszcz wrote:
> 
> On Fri, 5 Dec 2008, Redeeman wrote:
> 
> > On Fri, 2008-12-05 at 16:02 -0500, Justin Piszcz wrote:
> >>
> >> On Fri, 5 Dec 2008, Redeeman wrote:
> >>
> >>> Hello.
> >>>
> >>> I was looking at the PDFs linked to from the wiki, and found this:
> >>> http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
> >>>
> >>> More specifically, section 4, starting on page 8.
> >>>
> >>> Am I understanding this correctly, in that with raid6, linux is capable
> >>> of detecting if the content on 1 disk is corrupted, and reconstruct it
> >>> from the remaining disks?
> >>
> >> I ran md/raid6 for awhile, do you mean remap the bad sector on the fly?
> >> Linux/md raid does not do this afaik.
> >
> > No, i mean, if one disk does silent corruption
> 
> What would the error look like?  Both md/Linux & in the 3ware manual 
> recommend you run a 'check' across the raid at least once a week 
> (3ware/raid-verify) and md/Linux in Debian runs a check once a month I 
> believe to eliminate these issues.
> 
> If you are asking whether a read error of a latent sector from the one 
> disk will result it reading the data from the second disk that is a good
> question.

im asking, if one disk in a raid6 setup suddenly decides to flip a few
bits in some bytes, will it be able to detect that in a scan, and
correct it? i cant see how it can do it on raid5, but maybe raid6?
> 
> Justin.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-05 21:12       ` Redeeman
@ 2008-12-05 21:17         ` Justin Piszcz
  2008-12-05 21:30         ` Michał Przyłuski
  1 sibling, 0 replies; 26+ messages in thread
From: Justin Piszcz @ 2008-12-05 21:17 UTC (permalink / raw)
  To: Redeeman; +Cc: linux-raid



On Fri, 5 Dec 2008, Redeeman wrote:

> On Fri, 2008-12-05 at 16:09 -0500, Justin Piszcz wrote:
>>
>> On Fri, 5 Dec 2008, Redeeman wrote:
>>
>>> On Fri, 2008-12-05 at 16:02 -0500, Justin Piszcz wrote:
>>>>
>>>> On Fri, 5 Dec 2008, Redeeman wrote:
>>>>
>>>>> Hello.
>>>>>
>>>>> I was looking at the PDFs linked to from the wiki, and found this:
>>>>> http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
>>>>>
>>>>> More specifically, section 4, starting on page 8.
>>>>>
>>>>> Am I understanding this correctly, in that with raid6, linux is capable
>>>>> of detecting if the content on 1 disk is corrupted, and reconstruct it
>>>>> from the remaining disks?
>>>>
>>>> I ran md/raid6 for awhile, do you mean remap the bad sector on the fly?
>>>> Linux/md raid does not do this afaik.
>>>
>>> No, i mean, if one disk does silent corruption
>>
>> What would the error look like?  Both md/Linux & in the 3ware manual
>> recommend you run a 'check' across the raid at least once a week
>> (3ware/raid-verify) and md/Linux in Debian runs a check once a month I
>> believe to eliminate these issues.
>>
>> If you are asking whether a read error of a latent sector from the one
>> disk will result it reading the data from the second disk that is a good
>> question.
>
> im asking, if one disk in a raid6 setup suddenly decides to flip a few
> bits in some bytes, will it be able to detect that in a scan, and
> correct it? i cant see how it can do it on raid5, but maybe raid6?
I have never seen any kernel messages showing md/raid: fixed sector 29383 
for example and I have had 8-9 RMA's with Western Digital for bad/failed 
velociraptors in RAID6, I had go with raid6 because it is so bad that 
sometimes 2 drives at a time would drop out of the array.  However, for 
the instances where there were just problems with one drive, it kicked it 
out of the array and the raid6 became degraded.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-05 21:12       ` Redeeman
  2008-12-05 21:17         ` Justin Piszcz
@ 2008-12-05 21:30         ` Michał Przyłuski
  2008-12-05 22:12           ` Peter Rabbitson
  2008-12-12 15:31           ` Redeeman
  1 sibling, 2 replies; 26+ messages in thread
From: Michał Przyłuski @ 2008-12-05 21:30 UTC (permalink / raw)
  To: Redeeman; +Cc: linux-raid

Hi,

2008/12/5 Redeeman <redeeman@metanurb.dk>:
> On Fri, 2008-12-05 at 16:09 -0500, Justin Piszcz wrote:
>>
>> On Fri, 5 Dec 2008, Redeeman wrote:
>>
>> > On Fri, 2008-12-05 at 16:02 -0500, Justin Piszcz wrote:
>> >>
>> >> On Fri, 5 Dec 2008, Redeeman wrote:
>> >>
>> >>> Hello.
>> >>>
>> >>> I was looking at the PDFs linked to from the wiki, and found this:
>> >>> http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
>> >>>
>> >>> More specifically, section 4, starting on page 8.
>> >>>
>> >>> Am I understanding this correctly, in that with raid6, linux is capable
>> >>> of detecting if the content on 1 disk is corrupted, and reconstruct it
>> >>> from the remaining disks?
>> >>
>> >> I ran md/raid6 for awhile, do you mean remap the bad sector on the fly?
>> >> Linux/md raid does not do this afaik.
>> >
>> > No, i mean, if one disk does silent corruption
>>
>> What would the error look like?  Both md/Linux & in the 3ware manual
>> recommend you run a 'check' across the raid at least once a week
>> (3ware/raid-verify) and md/Linux in Debian runs a check once a month I
>> believe to eliminate these issues.
>>
>> If you are asking whether a read error of a latent sector from the one
>> disk will result it reading the data from the second disk that is a good
>> question.
>
> im asking, if one disk in a raid6 setup suddenly decides to flip a few
> bits in some bytes, will it be able to detect that in a scan, and
> correct it? i cant see how it can do it on raid5, but maybe raid6?

No, not really.
I've been investigating silent corruption for a quite a while now, and
it looks more or less like this.
During a "check" action it'll be detected. During normal operation -
it won't be detected.
Normal (non-degraded) raid5/6 reads don't read parity (or Q syndrome),
they just read data. So they have no idea that something went bad.
Now, worse news is that you cannot really fix it automagically, even
after detecting by a "check" procedure. A "repair" will overwrite
parity and Q syndrome, with new values (new = calculated from what it
seems to be data blocks).

It is possible (by the theory of Q syndrome, per the article you
linked) to detect which drive is doing a silent corruption with raid6
(and with some extra assumption, that just one drive is doing that).
But it's not implemented.

Greets,
Mike

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-05 21:30         ` Michał Przyłuski
@ 2008-12-05 22:12           ` Peter Rabbitson
  2008-12-05 22:26             ` Michał Przyłuski
  2008-12-12 15:31           ` Redeeman
  1 sibling, 1 reply; 26+ messages in thread
From: Peter Rabbitson @ 2008-12-05 22:12 UTC (permalink / raw)
  To: Michał Przyłuski; +Cc: Redeeman, linux-raid

Michał Przyłuski wrote:
> Hi,
> 
> 2008/12/5 Redeeman <redeeman@metanurb.dk>:
>> On Fri, 2008-12-05 at 16:09 -0500, Justin Piszcz wrote:
>>> On Fri, 5 Dec 2008, Redeeman wrote:
>>>
>>>> On Fri, 2008-12-05 at 16:02 -0500, Justin Piszcz wrote:
>>>>> On Fri, 5 Dec 2008, Redeeman wrote:
>>>>>
>>>>>> Hello.
>>>>>>
>>>>>> I was looking at the PDFs linked to from the wiki, and found this:
>>>>>> http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
>>>>>>
>>>>>> More specifically, section 4, starting on page 8.
>>>>>>
>>>>>> Am I understanding this correctly, in that with raid6, linux is capable
>>>>>> of detecting if the content on 1 disk is corrupted, and reconstruct it
>>>>>> from the remaining disks?
>>>>> I ran md/raid6 for awhile, do you mean remap the bad sector on the fly?
>>>>> Linux/md raid does not do this afaik.
>>>> No, i mean, if one disk does silent corruption
>>> What would the error look like?  Both md/Linux & in the 3ware manual
>>> recommend you run a 'check' across the raid at least once a week
>>> (3ware/raid-verify) and md/Linux in Debian runs a check once a month I
>>> believe to eliminate these issues.
>>>
>>> If you are asking whether a read error of a latent sector from the one
>>> disk will result it reading the data from the second disk that is a good
>>> question.
>> im asking, if one disk in a raid6 setup suddenly decides to flip a few
>> bits in some bytes, will it be able to detect that in a scan, and
>> correct it? i cant see how it can do it on raid5, but maybe raid6?
> 
> No, not really.
> I've been investigating silent corruption for a quite a while now, and
> it looks more or less like this.
> During a "check" action it'll be detected. During normal operation -
> it won't be detected.
> Normal (non-degraded) raid5/6 reads don't read parity (or Q syndrome),
> they just read data. So they have no idea that something went bad.
> Now, worse news is that you cannot really fix it automagically, even
> after detecting by a "check" procedure. A "repair" will overwrite
> parity and Q syndrome, with new values (new = calculated from what it
> seems to be data blocks).
> 
> It is possible (by the theory of Q syndrome, per the article you
> linked) to detect which drive is doing a silent corruption with raid6
> (and with some extra assumption, that just one drive is doing that).
> But it's not implemented.
> 

I'd like to shamelessly bring in an older related thread:
http://marc.info/?l=linux-raid&m=120605458309825
http://marc.info/?l=linux-raid&m=120618020817057

Maybe someone will get inspired, and will actually write the damned thing :)

Cheers
Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-05 22:12           ` Peter Rabbitson
@ 2008-12-05 22:26             ` Michał Przyłuski
  2008-12-05 22:43               ` Greg Freemyer
  0 siblings, 1 reply; 26+ messages in thread
From: Michał Przyłuski @ 2008-12-05 22:26 UTC (permalink / raw)
  To: Peter Rabbitson; +Cc: Redeeman, linux-raid

2008/12/5 Peter Rabbitson <rabbit+list@rabbit.us>:
> Michał Przyłuski wrote:
>> Hi,
>>
>> 2008/12/5 Redeeman <redeeman@metanurb.dk>:
>>> On Fri, 2008-12-05 at 16:09 -0500, Justin Piszcz wrote:
>>>> On Fri, 5 Dec 2008, Redeeman wrote:
>>>>
>>>>> On Fri, 2008-12-05 at 16:02 -0500, Justin Piszcz wrote:
>>>>>> On Fri, 5 Dec 2008, Redeeman wrote:
>>>>>>
>>>>>>> Hello.
>>>>>>>
>>>>>>> I was looking at the PDFs linked to from the wiki, and found this:
>>>>>>> http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
>>>>>>
>>>>>>> More specifically, section 4, starting on page 8.
>>>>>>>
>>>>>>> Am I understanding this correctly, in that with raid6, linux is capable
>>>>>>> of detecting if the content on 1 disk is corrupted, and reconstruct it
>>>>>>> from the remaining disks?
>>>>>> I ran md/raid6 for awhile, do you mean remap the bad sector on the fly?
>>>>>> Linux/md raid does not do this afaik.
>>>>> No, i mean, if one disk does silent corruption
>>>> What would the error look like?  Both md/Linux & in the 3ware manual
>>>> recommend you run a 'check' across the raid at least once a week
>>>> (3ware/raid-verify) and md/Linux in Debian runs a check once a month I
>>>> believe to eliminate these issues.
>>>>
>>>> If you are asking whether a read error of a latent sector from the one
>>>> disk will result it reading the data from the second disk that is a good
>>>> question.
>>> im asking, if one disk in a raid6 setup suddenly decides to flip a few
>>> bits in some bytes, will it be able to detect that in a scan, and
>>> correct it? i cant see how it can do it on raid5, but maybe raid6?
>>
>> No, not really.
>> I've been investigating silent corruption for a quite a while now, and
>> it looks more or less like this.
>> During a "check" action it'll be detected. During normal operation -
>> it won't be detected.
>> Normal (non-degraded) raid5/6 reads don't read parity (or Q syndrome),
>> they just read data. So they have no idea that something went bad.
>> Now, worse news is that you cannot really fix it automagically, even
>> after detecting by a "check" procedure. A "repair" will overwrite
>> parity and Q syndrome, with new values (new = calculated from what it
>> seems to be data blocks).
>>
>> It is possible (by the theory of Q syndrome, per the article you
>> linked) to detect which drive is doing a silent corruption with raid6
>> (and with some extra assumption, that just one drive is doing that).
>> But it's not implemented.
>>
>
> I'd like to shamelessly bring in an older related thread:
> http://marc.info/?l=linux-raid&m=120605458309825
> http://marc.info/?l=linux-raid&m=120618020817057
>
> Maybe someone will get inspired, and will actually write the damned thing :)

I concur. Even without a "fix", just printing information which disk
is suspected of doing silent corruption will be helpful. One can at
least, fail the disk, and get rid of it. Still better than taking wild
guesses what went wrong. I'm a silent corruption maniac myself,
keeping md5's of most bigger/more important files, so my judgment
might not be fair.

Also, it seems the feature is being asked about about 3-4 times a
year, which is probably the second most requested feature after
numerous reshape variations.
Regards,
Mike

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-05 22:26             ` Michał Przyłuski
@ 2008-12-05 22:43               ` Greg Freemyer
  2008-12-06  0:39                 ` Roger Heflin
  0 siblings, 1 reply; 26+ messages in thread
From: Greg Freemyer @ 2008-12-05 22:43 UTC (permalink / raw)
  To: Michał Przyłuski; +Cc: Peter Rabbitson, Redeeman, linux-raid

2008/12/5 Michał Przyłuski <mikylie@gmail.com>:
> 2008/12/5 Peter Rabbitson <rabbit+list@rabbit.us>:
>> Michał Przyłuski wrote:
>>> Hi,
>>>
>>> 2008/12/5 Redeeman <redeeman@metanurb.dk>:
>>>> On Fri, 2008-12-05 at 16:09 -0500, Justin Piszcz wrote:
>>>>> On Fri, 5 Dec 2008, Redeeman wrote:
>>>>>
>>>>>> On Fri, 2008-12-05 at 16:02 -0500, Justin Piszcz wrote:
>>>>>>> On Fri, 5 Dec 2008, Redeeman wrote:
>>>>>>>
>>>>>>>> Hello.
>>>>>>>>
>>>>>>>> I was looking at the PDFs linked to from the wiki, and found this:
>>>>>>>> http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
>>>>>>>
>>>>>>>> More specifically, section 4, starting on page 8.
>>>>>>>>
>>>>>>>> Am I understanding this correctly, in that with raid6, linux is capable
>>>>>>>> of detecting if the content on 1 disk is corrupted, and reconstruct it
>>>>>>>> from the remaining disks?
>>>>>>> I ran md/raid6 for awhile, do you mean remap the bad sector on the fly?
>>>>>>> Linux/md raid does not do this afaik.
>>>>>> No, i mean, if one disk does silent corruption
>>>>> What would the error look like?  Both md/Linux & in the 3ware manual
>>>>> recommend you run a 'check' across the raid at least once a week
>>>>> (3ware/raid-verify) and md/Linux in Debian runs a check once a month I
>>>>> believe to eliminate these issues.
>>>>>
>>>>> If you are asking whether a read error of a latent sector from the one
>>>>> disk will result it reading the data from the second disk that is a good
>>>>> question.
>>>> im asking, if one disk in a raid6 setup suddenly decides to flip a few
>>>> bits in some bytes, will it be able to detect that in a scan, and
>>>> correct it? i cant see how it can do it on raid5, but maybe raid6?
>>>
>>> No, not really.
>>> I've been investigating silent corruption for a quite a while now, and
>>> it looks more or less like this.
>>> During a "check" action it'll be detected. During normal operation -
>>> it won't be detected.
>>> Normal (non-degraded) raid5/6 reads don't read parity (or Q syndrome),
>>> they just read data. So they have no idea that something went bad.
>>> Now, worse news is that you cannot really fix it automagically, even
>>> after detecting by a "check" procedure. A "repair" will overwrite
>>> parity and Q syndrome, with new values (new = calculated from what it
>>> seems to be data blocks).
>>>
>>> It is possible (by the theory of Q syndrome, per the article you
>>> linked) to detect which drive is doing a silent corruption with raid6
>>> (and with some extra assumption, that just one drive is doing that).
>>> But it's not implemented.
>>>
>>
>> I'd like to shamelessly bring in an older related thread:
>> http://marc.info/?l=linux-raid&m=120605458309825
>> http://marc.info/?l=linux-raid&m=120618020817057
>>
>> Maybe someone will get inspired, and will actually write the damned thing :)
>
> I concur. Even without a "fix", just printing information which disk
> is suspected of doing silent corruption will be helpful. One can at
> least, fail the disk, and get rid of it. Still better than taking wild
> guesses what went wrong. I'm a silent corruption maniac myself,
> keeping md5's of most bigger/more important files, so my judgment
> might not be fair.
>
> Also, it seems the feature is being asked about about 3-4 times a
> year, which is probably the second most requested feature after
> numerous reshape variations.
> Regards,
> Mike
>

I'm also very concerned about silent corruption and we often "verify"
our critical large files by  performing MD5 verifies against a known
good value.  Especially when we make copies or move them from one
media to another.

But in all the cases of silent corruption I've seen, it was never the
disk.  Instead I've seen it be the cable, the controller, bad memory,
bad power supply, but never the disk itself.  Not to say the disk
controller could not be the cause, just that I have not seen it.

I did not read the relevant threads, but do they cover all of these
sources of silent corruption, or just if a disk is the source?

Thanks
Greg
-- 
Greg Freemyer
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
First 99 Days Litigation White Paper -
http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf

The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-05 22:43               ` Greg Freemyer
@ 2008-12-06  0:39                 ` Roger Heflin
  0 siblings, 0 replies; 26+ messages in thread
From: Roger Heflin @ 2008-12-06  0:39 UTC (permalink / raw)
  To: Greg Freemyer
  Cc: Michał Przyłuski, Peter Rabbitson, Redeeman, linux-raid

Greg Freemyer wrote:

> I'm also very concerned about silent corruption and we often "verify"
> our critical large files by  performing MD5 verifies against a known
> good value.  Especially when we make copies or move them from one
> media to another.
> 
> But in all the cases of silent corruption I've seen, it was never the
> disk.  Instead I've seen it be the cable, the controller, bad memory,
> bad power supply, but never the disk itself.  Not to say the disk
> controller could not be the cause, just that I have not seen it.
> 
> I did not read the relevant threads, but do they cover all of these
> sources of silent corruption, or just if a disk is the source?
> 
> Thanks
> Greg

I will second what Greg says, I have debugged a number of corruptions 
related to filesystems.    I have never seen it be the disk, I have 
seen 3-4 different controllers corrupt (bad PCI/MB interaction-2 
different manufacturers controllers, and a bad controller).

And then the #1 issue is actual bad memory or bad power supply in the 
machine.   None of the actual cases I saw actually affected *ONLY* a 
single disk=they affected all of the disks on the controller, so 
whatever has to be done would almost have to be done a the filesystem 
level or the application level.    The typical corruption is not data 
off of the disk, the platters themselves (and the internals of the 
disk) appear to have very very good corruption detection and 
correction, it is really really unlikely for a bad sector read to not 
get caught.   The PCI bus only has parity (and likely parity errors on 
the PCI bus are not being monitored-unless you installed the edac_mc 
module) so 50% of the errors that happen get missed.    This was one 
of the bad PCI/MB interactions, one of the slots on a certain MB (all 
of the specific MB with a couple of different companies card) *HAD* to 
be throttled to not produce corrupt data every 1GB of reads or so.

And internally the controllers often have poor checking, and will miss 
things if the controller goes bad.   The disks themselves appear to 
have very good internal controls-I have never seen disk electronics 
screw up and corrupt data either.

Basically don't waste time worrying about the single disk corrupting 
data silently, worry about everything after the disk first as that is 
the weakest link of everything and is far far more likely to bite you.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-05 21:30         ` Michał Przyłuski
  2008-12-05 22:12           ` Peter Rabbitson
@ 2008-12-12 15:31           ` Redeeman
  2008-12-16  2:33             ` Neil Brown
  1 sibling, 1 reply; 26+ messages in thread
From: Redeeman @ 2008-12-12 15:31 UTC (permalink / raw)
  To: Michał Przyłuski; +Cc: linux-raid

On Fri, 2008-12-05 at 22:30 +0100, Michał Przyłuski wrote:
> Hi,
> 
> 2008/12/5 Redeeman <redeeman@metanurb.dk>:
> > On Fri, 2008-12-05 at 16:09 -0500, Justin Piszcz wrote:
> >>
> >> On Fri, 5 Dec 2008, Redeeman wrote:
> >>
> >> > On Fri, 2008-12-05 at 16:02 -0500, Justin Piszcz wrote:
> >> >>
> >> >> On Fri, 5 Dec 2008, Redeeman wrote:
> >> >>
> >> >>> Hello.
> >> >>>
> >> >>> I was looking at the PDFs linked to from the wiki, and found this:
> >> >>> http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
> >> >>>
> >> >>> More specifically, section 4, starting on page 8.
> >> >>>
> >> >>> Am I understanding this correctly, in that with raid6, linux is capable
> >> >>> of detecting if the content on 1 disk is corrupted, and reconstruct it
> >> >>> from the remaining disks?
> >> >>
> >> >> I ran md/raid6 for awhile, do you mean remap the bad sector on the fly?
> >> >> Linux/md raid does not do this afaik.
> >> >
> >> > No, i mean, if one disk does silent corruption
> >>
> >> What would the error look like?  Both md/Linux & in the 3ware manual
> >> recommend you run a 'check' across the raid at least once a week
> >> (3ware/raid-verify) and md/Linux in Debian runs a check once a month I
> >> believe to eliminate these issues.
> >>
> >> If you are asking whether a read error of a latent sector from the one
> >> disk will result it reading the data from the second disk that is a good
> >> question.
> >
> > im asking, if one disk in a raid6 setup suddenly decides to flip a few
> > bits in some bytes, will it be able to detect that in a scan, and
> > correct it? i cant see how it can do it on raid5, but maybe raid6?
> 
> No, not really.
> I've been investigating silent corruption for a quite a while now, and
> it looks more or less like this.
> During a "check" action it'll be detected. During normal operation -
> it won't be detected.
> Normal (non-degraded) raid5/6 reads don't read parity (or Q syndrome),
> they just read data. So they have no idea that something went bad.
> Now, worse news is that you cannot really fix it automagically, even
> after detecting by a "check" procedure. A "repair" will overwrite
> parity and Q syndrome, with new values (new = calculated from what it
> seems to be data blocks).
> 
> It is possible (by the theory of Q syndrome, per the article you
> linked) to detect which drive is doing a silent corruption with raid6
> (and with some extra assumption, that just one drive is doing that).
> But it's not implemented.

thats a shame, it seems like a KILLER feature, but i guess its not too
simple to do, or it would have been done already :)

> 
> Greets,
> Mike
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-12 15:31           ` Redeeman
@ 2008-12-16  2:33             ` Neil Brown
  2008-12-16  6:33               ` Redeeman
  2008-12-16  7:59               ` Mattias Wadenstein
  0 siblings, 2 replies; 26+ messages in thread
From: Neil Brown @ 2008-12-16  2:33 UTC (permalink / raw)
  To: Redeeman; +Cc: Michał Przyłuski, linux-raid

On Friday December 12, redeeman@metanurb.dk wrote:
> > 
> > It is possible (by the theory of Q syndrome, per the article you
> > linked) to detect which drive is doing a silent corruption with raid6
> > (and with some extra assumption, that just one drive is doing that).
> > But it's not implemented.
> 
> thats a shame, it seems like a KILLER feature, but i guess its not too
> simple to do, or it would have been done already :)

The reason that it hasn't been done is not that it is difficult.
Certainly it is not trivial, but more complicated things have been
implemented.

The reason that it is not even on my TODO list is that I don't think
it is justifiable.

As has been said elsewhere in this thread, silent corruption is rarely
if ever caused by the storage device.  They tend to have strong CRCs
etc which detect bit-flips with greater reliability than the RAID6
algorithm would detect them.

If the silent corruption comes from anywhere else in the system, it is
not clear what if anything should be done.
e.g. if the corruption was due to bad memory, there is no behaviour
that will reliably do the "right" thing.

In that case, the best that can be done is simply to log any error
that is found and let some human figure it out.  That is part of the
motivation for a monthly 'check'.

I like to think about raid in a similar way to thinking about security
issues (after all, we are dealing with data security).

So before implementing any mechanism that might enhance security, I
need to have a clear understanding of what the threat model is.  In
this case, what is the source of corruption.
Then I need a clear understanding on how the enhancement neutralises
or logs the threat, and a credible explanation of why it won't increase
the risk from some other threat.

If silent corruption is an issue for you then you really need to be
doing checks at a much higher level than the md level.  A filesystem
that does checksums on all blocks (e.g. btrfs), or an application that
does them an all files (tripwire) are much more likely to be
beneficial than trying to leverage a side-effect of raid6.

I have a similar attitude to 3-way raid1 and voting on the result.  I
simply don't think it is the right solution.

NeilBrown

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-16  2:33             ` Neil Brown
@ 2008-12-16  6:33               ` Redeeman
  2008-12-16  7:59               ` Mattias Wadenstein
  1 sibling, 0 replies; 26+ messages in thread
From: Redeeman @ 2008-12-16  6:33 UTC (permalink / raw)
  To: Neil Brown; +Cc: Michał Przyłuski, linux-raid

On Tue, 2008-12-16 at 13:33 +1100, Neil Brown wrote:
> On Friday December 12, redeeman@metanurb.dk wrote:
> > > 
> > > It is possible (by the theory of Q syndrome, per the article you
> > > linked) to detect which drive is doing a silent corruption with raid6
> > > (and with some extra assumption, that just one drive is doing that).
> > > But it's not implemented.
> > 
> > thats a shame, it seems like a KILLER feature, but i guess its not too
> > simple to do, or it would have been done already :)
> 
> The reason that it hasn't been done is not that it is difficult.
> Certainly it is not trivial, but more complicated things have been
> implemented.
> 
> The reason that it is not even on my TODO list is that I don't think
> it is justifiable.
> 
> As has been said elsewhere in this thread, silent corruption is rarely
> if ever caused by the storage device.  They tend to have strong CRCs
> etc which detect bit-flips with greater reliability than the RAID6
> algorithm would detect them.
> 
> If the silent corruption comes from anywhere else in the system, it is
> not clear what if anything should be done.
> e.g. if the corruption was due to bad memory, there is no behaviour
> that will reliably do the "right" thing.

I respectfully disagree. Consider this example(and please correct me if
my assumptions are wrong)

you have a raid1 setup, with 3 disks, or just for the example, say... 5
disks.

You then force a check, which detects that 4 disks have identical data,
while 1 disk differs. Chance would dictate, that the data is SOMEHOW
wrong on 1 disk, now, that may well be the fault of the pci bus, ram, or
anything, but in my mind, it is very reasonable to assume that the right
thing, is to duplicate the data which is identical on 4/5 disks, onto
the next disk.

I would also argue, that in the case of raid1, if you 2/5 disks had
identical data, and the remaining 3 had differing data, that it still
makes for the best choice to "restore" the data which is identical most
times, and certainly, i cannot see any reason why it would be a worse
thing to do, than just randomly selecting a dataset to "trust".

Granted, if these instances occurs, its something to be very concerned
about, and surely requires a human to figure out what is causing it, but
that still doesnt mean it shouldnt try to do all it can to keep the
users data intact.

As for raid6, as i understand it, you have the ability, with parity and
Q syndrome etc, to arrive at the final data in 3 ways, involving
different disks, this still allows it to be 2/3 versus 1/3 with correct
data, and i would still argue that its much more reasonable to conclude
that 1 disk SOMEHOW has wrong data, versus two disks having the SAME
wrong data.

I do however get your point, if the corruption is in the controller etc
that may actually occur, that 2 disks have same corruption, however, i
would still argue that in general, this scheme would be better on
average, and since its not possibly to know 100% what causes stuff, i'd
say this is the most logical and reasonable action to take.

Am i wrong?

> 
> In that case, the best that can be done is simply to log any error
> that is found and let some human figure it out.  That is part of the
> motivation for a monthly 'check'.
> 
> I like to think about raid in a similar way to thinking about security
> issues (after all, we are dealing with data security).
> 
> So before implementing any mechanism that might enhance security, I
> need to have a clear understanding of what the threat model is.  In
> this case, what is the source of corruption.
> Then I need a clear understanding on how the enhancement neutralises
> or logs the threat, and a credible explanation of why it won't increase
> the risk from some other threat.
> 
> If silent corruption is an issue for you then you really need to be
> doing checks at a much higher level than the md level.  A filesystem
> that does checksums on all blocks (e.g. btrfs), or an application that
> does them an all files (tripwire) are much more likely to be
> beneficial than trying to leverage a side-effect of raid6.
> 
> I have a similar attitude to 3-way raid1 and voting on the result.  I
> simply don't think it is the right solution.
> 
> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-16  2:33             ` Neil Brown
  2008-12-16  6:33               ` Redeeman
@ 2008-12-16  7:59               ` Mattias Wadenstein
  2008-12-16 22:20                 ` Chris Worley
  1 sibling, 1 reply; 26+ messages in thread
From: Mattias Wadenstein @ 2008-12-16  7:59 UTC (permalink / raw)
  To: Neil Brown; +Cc: Redeeman, Michał Przyłuski, linux-raid

On Tue, 16 Dec 2008, Neil Brown wrote:

> On Friday December 12, redeeman@metanurb.dk wrote:
>>>
>>> It is possible (by the theory of Q syndrome, per the article you
>>> linked) to detect which drive is doing a silent corruption with raid6
>>> (and with some extra assumption, that just one drive is doing that).
>>> But it's not implemented.
>>
>> thats a shame, it seems like a KILLER feature, but i guess its not too
>> simple to do, or it would have been done already :)
>
> The reason that it hasn't been done is not that it is difficult.
> Certainly it is not trivial, but more complicated things have been
> implemented.
>
> The reason that it is not even on my TODO list is that I don't think
> it is justifiable.
>
> As has been said elsewhere in this thread, silent corruption is rarely
> if ever caused by the storage device.  They tend to have strong CRCs
> etc which detect bit-flips with greater reliability than the RAID6
> algorithm would detect them.

If by storage device, you mean the actual disk, then yes, it seems that 
way. At least in practice. In theory hdd manufacturers only guarantee you 
get a bitflip less often than once every 10^14 or 10^15 bits. Which is 
quite often. 10^14 bits is roughly the ammount of data read during a 
resync of a large:ish raidset.

In general, I agree that a checksumming filesystem is more important for 
data integrity. This is why all new fileservers around here are running 
Solaris+ZFS instead of Linux.

/Mattias Wadenstein

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-16  7:59               ` Mattias Wadenstein
@ 2008-12-16 22:20                 ` Chris Worley
  0 siblings, 0 replies; 26+ messages in thread
From: Chris Worley @ 2008-12-16 22:20 UTC (permalink / raw)
  To: linux-raid

On Tue, Dec 16, 2008 at 12:59 AM, Mattias Wadenstein <maswan@acc.umu.se> wrote:
> On Tue, 16 Dec 2008, Neil Brown wrote:
>> On Friday December 12, redeeman@metanurb.dk wrote:
>>
>> As has been said elsewhere in this thread, silent corruption is rarely
>> if ever caused by the storage device.  They tend to have strong CRCs
>> etc which detect bit-flips with greater reliability than the RAID6
>> algorithm would detect them.
>
> If by storage device, you mean the actual disk, then yes, it seems that way.
> At least in practice. In theory hdd manufacturers only guarantee you get a
> bitflip less often than once every 10^14 or 10^15 bits. Which is quite
> often. 10^14 bits is roughly the ammount of data read during a resync of a
> large:ish raidset.
>
> In general, I agree that a checksumming filesystem is more important for
> data integrity. This is why all new fileservers around here are running
> Solaris+ZFS instead of Linux.

I've seen many high-end arrays, such as those done by DDN
(www.datadirectnet.com), that spend idle time scouring the drives for
corruption.  They don't think it's insignificant.

I've had Linux RAID5 MD arrays with silent corrupt data, that could
have detected/corrected by idle scouring (before a disk goes bad
altogether, when the silent/corrupted data on a "good" disk becomes
totally unrecoverable).

SSD's like FusionIO have strong ECC built into every write (11 bits
per 240 bytes) that extend the usable life of NAND storage from 100K
writes per cell (using the industry standard 1 bit per 512 bytes) into
decades (they constantly scour too).

Chris

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-19  8:40 piergiorgio.sartor
@ 2008-12-19 13:10 ` Redeeman
  0 siblings, 0 replies; 26+ messages in thread
From: Redeeman @ 2008-12-19 13:10 UTC (permalink / raw)
  To: piergiorgio.sartor; +Cc: neilb, linux-raid

On Fri, 2008-12-19 at 09:40 +0100, piergiorgio.sartor@nexgo.de wrote:
> Hi,
> 
> thanks for the answer.
> I've still some comments on the topic, see below.
> 
> > Suppose we agree that bit flips don't happen (undetected) on drive
> > media.  But that bit flips can happen elsewhere (memory.  IO Buss
> > etc).
> > 
> > And then suppose we discover that a bit-flip has happened.  What does
> > that tell us?
> > Maybe it tells us that our hardware is dodgey.  So it cannot be
> > trusted to reliably do anything we tell it.  So maybe we shouldn't
> > tell it to do anything. ??
> 
> Maybe I should try to clarify the concept.
> There are *two* use cases.
> One is the "check" and one is the "repair".
> As I already wrote, I do agree that "repair" needs some deeper
> thinking. It is easy to see cases where it could produce more
> damages.
> The "check" case is another story.
> In case of RAID-6 I would like, as RFE, to have in the logs some
> report on which "drive" or "data path" the mismatch occurs, when
> detectable.
> So, if the mismatch count says there are 1024 mismatches, then
> would be nice to know if they belong all to the same drive or not.
> In this case, it would be possible to fail/remove that one and
> check the hardware (change drive/cable/connector/etc.).
> 
> Ideally, at the end of the "check", the log should report how
> many mismatches, how many are "undeterminable" (multiple
> drive), how many could belong to a specific drive.
> This will help to to diagnose a problem, maybe reported by
> the CRC in the filesystem.

Agreed :)

> This is for the "check", about the "repair", the only possible
> change I could see is to offer the user, and we could check
> in this mailing list how many would like to have the possibility,
> the option to "reset the parity" of the array or "recalculate the
> data", with the warning that the second one can do more
> damage than already has.
Yes, there is ofcourse the possibility to do damage, but i think if its
2 vs 1, thats something most people would bet on, atleast if its
multiple occourances all with the same "1".

:)

> 
> Conclusion, for me, is that the "check" should be more
> clever, with RAID-6, and "repair/resync" *might* be more
> flexible (with warnings).


> 
> I take the opportunity to wish you all Merry Christmas
> and Happy New Year.

And to you too!
> 
> bye,
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
@ 2008-12-19  8:40 piergiorgio.sartor
  2008-12-19 13:10 ` Redeeman
  0 siblings, 1 reply; 26+ messages in thread
From: piergiorgio.sartor @ 2008-12-19  8:40 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid

Hi,

thanks for the answer.
I've still some comments on the topic, see below.

> Suppose we agree that bit flips don't happen (undetected) on drive
> media.  But that bit flips can happen elsewhere (memory.  IO Buss
> etc).
> 
> And then suppose we discover that a bit-flip has happened.  What does
> that tell us?
> Maybe it tells us that our hardware is dodgey.  So it cannot be
> trusted to reliably do anything we tell it.  So maybe we shouldn't
> tell it to do anything. ??

Maybe I should try to clarify the concept.
There are *two* use cases.
One is the "check" and one is the "repair".
As I already wrote, I do agree that "repair" needs some deeper
thinking. It is easy to see cases where it could produce more
damages.
The "check" case is another story.
In case of RAID-6 I would like, as RFE, to have in the logs some
report on which "drive" or "data path" the mismatch occurs, when
detectable.
So, if the mismatch count says there are 1024 mismatches, then
would be nice to know if they belong all to the same drive or not.
In this case, it would be possible to fail/remove that one and
check the hardware (change drive/cable/connector/etc.).

Ideally, at the end of the "check", the log should report how
many mismatches, how many are "undeterminable" (multiple
drive), how many could belong to a specific drive.
This will help to to diagnose a problem, maybe reported by
the CRC in the filesystem.

This is for the "check", about the "repair", the only possible
change I could see is to offer the user, and we could check
in this mailing list how many would like to have the possibility,
the option to "reset the parity" of the array or "recalculate the
data", with the warning that the second one can do more
damage than already has.

Conclusion, for me, is that the "check" should be more
clever, with RAID-6, and "repair/resync" *might* be more
flexible (with warnings).

I take the opportunity to wish you all Merry Christmas
and Happy New Year.

bye,

-- 

pg


Jetzt komfortabel bei Arcor-Digital TV einsteigen: Mehr Happy Ends, mehr Herzschmerz, mehr Fernsehen! Erleben Sie 50 digitale TV Programme und optional 60 Pay TV Sender, einen elektronischen Programmführer mit Movie Star Bewertungen von TV Movie. Außerdem, aktuelle Filmhits und spannende Dokus in der Arcor-Videothek. Infos unter www.arcor.de/tv
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-19  4:39     ` Neil Brown
@ 2008-12-19  5:38       ` Redeeman
  0 siblings, 0 replies; 26+ messages in thread
From: Redeeman @ 2008-12-19  5:38 UTC (permalink / raw)
  To: Neil Brown; +Cc: Piergiorgio Sartor, linux-raid

On Fri, 2008-12-19 at 15:39 +1100, Neil Brown wrote:
> On Wednesday December 17, piergiorgio.sartor@nexgo.de wrote:
> > On Tue, 2008-12-16 at 23:25 +0100, Redeeman wrote:
> > [...]
> > > > Why a RAID system might have inconsistencies?
> > > > Why do we have a "check" command at all, to run weekly or monthly?
> > > As previously stated in discussion, while most bitflips etc does not
> > > happen on disk(apparently), they do happen, whether its in ram, pci,
> > > controller etc...
> > 
> > Ah! You spoiled it! :-)
> > 
> > Actually I was waiting for an answer from Neil Brown.
> > 
> > Because I'm under the impression that if it is not the HD,
> > it does not count... See below...
> 
> Suppose we agree that bit flips don't happen (undetected) on drive
> media.  But that bit flips can happen elsewhere (memory.  IO Buss
> etc).
> 
> And then suppose we discover that a bit-flip has happened.  What does
> that tell us?
> Maybe it tells us that our hardware is dodgey.  So it cannot be
> trusted to reliably do anything we tell it.  So maybe we shouldn't
> tell it to do anything. ??
> 
> And when we find a corruption, we clear cannot know if it is corrupt
> on disk (a previous write went bad) or just in memory (e.g. a recent
> read was bad).
> In the latter case, writing anything to disk is probably the wrong
> thing to do.  In the former case it might be a good thing to do - if
> we can be fairly sure that the error happens very rarely.
> And of course we cannot know if it was due to a bad read or a bad
> write.  So the safe course is to not write anything to disk.
> 
> Where does that leave us?
> 
> About the only thing that makes sense is to always read all the blocks
> in a stripe, and to perform a consistency test before responding to
> any read request.  If an inconsistency is found, we log what we know,
> and only return data if we have some reason to believe something is
> still valid (e.g. a majority vote for raid1).
> 
> And for raid5/6, a write would require:
>   read whole stripe
>   check consistency
>   copy in new data
>   update parity
>   write out changed blocks
> 
> This is going to be a substantial slowdown.
> 
> And does it really increase your data security?  or is it like putting
> a lock on your front door but not on your back door?
> 
> I guess it would provide some protection against low-frequency errors
> in the controller/cable/drive.
> 
> But given the high cost and the fairly low value, I wonder how many
> people would really use it....

I was suggesting that we only go through all the hoops on user request..
e.g. the addition of "resync_majorityvote" or something..

I can see the wisdom in just doing normal read/writes as now, for speed,
and only do the additional logic on request.

> 
> > 
> > Final point. More or less one year ago the same topic popped up,
> > with similar discussion.
> > At the end of the thread someone was asking if patches are
> > accepted in order to implement this feature.
> > I could not find any answer to that question in the archive.
> > 
> > What is the idea? Are patches accepted? Rejected by default?
> 
> By default, patches are reviewed and discussed.  If they then get
> revised and tested and appear to be sensible and useful they will
> probably get accepted eventually.
> 
> A change of this magnitude would almost certainly need to go through
> several iterations of revision and have substantial testing before
> being accepted.
> 
> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-17 21:52   ` Piergiorgio Sartor
@ 2008-12-19  4:39     ` Neil Brown
  2008-12-19  5:38       ` Redeeman
  0 siblings, 1 reply; 26+ messages in thread
From: Neil Brown @ 2008-12-19  4:39 UTC (permalink / raw)
  To: Piergiorgio Sartor; +Cc: linux-raid

On Wednesday December 17, piergiorgio.sartor@nexgo.de wrote:
> On Tue, 2008-12-16 at 23:25 +0100, Redeeman wrote:
> [...]
> > > Why a RAID system might have inconsistencies?
> > > Why do we have a "check" command at all, to run weekly or monthly?
> > As previously stated in discussion, while most bitflips etc does not
> > happen on disk(apparently), they do happen, whether its in ram, pci,
> > controller etc...
> 
> Ah! You spoiled it! :-)
> 
> Actually I was waiting for an answer from Neil Brown.
> 
> Because I'm under the impression that if it is not the HD,
> it does not count... See below...

Suppose we agree that bit flips don't happen (undetected) on drive
media.  But that bit flips can happen elsewhere (memory.  IO Buss
etc).

And then suppose we discover that a bit-flip has happened.  What does
that tell us?
Maybe it tells us that our hardware is dodgey.  So it cannot be
trusted to reliably do anything we tell it.  So maybe we shouldn't
tell it to do anything. ??

And when we find a corruption, we clear cannot know if it is corrupt
on disk (a previous write went bad) or just in memory (e.g. a recent
read was bad).
In the latter case, writing anything to disk is probably the wrong
thing to do.  In the former case it might be a good thing to do - if
we can be fairly sure that the error happens very rarely.
And of course we cannot know if it was due to a bad read or a bad
write.  So the safe course is to not write anything to disk.

Where does that leave us?

About the only thing that makes sense is to always read all the blocks
in a stripe, and to perform a consistency test before responding to
any read request.  If an inconsistency is found, we log what we know,
and only return data if we have some reason to believe something is
still valid (e.g. a majority vote for raid1).

And for raid5/6, a write would require:
  read whole stripe
  check consistency
  copy in new data
  update parity
  write out changed blocks

This is going to be a substantial slowdown.

And does it really increase your data security?  or is it like putting
a lock on your front door but not on your back door?

I guess it would provide some protection against low-frequency errors
in the controller/cable/drive.

But given the high cost and the fairly low value, I wonder how many
people would really use it....

> 
> Final point. More or less one year ago the same topic popped up,
> with similar discussion.
> At the end of the thread someone was asking if patches are
> accepted in order to implement this feature.
> I could not find any answer to that question in the archive.
> 
> What is the idea? Are patches accepted? Rejected by default?

By default, patches are reviewed and discussed.  If they then get
revised and tested and appear to be sensible and useful they will
probably get accepted eventually.

A change of this magnitude would almost certainly need to go through
several iterations of revision and have substantial testing before
being accepted.

NeilBrown

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-16 22:25 ` Redeeman
@ 2008-12-17 21:52   ` Piergiorgio Sartor
  2008-12-19  4:39     ` Neil Brown
  0 siblings, 1 reply; 26+ messages in thread
From: Piergiorgio Sartor @ 2008-12-17 21:52 UTC (permalink / raw)
  To: linux-raid

On Tue, 2008-12-16 at 23:25 +0100, Redeeman wrote:
[...]
> > Why a RAID system might have inconsistencies?
> > Why do we have a "check" command at all, to run weekly or monthly?
> As previously stated in discussion, while most bitflips etc does not
> happen on disk(apparently), they do happen, whether its in ram, pci,
> controller etc...

Ah! You spoiled it! :-)

Actually I was waiting for an answer from Neil Brown.

Because I'm under the impression that if it is not the HD,
it does not count... See below...

> Also, i imagine its just to be on top of things, read and ensure stuff
> works.. (but this is pure speculation)

I still have some comments on the topic.

First of all, someone mentioned the CRC/EDAC capabilities in the
filesystem. While this would be advisable, there is a fundamental
problem with it: there is no information on which device could
have caused the error (in case of RAID).
The FS can report, maybe correct, the data, but it is unaware of
the underlining hardware, so it does not help further.
On the other end (not hand), there are the device drivers.
Also these may report errors, but it can also be they just
deliver garbage, for several reasons.
The only component which can handle the problem is the "md", since
this is the only one which knows the devices _and_ the data.

Second. As mentioned above, it seems to me that RAID scope is
intentionally limited to pure HD failures.
Nowadays, one could build a RAID over usb-storage plus fw-sbp2
plus nbd plus esata.
The "HD" is not anymore the physical thing, it is everything
from the specific driver on.
If I stomp on the USB cable, detaching it, I would like the RAID
reacting as a real HD failure occurred (actually it does it properly).

So, IMHO, the argument that the "soft errors are improbable
within the HD" is limited, since it can happen elsewhere and
it should count like it was in the HD, IMHO...


Final point. More or less one year ago the same topic popped up,
with similar discussion.
At the end of the thread someone was asking if patches are
accepted in order to implement this feature.
I could not find any answer to that question in the archive.

What is the idea? Are patches accepted? Rejected by default?

Not that I want to provide one, but I was just curious...

bye,

-- 

pg









^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: detection/correction of corruption with raid6
       [not found]     ` <494960E8.8020407@tmr.com>
@ 2008-12-17 21:47       ` David Lethe
  0 siblings, 0 replies; 26+ messages in thread
From: David Lethe @ 2008-12-17 21:47 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Piergiorgio Sartor, linux-raid

From: Bill Davidsen [mailto:davidsen@tmr.com] 
Sent: Wednesday, December 17, 2008 2:28 PM
To: David Lethe
Cc: Piergiorgio Sartor; linux-raid@vger.kernel.org
Subject: Re: detection/correction of corruption with raid6

David Lethe wrote: 
-----Original Message-----
From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
owner@vger.kernel.org] On Behalf Of Bill Davidsen
Sent: Wednesday, December 17, 2008 8:49 AM
To: Piergiorgio Sartor
Cc: linux-raid@vger.kernel.org
Subject: Re: detection/correction of corruption with raid6

Piergiorgio Sartor wrote:
    
Why a RAID system might have inconsistencies?
Why do we have a "check" command at all, to run weekly or monthly?

      
Because alpha particles fly by, most systems don't have ECC memory, a
passing truck and noisy jet create a beat frequency that causes a once
in a century bit flip on a good cable connection, power line noise
creeps in, or maybe an angel farts.

My question is why we don't use available techniques to fix this since
we have the software to find it for us.

--
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will
still
  be valid when the war is over..." Otto von Bismark


--
To unsubscribe from this list: send the line "unsubscribe linux-raid"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
    

EEtimes published results from a 10-year IBM study.  They found that
alpha particles created  a bit flip once a month per 256 MB of DRAM
alone at sea level.  I can't remember what it was in high altitude, but
I think it was twice as bad. If you have 4GB of RAM in your computer,
then even if you have parity memory, then you are going to have
undetectable bit flips several times a month.
  

Why undetectable with parity? Even with Hamming Code, not the most
modern, you could correct all one bit errors and detect all two bit
errors, using 1+log2(N) parity bits, for 2^N data bits. Last I checked
parity memory had 8d+1p, and you have 8 parity bits available on a 64
bit fetch, more than enough. You would have to get three flips in a
fetch before you wouldn't see it, and using some of the better ECC
schemes I bet you would see three as well.


-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 

There is much more to it than that..   Your analysis doesn't factor in
that DRAM must refresh and it never gets read.  

Here is a great paper that explains everything, and gets into details,
and it avoids the calculus. 

http://www.tezzaron.com/about/papers/soft_errors_1_1_secure.pdf

To take a few quotes from it
"Memory errors occur mostly during read/write activity, so the SER rises
with memory speed and with the intensity of memory use; "memory cycling
at 100 nanoseconds can give soft error rates 100 times that of memory
idling in refresh mode (15 microseconds). Error rates rise with
altitude: SER is 5 times as high at 2600 feet as at sea level, and 10
times as high in Denver (5280 feet) as at sea level. SRAM tested at
10,000 feet above sea level will record SERs that are 14 times the rate
tested at sea level"

Quite aside from soft errors, particles with high energies can cause
permanent damage to memory cells. These "hard" errors exhibit error
rates that are strongly related to soft error rates, variously estimated
at 2% of total errors ...

Conclusions
Soft errors are a matter of increasing concern as memories get larger
and memory technologies get smaller. Even using a relatively
conservative error rate (500 FIT/Mbit), a system with 1 GByte of RAM can
expect an error every two weeks; a hypothetical Terabyte system would
experience a soft error every few minutes. Existing ECC technologies can
greatly reduce the error rate, but they may have unacceptable tradeoffs
in power, speed, price, or size. 

Soft errors can be disastrous for systems with large memories, critical
applications, or high altitude locations. Some type of error
detection/correction is mandatory in these cases, in spite of the cost
in price and/or performance."

David





^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: detection/correction of corruption with raid6
  2008-12-17 14:48 ` Bill Davidsen
@ 2008-12-17 15:50   ` David Lethe
       [not found]     ` <494960E8.8020407@tmr.com>
  0 siblings, 1 reply; 26+ messages in thread
From: David Lethe @ 2008-12-17 15:50 UTC (permalink / raw)
  To: Bill Davidsen, Piergiorgio Sartor; +Cc: linux-raid

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Bill Davidsen
> Sent: Wednesday, December 17, 2008 8:49 AM
> To: Piergiorgio Sartor
> Cc: linux-raid@vger.kernel.org
> Subject: Re: detection/correction of corruption with raid6
> 
> Piergiorgio Sartor wrote:
> > Why a RAID system might have inconsistencies?
> > Why do we have a "check" command at all, to run weekly or monthly?
> >
> 
> Because alpha particles fly by, most systems don't have ECC memory, a
> passing truck and noisy jet create a beat frequency that causes a once
> in a century bit flip on a good cable connection, power line noise
> creeps in, or maybe an angel farts.
> 
> My question is why we don't use available techniques to fix this since
> we have the software to find it for us.
> 
> --
> Bill Davidsen <davidsen@tmr.com>
>   "Woe unto the statesman who makes war without a reason that will
> still
>   be valid when the war is over..." Otto von Bismark
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

EEtimes published results from a 10-year IBM study.  They found that
alpha particles created  a bit flip once a month per 256 MB of DRAM
alone at sea level.  I can't remember what it was in high altitude, but
I think it was twice as bad. If you have 4GB of RAM in your computer,
then even if you have parity memory, then you are going to have
undetectable bit flips several times a month.




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-16 21:58 Piergiorgio Sartor
  2008-12-16 22:25 ` Redeeman
@ 2008-12-17 14:48 ` Bill Davidsen
  2008-12-17 15:50   ` David Lethe
  1 sibling, 1 reply; 26+ messages in thread
From: Bill Davidsen @ 2008-12-17 14:48 UTC (permalink / raw)
  To: Piergiorgio Sartor; +Cc: linux-raid

Piergiorgio Sartor wrote:
> Why a RAID system might have inconsistencies?
> Why do we have a "check" command at all, to run weekly or monthly?
>   

Because alpha particles fly by, most systems don't have ECC memory, a 
passing truck and noisy jet create a beat frequency that causes a once 
in a century bit flip on a good cable connection, power line noise 
creeps in, or maybe an angel farts.

My question is why we don't use available techniques to fix this since 
we have the software to find it for us.

-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
  2008-12-16 21:58 Piergiorgio Sartor
@ 2008-12-16 22:25 ` Redeeman
  2008-12-17 21:52   ` Piergiorgio Sartor
  2008-12-17 14:48 ` Bill Davidsen
  1 sibling, 1 reply; 26+ messages in thread
From: Redeeman @ 2008-12-16 22:25 UTC (permalink / raw)
  To: Piergiorgio Sartor; +Cc: linux-raid

On Tue, 2008-12-16 at 22:58 +0100, Piergiorgio Sartor wrote:
> Hi all,
> 
> while I do agree that the issue needs more in deep thinking,
> I would like to tell a recent story that happened to me.
> 
> I was testing a RAID-6 array, with 7, small, HDs.
> Intention was to get used to different situations, repair,
> grow, fail, remove, etc.
> 
> After some playing, I started to check the files on the array
> and I found out that they were not (always) correct.
> So I started a check of the array, which returned some 1000 or
> more mismatches.
> 
> After some investigation, I found out that one HD had a "flaky"
> interface, data was correctly written, but sometimes, randomly,
> reading returned some "wrong" bits (re-cabling solved the issue).
> 
> To check this with RAID-6, I could run the check with 6 disks,
> for 7 times, each with a different disk removed, until one run
> returned no mismatches.
> At this point, I knew which "data path" was defective.
> 
> It would have saved a lot of time, if the check could have
> done this automatically...

Exactly! this is partly the point i make too

> 
> So, my RFE, would be, if possible, to try, during RAID-6 check,
> to find out if and which HD has the mismatch.
> Ideally, at the end of the check, the system log should show
> how many mismatches, if any, are likely to belong to which HD
> or are undetermined.
> This would help to diagnose the full data path and reduce
> testing time in case of problems.
> In case only one HD results problematic, this one could be
> failed, removed and the complete cabling, I/F and so on checked.
> Of course, this goes beyond the simple "HD failure protection"
> scope of RAID, nevertheless I do not see why this possibility
> should be neglected, unless it is too complex/difficult to
> implement and maintain.
Yeah, I myself do not know how much more complicated this would make
things, but i would imagine it would be worth it..
> 
> Regarding the possibility of recovery, I have one question:
> 
> Why a RAID system might have inconsistencies?
> Why do we have a "check" command at all, to run weekly or monthly?
As previously stated in discussion, while most bitflips etc does not
happen on disk(apparently), they do happen, whether its in ram, pci,
controller etc...

Also, i imagine its just to be on top of things, read and ensure stuff
works.. (but this is pure speculation)
> 
> Thanks,
> 
> bye,
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: detection/correction of corruption with raid6
@ 2008-12-16 21:58 Piergiorgio Sartor
  2008-12-16 22:25 ` Redeeman
  2008-12-17 14:48 ` Bill Davidsen
  0 siblings, 2 replies; 26+ messages in thread
From: Piergiorgio Sartor @ 2008-12-16 21:58 UTC (permalink / raw)
  To: linux-raid

Hi all,

while I do agree that the issue needs more in deep thinking,
I would like to tell a recent story that happened to me.

I was testing a RAID-6 array, with 7, small, HDs.
Intention was to get used to different situations, repair,
grow, fail, remove, etc.

After some playing, I started to check the files on the array
and I found out that they were not (always) correct.
So I started a check of the array, which returned some 1000 or
more mismatches.

After some investigation, I found out that one HD had a "flaky"
interface, data was correctly written, but sometimes, randomly,
reading returned some "wrong" bits (re-cabling solved the issue).

To check this with RAID-6, I could run the check with 6 disks,
for 7 times, each with a different disk removed, until one run
returned no mismatches.
At this point, I knew which "data path" was defective.

It would have saved a lot of time, if the check could have
done this automatically...

So, my RFE, would be, if possible, to try, during RAID-6 check,
to find out if and which HD has the mismatch.
Ideally, at the end of the check, the system log should show
how many mismatches, if any, are likely to belong to which HD
or are undetermined.
This would help to diagnose the full data path and reduce
testing time in case of problems.
In case only one HD results problematic, this one could be
failed, removed and the complete cabling, I/F and so on checked.
Of course, this goes beyond the simple "HD failure protection"
scope of RAID, nevertheless I do not see why this possibility
should be neglected, unless it is too complex/difficult to
implement and maintain.

Regarding the possibility of recovery, I have one question:

Why a RAID system might have inconsistencies?
Why do we have a "check" command at all, to run weekly or monthly?

Thanks,

bye,


-- 

piergiorgio sartor




^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2008-12-19 13:10 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-05 21:00 detection/correction of corruption with raid6 Redeeman
2008-12-05 21:02 ` Justin Piszcz
2008-12-05 21:06   ` Redeeman
2008-12-05 21:09     ` Justin Piszcz
2008-12-05 21:12       ` Redeeman
2008-12-05 21:17         ` Justin Piszcz
2008-12-05 21:30         ` Michał Przyłuski
2008-12-05 22:12           ` Peter Rabbitson
2008-12-05 22:26             ` Michał Przyłuski
2008-12-05 22:43               ` Greg Freemyer
2008-12-06  0:39                 ` Roger Heflin
2008-12-12 15:31           ` Redeeman
2008-12-16  2:33             ` Neil Brown
2008-12-16  6:33               ` Redeeman
2008-12-16  7:59               ` Mattias Wadenstein
2008-12-16 22:20                 ` Chris Worley
2008-12-16 21:58 Piergiorgio Sartor
2008-12-16 22:25 ` Redeeman
2008-12-17 21:52   ` Piergiorgio Sartor
2008-12-19  4:39     ` Neil Brown
2008-12-19  5:38       ` Redeeman
2008-12-17 14:48 ` Bill Davidsen
2008-12-17 15:50   ` David Lethe
     [not found]     ` <494960E8.8020407@tmr.com>
2008-12-17 21:47       ` David Lethe
2008-12-19  8:40 piergiorgio.sartor
2008-12-19 13:10 ` Redeeman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.