All of lore.kernel.org
 help / color / mirror / Atom feed
* Triple-parity raid6
@ 2011-06-09  0:01 David Brown
  2011-06-09  1:49 ` NeilBrown
  2011-06-09 22:42 ` David Brown
  0 siblings, 2 replies; 22+ messages in thread
From: David Brown @ 2011-06-09  0:01 UTC (permalink / raw)
  To: linux-raid

Has anyone considered triple-parity raid6 ?  As far as I can see, it 
should not be significantly harder than normal raid6 - either  to 
implement, or for the processor at run-time.  Once you have the GF(2⁸) 
field arithmetic in place for raid6, it's just a matter of making 
another parity block in the same way but using a different generator:

P = D_0 + D_1 + D_2 + .. + D_(n.1)
Q = D_0 + g.D_1 + g².D_2 + .. + g^(n-1).D_(n.1)
R = D_0 + h.D_1 + h².D_2 + .. + h^(n-1).D_(n.1)

The raid6 implementation in mdraid uses g = 0x02 to generate the second 
parity (based on "The mathematics of RAID-6" - I haven't checked the 
source code).  You can make a third parity using h = 0x04 and then get a 
redundancy of 3 disks.  (Note - I haven't yet confirmed that this is 
valid for more than 100 data disks - I need to make my checker program 
more efficient first.)

Rebuilding a disk, or running in degraded mode, is just an obvious 
extension to the current raid6 algorithms.  If you are missing three 
data blocks, the maths looks hard to start with - but if you express the 
equations as a set of linear equations and use standard matrix inversion 
techniques, it should not be hard to implement.  You only need to do 
this inversion once when you find that one or more disks have failed - 
then you pre-compute the multiplication tables in the same way as is 
done for raid6 today.

In normal use, calculating the R parity is no more demanding than 
calculating the Q parity.  And most rebuilds or degraded situations will 
only involve a single disk, and the data can thus be re-constructed 
using the P parity just like raid5 or two-parity raid6.


I'm sure there are situations where triple-parity raid6 would be 
appealing - it has already been implemented in ZFS, and it is only a 
matter of time before two-parity raid6 has a real probability of hitting 
an unrecoverable read error during a rebuild.


And of course, there is no particular reason to stop at three parity 
blocks - the maths can easily be generalised.  1, 2, 4 and 8 can be used 
as generators for quad-parity (checked up to 60 disks), and adding 16 
gives you quintuple parity (checked up to 30 disks) - but that's maybe 
getting a bit paranoid.


ref.:

<http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf>
<http://blogs.oracle.com/ahl/entry/acm_triple_parity_raid>
<http://queue.acm.org/detail.cfm?id=1670144>
<http://blogs.oracle.com/ahl/entry/triple_parity_raid_z>


mvh.,

David

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-09  0:01 Triple-parity raid6 David Brown
@ 2011-06-09  1:49 ` NeilBrown
  2011-06-09 11:32   ` David Brown
  2011-06-09 22:42 ` David Brown
  1 sibling, 1 reply; 22+ messages in thread
From: NeilBrown @ 2011-06-09  1:49 UTC (permalink / raw)
  To: David Brown; +Cc: linux-raid

On Thu, 09 Jun 2011 02:01:06 +0200 David Brown <david.brown@hesbynett.no>
wrote:

> Has anyone considered triple-parity raid6 ?  As far as I can see, it 
> should not be significantly harder than normal raid6 - either  to 
> implement, or for the processor at run-time.  Once you have the GF(2⁸) 
> field arithmetic in place for raid6, it's just a matter of making 
> another parity block in the same way but using a different generator:
> 
> P = D_0 + D_1 + D_2 + .. + D_(n.1)
> Q = D_0 + g.D_1 + g².D_2 + .. + g^(n-1).D_(n.1)
> R = D_0 + h.D_1 + h².D_2 + .. + h^(n-1).D_(n.1)
> 
> The raid6 implementation in mdraid uses g = 0x02 to generate the second 
> parity (based on "The mathematics of RAID-6" - I haven't checked the 
> source code).  You can make a third parity using h = 0x04 and then get a 
> redundancy of 3 disks.  (Note - I haven't yet confirmed that this is 
> valid for more than 100 data disks - I need to make my checker program 
> more efficient first.)
> 
> Rebuilding a disk, or running in degraded mode, is just an obvious 
> extension to the current raid6 algorithms.  If you are missing three 
> data blocks, the maths looks hard to start with - but if you express the 
> equations as a set of linear equations and use standard matrix inversion 
> techniques, it should not be hard to implement.  You only need to do 
> this inversion once when you find that one or more disks have failed - 
> then you pre-compute the multiplication tables in the same way as is 
> done for raid6 today.
> 
> In normal use, calculating the R parity is no more demanding than 
> calculating the Q parity.  And most rebuilds or degraded situations will 
> only involve a single disk, and the data can thus be re-constructed 
> using the P parity just like raid5 or two-parity raid6.
> 
> 
> I'm sure there are situations where triple-parity raid6 would be 
> appealing - it has already been implemented in ZFS, and it is only a 
> matter of time before two-parity raid6 has a real probability of hitting 
> an unrecoverable read error during a rebuild.
> 
> 
> And of course, there is no particular reason to stop at three parity 
> blocks - the maths can easily be generalised.  1, 2, 4 and 8 can be used 
> as generators for quad-parity (checked up to 60 disks), and adding 16 
> gives you quintuple parity (checked up to 30 disks) - but that's maybe 
> getting a bit paranoid.
> 
> 
> ref.:
> 
> <http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf>
> <http://blogs.oracle.com/ahl/entry/acm_triple_parity_raid>
> <http://queue.acm.org/detail.cfm?id=1670144>
> <http://blogs.oracle.com/ahl/entry/triple_parity_raid_z>
> 

 -ENOPATCH  :-)

I have a series of patches nearly ready which removes a lot of the remaining
duplication in raid5.c between raid5 and raid6 paths.  So there will be
relative few places where RAID5 and RAID6 do different things - only the
places where they *must* do different things.
After that, adding a new level or layout which has 'max_degraded == 3' would
be quite easy.
The most difficult part would be the enhancements to libraid6 to generate the
new 'syndrome', and to handle the different recovery possibilities.

So if you're not otherwise busy this weekend, a patch would be nice :-)

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-09  1:49 ` NeilBrown
@ 2011-06-09 11:32   ` David Brown
  2011-06-09 12:04     ` NeilBrown
  0 siblings, 1 reply; 22+ messages in thread
From: David Brown @ 2011-06-09 11:32 UTC (permalink / raw)
  To: linux-raid

On 09/06/2011 03:49, NeilBrown wrote:
> On Thu, 09 Jun 2011 02:01:06 +0200 David Brown<david.brown@hesbynett.no>
> wrote:
>
>> Has anyone considered triple-parity raid6 ?  As far as I can see, it
>> should not be significantly harder than normal raid6 - either  to
>> implement, or for the processor at run-time.  Once you have the GF(2⁸)
>> field arithmetic in place for raid6, it's just a matter of making
>> another parity block in the same way but using a different generator:
>>
>> P = D_0 + D_1 + D_2 + .. + D_(n.1)
>> Q = D_0 + g.D_1 + g².D_2 + .. + g^(n-1).D_(n.1)
>> R = D_0 + h.D_1 + h².D_2 + .. + h^(n-1).D_(n.1)
>>
>> The raid6 implementation in mdraid uses g = 0x02 to generate the second
>> parity (based on "The mathematics of RAID-6" - I haven't checked the
>> source code).  You can make a third parity using h = 0x04 and then get a
>> redundancy of 3 disks.  (Note - I haven't yet confirmed that this is
>> valid for more than 100 data disks - I need to make my checker program
>> more efficient first.)
>>
>> Rebuilding a disk, or running in degraded mode, is just an obvious
>> extension to the current raid6 algorithms.  If you are missing three
>> data blocks, the maths looks hard to start with - but if you express the
>> equations as a set of linear equations and use standard matrix inversion
>> techniques, it should not be hard to implement.  You only need to do
>> this inversion once when you find that one or more disks have failed -
>> then you pre-compute the multiplication tables in the same way as is
>> done for raid6 today.
>>
>> In normal use, calculating the R parity is no more demanding than
>> calculating the Q parity.  And most rebuilds or degraded situations will
>> only involve a single disk, and the data can thus be re-constructed
>> using the P parity just like raid5 or two-parity raid6.
>>
>>
>> I'm sure there are situations where triple-parity raid6 would be
>> appealing - it has already been implemented in ZFS, and it is only a
>> matter of time before two-parity raid6 has a real probability of hitting
>> an unrecoverable read error during a rebuild.
>>
>>
>> And of course, there is no particular reason to stop at three parity
>> blocks - the maths can easily be generalised.  1, 2, 4 and 8 can be used
>> as generators for quad-parity (checked up to 60 disks), and adding 16
>> gives you quintuple parity (checked up to 30 disks) - but that's maybe
>> getting a bit paranoid.
>>
>>
>> ref.:
>>
>> <http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf>
>> <http://blogs.oracle.com/ahl/entry/acm_triple_parity_raid>
>> <http://queue.acm.org/detail.cfm?id=1670144>
>> <http://blogs.oracle.com/ahl/entry/triple_parity_raid_z>
>>
>
>   -ENOPATCH  :-)
>
> I have a series of patches nearly ready which removes a lot of the remaining
> duplication in raid5.c between raid5 and raid6 paths.  So there will be
> relative few places where RAID5 and RAID6 do different things - only the
> places where they *must* do different things.
> After that, adding a new level or layout which has 'max_degraded == 3' would
> be quite easy.
> The most difficult part would be the enhancements to libraid6 to generate the
> new 'syndrome', and to handle the different recovery possibilities.
>
> So if you're not otherwise busy this weekend, a patch would be nice :-)
>

I'm not going to promise any patches, but maybe I can help with the 
maths.  You say the difficult part is the syndrome calculations and 
recovery - I've got these bits figured out on paper and some 
quick-and-dirty python test code.  On the other hand, I don't really 
want to get into the md kernel code, or the mdadm code - I haven't done 
Linux kernel development before (I mostly program 8-bit microcontrollers 
- when I code on Linux, I use Python), and I fear it would take me a 
long time to get up to speed.

However, if the parity generation and recovery is neatly separated into 
a libraid6 library, the whole thing becomes much more tractable from my 
viewpoint.  Since I am new to this, can you tell me where I should get 
the current libraid6 code?  I'm sure google will find some sources for 
me, but I'd like to make sure I start with whatever version /you/ have.




--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-09 11:32   ` David Brown
@ 2011-06-09 12:04     ` NeilBrown
  2011-06-09 19:19       ` David Brown
                         ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: NeilBrown @ 2011-06-09 12:04 UTC (permalink / raw)
  To: David Brown; +Cc: linux-raid

On Thu, 09 Jun 2011 13:32:59 +0200 David Brown <david@westcontrol.com> wrote:

> On 09/06/2011 03:49, NeilBrown wrote:
> > On Thu, 09 Jun 2011 02:01:06 +0200 David Brown<david.brown@hesbynett.no>
> > wrote:
> >
> >> Has anyone considered triple-parity raid6 ?  As far as I can see, it
> >> should not be significantly harder than normal raid6 - either  to
> >> implement, or for the processor at run-time.  Once you have the GF(2⁸)
> >> field arithmetic in place for raid6, it's just a matter of making
> >> another parity block in the same way but using a different generator:
> >>
> >> P = D_0 + D_1 + D_2 + .. + D_(n.1)
> >> Q = D_0 + g.D_1 + g².D_2 + .. + g^(n-1).D_(n.1)
> >> R = D_0 + h.D_1 + h².D_2 + .. + h^(n-1).D_(n.1)
> >>
> >> The raid6 implementation in mdraid uses g = 0x02 to generate the second
> >> parity (based on "The mathematics of RAID-6" - I haven't checked the
> >> source code).  You can make a third parity using h = 0x04 and then get a
> >> redundancy of 3 disks.  (Note - I haven't yet confirmed that this is
> >> valid for more than 100 data disks - I need to make my checker program
> >> more efficient first.)
> >>
> >> Rebuilding a disk, or running in degraded mode, is just an obvious
> >> extension to the current raid6 algorithms.  If you are missing three
> >> data blocks, the maths looks hard to start with - but if you express the
> >> equations as a set of linear equations and use standard matrix inversion
> >> techniques, it should not be hard to implement.  You only need to do
> >> this inversion once when you find that one or more disks have failed -
> >> then you pre-compute the multiplication tables in the same way as is
> >> done for raid6 today.
> >>
> >> In normal use, calculating the R parity is no more demanding than
> >> calculating the Q parity.  And most rebuilds or degraded situations will
> >> only involve a single disk, and the data can thus be re-constructed
> >> using the P parity just like raid5 or two-parity raid6.
> >>
> >>
> >> I'm sure there are situations where triple-parity raid6 would be
> >> appealing - it has already been implemented in ZFS, and it is only a
> >> matter of time before two-parity raid6 has a real probability of hitting
> >> an unrecoverable read error during a rebuild.
> >>
> >>
> >> And of course, there is no particular reason to stop at three parity
> >> blocks - the maths can easily be generalised.  1, 2, 4 and 8 can be used
> >> as generators for quad-parity (checked up to 60 disks), and adding 16
> >> gives you quintuple parity (checked up to 30 disks) - but that's maybe
> >> getting a bit paranoid.
> >>
> >>
> >> ref.:
> >>
> >> <http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf>
> >> <http://blogs.oracle.com/ahl/entry/acm_triple_parity_raid>
> >> <http://queue.acm.org/detail.cfm?id=1670144>
> >> <http://blogs.oracle.com/ahl/entry/triple_parity_raid_z>
> >>
> >
> >   -ENOPATCH  :-)
> >
> > I have a series of patches nearly ready which removes a lot of the remaining
> > duplication in raid5.c between raid5 and raid6 paths.  So there will be
> > relative few places where RAID5 and RAID6 do different things - only the
> > places where they *must* do different things.
> > After that, adding a new level or layout which has 'max_degraded == 3' would
> > be quite easy.
> > The most difficult part would be the enhancements to libraid6 to generate the
> > new 'syndrome', and to handle the different recovery possibilities.
> >
> > So if you're not otherwise busy this weekend, a patch would be nice :-)
> >
> 
> I'm not going to promise any patches, but maybe I can help with the 
> maths.  You say the difficult part is the syndrome calculations and 
> recovery - I've got these bits figured out on paper and some 
> quick-and-dirty python test code.  On the other hand, I don't really 
> want to get into the md kernel code, or the mdadm code - I haven't done 
> Linux kernel development before (I mostly program 8-bit microcontrollers 
> - when I code on Linux, I use Python), and I fear it would take me a 
> long time to get up to speed.
> 
> However, if the parity generation and recovery is neatly separated into 
> a libraid6 library, the whole thing becomes much more tractable from my 
> viewpoint.  Since I am new to this, can you tell me where I should get 
> the current libraid6 code?  I'm sure google will find some sources for 
> me, but I'd like to make sure I start with whatever version /you/ have.
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

You can see the current kernel code at:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=lib/raid6;h=970c541a452d3b9983223d74b10866902f1a47c7;hb=HEAD


int.uc is the generic C code which 'unroll.awk' processes to make various
versions that unroll the loops different amounts to work with CPUs with
different numbers of registers.
Then there is sse1, sse2, altivec which provide the same functionality in
assembler which is optimised for various processors.

And 'recov' has the smarts for doing the reverse calculation when 2 data
blocks, or 1 data and P are missing.

Even if you don't feel up to implementing everything, a start might be
useful.  You never know when someone might jump up and offer to help.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-09 12:04     ` NeilBrown
@ 2011-06-09 19:19       ` David Brown
  2011-06-10  3:22       ` Namhyung Kim
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 22+ messages in thread
From: David Brown @ 2011-06-09 19:19 UTC (permalink / raw)
  To: linux-raid

On 09/06/11 14:04, NeilBrown wrote:
> On Thu, 09 Jun 2011 13:32:59 +0200 David Brown<david@westcontrol.com>  wrote:
>
>> On 09/06/2011 03:49, NeilBrown wrote:
>>> On Thu, 09 Jun 2011 02:01:06 +0200 David Brown<david.brown@hesbynett.no>
>>> wrote:
>>>
>>>> Has anyone considered triple-parity raid6 ?  As far as I can see, it
>>>> should not be significantly harder than normal raid6 - either  to
>>>> implement, or for the processor at run-time.  Once you have the GF(2⁸)
>>>> field arithmetic in place for raid6, it's just a matter of making
>>>> another parity block in the same way but using a different generator:
>>>>
>>>> P = D_0 + D_1 + D_2 + .. + D_(n.1)
>>>> Q = D_0 + g.D_1 + g².D_2 + .. + g^(n-1).D_(n.1)
>>>> R = D_0 + h.D_1 + h².D_2 + .. + h^(n-1).D_(n.1)
>>>>
>>>> The raid6 implementation in mdraid uses g = 0x02 to generate the second
>>>> parity (based on "The mathematics of RAID-6" - I haven't checked the
>>>> source code).  You can make a third parity using h = 0x04 and then get a
>>>> redundancy of 3 disks.  (Note - I haven't yet confirmed that this is
>>>> valid for more than 100 data disks - I need to make my checker program
>>>> more efficient first.)
>>>>
>>>> Rebuilding a disk, or running in degraded mode, is just an obvious
>>>> extension to the current raid6 algorithms.  If you are missing three
>>>> data blocks, the maths looks hard to start with - but if you express the
>>>> equations as a set of linear equations and use standard matrix inversion
>>>> techniques, it should not be hard to implement.  You only need to do
>>>> this inversion once when you find that one or more disks have failed -
>>>> then you pre-compute the multiplication tables in the same way as is
>>>> done for raid6 today.
>>>>
>>>> In normal use, calculating the R parity is no more demanding than
>>>> calculating the Q parity.  And most rebuilds or degraded situations will
>>>> only involve a single disk, and the data can thus be re-constructed
>>>> using the P parity just like raid5 or two-parity raid6.
>>>>
>>>>
>>>> I'm sure there are situations where triple-parity raid6 would be
>>>> appealing - it has already been implemented in ZFS, and it is only a
>>>> matter of time before two-parity raid6 has a real probability of hitting
>>>> an unrecoverable read error during a rebuild.
>>>>
>>>>
>>>> And of course, there is no particular reason to stop at three parity
>>>> blocks - the maths can easily be generalised.  1, 2, 4 and 8 can be used
>>>> as generators for quad-parity (checked up to 60 disks), and adding 16
>>>> gives you quintuple parity (checked up to 30 disks) - but that's maybe
>>>> getting a bit paranoid.
>>>>
>>>>
>>>> ref.:
>>>>
>>>> <http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf>
>>>> <http://blogs.oracle.com/ahl/entry/acm_triple_parity_raid>
>>>> <http://queue.acm.org/detail.cfm?id=1670144>
>>>> <http://blogs.oracle.com/ahl/entry/triple_parity_raid_z>
>>>>
>>>
>>>    -ENOPATCH  :-)
>>>
>>> I have a series of patches nearly ready which removes a lot of the remaining
>>> duplication in raid5.c between raid5 and raid6 paths.  So there will be
>>> relative few places where RAID5 and RAID6 do different things - only the
>>> places where they *must* do different things.
>>> After that, adding a new level or layout which has 'max_degraded == 3' would
>>> be quite easy.
>>> The most difficult part would be the enhancements to libraid6 to generate the
>>> new 'syndrome', and to handle the different recovery possibilities.
>>>
>>> So if you're not otherwise busy this weekend, a patch would be nice :-)
>>>
>>
>> I'm not going to promise any patches, but maybe I can help with the
>> maths.  You say the difficult part is the syndrome calculations and
>> recovery - I've got these bits figured out on paper and some
>> quick-and-dirty python test code.  On the other hand, I don't really
>> want to get into the md kernel code, or the mdadm code - I haven't done
>> Linux kernel development before (I mostly program 8-bit microcontrollers
>> - when I code on Linux, I use Python), and I fear it would take me a
>> long time to get up to speed.
>>
>> However, if the parity generation and recovery is neatly separated into
>> a libraid6 library, the whole thing becomes much more tractable from my
>> viewpoint.  Since I am new to this, can you tell me where I should get
>> the current libraid6 code?  I'm sure google will find some sources for
>> me, but I'd like to make sure I start with whatever version /you/ have.
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> You can see the current kernel code at:
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=lib/raid6;h=970c541a452d3b9983223d74b10866902f1a47c7;hb=HEAD
>
>
> int.uc is the generic C code which 'unroll.awk' processes to make various
> versions that unroll the loops different amounts to work with CPUs with
> different numbers of registers.
> Then there is sse1, sse2, altivec which provide the same functionality in
> assembler which is optimised for various processors.
>
> And 'recov' has the smarts for doing the reverse calculation when 2 data
> blocks, or 1 data and P are missing.
>
> Even if you don't feel up to implementing everything, a start might be
> useful.  You never know when someone might jump up and offer to help.
>
> NeilBrown

Monday is a holiday here in Norway, so I've got a long weekend.  I 
should get at least /some/ time to have a look at libraid6!

mvh.,

David


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-09  0:01 Triple-parity raid6 David Brown
  2011-06-09  1:49 ` NeilBrown
@ 2011-06-09 22:42 ` David Brown
  1 sibling, 0 replies; 22+ messages in thread
From: David Brown @ 2011-06-09 22:42 UTC (permalink / raw)
  To: linux-raid

On 09/06/11 02:01, David Brown wrote:
> Has anyone considered triple-parity raid6 ? As far as I can see, it
> should not be significantly harder than normal raid6 - either to
> implement, or for the processor at run-time. Once you have the GF(2⁸)
> field arithmetic in place for raid6, it's just a matter of making
> another parity block in the same way but using a different generator:
>
> P = D_0 + D_1 + D_2 + .. + D_(n.1)
> Q = D_0 + g.D_1 + g².D_2 + .. + g^(n-1).D_(n.1)
> R = D_0 + h.D_1 + h².D_2 + .. + h^(n-1).D_(n.1)
>
> The raid6 implementation in mdraid uses g = 0x02 to generate the second
> parity (based on "The mathematics of RAID-6" - I haven't checked the
> source code). You can make a third parity using h = 0x04 and then get a
> redundancy of 3 disks. (Note - I haven't yet confirmed that this is
> valid for more than 100 data disks - I need to make my checker program
> more efficient first.)
>
> Rebuilding a disk, or running in degraded mode, is just an obvious
> extension to the current raid6 algorithms. If you are missing three data
> blocks, the maths looks hard to start with - but if you express the
> equations as a set of linear equations and use standard matrix inversion
> techniques, it should not be hard to implement. You only need to do this
> inversion once when you find that one or more disks have failed - then
> you pre-compute the multiplication tables in the same way as is done for
> raid6 today.
>
> In normal use, calculating the R parity is no more demanding than
> calculating the Q parity. And most rebuilds or degraded situations will
> only involve a single disk, and the data can thus be re-constructed
> using the P parity just like raid5 or two-parity raid6.
>
>
> I'm sure there are situations where triple-parity raid6 would be
> appealing - it has already been implemented in ZFS, and it is only a
> matter of time before two-parity raid6 has a real probability of hitting
> an unrecoverable read error during a rebuild.
>
>
> And of course, there is no particular reason to stop at three parity
> blocks - the maths can easily be generalised. 1, 2, 4 and 8 can be used
> as generators for quad-parity (checked up to 60 disks), and adding 16
> gives you quintuple parity (checked up to 30 disks) - but that's maybe
> getting a bit paranoid.
>
>
> ref.:
>
> <http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf>
> <http://blogs.oracle.com/ahl/entry/acm_triple_parity_raid>
> <http://queue.acm.org/detail.cfm?id=1670144>
> <http://blogs.oracle.com/ahl/entry/triple_parity_raid_z>
>
>
> mvh.,
>
> David
>

Just to follow up on my numbers here - I've now checked the validity of 
triple-parity using generators 1, 2 and 4 for up to 254 data disks 
(i.e., 257 disks altogether).  I've checked the validity of quad-parity 
up to 120 disks - checking the full 253 disks will probably take the 
machine most of the night.  I'm sure there is some mathematical way to 
prove this, and it could certainly be checked more efficiently than with 
a Python program - but my computer has more spare time than me!


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-09 12:04     ` NeilBrown
  2011-06-09 19:19       ` David Brown
@ 2011-06-10  3:22       ` Namhyung Kim
  2011-06-10  8:45         ` David Brown
  2011-06-10  9:03       ` David Brown
  2011-06-10 13:56       ` Bill Davidsen
  3 siblings, 1 reply; 22+ messages in thread
From: Namhyung Kim @ 2011-06-10  3:22 UTC (permalink / raw)
  To: NeilBrown; +Cc: David Brown, linux-raid

NeilBrown <neilb@suse.de> writes:
> On Thu, 09 Jun 2011 13:32:59 +0200 David Brown <david@westcontrol.com> wrote:
>
> You can see the current kernel code at:
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=lib/raid6;h=970c541a452d3b9983223d74b10866902f1a47c7;hb=HEAD
>
>
> int.uc is the generic C code which 'unroll.awk' processes to make various
> versions that unroll the loops different amounts to work with CPUs with
> different numbers of registers.
> Then there is sse1, sse2, altivec which provide the same functionality in
> assembler which is optimised for various processors.
>
> And 'recov' has the smarts for doing the reverse calculation when 2 data
> blocks, or 1 data and P are missing.
>
> Even if you don't feel up to implementing everything, a start might be
> useful.  You never know when someone might jump up and offer to help.
>

Maybe I could help David to some extent. :)

I'm gonna read the raid6 code next week and hope that there is a room I
can help with, FWIW.

Thanks.


-- 
Regards,
Namhyung Kim

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-10  3:22       ` Namhyung Kim
@ 2011-06-10  8:45         ` David Brown
  2011-06-10 12:20           ` Christoph Dittmann
  0 siblings, 1 reply; 22+ messages in thread
From: David Brown @ 2011-06-10  8:45 UTC (permalink / raw)
  To: linux-raid

On 10/06/2011 05:22, Namhyung Kim wrote:
> NeilBrown<neilb@suse.de>  writes:
>> On Thu, 09 Jun 2011 13:32:59 +0200 David Brown<david@westcontrol.com>  wrote:
>>
>> You can see the current kernel code at:
>>
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=lib/raid6;h=970c541a452d3b9983223d74b10866902f1a47c7;hb=HEAD
>>
>>
>> int.uc is the generic C code which 'unroll.awk' processes to make various
>> versions that unroll the loops different amounts to work with CPUs with
>> different numbers of registers.
>> Then there is sse1, sse2, altivec which provide the same functionality in
>> assembler which is optimised for various processors.
>>
>> And 'recov' has the smarts for doing the reverse calculation when 2 data
>> blocks, or 1 data and P are missing.
>>
>> Even if you don't feel up to implementing everything, a start might be
>> useful.  You never know when someone might jump up and offer to help.
>>
>
> Maybe I could help David to some extent. :)
>
> I'm gonna read the raid6 code next week and hope that there is a room I
> can help with, FWIW.
>

No matter how far I manage to get, I'm going to need help from someone 
who knows the system and the development process, how the new functions 
would fit with the rest of the system, and not least people who can 
check and test the code.

Making multiple parity syndromes is easy enough mathematically:

For each parity bit P_j, you have a generator g_j and calculate for d_i 
running over all data disks:

	P_j = sum((g_j ^ i) . d_i)

Raid5 parity uses g_0 = 1, so it is just the xor.
Raid6 uses g_0 = 1, g_1 = 2.

Any independent generators could be used, of course.  I am not sure how 
to prove that a set of generators is independent (except in the easy 
case of 2 generators, as shown in the raid6.pdf paper) - but brute force 
testing over all choices of dead disks is easy enough.

For Raid7, the obvious choice is g_2 = 4 - then we can re-use existing 
macros and optimisations.


I've had a brief look at int.uc - the gen_syndrome function is nice and 
small, but that's before awk gets at it.  I haven't yet tried building 
any of this, but extending raid6_int to a third syndrome is, I hope, is 
straightforward:

static void raid7_int$#_gen_syndrome(int disks, size_t bytes, void **ptrs)
{
         u8 **dptr = (u8 **)ptrs;
         u8 *p, *q, *r;
         int d, z, z0;

         unative_t wd$$, wq$$, wp$$, w1$$, w2$$, wr$$, w3$$, w4$$;

         z0 = disks - 4;         /* Highest data disk */
         p = dptr[z0+1];         /* XOR parity */
         q = dptr[z0+2];         /* RS syndrome */
         r = dptr[z0+3];         /* RS syndrome 2 */

         for ( d = 0 ; d < bytes ; d += NSIZE*$# ) {
                 wr$$ = wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE];
                 for ( z = z0-1 ; z >= 0 ; z-- ) {
                         wd$$ = *(unative_t *)&dptr[z][d+$$*NSIZE];
                         wp$$ ^= wd$$;
                         w2$$ = MASK(wq$$);
                         w1$$ = SHLBYTE(wq$$);
                         w2$$ &= NBYTES(0x1d);
                         w1$$ ^= w2$$;
                         wq$$ = w1$$ ^ wd$$;

                         w4$$ = MASK(wr$$);
                         w3$$ = SHLBYTE(wr$$);
                         w4$$ &= NBYTES(0x1d);
                         w3$$ ^= w4$$;
                         w4$$ = MASK(w3$$);
                         w3$$ = SHLBYTE(w3$$);
                         w4$$ &= NBYTES(0x1d);
                         w3$$ ^= w4$$;
                         wr$$ = w3$$ ^ wd$$;
                 }
                 *(unative_t *)&p[d+NSIZE*$$] = wp$$;
                 *(unative_t *)&q[d+NSIZE*$$] = wq$$;
                 *(unative_t *)&r[d+NSIZE*$$] = wr$$;
         }
}


I wrote the wr$$ calculations using a second set of working variables. 
I don't know (yet) what compiler options are used to generate the target 
code, nor what the awk'ed code looks like.  If the compiler can handle 
the scheduling to interlace the Q and R calculations and reduce pipeline 
delays, then that's great.  If not, then they can be manually interlaced 
if it helps the code.  But as I'm writing this blind (on a windows 
machine, no less), I don't know if it makes a difference.

As a general point regarding optimisations, is there a minimum level of 
gcc we can expect to have here?  And are there rules about compiler 
flags?  Later versions of gcc are getting very smart about 
vectorisation, and generating multiple versions of code that are 
selected automatically at run-time depending on the capabilities of the 
processor.  My hope is that the code can be greatly simplified by 
writing a single clear version in C that runs fast and takes advantage 
of the cpu, without needing to make explicit SSE, AtliVec, etc., 
versions.  Are there rules about which gcc attributes or extensions we 
can use?


Another question - what should we call this?  Raid 7?  Raid 6.3?  Raid 
7.3?  While we are in the mood for triple-parity Raid, we can easily 
extend it to quad-parity or more - the name should probably be flexible 
enough to allow that.




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-09 12:04     ` NeilBrown
  2011-06-09 19:19       ` David Brown
  2011-06-10  3:22       ` Namhyung Kim
@ 2011-06-10  9:03       ` David Brown
  2011-06-10 13:56       ` Bill Davidsen
  3 siblings, 0 replies; 22+ messages in thread
From: David Brown @ 2011-06-10  9:03 UTC (permalink / raw)
  To: linux-raid

On 09/06/2011 14:04, NeilBrown wrote:
> On Thu, 09 Jun 2011 13:32:59 +0200 David Brown<david@westcontrol.com>  wrote:
>
>> On 09/06/2011 03:49, NeilBrown wrote:
>>> On Thu, 09 Jun 2011 02:01:06 +0200 David Brown<david.brown@hesbynett.no>
>>> wrote:
>>>
>>>> Has anyone considered triple-parity raid6 ?  As far as I can see, it
>>>> should not be significantly harder than normal raid6 - either  to
>>>> implement, or for the processor at run-time.  Once you have the GF(2⁸)
>>>> field arithmetic in place for raid6, it's just a matter of making
>>>> another parity block in the same way but using a different generator:
>>>>
>>>> P = D_0 + D_1 + D_2 + .. + D_(n.1)
>>>> Q = D_0 + g.D_1 + g².D_2 + .. + g^(n-1).D_(n.1)
>>>> R = D_0 + h.D_1 + h².D_2 + .. + h^(n-1).D_(n.1)
>>>>
>>>> The raid6 implementation in mdraid uses g = 0x02 to generate the second
>>>> parity (based on "The mathematics of RAID-6" - I haven't checked the
>>>> source code).  You can make a third parity using h = 0x04 and then get a
>>>> redundancy of 3 disks.  (Note - I haven't yet confirmed that this is
>>>> valid for more than 100 data disks - I need to make my checker program
>>>> more efficient first.)
>>>>
>>>> Rebuilding a disk, or running in degraded mode, is just an obvious
>>>> extension to the current raid6 algorithms.  If you are missing three
>>>> data blocks, the maths looks hard to start with - but if you express the
>>>> equations as a set of linear equations and use standard matrix inversion
>>>> techniques, it should not be hard to implement.  You only need to do
>>>> this inversion once when you find that one or more disks have failed -
>>>> then you pre-compute the multiplication tables in the same way as is
>>>> done for raid6 today.
>>>>
>>>> In normal use, calculating the R parity is no more demanding than
>>>> calculating the Q parity.  And most rebuilds or degraded situations will
>>>> only involve a single disk, and the data can thus be re-constructed
>>>> using the P parity just like raid5 or two-parity raid6.
>>>>
>>>>
>>>> I'm sure there are situations where triple-parity raid6 would be
>>>> appealing - it has already been implemented in ZFS, and it is only a
>>>> matter of time before two-parity raid6 has a real probability of hitting
>>>> an unrecoverable read error during a rebuild.
>>>>
>>>>
>>>> And of course, there is no particular reason to stop at three parity
>>>> blocks - the maths can easily be generalised.  1, 2, 4 and 8 can be used
>>>> as generators for quad-parity (checked up to 60 disks), and adding 16
>>>> gives you quintuple parity (checked up to 30 disks) - but that's maybe
>>>> getting a bit paranoid.
>>>>
>>>>
>>>> ref.:
>>>>
>>>> <http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf>
>>>> <http://blogs.oracle.com/ahl/entry/acm_triple_parity_raid>
>>>> <http://queue.acm.org/detail.cfm?id=1670144>
>>>> <http://blogs.oracle.com/ahl/entry/triple_parity_raid_z>
>>>>
>>>
>>>    -ENOPATCH  :-)
>>>
>>> I have a series of patches nearly ready which removes a lot of the remaining
>>> duplication in raid5.c between raid5 and raid6 paths.  So there will be
>>> relative few places where RAID5 and RAID6 do different things - only the
>>> places where they *must* do different things.
>>> After that, adding a new level or layout which has 'max_degraded == 3' would
>>> be quite easy.
>>> The most difficult part would be the enhancements to libraid6 to generate the
>>> new 'syndrome', and to handle the different recovery possibilities.
>>>
>>> So if you're not otherwise busy this weekend, a patch would be nice :-)
>>>
>>
>> I'm not going to promise any patches, but maybe I can help with the
>> maths.  You say the difficult part is the syndrome calculations and
>> recovery - I've got these bits figured out on paper and some
>> quick-and-dirty python test code.  On the other hand, I don't really
>> want to get into the md kernel code, or the mdadm code - I haven't done
>> Linux kernel development before (I mostly program 8-bit microcontrollers
>> - when I code on Linux, I use Python), and I fear it would take me a
>> long time to get up to speed.
>>
>> However, if the parity generation and recovery is neatly separated into
>> a libraid6 library, the whole thing becomes much more tractable from my
>> viewpoint.  Since I am new to this, can you tell me where I should get
>> the current libraid6 code?  I'm sure google will find some sources for
>> me, but I'd like to make sure I start with whatever version /you/ have.
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> You can see the current kernel code at:
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=lib/raid6;h=970c541a452d3b9983223d74b10866902f1a47c7;hb=HEAD
>
>
> int.uc is the generic C code which 'unroll.awk' processes to make various
> versions that unroll the loops different amounts to work with CPUs with
> different numbers of registers.
> Then there is sse1, sse2, altivec which provide the same functionality in
> assembler which is optimised for various processors.
>
> And 'recov' has the smarts for doing the reverse calculation when 2 data
> blocks, or 1 data and P are missing.
>
> Even if you don't feel up to implementing everything, a start might be
> useful.  You never know when someone might jump up and offer to help.
>
> NeilBrown


When looking at recov.c, I see in the "raid6_dual_recov" function there 
is no code for testing data+Q failure as it is equivalent to raid5 
recovery.  Should this not still be implemented here so that testing can 
be more complete?

Is there a general entry point for the recovery routines, which then 
decides which of raid6_2data_recov, raid6_datap_recov, or 
raid6_dual_recov is called?  With triple-parity raid, there are many 
more combinations - it would make sense for the library to have a single 
function like :

void raid7_3_recov(int disks, size_t bytes, int noOfFails,
		int *pFails, void **ptrs);

or even (to cover quad parity and more) :

void raid7_n_recov(int disks, int noOfParities, size_t bytes,
	int noOfFails, int *pFails, void **ptrs);






--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-10  8:45         ` David Brown
@ 2011-06-10 12:20           ` Christoph Dittmann
  2011-06-10 14:28             ` David Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Dittmann @ 2011-06-10 12:20 UTC (permalink / raw)
  To: linux-raid

On 06/10/2011 10:45 AM, David Brown wrote:
> Making multiple parity syndromes is easy enough mathematically:

Adam Leventhal, who wrote the double parity and triple parity code for 
ZFS, mentioned on his blog [1] that going beyond triple parity poses 
significant challenges if write performance should not suffer.

In particular, it looks like a relevant math paper (originally) had a 
flaw in the claim that quad parity and above can be implemented just 
like triple parity.

I don't know the implementation you were going to use, neither am I 
knowledgeable about multi-parity in general. I only thought it might be 
relevant to add to the current discussion that other people had issues 
with implementing N-parity for N > 3.


Christoph

[1] http://blogs.oracle.com/ahl/entry/triple_parity_raid_z

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-09 12:04     ` NeilBrown
                         ` (2 preceding siblings ...)
  2011-06-10  9:03       ` David Brown
@ 2011-06-10 13:56       ` Bill Davidsen
  3 siblings, 0 replies; 22+ messages in thread
From: Bill Davidsen @ 2011-06-10 13:56 UTC (permalink / raw)
  To: NeilBrown; +Cc: David Brown, linux-raid

NeilBrown wrote:
> On Thu, 09 Jun 2011 13:32:59 +0200 David Brown<david@westcontrol.com>  wrote:
>
>    
>> On 09/06/2011 03:49, NeilBrown wrote:
>>      
>>>
>>>    -ENOPATCH  :-)
>>>
>>> I have a series of patches nearly ready which removes a lot of the remaining
>>> duplication in raid5.c between raid5 and raid6 paths.  So there will be
>>> relative few places where RAID5 and RAID6 do different things - only the
>>> places where they *must* do different things.
>>>        
>>> After that, adding a new level or layout which has 'max_degraded == 3' would
>>> be quite easy.
>>> The most difficult part would be the enhancements to libraid6 to generate the
>>> new 'syndrome', and to handle the different recovery possibilities.
>>>
>>> So if you're not otherwise busy this weekend, a patch would be nice :-)
>>>
>>>        
>> I'm not going to promise any patches, but maybe I can help with the
>> maths.  You say the difficult part is the syndrome calculations and
>> recovery - I've got these bits figured out on paper and some
>> quick-and-dirty python test code.  On the other hand, I don't really
>> want to get into the md kernel code, or the mdadm code - I haven't done
>> Linux kernel development before (I mostly program 8-bit microcontrollers
>> - when I code on Linux, I use Python), and I fear it would take me a
>> long time to get up to speed.
>>
>> However, if the parity generation and recovery is neatly separated into
>> a libraid6 library, the whole thing becomes much more tractable from my
>> viewpoint.  Since I am new to this, can you tell me where I should get
>> the current libraid6 code?  I'm sure google will find some sources for
>> me, but I'd like to make sure I start with whatever version /you/ have.
>>      
> You can see the current kernel code at:
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=lib/raid6;h=970c541a452d3b9983223d74b10866902f1a47c7;hb=HEAD
>
>
> int.uc is the generic C code which 'unroll.awk' processes to make various
> versions that unroll the loops different amounts to work with CPUs with
> different numbers of registers.
> Then there is sse1, sse2, altivec which provide the same functionality in
> assembler which is optimised for various processors.
>
>    
And at some point I'm sure one of the video card vendors will provide a 
hack to do it in
the GPU in massively parallel fashion.

> And 'recov' has the smarts for doing the reverse calculation when 2 data
> blocks, or 1 data and P are missing.
>
> Even if you don't feel up to implementing everything, a start might be
> useful.  You never know when someone might jump up and offer to help.
>
>
>    


-- 
Bill Davidsen<davidsen@tmr.com>
   We are not out of the woods yet, but we know the direction and have
taken the first step. The steps are many, but finite in number, and if
we persevere we will reach our destination.  -me, 2010




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-10 12:20           ` Christoph Dittmann
@ 2011-06-10 14:28             ` David Brown
  2011-06-11 10:13               ` Piergiorgio Sartor
  0 siblings, 1 reply; 22+ messages in thread
From: David Brown @ 2011-06-10 14:28 UTC (permalink / raw)
  To: linux-raid

On 10/06/2011 14:20, Christoph Dittmann wrote:
> On 06/10/2011 10:45 AM, David Brown wrote:
>> Making multiple parity syndromes is easy enough mathematically:
>
> Adam Leventhal, who wrote the double parity and triple parity code for
> ZFS, mentioned on his blog [1] that going beyond triple parity poses
> significant challenges if write performance should not suffer.
>
> In particular, it looks like a relevant math paper (originally) had a
> flaw in the claim that quad parity and above can be implemented just
> like triple parity.
>
> I don't know the implementation you were going to use, neither am I
> knowledgeable about multi-parity in general. I only thought it might be
> relevant to add to the current discussion that other people had issues
> with implementing N-parity for N > 3.
>
>

I've looked at Adam's blog article - in fact, stumbling over it when 
wandering around the 'net was one of my inspirations to think about 
multi-parity raid again.

But there are a few key differences between the description in the James 
Plank papers referenced, and the implementation I've looked at.

One point is that he is looking at GF(2^4), which is quickly a limiting 
factor.  It's hard to get enough independent parity syndromes, and you 
can get easily start to think that things won't work for larger parity.

The other is the choice of factors for the syndromes.  Assuming I 
understand the papers correctly, the first paper is suggesting this 
arrangement (all arithmetic done in GF()):

P_0 = 1^0 . d_0  +  2^0 . d_1  +  3^0 . d_2  +  4^0 . d_3
P_1 = 1^1 . d_0  +  2^1 . d_1  +  3^1 . d_2  +  4^1 . d_3
P_2 = 1^2 . d_0  +  2^2 . d_1  +  3^2 . d_2  +  4^2 . d_3
P_3 = 1^3 . d_0  +  2^3 . d_1  +  3^3 . d_2  +  4^3 . d_3

The second paper changes it to:

P_0 = 1^0 . d_0  +  1^1 . d_1  +  1^2 . d_2  +  1^3 . d_3
P_1 = 2^0 . d_0  +  2^1 . d_1  +  2^2 . d_2  +  2^3 . d_3
P_2 = 3^0 . d_0  +  3^1 . d_1  +  3^2 . d_2  +  3^3 . d_3
P_3 = 4^0 . d_0  +  4^1 . d_1  +  4^2 . d_2  +  4^3 . d_3


For the first two parity blocks, this is the same as for Linux raid6:

P = d_0 + d_1 + d_2 + d_3
Q = g^0 . d_0  +  g^1 . d_1  +  g^2 . d_2  +  g^3 . d_3
(where g = 2)

But the third (and later) lines are not good - the restoration equations 
on multiple disk failure are not independent for all combinations of 
failed disks.  For example, if you have 20 data disks and 3 parity 
disks, there are 1140 different combinations of three data-disk 
failures.  Of these, 92 combinations lead to non-independent equations 
when you try to restore the data.


The equations I am using are:

P_0 = 1^0 . d_0  +  1^1 . d_1  +  1^2 . d_2  +  1^3 . d_3
P_1 = 2^0 . d_0  +  2^1 . d_1  +  2^2 . d_2  +  2^3 . d_3
P_2 = 4^0 . d_0  +  4^1 . d_1  +  4^2 . d_2  +  4^3 . d_3
P_3 = 8^0 . d_0  +  8^1 . d_1  +  8^2 . d_2  +  8^3 . d_3

P_4 would use powers of 16, etc.

I have checked that the restoration equations are solvable for all 
combinations of 3 disk failures for triple parity, and all combinations 
of 4 disk failures for quad parity.  The restoration equations can be 
written in matrix form and are dependent on the indexes of the failed 
disks.  To solve them, you simply invert the matrix and this gives you a 
linear formula for re-calculating each missing data block.  So to check 
that the data can be restored, you need to check that the determinant of 
this matrix is non-zero for all combinations of n-disk failures.

I /believe/ these should work find for at least up to P_7 (i.e., 8 
parity disks), but I haven't yet checked more than a few larger parity 
modes.  Checking all 4 disk failures from 253 disks took my inefficient 
python program about 45 minutes - to check for 8-disk failures would 
mean finding all combinations of 8 choices for up to 249 disks.  If my 
maths serves me right, that's 316,528,258,780,856 combinations - and for 
each one, I'd have to find the determinant of an 8x8 matrix over 
GF(2^8).  Fortunately, I don't think they'll be much call for 
octal-parity RAID in the near future :-)


Of course, all this assume that my maths is correct !



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-10 14:28             ` David Brown
@ 2011-06-11 10:13               ` Piergiorgio Sartor
  2011-06-11 11:51                 ` David Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Piergiorgio Sartor @ 2011-06-11 10:13 UTC (permalink / raw)
  To: David Brown; +Cc: linux-raid

[snip]
> Of course, all this assume that my maths is correct !

I would suggest to check out the Reed-Solomon thing
in the more friendly form of Vandermonde matrix.

It will be completely clear how to generate k parity
set with n data set (disk), so that n+k < 258 for the
GF(256) space.

It will also be much more clear how to re-construct
the data set in case of erasure (known data lost).

You can have a look, for reference, at:

http://lsirwww.epfl.ch/wdas2004/Presentations/manasse.ppt

If you search for something like "Reed Solomon Vandermonde"
you'll find even more information.

Hope this helps.

bye,

-- 

piergiorgio

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-11 10:13               ` Piergiorgio Sartor
@ 2011-06-11 11:51                 ` David Brown
  2011-06-11 13:18                   ` Piergiorgio Sartor
  0 siblings, 1 reply; 22+ messages in thread
From: David Brown @ 2011-06-11 11:51 UTC (permalink / raw)
  To: linux-raid

On 11/06/11 12:13, Piergiorgio Sartor wrote:
> [snip]
>> Of course, all this assume that my maths is correct !
>
> I would suggest to check out the Reed-Solomon thing
> in the more friendly form of Vandermonde matrix.
>
> It will be completely clear how to generate k parity
> set with n data set (disk), so that n+k<  258 for the
> GF(256) space.
>
> It will also be much more clear how to re-construct
> the data set in case of erasure (known data lost).
>
> You can have a look, for reference, at:
>
> http://lsirwww.epfl.ch/wdas2004/Presentations/manasse.ppt
>
> If you search for something like "Reed Solomon Vandermonde"
> you'll find even more information.
>
> Hope this helps.
>
> bye,
>

That presentation is using Vandermonde matrices, which are the same as 
the ones used in James Plank's papers.  As far as I can see, these are 
limited in how well you can recover from missing disks (the presentation 
here says it only works for up to triple parity).  They Vandermonde 
matrices have that advantage that the determinants are easily calculated 
- I haven't yet figured out an analytical method of calculating the 
determinants in my equations, and have just used brute force checking. 
(My syndromes also have the advantage of being easy to calculate quickly.)


Still, I think the next step for me should be to write up the maths a 
bit more formally, rather than just hints in mailing list posts.  Then 
others can have a look, and have an opinion on whether I've got it right 
or not.  It makes sense to be sure the algorithms will work before 
spending much time implementing them!

I certainly /believe/ my maths is correct here - but it's nearly twenty 
years since I did much formal algebra.  I studied maths at university, 
but I don't use group theory often in my daily job as an embedded 
programmer.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-11 11:51                 ` David Brown
@ 2011-06-11 13:18                   ` Piergiorgio Sartor
  2011-06-11 14:53                     ` David Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Piergiorgio Sartor @ 2011-06-11 13:18 UTC (permalink / raw)
  To: David Brown; +Cc: linux-raid

On Sat, Jun 11, 2011 at 01:51:12PM +0200, David Brown wrote:
> On 11/06/11 12:13, Piergiorgio Sartor wrote:
> >[snip]
> >>Of course, all this assume that my maths is correct !
> >
> >I would suggest to check out the Reed-Solomon thing
> >in the more friendly form of Vandermonde matrix.
> >
> >It will be completely clear how to generate k parity
> >set with n data set (disk), so that n+k<  258 for the
> >GF(256) space.
> >
> >It will also be much more clear how to re-construct
> >the data set in case of erasure (known data lost).
> >
> >You can have a look, for reference, at:
> >
> >http://lsirwww.epfl.ch/wdas2004/Presentations/manasse.ppt
> >
> >If you search for something like "Reed Solomon Vandermonde"
> >you'll find even more information.
> >
> >Hope this helps.
> >
> >bye,
> >
> 
> That presentation is using Vandermonde matrices, which are the same
> as the ones used in James Plank's papers.  As far as I can see,
> these are limited in how well you can recover from missing disks
> (the presentation here says it only works for up to triple parity).

As far as I understood, 3 is only an example, it works
up to k lost disks, with n+k < 258 (or 259).
I mean, I do not see why it should not.

You've, of course, to know which disks are failed.
On the other hand, having k parities allows you to find
up to k/2 error positions.
This is bit more complicated, I guess.
You can search for Berlekamp-Massey Algorithm (and related)
in order to see how to *find* the errors.

> They Vandermonde matrices have that advantage that the determinants
> are easily calculated - I haven't yet figured out an analytical
> method of calculating the determinants in my equations, and have
> just used brute force checking. (My syndromes also have the
> advantage of being easy to calculate quickly.)

I think the point of Reed Solomon (with Vandermonde or Cauchy
matrices) is also that it generalize the parity concept.

This means you do not have to care if it is 2 or 3 or 7 or ...

In this way you can have as many parities as you like, up to
the limitation of Reed Solomon in GF(256).

> Still, I think the next step for me should be to write up the maths
> a bit more formally, rather than just hints in mailing list posts.
> Then others can have a look, and have an opinion on whether I've got
> it right or not.  It makes sense to be sure the algorithms will work
> before spending much time implementing them!

I tend to agree. At first you should set up the background
theory, then the algorithm, later the implementation and
eventually the optimization.
 
> I certainly /believe/ my maths is correct here - but it's nearly
> twenty years since I did much formal algebra.  I studied maths at
> university, but I don't use group theory often in my daily job as an
> embedded programmer.

Well, I, for sure, will stay tuned for your results!

bye,

-- 

piergiorgio

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-11 13:18                   ` Piergiorgio Sartor
@ 2011-06-11 14:53                     ` David Brown
  2011-06-11 15:05                       ` Joe Landman
  0 siblings, 1 reply; 22+ messages in thread
From: David Brown @ 2011-06-11 14:53 UTC (permalink / raw)
  To: linux-raid

On 11/06/11 15:18, Piergiorgio Sartor wrote:
> On Sat, Jun 11, 2011 at 01:51:12PM +0200, David Brown wrote:
>> On 11/06/11 12:13, Piergiorgio Sartor wrote:
>>> [snip]
>>>> Of course, all this assume that my maths is correct !
>>>
>>> I would suggest to check out the Reed-Solomon thing
>>> in the more friendly form of Vandermonde matrix.
>>>
>>> It will be completely clear how to generate k parity
>>> set with n data set (disk), so that n+k<   258 for the
>>> GF(256) space.
>>>
>>> It will also be much more clear how to re-construct
>>> the data set in case of erasure (known data lost).
>>>
>>> You can have a look, for reference, at:
>>>
>>> http://lsirwww.epfl.ch/wdas2004/Presentations/manasse.ppt
>>>
>>> If you search for something like "Reed Solomon Vandermonde"
>>> you'll find even more information.
>>>
>>> Hope this helps.
>>>
>>> bye,
>>>
>>
>> That presentation is using Vandermonde matrices, which are the same
>> as the ones used in James Plank's papers.  As far as I can see,
>> these are limited in how well you can recover from missing disks
>> (the presentation here says it only works for up to triple parity).
>
> As far as I understood, 3 is only an example, it works
> up to k lost disks, with n+k<  258 (or 259).
> I mean, I do not see why it should not.
>

I also don't see why 3 parity should be a limitation - I think it must 
be because of the choice of syndrome calculations.  But the presentation 
you linked to specifically says on page 3 that it will say "Why it stops 
at three erasures, and works only for GF(2^k)".  I haven't investigated 
anything other than GF(2^8), since that is optimal for implementing raid 
(well, 2^1 is easier - but that only gives you raid5).  Unfortunately, 
the paper doesn't give details there.  Adam Leventhal's blog (mentioned 
earlier in this thread) also says that implementation of triple-parity 
for ZFS was relatively easy, but not for more than three parity bits.

> You've, of course, to know which disks are failed.

That's normally the case for disk applications.

> On the other hand, having k parities allows you to find
> up to k/2 error positions.
> This is bit more complicated, I guess.
> You can search for Berlekamp-Massey Algorithm (and related)
> in order to see how to *find* the errors.
>

I've worked with ECC systems for data transmission and communications 
systems, when you don't know if there are any errors or where the errors 
might be.  But although there is a fair overlap of the theory here, 
there are big differences in the way you implement such checking and 
correction, and your priorities.  With raid, you know either that your 
block read is correct (because of the ECC handled at the drive firmware 
level), or incorrect.

To deal with unknown errors or error positions, you have to read in 
everything in a stripe and run your error checking for every read - that 
would be a significant run-time cost, which normally be wasted (as the 
raid set is normally consistent).

One situation where that might be useful, however, is for scrubbing or 
checking when the array is know to be inconsistent (such as after a 
power failure).  Neil has already argued that the simple approach of 
re-creating the parity blocks (rather than identifying incorrect blocks) 
is better, or at least no worse, than being "smart".  But the balance of 
that argument might change with more parity blocks.

<http://neil.brown.name/blog/20100211050355>

>> They Vandermonde matrices have that advantage that the determinants
>> are easily calculated - I haven't yet figured out an analytical
>> method of calculating the determinants in my equations, and have
>> just used brute force checking. (My syndromes also have the
>> advantage of being easy to calculate quickly.)
>
> I think the point of Reed Solomon (with Vandermonde or Cauchy
> matrices) is also that it generalize the parity concept.
>
> This means you do not have to care if it is 2 or 3 or 7 or ...
>
> In this way you can have as many parities as you like, up to
> the limitation of Reed Solomon in GF(256).
>

I agree.  However, I'm not sure there is much practical use of going 
beyond perhaps 4 parity blocks - at that stage you are probably better 
dividing up your array, or (if you need more protection) using n-parity 
raid6 over raid1 mirror sets.

>> Still, I think the next step for me should be to write up the maths
>> a bit more formally, rather than just hints in mailing list posts.
>> Then others can have a look, and have an opinion on whether I've got
>> it right or not.  It makes sense to be sure the algorithms will work
>> before spending much time implementing them!
>
> I tend to agree. At first you should set up the background
> theory, then the algorithm, later the implementation and
> eventually the optimization.
>

Yes - we've already established that the implementation will be 
possible, and that there are people willing and able to help with it. 
And I believe that much of the optimisation can be handled by the 
compiler - gcc has come a long way since raid6 was first implemented in 
mdraid.

>> I certainly /believe/ my maths is correct here - but it's nearly
>> twenty years since I did much formal algebra.  I studied maths at
>> university, but I don't use group theory often in my daily job as an
>> embedded programmer.
>
> Well, I, for sure, will stay tuned for your results!
>
> bye,
>



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-11 14:53                     ` David Brown
@ 2011-06-11 15:05                       ` Joe Landman
  2011-06-11 16:31                         ` David Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Joe Landman @ 2011-06-11 15:05 UTC (permalink / raw)
  To: David Brown; +Cc: linux-raid

On 06/11/2011 10:53 AM, David Brown wrote:

> Yes - we've already established that the implementation will be
> possible, and that there are people willing and able to help with it.
> And I believe that much of the optimisation can be handled by the
> compiler - gcc has come a long way since raid6 was first implemented in
> mdraid.

Hmmm ... don't be too overly reliant upon optimization from a compiler 
for your performance.  It makes sense to have a well designed (and 
proven correct) algorithm first that is hand optimizable.  There are 
many designs which are anathema to performance, and you have to be 
careful to avoid using them in your coding.

We are interested in working on this capability (and more generic 
capability) as well.

Is anyone in particular starting to design/code this?  Please let me know.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman@scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-11 15:05                       ` Joe Landman
@ 2011-06-11 16:31                         ` David Brown
  2011-06-11 16:57                           ` Joe Landman
  2011-06-11 17:14                           ` Joe Landman
  0 siblings, 2 replies; 22+ messages in thread
From: David Brown @ 2011-06-11 16:31 UTC (permalink / raw)
  To: linux-raid

On 11/06/11 17:05, Joe Landman wrote:
> On 06/11/2011 10:53 AM, David Brown wrote:
>
>> Yes - we've already established that the implementation will be
>> possible, and that there are people willing and able to help with it.
>> And I believe that much of the optimisation can be handled by the
>> compiler - gcc has come a long way since raid6 was first implemented in
>> mdraid.
>
> Hmmm ... don't be too overly reliant upon optimization from a compiler
> for your performance. It makes sense to have a well designed (and proven
> correct) algorithm first that is hand optimizable. There are many
> designs which are anathema to performance, and you have to be careful to
> avoid using them in your coding.
>

Absolutely - it's the designer's job to pick a good algorithm, and the 
programmer's job to make a good implementation of it.  But it's the 
compiler's job to turn that into tight object code.  And for code like 
this, the algorithm designer has to work closely with the programmer (or 
be the same person, of course) to pick algorithms that can be 
implemented well.  Similarly, the programmer has to have a good 
understanding of the compiler and be able to understand the generated 
assembly, in order to get the best from the tools.

What has changed over the years is that there is no longer such a need 
for manual assembly code to get optimal speed out of the cpu.  While 
writing such assembly is fun, it is time-consuming to write and hard to 
maintain, especially for code that must run on so many different platforms.

> We are interested in working on this capability (and more generic
> capability) as well.
>
> Is anyone in particular starting to design/code this? Please let me know.
>

Well, I am currently trying to write up some of the maths - I started 
the thread because I had been playing around with the maths, and thought 
it should work.  I made a brief stab at writing a 
"raid7_int$#_gen_syndrome()" function, but I haven't done any testing 
with it (or even tried to compile it) - first I want to be sure of the 
algorithms.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-11 16:31                         ` David Brown
@ 2011-06-11 16:57                           ` Joe Landman
  2011-06-12  9:05                             ` David Brown
  2011-06-11 17:14                           ` Joe Landman
  1 sibling, 1 reply; 22+ messages in thread
From: Joe Landman @ 2011-06-11 16:57 UTC (permalink / raw)
  To: David Brown; +Cc: linux-raid

On 06/11/2011 12:31 PM, David Brown wrote:

> What has changed over the years is that there is no longer such a need
> for manual assembly code to get optimal speed out of the cpu. While

Hmmm ... I've done studies on this using an incredibly simple function 
(Riemann Zeta Function c.f. http://scalability.org/?p=470  ).  The short 
version is that hand optimized SSE2 is ~4x faster (for this case) than 
best optimization of high level code.  Hand optimized assembler is even 
better.

> writing such assembly is fun, it is time-consuming to write and hard to
> maintain, especially for code that must run on so many different platforms.

Yes, it is generally hard to write and maintain.  But it you can get the 
rest of the language semantics out of the way.  If you look at the tests 
that Linux does when it starts up, you can see a fairly wide 
distribution in the performance.

raid5: using function: generic_sse (13356.000 MB/sec)
raid6: int64x1   3507 MB/s
raid6: int64x2   3886 MB/s
raid6: int64x4   3257 MB/s
raid6: int64x8   3054 MB/s
raid6: sse2x1    8347 MB/s
raid6: sse2x2    9695 MB/s
raid6: sse2x4   10972 MB/s

Some of these are hand coded assembly. See 
${KERNEL_SOURCE}/drivers/md/raid6sse2.c and look at the 
raid6_sse24_gen_syndrome code.

Really, to get the best performance out of the system, requires a fairly 
deep understanding of how the processor/memory system operates.  These 
functions do use the SSE registers, but we can have only so many SSE 
operations in flight at once.  These processors can generally have quite 
a few simultaneous operations in flight at once, so a knowledge about 
that, and the mix of operations, and how the interact with the 
instruction scheduler in the hardware, is fairly essential to getting 
good performance.

>
>> We are interested in working on this capability (and more generic
>> capability) as well.
>>
>> Is anyone in particular starting to design/code this? Please let me know.
>>
>
> Well, I am currently trying to write up some of the maths - I started
> the thread because I had been playing around with the maths, and thought
> it should work. I made a brief stab at writing a
> "raid7_int$#_gen_syndrome()" function, but I haven't done any testing
> with it (or even tried to compile it) - first I want to be sure of the
> algorithms.

I've been coding various bits as "pseudocode" using Octave.  Makes 
checking with the built in Galios functions pretty easy.

I haven't looked at the math behind the triple parity syndrome calc yet, 
though I'd imagine someone has, and can write it down.  If someone 
hasn't done that yet, its a good first step.  Then we can code the 
simple version from there with test drivers/cases, and then start 
optimizing the implementation.


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman@scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-11 16:31                         ` David Brown
  2011-06-11 16:57                           ` Joe Landman
@ 2011-06-11 17:14                           ` Joe Landman
  2011-06-11 18:05                             ` David Brown
  1 sibling, 1 reply; 22+ messages in thread
From: Joe Landman @ 2011-06-11 17:14 UTC (permalink / raw)
  To: David Brown; +Cc: linux-raid

A quick note of caution (and someone from Netapp, feel free to speak up).

Netapp has a patent on triple parity raid (c.f. 
http://www.freepatentsonline.com/7640484.html).  A quick look over this, 
suggests that the major innovation is the layout and computation which 
they simplified in a particular manner.  That is, I don't think their 
patent covers triple parity RAID in general, but does cover their 
implementation, and the diagonal parity with anti-diagonal parity 
(effectively counter propagating or orthogonalized parity).

I am not sure what this means from a coding sense, other than not to use 
their techniques without a license to do so.  If Netapp wants to grant 
such a license, this would be good, but I suspect that it wouldn't be 
quite as simple as this.

Just a note so that we don't encounter problems.  I think its very 
possible to avoid their IP, as it would somewhat hard to claim ownership 
of the Galois Field math behind RAID calculations.  They can (and do) 
claim a particular implementation and algorithm.

[also not trying to open the patent on code wars here, just pointing out 
the current situation ]


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman@scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-11 17:14                           ` Joe Landman
@ 2011-06-11 18:05                             ` David Brown
  0 siblings, 0 replies; 22+ messages in thread
From: David Brown @ 2011-06-11 18:05 UTC (permalink / raw)
  To: linux-raid

On 11/06/11 19:14, Joe Landman wrote:
> A quick note of caution (and someone from Netapp, feel free to speak up).
>
> Netapp has a patent on triple parity raid (c.f.
> http://www.freepatentsonline.com/7640484.html). A quick look over this,
> suggests that the major innovation is the layout and computation which
> they simplified in a particular manner. That is, I don't think their
> patent covers triple parity RAID in general, but does cover their
> implementation, and the diagonal parity with anti-diagonal parity
> (effectively counter propagating or orthogonalized parity).
>
> I am not sure what this means from a coding sense, other than not to use
> their techniques without a license to do so. If Netapp wants to grant
> such a license, this would be good, but I suspect that it wouldn't be
> quite as simple as this.
>
> Just a note so that we don't encounter problems. I think its very
> possible to avoid their IP, as it would somewhat hard to claim ownership
> of the Galois Field math behind RAID calculations. They can (and do)
> claim a particular implementation and algorithm.
>
> [also not trying to open the patent on code wars here, just pointing out
> the current situation ]
>
>

I've read a little about diagonal parities - I can see some advantage in 
their simplicity, but I think that they are a poor choice for raid. 
Raid5+ already suffers from performance issues because you often have to 
read a whole stripe at a time just to change a few blocks - with 
diagonal parity, you'd have to read a whole 2-D set of stripes.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Triple-parity raid6
  2011-06-11 16:57                           ` Joe Landman
@ 2011-06-12  9:05                             ` David Brown
  0 siblings, 0 replies; 22+ messages in thread
From: David Brown @ 2011-06-12  9:05 UTC (permalink / raw)
  To: linux-raid

On 11/06/11 18:57, Joe Landman wrote:
> On 06/11/2011 12:31 PM, David Brown wrote:
>
>> What has changed over the years is that there is no longer such a need
>> for manual assembly code to get optimal speed out of the cpu. While
>
> Hmmm ... I've done studies on this using an incredibly simple function
> (Riemann Zeta Function c.f. http://scalability.org/?p=470 ). The short
> version is that hand optimized SSE2 is ~4x faster (for this case) than
> best optimization of high level code. Hand optimized assembler is even
> better.
>
>> writing such assembly is fun, it is time-consuming to write and hard to
>> maintain, especially for code that must run on so many different
>> platforms.
>
> Yes, it is generally hard to write and maintain. But it you can get the
> rest of the language semantics out of the way. If you look at the tests
> that Linux does when it starts up, you can see a fairly wide
> distribution in the performance.
>
> raid5: using function: generic_sse (13356.000 MB/sec)
> raid6: int64x1 3507 MB/s
> raid6: int64x2 3886 MB/s
> raid6: int64x4 3257 MB/s
> raid6: int64x8 3054 MB/s
> raid6: sse2x1 8347 MB/s
> raid6: sse2x2 9695 MB/s
> raid6: sse2x4 10972 MB/s
>
> Some of these are hand coded assembly. See
> ${KERNEL_SOURCE}/drivers/md/raid6sse2.c and look at the
> raid6_sse24_gen_syndrome code.
>
> Really, to get the best performance out of the system, requires a fairly
> deep understanding of how the processor/memory system operates. These
> functions do use the SSE registers, but we can have only so many SSE
> operations in flight at once. These processors can generally have quite
> a few simultaneous operations in flight at once, so a knowledge about
> that, and the mix of operations, and how the interact with the
> instruction scheduler in the hardware, is fairly essential to getting
> good performance.
>

I am not suggesting that hand-coding assembly won't make the 
calculations faster - just that better compiler optimisations (which 
will automatically make use of sse instructions) will make the generic 
code closer to the theoretical maximum.

Out of curiosity, have you re-tried your zeta function code using a more 
modern version of gcc?  A lot has happened with gcc since 4.1 - in 
particular, the "graphite" code in gcc 4.4 will make a big difference to 
code that loops through a lot of data (it re-arranges the loops to 
unroll inner blocks, and to make loop strides match cache sizes).


>>
>>> We are interested in working on this capability (and more generic
>>> capability) as well.
>>>
>>> Is anyone in particular starting to design/code this? Please let me
>>> know.
>>>
>>
>> Well, I am currently trying to write up some of the maths - I started
>> the thread because I had been playing around with the maths, and thought
>> it should work. I made a brief stab at writing a
>> "raid7_int$#_gen_syndrome()" function, but I haven't done any testing
>> with it (or even tried to compile it) - first I want to be sure of the
>> algorithms.
>
> I've been coding various bits as "pseudocode" using Octave. Makes
> checking with the built in Galios functions pretty easy.
>
> I haven't looked at the math behind the triple parity syndrome calc yet,
> though I'd imagine someone has, and can write it down. If someone hasn't
> done that yet, its a good first step. Then we can code the simple
> version from there with test drivers/cases, and then start optimizing
> the implementation.
>
>



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2011-06-12  9:05 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-09  0:01 Triple-parity raid6 David Brown
2011-06-09  1:49 ` NeilBrown
2011-06-09 11:32   ` David Brown
2011-06-09 12:04     ` NeilBrown
2011-06-09 19:19       ` David Brown
2011-06-10  3:22       ` Namhyung Kim
2011-06-10  8:45         ` David Brown
2011-06-10 12:20           ` Christoph Dittmann
2011-06-10 14:28             ` David Brown
2011-06-11 10:13               ` Piergiorgio Sartor
2011-06-11 11:51                 ` David Brown
2011-06-11 13:18                   ` Piergiorgio Sartor
2011-06-11 14:53                     ` David Brown
2011-06-11 15:05                       ` Joe Landman
2011-06-11 16:31                         ` David Brown
2011-06-11 16:57                           ` Joe Landman
2011-06-12  9:05                             ` David Brown
2011-06-11 17:14                           ` Joe Landman
2011-06-11 18:05                             ` David Brown
2011-06-10  9:03       ` David Brown
2011-06-10 13:56       ` Bill Davidsen
2011-06-09 22:42 ` David Brown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.