All of lore.kernel.org
 help / color / mirror / Atom feed
* erasure code and coefficients
@ 2014-06-29  9:30 Loic Dachary
  2014-06-29 18:38 ` Koleos Fuscus
  0 siblings, 1 reply; 9+ messages in thread
From: Loic Dachary @ 2014-06-29  9:30 UTC (permalink / raw)
  To: Andreas-Joachim Peters; +Cc: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 826 bytes --]

Hi Andreas,

In http://anrg.usc.edu/~maheswaran/Xorbas.pdf I get the idea of computing local coding chunks the way it is implemented in https://github.com/ceph/ceph/pull/1921 (i.e. delegating encoding / decoding to other plugins). However, there are theoretical aspects of the paper that I do not understand and I'm hoping you can shed some light on it. In particular, I don't know what "coefficients" are about. For instance in the context of Figure 2 caption : "The main theoretical challenge is to choose the coeffi\x0ecients c(i) to maximize the fault tolerance of the code."

Would you recommend a paper to read to better understand this ? Also I'd like to understand what "coefficients" mean in the context of jerasure or if they do not apply.

Thanks for you help :-)

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: erasure code and coefficients
  2014-06-29  9:30 erasure code and coefficients Loic Dachary
@ 2014-06-29 18:38 ` Koleos Fuscus
  2014-06-29 18:44   ` Loic Dachary
  0 siblings, 1 reply; 9+ messages in thread
From: Koleos Fuscus @ 2014-06-29 18:38 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Andreas-Joachim Peters, Ceph Development

Hello Loic,
Dimakis (one of the authors of xorbas) is talking about coefficients
because they want to find a way to reduce the storage overhead used
with LRC. In the simple case used in Fig. 2, a RS (k=10, m=4) has
14/10 storage overhead but when using LRC, the overhead increases to
17/10 because you also need to store s1, s2 and s3. Basically, the
idea is to find specific coefficients c1..c10 that permit to obtain s3
through s1 and s2. In other words, get some s1 and s2 that when xored
together give s3. If you find such coefficients, you don't need to
store s3 and the storage overhead of LRC is 1.6x instead of 1.7x.

Dimakis said that for the Reed Solomon implementation used in HDFS
RAID they can simple set all coefficients with value '1' and use xor.

This cannot be the case of the Reed Solomon implemented by you (I
understood is the jerasure library by Plank) but that I am not sure. I
guess we need the help of a mathematician or at least check and
compare both implementations.

Finally, apparently for xorbas they only implemented the configuration
RS(10,4) and not other combinations. Unfortunately, the wiki page of
the project is empty http://wiki.apache.org/hadoop/ErasureCode and the
main page says 'erasure coding under development'.

I recommend you to watch the xorbas presentation video
http://smahesh.com/HadoopUSC/ (a very clear explanation of xorbas) and
use the Dimakis wiki page to check the large collection of paper they
have: http://storagewiki.ece.utexas.edu/

Best,

koleosfuscus

________________________________________________________________
"My reply is: the software has no known bugs, therefore it has not
been updated."
Wietse Venema


On Sun, Jun 29, 2014 at 11:30 AM, Loic Dachary <loic@dachary.org> wrote:
> Hi Andreas,
>
> In http://anrg.usc.edu/~maheswaran/Xorbas.pdf I get the idea of computing local coding chunks the way it is implemented in https://github.com/ceph/ceph/pull/1921 (i.e. delegating encoding / decoding to other plugins). However, there are theoretical aspects of the paper that I do not understand and I'm hoping you can shed some light on it. In particular, I don't know what "coefficients" are about. For instance in the context of Figure 2 caption : "The main theoretical challenge is to choose the coeffi cients c(i) to maximize the fault tolerance of the code."
>
> Would you recommend a paper to read to better understand this ? Also I'd like to understand what "coefficients" mean in the context of jerasure or if they do not apply.
>
> Thanks for you help :-)
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: erasure code and coefficients
  2014-06-29 18:38 ` Koleos Fuscus
@ 2014-06-29 18:44   ` Loic Dachary
  2014-06-30  8:18     ` Koleos Fuscus
  0 siblings, 1 reply; 9+ messages in thread
From: Loic Dachary @ 2014-06-29 18:44 UTC (permalink / raw)
  To: Koleos Fuscus; +Cc: Andreas-Joachim Peters, Ceph Development

[-- Attachment #1: Type: text/plain, Size: 3043 bytes --]

Hi koleofuscus,

Thanks for the explanation : it is very conforting to know that you understand this :-) At the risk of being thick, I must say that the very notion of "coefficient" eludes me. What are they ?

Cheers

On 29/06/2014 20:38, Koleos Fuscus wrote:
> Hello Loic,
> Dimakis (one of the authors of xorbas) is talking about coefficients
> because they want to find a way to reduce the storage overhead used
> with LRC. In the simple case used in Fig. 2, a RS (k=10, m=4) has
> 14/10 storage overhead but when using LRC, the overhead increases to
> 17/10 because you also need to store s1, s2 and s3. Basically, the
> idea is to find specific coefficients c1..c10 that permit to obtain s3
> through s1 and s2. In other words, get some s1 and s2 that when xored
> together give s3. If you find such coefficients, you don't need to
> store s3 and the storage overhead of LRC is 1.6x instead of 1.7x.
> 
> Dimakis said that for the Reed Solomon implementation used in HDFS
> RAID they can simple set all coefficients with value '1' and use xor.
> 
> This cannot be the case of the Reed Solomon implemented by you (I
> understood is the jerasure library by Plank) but that I am not sure. I
> guess we need the help of a mathematician or at least check and
> compare both implementations.
> 
> Finally, apparently for xorbas they only implemented the configuration
> RS(10,4) and not other combinations. Unfortunately, the wiki page of
> the project is empty http://wiki.apache.org/hadoop/ErasureCode and the
> main page says 'erasure coding under development'.
> 
> I recommend you to watch the xorbas presentation video
> http://smahesh.com/HadoopUSC/ (a very clear explanation of xorbas) and
> use the Dimakis wiki page to check the large collection of paper they
> have: http://storagewiki.ece.utexas.edu/
> 
> Best,
> 
> koleosfuscus
> 
> ________________________________________________________________
> "My reply is: the software has no known bugs, therefore it has not
> been updated."
> Wietse Venema
> 
> 
> On Sun, Jun 29, 2014 at 11:30 AM, Loic Dachary <loic@dachary.org> wrote:
>> Hi Andreas,
>>
>> In http://anrg.usc.edu/~maheswaran/Xorbas.pdf I get the idea of computing local coding chunks the way it is implemented in https://github.com/ceph/ceph/pull/1921 (i.e. delegating encoding / decoding to other plugins). However, there are theoretical aspects of the paper that I do not understand and I'm hoping you can shed some light on it. In particular, I don't know what "coefficients" are about. For instance in the context of Figure 2 caption : "The main theoretical challenge is to choose the coeffi cients c(i) to maximize the fault tolerance of the code."
>>
>> Would you recommend a paper to read to better understand this ? Also I'd like to understand what "coefficients" mean in the context of jerasure or if they do not apply.
>>
>> Thanks for you help :-)
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
>>

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: erasure code and coefficients
  2014-06-29 18:44   ` Loic Dachary
@ 2014-06-30  8:18     ` Koleos Fuscus
  2014-06-30  9:06       ` Loic Dachary
  0 siblings, 1 reply; 9+ messages in thread
From: Koleos Fuscus @ 2014-06-30  8:18 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Andreas-Joachim Peters, Ceph Development

Hi Loic,

I am happy to contribute with some clarifications. In fact,
erasure/reliability concepts are not blocking my progress with the
reliability model at ceph. It is the ceph model itself that has some
parts not clear to me, and nobody had time yet to review the state
model diagram that I published on the wiki. :(
 Anyway, regarding coefficients here is a bit of background.
Coefficients are the numbers that multiply your variables inside an
equation. In a toy example, to solve the equation ax^2+bx+c=0 you need
to find the coefficients a,b,c that make the equation valid.
In the context of Reed Solomon, the definition of coefficients is a
bit more confusing. In the original design, the message x is
interpreted as coefficients of a polynomial p. But in subsequents
interpretations the message x is seen as the values of the polynomial
p evaluated at the first k points a1..ak. Such interpretation is
apparently a bit less efficient but desirable because you can
construct a systematic code.
In the context of xorbas, you are constructing a code on top of Reed
Solomon. The codewords are seen as values, and the idea is to get
coefficients c1..c10 that also satisfy s1+s2+s3=0 (take this as a
missing introduction to my previous message)

Cheers,

koleosfuscus

________________________________________________________________
"My reply is: the software has no known bugs, therefore it has not
been updated."
Wietse Venema


On Sun, Jun 29, 2014 at 8:44 PM, Loic Dachary <loic@dachary.org> wrote:
> Hi koleofuscus,
>
> Thanks for the explanation : it is very conforting to know that you understand this :-) At the risk of being thick, I must say that the very notion of "coefficient" eludes me. What are they ?
>
> Cheers
>
> On 29/06/2014 20:38, Koleos Fuscus wrote:
>> Hello Loic,
>> Dimakis (one of the authors of xorbas) is talking about coefficients
>> because they want to find a way to reduce the storage overhead used
>> with LRC. In the simple case used in Fig. 2, a RS (k=10, m=4) has
>> 14/10 storage overhead but when using LRC, the overhead increases to
>> 17/10 because you also need to store s1, s2 and s3. Basically, the
>> idea is to find specific coefficients c1..c10 that permit to obtain s3
>> through s1 and s2. In other words, get some s1 and s2 that when xored
>> together give s3. If you find such coefficients, you don't need to
>> store s3 and the storage overhead of LRC is 1.6x instead of 1.7x.
>>
>> Dimakis said that for the Reed Solomon implementation used in HDFS
>> RAID they can simple set all coefficients with value '1' and use xor.
>>
>> This cannot be the case of the Reed Solomon implemented by you (I
>> understood is the jerasure library by Plank) but that I am not sure. I
>> guess we need the help of a mathematician or at least check and
>> compare both implementations.
>>
>> Finally, apparently for xorbas they only implemented the configuration
>> RS(10,4) and not other combinations. Unfortunately, the wiki page of
>> the project is empty http://wiki.apache.org/hadoop/ErasureCode and the
>> main page says 'erasure coding under development'.
>>
>> I recommend you to watch the xorbas presentation video
>> http://smahesh.com/HadoopUSC/ (a very clear explanation of xorbas) and
>> use the Dimakis wiki page to check the large collection of paper they
>> have: http://storagewiki.ece.utexas.edu/
>>
>> Best,
>>
>> koleosfuscus
>>
>> ________________________________________________________________
>> "My reply is: the software has no known bugs, therefore it has not
>> been updated."
>> Wietse Venema
>>
>>
>> On Sun, Jun 29, 2014 at 11:30 AM, Loic Dachary <loic@dachary.org> wrote:
>>> Hi Andreas,
>>>
>>> In http://anrg.usc.edu/~maheswaran/Xorbas.pdf I get the idea of computing local coding chunks the way it is implemented in https://github.com/ceph/ceph/pull/1921 (i.e. delegating encoding / decoding to other plugins). However, there are theoretical aspects of the paper that I do not understand and I'm hoping you can shed some light on it. In particular, I don't know what "coefficients" are about. For instance in the context of Figure 2 caption : "The main theoretical challenge is to choose the coeffi cients c(i) to maximize the fault tolerance of the code."
>>>
>>> Would you recommend a paper to read to better understand this ? Also I'd like to understand what "coefficients" mean in the context of jerasure or if they do not apply.
>>>
>>> Thanks for you help :-)
>>>
>>> --
>>> Loïc Dachary, Artisan Logiciel Libre
>>>
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: erasure code and coefficients
  2014-06-30  8:18     ` Koleos Fuscus
@ 2014-06-30  9:06       ` Loic Dachary
  2014-06-30  9:26         ` Andreas Joachim Peters
  0 siblings, 1 reply; 9+ messages in thread
From: Loic Dachary @ 2014-06-30  9:06 UTC (permalink / raw)
  To: Koleos Fuscus; +Cc: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 5084 bytes --]

Hi koleosfuscus,

It clarifies it enough to raise a question : where can I read code (or an algorithm if not code) that chose the coefficients desirable to implement what is suggested in the Xorbas paper ?

Cheers

On 30/06/2014 10:18, Koleos Fuscus wrote:
> Hi Loic,
> 
> I am happy to contribute with some clarifications. In fact,
> erasure/reliability concepts are not blocking my progress with the
> reliability model at ceph. It is the ceph model itself that has some
> parts not clear to me, and nobody had time yet to review the state
> model diagram that I published on the wiki. :(
>  Anyway, regarding coefficients here is a bit of background.
> Coefficients are the numbers that multiply your variables inside an
> equation. In a toy example, to solve the equation ax^2+bx+c=0 you need
> to find the coefficients a,b,c that make the equation valid.
> In the context of Reed Solomon, the definition of coefficients is a
> bit more confusing. In the original design, the message x is
> interpreted as coefficients of a polynomial p. But in subsequents
> interpretations the message x is seen as the values of the polynomial
> p evaluated at the first k points a1..ak. Such interpretation is
> apparently a bit less efficient but desirable because you can
> construct a systematic code.
> In the context of xorbas, you are constructing a code on top of Reed
> Solomon. The codewords are seen as values, and the idea is to get
> coefficients c1..c10 that also satisfy s1+s2+s3=0 (take this as a
> missing introduction to my previous message)
> 
> Cheers,
> 
> koleosfuscus
> 
> ________________________________________________________________
> "My reply is: the software has no known bugs, therefore it has not
> been updated."
> Wietse Venema
> 
> 
> On Sun, Jun 29, 2014 at 8:44 PM, Loic Dachary <loic@dachary.org> wrote:
>> Hi koleofuscus,
>>
>> Thanks for the explanation : it is very conforting to know that you understand this :-) At the risk of being thick, I must say that the very notion of "coefficient" eludes me. What are they ?
>>
>> Cheers
>>
>> On 29/06/2014 20:38, Koleos Fuscus wrote:
>>> Hello Loic,
>>> Dimakis (one of the authors of xorbas) is talking about coefficients
>>> because they want to find a way to reduce the storage overhead used
>>> with LRC. In the simple case used in Fig. 2, a RS (k=10, m=4) has
>>> 14/10 storage overhead but when using LRC, the overhead increases to
>>> 17/10 because you also need to store s1, s2 and s3. Basically, the
>>> idea is to find specific coefficients c1..c10 that permit to obtain s3
>>> through s1 and s2. In other words, get some s1 and s2 that when xored
>>> together give s3. If you find such coefficients, you don't need to
>>> store s3 and the storage overhead of LRC is 1.6x instead of 1.7x.
>>>
>>> Dimakis said that for the Reed Solomon implementation used in HDFS
>>> RAID they can simple set all coefficients with value '1' and use xor.
>>>
>>> This cannot be the case of the Reed Solomon implemented by you (I
>>> understood is the jerasure library by Plank) but that I am not sure. I
>>> guess we need the help of a mathematician or at least check and
>>> compare both implementations.
>>>
>>> Finally, apparently for xorbas they only implemented the configuration
>>> RS(10,4) and not other combinations. Unfortunately, the wiki page of
>>> the project is empty http://wiki.apache.org/hadoop/ErasureCode and the
>>> main page says 'erasure coding under development'.
>>>
>>> I recommend you to watch the xorbas presentation video
>>> http://smahesh.com/HadoopUSC/ (a very clear explanation of xorbas) and
>>> use the Dimakis wiki page to check the large collection of paper they
>>> have: http://storagewiki.ece.utexas.edu/
>>>
>>> Best,
>>>
>>> koleosfuscus
>>>
>>> ________________________________________________________________
>>> "My reply is: the software has no known bugs, therefore it has not
>>> been updated."
>>> Wietse Venema
>>>
>>>
>>> On Sun, Jun 29, 2014 at 11:30 AM, Loic Dachary <loic@dachary.org> wrote:
>>>> Hi Andreas,
>>>>
>>>> In http://anrg.usc.edu/~maheswaran/Xorbas.pdf I get the idea of computing local coding chunks the way it is implemented in https://github.com/ceph/ceph/pull/1921 (i.e. delegating encoding / decoding to other plugins). However, there are theoretical aspects of the paper that I do not understand and I'm hoping you can shed some light on it. In particular, I don't know what "coefficients" are about. For instance in the context of Figure 2 caption : "The main theoretical challenge is to choose the coeffi cients c(i) to maximize the fault tolerance of the code."
>>>>
>>>> Would you recommend a paper to read to better understand this ? Also I'd like to understand what "coefficients" mean in the context of jerasure or if they do not apply.
>>>>
>>>> Thanks for you help :-)
>>>>
>>>> --
>>>> Loïc Dachary, Artisan Logiciel Libre
>>>>
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
>>

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: erasure code and coefficients
  2014-06-30  9:06       ` Loic Dachary
@ 2014-06-30  9:26         ` Andreas Joachim Peters
  2014-06-30 10:34           ` Loic Dachary
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Joachim Peters @ 2014-06-30  9:26 UTC (permalink / raw)
  To: Loic Dachary, Koleos Fuscus; +Cc: Ceph Development

Hi Loic, 

i think the best is to read along the sources. It is very readable!

https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java

If there is a high interest in this, you  could port the code from Java to C++ and use the GF-complete or Intel GF-multiplication implementation for the used GF multiplications.

Cheers Andreas.


________________________________________
From: ceph-devel-owner@vger.kernel.org [ceph-devel-owner@vger.kernel.org] on behalf of Loic Dachary [loic@dachary.org]
Sent: 30 June 2014 11:06
To: Koleos Fuscus
Cc: Ceph Development
Subject: Re: erasure code and coefficients

Hi koleosfuscus,

It clarifies it enough to raise a question : where can I read code (or an algorithm if not code) that chose the coefficients desirable to implement what is suggested in the Xorbas paper ?

Cheers

On 30/06/2014 10:18, Koleos Fuscus wrote:
> Hi Loic,
>
> I am happy to contribute with some clarifications. In fact,
> erasure/reliability concepts are not blocking my progress with the
> reliability model at ceph. It is the ceph model itself that has some
> parts not clear to me, and nobody had time yet to review the state
> model diagram that I published on the wiki. :(
>  Anyway, regarding coefficients here is a bit of background.
> Coefficients are the numbers that multiply your variables inside an
> equation. In a toy example, to solve the equation ax^2+bx+c=0 you need
> to find the coefficients a,b,c that make the equation valid.
> In the context of Reed Solomon, the definition of coefficients is a
> bit more confusing. In the original design, the message x is
> interpreted as coefficients of a polynomial p. But in subsequents
> interpretations the message x is seen as the values of the polynomial
> p evaluated at the first k points a1..ak. Such interpretation is
> apparently a bit less efficient but desirable because you can
> construct a systematic code.
> In the context of xorbas, you are constructing a code on top of Reed
> Solomon. The codewords are seen as values, and the idea is to get
> coefficients c1..c10 that also satisfy s1+s2+s3=0 (take this as a
> missing introduction to my previous message)
>
> Cheers,
>
> koleosfuscus
>
> ________________________________________________________________
> "My reply is: the software has no known bugs, therefore it has not
> been updated."
> Wietse Venema
>
>
> On Sun, Jun 29, 2014 at 8:44 PM, Loic Dachary <loic@dachary.org> wrote:
>> Hi koleofuscus,
>>
>> Thanks for the explanation : it is very conforting to know that you understand this :-) At the risk of being thick, I must say that the very notion of "coefficient" eludes me. What are they ?
>>
>> Cheers
>>
>> On 29/06/2014 20:38, Koleos Fuscus wrote:
>>> Hello Loic,
>>> Dimakis (one of the authors of xorbas) is talking about coefficients
>>> because they want to find a way to reduce the storage overhead used
>>> with LRC. In the simple case used in Fig. 2, a RS (k=10, m=4) has
>>> 14/10 storage overhead but when using LRC, the overhead increases to
>>> 17/10 because you also need to store s1, s2 and s3. Basically, the
>>> idea is to find specific coefficients c1..c10 that permit to obtain s3
>>> through s1 and s2. In other words, get some s1 and s2 that when xored
>>> together give s3. If you find such coefficients, you don't need to
>>> store s3 and the storage overhead of LRC is 1.6x instead of 1.7x.
>>>
>>> Dimakis said that for the Reed Solomon implementation used in HDFS
>>> RAID they can simple set all coefficients with value '1' and use xor.
>>>
>>> This cannot be the case of the Reed Solomon implemented by you (I
>>> understood is the jerasure library by Plank) but that I am not sure. I
>>> guess we need the help of a mathematician or at least check and
>>> compare both implementations.
>>>
>>> Finally, apparently for xorbas they only implemented the configuration
>>> RS(10,4) and not other combinations. Unfortunately, the wiki page of
>>> the project is empty http://wiki.apache.org/hadoop/ErasureCode and the
>>> main page says 'erasure coding under development'.
>>>
>>> I recommend you to watch the xorbas presentation video
>>> http://smahesh.com/HadoopUSC/ (a very clear explanation of xorbas) and
>>> use the Dimakis wiki page to check the large collection of paper they
>>> have: http://storagewiki.ece.utexas.edu/
>>>
>>> Best,
>>>
>>> koleosfuscus
>>>
>>> ________________________________________________________________
>>> "My reply is: the software has no known bugs, therefore it has not
>>> been updated."
>>> Wietse Venema
>>>
>>>
>>> On Sun, Jun 29, 2014 at 11:30 AM, Loic Dachary <loic@dachary.org> wrote:
>>>> Hi Andreas,
>>>>
>>>> In http://anrg.usc.edu/~maheswaran/Xorbas.pdf I get the idea of computing local coding chunks the way it is implemented in https://github.com/ceph/ceph/pull/1921 (i.e. delegating encoding / decoding to other plugins). However, there are theoretical aspects of the paper that I do not understand and I'm hoping you can shed some light on it. In particular, I don't know what "coefficients" are about. For instance in the context of Figure 2 caption : "The main theoretical challenge is to choose the coeffi cients c(i) to maximize the fault tolerance of the code."
>>>>
>>>> Would you recommend a paper to read to better understand this ? Also I'd like to understand what "coefficients" mean in the context of jerasure or if they do not apply.
>>>>
>>>> Thanks for you help :-)
>>>>
>>>> --
>>>> Loïc Dachary, Artisan Logiciel Libre
>>>>
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
>>

--
Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: erasure code and coefficients
  2014-06-30  9:26         ` Andreas Joachim Peters
@ 2014-06-30 10:34           ` Loic Dachary
       [not found]             ` <CAGhffvxTXLfziX9YLifvufYtuBnFPoooKNL5nGaHU0QyZWt6yg@mail.gmail.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Loic Dachary @ 2014-06-30 10:34 UTC (permalink / raw)
  To: Andreas Joachim Peters; +Cc: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 8768 bytes --]

Hi Andreas,

TL;DR: which part of the code chooses the coefficients to maximize the fault tolerance of the code as suggested in the Xorbas paper ?

If I understand correctly, the locality is computed (i.e. how many local parity chunks are created) with:

https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L84

This is different from specifying it explicitly (https://github.com/dachary/ceph/commit/91b9cc2be5851f7668d593bb69b2f4f089ea585a#diff-5518964bc98a094a784ce2d17a5b0cc1R11)

Then a RS encoding matrix is calculated 

https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L100

and will be used when encoding an object

https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L143

then each local parity chunk is calculated using a simpler function (XOR)

https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L154

When decoding, the equivalent of the Ceph minimum_to_decode method is used to figure out which chunks are needed to recover the lost chunks

https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L319

During recovery, if only one chunk is missing, it is recovered with the simpler function (XOR)

https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L207
https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L224
(both of which are combined recursively)

and if there is is a "conflict" (i.e. if local parity is not enough to recover), then the RS decoding function is used

https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L266

I don't understand at all how local parity is related to RS parity, in the sense of "choosing the coefficients c(i) to maximize the fault tolerance of the code" as mentioned in the Xorbas paper : I must be missing something :-) I do understand how the code works though, which is troubling. It also is reassuring because the logic is very similar to what is proposed in https://github.com/ceph/ceph/pull/1921

Cheers

On 30/06/2014 11:26, Andreas Joachim Peters wrote:
> Hi Loic, 
> 
> i think the best is to read along the sources. It is very readable!
> 
> https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java
> 
> If there is a high interest in this, you  could port the code from Java to C++ and use the GF-complete or Intel GF-multiplication implementation for the used GF multiplications.
> 
> Cheers Andreas.
> 
> 
> ________________________________________
> From: ceph-devel-owner@vger.kernel.org [ceph-devel-owner@vger.kernel.org] on behalf of Loic Dachary [loic@dachary.org]
> Sent: 30 June 2014 11:06
> To: Koleos Fuscus
> Cc: Ceph Development
> Subject: Re: erasure code and coefficients
> 
> Hi koleosfuscus,
> 
> It clarifies it enough to raise a question : where can I read code (or an algorithm if not code) that chose the coefficients desirable to implement what is suggested in the Xorbas paper ?
> 
> Cheers
> 
> On 30/06/2014 10:18, Koleos Fuscus wrote:
>> Hi Loic,
>>
>> I am happy to contribute with some clarifications. In fact,
>> erasure/reliability concepts are not blocking my progress with the
>> reliability model at ceph. It is the ceph model itself that has some
>> parts not clear to me, and nobody had time yet to review the state
>> model diagram that I published on the wiki. :(
>>  Anyway, regarding coefficients here is a bit of background.
>> Coefficients are the numbers that multiply your variables inside an
>> equation. In a toy example, to solve the equation ax^2+bx+c=0 you need
>> to find the coefficients a,b,c that make the equation valid.
>> In the context of Reed Solomon, the definition of coefficients is a
>> bit more confusing. In the original design, the message x is
>> interpreted as coefficients of a polynomial p. But in subsequents
>> interpretations the message x is seen as the values of the polynomial
>> p evaluated at the first k points a1..ak. Such interpretation is
>> apparently a bit less efficient but desirable because you can
>> construct a systematic code.
>> In the context of xorbas, you are constructing a code on top of Reed
>> Solomon. The codewords are seen as values, and the idea is to get
>> coefficients c1..c10 that also satisfy s1+s2+s3=0 (take this as a
>> missing introduction to my previous message)
>>
>> Cheers,
>>
>> koleosfuscus
>>
>> ________________________________________________________________
>> "My reply is: the software has no known bugs, therefore it has not
>> been updated."
>> Wietse Venema
>>
>>
>> On Sun, Jun 29, 2014 at 8:44 PM, Loic Dachary <loic@dachary.org> wrote:
>>> Hi koleofuscus,
>>>
>>> Thanks for the explanation : it is very conforting to know that you understand this :-) At the risk of being thick, I must say that the very notion of "coefficient" eludes me. What are they ?
>>>
>>> Cheers
>>>
>>> On 29/06/2014 20:38, Koleos Fuscus wrote:
>>>> Hello Loic,
>>>> Dimakis (one of the authors of xorbas) is talking about coefficients
>>>> because they want to find a way to reduce the storage overhead used
>>>> with LRC. In the simple case used in Fig. 2, a RS (k=10, m=4) has
>>>> 14/10 storage overhead but when using LRC, the overhead increases to
>>>> 17/10 because you also need to store s1, s2 and s3. Basically, the
>>>> idea is to find specific coefficients c1..c10 that permit to obtain s3
>>>> through s1 and s2. In other words, get some s1 and s2 that when xored
>>>> together give s3. If you find such coefficients, you don't need to
>>>> store s3 and the storage overhead of LRC is 1.6x instead of 1.7x.
>>>>
>>>> Dimakis said that for the Reed Solomon implementation used in HDFS
>>>> RAID they can simple set all coefficients with value '1' and use xor.
>>>>
>>>> This cannot be the case of the Reed Solomon implemented by you (I
>>>> understood is the jerasure library by Plank) but that I am not sure. I
>>>> guess we need the help of a mathematician or at least check and
>>>> compare both implementations.
>>>>
>>>> Finally, apparently for xorbas they only implemented the configuration
>>>> RS(10,4) and not other combinations. Unfortunately, the wiki page of
>>>> the project is empty http://wiki.apache.org/hadoop/ErasureCode and the
>>>> main page says 'erasure coding under development'.
>>>>
>>>> I recommend you to watch the xorbas presentation video
>>>> http://smahesh.com/HadoopUSC/ (a very clear explanation of xorbas) and
>>>> use the Dimakis wiki page to check the large collection of paper they
>>>> have: http://storagewiki.ece.utexas.edu/
>>>>
>>>> Best,
>>>>
>>>> koleosfuscus
>>>>
>>>> ________________________________________________________________
>>>> "My reply is: the software has no known bugs, therefore it has not
>>>> been updated."
>>>> Wietse Venema
>>>>
>>>>
>>>> On Sun, Jun 29, 2014 at 11:30 AM, Loic Dachary <loic@dachary.org> wrote:
>>>>> Hi Andreas,
>>>>>
>>>>> In http://anrg.usc.edu/~maheswaran/Xorbas.pdf I get the idea of computing local coding chunks the way it is implemented in https://github.com/ceph/ceph/pull/1921 (i.e. delegating encoding / decoding to other plugins). However, there are theoretical aspects of the paper that I do not understand and I'm hoping you can shed some light on it. In particular, I don't know what "coefficients" are about. For instance in the context of Figure 2 caption : "The main theoretical challenge is to choose the coeffi cients c(i) to maximize the fault tolerance of the code."
>>>>>
>>>>> Would you recommend a paper to read to better understand this ? Also I'd like to understand what "coefficients" mean in the context of jerasure or if they do not apply.
>>>>>
>>>>> Thanks for you help :-)
>>>>>
>>>>> --
>>>>> Loïc Dachary, Artisan Logiciel Libre
>>>>>
>>>
>>> --
>>> Loïc Dachary, Artisan Logiciel Libre
>>>
> 
> --
> Loïc Dachary, Artisan Logiciel Libre
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* FW: erasure code and coefficients
       [not found]               ` <53B158FC.8050303@dachary.org>
@ 2014-06-30 12:59                 ` Andreas Joachim Peters
  2014-06-30 13:32                   ` Loic Dachary
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Joachim Peters @ 2014-06-30 12:59 UTC (permalink / raw)
  To: ceph-devel, Loic Dachary

Hi Loic, 

I was reading through the code and I have got the impression that they have formulated something simple with a lot of mathematical formulas in this paper. Indeed all ci = 1 ... which makes it simple ...

The code seems to work  like this  e..g 10 + 2 + 4 :

Data = (1,2,3,4,5,6,7,8,9,10)

(RS1,RS2,RS3,RS4) = RS(1,2,3,4,5,5,6,7,8,9,10)

P1 = 1 ^ 2 ^ 3 ^ 4 ^ 5
P2 = 6 ^ 7 ^ 8 ^ 9 ^ 10
P3 = P1 ^ P2 = RS1 ^ RS2 ^ RS3 ^ RS4 ( no need to store that, since we store P1 and P2)

You can reconstruct a missing RS chunk with local parity like: RS1 = P1 ^ P2 ^ RS2 ^ RS3 ^ RS4

At least this is what I got out of the code ...  look at getSRCGroupNeighbors

What I don't know is, if P1 ^ P2 ^ P3 = 0 work with every encoding matrix or only for that particular one used.

Cheers Andreas.



>
> On Mon, Jun 30, 2014 at 12:34 PM, Loic Dachary <loic@dachary.org <mailto:loic@dachary.org>> wrote:
>
>     Hi Andreas,
>
>     TL;DR: which part of the code chooses the coefficients to maximize the fault tolerance of the code as suggested in the Xorbas paper ?
>
>     If I understand correctly, the locality is computed (i.e. how many local parity chunks are created) with:
>
>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L84
>
>     This is different from specifying it explicitly (https://github.com/dachary/ceph/commit/91b9cc2be5851f7668d593bb69b2f4f089ea585a#diff-5518964bc98a094a784ce2d17a5b0cc1R11)
>
>     Then a RS encoding matrix is calculated
>
>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L100
>
>     and will be used when encoding an object
>
>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L143
>
>     then each local parity chunk is calculated using a simpler function (XOR)
>
>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L154
>
>     When decoding, the equivalent of the Ceph minimum_to_decode method is used to figure out which chunks are needed to recover the lost chunks
>
>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L319
>
>     During recovery, if only one chunk is missing, it is recovered with the simpler function (XOR)
>
>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L207
>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L224
>     (both of which are combined recursively)
>
>     and if there is is a "conflict" (i.e. if local parity is not enough to recover), then the RS decoding function is used
>
>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L266
>
>     I don't understand at all how local parity is related to RS parity, in the sense of "choosing the coefficients c(i) to maximize the fault tolerance of the code" as mentioned in the Xorbas paper : I must be missing something :-) I do understand how the code works though, which is troubling. It also is reassuring because the logic is very similar to what is proposed in https://github.com/ceph/ceph/pull/1921
>
>     Cheers
>
>     On 30/06/2014 11:26, Andreas Joachim Peters wrote:
>     > Hi Loic,
>     >
>     > i think the best is to read along the sources. It is very readable!
>     >
>     > https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java
>     >
>     > If there is a high interest in this, you  could port the code from Java to C++ and use the GF-complete or Intel GF-multiplication implementation for the used GF multiplications.
>     >
>     > Cheers Andreas.
>     >
>     >
>     > ________________________________________
>     > From: ceph-devel-owner@vger.kernel.org <mailto:ceph-devel-owner@vger.kernel.org> [ceph-devel-owner@vger.kernel.org <mailto:ceph-devel-owner@vger.kernel.org>] on behalf of Loic Dachary [loic@dachary.org <mailto:loic@dachary.org>]
>     > Sent: 30 June 2014 11:06
>     > To: Koleos Fuscus
>     > Cc: Ceph Development
>     > Subject: Re: erasure code and coefficients
>     >
>     > Hi koleosfuscus,
>     >
>     > It clarifies it enough to raise a question : where can I read code (or an algorithm if not code) that chose the coefficients desirable to implement what is suggested in the Xorbas paper ?
>     >
>     > Cheers
>     >
>     > On 30/06/2014 10:18, Koleos Fuscus wrote:
>     >> Hi Loic,
>     >>
>     >> I am happy to contribute with some clarifications. In fact,
>     >> erasure/reliability concepts are not blocking my progress with the
>     >> reliability model at ceph. It is the ceph model itself that has some
>     >> parts not clear to me, and nobody had time yet to review the state
>     >> model diagram that I published on the wiki. :(
>     >>  Anyway, regarding coefficients here is a bit of background.
>     >> Coefficients are the numbers that multiply your variables inside an
>     >> equation. In a toy example, to solve the equation ax^2+bx+c=0 you need
>     >> to find the coefficients a,b,c that make the equation valid.
>     >> In the context of Reed Solomon, the definition of coefficients is a
>     >> bit more confusing. In the original design, the message x is
>     >> interpreted as coefficients of a polynomial p. But in subsequents
>     >> interpretations the message x is seen as the values of the polynomial
>     >> p evaluated at the first k points a1..ak. Such interpretation is
>     >> apparently a bit less efficient but desirable because you can
>     >> construct a systematic code.
>     >> In the context of xorbas, you are constructing a code on top of Reed
>     >> Solomon. The codewords are seen as values, and the idea is to get
>     >> coefficients c1..c10 that also satisfy s1+s2+s3=0 (take this as a
>     >> missing introduction to my previous message)
>     >>
>     >> Cheers,
>     >>
>     >> koleosfuscus
>     >>
>     >> ________________________________________________________________
>     >> "My reply is: the software has no known bugs, therefore it has not
>     >> been updated."
>     >> Wietse Venema
>     >>
>     >>
>     >> On Sun, Jun 29, 2014 at 8:44 PM, Loic Dachary <loic@dachary.org <mailto:loic@dachary.org>> wrote:
>     >>> Hi koleofuscus,
>     >>>
>     >>> Thanks for the explanation : it is very conforting to know that you understand this :-) At the risk of being thick, I must say that the very notion of "coefficient" eludes me. What are they ?
>     >>>
>     >>> Cheers
>     >>>
>     >>> On 29/06/2014 20:38, Koleos Fuscus wrote:
>     >>>> Hello Loic,
>     >>>> Dimakis (one of the authors of xorbas) is talking about coefficients
>     >>>> because they want to find a way to reduce the storage overhead used
>     >>>> with LRC. In the simple case used in Fig. 2, a RS (k=10, m=4) has
>     >>>> 14/10 storage overhead but when using LRC, the overhead increases to
>     >>>> 17/10 because you also need to store s1, s2 and s3. Basically, the
>     >>>> idea is to find specific coefficients c1..c10 that permit to obtain s3
>     >>>> through s1 and s2. In other words, get some s1 and s2 that when xored
>     >>>> together give s3. If you find such coefficients, you don't need to
>     >>>> store s3 and the storage overhead of LRC is 1.6x instead of 1.7x.
>     >>>>
>     >>>> Dimakis said that for the Reed Solomon implementation used in HDFS
>     >>>> RAID they can simple set all coefficients with value '1' and use xor.
>     >>>>
>     >>>> This cannot be the case of the Reed Solomon implemented by you (I
>     >>>> understood is the jerasure library by Plank) but that I am not sure. I
>     >>>> guess we need the help of a mathematician or at least check and
>     >>>> compare both implementations.
>     >>>>
>     >>>> Finally, apparently for xorbas they only implemented the configuration
>     >>>> RS(10,4) and not other combinations. Unfortunately, the wiki page of
>     >>>> the project is empty http://wiki.apache.org/hadoop/ErasureCode and the
>     >>>> main page says 'erasure coding under development'.
>     >>>>
>     >>>> I recommend you to watch the xorbas presentation video
>     >>>> http://smahesh.com/HadoopUSC/ (a very clear explanation of xorbas) and
>     >>>> use the Dimakis wiki page to check the large collection of paper they
>     >>>> have: http://storagewiki.ece.utexas.edu/
>     >>>>
>     >>>> Best,
>     >>>>
>     >>>> koleosfuscus
>     >>>>
>     >>>> ________________________________________________________________
>     >>>> "My reply is: the software has no known bugs, therefore it has not
>     >>>> been updated."
>     >>>> Wietse Venema
>     >>>>
>     >>>>
>     >>>> On Sun, Jun 29, 2014 at 11:30 AM, Loic Dachary <loic@dachary.org <mailto:loic@dachary.org>> wrote:
>     >>>>> Hi Andreas,
>     >>>>>
>     >>>>> In http://anrg.usc.edu/~maheswaran/Xorbas.pdf I get the idea of computing local coding chunks the way it is implemented in https://github.com/ceph/ceph/pull/1921 (i.e. delegating encoding / decoding to other plugins). However, there are theoretical aspects of the paper that I do not understand and I'm hoping you can shed some light on it. In particular, I don't know what "coefficients" are about. For instance in the context of Figure 2 caption : "The main theoretical challenge is to choose the coeffi cients c(i) to maximize the fault tolerance of the code."
>     >>>>>
>     >>>>> Would you recommend a paper to read to better understand this ? Also I'd like to understand what "coefficients" mean in the context of jerasure or if they do not apply.
>     >>>>>
>     >>>>> Thanks for you help :-)
>     >>>>>
>     >>>>> --
>     >>>>> Loïc Dachary, Artisan Logiciel Libre
>     >>>>>
>     >>>
>     >>> --
>     >>> Loïc Dachary, Artisan Logiciel Libre
>     >>>
>     >
>     > --
>     > Loïc Dachary, Artisan Logiciel Libre
>     >
>     > --
>     > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>     > the body of a message to majordomo@vger.kernel.org <mailto:majordomo@vger.kernel.org>
>     > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>     >
>
>     --
>     Loïc Dachary, Artisan Logiciel Libre
>
>

--
Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: FW: erasure code and coefficients
  2014-06-30 12:59                 ` FW: " Andreas Joachim Peters
@ 2014-06-30 13:32                   ` Loic Dachary
  0 siblings, 0 replies; 9+ messages in thread
From: Loic Dachary @ 2014-06-30 13:32 UTC (permalink / raw)
  To: Andreas Joachim Peters, ceph-devel

[-- Attachment #1: Type: text/plain, Size: 11840 bytes --]

Hi,

On 30/06/2014 14:59, Andreas Joachim Peters wrote:
> Hi Loic, 
> 
> I was reading through the code and I have got the impression that they have formulated something simple with a lot of mathematical formulas in this paper. Indeed all ci = 1 ... which makes it simple ...

It makes things simpler to understand from my perspective, cool :-)

> The code seems to work  like this  e..g 10 + 2 + 4 :
> 
> Data = (1,2,3,4,5,6,7,8,9,10)
> 
> (RS1,RS2,RS3,RS4) = RS(1,2,3,4,5,5,6,7,8,9,10)
> 
> P1 = 1 ^ 2 ^ 3 ^ 4 ^ 5
> P2 = 6 ^ 7 ^ 8 ^ 9 ^ 10
> P3 = P1 ^ P2 = RS1 ^ RS2 ^ RS3 ^ RS4 ( no need to store that, since we store P1 and P2)
> 

Using my own words, if local parity P1 and P2 is calculated using only the data chunks, it follows that P1 ^ P2 can also be used as an implied parity chunk for the RS parity chunks. But would that be true if P1 and P2 was calculated using a random subset of the data + coding chunks ? I.e if P1 = RS1 ^ 2 ^ 3 ^ 4 ^ 5 , can P1 ^ P2 = P3 be used to recover any of 1, RS2, RS3, RS4 ? In other words, does it make a difference that P3 is an implied parity for the parity chunks or can it be an implied parity for any data/coding chunks involved in RS ?

Cheers

> You can reconstruct a missing RS chunk with local parity like: RS1 = P1 ^ P2 ^ RS2 ^ RS3 ^ RS4
> 
> At least this is what I got out of the code ...  look at getSRCGroupNeighbors
> 
> What I don't know is, if P1 ^ P2 ^ P3 = 0 work with every encoding matrix or only for that particular one used.
> 
> Cheers Andreas.
> 
> 
> 
>>
>> On Mon, Jun 30, 2014 at 12:34 PM, Loic Dachary <loic@dachary.org <mailto:loic@dachary.org>> wrote:
>>
>>     Hi Andreas,
>>
>>     TL;DR: which part of the code chooses the coefficients to maximize the fault tolerance of the code as suggested in the Xorbas paper ?
>>
>>     If I understand correctly, the locality is computed (i.e. how many local parity chunks are created) with:
>>
>>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L84
>>
>>     This is different from specifying it explicitly (https://github.com/dachary/ceph/commit/91b9cc2be5851f7668d593bb69b2f4f089ea585a#diff-5518964bc98a094a784ce2d17a5b0cc1R11)
>>
>>     Then a RS encoding matrix is calculated
>>
>>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L100
>>
>>     and will be used when encoding an object
>>
>>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L143
>>
>>     then each local parity chunk is calculated using a simpler function (XOR)
>>
>>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L154
>>
>>     When decoding, the equivalent of the Ceph minimum_to_decode method is used to figure out which chunks are needed to recover the lost chunks
>>
>>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L319
>>
>>     During recovery, if only one chunk is missing, it is recovered with the simpler function (XOR)
>>
>>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L207
>>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L224
>>     (both of which are combined recursively)
>>
>>     and if there is is a "conflict" (i.e. if local parity is not enough to recover), then the RS decoding function is used
>>
>>     https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java#L266
>>
>>     I don't understand at all how local parity is related to RS parity, in the sense of "choosing the coefficients c(i) to maximize the fault tolerance of the code" as mentioned in the Xorbas paper : I must be missing something :-) I do understand how the code works though, which is troubling. It also is reassuring because the logic is very similar to what is proposed in https://github.com/ceph/ceph/pull/1921
>>
>>     Cheers
>>
>>     On 30/06/2014 11:26, Andreas Joachim Peters wrote:
>>     > Hi Loic,
>>     >
>>     > i think the best is to read along the sources. It is very readable!
>>     >
>>     > https://github.com/madiator/HadoopUSC/blob/developUSC/src/contrib/raid/src/java/org/apache/hadoop/raid/SimpleRegeneratingCode.java
>>     >
>>     > If there is a high interest in this, you  could port the code from Java to C++ and use the GF-complete or Intel GF-multiplication implementation for the used GF multiplications.
>>     >
>>     > Cheers Andreas.
>>     >
>>     >
>>     > ________________________________________
>>     > From: ceph-devel-owner@vger.kernel.org <mailto:ceph-devel-owner@vger.kernel.org> [ceph-devel-owner@vger.kernel.org <mailto:ceph-devel-owner@vger.kernel.org>] on behalf of Loic Dachary [loic@dachary.org <mailto:loic@dachary.org>]
>>     > Sent: 30 June 2014 11:06
>>     > To: Koleos Fuscus
>>     > Cc: Ceph Development
>>     > Subject: Re: erasure code and coefficients
>>     >
>>     > Hi koleosfuscus,
>>     >
>>     > It clarifies it enough to raise a question : where can I read code (or an algorithm if not code) that chose the coefficients desirable to implement what is suggested in the Xorbas paper ?
>>     >
>>     > Cheers
>>     >
>>     > On 30/06/2014 10:18, Koleos Fuscus wrote:
>>     >> Hi Loic,
>>     >>
>>     >> I am happy to contribute with some clarifications. In fact,
>>     >> erasure/reliability concepts are not blocking my progress with the
>>     >> reliability model at ceph. It is the ceph model itself that has some
>>     >> parts not clear to me, and nobody had time yet to review the state
>>     >> model diagram that I published on the wiki. :(
>>     >>  Anyway, regarding coefficients here is a bit of background.
>>     >> Coefficients are the numbers that multiply your variables inside an
>>     >> equation. In a toy example, to solve the equation ax^2+bx+c=0 you need
>>     >> to find the coefficients a,b,c that make the equation valid.
>>     >> In the context of Reed Solomon, the definition of coefficients is a
>>     >> bit more confusing. In the original design, the message x is
>>     >> interpreted as coefficients of a polynomial p. But in subsequents
>>     >> interpretations the message x is seen as the values of the polynomial
>>     >> p evaluated at the first k points a1..ak. Such interpretation is
>>     >> apparently a bit less efficient but desirable because you can
>>     >> construct a systematic code.
>>     >> In the context of xorbas, you are constructing a code on top of Reed
>>     >> Solomon. The codewords are seen as values, and the idea is to get
>>     >> coefficients c1..c10 that also satisfy s1+s2+s3=0 (take this as a
>>     >> missing introduction to my previous message)
>>     >>
>>     >> Cheers,
>>     >>
>>     >> koleosfuscus
>>     >>
>>     >> ________________________________________________________________
>>     >> "My reply is: the software has no known bugs, therefore it has not
>>     >> been updated."
>>     >> Wietse Venema
>>     >>
>>     >>
>>     >> On Sun, Jun 29, 2014 at 8:44 PM, Loic Dachary <loic@dachary.org <mailto:loic@dachary.org>> wrote:
>>     >>> Hi koleofuscus,
>>     >>>
>>     >>> Thanks for the explanation : it is very conforting to know that you understand this :-) At the risk of being thick, I must say that the very notion of "coefficient" eludes me. What are they ?
>>     >>>
>>     >>> Cheers
>>     >>>
>>     >>> On 29/06/2014 20:38, Koleos Fuscus wrote:
>>     >>>> Hello Loic,
>>     >>>> Dimakis (one of the authors of xorbas) is talking about coefficients
>>     >>>> because they want to find a way to reduce the storage overhead used
>>     >>>> with LRC. In the simple case used in Fig. 2, a RS (k=10, m=4) has
>>     >>>> 14/10 storage overhead but when using LRC, the overhead increases to
>>     >>>> 17/10 because you also need to store s1, s2 and s3. Basically, the
>>     >>>> idea is to find specific coefficients c1..c10 that permit to obtain s3
>>     >>>> through s1 and s2. In other words, get some s1 and s2 that when xored
>>     >>>> together give s3. If you find such coefficients, you don't need to
>>     >>>> store s3 and the storage overhead of LRC is 1.6x instead of 1.7x.
>>     >>>>
>>     >>>> Dimakis said that for the Reed Solomon implementation used in HDFS
>>     >>>> RAID they can simple set all coefficients with value '1' and use xor.
>>     >>>>
>>     >>>> This cannot be the case of the Reed Solomon implemented by you (I
>>     >>>> understood is the jerasure library by Plank) but that I am not sure. I
>>     >>>> guess we need the help of a mathematician or at least check and
>>     >>>> compare both implementations.
>>     >>>>
>>     >>>> Finally, apparently for xorbas they only implemented the configuration
>>     >>>> RS(10,4) and not other combinations. Unfortunately, the wiki page of
>>     >>>> the project is empty http://wiki.apache.org/hadoop/ErasureCode and the
>>     >>>> main page says 'erasure coding under development'.
>>     >>>>
>>     >>>> I recommend you to watch the xorbas presentation video
>>     >>>> http://smahesh.com/HadoopUSC/ (a very clear explanation of xorbas) and
>>     >>>> use the Dimakis wiki page to check the large collection of paper they
>>     >>>> have: http://storagewiki.ece.utexas.edu/
>>     >>>>
>>     >>>> Best,
>>     >>>>
>>     >>>> koleosfuscus
>>     >>>>
>>     >>>> ________________________________________________________________
>>     >>>> "My reply is: the software has no known bugs, therefore it has not
>>     >>>> been updated."
>>     >>>> Wietse Venema
>>     >>>>
>>     >>>>
>>     >>>> On Sun, Jun 29, 2014 at 11:30 AM, Loic Dachary <loic@dachary.org <mailto:loic@dachary.org>> wrote:
>>     >>>>> Hi Andreas,
>>     >>>>>
>>     >>>>> In http://anrg.usc.edu/~maheswaran/Xorbas.pdf I get the idea of computing local coding chunks the way it is implemented in https://github.com/ceph/ceph/pull/1921 (i.e. delegating encoding / decoding to other plugins). However, there are theoretical aspects of the paper that I do not understand and I'm hoping you can shed some light on it. In particular, I don't know what "coefficients" are about. For instance in the context of Figure 2 caption : "The main theoretical challenge is to choose the coeffi cients c(i) to maximize the fault tolerance of the code."
>>     >>>>>
>>     >>>>> Would you recommend a paper to read to better understand this ? Also I'd like to understand what "coefficients" mean in the context of jerasure or if they do not apply.
>>     >>>>>
>>     >>>>> Thanks for you help :-)
>>     >>>>>
>>     >>>>> --
>>     >>>>> Loïc Dachary, Artisan Logiciel Libre
>>     >>>>>
>>     >>>
>>     >>> --
>>     >>> Loïc Dachary, Artisan Logiciel Libre
>>     >>>
>>     >
>>     > --
>>     > Loïc Dachary, Artisan Logiciel Libre
>>     >
>>     > --
>>     > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>     > the body of a message to majordomo@vger.kernel.org <mailto:majordomo@vger.kernel.org>
>>     > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>     >
>>
>>     --
>>     Loïc Dachary, Artisan Logiciel Libre
>>
>>
> 
> --
> Loïc Dachary, Artisan Logiciel Libre
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-06-30 13:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-29  9:30 erasure code and coefficients Loic Dachary
2014-06-29 18:38 ` Koleos Fuscus
2014-06-29 18:44   ` Loic Dachary
2014-06-30  8:18     ` Koleos Fuscus
2014-06-30  9:06       ` Loic Dachary
2014-06-30  9:26         ` Andreas Joachim Peters
2014-06-30 10:34           ` Loic Dachary
     [not found]             ` <CAGhffvxTXLfziX9YLifvufYtuBnFPoooKNL5nGaHU0QyZWt6yg@mail.gmail.com>
     [not found]               ` <53B158FC.8050303@dachary.org>
2014-06-30 12:59                 ` FW: " Andreas Joachim Peters
2014-06-30 13:32                   ` Loic Dachary

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.