All of lore.kernel.org
 help / color / mirror / Atom feed
* Markov models for Ceph
@ 2014-07-07 15:19 Koleos Fuscus
  2014-07-07 17:16 ` Loic Dachary
  0 siblings, 1 reply; 2+ messages in thread
From: Koleos Fuscus @ 2014-07-07 15:19 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Kyle Bader, ceph-devel, Sage Weil

Hello Loic,

You ask previously:
In other words, is there a place where one could set things like "disk
fail % of the time" and "network is X Gb/s" and "repairing a disk
failure requires disk require reading B bytes from M disks" ? As far
as I understand, such factors cannot be expressed with a single
formula and this is why a Markov model is useful.

I think we need to run simulations to have a more precise estimation
of the reliability of an erasure coded system. Markov models are not
as flexible as you may think. Besides, solving equations when the
number of components that may fail is large makes the problem not
trivial. Maybe standard simulation is enough. As observed by Greenan
in his thesis, standard simulations have problems with rare events
which may not be observed during simulation time. I don't know if we
should care about rare events for comparing methods..

Greenan released the software used for his thesis. It is completely
developed in Python.
http://www.kaymgee.com/Kevin_Greenan/Software.html

I found Greenan tool while trying to validate the results of ceph-tool
and the numbers are completely different:

For instance:

Parameters for ceph tool:
Disk type consumer, FIT1=2167, FIT2=2167
Size: 2000GiB
RAID-6
Replace 0h
Rebuild 6000MiB/s
Volumes:8
NRE model: ignore
Period: 10 years

(I used this numbers to compared with model 2DFT.disk.model of Greenan tool)

Parameters for  Greenan HFRS tool
python mm_solve.py -m 2DFT.disk.model -M

Results

CEPH:

    storage               durability    PL(site)  PL(copies)
PL(NRE)     PL(rep)    loss/PiB

    ----------            ----------  ----------  ----------
----------  ----------  ----------

    RAID-6: 6+2             11-nines   0.000e+00   1.318e-12
0.000e+00   0.000e+00   9.887e+02


HRFS:

Analytic MTTDL:  4.06111903031e+12
*********************
Analytic prob. of failure: 2.15660e-08
*********************

Could you check if the parameters for ceph are correct and equivalent
to HRFS model?Do you think it has sense to include Greenan tool.
Greenan has a number of models including nonMDS codes. I am not sure
yet how we can describe the LRC code in this platform but it might be
possible.

koleosfuscus

________________________________________________________________
"My reply is: the software has no known bugs, therefore it has not
been updated."
Wietse Venema

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Markov models for Ceph
  2014-07-07 15:19 Markov models for Ceph Koleos Fuscus
@ 2014-07-07 17:16 ` Loic Dachary
  0 siblings, 0 replies; 2+ messages in thread
From: Loic Dachary @ 2014-07-07 17:16 UTC (permalink / raw)
  To: Koleos Fuscus; +Cc: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 5170 bytes --]

Hi koleosfuscus,

From http://www.kaymgee.com/Kevin_Greenan/Software_files/hfrs.tar downloaded from http://www.kaymgee.com/Kevin_Greenan/Software.html

In hfrs/models/weaver_8_8_3.disk.ber.model

[num states]
4
0 1 a failure
1 0 b repair
1 2 c failure
2 1 d repair
2 3 e failure
[assign]
a=N*lam_d
b=mu
c=(N-1)*lam_d
d=2*mu
e=(N-2)*lam_d
N=8
lam_d=(1/461386.)
mu=(1/12.)
[END]

is semi-human parsable but hfrs/models/weaver_8_8_3.disk.ber.model

[num states]
5
0 1 a failure
0 4 b failure
1 2 c failure
1 4 d failure
1 0 e repair
2 3 f failure
2 4 g failure
2 1 h repair
3 4 i failure
3 2 j repair
[assign]
a=(N-0)*lam_d*(1-0.000000)*(1-(0.000000*(1-(1-p)**(N-1))))
b=(N-0)*lam_d*(0.000000)+(N-0)*lam_d*(1-0.000000)*((0.000000*(1-(1-p)**(N-1))))
c=(N-1)*lam_d*(1-0.000000)*(1-(0.000000*(1-(1-p)**(N-2))))
d=(N-1)*lam_d*(0.000000)+(N-1)*lam_d*(1-0.000000)*((0.000000*(1-(1-p)**(N-2))))
e=1*mu
f=(N-2)*lam_d*(1-0.000000)*(1-(0.114286*(1-(1-p)**(N-3))))
g=(N-2)*lam_d*(0.000000)+(N-2)*lam_d*(1-0.000000)*((0.114286*(1-(1-p)**(N-3))))
h=2*mu
i=(N-3)*lam_d
j=3*mu
N=8
lam_d=(1/461386.)
mu=(1/12.)
p=0.0237
[END]

[Disk sector conditional fault tolerance]
[[0.0, 0.0, 0.0, 0.0, 0.0043956043956043956, 0.02197802197802198, 0.075924075924075921], [0.0, 0.0, 0.0, 0.01098901098901099, 0.057942057942057944, 0.19780219780219779, 1.0], [0.0, 0.0, 0.034632034632034632, 0.16623376623376623, 0.49494949494949497, 1.0, 1.0], [0.0, 0.11428571428571428, 0.44126984126984126, 0.98333333333333328, 1.0, 1.0, 1.0]]


Kevin write that "The HFRS uses an extremely efficient mathematical technique, called importance sampling, which enables the observation of extremely low-probability events.  I have implemented (and derived in my thesis) efficient simulation algorithms under both exponential and Weibull failure/repairs.  The combination of these techniques, in addition to a custom Markov model solver, makes the HFRS an extremely useful tool for evaluating storage system reliability." meaning you need to understand both https://en.wikipedia.org/wiki/Markov_model and https://en.wikipedia.org/wiki/Importance_sampling as well as the semantics of the input file which is documented in the README.

Nice find koleosfuscus :-)

Cheers

On 07/07/2014 17:19, Koleos Fuscus wrote:
> Hello Loic,
> 
> You ask previously:
> In other words, is there a place where one could set things like "disk
> fail % of the time" and "network is X Gb/s" and "repairing a disk
> failure requires disk require reading B bytes from M disks" ? As far
> as I understand, such factors cannot be expressed with a single
> formula and this is why a Markov model is useful.
> 
> I think we need to run simulations to have a more precise estimation
> of the reliability of an erasure coded system. Markov models are not
> as flexible as you may think. Besides, solving equations when the
> number of components that may fail is large makes the problem not
> trivial. Maybe standard simulation is enough. As observed by Greenan
> in his thesis, standard simulations have problems with rare events
> which may not be observed during simulation time. I don't know if we
> should care about rare events for comparing methods..
> 
> Greenan released the software used for his thesis. It is completely
> developed in Python.
> http://www.kaymgee.com/Kevin_Greenan/Software.html
> 
> I found Greenan tool while trying to validate the results of ceph-tool
> and the numbers are completely different:
> 
> For instance:
> 
> Parameters for ceph tool:
> Disk type consumer, FIT1=2167, FIT2=2167
> Size: 2000GiB
> RAID-6
> Replace 0h
> Rebuild 6000MiB/s
> Volumes:8
> NRE model: ignore
> Period: 10 years
> 
> (I used this numbers to compared with model 2DFT.disk.model of Greenan tool)
> 
> Parameters for  Greenan HFRS tool
> python mm_solve.py -m 2DFT.disk.model -M
> 
> Results
> 
> CEPH:
> 
>     storage               durability    PL(site)  PL(copies)
> PL(NRE)     PL(rep)    loss/PiB
> 
>     ----------            ----------  ----------  ----------
> ----------  ----------  ----------
> 
>     RAID-6: 6+2             11-nines   0.000e+00   1.318e-12
> 0.000e+00   0.000e+00   9.887e+02
> 
> 
> HRFS:
> 
> Analytic MTTDL:  4.06111903031e+12
> *********************
> Analytic prob. of failure: 2.15660e-08
> *********************
> 
> Could you check if the parameters for ceph are correct and equivalent
> to HRFS model?Do you think it has sense to include Greenan tool.
> Greenan has a number of models including nonMDS codes. I am not sure
> yet how we can describe the LRC code in this platform but it might be
> possible.
> 
> koleosfuscus
> 
> ________________________________________________________________
> "My reply is: the software has no known bugs, therefore it has not
> been updated."
> Wietse Venema
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-07-07 17:16 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-07 15:19 Markov models for Ceph Koleos Fuscus
2014-07-07 17:16 ` Loic Dachary

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.