All of lore.kernel.org
 help / color / mirror / Atom feed
* scrub randomization and load threshold
@ 2015-11-12  9:24 Dan van der Ster
  2015-11-12 13:29 ` Sage Weil
  0 siblings, 1 reply; 12+ messages in thread
From: Dan van der Ster @ 2015-11-12  9:24 UTC (permalink / raw)
  To: ceph-devel; +Cc: Herve Rousseau

Hi,

Firstly, we just had a look at the new
osd_scrub_interval_randomize_ratio option and found that it doesn't
really solve the deep scrubbing problem. Given the default options,

osd_scrub_min_interval = 60*60*24
osd_scrub_max_interval = 7*60*60*24
osd_scrub_interval_randomize_ratio = 0.5
osd_deep_scrub_interval = 60*60*24*7

we understand that the new option changes the min interval to the
range 1-1.5 days. However, this doesn't do anything for the thundering
herd of deep scrubs which will happen every 7 days. We've found a
configuration that should randomize deep scrubbing across two weeks,
e.g.:

osd_scrub_min_interval = 60*60*24*7
osd_scrub_max_interval = 100*60*60*24 // effectively disabling this option
osd_scrub_load_threshold = 10 // effectively disabling this option
osd_scrub_interval_randomize_ratio = 2.0
osd_deep_scrub_interval = 60*60*24*7

but that (a) doesn't allow shallow scrubs to run daily and (b) is so
far off the defaults that its basically an abuse of the intended
behaviour.

So we'd like to simplify how deep scrubbing can be randomized. Our PR
(http://github.com/ceph/ceph/pull/6550) adds a new option
osd_deep_scrub_randomize_ratio which  controls a coin flip to randomly
turn scrubs into deep scrubs. The default is tuned so roughly 1 in 7
scrubs will be run deeply.

Secondly, we'd also like to discuss the osd_scrub_load_threshold
option, where we see two problems:
   - the default is so low that it disables all the shallow scrub
randomization on all but completely idle clusters.
   - finding the correct osd_scrub_load_threshold for a cluster is
surely unclear/difficult and probably a moving target for most prod
clusters.

Given those observations, IMHO the smart Ceph admin should set
osd_scrub_load_threshold = 10 or higher, to effectively disable that
functionality. In the spirit of having good defaults, I therefore
propose that we increase the default osd_scrub_load_threshold (to at
least 5.0) and consider removing the load threshold logic completely.

Cheers,

Dan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: scrub randomization and load threshold
  2015-11-12  9:24 scrub randomization and load threshold Dan van der Ster
@ 2015-11-12 13:29 ` Sage Weil
  2015-11-12 14:36   ` Dan van der Ster
  0 siblings, 1 reply; 12+ messages in thread
From: Sage Weil @ 2015-11-12 13:29 UTC (permalink / raw)
  To: Dan van der Ster; +Cc: ceph-devel, Herve Rousseau

On Thu, 12 Nov 2015, Dan van der Ster wrote:
> Hi,
> 
> Firstly, we just had a look at the new
> osd_scrub_interval_randomize_ratio option and found that it doesn't
> really solve the deep scrubbing problem. Given the default options,
> 
> osd_scrub_min_interval = 60*60*24
> osd_scrub_max_interval = 7*60*60*24
> osd_scrub_interval_randomize_ratio = 0.5
> osd_deep_scrub_interval = 60*60*24*7
> 
> we understand that the new option changes the min interval to the
> range 1-1.5 days. However, this doesn't do anything for the thundering
> herd of deep scrubs which will happen every 7 days. We've found a
> configuration that should randomize deep scrubbing across two weeks,
> e.g.:
> 
> osd_scrub_min_interval = 60*60*24*7
> osd_scrub_max_interval = 100*60*60*24 // effectively disabling this option
> osd_scrub_load_threshold = 10 // effectively disabling this option
> osd_scrub_interval_randomize_ratio = 2.0
> osd_deep_scrub_interval = 60*60*24*7
> 
> but that (a) doesn't allow shallow scrubs to run daily and (b) is so
> far off the defaults that its basically an abuse of the intended
> behaviour.
> 
> So we'd like to simplify how deep scrubbing can be randomized. Our PR
> (http://github.com/ceph/ceph/pull/6550) adds a new option
> osd_deep_scrub_randomize_ratio which  controls a coin flip to randomly
> turn scrubs into deep scrubs. The default is tuned so roughly 1 in 7
> scrubs will be run deeply.

The coin flip seems reasonable to me.  But wouldn't it also/instead make 
sense to apply the randomize ratio to the deep_scrub_interval?  My just 
adding in the random factor here:

https://github.com/ceph/ceph/pull/6550/files#diff-dfb9ddca0a3ee32b266623e8fa489626R3247

That is what I would have expected to happen, and if the coin flip is also 
there then you have two knobs controlling the same thing, which'll cause 
confusion...

> Secondly, we'd also like to discuss the osd_scrub_load_threshold
> option, where we see two problems:
>    - the default is so low that it disables all the shallow scrub
> randomization on all but completely idle clusters.
>    - finding the correct osd_scrub_load_threshold for a cluster is
> surely unclear/difficult and probably a moving target for most prod
> clusters.
> 
> Given those observations, IMHO the smart Ceph admin should set
> osd_scrub_load_threshold = 10 or higher, to effectively disable that
> functionality. In the spirit of having good defaults, I therefore
> propose that we increase the default osd_scrub_load_threshold (to at
> least 5.0) and consider removing the load threshold logic completely.

This sounds reasonable to me.  It would be great if we could use a 24-hour 
average as the baseline or something so that it was self-tuning (e.g., set 
threshold to .8 of daily average), but that's a bit trickier.  Generally 
all for self-tuning, though... too many knobs...

sage

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: scrub randomization and load threshold
  2015-11-12 13:29 ` Sage Weil
@ 2015-11-12 14:36   ` Dan van der Ster
  2015-11-12 15:10     ` Sage Weil
  0 siblings, 1 reply; 12+ messages in thread
From: Dan van der Ster @ 2015-11-12 14:36 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel, Herve Rousseau

On Thu, Nov 12, 2015 at 2:29 PM, Sage Weil <sage@newdream.net> wrote:
> On Thu, 12 Nov 2015, Dan van der Ster wrote:
>> Hi,
>>
>> Firstly, we just had a look at the new
>> osd_scrub_interval_randomize_ratio option and found that it doesn't
>> really solve the deep scrubbing problem. Given the default options,
>>
>> osd_scrub_min_interval = 60*60*24
>> osd_scrub_max_interval = 7*60*60*24
>> osd_scrub_interval_randomize_ratio = 0.5
>> osd_deep_scrub_interval = 60*60*24*7
>>
>> we understand that the new option changes the min interval to the
>> range 1-1.5 days. However, this doesn't do anything for the thundering
>> herd of deep scrubs which will happen every 7 days. We've found a
>> configuration that should randomize deep scrubbing across two weeks,
>> e.g.:
>>
>> osd_scrub_min_interval = 60*60*24*7
>> osd_scrub_max_interval = 100*60*60*24 // effectively disabling this option
>> osd_scrub_load_threshold = 10 // effectively disabling this option
>> osd_scrub_interval_randomize_ratio = 2.0
>> osd_deep_scrub_interval = 60*60*24*7
>>
>> but that (a) doesn't allow shallow scrubs to run daily and (b) is so
>> far off the defaults that its basically an abuse of the intended
>> behaviour.
>>
>> So we'd like to simplify how deep scrubbing can be randomized. Our PR
>> (http://github.com/ceph/ceph/pull/6550) adds a new option
>> osd_deep_scrub_randomize_ratio which  controls a coin flip to randomly
>> turn scrubs into deep scrubs. The default is tuned so roughly 1 in 7
>> scrubs will be run deeply.
>
> The coin flip seems reasonable to me.  But wouldn't it also/instead make
> sense to apply the randomize ratio to the deep_scrub_interval?  My just
> adding in the random factor here:
>
> https://github.com/ceph/ceph/pull/6550/files#diff-dfb9ddca0a3ee32b266623e8fa489626R3247
>
> That is what I would have expected to happen, and if the coin flip is also
> there then you have two knobs controlling the same thing, which'll cause
> confusion...
>

That was our first idea. But that has a couple downsides:

  1.  If we use the random range for the deep scrub intervals, e.g.
deep every 1-1.5 weeks, we still get quite bursty scrubbing until it
randomizes over a period of many weeks/months. And I fear it might
even lead to lower frequency harmonics of many concurrent deep scrubs.
Using a coin flip guarantees uniformity starting immediately from time
zero.

  2. In our PR osd_deep_scrub_interval is still used as an upper limit
on how long a PG can go without being deeply scrubbed. This way
there's no confusion such as PGs going undeep-scrubbed longer than
expected. (In general, I think this random range is unintuitive and
difficult to tune (e.g. see my 2 week deep scrubbing config above).

For me, the most intuitive configuration (maintaining randomness) would be:

  a. drop the osd_scrub_interval_randomize_ratio because there is no
shallow scrub thundering herd problem (AFAIK), and it just complicates
the configuration. (But this is in a stable release now so I don't
know if you want to back it out).
  b. perform a (usually shallow) scrub every
osd_scrub_interval_(min/max) depending on a self-tuning load
threshold.
  c. do a coin flip each (b) to occasionally turn it into deep scrub.
  optionally: d. remove osd_deep_scrub_randomize_ratio and replace it
with  osd_scrub_interval_min/osd_deep_scrub_interval.

>> Secondly, we'd also like to discuss the osd_scrub_load_threshold
>> option, where we see two problems:
>>    - the default is so low that it disables all the shallow scrub
>> randomization on all but completely idle clusters.
>>    - finding the correct osd_scrub_load_threshold for a cluster is
>> surely unclear/difficult and probably a moving target for most prod
>> clusters.
>>
>> Given those observations, IMHO the smart Ceph admin should set
>> osd_scrub_load_threshold = 10 or higher, to effectively disable that
>> functionality. In the spirit of having good defaults, I therefore
>> propose that we increase the default osd_scrub_load_threshold (to at
>> least 5.0) and consider removing the load threshold logic completely.
>
> This sounds reasonable to me.  It would be great if we could use a 24-hour
> average as the baseline or something so that it was self-tuning (e.g., set
> threshold to .8 of daily average), but that's a bit trickier.  Generally
> all for self-tuning, though... too many knobs...

Yes, but we probably would need to make your 0.8 a function of the
stddev of the loadavg over a day, to handle clusters with flat
loadavgs as well as varying ones.

In order to randomly spread the deep scrubs across the week, it's
essential to give each PG many opportunities to scrub throughout the
week. If PGs are only shallow scrubbed once a week (at interval_max),
then every scrub would become a deep scrub and we again have the
thundering herd problem.

I'll push 5.0 for now.

-- dan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: scrub randomization and load threshold
  2015-11-12 14:36   ` Dan van der Ster
@ 2015-11-12 15:10     ` Sage Weil
  2015-11-12 15:34       ` Dan van der Ster
  0 siblings, 1 reply; 12+ messages in thread
From: Sage Weil @ 2015-11-12 15:10 UTC (permalink / raw)
  To: Dan van der Ster; +Cc: ceph-devel, Herve Rousseau

On Thu, 12 Nov 2015, Dan van der Ster wrote:
> On Thu, Nov 12, 2015 at 2:29 PM, Sage Weil <sage@newdream.net> wrote:
> > On Thu, 12 Nov 2015, Dan van der Ster wrote:
> >> Hi,
> >>
> >> Firstly, we just had a look at the new
> >> osd_scrub_interval_randomize_ratio option and found that it doesn't
> >> really solve the deep scrubbing problem. Given the default options,
> >>
> >> osd_scrub_min_interval = 60*60*24
> >> osd_scrub_max_interval = 7*60*60*24
> >> osd_scrub_interval_randomize_ratio = 0.5
> >> osd_deep_scrub_interval = 60*60*24*7
> >>
> >> we understand that the new option changes the min interval to the
> >> range 1-1.5 days. However, this doesn't do anything for the thundering
> >> herd of deep scrubs which will happen every 7 days. We've found a
> >> configuration that should randomize deep scrubbing across two weeks,
> >> e.g.:
> >>
> >> osd_scrub_min_interval = 60*60*24*7
> >> osd_scrub_max_interval = 100*60*60*24 // effectively disabling this option
> >> osd_scrub_load_threshold = 10 // effectively disabling this option
> >> osd_scrub_interval_randomize_ratio = 2.0
> >> osd_deep_scrub_interval = 60*60*24*7
> >>
> >> but that (a) doesn't allow shallow scrubs to run daily and (b) is so
> >> far off the defaults that its basically an abuse of the intended
> >> behaviour.
> >>
> >> So we'd like to simplify how deep scrubbing can be randomized. Our PR
> >> (http://github.com/ceph/ceph/pull/6550) adds a new option
> >> osd_deep_scrub_randomize_ratio which  controls a coin flip to randomly
> >> turn scrubs into deep scrubs. The default is tuned so roughly 1 in 7
> >> scrubs will be run deeply.
> >
> > The coin flip seems reasonable to me.  But wouldn't it also/instead make
> > sense to apply the randomize ratio to the deep_scrub_interval?  My just
> > adding in the random factor here:
> >
> > https://github.com/ceph/ceph/pull/6550/files#diff-dfb9ddca0a3ee32b266623e8fa489626R3247
> >
> > That is what I would have expected to happen, and if the coin flip is also
> > there then you have two knobs controlling the same thing, which'll cause
> > confusion...
> >
> 
> That was our first idea. But that has a couple downsides:
> 
>   1.  If we use the random range for the deep scrub intervals, e.g.
> deep every 1-1.5 weeks, we still get quite bursty scrubbing until it
> randomizes over a period of many weeks/months. And I fear it might
> even lead to lower frequency harmonics of many concurrent deep scrubs.
> Using a coin flip guarantees uniformity starting immediately from time
> zero.
>
>   2. In our PR osd_deep_scrub_interval is still used as an upper limit
> on how long a PG can go without being deeply scrubbed. This way
> there's no confusion such as PGs going undeep-scrubbed longer than
> expected. (In general, I think this random range is unintuitive and
> difficult to tune (e.g. see my 2 week deep scrubbing config above).

Fair enough..
 
> For me, the most intuitive configuration (maintaining randomness) would be:
> 
>   a. drop the osd_scrub_interval_randomize_ratio because there is no
> shallow scrub thundering herd problem (AFAIK), and it just complicates
> the configuration. (But this is in a stable release now so I don't
> know if you want to back it out).

I'm inclined to leave it, even if it complicates config: just because we 
haven't noticed the shallow scrub thundering herd doesn't mean it doesn't 
exist, and I fully expect that it is there.  Also, if the shallow scrubs 
are lumpy and we're promoting some of them to deep scrubs, then the deep 
scrubs will be lumpy too.

>   b. perform a (usually shallow) scrub every
> osd_scrub_interval_(min/max) depending on a self-tuning load
> threshold.

Yep, although as you note we have some work to do to get there.  :)

>   c. do a coin flip each (b) to occasionally turn it into deep scrub.

Works for me.

>   optionally: d. remove osd_deep_scrub_randomize_ratio and replace it
> with  osd_scrub_interval_min/osd_deep_scrub_interval.

There is no osd_deep_scrub_randomize_ratio.  Do you mean replace 
osd_deep_scrub_interval with osd_deep_scrub_{min,max}_interval?

> >> Secondly, we'd also like to discuss the osd_scrub_load_threshold
> >> option, where we see two problems:
> >>    - the default is so low that it disables all the shallow scrub
> >> randomization on all but completely idle clusters.
> >>    - finding the correct osd_scrub_load_threshold for a cluster is
> >> surely unclear/difficult and probably a moving target for most prod
> >> clusters.
> >>
> >> Given those observations, IMHO the smart Ceph admin should set
> >> osd_scrub_load_threshold = 10 or higher, to effectively disable that
> >> functionality. In the spirit of having good defaults, I therefore
> >> propose that we increase the default osd_scrub_load_threshold (to at
> >> least 5.0) and consider removing the load threshold logic completely.
> >
> > This sounds reasonable to me.  It would be great if we could use a 24-hour
> > average as the baseline or something so that it was self-tuning (e.g., set
> > threshold to .8 of daily average), but that's a bit trickier.  Generally
> > all for self-tuning, though... too many knobs...
> 
> Yes, but we probably would need to make your 0.8 a function of the
> stddev of the loadavg over a day, to handle clusters with flat
> loadavgs as well as varying ones.
> 
> In order to randomly spread the deep scrubs across the week, it's
> essential to give each PG many opportunities to scrub throughout the
> week. If PGs are only shallow scrubbed once a week (at interval_max),
> then every scrub would become a deep scrub and we again have the
> thundering herd problem.
> 
> I'll push 5.0 for now.

Sounds good.

I would still love to see someone tackle the auto-tuning approach, 
though! :)

sage

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: scrub randomization and load threshold
  2015-11-12 15:10     ` Sage Weil
@ 2015-11-12 15:34       ` Dan van der Ster
  2015-11-16 14:25         ` Dan van der Ster
  0 siblings, 1 reply; 12+ messages in thread
From: Dan van der Ster @ 2015-11-12 15:34 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel, Herve Rousseau

On Thu, Nov 12, 2015 at 4:10 PM, Sage Weil <sage@newdream.net> wrote:
> On Thu, 12 Nov 2015, Dan van der Ster wrote:
>> On Thu, Nov 12, 2015 at 2:29 PM, Sage Weil <sage@newdream.net> wrote:
>> > On Thu, 12 Nov 2015, Dan van der Ster wrote:
>> >> Hi,
>> >>
>> >> Firstly, we just had a look at the new
>> >> osd_scrub_interval_randomize_ratio option and found that it doesn't
>> >> really solve the deep scrubbing problem. Given the default options,
>> >>
>> >> osd_scrub_min_interval = 60*60*24
>> >> osd_scrub_max_interval = 7*60*60*24
>> >> osd_scrub_interval_randomize_ratio = 0.5
>> >> osd_deep_scrub_interval = 60*60*24*7
>> >>
>> >> we understand that the new option changes the min interval to the
>> >> range 1-1.5 days. However, this doesn't do anything for the thundering
>> >> herd of deep scrubs which will happen every 7 days. We've found a
>> >> configuration that should randomize deep scrubbing across two weeks,
>> >> e.g.:
>> >>
>> >> osd_scrub_min_interval = 60*60*24*7
>> >> osd_scrub_max_interval = 100*60*60*24 // effectively disabling this option
>> >> osd_scrub_load_threshold = 10 // effectively disabling this option
>> >> osd_scrub_interval_randomize_ratio = 2.0
>> >> osd_deep_scrub_interval = 60*60*24*7
>> >>
>> >> but that (a) doesn't allow shallow scrubs to run daily and (b) is so
>> >> far off the defaults that its basically an abuse of the intended
>> >> behaviour.
>> >>
>> >> So we'd like to simplify how deep scrubbing can be randomized. Our PR
>> >> (http://github.com/ceph/ceph/pull/6550) adds a new option
>> >> osd_deep_scrub_randomize_ratio which  controls a coin flip to randomly
>> >> turn scrubs into deep scrubs. The default is tuned so roughly 1 in 7
>> >> scrubs will be run deeply.
>> >
>> > The coin flip seems reasonable to me.  But wouldn't it also/instead make
>> > sense to apply the randomize ratio to the deep_scrub_interval?  My just
>> > adding in the random factor here:
>> >
>> > https://github.com/ceph/ceph/pull/6550/files#diff-dfb9ddca0a3ee32b266623e8fa489626R3247
>> >
>> > That is what I would have expected to happen, and if the coin flip is also
>> > there then you have two knobs controlling the same thing, which'll cause
>> > confusion...
>> >
>>
>> That was our first idea. But that has a couple downsides:
>>
>>   1.  If we use the random range for the deep scrub intervals, e.g.
>> deep every 1-1.5 weeks, we still get quite bursty scrubbing until it
>> randomizes over a period of many weeks/months. And I fear it might
>> even lead to lower frequency harmonics of many concurrent deep scrubs.
>> Using a coin flip guarantees uniformity starting immediately from time
>> zero.
>>
>>   2. In our PR osd_deep_scrub_interval is still used as an upper limit
>> on how long a PG can go without being deeply scrubbed. This way
>> there's no confusion such as PGs going undeep-scrubbed longer than
>> expected. (In general, I think this random range is unintuitive and
>> difficult to tune (e.g. see my 2 week deep scrubbing config above).
>
> Fair enough..
>
>> For me, the most intuitive configuration (maintaining randomness) would be:
>>
>>   a. drop the osd_scrub_interval_randomize_ratio because there is no
>> shallow scrub thundering herd problem (AFAIK), and it just complicates
>> the configuration. (But this is in a stable release now so I don't
>> know if you want to back it out).
>
> I'm inclined to leave it, even if it complicates config: just because we
> haven't noticed the shallow scrub thundering herd doesn't mean it doesn't
> exist, and I fully expect that it is there.  Also, if the shallow scrubs
> are lumpy and we're promoting some of them to deep scrubs, then the deep
> scrubs will be lumpy too.
>

Sounds good.

>>   b. perform a (usually shallow) scrub every
>> osd_scrub_interval_(min/max) depending on a self-tuning load
>> threshold.
>
> Yep, although as you note we have some work to do to get there.  :)
>
>>   c. do a coin flip each (b) to occasionally turn it into deep scrub.
>
> Works for me.
>
>>   optionally: d. remove osd_deep_scrub_randomize_ratio and replace it
>> with  osd_scrub_interval_min/osd_deep_scrub_interval.
>
> There is no osd_deep_scrub_randomize_ratio.  Do you mean replace
> osd_deep_scrub_interval with osd_deep_scrub_{min,max}_interval?

osd_deep_scrub_randomize_ratio is the new option we proposed in the
PR. We chose 0.15 because it's roughly 1/7 (i.e.
osd_scrub_interval_min/osd_deep_scrub_interval = 1/7 in the default
config). But the coin flip could use
osd_scrub_interval_min/osd_deep_scrub_interval instead of adding this
extra configurable.

My preference would be to keep it separately configurable.

>> >> Secondly, we'd also like to discuss the osd_scrub_load_threshold
>> >> option, where we see two problems:
>> >>    - the default is so low that it disables all the shallow scrub
>> >> randomization on all but completely idle clusters.
>> >>    - finding the correct osd_scrub_load_threshold for a cluster is
>> >> surely unclear/difficult and probably a moving target for most prod
>> >> clusters.
>> >>
>> >> Given those observations, IMHO the smart Ceph admin should set
>> >> osd_scrub_load_threshold = 10 or higher, to effectively disable that
>> >> functionality. In the spirit of having good defaults, I therefore
>> >> propose that we increase the default osd_scrub_load_threshold (to at
>> >> least 5.0) and consider removing the load threshold logic completely.
>> >
>> > This sounds reasonable to me.  It would be great if we could use a 24-hour
>> > average as the baseline or something so that it was self-tuning (e.g., set
>> > threshold to .8 of daily average), but that's a bit trickier.  Generally
>> > all for self-tuning, though... too many knobs...
>>
>> Yes, but we probably would need to make your 0.8 a function of the
>> stddev of the loadavg over a day, to handle clusters with flat
>> loadavgs as well as varying ones.
>>
>> In order to randomly spread the deep scrubs across the week, it's
>> essential to give each PG many opportunities to scrub throughout the
>> week. If PGs are only shallow scrubbed once a week (at interval_max),
>> then every scrub would become a deep scrub and we again have the
>> thundering herd problem.
>>
>> I'll push 5.0 for now.
>
> Sounds good.
>
> I would still love to see someone tackle the auto-tuning approach,
> though! :)

I should have some time next week to have a look, if nobody beat me to it.

-- dan

> sage

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: scrub randomization and load threshold
  2015-11-12 15:34       ` Dan van der Ster
@ 2015-11-16 14:25         ` Dan van der Ster
  2015-11-16 15:20           ` Sage Weil
  0 siblings, 1 reply; 12+ messages in thread
From: Dan van der Ster @ 2015-11-16 14:25 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel, Herve Rousseau

On Thu, Nov 12, 2015 at 4:34 PM, Dan van der Ster <dan@vanderster.com> wrote:
> On Thu, Nov 12, 2015 at 4:10 PM, Sage Weil <sage@newdream.net> wrote:
>> On Thu, 12 Nov 2015, Dan van der Ster wrote:
>>> On Thu, Nov 12, 2015 at 2:29 PM, Sage Weil <sage@newdream.net> wrote:
>>> > On Thu, 12 Nov 2015, Dan van der Ster wrote:
>>> >> Hi,
>>> >>
>>> >> Firstly, we just had a look at the new
>>> >> osd_scrub_interval_randomize_ratio option and found that it doesn't
>>> >> really solve the deep scrubbing problem. Given the default options,
>>> >>
>>> >> osd_scrub_min_interval = 60*60*24
>>> >> osd_scrub_max_interval = 7*60*60*24
>>> >> osd_scrub_interval_randomize_ratio = 0.5
>>> >> osd_deep_scrub_interval = 60*60*24*7
>>> >>
>>> >> we understand that the new option changes the min interval to the
>>> >> range 1-1.5 days. However, this doesn't do anything for the thundering
>>> >> herd of deep scrubs which will happen every 7 days. We've found a
>>> >> configuration that should randomize deep scrubbing across two weeks,
>>> >> e.g.:
>>> >>
>>> >> osd_scrub_min_interval = 60*60*24*7
>>> >> osd_scrub_max_interval = 100*60*60*24 // effectively disabling this option
>>> >> osd_scrub_load_threshold = 10 // effectively disabling this option
>>> >> osd_scrub_interval_randomize_ratio = 2.0
>>> >> osd_deep_scrub_interval = 60*60*24*7
>>> >>
>>> >> but that (a) doesn't allow shallow scrubs to run daily and (b) is so
>>> >> far off the defaults that its basically an abuse of the intended
>>> >> behaviour.
>>> >>
>>> >> So we'd like to simplify how deep scrubbing can be randomized. Our PR
>>> >> (http://github.com/ceph/ceph/pull/6550) adds a new option
>>> >> osd_deep_scrub_randomize_ratio which  controls a coin flip to randomly
>>> >> turn scrubs into deep scrubs. The default is tuned so roughly 1 in 7
>>> >> scrubs will be run deeply.
>>> >
>>> > The coin flip seems reasonable to me.  But wouldn't it also/instead make
>>> > sense to apply the randomize ratio to the deep_scrub_interval?  My just
>>> > adding in the random factor here:
>>> >
>>> > https://github.com/ceph/ceph/pull/6550/files#diff-dfb9ddca0a3ee32b266623e8fa489626R3247
>>> >
>>> > That is what I would have expected to happen, and if the coin flip is also
>>> > there then you have two knobs controlling the same thing, which'll cause
>>> > confusion...
>>> >
>>>
>>> That was our first idea. But that has a couple downsides:
>>>
>>>   1.  If we use the random range for the deep scrub intervals, e.g.
>>> deep every 1-1.5 weeks, we still get quite bursty scrubbing until it
>>> randomizes over a period of many weeks/months. And I fear it might
>>> even lead to lower frequency harmonics of many concurrent deep scrubs.
>>> Using a coin flip guarantees uniformity starting immediately from time
>>> zero.
>>>
>>>   2. In our PR osd_deep_scrub_interval is still used as an upper limit
>>> on how long a PG can go without being deeply scrubbed. This way
>>> there's no confusion such as PGs going undeep-scrubbed longer than
>>> expected. (In general, I think this random range is unintuitive and
>>> difficult to tune (e.g. see my 2 week deep scrubbing config above).
>>
>> Fair enough..
>>
>>> For me, the most intuitive configuration (maintaining randomness) would be:
>>>
>>>   a. drop the osd_scrub_interval_randomize_ratio because there is no
>>> shallow scrub thundering herd problem (AFAIK), and it just complicates
>>> the configuration. (But this is in a stable release now so I don't
>>> know if you want to back it out).
>>
>> I'm inclined to leave it, even if it complicates config: just because we
>> haven't noticed the shallow scrub thundering herd doesn't mean it doesn't
>> exist, and I fully expect that it is there.  Also, if the shallow scrubs
>> are lumpy and we're promoting some of them to deep scrubs, then the deep
>> scrubs will be lumpy too.
>>
>
> Sounds good.
>
>>>   b. perform a (usually shallow) scrub every
>>> osd_scrub_interval_(min/max) depending on a self-tuning load
>>> threshold.
>>
>> Yep, although as you note we have some work to do to get there.  :)
>>
>>>   c. do a coin flip each (b) to occasionally turn it into deep scrub.
>>
>> Works for me.
>>
>>>   optionally: d. remove osd_deep_scrub_randomize_ratio and replace it
>>> with  osd_scrub_interval_min/osd_deep_scrub_interval.
>>
>> There is no osd_deep_scrub_randomize_ratio.  Do you mean replace
>> osd_deep_scrub_interval with osd_deep_scrub_{min,max}_interval?
>
> osd_deep_scrub_randomize_ratio is the new option we proposed in the
> PR. We chose 0.15 because it's roughly 1/7 (i.e.
> osd_scrub_interval_min/osd_deep_scrub_interval = 1/7 in the default
> config). But the coin flip could use
> osd_scrub_interval_min/osd_deep_scrub_interval instead of adding this
> extra configurable.
>
> My preference would be to keep it separately configurable.
>
>>> >> Secondly, we'd also like to discuss the osd_scrub_load_threshold
>>> >> option, where we see two problems:
>>> >>    - the default is so low that it disables all the shallow scrub
>>> >> randomization on all but completely idle clusters.
>>> >>    - finding the correct osd_scrub_load_threshold for a cluster is
>>> >> surely unclear/difficult and probably a moving target for most prod
>>> >> clusters.
>>> >>
>>> >> Given those observations, IMHO the smart Ceph admin should set
>>> >> osd_scrub_load_threshold = 10 or higher, to effectively disable that
>>> >> functionality. In the spirit of having good defaults, I therefore
>>> >> propose that we increase the default osd_scrub_load_threshold (to at
>>> >> least 5.0) and consider removing the load threshold logic completely.
>>> >
>>> > This sounds reasonable to me.  It would be great if we could use a 24-hour
>>> > average as the baseline or something so that it was self-tuning (e.g., set
>>> > threshold to .8 of daily average), but that's a bit trickier.  Generally
>>> > all for self-tuning, though... too many knobs...
>>>
>>> Yes, but we probably would need to make your 0.8 a function of the
>>> stddev of the loadavg over a day, to handle clusters with flat
>>> loadavgs as well as varying ones.
>>>
>>> In order to randomly spread the deep scrubs across the week, it's
>>> essential to give each PG many opportunities to scrub throughout the
>>> week. If PGs are only shallow scrubbed once a week (at interval_max),
>>> then every scrub would become a deep scrub and we again have the
>>> thundering herd problem.
>>>
>>> I'll push 5.0 for now.
>>
>> Sounds good.
>>
>> I would still love to see someone tackle the auto-tuning approach,
>> though! :)
>
> I should have some time next week to have a look, if nobody beat me to it.

Instead of keeping a 24hr loadavg, how about we allow scrubs whenever
the loadavg is decreasing (or below the threshold)? As long as the
1min loadavg is less than the 15min loadavg, we should be ok to allow
new scrubs. If you agree I'll add the patch below to my PR.

-- dan


diff --git a/src/osd/OSD.cc b/src/osd/OSD.cc
index 0562eed..464162d 100644
--- a/src/osd/OSD.cc
+++ b/src/osd/OSD.cc
@@ -6065,20 +6065,24 @@ bool OSD::scrub_time_permit(utime_t now)

 bool OSD::scrub_load_below_threshold()
 {
-  double loadavgs[1];
-  if (getloadavg(loadavgs, 1) != 1) {
+  double loadavgs[3];
+  if (getloadavg(loadavgs, 3) != 3) {
     dout(10) << __func__ << " couldn't read loadavgs\n" << dendl;
     return false;
   }

   if (loadavgs[0] >= cct->_conf->osd_scrub_load_threshold) {
-    dout(20) << __func__ << " loadavg " << loadavgs[0]
-            << " >= max " << cct->_conf->osd_scrub_load_threshold
-            << " = no, load too high" << dendl;
-    return false;
+    if (loadavgs[0] >= loadavgs[2]) {
+      dout(20) << __func__ << " loadavg " << loadavgs[0]
+              << " >= max " << cct->_conf->osd_scrub_load_threshold
+               << " and >= 15m avg " << loadavgs[2]
+              << " = no, load too high" << dendl;
+      return false;
+    }
   } else {
     dout(20) << __func__ << " loadavg " << loadavgs[0]
             << " < max " << cct->_conf->osd_scrub_load_threshold
+            << " or < 15 min avg " << loadavgs[2]
             << " = yes" << dendl;
     return true;
   }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: scrub randomization and load threshold
  2015-11-16 14:25         ` Dan van der Ster
@ 2015-11-16 15:20           ` Sage Weil
  2015-11-16 15:32             ` Dan van der Ster
  0 siblings, 1 reply; 12+ messages in thread
From: Sage Weil @ 2015-11-16 15:20 UTC (permalink / raw)
  To: Dan van der Ster; +Cc: ceph-devel, Herve Rousseau

On Mon, 16 Nov 2015, Dan van der Ster wrote:
> Instead of keeping a 24hr loadavg, how about we allow scrubs whenever
> the loadavg is decreasing (or below the threshold)? As long as the
> 1min loadavg is less than the 15min loadavg, we should be ok to allow
> new scrubs. If you agree I'll add the patch below to my PR.

I like the simplicity of that, I'm afraid its going to just trigger a 
feedback loop and oscillations on the host.  I.e., as soo as we see *any* 
decrease, all osds on the host will start to scrub, which will push the 
load up.  Once that round of PGs finish, the load will start to drop 
again, triggering another round.  This'll happen regardless of whether 
we're in the peak hours or not, and the high-level goal (IMO at least) is 
to do scrubbing in non-peak hours.

sage

> -- dan
> 
> 
> diff --git a/src/osd/OSD.cc b/src/osd/OSD.cc
> index 0562eed..464162d 100644
> --- a/src/osd/OSD.cc
> +++ b/src/osd/OSD.cc
> @@ -6065,20 +6065,24 @@ bool OSD::scrub_time_permit(utime_t now)
> 
>  bool OSD::scrub_load_below_threshold()
>  {
> -  double loadavgs[1];
> -  if (getloadavg(loadavgs, 1) != 1) {
> +  double loadavgs[3];
> +  if (getloadavg(loadavgs, 3) != 3) {
>      dout(10) << __func__ << " couldn't read loadavgs\n" << dendl;
>      return false;
>    }
> 
>    if (loadavgs[0] >= cct->_conf->osd_scrub_load_threshold) {
> -    dout(20) << __func__ << " loadavg " << loadavgs[0]
> -            << " >= max " << cct->_conf->osd_scrub_load_threshold
> -            << " = no, load too high" << dendl;
> -    return false;
> +    if (loadavgs[0] >= loadavgs[2]) {
> +      dout(20) << __func__ << " loadavg " << loadavgs[0]
> +              << " >= max " << cct->_conf->osd_scrub_load_threshold
> +               << " and >= 15m avg " << loadavgs[2]
> +              << " = no, load too high" << dendl;
> +      return false;
> +    }
>    } else {
>      dout(20) << __func__ << " loadavg " << loadavgs[0]
>              << " < max " << cct->_conf->osd_scrub_load_threshold
> +            << " or < 15 min avg " << loadavgs[2]
>              << " = yes" << dendl;
>      return true;
>    }
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: scrub randomization and load threshold
  2015-11-16 15:20           ` Sage Weil
@ 2015-11-16 15:32             ` Dan van der Ster
  2015-11-16 15:58               ` Dan van der Ster
  0 siblings, 1 reply; 12+ messages in thread
From: Dan van der Ster @ 2015-11-16 15:32 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel, Herve Rousseau

On Mon, Nov 16, 2015 at 4:20 PM, Sage Weil <sage@newdream.net> wrote:
> On Mon, 16 Nov 2015, Dan van der Ster wrote:
>> Instead of keeping a 24hr loadavg, how about we allow scrubs whenever
>> the loadavg is decreasing (or below the threshold)? As long as the
>> 1min loadavg is less than the 15min loadavg, we should be ok to allow
>> new scrubs. If you agree I'll add the patch below to my PR.
>
> I like the simplicity of that, I'm afraid its going to just trigger a
> feedback loop and oscillations on the host.  I.e., as soo as we see *any*
> decrease, all osds on the host will start to scrub, which will push the
> load up.  Once that round of PGs finish, the load will start to drop
> again, triggering another round.  This'll happen regardless of whether
> we're in the peak hours or not, and the high-level goal (IMO at least) is
> to do scrubbing in non-peak hours.

We checked our OSDs' 24hr loadavg plots today and found that the
original idea of 0.8 * 24hr loadavg wouldn't leave many chances for
scrubs to run. So maybe if we used 0.9 or 1.0 it would be doable.

BTW, I realized there was a silly error in that earlier patch, and we
anyway need an upper bound, say # cpus. So until your response came I
was working with this idea:
https://stikked.web.cern.ch/stikked/view/raw/5586a912

-- dan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: scrub randomization and load threshold
  2015-11-16 15:32             ` Dan van der Ster
@ 2015-11-16 15:58               ` Dan van der Ster
  2015-11-16 17:06                 ` Dan van der Ster
  0 siblings, 1 reply; 12+ messages in thread
From: Dan van der Ster @ 2015-11-16 15:58 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel, Herve Rousseau

On Mon, Nov 16, 2015 at 4:32 PM, Dan van der Ster <dan@vanderster.com> wrote:
> On Mon, Nov 16, 2015 at 4:20 PM, Sage Weil <sage@newdream.net> wrote:
>> On Mon, 16 Nov 2015, Dan van der Ster wrote:
>>> Instead of keeping a 24hr loadavg, how about we allow scrubs whenever
>>> the loadavg is decreasing (or below the threshold)? As long as the
>>> 1min loadavg is less than the 15min loadavg, we should be ok to allow
>>> new scrubs. If you agree I'll add the patch below to my PR.
>>
>> I like the simplicity of that, I'm afraid its going to just trigger a
>> feedback loop and oscillations on the host.  I.e., as soo as we see *any*
>> decrease, all osds on the host will start to scrub, which will push the
>> load up.  Once that round of PGs finish, the load will start to drop
>> again, triggering another round.  This'll happen regardless of whether
>> we're in the peak hours or not, and the high-level goal (IMO at least) is
>> to do scrubbing in non-peak hours.
>
> We checked our OSDs' 24hr loadavg plots today and found that the
> original idea of 0.8 * 24hr loadavg wouldn't leave many chances for
> scrubs to run. So maybe if we used 0.9 or 1.0 it would be doable.
>
> BTW, I realized there was a silly error in that earlier patch, and we
> anyway need an upper bound, say # cpus. So until your response came I
> was working with this idea:
> https://stikked.web.cern.ch/stikked/view/raw/5586a912

Sorry for SSO. Here:

https://gist.github.com/dvanders/f3b08373af0f5957f589

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: scrub randomization and load threshold
  2015-11-16 15:58               ` Dan van der Ster
@ 2015-11-16 17:06                 ` Dan van der Ster
  2015-11-16 17:13                   ` Sage Weil
  0 siblings, 1 reply; 12+ messages in thread
From: Dan van der Ster @ 2015-11-16 17:06 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel, Herve Rousseau

On Mon, Nov 16, 2015 at 4:58 PM, Dan van der Ster <dan@vanderster.com> wrote:
> On Mon, Nov 16, 2015 at 4:32 PM, Dan van der Ster <dan@vanderster.com> wrote:
>> On Mon, Nov 16, 2015 at 4:20 PM, Sage Weil <sage@newdream.net> wrote:
>>> On Mon, 16 Nov 2015, Dan van der Ster wrote:
>>>> Instead of keeping a 24hr loadavg, how about we allow scrubs whenever
>>>> the loadavg is decreasing (or below the threshold)? As long as the
>>>> 1min loadavg is less than the 15min loadavg, we should be ok to allow
>>>> new scrubs. If you agree I'll add the patch below to my PR.
>>>
>>> I like the simplicity of that, I'm afraid its going to just trigger a
>>> feedback loop and oscillations on the host.  I.e., as soo as we see *any*
>>> decrease, all osds on the host will start to scrub, which will push the
>>> load up.  Once that round of PGs finish, the load will start to drop
>>> again, triggering another round.  This'll happen regardless of whether
>>> we're in the peak hours or not, and the high-level goal (IMO at least) is
>>> to do scrubbing in non-peak hours.
>>
>> We checked our OSDs' 24hr loadavg plots today and found that the
>> original idea of 0.8 * 24hr loadavg wouldn't leave many chances for
>> scrubs to run. So maybe if we used 0.9 or 1.0 it would be doable.
>>
>> BTW, I realized there was a silly error in that earlier patch, and we
>> anyway need an upper bound, say # cpus. So until your response came I
>> was working with this idea:
>> https://stikked.web.cern.ch/stikked/view/raw/5586a912
>
> Sorry for SSO. Here:
>
> https://gist.github.com/dvanders/f3b08373af0f5957f589

Hi again. Here's a first shot at a daily loadavg heuristic:
https://github.com/ceph/ceph/commit/15474124a183c7e92f457f836f7008a2813aa672
I had to guess where it would be best to store the daily_loadavg
member and where to initialize it... please advise.

I took the conservative approach of triggering scrubs when either:
   1m loadavg < osd_scrub_load_threshold, or
   1m loadavg < 24hr loadavg && 1m loadavg < 15m loadavg

The whole PR would become this:
https://github.com/ceph/ceph/compare/master...cernceph:wip-deepscrub-daily

-- Dan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: scrub randomization and load threshold
  2015-11-16 17:06                 ` Dan van der Ster
@ 2015-11-16 17:13                   ` Sage Weil
  2015-11-16 17:30                     ` Dan van der Ster
  0 siblings, 1 reply; 12+ messages in thread
From: Sage Weil @ 2015-11-16 17:13 UTC (permalink / raw)
  To: Dan van der Ster; +Cc: ceph-devel, Herve Rousseau

On Mon, 16 Nov 2015, Dan van der Ster wrote:
> On Mon, Nov 16, 2015 at 4:58 PM, Dan van der Ster <dan@vanderster.com> wrote:
> > On Mon, Nov 16, 2015 at 4:32 PM, Dan van der Ster <dan@vanderster.com> wrote:
> >> On Mon, Nov 16, 2015 at 4:20 PM, Sage Weil <sage@newdream.net> wrote:
> >>> On Mon, 16 Nov 2015, Dan van der Ster wrote:
> >>>> Instead of keeping a 24hr loadavg, how about we allow scrubs whenever
> >>>> the loadavg is decreasing (or below the threshold)? As long as the
> >>>> 1min loadavg is less than the 15min loadavg, we should be ok to allow
> >>>> new scrubs. If you agree I'll add the patch below to my PR.
> >>>
> >>> I like the simplicity of that, I'm afraid its going to just trigger a
> >>> feedback loop and oscillations on the host.  I.e., as soo as we see *any*
> >>> decrease, all osds on the host will start to scrub, which will push the
> >>> load up.  Once that round of PGs finish, the load will start to drop
> >>> again, triggering another round.  This'll happen regardless of whether
> >>> we're in the peak hours or not, and the high-level goal (IMO at least) is
> >>> to do scrubbing in non-peak hours.
> >>
> >> We checked our OSDs' 24hr loadavg plots today and found that the
> >> original idea of 0.8 * 24hr loadavg wouldn't leave many chances for
> >> scrubs to run. So maybe if we used 0.9 or 1.0 it would be doable.
> >>
> >> BTW, I realized there was a silly error in that earlier patch, and we
> >> anyway need an upper bound, say # cpus. So until your response came I
> >> was working with this idea:
> >> https://stikked.web.cern.ch/stikked/view/raw/5586a912
> >
> > Sorry for SSO. Here:
> >
> > https://gist.github.com/dvanders/f3b08373af0f5957f589
> 
> Hi again. Here's a first shot at a daily loadavg heuristic:
> https://github.com/ceph/ceph/commit/15474124a183c7e92f457f836f7008a2813aa672
> I had to guess where it would be best to store the daily_loadavg
> member and where to initialize it... please advise.
> 
> I took the conservative approach of triggering scrubs when either:
>    1m loadavg < osd_scrub_load_threshold, or
>    1m loadavg < 24hr loadavg && 1m loadavg < 15m loadavg
> 
> The whole PR would become this:
> https://github.com/ceph/ceph/compare/master...cernceph:wip-deepscrub-daily

Looks reasonable to me!

I'm still a bit worried that the 1m < 15m thing will mean that on the 
completion of every scrub we have to wait ~1m before the next scrub 
starts.  Maybe that's okay, though... I'd say let's try this and adjust 
that later if it seems problematic (conservative == better).

sage

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: scrub randomization and load threshold
  2015-11-16 17:13                   ` Sage Weil
@ 2015-11-16 17:30                     ` Dan van der Ster
  0 siblings, 0 replies; 12+ messages in thread
From: Dan van der Ster @ 2015-11-16 17:30 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel, Herve Rousseau

On Mon, Nov 16, 2015 at 6:13 PM, Sage Weil <sage@newdream.net> wrote:
> On Mon, 16 Nov 2015, Dan van der Ster wrote:
>> On Mon, Nov 16, 2015 at 4:58 PM, Dan van der Ster <dan@vanderster.com> wrote:
>> > On Mon, Nov 16, 2015 at 4:32 PM, Dan van der Ster <dan@vanderster.com> wrote:
>> >> On Mon, Nov 16, 2015 at 4:20 PM, Sage Weil <sage@newdream.net> wrote:
>> >>> On Mon, 16 Nov 2015, Dan van der Ster wrote:
>> >>>> Instead of keeping a 24hr loadavg, how about we allow scrubs whenever
>> >>>> the loadavg is decreasing (or below the threshold)? As long as the
>> >>>> 1min loadavg is less than the 15min loadavg, we should be ok to allow
>> >>>> new scrubs. If you agree I'll add the patch below to my PR.
>> >>>
>> >>> I like the simplicity of that, I'm afraid its going to just trigger a
>> >>> feedback loop and oscillations on the host.  I.e., as soo as we see *any*
>> >>> decrease, all osds on the host will start to scrub, which will push the
>> >>> load up.  Once that round of PGs finish, the load will start to drop
>> >>> again, triggering another round.  This'll happen regardless of whether
>> >>> we're in the peak hours or not, and the high-level goal (IMO at least) is
>> >>> to do scrubbing in non-peak hours.
>> >>
>> >> We checked our OSDs' 24hr loadavg plots today and found that the
>> >> original idea of 0.8 * 24hr loadavg wouldn't leave many chances for
>> >> scrubs to run. So maybe if we used 0.9 or 1.0 it would be doable.
>> >>
>> >> BTW, I realized there was a silly error in that earlier patch, and we
>> >> anyway need an upper bound, say # cpus. So until your response came I
>> >> was working with this idea:
>> >> https://stikked.web.cern.ch/stikked/view/raw/5586a912
>> >
>> > Sorry for SSO. Here:
>> >
>> > https://gist.github.com/dvanders/f3b08373af0f5957f589
>>
>> Hi again. Here's a first shot at a daily loadavg heuristic:
>> https://github.com/ceph/ceph/commit/15474124a183c7e92f457f836f7008a2813aa672
>> I had to guess where it would be best to store the daily_loadavg
>> member and where to initialize it... please advise.
>>
>> I took the conservative approach of triggering scrubs when either:
>>    1m loadavg < osd_scrub_load_threshold, or
>>    1m loadavg < 24hr loadavg && 1m loadavg < 15m loadavg
>>
>> The whole PR would become this:
>> https://github.com/ceph/ceph/compare/master...cernceph:wip-deepscrub-daily
>
> Looks reasonable to me!
>
> I'm still a bit worried that the 1m < 15m thing will mean that on the
> completion of every scrub we have to wait ~1m before the next scrub
> starts.  Maybe that's okay, though... I'd say let's try this and adjust
> that later if it seems problematic (conservative == better).
>
> sage

Great. I've updated the PR:  https://github.com/ceph/ceph/pull/6550

Cheers, Dan

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-11-16 17:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-12  9:24 scrub randomization and load threshold Dan van der Ster
2015-11-12 13:29 ` Sage Weil
2015-11-12 14:36   ` Dan van der Ster
2015-11-12 15:10     ` Sage Weil
2015-11-12 15:34       ` Dan van der Ster
2015-11-16 14:25         ` Dan van der Ster
2015-11-16 15:20           ` Sage Weil
2015-11-16 15:32             ` Dan van der Ster
2015-11-16 15:58               ` Dan van der Ster
2015-11-16 17:06                 ` Dan van der Ster
2015-11-16 17:13                   ` Sage Weil
2015-11-16 17:30                     ` Dan van der Ster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.