All of lore.kernel.org
 help / color / mirror / Atom feed
* RBD readahead strategies
@ 2014-09-10 22:36 Adam Crume
  2014-09-12  3:05 ` Sage Weil
  0 siblings, 1 reply; 2+ messages in thread
From: Adam Crume @ 2014-09-10 22:36 UTC (permalink / raw)
  To: ceph-devel

I've been testing a few strategies for RBD readahead and wanted to
share my results as well as ask for input.

I have four sample workloads that I replayed at maximum speed with
rbd-replay.  boot-ide and boot-virtio are captured from booting a VM
with the image on the IDE and virtio buses, respectively.  Likewise,
grep-ide and grep-virtio are captured from a large grep run.  (I'm not
entirely sure why the IDE and virtio workloads are different, but part
of it is the number of pending requests allowed.)

The readahead strategies are:
- none: No readahead.
- plain: My initial implementation.  The readahead window doubles for
each readahead request, up to a limit, and resets when a random
request is detected.
- aligned: Same as above, but readahead requests are aligned with
object boundaries, when possible.
- eager: When activated, read to the end of the object.

For all of these, 10 sequential requests trigger readahead, the
maximum readahead size is 4 MB, and "rbd readahead disable after
bytes" is disabled (meaning that readahead is enabled for the entire
workload).  The object size is the default 4 MB, and data is striped
over a single object.  (Alignment with stripes or object sets is
ignored for now.)

Here's the data:

workload      strategy   time (seconds)   RA ops   RA MB   read ops   read MB
boot-ide      none       46.22 +/- 0.41        0       0      57516       407
boot-ide      plain      11.42 +/- 0.25      281     203      57516       407
boot-ide      aligned    11.46 +/- 0.13      276     201      57516       407
boot-ide      eager      12.48 +/- 0.61      111     303      57516       407
boot-virtio   none        9.05 +/- 0.25        0       0      11851       393
boot-virtio   plain       8.05 +/- 0.38      451     221      11851       393
boot-virtio   aligned     7.86 +/- 0.27      452     213      11851       393
boot-virtio   eager       9.17 +/- 0.34      249     600      11851       393
grep-ide      none      138.55 +/- 1.67        0       0     130104      3044
grep-ide      plain     136.07 +/- 1.57      397     867     130104      3044
grep-ide      aligned   137.30 +/- 1.77      379     844     130104      3044
grep-ide      eager     138.77 +/- 1.52      346     993     130104      3044
grep-virtio   none      120.73 +/- 1.33        0       0     130061      2820
grep-virtio   plain     121.29 +/- 1.28     1186    1485     130061      2820
grep-virtio   aligned   123.32 +/- 1.29     1139    1409     130061      2820
grep-virtio   eager     127.75 +/- 1.32      842    2218     130061      2820

(The time is the mean wall-clock time +/- the margin of error with
99.7% confidence.  RA=readahead.)

Right off the bat, readahead is a huge improvement for the boot-ide
workload, which is no surprise because it issues 50,000 sequential,
single-sector reads.  (Why the early boot process is so inefficient is
open for speculation, but that's a real, natural workload.)
boot-virtio also sees an improvement, although not nearly so dramatic.
The grep workloads show no statistically significant improvement.

One conclusion I draw is that 'eager' is, well, too eager.  'aligned'
shows no statistically significant difference from 'plain', and
'plain' is no worse than 'none' (at statistically significant levels)
and sometimes better.

Should the readahead strategy be configurable, or should we just stick
with whichever seems the best one?  Is there anything big I'm missing?

Adam

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: RBD readahead strategies
  2014-09-10 22:36 RBD readahead strategies Adam Crume
@ 2014-09-12  3:05 ` Sage Weil
  0 siblings, 0 replies; 2+ messages in thread
From: Sage Weil @ 2014-09-12  3:05 UTC (permalink / raw)
  To: Adam Crume; +Cc: ceph-devel

On Wed, 10 Sep 2014, Adam Crume wrote:
> I've been testing a few strategies for RBD readahead and wanted to
> share my results as well as ask for input.
> 
> I have four sample workloads that I replayed at maximum speed with
> rbd-replay.  boot-ide and boot-virtio are captured from booting a VM
> with the image on the IDE and virtio buses, respectively.  Likewise,
> grep-ide and grep-virtio are captured from a large grep run.  (I'm not
> entirely sure why the IDE and virtio workloads are different, but part
> of it is the number of pending requests allowed.)
> 
> The readahead strategies are:
> - none: No readahead.
> - plain: My initial implementation.  The readahead window doubles for
> each readahead request, up to a limit, and resets when a random
> request is detected.
> - aligned: Same as above, but readahead requests are aligned with
> object boundaries, when possible.
> - eager: When activated, read to the end of the object.
> 
> For all of these, 10 sequential requests trigger readahead, the
> maximum readahead size is 4 MB, and "rbd readahead disable after
> bytes" is disabled (meaning that readahead is enabled for the entire
> workload).  The object size is the default 4 MB, and data is striped
> over a single object.  (Alignment with stripes or object sets is
> ignored for now.)
> 
> Here's the data:
> 
> workload      strategy   time (seconds)   RA ops   RA MB   read ops   read MB
> boot-ide      none       46.22 +/- 0.41        0       0      57516       407
> boot-ide      plain      11.42 +/- 0.25      281     203      57516       407
> boot-ide      aligned    11.46 +/- 0.13      276     201      57516       407
> boot-ide      eager      12.48 +/- 0.61      111     303      57516       407
> boot-virtio   none        9.05 +/- 0.25        0       0      11851       393
> boot-virtio   plain       8.05 +/- 0.38      451     221      11851       393
> boot-virtio   aligned     7.86 +/- 0.27      452     213      11851       393
> boot-virtio   eager       9.17 +/- 0.34      249     600      11851       393
> grep-ide      none      138.55 +/- 1.67        0       0     130104      3044
> grep-ide      plain     136.07 +/- 1.57      397     867     130104      3044
> grep-ide      aligned   137.30 +/- 1.77      379     844     130104      3044
> grep-ide      eager     138.77 +/- 1.52      346     993     130104      3044
> grep-virtio   none      120.73 +/- 1.33        0       0     130061      2820
> grep-virtio   plain     121.29 +/- 1.28     1186    1485     130061      2820
> grep-virtio   aligned   123.32 +/- 1.29     1139    1409     130061      2820
> grep-virtio   eager     127.75 +/- 1.32      842    2218     130061      2820
> 
> (The time is the mean wall-clock time +/- the margin of error with
> 99.7% confidence.  RA=readahead.)
> 
> Right off the bat, readahead is a huge improvement for the boot-ide
> workload, which is no surprise because it issues 50,000 sequential,
> single-sector reads.  (Why the early boot process is so inefficient is
> open for speculation, but that's a real, natural workload.)
> boot-virtio also sees an improvement, although not nearly so dramatic.
> The grep workloads show no statistically significant improvement.
> 
> One conclusion I draw is that 'eager' is, well, too eager.  'aligned'
> shows no statistically significant difference from 'plain', and
> 'plain' is no worse than 'none' (at statistically significant levels)
> and sometimes better.
> 
> Should the readahead strategy be configurable, or should we just stick
> with whichever seems the best one?  Is there anything big I'm missing?

Aligned seems like, even if it is no faster fromthe client's perspective, 
will result in fewer IOs on teh backend, right?  That makes me think we 
should go with that if we have to choose one.

Have you looked at what it might take to put the readahead logic in 
ObjectCacher somewhere, or in some other piece of shared code that would 
allow us to subsume the Client.cc readahead code as well?  Perhaps simply 
wrapping the readahead logic in a single class such that the calling code 
is super simple (just feeds in current offset and conditionally issues a 
readahead IO) would work as well.

sage

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-09-12  3:05 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-10 22:36 RBD readahead strategies Adam Crume
2014-09-12  3:05 ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.