* RBD readahead strategies
@ 2014-09-10 22:36 Adam Crume
2014-09-12 3:05 ` Sage Weil
0 siblings, 1 reply; 2+ messages in thread
From: Adam Crume @ 2014-09-10 22:36 UTC (permalink / raw)
To: ceph-devel
I've been testing a few strategies for RBD readahead and wanted to
share my results as well as ask for input.
I have four sample workloads that I replayed at maximum speed with
rbd-replay. boot-ide and boot-virtio are captured from booting a VM
with the image on the IDE and virtio buses, respectively. Likewise,
grep-ide and grep-virtio are captured from a large grep run. (I'm not
entirely sure why the IDE and virtio workloads are different, but part
of it is the number of pending requests allowed.)
The readahead strategies are:
- none: No readahead.
- plain: My initial implementation. The readahead window doubles for
each readahead request, up to a limit, and resets when a random
request is detected.
- aligned: Same as above, but readahead requests are aligned with
object boundaries, when possible.
- eager: When activated, read to the end of the object.
For all of these, 10 sequential requests trigger readahead, the
maximum readahead size is 4 MB, and "rbd readahead disable after
bytes" is disabled (meaning that readahead is enabled for the entire
workload). The object size is the default 4 MB, and data is striped
over a single object. (Alignment with stripes or object sets is
ignored for now.)
Here's the data:
workload strategy time (seconds) RA ops RA MB read ops read MB
boot-ide none 46.22 +/- 0.41 0 0 57516 407
boot-ide plain 11.42 +/- 0.25 281 203 57516 407
boot-ide aligned 11.46 +/- 0.13 276 201 57516 407
boot-ide eager 12.48 +/- 0.61 111 303 57516 407
boot-virtio none 9.05 +/- 0.25 0 0 11851 393
boot-virtio plain 8.05 +/- 0.38 451 221 11851 393
boot-virtio aligned 7.86 +/- 0.27 452 213 11851 393
boot-virtio eager 9.17 +/- 0.34 249 600 11851 393
grep-ide none 138.55 +/- 1.67 0 0 130104 3044
grep-ide plain 136.07 +/- 1.57 397 867 130104 3044
grep-ide aligned 137.30 +/- 1.77 379 844 130104 3044
grep-ide eager 138.77 +/- 1.52 346 993 130104 3044
grep-virtio none 120.73 +/- 1.33 0 0 130061 2820
grep-virtio plain 121.29 +/- 1.28 1186 1485 130061 2820
grep-virtio aligned 123.32 +/- 1.29 1139 1409 130061 2820
grep-virtio eager 127.75 +/- 1.32 842 2218 130061 2820
(The time is the mean wall-clock time +/- the margin of error with
99.7% confidence. RA=readahead.)
Right off the bat, readahead is a huge improvement for the boot-ide
workload, which is no surprise because it issues 50,000 sequential,
single-sector reads. (Why the early boot process is so inefficient is
open for speculation, but that's a real, natural workload.)
boot-virtio also sees an improvement, although not nearly so dramatic.
The grep workloads show no statistically significant improvement.
One conclusion I draw is that 'eager' is, well, too eager. 'aligned'
shows no statistically significant difference from 'plain', and
'plain' is no worse than 'none' (at statistically significant levels)
and sometimes better.
Should the readahead strategy be configurable, or should we just stick
with whichever seems the best one? Is there anything big I'm missing?
Adam
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: RBD readahead strategies
2014-09-10 22:36 RBD readahead strategies Adam Crume
@ 2014-09-12 3:05 ` Sage Weil
0 siblings, 0 replies; 2+ messages in thread
From: Sage Weil @ 2014-09-12 3:05 UTC (permalink / raw)
To: Adam Crume; +Cc: ceph-devel
On Wed, 10 Sep 2014, Adam Crume wrote:
> I've been testing a few strategies for RBD readahead and wanted to
> share my results as well as ask for input.
>
> I have four sample workloads that I replayed at maximum speed with
> rbd-replay. boot-ide and boot-virtio are captured from booting a VM
> with the image on the IDE and virtio buses, respectively. Likewise,
> grep-ide and grep-virtio are captured from a large grep run. (I'm not
> entirely sure why the IDE and virtio workloads are different, but part
> of it is the number of pending requests allowed.)
>
> The readahead strategies are:
> - none: No readahead.
> - plain: My initial implementation. The readahead window doubles for
> each readahead request, up to a limit, and resets when a random
> request is detected.
> - aligned: Same as above, but readahead requests are aligned with
> object boundaries, when possible.
> - eager: When activated, read to the end of the object.
>
> For all of these, 10 sequential requests trigger readahead, the
> maximum readahead size is 4 MB, and "rbd readahead disable after
> bytes" is disabled (meaning that readahead is enabled for the entire
> workload). The object size is the default 4 MB, and data is striped
> over a single object. (Alignment with stripes or object sets is
> ignored for now.)
>
> Here's the data:
>
> workload strategy time (seconds) RA ops RA MB read ops read MB
> boot-ide none 46.22 +/- 0.41 0 0 57516 407
> boot-ide plain 11.42 +/- 0.25 281 203 57516 407
> boot-ide aligned 11.46 +/- 0.13 276 201 57516 407
> boot-ide eager 12.48 +/- 0.61 111 303 57516 407
> boot-virtio none 9.05 +/- 0.25 0 0 11851 393
> boot-virtio plain 8.05 +/- 0.38 451 221 11851 393
> boot-virtio aligned 7.86 +/- 0.27 452 213 11851 393
> boot-virtio eager 9.17 +/- 0.34 249 600 11851 393
> grep-ide none 138.55 +/- 1.67 0 0 130104 3044
> grep-ide plain 136.07 +/- 1.57 397 867 130104 3044
> grep-ide aligned 137.30 +/- 1.77 379 844 130104 3044
> grep-ide eager 138.77 +/- 1.52 346 993 130104 3044
> grep-virtio none 120.73 +/- 1.33 0 0 130061 2820
> grep-virtio plain 121.29 +/- 1.28 1186 1485 130061 2820
> grep-virtio aligned 123.32 +/- 1.29 1139 1409 130061 2820
> grep-virtio eager 127.75 +/- 1.32 842 2218 130061 2820
>
> (The time is the mean wall-clock time +/- the margin of error with
> 99.7% confidence. RA=readahead.)
>
> Right off the bat, readahead is a huge improvement for the boot-ide
> workload, which is no surprise because it issues 50,000 sequential,
> single-sector reads. (Why the early boot process is so inefficient is
> open for speculation, but that's a real, natural workload.)
> boot-virtio also sees an improvement, although not nearly so dramatic.
> The grep workloads show no statistically significant improvement.
>
> One conclusion I draw is that 'eager' is, well, too eager. 'aligned'
> shows no statistically significant difference from 'plain', and
> 'plain' is no worse than 'none' (at statistically significant levels)
> and sometimes better.
>
> Should the readahead strategy be configurable, or should we just stick
> with whichever seems the best one? Is there anything big I'm missing?
Aligned seems like, even if it is no faster fromthe client's perspective,
will result in fewer IOs on teh backend, right? That makes me think we
should go with that if we have to choose one.
Have you looked at what it might take to put the readahead logic in
ObjectCacher somewhere, or in some other piece of shared code that would
allow us to subsume the Client.cc readahead code as well? Perhaps simply
wrapping the readahead logic in a single class such that the calling code
is super simple (just feeds in current offset and conditionally issues a
readahead IO) would work as well.
sage
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2014-09-12 3:05 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-10 22:36 RBD readahead strategies Adam Crume
2014-09-12 3:05 ` Sage Weil
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.