* Hammer vs Jewel librbd performance testing and git bisection results
@ 2016-05-11 13:21 Mark Nelson
2016-05-11 13:35 ` Piotr Dałek
[not found] ` <f9f54455-8348-42b7-153b-e38850eed5e1-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 2 replies; 9+ messages in thread
From: Mark Nelson @ 2016-05-11 13:21 UTC (permalink / raw)
To: ceph-devel, cbt-idqoXFIVOFJgJs9I8MT0rw,
ceph-users-idqoXFIVOFJgJs9I8MT0rw
Hi Guys,
we spent some time over the past week looking at hammer vs jewel RBD
performance in HDD only, HDD+NVMe journal, and NVMe cases with the
default filestore backend. We ran into a number of issues during
testing and I don't want to get into everything, but we were eventually
able to get a good set of data out of fio's bandwidth and latency logs.
Fio's log sampling intervals were not uniform, but Jens has since
written an experimental patch for fio that fixes this which you can find
in this thread:
http://www.spinics.net/lists/fio/msg04713.html
We ended up writing a parser that can work around this by reading
multiple fio bw/latency log files and getting aggregate data even
assuming non-uniform sample intervals. This was briefly part of CBT,
but recently was included upstream in fio itself. Armed with this, we
were able to get a seemingly accurate view of hammer vs jewel
performance across various IO sizes:
https://docs.google.com/spreadsheets/d/1MK09ZXufTUCgqa9jVJFO-J9oZWMKn7SnKN7NJ45fzTE/edit?usp=sharing
The gist of this is that Jewel is faster than Hammer for many random
workloads (Read, Write, and Mixed). There is one specific case where
performance degrades significantly: 64-128k sequential reads. We
couldn't find anything obviously wrong with these tests, so we spent
some time running git bisects between hammer and jewel with the NVMe
test configuration (these tests were faster to setup/run than the HDD
setup). We tested about 45 different commits with anywhere from 1-5
samples depending on how confident the results looked:
https://docs.google.com/spreadsheets/d/1hbsyNM5pr-ZwBuR7lqnphEd-4kQUid0C9eRyta3ohOA/edit?usp=sharing
There are several commits of interest that have a noticeable effect on
128K sequential read performance:
1) https://github.com/ceph/ceph/commit/3a7b5e3
This commit was the first that introduced anywhere from a 0-10%
performance decrease in the 128K sequential read tests. Primarily it
made performance lower on average and more variable.
2) https://github.com/ceph/ceph/commit/c474ee42
This commit had a very large impact, reducing performance by another 20-25%.
3) https://github.com/ceph/ceph/commit/66e7464
This was a fix that helped regain some of the performance loss due to
c474ee42, but didn't totally reclaim it.
4) 218bc2d - b85a5fe
Between commits 218bc2d and b85a5fe, there's a fair amount of
variability in the test results. It's possible that some of the commits
here are having an effect on performance, but it's difficult to tell.
Might be worth more investigation after other bottlenecks are removed.
5) https://github.com/ceph/ceph/commit/8aae868
The new AioImageRequestWQ appears to be the cause of the most recent
large reduction in 128K sequential read performance.
6) 8aae868 - 6f18f04
There may be some additional small performance impacts in these commits,
though it's difficult to tell which ones since most of the bisects had
to be skipped due to ceph failing to compile.
This is what we know so far, thank for reading. :)
Mark
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hammer vs Jewel librbd performance testing and git bisection results
2016-05-11 13:21 Hammer vs Jewel librbd performance testing and git bisection results Mark Nelson
@ 2016-05-11 13:35 ` Piotr Dałek
2016-05-11 13:49 ` Matt Benjamin
[not found] ` <f9f54455-8348-42b7-153b-e38850eed5e1-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
1 sibling, 1 reply; 9+ messages in thread
From: Piotr Dałek @ 2016-05-11 13:35 UTC (permalink / raw)
To: ceph-devel
On Wed, May 11, 2016 at 08:21:18AM -0500, Mark Nelson wrote:
> Hi Guys,
>
> [..]
> The gist of this is that Jewel is faster than Hammer for many random
> workloads (Read, Write, and Mixed). There is one specific case
> where performance degrades significantly: 64-128k sequential reads.
> We couldn't find anything obviously wrong with these tests, so we
> spent some time running git bisects between hammer and jewel with
> the NVMe test configuration (these tests were faster to setup/run
> than the HDD setup). We tested about 45 different commits with
> anywhere from 1-5 samples depending on how confident the results
> looked:
>
> https://docs.google.com/spreadsheets/d/1hbsyNM5pr-ZwBuR7lqnphEd-4kQUid0C9eRyta3ohOA/edit?usp=sharing
>
> There are several commits of interest that have a noticeable effect
> on 128K sequential read performance:
>
> [..]
> 2) https://github.com/ceph/ceph/commit/c474ee42
>
> This commit had a very large impact, reducing performance by another 20-25%.
https://github.com/ceph/ceph/commit/c474ee42#diff-254555dde8dcfb7fb908791ab8214b92R318
I would check if temporarily forcing unique_lock_name() to return its arg
(or other constant) would change things. If so, probably a more efficient way
to construct unique lock name may be in order.
--
Piotr Dałek
branch@predictor.org.pl
http://blog.predictor.org.pl
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hammer vs Jewel librbd performance testing and git bisection results
2016-05-11 13:35 ` Piotr Dałek
@ 2016-05-11 13:49 ` Matt Benjamin
2016-05-11 15:22 ` Piotr Dałek
0 siblings, 1 reply; 9+ messages in thread
From: Matt Benjamin @ 2016-05-11 13:49 UTC (permalink / raw)
To: Piotr Dałek; +Cc: ceph-devel
Hi,
----- Original Message -----
> From: "Piotr Dałek" <branch@predictor.org.pl>
> To: ceph-devel@vger.kernel.org
> Sent: Wednesday, May 11, 2016 9:35:04 AM
> Subject: Re: Hammer vs Jewel librbd performance testing and git bisection results
>
syNM5pr-ZwBuR7lqnphEd-4kQUid0C9eRyta3ohOA/edit?usp=sharing
> >
> > There are several commits of interest that have a noticeable effect
> > on 128K sequential read performance:
> >
> > [..]
> > 2) https://github.com/ceph/ceph/commit/c474ee42
> >
> > This commit had a very large impact, reducing performance by another
> > 20-25%.
>
> https://github.com/ceph/ceph/commit/c474ee42#diff-254555dde8dcfb7fb908791ab8214b92R318
> I would check if temporarily forcing unique_lock_name() to return its arg
> (or other constant) would change things. If so, probably a more efficient way
> to construct unique lock name may be in order.
++
Naively, too, what unique_lock_name is doing amounts to: 1) creating extra of [a small] std::string [better fixed, but not a likely root cause?] 2) using Utils::stringify to hook ostream operators in the type passed in, and doing that on a new sstream;
Maybe we should look horizontally at either speeding up or finding alternatives to stringify in other places?
Matt
>
> --
> Piotr Dałek
--
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103
http://www.redhat.com/en/technologies/storage
tel. 734-707-0660
fax. 734-769-8938
cel. 734-216-5309
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hammer vs Jewel librbd performance testing and git bisection results
[not found] ` <f9f54455-8348-42b7-153b-e38850eed5e1-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-05-11 13:52 ` Jason Dillaman
2016-05-11 14:07 ` [ceph-users] " Mark Nelson
2016-05-11 14:19 ` [ceph-users] " Haomai Wang
0 siblings, 2 replies; 9+ messages in thread
From: Jason Dillaman @ 2016-05-11 13:52 UTC (permalink / raw)
To: Mark Nelson
Cc: ceph-devel, ceph-users-idqoXFIVOFJgJs9I8MT0rw,
cbt-idqoXFIVOFJgJs9I8MT0rw
Awesome work Mark! Comments / questions inline below:
On Wed, May 11, 2016 at 9:21 AM, Mark Nelson <mnelson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> There are several commits of interest that have a noticeable effect on 128K
> sequential read performance:
>
>
> 1) https://github.com/ceph/ceph/commit/3a7b5e3
>
> This commit was the first that introduced anywhere from a 0-10% performance
> decrease in the 128K sequential read tests. Primarily it made performance
> lower on average and more variable.
This one is surprising to me since this change is also in Hammer
(cf6e1f50ea7b5c2fd6298be77c06ed4765d66611). When you are performing
the bisect, are you keeping the OSDs at the same version and only
swapping out librbd?
> 2) https://github.com/ceph/ceph/commit/c474ee42
>
> This commit had a very large impact, reducing performance by another 20-25%.
Definitely an area we should optimize given the number of
AioCompletions that are constructed.
> 3) https://github.com/ceph/ceph/commit/66e7464
>
> This was a fix that helped regain some of the performance loss due to
> c474ee42, but didn't totally reclaim it.
Odd -- since that effectively reverted c474ee42 (unique_lock_name)
within the IO path.
> 5) https://github.com/ceph/ceph/commit/8aae868
>
> The new AioImageRequestWQ appears to be the cause of the most recent large
> reduction in 128K sequential read performance.
We will have to investigate this -- AioImageRequestWQ is just a
wrapper around the same work queue used in the Hammer release.
--
Jason
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [ceph-users] Hammer vs Jewel librbd performance testing and git bisection results
2016-05-11 13:52 ` Jason Dillaman
@ 2016-05-11 14:07 ` Mark Nelson
2016-05-11 14:19 ` Jason Dillaman
2016-05-11 14:19 ` [ceph-users] " Haomai Wang
1 sibling, 1 reply; 9+ messages in thread
From: Mark Nelson @ 2016-05-11 14:07 UTC (permalink / raw)
To: dillaman; +Cc: ceph-devel, cbt, ceph-users
On 05/11/2016 08:52 AM, Jason Dillaman wrote:
> Awesome work Mark! Comments / questions inline below:
>
> On Wed, May 11, 2016 at 9:21 AM, Mark Nelson <mnelson@redhat.com> wrote:
>> There are several commits of interest that have a noticeable effect on 128K
>> sequential read performance:
>>
>>
>> 1) https://github.com/ceph/ceph/commit/3a7b5e3
>>
>> This commit was the first that introduced anywhere from a 0-10% performance
>> decrease in the 128K sequential read tests. Primarily it made performance
>> lower on average and more variable.
>
> This one is surprising to me since this change is also in Hammer
> (cf6e1f50ea7b5c2fd6298be77c06ed4765d66611). When you are performing
> the bisect, are you keeping the OSDs at the same version and only
> swapping out librbd?
Nope, I had no idea when trying to track this down if this was 100%
librbd or if there were other issues at play too, so the OSDs and librbd
are both changing. Having said that, I wouldn't expect there to be any
difference in the OSD code between afb896d and 3a7b5e3.
Given the variability in the results starting with 3a7b5e3, it might be
some kind of secondary effect. The highest performing samples were
still in the same ballpark as pre-3a7b5e3. I guess I would worry less
about this one right now.
>
>> 2) https://github.com/ceph/ceph/commit/c474ee42
>>
>> This commit had a very large impact, reducing performance by another 20-25%.
>
> Definitely an area we should optimize given the number of
> AioCompletions that are constructed.
>
>> 3) https://github.com/ceph/ceph/commit/66e7464
>>
>> This was a fix that helped regain some of the performance loss due to
>> c474ee42, but didn't totally reclaim it.
>
> Odd -- since that effectively reverted c474ee42 (unique_lock_name)
> within the IO path.
Perhaps 0024677 or 3ad19ae introduced another regression that was being
masked by c474e4 and when 66e7464 improved the situation, the other
regression appeared?
>
>> 5) https://github.com/ceph/ceph/commit/8aae868
>>
>> The new AioImageRequestWQ appears to be the cause of the most recent large
>> reduction in 128K sequential read performance.
>
> We will have to investigate this -- AioImageRequestWQ is just a
> wrapper around the same work queue used in the Hammer release.
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [ceph-users] Hammer vs Jewel librbd performance testing and git bisection results
2016-05-11 14:07 ` [ceph-users] " Mark Nelson
@ 2016-05-11 14:19 ` Jason Dillaman
[not found] ` <CA+aFP1DRcM5AwKpo6sU_14-YA764e9ZOHq=qtJWYkFN983O+kQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Jason Dillaman @ 2016-05-11 14:19 UTC (permalink / raw)
To: Mark Nelson; +Cc: ceph-devel, cbt, ceph-users
On Wed, May 11, 2016 at 10:07 AM, Mark Nelson <mnelson@redhat.com> wrote:
> Perhaps 0024677 or 3ad19ae introduced another regression that was being
> masked by c474e4 and when 66e7464 improved the situation, the other
> regression appeared?
0024677 is in Hammer as 7004149 and 3ad19ae is in Hammer as b38da480.
I opened two tickets [1] [2] to investigate further. Can you attach
the fio job you used?
[1] http://tracker.ceph.com/issues/15847
[2] http://tracker.ceph.com/issues/15848
Thanks,
--
Jason
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [ceph-users] Hammer vs Jewel librbd performance testing and git bisection results
2016-05-11 13:52 ` Jason Dillaman
2016-05-11 14:07 ` [ceph-users] " Mark Nelson
@ 2016-05-11 14:19 ` Haomai Wang
1 sibling, 0 replies; 9+ messages in thread
From: Haomai Wang @ 2016-05-11 14:19 UTC (permalink / raw)
To: Jason Dillaman; +Cc: Mark Nelson, ceph-devel, ceph-users, cbt
On Wed, May 11, 2016 at 9:52 PM, Jason Dillaman <jdillama@redhat.com> wrote:
> Awesome work Mark! Comments / questions inline below:
>
> On Wed, May 11, 2016 at 9:21 AM, Mark Nelson <mnelson@redhat.com> wrote:
>> There are several commits of interest that have a noticeable effect on 128K
>> sequential read performance:
>>
>>
>> 1) https://github.com/ceph/ceph/commit/3a7b5e3
>>
>> This commit was the first that introduced anywhere from a 0-10% performance
>> decrease in the 128K sequential read tests. Primarily it made performance
>> lower on average and more variable.
>
> This one is surprising to me since this change is also in Hammer
> (cf6e1f50ea7b5c2fd6298be77c06ed4765d66611). When you are performing
> the bisect, are you keeping the OSDs at the same version and only
> swapping out librbd?
>
>> 2) https://github.com/ceph/ceph/commit/c474ee42
>>
>> This commit had a very large impact, reducing performance by another 20-25%.
>
> Definitely an area we should optimize given the number of
> AioCompletions that are constructed.
Previously I talked to josh about the cpu time caused by
librbd::AioCompletion. The mutex construct and deconstruct hurt a lot.
A idea is create a object pool to cache this or add a api to allow
user to reset AioCompletion to make it reuse
>
>> 3) https://github.com/ceph/ceph/commit/66e7464
>>
>> This was a fix that helped regain some of the performance loss due to
>> c474ee42, but didn't totally reclaim it.
>
> Odd -- since that effectively reverted c474ee42 (unique_lock_name)
> within the IO path.
>
>> 5) https://github.com/ceph/ceph/commit/8aae868
>>
>> The new AioImageRequestWQ appears to be the cause of the most recent large
>> reduction in 128K sequential read performance.
>
> We will have to investigate this -- AioImageRequestWQ is just a
> wrapper around the same work queue used in the Hammer release.
>
> --
> Jason
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hammer vs Jewel librbd performance testing and git bisection results
[not found] ` <CA+aFP1DRcM5AwKpo6sU_14-YA764e9ZOHq=qtJWYkFN983O+kQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-05-11 14:24 ` Mark Nelson
0 siblings, 0 replies; 9+ messages in thread
From: Mark Nelson @ 2016-05-11 14:24 UTC (permalink / raw)
To: dillaman-H+wXaHxf7aLQT0dZR+AlfA
Cc: ceph-devel, ceph-users-idqoXFIVOFJgJs9I8MT0rw,
cbt-idqoXFIVOFJgJs9I8MT0rw
On 05/11/2016 09:19 AM, Jason Dillaman wrote:
> On Wed, May 11, 2016 at 10:07 AM, Mark Nelson <mnelson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> Perhaps 0024677 or 3ad19ae introduced another regression that was being
>> masked by c474e4 and when 66e7464 improved the situation, the other
>> regression appeared?
>
> 0024677 is in Hammer as 7004149 and 3ad19ae is in Hammer as b38da480.
> I opened two tickets [1] [2] to investigate further. Can you attach
> the fio job you used?
>
> [1] http://tracker.ceph.com/issues/15847
> [2] http://tracker.ceph.com/issues/15848
I can't give you a job file, but I can give you the command line
parameters used in the test:
/home/ubuntu/src/fio/fio --ioengine=rbd --clientname=admin
--pool=cbt-librbdfio --rbdname=cbt-librbdfio-`hostname -f`-0
--invalidate=0 --rw=read --runtime=300 --ramp_time=None --numjobs=1
--direct=1 --bs=131072B --iodepth=32 --end_fsync=0
--write_iops_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00131072/concurrent_procs-008/iodepth-032/read/output.0
--write_bw_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00131072/concurrent_procs-008/iodepth-032/read/output.0
--write_lat_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00131072/concurrent_procs-008/iodepth-032/read/output.0
--log_avg_msec=100 --name=librbdfio-`hostname -f`-0 >
/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00131072/concurrent_procs-008/iodepth-032/read/output.0
Two of these were run concurrently on 4 clients (8 volumes total).
>
> Thanks,
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hammer vs Jewel librbd performance testing and git bisection results
2016-05-11 13:49 ` Matt Benjamin
@ 2016-05-11 15:22 ` Piotr Dałek
0 siblings, 0 replies; 9+ messages in thread
From: Piotr Dałek @ 2016-05-11 15:22 UTC (permalink / raw)
To: Matt Benjamin; +Cc: ceph-devel
On Wed, May 11, 2016 at 09:49:20AM -0400, Matt Benjamin wrote:
> > > syNM5pr-ZwBuR7lqnphEd-4kQUid0C9eRyta3ohOA/edit?usp=sharing
> > >
> > > There are several commits of interest that have a noticeable effect
> > > on 128K sequential read performance:
> > >
> > > [..]
> > > 2) https://github.com/ceph/ceph/commit/c474ee42
> > >
> > > This commit had a very large impact, reducing performance by another
> > > 20-25%.
> >
> > https://github.com/ceph/ceph/commit/c474ee42#diff-254555dde8dcfb7fb908791ab8214b92R318
> > I would check if temporarily forcing unique_lock_name() to return its arg
> > (or other constant) would change things. If so, probably a more efficient way
> > to construct unique lock name may be in order.
>
> ++
>
> Naively, too, what unique_lock_name is doing amounts to: 1) creating extra of [a small] std::string [better fixed, but not a likely root cause?] 2) using Utils::stringify to hook ostream operators in the type passed in, and doing that on a new sstream;
I don't thing "stringify" alone is a source of problems. This looks like a
convenience function to make object dumping easier, so no real point in
changing it, rather than that...
> Maybe we should look horizontally at either speeding up or finding alternatives to stringify in other places?
... it's usage should be limited to non-performance-critical paths.
In this particular case, probably checking for lockdep and refusing to act
when it is disabled sounds like the least intrusive way to fix
things.
--
Piotr Dałek
branch@predictor.org.pl
http://blog.predictor.org.pl
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-05-11 15:20 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-11 13:21 Hammer vs Jewel librbd performance testing and git bisection results Mark Nelson
2016-05-11 13:35 ` Piotr Dałek
2016-05-11 13:49 ` Matt Benjamin
2016-05-11 15:22 ` Piotr Dałek
[not found] ` <f9f54455-8348-42b7-153b-e38850eed5e1-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-11 13:52 ` Jason Dillaman
2016-05-11 14:07 ` [ceph-users] " Mark Nelson
2016-05-11 14:19 ` Jason Dillaman
[not found] ` <CA+aFP1DRcM5AwKpo6sU_14-YA764e9ZOHq=qtJWYkFN983O+kQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-05-11 14:24 ` Mark Nelson
2016-05-11 14:19 ` [ceph-users] " Haomai Wang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.