Hammer vs Jewel librbd performance testing and git bisection results

All of lore.kernel.org
 help / color / mirror / Atom feed

* Hammer vs Jewel librbd performance testing and git bisection results
@ 2016-05-11 13:21 Mark Nelson
  2016-05-11 13:35 ` Piotr Dałek
       [not found] ` <f9f54455-8348-42b7-153b-e38850eed5e1-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 2 replies; 9+ messages in thread
From: Mark Nelson @ 2016-05-11 13:21 UTC (permalink / raw)
  To: ceph-devel, cbt-idqoXFIVOFJgJs9I8MT0rw,
	ceph-users-idqoXFIVOFJgJs9I8MT0rw

Hi Guys,

we spent some time over the past week looking at hammer vs jewel RBD 
performance in HDD only, HDD+NVMe journal, and NVMe cases with the 
default filestore backend.  We ran into a number of issues during 
testing and I don't want to get into everything, but we were eventually 
able to get a good set of data out of fio's bandwidth and latency logs. 
Fio's log sampling intervals were not uniform, but Jens has since 
written an experimental patch for fio that fixes this which you can find 
in this thread:

http://www.spinics.net/lists/fio/msg04713.html

We ended up writing a parser that can work around this by reading 
multiple fio bw/latency log files and getting aggregate data even 
assuming non-uniform sample intervals.  This was briefly part of CBT, 
but recently was included upstream in fio itself.  Armed with this, we 
were able to get a seemingly accurate view of hammer vs jewel 
performance across various IO sizes:

https://docs.google.com/spreadsheets/d/1MK09ZXufTUCgqa9jVJFO-J9oZWMKn7SnKN7NJ45fzTE/edit?usp=sharing

The gist of this is that Jewel is faster than Hammer for many random 
workloads (Read, Write, and Mixed).  There is one specific case where 
performance degrades significantly: 64-128k sequential reads.  We 
couldn't find anything obviously wrong with these tests, so we spent 
some time running git bisects between hammer and jewel with the NVMe 
test configuration (these tests were faster to setup/run than the HDD 
setup).  We tested about 45 different commits with anywhere from 1-5 
samples depending on how confident the results looked:

https://docs.google.com/spreadsheets/d/1hbsyNM5pr-ZwBuR7lqnphEd-4kQUid0C9eRyta3ohOA/edit?usp=sharing

There are several commits of interest that have a noticeable effect on 
128K sequential read performance:

1) https://github.com/ceph/ceph/commit/3a7b5e3

This commit was the first that introduced anywhere from a 0-10% 
performance decrease in the 128K sequential read tests.  Primarily it 
made performance lower on average and more variable.

2) https://github.com/ceph/ceph/commit/c474ee42

This commit had a very large impact, reducing performance by another 20-25%.

3) https://github.com/ceph/ceph/commit/66e7464

This was a fix that helped regain some of the performance loss due to 
c474ee42, but didn't totally reclaim it.

4) 218bc2d - b85a5fe

Between commits 218bc2d and b85a5fe, there's a fair amount of 
variability in the test results.  It's possible that some of the commits 
here are having an effect on performance, but it's difficult to tell. 
Might be worth more investigation after other bottlenecks are removed.

5) https://github.com/ceph/ceph/commit/8aae868

The new AioImageRequestWQ appears to be the cause of the most recent 
large reduction in 128K sequential read performance.

6) 8aae868 - 6f18f04

There may be some additional small performance impacts in these commits, 
though it's difficult to tell which ones since most of the bisects had 
to be skipped due to ceph failing to compile.

This is what we know so far, thank for reading. :)

Mark

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hammer vs Jewel librbd performance testing and git bisection results
  2016-05-11 13:21 Hammer vs Jewel librbd performance testing and git bisection results Mark Nelson
@ 2016-05-11 13:35 ` Piotr Dałek
  2016-05-11 13:49   ` Matt Benjamin
       [not found] ` <f9f54455-8348-42b7-153b-e38850eed5e1-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 9+ messages in thread
From: Piotr Dałek @ 2016-05-11 13:35 UTC (permalink / raw)
  To: ceph-devel

On Wed, May 11, 2016 at 08:21:18AM -0500, Mark Nelson wrote:
> Hi Guys,
> 
> [..]
> The gist of this is that Jewel is faster than Hammer for many random
> workloads (Read, Write, and Mixed).  There is one specific case
> where performance degrades significantly: 64-128k sequential reads.
> We couldn't find anything obviously wrong with these tests, so we
> spent some time running git bisects between hammer and jewel with
> the NVMe test configuration (these tests were faster to setup/run
> than the HDD setup).  We tested about 45 different commits with
> anywhere from 1-5 samples depending on how confident the results
> looked:
> 
> https://docs.google.com/spreadsheets/d/1hbsyNM5pr-ZwBuR7lqnphEd-4kQUid0C9eRyta3ohOA/edit?usp=sharing
> 
> There are several commits of interest that have a noticeable effect
> on 128K sequential read performance:
> 
> [..]
> 2) https://github.com/ceph/ceph/commit/c474ee42
> 
> This commit had a very large impact, reducing performance by another 20-25%.

https://github.com/ceph/ceph/commit/c474ee42#diff-254555dde8dcfb7fb908791ab8214b92R318
I would check if temporarily forcing unique_lock_name() to return its arg
(or other constant) would change things. If so, probably a more efficient way
to construct unique lock name may be in order.

-- 
Piotr Dałek
branch@predictor.org.pl
http://blog.predictor.org.pl
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hammer vs Jewel librbd performance testing and git bisection results
  2016-05-11 13:35 ` Piotr Dałek
@ 2016-05-11 13:49   ` Matt Benjamin
  2016-05-11 15:22     ` Piotr Dałek
  0 siblings, 1 reply; 9+ messages in thread
From: Matt Benjamin @ 2016-05-11 13:49 UTC (permalink / raw)
  To: Piotr Dałek; +Cc: ceph-devel

Hi,

----- Original Message -----
> From: "Piotr Dałek" <branch@predictor.org.pl>
> To: ceph-devel@vger.kernel.org
> Sent: Wednesday, May 11, 2016 9:35:04 AM
> Subject: Re: Hammer vs Jewel librbd performance testing and git bisection results
> 
syNM5pr-ZwBuR7lqnphEd-4kQUid0C9eRyta3ohOA/edit?usp=sharing
> > 
> > There are several commits of interest that have a noticeable effect
> > on 128K sequential read performance:
> > 
> > [..]
> > 2) https://github.com/ceph/ceph/commit/c474ee42
> > 
> > This commit had a very large impact, reducing performance by another
> > 20-25%.
> 
> https://github.com/ceph/ceph/commit/c474ee42#diff-254555dde8dcfb7fb908791ab8214b92R318
> I would check if temporarily forcing unique_lock_name() to return its arg
> (or other constant) would change things. If so, probably a more efficient way
> to construct unique lock name may be in order.

++

Naively, too, what unique_lock_name is doing amounts to:  1) creating extra of [a small] std::string [better fixed, but not a likely root cause?] 2) using Utils::stringify to hook ostream operators in the type passed in, and doing that on a new sstream;

Maybe we should look horizontally at either speeding up or finding alternatives to stringify in other places?

Matt

> 
> --
> Piotr Dałek


-- 
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-707-0660
fax.  734-769-8938
cel.  734-216-5309
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hammer vs Jewel librbd performance testing and git bisection results
       [not found] ` <f9f54455-8348-42b7-153b-e38850eed5e1-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-05-11 13:52   ` Jason Dillaman
  2016-05-11 14:07     ` [ceph-users] " Mark Nelson
  2016-05-11 14:19     ` [ceph-users] " Haomai Wang
  0 siblings, 2 replies; 9+ messages in thread
From: Jason Dillaman @ 2016-05-11 13:52 UTC (permalink / raw)
  To: Mark Nelson
  Cc: ceph-devel, ceph-users-idqoXFIVOFJgJs9I8MT0rw,
	cbt-idqoXFIVOFJgJs9I8MT0rw

Awesome work Mark!  Comments / questions inline below:

On Wed, May 11, 2016 at 9:21 AM, Mark Nelson <mnelson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> There are several commits of interest that have a noticeable effect on 128K
> sequential read performance:
>
>
> 1) https://github.com/ceph/ceph/commit/3a7b5e3
>
> This commit was the first that introduced anywhere from a 0-10% performance
> decrease in the 128K sequential read tests.  Primarily it made performance
> lower on average and more variable.

This one is surprising to me since this change is also in Hammer
(cf6e1f50ea7b5c2fd6298be77c06ed4765d66611).  When you are performing
the bisect, are you keeping the OSDs at the same version and only
swapping out librbd?

> 2) https://github.com/ceph/ceph/commit/c474ee42
>
> This commit had a very large impact, reducing performance by another 20-25%.

Definitely an area we should optimize given the number of
AioCompletions that are constructed.

> 3) https://github.com/ceph/ceph/commit/66e7464
>
> This was a fix that helped regain some of the performance loss due to
> c474ee42, but didn't totally reclaim it.

Odd -- since that effectively reverted c474ee42 (unique_lock_name)
within the IO path.

> 5) https://github.com/ceph/ceph/commit/8aae868
>
> The new AioImageRequestWQ appears to be the cause of the most recent large
> reduction in 128K sequential read performance.

We will have to investigate this -- AioImageRequestWQ is just a
wrapper around the same work queue used in the Hammer release.

-- 
Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [ceph-users] Hammer vs Jewel librbd performance testing and git bisection results
  2016-05-11 13:52   ` Jason Dillaman
@ 2016-05-11 14:07     ` Mark Nelson
  2016-05-11 14:19       ` Jason Dillaman
  2016-05-11 14:19     ` [ceph-users] " Haomai Wang
  1 sibling, 1 reply; 9+ messages in thread
From: Mark Nelson @ 2016-05-11 14:07 UTC (permalink / raw)
  To: dillaman; +Cc: ceph-devel, cbt, ceph-users



On 05/11/2016 08:52 AM, Jason Dillaman wrote:
> Awesome work Mark!  Comments / questions inline below:
>
> On Wed, May 11, 2016 at 9:21 AM, Mark Nelson <mnelson@redhat.com> wrote:
>> There are several commits of interest that have a noticeable effect on 128K
>> sequential read performance:
>>
>>
>> 1) https://github.com/ceph/ceph/commit/3a7b5e3
>>
>> This commit was the first that introduced anywhere from a 0-10% performance
>> decrease in the 128K sequential read tests.  Primarily it made performance
>> lower on average and more variable.
>
> This one is surprising to me since this change is also in Hammer
> (cf6e1f50ea7b5c2fd6298be77c06ed4765d66611).  When you are performing
> the bisect, are you keeping the OSDs at the same version and only
> swapping out librbd?

Nope, I had no idea when trying to track this down if this was 100% 
librbd or if there were other issues at play too, so the OSDs and librbd 
are both changing.  Having said that, I wouldn't expect there to be any 
difference in the OSD code between afb896d and 3a7b5e3.

Given the variability in the results starting with 3a7b5e3, it might be 
some kind of secondary effect.  The highest performing samples were 
still in the same ballpark as pre-3a7b5e3.  I guess I would worry less 
about this one right now.

>
>> 2) https://github.com/ceph/ceph/commit/c474ee42
>>
>> This commit had a very large impact, reducing performance by another 20-25%.
>
> Definitely an area we should optimize given the number of
> AioCompletions that are constructed.
>
>> 3) https://github.com/ceph/ceph/commit/66e7464
>>
>> This was a fix that helped regain some of the performance loss due to
>> c474ee42, but didn't totally reclaim it.
>
> Odd -- since that effectively reverted c474ee42 (unique_lock_name)
> within the IO path.

Perhaps 0024677 or 3ad19ae introduced another regression that was being 
masked by c474e4 and when 66e7464 improved the situation, the other 
regression appeared?

>
>> 5) https://github.com/ceph/ceph/commit/8aae868
>>
>> The new AioImageRequestWQ appears to be the cause of the most recent large
>> reduction in 128K sequential read performance.
>
> We will have to investigate this -- AioImageRequestWQ is just a
> wrapper around the same work queue used in the Hammer release.
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [ceph-users] Hammer vs Jewel librbd performance testing and git bisection results
  2016-05-11 14:07     ` [ceph-users] " Mark Nelson
@ 2016-05-11 14:19       ` Jason Dillaman
       [not found]         ` <CA+aFP1DRcM5AwKpo6sU_14-YA764e9ZOHq=qtJWYkFN983O+kQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Dillaman @ 2016-05-11 14:19 UTC (permalink / raw)
  To: Mark Nelson; +Cc: ceph-devel, cbt, ceph-users

On Wed, May 11, 2016 at 10:07 AM, Mark Nelson <mnelson@redhat.com> wrote:
> Perhaps 0024677 or 3ad19ae introduced another regression that was being
> masked by c474e4 and when 66e7464 improved the situation, the other
> regression appeared?

0024677 is in Hammer as 7004149 and 3ad19ae is in Hammer as b38da480.
I opened two tickets [1] [2] to investigate further.  Can you attach
the fio job you used?

[1] http://tracker.ceph.com/issues/15847
[2] http://tracker.ceph.com/issues/15848

Thanks,

-- 
Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [ceph-users] Hammer vs Jewel librbd performance testing and git bisection results
  2016-05-11 13:52   ` Jason Dillaman
  2016-05-11 14:07     ` [ceph-users] " Mark Nelson
@ 2016-05-11 14:19     ` Haomai Wang
  1 sibling, 0 replies; 9+ messages in thread
From: Haomai Wang @ 2016-05-11 14:19 UTC (permalink / raw)
  To: Jason Dillaman; +Cc: Mark Nelson, ceph-devel, ceph-users, cbt

On Wed, May 11, 2016 at 9:52 PM, Jason Dillaman <jdillama@redhat.com> wrote:
> Awesome work Mark!  Comments / questions inline below:
>
> On Wed, May 11, 2016 at 9:21 AM, Mark Nelson <mnelson@redhat.com> wrote:
>> There are several commits of interest that have a noticeable effect on 128K
>> sequential read performance:
>>
>>
>> 1) https://github.com/ceph/ceph/commit/3a7b5e3
>>
>> This commit was the first that introduced anywhere from a 0-10% performance
>> decrease in the 128K sequential read tests.  Primarily it made performance
>> lower on average and more variable.
>
> This one is surprising to me since this change is also in Hammer
> (cf6e1f50ea7b5c2fd6298be77c06ed4765d66611).  When you are performing
> the bisect, are you keeping the OSDs at the same version and only
> swapping out librbd?
>
>> 2) https://github.com/ceph/ceph/commit/c474ee42
>>
>> This commit had a very large impact, reducing performance by another 20-25%.
>
> Definitely an area we should optimize given the number of
> AioCompletions that are constructed.

Previously I talked to josh about the cpu time caused by
librbd::AioCompletion. The mutex construct and deconstruct hurt a lot.

A idea is create a object pool to cache this or add a api to allow
user to reset AioCompletion to make it reuse

>
>> 3) https://github.com/ceph/ceph/commit/66e7464
>>
>> This was a fix that helped regain some of the performance loss due to
>> c474ee42, but didn't totally reclaim it.
>
> Odd -- since that effectively reverted c474ee42 (unique_lock_name)
> within the IO path.
>
>> 5) https://github.com/ceph/ceph/commit/8aae868
>>
>> The new AioImageRequestWQ appears to be the cause of the most recent large
>> reduction in 128K sequential read performance.
>
> We will have to investigate this -- AioImageRequestWQ is just a
> wrapper around the same work queue used in the Hammer release.
>
> --
> Jason
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hammer vs Jewel librbd performance testing and git bisection results
       [not found]         ` <CA+aFP1DRcM5AwKpo6sU_14-YA764e9ZOHq=qtJWYkFN983O+kQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-05-11 14:24           ` Mark Nelson
  0 siblings, 0 replies; 9+ messages in thread
From: Mark Nelson @ 2016-05-11 14:24 UTC (permalink / raw)
  To: dillaman-H+wXaHxf7aLQT0dZR+AlfA
  Cc: ceph-devel, ceph-users-idqoXFIVOFJgJs9I8MT0rw,
	cbt-idqoXFIVOFJgJs9I8MT0rw



On 05/11/2016 09:19 AM, Jason Dillaman wrote:
> On Wed, May 11, 2016 at 10:07 AM, Mark Nelson <mnelson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> Perhaps 0024677 or 3ad19ae introduced another regression that was being
>> masked by c474e4 and when 66e7464 improved the situation, the other
>> regression appeared?
>
> 0024677 is in Hammer as 7004149 and 3ad19ae is in Hammer as b38da480.
> I opened two tickets [1] [2] to investigate further.  Can you attach
> the fio job you used?
>
> [1] http://tracker.ceph.com/issues/15847
> [2] http://tracker.ceph.com/issues/15848

I can't give you a job file, but I can give you the command line 
parameters used in the test:

/home/ubuntu/src/fio/fio --ioengine=rbd --clientname=admin 
--pool=cbt-librbdfio --rbdname=cbt-librbdfio-`hostname -f`-0 
--invalidate=0 --rw=read --runtime=300 --ramp_time=None --numjobs=1 
--direct=1 --bs=131072B --iodepth=32 --end_fsync=0 
--write_iops_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00131072/concurrent_procs-008/iodepth-032/read/output.0 
--write_bw_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00131072/concurrent_procs-008/iodepth-032/read/output.0 
--write_lat_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00131072/concurrent_procs-008/iodepth-032/read/output.0 
--log_avg_msec=100 --name=librbdfio-`hostname -f`-0  > 
/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00131072/concurrent_procs-008/iodepth-032/read/output.0

Two of these were run concurrently on 4 clients (8 volumes total).

>
> Thanks,
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hammer vs Jewel librbd performance testing and git bisection results
  2016-05-11 13:49   ` Matt Benjamin
@ 2016-05-11 15:22     ` Piotr Dałek
  0 siblings, 0 replies; 9+ messages in thread
From: Piotr Dałek @ 2016-05-11 15:22 UTC (permalink / raw)
  To: Matt Benjamin; +Cc: ceph-devel

On Wed, May 11, 2016 at 09:49:20AM -0400, Matt Benjamin wrote:
> > > syNM5pr-ZwBuR7lqnphEd-4kQUid0C9eRyta3ohOA/edit?usp=sharing
> > > 
> > > There are several commits of interest that have a noticeable effect
> > > on 128K sequential read performance:
> > > 
> > > [..]
> > > 2) https://github.com/ceph/ceph/commit/c474ee42
> > > 
> > > This commit had a very large impact, reducing performance by another
> > > 20-25%.
> > 
> > https://github.com/ceph/ceph/commit/c474ee42#diff-254555dde8dcfb7fb908791ab8214b92R318
> > I would check if temporarily forcing unique_lock_name() to return its arg
> > (or other constant) would change things. If so, probably a more efficient way
> > to construct unique lock name may be in order.
> 
> ++
> 
> Naively, too, what unique_lock_name is doing amounts to:  1) creating extra of [a small] std::string [better fixed, but not a likely root cause?] 2) using Utils::stringify to hook ostream operators in the type passed in, and doing that on a new sstream;
 
I don't thing "stringify" alone is a source of problems. This looks like a
convenience function to make object dumping easier, so no real point in
changing it, rather than that...

> Maybe we should look horizontally at either speeding up or finding alternatives to stringify in other places?

... it's usage should be limited to non-performance-critical paths.
In this particular case, probably checking for lockdep and refusing to act 
when it is disabled sounds like the least intrusive way to fix
things.

-- 
Piotr Dałek
branch@predictor.org.pl
http://blog.predictor.org.pl
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-05-11 15:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-11 13:21 Hammer vs Jewel librbd performance testing and git bisection results Mark Nelson
2016-05-11 13:35 ` Piotr Dałek
2016-05-11 13:49   ` Matt Benjamin
2016-05-11 15:22     ` Piotr Dałek
     [not found] ` <f9f54455-8348-42b7-153b-e38850eed5e1-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-11 13:52   ` Jason Dillaman
2016-05-11 14:07     ` [ceph-users] " Mark Nelson
2016-05-11 14:19       ` Jason Dillaman
     [not found]         ` <CA+aFP1DRcM5AwKpo6sU_14-YA764e9ZOHq=qtJWYkFN983O+kQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-05-11 14:24           ` Mark Nelson
2016-05-11 14:19     ` [ceph-users] " Haomai Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.