All of lore.kernel.org
 help / color / mirror / Atom feed
From: Blair Bethwaite <blair.bethwaite@gmail.com>
To: James Page <james.page@ubuntu.com>
Cc: Ceph Development <ceph-devel@vger.kernel.org>,
	mnelson@redhat.com, "Blinick,
	Stephen L" <stephen.l.blinick@intel.com>,
	Jay Vosburgh <jay.vosburgh@canonical.com>,
	Colin Ian King <colin.king@canonical.com>,
	Patricia Gaughen <patricia.gaughen@canonical.com>,
	Leann Ogasawara <leann.ogasawara@canonical.com>
Subject: Re: Memstore performance improvements v0.90 vs v0.87
Date: Fri, 20 Feb 2015 20:49:33 +1100	[thread overview]
Message-ID: <CA+z5DszfGOE3bE0M6ti-4GyRj2CXypsNRv1J=6Wx4b65NqE=Vg@mail.gmail.com> (raw)
In-Reply-To: <54E6F96B.9080202@ubuntu.com>

Hi James,

Interesting results, but did you do any tests with a NUMA system? IIUC
the original report was from a dual socket setup, and that'd
presumably be the standard setup for most folks (both OSD server and
client side).

Cheers,

On 20 February 2015 at 20:07, James Page <james.page@ubuntu.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Hi All
>
> The Ubuntu Kernel team have spent the last few weeks investigating the
> apparent performance disparity between RHEL 7 and Ubuntu 14.04; we've
> focussed efforts in a few ways (see below).
>
> All testing has been done using the latest Firefly release.
>
> 1) Base network latency
>
> Jay Vosburgh looked at the base network latencies between RHEL 7 and
> Ubuntu 14.04; under default install, RHEL actually had slightly worse
> latency than Ubuntu due to the default enablement of a firewall;
> disabling this brought latency back inline between the two distributions:
>
> OS                      rtt min/avg/max/mdev
> Ubuntu 14.04 (3.13)     0.013/0.016/0.018/0.005 ms
> RHEL7 (3.10)            0.010/0.018/0.025/0.005 ms
>
> ...base network latency is pretty much the same.
>
> This testing was performed on a matched pair of Dell Poweredge R610's,
> configured with a single 4 core CPU and 8G of RAM.
>
> 2) Latency and performance in Ceph using Rados bench
>
> Colin King spent a number of days testing and analysing results using
> rados bench against a single node ceph deployment, configured with a
> single memory backed OSD, to see if we could reproduce the disparities
> reported.
>
> He ran 120 second OSD benchmarks on RHEL 7 as well as Ubuntu 14.04 LTS
> with a selection of kernels including 3.10 vanilla, 3.13.0-44 (release
> kernel), 3.16.0-30 (utopic HWE kernel), 3.18.0-12 (vivid HWE kernel)
> and 3.19-rc6 with 1, 16 and 128 client threads.  The data collected is
> available at [0].
>
> Each round of tests consisted of 15 runs, from which we computed
> average latency, latency deviation and latency distribution:
>
>> 120 second x 1 thread
>
> Results all seem to cluster around 0.04->0.05ms, with RHEL 7 averaging
> at 0.044 and recent Ubuntu kernels at 0.036-0.037ms.  The older 3.10
> kernel in RHEL 7 does have some slightly higher average latency.
>
>> 120 second x 16 threads
>
> Results all seem to cluster around 0.6-0.7ms.  3.19.0-rc6 had a couple
> of 1.4ms outliers which pushed it out to be worse than RHEL 7. On the
> whole Ubuntu 3.10-3.18 kernels are better than RHEL 7 by ~0.1ms.  RHEL
> shows a far higher standard deviation, due to the bimodal latency
> distribution, which from the casual observer may appear to be more
> "jittery".
>
>> 120 second x 128 threads
>
> Later kernels show up to have less standard deviation than RHEL 7, so
> that shows perhaps less jitter in the stats than RHEL 7's 3.10 kernel.
> With this many threads pounding the test, we get a wider spread of
> latencies and it is hard to tell any kind of latency distribution
> patterns with just 15 rounds because of the large amount of latency
> jitter.  All systems show a latency of ~ 5ms.  Taking into
> consideration the amount of jitter, we think these results do not make
> much sense unless we repeat these tests with say 100 samples.
>
> 3) Conclusion
>
> We’ve have not been able to show any major anomalies in Ceph on Ubuntu
> compared to RHEL 7 when using memstore.  Our current hypothesis is that
> one needs to run the OSD bench stressor many times to get a fair capture
> of system latency stats.  The reason for this is:
>
> * Latencies are very low with memstore, so any small jitter in
> scheduling etc will show up as a large distortion (as shown by the large
> standard deviations in the samples).
>
> * When memstore is heavily utilized, memory pressure causes the system
> to page heavily and so we are subject to the nature of perhaps delays on
> paging that cause some latency jitters.  Latency differences may be just
> down to where a random page is in memory or in swap, and with memstore
> these may cause the large perturbations we see when running just a
> single test.
>
> * We needed to make *many* tens of measurements to get a typical idea of
> average latency and the latency distributions. Don't trust the results
> from just one test
>
> * We ran the tests with a pool configured to 100 pgs and 100 pgps [1].
> One can get different results with different placement group configs.
>
> I've CC'ed both Colin and Jay on this mail - so if anyone has any
> specific questions about the testing they can chime in with responses.
>
> Regards
>
> James
>
> [0] http://kernel.ubuntu.com/~cking/.ceph/ceph-benchmarks.ods
> [1] http://ceph.com/docs/master/rados/configuration/pool-pg-config-ref/
>
> - --
> James Page
> Ubuntu and Debian Developer
> james.page@ubuntu.com
> jamespage@debian.org
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iQIcBAEBCAAGBQJU5vlrAAoJEL/srsug59jDMvAQAIhSR4GFTXNc4RLpHtLT6h/X
> K5uyauKZGtL+wqtPKRfsXqbbUw9I5AZDifQuOEJ0APccLIPbgqxEN3d2uht/qygH
> G8q2Ax+M8OyZz07yqTitnD4JV3RmL8wNHUveWPLV0gs2TzBBYwP1ywExbRPed3PY
> cfDrszgkQszA/JwT5W5YNf1LZc+5VpOEFrTiLIaRzUDoxg7mm6Hwr3XT8OFjZhjm
> LSenKREHtrKKWoBh+OKTvuCUnHzEemK+CiwwRbNQ8l7xbp71wLyS08NpSB5C1y70
> 7uft+kP6XOGE9AKLvsdEL1PIXHfeKNonBEN5mO6nsXIW+MQzou01zHgDtne7AxDA
> 5OebQJfJtArmKt78WHuVg7h8gPcIRTRSW43LqJiADnIHL8fnZxj2v5yDiUQj7isw
> nYWXEJ3rR7mlVgydN34KQ7gpVWmGjhrVb8N01+zYOMAaTBnekldHdueEAXR07eU0
> PXiP9aOZiAxbEnDiJmreehjCuNFTagQqNeECRIHssSacfQXPxVljaImvuSfrxf8i
> myQLzftiObINTIHSN4TVDKMyveYrU2hILCKfYuxnSJh29j35wsRSeftjntOEyHai
> RDnrLD3fCPk4h3hCY6l60nqu9MQfbgdSB/FItvhiBGYqXvGb4+wuBeU9RT9SwG8N
> XPih7nLNvqDNw38IkkDN
> =qcvG
> -----END PGP SIGNATURE-----
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Cheers,
~Blairo
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2015-02-20  9:49 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-14  7:05 Memstore performance improvements v0.90 vs v0.87 Blinick, Stephen L
2015-01-14 22:32 ` Blinick, Stephen L
2015-01-14 22:43   ` Mark Nelson
2015-01-14 23:39     ` Blinick, Stephen L
2015-01-27 21:03       ` Mark Nelson
2015-01-28  1:23         ` Blinick, Stephen L
2015-01-28 21:51           ` Mark Nelson
2015-01-29 12:51             ` James Page
2015-02-20  9:07         ` James Page
2015-02-20  9:49           ` Blair Bethwaite [this message]
2015-02-20 10:09             ` Haomai Wang
2015-02-20 15:38             ` Mark Nelson
     [not found]               ` <524687337.1545267.1424448115086.JavaMail.zimbra@oxygem.tv>
2015-02-20 16:03                 ` Alexandre DERUMIER
2015-02-20 16:12                   ` Mark Nelson
     [not found]                     ` <298703592.1573873.1424506210041.JavaMail.zimbra@oxygem.tv>
2015-02-21  8:10                       ` Alexandre DERUMIER
     [not found]                         ` <1429598219.1574757.1424509359439.JavaMail.zimbra@oxygem.tv>
2015-02-21  9:02                           ` Alexandre DERUMIER
2015-02-20 18:38                   ` Stefan Priebe
2015-02-20 15:51           ` Mark Nelson
2015-02-20 15:58             ` James Page
2015-01-14 22:44   ` Somnath Roy
2015-01-14 23:37     ` Blinick, Stephen L
2015-01-15 10:43     ` Andreas Bluemle
2015-01-15 17:09       ` Sage Weil
2015-01-15 17:15       ` Sage Weil
2015-01-19  9:28         ` Andreas Bluemle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+z5DszfGOE3bE0M6ti-4GyRj2CXypsNRv1J=6Wx4b65NqE=Vg@mail.gmail.com' \
    --to=blair.bethwaite@gmail.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=colin.king@canonical.com \
    --cc=james.page@ubuntu.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=leann.ogasawara@canonical.com \
    --cc=mnelson@redhat.com \
    --cc=patricia.gaughen@canonical.com \
    --cc=stephen.l.blinick@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.