All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sage Weil <sweil@redhat.com>
To: Mark Nelson <mnelson@redhat.com>
Cc: "Chen, Xiaoxi" <xiaoxi.chen@intel.com>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: newstore performance update
Date: Mon, 4 May 2015 11:08:26 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.00.1505041106300.24939@cobra.newdream.net> (raw)
In-Reply-To: <5547B156.8060508@redhat.com>

On Mon, 4 May 2015, Mark Nelson wrote:
> On 05/01/2015 07:33 PM, Sage Weil wrote:
> > Ok, I think I figured out what was going on.  The db->submit_transaction()
> > call (from _txc_finish_io) was blocking when there was a
> > submit_transaction_sync() in progress.  This was making me hit a ceiling
> > of about 80 iops on my slow disk.  When I moved that into _kv_sync_thread
> > (just prior to the submit_transaction_sync() call) it jumps up to 300+
> > iops.
> > 
> > I pushed that to wip-newstore.
> > 
> > Further, if I drop the O_DSYNC, it goes up another 50% or so.  It'll take
> > a bit more coding to effectively batch the (implicit) fdatasync from the
> > O_DSYNC up, though, and capture some of that.  Next!
> > 
> > sage
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> Ran through a bunch of tests on 0c728ccc over the weekend:
> 
> http://nhm.ceph.com/newstore/5d96fe6f_vs_0c728ccc.pdf
> 
> The good news is that sequential writes on spinning disks are looking
> significantly better!  We went from 40x slower than filestore for small
> sequential IO to only about 30-40% slower and we become faster than filestore
> at 64kb+ IO sizes.
> 
> 128kb-2MB sequential writes with data on spinning disk and rocksdb on SSD
> regressed.  Newstore is no longer really any faster than filestore for those
> IO sizes.  We saw something similar for random IO, where spinning disk only
> results improved and spinning disk + rocksdb on SSD regressed.
> 
> With everything on SSD, we saw small sequential writes improve and nearly all
> random writes regress.  Not sure how much these regressions are due to
> 0c728ccc vs other commits yet.

That's surprising!  I pushed a commit that makes this tunable,

 newstore sync submit transaction = false (default)

Can you see if setting that to true (effectively reverting my last change) 
fixes the ssd regression?

It may also be that this is a simple locking issue that we can fix in 
rocksdb.  Again, the behavior I saw was that the db->submit_transaction() 
call would block until the sync commit (from kv_sync_thread) finished.  
I would expect rocksdb to be more careful about that, so maybe there is 
something else funny/subtle going on.

sage

  reply	other threads:[~2015-05-04 18:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-28 23:25 newstore performance update Mark Nelson
2015-04-29  0:00 ` Venkateswara Rao Jujjuri
2015-04-29  0:07   ` Mark Nelson
2015-04-29  2:59     ` kernel neophyte
2015-04-29  4:31       ` Alexandre DERUMIER
2015-04-29 13:11         ` Mark Nelson
2015-04-29 13:08       ` Mark Nelson
2015-04-29 15:55         ` Chen, Xiaoxi
2015-04-29 19:06           ` Mark Nelson
2015-04-30  1:08             ` Chen, Xiaoxi
2015-04-29  0:00 ` Mark Nelson
2015-04-29  8:33 ` Chen, Xiaoxi
2015-04-29 13:20   ` Mark Nelson
2015-04-29 15:00     ` Chen, Xiaoxi
2015-04-29 16:38   ` Sage Weil
2015-04-30 13:21     ` Haomai Wang
2015-04-30 16:20       ` Sage Weil
2015-04-30 13:28     ` Mark Nelson
2015-04-30 14:02       ` Chen, Xiaoxi
2015-04-30 14:11         ` Mark Nelson
2015-04-30 18:09           ` Sage Weil
2015-05-01 14:48             ` Mark Nelson
2015-05-01 15:22               ` Chen, Xiaoxi
2015-05-02  0:33               ` Sage Weil
2015-05-04 17:50                 ` Mark Nelson
2015-05-04 18:08                   ` Sage Weil [this message]
2015-05-05 17:43                     ` Mark Nelson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.1505041106300.24939@cobra.newdream.net \
    --to=sweil@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=mnelson@redhat.com \
    --cc=xiaoxi.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.