All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Nelson <mnelson@redhat.com>
To: Somnath Roy <Somnath.Roy@sandisk.com>,
	Sage Weil <sweil@redhat.com>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: wip-denc
Date: Wed, 14 Sep 2016 15:37:56 -0500	[thread overview]
Message-ID: <a4b8cc4a-aae4-7534-10f3-d48e862dbccd@redhat.com> (raw)
In-Reply-To: <BL2PR02MB2115859F0DB2F573F808BA14F4F10@BL2PR02MB2115.namprd02.prod.outlook.com>

Strange, it's working for me, and claims to be public for everyone.  :/

Mark

On 09/14/2016 03:35 PM, Somnath Roy wrote:
> Not able to access the graphs Mark..
>
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Mark Nelson
> Sent: Wednesday, September 14, 2016 1:32 PM
> To: Sage Weil; ceph-devel@vger.kernel.org
> Subject: Re: wip-denc
>
> On 09/13/2016 04:17 PM, Sage Weil wrote:
>> Hi everyone,
>>
>> Okay, I have a new wip-denc branch working and ready for some review:
>>
>> https://github.com/ceph/ceph/pull/11027
>>
>> Highlights:
>>
>> - This includes appender/iterator changes to buffer* to speed up
>> encoding and decoding (fewer bounds checks, simpler structures).
>>
>> - Accordingly, classes/types using the new-style have different
>> arguments types for encode/decode.  There is also a new bound_encode()
>> method that is used to calculate how big of a buffer to preallocate.
>>
>> - Most of the important helpers for doing types have new versions that
>> work with the new framework (e.g., the ENCODE_START macro has a new
>> DENC_START counterpart).
>>
>> - There is also a mechanism that lets you define the bound_encode,
>> encode, and decode methods all in one go using some template magic.
>> This only works for pretty simple types, but it is handy.  It looks like so:
>>
>>   struct foo_t {
>>     uint32_t a, b;
>>     ...
>>     DENC(foo_t, v, p) {
>>       DENC_START(1, 1, p);
>>       denc(v.a, p);
>>       denc(v.b, p);
>>       ...
>>       DENC_FINISH(p);
>>     }
>>   };
>>   WRITE_CLASS_DENC(foo_t)
>>
>>
>> - For new-style types, a new 'denc' function that is overload to do
>> either bound_encode, encode, or decode (based on argument types) is defined.
>> That means that
>>
>>   ::denc(v, p);
>>
>> will work for size_t& p, bufferptr::iterator& p, or
>> bufferlist::contiguous_appender& p.  This facilitates the DENC
>> definitions above.
>>
>> - There is glue to invoke new-style encode/decode when old-style
>> encode() and decode() are invoked, provided a denc_traits<T> is defined.
>>
>> - Most of the common containers are there list, vector, set, map,
>> pair, but others need to be converted.
>>
>> - Currently, we're a bit aggressive about using the new-style over the
>> old-style when we have the change.  For example, if you have
>>
>>   vector<int32_t> foo;
>>   ::encode(foo, bl);
>>
>> it will see that it knows how to do int32_t new-style and invoke the
>> new-style vector<> code.  I think this is going to be a net win, since
>> we avoid doing bounds checks on append for every element (and the
>> bound_encode is O(1) for thees base types).  On the other hand, it is
>> currently smart enough to not use new-style for individual integer
>> types, like so
>>
>>   int32_t v;
>>   ::encode(v, bl);
>>
>> although I suspect after the optimizer gets done with it the generated
>> machine code is almost identical.
>>
>> - Most of the key bluestore types are converted over so that we can do
>> some benchmarking.
>>
>> An overview is at the top of the new denc.h header here:
>>
>> https://github.com/liewegas/ceph/blob/wip-denc/src/include/denc.h#L55
>>
>> I think I've captured the best of Allen's, Varada's, and Sam's various
>> approaches, but we'll see how it behaves.  Let me know what you think!
>
> Alright, made it through a round of benchmarking without crashing this time.  This is wip-denc + 11059 + 11014 on 4 NVMe cards split into 16 OSDs.  Need to add the additional memory reduction patches, but for now this gives us a bit of an idea where we are at. Scroll to the right for graphs.
>
> https://drive.google.com/uc?export=download&id=0B2gTBZrkrnpZNi1aU1htRDRDekk
>
> 1) Basically sequential reads look bad, but we've known that for a while and we can look at it again once the dust settles.  We've never been great compared to filestore, but something took a turn for the worst earlier this summer.
>
> 2) Sequential writes are looking pretty great, and have been since july after a bitmap allocator fix.
>
> 3) Random read performance has dropped pretty significantly recently.
> Sage thinks this might be the sharding.
>
> 4) Small random write performance is about twice as fast, mostly due to the sharding, though I'd argue indirectly.  I'd argue this is really due to the reduction in bufferlist appends as we saw nearly the same improvement when we used the appender with the old code.  These tests continue to be CPU limited.
>
> 5) Sequential mixed read/write tests look pretty similar to the 7/28 tests.  The difference vs jewel bluestore seems to primarily be the bitmap allocator, but other changes might be having an effect as well.
>
> 6) Random mixed read/write tests have improved since 7/28 with the sharding and encode/decode changes.  Performance is much higher for larger IOs and a little slower for 4K IOs, but it's fairly competitive in these tests.
>
>>
>> Thanks-
>> sage
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>> in the body of a message to majordomo@vger.kernel.org More majordomo
>> info at  http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html
> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>

      reply	other threads:[~2016-09-14 20:38 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-13 21:17 wip-denc Sage Weil
2016-09-13 23:45 ` wip-denc Mark Nelson
2016-09-14  0:29   ` wip-denc Somnath Roy
2016-09-14  2:41     ` wip-denc Mark Nelson
2016-09-14  4:05       ` wip-denc Somnath Roy
2016-09-14 11:06         ` wip-denc Mark Nelson
2016-09-14 14:10     ` wip-denc Sage Weil
2016-09-14 14:51       ` wip-denc Somnath Roy
2016-09-14 17:53       ` wip-denc Somnath Roy
2016-09-15  0:39     ` wip-denc Sage Weil
2016-09-14  0:47   ` wip-denc Allen Samuels
2016-09-14  1:18     ` wip-denc Mark Nelson
2016-09-14  9:12 ` wip-denc Joao Eduardo Luis
2016-09-14 13:27   ` wip-denc Sage Weil
2016-09-14 15:03     ` wip-denc Joao Eduardo Luis
2016-09-14 20:31 ` wip-denc Mark Nelson
2016-09-14 20:35   ` wip-denc Somnath Roy
2016-09-14 20:37     ` Mark Nelson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a4b8cc4a-aae4-7534-10f3-d48e862dbccd@redhat.com \
    --to=mnelson@redhat.com \
    --cc=Somnath.Roy@sandisk.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sweil@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.