bufferlist appenders

* bufferlist appenders
@ 2016-08-12 14:27 Sage Weil
  2016-08-12 14:37 ` Mark Nelson
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Sage Weil @ 2016-08-12 14:27 UTC (permalink / raw)
  To: ceph-devel

A ton of time is the encoding/marshalling is spent doing bufferlist 
appends.  This is partly because the buffer code is doing lots of sanity 
range checks, and party because there are multiple layers that get range 
checks and length updates (bufferlist _len changes, 
and bufferlist::append_buffer (a ptr) gets it's length updated, at the 
very least).

To simplify and speed this up, I propose an 'appender' concept/type that 
is used for doing appends in a more efficient way.  It would be used 
like so:

 bufferlist bl;
 {
   bufferlist::safe_appender a = bl.get_safe_appender();
   ::encode(foo, a);
 }

or

 {
   bufferlist::unsafe_appender a = bl.get_unsafe_appender(1024);
   ::encode(foo, a);
 }

The appender keeps its own bufferptr that it copies data into.  The 
bufferptr isn't given to the bufferlist until the appender is destroyed 
(or flush() is called explicitly).  This means that appends are generally 
just a memcpy and a position pointer addition.  In the safe_appender case, 
we also do a range change and allocate a new buffer if necessary.  In the 
unsafe_appender case, it is the callers responsibility to say how big a 
buffer is preallocated.

I have a simple prototype here:

	https://github.com/ceph/ceph/pull/10700

It appears to be almost 10x faster when encoding a uint64_t in a loop!

[ RUN      ] BufferList.appender_bench
appending 1073741824 bytes
buffer::list::append 20.285963
buffer::list encode 19.719120
buffer::list::safe_appender::append 2.588926
buffer::list::safe_appender::append_v 2.837026
buffer::list::safe_appender encode 3.000614
buffer::list::unsafe_appender::append 2.452116
buffer::list::unsafe_appender::append_v 2.553745
buffer::list::unsafe_appender encode 2.200110
[       OK ] BufferList.appender_bench (55637 ms)

Interesting, unsafe isn't much faster than safe.  I suspect the CPU's 
branch prediction is just working really well there?

Anyway, thoughts on this?  Any suggestions for further improvement?

I think the next step is to figure out how to change our 
WRITE_CLASS_ENCODER macros and encode function work with both bufferlists 
and appenders so that it's easy to convert stuff over (and still work with 
a mix of bufferlist-based encoders and appender-based encoders).

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread