RE: Bluestore read performance

* RE: Bluestore read performance
@ 2016-07-14 16:50 Somnath Roy
  2016-07-14 17:17 ` Igor Fedotov
  0 siblings, 1 reply; 12+ messages in thread
From: Somnath Roy @ 2016-07-14 16:50 UTC (permalink / raw)
  To: Mark Nelson (mnelson@redhat.com); +Cc: ceph-devel (ceph-devel@vger.kernel.org)

Mark,
As we discussed in today's meeting , I ran 100% RR with the following fio profile on a single image of 4TB. Did precondition the entire image with 1M seq write. I have total of 16 OSDs over 2 nodes.

[global]
ioengine=rbd
clientname=admin
pool=recovery_test
rbdname=recovery_image
invalidate=0    # mandatory
rw=randread
bs=4k
direct=1
time_based
runtime=30m
numjobs=8
group_reporting

[rbd_iodepth32]
iodepth=128

Here is the ceph.conf option I used for Bluestore.

       osd_op_num_threads_per_shard = 2
        osd_op_num_shards = 25

        bluestore_rocksdb_options = "max_write_buffer_number=16,min_write_buffer_number_to_merge=16,recycle_log_file_num=16,compaction_threads=32,flusher_threads=4,
       max_background_compactions=32,max_background_flushes=8,max_bytes_for_level_base=5368709120,write_buffer_size=83886080,level0_file_num_compaction_trigger=4,level0_slowdown_writes_trigger=400,level0_stop_writes_trigger=800"
        rocksdb_cache_size = 4294967296
        #bluestore_min_alloc_size = 16384
        bluestore_min_alloc_size = 4096
        bluestore_csum = false
        bluestore_csum_type = none
        bluestore_bluefs_buffered_io = false
        bluestore_max_ops = 30000
        bluestore_max_bytes = 629145600

Here is the output I got.

rbd_iodepth32: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, iodepth=128
...
fio-2.1.11
Starting 8 processes
rbd engine: RBD version: 0.1.10
rbd engine: RBD version: 0.1.10
rbd engine: RBD version: 0.1.10
rbd engine: RBD version: 0.1.10
rbd engine: RBD version: 0.1.10
rbd engine: RBD version: 0.1.10
rbd engine: RBD version: 0.1.10
rbd engine: RBD version: 0.1.10
^Cbs: 8 (f=8): [r(8)] [9.4% done] [179.5MB/0KB/0KB /s] [45.1K/0/0 iops] [eta 27m:12s]
fio: terminating on signal 2

rbd_iodepth32: (groupid=0, jobs=8): err= 0: pid=1266211: Thu Jul 14 09:42:28 2016
  read : io=95898MB, bw=583425KB/s, iops=145856, runt=168316msec
    slat (usec): min=0, max=13967, avg= 4.56, stdev=38.79
    clat (usec): min=15, max=1949.3K, avg=6941.73, stdev=16018.84
     lat (usec): min=225, max=1949.3K, avg=6946.30, stdev=16018.92
    clat percentiles (usec):
     |  1.00th=[  876],  5.00th=[ 2024], 10.00th=[ 2672], 20.00th=[ 3312],
     | 30.00th=[ 3824], 40.00th=[ 4320], 50.00th=[ 5024], 60.00th=[ 5920],
     | 70.00th=[ 7072], 80.00th=[ 8768], 90.00th=[11840], 95.00th=[15040],
     | 99.00th=[22400], 99.50th=[27264], 99.90th=[248832], 99.95th=[366592],
     | 99.99th=[602112]

I was getting > 600MB/s  before memory started swapping for me and the fio output came down.
I never tested Bluestore read before, but, it is definitely lower than Filestore for me.
But, it is far better than you are getting it seems (?). Do you mind trying with the above ceph.conf option as well ?

My ceph version :
ceph version 11.0.0-536-g8df0c5b (8df0c5bcd90d80e9b309b2a9007b778f7b829edf)

Thanks & Regards
Somnath

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

^ permalink raw reply	[flat|nested] 12+ messages in thread