All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Norman.Kern" <norman.kern@gmx.com>
To: Coly Li <colyli@suse.de>
Cc: linux-block@vger.kernel.org, axboe@kernel.dk,
	linux-bcache@vger.kernel.org
Subject: Re: Large latency with bcache for Ceph OSD
Date: Thu, 25 Feb 2021 10:23:29 +0800	[thread overview]
Message-ID: <07bcb6c8-21e1-11de-d1f0-ffd417bd36ff@gmx.com> (raw)
In-Reply-To: <5867daf1-0960-39aa-1843-1a76c1e9a28d@suse.de>


On 2021/2/24 下午4:52, Coly Li wrote:
> On 2/22/21 7:48 AM, Norman.Kern wrote:
>> Ping.
>>
>> I'm confused on the SYNC I/O on bcache. why SYNC I/O must be writen back
>> for persistent cache?  It can cause some latency.
>>
>> @Coly, can you give help me to explain why bcache handle O_SYNC like this.?
>>
>>
> Hmm, normally we won't observe the application issuing I/Os on backing
> device except for,
> - I/O bypass by SSD congestion
> - Sequential I/O request
> - Dirty buckets exceeds the cutoff threshold
> - Write through mode
>
> Do you set the write/read congestion threshold to 0 ?

Thanks for you reply.

I have set the threshold to zero, all configs:

#make-bcache -C -b 4m -w 4k --discard --cache_replacement_policy=lru /dev/sdm
#make-bcache -B --writeback -w 4KiB /dev/sdn --wipe-bcache
congested_read_threshold_us = 0
congested_write_threshold_us = 0

# I tried to set sequential_cutoff to 0, but it didn't solve it.

sequential_cutoff = 4194304
writeback_percent = 40
cache_mode = writeback

I renew the cluster, run for hours and reproduced the problem. I check the cache status:

root@WXS0106:/root/perf-tools# cat /sys/fs/bcache/d87713c6-2e76-4a09-8517-d48306468659/cache_available_percent
29
root@WXS0106:/root/perf-tools# cat /sys/fs/bcache/d87713c6-2e76-4a09-8517-d48306468659/internal/cutoff_writeback_sync
70
'Dirty buckets exceeds the cutoff threshold' caused the problem?  My configs  are wrong or other reasons?

>
> Coly Li
>
>> On 2021/2/18 下午3:56, Norman.Kern wrote:
>>> Hi guys,
>>>
>>> I am testing ceph with bcache, I found some I/O with O_SYNC writeback
>>> to HDD, which caused large latency on HDD, I trace the I/O with iosnoop:
>>>
>>> ./iosnoop  -Q -ts -d '8,192
>>>
>>> Tracing block I/O for 1 seconds (buffered)...
>>> STARTs          ENDs            COMM         PID    TYPE DEV
>>> BLOCK        BYTES     LATms
>>>
>>> 1809296.292350  1809296.319052  tp_osd_tp    22191  R    8,192
>>> 4578940240   16384     26.70
>>> 1809296.292330  1809296.320974  tp_osd_tp    22191  R    8,192
>>> 4577938704   16384     28.64
>>> 1809296.292614  1809296.323292  tp_osd_tp    22191  R    8,192
>>> 4600404304   16384     30.68
>>> 1809296.292353  1809296.325300  tp_osd_tp    22191  R    8,192
>>> 4578343088   16384     32.95
>>> 1809296.292340  1809296.328013  tp_osd_tp    22191  R    8,192
>>> 4578055472   16384     35.67
>>> 1809296.292606  1809296.330518  tp_osd_tp    22191  R    8,192
>>> 4578581648   16384     37.91
>>> 1809295.169266  1809296.334041  bstore_kv_fi 17266  WS   8,192
>>> 4244996360   4096    1164.78
>>> 1809296.292618  1809296.336349  tp_osd_tp    22191  R    8,192
>>> 4602631760   16384     43.73
>>> 1809296.292618  1809296.338812  tp_osd_tp    22191  R    8,192
>>> 4602632976   16384     46.19
>>> 1809296.030103  1809296.342780  tp_osd_tp    22180  WS   8,192
>>> 4741276048   131072   312.68
>>> 1809296.292347  1809296.345045  tp_osd_tp    22191  R    8,192
>>> 4609037872   16384     52.70
>>> 1809296.292620  1809296.345109  tp_osd_tp    22191  R    8,192
>>> 4609037904   16384     52.49
>>> 1809296.292612  1809296.347251  tp_osd_tp    22191  R    8,192
>>> 4578937616   16384     54.64
>>> 1809296.292621  1809296.351136  tp_osd_tp    22191  R    8,192
>>> 4612654992   16384     58.51
>>> 1809296.292341  1809296.353428  tp_osd_tp    22191  R    8,192
>>> 4578220656   16384     61.09
>>> 1809296.292342  1809296.353864  tp_osd_tp    22191  R    8,192
>>> 4578220880   16384     61.52
>>> 1809295.167650  1809296.358510  bstore_kv_fi 17266  WS   8,192
>>> 4923695960   4096    1190.86
>>> 1809296.292347  1809296.361885  tp_osd_tp    22191  R    8,192
>>> 4607437136   16384     69.54
>>> 1809296.029363  1809296.367313  tp_osd_tp    22180  WS   8,192
>>> 4739824400   98304    337.95
>>> 1809296.292349  1809296.370245  tp_osd_tp    22191  R    8,192
>>> 4591379888   16384     77.90
>>> 1809296.292348  1809296.376273  tp_osd_tp    22191  R    8,192
>>> 4591289552   16384     83.92
>>> 1809296.292353  1809296.378659  tp_osd_tp    22191  R    8,192
>>> 4578248656   16384     86.31
>>> 1809296.292619  1809296.384835  tp_osd_tp    22191  R    8,192
>>> 4617494160   65536     92.22
>>> 1809295.165451  1809296.393715  bstore_kv_fi 17266  WS   8,192
>>> 1355703120   4096    1228.26
>>> 1809295.168595  1809296.401560  bstore_kv_fi 17266  WS   8,192
>>> 1122200      4096    1232.96
>>> 1809295.165221  1809296.408018  bstore_kv_fi 17266  WS   8,192
>>> 960656       4096    1242.80
>>> 1809295.166737  1809296.411505  bstore_kv_fi 17266  WS   8,192
>>> 57682504     4096    1244.77
>>> 1809296.292352  1809296.418123  tp_osd_tp    22191  R    8,192
>>> 4579459056   32768    125.77
>>>
>>> I'm confused why write with O_SYNC must writeback on the backend
>>> storage device?  And when I used bcache for a time,
>>>
>>> the latency increased a lot.(The SSD is not very busy), There's some
>>> best practices on configuration?
>>>

  parent reply	other threads:[~2021-02-25  2:25 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-18  7:56 Large latency with bcache for Ceph OSD Norman.Kern
2021-02-21 23:48 ` Norman.Kern
2021-02-24  8:52   ` Coly Li
2021-02-25  2:22     ` Norman.Kern
2021-02-25  2:23     ` Norman.Kern [this message]
2021-02-25 13:00       ` Norman.Kern
2021-02-25 14:44         ` Coly Li
2021-02-26  8:57           ` Norman.Kern
2021-02-26  9:54             ` Coly Li
2021-03-02  2:03               ` Norman.Kern
2021-03-02  5:30               ` Norman.Kern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07bcb6c8-21e1-11de-d1f0-ffd417bd36ff@gmx.com \
    --to=norman.kern@gmx.com \
    --cc=axboe@kernel.dk \
    --cc=colyli@suse.de \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.