From: "Norman.Kern" <norman.kern@gmx.com>
To: Coly Li <colyli@suse.de>
Cc: linux-block@vger.kernel.org, axboe@kernel.dk,
linux-bcache@vger.kernel.org
Subject: Re: Large latency with bcache for Ceph OSD
Date: Thu, 25 Feb 2021 10:23:29 +0800 [thread overview]
Message-ID: <07bcb6c8-21e1-11de-d1f0-ffd417bd36ff@gmx.com> (raw)
In-Reply-To: <5867daf1-0960-39aa-1843-1a76c1e9a28d@suse.de>
On 2021/2/24 下午4:52, Coly Li wrote:
> On 2/22/21 7:48 AM, Norman.Kern wrote:
>> Ping.
>>
>> I'm confused on the SYNC I/O on bcache. why SYNC I/O must be writen back
>> for persistent cache? It can cause some latency.
>>
>> @Coly, can you give help me to explain why bcache handle O_SYNC like this.?
>>
>>
> Hmm, normally we won't observe the application issuing I/Os on backing
> device except for,
> - I/O bypass by SSD congestion
> - Sequential I/O request
> - Dirty buckets exceeds the cutoff threshold
> - Write through mode
>
> Do you set the write/read congestion threshold to 0 ?
Thanks for you reply.
I have set the threshold to zero, all configs:
#make-bcache -C -b 4m -w 4k --discard --cache_replacement_policy=lru /dev/sdm
#make-bcache -B --writeback -w 4KiB /dev/sdn --wipe-bcache
congested_read_threshold_us = 0
congested_write_threshold_us = 0
# I tried to set sequential_cutoff to 0, but it didn't solve it.
sequential_cutoff = 4194304
writeback_percent = 40
cache_mode = writeback
I renew the cluster, run for hours and reproduced the problem. I check the cache status:
root@WXS0106:/root/perf-tools# cat /sys/fs/bcache/d87713c6-2e76-4a09-8517-d48306468659/cache_available_percent
29
root@WXS0106:/root/perf-tools# cat /sys/fs/bcache/d87713c6-2e76-4a09-8517-d48306468659/internal/cutoff_writeback_sync
70
'Dirty buckets exceeds the cutoff threshold' caused the problem? My configs are wrong or other reasons?
>
> Coly Li
>
>> On 2021/2/18 下午3:56, Norman.Kern wrote:
>>> Hi guys,
>>>
>>> I am testing ceph with bcache, I found some I/O with O_SYNC writeback
>>> to HDD, which caused large latency on HDD, I trace the I/O with iosnoop:
>>>
>>> ./iosnoop -Q -ts -d '8,192
>>>
>>> Tracing block I/O for 1 seconds (buffered)...
>>> STARTs ENDs COMM PID TYPE DEV
>>> BLOCK BYTES LATms
>>>
>>> 1809296.292350 1809296.319052 tp_osd_tp 22191 R 8,192
>>> 4578940240 16384 26.70
>>> 1809296.292330 1809296.320974 tp_osd_tp 22191 R 8,192
>>> 4577938704 16384 28.64
>>> 1809296.292614 1809296.323292 tp_osd_tp 22191 R 8,192
>>> 4600404304 16384 30.68
>>> 1809296.292353 1809296.325300 tp_osd_tp 22191 R 8,192
>>> 4578343088 16384 32.95
>>> 1809296.292340 1809296.328013 tp_osd_tp 22191 R 8,192
>>> 4578055472 16384 35.67
>>> 1809296.292606 1809296.330518 tp_osd_tp 22191 R 8,192
>>> 4578581648 16384 37.91
>>> 1809295.169266 1809296.334041 bstore_kv_fi 17266 WS 8,192
>>> 4244996360 4096 1164.78
>>> 1809296.292618 1809296.336349 tp_osd_tp 22191 R 8,192
>>> 4602631760 16384 43.73
>>> 1809296.292618 1809296.338812 tp_osd_tp 22191 R 8,192
>>> 4602632976 16384 46.19
>>> 1809296.030103 1809296.342780 tp_osd_tp 22180 WS 8,192
>>> 4741276048 131072 312.68
>>> 1809296.292347 1809296.345045 tp_osd_tp 22191 R 8,192
>>> 4609037872 16384 52.70
>>> 1809296.292620 1809296.345109 tp_osd_tp 22191 R 8,192
>>> 4609037904 16384 52.49
>>> 1809296.292612 1809296.347251 tp_osd_tp 22191 R 8,192
>>> 4578937616 16384 54.64
>>> 1809296.292621 1809296.351136 tp_osd_tp 22191 R 8,192
>>> 4612654992 16384 58.51
>>> 1809296.292341 1809296.353428 tp_osd_tp 22191 R 8,192
>>> 4578220656 16384 61.09
>>> 1809296.292342 1809296.353864 tp_osd_tp 22191 R 8,192
>>> 4578220880 16384 61.52
>>> 1809295.167650 1809296.358510 bstore_kv_fi 17266 WS 8,192
>>> 4923695960 4096 1190.86
>>> 1809296.292347 1809296.361885 tp_osd_tp 22191 R 8,192
>>> 4607437136 16384 69.54
>>> 1809296.029363 1809296.367313 tp_osd_tp 22180 WS 8,192
>>> 4739824400 98304 337.95
>>> 1809296.292349 1809296.370245 tp_osd_tp 22191 R 8,192
>>> 4591379888 16384 77.90
>>> 1809296.292348 1809296.376273 tp_osd_tp 22191 R 8,192
>>> 4591289552 16384 83.92
>>> 1809296.292353 1809296.378659 tp_osd_tp 22191 R 8,192
>>> 4578248656 16384 86.31
>>> 1809296.292619 1809296.384835 tp_osd_tp 22191 R 8,192
>>> 4617494160 65536 92.22
>>> 1809295.165451 1809296.393715 bstore_kv_fi 17266 WS 8,192
>>> 1355703120 4096 1228.26
>>> 1809295.168595 1809296.401560 bstore_kv_fi 17266 WS 8,192
>>> 1122200 4096 1232.96
>>> 1809295.165221 1809296.408018 bstore_kv_fi 17266 WS 8,192
>>> 960656 4096 1242.80
>>> 1809295.166737 1809296.411505 bstore_kv_fi 17266 WS 8,192
>>> 57682504 4096 1244.77
>>> 1809296.292352 1809296.418123 tp_osd_tp 22191 R 8,192
>>> 4579459056 32768 125.77
>>>
>>> I'm confused why write with O_SYNC must writeback on the backend
>>> storage device? And when I used bcache for a time,
>>>
>>> the latency increased a lot.(The SSD is not very busy), There's some
>>> best practices on configuration?
>>>
next prev parent reply other threads:[~2021-02-25 2:25 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-18 7:56 Large latency with bcache for Ceph OSD Norman.Kern
2021-02-21 23:48 ` Norman.Kern
2021-02-24 8:52 ` Coly Li
2021-02-25 2:22 ` Norman.Kern
2021-02-25 2:23 ` Norman.Kern [this message]
2021-02-25 13:00 ` Norman.Kern
2021-02-25 14:44 ` Coly Li
2021-02-26 8:57 ` Norman.Kern
2021-02-26 9:54 ` Coly Li
2021-03-02 2:03 ` Norman.Kern
2021-03-02 5:30 ` Norman.Kern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=07bcb6c8-21e1-11de-d1f0-ffd417bd36ff@gmx.com \
--to=norman.kern@gmx.com \
--cc=axboe@kernel.dk \
--cc=colyli@suse.de \
--cc=linux-bcache@vger.kernel.org \
--cc=linux-block@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.