All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Norman.Kern" <norman.kern@gmx.com>
To: Coly Li <colyli@suse.de>
Cc: linux-block@vger.kernel.org, axboe@kernel.dk,
	linux-bcache@vger.kernel.org
Subject: Re: Large latency with bcache for Ceph OSD
Date: Fri, 26 Feb 2021 16:57:42 +0800	[thread overview]
Message-ID: <b808dde3-cb58-907b-4df0-e0eb2938b51e@gmx.com> (raw)
In-Reply-To: <96daa0bf-c8e1-a334-14cb-2d260aed5115@suse.de>


On 2021/2/25 下午10:44, Coly Li wrote:
> On 2/25/21 9:00 PM, Norman.Kern wrote:
>> I made a test:
> BTW, what is the version of your kernel, and your bcache-tool, and which
> distribution is running ?
root@WXS0106:~# uname -a
Linux WXS0106 5.4.0-58-generic #64~18.04.1-Ubuntu SMP Wed Dec 9 17:11:11 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
root@WXS0106:~# cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
UBUNTU_CODENAME=xenial
root@WXS0106:~# dpkg -l bcache-tools
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                  Version                 Architecture            Description
+++-=====================================-=======================-=======================-================================================================================
ii  bcache-tools                          1.0.8-2                 amd64                   bcache userspace tools

>
>> - Stop writing and wait for dirty data writen back
>>
>> $ lsblk
>> NAME                                                                                                   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
>> sdf                                                                                                      8:80   0   7.3T  0 disk
>> └─bcache0                                                                                              252:0    0   7.3T  0 disk
>>   └─ceph--32a481f9--313c--417e--aaf7--bdd74515fd86-osd--data--2f670929--3c8a--45dd--bcef--c60ce3ee08e1 253:1    0   7.3T  0 lvm 
>> sdd                                                                                                      8:48   0   7.3T  0 disk
>> sdb                                                                                                      8:16   0   7.3T  0 disk
>> sdk                                                                                                      8:160  0 893.8G  0 disk
>> └─bcache0                                                                                              252:0    0   7.3T  0 disk
>>   └─ceph--32a481f9--313c--417e--aaf7--bdd74515fd86-osd--data--2f670929--3c8a--45dd--bcef--c60ce3ee08e1 253:1    0   7.3T  0 lvm 
>> $ cat /sys/block/bcache0/bcache/dirty_data
>> 0.0k
>>
>> root@WXS0106:~# bcache-super-show /dev/sdf
>> sb.magic                ok
>> sb.first_sector         8 [match]
>> sb.csum                 71DA9CA968B4A625 [match]
>> sb.version              1 [backing device]
>>
>> dev.label               (empty)
>> dev.uuid                d07dc435-129d-477d-8378-a6af75199852
>> dev.sectors_per_block   8
>> dev.sectors_per_bucket  1024
>> dev.data.first_sector   16
>> dev.data.cache_mode     1 [writeback]
>> dev.data.cache_state    1 [clean]
>> cset.uuid               d87713c6-2e76-4a09-8517-d48306468659
>>
>> - check the available cache
>>
>> # cat /sys/fs/bcache/d87713c6-2e76-4a09-8517-d48306468659/cache_available_percent
>> 27
>>
> What is the content from
> /sys/fs/bcache/<cache-set-uuid>/cache0/priority_stats ? Can you past
> here too.
I forgot get the info before I triggered gc..., I think I can reproduce the problem. When I reproduce it, I will collect the infomation.
>
> There is no dirty blocks, but cache is occupied 78% buckets, if you are
> using 5.8+ kernel, then a gc is probably desired.
>
> You may try to trigger a gc by writing to
> sys/fs/bcache/<cache-set-uuid>/internal/trigger_gc
>
When all cache had written back, I triggered gc, it recalled.

root@WXS0106:~# cat /sys/block/bcache0/bcache/cache/cache_available_percent
30

root@WXS0106:~# echo 1 > /sys/block/bcache0/bcache/cache/internal/trigger_gc
root@WXS0106:~# cat /sys/block/bcache0/bcache/cache/cache_available_percent
97

Why must I trigger gc manually? Is not a default action of bcache-gc thread? And I found it can only work when all dirty data written back.

>> As the doc described:
>>
>> cache_available_percent
>>     Percentage of cache device which doesn’t contain dirty data, and could potentially be used for writeback. This doesn’t mean this space isn’t used for clean cached data; the unused statistic (in priority_stats) is typically much lower.
>> When all dirty data writen back,  why cache_available_percent was not 100?
>>
>> And when I start the write I/O, the new writen didn't replace the clean cache(it think the cache is diry now?), so it cause the hdd with large latency:
>>
>> ./bin/iosnoop -Q -d '8,80'
>>
>> <...>        73338  WS   8,80     3513701472   4096     217.69
>> <...>        73338  WS   8,80     3513759360   4096     448.80
>> <...>        73338  WS   8,80     3562211912   4096     511.69
>> <...>        73335  WS   8,80     3562212528   4096     505.08
>> <...>        73339  WS   8,80     3562213376   4096     501.19
>> <...>        73336  WS   8,80     3562213992   4096     511.16
>> <...>        73343  WS   8,80     3562214016   4096     511.74
>> <...>        73340  WS   8,80     3562214128   4096     512.95
>> <...>        73329  WS   8,80     3562214208   4096     510.48
>> <...>        73338  WS   8,80     3562214600   4096     518.64
>> <...>        73341  WS   8,80     3562214632   4096     519.09
>> <...>        73342  WS   8,80     3562214664   4096     518.28
>> <...>        73336  WS   8,80     3562214688   4096     519.27
>> <...>        73343  WS   8,80     3562214736   4096     528.31
>> <...>        73339  WS   8,80     3562214784   4096     530.13
>>
> I just wondering why gc thread does not run up ....
>
>
> Thanks.
>
> Coly Li
>

  reply	other threads:[~2021-02-26  8:59 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-18  7:56 Large latency with bcache for Ceph OSD Norman.Kern
2021-02-21 23:48 ` Norman.Kern
2021-02-24  8:52   ` Coly Li
2021-02-25  2:22     ` Norman.Kern
2021-02-25  2:23     ` Norman.Kern
2021-02-25 13:00       ` Norman.Kern
2021-02-25 14:44         ` Coly Li
2021-02-26  8:57           ` Norman.Kern [this message]
2021-02-26  9:54             ` Coly Li
2021-03-02  2:03               ` Norman.Kern
2021-03-02  5:30               ` Norman.Kern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b808dde3-cb58-907b-4df0-e0eb2938b51e@gmx.com \
    --to=norman.kern@gmx.com \
    --cc=axboe@kernel.dk \
    --cc=colyli@suse.de \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.