From: Dongsheng Yang <dongsheng.yang@easystack.cn>
To: Dongdong Tao <dongdong.tao@canonical.com>
Cc: Dongdong Tao <tdd21151186@gmail.com>, Coly Li <colyli@suse.de>,
Gavin Guo <gavin.guo@canonical.com>,
Gerald Yang <gerald.yang@canonical.com>,
Trent Lloyd <trent.lloyd@canonical.com>,
Kent Overstreet <kent.overstreet@gmail.com>,
"open list:BCACHE (BLOCK LAYER CACHE)"
<linux-bcache@vger.kernel.org>,
open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] bcache: consider the fragmentation when update the writeback rate
Date: Wed, 9 Dec 2020 15:49:26 +0800 [thread overview]
Message-ID: <ce04461c-ca5c-781d-7aad-cdad3ebadac2@easystack.cn> (raw)
In-Reply-To: <CAJS8hV+UmLFQVhuqUin1Ze6kLtAO7paXH95+9gaiZgM19VZe1A@mail.gmail.com>
在 2020/12/9 星期三 下午 12:48, Dongdong Tao 写道:
> Hi Dongsheng,
>
> I'm working on it, next step I'm gathering some testing data and
> upload (very sorry for the delay...)
> Thanks for the comment.
> One of the main concerns to alleviate this issue with the writeback
> process is that we need to minimize the impact on the client IO
> performance.
> writeback_percent by default is 10, start writeback when dirty buckets
> reached 10 percent might be a bit too aggressive, as the
> writeback_cutoff_sync is 70 percent.
> So i chose to start the writeback when dirty buckets reached 50
> percent so that this patch will only take effect after dirty buckets
> percent is above that
Agree with that's too aggressive to reuse writeback_percent, and that's
less flexable.
Okey, let's wait for your testing result.
Thanx
>
> Thanks,
> Dongdong
>
>
>
>
> On Wed, Dec 9, 2020 at 10:27 AM Dongsheng Yang
> <dongsheng.yang@easystack.cn> wrote:
>>
>> 在 2020/11/3 星期二 下午 8:42, Dongdong Tao 写道:
>>> From: dongdong tao <dongdong.tao@canonical.com>
>>>
>>> Current way to calculate the writeback rate only considered the
>>> dirty sectors, this usually works fine when the fragmentation
>>> is not high, but it will give us unreasonable small rate when
>>> we are under a situation that very few dirty sectors consumed
>>> a lot dirty buckets. In some case, the dirty bucekts can reached
>>> to CUTOFF_WRITEBACK_SYNC while the dirty data(sectors) noteven
>>> reached the writeback_percent, the writeback rate will still
>>> be the minimum value (4k), thus it will cause all the writes to be
>>> stucked in a non-writeback mode because of the slow writeback.
>>>
>>> This patch will try to accelerate the writeback rate when the
>>> fragmentation is high. It calculate the propotional_scaled value
>>> based on below:
>>> (dirty_sectors / writeback_rate_p_term_inverse) * fragment
>>> As we can see, the higher fragmentation will result a larger
>>> proportional_scaled value, thus cause a larger writeback rate.
>>> The fragment value is calculated based on below:
>>> (dirty_buckets * bucket_size) / dirty_sectors
>>> If you think about it, the value of fragment will be always
>>> inside [1, bucket_size].
>>>
>>> This patch only considers the fragmentation when the number of
>>> dirty_buckets reached to a dirty threshold(configurable by
>>> writeback_fragment_percent, default is 50), so bcache will
>>> remain the original behaviour before the dirty buckets reached
>>> the threshold.
>>>
>>> Signed-off-by: dongdong tao <dongdong.tao@canonical.com>
>>> ---
>>> drivers/md/bcache/bcache.h | 1 +
>>> drivers/md/bcache/sysfs.c | 6 ++++++
>>> drivers/md/bcache/writeback.c | 21 +++++++++++++++++++++
>>> 3 files changed, 28 insertions(+)
>>>
>>> diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
>>> index 1d57f48307e6..87632f7032b6 100644
>>> --- a/drivers/md/bcache/bcache.h
>>> +++ b/drivers/md/bcache/bcache.h
>>> @@ -374,6 +374,7 @@ struct cached_dev {
>>> unsigned int writeback_metadata:1;
>>> unsigned int writeback_running:1;
>>> unsigned char writeback_percent;
>>> + unsigned char writeback_fragment_percent;
>>> unsigned int writeback_delay;
>>>
>>> uint64_t writeback_rate_target;
>>> diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
>>> index 554e3afc9b68..69499113aef8 100644
>>> --- a/drivers/md/bcache/sysfs.c
>>> +++ b/drivers/md/bcache/sysfs.c
>>> @@ -115,6 +115,7 @@ rw_attribute(stop_when_cache_set_failed);
>>> rw_attribute(writeback_metadata);
>>> rw_attribute(writeback_running);
>>> rw_attribute(writeback_percent);
>>> +rw_attribute(writeback_fragment_percent);
>>
>> Hi Dongdong and Coly,
>>
>> What is the status about this patch? In my opinion, it is a problem
>> we need to solve,
>>
>> but can we just reuse the parameter of writeback_percent, rather than
>> introduce a new writeback_fragment_percent?
>>
>> That means the semantic of writeback_percent will act on dirty data
>> percent and dirty bucket percent.
>>
>> When we found there are dirty buckets more than (c->nbuckets *
>> writeback_percent), start the writeback.
>>
>>
>> Thanx
>>
>> Yang
>>
>>> rw_attribute(writeback_delay);
>>> rw_attribute(writeback_rate);
>>>
>>> @@ -197,6 +198,7 @@ SHOW(__bch_cached_dev)
>>> var_printf(writeback_running, "%i");
>>> var_print(writeback_delay);
>>> var_print(writeback_percent);
>>> + var_print(writeback_fragment_percent);
>>> sysfs_hprint(writeback_rate,
>>> wb ? atomic_long_read(&dc->writeback_rate.rate) << 9 : 0);
>>> sysfs_printf(io_errors, "%i", atomic_read(&dc->io_errors));
>>> @@ -308,6 +310,9 @@ STORE(__cached_dev)
>>> sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent,
>>> 0, bch_cutoff_writeback);
>>>
>>> + sysfs_strtoul_clamp(writeback_fragment_percent, dc->writeback_fragment_percent,
>>> + 0, bch_cutoff_writeback_sync);
>>> +
>>> if (attr == &sysfs_writeback_rate) {
>>> ssize_t ret;
>>> long int v = atomic_long_read(&dc->writeback_rate.rate);
>>> @@ -498,6 +503,7 @@ static struct attribute *bch_cached_dev_files[] = {
>>> &sysfs_writeback_running,
>>> &sysfs_writeback_delay,
>>> &sysfs_writeback_percent,
>>> + &sysfs_writeback_fragment_percent,
>>> &sysfs_writeback_rate,
>>> &sysfs_writeback_rate_update_seconds,
>>> &sysfs_writeback_rate_i_term_inverse,
>>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
>>> index 3c74996978da..34babc89fdf3 100644
>>> --- a/drivers/md/bcache/writeback.c
>>> +++ b/drivers/md/bcache/writeback.c
>>> @@ -88,6 +88,26 @@ static void __update_writeback_rate(struct cached_dev *dc)
>>> int64_t integral_scaled;
>>> uint32_t new_rate;
>>>
>>> + /*
>>> + * We need to consider the number of dirty buckets as well
>>> + * when calculating the proportional_scaled, Otherwise we might
>>> + * have an unreasonable small writeback rate at a highly fragmented situation
>>> + * when very few dirty sectors consumed a lot dirty buckets, the
>>> + * worst case is when dirty_data reached writeback_percent and
>>> + * dirty buckets reached to cutoff_writeback_sync, but the rate
>>> + * still will be at the minimum value, which will cause the write
>>> + * stuck at a non-writeback mode.
>>> + */
>>> + struct cache_set *c = dc->disk.c;
>>> +
>>> + if (c->gc_stats.in_use > dc->writeback_fragment_percent && dirty > 0) {
>>> + int64_t dirty_buckets = (c->gc_stats.in_use * c->nbuckets) / 100;
>>> + int64_t fragment = (dirty_buckets * c->cache->sb.bucket_size) / dirty;
>>> +
>>> + proportional_scaled =
>>> + div_s64(dirty, dc->writeback_rate_p_term_inverse) * (fragment);
>>> + }
>>> +
>>> if ((error < 0 && dc->writeback_rate_integral > 0) ||
>>> (error > 0 && time_before64(local_clock(),
>>> dc->writeback_rate.next + NSEC_PER_MSEC))) {
>>> @@ -969,6 +989,7 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc)
>>> dc->writeback_metadata = true;
>>> dc->writeback_running = false;
>>> dc->writeback_percent = 10;
>>> + dc->writeback_fragment_percent = 50;
>>> dc->writeback_delay = 30;
>>> atomic_long_set(&dc->writeback_rate.rate, 1024);
>>> dc->writeback_rate_minimum = 8;
next prev parent reply other threads:[~2020-12-09 7:50 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-03 12:42 [PATCH] bcache: consider the fragmentation when update the writeback rate Dongdong Tao
2020-11-04 5:06 ` kernel test robot
2020-11-05 16:32 ` Coly Li
2020-11-10 4:19 ` Dongdong Tao
2020-11-11 8:33 ` Coly Li
2020-12-09 2:27 ` Dongsheng Yang
2020-12-09 4:48 ` Dongdong Tao
2020-12-09 7:49 ` Dongsheng Yang [this message]
[not found] ` <CAJS8hVLMUS1mdrwC8ovzvMO+HWf4xtXRCNJEghtbtW0g93Kh_g@mail.gmail.com>
2020-12-14 17:07 ` Coly Li
[not found] ` <CAJS8hVKMjec1cpe_zoeZAJrfY0Pq9bJ51eO6E+g8pgN9jV3Nmw@mail.gmail.com>
2020-12-21 8:08 ` Coly Li
2021-01-05 3:06 Dongdong Tao
[not found] ` <CAJS8hVK-ZCxJt=E3hwR0hmqPYL1T07_WC_nerb-dZodO+DqtDA@mail.gmail.com>
2021-01-05 4:33 ` Coly Li
[not found] ` <CAJS8hVL2B=RZr8H4jFbz=bX9k_E9ur7kTeue6BJwzm4pwv1+qQ@mail.gmail.com>
2021-01-08 4:05 ` Coly Li
2021-01-08 8:30 ` Dongdong Tao
2021-01-08 8:39 ` Coly Li
2021-01-08 8:47 ` Dongdong Tao
2021-01-14 4:55 ` Dongdong Tao
[not found] ` <CAJS8hVJDaREvpvG4iO+Xs-KQXQKFi7=k29TrG=NXqjyiPpUCZA@mail.gmail.com>
2021-01-14 10:05 ` Coly Li
2021-01-14 12:22 ` Dongdong Tao
2021-01-14 13:31 ` Coly Li
2021-01-14 15:35 ` Dongdong Tao
2021-01-17 7:11 ` kernel test robot
2021-01-19 12:56 Dongdong Tao
2021-01-19 14:06 ` Coly Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ce04461c-ca5c-781d-7aad-cdad3ebadac2@easystack.cn \
--to=dongsheng.yang@easystack.cn \
--cc=colyli@suse.de \
--cc=dongdong.tao@canonical.com \
--cc=gavin.guo@canonical.com \
--cc=gerald.yang@canonical.com \
--cc=kent.overstreet@gmail.com \
--cc=linux-bcache@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tdd21151186@gmail.com \
--cc=trent.lloyd@canonical.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).