linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dongsheng Yang <dongsheng.yang@easystack.cn>
To: Dongdong Tao <dongdong.tao@canonical.com>
Cc: Dongdong Tao <tdd21151186@gmail.com>, Coly Li <colyli@suse.de>,
	Gavin Guo <gavin.guo@canonical.com>,
	Gerald Yang <gerald.yang@canonical.com>,
	Trent Lloyd <trent.lloyd@canonical.com>,
	Kent Overstreet <kent.overstreet@gmail.com>,
	"open list:BCACHE (BLOCK LAYER CACHE)"
	<linux-bcache@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] bcache: consider the fragmentation when update the writeback rate
Date: Wed, 9 Dec 2020 15:49:26 +0800	[thread overview]
Message-ID: <ce04461c-ca5c-781d-7aad-cdad3ebadac2@easystack.cn> (raw)
In-Reply-To: <CAJS8hV+UmLFQVhuqUin1Ze6kLtAO7paXH95+9gaiZgM19VZe1A@mail.gmail.com>


在 2020/12/9 星期三 下午 12:48, Dongdong Tao 写道:
> Hi Dongsheng,
>
> I'm working on it, next step I'm gathering some testing data and
> upload (very sorry for the delay...)
> Thanks for the comment.
> One of the main concerns to alleviate this issue with the writeback
> process is that we need to minimize the impact on the client IO
> performance.
> writeback_percent by default is 10, start writeback when dirty buckets
> reached 10 percent might be a bit too aggressive, as the
> writeback_cutoff_sync is 70 percent.
> So i chose to start the writeback when dirty buckets reached 50
> percent so that this patch will only take effect after dirty buckets
> percent is above that

Agree with that's too aggressive to reuse writeback_percent, and that's 
less flexable.

Okey, let's wait for your testing result.


Thanx

>
> Thanks,
> Dongdong
>
>
>
>
> On Wed, Dec 9, 2020 at 10:27 AM Dongsheng Yang
> <dongsheng.yang@easystack.cn> wrote:
>>
>> 在 2020/11/3 星期二 下午 8:42, Dongdong Tao 写道:
>>> From: dongdong tao <dongdong.tao@canonical.com>
>>>
>>> Current way to calculate the writeback rate only considered the
>>> dirty sectors, this usually works fine when the fragmentation
>>> is not high, but it will give us unreasonable small rate when
>>> we are under a situation that very few dirty sectors consumed
>>> a lot dirty buckets. In some case, the dirty bucekts can reached
>>> to CUTOFF_WRITEBACK_SYNC while the dirty data(sectors) noteven
>>> reached the writeback_percent, the writeback rate will still
>>> be the minimum value (4k), thus it will cause all the writes to be
>>> stucked in a non-writeback mode because of the slow writeback.
>>>
>>> This patch will try to accelerate the writeback rate when the
>>> fragmentation is high. It calculate the propotional_scaled value
>>> based on below:
>>> (dirty_sectors / writeback_rate_p_term_inverse) * fragment
>>> As we can see, the higher fragmentation will result a larger
>>> proportional_scaled value, thus cause a larger writeback rate.
>>> The fragment value is calculated based on below:
>>> (dirty_buckets *  bucket_size) / dirty_sectors
>>> If you think about it, the value of fragment will be always
>>> inside [1, bucket_size].
>>>
>>> This patch only considers the fragmentation when the number of
>>> dirty_buckets reached to a dirty threshold(configurable by
>>> writeback_fragment_percent, default is 50), so bcache will
>>> remain the original behaviour before the dirty buckets reached
>>> the threshold.
>>>
>>> Signed-off-by: dongdong tao <dongdong.tao@canonical.com>
>>> ---
>>>    drivers/md/bcache/bcache.h    |  1 +
>>>    drivers/md/bcache/sysfs.c     |  6 ++++++
>>>    drivers/md/bcache/writeback.c | 21 +++++++++++++++++++++
>>>    3 files changed, 28 insertions(+)
>>>
>>> diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
>>> index 1d57f48307e6..87632f7032b6 100644
>>> --- a/drivers/md/bcache/bcache.h
>>> +++ b/drivers/md/bcache/bcache.h
>>> @@ -374,6 +374,7 @@ struct cached_dev {
>>>        unsigned int            writeback_metadata:1;
>>>        unsigned int            writeback_running:1;
>>>        unsigned char           writeback_percent;
>>> +     unsigned char           writeback_fragment_percent;
>>>        unsigned int            writeback_delay;
>>>
>>>        uint64_t                writeback_rate_target;
>>> diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
>>> index 554e3afc9b68..69499113aef8 100644
>>> --- a/drivers/md/bcache/sysfs.c
>>> +++ b/drivers/md/bcache/sysfs.c
>>> @@ -115,6 +115,7 @@ rw_attribute(stop_when_cache_set_failed);
>>>    rw_attribute(writeback_metadata);
>>>    rw_attribute(writeback_running);
>>>    rw_attribute(writeback_percent);
>>> +rw_attribute(writeback_fragment_percent);
>>
>> Hi Dongdong and Coly,
>>
>>       What is the status about this patch? In my opinion, it is a problem
>> we need to solve,
>>
>> but can we just reuse the parameter of writeback_percent, rather than
>> introduce a new writeback_fragment_percent?
>>
>> That means the semantic of writeback_percent will act on dirty data
>> percent and dirty bucket percent.
>>
>> When we found there are dirty buckets more than (c->nbuckets *
>> writeback_percent), start the writeback.
>>
>>
>> Thanx
>>
>> Yang
>>
>>>    rw_attribute(writeback_delay);
>>>    rw_attribute(writeback_rate);
>>>
>>> @@ -197,6 +198,7 @@ SHOW(__bch_cached_dev)
>>>        var_printf(writeback_running,   "%i");
>>>        var_print(writeback_delay);
>>>        var_print(writeback_percent);
>>> +     var_print(writeback_fragment_percent);
>>>        sysfs_hprint(writeback_rate,
>>>                     wb ? atomic_long_read(&dc->writeback_rate.rate) << 9 : 0);
>>>        sysfs_printf(io_errors,         "%i", atomic_read(&dc->io_errors));
>>> @@ -308,6 +310,9 @@ STORE(__cached_dev)
>>>        sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent,
>>>                            0, bch_cutoff_writeback);
>>>
>>> +     sysfs_strtoul_clamp(writeback_fragment_percent, dc->writeback_fragment_percent,
>>> +                         0, bch_cutoff_writeback_sync);
>>> +
>>>        if (attr == &sysfs_writeback_rate) {
>>>                ssize_t ret;
>>>                long int v = atomic_long_read(&dc->writeback_rate.rate);
>>> @@ -498,6 +503,7 @@ static struct attribute *bch_cached_dev_files[] = {
>>>        &sysfs_writeback_running,
>>>        &sysfs_writeback_delay,
>>>        &sysfs_writeback_percent,
>>> +     &sysfs_writeback_fragment_percent,
>>>        &sysfs_writeback_rate,
>>>        &sysfs_writeback_rate_update_seconds,
>>>        &sysfs_writeback_rate_i_term_inverse,
>>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
>>> index 3c74996978da..34babc89fdf3 100644
>>> --- a/drivers/md/bcache/writeback.c
>>> +++ b/drivers/md/bcache/writeback.c
>>> @@ -88,6 +88,26 @@ static void __update_writeback_rate(struct cached_dev *dc)
>>>        int64_t integral_scaled;
>>>        uint32_t new_rate;
>>>
>>> +     /*
>>> +      * We need to consider the number of dirty buckets as well
>>> +      * when calculating the proportional_scaled, Otherwise we might
>>> +      * have an unreasonable small writeback rate at a highly fragmented situation
>>> +      * when very few dirty sectors consumed a lot dirty buckets, the
>>> +      * worst case is when dirty_data reached writeback_percent and
>>> +      * dirty buckets reached to cutoff_writeback_sync, but the rate
>>> +      * still will be at the minimum value, which will cause the write
>>> +      * stuck at a non-writeback mode.
>>> +      */
>>> +     struct cache_set *c = dc->disk.c;
>>> +
>>> +     if (c->gc_stats.in_use > dc->writeback_fragment_percent && dirty > 0) {
>>> +             int64_t dirty_buckets = (c->gc_stats.in_use * c->nbuckets) / 100;
>>> +             int64_t fragment = (dirty_buckets *  c->cache->sb.bucket_size) / dirty;
>>> +
>>> +             proportional_scaled =
>>> +             div_s64(dirty, dc->writeback_rate_p_term_inverse) * (fragment);
>>> +     }
>>> +
>>>        if ((error < 0 && dc->writeback_rate_integral > 0) ||
>>>            (error > 0 && time_before64(local_clock(),
>>>                         dc->writeback_rate.next + NSEC_PER_MSEC))) {
>>> @@ -969,6 +989,7 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc)
>>>        dc->writeback_metadata          = true;
>>>        dc->writeback_running           = false;
>>>        dc->writeback_percent           = 10;
>>> +     dc->writeback_fragment_percent  = 50;
>>>        dc->writeback_delay             = 30;
>>>        atomic_long_set(&dc->writeback_rate.rate, 1024);
>>>        dc->writeback_rate_minimum      = 8;

  reply	other threads:[~2020-12-09  7:50 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-03 12:42 [PATCH] bcache: consider the fragmentation when update the writeback rate Dongdong Tao
2020-11-04  5:06 ` kernel test robot
2020-11-05 16:32 ` Coly Li
2020-11-10  4:19   ` Dongdong Tao
2020-11-11  8:33     ` Coly Li
2020-12-09  2:27 ` Dongsheng Yang
2020-12-09  4:48   ` Dongdong Tao
2020-12-09  7:49     ` Dongsheng Yang [this message]
     [not found]       ` <CAJS8hVLMUS1mdrwC8ovzvMO+HWf4xtXRCNJEghtbtW0g93Kh_g@mail.gmail.com>
2020-12-14 17:07         ` Coly Li
     [not found]           ` <CAJS8hVKMjec1cpe_zoeZAJrfY0Pq9bJ51eO6E+g8pgN9jV3Nmw@mail.gmail.com>
2020-12-21  8:08             ` Coly Li
2021-01-05  3:06 Dongdong Tao
     [not found] ` <CAJS8hVK-ZCxJt=E3hwR0hmqPYL1T07_WC_nerb-dZodO+DqtDA@mail.gmail.com>
2021-01-05  4:33   ` Coly Li
     [not found]     ` <CAJS8hVL2B=RZr8H4jFbz=bX9k_E9ur7kTeue6BJwzm4pwv1+qQ@mail.gmail.com>
2021-01-08  4:05       ` Coly Li
2021-01-08  8:30         ` Dongdong Tao
2021-01-08  8:39           ` Coly Li
2021-01-08  8:47             ` Dongdong Tao
2021-01-14  4:55               ` Dongdong Tao
     [not found]               ` <CAJS8hVJDaREvpvG4iO+Xs-KQXQKFi7=k29TrG=NXqjyiPpUCZA@mail.gmail.com>
2021-01-14 10:05                 ` Coly Li
2021-01-14 12:22                   ` Dongdong Tao
2021-01-14 13:31                     ` Coly Li
2021-01-14 15:35                       ` Dongdong Tao
2021-01-17  7:11 ` kernel test robot
2021-01-19 12:56 Dongdong Tao
2021-01-19 14:06 ` Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce04461c-ca5c-781d-7aad-cdad3ebadac2@easystack.cn \
    --to=dongsheng.yang@easystack.cn \
    --cc=colyli@suse.de \
    --cc=dongdong.tao@canonical.com \
    --cc=gavin.guo@canonical.com \
    --cc=gerald.yang@canonical.com \
    --cc=kent.overstreet@gmail.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tdd21151186@gmail.com \
    --cc=trent.lloyd@canonical.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).