All of lore.kernel.org
 help / color / mirror / Atom feed
From: tang.junhui@zte.com.cn
To: mlyle@lyle.org
Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org,
	tang.junhui@zte.com.cn
Subject: Re: [PATCH v3] bcache: fix writeback target calc on large devices
Date: Sat, 6 Jan 2018 15:29:57 +0800	[thread overview]
Message-ID: <1515223797-25169-1-git-send-email-tang.junhui@zte.com.cn> (raw)

From: Tang Junhui <tang.junhui@zte.com.cn>

Hello Mike,

I thought twice, and feel this patch is a little complex and still not very accurate for
small backend devices. I think we can resolve it like this:

     uint64_t cache_dirty_target =
         div_u64(cache_sectors * dc->writeback_percent, 100);

-    int64_t target = div64_u64(cache_dirty_target * bdev_sectors(dc->bdev),
-                   c->cached_dev_sectors);
+    int64_t target = div64_u64(cache_dirty_target,
+                    div64_u64(c->cached_dev_sectors, bdev_sectors(dc->bdev)));

c->cached_dev_sectors is always bigger than bdev_sectors(dc->bdev), so 
div64_u64(c->cached_dev_sectors, bdev_sectors(dc->bdev) would be never smaller than 1,
and this expression has no multiplication operation, so we do not need to worry
about overflows.

we also can multiply a value(2^10) to avoid loss of accuracy in division like bellow:
(it would support all device smaller than 2^54 BYTE).
+    int64_t target = div64_u64(cache_dirty_target<<10,                            
+                    div64_u64(c->cached_dev_sectors<<10, bdev_sectors(dc->bdev))); 

How do you think about?

> Bcache needs to scale the dirty data in the cache over the multiple
> backing disks in order to calculate writeback rates for each.
> The previous code did this by multiplying the target number of dirty
> sectors by the backing device size, and expected it to fit into a
> uint64_t; this blows up on relatively small backing devices.
> 
> The new approach figures out the bdev's share in 16384ths of the overall
> cached data.  This is chosen to cope well when bdevs drastically vary in
> size and to ensure that bcache can cross the petabyte boundary for each
> backing device.
> 
> This has been improved based on Tang Junhui's feedback to ensure that
> every device gets a share of dirty data, no matter how small it is
> compared to the total backing pool.
> 
> Reported-by: Jack Douglas <jack@douglastechnology.co.uk>
> Signed-off-by: Michael Lyle <mlyle@lyle.org>
> ---
>  drivers/md/bcache/writeback.c | 35 +++++++++++++++++++++++++++++++----
>  1 file changed, 31 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> index 61f24c04cebd..c7e35180091e 100644
> --- a/drivers/md/bcache/writeback.c
> +++ b/drivers/md/bcache/writeback.c
> @@ -18,17 +18,43 @@
>  #include <trace/events/bcache.h>
>  
>  /* Rate limiting */
> -
> -static void __update_writeback_rate(struct cached_dev *dc)
> +static uint64_t __calc_target_rate(struct cached_dev *dc)
>  {
>      struct cache_set *c = dc->disk.c;
> +
> +    /*
> +     * This is the size of the cache, minus the amount used for
> +     * flash-only devices
> +     */
>      uint64_t cache_sectors = c->nbuckets * c->sb.bucket_size -
>                  bcache_flash_devs_sectors_dirty(c);
> +
> +    /*
> +     * Unfortunately there is no control of global dirty data.  If the
> +     * user states that they want 10% dirty data in the cache, and has,
> +     * e.g., 5 backing volumes of equal size, we try and ensure each
> +     * backing volume uses about 2% of the cache.
> +     *
> +     * 16384 is chosen here as something that each backing device should
> +     * be a reasonable fraction of the share, and not to blow up until
> +     * individual backing devices are a petabyte.
> +     */
> +    uint32_t bdev_share_per16k =
> +        div64_u64(16384 * bdev_sectors(dc->bdev),
> +                c->cached_dev_sectors);
> +
>      uint64_t cache_dirty_target =
>          div_u64(cache_sectors * dc->writeback_percent, 100);
> -    int64_t target = div64_u64(cache_dirty_target * bdev_sectors(dc->bdev),
> -                   c->cached_dev_sectors);
>  
> +    /* Ensure each backing dev gets at least 1/16384th dirty share */
> +    if (bdev_share_per16k < 1)
> +        bdev_share_per16k = 1;
> +
> +    return div_u64(cache_dirty_target * bdev_share_per16k, 16384);
> +}
> +
> +static void __update_writeback_rate(struct cached_dev *dc)
> +{
>      /*
>       * PI controller:
>       * Figures out the amount that should be written per second.
> @@ -49,6 +75,7 @@ static void __update_writeback_rate(struct cached_dev *dc)
>       * This acts as a slow, long-term average that is not subject to
>       * variations in usage like the p term.
>       */
> +    int64_t target = __calc_target_rate(dc);
>      int64_t dirty = bcache_dev_sectors_dirty(&dc->disk);
>      int64_t error = dirty - target;
>      int64_t proportional_scaled =
> -- 
> 2.14.1

Thanks,
Tang Junhui

             reply	other threads:[~2018-01-06  7:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-06  7:29 tang.junhui [this message]
2018-01-06 17:47 ` [PATCH v3] bcache: fix writeback target calc on large devices Michael Lyle
  -- strict thread matches above, loose matches on Subject: below --
2018-01-08  2:46 tang.junhui
2018-01-08 17:58 ` Michael Lyle
2018-01-05 21:17 Michael Lyle
2018-01-05 21:20 ` Michael Lyle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1515223797-25169-1-git-send-email-tang.junhui@zte.com.cn \
    --to=tang.junhui@zte.com.cn \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=mlyle@lyle.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.