All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Lyle <mlyle@lyle.org>
To: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org
Cc: axboe@fb.com, Michael Lyle <mlyle@lyle.org>
Subject: [416 PATCH 13/13] bcache: fix writeback target calc on large devices
Date: Mon,  8 Jan 2018 12:21:30 -0800	[thread overview]
Message-ID: <20180108202130.31303-14-mlyle@lyle.org> (raw)
In-Reply-To: <20180108202130.31303-1-mlyle@lyle.org>

Bcache needs to scale the dirty data in the cache over the multiple
backing disks in order to calculate writeback rates for each.
The previous code did this by multiplying the target number of dirty
sectors by the backing device size, and expected it to fit into a
uint64_t; this blows up on relatively small backing devices.

The new approach figures out the bdev's share in 16384ths of the overall
cached data.  This is chosen to cope well when bdevs drastically vary in
size and to ensure that bcache can cross the petabyte boundary for each
backing device.

This has been improved based on Tang Junhui's feedback to ensure that
every device gets a share of dirty data, no matter how small it is
compared to the total backing pool.

The existing mechanism is very limited; this is purely a bug fix to
remove limits on volume size.  However, there still needs to be change
to make this "fair" over many volumes where some are idle.

Reported-by: Jack Douglas <jack@douglastechnology.co.uk>
Signed-off-by: Michael Lyle <mlyle@lyle.org>
Reviewed-by: Tang Junhui <tang.junhui@zte.com.cn>
---
 drivers/md/bcache/writeback.c | 31 +++++++++++++++++++++++++++----
 drivers/md/bcache/writeback.h |  7 +++++++
 2 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index 31b0a292a619..51306a19ab03 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -18,17 +18,39 @@
 #include <trace/events/bcache.h>
 
 /* Rate limiting */
-
-static void __update_writeback_rate(struct cached_dev *dc)
+static uint64_t __calc_target_rate(struct cached_dev *dc)
 {
 	struct cache_set *c = dc->disk.c;
+
+	/*
+	 * This is the size of the cache, minus the amount used for
+	 * flash-only devices
+	 */
 	uint64_t cache_sectors = c->nbuckets * c->sb.bucket_size -
 				bcache_flash_devs_sectors_dirty(c);
+
+	/*
+	 * Unfortunately there is no control of global dirty data.  If the
+	 * user states that they want 10% dirty data in the cache, and has,
+	 * e.g., 5 backing volumes of equal size, we try and ensure each
+	 * backing volume uses about 2% of the cache for dirty data.
+	 */
+	uint32_t bdev_share =
+		div64_u64(bdev_sectors(dc->bdev) << WRITEBACK_SHARE_SHIFT,
+				c->cached_dev_sectors);
+
 	uint64_t cache_dirty_target =
 		div_u64(cache_sectors * dc->writeback_percent, 100);
-	int64_t target = div64_u64(cache_dirty_target * bdev_sectors(dc->bdev),
-				   c->cached_dev_sectors);
 
+	/* Ensure each backing dev gets at least one dirty share */
+	if (bdev_share < 1)
+		bdev_share = 1;
+
+	return (cache_dirty_target * bdev_share) >> WRITEBACK_SHARE_SHIFT;
+}
+
+static void __update_writeback_rate(struct cached_dev *dc)
+{
 	/*
 	 * PI controller:
 	 * Figures out the amount that should be written per second.
@@ -49,6 +71,7 @@ static void __update_writeback_rate(struct cached_dev *dc)
 	 * This acts as a slow, long-term average that is not subject to
 	 * variations in usage like the p term.
 	 */
+	int64_t target = __calc_target_rate(dc);
 	int64_t dirty = bcache_dev_sectors_dirty(&dc->disk);
 	int64_t error = dirty - target;
 	int64_t proportional_scaled =
diff --git a/drivers/md/bcache/writeback.h b/drivers/md/bcache/writeback.h
index f102b1f9bc51..66f1c527fa24 100644
--- a/drivers/md/bcache/writeback.h
+++ b/drivers/md/bcache/writeback.h
@@ -8,6 +8,13 @@
 #define MAX_WRITEBACKS_IN_PASS  5
 #define MAX_WRITESIZE_IN_PASS   5000	/* *512b */
 
+/*
+ * 14 (16384ths) is chosen here as something that each backing device
+ * should be a reasonable fraction of the share, and not to blow up
+ * until individual backing devices are a petabyte.
+ */
+#define WRITEBACK_SHARE_SHIFT   14
+
 static inline uint64_t bcache_dev_sectors_dirty(struct bcache_device *d)
 {
 	uint64_t i, ret = 0;
-- 
2.14.1

  parent reply	other threads:[~2018-01-08 20:21 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-08 20:21 [416 PATCH 00/13] Bcache changes for 4.16 Michael Lyle
2018-01-08 20:21 ` [416 PATCH 01/13] bcache: ret IOERR when read meets metadata error Michael Lyle
2018-01-08 20:21 ` [416 PATCH 02/13] bcache: stop writeback thread after detaching Michael Lyle
2018-01-08 20:21 ` [416 PATCH 03/13] bcache: Use PTR_ERR_OR_ZERO() Michael Lyle
2018-01-08 20:21 ` [416 PATCH 04/13] bcache: segregate flash only volume write streams Michael Lyle
2018-01-08 20:21 ` [416 PATCH 05/13] bcache: fix wrong return value in bch_debug_init() Michael Lyle
2018-01-08 20:21 ` [416 PATCH 06/13] bcache: writeback: properly order backing device IO Michael Lyle
2018-01-08 20:21 ` [416 PATCH 07/13] bcache: allow quick writeback when backing idle Michael Lyle
2018-01-08 20:21 ` [416 PATCH 08/13] bcache: Fix, improve efficiency of closure_sync() Michael Lyle
2018-01-08 20:21 ` [416 PATCH 09/13] bcache: mark closure_sync() __sched Michael Lyle
2018-01-08 20:21 ` [416 PATCH 10/13] bcache: fix unmatched generic_end_io_acct() & generic_start_io_acct() Michael Lyle
2018-01-08 20:21 ` [416 PATCH 11/13] bcache: reduce cache_set devices iteration by devices_max_used Michael Lyle
2018-01-08 20:21 ` [416 PATCH 12/13] bcache: fix misleading error message in bch_count_io_errors() Michael Lyle
2018-01-08 20:21 ` Michael Lyle [this message]
2018-01-08 20:42 ` [416 PATCH 00/13] Bcache changes for 4.16 Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180108202130.31303-14-mlyle@lyle.org \
    --to=mlyle@lyle.org \
    --cc=axboe@fb.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.