linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shaohua Li <shli@fb.com>
To: <linux-block@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Cc: <kernel-team@fb.com>, <axboe@fb.com>, <tj@kernel.org>,
	<vgoyal@redhat.com>
Subject: [PATCH V5 17/17] blk-throttle: add latency target support
Date: Thu, 15 Dec 2016 12:33:08 -0800	[thread overview]
Message-ID: <99757f2dd713e63fc74ea8ae004b1c50380ec718.1481833017.git.shli@fb.com> (raw)
In-Reply-To: <cover.1481833017.git.shli@fb.com>

One hard problem adding .low limit is to detect idle cgroup. If one
cgroup doesn't dispatch enough IO against its low limit, we must have a
mechanism to determine if other cgroups dispatch more IO. We added the
think time detection mechanism before, but it doesn't work for all
workloads. Here we add a latency based approach.

We already have mechanism to calculate latency threshold for each IO
size. For every IO dispatched from a cgorup, we compare its latency
against its threshold and record the info. If most IO latency is below
threshold (in the code I use 75%), the cgroup could be treated idle and
other cgroups can dispatch more IO.

Currently this latency target check is only for SSD as we can't
calcualte the latency target for hard disk. And this is only for cgroup
leaf node so far.

Signed-off-by: Shaohua Li <shli@fb.com>
---
 block/blk-throttle.c | 41 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 1dc707a..915ebf5 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -162,6 +162,10 @@ struct throtl_grp {
 	u64 checked_last_finish_time;
 	u64 avg_ttime;
 	u64 idle_ttime_threshold;
+
+	unsigned int bio_cnt; /* total bios */
+	unsigned int bad_bio_cnt; /* bios exceeding latency threshold */
+	unsigned long bio_cnt_reset_time;
 };
 
 /* We measure latency for request size from <= 4k to >= 1M */
@@ -1688,11 +1692,14 @@ static bool throtl_tg_is_idle(struct throtl_grp *tg)
 	 * - single idle is too long, longer than a fixed value (in case user
 	 *   configure a too big threshold) or 4 times of slice
 	 * - average think time is more than threshold
+	 * - IO latency is largely below threshold
 	 */
 	u64 time = (u64)jiffies_to_usecs(4 * tg->td->throtl_slice) * 1000;
 	time = min_t(u64, MAX_IDLE_TIME, time);
 	return ktime_get_ns() - tg->last_finish_time > time ||
-	       tg->avg_ttime > tg->idle_ttime_threshold;
+	       tg->avg_ttime > tg->idle_ttime_threshold ||
+	       (tg->latency_target && tg->bio_cnt &&
+		tg->bad_bio_cnt * 5 < tg->bio_cnt);
 }
 
 static bool throtl_tg_can_upgrade(struct throtl_grp *tg)
@@ -2170,12 +2177,36 @@ void blk_throtl_bio_endio(struct bio *bio)
 
 	start_time = blk_stat_time(&bio->bi_issue_stat);
 	finish_time = __blk_stat_time(finish_time);
-	if (start_time && finish_time > start_time &&
-	    tg->td->track_bio_latency == 1 &&
-	    !(bio->bi_issue_stat.stat & SKIP_TRACK)) {
-		lat = finish_time - start_time;
+	if (!start_time || finish_time <= start_time)
+		return;
+
+	lat = finish_time - start_time;
+	if (tg->td->track_bio_latency == 1 &&
+	    !(bio->bi_issue_stat.stat & SKIP_TRACK))
 		throtl_track_latency(tg->td, blk_stat_size(&bio->bi_issue_stat),
 			bio_op(bio), lat);
+
+	if (tg->latency_target) {
+		int bucket;
+		unsigned int threshold;
+
+		bucket = request_bucket_index(
+			blk_stat_size(&bio->bi_issue_stat));
+		threshold = tg->td->avg_buckets[bucket].latency +
+			tg->latency_target;
+		if (lat > threshold)
+			tg->bad_bio_cnt++;
+		/*
+		 * Not race free, could get wrong count, which means cgroups
+		 * will be throttled
+		 */
+		tg->bio_cnt++;
+	}
+
+	if (time_after(jiffies, tg->bio_cnt_reset_time) || tg->bio_cnt > 1024) {
+		tg->bio_cnt_reset_time = tg->td->throtl_slice + jiffies;
+		tg->bio_cnt /= 2;
+		tg->bad_bio_cnt /= 2;
 	}
 }
 
-- 
2.9.3

  parent reply	other threads:[~2016-12-15 20:33 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-15 20:32 [PATCH V5 00/17] blk-throttle: add .low limit Shaohua Li
2016-12-15 20:32 ` [PATCH V5 01/17] blk-throttle: use U64_MAX/UINT_MAX to replace -1 Shaohua Li
2016-12-15 20:32 ` [PATCH V5 02/17] blk-throttle: prepare support multiple limits Shaohua Li
2016-12-15 20:32 ` [PATCH V5 03/17] blk-throttle: add .low interface Shaohua Li
2017-01-09 16:35   ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 04/17] blk-throttle: configure bps/iops limit for cgroup in low limit Shaohua Li
2017-01-09 17:35   ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 05/17] blk-throttle: add upgrade logic for LIMIT_LOW state Shaohua Li
2017-01-09 18:40   ` Tejun Heo
2017-01-09 19:46     ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 06/17] blk-throttle: add downgrade logic Shaohua Li
2016-12-15 20:32 ` [PATCH V5 07/17] blk-throttle: make sure expire time isn't too big Shaohua Li
2017-01-09 19:54   ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 08/17] blk-throttle: make throtl_slice tunable Shaohua Li
2017-01-09 20:08   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 09/17] blk-throttle: detect completed idle cgroup Shaohua Li
2017-01-09 20:13   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 10/17] blk-throttle: make bandwidth change smooth Shaohua Li
2017-01-09 20:28   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 11/17] blk-throttle: add a simple idle detection Shaohua Li
2017-01-09 20:56   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 12/17] blk-throttle: add interface to configure idle time threshold Shaohua Li
2017-01-09 20:58   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 13/17] blk-throttle: ignore idle cgroup limit Shaohua Li
2017-01-09 21:01   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 14/17] blk-throttle: add interface for per-cgroup target latency Shaohua Li
2017-01-09 21:14   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 15/17] block: track request size in blk_issue_stat Shaohua Li
2016-12-16  2:01   ` kbuild test robot
2017-01-09 21:17   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 16/17] blk-throttle: add a mechanism to estimate IO latency Shaohua Li
2017-01-09 21:39   ` Tejun Heo
2016-12-15 20:33 ` Shaohua Li [this message]
2017-01-09 21:46 ` [PATCH V5 00/17] blk-throttle: add .low limit Tejun Heo
2017-01-09 22:27   ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99757f2dd713e63fc74ea8ae004b1c50380ec718.1481833017.git.shli@fb.com \
    --to=shli@fb.com \
    --cc=axboe@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).