All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Shaohua Li <shli@fb.com>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com, axboe@fb.com, vgoyal@redhat.com
Subject: Re: [PATCH V5 16/17] blk-throttle: add a mechanism to estimate IO latency
Date: Mon, 9 Jan 2017 16:39:19 -0500	[thread overview]
Message-ID: <20170109213919.GU12827@mtj.duckdns.org> (raw)
In-Reply-To: <53ba1d9ed13cf6d3fd5a51ad84ae3219571d6c2f.1481833017.git.shli@fb.com>

Hello,

On Thu, Dec 15, 2016 at 12:33:07PM -0800, Shaohua Li wrote:
> User configures latency target, but the latency threshold for each
> request size isn't fixed. For a SSD, the IO latency highly depends on
> request size. To calculate latency threshold, we sample some data, eg,
> average latency for request size 4k, 8k, 16k, 32k .. 1M. The latency
> threshold of each request size will be the sample latency (I'll call it
> base latency) plus latency target. For example, the base latency for
> request size 4k is 80us and user configures latency target 60us. The 4k
> latency threshold will be 80 + 60 = 140us.

Ah okay, the user configures the extra latency.  Yeah, this is way
better than treating what the user configures as the target latency
for 4k IOs.

> @@ -25,6 +25,8 @@ static int throtl_quantum = 32;
>  #define DFL_IDLE_THRESHOLD_HD (1000 * 1000) /* 1 ms */
>  #define MAX_IDLE_TIME (500L * 1000 * 1000) /* 500 ms */
>  
> +#define SKIP_TRACK (((u64)1) << BLK_STAT_RES_SHIFT)

SKIP_LATENCY?

> +static void throtl_update_latency_buckets(struct throtl_data *td)
> +{
> +	struct avg_latency_bucket avg_latency[LATENCY_BUCKET_SIZE];
> +	int i, cpu;
> +	u64 last_latency = 0;
> +	u64 latency;
> +
> +	if (!blk_queue_nonrot(td->queue))
> +		return;
> +	if (time_before(jiffies, td->last_calculate_time + HZ))
> +		return;
> +	td->last_calculate_time = jiffies;
> +
> +	memset(avg_latency, 0, sizeof(avg_latency));
> +	for (i = 0; i < LATENCY_BUCKET_SIZE; i++) {
> +		struct latency_bucket *tmp = &td->tmp_buckets[i];
> +
> +		for_each_possible_cpu(cpu) {
> +			struct latency_bucket *bucket;
> +
> +			/* this isn't race free, but ok in practice */
> +			bucket = per_cpu_ptr(td->latency_buckets, cpu);
> +			tmp->total_latency += bucket[i].total_latency;
> +			tmp->samples += bucket[i].samples;

Heh, this *can* lead to surprising results (like reading zero for a
value larger than 2^32) on 32bit machines due to split updates, and if
we're using nanosecs, those surprises have a chance, albeit low, of
happening every four secs, which is a bit unsettling.  If we have to
use nanosecs, let's please use u64_stats_sync.  If we're okay with
microsecs, ulongs should be fine.

>  void blk_throtl_bio_endio(struct bio *bio)
>  {
>  	struct throtl_grp *tg;
> +	u64 finish_time;
> +	u64 start_time;
> +	u64 lat;
>  
>  	tg = bio->bi_cg_private;
>  	if (!tg)
>  		return;
>  	bio->bi_cg_private = NULL;
>  
> -	tg->last_finish_time = ktime_get_ns();
> +	finish_time = ktime_get_ns();
> +	tg->last_finish_time = finish_time;
> +
> +	start_time = blk_stat_time(&bio->bi_issue_stat);
> +	finish_time = __blk_stat_time(finish_time);
> +	if (start_time && finish_time > start_time &&
> +	    tg->td->track_bio_latency == 1 &&
> +	    !(bio->bi_issue_stat.stat & SKIP_TRACK)) {

Heh, can't we collapse some of the conditions?  e.g. flip SKIP_TRACK
to TRACK_LATENCY and set it iff the td has track_bio_latency set and
also the bio has start time set?

> @@ -2106,6 +2251,12 @@ int blk_throtl_init(struct request_queue *q)
>  	td = kzalloc_node(sizeof(*td), GFP_KERNEL, q->node);
>  	if (!td)
>  		return -ENOMEM;
> +	td->latency_buckets = __alloc_percpu(sizeof(struct latency_bucket) *
> +		LATENCY_BUCKET_SIZE, __alignof__(u64));
> +	if (!td->latency_buckets) {
> +		kfree(td);
> +		return -ENOMEM;
> +	}
>  
>  	INIT_WORK(&td->dispatch_work, blk_throtl_dispatch_work_fn);
>  	throtl_service_queue_init(&td->service_queue);
> @@ -2119,10 +2270,13 @@ int blk_throtl_init(struct request_queue *q)
>  	td->low_upgrade_time = jiffies;
>  	td->low_downgrade_time = jiffies;
>  
> +	td->track_bio_latency = UINT_MAX;

I don't think using 0, 1, UINT_MAX as enums is good for readability.

Thanks.

-- 
tejun

  reply	other threads:[~2017-01-09 21:39 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-15 20:32 [PATCH V5 00/17] blk-throttle: add .low limit Shaohua Li
2016-12-15 20:32 ` [PATCH V5 01/17] blk-throttle: use U64_MAX/UINT_MAX to replace -1 Shaohua Li
2016-12-15 20:32 ` [PATCH V5 02/17] blk-throttle: prepare support multiple limits Shaohua Li
2016-12-15 20:32 ` [PATCH V5 03/17] blk-throttle: add .low interface Shaohua Li
2017-01-09 16:35   ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 04/17] blk-throttle: configure bps/iops limit for cgroup in low limit Shaohua Li
2017-01-09 17:35   ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 05/17] blk-throttle: add upgrade logic for LIMIT_LOW state Shaohua Li
2017-01-09 18:40   ` Tejun Heo
2017-01-09 19:46     ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 06/17] blk-throttle: add downgrade logic Shaohua Li
2016-12-15 20:32 ` [PATCH V5 07/17] blk-throttle: make sure expire time isn't too big Shaohua Li
2017-01-09 19:54   ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 08/17] blk-throttle: make throtl_slice tunable Shaohua Li
2017-01-09 20:08   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 09/17] blk-throttle: detect completed idle cgroup Shaohua Li
2017-01-09 20:13   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 10/17] blk-throttle: make bandwidth change smooth Shaohua Li
2017-01-09 20:28   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 11/17] blk-throttle: add a simple idle detection Shaohua Li
2017-01-09 20:56   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 12/17] blk-throttle: add interface to configure idle time threshold Shaohua Li
2017-01-09 20:58   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 13/17] blk-throttle: ignore idle cgroup limit Shaohua Li
2017-01-09 21:01   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 14/17] blk-throttle: add interface for per-cgroup target latency Shaohua Li
2017-01-09 21:14   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 15/17] block: track request size in blk_issue_stat Shaohua Li
2016-12-16  2:01   ` kbuild test robot
2017-01-09 21:17   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 16/17] blk-throttle: add a mechanism to estimate IO latency Shaohua Li
2017-01-09 21:39   ` Tejun Heo [this message]
2016-12-15 20:33 ` [PATCH V5 17/17] blk-throttle: add latency target support Shaohua Li
2017-01-09 21:46 ` [PATCH V5 00/17] blk-throttle: add .low limit Tejun Heo
2017-01-09 22:27   ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170109213919.GU12827@mtj.duckdns.org \
    --to=tj@kernel.org \
    --cc=axboe@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shli@fb.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.