All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Shaohua Li <shli@fb.com>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com, axboe@fb.com, vgoyal@redhat.com
Subject: Re: [PATCH V5 05/17] blk-throttle: add upgrade logic for LIMIT_LOW state
Date: Mon, 9 Jan 2017 13:40:53 -0500	[thread overview]
Message-ID: <20170109184053.GG12827@mtj.duckdns.org> (raw)
In-Reply-To: <75685643afd126cbccefe894ca56fd5dd83fe8cf.1481833017.git.shli@fb.com>

Hello, Shaohua.

On Thu, Dec 15, 2016 at 12:32:56PM -0800, Shaohua Li wrote:
> For a cgroup hierarchy, there are two cases. Children has lower low
> limit than parent. Parent's low limit is meaningless. If children's
> bps/iops cross low limit, we can upgrade queue state. The other case is
> children has higher low limit than parent. Children's low limit is
> meaningless. As long as parent's bps/iops cross low limit, we can
> upgrade queue state.

The above isn't completely accurate as the parent should consider the
sum of what's currently being used in the children.

> +static bool throtl_tg_can_upgrade(struct throtl_grp *tg)
> +{
> +	struct throtl_service_queue *sq = &tg->service_queue;
> +	bool read_limit, write_limit;
> +
> +	/*
> +	 * if cgroup reaches low/max limit (max >= low), it's ok to next
> +	 * limit
> +	 */
> +	read_limit = tg->bps[READ][LIMIT_LOW] != U64_MAX ||
> +		     tg->iops[READ][LIMIT_LOW] != UINT_MAX;
> +	write_limit = tg->bps[WRITE][LIMIT_LOW] != U64_MAX ||
> +		      tg->iops[WRITE][LIMIT_LOW] != UINT_MAX;
> +	if (read_limit && sq->nr_queued[READ] &&
> +	    (!write_limit || sq->nr_queued[WRITE]))
> +		return true;
> +	if (write_limit && sq->nr_queued[WRITE] &&
> +	    (!read_limit || sq->nr_queued[READ]))
> +		return true;

I think it'd be great to explain the above.  It was a bit difficult
for me to follow.  It's also interesting because we're tying state
transitions for both read and write together.  blk-throtl has been
handling reads and writes independently, now the mode switching from
low to max is shared across reads and writes.  I suppose it could be
fine but would it be complex to separate them out?  It's weird to make
this one state shared across reads and writes while not for others or
was this sharing intentional?

> +	return false;
> +}
> +
> +static bool throtl_hierarchy_can_upgrade(struct throtl_grp *tg)
> +{
> +	while (true) {
> +		if (throtl_tg_can_upgrade(tg))
> +			return true;
> +		tg = sq_to_tg(tg->service_queue.parent_sq);
> +		if (!tg || (cgroup_subsys_on_dfl(io_cgrp_subsys) &&
> +				!tg_to_blkg(tg)->parent))
> +			return false;

Isn't the low limit v2 only?  Do we need the on_dfl test this deep?

> +	}
> +	return false;
> +}
> +
> +static bool throtl_can_upgrade(struct throtl_data *td,
> +	struct throtl_grp *this_tg)
> +{
> +	struct cgroup_subsys_state *pos_css;
> +	struct blkcg_gq *blkg;
> +
> +	if (td->limit_index != LIMIT_LOW)
> +		return false;
> +
> +	rcu_read_lock();
> +	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg) {
> +		struct throtl_grp *tg = blkg_to_tg(blkg);
> +
> +		if (tg == this_tg)
> +			continue;
> +		if (!list_empty(&tg_to_blkg(tg)->blkcg->css.children))
> +			continue;
> +		if (!throtl_hierarchy_can_upgrade(tg)) {
> +			rcu_read_unlock();
> +			return false;
> +		}
> +	}
> +	rcu_read_unlock();
> +	return true;
> +}

So, if all with low limit are over their limits (have commands queued
in the delay queue), the state can be upgraded, right?  Yeah, that
seems correct to me.  The patch description didn't seem to match it
tho.  Can you please update the description accordingly?

Thanks.

-- 
tejun

  reply	other threads:[~2017-01-09 18:41 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-15 20:32 [PATCH V5 00/17] blk-throttle: add .low limit Shaohua Li
2016-12-15 20:32 ` [PATCH V5 01/17] blk-throttle: use U64_MAX/UINT_MAX to replace -1 Shaohua Li
2016-12-15 20:32 ` [PATCH V5 02/17] blk-throttle: prepare support multiple limits Shaohua Li
2016-12-15 20:32 ` [PATCH V5 03/17] blk-throttle: add .low interface Shaohua Li
2017-01-09 16:35   ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 04/17] blk-throttle: configure bps/iops limit for cgroup in low limit Shaohua Li
2017-01-09 17:35   ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 05/17] blk-throttle: add upgrade logic for LIMIT_LOW state Shaohua Li
2017-01-09 18:40   ` Tejun Heo [this message]
2017-01-09 19:46     ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 06/17] blk-throttle: add downgrade logic Shaohua Li
2016-12-15 20:32 ` [PATCH V5 07/17] blk-throttle: make sure expire time isn't too big Shaohua Li
2017-01-09 19:54   ` Tejun Heo
2016-12-15 20:32 ` [PATCH V5 08/17] blk-throttle: make throtl_slice tunable Shaohua Li
2017-01-09 20:08   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 09/17] blk-throttle: detect completed idle cgroup Shaohua Li
2017-01-09 20:13   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 10/17] blk-throttle: make bandwidth change smooth Shaohua Li
2017-01-09 20:28   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 11/17] blk-throttle: add a simple idle detection Shaohua Li
2017-01-09 20:56   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 12/17] blk-throttle: add interface to configure idle time threshold Shaohua Li
2017-01-09 20:58   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 13/17] blk-throttle: ignore idle cgroup limit Shaohua Li
2017-01-09 21:01   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 14/17] blk-throttle: add interface for per-cgroup target latency Shaohua Li
2017-01-09 21:14   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 15/17] block: track request size in blk_issue_stat Shaohua Li
2016-12-16  2:01   ` kbuild test robot
2017-01-09 21:17   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 16/17] blk-throttle: add a mechanism to estimate IO latency Shaohua Li
2017-01-09 21:39   ` Tejun Heo
2016-12-15 20:33 ` [PATCH V5 17/17] blk-throttle: add latency target support Shaohua Li
2017-01-09 21:46 ` [PATCH V5 00/17] blk-throttle: add .low limit Tejun Heo
2017-01-09 22:27   ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170109184053.GG12827@mtj.duckdns.org \
    --to=tj@kernel.org \
    --cc=axboe@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shli@fb.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.