linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junxiao Bi <junxiao.bi@oracle.com>
To: Hillf Danton <hdanton@sina.com>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	axboe@kernel.dk
Subject: Re: [PATCH] block: fix io hung by block throttle
Date: Sun, 18 Apr 2021 23:09:41 -0700	[thread overview]
Message-ID: <1c3a2874-35bd-3ff0-3088-1a6ea968d016@oracle.com> (raw)
In-Reply-To: <20210418123342.13740-1-hdanton@sina.com>


On 4/18/21 5:33 AM, Hillf Danton wrote:
> On Sat, 17 Apr 2021 14:37:57  Junxiao Bi  wrote:
>> On 4/17/21 3:10 AM, Hillf Danton wrote:
>>> +	if (acquire_inflight_cb(rqw, private_data))
>> This function is to increase atomic variable rq_wait->inflight.
> You are right.
>
>> What's the mutex for?
> It cuts the race between we peek at the sleepers on rqw->wait while they are
> coming and going, and we cant update rqw->inflight without making sure there
> are no sleepers.

Why? I think checking the sleeper in original code is for a fast path.

For wbt, acquire_inflight_cb is wbt_inflight_cb where atomic_inc_below 
is used to update rqw->inflight. I don't see why a mutex is needed for 
this atomic operation.

>
> With the mutex in place, in addition to the certainty of !sleepers, we can
> avoid the race between us and waker in terms of updating inflight by removing
> the invokation of acquire_inflight_cb in the wakeup callback, and the bonus is
> we no longer need the wakeup cb and the rq_qos_wait_data because the more
> traditional wait_event() can do the job.
>
> Finally we can dump the cleanup_cb_t.
>
> +++ b/block/blk-rq-qos.c
> @@ -200,96 +200,24 @@ bool rq_depth_scale_down(struct rq_depth
>   	return true;
>   }
>   
> -struct rq_qos_wait_data {
> -	struct wait_queue_entry wq;
> -	struct task_struct *task;
> -	struct rq_wait *rqw;
> -	acquire_inflight_cb_t *cb;
> -	void *private_data;
> -	bool got_token;
> -};
> -
> -static int rq_qos_wake_function(struct wait_queue_entry *curr,
> -				unsigned int mode, int wake_flags, void *key)
> -{
> -	struct rq_qos_wait_data *data = container_of(curr,
> -						     struct rq_qos_wait_data,
> -						     wq);
> -
> -	/*
> -	 * If we fail to get a budget, return -1 to interrupt the wake up loop
> -	 * in __wake_up_common.
> -	 */
> -	if (!data->cb(data->rqw, data->private_data))
> -		return -1;
> -
> -	data->got_token = true;
> -	smp_wmb();
> -	list_del_init(&curr->entry);
> -	wake_up_process(data->task);
> -	return 1;
> -}
> -
>   /**
>    * rq_qos_wait - throttle on a rqw if we need to
>    * @rqw: rqw to throttle on
>    * @private_data: caller provided specific data
>    * @acquire_inflight_cb: inc the rqw->inflight counter if we can
> - * @cleanup_cb: the callback to cleanup in case we race with a waker
>    *
>    * This provides a uniform place for the rq_qos users to do their throttling.
>    * Since you can end up with a lot of things sleeping at once, this manages the
>    * waking up based on the resources available.  The acquire_inflight_cb should
>    * inc the rqw->inflight if we have the ability to do so, or return false if not
>    * and then we will sleep until the room becomes available.
> - *
> - * cleanup_cb is in case that we race with a waker and need to cleanup the
> - * inflight count accordingly.
>    */
>   void rq_qos_wait(struct rq_wait *rqw, void *private_data,
> -		 acquire_inflight_cb_t *acquire_inflight_cb,
> -		 cleanup_cb_t *cleanup_cb)
> +		 acquire_inflight_cb_t *acquire_inflight_cb)
>   {
> -	struct rq_qos_wait_data data = {
> -		.wq = {
> -			.func	= rq_qos_wake_function,
> -			.entry	= LIST_HEAD_INIT(data.wq.entry),
> -		},
> -		.task = current,
> -		.rqw = rqw,
> -		.cb = acquire_inflight_cb,
> -		.private_data = private_data,
> -	};
> -	bool has_sleeper;
> -
> -	has_sleeper = wq_has_sleeper(&rqw->wait);
> -	if (!has_sleeper && acquire_inflight_cb(rqw, private_data))
> -		return;
> -
> -	prepare_to_wait_exclusive(&rqw->wait, &data.wq, TASK_UNINTERRUPTIBLE);
> -	has_sleeper = !wq_has_single_sleeper(&rqw->wait);
> -	do {
> -		/* The memory barrier in set_task_state saves us here. */
> -		if (data.got_token)
> -			break;
> -		if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) {
> -			finish_wait(&rqw->wait, &data.wq);
> -
> -			/*
> -			 * We raced with wbt_wake_function() getting a token,
> -			 * which means we now have two. Put our local token
> -			 * and wake anyone else potentially waiting for one.
> -			 */
> -			smp_rmb();
> -			if (data.got_token)
> -				cleanup_cb(rqw, private_data);
> -			break;
> -		}
> -		io_schedule();
> -		has_sleeper = true;
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -	} while (1);
> -	finish_wait(&rqw->wait, &data.wq);
> +	mutex_lock(&rqw->throttle_mutex);
> +	wait_event(rqw->wait, acquire_inflight_cb(rqw, private_data));
> +	mutex_unlock(&rqw->throttle_mutex);

This will break the throttle? There is a inflight io limitation. With 
this change, there can be only one io inflight whatever the limit is.

Thanks,

Junxiao.

>   }
>   
>   void rq_qos_exit(struct request_queue *q)

  parent reply	other threads:[~2021-04-19  6:11 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-14 21:18 [PATCH] block: fix io hung by block throttle Junxiao Bi
     [not found] ` <20210417101008.4132-1-hdanton@sina.com>
2021-04-17 21:37   ` Junxiao Bi
     [not found]   ` <20210418123342.13740-1-hdanton@sina.com>
2021-04-19  6:09     ` Junxiao Bi [this message]
2021-04-19 16:39       ` Junxiao Bi
     [not found] ` <20210415041153.577-1-hdanton@sina.com>
2021-04-15  5:01   ` Junxiao Bi
2021-04-21 21:28   ` Junxiao Bi
2021-04-23  2:55 ` [block] 658f2fb7d2: fxmark.hdd_f2fs_dbench_client_72_directio.works/sec -21.4% regression kernel test robot
2021-04-23  5:26   ` Junxiao Bi
2021-04-28 21:08   ` Junxiao Bi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1c3a2874-35bd-3ff0-3088-1a6ea968d016@oracle.com \
    --to=junxiao.bi@oracle.com \
    --cc=axboe@kernel.dk \
    --cc=hdanton@sina.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).