All of lore.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <Damien.LeMoal@wdc.com>
To: Bart Van Assche <bvanassche@acm.org>, Jens Axboe <axboe@kernel.dk>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>, Jaegeuk Kim <jaegeuk@kernel.org>,
	Hannes Reinecke <hare@suse.de>, Ming Lei <ming.lei@redhat.com>,
	Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
	Himanshu Madhani <himanshu.madhani@oracle.com>
Subject: Re: [PATCH 14/14] block/mq-deadline: Prioritize high-priority requests
Date: Wed, 9 Jun 2021 05:10:53 +0000	[thread overview]
Message-ID: <DM6PR04MB7081853B525156C4F4FF0C0AE7369@DM6PR04MB7081.namprd04.prod.outlook.com> (raw)
In-Reply-To: 20210608230703.19510-15-bvanassche@acm.org

On 2021/06/09 8:07, Bart Van Assche wrote:
> While one or more requests with a certain I/O priority are pending, do not
> dispatch lower priority requests. Dispatch lower priority requests anyway
> after the "aging" time has expired.
> 
> This patch has been tested as follows:
> 
> modprobe scsi_debug ndelay=1000000 max_queue=16 &&
> sd='' &&
> while [ -z "$sd" ]; do
>   sd=/dev/$(basename /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/*)
> done &&
> echo $((100*1000)) > /sys/block/$sd/queue/iosched/aging_expire &&
> cd /sys/fs/cgroup/blkio/ &&
> echo $$ >cgroup.procs &&
> echo 2 >blkio.prio.class &&
> mkdir -p hipri &&
> cd hipri &&
> echo 1 >blkio.prio.class &&
> { max-iops -a1 -d32 -j1 -e mq-deadline $sd >& ~/low-pri.txt & } &&
> echo $$ >cgroup.procs &&
> max-iops -a1 -d32 -j1 -e mq-deadline $sd >& ~/hi-pri.txt
> 
> Result:
> * 11000 IOPS for the high-priority job
> *   400 IOPS for the low-priority job
> 
> If the aging expiry time is changed from 100s into 0, the IOPS results change
> into 6712 and 6796 IOPS.
> 
> The max-iops script is a script that runs fio with the following arguments:
> --bs=4K --gtod_reduce=1 --ioengine=libaio --ioscheduler=${arg_e} --runtime=60
> --norandommap --rw=read --thread --buffered=0 --numjobs=${arg_j}
> --iodepth=${arg_d} --iodepth_batch_submit=${arg_a}
> --iodepth_batch_complete=$((arg_d / 2)) --name=${positional_argument_1}
> --filename=${positional_argument_1}
> 
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  block/mq-deadline-main.c | 42 +++++++++++++++++++++++++++++++++++-----
>  1 file changed, 37 insertions(+), 5 deletions(-)
> 
> diff --git a/block/mq-deadline-main.c b/block/mq-deadline-main.c
> index 1b2b6c1de5b8..cecdb475a610 100644
> --- a/block/mq-deadline-main.c
> +++ b/block/mq-deadline-main.c
> @@ -32,6 +32,11 @@
>   */
>  static const int read_expire = HZ / 2;  /* max time before a read is submitted. */
>  static const int write_expire = 5 * HZ; /* ditto for writes, these limits are SOFT! */
> +/*
> + * Time after which to dispatch lower priority requests even if higher
> + * priority requests are pending.
> + */
> +static const int aging_expire = 10 * HZ;
>  static const int writes_starved = 2;    /* max times reads can starve a write */
>  static const int fifo_batch = 16;       /* # of sequential requests treated as one
>  				     by the above parameters. For throughput. */
> @@ -90,6 +95,7 @@ struct deadline_data {
>  	int writes_starved;
>  	int front_merges;
>  	u32 async_depth;
> +	int aging_expire;
>  
>  	spinlock_t lock;
>  	spinlock_t zone_lock;
> @@ -363,10 +369,10 @@ deadline_next_request(struct deadline_data *dd, enum dd_prio prio,
>  
>  /*
>   * deadline_dispatch_requests selects the best request according to
> - * read/write expire, fifo_batch, etc
> + * read/write expire, fifo_batch, etc and with a start time <= @latest.
>   */
>  static struct request *__dd_dispatch_request(struct deadline_data *dd,
> -					     enum dd_prio prio)
> +					enum dd_prio prio, u64 latest_start_ns)
>  {
>  	struct request *rq, *next_rq;
>  	enum dd_data_dir data_dir;
> @@ -378,6 +384,8 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
>  	if (!list_empty(&dd->dispatch[prio])) {
>  		rq = list_first_entry(&dd->dispatch[prio], struct request,
>  				      queuelist);
> +		if (rq->start_time_ns > latest_start_ns)
> +			return NULL;
>  		list_del_init(&rq->queuelist);
>  		goto done;
>  	}
> @@ -457,6 +465,8 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
>  	dd->batching = 0;
>  
>  dispatch_request:
> +	if (rq->start_time_ns > latest_start_ns)
> +		return NULL;
>  	/*
>  	 * rq is the selected appropriate request.
>  	 */
> @@ -493,15 +503,32 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
>  static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
>  {
>  	struct deadline_data *dd = hctx->queue->elevator->elevator_data;
> -	struct request *rq;
> +	const u64 now_ns = ktime_get_ns();
> +	struct request *rq = NULL;
>  	enum dd_prio prio;
>  
>  	spin_lock(&dd->lock);
> -	for (prio = 0; prio <= DD_PRIO_MAX; prio++) {
> -		rq = __dd_dispatch_request(dd, prio);
> +	/*
> +	 * Start with dispatching requests whose deadline expired more than
> +	 * aging_expire jiffies ago.
> +	 */
> +	for (prio = DD_BE_PRIO; prio <= DD_PRIO_MAX; prio++) {
> +		rq = __dd_dispatch_request(dd, prio, now_ns -
> +					   jiffies_to_nsecs(dd->aging_expire));
>  		if (rq)
> +			goto unlock;
> +	}
> +	/*
> +	 * Next, dispatch requests in priority order. Ignore lower priority
> +	 * requests if any higher priority requests are pending.
> +	 */
> +	for (prio = 0; prio <= DD_PRIO_MAX; prio++) {
> +		rq = __dd_dispatch_request(dd, prio, now_ns);
> +		if (rq || dd_queued(dd, prio))
>  			break;
>  	}
> +
> +unlock:
>  	spin_unlock(&dd->lock);
>  
>  	return rq;
> @@ -607,6 +634,7 @@ static int dd_init_sched(struct request_queue *q, struct elevator_type *e)
>  	dd->writes_starved = writes_starved;
>  	dd->front_merges = 1;
>  	dd->fifo_batch = fifo_batch;
> +	dd->aging_expire = aging_expire;
>  	spin_lock_init(&dd->lock);
>  	spin_lock_init(&dd->zone_lock);
>  
> @@ -725,6 +753,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>  	trace_block_rq_insert(rq);
>  
>  	if (at_head) {
> +		rq->fifo_time = jiffies;
>  		list_add(&rq->queuelist, &dd->dispatch[prio]);
>  	} else {
>  		deadline_add_rq_rb(dd, rq);
> @@ -841,6 +870,7 @@ static ssize_t __FUNC(struct elevator_queue *e, char *page)		\
>  #define SHOW_JIFFIES(__FUNC, __VAR) SHOW_INT(__FUNC, jiffies_to_msecs(__VAR))
>  SHOW_JIFFIES(deadline_read_expire_show, dd->fifo_expire[DD_READ]);
>  SHOW_JIFFIES(deadline_write_expire_show, dd->fifo_expire[DD_WRITE]);
> +SHOW_JIFFIES(deadline_aging_expire_show, dd->aging_expire);
>  SHOW_INT(deadline_writes_starved_show, dd->writes_starved);
>  SHOW_INT(deadline_front_merges_show, dd->front_merges);
>  SHOW_INT(deadline_fifo_batch_show, dd->fifo_batch);
> @@ -869,6 +899,7 @@ static ssize_t __FUNC(struct elevator_queue *e, const char *page, size_t count)
>  	STORE_FUNCTION(__FUNC, __PTR, MIN, MAX, msecs_to_jiffies)
>  STORE_JIFFIES(deadline_read_expire_store, &dd->fifo_expire[DD_READ], 0, INT_MAX);
>  STORE_JIFFIES(deadline_write_expire_store, &dd->fifo_expire[DD_WRITE], 0, INT_MAX);
> +STORE_JIFFIES(deadline_aging_expire_store, &dd->aging_expire, 0, INT_MAX);
>  STORE_INT(deadline_writes_starved_store, &dd->writes_starved, INT_MIN, INT_MAX);
>  STORE_INT(deadline_front_merges_store, &dd->front_merges, 0, 1);
>  STORE_INT(deadline_fifo_batch_store, &dd->fifo_batch, 0, INT_MAX);
> @@ -885,6 +916,7 @@ static struct elv_fs_entry deadline_attrs[] = {
>  	DD_ATTR(writes_starved),
>  	DD_ATTR(front_merges),
>  	DD_ATTR(fifo_batch),
> +	DD_ATTR(aging_expire),
>  	__ATTR_NULL
>  };
>  
> 

Looks good.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2021-06-09  5:10 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-08 23:06 [PATCH 00/14] Improve I/O priority support Bart Van Assche
2021-06-08 23:06 ` [PATCH 01/14] block/Kconfig: Make the BLK_WBT and BLK_WBT_MQ entries consecutive Bart Van Assche
2021-06-09  4:14   ` Damien Le Moal
2021-06-09  7:03   ` Johannes Thumshirn
2021-06-10  6:02   ` Hannes Reinecke
2021-06-08 23:06 ` [PATCH 02/14] block/blk-cgroup: Swap the blk_throtl_init() and blk_iolatency_init() calls Bart Van Assche
2021-06-09  4:19   ` Damien Le Moal
2021-06-09  7:05   ` Johannes Thumshirn
2021-06-10  6:02   ` Hannes Reinecke
2021-06-08 23:06 ` [PATCH 03/14] block/blk-rq-qos: Move a function from a header file into a C file Bart Van Assche
2021-06-09  4:22   ` Damien Le Moal
2021-06-10  6:03   ` Hannes Reinecke
2021-06-08 23:06 ` [PATCH 04/14] block: Introduce the ioprio rq-qos policy Bart Van Assche
2021-06-09  4:40   ` Damien Le Moal
2021-06-09 16:48     ` Bart Van Assche
2021-06-10  0:29       ` Damien Le Moal
2021-06-10  6:20   ` Hannes Reinecke
2021-06-10 17:14     ` Bart Van Assche
2021-06-08 23:06 ` [PATCH 05/14] block/mq-deadline: Add several comments Bart Van Assche
2021-06-10  6:21   ` Hannes Reinecke
2021-06-08 23:06 ` [PATCH 06/14] block/mq-deadline: Add two lockdep_assert_held() statements Bart Van Assche
2021-06-10  6:21   ` Hannes Reinecke
2021-06-08 23:06 ` [PATCH 07/14] block/mq-deadline: Remove two local variables Bart Van Assche
2021-06-10  6:22   ` Hannes Reinecke
2021-06-08 23:06 ` [PATCH 08/14] block/mq-deadline: Rename dd_init_queue() and dd_exit_queue() Bart Van Assche
2021-06-08 23:06 ` [PATCH 09/14] block/mq-deadline: Improve compile-time argument checking Bart Van Assche
2021-06-08 23:06 ` [PATCH 10/14] block/mq-deadline: Improve the sysfs show and store macros Bart Van Assche
2021-06-09  4:46   ` Damien Le Moal
2021-06-09 16:54     ` Bart Van Assche
2021-06-10  0:31       ` Damien Le Moal
2021-06-09  7:11   ` Johannes Thumshirn
2021-06-09 16:59     ` Bart Van Assche
2021-06-10  6:23   ` Hannes Reinecke
2021-06-08 23:07 ` [PATCH 11/14] block/mq-deadline: Reserve 25% of scheduler tags for synchronous requests Bart Van Assche
2021-06-10  6:26   ` Hannes Reinecke
2021-06-08 23:07 ` [PATCH 12/14] block/mq-deadline: Add I/O priority support Bart Van Assche
2021-06-09  5:03   ` Damien Le Moal
2021-06-09 17:25     ` Bart Van Assche
2021-06-10  0:39       ` Damien Le Moal
2021-06-10  6:35   ` Hannes Reinecke
2021-06-08 23:07 ` [PATCH 13/14] block/mq-deadline: Add cgroup support Bart Van Assche
2021-06-10  6:37   ` Hannes Reinecke
2021-06-08 23:07 ` [PATCH 14/14] block/mq-deadline: Prioritize high-priority requests Bart Van Assche
2021-06-09  5:10   ` Damien Le Moal [this message]
2021-06-10  6:43   ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR04MB7081853B525156C4F4FF0C0AE7369@DM6PR04MB7081.namprd04.prod.outlook.com \
    --to=damien.lemoal@wdc.com \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=himanshu.madhani@oracle.com \
    --cc=jaegeuk@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.