All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Jens Axboe <axboe@fb.com>,
	linux-block@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>,
	Laurence Oberman <loberman@redhat.com>,
	Paolo Valente <paolo.valente@linaro.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Ming Lei <ming.lei@redhat.com>
Subject: [PATCH V3 05/14] blk-mq-sched: improve dispatching from sw queue
Date: Sun, 27 Aug 2017 00:33:23 +0800	[thread overview]
Message-ID: <20170826163332.28971-6-ming.lei@redhat.com> (raw)
In-Reply-To: <20170826163332.28971-1-ming.lei@redhat.com>

SCSI devices use host-wide tagset, and the shared
driver tag space is often quite big. Meantime
there is also queue depth for each lun(.cmd_per_lun),
which is often small.

So lots of requests may stay in sw queue, and we
always flush all belonging to same hw queue and
dispatch them all to driver, unfortunately it is
easy to cause queue busy because of the small
per-lun queue depth. Once these requests are flushed
out, they have to stay in hctx->dispatch, and no bio
merge can participate into these requests, and
sequential IO performance is hurted.

This patch improves dispatching from sw queue when
there is per-request-queue queue depth by taking
request one by one from sw queue, just like the way
of IO scheduler.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-sched.c   | 61 +++++++++++++++++++++++++++++++++++++++++++++-----
 include/linux/blk-mq.h |  2 ++
 2 files changed, 57 insertions(+), 6 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index f69752961a34..735e432294ab 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -89,9 +89,9 @@ static bool blk_mq_sched_restart_hctx(struct blk_mq_hw_ctx *hctx)
 	return false;
 }
 
-static void blk_mq_do_dispatch(struct request_queue *q,
-			       struct elevator_queue *e,
-			       struct blk_mq_hw_ctx *hctx)
+static void blk_mq_do_dispatch_sched(struct request_queue *q,
+				     struct elevator_queue *e,
+				     struct blk_mq_hw_ctx *hctx)
 {
 	LIST_HEAD(rq_list);
 
@@ -105,6 +105,42 @@ static void blk_mq_do_dispatch(struct request_queue *q,
 	} while (blk_mq_dispatch_rq_list(q, &rq_list));
 }
 
+static struct blk_mq_ctx *blk_mq_next_ctx(struct blk_mq_hw_ctx *hctx,
+					  struct blk_mq_ctx *ctx)
+{
+	unsigned idx = ctx->index_hw;
+
+	if (++idx == hctx->nr_ctx)
+		idx = 0;
+
+	return hctx->ctxs[idx];
+}
+
+static void blk_mq_do_dispatch_ctx(struct request_queue *q,
+				   struct blk_mq_hw_ctx *hctx)
+{
+	LIST_HEAD(rq_list);
+	struct blk_mq_ctx *ctx = READ_ONCE(hctx->dispatch_from);
+	bool dispatched;
+
+	do {
+		struct request *rq;
+
+		rq = blk_mq_dispatch_rq_from_ctx(hctx, ctx);
+		if (!rq)
+			break;
+		list_add(&rq->queuelist, &rq_list);
+
+		/* round robin for fair dispatch */
+		ctx = blk_mq_next_ctx(hctx, rq->mq_ctx);
+
+		dispatched = blk_mq_dispatch_rq_list(q, &rq_list);
+	} while (dispatched);
+
+	if (!dispatched)
+		WRITE_ONCE(hctx->dispatch_from, blk_mq_next_ctx(hctx, ctx));
+}
+
 void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
 {
 	struct request_queue *q = hctx->queue;
@@ -142,18 +178,31 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
 	if (!list_empty(&rq_list)) {
 		blk_mq_sched_mark_restart_hctx(hctx);
 		do_sched_dispatch = blk_mq_dispatch_rq_list(q, &rq_list);
-	} else if (!has_sched_dispatch) {
+	} else if (!has_sched_dispatch && !q->queue_depth) {
+		/*
+		 * If there is no per-request_queue depth, we
+		 * flush all requests in this hw queue, otherwise
+		 * pick up request one by one from sw queue for
+		 * avoiding to mess up I/O merge when dispatch
+		 * is busy, which can be triggered easily by
+		 * per-request_queue queue depth
+		 */
 		blk_mq_flush_busy_ctxs(hctx, &rq_list);
 		blk_mq_dispatch_rq_list(q, &rq_list);
 	}
 
+	if (!do_sched_dispatch)
+		return;
+
 	/*
 	 * We want to dispatch from the scheduler if we had no work left
 	 * on the dispatch list, OR if we did have work but weren't able
 	 * to make progress.
 	 */
-	if (do_sched_dispatch && has_sched_dispatch)
-		blk_mq_do_dispatch(q, e, hctx);
+	if (has_sched_dispatch)
+		blk_mq_do_dispatch_sched(q, e, hctx);
+	else
+		blk_mq_do_dispatch_ctx(q, hctx);
 }
 
 bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio,
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 50c6485cb04f..7b7a366a97f3 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -30,6 +30,8 @@ struct blk_mq_hw_ctx {
 
 	struct sbitmap		ctx_map;
 
+	struct blk_mq_ctx	*dispatch_from;
+
 	struct blk_mq_ctx	**ctxs;
 	unsigned int		nr_ctx;
 
-- 
2.9.5

  parent reply	other threads:[~2017-08-26 16:34 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-26 16:33 [PATCH V3 00/14] blk-mq-sched: improve SCSI-MQ performance Ming Lei
2017-08-26 16:33 ` [PATCH V3 01/14] blk-mq-sched: fix scheduler bad performance Ming Lei
2017-08-26 16:33 ` [PATCH V3 02/14] sbitmap: introduce __sbitmap_for_each_set() Ming Lei
2017-08-30 15:55   ` Bart Van Assche
2017-08-31  3:33     ` Ming Lei
2017-08-26 16:33 ` [PATCH V3 03/14] blk-mq: introduce blk_mq_dispatch_rq_from_ctx() Ming Lei
2017-08-30 16:01   ` Bart Van Assche
2017-08-26 16:33 ` [PATCH V3 04/14] blk-mq-sched: move actual dispatching into one helper Ming Lei
2017-08-26 16:33 ` Ming Lei [this message]
2017-08-30 16:34   ` [PATCH V3 05/14] blk-mq-sched: improve dispatching from sw queue Bart Van Assche
2017-08-31  3:43     ` Ming Lei
2017-08-31 20:36       ` Bart Van Assche
2017-08-26 16:33 ` [PATCH V3 06/14] blk-mq-sched: don't dequeue request until all in ->dispatch are flushed Ming Lei
2017-08-30 17:11   ` Bart Van Assche
2017-08-31  4:01     ` Ming Lei
2017-08-31 21:00       ` Bart Van Assche
2017-09-01  3:02         ` Ming Lei
2017-09-01 18:19           ` Bart Van Assche
2017-08-26 16:33 ` [PATCH V3 07/14] blk-mq-sched: introduce blk_mq_sched_queue_depth() Ming Lei
2017-08-26 16:33 ` [PATCH V3 08/14] blk-mq-sched: use q->queue_depth as hint for q->nr_requests Ming Lei
2017-08-26 16:33 ` [PATCH V3 09/14] block: introduce rqhash helpers Ming Lei
2017-08-26 16:33 ` [PATCH V3 10/14] block: move actual bio merge code into __elv_merge Ming Lei
2017-08-26 16:33 ` [PATCH V3 11/14] block: add check on elevator for supporting bio merge via hashtable from blk-mq sw queue Ming Lei
2017-08-26 16:33 ` [PATCH V3 12/14] block: introduce .last_merge and .hash to blk_mq_ctx Ming Lei
2017-08-26 16:33 ` [PATCH V3 13/14] blk-mq-sched: refactor blk_mq_sched_try_merge() Ming Lei
2017-08-30 17:17   ` Bart Van Assche
2017-08-31  4:03     ` Ming Lei
2017-08-26 16:33 ` [PATCH V3 14/14] blk-mq: improve bio merge from blk-mq sw queue Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170826163332.28971-6-ming.lei@redhat.com \
    --to=ming.lei@redhat.com \
    --cc=axboe@fb.com \
    --cc=bart.vanassche@sandisk.com \
    --cc=hch@infradead.org \
    --cc=linux-block@vger.kernel.org \
    --cc=loberman@redhat.com \
    --cc=mgorman@techsingularity.net \
    --cc=paolo.valente@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.