linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: longli@linuxonhyperv.com
To: Jens Axboe <axboe@kernel.dk>, Ming Lei <ming.lei@redhat.com>,
	Sagi Grimberg <sagi@grimberg.me>,
	Keith Busch <keith.busch@intel.com>,
	Bart Van Assche <bvanassche@acm.org>,
	Damien Le Moal <damien.lemoal@wdc.com>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Long Li <longli@microsoft.com>
Subject: [PATCH] blk-mq: avoid repeatedly scheduling the same work to run hardware queue
Date: Thu, 14 Nov 2019 14:51:24 -0800	[thread overview]
Message-ID: <1573771884-38879-1-git-send-email-longli@linuxonhyperv.com> (raw)

From: Long Li <longli@microsoft.com>

SCSI layer calls blk_mq_run_hw_queues() in scsi_end_request(), for every
completed I/O. blk_mq_run_hw_queues() in turn schedules some works to run
the hardware queues.

The actual work is queued by mod_delayed_work_on(), it turns out the cost of
this function is high on locking and CPU usage, when the I/O workload has
high queue depth. Most of these calls are not necessary since the queue is
already scheduled to run, and has not run yet.

This patch tries to solve this problem by avoiding scheduling work when it's
already scheduled.

Benchmark results:
The following tests are run on a RAM backed virtual disk on Hyper-V, with 8
FIO jobs with 4k random read I/O. The test numbers are for IOPS.

queue_depth	pre-patch	after-patch	improvement
16		190k		190k		0%
64		235k		240k		2%
256		180k		256k		42%
1024		156k		250k		60%

Signed-off-by: Long Li <longli@microsoft.com>
---
 block/blk-mq.c         | 12 ++++++++++++
 include/linux/blk-mq.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index ec791156e9cc..a882bd65167a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1476,6 +1476,16 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
 		put_cpu();
 	}
 
+	/*
+	 * Queue a work to run queue. If this is a non-delay run and the
+	 * work is already scheduled, avoid scheduling the same work again.
+	 */
+	if (!msecs) {
+		if (test_bit(BLK_MQ_S_WORK_QUEUED, &hctx->state))
+			return;
+		set_bit(BLK_MQ_S_WORK_QUEUED, &hctx->state);
+	}
+
 	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,
 				    msecs_to_jiffies(msecs));
 }
@@ -1561,6 +1571,7 @@ void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx)
 	cancel_delayed_work(&hctx->run_work);
 
 	set_bit(BLK_MQ_S_STOPPED, &hctx->state);
+	clear_bit(BLK_MQ_S_WORK_QUEUED, &hctx->state);
 }
 EXPORT_SYMBOL(blk_mq_stop_hw_queue);
 
@@ -1626,6 +1637,7 @@ static void blk_mq_run_work_fn(struct work_struct *work)
 	struct blk_mq_hw_ctx *hctx;
 
 	hctx = container_of(work, struct blk_mq_hw_ctx, run_work.work);
+	clear_bit(BLK_MQ_S_WORK_QUEUED, &hctx->state);
 
 	/*
 	 * If we are stopped, don't run the queue.
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 0bf056de5cc3..98269d3fd141 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -234,6 +234,7 @@ enum {
 	BLK_MQ_S_STOPPED	= 0,
 	BLK_MQ_S_TAG_ACTIVE	= 1,
 	BLK_MQ_S_SCHED_RESTART	= 2,
+	BLK_MQ_S_WORK_QUEUED	= 3,
 
 	BLK_MQ_MAX_DEPTH	= 10240,
 
-- 
2.20.1


             reply	other threads:[~2019-11-14 22:51 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-14 22:51 longli [this message]
2019-11-15  2:13 ` [PATCH] blk-mq: avoid repeatedly scheduling the same work to run hardware queue Damien Le Moal
2019-11-15 23:25   ` Long Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1573771884-38879-1-git-send-email-longli@linuxonhyperv.com \
    --to=longli@linuxonhyperv.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=damien.lemoal@wdc.com \
    --cc=keith.busch@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=ming.lei@redhat.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).