All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/5] blk-mq: support to use hw tag for scheduling
@ 2017-05-03 19:58 Ming Lei
  2017-05-03 19:58 ` [PATCH V2 1/5] blk-mq: introduce BLK_MQ_F_SCHED_USE_HW_TAG Ming Lei
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Ming Lei @ 2017-05-03 19:58 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Bart Van Assche, Omar Sandoval, Ming Lei

Hi,

This patchset introduces flag of BLK_MQ_F_SCHED_USE_HW_TAG and
allows to use hardware tag directly for IO scheduling if the queue's
depth is big enough. In this way, we can avoid to allocate extra tags
and request pool for IO schedule, and the schedule tag allocation/release
can be saved in I/O submit path.

V2:
	- fix oops when kyber is used
	- move dumping the new flag into patch 1
	- support to use hw tag for shared tags
	- update hctx->sched_tag when BLK_MQ_F_SCHED_USE_HW_TAG
	is changed
	- clear the flag in patch of blk_mq_exit_sched()
	- don't update q->nr_requests when updating hw queue's depth
	- fix blk_mq_get_queue_depth()

Thanks reviewing from Bart, Omar, Jens and others.


Thanks,
Ming

Ming Lei (5):
  blk-mq: introduce BLK_MQ_F_SCHED_USE_HW_TAG
  blk-mq: introduce blk_mq_get_queue_depth()
  blk-mq: don't update q->nr_requests when updating hw queue's depth
  blk-mq: use hw tag for scheduling if hw tag space is big enough
  blk-mq: allow to use hw tag for shared tags

 block/blk-mq-debugfs.c |  1 +
 block/blk-mq-sched.c   | 34 ++++++++++++++++-----
 block/blk-mq-sched.h   | 24 +++++++++++++++
 block/blk-mq.c         | 82 +++++++++++++++++++++++++++++++++++++++++++++-----
 block/blk-mq.h         | 12 ++++++++
 block/kyber-iosched.c  |  7 ++++-
 include/linux/blk-mq.h |  1 +
 7 files changed, 144 insertions(+), 17 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V2 1/5] blk-mq: introduce BLK_MQ_F_SCHED_USE_HW_TAG
  2017-05-03 19:58 [PATCH V2 0/5] blk-mq: support to use hw tag for scheduling Ming Lei
@ 2017-05-03 19:58 ` Ming Lei
  2017-05-03 19:58 ` [PATCH V2 2/5] blk-mq: introduce blk_mq_get_queue_depth() Ming Lei
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2017-05-03 19:58 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Bart Van Assche, Omar Sandoval, Ming Lei

When blk-mq I/O scheduler is used, we need two tags for
submitting one request. One is called scheduler tag for
allocating request and scheduling I/O, another one is called
driver tag, which is used for dispatching IO to hardware/driver.
This way introduces one extra per-queue allocation for both tags
and request pool, and may not be as efficient as case of none
scheduler.

Also currently we put a default per-hctx limit on schedulable
requests, and this limit may be a bottleneck for some devices,
especialy when these devices have a quite big tag space.

This patch introduces BLK_MQ_F_SCHED_USE_HW_TAG so that we can
allow to use hardware/driver tags directly for IO scheduling if
devices's hardware tag space is big enough. Then we can avoid
the extra resource allocation and make IO submission more
efficient.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-debugfs.c |  1 +
 block/blk-mq-sched.c   | 10 +++++++++-
 block/blk-mq.c         | 35 +++++++++++++++++++++++++++++------
 block/kyber-iosched.c  |  7 ++++++-
 include/linux/blk-mq.h |  1 +
 5 files changed, 46 insertions(+), 8 deletions(-)

diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index bcd2a7d4a3a5..bc390847a60d 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -220,6 +220,7 @@ static const char *const hctx_flag_name[] = {
 	[ilog2(BLK_MQ_F_SG_MERGE)]	= "SG_MERGE",
 	[ilog2(BLK_MQ_F_BLOCKING)]	= "BLOCKING",
 	[ilog2(BLK_MQ_F_NO_SCHED)]	= "NO_SCHED",
+	[ilog2(BLK_MQ_F_SCHED_USE_HW_TAG)]	= "SCHED_USE_HW_TAG",
 };
 
 static int hctx_flags_show(struct seq_file *m, void *v)
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index e79e9f18d7c2..817c97c88942 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -83,7 +83,12 @@ struct request *blk_mq_sched_get_request(struct request_queue *q,
 		data->hctx = blk_mq_map_queue(q, data->ctx->cpu);
 
 	if (e) {
-		data->flags |= BLK_MQ_REQ_INTERNAL;
+		/*
+		 * If BLK_MQ_F_SCHED_USE_HW_TAG is set, we use hardware
+		 * tag for IO scheduler directly.
+		 */
+		if (!(data->hctx->flags & BLK_MQ_F_SCHED_USE_HW_TAG))
+			data->flags |= BLK_MQ_REQ_INTERNAL;
 
 		/*
 		 * Flush requests are special and go directly to the
@@ -429,6 +434,9 @@ static int blk_mq_sched_alloc_tags(struct request_queue *q,
 	struct blk_mq_tag_set *set = q->tag_set;
 	int ret;
 
+	if (hctx->flags & BLK_MQ_F_SCHED_USE_HW_TAG)
+		return 0;
+
 	hctx->sched_tags = blk_mq_alloc_rq_map(set, hctx_idx, q->nr_requests,
 					       set->reserved_tags);
 	if (!hctx->sched_tags)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index fb6738954b7d..095099df041f 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -263,9 +263,19 @@ struct request *__blk_mq_alloc_request(struct blk_mq_alloc_data *data,
 				rq->rq_flags = RQF_MQ_INFLIGHT;
 				atomic_inc(&data->hctx->nr_active);
 			}
-			rq->tag = tag;
-			rq->internal_tag = -1;
-			data->hctx->tags->rqs[rq->tag] = rq;
+			data->hctx->tags->rqs[tag] = rq;
+
+			/*
+			 * If we use hw tag for scheduling, postpone setting
+			 * rq->tag in blk_mq_get_driver_tag().
+			 */
+			if (data->hctx->flags & BLK_MQ_F_SCHED_USE_HW_TAG) {
+				rq->tag = -1;
+				rq->internal_tag = tag;
+			} else {
+				rq->tag = tag;
+				rq->internal_tag = -1;
+			}
 		}
 
 		blk_mq_rq_ctx_init(data->q, data->ctx, rq, op);
@@ -365,7 +375,7 @@ void __blk_mq_finish_request(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
 	clear_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags);
 	if (rq->tag != -1)
 		blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag);
-	if (sched_tag != -1)
+	if (sched_tag != -1 && !(hctx->flags & BLK_MQ_F_SCHED_USE_HW_TAG))
 		blk_mq_put_tag(hctx, hctx->sched_tags, ctx, sched_tag);
 	blk_mq_sched_restart(hctx);
 	blk_queue_exit(q);
@@ -869,6 +879,12 @@ bool blk_mq_get_driver_tag(struct request *rq, struct blk_mq_hw_ctx **hctx,
 	if (rq->tag != -1)
 		goto done;
 
+	/* we buffered driver tag in rq->internal_tag */
+	if (data.hctx->flags & BLK_MQ_F_SCHED_USE_HW_TAG) {
+		rq->tag = rq->internal_tag;
+		goto done;
+	}
+
 	if (blk_mq_tag_is_reserved(data.hctx->sched_tags, rq->internal_tag))
 		data.flags |= BLK_MQ_REQ_RESERVED;
 
@@ -890,9 +906,15 @@ bool blk_mq_get_driver_tag(struct request *rq, struct blk_mq_hw_ctx **hctx,
 static void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx,
 				    struct request *rq)
 {
-	blk_mq_put_tag(hctx, hctx->tags, rq->mq_ctx, rq->tag);
+	unsigned tag = rq->tag;
+
 	rq->tag = -1;
 
+	if (hctx->flags & BLK_MQ_F_SCHED_USE_HW_TAG)
+		return;
+
+	blk_mq_put_tag(hctx, hctx->tags, rq->mq_ctx, tag);
+
 	if (rq->rq_flags & RQF_MQ_INFLIGHT) {
 		rq->rq_flags &= ~RQF_MQ_INFLIGHT;
 		atomic_dec(&hctx->nr_active);
@@ -2852,7 +2874,8 @@ bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
 		blk_flush_plug_list(plug, false);
 
 	hctx = q->queue_hw_ctx[blk_qc_t_to_queue_num(cookie)];
-	if (!blk_qc_t_is_internal(cookie))
+	if (!blk_qc_t_is_internal(cookie) || (hctx->flags &
+			BLK_MQ_F_SCHED_USE_HW_TAG))
 		rq = blk_mq_tag_to_rq(hctx->tags, blk_qc_t_to_tag(cookie));
 	else {
 		rq = blk_mq_tag_to_rq(hctx->sched_tags, blk_qc_t_to_tag(cookie));
diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
index 3b0090bc5dd1..1968050c8515 100644
--- a/block/kyber-iosched.c
+++ b/block/kyber-iosched.c
@@ -275,8 +275,13 @@ static unsigned int kyber_sched_tags_shift(struct kyber_queue_data *kqd)
 	/*
 	 * All of the hardware queues have the same depth, so we can just grab
 	 * the shift of the first one.
+	 *
+	 * Hardware tags may be used for scheduling.
 	 */
-	return kqd->q->queue_hw_ctx[0]->sched_tags->bitmap_tags.sb.shift;
+	if (kqd->q->queue_hw_ctx[0]->sched_tags)
+		return kqd->q->queue_hw_ctx[0]->sched_tags->bitmap_tags.sb.shift;
+	else
+		return kqd->q->queue_hw_ctx[0]->tags->bitmap_tags.sb.shift;
 }
 
 static struct kyber_queue_data *kyber_queue_data_alloc(struct request_queue *q)
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 7aa1ca5fe659..3597ad40ecc3 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -162,6 +162,7 @@ enum {
 	BLK_MQ_F_SG_MERGE	= 1 << 2,
 	BLK_MQ_F_BLOCKING	= 1 << 5,
 	BLK_MQ_F_NO_SCHED	= 1 << 6,
+	BLK_MQ_F_SCHED_USE_HW_TAG	= 1 << 7,
 	BLK_MQ_F_ALLOC_POLICY_START_BIT = 8,
 	BLK_MQ_F_ALLOC_POLICY_BITS = 1,
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 2/5] blk-mq: introduce blk_mq_get_queue_depth()
  2017-05-03 19:58 [PATCH V2 0/5] blk-mq: support to use hw tag for scheduling Ming Lei
  2017-05-03 19:58 ` [PATCH V2 1/5] blk-mq: introduce BLK_MQ_F_SCHED_USE_HW_TAG Ming Lei
@ 2017-05-03 19:58 ` Ming Lei
  2017-05-03 19:58 ` [PATCH V2 3/5] blk-mq: don't update q->nr_requests when updating hw queue's depth Ming Lei
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2017-05-03 19:58 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Bart Van Assche, Omar Sandoval, Ming Lei

The hardware queue depth can be resized via blk_mq_update_nr_requests(),
so introduce this helper for retrieving queue's depth easily.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c | 12 ++++++++++++
 block/blk-mq.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 095099df041f..be475ad112ec 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2120,6 +2120,18 @@ static void blk_mq_map_swqueue(struct request_queue *q,
 	}
 }
 
+/*
+ * Queue depth can be changed via blk_mq_update_nr_requests(),
+ * so use this helper to retrieve queue's depth.
+ */
+int blk_mq_get_queue_depth(struct request_queue *q)
+{
+	/* All queues have same queue depth */
+	struct blk_mq_tags	*tags = q->tag_set->tags[0];
+
+	return tags->bitmap_tags.sb.depth + tags->breserved_tags.sb.depth;
+}
+
 static void queue_set_hctx_shared(struct request_queue *q, bool shared)
 {
 	struct blk_mq_hw_ctx *hctx;
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 2814a14e529c..8085d5989cf5 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -166,6 +166,7 @@ void __blk_mq_finish_request(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
 void blk_mq_finish_request(struct request *rq);
 struct request *__blk_mq_alloc_request(struct blk_mq_alloc_data *data,
 					unsigned int op);
+int blk_mq_get_queue_depth(struct request_queue *q);
 
 static inline bool blk_mq_hctx_stopped(struct blk_mq_hw_ctx *hctx)
 {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 3/5] blk-mq: don't update q->nr_requests when updating hw queue's depth
  2017-05-03 19:58 [PATCH V2 0/5] blk-mq: support to use hw tag for scheduling Ming Lei
  2017-05-03 19:58 ` [PATCH V2 1/5] blk-mq: introduce BLK_MQ_F_SCHED_USE_HW_TAG Ming Lei
  2017-05-03 19:58 ` [PATCH V2 2/5] blk-mq: introduce blk_mq_get_queue_depth() Ming Lei
@ 2017-05-03 19:58 ` Ming Lei
  2017-05-03 19:58 ` [PATCH V2 4/5] blk-mq: use hw tag for scheduling if hw tag space is big enough Ming Lei
  2017-05-03 19:58 ` [PATCH V2 5/5] blk-mq: allow to use hw tag for shared tags Ming Lei
  4 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2017-05-03 19:58 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Bart Van Assche, Omar Sandoval, Ming Lei

In MQ scheduler, q->nr_requests represents requests in scheduler
queue depth, so don't update it if hw queue's depth is updated.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index be475ad112ec..681bf33d8de8 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2643,6 +2643,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 	struct blk_mq_tag_set *set = q->tag_set;
 	struct blk_mq_hw_ctx *hctx;
 	int i, ret;
+	bool sched = false;
 
 	if (!set)
 		return -EINVAL;
@@ -2664,12 +2665,13 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 		} else {
 			ret = blk_mq_tag_update_depth(hctx, &hctx->sched_tags,
 							nr, true);
+			sched = true;
 		}
 		if (ret)
 			break;
 	}
 
-	if (!ret)
+	if (!ret && sched)
 		q->nr_requests = nr;
 
 	blk_mq_unfreeze_queue(q);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 4/5] blk-mq: use hw tag for scheduling if hw tag space is big enough
  2017-05-03 19:58 [PATCH V2 0/5] blk-mq: support to use hw tag for scheduling Ming Lei
                   ` (2 preceding siblings ...)
  2017-05-03 19:58 ` [PATCH V2 3/5] blk-mq: don't update q->nr_requests when updating hw queue's depth Ming Lei
@ 2017-05-03 19:58 ` Ming Lei
  2017-05-03 20:14   ` Jens Axboe
  2017-05-03 19:58 ` [PATCH V2 5/5] blk-mq: allow to use hw tag for shared tags Ming Lei
  4 siblings, 1 reply; 8+ messages in thread
From: Ming Lei @ 2017-05-03 19:58 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Bart Van Assche, Omar Sandoval, Ming Lei

When tag space of one device is big enough, we use hw tag
directly for I/O scheduling.

Now the decision is made if hw queue depth is not less than
q->nr_requests and the tag set isn't shared.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-sched.c | 24 +++++++++++++++++-------
 block/blk-mq-sched.h | 22 ++++++++++++++++++++++
 block/blk-mq.c       | 32 ++++++++++++++++++++++++++++++--
 3 files changed, 69 insertions(+), 9 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 817c97c88942..e25a2837d9f0 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -416,9 +416,9 @@ void blk_mq_sched_insert_requests(struct request_queue *q,
 	blk_mq_run_hw_queue(hctx, run_queue_async);
 }
 
-static void blk_mq_sched_free_tags(struct blk_mq_tag_set *set,
-				   struct blk_mq_hw_ctx *hctx,
-				   unsigned int hctx_idx)
+void blk_mq_sched_free_tags(struct blk_mq_tag_set *set,
+			    struct blk_mq_hw_ctx *hctx,
+			    unsigned int hctx_idx)
 {
 	if (hctx->sched_tags) {
 		blk_mq_free_rqs(set, hctx->sched_tags, hctx_idx);
@@ -427,9 +427,9 @@ static void blk_mq_sched_free_tags(struct blk_mq_tag_set *set,
 	}
 }
 
-static int blk_mq_sched_alloc_tags(struct request_queue *q,
-				   struct blk_mq_hw_ctx *hctx,
-				   unsigned int hctx_idx)
+int blk_mq_sched_alloc_tags(struct request_queue *q,
+			    struct blk_mq_hw_ctx *hctx,
+			    unsigned int hctx_idx)
 {
 	struct blk_mq_tag_set *set = q->tag_set;
 	int ret;
@@ -455,8 +455,10 @@ static void blk_mq_sched_tags_teardown(struct request_queue *q)
 	struct blk_mq_hw_ctx *hctx;
 	int i;
 
-	queue_for_each_hw_ctx(q, hctx, i)
+	queue_for_each_hw_ctx(q, hctx, i) {
+		hctx->flags &= ~BLK_MQ_F_SCHED_USE_HW_TAG;
 		blk_mq_sched_free_tags(set, hctx, i);
+	}
 }
 
 int blk_mq_sched_init_hctx(struct request_queue *q, struct blk_mq_hw_ctx *hctx,
@@ -505,6 +507,7 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
 	struct elevator_queue *eq;
 	unsigned int i;
 	int ret;
+	bool auto_hw_tag;
 
 	if (!e) {
 		q->elevator = NULL;
@@ -517,7 +520,14 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
 	 */
 	q->nr_requests = 2 * BLKDEV_MAX_RQ;
 
+	auto_hw_tag = blk_mq_sched_may_use_hw_tag(q);
+
 	queue_for_each_hw_ctx(q, hctx, i) {
+		if (auto_hw_tag)
+			hctx->flags |= BLK_MQ_F_SCHED_USE_HW_TAG;
+		else
+			hctx->flags &= ~BLK_MQ_F_SCHED_USE_HW_TAG;
+
 		ret = blk_mq_sched_alloc_tags(q, hctx, i);
 		if (ret)
 			goto err;
diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
index edafb5383b7b..241d23c18181 100644
--- a/block/blk-mq-sched.h
+++ b/block/blk-mq-sched.h
@@ -35,6 +35,13 @@ void blk_mq_sched_exit_hctx(struct request_queue *q, struct blk_mq_hw_ctx *hctx,
 
 int blk_mq_sched_init(struct request_queue *q);
 
+void blk_mq_sched_free_tags(struct blk_mq_tag_set *set,
+			    struct blk_mq_hw_ctx *hctx,
+			    unsigned int hctx_idx);
+int blk_mq_sched_alloc_tags(struct request_queue *q,
+			    struct blk_mq_hw_ctx *hctx,
+			    unsigned int hctx_idx);
+
 static inline bool
 blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio)
 {
@@ -129,4 +136,19 @@ static inline bool blk_mq_sched_needs_restart(struct blk_mq_hw_ctx *hctx)
 	return test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state);
 }
 
+/*
+ * If this queue has enough hardware tags and doesn't share tags with
+ * other queues, just use hw tag directly for scheduling.
+ */
+static inline bool blk_mq_sched_may_use_hw_tag(struct request_queue *q)
+{
+	if (q->tag_set->flags & BLK_MQ_F_TAG_SHARED)
+		return false;
+
+	if (blk_mq_get_queue_depth(q) < q->nr_requests)
+		return false;
+
+	return true;
+}
+
 #endif
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 681bf33d8de8..0d9433680b2a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2132,6 +2132,31 @@ int blk_mq_get_queue_depth(struct request_queue *q)
 	return tags->bitmap_tags.sb.depth + tags->breserved_tags.sb.depth;
 }
 
+static void blk_mq_update_sched_flag(struct request_queue *q)
+{
+	struct blk_mq_hw_ctx *hctx;
+	int i;
+
+	if (!q->elevator)
+		return;
+
+	if (!blk_mq_sched_may_use_hw_tag(q))
+		queue_for_each_hw_ctx(q, hctx, i) {
+			hctx->flags &= ~BLK_MQ_F_SCHED_USE_HW_TAG;
+			if (!hctx->sched_tags) {
+				if (blk_mq_sched_alloc_tags(q, hctx, i))
+					goto force_use_hw_tag;
+			}
+		}
+	else
+ force_use_hw_tag:
+		queue_for_each_hw_ctx(q, hctx, i) {
+			hctx->flags |= BLK_MQ_F_SCHED_USE_HW_TAG;
+			if (hctx->sched_tags)
+				blk_mq_sched_free_tags(q->tag_set, hctx, i);
+		}
+}
+
 static void queue_set_hctx_shared(struct request_queue *q, bool shared)
 {
 	struct blk_mq_hw_ctx *hctx;
@@ -2671,8 +2696,11 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 			break;
 	}
 
-	if (!ret && sched)
-		q->nr_requests = nr;
+	if (!ret) {
+		if (sched)
+			q->nr_requests = nr;
+		blk_mq_update_sched_flag(q);
+	}
 
 	blk_mq_unfreeze_queue(q);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 5/5] blk-mq: allow to use hw tag for shared tags
  2017-05-03 19:58 [PATCH V2 0/5] blk-mq: support to use hw tag for scheduling Ming Lei
                   ` (3 preceding siblings ...)
  2017-05-03 19:58 ` [PATCH V2 4/5] blk-mq: use hw tag for scheduling if hw tag space is big enough Ming Lei
@ 2017-05-03 19:58 ` Ming Lei
  4 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2017-05-03 19:58 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Bart Van Assche, Omar Sandoval, Ming Lei

In case of shared tags, hctx_may_queue() limits that
the maximum number of requests allocated to one hw
queue is .queue_depth / active_queues.

So we try to allow to use hw tag for this case
if .queue_depth/shared_queues is not less than
q->nr_requests.

This can cover some scsi devices too, such as virtio-scsi
in default configuration.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-sched.h | 10 ++++++----
 block/blk-mq.c       |  1 +
 block/blk-mq.h       | 11 +++++++++++
 3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
index 241d23c18181..b0c124b59e17 100644
--- a/block/blk-mq-sched.h
+++ b/block/blk-mq-sched.h
@@ -137,15 +137,17 @@ static inline bool blk_mq_sched_needs_restart(struct blk_mq_hw_ctx *hctx)
 }
 
 /*
- * If this queue has enough hardware tags and doesn't share tags with
- * other queues, just use hw tag directly for scheduling.
+ * If this queue has enough hardware tags, just use hw tag directly
+ * for scheduling.
  */
 static inline bool blk_mq_sched_may_use_hw_tag(struct request_queue *q)
 {
+	int nr_shared = 1;
+
 	if (q->tag_set->flags & BLK_MQ_F_TAG_SHARED)
-		return false;
+		nr_shared = blk_mq_get_shared_queues(q);
 
-	if (blk_mq_get_queue_depth(q) < q->nr_requests)
+	if ((blk_mq_get_queue_depth(q) / nr_shared) < q->nr_requests)
 		return false;
 
 	return true;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0d9433680b2a..c4ca4336fa69 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2179,6 +2179,7 @@ static void blk_mq_update_tag_set_depth(struct blk_mq_tag_set *set, bool shared)
 	list_for_each_entry(q, &set->tag_list, tag_set_list) {
 		blk_mq_freeze_queue(q);
 		queue_set_hctx_shared(q, shared);
+		blk_mq_update_sched_flag(q);
 		blk_mq_unfreeze_queue(q);
 	}
 }
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 8085d5989cf5..351f266ec3a7 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -178,4 +178,15 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx)
 	return hctx->nr_ctx && hctx->tags;
 }
 
+/* return how many queues shared tag set with me */
+static inline int blk_mq_get_shared_queues(struct request_queue *q)
+{
+	struct blk_mq_tag_set *set = q->tag_set;
+	int nr = 0;
+
+	list_for_each_entry_rcu(q, &set->tag_list, tag_set_list)
+		nr++;
+	return nr;
+}
+
 #endif
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 4/5] blk-mq: use hw tag for scheduling if hw tag space is big enough
  2017-05-03 19:58 ` [PATCH V2 4/5] blk-mq: use hw tag for scheduling if hw tag space is big enough Ming Lei
@ 2017-05-03 20:14   ` Jens Axboe
  2017-05-04  2:12     ` Ming Lei
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2017-05-03 20:14 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, Bart Van Assche, Omar Sandoval

On Thu, May 04 2017, Ming Lei wrote:
> diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
> index edafb5383b7b..241d23c18181 100644
> --- a/block/blk-mq-sched.h
> +++ b/block/blk-mq-sched.h
> @@ -129,4 +136,19 @@ static inline bool blk_mq_sched_needs_restart(struct blk_mq_hw_ctx *hctx)
>  	return test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state);
>  }
>  
> +/*
> + * If this queue has enough hardware tags and doesn't share tags with
> + * other queues, just use hw tag directly for scheduling.
> + */
> +static inline bool blk_mq_sched_may_use_hw_tag(struct request_queue *q)
> +{
> +	if (q->tag_set->flags & BLK_MQ_F_TAG_SHARED)
> +		return false;
> +
> +	if (blk_mq_get_queue_depth(q) < q->nr_requests)
> +		return false;
> +
> +	return true;
> +}
> +

Let's put that in block/blk-mq-sched.c instead, especially since it
grows more code in the next patch.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 4/5] blk-mq: use hw tag for scheduling if hw tag space is big enough
  2017-05-03 20:14   ` Jens Axboe
@ 2017-05-04  2:12     ` Ming Lei
  0 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2017-05-04  2:12 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Bart Van Assche, Omar Sandoval

On Wed, May 03, 2017 at 02:14:45PM -0600, Jens Axboe wrote:
> On Thu, May 04 2017, Ming Lei wrote:
> > diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
> > index edafb5383b7b..241d23c18181 100644
> > --- a/block/blk-mq-sched.h
> > +++ b/block/blk-mq-sched.h
> > @@ -129,4 +136,19 @@ static inline bool blk_mq_sched_needs_restart(struct blk_mq_hw_ctx *hctx)
> >  	return test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state);
> >  }
> >  
> > +/*
> > + * If this queue has enough hardware tags and doesn't share tags with
> > + * other queues, just use hw tag directly for scheduling.
> > + */
> > +static inline bool blk_mq_sched_may_use_hw_tag(struct request_queue *q)
> > +{
> > +	if (q->tag_set->flags & BLK_MQ_F_TAG_SHARED)
> > +		return false;
> > +
> > +	if (blk_mq_get_queue_depth(q) < q->nr_requests)
> > +		return false;
> > +
> > +	return true;
> > +}
> > +
> 
> Let's put that in block/blk-mq-sched.c instead, especially since it
> grows more code in the next patch.

OK, will do it in V3.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-05-04  2:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-03 19:58 [PATCH V2 0/5] blk-mq: support to use hw tag for scheduling Ming Lei
2017-05-03 19:58 ` [PATCH V2 1/5] blk-mq: introduce BLK_MQ_F_SCHED_USE_HW_TAG Ming Lei
2017-05-03 19:58 ` [PATCH V2 2/5] blk-mq: introduce blk_mq_get_queue_depth() Ming Lei
2017-05-03 19:58 ` [PATCH V2 3/5] blk-mq: don't update q->nr_requests when updating hw queue's depth Ming Lei
2017-05-03 19:58 ` [PATCH V2 4/5] blk-mq: use hw tag for scheduling if hw tag space is big enough Ming Lei
2017-05-03 20:14   ` Jens Axboe
2017-05-04  2:12     ` Ming Lei
2017-05-03 19:58 ` [PATCH V2 5/5] blk-mq: allow to use hw tag for shared tags Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.