All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags
@ 2021-12-06 12:49 John Garry
  2021-12-06 12:49 ` [PATCH v2 1/3] blk-mq: Drop busy_iter_fn blk_mq_hw_ctx argument John Garry
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: John Garry @ 2021-12-06 12:49 UTC (permalink / raw)
  To: axboe
  Cc: linux-block, linux-kernel, ming.lei, kashyap.desai, hare, John Garry

In [0] Kashyap reports high CPU usage for blk_mq_queue_tag_busy_iter()
and callees for shared tags.

Indeed blk_mq_queue_tag_busy_iter() would be less optimum for moving to
shared tags, but it was not optimum previously.

This series optimises by having only a single iter (per regular and resv
tags) for the shared tags, instead of an iter per HW queue.

[0] https://lore.kernel.org/linux-block/e4e92abbe9d52bcba6b8cc6c91c442cc@mail.gmail.com/

Differences to v1:
- Add tested-by and reviewed-by tags
- Reformat 3/3 a bit to keep <= 80 char lines
	- I kept the RB tags, so please check and let me know if not ok
	  with the changes

John Garry (3):
  blk-mq: Drop busy_iter_fn blk_mq_hw_ctx argument
  blk-mq: Delete busy_iter_fn
  blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags

 block/blk-mq-tag.c     | 65 ++++++++++++++++++++++++++++--------------
 block/blk-mq-tag.h     |  2 +-
 block/blk-mq.c         | 17 ++++++-----
 include/linux/blk-mq.h |  2 --
 4 files changed, 53 insertions(+), 33 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/3] blk-mq: Drop busy_iter_fn blk_mq_hw_ctx argument
  2021-12-06 12:49 [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags John Garry
@ 2021-12-06 12:49 ` John Garry
  2021-12-06 12:49 ` [PATCH v2 2/3] blk-mq: Delete busy_iter_fn John Garry
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2021-12-06 12:49 UTC (permalink / raw)
  To: axboe
  Cc: linux-block, linux-kernel, ming.lei, kashyap.desai, hare, John Garry

The only user of blk_mq_hw_ctx blk_mq_hw_ctx argument is
blk_mq_rq_inflight().

Function blk_mq_rq_inflight() uses the hctx to find the associated request
queue to match against the request. However this same check is already
done in caller bt_iter(), so drop this check.

With that change there are no more users of busy_iter_fn blk_mq_hw_ctx
argument, so drop the argument.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by Hannes Reinecke <hare@suse.de>
Tested-by: Kashyap Desai <kashyap.desai@broadcom.com>
---
 block/blk-mq-tag.c     |  2 +-
 block/blk-mq.c         | 17 ++++++++---------
 include/linux/blk-mq.h |  3 +--
 3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 380e2dd31bfc..d3cf91d764d5 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -254,7 +254,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 		return true;
 
 	if (rq->q == hctx->queue && rq->mq_hctx == hctx)
-		ret = iter_data->fn(hctx, rq, iter_data->data, reserved);
+		ret = iter_data->fn(rq, iter_data->data, reserved);
 	blk_mq_put_rq_ref(rq);
 	return ret;
 }
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 22ec21aa0c22..433fbb40dd6e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -127,8 +127,7 @@ struct mq_inflight {
 	unsigned int inflight[2];
 };
 
-static bool blk_mq_check_inflight(struct blk_mq_hw_ctx *hctx,
-				  struct request *rq, void *priv,
+static bool blk_mq_check_inflight(struct request *rq, void *priv,
 				  bool reserved)
 {
 	struct mq_inflight *mi = priv;
@@ -1308,14 +1307,15 @@ void blk_mq_delay_kick_requeue_list(struct request_queue *q,
 }
 EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
 
-static bool blk_mq_rq_inflight(struct blk_mq_hw_ctx *hctx, struct request *rq,
-			       void *priv, bool reserved)
+static bool blk_mq_rq_inflight(struct request *rq, void *priv,
+			       bool reserved)
 {
 	/*
-	 * If we find a request that isn't idle and the queue matches,
-	 * we know the queue is busy. Return false to stop the iteration.
+	 * If we find a request that isn't idle we know the queue is busy
+	 * as it's checked in the iter.
+	 * Return false to stop the iteration.
 	 */
-	if (blk_mq_request_started(rq) && rq->q == hctx->queue) {
+	if (blk_mq_request_started(rq)) {
 		bool *busy = priv;
 
 		*busy = true;
@@ -1377,8 +1377,7 @@ void blk_mq_put_rq_ref(struct request *rq)
 		__blk_mq_free_request(rq);
 }
 
-static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
-		struct request *rq, void *priv, bool reserved)
+static bool blk_mq_check_expired(struct request *rq, void *priv, bool reserved)
 {
 	unsigned long *next = priv;
 
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index ecdc049b52fa..17ebf29e42d8 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -470,8 +470,7 @@ struct blk_mq_queue_data {
 	bool last;
 };
 
-typedef bool (busy_iter_fn)(struct blk_mq_hw_ctx *, struct request *, void *,
-		bool);
+typedef bool (busy_iter_fn)(struct request *, void *, bool);
 typedef bool (busy_tag_iter_fn)(struct request *, void *, bool);
 
 /**
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/3] blk-mq: Delete busy_iter_fn
  2021-12-06 12:49 [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags John Garry
  2021-12-06 12:49 ` [PATCH v2 1/3] blk-mq: Drop busy_iter_fn blk_mq_hw_ctx argument John Garry
@ 2021-12-06 12:49 ` John Garry
  2021-12-06 12:49 ` [PATCH v2 3/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags John Garry
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2021-12-06 12:49 UTC (permalink / raw)
  To: axboe
  Cc: linux-block, linux-kernel, ming.lei, kashyap.desai, hare, John Garry

Typedefs busy_iter_fn and busy_tag_iter_fn are now identical, so delete
busy_iter_fn to reduce duplication.

It would be nicer to delete busy_tag_iter_fn, as the name busy_iter_fn is
less specific.

However busy_tag_iter_fn is used in many different parts of the tree,
unlike busy_iter_fn which is just use in block/, so just take the
straightforward path now, so that we could rename later treewide.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Kashyap Desai <kashyap.desai@broadcom.com>
---
 block/blk-mq-tag.c     | 6 +++---
 block/blk-mq-tag.h     | 2 +-
 include/linux/blk-mq.h | 1 -
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index d3cf91d764d5..58b80d4b7a07 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -215,7 +215,7 @@ void blk_mq_put_tags(struct blk_mq_tags *tags, int *tag_array, int nr_tags)
 
 struct bt_iter_data {
 	struct blk_mq_hw_ctx *hctx;
-	busy_iter_fn *fn;
+	busy_tag_iter_fn *fn;
 	void *data;
 	bool reserved;
 };
@@ -274,7 +274,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
  *		bitmap_tags member of struct blk_mq_tags.
  */
 static void bt_for_each(struct blk_mq_hw_ctx *hctx, struct sbitmap_queue *bt,
-			busy_iter_fn *fn, void *data, bool reserved)
+			busy_tag_iter_fn *fn, void *data, bool reserved)
 {
 	struct bt_iter_data iter_data = {
 		.hctx = hctx,
@@ -457,7 +457,7 @@ EXPORT_SYMBOL(blk_mq_tagset_wait_completed_request);
  * called for all requests on all queues that share that tag set and not only
  * for requests associated with @q.
  */
-void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn,
+void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_tag_iter_fn *fn,
 		void *priv)
 {
 	struct blk_mq_hw_ctx *hctx;
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index df787b5a23bd..5668e28be0b7 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -28,7 +28,7 @@ extern void blk_mq_tag_resize_shared_tags(struct blk_mq_tag_set *set,
 extern void blk_mq_tag_update_sched_shared_tags(struct request_queue *q);
 
 extern void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool);
-void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn,
+void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_tag_iter_fn *fn,
 		void *priv);
 void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn,
 		void *priv);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 17ebf29e42d8..772f8f921526 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -470,7 +470,6 @@ struct blk_mq_queue_data {
 	bool last;
 };
 
-typedef bool (busy_iter_fn)(struct request *, void *, bool);
 typedef bool (busy_tag_iter_fn)(struct request *, void *, bool);
 
 /**
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags
  2021-12-06 12:49 [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags John Garry
  2021-12-06 12:49 ` [PATCH v2 1/3] blk-mq: Drop busy_iter_fn blk_mq_hw_ctx argument John Garry
  2021-12-06 12:49 ` [PATCH v2 2/3] blk-mq: Delete busy_iter_fn John Garry
@ 2021-12-06 12:49 ` John Garry
  2021-12-06 20:10   ` John Garry
  2021-12-06 19:07 ` [PATCH v2 0/3] " Jens Axboe
  2021-12-07 16:18 ` Jens Axboe
  4 siblings, 1 reply; 10+ messages in thread
From: John Garry @ 2021-12-06 12:49 UTC (permalink / raw)
  To: axboe
  Cc: linux-block, linux-kernel, ming.lei, kashyap.desai, hare, John Garry

Kashyap reports high CPU usage in blk_mq_queue_tag_busy_iter() and callees
using megaraid SAS RAID card since moving to shared tags [0].

Previously, when shared tags was shared sbitmap, this function was less
than optimum since we would iter through all tags for all hctx's,
yet only ever match upto tagset depth number of rqs.

Since the change to shared tags, things are even less efficient if we have
parallel callers of blk_mq_queue_tag_busy_iter(). This is because in
bt_iter() -> blk_mq_find_and_get_req() there would be more contention on
accessing each request ref and tags->lock since they are now shared among
all HW queues.

Optimise by having separate calls to bt_for_each() for when we're using
shared tags. In this case no longer pass a hctx, as it is no longer
relevant, and teach bt_iter() about this.

Ming suggested something along the lines of this change, apart from a
different implementation.

[0] https://lore.kernel.org/linux-block/e4e92abbe9d52bcba6b8cc6c91c442cc@mail.gmail.com/

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reported-and-tested-by: Kashyap Desai <kashyap.desai@broadcom.com>
---
 block/blk-mq-tag.c | 59 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 41 insertions(+), 18 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 58b80d4b7a07..e55a6834c9a6 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -215,6 +215,7 @@ void blk_mq_put_tags(struct blk_mq_tags *tags, int *tag_array, int nr_tags)
 
 struct bt_iter_data {
 	struct blk_mq_hw_ctx *hctx;
+	struct request_queue *q;
 	busy_tag_iter_fn *fn;
 	void *data;
 	bool reserved;
@@ -238,11 +239,18 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 {
 	struct bt_iter_data *iter_data = data;
 	struct blk_mq_hw_ctx *hctx = iter_data->hctx;
-	struct blk_mq_tags *tags = hctx->tags;
+	struct request_queue *q = iter_data->q;
+	struct blk_mq_tag_set *set = q->tag_set;
 	bool reserved = iter_data->reserved;
+	struct blk_mq_tags *tags;
 	struct request *rq;
 	bool ret = true;
 
+	if (blk_mq_is_shared_tags(set->flags))
+		tags = set->shared_tags;
+	else
+		tags = hctx->tags;
+
 	if (!reserved)
 		bitnr += tags->nr_reserved_tags;
 	/*
@@ -253,7 +261,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 	if (!rq)
 		return true;
 
-	if (rq->q == hctx->queue && rq->mq_hctx == hctx)
+	if (rq->q == q && (!hctx || rq->mq_hctx == hctx))
 		ret = iter_data->fn(rq, iter_data->data, reserved);
 	blk_mq_put_rq_ref(rq);
 	return ret;
@@ -262,6 +270,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 /**
  * bt_for_each - iterate over the requests associated with a hardware queue
  * @hctx:	Hardware queue to examine.
+ * @q:		Request queue to examine.
  * @bt:		sbitmap to examine. This is either the breserved_tags member
  *		or the bitmap_tags member of struct blk_mq_tags.
  * @fn:		Pointer to the function that will be called for each request
@@ -273,14 +282,16 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
  * @reserved:	Indicates whether @bt is the breserved_tags member or the
  *		bitmap_tags member of struct blk_mq_tags.
  */
-static void bt_for_each(struct blk_mq_hw_ctx *hctx, struct sbitmap_queue *bt,
-			busy_tag_iter_fn *fn, void *data, bool reserved)
+static void bt_for_each(struct blk_mq_hw_ctx *hctx, struct request_queue *q,
+			struct sbitmap_queue *bt, busy_tag_iter_fn *fn,
+			void *data, bool reserved)
 {
 	struct bt_iter_data iter_data = {
 		.hctx = hctx,
 		.fn = fn,
 		.data = data,
 		.reserved = reserved,
+		.q = q,
 	};
 
 	sbitmap_for_each_set(&bt->sb, bt_iter, &iter_data);
@@ -460,9 +471,6 @@ EXPORT_SYMBOL(blk_mq_tagset_wait_completed_request);
 void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_tag_iter_fn *fn,
 		void *priv)
 {
-	struct blk_mq_hw_ctx *hctx;
-	int i;
-
 	/*
 	 * __blk_mq_update_nr_hw_queues() updates nr_hw_queues and queue_hw_ctx
 	 * while the queue is frozen. So we can use q_usage_counter to avoid
@@ -471,19 +479,34 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_tag_iter_fn *fn,
 	if (!percpu_ref_tryget(&q->q_usage_counter))
 		return;
 
-	queue_for_each_hw_ctx(q, hctx, i) {
-		struct blk_mq_tags *tags = hctx->tags;
-
-		/*
-		 * If no software queues are currently mapped to this
-		 * hardware queue, there's nothing to check
-		 */
-		if (!blk_mq_hw_queue_mapped(hctx))
-			continue;
+	if (blk_mq_is_shared_tags(q->tag_set->flags)) {
+		struct blk_mq_tags *tags = q->tag_set->shared_tags;
+		struct sbitmap_queue *bresv = &tags->breserved_tags;
+		struct sbitmap_queue *btags = &tags->bitmap_tags;
 
 		if (tags->nr_reserved_tags)
-			bt_for_each(hctx, &tags->breserved_tags, fn, priv, true);
-		bt_for_each(hctx, &tags->bitmap_tags, fn, priv, false);
+			bt_for_each(NULL, q, bresv, fn, priv, true);
+		bt_for_each(NULL, q, btags, fn, priv, false);
+	} else {
+		struct blk_mq_hw_ctx *hctx;
+		int i;
+
+		queue_for_each_hw_ctx(q, hctx, i) {
+			struct blk_mq_tags *tags = hctx->tags;
+			struct sbitmap_queue *bresv = &tags->breserved_tags;
+			struct sbitmap_queue *btags = &tags->bitmap_tags;
+
+			/*
+			 * If no software queues are currently mapped to this
+			 * hardware queue, there's nothing to check
+			 */
+			if (!blk_mq_hw_queue_mapped(hctx))
+				continue;
+
+			if (tags->nr_reserved_tags)
+				bt_for_each(hctx, q, bresv, fn, priv, true);
+			bt_for_each(hctx, q, btags, fn, priv, false);
+		}
 	}
 	blk_queue_exit(q);
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags
  2021-12-06 12:49 [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags John Garry
                   ` (2 preceding siblings ...)
  2021-12-06 12:49 ` [PATCH v2 3/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags John Garry
@ 2021-12-06 19:07 ` Jens Axboe
  2021-12-06 19:34   ` John Garry
  2021-12-07 16:18 ` Jens Axboe
  4 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2021-12-06 19:07 UTC (permalink / raw)
  To: John Garry; +Cc: linux-block, linux-kernel, ming.lei, kashyap.desai, hare

On 12/6/21 5:49 AM, John Garry wrote:
> In [0] Kashyap reports high CPU usage for blk_mq_queue_tag_busy_iter()
> and callees for shared tags.
> 
> Indeed blk_mq_queue_tag_busy_iter() would be less optimum for moving to
> shared tags, but it was not optimum previously.
> 
> This series optimises by having only a single iter (per regular and resv
> tags) for the shared tags, instead of an iter per HW queue.
> 
> [0] https://lore.kernel.org/linux-block/e4e92abbe9d52bcba6b8cc6c91c442cc@mail.gmail.com/

The patch(es) are missing Fixes tags.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags
  2021-12-06 19:07 ` [PATCH v2 0/3] " Jens Axboe
@ 2021-12-06 19:34   ` John Garry
  2021-12-06 19:48     ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: John Garry @ 2021-12-06 19:34 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, ming.lei, kashyap.desai, hare

On 06/12/2021 19:07, Jens Axboe wrote:
> On 12/6/21 5:49 AM, John Garry wrote:
>> In [0] Kashyap reports high CPU usage for blk_mq_queue_tag_busy_iter()
>> and callees for shared tags.
>>
>> Indeed blk_mq_queue_tag_busy_iter() would be less optimum for moving to
>> shared tags, but it was not optimum previously.
>>
>> This series optimises by having only a single iter (per regular and resv
>> tags) for the shared tags, instead of an iter per HW queue.
>>
>> [0]https://lore.kernel.org/linux-block/e4e92abbe9d52bcba6b8cc6c91c442cc@mail.gmail.com/

Hi Jens,

> The patch(es) are missing Fixes tags.

The first two patches aren't fixes, but are general dev. As for the 
last, it prob should go as a fix for 5.16, but I was not sure how you 
would feel about that - it's not a trivial change, we're late in the 
cycle, and Kashyap was happy for 5.17 .

Let me know if the last could be accepted as a fix and I'll re-send 
separately with a fixes tag.

Thanks,
John

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags
  2021-12-06 19:34   ` John Garry
@ 2021-12-06 19:48     ` Jens Axboe
  2021-12-06 19:59       ` John Garry
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2021-12-06 19:48 UTC (permalink / raw)
  To: John Garry; +Cc: linux-block, linux-kernel, ming.lei, kashyap.desai, hare

On 12/6/21 12:34 PM, John Garry wrote:
> On 06/12/2021 19:07, Jens Axboe wrote:
>> On 12/6/21 5:49 AM, John Garry wrote:
>>> In [0] Kashyap reports high CPU usage for blk_mq_queue_tag_busy_iter()
>>> and callees for shared tags.
>>>
>>> Indeed blk_mq_queue_tag_busy_iter() would be less optimum for moving to
>>> shared tags, but it was not optimum previously.
>>>
>>> This series optimises by having only a single iter (per regular and resv
>>> tags) for the shared tags, instead of an iter per HW queue.
>>>
>>> [0]https://lore.kernel.org/linux-block/e4e92abbe9d52bcba6b8cc6c91c442cc@mail.gmail.com/
> 
> Hi Jens,
> 
>> The patch(es) are missing Fixes tags.
> 
> The first two patches aren't fixes, but are general dev. As for the 
> last, it prob should go as a fix for 5.16, but I was not sure how you 
> would feel about that - it's not a trivial change, we're late in the 
> cycle, and Kashyap was happy for 5.17 .
> 
> Let me know if the last could be accepted as a fix and I'll re-send 
> separately with a fixes tag.

Regardless of whether it's going into 5.16 or 5.17 it should have a
fixes tag.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags
  2021-12-06 19:48     ` Jens Axboe
@ 2021-12-06 19:59       ` John Garry
  0 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2021-12-06 19:59 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, ming.lei, kashyap.desai, hare

On 06/12/2021 19:48, Jens Axboe wrote:
>> The first two patches aren't fixes, but are general dev. As for the
>> last, it prob should go as a fix for 5.16, but I was not sure how you
>> would feel about that - it's not a trivial change, we're late in the
>> cycle, and Kashyap was happy for 5.17 .
>>

Hi Jens,

>> Let me know if the last could be accepted as a fix and I'll re-send
>> separately with a fixes tag.
> Regardless of whether it's going into 5.16 or 5.17 it should have a
> fixes tag.

ok, so patch 3/3 would fix commit e155b0c238b2 ("blk-mq: Use shared tags 
for shared sbitmap support")

I'll reply to that one with a fixes tag and I think that b4 would pick 
it up in case.

Cheers,
John

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags
  2021-12-06 12:49 ` [PATCH v2 3/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags John Garry
@ 2021-12-06 20:10   ` John Garry
  0 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2021-12-06 20:10 UTC (permalink / raw)
  To: axboe; +Cc: linux-block, linux-kernel, ming.lei, kashyap.desai, hare

On 06/12/2021 12:49, John Garry wrote:
> Kashyap reports high CPU usage in blk_mq_queue_tag_busy_iter() and callees
> using megaraid SAS RAID card since moving to shared tags [0].
> 
> Previously, when shared tags was shared sbitmap, this function was less
> than optimum since we would iter through all tags for all hctx's,
> yet only ever match upto tagset depth number of rqs.
> 
> Since the change to shared tags, things are even less efficient if we have
> parallel callers of blk_mq_queue_tag_busy_iter(). This is because in
> bt_iter() -> blk_mq_find_and_get_req() there would be more contention on
> accessing each request ref and tags->lock since they are now shared among
> all HW queues.
> 
> Optimise by having separate calls to bt_for_each() for when we're using
> shared tags. In this case no longer pass a hctx, as it is no longer
> relevant, and teach bt_iter() about this.
> 
> Ming suggested something along the lines of this change, apart from a
> different implementation.
> 
> [0]https://lore.kernel.org/linux-block/e4e92abbe9d52bcba6b8cc6c91c442cc@mail.gmail.com/
> 

Fixes: e155b0c238b2 ("blk-mq: Use shared tags for shared sbitmap support")

> Signed-off-by: John Garry<john.garry@huawei.com>
> Reviewed-by: Hannes Reinecke<hare@suse.de>
> Reviewed-by: Ming Lei<ming.lei@redhat.com>
> Reported-and-tested-by: Kashyap Desai<kashyap.desai@broadcom.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags
  2021-12-06 12:49 [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags John Garry
                   ` (3 preceding siblings ...)
  2021-12-06 19:07 ` [PATCH v2 0/3] " Jens Axboe
@ 2021-12-07 16:18 ` Jens Axboe
  4 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2021-12-07 16:18 UTC (permalink / raw)
  To: John Garry; +Cc: linux-block, kashyap.desai, linux-kernel, hare, ming.lei

On Mon, 6 Dec 2021 20:49:47 +0800, John Garry wrote:
> In [0] Kashyap reports high CPU usage for blk_mq_queue_tag_busy_iter()
> and callees for shared tags.
> 
> Indeed blk_mq_queue_tag_busy_iter() would be less optimum for moving to
> shared tags, but it was not optimum previously.
> 
> This series optimises by having only a single iter (per regular and resv
> tags) for the shared tags, instead of an iter per HW queue.
> 
> [...]

Applied, thanks!

[1/3] blk-mq: Drop busy_iter_fn blk_mq_hw_ctx argument
      (no commit info)
[2/3] blk-mq: Delete busy_iter_fn
      (no commit info)
[3/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags
      (no commit info)

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-12-07 16:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-06 12:49 [PATCH v2 0/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags John Garry
2021-12-06 12:49 ` [PATCH v2 1/3] blk-mq: Drop busy_iter_fn blk_mq_hw_ctx argument John Garry
2021-12-06 12:49 ` [PATCH v2 2/3] blk-mq: Delete busy_iter_fn John Garry
2021-12-06 12:49 ` [PATCH v2 3/3] blk-mq: Optimise blk_mq_queue_tag_busy_iter() for shared tags John Garry
2021-12-06 20:10   ` John Garry
2021-12-06 19:07 ` [PATCH v2 0/3] " Jens Axboe
2021-12-06 19:34   ` John Garry
2021-12-06 19:48     ` Jens Axboe
2021-12-06 19:59       ` John Garry
2021-12-07 16:18 ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.