All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v2 0/6] Various block layer optimizations
@ 2021-10-18 17:51 Jens Axboe
  2021-10-18 17:51 ` [PATCH 1/6] block: don't call blk_status_to_errno in blk_update_request Jens Axboe
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
  To: linux-block; +Cc: hch

Hi,

Here's v2 of this patchset, with the reviewed ones pulled out.
Since v1:

- Use different method for blk_status_to_errno() (Christoph)
- Split plug patch into 3 pieces

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] block: don't call blk_status_to_errno in blk_update_request
  2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
  2021-10-18 17:51 ` [PATCH 2/6] block: return whether or not to unplug through boolean Jens Axboe
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
  To: linux-block; +Cc: hch, Jens Axboe

From: Christoph Hellwig <hch@lst.de>

We only need to call it to resolve the blk_status_t -> errno mapping for
tracing, so move the conversion into the tracepoints that are not called
at all when tracing isn't enabled.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c               | 2 +-
 include/trace/events/block.h | 6 +++---
 kernel/trace/blktrace.c      | 7 ++++---
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f7c610f5f3e7..e3ef55f76701 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -672,7 +672,7 @@ bool blk_update_request(struct request *req, blk_status_t error,
 {
 	int total_bytes;
 
-	trace_block_rq_complete(req, blk_status_to_errno(error), nr_bytes);
+	trace_block_rq_complete(req, error, nr_bytes);
 
 	if (!req->bio)
 		return false;
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index cc5ab96a7471..a95daa4d4caa 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -114,7 +114,7 @@ TRACE_EVENT(block_rq_requeue,
  */
 TRACE_EVENT(block_rq_complete,
 
-	TP_PROTO(struct request *rq, int error, unsigned int nr_bytes),
+	TP_PROTO(struct request *rq, blk_status_t error, unsigned int nr_bytes),
 
 	TP_ARGS(rq, error, nr_bytes),
 
@@ -122,7 +122,7 @@ TRACE_EVENT(block_rq_complete,
 		__field(  dev_t,	dev			)
 		__field(  sector_t,	sector			)
 		__field(  unsigned int,	nr_sector		)
-		__field(  int,		error			)
+		__field(  int	,	error			)
 		__array(  char,		rwbs,	RWBS_LEN	)
 		__dynamic_array( char,	cmd,	1		)
 	),
@@ -131,7 +131,7 @@ TRACE_EVENT(block_rq_complete,
 		__entry->dev	   = rq->rq_disk ? disk_devt(rq->rq_disk) : 0;
 		__entry->sector    = blk_rq_pos(rq);
 		__entry->nr_sector = nr_bytes >> 9;
-		__entry->error     = error;
+		__entry->error     = blk_status_to_errno(error);
 
 		blk_fill_rwbs(__entry->rwbs, rq->cmd_flags);
 		__get_str(cmd)[0] = '\0';
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index fa91f398f28b..1183c88634aa 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -816,7 +816,7 @@ blk_trace_request_get_cgid(struct request *rq)
  *     Records an action against a request. Will log the bio offset + size.
  *
  **/
-static void blk_add_trace_rq(struct request *rq, int error,
+static void blk_add_trace_rq(struct request *rq, blk_status_t error,
 			     unsigned int nr_bytes, u32 what, u64 cgid)
 {
 	struct blk_trace *bt;
@@ -834,7 +834,8 @@ static void blk_add_trace_rq(struct request *rq, int error,
 		what |= BLK_TC_ACT(BLK_TC_FS);
 
 	__blk_add_trace(bt, blk_rq_trace_sector(rq), nr_bytes, req_op(rq),
-			rq->cmd_flags, what, error, 0, NULL, cgid);
+			rq->cmd_flags, what, blk_status_to_errno(error), 0,
+			NULL, cgid);
 	rcu_read_unlock();
 }
 
@@ -863,7 +864,7 @@ static void blk_add_trace_rq_requeue(void *ignore, struct request *rq)
 }
 
 static void blk_add_trace_rq_complete(void *ignore, struct request *rq,
-			int error, unsigned int nr_bytes)
+			blk_status_t error, unsigned int nr_bytes)
 {
 	blk_add_trace_rq(rq, error, nr_bytes, BLK_TA_COMPLETE,
 			 blk_trace_request_get_cgid(rq));
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/6] block: return whether or not to unplug through boolean
  2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
  2021-10-18 17:51 ` [PATCH 1/6] block: don't call blk_status_to_errno in blk_update_request Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
  2021-10-18 18:00   ` Christoph Hellwig
  2021-10-18 17:51 ` [PATCH 3/6] block: get rid of plug list sorting Jens Axboe
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
  To: linux-block; +Cc: hch, Jens Axboe

Instead of returning the same queue request through a request pointer,
use a boolean to accomplish the same.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-merge.c | 11 +++++------
 block/blk-mq.c    | 16 +++++++++-------
 block/blk.h       |  2 +-
 3 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index ec727234ac48..c273b58378ce 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -1067,9 +1067,8 @@ static enum bio_merge_status blk_attempt_bio_merge(struct request_queue *q,
  * @q: request_queue new bio is being queued at
  * @bio: new bio being queued
  * @nr_segs: number of segments in @bio
- * @same_queue_rq: pointer to &struct request that gets filled in when
- * another request associated with @q is found on the plug list
- * (optional, may be %NULL)
+ * @same_queue_rq: output value, will be true if there's an existing request
+ * from the passed in @q already in the plug list
  *
  * Determine whether @bio being queued on @q can be merged with the previous
  * request on %current's plugged list.  Returns %true if merge was successful,
@@ -1085,7 +1084,7 @@ static enum bio_merge_status blk_attempt_bio_merge(struct request_queue *q,
  * Caller must ensure !blk_queue_nomerges(q) beforehand.
  */
 bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
-		unsigned int nr_segs, struct request **same_queue_rq)
+		unsigned int nr_segs, bool *same_queue_rq)
 {
 	struct blk_plug *plug;
 	struct request *rq;
@@ -1096,12 +1095,12 @@ bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
 
 	/* check the previously added entry for a quick merge attempt */
 	rq = list_last_entry(&plug->mq_list, struct request, queuelist);
-	if (rq->q == q && same_queue_rq) {
+	if (rq->q == q) {
 		/*
 		 * Only blk-mq multiple hardware queues case checks the rq in
 		 * the same queue, there should be only one such rq in a queue
 		 */
-		*same_queue_rq = rq;
+		*same_queue_rq = true;
 	}
 	if (blk_attempt_bio_merge(q, rq, bio, nr_segs, false) == BIO_MERGE_OK)
 		return true;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e3ef55f76701..d957b6812a98 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2422,7 +2422,7 @@ void blk_mq_submit_bio(struct bio *bio)
 	const int is_flush_fua = op_is_flush(bio->bi_opf);
 	struct request *rq;
 	struct blk_plug *plug;
-	struct request *same_queue_rq = NULL;
+	bool same_queue_rq = false;
 	unsigned int nr_segs = 1;
 	blk_status_t ret;
 
@@ -2515,6 +2515,8 @@ void blk_mq_submit_bio(struct bio *bio)
 		/* Insert the request at the IO scheduler queue */
 		blk_mq_sched_insert_request(rq, false, true, true);
 	} else if (plug && !blk_queue_nomerges(q)) {
+		struct request *next_rq = NULL;
+
 		/*
 		 * We do limited plugging. If the bio can be merged, do that.
 		 * Otherwise the existing request in the plug list will be
@@ -2522,19 +2524,19 @@ void blk_mq_submit_bio(struct bio *bio)
 		 * The plug list might get flushed before this. If that happens,
 		 * the plug list is empty, and same_queue_rq is invalid.
 		 */
-		if (list_empty(&plug->mq_list))
-			same_queue_rq = NULL;
 		if (same_queue_rq) {
-			list_del_init(&same_queue_rq->queuelist);
+			next_rq = list_last_entry(&plug->mq_list,
+							struct request,
+							queuelist);
+			list_del_init(&next_rq->queuelist);
 			plug->rq_count--;
 		}
 		blk_add_rq_to_plug(plug, rq);
 		trace_block_plug(q);
 
-		if (same_queue_rq) {
+		if (next_rq) {
 			trace_block_unplug(q, 1, true);
-			blk_mq_try_issue_directly(same_queue_rq->mq_hctx,
-						  same_queue_rq);
+			blk_mq_try_issue_directly(next_rq->mq_hctx, next_rq);
 		}
 	} else if ((q->nr_hw_queues > 1 && is_sync) ||
 		   !rq->mq_hctx->dispatch_busy) {
diff --git a/block/blk.h b/block/blk.h
index e80350327e6d..b9729c12fd62 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -218,7 +218,7 @@ void blk_add_timer(struct request *req);
 void blk_print_req_error(struct request *req, blk_status_t status);
 
 bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
-		unsigned int nr_segs, struct request **same_queue_rq);
+		unsigned int nr_segs, bool *same_queue_rq);
 bool blk_bio_list_merge(struct request_queue *q, struct list_head *list,
 			struct bio *bio, unsigned int nr_segs);
 
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/6] block: get rid of plug list sorting
  2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
  2021-10-18 17:51 ` [PATCH 1/6] block: don't call blk_status_to_errno in blk_update_request Jens Axboe
  2021-10-18 17:51 ` [PATCH 2/6] block: return whether or not to unplug through boolean Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
  2021-10-18 18:00   ` Christoph Hellwig
  2021-10-18 17:51 ` [PATCH 4/6] block: change plugging to use a singly linked list Jens Axboe
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
  To: linux-block; +Cc: hch, Jens Axboe

Even if we have multiple queues in the plug list, chances that they
are very interspersed is minimal. Don't bother spending CPU cycles
sorting the list.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index d957b6812a98..58774267dd95 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -19,7 +19,6 @@
 #include <linux/smp.h>
 #include <linux/interrupt.h>
 #include <linux/llist.h>
-#include <linux/list_sort.h>
 #include <linux/cpu.h>
 #include <linux/cache.h>
 #include <linux/sched/sysctl.h>
@@ -2151,20 +2150,6 @@ void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
 	spin_unlock(&ctx->lock);
 }
 
-static int plug_rq_cmp(void *priv, const struct list_head *a,
-		       const struct list_head *b)
-{
-	struct request *rqa = container_of(a, struct request, queuelist);
-	struct request *rqb = container_of(b, struct request, queuelist);
-
-	if (rqa->mq_ctx != rqb->mq_ctx)
-		return rqa->mq_ctx > rqb->mq_ctx;
-	if (rqa->mq_hctx != rqb->mq_hctx)
-		return rqa->mq_hctx > rqb->mq_hctx;
-
-	return blk_rq_pos(rqa) > blk_rq_pos(rqb);
-}
-
 void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 {
 	LIST_HEAD(list);
@@ -2172,10 +2157,6 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 	if (list_empty(&plug->mq_list))
 		return;
 	list_splice_init(&plug->mq_list, &list);
-
-	if (plug->rq_count > 2 && plug->multiple_queues)
-		list_sort(NULL, &list, plug_rq_cmp);
-
 	plug->rq_count = 0;
 
 	do {
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] block: change plugging to use a singly linked list
  2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
                   ` (2 preceding siblings ...)
  2021-10-18 17:51 ` [PATCH 3/6] block: get rid of plug list sorting Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
  2021-10-19  5:57   ` Christoph Hellwig
  2021-10-18 17:51 ` [PATCH 5/6] block: move blk_mq_tag_to_rq() inline Jens Axboe
  2021-10-18 17:51 ` [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline Jens Axboe
  5 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
  To: linux-block; +Cc: hch, Jens Axboe

Use a singly linked list for the blk_plug. This saves 8 bytes in the
blk_plug struct, and makes for faster list manipulations than doubly
linked lists. As we don't use the doubly linked lists for anything,
singly linked is just fine.

This yields a bump in default (merging enabled) performance from 7.0
to 7.1M IOPS, and ~7.5M IOPS with merging disabled.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-core.c       |   5 +-
 block/blk-merge.c      |   4 +-
 block/blk-mq.c         | 140 ++++++++++++++++++++++++++++++-----------
 include/linux/blkdev.h |   6 +-
 4 files changed, 113 insertions(+), 42 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index d0c2e11411d0..e6ad5b51d0c3 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1550,11 +1550,12 @@ void blk_start_plug_nr_ios(struct blk_plug *plug, unsigned short nr_ios)
 	if (tsk->plug)
 		return;
 
-	INIT_LIST_HEAD(&plug->mq_list);
+	plug->mq_list = NULL;
 	plug->cached_rq = NULL;
 	plug->nr_ios = min_t(unsigned short, nr_ios, BLK_MAX_REQUEST_COUNT);
 	plug->rq_count = 0;
 	plug->multiple_queues = false;
+	plug->has_elevator = false;
 	plug->nowait = false;
 	INIT_LIST_HEAD(&plug->cb_list);
 
@@ -1640,7 +1641,7 @@ void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 {
 	flush_plug_callbacks(plug, from_schedule);
 
-	if (!list_empty(&plug->mq_list))
+	if (!rq_list_empty(plug->mq_list))
 		blk_mq_flush_plug_list(plug, from_schedule);
 	if (unlikely(!from_schedule && plug->cached_rq))
 		blk_mq_free_plug_rqs(plug);
diff --git a/block/blk-merge.c b/block/blk-merge.c
index c273b58378ce..3e6fa449caff 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -1090,11 +1090,11 @@ bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
 	struct request *rq;
 
 	plug = blk_mq_plug(q, bio);
-	if (!plug || list_empty(&plug->mq_list))
+	if (!plug || rq_list_empty(plug->mq_list))
 		return false;
 
 	/* check the previously added entry for a quick merge attempt */
-	rq = list_last_entry(&plug->mq_list, struct request, queuelist);
+	rq = rq_list_peek(&plug->mq_list);
 	if (rq->q == q) {
 		/*
 		 * Only blk-mq multiple hardware queues case checks the rq in
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 58774267dd95..3d5010d93059 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2150,36 +2150,106 @@ void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
 	spin_unlock(&ctx->lock);
 }
 
+static void blk_mq_commit_rqs(struct blk_mq_hw_ctx *hctx, int *queued,
+			      bool from_schedule)
+{
+	if (hctx->queue->mq_ops->commit_rqs) {
+		trace_block_unplug(hctx->queue, *queued, !from_schedule);
+		hctx->queue->mq_ops->commit_rqs(hctx);
+	}
+	*queued = 0;
+}
+
+static void blk_mq_plug_issue_direct(struct blk_plug *plug, bool from_schedule)
+{
+	struct blk_mq_hw_ctx *hctx = NULL;
+	struct request *rq;
+	int queued = 0;
+	int errors = 0;
+
+	while ((rq = rq_list_pop(&plug->mq_list))) {
+		bool last = rq_list_empty(plug->mq_list);
+		blk_status_t ret;
+
+		if (hctx != rq->mq_hctx) {
+			if (hctx)
+				blk_mq_commit_rqs(hctx, &queued, from_schedule);
+			hctx = rq->mq_hctx;
+		}
+
+		ret = blk_mq_request_issue_directly(rq, last);
+		switch (ret) {
+		case BLK_STS_OK:
+			queued++;
+			break;
+		case BLK_STS_RESOURCE:
+		case BLK_STS_DEV_RESOURCE:
+			blk_mq_request_bypass_insert(rq, false, last);
+			blk_mq_commit_rqs(hctx, &queued, from_schedule);
+			return;
+		default:
+			blk_mq_end_request(rq, ret);
+			errors++;
+			break;
+		}
+	}
+
+	/*
+	 * If we didn't flush the entire list, we could have told the driver
+	 * there was more coming, but that turned out to be a lie.
+	 */
+	if (errors)
+		blk_mq_commit_rqs(hctx, &queued, from_schedule);
+}
+
 void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 {
+	struct blk_mq_hw_ctx *this_hctx;
+	struct blk_mq_ctx *this_ctx;
+	unsigned int depth;
 	LIST_HEAD(list);
 
-	if (list_empty(&plug->mq_list))
+	if (rq_list_empty(plug->mq_list))
 		return;
-	list_splice_init(&plug->mq_list, &list);
 	plug->rq_count = 0;
 
+	if (!plug->multiple_queues && !plug->has_elevator) {
+		blk_mq_plug_issue_direct(plug, from_schedule);
+		if (rq_list_empty(plug->mq_list))
+			return;
+	}
+
+	this_hctx = NULL;
+	this_ctx = NULL;
+	depth = 0;
 	do {
-		struct list_head rq_list;
-		struct request *rq, *head_rq = list_entry_rq(list.next);
-		struct list_head *pos = &head_rq->queuelist; /* skip first */
-		struct blk_mq_hw_ctx *this_hctx = head_rq->mq_hctx;
-		struct blk_mq_ctx *this_ctx = head_rq->mq_ctx;
-		unsigned int depth = 1;
-
-		list_for_each_continue(pos, &list) {
-			rq = list_entry_rq(pos);
-			BUG_ON(!rq->q);
-			if (rq->mq_hctx != this_hctx || rq->mq_ctx != this_ctx)
-				break;
-			depth++;
+		struct request *rq;
+
+		rq = rq_list_pop(&plug->mq_list);
+
+		if (!this_hctx) {
+			this_hctx = rq->mq_hctx;
+			this_ctx = rq->mq_ctx;
+		} else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx) {
+			trace_block_unplug(this_hctx->queue, depth,
+						!from_schedule);
+			blk_mq_sched_insert_requests(this_hctx, this_ctx,
+						&list, from_schedule);
+			depth = 0;
+			this_hctx = rq->mq_hctx;
+			this_ctx = rq->mq_ctx;
+
 		}
 
-		list_cut_before(&rq_list, &list, pos);
-		trace_block_unplug(head_rq->q, depth, !from_schedule);
-		blk_mq_sched_insert_requests(this_hctx, this_ctx, &rq_list,
+		list_add(&rq->queuelist, &list);
+		depth++;
+	} while (!rq_list_empty(plug->mq_list));
+
+	if (!list_empty(&list)) {
+		trace_block_unplug(this_hctx->queue, depth, !from_schedule);
+		blk_mq_sched_insert_requests(this_hctx, this_ctx, &list,
 						from_schedule);
-	} while(!list_empty(&list));
+	}
 }
 
 static void blk_mq_bio_to_request(struct request *rq, struct bio *bio,
@@ -2359,16 +2429,17 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
 
 static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq)
 {
-	list_add_tail(&rq->queuelist, &plug->mq_list);
-	plug->rq_count++;
-	if (!plug->multiple_queues && !list_is_singular(&plug->mq_list)) {
-		struct request *tmp;
+	if (!plug->multiple_queues) {
+		struct request *nxt = rq_list_peek(&plug->mq_list);
 
-		tmp = list_first_entry(&plug->mq_list, struct request,
-						queuelist);
-		if (tmp->q != rq->q)
+		if (nxt && nxt->q != rq->q)
 			plug->multiple_queues = true;
 	}
+	if (!plug->has_elevator && (rq->rq_flags & RQF_ELV))
+		plug->has_elevator = true;
+	rq->rq_next = NULL;
+	rq_list_add(&plug->mq_list, rq);
+	plug->rq_count++;
 }
 
 /*
@@ -2480,13 +2551,15 @@ void blk_mq_submit_bio(struct bio *bio)
 		unsigned int request_count = plug->rq_count;
 		struct request *last = NULL;
 
-		if (!request_count)
+		if (!request_count) {
 			trace_block_plug(q);
-		else
-			last = list_entry_rq(plug->mq_list.prev);
+		} else if (!blk_queue_nomerges(q)) {
+			last = rq_list_peek(&plug->mq_list);
+			if (blk_rq_bytes(last) < BLK_PLUG_FLUSH_SIZE)
+				last = NULL;
+		}
 
-		if (request_count >= blk_plug_max_rq_count(plug) || (last &&
-		    blk_rq_bytes(last) >= BLK_PLUG_FLUSH_SIZE)) {
+		if (request_count >= blk_plug_max_rq_count(plug) || last) {
 			blk_flush_plug_list(plug, false);
 			trace_block_plug(q);
 		}
@@ -2506,10 +2579,7 @@ void blk_mq_submit_bio(struct bio *bio)
 		 * the plug list is empty, and same_queue_rq is invalid.
 		 */
 		if (same_queue_rq) {
-			next_rq = list_last_entry(&plug->mq_list,
-							struct request,
-							queuelist);
-			list_del_init(&next_rq->queuelist);
+			next_rq = rq_list_pop(&plug->mq_list);
 			plug->rq_count--;
 		}
 		blk_add_rq_to_plug(plug, rq);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index abe721591e80..2e93682f8f68 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -711,7 +711,7 @@ extern void blk_set_queue_dying(struct request_queue *);
  * schedule() where blk_schedule_flush_plug() is called.
  */
 struct blk_plug {
-	struct list_head mq_list; /* blk-mq requests */
+	struct request *mq_list; /* blk-mq requests */
 
 	/* if ios_left is > 1, we can batch tag/rq allocations */
 	struct request *cached_rq;
@@ -720,6 +720,7 @@ struct blk_plug {
 	unsigned short rq_count;
 
 	bool multiple_queues;
+	bool has_elevator;
 	bool nowait;
 
 	struct list_head cb_list; /* md requires an unplug callback */
@@ -760,8 +761,7 @@ static inline bool blk_needs_flush_plug(struct task_struct *tsk)
 	struct blk_plug *plug = tsk->plug;
 
 	return plug &&
-		 (!list_empty(&plug->mq_list) ||
-		 !list_empty(&plug->cb_list));
+		 (plug->mq_list || !list_empty(&plug->cb_list));
 }
 
 int blkdev_issue_flush(struct block_device *bdev);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/6] block: move blk_mq_tag_to_rq() inline
  2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
                   ` (3 preceding siblings ...)
  2021-10-18 17:51 ` [PATCH 4/6] block: change plugging to use a singly linked list Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
  2021-10-18 18:01   ` Christoph Hellwig
  2021-10-18 17:51 ` [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline Jens Axboe
  5 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
  To: linux-block; +Cc: hch, Jens Axboe

This is in the fast path of driver issue or completion, and it's a single
array index operation. Move it inline to avoid a function call for it.

This does mean making struct blk_mq_tags block layer public, but there's
not really much in there.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq-tag.h     | 23 -----------------------
 block/blk-mq.c         | 11 -----------
 include/linux/blk-mq.h | 36 +++++++++++++++++++++++++++++++++++-
 3 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index 78ae2fb8e2a4..df787b5a23bd 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -4,29 +4,6 @@
 
 struct blk_mq_alloc_data;
 
-/*
- * Tag address space map.
- */
-struct blk_mq_tags {
-	unsigned int nr_tags;
-	unsigned int nr_reserved_tags;
-
-	atomic_t active_queues;
-
-	struct sbitmap_queue bitmap_tags;
-	struct sbitmap_queue breserved_tags;
-
-	struct request **rqs;
-	struct request **static_rqs;
-	struct list_head page_list;
-
-	/*
-	 * used to clear request reference in rqs[] before freeing one
-	 * request pool
-	 */
-	spinlock_t lock;
-};
-
 extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
 					unsigned int reserved_tags,
 					int node, int alloc_policy);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3d5010d93059..9a73e2c3ce32 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1112,17 +1112,6 @@ void blk_mq_delay_kick_requeue_list(struct request_queue *q,
 }
 EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
 
-struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
-{
-	if (tag < tags->nr_tags) {
-		prefetch(tags->rqs[tag]);
-		return tags->rqs[tag];
-	}
-
-	return NULL;
-}
-EXPORT_SYMBOL(blk_mq_tag_to_rq);
-
 static bool blk_mq_rq_inflight(struct blk_mq_hw_ctx *hctx, struct request *rq,
 			       void *priv, bool reserved)
 {
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 656fe34bdb6c..6cf35de151a9 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -7,6 +7,7 @@
 #include <linux/srcu.h>
 #include <linux/lockdep.h>
 #include <linux/scatterlist.h>
+#include <linux/prefetch.h>
 
 struct blk_mq_tags;
 struct blk_flush_queue;
@@ -675,7 +676,40 @@ struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op,
 struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
 		unsigned int op, blk_mq_req_flags_t flags,
 		unsigned int hctx_idx);
-struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag);
+
+/*
+ * Tag address space map.
+ */
+struct blk_mq_tags {
+	unsigned int nr_tags;
+	unsigned int nr_reserved_tags;
+
+	atomic_t active_queues;
+
+	struct sbitmap_queue bitmap_tags;
+	struct sbitmap_queue breserved_tags;
+
+	struct request **rqs;
+	struct request **static_rqs;
+	struct list_head page_list;
+
+	/*
+	 * used to clear request reference in rqs[] before freeing one
+	 * request pool
+	 */
+	spinlock_t lock;
+};
+
+static inline struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags,
+					       unsigned int tag)
+{
+	if (tag < tags->nr_tags) {
+		prefetch(tags->rqs[tag]);
+		return tags->rqs[tag];
+	}
+
+	return NULL;
+}
 
 enum {
 	BLK_MQ_UNIQUE_TAG_BITS = 16,
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline
  2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
                   ` (4 preceding siblings ...)
  2021-10-18 17:51 ` [PATCH 5/6] block: move blk_mq_tag_to_rq() inline Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
  2021-10-18 18:01   ` Christoph Hellwig
  5 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
  To: linux-block; +Cc: hch, Jens Axboe

We get all sorts of unreliable and funky results since the bio is
designed to align on a cacheline, which it does not when inlined like
this.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/fops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/fops.c b/block/fops.c
index 2ae8a7bd2b84..e30082ea219a 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -137,7 +137,7 @@ struct blkdev_dio {
 	size_t			size;
 	atomic_t		ref;
 	unsigned int		flags;
-	struct bio		bio;
+	struct bio		bio ____cacheline_aligned_in_smp;
 };
 
 static struct bio_set blkdev_dio_pool;
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] block: return whether or not to unplug through boolean
  2021-10-18 17:51 ` [PATCH 2/6] block: return whether or not to unplug through boolean Jens Axboe
@ 2021-10-18 18:00   ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-10-18 18:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, hch

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/6] block: get rid of plug list sorting
  2021-10-18 17:51 ` [PATCH 3/6] block: get rid of plug list sorting Jens Axboe
@ 2021-10-18 18:00   ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-10-18 18:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, hch

On Mon, Oct 18, 2021 at 11:51:06AM -0600, Jens Axboe wrote:
> Even if we have multiple queues in the plug list, chances that they
> are very interspersed is minimal. Don't bother spending CPU cycles
> sorting the list.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline
  2021-10-18 17:51 ` [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline Jens Axboe
@ 2021-10-18 18:01   ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-10-18 18:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, hch

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/6] block: move blk_mq_tag_to_rq() inline
  2021-10-18 17:51 ` [PATCH 5/6] block: move blk_mq_tag_to_rq() inline Jens Axboe
@ 2021-10-18 18:01   ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-10-18 18:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, hch

On Mon, Oct 18, 2021 at 11:51:08AM -0600, Jens Axboe wrote:
> This is in the fast path of driver issue or completion, and it's a single
> array index operation. Move it inline to avoid a function call for it.
> 
> This does mean making struct blk_mq_tags block layer public, but there's
> not really much in there.

I don't really like exposing more data structures if we can avoid it,
but if this makes a big enough difference:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] block: change plugging to use a singly linked list
  2021-10-18 17:51 ` [PATCH 4/6] block: change plugging to use a singly linked list Jens Axboe
@ 2021-10-19  5:57   ` Christoph Hellwig
  2021-10-19 11:58     ` Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2021-10-19  5:57 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, hch

On Mon, Oct 18, 2021 at 11:51:07AM -0600, Jens Axboe wrote:
> Use a singly linked list for the blk_plug. This saves 8 bytes in the
> blk_plug struct, and makes for faster list manipulations than doubly
> linked lists. As we don't use the doubly linked lists for anything,
> singly linked is just fine.
> 
> This yields a bump in default (merging enabled) performance from 7.0
> to 7.1M IOPS, and ~7.5M IOPS with merging disabled.

I still find this very hard to review.  It doesn't just switch the
list implementation but also does change how this code works.  Especially
in blk_mq_flush_plug_list, but also in blk_mq_submit_bio.  I think
this needs to be further split up.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] block: change plugging to use a singly linked list
  2021-10-19  5:57   ` Christoph Hellwig
@ 2021-10-19 11:58     ` Jens Axboe
  0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2021-10-19 11:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block

On 10/18/21 11:57 PM, Christoph Hellwig wrote:
> On Mon, Oct 18, 2021 at 11:51:07AM -0600, Jens Axboe wrote:
>> Use a singly linked list for the blk_plug. This saves 8 bytes in the
>> blk_plug struct, and makes for faster list manipulations than doubly
>> linked lists. As we don't use the doubly linked lists for anything,
>> singly linked is just fine.
>>
>> This yields a bump in default (merging enabled) performance from 7.0
>> to 7.1M IOPS, and ~7.5M IOPS with merging disabled.
> 
> I still find this very hard to review.  It doesn't just switch the
> list implementation but also does change how this code works.  Especially
> in blk_mq_flush_plug_list, but also in blk_mq_submit_bio.  I think
> this needs to be further split up.

OK, I'll split this one up a bit more.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-10-19 11:58 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
2021-10-18 17:51 ` [PATCH 1/6] block: don't call blk_status_to_errno in blk_update_request Jens Axboe
2021-10-18 17:51 ` [PATCH 2/6] block: return whether or not to unplug through boolean Jens Axboe
2021-10-18 18:00   ` Christoph Hellwig
2021-10-18 17:51 ` [PATCH 3/6] block: get rid of plug list sorting Jens Axboe
2021-10-18 18:00   ` Christoph Hellwig
2021-10-18 17:51 ` [PATCH 4/6] block: change plugging to use a singly linked list Jens Axboe
2021-10-19  5:57   ` Christoph Hellwig
2021-10-19 11:58     ` Jens Axboe
2021-10-18 17:51 ` [PATCH 5/6] block: move blk_mq_tag_to_rq() inline Jens Axboe
2021-10-18 18:01   ` Christoph Hellwig
2021-10-18 17:51 ` [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline Jens Axboe
2021-10-18 18:01   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.