* [PATCHSET v2 0/6] Various block layer optimizations
@ 2021-10-18 17:51 Jens Axboe
2021-10-18 17:51 ` [PATCH 1/6] block: don't call blk_status_to_errno in blk_update_request Jens Axboe
` (5 more replies)
0 siblings, 6 replies; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
To: linux-block; +Cc: hch
Hi,
Here's v2 of this patchset, with the reviewed ones pulled out.
Since v1:
- Use different method for blk_status_to_errno() (Christoph)
- Split plug patch into 3 pieces
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/6] block: don't call blk_status_to_errno in blk_update_request
2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
2021-10-18 17:51 ` [PATCH 2/6] block: return whether or not to unplug through boolean Jens Axboe
` (4 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
To: linux-block; +Cc: hch, Jens Axboe
From: Christoph Hellwig <hch@lst.de>
We only need to call it to resolve the blk_status_t -> errno mapping for
tracing, so move the conversion into the tracepoints that are not called
at all when tracing isn't enabled.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/blk-mq.c | 2 +-
include/trace/events/block.h | 6 +++---
kernel/trace/blktrace.c | 7 ++++---
3 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f7c610f5f3e7..e3ef55f76701 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -672,7 +672,7 @@ bool blk_update_request(struct request *req, blk_status_t error,
{
int total_bytes;
- trace_block_rq_complete(req, blk_status_to_errno(error), nr_bytes);
+ trace_block_rq_complete(req, error, nr_bytes);
if (!req->bio)
return false;
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index cc5ab96a7471..a95daa4d4caa 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -114,7 +114,7 @@ TRACE_EVENT(block_rq_requeue,
*/
TRACE_EVENT(block_rq_complete,
- TP_PROTO(struct request *rq, int error, unsigned int nr_bytes),
+ TP_PROTO(struct request *rq, blk_status_t error, unsigned int nr_bytes),
TP_ARGS(rq, error, nr_bytes),
@@ -122,7 +122,7 @@ TRACE_EVENT(block_rq_complete,
__field( dev_t, dev )
__field( sector_t, sector )
__field( unsigned int, nr_sector )
- __field( int, error )
+ __field( int , error )
__array( char, rwbs, RWBS_LEN )
__dynamic_array( char, cmd, 1 )
),
@@ -131,7 +131,7 @@ TRACE_EVENT(block_rq_complete,
__entry->dev = rq->rq_disk ? disk_devt(rq->rq_disk) : 0;
__entry->sector = blk_rq_pos(rq);
__entry->nr_sector = nr_bytes >> 9;
- __entry->error = error;
+ __entry->error = blk_status_to_errno(error);
blk_fill_rwbs(__entry->rwbs, rq->cmd_flags);
__get_str(cmd)[0] = '\0';
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index fa91f398f28b..1183c88634aa 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -816,7 +816,7 @@ blk_trace_request_get_cgid(struct request *rq)
* Records an action against a request. Will log the bio offset + size.
*
**/
-static void blk_add_trace_rq(struct request *rq, int error,
+static void blk_add_trace_rq(struct request *rq, blk_status_t error,
unsigned int nr_bytes, u32 what, u64 cgid)
{
struct blk_trace *bt;
@@ -834,7 +834,8 @@ static void blk_add_trace_rq(struct request *rq, int error,
what |= BLK_TC_ACT(BLK_TC_FS);
__blk_add_trace(bt, blk_rq_trace_sector(rq), nr_bytes, req_op(rq),
- rq->cmd_flags, what, error, 0, NULL, cgid);
+ rq->cmd_flags, what, blk_status_to_errno(error), 0,
+ NULL, cgid);
rcu_read_unlock();
}
@@ -863,7 +864,7 @@ static void blk_add_trace_rq_requeue(void *ignore, struct request *rq)
}
static void blk_add_trace_rq_complete(void *ignore, struct request *rq,
- int error, unsigned int nr_bytes)
+ blk_status_t error, unsigned int nr_bytes)
{
blk_add_trace_rq(rq, error, nr_bytes, BLK_TA_COMPLETE,
blk_trace_request_get_cgid(rq));
--
2.33.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/6] block: return whether or not to unplug through boolean
2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
2021-10-18 17:51 ` [PATCH 1/6] block: don't call blk_status_to_errno in blk_update_request Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
2021-10-18 18:00 ` Christoph Hellwig
2021-10-18 17:51 ` [PATCH 3/6] block: get rid of plug list sorting Jens Axboe
` (3 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
To: linux-block; +Cc: hch, Jens Axboe
Instead of returning the same queue request through a request pointer,
use a boolean to accomplish the same.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/blk-merge.c | 11 +++++------
block/blk-mq.c | 16 +++++++++-------
block/blk.h | 2 +-
3 files changed, 15 insertions(+), 14 deletions(-)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index ec727234ac48..c273b58378ce 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -1067,9 +1067,8 @@ static enum bio_merge_status blk_attempt_bio_merge(struct request_queue *q,
* @q: request_queue new bio is being queued at
* @bio: new bio being queued
* @nr_segs: number of segments in @bio
- * @same_queue_rq: pointer to &struct request that gets filled in when
- * another request associated with @q is found on the plug list
- * (optional, may be %NULL)
+ * @same_queue_rq: output value, will be true if there's an existing request
+ * from the passed in @q already in the plug list
*
* Determine whether @bio being queued on @q can be merged with the previous
* request on %current's plugged list. Returns %true if merge was successful,
@@ -1085,7 +1084,7 @@ static enum bio_merge_status blk_attempt_bio_merge(struct request_queue *q,
* Caller must ensure !blk_queue_nomerges(q) beforehand.
*/
bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
- unsigned int nr_segs, struct request **same_queue_rq)
+ unsigned int nr_segs, bool *same_queue_rq)
{
struct blk_plug *plug;
struct request *rq;
@@ -1096,12 +1095,12 @@ bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
/* check the previously added entry for a quick merge attempt */
rq = list_last_entry(&plug->mq_list, struct request, queuelist);
- if (rq->q == q && same_queue_rq) {
+ if (rq->q == q) {
/*
* Only blk-mq multiple hardware queues case checks the rq in
* the same queue, there should be only one such rq in a queue
*/
- *same_queue_rq = rq;
+ *same_queue_rq = true;
}
if (blk_attempt_bio_merge(q, rq, bio, nr_segs, false) == BIO_MERGE_OK)
return true;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e3ef55f76701..d957b6812a98 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2422,7 +2422,7 @@ void blk_mq_submit_bio(struct bio *bio)
const int is_flush_fua = op_is_flush(bio->bi_opf);
struct request *rq;
struct blk_plug *plug;
- struct request *same_queue_rq = NULL;
+ bool same_queue_rq = false;
unsigned int nr_segs = 1;
blk_status_t ret;
@@ -2515,6 +2515,8 @@ void blk_mq_submit_bio(struct bio *bio)
/* Insert the request at the IO scheduler queue */
blk_mq_sched_insert_request(rq, false, true, true);
} else if (plug && !blk_queue_nomerges(q)) {
+ struct request *next_rq = NULL;
+
/*
* We do limited plugging. If the bio can be merged, do that.
* Otherwise the existing request in the plug list will be
@@ -2522,19 +2524,19 @@ void blk_mq_submit_bio(struct bio *bio)
* The plug list might get flushed before this. If that happens,
* the plug list is empty, and same_queue_rq is invalid.
*/
- if (list_empty(&plug->mq_list))
- same_queue_rq = NULL;
if (same_queue_rq) {
- list_del_init(&same_queue_rq->queuelist);
+ next_rq = list_last_entry(&plug->mq_list,
+ struct request,
+ queuelist);
+ list_del_init(&next_rq->queuelist);
plug->rq_count--;
}
blk_add_rq_to_plug(plug, rq);
trace_block_plug(q);
- if (same_queue_rq) {
+ if (next_rq) {
trace_block_unplug(q, 1, true);
- blk_mq_try_issue_directly(same_queue_rq->mq_hctx,
- same_queue_rq);
+ blk_mq_try_issue_directly(next_rq->mq_hctx, next_rq);
}
} else if ((q->nr_hw_queues > 1 && is_sync) ||
!rq->mq_hctx->dispatch_busy) {
diff --git a/block/blk.h b/block/blk.h
index e80350327e6d..b9729c12fd62 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -218,7 +218,7 @@ void blk_add_timer(struct request *req);
void blk_print_req_error(struct request *req, blk_status_t status);
bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
- unsigned int nr_segs, struct request **same_queue_rq);
+ unsigned int nr_segs, bool *same_queue_rq);
bool blk_bio_list_merge(struct request_queue *q, struct list_head *list,
struct bio *bio, unsigned int nr_segs);
--
2.33.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/6] block: get rid of plug list sorting
2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
2021-10-18 17:51 ` [PATCH 1/6] block: don't call blk_status_to_errno in blk_update_request Jens Axboe
2021-10-18 17:51 ` [PATCH 2/6] block: return whether or not to unplug through boolean Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
2021-10-18 18:00 ` Christoph Hellwig
2021-10-18 17:51 ` [PATCH 4/6] block: change plugging to use a singly linked list Jens Axboe
` (2 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
To: linux-block; +Cc: hch, Jens Axboe
Even if we have multiple queues in the plug list, chances that they
are very interspersed is minimal. Don't bother spending CPU cycles
sorting the list.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/blk-mq.c | 19 -------------------
1 file changed, 19 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index d957b6812a98..58774267dd95 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -19,7 +19,6 @@
#include <linux/smp.h>
#include <linux/interrupt.h>
#include <linux/llist.h>
-#include <linux/list_sort.h>
#include <linux/cpu.h>
#include <linux/cache.h>
#include <linux/sched/sysctl.h>
@@ -2151,20 +2150,6 @@ void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
spin_unlock(&ctx->lock);
}
-static int plug_rq_cmp(void *priv, const struct list_head *a,
- const struct list_head *b)
-{
- struct request *rqa = container_of(a, struct request, queuelist);
- struct request *rqb = container_of(b, struct request, queuelist);
-
- if (rqa->mq_ctx != rqb->mq_ctx)
- return rqa->mq_ctx > rqb->mq_ctx;
- if (rqa->mq_hctx != rqb->mq_hctx)
- return rqa->mq_hctx > rqb->mq_hctx;
-
- return blk_rq_pos(rqa) > blk_rq_pos(rqb);
-}
-
void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
{
LIST_HEAD(list);
@@ -2172,10 +2157,6 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
if (list_empty(&plug->mq_list))
return;
list_splice_init(&plug->mq_list, &list);
-
- if (plug->rq_count > 2 && plug->multiple_queues)
- list_sort(NULL, &list, plug_rq_cmp);
-
plug->rq_count = 0;
do {
--
2.33.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/6] block: change plugging to use a singly linked list
2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
` (2 preceding siblings ...)
2021-10-18 17:51 ` [PATCH 3/6] block: get rid of plug list sorting Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
2021-10-19 5:57 ` Christoph Hellwig
2021-10-18 17:51 ` [PATCH 5/6] block: move blk_mq_tag_to_rq() inline Jens Axboe
2021-10-18 17:51 ` [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline Jens Axboe
5 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
To: linux-block; +Cc: hch, Jens Axboe
Use a singly linked list for the blk_plug. This saves 8 bytes in the
blk_plug struct, and makes for faster list manipulations than doubly
linked lists. As we don't use the doubly linked lists for anything,
singly linked is just fine.
This yields a bump in default (merging enabled) performance from 7.0
to 7.1M IOPS, and ~7.5M IOPS with merging disabled.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/blk-core.c | 5 +-
block/blk-merge.c | 4 +-
block/blk-mq.c | 140 ++++++++++++++++++++++++++++++-----------
include/linux/blkdev.h | 6 +-
4 files changed, 113 insertions(+), 42 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index d0c2e11411d0..e6ad5b51d0c3 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1550,11 +1550,12 @@ void blk_start_plug_nr_ios(struct blk_plug *plug, unsigned short nr_ios)
if (tsk->plug)
return;
- INIT_LIST_HEAD(&plug->mq_list);
+ plug->mq_list = NULL;
plug->cached_rq = NULL;
plug->nr_ios = min_t(unsigned short, nr_ios, BLK_MAX_REQUEST_COUNT);
plug->rq_count = 0;
plug->multiple_queues = false;
+ plug->has_elevator = false;
plug->nowait = false;
INIT_LIST_HEAD(&plug->cb_list);
@@ -1640,7 +1641,7 @@ void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule)
{
flush_plug_callbacks(plug, from_schedule);
- if (!list_empty(&plug->mq_list))
+ if (!rq_list_empty(plug->mq_list))
blk_mq_flush_plug_list(plug, from_schedule);
if (unlikely(!from_schedule && plug->cached_rq))
blk_mq_free_plug_rqs(plug);
diff --git a/block/blk-merge.c b/block/blk-merge.c
index c273b58378ce..3e6fa449caff 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -1090,11 +1090,11 @@ bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
struct request *rq;
plug = blk_mq_plug(q, bio);
- if (!plug || list_empty(&plug->mq_list))
+ if (!plug || rq_list_empty(plug->mq_list))
return false;
/* check the previously added entry for a quick merge attempt */
- rq = list_last_entry(&plug->mq_list, struct request, queuelist);
+ rq = rq_list_peek(&plug->mq_list);
if (rq->q == q) {
/*
* Only blk-mq multiple hardware queues case checks the rq in
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 58774267dd95..3d5010d93059 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2150,36 +2150,106 @@ void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
spin_unlock(&ctx->lock);
}
+static void blk_mq_commit_rqs(struct blk_mq_hw_ctx *hctx, int *queued,
+ bool from_schedule)
+{
+ if (hctx->queue->mq_ops->commit_rqs) {
+ trace_block_unplug(hctx->queue, *queued, !from_schedule);
+ hctx->queue->mq_ops->commit_rqs(hctx);
+ }
+ *queued = 0;
+}
+
+static void blk_mq_plug_issue_direct(struct blk_plug *plug, bool from_schedule)
+{
+ struct blk_mq_hw_ctx *hctx = NULL;
+ struct request *rq;
+ int queued = 0;
+ int errors = 0;
+
+ while ((rq = rq_list_pop(&plug->mq_list))) {
+ bool last = rq_list_empty(plug->mq_list);
+ blk_status_t ret;
+
+ if (hctx != rq->mq_hctx) {
+ if (hctx)
+ blk_mq_commit_rqs(hctx, &queued, from_schedule);
+ hctx = rq->mq_hctx;
+ }
+
+ ret = blk_mq_request_issue_directly(rq, last);
+ switch (ret) {
+ case BLK_STS_OK:
+ queued++;
+ break;
+ case BLK_STS_RESOURCE:
+ case BLK_STS_DEV_RESOURCE:
+ blk_mq_request_bypass_insert(rq, false, last);
+ blk_mq_commit_rqs(hctx, &queued, from_schedule);
+ return;
+ default:
+ blk_mq_end_request(rq, ret);
+ errors++;
+ break;
+ }
+ }
+
+ /*
+ * If we didn't flush the entire list, we could have told the driver
+ * there was more coming, but that turned out to be a lie.
+ */
+ if (errors)
+ blk_mq_commit_rqs(hctx, &queued, from_schedule);
+}
+
void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
{
+ struct blk_mq_hw_ctx *this_hctx;
+ struct blk_mq_ctx *this_ctx;
+ unsigned int depth;
LIST_HEAD(list);
- if (list_empty(&plug->mq_list))
+ if (rq_list_empty(plug->mq_list))
return;
- list_splice_init(&plug->mq_list, &list);
plug->rq_count = 0;
+ if (!plug->multiple_queues && !plug->has_elevator) {
+ blk_mq_plug_issue_direct(plug, from_schedule);
+ if (rq_list_empty(plug->mq_list))
+ return;
+ }
+
+ this_hctx = NULL;
+ this_ctx = NULL;
+ depth = 0;
do {
- struct list_head rq_list;
- struct request *rq, *head_rq = list_entry_rq(list.next);
- struct list_head *pos = &head_rq->queuelist; /* skip first */
- struct blk_mq_hw_ctx *this_hctx = head_rq->mq_hctx;
- struct blk_mq_ctx *this_ctx = head_rq->mq_ctx;
- unsigned int depth = 1;
-
- list_for_each_continue(pos, &list) {
- rq = list_entry_rq(pos);
- BUG_ON(!rq->q);
- if (rq->mq_hctx != this_hctx || rq->mq_ctx != this_ctx)
- break;
- depth++;
+ struct request *rq;
+
+ rq = rq_list_pop(&plug->mq_list);
+
+ if (!this_hctx) {
+ this_hctx = rq->mq_hctx;
+ this_ctx = rq->mq_ctx;
+ } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx) {
+ trace_block_unplug(this_hctx->queue, depth,
+ !from_schedule);
+ blk_mq_sched_insert_requests(this_hctx, this_ctx,
+ &list, from_schedule);
+ depth = 0;
+ this_hctx = rq->mq_hctx;
+ this_ctx = rq->mq_ctx;
+
}
- list_cut_before(&rq_list, &list, pos);
- trace_block_unplug(head_rq->q, depth, !from_schedule);
- blk_mq_sched_insert_requests(this_hctx, this_ctx, &rq_list,
+ list_add(&rq->queuelist, &list);
+ depth++;
+ } while (!rq_list_empty(plug->mq_list));
+
+ if (!list_empty(&list)) {
+ trace_block_unplug(this_hctx->queue, depth, !from_schedule);
+ blk_mq_sched_insert_requests(this_hctx, this_ctx, &list,
from_schedule);
- } while(!list_empty(&list));
+ }
}
static void blk_mq_bio_to_request(struct request *rq, struct bio *bio,
@@ -2359,16 +2429,17 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq)
{
- list_add_tail(&rq->queuelist, &plug->mq_list);
- plug->rq_count++;
- if (!plug->multiple_queues && !list_is_singular(&plug->mq_list)) {
- struct request *tmp;
+ if (!plug->multiple_queues) {
+ struct request *nxt = rq_list_peek(&plug->mq_list);
- tmp = list_first_entry(&plug->mq_list, struct request,
- queuelist);
- if (tmp->q != rq->q)
+ if (nxt && nxt->q != rq->q)
plug->multiple_queues = true;
}
+ if (!plug->has_elevator && (rq->rq_flags & RQF_ELV))
+ plug->has_elevator = true;
+ rq->rq_next = NULL;
+ rq_list_add(&plug->mq_list, rq);
+ plug->rq_count++;
}
/*
@@ -2480,13 +2551,15 @@ void blk_mq_submit_bio(struct bio *bio)
unsigned int request_count = plug->rq_count;
struct request *last = NULL;
- if (!request_count)
+ if (!request_count) {
trace_block_plug(q);
- else
- last = list_entry_rq(plug->mq_list.prev);
+ } else if (!blk_queue_nomerges(q)) {
+ last = rq_list_peek(&plug->mq_list);
+ if (blk_rq_bytes(last) < BLK_PLUG_FLUSH_SIZE)
+ last = NULL;
+ }
- if (request_count >= blk_plug_max_rq_count(plug) || (last &&
- blk_rq_bytes(last) >= BLK_PLUG_FLUSH_SIZE)) {
+ if (request_count >= blk_plug_max_rq_count(plug) || last) {
blk_flush_plug_list(plug, false);
trace_block_plug(q);
}
@@ -2506,10 +2579,7 @@ void blk_mq_submit_bio(struct bio *bio)
* the plug list is empty, and same_queue_rq is invalid.
*/
if (same_queue_rq) {
- next_rq = list_last_entry(&plug->mq_list,
- struct request,
- queuelist);
- list_del_init(&next_rq->queuelist);
+ next_rq = rq_list_pop(&plug->mq_list);
plug->rq_count--;
}
blk_add_rq_to_plug(plug, rq);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index abe721591e80..2e93682f8f68 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -711,7 +711,7 @@ extern void blk_set_queue_dying(struct request_queue *);
* schedule() where blk_schedule_flush_plug() is called.
*/
struct blk_plug {
- struct list_head mq_list; /* blk-mq requests */
+ struct request *mq_list; /* blk-mq requests */
/* if ios_left is > 1, we can batch tag/rq allocations */
struct request *cached_rq;
@@ -720,6 +720,7 @@ struct blk_plug {
unsigned short rq_count;
bool multiple_queues;
+ bool has_elevator;
bool nowait;
struct list_head cb_list; /* md requires an unplug callback */
@@ -760,8 +761,7 @@ static inline bool blk_needs_flush_plug(struct task_struct *tsk)
struct blk_plug *plug = tsk->plug;
return plug &&
- (!list_empty(&plug->mq_list) ||
- !list_empty(&plug->cb_list));
+ (plug->mq_list || !list_empty(&plug->cb_list));
}
int blkdev_issue_flush(struct block_device *bdev);
--
2.33.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 5/6] block: move blk_mq_tag_to_rq() inline
2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
` (3 preceding siblings ...)
2021-10-18 17:51 ` [PATCH 4/6] block: change plugging to use a singly linked list Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
2021-10-18 18:01 ` Christoph Hellwig
2021-10-18 17:51 ` [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline Jens Axboe
5 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
To: linux-block; +Cc: hch, Jens Axboe
This is in the fast path of driver issue or completion, and it's a single
array index operation. Move it inline to avoid a function call for it.
This does mean making struct blk_mq_tags block layer public, but there's
not really much in there.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/blk-mq-tag.h | 23 -----------------------
block/blk-mq.c | 11 -----------
include/linux/blk-mq.h | 36 +++++++++++++++++++++++++++++++++++-
3 files changed, 35 insertions(+), 35 deletions(-)
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index 78ae2fb8e2a4..df787b5a23bd 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -4,29 +4,6 @@
struct blk_mq_alloc_data;
-/*
- * Tag address space map.
- */
-struct blk_mq_tags {
- unsigned int nr_tags;
- unsigned int nr_reserved_tags;
-
- atomic_t active_queues;
-
- struct sbitmap_queue bitmap_tags;
- struct sbitmap_queue breserved_tags;
-
- struct request **rqs;
- struct request **static_rqs;
- struct list_head page_list;
-
- /*
- * used to clear request reference in rqs[] before freeing one
- * request pool
- */
- spinlock_t lock;
-};
-
extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
unsigned int reserved_tags,
int node, int alloc_policy);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3d5010d93059..9a73e2c3ce32 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1112,17 +1112,6 @@ void blk_mq_delay_kick_requeue_list(struct request_queue *q,
}
EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
-struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
-{
- if (tag < tags->nr_tags) {
- prefetch(tags->rqs[tag]);
- return tags->rqs[tag];
- }
-
- return NULL;
-}
-EXPORT_SYMBOL(blk_mq_tag_to_rq);
-
static bool blk_mq_rq_inflight(struct blk_mq_hw_ctx *hctx, struct request *rq,
void *priv, bool reserved)
{
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 656fe34bdb6c..6cf35de151a9 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -7,6 +7,7 @@
#include <linux/srcu.h>
#include <linux/lockdep.h>
#include <linux/scatterlist.h>
+#include <linux/prefetch.h>
struct blk_mq_tags;
struct blk_flush_queue;
@@ -675,7 +676,40 @@ struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op,
struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
unsigned int op, blk_mq_req_flags_t flags,
unsigned int hctx_idx);
-struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag);
+
+/*
+ * Tag address space map.
+ */
+struct blk_mq_tags {
+ unsigned int nr_tags;
+ unsigned int nr_reserved_tags;
+
+ atomic_t active_queues;
+
+ struct sbitmap_queue bitmap_tags;
+ struct sbitmap_queue breserved_tags;
+
+ struct request **rqs;
+ struct request **static_rqs;
+ struct list_head page_list;
+
+ /*
+ * used to clear request reference in rqs[] before freeing one
+ * request pool
+ */
+ spinlock_t lock;
+};
+
+static inline struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags,
+ unsigned int tag)
+{
+ if (tag < tags->nr_tags) {
+ prefetch(tags->rqs[tag]);
+ return tags->rqs[tag];
+ }
+
+ return NULL;
+}
enum {
BLK_MQ_UNIQUE_TAG_BITS = 16,
--
2.33.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline
2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
` (4 preceding siblings ...)
2021-10-18 17:51 ` [PATCH 5/6] block: move blk_mq_tag_to_rq() inline Jens Axboe
@ 2021-10-18 17:51 ` Jens Axboe
2021-10-18 18:01 ` Christoph Hellwig
5 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2021-10-18 17:51 UTC (permalink / raw)
To: linux-block; +Cc: hch, Jens Axboe
We get all sorts of unreliable and funky results since the bio is
designed to align on a cacheline, which it does not when inlined like
this.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/fops.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/fops.c b/block/fops.c
index 2ae8a7bd2b84..e30082ea219a 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -137,7 +137,7 @@ struct blkdev_dio {
size_t size;
atomic_t ref;
unsigned int flags;
- struct bio bio;
+ struct bio bio ____cacheline_aligned_in_smp;
};
static struct bio_set blkdev_dio_pool;
--
2.33.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/6] block: return whether or not to unplug through boolean
2021-10-18 17:51 ` [PATCH 2/6] block: return whether or not to unplug through boolean Jens Axboe
@ 2021-10-18 18:00 ` Christoph Hellwig
0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-10-18 18:00 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, hch
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/6] block: get rid of plug list sorting
2021-10-18 17:51 ` [PATCH 3/6] block: get rid of plug list sorting Jens Axboe
@ 2021-10-18 18:00 ` Christoph Hellwig
0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-10-18 18:00 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, hch
On Mon, Oct 18, 2021 at 11:51:06AM -0600, Jens Axboe wrote:
> Even if we have multiple queues in the plug list, chances that they
> are very interspersed is minimal. Don't bother spending CPU cycles
> sorting the list.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline
2021-10-18 17:51 ` [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline Jens Axboe
@ 2021-10-18 18:01 ` Christoph Hellwig
0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-10-18 18:01 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, hch
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5/6] block: move blk_mq_tag_to_rq() inline
2021-10-18 17:51 ` [PATCH 5/6] block: move blk_mq_tag_to_rq() inline Jens Axboe
@ 2021-10-18 18:01 ` Christoph Hellwig
0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-10-18 18:01 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, hch
On Mon, Oct 18, 2021 at 11:51:08AM -0600, Jens Axboe wrote:
> This is in the fast path of driver issue or completion, and it's a single
> array index operation. Move it inline to avoid a function call for it.
>
> This does mean making struct blk_mq_tags block layer public, but there's
> not really much in there.
I don't really like exposing more data structures if we can avoid it,
but if this makes a big enough difference:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/6] block: change plugging to use a singly linked list
2021-10-18 17:51 ` [PATCH 4/6] block: change plugging to use a singly linked list Jens Axboe
@ 2021-10-19 5:57 ` Christoph Hellwig
2021-10-19 11:58 ` Jens Axboe
0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2021-10-19 5:57 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, hch
On Mon, Oct 18, 2021 at 11:51:07AM -0600, Jens Axboe wrote:
> Use a singly linked list for the blk_plug. This saves 8 bytes in the
> blk_plug struct, and makes for faster list manipulations than doubly
> linked lists. As we don't use the doubly linked lists for anything,
> singly linked is just fine.
>
> This yields a bump in default (merging enabled) performance from 7.0
> to 7.1M IOPS, and ~7.5M IOPS with merging disabled.
I still find this very hard to review. It doesn't just switch the
list implementation but also does change how this code works. Especially
in blk_mq_flush_plug_list, but also in blk_mq_submit_bio. I think
this needs to be further split up.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/6] block: change plugging to use a singly linked list
2021-10-19 5:57 ` Christoph Hellwig
@ 2021-10-19 11:58 ` Jens Axboe
0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2021-10-19 11:58 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-block
On 10/18/21 11:57 PM, Christoph Hellwig wrote:
> On Mon, Oct 18, 2021 at 11:51:07AM -0600, Jens Axboe wrote:
>> Use a singly linked list for the blk_plug. This saves 8 bytes in the
>> blk_plug struct, and makes for faster list manipulations than doubly
>> linked lists. As we don't use the doubly linked lists for anything,
>> singly linked is just fine.
>>
>> This yields a bump in default (merging enabled) performance from 7.0
>> to 7.1M IOPS, and ~7.5M IOPS with merging disabled.
>
> I still find this very hard to review. It doesn't just switch the
> list implementation but also does change how this code works. Especially
> in blk_mq_flush_plug_list, but also in blk_mq_submit_bio. I think
> this needs to be further split up.
OK, I'll split this one up a bit more.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-10-19 11:58 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 17:51 [PATCHSET v2 0/6] Various block layer optimizations Jens Axboe
2021-10-18 17:51 ` [PATCH 1/6] block: don't call blk_status_to_errno in blk_update_request Jens Axboe
2021-10-18 17:51 ` [PATCH 2/6] block: return whether or not to unplug through boolean Jens Axboe
2021-10-18 18:00 ` Christoph Hellwig
2021-10-18 17:51 ` [PATCH 3/6] block: get rid of plug list sorting Jens Axboe
2021-10-18 18:00 ` Christoph Hellwig
2021-10-18 17:51 ` [PATCH 4/6] block: change plugging to use a singly linked list Jens Axboe
2021-10-19 5:57 ` Christoph Hellwig
2021-10-19 11:58 ` Jens Axboe
2021-10-18 17:51 ` [PATCH 5/6] block: move blk_mq_tag_to_rq() inline Jens Axboe
2021-10-18 18:01 ` Christoph Hellwig
2021-10-18 17:51 ` [PATCH 6/6] block: align blkdev_dio inlined bio to a cacheline Jens Axboe
2021-10-18 18:01 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).