All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next] block: fix hctx checks for batch allocation
@ 2023-01-17 11:42 Pavel Begunkov
  2023-01-17 16:56 ` Jens Axboe
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Pavel Begunkov @ 2023-01-17 11:42 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Pavel Begunkov

When there are no read queues read requests will be assigned a
default queue on allocation. However, blk_mq_get_cached_request() is not
prepared for that and will fail all attempts to grab read requests from
the cache. Worst case it doubles the number of requests allocated,
roughly half of which will be returned by blk_mq_free_plug_rqs().

It only affects batched allocations and so is io_uring specific.
For reference, QD8 t/io_uring benchmark improves by 20-35%.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---

It might be a good idea to always use HCTX_TYPE_DEFAULT, so the cache
always can accomodate combined write with read reqs.

 block/blk-mq.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 2c49b4151da1..9d463f7563bc 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2890,6 +2890,7 @@ static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
 		struct blk_plug *plug, struct bio **bio, unsigned int nsegs)
 {
 	struct request *rq;
+	enum hctx_type type, hctx_type;
 
 	if (!plug)
 		return NULL;
@@ -2902,7 +2903,10 @@ static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
 		return NULL;
 	}
 
-	if (blk_mq_get_hctx_type((*bio)->bi_opf) != rq->mq_hctx->type)
+	type = blk_mq_get_hctx_type((*bio)->bi_opf);
+	hctx_type = rq->mq_hctx->type;
+	if (type != hctx_type &&
+	    !(type == HCTX_TYPE_READ && hctx_type == HCTX_TYPE_DEFAULT))
 		return NULL;
 	if (op_is_flush(rq->cmd_flags) != op_is_flush((*bio)->bi_opf))
 		return NULL;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH for-next] block: fix hctx checks for batch allocation
  2023-01-17 11:42 [PATCH for-next] block: fix hctx checks for batch allocation Pavel Begunkov
@ 2023-01-17 16:56 ` Jens Axboe
  2023-01-17 17:22 ` Jens Axboe
  2023-01-18  5:36 ` Christoph Hellwig
  2 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2023-01-17 16:56 UTC (permalink / raw)
  To: Pavel Begunkov, linux-block

On 1/17/23 4:42?AM, Pavel Begunkov wrote:
> When there are no read queues read requests will be assigned a
> default queue on allocation. However, blk_mq_get_cached_request() is not
> prepared for that and will fail all attempts to grab read requests from
> the cache. Worst case it doubles the number of requests allocated,
> roughly half of which will be returned by blk_mq_free_plug_rqs().
> 
> It only affects batched allocations and so is io_uring specific.
> For reference, QD8 t/io_uring benchmark improves by 20-35%.

This does make a big difference for me. Usual peak test (24 drives), and
I get 63-65M IOPS with IRQ based IO. With the patch:

polled=0, fixedbufs=1/0, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=64.79M, BW=31.64GiB/s, IOS/call=32/31
IOPS=73.45M, BW=35.86GiB/s, IOS/call=32/32
IOPS=73.70M, BW=35.99GiB/s, IOS/call=31/31
IOPS=74.57M, BW=36.41GiB/s, IOS/call=31/31
IOPS=75.18M, BW=36.71GiB/s, IOS/call=31/31
IOPS=74.33M, BW=36.29GiB/s, IOS/call=32/32
IOPS=74.53M, BW=36.39GiB/s, IOS/call=32/32
IOPS=74.61M, BW=36.43GiB/s, IOS/call=32/32

which is 15-19% better.

> It might be a good idea to always use HCTX_TYPE_DEFAULT, so the cache
> always can accomodate combined write with read reqs.

I think it makes sense to do so, particularly now that we have support
for not just polled IO.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH for-next] block: fix hctx checks for batch allocation
  2023-01-17 11:42 [PATCH for-next] block: fix hctx checks for batch allocation Pavel Begunkov
  2023-01-17 16:56 ` Jens Axboe
@ 2023-01-17 17:22 ` Jens Axboe
  2023-01-18  5:36 ` Christoph Hellwig
  2 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2023-01-17 17:22 UTC (permalink / raw)
  To: linux-block, Pavel Begunkov


On Tue, 17 Jan 2023 11:42:15 +0000, Pavel Begunkov wrote:
> When there are no read queues read requests will be assigned a
> default queue on allocation. However, blk_mq_get_cached_request() is not
> prepared for that and will fail all attempts to grab read requests from
> the cache. Worst case it doubles the number of requests allocated,
> roughly half of which will be returned by blk_mq_free_plug_rqs().
> 
> It only affects batched allocations and so is io_uring specific.
> For reference, QD8 t/io_uring benchmark improves by 20-35%.
> 
> [...]

Applied, thanks!

[1/1] block: fix hctx checks for batch allocation
      (no commit info)

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH for-next] block: fix hctx checks for batch allocation
  2023-01-17 11:42 [PATCH for-next] block: fix hctx checks for batch allocation Pavel Begunkov
  2023-01-17 16:56 ` Jens Axboe
  2023-01-17 17:22 ` Jens Axboe
@ 2023-01-18  5:36 ` Christoph Hellwig
  2 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2023-01-18  5:36 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Jens Axboe, linux-block

On Tue, Jan 17, 2023 at 11:42:15AM +0000, Pavel Begunkov wrote:
> It might be a good idea to always use HCTX_TYPE_DEFAULT, so the cache
> always can accomodate combined write with read reqs.

I suspect we'll just need a separate cache for each HCTX_TYPE.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-01-18  5:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-17 11:42 [PATCH for-next] block: fix hctx checks for batch allocation Pavel Begunkov
2023-01-17 16:56 ` Jens Axboe
2023-01-17 17:22 ` Jens Axboe
2023-01-18  5:36 ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.