All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O
       [not found] <CGME20230117120741epcas5p2c7d2a20edd0f09bdff585fbe95bdadd9@epcas5p2.samsung.com>
@ 2023-01-17 12:06 ` Anuj Gupta
       [not found]   ` <CGME20230117120752epcas5p2f01ed01d190357f35dda4505fadea02b@epcas5p2.samsung.com>
                     ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Anuj Gupta @ 2023-01-17 12:06 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: linux-nvme, linux-block, gost.dev, Anuj Gupta

This series extends bio pcpu caching for normal / IRQ-driven
uring-passthru I/Os. Earlier, only polled uring-passthru I/Os could
leverage bio-cache. After the series from Pavel[1], bio-cache can be
leveraged by normal / IRQ driven I/Os as well. t/io_uring with an Optane
SSD setup shows +7.21% for batches of 32 requests.

[1] https://lore.kernel.org/io-uring/cover.1666347703.git.asml.silence@gmail.com/

IRQ, 128/32/32, cache off

# taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p0 -F1 -B1 -P0 -O0 -u1 -n1 /dev/ng0n1
submitter=0, tid=13207, file=/dev/ng0n1, node=-1
polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=3.05M, BW=1488MiB/s, IOS/call=32/31
IOPS=3.04M, BW=1483MiB/s, IOS/call=32/31
IOPS=3.03M, BW=1477MiB/s, IOS/call=32/32
IOPS=3.03M, BW=1481MiB/s, IOS/call=32/32
^CExiting on signal
Maximum IOPS=3.05M

IRQ, 128/32/32, cache on

# taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p0 -F1 -B1 -P0 -O0 -u1 -n1 /dev/ng0n1
submitter=0, tid=6755, file=/dev/ng0n1, node=-1
polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=3.27M, BW=1596MiB/s, IOS/call=32/31
IOPS=3.27M, BW=1595MiB/s, IOS/call=32/32
IOPS=3.26M, BW=1592MiB/s, IOS/call=32/31
IOPS=3.26M, BW=1593MiB/s, IOS/call=32/32
^CExiting on signal
Maximum IOPS=3.27M

Anuj Gupta (2):
  nvme: set REQ_ALLOC_CACHE for uring-passthru request
  block: extend bio-cache for non-polled requests

 block/blk-map.c           | 6 ++----
 drivers/nvme/host/ioctl.c | 4 ++--
 2 files changed, 4 insertions(+), 6 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH for-next v1 1/2] nvme: set REQ_ALLOC_CACHE for uring-passthru request
       [not found]   ` <CGME20230117120752epcas5p2f01ed01d190357f35dda4505fadea02b@epcas5p2.samsung.com>
@ 2023-01-17 12:06     ` Anuj Gupta
  0 siblings, 0 replies; 6+ messages in thread
From: Anuj Gupta @ 2023-01-17 12:06 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: linux-nvme, linux-block, gost.dev, Anuj Gupta, Kanchan Joshi

This patch sets REQ_ALLOC_CACHE flag for uring-passthru requests.
This is a prep-patch so that normal / IRQ-driven uring-passthru
I/Os can also leverage bio-cache.

Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
 drivers/nvme/host/ioctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 06f52db34be9..ffaabf16dd4c 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -554,7 +554,7 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 	struct nvme_uring_data d;
 	struct nvme_command c;
 	struct request *req;
-	blk_opf_t rq_flags = 0;
+	blk_opf_t rq_flags = REQ_ALLOC_CACHE;
 	blk_mq_req_flags_t blk_flags = 0;
 	void *meta = NULL;
 	int ret;
@@ -590,7 +590,7 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 	d.timeout_ms = READ_ONCE(cmd->timeout_ms);
 
 	if (issue_flags & IO_URING_F_NONBLOCK) {
-		rq_flags = REQ_NOWAIT;
+		rq_flags |= REQ_NOWAIT;
 		blk_flags = BLK_MQ_REQ_NOWAIT;
 	}
 	if (issue_flags & IO_URING_F_IOPOLL)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH for-next v1 2/2] block: extend bio-cache for non-polled requests
       [not found]   ` <CGME20230117120802epcas5p4a9d1fca9d49140649a4152856b54f81f@epcas5p4.samsung.com>
@ 2023-01-17 12:06     ` Anuj Gupta
  0 siblings, 0 replies; 6+ messages in thread
From: Anuj Gupta @ 2023-01-17 12:06 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: linux-nvme, linux-block, gost.dev, Anuj Gupta, Kanchan Joshi

This patch modifies the present check, so that bio-cache is not limited
to iopoll.

Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
 block/blk-map.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index f2135e6ee8f6..9ee4be4ba2f1 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -247,10 +247,8 @@ static struct bio *blk_rq_map_bio_alloc(struct request *rq,
 {
 	struct bio *bio;
 
-	if (rq->cmd_flags & REQ_POLLED) {
-		blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE;
-
-		bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
+	if (rq->cmd_flags & REQ_ALLOC_CACHE) {
+		bio = bio_alloc_bioset(NULL, nr_vecs, rq->cmd_flags, gfp_mask,
 					&fs_bio_set);
 		if (!bio)
 			return NULL;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O
  2023-01-17 12:06 ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Anuj Gupta
       [not found]   ` <CGME20230117120752epcas5p2f01ed01d190357f35dda4505fadea02b@epcas5p2.samsung.com>
       [not found]   ` <CGME20230117120802epcas5p4a9d1fca9d49140649a4152856b54f81f@epcas5p4.samsung.com>
@ 2023-01-17 17:11   ` Jens Axboe
  2023-01-18  9:14     ` Kanchan Joshi
  2023-01-17 17:23   ` Jens Axboe
  3 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2023-01-17 17:11 UTC (permalink / raw)
  To: Anuj Gupta, hch, kbusch, asml.silence; +Cc: linux-nvme, linux-block, gost.dev

On 1/17/23 5:06?AM, Anuj Gupta wrote:
> This series extends bio pcpu caching for normal / IRQ-driven
> uring-passthru I/Os. Earlier, only polled uring-passthru I/Os could
> leverage bio-cache. After the series from Pavel[1], bio-cache can be
> leveraged by normal / IRQ driven I/Os as well. t/io_uring with an Optane
> SSD setup shows +7.21% for batches of 32 requests.
> 
> [1] https://lore.kernel.org/io-uring/cover.1666347703.git.asml.silence@gmail.com/
> 
> IRQ, 128/32/32, cache off

Tests here -

before:

polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=62.88M, BW=30.70GiB/s, IOS/call=32/31
IOPS=62.95M, BW=30.74GiB/s, IOS/call=32/31
IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/32
IOPS=62.61M, BW=30.57GiB/s, IOS/call=31/32
IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/31
IOPS=62.40M, BW=30.47GiB/s, IOS/call=32/32

after:

polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=76.58M, BW=37.39GiB/s, IOS/call=31/31
IOPS=79.42M, BW=38.78GiB/s, IOS/call=32/32
IOPS=78.06M, BW=38.12GiB/s, IOS/call=31/31
IOPS=77.64M, BW=37.91GiB/s, IOS/call=32/31
IOPS=77.17M, BW=37.68GiB/s, IOS/call=32/32
IOPS=76.73M, BW=37.47GiB/s, IOS/call=31/31
IOPS=76.94M, BW=37.57GiB/s, IOS/call=32/31

Note that this includes Pavel's fix as well:

https://lore.kernel.org/linux-block/80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com/

But this mirrors the improvement seen on the non-passthrough side as
well. I'd say that's a pass :-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O
  2023-01-17 12:06 ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Anuj Gupta
                     ` (2 preceding siblings ...)
  2023-01-17 17:11   ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Jens Axboe
@ 2023-01-17 17:23   ` Jens Axboe
  3 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2023-01-17 17:23 UTC (permalink / raw)
  To: hch, kbusch, asml.silence, Anuj Gupta; +Cc: linux-nvme, linux-block, gost.dev


On Tue, 17 Jan 2023 17:36:36 +0530, Anuj Gupta wrote:
> This series extends bio pcpu caching for normal / IRQ-driven
> uring-passthru I/Os. Earlier, only polled uring-passthru I/Os could
> leverage bio-cache. After the series from Pavel[1], bio-cache can be
> leveraged by normal / IRQ driven I/Os as well. t/io_uring with an Optane
> SSD setup shows +7.21% for batches of 32 requests.
> 
> [1] https://lore.kernel.org/io-uring/cover.1666347703.git.asml.silence@gmail.com/
> 
> [...]

Applied, thanks!

[1/2] nvme: set REQ_ALLOC_CACHE for uring-passthru request
      commit: 988136a307157de9e6e9d27ee9f7ea24ee374f32
[2/2] block: extend bio-cache for non-polled requests
      commit: 934f178446b11f621ab52e83211ebf399896db47

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O
  2023-01-17 17:11   ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Jens Axboe
@ 2023-01-18  9:14     ` Kanchan Joshi
  0 siblings, 0 replies; 6+ messages in thread
From: Kanchan Joshi @ 2023-01-18  9:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Anuj Gupta, hch, kbusch, asml.silence, linux-nvme, linux-block, gost.dev

[-- Attachment #1: Type: text/plain, Size: 1865 bytes --]

On Tue, Jan 17, 2023 at 10:11:08AM -0700, Jens Axboe wrote:
>On 1/17/23 5:06?AM, Anuj Gupta wrote:
>> This series extends bio pcpu caching for normal / IRQ-driven
>> uring-passthru I/Os. Earlier, only polled uring-passthru I/Os could
>> leverage bio-cache. After the series from Pavel[1], bio-cache can be
>> leveraged by normal / IRQ driven I/Os as well. t/io_uring with an Optane
>> SSD setup shows +7.21% for batches of 32 requests.
>>
>> [1] https://lore.kernel.org/io-uring/cover.1666347703.git.asml.silence@gmail.com/
>>
>> IRQ, 128/32/32, cache off
>
>Tests here -
>
>before:
>
>polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
>Engine=io_uring, sq_ring=128, cq_ring=128
>IOPS=62.88M, BW=30.70GiB/s, IOS/call=32/31
>IOPS=62.95M, BW=30.74GiB/s, IOS/call=32/31
>IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/32
>IOPS=62.61M, BW=30.57GiB/s, IOS/call=31/32
>IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/31
>IOPS=62.40M, BW=30.47GiB/s, IOS/call=32/32
>
>after:
>
>polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
>Engine=io_uring, sq_ring=128, cq_ring=128
>IOPS=76.58M, BW=37.39GiB/s, IOS/call=31/31
>IOPS=79.42M, BW=38.78GiB/s, IOS/call=32/32
>IOPS=78.06M, BW=38.12GiB/s, IOS/call=31/31
>IOPS=77.64M, BW=37.91GiB/s, IOS/call=32/31
>IOPS=77.17M, BW=37.68GiB/s, IOS/call=32/32
>IOPS=76.73M, BW=37.47GiB/s, IOS/call=31/31
>IOPS=76.94M, BW=37.57GiB/s, IOS/call=32/31
>
>Note that this includes Pavel's fix as well:
>
>https://lore.kernel.org/linux-block/80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com/

So I was thinking whether we need this fix for passthru path too. We do
not.
For block path, blk_mq_get_cached_request() encountered a
mismatch since type was different (read vs default).
For passthru, blk_mq_alloc_cached_request() sees no mismatch since
passthrough opf is not treated as read (default vs default).

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-01-18 10:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20230117120741epcas5p2c7d2a20edd0f09bdff585fbe95bdadd9@epcas5p2.samsung.com>
2023-01-17 12:06 ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Anuj Gupta
     [not found]   ` <CGME20230117120752epcas5p2f01ed01d190357f35dda4505fadea02b@epcas5p2.samsung.com>
2023-01-17 12:06     ` [PATCH for-next v1 1/2] nvme: set REQ_ALLOC_CACHE for uring-passthru request Anuj Gupta
     [not found]   ` <CGME20230117120802epcas5p4a9d1fca9d49140649a4152856b54f81f@epcas5p4.samsung.com>
2023-01-17 12:06     ` [PATCH for-next v1 2/2] block: extend bio-cache for non-polled requests Anuj Gupta
2023-01-17 17:11   ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Jens Axboe
2023-01-18  9:14     ` Kanchan Joshi
2023-01-17 17:23   ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.