* [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O
[not found] <CGME20230117120741epcas5p2c7d2a20edd0f09bdff585fbe95bdadd9@epcas5p2.samsung.com>
@ 2023-01-17 12:06 ` Anuj Gupta
[not found] ` <CGME20230117120752epcas5p2f01ed01d190357f35dda4505fadea02b@epcas5p2.samsung.com>
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Anuj Gupta @ 2023-01-17 12:06 UTC (permalink / raw)
To: axboe, hch, kbusch, asml.silence
Cc: linux-nvme, linux-block, gost.dev, Anuj Gupta
This series extends bio pcpu caching for normal / IRQ-driven
uring-passthru I/Os. Earlier, only polled uring-passthru I/Os could
leverage bio-cache. After the series from Pavel[1], bio-cache can be
leveraged by normal / IRQ driven I/Os as well. t/io_uring with an Optane
SSD setup shows +7.21% for batches of 32 requests.
[1] https://lore.kernel.org/io-uring/cover.1666347703.git.asml.silence@gmail.com/
IRQ, 128/32/32, cache off
# taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p0 -F1 -B1 -P0 -O0 -u1 -n1 /dev/ng0n1
submitter=0, tid=13207, file=/dev/ng0n1, node=-1
polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=3.05M, BW=1488MiB/s, IOS/call=32/31
IOPS=3.04M, BW=1483MiB/s, IOS/call=32/31
IOPS=3.03M, BW=1477MiB/s, IOS/call=32/32
IOPS=3.03M, BW=1481MiB/s, IOS/call=32/32
^CExiting on signal
Maximum IOPS=3.05M
IRQ, 128/32/32, cache on
# taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p0 -F1 -B1 -P0 -O0 -u1 -n1 /dev/ng0n1
submitter=0, tid=6755, file=/dev/ng0n1, node=-1
polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=3.27M, BW=1596MiB/s, IOS/call=32/31
IOPS=3.27M, BW=1595MiB/s, IOS/call=32/32
IOPS=3.26M, BW=1592MiB/s, IOS/call=32/31
IOPS=3.26M, BW=1593MiB/s, IOS/call=32/32
^CExiting on signal
Maximum IOPS=3.27M
Anuj Gupta (2):
nvme: set REQ_ALLOC_CACHE for uring-passthru request
block: extend bio-cache for non-polled requests
block/blk-map.c | 6 ++----
drivers/nvme/host/ioctl.c | 4 ++--
2 files changed, 4 insertions(+), 6 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH for-next v1 1/2] nvme: set REQ_ALLOC_CACHE for uring-passthru request
[not found] ` <CGME20230117120752epcas5p2f01ed01d190357f35dda4505fadea02b@epcas5p2.samsung.com>
@ 2023-01-17 12:06 ` Anuj Gupta
0 siblings, 0 replies; 6+ messages in thread
From: Anuj Gupta @ 2023-01-17 12:06 UTC (permalink / raw)
To: axboe, hch, kbusch, asml.silence
Cc: linux-nvme, linux-block, gost.dev, Anuj Gupta, Kanchan Joshi
This patch sets REQ_ALLOC_CACHE flag for uring-passthru requests.
This is a prep-patch so that normal / IRQ-driven uring-passthru
I/Os can also leverage bio-cache.
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
drivers/nvme/host/ioctl.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 06f52db34be9..ffaabf16dd4c 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -554,7 +554,7 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
struct nvme_uring_data d;
struct nvme_command c;
struct request *req;
- blk_opf_t rq_flags = 0;
+ blk_opf_t rq_flags = REQ_ALLOC_CACHE;
blk_mq_req_flags_t blk_flags = 0;
void *meta = NULL;
int ret;
@@ -590,7 +590,7 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
d.timeout_ms = READ_ONCE(cmd->timeout_ms);
if (issue_flags & IO_URING_F_NONBLOCK) {
- rq_flags = REQ_NOWAIT;
+ rq_flags |= REQ_NOWAIT;
blk_flags = BLK_MQ_REQ_NOWAIT;
}
if (issue_flags & IO_URING_F_IOPOLL)
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH for-next v1 2/2] block: extend bio-cache for non-polled requests
[not found] ` <CGME20230117120802epcas5p4a9d1fca9d49140649a4152856b54f81f@epcas5p4.samsung.com>
@ 2023-01-17 12:06 ` Anuj Gupta
0 siblings, 0 replies; 6+ messages in thread
From: Anuj Gupta @ 2023-01-17 12:06 UTC (permalink / raw)
To: axboe, hch, kbusch, asml.silence
Cc: linux-nvme, linux-block, gost.dev, Anuj Gupta, Kanchan Joshi
This patch modifies the present check, so that bio-cache is not limited
to iopoll.
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
block/blk-map.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/block/blk-map.c b/block/blk-map.c
index f2135e6ee8f6..9ee4be4ba2f1 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -247,10 +247,8 @@ static struct bio *blk_rq_map_bio_alloc(struct request *rq,
{
struct bio *bio;
- if (rq->cmd_flags & REQ_POLLED) {
- blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE;
-
- bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
+ if (rq->cmd_flags & REQ_ALLOC_CACHE) {
+ bio = bio_alloc_bioset(NULL, nr_vecs, rq->cmd_flags, gfp_mask,
&fs_bio_set);
if (!bio)
return NULL;
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O
2023-01-17 12:06 ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Anuj Gupta
[not found] ` <CGME20230117120752epcas5p2f01ed01d190357f35dda4505fadea02b@epcas5p2.samsung.com>
[not found] ` <CGME20230117120802epcas5p4a9d1fca9d49140649a4152856b54f81f@epcas5p4.samsung.com>
@ 2023-01-17 17:11 ` Jens Axboe
2023-01-18 9:14 ` Kanchan Joshi
2023-01-17 17:23 ` Jens Axboe
3 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2023-01-17 17:11 UTC (permalink / raw)
To: Anuj Gupta, hch, kbusch, asml.silence; +Cc: linux-nvme, linux-block, gost.dev
On 1/17/23 5:06?AM, Anuj Gupta wrote:
> This series extends bio pcpu caching for normal / IRQ-driven
> uring-passthru I/Os. Earlier, only polled uring-passthru I/Os could
> leverage bio-cache. After the series from Pavel[1], bio-cache can be
> leveraged by normal / IRQ driven I/Os as well. t/io_uring with an Optane
> SSD setup shows +7.21% for batches of 32 requests.
>
> [1] https://lore.kernel.org/io-uring/cover.1666347703.git.asml.silence@gmail.com/
>
> IRQ, 128/32/32, cache off
Tests here -
before:
polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=62.88M, BW=30.70GiB/s, IOS/call=32/31
IOPS=62.95M, BW=30.74GiB/s, IOS/call=32/31
IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/32
IOPS=62.61M, BW=30.57GiB/s, IOS/call=31/32
IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/31
IOPS=62.40M, BW=30.47GiB/s, IOS/call=32/32
after:
polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=76.58M, BW=37.39GiB/s, IOS/call=31/31
IOPS=79.42M, BW=38.78GiB/s, IOS/call=32/32
IOPS=78.06M, BW=38.12GiB/s, IOS/call=31/31
IOPS=77.64M, BW=37.91GiB/s, IOS/call=32/31
IOPS=77.17M, BW=37.68GiB/s, IOS/call=32/32
IOPS=76.73M, BW=37.47GiB/s, IOS/call=31/31
IOPS=76.94M, BW=37.57GiB/s, IOS/call=32/31
Note that this includes Pavel's fix as well:
https://lore.kernel.org/linux-block/80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com/
But this mirrors the improvement seen on the non-passthrough side as
well. I'd say that's a pass :-)
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O
2023-01-17 12:06 ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Anuj Gupta
` (2 preceding siblings ...)
2023-01-17 17:11 ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Jens Axboe
@ 2023-01-17 17:23 ` Jens Axboe
3 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2023-01-17 17:23 UTC (permalink / raw)
To: hch, kbusch, asml.silence, Anuj Gupta; +Cc: linux-nvme, linux-block, gost.dev
On Tue, 17 Jan 2023 17:36:36 +0530, Anuj Gupta wrote:
> This series extends bio pcpu caching for normal / IRQ-driven
> uring-passthru I/Os. Earlier, only polled uring-passthru I/Os could
> leverage bio-cache. After the series from Pavel[1], bio-cache can be
> leveraged by normal / IRQ driven I/Os as well. t/io_uring with an Optane
> SSD setup shows +7.21% for batches of 32 requests.
>
> [1] https://lore.kernel.org/io-uring/cover.1666347703.git.asml.silence@gmail.com/
>
> [...]
Applied, thanks!
[1/2] nvme: set REQ_ALLOC_CACHE for uring-passthru request
commit: 988136a307157de9e6e9d27ee9f7ea24ee374f32
[2/2] block: extend bio-cache for non-polled requests
commit: 934f178446b11f621ab52e83211ebf399896db47
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O
2023-01-17 17:11 ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Jens Axboe
@ 2023-01-18 9:14 ` Kanchan Joshi
0 siblings, 0 replies; 6+ messages in thread
From: Kanchan Joshi @ 2023-01-18 9:14 UTC (permalink / raw)
To: Jens Axboe
Cc: Anuj Gupta, hch, kbusch, asml.silence, linux-nvme, linux-block, gost.dev
[-- Attachment #1: Type: text/plain, Size: 1865 bytes --]
On Tue, Jan 17, 2023 at 10:11:08AM -0700, Jens Axboe wrote:
>On 1/17/23 5:06?AM, Anuj Gupta wrote:
>> This series extends bio pcpu caching for normal / IRQ-driven
>> uring-passthru I/Os. Earlier, only polled uring-passthru I/Os could
>> leverage bio-cache. After the series from Pavel[1], bio-cache can be
>> leveraged by normal / IRQ driven I/Os as well. t/io_uring with an Optane
>> SSD setup shows +7.21% for batches of 32 requests.
>>
>> [1] https://lore.kernel.org/io-uring/cover.1666347703.git.asml.silence@gmail.com/
>>
>> IRQ, 128/32/32, cache off
>
>Tests here -
>
>before:
>
>polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
>Engine=io_uring, sq_ring=128, cq_ring=128
>IOPS=62.88M, BW=30.70GiB/s, IOS/call=32/31
>IOPS=62.95M, BW=30.74GiB/s, IOS/call=32/31
>IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/32
>IOPS=62.61M, BW=30.57GiB/s, IOS/call=31/32
>IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/31
>IOPS=62.40M, BW=30.47GiB/s, IOS/call=32/32
>
>after:
>
>polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128
>Engine=io_uring, sq_ring=128, cq_ring=128
>IOPS=76.58M, BW=37.39GiB/s, IOS/call=31/31
>IOPS=79.42M, BW=38.78GiB/s, IOS/call=32/32
>IOPS=78.06M, BW=38.12GiB/s, IOS/call=31/31
>IOPS=77.64M, BW=37.91GiB/s, IOS/call=32/31
>IOPS=77.17M, BW=37.68GiB/s, IOS/call=32/32
>IOPS=76.73M, BW=37.47GiB/s, IOS/call=31/31
>IOPS=76.94M, BW=37.57GiB/s, IOS/call=32/31
>
>Note that this includes Pavel's fix as well:
>
>https://lore.kernel.org/linux-block/80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com/
So I was thinking whether we need this fix for passthru path too. We do
not.
For block path, blk_mq_get_cached_request() encountered a
mismatch since type was different (read vs default).
For passthru, blk_mq_alloc_cached_request() sees no mismatch since
passthrough opf is not treated as read (default vs default).
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-01-18 10:09 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <CGME20230117120741epcas5p2c7d2a20edd0f09bdff585fbe95bdadd9@epcas5p2.samsung.com>
2023-01-17 12:06 ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Anuj Gupta
[not found] ` <CGME20230117120752epcas5p2f01ed01d190357f35dda4505fadea02b@epcas5p2.samsung.com>
2023-01-17 12:06 ` [PATCH for-next v1 1/2] nvme: set REQ_ALLOC_CACHE for uring-passthru request Anuj Gupta
[not found] ` <CGME20230117120802epcas5p4a9d1fca9d49140649a4152856b54f81f@epcas5p4.samsung.com>
2023-01-17 12:06 ` [PATCH for-next v1 2/2] block: extend bio-cache for non-polled requests Anuj Gupta
2023-01-17 17:11 ` [PATCH for-next v1 0/2] enable pcpu bio-cache for IRQ uring-passthru I/O Jens Axboe
2023-01-18 9:14 ` Kanchan Joshi
2023-01-17 17:23 ` Jens Axboe
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.