All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET 0/3] passthru block optimizations
@ 2022-08-06 15:20 Jens Axboe
  2022-08-06 15:20 ` [PATCH 1/3] block: shrink rq_map_data a bit Jens Axboe
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Jens Axboe @ 2022-08-06 15:20 UTC (permalink / raw)
  To: linux-block; +Cc: joshi.k, kbusch

Hi,

Currently passthru IO is slower than bdev O_DIRECT. One of the reasons
is that we do two allocations for each IO:

- One alloc+free for the page array for mapping the data
- One alloc+free of the bio

Let passthru IO dip into the bio cache to eliminate that one, and use
UIO_FASTIOV to gate whether we need to alloc+free the page array for
mapping purposes.

This closes about half of the gap between passthru and bdev dio for me.
If we can sanely wire up completion batching for passthru, then that
would almost fully close the gap. Outside of that, the main missing
feature for passthru is the ability to use registered buffers with
io_uring, as the per-io get_user_pages() is a large cycle consumer as
well.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] block: shrink rq_map_data a bit
  2022-08-06 15:20 [PATCHSET 0/3] passthru block optimizations Jens Axboe
@ 2022-08-06 15:20 ` Jens Axboe
  2022-08-07  9:26   ` Chaitanya Kulkarni
  2022-08-06 15:20 ` [PATCH 2/3] block: enable bio caching use for passthru IO Jens Axboe
  2022-08-06 15:20 ` [PATCH 3/3] block: use on-stack page vec for <= UIO_FASTIOV Jens Axboe
  2 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2022-08-06 15:20 UTC (permalink / raw)
  To: linux-block; +Cc: joshi.k, kbusch, Jens Axboe

We don't need full ints for several of these members. Change the
page_order and nr_entries to unsigned shorts, and the true/false from_user
and null_mapped to booleans.

This shrinks the struct from 32 to 24 bytes on 64-bit archs.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-map.c        | 2 +-
 include/linux/blk-mq.h | 8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index df8b066cd548..4043c5809cd4 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -158,7 +158,7 @@ static int bio_copy_user_iov(struct request *rq, struct rq_map_data *map_data,
 	bio_init(bio, NULL, bio->bi_inline_vecs, nr_pages, req_op(rq));
 
 	if (map_data) {
-		nr_pages = 1 << map_data->page_order;
+		nr_pages = 1U << map_data->page_order;
 		i = map_data->offset / PAGE_SIZE;
 	}
 	while (len) {
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index effee1dc715a..1f21590439d4 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -964,11 +964,11 @@ blk_status_t blk_insert_cloned_request(struct request *rq);
 
 struct rq_map_data {
 	struct page **pages;
-	int page_order;
-	int nr_entries;
 	unsigned long offset;
-	int null_mapped;
-	int from_user;
+	unsigned short page_order;
+	unsigned short nr_entries;
+	bool null_mapped;
+	bool from_user;
 };
 
 int blk_rq_map_user(struct request_queue *, struct request *,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] block: enable bio caching use for passthru IO
  2022-08-06 15:20 [PATCHSET 0/3] passthru block optimizations Jens Axboe
  2022-08-06 15:20 ` [PATCH 1/3] block: shrink rq_map_data a bit Jens Axboe
@ 2022-08-06 15:20 ` Jens Axboe
  2022-08-07  9:27   ` Chaitanya Kulkarni
  2022-08-07 18:08   ` Kanchan Joshi
  2022-08-06 15:20 ` [PATCH 3/3] block: use on-stack page vec for <= UIO_FASTIOV Jens Axboe
  2 siblings, 2 replies; 9+ messages in thread
From: Jens Axboe @ 2022-08-06 15:20 UTC (permalink / raw)
  To: linux-block; +Cc: joshi.k, kbusch, Jens Axboe

bdev based polled O_DIRECT is currently quite a bit faster than
passthru on the same device, and one of the reaons is that we're not
able to use the bio caching for passthru IO.

If REQ_POLLED is set on the request, use the fs bio set for grabbing a
bio from the caches, if available. This saves 5-6% of CPU over head
for polled passthru IO.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-map.c | 33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index 4043c5809cd4..5da03f2614eb 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -231,6 +231,16 @@ static int bio_copy_user_iov(struct request *rq, struct rq_map_data *map_data,
 	return ret;
 }
 
+static void bio_map_put(struct bio *bio)
+{
+	if (bio->bi_opf & REQ_ALLOC_CACHE) {
+		bio_put(bio);
+	} else {
+		bio_uninit(bio);
+		kfree(bio);
+	}
+}
+
 static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 		gfp_t gfp_mask)
 {
@@ -243,10 +253,19 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 	if (!iov_iter_count(iter))
 		return -EINVAL;
 
-	bio = bio_kmalloc(nr_vecs, gfp_mask);
-	if (!bio)
-		return -ENOMEM;
-	bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq));
+	if (rq->cmd_flags & REQ_POLLED) {
+		blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE;
+
+		bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
+					&fs_bio_set);
+		if (!bio)
+			return -ENOMEM;
+	} else {
+		bio = bio_kmalloc(nr_vecs, gfp_mask);
+		if (!bio)
+			return -ENOMEM;
+		bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq));
+	}
 
 	while (iov_iter_count(iter)) {
 		struct page **pages;
@@ -304,8 +323,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 
  out_unmap:
 	bio_release_pages(bio, false);
-	bio_uninit(bio);
-	kfree(bio);
+	bio_map_put(bio);
 	return ret;
 }
 
@@ -610,8 +628,7 @@ int blk_rq_unmap_user(struct bio *bio)
 
 		next_bio = bio;
 		bio = bio->bi_next;
-		bio_uninit(next_bio);
-		kfree(next_bio);
+		bio_map_put(next_bio);
 	}
 
 	return ret;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] block: use on-stack page vec for <= UIO_FASTIOV
  2022-08-06 15:20 [PATCHSET 0/3] passthru block optimizations Jens Axboe
  2022-08-06 15:20 ` [PATCH 1/3] block: shrink rq_map_data a bit Jens Axboe
  2022-08-06 15:20 ` [PATCH 2/3] block: enable bio caching use for passthru IO Jens Axboe
@ 2022-08-06 15:20 ` Jens Axboe
  2022-08-07  9:30   ` Chaitanya Kulkarni
  2 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2022-08-06 15:20 UTC (permalink / raw)
  To: linux-block; +Cc: joshi.k, kbusch, Jens Axboe

Avoid a kmalloc+kfree for each page array, if we only have a few pages
that are mapped. An alloc+free for each IO is quite expensive, and
it's pretty pointless if we're only dealing with 1 or a few vecs.

Use UIO_FASTIOV like we do in other spots to set a sane limit for how
big of an IO we want to avoid allocations for.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-map.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index 5da03f2614eb..d0ff80a9902e 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -268,12 +268,19 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 	}
 
 	while (iov_iter_count(iter)) {
-		struct page **pages;
+		struct page **pages, *stack_pages[UIO_FASTIOV];
 		ssize_t bytes;
 		size_t offs, added = 0;
 		int npages;
 
-		bytes = iov_iter_get_pages_alloc(iter, &pages, LONG_MAX, &offs);
+		if (nr_vecs < ARRAY_SIZE(stack_pages)) {
+			pages = stack_pages;
+			bytes = iov_iter_get_pages(iter, pages, LONG_MAX,
+							nr_vecs, &offs);
+		} else {
+			bytes = iov_iter_get_pages_alloc(iter, &pages, LONG_MAX,
+							&offs);
+		}
 		if (unlikely(bytes <= 0)) {
 			ret = bytes ? bytes : -EFAULT;
 			goto out_unmap;
@@ -310,7 +317,8 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 		 */
 		while (j < npages)
 			put_page(pages[j++]);
-		kvfree(pages);
+		if (pages != stack_pages)
+			kvfree(pages);
 		/* couldn't stuff something into bio? */
 		if (bytes)
 			break;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] block: shrink rq_map_data a bit
  2022-08-06 15:20 ` [PATCH 1/3] block: shrink rq_map_data a bit Jens Axboe
@ 2022-08-07  9:26   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 9+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-07  9:26 UTC (permalink / raw)
  To: Jens Axboe; +Cc: joshi.k, linux-block, kbusch

On 8/6/22 08:20, Jens Axboe wrote:
> We don't need full ints for several of these members. Change the
> page_order and nr_entries to unsigned shorts, and the true/false from_user
> and null_mapped to booleans.
> 
> This shrinks the struct from 32 to 24 bytes on 64-bit archs.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>


Looks good.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>

-ck



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] block: enable bio caching use for passthru IO
  2022-08-06 15:20 ` [PATCH 2/3] block: enable bio caching use for passthru IO Jens Axboe
@ 2022-08-07  9:27   ` Chaitanya Kulkarni
  2022-08-07 18:08   ` Kanchan Joshi
  1 sibling, 0 replies; 9+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-07  9:27 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: joshi.k, kbusch

On 8/6/22 08:20, Jens Axboe wrote:
> bdev based polled O_DIRECT is currently quite a bit faster than
> passthru on the same device, and one of the reaons is that we're not
> able to use the bio caching for passthru IO.
> 
> If REQ_POLLED is set on the request, use the fs bio set for grabbing a
> bio from the caches, if available. This saves 5-6% of CPU over head
> for polled passthru IO.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Looks good.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>

-ck



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] block: use on-stack page vec for <= UIO_FASTIOV
  2022-08-06 15:20 ` [PATCH 3/3] block: use on-stack page vec for <= UIO_FASTIOV Jens Axboe
@ 2022-08-07  9:30   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 9+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-07  9:30 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: joshi.k, kbusch

On 8/6/22 08:20, Jens Axboe wrote:
> Avoid a kmalloc+kfree for each page array, if we only have a few pages
> that are mapped. An alloc+free for each IO is quite expensive, and
> it's pretty pointless if we're only dealing with 1 or a few vecs.
> 
> Use UIO_FASTIOV like we do in other spots to set a sane limit for how
> big of an IO we want to avoid allocations for.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-map.c | 14 +++++++++++---
>   1 file changed, 11 insertions(+), 3 deletions(-)
> 


Looks good.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>

-ck



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] block: enable bio caching use for passthru IO
  2022-08-06 15:20 ` [PATCH 2/3] block: enable bio caching use for passthru IO Jens Axboe
  2022-08-07  9:27   ` Chaitanya Kulkarni
@ 2022-08-07 18:08   ` Kanchan Joshi
  2022-08-07 18:45     ` Jens Axboe
  1 sibling, 1 reply; 9+ messages in thread
From: Kanchan Joshi @ 2022-08-07 18:08 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, kbusch

[-- Attachment #1: Type: text/plain, Size: 527 bytes --]

On Sat, Aug 06, 2022 at 09:20:03AM -0600, Jens Axboe wrote:
>bdev based polled O_DIRECT is currently quite a bit faster than
>passthru on the same device, and one of the reaons is that we're not
>able to use the bio caching for passthru IO.
>
>If REQ_POLLED is set on the request, use the fs bio set for grabbing a
>bio from the caches, if available. This saves 5-6% of CPU over head
>for polled passthru IO.

For passthru path, bio is always freed in the task-context (and not in
irq) so must this be tied to polled-io only? 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] block: enable bio caching use for passthru IO
  2022-08-07 18:08   ` Kanchan Joshi
@ 2022-08-07 18:45     ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2022-08-07 18:45 UTC (permalink / raw)
  To: Kanchan Joshi; +Cc: linux-block, kbusch

On 8/7/22 12:08 PM, Kanchan Joshi wrote:
> On Sat, Aug 06, 2022 at 09:20:03AM -0600, Jens Axboe wrote:
>> bdev based polled O_DIRECT is currently quite a bit faster than
>> passthru on the same device, and one of the reaons is that we're not
>> able to use the bio caching for passthru IO.
>>
>> If REQ_POLLED is set on the request, use the fs bio set for grabbing a
>> bio from the caches, if available. This saves 5-6% of CPU over head
>> for polled passthru IO.
> 
> For passthru path, bio is always freed in the task-context (and not in
> irq) so must this be tied to polled-io only?

Right, that's why it's tied to polled. If polling gets cleared, then it
will be freed normally on completion rather than inserted into the
cache.

I do have patches for irq bio caching too, that'll work fine with
io_uring:

https://git.kernel.dk/cgit/linux-block/commit/?h=perf-wip&id=ab3d4371227a34a5561e4d594a17baaad03bf1b7

I'll post that too, would be nice if we can figure out a clean way to do
this. I have posted it before, iirc.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-08-07 18:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-06 15:20 [PATCHSET 0/3] passthru block optimizations Jens Axboe
2022-08-06 15:20 ` [PATCH 1/3] block: shrink rq_map_data a bit Jens Axboe
2022-08-07  9:26   ` Chaitanya Kulkarni
2022-08-06 15:20 ` [PATCH 2/3] block: enable bio caching use for passthru IO Jens Axboe
2022-08-07  9:27   ` Chaitanya Kulkarni
2022-08-07 18:08   ` Kanchan Joshi
2022-08-07 18:45     ` Jens Axboe
2022-08-06 15:20 ` [PATCH 3/3] block: use on-stack page vec for <= UIO_FASTIOV Jens Axboe
2022-08-07  9:30   ` Chaitanya Kulkarni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.