All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O
@ 2022-10-18 19:50 Pavel Begunkov
  2022-10-18 19:50 ` [RFC for-next v2 1/4] bio: safeguard REQ_ALLOC_CACHE bio put Pavel Begunkov
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Pavel Begunkov @ 2022-10-18 19:50 UTC (permalink / raw)
  To: Jens Axboe, linux-block
  Cc: io-uring, linux-kernel, linux-fsdevel, Pavel Begunkov

This series implements bio pcpu caching for normal / IRQ-driven I/O
extending REQ_ALLOC_CACHE currently limited to iopoll. The allocation side
still only works from non-irq context, which is the reason it's not enabled
by default, but turning it on for other users (e.g. filesystems) is
as a matter of passing a flag.

t/io_uring with an Optane SSD setup showed +7% for batches of 32 requests
and +4.3% for batches of 8.

IRQ, 128/32/32, cache off
IOPS=59.08M, BW=28.84GiB/s, IOS/call=31/31
IOPS=59.30M, BW=28.96GiB/s, IOS/call=32/32
IOPS=59.97M, BW=29.28GiB/s, IOS/call=31/31
IOPS=59.92M, BW=29.26GiB/s, IOS/call=32/32
IOPS=59.81M, BW=29.20GiB/s, IOS/call=32/31

IRQ, 128/32/32, cache on
IOPS=64.05M, BW=31.27GiB/s, IOS/call=32/31
IOPS=64.22M, BW=31.36GiB/s, IOS/call=32/32
IOPS=64.04M, BW=31.27GiB/s, IOS/call=31/31
IOPS=63.16M, BW=30.84GiB/s, IOS/call=32/32

IRQ, 32/8/8, cache off
IOPS=50.60M, BW=24.71GiB/s, IOS/call=7/8
IOPS=50.22M, BW=24.52GiB/s, IOS/call=8/7
IOPS=49.54M, BW=24.19GiB/s, IOS/call=8/8
IOPS=50.07M, BW=24.45GiB/s, IOS/call=7/7
IOPS=50.46M, BW=24.64GiB/s, IOS/call=8/8

IRQ, 32/8/8, cache on
IOPS=51.39M, BW=25.09GiB/s, IOS/call=8/7
IOPS=52.52M, BW=25.64GiB/s, IOS/call=7/8
IOPS=52.57M, BW=25.67GiB/s, IOS/call=8/8
IOPS=52.58M, BW=25.67GiB/s, IOS/call=8/7
IOPS=52.61M, BW=25.69GiB/s, IOS/call=8/8

The main part is in patch 3. Would be great to take patch 1 separately
for 6.1 for extra safety.

v2: fix botched splicing threshold checks

Pavel Begunkov (4):
  bio: safeguard REQ_ALLOC_CACHE bio put
  bio: split pcpu cache part of bio_put into a helper
  block/bio: add pcpu caching for non-polling bio_put
  io_uring/rw: enable bio caches for IRQ rw

 block/bio.c   | 94 ++++++++++++++++++++++++++++++++++++++++-----------
 io_uring/rw.c |  3 +-
 2 files changed, 76 insertions(+), 21 deletions(-)

-- 
2.38.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC for-next v2 1/4] bio: safeguard REQ_ALLOC_CACHE bio put
  2022-10-18 19:50 [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O Pavel Begunkov
@ 2022-10-18 19:50 ` Pavel Begunkov
  2022-10-20  8:26   ` Christoph Hellwig
  2022-10-18 19:50 ` [RFC for-next v2 2/4] bio: split pcpu cache part of bio_put into a helper Pavel Begunkov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Pavel Begunkov @ 2022-10-18 19:50 UTC (permalink / raw)
  To: Jens Axboe, linux-block
  Cc: io-uring, linux-kernel, linux-fsdevel, Pavel Begunkov

bio_put() with REQ_ALLOC_CACHE assumes that it's executed not from
an irq context. Let's add a warning if the invariant is not respected,
especially since there is a couple of places removing REQ_POLLED by hand
without also clearing REQ_ALLOC_CACHE.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 block/bio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/bio.c b/block/bio.c
index 7cb7d2ff139b..5b4594daa259 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -741,7 +741,7 @@ void bio_put(struct bio *bio)
 			return;
 	}
 
-	if (bio->bi_opf & REQ_ALLOC_CACHE) {
+	if ((bio->bi_opf & REQ_ALLOC_CACHE) && !WARN_ON_ONCE(in_interrupt())) {
 		struct bio_alloc_cache *cache;
 
 		bio_uninit(bio);
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC for-next v2 2/4] bio: split pcpu cache part of bio_put into a helper
  2022-10-18 19:50 [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O Pavel Begunkov
  2022-10-18 19:50 ` [RFC for-next v2 1/4] bio: safeguard REQ_ALLOC_CACHE bio put Pavel Begunkov
@ 2022-10-18 19:50 ` Pavel Begunkov
  2022-10-20  8:30   ` Christoph Hellwig
  2022-10-18 19:50 ` [RFC for-next v2 3/4] block/bio: add pcpu caching for non-polling bio_put Pavel Begunkov
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Pavel Begunkov @ 2022-10-18 19:50 UTC (permalink / raw)
  To: Jens Axboe, linux-block
  Cc: io-uring, linux-kernel, linux-fsdevel, Pavel Begunkov

Extract a helper out of bio_put for recycling into percpu caches.
It's a preparation patch without functional changes.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 block/bio.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 5b4594daa259..ac16cc154476 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -725,6 +725,28 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
 	bs->cache = NULL;
 }
 
+static inline void bio_put_percpu_cache(struct bio *bio)
+{
+	struct bio_alloc_cache *cache;
+
+	cache = per_cpu_ptr(bio->bi_pool->cache, get_cpu());
+	bio_uninit(bio);
+
+	if ((bio->bi_opf & REQ_POLLED) && !WARN_ON_ONCE(in_interrupt())) {
+		bio->bi_next = cache->free_list;
+		cache->free_list = bio;
+		cache->nr++;
+	} else {
+		put_cpu();
+		bio_free(bio);
+		return;
+	}
+
+	if (cache->nr > ALLOC_CACHE_MAX + ALLOC_CACHE_SLACK)
+		bio_alloc_cache_prune(cache, ALLOC_CACHE_SLACK);
+	put_cpu();
+}
+
 /**
  * bio_put - release a reference to a bio
  * @bio:   bio to release reference to
@@ -740,20 +762,10 @@ void bio_put(struct bio *bio)
 		if (!atomic_dec_and_test(&bio->__bi_cnt))
 			return;
 	}
-
-	if ((bio->bi_opf & REQ_ALLOC_CACHE) && !WARN_ON_ONCE(in_interrupt())) {
-		struct bio_alloc_cache *cache;
-
-		bio_uninit(bio);
-		cache = per_cpu_ptr(bio->bi_pool->cache, get_cpu());
-		bio->bi_next = cache->free_list;
-		cache->free_list = bio;
-		if (++cache->nr > ALLOC_CACHE_MAX + ALLOC_CACHE_SLACK)
-			bio_alloc_cache_prune(cache, ALLOC_CACHE_SLACK);
-		put_cpu();
-	} else {
+	if (bio->bi_opf & REQ_ALLOC_CACHE)
+		bio_put_percpu_cache(bio);
+	else
 		bio_free(bio);
-	}
 }
 EXPORT_SYMBOL(bio_put);
 
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC for-next v2 3/4] block/bio: add pcpu caching for non-polling bio_put
  2022-10-18 19:50 [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O Pavel Begunkov
  2022-10-18 19:50 ` [RFC for-next v2 1/4] bio: safeguard REQ_ALLOC_CACHE bio put Pavel Begunkov
  2022-10-18 19:50 ` [RFC for-next v2 2/4] bio: split pcpu cache part of bio_put into a helper Pavel Begunkov
@ 2022-10-18 19:50 ` Pavel Begunkov
  2022-10-20  8:31   ` Christoph Hellwig
  2022-10-18 19:50 ` [RFC for-next v2 4/4] io_uring/rw: enable bio caches for IRQ rw Pavel Begunkov
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Pavel Begunkov @ 2022-10-18 19:50 UTC (permalink / raw)
  To: Jens Axboe, linux-block
  Cc: io-uring, linux-kernel, linux-fsdevel, Pavel Begunkov

This patch extends REQ_ALLOC_CACHE to IRQ completions, whenever
currently it's only limited to iopoll. Instead of guarding the list with
irq toggling on alloc, which is expensive, it keeps an additional
irq-safe list from which bios are spliced in batches to ammortise
overhead. On the put side it toggles irqs, but in many cases they're
already disabled and so cheap.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 block/bio.c | 64 ++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 53 insertions(+), 11 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index ac16cc154476..c2dda2759df5 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -25,9 +25,15 @@
 #include "blk-rq-qos.h"
 #include "blk-cgroup.h"
 
+#define ALLOC_CACHE_THRESHOLD	16
+#define ALLOC_CACHE_SLACK	64
+#define ALLOC_CACHE_MAX		512
+
 struct bio_alloc_cache {
 	struct bio		*free_list;
+	struct bio		*free_list_irq;
 	unsigned int		nr;
+	unsigned int		nr_irq;
 };
 
 static struct biovec_slab {
@@ -408,6 +414,22 @@ static void punt_bios_to_rescuer(struct bio_set *bs)
 	queue_work(bs->rescue_workqueue, &bs->rescue_work);
 }
 
+static void bio_alloc_irq_cache_splice(struct bio_alloc_cache *cache)
+{
+	unsigned long flags;
+
+	/* cache->free_list must be empty */
+	if (WARN_ON_ONCE(cache->free_list))
+		return;
+
+	local_irq_save(flags);
+	cache->free_list = cache->free_list_irq;
+	cache->free_list_irq = NULL;
+	cache->nr += cache->nr_irq;
+	cache->nr_irq = 0;
+	local_irq_restore(flags);
+}
+
 static struct bio *bio_alloc_percpu_cache(struct block_device *bdev,
 		unsigned short nr_vecs, blk_opf_t opf, gfp_t gfp,
 		struct bio_set *bs)
@@ -417,9 +439,17 @@ static struct bio *bio_alloc_percpu_cache(struct block_device *bdev,
 
 	cache = per_cpu_ptr(bs->cache, get_cpu());
 	if (!cache->free_list) {
-		put_cpu();
-		return NULL;
+		if (READ_ONCE(cache->nr_irq) < ALLOC_CACHE_THRESHOLD) {
+			put_cpu();
+			return NULL;
+		}
+		bio_alloc_irq_cache_splice(cache);
+		if (!cache->free_list) {
+			put_cpu();
+			return NULL;
+		}
 	}
+
 	bio = cache->free_list;
 	cache->free_list = bio->bi_next;
 	cache->nr--;
@@ -676,11 +706,8 @@ void guard_bio_eod(struct bio *bio)
 	bio_truncate(bio, maxsector << 9);
 }
 
-#define ALLOC_CACHE_MAX		512
-#define ALLOC_CACHE_SLACK	 64
-
-static void bio_alloc_cache_prune(struct bio_alloc_cache *cache,
-				  unsigned int nr)
+static int __bio_alloc_cache_prune(struct bio_alloc_cache *cache,
+				   unsigned int nr)
 {
 	unsigned int i = 0;
 	struct bio *bio;
@@ -692,6 +719,17 @@ static void bio_alloc_cache_prune(struct bio_alloc_cache *cache,
 		if (++i == nr)
 			break;
 	}
+	return i;
+}
+
+static void bio_alloc_cache_prune(struct bio_alloc_cache *cache,
+				  unsigned int nr)
+{
+	nr -= __bio_alloc_cache_prune(cache, nr);
+	if (!READ_ONCE(cache->free_list)) {
+		bio_alloc_irq_cache_splice(cache);
+		__bio_alloc_cache_prune(cache, nr);
+	}
 }
 
 static int bio_cpu_dead(unsigned int cpu, struct hlist_node *node)
@@ -728,6 +766,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
 static inline void bio_put_percpu_cache(struct bio *bio)
 {
 	struct bio_alloc_cache *cache;
+	unsigned long flags;
 
 	cache = per_cpu_ptr(bio->bi_pool->cache, get_cpu());
 	bio_uninit(bio);
@@ -737,12 +776,15 @@ static inline void bio_put_percpu_cache(struct bio *bio)
 		cache->free_list = bio;
 		cache->nr++;
 	} else {
-		put_cpu();
-		bio_free(bio);
-		return;
+		local_irq_save(flags);
+		bio->bi_next = cache->free_list_irq;
+		cache->free_list_irq = bio;
+		cache->nr_irq++;
+		local_irq_restore(flags);
 	}
 
-	if (cache->nr > ALLOC_CACHE_MAX + ALLOC_CACHE_SLACK)
+	if (READ_ONCE(cache->nr_irq) + cache->nr >
+	    ALLOC_CACHE_MAX + ALLOC_CACHE_SLACK)
 		bio_alloc_cache_prune(cache, ALLOC_CACHE_SLACK);
 	put_cpu();
 }
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC for-next v2 4/4] io_uring/rw: enable bio caches for IRQ rw
  2022-10-18 19:50 [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O Pavel Begunkov
                   ` (2 preceding siblings ...)
  2022-10-18 19:50 ` [RFC for-next v2 3/4] block/bio: add pcpu caching for non-polling bio_put Pavel Begunkov
@ 2022-10-18 19:50 ` Pavel Begunkov
  2022-10-20  8:32 ` [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O Christoph Hellwig
  2022-10-20 12:50 ` (subset) " Jens Axboe
  5 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2022-10-18 19:50 UTC (permalink / raw)
  To: Jens Axboe, linux-block
  Cc: io-uring, linux-kernel, linux-fsdevel, Pavel Begunkov

Now we can use IOCB_ALLOC_CACHE not only for iopoll'ed reads/write but
also for normal IRQ driven I/O.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/rw.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/io_uring/rw.c b/io_uring/rw.c
index 100de2626e47..ff609b762742 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -667,6 +667,7 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
 	ret = kiocb_set_rw_flags(kiocb, rw->flags);
 	if (unlikely(ret))
 		return ret;
+	kiocb->ki_flags |= IOCB_ALLOC_CACHE;
 
 	/*
 	 * If the file is marked O_NONBLOCK, still allow retry for it if it
@@ -682,7 +683,7 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
 			return -EOPNOTSUPP;
 
 		kiocb->private = NULL;
-		kiocb->ki_flags |= IOCB_HIPRI | IOCB_ALLOC_CACHE;
+		kiocb->ki_flags |= IOCB_HIPRI;
 		kiocb->ki_complete = io_complete_rw_iopoll;
 		req->iopoll_completed = 0;
 	} else {
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC for-next v2 1/4] bio: safeguard REQ_ALLOC_CACHE bio put
  2022-10-18 19:50 ` [RFC for-next v2 1/4] bio: safeguard REQ_ALLOC_CACHE bio put Pavel Begunkov
@ 2022-10-20  8:26   ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2022-10-20  8:26 UTC (permalink / raw)
  To: Pavel Begunkov
  Cc: Jens Axboe, linux-block, io-uring, linux-kernel, linux-fsdevel

On Tue, Oct 18, 2022 at 08:50:55PM +0100, Pavel Begunkov wrote:
> bio_put() with REQ_ALLOC_CACHE assumes that it's executed not from
> an irq context. Let's add a warning if the invariant is not respected,
> especially since there is a couple of places removing REQ_POLLED by hand
> without also clearing REQ_ALLOC_CACHE.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC for-next v2 2/4] bio: split pcpu cache part of bio_put into a helper
  2022-10-18 19:50 ` [RFC for-next v2 2/4] bio: split pcpu cache part of bio_put into a helper Pavel Begunkov
@ 2022-10-20  8:30   ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2022-10-20  8:30 UTC (permalink / raw)
  To: Pavel Begunkov
  Cc: Jens Axboe, linux-block, io-uring, linux-kernel, linux-fsdevel

> +	if ((bio->bi_opf & REQ_POLLED) && !WARN_ON_ONCE(in_interrupt())) {
> +		bio->bi_next = cache->free_list;
> +		cache->free_list = bio;
> +		cache->nr++;
> +	} else {
> +		put_cpu();
> +		bio_free(bio);
> +		return;
> +	}

This reads a little strange with the return in an else.  Why not:

	if (!(bio->bi_opf & REQ_POLLED) || WARN_ON_ONCE(in_interrupt())) {
		put_cpu();
		bio_free(bio);
		return;
	}

	bio->bi_next = cache->free_list;
	cache->free_list = bio;
	cache->nr++;

but given that the simple free case doesn't care about what CPU we're
on or the per-cpu cache pointer, I think we can simply move the

	cache = per_cpu_ptr(bio->bi_pool->cache, get_cpu());

after the above return as well.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC for-next v2 3/4] block/bio: add pcpu caching for non-polling bio_put
  2022-10-18 19:50 ` [RFC for-next v2 3/4] block/bio: add pcpu caching for non-polling bio_put Pavel Begunkov
@ 2022-10-20  8:31   ` Christoph Hellwig
  2022-10-20 12:26     ` Pavel Begunkov
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2022-10-20  8:31 UTC (permalink / raw)
  To: Pavel Begunkov
  Cc: Jens Axboe, linux-block, io-uring, linux-kernel, linux-fsdevel

> +	unsigned long flags;
>  
>  	cache = per_cpu_ptr(bio->bi_pool->cache, get_cpu());
>  	bio_uninit(bio);
> @@ -737,12 +776,15 @@ static inline void bio_put_percpu_cache(struct bio *bio)
>  		cache->free_list = bio;
>  		cache->nr++;
>  	} else {
> -		put_cpu();
> -		bio_free(bio);
> -		return;
> +		local_irq_save(flags);
> +		bio->bi_next = cache->free_list_irq;
> +		cache->free_list_irq = bio;
> +		cache->nr_irq++;
> +		local_irq_restore(flags);
>  	}

Ok, I guess with that my previous comments don't make quite
as much sense any more.  I think youcan keep flags local in
the branch here, though.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O
  2022-10-18 19:50 [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O Pavel Begunkov
                   ` (3 preceding siblings ...)
  2022-10-18 19:50 ` [RFC for-next v2 4/4] io_uring/rw: enable bio caches for IRQ rw Pavel Begunkov
@ 2022-10-20  8:32 ` Christoph Hellwig
  2022-10-20 12:40   ` Pavel Begunkov
  2022-10-20 12:50 ` (subset) " Jens Axboe
  5 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2022-10-20  8:32 UTC (permalink / raw)
  To: Pavel Begunkov
  Cc: Jens Axboe, linux-block, io-uring, linux-kernel, linux-fsdevel

On Tue, Oct 18, 2022 at 08:50:54PM +0100, Pavel Begunkov wrote:
> This series implements bio pcpu caching for normal / IRQ-driven I/O
> extending REQ_ALLOC_CACHE currently limited to iopoll. The allocation side
> still only works from non-irq context, which is the reason it's not enabled
> by default, but turning it on for other users (e.g. filesystems) is
> as a matter of passing a flag.
> 
> t/io_uring with an Optane SSD setup showed +7% for batches of 32 requests
> and +4.3% for batches of 8.

This looks much nicer to me than the previous attempt exposing the bio
internals to io_uring, thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC for-next v2 3/4] block/bio: add pcpu caching for non-polling bio_put
  2022-10-20  8:31   ` Christoph Hellwig
@ 2022-10-20 12:26     ` Pavel Begunkov
  0 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2022-10-20 12:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, io-uring, linux-kernel, linux-fsdevel

On 10/20/22 09:31, Christoph Hellwig wrote:
>> +	unsigned long flags;
>>   
>>   	cache = per_cpu_ptr(bio->bi_pool->cache, get_cpu());
>>   	bio_uninit(bio);
>> @@ -737,12 +776,15 @@ static inline void bio_put_percpu_cache(struct bio *bio)
>>   		cache->free_list = bio;
>>   		cache->nr++;
>>   	} else {
>> -		put_cpu();
>> -		bio_free(bio);
>> -		return;
>> +		local_irq_save(flags);
>> +		bio->bi_next = cache->free_list_irq;
>> +		cache->free_list_irq = bio;
>> +		cache->nr_irq++;
>> +		local_irq_restore(flags);
>>   	}
> 
> Ok, I guess with that my previous comments don't make quite
> as much sense any more.  I think youcan keep flags local in

Yeah, a little bit of oracle coding

> the branch here, though.

Not like it makes any difference but can move it

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O
  2022-10-20  8:32 ` [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O Christoph Hellwig
@ 2022-10-20 12:40   ` Pavel Begunkov
  2022-10-20 12:53     ` Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Pavel Begunkov @ 2022-10-20 12:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, io-uring, linux-kernel, linux-fsdevel

On 10/20/22 09:32, Christoph Hellwig wrote:
> On Tue, Oct 18, 2022 at 08:50:54PM +0100, Pavel Begunkov wrote:
>> This series implements bio pcpu caching for normal / IRQ-driven I/O
>> extending REQ_ALLOC_CACHE currently limited to iopoll. The allocation side
>> still only works from non-irq context, which is the reason it's not enabled
>> by default, but turning it on for other users (e.g. filesystems) is
>> as a matter of passing a flag.
>>
>> t/io_uring with an Optane SSD setup showed +7% for batches of 32 requests
>> and +4.3% for batches of 8.
> 
> This looks much nicer to me than the previous attempt exposing the bio
> internals to io_uring, thanks.

Yeah, I saw the one Jens posted before but I wanted this one to be more
generic, i.e. applicable not only to io_uring. Thanks for taking a look.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: (subset) [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O
  2022-10-18 19:50 [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O Pavel Begunkov
                   ` (4 preceding siblings ...)
  2022-10-20  8:32 ` [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O Christoph Hellwig
@ 2022-10-20 12:50 ` Jens Axboe
  5 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2022-10-20 12:50 UTC (permalink / raw)
  To: linux-block, Pavel Begunkov; +Cc: io-uring, linux-kernel, linux-fsdevel

On Tue, 18 Oct 2022 20:50:54 +0100, Pavel Begunkov wrote:
> This series implements bio pcpu caching for normal / IRQ-driven I/O
> extending REQ_ALLOC_CACHE currently limited to iopoll. The allocation side
> still only works from non-irq context, which is the reason it's not enabled
> by default, but turning it on for other users (e.g. filesystems) is
> as a matter of passing a flag.
> 
> t/io_uring with an Optane SSD setup showed +7% for batches of 32 requests
> and +4.3% for batches of 8.
> 
> [...]

Applied, thanks!

[1/4] bio: safeguard REQ_ALLOC_CACHE bio put
      commit: d4347d50407daea6237872281ece64c4bdf1ec99

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O
  2022-10-20 12:40   ` Pavel Begunkov
@ 2022-10-20 12:53     ` Jens Axboe
  0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2022-10-20 12:53 UTC (permalink / raw)
  To: Pavel Begunkov, Christoph Hellwig
  Cc: linux-block, io-uring, linux-kernel, linux-fsdevel

On 10/20/22 5:40 AM, Pavel Begunkov wrote:
> On 10/20/22 09:32, Christoph Hellwig wrote:
>> On Tue, Oct 18, 2022 at 08:50:54PM +0100, Pavel Begunkov wrote:
>>> This series implements bio pcpu caching for normal / IRQ-driven I/O
>>> extending REQ_ALLOC_CACHE currently limited to iopoll. The allocation side
>>> still only works from non-irq context, which is the reason it's not enabled
>>> by default, but turning it on for other users (e.g. filesystems) is
>>> as a matter of passing a flag.
>>>
>>> t/io_uring with an Optane SSD setup showed +7% for batches of 32 requests
>>> and +4.3% for batches of 8.
>>
>> This looks much nicer to me than the previous attempt exposing the bio
>> internals to io_uring, thanks.
> 
> Yeah, I saw the one Jens posted before but I wanted this one to be more
> generic, i.e. applicable not only to io_uring. Thanks for taking a look.

It is indeed better like that, also because we can get rid of the alloc
cache flag long term and just have it be the way that bio allocations
work.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-10-20 12:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-18 19:50 [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O Pavel Begunkov
2022-10-18 19:50 ` [RFC for-next v2 1/4] bio: safeguard REQ_ALLOC_CACHE bio put Pavel Begunkov
2022-10-20  8:26   ` Christoph Hellwig
2022-10-18 19:50 ` [RFC for-next v2 2/4] bio: split pcpu cache part of bio_put into a helper Pavel Begunkov
2022-10-20  8:30   ` Christoph Hellwig
2022-10-18 19:50 ` [RFC for-next v2 3/4] block/bio: add pcpu caching for non-polling bio_put Pavel Begunkov
2022-10-20  8:31   ` Christoph Hellwig
2022-10-20 12:26     ` Pavel Begunkov
2022-10-18 19:50 ` [RFC for-next v2 4/4] io_uring/rw: enable bio caches for IRQ rw Pavel Begunkov
2022-10-20  8:32 ` [RFC for-next v2 0/4] enable pcpu bio caching for IRQ I/O Christoph Hellwig
2022-10-20 12:40   ` Pavel Begunkov
2022-10-20 12:53     ` Jens Axboe
2022-10-20 12:50 ` (subset) " Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.