All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] bio put in-IRQ caching optimisation
@ 2024-02-07 14:14 Pavel Begunkov
  2024-02-07 14:14 ` [PATCH v2 1/2] block: extend bio caching to task context Pavel Begunkov
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Pavel Begunkov @ 2024-02-07 14:14 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe, asml.silence, hch

Patch 1 is a preparation patch, which enables caching of !IOPOLL bios
for the task context execution.

Patch 2 optimise out local_irq_{save,restore}() from bio_put_percpu_cache()
for in-IRQ completions.

v2: Extend caching to the task context

    Move error path to the end of bio_put_percpu_cache(). It looks uglier,
    but I'm happy to make the change as long as it aligns with the community
    standards and helps folks around.

Pavel Begunkov (2):
  block: extend bio caching to task context
  block: optimise in irq bio put caching

 block/bio.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 1/2] block: extend bio caching to task context
  2024-02-07 14:14 [PATCH v2 0/2] bio put in-IRQ caching optimisation Pavel Begunkov
@ 2024-02-07 14:14 ` Pavel Begunkov
  2024-02-07 14:14 ` [PATCH v2 2/2] block: optimise in irq bio put caching Pavel Begunkov
  2024-02-08 17:19 ` [PATCH v2 0/2] bio put in-IRQ caching optimisation Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2024-02-07 14:14 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe, asml.silence, hch

bio_put_percpu_cache() puts all non-iopoll bios into the irq-safe list,
which entails disabling irqs. The overhead of that is not that bad when
interrupts are already off but getting worse otherwise. We can optimise
it when we're in the task context by using ->free_list directly just as
the IOPOLL path does.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 block/bio.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/bio.c b/block/bio.c
index b9642a41f286..8da941974f88 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -770,8 +770,9 @@ static inline void bio_put_percpu_cache(struct bio *bio)
 
 	bio_uninit(bio);
 
-	if ((bio->bi_opf & REQ_POLLED) && !WARN_ON_ONCE(in_interrupt())) {
+	if (in_task()) {
 		bio->bi_next = cache->free_list;
+		/* Not necessary but helps not to iopoll already freed bios */
 		bio->bi_bdev = NULL;
 		cache->free_list = bio;
 		cache->nr++;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 2/2] block: optimise in irq bio put caching
  2024-02-07 14:14 [PATCH v2 0/2] bio put in-IRQ caching optimisation Pavel Begunkov
  2024-02-07 14:14 ` [PATCH v2 1/2] block: extend bio caching to task context Pavel Begunkov
@ 2024-02-07 14:14 ` Pavel Begunkov
  2024-02-08 17:19 ` [PATCH v2 0/2] bio put in-IRQ caching optimisation Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2024-02-07 14:14 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe, asml.silence, hch

When enlisting a bio into ->free_list_irq we protect the list by
disabling irqs. It's likely they're already disabled and performance of
local_irq_{save,restore}() is decent, but it's not zero cost.

Let's only use the irq cache when when we're serving a hard irq, which
allows to remove local_irq_{save,restore}(), and fall back to bio_free()
in all left cases.

Profiles indicate that the bio_put() cost is reduced by ~3.5 times
(1.76% -> 0.49%), and total throughput of a CPU bound benchmark improve
by around 1% (t/io_uring with high QD and several drives).

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 block/bio.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 8da941974f88..00847ff1415c 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -762,30 +762,31 @@ static inline void bio_put_percpu_cache(struct bio *bio)
 	struct bio_alloc_cache *cache;
 
 	cache = per_cpu_ptr(bio->bi_pool->cache, get_cpu());
-	if (READ_ONCE(cache->nr_irq) + cache->nr > ALLOC_CACHE_MAX) {
-		put_cpu();
-		bio_free(bio);
-		return;
-	}
-
-	bio_uninit(bio);
+	if (READ_ONCE(cache->nr_irq) + cache->nr > ALLOC_CACHE_MAX)
+		goto out_free;
 
 	if (in_task()) {
+		bio_uninit(bio);
 		bio->bi_next = cache->free_list;
 		/* Not necessary but helps not to iopoll already freed bios */
 		bio->bi_bdev = NULL;
 		cache->free_list = bio;
 		cache->nr++;
-	} else {
-		unsigned long flags;
+	} else if (in_hardirq()) {
+		lockdep_assert_irqs_disabled();
 
-		local_irq_save(flags);
+		bio_uninit(bio);
 		bio->bi_next = cache->free_list_irq;
 		cache->free_list_irq = bio;
 		cache->nr_irq++;
-		local_irq_restore(flags);
+	} else {
+		goto out_free;
 	}
 	put_cpu();
+	return;
+out_free:
+	put_cpu();
+	bio_free(bio);
 }
 
 /**
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 0/2] bio put in-IRQ caching optimisation
  2024-02-07 14:14 [PATCH v2 0/2] bio put in-IRQ caching optimisation Pavel Begunkov
  2024-02-07 14:14 ` [PATCH v2 1/2] block: extend bio caching to task context Pavel Begunkov
  2024-02-07 14:14 ` [PATCH v2 2/2] block: optimise in irq bio put caching Pavel Begunkov
@ 2024-02-08 17:19 ` Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2024-02-08 17:19 UTC (permalink / raw)
  To: linux-block, Pavel Begunkov; +Cc: hch


On Wed, 07 Feb 2024 14:14:27 +0000, Pavel Begunkov wrote:
> Patch 1 is a preparation patch, which enables caching of !IOPOLL bios
> for the task context execution.
> 
> Patch 2 optimise out local_irq_{save,restore}() from bio_put_percpu_cache()
> for in-IRQ completions.
> 
> v2: Extend caching to the task context
> 
> [...]

Applied, thanks!

[1/2] block: extend bio caching to task context
      commit: c9f5f3aa19c617fe85085b19abbf7a9a077336d0
[2/2] block: optimise in irq bio put caching
      commit: e516c3fc6c182736aec5418a73f15199640491e2

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-02-08 17:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-07 14:14 [PATCH v2 0/2] bio put in-IRQ caching optimisation Pavel Begunkov
2024-02-07 14:14 ` [PATCH v2 1/2] block: extend bio caching to task context Pavel Begunkov
2024-02-07 14:14 ` [PATCH v2 2/2] block: optimise in irq bio put caching Pavel Begunkov
2024-02-08 17:19 ` [PATCH v2 0/2] bio put in-IRQ caching optimisation Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.