io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET 0/5] Allow batching of inline completions
@ 2020-06-23 15:16 Jens Axboe
  2020-06-23 15:16 ` [PATCH 1/5] io_uring: provide generic io_req_complete() helper Jens Axboe
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Jens Axboe @ 2020-06-23 15:16 UTC (permalink / raw)
  To: io-uring; +Cc: xuanzhuo, Dust.li

As Xuan Zhuo <xuanzhuo@linux.alibaba.com> reported here:

https://lore.kernel.org/io-uring/34ecb5c9-5822-827f-6e7b-973bea543569@kernel.dk/T/#me32d6897f976e8268284ff5cbdb3696010c2b7ba

we can do a bit better when dealing with inline completions from the
submission path. This patchset cleans up the standard completion
logic, then builds on top of that to allow collecting completions done
at submission time. This allows io_uring to amortize the cost of needing
to grab the completion lock, and updating the CQ ring as well.

On a silly t/io_uring NOP test on my laptop, this brings about a 20%
increase in performance. Xuan Zhuo reports that it changes his SQPOLL
based UDP processing (running at 800K PPS) profile from:

17.97% [kernel] [k] copy_user_generic_unrolled
13.92% [kernel] [k] io_commit_cqring
11.04% [kernel] [k] __io_cqring_fill_event
10.33% [kernel] [k] udp_recvmsg
 5.94% [kernel] [k] skb_release_data
 4.31% [kernel] [k] udp_rmem_release
 2.68% [kernel] [k] __check_object_size
 2.24% [kernel] [k] __slab_free
 2.22% [kernel] [k] _raw_spin_lock_bh
 2.21% [kernel] [k] kmem_cache_free
 2.13% [kernel] [k] free_pcppages_bulk
 1.83% [kernel] [k] io_submit_sqes
 1.38% [kernel] [k] page_frag_free
 1.31% [kernel] [k] inet_recvmsg

to

19.99% [kernel] [k] copy_user_generic_unrolled
11.63% [kernel] [k] skb_release_data
 9.36% [kernel] [k] udp_rmem_release
 8.64% [kernel] [k] udp_recvmsg
 6.21% [kernel] [k] __slab_free
 4.39% [kernel] [k] __check_object_size
 3.64% [kernel] [k] free_pcppages_bulk
 2.41% [kernel] [k] kmem_cache_free
 2.00% [kernel] [k] io_submit_sqes
 1.95% [kernel] [k] page_frag_free
 1.54% [kernel] [k] io_put_req
[...]
 0.07% [kernel] [k] io_commit_cqring
 0.44% [kernel] [k] __io_cqring_fill_event

which looks much nicer.

Patches are against my for-5.9/io_uring branch.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-06-23 15:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-23 15:16 [PATCHSET 0/5] Allow batching of inline completions Jens Axboe
2020-06-23 15:16 ` [PATCH 1/5] io_uring: provide generic io_req_complete() helper Jens Axboe
2020-06-23 15:16 ` [PATCH 2/5] io_uring: add 'io_comp_state' to struct io_submit_state Jens Axboe
2020-06-23 15:16 ` [PATCH 3/5] io_uring: pass down completion state on the issue side Jens Axboe
2020-06-23 15:16 ` [PATCH 4/5] io_uring: pass in completion state to appropriate issue side handlers Jens Axboe
2020-06-23 15:16 ` [PATCH 5/5] io_uring: enable READ/WRITE to use deferred completions Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).