On 28/12/2019 20:03, Jens Axboe wrote: > On 12/28/19 4:15 AM, Pavel Begunkov wrote: >> On 28/12/2019 14:13, Pavel Begunkov wrote: >>> percpu_ref_tryget() has its own overhead. Instead getting a reference >>> for each request, grab a bunch once per io_submit_sqes(). >>> >>> ~5% throughput boost for a "submit and wait 128 nops" benchmark. >>> >>> Signed-off-by: Pavel Begunkov >>> --- >>> fs/io_uring.c | 26 +++++++++++++++++--------- >>> 1 file changed, 17 insertions(+), 9 deletions(-) >>> >>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>> index 7fc1158bf9a4..404946080e86 100644 >>> --- a/fs/io_uring.c >>> +++ b/fs/io_uring.c >>> @@ -1080,9 +1080,6 @@ static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx, >>> gfp_t gfp = GFP_KERNEL | __GFP_NOWARN; >>> struct io_kiocb *req; >>> >>> - if (!percpu_ref_tryget(&ctx->refs)) >>> - return NULL; >>> - >>> if (!state) { >>> req = kmem_cache_alloc(req_cachep, gfp); >>> if (unlikely(!req)) >>> @@ -1141,6 +1138,14 @@ static void io_free_req_many(struct io_ring_ctx *ctx, void **reqs, int *nr) >>> } >>> } >>> >>> +static void __io_req_free_empty(struct io_kiocb *req) >> >> If anybody have better naming (or a better approach at all), I'm all ears. > > __io_req_do_free()? Not quite clear what's the difference with __io_req_free() then > > I think that's better than the empty, not quite sure what that means. Probably, so. It was kind of "request without a bound sqe". Does io_free_{hollow,empty}_req() sound better? > If you're fine with that, I can just make that edit when applying. > The rest looks fine to me now. > Please do -- Pavel Begunkov