Re: [PATCH] io_uring: use task_work for links if possible

From: Jens Axboe <axboe@kernel.dk>
To: Pavel Begunkov <asml.silence@gmail.com>,
	io-uring <io-uring@vger.kernel.org>
Subject: Re: [PATCH] io_uring: use task_work for links if possible
Date: Fri, 26 Jun 2020 14:43:52 -0600	[thread overview]
Message-ID: <22c72f8a-e80d-67ea-4f89-264238e5810d@kernel.dk> (raw)
In-Reply-To: <f6ad4ae4-dc7a-1f39-d4da-40b5d6c04d04@gmail.com>

On 6/26/20 2:27 PM, Pavel Begunkov wrote:
> On 25/06/2020 21:27, Jens Axboe wrote:
>> Currently links are always done in an async fashion, unless we
>> catch them inline after we successfully complete a request without
>> having to resort to blocking. This isn't necessarily the most efficient
>> approach, it'd be more ideal if we could just use the task_work handling
>> for this.
>>
>> Outside of saving an async jump, we can also do less prep work for
>> these kinds of requests.
>>
>> Running dependent links from the task_work handler yields some nice
>> performance benefits. As an example, examples/link-cp from the liburing
>> repository uses read+write links to implement a copy operation. Without
>> this patch, the a cache fold 4G file read from a VM runs in about
>> 3 seconds:
> 
> A few comments below
> 
> 
>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 0bba12e4e559..389274a078c8 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -898,6 +898,7 @@ enum io_mem_account {
> ...
>> +static void __io_req_task_submit(struct io_kiocb *req)
>> +{
>> +	struct io_ring_ctx *ctx = req->ctx;
>> +
>> +	__set_current_state(TASK_RUNNING);
>> +	if (!io_sq_thread_acquire_mm(ctx, req)) {
>> +		mutex_lock(&ctx->uring_lock);
>> +		__io_queue_sqe(req, NULL, NULL);
>> +		mutex_unlock(&ctx->uring_lock);
>> +	} else {
>> +		__io_req_task_cancel(req, -EFAULT);
>> +	}
>> +}
>> +
>> +static void io_req_task_submit(struct callback_head *cb)
>> +{
>> +	struct io_kiocb *req = container_of(cb, struct io_kiocb, task_work);
>> +
>> +	__io_req_task_submit(req);
>> +}
>> +
>> +static void io_req_task_queue(struct io_kiocb *req)
>> +{
>> +	struct task_struct *tsk = req->task;
>> +	int ret;
>> +
>> +	init_task_work(&req->task_work, io_req_task_submit);
>> +
>> +	ret = task_work_add(tsk, &req->task_work, true);
>> +	if (unlikely(ret)) {
>> +		init_task_work(&req->task_work, io_req_task_cancel);
> 
> Why not to kill it off here? It just was nxt, so shouldn't anything like
> NOCANCEL

We don't necessarily know the context, and we'd at least need to check
REQ_F_COMP_LOCKED and propagate. I think it gets ugly really quick,
better to just punt it to clean context.

>> +		tsk = io_wq_get_task(req->ctx->io_wq);
>> +		task_work_add(tsk, &req->task_work, true);
>> +	}
>> +	wake_up_process(tsk);
>> +}
>> +
>>  static void io_free_req(struct io_kiocb *req)
>>  {
>>  	struct io_kiocb *nxt = NULL;
>> @@ -1671,8 +1758,12 @@ static void io_free_req(struct io_kiocb *req)
>>  	io_req_find_next(req, &nxt);
>>  	__io_free_req(req);
>>  
>> -	if (nxt)
>> -		io_queue_async_work(nxt);
>> +	if (nxt) {
>> +		if (nxt->flags & REQ_F_WORK_INITIALIZED)
>> +			io_queue_async_work(nxt);
> 
> Don't think it will work. E.g. io_close_prep() may have set
> REQ_F_WORK_INITIALIZED but without io_req_work_grab_env().

This really doesn't change the existing path, it just makes sure we
don't do io_req_task_queue() on something that has already modified
->work (and hence, ->task_work). This might miss cases where we have
only cleared it and done nothing else, but that just means we'll have
cases that we could potentially improve the effiency of down the line.

>> -static inline void req_set_fail_links(struct io_kiocb *req)
>> -{
>> -	if ((req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) == REQ_F_LINK)
>> -		req->flags |= REQ_F_FAIL_LINK;
>> -}
>> -
> 
> I think it'd be nicer in 2 patches, first moving io_sq_thread_drop_mm, etc. up.
> And the second one doing actual work. 

Yeah I agree...

>>  static void io_complete_rw_common(struct kiocb *kiocb, long res,
>>  				  struct io_comp_state *cs)
>>  {
>> @@ -2035,35 +2120,6 @@ static void io_complete_rw_common(struct kiocb *kiocb, long res,
>>  	__io_req_complete(req, res, cflags, cs);
>>  }
>>  
> ...
>>  	switch (req->opcode) {
>>  	case IORING_OP_NOP:
>> @@ -5347,7 +5382,7 @@ static int io_req_defer(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>  	if (!req->io) {
>>  		if (io_alloc_async_ctx(req))
>>  			return -EAGAIN;
>> -		ret = io_req_defer_prep(req, sqe);
>> +		ret = io_req_defer_prep(req, sqe, true);
> 
> Why head of a link is for_async?

True, that could be false instead.

Since these are just minor things, we can do a fix on top. I don't want
to reshuffle this unless I have to.

-- 
Jens Axboe