Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, io-uring@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH 12/12] io_uring: support true async buffered reads, if file provides it
Date: Tue, 26 May 2020 10:44:00 +0300
Message-ID: <a8212987-bd06-5c67-73d7-e77a654df4ac@gmail.com> (raw)
In-Reply-To: <717e474a-5168-8e1e-2e02-c1bdff007bd9@kernel.dk>

On 25/05/2020 22:59, Jens Axboe wrote:
> On 5/25/20 1:29 AM, Pavel Begunkov wrote:
>> On 23/05/2020 21:57, Jens Axboe wrote:
>>> If the file is flagged with FMODE_BUF_RASYNC, then we don't have to punt
>>> the buffered read to an io-wq worker. Instead we can rely on page
>>> unlocking callbacks to support retry based async IO. This is a lot more
>>> efficient than doing async thread offload.
>>>
>>> The retry is done similarly to how we handle poll based retry. From
>>> the unlock callback, we simply queue the retry to a task_work based
>>> handler.
>>>
>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>> ---
>>>  fs/io_uring.c | 99 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 99 insertions(+)
>>>
>> ...
>>> +
>>> +	init_task_work(&rw->task_work, io_async_buf_retry);
>>> +	/* submit ref gets dropped, acquire a new one */
>>> +	refcount_inc(&req->refs);
>>> +	tsk = req->task;
>>> +	ret = task_work_add(tsk, &rw->task_work, true);
>>> +	if (unlikely(ret)) {
>>> +		/* queue just for cancelation */
>>> +		init_task_work(&rw->task_work, io_async_buf_cancel);
>>> +		tsk = io_wq_get_task(req->ctx->io_wq);
>>
>> IIRC, task will be put somewhere around io_free_req(). Then shouldn't here be
>> some juggling with reassigning req->task with task_{get,put}()?
> 
> Not sure I follow? Yes, we'll put this task again when the request
> is freed, but not sure what you mean with juggling?

I meant something like:

...
/* queue just for cancelation */
init_task_work(&rw->task_work, io_async_buf_cancel);
+ put_task_struct(req->task);
+ req->task = get_task_struct(io_wq_task);


but, thinking twice, if I got the whole idea right, it should be ok as is --
io-wq won't go away before the request anyway, and leaving req->task pinned down
for a bit is not a problem.

>>> +		task_work_add(tsk, &rw->task_work, true);
>>> +	}
>>> +	wake_up_process(tsk);
>>> +	return 1;
>>> +}
>> ...
>>>  static int io_read(struct io_kiocb *req, bool force_nonblock)
>>>  {
>>>  	struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
>>> @@ -2601,6 +2696,7 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
>>>  	if (!ret) {
>>>  		ssize_t ret2;
>>>  
>>> +retry:
>>>  		if (req->file->f_op->read_iter)
>>>  			ret2 = call_read_iter(req->file, kiocb, &iter);
>>>  		else
>>> @@ -2619,6 +2715,9 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
>>>  			if (!(req->flags & REQ_F_NOWAIT) &&
>>>  			    !file_can_poll(req->file))
>>>  				req->flags |= REQ_F_MUST_PUNT;
>>> +			if (io_rw_should_retry(req))
>>
>> It looks like a state machine with IOCB_WAITQ and gotos. Wouldn't it be cleaner
>> to call call_read_iter()/loop_rw_iter() here directly instead of "goto retry" ?
> 
> We could, probably making that part a separate helper then. How about the
> below incremental?

IMHO, it was easy to get lost with such implicit state switching.
Looks better now! See a small comment below.

> 
>> BTW, can this async stuff return -EAGAIN ?
> 
> Probably? Prefer not to make any definitive calls on that being possible or
> not, as it's sure to disappoint. If it does and IOCB_WAITQ is already set,
> then we'll punt to a thread like before.

Sounds reasonable

> 
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index a5a4d9602915..669dccd81207 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2677,6 +2677,13 @@ static bool io_rw_should_retry(struct io_kiocb *req)
>  	return false;
>  }
>  
> +static int __io_read(struct io_kiocb *req, struct iov_iter *iter)
> +{
> +	if (req->file->f_op->read_iter)
> +		return call_read_iter(req->file, &req->rw.kiocb, iter);
> +	return loop_rw_iter(READ, req->file, &req->rw.kiocb, iter);
> +}
> +
>  static int io_read(struct io_kiocb *req, bool force_nonblock)
>  {
>  	struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
> @@ -2710,11 +2717,7 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
>  	if (!ret) {
>  		ssize_t ret2;
>  
> -retry:
> -		if (req->file->f_op->read_iter)
> -			ret2 = call_read_iter(req->file, kiocb, &iter);
> -		else
> -			ret2 = loop_rw_iter(READ, req->file, kiocb, &iter);
> +		ret2 = __io_read(req, &iter);
>  
>  		/* Catch -EAGAIN return for forced non-blocking submission */
>  		if (!force_nonblock || ret2 != -EAGAIN) {
> @@ -2729,8 +2732,11 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
>  			if (!(req->flags & REQ_F_NOWAIT) &&
>  			    !file_can_poll(req->file))
>  				req->flags |= REQ_F_MUST_PUNT;
> -			if (io_rw_should_retry(req))
> -				goto retry;
> +			if (io_rw_should_retry(req)) {
> +				ret2 = __io_read(req, &iter);
> +				if (ret2 != -EAGAIN)
> +					goto out_free;

"goto out_free" returns ret=0, so someone should add a cqe

if (ret2 != -EAGAIN) {
	kiocb_done(kiocb, ret2);
	goto free_out;
}


> +			}
>  			kiocb->ki_flags &= ~IOCB_WAITQ;
>  			return -EAGAIN;
>  		}
> 

-- 
Pavel Begunkov

  reply index

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-23 18:57 [PATCHSET v2 0/12] Add support for async buffered reads Jens Axboe
2020-05-23 18:57 ` [PATCH 01/12] block: read-ahead submission should imply no-wait as well Jens Axboe
2020-05-23 18:57 ` [PATCH 02/12] mm: allow read-ahead with IOCB_NOWAIT set Jens Axboe
2020-05-23 18:57 ` [PATCH 03/12] mm: abstract out wake_page_match() from wake_page_function() Jens Axboe
2020-05-23 18:57 ` [PATCH 04/12] mm: add support for async page locking Jens Axboe
2020-05-23 18:57 ` [PATCH 05/12] mm: support async buffered reads in generic_file_buffered_read() Jens Axboe
2020-05-24 14:05   ` Trond Myklebust
2020-05-24 16:30     ` Jens Axboe
2020-05-24 16:40       ` Jens Axboe
2020-05-24 17:11         ` Trond Myklebust
2020-05-24 17:12           ` Jens Axboe
2020-05-23 18:57 ` [PATCH 06/12] fs: add FMODE_BUF_RASYNC Jens Axboe
2020-05-23 18:57 ` [PATCH 07/12] ext4: flag as supporting buffered async reads Jens Axboe
2020-05-23 18:57 ` [PATCH 08/12] block: flag block devices as supporting IOCB_WAITQ Jens Axboe
2020-05-23 18:57 ` [PATCH 09/12] xfs: flag files as supporting buffered async reads Jens Axboe
2020-05-23 18:57 ` [PATCH 10/12] btrfs: " Jens Axboe
2020-05-23 18:57 ` [PATCH 11/12] mm: add kiocb_wait_page_queue_init() helper Jens Axboe
2020-05-23 18:57 ` [PATCH 12/12] io_uring: support true async buffered reads, if file provides it Jens Axboe
2020-05-25  7:29   ` Pavel Begunkov
2020-05-25 19:59     ` Jens Axboe
2020-05-26  7:44       ` Pavel Begunkov [this message]
2020-05-26 13:50         ` Jens Axboe
2020-05-26  7:38   ` Pavel Begunkov
2020-05-26 13:47     ` Jens Axboe
2020-05-23 19:20 ` [PATCHSET v2 0/12] Add support for async buffered reads Jens Axboe
2020-05-24  9:46   ` Chris Panayis
2020-05-24 19:24     ` Jens Axboe
2020-05-24 19:21 [PATCHSET v4 " Jens Axboe
2020-05-24 19:22 ` [PATCH 12/12] io_uring: support true async buffered reads, if file provides it Jens Axboe
2020-05-26 19:51 [PATCHSET v5 0/12] Add support for async buffered reads Jens Axboe
2020-05-26 19:51 ` [PATCH 12/12] io_uring: support true async buffered reads, if file provides it Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a8212987-bd06-5c67-73d7-e77a654df4ac@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git