All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Olivier Langlois <olivier@trillion01.com>,
	Jens Axboe <axboe@kernel.dk>,
	io-uring@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] io_uring: store back buffer in case of failure
Date: Tue, 15 Jun 2021 11:01:18 +0100	[thread overview]
Message-ID: <93256513-08d8-5b15-aa98-c1e83af60b54@gmail.com> (raw)
In-Reply-To: <60c83c12.1c69fb81.e3bea.0806SMTPIN_ADDED_MISSING@mx.google.com>

On 6/11/21 8:46 PM, Olivier Langlois wrote:
> the man page says the following:
> 
>     If succesful, the resulting CQE will have IORING_CQE_F_BUFFER set
> in the flags part of the struct, and the upper IORING_CQE_BUFFER_SHIFT
> bits will contain the ID of the selected buffers.
> 
> in order to respect this contract, the buffer is stored back in case of
> an error. There are several reasons to do that:
> 
> 1. doing otherwise is counter-intuitive and error-prone (I cannot think
> of a single example of a syscall failing and still require the user to
> free the allocated resources). Especially when the documention
> explicitly mention that this is the behavior to expect.
> 
> 2. it is inefficient because the buffer is unneeded since there is no
> data to transfer back to the user and the buffer will need to be
> returned back to io_uring to avoid a leak.
> 
> Signed-off-by: Olivier Langlois <olivier@trillion01.com>
> ---
>  fs/io_uring.c | 97 +++++++++++++++++++++++++++++++++------------------
>  1 file changed, 64 insertions(+), 33 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 42380ed563c4..502d7cd81a8c 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
[...]
> +static unsigned int io_put_kbuf(struct io_kiocb *req, struct io_buffer *kbuf,
> +				u16 bgid, long res, unsigned int issue_flags)
>  {
>  	unsigned int cflags;
> +	struct io_ring_ctx *ctx = req->ctx;
> +	struct io_buffer *head;
> +	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
>  
>  	cflags = kbuf->bid << IORING_CQE_BUFFER_SHIFT;
>  	cflags |= IORING_CQE_F_BUFFER;
>  	req->flags &= ~REQ_F_BUFFER_SELECTED;
> -	kfree(kbuf);
> +
> +	/*
> +	 * Theoritically, res == 0 could be included as well but that would
> +	 * break the contract established in the man page saying that
> +	 * a buffer is returned if the operation is successful.
> +	 */
> +	if (unlikely(res < 0)) {
> +		io_ring_submit_lock(ctx, !force_nonblock);

io_complete_rw() is called from an IRQ context, so it can't sleep/wait.

> +
> +		lockdep_assert_held(&ctx->uring_lock);
> +
> +		head = xa_load(&ctx->io_buffers, bgid);
> +		if (head) {
> +			list_add_tail(&kbuf->list, &head->list);
> +			cflags = 0;
> +		} else {
> +			INIT_LIST_HEAD(&kbuf->list);
> +			if (!xa_insert(&ctx->io_buffers, bgid, kbuf, GFP_KERNEL))
> +				cflags = 0;
> +		}
> +		io_ring_submit_unlock(ctx, !force_nonblock);
> +	}
> +	if (cflags)
> +		kfree(kbuf);
>  	return cflags;
>  }
>  
[...]
> -static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret)
> +static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret,
> +			      unsigned int issue_flags)
>  {
>  	switch (ret) {
>  	case -EIOCBQUEUED:
> @@ -2728,7 +2775,7 @@ static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret)
>  		ret = -EINTR;
>  		fallthrough;
>  	default:
> -		kiocb->ki_complete(kiocb, ret, 0);
> +		kiocb->ki_complete(kiocb, ret, issue_flags);

Don't remember what the second argument of .ki_complete is for,
but it definitely should not be used for issue_flags. E.g. because
we get two different meanings for it depending on a context --
either a result from the block layer or issue_flags.

-- 
Pavel Begunkov

       reply	other threads:[~2021-06-15 10:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <60c83c12.1c69fb81.e3bea.0806SMTPIN_ADDED_MISSING@mx.google.com>
2021-06-15 10:01 ` Pavel Begunkov [this message]
2021-06-15 21:51   ` [PATCH] io_uring: store back buffer in case of failure Jens Axboe
2021-06-16 13:42     ` Olivier Langlois
2021-06-16 14:01       ` Pavel Begunkov
2021-06-16 14:44         ` Jens Axboe
2021-06-16 15:37           ` Pavel Begunkov
2021-06-18 21:26             ` Olivier Langlois

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=93256513-08d8-5b15-aa98-c1e83af60b54@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=olivier@trillion01.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.