Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, io-uring@vger.kernel.org
Cc: linux-block@vger.kernel.org, jannh@google.com
Subject: Re: [PATCH 3/3] io_uring: add support for backlogged CQ ring
Date: Sat, 9 Nov 2019 15:33:18 +0300
Message-ID: <f185bc90-da47-473e-f533-162fed2a872d@gmail.com> (raw)
In-Reply-To: <e9469ed1-dec0-c8ee-ee0a-5e81ee10d1bc@gmail.com>

On 11/9/2019 3:25 PM, Pavel Begunkov wrote:
> On 11/7/2019 7:00 PM, Jens Axboe wrote:
>> Currently we drop completion events, if the CQ ring is full. That's fine
>> for requests with bounded completion times, but it may make it harder to
>> use io_uring with networked IO where request completion times are
>> generally unbounded. Or with POLL, for example, which is also unbounded.
>>
>> This patch adds IORING_SETUP_CQ_NODROP, which changes the behavior a bit
>> for CQ ring overflows. First of all, it doesn't overflow the ring, it
>> simply stores a backlog of completions that we weren't able to put into
>> the CQ ring. To prevent the backlog from growing indefinitely, if the
>> backlog is non-empty, we apply back pressure on IO submissions. Any
>> attempt to submit new IO with a non-empty backlog will get an -EBUSY
>> return from the kernel. This is a signal to the application that it has
>> backlogged CQ events, and that it must reap those before being allowed
>> to submit more IO.>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  fs/io_uring.c                 | 103 ++++++++++++++++++++++++++++------
>>  include/uapi/linux/io_uring.h |   1 +
>>  2 files changed, 87 insertions(+), 17 deletions(-)
>>
>> +static void io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
>> +{
>> +	struct io_rings *rings = ctx->rings;
>> +	struct io_uring_cqe *cqe;
>> +	struct io_kiocb *req;
>> +	unsigned long flags;
>> +	LIST_HEAD(list);
>> +
>> +	if (list_empty_careful(&ctx->cq_overflow_list))
>> +		return;
>> +	if (ctx->cached_cq_tail - READ_ONCE(rings->cq.head) ==
>> +	    rings->cq_ring_entries)
>> +		return;
>> +
>> +	spin_lock_irqsave(&ctx->completion_lock, flags);
>> +
>> +	while (!list_empty(&ctx->cq_overflow_list)) {
>> +		cqe = io_get_cqring(ctx);
>> +		if (!cqe && !force)
>> +			break;> +
>> +		req = list_first_entry(&ctx->cq_overflow_list, struct io_kiocb,
>> +						list);
>> +		list_move(&req->list, &list);
>> +		if (cqe) {
>> +			WRITE_ONCE(cqe->user_data, req->user_data);
>> +			WRITE_ONCE(cqe->res, req->result);
>> +			WRITE_ONCE(cqe->flags, 0);
>> +		}
> 
> Hmm, second thought. We should account overflow here.
> 
Clarification: We should account overflow in case of (!cqe).

i.e.
if (!cqe) { // else
	WRITE_ONCE(ctx->rings->cq_overflow,
			atomic_inc_return(&ctx->cached_cq_overflow));
}

>> +	}
>> +
>> +	io_commit_cqring(ctx);
>> +	spin_unlock_irqrestore(&ctx->completion_lock, flags);
>> +	io_cqring_ev_posted(ctx);
>> +
>> +	while (!list_empty(&list)) {
>> +		req = list_first_entry(&list, struct io_kiocb, list);
>> +		list_del(&req->list);
>> +		io_put_req(req, NULL);
>> +	}
>> +}
>> +

-- 
Pavel Begunkov

  reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-07 16:00 [PATCHSET v3 0/3] io_uring CQ ring backpressure Jens Axboe
2019-11-07 16:00 ` [PATCH 1/3] io_uring: make io_cqring_events() take 'ctx' as argument Jens Axboe
2019-11-07 16:00 ` [PATCH 2/3] io_uring: pass in io_kiocb to fill/add CQ handlers Jens Axboe
2019-11-07 16:00 ` [PATCH 3/3] io_uring: add support for backlogged CQ ring Jens Axboe
2019-11-09 12:25   ` Pavel Begunkov
2019-11-09 12:33     ` Pavel Begunkov [this message]
2019-11-09 14:14       ` Jens Axboe
2019-11-08  9:26 ` [PATCHSET v3 0/3] io_uring CQ ring backpressure Pavel Begunkov
  -- strict thread matches above, loose matches on Subject: below --
2019-11-06 23:53 [PATCHSET v2 " Jens Axboe
2019-11-06 23:53 ` [PATCH 3/3] io_uring: add support for backlogged CQ ring Jens Axboe

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f185bc90-da47-473e-f533-162fed2a872d@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=jannh@google.com \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git