Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Pavel Begunkov <asml.silence@gmail.com>, io-uring@vger.kernel.org
Cc: linux-block@vger.kernel.org, jannh@google.com
Subject: Re: [PATCH 3/3] io_uring: add support for backlogged CQ ring
Date: Sat, 9 Nov 2019 07:14:23 -0700
Message-ID: <39b387c8-9440-124a-e491-5847f1d68d2c@kernel.dk> (raw)
In-Reply-To: <f185bc90-da47-473e-f533-162fed2a872d@gmail.com>

On 11/9/19 5:33 AM, Pavel Begunkov wrote:
> On 11/9/2019 3:25 PM, Pavel Begunkov wrote:
>> On 11/7/2019 7:00 PM, Jens Axboe wrote:
>>> Currently we drop completion events, if the CQ ring is full. That's fine
>>> for requests with bounded completion times, but it may make it harder to
>>> use io_uring with networked IO where request completion times are
>>> generally unbounded. Or with POLL, for example, which is also unbounded.
>>>
>>> This patch adds IORING_SETUP_CQ_NODROP, which changes the behavior a bit
>>> for CQ ring overflows. First of all, it doesn't overflow the ring, it
>>> simply stores a backlog of completions that we weren't able to put into
>>> the CQ ring. To prevent the backlog from growing indefinitely, if the
>>> backlog is non-empty, we apply back pressure on IO submissions. Any
>>> attempt to submit new IO with a non-empty backlog will get an -EBUSY
>>> return from the kernel. This is a signal to the application that it has
>>> backlogged CQ events, and that it must reap those before being allowed
>>> to submit more IO.>
>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>> ---
>>>   fs/io_uring.c                 | 103 ++++++++++++++++++++++++++++------
>>>   include/uapi/linux/io_uring.h |   1 +
>>>   2 files changed, 87 insertions(+), 17 deletions(-)
>>>
>>> +static void io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
>>> +{
>>> +	struct io_rings *rings = ctx->rings;
>>> +	struct io_uring_cqe *cqe;
>>> +	struct io_kiocb *req;
>>> +	unsigned long flags;
>>> +	LIST_HEAD(list);
>>> +
>>> +	if (list_empty_careful(&ctx->cq_overflow_list))
>>> +		return;
>>> +	if (ctx->cached_cq_tail - READ_ONCE(rings->cq.head) ==
>>> +	    rings->cq_ring_entries)
>>> +		return;
>>> +
>>> +	spin_lock_irqsave(&ctx->completion_lock, flags);
>>> +
>>> +	while (!list_empty(&ctx->cq_overflow_list)) {
>>> +		cqe = io_get_cqring(ctx);
>>> +		if (!cqe && !force)
>>> +			break;> +
>>> +		req = list_first_entry(&ctx->cq_overflow_list, struct io_kiocb,
>>> +						list);
>>> +		list_move(&req->list, &list);
>>> +		if (cqe) {
>>> +			WRITE_ONCE(cqe->user_data, req->user_data);
>>> +			WRITE_ONCE(cqe->res, req->result);
>>> +			WRITE_ONCE(cqe->flags, 0);
>>> +		}
>>
>> Hmm, second thought. We should account overflow here.
>>
> Clarification: We should account overflow in case of (!cqe).
> 
> i.e.
> if (!cqe) { // else
> 	WRITE_ONCE(ctx->rings->cq_overflow,
> 			atomic_inc_return(&ctx->cached_cq_overflow));
> }

Ah yes, we should, even if this is only the flush path. I'll send out
a patch for that, unless you beat me to it.

-- 
Jens Axboe


  reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-07 16:00 [PATCHSET v3 0/3] io_uring CQ ring backpressure Jens Axboe
2019-11-07 16:00 ` [PATCH 1/3] io_uring: make io_cqring_events() take 'ctx' as argument Jens Axboe
2019-11-07 16:00 ` [PATCH 2/3] io_uring: pass in io_kiocb to fill/add CQ handlers Jens Axboe
2019-11-07 16:00 ` [PATCH 3/3] io_uring: add support for backlogged CQ ring Jens Axboe
2019-11-09 12:25   ` Pavel Begunkov
2019-11-09 12:33     ` Pavel Begunkov
2019-11-09 14:14       ` Jens Axboe [this message]
2019-11-08  9:26 ` [PATCHSET v3 0/3] io_uring CQ ring backpressure Pavel Begunkov
  -- strict thread matches above, loose matches on Subject: below --
2019-11-06 23:53 [PATCHSET v2 " Jens Axboe
2019-11-06 23:53 ` [PATCH 3/3] io_uring: add support for backlogged CQ ring Jens Axboe

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=39b387c8-9440-124a-e491-5847f1d68d2c@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=io-uring@vger.kernel.org \
    --cc=jannh@google.com \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git