IO-Uring Archive on lore.kernel.org
 help / color / Atom feed
From: "Carter Li 李通洲" <carter.li@eoitek.com>
To: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring <io-uring@vger.kernel.org>
Subject: Re: [FEATURE REQUEST] Specify a sqe won't generate a cqe
Date: Fri, 14 Feb 2020 11:27:58 +0000
Message-ID: <7C48911C-9C0F-42E1-90DA-7C277E37D986@eoitek.com> (raw)
In-Reply-To: <30d88cf3-527e-4396-4934-fff13c449a80@gmail.com>


> 2020年2月14日 下午6:34,Pavel Begunkov <asml.silence@gmail.com> 写道:
> 
> On 2/14/2020 11:29 AM, Carter Li 李通洲 wrote:
>> To implement io_uring_wait_cqe_timeout, we introduce a magic number
>> called `LIBURING_UDATA_TIMEOUT`. The problem is that not only we
>> must make sure that users should never set sqe->user_data to
>> LIBURING_UDATA_TIMEOUT, but also introduce extra complexity to
>> filter out TIMEOUT cqes.
>> 
>> Former discussion: https://github.com/axboe/liburing/issues/53
>> 
>> I’m suggesting introducing a new SQE flag called IOSQE_IGNORE_CQE
>> to solve this problem.
>> 
>> For a sqe tagged with IOSQE_IGNORE_CQE flag, it won’t generate a cqe
>> on completion. So that IORING_OP_TIMEOUT can be filtered on kernel
>> side.
>> 
>> In addition, `IOSQE_IGNORE_CQE` can be used to save cq size.
>> 
>> For example `POLL_ADD(POLLIN)->READ/RECV` link chain, people usually
>> don’t care the result of `POLL_ADD` is ( since it will always be
>> POLLIN ), `IOSQE_IGNORE_CQE` can be set on `POLL_ADD` to save lots
>> of cq size.
>> 
>> Besides POLL_ADD, people usually don’t care the result of POLL_REMOVE
>> /TIMEOUT_REMOVE/ASYNC_CANCEL/CLOSE. These operations can also be tagged
>> with IOSQE_IGNORE_CQE.
>> 
>> Thoughts?
>> 
> 
> I like the idea! And that's one of my TODOs for the eBPF plans.
> Let me list my use cases, so we can think how to extend it a bit.
> 
> 1. In case of link fail, we need to reap all -ECANCELLED, analise it and
> resubmit the rest. It's quite inconvenient. We may want to have CQE only
> for not cancelled requests.
> 
> 2. When chain succeeded, you in the most cases already know the result
> of all intermediate CQEs, but you still need to reap and match them.
> I'd prefer to have only 1 CQE per link, that is either for the first
> failed or for the last request in the chain.
> 
> These 2 may shed much processing overhead from the userspace.

I couldn't agree more!

Another problem is that io_uring_enter will be awaked for completion of
every operation in a link, which results in unnecessary context switch.
When awaked, users have nothing to do but issue another io_uring_enter
syscall to wait for completion of the entire link chain.

> 
> 3. If we generate requests by eBPF even the notion of per-request event
> may broke.
> - eBPF creating new requests would also need to specify user-data, and
>  this may be problematic from the user perspective.
> - may want to not generate CQEs automatically, but let eBPF do it.
> 
> -- 
> Pavel Begunkov


  reply index

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-14  8:29 Carter Li 李通洲
2020-02-14 10:34 ` Pavel Begunkov
2020-02-14 11:27   ` Carter Li 李通洲 [this message]
2020-02-14 12:52     ` Pavel Begunkov
2020-02-14 13:27       ` Carter Li 李通洲
2020-02-14 14:16         ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7C48911C-9C0F-42E1-90DA-7C277E37D986@eoitek.com \
    --to=carter.li@eoitek.com \
    --cc=asml.silence@gmail.com \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

IO-Uring Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/io-uring/0 io-uring/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 io-uring io-uring/ https://lore.kernel.org/io-uring \
		io-uring@vger.kernel.org
	public-inbox-index io-uring

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.io-uring


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git