IO-Uring Archive on
 help / color / Atom feed
From: Jann Horn <>
To: Jens Axboe <>, Pavel Begunkov <>
Cc: io-uring <>
Subject: Re: [PATCH 3/7] io_uring: don't wait when under-submitting
Date: Wed, 18 Dec 2019 00:55:36 +0100
Message-ID: <> (raw)
In-Reply-To: <>

On Tue, Dec 17, 2019 at 11:54 PM Jens Axboe <> wrote:
> There is no reliable way to submit and wait in a single syscall, as
> io_submit_sqes() may under-consume sqes (in case of an early error).
> Then it will wait for not-yet-submitted requests, deadlocking the user
> in most cases.
> In such cases adjust min_complete, so it won't wait for more than
> what have been submitted in the current io_uring_enter() call. It
> may be less than total in-flight, but that up to a user to handle.
> Signed-off-by: Pavel Begunkov <>
> Signed-off-by: Jens Axboe <>
>         if (flags & IORING_ENTER_GETEVENTS) {
>                 unsigned nr_events = 0;
>                 min_complete = min(min_complete, ctx->cq_entries);
> +               if (submitted != to_submit)
> +                       min_complete = min(min_complete, (u32)submitted);

Hm. Let's say someone submits two requests, first an ACCEPT request
that might stall indefinitely and then a WRITE to a file on disk that
is expected to complete quickly; and the caller uses min_complete=1
because they want to wait for the WRITE op. But now the submission of
the WRITE fails, io_uring_enter() computes min_complete=min(1, 1)=1,
and it blocks on the ACCEPT op. That would be bad, right?

If the usecase I described is valid, I think it might make more sense
to do something like this:

u32 missing_submissions = to_submit - submitted;
min_complete = min(min_complete, ctx->cq_entries);
if ((flags & IORING_ENTER_GETEVENTS) && missing_submissions < min_complete) {
  min_complete -= missing_submissions;

In other words: If we do a partially successful submission, only wait
as long as we know that userspace definitely wants us to wait for one
of the pending requests; and once we can't tell whether userspace
intended to wait longer, return to userspace and let the user decide.

Or it might make sense to just ignore IORING_ENTER_GETEVENTS
completely in the partial submission case, in case userspace wants to
immediately react to the failed request by writing out an error
message to a socket or whatever. This case probably isn't
performance-critical, right? And it would simplify things a bit.

  reply index

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-17 22:54 [PATCHSET] io_uring fixes for 5.5 Jens Axboe
2019-12-17 22:54 ` [PATCH 1/7] io_uring: fix stale comment and a few typos Jens Axboe
2019-12-17 22:54 ` [PATCH 2/7] io_uring: fix sporadic -EFAULT from IORING_OP_RECVMSG Jens Axboe
2019-12-17 22:54 ` [PATCH 3/7] io_uring: don't wait when under-submitting Jens Axboe
2019-12-17 23:55   ` Jann Horn [this message]
2019-12-18  0:06     ` Jens Axboe
2019-12-18  9:38       ` Pavel Begunkov
2019-12-18 13:02         ` Jens Axboe
2019-12-18 13:09           ` Pavel Begunkov
2019-12-17 22:54 ` [PATCH 4/7] io_uring: fix pre-prepped issue with force_nonblock == true Jens Axboe
2019-12-17 22:54 ` [PATCH 5/7] io_uring: remove 'sqe' parameter to the OP helpers that take it Jens Axboe
2019-12-17 22:54 ` [PATCH 6/7] io_uring: any deferred command must have stable sqe data Jens Axboe
2019-12-17 22:54 ` [PATCH 7/7] io_uring: make HARDLINK imply LINK Jens Axboe

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='' \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

IO-Uring Archive on

Archives are clonable:
	git clone --mirror io-uring/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 io-uring io-uring/ \
	public-inbox-index io-uring

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone