linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: "Stefan Bühler" <source@stbuehler.de>,
	linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: io_uring: not good enough for release
Date: Tue, 23 Apr 2019 14:31:51 -0600	[thread overview]
Message-ID: <37071226-375a-07a6-d3d3-21323145de71@kernel.dk> (raw)
In-Reply-To: <366484f9-cc5b-e477-6cc5-6c65f21afdcb@stbuehler.de>

On 4/23/19 1:06 PM, Stefan Bühler wrote:
> Hi,
> 
> now that I've got some of my rust code running with io_uring I don't
> think io_uring is ready.
> 
> If marking it as EXPERIMENTAL (and not "default y") is considered a
> clear flag for "API might still change" I'd recommend going for that.

That might be an option, but I don't think we need to do that. We've
still got a least a few weeks, and the only issue mentioned below that's
really a change that would warrant something like that is easily doable
now. All it needs is agreement.

> Here is my current issue list:
> 
> ---
> 
> 1. An error for a submission should be returned as completion for that
> submission.  Please don't break my main event loop with strange error
> codes just because a single operation is broken/not supported/...

So that's the case I was referring to above. We can just make that change,
there's absolutely no reason to have errors passed back through a different
channel.

> 2. {read,write}_iter and FMODE_NOWAIT / IOCB_NOWAIT is broken at the vfs
> layer: vfs_{read,write} should set IOCB_NOWAIT if O_NONBLOCK is set when
> they call {read,write}_iter (i.e. init_sync_kiocb/iocb_flags needs to
> convert the flag).
> 
> And all {read,write}_iter should check IOCB_NOWAIT instead of O_NONBLOCK
> (hi there pipe.c!), and set FMODE_NOWAIT if they support IOCB_NOWAIT.
> 
> {read,write}_iter should only queue the IOCB though if is_sync_kiocb()
> returns false (i.e. if ki_callback is set).

That's a trivial fix. I agree that it should be done.

> Because right now an IORING_OP_READV on a blocking pipe *blocks*
> io_uring_enter, and on a non-blocking pipe completes with EAGAIN all the
> time.
> 
> So io_uring (readv) doesn't even work on a pipe!  (At least
> IORING_OP_POLL_ADD is working...)

It works, but it blocks. That can be argued as broken, and I agree that
it is, but it's important to make the distinction!

> As another side note: timerfd doesn't have read_iter, so needs
> IORING_OP_POLL_ADD too... :(
> 
> (Also RWF_NOWAIT doesn't work in io_uring right now: IOCB_NOWAIT is
> always removed in the workqueue context, and I don't see an early EAGAIN
> completion).

That's a case I didn't consider, that you'd want to see EAGAIN after
it's been punted. Once punted, we're not going to return EAGAIN since
we can now block. Not sure how you'd want to handle that any better...

> 3. io_file_supports_async should check for FMODE_NOWAIT instead of using
> some hard-coded magic checks.

We probably just need to err on the side of caution there, and suffer
the extra async punts.

> 4. io_prep_rw shouldn't disable force_nonblock if FMODE_NOWAIT isn't
> available; it should return EAGAIN instead and let the workqueue handle it.

Agree

> I'm guessing especially 2. has something to do with why aio never took
> off - so maybe it's time to fix the underlying issues first.

It only really works for a subset of it, but we should ensure that it's
caught and always punted so we don't end up with io_uring_enter() blocking.
That should be the key goal. For regular file writes, should be easy
enough to do. But it should end up being an optimization to what we have,
getting rid of an unecessary async indirection, instead of having cases
where io_uring_enter() blocks.

> I'd be happy to contribute a few patches to those issues if there is an
> agreement what the result should look like :)

Pretty sure folks would be happy to see that :-)

> I have one other question: is there any way to cancel an IO read/write
> operation? I don't think closing io_uring has any effect, what about
> closing the files I'm reading/writing?  (Adding cancelation to kiocb
> sounds like a non-trivial task; and I don't think it already supports it.)

There is no way to do that. If you look at existing aio, nobody supports
that either. Hence io_uring doesn't export any sort of cancellation outside
of the poll case where we can handle it internally to io_uring.

If you look at storage, then generally IO doesn't wait around in the stack,
it's issued. Most hardware only supports queue abort like cancellation,
which isn't useful at all.

So I don't think that will ever happen.

> So cleanup in general seems hard to me: do I have to wait for all
> read/write operations to complete so I can safely free all buffers
> before I close the event loop?

The ring exit waits for IO to complete already.

-- 
Jens Axboe


  reply	other threads:[~2019-04-23 20:31 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-23 19:06 io_uring: not good enough for release Stefan Bühler
2019-04-23 20:31 ` Jens Axboe [this message]
2019-04-23 22:07   ` Jens Axboe
2019-04-24 16:09     ` Jens Axboe
2019-04-27 16:05       ` io_uring: RWF_NOWAIT support Stefan Bühler
2019-04-27 18:34         ` [PATCH v1 1/1] [io_uring] fix handling SQEs requesting NOWAIT Stefan Bühler
2019-04-30 15:40           ` Jens Axboe
2019-04-27 15:50     ` io_uring: submission error handling Stefan Bühler
2019-04-30 16:02       ` Jens Axboe
2019-04-30 16:15         ` Jens Axboe
2019-04-30 18:15           ` Stefan Bühler
2019-04-30 18:42             ` Jens Axboe
2019-05-01 11:49               ` [PATCH v1 1/1] [io_uring] don't stall on submission errors Stefan Bühler
2019-05-01 12:43                 ` Jens Axboe
2019-04-27 21:07   ` io_uring: closing / release Stefan Bühler
2019-05-11 16:26     ` Stefan Bühler
2019-04-28 15:54   ` io_uring: O_NONBLOCK/IOCB_NOWAIT/RWF_NOWAIT mess Stefan Bühler
2019-05-11 16:34     ` Stefan Bühler
2019-05-11 16:57       ` [PATCH 1/5] fs: RWF flags override default IOCB flags from file flags Stefan Bühler
2019-05-11 16:57         ` [PATCH 2/5] tcp: handle SPLICE_F_NONBLOCK in tcp_splice_read Stefan Bühler
2019-05-11 16:57         ` [PATCH 3/5] pipe: use IOCB_NOWAIT instead of O_NONBLOCK Stefan Bühler
2019-05-11 16:57         ` [PATCH 4/5] socket: " Stefan Bühler
2019-05-11 16:57         ` [PATCH 5/5] io_uring: use FMODE_NOWAIT to detect files supporting IOCB_NOWAIT Stefan Bühler
2019-05-03  9:47   ` [PATCH 1/2] io_uring: restructure io_{read,write} control flow Stefan Bühler
2019-05-03  9:47     ` [PATCH 2/2] io_uring: punt to workers if file doesn't support async Stefan Bühler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37071226-375a-07a6-d3d3-21323145de71@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=source@stbuehler.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).