linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Al Viro <viro@zeniv.linux.org.uk>, Jens Axboe <axboe@kernel.dk>,
	Pavel Begunkov <asml.silence@gmail.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [git pull] iov_iter fixes
Date: Thu, 9 Sep 2021 12:37:53 -0700	[thread overview]
Message-ID: <CAHk-=wiacKV4Gh-MYjteU0LwNBSGpWrK-Ov25HdqB1ewinrFPg@mail.gmail.com> (raw)
In-Reply-To: <YTmL/plKyujwhoaR@zeniv-ca.linux.org.uk>

On Wed, Sep 8, 2021 at 9:24 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
>         Fixes for io-uring handling of iov_iter reexpands

Ugh.

I have pulled this, because I understand what it does and I agree it
fixes a bug, but it really feels very very hacky and wrong to me.

It really smells like io-uring is doing a "iov_iter_revert()" using a
number that it pulls incorrectly out of its arse.

So when io-uring does that

                iov_iter_revert(iter, io_size - iov_iter_count(iter));

what it *really* wants to do is just basically "iov_iter_reset(iter)".

And that's basically what that addition of that "iov_iter_reexpand()"
tries to effectively do.

Wouldn't it be better to have a function that does exactly that?

Alternatively (and I'm cc'ing Jens) is is not possible for the
io-uring code to know how many bytes it *actually* used, rather than
saying that "ok, the iter originally had X bytes, now it has Y bytes,
so it must have used X-Y bytes" which was actively wrong for the case
where something ended up truncating the IO for some reason.

Because I note that io-uring does that

        /* may have left rw->iter inconsistent on -EIOCBQUEUED */
        iov_iter_revert(&rw->iter, req->result - iov_iter_count(&rw->iter));

in io_resubmit_prep() too, and that you guys missed that it's the
exact same issue, and needs that exact same iov_iter_reexpand().

That "req->result" is once again the *original* length, and the above
code once again mis-handles the case of "oh, the iov got truncated
because of some IO limit".

So I've pulled this, but I think it is

 (a) ugly nasty

 (b) incomplete and misses a case

and needs more thought. At the VERY least it needs that
iov_iter_reexpand() in io_resubmit_prep() too, I think.

I'd like the comments expanded too. In particular that

                /* some cases will consume bytes even on error returns */

really should expand on the "some cases" thing, and why such an error
isn't fatal buye should be retried asynchronously blindly like this?

Because I think _that_ is part of the fundamental issue here - the
io_uring code tries to just blindly re-submit the whole thing, and it
does it very badly and actually incorrectly.

Or am I missing something?

           Linus

  reply	other threads:[~2021-09-09 19:38 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-09  4:22 [git pull] iov_iter fixes Al Viro
2021-09-09 19:37 ` Linus Torvalds [this message]
2021-09-09 21:19   ` Jens Axboe
2021-09-09 21:39     ` Jens Axboe
2021-09-09 21:56       ` Linus Torvalds
2021-09-09 22:21         ` Jens Axboe
2021-09-09 22:56           ` Linus Torvalds
2021-09-10  1:35             ` Jens Axboe
2021-09-10  2:43               ` Jens Axboe
2021-09-10  2:48               ` Al Viro
2021-09-10  3:06                 ` Jens Axboe
2021-09-10  3:15                   ` Al Viro
2021-09-10  3:23                     ` Jens Axboe
2021-09-10  3:24                     ` Al Viro
2021-09-10  3:28                       ` Jens Axboe
2021-09-13 15:29                 ` David Laight
2021-09-09 21:42     ` Dave Chinner
2021-09-10  2:57     ` Al Viro
2021-09-10  3:05       ` Jens Axboe
2021-09-10  3:11         ` Al Viro
2021-09-10  3:22           ` Jens Axboe
2021-09-10  3:27             ` Al Viro
2021-09-10  3:30               ` Jens Axboe
2021-09-10  3:36                 ` Al Viro
2021-09-10 13:57                   ` Jens Axboe
2021-09-10 14:42                     ` Al Viro
2021-09-10 15:08                       ` Jens Axboe
2021-09-10 15:32                         ` Al Viro
2021-09-10 15:36                           ` Jens Axboe
2021-09-10 15:04                     ` Jens Axboe
2021-09-10 16:06                       ` Jens Axboe
2021-09-10 16:44                         ` Linus Torvalds
2021-09-10 16:56                         ` Al Viro
2021-09-10 16:58                           ` Linus Torvalds
2021-09-10 17:26                             ` Jens Axboe
2021-09-10 17:31                               ` Linus Torvalds
2021-09-10 17:32                                 ` Jens Axboe
2021-09-10 18:48                                 ` Al Viro
2021-09-10 19:04                                   ` Linus Torvalds
2021-09-10 19:10                                     ` Linus Torvalds
2021-09-10 19:10                                   ` Jens Axboe
2021-09-10 17:04                           ` Jens Axboe
2021-09-09 22:54   ` Pavel Begunkov
2021-09-09 22:57     ` Pavel Begunkov
2021-09-09 23:14   ` Pavel Begunkov
2021-09-09 20:03 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wiacKV4Gh-MYjteU0LwNBSGpWrK-Ov25HdqB1ewinrFPg@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).