archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <>
To: Linus Torvalds <>
Cc: io-uring <>,
	linux-fsdevel <>,
	Al Viro <>
Subject: Re: [PATCHSET v3 0/3] Add ability to save/restore iov_iter state
Date: Wed, 15 Sep 2021 19:15:55 -0600	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On 9/15/21 4:42 PM, Jens Axboe wrote:
> On 9/15/21 1:40 PM, Jens Axboe wrote:
>> On 9/15/21 1:26 PM, Linus Torvalds wrote:
>>> On Wed, Sep 15, 2021 at 11:46 AM Jens Axboe <> wrote:
>>>>    The usual tests
>>>> do end up hitting the -EAGAIN path quite easily for certain device
>>>> types, but not the short read/write.
>>> No way to do something like "read in file to make sure it's cached,
>>> then invalidate caches from position X with POSIX_FADV_DONTNEED, then
>>> do a read that crosses that cached/uncached boundary"?
>>> To at least verify that "partly synchronous, but partly punted to
>>> async" case?
>>> Or were you talking about some other situation?
>> No that covers some of it, and that happens naturally with buffered IO.
>> The typical case is -EAGAIN on the first try, then you get a partial
>> or all of it the next loop, and then done or continue. I tend to run
>> fio verification workloads for that, as you get all the flexibility
>> of fio with the data verification. And there are tests in there that run
>> DONTNEED in parallel with buffered IO, exactly to catch some of these
>> csaes. But they don't verify the data, generally.
>> In that sense buffered is a lot easier than O_DIRECT, as it's easier to
>> provoke these cases. And that does hit all the save/restore parts and
>> looping, and if you do it with registered buffers then you get to work
>> with bvec iter as well. O_DIRECT may get you -EAGAIN for low queue depth
>> devices, but it'll never do a short read/write after that. 
>> But that's not in the regressions tests. I'll write a test case
>> that can go with the liburing regressions for it.
> OK I wrote one, quick'n dirty. It's written as a liburing test, which
> means it can take no arguments (in which case it creates a 128MB file),
> or it can take an argument and it'll use that argument as the file. We
> fill the first 128MB of the file with known data, basically the offset
> of the file. Then we read it back in any of the following ways:
> 1) Using non-vectored read
> 2) Using vectored read, segments that fit in UIO_FASTIOV
> 3) Using vectored read, segments larger than UIO_FASTIOV
> This catches all the different cases for a read.
> We do that with both buffered and O_DIRECT, and before each pass, we
> randomly DONTNEED either the first, middle, or end part of each segment
> in the read size.
> I ran this on my laptop, and I found this:
> axboe@p1 ~/gi/liburing (master)> test/file-verify                                0.100s
> bad read 229376, read 3
> Buffered novec test failed
> axboe@p1 ~/gi/liburing (master)> test/file-verify                                0.213s
> bad read 294912, read 0
> Buffered novec test failed
> which is because I'm running the iov_iter.2 stuff, and we're hitting
> that double accounting issue that I mentioned in the cover letter for
> this series. That's why the read return is larger than we ask for
> (128K). Running it on the current branch passes:
> [root@archlinux liburing]# for i in $(seq 10); do test/file-verify; done
> [root@archlinux liburing]# 
> (this is in my test vm that I run on the laptop for kernel testing,
> hence the root and different hostname).
> I will add this as a liburing regression test case. Probably needs a bit
> of cleaning up first, it was just a quick prototype as I thought your
> suggestion was a good one. Will probably change it to run at a higher
> queue depth than just the 1 it does now.

Cleaned it up a bit, and added registered buffer support as well (which
is another variant over non-vectored reads) and queued IO support as

and it's now part of the regular testing. Here's my usual run:

Running test file-verify                                            3 sec
Running test file-verify /dev/nvme0n1p2                             3 sec
Running test file-verify /dev/nvme1n1p1                             3 sec
Running test file-verify /dev/sdc2                                  Test file-verify timed out (may not be a failure)
Running test file-verify /dev/dm-0                                  3 sec
Running test file-verify /data/file                                 3 sec

Note that the sdc2 timeout isn't a failure, it's just that emulation on
qemu is slow enough that it takes 1min20s to run and I time out tests
after 60s in the harness to prevent something stalling forever.

Jens Axboe

  reply	other threads:[~2021-09-16  1:16 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-15 16:29 Jens Axboe
2021-09-15 16:29 ` [PATCH 1/3] iov_iter: add helper to save " Jens Axboe
2021-09-15 16:29 ` [PATCH 2/3] io_uring: use iov_iter state save/restore helpers Jens Axboe
2021-09-15 16:29 ` [PATCH 3/3] Revert "iov_iter: track truncated size" Jens Axboe
2021-09-15 18:32 ` [PATCHSET v3 0/3] Add ability to save/restore iov_iter state Linus Torvalds
2021-09-15 18:46   ` Jens Axboe
2021-09-15 19:26     ` Linus Torvalds
2021-09-15 19:40       ` Jens Axboe
2021-09-15 22:42         ` Jens Axboe
2021-09-16  1:15           ` Jens Axboe [this message]
2021-09-16  4:47 ` Al Viro
2021-09-16 16:10   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \
    --subject='Re: [PATCHSET v3 0/3] Add ability to save/restore iov_iter state' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).