io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Large number of empty reads on 5.9-rc2 under moderate load
@ 2020-08-24 10:40 Dmitry Shulyak
  2020-08-24 10:46 ` Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Shulyak @ 2020-08-24 10:40 UTC (permalink / raw)
  To: io-uring

In the program, I am submitting a large number of concurrent read
requests with o_direct. In both scenarios the number of concurrent
read requests is limited to 20 000, with only difference being that
for 512b total number of reads is 8millions and for 8kb - 1million. On
5.8.3 I didn't see any empty reads at all.

BenchmarkReadAt/uring_512-8              8000000              1879
ns/op         272.55 MB/s
BenchmarkReadAt/uring_8192-8             1000000             18178
ns/op         450.65 MB/s

I am seeing the same numbers in iotop, so pretty confident that the
benchmark is fine. Below is a version with regular syscalls and
threads (note that this is with golang):

BenchmarkReadAt/os_512-256               8000000              4393
ns/op         116.55 MB/s
BenchmarkReadAt/os_8192-256              1000000             18811
ns/op         435.48 MB/s

I run the same program on 5.9-rc.2 and noticed that for workload with
8kb buffer and 1mill reads I had to make more than 7 millions retries,
which obviously makes the program very slow. For 512b and 8million
reads there were only 22 000 retries, but it is still very slow for
some other reason.

BenchmarkReadAt/uring_512-8  8000000       8432 ns/op   60.72 MB/s
BenchmarkReadAt/uring_8192-8 1000000      42603 ns/op 192.29 MB/s

In iotop i am seeing a huge increase for 8kb, actual disk read goes up
to 2gb/s, which looks somewhat suspicious given that my ssd should
support only 450mb/s. If I will lower the number of concurrent
requests to 1000, then there are almost no empty reads and numbers for
8kb go back to the same level I saw with 5.8.3.

Is it a regression or should I throttle submissions?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-24 10:40 Large number of empty reads on 5.9-rc2 under moderate load Dmitry Shulyak
@ 2020-08-24 10:46 ` Jens Axboe
  2020-08-24 11:09   ` Dmitry Shulyak
  0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2020-08-24 10:46 UTC (permalink / raw)
  To: Dmitry Shulyak, io-uring

On 8/24/20 4:40 AM, Dmitry Shulyak wrote:
> In the program, I am submitting a large number of concurrent read
> requests with o_direct. In both scenarios the number of concurrent
> read requests is limited to 20 000, with only difference being that
> for 512b total number of reads is 8millions and for 8kb - 1million. On
> 5.8.3 I didn't see any empty reads at all.
> 
> BenchmarkReadAt/uring_512-8              8000000              1879
> ns/op         272.55 MB/s
> BenchmarkReadAt/uring_8192-8             1000000             18178
> ns/op         450.65 MB/s
> 
> I am seeing the same numbers in iotop, so pretty confident that the
> benchmark is fine. Below is a version with regular syscalls and
> threads (note that this is with golang):
> 
> BenchmarkReadAt/os_512-256               8000000              4393
> ns/op         116.55 MB/s
> BenchmarkReadAt/os_8192-256              1000000             18811
> ns/op         435.48 MB/s
> 
> I run the same program on 5.9-rc.2 and noticed that for workload with
> 8kb buffer and 1mill reads I had to make more than 7 millions retries,
> which obviously makes the program very slow. For 512b and 8million
> reads there were only 22 000 retries, but it is still very slow for
> some other reason.
> 
> BenchmarkReadAt/uring_512-8  8000000       8432 ns/op   60.72 MB/s
> BenchmarkReadAt/uring_8192-8 1000000      42603 ns/op 192.29 MB/s
> 
> In iotop i am seeing a huge increase for 8kb, actual disk read goes up
> to 2gb/s, which looks somewhat suspicious given that my ssd should
> support only 450mb/s. If I will lower the number of concurrent
> requests to 1000, then there are almost no empty reads and numbers for
> 8kb go back to the same level I saw with 5.8.3.
> 
> Is it a regression or should I throttle submissions?

Since it's performing worse than 5.8, sounds like there is. How can we
reproduce this?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-24 10:46 ` Jens Axboe
@ 2020-08-24 11:09   ` Dmitry Shulyak
  2020-08-24 14:06     ` Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Shulyak @ 2020-08-24 11:09 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

library that i am using https://github.com/dshulyak/uring
It requires golang 1.14, if installed, benchmark can be run with:
go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
go test ./fs -run=xx -bench=BenchmarkReadAt/uring_5 -benchtime=8000000x

note that it will setup uring instance per cpu, with shared worker pool.
it will take me too much time to implement repro in c, but in general
i am simply submitting multiple concurrent
read requests and watching read rate.



On Mon, 24 Aug 2020 at 13:46, Jens Axboe <axboe@kernel.dk> wrote:
>
> On 8/24/20 4:40 AM, Dmitry Shulyak wrote:
> > In the program, I am submitting a large number of concurrent read
> > requests with o_direct. In both scenarios the number of concurrent
> > read requests is limited to 20 000, with only difference being that
> > for 512b total number of reads is 8millions and for 8kb - 1million. On
> > 5.8.3 I didn't see any empty reads at all.
> >
> > BenchmarkReadAt/uring_512-8              8000000              1879
> > ns/op         272.55 MB/s
> > BenchmarkReadAt/uring_8192-8             1000000             18178
> > ns/op         450.65 MB/s
> >
> > I am seeing the same numbers in iotop, so pretty confident that the
> > benchmark is fine. Below is a version with regular syscalls and
> > threads (note that this is with golang):
> >
> > BenchmarkReadAt/os_512-256               8000000              4393
> > ns/op         116.55 MB/s
> > BenchmarkReadAt/os_8192-256              1000000             18811
> > ns/op         435.48 MB/s
> >
> > I run the same program on 5.9-rc.2 and noticed that for workload with
> > 8kb buffer and 1mill reads I had to make more than 7 millions retries,
> > which obviously makes the program very slow. For 512b and 8million
> > reads there were only 22 000 retries, but it is still very slow for
> > some other reason.
> >
> > BenchmarkReadAt/uring_512-8  8000000       8432 ns/op   60.72 MB/s
> > BenchmarkReadAt/uring_8192-8 1000000      42603 ns/op 192.29 MB/s
> >
> > In iotop i am seeing a huge increase for 8kb, actual disk read goes up
> > to 2gb/s, which looks somewhat suspicious given that my ssd should
> > support only 450mb/s. If I will lower the number of concurrent
> > requests to 1000, then there are almost no empty reads and numbers for
> > 8kb go back to the same level I saw with 5.8.3.
> >
> > Is it a regression or should I throttle submissions?
>
> Since it's performing worse than 5.8, sounds like there is. How can we
> reproduce this?
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-24 11:09   ` Dmitry Shulyak
@ 2020-08-24 14:06     ` Jens Axboe
  2020-08-24 14:45       ` Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2020-08-24 14:06 UTC (permalink / raw)
  To: Dmitry Shulyak; +Cc: io-uring

On 8/24/20 5:09 AM, Dmitry Shulyak wrote:
> library that i am using https://github.com/dshulyak/uring
> It requires golang 1.14, if installed, benchmark can be run with:
> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_5 -benchtime=8000000x
> 
> note that it will setup uring instance per cpu, with shared worker pool.
> it will take me too much time to implement repro in c, but in general
> i am simply submitting multiple concurrent
> read requests and watching read rate.

I'm fine with trying your Go version, but I can into a bit of trouble:

axboe@amd ~/g/go-uring (master)> 
go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
# github.com/dshulyak/uring/fixed
fixed/allocator.go:38:48: error: incompatible type for field 2 in struct construction (cannot use type uint64 as type syscall.Iovec_len_t)
   38 |  iovec := []syscall.Iovec{{Base: &mem[0], Len: uint64(size)}}
      |                                                ^
FAIL	github.com/dshulyak/uring/fs [build failed]
FAIL
axboe@amd ~/g/go-uring (master)> go version
go version go1.14.6 gccgo (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0 linux/amd64

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-24 14:06     ` Jens Axboe
@ 2020-08-24 14:45       ` Jens Axboe
  2020-08-24 15:33         ` Dmitry Shulyak
  0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2020-08-24 14:45 UTC (permalink / raw)
  To: Dmitry Shulyak; +Cc: io-uring

On 8/24/20 8:06 AM, Jens Axboe wrote:
> On 8/24/20 5:09 AM, Dmitry Shulyak wrote:
>> library that i am using https://github.com/dshulyak/uring
>> It requires golang 1.14, if installed, benchmark can be run with:
>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_5 -benchtime=8000000x
>>
>> note that it will setup uring instance per cpu, with shared worker pool.
>> it will take me too much time to implement repro in c, but in general
>> i am simply submitting multiple concurrent
>> read requests and watching read rate.
> 
> I'm fine with trying your Go version, but I can into a bit of trouble:
> 
> axboe@amd ~/g/go-uring (master)> 
> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
> # github.com/dshulyak/uring/fixed
> fixed/allocator.go:38:48: error: incompatible type for field 2 in struct construction (cannot use type uint64 as type syscall.Iovec_len_t)
>    38 |  iovec := []syscall.Iovec{{Base: &mem[0], Len: uint64(size)}}
>       |                                                ^
> FAIL	github.com/dshulyak/uring/fs [build failed]
> FAIL
> axboe@amd ~/g/go-uring (master)> go version
> go version go1.14.6 gccgo (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0 linux/amd64

Alright, got it working. What device are you running this on? And am I
correct in assuming you get short reads, or rather 0 reads? What file
system?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-24 14:45       ` Jens Axboe
@ 2020-08-24 15:33         ` Dmitry Shulyak
  2020-08-24 16:10           ` Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Shulyak @ 2020-08-24 15:33 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

Was going to look into this.
I am getting 0 reads. This is on some old kingston ssd, ext4.

On Mon, 24 Aug 2020 at 17:45, Jens Axboe <axboe@kernel.dk> wrote:
>
> On 8/24/20 8:06 AM, Jens Axboe wrote:
> > On 8/24/20 5:09 AM, Dmitry Shulyak wrote:
> >> library that i am using https://github.com/dshulyak/uring
> >> It requires golang 1.14, if installed, benchmark can be run with:
> >> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
> >> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_5 -benchtime=8000000x
> >>
> >> note that it will setup uring instance per cpu, with shared worker pool.
> >> it will take me too much time to implement repro in c, but in general
> >> i am simply submitting multiple concurrent
> >> read requests and watching read rate.
> >
> > I'm fine with trying your Go version, but I can into a bit of trouble:
> >
> > axboe@amd ~/g/go-uring (master)>
> > go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
> > # github.com/dshulyak/uring/fixed
> > fixed/allocator.go:38:48: error: incompatible type for field 2 in struct construction (cannot use type uint64 as type syscall.Iovec_len_t)
> >    38 |  iovec := []syscall.Iovec{{Base: &mem[0], Len: uint64(size)}}
> >       |                                                ^
> > FAIL  github.com/dshulyak/uring/fs [build failed]
> > FAIL
> > axboe@amd ~/g/go-uring (master)> go version
> > go version go1.14.6 gccgo (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0 linux/amd64
>
> Alright, got it working. What device are you running this on? And am I
> correct in assuming you get short reads, or rather 0 reads? What file
> system?
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-24 15:33         ` Dmitry Shulyak
@ 2020-08-24 16:10           ` Jens Axboe
  2020-08-24 16:13             ` Dmitry Shulyak
  0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2020-08-24 16:10 UTC (permalink / raw)
  To: Dmitry Shulyak; +Cc: io-uring

On 8/24/20 9:33 AM, Dmitry Shulyak wrote:
> On Mon, 24 Aug 2020 at 17:45, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 8/24/20 8:06 AM, Jens Axboe wrote:
>>> On 8/24/20 5:09 AM, Dmitry Shulyak wrote:
>>>> library that i am using https://github.com/dshulyak/uring
>>>> It requires golang 1.14, if installed, benchmark can be run with:
>>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
>>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_5 -benchtime=8000000x
>>>>
>>>> note that it will setup uring instance per cpu, with shared worker pool.
>>>> it will take me too much time to implement repro in c, but in general
>>>> i am simply submitting multiple concurrent
>>>> read requests and watching read rate.
>>>
>>> I'm fine with trying your Go version, but I can into a bit of trouble:
>>>
>>> axboe@amd ~/g/go-uring (master)>
>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
>>> # github.com/dshulyak/uring/fixed
>>> fixed/allocator.go:38:48: error: incompatible type for field 2 in struct construction (cannot use type uint64 as type syscall.Iovec_len_t)
>>>    38 |  iovec := []syscall.Iovec{{Base: &mem[0], Len: uint64(size)}}
>>>       |                                                ^
>>> FAIL  github.com/dshulyak/uring/fs [build failed]
>>> FAIL
>>> axboe@amd ~/g/go-uring (master)> go version
>>> go version go1.14.6 gccgo (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0 linux/amd64
>>
>> Alright, got it working. What device are you running this on? And am I
>> correct in assuming you get short reads, or rather 0 reads? What file
>> system?
>
> Was going to look into this.
> I am getting 0 reads. This is on some old kingston ssd, ext4.

I can't seem to reproduce this. I do see some cqe->res == 0 completes,
but those appear to be NOPs. And they trigger at the start and end. I'll
keep poking.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-24 16:10           ` Jens Axboe
@ 2020-08-24 16:13             ` Dmitry Shulyak
  2020-08-24 16:18               ` Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Shulyak @ 2020-08-24 16:13 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

Nops are used for draining and closing rings at the end of benchmarks.
It also appears in the beginning because of the way golang runs benchmarks...

On Mon, 24 Aug 2020 at 19:10, Jens Axboe <axboe@kernel.dk> wrote:
>
> On 8/24/20 9:33 AM, Dmitry Shulyak wrote:
> > On Mon, 24 Aug 2020 at 17:45, Jens Axboe <axboe@kernel.dk> wrote:
> >>
> >> On 8/24/20 8:06 AM, Jens Axboe wrote:
> >>> On 8/24/20 5:09 AM, Dmitry Shulyak wrote:
> >>>> library that i am using https://github.com/dshulyak/uring
> >>>> It requires golang 1.14, if installed, benchmark can be run with:
> >>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
> >>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_5 -benchtime=8000000x
> >>>>
> >>>> note that it will setup uring instance per cpu, with shared worker pool.
> >>>> it will take me too much time to implement repro in c, but in general
> >>>> i am simply submitting multiple concurrent
> >>>> read requests and watching read rate.
> >>>
> >>> I'm fine with trying your Go version, but I can into a bit of trouble:
> >>>
> >>> axboe@amd ~/g/go-uring (master)>
> >>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
> >>> # github.com/dshulyak/uring/fixed
> >>> fixed/allocator.go:38:48: error: incompatible type for field 2 in struct construction (cannot use type uint64 as type syscall.Iovec_len_t)
> >>>    38 |  iovec := []syscall.Iovec{{Base: &mem[0], Len: uint64(size)}}
> >>>       |                                                ^
> >>> FAIL  github.com/dshulyak/uring/fs [build failed]
> >>> FAIL
> >>> axboe@amd ~/g/go-uring (master)> go version
> >>> go version go1.14.6 gccgo (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0 linux/amd64
> >>
> >> Alright, got it working. What device are you running this on? And am I
> >> correct in assuming you get short reads, or rather 0 reads? What file
> >> system?
> >
> > Was going to look into this.
> > I am getting 0 reads. This is on some old kingston ssd, ext4.
>
> I can't seem to reproduce this. I do see some cqe->res == 0 completes,
> but those appear to be NOPs. And they trigger at the start and end. I'll
> keep poking.
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-24 16:13             ` Dmitry Shulyak
@ 2020-08-24 16:18               ` Jens Axboe
  2020-08-24 17:44                 ` Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2020-08-24 16:18 UTC (permalink / raw)
  To: Dmitry Shulyak; +Cc: io-uring

On 8/24/20 10:13 AM, Dmitry Shulyak wrote:
> On Mon, 24 Aug 2020 at 19:10, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 8/24/20 9:33 AM, Dmitry Shulyak wrote:
>>> On Mon, 24 Aug 2020 at 17:45, Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>> On 8/24/20 8:06 AM, Jens Axboe wrote:
>>>>> On 8/24/20 5:09 AM, Dmitry Shulyak wrote:
>>>>>> library that i am using https://github.com/dshulyak/uring
>>>>>> It requires golang 1.14, if installed, benchmark can be run with:
>>>>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
>>>>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_5 -benchtime=8000000x
>>>>>>
>>>>>> note that it will setup uring instance per cpu, with shared worker pool.
>>>>>> it will take me too much time to implement repro in c, but in general
>>>>>> i am simply submitting multiple concurrent
>>>>>> read requests and watching read rate.
>>>>>
>>>>> I'm fine with trying your Go version, but I can into a bit of trouble:
>>>>>
>>>>> axboe@amd ~/g/go-uring (master)>
>>>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
>>>>> # github.com/dshulyak/uring/fixed
>>>>> fixed/allocator.go:38:48: error: incompatible type for field 2 in struct construction (cannot use type uint64 as type syscall.Iovec_len_t)
>>>>>    38 |  iovec := []syscall.Iovec{{Base: &mem[0], Len: uint64(size)}}
>>>>>       |                                                ^
>>>>> FAIL  github.com/dshulyak/uring/fs [build failed]
>>>>> FAIL
>>>>> axboe@amd ~/g/go-uring (master)> go version
>>>>> go version go1.14.6 gccgo (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0 linux/amd64
>>>>
>>>> Alright, got it working. What device are you running this on? And am I
>>>> correct in assuming you get short reads, or rather 0 reads? What file
>>>> system?
>>>
>>> Was going to look into this.
>>> I am getting 0 reads. This is on some old kingston ssd, ext4.
>>
>> I can't seem to reproduce this. I do see some cqe->res == 0 completes,
>> but those appear to be NOPs. And they trigger at the start and end. I'll
>> keep poking.
>
> Nops are used for draining and closing rings at the end of benchmarks.
> It also appears in the beginning because of the way golang runs
> benchmarks...

OK, just checking if it was expected.

But I can reproduce it now, turns out I was running XFS and that doesn't
trigger it. With ext4, I do see zero sized read completions. I'll keep
poking.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-24 16:18               ` Jens Axboe
@ 2020-08-24 17:44                 ` Jens Axboe
  2020-08-25  8:52                   ` Dmitry Shulyak
  0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2020-08-24 17:44 UTC (permalink / raw)
  To: Dmitry Shulyak; +Cc: io-uring

On 8/24/20 10:18 AM, Jens Axboe wrote:
> On 8/24/20 10:13 AM, Dmitry Shulyak wrote:
>> On Mon, 24 Aug 2020 at 19:10, Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> On 8/24/20 9:33 AM, Dmitry Shulyak wrote:
>>>> On Mon, 24 Aug 2020 at 17:45, Jens Axboe <axboe@kernel.dk> wrote:
>>>>>
>>>>> On 8/24/20 8:06 AM, Jens Axboe wrote:
>>>>>> On 8/24/20 5:09 AM, Dmitry Shulyak wrote:
>>>>>>> library that i am using https://github.com/dshulyak/uring
>>>>>>> It requires golang 1.14, if installed, benchmark can be run with:
>>>>>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
>>>>>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_5 -benchtime=8000000x
>>>>>>>
>>>>>>> note that it will setup uring instance per cpu, with shared worker pool.
>>>>>>> it will take me too much time to implement repro in c, but in general
>>>>>>> i am simply submitting multiple concurrent
>>>>>>> read requests and watching read rate.
>>>>>>
>>>>>> I'm fine with trying your Go version, but I can into a bit of trouble:
>>>>>>
>>>>>> axboe@amd ~/g/go-uring (master)>
>>>>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
>>>>>> # github.com/dshulyak/uring/fixed
>>>>>> fixed/allocator.go:38:48: error: incompatible type for field 2 in struct construction (cannot use type uint64 as type syscall.Iovec_len_t)
>>>>>>    38 |  iovec := []syscall.Iovec{{Base: &mem[0], Len: uint64(size)}}
>>>>>>       |                                                ^
>>>>>> FAIL  github.com/dshulyak/uring/fs [build failed]
>>>>>> FAIL
>>>>>> axboe@amd ~/g/go-uring (master)> go version
>>>>>> go version go1.14.6 gccgo (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0 linux/amd64
>>>>>
>>>>> Alright, got it working. What device are you running this on? And am I
>>>>> correct in assuming you get short reads, or rather 0 reads? What file
>>>>> system?
>>>>
>>>> Was going to look into this.
>>>> I am getting 0 reads. This is on some old kingston ssd, ext4.
>>>
>>> I can't seem to reproduce this. I do see some cqe->res == 0 completes,
>>> but those appear to be NOPs. And they trigger at the start and end. I'll
>>> keep poking.
>>
>> Nops are used for draining and closing rings at the end of benchmarks.
>> It also appears in the beginning because of the way golang runs
>> benchmarks...
> 
> OK, just checking if it was expected.
> 
> But I can reproduce it now, turns out I was running XFS and that doesn't
> trigger it. With ext4, I do see zero sized read completions. I'll keep
> poking.

Can you try with this? Looks like some cases will consume bytes from the
iterator even if they ultimately return an error. If we've consumed bytes
but need to trigger retry, ensure we revert the consumed bytes.


diff --git a/fs/io_uring.c b/fs/io_uring.c
index 91e2cc8414f9..609b4996a4e9 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3152,6 +3152,8 @@ static int io_read(struct io_kiocb *req, bool force_nonblock,
 	} else if (ret == -EAGAIN) {
 		if (!force_nonblock)
 			goto done;
+		/* some cases will consume bytes even on error returns */
+		iov_iter_revert(iter, iov_count - iov_iter_count(iter));
 		ret = io_setup_async_rw(req, iovec, inline_vecs, iter, false);
 		if (ret)
 			goto out_free;
@@ -3293,6 +3295,8 @@ static int io_write(struct io_kiocb *req, bool force_nonblock,
 	if (!force_nonblock || ret2 != -EAGAIN) {
 		kiocb_done(kiocb, ret2, cs);
 	} else {
+		/* some cases will consume bytes even on error returns */
+		iov_iter_revert(iter, iov_count - iov_iter_count(iter));
 copy_iov:
 		ret = io_setup_async_rw(req, iovec, inline_vecs, iter, false);
 		if (!ret)

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-24 17:44                 ` Jens Axboe
@ 2020-08-25  8:52                   ` Dmitry Shulyak
  2020-08-25 13:39                     ` Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Shulyak @ 2020-08-25  8:52 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

this patch fixes the issue with 0 reads. there seems to be a
regression that is not specific to uring,
regular syscall reads slowed down noticeably.

On Mon, 24 Aug 2020 at 20:44, Jens Axboe <axboe@kernel.dk> wrote:
>
> On 8/24/20 10:18 AM, Jens Axboe wrote:
> > On 8/24/20 10:13 AM, Dmitry Shulyak wrote:
> >> On Mon, 24 Aug 2020 at 19:10, Jens Axboe <axboe@kernel.dk> wrote:
> >>>
> >>> On 8/24/20 9:33 AM, Dmitry Shulyak wrote:
> >>>> On Mon, 24 Aug 2020 at 17:45, Jens Axboe <axboe@kernel.dk> wrote:
> >>>>>
> >>>>> On 8/24/20 8:06 AM, Jens Axboe wrote:
> >>>>>> On 8/24/20 5:09 AM, Dmitry Shulyak wrote:
> >>>>>>> library that i am using https://github.com/dshulyak/uring
> >>>>>>> It requires golang 1.14, if installed, benchmark can be run with:
> >>>>>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
> >>>>>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_5 -benchtime=8000000x
> >>>>>>>
> >>>>>>> note that it will setup uring instance per cpu, with shared worker pool.
> >>>>>>> it will take me too much time to implement repro in c, but in general
> >>>>>>> i am simply submitting multiple concurrent
> >>>>>>> read requests and watching read rate.
> >>>>>>
> >>>>>> I'm fine with trying your Go version, but I can into a bit of trouble:
> >>>>>>
> >>>>>> axboe@amd ~/g/go-uring (master)>
> >>>>>> go test ./fs -run=xx -bench=BenchmarkReadAt/uring_8 -benchtime=1000000x
> >>>>>> # github.com/dshulyak/uring/fixed
> >>>>>> fixed/allocator.go:38:48: error: incompatible type for field 2 in struct construction (cannot use type uint64 as type syscall.Iovec_len_t)
> >>>>>>    38 |  iovec := []syscall.Iovec{{Base: &mem[0], Len: uint64(size)}}
> >>>>>>       |                                                ^
> >>>>>> FAIL  github.com/dshulyak/uring/fs [build failed]
> >>>>>> FAIL
> >>>>>> axboe@amd ~/g/go-uring (master)> go version
> >>>>>> go version go1.14.6 gccgo (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0 linux/amd64
> >>>>>
> >>>>> Alright, got it working. What device are you running this on? And am I
> >>>>> correct in assuming you get short reads, or rather 0 reads? What file
> >>>>> system?
> >>>>
> >>>> Was going to look into this.
> >>>> I am getting 0 reads. This is on some old kingston ssd, ext4.
> >>>
> >>> I can't seem to reproduce this. I do see some cqe->res == 0 completes,
> >>> but those appear to be NOPs. And they trigger at the start and end. I'll
> >>> keep poking.
> >>
> >> Nops are used for draining and closing rings at the end of benchmarks.
> >> It also appears in the beginning because of the way golang runs
> >> benchmarks...
> >
> > OK, just checking if it was expected.
> >
> > But I can reproduce it now, turns out I was running XFS and that doesn't
> > trigger it. With ext4, I do see zero sized read completions. I'll keep
> > poking.
>
> Can you try with this? Looks like some cases will consume bytes from the
> iterator even if they ultimately return an error. If we've consumed bytes
> but need to trigger retry, ensure we revert the consumed bytes.
>
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 91e2cc8414f9..609b4996a4e9 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -3152,6 +3152,8 @@ static int io_read(struct io_kiocb *req, bool force_nonblock,
>         } else if (ret == -EAGAIN) {
>                 if (!force_nonblock)
>                         goto done;
> +               /* some cases will consume bytes even on error returns */
> +               iov_iter_revert(iter, iov_count - iov_iter_count(iter));
>                 ret = io_setup_async_rw(req, iovec, inline_vecs, iter, false);
>                 if (ret)
>                         goto out_free;
> @@ -3293,6 +3295,8 @@ static int io_write(struct io_kiocb *req, bool force_nonblock,
>         if (!force_nonblock || ret2 != -EAGAIN) {
>                 kiocb_done(kiocb, ret2, cs);
>         } else {
> +               /* some cases will consume bytes even on error returns */
> +               iov_iter_revert(iter, iov_count - iov_iter_count(iter));
>  copy_iov:
>                 ret = io_setup_async_rw(req, iovec, inline_vecs, iter, false);
>                 if (!ret)
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-25  8:52                   ` Dmitry Shulyak
@ 2020-08-25 13:39                     ` Jens Axboe
  2020-08-25 14:14                       ` Dmitry Shulyak
  0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2020-08-25 13:39 UTC (permalink / raw)
  To: Dmitry Shulyak; +Cc: io-uring

On 8/25/20 2:52 AM, Dmitry Shulyak wrote:
> this patch fixes the issue with 0 reads. there seems to be a
> regression that is not specific to uring,
> regular syscall reads slowed down noticeably.

Do you have a test case? Related to specific system calls, or just
overall? My initial suspicion would be Yet Another one of the
security fixes...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Large number of empty reads on 5.9-rc2 under moderate load
  2020-08-25 13:39                     ` Jens Axboe
@ 2020-08-25 14:14                       ` Dmitry Shulyak
  0 siblings, 0 replies; 13+ messages in thread
From: Dmitry Shulyak @ 2020-08-25 14:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

I was doing readv with the same tool that i shared earlier:
go test ./fs -run=xx -bench=BenchmarkReadAt/os_5 -benchtime=8000000x -cpu=256

On 5.8.3 it is consistently > 100 mb/s, on 5.9-rc2 ~50 mb/s on my device.
with uring i was getting > 250 mb/s, and now ~70 mb/s.

double checked if there is any difference with writev, and i don't see any.


On Tue, 25 Aug 2020 at 16:39, Jens Axboe <axboe@kernel.dk> wrote:
>
> On 8/25/20 2:52 AM, Dmitry Shulyak wrote:
> > this patch fixes the issue with 0 reads. there seems to be a
> > regression that is not specific to uring,
> > regular syscall reads slowed down noticeably.
>
> Do you have a test case? Related to specific system calls, or just
> overall? My initial suspicion would be Yet Another one of the
> security fixes...
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-08-25 14:15 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-24 10:40 Large number of empty reads on 5.9-rc2 under moderate load Dmitry Shulyak
2020-08-24 10:46 ` Jens Axboe
2020-08-24 11:09   ` Dmitry Shulyak
2020-08-24 14:06     ` Jens Axboe
2020-08-24 14:45       ` Jens Axboe
2020-08-24 15:33         ` Dmitry Shulyak
2020-08-24 16:10           ` Jens Axboe
2020-08-24 16:13             ` Dmitry Shulyak
2020-08-24 16:18               ` Jens Axboe
2020-08-24 17:44                 ` Jens Axboe
2020-08-25  8:52                   ` Dmitry Shulyak
2020-08-25 13:39                     ` Jens Axboe
2020-08-25 14:14                       ` Dmitry Shulyak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).