All of lore.kernel.org
 help / color / mirror / Atom feed
* time_based not working with randread
@ 2018-05-25 19:20 Paolo Valente
  2018-05-27 14:24 ` Sitsofe Wheeler
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Valente @ 2018-05-25 19:20 UTC (permalink / raw)
  To: fio

Hi,
if I run this job (even with the last version GitHub version of fio) on an SSD:
[global]
 ioengine=sync
 time_based=1
 runtime=20
 readwrite=randread
 size=100m
 numjobs=1
 invalidate=1
[job1]

then, after little time (I think after 100MB have been read), fio reports a nonsensically large value for the throughput, while a simple iostat shows that no I/O is going on. By just replacing time_based with loops, i.e., with a job file like this:

[global]
 ioengine=sync
 loops=1000
 readwrite=randread
 size=100m
 numjobs=1
 invalidate=1
[job1]

the problem disappears.

In a similar vein, there is no problem with sequential reads, even in time_based mode.

Looks like fio re-reads the file from cache in with time_based randread.

Thanks,
Paolo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: time_based not working with randread
  2018-05-25 19:20 time_based not working with randread Paolo Valente
@ 2018-05-27 14:24 ` Sitsofe Wheeler
  2018-05-31  4:10   ` Sitsofe Wheeler
  2018-05-31  8:55   ` Paolo Valente
  0 siblings, 2 replies; 9+ messages in thread
From: Sitsofe Wheeler @ 2018-05-27 14:24 UTC (permalink / raw)
  To: Paolo Valente; +Cc: fio, Jens Axboe

Hi Paolo!

On 25 May 2018 at 20:20, Paolo Valente <paolo.valente@linaro.org> wrote:
> Hi,
> if I run this job (even with the last version GitHub version of fio) on an SSD:
> [global]
>  ioengine=sync
>  time_based=1
>  runtime=20
>  readwrite=randread
>  size=100m
>  numjobs=1
>  invalidate=1
> [job1]
>
> then, after little time (I think after 100MB have been read), fio reports a nonsensically large value for the throughput, while a simple iostat shows that no I/O is going on. By just replacing time_based with loops, i.e., with a job file like this:
>
> [global]
>  ioengine=sync
>  loops=1000
>  readwrite=randread
>  size=100m
>  numjobs=1
>  invalidate=1
> [job1]
>
> the problem disappears.

I've taken a stab at fixing this over in
https://github.com/sitsofe/fio/tree/random_reinvalidate - does that
solve the issue for you too? There's an argument that the code should
only do this loop invalidation when time_based is set and not when
td->o.file_service_type is __FIO_FSERVICE_NONUNIFORM but I kind of
wanted to keep this commit small.

Jens: any thoughts on whether we should add another if guard to ensure
we only call loop_cache_invalidate() with time_based around
https://github.com/sitsofe/fio/commit/854d8e626e7008df43640b5e08bf980fe30a6037#diff-fa5026e1991e27a64e5921f64f6cf2d6R329
?

> In a similar vein, there is no problem with sequential reads, even in time_based mode.
>
> Looks like fio re-reads the file from cache in with time_based randread.

I know I'm "Teaching grandmother to suck eggs" given that you're the
author of BFQ but just in case...

This issue happens on loops=1000 too and I believe it's down to
readahead. Basically fio is trying to invalidate the cache but said
cache is also being populated by readahead and in the end some data
ends up reused from the cache:

# modprobe null_blk completion_nsec=100000 irqmode=2
#
# ./fio --ioengine=sync --runtime=10 --size=100m
--filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
--invalidate=1
cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=sync, iodepth=1
fio-3.6-40-g854d
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=2233MiB/s,w=0KiB/s][r=572k,w=0
IOPS][eta 00m:00s]

Run status group 0 (all jobs):

Disk stats (read/write):
  nullb0: ios=266158/0, merge=0/0, ticks=26611/0, in_queue=26492, util=82.39%
# ./fio --ioengine=sync --runtime=10 --size=100m
--filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
--invalidate=1 --fadvise=random
cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=sync, iodepth=1
fio-3.6-40-g854d
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=37.3MiB/s,w=0KiB/s][r=9561,w=0
IOPS][eta 00m:00s]

Run status group 0 (all jobs):

Disk stats (read/write):
  nullb0: ios=94364/0, merge=0/0, ticks=9615/0, in_queue=9612, util=96.15%
# ./fio --ioengine=sync --runtime=10 --size=100m
--filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
--invalidate=0
cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=sync, iodepth=1
fio-3.6-40-g854d
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=3888MiB/s,w=0KiB/s][r=995k,w=0
IOPS][eta 00m:00s]

Run status group 0 (all jobs):

Disk stats (read/write):
  nullb0: ios=1208/0, merge=0/0, ticks=122/0, in_queue=121, util=0.37%
#
# echo 0 > /sys/block/nullb0/queue/read_ahead_kb
#
# ./fio --ioengine=sync --runtime=10 --size=100m
--filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
--invalidate=1
cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=sync, iodepth=1
fio-3.6-40-g854d
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=37.0MiB/s,w=0KiB/s][r=9478,w=0
IOPS][eta 00m:00s]

Run status group 0 (all jobs):

Disk stats (read/write):
  nullb0: ios=93518/0, merge=0/0, ticks=9654/0, in_queue=9653, util=96.56%
# ./fio --ioengine=sync --runtime=10 --size=100m
--filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
--invalidate=0
cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=sync, iodepth=1
fio-3.6-40-g854d
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=3894MiB/s,w=0KiB/s][r=997k,w=0
IOPS][eta 00m:00s]

Run status group 0 (all jobs):

Disk stats (read/write):
  nullb0: ios=25600/0, merge=0/0, ticks=2650/0, in_queue=2649, util=26.50%

Note: readahead (if set) can still take place even when the
ioscheduler has been set to none.

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: time_based not working with randread
  2018-05-27 14:24 ` Sitsofe Wheeler
@ 2018-05-31  4:10   ` Sitsofe Wheeler
  2018-05-31  8:55   ` Paolo Valente
  1 sibling, 0 replies; 9+ messages in thread
From: Sitsofe Wheeler @ 2018-05-31  4:10 UTC (permalink / raw)
  To: Paolo Valente; +Cc: fio, Jens Axboe

Hi Paolo,

I'm just chasing - do you know the commit in
https://github.com/sitsofe/fio/tree/random_reinvalidate helped?

On 27 May 2018 at 15:24, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> Hi Paolo!
>
> On 25 May 2018 at 20:20, Paolo Valente <paolo.valente@linaro.org> wrote:
>> Hi,
>> if I run this job (even with the last version GitHub version of fio) on an SSD:
>> [global]
>>  ioengine=sync
>>  time_based=1
>>  runtime=20
>>  readwrite=randread
>>  size=100m
>>  numjobs=1
>>  invalidate=1
>> [job1]
>>
>> then, after little time (I think after 100MB have been read), fio reports a nonsensically large value for the throughput, while a simple iostat shows that no I/O is going on. By just replacing time_based with loops, i.e., with a job file like this:
>>
>> [global]
>>  ioengine=sync
>>  loops=1000
>>  readwrite=randread
>>  size=100m
>>  numjobs=1
>>  invalidate=1
>> [job1]
>>
>> the problem disappears.
>
> I've taken a stab at fixing this over in
> https://github.com/sitsofe/fio/tree/random_reinvalidate - does that
> solve the issue for you too? There's an argument that the code should
> only do this loop invalidation when time_based is set and not when
> td->o.file_service_type is __FIO_FSERVICE_NONUNIFORM but I kind of
> wanted to keep this commit small.
>
> Jens: any thoughts on whether we should add another if guard to ensure
> we only call loop_cache_invalidate() with time_based around
> https://github.com/sitsofe/fio/commit/854d8e626e7008df43640b5e08bf980fe30a6037#diff-fa5026e1991e27a64e5921f64f6cf2d6R329
> ?
>
>> In a similar vein, there is no problem with sequential reads, even in time_based mode.
>>
>> Looks like fio re-reads the file from cache in with time_based randread.
>
> I know I'm "Teaching grandmother to suck eggs" given that you're the
> author of BFQ but just in case...
>
> This issue happens on loops=1000 too and I believe it's down to
> readahead. Basically fio is trying to invalidate the cache but said
> cache is also being populated by readahead and in the end some data
> ends up reused from the cache:
>
> # modprobe null_blk completion_nsec=100000 irqmode=2
> #
> # ./fio --ioengine=sync --runtime=10 --size=100m
> --filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
> --invalidate=1
> cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=sync, iodepth=1
> fio-3.6-40-g854d
> Starting 1 process
> Jobs: 1 (f=1): [R(1)][100.0%][r=2233MiB/s,w=0KiB/s][r=572k,w=0
> IOPS][eta 00m:00s]
>
> Run status group 0 (all jobs):
>
> Disk stats (read/write):
>   nullb0: ios=266158/0, merge=0/0, ticks=26611/0, in_queue=26492, util=82.39%
> # ./fio --ioengine=sync --runtime=10 --size=100m
> --filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
> --invalidate=1 --fadvise=random
> cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=sync, iodepth=1
> fio-3.6-40-g854d
> Starting 1 process
> Jobs: 1 (f=1): [R(1)][100.0%][r=37.3MiB/s,w=0KiB/s][r=9561,w=0
> IOPS][eta 00m:00s]
>
> Run status group 0 (all jobs):
>
> Disk stats (read/write):
>   nullb0: ios=94364/0, merge=0/0, ticks=9615/0, in_queue=9612, util=96.15%
> # ./fio --ioengine=sync --runtime=10 --size=100m
> --filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
> --invalidate=0
> cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=sync, iodepth=1
> fio-3.6-40-g854d
> Starting 1 process
> Jobs: 1 (f=1): [R(1)][100.0%][r=3888MiB/s,w=0KiB/s][r=995k,w=0
> IOPS][eta 00m:00s]
>
> Run status group 0 (all jobs):
>
> Disk stats (read/write):
>   nullb0: ios=1208/0, merge=0/0, ticks=122/0, in_queue=121, util=0.37%
> #
> # echo 0 > /sys/block/nullb0/queue/read_ahead_kb
> #
> # ./fio --ioengine=sync --runtime=10 --size=100m
> --filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
> --invalidate=1
> cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=sync, iodepth=1
> fio-3.6-40-g854d
> Starting 1 process
> Jobs: 1 (f=1): [R(1)][100.0%][r=37.0MiB/s,w=0KiB/s][r=9478,w=0
> IOPS][eta 00m:00s]
>
> Run status group 0 (all jobs):
>
> Disk stats (read/write):
>   nullb0: ios=93518/0, merge=0/0, ticks=9654/0, in_queue=9653, util=96.56%
> # ./fio --ioengine=sync --runtime=10 --size=100m
> --filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
> --invalidate=0
> cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=sync, iodepth=1
> fio-3.6-40-g854d
> Starting 1 process
> Jobs: 1 (f=1): [R(1)][100.0%][r=3894MiB/s,w=0KiB/s][r=997k,w=0
> IOPS][eta 00m:00s]
>
> Run status group 0 (all jobs):
>
> Disk stats (read/write):
>   nullb0: ios=25600/0, merge=0/0, ticks=2650/0, in_queue=2649, util=26.50%
>
> Note: readahead (if set) can still take place even when the
> ioscheduler has been set to none.

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: time_based not working with randread
  2018-05-27 14:24 ` Sitsofe Wheeler
  2018-05-31  4:10   ` Sitsofe Wheeler
@ 2018-05-31  8:55   ` Paolo Valente
  2018-05-31 14:38     ` Jens Axboe
  1 sibling, 1 reply; 9+ messages in thread
From: Paolo Valente @ 2018-05-31  8:55 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio, Jens Axboe



> Il giorno 27 mag 2018, alle ore 16:24, Sitsofe Wheeler <sitsofe@gmail.com> ha scritto:
> 
> Hi Paolo!
> 
> On 25 May 2018 at 20:20, Paolo Valente <paolo.valente@linaro.org> wrote:
>> Hi,
>> if I run this job (even with the last version GitHub version of fio) on an SSD:
>> [global]
>> ioengine=sync
>> time_based=1
>> runtime=20
>> readwrite=randread
>> size=100m
>> numjobs=1
>> invalidate=1
>> [job1]
>> 
>> then, after little time (I think after 100MB have been read), fio reports a nonsensically large value for the throughput, while a simple iostat shows that no I/O is going on. By just replacing time_based with loops, i.e., with a job file like this:
>> 
>> [global]
>> ioengine=sync
>> loops=1000
>> readwrite=randread
>> size=100m
>> numjobs=1
>> invalidate=1
>> [job1]
>> 
>> the problem disappears.
> 
> I've taken a stab at fixing this over in
> https://github.com/sitsofe/fio/tree/random_reinvalidate - does that
> solve the issue for you too?

Nope :(

> ...
> I know I'm "Teaching grandmother to suck eggs" given that you're the
> author of BFQ but just in case...
> 
> This issue happens on loops=1000 too and I believe it's down to
> readahead.

I'm afraid there is a misunderstanding on this, grandson :)

As I wrote, this problem does not occur with loops=1000.  My
impression is that, with loops, as well as with time-based and read,
fio does invalidate the cache every time it restarts reading the same
file, while with time_based and randread it does not (or maybe it
tries to, but fails for some reason).

Thanks,
grandma

> Basically fio is trying to invalidate the cache but said
> cache is also being populated by readahead and in the end some data
> ends up reused from the cache:
> 
> # modprobe null_blk completion_nsec=100000 irqmode=2
> #
> # ./fio --ioengine=sync --runtime=10 --size=100m
> --filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
> --invalidate=1
> cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=sync, iodepth=1
> fio-3.6-40-g854d
> Starting 1 process
> Jobs: 1 (f=1): [R(1)][100.0%][r=2233MiB/s,w=0KiB/s][r=572k,w=0
> IOPS][eta 00m:00s]
> 
> Run status group 0 (all jobs):
> 
> Disk stats (read/write):
>  nullb0: ios=266158/0, merge=0/0, ticks=26611/0, in_queue=26492, util=82.39%
> # ./fio --ioengine=sync --runtime=10 --size=100m
> --filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
> --invalidate=1 --fadvise=random
> cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=sync, iodepth=1
> fio-3.6-40-g854d
> Starting 1 process
> Jobs: 1 (f=1): [R(1)][100.0%][r=37.3MiB/s,w=0KiB/s][r=9561,w=0
> IOPS][eta 00m:00s]
> 
> Run status group 0 (all jobs):
> 
> Disk stats (read/write):
>  nullb0: ios=94364/0, merge=0/0, ticks=9615/0, in_queue=9612, util=96.15%
> # ./fio --ioengine=sync --runtime=10 --size=100m
> --filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
> --invalidate=0
> cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=sync, iodepth=1
> fio-3.6-40-g854d
> Starting 1 process
> Jobs: 1 (f=1): [R(1)][100.0%][r=3888MiB/s,w=0KiB/s][r=995k,w=0
> IOPS][eta 00m:00s]
> 
> Run status group 0 (all jobs):
> 
> Disk stats (read/write):
>  nullb0: ios=1208/0, merge=0/0, ticks=122/0, in_queue=121, util=0.37%
> #
> # echo 0 > /sys/block/nullb0/queue/read_ahead_kb
> #
> # ./fio --ioengine=sync --runtime=10 --size=100m
> --filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
> --invalidate=1
> cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=sync, iodepth=1
> fio-3.6-40-g854d
> Starting 1 process
> Jobs: 1 (f=1): [R(1)][100.0%][r=37.0MiB/s,w=0KiB/s][r=9478,w=0
> IOPS][eta 00m:00s]
> 
> Run status group 0 (all jobs):
> 
> Disk stats (read/write):
>  nullb0: ios=93518/0, merge=0/0, ticks=9654/0, in_queue=9653, util=96.56%
> # ./fio --ioengine=sync --runtime=10 --size=100m
> --filename=/dev/nullb0 --time_based --name=cached --rw=read --stats=0
> --invalidate=0
> cached: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=sync, iodepth=1
> fio-3.6-40-g854d
> Starting 1 process
> Jobs: 1 (f=1): [R(1)][100.0%][r=3894MiB/s,w=0KiB/s][r=997k,w=0
> IOPS][eta 00m:00s]
> 
> Run status group 0 (all jobs):
> 
> Disk stats (read/write):
>  nullb0: ios=25600/0, merge=0/0, ticks=2650/0, in_queue=2649, util=26.50%
> 
> Note: readahead (if set) can still take place even when the
> ioscheduler has been set to none.
> 
> -- 
> Sitsofe | http://sucs.org/~sits/



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: time_based not working with randread
  2018-05-31  8:55   ` Paolo Valente
@ 2018-05-31 14:38     ` Jens Axboe
  2018-05-31 14:49       ` Paolo Valente
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2018-05-31 14:38 UTC (permalink / raw)
  To: Paolo Valente, Sitsofe Wheeler; +Cc: fio

On 5/31/18 2:55 AM, Paolo Valente wrote:
> 
> 
>> Il giorno 27 mag 2018, alle ore 16:24, Sitsofe Wheeler <sitsofe@gmail.com> ha scritto:
>>
>> Hi Paolo!
>>
>> On 25 May 2018 at 20:20, Paolo Valente <paolo.valente@linaro.org> wrote:
>>> Hi,
>>> if I run this job (even with the last version GitHub version of fio) on an SSD:
>>> [global]
>>> ioengine=sync
>>> time_based=1
>>> runtime=20
>>> readwrite=randread
>>> size=100m
>>> numjobs=1
>>> invalidate=1
>>> [job1]
>>>
>>> then, after little time (I think after 100MB have been read), fio reports a nonsensically large value for the throughput, while a simple iostat shows that no I/O is going on. By just replacing time_based with loops, i.e., with a job file like this:
>>>
>>> [global]
>>> ioengine=sync
>>> loops=1000
>>> readwrite=randread
>>> size=100m
>>> numjobs=1
>>> invalidate=1
>>> [job1]
>>>
>>> the problem disappears.
>>
>> I've taken a stab at fixing this over in
>> https://github.com/sitsofe/fio/tree/random_reinvalidate - does that
>> solve the issue for you too?
> 
> Nope :(
> 
>> ...
>> I know I'm "Teaching grandmother to suck eggs" given that you're the
>> author of BFQ but just in case...
>>
>> This issue happens on loops=1000 too and I believe it's down to
>> readahead.
> 
> I'm afraid there is a misunderstanding on this, grandson :)
> 
> As I wrote, this problem does not occur with loops=1000.  My
> impression is that, with loops, as well as with time-based and read,
> fio does invalidate the cache every time it restarts reading the same
> file, while with time_based and randread it does not (or maybe it
> tries to, but fails for some reason).

This is basically by design. loop will go through the full
open+invalidate, whereas time_based will just keep chugging
along. Once your 100mb is in page cache, then no more IO
will be done, as reads are just served from there.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: time_based not working with randread
  2018-05-31 14:38     ` Jens Axboe
@ 2018-05-31 14:49       ` Paolo Valente
  2018-05-31 15:07         ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Valente @ 2018-05-31 14:49 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Sitsofe Wheeler, fio



> Il giorno 31 mag 2018, alle ore 16:38, Jens Axboe <axboe@kernel.dk> ha scritto:
> 
> On 5/31/18 2:55 AM, Paolo Valente wrote:
>> 
>> 
>>> Il giorno 27 mag 2018, alle ore 16:24, Sitsofe Wheeler <sitsofe@gmail.com> ha scritto:
>>> 
>>> Hi Paolo!
>>> 
>>> On 25 May 2018 at 20:20, Paolo Valente <paolo.valente@linaro.org> wrote:
>>>> Hi,
>>>> if I run this job (even with the last version GitHub version of fio) on an SSD:
>>>> [global]
>>>> ioengine=sync
>>>> time_based=1
>>>> runtime=20
>>>> readwrite=randread
>>>> size=100m
>>>> numjobs=1
>>>> invalidate=1
>>>> [job1]
>>>> 
>>>> then, after little time (I think after 100MB have been read), fio reports a nonsensically large value for the throughput, while a simple iostat shows that no I/O is going on. By just replacing time_based with loops, i.e., with a job file like this:
>>>> 
>>>> [global]
>>>> ioengine=sync
>>>> loops=1000
>>>> readwrite=randread
>>>> size=100m
>>>> numjobs=1
>>>> invalidate=1
>>>> [job1]
>>>> 
>>>> the problem disappears.
>>> 
>>> I've taken a stab at fixing this over in
>>> https://github.com/sitsofe/fio/tree/random_reinvalidate - does that
>>> solve the issue for you too?
>> 
>> Nope :(
>> 
>>> ...
>>> I know I'm "Teaching grandmother to suck eggs" given that you're the
>>> author of BFQ but just in case...
>>> 
>>> This issue happens on loops=1000 too and I believe it's down to
>>> readahead.
>> 
>> I'm afraid there is a misunderstanding on this, grandson :)
>> 
>> As I wrote, this problem does not occur with loops=1000.  My
>> impression is that, with loops, as well as with time-based and read,
>> fio does invalidate the cache every time it restarts reading the same
>> file, while with time_based and randread it does not (or maybe it
>> tries to, but fails for some reason).
> 
> This is basically by design. loop will go through the full
> open+invalidate, whereas time_based will just keep chugging
> along. Once your 100mb is in page cache, then no more IO
> will be done, as reads are just served from there.
> 

Such a design confused me.  Highlighting somewhere this deviation
(between loops and time_based) might help other dull people like me.

Thanks,
Paolo

> -- 
> Jens Axboe



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: time_based not working with randread
  2018-05-31 14:49       ` Paolo Valente
@ 2018-05-31 15:07         ` Jens Axboe
  2018-06-04 15:15           ` Paolo Valente
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2018-05-31 15:07 UTC (permalink / raw)
  To: Paolo Valente; +Cc: Sitsofe Wheeler, fio

On 5/31/18 8:49 AM, Paolo Valente wrote:
> 
> 
>> Il giorno 31 mag 2018, alle ore 16:38, Jens Axboe <axboe@kernel.dk> ha scritto:
>>
>> On 5/31/18 2:55 AM, Paolo Valente wrote:
>>>
>>>
>>>> Il giorno 27 mag 2018, alle ore 16:24, Sitsofe Wheeler <sitsofe@gmail.com> ha scritto:
>>>>
>>>> Hi Paolo!
>>>>
>>>> On 25 May 2018 at 20:20, Paolo Valente <paolo.valente@linaro.org> wrote:
>>>>> Hi,
>>>>> if I run this job (even with the last version GitHub version of fio) on an SSD:
>>>>> [global]
>>>>> ioengine=sync
>>>>> time_based=1
>>>>> runtime=20
>>>>> readwrite=randread
>>>>> size=100m
>>>>> numjobs=1
>>>>> invalidate=1
>>>>> [job1]
>>>>>
>>>>> then, after little time (I think after 100MB have been read), fio reports a nonsensically large value for the throughput, while a simple iostat shows that no I/O is going on. By just replacing time_based with loops, i.e., with a job file like this:
>>>>>
>>>>> [global]
>>>>> ioengine=sync
>>>>> loops=1000
>>>>> readwrite=randread
>>>>> size=100m
>>>>> numjobs=1
>>>>> invalidate=1
>>>>> [job1]
>>>>>
>>>>> the problem disappears.
>>>>
>>>> I've taken a stab at fixing this over in
>>>> https://github.com/sitsofe/fio/tree/random_reinvalidate - does that
>>>> solve the issue for you too?
>>>
>>> Nope :(
>>>
>>>> ...
>>>> I know I'm "Teaching grandmother to suck eggs" given that you're the
>>>> author of BFQ but just in case...
>>>>
>>>> This issue happens on loops=1000 too and I believe it's down to
>>>> readahead.
>>>
>>> I'm afraid there is a misunderstanding on this, grandson :)
>>>
>>> As I wrote, this problem does not occur with loops=1000.  My
>>> impression is that, with loops, as well as with time-based and read,
>>> fio does invalidate the cache every time it restarts reading the same
>>> file, while with time_based and randread it does not (or maybe it
>>> tries to, but fails for some reason).
>>
>> This is basically by design. loop will go through the full
>> open+invalidate, whereas time_based will just keep chugging
>> along. Once your 100mb is in page cache, then no more IO
>> will be done, as reads are just served from there.
>>
> 
> Such a design confused me.  Highlighting somewhere this deviation
> (between loops and time_based) might help other dull people like me.

Actually, I'm misremembering, and we did fix this up. But looks
like I botched the fix, try pulling a new update and it should
work for you. Fix:

http://git.kernel.dk/cgit/fio/commit/?id=80f021501fda6a6244672bb89dd8221a61cee54b

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: time_based not working with randread
  2018-05-31 15:07         ` Jens Axboe
@ 2018-06-04 15:15           ` Paolo Valente
  2018-06-04 15:17             ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Valente @ 2018-06-04 15:15 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Sitsofe Wheeler, fio



> Il giorno 31 mag 2018, alle ore 17:07, Jens Axboe <axboe@kernel.dk> ha scritto:
> 
> On 5/31/18 8:49 AM, Paolo Valente wrote:
>> 
>> 
>>> Il giorno 31 mag 2018, alle ore 16:38, Jens Axboe <axboe@kernel.dk> ha scritto:
>>> 
>>> On 5/31/18 2:55 AM, Paolo Valente wrote:
>>>> 
>>>> 
>>>>> Il giorno 27 mag 2018, alle ore 16:24, Sitsofe Wheeler <sitsofe@gmail.com> ha scritto:
>>>>> 
>>>>> Hi Paolo!
>>>>> 
>>>>> On 25 May 2018 at 20:20, Paolo Valente <paolo.valente@linaro.org> wrote:
>>>>>> Hi,
>>>>>> if I run this job (even with the last version GitHub version of fio) on an SSD:
>>>>>> [global]
>>>>>> ioengine=sync
>>>>>> time_based=1
>>>>>> runtime=20
>>>>>> readwrite=randread
>>>>>> size=100m
>>>>>> numjobs=1
>>>>>> invalidate=1
>>>>>> [job1]
>>>>>> 
>>>>>> then, after little time (I think after 100MB have been read), fio reports a nonsensically large value for the throughput, while a simple iostat shows that no I/O is going on. By just replacing time_based with loops, i.e., with a job file like this:
>>>>>> 
>>>>>> [global]
>>>>>> ioengine=sync
>>>>>> loops=1000
>>>>>> readwrite=randread
>>>>>> size=100m
>>>>>> numjobs=1
>>>>>> invalidate=1
>>>>>> [job1]
>>>>>> 
>>>>>> the problem disappears.
>>>>> 
>>>>> I've taken a stab at fixing this over in
>>>>> https://github.com/sitsofe/fio/tree/random_reinvalidate - does that
>>>>> solve the issue for you too?
>>>> 
>>>> Nope :(
>>>> 
>>>>> ...
>>>>> I know I'm "Teaching grandmother to suck eggs" given that you're the
>>>>> author of BFQ but just in case...
>>>>> 
>>>>> This issue happens on loops=1000 too and I believe it's down to
>>>>> readahead.
>>>> 
>>>> I'm afraid there is a misunderstanding on this, grandson :)
>>>> 
>>>> As I wrote, this problem does not occur with loops=1000.  My
>>>> impression is that, with loops, as well as with time-based and read,
>>>> fio does invalidate the cache every time it restarts reading the same
>>>> file, while with time_based and randread it does not (or maybe it
>>>> tries to, but fails for some reason).
>>> 
>>> This is basically by design. loop will go through the full
>>> open+invalidate, whereas time_based will just keep chugging
>>> along. Once your 100mb is in page cache, then no more IO
>>> will be done, as reads are just served from there.
>>> 
>> 
>> Such a design confused me.  Highlighting somewhere this deviation
>> (between loops and time_based) might help other dull people like me.
> 
> Actually, I'm misremembering, and we did fix this up. But looks
> like I botched the fix, try pulling a new update and it should
> work for you. Fix:
> 
> http://git.kernel.dk/cgit/fio/commit/?id=80f021501fda6a6244672bb89dd8221a61cee54b

It does work!

A tested-by and/or a reported-by would have been helpful for me, but I guess it's too late.

Thanks,
Paolo

> 
> -- 
> Jens Axboe



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: time_based not working with randread
  2018-06-04 15:15           ` Paolo Valente
@ 2018-06-04 15:17             ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2018-06-04 15:17 UTC (permalink / raw)
  To: Paolo Valente; +Cc: Sitsofe Wheeler, fio

On 6/4/18 9:15 AM, Paolo Valente wrote:
> 
> 
>> Il giorno 31 mag 2018, alle ore 17:07, Jens Axboe <axboe@kernel.dk> ha scritto:
>>
>> On 5/31/18 8:49 AM, Paolo Valente wrote:
>>>
>>>
>>>> Il giorno 31 mag 2018, alle ore 16:38, Jens Axboe <axboe@kernel.dk> ha scritto:
>>>>
>>>> On 5/31/18 2:55 AM, Paolo Valente wrote:
>>>>>
>>>>>
>>>>>> Il giorno 27 mag 2018, alle ore 16:24, Sitsofe Wheeler <sitsofe@gmail.com> ha scritto:
>>>>>>
>>>>>> Hi Paolo!
>>>>>>
>>>>>> On 25 May 2018 at 20:20, Paolo Valente <paolo.valente@linaro.org> wrote:
>>>>>>> Hi,
>>>>>>> if I run this job (even with the last version GitHub version of fio) on an SSD:
>>>>>>> [global]
>>>>>>> ioengine=sync
>>>>>>> time_based=1
>>>>>>> runtime=20
>>>>>>> readwrite=randread
>>>>>>> size=100m
>>>>>>> numjobs=1
>>>>>>> invalidate=1
>>>>>>> [job1]
>>>>>>>
>>>>>>> then, after little time (I think after 100MB have been read), fio reports a nonsensically large value for the throughput, while a simple iostat shows that no I/O is going on. By just replacing time_based with loops, i.e., with a job file like this:
>>>>>>>
>>>>>>> [global]
>>>>>>> ioengine=sync
>>>>>>> loops=1000
>>>>>>> readwrite=randread
>>>>>>> size=100m
>>>>>>> numjobs=1
>>>>>>> invalidate=1
>>>>>>> [job1]
>>>>>>>
>>>>>>> the problem disappears.
>>>>>>
>>>>>> I've taken a stab at fixing this over in
>>>>>> https://github.com/sitsofe/fio/tree/random_reinvalidate - does that
>>>>>> solve the issue for you too?
>>>>>
>>>>> Nope :(
>>>>>
>>>>>> ...
>>>>>> I know I'm "Teaching grandmother to suck eggs" given that you're the
>>>>>> author of BFQ but just in case...
>>>>>>
>>>>>> This issue happens on loops=1000 too and I believe it's down to
>>>>>> readahead.
>>>>>
>>>>> I'm afraid there is a misunderstanding on this, grandson :)
>>>>>
>>>>> As I wrote, this problem does not occur with loops=1000.  My
>>>>> impression is that, with loops, as well as with time-based and read,
>>>>> fio does invalidate the cache every time it restarts reading the same
>>>>> file, while with time_based and randread it does not (or maybe it
>>>>> tries to, but fails for some reason).
>>>>
>>>> This is basically by design. loop will go through the full
>>>> open+invalidate, whereas time_based will just keep chugging
>>>> along. Once your 100mb is in page cache, then no more IO
>>>> will be done, as reads are just served from there.
>>>>
>>>
>>> Such a design confused me.  Highlighting somewhere this deviation
>>> (between loops and time_based) might help other dull people like me.
>>
>> Actually, I'm misremembering, and we did fix this up. But looks
>> like I botched the fix, try pulling a new update and it should
>> work for you. Fix:
>>
>> http://git.kernel.dk/cgit/fio/commit/?id=80f021501fda6a6244672bb89dd8221a61cee54b
> 
> It does work!
> 
> A tested-by and/or a reported-by would have been helpful for me, but I guess it's too late.

Yeah, but depending on the fix, I don't usually wait around for those.
If it's obvious enough, I just commit it, especially if I can reproduce
and verify it myself. Thanks for confirming, though.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-06-04 15:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-25 19:20 time_based not working with randread Paolo Valente
2018-05-27 14:24 ` Sitsofe Wheeler
2018-05-31  4:10   ` Sitsofe Wheeler
2018-05-31  8:55   ` Paolo Valente
2018-05-31 14:38     ` Jens Axboe
2018-05-31 14:49       ` Paolo Valente
2018-05-31 15:07         ` Jens Axboe
2018-06-04 15:15           ` Paolo Valente
2018-06-04 15:17             ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.