fio.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Problems with interpreting FIO output
@ 2023-02-19  5:42 seyed mohammad
  2023-02-19 17:32 ` Adam Horshack
  0 siblings, 1 reply; 4+ messages in thread
From: seyed mohammad @ 2023-02-19  5:42 UTC (permalink / raw)
  To: fio

I've run a random read/write FIO test with 400 jobs to benchmark a
RAID setup. The output IOPS and the iops statistics (the average/avg)
are divergent; how can this be interpreted? Should these two values
converge?

Test results for read:
job1: (groupid=0, jobs=400): err= 0: pid=25204:
  read: IOPS=199k, BW=778MiB/s (816MB/s)(23.8GiB/31279msec)
    slat (usec): min=3, max=1401.7k, avg=1423.58, stdev=38712.03
    clat (usec): min=60, max=2683.7k, avg=86310.45, stdev=289174.76
     lat (usec): min=70, max=2683.7k, avg=87730.40, stdev=291369.20
    clat percentiles (usec):
     |  1.00th=[    239],  5.00th=[    562], 10.00th=[    717],
     | 20.00th=[    816], 30.00th=[    865], 40.00th=[    906],
     | 50.00th=[    947], 60.00th=[    988], 70.00th=[   1057],
     | 80.00th=[   1156], 90.00th=[   2147], 95.00th=[ 935330],
     | 99.00th=[1266680], 99.50th=[1300235], 99.90th=[1350566],
     | 99.95th=[1384121], 99.99th=[1803551]
   bw (  MiB/s): min=  228, max= 3556, per=100.00%, avg=1553.73,
stdev= 2.08, samples=12469
   iops        : min=58376, max=910489, avg=397711.45, stdev=532.63,
samples=12469

Test results for write:
 write: IOPS=85.1k, BW=334MiB/s (350MB/s)(10.2GiB/31279msec); 0 zone resets
    slat (usec): min=7, max=1397.5k, avg=1430.09, stdev=38675.97
    clat (usec): min=14, max=2684.1k, avg=97398.93, stdev=306662.71
     lat (usec): min=28, max=2684.1k, avg=98825.72, stdev=308705.23
    clat percentiles (usec):
     |  1.00th=[    314],  5.00th=[    635], 10.00th=[    717],
     | 20.00th=[    791], 30.00th=[    840], 40.00th=[    881],
     | 50.00th=[    922], 60.00th=[    971], 70.00th=[   1057],
     | 80.00th=[   1270], 90.00th=[  10290], 95.00th=[1044382],
     | 99.00th=[1266680], 99.50th=[1300235], 99.90th=[1350566],
     | 99.95th=[1384121], 99.99th=[1769997]
   bw (  KiB/s): min=98765, max=1568174, per=100.00%, avg=681623.43,
stdev=924.60, samples=12469
   iops        : min=24675, max=391998, avg=170363.71, stdev=231.14,
samples=12469


The FIO job:
[global]
name=4k random read 4 ios
ioengine=libaio
direct=1
readwrite=randrw
rwmixread=70
iodepth=64
buffered=0
size=400M
runtime=30
time_based
randrepeat=0
norandommap
refill_buffers
ramp_time=10
bs=4k
numjobs=400
group_reporting=1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Problems with interpreting FIO output
  2023-02-19  5:42 Problems with interpreting FIO output seyed mohammad
@ 2023-02-19 17:32 ` Adam Horshack
  2023-02-21 15:06   ` seyed mohammad
  0 siblings, 1 reply; 4+ messages in thread
From: Adam Horshack @ 2023-02-19 17:32 UTC (permalink / raw)
  To: fio

Hi Seyed,

As you probably know, IOPS represents total #I/Os / total time and iops represents the per-interval averages. The default averaging interval for IOPs is 500ms (--iopsavgtime=500).

fio is reporting 12,469 interval samples for your run. I would expect 400 (jobs) x 31.279 (actual runtime secs) x 2 (samples per second, ie 1000/500) = 25,023 interval samples. It's not unusual for total # samples to depart vs expected under heavy loads. Your reported IOPS is 2x your iops (199K vs 397k for reads), implying the #samples discrepancy is related to the IOPs discrepancy.

I was able to reproduce your IOPs vs iops discrepancy on a ramdisk using 100 vs 1,000 jobs. Read IOPs for 5 second run:

100 jobs: #samples=910 vs #expected=1,000, IOPS=84.3k vs iops=83.4k
1000 jobs: #samples=1,470 vs #expected=10,000, IOPS=107k vs iops=172k

If I enable IOPs logging with a 500ms interval (--write_iops_log and --log_avg_msec=500) I get the following samples for one of the jobs:

100 job run
563, 717, 0, 0, 0
1062, 759, 0, 0, 0
1562, 784, 0, 0, 0
2062, 772, 0, 0, 0
2562, 774, 0, 0, 0
3062, 769, 0, 0, 0
3563, 778, 0, 0, 0
4064, 706, 0, 0, 0
4563, 757, 0, 0, 0

1000 job run:
892, 128, 0, 0, 0
2660, 95, 0, 0, 0

First column is elapsed time ms, second is average IOPs over that interval. Notice the 1,000 job run has fewer samples. Some jobs in the 1,000 job run have zero samples. The IOPS vs iops discrepancy is likely sampling error from insufficient samples captured due to CPU load. I'm not too familiar with the back-end stats capture but I think it runs as a single thread. Perhaps its priority could be boosted.

Adam

From: seyed mohammad <seymad.f1@gmail.com>
Sent: Sunday, February 19, 2023 12:42 AM
To: fio@vger.kernel.org <fio@vger.kernel.org>
Subject: Problems with interpreting FIO output 
 
I've run a random read/write FIO test with 400 jobs to benchmark a
RAID setup. The output IOPS and the iops statistics (the average/avg)
are divergent; how can this be interpreted? Should these two values
converge?

Test results for read:
job1: (groupid=0, jobs=400): err= 0: pid=25204:
  read: IOPS=199k, BW=778MiB/s (816MB/s)(23.8GiB/31279msec)
    slat (usec): min=3, max=1401.7k, avg=1423.58, stdev=38712.03
    clat (usec): min=60, max=2683.7k, avg=86310.45, stdev=289174.76
     lat (usec): min=70, max=2683.7k, avg=87730.40, stdev=291369.20
    clat percentiles (usec):
     |  1.00th=[    239],  5.00th=[    562], 10.00th=[    717],
     | 20.00th=[    816], 30.00th=[    865], 40.00th=[    906],
     | 50.00th=[    947], 60.00th=[    988], 70.00th=[   1057],
     | 80.00th=[   1156], 90.00th=[   2147], 95.00th=[ 935330],
     | 99.00th=[1266680], 99.50th=[1300235], 99.90th=[1350566],
     | 99.95th=[1384121], 99.99th=[1803551]
   bw (  MiB/s): min=  228, max= 3556, per=100.00%, avg=1553.73,
stdev= 2.08, samples=12469
   iops        : min=58376, max=910489, avg=397711.45, stdev=532.63,
samples=12469

Test results for write:
 write: IOPS=85.1k, BW=334MiB/s (350MB/s)(10.2GiB/31279msec); 0 zone resets
    slat (usec): min=7, max=1397.5k, avg=1430.09, stdev=38675.97
    clat (usec): min=14, max=2684.1k, avg=97398.93, stdev=306662.71
     lat (usec): min=28, max=2684.1k, avg=98825.72, stdev=308705.23
    clat percentiles (usec):
     |  1.00th=[    314],  5.00th=[    635], 10.00th=[    717],
     | 20.00th=[    791], 30.00th=[    840], 40.00th=[    881],
     | 50.00th=[    922], 60.00th=[    971], 70.00th=[   1057],
     | 80.00th=[   1270], 90.00th=[  10290], 95.00th=[1044382],
     | 99.00th=[1266680], 99.50th=[1300235], 99.90th=[1350566],
     | 99.95th=[1384121], 99.99th=[1769997]
   bw (  KiB/s): min=98765, max=1568174, per=100.00%, avg=681623.43,
stdev=924.60, samples=12469
   iops        : min=24675, max=391998, avg=170363.71, stdev=231.14,
samples=12469


The FIO job:
[global]
name=4k random read 4 ios
ioengine=libaio
direct=1
readwrite=randrw
rwmixread=70
iodepth=64
buffered=0
size=400M
runtime=30
time_based
randrepeat=0
norandommap
refill_buffers
ramp_time=10
bs=4k
numjobs=400
group_reporting=1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Problems with interpreting FIO output
  2023-02-19 17:32 ` Adam Horshack
@ 2023-02-21 15:06   ` seyed mohammad
  2023-02-22 18:52     ` Adam Horshack
  0 siblings, 1 reply; 4+ messages in thread
From: seyed mohammad @ 2023-02-21 15:06 UTC (permalink / raw)
  To: Adam Horshack; +Cc: fio

Hi Adam,
Just to give a little background, this happened when I was testing my
software RAID10 (mdadm) alongside BTRFS; I was checking the system
load with htop and it did not shoot through the roof.  Many thanks for
the test results. The same experiment with other FS, had much higher
system loads.

On Sun, Feb 19, 2023 at 9:06 PM Adam Horshack <horshack@live.com> wrote:
>
> Hi Seyed,
>
> As you probably know, IOPS represents total #I/Os / total time and iops represents the per-interval averages. The default averaging interval for IOPs is 500ms (--iopsavgtime=500).
>
> fio is reporting 12,469 interval samples for your run. I would expect 400 (jobs) x 31.279 (actual runtime secs) x 2 (samples per second, ie 1000/500) = 25,023 interval samples. It's not unusual for total # samples to depart vs expected under heavy loads. Your reported IOPS is 2x your iops (199K vs 397k for reads), implying the #samples discrepancy is related to the IOPs discrepancy.
>
> I was able to reproduce your IOPs vs iops discrepancy on a ramdisk using 100 vs 1,000 jobs. Read IOPs for 5 second run:
>
> 100 jobs: #samples=910 vs #expected=1,000, IOPS=84.3k vs iops=83.4k
> 1000 jobs: #samples=1,470 vs #expected=10,000, IOPS=107k vs iops=172k
>
> If I enable IOPs logging with a 500ms interval (--write_iops_log and --log_avg_msec=500) I get the following samples for one of the jobs:
>
> 100 job run
> 563, 717, 0, 0, 0
> 1062, 759, 0, 0, 0
> 1562, 784, 0, 0, 0
> 2062, 772, 0, 0, 0
> 2562, 774, 0, 0, 0
> 3062, 769, 0, 0, 0
> 3563, 778, 0, 0, 0
> 4064, 706, 0, 0, 0
> 4563, 757, 0, 0, 0
>
> 1000 job run:
> 892, 128, 0, 0, 0
> 2660, 95, 0, 0, 0
>
> First column is elapsed time ms, second is average IOPs over that interval. Notice the 1,000 job run has fewer samples. Some jobs in the 1,000 job run have zero samples. The IOPS vs iops discrepancy is likely sampling error from insufficient samples captured due to CPU load. I'm not too familiar with the back-end stats capture but I think it runs as a single thread. Perhaps its priority could be boosted.
>
> Adam
>
> From: seyed mohammad <seymad.f1@gmail.com>
> Sent: Sunday, February 19, 2023 12:42 AM
> To: fio@vger.kernel.org <fio@vger.kernel.org>
> Subject: Problems with interpreting FIO output
>
> I've run a random read/write FIO test with 400 jobs to benchmark a
> RAID setup. The output IOPS and the iops statistics (the average/avg)
> are divergent; how can this be interpreted? Should these two values
> converge?
>
> Test results for read:
> job1: (groupid=0, jobs=400): err= 0: pid=25204:
>   read: IOPS=199k, BW=778MiB/s (816MB/s)(23.8GiB/31279msec)
>     slat (usec): min=3, max=1401.7k, avg=1423.58, stdev=38712.03
>     clat (usec): min=60, max=2683.7k, avg=86310.45, stdev=289174.76
>      lat (usec): min=70, max=2683.7k, avg=87730.40, stdev=291369.20
>     clat percentiles (usec):
>      |  1.00th=[    239],  5.00th=[    562], 10.00th=[    717],
>      | 20.00th=[    816], 30.00th=[    865], 40.00th=[    906],
>      | 50.00th=[    947], 60.00th=[    988], 70.00th=[   1057],
>      | 80.00th=[   1156], 90.00th=[   2147], 95.00th=[ 935330],
>      | 99.00th=[1266680], 99.50th=[1300235], 99.90th=[1350566],
>      | 99.95th=[1384121], 99.99th=[1803551]
>    bw (  MiB/s): min=  228, max= 3556, per=100.00%, avg=1553.73,
> stdev= 2.08, samples=12469
>    iops        : min=58376, max=910489, avg=397711.45, stdev=532.63,
> samples=12469
>
> Test results for write:
>  write: IOPS=85.1k, BW=334MiB/s (350MB/s)(10.2GiB/31279msec); 0 zone resets
>     slat (usec): min=7, max=1397.5k, avg=1430.09, stdev=38675.97
>     clat (usec): min=14, max=2684.1k, avg=97398.93, stdev=306662.71
>      lat (usec): min=28, max=2684.1k, avg=98825.72, stdev=308705.23
>     clat percentiles (usec):
>      |  1.00th=[    314],  5.00th=[    635], 10.00th=[    717],
>      | 20.00th=[    791], 30.00th=[    840], 40.00th=[    881],
>      | 50.00th=[    922], 60.00th=[    971], 70.00th=[   1057],
>      | 80.00th=[   1270], 90.00th=[  10290], 95.00th=[1044382],
>      | 99.00th=[1266680], 99.50th=[1300235], 99.90th=[1350566],
>      | 99.95th=[1384121], 99.99th=[1769997]
>    bw (  KiB/s): min=98765, max=1568174, per=100.00%, avg=681623.43,
> stdev=924.60, samples=12469
>    iops        : min=24675, max=391998, avg=170363.71, stdev=231.14,
> samples=12469
>
>
> The FIO job:
> [global]
> name=4k random read 4 ios
> ioengine=libaio
> direct=1
> readwrite=randrw
> rwmixread=70
> iodepth=64
> buffered=0
> size=400M
> runtime=30
> time_based
> randrepeat=0
> norandommap
> refill_buffers
> ramp_time=10
> bs=4k
> numjobs=400
> group_reporting=1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Problems with interpreting FIO output
  2023-02-21 15:06   ` seyed mohammad
@ 2023-02-22 18:52     ` Adam Horshack
  0 siblings, 0 replies; 4+ messages in thread
From: Adam Horshack @ 2023-02-22 18:52 UTC (permalink / raw)
  To: fio

Hi Seyed,

The fact the # samples is much less than expected indicates fio's worker stats worker thread is CPU constrained...or is not awaking as frequently as it should to properly sample the 500ms intervals.

If you can run your test with the following options and zip up the output files in /tmp/fio and place a link to the zip in a new issue post on the repository I can have a look:

--write_iops_log=/tmp/fio/test --iopsavgtime=500 --log_avg_msec=500 --output-format="json" --output=/tmp/fio/results.json

"mkdir /tmp/fio" prior to execution

Link to post issue: https://github.com/axboe/fio/issues

Adam



From: seyed mohammad <seymad.f1@gmail.com>
Sent: Tuesday, February 21, 2023 10:06 AM
To: Adam Horshack <horshack@live.com>
Cc: fio@vger.kernel.org <fio@vger.kernel.org>
Subject: Re: Problems with interpreting FIO output 
 
Hi Adam,
Just to give a little background, this happened when I was testing my
software RAID10 (mdadm) alongside BTRFS; I was checking the system
load with htop and it did not shoot through the roof.  Many thanks for
the test results. The same experiment with other FS, had much higher
system loads.

On Sun, Feb 19, 2023 at 9:06 PM Adam Horshack <horshack@live.com> wrote:
>
> Hi Seyed,
>
> As you probably know, IOPS represents total #I/Os / total time and iops represents the per-interval averages. The default averaging interval for IOPs is 500ms (--iopsavgtime=500).
>
> fio is reporting 12,469 interval samples for your run. I would expect 400 (jobs) x 31.279 (actual runtime secs) x 2 (samples per second, ie 1000/500) = 25,023 interval samples. It's not unusual for total # samples to depart vs expected under heavy loads. Your reported IOPS is 2x your iops (199K vs 397k for reads), implying the #samples discrepancy is related to the IOPs discrepancy.
>
> I was able to reproduce your IOPs vs iops discrepancy on a ramdisk using 100 vs 1,000 jobs. Read IOPs for 5 second run:
>
> 100 jobs: #samples=910 vs #expected=1,000, IOPS=84.3k vs iops=83.4k
> 1000 jobs: #samples=1,470 vs #expected=10,000, IOPS=107k vs iops=172k
>
> If I enable IOPs logging with a 500ms interval (--write_iops_log and --log_avg_msec=500) I get the following samples for one of the jobs:
>
> 100 job run
> 563, 717, 0, 0, 0
> 1062, 759, 0, 0, 0
> 1562, 784, 0, 0, 0
> 2062, 772, 0, 0, 0
> 2562, 774, 0, 0, 0
> 3062, 769, 0, 0, 0
> 3563, 778, 0, 0, 0
> 4064, 706, 0, 0, 0
> 4563, 757, 0, 0, 0
>
> 1000 job run:
> 892, 128, 0, 0, 0
> 2660, 95, 0, 0, 0
>
> First column is elapsed time ms, second is average IOPs over that interval. Notice the 1,000 job run has fewer samples. Some jobs in the 1,000 job run have zero samples. The IOPS vs iops discrepancy is likely sampling error from insufficient samples captured due to CPU load. I'm not too familiar with the back-end stats capture but I think it runs as a single thread. Perhaps its priority could be boosted.
>
> Adam
>
> From: seyed mohammad <seymad.f1@gmail.com>
> Sent: Sunday, February 19, 2023 12:42 AM
> To: fio@vger.kernel.org <fio@vger.kernel.org>
> Subject: Problems with interpreting FIO output
>
> I've run a random read/write FIO test with 400 jobs to benchmark a
> RAID setup. The output IOPS and the iops statistics (the average/avg)
> are divergent; how can this be interpreted? Should these two values
> converge?
>
> Test results for read:
> job1: (groupid=0, jobs=400): err= 0: pid=25204:
>   read: IOPS=199k, BW=778MiB/s (816MB/s)(23.8GiB/31279msec)
>     slat (usec): min=3, max=1401.7k, avg=1423.58, stdev=38712.03
>     clat (usec): min=60, max=2683.7k, avg=86310.45, stdev=289174.76
>      lat (usec): min=70, max=2683.7k, avg=87730.40, stdev=291369.20
>     clat percentiles (usec):
>      |  1.00th=[    239],  5.00th=[    562], 10.00th=[    717],
>      | 20.00th=[    816], 30.00th=[    865], 40.00th=[    906],
>      | 50.00th=[    947], 60.00th=[    988], 70.00th=[   1057],
>      | 80.00th=[   1156], 90.00th=[   2147], 95.00th=[ 935330],
>      | 99.00th=[1266680], 99.50th=[1300235], 99.90th=[1350566],
>      | 99.95th=[1384121], 99.99th=[1803551]
>    bw (  MiB/s): min=  228, max= 3556, per=100.00%, avg=1553.73,
> stdev= 2.08, samples=12469
>    iops        : min=58376, max=910489, avg=397711.45, stdev=532.63,
> samples=12469
>
> Test results for write:
>  write: IOPS=85.1k, BW=334MiB/s (350MB/s)(10.2GiB/31279msec); 0 zone resets
>     slat (usec): min=7, max=1397.5k, avg=1430.09, stdev=38675.97
>     clat (usec): min=14, max=2684.1k, avg=97398.93, stdev=306662.71
>      lat (usec): min=28, max=2684.1k, avg=98825.72, stdev=308705.23
>     clat percentiles (usec):
>      |  1.00th=[    314],  5.00th=[    635], 10.00th=[    717],
>      | 20.00th=[    791], 30.00th=[    840], 40.00th=[    881],
>      | 50.00th=[    922], 60.00th=[    971], 70.00th=[   1057],
>      | 80.00th=[   1270], 90.00th=[  10290], 95.00th=[1044382],
>      | 99.00th=[1266680], 99.50th=[1300235], 99.90th=[1350566],
>      | 99.95th=[1384121], 99.99th=[1769997]
>    bw (  KiB/s): min=98765, max=1568174, per=100.00%, avg=681623.43,
> stdev=924.60, samples=12469
>    iops        : min=24675, max=391998, avg=170363.71, stdev=231.14,
> samples=12469
>
>
> The FIO job:
> [global]
> name=4k random read 4 ios
> ioengine=libaio
> direct=1
> readwrite=randrw
> rwmixread=70
> iodepth=64
> buffered=0
> size=400M
> runtime=30
> time_based
> randrepeat=0
> norandommap
> refill_buffers
> ramp_time=10
> bs=4k
> numjobs=400
> group_reporting=1

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-02-22 18:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-19  5:42 Problems with interpreting FIO output seyed mohammad
2023-02-19 17:32 ` Adam Horshack
2023-02-21 15:06   ` seyed mohammad
2023-02-22 18:52     ` Adam Horshack

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).