All of lore.kernel.org
 help / color / mirror / Atom feed
* FW: re; fio / test result diffs
       [not found] ` <SJ0PR84MB1434CE895980F3953C93BA6FF4529@SJ0PR84MB1434.NAMPRD84.PROD.OUTLOOK.COM>
@ 2022-01-12 19:30   ` Gibson, Thomas
  2022-01-15 16:04     ` Sitsofe Wheeler
  0 siblings, 1 reply; 4+ messages in thread
From: Gibson, Thomas @ 2022-01-12 19:30 UTC (permalink / raw)
  To: fio

[-- Attachment #1: Type: text/plain, Size: 1782 bytes --]


My company builds SDWan network appliances and use SSD's and NVMe's for several key purposes.

In order to qualify upcoming new disks for our systems, we typically run 2 hour test runs using fio, for the following: 

seqrd, seqwr, seqrw, seqrd_seqwr
randrd, randwr, randrw, randrd_randwr, randrd_seqwr

We've noticed that single disk mode tests (ie seqrd, seqwr, etc) show high numbers than their counter parts in multiple disk mode tests (ie seqrd_seqwr).   But we don't understand why.  This may part normal, but we don't understand how testing functions to explain this.  And if it's not normal, what factors might account for it.

I've include a table of test data below.  You'll notice, as an example, the seq read and seq write numbers are much high than the seq read part of seqrd_seqwr and even high than seqrw.

I've also included a package of fio and test execution files in case that helps.

Also prior to each test run, we do a prefill write to the disk and a clearing of the buffer cache, if that helps.

  FIO            SSSTC_CVB-8D120_FW_CZJG801
  Seq Read                               533MiB/s 
  Seq Write                              317MiB/s 
  Seq Read/Write                   138MiB/s & 138MiB/s   ;  why values are lower here?
  Seq Read, Seq Write           152MiB/s & 152MiB/s   ;  why values are lower here?
  Rand Read                             532MiB/s 
  Rand Write                            253MiB/s 
  Rand Read/Write                 129MiB/s & 129MiB/s  ; same issue
  Rand Read, Rand Write       136MiB/s & 136MiB/s ; same issue
  Rand/Read and Seq/Write  145MiB/s & 145MiB/s ; same issue

Any help or info would be appreciated.
Tom Gibson
mailto:thomas.gibson@hpe.com
HPE/Aruba
HW/SW test engineer
Gilroy, CA.

[-- Attachment #2: fio-testfiles.tar --]
[-- Type: application/x-tar, Size: 30720 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FW: re; fio / test result diffs
  2022-01-12 19:30   ` FW: re; fio / test result diffs Gibson, Thomas
@ 2022-01-15 16:04     ` Sitsofe Wheeler
  2022-01-15 22:41       ` Gibson, Thomas
  0 siblings, 1 reply; 4+ messages in thread
From: Sitsofe Wheeler @ 2022-01-15 16:04 UTC (permalink / raw)
  To: Gibson, Thomas; +Cc: fio

On Thu, 13 Jan 2022 at 00:57, Gibson, Thomas <thomas.gibson@hpe.com> wrote:
>
>
> My company builds SDWan network appliances and use SSD's and NVMe's for several key purposes.
>
> In order to qualify upcoming new disks for our systems, we typically run 2 hour test runs using fio, for the following:
>
> seqrd, seqwr, seqrw, seqrd_seqwr
> randrd, randwr, randrw, randrd_randwr, randrd_seqwr
>
> We've noticed that single disk mode tests (ie seqrd, seqwr, etc) show high numbers than their counter parts in multiple disk mode tests (ie seqrd_seqwr).   But we don't understand why.  This may part normal, but we don't understand how testing functions to explain this.  And if it's not normal, what factors might account for it.
>
> I've include a table of test data below.  You'll notice, as an example, the seq read and seq write numbers are much high than the seq read part of seqrd_seqwr and even high than seqrw.
>
> I've also included a package of fio and test execution files in case that helps.
>
> Also prior to each test run, we do a prefill write to the disk and a clearing of the buffer cache, if that helps.
>
>   FIO            SSSTC_CVB-8D120_FW_CZJG801
>   Seq Read                               533MiB/s
>   Seq Write                              317MiB/s
>   Seq Read/Write                   138MiB/s & 138MiB/s   ;  why values are lower here?

An example job from your tarball (included here because it's easier to read)

; Random Read,Sequential Write
[global]
ioengine=libaio
direct=1
iodepth=16
randrepeat=0
bs=256000
time_based
runtime=7200
log_avg_msec=500
[SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-RandRd]
filename=/dev/sde
write_bw_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-randrd
write_iops_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-randrd
write_lat_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-randrd
rw=randread
[SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-SeqWr]
filename=/dev/sde
write_bw_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-seqwr
write_iops_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-seqwr
write_lat_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-seqwr
rw=write

What does fio's summary output look like for that job? On Linux, fio
prints disk and CPU utilisation information in its summary when the
job finishes - what does it say? Alternatively take a look at the
"iostat -xzh 1" output while the job is running and see what the disk
utilisation is like.

-- 
Sitsofe

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: FW: re; fio / test result diffs
  2022-01-15 16:04     ` Sitsofe Wheeler
@ 2022-01-15 22:41       ` Gibson, Thomas
  2022-01-17 20:03         ` Sitsofe Wheeler
  0 siblings, 1 reply; 4+ messages in thread
From: Gibson, Thomas @ 2022-01-15 22:41 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

[-- Attachment #1: Type: text/plain, Size: 2882 bytes --]

I wasn't sure if I should reply to you directly or not.  So I did for now.
But I cc'c the fio mailing list as well.

I've attached a test data tarfile on the fio test runs for each fio test.
tom...

-----Original Message-----
From: Sitsofe Wheeler <sitsofe@gmail.com> 
Sent: Saturday, January 15, 2022 8:04 AM
To: Gibson, Thomas <thomas.gibson@hpe.com>
Cc: fio@vger.kernel.org
Subject: Re: FW: re; fio / test result diffs

On Thu, 13 Jan 2022 at 00:57, Gibson, Thomas <thomas.gibson@hpe.com> wrote:
>
>
> My company builds SDWan network appliances and use SSD's and NVMe's for several key purposes.
>
> In order to qualify upcoming new disks for our systems, we typically run 2 hour test runs using fio, for the following:
>
> seqrd, seqwr, seqrw, seqrd_seqwr
> randrd, randwr, randrw, randrd_randwr, randrd_seqwr
>
> We've noticed that single disk mode tests (ie seqrd, seqwr, etc) show high numbers than their counter parts in multiple disk mode tests (ie seqrd_seqwr).   But we don't understand why.  This may part normal, but we don't understand how testing functions to explain this.  And if it's not normal, what factors might account for it.
>
> I've include a table of test data below.  You'll notice, as an example, the seq read and seq write numbers are much high than the seq read part of seqrd_seqwr and even high than seqrw.
>
> I've also included a package of fio and test execution files in case that helps.
>
> Also prior to each test run, we do a prefill write to the disk and a clearing of the buffer cache, if that helps.
>
>   FIO            SSSTC_CVB-8D120_FW_CZJG801
>   Seq Read                               533MiB/s
>   Seq Write                              317MiB/s
>   Seq Read/Write                   138MiB/s & 138MiB/s   ;  why values are lower here?

An example job from your tarball (included here because it's easier to read)

; Random Read,Sequential Write
[global]
ioengine=libaio
direct=1
iodepth=16
randrepeat=0
bs=256000
time_based
runtime=7200
log_avg_msec=500
[SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-RandRd]
filename=/dev/sde
write_bw_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-randrd
write_iops_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-randrd
write_lat_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-randrd
rw=randread
[SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-SeqWr]
filename=/dev/sde
write_bw_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-seqwr
write_iops_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-seqwr
write_lat_log=SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-seqwr
rw=write

What does fio's summary output look like for that job? On Linux, fio prints disk and CPU utilisation information in its summary when the job finishes - what does it say? Alternatively take a look at the "iostat -xzh 1" output while the job is running and see what the disk utilisation is like.

--
Sitsofe

[-- Attachment #2: fio-raw-test-data.tar --]
[-- Type: application/x-tar, Size: 51200 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FW: re; fio / test result diffs
  2022-01-15 22:41       ` Gibson, Thomas
@ 2022-01-17 20:03         ` Sitsofe Wheeler
  0 siblings, 0 replies; 4+ messages in thread
From: Sitsofe Wheeler @ 2022-01-17 20:03 UTC (permalink / raw)
  To: Gibson, Thomas; +Cc: fio

On Sat, 15 Jan 2022 at 22:41, Gibson, Thomas <thomas.gibson@hpe.com> wrote:
>
> I wasn't sure if I should reply to you directly or not.  So I did for now.
> But I cc'c the fio mailing list as well.
>
> I've attached a test data tarfile on the fio test runs for each fio test.
> tom...

From SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-randrd_seqwr.txt:
[...]
SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-RandRd: (groupid=0, jobs=1):
err= 0: pid=42215: Mon Dec 20 16:53:45 2021
   read: IOPS=597, BW=146MiB/s (153MB/s)(1026GiB/7200022msec)
    slat (nsec): min=5512, max=99489, avg=8598.67, stdev=981.29
    clat (usec): min=1156, max=329247, avg=26747.47, stdev=8216.72
     lat (usec): min=1223, max=329255, avg=26756.16, stdev=8216.72
[...]
  cpu          : usr=0.15%, sys=0.64%, ctx=4305429, majf=0, minf=1267
SSSTC_CVB-8D120_FW_CZJG801_2ndrun_r640-SeqWr: (groupid=0, jobs=1):
err= 0: pid=42216: Mon Dec 20 16:53:45 2021
  write: IOPS=597, BW=146MiB/s (153MB/s)(1026GiB/7200021msec)
    slat (usec): min=7, max=143, avg=14.13, stdev= 2.23
    clat (msec): min=9, max=329, avg=26.74, stdev= 8.21
     lat (msec): min=9, max=329, avg=26.76, stdev= 8.21
[...]
  cpu          : usr=0.51%, sys=0.59%, ctx=4137284, majf=0, minf=270
[...]
Disk stats (read/write):
  sde: ios=4305116/4304491, merge=0/0, ticks=115166190/115134060,
in_queue=18446744069644883850, util=100.00%

You have plenty of CPU left but your disk utilisation is 100% so for
whatever reason the disk can't accept more I/O. Your problem is that
for whatever reason something below the kernel is bottlenecking...

My only thought is maybe your mixed workloads are better at exhausting
the disk's cache than a solo workload does?

-- 
Sitsofe

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-01-17 20:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <SJ0PR84MB1434F97AE046381C181A3514F4529@SJ0PR84MB1434.NAMPRD84.PROD.OUTLOOK.COM>
     [not found] ` <SJ0PR84MB1434CE895980F3953C93BA6FF4529@SJ0PR84MB1434.NAMPRD84.PROD.OUTLOOK.COM>
2022-01-12 19:30   ` FW: re; fio / test result diffs Gibson, Thomas
2022-01-15 16:04     ` Sitsofe Wheeler
2022-01-15 22:41       ` Gibson, Thomas
2022-01-17 20:03         ` Sitsofe Wheeler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.