Hello fio users. This is my first attempt at using fio, and I'm puzzled by the results I obtained while running fio on a ZFS server I built recently. I'm running these commands mostly out of curiosity, and just to understand what the server is capable of in a ballpark range. Here goes my naive attempts and results. Context: FreeBSD 11.2, ZFS, Compression enabled, Deduplication disabled, 120 X 10TB SAS HDD disks, no SSD caching, 768GB of DDR4 memory, 2X intel Xeon silver 4110 processors.All commands were run directly on the FreeBSD server. Running fio-3.8 *_Test # 1:_* Use dd to create a 16 GiB file with random contents. root@delorean:/sec_stor/backup/fiotest # time dd if=/dev/random of=./test_file bs=128k count=131072131072+0 records in 131072+0 records out 17179869184 bytes transferred in 137.402721 secs (125032962 bytes/sec) 0.031u 136.915s 2:17.40 99.6% 30+172k 4+131072io 0pf+0w This shows that I was able to create a 16GiB file with random data in 137 secs with a throughput of approx. *117 MiB/s.* *_Test # 2:_* Use fio to perform seq_writes on a 16 GiB file with default random contents. root@delorean:/sec_stor/backup/fiotest/fio-master # ./fio -- name=seqwrite --rw=write --bs=128k --numjobs=1 --size=16G --runtime=60 --iodepth=1 --group_reportingseqwrite: (g=0): rw=write, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=psync, iodepth=1fio-3.8 Starting 1 process Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=2188MiB/s][r=0,w=17.5k IOPS][eta 00m:00s]seqwrite: (groupid=0, jobs=1): err= 0: pid=8693: Wed Jul 25 14:11:54 2018 write: IOPS=16.5k, BW=2065MiB/s (2165MB/s)(16.0GiB/7935msec) clat (usec): min=18, max=17174, avg=58.18, stdev=285.68 lat (usec): min=18, max=17180, avg=59.94, stdev=285.88 clat percentiles (usec): | 1.00th=[ 23], 5.00th=[ 39], 10.00th=[ 41], | 20.00th=[ 42], | 30.00th=[ 43], 40.00th=[ 44], 50.00th=[ 45], | 60.00th=[ 47], | 70.00th=[ 51], 80.00th=[ 57], 90.00th=[ 73], | 95.00th=[ 89], | 99.00th=[ 135], 99.50th=[ 165], 99.90th=[ 396], | 99.95th=[10421], | 99.99th=[14877] bw ( MiB/s): min= 1648, max= 2388, per=99.00%, avg=2044.07, stdev=218.90, samples=15 iops : min=13185, max=19108, avg=16352.07, stdev=1751.17, samples=15 lat (usec) : 20=0.33%, 50=67.57%, 100=28.78%, 250=3.12%, 500=0.13% lat (usec) : 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=0.01%, 20=0.05% cpu : usr=5.08%, sys=90.27%, ctx=40602, majf=0, minf=0 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,131072,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=2065MiB/s (2165MB/s), 2065MiB/s-2065MiB/s (2165MB/s- 2165MB/s), io=16.0GiB (17.2GB), run=7935-7935msecroot@delorean:/sec_stor/backup/fiotest/fio-master # du -sh seqwrite.0.0 16G seqwrite.0.0 As you can see, it hits a throughput of 2065 MiB/s. Question: Why such a drastic difference between dd and this? My understanding is that dd command must have used createcall to create a file, then issue open, and write calls to write the random data into the open file. Finally closethe file. I chose block size of 128 K to match with ZFS default record size.The fio test should be measuring just the write calls, but everything else, the same. Why is there so much difference in throughput?While that question is still open in mind, I ventured into testing with compressible data. *_Test # 3:_* Run fio with buffer_compress_percent=0 (0% compression) - My expectation is this to match the results of test #2. But, it didnt actually create 0% compressible data. More like 100% compressible data! root@delorean:/sec_stor/backup/fiotest/fio-master # ./fio -- name=seqwrite --rw=write --bs=128k --numjobs=1 --size=16G --runtime=60 --iodepth=1 --buffer_compress_percentage=0 --buffer_pattern=0xdeadbeef --refill_buffers --group_reportingseqwrite: (g=0): rw=write, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=psync, iodepth=1fio-3.8 Starting 1 process Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=2470MiB/s][r=0,w=19.8k IOPS][eta 00m:00s]seqwrite: (groupid=0, jobs=1): err= 0: pid=10204: Wed Jul 25 14:14:22 2018 write: IOPS=19.8k, BW=2477MiB/s (2597MB/s)(16.0GiB/6615msec) clat (usec): min=25, max=529, avg=45.34, stdev= 7.71 lat (usec): min=25, max=532, avg=45.39, stdev= 7.73 clat percentiles (usec): | 1.00th=[ 34], 5.00th=[ 39], 10.00th=[ 41], | 20.00th=[ 42], | 30.00th=[ 42], 40.00th=[ 43], 50.00th=[ 44], | 60.00th=[ 45], | 70.00th=[ 46], 80.00th=[ 49], 90.00th=[ 54], | 95.00th=[ 59], | 99.00th=[ 71], 99.50th=[ 77], 99.90th=[ 106], 99.95th=[ | 121], | 99.99th=[ 239] bw ( MiB/s): min= 2404, max= 2513, per=99.54%, avg=2465.45, stdev=34.80, samples=13 iops : min=19235, max=20109, avg=19723.23, stdev=278.34, samples=13 lat (usec) : 50=83.01%, 100=16.84%, 250=0.15%, 500=0.01%, 750=0.01% cpu : usr=13.09%, sys=86.82%, ctx=305, majf=0, minf=0 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,131072,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=2477MiB/s (2597MB/s), 2477MiB/s-2477MiB/s (2597MB/s- 2597MB/s), io=16.0GiB (17.2GB), run=6615-6615msecroot@delorean:/sec_stor/backup/fiotest/fio-master # du -sh seqwrite.0.0 1.1G seqwrite.0.0 As you see, this test created a 1.1GiB file when I was expecting a 16 GiB file. Why? What options did I miss or specify wrong? I just couldn't stop, so I went on with another test, *_Test # 4:_* Run fio with buffer_compress_percent=50 (50% compression). It looks like it did create a 50% compressible dataset. But, yielded relatively low throughput. root@delorean:/sec_stor/backup/fiotest/fio-master # ./fio -- name=seqwrite --rw=write --bs=128k --numjobs=1 --size=16G --runtime=60 --iodepth=1 --buffer_compress_percentage=50 --buffer_pattern=0xdeadbeef --refill_buffers --group_reportingseqwrite: (g=0): rw=write, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=psync, iodepth=1fio-3.8 Starting 1 process Jobs: 1 (f=1): [W(1)][95.7%][r=0KiB/s,w=715MiB/s][r=0,w=5721 IOPS][eta 00m:01s]seqwrite: (groupid=0, jobs=1): err= 0: pid=13543: Wed Jul 25 14:18:41 2018 write: IOPS=5729, BW=716MiB/s (751MB/s)(16.0GiB/22876msec) clat (usec): min=26, max=1228, avg=46.59, stdev=19.26 lat (usec): min=26, max=1229, avg=46.79, stdev=19.73 clat percentiles (usec): | 1.00th=[ 35], 5.00th=[ 39], 10.00th=[ 40], | 20.00th=[ 42], | 30.00th=[ 42], 40.00th=[ 43], 50.00th=[ 44], | 60.00th=[ 44], | 70.00th=[ 45], 80.00th=[ 47], 90.00th=[ 52], | 95.00th=[ 64], | 99.00th=[ 120], 99.50th=[ 153], 99.90th=[ 269], 99.95th=[ | 359], | 99.99th=[ 627] bw ( KiB/s): min=689820, max=748346, per=99.16%, avg=727266.49, stdev=13577.99, samples=45 iops : min= 5389, max= 5846, avg=5681.29, stdev=106.15, samples=45 lat (usec) : 50=88.05%, 100=10.38%, 250=1.45%, 500=0.10%, 750=0.02% lat (usec) : 1000=0.01% lat (msec) : 2=0.01% cpu : usr=70.79%, sys=27.64%, ctx=24546, majf=0, minf=0 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,131072,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=716MiB/s (751MB/s), 716MiB/s-716MiB/s (751MB/s-751MB/s), io=16.0GiB (17.2GB), run=22876-22876msecroot@delorean:/sec_stor/backup/fiotest/fio-master # du -sh seqwrite.0.0 9.2G seqwrite.0.0 As you see, 9.2G file seems reasonable at 50% compression. But, why 716 MiB/s when compared to test #2(no compression) - 2065 MiB/s? At 50% compression, I understand there is work for the CPUs to compress the data inline, but, would'nt that cost be negated by the reduction in amount of data to be written to disks? Is this kind of drop in throughput reasonable in the field for enabling compression? Apologies in advance if I missed anything obvious in the documentation. This is my first use of fio. Thanks, -- Aravindh Sampathkumar aravindh@fastmail.com