"No I/O performed by <engine>" reporting bug?

* "No I/O performed by <engine>" reporting bug?
@ 2018-03-23 16:08 Kris Davis
  2018-03-23 21:45 ` Sitsofe Wheeler
  0 siblings, 1 reply; 12+ messages in thread
From: Kris Davis @ 2018-03-23 16:08 UTC (permalink / raw)
  To: fio; +Cc: Itay Ben Yaacov

In fio version 2.20 a test seems to have been added to backend.c which "may" falsely report that no I/O has occurred.
Currently, at line 1856 of backend.c there is the following test block, which doesn't appear to exist in versions prior to 2.20, unless the equivalent resided elsewhere.

	/*
	 * If td ended up with no I/O when it should have had,
	 * then something went wrong unless FIO_NOIO or FIO_DISKLESSIO.
	 * (Are we not missing other flags that can be ignored ?)
	 */
	if ((td->o.size || td->o.io_size) && !ddir_rw_sum(bytes_done) &&
	    !did_some_io && !td->o.create_only &&
	    !(td_ioengine_flagged(td, FIO_NOIO) ||
	      td_ioengine_flagged(td, FIO_DISKLESSIO)))
		log_err("%s: No I/O performed by %s, "
			 "perhaps try --debug=io option for details?\n",
			 td->o.name, td->io_ops->name);

In the following fio test, the error message is logged, even though it appears that everything has run ok (fio built from main below, but results appear to be the same with at least fio version 2.21)
(It's also interesting to note that the estimated "eta" is also grossly inaccurate [captured while running below] - but that is a different issue I think).
You can see the " Random-read-4K-QD1: No I/O performed by libaio, perhaps try --debug=io option for details?" error message below, which "appears" to be false.

    $ fio --ioengine=libaio --loops=32 --direct=1 --numjobs=1--norandommap --randrepeat=0 --size=16GB --filename=/dev/sdb --name=Random-read-4K-QD1 --rw=randread --bs=4K --iodepth=1 --number_ios=8192
    Random-read-4K-QD1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
    fio-3.5-60-g69c5
    Starting 1 process
    Jobs: 1 (f=1): [r(1)][0.2%][r=30.9MiB/s,w=0KiB/s][r=7898,w=0 IOPS][eta 04h:45m:35s]

    Random-read-4K-QD1: No I/O performed by libaio, perhaps try --debug=io option for details?

    Random-read-4K-QD1: (groupid=0, jobs=1): err= 0: pid=21471: Fri Mar 23 10:49:12 2018
       read: IOPS=7844, BW=30.6MiB/s (32.1MB/s)(1024MiB/33418msec)
        slat (nsec): min=4369, max=29947, avg=4781.26, stdev=279.14
        clat (usec): min=28, max=7778, avg=121.24, stdev=91.57
         lat (usec): min=36, max=7783, avg=126.18, stdev=91.57
        clat percentiles (usec):
         |  1.00th=[   32],  5.00th=[   33], 10.00th=[  116], 20.00th=[  117],
         | 30.00th=[  118], 40.00th=[  120], 50.00th=[  121], 60.00th=[  122],
         | 70.00th=[  133], 80.00th=[  135], 90.00th=[  137], 95.00th=[  139],
         | 99.00th=[  141], 99.50th=[  141], 99.90th=[ 1336], 99.95th=[ 2868],
         | 99.99th=[ 2966]
       bw (  KiB/s): min=24424, max=32624, per=100.00%, avg=31379.52, stdev=1196.81, samples=66
       iops        : min= 6106, max= 8156, avg=7844.88, stdev=299.20, samples=66
      lat (usec)   : 50=6.68%, 100=0.09%, 250=93.03%, 500=0.08%, 750=0.02%
      lat (usec)   : 1000=0.01%
      lat (msec)   : 2=0.01%, 4=0.09%, 10=0.01%
      cpu          : usr=3.20%, sys=4.90%, ctx=262146, majf=0, minf=206
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=1

    Run status group 0 (all jobs):
       READ: bw=30.6MiB/s (32.1MB/s), 30.6MiB/s-30.6MiB/s (32.1MB/s-32.1MB/s), io=1024MiB (1074MB), run=33418-33418msec

    Disk stats (read/write):
      sdb: ios=261972/0, merge=0/0, ticks=31245/0, in_queue=31188, util=93.13%

I added a print message which showed that both ddir_rw_sum(bytes_done) and did_some_io values were zero at that point in the code.
I didn't try to dig into the operation of the bytes_done and did_some_io values further, yet.

Also, I noticed that if option "--io_size=32M" was used instead of "number_ios=8192", then the eta appeared to be accurate - so the eta issue does "appear" to be independent - I can open a separate discussion or issue if desired.

Thanks

Kris Davis

^ permalink raw reply	[flat|nested] 12+ messages in thread