[regression] very inaccurate %util of iostat

* [regression] very inaccurate %util of iostat
@ 2020-03-24  3:19 Ming Lei
  2020-03-24  3:53 ` Mike Snitzer
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2020-03-24  3:19 UTC (permalink / raw)
  To: linux-block, Jens Axboe, Mike Snitzer, mpatocka

Hi Guys,

Commit 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
changes calculation of 'io_ticks' a lot.

In theory, io_ticks counts the time when there is any IO in-flight or in-queue,
so it has to rely on in-flight counting of IO.

However, commit 5b18b5a73760 changes io_ticks's accounting into the
following way:

	stamp = READ_ONCE(part->stamp);
	if (unlikely(stamp != now)) {
		if (likely(cmpxchg(&part->stamp, stamp, now) == stamp))
			__part_stat_add(part, io_ticks, 1);
	}

So this way doesn't use any in-flight IO's info, simply adding 1 if stamp
changes compared with previous stamp, no matter if there is any in-flight
IO or not.

Now when there is very heavy IO on disks, %util is still much less than
100%, especially on HDD, the reason could be that IO latency can be much more
than 1ms in case of 1000HZ, so the above calculation is very inaccurate.

Another extreme example is that if IOs take long time to complete, such
as IO stall, %util may show 0% utilization, instead of 100%.

Thanks, 
Ming

^ permalink raw reply	[flat|nested] 5+ messages in thread