linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Weiping Zhang <zwp10758@gmail.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Ming Lei <ming.lei@redhat.com>,
	linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
	mpatocka@redhat.com,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Subject: Re: very inaccurate %util of iostat
Date: Fri, 23 Oct 2020 13:50:49 +0800	[thread overview]
Message-ID: <CAA70yB73hVgt-KppSYaXuh_yOpmJ0zQi3hWP-236a0oUTnOxpg@mail.gmail.com> (raw)
In-Reply-To: <CAA70yB5mFgKKBbJRQ=3Nv+XbVXbSUjmnAGOsUtBT2+s9f4az+Q@mail.gmail.com>

On Fri, Oct 23, 2020 at 10:48 AM Weiping Zhang <zwp10758@gmail.com> wrote:
>
> On Tue, Mar 24, 2020 at 11:54 AM Mike Snitzer <snitzer@redhat.com> wrote:
> >
> > On Mon, Mar 23 2020 at 11:19pm -0400,
> > Ming Lei <ming.lei@redhat.com> wrote:
> >
> > > Hi Guys,
> > >
> > > Commit 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
> > > changes calculation of 'io_ticks' a lot.
> > >
> > > In theory, io_ticks counts the time when there is any IO in-flight or in-queue,
> > > so it has to rely on in-flight counting of IO.
> > >
> > > However, commit 5b18b5a73760 changes io_ticks's accounting into the
> > > following way:
> > >
> > >       stamp = READ_ONCE(part->stamp);
> > >       if (unlikely(stamp != now)) {
> > >               if (likely(cmpxchg(&part->stamp, stamp, now) == stamp))
> > >                       __part_stat_add(part, io_ticks, 1);
> > >       }
> > >
> > > So this way doesn't use any in-flight IO's info, simply adding 1 if stamp
> > > changes compared with previous stamp, no matter if there is any in-flight
> > > IO or not.
> > >
> > > Now when there is very heavy IO on disks, %util is still much less than
> > > 100%, especially on HDD, the reason could be that IO latency can be much more
> > > than 1ms in case of 1000HZ, so the above calculation is very inaccurate.
> > >
> > > Another extreme example is that if IOs take long time to complete, such
> > > as IO stall, %util may show 0% utilization, instead of 100%.
> >
> > Hi Ming,
> >
> > Your email triggered a memory of someone else (Konstantin Khlebnikov)
> > having reported and fixed this relatively recently, please see this
> > patchset: https://lkml.org/lkml/2020/3/2/336
> >
> > Obviously this needs fixing.  If you have time to review/polish the
> > proposed patches that'd be great.
> >
> > Mike
> >
>
> Hi,
>
> commit 5b18b5a73760   makes io.util larger than the real, when IO
> inflight count <= 1,
> even with the commit 2b8bd423614, the problem is exist too.
>
> static void update_io_ticks(struct hd_struct *part, unsigned long now, bool end)
> {
>         unsigned long stamp;
> again:
>         stamp = READ_ONCE(part->stamp);
>         if (unlikely(stamp != now)) {
>                 if (likely(cmpxchg(&part->stamp, stamp, now) == stamp))
>                         __part_stat_add(part, io_ticks, end ? now - stamp : 1);
>         }
>
> when a new start, blk_account_io_start => update_io_ticks and add 1 jiffy to
> io_ticks, even there is no IO before, so it will always add an extra 1 jiffy.
> So we should know is there any inflight IO before.
>
>
>
> Before commit 5b18b5a73760,
> The io_ticks will not be added, if there is no inflight when start a new IO.
> static void part_round_stats_single(struct request_queue *q,
>                                     struct hd_struct *part, unsigned long now,
>                                     unsigned int inflight)
> {
>         if (inflight) {
>                 __part_stat_add(part, time_in_queue,
>                                 inflight * (now - part->stamp));
>                 __part_stat_add(part, io_ticks, (now - part->stamp));
>         }
>         part->stamp = now;
> }
>
>
> Reproduce:
> fio -name=test -ioengine=sync -bs=4K -rw=write
> -filename=/home/test.fio.log -size=100M -time_based=1 -direct=1
> -runtime=300 -rate=2m,2m



Let's explain in more detail.

I run the following command on a host, with different kernel version.

fio -name=test -ioengine=sync -bs=4K -rw=write
-filename=/home/test.fio.log -size=100M -time_based=1 -direct=1
-runtime=300 -rate=2m,2m

If we run fio in a sync direct io mode, IO will be proccessed one by one,
you can see that there are 512 IOs completed in one second.


kernel: 4.19.0

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00     0.00    0.00  512.00     0.00     2.00
8.00     0.21    0.40    0.00    0.40   0.40  20.60

The averate io.latency is 0.4ms
So the disk time cost in one second should be 0.4 * 512 = 204.8 ms,
that means, %util should be 20%.


kernel: commit f9893351ac
In the latest kernel commit f9893351ac, I got the follow number,
Becase update_io_ticks will add a extra 1 jiffy(1ms) for every IO, the
io.latency io.latency will be 1 + 0.4 = 1.4ms,
1.4 * 512 = 716.8ms, so the %util show it about 72%.

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s
%rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
vda              0.00  512.00      0.00      2.00     0.00     0.00
0.00   0.00    0.00    0.40   0.20     0.00     4.00   1.41  72.10

There is a big gap for %util.

  reply	other threads:[~2020-10-23  5:51 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-24  3:19 [regression] very inaccurate %util of iostat Ming Lei
2020-03-24  3:53 ` Mike Snitzer
2020-10-23  2:48   ` Weiping Zhang
2020-10-23  5:50     ` Weiping Zhang [this message]
     [not found]       ` <156351603710306@mail.yandex-team.ru>
2020-10-27  4:26         ` Weiping Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA70yB73hVgt-KppSYaXuh_yOpmJ0zQi3hWP-236a0oUTnOxpg@mail.gmail.com \
    --to=zwp10758@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).