linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Weiping Zhang <zwp10758@gmail.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>, Mike Snitzer <snitzer@redhat.com>,
	mpatocka@redhat.com, linux-block@vger.kernel.org
Subject: Re: [PATCH v5 0/2] fix inaccurate io_ticks
Date: Thu, 26 Nov 2020 19:23:19 +0800	[thread overview]
Message-ID: <CAA70yB76dWneSJvtyA=1BFSJit3jzyvBAzJQYO-UHo-KACSZ9g@mail.gmail.com> (raw)
In-Reply-To: <CAA70yB4c_mBxr3ftDd1omU=Piozxw2jKM0nyMmOP9P_hOYjNMQ@mail.gmail.com>

Ping

On Wed, Nov 18, 2020 at 1:55 PM Weiping Zhang <zwp10758@gmail.com> wrote:
>
> On Tue, Nov 17, 2020 at 3:40 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Tue, Nov 17, 2020 at 12:59:46PM +0800, Weiping Zhang wrote:
> > > On Tue, Nov 17, 2020 at 11:28 AM Ming Lei <ming.lei@redhat.com> wrote:
> > > >
> > > > On Tue, Nov 17, 2020 at 11:01:49AM +0800, Weiping Zhang wrote:
> > > > > Hi Jens,
> > > > >
> > > > > Ping
> > > >
> > > > Hello Weiping,
> > > >
> > > > Not sure we have to fix this issue, and adding blk_mq_queue_inflight()
> > > > back to IO path brings cost which turns out to be visible, and I did
> > > > get soft lockup report on Azure NVMe because of this kind of cost.
> > > >
> > > Have you test v5, this patch is different from v1, the v1 gets
> > > inflight for each IO,
> > > v5 has changed to get inflight every jiffer.
> >
> > I meant the issue can be reproduced on kernel before 5b18b5a73760("block:
> > delete part_round_stats and switch to less precise counting").
> >
> > Also do we really need to fix this issue? I understand device
> > utilization becomes not accurate at very small load, is it really
> > worth of adding runtime load in fast path for fixing this issue?
> >
> Hello Ming,
>
> The problem is user hard to know how busy disk is,
> for small load, it shows high utilization, for heavy load it also shows
> high utilization, that makes %util meaningless.
>
> The following test case shows a big gap with same workload:
>
> modprobe null_blk submit_queues=8 queue_mode=2 irqmode=2 completion_nsec=100000
> fio -name=test -ioengine=sync -bs=4K -rw=write -filename=/dev/nullb0
> -size=100M -time_based=1 -direct=1 -runtime=300 -rate=4m &
>
>                         w/s   w_await  %util
> -----------------------------------------------
> before patch 1024         0.15  100
> after   patch  1024         0.15  14.5
>
> I know for hyper speed disk, add such accounting in fast path is harmful,
> maybe we add an interface to enable/disable io_ticks accounting, like
> what /sys/block/<disk>/queue/iostat does.
>
> eg: /sys/block/<disk>/queue/iostat_io_ticks
> when write 0 to it, just disable io_ticks totally.
>
> Or any other good idea ?
>
> > >
> > > If for v5, can we reproduce it on null_blk ?
> >
> > No, I just saw report on Azure NVMe.
> >
> > >
> > > > BTW, suppose the io accounting issue needs to be fixed, just wondering
> > > > why not simply revert 5b18b5a73760 ("block: delete part_round_stats and
> > > > switch to less precise counting"), and the original way had been worked
> > > > for decades.
> > > >
> > > This patch is more better than before, it will break early when find there is
> > > inflight io on any cpu, for the worst case(the io in running on the last cpu),
> > > it iterates all cpus.
> >
> Yes, it's the worst case.
> Actually v5 has two improvements compare to before 5b18b5a73760:
> 1. for io end, v5 do not get inflight count
> 2. for io start, v5 just find the first inflight io in any cpu, for
> the worst case it does same as before.
>
> > Please see the following case:
> >
> > 1) one device has 256 hw queues, and the system has 256 cpu cores, and
> > each hw queue's depth is 1k.
> >
> > 2) there isn't any io load on CPUs(0 ~ 254)
> >
> > 3) heavy io load is run on CPU 255
> >
> > So with your trick the code still need to iterate hw queues from 0 to 254, and
> > the load isn't something which can be ignored. Especially it is just for
> > io accounting.
> >
> >
> > Thanks,
> > Ming
> >
> Thanks

  reply	other threads:[~2020-11-26 11:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-27  4:54 [PATCH v5 0/2] fix inaccurate io_ticks Weiping Zhang
2020-11-04  3:26 ` Weiping Zhang
2020-11-17  3:01   ` Weiping Zhang
2020-11-17  3:27     ` Ming Lei
2020-11-17  4:59       ` Weiping Zhang
2020-11-17  7:40         ` Ming Lei
2020-11-18  5:55           ` Weiping Zhang
2020-11-26 11:23             ` Weiping Zhang [this message]
2020-12-17 17:03               ` Weiping Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA70yB76dWneSJvtyA=1BFSJit3jzyvBAzJQYO-UHo-KACSZ9g@mail.gmail.com' \
    --to=zwp10758@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).