From: Ming Lei <ming.lei@redhat.com>
To: Gulam Mohamed <gulam.mohamed@oracle.com>
Cc: linux-block@vger.kernel.org, axboe@kernel.dk,
philipp.reisner@linbit.com, lars.ellenberg@linbit.com,
christoph.boehmwalder@linbit.com, minchan@kernel.org,
ngupta@vflare.org, senozhatsky@chromium.org, colyli@suse.de,
kent.overstreet@gmail.com, agk@redhat.com, snitzer@kernel.org,
dm-devel@redhat.com, song@kernel.org, dan.j.williams@intel.com,
vishal.l.verma@intel.com, dave.jiang@intel.com,
ira.weiny@intel.com, junxiao.bi@oracle.com,
martin.petersen@oracle.com, kch@nvidia.com,
drbd-dev@lists.linbit.com, linux-kernel@vger.kernel.org,
linux-bcache@vger.kernel.org, linux-raid@vger.kernel.org,
nvdimm@lists.linux.dev, konrad.wilk@oracle.com,
joe.jin@oracle.com, ming.lei@redhat.com
Subject: Re: [RFC for-6.2/block V2] block: Change the granularity of io ticks from ms to ns
Date: Thu, 8 Dec 2022 08:36:49 +0800 [thread overview]
Message-ID: <Y5ExoZ+7Am6Nm8+h@T590> (raw)
In-Reply-To: <20221207223204.22459-1-gulam.mohamed@oracle.com>
On Wed, Dec 07, 2022 at 10:32:04PM +0000, Gulam Mohamed wrote:
> As per the review comment from Jens Axboe, I am re-sending this patch
> against "for-6.2/block".
>
>
> Use ktime to change the granularity of IO accounting in block layer from
> milli-seconds to nano-seconds to get the proper latency values for the
> devices whose latency is in micro-seconds. After changing the granularity
> to nano-seconds the iostat command, which was showing incorrect values for
> %util, is now showing correct values.
Please add the theory behind why using nano-seconds can get correct accounting.
>
> We did not work on the patch to drop the logic for
> STAT_PRECISE_TIMESTAMPS yet. Will do it if this patch is ok.
>
> The iostat command was run after starting the fio with following command
> on an NVME disk. For the same fio command, the iostat %util was showing
> ~100% for the disks whose latencies are in the range of microseconds.
> With the kernel changes (granularity to nano-seconds), the %util was
> showing correct values. Following are the details of the test and their
> output:
>
> fio command
> -----------
> [global]
> bs=128K
> iodepth=1
> direct=1
> ioengine=libaio
> group_reporting
> time_based
> runtime=90
> thinktime=1ms
> numjobs=1
> name=raw-write
> rw=randrw
> ignore_error=EIO:EIO
> [job1]
> filename=/dev/nvme0n1
>
> Correct values after kernel changes:
> ====================================
> iostat output
> -------------
> iostat -d /dev/nvme0n1 -x 1
>
> Device r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
> nvme0n1 0.08 0.05 0.06 128.00 128.00 0.07 6.50
>
> Device r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
> nvme0n1 0.08 0.06 0.06 128.00 128.00 0.07 6.30
>
> Device r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
> nvme0n1 0.06 0.05 0.06 128.00 128.00 0.06 5.70
>
> From fio
> --------
> Read Latency: clat (usec): min=32, max=2335, avg=79.54, stdev=29.95
> Write Latency: clat (usec): min=38, max=130, avg=57.76, stdev= 3.25
Can you explain a bit why the above %util is correct?
BTW, %util is usually not important for SSDs, please see 'man iostat':
%util
Percentage of elapsed time during which I/O requests were issued to the device (bandwidth uti‐
lization for the device). Device saturation occurs when this value is close to 100% for devices
serving requests serially. But for devices serving requests in parallel, such as RAID arrays
and modern SSDs, this number does not reflect their performance limits.
Thanks,
Ming
prev parent reply other threads:[~2022-12-08 0:38 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-07 22:32 [RFC for-6.2/block V2] block: Change the granularity of io ticks from ms to ns Gulam Mohamed
2022-12-07 23:02 ` Chaitanya Kulkarni
2022-12-07 23:08 ` Jens Axboe
2022-12-07 23:17 ` Chaitanya Kulkarni
2022-12-08 0:35 ` Keith Busch
2022-12-08 2:55 ` Jens Axboe
2022-12-08 0:36 ` Ming Lei [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y5ExoZ+7Am6Nm8+h@T590 \
--to=ming.lei@redhat.com \
--cc=agk@redhat.com \
--cc=axboe@kernel.dk \
--cc=christoph.boehmwalder@linbit.com \
--cc=colyli@suse.de \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dm-devel@redhat.com \
--cc=drbd-dev@lists.linbit.com \
--cc=gulam.mohamed@oracle.com \
--cc=ira.weiny@intel.com \
--cc=joe.jin@oracle.com \
--cc=junxiao.bi@oracle.com \
--cc=kch@nvidia.com \
--cc=kent.overstreet@gmail.com \
--cc=konrad.wilk@oracle.com \
--cc=lars.ellenberg@linbit.com \
--cc=linux-bcache@vger.kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=minchan@kernel.org \
--cc=ngupta@vflare.org \
--cc=nvdimm@lists.linux.dev \
--cc=philipp.reisner@linbit.com \
--cc=senozhatsky@chromium.org \
--cc=snitzer@kernel.org \
--cc=song@kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).