stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: "Rantala, Tommi T. (Nokia - FI/Espoo)" <tommi.t.rantala@nokia.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
	"osandov@fb.com" <osandov@fb.com>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>
Subject: Re: 4.19 LTS high /proc/diskstats io_ticks
Date: Wed, 25 Mar 2020 11:07:36 +0100	[thread overview]
Message-ID: <20200325100736.GA3083079@kroah.com> (raw)
In-Reply-To: <564f7f3718cdc85f841d27a358a43aee4ca239d6.camel@nokia.com>

On Wed, Mar 25, 2020 at 10:02:41AM +0000, Rantala, Tommi T. (Nokia - FI/Espoo) wrote:
> Hi,
> 
> Tools like sar and iostat are reporting abnormally high %util with 4.19.y
> running in VM (the disk is almost idle):
> 
>   $ sar -dp
>   Linux 4.19.107-1.x86_64   03/25/20    _x86_64_   (6 CPU)
> 
>   00:00:00        DEV       tps      ...     %util
>   00:10:00        vda      0.55      ...     98.07
>   ...
>   10:00:00        vda      0.44      ...     99.74
>   Average:        vda      0.48      ...     98.98
> 
> The numbers look reasonable for the partition:
> 
>   # iostat -x -p ALL 1 1
>   Linux 4.19.107-1.x86_64   03/25/20    _x86_64_  (6 CPU)
> 
>   avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>             10.51    0.00    8.58    0.05    0.11   80.75
> 
>   Device            r/s     ...  %util
>   vda              0.02     ...  98.25
>   vda1             0.01     ...   0.09
> 
> 
> Lots of io_ticks in /proc/diskstats:
> 
> # cat /proc/uptime
> 45787.03 229321.29
> 
> # grep vda /proc/diskstats
>  253      0 vda 760 0 38498 731 28165 43212 1462928 157514 0 44690263
> 44812032 0 0 0 0
>  253      1 vda1 350 0 19074 293 26169 43212 1462912 154931 0 41560 150998
> 0 0 0 0
> 
> 
> Other people are apparently seeing this too with 4.19:
> https://kudzia.eu/b/2019/09/iostat-x-1-reporting-100-utilization-of-nearly-idle-nvme-drives/
> 
> 
> I also see this only in 4.19.y and bisected to this (based on the Fixes
> tag, this should have been taken to 4.14 too...):
> 
> commit 6131837b1de66116459ef4413e26fdbc70d066dc
> Author: Omar Sandoval <osandov@fb.com>
> Date:   Thu Apr 26 00:21:58 2018 -0700
> 
>   blk-mq: count allocated but not started requests in iostats inflight
> 
>   In the legacy block case, we increment the counter right after we
>   allocate the request, not when the driver handles it. In both the legacy
>   and blk-mq cases, part_inc_in_flight() is called from
>   blk_account_io_start() right after we've allocated the request. blk-mq
>   only considers requests started requests as inflight, but this is
>   inconsistent with the legacy definition and the intention in the code.
>   This removes the started condition and instead counts all allocated
>   requests.
> 
>   Fixes: f299b7c7a9de ("blk-mq: provide internal in-flight variant")
>   Signed-off-by: Omar Sandoval <osandov@fb.com>
>   Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index c3621453ad87..5450cbc61f8d 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -95,18 +95,15 @@ static void blk_mq_check_inflight(struct blk_mq_hw_ctx
> *hctx,
>  {
>         struct mq_inflight *mi = priv;
>  
> -       if (blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT) {
> -               /*
> -                * index[0] counts the specific partition that was asked
> -                * for. index[1] counts the ones that are active on the
> -                * whole device, so increment that if mi->part is indeed
> -                * a partition, and not a whole device.
> -                */
> -               if (rq->part == mi->part)
> -                       mi->inflight[0]++;
> -               if (mi->part->partno)
> -                       mi->inflight[1]++;
> -       }
> +       /*
> +        * index[0] counts the specific partition that was asked for.
> index[1]
> +        * counts the ones that are active on the whole device, so
> increment
> +        * that if mi->part is indeed a partition, and not a whole device.
> +        */
> +       if (rq->part == mi->part)
> +               mi->inflight[0]++;
> +       if (mi->part->partno)
> +               mi->inflight[1]++;
>  }
>  
>  void blk_mq_in_flight(struct request_queue *q, struct hd_struct *part,
> 
> 
> 
> If I get it right, when the disk is idle, and some request is allocated,
> part_round_stats() with this commit will now add all ticks between
> previous I/O and current time (now - part->stamp) to io_ticks.
> 
> Before the commit, part_round_stats() would only update part->stamp when
> called after request allocation.

So this is a "false" reporting?  there's really no load?

> Any thoughts how to best fix this in 4.19?
> I see the io_ticks accounting has been reworked in 5.0, do we need to
> backport those to 4.19, or any ill effects if this commit is reverted in
> 4.19?

Do you see this issue in 5.4?  What's keeping you from moving to 5.4.y?

And if this isn't a real issue, is that a problem too?

As you can test this, if you have a set of patches backported that could
resolve it, can you send them to us?

thanks,

greg k-h

  reply	other threads:[~2020-03-25 10:07 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-25 10:02 4.19 LTS high /proc/diskstats io_ticks Rantala, Tommi T. (Nokia - FI/Espoo)
2020-03-25 10:07 ` Greg KH [this message]
2020-03-25 11:22   ` Rantala, Tommi T. (Nokia - FI/Espoo)
2020-03-25 15:24     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200325100736.GA3083079@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=osandov@fb.com \
    --cc=stable@vger.kernel.org \
    --cc=tommi.t.rantala@nokia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).