All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Paolo Valente <paolo.valente@linaro.org>,
	Mike Galbraith <efault@gmx.de>, Christoph Hellwig <hch@lst.de>
Cc: linux-block <linux-block@vger.kernel.org>,
	Ulf Hansson <ulf.hansson@linaro.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Oleksandr Natalenko <oleksandr@natalenko.name>
Subject: Re: bug in tag handling in blk-mq?
Date: Mon, 7 May 2018 10:39:12 -0600	[thread overview]
Message-ID: <7760d23b-7a4c-a645-1c7a-da7569bb44dc@kernel.dk> (raw)
In-Reply-To: <999DF2B3-4EE8-4BDF-89C5-EB0C2D8BF69E@linaro.org>

On 5/7/18 8:03 AM, Paolo Valente wrote:
> Hi Jens, Christoph, all,
> Mike Galbraith has been experiencing hangs, on blk_mq_get_tag, only
> with bfq [1].  Symptoms seem to clearly point to a problem in I/O-tag
> handling, triggered by bfq because it limits the number of tags for
> async and sync write requests (in bfq_limit_depth).
> 
> Fortunately, I just happened to find a way to apparently confirm it.
> With the following one-liner for block/bfq-iosched.c:
> 
> @@ -554,8 +554,7 @@ static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data)
>         if (unlikely(bfqd->sb_shift != bt->sb.shift))
>                 bfq_update_depths(bfqd, bt);
>  
> -       data->shallow_depth =
> -               bfqd->word_depths[!!bfqd->wr_busy_queues][op_is_sync(op)];
> +       data->shallow_depth = 1;
>  
>         bfq_log(bfqd, "[%s] wr_busy %d sync %d depth %u",
>                         __func__, bfqd->wr_busy_queues, op_is_sync(op),
> 
> Mike's machine now crashes soon and systematically, while nothing bad
> happens on my machines, even with heavy workloads (apart from an
> expected throughput drop).
> 
> This change simply reduces to 1 the maximum possible value for the sum
> of the number of async requests and of sync write requests.
> 
> This email is basically a request for help to knowledgeable people.  To
> start, here are my first doubts/questions:
> 1) Just to be certain, I guess it is not normal that blk-mq hangs if
> async requests and sync write requests can be at most one, right?
> 2) Do you have any hint to where I could look for, to chase this bug?
> Of course, the bug may be in bfq, i.e, it may be a somehow unrelated
> bfq bug that causes this hang in blk-mq, indirectly.  But it is hard
> for me to understand how.

CC Omar, since he implemented the shallow part. But we'll need some
traces to show where we are hung, probably also the value of the
/sys/debug/kernel/block/<dev>/ directory. For the crash mentioned, a
trace as well. Otherwise we'll be wasting a lot of time on this.

Is there a reproducer?

-- 
Jens Axboe

  reply	other threads:[~2018-05-07 16:39 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-07 14:03 bug in tag handling in blk-mq? Paolo Valente
2018-05-07 14:03 ` Paolo Valente
2018-05-07 16:39 ` Jens Axboe [this message]
2018-05-07 18:02   ` Paolo Valente
2018-05-07 18:02     ` Paolo Valente
2018-05-08  4:51     ` Mike Galbraith
2018-05-08  4:51       ` Mike Galbraith
2018-05-08  8:37       ` Mike Galbraith
2018-05-08  8:37         ` Mike Galbraith
2018-05-08 14:55         ` Jens Axboe
2018-05-08 14:55           ` Jens Axboe
2018-05-08 16:42           ` Mike Galbraith
2018-05-08 16:42             ` Mike Galbraith
2018-05-08 20:37             ` Jens Axboe
2018-05-08 21:19               ` Jens Axboe
2018-05-09  1:09                 ` Jens Axboe
2018-05-09  4:11                   ` Mike Galbraith
2018-05-09  4:11                     ` Mike Galbraith
2018-05-09  5:06                     ` Paolo Valente
2018-05-09  5:06                       ` Paolo Valente
2018-05-09 15:18                     ` Jens Axboe
2018-05-09 16:57                       ` Mike Galbraith
2018-05-09 16:57                         ` Mike Galbraith
2018-05-09 17:01                         ` Jens Axboe
2018-05-09 18:31                           ` Mike Galbraith
2018-05-09 18:31                             ` Mike Galbraith
2018-05-09 19:50                             ` Jens Axboe
2018-05-10  4:38                               ` Mike Galbraith
2018-05-10  4:38                                 ` Mike Galbraith
2018-05-09  5:09               ` Mike Galbraith
2018-05-09  5:09                 ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7760d23b-7a4c-a645-1c7a-da7569bb44dc@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=efault@gmx.de \
    --cc=hch@lst.de \
    --cc=linus.walleij@linaro.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleksandr@natalenko.name \
    --cc=paolo.valente@linaro.org \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.