linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Doug Anderson <dianders@chromium.org>
To: Paolo Valente <paolo.valente@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>, Ming Lei <ming.lei@redhat.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	linux-block <linux-block@vger.kernel.org>,
	Guenter Roeck <groeck@chromium.org>,
	linux-scsi@vger.kernel.org, Salman Qazi <sqazi@google.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] scsi: core: Fix stall if two threads request budget at the same time
Date: Thu, 2 Apr 2020 08:52:59 -0700	[thread overview]
Message-ID: <CAD=FV=XX+S==bz1OQFYyQxNmsPdMQoJ7AJSs++D9dP_r0Rc93A@mail.gmail.com> (raw)
In-Reply-To: <1D4B63FC-FA4B-4C73-B70F-6639CC41D3A6@linaro.org>

Hi,

On Wed, Apr 1, 2020 at 12:48 AM Paolo Valente <paolo.valente@linaro.org> wrote:
>
> >> 28977309us - PID 2167 got the budget.
> >> 28977518us - BFQ told PID 2167 that there was nothing to dispatch.
> >> 28977702us - BFQ idle timer fires.
> >> 28977725us - We start to try to dispatch as a result of BFQ's idle timer.
> >> 28977732us - Dispatching that was a result of BFQ's idle timer can't get
> >>             budget and thus does nothing.
> >> 28977780us - PID 2167 put the budget and exits since there was nothing
> >>             to dispatch.
> >>
> >> This is only one particular trace, but in this case BFQ did attempt to
> >> rerun the queue after it returned NULL, but that ran almost
> >> immediately after it returned NULL and thus ran into the race.  :(
> >
> > OK, and then it doesn't trigger again? It's key that it keeps doing this
> > timeout and re-dispatch if it fails, not just once.
> >
>
> The goal of BFQ's timer is to make BFQ switch from non-work-conserving
> to work-conserving mode, just because not doing so would cause a
> stall.  In contrast, it sounds a little weird that an I/O scheduler
> systematically kicks I/O periodically (how can BFQ know when no more
> kicking is needed?).  IOW, it doesn't seem very robust that blk-mq may
> need a series of periodic kicks to finally restart, like a flooded
> engine.
>
> Compared with this solution, I'd still prefer one where BFQ doesn't
> trigger this blk-mq stall at all.

I spent more time thinking about this / testing things.  Probably the
best summary of my thoughts can be found at
<https://crbug.com/1061950#c79>.  The quick summary is that I believe
the problem is that BFQ has faith that when it calls
blk_mq_run_hw_queues() that it will eventually cause BFQ to be called
back at least once to dispatch.  That doesn't always happen due to the
race we're trying to solve here.  If we fix the race / make
blk_mq_run_hw_queues() reliable then I don't think there's a need for
BFQ to implement some type of timeout/retry mechanism.


> > But BFQ really should be smarter here. It's the same caller etc that
> > asks whether it has work and whether it can dispatch, yet the answer is
> > different. That's just kind of silly, and it'd make more sense if BFQ
> > actually implemented the ->has_work() as a "would I actually dispatch
> > for this guy, now".

I prototyped this and I think it would solve the problem (though I
haven't had time to do extensive testing yet).  It certainly makes
BFQ's has_work() more expensive in some cases, but it might be worth
it.  Someone setup to do benchmarking would need to say for sure.

However, I think I've figured out an inexpensive / lightweight
solution that means we can let has_work() be inexact.  It's mostly the
same as this patch but implemented at the blk-mq layer (not the SCSI
layer) and doesn't add a spinlock.  I'll post a v2 and you can see if
you hate it or if it looks OK.  You can find it at:

https://lore.kernel.org/r/20200402085050.v2.2.I28278ef8ea27afc0ec7e597752a6d4e58c16176f@changeid

-Doug

  reply	other threads:[~2020-04-02 15:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-30 14:49 [PATCH 0/2] blk-mq: Fix two causes of IO stalls found in reboot testing Douglas Anderson
2020-03-30 14:49 ` [PATCH 1/2] blk-mq: In blk_mq_dispatch_rq_list() "no budget" is a reason to kick Douglas Anderson
2020-03-30 14:49 ` [PATCH 2/2] scsi: core: Fix stall if two threads request budget at the same time Douglas Anderson
2020-03-31  1:41   ` Ming Lei
2020-03-31  2:15     ` Doug Anderson
2020-03-31  2:58       ` Ming Lei
2020-03-31  4:05         ` Doug Anderson
2020-03-31 18:07     ` Paolo Valente
2020-03-31 18:26       ` Jens Axboe
2020-03-31 23:51         ` Doug Anderson
2020-04-01  1:21           ` Jens Axboe
2020-04-01  7:49             ` Paolo Valente
2020-04-02 15:52               ` Doug Anderson [this message]
2020-04-01  2:04           ` Ming Lei
2020-04-01  2:32             ` Doug Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAD=FV=XX+S==bz1OQFYyQxNmsPdMQoJ7AJSs++D9dP_r0Rc93A@mail.gmail.com' \
    --to=dianders@chromium.org \
    --cc=axboe@kernel.dk \
    --cc=groeck@chromium.org \
    --cc=jejb@linux.ibm.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    --cc=paolo.valente@linaro.org \
    --cc=sqazi@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).