All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Valente <paolo.valente@linaro.org>
To: Doug Anderson <dianders@chromium.org>
Cc: Ming Lei <ming.lei@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Salman Qazi <sqazi@google.com>,
	linux-block <linux-block@vger.kernel.org>,
	linux-scsi@vger.kernel.org, Guenter Roeck <groeck@chromium.org>,
	Ajay Joshi <ajay.joshi@wdc.com>, Arnd Bergmann <arnd@arndb.de>,
	Bart Van Assche <bvanassche@acm.org>,
	Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>,
	Damien Le Moal <damien.lemoal@wdc.com>,
	Hou Tao <houtao1@huawei.com>,
	Pavel Begunkov <asml.silence@gmail.com>,
	Tejun Heo <tj@kernel.org>, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 2/2] blk-mq: Rerun dispatching in the case of budget contention
Date: Tue, 7 Apr 2020 12:43:58 +0200	[thread overview]
Message-ID: <B7C03D1F-7048-4FDF-AAAF-BCD0F95132E6@linaro.org> (raw)
In-Reply-To: <CAD=FV=UK=4OW2Q5i2FhrJw_+A-q+R=K8E5ui-PCQXvYhDY3ZHw@mail.gmail.com>



> Il giorno 5 apr 2020, alle ore 18:16, Doug Anderson <dianders@chromium.org> ha scritto:
> 
> Hi,
> 
> On Sun, Apr 5, 2020 at 7:55 AM Paolo Valente <paolo.valente@linaro.org> wrote:
>> 
>>> Il giorno 5 apr 2020, alle ore 16:00, Doug Anderson <dianders@chromium.org> ha scritto:
>>> 
>>> Hi,
>>> 
>>> On Sun, Apr 5, 2020 at 2:15 AM Ming Lei <ming.lei@redhat.com> wrote:
>>>> 
>>>> OK, looks it isn't specific on BFQ any more.
>>>> 
>>>> Follows another candidate approach for this issue, given it is so hard
>>>> to trigger, we can make it more reliable by rerun queue when has_work()
>>>> returns true after ops->dispath_request() returns NULL.
>>>> 
>>>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
>>>> index 74cedea56034..4408e5d4fcd8 100644
>>>> --- a/block/blk-mq-sched.c
>>>> +++ b/block/blk-mq-sched.c
>>>> @@ -80,6 +80,7 @@ void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx)
>>>>       blk_mq_run_hw_queue(hctx, true);
>>>> }
>>>> 
>>>> +#define BLK_MQ_BUDGET_DELAY    3               /* ms units */
>>>> /*
>>>> * Only SCSI implements .get_budget and .put_budget, and SCSI restarts
>>>> * its queue by itself in its completion handler, so we don't need to
>>>> @@ -103,6 +104,9 @@ static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
>>>>               rq = e->type->ops.dispatch_request(hctx);
>>>>               if (!rq) {
>>>>                       blk_mq_put_dispatch_budget(hctx);
>>>> +
>>>> +                       if (e->type->ops.has_work && e->type->ops.has_work(hctx))
>>>> +                               blk_mq_delay_run_hw_queue(hctx, BLK_MQ_BUDGET_DELAY);
>>> 
>>> I agree that your patch should solve the race.  With the current BFQ's
>>> has_work() it's a bit of a disaster though. It will essentially put
>>> blk-mq into a busy-wait loop (with a 3 ms delay between each poll)
>>> while BFQ's has_work() says "true" but BFQ doesn't dispatch anything.
>>> 
>>> ...so I guess the question that still needs to be answered: does
>>> has_work() need to be exact?  If so then we need the patch you propose
>>> plus one to BFQ.  If not, we should continue along the lines of my
>>> patch.
>>> 
>> 
>> Some more comments.  BFQ's I/O plugging lasts 9 ms by default.  So,
>> with this last Ming's patch, BFQ may happen to be polled every 3ms,
>> for at most three times.
> 
> Ah!  I did not know this.  OK, then Ming's patch seems like it should
> work.  If nothing else it should fix the problem.  If this ends up
> making BFQ chew up too much CPU time then presumably someone will
> notice and BFQ's has_work() can be improved.
> 
> Ming: how do you want to proceed?  Do you want to formally post the
> patch?  Do you want me to post a v3 of my series where I place patch
> #2 with your patch?  Do you want authorship (which implies adding your
> Signed-off-by)?
> 
> 
>> On the opposite end, making bfq_has_work plugging aware costs more
>> complexity, and possibly one more lock.  While avoiding the above
>> occasional polling, this may imply a lot of overhead or CPU stalls on
>> every dispatch.
> 
> I still think it would be interesting to run performance tests with my
> proof-of-concept solution for has_work().  Even if it's not ideal,
> knowing whether performance increased, decreased, or stayed the same
> would give information about how much more effort should be put into
> this.
> 

Why not?  It is however hard to hope that we add only negligible
overhead and CPU stalls if we move from one lock-protected section per
I/O-request dispatch, to two or more lock-protected sections per
request (has_work may be invoked several times per request).

At any rate, if useful, one of the scripts in my S benchmark suite can
also measure max IOPS (when limited only by I/O processing) [1].  The
script is for Linux distros; I don't know whether it works in your
environments of interest, Doug.

Paolo

[1] https://github.com/Algodev-github/S/blob/master/throughput-sync/throughput-sync.sh

> -Doug


  reply	other threads:[~2020-04-07 10:42 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-02 15:51 [PATCH v2 0/2] blk-mq: Fix two causes of IO stalls found in reboot testing Douglas Anderson
2020-04-02 15:51 ` [PATCH v2 1/2] blk-mq: In blk_mq_dispatch_rq_list() "no budget" is a reason to kick Douglas Anderson
2020-04-03  1:55   ` Ming Lei
2020-04-02 15:51 ` [PATCH v2 2/2] blk-mq: Rerun dispatching in the case of budget contention Douglas Anderson
2020-04-03  1:33   ` Ming Lei
2020-04-03 15:10     ` Doug Anderson
2020-04-03 15:49       ` Doug Anderson
2020-04-05  9:14         ` Ming Lei
2020-04-05 14:00           ` Doug Anderson
2020-04-05 14:57             ` Paolo Valente
2020-04-05 16:16               ` Doug Anderson
2020-04-07 10:43                 ` Paolo Valente [this message]
2020-04-05 16:26           ` Doug Anderson
2020-04-07  2:14             ` Ming Lei
2020-04-07 22:04               ` Doug Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B7C03D1F-7048-4FDF-AAAF-BCD0F95132E6@linaro.org \
    --to=paolo.valente@linaro.org \
    --cc=ajay.joshi@wdc.com \
    --cc=arnd@arndb.de \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=chaitanya.kulkarni@wdc.com \
    --cc=damien.lemoal@wdc.com \
    --cc=dianders@chromium.org \
    --cc=groeck@chromium.org \
    --cc=houtao1@huawei.com \
    --cc=jejb@linux.ibm.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    --cc=sqazi@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.