All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Jens Axboe <axboe@kernel.dk>,
	Damien Le Moal <Damien.LeMoal@wdc.com>
Subject: Re: commit 01e99aeca397 causes longer runtime of block/004
Date: Fri, 6 Mar 2020 06:06:23 +0000	[thread overview]
Message-ID: <20200306060622.t2jl7qkzvkwvvcbx@shindev.dhcp.fujisawa.hgst.com> (raw)
In-Reply-To: <20200305024808.GA26733@ming.t460p>

On Mar 05, 2020 / 10:48, Ming Lei wrote:
> Hi Shinichiro,
> 
> On Thu, Mar 05, 2020 at 01:19:02AM +0000, Shinichiro Kawasaki wrote:
> > On Mar 04, 2020 / 17:53, Ming Lei wrote:
> > > On Wed, Mar 04, 2020 at 06:11:37AM +0000, Shinichiro Kawasaki wrote:
> > > > On Mar 04, 2020 / 11:46, Ming Lei wrote:
> > > > > On Wed, Mar 04, 2020 at 02:38:43AM +0000, Shinichiro Kawasaki wrote:
> > > > > > I noticed that blktests block/004 takes longer runtime with 5.6-rc4 than
> > > > > > 5.6-rc3, and found that the commit 01e99aeca397 ("blk-mq: insert passthrough
> > > > > > request into hctx->dispatch directly") triggers it.
> > > > > > 
> > > > > > The longer runtime was observed with dm-linear device which maps SATA SMR HDD
> > > > > > connected via AHCI. It was not observed with dm-linear on SAS/SATA SMR HDDs
> > > > > > connected via SAS-HBA. Not observed with dm-linear on non-SMR HDDs either.
> > > > > > 
> > > > > > Before the commit, block/004 took around 130 seconds. After the commit, it takes
> > > > > > around 300 seconds. I need to dig in further details to understand why the
> > > > > > commit makes the test case longer.
> > > > > > 
> > > > > > The test case block/004 does "flush intensive workload". Is this longer runtime
> > > > > > expected?
> > > > > 
> > > > > The following patch might address this issue:
> > > > > 
> > > > > https://lore.kernel.org/linux-block/20200207190416.99928-1-sqazi@google.com/#t
> > > > > 
> > > > > Please test and provide us the result.
> > > > > 
> > > > > thanks,
> > > > > Ming
> > > > >
> > > > 
> > > > Hi Ming,
> > > > 
> > > > I applied the patch to 5.6-rc4 but I observed the longer runtime of block/004.
> > > > Still it takes around 300 seconds.
> > > 
> > > Hello Shinichiro,
> > > 
> > > block/004 only sends 1564 sync randwrite, and seems 130s has been slow
> > > enough.
> > > 
> > > There are two related effect in that commit for your issue:
> > > 
> > > 1) 'at_head' is applied in blk_mq_sched_insert_request() for flush
> > > request
> > > 
> > > 2) all IO is added back to tail of hctx->dispatch after .queue_rq()
> > > returns STS_RESOURCE
> > > 
> > > Seems it is more related with 2) given you can't reproduce the issue on 
> > > SAS.
> > > 
> > > So please test the following two patches, and see which one makes a
> > > difference for you.
> > > 
> > > BTW, both two looks not reasonable, just for narrowing down the issue.
> > > 
> > > 1) patch 1
> > > 
> > > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> > > index 856356b1619e..86137c75283c 100644
> > > --- a/block/blk-mq-sched.c
> > > +++ b/block/blk-mq-sched.c
> > > @@ -398,7 +398,7 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head,
> > >  	WARN_ON(e && (rq->tag != -1));
> > >  
> > >  	if (blk_mq_sched_bypass_insert(hctx, !!e, rq)) {
> > > -		blk_mq_request_bypass_insert(rq, at_head, false);
> > > +		blk_mq_request_bypass_insert(rq, true, false);
> > >  		goto run;
> > >  	}
> > 
> > Ming, thank you for the trial patches.
> > This "patch 1" reduced the runtime, as short as rc3.
> > 
> > > 
> > > 
> > > 2) patch 2
> > > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > > index d92088dec6c3..447d5cb39832 100644
> > > --- a/block/blk-mq.c
> > > +++ b/block/blk-mq.c
> > > @@ -1286,7 +1286,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
> > >  			q->mq_ops->commit_rqs(hctx);
> > >  
> > >  		spin_lock(&hctx->lock);
> > > -		list_splice_tail_init(list, &hctx->dispatch);
> > > +		list_splice_init(list, &hctx->dispatch);
> > >  		spin_unlock(&hctx->lock);
> > >  
> > >  		/*
> > 
> > This patch 2 didn't reduce the runtime.
> > 
> > Wish this report helps.
> 
> Your feedback does help, then please test the following patch:

Hi Ming, thank you for the patch. I applied it on top of rc4 and confirmed
it reduces the runtime as short as rc3. Good.

-- 
Best Regards,
Shin'ichiro Kawasaki

> 
> diff --git a/block/blk-flush.c b/block/blk-flush.c
> index 5cc775bdb06a..68957802f96f 100644
> --- a/block/blk-flush.c
> +++ b/block/blk-flush.c
> @@ -334,7 +334,7 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
>  	flush_rq->rq_disk = first_rq->rq_disk;
>  	flush_rq->end_io = flush_end_io;
>  
> -	blk_flush_queue_rq(flush_rq, false);
> +	blk_flush_queue_rq(flush_rq, true);
>  }
>  
>  static void mq_flush_data_end_io(struct request *rq, blk_status_t error)
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index d92088dec6c3..56d61b693f2e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -724,6 +724,8 @@ static void blk_mq_requeue_work(struct work_struct *work)
>  	spin_unlock_irq(&q->requeue_lock);
>  
>  	list_for_each_entry_safe(rq, next, &rq_list, queuelist) {
> +		bool at_head = !!(rq->rq_flags & RQF_SOFTBARRIER);
> +
>  		if (!(rq->rq_flags & (RQF_SOFTBARRIER | RQF_DONTPREP)))
>  			continue;
>  
> @@ -735,9 +737,9 @@ static void blk_mq_requeue_work(struct work_struct *work)
>  		 * merge.
>  		 */
>  		if (rq->rq_flags & RQF_DONTPREP)
> -			blk_mq_request_bypass_insert(rq, false, false);
> +			blk_mq_request_bypass_insert(rq, at_head, false);
>  		else
> -			blk_mq_sched_insert_request(rq, true, false, false);
> +			blk_mq_sched_insert_request(rq, at_head, false, false);
>  	}
>  
>  	while (!list_empty(&rq_list)) {
> 
> Thanks,
> Ming
> 

  reply	other threads:[~2020-03-06  6:06 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-04  2:38 commit 01e99aeca397 causes longer runtime of block/004 Shinichiro Kawasaki
2020-03-04  3:46 ` Ming Lei
2020-03-04  6:11   ` Shinichiro Kawasaki
2020-03-04  9:53     ` Ming Lei
2020-03-05  1:19       ` Shinichiro Kawasaki
2020-03-05  2:48         ` Ming Lei
2020-03-06  6:06           ` Shinichiro Kawasaki [this message]
2020-03-06  8:13             ` Ming Lei
2020-03-07  1:02               ` Shinichiro Kawasaki
2020-03-07  4:13                 ` Ming Lei
2020-03-09  0:07                   ` Shinichiro Kawasaki
2020-03-09 16:14                     ` Ming Lei
2020-03-10  3:07                       ` Damien Le Moal
2020-03-10  5:54                         ` Shinichiro Kawasaki
2020-03-10  6:00                           ` Damien Le Moal
2020-03-10  8:07                           ` Ming Lei
2020-03-10 11:07                             ` Shinichiro Kawasaki
2020-03-10 13:37                               ` Ming Lei
2020-03-10 14:37                                 ` Ming Lei
2020-03-11  4:59                                   ` Shinichiro Kawasaki
2020-03-11  7:54                                     ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200306060622.t2jl7qkzvkwvvcbx@shindev.dhcp.fujisawa.hgst.com \
    --to=shinichiro.kawasaki@wdc.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.