linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Very slow qemu device access
@ 2020-08-07 17:44 Matthew Wilcox
  2020-08-09  2:40 ` Ming Lei
  0 siblings, 1 reply; 5+ messages in thread
From: Matthew Wilcox @ 2020-08-07 17:44 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, linux-kernel, linux-fsdevel


Everything starts going very slowly after this commit:

commit 37f4a24c2469a10a4c16c641671bd766e276cf9f (refs/bisect/bad)
Author: Ming Lei <ming.lei@redhat.com>
Date:   Tue Jun 30 22:03:57 2020 +0800

    blk-mq: centralise related handling into blk_mq_get_driver_tag
    
    Move .nr_active update and request assignment into blk_mq_get_driver_tag(),
    all are good to do during getting driver tag.
    
    Meantime blk-flush related code is simplified and flush request needn't
    to update the request table manually any more.
    
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

By the time xfstests gets to generic/007, things are blocking trying
to get tags:

root@bobo-kvm:~# cat /proc/9530/stack
[<0>] blk_mq_get_tag+0x109/0x250
[<0>] __blk_mq_alloc_request+0x67/0xf0
[<0>] blk_mq_submit_bio+0xee/0x560
[<0>] submit_bio_noacct+0x3a3/0x410
[<0>] submit_bio+0x33/0xf0
[<0>] submit_bh_wbc.isra.0+0x139/0x160
[<0>] block_read_full_page+0x357/0x4a0
[<0>] blkdev_readpage+0x13/0x20
[<0>] do_read_cache_page+0x557/0x860
...

maybe tags aren't getting freed properly?  Or things aren't being woken
up promptly?

(that trace is from current linus head; i bisected back to this commit)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Very slow qemu device access
  2020-08-07 17:44 Very slow qemu device access Matthew Wilcox
@ 2020-08-09  2:40 ` Ming Lei
  2020-08-09 14:25   ` Matthew Wilcox
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2020-08-09  2:40 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Jens Axboe, linux-block, linux-kernel, linux-fsdevel

Hello Matthew,

On Fri, Aug 07, 2020 at 06:44:16PM +0100, Matthew Wilcox wrote:
> 
> Everything starts going very slowly after this commit:
> 
> commit 37f4a24c2469a10a4c16c641671bd766e276cf9f (refs/bisect/bad)
> Author: Ming Lei <ming.lei@redhat.com>
> Date:   Tue Jun 30 22:03:57 2020 +0800
> 
>     blk-mq: centralise related handling into blk_mq_get_driver_tag

Yeah, the above is one known bad commit, which is reverted in
4e2f62e566b5 ("Revert "blk-mq: put driver tag when this request is completed")

Finally the fixed patch of 'blk-mq: centralise related handling into blk_mq_get_driver_tag'
is merged as 568f27006577 ("blk-mq: centralise related handling into blk_mq_get_driver_tag").

So please test either 4e2f62e566b5 or 568f27006577 and see if there is
such issue.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Very slow qemu device access
  2020-08-09  2:40 ` Ming Lei
@ 2020-08-09 14:25   ` Matthew Wilcox
  2020-08-10  3:10     ` Ming Lei
  0 siblings, 1 reply; 5+ messages in thread
From: Matthew Wilcox @ 2020-08-09 14:25 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, linux-kernel, linux-fsdevel

On Sun, Aug 09, 2020 at 10:40:05AM +0800, Ming Lei wrote:
> Hello Matthew,
> 
> On Fri, Aug 07, 2020 at 06:44:16PM +0100, Matthew Wilcox wrote:
> > 
> > Everything starts going very slowly after this commit:
> > 
> > commit 37f4a24c2469a10a4c16c641671bd766e276cf9f (refs/bisect/bad)
> > Author: Ming Lei <ming.lei@redhat.com>
> > Date:   Tue Jun 30 22:03:57 2020 +0800
> > 
> >     blk-mq: centralise related handling into blk_mq_get_driver_tag
> 
> Yeah, the above is one known bad commit, which is reverted in
> 4e2f62e566b5 ("Revert "blk-mq: put driver tag when this request is completed")
> 
> Finally the fixed patch of 'blk-mq: centralise related handling into blk_mq_get_driver_tag'
> is merged as 568f27006577 ("blk-mq: centralise related handling into blk_mq_get_driver_tag").
> 
> So please test either 4e2f62e566b5 or 568f27006577 and see if there is
> such issue.

4e2f62e566b5 is good
568f27006577 is bad

As before, the stack points to the tag code:

# cat /proc/9986/stack
[<0>] blk_mq_get_tag+0x109/0x250
[<0>] __blk_mq_alloc_request+0x67/0xf0
[<0>] blk_mq_submit_bio+0xee/0x560
[<0>] submit_bio_noacct+0x3a3/0x410
[<0>] submit_bio+0x33/0xf0

It's not nice to leave these little landmines in the git history for
bisect to fall into ;-(

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Very slow qemu device access
  2020-08-09 14:25   ` Matthew Wilcox
@ 2020-08-10  3:10     ` Ming Lei
  2020-08-10  3:22       ` Matthew Wilcox
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2020-08-10  3:10 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Jens Axboe, linux-block, linux-kernel, linux-fsdevel

On Sun, Aug 09, 2020 at 03:25:22PM +0100, Matthew Wilcox wrote:
> On Sun, Aug 09, 2020 at 10:40:05AM +0800, Ming Lei wrote:
> > Hello Matthew,
> > 
> > On Fri, Aug 07, 2020 at 06:44:16PM +0100, Matthew Wilcox wrote:
> > > 
> > > Everything starts going very slowly after this commit:
> > > 
> > > commit 37f4a24c2469a10a4c16c641671bd766e276cf9f (refs/bisect/bad)
> > > Author: Ming Lei <ming.lei@redhat.com>
> > > Date:   Tue Jun 30 22:03:57 2020 +0800
> > > 
> > >     blk-mq: centralise related handling into blk_mq_get_driver_tag
> > 
> > Yeah, the above is one known bad commit, which is reverted in
> > 4e2f62e566b5 ("Revert "blk-mq: put driver tag when this request is completed")
> > 
> > Finally the fixed patch of 'blk-mq: centralise related handling into blk_mq_get_driver_tag'
> > is merged as 568f27006577 ("blk-mq: centralise related handling into blk_mq_get_driver_tag").
> > 
> > So please test either 4e2f62e566b5 or 568f27006577 and see if there is
> > such issue.
> 
> 4e2f62e566b5 is good
> 568f27006577 is bad

Please try the following patch, and we shouldn't take flush request
account into driver tag allocation, because it always shares the
data request's tag:

From d508415eee08940ff9c78efe0eddddf594afdb94 Mon Sep 17 00:00:00 2001
From: Ming Lei <ming.lei@redhat.com>
Date: Mon, 10 Aug 2020 11:06:15 +0800
Subject: [PATCH] block: don't double account of flush request's driver tag

In case of none scheduler, we share data request's driver tag for
flush request, so have to mark the flush request as INFLIGHT for
avoiding double account of this driver tag.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Fixes: 568f27006577 ("blk-mq: centralise related handling into blk_mq_get_driver_tag")
Reported-by: Matthew Wilcox <willy@infradead.org>
---
 block/blk-flush.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 6e1543c10493..53abb5c73d99 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -308,9 +308,16 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
 	flush_rq->mq_ctx = first_rq->mq_ctx;
 	flush_rq->mq_hctx = first_rq->mq_hctx;
 
-	if (!q->elevator)
+	if (!q->elevator) {
 		flush_rq->tag = first_rq->tag;
-	else
+
+		/*
+		 * We borrow data request's driver tag, so have to mark
+		 * this flush request as INFLIGHT for avoiding double
+		 * account of this driver tag
+		 */
+		flush_rq->rq_flags |= RQF_MQ_INFLIGHT;
+	} else
 		flush_rq->internal_tag = first_rq->internal_tag;
 
 	flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH;
-- 
2.25.2

 

thanks,
Ming


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Very slow qemu device access
  2020-08-10  3:10     ` Ming Lei
@ 2020-08-10  3:22       ` Matthew Wilcox
  0 siblings, 0 replies; 5+ messages in thread
From: Matthew Wilcox @ 2020-08-10  3:22 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, linux-kernel, linux-fsdevel

On Mon, Aug 10, 2020 at 11:10:49AM +0800, Ming Lei wrote:
> On Sun, Aug 09, 2020 at 03:25:22PM +0100, Matthew Wilcox wrote:
> > On Sun, Aug 09, 2020 at 10:40:05AM +0800, Ming Lei wrote:
> > > Hello Matthew,
> > > 
> > > On Fri, Aug 07, 2020 at 06:44:16PM +0100, Matthew Wilcox wrote:
> > > > 
> > > > Everything starts going very slowly after this commit:
> > > > 
> > > > commit 37f4a24c2469a10a4c16c641671bd766e276cf9f (refs/bisect/bad)
> > > > Author: Ming Lei <ming.lei@redhat.com>
> > > > Date:   Tue Jun 30 22:03:57 2020 +0800
> > > > 
> > > >     blk-mq: centralise related handling into blk_mq_get_driver_tag
> > > 
> > > Yeah, the above is one known bad commit, which is reverted in
> > > 4e2f62e566b5 ("Revert "blk-mq: put driver tag when this request is completed")
> > > 
> > > Finally the fixed patch of 'blk-mq: centralise related handling into blk_mq_get_driver_tag'
> > > is merged as 568f27006577 ("blk-mq: centralise related handling into blk_mq_get_driver_tag").
> > > 
> > > So please test either 4e2f62e566b5 or 568f27006577 and see if there is
> > > such issue.
> > 
> > 4e2f62e566b5 is good
> > 568f27006577 is bad
> 
> Please try the following patch, and we shouldn't take flush request
> account into driver tag allocation, because it always shares the
> data request's tag:
> 
> >From d508415eee08940ff9c78efe0eddddf594afdb94 Mon Sep 17 00:00:00 2001
> From: Ming Lei <ming.lei@redhat.com>
> Date: Mon, 10 Aug 2020 11:06:15 +0800
> Subject: [PATCH] block: don't double account of flush request's driver tag
> 
> In case of none scheduler, we share data request's driver tag for
> flush request, so have to mark the flush request as INFLIGHT for
> avoiding double account of this driver tag.

Yes, this fixes the problem.  Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-08-10  3:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-07 17:44 Very slow qemu device access Matthew Wilcox
2020-08-09  2:40 ` Ming Lei
2020-08-09 14:25   ` Matthew Wilcox
2020-08-10  3:10     ` Ming Lei
2020-08-10  3:22       ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).