linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND] blk-mq: insert request not through ->queue_rq into sw/scheduler queue
@ 2020-08-18  9:07 Ming Lei
  2020-08-18 14:50 ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2020-08-18  9:07 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, Christoph Hellwig, Bart Van Assche, Mike Snitzer

c616cbee97ae ("blk-mq: punt failed direct issue to dispatch list") supposed
to add request which has been through ->queue_rq() to the hw queue dispatch
list, however it adds request running out of budget or driver tag to hw queue
too. This way basically bypasses request merge, and causes too many request
dispatched to LLD, and system% is unnecessary increased.

Fixes this issue by adding request not through ->queue_rq into sw/scheduler
queue, and this way is safe because no ->queue_rq is called on this request
yet.

High %system can be observed on Azure storvsc device, and even soft lock
is observed. This patch reduces %system during heavy sequential IO,
meantime decreases soft lockup risk.

Fixes: c616cbee97ae ("blk-mq: punt failed direct issue to dispatch list")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Mike Snitzer <snitzer@redhat.com>
---
 block/blk-mq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 5ac80bfac325..f50c38ccac3c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2039,7 +2039,8 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	if (bypass_insert)
 		return BLK_STS_RESOURCE;
 
-	blk_mq_request_bypass_insert(rq, false, run_queue);
+	blk_mq_sched_insert_request(rq, false, run_queue, false);
+
 	return BLK_STS_OK;
 }
 
-- 
2.25.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH RESEND] blk-mq: insert request not through ->queue_rq into sw/scheduler queue
  2020-08-18  9:07 [PATCH RESEND] blk-mq: insert request not through ->queue_rq into sw/scheduler queue Ming Lei
@ 2020-08-18 14:50 ` Jens Axboe
  2020-08-18 15:20   ` Mike Snitzer
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2020-08-18 14:50 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, Christoph Hellwig, Bart Van Assche, Mike Snitzer

On 8/18/20 2:07 AM, Ming Lei wrote:
> c616cbee97ae ("blk-mq: punt failed direct issue to dispatch list") supposed
> to add request which has been through ->queue_rq() to the hw queue dispatch
> list, however it adds request running out of budget or driver tag to hw queue
> too. This way basically bypasses request merge, and causes too many request
> dispatched to LLD, and system% is unnecessary increased.
> 
> Fixes this issue by adding request not through ->queue_rq into sw/scheduler
> queue, and this way is safe because no ->queue_rq is called on this request
> yet.
> 
> High %system can be observed on Azure storvsc device, and even soft lock
> is observed. This patch reduces %system during heavy sequential IO,
> meantime decreases soft lockup risk.

Applied, thanks Ming.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RESEND] blk-mq: insert request not through ->queue_rq into sw/scheduler queue
  2020-08-18 14:50 ` Jens Axboe
@ 2020-08-18 15:20   ` Mike Snitzer
  2020-08-18 23:52     ` Ming Lei
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Snitzer @ 2020-08-18 15:20 UTC (permalink / raw)
  To: Jens Axboe, Ming Lei
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, dm-devel

On Tue, Aug 18 2020 at 10:50am -0400,
Jens Axboe <axboe@kernel.dk> wrote:

> On 8/18/20 2:07 AM, Ming Lei wrote:
> > c616cbee97ae ("blk-mq: punt failed direct issue to dispatch list") supposed
> > to add request which has been through ->queue_rq() to the hw queue dispatch
> > list, however it adds request running out of budget or driver tag to hw queue
> > too. This way basically bypasses request merge, and causes too many request
> > dispatched to LLD, and system% is unnecessary increased.
> > 
> > Fixes this issue by adding request not through ->queue_rq into sw/scheduler
> > queue, and this way is safe because no ->queue_rq is called on this request
> > yet.
> > 
> > High %system can be observed on Azure storvsc device, and even soft lock
> > is observed. This patch reduces %system during heavy sequential IO,
> > meantime decreases soft lockup risk.
> 
> Applied, thanks Ming.

Hmm, strikes me as strange that this is occurring given the direct
insertion into blk-mq queue (bypassing scheduler) is meant to avoid 2
layers of IO merging when dm-mulipath is stacked on blk-mq path(s).  The
dm-mpath IO scheduler does all merging and underlying paths' blk-mq
request_queues are meant to just dispatch the top-level's requests.

So this change concerns me.  Feels like this design has broken down.

Could be that some other entry point was added for the
__blk_mq_try_issue_directly() code?  And it needs to be untangled away
from the dm-multipath use-case?

Apologies for not responding to this patch until now.

Mike


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RESEND] blk-mq: insert request not through ->queue_rq into sw/scheduler queue
  2020-08-18 15:20   ` Mike Snitzer
@ 2020-08-18 23:52     ` Ming Lei
  2020-08-19  0:20       ` Mike Snitzer
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2020-08-18 23:52 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Bart Van Assche, dm-devel

On Tue, Aug 18, 2020 at 11:20:22AM -0400, Mike Snitzer wrote:
> On Tue, Aug 18 2020 at 10:50am -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
> > On 8/18/20 2:07 AM, Ming Lei wrote:
> > > c616cbee97ae ("blk-mq: punt failed direct issue to dispatch list") supposed
> > > to add request which has been through ->queue_rq() to the hw queue dispatch
> > > list, however it adds request running out of budget or driver tag to hw queue
> > > too. This way basically bypasses request merge, and causes too many request
> > > dispatched to LLD, and system% is unnecessary increased.
> > > 
> > > Fixes this issue by adding request not through ->queue_rq into sw/scheduler
> > > queue, and this way is safe because no ->queue_rq is called on this request
> > > yet.
> > > 
> > > High %system can be observed on Azure storvsc device, and even soft lock
> > > is observed. This patch reduces %system during heavy sequential IO,
> > > meantime decreases soft lockup risk.
> > 
> > Applied, thanks Ming.
> 
> Hmm, strikes me as strange that this is occurring given the direct
> insertion into blk-mq queue (bypassing scheduler) is meant to avoid 2
> layers of IO merging when dm-mulipath is stacked on blk-mq path(s).  The
> dm-mpath IO scheduler does all merging and underlying paths' blk-mq
> request_queues are meant to just dispatch the top-level's requests.
> 
> So this change concerns me.  Feels like this design has broken down.
> 

'bypass_insert' is 'true' when blk_insert_cloned_request() is
called from device mapper code, so this patch doesn't affect dm.

> Could be that some other entry point was added for the
> __blk_mq_try_issue_directly() code?  And it needs to be untangled away
> from the dm-multipath use-case?

__blk_mq_try_issue_directly() can be called from blk-mq directly, that
is the case this patch is addressing, if one request can't be queued to
LLD because of running out of budget or driver tag, it should be added to
scheduler queue for improving io merge, meantime we can avoid too many
requests dispatched to hardware.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RESEND] blk-mq: insert request not through ->queue_rq into sw/scheduler queue
  2020-08-18 23:52     ` Ming Lei
@ 2020-08-19  0:20       ` Mike Snitzer
  0 siblings, 0 replies; 5+ messages in thread
From: Mike Snitzer @ 2020-08-19  0:20 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Bart Van Assche, dm-devel

On Tue, Aug 18 2020 at  7:52pm -0400,
Ming Lei <ming.lei@redhat.com> wrote:

> On Tue, Aug 18, 2020 at 11:20:22AM -0400, Mike Snitzer wrote:
> > On Tue, Aug 18 2020 at 10:50am -0400,
> > Jens Axboe <axboe@kernel.dk> wrote:
> > 
> > > On 8/18/20 2:07 AM, Ming Lei wrote:
> > > > c616cbee97ae ("blk-mq: punt failed direct issue to dispatch list") supposed
> > > > to add request which has been through ->queue_rq() to the hw queue dispatch
> > > > list, however it adds request running out of budget or driver tag to hw queue
> > > > too. This way basically bypasses request merge, and causes too many request
> > > > dispatched to LLD, and system% is unnecessary increased.
> > > > 
> > > > Fixes this issue by adding request not through ->queue_rq into sw/scheduler
> > > > queue, and this way is safe because no ->queue_rq is called on this request
> > > > yet.
> > > > 
> > > > High %system can be observed on Azure storvsc device, and even soft lock
> > > > is observed. This patch reduces %system during heavy sequential IO,
> > > > meantime decreases soft lockup risk.
> > > 
> > > Applied, thanks Ming.
> > 
> > Hmm, strikes me as strange that this is occurring given the direct
> > insertion into blk-mq queue (bypassing scheduler) is meant to avoid 2
> > layers of IO merging when dm-mulipath is stacked on blk-mq path(s).  The
> > dm-mpath IO scheduler does all merging and underlying paths' blk-mq
> > request_queues are meant to just dispatch the top-level's requests.
> > 
> > So this change concerns me.  Feels like this design has broken down.
> > 
> 
> 'bypass_insert' is 'true' when blk_insert_cloned_request() is
> called from device mapper code, so this patch doesn't affect dm.

Great.
 
> > Could be that some other entry point was added for the
> > __blk_mq_try_issue_directly() code?  And it needs to be untangled away
> > from the dm-multipath use-case?
> 
> __blk_mq_try_issue_directly() can be called from blk-mq directly, that
> is the case this patch is addressing, if one request can't be queued to
> LLD because of running out of budget or driver tag, it should be added to
> scheduler queue for improving io merge, meantime we can avoid too many
> requests dispatched to hardware.

I see, so if retry is needed best to attempt merge again.

Thanks for the explanation.

Mike


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-08-19  0:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-18  9:07 [PATCH RESEND] blk-mq: insert request not through ->queue_rq into sw/scheduler queue Ming Lei
2020-08-18 14:50 ` Jens Axboe
2020-08-18 15:20   ` Mike Snitzer
2020-08-18 23:52     ` Ming Lei
2020-08-19  0:20       ` Mike Snitzer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).