linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND] blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART
@ 2020-08-17 10:01 Ming Lei
  2020-08-17 10:15 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Ming Lei @ 2020-08-17 10:01 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, Bart Van Assche, Christoph Hellwig,
	David Jeffery, kernel test robot, stable

SCHED_RESTART code path is relied to re-run queue for dispatch requests
in hctx->dispatch. Meantime the SCHED_RSTART flag is checked when adding
requests to hctx->dispatch.

memory barriers have to be used for ordering the following two pair of OPs:

1) adding requests to hctx->dispatch and checking SCHED_RESTART in
blk_mq_dispatch_rq_list()

2) clearing SCHED_RESTART and checking if there is request in hctx->dispatch
in blk_mq_sched_restart().

Without the added memory barrier, either:

1) blk_mq_sched_restart() may miss requests added to hctx->dispatch meantime
blk_mq_dispatch_rq_list() observes SCHED_RESTART, and not run queue in
dispatch side

or

2) blk_mq_dispatch_rq_list still sees SCHED_RESTART, and not run queue
in dispatch side, meantime checking if there is request in
hctx->dispatch from blk_mq_sched_restart() is missed.

IO hang in ltp/fs_fill test is reported by kernel test robot:

	https://lkml.org/lkml/2020/7/26/77

Turns out it is caused by the above out-of-order OPs. And the IO hang
can't be observed any more after applying this patch.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Jeffery <djeffery@redhat.com>
Reported-by: kernel test robot <rong.a.chen@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-sched.c | 9 +++++++++
 block/blk-mq.c       | 9 +++++++++
 2 files changed, 18 insertions(+)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index a19cdf159b75..d2790e5b06d1 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -78,6 +78,15 @@ void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx)
 		return;
 	clear_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state);
 
+	/*
+	 * Order clearing SCHED_RESTART and list_empty_careful(&hctx->dispatch)
+	 * in blk_mq_run_hw_queue(). Its pair is the barrier in
+	 * blk_mq_dispatch_rq_list(). So dispatch code won't see SCHED_RESTART,
+	 * meantime new request added to hctx->dispatch is missed to check in
+	 * blk_mq_run_hw_queue().
+	 */
+	smp_mb();
+
 	blk_mq_run_hw_queue(hctx, true);
 }
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0015a1892153..6c1c3ad175a9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1437,6 +1437,15 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list,
 		list_splice_tail_init(list, &hctx->dispatch);
 		spin_unlock(&hctx->lock);
 
+		/*
+		 * Order adding requests to hctx->dispatch and checking
+		 * SCHED_RESTART flag. The pair of this smp_mb() is the one
+		 * in blk_mq_sched_restart(). Avoid restart code path to
+		 * miss the new added requests to hctx->dispatch, meantime
+		 * SCHED_RESTART is observed here.
+		 */
+		smp_mb();
+
 		/*
 		 * If SCHED_RESTART was set by the caller of this function and
 		 * it is no longer set that means that it was cleared by another
-- 
2.25.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH RESEND] blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART
  2020-08-17 10:01 [PATCH RESEND] blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART Ming Lei
@ 2020-08-17 10:15 ` Christoph Hellwig
  2020-08-17 11:06   ` Ming Lei
  2020-08-17 13:58 ` Jens Axboe
  2020-08-26 13:53 ` Sasha Levin
  2 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2020-08-17 10:15 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Bart Van Assche, Christoph Hellwig,
	David Jeffery, kernel test robot, stable

On Mon, Aug 17, 2020 at 06:01:15PM +0800, Ming Lei wrote:
> SCHED_RESTART code path is relied to re-run queue for dispatch requests
> in hctx->dispatch. Meantime the SCHED_RSTART flag is checked when adding
> requests to hctx->dispatch.
> 
> memory barriers have to be used for ordering the following two pair of OPs:
> 
> 1) adding requests to hctx->dispatch and checking SCHED_RESTART in
> blk_mq_dispatch_rq_list()
> 
> 2) clearing SCHED_RESTART and checking if there is request in hctx->dispatch
> in blk_mq_sched_restart().
> 
> Without the added memory barrier, either:
> 
> 1) blk_mq_sched_restart() may miss requests added to hctx->dispatch meantime
> blk_mq_dispatch_rq_list() observes SCHED_RESTART, and not run queue in
> dispatch side
> 
> or
> 
> 2) blk_mq_dispatch_rq_list still sees SCHED_RESTART, and not run queue
> in dispatch side, meantime checking if there is request in
> hctx->dispatch from blk_mq_sched_restart() is missed.
> 
> IO hang in ltp/fs_fill test is reported by kernel test robot:
> 
> 	https://lkml.org/lkml/2020/7/26/77
> 
> Turns out it is caused by the above out-of-order OPs. And the IO hang
> can't be observed any more after applying this patch.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: David Jeffery <djeffery@redhat.com>
> Reported-by: kernel test robot <rong.a.chen@intel.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Can you add a Fixes: tag so that the commit gets backported?

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RESEND] blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART
  2020-08-17 10:15 ` Christoph Hellwig
@ 2020-08-17 11:06   ` Ming Lei
  0 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2020-08-17 11:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Bart Van Assche, David Jeffery,
	kernel test robot, stable

On Mon, Aug 17, 2020 at 12:15:39PM +0200, Christoph Hellwig wrote:
> On Mon, Aug 17, 2020 at 06:01:15PM +0800, Ming Lei wrote:
> > SCHED_RESTART code path is relied to re-run queue for dispatch requests
> > in hctx->dispatch. Meantime the SCHED_RSTART flag is checked when adding
> > requests to hctx->dispatch.
> > 
> > memory barriers have to be used for ordering the following two pair of OPs:
> > 
> > 1) adding requests to hctx->dispatch and checking SCHED_RESTART in
> > blk_mq_dispatch_rq_list()
> > 
> > 2) clearing SCHED_RESTART and checking if there is request in hctx->dispatch
> > in blk_mq_sched_restart().
> > 
> > Without the added memory barrier, either:
> > 
> > 1) blk_mq_sched_restart() may miss requests added to hctx->dispatch meantime
> > blk_mq_dispatch_rq_list() observes SCHED_RESTART, and not run queue in
> > dispatch side
> > 
> > or
> > 
> > 2) blk_mq_dispatch_rq_list still sees SCHED_RESTART, and not run queue
> > in dispatch side, meantime checking if there is request in
> > hctx->dispatch from blk_mq_sched_restart() is missed.
> > 
> > IO hang in ltp/fs_fill test is reported by kernel test robot:
> > 
> > 	https://lkml.org/lkml/2020/7/26/77
> > 
> > Turns out it is caused by the above out-of-order OPs. And the IO hang
> > can't be observed any more after applying this patch.
> > 
> > Cc: Bart Van Assche <bvanassche@acm.org>
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: David Jeffery <djeffery@redhat.com>
> > Reported-by: kernel test robot <rong.a.chen@intel.com>
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> 
> Can you add a Fixes: tag so that the commit gets backported?

Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers")


Thanks, 
Ming


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RESEND] blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART
  2020-08-17 10:01 [PATCH RESEND] blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART Ming Lei
  2020-08-17 10:15 ` Christoph Hellwig
@ 2020-08-17 13:58 ` Jens Axboe
  2020-08-26 13:53 ` Sasha Levin
  2 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2020-08-17 13:58 UTC (permalink / raw)
  To: Ming Lei
  Cc: linux-block, Bart Van Assche, Christoph Hellwig, David Jeffery,
	kernel test robot, stable

On 8/17/20 3:01 AM, Ming Lei wrote:
> SCHED_RESTART code path is relied to re-run queue for dispatch requests
> in hctx->dispatch. Meantime the SCHED_RSTART flag is checked when adding
> requests to hctx->dispatch.
> 
> memory barriers have to be used for ordering the following two pair of OPs:
> 
> 1) adding requests to hctx->dispatch and checking SCHED_RESTART in
> blk_mq_dispatch_rq_list()
> 
> 2) clearing SCHED_RESTART and checking if there is request in hctx->dispatch
> in blk_mq_sched_restart().
> 
> Without the added memory barrier, either:
> 
> 1) blk_mq_sched_restart() may miss requests added to hctx->dispatch meantime
> blk_mq_dispatch_rq_list() observes SCHED_RESTART, and not run queue in
> dispatch side
> 
> or
> 
> 2) blk_mq_dispatch_rq_list still sees SCHED_RESTART, and not run queue
> in dispatch side, meantime checking if there is request in
> hctx->dispatch from blk_mq_sched_restart() is missed.
> 
> IO hang in ltp/fs_fill test is reported by kernel test robot:
> 
> 	https://lkml.org/lkml/2020/7/26/77
> 
> Turns out it is caused by the above out-of-order OPs. And the IO hang
> can't be observed any more after applying this patch.

Applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RESEND] blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART
  2020-08-17 10:01 [PATCH RESEND] blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART Ming Lei
  2020-08-17 10:15 ` Christoph Hellwig
  2020-08-17 13:58 ` Jens Axboe
@ 2020-08-26 13:53 ` Sasha Levin
  2 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2020-08-26 13:53 UTC (permalink / raw)
  To: Sasha Levin, Ming Lei, Jens Axboe
  Cc: linux-block, Ming Lei, Bart Van Assche, Christoph Hellwig,
	David Jeffery, stable, stable

Hi

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.8.2, v5.7.16, v5.4.59, v4.19.140, v4.14.193, v4.9.232, v4.4.232.

v5.8.2: Build OK!
v5.7.16: Build OK!
v5.4.59: Build OK!
v4.19.140: Build OK!
v4.14.193: Failed to apply! Possible dependencies:
    1f460b63d4b3 ("blk-mq: don't restart queue when .get_budget returns BLK_STS_RESOURCE")
    358a3a6bccb7 ("blk-mq: don't handle TAG_SHARED in restart")
    97889f9ac24f ("blk-mq: remove synchronize_rcu() from blk_mq_del_queue_tag_set()")
    b347689ffbca ("blk-mq-sched: improve dispatching from sw queue")
    caf8eb0d604a ("blk-mq-sched: move actual dispatching into one helper")
    de1482974080 ("blk-mq: introduce .get_budget and .put_budget in blk_mq_ops")

v4.9.232: Failed to apply! Possible dependencies:
    8e8320c9315c ("blk-mq: fix performance regression with shared tags")
    97889f9ac24f ("blk-mq: remove synchronize_rcu() from blk_mq_del_queue_tag_set()")
    bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers")
    cf43e6be865a ("block: add scalable completion tracking of requests")
    e806402130c9 ("block: split out request-only flags into a new namespace")
    f1ba82616c33 ("blk-mq: pass bio to blk_mq_sched_get_rq_priv")

v4.4.232: Failed to apply! Possible dependencies:
    511cbce2ff8b ("irq_poll: make blk-iopoll available outside the block layer")
    6f3b0e8bcf3c ("blk-mq: add a flags parameter to blk_mq_alloc_request")
    88459642cba4 ("blk-mq: abstract tag allocation out into sbitmap library")
    8e8320c9315c ("blk-mq: fix performance regression with shared tags")
    9467f85960a3 ("blk-mq/cpu-notif: Convert to new hotplug state machine")
    97889f9ac24f ("blk-mq: remove synchronize_rcu() from blk_mq_del_queue_tag_set()")
    9e0e252a048b ("badblocks: Add core badblock management code")
    bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers")
    cf43e6be865a ("block: add scalable completion tracking of requests")
    e57690fe009b ("blk-mq: don't overwrite rq->mq_ctx")
    f1ba82616c33 ("blk-mq: pass bio to blk_mq_sched_get_rq_priv")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

-- 
Thanks
Sasha

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-08-26 13:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-17 10:01 [PATCH RESEND] blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART Ming Lei
2020-08-17 10:15 ` Christoph Hellwig
2020-08-17 11:06   ` Ming Lei
2020-08-17 13:58 ` Jens Axboe
2020-08-26 13:53 ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).