All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: linux-nvme@lists.infradead.org, Christoph Hellwig <hch@lst.de>,
	Jens Axboe <axboe@fb.com>,
	linux-block@vger.kernel.org
Cc: Bart Van Assche <bart.vanassche@sandisk.com>,
	Keith Busch <keith.busch@intel.com>,
	Sagi Grimberg <sagi@grimberg.me>, Yi Zhang <yi.zhang@redhat.com>,
	Johannes Thumshirn <jthumshirn@suse.de>,
	Ming Lei <ming.lei@redhat.com>
Subject: [PATCH V2 3/6] blk-mq: quiesce queue during switching io sched and updating nr_requests
Date: Thu, 14 Dec 2017 10:31:00 +0800	[thread overview]
Message-ID: <20171214023103.18272-4-ming.lei@redhat.com> (raw)
In-Reply-To: <20171214023103.18272-1-ming.lei@redhat.com>

Dispatch may still be in-progress after queue is frozen, so we have to
quiesce queue before switching IO scheduler and updating nr_requests.

Also when switching io schedulers, blk_mq_run_hw_queue() may still be
called somewhere(such as from nvme_reset_work()), and io scheduler's
per-hctx data may not be setup yet, so cause oops even inside
blk_mq_hctx_has_pending(), such as it can be run just between:

        ret = e->ops.mq.init_sched(q, e);
AND
        ret = e->ops.mq.init_hctx(hctx, i)

inside blk_mq_init_sched().

This reverts commit 7a148c2fcff8330(block: don't call blk_mq_quiesce_queue()
after queue is frozen) basically, and makes sure blk_mq_hctx_has_pending
won't be called if queue is quiesced.

Fixes: 7a148c2fcff83309(block: don't call blk_mq_quiesce_queue() after queue is frozen)
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c   | 27 ++++++++++++++++++++++++++-
 block/elevator.c |  2 ++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 5d69c8075339..85954a0b4394 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1296,7 +1296,30 @@ EXPORT_SYMBOL(blk_mq_delay_run_hw_queue);
 
 bool blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async)
 {
-	if (blk_mq_hctx_has_pending(hctx)) {
+	int srcu_idx;
+	bool need_run;
+
+	/*
+	 * When queue is quiesced, we may be switching io scheduler, or
+	 * updating nr_hw_queues, or other things, and we can't run queue
+	 * any more, even __blk_mq_hctx_has_pending() can't be called safely.
+	 *
+	 * And queue will be rerun in blk_mq_unquiesce_queue() if it is
+	 * quiesced.
+	 */
+	if (!(hctx->flags & BLK_MQ_F_BLOCKING)) {
+		rcu_read_lock();
+		need_run = !blk_queue_quiesced(hctx->queue) &&
+			blk_mq_hctx_has_pending(hctx);
+		rcu_read_unlock();
+	} else {
+		srcu_idx = srcu_read_lock(hctx->queue_rq_srcu);
+		need_run = !blk_queue_quiesced(hctx->queue) &&
+			blk_mq_hctx_has_pending(hctx);
+		srcu_read_unlock(hctx->queue_rq_srcu, srcu_idx);
+	}
+
+	if (need_run) {
 		__blk_mq_delay_run_hw_queue(hctx, async, 0);
 		return true;
 	}
@@ -2721,6 +2744,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 		return -EINVAL;
 
 	blk_mq_freeze_queue(q);
+	blk_mq_quiesce_queue(q);
 
 	ret = 0;
 	queue_for_each_hw_ctx(q, hctx, i) {
@@ -2744,6 +2768,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 	if (!ret)
 		q->nr_requests = nr;
 
+	blk_mq_unquiesce_queue(q);
 	blk_mq_unfreeze_queue(q);
 
 	return ret;
diff --git a/block/elevator.c b/block/elevator.c
index 7bda083d5968..138faeb08a7c 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -968,6 +968,7 @@ static int elevator_switch_mq(struct request_queue *q,
 	int ret;
 
 	blk_mq_freeze_queue(q);
+	blk_mq_quiesce_queue(q);
 
 	if (q->elevator) {
 		if (q->elevator->registered)
@@ -994,6 +995,7 @@ static int elevator_switch_mq(struct request_queue *q,
 		blk_add_trace_msg(q, "elv switch: none");
 
 out:
+	blk_mq_unquiesce_queue(q);
 	blk_mq_unfreeze_queue(q);
 	return ret;
 }
-- 
2.9.5

WARNING: multiple messages have this Message-ID (diff)
From: ming.lei@redhat.com (Ming Lei)
Subject: [PATCH V2 3/6] blk-mq: quiesce queue during switching io sched and updating nr_requests
Date: Thu, 14 Dec 2017 10:31:00 +0800	[thread overview]
Message-ID: <20171214023103.18272-4-ming.lei@redhat.com> (raw)
In-Reply-To: <20171214023103.18272-1-ming.lei@redhat.com>

Dispatch may still be in-progress after queue is frozen, so we have to
quiesce queue before switching IO scheduler and updating nr_requests.

Also when switching io schedulers, blk_mq_run_hw_queue() may still be
called somewhere(such as from nvme_reset_work()), and io scheduler's
per-hctx data may not be setup yet, so cause oops even inside
blk_mq_hctx_has_pending(), such as it can be run just between:

        ret = e->ops.mq.init_sched(q, e);
AND
        ret = e->ops.mq.init_hctx(hctx, i)

inside blk_mq_init_sched().

This reverts commit 7a148c2fcff8330(block: don't call blk_mq_quiesce_queue()
after queue is frozen) basically, and makes sure blk_mq_hctx_has_pending
won't be called if queue is quiesced.

Fixes: 7a148c2fcff83309(block: don't call blk_mq_quiesce_queue() after queue is frozen)
Reported-by: Yi Zhang <yi.zhang at redhat.com>
Signed-off-by: Ming Lei <ming.lei at redhat.com>
---
 block/blk-mq.c   | 27 ++++++++++++++++++++++++++-
 block/elevator.c |  2 ++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 5d69c8075339..85954a0b4394 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1296,7 +1296,30 @@ EXPORT_SYMBOL(blk_mq_delay_run_hw_queue);
 
 bool blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async)
 {
-	if (blk_mq_hctx_has_pending(hctx)) {
+	int srcu_idx;
+	bool need_run;
+
+	/*
+	 * When queue is quiesced, we may be switching io scheduler, or
+	 * updating nr_hw_queues, or other things, and we can't run queue
+	 * any more, even __blk_mq_hctx_has_pending() can't be called safely.
+	 *
+	 * And queue will be rerun in blk_mq_unquiesce_queue() if it is
+	 * quiesced.
+	 */
+	if (!(hctx->flags & BLK_MQ_F_BLOCKING)) {
+		rcu_read_lock();
+		need_run = !blk_queue_quiesced(hctx->queue) &&
+			blk_mq_hctx_has_pending(hctx);
+		rcu_read_unlock();
+	} else {
+		srcu_idx = srcu_read_lock(hctx->queue_rq_srcu);
+		need_run = !blk_queue_quiesced(hctx->queue) &&
+			blk_mq_hctx_has_pending(hctx);
+		srcu_read_unlock(hctx->queue_rq_srcu, srcu_idx);
+	}
+
+	if (need_run) {
 		__blk_mq_delay_run_hw_queue(hctx, async, 0);
 		return true;
 	}
@@ -2721,6 +2744,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 		return -EINVAL;
 
 	blk_mq_freeze_queue(q);
+	blk_mq_quiesce_queue(q);
 
 	ret = 0;
 	queue_for_each_hw_ctx(q, hctx, i) {
@@ -2744,6 +2768,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 	if (!ret)
 		q->nr_requests = nr;
 
+	blk_mq_unquiesce_queue(q);
 	blk_mq_unfreeze_queue(q);
 
 	return ret;
diff --git a/block/elevator.c b/block/elevator.c
index 7bda083d5968..138faeb08a7c 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -968,6 +968,7 @@ static int elevator_switch_mq(struct request_queue *q,
 	int ret;
 
 	blk_mq_freeze_queue(q);
+	blk_mq_quiesce_queue(q);
 
 	if (q->elevator) {
 		if (q->elevator->registered)
@@ -994,6 +995,7 @@ static int elevator_switch_mq(struct request_queue *q,
 		blk_add_trace_msg(q, "elv switch: none");
 
 out:
+	blk_mq_unquiesce_queue(q);
 	blk_mq_unfreeze_queue(q);
 	return ret;
 }
-- 
2.9.5

  parent reply	other threads:[~2017-12-14  2:32 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-14  2:30 [PATCH V2 0/6] blk-mq: fix race related with device deletion/reset/switching sched Ming Lei
2017-12-14  2:30 ` Ming Lei
2017-12-14  2:30 ` [PATCH V2 1/6] blk-mq: quiesce queue before freeing queue Ming Lei
2017-12-14  2:30   ` Ming Lei
2017-12-29  9:54   ` Christoph Hellwig
2017-12-29  9:54     ` Christoph Hellwig
2017-12-14  2:30 ` [PATCH V2 2/6] blk-mq: support concurrent blk_mq_quiesce_queue() Ming Lei
2017-12-14  2:30   ` Ming Lei
2017-12-29  9:58   ` Christoph Hellwig
2017-12-29  9:58     ` Christoph Hellwig
2018-01-02  3:01     ` Ming Lei
2018-01-02  3:01       ` Ming Lei
2017-12-14  2:31 ` Ming Lei [this message]
2017-12-14  2:31   ` [PATCH V2 3/6] blk-mq: quiesce queue during switching io sched and updating nr_requests Ming Lei
2017-12-29  9:58   ` Christoph Hellwig
2017-12-29  9:58     ` Christoph Hellwig
2017-12-14  2:31 ` [PATCH V2 4/6] blk-mq: avoid to map CPU into stale hw queue Ming Lei
2017-12-14  2:31   ` Ming Lei
2017-12-29  9:59   ` Christoph Hellwig
2017-12-29  9:59     ` Christoph Hellwig
2017-12-14  2:31 ` [PATCH V2 5/6] blk-mq: fix race between updating nr_hw_queues and switching io sched Ming Lei
2017-12-14  2:31   ` Ming Lei
2017-12-29  9:59   ` Christoph Hellwig
2017-12-29  9:59     ` Christoph Hellwig
2017-12-14  2:31 ` [PATCH V2 6/6] nvme-pci: remove .init_request callback Ming Lei
2017-12-14  2:31   ` Ming Lei
2017-12-21  8:20   ` Sagi Grimberg
2017-12-21  8:20     ` Sagi Grimberg
2017-12-21  8:36     ` Johannes Thumshirn
2017-12-21  8:36       ` Johannes Thumshirn
2017-12-22  1:34     ` Ming Lei
2017-12-22  1:34       ` Ming Lei
2017-12-24  8:50       ` Sagi Grimberg
2017-12-24  8:50         ` Sagi Grimberg
2017-12-19 15:30 ` [PATCH V2 0/6] blk-mq: fix race related with device deletion/reset/switching sched Yi Zhang
2017-12-19 15:30   ` Yi Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171214023103.18272-4-ming.lei@redhat.com \
    --to=ming.lei@redhat.com \
    --cc=axboe@fb.com \
    --cc=bart.vanassche@sandisk.com \
    --cc=hch@lst.de \
    --cc=jthumshirn@suse.de \
    --cc=keith.busch@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.