From: Ming Lei <ming.lei@redhat.com> To: linux-nvme@lists.infradead.org, Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@fb.com>, linux-block@vger.kernel.org Cc: Bart Van Assche <bart.vanassche@sandisk.com>, Keith Busch <keith.busch@intel.com>, Sagi Grimberg <sagi@grimberg.me>, Yi Zhang <yi.zhang@redhat.com>, Johannes Thumshirn <jthumshirn@suse.de>, Ming Lei <ming.lei@redhat.com> Subject: [PATCH V2 3/6] blk-mq: quiesce queue during switching io sched and updating nr_requests Date: Thu, 14 Dec 2017 10:31:00 +0800 [thread overview] Message-ID: <20171214023103.18272-4-ming.lei@redhat.com> (raw) In-Reply-To: <20171214023103.18272-1-ming.lei@redhat.com> Dispatch may still be in-progress after queue is frozen, so we have to quiesce queue before switching IO scheduler and updating nr_requests. Also when switching io schedulers, blk_mq_run_hw_queue() may still be called somewhere(such as from nvme_reset_work()), and io scheduler's per-hctx data may not be setup yet, so cause oops even inside blk_mq_hctx_has_pending(), such as it can be run just between: ret = e->ops.mq.init_sched(q, e); AND ret = e->ops.mq.init_hctx(hctx, i) inside blk_mq_init_sched(). This reverts commit 7a148c2fcff8330(block: don't call blk_mq_quiesce_queue() after queue is frozen) basically, and makes sure blk_mq_hctx_has_pending won't be called if queue is quiesced. Fixes: 7a148c2fcff83309(block: don't call blk_mq_quiesce_queue() after queue is frozen) Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-mq.c | 27 ++++++++++++++++++++++++++- block/elevator.c | 2 ++ 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 5d69c8075339..85954a0b4394 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1296,7 +1296,30 @@ EXPORT_SYMBOL(blk_mq_delay_run_hw_queue); bool blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async) { - if (blk_mq_hctx_has_pending(hctx)) { + int srcu_idx; + bool need_run; + + /* + * When queue is quiesced, we may be switching io scheduler, or + * updating nr_hw_queues, or other things, and we can't run queue + * any more, even __blk_mq_hctx_has_pending() can't be called safely. + * + * And queue will be rerun in blk_mq_unquiesce_queue() if it is + * quiesced. + */ + if (!(hctx->flags & BLK_MQ_F_BLOCKING)) { + rcu_read_lock(); + need_run = !blk_queue_quiesced(hctx->queue) && + blk_mq_hctx_has_pending(hctx); + rcu_read_unlock(); + } else { + srcu_idx = srcu_read_lock(hctx->queue_rq_srcu); + need_run = !blk_queue_quiesced(hctx->queue) && + blk_mq_hctx_has_pending(hctx); + srcu_read_unlock(hctx->queue_rq_srcu, srcu_idx); + } + + if (need_run) { __blk_mq_delay_run_hw_queue(hctx, async, 0); return true; } @@ -2721,6 +2744,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) return -EINVAL; blk_mq_freeze_queue(q); + blk_mq_quiesce_queue(q); ret = 0; queue_for_each_hw_ctx(q, hctx, i) { @@ -2744,6 +2768,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) if (!ret) q->nr_requests = nr; + blk_mq_unquiesce_queue(q); blk_mq_unfreeze_queue(q); return ret; diff --git a/block/elevator.c b/block/elevator.c index 7bda083d5968..138faeb08a7c 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -968,6 +968,7 @@ static int elevator_switch_mq(struct request_queue *q, int ret; blk_mq_freeze_queue(q); + blk_mq_quiesce_queue(q); if (q->elevator) { if (q->elevator->registered) @@ -994,6 +995,7 @@ static int elevator_switch_mq(struct request_queue *q, blk_add_trace_msg(q, "elv switch: none"); out: + blk_mq_unquiesce_queue(q); blk_mq_unfreeze_queue(q); return ret; } -- 2.9.5
WARNING: multiple messages have this Message-ID (diff)
From: ming.lei@redhat.com (Ming Lei) Subject: [PATCH V2 3/6] blk-mq: quiesce queue during switching io sched and updating nr_requests Date: Thu, 14 Dec 2017 10:31:00 +0800 [thread overview] Message-ID: <20171214023103.18272-4-ming.lei@redhat.com> (raw) In-Reply-To: <20171214023103.18272-1-ming.lei@redhat.com> Dispatch may still be in-progress after queue is frozen, so we have to quiesce queue before switching IO scheduler and updating nr_requests. Also when switching io schedulers, blk_mq_run_hw_queue() may still be called somewhere(such as from nvme_reset_work()), and io scheduler's per-hctx data may not be setup yet, so cause oops even inside blk_mq_hctx_has_pending(), such as it can be run just between: ret = e->ops.mq.init_sched(q, e); AND ret = e->ops.mq.init_hctx(hctx, i) inside blk_mq_init_sched(). This reverts commit 7a148c2fcff8330(block: don't call blk_mq_quiesce_queue() after queue is frozen) basically, and makes sure blk_mq_hctx_has_pending won't be called if queue is quiesced. Fixes: 7a148c2fcff83309(block: don't call blk_mq_quiesce_queue() after queue is frozen) Reported-by: Yi Zhang <yi.zhang at redhat.com> Signed-off-by: Ming Lei <ming.lei at redhat.com> --- block/blk-mq.c | 27 ++++++++++++++++++++++++++- block/elevator.c | 2 ++ 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 5d69c8075339..85954a0b4394 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1296,7 +1296,30 @@ EXPORT_SYMBOL(blk_mq_delay_run_hw_queue); bool blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async) { - if (blk_mq_hctx_has_pending(hctx)) { + int srcu_idx; + bool need_run; + + /* + * When queue is quiesced, we may be switching io scheduler, or + * updating nr_hw_queues, or other things, and we can't run queue + * any more, even __blk_mq_hctx_has_pending() can't be called safely. + * + * And queue will be rerun in blk_mq_unquiesce_queue() if it is + * quiesced. + */ + if (!(hctx->flags & BLK_MQ_F_BLOCKING)) { + rcu_read_lock(); + need_run = !blk_queue_quiesced(hctx->queue) && + blk_mq_hctx_has_pending(hctx); + rcu_read_unlock(); + } else { + srcu_idx = srcu_read_lock(hctx->queue_rq_srcu); + need_run = !blk_queue_quiesced(hctx->queue) && + blk_mq_hctx_has_pending(hctx); + srcu_read_unlock(hctx->queue_rq_srcu, srcu_idx); + } + + if (need_run) { __blk_mq_delay_run_hw_queue(hctx, async, 0); return true; } @@ -2721,6 +2744,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) return -EINVAL; blk_mq_freeze_queue(q); + blk_mq_quiesce_queue(q); ret = 0; queue_for_each_hw_ctx(q, hctx, i) { @@ -2744,6 +2768,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) if (!ret) q->nr_requests = nr; + blk_mq_unquiesce_queue(q); blk_mq_unfreeze_queue(q); return ret; diff --git a/block/elevator.c b/block/elevator.c index 7bda083d5968..138faeb08a7c 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -968,6 +968,7 @@ static int elevator_switch_mq(struct request_queue *q, int ret; blk_mq_freeze_queue(q); + blk_mq_quiesce_queue(q); if (q->elevator) { if (q->elevator->registered) @@ -994,6 +995,7 @@ static int elevator_switch_mq(struct request_queue *q, blk_add_trace_msg(q, "elv switch: none"); out: + blk_mq_unquiesce_queue(q); blk_mq_unfreeze_queue(q); return ret; } -- 2.9.5
next prev parent reply other threads:[~2017-12-14 2:32 UTC|newest] Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-12-14 2:30 [PATCH V2 0/6] blk-mq: fix race related with device deletion/reset/switching sched Ming Lei 2017-12-14 2:30 ` Ming Lei 2017-12-14 2:30 ` [PATCH V2 1/6] blk-mq: quiesce queue before freeing queue Ming Lei 2017-12-14 2:30 ` Ming Lei 2017-12-29 9:54 ` Christoph Hellwig 2017-12-29 9:54 ` Christoph Hellwig 2017-12-14 2:30 ` [PATCH V2 2/6] blk-mq: support concurrent blk_mq_quiesce_queue() Ming Lei 2017-12-14 2:30 ` Ming Lei 2017-12-29 9:58 ` Christoph Hellwig 2017-12-29 9:58 ` Christoph Hellwig 2018-01-02 3:01 ` Ming Lei 2018-01-02 3:01 ` Ming Lei 2017-12-14 2:31 ` Ming Lei [this message] 2017-12-14 2:31 ` [PATCH V2 3/6] blk-mq: quiesce queue during switching io sched and updating nr_requests Ming Lei 2017-12-29 9:58 ` Christoph Hellwig 2017-12-29 9:58 ` Christoph Hellwig 2017-12-14 2:31 ` [PATCH V2 4/6] blk-mq: avoid to map CPU into stale hw queue Ming Lei 2017-12-14 2:31 ` Ming Lei 2017-12-29 9:59 ` Christoph Hellwig 2017-12-29 9:59 ` Christoph Hellwig 2017-12-14 2:31 ` [PATCH V2 5/6] blk-mq: fix race between updating nr_hw_queues and switching io sched Ming Lei 2017-12-14 2:31 ` Ming Lei 2017-12-29 9:59 ` Christoph Hellwig 2017-12-29 9:59 ` Christoph Hellwig 2017-12-14 2:31 ` [PATCH V2 6/6] nvme-pci: remove .init_request callback Ming Lei 2017-12-14 2:31 ` Ming Lei 2017-12-21 8:20 ` Sagi Grimberg 2017-12-21 8:20 ` Sagi Grimberg 2017-12-21 8:36 ` Johannes Thumshirn 2017-12-21 8:36 ` Johannes Thumshirn 2017-12-22 1:34 ` Ming Lei 2017-12-22 1:34 ` Ming Lei 2017-12-24 8:50 ` Sagi Grimberg 2017-12-24 8:50 ` Sagi Grimberg 2017-12-19 15:30 ` [PATCH V2 0/6] blk-mq: fix race related with device deletion/reset/switching sched Yi Zhang 2017-12-19 15:30 ` Yi Zhang
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20171214023103.18272-4-ming.lei@redhat.com \ --to=ming.lei@redhat.com \ --cc=axboe@fb.com \ --cc=bart.vanassche@sandisk.com \ --cc=hch@lst.de \ --cc=jthumshirn@suse.de \ --cc=keith.busch@intel.com \ --cc=linux-block@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=sagi@grimberg.me \ --cc=yi.zhang@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.