From: Ming Lei <ming.lei@redhat.com> To: Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>, linux-block@vger.kernel.org Cc: Sagi Grimberg <sagi@grimberg.me>, linux-nvme@lists.infradead.org, Keith Busch <kbusch@kernel.org>, Ming Lei <ming.lei@redhat.com> Subject: [PATCH 5/5] blk-mq: support nested blk_mq_quiesce_queue() Date: Wed, 29 Sep 2021 12:15:59 +0800 [thread overview] Message-ID: <20210929041559.701102-6-ming.lei@redhat.com> (raw) In-Reply-To: <20210929041559.701102-1-ming.lei@redhat.com> Turns out that blk_mq_freeze_queue() isn't stronger[1] than blk_mq_quiesce_queue() because dispatch may still be in-progress after queue is frozen, and in several cases, such as switching io scheduler, updating nr_requests & wbt latency, we still need to quiesce queue as a supplement of freezing queue. As we need to extend uses of blk_mq_quiesce_queue(), it is inevitable for us to need support nested quiesce, especailly we can't let unquiesce happen when there is quiesce originated from other contexts. This patch introduces q->mq_quiesce_depth to deal concurrent quiesce, and we only unquiesce queue when it is the last one from all contexts. One kernel panic issue has been reported[2] when running stress test on dm-mpath's updating nr_requests and suspending queue, and the similar issue should exist on almost all drivers which use quiesce/unquiesce. [1] https://marc.info/?l=linux-block&m=150993988115872&w=2 [2] https://listman.redhat.com/archives/dm-devel/2021-September/msg00189.html Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-mq.c | 20 +++++++++++++++++--- include/linux/blkdev.h | 2 ++ 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 21bf4c3f0825..10f8a3d4e3a1 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -209,7 +209,12 @@ EXPORT_SYMBOL_GPL(blk_mq_unfreeze_queue); */ void blk_mq_quiesce_queue_nowait(struct request_queue *q) { - blk_queue_flag_set(QUEUE_FLAG_QUIESCED, q); + unsigned long flags; + + spin_lock_irqsave(&q->queue_lock, flags); + if (!q->quiesce_depth++) + blk_queue_flag_set(QUEUE_FLAG_QUIESCED, q); + spin_unlock_irqrestore(&q->queue_lock, flags); } EXPORT_SYMBOL_GPL(blk_mq_quiesce_queue_nowait); @@ -250,10 +255,19 @@ EXPORT_SYMBOL_GPL(blk_mq_quiesce_queue); */ void blk_mq_unquiesce_queue(struct request_queue *q) { - blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q); + unsigned long flags; + bool run_queue = false; + + spin_lock_irqsave(&q->queue_lock, flags); + if (q->quiesce_depth > 0 && !--q->quiesce_depth) { + blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q); + run_queue = true; + } + spin_unlock_irqrestore(&q->queue_lock, flags); /* dispatch requests which are inserted during quiescing */ - blk_mq_run_hw_queues(q, true); + if (run_queue) + blk_mq_run_hw_queues(q, true); } EXPORT_SYMBOL_GPL(blk_mq_unquiesce_queue); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 0e960d74615e..74c60e2d61f9 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -315,6 +315,8 @@ struct request_queue { */ struct mutex mq_freeze_lock; + int quiesce_depth; + struct blk_mq_tag_set *tag_set; struct list_head tag_set_list; struct bio_set bio_split; -- 2.31.1
WARNING: multiple messages have this Message-ID (diff)
From: Ming Lei <ming.lei@redhat.com> To: Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>, linux-block@vger.kernel.org Cc: Sagi Grimberg <sagi@grimberg.me>, linux-nvme@lists.infradead.org, Keith Busch <kbusch@kernel.org>, Ming Lei <ming.lei@redhat.com> Subject: [PATCH 5/5] blk-mq: support nested blk_mq_quiesce_queue() Date: Wed, 29 Sep 2021 12:15:59 +0800 [thread overview] Message-ID: <20210929041559.701102-6-ming.lei@redhat.com> (raw) In-Reply-To: <20210929041559.701102-1-ming.lei@redhat.com> Turns out that blk_mq_freeze_queue() isn't stronger[1] than blk_mq_quiesce_queue() because dispatch may still be in-progress after queue is frozen, and in several cases, such as switching io scheduler, updating nr_requests & wbt latency, we still need to quiesce queue as a supplement of freezing queue. As we need to extend uses of blk_mq_quiesce_queue(), it is inevitable for us to need support nested quiesce, especailly we can't let unquiesce happen when there is quiesce originated from other contexts. This patch introduces q->mq_quiesce_depth to deal concurrent quiesce, and we only unquiesce queue when it is the last one from all contexts. One kernel panic issue has been reported[2] when running stress test on dm-mpath's updating nr_requests and suspending queue, and the similar issue should exist on almost all drivers which use quiesce/unquiesce. [1] https://marc.info/?l=linux-block&m=150993988115872&w=2 [2] https://listman.redhat.com/archives/dm-devel/2021-September/msg00189.html Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-mq.c | 20 +++++++++++++++++--- include/linux/blkdev.h | 2 ++ 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 21bf4c3f0825..10f8a3d4e3a1 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -209,7 +209,12 @@ EXPORT_SYMBOL_GPL(blk_mq_unfreeze_queue); */ void blk_mq_quiesce_queue_nowait(struct request_queue *q) { - blk_queue_flag_set(QUEUE_FLAG_QUIESCED, q); + unsigned long flags; + + spin_lock_irqsave(&q->queue_lock, flags); + if (!q->quiesce_depth++) + blk_queue_flag_set(QUEUE_FLAG_QUIESCED, q); + spin_unlock_irqrestore(&q->queue_lock, flags); } EXPORT_SYMBOL_GPL(blk_mq_quiesce_queue_nowait); @@ -250,10 +255,19 @@ EXPORT_SYMBOL_GPL(blk_mq_quiesce_queue); */ void blk_mq_unquiesce_queue(struct request_queue *q) { - blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q); + unsigned long flags; + bool run_queue = false; + + spin_lock_irqsave(&q->queue_lock, flags); + if (q->quiesce_depth > 0 && !--q->quiesce_depth) { + blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q); + run_queue = true; + } + spin_unlock_irqrestore(&q->queue_lock, flags); /* dispatch requests which are inserted during quiescing */ - blk_mq_run_hw_queues(q, true); + if (run_queue) + blk_mq_run_hw_queues(q, true); } EXPORT_SYMBOL_GPL(blk_mq_unquiesce_queue); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 0e960d74615e..74c60e2d61f9 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -315,6 +315,8 @@ struct request_queue { */ struct mutex mq_freeze_lock; + int quiesce_depth; + struct blk_mq_tag_set *tag_set; struct list_head tag_set_list; struct bio_set bio_split; -- 2.31.1 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2021-09-29 4:17 UTC|newest] Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-09-29 4:15 [PATCH 0/5] blk-mq: support nested queue quiescing Ming Lei 2021-09-29 4:15 ` Ming Lei 2021-09-29 4:15 ` [PATCH 1/5] nvme: add APIs for stopping/starting admin queue Ming Lei 2021-09-29 4:15 ` Ming Lei 2021-09-29 4:15 ` [PATCH 2/5] nvme: apply nvme API to quiesce/unquiesce " Ming Lei 2021-09-29 4:15 ` Ming Lei 2021-09-29 4:15 ` [PATCH 3/5] nvme: prepare for pairing quiescing and unquiescing Ming Lei 2021-09-29 4:15 ` Ming Lei 2021-09-29 4:15 ` [PATCH 4/5] nvme: paring quiesce/unquiesce Ming Lei 2021-09-29 4:15 ` Ming Lei 2021-09-29 11:49 ` Sagi Grimberg 2021-09-29 11:49 ` Sagi Grimberg 2021-09-29 15:28 ` Ming Lei 2021-09-29 15:28 ` Ming Lei 2021-09-29 4:15 ` Ming Lei [this message] 2021-09-29 4:15 ` [PATCH 5/5] blk-mq: support nested blk_mq_quiesce_queue() Ming Lei 2021-09-29 11:53 ` Sagi Grimberg 2021-09-29 11:53 ` Sagi Grimberg 2021-09-29 15:44 ` Ming Lei 2021-09-29 15:44 ` Ming Lei
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210929041559.701102-6-ming.lei@redhat.com \ --to=ming.lei@redhat.com \ --cc=axboe@kernel.dk \ --cc=hch@lst.de \ --cc=kbusch@kernel.org \ --cc=linux-block@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=sagi@grimberg.me \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.