linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/4] block: fix concurrent quiesce
@ 2021-11-09  7:11 Ming Lei
  2021-11-09  7:11 ` [PATCH V2 1/4] blk-mq: add one API for waiting until quiesce is done Ming Lei
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Ming Lei @ 2021-11-09  7:11 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Yi Zhang, linux-block, linux-nvme, Martin K . Petersen,
	linux-scsi, James Bottomley, Ming Lei

Hi Jens,

Convert SCSI into balanced quiesce and unquiesce by using atomic
variable as suggested by James, meantime fix previous nvme conversion by
adding one new API because we have to wait until the started quiesce is
done.

V2:
	- add comment on scsi's change, as suggested by James, 3/4
	- add reviewed-by tag, 4/4

Ming Lei (4):
  blk-mq: add one API for waiting until quiesce is done
  scsi: avoid to quiesce sdev->request_queue two times
  scsi: make sure that request queue queiesce and unquiesce balanced
  nvme: wait until quiesce is done

 block/blk-mq.c             | 28 ++++++++++++-----
 drivers/nvme/host/core.c   |  4 +++
 drivers/scsi/scsi_lib.c    | 62 ++++++++++++++++++++++++--------------
 include/linux/blk-mq.h     |  1 +
 include/scsi/scsi_device.h |  1 +
 5 files changed, 66 insertions(+), 30 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH V2 1/4] blk-mq: add one API for waiting until quiesce is done
  2021-11-09  7:11 [PATCH V2 0/4] block: fix concurrent quiesce Ming Lei
@ 2021-11-09  7:11 ` Ming Lei
  2021-11-09  7:11 ` [PATCH V2 2/4] scsi: avoid to quiesce sdev->request_queue two times Ming Lei
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Ming Lei @ 2021-11-09  7:11 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Yi Zhang, linux-block, linux-nvme, Martin K . Petersen,
	linux-scsi, James Bottomley, Ming Lei

Some drivers(NVMe, SCSI) need to call quiesce and unquiesce in pair, but it
is hard to switch to this style, so these drivers need one atomic flag for
helping to balance quiesce and unquiesce.

When quiesce is in-progress, the driver still needs to wait until
the quiesce is done, so add API of blk_mq_wait_quiesce_done() for
these drivers.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c         | 28 ++++++++++++++++++++--------
 include/linux/blk-mq.h |  1 +
 2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 6358131cfc28..629cf421417f 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -251,22 +251,18 @@ void blk_mq_quiesce_queue_nowait(struct request_queue *q)
 EXPORT_SYMBOL_GPL(blk_mq_quiesce_queue_nowait);
 
 /**
- * blk_mq_quiesce_queue() - wait until all ongoing dispatches have finished
+ * blk_mq_wait_quiesce_done() - wait until in-progress quiesce is done
  * @q: request queue.
  *
- * Note: this function does not prevent that the struct request end_io()
- * callback function is invoked. Once this function is returned, we make
- * sure no dispatch can happen until the queue is unquiesced via
- * blk_mq_unquiesce_queue().
+ * Note: it is driver's responsibility for making sure that quiesce has
+ * been started.
  */
-void blk_mq_quiesce_queue(struct request_queue *q)
+void blk_mq_wait_quiesce_done(struct request_queue *q)
 {
 	struct blk_mq_hw_ctx *hctx;
 	unsigned int i;
 	bool rcu = false;
 
-	blk_mq_quiesce_queue_nowait(q);
-
 	queue_for_each_hw_ctx(q, hctx, i) {
 		if (hctx->flags & BLK_MQ_F_BLOCKING)
 			synchronize_srcu(hctx->srcu);
@@ -276,6 +272,22 @@ void blk_mq_quiesce_queue(struct request_queue *q)
 	if (rcu)
 		synchronize_rcu();
 }
+EXPORT_SYMBOL_GPL(blk_mq_wait_quiesce_done);
+
+/**
+ * blk_mq_quiesce_queue() - wait until all ongoing dispatches have finished
+ * @q: request queue.
+ *
+ * Note: this function does not prevent that the struct request end_io()
+ * callback function is invoked. Once this function is returned, we make
+ * sure no dispatch can happen until the queue is unquiesced via
+ * blk_mq_unquiesce_queue().
+ */
+void blk_mq_quiesce_queue(struct request_queue *q)
+{
+	blk_mq_quiesce_queue_nowait(q);
+	blk_mq_wait_quiesce_done(q);
+}
 EXPORT_SYMBOL_GPL(blk_mq_quiesce_queue);
 
 /*
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 8682663e7368..2949d9ac7484 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -798,6 +798,7 @@ void blk_mq_start_hw_queues(struct request_queue *q);
 void blk_mq_start_stopped_hw_queue(struct blk_mq_hw_ctx *hctx, bool async);
 void blk_mq_start_stopped_hw_queues(struct request_queue *q, bool async);
 void blk_mq_quiesce_queue(struct request_queue *q);
+void blk_mq_wait_quiesce_done(struct request_queue *q);
 void blk_mq_unquiesce_queue(struct request_queue *q);
 void blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, unsigned long msecs);
 void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH V2 2/4] scsi: avoid to quiesce sdev->request_queue two times
  2021-11-09  7:11 [PATCH V2 0/4] block: fix concurrent quiesce Ming Lei
  2021-11-09  7:11 ` [PATCH V2 1/4] blk-mq: add one API for waiting until quiesce is done Ming Lei
@ 2021-11-09  7:11 ` Ming Lei
  2021-11-09  7:11 ` [PATCH V2 3/4] scsi: make sure that request queue queiesce and unquiesce balanced Ming Lei
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Ming Lei @ 2021-11-09  7:11 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Yi Zhang, linux-block, linux-nvme, Martin K . Petersen,
	linux-scsi, James Bottomley, Ming Lei

For fixing queue quiesce race between driver and block layer(elevator
switch, update nr_requests, ...), we need to support concurrent quiesce
and unquiesce, which requires the two to be balanced.

blk_mq_quiesce_queue() calls blk_mq_quiesce_queue_nowait() for updating
quiesce depth and marking the flag, then scsi_internal_device_block() calls
blk_mq_quiesce_queue_nowait() two times actually.

Fix the double quiesce and keep quiesce and unquiesce balanced.

Reported-by: Yi Zhang <yi.zhang@redhat.com>
Fixes: e70feb8b3e68 ("blk-mq: support concurrent queue quiesce/unquiesce")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi_lib.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9c2b99e12ce3..1cd3ef9056d5 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2645,6 +2645,14 @@ scsi_target_resume(struct scsi_target *starget)
 }
 EXPORT_SYMBOL(scsi_target_resume);
 
+static int __scsi_internal_device_block_nowait(struct scsi_device *sdev)
+{
+	if (scsi_device_set_state(sdev, SDEV_BLOCK))
+		return scsi_device_set_state(sdev, SDEV_CREATED_BLOCK);
+
+	return 0;
+}
+
 /**
  * scsi_internal_device_block_nowait - try to transition to the SDEV_BLOCK state
  * @sdev: device to block
@@ -2661,24 +2669,16 @@ EXPORT_SYMBOL(scsi_target_resume);
  */
 int scsi_internal_device_block_nowait(struct scsi_device *sdev)
 {
-	struct request_queue *q = sdev->request_queue;
-	int err = 0;
-
-	err = scsi_device_set_state(sdev, SDEV_BLOCK);
-	if (err) {
-		err = scsi_device_set_state(sdev, SDEV_CREATED_BLOCK);
-
-		if (err)
-			return err;
-	}
+	int ret = __scsi_internal_device_block_nowait(sdev);
 
 	/*
 	 * The device has transitioned to SDEV_BLOCK.  Stop the
 	 * block layer from calling the midlayer with this device's
 	 * request queue.
 	 */
-	blk_mq_quiesce_queue_nowait(q);
-	return 0;
+	if (!ret)
+		blk_mq_quiesce_queue_nowait(sdev->request_queue);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(scsi_internal_device_block_nowait);
 
@@ -2699,13 +2699,12 @@ EXPORT_SYMBOL_GPL(scsi_internal_device_block_nowait);
  */
 static int scsi_internal_device_block(struct scsi_device *sdev)
 {
-	struct request_queue *q = sdev->request_queue;
 	int err;
 
 	mutex_lock(&sdev->state_mutex);
-	err = scsi_internal_device_block_nowait(sdev);
+	err = __scsi_internal_device_block_nowait(sdev);
 	if (err == 0)
-		blk_mq_quiesce_queue(q);
+		blk_mq_quiesce_queue(sdev->request_queue);
 	mutex_unlock(&sdev->state_mutex);
 
 	return err;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH V2 3/4] scsi: make sure that request queue queiesce and unquiesce balanced
  2021-11-09  7:11 [PATCH V2 0/4] block: fix concurrent quiesce Ming Lei
  2021-11-09  7:11 ` [PATCH V2 1/4] blk-mq: add one API for waiting until quiesce is done Ming Lei
  2021-11-09  7:11 ` [PATCH V2 2/4] scsi: avoid to quiesce sdev->request_queue two times Ming Lei
@ 2021-11-09  7:11 ` Ming Lei
  2021-11-09  7:11 ` [PATCH V2 4/4] nvme: wait until quiesce is done Ming Lei
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Ming Lei @ 2021-11-09  7:11 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Yi Zhang, linux-block, linux-nvme, Martin K . Petersen,
	linux-scsi, James Bottomley, Ming Lei

For fixing queue quiesce race between driver and block layer(elevator
switch, update nr_requests, ...), we need to support concurrent quiesce
and unquiesce, which requires the two call balanced.

It isn't easy to audit that in all scsi drivers, especially the two may
be called from different contexts, so do it in scsi core with one
per-device atomic variable to balance quiesce and unquiesce.

Reported-by: Yi Zhang <yi.zhang@redhat.com>
Fixes: e70feb8b3e68 ("blk-mq: support concurrent queue quiesce/unquiesce")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi_lib.c    | 37 ++++++++++++++++++++++++++++---------
 include/scsi/scsi_device.h |  1 +
 2 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 1cd3ef9056d5..9e3bf028f95a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2653,6 +2653,32 @@ static int __scsi_internal_device_block_nowait(struct scsi_device *sdev)
 	return 0;
 }
 
+void scsi_start_queue(struct scsi_device *sdev)
+{
+	if (cmpxchg(&sdev->queue_stopped, 1, 0))
+		blk_mq_unquiesce_queue(sdev->request_queue);
+}
+
+static void scsi_stop_queue(struct scsi_device *sdev, bool nowait)
+{
+	/*
+	 * The atomic variable of ->queue_stopped covers that
+	 * blk_mq_quiesce_queue* is balanced with blk_mq_unquiesce_queue.
+	 *
+	 * However, we still need to wait until quiesce is done
+	 * in case that queue has been stopped.
+	 */
+	if (!cmpxchg(&sdev->queue_stopped, 0, 1)) {
+		if (nowait)
+			blk_mq_quiesce_queue_nowait(sdev->request_queue);
+		else
+			blk_mq_quiesce_queue(sdev->request_queue);
+	} else {
+		if (!nowait)
+			blk_mq_wait_quiesce_done(sdev->request_queue);
+	}
+}
+
 /**
  * scsi_internal_device_block_nowait - try to transition to the SDEV_BLOCK state
  * @sdev: device to block
@@ -2677,7 +2703,7 @@ int scsi_internal_device_block_nowait(struct scsi_device *sdev)
 	 * request queue.
 	 */
 	if (!ret)
-		blk_mq_quiesce_queue_nowait(sdev->request_queue);
+		scsi_stop_queue(sdev, true);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(scsi_internal_device_block_nowait);
@@ -2704,19 +2730,12 @@ static int scsi_internal_device_block(struct scsi_device *sdev)
 	mutex_lock(&sdev->state_mutex);
 	err = __scsi_internal_device_block_nowait(sdev);
 	if (err == 0)
-		blk_mq_quiesce_queue(sdev->request_queue);
+		scsi_stop_queue(sdev, false);
 	mutex_unlock(&sdev->state_mutex);
 
 	return err;
 }
 
-void scsi_start_queue(struct scsi_device *sdev)
-{
-	struct request_queue *q = sdev->request_queue;
-
-	blk_mq_unquiesce_queue(q);
-}
-
 /**
  * scsi_internal_device_unblock_nowait - resume a device after a block request
  * @sdev:	device to resume
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 430b73bd02ac..d1c6fc83b1e3 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -207,6 +207,7 @@ struct scsi_device {
 					 * creation time */
 	unsigned ignore_media_change:1; /* Ignore MEDIA CHANGE on resume */
 
+	unsigned int queue_stopped;	/* request queue is quiesced */
 	bool offline_already;		/* Device offline message logged */
 
 	atomic_t disk_events_disable_depth; /* disable depth for disk events */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH V2 4/4] nvme: wait until quiesce is done
  2021-11-09  7:11 [PATCH V2 0/4] block: fix concurrent quiesce Ming Lei
                   ` (2 preceding siblings ...)
  2021-11-09  7:11 ` [PATCH V2 3/4] scsi: make sure that request queue queiesce and unquiesce balanced Ming Lei
@ 2021-11-09  7:11 ` Ming Lei
  2021-11-09 14:41 ` [PATCH V2 0/4] block: fix concurrent quiesce Martin K. Petersen
  2021-11-09 15:14 ` Jens Axboe
  5 siblings, 0 replies; 7+ messages in thread
From: Ming Lei @ 2021-11-09  7:11 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Yi Zhang, linux-block, linux-nvme, Martin K . Petersen,
	linux-scsi, James Bottomley, Ming Lei, Keith Busch

NVMe uses one atomic flag to check if quiesce is needed. If quiesce is
started, the helper returns immediately. This way is wrong, since we
have to wait until quiesce is done.

Fixes: e70feb8b3e68 ("blk-mq: support concurrent queue quiesce/unquiesce")
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/nvme/host/core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 838b5e2058be..4b5de8f5435a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4518,6 +4518,8 @@ static void nvme_stop_ns_queue(struct nvme_ns *ns)
 {
 	if (!test_and_set_bit(NVME_NS_STOPPED, &ns->flags))
 		blk_mq_quiesce_queue(ns->queue);
+	else
+		blk_mq_wait_quiesce_done(ns->queue);
 }
 
 /*
@@ -4637,6 +4639,8 @@ void nvme_stop_admin_queue(struct nvme_ctrl *ctrl)
 {
 	if (!test_and_set_bit(NVME_CTRL_ADMIN_Q_STOPPED, &ctrl->flags))
 		blk_mq_quiesce_queue(ctrl->admin_q);
+	else
+		blk_mq_wait_quiesce_done(ctrl->admin_q);
 }
 EXPORT_SYMBOL_GPL(nvme_stop_admin_queue);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH V2 0/4] block: fix concurrent quiesce
  2021-11-09  7:11 [PATCH V2 0/4] block: fix concurrent quiesce Ming Lei
                   ` (3 preceding siblings ...)
  2021-11-09  7:11 ` [PATCH V2 4/4] nvme: wait until quiesce is done Ming Lei
@ 2021-11-09 14:41 ` Martin K. Petersen
  2021-11-09 15:14 ` Jens Axboe
  5 siblings, 0 replies; 7+ messages in thread
From: Martin K. Petersen @ 2021-11-09 14:41 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Yi Zhang, linux-block, linux-nvme,
	Martin K . Petersen, linux-scsi, James Bottomley


Ming,

> Convert SCSI into balanced quiesce and unquiesce by using atomic
> variable as suggested by James, meantime fix previous nvme conversion
> by adding one new API because we have to wait until the started
> quiesce is done.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2 0/4] block: fix concurrent quiesce
  2021-11-09  7:11 [PATCH V2 0/4] block: fix concurrent quiesce Ming Lei
                   ` (4 preceding siblings ...)
  2021-11-09 14:41 ` [PATCH V2 0/4] block: fix concurrent quiesce Martin K. Petersen
@ 2021-11-09 15:14 ` Jens Axboe
  5 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2021-11-09 15:14 UTC (permalink / raw)
  To: Ming Lei
  Cc: linux-scsi, Martin K . Petersen, linux-block, Yi Zhang,
	linux-nvme, James Bottomley

On Tue, 9 Nov 2021 15:11:40 +0800, Ming Lei wrote:
> Convert SCSI into balanced quiesce and unquiesce by using atomic
> variable as suggested by James, meantime fix previous nvme conversion by
> adding one new API because we have to wait until the started quiesce is
> done.
> 
> V2:
> 	- add comment on scsi's change, as suggested by James, 3/4
> 	- add reviewed-by tag, 4/4
> 
> [...]

Applied, thanks!

[1/4] blk-mq: add one API for waiting until quiesce is done
      commit: 9ef4d0209cbadb63656a7aa29fde49c27ab2b9bf
[2/4] scsi: avoid to quiesce sdev->request_queue two times
      commit: d2b9f12b0f7cf95c43f5fd4a18688d958d39e423
[3/4] scsi: make sure that request queue queiesce and unquiesce balanced
      commit: 93542fbfa7b726d053c01a9399577c03968c4f6b
[4/4] nvme: wait until quiesce is done
      commit: 26af1cd00364ce20dbec66b93ef42f9d42dc6953

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-11-09 15:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-09  7:11 [PATCH V2 0/4] block: fix concurrent quiesce Ming Lei
2021-11-09  7:11 ` [PATCH V2 1/4] blk-mq: add one API for waiting until quiesce is done Ming Lei
2021-11-09  7:11 ` [PATCH V2 2/4] scsi: avoid to quiesce sdev->request_queue two times Ming Lei
2021-11-09  7:11 ` [PATCH V2 3/4] scsi: make sure that request queue queiesce and unquiesce balanced Ming Lei
2021-11-09  7:11 ` [PATCH V2 4/4] nvme: wait until quiesce is done Ming Lei
2021-11-09 14:41 ` [PATCH V2 0/4] block: fix concurrent quiesce Martin K. Petersen
2021-11-09 15:14 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).