All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND v4 0/8] Handle update hardware queues and queue freeze more carefully
@ 2021-08-02 11:26 ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner

[forgot to add the linux-nvme mailing list]

Hi,

This update version makes sure the unfreeze call is done when
recreating the queues. I was able to reproduce hanging I/Os when we go
into error recovery mode for FC and TCP [1]. Unfortunatly, I don't
have access to a RDMA setup to verify but as the code is identically
to the TCP, RDMA looks to like to suffer from the same problem.

Thanks,
Daniel

[1] https://lore.kernel.org/linux-nvme/20210730094907.5vg7qebggttibogz@beryllium.lan/
    https://lore.kernel.org/linux-nvme/20210730113415.wezsrvxv5cu4yz4x@beryllium.lan/


v1:
 - https://lore.kernel.org/linux-nvme/20210625101649.49296-1-dwagner@suse.de/
v2:
 - https://lore.kernel.org/linux-nvme/20210708092755.15660-1-dwagner@suse.de/
 - reviewed tags collected
 - added 'update hardware queues' for all transport
 - added fix for fc hanger in nvme_wait_freeze_timeout
v3:
 - https://lore.kernel.org/linux-nvme/20210720124353.127959-1-dwagner@suse.de/
 - dropped 'nvme-fc: Freeze queues before destroying them'
 - added James' two patches
v4:
 - added 'nvme-*: Unfreeze queues on reconnect'
 - added Hannes' reviewed tags
 

Daniel Wagner (5):
  nvme-fc: Update hardware queues before using them
  nvme-rdma: Update number of hardware queues before using them
  nvme-fc: Wait with a timeout for queue to freeze
  nvme-tcp: Unfreeze queues on reconnect
  nvme-rdma: Unfreeze queues on reconnect

Hannes Reinecke (1):
  nvme-tcp: Update number of hardware queues before using them

James Smart (2):
  nvme-fc: avoid race between time out and tear down
  nvme-fc: fix controller reset hang during traffic

 drivers/nvme/host/fc.c   | 28 +++++++++++++++++++---------
 drivers/nvme/host/rdma.c | 15 ++++++++-------
 drivers/nvme/host/tcp.c  | 18 +++++++++---------
 3 files changed, 36 insertions(+), 25 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH RESEND v4 0/8] Handle update hardware queues and queue freeze more carefully
@ 2021-08-02 11:26 ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner

[forgot to add the linux-nvme mailing list]

Hi,

This update version makes sure the unfreeze call is done when
recreating the queues. I was able to reproduce hanging I/Os when we go
into error recovery mode for FC and TCP [1]. Unfortunatly, I don't
have access to a RDMA setup to verify but as the code is identically
to the TCP, RDMA looks to like to suffer from the same problem.

Thanks,
Daniel

[1] https://lore.kernel.org/linux-nvme/20210730094907.5vg7qebggttibogz@beryllium.lan/
    https://lore.kernel.org/linux-nvme/20210730113415.wezsrvxv5cu4yz4x@beryllium.lan/


v1:
 - https://lore.kernel.org/linux-nvme/20210625101649.49296-1-dwagner@suse.de/
v2:
 - https://lore.kernel.org/linux-nvme/20210708092755.15660-1-dwagner@suse.de/
 - reviewed tags collected
 - added 'update hardware queues' for all transport
 - added fix for fc hanger in nvme_wait_freeze_timeout
v3:
 - https://lore.kernel.org/linux-nvme/20210720124353.127959-1-dwagner@suse.de/
 - dropped 'nvme-fc: Freeze queues before destroying them'
 - added James' two patches
v4:
 - added 'nvme-*: Unfreeze queues on reconnect'
 - added Hannes' reviewed tags
 

Daniel Wagner (5):
  nvme-fc: Update hardware queues before using them
  nvme-rdma: Update number of hardware queues before using them
  nvme-fc: Wait with a timeout for queue to freeze
  nvme-tcp: Unfreeze queues on reconnect
  nvme-rdma: Unfreeze queues on reconnect

Hannes Reinecke (1):
  nvme-tcp: Update number of hardware queues before using them

James Smart (2):
  nvme-fc: avoid race between time out and tear down
  nvme-fc: fix controller reset hang during traffic

 drivers/nvme/host/fc.c   | 28 +++++++++++++++++++---------
 drivers/nvme/host/rdma.c | 15 ++++++++-------
 drivers/nvme/host/tcp.c  | 18 +++++++++---------
 3 files changed, 36 insertions(+), 25 deletions(-)

-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v4 1/8] nvme-fc: Update hardware queues before using them
  2021-08-02 11:26 ` Daniel Wagner
@ 2021-08-02 11:26   ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner, James Smart

In case the number of hardware queues changes, do the update the
tagset and ctx to hctx first before using the mapping to recreate and
connnect the IO queues.

Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 7f462af1b02a..8a903769364f 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2951,14 +2951,6 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 	if (ctrl->ctrl.queue_count == 1)
 		return 0;
 
-	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
-	if (ret)
-		goto out_free_io_queues;
-
-	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
-	if (ret)
-		goto out_delete_hw_queues;
-
 	if (prior_ioq_cnt != nr_io_queues) {
 		dev_info(ctrl->ctrl.device,
 			"reconnect: revising io queue count from %d to %d\n",
@@ -2968,6 +2960,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 		nvme_unfreeze(&ctrl->ctrl);
 	}
 
+	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
+	if (ret)
+		goto out_free_io_queues;
+
+	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
+	if (ret)
+		goto out_delete_hw_queues;
+
 	return 0;
 
 out_delete_hw_queues:
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 1/8] nvme-fc: Update hardware queues before using them
@ 2021-08-02 11:26   ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner, James Smart

In case the number of hardware queues changes, do the update the
tagset and ctx to hctx first before using the mapping to recreate and
connnect the IO queues.

Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 7f462af1b02a..8a903769364f 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2951,14 +2951,6 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 	if (ctrl->ctrl.queue_count == 1)
 		return 0;
 
-	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
-	if (ret)
-		goto out_free_io_queues;
-
-	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
-	if (ret)
-		goto out_delete_hw_queues;
-
 	if (prior_ioq_cnt != nr_io_queues) {
 		dev_info(ctrl->ctrl.device,
 			"reconnect: revising io queue count from %d to %d\n",
@@ -2968,6 +2960,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 		nvme_unfreeze(&ctrl->ctrl);
 	}
 
+	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
+	if (ret)
+		goto out_free_io_queues;
+
+	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
+	if (ret)
+		goto out_delete_hw_queues;
+
 	return 0;
 
 out_delete_hw_queues:
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
  2021-08-02 11:26 ` Daniel Wagner
@ 2021-08-02 11:26   ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner

From: Hannes Reinecke <hare@suse.de>

When the number of hardware queues changes during resetting we should
update the tagset first before using it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/tcp.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 0a97ba02f61e..32268f24f62a 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1789,6 +1789,7 @@ static void nvme_tcp_destroy_io_queues(struct nvme_ctrl *ctrl, bool remove)
 static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 {
 	int ret;
+	u32 prior_q_cnt = ctrl->queue_count;
 
 	ret = nvme_tcp_alloc_io_queues(ctrl);
 	if (ret)
@@ -1806,14 +1807,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 			ret = PTR_ERR(ctrl->connect_q);
 			goto out_free_tag_set;
 		}
-	}
-
-	ret = nvme_tcp_start_io_queues(ctrl);
-	if (ret)
-		goto out_cleanup_connect_q;
-
-	if (!new) {
-		nvme_start_queues(ctrl);
+	} else if (prior_q_cnt != ctrl->queue_count) {
 		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
 			/*
 			 * If we timed out waiting for freeze we are likely to
@@ -1828,6 +1822,10 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 		nvme_unfreeze(ctrl);
 	}
 
+	ret = nvme_tcp_start_io_queues(ctrl);
+	if (ret)
+		goto out_cleanup_connect_q;
+
 	return 0;
 
 out_wait_freeze_timed_out:
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
@ 2021-08-02 11:26   ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner

From: Hannes Reinecke <hare@suse.de>

When the number of hardware queues changes during resetting we should
update the tagset first before using it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/tcp.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 0a97ba02f61e..32268f24f62a 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1789,6 +1789,7 @@ static void nvme_tcp_destroy_io_queues(struct nvme_ctrl *ctrl, bool remove)
 static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 {
 	int ret;
+	u32 prior_q_cnt = ctrl->queue_count;
 
 	ret = nvme_tcp_alloc_io_queues(ctrl);
 	if (ret)
@@ -1806,14 +1807,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 			ret = PTR_ERR(ctrl->connect_q);
 			goto out_free_tag_set;
 		}
-	}
-
-	ret = nvme_tcp_start_io_queues(ctrl);
-	if (ret)
-		goto out_cleanup_connect_q;
-
-	if (!new) {
-		nvme_start_queues(ctrl);
+	} else if (prior_q_cnt != ctrl->queue_count) {
 		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
 			/*
 			 * If we timed out waiting for freeze we are likely to
@@ -1828,6 +1822,10 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 		nvme_unfreeze(ctrl);
 	}
 
+	ret = nvme_tcp_start_io_queues(ctrl);
+	if (ret)
+		goto out_cleanup_connect_q;
+
 	return 0;
 
 out_wait_freeze_timed_out:
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 3/8] nvme-rdma: Update number of hardware queues before using them
  2021-08-02 11:26 ` Daniel Wagner
@ 2021-08-02 11:26   ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner

When the number of hardware queues changes during resetting we should
update the tagset first before using it.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/rdma.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 69ae67652f38..de2a8950d282 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -965,6 +965,7 @@ static void nvme_rdma_destroy_io_queues(struct nvme_rdma_ctrl *ctrl,
 static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 {
 	int ret;
+	u32 prior_q_cnt = ctrl->ctrl.queue_count;
 
 	ret = nvme_rdma_alloc_io_queues(ctrl);
 	if (ret)
@@ -982,13 +983,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 			ret = PTR_ERR(ctrl->ctrl.connect_q);
 			goto out_free_tag_set;
 		}
-	}
-
-	ret = nvme_rdma_start_io_queues(ctrl);
-	if (ret)
-		goto out_cleanup_connect_q;
-
-	if (!new) {
+	} else if (prior_q_cnt != ctrl->ctrl.queue_count) {
 		nvme_start_queues(&ctrl->ctrl);
 		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
 			/*
@@ -1004,6 +999,10 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 		nvme_unfreeze(&ctrl->ctrl);
 	}
 
+	ret = nvme_rdma_start_io_queues(ctrl);
+	if (ret)
+		goto out_cleanup_connect_q;
+
 	return 0;
 
 out_wait_freeze_timed_out:
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 3/8] nvme-rdma: Update number of hardware queues before using them
@ 2021-08-02 11:26   ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner

When the number of hardware queues changes during resetting we should
update the tagset first before using it.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/rdma.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 69ae67652f38..de2a8950d282 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -965,6 +965,7 @@ static void nvme_rdma_destroy_io_queues(struct nvme_rdma_ctrl *ctrl,
 static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 {
 	int ret;
+	u32 prior_q_cnt = ctrl->ctrl.queue_count;
 
 	ret = nvme_rdma_alloc_io_queues(ctrl);
 	if (ret)
@@ -982,13 +983,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 			ret = PTR_ERR(ctrl->ctrl.connect_q);
 			goto out_free_tag_set;
 		}
-	}
-
-	ret = nvme_rdma_start_io_queues(ctrl);
-	if (ret)
-		goto out_cleanup_connect_q;
-
-	if (!new) {
+	} else if (prior_q_cnt != ctrl->ctrl.queue_count) {
 		nvme_start_queues(&ctrl->ctrl);
 		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
 			/*
@@ -1004,6 +999,10 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 		nvme_unfreeze(&ctrl->ctrl);
 	}
 
+	ret = nvme_rdma_start_io_queues(ctrl);
+	if (ret)
+		goto out_cleanup_connect_q;
+
 	return 0;
 
 out_wait_freeze_timed_out:
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 4/8] nvme-fc: Wait with a timeout for queue to freeze
  2021-08-02 11:26 ` Daniel Wagner
@ 2021-08-02 11:26   ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner, James Smart

Do not wait indifinitly for all queues to freeze. Instead use a
timeout and abort the operation if we get stuck.

Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 8a903769364f..dbb8ad816df8 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 		dev_info(ctrl->ctrl.device,
 			"reconnect: revising io queue count from %d to %d\n",
 			prior_ioq_cnt, nr_io_queues);
-		nvme_wait_freeze(&ctrl->ctrl);
+		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
+			/*
+			 * If we timed out waiting for freeze we are likely to
+			 * be stuck.  Fail the controller initialization just
+			 * to be safe.
+			 */
+			return -ENODEV;
+		}
 		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
 		nvme_unfreeze(&ctrl->ctrl);
 	}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 4/8] nvme-fc: Wait with a timeout for queue to freeze
@ 2021-08-02 11:26   ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner, James Smart

Do not wait indifinitly for all queues to freeze. Instead use a
timeout and abort the operation if we get stuck.

Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 8a903769364f..dbb8ad816df8 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 		dev_info(ctrl->ctrl.device,
 			"reconnect: revising io queue count from %d to %d\n",
 			prior_ioq_cnt, nr_io_queues);
-		nvme_wait_freeze(&ctrl->ctrl);
+		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
+			/*
+			 * If we timed out waiting for freeze we are likely to
+			 * be stuck.  Fail the controller initialization just
+			 * to be safe.
+			 */
+			return -ENODEV;
+		}
 		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
 		nvme_unfreeze(&ctrl->ctrl);
 	}
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 5/8] nvme-fc: avoid race between time out and tear down
  2021-08-02 11:26 ` Daniel Wagner
@ 2021-08-02 11:26   ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart, Chao Leng,
	Daniel Wagner

From: James Smart <jsmart2021@gmail.com>

To avoid race between time out and tear down, in tear down process,
first we quiesce the queue, and then delete the timer and cancel
the time out work for the queue.

This patch merges the admin and io sync ops into the queue teardown logic
as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time
out and tear down". There is no teardown_lock in nvme-fc.

Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: Chao Leng <lengchao@huawei.com>
Tested-by: Daniel Wagner <dwagner@suse.de>
[dwagner: updated commit id referenced in commit message]
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/fc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index dbb8ad816df8..133b87db4f1d 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2487,6 +2487,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
 	 */
 	if (ctrl->ctrl.queue_count > 1) {
 		nvme_stop_queues(&ctrl->ctrl);
+		nvme_sync_io_queues(&ctrl->ctrl);
 		blk_mq_tagset_busy_iter(&ctrl->tag_set,
 				nvme_fc_terminate_exchange, &ctrl->ctrl);
 		blk_mq_tagset_wait_completed_request(&ctrl->tag_set);
@@ -2510,6 +2511,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
 	 * clean up the admin queue. Same thing as above.
 	 */
 	blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
+	blk_sync_queue(ctrl->ctrl.admin_q);
 	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
 				nvme_fc_terminate_exchange, &ctrl->ctrl);
 	blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 5/8] nvme-fc: avoid race between time out and tear down
@ 2021-08-02 11:26   ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart, Chao Leng,
	Daniel Wagner

From: James Smart <jsmart2021@gmail.com>

To avoid race between time out and tear down, in tear down process,
first we quiesce the queue, and then delete the timer and cancel
the time out work for the queue.

This patch merges the admin and io sync ops into the queue teardown logic
as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time
out and tear down". There is no teardown_lock in nvme-fc.

Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: Chao Leng <lengchao@huawei.com>
Tested-by: Daniel Wagner <dwagner@suse.de>
[dwagner: updated commit id referenced in commit message]
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/fc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index dbb8ad816df8..133b87db4f1d 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2487,6 +2487,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
 	 */
 	if (ctrl->ctrl.queue_count > 1) {
 		nvme_stop_queues(&ctrl->ctrl);
+		nvme_sync_io_queues(&ctrl->ctrl);
 		blk_mq_tagset_busy_iter(&ctrl->tag_set,
 				nvme_fc_terminate_exchange, &ctrl->ctrl);
 		blk_mq_tagset_wait_completed_request(&ctrl->tag_set);
@@ -2510,6 +2511,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
 	 * clean up the admin queue. Same thing as above.
 	 */
 	blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
+	blk_sync_queue(ctrl->ctrl.admin_q);
 	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
 				nvme_fc_terminate_exchange, &ctrl->ctrl);
 	blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set);
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
  2021-08-02 11:26 ` Daniel Wagner
@ 2021-08-02 11:26   ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart, Daniel Wagner

From: James Smart <jsmart2021@gmail.com>

commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
exposed an issue where we may hang trying to wait for queue freeze
during I/O. We call blk_mq_update_nr_hw_queues which may attempt to freeze
the queue. However we never started queue freeze when starting the
reset, which means that we have inflight pending requests that entered the
queue that we will not complete once the queue is quiesced.

So start a freeze before we quiesce the queue, and unfreeze the queue
after we successfully connected the I/O queues (the unfreeze is already
present in the code). blk_mq_update_nr_hw_queues will be called only
after we are sure that the queue was already frozen.

This follows to how the pci driver handles resets.

This patch added logic introduced in commit 9f98772ba307 "nvme-rdma: fix
controller reset hang during traffic".

Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: Sagi Grimberg <sagi@grimberg.me>
[dwagner: call nvme_unfreeze() unconditionally in
          nvme_fc_recreate_io_queues() to match the nvme_start_freeze()]
Tested-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 133b87db4f1d..b292af0fd655 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2486,6 +2486,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
 	 * (but with error status).
 	 */
 	if (ctrl->ctrl.queue_count > 1) {
+		nvme_start_freeze(&ctrl->ctrl);
 		nvme_stop_queues(&ctrl->ctrl);
 		nvme_sync_io_queues(&ctrl->ctrl);
 		blk_mq_tagset_busy_iter(&ctrl->tag_set,
@@ -2966,8 +2967,8 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 			return -ENODEV;
 		}
 		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
-		nvme_unfreeze(&ctrl->ctrl);
 	}
+	nvme_unfreeze(&ctrl->ctrl);
 
 	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
 	if (ret)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
@ 2021-08-02 11:26   ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart, Daniel Wagner

From: James Smart <jsmart2021@gmail.com>

commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
exposed an issue where we may hang trying to wait for queue freeze
during I/O. We call blk_mq_update_nr_hw_queues which may attempt to freeze
the queue. However we never started queue freeze when starting the
reset, which means that we have inflight pending requests that entered the
queue that we will not complete once the queue is quiesced.

So start a freeze before we quiesce the queue, and unfreeze the queue
after we successfully connected the I/O queues (the unfreeze is already
present in the code). blk_mq_update_nr_hw_queues will be called only
after we are sure that the queue was already frozen.

This follows to how the pci driver handles resets.

This patch added logic introduced in commit 9f98772ba307 "nvme-rdma: fix
controller reset hang during traffic".

Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: Sagi Grimberg <sagi@grimberg.me>
[dwagner: call nvme_unfreeze() unconditionally in
          nvme_fc_recreate_io_queues() to match the nvme_start_freeze()]
Tested-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 133b87db4f1d..b292af0fd655 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2486,6 +2486,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
 	 * (but with error status).
 	 */
 	if (ctrl->ctrl.queue_count > 1) {
+		nvme_start_freeze(&ctrl->ctrl);
 		nvme_stop_queues(&ctrl->ctrl);
 		nvme_sync_io_queues(&ctrl->ctrl);
 		blk_mq_tagset_busy_iter(&ctrl->tag_set,
@@ -2966,8 +2967,8 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 			return -ENODEV;
 		}
 		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
-		nvme_unfreeze(&ctrl->ctrl);
 	}
+	nvme_unfreeze(&ctrl->ctrl);
 
 	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
 	if (ret)
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 7/8] nvme-tcp: Unfreeze queues on reconnect
  2021-08-02 11:26 ` Daniel Wagner
@ 2021-08-02 11:26   ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner

During the queue teardown in nvme_tcp_teardown_io_queues() freeze is
called unconditionally. When we reconnect we need to pair the freeze
with an unfreeze to avoid hanging I/Os. For newly created connection
this is not needed.

Fixes: 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic")
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/tcp.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 32268f24f62a..097f7dd10ed3 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1819,9 +1819,11 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 		}
 		blk_mq_update_nr_hw_queues(ctrl->tagset,
 			ctrl->queue_count - 1);
-		nvme_unfreeze(ctrl);
 	}
 
+	if (!new)
+		nvme_unfreeze(ctrl);
+
 	ret = nvme_tcp_start_io_queues(ctrl);
 	if (ret)
 		goto out_cleanup_connect_q;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 7/8] nvme-tcp: Unfreeze queues on reconnect
@ 2021-08-02 11:26   ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner

During the queue teardown in nvme_tcp_teardown_io_queues() freeze is
called unconditionally. When we reconnect we need to pair the freeze
with an unfreeze to avoid hanging I/Os. For newly created connection
this is not needed.

Fixes: 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic")
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/tcp.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 32268f24f62a..097f7dd10ed3 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1819,9 +1819,11 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 		}
 		blk_mq_update_nr_hw_queues(ctrl->tagset,
 			ctrl->queue_count - 1);
-		nvme_unfreeze(ctrl);
 	}
 
+	if (!new)
+		nvme_unfreeze(ctrl);
+
 	ret = nvme_tcp_start_io_queues(ctrl);
 	if (ret)
 		goto out_cleanup_connect_q;
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect
  2021-08-02 11:26 ` Daniel Wagner
@ 2021-08-02 11:26   ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner

During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
called unconditionally. When we reconnect we need to pair the freeze
with an unfreeze to avoid hanging I/Os. For newly created connection
this is not needed.

Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/rdma.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index de2a8950d282..21a8a5353af0 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
 			error = PTR_ERR(ctrl->ctrl.admin_q);
 			goto out_cleanup_fabrics_q;
 		}
+	} else {
+		nvme_unfreeze(&ctrl->ctrl);
 	}
 
 	error = nvme_rdma_start_queue(ctrl, 0);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect
@ 2021-08-02 11:26   ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02 11:26 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner

During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
called unconditionally. When we reconnect we need to pair the freeze
with an unfreeze to avoid hanging I/Os. For newly created connection
this is not needed.

Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/rdma.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index de2a8950d282..21a8a5353af0 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
 			error = PTR_ERR(ctrl->ctrl.admin_q);
 			goto out_cleanup_fabrics_q;
 		}
+	} else {
+		nvme_unfreeze(&ctrl->ctrl);
 	}
 
 	error = nvme_rdma_start_queue(ctrl, 0);
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 1/8] nvme-fc: Update hardware queues before using them
  2021-08-02 11:26   ` Daniel Wagner
@ 2021-08-02 19:34     ` Himanshu Madhani
  -1 siblings, 0 replies; 59+ messages in thread
From: Himanshu Madhani @ 2021-08-02 19:34 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart



On 8/2/21 6:26 AM, Daniel Wagner wrote:
> In case the number of hardware queues changes, do the update the
> tagset and ctx to hctx first before using the mapping to recreate and
> connnect the IO queues.
> 
> Reviewed-by: James Smart <jsmart2021@gmail.com>
> Reviewed-by: Ming Lei <ming.lei@redhat.com>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 16 ++++++++--------
>   1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 7f462af1b02a..8a903769364f 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2951,14 +2951,6 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>   	if (ctrl->ctrl.queue_count == 1)
>   		return 0;
>   
> -	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
> -	if (ret)
> -		goto out_free_io_queues;
> -
> -	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
> -	if (ret)
> -		goto out_delete_hw_queues;
> -
>   	if (prior_ioq_cnt != nr_io_queues) {
>   		dev_info(ctrl->ctrl.device,
>   			"reconnect: revising io queue count from %d to %d\n",
> @@ -2968,6 +2960,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>   		nvme_unfreeze(&ctrl->ctrl);
>   	}
>   
> +	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
> +	if (ret)
> +		goto out_free_io_queues;
> +
> +	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
> +	if (ret)
> +		goto out_delete_hw_queues;
> +
>   	return 0;
>   
>   out_delete_hw_queues:
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani                                Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 1/8] nvme-fc: Update hardware queues before using them
@ 2021-08-02 19:34     ` Himanshu Madhani
  0 siblings, 0 replies; 59+ messages in thread
From: Himanshu Madhani @ 2021-08-02 19:34 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart



On 8/2/21 6:26 AM, Daniel Wagner wrote:
> In case the number of hardware queues changes, do the update the
> tagset and ctx to hctx first before using the mapping to recreate and
> connnect the IO queues.
> 
> Reviewed-by: James Smart <jsmart2021@gmail.com>
> Reviewed-by: Ming Lei <ming.lei@redhat.com>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 16 ++++++++--------
>   1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 7f462af1b02a..8a903769364f 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2951,14 +2951,6 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>   	if (ctrl->ctrl.queue_count == 1)
>   		return 0;
>   
> -	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
> -	if (ret)
> -		goto out_free_io_queues;
> -
> -	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
> -	if (ret)
> -		goto out_delete_hw_queues;
> -
>   	if (prior_ioq_cnt != nr_io_queues) {
>   		dev_info(ctrl->ctrl.device,
>   			"reconnect: revising io queue count from %d to %d\n",
> @@ -2968,6 +2960,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>   		nvme_unfreeze(&ctrl->ctrl);
>   	}
>   
> +	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
> +	if (ret)
> +		goto out_free_io_queues;
> +
> +	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
> +	if (ret)
> +		goto out_delete_hw_queues;
> +
>   	return 0;
>   
>   out_delete_hw_queues:
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani                                Oracle Linux Engineering

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 4/8] nvme-fc: Wait with a timeout for queue to freeze
  2021-08-02 11:26   ` Daniel Wagner
@ 2021-08-02 19:36     ` Himanshu Madhani
  -1 siblings, 0 replies; 59+ messages in thread
From: Himanshu Madhani @ 2021-08-02 19:36 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart



On 8/2/21 6:26 AM, Daniel Wagner wrote:
> Do not wait indifinitly for all queues to freeze. Instead use a
> timeout and abort the operation if we get stuck.
> 
> Reviewed-by: James Smart <jsmart2021@gmail.com>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 8a903769364f..dbb8ad816df8 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>   		dev_info(ctrl->ctrl.device,
>   			"reconnect: revising io queue count from %d to %d\n",
>   			prior_ioq_cnt, nr_io_queues);
> -		nvme_wait_freeze(&ctrl->ctrl);
> +		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> +			/*
> +			 * If we timed out waiting for freeze we are likely to
> +			 * be stuck.  Fail the controller initialization just
> +			 * to be safe.
> +			 */
> +			return -ENODEV;
> +		}
>   		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
>   		nvme_unfreeze(&ctrl->ctrl);
>   	}
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani                                Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 4/8] nvme-fc: Wait with a timeout for queue to freeze
@ 2021-08-02 19:36     ` Himanshu Madhani
  0 siblings, 0 replies; 59+ messages in thread
From: Himanshu Madhani @ 2021-08-02 19:36 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart



On 8/2/21 6:26 AM, Daniel Wagner wrote:
> Do not wait indifinitly for all queues to freeze. Instead use a
> timeout and abort the operation if we get stuck.
> 
> Reviewed-by: James Smart <jsmart2021@gmail.com>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 8a903769364f..dbb8ad816df8 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>   		dev_info(ctrl->ctrl.device,
>   			"reconnect: revising io queue count from %d to %d\n",
>   			prior_ioq_cnt, nr_io_queues);
> -		nvme_wait_freeze(&ctrl->ctrl);
> +		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> +			/*
> +			 * If we timed out waiting for freeze we are likely to
> +			 * be stuck.  Fail the controller initialization just
> +			 * to be safe.
> +			 */
> +			return -ENODEV;
> +		}
>   		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
>   		nvme_unfreeze(&ctrl->ctrl);
>   	}
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani                                Oracle Linux Engineering

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 5/8] nvme-fc: avoid race between time out and tear down
  2021-08-02 11:26   ` Daniel Wagner
@ 2021-08-02 19:38     ` Himanshu Madhani
  -1 siblings, 0 replies; 59+ messages in thread
From: Himanshu Madhani @ 2021-08-02 19:38 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart, Chao Leng



On 8/2/21 6:26 AM, Daniel Wagner wrote:
> From: James Smart <jsmart2021@gmail.com>
> 
> To avoid race between time out and tear down, in tear down process,
> first we quiesce the queue, and then delete the timer and cancel
> the time out work for the queue.
> 
> This patch merges the admin and io sync ops into the queue teardown logic
> as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time
> out and tear down". There is no teardown_lock in nvme-fc.
> 
> Signed-off-by: James Smart <jsmart2021@gmail.com>
> CC: Chao Leng <lengchao@huawei.com>
> Tested-by: Daniel Wagner <dwagner@suse.de>
> [dwagner: updated commit id referenced in commit message]
> Reviewed-by: Daniel Wagner <dwagner@suse.de>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/host/fc.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index dbb8ad816df8..133b87db4f1d 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2487,6 +2487,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
>   	 */
>   	if (ctrl->ctrl.queue_count > 1) {
>   		nvme_stop_queues(&ctrl->ctrl);
> +		nvme_sync_io_queues(&ctrl->ctrl);
>   		blk_mq_tagset_busy_iter(&ctrl->tag_set,
>   				nvme_fc_terminate_exchange, &ctrl->ctrl);
>   		blk_mq_tagset_wait_completed_request(&ctrl->tag_set);
> @@ -2510,6 +2511,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
>   	 * clean up the admin queue. Same thing as above.
>   	 */
>   	blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
> +	blk_sync_queue(ctrl->ctrl.admin_q);
>   	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
>   				nvme_fc_terminate_exchange, &ctrl->ctrl);
>   	blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set);
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani                                Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 5/8] nvme-fc: avoid race between time out and tear down
@ 2021-08-02 19:38     ` Himanshu Madhani
  0 siblings, 0 replies; 59+ messages in thread
From: Himanshu Madhani @ 2021-08-02 19:38 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart, Chao Leng



On 8/2/21 6:26 AM, Daniel Wagner wrote:
> From: James Smart <jsmart2021@gmail.com>
> 
> To avoid race between time out and tear down, in tear down process,
> first we quiesce the queue, and then delete the timer and cancel
> the time out work for the queue.
> 
> This patch merges the admin and io sync ops into the queue teardown logic
> as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time
> out and tear down". There is no teardown_lock in nvme-fc.
> 
> Signed-off-by: James Smart <jsmart2021@gmail.com>
> CC: Chao Leng <lengchao@huawei.com>
> Tested-by: Daniel Wagner <dwagner@suse.de>
> [dwagner: updated commit id referenced in commit message]
> Reviewed-by: Daniel Wagner <dwagner@suse.de>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/host/fc.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index dbb8ad816df8..133b87db4f1d 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2487,6 +2487,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
>   	 */
>   	if (ctrl->ctrl.queue_count > 1) {
>   		nvme_stop_queues(&ctrl->ctrl);
> +		nvme_sync_io_queues(&ctrl->ctrl);
>   		blk_mq_tagset_busy_iter(&ctrl->tag_set,
>   				nvme_fc_terminate_exchange, &ctrl->ctrl);
>   		blk_mq_tagset_wait_completed_request(&ctrl->tag_set);
> @@ -2510,6 +2511,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
>   	 * clean up the admin queue. Same thing as above.
>   	 */
>   	blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
> +	blk_sync_queue(ctrl->ctrl.admin_q);
>   	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
>   				nvme_fc_terminate_exchange, &ctrl->ctrl);
>   	blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set);
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani                                Oracle Linux Engineering

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
  2021-08-02 11:26   ` Daniel Wagner
@ 2021-08-02 19:39     ` Himanshu Madhani
  -1 siblings, 0 replies; 59+ messages in thread
From: Himanshu Madhani @ 2021-08-02 19:39 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart



On 8/2/21 6:26 AM, Daniel Wagner wrote:
> From: James Smart <jsmart2021@gmail.com>
> 
> commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
> exposed an issue where we may hang trying to wait for queue freeze
> during I/O. We call blk_mq_update_nr_hw_queues which may attempt to freeze
> the queue. However we never started queue freeze when starting the
> reset, which means that we have inflight pending requests that entered the
> queue that we will not complete once the queue is quiesced.
> 
> So start a freeze before we quiesce the queue, and unfreeze the queue
> after we successfully connected the I/O queues (the unfreeze is already
> present in the code). blk_mq_update_nr_hw_queues will be called only
> after we are sure that the queue was already frozen.
> 
> This follows to how the pci driver handles resets.
> 
> This patch added logic introduced in commit 9f98772ba307 "nvme-rdma: fix
> controller reset hang during traffic".
> 
> Signed-off-by: James Smart <jsmart2021@gmail.com>
> CC: Sagi Grimberg <sagi@grimberg.me>
> [dwagner: call nvme_unfreeze() unconditionally in
>            nvme_fc_recreate_io_queues() to match the nvme_start_freeze()]
> Tested-by: Daniel Wagner <dwagner@suse.de>
> Reviewed-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 133b87db4f1d..b292af0fd655 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2486,6 +2486,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
>   	 * (but with error status).
>   	 */
>   	if (ctrl->ctrl.queue_count > 1) {
> +		nvme_start_freeze(&ctrl->ctrl);
>   		nvme_stop_queues(&ctrl->ctrl);
>   		nvme_sync_io_queues(&ctrl->ctrl);
>   		blk_mq_tagset_busy_iter(&ctrl->tag_set,
> @@ -2966,8 +2967,8 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>   			return -ENODEV;
>   		}
>   		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
> -		nvme_unfreeze(&ctrl->ctrl);
>   	}
> +	nvme_unfreeze(&ctrl->ctrl);
>   
>   	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
>   	if (ret)
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani                                Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
@ 2021-08-02 19:39     ` Himanshu Madhani
  0 siblings, 0 replies; 59+ messages in thread
From: Himanshu Madhani @ 2021-08-02 19:39 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, James Smart



On 8/2/21 6:26 AM, Daniel Wagner wrote:
> From: James Smart <jsmart2021@gmail.com>
> 
> commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
> exposed an issue where we may hang trying to wait for queue freeze
> during I/O. We call blk_mq_update_nr_hw_queues which may attempt to freeze
> the queue. However we never started queue freeze when starting the
> reset, which means that we have inflight pending requests that entered the
> queue that we will not complete once the queue is quiesced.
> 
> So start a freeze before we quiesce the queue, and unfreeze the queue
> after we successfully connected the I/O queues (the unfreeze is already
> present in the code). blk_mq_update_nr_hw_queues will be called only
> after we are sure that the queue was already frozen.
> 
> This follows to how the pci driver handles resets.
> 
> This patch added logic introduced in commit 9f98772ba307 "nvme-rdma: fix
> controller reset hang during traffic".
> 
> Signed-off-by: James Smart <jsmart2021@gmail.com>
> CC: Sagi Grimberg <sagi@grimberg.me>
> [dwagner: call nvme_unfreeze() unconditionally in
>            nvme_fc_recreate_io_queues() to match the nvme_start_freeze()]
> Tested-by: Daniel Wagner <dwagner@suse.de>
> Reviewed-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 133b87db4f1d..b292af0fd655 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2486,6 +2486,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
>   	 * (but with error status).
>   	 */
>   	if (ctrl->ctrl.queue_count > 1) {
> +		nvme_start_freeze(&ctrl->ctrl);
>   		nvme_stop_queues(&ctrl->ctrl);
>   		nvme_sync_io_queues(&ctrl->ctrl);
>   		blk_mq_tagset_busy_iter(&ctrl->tag_set,
> @@ -2966,8 +2967,8 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>   			return -ENODEV;
>   		}
>   		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
> -		nvme_unfreeze(&ctrl->ctrl);
>   	}
> +	nvme_unfreeze(&ctrl->ctrl);
>   
>   	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
>   	if (ret)
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani                                Oracle Linux Engineering

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
  2021-08-02 11:26   ` Daniel Wagner
@ 2021-08-04  7:23     ` Hannes Reinecke
  -1 siblings, 0 replies; 59+ messages in thread
From: Hannes Reinecke @ 2021-08-04  7:23 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Wen Xiong, James Smart

On 8/2/21 1:26 PM, Daniel Wagner wrote:
> From: James Smart <jsmart2021@gmail.com>
> 
> commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
> exposed an issue where we may hang trying to wait for queue freeze
> during I/O. We call blk_mq_update_nr_hw_queues which may attempt to freeze
> the queue. However we never started queue freeze when starting the
> reset, which means that we have inflight pending requests that entered the
> queue that we will not complete once the queue is quiesced.
> 
> So start a freeze before we quiesce the queue, and unfreeze the queue
> after we successfully connected the I/O queues (the unfreeze is already
> present in the code). blk_mq_update_nr_hw_queues will be called only
> after we are sure that the queue was already frozen.
> 
> This follows to how the pci driver handles resets.
> 
> This patch added logic introduced in commit 9f98772ba307 "nvme-rdma: fix
> controller reset hang during traffic".
> 
> Signed-off-by: James Smart <jsmart2021@gmail.com>
> CC: Sagi Grimberg <sagi@grimberg.me>
> [dwagner: call nvme_unfreeze() unconditionally in
>           nvme_fc_recreate_io_queues() to match the nvme_start_freeze()]
> Tested-by: Daniel Wagner <dwagner@suse.de>
> Reviewed-by: Daniel Wagner <dwagner@suse.de>
> ---
>  drivers/nvme/host/fc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 133b87db4f1d..b292af0fd655 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2486,6 +2486,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
>  	 * (but with error status).
>  	 */
>  	if (ctrl->ctrl.queue_count > 1) {
> +		nvme_start_freeze(&ctrl->ctrl);
>  		nvme_stop_queues(&ctrl->ctrl);
>  		nvme_sync_io_queues(&ctrl->ctrl);
>  		blk_mq_tagset_busy_iter(&ctrl->tag_set,
> @@ -2966,8 +2967,8 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>  			return -ENODEV;
>  		}
>  		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
> -		nvme_unfreeze(&ctrl->ctrl);
>  	}
> +	nvme_unfreeze(&ctrl->ctrl);
>  
>  	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
>  	if (ret)
> 
There still is now an imbalance, as we're always calling
'nvme_unfreeze()' (irrespective on the number of queues), but will only
call 'nvme_start_freeze()' if we have more than one queue.

This might lead to an imbalance on the mq_freeze_depth counter.
Wouldn't it be better to move the call to 'nvme_start_freeze()' out of
the if() condition to avoid the imbalance?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
@ 2021-08-04  7:23     ` Hannes Reinecke
  0 siblings, 0 replies; 59+ messages in thread
From: Hannes Reinecke @ 2021-08-04  7:23 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Wen Xiong, James Smart

On 8/2/21 1:26 PM, Daniel Wagner wrote:
> From: James Smart <jsmart2021@gmail.com>
> 
> commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
> exposed an issue where we may hang trying to wait for queue freeze
> during I/O. We call blk_mq_update_nr_hw_queues which may attempt to freeze
> the queue. However we never started queue freeze when starting the
> reset, which means that we have inflight pending requests that entered the
> queue that we will not complete once the queue is quiesced.
> 
> So start a freeze before we quiesce the queue, and unfreeze the queue
> after we successfully connected the I/O queues (the unfreeze is already
> present in the code). blk_mq_update_nr_hw_queues will be called only
> after we are sure that the queue was already frozen.
> 
> This follows to how the pci driver handles resets.
> 
> This patch added logic introduced in commit 9f98772ba307 "nvme-rdma: fix
> controller reset hang during traffic".
> 
> Signed-off-by: James Smart <jsmart2021@gmail.com>
> CC: Sagi Grimberg <sagi@grimberg.me>
> [dwagner: call nvme_unfreeze() unconditionally in
>           nvme_fc_recreate_io_queues() to match the nvme_start_freeze()]
> Tested-by: Daniel Wagner <dwagner@suse.de>
> Reviewed-by: Daniel Wagner <dwagner@suse.de>
> ---
>  drivers/nvme/host/fc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 133b87db4f1d..b292af0fd655 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2486,6 +2486,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
>  	 * (but with error status).
>  	 */
>  	if (ctrl->ctrl.queue_count > 1) {
> +		nvme_start_freeze(&ctrl->ctrl);
>  		nvme_stop_queues(&ctrl->ctrl);
>  		nvme_sync_io_queues(&ctrl->ctrl);
>  		blk_mq_tagset_busy_iter(&ctrl->tag_set,
> @@ -2966,8 +2967,8 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>  			return -ENODEV;
>  		}
>  		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
> -		nvme_unfreeze(&ctrl->ctrl);
>  	}
> +	nvme_unfreeze(&ctrl->ctrl);
>  
>  	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
>  	if (ret)
> 
There still is now an imbalance, as we're always calling
'nvme_unfreeze()' (irrespective on the number of queues), but will only
call 'nvme_start_freeze()' if we have more than one queue.

This might lead to an imbalance on the mq_freeze_depth counter.
Wouldn't it be better to move the call to 'nvme_start_freeze()' out of
the if() condition to avoid the imbalance?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect
  2021-08-02 11:26   ` Daniel Wagner
@ 2021-08-04  7:25     ` Hannes Reinecke
  -1 siblings, 0 replies; 59+ messages in thread
From: Hannes Reinecke @ 2021-08-04  7:25 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Wen Xiong

On 8/2/21 1:26 PM, Daniel Wagner wrote:
> During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
> called unconditionally. When we reconnect we need to pair the freeze
> with an unfreeze to avoid hanging I/Os. For newly created connection
> this is not needed.
> 
> Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>  drivers/nvme/host/rdma.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index de2a8950d282..21a8a5353af0 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
>  			error = PTR_ERR(ctrl->ctrl.admin_q);
>  			goto out_cleanup_fabrics_q;
>  		}
> +	} else {
> +		nvme_unfreeze(&ctrl->ctrl);
>  	}
>  
>  	error = nvme_rdma_start_queue(ctrl, 0);
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect
@ 2021-08-04  7:25     ` Hannes Reinecke
  0 siblings, 0 replies; 59+ messages in thread
From: Hannes Reinecke @ 2021-08-04  7:25 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Wen Xiong

On 8/2/21 1:26 PM, Daniel Wagner wrote:
> During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
> called unconditionally. When we reconnect we need to pair the freeze
> with an unfreeze to avoid hanging I/Os. For newly created connection
> this is not needed.
> 
> Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>  drivers/nvme/host/rdma.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index de2a8950d282..21a8a5353af0 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
>  			error = PTR_ERR(ctrl->ctrl.admin_q);
>  			goto out_cleanup_fabrics_q;
>  		}
> +	} else {
> +		nvme_unfreeze(&ctrl->ctrl);
>  	}
>  
>  	error = nvme_rdma_start_queue(ctrl, 0);
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
  2021-08-04  7:23     ` Hannes Reinecke
@ 2021-08-04  8:08       ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-04  8:08 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Ming Lei,
	Sagi Grimberg, Wen Xiong, James Smart

On Wed, Aug 04, 2021 at 09:23:49AM +0200, Hannes Reinecke wrote:
> There still is now an imbalance, as we're always calling
> 'nvme_unfreeze()' (irrespective on the number of queues), but will only
> call 'nvme_start_freeze()' if we have more than one queue.
> 
> This might lead to an imbalance on the mq_freeze_depth counter.
> Wouldn't it be better to move the call to 'nvme_start_freeze()' out of
> the if() condition to avoid the imbalance?

What about something like nmve_is_frozen() which would avoid the tracking
of the queue state open coded as it is right?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
@ 2021-08-04  8:08       ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-04  8:08 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Ming Lei,
	Sagi Grimberg, Wen Xiong, James Smart

On Wed, Aug 04, 2021 at 09:23:49AM +0200, Hannes Reinecke wrote:
> There still is now an imbalance, as we're always calling
> 'nvme_unfreeze()' (irrespective on the number of queues), but will only
> call 'nvme_start_freeze()' if we have more than one queue.
> 
> This might lead to an imbalance on the mq_freeze_depth counter.
> Wouldn't it be better to move the call to 'nvme_start_freeze()' out of
> the if() condition to avoid the imbalance?

What about something like nmve_is_frozen() which would avoid the tracking
of the queue state open coded as it is right?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
  2021-08-02 11:26   ` Daniel Wagner
@ 2021-08-06 19:57     ` Sagi Grimberg
  -1 siblings, 0 replies; 59+ messages in thread
From: Sagi Grimberg @ 2021-08-06 19:57 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong


> From: Hannes Reinecke <hare@suse.de>
> 
> When the number of hardware queues changes during resetting we should
> update the tagset first before using it.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/tcp.c | 14 ++++++--------
>   1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 0a97ba02f61e..32268f24f62a 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1789,6 +1789,7 @@ static void nvme_tcp_destroy_io_queues(struct nvme_ctrl *ctrl, bool remove)
>   static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   {
>   	int ret;
> +	u32 prior_q_cnt = ctrl->queue_count;
>   
>   	ret = nvme_tcp_alloc_io_queues(ctrl);
>   	if (ret)
> @@ -1806,14 +1807,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   			ret = PTR_ERR(ctrl->connect_q);
>   			goto out_free_tag_set;
>   		}
> -	}
> -
> -	ret = nvme_tcp_start_io_queues(ctrl);
> -	if (ret)
> -		goto out_cleanup_connect_q;
> -
> -	if (!new) {
> -		nvme_start_queues(ctrl);
> +	} else if (prior_q_cnt != ctrl->queue_count) {

So if the queue count did not change we don't wait to make sure
the queue g_usage_counter ref made it to zero? What guarantees that it
did?

>   		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
>   			/*
>   			 * If we timed out waiting for freeze we are likely to
> @@ -1828,6 +1822,10 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   		nvme_unfreeze(ctrl);
>   	}
>   
> +	ret = nvme_tcp_start_io_queues(ctrl);
> +	if (ret)
> +		goto out_cleanup_connect_q;
> +

Did you test this with both heavy I/O, reset loop and ifdown/ifup loop?

If we unquiesce and unfreeze before we start the queues the pending I/Os
may resume before the connect and not allow the connect to make forward
progress.

>   	return 0;
>   
>   out_wait_freeze_timed_out:
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
@ 2021-08-06 19:57     ` Sagi Grimberg
  0 siblings, 0 replies; 59+ messages in thread
From: Sagi Grimberg @ 2021-08-06 19:57 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong


> From: Hannes Reinecke <hare@suse.de>
> 
> When the number of hardware queues changes during resetting we should
> update the tagset first before using it.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/tcp.c | 14 ++++++--------
>   1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 0a97ba02f61e..32268f24f62a 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1789,6 +1789,7 @@ static void nvme_tcp_destroy_io_queues(struct nvme_ctrl *ctrl, bool remove)
>   static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   {
>   	int ret;
> +	u32 prior_q_cnt = ctrl->queue_count;
>   
>   	ret = nvme_tcp_alloc_io_queues(ctrl);
>   	if (ret)
> @@ -1806,14 +1807,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   			ret = PTR_ERR(ctrl->connect_q);
>   			goto out_free_tag_set;
>   		}
> -	}
> -
> -	ret = nvme_tcp_start_io_queues(ctrl);
> -	if (ret)
> -		goto out_cleanup_connect_q;
> -
> -	if (!new) {
> -		nvme_start_queues(ctrl);
> +	} else if (prior_q_cnt != ctrl->queue_count) {

So if the queue count did not change we don't wait to make sure
the queue g_usage_counter ref made it to zero? What guarantees that it
did?

>   		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
>   			/*
>   			 * If we timed out waiting for freeze we are likely to
> @@ -1828,6 +1822,10 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   		nvme_unfreeze(ctrl);
>   	}
>   
> +	ret = nvme_tcp_start_io_queues(ctrl);
> +	if (ret)
> +		goto out_cleanup_connect_q;
> +

Did you test this with both heavy I/O, reset loop and ifdown/ifup loop?

If we unquiesce and unfreeze before we start the queues the pending I/Os
may resume before the connect and not allow the connect to make forward
progress.

>   	return 0;
>   
>   out_wait_freeze_timed_out:
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect
  2021-08-02 11:26   ` Daniel Wagner
@ 2021-08-06 19:59     ` Sagi Grimberg
  -1 siblings, 0 replies; 59+ messages in thread
From: Sagi Grimberg @ 2021-08-06 19:59 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong


> During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
> called unconditionally. When we reconnect we need to pair the freeze
> with an unfreeze to avoid hanging I/Os. For newly created connection
> this is not needed.
> 
> Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/rdma.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index de2a8950d282..21a8a5353af0 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
>   			error = PTR_ERR(ctrl->ctrl.admin_q);
>   			goto out_cleanup_fabrics_q;
>   		}
> +	} else {
> +		nvme_unfreeze(&ctrl->ctrl);

That seems misplaced.. unfreezing the I/O queues when setting up the 
admin queue?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect
@ 2021-08-06 19:59     ` Sagi Grimberg
  0 siblings, 0 replies; 59+ messages in thread
From: Sagi Grimberg @ 2021-08-06 19:59 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong


> During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
> called unconditionally. When we reconnect we need to pair the freeze
> with an unfreeze to avoid hanging I/Os. For newly created connection
> this is not needed.
> 
> Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/rdma.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index de2a8950d282..21a8a5353af0 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
>   			error = PTR_ERR(ctrl->ctrl.admin_q);
>   			goto out_cleanup_fabrics_q;
>   		}
> +	} else {
> +		nvme_unfreeze(&ctrl->ctrl);

That seems misplaced.. unfreezing the I/O queues when setting up the 
admin queue?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
  2021-08-06 19:57     ` Sagi Grimberg
@ 2021-08-09  8:52       ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-09  8:52 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong

Hi Sagi,

On Fri, Aug 06, 2021 at 12:57:17PM -0700, Sagi Grimberg wrote:
> > -	ret = nvme_tcp_start_io_queues(ctrl);
> > -	if (ret)
> > -		goto out_cleanup_connect_q;
> > -
> > -	if (!new) {
> > -		nvme_start_queues(ctrl);
> > +	} else if (prior_q_cnt != ctrl->queue_count) {
> 
> So if the queue count did not change we don't wait to make sure
> the queue g_usage_counter ref made it to zero? What guarantees that it
> did?

Hmm, good point. we should always call nvme_wait_freeze_timeout()
for !new queues. Is this what you are implying?


> >   		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
> >   			/*
> >   			 * If we timed out waiting for freeze we are likely to
> > @@ -1828,6 +1822,10 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
> >   		nvme_unfreeze(ctrl);
> >   	}
> > +	ret = nvme_tcp_start_io_queues(ctrl);
> > +	if (ret)
> > +		goto out_cleanup_connect_q;
> > +
> 
> Did you test this with both heavy I/O, reset loop and ifdown/ifup
> loop?

Not sure if this classifies as heavy I/O (on 80 CPU machine)

fio --rw=readwrite --name=test --filename=/dev/nvme16n1 --size=50M \
    --direct=1 --bs=4k --numjobs=40 --group_reporting --runtime=4h \
    --time_based

and then I installed iptables rules to block the traffic on the
controller side. With this test it is pretty easily to get
the host hanging. Let me know what test you would like to see
from me. I am glad to try to get them running.

> If we unquiesce and unfreeze before we start the queues the pending I/Os
> may resume before the connect and not allow the connect to make forward
> progress.

So the unfreeze should happen after the connect call? What about the
newly created queues? Do they not suffer from the same problem? Isn't
the NVME_TCP_Q_LIVE flag not enough?

Daniel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
@ 2021-08-09  8:52       ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-09  8:52 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong

Hi Sagi,

On Fri, Aug 06, 2021 at 12:57:17PM -0700, Sagi Grimberg wrote:
> > -	ret = nvme_tcp_start_io_queues(ctrl);
> > -	if (ret)
> > -		goto out_cleanup_connect_q;
> > -
> > -	if (!new) {
> > -		nvme_start_queues(ctrl);
> > +	} else if (prior_q_cnt != ctrl->queue_count) {
> 
> So if the queue count did not change we don't wait to make sure
> the queue g_usage_counter ref made it to zero? What guarantees that it
> did?

Hmm, good point. we should always call nvme_wait_freeze_timeout()
for !new queues. Is this what you are implying?


> >   		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
> >   			/*
> >   			 * If we timed out waiting for freeze we are likely to
> > @@ -1828,6 +1822,10 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
> >   		nvme_unfreeze(ctrl);
> >   	}
> > +	ret = nvme_tcp_start_io_queues(ctrl);
> > +	if (ret)
> > +		goto out_cleanup_connect_q;
> > +
> 
> Did you test this with both heavy I/O, reset loop and ifdown/ifup
> loop?

Not sure if this classifies as heavy I/O (on 80 CPU machine)

fio --rw=readwrite --name=test --filename=/dev/nvme16n1 --size=50M \
    --direct=1 --bs=4k --numjobs=40 --group_reporting --runtime=4h \
    --time_based

and then I installed iptables rules to block the traffic on the
controller side. With this test it is pretty easily to get
the host hanging. Let me know what test you would like to see
from me. I am glad to try to get them running.

> If we unquiesce and unfreeze before we start the queues the pending I/Os
> may resume before the connect and not allow the connect to make forward
> progress.

So the unfreeze should happen after the connect call? What about the
newly created queues? Do they not suffer from the same problem? Isn't
the NVME_TCP_Q_LIVE flag not enough?

Daniel

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect
  2021-08-06 19:59     ` Sagi Grimberg
@ 2021-08-09  8:58       ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-09  8:58 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong

Hi Sagi,

On Fri, Aug 06, 2021 at 12:59:15PM -0700, Sagi Grimberg wrote:
> 
> > During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
> > called unconditionally. When we reconnect we need to pair the freeze
> > with an unfreeze to avoid hanging I/Os. For newly created connection
> > this is not needed.
> > 
> > Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
> > Signed-off-by: Daniel Wagner <dwagner@suse.de>
> > ---
> >   drivers/nvme/host/rdma.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> > index de2a8950d282..21a8a5353af0 100644
> > --- a/drivers/nvme/host/rdma.c
> > +++ b/drivers/nvme/host/rdma.c
> > @@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
> >   			error = PTR_ERR(ctrl->ctrl.admin_q);
> >   			goto out_cleanup_fabrics_q;
> >   		}
> > +	} else {
> > +		nvme_unfreeze(&ctrl->ctrl);
> 
> That seems misplaced.. unfreezing the I/O queues when setting up the admin
> queue?

Indeed. After looking again on it, this should be almost identically to
the tcp.c fix in nvme_rdma_configure_io_queues.

BTW, I am working on getting a RDMA test setup running. Hopefully I have
all the right licenses on the array.

Daniel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect
@ 2021-08-09  8:58       ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-09  8:58 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong

Hi Sagi,

On Fri, Aug 06, 2021 at 12:59:15PM -0700, Sagi Grimberg wrote:
> 
> > During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
> > called unconditionally. When we reconnect we need to pair the freeze
> > with an unfreeze to avoid hanging I/Os. For newly created connection
> > this is not needed.
> > 
> > Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
> > Signed-off-by: Daniel Wagner <dwagner@suse.de>
> > ---
> >   drivers/nvme/host/rdma.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> > index de2a8950d282..21a8a5353af0 100644
> > --- a/drivers/nvme/host/rdma.c
> > +++ b/drivers/nvme/host/rdma.c
> > @@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
> >   			error = PTR_ERR(ctrl->ctrl.admin_q);
> >   			goto out_cleanup_fabrics_q;
> >   		}
> > +	} else {
> > +		nvme_unfreeze(&ctrl->ctrl);
> 
> That seems misplaced.. unfreezing the I/O queues when setting up the admin
> queue?

Indeed. After looking again on it, this should be almost identically to
the tcp.c fix in nvme_rdma_configure_io_queues.

BTW, I am working on getting a RDMA test setup running. Hopefully I have
all the right licenses on the array.

Daniel

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
  2021-08-09  8:52       ` Daniel Wagner
@ 2021-08-11  1:00         ` Sagi Grimberg
  -1 siblings, 0 replies; 59+ messages in thread
From: Sagi Grimberg @ 2021-08-11  1:00 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong



On 8/9/21 1:52 AM, Daniel Wagner wrote:
> Hi Sagi,
> 
> On Fri, Aug 06, 2021 at 12:57:17PM -0700, Sagi Grimberg wrote:
>>> -	ret = nvme_tcp_start_io_queues(ctrl);
>>> -	if (ret)
>>> -		goto out_cleanup_connect_q;
>>> -
>>> -	if (!new) {
>>> -		nvme_start_queues(ctrl);
>>> +	} else if (prior_q_cnt != ctrl->queue_count) {
>>
>> So if the queue count did not change we don't wait to make sure
>> the queue g_usage_counter ref made it to zero? What guarantees that it
>> did?
> 
> Hmm, good point. we should always call nvme_wait_freeze_timeout()
> for !new queues. Is this what you are implying?

I think we should always wait for the freeze to complete.

> 
> 
>>>    		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
>>>    			/*
>>>    			 * If we timed out waiting for freeze we are likely to
>>> @@ -1828,6 +1822,10 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>>>    		nvme_unfreeze(ctrl);
>>>    	}
>>> +	ret = nvme_tcp_start_io_queues(ctrl);
>>> +	if (ret)
>>> +		goto out_cleanup_connect_q;
>>> +
>>
>> Did you test this with both heavy I/O, reset loop and ifdown/ifup
>> loop?
> 
> Not sure if this classifies as heavy I/O (on 80 CPU machine)
> 
> fio --rw=readwrite --name=test --filename=/dev/nvme16n1 --size=50M \
>      --direct=1 --bs=4k --numjobs=40 --group_reporting --runtime=4h \
>      --time_based
> 
> and then I installed iptables rules to block the traffic on the
> controller side. With this test it is pretty easily to get
> the host hanging. Let me know what test you would like to see
> from me. I am glad to try to get them running.

Lets add iodepth=128

>> If we unquiesce and unfreeze before we start the queues the pending I/Os
>> may resume before the connect and not allow the connect to make forward
>> progress.
> 
> So the unfreeze should happen after the connect call? What about the
> newly created queues? Do they not suffer from the same problem? Isn't
> the NVME_TCP_Q_LIVE flag not enough?

Q_LIVE will protect against the transport itself from queueing, however
when multipath is not used, the transport will return BLK_STS_RESOURCE
which will immediately trigger re-submission, in an endless loop, and
that can prevent forward progress. It is also consistent with what 
nvme-pci does.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
@ 2021-08-11  1:00         ` Sagi Grimberg
  0 siblings, 0 replies; 59+ messages in thread
From: Sagi Grimberg @ 2021-08-11  1:00 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong



On 8/9/21 1:52 AM, Daniel Wagner wrote:
> Hi Sagi,
> 
> On Fri, Aug 06, 2021 at 12:57:17PM -0700, Sagi Grimberg wrote:
>>> -	ret = nvme_tcp_start_io_queues(ctrl);
>>> -	if (ret)
>>> -		goto out_cleanup_connect_q;
>>> -
>>> -	if (!new) {
>>> -		nvme_start_queues(ctrl);
>>> +	} else if (prior_q_cnt != ctrl->queue_count) {
>>
>> So if the queue count did not change we don't wait to make sure
>> the queue g_usage_counter ref made it to zero? What guarantees that it
>> did?
> 
> Hmm, good point. we should always call nvme_wait_freeze_timeout()
> for !new queues. Is this what you are implying?

I think we should always wait for the freeze to complete.

> 
> 
>>>    		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
>>>    			/*
>>>    			 * If we timed out waiting for freeze we are likely to
>>> @@ -1828,6 +1822,10 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>>>    		nvme_unfreeze(ctrl);
>>>    	}
>>> +	ret = nvme_tcp_start_io_queues(ctrl);
>>> +	if (ret)
>>> +		goto out_cleanup_connect_q;
>>> +
>>
>> Did you test this with both heavy I/O, reset loop and ifdown/ifup
>> loop?
> 
> Not sure if this classifies as heavy I/O (on 80 CPU machine)
> 
> fio --rw=readwrite --name=test --filename=/dev/nvme16n1 --size=50M \
>      --direct=1 --bs=4k --numjobs=40 --group_reporting --runtime=4h \
>      --time_based
> 
> and then I installed iptables rules to block the traffic on the
> controller side. With this test it is pretty easily to get
> the host hanging. Let me know what test you would like to see
> from me. I am glad to try to get them running.

Lets add iodepth=128

>> If we unquiesce and unfreeze before we start the queues the pending I/Os
>> may resume before the connect and not allow the connect to make forward
>> progress.
> 
> So the unfreeze should happen after the connect call? What about the
> newly created queues? Do they not suffer from the same problem? Isn't
> the NVME_TCP_Q_LIVE flag not enough?

Q_LIVE will protect against the transport itself from queueing, however
when multipath is not used, the transport will return BLK_STS_RESOURCE
which will immediately trigger re-submission, in an endless loop, and
that can prevent forward progress. It is also consistent with what 
nvme-pci does.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
  2021-08-04  8:08       ` Daniel Wagner
@ 2021-08-11  1:05         ` Sagi Grimberg
  -1 siblings, 0 replies; 59+ messages in thread
From: Sagi Grimberg @ 2021-08-11  1:05 UTC (permalink / raw)
  To: Daniel Wagner, Hannes Reinecke
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Ming Lei,
	Wen Xiong, James Smart


> On Wed, Aug 04, 2021 at 09:23:49AM +0200, Hannes Reinecke wrote:
>> There still is now an imbalance, as we're always calling
>> 'nvme_unfreeze()' (irrespective on the number of queues), but will only
>> call 'nvme_start_freeze()' if we have more than one queue.
>>
>> This might lead to an imbalance on the mq_freeze_depth counter.
>> Wouldn't it be better to move the call to 'nvme_start_freeze()' out of
>> the if() condition to avoid the imbalance?
> 
> What about something like nmve_is_frozen() which would avoid the tracking
> of the queue state open coded as it is right?

Why is there a conditional freeze?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
@ 2021-08-11  1:05         ` Sagi Grimberg
  0 siblings, 0 replies; 59+ messages in thread
From: Sagi Grimberg @ 2021-08-11  1:05 UTC (permalink / raw)
  To: Daniel Wagner, Hannes Reinecke
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Ming Lei,
	Wen Xiong, James Smart


> On Wed, Aug 04, 2021 at 09:23:49AM +0200, Hannes Reinecke wrote:
>> There still is now an imbalance, as we're always calling
>> 'nvme_unfreeze()' (irrespective on the number of queues), but will only
>> call 'nvme_start_freeze()' if we have more than one queue.
>>
>> This might lead to an imbalance on the mq_freeze_depth counter.
>> Wouldn't it be better to move the call to 'nvme_start_freeze()' out of
>> the if() condition to avoid the imbalance?
> 
> What about something like nmve_is_frozen() which would avoid the tracking
> of the queue state open coded as it is right?

Why is there a conditional freeze?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
  2021-08-11  1:00         ` Sagi Grimberg
@ 2021-08-11  1:07           ` Keith Busch
  -1 siblings, 0 replies; 59+ messages in thread
From: Keith Busch @ 2021-08-11  1:07 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Daniel Wagner, linux-nvme, linux-kernel, James Smart, Ming Lei,
	Hannes Reinecke, Wen Xiong

On Tue, Aug 10, 2021 at 06:00:37PM -0700, Sagi Grimberg wrote:
> 
> 
> On 8/9/21 1:52 AM, Daniel Wagner wrote:
> > Hi Sagi,
> > 
> > On Fri, Aug 06, 2021 at 12:57:17PM -0700, Sagi Grimberg wrote:
> > > > -	ret = nvme_tcp_start_io_queues(ctrl);
> > > > -	if (ret)
> > > > -		goto out_cleanup_connect_q;
> > > > -
> > > > -	if (!new) {
> > > > -		nvme_start_queues(ctrl);
> > > > +	} else if (prior_q_cnt != ctrl->queue_count) {
> > > 
> > > So if the queue count did not change we don't wait to make sure
> > > the queue g_usage_counter ref made it to zero? What guarantees that it
> > > did?
> > 
> > Hmm, good point. we should always call nvme_wait_freeze_timeout()
> > for !new queues. Is this what you are implying?
> 
> I think we should always wait for the freeze to complete.

Don't the queues need to be started in order for the freeze to complete?
Any enqueued requests on the quiesced queues will never complete this
way, so the wait_freeze() will be stuck, right? If so, I think the
nvme_start_queues() was in the correct place already.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
@ 2021-08-11  1:07           ` Keith Busch
  0 siblings, 0 replies; 59+ messages in thread
From: Keith Busch @ 2021-08-11  1:07 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Daniel Wagner, linux-nvme, linux-kernel, James Smart, Ming Lei,
	Hannes Reinecke, Wen Xiong

On Tue, Aug 10, 2021 at 06:00:37PM -0700, Sagi Grimberg wrote:
> 
> 
> On 8/9/21 1:52 AM, Daniel Wagner wrote:
> > Hi Sagi,
> > 
> > On Fri, Aug 06, 2021 at 12:57:17PM -0700, Sagi Grimberg wrote:
> > > > -	ret = nvme_tcp_start_io_queues(ctrl);
> > > > -	if (ret)
> > > > -		goto out_cleanup_connect_q;
> > > > -
> > > > -	if (!new) {
> > > > -		nvme_start_queues(ctrl);
> > > > +	} else if (prior_q_cnt != ctrl->queue_count) {
> > > 
> > > So if the queue count did not change we don't wait to make sure
> > > the queue g_usage_counter ref made it to zero? What guarantees that it
> > > did?
> > 
> > Hmm, good point. we should always call nvme_wait_freeze_timeout()
> > for !new queues. Is this what you are implying?
> 
> I think we should always wait for the freeze to complete.

Don't the queues need to be started in order for the freeze to complete?
Any enqueued requests on the quiesced queues will never complete this
way, so the wait_freeze() will be stuck, right? If so, I think the
nvme_start_queues() was in the correct place already.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
  2021-08-11  1:07           ` Keith Busch
@ 2021-08-11  5:57             ` Sagi Grimberg
  -1 siblings, 0 replies; 59+ messages in thread
From: Sagi Grimberg @ 2021-08-11  5:57 UTC (permalink / raw)
  To: Keith Busch
  Cc: Daniel Wagner, linux-nvme, linux-kernel, James Smart, Ming Lei,
	Hannes Reinecke, Wen Xiong


>> On 8/9/21 1:52 AM, Daniel Wagner wrote:
>>> Hi Sagi,
>>>
>>> On Fri, Aug 06, 2021 at 12:57:17PM -0700, Sagi Grimberg wrote:
>>>>> -	ret = nvme_tcp_start_io_queues(ctrl);
>>>>> -	if (ret)
>>>>> -		goto out_cleanup_connect_q;
>>>>> -
>>>>> -	if (!new) {
>>>>> -		nvme_start_queues(ctrl);
>>>>> +	} else if (prior_q_cnt != ctrl->queue_count) {
>>>>
>>>> So if the queue count did not change we don't wait to make sure
>>>> the queue g_usage_counter ref made it to zero? What guarantees that it
>>>> did?
>>>
>>> Hmm, good point. we should always call nvme_wait_freeze_timeout()
>>> for !new queues. Is this what you are implying?
>>
>> I think we should always wait for the freeze to complete.
> 
> Don't the queues need to be started in order for the freeze to complete?
> Any enqueued requests on the quiesced queues will never complete this
> way, so the wait_freeze() will be stuck, right? If so, I think the
> nvme_start_queues() was in the correct place already.

Exactly what I was trying to point out (poorly though)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
@ 2021-08-11  5:57             ` Sagi Grimberg
  0 siblings, 0 replies; 59+ messages in thread
From: Sagi Grimberg @ 2021-08-11  5:57 UTC (permalink / raw)
  To: Keith Busch
  Cc: Daniel Wagner, linux-nvme, linux-kernel, James Smart, Ming Lei,
	Hannes Reinecke, Wen Xiong


>> On 8/9/21 1:52 AM, Daniel Wagner wrote:
>>> Hi Sagi,
>>>
>>> On Fri, Aug 06, 2021 at 12:57:17PM -0700, Sagi Grimberg wrote:
>>>>> -	ret = nvme_tcp_start_io_queues(ctrl);
>>>>> -	if (ret)
>>>>> -		goto out_cleanup_connect_q;
>>>>> -
>>>>> -	if (!new) {
>>>>> -		nvme_start_queues(ctrl);
>>>>> +	} else if (prior_q_cnt != ctrl->queue_count) {
>>>>
>>>> So if the queue count did not change we don't wait to make sure
>>>> the queue g_usage_counter ref made it to zero? What guarantees that it
>>>> did?
>>>
>>> Hmm, good point. we should always call nvme_wait_freeze_timeout()
>>> for !new queues. Is this what you are implying?
>>
>> I think we should always wait for the freeze to complete.
> 
> Don't the queues need to be started in order for the freeze to complete?
> Any enqueued requests on the quiesced queues will never complete this
> way, so the wait_freeze() will be stuck, right? If so, I think the
> nvme_start_queues() was in the correct place already.

Exactly what I was trying to point out (poorly though)

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
  2021-08-11  5:57             ` Sagi Grimberg
@ 2021-08-11 10:25               ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-11 10:25 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Keith Busch, linux-nvme, linux-kernel, James Smart, Ming Lei,
	Hannes Reinecke, Wen Xiong

On Tue, Aug 10, 2021 at 10:57:58PM -0700, Sagi Grimberg wrote:
> > > I think we should always wait for the freeze to complete.
> > 
> > Don't the queues need to be started in order for the freeze to complete?
> > Any enqueued requests on the quiesced queues will never complete this
> > way, so the wait_freeze() will be stuck, right? If so, I think the
> > nvme_start_queues() was in the correct place already.
> 
> Exactly what I was trying to point out (poorly though)

Thanks for explaining. I think I got the general idea what the different
states are doing and what the transitions are now. (famous last words).

Anyway, the first three patches are the result of debugging the case of
'prior_ioq_cnt != nr_io_queues'. Starting the queues before updating the
number of queues lookes strange.

I suppose in the case 'prior_ioq_cnt > nr_io_queues',
nvme_tcp_start_io_queues() should be successful and we do the
blk_mq_update_nr_hw_queues(). In the other case we should land in the
error recovery.

Wouldn't it make sense to always exercise the error recovery path if we
detect 'prior_ioq_cnt != nr_io_queues' and reduce the number of things
which can go wrong?

Daniel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them
@ 2021-08-11 10:25               ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-11 10:25 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Keith Busch, linux-nvme, linux-kernel, James Smart, Ming Lei,
	Hannes Reinecke, Wen Xiong

On Tue, Aug 10, 2021 at 10:57:58PM -0700, Sagi Grimberg wrote:
> > > I think we should always wait for the freeze to complete.
> > 
> > Don't the queues need to be started in order for the freeze to complete?
> > Any enqueued requests on the quiesced queues will never complete this
> > way, so the wait_freeze() will be stuck, right? If so, I think the
> > nvme_start_queues() was in the correct place already.
> 
> Exactly what I was trying to point out (poorly though)

Thanks for explaining. I think I got the general idea what the different
states are doing and what the transitions are now. (famous last words).

Anyway, the first three patches are the result of debugging the case of
'prior_ioq_cnt != nr_io_queues'. Starting the queues before updating the
number of queues lookes strange.

I suppose in the case 'prior_ioq_cnt > nr_io_queues',
nvme_tcp_start_io_queues() should be successful and we do the
blk_mq_update_nr_hw_queues(). In the other case we should land in the
error recovery.

Wouldn't it make sense to always exercise the error recovery path if we
detect 'prior_ioq_cnt != nr_io_queues' and reduce the number of things
which can go wrong?

Daniel

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
  2021-08-11  1:05         ` Sagi Grimberg
@ 2021-08-11 10:30           ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-11 10:30 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Hannes Reinecke, linux-nvme, linux-kernel, James Smart,
	Keith Busch, Ming Lei, Wen Xiong, James Smart

On Tue, Aug 10, 2021 at 06:05:04PM -0700, Sagi Grimberg wrote:
> Why is there a conditional freeze?

The freeze only happens if I/O queues have been created/started. If I
understand this correctly, this is the corner case we were only able to
establish the admin queue and fail later to setup the I/O queues.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
@ 2021-08-11 10:30           ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-11 10:30 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Hannes Reinecke, linux-nvme, linux-kernel, James Smart,
	Keith Busch, Ming Lei, Wen Xiong, James Smart

On Tue, Aug 10, 2021 at 06:05:04PM -0700, Sagi Grimberg wrote:
> Why is there a conditional freeze?

The freeze only happens if I/O queues have been created/started. If I
understand this correctly, this is the corner case we were only able to
establish the admin queue and fail later to setup the I/O queues.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
  2021-08-04  7:23     ` Hannes Reinecke
@ 2021-08-12 20:03       ` James Smart
  -1 siblings, 0 replies; 59+ messages in thread
From: James Smart @ 2021-08-12 20:03 UTC (permalink / raw)
  To: Hannes Reinecke, Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Wen Xiong

On 8/4/2021 12:23 AM, Hannes Reinecke wrote:
> On 8/2/21 1:26 PM, Daniel Wagner wrote:
>> From: James Smart <jsmart2021@gmail.com>
>>
>> commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
>> exposed an issue where we may hang trying to wait for queue freeze
>> during I/O. We call blk_mq_update_nr_hw_queues which may attempt to freeze
>> the queue. However we never started queue freeze when starting the
>> reset, which means that we have inflight pending requests that entered the
>> queue that we will not complete once the queue is quiesced.
>>
>> So start a freeze before we quiesce the queue, and unfreeze the queue
>> after we successfully connected the I/O queues (the unfreeze is already
>> present in the code). blk_mq_update_nr_hw_queues will be called only
>> after we are sure that the queue was already frozen.
>>
>> This follows to how the pci driver handles resets.
>>
>> This patch added logic introduced in commit 9f98772ba307 "nvme-rdma: fix
>> controller reset hang during traffic".
>>
>> Signed-off-by: James Smart <jsmart2021@gmail.com>
>> CC: Sagi Grimberg <sagi@grimberg.me>
>> [dwagner: call nvme_unfreeze() unconditionally in
>>            nvme_fc_recreate_io_queues() to match the nvme_start_freeze()]
>> Tested-by: Daniel Wagner <dwagner@suse.de>
>> Reviewed-by: Daniel Wagner <dwagner@suse.de>
>> ---
>>   drivers/nvme/host/fc.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
>> index 133b87db4f1d..b292af0fd655 100644
>> --- a/drivers/nvme/host/fc.c
>> +++ b/drivers/nvme/host/fc.c
>> @@ -2486,6 +2486,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
>>   	 * (but with error status).
>>   	 */
>>   	if (ctrl->ctrl.queue_count > 1) {
>> +		nvme_start_freeze(&ctrl->ctrl);
>>   		nvme_stop_queues(&ctrl->ctrl);
>>   		nvme_sync_io_queues(&ctrl->ctrl);
>>   		blk_mq_tagset_busy_iter(&ctrl->tag_set,
>> @@ -2966,8 +2967,8 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>>   			return -ENODEV;
>>   		}
>>   		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
>> -		nvme_unfreeze(&ctrl->ctrl);
>>   	}
>> +	nvme_unfreeze(&ctrl->ctrl);
>>   
>>   	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
>>   	if (ret)
>>
> There still is now an imbalance, as we're always calling
> 'nvme_unfreeze()' (irrespective on the number of queues), but will only
> call 'nvme_start_freeze()' if we have more than one queue.
> 
> This might lead to an imbalance on the mq_freeze_depth counter.
> Wouldn't it be better to move the call to 'nvme_start_freeze()' out of
> the if() condition to avoid the imbalance?
> 
> Cheers,

Daniel,

try this. Moves the location of the freeze and will always pair with the 
unfreeze.

--- fc.c	2021-08-12 12:33:33.273278611 -0700
+++ fc.c.new	2021-08-12 13:01:16.185817238 -0700
@@ -2965,9 +2965,10 @@ nvme_fc_recreate_io_queues(struct nvme_f
  			prior_ioq_cnt, nr_io_queues);
  		nvme_wait_freeze(&ctrl->ctrl);
  		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
-		nvme_unfreeze(&ctrl->ctrl);
  	}

+	nvme_unfreeze(&ctrl->ctrl);
+
  	return 0;

  out_delete_hw_queues:
@@ -3206,6 +3207,9 @@ nvme_fc_delete_association(struct nvme_f
  	ctrl->iocnt = 0;
  	spin_unlock_irqrestore(&ctrl->lock, flags);

+	if (ctrl->ctrl.queue_count > 1)
+		nvme_start_freeze(&ctrl->ctrl);
+
  	__nvme_fc_abort_outstanding_ios(ctrl, false);

  	/* kill the aens as they are a separate path */


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
@ 2021-08-12 20:03       ` James Smart
  0 siblings, 0 replies; 59+ messages in thread
From: James Smart @ 2021-08-12 20:03 UTC (permalink / raw)
  To: Hannes Reinecke, Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Wen Xiong

On 8/4/2021 12:23 AM, Hannes Reinecke wrote:
> On 8/2/21 1:26 PM, Daniel Wagner wrote:
>> From: James Smart <jsmart2021@gmail.com>
>>
>> commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
>> exposed an issue where we may hang trying to wait for queue freeze
>> during I/O. We call blk_mq_update_nr_hw_queues which may attempt to freeze
>> the queue. However we never started queue freeze when starting the
>> reset, which means that we have inflight pending requests that entered the
>> queue that we will not complete once the queue is quiesced.
>>
>> So start a freeze before we quiesce the queue, and unfreeze the queue
>> after we successfully connected the I/O queues (the unfreeze is already
>> present in the code). blk_mq_update_nr_hw_queues will be called only
>> after we are sure that the queue was already frozen.
>>
>> This follows to how the pci driver handles resets.
>>
>> This patch added logic introduced in commit 9f98772ba307 "nvme-rdma: fix
>> controller reset hang during traffic".
>>
>> Signed-off-by: James Smart <jsmart2021@gmail.com>
>> CC: Sagi Grimberg <sagi@grimberg.me>
>> [dwagner: call nvme_unfreeze() unconditionally in
>>            nvme_fc_recreate_io_queues() to match the nvme_start_freeze()]
>> Tested-by: Daniel Wagner <dwagner@suse.de>
>> Reviewed-by: Daniel Wagner <dwagner@suse.de>
>> ---
>>   drivers/nvme/host/fc.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
>> index 133b87db4f1d..b292af0fd655 100644
>> --- a/drivers/nvme/host/fc.c
>> +++ b/drivers/nvme/host/fc.c
>> @@ -2486,6 +2486,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
>>   	 * (but with error status).
>>   	 */
>>   	if (ctrl->ctrl.queue_count > 1) {
>> +		nvme_start_freeze(&ctrl->ctrl);
>>   		nvme_stop_queues(&ctrl->ctrl);
>>   		nvme_sync_io_queues(&ctrl->ctrl);
>>   		blk_mq_tagset_busy_iter(&ctrl->tag_set,
>> @@ -2966,8 +2967,8 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>>   			return -ENODEV;
>>   		}
>>   		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
>> -		nvme_unfreeze(&ctrl->ctrl);
>>   	}
>> +	nvme_unfreeze(&ctrl->ctrl);
>>   
>>   	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
>>   	if (ret)
>>
> There still is now an imbalance, as we're always calling
> 'nvme_unfreeze()' (irrespective on the number of queues), but will only
> call 'nvme_start_freeze()' if we have more than one queue.
> 
> This might lead to an imbalance on the mq_freeze_depth counter.
> Wouldn't it be better to move the call to 'nvme_start_freeze()' out of
> the if() condition to avoid the imbalance?
> 
> Cheers,

Daniel,

try this. Moves the location of the freeze and will always pair with the 
unfreeze.

--- fc.c	2021-08-12 12:33:33.273278611 -0700
+++ fc.c.new	2021-08-12 13:01:16.185817238 -0700
@@ -2965,9 +2965,10 @@ nvme_fc_recreate_io_queues(struct nvme_f
  			prior_ioq_cnt, nr_io_queues);
  		nvme_wait_freeze(&ctrl->ctrl);
  		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
-		nvme_unfreeze(&ctrl->ctrl);
  	}

+	nvme_unfreeze(&ctrl->ctrl);
+
  	return 0;

  out_delete_hw_queues:
@@ -3206,6 +3207,9 @@ nvme_fc_delete_association(struct nvme_f
  	ctrl->iocnt = 0;
  	spin_unlock_irqrestore(&ctrl->lock, flags);

+	if (ctrl->ctrl.queue_count > 1)
+		nvme_start_freeze(&ctrl->ctrl);
+
  	__nvme_fc_abort_outstanding_ios(ctrl, false);

  	/* kill the aens as they are a separate path */


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
  2021-08-12 20:03       ` James Smart
@ 2021-08-18 11:43         ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-18 11:43 UTC (permalink / raw)
  To: James Smart
  Cc: Hannes Reinecke, linux-nvme, linux-kernel, James Smart,
	Keith Busch, Ming Lei, Sagi Grimberg, Wen Xiong

On Thu, Aug 12, 2021 at 01:03:07PM -0700, James Smart wrote:
> try this. Moves the location of the freeze and will always pair with the
> unfreeze.

Yes, this works. Do you send a proper patch out or should I just pick up
this patch and add a commit message?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
@ 2021-08-18 11:43         ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-18 11:43 UTC (permalink / raw)
  To: James Smart
  Cc: Hannes Reinecke, linux-nvme, linux-kernel, James Smart,
	Keith Busch, Ming Lei, Sagi Grimberg, Wen Xiong

On Thu, Aug 12, 2021 at 01:03:07PM -0700, James Smart wrote:
> try this. Moves the location of the freeze and will always pair with the
> unfreeze.

Yes, this works. Do you send a proper patch out or should I just pick up
this patch and add a commit message?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
  2021-08-18 11:43         ` Daniel Wagner
@ 2021-08-18 11:49           ` Daniel Wagner
  -1 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-18 11:49 UTC (permalink / raw)
  To: James Smart
  Cc: Hannes Reinecke, linux-nvme, linux-kernel, James Smart,
	Keith Busch, Ming Lei, Sagi Grimberg, Wen Xiong

On Wed, Aug 18, 2021 at 01:43:09PM +0200, Daniel Wagner wrote:
> On Thu, Aug 12, 2021 at 01:03:07PM -0700, James Smart wrote:
> > try this. Moves the location of the freeze and will always pair with the
> > unfreeze.
> 
> Yes, this works. Do you send a proper patch out or should I just pick up
> this patch and add a commit message?

Forget it, the commit message is still the same. Let me respin this series.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic
@ 2021-08-18 11:49           ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-18 11:49 UTC (permalink / raw)
  To: James Smart
  Cc: Hannes Reinecke, linux-nvme, linux-kernel, James Smart,
	Keith Busch, Ming Lei, Sagi Grimberg, Wen Xiong

On Wed, Aug 18, 2021 at 01:43:09PM +0200, Daniel Wagner wrote:
> On Thu, Aug 12, 2021 at 01:03:07PM -0700, James Smart wrote:
> > try this. Moves the location of the freeze and will always pair with the
> > unfreeze.
> 
> Yes, this works. Do you send a proper patch out or should I just pick up
> this patch and add a commit message?

Forget it, the commit message is still the same. Let me respin this series.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v4 4/8] nvme-fc: Wait with a timeout for queue to freeze
  2021-08-02  9:14 [PATCH v4 0/8] Handle update hardware queues and queue freeze more carefully Daniel Wagner
@ 2021-08-02  9:14 ` Daniel Wagner
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Wagner @ 2021-08-02  9:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Daniel Wagner, James Smart

Do not wait indifinitly for all queues to freeze. Instead use a
timeout and abort the operation if we get stuck.

Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 8a903769364f..dbb8ad816df8 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 		dev_info(ctrl->ctrl.device,
 			"reconnect: revising io queue count from %d to %d\n",
 			prior_ioq_cnt, nr_io_queues);
-		nvme_wait_freeze(&ctrl->ctrl);
+		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
+			/*
+			 * If we timed out waiting for freeze we are likely to
+			 * be stuck.  Fail the controller initialization just
+			 * to be safe.
+			 */
+			return -ENODEV;
+		}
 		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
 		nvme_unfreeze(&ctrl->ctrl);
 	}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2021-08-18 11:49 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-02 11:26 [PATCH RESEND v4 0/8] Handle update hardware queues and queue freeze more carefully Daniel Wagner
2021-08-02 11:26 ` Daniel Wagner
2021-08-02 11:26 ` [PATCH v4 1/8] nvme-fc: Update hardware queues before using them Daniel Wagner
2021-08-02 11:26   ` Daniel Wagner
2021-08-02 19:34   ` Himanshu Madhani
2021-08-02 19:34     ` Himanshu Madhani
2021-08-02 11:26 ` [PATCH v4 2/8] nvme-tcp: Update number of " Daniel Wagner
2021-08-02 11:26   ` Daniel Wagner
2021-08-06 19:57   ` Sagi Grimberg
2021-08-06 19:57     ` Sagi Grimberg
2021-08-09  8:52     ` Daniel Wagner
2021-08-09  8:52       ` Daniel Wagner
2021-08-11  1:00       ` Sagi Grimberg
2021-08-11  1:00         ` Sagi Grimberg
2021-08-11  1:07         ` Keith Busch
2021-08-11  1:07           ` Keith Busch
2021-08-11  5:57           ` Sagi Grimberg
2021-08-11  5:57             ` Sagi Grimberg
2021-08-11 10:25             ` Daniel Wagner
2021-08-11 10:25               ` Daniel Wagner
2021-08-02 11:26 ` [PATCH v4 3/8] nvme-rdma: " Daniel Wagner
2021-08-02 11:26   ` Daniel Wagner
2021-08-02 11:26 ` [PATCH v4 4/8] nvme-fc: Wait with a timeout for queue to freeze Daniel Wagner
2021-08-02 11:26   ` Daniel Wagner
2021-08-02 19:36   ` Himanshu Madhani
2021-08-02 19:36     ` Himanshu Madhani
2021-08-02 11:26 ` [PATCH v4 5/8] nvme-fc: avoid race between time out and tear down Daniel Wagner
2021-08-02 11:26   ` Daniel Wagner
2021-08-02 19:38   ` Himanshu Madhani
2021-08-02 19:38     ` Himanshu Madhani
2021-08-02 11:26 ` [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic Daniel Wagner
2021-08-02 11:26   ` Daniel Wagner
2021-08-02 19:39   ` Himanshu Madhani
2021-08-02 19:39     ` Himanshu Madhani
2021-08-04  7:23   ` Hannes Reinecke
2021-08-04  7:23     ` Hannes Reinecke
2021-08-04  8:08     ` Daniel Wagner
2021-08-04  8:08       ` Daniel Wagner
2021-08-11  1:05       ` Sagi Grimberg
2021-08-11  1:05         ` Sagi Grimberg
2021-08-11 10:30         ` Daniel Wagner
2021-08-11 10:30           ` Daniel Wagner
2021-08-12 20:03     ` James Smart
2021-08-12 20:03       ` James Smart
2021-08-18 11:43       ` Daniel Wagner
2021-08-18 11:43         ` Daniel Wagner
2021-08-18 11:49         ` Daniel Wagner
2021-08-18 11:49           ` Daniel Wagner
2021-08-02 11:26 ` [PATCH v4 7/8] nvme-tcp: Unfreeze queues on reconnect Daniel Wagner
2021-08-02 11:26   ` Daniel Wagner
2021-08-02 11:26 ` [PATCH v4 8/8] nvme-rdma: " Daniel Wagner
2021-08-02 11:26   ` Daniel Wagner
2021-08-04  7:25   ` Hannes Reinecke
2021-08-04  7:25     ` Hannes Reinecke
2021-08-06 19:59   ` Sagi Grimberg
2021-08-06 19:59     ` Sagi Grimberg
2021-08-09  8:58     ` Daniel Wagner
2021-08-09  8:58       ` Daniel Wagner
  -- strict thread matches above, loose matches on Subject: below --
2021-08-02  9:14 [PATCH v4 0/8] Handle update hardware queues and queue freeze more carefully Daniel Wagner
2021-08-02  9:14 ` [PATCH v4 4/8] nvme-fc: Wait with a timeout for queue to freeze Daniel Wagner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.