linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Handle update hardware queues and queue freeze more carefully
@ 2021-07-08  9:27 Daniel Wagner
  2021-07-08  9:27 ` [PATCH v2 1/5] nvme-fc: Update hardware queues before using them Daniel Wagner
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Daniel Wagner @ 2021-07-08  9:27 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Jens Axboe, Ming Lei,
	Sagi Grimberg, Daniel Wagner

Hi,

I've tested this on top of Ming's patches 'blk-mq: fix
blk_mq_alloc_request_hctx'[1] which fixes all problems (including the
hanger in nvme_wait_freeze()).

Thanks,
Danie

[1] https://lore.kernel.org/linux-nvme/20210629074951.1981284-1-ming.lei@redhat.com/

v1:
 - https://lore.kernel.org/linux-nvme/20210625101649.49296-1-dwagner@suse.de/
v2:
 - reviewed tags collected
 - added 'update hardware queues' for all transport
 - added fix for fc hanger in nvme_wait_freeze_timeout


Initial cover letter:

this is a followup on the crash I reported in

  https://lore.kernel.org/linux-block/20210608183339.70609-1-dwagner@suse.de/

By moving the hardware check up the crash was gone. Unfortuntatly, I
don't understand why this fixes the crash. The per-cpu access is
crashing but I can't see why the blk_mq_update_nr_hw_queues() is
fixing this problem.

Even though I can't explain why it fixes it, I think it makes sense to
update the hardware queue mapping bevore we recreate the IO
queues. Thus I avoided in the commit message to say it fixes
something.

Also during testing I observed the we hang indivinetly in
blk_mq_freeze_queue_wait(). Again I can't explain why we get stuck
there but given a common pattern for the nvme_wait_freeze() is to use
it with a timeout I think the timeout should be used too :)

Anyway, someone with more undertanding of the stack can explain the
problems.


Daniel Wagner (4):
  nvme-fc: Update hardware queues before using them
  nvme-rdma: Update number of hardware queues before using them
  nvme-fc: Wait with a timeout for queue to freeze
  nvme-fc: Freeze queues before destroying them

Hannes Reinecke (1):
  nvme-tcp: Update number of hardware queues before using them

 drivers/nvme/host/fc.c   | 26 +++++++++++++++++---------
 drivers/nvme/host/rdma.c | 13 ++++++-------
 drivers/nvme/host/tcp.c  | 14 ++++++--------
 3 files changed, 29 insertions(+), 24 deletions(-)

-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/5] nvme-fc: Update hardware queues before using them
  2021-07-08  9:27 [PATCH v2 0/5] Handle update hardware queues and queue freeze more carefully Daniel Wagner
@ 2021-07-08  9:27 ` Daniel Wagner
  2021-07-08 10:08   ` Hannes Reinecke
  2021-07-08  9:27 ` [PATCH v2 2/5] nvme-tcp: Update number of " Daniel Wagner
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Daniel Wagner @ 2021-07-08  9:27 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Jens Axboe, Ming Lei,
	Sagi Grimberg, Daniel Wagner, James Smart

In case the number of hardware queues changes, do the update the
tagset and ctx to hctx first before using the mapping to recreate and
connnect the IO queues.

Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index c087cf7a6e1f..d0eb81387d4e 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2952,14 +2952,6 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 	if (ctrl->ctrl.queue_count == 1)
 		return 0;
 
-	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
-	if (ret)
-		goto out_free_io_queues;
-
-	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
-	if (ret)
-		goto out_delete_hw_queues;
-
 	if (prior_ioq_cnt != nr_io_queues) {
 		dev_info(ctrl->ctrl.device,
 			"reconnect: revising io queue count from %d to %d\n",
@@ -2969,6 +2961,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 		nvme_unfreeze(&ctrl->ctrl);
 	}
 
+	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
+	if (ret)
+		goto out_free_io_queues;
+
+	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
+	if (ret)
+		goto out_delete_hw_queues;
+
 	return 0;
 
 out_delete_hw_queues:
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/5] nvme-tcp: Update number of hardware queues before using them
  2021-07-08  9:27 [PATCH v2 0/5] Handle update hardware queues and queue freeze more carefully Daniel Wagner
  2021-07-08  9:27 ` [PATCH v2 1/5] nvme-fc: Update hardware queues before using them Daniel Wagner
@ 2021-07-08  9:27 ` Daniel Wagner
  2021-07-08  9:27 ` [PATCH v2 3/5] nvme-rdma: " Daniel Wagner
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Daniel Wagner @ 2021-07-08  9:27 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Jens Axboe, Ming Lei,
	Sagi Grimberg, Hannes Reinecke, Daniel Wagner

From: Hannes Reinecke <hare@suse.de>

When the number of hardware queues changes during resetting we should
update the tagset first before using it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/tcp.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 0125463b7d77..a0f15e94c7a6 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1784,6 +1784,7 @@ static void nvme_tcp_destroy_io_queues(struct nvme_ctrl *ctrl, bool remove)
 static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 {
 	int ret;
+	u32 prior_q_cnt = ctrl->queue_count;
 
 	ret = nvme_tcp_alloc_io_queues(ctrl);
 	if (ret)
@@ -1801,14 +1802,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 			ret = PTR_ERR(ctrl->connect_q);
 			goto out_free_tag_set;
 		}
-	}
-
-	ret = nvme_tcp_start_io_queues(ctrl);
-	if (ret)
-		goto out_cleanup_connect_q;
-
-	if (!new) {
-		nvme_start_queues(ctrl);
+	} else if (prior_q_cnt != ctrl->queue_count) {
 		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
 			/*
 			 * If we timed out waiting for freeze we are likely to
@@ -1823,6 +1817,10 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 		nvme_unfreeze(ctrl);
 	}
 
+	ret = nvme_tcp_start_io_queues(ctrl);
+	if (ret)
+		goto out_cleanup_connect_q;
+
 	return 0;
 
 out_wait_freeze_timed_out:
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/5] nvme-rdma: Update number of hardware queues before using them
  2021-07-08  9:27 [PATCH v2 0/5] Handle update hardware queues and queue freeze more carefully Daniel Wagner
  2021-07-08  9:27 ` [PATCH v2 1/5] nvme-fc: Update hardware queues before using them Daniel Wagner
  2021-07-08  9:27 ` [PATCH v2 2/5] nvme-tcp: Update number of " Daniel Wagner
@ 2021-07-08  9:27 ` Daniel Wagner
  2021-07-08  9:27 ` [PATCH v2 4/5] nvme-fc: Wait with a timeout for queue to freeze Daniel Wagner
  2021-07-08  9:27 ` [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them Daniel Wagner
  4 siblings, 0 replies; 11+ messages in thread
From: Daniel Wagner @ 2021-07-08  9:27 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Jens Axboe, Ming Lei,
	Sagi Grimberg, Daniel Wagner

When the number of hardware queues changes during resetting we should
update the tagset first before using it.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/rdma.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 3a296fd34bef..9825112bd9f4 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -967,6 +967,7 @@ static void nvme_rdma_destroy_io_queues(struct nvme_rdma_ctrl *ctrl,
 static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 {
 	int ret;
+	u32 prior_q_cnt = ctrl->ctrl.queue_count;
 
 	ret = nvme_rdma_alloc_io_queues(ctrl);
 	if (ret)
@@ -984,13 +985,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 			ret = PTR_ERR(ctrl->ctrl.connect_q);
 			goto out_free_tag_set;
 		}
-	}
-
-	ret = nvme_rdma_start_io_queues(ctrl);
-	if (ret)
-		goto out_cleanup_connect_q;
-
-	if (!new) {
+	} else if (prior_q_cnt != ctrl->ctrl.queue_count) {
 		nvme_start_queues(&ctrl->ctrl);
 		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
 			/*
@@ -1006,6 +1001,10 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 		nvme_unfreeze(&ctrl->ctrl);
 	}
 
+	ret = nvme_rdma_start_io_queues(ctrl);
+	if (ret)
+		goto out_cleanup_connect_q;
+
 	return 0;
 
 out_wait_freeze_timed_out:
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 4/5] nvme-fc: Wait with a timeout for queue to freeze
  2021-07-08  9:27 [PATCH v2 0/5] Handle update hardware queues and queue freeze more carefully Daniel Wagner
                   ` (2 preceding siblings ...)
  2021-07-08  9:27 ` [PATCH v2 3/5] nvme-rdma: " Daniel Wagner
@ 2021-07-08  9:27 ` Daniel Wagner
  2021-07-08 10:12   ` Hannes Reinecke
  2021-07-08  9:27 ` [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them Daniel Wagner
  4 siblings, 1 reply; 11+ messages in thread
From: Daniel Wagner @ 2021-07-08  9:27 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Jens Axboe, Ming Lei,
	Sagi Grimberg, Daniel Wagner, James Smart

Do not wait indifinitly for all queues to freeze. Instead use a
timeout and abort the operation if we get stuck.

Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index d0eb81387d4e..8e1fc3796735 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2956,7 +2956,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 		dev_info(ctrl->ctrl.device,
 			"reconnect: revising io queue count from %d to %d\n",
 			prior_ioq_cnt, nr_io_queues);
-		nvme_wait_freeze(&ctrl->ctrl);
+		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
+			/*
+			 * If we timed out waiting for freeze we are likely to
+			 * be stuck.  Fail the controller initialization just
+			 * to be safe.
+			 */
+			return -ENODEV;
+		}
 		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
 		nvme_unfreeze(&ctrl->ctrl);
 	}
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them
  2021-07-08  9:27 [PATCH v2 0/5] Handle update hardware queues and queue freeze more carefully Daniel Wagner
                   ` (3 preceding siblings ...)
  2021-07-08  9:27 ` [PATCH v2 4/5] nvme-fc: Wait with a timeout for queue to freeze Daniel Wagner
@ 2021-07-08  9:27 ` Daniel Wagner
  2021-07-08 10:14   ` Hannes Reinecke
  2021-07-09 16:14   ` James Smart
  4 siblings, 2 replies; 11+ messages in thread
From: Daniel Wagner @ 2021-07-08  9:27 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Jens Axboe, Ming Lei,
	Sagi Grimberg, Daniel Wagner

nvme_wait_freeze_timeout() in nvme_fc_recreate_io_queues() needs to be
paired with a nvme_start_freeze(). Without freezing first we will always
timeout in nvme_wait_freeze_timeout().

Note there is a similiar fix for RDMA 9f98772ba307 ("nvme-rdma: fix
controller reset hang during traffic") which happens to follow the PCI
strategy how to handle resetting the queues.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 8e1fc3796735..a38b01485939 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3249,6 +3249,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
 		nvme_fc_xmt_ls_rsp(disls);
 
 	if (ctrl->ctrl.tagset) {
+		nvme_start_freeze(&ctrl->ctrl);
 		nvme_fc_delete_hw_io_queues(ctrl);
 		nvme_fc_free_io_queues(ctrl);
 	}
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/5] nvme-fc: Update hardware queues before using them
  2021-07-08  9:27 ` [PATCH v2 1/5] nvme-fc: Update hardware queues before using them Daniel Wagner
@ 2021-07-08 10:08   ` Hannes Reinecke
  0 siblings, 0 replies; 11+ messages in thread
From: Hannes Reinecke @ 2021-07-08 10:08 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Jens Axboe, Ming Lei,
	Sagi Grimberg, James Smart

On 7/8/21 11:27 AM, Daniel Wagner wrote:
> In case the number of hardware queues changes, do the update the
> tagset and ctx to hctx first before using the mapping to recreate and
> connnect the IO queues.
> 
> Reviewed-by: James Smart <jsmart2021@gmail.com>
> Reviewed-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 16 ++++++++--------
>   1 file changed, 8 insertions(+), 8 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/5] nvme-fc: Wait with a timeout for queue to freeze
  2021-07-08  9:27 ` [PATCH v2 4/5] nvme-fc: Wait with a timeout for queue to freeze Daniel Wagner
@ 2021-07-08 10:12   ` Hannes Reinecke
  0 siblings, 0 replies; 11+ messages in thread
From: Hannes Reinecke @ 2021-07-08 10:12 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Jens Axboe, Ming Lei,
	Sagi Grimberg, James Smart

On 7/8/21 11:27 AM, Daniel Wagner wrote:
> Do not wait indifinitly for all queues to freeze. Instead use a
> timeout and abort the operation if we get stuck.
> 
> Reviewed-by: James Smart <jsmart2021@gmail.com>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index d0eb81387d4e..8e1fc3796735 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2956,7 +2956,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>   		dev_info(ctrl->ctrl.device,
>   			"reconnect: revising io queue count from %d to %d\n",
>   			prior_ioq_cnt, nr_io_queues);
> -		nvme_wait_freeze(&ctrl->ctrl);
> +		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> +			/*
> +			 * If we timed out waiting for freeze we are likely to
> +			 * be stuck.  Fail the controller initialization just
> +			 * to be safe.
> +			 */
> +			return -ENODEV;

For controller reset we're using '-ENOTCONN'; maybe it's worthwhile to 
use the same error code here.
But that's just a minor detail.

> +		}
>   		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
>   		nvme_unfreeze(&ctrl->ctrl);
>   	}
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them
  2021-07-08  9:27 ` [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them Daniel Wagner
@ 2021-07-08 10:14   ` Hannes Reinecke
  2021-07-09 16:14   ` James Smart
  1 sibling, 0 replies; 11+ messages in thread
From: Hannes Reinecke @ 2021-07-08 10:14 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Jens Axboe, Ming Lei,
	Sagi Grimberg

On 7/8/21 11:27 AM, Daniel Wagner wrote:
> nvme_wait_freeze_timeout() in nvme_fc_recreate_io_queues() needs to be
> paired with a nvme_start_freeze(). Without freezing first we will always
> timeout in nvme_wait_freeze_timeout().
> 
> Note there is a similiar fix for RDMA 9f98772ba307 ("nvme-rdma: fix
> controller reset hang during traffic") which happens to follow the PCI
> strategy how to handle resetting the queues.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 8e1fc3796735..a38b01485939 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -3249,6 +3249,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
>   		nvme_fc_xmt_ls_rsp(disls);
>   
>   	if (ctrl->ctrl.tagset) {
> +		nvme_start_freeze(&ctrl->ctrl);
>   		nvme_fc_delete_hw_io_queues(ctrl);
>   		nvme_fc_free_io_queues(ctrl);
>   	}
> 
Please add a comment here about the pairing. We've missed it once, so we 
should make it clear why it has to be placed here.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them
  2021-07-08  9:27 ` [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them Daniel Wagner
  2021-07-08 10:14   ` Hannes Reinecke
@ 2021-07-09 16:14   ` James Smart
  2021-07-09 16:42     ` Daniel Wagner
  1 sibling, 1 reply; 11+ messages in thread
From: James Smart @ 2021-07-09 16:14 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Jens Axboe, Ming Lei,
	Sagi Grimberg

On 7/8/2021 2:27 AM, Daniel Wagner wrote:
> nvme_wait_freeze_timeout() in nvme_fc_recreate_io_queues() needs to be
> paired with a nvme_start_freeze(). Without freezing first we will always
> timeout in nvme_wait_freeze_timeout().
> 
> Note there is a similiar fix for RDMA 9f98772ba307 ("nvme-rdma: fix
> controller reset hang during traffic") which happens to follow the PCI
> strategy how to handle resetting the queues.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 8e1fc3796735..a38b01485939 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -3249,6 +3249,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
>   		nvme_fc_xmt_ls_rsp(disls);
>   
>   	if (ctrl->ctrl.tagset) {
> +		nvme_start_freeze(&ctrl->ctrl);
>   		nvme_fc_delete_hw_io_queues(ctrl);
>   		nvme_fc_free_io_queues(ctrl);
>   	}
> 

Thanks for the note. that definitely helped follow what is being 
attempted. I also agree with Hannes that the comment from the rdma patch 
should also be present to understand what's going on.

Looking at the patch - this is not done in the same place or manner as 
rdma. Freezing and stoppage is prior to cancelling and that doesn't 
correspond where this was added (this is after all cancellations). We 
also seem to be missing a nvme_sync_io_queues() call in the sequence as 
well. So I believe there's more work to be done on this patch.  I'll see 
what I can do.

We really need to see about a common layer for transports. So much we do 
is similar. We were ok at the start, but we've drifted apart over time 
and the requirements to the core layer aren't propogating to all transports.

-- james

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them
  2021-07-09 16:14   ` James Smart
@ 2021-07-09 16:42     ` Daniel Wagner
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Wagner @ 2021-07-09 16:42 UTC (permalink / raw)
  To: James Smart
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Jens Axboe,
	Ming Lei, Sagi Grimberg

Hi James,

On Fri, Jul 09, 2021 at 09:14:07AM -0700, James Smart wrote:
> Thanks for the note. that definitely helped follow what is being attempted.
> I also agree with Hannes that the comment from the rdma patch should also be
> present to understand what's going on.

Sure will do. Though this has to wait until I am back from holiday
though :)

> Looking at the patch - this is not done in the same place or manner as rdma.
> Freezing and stoppage is prior to cancelling and that doesn't correspond
> where this was added (this is after all cancellations). We also seem to be
> missing a nvme_sync_io_queues() call in the sequence as well. So I believe
> there's more work to be done on this patch.  I'll see what I can do.

Thanks!

> We really need to see about a common layer for transports. So much we do is
> similar. We were ok at the start, but we've drifted apart over time and the
> requirements to the core layer aren't propogating to all transports.

Yes, makes a lot of sense to me to sync the transports implementation a
bit more.

Thanks,
Daniel

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-07-09 16:43 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-08  9:27 [PATCH v2 0/5] Handle update hardware queues and queue freeze more carefully Daniel Wagner
2021-07-08  9:27 ` [PATCH v2 1/5] nvme-fc: Update hardware queues before using them Daniel Wagner
2021-07-08 10:08   ` Hannes Reinecke
2021-07-08  9:27 ` [PATCH v2 2/5] nvme-tcp: Update number of " Daniel Wagner
2021-07-08  9:27 ` [PATCH v2 3/5] nvme-rdma: " Daniel Wagner
2021-07-08  9:27 ` [PATCH v2 4/5] nvme-fc: Wait with a timeout for queue to freeze Daniel Wagner
2021-07-08 10:12   ` Hannes Reinecke
2021-07-08  9:27 ` [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them Daniel Wagner
2021-07-08 10:14   ` Hannes Reinecke
2021-07-09 16:14   ` James Smart
2021-07-09 16:42     ` Daniel Wagner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).