linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3]  Handle number of queue changes
@ 2022-08-29  8:38 Daniel Wagner
  2022-08-29  8:38 ` [PATCH v3 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Daniel Wagner @ 2022-08-29  8:38 UTC (permalink / raw)
  To: linux-nvme; +Cc: Sagi Grimberg, Hannes Reinecke, Chao Leng, Daniel Wagner

Updated this series to a proper patch series with Hannes and Sagi's
feedback addressed.

I tested this with nvme-tcp but due to lack of hardware the nvme-rdma
is only compile tested.

Daniel


From the previous cover letter:

We got a report from our customer that a firmware upgrade on the
storage array is able to 'break' host. This is caused a change of
number of queues which the target supports after a reconnect.

Let's assume the number of queues is 8 and all is working fine. Then
the connection is dropped and the host starts to try to
reconnect. Eventually, this succeeds but now the new number of queues
is 10:

nvme0: creating 8 I/O queues.
nvme0: mapped 8/0/0 default/read/poll queues.
nvme0: new ctrl: NQN "nvmet-test", addr 10.100.128.29:4420
nvme0: queue 0: timeout request 0x0 type 4
nvme0: starting error recovery
nvme0: failed nvme_keep_alive_end_io error=10
nvme0: Reconnecting in 10 seconds...
nvme0: failed to connect socket: -110
nvme0: Failed reconnect attempt 1
nvme0: Reconnecting in 10 seconds...
nvme0: creating 10 I/O queues.
nvme0: Connect command failed, error wo/DNR bit: -16389
nvme0: failed to connect queue: 9 ret=-5
nvme0: Failed reconnect attempt 2

As you can see queue number 9 is not able to connect.

As the order of starting and unfreezing is important we can't just
move the start of the queues after the tagset update. So my stupid
idea was to start just the older number of queues and then the rest.

This seems work:

nvme nvme0: creating 4 I/O queues.
nvme nvme0: mapped 4/0/0 default/read/poll queues.
nvme_tcp_start_io_queues nr_hw_queues 4 queue_count 5 qcnt 5
nvme_tcp_start_io_queues nr_hw_queues 4 queue_count 5 qcnt 5
nvme nvme0: new ctrl: NQN "nvmet-test", addr 10.100.128.29:4420
nvme nvme0: queue 0: timeout request 0x0 type 4
nvme nvme0: starting error recovery
nvme0: Keep Alive(0x18), Unknown (sct 0x3 / sc 0x71)
nvme nvme0: failed nvme_keep_alive_end_io error=10
nvme nvme0: Reconnecting in 10 seconds...
nvme nvme0: creating 6 I/O queues.
nvme_tcp_start_io_queues nr_hw_queues 4 queue_count 7 qcnt 5
nvme nvme0: mapped 6/0/0 default/read/poll queues.
nvme_tcp_start_io_queues nr_hw_queues 6 queue_count 7 qcnt 7
nvme nvme0: Successfully reconnected (1 attempt)
nvme nvme0: starting error recovery
nvme0: Keep Alive(0x18), Unknown (sct 0x3 / sc 0x71)
nvme nvme0: failed nvme_keep_alive_end_io error=10
nvme nvme0: Reconnecting in 10 seconds...
nvme nvme0: creating 4 I/O queues.
nvme_tcp_start_io_queues nr_hw_queues 6 queue_count 5 qcnt 5
nvme nvme0: mapped 4/0/0 default/read/poll queues.
nvme_tcp_start_io_queues nr_hw_queues 4 queue_count 5 qcnt 5
nvme nvme0: Successfully reconnected (1 attempt)


changes:
v3:
 - only allow max_qid changes when port is disabled
 - use ctrl->tagset and avoid code churn
 - use correct error label for goto
v2:
 - removed debug logging
 - pass in queue range idx as argument to nvme_tcp_start_io_queues
v1:
 - https://lore.kernel.org/linux-nvme/20220812142824.17766-1-dwagner@suse.de/

Daniel Wagner (3):
  nvmet: Expose max queues to sysfs
  nvme-tcp: Handle number of queue changes
  nvme-rdma: Handle number of queue changes

 drivers/nvme/host/rdma.c       | 26 +++++++++++++++++++++-----
 drivers/nvme/host/tcp.c        | 26 +++++++++++++++++++++-----
 drivers/nvme/target/configfs.c | 28 ++++++++++++++++++++++++++++
 3 files changed, 70 insertions(+), 10 deletions(-)

-- 
2.37.2



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 1/3] nvmet: Expose max queues to sysfs
  2022-08-29  8:38 [PATCH v3 0/3] Handle number of queue changes Daniel Wagner
@ 2022-08-29  8:38 ` Daniel Wagner
  2022-08-29  9:01   ` Sagi Grimberg
  2022-08-29  9:06   ` Hannes Reinecke
  2022-08-29  8:38 ` [PATCH v3 2/3] nvme-tcp: Handle number of queue changes Daniel Wagner
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 25+ messages in thread
From: Daniel Wagner @ 2022-08-29  8:38 UTC (permalink / raw)
  To: linux-nvme; +Cc: Sagi Grimberg, Hannes Reinecke, Chao Leng, Daniel Wagner

Allow to set the max queues the target supports. This is useful for
testing the reconnect attempt of the host with changing numbers of
supported queues.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/configfs.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 2bcd60758919..a415c2987393 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -1281,6 +1281,33 @@ static ssize_t nvmet_subsys_attr_pi_enable_store(struct config_item *item,
 CONFIGFS_ATTR(nvmet_subsys_, attr_pi_enable);
 #endif
 
+static ssize_t nvmet_subsys_attr_qid_max_show(struct config_item *item,
+					      char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%u\n", to_subsys(item)->max_qid);
+}
+
+static ssize_t nvmet_subsys_attr_qid_max_store(struct config_item *item,
+					       const char *page, size_t cnt)
+{
+	u16 qid_max;
+
+	if (nvmet_is_port_enabled(port, __func__))
+		return -EACCES;
+
+	if (sscanf(page, "%hu\n", &qid_max) != 1)
+		return -EINVAL;
+
+	if (qid_max < 1 || qid_max > NVMET_NR_QUEUES)
+		return -EINVAL;
+
+	down_write(&nvmet_config_sem);
+	to_subsys(item)->max_qid = qid_max;
+	up_write(&nvmet_config_sem);
+	return cnt;
+}
+CONFIGFS_ATTR(nvmet_subsys_, attr_qid_max);
+
 static struct configfs_attribute *nvmet_subsys_attrs[] = {
 	&nvmet_subsys_attr_attr_allow_any_host,
 	&nvmet_subsys_attr_attr_version,
@@ -1291,6 +1318,7 @@ static struct configfs_attribute *nvmet_subsys_attrs[] = {
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 	&nvmet_subsys_attr_attr_pi_enable,
 #endif
+	&nvmet_subsys_attr_attr_qid_max,
 	NULL,
 };
 
-- 
2.37.2



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 2/3] nvme-tcp: Handle number of queue changes
  2022-08-29  8:38 [PATCH v3 0/3] Handle number of queue changes Daniel Wagner
  2022-08-29  8:38 ` [PATCH v3 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
@ 2022-08-29  8:38 ` Daniel Wagner
  2022-08-29  9:01   ` Sagi Grimberg
  2022-08-29  9:07   ` Hannes Reinecke
  2022-08-29  8:38 ` [PATCH v3 3/3] nvme-rdma: " Daniel Wagner
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 25+ messages in thread
From: Daniel Wagner @ 2022-08-29  8:38 UTC (permalink / raw)
  To: linux-nvme; +Cc: Sagi Grimberg, Hannes Reinecke, Chao Leng, Daniel Wagner

On reconnect, the number of queues might have changed.

In the case where we have more queues available than previously we try
to access queues which are not initialized yet.

The other case where we have less queues than previously, the
connection attempt will fail because the target doesn't support the
old number of queues and we end up in a reconnect loop.

Thus, only start queues which are currently present in the tagset
limited by the number of available queues. Then we update the tagset
and we can start any new queue.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/tcp.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 044da18c06f5..d97b0b6369b5 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1762,11 +1762,12 @@ static void nvme_tcp_stop_io_queues(struct nvme_ctrl *ctrl)
 		nvme_tcp_stop_queue(ctrl, i);
 }
 
-static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl)
+static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl,
+				    int first, int last)
 {
 	int i, ret;
 
-	for (i = 1; i < ctrl->queue_count; i++) {
+	for (i = first; i < last; i++) {
 		ret = nvme_tcp_start_queue(ctrl, i);
 		if (ret)
 			goto out_stop_queues;
@@ -1775,7 +1776,7 @@ static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl)
 	return 0;
 
 out_stop_queues:
-	for (i--; i >= 1; i--)
+	for (i--; i >= first; i--)
 		nvme_tcp_stop_queue(ctrl, i);
 	return ret;
 }
@@ -1901,7 +1902,7 @@ static void nvme_tcp_destroy_io_queues(struct nvme_ctrl *ctrl, bool remove)
 
 static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 {
-	int ret;
+	int ret, nr_queues;
 
 	ret = nvme_tcp_alloc_io_queues(ctrl);
 	if (ret)
@@ -1917,7 +1918,13 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 			goto out_free_tag_set;
 	}
 
-	ret = nvme_tcp_start_io_queues(ctrl);
+	/*
+	 * Only start IO queues for which we have allocated the tagset
+	 * and limitted it to the available queues. On reconnects, the
+	 * queue number might have changed.
+	 */
+	nr_queues = min(ctrl->tagset->nr_hw_queues + 1, ctrl->queue_count);
+	ret = nvme_tcp_start_io_queues(ctrl, 1, nr_queues);
 	if (ret)
 		goto out_cleanup_connect_q;
 
@@ -1937,6 +1944,15 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 		nvme_unfreeze(ctrl);
 	}
 
+	/*
+	 * If the number of queues has increased (reconnect case)
+	 * start all new queues now.
+	 */
+	ret = nvme_tcp_start_io_queues(ctrl, nr_queues,
+				       ctrl->tagset->nr_hw_queues + 1);
+	if (ret)
+		goto out_wait_freeze_timed_out;
+
 	return 0;
 
 out_wait_freeze_timed_out:
-- 
2.37.2



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 3/3] nvme-rdma: Handle number of queue changes
  2022-08-29  8:38 [PATCH v3 0/3] Handle number of queue changes Daniel Wagner
  2022-08-29  8:38 ` [PATCH v3 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
  2022-08-29  8:38 ` [PATCH v3 2/3] nvme-tcp: Handle number of queue changes Daniel Wagner
@ 2022-08-29  8:38 ` Daniel Wagner
  2022-08-29  9:02   ` Sagi Grimberg
                     ` (2 more replies)
  2022-08-29  9:07 ` [PATCH v3 0/3] " Hannes Reinecke
  2022-08-29 18:44 ` James Smart
  4 siblings, 3 replies; 25+ messages in thread
From: Daniel Wagner @ 2022-08-29  8:38 UTC (permalink / raw)
  To: linux-nvme; +Cc: Sagi Grimberg, Hannes Reinecke, Chao Leng, Daniel Wagner

On reconnect, the number of queues might have changed.

In the case where we have more queues available than previously we try
to access queues which are not initialized yet.

The other case where we have less queues than previously, the
connection attempt will fail because the target doesn't support the
old number of queues and we end up in a reconnect loop.

Thus, only start queues which are currently present in the tagset
limited by the number of available queues. Then we update the tagset
and we can start any new queue.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/rdma.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 3100643be299..0d1d29e644d9 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -696,11 +696,12 @@ static int nvme_rdma_start_queue(struct nvme_rdma_ctrl *ctrl, int idx)
 	return ret;
 }
 
-static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl)
+static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl,
+				     int first, int last)
 {
 	int i, ret = 0;
 
-	for (i = 1; i < ctrl->ctrl.queue_count; i++) {
+	for (i = first; i < last; i++) {
 		ret = nvme_rdma_start_queue(ctrl, i);
 		if (ret)
 			goto out_stop_queues;
@@ -709,7 +710,7 @@ static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl)
 	return 0;
 
 out_stop_queues:
-	for (i--; i >= 1; i--)
+	for (i--; i >= first; i--)
 		nvme_rdma_stop_queue(&ctrl->queues[i]);
 	return ret;
 }
@@ -964,7 +965,7 @@ static void nvme_rdma_destroy_io_queues(struct nvme_rdma_ctrl *ctrl,
 
 static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 {
-	int ret;
+	int ret, nr_queues;
 
 	ret = nvme_rdma_alloc_io_queues(ctrl);
 	if (ret)
@@ -980,7 +981,13 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 			goto out_free_tag_set;
 	}
 
-	ret = nvme_rdma_start_io_queues(ctrl);
+	/*
+	 * Only start IO queues for which we have allocated the tagset
+	 * and limitted it to the available queues. On reconnects, the
+	 * queue number might have changed.
+	 */
+	nr_queues = min(ctrl->tag_set.nr_hw_queues + 1, nctrl->queue_count);
+	ret = nvme_rdma_start_io_queues(ctrl, 1, nr_queues);
 	if (ret)
 		goto out_cleanup_connect_q;
 
@@ -1000,6 +1007,15 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 		nvme_unfreeze(&ctrl->ctrl);
 	}
 
+	/*
+	 * If the number of queues has increased (reconnect case)
+	 * start all new queues now.
+	 */
+	ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
+					ctrl->tag_set.nr_hw_queues + 1);
+	if (ret)
+		goto out_wait_freeze_timed_out;
+
 	return 0;
 
 out_wait_freeze_timed_out:
-- 
2.37.2



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] nvmet: Expose max queues to sysfs
  2022-08-29  8:38 ` [PATCH v3 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
@ 2022-08-29  9:01   ` Sagi Grimberg
  2022-08-29  9:06   ` Hannes Reinecke
  1 sibling, 0 replies; 25+ messages in thread
From: Sagi Grimberg @ 2022-08-29  9:01 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Hannes Reinecke, Chao Leng

s/sysfs/configfs/


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] nvme-tcp: Handle number of queue changes
  2022-08-29  8:38 ` [PATCH v3 2/3] nvme-tcp: Handle number of queue changes Daniel Wagner
@ 2022-08-29  9:01   ` Sagi Grimberg
  2022-08-29  9:07   ` Hannes Reinecke
  1 sibling, 0 replies; 25+ messages in thread
From: Sagi Grimberg @ 2022-08-29  9:01 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Hannes Reinecke, Chao Leng

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] nvme-rdma: Handle number of queue changes
  2022-08-29  8:38 ` [PATCH v3 3/3] nvme-rdma: " Daniel Wagner
@ 2022-08-29  9:02   ` Sagi Grimberg
  2022-08-29  9:07   ` Hannes Reinecke
  2022-08-29  9:22   ` Chao Leng
  2 siblings, 0 replies; 25+ messages in thread
From: Sagi Grimberg @ 2022-08-29  9:02 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Hannes Reinecke, Chao Leng

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] nvmet: Expose max queues to sysfs
  2022-08-29  8:38 ` [PATCH v3 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
  2022-08-29  9:01   ` Sagi Grimberg
@ 2022-08-29  9:06   ` Hannes Reinecke
  1 sibling, 0 replies; 25+ messages in thread
From: Hannes Reinecke @ 2022-08-29  9:06 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg, Chao Leng

On 8/29/22 10:38, Daniel Wagner wrote:
> Allow to set the max queues the target supports. This is useful for
> testing the reconnect attempt of the host with changing numbers of
> supported queues.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/configfs.c | 28 ++++++++++++++++++++++++++++
>   1 file changed, 28 insertions(+)
> 
> diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
> index 2bcd60758919..a415c2987393 100644
> --- a/drivers/nvme/target/configfs.c
> +++ b/drivers/nvme/target/configfs.c
> @@ -1281,6 +1281,33 @@ static ssize_t nvmet_subsys_attr_pi_enable_store(struct config_item *item,
>   CONFIGFS_ATTR(nvmet_subsys_, attr_pi_enable);
>   #endif
>   
> +static ssize_t nvmet_subsys_attr_qid_max_show(struct config_item *item,
> +					      char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%u\n", to_subsys(item)->max_qid);
> +}
> +
> +static ssize_t nvmet_subsys_attr_qid_max_store(struct config_item *item,
> +					       const char *page, size_t cnt)
> +{
> +	u16 qid_max;
> +
> +	if (nvmet_is_port_enabled(port, __func__))
> +		return -EACCES;
> +
> +	if (sscanf(page, "%hu\n", &qid_max) != 1)
> +		return -EINVAL;
> +
> +	if (qid_max < 1 || qid_max > NVMET_NR_QUEUES)
> +		return -EINVAL;
> +
> +	down_write(&nvmet_config_sem);
> +	to_subsys(item)->max_qid = qid_max;
> +	up_write(&nvmet_config_sem);
> +	return cnt;
> +}
> +CONFIGFS_ATTR(nvmet_subsys_, attr_qid_max);
> +
>   static struct configfs_attribute *nvmet_subsys_attrs[] = {
>   	&nvmet_subsys_attr_attr_allow_any_host,
>   	&nvmet_subsys_attr_attr_version,
> @@ -1291,6 +1318,7 @@ static struct configfs_attribute *nvmet_subsys_attrs[] = {
>   #ifdef CONFIG_BLK_DEV_INTEGRITY
>   	&nvmet_subsys_attr_attr_pi_enable,
>   #endif
> +	&nvmet_subsys_attr_attr_qid_max,
>   	NULL,
>   };
>   
Minor nit: I would add the new field prior to the 'ifdef' stuff; as it 
stands the layout of the resulting structure changes depending on the 
configuration, which tends to cause all sorts of issues for distributions.
But otherwise:

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/3] Handle number of queue changes
  2022-08-29  8:38 [PATCH v3 0/3] Handle number of queue changes Daniel Wagner
                   ` (2 preceding siblings ...)
  2022-08-29  8:38 ` [PATCH v3 3/3] nvme-rdma: " Daniel Wagner
@ 2022-08-29  9:07 ` Hannes Reinecke
  2022-08-29  9:17   ` Daniel Wagner
  2022-08-29 18:56   ` James Smart
  2022-08-29 18:44 ` James Smart
  4 siblings, 2 replies; 25+ messages in thread
From: Hannes Reinecke @ 2022-08-29  9:07 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg, Chao Leng

On 8/29/22 10:38, Daniel Wagner wrote:
> Updated this series to a proper patch series with Hannes and Sagi's
> feedback addressed.
> 
> I tested this with nvme-tcp but due to lack of hardware the nvme-rdma
> is only compile tested.
> 

One does wonder: what about FC?
Does it suffer from the same problems?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] nvme-tcp: Handle number of queue changes
  2022-08-29  8:38 ` [PATCH v3 2/3] nvme-tcp: Handle number of queue changes Daniel Wagner
  2022-08-29  9:01   ` Sagi Grimberg
@ 2022-08-29  9:07   ` Hannes Reinecke
  1 sibling, 0 replies; 25+ messages in thread
From: Hannes Reinecke @ 2022-08-29  9:07 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg, Chao Leng

On 8/29/22 10:38, Daniel Wagner wrote:
> On reconnect, the number of queues might have changed.
> 
> In the case where we have more queues available than previously we try
> to access queues which are not initialized yet.
> 
> The other case where we have less queues than previously, the
> connection attempt will fail because the target doesn't support the
> old number of queues and we end up in a reconnect loop.
> 
> Thus, only start queues which are currently present in the tagset
> limited by the number of available queues. Then we update the tagset
> and we can start any new queue.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/tcp.c | 26 +++++++++++++++++++++-----
>   1 file changed, 21 insertions(+), 5 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] nvme-rdma: Handle number of queue changes
  2022-08-29  8:38 ` [PATCH v3 3/3] nvme-rdma: " Daniel Wagner
  2022-08-29  9:02   ` Sagi Grimberg
@ 2022-08-29  9:07   ` Hannes Reinecke
  2022-08-29  9:22   ` Chao Leng
  2 siblings, 0 replies; 25+ messages in thread
From: Hannes Reinecke @ 2022-08-29  9:07 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg, Chao Leng

On 8/29/22 10:38, Daniel Wagner wrote:
> On reconnect, the number of queues might have changed.
> 
> In the case where we have more queues available than previously we try
> to access queues which are not initialized yet.
> 
> The other case where we have less queues than previously, the
> connection attempt will fail because the target doesn't support the
> old number of queues and we end up in a reconnect loop.
> 
> Thus, only start queues which are currently present in the tagset
> limited by the number of available queues. Then we update the tagset
> and we can start any new queue.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/rdma.c | 26 +++++++++++++++++++++-----
>   1 file changed, 21 insertions(+), 5 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/3] Handle number of queue changes
  2022-08-29  9:07 ` [PATCH v3 0/3] " Hannes Reinecke
@ 2022-08-29  9:17   ` Daniel Wagner
  2022-08-29  9:20     ` Hannes Reinecke
  2022-08-29 18:56   ` James Smart
  1 sibling, 1 reply; 25+ messages in thread
From: Daniel Wagner @ 2022-08-29  9:17 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-nvme, Sagi Grimberg, Chao Leng

On Mon, Aug 29, 2022 at 11:07:00AM +0200, Hannes Reinecke wrote:
> One does wonder: what about FC?
> Does it suffer from the same problems?

No idea, I'll look into it.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/3] Handle number of queue changes
  2022-08-29  9:17   ` Daniel Wagner
@ 2022-08-29  9:20     ` Hannes Reinecke
  0 siblings, 0 replies; 25+ messages in thread
From: Hannes Reinecke @ 2022-08-29  9:20 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: linux-nvme, Sagi Grimberg, Chao Leng

On 8/29/22 11:17, Daniel Wagner wrote:
> On Mon, Aug 29, 2022 at 11:07:00AM +0200, Hannes Reinecke wrote:
>> One does wonder: what about FC?
>> Does it suffer from the same problems?
> 
> No idea, I'll look into it.

And while we're at it, a blktest would be grand...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] nvme-rdma: Handle number of queue changes
  2022-08-29  8:38 ` [PATCH v3 3/3] nvme-rdma: " Daniel Wagner
  2022-08-29  9:02   ` Sagi Grimberg
  2022-08-29  9:07   ` Hannes Reinecke
@ 2022-08-29  9:22   ` Chao Leng
  2022-08-29  9:32     ` Daniel Wagner
  2 siblings, 1 reply; 25+ messages in thread
From: Chao Leng @ 2022-08-29  9:22 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg, Hannes Reinecke



On 2022/8/29 16:38, Daniel Wagner wrote:
> On reconnect, the number of queues might have changed.
> 
> In the case where we have more queues available than previously we try
> to access queues which are not initialized yet.
> 
> The other case where we have less queues than previously, the
> connection attempt will fail because the target doesn't support the
> old number of queues and we end up in a reconnect loop.
> 
> Thus, only start queues which are currently present in the tagset
> limited by the number of available queues. Then we update the tagset
> and we can start any new queue.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/rdma.c | 26 +++++++++++++++++++++-----
>   1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 3100643be299..0d1d29e644d9 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -696,11 +696,12 @@ static int nvme_rdma_start_queue(struct nvme_rdma_ctrl *ctrl, int idx)
>   	return ret;
>   }
>   
> -static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl)
> +static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl,
> +				     int first, int last)
>   {
>   	int i, ret = 0;
>   
> -	for (i = 1; i < ctrl->ctrl.queue_count; i++) {
> +	for (i = first; i < last; i++) {
>   		ret = nvme_rdma_start_queue(ctrl, i);
>   		if (ret)
>   			goto out_stop_queues;
> @@ -709,7 +710,7 @@ static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl)
>   	return 0;
>   
>   out_stop_queues:
> -	for (i--; i >= 1; i--)
> +	for (i--; i >= first; i--)
>   		nvme_rdma_stop_queue(&ctrl->queues[i]);
>   	return ret;
>   }
> @@ -964,7 +965,7 @@ static void nvme_rdma_destroy_io_queues(struct nvme_rdma_ctrl *ctrl,
>   
>   static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>   {
> -	int ret;
> +	int ret, nr_queues;
>   
>   	ret = nvme_rdma_alloc_io_queues(ctrl);
>   	if (ret)
> @@ -980,7 +981,13 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>   			goto out_free_tag_set;
>   	}
>   
> -	ret = nvme_rdma_start_io_queues(ctrl);
> +	/*
> +	 * Only start IO queues for which we have allocated the tagset
> +	 * and limitted it to the available queues. On reconnects, the
> +	 * queue number might have changed.
> +	 */
> +	nr_queues = min(ctrl->tag_set.nr_hw_queues + 1, nctrl->queue_count);
> +	ret = nvme_rdma_start_io_queues(ctrl, 1, nr_queues);
>   	if (ret)
>   		goto out_cleanup_connect_q;
>   
> @@ -1000,6 +1007,15 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>   		nvme_unfreeze(&ctrl->ctrl);
>   	}
>   
> +	/*
> +	 * If the number of queues has increased (reconnect case)
> +	 * start all new queues now.
> +	 */
> +	ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
> +					ctrl->tag_set.nr_hw_queues + 1);
> +	if (ret)
> +		goto out_wait_freeze_timed_out;
nvme_rdma_start_io_queues(ctrl, 1, nr_queues) need to clean.
> +
>   	return 0;
>   
>   out_wait_freeze_timed_out:
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] nvme-rdma: Handle number of queue changes
  2022-08-29  9:22   ` Chao Leng
@ 2022-08-29  9:32     ` Daniel Wagner
  2022-08-29 13:16       ` Chao Leng
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Wagner @ 2022-08-29  9:32 UTC (permalink / raw)
  To: Chao Leng; +Cc: linux-nvme, Sagi Grimberg, Hannes Reinecke

On Mon, Aug 29, 2022 at 05:22:50PM +0800, Chao Leng wrote:
> > +	ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
> > +					ctrl->tag_set.nr_hw_queues + 1);
> > +	if (ret)
> > +		goto out_wait_freeze_timed_out;
> nvme_rdma_start_io_queues(ctrl, 1, nr_queues) need to clean.

the teardown path in nvme_rdma_stop_io_queus() is using

   test_and_clear_bit(NVME_RDMA_Q_LIVE, &queue->flags)

when iterating over all nr_hw_queues.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] nvme-rdma: Handle number of queue changes
  2022-08-29  9:32     ` Daniel Wagner
@ 2022-08-29 13:16       ` Chao Leng
  2022-08-30  8:21         ` Sagi Grimberg
  0 siblings, 1 reply; 25+ messages in thread
From: Chao Leng @ 2022-08-29 13:16 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: linux-nvme, Sagi Grimberg, Hannes Reinecke



On 2022/8/29 17:32, Daniel Wagner wrote:
> On Mon, Aug 29, 2022 at 05:22:50PM +0800, Chao Leng wrote:
>>> +	ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
>>> +					ctrl->tag_set.nr_hw_queues + 1);
>>> +	if (ret)
>>> +		goto out_wait_freeze_timed_out;
>> nvme_rdma_start_io_queues(ctrl, 1, nr_queues) need to clean.
> 
> the teardown path in nvme_rdma_stop_io_queus() is using
> 
>     test_and_clear_bit(NVME_RDMA_Q_LIVE, &queue->flags)
> 
> when iterating over all nr_hw_queues.
I mean that we need realloc the ctrl->queues for
nvme_rdma_start_io_queues(ctrl, 1, nr_queues).
Otherwise, unallocated memory will be accessed when
the new queue_count is bigger than the old one.
> .
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/3] Handle number of queue changes
  2022-08-29  8:38 [PATCH v3 0/3] Handle number of queue changes Daniel Wagner
                   ` (3 preceding siblings ...)
  2022-08-29  9:07 ` [PATCH v3 0/3] " Hannes Reinecke
@ 2022-08-29 18:44 ` James Smart
  2022-08-31  9:05   ` Daniel Wagner
  4 siblings, 1 reply; 25+ messages in thread
From: James Smart @ 2022-08-29 18:44 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg, Hannes Reinecke, Chao Leng

On 8/29/2022 1:38 AM, Daniel Wagner wrote:
> Updated this series to a proper patch series with Hannes and Sagi's
> feedback addressed.
> 
> I tested this with nvme-tcp but due to lack of hardware the nvme-rdma
> is only compile tested.
> 
> Daniel
> 
> 
>  From the previous cover letter:
> 
> We got a report from our customer that a firmware upgrade on the
> storage array is able to 'break' host. This is caused a change of
> number of queues which the target supports after a reconnect.
> 
> Let's assume the number of queues is 8 and all is working fine. Then
> the connection is dropped and the host starts to try to
> reconnect. Eventually, this succeeds but now the new number of queues
> is 10:
> 
> nvme0: creating 8 I/O queues.
> nvme0: mapped 8/0/0 default/read/poll queues.
> nvme0: new ctrl: NQN "nvmet-test", addr 10.100.128.29:4420
> nvme0: queue 0: timeout request 0x0 type 4
> nvme0: starting error recovery
> nvme0: failed nvme_keep_alive_end_io error=10
> nvme0: Reconnecting in 10 seconds...
> nvme0: failed to connect socket: -110
> nvme0: Failed reconnect attempt 1
> nvme0: Reconnecting in 10 seconds...
> nvme0: creating 10 I/O queues.
> nvme0: Connect command failed, error wo/DNR bit: -16389
> nvme0: failed to connect queue: 9 ret=-5
> nvme0: Failed reconnect attempt 2
> 
> As you can see queue number 9 is not able to connect.
> 
> As the order of starting and unfreezing is important we can't just
> move the start of the queues after the tagset update. So my stupid
> idea was to start just the older number of queues and then the rest.

Since you're in the area....

I recommend adding something to ensure that after a reconnect that if 
I/O queues were present in the prior association, the new controller 
must return support for at least 1 io queue or the reconnect fails and 
retries. This was a bug we hit on a subsystem in FC.

-- james



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/3] Handle number of queue changes
  2022-08-29  9:07 ` [PATCH v3 0/3] " Hannes Reinecke
  2022-08-29  9:17   ` Daniel Wagner
@ 2022-08-29 18:56   ` James Smart
  2022-08-30  7:57     ` Sagi Grimberg
  1 sibling, 1 reply; 25+ messages in thread
From: James Smart @ 2022-08-29 18:56 UTC (permalink / raw)
  To: Hannes Reinecke, Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg, Chao Leng

On 8/29/2022 2:07 AM, Hannes Reinecke wrote:
> On 8/29/22 10:38, Daniel Wagner wrote:
>> Updated this series to a proper patch series with Hannes and Sagi's
>> feedback addressed.
>>
>> I tested this with nvme-tcp but due to lack of hardware the nvme-rdma
>> is only compile tested.
>>
> 
> One does wonder: what about FC?
> Does it suffer from the same problems?
> 
> Cheers,
> 
> Hannes


Yep, wondering too. I don't think so... FC does do this differently.

We don't realloc io queues nor tag_sets.  We reuse the allocations and 
tag_set originally created in the 1st successful association for the 
controller.

On reconnect, we set the queue count, then call 
blk_mq_update_nr_hw_queues() if it changed before we get into the loops 
to change queue states.

So FC's call to update nr_hw_queues() is much earlier than rdma/tcp today.

-- james



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/3] Handle number of queue changes
  2022-08-29 18:56   ` James Smart
@ 2022-08-30  7:57     ` Sagi Grimberg
  2022-08-30 14:38       ` James Smart
  0 siblings, 1 reply; 25+ messages in thread
From: Sagi Grimberg @ 2022-08-30  7:57 UTC (permalink / raw)
  To: James Smart, Hannes Reinecke, Daniel Wagner, linux-nvme; +Cc: Chao Leng


>>> Updated this series to a proper patch series with Hannes and Sagi's
>>> feedback addressed.
>>>
>>> I tested this with nvme-tcp but due to lack of hardware the nvme-rdma
>>> is only compile tested.
>>>
>>
>> One does wonder: what about FC?
>> Does it suffer from the same problems?
>>
>> Cheers,
>>
>> Hannes
> 
> 
> Yep, wondering too. I don't think so... FC does do this differently.
> 
> We don't realloc io queues nor tag_sets.  We reuse the allocations and 
> tag_set originally created in the 1st successful association for the 
> controller.
> 
> On reconnect, we set the queue count, then call 
> blk_mq_update_nr_hw_queues() if it changed before we get into the loops 
> to change queue states.

That is the same as tcp/rdma. We don't realloc tagsets or queues, we
just reinitialize the queues based on the queue_count.

> So FC's call to update nr_hw_queues() is much earlier than rdma/tcp today.

The only difference is that you don't freeze the request queues when
tearing down the controller, so you can allocate/start all the queues in
one go.

In pci/rdma/tcp, we start by freezing the request queues to address a 
hang that happens with multiple queue-maps (default/read/poll).
See 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic").

I don't think that fc supports multiple queue maps, but in case the
number of queue changes, blk_mq_update_nr_hw_queues() will still attempt
to freeze the request queues, which may lead to a hang if some requests
may not be able to complete (because the queues are quiesced at this
time). However, I see that fc starts the queues in the end of
nvme_fc_delete_association (which is a bit strange because the same can
be achieved by passing start_queues=true to
__nvme_fc_abort_outstanding_ios.

But that is the main difference, tcp/rdma does not start the queues when
tearing down a controller in a reset, only after we re-establish the 
queues. I think this was needed to support a non-mpath configurations,
where IOs do not failover. Maybe that is a legacy thing now for fabrics 
though...


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] nvme-rdma: Handle number of queue changes
  2022-08-29 13:16       ` Chao Leng
@ 2022-08-30  8:21         ` Sagi Grimberg
  2022-08-30 12:22           ` Chao Leng
  0 siblings, 1 reply; 25+ messages in thread
From: Sagi Grimberg @ 2022-08-30  8:21 UTC (permalink / raw)
  To: Chao Leng, Daniel Wagner; +Cc: linux-nvme, Hannes Reinecke


>> On Mon, Aug 29, 2022 at 05:22:50PM +0800, Chao Leng wrote:
>>>> +    ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
>>>> +                    ctrl->tag_set.nr_hw_queues + 1);
>>>> +    if (ret)
>>>> +        goto out_wait_freeze_timed_out;
>>> nvme_rdma_start_io_queues(ctrl, 1, nr_queues) need to clean.
>>
>> the teardown path in nvme_rdma_stop_io_queus() is using
>>
>>     test_and_clear_bit(NVME_RDMA_Q_LIVE, &queue->flags)
>>
>> when iterating over all nr_hw_queues.
> I mean that we need realloc the ctrl->queues for
> nvme_rdma_start_io_queues(ctrl, 1, nr_queues).
> Otherwise, unallocated memory will be accessed when
> the new queue_count is bigger than the old one.

That can't happen. ctrl->queues is allocated at the start
of the controller lifetime and will never exceed this queue
count.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] nvme-rdma: Handle number of queue changes
  2022-08-30  8:21         ` Sagi Grimberg
@ 2022-08-30 12:22           ` Chao Leng
  2022-08-30 12:39             ` Chao Leng
  2022-08-30 13:22             ` Sagi Grimberg
  0 siblings, 2 replies; 25+ messages in thread
From: Chao Leng @ 2022-08-30 12:22 UTC (permalink / raw)
  To: Sagi Grimberg, Daniel Wagner; +Cc: linux-nvme, Hannes Reinecke



On 2022/8/30 16:21, Sagi Grimberg wrote:
> 
>>> On Mon, Aug 29, 2022 at 05:22:50PM +0800, Chao Leng wrote:
>>>>> +    ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
>>>>> +                    ctrl->tag_set.nr_hw_queues + 1);
>>>>> +    if (ret)
>>>>> +        goto out_wait_freeze_timed_out;
>>>> nvme_rdma_start_io_queues(ctrl, 1, nr_queues) need to clean.
>>>
>>> the teardown path in nvme_rdma_stop_io_queus() is using
>>>
>>>     test_and_clear_bit(NVME_RDMA_Q_LIVE, &queue->flags)
>>>
>>> when iterating over all nr_hw_queues.
>> I mean that we need realloc the ctrl->queues for
>> nvme_rdma_start_io_queues(ctrl, 1, nr_queues).
>> Otherwise, unallocated memory will be accessed when
>> the new queue_count is bigger than the old one.
> 
> That can't happen. ctrl->queues is allocated at the start
> of the controller lifetime and will never exceed this queue
> count.
I don't understand why it can't happen.
nvme_rdma_start_io_queues(ctrl, nr_queues, ctrl->tag_set.nr_hw_queues + 1)
is designed for dealing with the scenario that the new queue_count is
bigger than the old one. it will access the unallocated memory.
ctrl->queues is just allocated the first queue count buffers at the start
of the controller lifetime, if the new queue count is bigger than the
first queue count, ctrl-queues will be insufficient.
> .


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] nvme-rdma: Handle number of queue changes
  2022-08-30 12:22           ` Chao Leng
@ 2022-08-30 12:39             ` Chao Leng
  2022-08-30 13:22             ` Sagi Grimberg
  1 sibling, 0 replies; 25+ messages in thread
From: Chao Leng @ 2022-08-30 12:39 UTC (permalink / raw)
  To: linux-nvme



On 2022/8/30 20:22, Chao Leng wrote:
> 
> 
> On 2022/8/30 16:21, Sagi Grimberg wrote:
>>
>>>> On Mon, Aug 29, 2022 at 05:22:50PM +0800, Chao Leng wrote:
>>>>>> +    ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
>>>>>> +                    ctrl->tag_set.nr_hw_queues + 1);
>>>>>> +    if (ret)
>>>>>> +        goto out_wait_freeze_timed_out;
>>>>> nvme_rdma_start_io_queues(ctrl, 1, nr_queues) need to clean.
>>>>
>>>> the teardown path in nvme_rdma_stop_io_queus() is using
>>>>
>>>>     test_and_clear_bit(NVME_RDMA_Q_LIVE, &queue->flags)
>>>>
>>>> when iterating over all nr_hw_queues.
>>> I mean that we need realloc the ctrl->queues for
>>> nvme_rdma_start_io_queues(ctrl, 1, nr_queues).
>>> Otherwise, unallocated memory will be accessed when
>>> the new queue_count is bigger than the old one.
>>
>> That can't happen. ctrl->queues is allocated at the start
>> of the controller lifetime and will never exceed this queue
>> count.
> I don't understand why it can't happen.
> nvme_rdma_start_io_queues(ctrl, nr_queues, ctrl->tag_set.nr_hw_queues + 1)
> is designed for dealing with the scenario that the new queue_count is
> bigger than the old one. it will access the unallocated memory.
> ctrl->queues is just allocated the first queue count buffers at the start
> of the controller lifetime, if the new queue count is bigger than the
> first queue count, ctrl-queues will be insufficient.
Please ignore this.
ctrl->queues is allocated the max buffers.
So you are right.
>> .
> 
> .


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] nvme-rdma: Handle number of queue changes
  2022-08-30 12:22           ` Chao Leng
  2022-08-30 12:39             ` Chao Leng
@ 2022-08-30 13:22             ` Sagi Grimberg
  1 sibling, 0 replies; 25+ messages in thread
From: Sagi Grimberg @ 2022-08-30 13:22 UTC (permalink / raw)
  To: Chao Leng, Daniel Wagner; +Cc: linux-nvme, Hannes Reinecke


>>>> when iterating over all nr_hw_queues.
>>> I mean that we need realloc the ctrl->queues for
>>> nvme_rdma_start_io_queues(ctrl, 1, nr_queues).
>>> Otherwise, unallocated memory will be accessed when
>>> the new queue_count is bigger than the old one.
>>
>> That can't happen. ctrl->queues is allocated at the start
>> of the controller lifetime and will never exceed this queue
>> count.
> I don't understand why it can't happen.
> nvme_rdma_start_io_queues(ctrl, nr_queues, ctrl->tag_set.nr_hw_queues + 1)
> is designed for dealing with the scenario that the new queue_count is
> bigger than the old one. it will access the unallocated memory.
> ctrl->queues is just allocated the first queue count buffers at the start
> of the controller lifetime, if the new queue count is bigger than the
> first queue count, ctrl-queues will be insufficient.

queue_count cannot be larger than what the user asked for, and that is
the number of entries ctrl->queues is spanning.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/3] Handle number of queue changes
  2022-08-30  7:57     ` Sagi Grimberg
@ 2022-08-30 14:38       ` James Smart
  0 siblings, 0 replies; 25+ messages in thread
From: James Smart @ 2022-08-30 14:38 UTC (permalink / raw)
  To: Sagi Grimberg, Hannes Reinecke, Daniel Wagner, linux-nvme; +Cc: Chao Leng

On 8/30/2022 12:57 AM, Sagi Grimberg wrote:
> 
>>> One does wonder: what about FC?
>>> Does it suffer from the same problems?
>>>
>>> Cheers,
>>>
>>> Hannes
>>
>>
>> Yep, wondering too. I don't think so... FC does do this differently.
>>
>> We don't realloc io queues nor tag_sets.  We reuse the allocations and 
>> tag_set originally created in the 1st successful association for the 
>> controller.
>>
>> On reconnect, we set the queue count, then call 
>> blk_mq_update_nr_hw_queues() if it changed before we get into the 
>> loops to change queue states.
> 
> That is the same as tcp/rdma. We don't realloc tagsets or queues, we
> just reinitialize the queues based on the queue_count.

yeah - true.

> 
>> So FC's call to update nr_hw_queues() is much earlier than rdma/tcp 
>> today.
> 
> The only difference is that you don't freeze the request queues when
> tearing down the controller, so you can allocate/start all the queues in
> one go.

This is related to the way FC has to actually do things on the wire to 
terminate I/O's before we can call outstanding i/o dead, and while on 
the wire doing so... they needed to be unfrozen for other paths to hit 
reject states.

When I look at the mods: I still think it comes down to when the 
transports do the update.  Rdma/TCP's mods are specifically that 
start_io_queues, which loops on (the now changed) queue_count, has to be 
called before update nr_hw_queues is called to update it on the tag_set. 
The patch reverts the initial call limit to whats in the (existing 
pre-change) tag_set. And adds a post call to start all queues in the now 
revised tag_set

FC as it doesn't call a loop routine, updates the tag set before getting 
in trouble.

> 
> In pci/rdma/tcp, we start by freezing the request queues to address a 
> hang that happens with multiple queue-maps (default/read/poll).
> See 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic").
> 
> I don't think that fc supports multiple queue maps, but in case the

Nope we don't. Isn't as meaningful based on the way the transport works.

> number of queue changes, blk_mq_update_nr_hw_queues() will still attempt
> to freeze the request queues, which may lead to a hang if some requests
> may not be able to complete (because the queues are quiesced at this
> time). However, I see that fc starts the queues in the end of
> nvme_fc_delete_association (which is a bit strange because the same can
> be achieved by passing start_queues=true to
> __nvme_fc_abort_outstanding_ios.

Yep - code fragmented a little over time.


> But that is the main difference, tcp/rdma does not start the queues when
> tearing down a controller in a reset, only after we re-establish the 
> queues. I think this was needed to support a non-mpath configurations,
> where IOs do not failover. Maybe that is a legacy thing now for fabrics 
> though...

agree. thus the ordering difference vs update.

-- james



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/3] Handle number of queue changes
  2022-08-29 18:44 ` James Smart
@ 2022-08-31  9:05   ` Daniel Wagner
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Wagner @ 2022-08-31  9:05 UTC (permalink / raw)
  To: James Smart; +Cc: linux-nvme, Sagi Grimberg, Hannes Reinecke, Chao Leng

On Mon, Aug 29, 2022 at 11:44:41AM -0700, James Smart wrote:
> Since you're in the area....
> 
> I recommend adding something to ensure that after a reconnect that if I/O
> queues were present in the prior association, the new controller must return
> support for at least 1 io queue or the reconnect fails and retries. This was
> a bug we hit on a subsystem in FC.

I suppose this is the fix for the problem you describe:

  834d3710a093 ("nvme-fc: reject reconnect if io queue count is reduced to zero")

A test with v4 gives this

 nvmet: adding nsid 1 to subsystem blktests-subsystem-1
 nvmet_tcp: enabling port 0 (127.0.0.1:4420)
 nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:.
 nvme nvme1: creating 8 I/O queues.
 nvme nvme1: mapped 8/0/0 default/read/poll queues.
 nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420
 nvmet: remove subystem from port
 nvme nvme1: failed to send request -32
 nvme nvme1: failed nvme_keep_alive_end_io error=4
 nvme nvme1: starting error recovery
 nvme nvme1: Reconnecting in 10 seconds...
 nvmet: set qid_max to 0
 nvmet_tcp: enabling port 0 (127.0.0.1:4420)
 nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:.
 nvme nvme1: creating 1 I/O queues.
 nvmet: invalid queue id (1)
 nvme nvme1: Connect Invalid SQE Parameter, qid 1
 nvme nvme1: failed to connect queue: 1 ret=16770
 nvme nvme1: Failed reconnect attempt 1
 nvme nvme1: Reconnecting in 10 seconds...
 nvmet: remove subsystem from port
 nvme nvme1: failed to connect socket: -111
 nvme nvme1: Failed reconnect attempt 2
 nvme nvme1: Reconnecting in 10 seconds...
 nvmet: set qid_max to 2
 nvmet_tcp: enabling port 0 (127.0.0.1:4420)
 nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:.
 nvme nvme1: creating 2 I/O queues.
 nvme nvme1: mapped 2/0/0 default/read/poll queues.
 nvme nvme1: Successfully reconnected (3 attempt)

It looks like the host is able to recover correctly even though the IO
queue count is 0.


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-08-31  9:05 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-29  8:38 [PATCH v3 0/3] Handle number of queue changes Daniel Wagner
2022-08-29  8:38 ` [PATCH v3 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
2022-08-29  9:01   ` Sagi Grimberg
2022-08-29  9:06   ` Hannes Reinecke
2022-08-29  8:38 ` [PATCH v3 2/3] nvme-tcp: Handle number of queue changes Daniel Wagner
2022-08-29  9:01   ` Sagi Grimberg
2022-08-29  9:07   ` Hannes Reinecke
2022-08-29  8:38 ` [PATCH v3 3/3] nvme-rdma: " Daniel Wagner
2022-08-29  9:02   ` Sagi Grimberg
2022-08-29  9:07   ` Hannes Reinecke
2022-08-29  9:22   ` Chao Leng
2022-08-29  9:32     ` Daniel Wagner
2022-08-29 13:16       ` Chao Leng
2022-08-30  8:21         ` Sagi Grimberg
2022-08-30 12:22           ` Chao Leng
2022-08-30 12:39             ` Chao Leng
2022-08-30 13:22             ` Sagi Grimberg
2022-08-29  9:07 ` [PATCH v3 0/3] " Hannes Reinecke
2022-08-29  9:17   ` Daniel Wagner
2022-08-29  9:20     ` Hannes Reinecke
2022-08-29 18:56   ` James Smart
2022-08-30  7:57     ` Sagi Grimberg
2022-08-30 14:38       ` James Smart
2022-08-29 18:44 ` James Smart
2022-08-31  9:05   ` Daniel Wagner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).