All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Handle number of queue changes
@ 2022-08-23  7:44 Daniel Wagner
  2022-08-23  7:44 ` [PATCH v2 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Daniel Wagner @ 2022-08-23  7:44 UTC (permalink / raw)
  To: linux-nvme; +Cc: Sagi Grimberg, Daniel Wagner

Updated this series to a proper patch series with Sagi's feedback
addressed.

This version updates the caller side of nvme_tcp_start_io_queues()
which queues need to be started instead making
nvme_tcp_start_io_queues() re-entrant safe.

I tested this properly with nvme-tcp but due to lack of hardware the
nvme-rdma is only compile tested.

Daniel


From the previous cover letter:

We got a report from our customer that a firmware upgrade on the
storage array is able to 'break' host. This is caused a change of
number of queues which the target supports after a reconnect.

Let's assume the number of queues is 8 and all is working fine. Then
the connection is dropped and the host starts to try to
reconnect. Eventually, this succeeds but now the new number of queues
is 10:

nvme0: creating 8 I/O queues.
nvme0: mapped 8/0/0 default/read/poll queues.
nvme0: new ctrl: NQN "nvmet-test", addr 10.100.128.29:4420
nvme0: queue 0: timeout request 0x0 type 4
nvme0: starting error recovery
nvme0: failed nvme_keep_alive_end_io error=10
nvme0: Reconnecting in 10 seconds...
nvme0: failed to connect socket: -110
nvme0: Failed reconnect attempt 1
nvme0: Reconnecting in 10 seconds...
nvme0: creating 10 I/O queues.
nvme0: Connect command failed, error wo/DNR bit: -16389
nvme0: failed to connect queue: 9 ret=-5
nvme0: Failed reconnect attempt 2

As you can see queue number 9 is not able to connect.

As the order of starting and unfreezing is important we can't just
move the start of the queues after the tagset update. So my stupid
idea was to start just the older number of queues and then the rest.

This seems work:

nvme nvme0: creating 4 I/O queues.
nvme nvme0: mapped 4/0/0 default/read/poll queues.
nvme_tcp_start_io_queues nr_hw_queues 4 queue_count 5 qcnt 5
nvme_tcp_start_io_queues nr_hw_queues 4 queue_count 5 qcnt 5
nvme nvme0: new ctrl: NQN "nvmet-test", addr 10.100.128.29:4420
nvme nvme0: queue 0: timeout request 0x0 type 4
nvme nvme0: starting error recovery
nvme0: Keep Alive(0x18), Unknown (sct 0x3 / sc 0x71)
nvme nvme0: failed nvme_keep_alive_end_io error=10
nvme nvme0: Reconnecting in 10 seconds...
nvme nvme0: creating 6 I/O queues.
nvme_tcp_start_io_queues nr_hw_queues 4 queue_count 7 qcnt 5
nvme nvme0: mapped 6/0/0 default/read/poll queues.
nvme_tcp_start_io_queues nr_hw_queues 6 queue_count 7 qcnt 7
nvme nvme0: Successfully reconnected (1 attempt)
nvme nvme0: starting error recovery
nvme0: Keep Alive(0x18), Unknown (sct 0x3 / sc 0x71)
nvme nvme0: failed nvme_keep_alive_end_io error=10
nvme nvme0: Reconnecting in 10 seconds...
nvme nvme0: creating 4 I/O queues.
nvme_tcp_start_io_queues nr_hw_queues 6 queue_count 5 qcnt 5
nvme nvme0: mapped 4/0/0 default/read/poll queues.
nvme_tcp_start_io_queues nr_hw_queues 4 queue_count 5 qcnt 5
nvme nvme0: Successfully reconnected (1 attempt)


changes:
v2:
 - removed debug logging
 - pass in queue range idx as argument to nvme_tcp_start_io_queues
v1:
 - https://lore.kernel.org/linux-nvme/20220812142824.17766-1-dwagner@suse.de/

Daniel Wagner (3):
  nvmet: Expose max queues to sysfs
  nvme-tcp: Handle number of queue changes
  nvme-rdma: Handle number of queue changes

 drivers/nvme/host/rdma.c       | 26 ++++++++++++---
 drivers/nvme/host/tcp.c        | 59 ++++++++++++++++++++++------------
 drivers/nvme/target/configfs.c | 25 ++++++++++++++
 3 files changed, 84 insertions(+), 26 deletions(-)

-- 
2.37.1



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/3] nvmet: Expose max queues to sysfs
  2022-08-23  7:44 [PATCH v2 0/3] Handle number of queue changes Daniel Wagner
@ 2022-08-23  7:44 ` Daniel Wagner
  2022-08-25 16:20   ` Hannes Reinecke
  2022-08-28 12:12   ` Sagi Grimberg
  2022-08-23  7:44 ` [PATCH v2 2/3] nvme-tcp: Handle number of queue changes Daniel Wagner
  2022-08-23  7:44 ` [PATCH v2 3/3] nvme-rdma: " Daniel Wagner
  2 siblings, 2 replies; 19+ messages in thread
From: Daniel Wagner @ 2022-08-23  7:44 UTC (permalink / raw)
  To: linux-nvme; +Cc: Sagi Grimberg, Daniel Wagner

Allow to set the max queues the target supports. This is useful for
testing the reconnect attempt of the host with chaning numbers of
supported queues.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/configfs.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 2bcd60758919..ba9f3209f848 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -1281,6 +1281,30 @@ static ssize_t nvmet_subsys_attr_pi_enable_store(struct config_item *item,
 CONFIGFS_ATTR(nvmet_subsys_, attr_pi_enable);
 #endif
 
+static ssize_t nvmet_subsys_attr_qid_max_show(struct config_item *item,
+					      char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%u\n", to_subsys(item)->max_qid);
+}
+
+static ssize_t nvmet_subsys_attr_qid_max_store(struct config_item *item,
+					       const char *page, size_t cnt)
+{
+	u16 qid_max;
+
+	if (sscanf(page, "%hu\n", &qid_max) != 1)
+		return -EINVAL;
+
+	if (qid_max == 0)
+		return -EINVAL;
+
+	down_write(&nvmet_config_sem);
+	to_subsys(item)->max_qid = qid_max;
+	up_write(&nvmet_config_sem);
+	return cnt;
+}
+CONFIGFS_ATTR(nvmet_subsys_, attr_qid_max);
+
 static struct configfs_attribute *nvmet_subsys_attrs[] = {
 	&nvmet_subsys_attr_attr_allow_any_host,
 	&nvmet_subsys_attr_attr_version,
@@ -1291,6 +1315,7 @@ static struct configfs_attribute *nvmet_subsys_attrs[] = {
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 	&nvmet_subsys_attr_attr_pi_enable,
 #endif
+	&nvmet_subsys_attr_attr_qid_max,
 	NULL,
 };
 
-- 
2.37.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/3] nvme-tcp: Handle number of queue changes
  2022-08-23  7:44 [PATCH v2 0/3] Handle number of queue changes Daniel Wagner
  2022-08-23  7:44 ` [PATCH v2 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
@ 2022-08-23  7:44 ` Daniel Wagner
  2022-08-28 12:25   ` Sagi Grimberg
  2022-08-23  7:44 ` [PATCH v2 3/3] nvme-rdma: " Daniel Wagner
  2 siblings, 1 reply; 19+ messages in thread
From: Daniel Wagner @ 2022-08-23  7:44 UTC (permalink / raw)
  To: linux-nvme; +Cc: Sagi Grimberg, Daniel Wagner

On reconnect, the number of queues might have changed.

In the case where we have more queues available than previously we try
to access queues which are not initialized yet.

The other case where we have less queues than previously, the
connection attempt will fail because the target doesn't support the
old number of queues and we end up in a reconnect loop.

Thus, only start queues which are currently present in the tagset
limited by the number of available queues. Then we update the tagset
and we can start any new queue.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/tcp.c | 59 ++++++++++++++++++++++++++---------------
 1 file changed, 38 insertions(+), 21 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 044da18c06f5..93206215e381 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1762,11 +1762,12 @@ static void nvme_tcp_stop_io_queues(struct nvme_ctrl *ctrl)
 		nvme_tcp_stop_queue(ctrl, i);
 }
 
-static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl)
+static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl,
+				    int first, int last)
 {
 	int i, ret;
 
-	for (i = 1; i < ctrl->queue_count; i++) {
+	for (i = first; i < last; i++) {
 		ret = nvme_tcp_start_queue(ctrl, i);
 		if (ret)
 			goto out_stop_queues;
@@ -1775,7 +1776,7 @@ static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl)
 	return 0;
 
 out_stop_queues:
-	for (i--; i >= 1; i--)
+	for (i--; i >= first; i--)
 		nvme_tcp_stop_queue(ctrl, i);
 	return ret;
 }
@@ -1899,31 +1900,38 @@ static void nvme_tcp_destroy_io_queues(struct nvme_ctrl *ctrl, bool remove)
 	nvme_tcp_free_io_queues(ctrl);
 }
 
-static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
+static int nvme_tcp_configure_io_queues(struct nvme_ctrl *nctrl, bool new)
 {
-	int ret;
+	struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
+	int ret, nr_queues;
 
-	ret = nvme_tcp_alloc_io_queues(ctrl);
+	ret = nvme_tcp_alloc_io_queues(nctrl);
 	if (ret)
 		return ret;
 
 	if (new) {
-		ret = nvme_tcp_alloc_tag_set(ctrl);
+		ret = nvme_tcp_alloc_tag_set(nctrl);
 		if (ret)
 			goto out_free_io_queues;
 
-		ret = nvme_ctrl_init_connect_q(ctrl);
+		ret = nvme_ctrl_init_connect_q(nctrl);
 		if (ret)
 			goto out_free_tag_set;
 	}
 
-	ret = nvme_tcp_start_io_queues(ctrl);
+	/*
+	 * Only start IO queues for which we have allocated the tagset
+	 * and limitted it to the available queues. On reconnects, the
+	 * queue number might have changed.
+	 */
+	nr_queues = min(ctrl->tag_set.nr_hw_queues + 1, nctrl->queue_count);
+	ret = nvme_tcp_start_io_queues(nctrl, 1, nr_queues);
 	if (ret)
 		goto out_cleanup_connect_q;
 
 	if (!new) {
-		nvme_start_queues(ctrl);
-		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
+		nvme_start_queues(nctrl);
+		if (!nvme_wait_freeze_timeout(nctrl, NVME_IO_TIMEOUT)) {
 			/*
 			 * If we timed out waiting for freeze we are likely to
 			 * be stuck.  Fail the controller initialization just
@@ -1932,26 +1940,35 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 			ret = -ENODEV;
 			goto out_wait_freeze_timed_out;
 		}
-		blk_mq_update_nr_hw_queues(ctrl->tagset,
-			ctrl->queue_count - 1);
-		nvme_unfreeze(ctrl);
+		blk_mq_update_nr_hw_queues(nctrl->tagset,
+			nctrl->queue_count - 1);
+		nvme_unfreeze(nctrl);
 	}
 
+	/*
+	 * If the number of queues has increased (reconnect case)
+	 * start all new queues now.
+	 */
+	ret = nvme_tcp_start_io_queues(nctrl, nr_queues,
+				       ctrl->tag_set.nr_hw_queues + 1);
+	if (ret)
+		goto out_cleanup_connect_q;
+
 	return 0;
 
 out_wait_freeze_timed_out:
-	nvme_stop_queues(ctrl);
-	nvme_sync_io_queues(ctrl);
-	nvme_tcp_stop_io_queues(ctrl);
+	nvme_stop_queues(nctrl);
+	nvme_sync_io_queues(nctrl);
+	nvme_tcp_stop_io_queues(nctrl);
 out_cleanup_connect_q:
-	nvme_cancel_tagset(ctrl);
+	nvme_cancel_tagset(nctrl);
 	if (new)
-		blk_mq_destroy_queue(ctrl->connect_q);
+		blk_mq_destroy_queue(nctrl->connect_q);
 out_free_tag_set:
 	if (new)
-		blk_mq_free_tag_set(ctrl->tagset);
+		blk_mq_free_tag_set(nctrl->tagset);
 out_free_io_queues:
-	nvme_tcp_free_io_queues(ctrl);
+	nvme_tcp_free_io_queues(nctrl);
 	return ret;
 }
 
-- 
2.37.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/3] nvme-rdma: Handle number of queue changes
  2022-08-23  7:44 [PATCH v2 0/3] Handle number of queue changes Daniel Wagner
  2022-08-23  7:44 ` [PATCH v2 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
  2022-08-23  7:44 ` [PATCH v2 2/3] nvme-tcp: Handle number of queue changes Daniel Wagner
@ 2022-08-23  7:44 ` Daniel Wagner
  2022-08-25 10:08   ` Chao Leng
  2 siblings, 1 reply; 19+ messages in thread
From: Daniel Wagner @ 2022-08-23  7:44 UTC (permalink / raw)
  To: linux-nvme; +Cc: Sagi Grimberg, Daniel Wagner

On reconnect, the number of queues might have changed.

In the case where we have more queues available than previously we try
to access queues which are not initialized yet.

The other case where we have less queues than previously, the
connection attempt will fail because the target doesn't support the
old number of queues and we end up in a reconnect loop.

Thus, only start queues which are currently present in the tagset
limited by the number of available queues. Then we update the tagset
and we can start any new queue.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/rdma.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 3100643be299..386674d7c0e6 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -696,11 +696,12 @@ static int nvme_rdma_start_queue(struct nvme_rdma_ctrl *ctrl, int idx)
 	return ret;
 }
 
-static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl)
+static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl,
+				     int first, int last)
 {
 	int i, ret = 0;
 
-	for (i = 1; i < ctrl->ctrl.queue_count; i++) {
+	for (i = first; i < last; i++) {
 		ret = nvme_rdma_start_queue(ctrl, i);
 		if (ret)
 			goto out_stop_queues;
@@ -709,7 +710,7 @@ static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl)
 	return 0;
 
 out_stop_queues:
-	for (i--; i >= 1; i--)
+	for (i--; i >= first; i--)
 		nvme_rdma_stop_queue(&ctrl->queues[i]);
 	return ret;
 }
@@ -964,7 +965,7 @@ static void nvme_rdma_destroy_io_queues(struct nvme_rdma_ctrl *ctrl,
 
 static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 {
-	int ret;
+	int ret, nr_queues;
 
 	ret = nvme_rdma_alloc_io_queues(ctrl);
 	if (ret)
@@ -980,7 +981,13 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 			goto out_free_tag_set;
 	}
 
-	ret = nvme_rdma_start_io_queues(ctrl);
+	/*
+	 * Only start IO queues for which we have allocated the tagset
+	 * and limitted it to the available queues. On reconnects, the
+	 * queue number might have changed.
+	 */
+	nr_queues = min(ctrl->tag_set.nr_hw_queues + 1, nctrl->queue_count);
+	ret = nvme_rdma_start_io_queues(ctrl, 1, nr_queues);
 	if (ret)
 		goto out_cleanup_connect_q;
 
@@ -1000,6 +1007,15 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 		nvme_unfreeze(&ctrl->ctrl);
 	}
 
+	/*
+	 * If the number of queues has increased (reconnect case)
+	 * start all new queues now.
+	 */
+	ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
+					ctrl->tag_set.nr_hw_queues + 1);
+	if (ret)
+		goto out_cleanup_connect_q;
+
 	return 0;
 
 out_wait_freeze_timed_out:
-- 
2.37.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] nvme-rdma: Handle number of queue changes
  2022-08-23  7:44 ` [PATCH v2 3/3] nvme-rdma: " Daniel Wagner
@ 2022-08-25 10:08   ` Chao Leng
  2022-08-25 10:55     ` Daniel Wagner
  0 siblings, 1 reply; 19+ messages in thread
From: Chao Leng @ 2022-08-25 10:08 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg



On 2022/8/23 15:44, Daniel Wagner wrote:
> On reconnect, the number of queues might have changed.
> 
> In the case where we have more queues available than previously we try
> to access queues which are not initialized yet.
> 
> The other case where we have less queues than previously, the
> connection attempt will fail because the target doesn't support the
> old number of queues and we end up in a reconnect loop.
> 
> Thus, only start queues which are currently present in the tagset
> limited by the number of available queues. Then we update the tagset
> and we can start any new queue.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/rdma.c | 26 +++++++++++++++++++++-----
>   1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 3100643be299..386674d7c0e6 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -696,11 +696,12 @@ static int nvme_rdma_start_queue(struct nvme_rdma_ctrl *ctrl, int idx)
>   	return ret;
>   }
>   
> -static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl)
> +static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl,
> +				     int first, int last)
>   {
>   	int i, ret = 0;
>   
> -	for (i = 1; i < ctrl->ctrl.queue_count; i++) {
> +	for (i = first; i < last; i++) {
>   		ret = nvme_rdma_start_queue(ctrl, i);
>   		if (ret)
>   			goto out_stop_queues;
> @@ -709,7 +710,7 @@ static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl)
>   	return 0;
>   
>   out_stop_queues:
> -	for (i--; i >= 1; i--)
> +	for (i--; i >= first; i--)
>   		nvme_rdma_stop_queue(&ctrl->queues[i]);
>   	return ret;
>   }
> @@ -964,7 +965,7 @@ static void nvme_rdma_destroy_io_queues(struct nvme_rdma_ctrl *ctrl,
>   
>   static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>   {
> -	int ret;
> +	int ret, nr_queues;
>   
>   	ret = nvme_rdma_alloc_io_queues(ctrl);
>   	if (ret)
> @@ -980,7 +981,13 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>   			goto out_free_tag_set;
>   	}
>   
> -	ret = nvme_rdma_start_io_queues(ctrl);
> +	/*
> +	 * Only start IO queues for which we have allocated the tagset
> +	 * and limitted it to the available queues. On reconnects, the
> +	 * queue number might have changed.
> +	 */
> +	nr_queues = min(ctrl->tag_set.nr_hw_queues + 1, nctrl->queue_count);
> +	ret = nvme_rdma_start_io_queues(ctrl, 1, nr_queues);
>   	if (ret)
>   		goto out_cleanup_connect_q;
>   
> @@ -1000,6 +1007,15 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>   		nvme_unfreeze(&ctrl->ctrl);
>   	}


>   
> +	/*
> +	 * If the number of queues has increased (reconnect case)
> +	 * start all new queues now.
> +	 */
> +	ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
> +					ctrl->tag_set.nr_hw_queues + 1);
> +	if (ret)
> +		goto out_cleanup_connect_q;
> +
Now the code looks weird.
Maybe we can do like this:
first blk_mq_update_nr_hw_queues, and then nvme_rdma_start_io_queues.
>   	return 0;
>   
>   out_wait_freeze_timed_out:
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] nvme-rdma: Handle number of queue changes
  2022-08-25 10:08   ` Chao Leng
@ 2022-08-25 10:55     ` Daniel Wagner
  2022-08-26  1:10       ` Chao Leng
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Wagner @ 2022-08-25 10:55 UTC (permalink / raw)
  To: Chao Leng; +Cc: linux-nvme, Sagi Grimberg

On Thu, Aug 25, 2022 at 06:08:10PM +0800, Chao Leng wrote:
> > +	/*
> > +	 * If the number of queues has increased (reconnect case)
> > +	 * start all new queues now.
> > +	 */
> > +	ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
> > +					ctrl->tag_set.nr_hw_queues + 1);
> > +	if (ret)
> > +		goto out_cleanup_connect_q;
> > +
> Now the code looks weird.
> Maybe we can do like this:
> first blk_mq_update_nr_hw_queues, and then nvme_rdma_start_io_queues.

We have to start the exiting queues before going into the 'if (!new)'
part. That's why the start of queues is splited into two steps.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] nvmet: Expose max queues to sysfs
  2022-08-23  7:44 ` [PATCH v2 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
@ 2022-08-25 16:20   ` Hannes Reinecke
  2022-08-29  7:57     ` Daniel Wagner
  2022-08-28 12:12   ` Sagi Grimberg
  1 sibling, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2022-08-25 16:20 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg

On 8/23/22 09:44, Daniel Wagner wrote:
> Allow to set the max queues the target supports. This is useful for
> testing the reconnect attempt of the host with chaning numbers of
> supported queues.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/configfs.c | 25 +++++++++++++++++++++++++
>   1 file changed, 25 insertions(+)
> 
> diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
> index 2bcd60758919..ba9f3209f848 100644
> --- a/drivers/nvme/target/configfs.c
> +++ b/drivers/nvme/target/configfs.c
> @@ -1281,6 +1281,30 @@ static ssize_t nvmet_subsys_attr_pi_enable_store(struct config_item *item,
>   CONFIGFS_ATTR(nvmet_subsys_, attr_pi_enable);
>   #endif
>   
> +static ssize_t nvmet_subsys_attr_qid_max_show(struct config_item *item,
> +					      char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%u\n", to_subsys(item)->max_qid);
> +}
> +
> +static ssize_t nvmet_subsys_attr_qid_max_store(struct config_item *item,
> +					       const char *page, size_t cnt)
> +{
> +	u16 qid_max;
> +
> +	if (sscanf(page, "%hu\n", &qid_max) != 1)
> +		return -EINVAL;
> +
> +	if (qid_max == 0)
> +		return -EINVAL;
> +
> +	down_write(&nvmet_config_sem);
> +	to_subsys(item)->max_qid = qid_max;
> +	up_write(&nvmet_config_sem);

I would have expected this to trigger a reconnect; as it stands it 
requires the user to explicitly do a reconnect via nvme-cli, and there's 
a fair chance the number of queues will be specified there.
Hmm?

> +	return cnt;
> +}
> +CONFIGFS_ATTR(nvmet_subsys_, attr_qid_max);
> +
>   static struct configfs_attribute *nvmet_subsys_attrs[] = {
>   	&nvmet_subsys_attr_attr_allow_any_host,
>   	&nvmet_subsys_attr_attr_version,
> @@ -1291,6 +1315,7 @@ static struct configfs_attribute *nvmet_subsys_attrs[] = {
>   #ifdef CONFIG_BLK_DEV_INTEGRITY
>   	&nvmet_subsys_attr_attr_pi_enable,
>   #endif
> +	&nvmet_subsys_attr_attr_qid_max,
>   	NULL,
>   };
>   

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] nvme-rdma: Handle number of queue changes
  2022-08-25 10:55     ` Daniel Wagner
@ 2022-08-26  1:10       ` Chao Leng
  2022-08-26  6:30         ` Daniel Wagner
  0 siblings, 1 reply; 19+ messages in thread
From: Chao Leng @ 2022-08-26  1:10 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: linux-nvme, Sagi Grimberg



On 2022/8/25 18:55, Daniel Wagner wrote:
> On Thu, Aug 25, 2022 at 06:08:10PM +0800, Chao Leng wrote:
>>> +	/*
>>> +	 * If the number of queues has increased (reconnect case)
>>> +	 * start all new queues now.
>>> +	 */
>>> +	ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
>>> +					ctrl->tag_set.nr_hw_queues + 1);
>>> +	if (ret)
>>> +		goto out_cleanup_connect_q;
>>> +
>> Now the code looks weird.
>> Maybe we can do like this:
>> first blk_mq_update_nr_hw_queues, and then nvme_rdma_start_io_queues.
> 
> We have to start the exiting queues before going into the 'if (!new)'
> part. That's why the start of queues is splited into two steps.
Indeed it is not necessary.
It's just a little negative: some request will failed, and then retry
or failover. I think it is acceptable.
> 
> .
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] nvme-rdma: Handle number of queue changes
  2022-08-26  1:10       ` Chao Leng
@ 2022-08-26  6:30         ` Daniel Wagner
  2022-08-26  7:02           ` Chao Leng
  2022-08-26  7:31           ` Chao Leng
  0 siblings, 2 replies; 19+ messages in thread
From: Daniel Wagner @ 2022-08-26  6:30 UTC (permalink / raw)
  To: Chao Leng; +Cc: linux-nvme, Sagi Grimberg

On Fri, Aug 26, 2022 at 09:10:04AM +0800, Chao Leng wrote:
> On 2022/8/25 18:55, Daniel Wagner wrote:
> > On Thu, Aug 25, 2022 at 06:08:10PM +0800, Chao Leng wrote:
> > > > +	/*
> > > > +	 * If the number of queues has increased (reconnect case)
> > > > +	 * start all new queues now.
> > > > +	 */
> > > > +	ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
> > > > +					ctrl->tag_set.nr_hw_queues + 1);
> > > > +	if (ret)
> > > > +		goto out_cleanup_connect_q;
> > > > +
> > > Now the code looks weird.
> > > Maybe we can do like this:
> > > first blk_mq_update_nr_hw_queues, and then nvme_rdma_start_io_queues.
> > 
> > We have to start the exiting queues before going into the 'if (!new)'
> > part. That's why the start of queues is splited into two steps.
> Indeed it is not necessary.
> It's just a little negative: some request will failed, and then retry
> or failover. I think it is acceptable.

The first version made nvme_rdma_start_io_queues() re-entrant and hence
we just could call nvme_rdma_start_io_queues() twice without the max
queue logic here.

After seeing both version I tend to do say the first one keeps the
'wierd' stuff more closer together and doesn't make the callside of
nvme_rdma_start_io_queues() do the counting. So my personal preference
is to go with v1.

Maybe there is another way but I haven't figured it out yet.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] nvme-rdma: Handle number of queue changes
  2022-08-26  6:30         ` Daniel Wagner
@ 2022-08-26  7:02           ` Chao Leng
  2022-08-26  7:31           ` Chao Leng
  1 sibling, 0 replies; 19+ messages in thread
From: Chao Leng @ 2022-08-26  7:02 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: linux-nvme, Sagi Grimberg



On 2022/8/26 14:30, Daniel Wagner wrote:
> On Fri, Aug 26, 2022 at 09:10:04AM +0800, Chao Leng wrote:
>> On 2022/8/25 18:55, Daniel Wagner wrote:
>>> On Thu, Aug 25, 2022 at 06:08:10PM +0800, Chao Leng wrote:
>>>>> +	/*
>>>>> +	 * If the number of queues has increased (reconnect case)
>>>>> +	 * start all new queues now.
>>>>> +	 */
>>>>> +	ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
>>>>> +					ctrl->tag_set.nr_hw_queues + 1);
>>>>> +	if (ret)
>>>>> +		goto out_cleanup_connect_q;
>>>>> +
>>>> Now the code looks weird.
>>>> Maybe we can do like this:
>>>> first blk_mq_update_nr_hw_queues, and then nvme_rdma_start_io_queues.
>>>
>>> We have to start the exiting queues before going into the 'if (!new)'
>>> part. That's why the start of queues is splited into two steps.
>> Indeed it is not necessary.
>> It's just a little negative: some request will failed, and then retry
>> or failover. I think it is acceptable.
> 
> The first version made nvme_rdma_start_io_queues() re-entrant and hence
> we just could call nvme_rdma_start_io_queues() twice without the max
> queue logic here.
> 
> After seeing both version I tend to do say the first one keeps the
> 'wierd' stuff more closer together and doesn't make the callside of
> nvme_rdma_start_io_queues() do the counting. So my personal preference
> is to go with v1.
The second nvme_rdma_start_io_queues of patch v1 should not goto
out_cleanup_connect_q when failed. This results in incomplete error handling.
> 
> Maybe there is another way but I haven't figured it out yet.
> .
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] nvme-rdma: Handle number of queue changes
  2022-08-26  6:30         ` Daniel Wagner
  2022-08-26  7:02           ` Chao Leng
@ 2022-08-26  7:31           ` Chao Leng
  2022-08-26 11:33             ` Daniel Wagner
  1 sibling, 1 reply; 19+ messages in thread
From: Chao Leng @ 2022-08-26  7:31 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: linux-nvme, Sagi Grimberg



On 2022/8/26 14:30, Daniel Wagner wrote:
> On Fri, Aug 26, 2022 at 09:10:04AM +0800, Chao Leng wrote:
>> On 2022/8/25 18:55, Daniel Wagner wrote:
>>> On Thu, Aug 25, 2022 at 06:08:10PM +0800, Chao Leng wrote:
>>>>> +	/*
>>>>> +	 * If the number of queues has increased (reconnect case)
>>>>> +	 * start all new queues now.
>>>>> +	 */
>>>>> +	ret = nvme_rdma_start_io_queues(ctrl, nr_queues,
>>>>> +					ctrl->tag_set.nr_hw_queues + 1);
>>>>> +	if (ret)
>>>>> +		goto out_cleanup_connect_q;
>>>>> +
>>>> Now the code looks weird.
>>>> Maybe we can do like this:
>>>> first blk_mq_update_nr_hw_queues, and then nvme_rdma_start_io_queues.
>>>
>>> We have to start the exiting queues before going into the 'if (!new)'
>>> part. That's why the start of queues is splited into two steps.
>> Indeed it is not necessary.
>> It's just a little negative: some request will failed, and then retry
>> or failover. I think it is acceptable.
> 
> The first version made nvme_rdma_start_io_queues() re-entrant and hence
> we just could call nvme_rdma_start_io_queues() twice without the max
> queue logic here.
> 
> After seeing both version I tend to do say the first one keeps the
> 'wierd' stuff more closer together and doesn't make the callside of
> nvme_rdma_start_io_queues() do the counting. So my personal preference
I don't understand "do the counting".
Show the code:
---
  drivers/nvme/host/rdma.c | 9 ++++-----
  1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 7d01fb770284..8dfb79726e13 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -980,10 +980,6 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
                         goto out_free_tag_set;
         }

-       ret = nvme_rdma_start_io_queues(ctrl);
-       if (ret)
-               goto out_cleanup_connect_q;
-
         if (!new) {
                 nvme_start_queues(&ctrl->ctrl);
                 if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
@@ -1000,13 +996,16 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
                 nvme_unfreeze(&ctrl->ctrl);
         }

+       ret = nvme_rdma_start_io_queues(ctrl);
+       if (ret)
+               goto out_wait_freeze_timed_out;
+
         return 0;

  out_wait_freeze_timed_out:
         nvme_stop_queues(&ctrl->ctrl);
         nvme_sync_io_queues(&ctrl->ctrl);
         nvme_rdma_stop_io_queues(ctrl);
-out_cleanup_connect_q:
         nvme_cancel_tagset(&ctrl->ctrl);
         if (new)
                 blk_cleanup_queue(ctrl->ctrl.connect_q);
-- 
> is to go with v1.
> 
> Maybe there is another way but I haven't figured it out yet.
> .
> 


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] nvme-rdma: Handle number of queue changes
  2022-08-26  7:31           ` Chao Leng
@ 2022-08-26 11:33             ` Daniel Wagner
  2022-08-27  1:02               ` Chao Leng
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Wagner @ 2022-08-26 11:33 UTC (permalink / raw)
  To: Chao Leng; +Cc: linux-nvme, Sagi Grimberg

On Fri, Aug 26, 2022 at 03:31:15PM +0800, Chao Leng wrote:
> > After seeing both version I tend to do say the first one keeps the
> > 'wierd' stuff more closer together and doesn't make the callside of
> > nvme_rdma_start_io_queues() do the counting. So my personal preference
> I don't understand "do the counting".

Sorry. I meant we fist start only queues for which have resources
allocated (nr_queues in my patch). And then we only need to start
potentially added queues.

> Show the code:
> ---
>  drivers/nvme/host/rdma.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 7d01fb770284..8dfb79726e13 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -980,10 +980,6 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>                         goto out_free_tag_set;
>         }
> 
> -       ret = nvme_rdma_start_io_queues(ctrl);
> -       if (ret)
> -               goto out_cleanup_connect_q;

Again, these need to start so that...

> -
>         if (!new) {
>                 nvme_start_queues(&ctrl->ctrl);
>                 if (!nvme_wait_freeze_timeout(&ctrl->ctrl,

... this here has a chance to work.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] nvme-rdma: Handle number of queue changes
  2022-08-26 11:33             ` Daniel Wagner
@ 2022-08-27  1:02               ` Chao Leng
  2022-08-28 12:20                 ` Sagi Grimberg
  0 siblings, 1 reply; 19+ messages in thread
From: Chao Leng @ 2022-08-27  1:02 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: linux-nvme, Sagi Grimberg



On 2022/8/26 19:33, Daniel Wagner wrote:
> On Fri, Aug 26, 2022 at 03:31:15PM +0800, Chao Leng wrote:
>>> After seeing both version I tend to do say the first one keeps the
>>> 'wierd' stuff more closer together and doesn't make the callside of
>>> nvme_rdma_start_io_queues() do the counting. So my personal preference
>> I don't understand "do the counting".
> 
> Sorry. I meant we fist start only queues for which have resources
> allocated (nr_queues in my patch). And then we only need to start
> potentially added queues.
> 
>> Show the code:
>> ---
>>   drivers/nvme/host/rdma.c | 9 ++++-----
>>   1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>> index 7d01fb770284..8dfb79726e13 100644
>> --- a/drivers/nvme/host/rdma.c
>> +++ b/drivers/nvme/host/rdma.c
>> @@ -980,10 +980,6 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>>                          goto out_free_tag_set;
>>          }
>>
>> -       ret = nvme_rdma_start_io_queues(ctrl);
>> -       if (ret)
>> -               goto out_cleanup_connect_q;
> 
> Again, these need to start so that...
> 
>> -
>>          if (!new) {
>>                  nvme_start_queues(&ctrl->ctrl);
>>                  if (!nvme_wait_freeze_timeout(&ctrl->ctrl,
> 
> ... this here has a chance to work.
Some request will be submited, and will failed, and then retry
or failover. It is similar to nvme_cancel_tagset in nvme_rdma_teardown_io_queues.
I think it is acceptable.
> .
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] nvmet: Expose max queues to sysfs
  2022-08-23  7:44 ` [PATCH v2 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
  2022-08-25 16:20   ` Hannes Reinecke
@ 2022-08-28 12:12   ` Sagi Grimberg
  1 sibling, 0 replies; 19+ messages in thread
From: Sagi Grimberg @ 2022-08-28 12:12 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme


> Allow to set the max queues the target supports. This is useful for
> testing the reconnect attempt of the host with chaning numbers of
> supported queues.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/configfs.c | 25 +++++++++++++++++++++++++
>   1 file changed, 25 insertions(+)
> 
> diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
> index 2bcd60758919..ba9f3209f848 100644
> --- a/drivers/nvme/target/configfs.c
> +++ b/drivers/nvme/target/configfs.c
> @@ -1281,6 +1281,30 @@ static ssize_t nvmet_subsys_attr_pi_enable_store(struct config_item *item,
>   CONFIGFS_ATTR(nvmet_subsys_, attr_pi_enable);
>   #endif
>   
> +static ssize_t nvmet_subsys_attr_qid_max_show(struct config_item *item,
> +					      char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%u\n", to_subsys(item)->max_qid);
> +}
> +
> +static ssize_t nvmet_subsys_attr_qid_max_store(struct config_item *item,
> +					       const char *page, size_t cnt)
> +{
> +	u16 qid_max;
> +
> +	if (sscanf(page, "%hu\n", &qid_max) != 1)
> +		return -EINVAL;
> +
> +	if (qid_max == 0)
> +		return -EINVAL;

This needs an upper limit.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] nvme-rdma: Handle number of queue changes
  2022-08-27  1:02               ` Chao Leng
@ 2022-08-28 12:20                 ` Sagi Grimberg
  2022-08-29  8:02                   ` Chao Leng
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2022-08-28 12:20 UTC (permalink / raw)
  To: Chao Leng, Daniel Wagner; +Cc: linux-nvme


>> On Fri, Aug 26, 2022 at 03:31:15PM +0800, Chao Leng wrote:
>>>> After seeing both version I tend to do say the first one keeps the
>>>> 'wierd' stuff more closer together and doesn't make the callside of
>>>> nvme_rdma_start_io_queues() do the counting. So my personal preference
>>> I don't understand "do the counting".
>>
>> Sorry. I meant we fist start only queues for which have resources
>> allocated (nr_queues in my patch). And then we only need to start
>> potentially added queues.
>>
>>> Show the code:
>>> ---
>>>   drivers/nvme/host/rdma.c | 9 ++++-----
>>>   1 file changed, 4 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>>> index 7d01fb770284..8dfb79726e13 100644
>>> --- a/drivers/nvme/host/rdma.c
>>> +++ b/drivers/nvme/host/rdma.c
>>> @@ -980,10 +980,6 @@ static int nvme_rdma_configure_io_queues(struct 
>>> nvme_rdma_ctrl *ctrl, bool new)
>>>                          goto out_free_tag_set;
>>>          }
>>>
>>> -       ret = nvme_rdma_start_io_queues(ctrl);
>>> -       if (ret)
>>> -               goto out_cleanup_connect_q;
>>
>> Again, these need to start so that...
>>
>>> -
>>>          if (!new) {
>>>                  nvme_start_queues(&ctrl->ctrl);
>>>                  if (!nvme_wait_freeze_timeout(&ctrl->ctrl,
>>
>> ... this here has a chance to work.
> Some request will be submited, and will failed, and then retry
> or failover. It is similar to nvme_cancel_tagset in 
> nvme_rdma_teardown_io_queues.
> I think it is acceptable.

Not really...

In order for the queue freeze to complete, all pending IOs need
to complete or error out, and that cannot be guaranteed without
restarting the queues as some may be waiting on tags and need to
be restarted in order to complete.

See 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] nvme-tcp: Handle number of queue changes
  2022-08-23  7:44 ` [PATCH v2 2/3] nvme-tcp: Handle number of queue changes Daniel Wagner
@ 2022-08-28 12:25   ` Sagi Grimberg
  2022-08-29  7:46     ` Daniel Wagner
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2022-08-28 12:25 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme



On 8/23/22 10:44, Daniel Wagner wrote:
> On reconnect, the number of queues might have changed.
> 
> In the case where we have more queues available than previously we try
> to access queues which are not initialized yet.
> 
> The other case where we have less queues than previously, the
> connection attempt will fail because the target doesn't support the
> old number of queues and we end up in a reconnect loop.
> 
> Thus, only start queues which are currently present in the tagset
> limited by the number of available queues. Then we update the tagset
> and we can start any new queue.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/tcp.c | 59 ++++++++++++++++++++++++++---------------
>   1 file changed, 38 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 044da18c06f5..93206215e381 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1762,11 +1762,12 @@ static void nvme_tcp_stop_io_queues(struct nvme_ctrl *ctrl)
>   		nvme_tcp_stop_queue(ctrl, i);
>   }
>   
> -static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl)
> +static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl,
> +				    int first, int last)
>   {
>   	int i, ret;
>   
> -	for (i = 1; i < ctrl->queue_count; i++) {
> +	for (i = first; i < last; i++) {
>   		ret = nvme_tcp_start_queue(ctrl, i);
>   		if (ret)
>   			goto out_stop_queues;
> @@ -1775,7 +1776,7 @@ static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl)
>   	return 0;
>   
>   out_stop_queues:
> -	for (i--; i >= 1; i--)
> +	for (i--; i >= first; i--)
>   		nvme_tcp_stop_queue(ctrl, i);
>   	return ret;
>   }
> @@ -1899,31 +1900,38 @@ static void nvme_tcp_destroy_io_queues(struct nvme_ctrl *ctrl, bool remove)
>   	nvme_tcp_free_io_queues(ctrl);
>   }
>   
> -static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
> +static int nvme_tcp_configure_io_queues(struct nvme_ctrl *nctrl, bool new)

Lets not change the name here, it creates some churn...

>   {
> -	int ret;
> +	struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);

Why do you need the tcp ctrl?

> +	int ret, nr_queues;
>   
> -	ret = nvme_tcp_alloc_io_queues(ctrl);
> +	ret = nvme_tcp_alloc_io_queues(nctrl);
>   	if (ret)
>   		return ret;
>   
>   	if (new) {
> -		ret = nvme_tcp_alloc_tag_set(ctrl);
> +		ret = nvme_tcp_alloc_tag_set(nctrl);
>   		if (ret)
>   			goto out_free_io_queues;
>   
> -		ret = nvme_ctrl_init_connect_q(ctrl);
> +		ret = nvme_ctrl_init_connect_q(nctrl);
>   		if (ret)
>   			goto out_free_tag_set;
>   	}
>   
> -	ret = nvme_tcp_start_io_queues(ctrl);
> +	/*
> +	 * Only start IO queues for which we have allocated the tagset
> +	 * and limitted it to the available queues. On reconnects, the
> +	 * queue number might have changed.
> +	 */
> +	nr_queues = min(ctrl->tag_set.nr_hw_queues + 1, nctrl->queue_count);

	nr_queues = min(ctrl->tagset->nr_hw_queues + 1, ctrl->queue_count);

> +	ret = nvme_tcp_start_io_queues(nctrl, 1, nr_queues);
>   	if (ret)
>   		goto out_cleanup_connect_q;
>   
>   	if (!new) {
> -		nvme_start_queues(ctrl);
> -		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
> +		nvme_start_queues(nctrl);
> +		if (!nvme_wait_freeze_timeout(nctrl, NVME_IO_TIMEOUT)) {
>   			/*
>   			 * If we timed out waiting for freeze we are likely to
>   			 * be stuck.  Fail the controller initialization just
> @@ -1932,26 +1940,35 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   			ret = -ENODEV;
>   			goto out_wait_freeze_timed_out;
>   		}
> -		blk_mq_update_nr_hw_queues(ctrl->tagset,
> -			ctrl->queue_count - 1);
> -		nvme_unfreeze(ctrl);
> +		blk_mq_update_nr_hw_queues(nctrl->tagset,
> +			nctrl->queue_count - 1);
> +		nvme_unfreeze(nctrl);
>   	}
>   
> +	/*
> +	 * If the number of queues has increased (reconnect case)
> +	 * start all new queues now.
> +	 */
> +	ret = nvme_tcp_start_io_queues(nctrl, nr_queues,
> +				       ctrl->tag_set.nr_hw_queues + 1);
> +	if (ret)
> +		goto out_cleanup_connect_q;

You need to go to out_wait_freeze_timed_out as IOs pending have
possibly restarted with nvme_unfreeze...

> +
>   	return 0;
>   
>   out_wait_freeze_timed_out:
> -	nvme_stop_queues(ctrl);
> -	nvme_sync_io_queues(ctrl);
> -	nvme_tcp_stop_io_queues(ctrl);
> +	nvme_stop_queues(nctrl);
> +	nvme_sync_io_queues(nctrl);
> +	nvme_tcp_stop_io_queues(nctrl);
>   out_cleanup_connect_q:
> -	nvme_cancel_tagset(ctrl);
> +	nvme_cancel_tagset(nctrl);
>   	if (new)
> -		blk_mq_destroy_queue(ctrl->connect_q);
> +		blk_mq_destroy_queue(nctrl->connect_q);
>   out_free_tag_set:
>   	if (new)
> -		blk_mq_free_tag_set(ctrl->tagset);
> +		blk_mq_free_tag_set(nctrl->tagset);
>   out_free_io_queues:
> -	nvme_tcp_free_io_queues(ctrl);
> +	nvme_tcp_free_io_queues(nctrl);
>   	return ret;
>   }
>   


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] nvme-tcp: Handle number of queue changes
  2022-08-28 12:25   ` Sagi Grimberg
@ 2022-08-29  7:46     ` Daniel Wagner
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Wagner @ 2022-08-29  7:46 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme

On Sun, Aug 28, 2022 at 03:25:09PM +0300, Sagi Grimberg wrote:
> > -static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
> > +static int nvme_tcp_configure_io_queues(struct nvme_ctrl *nctrl, bool new)
> 
> Lets not change the name here, it creates some churn...

Sure thing

> 
> >   {
> > -	int ret;
> > +	struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
> 
> Why do you need the tcp ctrl?

Because I was blind and didn't see ctrl->tagset ...

> > +	if (ret)
> > +		goto out_cleanup_connect_q;
> 
> You need to go to out_wait_freeze_timed_out as IOs pending have
> possibly restarted with nvme_unfreeze...

Copy & paste error.

Thanks for reviewing!
Daniel


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] nvmet: Expose max queues to sysfs
  2022-08-25 16:20   ` Hannes Reinecke
@ 2022-08-29  7:57     ` Daniel Wagner
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Wagner @ 2022-08-29  7:57 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-nvme, Sagi Grimberg

> > +static ssize_t nvmet_subsys_attr_qid_max_store(struct config_item *item,
> > +					       const char *page, size_t cnt)
> > +{
> > +	u16 qid_max;
> > +
> > +	if (sscanf(page, "%hu\n", &qid_max) != 1)
> > +		return -EINVAL;
> > +
> > +	if (qid_max == 0)
> > +		return -EINVAL;
> > +
> > +	down_write(&nvmet_config_sem);
> > +	to_subsys(item)->max_qid = qid_max;
> > +	up_write(&nvmet_config_sem);
> 
> I would have expected this to trigger a reconnect; as it stands it requires
> the user to explicitly do a reconnect via nvme-cli, and there's a fair
> chance the number of queues will be specified there.
> Hmm?

Good point. Currently we do not allow to change the attributes when the
port is enabled. I think we should the same here because we currently
don't have an interface for triggering a reconnect without going through
the fatal error handler.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] nvme-rdma: Handle number of queue changes
  2022-08-28 12:20                 ` Sagi Grimberg
@ 2022-08-29  8:02                   ` Chao Leng
  0 siblings, 0 replies; 19+ messages in thread
From: Chao Leng @ 2022-08-29  8:02 UTC (permalink / raw)
  To: Sagi Grimberg, Daniel Wagner; +Cc: linux-nvme



On 2022/8/28 20:20, Sagi Grimberg wrote:
> 
>>> On Fri, Aug 26, 2022 at 03:31:15PM +0800, Chao Leng wrote:
>>>>> After seeing both version I tend to do say the first one keeps the
>>>>> 'wierd' stuff more closer together and doesn't make the callside of
>>>>> nvme_rdma_start_io_queues() do the counting. So my personal preference
>>>> I don't understand "do the counting".
>>>
>>> Sorry. I meant we fist start only queues for which have resources
>>> allocated (nr_queues in my patch). And then we only need to start
>>> potentially added queues.
>>>
>>>> Show the code:
>>>> ---
>>>>   drivers/nvme/host/rdma.c | 9 ++++-----
>>>>   1 file changed, 4 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>>>> index 7d01fb770284..8dfb79726e13 100644
>>>> --- a/drivers/nvme/host/rdma.c
>>>> +++ b/drivers/nvme/host/rdma.c
>>>> @@ -980,10 +980,6 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>>>>                          goto out_free_tag_set;
>>>>          }
>>>>
>>>> -       ret = nvme_rdma_start_io_queues(ctrl);
>>>> -       if (ret)
>>>> -               goto out_cleanup_connect_q;
>>>
>>> Again, these need to start so that...
>>>
>>>> -
>>>>          if (!new) {
>>>>                  nvme_start_queues(&ctrl->ctrl);
>>>>                  if (!nvme_wait_freeze_timeout(&ctrl->ctrl,
>>>
>>> ... this here has a chance to work.
>> Some request will be submited, and will failed, and then retry
>> or failover. It is similar to nvme_cancel_tagset in nvme_rdma_teardown_io_queues.
>> I think it is acceptable.
> 
> Not really...
> 
> In order for the queue freeze to complete, all pending IOs need
> to complete or error out, and that cannot be guaranteed without
> restarting the queues as some may be waiting on tags and need to
> be restarted in order to complete.
> 
> See 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
Yes, we need restart the queue before wait the queue freeze.
But it is not necessary to nvme_rdma_start_io_queues before nvme_start_queues.
Certainly there is a downside: the requests which are waiting on tags
will all failed.
> .


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-08-29  8:02 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-23  7:44 [PATCH v2 0/3] Handle number of queue changes Daniel Wagner
2022-08-23  7:44 ` [PATCH v2 1/3] nvmet: Expose max queues to sysfs Daniel Wagner
2022-08-25 16:20   ` Hannes Reinecke
2022-08-29  7:57     ` Daniel Wagner
2022-08-28 12:12   ` Sagi Grimberg
2022-08-23  7:44 ` [PATCH v2 2/3] nvme-tcp: Handle number of queue changes Daniel Wagner
2022-08-28 12:25   ` Sagi Grimberg
2022-08-29  7:46     ` Daniel Wagner
2022-08-23  7:44 ` [PATCH v2 3/3] nvme-rdma: " Daniel Wagner
2022-08-25 10:08   ` Chao Leng
2022-08-25 10:55     ` Daniel Wagner
2022-08-26  1:10       ` Chao Leng
2022-08-26  6:30         ` Daniel Wagner
2022-08-26  7:02           ` Chao Leng
2022-08-26  7:31           ` Chao Leng
2022-08-26 11:33             ` Daniel Wagner
2022-08-27  1:02               ` Chao Leng
2022-08-28 12:20                 ` Sagi Grimberg
2022-08-29  8:02                   ` Chao Leng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.