linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/3]  Handle update hardware queues and queue freeze more carefully
@ 2021-08-23 11:23 Daniel Wagner
  2021-08-23 11:23 ` [PATCH v6 1/3] nvme-fc: Update hardware queues before using them Daniel Wagner
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Daniel Wagner @ 2021-08-23 11:23 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Himanshu Madhani, Daniel Wagner

Hi,

After our last discussion in v5, the nvme_freeze_start() is gone
(James provided a new patch). I also updated the commit message of the
first patch which adds the imported bit why we need to update the
number queues first.

Anyway, I think we figured out the details in this path and I am quiet
confident that we nailed it now (yeah, famous last words).

Daniel

v1:
 - https://lore.kernel.org/linux-nvme/20210625101649.49296-1-dwagner@suse.de/
v2:
 - https://lore.kernel.org/linux-nvme/20210708092755.15660-1-dwagner@suse.de/
 - reviewed tags collected
 - added 'update hardware queues' for all transport
 - added fix for fc hanger in nvme_wait_freeze_timeout
v3:
 - https://lore.kernel.org/linux-nvme/20210720124353.127959-1-dwagner@suse.de/
 - dropped 'nvme-fc: Freeze queues before destroying them'
 - added James' two patches
v4:
 - https://lore.kernel.org/linux-nvme/20210802112658.75875-1-dwagner@suse.de/
 - added 'nvme-*: Unfreeze queues on reconnect'
 - added Hannes' reviewed tags
v5:
 - https://lore.kernel.org/linux-nvme/20210818120530.130501-1-dwagner@suse.de/
 - dropped non nvme-fc patches
 - updated 'nvme-fc: fix controller reset hang during traffic'
v6:
 - updated commit message 'nvme-fc: Update hardware queues before using them'
 - dropped 'nvme-fc: fix controller reset hang during traffic'
 - added 'nvme-fc: Remove freeze/unfreeze around update_nr_hw_queues'

Daniel Wagner (1):
  nvme-fc: Update hardware queues before using them

James Smart (2):
  nvme-fc: avoid race between time out and tear down
  nvme-fc: Remove freeze/unfreeze around update_nr_hw_queues

 drivers/nvme/host/fc.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v6 1/3] nvme-fc: Update hardware queues before using them
  2021-08-23 11:23 [PATCH v6 0/3] Handle update hardware queues and queue freeze more carefully Daniel Wagner
@ 2021-08-23 11:23 ` Daniel Wagner
  2021-08-23 11:23 ` [PATCH v6 2/3] nvme-fc: avoid race between time out and tear down Daniel Wagner
  2021-08-23 11:23 ` [PATCH v6 3/3] nvme-fc: Remove freeze/unfreeze around update_nr_hw_queues Daniel Wagner
  2 siblings, 0 replies; 7+ messages in thread
From: Daniel Wagner @ 2021-08-23 11:23 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Himanshu Madhani, Daniel Wagner,
	James Smart

In case the number of hardware queues changes, we need to update the
tagset and the mapping of ctx to hctx first.

If we try to create and connect the I/O queues first, this operation
will fail (target will reject the connect call due to the wrong number
of queues) and hence we bail out of the recreate function. Then we
will to try the very same operation again, thus we don't make any
progress.

Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 7f462af1b02a..8a903769364f 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2951,14 +2951,6 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 	if (ctrl->ctrl.queue_count == 1)
 		return 0;
 
-	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
-	if (ret)
-		goto out_free_io_queues;
-
-	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
-	if (ret)
-		goto out_delete_hw_queues;
-
 	if (prior_ioq_cnt != nr_io_queues) {
 		dev_info(ctrl->ctrl.device,
 			"reconnect: revising io queue count from %d to %d\n",
@@ -2968,6 +2960,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 		nvme_unfreeze(&ctrl->ctrl);
 	}
 
+	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
+	if (ret)
+		goto out_free_io_queues;
+
+	ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
+	if (ret)
+		goto out_delete_hw_queues;
+
 	return 0;
 
 out_delete_hw_queues:
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v6 2/3] nvme-fc: avoid race between time out and tear down
  2021-08-23 11:23 [PATCH v6 0/3] Handle update hardware queues and queue freeze more carefully Daniel Wagner
  2021-08-23 11:23 ` [PATCH v6 1/3] nvme-fc: Update hardware queues before using them Daniel Wagner
@ 2021-08-23 11:23 ` Daniel Wagner
  2021-08-23 11:23 ` [PATCH v6 3/3] nvme-fc: Remove freeze/unfreeze around update_nr_hw_queues Daniel Wagner
  2 siblings, 0 replies; 7+ messages in thread
From: Daniel Wagner @ 2021-08-23 11:23 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Himanshu Madhani, James Smart,
	Chao Leng, Daniel Wagner

From: James Smart <jsmart2021@gmail.com>

To avoid race between time out and tear down, in tear down process,
first we quiesce the queue, and then delete the timer and cancel
the time out work for the queue.

This patch merges the admin and io sync ops into the queue teardown logic
as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time
out and tear down". There is no teardown_lock in nvme-fc.

Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: Chao Leng <lengchao@huawei.com>
Tested-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/fc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 8a903769364f..48aaa753be44 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2487,6 +2487,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
 	 */
 	if (ctrl->ctrl.queue_count > 1) {
 		nvme_stop_queues(&ctrl->ctrl);
+		nvme_sync_io_queues(&ctrl->ctrl);
 		blk_mq_tagset_busy_iter(&ctrl->tag_set,
 				nvme_fc_terminate_exchange, &ctrl->ctrl);
 		blk_mq_tagset_wait_completed_request(&ctrl->tag_set);
@@ -2510,6 +2511,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
 	 * clean up the admin queue. Same thing as above.
 	 */
 	blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
+	blk_sync_queue(ctrl->ctrl.admin_q);
 	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
 				nvme_fc_terminate_exchange, &ctrl->ctrl);
 	blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v6 3/3] nvme-fc: Remove freeze/unfreeze around update_nr_hw_queues
  2021-08-23 11:23 [PATCH v6 0/3] Handle update hardware queues and queue freeze more carefully Daniel Wagner
  2021-08-23 11:23 ` [PATCH v6 1/3] nvme-fc: Update hardware queues before using them Daniel Wagner
  2021-08-23 11:23 ` [PATCH v6 2/3] nvme-fc: avoid race between time out and tear down Daniel Wagner
@ 2021-08-23 11:23 ` Daniel Wagner
  2021-08-24 20:38   ` Sagi Grimberg
  2 siblings, 1 reply; 7+ messages in thread
From: Daniel Wagner @ 2021-08-23 11:23 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei, Sagi Grimberg,
	Hannes Reinecke, Wen Xiong, Himanshu Madhani, James Smart,
	Daniel Wagner

From: James Smart <jsmart2021@gmail.com>

Remove the freeze/unfreeze around changes to the number of hardware
queues. Study and retest has indicated there are no ios that can be
active at this point so there is nothing to freeze.

This patch primarily reverts 883837ed0f1f "nvme-fc: wait for queues to
freeze before calling update_hr_hw_queues". It's not an exact revert as
it leaves the adjusting of hw queues only if the count changes.

Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 48aaa753be44..b71d0c2d4d31 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2957,9 +2957,7 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 		dev_info(ctrl->ctrl.device,
 			"reconnect: revising io queue count from %d to %d\n",
 			prior_ioq_cnt, nr_io_queues);
-		nvme_wait_freeze(&ctrl->ctrl);
 		blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
-		nvme_unfreeze(&ctrl->ctrl);
 	}
 
 	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v6 3/3] nvme-fc: Remove freeze/unfreeze around update_nr_hw_queues
  2021-08-23 11:23 ` [PATCH v6 3/3] nvme-fc: Remove freeze/unfreeze around update_nr_hw_queues Daniel Wagner
@ 2021-08-24 20:38   ` Sagi Grimberg
  2021-08-25 13:04     ` Daniel Wagner
  0 siblings, 1 reply; 7+ messages in thread
From: Sagi Grimberg @ 2021-08-24 20:38 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong, Himanshu Madhani, James Smart


> From: James Smart <jsmart2021@gmail.com>
> 
> Remove the freeze/unfreeze around changes to the number of hardware
> queues. Study and retest has indicated there are no ios that can be
> active at this point so there is nothing to freeze.
> 
> This patch primarily reverts 883837ed0f1f

Bogus commit ID.

  "nvme-fc: wait for queues to
> freeze before calling update_hr_hw_queues". It's not an exact revert as
> it leaves the adjusting of hw queues only if the count changes.

I see that fc doesn't freeze the queues, so it obviously wrong to
unfreeze them. But is it correct to not freeze the queues?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6 3/3] nvme-fc: Remove freeze/unfreeze around update_nr_hw_queues
  2021-08-24 20:38   ` Sagi Grimberg
@ 2021-08-25 13:04     ` Daniel Wagner
  2021-08-25 16:03       ` James Smart
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Wagner @ 2021-08-25 13:04 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, linux-kernel, James Smart, Keith Busch, Ming Lei,
	Hannes Reinecke, Wen Xiong, Himanshu Madhani, James Smart

On Tue, Aug 24, 2021 at 01:38:20PM -0700, Sagi Grimberg wrote:
> > freeze before calling update_hr_hw_queues". It's not an exact revert as
> > it leaves the adjusting of hw queues only if the count changes.
> 
> I see that fc doesn't freeze the queues, so it obviously wrong to
> unfreeze them. But is it correct to not freeze the queues?

nvme-fc is draining the queues in the error recovery path
(__nvme_fc_abort_outstanding_ios). There are no request in the queues
hence we don't have to freeze.

The only reason to keep the freeze/unfreeze in this path would be to
make it look alike the other transport. But it would be a no-op.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6 3/3] nvme-fc: Remove freeze/unfreeze around update_nr_hw_queues
  2021-08-25 13:04     ` Daniel Wagner
@ 2021-08-25 16:03       ` James Smart
  0 siblings, 0 replies; 7+ messages in thread
From: James Smart @ 2021-08-25 16:03 UTC (permalink / raw)
  To: Daniel Wagner, Sagi Grimberg
  Cc: linux-nvme, linux-kernel, Keith Busch, Ming Lei, Hannes Reinecke,
	Wen Xiong, Himanshu Madhani, James Smart

[-- Attachment #1: Type: text/plain, Size: 1563 bytes --]



On 8/25/2021 6:04 AM, Daniel Wagner wrote:
> On Tue, Aug 24, 2021 at 01:38:20PM -0700, Sagi Grimberg wrote:
>>> freeze before calling update_hr_hw_queues". It's not an exact revert as
>>> it leaves the adjusting of hw queues only if the count changes.
>> I see that fc doesn't freeze the queues, so it obviously wrong to
>> unfreeze them. But is it correct to not freeze the queues?
> nvme-fc is draining the queues in the error recovery path
> (__nvme_fc_abort_outstanding_ios). There are no request in the queues
> hence we don't have to freeze.
>
> The only reason to keep the freeze/unfreeze in this path would be to
> make it look alike the other transport. But it would be a no-op.
Yep.

And updated commit is :  88e837ed0f1f

-- james

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4206 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-08-25 16:04 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-23 11:23 [PATCH v6 0/3] Handle update hardware queues and queue freeze more carefully Daniel Wagner
2021-08-23 11:23 ` [PATCH v6 1/3] nvme-fc: Update hardware queues before using them Daniel Wagner
2021-08-23 11:23 ` [PATCH v6 2/3] nvme-fc: avoid race between time out and tear down Daniel Wagner
2021-08-23 11:23 ` [PATCH v6 3/3] nvme-fc: Remove freeze/unfreeze around update_nr_hw_queues Daniel Wagner
2021-08-24 20:38   ` Sagi Grimberg
2021-08-25 13:04     ` Daniel Wagner
2021-08-25 16:03       ` James Smart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).