linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 5.8 67/72] nvme-tcp: fix controller reset hang during traffic
       [not found] <20200808233542.3617339-1-sashal@kernel.org>
@ 2020-08-08 23:35 ` Sasha Levin
  2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 68/72] nvme-rdma: " Sasha Levin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2020-08-08 23:35 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Sagi Grimberg, linux-nvme, Christoph Hellwig

From: Sagi Grimberg <sagi@grimberg.me>

[ Upstream commit 2875b0aecabe2f081a8432e2bc85b85df0529490 ]

commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
exposed an issue where we may hang trying to wait for queue freeze
during I/O. We call blk_mq_update_nr_hw_queues which in case of multiple
queue maps (which we have now for default/read/poll) is attempting to
freeze the queue. However we never started queue freeze when starting the
reset, which means that we have inflight pending requests that entered the
queue that we will not complete once the queue is quiesced.

So start a freeze before we quiesce the queue, and unfreeze the queue
after we successfully connected the I/O queues (and make sure to call
blk_mq_update_nr_hw_queues only after we are sure that the queue was
already frozen).

This follows to how the pci driver handles resets.

Fixes: fe35ec58f0d3 ("block: update hctx map when use multiple maps")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/nvme/host/tcp.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index f3a91818167b1..83bb329d4113a 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1744,15 +1744,20 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 			ret = PTR_ERR(ctrl->connect_q);
 			goto out_free_tag_set;
 		}
-	} else {
-		blk_mq_update_nr_hw_queues(ctrl->tagset,
-			ctrl->queue_count - 1);
 	}
 
 	ret = nvme_tcp_start_io_queues(ctrl);
 	if (ret)
 		goto out_cleanup_connect_q;
 
+	if (!new) {
+		nvme_start_queues(ctrl);
+		nvme_wait_freeze(ctrl);
+		blk_mq_update_nr_hw_queues(ctrl->tagset,
+			ctrl->queue_count - 1);
+		nvme_unfreeze(ctrl);
+	}
+
 	return 0;
 
 out_cleanup_connect_q:
@@ -1857,6 +1862,7 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl,
 {
 	if (ctrl->queue_count <= 1)
 		return;
+	nvme_start_freeze(ctrl);
 	nvme_stop_queues(ctrl);
 	nvme_tcp_stop_io_queues(ctrl);
 	if (ctrl->tagset) {
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH AUTOSEL 5.8 68/72] nvme-rdma: fix controller reset hang during traffic
       [not found] <20200808233542.3617339-1-sashal@kernel.org>
  2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 67/72] nvme-tcp: fix controller reset hang during traffic Sasha Levin
@ 2020-08-08 23:35 ` Sasha Levin
  2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 69/72] nvme-multipath: fix logic for non-optimized paths Sasha Levin
  2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 70/72] nvme-multipath: do not fall back to __nvme_find_path() " Sasha Levin
  3 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2020-08-08 23:35 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Sagi Grimberg, linux-nvme, Christoph Hellwig

From: Sagi Grimberg <sagi@grimberg.me>

[ Upstream commit 9f98772ba307dd89a3d17dc2589f213d3972fc64 ]

commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
exposed an issue where we may hang trying to wait for queue freeze
during I/O. We call blk_mq_update_nr_hw_queues which in case of multiple
queue maps (which we have now for default/read/poll) is attempting to
freeze the queue. However we never started queue freeze when starting the
reset, which means that we have inflight pending requests that entered the
queue that we will not complete once the queue is quiesced.

So start a freeze before we quiesce the queue, and unfreeze the queue
after we successfully connected the I/O queues (and make sure to call
blk_mq_update_nr_hw_queues only after we are sure that the queue was
already frozen).

This follows to how the pci driver handles resets.

Fixes: fe35ec58f0d3 ("block: update hctx map when use multiple maps")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/nvme/host/rdma.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 13506a87a4444..af0cfd25ed7a4 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -941,15 +941,20 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 			ret = PTR_ERR(ctrl->ctrl.connect_q);
 			goto out_free_tag_set;
 		}
-	} else {
-		blk_mq_update_nr_hw_queues(&ctrl->tag_set,
-			ctrl->ctrl.queue_count - 1);
 	}
 
 	ret = nvme_rdma_start_io_queues(ctrl);
 	if (ret)
 		goto out_cleanup_connect_q;
 
+	if (!new) {
+		nvme_start_queues(&ctrl->ctrl);
+		nvme_wait_freeze(&ctrl->ctrl);
+		blk_mq_update_nr_hw_queues(ctrl->ctrl.tagset,
+			ctrl->ctrl.queue_count - 1);
+		nvme_unfreeze(&ctrl->ctrl);
+	}
+
 	return 0;
 
 out_cleanup_connect_q:
@@ -982,6 +987,7 @@ static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl,
 		bool remove)
 {
 	if (ctrl->ctrl.queue_count > 1) {
+		nvme_start_freeze(&ctrl->ctrl);
 		nvme_stop_queues(&ctrl->ctrl);
 		nvme_rdma_stop_io_queues(ctrl);
 		if (ctrl->ctrl.tagset) {
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH AUTOSEL 5.8 69/72] nvme-multipath: fix logic for non-optimized paths
       [not found] <20200808233542.3617339-1-sashal@kernel.org>
  2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 67/72] nvme-tcp: fix controller reset hang during traffic Sasha Levin
  2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 68/72] nvme-rdma: " Sasha Levin
@ 2020-08-08 23:35 ` Sasha Levin
  2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 70/72] nvme-multipath: do not fall back to __nvme_find_path() " Sasha Levin
  3 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2020-08-08 23:35 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Sagi Grimberg, linux-nvme, Martin Wilck,
	Hannes Reinecke, Christoph Hellwig

From: Martin Wilck <mwilck@suse.com>

[ Upstream commit 3f6e3246db0e6f92e784965d9d0edb8abe6c6b74 ]

Handle the special case where we have exactly one optimized path,
which we should keep using in this case.

Fixes: 75c10e732724 ("nvme-multipath: round-robin I/O policy")
Signed off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/nvme/host/multipath.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 66509472fe06a..fe8f7f123fac7 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -246,6 +246,12 @@ static struct nvme_ns *nvme_round_robin_path(struct nvme_ns_head *head,
 			fallback = ns;
 	}
 
+	/* No optimized path found, re-check the current path */
+	if (!nvme_path_is_disabled(old) &&
+	    old->ana_state == NVME_ANA_OPTIMIZED) {
+		found = old;
+		goto out;
+	}
 	if (!fallback)
 		return NULL;
 	found = fallback;
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH AUTOSEL 5.8 70/72] nvme-multipath: do not fall back to __nvme_find_path() for non-optimized paths
       [not found] <20200808233542.3617339-1-sashal@kernel.org>
                   ` (2 preceding siblings ...)
  2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 69/72] nvme-multipath: fix logic for non-optimized paths Sasha Levin
@ 2020-08-08 23:35 ` Sasha Levin
  2020-08-10 15:37   ` Martin Wilck
  3 siblings, 1 reply; 6+ messages in thread
From: Sasha Levin @ 2020-08-08 23:35 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Sagi Grimberg, linux-nvme, Martin Wilck,
	Hannes Reinecke, Christoph Hellwig

From: Hannes Reinecke <hare@suse.de>

[ Upstream commit fbd6a42d8932e172921c7de10468a2e12c34846b ]

When nvme_round_robin_path() finds a valid namespace we should be using it;
falling back to __nvme_find_path() for non-optimized paths will cause the
result from nvme_round_robin_path() to be ignored for non-optimized paths.

Fixes: 75c10e732724 ("nvme-multipath: round-robin I/O policy")
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/nvme/host/multipath.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index fe8f7f123fac7..57d51148e71b6 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -272,10 +272,13 @@ inline struct nvme_ns *nvme_find_path(struct nvme_ns_head *head)
 	struct nvme_ns *ns;
 
 	ns = srcu_dereference(head->current_path[node], &head->srcu);
-	if (READ_ONCE(head->subsys->iopolicy) == NVME_IOPOLICY_RR && ns)
-		ns = nvme_round_robin_path(head, node, ns);
-	if (unlikely(!ns || !nvme_path_is_optimized(ns)))
-		ns = __nvme_find_path(head, node);
+	if (unlikely(!ns))
+		return __nvme_find_path(head, node);
+
+	if (READ_ONCE(head->subsys->iopolicy) == NVME_IOPOLICY_RR)
+		return nvme_round_robin_path(head, node, ns);
+	if (unlikely(!nvme_path_is_optimized(ns)))
+		return __nvme_find_path(head, node);
 	return ns;
 }
 
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH AUTOSEL 5.8 70/72] nvme-multipath: do not fall back to __nvme_find_path() for non-optimized paths
  2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 70/72] nvme-multipath: do not fall back to __nvme_find_path() " Sasha Levin
@ 2020-08-10 15:37   ` Martin Wilck
  2020-08-16 13:50     ` Sasha Levin
  0 siblings, 1 reply; 6+ messages in thread
From: Martin Wilck @ 2020-08-10 15:37 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable
  Cc: Christoph Hellwig, Hannes Reinecke, linux-nvme, Sagi Grimberg

On Sat, 2020-08-08 at 19:35 -0400, Sasha Levin wrote:
> From: Hannes Reinecke <hare@suse.de>
> 
> [ Upstream commit fbd6a42d8932e172921c7de10468a2e12c34846b ]
> 
> When nvme_round_robin_path() finds a valid namespace we should be
> using it;
> falling back to __nvme_find_path() for non-optimized paths will cause
> the
> result from nvme_round_robin_path() to be ignored for non-optimized
> paths.
> 
> Fixes: 75c10e732724 ("nvme-multipath: round-robin I/O policy")
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  drivers/nvme/host/multipath.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)

Hello Sasha,

this patch needs a fix that I recently submitted to linux-nvme.
The same holds for the respective 5.7 and 5.4 AUTOSEL patches.

http://lists.infradead.org/pipermail/linux-nvme/2020-August/018570.html

Regards,
Martin



_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH AUTOSEL 5.8 70/72] nvme-multipath: do not fall back to __nvme_find_path() for non-optimized paths
  2020-08-10 15:37   ` Martin Wilck
@ 2020-08-16 13:50     ` Sasha Levin
  0 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2020-08-16 13:50 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Sagi Grimberg, linux-kernel, linux-nvme, stable,
	Christoph Hellwig, Hannes Reinecke

On Mon, Aug 10, 2020 at 05:37:54PM +0200, Martin Wilck wrote:
>On Sat, 2020-08-08 at 19:35 -0400, Sasha Levin wrote:
>> From: Hannes Reinecke <hare@suse.de>
>>
>> [ Upstream commit fbd6a42d8932e172921c7de10468a2e12c34846b ]
>>
>> When nvme_round_robin_path() finds a valid namespace we should be
>> using it;
>> falling back to __nvme_find_path() for non-optimized paths will cause
>> the
>> result from nvme_round_robin_path() to be ignored for non-optimized
>> paths.
>>
>> Fixes: 75c10e732724 ("nvme-multipath: round-robin I/O policy")
>> Signed-off-by: Martin Wilck <mwilck@suse.com>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>> ---
>>  drivers/nvme/host/multipath.c | 11 +++++++----
>>  1 file changed, 7 insertions(+), 4 deletions(-)
>
>Hello Sasha,
>
>this patch needs a fix that I recently submitted to linux-nvme.
>The same holds for the respective 5.7 and 5.4 AUTOSEL patches.
>
>http://lists.infradead.org/pipermail/linux-nvme/2020-August/018570.html

I'll grab it too, thanks!

-- 
Thanks,
Sasha

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-08-16 13:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20200808233542.3617339-1-sashal@kernel.org>
2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 67/72] nvme-tcp: fix controller reset hang during traffic Sasha Levin
2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 68/72] nvme-rdma: " Sasha Levin
2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 69/72] nvme-multipath: fix logic for non-optimized paths Sasha Levin
2020-08-08 23:35 ` [PATCH AUTOSEL 5.8 70/72] nvme-multipath: do not fall back to __nvme_find_path() " Sasha Levin
2020-08-10 15:37   ` Martin Wilck
2020-08-16 13:50     ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).