All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND 0/3] nvme/mpath: fix missed namespaces in ana state update
@ 2021-09-12 18:54 Anton Eidelman
  2021-09-12 18:54 ` [PATCH 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
                   ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Anton Eidelman @ 2021-09-12 18:54 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

Fixed two issues in nvme_update_ana_state() that caused ana_work
to miss existing namespaces and consequently a failure to update
the namespace ANA state based on the ANA log page.

1) A plain bug: we skipped an nsid in desc->nsids in a certain
   combination of nsids present and nsids reports in the ANA log,
   and failed to match this nsid to an existing namespace.
2) Unhandled situation when scan_work appended new namespaces to
   ctrl->namespaces and did not sort the list yet.
   In such transient state ana_work would fail to match nsids
   to those new namespaces.

Both issues potentially caused some namespaces to get stuck
in an incorrect ANA state, e.g. to never become live.

Anton Eidelman (3):
  nvme/multipath: fix failure to update ns ana state
  nvme/multipath: cosmetic: keep ns nsid locally
  nvme/multipath: fix stale ana state for namespaces just added by scan
    work

 drivers/nvme/host/multipath.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/3] nvme/multipath: fix failure to update ns ana state
  2021-09-12 18:54 [PATCH RESEND 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
@ 2021-09-12 18:54 ` Anton Eidelman
  2021-09-12 18:54 ` [PATCH 2/3] nvme/multipath: cosmetic: keep ns nsid locally Anton Eidelman
  2021-09-12 18:54 ` [PATCH 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman
  2 siblings, 0 replies; 35+ messages in thread
From: Anton Eidelman @ 2021-09-12 18:54 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

nvme_update_ana_state() has a deficiency that results
in failure to update the ana state for a namespace
in the following case:
nsid's in ctrl->namespaces: 1, 3, 4
nsid's in desc->nsids: 1, 2, 3, 4

Loop iteration 0:
    ns index = 0, n = 0, ns->head->ns_id = 1, nsid = 1, MATCH.
Loop iteration 1:
    ns index = 1, n = 1, ns->head->ns_id = 3, nsid = 2, NO MATCH.
Loop iteration 2:
    ns index = 2, n = 2, ns->head->ns_id = 4, nsid = 4, MATCH.

Result: missed nsid=3 and did not update its ana state.

Solution: when ns->head->ns_id is higher than nsid,
increment n and RETRY with the SAME ns.

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
---
 drivers/nvme/host/multipath.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 5d7bc58a27bd..e8ccdd398f78 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -600,14 +600,17 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 
 	down_read(&ctrl->namespaces_rwsem);
 	list_for_each_entry(ns, &ctrl->namespaces, list) {
-		unsigned nsid = le32_to_cpu(desc->nsids[n]);
-
+		unsigned nsid;
+again:
+		nsid = le32_to_cpu(desc->nsids[n]);
 		if (ns->head->ns_id < nsid)
 			continue;
 		if (ns->head->ns_id == nsid)
 			nvme_update_ns_ana_state(desc, ns);
 		if (++n == nr_nsids)
 			break;
+		if (ns->head->ns_id > nsid)
+			goto again;
 	}
 	up_read(&ctrl->namespaces_rwsem);
 	return 0;
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/3] nvme/multipath: cosmetic: keep ns nsid locally
  2021-09-12 18:54 [PATCH RESEND 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
  2021-09-12 18:54 ` [PATCH 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
@ 2021-09-12 18:54 ` Anton Eidelman
  2021-09-12 18:54 ` [PATCH 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman
  2 siblings, 0 replies; 35+ messages in thread
From: Anton Eidelman @ 2021-09-12 18:54 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

Keep the nsid of the current namespace in a local variable.
Change the type to unsigned int to make checkpatch happy.

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
---
 drivers/nvme/host/multipath.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index e8ccdd398f78..a51561d67b93 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -600,16 +600,18 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 
 	down_read(&ctrl->namespaces_rwsem);
 	list_for_each_entry(ns, &ctrl->namespaces, list) {
-		unsigned nsid;
+		unsigned int nsid;
+		unsigned int ns_nsid = ns->head->ns_id;
+
 again:
 		nsid = le32_to_cpu(desc->nsids[n]);
-		if (ns->head->ns_id < nsid)
+		if (ns_nsid < nsid)
 			continue;
-		if (ns->head->ns_id == nsid)
+		if (ns_nsid == nsid)
 			nvme_update_ns_ana_state(desc, ns);
 		if (++n == nr_nsids)
 			break;
-		if (ns->head->ns_id > nsid)
+		if (ns_nsid > nsid)
 			goto again;
 	}
 	up_read(&ctrl->namespaces_rwsem);
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-12 18:54 [PATCH RESEND 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
  2021-09-12 18:54 ` [PATCH 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
  2021-09-12 18:54 ` [PATCH 2/3] nvme/multipath: cosmetic: keep ns nsid locally Anton Eidelman
@ 2021-09-12 18:54 ` Anton Eidelman
  2021-09-13 15:30   ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
  2 siblings, 1 reply; 35+ messages in thread
From: Anton Eidelman @ 2021-09-12 18:54 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

Scan work initially adds new namespaces to ctrl->namespaces TAIL.
They make the list unordered temporarily until nvme_scan_work()
finally sorts the list.

In case nvme_update_ana_state() runs while the list is unsorted,
the recently added namespaces are missed and their ana state
may remain not updated forever if timing between scan work and ana work
is unfortunate, e.g.
Initial state: namespaces = {2, 3}
scan_work: adds nsid=1: namespaces = {2, 3, 1}
scan_work: finds nsid=1 is still Inaccessible
ana_work: log page has nsids = {1, 2, 3, 4}, all Optimized.
ana_work: updates nsids {2, 3} but fails to find nsid=1 in namespaces.
scan_work: adds nsid=4: namespaces = {2, 3, 1, 4}
scan_work: finds nsid=4 is Optimized: sets it live.
scan_work: completes an sorts namespaces = {1, 2, 3, 4}
Result: nsid=1 will remain in Inaccessible state.

Solution:
In order to preserve the way ctrl->namespaces is updated and sorted,
make nvme_update_ana_state() deal with the case where ctrl->namespaces
is not fully sorted and has new namespaces appended with potentially
lower nsids.
nvme_update_ana_state() keeps track of the nsid seen in the list,
detects the unsorted case (rare), and restarts scanning of desc->nsids.

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
---
 drivers/nvme/host/multipath.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index a51561d67b93..1ad8dc8adb86 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -587,6 +587,7 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 	u32 nr_nsids = le32_to_cpu(desc->nnsids), n = 0;
 	unsigned *nr_change_groups = data;
 	struct nvme_ns *ns;
+	unsigned int last_ns_nsid = 0;
 
 	dev_dbg(ctrl->device, "ANA group %d: %s.\n",
 			le32_to_cpu(desc->grpid),
@@ -603,6 +604,11 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 		unsigned int nsid;
 		unsigned int ns_nsid = ns->head->ns_id;
 
+		if (ns_nsid < last_ns_nsid) {
+			/* Detected unsorted ctrl->namespaces: re-scan desc->nsids */
+			last_ns_nsid = ns_nsid;
+			n = 0;
+		}
 again:
 		nsid = le32_to_cpu(desc->nsids[n]);
 		if (ns_nsid < nsid)
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update
  2021-09-12 18:54 ` [PATCH 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman
@ 2021-09-13 15:30   ` Anton Eidelman
  2021-09-13 15:30     ` [PATCH v2 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
                       ` (4 more replies)
  0 siblings, 5 replies; 35+ messages in thread
From: Anton Eidelman @ 2021-09-13 15:30 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

Fixed two issues in nvme_update_ana_state() that caused ana_work
to miss existing namespaces and consequently a failure to update
the namespace ANA state based on the ANA log page.

1) A plain bug: we skipped an nsid in desc->nsids in a certain
   combination of nsids present and nsids reports in the ANA log,
   and failed to match this nsid to an existing namespace.
2) Unhandled situation when scan_work appended new namespaces to
   ctrl->namespaces and did not sort the list yet.
   In such transient state ana_work would fail to match nsids
   to those new namespaces.

Both issues potentially caused some namespaces to get stuck
in an incorrect ANA state, e.g. to never become live.

Anton Eidelman (3):
  nvme/multipath: fix failure to update ns ana state
  nvme/multipath: cosmetic: keep ns nsid locally
  nvme/multipath: fix stale ana state for namespaces just added by scan
    work

 drivers/nvme/host/multipath.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 1/3] nvme/multipath: fix failure to update ns ana state
  2021-09-13 15:30   ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
@ 2021-09-13 15:30     ` Anton Eidelman
  2021-09-13 16:01       ` Christoph Hellwig
                         ` (2 more replies)
  2021-09-13 15:30     ` [PATCH v2 2/3] nvme/multipath: cosmetic: keep ns nsid locally Anton Eidelman
                       ` (3 subsequent siblings)
  4 siblings, 3 replies; 35+ messages in thread
From: Anton Eidelman @ 2021-09-13 15:30 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

nvme_update_ana_state() has a deficiency that results
in failure to update the ana state for a namespace
in the following case:
nsid's in ctrl->namespaces: 1, 3, 4
nsid's in desc->nsids: 1, 2, 3, 4

Loop iteration 0:
    ns index = 0, n = 0, ns->head->ns_id = 1, nsid = 1, MATCH.
Loop iteration 1:
    ns index = 1, n = 1, ns->head->ns_id = 3, nsid = 2, NO MATCH.
Loop iteration 2:
    ns index = 2, n = 2, ns->head->ns_id = 4, nsid = 4, MATCH.

Result: missed nsid=3 and did not update its ana state.

Solution: when ns->head->ns_id is higher than nsid,
increment n and RETRY with the SAME ns.

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
---
 drivers/nvme/host/multipath.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 5d7bc58a27bd..e8ccdd398f78 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -600,14 +600,17 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 
 	down_read(&ctrl->namespaces_rwsem);
 	list_for_each_entry(ns, &ctrl->namespaces, list) {
-		unsigned nsid = le32_to_cpu(desc->nsids[n]);
-
+		unsigned nsid;
+again:
+		nsid = le32_to_cpu(desc->nsids[n]);
 		if (ns->head->ns_id < nsid)
 			continue;
 		if (ns->head->ns_id == nsid)
 			nvme_update_ns_ana_state(desc, ns);
 		if (++n == nr_nsids)
 			break;
+		if (ns->head->ns_id > nsid)
+			goto again;
 	}
 	up_read(&ctrl->namespaces_rwsem);
 	return 0;
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 2/3] nvme/multipath: cosmetic: keep ns nsid locally
  2021-09-13 15:30   ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
  2021-09-13 15:30     ` [PATCH v2 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
@ 2021-09-13 15:30     ` Anton Eidelman
  2021-09-13 16:02       ` Christoph Hellwig
  2021-09-13 15:30     ` [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 35+ messages in thread
From: Anton Eidelman @ 2021-09-13 15:30 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

Keep the nsid of the current namespace in a local variable.
Change the type to unsigned int to make checkpatch happy.

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
---
 drivers/nvme/host/multipath.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index e8ccdd398f78..a51561d67b93 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -600,16 +600,18 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 
 	down_read(&ctrl->namespaces_rwsem);
 	list_for_each_entry(ns, &ctrl->namespaces, list) {
-		unsigned nsid;
+		unsigned int nsid;
+		unsigned int ns_nsid = ns->head->ns_id;
+
 again:
 		nsid = le32_to_cpu(desc->nsids[n]);
-		if (ns->head->ns_id < nsid)
+		if (ns_nsid < nsid)
 			continue;
-		if (ns->head->ns_id == nsid)
+		if (ns_nsid == nsid)
 			nvme_update_ns_ana_state(desc, ns);
 		if (++n == nr_nsids)
 			break;
-		if (ns->head->ns_id > nsid)
+		if (ns_nsid > nsid)
 			goto again;
 	}
 	up_read(&ctrl->namespaces_rwsem);
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-13 15:30   ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
  2021-09-13 15:30     ` [PATCH v2 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
  2021-09-13 15:30     ` [PATCH v2 2/3] nvme/multipath: cosmetic: keep ns nsid locally Anton Eidelman
@ 2021-09-13 15:30     ` Anton Eidelman
  2021-09-13 16:03       ` Christoph Hellwig
  2021-09-13 16:07       ` Keith Busch
  2021-09-13 15:32     ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update Christoph Hellwig
  2021-09-13 15:42     ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update FIXED Anton Eidelman
  4 siblings, 2 replies; 35+ messages in thread
From: Anton Eidelman @ 2021-09-13 15:30 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

Scan work initially adds new namespaces to ctrl->namespaces TAIL.
They make the list unordered temporarily until nvme_scan_work()
finally sorts the list.

In case nvme_update_ana_state() runs while the list is unsorted,
the recently added namespaces are missed and their ana state
may remain not updated forever if timing between scan work and ana work
is unfortunate, e.g.
Initial state: namespaces = {2, 3}
scan_work: adds nsid=1: namespaces = {2, 3, 1}
scan_work: finds nsid=1 is still Inaccessible
ana_work: log page has nsids = {1, 2, 3, 4}, all Optimized.
ana_work: updates nsids {2, 3} but fails to find nsid=1 in namespaces.
scan_work: adds nsid=4: namespaces = {2, 3, 1, 4}
scan_work: finds nsid=4 is Optimized: sets it live.
scan_work: completes an sorts namespaces = {1, 2, 3, 4}
Result: nsid=1 will remain in Inaccessible state.

Solution:
In order to preserve the way ctrl->namespaces is updated and sorted,
make nvme_update_ana_state() deal with the case where ctrl->namespaces
is not fully sorted and has new namespaces appended with potentially
lower nsids.
nvme_update_ana_state() keeps track of the nsid seen in the list,
detects the unsorted case (rare), and restarts scanning of desc->nsids.

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
---
 drivers/nvme/host/multipath.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index a51561d67b93..0a50439a8c41 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -587,6 +587,7 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 	u32 nr_nsids = le32_to_cpu(desc->nnsids), n = 0;
 	unsigned *nr_change_groups = data;
 	struct nvme_ns *ns;
+	unsigned int last_ns_nsid = 0;
 
 	dev_dbg(ctrl->device, "ANA group %d: %s.\n",
 			le32_to_cpu(desc->grpid),
@@ -603,6 +604,11 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 		unsigned int nsid;
 		unsigned int ns_nsid = ns->head->ns_id;
 
+		if (ns_nsid < last_ns_nsid) {
+			/* Unsorted ctrl->namespaces: re-scan desc->nsids */
+			n = 0;
+		}
+		last_ns_nsid = ns_nsid;
 again:
 		nsid = le32_to_cpu(desc->nsids[n]);
 		if (ns_nsid < nsid)
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update
  2021-09-13 15:30   ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
                       ` (2 preceding siblings ...)
  2021-09-13 15:30     ` [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman
@ 2021-09-13 15:32     ` Christoph Hellwig
  2021-09-13 15:42     ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update FIXED Anton Eidelman
  4 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-09-13 15:32 UTC (permalink / raw)
  To: Anton Eidelman; +Cc: linux-nvme, hch, kbusch, sagi, axboe, Anton Eidelman

On Mon, Sep 13, 2021 at 09:30:20AM -0600, Anton Eidelman wrote:
> Fixed two issues in nvme_update_ana_state() that caused ana_work
> to miss existing namespaces and consequently a failure to update
> the namespace ANA state based on the ANA log page.
> 
> 1) A plain bug: we skipped an nsid in desc->nsids in a certain
>    combination of nsids present and nsids reports in the ANA log,
>    and failed to match this nsid to an existing namespace.
> 2) Unhandled situation when scan_work appended new namespaces to
>    ctrl->namespaces and did not sort the list yet.
>    In such transient state ana_work would fail to match nsids
>    to those new namespaces.
> 
> Both issues potentially caused some namespaces to get stuck
> in an incorrect ANA state, e.g. to never become live.

What are the changes to v1?  That should always go into the cover letter.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update FIXED
  2021-09-13 15:30   ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
                       ` (3 preceding siblings ...)
  2021-09-13 15:32     ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update Christoph Hellwig
@ 2021-09-13 15:42     ` Anton Eidelman
  2021-09-13 21:46       ` [PATCH v3 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
  4 siblings, 1 reply; 35+ messages in thread
From: Anton Eidelman @ 2021-09-13 15:42 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

Fixed two issues in nvme_update_ana_state() that caused ana_work
to miss existing namespaces and consequently a failure to update
the namespace ANA state based on the ANA log page.

1) A plain bug: we skipped an nsid in desc->nsids in a certain
   combination of nsids present and nsids reports in the ANA log,
   and failed to match this nsid to an existing namespace.
2) Unhandled situation when scan_work appended new namespaces to
   ctrl->namespaces and did not sort the list yet.
   In such transient state ana_work would fail to match nsids
   to those new namespaces.

Both issues potentially caused some namespaces to get stuck
in an incorrect ANA state, e.g. to never become live.

CHANGES FROM V1:
- 3/3: last_ns_nsid was not updated in every iteration, which is wrong

Anton Eidelman (3):
  nvme/multipath: fix failure to update ns ana state
  nvme/multipath: cosmetic: keep ns nsid locally
  nvme/multipath: fix stale ana state for namespaces just added by scan
    work

 drivers/nvme/host/multipath.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/3] nvme/multipath: fix failure to update ns ana state
  2021-09-13 15:30     ` [PATCH v2 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
@ 2021-09-13 16:01       ` Christoph Hellwig
  2021-09-13 19:29       ` Sagi Grimberg
  2021-09-14  8:27       ` Christoph Hellwig
  2 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-09-13 16:01 UTC (permalink / raw)
  To: Anton Eidelman; +Cc: linux-nvme, hch, kbusch, sagi, axboe, Anton Eidelman

I kinda hate the goto, but couldn't come up with anything better,
so this looks good.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/3] nvme/multipath: cosmetic: keep ns nsid locally
  2021-09-13 15:30     ` [PATCH v2 2/3] nvme/multipath: cosmetic: keep ns nsid locally Anton Eidelman
@ 2021-09-13 16:02       ` Christoph Hellwig
  2021-09-13 20:19         ` Anton Eidelman
  0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2021-09-13 16:02 UTC (permalink / raw)
  To: Anton Eidelman; +Cc: linux-nvme, hch, kbusch, sagi, axboe, Anton Eidelman

On Mon, Sep 13, 2021 at 09:30:22AM -0600, Anton Eidelman wrote:
> Keep the nsid of the current namespace in a local variable.
> Change the type to unsigned int to make checkpatch happy.
> 
> Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>

I can't really see any improvement here.  Also checkpath as so often
these days is fundamentally wrong here.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-13 15:30     ` [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman
@ 2021-09-13 16:03       ` Christoph Hellwig
  2021-09-13 16:08         ` Keith Busch
  2021-09-13 16:07       ` Keith Busch
  1 sibling, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2021-09-13 16:03 UTC (permalink / raw)
  To: Anton Eidelman
  Cc: linux-nvme, hch, kbusch, sagi, axboe, Anton Eidelman, Chaitanya Kulkarni

On Mon, Sep 13, 2021 at 09:30:23AM -0600, Anton Eidelman wrote:
> Scan work initially adds new namespaces to ctrl->namespaces TAIL.
> They make the list unordered temporarily until nvme_scan_work()
> finally sorts the list.
> 
> Solution:
> In order to preserve the way ctrl->namespaces is updated and sorted,
> make nvme_update_ana_state() deal with the case where ctrl->namespaces
> is not fully sorted and has new namespaces appended with potentially
> lower nsids.
> nvme_update_ana_state() keeps track of the nsid seen in the list,
> detects the unsorted case (rare), and restarts scanning of desc->nsids.

Can we dust off the patch from Chaitanya to switch to an always sorted
xarray instead of papering over this delayed sort?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-13 15:30     ` [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman
  2021-09-13 16:03       ` Christoph Hellwig
@ 2021-09-13 16:07       ` Keith Busch
  2021-09-13 16:11         ` Christoph Hellwig
  1 sibling, 1 reply; 35+ messages in thread
From: Keith Busch @ 2021-09-13 16:07 UTC (permalink / raw)
  To: Anton Eidelman; +Cc: linux-nvme, hch, sagi, axboe, Anton Eidelman

On Mon, Sep 13, 2021 at 09:30:23AM -0600, Anton Eidelman wrote:
> Scan work initially adds new namespaces to ctrl->namespaces TAIL.
> They make the list unordered temporarily until nvme_scan_work()
> finally sorts the list.
> 
> In case nvme_update_ana_state() runs while the list is unsorted,
> the recently added namespaces are missed and their ana state
> may remain not updated forever if timing between scan work and ana work
> is unfortunate, e.g.
> Initial state: namespaces = {2, 3}
> scan_work: adds nsid=1: namespaces = {2, 3, 1}
> scan_work: finds nsid=1 is still Inaccessible
> ana_work: log page has nsids = {1, 2, 3, 4}, all Optimized.
> ana_work: updates nsids {2, 3} but fails to find nsid=1 in namespaces.
> scan_work: adds nsid=4: namespaces = {2, 3, 1, 4}
> scan_work: finds nsid=4 is Optimized: sets it live.
> scan_work: completes an sorts namespaces = {1, 2, 3, 4}
> Result: nsid=1 will remain in Inaccessible state.
> 
> Solution:
> In order to preserve the way ctrl->namespaces is updated and sorted,
> make nvme_update_ana_state() deal with the case where ctrl->namespaces
> is not fully sorted and has new namespaces appended with potentially
> lower nsids.
> nvme_update_ana_state() keeps track of the nsid seen in the list,
> detects the unsorted case (rare), and restarts scanning of desc->nsids.
> 
> Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>

I guess this works for the structures we have in place, but I think we
should replace the sorted list and with xarray. Then you can get
something simpler to follow like:

---
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 5d7bc58a27bd..2b4bd624bded 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -595,21 +595,15 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 	if (desc->state == NVME_ANA_CHANGE)
 		(*nr_change_groups)++;
 
-	if (!nr_nsids)
-		return 0;
-
-	down_read(&ctrl->namespaces_rwsem);
-	list_for_each_entry(ns, &ctrl->namespaces, list) {
-		unsigned nsid = le32_to_cpu(desc->nsids[n]);
+	while (n < nr_nsids) {
+		unsigned nsid = le32_to_cpu(desc->nsids[n++]);
 
-		if (ns->head->ns_id < nsid)
-			continue;
-		if (ns->head->ns_id == nsid)
+		ns = nvme_find_get_ns(ctrl, nsid);
+		if (ns) {
 			nvme_update_ns_ana_state(desc, ns);
-		if (++n == nr_nsids)
-			break;
+			nvme_put_ns(ns);
+		}
 	}
-	up_read(&ctrl->namespaces_rwsem);
 	return 0;
 }
 
--

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-13 16:03       ` Christoph Hellwig
@ 2021-09-13 16:08         ` Keith Busch
  2021-09-14  6:24           ` Sagi Grimberg
  0 siblings, 1 reply; 35+ messages in thread
From: Keith Busch @ 2021-09-13 16:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Anton Eidelman, linux-nvme, sagi, axboe, Anton Eidelman,
	Chaitanya Kulkarni

On Mon, Sep 13, 2021 at 06:03:10PM +0200, Christoph Hellwig wrote:
> On Mon, Sep 13, 2021 at 09:30:23AM -0600, Anton Eidelman wrote:
> > Scan work initially adds new namespaces to ctrl->namespaces TAIL.
> > They make the list unordered temporarily until nvme_scan_work()
> > finally sorts the list.
> > 
> > Solution:
> > In order to preserve the way ctrl->namespaces is updated and sorted,
> > make nvme_update_ana_state() deal with the case where ctrl->namespaces
> > is not fully sorted and has new namespaces appended with potentially
> > lower nsids.
> > nvme_update_ana_state() keeps track of the nsid seen in the list,
> > detects the unsorted case (rare), and restarts scanning of desc->nsids.
> 
> Can we dust off the patch from Chaitanya to switch to an always sorted
> xarray instead of papering over this delayed sort?

Yeah, that's what I was going to say. :)

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-13 16:07       ` Keith Busch
@ 2021-09-13 16:11         ` Christoph Hellwig
  2021-09-13 16:24           ` Keith Busch
  2021-09-13 16:46           ` Anton Eidelman
  0 siblings, 2 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-09-13 16:11 UTC (permalink / raw)
  To: Keith Busch; +Cc: Anton Eidelman, linux-nvme, hch, sagi, axboe, Anton Eidelman

On Mon, Sep 13, 2021 at 09:07:56AM -0700, Keith Busch wrote:
> I guess this works for the structures we have in place, but I think we
> should replace the sorted list and with xarray. Then you can get
> something simpler to follow like:

I guess that's fine for the short term.  It mean we'll have to walk
the list for every namespace in the decriptor, but I've not actually
seen deployments with crazy numbers of namespaces yet.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-13 16:11         ` Christoph Hellwig
@ 2021-09-13 16:24           ` Keith Busch
  2021-09-13 16:46           ` Anton Eidelman
  1 sibling, 0 replies; 35+ messages in thread
From: Keith Busch @ 2021-09-13 16:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Anton Eidelman, linux-nvme, sagi, axboe, Anton Eidelman

On Mon, Sep 13, 2021 at 06:11:04PM +0200, Christoph Hellwig wrote:
> On Mon, Sep 13, 2021 at 09:07:56AM -0700, Keith Busch wrote:
> > I guess this works for the structures we have in place, but I think we
> > should replace the sorted list and with xarray. Then you can get
> > something simpler to follow like:
> 
> I guess that's fine for the short term.  It mean we'll have to walk
> the list for every namespace in the decriptor, but I've not actually
> seen deployments with crazy numbers of namespaces yet.

Oh, my suggestion depends on using the xarray implementation since the
current nvme_find_get_ns() also depends on a sorted list. Lookup
time with an xarray doesn't change with the number of namespaces, so
even a controller with lots of them should be fine if someone wants to make
such a device.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-13 16:11         ` Christoph Hellwig
  2021-09-13 16:24           ` Keith Busch
@ 2021-09-13 16:46           ` Anton Eidelman
  1 sibling, 0 replies; 35+ messages in thread
From: Anton Eidelman @ 2021-09-13 16:46 UTC (permalink / raw)
  To: Christoph Hellwig, linux-nvme, kbusch, sagi, axboe

On Mon, Sep 13, 2021 at 06:11:04PM +0200, Christoph Hellwig wrote:
> On Mon, Sep 13, 2021 at 09:07:56AM -0700, Keith Busch wrote:
> > I guess this works for the structures we have in place, but I think we
> > should replace the sorted list and with xarray. Then you can get
> > something simpler to follow like:
> 
> I guess that's fine for the short term.  It mean we'll have to walk
> the list for every namespace in the decriptor, but I've not actually
> seen deployments with crazy numbers of namespaces yet.

This is definitely a short term solution that does not require
us to change the data structures or address sorting at the end of scan_work.

I believe the impact of re-scanning desc->nsids will be minimal because
- It only happens when ana_work catches the namespaces list in a
  transient state with a few new namespaces appended
- The number of desc->nsids (one ANA group) should be much smaller
  than the total number of namespaces.

Thanks,
Anton

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/3] nvme/multipath: fix failure to update ns ana state
  2021-09-13 15:30     ` [PATCH v2 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
  2021-09-13 16:01       ` Christoph Hellwig
@ 2021-09-13 19:29       ` Sagi Grimberg
  2021-09-14  8:27       ` Christoph Hellwig
  2 siblings, 0 replies; 35+ messages in thread
From: Sagi Grimberg @ 2021-09-13 19:29 UTC (permalink / raw)
  To: Anton Eidelman, linux-nvme, hch, kbusch, axboe; +Cc: Anton Eidelman

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/3] nvme/multipath: cosmetic: keep ns nsid locally
  2021-09-13 16:02       ` Christoph Hellwig
@ 2021-09-13 20:19         ` Anton Eidelman
  2021-09-13 20:57           ` Keith Busch
  0 siblings, 1 reply; 35+ messages in thread
From: Anton Eidelman @ 2021-09-13 20:19 UTC (permalink / raw)
  To: Christoph Hellwig, linux-nvme, kbusch, sagi, axboe

On Mon, Sep 13, 2021 at 06:02:20PM +0200, Christoph Hellwig wrote:
> On Mon, Sep 13, 2021 at 09:30:22AM -0600, Anton Eidelman wrote:
> > Keep the nsid of the current namespace in a local variable.
> > Change the type to unsigned int to make checkpatch happy.
> > 
> > Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
> 
> I can't really see any improvement here.  Also checkpath as so often
> these days is fundamentally wrong here.

Do we you prefer to scrap this commit?

I just wanted to get rid of multiple occurances of "ns->head->ns_id",
and just use a local variable that only changes
once in the for_each iteration.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/3] nvme/multipath: cosmetic: keep ns nsid locally
  2021-09-13 20:19         ` Anton Eidelman
@ 2021-09-13 20:57           ` Keith Busch
  2021-09-14  5:44             ` Christoph Hellwig
  0 siblings, 1 reply; 35+ messages in thread
From: Keith Busch @ 2021-09-13 20:57 UTC (permalink / raw)
  To: Anton Eidelman; +Cc: Christoph Hellwig, linux-nvme, sagi, axboe

On Mon, Sep 13, 2021 at 02:19:35PM -0600, Anton Eidelman wrote:
> On Mon, Sep 13, 2021 at 06:02:20PM +0200, Christoph Hellwig wrote:
> > On Mon, Sep 13, 2021 at 09:30:22AM -0600, Anton Eidelman wrote:
> > > Keep the nsid of the current namespace in a local variable.
> > > Change the type to unsigned int to make checkpatch happy.
> > > 
> > > Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
> > 
> > I can't really see any improvement here.  Also checkpath as so often
> > these days is fundamentally wrong here.
> 
> Do we you prefer to scrap this commit?
> 
> I just wanted to get rid of multiple occurances of "ns->head->ns_id",
> and just use a local variable that only changes
> once in the for_each iteration.

I personally think "unsigned" is a well defined type in C and we should
ignore check-patch complaints on its use. Otherwise, I don't have a
problem with the patch.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3 0/3] nvme/mpath: fix missed namespaces in ana state update
  2021-09-13 15:42     ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update FIXED Anton Eidelman
@ 2021-09-13 21:46       ` Anton Eidelman
  2021-09-13 21:46         ` [PATCH v3 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
                           ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Anton Eidelman @ 2021-09-13 21:46 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

Fixed two issues in nvme_update_ana_state() that caused ana_work
to miss existing namespaces and consequently a failure to update
the namespace ANA state based on the ANA log page.

1) A plain bug: we skipped an nsid in desc->nsids in a certain
   combination of nsids present and nsids reports in the ANA log,
   and failed to match this nsid to an existing namespace.
2) Unhandled situation when scan_work appended new namespaces to
   ctrl->namespaces and did not sort the list yet.
   In such transient state ana_work would fail to match nsids
   to those new namespaces.

Both issues potentially caused some namespaces to get stuck
in an incorrect ANA state, e.g. to never become live.

CHANGES in V2:
- 3/3: last_ns_nsid was not updated in every iteration, which is wrong
CHANGES in V3:
- 2/3: use unsigned instead of unsigned int (ignore checkpatch on this)
- 3/3: use unsigned instead of unsigned int for last_ns_nsid

Anton Eidelman (3):
  nvme/multipath: fix failure to update ns ana state
  nvme/multipath: cosmetic: keep ns nsid locally
  nvme/multipath: fix stale ana state for namespaces just added by scan
    work

 drivers/nvme/host/multipath.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3 1/3] nvme/multipath: fix failure to update ns ana state
  2021-09-13 21:46       ` [PATCH v3 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
@ 2021-09-13 21:46         ` Anton Eidelman
  2021-09-13 21:46         ` [PATCH v3 2/3] nvme/multipath: cosmetic: keep ns nsid locally Anton Eidelman
  2021-09-13 21:46         ` [PATCH v3 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman
  2 siblings, 0 replies; 35+ messages in thread
From: Anton Eidelman @ 2021-09-13 21:46 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

nvme_update_ana_state() has a deficiency that results
in failure to update the ana state for a namespace
in the following case:
nsid's in ctrl->namespaces: 1, 3, 4
nsid's in desc->nsids: 1, 2, 3, 4

Loop iteration 0:
    ns index = 0, n = 0, ns->head->ns_id = 1, nsid = 1, MATCH.
Loop iteration 1:
    ns index = 1, n = 1, ns->head->ns_id = 3, nsid = 2, NO MATCH.
Loop iteration 2:
    ns index = 2, n = 2, ns->head->ns_id = 4, nsid = 4, MATCH.

Result: missed nsid=3 and did not update its ana state.

Solution: when ns->head->ns_id is higher than nsid,
increment n and RETRY with the SAME ns.

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
---
 drivers/nvme/host/multipath.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 5d7bc58a27bd..e8ccdd398f78 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -600,14 +600,17 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 
 	down_read(&ctrl->namespaces_rwsem);
 	list_for_each_entry(ns, &ctrl->namespaces, list) {
-		unsigned nsid = le32_to_cpu(desc->nsids[n]);
-
+		unsigned nsid;
+again:
+		nsid = le32_to_cpu(desc->nsids[n]);
 		if (ns->head->ns_id < nsid)
 			continue;
 		if (ns->head->ns_id == nsid)
 			nvme_update_ns_ana_state(desc, ns);
 		if (++n == nr_nsids)
 			break;
+		if (ns->head->ns_id > nsid)
+			goto again;
 	}
 	up_read(&ctrl->namespaces_rwsem);
 	return 0;
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 2/3] nvme/multipath: cosmetic: keep ns nsid locally
  2021-09-13 21:46       ` [PATCH v3 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
  2021-09-13 21:46         ` [PATCH v3 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
@ 2021-09-13 21:46         ` Anton Eidelman
  2021-09-13 21:46         ` [PATCH v3 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman
  2 siblings, 0 replies; 35+ messages in thread
From: Anton Eidelman @ 2021-09-13 21:46 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

Keep the nsid of the current namespace in a local variable,
in order to avoid multiple occurrances of "ns->head->ns_id".

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
---
 drivers/nvme/host/multipath.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index e8ccdd398f78..e1ce80c12815 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -601,15 +601,17 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 	down_read(&ctrl->namespaces_rwsem);
 	list_for_each_entry(ns, &ctrl->namespaces, list) {
 		unsigned nsid;
+		unsigned ns_nsid = ns->head->ns_id;
+
 again:
 		nsid = le32_to_cpu(desc->nsids[n]);
-		if (ns->head->ns_id < nsid)
+		if (ns_nsid < nsid)
 			continue;
-		if (ns->head->ns_id == nsid)
+		if (ns_nsid == nsid)
 			nvme_update_ns_ana_state(desc, ns);
 		if (++n == nr_nsids)
 			break;
-		if (ns->head->ns_id > nsid)
+		if (ns_nsid > nsid)
 			goto again;
 	}
 	up_read(&ctrl->namespaces_rwsem);
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-13 21:46       ` [PATCH v3 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
  2021-09-13 21:46         ` [PATCH v3 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
  2021-09-13 21:46         ` [PATCH v3 2/3] nvme/multipath: cosmetic: keep ns nsid locally Anton Eidelman
@ 2021-09-13 21:46         ` Anton Eidelman
  2 siblings, 0 replies; 35+ messages in thread
From: Anton Eidelman @ 2021-09-13 21:46 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi, axboe; +Cc: Anton Eidelman

Scan work initially adds new namespaces to ctrl->namespaces TAIL.
They make the list unordered temporarily until nvme_scan_work()
finally sorts the list.

In case nvme_update_ana_state() runs while the list is unsorted,
the recently added namespaces are missed and their ana state
may remain not updated forever if timing between scan work and ana work
is unfortunate, e.g.
Initial state: namespaces = {2, 3}
scan_work: adds nsid=1: namespaces = {2, 3, 1}
scan_work: finds nsid=1 is still Inaccessible
ana_work: log page has nsids = {1, 2, 3, 4}, all Optimized.
ana_work: updates nsids {2, 3} but fails to find nsid=1 in namespaces.
scan_work: adds nsid=4: namespaces = {2, 3, 1, 4}
scan_work: finds nsid=4 is Optimized: sets it live.
scan_work: completes an sorts namespaces = {1, 2, 3, 4}
Result: nsid=1 will remain in Inaccessible state.

Solution:
In order to preserve the way ctrl->namespaces is updated and sorted,
make nvme_update_ana_state() deal with the case where ctrl->namespaces
is not fully sorted and has new namespaces appended with potentially
lower nsids.
nvme_update_ana_state() keeps track of the nsid seen in the list,
detects the unsorted case (rare), and restarts scanning of desc->nsids.

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
---
 drivers/nvme/host/multipath.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index e1ce80c12815..14504a71711c 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -587,6 +587,7 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 	u32 nr_nsids = le32_to_cpu(desc->nnsids), n = 0;
 	unsigned *nr_change_groups = data;
 	struct nvme_ns *ns;
+	unsigned last_ns_nsid = 0;
 
 	dev_dbg(ctrl->device, "ANA group %d: %s.\n",
 			le32_to_cpu(desc->grpid),
@@ -603,6 +604,11 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 		unsigned nsid;
 		unsigned ns_nsid = ns->head->ns_id;
 
+		if (ns_nsid < last_ns_nsid) {
+			/* Unsorted ctrl->namespaces: re-scan desc->nsids */
+			n = 0;
+		}
+		last_ns_nsid = ns_nsid;
 again:
 		nsid = le32_to_cpu(desc->nsids[n]);
 		if (ns_nsid < nsid)
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/3] nvme/multipath: cosmetic: keep ns nsid locally
  2021-09-13 20:57           ` Keith Busch
@ 2021-09-14  5:44             ` Christoph Hellwig
  0 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-09-14  5:44 UTC (permalink / raw)
  To: Keith Busch; +Cc: Anton Eidelman, Christoph Hellwig, linux-nvme, sagi, axboe

On Mon, Sep 13, 2021 at 01:57:17PM -0700, Keith Busch wrote:
> > > I can't really see any improvement here.  Also checkpath as so often
> > > these days is fundamentally wrong here.
> > 
> > Do we you prefer to scrap this commit?
> > 
> > I just wanted to get rid of multiple occurances of "ns->head->ns_id",
> > and just use a local variable that only changes
> > once in the for_each iteration.
> 
> I personally think "unsigned" is a well defined type in C and we should
> ignore check-patch complaints on its use.

Agreed.  If we want to cleanup nsids (and at this point I don't), we
should move them all over to u32 instead..

> Otherwise, I don't have a
> problem with the patch.

Well, it is obviously correct, but I also don't see the point of it.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-13 16:08         ` Keith Busch
@ 2021-09-14  6:24           ` Sagi Grimberg
  2021-09-14  6:30             ` Christoph Hellwig
  0 siblings, 1 reply; 35+ messages in thread
From: Sagi Grimberg @ 2021-09-14  6:24 UTC (permalink / raw)
  To: Keith Busch, Christoph Hellwig
  Cc: Anton Eidelman, linux-nvme, axboe, Anton Eidelman, Chaitanya Kulkarni


>>> Scan work initially adds new namespaces to ctrl->namespaces TAIL.
>>> They make the list unordered temporarily until nvme_scan_work()
>>> finally sorts the list.
>>>
>>> Solution:
>>> In order to preserve the way ctrl->namespaces is updated and sorted,
>>> make nvme_update_ana_state() deal with the case where ctrl->namespaces
>>> is not fully sorted and has new namespaces appended with potentially
>>> lower nsids.
>>> nvme_update_ana_state() keeps track of the nsid seen in the list,
>>> detects the unsorted case (rare), and restarts scanning of desc->nsids.
>>
>> Can we dust off the patch from Chaitanya to switch to an always sorted
>> xarray instead of papering over this delayed sort?
> 
> Yeah, that's what I was going to say. :)

I agree it should not happen with xarray as inserts preserve sorting,
but that patch has the non-trivial removal of the namespaces_rwsem which
synchronizes in a lot in a lot of error recovery flows. It's very
possible that xarray will introduce regressions.

Anton, is it possible to rerun tests with the xarray patch applied?
The latest one is:
[PATCH V6] nvme-core: use xarray for ctrl ns tracking

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-14  6:24           ` Sagi Grimberg
@ 2021-09-14  6:30             ` Christoph Hellwig
  2021-09-14  6:40               ` Christoph Hellwig
                                 ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-09-14  6:30 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Keith Busch, Christoph Hellwig, Anton Eidelman, linux-nvme,
	axboe, Anton Eidelman, Chaitanya Kulkarni

On Tue, Sep 14, 2021 at 09:24:30AM +0300, Sagi Grimberg wrote:
> I agree it should not happen with xarray as inserts preserve sorting,
> but that patch has the non-trivial removal of the namespaces_rwsem which
> synchronizes in a lot in a lot of error recovery flows. It's very
> possible that xarray will introduce regressions.

I've actually started looking over the series earlier today, and staring
it I'm pretty sure the namespaces_rwsem is completely broken.  There
is nothing preventing the namespaces from going away in these loops.

But I can't see why we don't just always do a sorted insertation into
the ->namespaces list.  Just trying to figure out the history of the
delayed sort ATM.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-14  6:30             ` Christoph Hellwig
@ 2021-09-14  6:40               ` Christoph Hellwig
  2021-09-14  7:09               ` Sagi Grimberg
  2021-09-14 14:24               ` Keith Busch
  2 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-09-14  6:40 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Keith Busch, Christoph Hellwig, Anton Eidelman, linux-nvme,
	axboe, Anton Eidelman, Chaitanya Kulkarni

On Tue, Sep 14, 2021 at 08:30:52AM +0200, Christoph Hellwig wrote:
> On Tue, Sep 14, 2021 at 09:24:30AM +0300, Sagi Grimberg wrote:
> > I agree it should not happen with xarray as inserts preserve sorting,
> > but that patch has the non-trivial removal of the namespaces_rwsem which
> > synchronizes in a lot in a lot of error recovery flows. It's very
> > possible that xarray will introduce regressions.
> 
> I've actually started looking over the series earlier today, and staring
> it I'm pretty sure the namespaces_rwsem is completely broken.  There
> is nothing preventing the namespaces from going away in these loops.
> 
> But I can't see why we don't just always do a sorted insertation into
> the ->namespaces list.  Just trying to figure out the history of the
> delayed sort ATM.

Anton, can you try this instead of your last patch?


diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 97f8211cf92c1..3e3c835777458 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -13,7 +13,6 @@
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/backing-dev.h>
-#include <linux/list_sort.h>
 #include <linux/slab.h>
 #include <linux/types.h>
 #include <linux/pr.h>
@@ -3716,15 +3715,6 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
 	return ret;
 }
 
-static int ns_cmp(void *priv, const struct list_head *a,
-		const struct list_head *b)
-{
-	struct nvme_ns *nsa = container_of(a, struct nvme_ns, list);
-	struct nvme_ns *nsb = container_of(b, struct nvme_ns, list);
-
-	return nsa->head->ns_id - nsb->head->ns_id;
-}
-
 struct nvme_ns *nvme_find_get_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 {
 	struct nvme_ns *ns, *ret = NULL;
@@ -3745,6 +3735,23 @@ struct nvme_ns *nvme_find_get_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 }
 EXPORT_SYMBOL_NS_GPL(nvme_find_get_ns, NVME_TARGET_PASSTHRU);
 
+/*
+ * Add the namespace to the controller list while keeping the list ordered.
+ */
+static void nvme_ns_add_to_ctrl_list(struct nvme_ns *ns)
+{
+	struct nvme_ns *tmp;
+
+	list_for_each_entry_reverse(tmp, &ns->ctrl->namespaces, list) {
+		if (tmp->head->ns_id < ns->head->ns_id) {
+			list_add(&tmp->list, &ns->ctrl->namespaces);
+			return;
+		}
+	}
+
+	list_add_tail(&ns->list, &ns->ctrl->namespaces);
+}
+
 static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid,
 		struct nvme_ns_ids *ids)
 {
@@ -3794,10 +3801,9 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid,
 	if (nvme_update_ns_info(ns, id))
 		goto out_unlink_ns;
 
-	down_write(&ctrl->namespaces_rwsem);
-	list_add_tail(&ns->list, &ctrl->namespaces);
-	up_write(&ctrl->namespaces_rwsem);
-
+	down_write(&ns->ctrl->namespaces_rwsem);
+	nvme_ns_add_to_ctrl_list(ns);
+	up_write(&ns->ctrl->namespaces_rwsem);
 	nvme_get_ctrl(ctrl);
 
 	if (device_add_disk(ctrl->device, ns->disk, nvme_ns_id_attr_groups))
@@ -4082,10 +4088,6 @@ static void nvme_scan_work(struct work_struct *work)
 	if (nvme_scan_ns_list(ctrl) != 0)
 		nvme_scan_ns_sequential(ctrl);
 	mutex_unlock(&ctrl->scan_lock);
-
-	down_write(&ctrl->namespaces_rwsem);
-	list_sort(NULL, &ctrl->namespaces, ns_cmp);
-	up_write(&ctrl->namespaces_rwsem);
 }
 
 /*

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-14  6:30             ` Christoph Hellwig
  2021-09-14  6:40               ` Christoph Hellwig
@ 2021-09-14  7:09               ` Sagi Grimberg
  2021-09-14  7:32                 ` Christoph Hellwig
  2021-09-14 14:24               ` Keith Busch
  2 siblings, 1 reply; 35+ messages in thread
From: Sagi Grimberg @ 2021-09-14  7:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Anton Eidelman, linux-nvme, axboe, Anton Eidelman,
	Chaitanya Kulkarni


>> I agree it should not happen with xarray as inserts preserve sorting,
>> but that patch has the non-trivial removal of the namespaces_rwsem which
>> synchronizes in a lot in a lot of error recovery flows. It's very
>> possible that xarray will introduce regressions.
> 
> I've actually started looking over the series earlier today, and staring
> it I'm pretty sure the namespaces_rwsem is completely broken.  There
> is nothing preventing the namespaces from going away in these loops.

Where? the ns itself? the ns is removed from ctrl->namespaces list under
this rwsem (nvme_ns_remove).

> But I can't see why we don't just always do a sorted insertation into
> the ->namespaces list.  Just trying to figure out the history of the
> delayed sort ATM.

Don't know the history.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-14  7:09               ` Sagi Grimberg
@ 2021-09-14  7:32                 ` Christoph Hellwig
  2021-09-14 16:45                   ` Anton Eidelman
  0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2021-09-14  7:32 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Christoph Hellwig, Keith Busch, Anton Eidelman, linux-nvme,
	axboe, Anton Eidelman, Chaitanya Kulkarni

On Tue, Sep 14, 2021 at 10:09:43AM +0300, Sagi Grimberg wrote:
>
>>> I agree it should not happen with xarray as inserts preserve sorting,
>>> but that patch has the non-trivial removal of the namespaces_rwsem which
>>> synchronizes in a lot in a lot of error recovery flows. It's very
>>> possible that xarray will introduce regressions.
>>
>> I've actually started looking over the series earlier today, and staring
>> it I'm pretty sure the namespaces_rwsem is completely broken.  There
>> is nothing preventing the namespaces from going away in these loops.
>
> Where? the ns itself? the ns is removed from ctrl->namespaces list under
> this rwsem (nvme_ns_remove).

Sorry, with the series I meant the xarray conversion that removes the
rwsem entirely.

>> But I can't see why we don't just always do a sorted insertation into
>> the ->namespaces list.  Just trying to figure out the history of the
>> delayed sort ATM.
>
> Don't know the history.

git-blame teels me it showed up with the initial ID ns list support
from Keith, so not all that much history here.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/3] nvme/multipath: fix failure to update ns ana state
  2021-09-13 15:30     ` [PATCH v2 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
  2021-09-13 16:01       ` Christoph Hellwig
  2021-09-13 19:29       ` Sagi Grimberg
@ 2021-09-14  8:27       ` Christoph Hellwig
  2 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-09-14  8:27 UTC (permalink / raw)
  To: Anton Eidelman; +Cc: linux-nvme, hch, kbusch, sagi, axboe, Anton Eidelman

I've applied this patch for now while we're still pondering the sorting
of the namespace list.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-14  6:30             ` Christoph Hellwig
  2021-09-14  6:40               ` Christoph Hellwig
  2021-09-14  7:09               ` Sagi Grimberg
@ 2021-09-14 14:24               ` Keith Busch
  2 siblings, 0 replies; 35+ messages in thread
From: Keith Busch @ 2021-09-14 14:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Anton Eidelman, linux-nvme, axboe, Anton Eidelman,
	Chaitanya Kulkarni

On Tue, Sep 14, 2021 at 08:30:52AM +0200, Christoph Hellwig wrote:
> On Tue, Sep 14, 2021 at 09:24:30AM +0300, Sagi Grimberg wrote:
> > I agree it should not happen with xarray as inserts preserve sorting,
> > but that patch has the non-trivial removal of the namespaces_rwsem which
> > synchronizes in a lot in a lot of error recovery flows. It's very
> > possible that xarray will introduce regressions.
> 
> I've actually started looking over the series earlier today, and staring
> it I'm pretty sure the namespaces_rwsem is completely broken.  There
> is nothing preventing the namespaces from going away in these loops.
> 
> But I can't see why we don't just always do a sorted insertation into
> the ->namespaces list.  Just trying to figure out the history of the
> delayed sort ATM.

There doesn't appear to be any reason for the delayed list sort. It
looks correct to do a ordered list insert from the beginning.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-14  7:32                 ` Christoph Hellwig
@ 2021-09-14 16:45                   ` Anton Eidelman
  2021-09-15  9:32                     ` Sagi Grimberg
  0 siblings, 1 reply; 35+ messages in thread
From: Anton Eidelman @ 2021-09-14 16:45 UTC (permalink / raw)
  To: Christoph Hellwig, Keith Busch, linux-nvme, axboe,
	Chaitanya Kulkarni, Sagi Grimberg

On Tue, Sep 14, 2021 at 09:32:36AM +0200, Christoph Hellwig wrote:
> On Tue, Sep 14, 2021 at 10:09:43AM +0300, Sagi Grimberg wrote:
> >
> >>> I agree it should not happen with xarray as inserts preserve sorting,
> >>> but that patch has the non-trivial removal of the namespaces_rwsem which
> >>> synchronizes in a lot in a lot of error recovery flows. It's very
> >>> possible that xarray will introduce regressions.
> >>
> >> I've actually started looking over the series earlier today, and staring
> >> it I'm pretty sure the namespaces_rwsem is completely broken.  There
> >> is nothing preventing the namespaces from going away in these loops.
> >
> > Where? the ns itself? the ns is removed from ctrl->namespaces list under
> > this rwsem (nvme_ns_remove).
> 
> Sorry, with the series I meant the xarray conversion that removes the
> rwsem entirely.
> 
> >> But I can't see why we don't just always do a sorted insertation into
> >> the ->namespaces list.  Just trying to figure out the history of the
> >> delayed sort ATM.
> >
> > Don't know the history.
> 
> git-blame teels me it showed up with the initial ID ns list support
> from Keith, so not all that much history here.

The main reason I suggest applying this as a short term fix
is that it is well isolated and is easy to cherry-pick into the LTS/stable versions.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work
  2021-09-14 16:45                   ` Anton Eidelman
@ 2021-09-15  9:32                     ` Sagi Grimberg
  0 siblings, 0 replies; 35+ messages in thread
From: Sagi Grimberg @ 2021-09-15  9:32 UTC (permalink / raw)
  To: Anton Eidelman, Christoph Hellwig, Keith Busch, linux-nvme,
	axboe, Chaitanya Kulkarni


>>>>> I agree it should not happen with xarray as inserts preserve sorting,
>>>>> but that patch has the non-trivial removal of the namespaces_rwsem which
>>>>> synchronizes in a lot in a lot of error recovery flows. It's very
>>>>> possible that xarray will introduce regressions.
>>>>
>>>> I've actually started looking over the series earlier today, and staring
>>>> it I'm pretty sure the namespaces_rwsem is completely broken.  There
>>>> is nothing preventing the namespaces from going away in these loops.
>>>
>>> Where? the ns itself? the ns is removed from ctrl->namespaces list under
>>> this rwsem (nvme_ns_remove).
>>
>> Sorry, with the series I meant the xarray conversion that removes the
>> rwsem entirely.
>>
>>>> But I can't see why we don't just always do a sorted insertation into
>>>> the ->namespaces list.  Just trying to figure out the history of the
>>>> delayed sort ATM.
>>>
>>> Don't know the history.
>>
>> git-blame teels me it showed up with the initial ID ns list support
>> from Keith, so not all that much history here.
> 
> The main reason I suggest applying this as a short term fix
> is that it is well isolated and is easy to cherry-pick into the LTS/stable versions.

I think that Christoph's version of keeping the list sorted is pretty
self-contained and can probably apply on stable kernels.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2021-09-15  9:33 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-12 18:54 [PATCH RESEND 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
2021-09-12 18:54 ` [PATCH 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
2021-09-12 18:54 ` [PATCH 2/3] nvme/multipath: cosmetic: keep ns nsid locally Anton Eidelman
2021-09-12 18:54 ` [PATCH 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman
2021-09-13 15:30   ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
2021-09-13 15:30     ` [PATCH v2 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
2021-09-13 16:01       ` Christoph Hellwig
2021-09-13 19:29       ` Sagi Grimberg
2021-09-14  8:27       ` Christoph Hellwig
2021-09-13 15:30     ` [PATCH v2 2/3] nvme/multipath: cosmetic: keep ns nsid locally Anton Eidelman
2021-09-13 16:02       ` Christoph Hellwig
2021-09-13 20:19         ` Anton Eidelman
2021-09-13 20:57           ` Keith Busch
2021-09-14  5:44             ` Christoph Hellwig
2021-09-13 15:30     ` [PATCH v2 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman
2021-09-13 16:03       ` Christoph Hellwig
2021-09-13 16:08         ` Keith Busch
2021-09-14  6:24           ` Sagi Grimberg
2021-09-14  6:30             ` Christoph Hellwig
2021-09-14  6:40               ` Christoph Hellwig
2021-09-14  7:09               ` Sagi Grimberg
2021-09-14  7:32                 ` Christoph Hellwig
2021-09-14 16:45                   ` Anton Eidelman
2021-09-15  9:32                     ` Sagi Grimberg
2021-09-14 14:24               ` Keith Busch
2021-09-13 16:07       ` Keith Busch
2021-09-13 16:11         ` Christoph Hellwig
2021-09-13 16:24           ` Keith Busch
2021-09-13 16:46           ` Anton Eidelman
2021-09-13 15:32     ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update Christoph Hellwig
2021-09-13 15:42     ` [PATCH v2 0/3] nvme/mpath: fix missed namespaces in ana state update FIXED Anton Eidelman
2021-09-13 21:46       ` [PATCH v3 0/3] nvme/mpath: fix missed namespaces in ana state update Anton Eidelman
2021-09-13 21:46         ` [PATCH v3 1/3] nvme/multipath: fix failure to update ns ana state Anton Eidelman
2021-09-13 21:46         ` [PATCH v3 2/3] nvme/multipath: cosmetic: keep ns nsid locally Anton Eidelman
2021-09-13 21:46         ` [PATCH v3 3/3] nvme/multipath: fix stale ana state for namespaces just added by scan work Anton Eidelman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.