linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions
@ 2019-08-13  6:42 Sagi Grimberg
  2019-08-13  6:42 ` [PATCH v6 1/7] nvme: fail cancelled commands with NVME_SC_HOST_PATH_ERROR Sagi Grimberg
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Sagi Grimberg @ 2019-08-13  6:42 UTC (permalink / raw)


Hey Hannes and Keith,

This series handles the reset and scanning race saga.

The approach is to have the relevant admin commands return a proper
status code that reflects that we had a transport error and
not remove the namepsace if that is indeed the case.

This should be a reliable way to know if the revalidate_disk failed
due to a transport error or not.

I am able to reproduce this race with the following command (using
tcp/rdma):
for j in `seq 50`; do nvme connect-all; for i in `seq 50`; do nvme reset /dev/nvme0; done ; nvme disconnect-all; done

With this patch set (plus two more tcp/rdma transport specific patches
that address a other issues) I was able to pass the test without
reproducing the hang that you hannes reported.

Changes from v5:
- don't return blk_status_t from nvme_submit_user_cmd

Changes from v4:
- return nvme_error_status in __nvme_submit_sync_cmd and cast to
  errno in nvme_identify_ns
- modified status print in nvme_report_ns_ids

Changes from v3:
- return blk_status_to_errno instead of blk_status_t in sync cmds
- check for normal return errno from revalidate_disk, this covers
  transport errors, but also spurious allocation errors and any
  type of transient errors.

Changes from v2:
- added fc patch from James (can you test please?)
- made nvme_identify_ns return id or PTR_ERR (Hannes)

Changes from v1:
- different approach

James Smart (1):
  nvme-fc: Fail transport errors with NVME_SC_HOST_PATH

Sagi Grimberg (6):
  nvme: fail cancelled commands with NVME_SC_HOST_PATH_ERROR
  nvme: return a proper status for sync commands failure
  nvme: make nvme_identify_ns propagate errors back
  nvme: make nvme_report_ns_ids propagate error back
  nvme-tcp: fail command with NVME_SC_HOST_PATH_ERROR send failed
  nvme: fix ns removal hang when failing to revalidate due to a
    transient error

 drivers/nvme/host/core.c | 60 ++++++++++++++++++++++++++++------------
 drivers/nvme/host/fc.c   | 37 ++++++++++++++++++++-----
 drivers/nvme/host/tcp.c  |  2 +-
 3 files changed, 74 insertions(+), 25 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 1/7] nvme: fail cancelled commands with NVME_SC_HOST_PATH_ERROR
  2019-08-13  6:42 [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
@ 2019-08-13  6:42 ` Sagi Grimberg
  2019-08-15 21:03   ` James Smart
  2019-08-13  6:42 ` [PATCH v6 2/7] nvme: return a proper status for sync commands failure Sagi Grimberg
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2019-08-13  6:42 UTC (permalink / raw)


NVME_SC_ABORT_REQ means that the request was aborted due to
an abort command received. In our case, this is a transport
cancellation, so host pathing error is much more appropriate.

Also, convert NVME_SC_HOST_PATH_ERROR to BLK_STS_TRANSPORT for
such that callers can understand that the status is a transport
related error. This will be used by the ns scanning code to
understand if it got an error from the controller or that the
controller happens to be unreachable by the transport.

Reviewed-by: Minwoo Im <minwoo.im.dev at gmail.com>
Reviewed-by: Hannes Reinecke <hare at suse.com>
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6956041224ec..867c8977eb3e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -228,6 +228,8 @@ static blk_status_t nvme_error_status(struct request *req)
 		return BLK_STS_PROTECTION;
 	case NVME_SC_RESERVATION_CONFLICT:
 		return BLK_STS_NEXUS;
+	case NVME_SC_HOST_PATH_ERROR:
+		return BLK_STS_TRANSPORT;
 	default:
 		return BLK_STS_IOERR;
 	}
@@ -294,7 +296,7 @@ bool nvme_cancel_request(struct request *req, void *data, bool reserved)
 	dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
 				"Cancelling I/O %d", req->tag);
 
-	nvme_req(req)->status = NVME_SC_ABORT_REQ;
+	nvme_req(req)->status = NVME_SC_HOST_PATH_ERROR;
 	blk_mq_complete_request_sync(req);
 	return true;
 }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 2/7] nvme: return a proper status for sync commands failure
  2019-08-13  6:42 [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
  2019-08-13  6:42 ` [PATCH v6 1/7] nvme: fail cancelled commands with NVME_SC_HOST_PATH_ERROR Sagi Grimberg
@ 2019-08-13  6:42 ` Sagi Grimberg
  2019-08-13  7:09   ` Hannes Reinecke
  2019-08-15 21:05   ` James Smart
  2019-08-13  6:43 ` [PATCH v6 3/7] nvme: make nvme_identify_ns propagate errors back Sagi Grimberg
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 18+ messages in thread
From: Sagi Grimberg @ 2019-08-13  6:42 UTC (permalink / raw)


callers should not rely on raw nvme status, instead return
is more appropriate blk_status_t.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 867c8977eb3e..f9bc10407f1b 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -774,7 +774,7 @@ static void nvme_execute_rq_polled(struct request_queue *q,
 
 /*
  * Returns 0 on success.  If the result is negative, it's a Linux error code;
- * if the result is positive, it's an NVM Express status code
+ * if the result is positive, it's a blk_status_t status code
  */
 int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
 		union nvme_result *result, void *buffer, unsigned bufflen,
@@ -805,7 +805,7 @@ int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
 	if (nvme_req(req)->flags & NVME_REQ_CANCELLED)
 		ret = -EINTR;
 	else
-		ret = nvme_req(req)->status;
+		ret = nvme_error_status(req);
  out:
 	blk_mq_free_request(req);
 	return ret;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 3/7] nvme: make nvme_identify_ns propagate errors back
  2019-08-13  6:42 [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
  2019-08-13  6:42 ` [PATCH v6 1/7] nvme: fail cancelled commands with NVME_SC_HOST_PATH_ERROR Sagi Grimberg
  2019-08-13  6:42 ` [PATCH v6 2/7] nvme: return a proper status for sync commands failure Sagi Grimberg
@ 2019-08-13  6:43 ` Sagi Grimberg
  2019-08-15 21:10   ` James Smart
  2019-08-13  6:43 ` [PATCH v6 4/7] nvme: make nvme_report_ns_ids propagate error back Sagi Grimberg
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2019-08-13  6:43 UTC (permalink / raw)


right now callers of nvme_identify_ns only know that it failed,
but don't know why. Make nvme_identify_ns propagate the error back.
Note that if the return status from nvme_submit_sync_cmd() is positive
then we convert it from a blk_status_t to a negative errno.

Reviewed-by: Minwoo Im <minwoo.im.dev at gmail.com>
Reviewed-by: Hannes Reinecke <hare at suse.com>
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/core.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f9bc10407f1b..4cc21567c43f 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1110,13 +1110,16 @@ static struct nvme_id_ns *nvme_identify_ns(struct nvme_ctrl *ctrl,
 
 	id = kmalloc(sizeof(*id), GFP_KERNEL);
 	if (!id)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	error = nvme_submit_sync_cmd(ctrl->admin_q, &c, id, sizeof(*id));
 	if (error) {
 		dev_warn(ctrl->device, "Identify namespace failed (%d)\n", error);
 		kfree(id);
-		return NULL;
+		if (error > 0)
+			return ERR_PTR(blk_status_to_errno(error));
+		else
+			return ERR_PTR(error);
 	}
 
 	return id;
@@ -1750,8 +1753,8 @@ static int nvme_revalidate_disk(struct gendisk *disk)
 	}
 
 	id = nvme_identify_ns(ctrl, ns->head->ns_id);
-	if (!id)
-		return -ENODEV;
+	if (IS_ERR(id))
+		return PTR_ERR(id);
 
 	if (id->ncap == 0) {
 		ret = -ENODEV;
@@ -3339,8 +3342,8 @@ static int nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 	nvme_set_queue_limits(ctrl, ns->queue);
 
 	id = nvme_identify_ns(ctrl, nsid);
-	if (!id) {
-		ret = -EIO;
+	if (IS_ERR(id)) {
+		ret = PTR_ERR(id);
 		goto out_free_queue;
 	}
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 4/7] nvme: make nvme_report_ns_ids propagate error back
  2019-08-13  6:42 [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
                   ` (2 preceding siblings ...)
  2019-08-13  6:43 ` [PATCH v6 3/7] nvme: make nvme_identify_ns propagate errors back Sagi Grimberg
@ 2019-08-13  6:43 ` Sagi Grimberg
  2019-08-15 21:11   ` James Smart
  2019-08-13  6:43 ` [PATCH v6 5/7] nvme-tcp: fail command with NVME_SC_HOST_PATH_ERROR send failed Sagi Grimberg
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2019-08-13  6:43 UTC (permalink / raw)


And make the callers check the return status and propagate
back accordingly. Also print the return status.

Reviewed-by: Hannes Reinecke <hare at suse.com>
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/core.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4cc21567c43f..10e237f80800 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1608,9 +1608,11 @@ static void nvme_config_write_zeroes(struct gendisk *disk, struct nvme_ns *ns)
 	blk_queue_max_write_zeroes_sectors(disk->queue, max_sectors);
 }
 
-static void nvme_report_ns_ids(struct nvme_ctrl *ctrl, unsigned int nsid,
+static int nvme_report_ns_ids(struct nvme_ctrl *ctrl, unsigned int nsid,
 		struct nvme_id_ns *id, struct nvme_ns_ids *ids)
 {
+	int ret = 0;
+
 	memset(ids, 0, sizeof(*ids));
 
 	if (ctrl->vs >= NVME_VS(1, 1, 0))
@@ -1621,10 +1623,12 @@ static void nvme_report_ns_ids(struct nvme_ctrl *ctrl, unsigned int nsid,
 		 /* Don't treat error as fatal we potentially
 		  * already have a NGUID or EUI-64
 		  */
-		if (nvme_identify_ns_descs(ctrl, nsid, ids))
+		ret = nvme_identify_ns_descs(ctrl, nsid, ids);
+		if (ret)
 			dev_warn(ctrl->device,
-				 "%s: Identify Descriptors failed\n", __func__);
+				 "Identify Descriptors failed (%d)\n", ret);
 	}
+	return ret;
 }
 
 static bool nvme_ns_ids_valid(struct nvme_ns_ids *ids)
@@ -1762,7 +1766,10 @@ static int nvme_revalidate_disk(struct gendisk *disk)
 	}
 
 	__nvme_revalidate_disk(disk, id);
-	nvme_report_ns_ids(ctrl, ns->head->ns_id, id, &ids);
+	ret = nvme_report_ns_ids(ctrl, ns->head->ns_id, id, &ids);
+	if (ret)
+		goto out;
+
 	if (!nvme_ns_ids_equal(&ns->head->ids, &ids)) {
 		dev_err(ctrl->device,
 			"identifiers changed for nsid %d\n", ns->head->ns_id);
@@ -3187,7 +3194,9 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl,
 	head->ns_id = nsid;
 	kref_init(&head->ref);
 
-	nvme_report_ns_ids(ctrl, nsid, id, &head->ids);
+	ret = nvme_report_ns_ids(ctrl, nsid, id, &head->ids);
+	if (ret)
+		goto out_cleanup_srcu;
 
 	ret = __nvme_check_ids(ctrl->subsys, head);
 	if (ret) {
@@ -3235,7 +3244,10 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
 	} else {
 		struct nvme_ns_ids ids;
 
-		nvme_report_ns_ids(ctrl, nsid, id, &ids);
+		ret = nvme_report_ns_ids(ctrl, nsid, id, &ids);
+		if (ret)
+			goto out_unlock;
+
 		if (!nvme_ns_ids_equal(&head->ids, &ids)) {
 			dev_err(ctrl->device,
 				"IDs don't match for shared namespace %d\n",
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 5/7] nvme-tcp: fail command with NVME_SC_HOST_PATH_ERROR send failed
  2019-08-13  6:42 [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
                   ` (3 preceding siblings ...)
  2019-08-13  6:43 ` [PATCH v6 4/7] nvme: make nvme_report_ns_ids propagate error back Sagi Grimberg
@ 2019-08-13  6:43 ` Sagi Grimberg
  2019-08-15 21:11   ` James Smart
  2019-08-13  6:43 ` [PATCH v6 6/7] nvme-fc: Fail transport errors with NVME_SC_HOST_PATH Sagi Grimberg
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2019-08-13  6:43 UTC (permalink / raw)


This is a more appropriate error status for a transport error
detected by us (the host).

Reviewed-by: Hannes Reinecke <hare at suse.com>
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/tcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 2de33c4c5d14..7daf39f31b08 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -842,7 +842,7 @@ static inline void nvme_tcp_done_send_req(struct nvme_tcp_queue *queue)
 
 static void nvme_tcp_fail_request(struct nvme_tcp_request *req)
 {
-	nvme_tcp_end_request(blk_mq_rq_from_pdu(req), NVME_SC_DATA_XFER_ERROR);
+	nvme_tcp_end_request(blk_mq_rq_from_pdu(req), NVME_SC_HOST_PATH_ERROR);
 }
 
 static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 6/7] nvme-fc: Fail transport errors with NVME_SC_HOST_PATH
  2019-08-13  6:42 [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
                   ` (4 preceding siblings ...)
  2019-08-13  6:43 ` [PATCH v6 5/7] nvme-tcp: fail command with NVME_SC_HOST_PATH_ERROR send failed Sagi Grimberg
@ 2019-08-13  6:43 ` Sagi Grimberg
  2019-08-15 21:12   ` James Smart
  2019-08-13  6:43 ` [PATCH v6 7/7] nvme: fix ns removal hang when failing to revalidate due to a transient error Sagi Grimberg
  2019-08-15 17:45 ` [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
  7 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2019-08-13  6:43 UTC (permalink / raw)


From: James Smart <james.smart@broadcom.com>

NVME_SC_INTERNAL should indicate an internal controller errors
and not host transport errors. These errors will propagate to
upper layers (essentially nvme core) and be interpereted as
transport errors which should not be taken into account for
namespace state or condition.

Reviewed-by: Hannes Reinecke <hare at suse.com>
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/fc.c | 37 ++++++++++++++++++++++++++++++-------
 1 file changed, 30 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 3f3725d1eca1..c289f46f6d13 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1608,9 +1608,13 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
 				sizeof(op->rsp_iu), DMA_FROM_DEVICE);
 
 	if (opstate == FCPOP_STATE_ABORTED)
-		status = cpu_to_le16(NVME_SC_ABORT_REQ << 1);
-	else if (freq->status)
-		status = cpu_to_le16(NVME_SC_INTERNAL << 1);
+		status = cpu_to_le16(NVME_SC_HOST_PATH_ERROR << 1);
+	else if (freq->status) {
+		status = cpu_to_le16(NVME_SC_HOST_PATH_ERROR << 1);
+		dev_info(ctrl->ctrl.device,
+			"NVME-FC{%d}: io failed due to lldd error %d\n",
+			ctrl->cnum, freq->status);
+	}
 
 	/*
 	 * For the linux implementation, if we have an unsuccesful
@@ -1637,8 +1641,13 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
 		 * no payload in the CQE by the transport.
 		 */
 		if (freq->transferred_length !=
-			be32_to_cpu(op->cmd_iu.data_len)) {
-			status = cpu_to_le16(NVME_SC_INTERNAL << 1);
+		    be32_to_cpu(op->cmd_iu.data_len)) {
+			status = cpu_to_le16(NVME_SC_HOST_PATH_ERROR << 1);
+			dev_info(ctrl->ctrl.device,
+				"NVME-FC{%d}: io failed due to bad transfer "
+				"length: %d vs expected %d\n",
+				ctrl->cnum, freq->transferred_length,
+				be32_to_cpu(op->cmd_iu.data_len));
 			goto done;
 		}
 		result.u64 = 0;
@@ -1655,7 +1664,17 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
 					freq->transferred_length ||
 			     op->rsp_iu.status_code ||
 			     sqe->common.command_id != cqe->command_id)) {
-			status = cpu_to_le16(NVME_SC_INTERNAL << 1);
+			status = cpu_to_le16(NVME_SC_HOST_PATH_ERROR << 1);
+			dev_info(ctrl->ctrl.device,
+				"NVME-FC{%d}: io failed due to bad NVMe_ERSP: "
+				"iu len %d, xfr len %d vs %d, status code "
+				"%d, cmdid %d vs %d\n",
+				ctrl->cnum, be16_to_cpu(op->rsp_iu.iu_len),
+				be32_to_cpu(op->rsp_iu.xfrd_len),
+				freq->transferred_length,
+				op->rsp_iu.status_code,
+				sqe->common.command_id,
+				cqe->command_id);
 			goto done;
 		}
 		result = cqe->result;
@@ -1663,7 +1682,11 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
 		break;
 
 	default:
-		status = cpu_to_le16(NVME_SC_INTERNAL << 1);
+		status = cpu_to_le16(NVME_SC_HOST_PATH_ERROR << 1);
+		dev_info(ctrl->ctrl.device,
+			"NVME-FC{%d}: io failed due to odd NVMe_xRSP iu "
+			"len %d\n",
+			ctrl->cnum, freq->rcv_rsplen);
 		goto done;
 	}
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 7/7] nvme: fix ns removal hang when failing to revalidate due to a transient error
  2019-08-13  6:42 [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
                   ` (5 preceding siblings ...)
  2019-08-13  6:43 ` [PATCH v6 6/7] nvme-fc: Fail transport errors with NVME_SC_HOST_PATH Sagi Grimberg
@ 2019-08-13  6:43 ` Sagi Grimberg
  2019-08-15 21:17   ` James Smart
  2019-08-15 17:45 ` [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
  7 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2019-08-13  6:43 UTC (permalink / raw)


If a controller reset is racing with a namespace revalidation, the
revalidation (admin) I/O will surely fail, but we should not remove the
namespace as we will execute the I/O when the controller is back up.
Same for spurious allocation errors (return -ENOMEM).

Fix this by checking the specific error code that revalidate_disk
returns, and if it is a transient error (for example ENOLINK correlates
to BLK_STS_TRANSPORT or ENOMEM correlates to BLK_STS_RESOURCE or an
allocation failure), do not remove the namespace as it will either
recover when the controller is back up and schedule a subsequent scan,
or the controller is going away and the namespaces will be removed anyways.

This fixes a hang namespace scanning racing with a controller reset and
also sporious I/O errors in path failover coditions where the
controller reset is racing with the namespace scan work with multipath
enabled.

Reported-by: Hannes Reinecke  <hare at suse.de>
Reviewed-by: Hannes Reinecke <hare at suse.com>
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/core.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 10e237f80800..ce93fbebff61 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3457,8 +3457,17 @@ static void nvme_validate_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 
 	ns = nvme_find_get_ns(ctrl, nsid);
 	if (ns) {
-		if (ns->disk && revalidate_disk(ns->disk))
-			nvme_ns_remove(ns);
+		if (ns->disk) {
+			int ret = revalidate_disk(ns->disk);
+
+			/*
+			 * remove the ns only if the return status is
+			 * not a temporal execution error.
+			 */
+			if (ret && ret != -ENOLINK && ret != -ENOMEM &&
+			    ret != -EAGAIN && ret != -EBUSY)
+				nvme_ns_remove(ns);
+		}
 		nvme_put_ns(ns);
 	} else
 		nvme_alloc_ns(ctrl, nsid);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 2/7] nvme: return a proper status for sync commands failure
  2019-08-13  6:42 ` [PATCH v6 2/7] nvme: return a proper status for sync commands failure Sagi Grimberg
@ 2019-08-13  7:09   ` Hannes Reinecke
  2019-08-15 21:05   ` James Smart
  1 sibling, 0 replies; 18+ messages in thread
From: Hannes Reinecke @ 2019-08-13  7:09 UTC (permalink / raw)


On 8/13/19 8:42 AM, Sagi Grimberg wrote:
> callers should not rely on raw nvme status, instead return
> is more appropriate blk_status_t.
> 
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>  drivers/nvme/host/core.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 867c8977eb3e..f9bc10407f1b 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -774,7 +774,7 @@ static void nvme_execute_rq_polled(struct request_queue *q,
>  
>  /*
>   * Returns 0 on success.  If the result is negative, it's a Linux error code;
> - * if the result is positive, it's an NVM Express status code
> + * if the result is positive, it's a blk_status_t status code
>   */
>  int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
>  		union nvme_result *result, void *buffer, unsigned bufflen,
> @@ -805,7 +805,7 @@ int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
>  	if (nvme_req(req)->flags & NVME_REQ_CANCELLED)
>  		ret = -EINTR;
>  	else
> -		ret = nvme_req(req)->status;
> +		ret = nvme_error_status(req);
>   out:
>  	blk_mq_free_request(req);
>  	return ret;
> 
Reviewed-by: Hannes Reinecke <hare at suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions
  2019-08-13  6:42 [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
                   ` (6 preceding siblings ...)
  2019-08-13  6:43 ` [PATCH v6 7/7] nvme: fix ns removal hang when failing to revalidate due to a transient error Sagi Grimberg
@ 2019-08-15 17:45 ` Sagi Grimberg
  2019-08-15 20:01   ` James Smart
  7 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2019-08-15 17:45 UTC (permalink / raw)


Hannes, Keith, James,

I think this series got the blessing from Hannes and Keith, but
I also want to hear from James.

Also Hannes, did you happen to give this a try? it would help a lot
knowing that it fixes the issues you were seeing.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions
  2019-08-15 17:45 ` [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
@ 2019-08-15 20:01   ` James Smart
  0 siblings, 0 replies; 18+ messages in thread
From: James Smart @ 2019-08-15 20:01 UTC (permalink / raw)


On 8/15/2019 10:45 AM, Sagi Grimberg wrote:
> Hannes, Keith, James,
>
> I think this series got the blessing from Hannes and Keith, but
> I also want to hear from James.
>
> Also Hannes, did you happen to give this a try? it would help a lot
> knowing that it fixes the issues you were seeing.

I'll look through it shortly.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 1/7] nvme: fail cancelled commands with NVME_SC_HOST_PATH_ERROR
  2019-08-13  6:42 ` [PATCH v6 1/7] nvme: fail cancelled commands with NVME_SC_HOST_PATH_ERROR Sagi Grimberg
@ 2019-08-15 21:03   ` James Smart
  0 siblings, 0 replies; 18+ messages in thread
From: James Smart @ 2019-08-15 21:03 UTC (permalink / raw)




On 8/12/2019 11:42 PM, Sagi Grimberg wrote:
> NVME_SC_ABORT_REQ means that the request was aborted due to
> an abort command received. In our case, this is a transport
> cancellation, so host pathing error is much more appropriate.
>
> Also, convert NVME_SC_HOST_PATH_ERROR to BLK_STS_TRANSPORT for
> such that callers can understand that the status is a transport
> related error. This will be used by the ns scanning code to
> understand if it got an error from the controller or that the
> controller happens to be unreachable by the transport.
>
> Reviewed-by: Minwoo Im <minwoo.im.dev at gmail.com>
> Reviewed-by: Hannes Reinecke <hare at suse.com>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>   drivers/nvme/host/core.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>

Reviewed-by: James Smart <james.smart at broadcom.com>

-- james

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 2/7] nvme: return a proper status for sync commands failure
  2019-08-13  6:42 ` [PATCH v6 2/7] nvme: return a proper status for sync commands failure Sagi Grimberg
  2019-08-13  7:09   ` Hannes Reinecke
@ 2019-08-15 21:05   ` James Smart
  1 sibling, 0 replies; 18+ messages in thread
From: James Smart @ 2019-08-15 21:05 UTC (permalink / raw)




On 8/12/2019 11:42 PM, Sagi Grimberg wrote:
> callers should not rely on raw nvme status, instead return
> is more appropriate blk_status_t.
>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>   drivers/nvme/host/core.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
>


Reviewed-by: James Smart <james.smart at broadcom.com>

-- james

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 3/7] nvme: make nvme_identify_ns propagate errors back
  2019-08-13  6:43 ` [PATCH v6 3/7] nvme: make nvme_identify_ns propagate errors back Sagi Grimberg
@ 2019-08-15 21:10   ` James Smart
  0 siblings, 0 replies; 18+ messages in thread
From: James Smart @ 2019-08-15 21:10 UTC (permalink / raw)




On 8/12/2019 11:43 PM, Sagi Grimberg wrote:
> right now callers of nvme_identify_ns only know that it failed,
> but don't know why. Make nvme_identify_ns propagate the error back.
> Note that if the return status from nvme_submit_sync_cmd() is positive
> then we convert it from a blk_status_t to a negative errno.
>
> Reviewed-by: Minwoo Im <minwoo.im.dev at gmail.com>
> Reviewed-by: Hannes Reinecke <hare at suse.com>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>


Reviewed-by: James Smart <james.smart at broadcom.com>

-- james

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 4/7] nvme: make nvme_report_ns_ids propagate error back
  2019-08-13  6:43 ` [PATCH v6 4/7] nvme: make nvme_report_ns_ids propagate error back Sagi Grimberg
@ 2019-08-15 21:11   ` James Smart
  0 siblings, 0 replies; 18+ messages in thread
From: James Smart @ 2019-08-15 21:11 UTC (permalink / raw)




On 8/12/2019 11:43 PM, Sagi Grimberg wrote:
> And make the callers check the return status and propagate
> back accordingly. Also print the return status.
>
> Reviewed-by: Hannes Reinecke <hare at suse.com>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>   drivers/nvme/host/core.c | 24 ++++++++++++++++++------
>   1 file changed, 18 insertions(+), 6 deletions(-)
>
>


Reviewed-by: James Smart <james.smart at broadcom.com>

-- james

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 5/7] nvme-tcp: fail command with NVME_SC_HOST_PATH_ERROR send failed
  2019-08-13  6:43 ` [PATCH v6 5/7] nvme-tcp: fail command with NVME_SC_HOST_PATH_ERROR send failed Sagi Grimberg
@ 2019-08-15 21:11   ` James Smart
  0 siblings, 0 replies; 18+ messages in thread
From: James Smart @ 2019-08-15 21:11 UTC (permalink / raw)




On 8/12/2019 11:43 PM, Sagi Grimberg wrote:
> This is a more appropriate error status for a transport error
> detected by us (the host).
>
> Reviewed-by: Hannes Reinecke <hare at suse.com>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>

Reviewed-by: James Smart <james.smart at broadcom.com>

-- james

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 6/7] nvme-fc: Fail transport errors with NVME_SC_HOST_PATH
  2019-08-13  6:43 ` [PATCH v6 6/7] nvme-fc: Fail transport errors with NVME_SC_HOST_PATH Sagi Grimberg
@ 2019-08-15 21:12   ` James Smart
  0 siblings, 0 replies; 18+ messages in thread
From: James Smart @ 2019-08-15 21:12 UTC (permalink / raw)




On 8/12/2019 11:43 PM, Sagi Grimberg wrote:
> From: James Smart <james.smart at broadcom.com>
>
> NVME_SC_INTERNAL should indicate an internal controller errors
> and not host transport errors. These errors will propagate to
> upper layers (essentially nvme core) and be interpereted as
> transport errors which should not be taken into account for
> namespace state or condition.
>
> Reviewed-by: Hannes Reinecke <hare at suse.com>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>

Reviewed-by: James Smart <james.smart at broadcom.com>

-- james

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 7/7] nvme: fix ns removal hang when failing to revalidate due to a transient error
  2019-08-13  6:43 ` [PATCH v6 7/7] nvme: fix ns removal hang when failing to revalidate due to a transient error Sagi Grimberg
@ 2019-08-15 21:17   ` James Smart
  0 siblings, 0 replies; 18+ messages in thread
From: James Smart @ 2019-08-15 21:17 UTC (permalink / raw)




On 8/12/2019 11:43 PM, Sagi Grimberg wrote:
> If a controller reset is racing with a namespace revalidation, the
> revalidation (admin) I/O will surely fail, but we should not remove the
> namespace as we will execute the I/O when the controller is back up.
> Same for spurious allocation errors (return -ENOMEM).
>
> Fix this by checking the specific error code that revalidate_disk
> returns, and if it is a transient error (for example ENOLINK correlates
> to BLK_STS_TRANSPORT or ENOMEM correlates to BLK_STS_RESOURCE or an
> allocation failure), do not remove the namespace as it will either
> recover when the controller is back up and schedule a subsequent scan,
> or the controller is going away and the namespaces will be removed anyways.
>
> This fixes a hang namespace scanning racing with a controller reset and
> also sporious I/O errors in path failover coditions where the
> controller reset is racing with the namespace scan work with multipath
> enabled.
>
> Reported-by: Hannes Reinecke  <hare at suse.de>
> Reviewed-by: Hannes Reinecke <hare at suse.com>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
>

This looks fine:??? Reviewed-by: James Smart <james.smart at broadcom.com>

Q: do we need to do something about nvme_update_formats() which does a 
nvme_set_queue_dying() if nvme_revalidate_disk() fails ?? It's not 
removal, but....

-- james

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-08-15 21:17 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-13  6:42 [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
2019-08-13  6:42 ` [PATCH v6 1/7] nvme: fail cancelled commands with NVME_SC_HOST_PATH_ERROR Sagi Grimberg
2019-08-15 21:03   ` James Smart
2019-08-13  6:42 ` [PATCH v6 2/7] nvme: return a proper status for sync commands failure Sagi Grimberg
2019-08-13  7:09   ` Hannes Reinecke
2019-08-15 21:05   ` James Smart
2019-08-13  6:43 ` [PATCH v6 3/7] nvme: make nvme_identify_ns propagate errors back Sagi Grimberg
2019-08-15 21:10   ` James Smart
2019-08-13  6:43 ` [PATCH v6 4/7] nvme: make nvme_report_ns_ids propagate error back Sagi Grimberg
2019-08-15 21:11   ` James Smart
2019-08-13  6:43 ` [PATCH v6 5/7] nvme-tcp: fail command with NVME_SC_HOST_PATH_ERROR send failed Sagi Grimberg
2019-08-15 21:11   ` James Smart
2019-08-13  6:43 ` [PATCH v6 6/7] nvme-fc: Fail transport errors with NVME_SC_HOST_PATH Sagi Grimberg
2019-08-15 21:12   ` James Smart
2019-08-13  6:43 ` [PATCH v6 7/7] nvme: fix ns removal hang when failing to revalidate due to a transient error Sagi Grimberg
2019-08-15 21:17   ` James Smart
2019-08-15 17:45 ` [PATCH v6 0/7] nvme controller reset and namespace scan work race conditions Sagi Grimberg
2019-08-15 20:01   ` James Smart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).