Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 0/4] nvme: improve error handling and ana_state to work well with dm-multipath
@ 2021-04-15 23:15 Mike Snitzer
  2021-04-15 23:15 ` [PATCH v2 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set Mike Snitzer
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Mike Snitzer @ 2021-04-15 23:15 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe; +Cc: dm-devel, linux-block, linux-nvme

Hi,

I just rebased and tweaked these 4 patches ontop of v5.12-rc7.

Been carrying an older ~5.9 based version in RHEL8 (since RHEL8.3,
August 2020; yes the patchset I just mistakenly emailed instead of
this patchset):
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=nvme-error-handling-fixes/for-5.9

RHEL9 is coming, would really prefer that these changes land upstream
rather than carry them within RHEL.

All review/feedback welcome.

Thanks,
Mike

Chao Leng (1):
  nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set

Mike Snitzer (3):
  nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set
  nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT
  nvme: decouple basic ANA log page re-read support from native multipathing

 drivers/nvme/host/core.c      | 45 +++++++++++++++++++++++++++++++++++++------
 drivers/nvme/host/multipath.c | 16 ++++++++++-----
 drivers/nvme/host/nvme.h      |  4 ++++
 include/linux/blk_types.h     |  8 ++++++++
 4 files changed, 62 insertions(+), 11 deletions(-)

-- 
2.15.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set
  2021-04-15 23:15 [PATCH v2 0/4] nvme: improve error handling and ana_state to work well with dm-multipath Mike Snitzer
@ 2021-04-15 23:15 ` Mike Snitzer
  2021-04-15 23:48   ` Chaitanya Kulkarni
  2021-04-16 13:51   ` Hannes Reinecke
  2021-04-15 23:15 ` [PATCH v2 2/4] nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set Mike Snitzer
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 17+ messages in thread
From: Mike Snitzer @ 2021-04-15 23:15 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe; +Cc: dm-devel, linux-block, linux-nvme

If the DNR bit is set we should not retry the command.

We care about the retryable vs not retryable distinction at the block
layer so propagate the equivalent of the DNR bit by introducing
BLK_STS_DO_NOT_RETRY. Update blk_path_error() to _not_ retry if it
is set.

This change runs with the suggestion made here:
https://lore.kernel.org/linux-nvme/20190813170144.GA10269@lst.de/

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/nvme/host/core.c  | 3 +++
 include/linux/blk_types.h | 8 ++++++++
 2 files changed, 11 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0896e21642be..540d6fd8ffef 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -237,6 +237,9 @@ static void nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl)
 
 static blk_status_t nvme_error_status(u16 status)
 {
+	if (unlikely(status & NVME_SC_DNR))
+		return BLK_STS_DO_NOT_RETRY;
+
 	switch (status & 0x7ff) {
 	case NVME_SC_SUCCESS:
 		return BLK_STS_OK;
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index db026b6ec15a..1ca724948c56 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -142,6 +142,13 @@ typedef u8 __bitwise blk_status_t;
  */
 #define BLK_STS_ZONE_ACTIVE_RESOURCE	((__force blk_status_t)16)
 
+/*
+ * BLK_STS_DO_NOT_RETRY is returned from the driver in the completion path
+ * if the device returns a status indicating that if the same command is
+ * re-submitted it is expected to fail.
+ */
+#define BLK_STS_DO_NOT_RETRY	((__force blk_status_t)17)
+
 /**
  * blk_path_error - returns true if error may be path related
  * @error: status the request was completed with
@@ -157,6 +164,7 @@ typedef u8 __bitwise blk_status_t;
 static inline bool blk_path_error(blk_status_t error)
 {
 	switch (error) {
+	case BLK_STS_DO_NOT_RETRY:
 	case BLK_STS_NOTSUPP:
 	case BLK_STS_NOSPC:
 	case BLK_STS_TARGET:
-- 
2.15.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 2/4] nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set
  2021-04-15 23:15 [PATCH v2 0/4] nvme: improve error handling and ana_state to work well with dm-multipath Mike Snitzer
  2021-04-15 23:15 ` [PATCH v2 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set Mike Snitzer
@ 2021-04-15 23:15 ` Mike Snitzer
  2021-04-16 14:01   ` Hannes Reinecke
  2021-04-15 23:15 ` [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT Mike Snitzer
  2021-04-15 23:15 ` [PATCH v2 4/4] nvme: decouple basic ANA log page re-read support from native multipathing Mike Snitzer
  3 siblings, 1 reply; 17+ messages in thread
From: Mike Snitzer @ 2021-04-15 23:15 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: dm-devel, linux-block, linux-nvme, Chao Leng

From: Chao Leng <lengchao@huawei.com>

REQ_FAILFAST_TRANSPORT was designed for SCSI, because the SCSI protocol
does not define the local retry mechanism. SCSI implements a fuzzy
local retry mechanism, so REQ_FAILFAST_TRANSPORT is needed to allow
higher-level multipathing software to perform failover/retry.

NVMe is different with SCSI about this. It defines a local retry
mechanism and path error codes, so NVMe should retry local for non
path error. If path related error, whether to retry and how to retry
is still determined by higher-level multipathing's failover.

Unlike SCSI, NVMe shouldn't prevent retry if REQ_FAILFAST_TRANSPORT
because NVMe's local retry is needed -- as is NVMe specific logic to
categorize whether an error is path related.

In this way, the mechanism of NVMe multipath or other multipath are
now equivalent. The mechanism is: non path related error will be
retried locally, path related error is handled by multipath.

Signed-off-by: Chao Leng <lengchao@huawei.com>
[snitzer: edited header for grammar and clarity, also added code comment]
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/nvme/host/core.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 540d6fd8ffef..4134cf3c7e48 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -306,7 +306,14 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
 	if (likely(nvme_req(req)->status == 0))
 		return COMPLETE;
 
-	if (blk_noretry_request(req) ||
+	/*
+	 * REQ_FAILFAST_TRANSPORT is set by upper layer software that
+	 * handles multipathing. Unlike SCSI, NVMe's error handling was
+	 * specifically designed to handle local retry for non-path errors.
+	 * As such, allow NVMe's local retry mechanism to be used for
+	 * requests marked with REQ_FAILFAST_TRANSPORT.
+	 */
+	if ((req->cmd_flags & (REQ_FAILFAST_DEV | REQ_FAILFAST_DRIVER)) ||
 	    (nvme_req(req)->status & NVME_SC_DNR) ||
 	    nvme_req(req)->retries >= nvme_max_retries)
 		return COMPLETE;
-- 
2.15.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT
  2021-04-15 23:15 [PATCH v2 0/4] nvme: improve error handling and ana_state to work well with dm-multipath Mike Snitzer
  2021-04-15 23:15 ` [PATCH v2 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set Mike Snitzer
  2021-04-15 23:15 ` [PATCH v2 2/4] nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set Mike Snitzer
@ 2021-04-15 23:15 ` Mike Snitzer
  2021-04-16 14:07   ` Hannes Reinecke
  2021-04-16 20:52   ` Ewan D. Milne
  2021-04-15 23:15 ` [PATCH v2 4/4] nvme: decouple basic ANA log page re-read support from native multipathing Mike Snitzer
  3 siblings, 2 replies; 17+ messages in thread
From: Mike Snitzer @ 2021-04-15 23:15 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe; +Cc: dm-devel, linux-block, linux-nvme

If REQ_FAILFAST_TRANSPORT is set it means the driver should not retry
IO that completed with transport errors. REQ_FAILFAST_TRANSPORT is
set by multipathing software (e.g. dm-multipath) before it issues IO.

Update NVMe to allow failover of requests marked with either
REQ_NVME_MPATH or REQ_FAILFAST_TRANSPORT. This allows such requests
to be given a disposition of either FAILOVER or FAILUP respectively.
FAILUP handling ensures a retryable error is returned up from NVMe.

Introduce nvme_failup_req() for use in nvme_complete_rq() if
nvme_decide_disposition() returns FAILUP. nvme_failup_req() ensures
the request is completed with a retryable IO error when appropriate.
__nvme_end_req() was factored out for use by both nvme_end_req() and
nvme_failup_req().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/nvme/host/core.c | 31 ++++++++++++++++++++++++++-----
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4134cf3c7e48..10375197dd53 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -299,6 +299,7 @@ enum nvme_disposition {
 	COMPLETE,
 	RETRY,
 	FAILOVER,
+	FAILUP,
 };
 
 static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
@@ -318,10 +319,11 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
 	    nvme_req(req)->retries >= nvme_max_retries)
 		return COMPLETE;
 
-	if (req->cmd_flags & REQ_NVME_MPATH) {
+	if (req->cmd_flags & (REQ_NVME_MPATH | REQ_FAILFAST_TRANSPORT)) {
 		if (nvme_is_path_error(nvme_req(req)->status) ||
 		    blk_queue_dying(req->q))
-			return FAILOVER;
+			return (req->cmd_flags & REQ_NVME_MPATH) ?
+				FAILOVER : FAILUP;
 	} else {
 		if (blk_queue_dying(req->q))
 			return COMPLETE;
@@ -330,10 +332,8 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
 	return RETRY;
 }
 
-static inline void nvme_end_req(struct request *req)
+static inline void __nvme_end_req(struct request *req, blk_status_t status)
 {
-	blk_status_t status = nvme_error_status(nvme_req(req)->status);
-
 	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
 	    req_op(req) == REQ_OP_ZONE_APPEND)
 		req->__sector = nvme_lba_to_sect(req->q->queuedata,
@@ -343,6 +343,24 @@ static inline void nvme_end_req(struct request *req)
 	blk_mq_end_request(req, status);
 }
 
+static inline void nvme_end_req(struct request *req)
+{
+	__nvme_end_req(req, nvme_error_status(nvme_req(req)->status));
+}
+
+static void nvme_failup_req(struct request *req)
+{
+	blk_status_t status = nvme_error_status(nvme_req(req)->status);
+
+	if (WARN_ON_ONCE(!blk_path_error(status))) {
+		pr_debug("Request meant for failover but blk_status_t (errno=%d) was not retryable.\n",
+			 blk_status_to_errno(status));
+		status = BLK_STS_IOERR;
+	}
+
+	__nvme_end_req(req, status);
+}
+
 void nvme_complete_rq(struct request *req)
 {
 	trace_nvme_complete_rq(req);
@@ -361,6 +379,9 @@ void nvme_complete_rq(struct request *req)
 	case FAILOVER:
 		nvme_failover_req(req);
 		return;
+	case FAILUP:
+		nvme_failup_req(req);
+		return;
 	}
 }
 EXPORT_SYMBOL_GPL(nvme_complete_rq);
-- 
2.15.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 4/4] nvme: decouple basic ANA log page re-read support from native multipathing
  2021-04-15 23:15 [PATCH v2 0/4] nvme: improve error handling and ana_state to work well with dm-multipath Mike Snitzer
                   ` (2 preceding siblings ...)
  2021-04-15 23:15 ` [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT Mike Snitzer
@ 2021-04-15 23:15 ` Mike Snitzer
  3 siblings, 0 replies; 17+ messages in thread
From: Mike Snitzer @ 2021-04-15 23:15 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe; +Cc: dm-devel, linux-block, linux-nvme

Whether or not ANA is present is a choice of the target implementation;
the host (and whether it supports multipathing) has _zero_ influence on
this. If the target declares a path as 'inaccessible' the path _is_
inaccessible to the host. As such, ANA support should be functional
even if native multipathing is not.

Introduce ability to always re-read ANA log page as required due to ANA
error and make current ANA state available via sysfs -- even if native
multipathing is disabled on the host (e.g. nvme_core.multipath=N).
This is achieved by factoring out nvme_update_ana() and calling it in
nvme_complete_rq() for all FAILOVER requests.

This affords userspace access to the current ANA state independent of
which layer might be doing multipathing. This makes 'nvme list-subsys'
show ANA state for all NVMe subsystems with multiple controllers. It
also allows userspace multipath-tools to rely on the NVMe driver for
ANA support while dm-multipath takes care of multipathing.

And as always, if embedded NVMe users do not want any performance
overhead associated with ANA or native NVMe multipathing they can
disable CONFIG_NVME_MULTIPATH.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/nvme/host/core.c      |  2 ++
 drivers/nvme/host/multipath.c | 16 +++++++++++-----
 drivers/nvme/host/nvme.h      |  4 ++++
 3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 10375197dd53..1c6dc3a6c24d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -352,6 +352,8 @@ static void nvme_failup_req(struct request *req)
 {
 	blk_status_t status = nvme_error_status(nvme_req(req)->status);
 
+	nvme_update_ana(req);
+
 	if (WARN_ON_ONCE(!blk_path_error(status))) {
 		pr_debug("Request meant for failover but blk_status_t (errno=%d) was not retryable.\n",
 			 blk_status_to_errno(status));
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index a1d476e1ac02..7d94250264aa 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -65,23 +65,29 @@ void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
 	}
 }
 
-void nvme_failover_req(struct request *req)
+void nvme_update_ana(struct request *req)
 {
 	struct nvme_ns *ns = req->q->queuedata;
 	u16 status = nvme_req(req)->status & 0x7ff;
-	unsigned long flags;
-
-	nvme_mpath_clear_current_path(ns);
 
 	/*
 	 * If we got back an ANA error, we know the controller is alive but not
-	 * ready to serve this namespace.  Kick of a re-read of the ANA
+	 * ready to serve this namespace.  Kick off a re-read of the ANA
 	 * information page, and just try any other available path for now.
 	 */
 	if (nvme_is_ana_error(status) && ns->ctrl->ana_log_buf) {
 		set_bit(NVME_NS_ANA_PENDING, &ns->flags);
 		queue_work(nvme_wq, &ns->ctrl->ana_work);
 	}
+}
+
+void nvme_failover_req(struct request *req)
+{
+	struct nvme_ns *ns = req->q->queuedata;
+	unsigned long flags;
+
+	nvme_mpath_clear_current_path(ns);
+	nvme_update_ana(req);
 
 	spin_lock_irqsave(&ns->head->requeue_lock, flags);
 	blk_steal_bios(&ns->head->requeue_list, req);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 07b34175c6ce..4eed8536625c 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -664,6 +664,7 @@ void nvme_mpath_start_freeze(struct nvme_subsystem *subsys);
 void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
 			struct nvme_ctrl *ctrl, int *flags);
 void nvme_failover_req(struct request *req);
+void nvme_update_ana(struct request *req);
 void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl);
 int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl,struct nvme_ns_head *head);
 void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id);
@@ -714,6 +715,9 @@ static inline void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
 static inline void nvme_failover_req(struct request *req)
 {
 }
+static inline void nvme_update_ana(struct request *req)
+{
+}
 static inline void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl)
 {
 }
-- 
2.15.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set
  2021-04-15 23:15 ` [PATCH v2 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set Mike Snitzer
@ 2021-04-15 23:48   ` Chaitanya Kulkarni
  2021-04-16 13:51   ` Hannes Reinecke
  1 sibling, 0 replies; 17+ messages in thread
From: Chaitanya Kulkarni @ 2021-04-15 23:48 UTC (permalink / raw)
  To: Mike Snitzer, Christoph Hellwig, Jens Axboe
  Cc: dm-devel, linux-block, linux-nvme

On 4/15/21 16:27, Mike Snitzer wrote:
> If the DNR bit is set we should not retry the command.
>
> We care about the retryable vs not retryable distinction at the block
> layer so propagate the equivalent of the DNR bit by introducing
> BLK_STS_DO_NOT_RETRY. Update blk_path_error() to _not_ retry if it
> is set.
>
> This change runs with the suggestion made here:
> https://lore.kernel.org/linux-nvme/20190813170144.GA10269@lst.de/
>
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>

Looks good.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set
  2021-04-15 23:15 ` [PATCH v2 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set Mike Snitzer
  2021-04-15 23:48   ` Chaitanya Kulkarni
@ 2021-04-16 13:51   ` Hannes Reinecke
  1 sibling, 0 replies; 17+ messages in thread
From: Hannes Reinecke @ 2021-04-16 13:51 UTC (permalink / raw)
  To: Mike Snitzer, Christoph Hellwig, Jens Axboe
  Cc: dm-devel, linux-block, linux-nvme

On 4/16/21 1:15 AM, Mike Snitzer wrote:
> If the DNR bit is set we should not retry the command.
> 
> We care about the retryable vs not retryable distinction at the block
> layer so propagate the equivalent of the DNR bit by introducing
> BLK_STS_DO_NOT_RETRY. Update blk_path_error() to _not_ retry if it
> is set.
> 
> This change runs with the suggestion made here:
> https://lore.kernel.org/linux-nvme/20190813170144.GA10269@lst.de/
> 
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/nvme/host/core.c  | 3 +++
>  include/linux/blk_types.h | 8 ++++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 0896e21642be..540d6fd8ffef 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -237,6 +237,9 @@ static void nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl)
>  
>  static blk_status_t nvme_error_status(u16 status)
>  {
> +	if (unlikely(status & NVME_SC_DNR))
> +		return BLK_STS_DO_NOT_RETRY;
> +
>  	switch (status & 0x7ff) {
>  	case NVME_SC_SUCCESS:
>  		return BLK_STS_OK;
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index db026b6ec15a..1ca724948c56 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -142,6 +142,13 @@ typedef u8 __bitwise blk_status_t;
>   */
>  #define BLK_STS_ZONE_ACTIVE_RESOURCE	((__force blk_status_t)16)
>  
> +/*
> + * BLK_STS_DO_NOT_RETRY is returned from the driver in the completion path
> + * if the device returns a status indicating that if the same command is
> + * re-submitted it is expected to fail.
> + */
> +#define BLK_STS_DO_NOT_RETRY	((__force blk_status_t)17)
> +
>  /**
>   * blk_path_error - returns true if error may be path related
>   * @error: status the request was completed with
> @@ -157,6 +164,7 @@ typedef u8 __bitwise blk_status_t;
>  static inline bool blk_path_error(blk_status_t error)
>  {
>  	switch (error) {
> +	case BLK_STS_DO_NOT_RETRY:
>  	case BLK_STS_NOTSUPP:
>  	case BLK_STS_NOSPC:
>  	case BLK_STS_TARGET:
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/4] nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set
  2021-04-15 23:15 ` [PATCH v2 2/4] nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set Mike Snitzer
@ 2021-04-16 14:01   ` Hannes Reinecke
  2021-04-16 14:53     ` Mike Snitzer
  0 siblings, 1 reply; 17+ messages in thread
From: Hannes Reinecke @ 2021-04-16 14:01 UTC (permalink / raw)
  To: Mike Snitzer, Christoph Hellwig, Jens Axboe
  Cc: dm-devel, linux-block, linux-nvme, Chao Leng

On 4/16/21 1:15 AM, Mike Snitzer wrote:
> From: Chao Leng <lengchao@huawei.com>
> 
> REQ_FAILFAST_TRANSPORT was designed for SCSI, because the SCSI protocol
> does not define the local retry mechanism. SCSI implements a fuzzy
> local retry mechanism, so REQ_FAILFAST_TRANSPORT is needed to allow
> higher-level multipathing software to perform failover/retry.
> 
> NVMe is different with SCSI about this. It defines a local retry
> mechanism and path error codes, so NVMe should retry local for non
> path error. If path related error, whether to retry and how to retry
> is still determined by higher-level multipathing's failover.
> 
> Unlike SCSI, NVMe shouldn't prevent retry if REQ_FAILFAST_TRANSPORT
> because NVMe's local retry is needed -- as is NVMe specific logic to
> categorize whether an error is path related.
> 
> In this way, the mechanism of NVMe multipath or other multipath are
> now equivalent. The mechanism is: non path related error will be
> retried locally, path related error is handled by multipath.
> 
> Signed-off-by: Chao Leng <lengchao@huawei.com>
> [snitzer: edited header for grammar and clarity, also added code comment]
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/nvme/host/core.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 540d6fd8ffef..4134cf3c7e48 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -306,7 +306,14 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
>  	if (likely(nvme_req(req)->status == 0))
>  		return COMPLETE;
>  
> -	if (blk_noretry_request(req) ||
> +	/*
> +	 * REQ_FAILFAST_TRANSPORT is set by upper layer software that
> +	 * handles multipathing. Unlike SCSI, NVMe's error handling was
> +	 * specifically designed to handle local retry for non-path errors.
> +	 * As such, allow NVMe's local retry mechanism to be used for
> +	 * requests marked with REQ_FAILFAST_TRANSPORT.
> +	 */
> +	if ((req->cmd_flags & (REQ_FAILFAST_DEV | REQ_FAILFAST_DRIVER)) ||
>  	    (nvme_req(req)->status & NVME_SC_DNR) ||
>  	    nvme_req(req)->retries >= nvme_max_retries)
>  		return COMPLETE;
> 
Huh?

#define blk_noretry_request(rq) \
        ((rq)->cmd_flags & (REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT| \
                             REQ_FAILFAST_DRIVER))

making the only _actual_ change in your patch _not_ evaluating the
REQ_FAILFAST_DRIVER, which incidentally is only used by the NVMe core.
So what is it you're trying to solve?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT
  2021-04-15 23:15 ` [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT Mike Snitzer
@ 2021-04-16 14:07   ` Hannes Reinecke
  2021-04-16 15:03     ` Mike Snitzer
  2021-04-16 20:03     ` [dm-devel] " Ewan D. Milne
  2021-04-16 20:52   ` Ewan D. Milne
  1 sibling, 2 replies; 17+ messages in thread
From: Hannes Reinecke @ 2021-04-16 14:07 UTC (permalink / raw)
  To: Mike Snitzer, Christoph Hellwig, Jens Axboe
  Cc: dm-devel, linux-block, linux-nvme

On 4/16/21 1:15 AM, Mike Snitzer wrote:
> If REQ_FAILFAST_TRANSPORT is set it means the driver should not retry
> IO that completed with transport errors. REQ_FAILFAST_TRANSPORT is
> set by multipathing software (e.g. dm-multipath) before it issues IO.
> 
> Update NVMe to allow failover of requests marked with either
> REQ_NVME_MPATH or REQ_FAILFAST_TRANSPORT. This allows such requests
> to be given a disposition of either FAILOVER or FAILUP respectively.
> FAILUP handling ensures a retryable error is returned up from NVMe.
> 
> Introduce nvme_failup_req() for use in nvme_complete_rq() if
> nvme_decide_disposition() returns FAILUP. nvme_failup_req() ensures
> the request is completed with a retryable IO error when appropriate.
> __nvme_end_req() was factored out for use by both nvme_end_req() and
> nvme_failup_req().
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/nvme/host/core.c | 31 ++++++++++++++++++++++++++-----
>  1 file changed, 26 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 4134cf3c7e48..10375197dd53 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -299,6 +299,7 @@ enum nvme_disposition {
>  	COMPLETE,
>  	RETRY,
>  	FAILOVER,
> +	FAILUP,
>  };
>  
>  static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
> @@ -318,10 +319,11 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
>  	    nvme_req(req)->retries >= nvme_max_retries)
>  		return COMPLETE;
>  
> -	if (req->cmd_flags & REQ_NVME_MPATH) {
> +	if (req->cmd_flags & (REQ_NVME_MPATH | REQ_FAILFAST_TRANSPORT)) {
>  		if (nvme_is_path_error(nvme_req(req)->status) ||
>  		    blk_queue_dying(req->q))
> -			return FAILOVER;
> +			return (req->cmd_flags & REQ_NVME_MPATH) ?
> +				FAILOVER : FAILUP;
>  	} else {
>  		if (blk_queue_dying(req->q))
>  			return COMPLETE;
> @@ -330,10 +332,8 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
>  	return RETRY;
>  }
>  
> -static inline void nvme_end_req(struct request *req)
> +static inline void __nvme_end_req(struct request *req, blk_status_t status)
>  {
> -	blk_status_t status = nvme_error_status(nvme_req(req)->status);
> -
>  	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
>  	    req_op(req) == REQ_OP_ZONE_APPEND)
>  		req->__sector = nvme_lba_to_sect(req->q->queuedata,
> @@ -343,6 +343,24 @@ static inline void nvme_end_req(struct request *req)
>  	blk_mq_end_request(req, status);
>  }
>  
> +static inline void nvme_end_req(struct request *req)
> +{
> +	__nvme_end_req(req, nvme_error_status(nvme_req(req)->status));
> +}
> +
> +static void nvme_failup_req(struct request *req)
> +{
> +	blk_status_t status = nvme_error_status(nvme_req(req)->status);
> +
> +	if (WARN_ON_ONCE(!blk_path_error(status))) {
> +		pr_debug("Request meant for failover but blk_status_t (errno=%d) was not retryable.\n",
> +			 blk_status_to_errno(status));
> +		status = BLK_STS_IOERR;
> +	}
> +
> +	__nvme_end_req(req, status);
> +}
> +
>  void nvme_complete_rq(struct request *req)
>  {
>  	trace_nvme_complete_rq(req);
> @@ -361,6 +379,9 @@ void nvme_complete_rq(struct request *req)
>  	case FAILOVER:
>  		nvme_failover_req(req);
>  		return;
> +	case FAILUP:
> +		nvme_failup_req(req);
> +		return;
>  	}
>  }
>  EXPORT_SYMBOL_GPL(nvme_complete_rq);
> 

Hmm. Quite convoluted, methinks.
Shouldn't this achieve the same thing?

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index e89ec2522ca6..8c36a2196b66 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -303,8 +303,10 @@ static inline enum nvme_disposition
nvme_decide_disposition(struct request *req)
        if (likely(nvme_req(req)->status == 0))
                return COMPLETE;

-       if (blk_noretry_request(req) ||
-           (nvme_req(req)->status & NVME_SC_DNR) ||
+       if (blk_noretry_request(req))
+               nvme_req(req)->status |= NVME_SC_DNR;
+
+       if ((nvme_req(req)->status & NVME_SC_DNR) ||
            nvme_req(req)->retries >= nvme_max_retries)
                return COMPLETE;


Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/4] nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set
  2021-04-16 14:01   ` Hannes Reinecke
@ 2021-04-16 14:53     ` Mike Snitzer
  2021-04-16 15:20       ` Hannes Reinecke
  0 siblings, 1 reply; 17+ messages in thread
From: Mike Snitzer @ 2021-04-16 14:53 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Jens Axboe, dm-devel, linux-block, linux-nvme,
	Chao Leng

On Fri, Apr 16 2021 at 10:01am -0400,
Hannes Reinecke <hare@suse.de> wrote:

> On 4/16/21 1:15 AM, Mike Snitzer wrote:
> > From: Chao Leng <lengchao@huawei.com>
> > 
> > REQ_FAILFAST_TRANSPORT was designed for SCSI, because the SCSI protocol
> > does not define the local retry mechanism. SCSI implements a fuzzy
> > local retry mechanism, so REQ_FAILFAST_TRANSPORT is needed to allow
> > higher-level multipathing software to perform failover/retry.
> > 
> > NVMe is different with SCSI about this. It defines a local retry
> > mechanism and path error codes, so NVMe should retry local for non
> > path error. If path related error, whether to retry and how to retry
> > is still determined by higher-level multipathing's failover.
> > 
> > Unlike SCSI, NVMe shouldn't prevent retry if REQ_FAILFAST_TRANSPORT
> > because NVMe's local retry is needed -- as is NVMe specific logic to
> > categorize whether an error is path related.
> > 
> > In this way, the mechanism of NVMe multipath or other multipath are
> > now equivalent. The mechanism is: non path related error will be
> > retried locally, path related error is handled by multipath.
> > 
> > Signed-off-by: Chao Leng <lengchao@huawei.com>
> > [snitzer: edited header for grammar and clarity, also added code comment]
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  drivers/nvme/host/core.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 540d6fd8ffef..4134cf3c7e48 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -306,7 +306,14 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
> >  	if (likely(nvme_req(req)->status == 0))
> >  		return COMPLETE;
> >  
> > -	if (blk_noretry_request(req) ||
> > +	/*
> > +	 * REQ_FAILFAST_TRANSPORT is set by upper layer software that
> > +	 * handles multipathing. Unlike SCSI, NVMe's error handling was
> > +	 * specifically designed to handle local retry for non-path errors.
> > +	 * As such, allow NVMe's local retry mechanism to be used for
> > +	 * requests marked with REQ_FAILFAST_TRANSPORT.
> > +	 */
> > +	if ((req->cmd_flags & (REQ_FAILFAST_DEV | REQ_FAILFAST_DRIVER)) ||
> >  	    (nvme_req(req)->status & NVME_SC_DNR) ||
> >  	    nvme_req(req)->retries >= nvme_max_retries)
> >  		return COMPLETE;
> > 
> Huh?
> 
> #define blk_noretry_request(rq) \
>         ((rq)->cmd_flags & (REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT| \
>                              REQ_FAILFAST_DRIVER))
> 
> making the only _actual_ change in your patch _not_ evaluating the
> REQ_FAILFAST_DRIVER, which incidentally is only used by the NVMe core.

No, not sure how you got there. I'd have thought the 5 references to
"REQ_FAILFAST_TRANSPORT" would've been sufficient ;)

This patch makes it so requests marked with REQ_FAILFAST_TRANSPORT are
allowed to use NVMe's local retry (that is required for non-transport
errors).

> So what is it you're trying to solve?

What the patch header, code and code comment detail.

Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT
  2021-04-16 14:07   ` Hannes Reinecke
@ 2021-04-16 15:03     ` Mike Snitzer
  2021-04-16 16:23       ` Hannes Reinecke
  2021-04-16 20:03     ` [dm-devel] " Ewan D. Milne
  1 sibling, 1 reply; 17+ messages in thread
From: Mike Snitzer @ 2021-04-16 15:03 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Jens Axboe, dm-devel, linux-block, linux-nvme

On Fri, Apr 16 2021 at 10:07am -0400,
Hannes Reinecke <hare@suse.de> wrote:

> On 4/16/21 1:15 AM, Mike Snitzer wrote:
> > If REQ_FAILFAST_TRANSPORT is set it means the driver should not retry
> > IO that completed with transport errors. REQ_FAILFAST_TRANSPORT is
> > set by multipathing software (e.g. dm-multipath) before it issues IO.
> > 
> > Update NVMe to allow failover of requests marked with either
> > REQ_NVME_MPATH or REQ_FAILFAST_TRANSPORT. This allows such requests
> > to be given a disposition of either FAILOVER or FAILUP respectively.
> > FAILUP handling ensures a retryable error is returned up from NVMe.
> > 
> > Introduce nvme_failup_req() for use in nvme_complete_rq() if
> > nvme_decide_disposition() returns FAILUP. nvme_failup_req() ensures
> > the request is completed with a retryable IO error when appropriate.
> > __nvme_end_req() was factored out for use by both nvme_end_req() and
> > nvme_failup_req().
> > 
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  drivers/nvme/host/core.c | 31 ++++++++++++++++++++++++++-----
> >  1 file changed, 26 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 4134cf3c7e48..10375197dd53 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -299,6 +299,7 @@ enum nvme_disposition {
> >  	COMPLETE,
> >  	RETRY,
> >  	FAILOVER,
> > +	FAILUP,
> >  };
> >  
> >  static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
> > @@ -318,10 +319,11 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
> >  	    nvme_req(req)->retries >= nvme_max_retries)
> >  		return COMPLETE;
> >  
> > -	if (req->cmd_flags & REQ_NVME_MPATH) {
> > +	if (req->cmd_flags & (REQ_NVME_MPATH | REQ_FAILFAST_TRANSPORT)) {
> >  		if (nvme_is_path_error(nvme_req(req)->status) ||
> >  		    blk_queue_dying(req->q))
> > -			return FAILOVER;
> > +			return (req->cmd_flags & REQ_NVME_MPATH) ?
> > +				FAILOVER : FAILUP;
> >  	} else {
> >  		if (blk_queue_dying(req->q))
> >  			return COMPLETE;
> > @@ -330,10 +332,8 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
> >  	return RETRY;
> >  }
> >  
> > -static inline void nvme_end_req(struct request *req)
> > +static inline void __nvme_end_req(struct request *req, blk_status_t status)
> >  {
> > -	blk_status_t status = nvme_error_status(nvme_req(req)->status);
> > -
> >  	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
> >  	    req_op(req) == REQ_OP_ZONE_APPEND)
> >  		req->__sector = nvme_lba_to_sect(req->q->queuedata,
> > @@ -343,6 +343,24 @@ static inline void nvme_end_req(struct request *req)
> >  	blk_mq_end_request(req, status);
> >  }
> >  
> > +static inline void nvme_end_req(struct request *req)
> > +{
> > +	__nvme_end_req(req, nvme_error_status(nvme_req(req)->status));
> > +}
> > +
> > +static void nvme_failup_req(struct request *req)
> > +{
> > +	blk_status_t status = nvme_error_status(nvme_req(req)->status);
> > +
> > +	if (WARN_ON_ONCE(!blk_path_error(status))) {
> > +		pr_debug("Request meant for failover but blk_status_t (errno=%d) was not retryable.\n",
> > +			 blk_status_to_errno(status));
> > +		status = BLK_STS_IOERR;
> > +	}
> > +
> > +	__nvme_end_req(req, status);
> > +}
> > +
> >  void nvme_complete_rq(struct request *req)
> >  {
> >  	trace_nvme_complete_rq(req);
> > @@ -361,6 +379,9 @@ void nvme_complete_rq(struct request *req)
> >  	case FAILOVER:
> >  		nvme_failover_req(req);
> >  		return;
> > +	case FAILUP:
> > +		nvme_failup_req(req);
> > +		return;
> >  	}
> >  }
> >  EXPORT_SYMBOL_GPL(nvme_complete_rq);
> > 
> 
> Hmm. Quite convoluted, methinks.

Maybe you didn't read the header or patch?

I'm cool with critical review when it is clear the reviewer fully
understands the patch but... ;)

> Shouldn't this achieve the same thing?
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index e89ec2522ca6..8c36a2196b66 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -303,8 +303,10 @@ static inline enum nvme_disposition
> nvme_decide_disposition(struct request *req)
>         if (likely(nvme_req(req)->status == 0))
>                 return COMPLETE;
> 
> -       if (blk_noretry_request(req) ||
> -           (nvme_req(req)->status & NVME_SC_DNR) ||
> +       if (blk_noretry_request(req))
> +               nvme_req(req)->status |= NVME_SC_DNR;
> +
> +       if ((nvme_req(req)->status & NVME_SC_DNR) ||
>             nvme_req(req)->retries >= nvme_max_retries)
>                 return COMPLETE;

Definitely won't achieve the same. And especially not with patch 1/4
("nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set") that you
gave your Reviewed-by to earlier.

Instead of "FAILUP", I thought about using "FAILUP_AND_OVER" to convey
that this is a variant of failover.  Meaning it takes the same patch as
nvme "FAILOVER" until the very end; where it does REQ_FAILFAST_TRANSPORT
specific work detailed in nvme_failup_req().

And then patch 4/4 makes further use of nvme_failup_req() by adding a
call to the factored out nvme_update_ana().

Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/4] nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set
  2021-04-16 14:53     ` Mike Snitzer
@ 2021-04-16 15:20       ` Hannes Reinecke
  2021-04-16 15:32         ` Mike Snitzer
  0 siblings, 1 reply; 17+ messages in thread
From: Hannes Reinecke @ 2021-04-16 15:20 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Christoph Hellwig, Jens Axboe, dm-devel, linux-block, linux-nvme,
	Chao Leng

On 4/16/21 4:53 PM, Mike Snitzer wrote:
> On Fri, Apr 16 2021 at 10:01am -0400,
> Hannes Reinecke <hare@suse.de> wrote:
> 
>> On 4/16/21 1:15 AM, Mike Snitzer wrote:
>>> From: Chao Leng <lengchao@huawei.com>
>>>
>>> REQ_FAILFAST_TRANSPORT was designed for SCSI, because the SCSI protocol
>>> does not define the local retry mechanism. SCSI implements a fuzzy
>>> local retry mechanism, so REQ_FAILFAST_TRANSPORT is needed to allow
>>> higher-level multipathing software to perform failover/retry.
>>>
>>> NVMe is different with SCSI about this. It defines a local retry
>>> mechanism and path error codes, so NVMe should retry local for non
>>> path error. If path related error, whether to retry and how to retry
>>> is still determined by higher-level multipathing's failover.
>>>
>>> Unlike SCSI, NVMe shouldn't prevent retry if REQ_FAILFAST_TRANSPORT
>>> because NVMe's local retry is needed -- as is NVMe specific logic to
>>> categorize whether an error is path related.
>>>
>>> In this way, the mechanism of NVMe multipath or other multipath are
>>> now equivalent. The mechanism is: non path related error will be
>>> retried locally, path related error is handled by multipath.
>>>
>>> Signed-off-by: Chao Leng <lengchao@huawei.com>
>>> [snitzer: edited header for grammar and clarity, also added code comment]
>>> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
>>> ---
>>>  drivers/nvme/host/core.c | 9 ++++++++-
>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index 540d6fd8ffef..4134cf3c7e48 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -306,7 +306,14 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
>>>  	if (likely(nvme_req(req)->status == 0))
>>>  		return COMPLETE;
>>>  
>>> -	if (blk_noretry_request(req) ||
>>> +	/*
>>> +	 * REQ_FAILFAST_TRANSPORT is set by upper layer software that
>>> +	 * handles multipathing. Unlike SCSI, NVMe's error handling was
>>> +	 * specifically designed to handle local retry for non-path errors.
>>> +	 * As such, allow NVMe's local retry mechanism to be used for
>>> +	 * requests marked with REQ_FAILFAST_TRANSPORT.
>>> +	 */
>>> +	if ((req->cmd_flags & (REQ_FAILFAST_DEV | REQ_FAILFAST_DRIVER)) ||
>>>  	    (nvme_req(req)->status & NVME_SC_DNR) ||
>>>  	    nvme_req(req)->retries >= nvme_max_retries)
>>>  		return COMPLETE;
>>>
>> Huh?
>>
>> #define blk_noretry_request(rq) \
>>         ((rq)->cmd_flags & (REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT| \
>>                              REQ_FAILFAST_DRIVER))
>>
>> making the only _actual_ change in your patch _not_ evaluating the
>> REQ_FAILFAST_DRIVER, which incidentally is only used by the NVMe core.
> 
> No, not sure how you got there. I'd have thought the 5 references to
> "REQ_FAILFAST_TRANSPORT" would've been sufficient ;)
> 

Ah. Misread stuff. You're excluding the REQ_FAILFAST_TRANSPORT here.
But then it's _actually_ similar to the next patch (which I've also
commented).

Wouldn't it be better to fold them into one patch and discuss things
together; especially as my comment to the next one might actually
achieve the same thing?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/4] nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set
  2021-04-16 15:20       ` Hannes Reinecke
@ 2021-04-16 15:32         ` Mike Snitzer
  0 siblings, 0 replies; 17+ messages in thread
From: Mike Snitzer @ 2021-04-16 15:32 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Jens Axboe, dm-devel, linux-block, linux-nvme,
	Chao Leng

On Fri, Apr 16 2021 at 11:20am -0400,
Hannes Reinecke <hare@suse.de> wrote:

> On 4/16/21 4:53 PM, Mike Snitzer wrote:
> > On Fri, Apr 16 2021 at 10:01am -0400,
> > Hannes Reinecke <hare@suse.de> wrote:
> > 
> >> On 4/16/21 1:15 AM, Mike Snitzer wrote:
> >>> From: Chao Leng <lengchao@huawei.com>
> >>>
> >>> REQ_FAILFAST_TRANSPORT was designed for SCSI, because the SCSI protocol
> >>> does not define the local retry mechanism. SCSI implements a fuzzy
> >>> local retry mechanism, so REQ_FAILFAST_TRANSPORT is needed to allow
> >>> higher-level multipathing software to perform failover/retry.
> >>>
> >>> NVMe is different with SCSI about this. It defines a local retry
> >>> mechanism and path error codes, so NVMe should retry local for non
> >>> path error. If path related error, whether to retry and how to retry
> >>> is still determined by higher-level multipathing's failover.
> >>>
> >>> Unlike SCSI, NVMe shouldn't prevent retry if REQ_FAILFAST_TRANSPORT
> >>> because NVMe's local retry is needed -- as is NVMe specific logic to
> >>> categorize whether an error is path related.
> >>>
> >>> In this way, the mechanism of NVMe multipath or other multipath are
> >>> now equivalent. The mechanism is: non path related error will be
> >>> retried locally, path related error is handled by multipath.
> >>>
> >>> Signed-off-by: Chao Leng <lengchao@huawei.com>
> >>> [snitzer: edited header for grammar and clarity, also added code comment]
> >>> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> >>> ---
> >>>  drivers/nvme/host/core.c | 9 ++++++++-
> >>>  1 file changed, 8 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> >>> index 540d6fd8ffef..4134cf3c7e48 100644
> >>> --- a/drivers/nvme/host/core.c
> >>> +++ b/drivers/nvme/host/core.c
> >>> @@ -306,7 +306,14 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
> >>>  	if (likely(nvme_req(req)->status == 0))
> >>>  		return COMPLETE;
> >>>  
> >>> -	if (blk_noretry_request(req) ||
> >>> +	/*
> >>> +	 * REQ_FAILFAST_TRANSPORT is set by upper layer software that
> >>> +	 * handles multipathing. Unlike SCSI, NVMe's error handling was
> >>> +	 * specifically designed to handle local retry for non-path errors.
> >>> +	 * As such, allow NVMe's local retry mechanism to be used for
> >>> +	 * requests marked with REQ_FAILFAST_TRANSPORT.
> >>> +	 */
> >>> +	if ((req->cmd_flags & (REQ_FAILFAST_DEV | REQ_FAILFAST_DRIVER)) ||
> >>>  	    (nvme_req(req)->status & NVME_SC_DNR) ||
> >>>  	    nvme_req(req)->retries >= nvme_max_retries)
> >>>  		return COMPLETE;
> >>>
> >> Huh?
> >>
> >> #define blk_noretry_request(rq) \
> >>         ((rq)->cmd_flags & (REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT| \
> >>                              REQ_FAILFAST_DRIVER))
> >>
> >> making the only _actual_ change in your patch _not_ evaluating the
> >> REQ_FAILFAST_DRIVER, which incidentally is only used by the NVMe core.
> > 
> > No, not sure how you got there. I'd have thought the 5 references to
> > "REQ_FAILFAST_TRANSPORT" would've been sufficient ;)
> > 
> 
> Ah. Misread stuff. You're excluding the REQ_FAILFAST_TRANSPORT here.
> But then it's _actually_ similar to the next patch (which I've also
> commented).
> 
> Wouldn't it be better to fold them into one patch and discuss things
> together; especially as my comment to the next one might actually
> achieve the same thing?

2 discrete things. This patch enables local retry.
Patch 3 allows proper failover via upper layer multipathing.

And as I replied, your suggestion about using DNR doesn't achieve the
same thing (said as much in reply to the patch 3 thread).

Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT
  2021-04-16 15:03     ` Mike Snitzer
@ 2021-04-16 16:23       ` Hannes Reinecke
  0 siblings, 0 replies; 17+ messages in thread
From: Hannes Reinecke @ 2021-04-16 16:23 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Christoph Hellwig, Jens Axboe, dm-devel, linux-block, linux-nvme

On 4/16/21 5:03 PM, Mike Snitzer wrote:
> On Fri, Apr 16 2021 at 10:07am -0400,
> Hannes Reinecke <hare@suse.de> wrote:
> 
>> On 4/16/21 1:15 AM, Mike Snitzer wrote:
>>> If REQ_FAILFAST_TRANSPORT is set it means the driver should not retry
>>> IO that completed with transport errors. REQ_FAILFAST_TRANSPORT is
>>> set by multipathing software (e.g. dm-multipath) before it issues IO.
>>>
>>> Update NVMe to allow failover of requests marked with either
>>> REQ_NVME_MPATH or REQ_FAILFAST_TRANSPORT. This allows such requests
>>> to be given a disposition of either FAILOVER or FAILUP respectively.
>>> FAILUP handling ensures a retryable error is returned up from NVMe.
>>>
>>> Introduce nvme_failup_req() for use in nvme_complete_rq() if
>>> nvme_decide_disposition() returns FAILUP. nvme_failup_req() ensures
>>> the request is completed with a retryable IO error when appropriate.
>>> __nvme_end_req() was factored out for use by both nvme_end_req() and
>>> nvme_failup_req().
>>>
>>> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
>>> ---
>>>  drivers/nvme/host/core.c | 31 ++++++++++++++++++++++++++-----
>>>  1 file changed, 26 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index 4134cf3c7e48..10375197dd53 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -299,6 +299,7 @@ enum nvme_disposition {
>>>  	COMPLETE,
>>>  	RETRY,
>>>  	FAILOVER,
>>> +	FAILUP,
>>>  };
>>>  
>>>  static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
>>> @@ -318,10 +319,11 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
>>>  	    nvme_req(req)->retries >= nvme_max_retries)
>>>  		return COMPLETE;
>>>  
>>> -	if (req->cmd_flags & REQ_NVME_MPATH) {
>>> +	if (req->cmd_flags & (REQ_NVME_MPATH | REQ_FAILFAST_TRANSPORT)) {
>>>  		if (nvme_is_path_error(nvme_req(req)->status) ||
>>>  		    blk_queue_dying(req->q))
>>> -			return FAILOVER;
>>> +			return (req->cmd_flags & REQ_NVME_MPATH) ?
>>> +				FAILOVER : FAILUP;
>>>  	} else {
>>>  		if (blk_queue_dying(req->q))
>>>  			return COMPLETE;
>>> @@ -330,10 +332,8 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
>>>  	return RETRY;
>>>  }
>>>  
>>> -static inline void nvme_end_req(struct request *req)
>>> +static inline void __nvme_end_req(struct request *req, blk_status_t status)
>>>  {
>>> -	blk_status_t status = nvme_error_status(nvme_req(req)->status);
>>> -
>>>  	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
>>>  	    req_op(req) == REQ_OP_ZONE_APPEND)
>>>  		req->__sector = nvme_lba_to_sect(req->q->queuedata,
>>> @@ -343,6 +343,24 @@ static inline void nvme_end_req(struct request *req)
>>>  	blk_mq_end_request(req, status);
>>>  }
>>>  
>>> +static inline void nvme_end_req(struct request *req)
>>> +{
>>> +	__nvme_end_req(req, nvme_error_status(nvme_req(req)->status));
>>> +}
>>> +
>>> +static void nvme_failup_req(struct request *req)
>>> +{
>>> +	blk_status_t status = nvme_error_status(nvme_req(req)->status);
>>> +
>>> +	if (WARN_ON_ONCE(!blk_path_error(status))) {
>>> +		pr_debug("Request meant for failover but blk_status_t (errno=%d) was not retryable.\n",
>>> +			 blk_status_to_errno(status));
>>> +		status = BLK_STS_IOERR;
>>> +	}
>>> +
>>> +	__nvme_end_req(req, status);
>>> +}
>>> +
>>>  void nvme_complete_rq(struct request *req)
>>>  {
>>>  	trace_nvme_complete_rq(req);
>>> @@ -361,6 +379,9 @@ void nvme_complete_rq(struct request *req)
>>>  	case FAILOVER:
>>>  		nvme_failover_req(req);
>>>  		return;
>>> +	case FAILUP:
>>> +		nvme_failup_req(req);
>>> +		return;
>>>  	}
>>>  }
>>>  EXPORT_SYMBOL_GPL(nvme_complete_rq);
>>>
>>
>> Hmm. Quite convoluted, methinks.
> 
> Maybe you didn't read the header or patch?
> 
> I'm cool with critical review when it is clear the reviewer fully
> understands the patch but... ;)
> 
>> Shouldn't this achieve the same thing?
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index e89ec2522ca6..8c36a2196b66 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -303,8 +303,10 @@ static inline enum nvme_disposition
>> nvme_decide_disposition(struct request *req)
>>         if (likely(nvme_req(req)->status == 0))
>>                 return COMPLETE;
>>
>> -       if (blk_noretry_request(req) ||
>> -           (nvme_req(req)->status & NVME_SC_DNR) ||
>> +       if (blk_noretry_request(req))
>> +               nvme_req(req)->status |= NVME_SC_DNR;
>> +
>> +       if ((nvme_req(req)->status & NVME_SC_DNR) ||
>>             nvme_req(req)->retries >= nvme_max_retries)
>>                 return COMPLETE;
> 
> Definitely won't achieve the same. And especially not with patch 1/4
> ("nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set") that you
> gave your Reviewed-by to earlier.
> 
Ah. Right. Sorry.

> Instead of "FAILUP", I thought about using "FAILUP_AND_OVER" to convey
> that this is a variant of failover.  Meaning it takes the same patch as
> nvme "FAILOVER" until the very end; where it does REQ_FAILFAST_TRANSPORT
> specific work detailed in nvme_failup_req().
> 
All very intricate; will need to check the patches in their combined
version.
Not deliberately stalling, mind you, just wanting to figure out what the
net result will be.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT
  2021-04-16 14:07   ` Hannes Reinecke
  2021-04-16 15:03     ` Mike Snitzer
@ 2021-04-16 20:03     ` Ewan D. Milne
  1 sibling, 0 replies; 17+ messages in thread
From: Ewan D. Milne @ 2021-04-16 20:03 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer, Christoph Hellwig, Jens Axboe
  Cc: linux-block, dm-devel, linux-nvme

On Fri, 2021-04-16 at 16:07 +0200, Hannes Reinecke wrote:
> 
> Hmm. Quite convoluted, methinks.
> Shouldn't this achieve the same thing?
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index e89ec2522ca6..8c36a2196b66 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -303,8 +303,10 @@ static inline enum nvme_disposition
> nvme_decide_disposition(struct request *req)
>         if (likely(nvme_req(req)->status == 0))
>                 return COMPLETE;
> 
> -       if (blk_noretry_request(req) ||
> -           (nvme_req(req)->status & NVME_SC_DNR) ||
> +       if (blk_noretry_request(req))
> +               nvme_req(req)->status |= NVME_SC_DNR;
> +
> +       if ((nvme_req(req)->status & NVME_SC_DNR) ||
>             nvme_req(req)->retries >= nvme_max_retries)
>                 return COMPLETE;
> 

I am not in favor of altering ->status to set DNR jus to
simplify the following conditional.

-Ewan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT
  2021-04-15 23:15 ` [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT Mike Snitzer
  2021-04-16 14:07   ` Hannes Reinecke
@ 2021-04-16 20:52   ` Ewan D. Milne
  2021-04-16 21:44     ` Mike Snitzer
  1 sibling, 1 reply; 17+ messages in thread
From: Ewan D. Milne @ 2021-04-16 20:52 UTC (permalink / raw)
  To: Mike Snitzer, Christoph Hellwig, Jens Axboe
  Cc: dm-devel, linux-block, linux-nvme

On Thu, 2021-04-15 at 19:15 -0400, Mike Snitzer wrote:
> If REQ_FAILFAST_TRANSPORT is set it means the driver should not retry
> IO that completed with transport errors. REQ_FAILFAST_TRANSPORT is
> set by multipathing software (e.g. dm-multipath) before it issues IO.
> 
> Update NVMe to allow failover of requests marked with either
> REQ_NVME_MPATH or REQ_FAILFAST_TRANSPORT. This allows such requests
> to be given a disposition of either FAILOVER or FAILUP respectively.
> FAILUP handling ensures a retryable error is returned up from NVMe.
> 
> Introduce nvme_failup_req() for use in nvme_complete_rq() if
> nvme_decide_disposition() returns FAILUP. nvme_failup_req() ensures
> the request is completed with a retryable IO error when appropriate.
> __nvme_end_req() was factored out for use by both nvme_end_req() and
> nvme_failup_req().
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/nvme/host/core.c | 31 ++++++++++++++++++++++++++-----
>  1 file changed, 26 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 4134cf3c7e48..10375197dd53 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -299,6 +299,7 @@ enum nvme_disposition {
>  	COMPLETE,
>  	RETRY,
>  	FAILOVER,
> +	FAILUP,
>  };
>  
>  static inline enum nvme_disposition nvme_decide_disposition(struct
> request *req)
> @@ -318,10 +319,11 @@ static inline enum nvme_disposition
> nvme_decide_disposition(struct request *req)
>  	    nvme_req(req)->retries >= nvme_max_retries)
>  		return COMPLETE;
>  
> -	if (req->cmd_flags & REQ_NVME_MPATH) {
> +	if (req->cmd_flags & (REQ_NVME_MPATH | REQ_FAILFAST_TRANSPORT))
> {
>  		if (nvme_is_path_error(nvme_req(req)->status) ||
>  		    blk_queue_dying(req->q))
> -			return FAILOVER;
> +			return (req->cmd_flags & REQ_NVME_MPATH) ?
> +				FAILOVER : FAILUP;
>  	} else {
>  		if (blk_queue_dying(req->q))
>  			return COMPLETE;
> @@ -330,10 +332,8 @@ static inline enum nvme_disposition
> nvme_decide_disposition(struct request *req)
>  	return RETRY;
>  }
>  
> -static inline void nvme_end_req(struct request *req)
> +static inline void __nvme_end_req(struct request *req, blk_status_t
> status)
>  {
> -	blk_status_t status = nvme_error_status(nvme_req(req)->status);
> -
>  	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
>  	    req_op(req) == REQ_OP_ZONE_APPEND)
>  		req->__sector = nvme_lba_to_sect(req->q->queuedata,
> @@ -343,6 +343,24 @@ static inline void nvme_end_req(struct request
> *req)
>  	blk_mq_end_request(req, status);
>  }
>  
> +static inline void nvme_end_req(struct request *req)
> +{
> +	__nvme_end_req(req, nvme_error_status(nvme_req(req)->status));
> +}
> +
> +static void nvme_failup_req(struct request *req)
> +{
> +	blk_status_t status = nvme_error_status(nvme_req(req)->status);
> +
> +	if (WARN_ON_ONCE(!blk_path_error(status))) {
> +		pr_debug("Request meant for failover but blk_status_t
> (errno=%d) was not retryable.\n",
> +			 blk_status_to_errno(status));
> +		status = BLK_STS_IOERR;
> +	}
> +
> +	__nvme_end_req(req, status);

It seems like adding __nvme_end_req() was unnecessary, since
nvme_end_req() just calls it and does nothing else?

-Ewan

> +}
> +
>  void nvme_complete_rq(struct request *req)
>  {
>  	trace_nvme_complete_rq(req);
> @@ -361,6 +379,9 @@ void nvme_complete_rq(struct request *req)
>  	case FAILOVER:
>  		nvme_failover_req(req);
>  		return;
> +	case FAILUP:
> +		nvme_failup_req(req);
> +		return;
>  	}
>  }
>  EXPORT_SYMBOL_GPL(nvme_complete_rq);


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT
  2021-04-16 20:52   ` Ewan D. Milne
@ 2021-04-16 21:44     ` Mike Snitzer
  0 siblings, 0 replies; 17+ messages in thread
From: Mike Snitzer @ 2021-04-16 21:44 UTC (permalink / raw)
  To: Ewan D. Milne
  Cc: Christoph Hellwig, Jens Axboe, dm-devel, linux-block, linux-nvme

On Fri, Apr 16 2021 at  4:52pm -0400,
Ewan D. Milne <emilne@redhat.com> wrote:

> On Thu, 2021-04-15 at 19:15 -0400, Mike Snitzer wrote:
> > If REQ_FAILFAST_TRANSPORT is set it means the driver should not retry
> > IO that completed with transport errors. REQ_FAILFAST_TRANSPORT is
> > set by multipathing software (e.g. dm-multipath) before it issues IO.
> > 
> > Update NVMe to allow failover of requests marked with either
> > REQ_NVME_MPATH or REQ_FAILFAST_TRANSPORT. This allows such requests
> > to be given a disposition of either FAILOVER or FAILUP respectively.
> > FAILUP handling ensures a retryable error is returned up from NVMe.
> > 
> > Introduce nvme_failup_req() for use in nvme_complete_rq() if
> > nvme_decide_disposition() returns FAILUP. nvme_failup_req() ensures
> > the request is completed with a retryable IO error when appropriate.
> > __nvme_end_req() was factored out for use by both nvme_end_req() and
> > nvme_failup_req().
> > 
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  drivers/nvme/host/core.c | 31 ++++++++++++++++++++++++++-----
> >  1 file changed, 26 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 4134cf3c7e48..10375197dd53 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -299,6 +299,7 @@ enum nvme_disposition {
> >  	COMPLETE,
> >  	RETRY,
> >  	FAILOVER,
> > +	FAILUP,
> >  };
> >  
> >  static inline enum nvme_disposition nvme_decide_disposition(struct
> > request *req)
> > @@ -318,10 +319,11 @@ static inline enum nvme_disposition
> > nvme_decide_disposition(struct request *req)
> >  	    nvme_req(req)->retries >= nvme_max_retries)
> >  		return COMPLETE;
> >  
> > -	if (req->cmd_flags & REQ_NVME_MPATH) {
> > +	if (req->cmd_flags & (REQ_NVME_MPATH | REQ_FAILFAST_TRANSPORT))
> > {
> >  		if (nvme_is_path_error(nvme_req(req)->status) ||
> >  		    blk_queue_dying(req->q))
> > -			return FAILOVER;
> > +			return (req->cmd_flags & REQ_NVME_MPATH) ?
> > +				FAILOVER : FAILUP;
> >  	} else {
> >  		if (blk_queue_dying(req->q))
> >  			return COMPLETE;
> > @@ -330,10 +332,8 @@ static inline enum nvme_disposition
> > nvme_decide_disposition(struct request *req)
> >  	return RETRY;
> >  }
> >  
> > -static inline void nvme_end_req(struct request *req)
> > +static inline void __nvme_end_req(struct request *req, blk_status_t
> > status)
> >  {
> > -	blk_status_t status = nvme_error_status(nvme_req(req)->status);
> > -
> >  	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
> >  	    req_op(req) == REQ_OP_ZONE_APPEND)
> >  		req->__sector = nvme_lba_to_sect(req->q->queuedata,
> > @@ -343,6 +343,24 @@ static inline void nvme_end_req(struct request
> > *req)
> >  	blk_mq_end_request(req, status);
> >  }
> >  
> > +static inline void nvme_end_req(struct request *req)
> > +{
> > +	__nvme_end_req(req, nvme_error_status(nvme_req(req)->status));
> > +}
> > +
> > +static void nvme_failup_req(struct request *req)
> > +{
> > +	blk_status_t status = nvme_error_status(nvme_req(req)->status);
> > +
> > +	if (WARN_ON_ONCE(!blk_path_error(status))) {
> > +		pr_debug("Request meant for failover but blk_status_t
> > (errno=%d) was not retryable.\n",
> > +			 blk_status_to_errno(status));
> > +		status = BLK_STS_IOERR;
> > +	}
> > +
> > +	__nvme_end_req(req, status);
> 
> It seems like adding __nvme_end_req() was unnecessary, since
> nvme_end_req() just calls it and does nothing else?

The __nvme_end_req helper allowed the status to be passed in, making it
easy to override status for unlikley edge-case where the BLK_STS is
!blk_path_error() but should be given nvme_is_path_error() returned
true.

Anyway, I can avoid the need to factor out __nvme_end_req() and just:
  nvme_req(req)->status = NVME_SC_HOST_PATH_ERROR;
  nvme_end_req(req);

Thanks,
Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, back to index

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-15 23:15 [PATCH v2 0/4] nvme: improve error handling and ana_state to work well with dm-multipath Mike Snitzer
2021-04-15 23:15 ` [PATCH v2 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set Mike Snitzer
2021-04-15 23:48   ` Chaitanya Kulkarni
2021-04-16 13:51   ` Hannes Reinecke
2021-04-15 23:15 ` [PATCH v2 2/4] nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set Mike Snitzer
2021-04-16 14:01   ` Hannes Reinecke
2021-04-16 14:53     ` Mike Snitzer
2021-04-16 15:20       ` Hannes Reinecke
2021-04-16 15:32         ` Mike Snitzer
2021-04-15 23:15 ` [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT Mike Snitzer
2021-04-16 14:07   ` Hannes Reinecke
2021-04-16 15:03     ` Mike Snitzer
2021-04-16 16:23       ` Hannes Reinecke
2021-04-16 20:03     ` [dm-devel] " Ewan D. Milne
2021-04-16 20:52   ` Ewan D. Milne
2021-04-16 21:44     ` Mike Snitzer
2021-04-15 23:15 ` [PATCH v2 4/4] nvme: decouple basic ANA log page re-read support from native multipathing Mike Snitzer

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git