linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/5] scsi: Support to handle Intermittent errors
@ 2020-11-11  4:58 Muneendra
  2020-11-11  4:58 ` [PATCH v7 1/5] scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h Muneendra
                   ` (5 more replies)
  0 siblings, 6 replies; 26+ messages in thread
From: Muneendra @ 2020-11-11  4:58 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 4131 bytes --]

This patch adds a support to prevent retries of all the
io's after an abort succeeds on a particular device when transport
connectivity to the device is encountering intermittent errors.

Intermittent connectivity is a condition that can be detected by transport
fabric notifications. A service can monitor the ELS notifications and
take action on all the outstanding io's of a scsi device at that instant.

This feature is intended to be used when the device is part of a multipath
environment. When the service detects the poor connectivity, the multipath
path can be placed in a marginal path group and ignored further io
operations.

After placing a path in the marginal path group,the daemon sets the
port_state to Marginal which sets bit in scmd->state for all the
io's on that particular device with the new sysfs interface
provided in this patch.This prevent retries of all the
io's if an io hits a scsi timeout which inturn issues an abort.
On Abort succeeds on a marginal path the io will be immediately retried on
another active path.On abort fails then the things escalates to existing
target reset sg interface recovery process.

Below is the interface provided to set the port state to Marginal
and Online.
echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state


The patches were cut against  5.11/scsi-queue tree

---
v7:

Added New routine in scsi_host_template to decide if a cmd is
retryable instead of checking the same using  SCMD_NORETRIES_ABORT
bit as the cmd retry part can be checked by validating the port state.

Removed the changes related to SCMD_NORETRIES_ABORT bit.

Added a new function fc_eh_should_retry_cmd to check whether the cmd
should be retried based on the rport state.

Reoreder the patch

The patches were cut against  5.11/scsi-queue tree


v6:
Reordered the patches to make patch ordering and more logical.

v5:
Added the DID_TRANSPORT_MARGINAL case to scsi_decide_disposition

Made changes to clear the SCMD_NORETRIES_ABORT bit if the port_state
has changed from marginal to online due to port_delete and port_add
as we need the normal cmd retry behaviour while we are calling the
eh handlers.

Made changes in fc_scsi_scan_rport as we are checking FC_PORTSTATE_ONLINE
instead of FC_PORTSTATE_ONLINE and FC_PORTSTATE_MARGINAL


v4:
Made changes in fc_eh_timed_out callout to set the SCMD_NORETRIES_ABORT if port
state is marginal 

With this change, we  removed the code  to loop over running commands
and fc_remote_port_chkready changes to set the SCMD_NORETRIES_ABORT 

Removed the scsi_cmd argument for fc_remote_port_chkready
and reverted back the patches that addressed this change(argument)

Removed unnecessary comments
Handle the return of errors on failure.

v3:
Removed the port_state from starget attributes.
Enabled the store functionality for port_state under remote port
Added a new argument to scsi_cmd  to fc_remote_port_chkready
Used the existing scsi command iterators scsi_host_busy_iter.
Rearranged the patches
Added new patches to add new argument for fc_remote_port_chkready

v2:
Added new error code DID_TRANSPORT_MARGINAL to handle marginal errors.
Added a new rport_state FC_PORTSTATE_MARGINAL and also added a new
sysfs interface port_state to set the port_state to marginal.
Added the support in lpfc to handle the marginal state.


*** BLURB HERE ***

Muneendra (5):
  scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h
  scsi: No retries on abort success
  scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  scsi_transport_fc: Added store fucntionality to set the rport
    port_state using sysfs
  scsi:lpfc: Added support for eh_should_retry_cmd

 drivers/scsi/lpfc/lpfc_scsi.c    |   1 +
 drivers/scsi/scsi_error.c        |  23 +++++-
 drivers/scsi/scsi_lib.c          |   1 +
 drivers/scsi/scsi_transport_fc.c | 118 ++++++++++++++++++++++++++-----
 include/scsi/scsi.h              |   1 +
 include/scsi/scsi_host.h         |   6 ++
 include/scsi/scsi_transport_fc.h |   4 +-
 7 files changed, 133 insertions(+), 21 deletions(-)

-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v7 1/5] scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h
  2020-11-11  4:58 [PATCH v7 0/5] scsi: Support to handle Intermittent errors Muneendra
@ 2020-11-11  4:58 ` Muneendra
  2020-11-16  8:16   ` Hannes Reinecke
                     ` (2 more replies)
  2020-11-11  4:58 ` [PATCH v7 2/5] scsi: No retries on abort success Muneendra
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 26+ messages in thread
From: Muneendra @ 2020-11-11  4:58 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 2754 bytes --]

Added a new error code DID_TRANSPORT_MARGINAL to handle marginal
errors in scsi.h

Added a code in scsi_result_to_blk_status to translate
a new error DID_TRANSPORT_MARGINAL to the corresponding blk_status_t
i.e BLK_STS_TRANSPORT

Added DID_TRANSPORT_MARGINAL case to scsi_decide_disposition

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v7:
Rearranged the patch by moving the DID_TRANSPORT_MARGINAL
and the changes with respect to the same to this patch
from the previous patch2 in v6

Removed the previuos patch patch1 in v6 as in the
current approach there is no need of this bit SCMD_NORETRIES_ABORT

v6:
Rearranged the patch by merging second hunk of the patch2 in v5
to this patch

v5:
added the DID_TRANSPORT_MARGINAL case to
scsi_decide_disposition
v4:
Modified the comments in the code appropriately

v3:
Merged  first part of the previous patch(v2 patch3) with
this patch.

v2:
set the hostbyte as DID_TRANSPORT_MARGINAL instead of
DID_TRANSPORT_FAILFAST.
---
 drivers/scsi/scsi_error.c | 6 ++++++
 drivers/scsi/scsi_lib.c   | 1 +
 include/scsi/scsi.h       | 1 +
 3 files changed, 8 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f11f51e2465f..28056ee498b3 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1861,6 +1861,12 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * the fast io fail tmo fired), so send IO directly upwards.
 		 */
 		return SUCCESS;
+	case DID_TRANSPORT_MARGINAL:
+		/*
+		 * caller has decided not to do retries on
+		 * abort success, so send IO directly upwards
+		 */
+		return SUCCESS;
 	case DID_ERROR:
 		if (msg_byte(scmd->result) == COMMAND_COMPLETE &&
 		    status_byte(scmd->result) == RESERVATION_CONFLICT)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 20a357563d3d..ce1e2adaca36 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -629,6 +629,7 @@ static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
 			return BLK_STS_OK;
 		return BLK_STS_IOERR;
 	case DID_TRANSPORT_FAILFAST:
+	case DID_TRANSPORT_MARGINAL:
 		return BLK_STS_TRANSPORT;
 	case DID_TARGET_FAILURE:
 		set_host_byte(cmd, DID_OK);
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 5339baadc082..5b287ad8b727 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -159,6 +159,7 @@ static inline int scsi_is_wlun(u64 lun)
 				 * paths might yield different results */
 #define DID_ALLOC_FAILURE 0x12  /* Space allocation on the device failed */
 #define DID_MEDIUM_ERROR  0x13  /* Medium error */
+#define DID_TRANSPORT_MARGINAL 0x14 /* Transport marginal errors */
 #define DRIVER_OK       0x00	/* Driver status                           */
 
 /*
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 2/5] scsi: No retries on abort success
  2020-11-11  4:58 [PATCH v7 0/5] scsi: Support to handle Intermittent errors Muneendra
  2020-11-11  4:58 ` [PATCH v7 1/5] scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h Muneendra
@ 2020-11-11  4:58 ` Muneendra
  2020-11-16  8:22   ` Hannes Reinecke
                     ` (2 more replies)
  2020-11-11  4:58 ` [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 26+ messages in thread
From: Muneendra @ 2020-11-11  4:58 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 3592 bytes --]

Added a new optional routine eh_should_retry_cmd
in scsi_host_template that allows the transport to decide if a
cmd is retryable.Return true if the transport is in a state the
cmd should be retried on.

Added a new interface scsi_eh_should_retry_cmd which checks and
calls the new routine eh_should_retry_cmd.

Made changes in scmd_eh_abort_handler and scsi_eh_flush_done_q which
calls the scsi_eh_should_retry_cmd to check whether the
command needs to be retried.

The above changes were done based on a patch by Mike Christie.

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v7:
Added New routine in scsi_host_template to decide if a cmd is
retryable instead of checking the same using  SCMD_NORETRIES_ABORT
bit as the cmd retry part can be checked by validating the port state.

Moved the DID_TRANSPORT_MARGINAL changes to previous patch
for reordering

v6:
Rearranged the patch by merging second hunk of the patch2 in v5
to this patch

v5:
added the DID_TRANSPORT_MARGINAL case to
scsi_decide_disposition
v4:
Modified the comments in the code appropriately

v3:
Merged  first part of the previous patch(v2 patch3) with
this patch.

v2:
set the hostbyte as DID_TRANSPORT_MARGINAL instead of
DID_TRANSPORT_FAILFAST.
---
 drivers/scsi/scsi_error.c | 17 +++++++++++++++--
 include/scsi/scsi_host.h  |  6 ++++++
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 28056ee498b3..1cdfa5a8ca09 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -124,6 +124,17 @@ static bool scsi_cmd_retry_allowed(struct scsi_cmnd *cmd)
 	return ++cmd->retries <= cmd->allowed;
 }
 
+static bool scsi_eh_should_retry_cmd(struct scsi_cmnd *cmd)
+{
+	struct scsi_device *sdev = cmd->device;
+	struct Scsi_Host *host = sdev->host;
+
+	if (host->hostt->eh_should_retry_cmd)
+		return  host->hostt->eh_should_retry_cmd(cmd);
+
+	return true;
+}
+
 /**
  * scmd_eh_abort_handler - Handle command aborts
  * @work:	command to be aborted.
@@ -159,7 +170,8 @@ scmd_eh_abort_handler(struct work_struct *work)
 						    "eh timeout, not retrying "
 						    "aborted command\n"));
 			} else if (!scsi_noretry_cmd(scmd) &&
-				   scsi_cmd_retry_allowed(scmd)) {
+				   scsi_cmd_retry_allowed(scmd) &&
+				scsi_eh_should_retry_cmd(scmd)) {
 				SCSI_LOG_ERROR_RECOVERY(3,
 					scmd_printk(KERN_WARNING, scmd,
 						    "retry aborted command\n"));
@@ -2111,7 +2123,8 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
 	list_for_each_entry_safe(scmd, next, done_q, eh_entry) {
 		list_del_init(&scmd->eh_entry);
 		if (scsi_device_online(scmd->device) &&
-		    !scsi_noretry_cmd(scmd) && scsi_cmd_retry_allowed(scmd)) {
+		    !scsi_noretry_cmd(scmd) && scsi_cmd_retry_allowed(scmd) &&
+			scsi_eh_should_retry_cmd(scmd)) {
 			SCSI_LOG_ERROR_RECOVERY(3,
 				scmd_printk(KERN_INFO, scmd,
 					     "%s: flush retry cmd\n",
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 701f178b20ae..e30fd963b97d 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -314,6 +314,12 @@ struct scsi_host_template {
 	 * Status: OPTIONAL
 	 */
 	enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
+	/*
+	 * Optional routine that allows the transport to decide if a cmd
+	 * is retryable. Return true if the transport is in a state the
+	 * cmd should be retried on.
+	 */
+	bool (*eh_should_retry_cmd)(struct scsi_cmnd *scmd);
 
 	/* This is an optional routine that allows transport to initiate
 	 * LLD adapter or firmware reset using sysfs attribute.
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-11-11  4:58 [PATCH v7 0/5] scsi: Support to handle Intermittent errors Muneendra
  2020-11-11  4:58 ` [PATCH v7 1/5] scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h Muneendra
  2020-11-11  4:58 ` [PATCH v7 2/5] scsi: No retries on abort success Muneendra
@ 2020-11-11  4:58 ` Muneendra
  2020-11-16  8:19   ` Hannes Reinecke
                     ` (2 more replies)
  2020-11-11  4:58 ` [PATCH v7 4/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 26+ messages in thread
From: Muneendra @ 2020-11-11  4:58 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 7204 bytes --]

Added a new rport state FC_PORTSTATE_MARGINAL.

Added a new interface fc_eh_should_retry_cmd which Checks if the cmd
should be retried or not by checking the rport state.
If the rport state is marginal it returns
false to make sure there won't be any retries on the cmd.

Also made changes in fc_remote_port_delete,fc_user_scan_tgt,
fc_timeout_deleted_rport functions  to handle the new rport state
FC_PORTSTATE_MARGINAL.

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v7:
Removed the changes related to SCMD_NORETRIES_ABORT bit.

Added a new function fc_eh_should_retry_cmd to check whether the cmd
should be retried based on the rport state.

v6:
No change

v5:
Made changes to clear the SCMD_NORETRIES_ABORT bit if the port_state
has changed from marginal to online due to port_delete and port_add
as we need the normal cmd retry behaviour

Made changes in fc_scsi_scan_rport as we are checking FC_PORTSTATE_ONLINE
instead of FC_PORTSTATE_ONLINE and FC_PORTSTATE_MARGINAL

v4:
Made changes in fc_eh_timed_out to call fc_rport_chkmarginal_set_noretries
so that SCMD_NORETRIES_ABORT bit in cmd->state is set if rport state
is marginal.

Removed the newly added scsi_cmd argument to fc_remote_port_chkready
as the current patch handles only SCSI EH timeout/abort case.

v3:
Rearranged the patch so that all the changes with respect to new
rport state is part of this patch.
Added a new argument to scsi_cmd  to fc_remote_port_chkready

v2:
New patch
---
 drivers/scsi/scsi_transport_fc.c | 62 +++++++++++++++++++++++---------
 include/scsi/scsi_transport_fc.h |  4 ++-
 2 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index a926e8f9e56e..ffd25195ae62 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -148,20 +148,23 @@ fc_enum_name_search(host_event_code, fc_host_event_code,
 static struct {
 	enum fc_port_state	value;
 	char			*name;
+	int			matchlen;
 } fc_port_state_names[] = {
-	{ FC_PORTSTATE_UNKNOWN,		"Unknown" },
-	{ FC_PORTSTATE_NOTPRESENT,	"Not Present" },
-	{ FC_PORTSTATE_ONLINE,		"Online" },
-	{ FC_PORTSTATE_OFFLINE,		"Offline" },
-	{ FC_PORTSTATE_BLOCKED,		"Blocked" },
-	{ FC_PORTSTATE_BYPASSED,	"Bypassed" },
-	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics" },
-	{ FC_PORTSTATE_LINKDOWN,	"Linkdown" },
-	{ FC_PORTSTATE_ERROR,		"Error" },
-	{ FC_PORTSTATE_LOOPBACK,	"Loopback" },
-	{ FC_PORTSTATE_DELETED,		"Deleted" },
+	{ FC_PORTSTATE_UNKNOWN,		"Unknown", 7},
+	{ FC_PORTSTATE_NOTPRESENT,	"Not Present", 11 },
+	{ FC_PORTSTATE_ONLINE,		"Online", 6 },
+	{ FC_PORTSTATE_OFFLINE,		"Offline", 7 },
+	{ FC_PORTSTATE_BLOCKED,		"Blocked", 7 },
+	{ FC_PORTSTATE_BYPASSED,	"Bypassed", 8 },
+	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics", 11 },
+	{ FC_PORTSTATE_LINKDOWN,	"Linkdown", 8 },
+	{ FC_PORTSTATE_ERROR,		"Error", 5 },
+	{ FC_PORTSTATE_LOOPBACK,	"Loopback", 8 },
+	{ FC_PORTSTATE_DELETED,		"Deleted", 7 },
+	{ FC_PORTSTATE_MARGINAL,	"Marginal", 8 },
 };
 fc_enum_name_search(port_state, fc_port_state, fc_port_state_names)
+fc_enum_name_match(port_state, fc_port_state, fc_port_state_names)
 #define FC_PORTSTATE_MAX_NAMELEN	20
 
 
@@ -2509,7 +2512,8 @@ fc_user_scan_tgt(struct Scsi_Host *shost, uint channel, uint id, u64 lun)
 		if (rport->scsi_target_id == -1)
 			continue;
 
-		if (rport->port_state != FC_PORTSTATE_ONLINE)
+		if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
+			(rport->port_state != FC_PORTSTATE_MARGINAL))
 			continue;
 
 		if ((channel == rport->channel) &&
@@ -3373,7 +3377,8 @@ fc_remote_port_delete(struct fc_rport  *rport)
 
 	spin_lock_irqsave(shost->host_lock, flags);
 
-	if (rport->port_state != FC_PORTSTATE_ONLINE) {
+	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
+		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
 		spin_unlock_irqrestore(shost->host_lock, flags);
 		return;
 	}
@@ -3515,7 +3520,8 @@ fc_timeout_deleted_rport(struct work_struct *work)
 	 * target, validate it still is. If not, tear down the
 	 * scsi_target on it.
 	 */
-	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
+	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
+		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
 	    (rport->scsi_target_id != -1) &&
 	    !(rport->roles & FC_PORT_ROLE_FCP_TARGET)) {
 		dev_printk(KERN_ERR, &rport->dev,
@@ -3658,7 +3664,8 @@ fc_scsi_scan_rport(struct work_struct *work)
 	struct fc_internal *i = to_fc_internal(shost->transportt);
 	unsigned long flags;
 
-	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
+	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
+		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
 	    (rport->roles & FC_PORT_ROLE_FCP_TARGET) &&
 	    !(i->f->disable_target_scan)) {
 		scsi_scan_target(&rport->dev, rport->channel,
@@ -3731,6 +3738,28 @@ int fc_block_scsi_eh(struct scsi_cmnd *cmnd)
 }
 EXPORT_SYMBOL(fc_block_scsi_eh);
 
+/*
+ * fc_eh_should_retry_cmd - Checks if the cmd should be retried or not
+ * @scmd:        The SCSI command to be checked
+ *
+ * This checks the rport state to decide if a cmd is
+ * retryable.
+ *
+ * Returns: true if the rport state is not in marginal state.
+ */
+bool fc_eh_should_retry_cmd(struct scsi_cmnd *scmd)
+{
+	struct fc_rport *rport = starget_to_rport(scsi_target(scmd->device));
+
+	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
+		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
+		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);
+		return false;
+	}
+	return true;
+}
+EXPORT_SYMBOL_GPL(fc_eh_should_retry_cmd);
+
 /**
  * fc_vport_setup - allocates and creates a FC virtual port.
  * @shost:	scsi host the virtual port is connected to.
@@ -4162,7 +4191,8 @@ static blk_status_t fc_bsg_rport_prep(struct fc_rport *rport)
 	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
 		return BLK_STS_RESOURCE;
 
-	if (rport->port_state != FC_PORTSTATE_ONLINE)
+	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
+		(rport->port_state != FC_PORTSTATE_MARGINAL))
 		return BLK_STS_IOERR;
 
 	return BLK_STS_OK;
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index c759b29e46c7..14214ee121ad 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -67,6 +67,7 @@ enum fc_port_state {
 	FC_PORTSTATE_ERROR,
 	FC_PORTSTATE_LOOPBACK,
 	FC_PORTSTATE_DELETED,
+	FC_PORTSTATE_MARGINAL,
 };
 
 
@@ -742,7 +743,6 @@ struct fc_function_template {
 	unsigned long	disable_target_scan:1;
 };
 
-
 /**
  * fc_remote_port_chkready - called to validate the remote port state
  *   prior to initiating io to the port.
@@ -758,6 +758,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
 
 	switch (rport->port_state) {
 	case FC_PORTSTATE_ONLINE:
+	case FC_PORTSTATE_MARGINAL:
 		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
 			result = 0;
 		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
@@ -839,6 +840,7 @@ int fc_vport_terminate(struct fc_vport *vport);
 int fc_block_rport(struct fc_rport *rport);
 int fc_block_scsi_eh(struct scsi_cmnd *cmnd);
 enum blk_eh_timer_return fc_eh_timed_out(struct scsi_cmnd *scmd);
+bool fc_eh_should_retry_cmd(struct scsi_cmnd *scmd);
 
 static inline struct Scsi_Host *fc_bsg_to_shost(struct bsg_job *job)
 {
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 4/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs
  2020-11-11  4:58 [PATCH v7 0/5] scsi: Support to handle Intermittent errors Muneendra
                   ` (2 preceding siblings ...)
  2020-11-11  4:58 ` [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
@ 2020-11-11  4:58 ` Muneendra
  2020-11-16  8:20   ` Hannes Reinecke
                     ` (2 more replies)
  2020-11-11  4:58 ` [PATCH v7 5/5] scsi:lpfc: Added support for eh_should_retry_cmd Muneendra
  2020-12-08  5:00 ` [PATCH v7 0/5] scsi: Support to handle Intermittent errors Muneendra Kumar M
  5 siblings, 3 replies; 26+ messages in thread
From: Muneendra @ 2020-11-11  4:58 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 3881 bytes --]

Added a store functionality to set rport port_state using sysfs
under  fc_remote_ports/rport-*/port_state

With this functionality the user can move the port_state from
Marginal -> Online and Online->Marginal.

On Marginal :This interface will set SCMD_NORETRIES_ABORT bit in
scmd->state for all the pending io's on the scsi device associated
with target port.

On Online :This interface will clear SCMD_NORETRIES_ABORT bit in
scmd->state for all the pending io's on the scsi device associated
with target port.

Below is the interface provided to set the port state to Marginal
and Online.

echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v7:
No change

v6:
No change

v5:
No change

v4:
Addressed the error reported by kernel test robot
Removed the code needed to traverse all the devices under rport
to set/clear SCMD_NORETRIES_ABORT
Removed unncessary comments.
Return the error values on failure while setting the port_state

v3:
Removed the port_state from starget attributes.
Enabled the store functionality for port_state under remote port.
used the starget_for_each_device to traverse around all the devices
under rport

v2:
Changed from a noretries_abort attribute under fc_transport/target*/ to
port_state for changing the port_state to a marginal state
---
 drivers/scsi/scsi_transport_fc.c | 56 ++++++++++++++++++++++++++++++--
 1 file changed, 54 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index ffd25195ae62..d378ca4a60fe 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -1238,7 +1238,59 @@ show_fc_rport_roles (struct device *dev, struct device_attribute *attr,
 static FC_DEVICE_ATTR(rport, roles, S_IRUGO,
 		show_fc_rport_roles, NULL);
 
-fc_private_rport_rd_enum_attr(port_state, FC_PORTSTATE_MAX_NAMELEN);
+static ssize_t fc_rport_set_marginal_state(struct device *dev,
+						struct device_attribute *attr,
+						const char *buf, size_t count)
+{
+	struct fc_rport *rport = transport_class_to_rport(dev);
+	enum fc_port_state port_state;
+	int ret = 0;
+
+	ret = get_fc_port_state_match(buf, &port_state);
+	if (ret)
+		return -EINVAL;
+	if (port_state == FC_PORTSTATE_MARGINAL) {
+		/*
+		 * Change the state to marginal only if the
+		 * current rport state is Online
+		 * Allow only Online->marginal
+		 */
+		if (rport->port_state == FC_PORTSTATE_ONLINE)
+			rport->port_state = port_state;
+		else
+			return -EINVAL;
+	} else if (port_state == FC_PORTSTATE_ONLINE) {
+		/*
+		 * Change the state to Online only if the
+		 * current rport state is Marginal
+		 * Allow only  MArginal->Online
+		 */
+		if (rport->port_state == FC_PORTSTATE_MARGINAL)
+			rport->port_state = port_state;
+		else
+			return -EINVAL;
+	} else
+		return -EINVAL;
+	return count;
+}
+
+static ssize_t
+show_fc_rport_port_state(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	const char *name;
+	struct fc_rport *rport = transport_class_to_rport(dev);
+
+	name = get_fc_port_state_name(rport->port_state);
+	if (!name)
+		return -EINVAL;
+
+	return snprintf(buf, 20, "%s\n", name);
+}
+
+static FC_DEVICE_ATTR(rport, port_state, 0444 | 0200,
+			show_fc_rport_port_state, fc_rport_set_marginal_state);
+
 fc_private_rport_rd_attr(scsi_target_id, "%d\n", 20);
 
 /*
@@ -2681,7 +2733,7 @@ fc_attach_transport(struct fc_function_template *ft)
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_name);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_id);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(roles);
-	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_state);
+	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(port_state);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(scsi_target_id);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(fast_io_fail_tmo);
 
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 5/5] scsi:lpfc: Added support for eh_should_retry_cmd
  2020-11-11  4:58 [PATCH v7 0/5] scsi: Support to handle Intermittent errors Muneendra
                   ` (3 preceding siblings ...)
  2020-11-11  4:58 ` [PATCH v7 4/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
@ 2020-11-11  4:58 ` Muneendra
  2020-11-16  8:23   ` Hannes Reinecke
                     ` (2 more replies)
  2020-12-08  5:00 ` [PATCH v7 0/5] scsi: Support to handle Intermittent errors Muneendra Kumar M
  5 siblings, 3 replies; 26+ messages in thread
From: Muneendra @ 2020-11-11  4:58 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 780 bytes --]

Added support to handle eh_should_retry_cmd in lpfc_template.

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v7:
New patch
---
 drivers/scsi/lpfc/lpfc_scsi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 4ffdfd2c8604..dbc80e6423ea 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -6040,6 +6040,7 @@ struct scsi_host_template lpfc_template = {
 	.info			= lpfc_info,
 	.queuecommand		= lpfc_queuecommand,
 	.eh_timed_out		= fc_eh_timed_out,
+	.eh_should_retry_cmd    = fc_eh_should_retry_cmd,
 	.eh_abort_handler	= lpfc_abort_handler,
 	.eh_device_reset_handler = lpfc_device_reset_handler,
 	.eh_target_reset_handler = lpfc_target_reset_handler,
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 1/5] scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h
  2020-11-11  4:58 ` [PATCH v7 1/5] scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h Muneendra
@ 2020-11-16  8:16   ` Hannes Reinecke
  2020-11-23 19:45   ` Ewan D. Milne
  2020-11-24 17:42   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Hannes Reinecke @ 2020-11-16  8:16 UTC (permalink / raw)
  To: Muneendra, linux-scsi, michael.christie; +Cc: jsmart2021, emilne, mkumar

On 11/11/20 5:58 AM, Muneendra wrote:
> Added a new error code DID_TRANSPORT_MARGINAL to handle marginal
> errors in scsi.h
> 
> Added a code in scsi_result_to_blk_status to translate
> a new error DID_TRANSPORT_MARGINAL to the corresponding blk_status_t
> i.e BLK_STS_TRANSPORT
> 
> Added DID_TRANSPORT_MARGINAL case to scsi_decide_disposition
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> Rearranged the patch by moving the DID_TRANSPORT_MARGINAL
> and the changes with respect to the same to this patch
> from the previous patch2 in v6
> 
> Removed the previuos patch patch1 in v6 as in the
> current approach there is no need of this bit SCMD_NORETRIES_ABORT
> 
> v6:
> Rearranged the patch by merging second hunk of the patch2 in v5
> to this patch
> 
> v5:
> added the DID_TRANSPORT_MARGINAL case to
> scsi_decide_disposition
> v4:
> Modified the comments in the code appropriately
> 
> v3:
> Merged  first part of the previous patch(v2 patch3) with
> this patch.
> 
> v2:
> set the hostbyte as DID_TRANSPORT_MARGINAL instead of
> DID_TRANSPORT_FAILFAST.
> ---
>   drivers/scsi/scsi_error.c | 6 ++++++
>   drivers/scsi/scsi_lib.c   | 1 +
>   include/scsi/scsi.h       | 1 +
>   3 files changed, 8 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index f11f51e2465f..28056ee498b3 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -1861,6 +1861,12 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
>   		 * the fast io fail tmo fired), so send IO directly upwards.
>   		 */
>   		return SUCCESS;
> +	case DID_TRANSPORT_MARGINAL:
> +		/*
> +		 * caller has decided not to do retries on
> +		 * abort success, so send IO directly upwards
> +		 */
> +		return SUCCESS;
>   	case DID_ERROR:
>   		if (msg_byte(scmd->result) == COMMAND_COMPLETE &&
>   		    status_byte(scmd->result) == RESERVATION_CONFLICT)
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 20a357563d3d..ce1e2adaca36 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -629,6 +629,7 @@ static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
>   			return BLK_STS_OK;
>   		return BLK_STS_IOERR;
>   	case DID_TRANSPORT_FAILFAST:
> +	case DID_TRANSPORT_MARGINAL:
>   		return BLK_STS_TRANSPORT;
>   	case DID_TARGET_FAILURE:
>   		set_host_byte(cmd, DID_OK);
> diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
> index 5339baadc082..5b287ad8b727 100644
> --- a/include/scsi/scsi.h
> +++ b/include/scsi/scsi.h
> @@ -159,6 +159,7 @@ static inline int scsi_is_wlun(u64 lun)
>   				 * paths might yield different results */
>   #define DID_ALLOC_FAILURE 0x12  /* Space allocation on the device failed */
>   #define DID_MEDIUM_ERROR  0x13  /* Medium error */
> +#define DID_TRANSPORT_MARGINAL 0x14 /* Transport marginal errors */
>   #define DRIVER_OK       0x00	/* Driver status                           */
>   
>   /*
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-11-11  4:58 ` [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
@ 2020-11-16  8:19   ` Hannes Reinecke
  2020-11-17  7:43     ` Muneendra Kumar M
  2020-11-23 20:01     ` Ewan D. Milne
  2020-11-23 19:47   ` Ewan D. Milne
  2020-11-24 17:43   ` Himanshu Madhani
  2 siblings, 2 replies; 26+ messages in thread
From: Hannes Reinecke @ 2020-11-16  8:19 UTC (permalink / raw)
  To: Muneendra, linux-scsi, michael.christie; +Cc: jsmart2021, emilne, mkumar

On 11/11/20 5:58 AM, Muneendra wrote:
> Added a new rport state FC_PORTSTATE_MARGINAL.
> 
> Added a new interface fc_eh_should_retry_cmd which Checks if the cmd
> should be retried or not by checking the rport state.
> If the rport state is marginal it returns
> false to make sure there won't be any retries on the cmd.
> 
> Also made changes in fc_remote_port_delete,fc_user_scan_tgt,
> fc_timeout_deleted_rport functions  to handle the new rport state
> FC_PORTSTATE_MARGINAL.
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> Removed the changes related to SCMD_NORETRIES_ABORT bit.
> 
> Added a new function fc_eh_should_retry_cmd to check whether the cmd
> should be retried based on the rport state.
> 
> v6:
> No change
> 
> v5:
> Made changes to clear the SCMD_NORETRIES_ABORT bit if the port_state
> has changed from marginal to online due to port_delete and port_add
> as we need the normal cmd retry behaviour
> 
> Made changes in fc_scsi_scan_rport as we are checking FC_PORTSTATE_ONLINE
> instead of FC_PORTSTATE_ONLINE and FC_PORTSTATE_MARGINAL
> 
> v4:
> Made changes in fc_eh_timed_out to call fc_rport_chkmarginal_set_noretries
> so that SCMD_NORETRIES_ABORT bit in cmd->state is set if rport state
> is marginal.
> 
> Removed the newly added scsi_cmd argument to fc_remote_port_chkready
> as the current patch handles only SCSI EH timeout/abort case.
> 
> v3:
> Rearranged the patch so that all the changes with respect to new
> rport state is part of this patch.
> Added a new argument to scsi_cmd  to fc_remote_port_chkready
> 
> v2:
> New patch
> ---
>   drivers/scsi/scsi_transport_fc.c | 62 +++++++++++++++++++++++---------
>   include/scsi/scsi_transport_fc.h |  4 ++-
>   2 files changed, 49 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> index a926e8f9e56e..ffd25195ae62 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -148,20 +148,23 @@ fc_enum_name_search(host_event_code, fc_host_event_code,
>   static struct {
>   	enum fc_port_state	value;
>   	char			*name;
> +	int			matchlen;
>   } fc_port_state_names[] = {
> -	{ FC_PORTSTATE_UNKNOWN,		"Unknown" },
> -	{ FC_PORTSTATE_NOTPRESENT,	"Not Present" },
> -	{ FC_PORTSTATE_ONLINE,		"Online" },
> -	{ FC_PORTSTATE_OFFLINE,		"Offline" },
> -	{ FC_PORTSTATE_BLOCKED,		"Blocked" },
> -	{ FC_PORTSTATE_BYPASSED,	"Bypassed" },
> -	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics" },
> -	{ FC_PORTSTATE_LINKDOWN,	"Linkdown" },
> -	{ FC_PORTSTATE_ERROR,		"Error" },
> -	{ FC_PORTSTATE_LOOPBACK,	"Loopback" },
> -	{ FC_PORTSTATE_DELETED,		"Deleted" },
> +	{ FC_PORTSTATE_UNKNOWN,		"Unknown", 7},
> +	{ FC_PORTSTATE_NOTPRESENT,	"Not Present", 11 },
> +	{ FC_PORTSTATE_ONLINE,		"Online", 6 },
> +	{ FC_PORTSTATE_OFFLINE,		"Offline", 7 },
> +	{ FC_PORTSTATE_BLOCKED,		"Blocked", 7 },
> +	{ FC_PORTSTATE_BYPASSED,	"Bypassed", 8 },
> +	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics", 11 },
> +	{ FC_PORTSTATE_LINKDOWN,	"Linkdown", 8 },
> +	{ FC_PORTSTATE_ERROR,		"Error", 5 },
> +	{ FC_PORTSTATE_LOOPBACK,	"Loopback", 8 },
> +	{ FC_PORTSTATE_DELETED,		"Deleted", 7 },
> +	{ FC_PORTSTATE_MARGINAL,	"Marginal", 8 },

Why did you append the length of the string here?
This doesn't have anything to do with this patch, but rather is an 
improvement/modification of the original code, and should be delegated 
to a separate patch.

>   };
>   fc_enum_name_search(port_state, fc_port_state, fc_port_state_names)
> +fc_enum_name_match(port_state, fc_port_state, fc_port_state_names)
>   #define FC_PORTSTATE_MAX_NAMELEN	20
>   
>   
> @@ -2509,7 +2512,8 @@ fc_user_scan_tgt(struct Scsi_Host *shost, uint channel, uint id, u64 lun)
>   		if (rport->scsi_target_id == -1)
>   			continue;
>   
> -		if (rport->port_state != FC_PORTSTATE_ONLINE)
> +		if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +			(rport->port_state != FC_PORTSTATE_MARGINAL))
>   			continue;
>   
>   		if ((channel == rport->channel) &&
> @@ -3373,7 +3377,8 @@ fc_remote_port_delete(struct fc_rport  *rport)
>   
>   	spin_lock_irqsave(shost->host_lock, flags);
>   
> -	if (rport->port_state != FC_PORTSTATE_ONLINE) {
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
>   		spin_unlock_irqrestore(shost->host_lock, flags);
>   		return;
>   	}
> @@ -3515,7 +3520,8 @@ fc_timeout_deleted_rport(struct work_struct *work)
>   	 * target, validate it still is. If not, tear down the
>   	 * scsi_target on it.
>   	 */
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
>   	    (rport->scsi_target_id != -1) &&
>   	    !(rport->roles & FC_PORT_ROLE_FCP_TARGET)) {
>   		dev_printk(KERN_ERR, &rport->dev,
> @@ -3658,7 +3664,8 @@ fc_scsi_scan_rport(struct work_struct *work)
>   	struct fc_internal *i = to_fc_internal(shost->transportt);
>   	unsigned long flags;
>   
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
>   	    (rport->roles & FC_PORT_ROLE_FCP_TARGET) &&
>   	    !(i->f->disable_target_scan)) {
>   		scsi_scan_target(&rport->dev, rport->channel,
> @@ -3731,6 +3738,28 @@ int fc_block_scsi_eh(struct scsi_cmnd *cmnd)
>   }
>   EXPORT_SYMBOL(fc_block_scsi_eh);
>   
> +/*
> + * fc_eh_should_retry_cmd - Checks if the cmd should be retried or not
> + * @scmd:        The SCSI command to be checked
> + *
> + * This checks the rport state to decide if a cmd is
> + * retryable.
> + *
> + * Returns: true if the rport state is not in marginal state.
> + */
> +bool fc_eh_should_retry_cmd(struct scsi_cmnd *scmd)
> +{
> +	struct fc_rport *rport = starget_to_rport(scsi_target(scmd->device));
> +
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
> +		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);
> +		return false;
> +	}
> +	return true;
> +}
> +EXPORT_SYMBOL_GPL(fc_eh_should_retry_cmd);
> +
>   /**
>    * fc_vport_setup - allocates and creates a FC virtual port.
>    * @shost:	scsi host the virtual port is connected to.
> @@ -4162,7 +4191,8 @@ static blk_status_t fc_bsg_rport_prep(struct fc_rport *rport)
>   	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
>   		return BLK_STS_RESOURCE;
>   
> -	if (rport->port_state != FC_PORTSTATE_ONLINE)
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL))
>   		return BLK_STS_IOERR;
>   
>   	return BLK_STS_OK;
> diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
> index c759b29e46c7..14214ee121ad 100644
> --- a/include/scsi/scsi_transport_fc.h
> +++ b/include/scsi/scsi_transport_fc.h
> @@ -67,6 +67,7 @@ enum fc_port_state {
>   	FC_PORTSTATE_ERROR,
>   	FC_PORTSTATE_LOOPBACK,
>   	FC_PORTSTATE_DELETED,
> +	FC_PORTSTATE_MARGINAL,
>   };
>   
>   
> @@ -742,7 +743,6 @@ struct fc_function_template {
>   	unsigned long	disable_target_scan:1;
>   };
>   
> -
>   /**
>    * fc_remote_port_chkready - called to validate the remote port state
>    *   prior to initiating io to the port.
> @@ -758,6 +758,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
>   
>   	switch (rport->port_state) {
>   	case FC_PORTSTATE_ONLINE:
> +	case FC_PORTSTATE_MARGINAL:
>   		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
>   			result = 0;
>   		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
> @@ -839,6 +840,7 @@ int fc_vport_terminate(struct fc_vport *vport);
>   int fc_block_rport(struct fc_rport *rport);
>   int fc_block_scsi_eh(struct scsi_cmnd *cmnd);
>   enum blk_eh_timer_return fc_eh_timed_out(struct scsi_cmnd *scmd);
> +bool fc_eh_should_retry_cmd(struct scsi_cmnd *scmd);
>   
>   static inline struct Scsi_Host *fc_bsg_to_shost(struct bsg_job *job)
>   {
> 
Other than that:

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 4/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs
  2020-11-11  4:58 ` [PATCH v7 4/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
@ 2020-11-16  8:20   ` Hannes Reinecke
  2020-11-23 19:47   ` Ewan D. Milne
  2020-11-24 17:44   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Hannes Reinecke @ 2020-11-16  8:20 UTC (permalink / raw)
  To: Muneendra, linux-scsi, michael.christie; +Cc: jsmart2021, emilne, mkumar

On 11/11/20 5:58 AM, Muneendra wrote:
> Added a store functionality to set rport port_state using sysfs
> under  fc_remote_ports/rport-*/port_state
> 
> With this functionality the user can move the port_state from
> Marginal -> Online and Online->Marginal.
> 
> On Marginal :This interface will set SCMD_NORETRIES_ABORT bit in
> scmd->state for all the pending io's on the scsi device associated
> with target port.
> 
> On Online :This interface will clear SCMD_NORETRIES_ABORT bit in
> scmd->state for all the pending io's on the scsi device associated
> with target port.
> 
> Below is the interface provided to set the port state to Marginal
> and Online.
> 
> echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
> echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> No change
> 
> v6:
> No change
> 
> v5:
> No change
> 
> v4:
> Addressed the error reported by kernel test robot
> Removed the code needed to traverse all the devices under rport
> to set/clear SCMD_NORETRIES_ABORT
> Removed unncessary comments.
> Return the error values on failure while setting the port_state
> 
> v3:
> Removed the port_state from starget attributes.
> Enabled the store functionality for port_state under remote port.
> used the starget_for_each_device to traverse around all the devices
> under rport
> 
> v2:
> Changed from a noretries_abort attribute under fc_transport/target*/ to
> port_state for changing the port_state to a marginal state
> ---
>   drivers/scsi/scsi_transport_fc.c | 56 ++++++++++++++++++++++++++++++--
>   1 file changed, 54 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 2/5] scsi: No retries on abort success
  2020-11-11  4:58 ` [PATCH v7 2/5] scsi: No retries on abort success Muneendra
@ 2020-11-16  8:22   ` Hannes Reinecke
  2020-11-23 19:45   ` Ewan D. Milne
  2020-11-24 17:43   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Hannes Reinecke @ 2020-11-16  8:22 UTC (permalink / raw)
  To: Muneendra, linux-scsi, michael.christie; +Cc: jsmart2021, emilne, mkumar

On 11/11/20 5:58 AM, Muneendra wrote:
> Added a new optional routine eh_should_retry_cmd
> in scsi_host_template that allows the transport to decide if a
> cmd is retryable.Return true if the transport is in a state the
> cmd should be retried on.
> 
> Added a new interface scsi_eh_should_retry_cmd which checks and
> calls the new routine eh_should_retry_cmd.
> 
> Made changes in scmd_eh_abort_handler and scsi_eh_flush_done_q which
> calls the scsi_eh_should_retry_cmd to check whether the
> command needs to be retried.
> 
> The above changes were done based on a patch by Mike Christie.
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> Added New routine in scsi_host_template to decide if a cmd is
> retryable instead of checking the same using  SCMD_NORETRIES_ABORT
> bit as the cmd retry part can be checked by validating the port state.
> 
> Moved the DID_TRANSPORT_MARGINAL changes to previous patch
> for reordering
> 
> v6:
> Rearranged the patch by merging second hunk of the patch2 in v5
> to this patch
> 
> v5:
> added the DID_TRANSPORT_MARGINAL case to
> scsi_decide_disposition
> v4:
> Modified the comments in the code appropriately
> 
> v3:
> Merged  first part of the previous patch(v2 patch3) with
> this patch.
> 
> v2:
> set the hostbyte as DID_TRANSPORT_MARGINAL instead of
> DID_TRANSPORT_FAILFAST.
> ---
>   drivers/scsi/scsi_error.c | 17 +++++++++++++++--
>   include/scsi/scsi_host.h  |  6 ++++++
>   2 files changed, 21 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 5/5] scsi:lpfc: Added support for eh_should_retry_cmd
  2020-11-11  4:58 ` [PATCH v7 5/5] scsi:lpfc: Added support for eh_should_retry_cmd Muneendra
@ 2020-11-16  8:23   ` Hannes Reinecke
  2020-11-23 19:51     ` Ewan D. Milne
  2020-11-23 19:48   ` Ewan D. Milne
  2020-11-24 17:46   ` Himanshu Madhani
  2 siblings, 1 reply; 26+ messages in thread
From: Hannes Reinecke @ 2020-11-16  8:23 UTC (permalink / raw)
  To: Muneendra, linux-scsi, michael.christie; +Cc: jsmart2021, emilne, mkumar

On 11/11/20 5:58 AM, Muneendra wrote:
> Added support to handle eh_should_retry_cmd in lpfc_template.
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> New patch
> ---
>   drivers/scsi/lpfc/lpfc_scsi.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
> index 4ffdfd2c8604..dbc80e6423ea 100644
> --- a/drivers/scsi/lpfc/lpfc_scsi.c
> +++ b/drivers/scsi/lpfc/lpfc_scsi.c
> @@ -6040,6 +6040,7 @@ struct scsi_host_template lpfc_template = {
>   	.info			= lpfc_info,
>   	.queuecommand		= lpfc_queuecommand,
>   	.eh_timed_out		= fc_eh_timed_out,
> +	.eh_should_retry_cmd    = fc_eh_should_retry_cmd,
>   	.eh_abort_handler	= lpfc_abort_handler,
>   	.eh_device_reset_handler = lpfc_device_reset_handler,
>   	.eh_target_reset_handler = lpfc_target_reset_handler,
> 
I guess this change is pretty generic, and should be made to every FC 
HBA driver. But probably not your immediate scope, I do agree :-)

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-11-16  8:19   ` Hannes Reinecke
@ 2020-11-17  7:43     ` Muneendra Kumar M
  2020-11-23 20:01     ` Ewan D. Milne
  1 sibling, 0 replies; 26+ messages in thread
From: Muneendra Kumar M @ 2020-11-17  7:43 UTC (permalink / raw)
  To: Hannes Reinecke, linux-scsi, michael.christie; +Cc: jsmart2021, emilne, mkumar

[-- Attachment #1: Type: text/plain, Size: 1136 bytes --]

Hi Hannes,
Thanks for the review.
> +	{ FC_PORTSTATE_LOOPBACK,	"Loopback", 8 },
> +	{ FC_PORTSTATE_DELETED,		"Deleted", 7 },
> +	{ FC_PORTSTATE_MARGINAL,	"Marginal", 8 },

>Why did you append the length of the string here?
>This doesn't have anything to do with this patch, but rather is an
>improvement/modification of the original code, and should be delegated to a
>separate patch.
[Muneendra] fc_enum_name_match needs the string length for validation.
get_fc_port_state_match  will use this as part of setting the port_state in
sysfs for validation(next patch 4/5).

>   };
>   fc_enum_name_search(port_state, fc_port_state, fc_port_state_names)
> +fc_enum_name_match(port_state, fc_port_state, fc_port_state_names)
>   #define FC_PORTSTATE_MAX_NAMELEN	20
>
....
...

>Other than that:

>Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG
Nürnberg), Geschäftsführer: Felix Imendörffer

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 1/5] scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h
  2020-11-11  4:58 ` [PATCH v7 1/5] scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h Muneendra
  2020-11-16  8:16   ` Hannes Reinecke
@ 2020-11-23 19:45   ` Ewan D. Milne
  2020-11-24 17:42   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Ewan D. Milne @ 2020-11-23 19:45 UTC (permalink / raw)
  To: Muneendra, linux-scsi

On Wed, 2020-11-11 at 10:28 +0530, Muneendra wrote:
> Added a new error code DID_TRANSPORT_MARGINAL to handle marginal
> errors in scsi.h
> 
> Added a code in scsi_result_to_blk_status to translate
> a new error DID_TRANSPORT_MARGINAL to the corresponding blk_status_t
> i.e BLK_STS_TRANSPORT
> 
> Added DID_TRANSPORT_MARGINAL case to scsi_decide_disposition
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> Rearranged the patch by moving the DID_TRANSPORT_MARGINAL
> and the changes with respect to the same to this patch
> from the previous patch2 in v6
> 
> Removed the previuos patch patch1 in v6 as in the
> current approach there is no need of this bit SCMD_NORETRIES_ABORT
> 
> v6:
> Rearranged the patch by merging second hunk of the patch2 in v5
> to this patch
> 
> v5:
> added the DID_TRANSPORT_MARGINAL case to
> scsi_decide_disposition
> v4:
> Modified the comments in the code appropriately
> 
> v3:
> Merged  first part of the previous patch(v2 patch3) with
> this patch.
> 
> v2:
> set the hostbyte as DID_TRANSPORT_MARGINAL instead of
> DID_TRANSPORT_FAILFAST.
> ---
>  drivers/scsi/scsi_error.c | 6 ++++++
>  drivers/scsi/scsi_lib.c   | 1 +
>  include/scsi/scsi.h       | 1 +
>  3 files changed, 8 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index f11f51e2465f..28056ee498b3 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -1861,6 +1861,12 @@ int scsi_decide_disposition(struct scsi_cmnd
> *scmd)
>  		 * the fast io fail tmo fired), so send IO directly
> upwards.
>  		 */
>  		return SUCCESS;
> +	case DID_TRANSPORT_MARGINAL:
> +		/*
> +		 * caller has decided not to do retries on
> +		 * abort success, so send IO directly upwards
> +		 */
> +		return SUCCESS;
>  	case DID_ERROR:
>  		if (msg_byte(scmd->result) == COMMAND_COMPLETE &&
>  		    status_byte(scmd->result) == RESERVATION_CONFLICT)
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 20a357563d3d..ce1e2adaca36 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -629,6 +629,7 @@ static blk_status_t
> scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
>  			return BLK_STS_OK;
>  		return BLK_STS_IOERR;
>  	case DID_TRANSPORT_FAILFAST:
> +	case DID_TRANSPORT_MARGINAL:
>  		return BLK_STS_TRANSPORT;
>  	case DID_TARGET_FAILURE:
>  		set_host_byte(cmd, DID_OK);
> diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
> index 5339baadc082..5b287ad8b727 100644
> --- a/include/scsi/scsi.h
> +++ b/include/scsi/scsi.h
> @@ -159,6 +159,7 @@ static inline int scsi_is_wlun(u64 lun)
>  				 * paths might yield different results
> */
>  #define DID_ALLOC_FAILURE 0x12  /* Space allocation on the device
> failed */
>  #define DID_MEDIUM_ERROR  0x13  /* Medium error */
> +#define DID_TRANSPORT_MARGINAL 0x14 /* Transport marginal errors */
>  #define DRIVER_OK       0x00	/* Driver
> status                           */
>  
>  /*

Reviewed-by: Ewan D. Milne <emilne@redhat.com>



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 2/5] scsi: No retries on abort success
  2020-11-11  4:58 ` [PATCH v7 2/5] scsi: No retries on abort success Muneendra
  2020-11-16  8:22   ` Hannes Reinecke
@ 2020-11-23 19:45   ` Ewan D. Milne
  2020-11-24 17:43   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Ewan D. Milne @ 2020-11-23 19:45 UTC (permalink / raw)
  To: Muneendra, linux-scsi

On Wed, 2020-11-11 at 10:28 +0530, Muneendra wrote:
> Added a new optional routine eh_should_retry_cmd
> in scsi_host_template that allows the transport to decide if a
> cmd is retryable.Return true if the transport is in a state the
> cmd should be retried on.
> 
> Added a new interface scsi_eh_should_retry_cmd which checks and
> calls the new routine eh_should_retry_cmd.
> 
> Made changes in scmd_eh_abort_handler and scsi_eh_flush_done_q which
> calls the scsi_eh_should_retry_cmd to check whether the
> command needs to be retried.
> 
> The above changes were done based on a patch by Mike Christie.
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> Added New routine in scsi_host_template to decide if a cmd is
> retryable instead of checking the same using  SCMD_NORETRIES_ABORT
> bit as the cmd retry part can be checked by validating the port
> state.
> 
> Moved the DID_TRANSPORT_MARGINAL changes to previous patch
> for reordering
> 
> v6:
> Rearranged the patch by merging second hunk of the patch2 in v5
> to this patch
> 
> v5:
> added the DID_TRANSPORT_MARGINAL case to
> scsi_decide_disposition
> v4:
> Modified the comments in the code appropriately
> 
> v3:
> Merged  first part of the previous patch(v2 patch3) with
> this patch.
> 
> v2:
> set the hostbyte as DID_TRANSPORT_MARGINAL instead of
> DID_TRANSPORT_FAILFAST.
> ---
>  drivers/scsi/scsi_error.c | 17 +++++++++++++++--
>  include/scsi/scsi_host.h  |  6 ++++++
>  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 28056ee498b3..1cdfa5a8ca09 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -124,6 +124,17 @@ static bool scsi_cmd_retry_allowed(struct
> scsi_cmnd *cmd)
>  	return ++cmd->retries <= cmd->allowed;
>  }
>  
> +static bool scsi_eh_should_retry_cmd(struct scsi_cmnd *cmd)
> +{
> +	struct scsi_device *sdev = cmd->device;
> +	struct Scsi_Host *host = sdev->host;
> +
> +	if (host->hostt->eh_should_retry_cmd)
> +		return  host->hostt->eh_should_retry_cmd(cmd);
> +
> +	return true;
> +}
> +
>  /**
>   * scmd_eh_abort_handler - Handle command aborts
>   * @work:	command to be aborted.
> @@ -159,7 +170,8 @@ scmd_eh_abort_handler(struct work_struct *work)
>  						    "eh timeout, not
> retrying "
>  						    "aborted
> command\n"));
>  			} else if (!scsi_noretry_cmd(scmd) &&
> -				   scsi_cmd_retry_allowed(scmd)) {
> +				   scsi_cmd_retry_allowed(scmd) &&
> +				scsi_eh_should_retry_cmd(scmd)) {
>  				SCSI_LOG_ERROR_RECOVERY(3,
>  					scmd_printk(KERN_WARNING, scmd,
>  						    "retry aborted
> command\n"));
> @@ -2111,7 +2123,8 @@ void scsi_eh_flush_done_q(struct list_head
> *done_q)
>  	list_for_each_entry_safe(scmd, next, done_q, eh_entry) {
>  		list_del_init(&scmd->eh_entry);
>  		if (scsi_device_online(scmd->device) &&
> -		    !scsi_noretry_cmd(scmd) &&
> scsi_cmd_retry_allowed(scmd)) {
> +		    !scsi_noretry_cmd(scmd) &&
> scsi_cmd_retry_allowed(scmd) &&
> +			scsi_eh_should_retry_cmd(scmd)) {
>  			SCSI_LOG_ERROR_RECOVERY(3,
>  				scmd_printk(KERN_INFO, scmd,
>  					     "%s: flush retry cmd\n",
> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
> index 701f178b20ae..e30fd963b97d 100644
> --- a/include/scsi/scsi_host.h
> +++ b/include/scsi/scsi_host.h
> @@ -314,6 +314,12 @@ struct scsi_host_template {
>  	 * Status: OPTIONAL
>  	 */
>  	enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
> +	/*
> +	 * Optional routine that allows the transport to decide if a
> cmd
> +	 * is retryable. Return true if the transport is in a state the
> +	 * cmd should be retried on.
> +	 */
> +	bool (*eh_should_retry_cmd)(struct scsi_cmnd *scmd);
>  
>  	/* This is an optional routine that allows transport to
> initiate
>  	 * LLD adapter or firmware reset using sysfs attribute.

Reviewed-by: Ewan D. Milne <emilne@redhat.com>



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-11-11  4:58 ` [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
  2020-11-16  8:19   ` Hannes Reinecke
@ 2020-11-23 19:47   ` Ewan D. Milne
  2020-11-24 17:43   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Ewan D. Milne @ 2020-11-23 19:47 UTC (permalink / raw)
  To: Muneendra, linux-scsi; +Cc: jsmart2021, mkumar

On Wed, 2020-11-11 at 10:28 +0530, Muneendra wrote:
> Added a new rport state FC_PORTSTATE_MARGINAL.
> 
> Added a new interface fc_eh_should_retry_cmd which Checks if the cmd
> should be retried or not by checking the rport state.
> If the rport state is marginal it returns
> false to make sure there won't be any retries on the cmd.
> 
> Also made changes in fc_remote_port_delete,fc_user_scan_tgt,
> fc_timeout_deleted_rport functions  to handle the new rport state
> FC_PORTSTATE_MARGINAL.
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> Removed the changes related to SCMD_NORETRIES_ABORT bit.
> 
> Added a new function fc_eh_should_retry_cmd to check whether the cmd
> should be retried based on the rport state.
> 
> v6:
> No change
> 
> v5:
> Made changes to clear the SCMD_NORETRIES_ABORT bit if the port_state
> has changed from marginal to online due to port_delete and port_add
> as we need the normal cmd retry behaviour
> 
> Made changes in fc_scsi_scan_rport as we are checking
> FC_PORTSTATE_ONLINE
> instead of FC_PORTSTATE_ONLINE and FC_PORTSTATE_MARGINAL
> 
> v4:
> Made changes in fc_eh_timed_out to call
> fc_rport_chkmarginal_set_noretries
> so that SCMD_NORETRIES_ABORT bit in cmd->state is set if rport state
> is marginal.
> 
> Removed the newly added scsi_cmd argument to fc_remote_port_chkready
> as the current patch handles only SCSI EH timeout/abort case.
> 
> v3:
> Rearranged the patch so that all the changes with respect to new
> rport state is part of this patch.
> Added a new argument to scsi_cmd  to fc_remote_port_chkready
> 
> v2:
> New patch
> ---
>  drivers/scsi/scsi_transport_fc.c | 62 +++++++++++++++++++++++-------
> --
>  include/scsi/scsi_transport_fc.h |  4 ++-
>  2 files changed, 49 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_fc.c
> b/drivers/scsi/scsi_transport_fc.c
> index a926e8f9e56e..ffd25195ae62 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -148,20 +148,23 @@ fc_enum_name_search(host_event_code,
> fc_host_event_code,
>  static struct {
>  	enum fc_port_state	value;
>  	char			*name;
> +	int			matchlen;
>  } fc_port_state_names[] = {
> -	{ FC_PORTSTATE_UNKNOWN,		"Unknown" },
> -	{ FC_PORTSTATE_NOTPRESENT,	"Not Present" },
> -	{ FC_PORTSTATE_ONLINE,		"Online" },
> -	{ FC_PORTSTATE_OFFLINE,		"Offline" },
> -	{ FC_PORTSTATE_BLOCKED,		"Blocked" },
> -	{ FC_PORTSTATE_BYPASSED,	"Bypassed" },
> -	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics" },
> -	{ FC_PORTSTATE_LINKDOWN,	"Linkdown" },
> -	{ FC_PORTSTATE_ERROR,		"Error" },
> -	{ FC_PORTSTATE_LOOPBACK,	"Loopback" },
> -	{ FC_PORTSTATE_DELETED,		"Deleted" },
> +	{ FC_PORTSTATE_UNKNOWN,		"Unknown", 7},
> +	{ FC_PORTSTATE_NOTPRESENT,	"Not Present", 11 },
> +	{ FC_PORTSTATE_ONLINE,		"Online", 6 },
> +	{ FC_PORTSTATE_OFFLINE,		"Offline", 7 },
> +	{ FC_PORTSTATE_BLOCKED,		"Blocked", 7 },
> +	{ FC_PORTSTATE_BYPASSED,	"Bypassed", 8 },
> +	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics", 11 },
> +	{ FC_PORTSTATE_LINKDOWN,	"Linkdown", 8 },
> +	{ FC_PORTSTATE_ERROR,		"Error", 5 },
> +	{ FC_PORTSTATE_LOOPBACK,	"Loopback", 8 },
> +	{ FC_PORTSTATE_DELETED,		"Deleted", 7 },
> +	{ FC_PORTSTATE_MARGINAL,	"Marginal", 8 },
>  };
>  fc_enum_name_search(port_state, fc_port_state, fc_port_state_names)
> +fc_enum_name_match(port_state, fc_port_state, fc_port_state_names)
>  #define FC_PORTSTATE_MAX_NAMELEN	20
>  
>  
> @@ -2509,7 +2512,8 @@ fc_user_scan_tgt(struct Scsi_Host *shost, uint
> channel, uint id, u64 lun)
>  		if (rport->scsi_target_id == -1)
>  			continue;
>  
> -		if (rport->port_state != FC_PORTSTATE_ONLINE)
> +		if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +			(rport->port_state != FC_PORTSTATE_MARGINAL))
>  			continue;
>  
>  		if ((channel == rport->channel) &&
> @@ -3373,7 +3377,8 @@ fc_remote_port_delete(struct fc_rport  *rport)
>  
>  	spin_lock_irqsave(shost->host_lock, flags);
>  
> -	if (rport->port_state != FC_PORTSTATE_ONLINE) {
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
>  		spin_unlock_irqrestore(shost->host_lock, flags);
>  		return;
>  	}
> @@ -3515,7 +3520,8 @@ fc_timeout_deleted_rport(struct work_struct
> *work)
>  	 * target, validate it still is. If not, tear down the
>  	 * scsi_target on it.
>  	 */
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
>  	    (rport->scsi_target_id != -1) &&
>  	    !(rport->roles & FC_PORT_ROLE_FCP_TARGET)) {
>  		dev_printk(KERN_ERR, &rport->dev,
> @@ -3658,7 +3664,8 @@ fc_scsi_scan_rport(struct work_struct *work)
>  	struct fc_internal *i = to_fc_internal(shost->transportt);
>  	unsigned long flags;
>  
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
>  	    (rport->roles & FC_PORT_ROLE_FCP_TARGET) &&
>  	    !(i->f->disable_target_scan)) {
>  		scsi_scan_target(&rport->dev, rport->channel,
> @@ -3731,6 +3738,28 @@ int fc_block_scsi_eh(struct scsi_cmnd *cmnd)
>  }
>  EXPORT_SYMBOL(fc_block_scsi_eh);
>  
> +/*
> + * fc_eh_should_retry_cmd - Checks if the cmd should be retried or
> not
> + * @scmd:        The SCSI command to be checked
> + *
> + * This checks the rport state to decide if a cmd is
> + * retryable.
> + *
> + * Returns: true if the rport state is not in marginal state.
> + */
> +bool fc_eh_should_retry_cmd(struct scsi_cmnd *scmd)
> +{
> +	struct fc_rport *rport = starget_to_rport(scsi_target(scmd-
> >device));
> +
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
> +		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);
> +		return false;
> +	}
> +	return true;
> +}
> +EXPORT_SYMBOL_GPL(fc_eh_should_retry_cmd);
> +
>  /**
>   * fc_vport_setup - allocates and creates a FC virtual port.
>   * @shost:	scsi host the virtual port is connected to.
> @@ -4162,7 +4191,8 @@ static blk_status_t fc_bsg_rport_prep(struct
> fc_rport *rport)
>  	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
>  		return BLK_STS_RESOURCE;
>  
> -	if (rport->port_state != FC_PORTSTATE_ONLINE)
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL))
>  		return BLK_STS_IOERR;
>  
>  	return BLK_STS_OK;
> diff --git a/include/scsi/scsi_transport_fc.h
> b/include/scsi/scsi_transport_fc.h
> index c759b29e46c7..14214ee121ad 100644
> --- a/include/scsi/scsi_transport_fc.h
> +++ b/include/scsi/scsi_transport_fc.h
> @@ -67,6 +67,7 @@ enum fc_port_state {
>  	FC_PORTSTATE_ERROR,
>  	FC_PORTSTATE_LOOPBACK,
>  	FC_PORTSTATE_DELETED,
> +	FC_PORTSTATE_MARGINAL,
>  };
>  
>  
> @@ -742,7 +743,6 @@ struct fc_function_template {
>  	unsigned long	disable_target_scan:1;
>  };
>  
> -
>  /**
>   * fc_remote_port_chkready - called to validate the remote port
> state
>   *   prior to initiating io to the port.
> @@ -758,6 +758,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
>  
>  	switch (rport->port_state) {
>  	case FC_PORTSTATE_ONLINE:
> +	case FC_PORTSTATE_MARGINAL:
>  		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
>  			result = 0;
>  		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
> @@ -839,6 +840,7 @@ int fc_vport_terminate(struct fc_vport *vport);
>  int fc_block_rport(struct fc_rport *rport);
>  int fc_block_scsi_eh(struct scsi_cmnd *cmnd);
>  enum blk_eh_timer_return fc_eh_timed_out(struct scsi_cmnd *scmd);
> +bool fc_eh_should_retry_cmd(struct scsi_cmnd *scmd);
>  
>  static inline struct Scsi_Host *fc_bsg_to_shost(struct bsg_job *job)
>  {

Reviewed-by: Ewan D. Milne <emilne@redhat.com>



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 4/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs
  2020-11-11  4:58 ` [PATCH v7 4/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
  2020-11-16  8:20   ` Hannes Reinecke
@ 2020-11-23 19:47   ` Ewan D. Milne
  2020-11-24 17:44   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Ewan D. Milne @ 2020-11-23 19:47 UTC (permalink / raw)
  To: Muneendra, linux-scsi, michael.christie, hare; +Cc: jsmart2021, mkumar

On Wed, 2020-11-11 at 10:28 +0530, Muneendra wrote:
> Added a store functionality to set rport port_state using sysfs
> under  fc_remote_ports/rport-*/port_state
> 
> With this functionality the user can move the port_state from
> Marginal -> Online and Online->Marginal.
> 
> On Marginal :This interface will set SCMD_NORETRIES_ABORT bit in
> scmd->state for all the pending io's on the scsi device associated
> with target port.
> 
> On Online :This interface will clear SCMD_NORETRIES_ABORT bit in
> scmd->state for all the pending io's on the scsi device associated
> with target port.
> 
> Below is the interface provided to set the port state to Marginal
> and Online.
> 
> echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
> echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> No change
> 
> v6:
> No change
> 
> v5:
> No change
> 
> v4:
> Addressed the error reported by kernel test robot
> Removed the code needed to traverse all the devices under rport
> to set/clear SCMD_NORETRIES_ABORT
> Removed unncessary comments.
> Return the error values on failure while setting the port_state
> 
> v3:
> Removed the port_state from starget attributes.
> Enabled the store functionality for port_state under remote port.
> used the starget_for_each_device to traverse around all the devices
> under rport
> 
> v2:
> Changed from a noretries_abort attribute under fc_transport/target*/
> to
> port_state for changing the port_state to a marginal state
> ---
>  drivers/scsi/scsi_transport_fc.c | 56
> ++++++++++++++++++++++++++++++--
>  1 file changed, 54 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_fc.c
> b/drivers/scsi/scsi_transport_fc.c
> index ffd25195ae62..d378ca4a60fe 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -1238,7 +1238,59 @@ show_fc_rport_roles (struct device *dev,
> struct device_attribute *attr,
>  static FC_DEVICE_ATTR(rport, roles, S_IRUGO,
>  		show_fc_rport_roles, NULL);
>  
> -fc_private_rport_rd_enum_attr(port_state, FC_PORTSTATE_MAX_NAMELEN);
> +static ssize_t fc_rport_set_marginal_state(struct device *dev,
> +						struct device_attribute
> *attr,
> +						const char *buf, size_t
> count)
> +{
> +	struct fc_rport *rport = transport_class_to_rport(dev);
> +	enum fc_port_state port_state;
> +	int ret = 0;
> +
> +	ret = get_fc_port_state_match(buf, &port_state);
> +	if (ret)
> +		return -EINVAL;
> +	if (port_state == FC_PORTSTATE_MARGINAL) {
> +		/*
> +		 * Change the state to marginal only if the
> +		 * current rport state is Online
> +		 * Allow only Online->marginal
> +		 */
> +		if (rport->port_state == FC_PORTSTATE_ONLINE)
> +			rport->port_state = port_state;
> +		else
> +			return -EINVAL;
> +	} else if (port_state == FC_PORTSTATE_ONLINE) {
> +		/*
> +		 * Change the state to Online only if the
> +		 * current rport state is Marginal
> +		 * Allow only  MArginal->Online
> +		 */
> +		if (rport->port_state == FC_PORTSTATE_MARGINAL)
> +			rport->port_state = port_state;
> +		else
> +			return -EINVAL;
> +	} else
> +		return -EINVAL;
> +	return count;
> +}
> +
> +static ssize_t
> +show_fc_rport_port_state(struct device *dev,
> +				struct device_attribute *attr, char
> *buf)
> +{
> +	const char *name;
> +	struct fc_rport *rport = transport_class_to_rport(dev);
> +
> +	name = get_fc_port_state_name(rport->port_state);
> +	if (!name)
> +		return -EINVAL;
> +
> +	return snprintf(buf, 20, "%s\n", name);
> +}
> +
> +static FC_DEVICE_ATTR(rport, port_state, 0444 | 0200,
> +			show_fc_rport_port_state,
> fc_rport_set_marginal_state);
> +
>  fc_private_rport_rd_attr(scsi_target_id, "%d\n", 20);
>  
>  /*
> @@ -2681,7 +2733,7 @@ fc_attach_transport(struct fc_function_template
> *ft)
>  	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_name);
>  	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_id);
>  	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(roles);
> -	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_state);
> +	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(port_state);
>  	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(scsi_target_id);
>  	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(fast_io_fail_tmo);
>  

Reviewed-by: Ewan D. Milne <emilne@redhat.com>



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 5/5] scsi:lpfc: Added support for eh_should_retry_cmd
  2020-11-11  4:58 ` [PATCH v7 5/5] scsi:lpfc: Added support for eh_should_retry_cmd Muneendra
  2020-11-16  8:23   ` Hannes Reinecke
@ 2020-11-23 19:48   ` Ewan D. Milne
  2020-11-24 17:46   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Ewan D. Milne @ 2020-11-23 19:48 UTC (permalink / raw)
  To: Muneendra, linux-scsi; +Cc: jsmart2021, mkumar

On Wed, 2020-11-11 at 10:28 +0530, Muneendra wrote:
> Added support to handle eh_should_retry_cmd in lpfc_template.
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> New patch
> ---
>  drivers/scsi/lpfc/lpfc_scsi.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_scsi.c
> b/drivers/scsi/lpfc/lpfc_scsi.c
> index 4ffdfd2c8604..dbc80e6423ea 100644
> --- a/drivers/scsi/lpfc/lpfc_scsi.c
> +++ b/drivers/scsi/lpfc/lpfc_scsi.c
> @@ -6040,6 +6040,7 @@ struct scsi_host_template lpfc_template = {
>  	.info			= lpfc_info,
>  	.queuecommand		= lpfc_queuecommand,
>  	.eh_timed_out		= fc_eh_timed_out,
> +	.eh_should_retry_cmd    = fc_eh_should_retry_cmd,
>  	.eh_abort_handler	= lpfc_abort_handler,
>  	.eh_device_reset_handler = lpfc_device_reset_handler,
>  	.eh_target_reset_handler = lpfc_target_reset_handler,

Reviewed-by: Ewan D. Milne <emilne@redhat.com>



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 5/5] scsi:lpfc: Added support for eh_should_retry_cmd
  2020-11-16  8:23   ` Hannes Reinecke
@ 2020-11-23 19:51     ` Ewan D. Milne
  0 siblings, 0 replies; 26+ messages in thread
From: Ewan D. Milne @ 2020-11-23 19:51 UTC (permalink / raw)
  To: Hannes Reinecke, Muneendra, linux-scsi, michael.christie
  Cc: jsmart2021, mkumar

On Mon, 2020-11-16 at 09:23 +0100, Hannes Reinecke wrote:
> On 11/11/20 5:58 AM, Muneendra wrote:
> > Added support to handle eh_should_retry_cmd in lpfc_template.
> > 
> > Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> > 
> > ---
> > v7:
> > New patch
> > ---
> >   drivers/scsi/lpfc/lpfc_scsi.c | 1 +
> >   1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/scsi/lpfc/lpfc_scsi.c
> > b/drivers/scsi/lpfc/lpfc_scsi.c
> > index 4ffdfd2c8604..dbc80e6423ea 100644
> > --- a/drivers/scsi/lpfc/lpfc_scsi.c
> > +++ b/drivers/scsi/lpfc/lpfc_scsi.c
> > @@ -6040,6 +6040,7 @@ struct scsi_host_template lpfc_template = {
> >   	.info			= lpfc_info,
> >   	.queuecommand		= lpfc_queuecommand,
> >   	.eh_timed_out		= fc_eh_timed_out,
> > +	.eh_should_retry_cmd    = fc_eh_should_retry_cmd,
> >   	.eh_abort_handler	= lpfc_abort_handler,
> >   	.eh_device_reset_handler = lpfc_device_reset_handler,
> >   	.eh_target_reset_handler = lpfc_target_reset_handler,
> > 
> 
> I guess this change is pretty generic, and should be made to every
> FC 
> HBA driver. But probably not your immediate scope, I do agree :-)

Yes, I think this is better left to the other driver maintainers
as it should be tested if it is called via the host template.

-Ewan

> 
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> 
> Cheers,
> 
> Hannes


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-11-16  8:19   ` Hannes Reinecke
  2020-11-17  7:43     ` Muneendra Kumar M
@ 2020-11-23 20:01     ` Ewan D. Milne
  1 sibling, 0 replies; 26+ messages in thread
From: Ewan D. Milne @ 2020-11-23 20:01 UTC (permalink / raw)
  To: Hannes Reinecke, Muneendra, linux-scsi, michael.christie
  Cc: jsmart2021, mkumar

On Mon, 2020-11-16 at 09:19 +0100, Hannes Reinecke wrote:
> On 11/11/20 5:58 AM, Muneendra wrote:
> > Added a new rport state FC_PORTSTATE_MARGINAL.
> > 
> > Added a new interface fc_eh_should_retry_cmd which Checks if the
> > cmd
> > should be retried or not by checking the rport state.
> > If the rport state is marginal it returns
> > false to make sure there won't be any retries on the cmd.
> > 
> > Also made changes in fc_remote_port_delete,fc_user_scan_tgt,
> > fc_timeout_deleted_rport functions  to handle the new rport state
> > FC_PORTSTATE_MARGINAL.
> > 
> > Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> > 
> > ---
> > v7:
> > Removed the changes related to SCMD_NORETRIES_ABORT bit.
> > 
> > Added a new function fc_eh_should_retry_cmd to check whether the
> > cmd
> > should be retried based on the rport state.
> > 
> > v6:
> > No change
> > 
> > v5:
> > Made changes to clear the SCMD_NORETRIES_ABORT bit if the
> > port_state
> > has changed from marginal to online due to port_delete and port_add
> > as we need the normal cmd retry behaviour
> > 
> > Made changes in fc_scsi_scan_rport as we are checking
> > FC_PORTSTATE_ONLINE
> > instead of FC_PORTSTATE_ONLINE and FC_PORTSTATE_MARGINAL
> > 
> > v4:
> > Made changes in fc_eh_timed_out to call
> > fc_rport_chkmarginal_set_noretries
> > so that SCMD_NORETRIES_ABORT bit in cmd->state is set if rport
> > state
> > is marginal.
> > 
> > Removed the newly added scsi_cmd argument to
> > fc_remote_port_chkready
> > as the current patch handles only SCSI EH timeout/abort case.
> > 
> > v3:
> > Rearranged the patch so that all the changes with respect to new
> > rport state is part of this patch.
> > Added a new argument to scsi_cmd  to fc_remote_port_chkready
> > 
> > v2:
> > New patch
> > ---
> >   drivers/scsi/scsi_transport_fc.c | 62 +++++++++++++++++++++++--
> > -------
> >   include/scsi/scsi_transport_fc.h |  4 ++-
> >   2 files changed, 49 insertions(+), 17 deletions(-)
> > 
> > diff --git a/drivers/scsi/scsi_transport_fc.c
> > b/drivers/scsi/scsi_transport_fc.c
> > index a926e8f9e56e..ffd25195ae62 100644
> > --- a/drivers/scsi/scsi_transport_fc.c
> > +++ b/drivers/scsi/scsi_transport_fc.c
> > @@ -148,20 +148,23 @@ fc_enum_name_search(host_event_code,
> > fc_host_event_code,
> >   static struct {
> >   	enum fc_port_state	value;
> >   	char			*name;
> > +	int			matchlen;
> >   } fc_port_state_names[] = {
> > -	{ FC_PORTSTATE_UNKNOWN,		"Unknown" },
> > -	{ FC_PORTSTATE_NOTPRESENT,	"Not Present" },
> > -	{ FC_PORTSTATE_ONLINE,		"Online" },
> > -	{ FC_PORTSTATE_OFFLINE,		"Offline" },
> > -	{ FC_PORTSTATE_BLOCKED,		"Blocked" },
> > -	{ FC_PORTSTATE_BYPASSED,	"Bypassed" },
> > -	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics" },
> > -	{ FC_PORTSTATE_LINKDOWN,	"Linkdown" },
> > -	{ FC_PORTSTATE_ERROR,		"Error" },
> > -	{ FC_PORTSTATE_LOOPBACK,	"Loopback" },
> > -	{ FC_PORTSTATE_DELETED,		"Deleted" },
> > +	{ FC_PORTSTATE_UNKNOWN,		"Unknown", 7},
> > +	{ FC_PORTSTATE_NOTPRESENT,	"Not Present", 11 },
> > +	{ FC_PORTSTATE_ONLINE,		"Online", 6 },
> > +	{ FC_PORTSTATE_OFFLINE,		"Offline", 7 },
> > +	{ FC_PORTSTATE_BLOCKED,		"Blocked", 7 },
> > +	{ FC_PORTSTATE_BYPASSED,	"Bypassed", 8 },
> > +	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics", 11 },
> > +	{ FC_PORTSTATE_LINKDOWN,	"Linkdown", 8 },
> > +	{ FC_PORTSTATE_ERROR,		"Error", 5 },
> > +	{ FC_PORTSTATE_LOOPBACK,	"Loopback", 8 },
> > +	{ FC_PORTSTATE_DELETED,		"Deleted", 7 },
> > +	{ FC_PORTSTATE_MARGINAL,	"Marginal", 8 },
> 
> Why did you append the length of the string here?
> This doesn't have anything to do with this patch, but rather is an 
> improvement/modification of the original code, and should be
> delegated 
> to a separate patch.

It's because the attribute is now writeable, the use of the
fc_enum_name_match() macro on port_state needs the length.

The tgtid_bind_type attribute uses the same mechanism already.

-Ewan

> 
> >   };
> >   fc_enum_name_search(port_state, fc_port_state,
> > fc_port_state_names)
> > +fc_enum_name_match(port_state, fc_port_state, fc_port_state_names)
> >   #define FC_PORTSTATE_MAX_NAMELEN	20
> >   
> >   
> > @@ -2509,7 +2512,8 @@ fc_user_scan_tgt(struct Scsi_Host *shost,
> > uint channel, uint id, u64 lun)
> >   		if (rport->scsi_target_id == -1)
> >   			continue;
> >   
> > -		if (rport->port_state != FC_PORTSTATE_ONLINE)
> > +		if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> > +			(rport->port_state != FC_PORTSTATE_MARGINAL))
> >   			continue;
> >   
> >   		if ((channel == rport->channel) &&
> > @@ -3373,7 +3377,8 @@ fc_remote_port_delete(struct
> > fc_rport  *rport)
> >   
> >   	spin_lock_irqsave(shost->host_lock, flags);
> >   
> > -	if (rport->port_state != FC_PORTSTATE_ONLINE) {
> > +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> > +		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
> >   		spin_unlock_irqrestore(shost->host_lock, flags);
> >   		return;
> >   	}
> > @@ -3515,7 +3520,8 @@ fc_timeout_deleted_rport(struct work_struct
> > *work)
> >   	 * target, validate it still is. If not, tear down the
> >   	 * scsi_target on it.
> >   	 */
> > -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> > +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> > +		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
> >   	    (rport->scsi_target_id != -1) &&
> >   	    !(rport->roles & FC_PORT_ROLE_FCP_TARGET)) {
> >   		dev_printk(KERN_ERR, &rport->dev,
> > @@ -3658,7 +3664,8 @@ fc_scsi_scan_rport(struct work_struct *work)
> >   	struct fc_internal *i = to_fc_internal(shost->transportt);
> >   	unsigned long flags;
> >   
> > -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> > +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> > +		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
> >   	    (rport->roles & FC_PORT_ROLE_FCP_TARGET) &&
> >   	    !(i->f->disable_target_scan)) {
> >   		scsi_scan_target(&rport->dev, rport->channel,
> > @@ -3731,6 +3738,28 @@ int fc_block_scsi_eh(struct scsi_cmnd *cmnd)
> >   }
> >   EXPORT_SYMBOL(fc_block_scsi_eh);
> >   
> > +/*
> > + * fc_eh_should_retry_cmd - Checks if the cmd should be retried or
> > not
> > + * @scmd:        The SCSI command to be checked
> > + *
> > + * This checks the rport state to decide if a cmd is
> > + * retryable.
> > + *
> > + * Returns: true if the rport state is not in marginal state.
> > + */
> > +bool fc_eh_should_retry_cmd(struct scsi_cmnd *scmd)
> > +{
> > +	struct fc_rport *rport = starget_to_rport(scsi_target(scmd-
> > >device));
> > +
> > +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> > +		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
> > +		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);
> > +		return false;
> > +	}
> > +	return true;
> > +}
> > +EXPORT_SYMBOL_GPL(fc_eh_should_retry_cmd);
> > +
> >   /**
> >    * fc_vport_setup - allocates and creates a FC virtual port.
> >    * @shost:	scsi host the virtual port is connected to.
> > @@ -4162,7 +4191,8 @@ static blk_status_t fc_bsg_rport_prep(struct
> > fc_rport *rport)
> >   	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
> >   		return BLK_STS_RESOURCE;
> >   
> > -	if (rport->port_state != FC_PORTSTATE_ONLINE)
> > +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> > +		(rport->port_state != FC_PORTSTATE_MARGINAL))
> >   		return BLK_STS_IOERR;
> >   
> >   	return BLK_STS_OK;
> > diff --git a/include/scsi/scsi_transport_fc.h
> > b/include/scsi/scsi_transport_fc.h
> > index c759b29e46c7..14214ee121ad 100644
> > --- a/include/scsi/scsi_transport_fc.h
> > +++ b/include/scsi/scsi_transport_fc.h
> > @@ -67,6 +67,7 @@ enum fc_port_state {
> >   	FC_PORTSTATE_ERROR,
> >   	FC_PORTSTATE_LOOPBACK,
> >   	FC_PORTSTATE_DELETED,
> > +	FC_PORTSTATE_MARGINAL,
> >   };
> >   
> >   
> > @@ -742,7 +743,6 @@ struct fc_function_template {
> >   	unsigned long	disable_target_scan:1;
> >   };
> >   
> > -
> >   /**
> >    * fc_remote_port_chkready - called to validate the remote port
> > state
> >    *   prior to initiating io to the port.
> > @@ -758,6 +758,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
> >   
> >   	switch (rport->port_state) {
> >   	case FC_PORTSTATE_ONLINE:
> > +	case FC_PORTSTATE_MARGINAL:
> >   		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
> >   			result = 0;
> >   		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
> > @@ -839,6 +840,7 @@ int fc_vport_terminate(struct fc_vport *vport);
> >   int fc_block_rport(struct fc_rport *rport);
> >   int fc_block_scsi_eh(struct scsi_cmnd *cmnd);
> >   enum blk_eh_timer_return fc_eh_timed_out(struct scsi_cmnd *scmd);
> > +bool fc_eh_should_retry_cmd(struct scsi_cmnd *scmd);
> >   
> >   static inline struct Scsi_Host *fc_bsg_to_shost(struct bsg_job
> > *job)
> >   {
> > 
> 
> Other than that:
> 
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> 
> Cheers,
> 
> Hannes


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 1/5] scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h
  2020-11-11  4:58 ` [PATCH v7 1/5] scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h Muneendra
  2020-11-16  8:16   ` Hannes Reinecke
  2020-11-23 19:45   ` Ewan D. Milne
@ 2020-11-24 17:42   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Himanshu Madhani @ 2020-11-24 17:42 UTC (permalink / raw)
  To: Muneendra; +Cc: linux-scsi, michael.christie, hare, jsmart2021, emilne, mkumar



> On Nov 10, 2020, at 10:58 PM, Muneendra <muneendra.kumar@broadcom.com> wrote:
> 
> Added a new error code DID_TRANSPORT_MARGINAL to handle marginal
> errors in scsi.h
> 
> Added a code in scsi_result_to_blk_status to translate
> a new error DID_TRANSPORT_MARGINAL to the corresponding blk_status_t
> i.e BLK_STS_TRANSPORT
> 
> Added DID_TRANSPORT_MARGINAL case to scsi_decide_disposition
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> Rearranged the patch by moving the DID_TRANSPORT_MARGINAL
> and the changes with respect to the same to this patch
> from the previous patch2 in v6
> 
> Removed the previuos patch patch1 in v6 as in the
> current approach there is no need of this bit SCMD_NORETRIES_ABORT
> 
> v6:
> Rearranged the patch by merging second hunk of the patch2 in v5
> to this patch
> 
> v5:
> added the DID_TRANSPORT_MARGINAL case to
> scsi_decide_disposition
> v4:
> Modified the comments in the code appropriately
> 
> v3:
> Merged  first part of the previous patch(v2 patch3) with
> this patch.
> 
> v2:
> set the hostbyte as DID_TRANSPORT_MARGINAL instead of
> DID_TRANSPORT_FAILFAST.
> ---
> drivers/scsi/scsi_error.c | 6 ++++++
> drivers/scsi/scsi_lib.c   | 1 +
> include/scsi/scsi.h       | 1 +
> 3 files changed, 8 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index f11f51e2465f..28056ee498b3 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -1861,6 +1861,12 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
> 		 * the fast io fail tmo fired), so send IO directly upwards.
> 		 */
> 		return SUCCESS;
> +	case DID_TRANSPORT_MARGINAL:
> +		/*
> +		 * caller has decided not to do retries on
> +		 * abort success, so send IO directly upwards
> +		 */
> +		return SUCCESS;
> 	case DID_ERROR:
> 		if (msg_byte(scmd->result) == COMMAND_COMPLETE &&
> 		    status_byte(scmd->result) == RESERVATION_CONFLICT)
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 20a357563d3d..ce1e2adaca36 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -629,6 +629,7 @@ static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
> 			return BLK_STS_OK;
> 		return BLK_STS_IOERR;
> 	case DID_TRANSPORT_FAILFAST:
> +	case DID_TRANSPORT_MARGINAL:
> 		return BLK_STS_TRANSPORT;
> 	case DID_TARGET_FAILURE:
> 		set_host_byte(cmd, DID_OK);
> diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
> index 5339baadc082..5b287ad8b727 100644
> --- a/include/scsi/scsi.h
> +++ b/include/scsi/scsi.h
> @@ -159,6 +159,7 @@ static inline int scsi_is_wlun(u64 lun)
> 				 * paths might yield different results */
> #define DID_ALLOC_FAILURE 0x12  /* Space allocation on the device failed */
> #define DID_MEDIUM_ERROR  0x13  /* Medium error */
> +#define DID_TRANSPORT_MARGINAL 0x14 /* Transport marginal errors */
> #define DRIVER_OK       0x00	/* Driver status                           */
> 
> /*
> -- 
> 2.26.2
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

--
Himanshu Madhani	 Oracle Linux Engineering


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 2/5] scsi: No retries on abort success
  2020-11-11  4:58 ` [PATCH v7 2/5] scsi: No retries on abort success Muneendra
  2020-11-16  8:22   ` Hannes Reinecke
  2020-11-23 19:45   ` Ewan D. Milne
@ 2020-11-24 17:43   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Himanshu Madhani @ 2020-11-24 17:43 UTC (permalink / raw)
  To: Muneendra; +Cc: linux-scsi, michael.christie, hare, jsmart2021, emilne, mkumar



> On Nov 10, 2020, at 10:58 PM, Muneendra <muneendra.kumar@broadcom.com> wrote:
> 
> Added a new optional routine eh_should_retry_cmd
> in scsi_host_template that allows the transport to decide if a
> cmd is retryable.Return true if the transport is in a state the
> cmd should be retried on.
> 
> Added a new interface scsi_eh_should_retry_cmd which checks and
> calls the new routine eh_should_retry_cmd.
> 
> Made changes in scmd_eh_abort_handler and scsi_eh_flush_done_q which
> calls the scsi_eh_should_retry_cmd to check whether the
> command needs to be retried.
> 
> The above changes were done based on a patch by Mike Christie.
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> Added New routine in scsi_host_template to decide if a cmd is
> retryable instead of checking the same using  SCMD_NORETRIES_ABORT
> bit as the cmd retry part can be checked by validating the port state.
> 
> Moved the DID_TRANSPORT_MARGINAL changes to previous patch
> for reordering
> 
> v6:
> Rearranged the patch by merging second hunk of the patch2 in v5
> to this patch
> 
> v5:
> added the DID_TRANSPORT_MARGINAL case to
> scsi_decide_disposition
> v4:
> Modified the comments in the code appropriately
> 
> v3:
> Merged  first part of the previous patch(v2 patch3) with
> this patch.
> 
> v2:
> set the hostbyte as DID_TRANSPORT_MARGINAL instead of
> DID_TRANSPORT_FAILFAST.
> ---
> drivers/scsi/scsi_error.c | 17 +++++++++++++++--
> include/scsi/scsi_host.h  |  6 ++++++
> 2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 28056ee498b3..1cdfa5a8ca09 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -124,6 +124,17 @@ static bool scsi_cmd_retry_allowed(struct scsi_cmnd *cmd)
> 	return ++cmd->retries <= cmd->allowed;
> }
> 
> +static bool scsi_eh_should_retry_cmd(struct scsi_cmnd *cmd)
> +{
> +	struct scsi_device *sdev = cmd->device;
> +	struct Scsi_Host *host = sdev->host;
> +
> +	if (host->hostt->eh_should_retry_cmd)
> +		return  host->hostt->eh_should_retry_cmd(cmd);
> +
> +	return true;
> +}
> +
> /**
>  * scmd_eh_abort_handler - Handle command aborts
>  * @work:	command to be aborted.
> @@ -159,7 +170,8 @@ scmd_eh_abort_handler(struct work_struct *work)
> 						    "eh timeout, not retrying "
> 						    "aborted command\n"));
> 			} else if (!scsi_noretry_cmd(scmd) &&
> -				   scsi_cmd_retry_allowed(scmd)) {
> +				   scsi_cmd_retry_allowed(scmd) &&
> +				scsi_eh_should_retry_cmd(scmd)) {
> 				SCSI_LOG_ERROR_RECOVERY(3,
> 					scmd_printk(KERN_WARNING, scmd,
> 						    "retry aborted command\n"));
> @@ -2111,7 +2123,8 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
> 	list_for_each_entry_safe(scmd, next, done_q, eh_entry) {
> 		list_del_init(&scmd->eh_entry);
> 		if (scsi_device_online(scmd->device) &&
> -		    !scsi_noretry_cmd(scmd) && scsi_cmd_retry_allowed(scmd)) {
> +		    !scsi_noretry_cmd(scmd) && scsi_cmd_retry_allowed(scmd) &&
> +			scsi_eh_should_retry_cmd(scmd)) {
> 			SCSI_LOG_ERROR_RECOVERY(3,
> 				scmd_printk(KERN_INFO, scmd,
> 					     "%s: flush retry cmd\n",
> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
> index 701f178b20ae..e30fd963b97d 100644
> --- a/include/scsi/scsi_host.h
> +++ b/include/scsi/scsi_host.h
> @@ -314,6 +314,12 @@ struct scsi_host_template {
> 	 * Status: OPTIONAL
> 	 */
> 	enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
> +	/*
> +	 * Optional routine that allows the transport to decide if a cmd
> +	 * is retryable. Return true if the transport is in a state the
> +	 * cmd should be retried on.
> +	 */
> +	bool (*eh_should_retry_cmd)(struct scsi_cmnd *scmd);
> 
> 	/* This is an optional routine that allows transport to initiate
> 	 * LLD adapter or firmware reset using sysfs attribute.
> -- 
> 2.26.2
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

--
Himanshu Madhani	 Oracle Linux Engineering


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-11-11  4:58 ` [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
  2020-11-16  8:19   ` Hannes Reinecke
  2020-11-23 19:47   ` Ewan D. Milne
@ 2020-11-24 17:43   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Himanshu Madhani @ 2020-11-24 17:43 UTC (permalink / raw)
  To: Muneendra; +Cc: linux-scsi, Mike Christie, hare, jsmart2021, emilne, mkumar



> On Nov 10, 2020, at 10:58 PM, Muneendra <muneendra.kumar@broadcom.com> wrote:
> 
> Added a new rport state FC_PORTSTATE_MARGINAL.
> 
> Added a new interface fc_eh_should_retry_cmd which Checks if the cmd
> should be retried or not by checking the rport state.
> If the rport state is marginal it returns
> false to make sure there won't be any retries on the cmd.
> 
> Also made changes in fc_remote_port_delete,fc_user_scan_tgt,
> fc_timeout_deleted_rport functions  to handle the new rport state
> FC_PORTSTATE_MARGINAL.
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> Removed the changes related to SCMD_NORETRIES_ABORT bit.
> 
> Added a new function fc_eh_should_retry_cmd to check whether the cmd
> should be retried based on the rport state.
> 
> v6:
> No change
> 
> v5:
> Made changes to clear the SCMD_NORETRIES_ABORT bit if the port_state
> has changed from marginal to online due to port_delete and port_add
> as we need the normal cmd retry behaviour
> 
> Made changes in fc_scsi_scan_rport as we are checking FC_PORTSTATE_ONLINE
> instead of FC_PORTSTATE_ONLINE and FC_PORTSTATE_MARGINAL
> 
> v4:
> Made changes in fc_eh_timed_out to call fc_rport_chkmarginal_set_noretries
> so that SCMD_NORETRIES_ABORT bit in cmd->state is set if rport state
> is marginal.
> 
> Removed the newly added scsi_cmd argument to fc_remote_port_chkready
> as the current patch handles only SCSI EH timeout/abort case.
> 
> v3:
> Rearranged the patch so that all the changes with respect to new
> rport state is part of this patch.
> Added a new argument to scsi_cmd  to fc_remote_port_chkready
> 
> v2:
> New patch
> ---
> drivers/scsi/scsi_transport_fc.c | 62 +++++++++++++++++++++++---------
> include/scsi/scsi_transport_fc.h |  4 ++-
> 2 files changed, 49 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> index a926e8f9e56e..ffd25195ae62 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -148,20 +148,23 @@ fc_enum_name_search(host_event_code, fc_host_event_code,
> static struct {
> 	enum fc_port_state	value;
> 	char			*name;
> +	int			matchlen;
> } fc_port_state_names[] = {
> -	{ FC_PORTSTATE_UNKNOWN,		"Unknown" },
> -	{ FC_PORTSTATE_NOTPRESENT,	"Not Present" },
> -	{ FC_PORTSTATE_ONLINE,		"Online" },
> -	{ FC_PORTSTATE_OFFLINE,		"Offline" },
> -	{ FC_PORTSTATE_BLOCKED,		"Blocked" },
> -	{ FC_PORTSTATE_BYPASSED,	"Bypassed" },
> -	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics" },
> -	{ FC_PORTSTATE_LINKDOWN,	"Linkdown" },
> -	{ FC_PORTSTATE_ERROR,		"Error" },
> -	{ FC_PORTSTATE_LOOPBACK,	"Loopback" },
> -	{ FC_PORTSTATE_DELETED,		"Deleted" },
> +	{ FC_PORTSTATE_UNKNOWN,		"Unknown", 7},
> +	{ FC_PORTSTATE_NOTPRESENT,	"Not Present", 11 },
> +	{ FC_PORTSTATE_ONLINE,		"Online", 6 },
> +	{ FC_PORTSTATE_OFFLINE,		"Offline", 7 },
> +	{ FC_PORTSTATE_BLOCKED,		"Blocked", 7 },
> +	{ FC_PORTSTATE_BYPASSED,	"Bypassed", 8 },
> +	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics", 11 },
> +	{ FC_PORTSTATE_LINKDOWN,	"Linkdown", 8 },
> +	{ FC_PORTSTATE_ERROR,		"Error", 5 },
> +	{ FC_PORTSTATE_LOOPBACK,	"Loopback", 8 },
> +	{ FC_PORTSTATE_DELETED,		"Deleted", 7 },
> +	{ FC_PORTSTATE_MARGINAL,	"Marginal", 8 },
> };
> fc_enum_name_search(port_state, fc_port_state, fc_port_state_names)
> +fc_enum_name_match(port_state, fc_port_state, fc_port_state_names)
> #define FC_PORTSTATE_MAX_NAMELEN	20
> 
> 
> @@ -2509,7 +2512,8 @@ fc_user_scan_tgt(struct Scsi_Host *shost, uint channel, uint id, u64 lun)
> 		if (rport->scsi_target_id == -1)
> 			continue;
> 
> -		if (rport->port_state != FC_PORTSTATE_ONLINE)
> +		if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +			(rport->port_state != FC_PORTSTATE_MARGINAL))
> 			continue;
> 
> 		if ((channel == rport->channel) &&
> @@ -3373,7 +3377,8 @@ fc_remote_port_delete(struct fc_rport  *rport)
> 
> 	spin_lock_irqsave(shost->host_lock, flags);
> 
> -	if (rport->port_state != FC_PORTSTATE_ONLINE) {
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
> 		spin_unlock_irqrestore(shost->host_lock, flags);
> 		return;
> 	}
> @@ -3515,7 +3520,8 @@ fc_timeout_deleted_rport(struct work_struct *work)
> 	 * target, validate it still is. If not, tear down the
> 	 * scsi_target on it.
> 	 */
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
> 	    (rport->scsi_target_id != -1) &&
> 	    !(rport->roles & FC_PORT_ROLE_FCP_TARGET)) {
> 		dev_printk(KERN_ERR, &rport->dev,
> @@ -3658,7 +3664,8 @@ fc_scsi_scan_rport(struct work_struct *work)
> 	struct fc_internal *i = to_fc_internal(shost->transportt);
> 	unsigned long flags;
> 
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
> 	    (rport->roles & FC_PORT_ROLE_FCP_TARGET) &&
> 	    !(i->f->disable_target_scan)) {
> 		scsi_scan_target(&rport->dev, rport->channel,
> @@ -3731,6 +3738,28 @@ int fc_block_scsi_eh(struct scsi_cmnd *cmnd)
> }
> EXPORT_SYMBOL(fc_block_scsi_eh);
> 
> +/*
> + * fc_eh_should_retry_cmd - Checks if the cmd should be retried or not
> + * @scmd:        The SCSI command to be checked
> + *
> + * This checks the rport state to decide if a cmd is
> + * retryable.
> + *
> + * Returns: true if the rport state is not in marginal state.
> + */
> +bool fc_eh_should_retry_cmd(struct scsi_cmnd *scmd)
> +{
> +	struct fc_rport *rport = starget_to_rport(scsi_target(scmd->device));
> +
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
> +		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);
> +		return false;
> +	}
> +	return true;
> +}
> +EXPORT_SYMBOL_GPL(fc_eh_should_retry_cmd);
> +
> /**
>  * fc_vport_setup - allocates and creates a FC virtual port.
>  * @shost:	scsi host the virtual port is connected to.
> @@ -4162,7 +4191,8 @@ static blk_status_t fc_bsg_rport_prep(struct fc_rport *rport)
> 	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
> 		return BLK_STS_RESOURCE;
> 
> -	if (rport->port_state != FC_PORTSTATE_ONLINE)
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL))
> 		return BLK_STS_IOERR;
> 
> 	return BLK_STS_OK;
> diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
> index c759b29e46c7..14214ee121ad 100644
> --- a/include/scsi/scsi_transport_fc.h
> +++ b/include/scsi/scsi_transport_fc.h
> @@ -67,6 +67,7 @@ enum fc_port_state {
> 	FC_PORTSTATE_ERROR,
> 	FC_PORTSTATE_LOOPBACK,
> 	FC_PORTSTATE_DELETED,
> +	FC_PORTSTATE_MARGINAL,
> };
> 
> 
> @@ -742,7 +743,6 @@ struct fc_function_template {
> 	unsigned long	disable_target_scan:1;
> };
> 
> -
> /**
>  * fc_remote_port_chkready - called to validate the remote port state
>  *   prior to initiating io to the port.
> @@ -758,6 +758,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
> 
> 	switch (rport->port_state) {
> 	case FC_PORTSTATE_ONLINE:
> +	case FC_PORTSTATE_MARGINAL:
> 		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
> 			result = 0;
> 		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
> @@ -839,6 +840,7 @@ int fc_vport_terminate(struct fc_vport *vport);
> int fc_block_rport(struct fc_rport *rport);
> int fc_block_scsi_eh(struct scsi_cmnd *cmnd);
> enum blk_eh_timer_return fc_eh_timed_out(struct scsi_cmnd *scmd);
> +bool fc_eh_should_retry_cmd(struct scsi_cmnd *scmd);
> 
> static inline struct Scsi_Host *fc_bsg_to_shost(struct bsg_job *job)
> {
> -- 
> 2.26.2
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

--
Himanshu Madhani	 Oracle Linux Engineering


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 4/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs
  2020-11-11  4:58 ` [PATCH v7 4/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
  2020-11-16  8:20   ` Hannes Reinecke
  2020-11-23 19:47   ` Ewan D. Milne
@ 2020-11-24 17:44   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Himanshu Madhani @ 2020-11-24 17:44 UTC (permalink / raw)
  To: Muneendra; +Cc: linux-scsi, michael.christie, hare, jsmart2021, emilne, mkumar



> On Nov 10, 2020, at 10:58 PM, Muneendra <muneendra.kumar@broadcom.com> wrote:
> 
> Added a store functionality to set rport port_state using sysfs
> under  fc_remote_ports/rport-*/port_state
> 
> With this functionality the user can move the port_state from
> Marginal -> Online and Online->Marginal.
> 
> On Marginal :This interface will set SCMD_NORETRIES_ABORT bit in
> scmd->state for all the pending io's on the scsi device associated
> with target port.
> 
> On Online :This interface will clear SCMD_NORETRIES_ABORT bit in
> scmd->state for all the pending io's on the scsi device associated
> with target port.
> 
> Below is the interface provided to set the port state to Marginal
> and Online.
> 
> echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
> echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> No change
> 
> v6:
> No change
> 
> v5:
> No change
> 
> v4:
> Addressed the error reported by kernel test robot
> Removed the code needed to traverse all the devices under rport
> to set/clear SCMD_NORETRIES_ABORT
> Removed unncessary comments.
> Return the error values on failure while setting the port_state
> 
> v3:
> Removed the port_state from starget attributes.
> Enabled the store functionality for port_state under remote port.
> used the starget_for_each_device to traverse around all the devices
> under rport
> 
> v2:
> Changed from a noretries_abort attribute under fc_transport/target*/ to
> port_state for changing the port_state to a marginal state
> ---
> drivers/scsi/scsi_transport_fc.c | 56 ++++++++++++++++++++++++++++++--
> 1 file changed, 54 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> index ffd25195ae62..d378ca4a60fe 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -1238,7 +1238,59 @@ show_fc_rport_roles (struct device *dev, struct device_attribute *attr,
> static FC_DEVICE_ATTR(rport, roles, S_IRUGO,
> 		show_fc_rport_roles, NULL);
> 
> -fc_private_rport_rd_enum_attr(port_state, FC_PORTSTATE_MAX_NAMELEN);
> +static ssize_t fc_rport_set_marginal_state(struct device *dev,
> +						struct device_attribute *attr,
> +						const char *buf, size_t count)
> +{
> +	struct fc_rport *rport = transport_class_to_rport(dev);
> +	enum fc_port_state port_state;
> +	int ret = 0;
> +
> +	ret = get_fc_port_state_match(buf, &port_state);
> +	if (ret)
> +		return -EINVAL;
> +	if (port_state == FC_PORTSTATE_MARGINAL) {
> +		/*
> +		 * Change the state to marginal only if the
> +		 * current rport state is Online
> +		 * Allow only Online->marginal
> +		 */
> +		if (rport->port_state == FC_PORTSTATE_ONLINE)
> +			rport->port_state = port_state;
> +		else
> +			return -EINVAL;
> +	} else if (port_state == FC_PORTSTATE_ONLINE) {
> +		/*
> +		 * Change the state to Online only if the
> +		 * current rport state is Marginal
> +		 * Allow only  MArginal->Online
> +		 */
> +		if (rport->port_state == FC_PORTSTATE_MARGINAL)
> +			rport->port_state = port_state;
> +		else
> +			return -EINVAL;
> +	} else
> +		return -EINVAL;
> +	return count;
> +}
> +
> +static ssize_t
> +show_fc_rport_port_state(struct device *dev,
> +				struct device_attribute *attr, char *buf)
> +{
> +	const char *name;
> +	struct fc_rport *rport = transport_class_to_rport(dev);
> +
> +	name = get_fc_port_state_name(rport->port_state);
> +	if (!name)
> +		return -EINVAL;
> +
> +	return snprintf(buf, 20, "%s\n", name);
> +}
> +
> +static FC_DEVICE_ATTR(rport, port_state, 0444 | 0200,
> +			show_fc_rport_port_state, fc_rport_set_marginal_state);
> +
> fc_private_rport_rd_attr(scsi_target_id, "%d\n", 20);
> 
> /*
> @@ -2681,7 +2733,7 @@ fc_attach_transport(struct fc_function_template *ft)
> 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_name);
> 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_id);
> 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(roles);
> -	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_state);
> +	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(port_state);
> 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(scsi_target_id);
> 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(fast_io_fail_tmo);
> 
> -- 
> 2.26.2
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

--
Himanshu Madhani	 Oracle Linux Engineering


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 5/5] scsi:lpfc: Added support for eh_should_retry_cmd
  2020-11-11  4:58 ` [PATCH v7 5/5] scsi:lpfc: Added support for eh_should_retry_cmd Muneendra
  2020-11-16  8:23   ` Hannes Reinecke
  2020-11-23 19:48   ` Ewan D. Milne
@ 2020-11-24 17:46   ` Himanshu Madhani
  2 siblings, 0 replies; 26+ messages in thread
From: Himanshu Madhani @ 2020-11-24 17:46 UTC (permalink / raw)
  To: Muneendra; +Cc: linux-scsi, Mike Christie, hare, jsmart2021, emilne, mkumar



> On Nov 10, 2020, at 10:58 PM, Muneendra <muneendra.kumar@broadcom.com> wrote:
> 
> Added support to handle eh_should_retry_cmd in lpfc_template.
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v7:
> New patch
> ---
> drivers/scsi/lpfc/lpfc_scsi.c | 1 +
> 1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
> index 4ffdfd2c8604..dbc80e6423ea 100644
> --- a/drivers/scsi/lpfc/lpfc_scsi.c
> +++ b/drivers/scsi/lpfc/lpfc_scsi.c
> @@ -6040,6 +6040,7 @@ struct scsi_host_template lpfc_template = {
> 	.info			= lpfc_info,
> 	.queuecommand		= lpfc_queuecommand,
> 	.eh_timed_out		= fc_eh_timed_out,
> +	.eh_should_retry_cmd    = fc_eh_should_retry_cmd,
> 	.eh_abort_handler	= lpfc_abort_handler,
> 	.eh_device_reset_handler = lpfc_device_reset_handler,
> 	.eh_target_reset_handler = lpfc_target_reset_handler,
> -- 
> 2.26.2
> 

This is generic change for LLD, I expect other drivers to add this to their respective code as well. 

So with that in mind ..  

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

--
Himanshu Madhani	 Oracle Linux Engineering


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v7 0/5] scsi: Support to handle Intermittent errors
  2020-11-11  4:58 [PATCH v7 0/5] scsi: Support to handle Intermittent errors Muneendra
                   ` (4 preceding siblings ...)
  2020-11-11  4:58 ` [PATCH v7 5/5] scsi:lpfc: Added support for eh_should_retry_cmd Muneendra
@ 2020-12-08  5:00 ` Muneendra Kumar M
  2020-12-08  5:14   ` Martin K. Petersen
  5 siblings, 1 reply; 26+ messages in thread
From: Muneendra Kumar M @ 2020-12-08  5:00 UTC (permalink / raw)
  To: linux-scsi, martin.petersen, Hannes Reinecke; +Cc: jsmart2021, emilne, mkumar

[-- Attachment #1: Type: text/plain, Size: 5470 bytes --]

Hi Martin,
Could you please let us know if the patch series can get committed to the
scsi tree .
This patch series has been reviewed by Hannes,Ewan and Himanshu.


Thanks,
Muneendra.

-----Original Message-----
From: Muneendra [mailto:muneendra.kumar@broadcom.com]
Sent: Wednesday, November 11, 2020 10:28 AM
To: linux-scsi@vger.kernel.org; michael.christie@oracle.com; hare@suse.de
Cc: jsmart2021@gmail.com; emilne@redhat.com; mkumar@redhat.com; Muneendra
<muneendra.kumar@broadcom.com>
Subject: [PATCH v7 0/5] scsi: Support to handle Intermittent errors

This patch adds a support to prevent retries of all the io's after an
abort succeeds on a particular device when transport connectivity to the
device is encountering intermittent errors.

Intermittent connectivity is a condition that can be detected by transport
fabric notifications. A service can monitor the ELS notifications and take
action on all the outstanding io's of a scsi device at that instant.

This feature is intended to be used when the device is part of a multipath
environment. When the service detects the poor connectivity, the multipath
path can be placed in a marginal path group and ignored further io
operations.

After placing a path in the marginal path group,the daemon sets the
port_state to Marginal which sets bit in scmd->state for all the io's on
that particular device with the new sysfs interface provided in this
patch.This prevent retries of all the io's if an io hits a scsi timeout
which inturn issues an abort.
On Abort succeeds on a marginal path the io will be immediately retried on
another active path.On abort fails then the things escalates to existing
target reset sg interface recovery process.

Below is the interface provided to set the port state to Marginal and
Online.
echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state


The patches were cut against  5.11/scsi-queue tree

---
v7:

Added New routine in scsi_host_template to decide if a cmd is retryable
instead of checking the same using  SCMD_NORETRIES_ABORT bit as the cmd
retry part can be checked by validating the port state.

Removed the changes related to SCMD_NORETRIES_ABORT bit.

Added a new function fc_eh_should_retry_cmd to check whether the cmd
should be retried based on the rport state.

Reoreder the patch

The patches were cut against  5.11/scsi-queue tree


v6:
Reordered the patches to make patch ordering and more logical.

v5:
Added the DID_TRANSPORT_MARGINAL case to scsi_decide_disposition

Made changes to clear the SCMD_NORETRIES_ABORT bit if the port_state has
changed from marginal to online due to port_delete and port_add as we need
the normal cmd retry behaviour while we are calling the eh handlers.

Made changes in fc_scsi_scan_rport as we are checking FC_PORTSTATE_ONLINE
instead of FC_PORTSTATE_ONLINE and FC_PORTSTATE_MARGINAL


v4:
Made changes in fc_eh_timed_out callout to set the SCMD_NORETRIES_ABORT if
port state is marginal

With this change, we  removed the code  to loop over running commands and
fc_remote_port_chkready changes to set the SCMD_NORETRIES_ABORT

Removed the scsi_cmd argument for fc_remote_port_chkready and reverted
back the patches that addressed this change(argument)

Removed unnecessary comments
Handle the return of errors on failure.

v3:
Removed the port_state from starget attributes.
Enabled the store functionality for port_state under remote port Added a
new argument to scsi_cmd  to fc_remote_port_chkready Used the existing
scsi command iterators scsi_host_busy_iter.
Rearranged the patches
Added new patches to add new argument for fc_remote_port_chkready

v2:
Added new error code DID_TRANSPORT_MARGINAL to handle marginal errors.
Added a new rport_state FC_PORTSTATE_MARGINAL and also added a new sysfs
interface port_state to set the port_state to marginal.
Added the support in lpfc to handle the marginal state.


*** BLURB HERE ***

Muneendra (5):
  scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h
  scsi: No retries on abort success
  scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  scsi_transport_fc: Added store fucntionality to set the rport
    port_state using sysfs
  scsi:lpfc: Added support for eh_should_retry_cmd

 drivers/scsi/lpfc/lpfc_scsi.c    |   1 +
 drivers/scsi/scsi_error.c        |  23 +++++-
 drivers/scsi/scsi_lib.c          |   1 +
 drivers/scsi/scsi_transport_fc.c | 118 ++++++++++++++++++++++++++-----
 include/scsi/scsi.h              |   1 +
 include/scsi/scsi_host.h         |   6 ++
 include/scsi/scsi_transport_fc.h |   4 +-
 7 files changed, 133 insertions(+), 21 deletions(-)

--
2.26.2

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 0/5] scsi: Support to handle Intermittent errors
  2020-12-08  5:00 ` [PATCH v7 0/5] scsi: Support to handle Intermittent errors Muneendra Kumar M
@ 2020-12-08  5:14   ` Martin K. Petersen
  0 siblings, 0 replies; 26+ messages in thread
From: Martin K. Petersen @ 2020-12-08  5:14 UTC (permalink / raw)
  To: Muneendra Kumar M
  Cc: linux-scsi, martin.petersen, Hannes Reinecke, jsmart2021, emilne, mkumar


Hi Muneendra,

> Could you please let us know if the patch series can get committed to
> the scsi tree .  This patch series has been reviewed by Hannes,Ewan
> and Himanshu.

Core SCSI changes need to land in the early -rc releases so your series
will be a candidate for 5.12. Please rebase on top of 5.11-rc1 when that
is out in a couple of weeks.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-12-08  5:16 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-11  4:58 [PATCH v7 0/5] scsi: Support to handle Intermittent errors Muneendra
2020-11-11  4:58 ` [PATCH v7 1/5] scsi: Added a new error code DID_TRANSPORT_MARGINAL in scsi.h Muneendra
2020-11-16  8:16   ` Hannes Reinecke
2020-11-23 19:45   ` Ewan D. Milne
2020-11-24 17:42   ` Himanshu Madhani
2020-11-11  4:58 ` [PATCH v7 2/5] scsi: No retries on abort success Muneendra
2020-11-16  8:22   ` Hannes Reinecke
2020-11-23 19:45   ` Ewan D. Milne
2020-11-24 17:43   ` Himanshu Madhani
2020-11-11  4:58 ` [PATCH v7 3/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
2020-11-16  8:19   ` Hannes Reinecke
2020-11-17  7:43     ` Muneendra Kumar M
2020-11-23 20:01     ` Ewan D. Milne
2020-11-23 19:47   ` Ewan D. Milne
2020-11-24 17:43   ` Himanshu Madhani
2020-11-11  4:58 ` [PATCH v7 4/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
2020-11-16  8:20   ` Hannes Reinecke
2020-11-23 19:47   ` Ewan D. Milne
2020-11-24 17:44   ` Himanshu Madhani
2020-11-11  4:58 ` [PATCH v7 5/5] scsi:lpfc: Added support for eh_should_retry_cmd Muneendra
2020-11-16  8:23   ` Hannes Reinecke
2020-11-23 19:51     ` Ewan D. Milne
2020-11-23 19:48   ` Ewan D. Milne
2020-11-24 17:46   ` Himanshu Madhani
2020-12-08  5:00 ` [PATCH v7 0/5] scsi: Support to handle Intermittent errors Muneendra Kumar M
2020-12-08  5:14   ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).