All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch v4 0/5] scsi: Support to handle Intermittent errors
@ 2020-10-22 12:34 Muneendra
  2020-10-22 12:34 ` [patch v4 1/5] scsi: Added a new definition in scsi_cmnd.h Muneendra
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Muneendra @ 2020-10-22 12:34 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 3099 bytes --]

This patch adds a support to prevent retries of all the
io's after an abort succeeds on a particular device when transport
connectivity to the device is encountering intermittent errors.

Intermittent connectivity is a condition that can be detected by transport
fabric notifications. A service can monitor the ELS notifications and
take action on all the outstanding io's of a scsi device at that instant.

This feature is intended to be used when the device is part of a multipath
environment. When the service detects the poor connectivity, the multipath
path can be placed in a marginal path group and ignored further io
operations.

After placing a path in the marginal path group,the daemon sets the
port_state to Marginal which sets bit in scmd->state for all the
io's on that particular device with the new sysfs interface
provided in this patch.This prevent retries of all the
io's if an io hits a scsi timeout which inturn issues an abort.
On Abort succeeds on a marginal path the io will be immediately retried on
another active path.On abort fails then the things escalates to existing
target reset sg interface recovery process.

Below is the interface provided to set the port state to Marginal
and Online.
echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state


The patches were cut against  5.10/scsi-queue tree

---
v4:
Made changes in fc_eh_timed_out callout to set the SCMD_NORETRIES_ABORT if port
state is marginal 

With this change, we  removed the code  to loop over running commands
and fc_remote_port_chkready changes to set the SCMD_NORETRIES_ABORT 

Removed the scsi_cmd argument for fc_remote_port_chkready
and reverted back the patches that addressed this change(argument)

Removed unnecessary comments
Handle the return of errors on failure.

v3:
Removed the port_state from starget attributes.
Enabled the store functionality for port_state under remote port
Added a new argument to scsi_cmd  to fc_remote_port_chkready
Used the existing scsi command iterators scsi_host_busy_iter.
Rearranged the patches
Added new patches to add new argument for fc_remote_port_chkready

v2:
Added new error code DID_TRANSPORT_MARGINAL to handle marginal errors.
Added a new rport_state FC_PORTSTATE_MARGINAL and also added a new
sysfs interface port_state to set the port_state to marginal.
Added the support in lpfc to handle the marginal state.



Muneendra (5):
  scsi: Added a new definition in scsi_cmnd.h
  scsi: Added a new error code in scsi.h
  scsi: No retries on abort success
  scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  scsi_transport_fc: Added store fucntionality to set the rport
    port_state using sysfs

 drivers/scsi/scsi_error.c        | 10 ++++
 drivers/scsi/scsi_lib.c          |  2 +
 drivers/scsi/scsi_transport_fc.c | 97 ++++++++++++++++++++++++++------
 include/scsi/scsi.h              |  1 +
 include/scsi/scsi_cmnd.h         |  3 +
 include/scsi/scsi_transport_fc.h | 19 +++++++
 6 files changed, 114 insertions(+), 18 deletions(-)

-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch v4 1/5] scsi: Added a new definition in scsi_cmnd.h
  2020-10-22 12:34 [patch v4 0/5] scsi: Support to handle Intermittent errors Muneendra
@ 2020-10-22 12:34 ` Muneendra
  2020-10-22 12:34 ` [patch v4 2/5] scsi: Added a new error code in scsi.h Muneendra
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Muneendra @ 2020-10-22 12:34 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 807 bytes --]

Added a new definition SCMD_NORETRIES_ABORT in scsi_cmnd.h

The SCMD_NORETRIES_ABORT defines the third bit postion
in scmd->state

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v4:
No Change

v3:
No change

v2:
Modified the commit log
---
 include/scsi/scsi_cmnd.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index 69ade4fb71aa..8dec4ec6bd5f 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -64,6 +64,9 @@ struct scsi_pointer {
 /* for scmd->state */
 #define SCMD_STATE_COMPLETE	0
 #define SCMD_STATE_INFLIGHT	1
+#define SCMD_NORETRIES_ABORT	2 /* If this bit is set then there won't be any
+				   * retries of scmd on abort success
+				   */
 
 struct scsi_cmnd {
 	struct scsi_request req;
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [patch v4 2/5] scsi: Added a new error code in scsi.h
  2020-10-22 12:34 [patch v4 0/5] scsi: Support to handle Intermittent errors Muneendra
  2020-10-22 12:34 ` [patch v4 1/5] scsi: Added a new definition in scsi_cmnd.h Muneendra
@ 2020-10-22 12:34 ` Muneendra
  2020-10-26 11:45   ` Ewan D. Milne
  2020-10-22 12:34 ` [patch v4 3/5] scsi: No retries on abort success Muneendra
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Muneendra @ 2020-10-22 12:34 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 1423 bytes --]

Added a new error code DID_TRANSPORT_MARGINAL to handle marginal
errors in scsi.h

Clearing the SCMD_NORETRIES_ABORT bit in state flag before
blk_mq_start_request

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v4:
No change

v3:
Rearranged the patch by merging second hunk of the previous(v2)
patch3 to this patch

v2:
Newpatch
---
 drivers/scsi/scsi_lib.c | 1 +
 include/scsi/scsi.h     | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 1a2e9bab42ef..2b5dea07498e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1660,6 +1660,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 		req->rq_flags |= RQF_DONTPREP;
 	} else {
 		clear_bit(SCMD_STATE_COMPLETE, &cmd->state);
+		clear_bit(SCMD_NORETRIES_ABORT, &cmd->state);
 	}
 
 	cmd->flags &= SCMD_PRESERVED_FLAGS;
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 5339baadc082..5b287ad8b727 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -159,6 +159,7 @@ static inline int scsi_is_wlun(u64 lun)
 				 * paths might yield different results */
 #define DID_ALLOC_FAILURE 0x12  /* Space allocation on the device failed */
 #define DID_MEDIUM_ERROR  0x13  /* Medium error */
+#define DID_TRANSPORT_MARGINAL 0x14 /* Transport marginal errors */
 #define DRIVER_OK       0x00	/* Driver status                           */
 
 /*
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [patch v4 3/5] scsi: No retries on abort success
  2020-10-22 12:34 [patch v4 0/5] scsi: Support to handle Intermittent errors Muneendra
  2020-10-22 12:34 ` [patch v4 1/5] scsi: Added a new definition in scsi_cmnd.h Muneendra
  2020-10-22 12:34 ` [patch v4 2/5] scsi: Added a new error code in scsi.h Muneendra
@ 2020-10-22 12:34 ` Muneendra
  2020-10-22 12:34 ` [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
  2020-10-22 12:34 ` [patch v4 5/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
  4 siblings, 0 replies; 14+ messages in thread
From: Muneendra @ 2020-10-22 12:34 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 2085 bytes --]

Made an additional check in scsi_noretry_cmd to verify whether user has
decided not to do retries on abort success by setting the
SCMD_NORETRIES_ABORT bit

If SCMD_NORETRIES_ABORT bit is set we are making sure there won't be any
retries done on the same path and also setting the host byte as
DID_TRANSPORT_MARGINAL so that the error can be propogated as recoverable
transport error to the blk layers.

Added a code in scsi_result_to_blk_status to translate
a new error DID_TRANSPORT_MARGINAL to the corresponding blk_status_t
i.e BLK_STS_TRANSPORT

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v4:
Modified the comments in the code appropriately

v3:
Merged  first part of the previous patch(v2 patch3) with
this patch.

v2:
set the hostbyte as DID_TRANSPORT_MARGINAL instead of
DID_TRANSPORT_FAILFAST.
---
 drivers/scsi/scsi_error.c | 10 ++++++++++
 drivers/scsi/scsi_lib.c   |  1 +
 2 files changed, 11 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index ae80daa5d831..5c016270bda2 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1763,6 +1763,16 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
 		return 0;
 
 check_type:
+	/*
+	 * Check whether caller has decided not to do retries on
+	 * abort success by checking the SCMD_NORETRIES_ABORT bit
+	 */
+	if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) &&
+		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
+		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);
+		return 1;
+	}
+
 	/*
 	 * assume caller has checked sense and determined
 	 * the check condition was retryable.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 2b5dea07498e..9606bad1542f 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -629,6 +629,7 @@ static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
 			return BLK_STS_OK;
 		return BLK_STS_IOERR;
 	case DID_TRANSPORT_FAILFAST:
+	case DID_TRANSPORT_MARGINAL:
 		return BLK_STS_TRANSPORT;
 	case DID_TARGET_FAILURE:
 		set_host_byte(cmd, DID_OK);
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-22 12:34 [patch v4 0/5] scsi: Support to handle Intermittent errors Muneendra
                   ` (2 preceding siblings ...)
  2020-10-22 12:34 ` [patch v4 3/5] scsi: No retries on abort success Muneendra
@ 2020-10-22 12:34 ` Muneendra
  2020-10-26 11:48   ` Ewan D. Milne
  2020-10-26 17:14   ` Mike Christie
  2020-10-22 12:34 ` [patch v4 5/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
  4 siblings, 2 replies; 14+ messages in thread
From: Muneendra @ 2020-10-22 12:34 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 6470 bytes --]

Added a new rport state FC_PORTSTATE_MARGINAL.

Added a new inline function fc_rport_chkmarginal_set_noretries
which will set the SCMD_NORETRIES_ABORT bit in cmd->state if rport state
is marginal.

Made changes in fc_eh_timed_out to call fc_rport_chkmarginal_set_noretries
Also made changes in fc_remote_port_delete,fc_user_scan_tgt,
fc_timeout_deleted_rport functions  to handle the new rport state
FC_PORTSTATE_MARGINAL.

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v4:
Made changes in fc_eh_timed_out to call fc_rport_chkmarginal_set_noretries
so that SCMD_NORETRIES_ABORT bit in cmd->state is set if rport state
is marginal.

Removed the newly added scsi_cmd argument to fc_remote_port_chkready
as the current patch handles only SCSI EH timeout/abort case.

v3:
Rearranged the patch so that all the changes with respect to new
rport state is part of this patch.
Added a new argument to scsi_cmd  to fc_remote_port_chkready

v2:
New patch
---
 drivers/scsi/scsi_transport_fc.c | 41 +++++++++++++++++++-------------
 include/scsi/scsi_transport_fc.h | 19 +++++++++++++++
 2 files changed, 44 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 2ff7f06203da..fcb38068e2a4 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -142,20 +142,23 @@ fc_enum_name_search(host_event_code, fc_host_event_code,
 static struct {
 	enum fc_port_state	value;
 	char			*name;
+	int			matchlen;
 } fc_port_state_names[] = {
-	{ FC_PORTSTATE_UNKNOWN,		"Unknown" },
-	{ FC_PORTSTATE_NOTPRESENT,	"Not Present" },
-	{ FC_PORTSTATE_ONLINE,		"Online" },
-	{ FC_PORTSTATE_OFFLINE,		"Offline" },
-	{ FC_PORTSTATE_BLOCKED,		"Blocked" },
-	{ FC_PORTSTATE_BYPASSED,	"Bypassed" },
-	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics" },
-	{ FC_PORTSTATE_LINKDOWN,	"Linkdown" },
-	{ FC_PORTSTATE_ERROR,		"Error" },
-	{ FC_PORTSTATE_LOOPBACK,	"Loopback" },
-	{ FC_PORTSTATE_DELETED,		"Deleted" },
+	{ FC_PORTSTATE_UNKNOWN,		"Unknown", 7},
+	{ FC_PORTSTATE_NOTPRESENT,	"Not Present", 11 },
+	{ FC_PORTSTATE_ONLINE,		"Online", 6 },
+	{ FC_PORTSTATE_OFFLINE,		"Offline", 7 },
+	{ FC_PORTSTATE_BLOCKED,		"Blocked", 7 },
+	{ FC_PORTSTATE_BYPASSED,	"Bypassed", 8 },
+	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics", 11 },
+	{ FC_PORTSTATE_LINKDOWN,	"Linkdown", 8 },
+	{ FC_PORTSTATE_ERROR,		"Error", 5 },
+	{ FC_PORTSTATE_LOOPBACK,	"Loopback", 8 },
+	{ FC_PORTSTATE_DELETED,		"Deleted", 7 },
+	{ FC_PORTSTATE_MARGINAL,	"Marginal", 8 },
 };
 fc_enum_name_search(port_state, fc_port_state, fc_port_state_names)
+fc_enum_name_match(port_state, fc_port_state, fc_port_state_names)
 #define FC_PORTSTATE_MAX_NAMELEN	20
 
 
@@ -2071,6 +2074,7 @@ fc_eh_timed_out(struct scsi_cmnd *scmd)
 {
 	struct fc_rport *rport = starget_to_rport(scsi_target(scmd->device));
 
+	fc_rport_chkmarginal_set_noretries(rport, scmd);
 	if (rport->port_state == FC_PORTSTATE_BLOCKED)
 		return BLK_EH_RESET_TIMER;
 
@@ -2095,7 +2099,8 @@ fc_user_scan_tgt(struct Scsi_Host *shost, uint channel, uint id, u64 lun)
 		if (rport->scsi_target_id == -1)
 			continue;
 
-		if (rport->port_state != FC_PORTSTATE_ONLINE)
+		if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
+			(rport->port_state != FC_PORTSTATE_MARGINAL))
 			continue;
 
 		if ((channel == rport->channel) &&
@@ -2958,7 +2963,8 @@ fc_remote_port_delete(struct fc_rport  *rport)
 
 	spin_lock_irqsave(shost->host_lock, flags);
 
-	if (rport->port_state != FC_PORTSTATE_ONLINE) {
+	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
+		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
 		spin_unlock_irqrestore(shost->host_lock, flags);
 		return;
 	}
@@ -3100,7 +3106,8 @@ fc_timeout_deleted_rport(struct work_struct *work)
 	 * target, validate it still is. If not, tear down the
 	 * scsi_target on it.
 	 */
-	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
+	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
+		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
 	    (rport->scsi_target_id != -1) &&
 	    !(rport->roles & FC_PORT_ROLE_FCP_TARGET)) {
 		dev_printk(KERN_ERR, &rport->dev,
@@ -3243,7 +3250,8 @@ fc_scsi_scan_rport(struct work_struct *work)
 	struct fc_internal *i = to_fc_internal(shost->transportt);
 	unsigned long flags;
 
-	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
+	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
+		(rport->port_state == FC_PORTSTATE_ONLINE)) &&
 	    (rport->roles & FC_PORT_ROLE_FCP_TARGET) &&
 	    !(i->f->disable_target_scan)) {
 		scsi_scan_target(&rport->dev, rport->channel,
@@ -3747,7 +3755,8 @@ static blk_status_t fc_bsg_rport_prep(struct fc_rport *rport)
 	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
 		return BLK_STS_RESOURCE;
 
-	if (rport->port_state != FC_PORTSTATE_ONLINE)
+	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
+		(rport->port_state != FC_PORTSTATE_MARGINAL))
 		return BLK_STS_IOERR;
 
 	return BLK_STS_OK;
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index 1c7dd35cb7a0..829bade13b89 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -14,6 +14,7 @@
 #include <linux/bsg-lib.h>
 #include <asm/unaligned.h>
 #include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
 #include <scsi/scsi_netlink.h>
 #include <scsi/scsi_host.h>
 
@@ -67,6 +68,7 @@ enum fc_port_state {
 	FC_PORTSTATE_ERROR,
 	FC_PORTSTATE_LOOPBACK,
 	FC_PORTSTATE_DELETED,
+	FC_PORTSTATE_MARGINAL,
 };
 
 
@@ -707,6 +709,22 @@ struct fc_function_template {
 	unsigned long	disable_target_scan:1;
 };
 
+/**
+ * fc_rport_chkmarginal_set_noretries - Set the SCMD_NORETRIES_ABORT bit
+ * in cmd->state if port state is marginal
+ * @rport:	remote port to be checked
+ * @scmd:	scsi_cmd to set/clear the SCMD_NORETRIES_ABORT bit on Marginal state
+ **/
+static inline void
+fc_rport_chkmarginal_set_noretries(struct fc_rport *rport, struct scsi_cmnd *cmd)
+{
+	if ((rport->port_state == FC_PORTSTATE_MARGINAL) &&
+		 (cmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT))
+		set_bit(SCMD_NORETRIES_ABORT, &cmd->state);
+	else
+		clear_bit(SCMD_NORETRIES_ABORT, &cmd->state);
+
+}
 
 /**
  * fc_remote_port_chkready - called to validate the remote port state
@@ -723,6 +741,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
 
 	switch (rport->port_state) {
 	case FC_PORTSTATE_ONLINE:
+	case FC_PORTSTATE_MARGINAL:
 		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
 			result = 0;
 		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [patch v4 5/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs
  2020-10-22 12:34 [patch v4 0/5] scsi: Support to handle Intermittent errors Muneendra
                   ` (3 preceding siblings ...)
  2020-10-22 12:34 ` [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
@ 2020-10-22 12:34 ` Muneendra
  4 siblings, 0 replies; 14+ messages in thread
From: Muneendra @ 2020-10-22 12:34 UTC (permalink / raw)
  To: linux-scsi, michael.christie, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 3881 bytes --]

Added a store functionality to set rport port_state using sysfs
under  fc_remote_ports/rport-*/port_state

With this functionality the user can move the port_state from
Marginal -> Online and Online->Marginal.

On Marginal :This interface will set SCMD_NORETRIES_ABORT bit in
scmd->state for all the pending io's on the scsi device associated
with target port.

On Online :This interface will clear SCMD_NORETRIES_ABORT bit in
scmd->state for all the pending io's on the scsi device associated
with target port.

Below is the interface provided to set the port state to Marginal
and Online.

echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v4:
Addressed the error reported by kernel test robot
Removed the code needed to traverse all the devices under rport
to set/clear SCMD_NORETRIES_ABORT
Removed unncessary comments.
Return the error values on failure while setting the port_state

v3:
Removed the port_state from starget attributes.
Enabled the store functionality for port_state under remote port.
used the starget_for_each_device to traverse around all the devices
under rport

v2:
Changed from a noretries_abort attribute under fc_transport/target*/ to
port_state for changing the port_state to a marginal state
---
 drivers/scsi/scsi_transport_fc.c | 56 ++++++++++++++++++++++++++++++--
 1 file changed, 54 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index fcb38068e2a4..41587b1a49fb 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -943,7 +943,59 @@ show_fc_rport_roles (struct device *dev, struct device_attribute *attr,
 static FC_DEVICE_ATTR(rport, roles, S_IRUGO,
 		show_fc_rport_roles, NULL);
 
-fc_private_rport_rd_enum_attr(port_state, FC_PORTSTATE_MAX_NAMELEN);
+static ssize_t fc_rport_set_marginal_state(struct device *dev,
+						struct device_attribute *attr,
+						const char *buf, size_t count)
+{
+	struct fc_rport *rport = transport_class_to_rport(dev);
+	enum fc_port_state port_state;
+	int ret = 0;
+
+	ret = get_fc_port_state_match(buf, &port_state);
+	if (ret)
+		return -EINVAL;
+	if (port_state == FC_PORTSTATE_MARGINAL) {
+		/*
+		 * Change the state to marginal only if the
+		 * current rport state is Online
+		 * Allow only Online->marginal
+		 */
+		if (rport->port_state == FC_PORTSTATE_ONLINE)
+			rport->port_state = port_state;
+		else
+			return -EINVAL;
+	} else if (port_state == FC_PORTSTATE_ONLINE) {
+		/*
+		 * Change the state to Online only if the
+		 * current rport state is Marginal
+		 * Allow only  MArginal->Online
+		 */
+		if (rport->port_state == FC_PORTSTATE_MARGINAL)
+			rport->port_state = port_state;
+		else
+			return -EINVAL;
+	} else
+		return -EINVAL;
+	return count;
+}
+
+static ssize_t
+show_fc_rport_port_state(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	const char *name;
+	struct fc_rport *rport = transport_class_to_rport(dev);
+
+	name = get_fc_port_state_name(rport->port_state);
+	if (!name)
+		return -EINVAL;
+
+	return snprintf(buf, 20, "%s\n", name);
+}
+
+static FC_DEVICE_ATTR(rport, port_state, 0444 | 0200,
+			show_fc_rport_port_state, fc_rport_set_marginal_state);
+
 fc_private_rport_rd_attr(scsi_target_id, "%d\n", 20);
 
 /*
@@ -2267,7 +2319,7 @@ fc_attach_transport(struct fc_function_template *ft)
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_name);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_id);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(roles);
-	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_state);
+	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(port_state);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(scsi_target_id);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(fast_io_fail_tmo);
 
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [patch v4 2/5] scsi: Added a new error code in scsi.h
  2020-10-22 12:34 ` [patch v4 2/5] scsi: Added a new error code in scsi.h Muneendra
@ 2020-10-26 11:45   ` Ewan D. Milne
  2020-10-26 20:24     ` Muneendra Kumar M
  0 siblings, 1 reply; 14+ messages in thread
From: Ewan D. Milne @ 2020-10-26 11:45 UTC (permalink / raw)
  To: Muneendra, linux-scsi, michael.christie, hare; +Cc: jsmart2021, mkumar

Don't you need to add the DID_TRANSPORT_MARGINAL case to
scsi_decide_disposition() ?   DID_TRANSPORT_FAILFAST returns
SUCCESS but the default case returns ERROR.

-Ewan


On Thu, 2020-10-22 at 18:04 +0530, Muneendra wrote:
> Added a new error code DID_TRANSPORT_MARGINAL to handle marginal
> errors in scsi.h
> 
> Clearing the SCMD_NORETRIES_ABORT bit in state flag before
> blk_mq_start_request
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v4:
> No change
> 
> v3:
> Rearranged the patch by merging second hunk of the previous(v2)
> patch3 to this patch
> 
> v2:
> Newpatch
> ---
>  drivers/scsi/scsi_lib.c | 1 +
>  include/scsi/scsi.h     | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 1a2e9bab42ef..2b5dea07498e 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1660,6 +1660,7 @@ static blk_status_t scsi_queue_rq(struct
> blk_mq_hw_ctx *hctx,
>  		req->rq_flags |= RQF_DONTPREP;
>  	} else {
>  		clear_bit(SCMD_STATE_COMPLETE, &cmd->state);
> +		clear_bit(SCMD_NORETRIES_ABORT, &cmd->state);
>  	}
>  
>  	cmd->flags &= SCMD_PRESERVED_FLAGS;
> diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
> index 5339baadc082..5b287ad8b727 100644
> --- a/include/scsi/scsi.h
> +++ b/include/scsi/scsi.h
> @@ -159,6 +159,7 @@ static inline int scsi_is_wlun(u64 lun)
>  				 * paths might yield different results
> */
>  #define DID_ALLOC_FAILURE 0x12  /* Space allocation on the device
> failed */
>  #define DID_MEDIUM_ERROR  0x13  /* Medium error */
> +#define DID_TRANSPORT_MARGINAL 0x14 /* Transport marginal errors */
>  #define DRIVER_OK       0x00	/* Driver
> status                           */
>  
>  /*


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-22 12:34 ` [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
@ 2020-10-26 11:48   ` Ewan D. Milne
  2020-10-26 20:24     ` Muneendra Kumar M
  2020-10-26 17:14   ` Mike Christie
  1 sibling, 1 reply; 14+ messages in thread
From: Ewan D. Milne @ 2020-10-26 11:48 UTC (permalink / raw)
  To: Muneendra, linux-scsi, michael.christie, hare; +Cc: jsmart2021, mkumar

See below.  I think you wanted to check for FC_PORTSTATE_MARGINAL
in the code you added to fc_scsi_scan_rport().  Instead the code
tests for FC_PORTSTATE_ONLINE twice.

-Ewan

On Thu, 2020-10-22 at 18:04 +0530, Muneendra wrote:
> Added a new rport state FC_PORTSTATE_MARGINAL.
> 
> Added a new inline function fc_rport_chkmarginal_set_noretries
> which will set the SCMD_NORETRIES_ABORT bit in cmd->state if rport
> state
> is marginal.
> 
> Made changes in fc_eh_timed_out to call
> fc_rport_chkmarginal_set_noretries
> Also made changes in fc_remote_port_delete,fc_user_scan_tgt,
> fc_timeout_deleted_rport functions  to handle the new rport state
> FC_PORTSTATE_MARGINAL.
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v4:
> Made changes in fc_eh_timed_out to call
> fc_rport_chkmarginal_set_noretries
> so that SCMD_NORETRIES_ABORT bit in cmd->state is set if rport state
> is marginal.
> 
> Removed the newly added scsi_cmd argument to fc_remote_port_chkready
> as the current patch handles only SCSI EH timeout/abort case.
> 
> v3:
> Rearranged the patch so that all the changes with respect to new
> rport state is part of this patch.
> Added a new argument to scsi_cmd  to fc_remote_port_chkready
> 
> v2:
> New patch
> ---
>  drivers/scsi/scsi_transport_fc.c | 41 +++++++++++++++++++-----------
> --
>  include/scsi/scsi_transport_fc.h | 19 +++++++++++++++
>  2 files changed, 44 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_fc.c
> b/drivers/scsi/scsi_transport_fc.c
> index 2ff7f06203da..fcb38068e2a4 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -142,20 +142,23 @@ fc_enum_name_search(host_event_code,
> fc_host_event_code,
>  static struct {
>  	enum fc_port_state	value;
>  	char			*name;
> +	int			matchlen;
>  } fc_port_state_names[] = {
> -	{ FC_PORTSTATE_UNKNOWN,		"Unknown" },
> -	{ FC_PORTSTATE_NOTPRESENT,	"Not Present" },
> -	{ FC_PORTSTATE_ONLINE,		"Online" },
> -	{ FC_PORTSTATE_OFFLINE,		"Offline" },
> -	{ FC_PORTSTATE_BLOCKED,		"Blocked" },
> -	{ FC_PORTSTATE_BYPASSED,	"Bypassed" },
> -	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics" },
> -	{ FC_PORTSTATE_LINKDOWN,	"Linkdown" },
> -	{ FC_PORTSTATE_ERROR,		"Error" },
> -	{ FC_PORTSTATE_LOOPBACK,	"Loopback" },
> -	{ FC_PORTSTATE_DELETED,		"Deleted" },
> +	{ FC_PORTSTATE_UNKNOWN,		"Unknown", 7},
> +	{ FC_PORTSTATE_NOTPRESENT,	"Not Present", 11 },
> +	{ FC_PORTSTATE_ONLINE,		"Online", 6 },
> +	{ FC_PORTSTATE_OFFLINE,		"Offline", 7 },
> +	{ FC_PORTSTATE_BLOCKED,		"Blocked", 7 },
> +	{ FC_PORTSTATE_BYPASSED,	"Bypassed", 8 },
> +	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics", 11 },
> +	{ FC_PORTSTATE_LINKDOWN,	"Linkdown", 8 },
> +	{ FC_PORTSTATE_ERROR,		"Error", 5 },
> +	{ FC_PORTSTATE_LOOPBACK,	"Loopback", 8 },
> +	{ FC_PORTSTATE_DELETED,		"Deleted", 7 },
> +	{ FC_PORTSTATE_MARGINAL,	"Marginal", 8 },
>  };
>  fc_enum_name_search(port_state, fc_port_state, fc_port_state_names)
> +fc_enum_name_match(port_state, fc_port_state, fc_port_state_names)
>  #define FC_PORTSTATE_MAX_NAMELEN	20
>  
>  
> @@ -2071,6 +2074,7 @@ fc_eh_timed_out(struct scsi_cmnd *scmd)
>  {
>  	struct fc_rport *rport = starget_to_rport(scsi_target(scmd-
> >device));
>  
> +	fc_rport_chkmarginal_set_noretries(rport, scmd);
>  	if (rport->port_state == FC_PORTSTATE_BLOCKED)
>  		return BLK_EH_RESET_TIMER;
>  
> @@ -2095,7 +2099,8 @@ fc_user_scan_tgt(struct Scsi_Host *shost, uint
> channel, uint id, u64 lun)
>  		if (rport->scsi_target_id == -1)
>  			continue;
>  
> -		if (rport->port_state != FC_PORTSTATE_ONLINE)
> +		if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +			(rport->port_state != FC_PORTSTATE_MARGINAL))
>  			continue;
>  
>  		if ((channel == rport->channel) &&
> @@ -2958,7 +2963,8 @@ fc_remote_port_delete(struct fc_rport  *rport)
>  
>  	spin_lock_irqsave(shost->host_lock, flags);
>  
> -	if (rport->port_state != FC_PORTSTATE_ONLINE) {
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
>  		spin_unlock_irqrestore(shost->host_lock, flags);
>  		return;
>  	}
> @@ -3100,7 +3106,8 @@ fc_timeout_deleted_rport(struct work_struct
> *work)
>  	 * target, validate it still is. If not, tear down the
>  	 * scsi_target on it.
>  	 */
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
>  	    (rport->scsi_target_id != -1) &&
>  	    !(rport->roles & FC_PORT_ROLE_FCP_TARGET)) {
>  		dev_printk(KERN_ERR, &rport->dev,
> @@ -3243,7 +3250,8 @@ fc_scsi_scan_rport(struct work_struct *work)
>  	struct fc_internal *i = to_fc_internal(shost->transportt);
>  	unsigned long flags;
>  
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_ONLINE)) &&

I think the second line should have been FC_PORTSTATE_MARGINAL.


>  	    (rport->roles & FC_PORT_ROLE_FCP_TARGET) &&
>  	    !(i->f->disable_target_scan)) {
>  		scsi_scan_target(&rport->dev, rport->channel,
> @@ -3747,7 +3755,8 @@ static blk_status_t fc_bsg_rport_prep(struct
> fc_rport *rport)
>  	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
>  		return BLK_STS_RESOURCE;
>  
> -	if (rport->port_state != FC_PORTSTATE_ONLINE)
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL))
>  		return BLK_STS_IOERR;
>  
>  	return BLK_STS_OK;
> diff --git a/include/scsi/scsi_transport_fc.h
> b/include/scsi/scsi_transport_fc.h
> index 1c7dd35cb7a0..829bade13b89 100644
> --- a/include/scsi/scsi_transport_fc.h
> +++ b/include/scsi/scsi_transport_fc.h
> @@ -14,6 +14,7 @@
>  #include <linux/bsg-lib.h>
>  #include <asm/unaligned.h>
>  #include <scsi/scsi.h>
> +#include <scsi/scsi_cmnd.h>
>  #include <scsi/scsi_netlink.h>
>  #include <scsi/scsi_host.h>
>  
> @@ -67,6 +68,7 @@ enum fc_port_state {
>  	FC_PORTSTATE_ERROR,
>  	FC_PORTSTATE_LOOPBACK,
>  	FC_PORTSTATE_DELETED,
> +	FC_PORTSTATE_MARGINAL,
>  };
>  
>  
> @@ -707,6 +709,22 @@ struct fc_function_template {
>  	unsigned long	disable_target_scan:1;
>  };
>  
> +/**
> + * fc_rport_chkmarginal_set_noretries - Set the SCMD_NORETRIES_ABORT
> bit
> + * in cmd->state if port state is marginal
> + * @rport:	remote port to be checked
> + * @scmd:	scsi_cmd to set/clear the SCMD_NORETRIES_ABORT bit on
> Marginal state
> + **/
> +static inline void
> +fc_rport_chkmarginal_set_noretries(struct fc_rport *rport, struct
> scsi_cmnd *cmd)
> +{
> +	if ((rport->port_state == FC_PORTSTATE_MARGINAL) &&
> +		 (cmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT))
> +		set_bit(SCMD_NORETRIES_ABORT, &cmd->state);
> +	else
> +		clear_bit(SCMD_NORETRIES_ABORT, &cmd->state);
> +
> +}
>  
>  /**
>   * fc_remote_port_chkready - called to validate the remote port
> state
> @@ -723,6 +741,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
>  
>  	switch (rport->port_state) {
>  	case FC_PORTSTATE_ONLINE:
> +	case FC_PORTSTATE_MARGINAL:
>  		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
>  			result = 0;
>  		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-22 12:34 ` [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
  2020-10-26 11:48   ` Ewan D. Milne
@ 2020-10-26 17:14   ` Mike Christie
  2020-10-29 11:53     ` Muneendra Kumar M
  1 sibling, 1 reply; 14+ messages in thread
From: Mike Christie @ 2020-10-26 17:14 UTC (permalink / raw)
  To: Muneendra, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

On 10/22/20 7:34 AM, Muneendra wrote:
> @@ -2071,6 +2074,7 @@ fc_eh_timed_out(struct scsi_cmnd *scmd)
>  {
>  	struct fc_rport *rport = starget_to_rport(scsi_target(scmd->device));
>  
> +	fc_rport_chkmarginal_set_noretries(rport, scmd);
>  	if (rport->port_state == FC_PORTSTATE_BLOCKED)
>  		return BLK_EH_RESET_TIMER;

If we are in port state marginal above, then we will try to abort
the cmd, but if while doing the abort we call fc_remote_port_delete and
fc_remote_port_add then the port state will be online when the EH
callouts complete. In this case, the port state is online in the end, but
we would fail the command like it was in marginal.

>  
> @@ -2095,7 +2099,8 @@ fc_user_scan_tgt(struct Scsi_Host *shost, uint channel, uint id, u64 lun)
>  		if (rport->scsi_target_id == -1)
>  			continue;
>  
> -		if (rport->port_state != FC_PORTSTATE_ONLINE)
> +		if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +			(rport->port_state != FC_PORTSTATE_MARGINAL))
>  			continue;
>  
>  		if ((channel == rport->channel) &&
> @@ -2958,7 +2963,8 @@ fc_remote_port_delete(struct fc_rport  *rport)
>  
>  	spin_lock_irqsave(shost->host_lock, flags);
>  
> -	if (rport->port_state != FC_PORTSTATE_ONLINE) {
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
>  		spin_unlock_irqrestore(shost->host_lock, flags);
>  		return;

It looks like if fc_remote_port_delete is called, then we will
allow that function to set the port_state to blocked. If the
problem is resolved then fc_remote_port_add will set the state
to online. So it would look like the port state is now ok in the
kernel, but would userspace still have it in the marginal port group?

Did you want this behavior or did you want it to stay in marginal
until your daemon marks it as online?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-26 11:48   ` Ewan D. Milne
@ 2020-10-26 20:24     ` Muneendra Kumar M
  0 siblings, 0 replies; 14+ messages in thread
From: Muneendra Kumar M @ 2020-10-26 20:24 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi, michael.christie, hare; +Cc: jsmart2021, mkumar

[-- Attachment #1: Type: text/plain, Size: 7539 bytes --]

Hi ewan,
Thanks for the input.
I will change this.

Regards,
Muneendra.

-----Original Message-----
From: Ewan D. Milne [mailto:emilne@redhat.com]
Sent: Monday, October 26, 2020 5:18 PM
To: Muneendra <muneendra.kumar@broadcom.com>; linux-scsi@vger.kernel.org;
michael.christie@oracle.com; hare@suse.de
Cc: jsmart2021@gmail.com; mkumar@redhat.com
Subject: Re: [patch v4 4/5] scsi_transport_fc: Added a new rport state
FC_PORTSTATE_MARGINAL

See below.  I think you wanted to check for FC_PORTSTATE_MARGINAL in the
code you added to fc_scsi_scan_rport().  Instead the code tests for
FC_PORTSTATE_ONLINE twice.

-Ewan

On Thu, 2020-10-22 at 18:04 +0530, Muneendra wrote:
> Added a new rport state FC_PORTSTATE_MARGINAL.
>
> Added a new inline function fc_rport_chkmarginal_set_noretries
> which will set the SCMD_NORETRIES_ABORT bit in cmd->state if rport
> state is marginal.
>
> Made changes in fc_eh_timed_out to call
> fc_rport_chkmarginal_set_noretries
> Also made changes in fc_remote_port_delete,fc_user_scan_tgt,
> fc_timeout_deleted_rport functions  to handle the new rport state
> FC_PORTSTATE_MARGINAL.
>
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
>
> ---
> v4:
> Made changes in fc_eh_timed_out to call
> fc_rport_chkmarginal_set_noretries
> so that SCMD_NORETRIES_ABORT bit in cmd->state is set if rport state
> is marginal.
>
> Removed the newly added scsi_cmd argument to fc_remote_port_chkready
> as the current patch handles only SCSI EH timeout/abort case.
>
> v3:
> Rearranged the patch so that all the changes with respect to new rport
> state is part of this patch.
> Added a new argument to scsi_cmd  to fc_remote_port_chkready
>
> v2:
> New patch
> ---
>  drivers/scsi/scsi_transport_fc.c | 41 +++++++++++++++++++-----------
> --
>  include/scsi/scsi_transport_fc.h | 19 +++++++++++++++
>  2 files changed, 44 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/scsi/scsi_transport_fc.c
> b/drivers/scsi/scsi_transport_fc.c
> index 2ff7f06203da..fcb38068e2a4 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -142,20 +142,23 @@ fc_enum_name_search(host_event_code,
> fc_host_event_code,
>  static struct {
>  	enum fc_port_state	value;
>  	char			*name;
> +	int			matchlen;
>  } fc_port_state_names[] = {
> -	{ FC_PORTSTATE_UNKNOWN,		"Unknown" },
> -	{ FC_PORTSTATE_NOTPRESENT,	"Not Present" },
> -	{ FC_PORTSTATE_ONLINE,		"Online" },
> -	{ FC_PORTSTATE_OFFLINE,		"Offline" },
> -	{ FC_PORTSTATE_BLOCKED,		"Blocked" },
> -	{ FC_PORTSTATE_BYPASSED,	"Bypassed" },
> -	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics" },
> -	{ FC_PORTSTATE_LINKDOWN,	"Linkdown" },
> -	{ FC_PORTSTATE_ERROR,		"Error" },
> -	{ FC_PORTSTATE_LOOPBACK,	"Loopback" },
> -	{ FC_PORTSTATE_DELETED,		"Deleted" },
> +	{ FC_PORTSTATE_UNKNOWN,		"Unknown", 7},
> +	{ FC_PORTSTATE_NOTPRESENT,	"Not Present", 11 },
> +	{ FC_PORTSTATE_ONLINE,		"Online", 6 },
> +	{ FC_PORTSTATE_OFFLINE,		"Offline", 7 },
> +	{ FC_PORTSTATE_BLOCKED,		"Blocked", 7 },
> +	{ FC_PORTSTATE_BYPASSED,	"Bypassed", 8 },
> +	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics", 11 },
> +	{ FC_PORTSTATE_LINKDOWN,	"Linkdown", 8 },
> +	{ FC_PORTSTATE_ERROR,		"Error", 5 },
> +	{ FC_PORTSTATE_LOOPBACK,	"Loopback", 8 },
> +	{ FC_PORTSTATE_DELETED,		"Deleted", 7 },
> +	{ FC_PORTSTATE_MARGINAL,	"Marginal", 8 },
>  };
>  fc_enum_name_search(port_state, fc_port_state, fc_port_state_names)
> +fc_enum_name_match(port_state, fc_port_state, fc_port_state_names)
>  #define FC_PORTSTATE_MAX_NAMELEN	20
>
>
> @@ -2071,6 +2074,7 @@ fc_eh_timed_out(struct scsi_cmnd *scmd)  {
>  	struct fc_rport *rport = starget_to_rport(scsi_target(scmd-
> >device));
>
> +	fc_rport_chkmarginal_set_noretries(rport, scmd);
>  	if (rport->port_state == FC_PORTSTATE_BLOCKED)
>  		return BLK_EH_RESET_TIMER;
>
> @@ -2095,7 +2099,8 @@ fc_user_scan_tgt(struct Scsi_Host *shost, uint
> channel, uint id, u64 lun)
>  		if (rport->scsi_target_id == -1)
>  			continue;
>
> -		if (rport->port_state != FC_PORTSTATE_ONLINE)
> +		if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +			(rport->port_state != FC_PORTSTATE_MARGINAL))
>  			continue;
>
>  		if ((channel == rport->channel) &&
> @@ -2958,7 +2963,8 @@ fc_remote_port_delete(struct fc_rport  *rport)
>
>  	spin_lock_irqsave(shost->host_lock, flags);
>
> -	if (rport->port_state != FC_PORTSTATE_ONLINE) {
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
>  		spin_unlock_irqrestore(shost->host_lock, flags);
>  		return;
>  	}
> @@ -3100,7 +3106,8 @@ fc_timeout_deleted_rport(struct work_struct
> *work)
>  	 * target, validate it still is. If not, tear down the
>  	 * scsi_target on it.
>  	 */
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
>  	    (rport->scsi_target_id != -1) &&
>  	    !(rport->roles & FC_PORT_ROLE_FCP_TARGET)) {
>  		dev_printk(KERN_ERR, &rport->dev,
> @@ -3243,7 +3250,8 @@ fc_scsi_scan_rport(struct work_struct *work)
>  	struct fc_internal *i = to_fc_internal(shost->transportt);
>  	unsigned long flags;
>
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_ONLINE)) &&

I think the second line should have been FC_PORTSTATE_MARGINAL.


>  	    (rport->roles & FC_PORT_ROLE_FCP_TARGET) &&
>  	    !(i->f->disable_target_scan)) {
>  		scsi_scan_target(&rport->dev, rport->channel, @@ -3747,7 +3755,8 @@
> static blk_status_t fc_bsg_rport_prep(struct fc_rport *rport)
>  	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
>  		return BLK_STS_RESOURCE;
>
> -	if (rport->port_state != FC_PORTSTATE_ONLINE)
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL))
>  		return BLK_STS_IOERR;
>
>  	return BLK_STS_OK;
> diff --git a/include/scsi/scsi_transport_fc.h
> b/include/scsi/scsi_transport_fc.h
> index 1c7dd35cb7a0..829bade13b89 100644
> --- a/include/scsi/scsi_transport_fc.h
> +++ b/include/scsi/scsi_transport_fc.h
> @@ -14,6 +14,7 @@
>  #include <linux/bsg-lib.h>
>  #include <asm/unaligned.h>
>  #include <scsi/scsi.h>
> +#include <scsi/scsi_cmnd.h>
>  #include <scsi/scsi_netlink.h>
>  #include <scsi/scsi_host.h>
>
> @@ -67,6 +68,7 @@ enum fc_port_state {
>  	FC_PORTSTATE_ERROR,
>  	FC_PORTSTATE_LOOPBACK,
>  	FC_PORTSTATE_DELETED,
> +	FC_PORTSTATE_MARGINAL,
>  };
>
>
> @@ -707,6 +709,22 @@ struct fc_function_template {
>  	unsigned long	disable_target_scan:1;
>  };
>
> +/**
> + * fc_rport_chkmarginal_set_noretries - Set the SCMD_NORETRIES_ABORT
> bit
> + * in cmd->state if port state is marginal
> + * @rport:	remote port to be checked
> + * @scmd:	scsi_cmd to set/clear the SCMD_NORETRIES_ABORT bit on
> Marginal state
> + **/
> +static inline void
> +fc_rport_chkmarginal_set_noretries(struct fc_rport *rport, struct
> scsi_cmnd *cmd)
> +{
> +	if ((rport->port_state == FC_PORTSTATE_MARGINAL) &&
> +		 (cmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT))
> +		set_bit(SCMD_NORETRIES_ABORT, &cmd->state);
> +	else
> +		clear_bit(SCMD_NORETRIES_ABORT, &cmd->state);
> +
> +}
>
>  /**
>   * fc_remote_port_chkready - called to validate the remote port state
> @@ -723,6 +741,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
>
>  	switch (rport->port_state) {
>  	case FC_PORTSTATE_ONLINE:
> +	case FC_PORTSTATE_MARGINAL:
>  		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
>  			result = 0;
>  		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [patch v4 2/5] scsi: Added a new error code in scsi.h
  2020-10-26 11:45   ` Ewan D. Milne
@ 2020-10-26 20:24     ` Muneendra Kumar M
  0 siblings, 0 replies; 14+ messages in thread
From: Muneendra Kumar M @ 2020-10-26 20:24 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi, michael.christie, hare; +Cc: jsmart2021, mkumar

[-- Attachment #1: Type: text/plain, Size: 2140 bytes --]

Hi Ewan,
Thanks for the input .
I will add this change.

Regards,
Muneendra.

-----Original Message-----
From: Ewan D. Milne [mailto:emilne@redhat.com]
Sent: Monday, October 26, 2020 5:16 PM
To: Muneendra <muneendra.kumar@broadcom.com>; linux-scsi@vger.kernel.org;
michael.christie@oracle.com; hare@suse.de
Cc: jsmart2021@gmail.com; mkumar@redhat.com
Subject: Re: [patch v4 2/5] scsi: Added a new error code in scsi.h

Don't you need to add the DID_TRANSPORT_MARGINAL case to
scsi_decide_disposition() ?   DID_TRANSPORT_FAILFAST returns
SUCCESS but the default case returns ERROR.

-Ewan


On Thu, 2020-10-22 at 18:04 +0530, Muneendra wrote:
> Added a new error code DID_TRANSPORT_MARGINAL to handle marginal
> errors in scsi.h
>
> Clearing the SCMD_NORETRIES_ABORT bit in state flag before
> blk_mq_start_request
>
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
>
> ---
> v4:
> No change
>
> v3:
> Rearranged the patch by merging second hunk of the previous(v2)
> patch3 to this patch
>
> v2:
> Newpatch
> ---
>  drivers/scsi/scsi_lib.c | 1 +
>  include/scsi/scsi.h     | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index
> 1a2e9bab42ef..2b5dea07498e 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1660,6 +1660,7 @@ static blk_status_t scsi_queue_rq(struct
> blk_mq_hw_ctx *hctx,
>  		req->rq_flags |= RQF_DONTPREP;
>  	} else {
>  		clear_bit(SCMD_STATE_COMPLETE, &cmd->state);
> +		clear_bit(SCMD_NORETRIES_ABORT, &cmd->state);
>  	}
>
>  	cmd->flags &= SCMD_PRESERVED_FLAGS;
> diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h index
> 5339baadc082..5b287ad8b727 100644
> --- a/include/scsi/scsi.h
> +++ b/include/scsi/scsi.h
> @@ -159,6 +159,7 @@ static inline int scsi_is_wlun(u64 lun)
>  				 * paths might yield different results */  #define
> DID_ALLOC_FAILURE 0x12  /* Space allocation on the device failed */
> #define DID_MEDIUM_ERROR  0x13  /* Medium error */
> +#define DID_TRANSPORT_MARGINAL 0x14 /* Transport marginal errors */
>  #define DRIVER_OK       0x00	/* Driver
> status                           */
>
>  /*

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-26 17:14   ` Mike Christie
@ 2020-10-29 11:53     ` Muneendra Kumar M
  2020-10-29 16:20       ` Mike Christie
  0 siblings, 1 reply; 14+ messages in thread
From: Muneendra Kumar M @ 2020-10-29 11:53 UTC (permalink / raw)
  To: Mike Christie, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

[-- Attachment #1: Type: text/plain, Size: 2404 bytes --]

Hi Mike,
Below are my replies.

-----Original Message-----
From: Mike Christie [mailto:michael.christie@oracle.com]
Sent: Monday, October 26, 2020 10:45 PM
To: Muneendra <muneendra.kumar@broadcom.com>; linux-scsi@vger.kernel.org;
hare@suse.de
Cc: jsmart2021@gmail.com; emilne@redhat.com; mkumar@redhat.com
Subject: Re: [patch v4 4/5] scsi_transport_fc: Added a new rport state
FC_PORTSTATE_MARGINAL

On 10/22/20 7:34 AM, Muneendra wrote:
> @@ -2071,6 +2074,7 @@ fc_eh_timed_out(struct scsi_cmnd *scmd)  {
>  	struct fc_rport *rport =
> starget_to_rport(scsi_target(scmd->device));
>
> +	fc_rport_chkmarginal_set_noretries(rport, scmd);
>  	if (rport->port_state == FC_PORTSTATE_BLOCKED)
>  		return BLK_EH_RESET_TIMER;

>If we are in port state marginal above, then we will try to abort the cmd,
>but if while doing the abort we call fc_remote_port_delete and
>fc_remote_port_add then the port state will be online when the EH callouts
>complete. In >this case, the port state is online in the end, but we would
>fail the command like it was in marginal.
[Muneendra] I have to  make sure the flag is set after the check for blocked
state.  If blocked, it's returning BLK_EH_RESET_TIMER, so it will restart
the eh
timer. The io will "sit out" like this, pending, until either the adapter
fails it back due to logout or io completion, or fastio fail or
rport devloss timesout and invokes the abort handler to force abort .

> +		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
>  		spin_unlock_irqrestore(shost->host_lock, flags);
>  		return;

>It looks like if fc_remote_port_delete is called, then we will allow that
>function to set the port_state to blocked. If the problem is resolved then
>fc_remote_port_add will set the state to online. So it would look like the
>port state is >now ok in the kernel, but would userspace still have it in
>the marginal port group?

>Did you want this behavior or did you want it to stay in marginal until
>your daemon marks it as online?
[Muneendra] We need this behavior.User daemon
should not depend on the rport_state to move a path from marginal path
 group.It should only depends on RSCN and LINKUP events/manual
intervention. events that we look out (rscn for target-side cable  bounces
and link up/down for initiator cable bounces) will result in
port state changes - so although we don't drive one from the other, they are
correlated.

Regards,
Muneendra.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-29 11:53     ` Muneendra Kumar M
@ 2020-10-29 16:20       ` Mike Christie
  2020-10-29 16:56         ` Muneendra Kumar M
  0 siblings, 1 reply; 14+ messages in thread
From: Mike Christie @ 2020-10-29 16:20 UTC (permalink / raw)
  To: Muneendra Kumar M, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

On 10/29/20 6:53 AM, Muneendra Kumar M wrote:
> Hi Mike,
> Below are my replies.
> 
> -----Original Message-----
> From: Mike Christie [mailto:michael.christie@oracle.com]
> Sent: Monday, October 26, 2020 10:45 PM
> To: Muneendra <muneendra.kumar@broadcom.com>; linux-scsi@vger.kernel.org;
> hare@suse.de
> Cc: jsmart2021@gmail.com; emilne@redhat.com; mkumar@redhat.com
> Subject: Re: [patch v4 4/5] scsi_transport_fc: Added a new rport state
> FC_PORTSTATE_MARGINAL
> 
> On 10/22/20 7:34 AM, Muneendra wrote:
>> @@ -2071,6 +2074,7 @@ fc_eh_timed_out(struct scsi_cmnd *scmd)  {
>>   	struct fc_rport *rport =
>> starget_to_rport(scsi_target(scmd->device));
>>
>> +	fc_rport_chkmarginal_set_noretries(rport, scmd);
>>   	if (rport->port_state == FC_PORTSTATE_BLOCKED)
>>   		return BLK_EH_RESET_TIMER;
> 
>> If we are in port state marginal above, then we will try to abort the cmd,
>> but if while doing the abort we call fc_remote_port_delete and
>> fc_remote_port_add then the port state will be online when the EH callouts
>> complete. In >this case, the port state is online in the end, but we would
>> fail the command like it was in marginal.
> [Muneendra] I have to  make sure the flag is set after the check for blocked
> state.  If blocked, it's returning BLK_EH_RESET_TIMER, so it will restart
> the eh
> timer. The io will "sit out" like this, pending, until either the adapter
> fails it back due to logout or io completion, or fastio fail or
> rport devloss timesout and invokes the abort handler to force abort .

Hey,

I'm not sure if we are talking about the same thing. If port state is 
marginal above, then we set the NORETRIES bit then return BLK_EH_DONE 
which will start up the scsi eh_abort_handler and if that fails the rest 
of the scsi eh_*_handlers.

While we are calling the eh handlers, if the driver does a 
fc_remote_port_delete then fc_remote_port_add we still have the 
NORETRIES bit set, so when we return from the eh_*_handlers we will fail 
the IO upwards.

I was trying to ask if you wanted the IO failed upwards in that case. 
Because the port state went to online, did you want the normal (cleared 
NOTRIES bit) cmd retry behavior? It sounds like below you want the 
cleared NORETRIED bit behavior, right?


> 
>> +		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
>>   		spin_unlock_irqrestore(shost->host_lock, flags);
>>   		return;
> 
>> It looks like if fc_remote_port_delete is called, then we will allow that
>> function to set the port_state to blocked. If the problem is resolved then
>> fc_remote_port_add will set the state to online. So it would look like the
>> port state is >now ok in the kernel, but would userspace still have it in
>> the marginal port group?
> 
>> Did you want this behavior or did you want it to stay in marginal until
>> your daemon marks it as online?
> [Muneendra] We need this behavior.User daemon
> should not depend on the rport_state to move a path from marginal path
>   group.It should only depends on RSCN and LINKUP events/manual
> intervention. events that we look out (rscn for target-side cable  bounces
> and link up/down for initiator cable bounces) will result in
> port state changes - so although we don't drive one from the other, they are
> correlated.
> 
> Regards,
> Muneendra.
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-29 16:20       ` Mike Christie
@ 2020-10-29 16:56         ` Muneendra Kumar M
  0 siblings, 0 replies; 14+ messages in thread
From: Muneendra Kumar M @ 2020-10-29 16:56 UTC (permalink / raw)
  To: Mike Christie, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

[-- Attachment #1: Type: text/plain, Size: 2457 bytes --]

Hi Mike,

> [Muneendra] I have to  make sure the flag is set after the check for
> blocked state.  If blocked, it's returning BLK_EH_RESET_TIMER, so it
> will restart the eh timer. The io will "sit out" like this, pending,
> until either the adapter fails it back due to logout or io completion,
> or fastio fail or rport devloss timesout and invokes the abort handler
> to force abort .

>Hey,

>I'm not sure if we are talking about the same thing. If port state is
>marginal above, then we set the NORETRIES bit then return BLK_EH_DONE which
>will start up the scsi eh_abort_handler and if that fails the rest of the
>scsi >eh_*_handlers.

>While we are calling the eh handlers, if the driver does a
>fc_remote_port_delete then fc_remote_port_add we still have the NORETRIES
>bit set, so when we return from the eh_*_handlers we will fail the IO
>upwards.

>I was trying to ask if you wanted the IO failed upwards in that case.
>Because the port state went to online, did you want the normal (cleared
>NOTRIES bit) cmd retry behavior? It sounds like below you want the cleared
>NORETRIED bit behavior, right?
[Muneendra]Yes we need the normal cmd behavior(clear noretries bit) when the
portstate went to normal.
I think to achieve this we can  clear the noretries bit in fc_block_scsi_eh/
fc_block_rport .
As this is the last place where the individual abort_handler checks for
blocked state. Is this fine?

Regards,
Muneendra.

>
>> +		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
>>   		spin_unlock_irqrestore(shost->host_lock, flags);
>>   		return;
>
>> It looks like if fc_remote_port_delete is called, then we will allow
>> that function to set the port_state to blocked. If the problem is
>> resolved then fc_remote_port_add will set the state to online. So it
>> would look like the port state is >now ok in the kernel, but would
>> userspace still have it in the marginal port group?
>
>> Did you want this behavior or did you want it to stay in marginal
>> until your daemon marks it as online?
> [Muneendra] We need this behavior.User daemon should not depend on the
> rport_state to move a path from marginal path
>   group.It should only depends on RSCN and LINKUP events/manual
> intervention. events that we look out (rscn for target-side cable
> bounces and link up/down for initiator cable bounces) will result in
> port state changes - so although we don't drive one from the other,
> they are correlated.
>
> Regards,
> Muneendra.
>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-10-29 16:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22 12:34 [patch v4 0/5] scsi: Support to handle Intermittent errors Muneendra
2020-10-22 12:34 ` [patch v4 1/5] scsi: Added a new definition in scsi_cmnd.h Muneendra
2020-10-22 12:34 ` [patch v4 2/5] scsi: Added a new error code in scsi.h Muneendra
2020-10-26 11:45   ` Ewan D. Milne
2020-10-26 20:24     ` Muneendra Kumar M
2020-10-22 12:34 ` [patch v4 3/5] scsi: No retries on abort success Muneendra
2020-10-22 12:34 ` [patch v4 4/5] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
2020-10-26 11:48   ` Ewan D. Milne
2020-10-26 20:24     ` Muneendra Kumar M
2020-10-26 17:14   ` Mike Christie
2020-10-29 11:53     ` Muneendra Kumar M
2020-10-29 16:20       ` Mike Christie
2020-10-29 16:56         ` Muneendra Kumar M
2020-10-22 12:34 ` [patch v4 5/5] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.