All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/17] scsi: Support to handle Intermittent errors
@ 2020-10-15  3:27 Muneendra
  2020-10-15  3:27 ` [PATCH v3 01/17] scsi: Added a new definition in scsi_cmnd.h Muneendra
                   ` (16 more replies)
  0 siblings, 17 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 4087 bytes --]

This patch adds a support to prevent retries of all the
io's after an abort succeeds on a particular device when transport
connectivity to the device is encountering intermittent errors.

Intermittent connectivity is a condition that can be detected by transport
fabric notifications. A service can monitor the ELS notifications and
take action on all the outstanding io's of a scsi device at that instant.

This feature is intended to be used when the device is part of a multipath
environment. When the service detects the poor connectivity, the multipath
path can be placed in a marginal path group and ignored further io
operations.

After placing a path in the marginal path group,the daemon sets the
port_state to Marginal which sets bit in scmd->state for all the
io's on that particular device with the new sysfs interface
provided in this patch.This prevent retries of all the
io's if an io hits a scsi timeout which inturn issues an abort.
On Abort succeeds on a marginal path the io will be immediately retried on
another active path.On abort fails then the things escalates to existing
target reset sg interface recovery process.

Below is the interface provided to set the port state to Marginal
and Online.
echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state


The patches were cut against  5.10/scsi-queue tree

---

v3:
Removed the port_state from starget attributes.
Enabled the store functionality for port_state under remote port
Added a new argument to scsi_cmd  to fc_remote_port_chkready
Used the existing scsi command iterators scsi_host_busy_iter.
Rearranged the patches
Added new patches to add new argument for fc_remote_port_chkready

v2:
Added new error code DID_TRANSPORT_MARGINAL to handle marginal errors.
Added a new rport_state FC_PORTSTATE_MARGINAL and also added a new
sysfs interface port_state to set the port_state to marginal.
Added the support in lpfc to handle the marginal state.


Muneendra (17):
  scsi: Added a new definition in scsi_cmnd.h
  scsi: Added a new error code in scsi.h
  scsi: No retries on abort success
  scsi: Added routine to set/clear SCMD_NORETRIES_ABORT bit for
    outstanding io on scsi_dev
  scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  scsi_transport_fc: Added store fucntionality to set the rport
    port_state using sysfs
  scsi:lpfc: Added changes to fc_remote_port_chkready
  scsi:qla2xx: Added changes to fc_remote_port_chkready
  scsi:qedf: Added changes to fc_remote_port_chkready
  scsi:libfc: Added changes to fc_remote_port_chkready
  scsi:ibmvfc: Added changes to fc_remote_port_chkready
  scsi:fnic: Added changes to fc_remote_port_chkready
  scsi:bnx2fc: Added changes to fc_remote_port_chkready
  scsi:csio: Added changes to fc_remote_port_chkready
  scsi:bfa: Added changes to fc_remote_port_chkready
  scsi:zfcp: Added changes to fc_remote_port_chkready
  scsi:mpt: Added changes to fc_remote_port_chkready

 drivers/message/fusion/mptfc.c    |   6 +-
 drivers/s390/scsi/zfcp_scsi.c     |   2 +-
 drivers/scsi/bfa/bfad_im.c        |   4 +-
 drivers/scsi/bnx2fc/bnx2fc_els.c  |   2 +-
 drivers/scsi/bnx2fc/bnx2fc_io.c   |   2 +-
 drivers/scsi/csiostor/csio_scsi.c |   6 +-
 drivers/scsi/fnic/fnic_main.c     |   2 +-
 drivers/scsi/fnic/fnic_scsi.c     |   6 +-
 drivers/scsi/ibmvscsi/ibmvfc.c    |  28 +++++--
 drivers/scsi/libfc/fc_fcp.c       |   4 +-
 drivers/scsi/lpfc/lpfc_scsi.c     |   4 +-
 drivers/scsi/qedf/qedf_els.c      |   2 +-
 drivers/scsi/qedf/qedf_io.c       |   4 +-
 drivers/scsi/qedf/qedf_main.c     |   2 +-
 drivers/scsi/qla2xxx/qla_os.c     |   6 +-
 drivers/scsi/scsi_error.c         |  75 ++++++++++++++++++
 drivers/scsi/scsi_lib.c           |   2 +
 drivers/scsi/scsi_priv.h          |   2 +
 drivers/scsi/scsi_transport_fc.c  | 125 +++++++++++++++++++++++++-----
 include/scsi/scsi.h               |   1 +
 include/scsi/scsi_cmnd.h          |   3 +
 include/scsi/scsi_transport_fc.h  |  26 ++++++-
 22 files changed, 264 insertions(+), 50 deletions(-)

-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v3 01/17] scsi: Added a new definition in scsi_cmnd.h
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 02/17] scsi: Added a new error code in scsi.h Muneendra
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 792 bytes --]

Added a new definition SCMD_NORETRIES_ABORT in scsi_cmnd.h

The SCMD_NORETRIES_ABORT defines the third bit postion
in scmd->state

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
No change

v2:
Modified the commit log
---
 include/scsi/scsi_cmnd.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index 69ade4fb71aa..8dec4ec6bd5f 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -64,6 +64,9 @@ struct scsi_pointer {
 /* for scmd->state */
 #define SCMD_STATE_COMPLETE	0
 #define SCMD_STATE_INFLIGHT	1
+#define SCMD_NORETRIES_ABORT	2 /* If this bit is set then there won't be any
+				   * retries of scmd on abort success
+				   */
 
 struct scsi_cmnd {
 	struct scsi_request req;
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 02/17] scsi: Added a new error code in scsi.h
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
  2020-10-15  3:27 ` [PATCH v3 01/17] scsi: Added a new definition in scsi_cmnd.h Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 03/17] scsi: No retries on abort success Muneendra
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 1408 bytes --]

Added a new error code DID_TRANSPORT_MARGINAL to handle marginal
errors in scsi.h

Clearing the SCMD_NORETRIES_ABORT bit in state flag before
blk_mq_start_request

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
Rearranged the patch by merging second hunk of the previous(v2)
patch3 to this patch

v2:
Newpatch
---
 drivers/scsi/scsi_lib.c | 1 +
 include/scsi/scsi.h     | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 1a2e9bab42ef..2b5dea07498e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1660,6 +1660,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 		req->rq_flags |= RQF_DONTPREP;
 	} else {
 		clear_bit(SCMD_STATE_COMPLETE, &cmd->state);
+		clear_bit(SCMD_NORETRIES_ABORT, &cmd->state);
 	}
 
 	cmd->flags &= SCMD_PRESERVED_FLAGS;
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 5339baadc082..5b287ad8b727 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -159,6 +159,7 @@ static inline int scsi_is_wlun(u64 lun)
 				 * paths might yield different results */
 #define DID_ALLOC_FAILURE 0x12  /* Space allocation on the device failed */
 #define DID_MEDIUM_ERROR  0x13  /* Medium error */
+#define DID_TRANSPORT_MARGINAL 0x14 /* Transport marginal errors */
 #define DRIVER_OK       0x00	/* Driver status                           */
 
 /*
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 03/17] scsi: No retries on abort success
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
  2020-10-15  3:27 ` [PATCH v3 01/17] scsi: Added a new definition in scsi_cmnd.h Muneendra
  2020-10-15  3:27 ` [PATCH v3 02/17] scsi: Added a new error code in scsi.h Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-16 18:37   ` Mike Christie
  2020-10-19 19:05   ` Mike Christie
  2020-10-15  3:27 ` [PATCH v3 04/17] scsi: Added routine to set/clear SCMD_NORETRIES_ABORT bit for outstanding io on scsi_dev Muneendra
                   ` (13 subsequent siblings)
  16 siblings, 2 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 2031 bytes --]

Made an additional check in scsi_noretry_cmd to verify whether user has
decided not to do retries on abort success by setting the
SCMD_NORETRIES_ABORT bit

If SCMD_NORETRIES_ABORT bit is set we are making sure there won't be any
retries done on the same path and also setting the host byte as
DID_TRANSPORT_MARGINAL so that the error can be propogated as recoverable
transport error to the blk layers.

Added a code in scsi_result_to_blk_status to translate
a new error DID_TRANSPORT_MARGINAL to the corresponding blk_status_t
i.e BLK_STS_TRANSPORT

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
Merged  first part of the previous patch(v2 patch3) with
this patch.

v2:
set the hostbyte as DID_TRANSPORT_MARGINAL instead of
DID_TRANSPORT_FAILFAST.
---
 drivers/scsi/scsi_error.c | 10 ++++++++++
 drivers/scsi/scsi_lib.c   |  1 +
 2 files changed, 11 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index ae80daa5d831..aa30c1c9e9db 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1763,6 +1763,16 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
 		return 0;
 
 check_type:
+	/*
+	 * Check whether caller has decided not to do retries on
+	 * abort success by setting the SCMD_NORETRIES_ABORT bit
+	 */
+	if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) &&
+		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
+		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);
+		return 1;
+	}
+
 	/*
 	 * assume caller has checked sense and determined
 	 * the check condition was retryable.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 2b5dea07498e..9606bad1542f 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -629,6 +629,7 @@ static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
 			return BLK_STS_OK;
 		return BLK_STS_IOERR;
 	case DID_TRANSPORT_FAILFAST:
+	case DID_TRANSPORT_MARGINAL:
 		return BLK_STS_TRANSPORT;
 	case DID_TARGET_FAILURE:
 		set_host_byte(cmd, DID_OK);
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 04/17] scsi: Added routine to set/clear SCMD_NORETRIES_ABORT bit for outstanding io on scsi_dev
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (2 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 03/17] scsi: No retries on abort success Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 3948 bytes --]

Added a new routine scsi_chg_noretries_abort_io_device().
This functions accepts two arguments Scsi_device and a bool(set).

When set is passed as 1
this routine will set SCMD_NORETRIES_ABORT bit in
scmd->state for all the io's on the scsi device associated
with remote port.

When set is passed as 0
This routine  will clear SCMD_NORETRIES_ABORT bit in
scmd->state for all the io's on the scsi device associated
with remote port.

Export the symbol so the routine can be called by scsi_transport_fc.c

Added new function declaration scsi_chg_noretries_abort_io_device in
scsi_priv.h

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
Used the existing scsi command iterators scsi_host_busy_iter.
Set the SCMD_NORETRIES_ABORT to every command instead of only the inflight ios.
Modified the scsi_chg_noretries_abort_io_device argument type from int to bool

v2:
Renamed the below functions as
scsi_set_noretries_abort_io_device ->scsi_chg_noretries_abort_io_device
__scsi_set_noretries_abort_io_device->__scsi_set_noretries_abort_io_device
which accepts the value as an arg to set/clear the SCMD_NORETRIES_ABORT bit
---
 drivers/scsi/scsi_error.c | 65 +++++++++++++++++++++++++++++++++++++++
 drivers/scsi/scsi_priv.h  |  2 ++
 2 files changed, 67 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index aa30c1c9e9db..70e70fdfb00c 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -279,6 +279,71 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
 	call_rcu(&scmd->rcu, scsi_eh_inc_host_failed);
 }
 
+static bool
+scsi_clear_noretries_abort_io(struct scsi_cmnd *scmd, void *priv, bool reserved)
+{
+	struct scsi_device *sdev = scmd->device;
+
+	/* only clear SCMD_NORETRIES_ABORT on ios on a specific sdev */
+	if (sdev != priv)
+		return true;
+
+	/*Clear the SCMD_NORETRIES_ABORT bit*/
+	clear_bit(SCMD_NORETRIES_ABORT, &scmd->state);
+	return true;
+}
+
+static bool
+scsi_set_noretries_abort_io(struct scsi_cmnd *scmd, void *priv, bool reserved)
+{
+	struct scsi_device *sdev = scmd->device;
+
+	/* only set SCMD_NORETRIES_ABORT on ios on a specific sdev */
+	if (sdev != priv)
+		return true;
+	/* we don't want this command reissued on abort success
+	 * so set SCMD_NORETRIES_ABORT bit to ensure it
+	 * won't get reissued
+	 */
+	set_bit(SCMD_NORETRIES_ABORT, &scmd->state);
+	return true;
+}
+
+static int
+__scsi_chg_noretries_abort_io_device(struct scsi_device *sdev, bool set)
+{
+
+	if (sdev->sdev_state != SDEV_RUNNING)
+		return -EINVAL;
+
+	blk_mq_quiesce_queue(sdev->request_queue);
+	if (set)
+		scsi_host_busy_iter(sdev->host, scsi_set_noretries_abort_io, sdev);
+	else
+		scsi_host_busy_iter(sdev->host, scsi_clear_noretries_abort_io, sdev);
+
+	blk_mq_unquiesce_queue(sdev->request_queue);
+	return 0;
+}
+
+/*
+ * scsi_chg_noretries_abort_io_device - set/clear the SCMD_NORETRIES_ABORT
+ * bit for all the pending io's on a device
+ * @sdev:	scsi_device
+ */
+int
+scsi_chg_noretries_abort_io_device(struct scsi_device *sdev, bool set)
+{
+	struct Scsi_Host *shost = sdev->host;
+	int ret  = -EINVAL;
+
+	mutex_lock(&shost->scan_mutex);
+	ret = __scsi_chg_noretries_abort_io_device(sdev, set);
+	mutex_unlock(&shost->scan_mutex);
+	return ret;
+}
+EXPORT_SYMBOL(scsi_chg_noretries_abort_io_device);
+
 /**
  * scsi_times_out - Timeout function for normal scsi commands.
  * @req:	request that is timing out.
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 180636d54982..9ccd3b716cf9 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -82,6 +82,8 @@ void scsi_eh_ready_devs(struct Scsi_Host *shost,
 int scsi_eh_get_sense(struct list_head *work_q,
 		      struct list_head *done_q);
 int scsi_noretry_cmd(struct scsi_cmnd *scmd);
+extern int scsi_chg_noretries_abort_io_device(struct scsi_device *sdev,
+			bool set);
 
 /* scsi_lib.c */
 extern int scsi_maybe_unblock_host(struct scsi_device *sdev);
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (3 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 04/17] scsi: Added routine to set/clear SCMD_NORETRIES_ABORT bit for outstanding io on scsi_dev Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-16 19:52   ` Mike Christie
  2020-10-15  3:27 ` [PATCH v3 06/17] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 6673 bytes --]

Added a new rport state FC_PORTSTATE_MARGINAL.

Added a new inline function fc_rport_chkmarginal_set_noretries
which will set the SCMD_NORETRIES_ABORT bit in cmd->state if rport state
is marginal.

Added a new argumet scsi_cmnd to the function fc_remote_port_chkready.
Made changes in fc_remote_port_chkready function to treat marginal and
online as same.If scsi_cmd is passed fc_rport_chkmarginal_set_noretries
will be called.

Also made changes in fc_remote_port_delete,fc_user_scan_tgt,
fc_timeout_deleted_rport functions  to handle the new rport state
FC_PORTSTATE_MARGINAL.

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
Rearranged the patch so that all the changes with respect to new
rport state is part of this patch.
Added a new argument to scsi_cmd  to fc_remote_port_chkready

v2:
New patch
---
 drivers/scsi/scsi_transport_fc.c | 40 +++++++++++++++++++-------------
 include/scsi/scsi_transport_fc.h | 26 ++++++++++++++++++++-
 2 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 2ff7f06203da..df66a51d62e6 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -142,20 +142,23 @@ fc_enum_name_search(host_event_code, fc_host_event_code,
 static struct {
 	enum fc_port_state	value;
 	char			*name;
+	int			matchlen;
 } fc_port_state_names[] = {
-	{ FC_PORTSTATE_UNKNOWN,		"Unknown" },
-	{ FC_PORTSTATE_NOTPRESENT,	"Not Present" },
-	{ FC_PORTSTATE_ONLINE,		"Online" },
-	{ FC_PORTSTATE_OFFLINE,		"Offline" },
-	{ FC_PORTSTATE_BLOCKED,		"Blocked" },
-	{ FC_PORTSTATE_BYPASSED,	"Bypassed" },
-	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics" },
-	{ FC_PORTSTATE_LINKDOWN,	"Linkdown" },
-	{ FC_PORTSTATE_ERROR,		"Error" },
-	{ FC_PORTSTATE_LOOPBACK,	"Loopback" },
-	{ FC_PORTSTATE_DELETED,		"Deleted" },
+	{ FC_PORTSTATE_UNKNOWN,		"Unknown", 7},
+	{ FC_PORTSTATE_NOTPRESENT,	"Not Present", 11 },
+	{ FC_PORTSTATE_ONLINE,		"Online", 6 },
+	{ FC_PORTSTATE_OFFLINE,		"Offline", 7 },
+	{ FC_PORTSTATE_BLOCKED,		"Blocked", 7 },
+	{ FC_PORTSTATE_BYPASSED,	"Bypassed", 8 },
+	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics", 11 },
+	{ FC_PORTSTATE_LINKDOWN,	"Linkdown", 8 },
+	{ FC_PORTSTATE_ERROR,		"Error", 5 },
+	{ FC_PORTSTATE_LOOPBACK,	"Loopback", 8 },
+	{ FC_PORTSTATE_DELETED,		"Deleted", 7 },
+	{ FC_PORTSTATE_MARGINAL,	"Marginal", 8 },
 };
 fc_enum_name_search(port_state, fc_port_state, fc_port_state_names)
+fc_enum_name_match(port_state, fc_port_state, fc_port_state_names)
 #define FC_PORTSTATE_MAX_NAMELEN	20
 
 
@@ -2095,7 +2098,8 @@ fc_user_scan_tgt(struct Scsi_Host *shost, uint channel, uint id, u64 lun)
 		if (rport->scsi_target_id == -1)
 			continue;
 
-		if (rport->port_state != FC_PORTSTATE_ONLINE)
+		if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
+			(rport->port_state != FC_PORTSTATE_MARGINAL))
 			continue;
 
 		if ((channel == rport->channel) &&
@@ -2958,7 +2962,8 @@ fc_remote_port_delete(struct fc_rport  *rport)
 
 	spin_lock_irqsave(shost->host_lock, flags);
 
-	if (rport->port_state != FC_PORTSTATE_ONLINE) {
+	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
+		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
 		spin_unlock_irqrestore(shost->host_lock, flags);
 		return;
 	}
@@ -3100,7 +3105,8 @@ fc_timeout_deleted_rport(struct work_struct *work)
 	 * target, validate it still is. If not, tear down the
 	 * scsi_target on it.
 	 */
-	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
+	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
+		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
 	    (rport->scsi_target_id != -1) &&
 	    !(rport->roles & FC_PORT_ROLE_FCP_TARGET)) {
 		dev_printk(KERN_ERR, &rport->dev,
@@ -3243,7 +3249,8 @@ fc_scsi_scan_rport(struct work_struct *work)
 	struct fc_internal *i = to_fc_internal(shost->transportt);
 	unsigned long flags;
 
-	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
+	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
+		(rport->port_state == FC_PORTSTATE_ONLINE)) &&
 	    (rport->roles & FC_PORT_ROLE_FCP_TARGET) &&
 	    !(i->f->disable_target_scan)) {
 		scsi_scan_target(&rport->dev, rport->channel,
@@ -3747,7 +3754,8 @@ static blk_status_t fc_bsg_rport_prep(struct fc_rport *rport)
 	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
 		return BLK_STS_RESOURCE;
 
-	if (rport->port_state != FC_PORTSTATE_ONLINE)
+	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
+		(rport->port_state != FC_PORTSTATE_MARGINAL))
 		return BLK_STS_IOERR;
 
 	return BLK_STS_OK;
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index 1c7dd35cb7a0..7d010324c1e3 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -14,6 +14,7 @@
 #include <linux/bsg-lib.h>
 #include <asm/unaligned.h>
 #include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
 #include <scsi/scsi_netlink.h>
 #include <scsi/scsi_host.h>
 
@@ -67,6 +68,7 @@ enum fc_port_state {
 	FC_PORTSTATE_ERROR,
 	FC_PORTSTATE_LOOPBACK,
 	FC_PORTSTATE_DELETED,
+	FC_PORTSTATE_MARGINAL,
 };
 
 
@@ -707,6 +709,23 @@ struct fc_function_template {
 	unsigned long	disable_target_scan:1;
 };
 
+/**
+ * fc_rport_chkmarginal_set_noretries - Set the SCMD_NORETRIES_ABORT bit
+ * in cmd->state if port state is marginal prior to initiating
+ * io to the port.
+ * @rport:	remote port to be checked
+ * @scmd:	scsi_cmd to set/clear the SCMD_NORETRIES_ABORT bit on Marginal state
+ **/
+static inline void
+fc_rport_chkmarginal_set_noretries(struct fc_rport *rport, struct scsi_cmnd *cmd)
+{
+	if ((rport->port_state == FC_PORTSTATE_MARGINAL) &&
+		 (cmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT))
+		set_bit(SCMD_NORETRIES_ABORT, &cmd->state);
+	else
+		clear_bit(SCMD_NORETRIES_ABORT, &cmd->state);
+
+}
 
 /**
  * fc_remote_port_chkready - called to validate the remote port state
@@ -715,20 +734,25 @@ struct fc_function_template {
  * Returns a scsi result code that can be returned by the LLDD.
  *
  * @rport:	remote port to be checked
+ * @cmd:	scsi_cmd to set/clear the SCMD_NORETRIES_ABORT bit on Marginal state
  **/
 static inline int
-fc_remote_port_chkready(struct fc_rport *rport)
+fc_remote_port_chkready(struct fc_rport *rport, struct scsi_cmnd *cmd)
 {
 	int result;
 
 	switch (rport->port_state) {
 	case FC_PORTSTATE_ONLINE:
+	case FC_PORTSTATE_MARGINAL:
 		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
 			result = 0;
 		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
 			result = DID_IMM_RETRY << 16;
 		else
 			result = DID_NO_CONNECT << 16;
+
+		if (cmd)
+			fc_rport_chkmarginal_set_noretries(rport, cmd);
 		break;
 	case FC_PORTSTATE_BLOCKED:
 		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 06/17] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (4 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15 14:05     ` kernel test robot
  2020-10-16 18:34   ` Mike Christie
  2020-10-15  3:27 ` [PATCH v3 07/17] scsi:lpfc: Added changes to fc_remote_port_chkready Muneendra
                   ` (10 subsequent siblings)
  16 siblings, 2 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 4469 bytes --]

Added a store functionality to set rport port_state using sysfs
under  fc_remote_ports/rport-*/port_state

With this functionality the user can move the port_state from
Marginal->Online and Online->Marginal.

On Marginal :This interface will set SCMD_NORETRIES_ABORT bit in
scmd->state for all the io's on the scsi device associated
with remote port.

On Online :This interface will clear SCMD_NORETRIES_ABORT bit in
scmd->state for all the pending io's on the scsi device associated
with remote port.

Below is the interface provided to set the port state to Marginal
and Online.

echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
Removed the port_state from starget attributes.
Enabled the store functionality for port_state under remote port.
used the starget_for_each_device to traverse around all the devices
under rport

v2:
Changed from a noretries_abort attribute under fc_transport/target*/ to
port_state for changing the port_state to a marginal state
---
 drivers/scsi/scsi_transport_fc.c | 85 +++++++++++++++++++++++++++++++-
 1 file changed, 83 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index df66a51d62e6..ac5283b645a6 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -943,7 +943,88 @@ show_fc_rport_roles (struct device *dev, struct device_attribute *attr,
 static FC_DEVICE_ATTR(rport, roles, S_IRUGO,
 		show_fc_rport_roles, NULL);
 
-fc_private_rport_rd_enum_attr(port_state, FC_PORTSTATE_MAX_NAMELEN);
+static void
+device_chg_noretries_abort(struct scsi_device *sdev, void *data)
+{
+	scsi_chg_noretries_abort_io_device(sdev, *(bool *)data);
+}
+
+static int
+target_chg_noretries_abort(struct device *dev, void *data)
+{
+	if (scsi_is_target_device(dev))
+		starget_for_each_device(to_scsi_target(dev), data,
+					device_chg_noretries_abort);
+	return 0;
+}
+
+static void scsi_target_chg_noretries_abort(struct device *dev, bool set)
+{
+	if (scsi_is_target_device(dev))
+		starget_for_each_device(to_scsi_target(dev), &set,
+					device_chg_noretries_abort);
+	else
+		device_for_each_child(dev, &set, target_chg_noretries_abort);
+}
+
+/*
+ * Sets port_state to Marginal/Online.
+ * On Marginal it Sets  no retries on abort in scmd->state for all
+ * outstanding io of all the scsi_devs
+ * This only allows ONLINE->MARGINAL and MARGINAL->ONLINE
+ */
+static ssize_t fc_rport_set_marginal_state(struct device *dev,
+						struct device_attribute *attr,
+						const char *buf, size_t count)
+{
+	struct fc_rport *rport = transport_class_to_rport(dev);
+	enum fc_port_state port_state;
+	int ret = 0;
+
+	ret = get_fc_port_state_match(buf, &port_state);
+
+	if (port_state == FC_PORTSTATE_MARGINAL) {
+		/*
+		 * Change the state to marginal only if the
+		 * current rport state is Online
+		 * Allow only Online->marginal
+		 */
+		if (rport->port_state == FC_PORTSTATE_ONLINE) {
+			rport->port_state = port_state;
+			scsi_target_chg_noretries_abort(&rport->dev, 1);
+		}
+	} else if (port_state == FC_PORTSTATE_ONLINE) {
+		/*
+		 * Change the state to Online only if the
+		 * current rport state is Marginal
+		 * Allow only  MArginal->Online
+		 */
+		if (rport->port_state == FC_PORTSTATE_MARGINAL) {
+			rport->port_state = port_state;
+			scsi_target_chg_noretries_abort(&rport->dev, 0);
+		}
+	} else
+		return -EINVAL;
+	return count;
+}
+
+static ssize_t
+show_fc_rport_port_state(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	const char *name;
+	struct fc_rport *rport = transport_class_to_rport(dev);
+
+	name = get_fc_port_state_name(rport->port_state);
+	if (!name)
+		return -EINVAL;
+
+	return snprintf(buf, 20, "%s\n", name);
+}
+
+static FC_DEVICE_ATTR(rport, port_state, 0444 | 0200,
+			show_fc_rport_port_state, fc_rport_set_marginal_state);
+
 fc_private_rport_rd_attr(scsi_target_id, "%d\n", 20);
 
 /*
@@ -2266,7 +2347,7 @@ fc_attach_transport(struct fc_function_template *ft)
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_name);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_id);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(roles);
-	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_state);
+	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(port_state);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(scsi_target_id);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(fast_io_fail_tmo);
 
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 07/17] scsi:lpfc: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (5 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 06/17] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 08/17] scsi:qla2xx: " Muneendra
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 1226 bytes --]

Added changes to pass a new argument to fc_remote_port_chkready

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
Removed calling fc_rport_chkmarginal_set_noretries to set the
SCMD_NORETRIES_ABORT bit.

Modified  the fc_remote_port_chkready fucntion by passing
new argument scsi_cmd which internally checks and sets the same.

v2:
New Patch
---
 drivers/scsi/lpfc/lpfc_scsi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 5e802c8b22a9..35231dc30a99 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -4521,7 +4521,7 @@ lpfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd)
 	if (unlikely(!rdata) || unlikely(!rport))
 		goto out_fail_command;
 
-	err = fc_remote_port_chkready(rport);
+	err = fc_remote_port_chkready(rport, cmnd);
 	if (err) {
 		cmnd->result = err;
 		goto out_fail_command;
@@ -5519,7 +5519,7 @@ lpfc_slave_alloc(struct scsi_device *sdev)
 	unsigned long flags;
 	struct lpfc_name target_wwpn;
 
-	if (!rport || fc_remote_port_chkready(rport))
+	if (!rport || fc_remote_port_chkready(rport, NULL))
 		return -ENXIO;
 
 	if (phba->cfg_fof) {
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 08/17] scsi:qla2xx: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (6 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 07/17] scsi:lpfc: Added changes to fc_remote_port_chkready Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 09/17] scsi:qedf: " Muneendra
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 1394 bytes --]

Added changes to pass a new argument to fc_remote_port_chkready

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---

v3:
New Patch
---
 drivers/scsi/qla2xxx/qla_os.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 2bb015b58609..f2be1667cc37 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -867,7 +867,7 @@ qla2xxx_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd)
 		goto qc24_fail_command;
 	}
 
-	rval = fc_remote_port_chkready(rport);
+	rval = fc_remote_port_chkready(rport, cmd);
 	if (rval) {
 		cmd->result = rval;
 		ql_dbg(ql_dbg_io + ql_dbg_verbose, vha, 0x3003,
@@ -958,7 +958,7 @@ qla2xxx_mqueuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd,
 	srb_t *sp;
 	int rval;
 
-	rval = rport ? fc_remote_port_chkready(rport) : FC_PORTSTATE_OFFLINE;
+	rval = rport ? fc_remote_port_chkready(rport, cmd) : FC_PORTSTATE_OFFLINE;
 	if (rval) {
 		cmd->result = rval;
 		ql_dbg(ql_dbg_io + ql_dbg_verbose, vha, 0x3076,
@@ -1847,7 +1847,7 @@ qla2xxx_slave_alloc(struct scsi_device *sdev)
 {
 	struct fc_rport *rport = starget_to_rport(scsi_target(sdev));
 
-	if (!rport || fc_remote_port_chkready(rport))
+	if (!rport || fc_remote_port_chkready(rport, NULL))
 		return -ENXIO;
 
 	sdev->hostdata = *(fc_port_t **)rport->dd_data;
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 09/17] scsi:qedf: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (7 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 08/17] scsi:qla2xx: " Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 10/17] scsi:libfc: " Muneendra
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 2198 bytes --]

Added changes to pass a new argument to fc_remote_port_chkready

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
New Patch
---
 drivers/scsi/qedf/qedf_els.c  | 2 +-
 drivers/scsi/qedf/qedf_io.c   | 4 ++--
 drivers/scsi/qedf/qedf_main.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/qedf/qedf_els.c b/drivers/scsi/qedf/qedf_els.c
index 625e58ccb8c8..278921555def 100644
--- a/drivers/scsi/qedf/qedf_els.c
+++ b/drivers/scsi/qedf/qedf_els.c
@@ -35,7 +35,7 @@ static int qedf_initiate_els(struct qedf_rport *fcport, unsigned int op,
 
 	QEDF_INFO(&(qedf->dbg_ctx), QEDF_LOG_ELS, "Sending ELS\n");
 
-	rc = fc_remote_port_chkready(fcport->rport);
+	rc = fc_remote_port_chkready(fcport->rport, NULL);
 	if (rc) {
 		QEDF_ERR(&(qedf->dbg_ctx), "els 0x%x: rport not ready\n", op);
 		rc = -EAGAIN;
diff --git a/drivers/scsi/qedf/qedf_io.c b/drivers/scsi/qedf/qedf_io.c
index 4869ef813dc4..0673b28ede2b 100644
--- a/drivers/scsi/qedf/qedf_io.c
+++ b/drivers/scsi/qedf/qedf_io.c
@@ -975,7 +975,7 @@ qedf_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc_cmd)
 		return 0;
 	}
 
-	rval = fc_remote_port_chkready(rport);
+	rval = fc_remote_port_chkready(rport, sc_cmd);
 	if (rval) {
 		QEDF_INFO(&qedf->dbg_ctx, QEDF_LOG_IO,
 			  "fc_remote_port_chkready failed=0x%x for port_id=0x%06x.\n",
@@ -2438,7 +2438,7 @@ int qedf_initiate_tmf(struct scsi_cmnd *sc_cmd, u8 tm_flags)
 			 io_req, io_req->xid, ref_cnt);
 	}
 
-	rval = fc_remote_port_chkready(rport);
+	rval = fc_remote_port_chkready(rport, sc_cmd);
 	if (rval) {
 		QEDF_ERR(NULL, "device_reset rport not ready\n");
 		rc = FAILED;
diff --git a/drivers/scsi/qedf/qedf_main.c b/drivers/scsi/qedf/qedf_main.c
index 46d185cb9ea8..59990400be08 100644
--- a/drivers/scsi/qedf/qedf_main.c
+++ b/drivers/scsi/qedf/qedf_main.c
@@ -761,7 +761,7 @@ static int qedf_eh_abort(struct scsi_cmnd *sc_cmd)
 		goto drop_rdata_kref;
 	}
 
-	if (fc_remote_port_chkready(rport)) {
+	if (fc_remote_port_chkready(rport, sc_cmd)) {
 		refcount = kref_read(&io_req->refcount);
 		QEDF_ERR(&qedf->dbg_ctx,
 			 "rport not ready, io_req=%p, xid=0x%x sc_cmd=%p op=0x%02x, refcount=%d, port_id=%06x\n",
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 10/17] scsi:libfc: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (8 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 09/17] scsi:qedf: " Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 11/17] scsi:ibmvfc: " Muneendra
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 1034 bytes --]

Added changes to pass a new argument to fc_remote_port_chkready

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
New Patch
---
 drivers/scsi/libfc/fc_fcp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/libfc/fc_fcp.c b/drivers/scsi/libfc/fc_fcp.c
index e11d4f002bd4..5b7a85dbc6c1 100644
--- a/drivers/scsi/libfc/fc_fcp.c
+++ b/drivers/scsi/libfc/fc_fcp.c
@@ -1867,7 +1867,7 @@ int fc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *sc_cmd)
 	int rc = 0;
 	struct fc_stats *stats;
 
-	rval = fc_remote_port_chkready(rport);
+	rval = fc_remote_port_chkready(rport, sc_cmd);
 	if (rval) {
 		sc_cmd->result = rval;
 		sc_cmd->scsi_done(sc_cmd);
@@ -2239,7 +2239,7 @@ int fc_slave_alloc(struct scsi_device *sdev)
 {
 	struct fc_rport *rport = starget_to_rport(scsi_target(sdev));
 
-	if (!rport || fc_remote_port_chkready(rport))
+	if (!rport || fc_remote_port_chkready(rport, NULL))
 		return -ENXIO;
 
 	scsi_change_queue_depth(sdev, FC_FCP_DFLT_QUEUE_DEPTH);
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 11/17] scsi:ibmvfc: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (9 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 10/17] scsi:libfc: " Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 12/17] scsi:fnic: " Muneendra
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 1958 bytes --]

Added changes to pass a new argument to fc_remote_port_chkready

Also fixed the checkpatch error
"do not use assignment in if condition"

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
New Patch
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index e09e0310b4c8..0c9334e8f036 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -1663,8 +1663,15 @@ static int ibmvfc_queuecommand_lck(struct scsi_cmnd *cmnd,
 	struct ibmvfc_event *evt;
 	int rc;
 
-	if (unlikely((rc = fc_remote_port_chkready(rport))) ||
-	    unlikely((rc = ibmvfc_host_chkready(vhost)))) {
+	rc = fc_remote_port_chkready(rport, cmnd);
+	if (unlikely(rc)) {
+		cmnd->result = rc;
+		done(cmnd);
+		return 0;
+	}
+
+	rc = ibmvfc_host_chkready(vhost);
+	if (unlikely(rc)) {
 		cmnd->result = rc;
 		done(cmnd);
 		return 0;
@@ -1934,8 +1941,19 @@ static int ibmvfc_bsg_request(struct bsg_job *job)
 
 	spin_lock_irqsave(vhost->host->host_lock, flags);
 
-	if (unlikely(rc || (rport && (rc = fc_remote_port_chkready(rport)))) ||
-	    unlikely((rc = ibmvfc_host_chkready(vhost)))) {
+	if (unlikely(rc)) {
+		spin_unlock_irqrestore(vhost->host->host_lock, flags);
+		goto out;
+	}
+
+	rc = fc_remote_port_chkready(rport, NULL);
+	if (rport && rc) {
+		spin_unlock_irqrestore(vhost->host->host_lock, flags);
+		goto out;
+	}
+
+	rc = ibmvfc_host_chkready(vhost);
+	if (unlikely(rc)) {
 		spin_unlock_irqrestore(vhost->host->host_lock, flags);
 		goto out;
 	}
@@ -2910,7 +2928,7 @@ static int ibmvfc_slave_alloc(struct scsi_device *sdev)
 	struct ibmvfc_host *vhost = shost_priv(shost);
 	unsigned long flags = 0;
 
-	if (!rport || fc_remote_port_chkready(rport))
+	if (!rport || fc_remote_port_chkready(rport, NULL))
 		return -ENXIO;
 
 	spin_lock_irqsave(shost->host_lock, flags);
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 12/17] scsi:fnic: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (10 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 11/17] scsi:ibmvfc: " Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 13/17] scsi:bnx2fc: " Muneendra
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 1969 bytes --]

Added changes to pass a new argument to fc_remote_port_chkready

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
New Patch
---
 drivers/scsi/fnic/fnic_main.c | 2 +-
 drivers/scsi/fnic/fnic_scsi.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/fnic/fnic_main.c b/drivers/scsi/fnic/fnic_main.c
index 5f8a7ef8f6a8..f312b4be2846 100644
--- a/drivers/scsi/fnic/fnic_main.c
+++ b/drivers/scsi/fnic/fnic_main.c
@@ -100,7 +100,7 @@ static int fnic_slave_alloc(struct scsi_device *sdev)
 {
 	struct fc_rport *rport = starget_to_rport(scsi_target(sdev));
 
-	if (!rport || fc_remote_port_chkready(rport))
+	if (!rport || fc_remote_port_chkready(rport, NULL))
 		return -ENXIO;
 
 	scsi_change_queue_depth(sdev, fnic_max_qdepth);
diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
index d1f7b84bbfe8..943b2bf7aaa4 100644
--- a/drivers/scsi/fnic/fnic_scsi.c
+++ b/drivers/scsi/fnic/fnic_scsi.c
@@ -452,7 +452,7 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void (*done)(struct scsi_
 		return 0;
 	}
 
-	ret = fc_remote_port_chkready(rport);
+	ret = fc_remote_port_chkready(rport, sc);
 	if (ret) {
 		FNIC_SCSI_DBG(KERN_DEBUG, fnic->lport->host,
 				"rport is not ready\n");
@@ -1938,7 +1938,7 @@ int fnic_abort_cmd(struct scsi_cmnd *sc)
 	 * port is up, then send abts to the remote port to terminate
 	 * the IO. Else, just locally terminate the IO in the firmware
 	 */
-	if (fc_remote_port_chkready(rport) == 0)
+	if (fc_remote_port_chkready(rport, sc) == 0)
 		task_req = FCPIO_ITMF_ABT_TASK;
 	else {
 		atomic64_inc(&fnic_stats->misc_stats.rport_not_ready);
@@ -2364,7 +2364,7 @@ int fnic_device_reset(struct scsi_cmnd *sc)
 		goto fnic_device_reset_end;
 
 	/* Check if remote port up */
-	if (fc_remote_port_chkready(rport)) {
+	if (fc_remote_port_chkready(rport, sc)) {
 		atomic64_inc(&fnic_stats->misc_stats.rport_not_ready);
 		goto fnic_device_reset_end;
 	}
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 13/17] scsi:bnx2fc: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (11 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 12/17] scsi:fnic: " Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 14/17] scsi:csio: " Muneendra
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 1236 bytes --]

Added changes to pass a new argument to fc_remote_port_chkready

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
New Patch
---
 drivers/scsi/bnx2fc/bnx2fc_els.c | 2 +-
 drivers/scsi/bnx2fc/bnx2fc_io.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/bnx2fc/bnx2fc_els.c b/drivers/scsi/bnx2fc/bnx2fc_els.c
index 754f2e82d955..aea0e2e6c8b4 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_els.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_els.c
@@ -686,7 +686,7 @@ static int bnx2fc_initiate_els(struct bnx2fc_rport *tgt, unsigned int op,
 	u32 did, sid;
 	u16 xid;
 
-	rc = fc_remote_port_chkready(rport);
+	rc = fc_remote_port_chkready(rport, NULL);
 	if (rc) {
 		printk(KERN_ERR PFX "els 0x%x: rport not ready\n", op);
 		rc = -EINVAL;
diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c
index 1a0dc18d6915..bed00287f8f1 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_io.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_io.c
@@ -1849,7 +1849,7 @@ int bnx2fc_queuecommand(struct Scsi_Host *host,
 	int rc = 0;
 	int rval;
 
-	rval = fc_remote_port_chkready(rport);
+	rval = fc_remote_port_chkready(rport, sc_cmd);
 	if (rval) {
 		sc_cmd->result = rval;
 		sc_cmd->scsi_done(sc_cmd);
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 14/17] scsi:csio: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (12 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 13/17] scsi:bnx2fc: " Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 15/17] scsi:bfa: " Muneendra
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 1477 bytes --]

Added changes to pass a new argument to fc_remote_port_chkready

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
New Patch
---
 drivers/scsi/csiostor/csio_scsi.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/csiostor/csio_scsi.c b/drivers/scsi/csiostor/csio_scsi.c
index 55e74da2f3cb..2df093b49808 100644
--- a/drivers/scsi/csiostor/csio_scsi.c
+++ b/drivers/scsi/csiostor/csio_scsi.c
@@ -1788,7 +1788,7 @@ csio_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmnd)
 
 	sqset = &hw->sqset[ln->portid][blk_mq_rq_cpu(cmnd->request)];
 
-	nr = fc_remote_port_chkready(rport);
+	nr = fc_remote_port_chkready(rport, cmnd);
 	if (nr) {
 		cmnd->result = nr;
 		CSIO_INC_STATS(scsim, n_rn_nr_error);
@@ -2095,7 +2095,7 @@ csio_eh_lun_reset_handler(struct scsi_cmnd *cmnd)
 	 * the former case, since LUN reset is a TMF I/O on the wire, and we
 	 * need a valid session to issue it.
 	 */
-	if (fc_remote_port_chkready(rn->rport)) {
+	if (fc_remote_port_chkready(rn->rport, cmnd)) {
 		csio_err(hw,
 			 "LUN reset cannot be issued on non-ready"
 			 " remote node ssni:0x%x (LUN:%llu)\n",
@@ -2223,7 +2223,7 @@ csio_slave_alloc(struct scsi_device *sdev)
 {
 	struct fc_rport *rport = starget_to_rport(scsi_target(sdev));
 
-	if (!rport || fc_remote_port_chkready(rport))
+	if (!rport || fc_remote_port_chkready(rport, NULL))
 		return -ENXIO;
 
 	sdev->hostdata = *((struct csio_lnode **)(rport->dd_data));
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 15/17] scsi:bfa: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (13 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 14/17] scsi:csio: " Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-15  3:27 ` [PATCH v3 16/17] scsi:zfcp: " Muneendra
  2020-10-15  3:27 ` [PATCH v3 17/17] scsi:mpt: " Muneendra
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 1062 bytes --]

Added changes to pass a new argument to fc_remote_port_chkready

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
New Patch
---
 drivers/scsi/bfa/bfad_im.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/bfa/bfad_im.c b/drivers/scsi/bfa/bfad_im.c
index 22f06be2606f..550ce8cdbce2 100644
--- a/drivers/scsi/bfa/bfad_im.c
+++ b/drivers/scsi/bfa/bfad_im.c
@@ -956,7 +956,7 @@ bfad_im_slave_alloc(struct scsi_device *sdev)
 	struct bfad_itnim_data_s *itnim_data;
 	struct bfa_s *bfa;
 
-	if (!rport || fc_remote_port_chkready(rport))
+	if (!rport || fc_remote_port_chkready(rport, NULL))
 		return -ENXIO;
 
 	itnim_data = (struct bfad_itnim_data_s *) rport->dd_data;
@@ -1213,7 +1213,7 @@ bfad_im_queuecommand_lck(struct scsi_cmnd *cmnd, void (*done) (struct scsi_cmnd
 	int       sg_cnt = 0;
 	struct fc_rport *rport = starget_to_rport(scsi_target(cmnd->device));
 
-	rc = fc_remote_port_chkready(rport);
+	rc = fc_remote_port_chkready(rport, cmnd);
 	if (rc) {
 		cmnd->result = rc;
 		done(cmnd);
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 16/17] scsi:zfcp: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (14 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 15/17] scsi:bfa: " Muneendra
@ 2020-10-15  3:27 ` Muneendra
  2020-10-22 16:50   ` Benjamin Block
  2020-10-15  3:27 ` [PATCH v3 17/17] scsi:mpt: " Muneendra
  16 siblings, 1 reply; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 779 bytes --]

Added changes to pass a new argument to fc_remote_port_chkready

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
New Patch
---
 drivers/s390/scsi/zfcp_scsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index d58bf79892f2..732e15e3a839 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -74,7 +74,7 @@ int zfcp_scsi_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scpnt)
 	scpnt->result = 0;
 	scpnt->host_scribble = NULL;
 
-	scsi_result = fc_remote_port_chkready(rport);
+	scsi_result = fc_remote_port_chkready(rport, scpnt);
 	if (unlikely(scsi_result)) {
 		scpnt->result = scsi_result;
 		zfcp_dbf_scsi_fail_send(scpnt);
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 17/17] scsi:mpt: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
                   ` (15 preceding siblings ...)
  2020-10-15  3:27 ` [PATCH v3 16/17] scsi:zfcp: " Muneendra
@ 2020-10-15  3:27 ` Muneendra
  16 siblings, 0 replies; 46+ messages in thread
From: Muneendra @ 2020-10-15  3:27 UTC (permalink / raw)
  To: linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 1453 bytes --]

Added changes to pass a new argument to fc_remote_port_chkready

Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>

---
v3:
New Patch
---
 drivers/message/fusion/mptfc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/message/fusion/mptfc.c b/drivers/message/fusion/mptfc.c
index f92b0433f599..e6f7fa891455 100644
--- a/drivers/message/fusion/mptfc.c
+++ b/drivers/message/fusion/mptfc.c
@@ -199,7 +199,7 @@ mptfc_block_error_handler(struct scsi_cmnd *SCpnt,
 	hd = shost_priv(SCpnt->device->host);
 	ioc = hd->ioc;
 	spin_lock_irqsave(shost->host_lock, flags);
-	while ((ready = fc_remote_port_chkready(rport) >> 16) == DID_IMM_RETRY
+	while ((ready = fc_remote_port_chkready(rport, SCpnt) >> 16) == DID_IMM_RETRY
 	 || (loops > 0 && ioc->active == 0)) {
 		spin_unlock_irqrestore(shost->host_lock, flags);
 		dfcprintk (ioc, printk(MYIOC_s_DEBUG_FMT
@@ -606,7 +606,7 @@ mptfc_slave_alloc(struct scsi_device *sdev)
 	starget = scsi_target(sdev);
 	rport = starget_to_rport(starget);
 
-	if (!rport || fc_remote_port_chkready(rport))
+	if (!rport || fc_remote_port_chkready(rport, NULL))
 		return -ENXIO;
 
 	hd = shost_priv(sdev->host);
@@ -653,7 +653,7 @@ mptfc_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *SCpnt)
 		return 0;
 	}
 
-	err = fc_remote_port_chkready(rport);
+	err = fc_remote_port_chkready(rport, SCpnt);
 	if (unlikely(err)) {
 		SCpnt->result = err;
 		SCpnt->scsi_done(SCpnt);
-- 
2.26.2


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 06/17] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs
  2020-10-15  3:27 ` [PATCH v3 06/17] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
@ 2020-10-15 14:05     ` kernel test robot
  2020-10-16 18:34   ` Mike Christie
  1 sibling, 0 replies; 46+ messages in thread
From: kernel test robot @ 2020-10-15 14:05 UTC (permalink / raw)
  To: Muneendra, linux-scsi, hare
  Cc: kbuild-all, jsmart2021, emilne, mkumar, Muneendra

[-- Attachment #1: Type: text/plain, Size: 4384 bytes --]

Hi Muneendra,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on scsi/for-next]
[also build test WARNING on mkp-scsi/for-next next-20201015]
[cannot apply to v5.9]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Muneendra/scsi-Support-to-handle-Intermittent-errors/20201015-182315
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
config: m68k-allmodconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/6bb032ada0c283207bbbf0d17a5d8091f49ebb03
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Muneendra/scsi-Support-to-handle-Intermittent-errors/20201015-182315
        git checkout 6bb032ada0c283207bbbf0d17a5d8091f49ebb03
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:11,
                    from include/linux/list.h:9,
                    from include/linux/module.h:12,
                    from drivers/scsi/scsi_transport_fc.c:11:
   include/linux/scatterlist.h: In function 'sg_set_buf':
   arch/m68k/include/asm/page_mm.h:169:49: warning: ordered comparison of pointer with null pointer [-Wextra]
     169 | #define virt_addr_valid(kaddr) ((void *)(kaddr) >= (void *)PAGE_OFFSET && (void *)(kaddr) < high_memory)
         |                                                 ^~
   include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
      78 | # define unlikely(x) __builtin_expect(!!(x), 0)
         |                                          ^
   include/linux/scatterlist.h:143:2: note: in expansion of macro 'BUG_ON'
     143 |  BUG_ON(!virt_addr_valid(buf));
         |  ^~~~~~
   include/linux/scatterlist.h:143:10: note: in expansion of macro 'virt_addr_valid'
     143 |  BUG_ON(!virt_addr_valid(buf));
         |          ^~~~~~~~~~~~~~~
   drivers/scsi/scsi_transport_fc.c: In function 'fc_rport_set_marginal_state':
>> drivers/scsi/scsi_transport_fc.c:982:6: warning: variable 'ret' set but not used [-Wunused-but-set-variable]
     982 |  int ret = 0;
         |      ^~~

vim +/ret +982 drivers/scsi/scsi_transport_fc.c

   969	
   970	/*
   971	 * Sets port_state to Marginal/Online.
   972	 * On Marginal it Sets  no retries on abort in scmd->state for all
   973	 * outstanding io of all the scsi_devs
   974	 * This only allows ONLINE->MARGINAL and MARGINAL->ONLINE
   975	 */
   976	static ssize_t fc_rport_set_marginal_state(struct device *dev,
   977							struct device_attribute *attr,
   978							const char *buf, size_t count)
   979	{
   980		struct fc_rport *rport = transport_class_to_rport(dev);
   981		enum fc_port_state port_state;
 > 982		int ret = 0;
   983	
   984		ret = get_fc_port_state_match(buf, &port_state);
   985	
   986		if (port_state == FC_PORTSTATE_MARGINAL) {
   987			/*
   988			 * Change the state to marginal only if the
   989			 * current rport state is Online
   990			 * Allow only Online->marginal
   991			 */
   992			if (rport->port_state == FC_PORTSTATE_ONLINE) {
   993				rport->port_state = port_state;
   994				scsi_target_chg_noretries_abort(&rport->dev, 1);
   995			}
   996		} else if (port_state == FC_PORTSTATE_ONLINE) {
   997			/*
   998			 * Change the state to Online only if the
   999			 * current rport state is Marginal
  1000			 * Allow only  MArginal->Online
  1001			 */
  1002			if (rport->port_state == FC_PORTSTATE_MARGINAL) {
  1003				rport->port_state = port_state;
  1004				scsi_target_chg_noretries_abort(&rport->dev, 0);
  1005			}
  1006		} else
  1007			return -EINVAL;
  1008		return count;
  1009	}
  1010	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 57804 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 06/17] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs
@ 2020-10-15 14:05     ` kernel test robot
  0 siblings, 0 replies; 46+ messages in thread
From: kernel test robot @ 2020-10-15 14:05 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4486 bytes --]

Hi Muneendra,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on scsi/for-next]
[also build test WARNING on mkp-scsi/for-next next-20201015]
[cannot apply to v5.9]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Muneendra/scsi-Support-to-handle-Intermittent-errors/20201015-182315
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
config: m68k-allmodconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/6bb032ada0c283207bbbf0d17a5d8091f49ebb03
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Muneendra/scsi-Support-to-handle-Intermittent-errors/20201015-182315
        git checkout 6bb032ada0c283207bbbf0d17a5d8091f49ebb03
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:11,
                    from include/linux/list.h:9,
                    from include/linux/module.h:12,
                    from drivers/scsi/scsi_transport_fc.c:11:
   include/linux/scatterlist.h: In function 'sg_set_buf':
   arch/m68k/include/asm/page_mm.h:169:49: warning: ordered comparison of pointer with null pointer [-Wextra]
     169 | #define virt_addr_valid(kaddr) ((void *)(kaddr) >= (void *)PAGE_OFFSET && (void *)(kaddr) < high_memory)
         |                                                 ^~
   include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
      78 | # define unlikely(x) __builtin_expect(!!(x), 0)
         |                                          ^
   include/linux/scatterlist.h:143:2: note: in expansion of macro 'BUG_ON'
     143 |  BUG_ON(!virt_addr_valid(buf));
         |  ^~~~~~
   include/linux/scatterlist.h:143:10: note: in expansion of macro 'virt_addr_valid'
     143 |  BUG_ON(!virt_addr_valid(buf));
         |          ^~~~~~~~~~~~~~~
   drivers/scsi/scsi_transport_fc.c: In function 'fc_rport_set_marginal_state':
>> drivers/scsi/scsi_transport_fc.c:982:6: warning: variable 'ret' set but not used [-Wunused-but-set-variable]
     982 |  int ret = 0;
         |      ^~~

vim +/ret +982 drivers/scsi/scsi_transport_fc.c

   969	
   970	/*
   971	 * Sets port_state to Marginal/Online.
   972	 * On Marginal it Sets  no retries on abort in scmd->state for all
   973	 * outstanding io of all the scsi_devs
   974	 * This only allows ONLINE->MARGINAL and MARGINAL->ONLINE
   975	 */
   976	static ssize_t fc_rport_set_marginal_state(struct device *dev,
   977							struct device_attribute *attr,
   978							const char *buf, size_t count)
   979	{
   980		struct fc_rport *rport = transport_class_to_rport(dev);
   981		enum fc_port_state port_state;
 > 982		int ret = 0;
   983	
   984		ret = get_fc_port_state_match(buf, &port_state);
   985	
   986		if (port_state == FC_PORTSTATE_MARGINAL) {
   987			/*
   988			 * Change the state to marginal only if the
   989			 * current rport state is Online
   990			 * Allow only Online->marginal
   991			 */
   992			if (rport->port_state == FC_PORTSTATE_ONLINE) {
   993				rport->port_state = port_state;
   994				scsi_target_chg_noretries_abort(&rport->dev, 1);
   995			}
   996		} else if (port_state == FC_PORTSTATE_ONLINE) {
   997			/*
   998			 * Change the state to Online only if the
   999			 * current rport state is Marginal
  1000			 * Allow only  MArginal->Online
  1001			 */
  1002			if (rport->port_state == FC_PORTSTATE_MARGINAL) {
  1003				rport->port_state = port_state;
  1004				scsi_target_chg_noretries_abort(&rport->dev, 0);
  1005			}
  1006		} else
  1007			return -EINVAL;
  1008		return count;
  1009	}
  1010	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 57804 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 06/17] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs
  2020-10-15  3:27 ` [PATCH v3 06/17] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
  2020-10-15 14:05     ` kernel test robot
@ 2020-10-16 18:34   ` Mike Christie
  2020-10-19 10:52     ` Muneendra Kumar M
  1 sibling, 1 reply; 46+ messages in thread
From: Mike Christie @ 2020-10-16 18:34 UTC (permalink / raw)
  To: Muneendra, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

On 10/14/20 10:27 PM, Muneendra wrote:
> Added a store functionality to set rport port_state using sysfs
> under  fc_remote_ports/rport-*/port_state
> 
> With this functionality the user can move the port_state from
> Marginal->Online and Online->Marginal.
> 
> On Marginal :This interface will set SCMD_NORETRIES_ABORT bit in
> scmd->state for all the io's on the scsi device associated
> with remote port.
> 
> On Online :This interface will clear SCMD_NORETRIES_ABORT bit in
> scmd->state for all the pending io's on the scsi device associated
> with remote port.
> 
> Below is the interface provided to set the port state to Marginal
> and Online.
> 
> echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
> echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v3:
> Removed the port_state from starget attributes.
> Enabled the store functionality for port_state under remote port.
> used the starget_for_each_device to traverse around all the devices
> under rport
> 
> v2:
> Changed from a noretries_abort attribute under fc_transport/target*/ to
> port_state for changing the port_state to a marginal state
> ---
>   drivers/scsi/scsi_transport_fc.c | 85 +++++++++++++++++++++++++++++++-
>   1 file changed, 83 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> index df66a51d62e6..ac5283b645a6 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -943,7 +943,88 @@ show_fc_rport_roles (struct device *dev, struct device_attribute *attr,
>   static FC_DEVICE_ATTR(rport, roles, S_IRUGO,
>   		show_fc_rport_roles, NULL);
>   
> -fc_private_rport_rd_enum_attr(port_state, FC_PORTSTATE_MAX_NAMELEN);
> +static void
> +device_chg_noretries_abort(struct scsi_device *sdev, void *data)
> +{
> +	scsi_chg_noretries_abort_io_device(sdev, *(bool *)data);
> +}
> +
> +static int
> +target_chg_noretries_abort(struct device *dev, void *data)
> +{
> +	if (scsi_is_target_device(dev))
> +		starget_for_each_device(to_scsi_target(dev), data,
> +					device_chg_noretries_abort);
> +	return 0;
> +}
> +
> +static void scsi_target_chg_noretries_abort(struct device *dev, bool set)
> +{
> +	if (scsi_is_target_device(dev))
> +		starget_for_each_device(to_scsi_target(dev), &set,
> +					device_chg_noretries_abort);
> +	else
> +		device_for_each_child(dev, &set, target_chg_noretries_abort);
> +}
> +
> +/*
> + * Sets port_state to Marginal/Online.
> + * On Marginal it Sets  no retries on abort in scmd->state for all
> + * outstanding io of all the scsi_devs
> + * This only allows ONLINE->MARGINAL and MARGINAL->ONLINE

The above comments are not needed since not counting comments the code 
is almost the same number of lines as the comments. Plus the comments 
below say the same thing. And the functions/fields/variables you are 
using below are pretty clear in what they are doing.

> + */
> +static ssize_t fc_rport_set_marginal_state(struct device *dev,
> +						struct device_attribute *attr,
> +						const char *buf, size_t count)
> +{
> +	struct fc_rport *rport = transport_class_to_rport(dev);
> +	enum fc_port_state port_state;
> +	int ret = 0;
> +
> +	ret = get_fc_port_state_match(buf, &port_state);

The kernel test robot mentioned this.

> +
> +	if (port_state == FC_PORTSTATE_MARGINAL) {
> +		/*
> +		 * Change the state to marginal only if the
> +		 * current rport state is Online
> +		 * Allow only Online->marginal
> +		 */
> +		if (rport->port_state == FC_PORTSTATE_ONLINE) {
> +			rport->port_state = port_state;
> +			scsi_target_chg_noretries_abort(&rport->dev, 1);
> +		}

Should this return a failure if port_state is not online?


> +	} else if (port_state == FC_PORTSTATE_ONLINE) {
> +		/*
> +		 * Change the state to Online only if the
> +		 * current rport state is Marginal
> +		 * Allow only  MArginal->Online
> +		 */
> +		if (rport->port_state == FC_PORTSTATE_MARGINAL) {
> +			rport->port_state = port_state;
> +			scsi_target_chg_noretries_abort(&rport->dev, 0);
> +		}

Same here.

> +	} else
> +		return -EINVAL;
> +	return count;
> +}
> +
> +static ssize_t
> +show_fc_rport_port_state(struct device *dev,
> +				struct device_attribute *attr, char *buf)
> +{
> +	const char *name;
> +	struct fc_rport *rport = transport_class_to_rport(dev);
> +
> +	name = get_fc_port_state_name(rport->port_state);
> +	if (!name)
> +		return -EINVAL;
> +
> +	return snprintf(buf, 20, "%s\n", name);
> +}
> +
> +static FC_DEVICE_ATTR(rport, port_state, 0444 | 0200,
> +			show_fc_rport_port_state, fc_rport_set_marginal_state);
> +
>   fc_private_rport_rd_attr(scsi_target_id, "%d\n", 20);
>   
>   /*
> @@ -2266,7 +2347,7 @@ fc_attach_transport(struct fc_function_template *ft)
>   	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_name);
>   	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_id);
>   	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(roles);
> -	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_state);
> +	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(port_state);
>   	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(scsi_target_id);
>   	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(fast_io_fail_tmo);
>   
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 03/17] scsi: No retries on abort success
  2020-10-15  3:27 ` [PATCH v3 03/17] scsi: No retries on abort success Muneendra
@ 2020-10-16 18:37   ` Mike Christie
  2020-10-19 19:05   ` Mike Christie
  1 sibling, 0 replies; 46+ messages in thread
From: Mike Christie @ 2020-10-16 18:37 UTC (permalink / raw)
  To: Muneendra, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

On 10/14/20 10:27 PM, Muneendra wrote:
> Made an additional check in scsi_noretry_cmd to verify whether user has
> decided not to do retries on abort success by setting the
> SCMD_NORETRIES_ABORT bit
> 
> If SCMD_NORETRIES_ABORT bit is set we are making sure there won't be any
> retries done on the same path and also setting the host byte as
> DID_TRANSPORT_MARGINAL so that the error can be propogated as recoverable
> transport error to the blk layers.
> 
> Added a code in scsi_result_to_blk_status to translate
> a new error DID_TRANSPORT_MARGINAL to the corresponding blk_status_t
> i.e BLK_STS_TRANSPORT
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v3:
> Merged  first part of the previous patch(v2 patch3) with
> this patch.
> 
> v2:
> set the hostbyte as DID_TRANSPORT_MARGINAL instead of
> DID_TRANSPORT_FAILFAST.
> ---
>   drivers/scsi/scsi_error.c | 10 ++++++++++
>   drivers/scsi/scsi_lib.c   |  1 +
>   2 files changed, 11 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index ae80daa5d831..aa30c1c9e9db 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -1763,6 +1763,16 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
>   		return 0;
>   
>   check_type:
> +	/*
> +	 * Check whether caller has decided not to do retries on
> +	 * abort success by setting the SCMD_NORETRIES_ABORT bit

The comment seems wrong here because we are not setting that bit.

> +	 */
> +	if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) &&
> +		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
> +		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);
> +		return 1;
> +	}
> +
>   	/*
>   	 * assume caller has checked sense and determined
>   	 * the check condition was retryable.
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 2b5dea07498e..9606bad1542f 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -629,6 +629,7 @@ static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
>   			return BLK_STS_OK;
>   		return BLK_STS_IOERR;
>   	case DID_TRANSPORT_FAILFAST:
> +	case DID_TRANSPORT_MARGINAL:
>   		return BLK_STS_TRANSPORT;
>   	case DID_TARGET_FAILURE:
>   		set_host_byte(cmd, DID_OK);
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-15  3:27 ` [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
@ 2020-10-16 19:52   ` Mike Christie
  2020-10-19 10:47     ` Muneendra Kumar M
  0 siblings, 1 reply; 46+ messages in thread
From: Mike Christie @ 2020-10-16 19:52 UTC (permalink / raw)
  To: Muneendra, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

On 10/14/20 10:27 PM, Muneendra wrote:
> Added a new rport state FC_PORTSTATE_MARGINAL.
> 
> Added a new inline function fc_rport_chkmarginal_set_noretries
> which will set the SCMD_NORETRIES_ABORT bit in cmd->state if rport state
> is marginal.
> 
> Added a new argumet scsi_cmnd to the function fc_remote_port_chkready.
> Made changes in fc_remote_port_chkready function to treat marginal and
> online as same.If scsi_cmd is passed fc_rport_chkmarginal_set_noretries
> will be called.
> 
> Also made changes in fc_remote_port_delete,fc_user_scan_tgt,
> fc_timeout_deleted_rport functions  to handle the new rport state
> FC_PORTSTATE_MARGINAL.
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v3:
> Rearranged the patch so that all the changes with respect to new
> rport state is part of this patch.
> Added a new argument to scsi_cmd  to fc_remote_port_chkready
> 
> v2:
> New patch
> ---
>   drivers/scsi/scsi_transport_fc.c | 40 +++++++++++++++++++-------------
>   include/scsi/scsi_transport_fc.h | 26 ++++++++++++++++++++-
>   2 files changed, 49 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> index 2ff7f06203da..df66a51d62e6 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -142,20 +142,23 @@ fc_enum_name_search(host_event_code, fc_host_event_code,
>   static struct {
>   	enum fc_port_state	value;
>   	char			*name;
> +	int			matchlen;
>   } fc_port_state_names[] = {
> -	{ FC_PORTSTATE_UNKNOWN,		"Unknown" },
> -	{ FC_PORTSTATE_NOTPRESENT,	"Not Present" },
> -	{ FC_PORTSTATE_ONLINE,		"Online" },
> -	{ FC_PORTSTATE_OFFLINE,		"Offline" },
> -	{ FC_PORTSTATE_BLOCKED,		"Blocked" },
> -	{ FC_PORTSTATE_BYPASSED,	"Bypassed" },
> -	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics" },
> -	{ FC_PORTSTATE_LINKDOWN,	"Linkdown" },
> -	{ FC_PORTSTATE_ERROR,		"Error" },
> -	{ FC_PORTSTATE_LOOPBACK,	"Loopback" },
> -	{ FC_PORTSTATE_DELETED,		"Deleted" },
> +	{ FC_PORTSTATE_UNKNOWN,		"Unknown", 7},
> +	{ FC_PORTSTATE_NOTPRESENT,	"Not Present", 11 },
> +	{ FC_PORTSTATE_ONLINE,		"Online", 6 },
> +	{ FC_PORTSTATE_OFFLINE,		"Offline", 7 },
> +	{ FC_PORTSTATE_BLOCKED,		"Blocked", 7 },
> +	{ FC_PORTSTATE_BYPASSED,	"Bypassed", 8 },
> +	{ FC_PORTSTATE_DIAGNOSTICS,	"Diagnostics", 11 },
> +	{ FC_PORTSTATE_LINKDOWN,	"Linkdown", 8 },
> +	{ FC_PORTSTATE_ERROR,		"Error", 5 },
> +	{ FC_PORTSTATE_LOOPBACK,	"Loopback", 8 },
> +	{ FC_PORTSTATE_DELETED,		"Deleted", 7 },
> +	{ FC_PORTSTATE_MARGINAL,	"Marginal", 8 },
>   };
>   fc_enum_name_search(port_state, fc_port_state, fc_port_state_names)
> +fc_enum_name_match(port_state, fc_port_state, fc_port_state_names)
>   #define FC_PORTSTATE_MAX_NAMELEN	20
>   
>   
> @@ -2095,7 +2098,8 @@ fc_user_scan_tgt(struct Scsi_Host *shost, uint channel, uint id, u64 lun)
>   		if (rport->scsi_target_id == -1)
>   			continue;
>   
> -		if (rport->port_state != FC_PORTSTATE_ONLINE)
> +		if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +			(rport->port_state != FC_PORTSTATE_MARGINAL))
>   			continue;
>   
>   		if ((channel == rport->channel) &&
> @@ -2958,7 +2962,8 @@ fc_remote_port_delete(struct fc_rport  *rport)
>   
>   	spin_lock_irqsave(shost->host_lock, flags);
>   
> -	if (rport->port_state != FC_PORTSTATE_ONLINE) {
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL)) {
>   		spin_unlock_irqrestore(shost->host_lock, flags);
>   		return;
>   	}
> @@ -3100,7 +3105,8 @@ fc_timeout_deleted_rport(struct work_struct *work)
>   	 * target, validate it still is. If not, tear down the
>   	 * scsi_target on it.
>   	 */
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_MARGINAL)) &&
>   	    (rport->scsi_target_id != -1) &&
>   	    !(rport->roles & FC_PORT_ROLE_FCP_TARGET)) {
>   		dev_printk(KERN_ERR, &rport->dev,
> @@ -3243,7 +3249,8 @@ fc_scsi_scan_rport(struct work_struct *work)
>   	struct fc_internal *i = to_fc_internal(shost->transportt);
>   	unsigned long flags;
>   
> -	if ((rport->port_state == FC_PORTSTATE_ONLINE) &&
> +	if (((rport->port_state == FC_PORTSTATE_ONLINE) ||
> +		(rport->port_state == FC_PORTSTATE_ONLINE)) &&
>   	    (rport->roles & FC_PORT_ROLE_FCP_TARGET) &&
>   	    !(i->f->disable_target_scan)) {
>   		scsi_scan_target(&rport->dev, rport->channel,
> @@ -3747,7 +3754,8 @@ static blk_status_t fc_bsg_rport_prep(struct fc_rport *rport)
>   	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
>   		return BLK_STS_RESOURCE;
>   
> -	if (rport->port_state != FC_PORTSTATE_ONLINE)
> +	if ((rport->port_state != FC_PORTSTATE_ONLINE) &&
> +		(rport->port_state != FC_PORTSTATE_MARGINAL))
>   		return BLK_STS_IOERR;
>   
>   	return BLK_STS_OK;
> diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
> index 1c7dd35cb7a0..7d010324c1e3 100644
> --- a/include/scsi/scsi_transport_fc.h
> +++ b/include/scsi/scsi_transport_fc.h
> @@ -14,6 +14,7 @@
>   #include <linux/bsg-lib.h>
>   #include <asm/unaligned.h>
>   #include <scsi/scsi.h>
> +#include <scsi/scsi_cmnd.h>
>   #include <scsi/scsi_netlink.h>
>   #include <scsi/scsi_host.h>
>   
> @@ -67,6 +68,7 @@ enum fc_port_state {
>   	FC_PORTSTATE_ERROR,
>   	FC_PORTSTATE_LOOPBACK,
>   	FC_PORTSTATE_DELETED,
> +	FC_PORTSTATE_MARGINAL,
>   };
>   
>   
> @@ -707,6 +709,23 @@ struct fc_function_template {
>   	unsigned long	disable_target_scan:1;
>   };
>   
> +/**
> + * fc_rport_chkmarginal_set_noretries - Set the SCMD_NORETRIES_ABORT bit
> + * in cmd->state if port state is marginal prior to initiating
> + * io to the port.
> + * @rport:	remote port to be checked
> + * @scmd:	scsi_cmd to set/clear the SCMD_NORETRIES_ABORT bit on Marginal state
> + **/
> +static inline void
> +fc_rport_chkmarginal_set_noretries(struct fc_rport *rport, struct scsi_cmnd *cmd)
> +{
> +	if ((rport->port_state == FC_PORTSTATE_MARGINAL) &&
> +		 (cmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT))
> +		set_bit(SCMD_NORETRIES_ABORT, &cmd->state);
> +	else
> +		clear_bit(SCMD_NORETRIES_ABORT, &cmd->state);
> +
> +}
>   
>   /**
>    * fc_remote_port_chkready - called to validate the remote port state
> @@ -715,20 +734,25 @@ struct fc_function_template {
>    * Returns a scsi result code that can be returned by the LLDD.
>    *
>    * @rport:	remote port to be checked
> + * @cmd:	scsi_cmd to set/clear the SCMD_NORETRIES_ABORT bit on Marginal state
>    **/
>   static inline int
> -fc_remote_port_chkready(struct fc_rport *rport)
> +fc_remote_port_chkready(struct fc_rport *rport, struct scsi_cmnd *cmd)
>   {
>   	int result;
>   
>   	switch (rport->port_state) {
>   	case FC_PORTSTATE_ONLINE:
> +	case FC_PORTSTATE_MARGINAL:
>   		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
>   			result = 0;
>   		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
>   			result = DID_IMM_RETRY << 16;
>   		else
>   			result = DID_NO_CONNECT << 16;
> +
> +		if (cmd)
> +			fc_rport_chkmarginal_set_noretries(rport, cmd);

I was just wondering why don't you do drop all the SCMD_NORETRIES_ABORT 
code and then in this function when you check for the marginal state do:

result = DID_TRANSPORT_MARGINAL;

?

So you would do:

1. Userspace calls in scsi_transport_fc and sets the port state to marginal.
2. New commands would see the above check and complete the command woth 
DID_TRANSPORT_MARGINAL.
3. If a command is retried by the scsi layer we would end up hitting 
this function and see the check and end up faling the IO like in #2. It 
would be similar to what happens when the dev loss or fast io fail 
timers fire.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-16 19:52   ` Mike Christie
@ 2020-10-19 10:47     ` Muneendra Kumar M
  2020-10-19 16:10       ` Michael Christie
  0 siblings, 1 reply; 46+ messages in thread
From: Muneendra Kumar M @ 2020-10-19 10:47 UTC (permalink / raw)
  To: Mike Christie, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

[-- Attachment #1: Type: text/plain, Size: 1993 bytes --]

Hi Mike,
Thanks for the review.

>> +fc_remote_port_chkready(struct fc_rport *rport, struct scsi_cmnd
>> +*cmd)
> >  {
>   >	int result;
>  >
>   >	switch (rport->port_state) {
>   >	case FC_PORTSTATE_ONLINE:
> >+	case FC_PORTSTATE_MARGINAL:
>   >		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
>   >			result = 0;
>   >		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
>>   			result = DID_IMM_RETRY << 16;
>>   		else
>>   			result = DID_NO_CONNECT << 16;
> >+
> >+		if (cmd)
> >+			fc_rport_chkmarginal_set_noretries(rport, cmd);

>I was just wondering why don't you do drop all the SCMD_NORETRIES_ABORT
>code and then in this function when you check for the marginal state do:

>result = DID_TRANSPORT_MARGINAL;

?
[Muneendra] Correct me if my understanding is correct?
You mean to say ,if the SCMD_NORETRIES_ABORT is set end up failing the IO
with DID_TRANSPORT_MARGINAL instead of retry?
I assume that your below point 3 talks the same.
Let me know if this is correct.

>So you would do:

>1. Userspace calls in scsi_transport_fc and sets the port state to
>marginal.
2. New commands would see the above check and complete the command with
DID_TRANSPORT_MARGINAL.
[Muneendra]Correct me if my understanding is right.
You mean to say if the port_state is set to marginal return all the new
commands with DID_TRANSPORT_MARGINAL  (without checking
SCMD_NORETRIES_ABORT)?

If yes then below are my concerns
In marginal state  what we want is the ios to be attempted at least once .

In marginal state we cannot drop all the commands as one of the concern we
have is if a target is non-mpio based targets then we will be dropping it
without any attempt.
With this we will be even dropping the TUR commands also which unnecessarily
lead to error handling.


3. If a command is retried by the scsi layer we would end up hitting this
function and see the check and end up faling the IO like in #2. It would be
similar to what happens when the dev loss or fast io fail timers fire.


[Muneendra]

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH v3 06/17] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs
  2020-10-16 18:34   ` Mike Christie
@ 2020-10-19 10:52     ` Muneendra Kumar M
  0 siblings, 0 replies; 46+ messages in thread
From: Muneendra Kumar M @ 2020-10-19 10:52 UTC (permalink / raw)
  To: Mike Christie, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

[-- Attachment #1: Type: text/plain, Size: 859 bytes --]

HI Mike,
Thanks for the review.

> +/*
> + * Sets port_state to Marginal/Online.
> + * On Marginal it Sets  no retries on abort in scmd->state for all
> + * outstanding io of all the scsi_devs
> + * This only allows ONLINE->MARGINAL and MARGINAL->ONLINE

The above comments are not needed since not counting comments the code is
almost the same number of lines as the comments. Plus the comments below say
the same thing. And the functions/fields/variables you are using below are
pretty clear in what they are doing.
[Munendra] Agreed.
> +		if (rport->port_state == FC_PORTSTATE_ONLINE) {
> +			rport->port_state = port_state;
> +			scsi_target_chg_noretries_abort(&rport->dev, 1);
> +		}

Should this return a failure if port_state is not online?
[Muneendra]Agreed. Thanks for pointing it .I will add these changes in my
next version.


Regards,
Muneendra.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 10:47     ` Muneendra Kumar M
@ 2020-10-19 16:10       ` Michael Christie
  2020-10-19 16:19         ` Michael Christie
  0 siblings, 1 reply; 46+ messages in thread
From: Michael Christie @ 2020-10-19 16:10 UTC (permalink / raw)
  To: Muneendra Kumar M; +Cc: linux-scsi, Hannes Reinecke, jsmart2021, emilne, mkumar



> On Oct 19, 2020, at 5:47 AM, Muneendra Kumar M <muneendra.kumar@broadcom.com> wrote:
> 
> Hi Mike,
> Thanks for the review.
> 
>>> +fc_remote_port_chkready(struct fc_rport *rport, struct scsi_cmnd
>>> +*cmd)
>>> {
>>> 	int result;
>>> 
>>> 	switch (rport->port_state) {
>>> 	case FC_PORTSTATE_ONLINE:
>>> +	case FC_PORTSTATE_MARGINAL:
>>> 		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
>>> 			result = 0;
>>> 		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
>>>  			result = DID_IMM_RETRY << 16;
>>>  		else
>>>  			result = DID_NO_CONNECT << 16;
>>> +
>>> +		if (cmd)
>>> +			fc_rport_chkmarginal_set_noretries(rport, cmd);
> 
>> I was just wondering why don't you do drop all the SCMD_NORETRIES_ABORT
>> code and then in this function when you check for the marginal state do:
> 
>> result = DID_TRANSPORT_MARGINAL;
> 
> ?
> [Muneendra] Correct me if my understanding is correct?
> You mean to say ,if the SCMD_NORETRIES_ABORT is set end up failing the IO
> with DID_TRANSPORT_MARGINAL instead of retry?

Sort of. I was saying to just drop the SCMD_NORETRIES_ABORT related code
and when you detect the port state here fail the IO.

> I assume that your below point 3 talks the same.
> Let me know if this is correct.
> 
>> So you would do:
> 
>> 1. Userspace calls in scsi_transport_fc and sets the port state to
>> marginal.
> 2. New commands would see the above check and complete the command with
> DID_TRANSPORT_MARGINAL.
> [Muneendra]Correct me if my understanding is right.
> You mean to say if the port_state is set to marginal return all the new
> commands with DID_TRANSPORT_MARGINAL  (without checking
> SCMD_NORETRIES_ABORT)?

Yes.

> 
> If yes then below are my concerns
> In marginal state  what we want is the ios to be attempted at least once .
> 
> In marginal state we cannot drop all the commands as one of the concern we
> have is if a target is non-mpio based targets then we will be dropping it
> without any attempt.
> With this we will be even dropping the TUR commands also which unnecessarily
> lead to error handling.
> 

I’m a little confused. In the 0/17 email you wrote:

-----

This feature is intended to be used when the device is part of a multipath
environment. When the service detects the poor connectivity, the multipath
path can be placed in a marginal path group and ignored further io
operations.

After placing a path in the marginal path group,the daemon sets the
port_state to Marginal which sets bit in scmd->state for all the

——

So it sounded like this is would only be used for mpio enabled paths.

Is this the daemon you mention in the 0/17 email:

https://github.com/brocade/bsn-fc-txptd

?

I might be completely misunderstanding you though. If the above link
is for your daemon then going off the 0/17 email and the GitHub README it
sounds like you are doing:

1. bsn-fc-txptd gets ELS, and moves affected paths to new marginal path group.
2. The non-marginal paths stay in the active path group and are continued to
be used like normal.
3. The paths in the marginal path group are not used until we get link up
or another ELS that indicates the paths are ok.

For these outstanding IOs on marginal paths, it sounds like you are flushing
the existing IO on them.mOnce they complete they will be in the marginal group
and not be used until #3.

So it’s not clear to me if you know the path is not optimal and might hit
a timeout, and you are not going to use it once the existing IO completes why
even try to send it? I mean in this setup, new commands that are entering
the dm-multipath layer will not be sent to these marginal paths right?

Also for the TUR comment, what TUR do you mean exactly? From the SCSI EH?
 





^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 16:10       ` Michael Christie
@ 2020-10-19 16:19         ` Michael Christie
  2020-10-19 17:16           ` Hannes Reinecke
  0 siblings, 1 reply; 46+ messages in thread
From: Michael Christie @ 2020-10-19 16:19 UTC (permalink / raw)
  To: Muneendra Kumar M; +Cc: linux-scsi, Hannes Reinecke, jsmart2021, emilne, mkumar



> On Oct 19, 2020, at 11:10 AM, Michael Christie <michael.christie@oracle.com> wrote:
> 
> 
> So it’s not clear to me if you know the path is not optimal and might hit
> a timeout, and you are not going to use it once the existing IO completes why
> even try to send it? I mean in this setup, new commands that are entering
> the dm-multipath layer will not be sent to these marginal paths right?


Oh yeah, to be clear I meant why try to send it on the marginal path when you are
setting up the path groups so they are not used and only the optimal paths are used.
When the driver/scsi layer fails the IO then the multipath layer will make sure it
goes on a optimal path right so you do not have to worry about hitting a cmd timeout
and firing off the scsi eh.

However, one other question I had though, is are you setting up multipathd so the
marginal paths are used if the optimal ones were to fail (like the optimal paths hit a
link down, dev_loss_tmo or fast_io_fail fires, etc) or will they be treated
like failed paths?

So could you end up with 3 groups:

1. Active optimal paths
2. Marginal
3. failed

If the paths in 1 move to 3, then does multipathd handle it like a all paths down
or does multipathd switch to #2?


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 16:19         ` Michael Christie
@ 2020-10-19 17:16           ` Hannes Reinecke
  2020-10-19 17:31             ` Muneendra Kumar M
  2020-10-19 18:03             ` Muneendra Kumar M
  0 siblings, 2 replies; 46+ messages in thread
From: Hannes Reinecke @ 2020-10-19 17:16 UTC (permalink / raw)
  To: Michael Christie, Muneendra Kumar M
  Cc: linux-scsi, jsmart2021, emilne, mkumar

On 10/19/20 6:19 PM, Michael Christie wrote:
> 
> 
>> On Oct 19, 2020, at 11:10 AM, Michael Christie <michael.christie@oracle.com> wrote:
>>
>>
>> So it’s not clear to me if you know the path is not optimal and might hit
>> a timeout, and you are not going to use it once the existing IO completes why
>> even try to send it? I mean in this setup, new commands that are entering
>> the dm-multipath layer will not be sent to these marginal paths right?
> 
> 
> Oh yeah, to be clear I meant why try to send it on the marginal path when you are
> setting up the path groups so they are not used and only the optimal paths are used.
> When the driver/scsi layer fails the IO then the multipath layer will make sure it
> goes on a optimal path right so you do not have to worry about hitting a cmd timeout
> and firing off the scsi eh.
> 
> However, one other question I had though, is are you setting up multipathd so the
> marginal paths are used if the optimal ones were to fail (like the optimal paths hit a
> link down, dev_loss_tmo or fast_io_fail fires, etc) or will they be treated
> like failed paths?
> 
> So could you end up with 3 groups:
> 
> 1. Active optimal paths
> 2. Marginal
> 3. failed
> 
> If the paths in 1 move to 3, then does multipathd handle it like a all paths down
> or does multipathd switch to #2?
> 
Actually, marginal path work similar to the ALUA non-optimized state.
Yes, the system can sent I/O to it, but it'd be preferable for the I/O 
to be moved somewhere else.
If there is no other path (or no better path), yeah, tough.

Hence the answer would be 2)

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 17:16           ` Hannes Reinecke
@ 2020-10-19 17:31             ` Muneendra Kumar M
  2020-10-19 18:55               ` Mike Christie
  2020-10-19 18:03             ` Muneendra Kumar M
  1 sibling, 1 reply; 46+ messages in thread
From: Muneendra Kumar M @ 2020-10-19 17:31 UTC (permalink / raw)
  To: Hannes Reinecke, Michael Christie; +Cc: linux-scsi, jsmart2021, emilne, mkumar

[-- Attachment #1: Type: text/plain, Size: 1307 bytes --]

Hi Michael,


>
>
> Oh yeah, to be clear I meant why try to send it on the marginal path
> when you are setting up the path groups so they are not used and only the
> optimal paths are used.
> When the driver/scsi layer fails the IO then the multipath layer will
> make sure it goes on a optimal path right so you do not have to worry
> about hitting a cmd timeout and firing off the scsi eh.
>
> However, one other question I had though, is are you setting up
> multipathd so the marginal paths are used if the optimal ones were to
> fail (like the optimal paths hit a link down, dev_loss_tmo or
> fast_io_fail fires, etc) or will they be treated like failed paths?
>
> So could you end up with 3 groups:
>
> 1. Active optimal paths
> 2. Marginal
> 3. failed
>
> If the paths in 1 move to 3, then does multipathd handle it like a all
> paths down or does multipathd switch to #2?
>
>Actually, marginal path work similar to the ALUA non-optimized state.
>Yes, the system can sent I/O to it, but it'd be preferable for the I/O to
>be moved somewhere else.
>If there is no other path (or no better path), yeah, tough.

>Hence the answer would be 2)


[Muneendra]As Hannes mentioned if there are no active paths, the marginal
paths will be moved to normal and the system will send the io.

Regards,
Muneendra.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 17:16           ` Hannes Reinecke
  2020-10-19 17:31             ` Muneendra Kumar M
@ 2020-10-19 18:03             ` Muneendra Kumar M
  2020-10-19 18:44               ` Mike Christie
  1 sibling, 1 reply; 46+ messages in thread
From: Muneendra Kumar M @ 2020-10-19 18:03 UTC (permalink / raw)
  To: Hannes Reinecke, Michael Christie; +Cc: linux-scsi, jsmart2021, emilne, mkumar

[-- Attachment #1: Type: text/plain, Size: 2220 bytes --]

Hi Michael,
Regarding the TUR (Test Unit Ready)command which I was mentioning .
Multipath daemon issues TUR commands on a regular intervals to check the
path status.
When a port_state is set to marginal we are not suppose to end up failing
the cmd  with DID_TRANSPORT_MARGINAL with out proceeding it.
This may  leads to give wrong health status.
Hannes/James Correct me if this is wrong.

Regards,
Muneendra.

-----Original Message-----
From: Muneendra Kumar M [mailto:muneendra.kumar@broadcom.com]
Sent: Monday, October 19, 2020 11:01 PM
To: 'Hannes Reinecke' <hare@suse.de>; 'Michael Christie'
<michael.christie@oracle.com>
Cc: 'linux-scsi@vger.kernel.org' <linux-scsi@vger.kernel.org>;
'jsmart2021@gmail.com' <jsmart2021@gmail.com>; 'emilne@redhat.com'
<emilne@redhat.com>; 'mkumar@redhat.com' <mkumar@redhat.com>
Subject: RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state
FC_PORTSTATE_MARGINAL

Hi Michael,


>
>
> Oh yeah, to be clear I meant why try to send it on the marginal path
> when you are setting up the path groups so they are not used and only the
> optimal paths are used.
> When the driver/scsi layer fails the IO then the multipath layer will
> make sure it goes on a optimal path right so you do not have to worry
> about hitting a cmd timeout and firing off the scsi eh.
>
> However, one other question I had though, is are you setting up
> multipathd so the marginal paths are used if the optimal ones were to
> fail (like the optimal paths hit a link down, dev_loss_tmo or
> fast_io_fail fires, etc) or will they be treated like failed paths?
>
> So could you end up with 3 groups:
>
> 1. Active optimal paths
> 2. Marginal
> 3. failed
>
> If the paths in 1 move to 3, then does multipathd handle it like a all
> paths down or does multipathd switch to #2?
>
>Actually, marginal path work similar to the ALUA non-optimized state.
>Yes, the system can sent I/O to it, but it'd be preferable for the I/O to
>be moved somewhere else.
>If there is no other path (or no better path), yeah, tough.

>Hence the answer would be 2)


[Muneendra]As Hannes mentioned if there are no active paths, the marginal
paths will be moved to normal and the system will send the io.

Regards,
Muneendra.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 18:03             ` Muneendra Kumar M
@ 2020-10-19 18:44               ` Mike Christie
  2020-10-19 18:55                 ` Muneendra Kumar M
  0 siblings, 1 reply; 46+ messages in thread
From: Mike Christie @ 2020-10-19 18:44 UTC (permalink / raw)
  To: Muneendra Kumar M, Hannes Reinecke; +Cc: linux-scsi, jsmart2021, emilne, mkumar

On 10/19/20 1:03 PM, Muneendra Kumar M wrote:
> Hi Michael,
> Regarding the TUR (Test Unit Ready)command which I was mentioning .
> Multipath daemon issues TUR commands on a regular intervals to check the
> path status.
> When a port_state is set to marginal we are not suppose to end up failing
> the cmd  with DID_TRANSPORT_MARGINAL with out proceeding it.
> This may  leads to give wrong health status.


If your daemon works such that you only move paths from marginal to 
active if you get an ELS indicating the path is ok or you get a link up, 
then why have multipathd send path tester IO to the paths in the 
marginal path group? They do not do anything do they?



> Hannes/James Correct me if this is wrong.
> 
> Regards,
> Muneendra.
> 
> -----Original Message-----
> From: Muneendra Kumar M [mailto:muneendra.kumar@broadcom.com]
> Sent: Monday, October 19, 2020 11:01 PM
> To: 'Hannes Reinecke' <hare@suse.de>; 'Michael Christie'
> <michael.christie@oracle.com>
> Cc: 'linux-scsi@vger.kernel.org' <linux-scsi@vger.kernel.org>;
> 'jsmart2021@gmail.com' <jsmart2021@gmail.com>; 'emilne@redhat.com'
> <emilne@redhat.com>; 'mkumar@redhat.com' <mkumar@redhat.com>
> Subject: RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state
> FC_PORTSTATE_MARGINAL
> 
> Hi Michael,
> 
> 
>>
>>
>> Oh yeah, to be clear I meant why try to send it on the marginal path
>> when you are setting up the path groups so they are not used and only the
>> optimal paths are used.
>> When the driver/scsi layer fails the IO then the multipath layer will
>> make sure it goes on a optimal path right so you do not have to worry
>> about hitting a cmd timeout and firing off the scsi eh.
>>
>> However, one other question I had though, is are you setting up
>> multipathd so the marginal paths are used if the optimal ones were to
>> fail (like the optimal paths hit a link down, dev_loss_tmo or
>> fast_io_fail fires, etc) or will they be treated like failed paths?
>>
>> So could you end up with 3 groups:
>>
>> 1. Active optimal paths
>> 2. Marginal
>> 3. failed
>>
>> If the paths in 1 move to 3, then does multipathd handle it like a all
>> paths down or does multipathd switch to #2?
>>
>> Actually, marginal path work similar to the ALUA non-optimized state.
>> Yes, the system can sent I/O to it, but it'd be preferable for the I/O to
>> be moved somewhere else.
>> If there is no other path (or no better path), yeah, tough.
> 
>> Hence the answer would be 2)
> 
> 
> [Muneendra]As Hannes mentioned if there are no active paths, the marginal
> paths will be moved to normal and the system will send the io.
> 
> Regards,
> Muneendra.
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 17:31             ` Muneendra Kumar M
@ 2020-10-19 18:55               ` Mike Christie
  2020-10-19 19:03                 ` Muneendra Kumar M
  2020-10-20  6:03                 ` Hannes Reinecke
  0 siblings, 2 replies; 46+ messages in thread
From: Mike Christie @ 2020-10-19 18:55 UTC (permalink / raw)
  To: Muneendra Kumar M, Hannes Reinecke; +Cc: linux-scsi, jsmart2021, emilne, mkumar

On 10/19/20 12:31 PM, Muneendra Kumar M wrote:
> Hi Michael,
> 
> 
>>
>>
>> Oh yeah, to be clear I meant why try to send it on the marginal path
>> when you are setting up the path groups so they are not used and only the
>> optimal paths are used.
>> When the driver/scsi layer fails the IO then the multipath layer will
>> make sure it goes on a optimal path right so you do not have to worry
>> about hitting a cmd timeout and firing off the scsi eh.
>>
>> However, one other question I had though, is are you setting up
>> multipathd so the marginal paths are used if the optimal ones were to
>> fail (like the optimal paths hit a link down, dev_loss_tmo or
>> fast_io_fail fires, etc) or will they be treated like failed paths?
>>
>> So could you end up with 3 groups:
>>
>> 1. Active optimal paths
>> 2. Marginal
>> 3. failed
>>
>> If the paths in 1 move to 3, then does multipathd handle it like a all
>> paths down or does multipathd switch to #2?
>>
>> Actually, marginal path work similar to the ALUA non-optimized state.
>> Yes, the system can sent I/O to it, but it'd be preferable for the I/O to
>> be moved somewhere else.
>> If there is no other path (or no better path), yeah, tough.
> 
>> Hence the answer would be 2)
> 
> 
> [Muneendra]As Hannes mentioned if there are no active paths, the marginal
> paths will be moved to normal and the system will send the io.
What do you mean by normal?

- You don't mean the the fc remote port state will change to online right?

- Do you just mean the the marginal path group will become the active 
group in the dm-multipath layer?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 18:44               ` Mike Christie
@ 2020-10-19 18:55                 ` Muneendra Kumar M
  2020-10-20 16:48                   ` Mike Christie
  2020-10-20 18:14                   ` Benjamin Marzinski
  0 siblings, 2 replies; 46+ messages in thread
From: Muneendra Kumar M @ 2020-10-19 18:55 UTC (permalink / raw)
  To: Mike Christie, Hannes Reinecke
  Cc: linux-scsi, jsmart2021, emilne, mkumar, Benjamin Marzinski

[-- Attachment #1: Type: text/plain, Size: 3294 bytes --]

Hi Micheal,
AFIK As long as the paths are available irrespective of  the path is moved
to marginal path group or not multipathd  will keep sending the send path
tester IO (TUR) to check the health status.

Benjamin,
Please let me know if iam wrong.

Regards,
Muneendra.


-----Original Message-----
From: Mike Christie [mailto:michael.christie@oracle.com]
Sent: Tuesday, October 20, 2020 12:15 AM
To: Muneendra Kumar M <muneendra.kumar@broadcom.com>; Hannes Reinecke
<hare@suse.de>
Cc: linux-scsi@vger.kernel.org; jsmart2021@gmail.com; emilne@redhat.com;
mkumar@redhat.com
Subject: Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state
FC_PORTSTATE_MARGINAL

On 10/19/20 1:03 PM, Muneendra Kumar M wrote:
> Hi Michael,
> Regarding the TUR (Test Unit Ready)command which I was mentioning .
> Multipath daemon issues TUR commands on a regular intervals to check
> the path status.
> When a port_state is set to marginal we are not suppose to end up
> failing the cmd  with DID_TRANSPORT_MARGINAL with out proceeding it.
> This may  leads to give wrong health status.


If your daemon works such that you only move paths from marginal to active
if you get an ELS indicating the path is ok or you get a link up, then why
have multipathd send path tester IO to the paths in the marginal path group?
They do not do anything do they?



> Hannes/James Correct me if this is wrong.
>
> Regards,
> Muneendra.
>
> -----Original Message-----
> From: Muneendra Kumar M [mailto:muneendra.kumar@broadcom.com]
> Sent: Monday, October 19, 2020 11:01 PM
> To: 'Hannes Reinecke' <hare@suse.de>; 'Michael Christie'
> <michael.christie@oracle.com>
> Cc: 'linux-scsi@vger.kernel.org' <linux-scsi@vger.kernel.org>;
> 'jsmart2021@gmail.com' <jsmart2021@gmail.com>; 'emilne@redhat.com'
> <emilne@redhat.com>; 'mkumar@redhat.com' <mkumar@redhat.com>
> Subject: RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport
> state FC_PORTSTATE_MARGINAL
>
> Hi Michael,
>
>
>>
>>
>> Oh yeah, to be clear I meant why try to send it on the marginal path
>> when you are setting up the path groups so they are not used and only
>> the optimal paths are used.
>> When the driver/scsi layer fails the IO then the multipath layer will
>> make sure it goes on a optimal path right so you do not have to worry
>> about hitting a cmd timeout and firing off the scsi eh.
>>
>> However, one other question I had though, is are you setting up
>> multipathd so the marginal paths are used if the optimal ones were to
>> fail (like the optimal paths hit a link down, dev_loss_tmo or
>> fast_io_fail fires, etc) or will they be treated like failed paths?
>>
>> So could you end up with 3 groups:
>>
>> 1. Active optimal paths
>> 2. Marginal
>> 3. failed
>>
>> If the paths in 1 move to 3, then does multipathd handle it like a
>> all paths down or does multipathd switch to #2?
>>
>> Actually, marginal path work similar to the ALUA non-optimized state.
>> Yes, the system can sent I/O to it, but it'd be preferable for the
>> I/O to be moved somewhere else.
>> If there is no other path (or no better path), yeah, tough.
>
>> Hence the answer would be 2)
>
>
> [Muneendra]As Hannes mentioned if there are no active paths, the
> marginal paths will be moved to normal and the system will send the io.
>
> Regards,
> Muneendra.
>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 18:55               ` Mike Christie
@ 2020-10-19 19:03                 ` Muneendra Kumar M
  2020-10-20 18:24                   ` Benjamin Marzinski
  2020-10-20  6:03                 ` Hannes Reinecke
  1 sibling, 1 reply; 46+ messages in thread
From: Muneendra Kumar M @ 2020-10-19 19:03 UTC (permalink / raw)
  To: Mike Christie, Hannes Reinecke
  Cc: linux-scsi, jsmart2021, emilne, mkumar, Benjamin Marzinski

[-- Attachment #1: Type: text/plain, Size: 438 bytes --]

HI Michael,

> [Muneendra]As Hannes mentioned if there are no active paths, the
> marginal paths will be moved to normal and the system will send the io.
>What do you mean by normal?

>- You don't mean the the fc remote port state will change to online right?
Fc remote port state will not change to online.

- Do you just mean the the marginal path group will become the active group
in the dm-multipath layer?

Yes

Regards,
Muneendra.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 03/17] scsi: No retries on abort success
  2020-10-15  3:27 ` [PATCH v3 03/17] scsi: No retries on abort success Muneendra
  2020-10-16 18:37   ` Mike Christie
@ 2020-10-19 19:05   ` Mike Christie
  2020-10-19 19:26     ` Muneendra Kumar M
  1 sibling, 1 reply; 46+ messages in thread
From: Mike Christie @ 2020-10-19 19:05 UTC (permalink / raw)
  To: Muneendra, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

On 10/14/20 10:27 PM, Muneendra wrote:
>   check_type:
> +	/*
> +	 * Check whether caller has decided not to do retries on
> +	 * abort success by setting the SCMD_NORETRIES_ABORT bit
> +	 */
> +	if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) &&
> +		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
> +		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);

Hey, one other thing that might be confusing me is this check and the 
naming. The 0/17 email description and the SCMD_NORETRIES_ABORT name 
makes me think we want to only run this if the cmd timedout and we went 
through the SCSI EH TMF operations. However, I think this will end up 
failing other errors with DID_TRANSPORT_MARGINAL right?

Did you want just the the SCSI EH timeout/abort case to hit this or any 
errors that hit this code path?

> +		return 1;
> +	}
> +

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH v3 03/17] scsi: No retries on abort success
  2020-10-19 19:05   ` Mike Christie
@ 2020-10-19 19:26     ` Muneendra Kumar M
  2020-10-20 19:53       ` Mike Christie
  0 siblings, 1 reply; 46+ messages in thread
From: Muneendra Kumar M @ 2020-10-19 19:26 UTC (permalink / raw)
  To: Mike Christie, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

[-- Attachment #1: Type: text/plain, Size: 876 bytes --]

Hi Michael,

>   check_type:
> +	/*
> +	 * Check whether caller has decided not to do retries on
> +	 * abort success by setting the SCMD_NORETRIES_ABORT bit
> +	 */
> +	if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) &&
> +		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
> +		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);

>Hey, one other thing that might be confusing me is this check and the
>naming. The 0/17 email description and the SCMD_NORETRIES_ABORT name makes
>me think we want to only run this if the cmd timedout and we went >through
>the SCSI EH TMF operations. However, I think this will end up failing other
>errors with DID_TRANSPORT_MARGINAL right?

>Did you want just the the SCSI EH timeout/abort case to hit this or any
>errors that hit this code path?
[Muneendra]At present we want SCSI EH timeout/abort case to hit this.

Regards,
Muneendra.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 18:55               ` Mike Christie
  2020-10-19 19:03                 ` Muneendra Kumar M
@ 2020-10-20  6:03                 ` Hannes Reinecke
  1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2020-10-20  6:03 UTC (permalink / raw)
  To: Mike Christie, Muneendra Kumar M; +Cc: linux-scsi, jsmart2021, emilne, mkumar

On 10/19/20 8:55 PM, Mike Christie wrote:
> On 10/19/20 12:31 PM, Muneendra Kumar M wrote:
>> Hi Michael,
>>
>>
>>>
>>>
>>> Oh yeah, to be clear I meant why try to send it on the marginal path
>>> when you are setting up the path groups so they are not used and only 
>>> the
>>> optimal paths are used.
>>> When the driver/scsi layer fails the IO then the multipath layer will
>>> make sure it goes on a optimal path right so you do not have to worry
>>> about hitting a cmd timeout and firing off the scsi eh.
>>>
>>> However, one other question I had though, is are you setting up
>>> multipathd so the marginal paths are used if the optimal ones were to
>>> fail (like the optimal paths hit a link down, dev_loss_tmo or
>>> fast_io_fail fires, etc) or will they be treated like failed paths?
>>>
>>> So could you end up with 3 groups:
>>>
>>> 1. Active optimal paths
>>> 2. Marginal
>>> 3. failed
>>>
>>> If the paths in 1 move to 3, then does multipathd handle it like a all
>>> paths down or does multipathd switch to #2?
>>>
>>> Actually, marginal path work similar to the ALUA non-optimized state.
>>> Yes, the system can sent I/O to it, but it'd be preferable for the 
>>> I/O to
>>> be moved somewhere else.
>>> If there is no other path (or no better path), yeah, tough.
>>
>>> Hence the answer would be 2)
>>
>>
>> [Muneendra]As Hannes mentioned if there are no active paths, the marginal
>> paths will be moved to normal and the system will send the io.
> What do you mean by normal?
> 
> - You don't mean the the fc remote port state will change to online right?
> 
> - Do you just mean the the marginal path group will become the active 
> group in the dm-multipath layer?

Actually, the latter is what I had in mind.

The paths should stay in 'marginal' until some admin interaction did 
take place.
That would be either a link reset (ie the fabric has been rescanned due 
to an RSCN event), or an admin resetting the state to 'normal' manually.
The daemons should never be moving the port out of 'marginal'.

So it really just influences the path grouping in multipathd, and 
multipath should switch to the marginal path group if all running paths 
are gone.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 18:55                 ` Muneendra Kumar M
@ 2020-10-20 16:48                   ` Mike Christie
  2020-10-20 17:19                     ` Muneendra Kumar M
  2020-10-20 18:14                   ` Benjamin Marzinski
  1 sibling, 1 reply; 46+ messages in thread
From: Mike Christie @ 2020-10-20 16:48 UTC (permalink / raw)
  To: Muneendra Kumar M, Hannes Reinecke
  Cc: linux-scsi, jsmart2021, emilne, mkumar, Benjamin Marzinski

On 10/19/20 1:55 PM, Muneendra Kumar M wrote:
> Hi Micheal,
> AFIK As long as the paths are available irrespective of  the path is moved
> to marginal path group or not multipathd  will keep sending the send path
> tester IO (TUR) to check the health status.
> 

You can change the multipathd code.


> Benjamin,
> Please let me know if iam wrong >
> Regards,
> Muneendra.
> 
> 
> -----Original Message-----
> From: Mike Christie [mailto:michael.christie@oracle.com]
> Sent: Tuesday, October 20, 2020 12:15 AM
> To: Muneendra Kumar M <muneendra.kumar@broadcom.com>; Hannes Reinecke
> <hare@suse.de>
> Cc: linux-scsi@vger.kernel.org; jsmart2021@gmail.com; emilne@redhat.com;
> mkumar@redhat.com
> Subject: Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state
> FC_PORTSTATE_MARGINAL
> 
> On 10/19/20 1:03 PM, Muneendra Kumar M wrote:
>> Hi Michael,
>> Regarding the TUR (Test Unit Ready)command which I was mentioning .
>> Multipath daemon issues TUR commands on a regular intervals to check
>> the path status.
>> When a port_state is set to marginal we are not suppose to end up
>> failing the cmd  with DID_TRANSPORT_MARGINAL with out proceeding it.
>> This may  leads to give wrong health status.
> 
> 
> If your daemon works such that you only move paths from marginal to active
> if you get an ELS indicating the path is ok or you get a link up, then why
> have multipathd send path tester IO to the paths in the marginal path group?
> They do not do anything do they?
> 
> 
> 
>> Hannes/James Correct me if this is wrong.
>>
>> Regards,
>> Muneendra.
>>
>> -----Original Message-----
>> From: Muneendra Kumar M [mailto:muneendra.kumar@broadcom.com]
>> Sent: Monday, October 19, 2020 11:01 PM
>> To: 'Hannes Reinecke' <hare@suse.de>; 'Michael Christie'
>> <michael.christie@oracle.com>
>> Cc: 'linux-scsi@vger.kernel.org' <linux-scsi@vger.kernel.org>;
>> 'jsmart2021@gmail.com' <jsmart2021@gmail.com>; 'emilne@redhat.com'
>> <emilne@redhat.com>; 'mkumar@redhat.com' <mkumar@redhat.com>
>> Subject: RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport
>> state FC_PORTSTATE_MARGINAL
>>
>> Hi Michael,
>>
>>
>>>
>>>
>>> Oh yeah, to be clear I meant why try to send it on the marginal path
>>> when you are setting up the path groups so they are not used and only
>>> the optimal paths are used.
>>> When the driver/scsi layer fails the IO then the multipath layer will
>>> make sure it goes on a optimal path right so you do not have to worry
>>> about hitting a cmd timeout and firing off the scsi eh.
>>>
>>> However, one other question I had though, is are you setting up
>>> multipathd so the marginal paths are used if the optimal ones were to
>>> fail (like the optimal paths hit a link down, dev_loss_tmo or
>>> fast_io_fail fires, etc) or will they be treated like failed paths?
>>>
>>> So could you end up with 3 groups:
>>>
>>> 1. Active optimal paths
>>> 2. Marginal
>>> 3. failed
>>>
>>> If the paths in 1 move to 3, then does multipathd handle it like a
>>> all paths down or does multipathd switch to #2?
>>>
>>> Actually, marginal path work similar to the ALUA non-optimized state.
>>> Yes, the system can sent I/O to it, but it'd be preferable for the
>>> I/O to be moved somewhere else.
>>> If there is no other path (or no better path), yeah, tough.
>>
>>> Hence the answer would be 2)
>>
>>
>> [Muneendra]As Hannes mentioned if there are no active paths, the
>> marginal paths will be moved to normal and the system will send the io.
>>
>> Regards,
>> Muneendra.
>>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-20 16:48                   ` Mike Christie
@ 2020-10-20 17:19                     ` Muneendra Kumar M
  2020-10-20 18:44                       ` Benjamin Marzinski
  0 siblings, 1 reply; 46+ messages in thread
From: Muneendra Kumar M @ 2020-10-20 17:19 UTC (permalink / raw)
  To: Mike Christie, Hannes Reinecke
  Cc: linux-scsi, jsmart2021, emilne, mkumar, Benjamin Marzinski

[-- Attachment #1: Type: text/plain, Size: 3671 bytes --]

HI Michael,

> AFIK As long as the paths are available irrespective of  the path is
> moved to marginal path group or not multipathd  will keep sending the
> send path tester IO (TUR) to check the health status.
>

>You can change the multipathd code.
You mean to say don't send the TUR commands for the devices under marginal
path groups ?

At present the multipathd checks the device state. If the device state is
"running" then the check_path
Will issue a TUR commands at regular intervals to check the path health
status.

Regards,
Muneendra.



> -----Original Message-----
> From: Mike Christie [mailto:michael.christie@oracle.com]
> Sent: Tuesday, October 20, 2020 12:15 AM
> To: Muneendra Kumar M <muneendra.kumar@broadcom.com>; Hannes Reinecke
> <hare@suse.de>
> Cc: linux-scsi@vger.kernel.org; jsmart2021@gmail.com;
> emilne@redhat.com; mkumar@redhat.com
> Subject: Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport
> state FC_PORTSTATE_MARGINAL
>
> On 10/19/20 1:03 PM, Muneendra Kumar M wrote:
>> Hi Michael,
>> Regarding the TUR (Test Unit Ready)command which I was mentioning .
>> Multipath daemon issues TUR commands on a regular intervals to check
>> the path status.
>> When a port_state is set to marginal we are not suppose to end up
>> failing the cmd  with DID_TRANSPORT_MARGINAL with out proceeding it.
>> This may  leads to give wrong health status.
>
>
> If your daemon works such that you only move paths from marginal to
> active if you get an ELS indicating the path is ok or you get a link
> up, then why have multipathd send path tester IO to the paths in the
> marginal path group?
> They do not do anything do they?
>
>
>
>> Hannes/James Correct me if this is wrong.
>>
>> Regards,
>> Muneendra.
>>
>> -----Original Message-----
>> From: Muneendra Kumar M [mailto:muneendra.kumar@broadcom.com]
>> Sent: Monday, October 19, 2020 11:01 PM
>> To: 'Hannes Reinecke' <hare@suse.de>; 'Michael Christie'
>> <michael.christie@oracle.com>
>> Cc: 'linux-scsi@vger.kernel.org' <linux-scsi@vger.kernel.org>;
>> 'jsmart2021@gmail.com' <jsmart2021@gmail.com>; 'emilne@redhat.com'
>> <emilne@redhat.com>; 'mkumar@redhat.com' <mkumar@redhat.com>
>> Subject: RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport
>> state FC_PORTSTATE_MARGINAL
>>
>> Hi Michael,
>>
>>
>>>
>>>
>>> Oh yeah, to be clear I meant why try to send it on the marginal path
>>> when you are setting up the path groups so they are not used and
>>> only the optimal paths are used.
>>> When the driver/scsi layer fails the IO then the multipath layer
>>> will make sure it goes on a optimal path right so you do not have to
>>> worry about hitting a cmd timeout and firing off the scsi eh.
>>>
>>> However, one other question I had though, is are you setting up
>>> multipathd so the marginal paths are used if the optimal ones were
>>> to fail (like the optimal paths hit a link down, dev_loss_tmo or
>>> fast_io_fail fires, etc) or will they be treated like failed paths?
>>>
>>> So could you end up with 3 groups:
>>>
>>> 1. Active optimal paths
>>> 2. Marginal
>>> 3. failed
>>>
>>> If the paths in 1 move to 3, then does multipathd handle it like a
>>> all paths down or does multipathd switch to #2?
>>>
>>> Actually, marginal path work similar to the ALUA non-optimized state.
>>> Yes, the system can sent I/O to it, but it'd be preferable for the
>>> I/O to be moved somewhere else.
>>> If there is no other path (or no better path), yeah, tough.
>>
>>> Hence the answer would be 2)
>>
>>
>> [Muneendra]As Hannes mentioned if there are no active paths, the
>> marginal paths will be moved to normal and the system will send the io.
>>
>> Regards,
>> Muneendra.
>>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 18:55                 ` Muneendra Kumar M
  2020-10-20 16:48                   ` Mike Christie
@ 2020-10-20 18:14                   ` Benjamin Marzinski
  1 sibling, 0 replies; 46+ messages in thread
From: Benjamin Marzinski @ 2020-10-20 18:14 UTC (permalink / raw)
  To: Muneendra Kumar M
  Cc: Mike Christie, Hannes Reinecke, linux-scsi, jsmart2021, emilne, mkumar

On Tue, Oct 20, 2020 at 12:25:42AM +0530, Muneendra Kumar M wrote:
> Hi Micheal,
> AFIK As long as the paths are available irrespective of  the path is moved
> to marginal path group or not multipathd  will keep sending the send path
> tester IO (TUR) to check the health status.
> 
> Benjamin,
> Please let me know if iam wrong.

You are correct. If a path is part of a multipath device, multipathd
will run periodic checks on it.

-Ben

> 
> Regards,
> Muneendra.
> 
> 
> -----Original Message-----
> From: Mike Christie [mailto:michael.christie@oracle.com]
> Sent: Tuesday, October 20, 2020 12:15 AM
> To: Muneendra Kumar M <muneendra.kumar@broadcom.com>; Hannes Reinecke
> <hare@suse.de>
> Cc: linux-scsi@vger.kernel.org; jsmart2021@gmail.com; emilne@redhat.com;
> mkumar@redhat.com
> Subject: Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state
> FC_PORTSTATE_MARGINAL
> 
> On 10/19/20 1:03 PM, Muneendra Kumar M wrote:
> > Hi Michael,
> > Regarding the TUR (Test Unit Ready)command which I was mentioning .
> > Multipath daemon issues TUR commands on a regular intervals to check
> > the path status.
> > When a port_state is set to marginal we are not suppose to end up
> > failing the cmd  with DID_TRANSPORT_MARGINAL with out proceeding it.
> > This may  leads to give wrong health status.
> 
> 
> If your daemon works such that you only move paths from marginal to active
> if you get an ELS indicating the path is ok or you get a link up, then why
> have multipathd send path tester IO to the paths in the marginal path group?
> They do not do anything do they?
> 
> 
> 
> > Hannes/James Correct me if this is wrong.
> >
> > Regards,
> > Muneendra.
> >
> > -----Original Message-----
> > From: Muneendra Kumar M [mailto:muneendra.kumar@broadcom.com]
> > Sent: Monday, October 19, 2020 11:01 PM
> > To: 'Hannes Reinecke' <hare@suse.de>; 'Michael Christie'
> > <michael.christie@oracle.com>
> > Cc: 'linux-scsi@vger.kernel.org' <linux-scsi@vger.kernel.org>;
> > 'jsmart2021@gmail.com' <jsmart2021@gmail.com>; 'emilne@redhat.com'
> > <emilne@redhat.com>; 'mkumar@redhat.com' <mkumar@redhat.com>
> > Subject: RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport
> > state FC_PORTSTATE_MARGINAL
> >
> > Hi Michael,
> >
> >
> >>
> >>
> >> Oh yeah, to be clear I meant why try to send it on the marginal path
> >> when you are setting up the path groups so they are not used and only
> >> the optimal paths are used.
> >> When the driver/scsi layer fails the IO then the multipath layer will
> >> make sure it goes on a optimal path right so you do not have to worry
> >> about hitting a cmd timeout and firing off the scsi eh.
> >>
> >> However, one other question I had though, is are you setting up
> >> multipathd so the marginal paths are used if the optimal ones were to
> >> fail (like the optimal paths hit a link down, dev_loss_tmo or
> >> fast_io_fail fires, etc) or will they be treated like failed paths?
> >>
> >> So could you end up with 3 groups:
> >>
> >> 1. Active optimal paths
> >> 2. Marginal
> >> 3. failed
> >>
> >> If the paths in 1 move to 3, then does multipathd handle it like a
> >> all paths down or does multipathd switch to #2?
> >>
> >> Actually, marginal path work similar to the ALUA non-optimized state.
> >> Yes, the system can sent I/O to it, but it'd be preferable for the
> >> I/O to be moved somewhere else.
> >> If there is no other path (or no better path), yeah, tough.
> >
> >> Hence the answer would be 2)
> >
> >
> > [Muneendra]As Hannes mentioned if there are no active paths, the
> > marginal paths will be moved to normal and the system will send the io.
> >
> > Regards,
> > Muneendra.
> >



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-19 19:03                 ` Muneendra Kumar M
@ 2020-10-20 18:24                   ` Benjamin Marzinski
  0 siblings, 0 replies; 46+ messages in thread
From: Benjamin Marzinski @ 2020-10-20 18:24 UTC (permalink / raw)
  To: Muneendra Kumar M
  Cc: Mike Christie, Hannes Reinecke, linux-scsi, jsmart2021, emilne, mkumar

On Tue, Oct 20, 2020 at 12:33:39AM +0530, Muneendra Kumar M wrote:
> HI Michael,
> 
> > [Muneendra]As Hannes mentioned if there are no active paths, the
> > marginal paths will be moved to normal and the system will send the io.
> >What do you mean by normal?
> 
> >- You don't mean the the fc remote port state will change to online right?
> Fc remote port state will not change to online.
> 
> - Do you just mean the the marginal path group will become the active group
> in the dm-multipath layer?
> 
> Yes
> 

Correct. The path group is still marginal. Being marginal simply means
that the path group has a lower priority than all the non-marginal path
groups.  However, if there are no usable paths in any non-marginal path
group, then a marginal path group will become the active path group (the
path group that the kernel will send IO to).

-Ben

> Regards,
> Muneendra.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL
  2020-10-20 17:19                     ` Muneendra Kumar M
@ 2020-10-20 18:44                       ` Benjamin Marzinski
  0 siblings, 0 replies; 46+ messages in thread
From: Benjamin Marzinski @ 2020-10-20 18:44 UTC (permalink / raw)
  To: Muneendra Kumar M
  Cc: Mike Christie, Hannes Reinecke, linux-scsi, jsmart2021, emilne, mkumar

On Tue, Oct 20, 2020 at 10:49:56PM +0530, Muneendra Kumar M wrote:
> HI Michael,
> 
> > AFIK As long as the paths are available irrespective of  the path is
> > moved to marginal path group or not multipathd  will keep sending the
> > send path tester IO (TUR) to check the health status.
> >
> 
> >You can change the multipathd code.
> You mean to say don't send the TUR commands for the devices under marginal
> path groups ?

Multipath does need to keep checking the marginal paths. For one thing,
just because a path is marginal, doesn't mean it belongs in the same
path group as every other marginal path. If you have an active/passive
array, then it's quite possible that you will have multiple marginal
path groups. A path's priority is updated when it is checked, and many
devices use this to determine the correct path groups. Also the checker
is what determines if a path is in a ghost (passive) state, which tells
multipath which path group to try next. For another thing, if all the
non-marginal paths fail, then the marginal path group will also be the
active path group, and we definitely want to be checking the paths on
the active path group.

-Ben

> 
> At present the multipathd checks the device state. If the device state is
> "running" then the check_path
> Will issue a TUR commands at regular intervals to check the path health
> status.
> 
> Regards,
> Muneendra.
> 
> 
> 
> > -----Original Message-----
> > From: Mike Christie [mailto:michael.christie@oracle.com]
> > Sent: Tuesday, October 20, 2020 12:15 AM
> > To: Muneendra Kumar M <muneendra.kumar@broadcom.com>; Hannes Reinecke
> > <hare@suse.de>
> > Cc: linux-scsi@vger.kernel.org; jsmart2021@gmail.com;
> > emilne@redhat.com; mkumar@redhat.com
> > Subject: Re: [PATCH v3 05/17] scsi_transport_fc: Added a new rport
> > state FC_PORTSTATE_MARGINAL
> >
> > On 10/19/20 1:03 PM, Muneendra Kumar M wrote:
> >> Hi Michael,
> >> Regarding the TUR (Test Unit Ready)command which I was mentioning .
> >> Multipath daemon issues TUR commands on a regular intervals to check
> >> the path status.
> >> When a port_state is set to marginal we are not suppose to end up
> >> failing the cmd  with DID_TRANSPORT_MARGINAL with out proceeding it.
> >> This may  leads to give wrong health status.
> >
> >
> > If your daemon works such that you only move paths from marginal to
> > active if you get an ELS indicating the path is ok or you get a link
> > up, then why have multipathd send path tester IO to the paths in the
> > marginal path group?
> > They do not do anything do they?
> >
> >
> >
> >> Hannes/James Correct me if this is wrong.
> >>
> >> Regards,
> >> Muneendra.
> >>
> >> -----Original Message-----
> >> From: Muneendra Kumar M [mailto:muneendra.kumar@broadcom.com]
> >> Sent: Monday, October 19, 2020 11:01 PM
> >> To: 'Hannes Reinecke' <hare@suse.de>; 'Michael Christie'
> >> <michael.christie@oracle.com>
> >> Cc: 'linux-scsi@vger.kernel.org' <linux-scsi@vger.kernel.org>;
> >> 'jsmart2021@gmail.com' <jsmart2021@gmail.com>; 'emilne@redhat.com'
> >> <emilne@redhat.com>; 'mkumar@redhat.com' <mkumar@redhat.com>
> >> Subject: RE: [PATCH v3 05/17] scsi_transport_fc: Added a new rport
> >> state FC_PORTSTATE_MARGINAL
> >>
> >> Hi Michael,
> >>
> >>
> >>>
> >>>
> >>> Oh yeah, to be clear I meant why try to send it on the marginal path
> >>> when you are setting up the path groups so they are not used and
> >>> only the optimal paths are used.
> >>> When the driver/scsi layer fails the IO then the multipath layer
> >>> will make sure it goes on a optimal path right so you do not have to
> >>> worry about hitting a cmd timeout and firing off the scsi eh.
> >>>
> >>> However, one other question I had though, is are you setting up
> >>> multipathd so the marginal paths are used if the optimal ones were
> >>> to fail (like the optimal paths hit a link down, dev_loss_tmo or
> >>> fast_io_fail fires, etc) or will they be treated like failed paths?
> >>>
> >>> So could you end up with 3 groups:
> >>>
> >>> 1. Active optimal paths
> >>> 2. Marginal
> >>> 3. failed
> >>>
> >>> If the paths in 1 move to 3, then does multipathd handle it like a
> >>> all paths down or does multipathd switch to #2?
> >>>
> >>> Actually, marginal path work similar to the ALUA non-optimized state.
> >>> Yes, the system can sent I/O to it, but it'd be preferable for the
> >>> I/O to be moved somewhere else.
> >>> If there is no other path (or no better path), yeah, tough.
> >>
> >>> Hence the answer would be 2)
> >>
> >>
> >> [Muneendra]As Hannes mentioned if there are no active paths, the
> >> marginal paths will be moved to normal and the system will send the io.
> >>
> >> Regards,
> >> Muneendra.
> >>



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 03/17] scsi: No retries on abort success
  2020-10-19 19:26     ` Muneendra Kumar M
@ 2020-10-20 19:53       ` Mike Christie
  2020-10-20 20:18         ` Mike Christie
  0 siblings, 1 reply; 46+ messages in thread
From: Mike Christie @ 2020-10-20 19:53 UTC (permalink / raw)
  To: Muneendra Kumar M, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

On 10/19/20 2:26 PM, Muneendra Kumar M wrote:
> Hi Michael,
> 
>>   check_type:
>> +	/*
>> +	 * Check whether caller has decided not to do retries on
>> +	 * abort success by setting the SCMD_NORETRIES_ABORT bit
>> +	 */
>> +	if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) &&
>> +		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
>> +		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);
> 
>> Hey, one other thing that might be confusing me is this check and the
>> naming. The 0/17 email description and the SCMD_NORETRIES_ABORT name makes
>> me think we want to only run this if the cmd timedout and we went >through
>> the SCSI EH TMF operations. However, I think this will end up failing other
>> errors with DID_TRANSPORT_MARGINAL right?
> 
>> Did you want just the the SCSI EH timeout/abort case to hit this or any
>> errors that hit this code path?
> [Muneendra]At present we want SCSI EH timeout/abort case to hit this.

What about adding a new eh callout that the transport classes can implement.
They can check the port state at this time and decide if the cmd should be
retried or failed. It would be similar to fc_eh_timed_out for example.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 03/17] scsi: No retries on abort success
  2020-10-20 19:53       ` Mike Christie
@ 2020-10-20 20:18         ` Mike Christie
  2020-10-21 18:51           ` Muneendra Kumar M
  0 siblings, 1 reply; 46+ messages in thread
From: Mike Christie @ 2020-10-20 20:18 UTC (permalink / raw)
  To: Muneendra Kumar M, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

On 10/20/20 2:53 PM, Mike Christie wrote:
> On 10/19/20 2:26 PM, Muneendra Kumar M wrote:
>> Hi Michael,
>>
>>>   check_type:
>>> +	/*
>>> +	 * Check whether caller has decided not to do retries on
>>> +	 * abort success by setting the SCMD_NORETRIES_ABORT bit
>>> +	 */
>>> +	if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) &&
>>> +		(scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) {
>>> +		set_host_byte(scmd, DID_TRANSPORT_MARGINAL);
>>
>>> Hey, one other thing that might be confusing me is this check and the
>>> naming. The 0/17 email description and the SCMD_NORETRIES_ABORT name makes
>>> me think we want to only run this if the cmd timedout and we went >through
>>> the SCSI EH TMF operations. However, I think this will end up failing other
>>> errors with DID_TRANSPORT_MARGINAL right?
>>
>>> Did you want just the the SCSI EH timeout/abort case to hit this or any
>>> errors that hit this code path?
>> [Muneendra]At present we want SCSI EH timeout/abort case to hit this.
> 
> What about adding a new eh callout that the transport classes can implement.
> They can check the port state at this time and decide if the cmd should be
> retried or failed. It would be similar to fc_eh_timed_out for example.
> 

Or, in the fc_eh_timed_out callout set the SCMD_NORETRIES_ABORT if port
state is marginal and it indicates it wants to abort the cmd. In
scsi_noretry_cmd above then you know we got there, because the cmd
timedout and we tried/were going to abort it. We don't need the code to
loop over running commands, the chkready changes, etc.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH v3 03/17] scsi: No retries on abort success
  2020-10-20 20:18         ` Mike Christie
@ 2020-10-21 18:51           ` Muneendra Kumar M
  0 siblings, 0 replies; 46+ messages in thread
From: Muneendra Kumar M @ 2020-10-21 18:51 UTC (permalink / raw)
  To: Mike Christie, linux-scsi, hare; +Cc: jsmart2021, emilne, mkumar

[-- Attachment #1: Type: text/plain, Size: 447 bytes --]

Hi Michael,
Thanks for the input.

>Or, in the fc_eh_timed_out callout set the SCMD_NORETRIES_ABORT if port
>state is marginal and it indicates it wants to abort the cmd. In
>scsi_noretry_cmd above then you know we got there, because the cmd timedout
>and we >tried/were going to abort it. We don't need the code to loop over
>running commands, the chkready changes, etc.

Agreed.
I will incorporate all your review comments.

Regards,
Muneendra.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4177 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 16/17] scsi:zfcp: Added changes to fc_remote_port_chkready
  2020-10-15  3:27 ` [PATCH v3 16/17] scsi:zfcp: " Muneendra
@ 2020-10-22 16:50   ` Benjamin Block
  0 siblings, 0 replies; 46+ messages in thread
From: Benjamin Block @ 2020-10-22 16:50 UTC (permalink / raw)
  To: Muneendra; +Cc: linux-scsi, hare, jsmart2021, emilne, mkumar, Steffen Maier

Would be good if you could address the driver mails to the driver
maintainers :) I was out of office, but we might still miss it
occasionally.

On Thu, Oct 15, 2020 at 08:57:41AM +0530, Muneendra wrote:
> Added changes to pass a new argument to fc_remote_port_chkready
> 
> Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
> 
> ---
> v3:
> New Patch
> ---
>  drivers/s390/scsi/zfcp_scsi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
> index d58bf79892f2..732e15e3a839 100644
> --- a/drivers/s390/scsi/zfcp_scsi.c
> +++ b/drivers/s390/scsi/zfcp_scsi.c
> @@ -74,7 +74,7 @@ int zfcp_scsi_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scpnt)
>  	scpnt->result = 0;
>  	scpnt->host_scribble = NULL;
>  
> -	scsi_result = fc_remote_port_chkready(rport);
> +	scsi_result = fc_remote_port_chkready(rport, scpnt);
>  	if (unlikely(scsi_result)) {
>  		scpnt->result = scsi_result;
>  		zfcp_dbf_scsi_fail_send(scpnt);
> -- 
> 2.26.2
> 

This change looks fine to me for zfcp.

Reviewed-by: Benjamin Block <bblock@linux.ibm.com>


-- 
Best Regards, Benjamin Block  / Linux on IBM Z Kernel Development / IBM Systems
IBM Deutschland Research & Development GmbH    /    https://www.ibm.com/privacy
Vorsitz. AufsR.: Gregor Pillen         /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2020-10-22 16:50 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-15  3:27 [PATCH v3 00/17] scsi: Support to handle Intermittent errors Muneendra
2020-10-15  3:27 ` [PATCH v3 01/17] scsi: Added a new definition in scsi_cmnd.h Muneendra
2020-10-15  3:27 ` [PATCH v3 02/17] scsi: Added a new error code in scsi.h Muneendra
2020-10-15  3:27 ` [PATCH v3 03/17] scsi: No retries on abort success Muneendra
2020-10-16 18:37   ` Mike Christie
2020-10-19 19:05   ` Mike Christie
2020-10-19 19:26     ` Muneendra Kumar M
2020-10-20 19:53       ` Mike Christie
2020-10-20 20:18         ` Mike Christie
2020-10-21 18:51           ` Muneendra Kumar M
2020-10-15  3:27 ` [PATCH v3 04/17] scsi: Added routine to set/clear SCMD_NORETRIES_ABORT bit for outstanding io on scsi_dev Muneendra
2020-10-15  3:27 ` [PATCH v3 05/17] scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL Muneendra
2020-10-16 19:52   ` Mike Christie
2020-10-19 10:47     ` Muneendra Kumar M
2020-10-19 16:10       ` Michael Christie
2020-10-19 16:19         ` Michael Christie
2020-10-19 17:16           ` Hannes Reinecke
2020-10-19 17:31             ` Muneendra Kumar M
2020-10-19 18:55               ` Mike Christie
2020-10-19 19:03                 ` Muneendra Kumar M
2020-10-20 18:24                   ` Benjamin Marzinski
2020-10-20  6:03                 ` Hannes Reinecke
2020-10-19 18:03             ` Muneendra Kumar M
2020-10-19 18:44               ` Mike Christie
2020-10-19 18:55                 ` Muneendra Kumar M
2020-10-20 16:48                   ` Mike Christie
2020-10-20 17:19                     ` Muneendra Kumar M
2020-10-20 18:44                       ` Benjamin Marzinski
2020-10-20 18:14                   ` Benjamin Marzinski
2020-10-15  3:27 ` [PATCH v3 06/17] scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs Muneendra
2020-10-15 14:05   ` kernel test robot
2020-10-15 14:05     ` kernel test robot
2020-10-16 18:34   ` Mike Christie
2020-10-19 10:52     ` Muneendra Kumar M
2020-10-15  3:27 ` [PATCH v3 07/17] scsi:lpfc: Added changes to fc_remote_port_chkready Muneendra
2020-10-15  3:27 ` [PATCH v3 08/17] scsi:qla2xx: " Muneendra
2020-10-15  3:27 ` [PATCH v3 09/17] scsi:qedf: " Muneendra
2020-10-15  3:27 ` [PATCH v3 10/17] scsi:libfc: " Muneendra
2020-10-15  3:27 ` [PATCH v3 11/17] scsi:ibmvfc: " Muneendra
2020-10-15  3:27 ` [PATCH v3 12/17] scsi:fnic: " Muneendra
2020-10-15  3:27 ` [PATCH v3 13/17] scsi:bnx2fc: " Muneendra
2020-10-15  3:27 ` [PATCH v3 14/17] scsi:csio: " Muneendra
2020-10-15  3:27 ` [PATCH v3 15/17] scsi:bfa: " Muneendra
2020-10-15  3:27 ` [PATCH v3 16/17] scsi:zfcp: " Muneendra
2020-10-22 16:50   ` Benjamin Block
2020-10-15  3:27 ` [PATCH v3 17/17] scsi:mpt: " Muneendra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.