[PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer
@ 2016-03-14 18:14 Satish Kharat
  2016-03-14 18:14 ` [PATCH 2/5] Cleanup the I/O that has timed out and is used to issue LUN reset Satish Kharat
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Satish Kharat @ 2016-03-14 18:14 UTC (permalink / raw)
  To: satishkh, linux-scsi

If an I/O times out and an abort issued by host, if the abort is
successful we need to set scsi status as DID_ABORT. Or else the
mid-layer error handler which looks for this error code, will
offline the device. Also if the original I/O is not found in fnic
firmware, we will consider the abort as successful.
Fnic driver version changed from 1.6.0.17a to 1.6.0.19,
version 1.6.0.18 has been skipped

Signed-off-by: Satish Kharat <satishkh@cisco.com>
---
 drivers/scsi/fnic/fnic.h      |  2 +-
 drivers/scsi/fnic/fnic_scsi.c | 32 +++++++++++++++++++++++++++-----
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
index ce129e5..52a53f8 100644
--- a/drivers/scsi/fnic/fnic.h
+++ b/drivers/scsi/fnic/fnic.h
@@ -39,7 +39,7 @@
 
 #define DRV_NAME		"fnic"
 #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
-#define DRV_VERSION		"1.6.0.17a"
+#define DRV_VERSION		"1.6.0.19"
 #define PFX			DRV_NAME ": "
 #define DFX                     DRV_NAME "%d: "
 
diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
index 266b909..b9732b1 100644
--- a/drivers/scsi/fnic/fnic_scsi.c
+++ b/drivers/scsi/fnic/fnic_scsi.c
@@ -1092,6 +1092,11 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
 				atomic64_inc(
 					&term_stats->terminate_fw_timeouts);
 			break;
+		case FCPIO_ITMF_REJECTED:
+			FNIC_SCSI_DBG(KERN_INFO, fnic->lport->host,
+				"abort reject recd. id %d\n",
+				(int)(id & FNIC_TAG_MASK));
+			break;
 		case FCPIO_IO_NOT_FOUND:
 			if (CMD_FLAGS(sc) & FNIC_IO_ABTS_ISSUED)
 				atomic64_inc(&abts_stats->abort_io_not_found);
@@ -1112,9 +1117,14 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
 			spin_unlock_irqrestore(io_lock, flags);
 			return;
 		}
+
 		CMD_ABTS_STATUS(sc) = hdr_status;
 		CMD_FLAGS(sc) |= FNIC_IO_ABT_TERM_DONE;
 
+		/* If the status is IO not found consider it as success */
+		if (hdr_status == FCPIO_IO_NOT_FOUND)
+			CMD_ABTS_STATUS(sc) = FCPIO_SUCCESS;
+
 		atomic64_dec(&fnic_stats->io_stats.active_ios);
 		if (atomic64_read(&fnic->io_cmpl_skip))
 			atomic64_dec(&fnic->io_cmpl_skip);
@@ -1927,21 +1937,33 @@ int fnic_abort_cmd(struct scsi_cmnd *sc)
 
 	CMD_STATE(sc) = FNIC_IOREQ_ABTS_COMPLETE;
 
+	start_time = io_req->start_time;
 	/*
 	 * firmware completed the abort, check the status,
-	 * free the io_req irrespective of failure or success
+	 * free the io_req if successful. If abort fails,
+	 * Device reset will clean the I/O.
 	 */
-	if (CMD_ABTS_STATUS(sc) != FCPIO_SUCCESS)
+	if (CMD_ABTS_STATUS(sc) == FCPIO_SUCCESS) {
+		CMD_SP(sc) = NULL;
+	}
+	else
+	{
 		ret = FAILED;
-
-	CMD_SP(sc) = NULL;
+		spin_unlock_irqrestore(io_lock, flags);
+		goto fnic_abort_cmd_end;
+	}
 
 	spin_unlock_irqrestore(io_lock, flags);
 
-	start_time = io_req->start_time;
 	fnic_release_ioreq_buf(fnic, io_req, sc);
 	mempool_free(io_req, fnic->io_req_pool);
 
+	if (sc->scsi_done) {
+	/* Call SCSI completion function to complete the IO */
+		sc->result = (DID_ABORT << 16);
+		sc->scsi_done(sc);
+	}
+
 fnic_abort_cmd_end:
 	FNIC_TRACE(fnic_abort_cmd, sc->device->host->host_no,
 		  sc->request->tag, sc,
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] Cleanup the I/O that has timed out and is used to issue LUN reset
  2016-03-14 18:14 [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer Satish Kharat
@ 2016-03-14 18:14 ` Satish Kharat
  2016-03-14 20:14   ` Ewan Milne
  2016-03-15  9:53   ` Christoph Hellwig
  2016-03-14 18:14 ` [PATCH 3/5] Using rport->dd_data to check rport online instead of rport_lookup Satish Kharat
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 15+ messages in thread
From: Satish Kharat @ 2016-03-14 18:14 UTC (permalink / raw)
  To: satishkh, linux-scsi

In case of LUN reset, the device reset command is issued with one of
the I/Os that has timed out on that LUN. The change is to also return
this I/O with error status set to DID_RESET.
Fnic driver version changed from 1.6.0.19 to 1.6.0.20

Signed-off-by: Satish Kharat <satishkh@cisco.com>
---
 drivers/scsi/fnic/fnic.h      |  2 +-
 drivers/scsi/fnic/fnic_scsi.c | 35 +++++++++++++++++++++++++++--------
 2 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
index 52a53f8..1023eae 100644
--- a/drivers/scsi/fnic/fnic.h
+++ b/drivers/scsi/fnic/fnic.h
@@ -39,7 +39,7 @@
 
 #define DRV_NAME		"fnic"
 #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
-#define DRV_VERSION		"1.6.0.19"
+#define DRV_VERSION		"1.6.0.20"
 #define PFX			DRV_NAME ": "
 #define DFX                     DRV_NAME "%d: "
 
diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
index b9732b1..a3e0f69 100644
--- a/drivers/scsi/fnic/fnic_scsi.c
+++ b/drivers/scsi/fnic/fnic_scsi.c
@@ -2041,7 +2041,9 @@ lr_io_req_end:
  * successfully aborted, 1 otherwise
  */
 static int fnic_clean_pending_aborts(struct fnic *fnic,
-				     struct scsi_cmnd *lr_sc)
+				     struct scsi_cmnd *lr_sc,
+					 bool new_sc)
+
 {
 	int tag, abt_tag;
 	struct fnic_io_req *io_req;
@@ -2059,10 +2061,10 @@ static int fnic_clean_pending_aborts(struct fnic *fnic,
 		spin_lock_irqsave(io_lock, flags);
 		sc = scsi_host_find_tag(fnic->lport->host, tag);
 		/*
-		 * ignore this lun reset cmd or cmds that do not belong to
-		 * this lun
+		 * ignore this lun reset cmd if issued using new SC
+		 * or cmds that do not belong to this lun
 		 */
-		if (!sc || sc == lr_sc || sc->device != lun_dev) {
+		if (!sc || ((sc == lr_sc) && new_sc) || sc->device != lun_dev) {
 			spin_unlock_irqrestore(io_lock, flags);
 			continue;
 		}
@@ -2168,11 +2170,27 @@ static int fnic_clean_pending_aborts(struct fnic *fnic,
 			goto clean_pending_aborts_end;
 		}
 		CMD_STATE(sc) = FNIC_IOREQ_ABTS_COMPLETE;
-		CMD_SP(sc) = NULL;
+
+		/* original sc used for lr is handled by dev reset code */
+		if (sc != lr_sc)
+			CMD_SP(sc) = NULL;
 		spin_unlock_irqrestore(io_lock, flags);
 
-		fnic_release_ioreq_buf(fnic, io_req, sc);
-		mempool_free(io_req, fnic->io_req_pool);
+		/* original sc used for lr is handled by dev reset code */
+		if (sc != lr_sc) {
+			fnic_release_ioreq_buf(fnic, io_req, sc);
+			mempool_free(io_req, fnic->io_req_pool);
+		}
+
+		/*
+		 * Any IO is returned during reset, it needs to call scsi_done
+		 * to return the scsi_cmnd to upper layer.
+		 */
+		if (sc->scsi_done) {
+			/* Set result to let upper SCSI layer retry */
+			sc->result = DID_RESET << 16;
+			sc->scsi_done(sc);
+		}
 	}
 
 	schedule_timeout(msecs_to_jiffies(2 * fnic->config.ed_tov));
@@ -2266,6 +2284,7 @@ int fnic_device_reset(struct scsi_cmnd *sc)
 	int tag = 0;
 	DECLARE_COMPLETION_ONSTACK(tm_done);
 	int tag_gen_flag = 0;   /*to track tags allocated by fnic driver*/
+	bool new_sc = 0;
 
 	/* Wait for rport to unblock */
 	fc_block_scsi_eh(sc);
@@ -2452,7 +2471,7 @@ int fnic_device_reset(struct scsi_cmnd *sc)
 	 * the lun reset cmd. If all cmds get cleaned, the lun reset
 	 * succeeds
 	 */
-	if (fnic_clean_pending_aborts(fnic, sc)) {
+	if (fnic_clean_pending_aborts(fnic, sc, new_sc)) {
 		spin_lock_irqsave(io_lock, flags);
 		io_req = (struct fnic_io_req *)CMD_SP(sc);
 		FNIC_SCSI_DBG(KERN_DEBUG, fnic->lport->host,
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] Using rport->dd_data to check rport online instead of rport_lookup.
  2016-03-14 18:14 [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer Satish Kharat
  2016-03-14 18:14 ` [PATCH 2/5] Cleanup the I/O that has timed out and is used to issue LUN reset Satish Kharat
@ 2016-03-14 18:14 ` Satish Kharat
  2016-03-14 20:15   ` Ewan Milne
  2016-03-14 18:14 ` [PATCH 4/5] Setting scsi host template to indicate that fnic does not support multiqueue Satish Kharat
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Satish Kharat @ 2016-03-14 18:14 UTC (permalink / raw)
  To: satishkh, linux-scsi

When issuing I/O we check if rport is online through libfc
rport_lookup() function which needs to be protected by mutex lock
that cannot acquired in I/O context. The change is to use midlayer
remote port’s dd_data which is preserved until its devloss timeout
and no protection is required.

Fnic driver version changed from 1.6.0.20 to 1.6.0.21

Signed-off-by: Satish Kharat <satishkh@cisco.com>
---
 drivers/scsi/fnic/fnic.h      |  2 +-
 drivers/scsi/fnic/fnic_scsi.c | 20 +++++++++++---------
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
index 1023eae..9ddc920 100644
--- a/drivers/scsi/fnic/fnic.h
+++ b/drivers/scsi/fnic/fnic.h
@@ -39,7 +39,7 @@
 
 #define DRV_NAME		"fnic"
 #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
-#define DRV_VERSION		"1.6.0.20"
+#define DRV_VERSION		"1.6.0.21"
 #define PFX			DRV_NAME ": "
 #define DFX                     DRV_NAME "%d: "
 
diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
index a3e0f69..8492143 100644
--- a/drivers/scsi/fnic/fnic_scsi.c
+++ b/drivers/scsi/fnic/fnic_scsi.c
@@ -439,7 +439,6 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void (*done)(struct scsi_
 	int sg_count = 0;
 	unsigned long flags = 0;
 	unsigned long ptr;
-	struct fc_rport_priv *rdata;
 	spinlock_t *io_lock = NULL;
 	int io_lock_acquired = 0;
 
@@ -455,14 +454,17 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void (*done)(struct scsi_
 		return 0;
 	}
 
-	rdata = lp->tt.rport_lookup(lp, rport->port_id);
-	if (!rdata || (rdata->rp_state == RPORT_ST_DELETE)) {
-		FNIC_SCSI_DBG(KERN_DEBUG, fnic->lport->host,
-			"returning IO as rport is removed\n");
-		atomic64_inc(&fnic_stats->misc_stats.rport_not_ready);
-		sc->result = DID_NO_CONNECT;
-		done(sc);
-		return 0;
+	if (rport) {
+		struct fc_rport_libfc_priv *rp = rport->dd_data;
+
+		if (!rp || rp->rp_state != RPORT_ST_READY) {
+			FNIC_SCSI_DBG(KERN_DEBUG, fnic->lport->host,
+				"returning DID_NO_CONNECT for IO as rport is removed\n");
+			atomic64_inc(&fnic_stats->misc_stats.rport_not_ready);
+			sc->result = DID_NO_CONNECT;
+			done(sc);
+			return 0;
+		}
 	}
 
 	if (lp->state != LPORT_ST_READY || !(lp->link_up))
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] Setting scsi host template to indicate that fnic does not support multiqueue
  2016-03-14 18:14 [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer Satish Kharat
  2016-03-14 18:14 ` [PATCH 2/5] Cleanup the I/O that has timed out and is used to issue LUN reset Satish Kharat
  2016-03-14 18:14 ` [PATCH 3/5] Using rport->dd_data to check rport online instead of rport_lookup Satish Kharat
@ 2016-03-14 18:14 ` Satish Kharat
  2016-03-14 20:16   ` Ewan Milne
                     ` (2 more replies)
  2016-03-14 18:14 ` [PATCH 5/5] Leftshift returned scsi_cmnd error code 16 bits Satish Kharat
  2016-03-14 19:59 ` [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer Ewan Milne
  4 siblings, 3 replies; 15+ messages in thread
From: Satish Kharat @ 2016-03-14 18:14 UTC (permalink / raw)
  To: satishkh, linux-scsi

Fnic does not support mutiqueue. Setting the scsi_host_template in
fnic_host_template to indicate that.

Changing fnic_version from 1.6.0.21 to 1.6.0.22

Signed-off-by: Satish Kharat <satishkh@cisco.com>
---
 drivers/scsi/fnic/fnic.h      | 2 +-
 drivers/scsi/fnic/fnic_main.c | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
index 9ddc920..35264c7 100644
--- a/drivers/scsi/fnic/fnic.h
+++ b/drivers/scsi/fnic/fnic.h
@@ -39,7 +39,7 @@
 
 #define DRV_NAME		"fnic"
 #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
-#define DRV_VERSION		"1.6.0.21"
+#define DRV_VERSION		"1.6.0.22"
 #define PFX			DRV_NAME ": "
 #define DFX                     DRV_NAME "%d: "
 
diff --git a/drivers/scsi/fnic/fnic_main.c b/drivers/scsi/fnic/fnic_main.c
index 58ce902..fa8a121 100644
--- a/drivers/scsi/fnic/fnic_main.c
+++ b/drivers/scsi/fnic/fnic_main.c
@@ -119,6 +119,7 @@ static struct scsi_host_template fnic_host_template = {
 	.max_sectors = 0xffff,
 	.shost_attrs = fnic_attrs,
 	.track_queue_depth = 1,
+	.disable_blk_mq = 1,
 };
 
 static void
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] Leftshift returned scsi_cmnd error code 16 bits
  2016-03-14 18:14 [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer Satish Kharat
                   ` (2 preceding siblings ...)
  2016-03-14 18:14 ` [PATCH 4/5] Setting scsi host template to indicate that fnic does not support multiqueue Satish Kharat
@ 2016-03-14 18:14 ` Satish Kharat
  2016-03-14 20:17   ` Ewan Milne
  2016-03-14 19:59 ` [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer Ewan Milne
  4 siblings, 1 reply; 15+ messages in thread
From: Satish Kharat @ 2016-03-14 18:14 UTC (permalink / raw)
  To: satishkh, linux-scsi

The the scsi_cmnd error code is expected to be in the left 16 bits
of the result field. Change is to correct this.

Signed-off-by: Satish Kharat <satishkh@cisco.com>
---
 drivers/scsi/fnic/fnic.h      | 2 +-
 drivers/scsi/fnic/fnic_scsi.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
index 35264c7..e4e22f1 100644
--- a/drivers/scsi/fnic/fnic.h
+++ b/drivers/scsi/fnic/fnic.h
@@ -39,7 +39,7 @@
 
 #define DRV_NAME		"fnic"
 #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
-#define DRV_VERSION		"1.6.0.22"
+#define DRV_VERSION		"1.6.0.24"
 #define PFX			DRV_NAME ": "
 #define DFX                     DRV_NAME "%d: "
 
diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
index 8492143..92a4ca3 100644
--- a/drivers/scsi/fnic/fnic_scsi.c
+++ b/drivers/scsi/fnic/fnic_scsi.c
@@ -461,7 +461,7 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void (*done)(struct scsi_
 			FNIC_SCSI_DBG(KERN_DEBUG, fnic->lport->host,
 				"returning DID_NO_CONNECT for IO as rport is removed\n");
 			atomic64_inc(&fnic_stats->misc_stats.rport_not_ready);
-			sc->result = DID_NO_CONNECT;
+			sc->result = DID_NO_CONNECT<<16;
 			done(sc);
 			return 0;
 		}
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer
  2016-03-14 18:14 [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer Satish Kharat
                   ` (3 preceding siblings ...)
  2016-03-14 18:14 ` [PATCH 5/5] Leftshift returned scsi_cmnd error code 16 bits Satish Kharat
@ 2016-03-14 19:59 ` Ewan Milne
  2016-03-16 22:38   ` Satish Kharat (satishkh)
  4 siblings, 1 reply; 15+ messages in thread
From: Ewan Milne @ 2016-03-14 19:59 UTC (permalink / raw)
  To: Satish Kharat; +Cc: linux-scsi

On Mon, 2016-03-14 at 11:14 -0700, Satish Kharat wrote:
> If an I/O times out and an abort issued by host, if the abort is
> successful we need to set scsi status as DID_ABORT. Or else the
> mid-layer error handler which looks for this error code, will
> offline the device. Also if the original I/O is not found in fnic
> firmware, we will consider the abort as successful.
> Fnic driver version changed from 1.6.0.17a to 1.6.0.19,
> version 1.6.0.18 has been skipped
> 
> Signed-off-by: Satish Kharat <satishkh@cisco.com>
> ---
>  drivers/scsi/fnic/fnic.h      |  2 +-
>  drivers/scsi/fnic/fnic_scsi.c | 32 +++++++++++++++++++++++++++-----
>  2 files changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
> index ce129e5..52a53f8 100644
> --- a/drivers/scsi/fnic/fnic.h
> +++ b/drivers/scsi/fnic/fnic.h
> @@ -39,7 +39,7 @@
>  
>  #define DRV_NAME		"fnic"
>  #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
> -#define DRV_VERSION		"1.6.0.17a"
> +#define DRV_VERSION		"1.6.0.19"
>  #define PFX			DRV_NAME ": "
>  #define DFX                     DRV_NAME "%d: "
>  
> diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
> index 266b909..b9732b1 100644
> --- a/drivers/scsi/fnic/fnic_scsi.c
> +++ b/drivers/scsi/fnic/fnic_scsi.c
> @@ -1092,6 +1092,11 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
>  				atomic64_inc(
>  					&term_stats->terminate_fw_timeouts);
>  			break;
> +		case FCPIO_ITMF_REJECTED:
> +			FNIC_SCSI_DBG(KERN_INFO, fnic->lport->host,
> +				"abort reject recd. id %d\n",
> +				(int)(id & FNIC_TAG_MASK));
> +			break;
>  		case FCPIO_IO_NOT_FOUND:
>  			if (CMD_FLAGS(sc) & FNIC_IO_ABTS_ISSUED)
>  				atomic64_inc(&abts_stats->abort_io_not_found);
> @@ -1112,9 +1117,14 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
>  			spin_unlock_irqrestore(io_lock, flags);
>  			return;
>  		}
> +
>  		CMD_ABTS_STATUS(sc) = hdr_status;
>  		CMD_FLAGS(sc) |= FNIC_IO_ABT_TERM_DONE;
>  
> +		/* If the status is IO not found consider it as success */
> +		if (hdr_status == FCPIO_IO_NOT_FOUND)
> +			CMD_ABTS_STATUS(sc) = FCPIO_SUCCESS;
> +

Minor nitpick:  I would have done:

		if (hdr_status == FCP_IO_NOT_FOUND)
			CMD_ABTS_STATUS(sc) = FCPIO_SUCCESS;
		else
			CMD_ABTS_STATUS(sc) = hdr_status;

rather than overwriting the assignment made several lines above, it is
more explicit and less prone to errors if the code is later changed.

>  		atomic64_dec(&fnic_stats->io_stats.active_ios);
>  		if (atomic64_read(&fnic->io_cmpl_skip))
>  			atomic64_dec(&fnic->io_cmpl_skip);
> @@ -1927,21 +1937,33 @@ int fnic_abort_cmd(struct scsi_cmnd *sc)
>  
>  	CMD_STATE(sc) = FNIC_IOREQ_ABTS_COMPLETE;
>  
> +	start_time = io_req->start_time;
>  	/*
>  	 * firmware completed the abort, check the status,
> -	 * free the io_req irrespective of failure or success
> +	 * free the io_req if successful. If abort fails,
> +	 * Device reset will clean the I/O.
>  	 */
> -	if (CMD_ABTS_STATUS(sc) != FCPIO_SUCCESS)
> +	if (CMD_ABTS_STATUS(sc) == FCPIO_SUCCESS) {
> +		CMD_SP(sc) = NULL;
> +	}
> +	else
> +	{
>  		ret = FAILED;
> -
> -	CMD_SP(sc) = NULL;
> +		spin_unlock_irqrestore(io_lock, flags);
> +		goto fnic_abort_cmd_end;
> +	}
>  
>  	spin_unlock_irqrestore(io_lock, flags);
>  
> -	start_time = io_req->start_time;

What was the point of moving the assignment of start_time?
Other places in the code still set it just before the call to
fnic_release_ioreq_buf().

>  	fnic_release_ioreq_buf(fnic, io_req, sc);
>  	mempool_free(io_req, fnic->io_req_pool);
>  
> +	if (sc->scsi_done) {
> +	/* Call SCSI completion function to complete the IO */
> +		sc->result = (DID_ABORT << 16);
> +		sc->scsi_done(sc);
> +	}
> +
>  fnic_abort_cmd_end:
>  	FNIC_TRACE(fnic_abort_cmd, sc->device->host->host_no,
>  		  sc->request->tag, sc,

Despite the above comments, I guess this looks OK.

Reviewed-by: Ewan D. Milne <emilne@redhat.com>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] Cleanup the I/O that has timed out and is used to issue LUN reset
  2016-03-14 18:14 ` [PATCH 2/5] Cleanup the I/O that has timed out and is used to issue LUN reset Satish Kharat
@ 2016-03-14 20:14   ` Ewan Milne
  2016-03-15  9:53   ` Christoph Hellwig
  1 sibling, 0 replies; 15+ messages in thread
From: Ewan Milne @ 2016-03-14 20:14 UTC (permalink / raw)
  To: Satish Kharat; +Cc: linux-scsi

On Mon, 2016-03-14 at 11:14 -0700, Satish Kharat wrote:
> In case of LUN reset, the device reset command is issued with one of
> the I/Os that has timed out on that LUN. The change is to also return
> this I/O with error status set to DID_RESET.
> Fnic driver version changed from 1.6.0.19 to 1.6.0.20
> 
> Signed-off-by: Satish Kharat <satishkh@cisco.com>
> ---
>  drivers/scsi/fnic/fnic.h      |  2 +-
>  drivers/scsi/fnic/fnic_scsi.c | 35 +++++++++++++++++++++++++++--------
>  2 files changed, 28 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
> index 52a53f8..1023eae 100644
> --- a/drivers/scsi/fnic/fnic.h
> +++ b/drivers/scsi/fnic/fnic.h
> @@ -39,7 +39,7 @@
>  
>  #define DRV_NAME		"fnic"
>  #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
> -#define DRV_VERSION		"1.6.0.19"
> +#define DRV_VERSION		"1.6.0.20"
>  #define PFX			DRV_NAME ": "
>  #define DFX                     DRV_NAME "%d: "
>  
> diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
> index b9732b1..a3e0f69 100644
> --- a/drivers/scsi/fnic/fnic_scsi.c
> +++ b/drivers/scsi/fnic/fnic_scsi.c
> @@ -2041,7 +2041,9 @@ lr_io_req_end:
>   * successfully aborted, 1 otherwise
>   */
>  static int fnic_clean_pending_aborts(struct fnic *fnic,
> -				     struct scsi_cmnd *lr_sc)
> +				     struct scsi_cmnd *lr_sc,
> +					 bool new_sc)
> +

I don't see that this or any of the subsequent patches do anything with
this argument except pass 0 to it.  So why add it?

>  {
>  	int tag, abt_tag;
>  	struct fnic_io_req *io_req;
> @@ -2059,10 +2061,10 @@ static int fnic_clean_pending_aborts(struct fnic *fnic,
>  		spin_lock_irqsave(io_lock, flags);
>  		sc = scsi_host_find_tag(fnic->lport->host, tag);
>  		/*
> -		 * ignore this lun reset cmd or cmds that do not belong to
> -		 * this lun
> +		 * ignore this lun reset cmd if issued using new SC
> +		 * or cmds that do not belong to this lun
>  		 */
> -		if (!sc || sc == lr_sc || sc->device != lun_dev) {
> +		if (!sc || ((sc == lr_sc) && new_sc) || sc->device != lun_dev) {
>  			spin_unlock_irqrestore(io_lock, flags);
>  			continue;
>  		}
> @@ -2168,11 +2170,27 @@ static int fnic_clean_pending_aborts(struct fnic *fnic,
>  			goto clean_pending_aborts_end;
>  		}
>  		CMD_STATE(sc) = FNIC_IOREQ_ABTS_COMPLETE;
> -		CMD_SP(sc) = NULL;
> +
> +		/* original sc used for lr is handled by dev reset code */
> +		if (sc != lr_sc)
> +			CMD_SP(sc) = NULL;
>  		spin_unlock_irqrestore(io_lock, flags);
>  
> -		fnic_release_ioreq_buf(fnic, io_req, sc);
> -		mempool_free(io_req, fnic->io_req_pool);
> +		/* original sc used for lr is handled by dev reset code */
> +		if (sc != lr_sc) {
> +			fnic_release_ioreq_buf(fnic, io_req, sc);
> +			mempool_free(io_req, fnic->io_req_pool);
> +		}
> +
> +		/*
> +		 * Any IO is returned during reset, it needs to call scsi_done
> +		 * to return the scsi_cmnd to upper layer.
> +		 */
> +		if (sc->scsi_done) {
> +			/* Set result to let upper SCSI layer retry */
> +			sc->result = DID_RESET << 16;
> +			sc->scsi_done(sc);
> +		}
>  	}
>  
>  	schedule_timeout(msecs_to_jiffies(2 * fnic->config.ed_tov));
> @@ -2266,6 +2284,7 @@ int fnic_device_reset(struct scsi_cmnd *sc)
>  	int tag = 0;
>  	DECLARE_COMPLETION_ONSTACK(tm_done);
>  	int tag_gen_flag = 0;   /*to track tags allocated by fnic driver*/
> +	bool new_sc = 0;
>  
>  	/* Wait for rport to unblock */
>  	fc_block_scsi_eh(sc);
> @@ -2452,7 +2471,7 @@ int fnic_device_reset(struct scsi_cmnd *sc)
>  	 * the lun reset cmd. If all cmds get cleaned, the lun reset
>  	 * succeeds
>  	 */
> -	if (fnic_clean_pending_aborts(fnic, sc)) {
> +	if (fnic_clean_pending_aborts(fnic, sc, new_sc)) {
>  		spin_lock_irqsave(io_lock, flags);
>  		io_req = (struct fnic_io_req *)CMD_SP(sc);
>  		FNIC_SCSI_DBG(KERN_DEBUG, fnic->lport->host,

Reviewed-by: Ewan D. Milne <emilne@redhat.com>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] Using rport->dd_data to check rport online instead of rport_lookup.
  2016-03-14 18:14 ` [PATCH 3/5] Using rport->dd_data to check rport online instead of rport_lookup Satish Kharat
@ 2016-03-14 20:15   ` Ewan Milne
  0 siblings, 0 replies; 15+ messages in thread
From: Ewan Milne @ 2016-03-14 20:15 UTC (permalink / raw)
  To: Satish Kharat; +Cc: linux-scsi

On Mon, 2016-03-14 at 11:14 -0700, Satish Kharat wrote:
> When issuing I/O we check if rport is online through libfc
> rport_lookup() function which needs to be protected by mutex lock
> that cannot acquired in I/O context. The change is to use midlayer
> remote port’s dd_data which is preserved until its devloss timeout
> and no protection is required.
> 
> Fnic driver version changed from 1.6.0.20 to 1.6.0.21
> 
> Signed-off-by: Satish Kharat <satishkh@cisco.com>
> ---
>  drivers/scsi/fnic/fnic.h      |  2 +-
>  drivers/scsi/fnic/fnic_scsi.c | 20 +++++++++++---------
>  2 files changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
> index 1023eae..9ddc920 100644
> --- a/drivers/scsi/fnic/fnic.h
> +++ b/drivers/scsi/fnic/fnic.h
> @@ -39,7 +39,7 @@
>  
>  #define DRV_NAME		"fnic"
>  #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
> -#define DRV_VERSION		"1.6.0.20"
> +#define DRV_VERSION		"1.6.0.21"
>  #define PFX			DRV_NAME ": "
>  #define DFX                     DRV_NAME "%d: "
>  
> diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
> index a3e0f69..8492143 100644
> --- a/drivers/scsi/fnic/fnic_scsi.c
> +++ b/drivers/scsi/fnic/fnic_scsi.c
> @@ -439,7 +439,6 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void (*done)(struct scsi_
>  	int sg_count = 0;
>  	unsigned long flags = 0;
>  	unsigned long ptr;
> -	struct fc_rport_priv *rdata;
>  	spinlock_t *io_lock = NULL;
>  	int io_lock_acquired = 0;
>  
> @@ -455,14 +454,17 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void (*done)(struct scsi_
>  		return 0;
>  	}
>  
> -	rdata = lp->tt.rport_lookup(lp, rport->port_id);
> -	if (!rdata || (rdata->rp_state == RPORT_ST_DELETE)) {
> -		FNIC_SCSI_DBG(KERN_DEBUG, fnic->lport->host,
> -			"returning IO as rport is removed\n");
> -		atomic64_inc(&fnic_stats->misc_stats.rport_not_ready);
> -		sc->result = DID_NO_CONNECT;

You fixed this in patch 5/5 to << 16, might as well just do it right
in the first patch here and eliminate that one.

> -		done(sc);
> -		return 0;
> +	if (rport) {
> +		struct fc_rport_libfc_priv *rp = rport->dd_data;
> +
> +		if (!rp || rp->rp_state != RPORT_ST_READY) {
> +			FNIC_SCSI_DBG(KERN_DEBUG, fnic->lport->host,
> +				"returning DID_NO_CONNECT for IO as rport is removed\n");
> +			atomic64_inc(&fnic_stats->misc_stats.rport_not_ready);
> +			sc->result = DID_NO_CONNECT;
> +			done(sc);
> +			return 0;
> +		}
>  	}
>  
>  	if (lp->state != LPORT_ST_READY || !(lp->link_up))

Reviewed-by: Ewan D. Milne <emilne@redhat.com>


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] Setting scsi host template to indicate that fnic does not support multiqueue
  2016-03-14 18:14 ` [PATCH 4/5] Setting scsi host template to indicate that fnic does not support multiqueue Satish Kharat
@ 2016-03-14 20:16   ` Ewan Milne
  2016-03-15  9:46   ` Johannes Thumshirn
  2016-03-15  9:51   ` Christoph Hellwig
  2 siblings, 0 replies; 15+ messages in thread
From: Ewan Milne @ 2016-03-14 20:16 UTC (permalink / raw)
  To: Satish Kharat; +Cc: linux-scsi

On Mon, 2016-03-14 at 11:14 -0700, Satish Kharat wrote:
> Fnic does not support mutiqueue. Setting the scsi_host_template in
> fnic_host_template to indicate that.
> 
> Changing fnic_version from 1.6.0.21 to 1.6.0.22
> 
> Signed-off-by: Satish Kharat <satishkh@cisco.com>
> ---
>  drivers/scsi/fnic/fnic.h      | 2 +-
>  drivers/scsi/fnic/fnic_main.c | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
> index 9ddc920..35264c7 100644
> --- a/drivers/scsi/fnic/fnic.h
> +++ b/drivers/scsi/fnic/fnic.h
> @@ -39,7 +39,7 @@
>  
>  #define DRV_NAME		"fnic"
>  #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
> -#define DRV_VERSION		"1.6.0.21"
> +#define DRV_VERSION		"1.6.0.22"
>  #define PFX			DRV_NAME ": "
>  #define DFX                     DRV_NAME "%d: "
>  
> diff --git a/drivers/scsi/fnic/fnic_main.c b/drivers/scsi/fnic/fnic_main.c
> index 58ce902..fa8a121 100644
> --- a/drivers/scsi/fnic/fnic_main.c
> +++ b/drivers/scsi/fnic/fnic_main.c
> @@ -119,6 +119,7 @@ static struct scsi_host_template fnic_host_template = {
>  	.max_sectors = 0xffff,
>  	.shost_attrs = fnic_attrs,
>  	.track_queue_depth = 1,
> +	.disable_blk_mq = 1,
>  };
>  
>  static void

Reviewed-by: Ewan D. Milne <emilne@redhat.com>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] Leftshift returned scsi_cmnd error code 16 bits
  2016-03-14 18:14 ` [PATCH 5/5] Leftshift returned scsi_cmnd error code 16 bits Satish Kharat
@ 2016-03-14 20:17   ` Ewan Milne
  0 siblings, 0 replies; 15+ messages in thread
From: Ewan Milne @ 2016-03-14 20:17 UTC (permalink / raw)
  To: Satish Kharat; +Cc: linux-scsi

On Mon, 2016-03-14 at 11:14 -0700, Satish Kharat wrote:
> The the scsi_cmnd error code is expected to be in the left 16 bits
> of the result field. Change is to correct this.
> 
> Signed-off-by: Satish Kharat <satishkh@cisco.com>
> ---
>  drivers/scsi/fnic/fnic.h      | 2 +-
>  drivers/scsi/fnic/fnic_scsi.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
> index 35264c7..e4e22f1 100644
> --- a/drivers/scsi/fnic/fnic.h
> +++ b/drivers/scsi/fnic/fnic.h
> @@ -39,7 +39,7 @@
>  
>  #define DRV_NAME		"fnic"
>  #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
> -#define DRV_VERSION		"1.6.0.22"
> +#define DRV_VERSION		"1.6.0.24"
>  #define PFX			DRV_NAME ": "
>  #define DFX                     DRV_NAME "%d: "
>  
> diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
> index 8492143..92a4ca3 100644
> --- a/drivers/scsi/fnic/fnic_scsi.c
> +++ b/drivers/scsi/fnic/fnic_scsi.c
> @@ -461,7 +461,7 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void (*done)(struct scsi_
>  			FNIC_SCSI_DBG(KERN_DEBUG, fnic->lport->host,
>  				"returning DID_NO_CONNECT for IO as rport is removed\n");
>  			atomic64_inc(&fnic_stats->misc_stats.rport_not_ready);
> -			sc->result = DID_NO_CONNECT;
> +			sc->result = DID_NO_CONNECT<<16;
>  			done(sc);
>  			return 0;
>  		}

Should just fix patch 3/5 instead.

Reviewed-by: Ewan D. Milne <emilne@redhat.com>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] Setting scsi host template to indicate that fnic does not support multiqueue
  2016-03-14 18:14 ` [PATCH 4/5] Setting scsi host template to indicate that fnic does not support multiqueue Satish Kharat
  2016-03-14 20:16   ` Ewan Milne
@ 2016-03-15  9:46   ` Johannes Thumshirn
  2016-03-15  9:51   ` Christoph Hellwig
  2 siblings, 0 replies; 15+ messages in thread
From: Johannes Thumshirn @ 2016-03-15  9:46 UTC (permalink / raw)
  To: Satish Kharat; +Cc: linux-scsi

On Mon, Mar 14, 2016 at 11:14:21AM -0700, Satish Kharat wrote:
> Fnic does not support mutiqueue. Setting the scsi_host_template in
> fnic_host_template to indicate that.
> 
> Changing fnic_version from 1.6.0.21 to 1.6.0.22
> 
> Signed-off-by: Satish Kharat <satishkh@cisco.com>
> ---
>  drivers/scsi/fnic/fnic.h      | 2 +-
>  drivers/scsi/fnic/fnic_main.c | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
> index 9ddc920..35264c7 100644
> --- a/drivers/scsi/fnic/fnic.h
> +++ b/drivers/scsi/fnic/fnic.h
> @@ -39,7 +39,7 @@
>  
>  #define DRV_NAME		"fnic"
>  #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
> -#define DRV_VERSION		"1.6.0.21"
> +#define DRV_VERSION		"1.6.0.22"
>  #define PFX			DRV_NAME ": "
>  #define DFX                     DRV_NAME "%d: "
>  
> diff --git a/drivers/scsi/fnic/fnic_main.c b/drivers/scsi/fnic/fnic_main.c
> index 58ce902..fa8a121 100644
> --- a/drivers/scsi/fnic/fnic_main.c
> +++ b/drivers/scsi/fnic/fnic_main.c
> @@ -119,6 +119,7 @@ static struct scsi_host_template fnic_host_template = {
>  	.max_sectors = 0xffff,
>  	.shost_attrs = fnic_attrs,
>  	.track_queue_depth = 1,
> +	.disable_blk_mq = 1,
>  };
>  
>  static void
> -- 
> 2.4.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Can you please remove the check for shost_use_blk_mq() in fnic_scsi.c as well,
as it's a NOP?

Thanks,
	Johannes

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] Setting scsi host template to indicate that fnic does not support multiqueue
  2016-03-14 18:14 ` [PATCH 4/5] Setting scsi host template to indicate that fnic does not support multiqueue Satish Kharat
  2016-03-14 20:16   ` Ewan Milne
  2016-03-15  9:46   ` Johannes Thumshirn
@ 2016-03-15  9:51   ` Christoph Hellwig
  2 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2016-03-15  9:51 UTC (permalink / raw)
  To: Satish Kharat; +Cc: linux-scsi

Sorry, but this patch is bullshit.  Why doesn't it support using blk-mq?
And why didn't I see a single attempt from you to fix it first?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] Cleanup the I/O that has timed out and is used to issue LUN reset
  2016-03-14 18:14 ` [PATCH 2/5] Cleanup the I/O that has timed out and is used to issue LUN reset Satish Kharat
  2016-03-14 20:14   ` Ewan Milne
@ 2016-03-15  9:53   ` Christoph Hellwig
  1 sibling, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2016-03-15  9:53 UTC (permalink / raw)
  To: Satish Kharat; +Cc: linux-scsi


The right fix is to get rid of fnic_clean_pending_aborts entirely.
It's not the low level drivers business to clean up pending commands.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer
  2016-03-14 19:59 ` [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer Ewan Milne
@ 2016-03-16 22:38   ` Satish Kharat (satishkh)
  2016-03-17 17:51     ` Ewan D. Milne
  0 siblings, 1 reply; 15+ messages in thread
From: Satish Kharat (satishkh) @ 2016-03-16 22:38 UTC (permalink / raw)
  To: Ewan Milne; +Cc: linux-scsi

Thanks for the review comments. I incorporated the comments, below is the updated patch: 


If an I/O times out and an abort issued by host, if the abort is
successful we need to set scsi status as DID_ABORT. Or else the
mid-layer error handler which looks for this error code, will
offline the device. Also if the original I/O is not found in fnic
firmware, we will consider the abort as successful.
The start_time assignment is moved because of the new goto.
Fnic driver version changed from 1.6.0.17a to 1.6.0.19,
version 1.6.0.18 has been skipped

Signed-off-by: Satish Kharat <satishkh@cisco.com>
---
 drivers/scsi/fnic/fnic.h      |  2 +-
 drivers/scsi/fnic/fnic_scsi.c | 35 +++++++++++++++++++++++++++++------
 2 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
index ce129e5..52a53f8 100644
--- a/drivers/scsi/fnic/fnic.h
+++ b/drivers/scsi/fnic/fnic.h
@@ -39,7 +39,7 @@
 
 #define DRV_NAME		"fnic"
 #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
-#define DRV_VERSION		"1.6.0.17a"
+#define DRV_VERSION		"1.6.0.19"
 #define PFX			DRV_NAME ": "
 #define DFX                     DRV_NAME "%d: "
 
diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
index 266b909..b732fa3 100644
--- a/drivers/scsi/fnic/fnic_scsi.c
+++ b/drivers/scsi/fnic/fnic_scsi.c
@@ -1092,6 +1092,11 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
 				atomic64_inc(
 					&term_stats->terminate_fw_timeouts);
 			break;
+		case FCPIO_ITMF_REJECTED:
+			FNIC_SCSI_DBG(KERN_INFO, fnic->lport->host,
+				"abort reject recd. id %d\n",
+				(int)(id & FNIC_TAG_MASK));
+			break;
 		case FCPIO_IO_NOT_FOUND:
 			if (CMD_FLAGS(sc) & FNIC_IO_ABTS_ISSUED)
 				atomic64_inc(&abts_stats->abort_io_not_found);
@@ -1112,9 +1117,15 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
 			spin_unlock_irqrestore(io_lock, flags);
 			return;
 		}
-		CMD_ABTS_STATUS(sc) = hdr_status;
+
 		CMD_FLAGS(sc) |= FNIC_IO_ABT_TERM_DONE;
 
+		/* If the status is IO not found consider it as success */
+		if (hdr_status == FCPIO_IO_NOT_FOUND)
+			CMD_ABTS_STATUS(sc) = FCPIO_SUCCESS;
+		else
+			CMD_ABTS_STATUS(sc) = hdr_status;
+
 		atomic64_dec(&fnic_stats->io_stats.active_ios);
 		if (atomic64_read(&fnic->io_cmpl_skip))
 			atomic64_dec(&fnic->io_cmpl_skip);
@@ -1927,21 +1938,33 @@ int fnic_abort_cmd(struct scsi_cmnd *sc)
 
 	CMD_STATE(sc) = FNIC_IOREQ_ABTS_COMPLETE;
 
+	start_time = io_req->start_time;
 	/*
 	 * firmware completed the abort, check the status,
-	 * free the io_req irrespective of failure or success
+	 * free the io_req if successful. If abort fails,
+	 * Device reset will clean the I/O.
 	 */
-	if (CMD_ABTS_STATUS(sc) != FCPIO_SUCCESS)
+	if (CMD_ABTS_STATUS(sc) == FCPIO_SUCCESS) {
+		CMD_SP(sc) = NULL;
+	}
+	else
+	{
 		ret = FAILED;
-
-	CMD_SP(sc) = NULL;
+		spin_unlock_irqrestore(io_lock, flags);
+		goto fnic_abort_cmd_end;
+	}
 
 	spin_unlock_irqrestore(io_lock, flags);
 
-	start_time = io_req->start_time;
 	fnic_release_ioreq_buf(fnic, io_req, sc);
 	mempool_free(io_req, fnic->io_req_pool);
 
+	if (sc->scsi_done) {
+	/* Call SCSI completion function to complete the IO */
+		sc->result = (DID_ABORT << 16);
+		sc->scsi_done(sc);
+	}
+
 fnic_abort_cmd_end:
 	FNIC_TRACE(fnic_abort_cmd, sc->device->host->host_no,
 		  sc->request->tag, sc,
-- 
2.4.3

________________________________________
From: Ewan Milne <emilne@redhat.com>
Sent: Monday, March 14, 2016 12:59 PM
To: Satish Kharat (satishkh)
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer

On Mon, 2016-03-14 at 11:14 -0700, Satish Kharat wrote:
> If an I/O times out and an abort issued by host, if the abort is
> successful we need to set scsi status as DID_ABORT. Or else the
> mid-layer error handler which looks for this error code, will
> offline the device. Also if the original I/O is not found in fnic
> firmware, we will consider the abort as successful.
> Fnic driver version changed from 1.6.0.17a to 1.6.0.19,
> version 1.6.0.18 has been skipped
>
> Signed-off-by: Satish Kharat <satishkh@cisco.com>
> ---
>  drivers/scsi/fnic/fnic.h      |  2 +-
>  drivers/scsi/fnic/fnic_scsi.c | 32 +++++++++++++++++++++++++++-----
>  2 files changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
> index ce129e5..52a53f8 100644
> --- a/drivers/scsi/fnic/fnic.h
> +++ b/drivers/scsi/fnic/fnic.h
> @@ -39,7 +39,7 @@
>
>  #define DRV_NAME             "fnic"
>  #define DRV_DESCRIPTION              "Cisco FCoE HBA Driver"
> -#define DRV_VERSION          "1.6.0.17a"
> +#define DRV_VERSION          "1.6.0.19"
>  #define PFX                  DRV_NAME ": "
>  #define DFX                     DRV_NAME "%d: "
>
> diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
> index 266b909..b9732b1 100644
> --- a/drivers/scsi/fnic/fnic_scsi.c
> +++ b/drivers/scsi/fnic/fnic_scsi.c
> @@ -1092,6 +1092,11 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
>                               atomic64_inc(
>                                       &term_stats->terminate_fw_timeouts);
>                       break;
> +             case FCPIO_ITMF_REJECTED:
> +                     FNIC_SCSI_DBG(KERN_INFO, fnic->lport->host,
> +                             "abort reject recd. id %d\n",
> +                             (int)(id & FNIC_TAG_MASK));
> +                     break;
>               case FCPIO_IO_NOT_FOUND:
>                       if (CMD_FLAGS(sc) & FNIC_IO_ABTS_ISSUED)
>                               atomic64_inc(&abts_stats->abort_io_not_found);
> @@ -1112,9 +1117,14 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
>                       spin_unlock_irqrestore(io_lock, flags);
>                       return;
>               }
> +
>               CMD_ABTS_STATUS(sc) = hdr_status;
>               CMD_FLAGS(sc) |= FNIC_IO_ABT_TERM_DONE;
>
> +             /* If the status is IO not found consider it as success */
> +             if (hdr_status == FCPIO_IO_NOT_FOUND)
> +                     CMD_ABTS_STATUS(sc) = FCPIO_SUCCESS;
> +

Minor nitpick:  I would have done:

                if (hdr_status == FCP_IO_NOT_FOUND)
                        CMD_ABTS_STATUS(sc) = FCPIO_SUCCESS;
                else
                        CMD_ABTS_STATUS(sc) = hdr_status;

rather than overwriting the assignment made several lines above, it is
more explicit and less prone to errors if the code is later changed.

>               atomic64_dec(&fnic_stats->io_stats.active_ios);
>               if (atomic64_read(&fnic->io_cmpl_skip))
>                       atomic64_dec(&fnic->io_cmpl_skip);
> @@ -1927,21 +1937,33 @@ int fnic_abort_cmd(struct scsi_cmnd *sc)
>
>       CMD_STATE(sc) = FNIC_IOREQ_ABTS_COMPLETE;
>
> +     start_time = io_req->start_time;
>       /*
>        * firmware completed the abort, check the status,
> -      * free the io_req irrespective of failure or success
> +      * free the io_req if successful. If abort fails,
> +      * Device reset will clean the I/O.
>        */
> -     if (CMD_ABTS_STATUS(sc) != FCPIO_SUCCESS)
> +     if (CMD_ABTS_STATUS(sc) == FCPIO_SUCCESS) {
> +             CMD_SP(sc) = NULL;
> +     }
> +     else
> +     {
>               ret = FAILED;
> -
> -     CMD_SP(sc) = NULL;
> +             spin_unlock_irqrestore(io_lock, flags);
> +             goto fnic_abort_cmd_end;
> +     }
>
>       spin_unlock_irqrestore(io_lock, flags);
>
> -     start_time = io_req->start_time;

What was the point of moving the assignment of start_time?
Other places in the code still set it just before the call to
fnic_release_ioreq_buf().

>       fnic_release_ioreq_buf(fnic, io_req, sc);
>       mempool_free(io_req, fnic->io_req_pool);
>
> +     if (sc->scsi_done) {
> +     /* Call SCSI completion function to complete the IO */
> +             sc->result = (DID_ABORT << 16);
> +             sc->scsi_done(sc);
> +     }
> +
>  fnic_abort_cmd_end:
>       FNIC_TRACE(fnic_abort_cmd, sc->device->host->host_no,
>                 sc->request->tag, sc,

Despite the above comments, I guess this looks OK.

Reviewed-by: Ewan D. Milne <emilne@redhat.com>


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer
  2016-03-16 22:38   ` Satish Kharat (satishkh)
@ 2016-03-17 17:51     ` Ewan D. Milne
  0 siblings, 0 replies; 15+ messages in thread
From: Ewan D. Milne @ 2016-03-17 17:51 UTC (permalink / raw)
  To: Satish Kharat (satishkh); +Cc: linux-scsi

On Wed, 2016-03-16 at 22:38 +0000, Satish Kharat (satishkh) wrote:
> Thanks for the review comments. I incorporated the comments, below is the updated patch: 
> 
> 
> If an I/O times out and an abort issued by host, if the abort is
> successful we need to set scsi status as DID_ABORT. Or else the
> mid-layer error handler which looks for this error code, will
> offline the device. Also if the original I/O is not found in fnic
> firmware, we will consider the abort as successful.
> The start_time assignment is moved because of the new goto.
> Fnic driver version changed from 1.6.0.17a to 1.6.0.19,
> version 1.6.0.18 has been skipped
> 
> Signed-off-by: Satish Kharat <satishkh@cisco.com>
> ---
>  drivers/scsi/fnic/fnic.h      |  2 +-
>  drivers/scsi/fnic/fnic_scsi.c | 35 +++++++++++++++++++++++++++++------
>  2 files changed, 30 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
> index ce129e5..52a53f8 100644
> --- a/drivers/scsi/fnic/fnic.h
> +++ b/drivers/scsi/fnic/fnic.h
> @@ -39,7 +39,7 @@
>  
>  #define DRV_NAME		"fnic"
>  #define DRV_DESCRIPTION		"Cisco FCoE HBA Driver"
> -#define DRV_VERSION		"1.6.0.17a"
> +#define DRV_VERSION		"1.6.0.19"
>  #define PFX			DRV_NAME ": "
>  #define DFX                     DRV_NAME "%d: "
>  
> diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
> index 266b909..b732fa3 100644
> --- a/drivers/scsi/fnic/fnic_scsi.c
> +++ b/drivers/scsi/fnic/fnic_scsi.c
> @@ -1092,6 +1092,11 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
>  				atomic64_inc(
>  					&term_stats->terminate_fw_timeouts);
>  			break;
> +		case FCPIO_ITMF_REJECTED:
> +			FNIC_SCSI_DBG(KERN_INFO, fnic->lport->host,
> +				"abort reject recd. id %d\n",
> +				(int)(id & FNIC_TAG_MASK));
> +			break;
>  		case FCPIO_IO_NOT_FOUND:
>  			if (CMD_FLAGS(sc) & FNIC_IO_ABTS_ISSUED)
>  				atomic64_inc(&abts_stats->abort_io_not_found);
> @@ -1112,9 +1117,15 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
>  			spin_unlock_irqrestore(io_lock, flags);
>  			return;
>  		}
> -		CMD_ABTS_STATUS(sc) = hdr_status;
> +
>  		CMD_FLAGS(sc) |= FNIC_IO_ABT_TERM_DONE;
>  
> +		/* If the status is IO not found consider it as success */
> +		if (hdr_status == FCPIO_IO_NOT_FOUND)
> +			CMD_ABTS_STATUS(sc) = FCPIO_SUCCESS;
> +		else
> +			CMD_ABTS_STATUS(sc) = hdr_status;
> +
>  		atomic64_dec(&fnic_stats->io_stats.active_ios);
>  		if (atomic64_read(&fnic->io_cmpl_skip))
>  			atomic64_dec(&fnic->io_cmpl_skip);
> @@ -1927,21 +1938,33 @@ int fnic_abort_cmd(struct scsi_cmnd *sc)
>  
>  	CMD_STATE(sc) = FNIC_IOREQ_ABTS_COMPLETE;
>  
> +	start_time = io_req->start_time;
>  	/*
>  	 * firmware completed the abort, check the status,
> -	 * free the io_req irrespective of failure or success
> +	 * free the io_req if successful. If abort fails,
> +	 * Device reset will clean the I/O.
>  	 */
> -	if (CMD_ABTS_STATUS(sc) != FCPIO_SUCCESS)
> +	if (CMD_ABTS_STATUS(sc) == FCPIO_SUCCESS) {
> +		CMD_SP(sc) = NULL;
> +	}
> +	else
> +	{
>  		ret = FAILED;
> -
> -	CMD_SP(sc) = NULL;
> +		spin_unlock_irqrestore(io_lock, flags);
> +		goto fnic_abort_cmd_end;
> +	}
>  
>  	spin_unlock_irqrestore(io_lock, flags);
>  
> -	start_time = io_req->start_time;
>  	fnic_release_ioreq_buf(fnic, io_req, sc);
>  	mempool_free(io_req, fnic->io_req_pool);
>  
> +	if (sc->scsi_done) {
> +	/* Call SCSI completion function to complete the IO */
> +		sc->result = (DID_ABORT << 16);
> +		sc->scsi_done(sc);
> +	}
> +
>  fnic_abort_cmd_end:
>  	FNIC_TRACE(fnic_abort_cmd, sc->device->host->host_no,
>  		  sc->request->tag, sc,
> -- 
> 2.4.3

Reviewed-by: Ewan D. Milne <emilne@redhat.com>

> 
> ________________________________________
> From: Ewan Milne <emilne@redhat.com>
> Sent: Monday, March 14, 2016 12:59 PM
> To: Satish Kharat (satishkh)
> Cc: linux-scsi@vger.kernel.org
> Subject: Re: [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer
> 
> On Mon, 2016-03-14 at 11:14 -0700, Satish Kharat wrote:
> > If an I/O times out and an abort issued by host, if the abort is
> > successful we need to set scsi status as DID_ABORT. Or else the
> > mid-layer error handler which looks for this error code, will
> > offline the device. Also if the original I/O is not found in fnic
> > firmware, we will consider the abort as successful.
> > Fnic driver version changed from 1.6.0.17a to 1.6.0.19,
> > version 1.6.0.18 has been skipped
> >
> > Signed-off-by: Satish Kharat <satishkh@cisco.com>
> > ---
> >  drivers/scsi/fnic/fnic.h      |  2 +-
> >  drivers/scsi/fnic/fnic_scsi.c | 32 +++++++++++++++++++++++++++-----
> >  2 files changed, 28 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
> > index ce129e5..52a53f8 100644
> > --- a/drivers/scsi/fnic/fnic.h
> > +++ b/drivers/scsi/fnic/fnic.h
> > @@ -39,7 +39,7 @@
> >
> >  #define DRV_NAME             "fnic"
> >  #define DRV_DESCRIPTION              "Cisco FCoE HBA Driver"
> > -#define DRV_VERSION          "1.6.0.17a"
> > +#define DRV_VERSION          "1.6.0.19"
> >  #define PFX                  DRV_NAME ": "
> >  #define DFX                     DRV_NAME "%d: "
> >
> > diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
> > index 266b909..b9732b1 100644
> > --- a/drivers/scsi/fnic/fnic_scsi.c
> > +++ b/drivers/scsi/fnic/fnic_scsi.c
> > @@ -1092,6 +1092,11 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
> >                               atomic64_inc(
> >                                       &term_stats->terminate_fw_timeouts);
> >                       break;
> > +             case FCPIO_ITMF_REJECTED:
> > +                     FNIC_SCSI_DBG(KERN_INFO, fnic->lport->host,
> > +                             "abort reject recd. id %d\n",
> > +                             (int)(id & FNIC_TAG_MASK));
> > +                     break;
> >               case FCPIO_IO_NOT_FOUND:
> >                       if (CMD_FLAGS(sc) & FNIC_IO_ABTS_ISSUED)
> >                               atomic64_inc(&abts_stats->abort_io_not_found);
> > @@ -1112,9 +1117,14 @@ static void fnic_fcpio_itmf_cmpl_handler(struct fnic *fnic,
> >                       spin_unlock_irqrestore(io_lock, flags);
> >                       return;
> >               }
> > +
> >               CMD_ABTS_STATUS(sc) = hdr_status;
> >               CMD_FLAGS(sc) |= FNIC_IO_ABT_TERM_DONE;
> >
> > +             /* If the status is IO not found consider it as success */
> > +             if (hdr_status == FCPIO_IO_NOT_FOUND)
> > +                     CMD_ABTS_STATUS(sc) = FCPIO_SUCCESS;
> > +
> 
> Minor nitpick:  I would have done:
> 
>                 if (hdr_status == FCP_IO_NOT_FOUND)
>                         CMD_ABTS_STATUS(sc) = FCPIO_SUCCESS;
>                 else
>                         CMD_ABTS_STATUS(sc) = hdr_status;
> 
> rather than overwriting the assignment made several lines above, it is
> more explicit and less prone to errors if the code is later changed.
> 
> >               atomic64_dec(&fnic_stats->io_stats.active_ios);
> >               if (atomic64_read(&fnic->io_cmpl_skip))
> >                       atomic64_dec(&fnic->io_cmpl_skip);
> > @@ -1927,21 +1937,33 @@ int fnic_abort_cmd(struct scsi_cmnd *sc)
> >
> >       CMD_STATE(sc) = FNIC_IOREQ_ABTS_COMPLETE;
> >
> > +     start_time = io_req->start_time;
> >       /*
> >        * firmware completed the abort, check the status,
> > -      * free the io_req irrespective of failure or success
> > +      * free the io_req if successful. If abort fails,
> > +      * Device reset will clean the I/O.
> >        */
> > -     if (CMD_ABTS_STATUS(sc) != FCPIO_SUCCESS)
> > +     if (CMD_ABTS_STATUS(sc) == FCPIO_SUCCESS) {
> > +             CMD_SP(sc) = NULL;
> > +     }
> > +     else
> > +     {
> >               ret = FAILED;
> > -
> > -     CMD_SP(sc) = NULL;
> > +             spin_unlock_irqrestore(io_lock, flags);
> > +             goto fnic_abort_cmd_end;
> > +     }
> >
> >       spin_unlock_irqrestore(io_lock, flags);
> >
> > -     start_time = io_req->start_time;
> 
> What was the point of moving the assignment of start_time?
> Other places in the code still set it just before the call to
> fnic_release_ioreq_buf().
> 
> >       fnic_release_ioreq_buf(fnic, io_req, sc);
> >       mempool_free(io_req, fnic->io_req_pool);
> >
> > +     if (sc->scsi_done) {
> > +     /* Call SCSI completion function to complete the IO */
> > +             sc->result = (DID_ABORT << 16);
> > +             sc->scsi_done(sc);
> > +     }
> > +
> >  fnic_abort_cmd_end:
> >       FNIC_TRACE(fnic_abort_cmd, sc->device->host->host_no,
> >                 sc->request->tag, sc,
> 
> Despite the above comments, I guess this looks OK.
> 
> Reviewed-by: Ewan D. Milne <emilne@redhat.com>
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-03-17 17:51 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-14 18:14 [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer Satish Kharat
2016-03-14 18:14 ` [PATCH 2/5] Cleanup the I/O that has timed out and is used to issue LUN reset Satish Kharat
2016-03-14 20:14   ` Ewan Milne
2016-03-15  9:53   ` Christoph Hellwig
2016-03-14 18:14 ` [PATCH 3/5] Using rport->dd_data to check rport online instead of rport_lookup Satish Kharat
2016-03-14 20:15   ` Ewan Milne
2016-03-14 18:14 ` [PATCH 4/5] Setting scsi host template to indicate that fnic does not support multiqueue Satish Kharat
2016-03-14 20:16   ` Ewan Milne
2016-03-15  9:46   ` Johannes Thumshirn
2016-03-15  9:51   ` Christoph Hellwig
2016-03-14 18:14 ` [PATCH 5/5] Leftshift returned scsi_cmnd error code 16 bits Satish Kharat
2016-03-14 20:17   ` Ewan Milne
2016-03-14 19:59 ` [PATCH 1/5] Fix to cleanup aborted IO to avoid device being offlined by mid-layer Ewan Milne
2016-03-16 22:38   ` Satish Kharat (satishkh)
2016-03-17 17:51     ` Ewan D. Milne

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.