linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Smart <jsmart2021@gmail.com>
To: linux-scsi@vger.kernel.org
Cc: James Smart <jsmart2021@gmail.com>,
	Dick Kennedy <dick.kennedy@broadcom.com>
Subject: [PATCH 13/20] lpfc: Fix list corruption in lpfc_sli_get_iocbq
Date: Sat, 21 Sep 2019 20:58:59 -0700	[thread overview]
Message-ID: <20190922035906.10977-14-jsmart2021@gmail.com> (raw)
In-Reply-To: <20190922035906.10977-1-jsmart2021@gmail.com>

After study, it was determined there was a double free of a CT
iocb during execution of lpfc_offline_prep and lpfc_offline.
The prep routine issued an abort for some CT iocbs, but the
aborts did not complete fast enough for a subsequent routine
that waits for completion. Thus the driver proceeded to
lpfc_offline, which releases any pending iocbs. Unfortunately,
the completions for the aborts were then received which re-released
the ct iocbs.

Turns out the issue for why the aborts didn't complete fast
enough was not their time on the wire/in the adapter. It was the
lpfc_work_done routine, which requires the adapter state to be UP
before it calls lpfc_sli_handle_slow_ring_event() to process the
completions. The issue is the prep routine takes the link down
as part of it's processing.

To fix, the following was performed:
- Prevent the offline routine from releasing iocbs that have had aborts
  issued on them. Defer to the abort completions. Also means the
  driver fully waits for the completions.
  Given this change, the recognition of "driver-generated" status
  which then releases the iocb is no longer valid. As such, the change
  made in the commit 296012285c90 is reverted.
  As recognition of "driver-generated" status is no longer valid,
  this patch reverts the changes made in
  commit 296012285c90 ("scsi: lpfc: Fix leak of ELS completions on adapter reset").
- Modify lpfc_work_done to allow slow path completions so that
  the abort completions aren't ignored.
- Updated the fdmi path to recognize a CT request that fails
  due to the port being unusable. This stops FDMI retries. FDMI
  will be restarted on next link up.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
---
 drivers/scsi/lpfc/lpfc_ct.c      | 6 ++++++
 drivers/scsi/lpfc/lpfc_els.c     | 3 +++
 drivers/scsi/lpfc/lpfc_hbadisc.c | 5 ++++-
 drivers/scsi/lpfc/lpfc_sli.c     | 3 ---
 4 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c
index 25e86706e207..f883fac2d2b1 100644
--- a/drivers/scsi/lpfc/lpfc_ct.c
+++ b/drivers/scsi/lpfc/lpfc_ct.c
@@ -1868,6 +1868,12 @@ lpfc_cmpl_ct_disc_fdmi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 		if (irsp->ulpStatus == IOSTAT_LOCAL_REJECT) {
 			switch ((irsp->un.ulpWord[4] & IOERR_PARAM_MASK)) {
 			case IOERR_SLI_ABORTED:
+			case IOERR_SLI_DOWN:
+				/* Driver aborted this IO.  No retry as error
+				 * is likely Offline->Online or some adapter
+				 * error.  Recovery will try again.
+				 */
+				break;
 			case IOERR_ABORT_IN_PROGRESS:
 			case IOERR_SEQUENCE_TIMEOUT:
 			case IOERR_ILLEGAL_FRAME:
diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 55ab37572e92..bd8109b2a083 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -8019,6 +8019,9 @@ lpfc_els_flush_cmd(struct lpfc_vport *vport)
 		if (piocb->vport != vport)
 			continue;
 
+		if (piocb->iocb_flag & LPFC_DRIVER_ABORTED)
+			continue;
+
 		/* On the ELS ring we can have ELS_REQUESTs or
 		 * GEN_REQUESTs waiting for a response.
 		 */
diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c
index f483b3aea22b..808ad666bb1b 100644
--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
+++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
@@ -700,7 +700,10 @@ lpfc_work_done(struct lpfc_hba *phba)
 			if (!(phba->hba_flag & HBA_SP_QUEUE_EVT))
 				set_bit(LPFC_DATA_READY, &phba->data_flags);
 		} else {
-			if (phba->link_state >= LPFC_LINK_UP ||
+			/* Driver could have abort request completed in queue
+			 * when link goes down.  Allow for this transition.
+			 */
+			if (phba->link_state >= LPFC_LINK_DOWN ||
 			    phba->link_flag & LS_MDS_LOOPBACK) {
 				pring->flag &= ~LPFC_DEFERRED_RING_EVENT;
 				lpfc_sli_handle_slow_ring_event(phba, pring,
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 412cd8c56d90..ff261c0c738a 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -11090,9 +11090,6 @@ lpfc_sli_abort_els_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 				irsp->ulpStatus, irsp->un.ulpWord[4]);
 
 		spin_unlock_irq(&phba->hbalock);
-		if (irsp->ulpStatus == IOSTAT_LOCAL_REJECT &&
-		    irsp->un.ulpWord[4] == IOERR_SLI_ABORTED)
-			lpfc_sli_release_iocbq(phba, abort_iocb);
 	}
 release_iocb:
 	lpfc_sli_release_iocbq(phba, cmdiocb);
-- 
2.13.7


  parent reply	other threads:[~2019-09-22  3:59 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-22  3:58 [PATCH 00/20] lpfc: Update lpfc to revision 12.4.0.1 James Smart
2019-09-22  3:58 ` [PATCH 01/20] lpfc: Fix pt2pt discovery on SLI3 HBAs James Smart
2019-09-22  3:58 ` [PATCH 02/20] lpfc: Fix premature re-enabling of interrupts in lpfc_sli_host_down James Smart
2019-09-22  3:58 ` [PATCH 03/20] lpfc: Fix miss of register read failure check James Smart
2019-09-22  3:58 ` [PATCH 04/20] lpfc: Fix NVME io abort failures causing hangs James Smart
2019-09-22  3:58 ` [PATCH 05/20] lpfc: Fix rpi release when deleting vport James Smart
2019-09-22  3:58 ` [PATCH 06/20] lpfc: Fix device recovery errors after PLOGI failures James Smart
2019-09-22  3:58 ` [PATCH 07/20] lpfc: Fix locking on mailbox command completion James Smart
2019-09-22  3:58 ` [PATCH 08/20] lpfc: Fix GPF on scsi " James Smart
2019-09-22  3:58 ` [PATCH 09/20] lpfc: Fix discovery failures when target device connectivity bounces James Smart
2019-09-22  3:58 ` [PATCH 10/20] lpfc: Fix NVMe ABTS in response to receiving an ABTS James Smart
2019-09-22  3:58 ` [PATCH 11/20] lpfc: Fix coverity errors on NULL pointer checks James Smart
2019-09-22  3:58 ` [PATCH 12/20] lpfc: Fix host hang at boot or slow boot James Smart
2019-09-22  3:58 ` James Smart [this message]
2019-09-22  3:59 ` [PATCH 14/20] lpfc: Fix spinlock_irq issues in lpfc_els_flush_cmd() James Smart
2019-09-22  3:59 ` [PATCH 15/20] lpfc: Fix hdwq sgl locks and irq handling James Smart
2019-09-22  3:59 ` [PATCH 16/20] lpfc: Fix list corruption detected in lpfc_put_sgl_per_hdwq James Smart
2019-09-22  3:59 ` [PATCH 17/20] lpfc: Update async event logging James Smart
2019-09-22  3:59 ` [PATCH 18/20] lpfc: Complete removal of FCoE T10diff support on SLI-4 adapters James Smart
2019-09-22  3:59 ` [PATCH 19/20] lpfc: cleanup: remove unused fcp_txcmlpq_cnt James Smart
2019-09-22  3:59 ` [PATCH 20/20] lpfc: Update lpfc version to 12.4.0.1 James Smart
2019-10-01  2:07 ` [PATCH 00/20] lpfc: Update lpfc to revision 12.4.0.1 Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190922035906.10977-14-jsmart2021@gmail.com \
    --to=jsmart2021@gmail.com \
    --cc=dick.kennedy@broadcom.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).