linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 5.18 035/159] scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg()
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
@ 2022-05-30 13:22 ` Sasha Levin
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 036/159] scsi: lpfc: Fix SCSI I/O completion and abort handler deadlock Sasha Levin
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:22 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: James Smart, Justin Tee, Martin K . Petersen, Sasha Levin,
	james.smart, dick.kennedy, jejb, linux-scsi

From: James Smart <jsmart2021@gmail.com>

[ Upstream commit e294647b1aed4247fe52851f3a3b2b19ae906228 ]

In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard
lockup call trace hangs the system.

Call Trace:
 _raw_spin_lock_irqsave+0x32/0x40
 lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc]
 lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc]
 lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc]
 lpfc_els_flush_cmd+0x43c/0x670 [lpfc]
 lpfc_els_flush_all_cmd+0x37/0x60 [lpfc]
 lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc]
 lpfc_do_work+0x1485/0x1d70 [lpfc]
 kthread+0x112/0x130
 ret_from_fork+0x1f/0x40
Kernel panic - not syncing: Hard LOCKUP

The same CPU tries to claim the phba->port_list_lock twice.

Move the cfg_log_verbose checks as part of the lpfc_printf_vlog() and
lpfc_printf_log() macros before calling lpfc_dmp_dbg().  There is no need
to take the phba->port_list_lock within lpfc_dmp_dbg().

Link: https://lore.kernel.org/r/20220412222008.126521-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/lpfc/lpfc_init.c   | 29 +----------------------------
 drivers/scsi/lpfc/lpfc_logmsg.h |  6 +++---
 2 files changed, 4 insertions(+), 31 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 461d333b1b3a..f9cd4b72d949 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -15700,34 +15700,7 @@ void lpfc_dmp_dbg(struct lpfc_hba *phba)
 	unsigned int temp_idx;
 	int i;
 	int j = 0;
-	unsigned long rem_nsec, iflags;
-	bool log_verbose = false;
-	struct lpfc_vport *port_iterator;
-
-	/* Don't dump messages if we explicitly set log_verbose for the
-	 * physical port or any vport.
-	 */
-	if (phba->cfg_log_verbose)
-		return;
-
-	spin_lock_irqsave(&phba->port_list_lock, iflags);
-	list_for_each_entry(port_iterator, &phba->port_list, listentry) {
-		if (port_iterator->load_flag & FC_UNLOADING)
-			continue;
-		if (scsi_host_get(lpfc_shost_from_vport(port_iterator))) {
-			if (port_iterator->cfg_log_verbose)
-				log_verbose = true;
-
-			scsi_host_put(lpfc_shost_from_vport(port_iterator));
-
-			if (log_verbose) {
-				spin_unlock_irqrestore(&phba->port_list_lock,
-						       iflags);
-				return;
-			}
-		}
-	}
-	spin_unlock_irqrestore(&phba->port_list_lock, iflags);
+	unsigned long rem_nsec;
 
 	if (atomic_cmpxchg(&phba->dbg_log_dmping, 0, 1) != 0)
 		return;
diff --git a/drivers/scsi/lpfc/lpfc_logmsg.h b/drivers/scsi/lpfc/lpfc_logmsg.h
index 7d480c798794..a5aafe230c74 100644
--- a/drivers/scsi/lpfc/lpfc_logmsg.h
+++ b/drivers/scsi/lpfc/lpfc_logmsg.h
@@ -73,7 +73,7 @@ do { \
 #define lpfc_printf_vlog(vport, level, mask, fmt, arg...) \
 do { \
 	{ if (((mask) & (vport)->cfg_log_verbose) || (level[1] <= '3')) { \
-		if ((mask) & LOG_TRACE_EVENT) \
+		if ((mask) & LOG_TRACE_EVENT && !(vport)->cfg_log_verbose) \
 			lpfc_dmp_dbg((vport)->phba); \
 		dev_printk(level, &((vport)->phba->pcidev)->dev, "%d:(%d):" \
 			   fmt, (vport)->phba->brd_no, vport->vpi, ##arg);  \
@@ -89,11 +89,11 @@ do { \
 				 (phba)->pport->cfg_log_verbose : \
 				 (phba)->cfg_log_verbose; \
 	if (((mask) & log_verbose) || (level[1] <= '3')) { \
-		if ((mask) & LOG_TRACE_EVENT) \
+		if ((mask) & LOG_TRACE_EVENT && !log_verbose) \
 			lpfc_dmp_dbg(phba); \
 		dev_printk(level, &((phba)->pcidev)->dev, "%d:" \
 			fmt, phba->brd_no, ##arg); \
-	} else  if (!(phba)->cfg_log_verbose)\
+	} else if (!log_verbose)\
 		lpfc_dbg_print(phba, "%d:" fmt, phba->brd_no, ##arg); \
 	} \
 } while (0)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH AUTOSEL 5.18 036/159] scsi: lpfc: Fix SCSI I/O completion and abort handler deadlock
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 035/159] scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg() Sasha Levin
@ 2022-05-30 13:22 ` Sasha Levin
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 037/159] scsi: lpfc: Fix null pointer dereference after failing to issue FLOGI and PLOGI Sasha Levin
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:22 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: James Smart, Justin Tee, Martin K . Petersen, Sasha Levin,
	james.smart, dick.kennedy, jejb, linux-scsi

From: James Smart <jsmart2021@gmail.com>

[ Upstream commit 03cbbd7c2f5ee288f648f4aeedc765a181188553 ]

During stress I/O tests with 500+ vports, hard LOCKUP call traces are
observed.

CPU A:
 native_queued_spin_lock_slowpath+0x192
 _raw_spin_lock_irqsave+0x32
 lpfc_handle_fcp_err+0x4c6
 lpfc_fcp_io_cmd_wqe_cmpl+0x964
 lpfc_sli4_fp_handle_cqe+0x266
 __lpfc_sli4_process_cq+0x105
 __lpfc_sli4_hba_process_cq+0x3c
 lpfc_cq_poll_hdler+0x16
 irq_poll_softirq+0x76
 __softirqentry_text_start+0xe4
 irq_exit+0xf7
 do_IRQ+0x7f

CPU B:
 native_queued_spin_lock_slowpath+0x5b
 _raw_spin_lock+0x1c
 lpfc_abort_handler+0x13e
 scmd_eh_abort_handler+0x85
 process_one_work+0x1a7
 worker_thread+0x30
 kthread+0x112
 ret_from_fork+0x1f

Diagram of lockup:

CPUA                            CPUB
----                            ----
lpfc_cmd->buf_lock
                            phba->hbalock
                            lpfc_cmd->buf_lock
phba->hbalock

Fix by reordering the taking of the lpfc_cmd->buf_lock and phba->hbalock in
lpfc_abort_handler routine so that it tries to take the lpfc_cmd->buf_lock
first before phba->hbalock.

Link: https://lore.kernel.org/r/20220412222008.126521-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/lpfc/lpfc_scsi.c | 33 +++++++++++++++------------------
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index ba9dbb51b75f..c4fa7d68fe03 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -5864,25 +5864,25 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd)
 	if (!lpfc_cmd)
 		return ret;
 
-	spin_lock_irqsave(&phba->hbalock, flags);
+	/* Guard against IO completion being called at same time */
+	spin_lock_irqsave(&lpfc_cmd->buf_lock, flags);
+
+	spin_lock(&phba->hbalock);
 	/* driver queued commands are in process of being flushed */
 	if (phba->hba_flag & HBA_IOQ_FLUSH) {
 		lpfc_printf_vlog(vport, KERN_WARNING, LOG_FCP,
 			"3168 SCSI Layer abort requested I/O has been "
 			"flushed by LLD.\n");
 		ret = FAILED;
-		goto out_unlock;
+		goto out_unlock_hba;
 	}
 
-	/* Guard against IO completion being called at same time */
-	spin_lock(&lpfc_cmd->buf_lock);
-
 	if (!lpfc_cmd->pCmd) {
 		lpfc_printf_vlog(vport, KERN_WARNING, LOG_FCP,
 			 "2873 SCSI Layer I/O Abort Request IO CMPL Status "
 			 "x%x ID %d LUN %llu\n",
 			 SUCCESS, cmnd->device->id, cmnd->device->lun);
-		goto out_unlock_buf;
+		goto out_unlock_hba;
 	}
 
 	iocb = &lpfc_cmd->cur_iocbq;
@@ -5890,7 +5890,7 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd)
 		pring_s4 = phba->sli4_hba.hdwq[iocb->hba_wqidx].io_wq->pring;
 		if (!pring_s4) {
 			ret = FAILED;
-			goto out_unlock_buf;
+			goto out_unlock_hba;
 		}
 		spin_lock(&pring_s4->ring_lock);
 	}
@@ -5923,8 +5923,8 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd)
 			 "3389 SCSI Layer I/O Abort Request is pending\n");
 		if (phba->sli_rev == LPFC_SLI_REV4)
 			spin_unlock(&pring_s4->ring_lock);
-		spin_unlock(&lpfc_cmd->buf_lock);
-		spin_unlock_irqrestore(&phba->hbalock, flags);
+		spin_unlock(&phba->hbalock);
+		spin_unlock_irqrestore(&lpfc_cmd->buf_lock, flags);
 		goto wait_for_cmpl;
 	}
 
@@ -5945,15 +5945,13 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd)
 	if (ret_val != IOCB_SUCCESS) {
 		/* Indicate the IO is not being aborted by the driver. */
 		lpfc_cmd->waitq = NULL;
-		spin_unlock(&lpfc_cmd->buf_lock);
-		spin_unlock_irqrestore(&phba->hbalock, flags);
 		ret = FAILED;
-		goto out;
+		goto out_unlock_hba;
 	}
 
 	/* no longer need the lock after this point */
-	spin_unlock(&lpfc_cmd->buf_lock);
-	spin_unlock_irqrestore(&phba->hbalock, flags);
+	spin_unlock(&phba->hbalock);
+	spin_unlock_irqrestore(&lpfc_cmd->buf_lock, flags);
 
 	if (phba->cfg_poll & DISABLE_FCP_RING_INT)
 		lpfc_sli_handle_fast_ring_event(phba,
@@ -5988,10 +5986,9 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd)
 out_unlock_ring:
 	if (phba->sli_rev == LPFC_SLI_REV4)
 		spin_unlock(&pring_s4->ring_lock);
-out_unlock_buf:
-	spin_unlock(&lpfc_cmd->buf_lock);
-out_unlock:
-	spin_unlock_irqrestore(&phba->hbalock, flags);
+out_unlock_hba:
+	spin_unlock(&phba->hbalock);
+	spin_unlock_irqrestore(&lpfc_cmd->buf_lock, flags);
 out:
 	lpfc_printf_vlog(vport, KERN_WARNING, LOG_FCP,
 			 "0749 SCSI Layer I/O Abort Request Status x%x ID %d "
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH AUTOSEL 5.18 037/159] scsi: lpfc: Fix null pointer dereference after failing to issue FLOGI and PLOGI
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 035/159] scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg() Sasha Levin
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 036/159] scsi: lpfc: Fix SCSI I/O completion and abort handler deadlock Sasha Levin
@ 2022-05-30 13:22 ` Sasha Levin
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 038/159] scsi: lpfc: Protect memory leak for NPIV ports sending PLOGI_RJT Sasha Levin
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:22 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: James Smart, Justin Tee, Martin K . Petersen, Sasha Levin,
	james.smart, dick.kennedy, jejb, linux-scsi

From: James Smart <jsmart2021@gmail.com>

[ Upstream commit 577a942df3de2666f6947bdd3a5c9e8d30073424 ]

If lpfc_issue_els_flogi() fails and returns non-zero status, the node
reference count is decremented to trigger the release of the nodelist
structure. However, if there is a prior registration or dev-loss-evt work
pending, the node may be released prematurely.  When dev-loss-evt
completes, the released node is referenced causing a use-after-free null
pointer dereference.

Similarly, when processing non-zero ELS PLOGI completion status in
lpfc_cmpl_els_plogi(), the ndlp flags are checked for a transport
registration before triggering node removal.  If dev-loss-evt work is
pending, the node may be released prematurely and a subsequent call to
lpfc_dev_loss_tmo_handler() results in a use after free ndlp dereference.

Add test for pending dev-loss before decrementing the node reference count
for FLOGI, PLOGI, PRLI, and ADISC handling.

Link: https://lore.kernel.org/r/20220412222008.126521-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/lpfc/lpfc_els.c | 51 +++++++++++++++++++++++++-----------
 1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 872a26376ccb..46a01a51b207 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -1532,10 +1532,13 @@ lpfc_initial_flogi(struct lpfc_vport *vport)
 	}
 
 	if (lpfc_issue_els_flogi(vport, ndlp, 0)) {
-		/* This decrement of reference count to node shall kick off
-		 * the release of the node.
+		/* A node reference should be retained while registered with a
+		 * transport or dev-loss-evt work is pending.
+		 * Otherwise, decrement node reference to trigger release.
 		 */
-		lpfc_nlp_put(ndlp);
+		if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD)) &&
+		    !(ndlp->nlp_flag & NLP_IN_DEV_LOSS))
+			lpfc_nlp_put(ndlp);
 		return 0;
 	}
 	return 1;
@@ -1578,10 +1581,13 @@ lpfc_initial_fdisc(struct lpfc_vport *vport)
 	}
 
 	if (lpfc_issue_els_fdisc(vport, ndlp, 0)) {
-		/* decrement node reference count to trigger the release of
-		 * the node.
+		/* A node reference should be retained while registered with a
+		 * transport or dev-loss-evt work is pending.
+		 * Otherwise, decrement node reference to trigger release.
 		 */
-		lpfc_nlp_put(ndlp);
+		if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD)) &&
+		    !(ndlp->nlp_flag & NLP_IN_DEV_LOSS))
+			lpfc_nlp_put(ndlp);
 		return 0;
 	}
 	return 1;
@@ -1983,6 +1989,7 @@ lpfc_cmpl_els_plogi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 	int disc;
 	struct serv_parm *sp = NULL;
 	u32 ulp_status, ulp_word4, did, iotag;
+	bool release_node = false;
 
 	/* we pass cmdiocb to state machine which needs rspiocb as well */
 	cmdiocb->context_un.rsp_iocb = rspiocb;
@@ -2071,19 +2078,21 @@ lpfc_cmpl_els_plogi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 			spin_unlock_irq(&ndlp->lock);
 			goto out;
 		}
-		spin_unlock_irq(&ndlp->lock);
 
 		/* No PLOGI collision and the node is not registered with the
 		 * scsi or nvme transport. It is no longer an active node. Just
 		 * start the device remove process.
 		 */
 		if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD))) {
-			spin_lock_irq(&ndlp->lock);
 			ndlp->nlp_flag &= ~NLP_NPR_2B_DISC;
-			spin_unlock_irq(&ndlp->lock);
+			if (!(ndlp->nlp_flag & NLP_IN_DEV_LOSS))
+				release_node = true;
+		}
+		spin_unlock_irq(&ndlp->lock);
+
+		if (release_node)
 			lpfc_disc_state_machine(vport, ndlp, cmdiocb,
 						NLP_EVT_DEVICE_RM);
-		}
 	} else {
 		/* Good status, call state machine */
 		prsp = list_entry(((struct lpfc_dmabuf *)
@@ -2294,6 +2303,7 @@ lpfc_cmpl_els_prli(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 	u32 loglevel;
 	u32 ulp_status;
 	u32 ulp_word4;
+	bool release_node = false;
 
 	/* we pass cmdiocb to state machine which needs rspiocb as well */
 	cmdiocb->context_un.rsp_iocb = rspiocb;
@@ -2370,14 +2380,18 @@ lpfc_cmpl_els_prli(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 		 * it is no longer an active node.  Otherwise devloss
 		 * handles the final cleanup.
 		 */
+		spin_lock_irq(&ndlp->lock);
 		if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD)) &&
 		    !ndlp->fc4_prli_sent) {
-			spin_lock_irq(&ndlp->lock);
 			ndlp->nlp_flag &= ~NLP_NPR_2B_DISC;
-			spin_unlock_irq(&ndlp->lock);
+			if (!(ndlp->nlp_flag & NLP_IN_DEV_LOSS))
+				release_node = true;
+		}
+		spin_unlock_irq(&ndlp->lock);
+
+		if (release_node)
 			lpfc_disc_state_machine(vport, ndlp, cmdiocb,
 						NLP_EVT_DEVICE_RM);
-		}
 	} else {
 		/* Good status, call state machine.  However, if another
 		 * PRLI is outstanding, don't call the state machine
@@ -2749,6 +2763,7 @@ lpfc_cmpl_els_adisc(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 	struct lpfc_nodelist *ndlp;
 	int  disc;
 	u32 ulp_status, ulp_word4, tmo;
+	bool release_node = false;
 
 	/* we pass cmdiocb to state machine which needs rspiocb as well */
 	cmdiocb->context_un.rsp_iocb = rspiocb;
@@ -2815,13 +2830,17 @@ lpfc_cmpl_els_adisc(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 		 * transport, it is no longer an active node. Otherwise
 		 * devloss handles the final cleanup.
 		 */
+		spin_lock_irq(&ndlp->lock);
 		if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD))) {
-			spin_lock_irq(&ndlp->lock);
 			ndlp->nlp_flag &= ~NLP_NPR_2B_DISC;
-			spin_unlock_irq(&ndlp->lock);
+			if (!(ndlp->nlp_flag & NLP_IN_DEV_LOSS))
+				release_node = true;
+		}
+		spin_unlock_irq(&ndlp->lock);
+
+		if (release_node)
 			lpfc_disc_state_machine(vport, ndlp, cmdiocb,
 						NLP_EVT_DEVICE_RM);
-		}
 	} else
 		/* Good status, call state machine */
 		lpfc_disc_state_machine(vport, ndlp, cmdiocb,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH AUTOSEL 5.18 038/159] scsi: lpfc: Protect memory leak for NPIV ports sending PLOGI_RJT
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
                   ` (2 preceding siblings ...)
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 037/159] scsi: lpfc: Fix null pointer dereference after failing to issue FLOGI and PLOGI Sasha Levin
@ 2022-05-30 13:22 ` Sasha Levin
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 039/159] scsi: lpfc: Fix call trace observed during I/O with CMF enabled Sasha Levin
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:22 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: James Smart, Justin Tee, Martin K . Petersen, Sasha Levin,
	james.smart, dick.kennedy, jejb, linux-scsi

From: James Smart <jsmart2021@gmail.com>

[ Upstream commit 672d1cb40551ea9c95efad43ab6d45e4ab4e015f ]

There is a potential memory leak in lpfc_ignore_els_cmpl() and
lpfc_els_rsp_reject() that was allocated from NPIV PLOGI_RJT
(lpfc_rcv_plogi()'s login_mbox).

Check if cmdiocb->context_un.mbox was allocated in lpfc_ignore_els_cmpl(),
and then free it back to phba->mbox_mem_pool along with mbox->ctx_buf for
service parameters.

For lpfc_els_rsp_reject() failure, free both the ctx_buf for service
parameters and the login_mbox.

Link: https://lore.kernel.org/r/20220412222008.126521-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/lpfc/lpfc_nportdisc.c | 10 ++++++++--
 drivers/scsi/lpfc/lpfc_sli.c       | 17 +++++++++++++++++
 2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c b/drivers/scsi/lpfc/lpfc_nportdisc.c
index c4e1a07066a2..4b065c51ee1b 100644
--- a/drivers/scsi/lpfc/lpfc_nportdisc.c
+++ b/drivers/scsi/lpfc/lpfc_nportdisc.c
@@ -614,9 +614,15 @@ lpfc_rcv_plogi(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp,
 		stat.un.b.lsRjtRsnCode = LSRJT_INVALID_CMD;
 		stat.un.b.lsRjtRsnCodeExp = LSEXP_NOTHING_MORE;
 		rc = lpfc_els_rsp_reject(vport, stat.un.lsRjtError, cmdiocb,
-			ndlp, login_mbox);
-		if (rc)
+					 ndlp, login_mbox);
+		if (rc) {
+			mp = (struct lpfc_dmabuf *)login_mbox->ctx_buf;
+			if (mp) {
+				lpfc_mbuf_free(phba, mp->virt, mp->phys);
+				kfree(mp);
+			}
 			mempool_free(login_mbox, phba->mbox_mem_pool);
+		}
 		return 1;
 	}
 
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 6adaf79e67cc..09a45f8ecf3f 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -12066,6 +12066,8 @@ lpfc_ignore_els_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 {
 	struct lpfc_nodelist *ndlp = NULL;
 	IOCB_t *irsp;
+	LPFC_MBOXQ_t *mbox;
+	struct lpfc_dmabuf *mp;
 	u32 ulp_command, ulp_status, ulp_word4, iotag;
 
 	ulp_command = get_job_cmnd(phba, cmdiocb);
@@ -12077,6 +12079,21 @@ lpfc_ignore_els_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 	} else {
 		irsp = &rspiocb->iocb;
 		iotag = irsp->ulpIoTag;
+
+		/* It is possible a PLOGI_RJT for NPIV ports to get aborted.
+		 * The MBX_REG_LOGIN64 mbox command is freed back to the
+		 * mbox_mem_pool here.
+		 */
+		if (cmdiocb->context_un.mbox) {
+			mbox = cmdiocb->context_un.mbox;
+			mp = (struct lpfc_dmabuf *)mbox->ctx_buf;
+			if (mp) {
+				lpfc_mbuf_free(phba, mp->virt, mp->phys);
+				kfree(mp);
+			}
+			mempool_free(mbox, phba->mbox_mem_pool);
+			cmdiocb->context_un.mbox = NULL;
+		}
 	}
 
 	/* ELS cmd tag <ulpIoTag> completes */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH AUTOSEL 5.18 039/159] scsi: lpfc: Fix call trace observed during I/O with CMF enabled
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
                   ` (3 preceding siblings ...)
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 038/159] scsi: lpfc: Protect memory leak for NPIV ports sending PLOGI_RJT Sasha Levin
@ 2022-05-30 13:22 ` Sasha Levin
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 058/159] scsi: megaraid: Fix error check return value of register_chrdev() Sasha Levin
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:22 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: James Smart, Justin Tee, Martin K . Petersen, Sasha Levin,
	james.smart, dick.kennedy, jejb, linux-scsi

From: James Smart <jsmart2021@gmail.com>

[ Upstream commit d6d45f67a11136cb88a70a29ab22ea6db8ae6bd5 ]

The following was seen with CMF enabled:

BUG: using smp_processor_id() in preemptible
code: systemd-udevd/31711
kernel: caller is lpfc_update_cmf_cmd+0x214/0x420  [lpfc]
kernel: CPU: 12 PID: 31711 Comm: systemd-udevd
kernel: Call Trace:
kernel: <TASK>
kernel: dump_stack_lvl+0x44/0x57
kernel: check_preemption_disabled+0xbf/0xe0
kernel: lpfc_update_cmf_cmd+0x214/0x420 [lpfc]
kernel: lpfc_nvme_fcp_io_submit+0x23b4/0x4df0 [lpfc]

this_cpu_ptr() calls smp_processor_id() in a preemptible context.

Fix by using per_cpu_ptr() with raw_smp_processor_id() instead.

Link: https://lore.kernel.org/r/20220412222008.126521-16-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/lpfc/lpfc_scsi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index c4fa7d68fe03..f617a2ef6b0f 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -3835,7 +3835,7 @@ lpfc_update_cmf_cmpl(struct lpfc_hba *phba,
 		else
 			time = div_u64(time + 500, 1000); /* round it */
 
-		cgs = this_cpu_ptr(phba->cmf_stat);
+		cgs = per_cpu_ptr(phba->cmf_stat, raw_smp_processor_id());
 		atomic64_add(size, &cgs->rcv_bytes);
 		atomic64_add(time, &cgs->rx_latency);
 		atomic_inc(&cgs->rx_io_cnt);
@@ -3879,7 +3879,7 @@ lpfc_update_cmf_cmd(struct lpfc_hba *phba, uint32_t size)
 			atomic_set(&phba->rx_max_read_cnt, size);
 	}
 
-	cgs = this_cpu_ptr(phba->cmf_stat);
+	cgs = per_cpu_ptr(phba->cmf_stat, raw_smp_processor_id());
 	atomic64_add(size, &cgs->total_bytes);
 	return 0;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH AUTOSEL 5.18 058/159] scsi: megaraid: Fix error check return value of register_chrdev()
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
                   ` (4 preceding siblings ...)
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 039/159] scsi: lpfc: Fix call trace observed during I/O with CMF enabled Sasha Levin
@ 2022-05-30 13:22 ` Sasha Levin
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 060/159] scsi: ufs: Use pm_runtime_resume_and_get() instead of pm_runtime_get_sync() Sasha Levin
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:22 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Lv Ruyi, Zeal Robot, Martin K . Petersen, Sasha Levin,
	kashyap.desai, sumit.saxena, shivasharan.srikanteshwara, jejb,
	megaraidlinux.pdl, linux-scsi

From: Lv Ruyi <lv.ruyi@zte.com.cn>

[ Upstream commit c5acd61dbb32b6bda0f3a354108f2b8dcb788985 ]

If major equals 0, register_chrdev() returns an error code when it fails.
This function dynamically allocates a major and returns its number on
success, so we should use "< 0" to check it instead of "!".

Link: https://lore.kernel.org/r/20220418105755.2558828-1-lv.ruyi@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/megaraid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index a5d8cee2d510..bf491af9f0d6 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -4607,7 +4607,7 @@ static int __init megaraid_init(void)
 	 * major number allocation.
 	 */
 	major = register_chrdev(0, "megadev_legacy", &megadev_fops);
-	if (!major) {
+	if (major < 0) {
 		printk(KERN_WARNING
 				"megaraid: failed to register char device\n");
 	}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH AUTOSEL 5.18 060/159] scsi: ufs: Use pm_runtime_resume_and_get() instead of pm_runtime_get_sync()
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
                   ` (5 preceding siblings ...)
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 058/159] scsi: megaraid: Fix error check return value of register_chrdev() Sasha Levin
@ 2022-05-30 13:22 ` Sasha Levin
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 061/159] scsi: lpfc: Fix resource leak in lpfc_sli4_send_seq_to_ulp() Sasha Levin
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:22 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Minghao Chi, Zeal Robot, Martin K . Petersen, Sasha Levin, jejb,
	linux-scsi

From: Minghao Chi <chi.minghao@zte.com.cn>

[ Upstream commit 75b8715e20a20bc7b4844835e4035543a2674200 ]

Using pm_runtime_resume_and_get() to replace pm_runtime_get_sync() and
pm_runtime_put_noidle(). This change is just to simplify the code, no
actual functional changes.

Link: https://lore.kernel.org/r/20220420090353.2588804-1-chi.minghao@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/ufs/ti-j721e-ufs.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/ufs/ti-j721e-ufs.c b/drivers/scsi/ufs/ti-j721e-ufs.c
index eafe0db98d54..122d650d0810 100644
--- a/drivers/scsi/ufs/ti-j721e-ufs.c
+++ b/drivers/scsi/ufs/ti-j721e-ufs.c
@@ -29,11 +29,9 @@ static int ti_j721e_ufs_probe(struct platform_device *pdev)
 		return PTR_ERR(regbase);
 
 	pm_runtime_enable(dev);
-	ret = pm_runtime_get_sync(dev);
-	if (ret < 0) {
-		pm_runtime_put_noidle(dev);
+	ret = pm_runtime_resume_and_get(dev);
+	if (ret < 0)
 		goto disable_pm;
-	}
 
 	/* Select MPHY refclk frequency */
 	clk = devm_clk_get(dev, NULL);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH AUTOSEL 5.18 061/159] scsi: lpfc: Fix resource leak in lpfc_sli4_send_seq_to_ulp()
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
                   ` (6 preceding siblings ...)
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 060/159] scsi: ufs: Use pm_runtime_resume_and_get() instead of pm_runtime_get_sync() Sasha Levin
@ 2022-05-30 13:22 ` Sasha Levin
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 074/159] scsi: target: tcmu: Fix possible data corruption Sasha Levin
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:22 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: James Smart, Justin Tee, Martin K . Petersen, Sasha Levin,
	james.smart, dick.kennedy, jejb, linux-scsi

From: James Smart <jsmart2021@gmail.com>

[ Upstream commit 646db1a560f44236b7278b822ca99a1d3b6ea72c ]

If no handler is found in lpfc_complete_unsol_iocb() to match the rctl of a
received frame, the frame is dropped and resources are leaked.

Fix by returning resources when discarding an unhandled frame type.  Update
lpfc_fc_frame_check() handling of NOP basic link service.

Link: https://lore.kernel.org/r/20220426181419.9154-1-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/lpfc/lpfc_sli.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 09a45f8ecf3f..a174e06bd96e 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -18124,7 +18124,6 @@ lpfc_fc_frame_check(struct lpfc_hba *phba, struct fc_frame_header *fc_hdr)
 	case FC_RCTL_ELS_REP:	/* extended link services reply */
 	case FC_RCTL_ELS4_REQ:	/* FC-4 ELS request */
 	case FC_RCTL_ELS4_REP:	/* FC-4 ELS reply */
-	case FC_RCTL_BA_NOP:  	/* basic link service NOP */
 	case FC_RCTL_BA_ABTS: 	/* basic link service abort */
 	case FC_RCTL_BA_RMC: 	/* remove connection */
 	case FC_RCTL_BA_ACC:	/* basic accept */
@@ -18145,6 +18144,7 @@ lpfc_fc_frame_check(struct lpfc_hba *phba, struct fc_frame_header *fc_hdr)
 		fc_vft_hdr = (struct fc_vft_header *)fc_hdr;
 		fc_hdr = &((struct fc_frame_header *)fc_vft_hdr)[1];
 		return lpfc_fc_frame_check(phba, fc_hdr);
+	case FC_RCTL_BA_NOP:	/* basic link service NOP */
 	default:
 		goto drop;
 	}
@@ -18959,12 +18959,14 @@ lpfc_sli4_send_seq_to_ulp(struct lpfc_vport *vport,
 	if (!lpfc_complete_unsol_iocb(phba,
 				      phba->sli4_hba.els_wq->pring,
 				      iocbq, fc_hdr->fh_r_ctl,
-				      fc_hdr->fh_type))
+				      fc_hdr->fh_type)) {
 		lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT,
 				"2540 Ring %d handler: unexpected Rctl "
 				"x%x Type x%x received\n",
 				LPFC_ELS_RING,
 				fc_hdr->fh_r_ctl, fc_hdr->fh_type);
+		lpfc_in_buf_free(phba, &seq_dmabuf->dbuf);
+	}
 
 	/* Free iocb created in lpfc_prep_seq */
 	list_for_each_entry_safe(curr_iocb, next_iocb,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH AUTOSEL 5.18 074/159] scsi: target: tcmu: Fix possible data corruption
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
                   ` (7 preceding siblings ...)
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 061/159] scsi: lpfc: Fix resource leak in lpfc_sli4_send_seq_to_ulp() Sasha Levin
@ 2022-05-30 13:22 ` Sasha Levin
  2022-05-30 15:47   ` Bodo Stroesser
  2022-05-30 13:23 ` [PATCH AUTOSEL 5.18 092/159] scsi: hisi_sas: Undo RPM resume for failed notify phy event for v3 HW Sasha Levin
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:22 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Xiaoguang Wang, Bodo Stroesser, Martin K . Petersen, Sasha Levin,
	linux-scsi, target-devel

From: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>

[ Upstream commit bb9b9eb0ae2e9d3f6036f0ad907c3a83dcd43485 ]

When tcmu_vma_fault() gets a page successfully, before the current context
completes page fault procedure, find_free_blocks() may run and call
unmap_mapping_range() to unmap the page. Assume that when
find_free_blocks() initially completes and the previous page fault
procedure starts to run again and completes, then one truncated page has
been mapped to userspace. But note that tcmu_vma_fault() has gotten a
refcount for the page so any other subsystem won't be able to use the page
unless the userspace address is unmapped later.

If another command subsequently runs and needs to extend dbi_thresh it may
reuse the corresponding slot for the previous page in data_bitmap. Then
though we'll allocate new page for this slot in data_area, no page fault
will happen because we have a valid map and the real request's data will be
lost.

Filesystem implementations will also run into this issue but they usually
lock the page when vm_operations_struct->fault gets a page and unlock the
page after finish_fault() completes. For truncate filesystems lock pages in
truncate_inode_pages() to protect against racing wrt. page faults.

To fix this possible data corruption scenario we can apply a method similar
to the filesystems.  For pages that are to be freed, tcmu_blocks_release()
locks and unlocks. Make tcmu_vma_fault() also lock found page under
cmdr_lock. At the same time, since tcmu_vma_fault() gets an extra page
refcount, tcmu_blocks_release() won't free pages if pages are in page fault
procedure, which means it is safe to call tcmu_blocks_release() before
unmap_mapping_range().

With these changes tcmu_blocks_release() will wait for all page faults to
be completed before calling unmap_mapping_range(). And later, if
unmap_mapping_range() is called, it will ensure stale mappings are removed.

Link: https://lore.kernel.org/r/20220421023735.9018-1-xiaoguang.wang@linux.alibaba.com
Reviewed-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/target/target_core_user.c | 40 ++++++++++++++++++++++++++++---
 1 file changed, 37 insertions(+), 3 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index fd7267baa707..b1fd06edea59 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -20,6 +20,7 @@
 #include <linux/configfs.h>
 #include <linux/mutex.h>
 #include <linux/workqueue.h>
+#include <linux/pagemap.h>
 #include <net/genetlink.h>
 #include <scsi/scsi_common.h>
 #include <scsi/scsi_proto.h>
@@ -1667,6 +1668,26 @@ static u32 tcmu_blocks_release(struct tcmu_dev *udev, unsigned long first,
 	xas_lock(&xas);
 	xas_for_each(&xas, page, (last + 1) * udev->data_pages_per_blk - 1) {
 		xas_store(&xas, NULL);
+		/*
+		 * While reaching here there may be page faults occurring on
+		 * the to-be-released pages. A race condition may occur if
+		 * unmap_mapping_range() is called before page faults on these
+		 * pages have completed; a valid but stale map is created.
+		 *
+		 * If another command subsequently runs and needs to extend
+		 * dbi_thresh, it may reuse the slot corresponding to the
+		 * previous page in data_bitmap. Though we will allocate a new
+		 * page for the slot in data_area, no page fault will happen
+		 * because we have a valid map. Therefore the command's data
+		 * will be lost.
+		 *
+		 * We lock and unlock pages that are to be released to ensure
+		 * all page faults have completed. This way
+		 * unmap_mapping_range() can ensure stale maps are cleanly
+		 * removed.
+		 */
+		lock_page(page);
+		unlock_page(page);
 		__free_page(page);
 		pages_freed++;
 	}
@@ -1822,6 +1843,7 @@ static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
 	page = xa_load(&udev->data_pages, dpi);
 	if (likely(page)) {
 		get_page(page);
+		lock_page(page);
 		mutex_unlock(&udev->cmdr_lock);
 		return page;
 	}
@@ -1863,6 +1885,7 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf)
 	struct page *page;
 	unsigned long offset;
 	void *addr;
+	vm_fault_t ret = 0;
 
 	int mi = tcmu_find_mem_index(vmf->vma);
 	if (mi < 0)
@@ -1887,10 +1910,11 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf)
 		page = tcmu_try_get_data_page(udev, dpi);
 		if (!page)
 			return VM_FAULT_SIGBUS;
+		ret = VM_FAULT_LOCKED;
 	}
 
 	vmf->page = page;
-	return 0;
+	return ret;
 }
 
 static const struct vm_operations_struct tcmu_vm_ops = {
@@ -3205,12 +3229,22 @@ static void find_free_blocks(void)
 			udev->dbi_max = block;
 		}
 
+		/*
+		 * Release the block pages.
+		 *
+		 * Also note that since tcmu_vma_fault() gets an extra page
+		 * refcount, tcmu_blocks_release() won't free pages if pages
+		 * are mapped. This means it is safe to call
+		 * tcmu_blocks_release() before unmap_mapping_range() which
+		 * drops the refcount of any pages it unmaps and thus releases
+		 * them.
+		 */
+		pages_freed = tcmu_blocks_release(udev, start, end - 1);
+
 		/* Here will truncate the data area from off */
 		off = udev->data_off + (loff_t)start * udev->data_blk_size;
 		unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
 
-		/* Release the block pages */
-		pages_freed = tcmu_blocks_release(udev, start, end - 1);
 		mutex_unlock(&udev->cmdr_lock);
 
 		total_pages_freed += pages_freed;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH AUTOSEL 5.18 092/159] scsi: hisi_sas: Undo RPM resume for failed notify phy event for v3 HW
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
                   ` (8 preceding siblings ...)
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 074/159] scsi: target: tcmu: Fix possible data corruption Sasha Levin
@ 2022-05-30 13:23 ` Sasha Levin
  2022-05-30 13:23 ` [PATCH AUTOSEL 5.18 093/159] scsi: lpfc: Inhibit aborts if external loopback plug is inserted Sasha Levin
  2022-05-30 13:23 ` [PATCH AUTOSEL 5.18 094/159] scsi: lpfc: Alter FPIN stat accounting logic Sasha Levin
  11 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:23 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Xiang Chen, Yihang Li, John Garry, Martin K . Petersen,
	Sasha Levin, jejb, linux-scsi

From: Xiang Chen <chenxiang66@hisilicon.com>

[ Upstream commit 9b5387fe5af38116b452259d87cd66594b6277c1 ]

If we fail to notify the phy up event then undo the RPM resume, as the phy
up notify event handling pairs with that RPM resume.

Link: https://lore.kernel.org/r/1651839939-101188-1-git-send-email-john.garry@huawei.com
Reported-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 79f87d7c3e68..7d819fc0395e 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -1563,9 +1563,15 @@ static irqreturn_t phy_up_v3_hw(int phy_no, struct hisi_hba *hisi_hba)
 
 	phy->port_id = port_id;
 
-	/* Call pm_runtime_put_sync() with pairs in hisi_sas_phyup_pm_work() */
+	/*
+	 * Call pm_runtime_get_noresume() which pairs with
+	 * hisi_sas_phyup_pm_work() -> pm_runtime_put_sync().
+	 * For failure call pm_runtime_put() as we are in a hardirq context.
+	 */
 	pm_runtime_get_noresume(dev);
-	hisi_sas_notify_phy_event(phy, HISI_PHYE_PHY_UP_PM);
+	res = hisi_sas_notify_phy_event(phy, HISI_PHYE_PHY_UP_PM);
+	if (!res)
+		pm_runtime_put(dev);
 
 	res = IRQ_HANDLED;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH AUTOSEL 5.18 093/159] scsi: lpfc: Inhibit aborts if external loopback plug is inserted
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
                   ` (9 preceding siblings ...)
  2022-05-30 13:23 ` [PATCH AUTOSEL 5.18 092/159] scsi: hisi_sas: Undo RPM resume for failed notify phy event for v3 HW Sasha Levin
@ 2022-05-30 13:23 ` Sasha Levin
  2022-05-30 13:23 ` [PATCH AUTOSEL 5.18 094/159] scsi: lpfc: Alter FPIN stat accounting logic Sasha Levin
  11 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:23 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: James Smart, Justin Tee, Martin K . Petersen, Sasha Levin,
	james.smart, dick.kennedy, jejb, linux-scsi

From: James Smart <jsmart2021@gmail.com>

[ Upstream commit ead76d4c09b89f4c8d632648026a476a5a34fde8 ]

After running a short external loopback test, when the external loopback is
removed and a normal cable inserted that is directly connected to a target
device, the system oops in the llpfc_set_rrq_active() routine.

When the loopback was inserted an FLOGI was transmit. As we're looped back,
we receive the FLOGI request. The FLOGI is ABTS'd as we recognize the same
wppn thus understand it's a loopback. However, as the ABTS sends address
information the port is not set to (fffffe), the ABTS is dropped on the
wire. A short 1 frame loopback test is run and completes before the ABTS
times out. The looback is unplugged and the new cable plugged in, and the
an FLOGI to the new device occurs and completes. Due to a mixup in ref
counting the completion of the new FLOGI releases the fabric ndlp. Then the
original ABTS completes and references the released ndlp generating the
oops.

Correct by no-op'ing the ABTS when in loopback mode (it will be dropped
anyway). Added a flag to track the mode to recognize when it should be
no-op'd.

Link: https://lore.kernel.org/r/20220506035519.50908-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/lpfc/lpfc.h         |  1 +
 drivers/scsi/lpfc/lpfc_els.c     | 12 ++++++++++++
 drivers/scsi/lpfc/lpfc_hbadisc.c |  3 +++
 drivers/scsi/lpfc/lpfc_sli.c     |  8 +++++---
 4 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 0025760230e5..da5e91a91151 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -1025,6 +1025,7 @@ struct lpfc_hba {
 #define LS_MDS_LINK_DOWN      0x8	/* MDS Diagnostics Link Down */
 #define LS_MDS_LOOPBACK       0x10	/* MDS Diagnostics Link Up (Loopback) */
 #define LS_CT_VEN_RPA         0x20	/* Vendor RPA sent to switch */
+#define LS_EXTERNAL_LOOPBACK  0x40	/* External loopback plug inserted */
 
 	uint32_t hba_flag;	/* hba generic flags */
 #define HBA_ERATT_HANDLED	0x1 /* This flag is set when eratt handled */
diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 46a01a51b207..9545a35f0777 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -1387,6 +1387,9 @@ lpfc_issue_els_flogi(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp,
 
 	phba->hba_flag |= (HBA_FLOGI_ISSUED | HBA_FLOGI_OUTSTANDING);
 
+	/* Clear external loopback plug detected flag */
+	phba->link_flag &= ~LS_EXTERNAL_LOOPBACK;
+
 	/* Check for a deferred FLOGI ACC condition */
 	if (phba->defer_flogi_acc_flag) {
 		/* lookup ndlp for received FLOGI */
@@ -8182,6 +8185,9 @@ lpfc_els_rcv_flogi(struct lpfc_vport *vport, struct lpfc_iocbq *cmdiocb,
 	uint32_t fc_flag = 0;
 	uint32_t port_state = 0;
 
+	/* Clear external loopback plug detected flag */
+	phba->link_flag &= ~LS_EXTERNAL_LOOPBACK;
+
 	cmd = *lp++;
 	sp = (struct serv_parm *) lp;
 
@@ -8233,6 +8239,12 @@ lpfc_els_rcv_flogi(struct lpfc_vport *vport, struct lpfc_iocbq *cmdiocb,
 			return 1;
 		}
 
+		/* External loopback plug insertion detected */
+		phba->link_flag |= LS_EXTERNAL_LOOPBACK;
+
+		lpfc_printf_vlog(vport, KERN_INFO, LOG_ELS | LOG_LIBDFC,
+				 "1119 External Loopback plug detected\n");
+
 		/* abort the flogi coming back to ourselves
 		 * due to external loopback on the port.
 		 */
diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c
index 2b877dff5ed4..6b6b3790d7b5 100644
--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
+++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
@@ -1221,6 +1221,9 @@ lpfc_linkdown(struct lpfc_hba *phba)
 
 	phba->defer_flogi_acc_flag = false;
 
+	/* Clear external loopback plug detected flag */
+	phba->link_flag &= ~LS_EXTERNAL_LOOPBACK;
+
 	spin_lock_irq(&phba->hbalock);
 	phba->fcf.fcf_flag &= ~(FCF_AVAILABLE | FCF_SCAN_DONE);
 	spin_unlock_irq(&phba->hbalock);
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index a174e06bd96e..11f907278f09 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -12202,7 +12202,8 @@ lpfc_sli_issue_abort_iotag(struct lpfc_hba *phba, struct lpfc_sli_ring *pring,
 
 	if (phba->link_state < LPFC_LINK_UP ||
 	    (phba->sli_rev == LPFC_SLI_REV4 &&
-	     phba->sli4_hba.link_state.status == LPFC_FC_LA_TYPE_LINK_DOWN))
+	     phba->sli4_hba.link_state.status == LPFC_FC_LA_TYPE_LINK_DOWN) ||
+	    (phba->link_flag & LS_EXTERNAL_LOOPBACK))
 		ia = true;
 	else
 		ia = false;
@@ -12661,7 +12662,8 @@ lpfc_sli_abort_taskmgmt(struct lpfc_vport *vport, struct lpfc_sli_ring *pring,
 		ndlp = lpfc_cmd->rdata->pnode;
 
 		if (lpfc_is_link_up(phba) &&
-		    (ndlp && ndlp->nlp_state == NLP_STE_MAPPED_NODE))
+		    (ndlp && ndlp->nlp_state == NLP_STE_MAPPED_NODE) &&
+		    !(phba->link_flag & LS_EXTERNAL_LOOPBACK))
 			ia = false;
 		else
 			ia = true;
@@ -21126,7 +21128,7 @@ lpfc_sli4_issue_abort_iotag(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 	abtswqe = &abtsiocb->wqe;
 	memset(abtswqe, 0, sizeof(*abtswqe));
 
-	if (!lpfc_is_link_up(phba))
+	if (!lpfc_is_link_up(phba) || (phba->link_flag & LS_EXTERNAL_LOOPBACK))
 		bf_set(abort_cmd_ia, &abtswqe->abort_cmd, 1);
 	bf_set(abort_cmd_criteria, &abtswqe->abort_cmd, T_XRI_TAG);
 	abtswqe->abort_cmd.rsrvd5 = 0;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH AUTOSEL 5.18 094/159] scsi: lpfc: Alter FPIN stat accounting logic
       [not found] <20220530132425.1929512-1-sashal@kernel.org>
                   ` (10 preceding siblings ...)
  2022-05-30 13:23 ` [PATCH AUTOSEL 5.18 093/159] scsi: lpfc: Inhibit aborts if external loopback plug is inserted Sasha Levin
@ 2022-05-30 13:23 ` Sasha Levin
  11 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-05-30 13:23 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: James Smart, Justin Tee, Martin K . Petersen, Sasha Levin,
	james.smart, dick.kennedy, jejb, linux-scsi

From: James Smart <jsmart2021@gmail.com>

[ Upstream commit e6f51041450282a8668af3a8fc5c7744e81a447c ]

When configuring CMF management based on signals instead of FPINs, FPIN
alarm and warning statistics are not tracked.

Change the behavior so that FPIN alarms and warnings are always tracked
regardless of the configured mode.

Similar changes are made in the CMF signal stat accounting logic.  Upon
receipt of a signal, only track signaled alarms and warnings. FPIN stats
should not be incremented upon receipt of a signal.

Link: https://lore.kernel.org/r/20220506035519.50908-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/lpfc/lpfc_els.c  | 49 +++++++++++------------------------
 drivers/scsi/lpfc/lpfc_init.c | 22 ++--------------
 2 files changed, 17 insertions(+), 54 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 9545a35f0777..892b3da1ba45 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -3877,9 +3877,6 @@ lpfc_least_capable_settings(struct lpfc_hba *phba,
 {
 	u32 rsp_sig_cap = 0, drv_sig_cap = 0;
 	u32 rsp_sig_freq_cyc = 0, rsp_sig_freq_scale = 0;
-	struct lpfc_cgn_info *cp;
-	u32 crc;
-	u16 sig_freq;
 
 	/* Get rsp signal and frequency capabilities.  */
 	rsp_sig_cap = be32_to_cpu(pcgd->xmt_signal_capability);
@@ -3935,25 +3932,7 @@ lpfc_least_capable_settings(struct lpfc_hba *phba,
 		}
 	}
 
-	if (!phba->cgn_i)
-		return;
-
-	/* Update signal frequency in congestion info buffer */
-	cp = (struct lpfc_cgn_info *)phba->cgn_i->virt;
-
-	/* Frequency (in ms) Signal Warning/Signal Congestion Notifications
-	 * are received by the HBA
-	 */
-	sig_freq = phba->cgn_sig_freq;
-
-	if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ONLY)
-		cp->cgn_warn_freq = cpu_to_le16(sig_freq);
-	if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ALARM) {
-		cp->cgn_alarm_freq = cpu_to_le16(sig_freq);
-		cp->cgn_warn_freq = cpu_to_le16(sig_freq);
-	}
-	crc = lpfc_cgn_calc_crc32(cp, LPFC_CGN_INFO_SZ, LPFC_CGN_CRC32_SEED);
-	cp->cgn_info_crc = cpu_to_le32(crc);
+	/* We are NOT recording signal frequency in congestion info buffer */
 	return;
 
 out_no_support:
@@ -9971,11 +9950,14 @@ lpfc_els_rcv_fpin_cgn(struct lpfc_hba *phba, struct fc_tlv_desc *tlv)
 			/* Take action here for an Alarm event */
 			if (phba->cmf_active_mode != LPFC_CFG_OFF) {
 				if (phba->cgn_reg_fpin & LPFC_CGN_FPIN_ALARM) {
-					/* Track of alarm cnt for cgn_info */
-					atomic_inc(&phba->cgn_fabric_alarm_cnt);
 					/* Track of alarm cnt for SYNC_WQE */
 					atomic_inc(&phba->cgn_sync_alarm_cnt);
 				}
+				/* Track alarm cnt for cgn_info regardless
+				 * of whether CMF is configured for Signals
+				 * or FPINs.
+				 */
+				atomic_inc(&phba->cgn_fabric_alarm_cnt);
 				goto cleanup;
 			}
 			break;
@@ -9983,11 +9965,14 @@ lpfc_els_rcv_fpin_cgn(struct lpfc_hba *phba, struct fc_tlv_desc *tlv)
 			/* Take action here for a Warning event */
 			if (phba->cmf_active_mode != LPFC_CFG_OFF) {
 				if (phba->cgn_reg_fpin & LPFC_CGN_FPIN_WARN) {
-					/* Track of warning cnt for cgn_info */
-					atomic_inc(&phba->cgn_fabric_warn_cnt);
 					/* Track of warning cnt for SYNC_WQE */
 					atomic_inc(&phba->cgn_sync_warn_cnt);
 				}
+				/* Track warning cnt and freq for cgn_info
+				 * regardless of whether CMF is configured for
+				 * Signals or FPINs.
+				 */
+				atomic_inc(&phba->cgn_fabric_warn_cnt);
 cleanup:
 				/* Save frequency in ms */
 				phba->cgn_fpin_frequency =
@@ -9996,14 +9981,10 @@ lpfc_els_rcv_fpin_cgn(struct lpfc_hba *phba, struct fc_tlv_desc *tlv)
 				if (phba->cgn_i) {
 					cp = (struct lpfc_cgn_info *)
 						phba->cgn_i->virt;
-					if (phba->cgn_reg_fpin &
-						LPFC_CGN_FPIN_ALARM)
-						cp->cgn_alarm_freq =
-							cpu_to_le16(value);
-					if (phba->cgn_reg_fpin &
-						LPFC_CGN_FPIN_WARN)
-						cp->cgn_warn_freq =
-							cpu_to_le16(value);
+					cp->cgn_alarm_freq =
+						cpu_to_le16(value);
+					cp->cgn_warn_freq =
+						cpu_to_le16(value);
 					crc = lpfc_cgn_calc_crc32
 						(cp,
 						LPFC_CGN_INFO_SZ,
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index f9cd4b72d949..011849c1ed3c 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -5866,21 +5866,8 @@ lpfc_cgn_save_evt_cnt(struct lpfc_hba *phba)
 
 	/* Use the frequency found in the last rcv'ed FPIN */
 	value = phba->cgn_fpin_frequency;
-	if (phba->cgn_reg_fpin & LPFC_CGN_FPIN_WARN)
-		cp->cgn_warn_freq = cpu_to_le16(value);
-	if (phba->cgn_reg_fpin & LPFC_CGN_FPIN_ALARM)
-		cp->cgn_alarm_freq = cpu_to_le16(value);
-
-	/* Frequency (in ms) Signal Warning/Signal Congestion Notifications
-	 * are received by the HBA
-	 */
-	value = phba->cgn_sig_freq;
-
-	if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ONLY ||
-	    phba->cgn_reg_signal == EDC_CG_SIG_WARN_ALARM)
-		cp->cgn_warn_freq = cpu_to_le16(value);
-	if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ALARM)
-		cp->cgn_alarm_freq = cpu_to_le16(value);
+	cp->cgn_warn_freq = cpu_to_le16(value);
+	cp->cgn_alarm_freq = cpu_to_le16(value);
 
 	lvalue = lpfc_cgn_calc_crc32(cp, LPFC_CGN_INFO_SZ,
 				     LPFC_CGN_CRC32_SEED);
@@ -6595,9 +6582,6 @@ lpfc_sli4_async_sli_evt(struct lpfc_hba *phba, struct lpfc_acqe_sli *acqe_sli)
 		/* Alarm overrides warning, so check that first */
 		if (cgn_signal->alarm_cnt) {
 			if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ALARM) {
-				/* Keep track of alarm cnt for cgn_info */
-				atomic_add(cgn_signal->alarm_cnt,
-					   &phba->cgn_fabric_alarm_cnt);
 				/* Keep track of alarm cnt for CMF_SYNC_WQE */
 				atomic_add(cgn_signal->alarm_cnt,
 					   &phba->cgn_sync_alarm_cnt);
@@ -6606,8 +6590,6 @@ lpfc_sli4_async_sli_evt(struct lpfc_hba *phba, struct lpfc_acqe_sli *acqe_sli)
 			/* signal action needs to be taken */
 			if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ONLY ||
 			    phba->cgn_reg_signal == EDC_CG_SIG_WARN_ALARM) {
-				/* Keep track of warning cnt for cgn_info */
-				atomic_add(cnt, &phba->cgn_fabric_warn_cnt);
 				/* Keep track of warning cnt for CMF_SYNC_WQE */
 				atomic_add(cnt, &phba->cgn_sync_warn_cnt);
 			}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH AUTOSEL 5.18 074/159] scsi: target: tcmu: Fix possible data corruption
  2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 074/159] scsi: target: tcmu: Fix possible data corruption Sasha Levin
@ 2022-05-30 15:47   ` Bodo Stroesser
  2022-06-05 13:16     ` Sasha Levin
  0 siblings, 1 reply; 14+ messages in thread
From: Bodo Stroesser @ 2022-05-30 15:47 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable
  Cc: Xiaoguang Wang, Martin K . Petersen, linux-scsi, target-devel

Sasha,

the below patch introduces a new bug, which is fixed by commit
   325d5c5fb216 ("scsi: target: tcmu: Avoid holding XArray lock when calling lock_page")
Please consider adding this further fix.

For my understanding: commit 325d5c5fb216 contains a "Fixes:"
tag. So I'd expect it to be added automatically.
Is there still something missing in the commit?

Thank you,
Bodo


On 30.05.22 15:22, Sasha Levin wrote:
> From: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
> 
> [ Upstream commit bb9b9eb0ae2e9d3f6036f0ad907c3a83dcd43485 ]
> 
> When tcmu_vma_fault() gets a page successfully, before the current context
> completes page fault procedure, find_free_blocks() may run and call
> unmap_mapping_range() to unmap the page. Assume that when
> find_free_blocks() initially completes and the previous page fault
> procedure starts to run again and completes, then one truncated page has
> been mapped to userspace. But note that tcmu_vma_fault() has gotten a
> refcount for the page so any other subsystem won't be able to use the page
> unless the userspace address is unmapped later.
> 
> If another command subsequently runs and needs to extend dbi_thresh it may
> reuse the corresponding slot for the previous page in data_bitmap. Then
> though we'll allocate new page for this slot in data_area, no page fault
> will happen because we have a valid map and the real request's data will be
> lost.
> 
> Filesystem implementations will also run into this issue but they usually
> lock the page when vm_operations_struct->fault gets a page and unlock the
> page after finish_fault() completes. For truncate filesystems lock pages in
> truncate_inode_pages() to protect against racing wrt. page faults.
> 
> To fix this possible data corruption scenario we can apply a method similar
> to the filesystems.  For pages that are to be freed, tcmu_blocks_release()
> locks and unlocks. Make tcmu_vma_fault() also lock found page under
> cmdr_lock. At the same time, since tcmu_vma_fault() gets an extra page
> refcount, tcmu_blocks_release() won't free pages if pages are in page fault
> procedure, which means it is safe to call tcmu_blocks_release() before
> unmap_mapping_range().
> 
> With these changes tcmu_blocks_release() will wait for all page faults to
> be completed before calling unmap_mapping_range(). And later, if
> unmap_mapping_range() is called, it will ensure stale mappings are removed.
> 
> Link: https://lore.kernel.org/r/20220421023735.9018-1-xiaoguang.wang@linux.alibaba.com
> Reviewed-by: Bodo Stroesser <bostroesser@gmail.com>
> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>   drivers/target/target_core_user.c | 40 ++++++++++++++++++++++++++++---
>   1 file changed, 37 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
> index fd7267baa707..b1fd06edea59 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c
> @@ -20,6 +20,7 @@
>   #include <linux/configfs.h>
>   #include <linux/mutex.h>
>   #include <linux/workqueue.h>
> +#include <linux/pagemap.h>
>   #include <net/genetlink.h>
>   #include <scsi/scsi_common.h>
>   #include <scsi/scsi_proto.h>
> @@ -1667,6 +1668,26 @@ static u32 tcmu_blocks_release(struct tcmu_dev *udev, unsigned long first,
>   	xas_lock(&xas);
>   	xas_for_each(&xas, page, (last + 1) * udev->data_pages_per_blk - 1) {
>   		xas_store(&xas, NULL);
> +		/*
> +		 * While reaching here there may be page faults occurring on
> +		 * the to-be-released pages. A race condition may occur if
> +		 * unmap_mapping_range() is called before page faults on these
> +		 * pages have completed; a valid but stale map is created.
> +		 *
> +		 * If another command subsequently runs and needs to extend
> +		 * dbi_thresh, it may reuse the slot corresponding to the
> +		 * previous page in data_bitmap. Though we will allocate a new
> +		 * page for the slot in data_area, no page fault will happen
> +		 * because we have a valid map. Therefore the command's data
> +		 * will be lost.
> +		 *
> +		 * We lock and unlock pages that are to be released to ensure
> +		 * all page faults have completed. This way
> +		 * unmap_mapping_range() can ensure stale maps are cleanly
> +		 * removed.
> +		 */
> +		lock_page(page);
> +		unlock_page(page);
>   		__free_page(page);
>   		pages_freed++;
>   	}
> @@ -1822,6 +1843,7 @@ static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
>   	page = xa_load(&udev->data_pages, dpi);
>   	if (likely(page)) {
>   		get_page(page);
> +		lock_page(page);
>   		mutex_unlock(&udev->cmdr_lock);
>   		return page;
>   	}
> @@ -1863,6 +1885,7 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf)
>   	struct page *page;
>   	unsigned long offset;
>   	void *addr;
> +	vm_fault_t ret = 0;
>   
>   	int mi = tcmu_find_mem_index(vmf->vma);
>   	if (mi < 0)
> @@ -1887,10 +1910,11 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf)
>   		page = tcmu_try_get_data_page(udev, dpi);
>   		if (!page)
>   			return VM_FAULT_SIGBUS;
> +		ret = VM_FAULT_LOCKED;
>   	}
>   
>   	vmf->page = page;
> -	return 0;
> +	return ret;
>   }
>   
>   static const struct vm_operations_struct tcmu_vm_ops = {
> @@ -3205,12 +3229,22 @@ static void find_free_blocks(void)
>   			udev->dbi_max = block;
>   		}
>   
> +		/*
> +		 * Release the block pages.
> +		 *
> +		 * Also note that since tcmu_vma_fault() gets an extra page
> +		 * refcount, tcmu_blocks_release() won't free pages if pages
> +		 * are mapped. This means it is safe to call
> +		 * tcmu_blocks_release() before unmap_mapping_range() which
> +		 * drops the refcount of any pages it unmaps and thus releases
> +		 * them.
> +		 */
> +		pages_freed = tcmu_blocks_release(udev, start, end - 1);
> +
>   		/* Here will truncate the data area from off */
>   		off = udev->data_off + (loff_t)start * udev->data_blk_size;
>   		unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
>   
> -		/* Release the block pages */
> -		pages_freed = tcmu_blocks_release(udev, start, end - 1);
>   		mutex_unlock(&udev->cmdr_lock);
>   
>   		total_pages_freed += pages_freed;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH AUTOSEL 5.18 074/159] scsi: target: tcmu: Fix possible data corruption
  2022-05-30 15:47   ` Bodo Stroesser
@ 2022-06-05 13:16     ` Sasha Levin
  0 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2022-06-05 13:16 UTC (permalink / raw)
  To: Bodo Stroesser
  Cc: linux-kernel, stable, Xiaoguang Wang, Martin K . Petersen,
	linux-scsi, target-devel

On Mon, May 30, 2022 at 05:47:12PM +0200, Bodo Stroesser wrote:
>Sasha,
>
>the below patch introduces a new bug, which is fixed by commit
>  325d5c5fb216 ("scsi: target: tcmu: Avoid holding XArray lock when calling lock_page")
>Please consider adding this further fix.
>
>For my understanding: commit 325d5c5fb216 contains a "Fixes:"
>tag. So I'd expect it to be added automatically.
>Is there still something missing in the commit?

I'll make sure it's added along with this one, thanks!

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-06-05 13:17 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20220530132425.1929512-1-sashal@kernel.org>
2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 035/159] scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg() Sasha Levin
2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 036/159] scsi: lpfc: Fix SCSI I/O completion and abort handler deadlock Sasha Levin
2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 037/159] scsi: lpfc: Fix null pointer dereference after failing to issue FLOGI and PLOGI Sasha Levin
2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 038/159] scsi: lpfc: Protect memory leak for NPIV ports sending PLOGI_RJT Sasha Levin
2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 039/159] scsi: lpfc: Fix call trace observed during I/O with CMF enabled Sasha Levin
2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 058/159] scsi: megaraid: Fix error check return value of register_chrdev() Sasha Levin
2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 060/159] scsi: ufs: Use pm_runtime_resume_and_get() instead of pm_runtime_get_sync() Sasha Levin
2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 061/159] scsi: lpfc: Fix resource leak in lpfc_sli4_send_seq_to_ulp() Sasha Levin
2022-05-30 13:22 ` [PATCH AUTOSEL 5.18 074/159] scsi: target: tcmu: Fix possible data corruption Sasha Levin
2022-05-30 15:47   ` Bodo Stroesser
2022-06-05 13:16     ` Sasha Levin
2022-05-30 13:23 ` [PATCH AUTOSEL 5.18 092/159] scsi: hisi_sas: Undo RPM resume for failed notify phy event for v3 HW Sasha Levin
2022-05-30 13:23 ` [PATCH AUTOSEL 5.18 093/159] scsi: lpfc: Inhibit aborts if external loopback plug is inserted Sasha Levin
2022-05-30 13:23 ` [PATCH AUTOSEL 5.18 094/159] scsi: lpfc: Alter FPIN stat accounting logic Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).