stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] scsi/ufs: Cherry-pick 2 fixes for null pointer into 5.4.y only
@ 2021-11-16 10:48 Orson Zhai
  2021-11-16 10:48 ` [PATCH 1/2] scsi: ufs: Fix interrupt error message for shared interrupts Orson Zhai
  2021-11-16 10:48 ` [PATCH 2/2] scsi: ufs: Fix tm request when non-fatal error happens Orson Zhai
  0 siblings, 2 replies; 5+ messages in thread
From: Orson Zhai @ 2021-11-16 10:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: Sasha Levin, Can Guo, Jaegeuk Kim, Martin K . Petersen,
	Avri Altman, Adrian Hunter, orson.zhai, Orson Zhai

From: Orson Zhai <orson.zhai@unisoc.com>

Hi Greg,

Following 2 patches were merged into 5.10.y but not in 5.4.y.
We've found kernel crashes on our devices with 5.4 stable caused by missing them.

Please feel free to add them into the stable queue for 5.4.y if no issue.

Thanks,
Orson

Adrian Hunter (1):
  scsi: ufs: Fix interrupt error message for shared interrupts

Jaegeuk Kim (1):
  scsi: ufs: Fix tm request when non-fatal error happens

 drivers/scsi/ufs/ufshcd.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] scsi: ufs: Fix interrupt error message for shared interrupts
  2021-11-16 10:48 [PATCH 0/2] scsi/ufs: Cherry-pick 2 fixes for null pointer into 5.4.y only Orson Zhai
@ 2021-11-16 10:48 ` Orson Zhai
  2021-11-17 17:55   ` Greg Kroah-Hartman
  2021-11-16 10:48 ` [PATCH 2/2] scsi: ufs: Fix tm request when non-fatal error happens Orson Zhai
  1 sibling, 1 reply; 5+ messages in thread
From: Orson Zhai @ 2021-11-16 10:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: Sasha Levin, Can Guo, Jaegeuk Kim, Martin K . Petersen,
	Avri Altman, Adrian Hunter, orson.zhai, Orson Zhai

From: Adrian Hunter <adrian.hunter@intel.com>

[ Upstream commit 6337f58cec030b34ced435b3d9d7d29d63c96e36 ]

The interrupt might be shared, in which case it is not an error for the
interrupt handler to be called when the interrupt status is zero, so don't
print the message unless there was enabled interrupt status.

Change-Id: Ic18aa63b43d9479a62e8e664a73e70380669b109
Link: https://lore.kernel.org/r/20200811133936.19171-1-adrian.hunter@intel.com
Fixes: 9333d7757348 ("scsi: ufs: Fix irq return code")
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Orson Zhai <orson.zhai@unisoc.com>
---
 drivers/scsi/ufs/ufshcd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 24396f4..a5d4ee6 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5661,7 +5661,7 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba)
 		intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);
 	}
 
-	if (retval == IRQ_NONE) {
+	if (enabled_intr_status && retval == IRQ_NONE) {
 		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n",
 					__func__, intr_status);
 		ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] scsi: ufs: Fix tm request when non-fatal error happens
  2021-11-16 10:48 [PATCH 0/2] scsi/ufs: Cherry-pick 2 fixes for null pointer into 5.4.y only Orson Zhai
  2021-11-16 10:48 ` [PATCH 1/2] scsi: ufs: Fix interrupt error message for shared interrupts Orson Zhai
@ 2021-11-16 10:48 ` Orson Zhai
  2021-11-17 17:58   ` Greg Kroah-Hartman
  1 sibling, 1 reply; 5+ messages in thread
From: Orson Zhai @ 2021-11-16 10:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: Sasha Levin, Can Guo, Jaegeuk Kim, Martin K . Petersen,
	Avri Altman, Adrian Hunter, orson.zhai, Orson Zhai

From: Jaegeuk Kim <jaegeuk@kernel.org>

[ Upstream commit eeb1b55b6e25c5f7265ff45cd050f3bc2cc423a4 ]

When non-fatal error like line-reset happens, ufshcd_err_handler() starts
to abort tasks by ufshcd_try_to_abort_task(). When it tries to issue a task
management request, we hit two warnings:

WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70
WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 blk_mq_get_tag+0x438/0x46c

After fixing the above warnings we hit another tm_cmd timeout which may be
caused by unstable controller state:

__ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out

Then, ufshcd_err_handler() enters full reset, and kernel gets stuck. It
turned out ufshcd_print_trs() printed too many messages on console which
requires CPU locks. Likewise hba->silence_err_logs, we need to avoid too
verbose messages. This is actually not an error case.

Change-Id: I8a422b1f0e3152191f576548cc371a1a41115f59
Link: https://lore.kernel.org/r/20210107185316.788815-3-jaegeuk@kernel.org
Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs")
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Orson Zhai <orson.zhai@unisoc.com>
---
 drivers/scsi/ufs/ufshcd.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index a5d4ee6..4004506 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -4748,7 +4748,8 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp)
 		break;
 	} /* end of switch */
 
-	if ((host_byte(result) != DID_OK) && !hba->silence_err_logs)
+	if ((host_byte(result) != DID_OK) &&
+	    (host_byte(result) != DID_REQUEUE) && !hba->silence_err_logs)
 		ufshcd_print_trs(hba, 1 << lrbp->task_tag, true);
 	return result;
 }
@@ -5661,9 +5662,13 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba)
 		intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);
 	}
 
-	if (enabled_intr_status && retval == IRQ_NONE) {
-		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n",
-					__func__, intr_status);
+	if (enabled_intr_status && retval == IRQ_NONE &&
+				!ufshcd_eh_in_progress(hba)) {
+		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, 0x%08x)\n",
+					__func__,
+					intr_status,
+					hba->ufs_stats.last_intr_status,
+					enabled_intr_status);
 		ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");
 	}
 
@@ -5705,7 +5710,10 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
 	/*
 	 * blk_get_request() is used here only to get a free tag.
 	 */
-	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED);
+	req = blk_get_request(q, REQ_OP_DRV_OUT, 0);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
 	req->end_io_data = &wait;
 	ufshcd_hold(hba, false);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] scsi: ufs: Fix interrupt error message for shared interrupts
  2021-11-16 10:48 ` [PATCH 1/2] scsi: ufs: Fix interrupt error message for shared interrupts Orson Zhai
@ 2021-11-17 17:55   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 5+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-17 17:55 UTC (permalink / raw)
  To: Orson Zhai
  Cc: stable, Sasha Levin, Can Guo, Jaegeuk Kim, Martin K . Petersen,
	Avri Altman, Adrian Hunter, orson.zhai, Orson Zhai

On Tue, Nov 16, 2021 at 06:48:30PM +0800, Orson Zhai wrote:
> From: Adrian Hunter <adrian.hunter@intel.com>
> 
> [ Upstream commit 6337f58cec030b34ced435b3d9d7d29d63c96e36 ]
> 
> The interrupt might be shared, in which case it is not an error for the
> interrupt handler to be called when the interrupt status is zero, so don't
> print the message unless there was enabled interrupt status.
> 
> Change-Id: Ic18aa63b43d9479a62e8e664a73e70380669b109

Why is this here?  :(

I'll go delete it...


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: Fix tm request when non-fatal error happens
  2021-11-16 10:48 ` [PATCH 2/2] scsi: ufs: Fix tm request when non-fatal error happens Orson Zhai
@ 2021-11-17 17:58   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 5+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-17 17:58 UTC (permalink / raw)
  To: Orson Zhai
  Cc: stable, Sasha Levin, Can Guo, Jaegeuk Kim, Martin K . Petersen,
	Avri Altman, Adrian Hunter, orson.zhai, Orson Zhai

On Tue, Nov 16, 2021 at 06:48:31PM +0800, Orson Zhai wrote:
> From: Jaegeuk Kim <jaegeuk@kernel.org>
> 
> [ Upstream commit eeb1b55b6e25c5f7265ff45cd050f3bc2cc423a4 ]
> 
> When non-fatal error like line-reset happens, ufshcd_err_handler() starts
> to abort tasks by ufshcd_try_to_abort_task(). When it tries to issue a task
> management request, we hit two warnings:
> 
> WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70
> WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 blk_mq_get_tag+0x438/0x46c
> 
> After fixing the above warnings we hit another tm_cmd timeout which may be
> caused by unstable controller state:
> 
> __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out
> 
> Then, ufshcd_err_handler() enters full reset, and kernel gets stuck. It
> turned out ufshcd_print_trs() printed too many messages on console which
> requires CPU locks. Likewise hba->silence_err_logs, we need to avoid too
> verbose messages. This is actually not an error case.
> 
> Change-Id: I8a422b1f0e3152191f576548cc371a1a41115f59
> Link: https://lore.kernel.org/r/20210107185316.788815-3-jaegeuk@kernel.org
> Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs")
> Reviewed-by: Can Guo <cang@codeaurora.org>
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> Signed-off-by: Orson Zhai <orson.zhai@unisoc.com>
> ---
>  drivers/scsi/ufs/ufshcd.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index a5d4ee6..4004506 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -4748,7 +4748,8 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp)
>  		break;
>  	} /* end of switch */
>  
> -	if ((host_byte(result) != DID_OK) && !hba->silence_err_logs)
> +	if ((host_byte(result) != DID_OK) &&
> +	    (host_byte(result) != DID_REQUEUE) && !hba->silence_err_logs)
>  		ufshcd_print_trs(hba, 1 << lrbp->task_tag, true);
>  	return result;
>  }
> @@ -5661,9 +5662,13 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba)
>  		intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);
>  	}
>  
> -	if (enabled_intr_status && retval == IRQ_NONE) {
> -		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n",
> -					__func__, intr_status);
> +	if (enabled_intr_status && retval == IRQ_NONE &&
> +				!ufshcd_eh_in_progress(hba)) {
> +		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, 0x%08x)\n",
> +					__func__,
> +					intr_status,
> +					hba->ufs_stats.last_intr_status,
> +					enabled_intr_status);
>  		ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");
>  	}
>  
> @@ -5705,7 +5710,10 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
>  	/*
>  	 * blk_get_request() is used here only to get a free tag.
>  	 */
> -	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED);
> +	req = blk_get_request(q, REQ_OP_DRV_OUT, 0);
> +	if (IS_ERR(req))
> +		return PTR_ERR(req);
> +
>  	req->end_io_data = &wait;
>  	ufshcd_hold(hba, false);
>  
> -- 
> 2.7.4
> 

This commit does not build :(

Did you test it?

Please fix up and resend AFTER testing it.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-17 17:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-16 10:48 [PATCH 0/2] scsi/ufs: Cherry-pick 2 fixes for null pointer into 5.4.y only Orson Zhai
2021-11-16 10:48 ` [PATCH 1/2] scsi: ufs: Fix interrupt error message for shared interrupts Orson Zhai
2021-11-17 17:55   ` Greg Kroah-Hartman
2021-11-16 10:48 ` [PATCH 2/2] scsi: ufs: Fix tm request when non-fatal error happens Orson Zhai
2021-11-17 17:58   ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).