All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] scsi: ufs: fix tm cmd timeout/ISR racing issue
@ 2021-11-11  9:49 ` peter.wang
  0 siblings, 0 replies; 7+ messages in thread
From: peter.wang @ 2021-11-11  9:49 UTC (permalink / raw)
  To: stanley.chu, linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb
  Cc: wsd_upstream, linux-mediatek, peter.wang, chun-hung.wu,
	alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
	jonathan.hsu, qilin.tan, lin.gui, mikebi

From: Peter Wang <peter.wang@mediatek.com>

When tmc 100 ms timeout and recevied tmc complete ISR concurrently,
Bug happen because complete NULL poiner and KE.
Fix this racing issue by check NULL and use host_lock protect.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
---
 drivers/scsi/ufs/ufshcd.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5c6a58a666d2..6821ceb6783e 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6442,7 +6442,8 @@ static irqreturn_t ufshcd_tmc_handler(struct ufs_hba *hba)
 		struct request *req = hba->tmf_rqs[tag];
 		struct completion *c = req->end_io_data;
 
-		complete(c);
+		if (c)
+			complete(c);
 		ret = IRQ_HANDLED;
 	}
 	spin_unlock_irqrestore(hba->host->host_lock, flags);
@@ -6597,7 +6598,10 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
 		 * Make sure that ufshcd_compl_tm() does not trigger a
 		 * use-after-free.
 		 */
+		spin_lock_irqsave(hba->host->host_lock, flags);
 		req->end_io_data = NULL;
+		spin_unlock_irqrestore(hba->host->host_lock, flags);
+
 		ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
 		dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
 				__func__, tm_function);
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v1] scsi: ufs: fix tm cmd timeout/ISR racing issue
@ 2021-11-11  9:49 ` peter.wang
  0 siblings, 0 replies; 7+ messages in thread
From: peter.wang @ 2021-11-11  9:49 UTC (permalink / raw)
  To: stanley.chu, linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb
  Cc: wsd_upstream, linux-mediatek, peter.wang, chun-hung.wu,
	alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
	jonathan.hsu, qilin.tan, lin.gui, mikebi

From: Peter Wang <peter.wang@mediatek.com>

When tmc 100 ms timeout and recevied tmc complete ISR concurrently,
Bug happen because complete NULL poiner and KE.
Fix this racing issue by check NULL and use host_lock protect.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
---
 drivers/scsi/ufs/ufshcd.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5c6a58a666d2..6821ceb6783e 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6442,7 +6442,8 @@ static irqreturn_t ufshcd_tmc_handler(struct ufs_hba *hba)
 		struct request *req = hba->tmf_rqs[tag];
 		struct completion *c = req->end_io_data;
 
-		complete(c);
+		if (c)
+			complete(c);
 		ret = IRQ_HANDLED;
 	}
 	spin_unlock_irqrestore(hba->host->host_lock, flags);
@@ -6597,7 +6598,10 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
 		 * Make sure that ufshcd_compl_tm() does not trigger a
 		 * use-after-free.
 		 */
+		spin_lock_irqsave(hba->host->host_lock, flags);
 		req->end_io_data = NULL;
+		spin_unlock_irqrestore(hba->host->host_lock, flags);
+
 		ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
 		dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
 				__func__, tm_function);
-- 
2.18.0


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] scsi: ufs: fix tm cmd timeout/ISR racing issue
  2021-11-11  9:49 ` peter.wang
@ 2021-11-15 19:49   ` Bart Van Assche
  -1 siblings, 0 replies; 7+ messages in thread
From: Bart Van Assche @ 2021-11-15 19:49 UTC (permalink / raw)
  To: peter.wang, stanley.chu, linux-scsi, martin.petersen,
	avri.altman, alim.akhtar, jejb
  Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
	chaotian.jing, jiajie.hao, powen.kao, jonathan.hsu, qilin.tan,
	lin.gui, mikebi

On 11/11/21 1:49 AM, peter.wang@mediatek.com wrote:
> From: Peter Wang <peter.wang@mediatek.com>
> 
> When tmc 100 ms timeout and recevied tmc complete ISR concurrently,
> Bug happen because complete NULL poiner and KE.
> Fix this racing issue by check NULL and use host_lock protect.
> 
> Signed-off-by: Peter Wang <peter.wang@mediatek.com>
> ---
>   drivers/scsi/ufs/ufshcd.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 5c6a58a666d2..6821ceb6783e 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -6442,7 +6442,8 @@ static irqreturn_t ufshcd_tmc_handler(struct ufs_hba *hba)
>   		struct request *req = hba->tmf_rqs[tag];
>   		struct completion *c = req->end_io_data;
>   
> -		complete(c);
> +		if (c)
> +			complete(c);
>   		ret = IRQ_HANDLED;
>   	}
>   	spin_unlock_irqrestore(hba->host->host_lock, flags);
> @@ -6597,7 +6598,10 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
>   		 * Make sure that ufshcd_compl_tm() does not trigger a
>   		 * use-after-free.
>   		 */
> +		spin_lock_irqsave(hba->host->host_lock, flags);
>   		req->end_io_data = NULL;
> +		spin_unlock_irqrestore(hba->host->host_lock, flags);
> +
>   		ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
>   		dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
>   				__func__, tm_function);

Isn't this already addressed by Adrian Hunter's patches? See also
https://lore.kernel.org/linux-scsi/20211108064815.569494-1-adrian.hunter@intel.com/

Thanks,

Bart.



_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] scsi: ufs: fix tm cmd timeout/ISR racing issue
@ 2021-11-15 19:49   ` Bart Van Assche
  0 siblings, 0 replies; 7+ messages in thread
From: Bart Van Assche @ 2021-11-15 19:49 UTC (permalink / raw)
  To: peter.wang, stanley.chu, linux-scsi, martin.petersen,
	avri.altman, alim.akhtar, jejb
  Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
	chaotian.jing, jiajie.hao, powen.kao, jonathan.hsu, qilin.tan,
	lin.gui, mikebi

On 11/11/21 1:49 AM, peter.wang@mediatek.com wrote:
> From: Peter Wang <peter.wang@mediatek.com>
> 
> When tmc 100 ms timeout and recevied tmc complete ISR concurrently,
> Bug happen because complete NULL poiner and KE.
> Fix this racing issue by check NULL and use host_lock protect.
> 
> Signed-off-by: Peter Wang <peter.wang@mediatek.com>
> ---
>   drivers/scsi/ufs/ufshcd.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 5c6a58a666d2..6821ceb6783e 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -6442,7 +6442,8 @@ static irqreturn_t ufshcd_tmc_handler(struct ufs_hba *hba)
>   		struct request *req = hba->tmf_rqs[tag];
>   		struct completion *c = req->end_io_data;
>   
> -		complete(c);
> +		if (c)
> +			complete(c);
>   		ret = IRQ_HANDLED;
>   	}
>   	spin_unlock_irqrestore(hba->host->host_lock, flags);
> @@ -6597,7 +6598,10 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
>   		 * Make sure that ufshcd_compl_tm() does not trigger a
>   		 * use-after-free.
>   		 */
> +		spin_lock_irqsave(hba->host->host_lock, flags);
>   		req->end_io_data = NULL;
> +		spin_unlock_irqrestore(hba->host->host_lock, flags);
> +
>   		ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
>   		dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
>   				__func__, tm_function);

Isn't this already addressed by Adrian Hunter's patches? See also
https://lore.kernel.org/linux-scsi/20211108064815.569494-1-adrian.hunter@intel.com/

Thanks,

Bart.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] scsi: ufs: fix tm cmd timeout/ISR racing issue
  2021-11-15 19:49   ` Bart Van Assche
  (?)
@ 2021-11-16  6:57   ` Peter Wang
  2021-11-16 17:28       ` Bart Van Assche
  -1 siblings, 1 reply; 7+ messages in thread
From: Peter Wang @ 2021-11-16  6:57 UTC (permalink / raw)
  To: Bart Van Assche, stanley.chu, linux-scsi, martin.petersen,
	avri.altman, alim.akhtar, jejb
  Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
	chaotian.jing, jiajie.hao, powen.kao, jonathan.hsu, qilin.tan,
	lin.gui, mikebi


On 11/16/21 3:49 AM, Bart Van Assche wrote:
> On 11/11/21 1:49 AM, peter.wang@mediatek.com wrote:
>> From: Peter Wang <peter.wang@mediatek.com>
>>
>> When tmc 100 ms timeout and recevied tmc complete ISR concurrently,
>> Bug happen because complete NULL poiner and KE.
>> Fix this racing issue by check NULL and use host_lock protect.
>>
>> Signed-off-by: Peter Wang <peter.wang@mediatek.com>
>> ---
>>   drivers/scsi/ufs/ufshcd.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>> index 5c6a58a666d2..6821ceb6783e 100644
>> --- a/drivers/scsi/ufs/ufshcd.c
>> +++ b/drivers/scsi/ufs/ufshcd.c
>> @@ -6442,7 +6442,8 @@ static irqreturn_t ufshcd_tmc_handler(struct 
>> ufs_hba *hba)
>>           struct request *req = hba->tmf_rqs[tag];
>>           struct completion *c = req->end_io_data;
>>   -        complete(c);
>> +        if (c)
>> +            complete(c);
>>           ret = IRQ_HANDLED;
>>       }
>>       spin_unlock_irqrestore(hba->host->host_lock, flags);
>> @@ -6597,7 +6598,10 @@ static int __ufshcd_issue_tm_cmd(struct 
>> ufs_hba *hba,
>>            * Make sure that ufshcd_compl_tm() does not trigger a
>>            * use-after-free.
>>            */
>> +        spin_lock_irqsave(hba->host->host_lock, flags);
>>           req->end_io_data = NULL;
>> +        spin_unlock_irqrestore(hba->host->host_lock, flags);
>> +
>>           ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
>>           dev_err(hba->dev, "%s: task management cmd 0x%.2x 
>> timed-out\n",
>>                   __func__, tm_function);
>
> Isn't this already addressed by Adrian Hunter's patches? See also
> https://urldefense.com/v3/__https://lore.kernel.org/linux-scsi/20211108064815.569494-1-adrian.hunter@intel.com/__;!!CTRNKA9wMg0ARbw!zttcrXBZgCk261BxtN67hFHTMRzOwcDr1IVH8znRw4I0POKCxUijARo7H3btU8SfRQ$ 
>
> Thanks,
>
> Bart.
>
Hi Bart,

Yes, I will drop this patch.

By the way, we observe that 100ms TMC timeout value may not enough for 
some device, maybe we need enlarge this value?


Thanks

Peter


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] scsi: ufs: fix tm cmd timeout/ISR racing issue
  2021-11-16  6:57   ` Peter Wang
@ 2021-11-16 17:28       ` Bart Van Assche
  0 siblings, 0 replies; 7+ messages in thread
From: Bart Van Assche @ 2021-11-16 17:28 UTC (permalink / raw)
  To: Peter Wang, stanley.chu, linux-scsi, martin.petersen,
	avri.altman, alim.akhtar, jejb
  Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
	chaotian.jing, jiajie.hao, powen.kao, jonathan.hsu, qilin.tan,
	lin.gui, mikebi

On 11/15/21 22:57, Peter Wang wrote:
> By the way, we observe that 100ms TMC timeout value may not enough for
> some device, maybe we need enlarge this value?

Is that the TM_CMD_TIMEOUT constant? It surprises me that 100 ms is not 
enough. Will increasing that constant have a negative impact on the 
error handler in case it hits a task management timeout?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] scsi: ufs: fix tm cmd timeout/ISR racing issue
@ 2021-11-16 17:28       ` Bart Van Assche
  0 siblings, 0 replies; 7+ messages in thread
From: Bart Van Assche @ 2021-11-16 17:28 UTC (permalink / raw)
  To: Peter Wang, stanley.chu, linux-scsi, martin.petersen,
	avri.altman, alim.akhtar, jejb
  Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
	chaotian.jing, jiajie.hao, powen.kao, jonathan.hsu, qilin.tan,
	lin.gui, mikebi

On 11/15/21 22:57, Peter Wang wrote:
> By the way, we observe that 100ms TMC timeout value may not enough for
> some device, maybe we need enlarge this value?

Is that the TM_CMD_TIMEOUT constant? It surprises me that 100 ms is not 
enough. Will increasing that constant have a negative impact on the 
error handler in case it hits a task management timeout?

Thanks,

Bart.

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-11-16 17:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-11  9:49 [PATCH v1] scsi: ufs: fix tm cmd timeout/ISR racing issue peter.wang
2021-11-11  9:49 ` peter.wang
2021-11-15 19:49 ` Bart Van Assche
2021-11-15 19:49   ` Bart Van Assche
2021-11-16  6:57   ` Peter Wang
2021-11-16 17:28     ` Bart Van Assche
2021-11-16 17:28       ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.