linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU
       [not found] <CGME20200717074740epcas2p2b1c8e7bf7dc28f13c5a9999373f4601b@epcas2p2.samsung.com>
@ 2020-07-17  7:39 ` Lee Sang Hyun
  2020-07-18 20:30   ` Bart Van Assche
  0 siblings, 1 reply; 5+ messages in thread
From: Lee Sang Hyun @ 2020-07-17  7:39 UTC (permalink / raw)
  To: linux-scsi, alim.akhtar, avri.altman, jejb, martin.petersen,
	beanhuo, asutoshd, cang, bvanassche, grant.jung, sc.suh,
	hy50.seo, sh425.lee, kwmad.kim

Retry of SSU is not working because of UNIT_ATTENTION if SSU is failed.
So, more retry is needed.
More than 3 times of SSU(START STOP UNIT) retries are needed
to prevent a platform watchdog which is because of IO stuck.
Host sends SSU to device during resume to wake-up UFS device.
And system(host) can not do IO operations if SSU is failed.

There are no responses from UFS device and
ufshcd_eh_reset_handler() is called in the below log.
We need to do 3 times of retries to clear UNIT ATTENTION
which is because of HW RESET at this situation.

<3>[  636.089575]  [0: kworker/u16:11: 3578] ufshcd-qcom 1d84000.ufshc: ufshcd_abort: tag:0, cmd:0x1b, retries 2
<3>[  636.089898]  [0: kworker/u16:11: 3578] ufshcd-qcom 1d84000.ufshc: ufshcd_eh_host_reset_handler: reset in progress - 2
...
<4>[  638.271463]  [1:    kworker/1:1: 3552] scsi 0:0:0:49488: START_STOP failed for power mode: 1, result 30000
...
<6>[ 1056.481268]  [3:   msm_watchdog:   79]  current proc : 79 msm_watchdog
...
<4>[ 1056.576026]  [3:   msm_watchdog:   79] Call trace:
<4>[ 1056.576138]  [3:   msm_watchdog:   79]  __switch_to+0xb4/0xc0
<4>[ 1056.576267]  [3:   msm_watchdog:   79]  __schedule+0x8d8/0xc70
<4>[ 1056.576404]  [3:   msm_watchdog:   79]  io_schedule+0x8c/0xc0
<4>[ 1056.576530]  [3:   msm_watchdog:   79]  bit_wait_io+0x14/0x60
<4>[ 1056.576658]  [3:   msm_watchdog:   79]  out_of_line_wait_on_bit+0xd8/0x158
<4>[ 1056.576817]  [3:   msm_watchdog:   79]  __wait_on_buffer+0x24/0x30
<4>[ 1056.576962]  [3:   msm_watchdog:   79]  jbd2_journal_commit_transaction+0xf24/0x1970
<4>[ 1056.577134]  [3:   msm_watchdog:   79]  kjournald2+0x1e0/0x258
<4>[ 1056.577264]  [3:   msm_watchdog:   79]  kthread+0x110/0x120
<4>[ 1056.577389]  [3:   msm_watchdog:   79]  ret_from_fork+0x10/0x18
...
<0>[ 1056.784682]  [3:   msm_watchdog:   79] Kernel panic - not syncing: Platform Watchdog can't update sync_cnt
...

Change-Id: I933f774e04456226d760536200e9f079a5d5b987
Signed-off-by: Lee Sang Hyun <sh425.lee@samsung.com>
---
 drivers/scsi/ufs/ufshcd.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index ad4fc82..30cee8c 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -100,6 +100,9 @@
 /* Default value of wait time before gating device ref clock */
 #define UFSHCD_REF_CLK_GATING_WAIT_US 0xFF /* microsecs */
 
+/* SSU retries */
+#define SSU_RETRIES 3
+
 #define ufshcd_toggle_vreg(_dev, _vreg, _on)				\
 	({                                                              \
 		int _ret;                                               \
@@ -7977,6 +7980,7 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
 	struct scsi_device *sdp;
 	unsigned long flags;
 	int ret;
+	int retries;
 
 	spin_lock_irqsave(hba->host->host_lock, flags);
 	sdp = hba->sdev_ufs_device;
@@ -8016,14 +8020,18 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
 	 * callbacks hence set the RQF_PM flag so that it doesn't resume the
 	 * already suspended childs.
 	 */
-	ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
-			START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
-	if (ret) {
-		sdev_printk(KERN_WARNING, sdp,
-			    "START_STOP failed for power mode: %d, result %x\n",
-			    pwr_mode, ret);
-		if (driver_byte(ret) == DRIVER_SENSE)
-			scsi_print_sense_hdr(sdp, NULL, &sshdr);
+	for (retries = 0; retries < SSU_RETRIES; retries++) {
+		ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
+				START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
+		if (ret) {
+			sdev_printk(KERN_WARNING, sdp,
+				    "START_STOP failed for power mode: %d, result %x\n",
+				    pwr_mode, ret);
+			if (driver_byte(ret) == DRIVER_SENSE)
+				scsi_print_sense_hdr(sdp, NULL, &sshdr);
+		} else {
+			break;
+		}
 	}
 
 	if (!ret)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU
  2020-07-17  7:39 ` [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU Lee Sang Hyun
@ 2020-07-18 20:30   ` Bart Van Assche
  2020-07-22  9:14     ` Stanley Chu
  0 siblings, 1 reply; 5+ messages in thread
From: Bart Van Assche @ 2020-07-18 20:30 UTC (permalink / raw)
  To: Lee Sang Hyun, linux-scsi, alim.akhtar, avri.altman, jejb,
	martin.petersen, beanhuo, asutoshd, cang, grant.jung, sc.suh,
	hy50.seo, kwmad.kim

On 2020-07-17 00:39, Lee Sang Hyun wrote:
> -	ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
> -			START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
> -	if (ret) {
> -		sdev_printk(KERN_WARNING, sdp,
> -			    "START_STOP failed for power mode: %d, result %x\n",
> -			    pwr_mode, ret);
> -		if (driver_byte(ret) == DRIVER_SENSE)
> -			scsi_print_sense_hdr(sdp, NULL, &sshdr);
> +	for (retries = 0; retries < SSU_RETRIES; retries++) {
> +		ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
> +				START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
> +		if (ret) {
> +			sdev_printk(KERN_WARNING, sdp,
> +				    "START_STOP failed for power mode: %d, result %x\n",
> +				    pwr_mode, ret);
> +			if (driver_byte(ret) == DRIVER_SENSE)
> +				scsi_print_sense_hdr(sdp, NULL, &sshdr);
> +		} else {
> +			break;
> +		}

The ninth argument of scsi_execute() is called 'retries'. Wouldn't it be
better to pass a nonzero value as the 'retries' argument of
scsi_execute() instead of adding a loop around the scsi_execute() call?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU
  2020-07-18 20:30   ` Bart Van Assche
@ 2020-07-22  9:14     ` Stanley Chu
  2020-07-22 15:10       ` Bart Van Assche
  0 siblings, 1 reply; 5+ messages in thread
From: Stanley Chu @ 2020-07-22  9:14 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Lee Sang Hyun, linux-scsi, alim.akhtar, avri.altman, jejb,
	martin.petersen, beanhuo, asutoshd, cang, grant.jung, sc.suh,
	hy50.seo, kwmad.kim

Hi Bart,

On Sat, 2020-07-18 at 13:30 -0700, Bart Van Assche wrote:
> On 2020-07-17 00:39, Lee Sang Hyun wrote:
> > -	ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
> > -			START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
> > -	if (ret) {
> > -		sdev_printk(KERN_WARNING, sdp,
> > -			    "START_STOP failed for power mode: %d, result %x\n",
> > -			    pwr_mode, ret);
> > -		if (driver_byte(ret) == DRIVER_SENSE)
> > -			scsi_print_sense_hdr(sdp, NULL, &sshdr);
> > +	for (retries = 0; retries < SSU_RETRIES; retries++) {
> > +		ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
> > +				START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
> > +		if (ret) {
> > +			sdev_printk(KERN_WARNING, sdp,
> > +				    "START_STOP failed for power mode: %d, result %x\n",
> > +				    pwr_mode, ret);
> > +			if (driver_byte(ret) == DRIVER_SENSE)
> > +				scsi_print_sense_hdr(sdp, NULL, &sshdr);
> > +		} else {
> > +			break;
> > +		}
> 
> The ninth argument of scsi_execute() is called 'retries'. Wouldn't it be
> better to pass a nonzero value as the 'retries' argument of
> scsi_execute() instead of adding a loop around the scsi_execute() call?

If a SCSI command issued via scsi_execute() encounters "timeout" or
"check condition", scsi_noretry_cmd() will return 1 (true) because
blk_rq_is_passthrough() is true due to REQ_OP_SCSI_IN or REQ_OP_SCSI_OUT
flag was set to this SCSI command by scsi_execute(). Therefore even a
non-zero 'retries' value is assigned while calling scsi_execute(), the
failed command has no chance to be retried since the decision will be
no-retry by scsi_noretry_cmd().

(Take command timeout as example)
scsi_times_out()->scsi_abort_command()->scmd_eh_abort_handler(), here
scsi_noretry_cmd() returns 1, and then command will be finished (with
error code) without retry.

In scsi_noretry_cmd(), there is a comment message in section
"check_type" as below

	/*
	 * assume caller has checked sense and determined
	 * the check condition was retryable.
	 */

I am not sure if "timeout" and "check condition" cases in such SCSI
commands issued via scsi_execute() are specially designed to be unable
to retry.

Would you have any suggestions if LLD drivers would like to retry these
kinds of SCSI commands?

Thanks a lot,
Stanley Chu


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU
  2020-07-22  9:14     ` Stanley Chu
@ 2020-07-22 15:10       ` Bart Van Assche
  2020-07-25  3:10         ` Stanley Chu
  0 siblings, 1 reply; 5+ messages in thread
From: Bart Van Assche @ 2020-07-22 15:10 UTC (permalink / raw)
  To: Stanley Chu
  Cc: Lee Sang Hyun, linux-scsi, alim.akhtar, avri.altman, jejb,
	martin.petersen, beanhuo, asutoshd, cang, grant.jung, sc.suh,
	hy50.seo, kwmad.kim

On 2020-07-22 02:14, Stanley Chu wrote:
> Hi Bart,
> 
> On Sat, 2020-07-18 at 13:30 -0700, Bart Van Assche wrote:
>> On 2020-07-17 00:39, Lee Sang Hyun wrote:
>>> -	ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
>>> -			START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
>>> -	if (ret) {
>>> -		sdev_printk(KERN_WARNING, sdp,
>>> -			    "START_STOP failed for power mode: %d, result %x\n",
>>> -			    pwr_mode, ret);
>>> -		if (driver_byte(ret) == DRIVER_SENSE)
>>> -			scsi_print_sense_hdr(sdp, NULL, &sshdr);
>>> +	for (retries = 0; retries < SSU_RETRIES; retries++) {
>>> +		ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
>>> +				START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
>>> +		if (ret) {
>>> +			sdev_printk(KERN_WARNING, sdp,
>>> +				    "START_STOP failed for power mode: %d, result %x\n",
>>> +				    pwr_mode, ret);
>>> +			if (driver_byte(ret) == DRIVER_SENSE)
>>> +				scsi_print_sense_hdr(sdp, NULL, &sshdr);
>>> +		} else {
>>> +			break;
>>> +		}
>>
>> The ninth argument of scsi_execute() is called 'retries'. Wouldn't it be
>> better to pass a nonzero value as the 'retries' argument of
>> scsi_execute() instead of adding a loop around the scsi_execute() call?
> 
> If a SCSI command issued via scsi_execute() encounters "timeout" or
> "check condition", scsi_noretry_cmd() will return 1 (true) because
> blk_rq_is_passthrough() is true due to REQ_OP_SCSI_IN or REQ_OP_SCSI_OUT
> flag was set to this SCSI command by scsi_execute(). Therefore even a
> non-zero 'retries' value is assigned while calling scsi_execute(), the
> failed command has no chance to be retried since the decision will be
> no-retry by scsi_noretry_cmd().
> 
> (Take command timeout as example)
> scsi_times_out()->scsi_abort_command()->scmd_eh_abort_handler(), here
> scsi_noretry_cmd() returns 1, and then command will be finished (with
> error code) without retry.
> 
> In scsi_noretry_cmd(), there is a comment message in section
> "check_type" as below
> 
> 	/*
> 	 * assume caller has checked sense and determined
> 	 * the check condition was retryable.
> 	 */
> 
> I am not sure if "timeout" and "check condition" cases in such SCSI
> commands issued via scsi_execute() are specially designed to be unable
> to retry.
> 
> Would you have any suggestions if LLD drivers would like to retry these
> kinds of SCSI commands?

How about summarizing the above explanation of why the 'retry' argument
of scsi_execute() cannot be used in a two or three line comment and to
add that comment above the loop introduced by this patch?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU
  2020-07-22 15:10       ` Bart Van Assche
@ 2020-07-25  3:10         ` Stanley Chu
  0 siblings, 0 replies; 5+ messages in thread
From: Stanley Chu @ 2020-07-25  3:10 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Stanley Chu, Lee Sang Hyun, linux-scsi, Alim Akhtar, Avri Altman,
	James E.J. Bottomley, Martin K. Petersen, Bean Huo (beanhuo),
	Asutosh Das, cang, grant.jung, sc.suh, hy50.seo, Kiwoong Kim

Hi Bart,

Thanks for your suggestions.

Hi Sang Hyun,

Bart Van Assche <bvanassche@acm.org> 於 2020年7月22日 週三 下午11:11寫道:
>
> On 2020-07-22 02:14, Stanley Chu wrote:
> > Hi Bart,
> >
> > On Sat, 2020-07-18 at 13:30 -0700, Bart Van Assche wrote:
> >> On 2020-07-17 00:39, Lee Sang Hyun wrote:
> >>> -   ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
> >>> -                   START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
> >>> -   if (ret) {
> >>> -           sdev_printk(KERN_WARNING, sdp,
> >>> -                       "START_STOP failed for power mode: %d, result %x\n",
> >>> -                       pwr_mode, ret);
> >>> -           if (driver_byte(ret) == DRIVER_SENSE)
> >>> -                   scsi_print_sense_hdr(sdp, NULL, &sshdr);
> >>> +   for (retries = 0; retries < SSU_RETRIES; retries++) {
> >>> +           ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
> >>> +                           START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
> >>> +           if (ret) {
> >>> +                   sdev_printk(KERN_WARNING, sdp,
> >>> +                               "START_STOP failed for power mode: %d, result %x\n",
> >>> +                               pwr_mode, ret);
> >>> +                   if (driver_byte(ret) == DRIVER_SENSE)
> >>> +                           scsi_print_sense_hdr(sdp, NULL, &sshdr);
> >>> +           } else {
> >>> +                   break;
> >>> +           }
> >>
> >> The ninth argument of scsi_execute() is called 'retries'. Wouldn't it be
> >> better to pass a nonzero value as the 'retries' argument of
> >> scsi_execute() instead of adding a loop around the scsi_execute() call?
> >
> > If a SCSI command issued via scsi_execute() encounters "timeout" or
> > "check condition", scsi_noretry_cmd() will return 1 (true) because
> > blk_rq_is_passthrough() is true due to REQ_OP_SCSI_IN or REQ_OP_SCSI_OUT
> > flag was set to this SCSI command by scsi_execute(). Therefore even a
> > non-zero 'retries' value is assigned while calling scsi_execute(), the
> > failed command has no chance to be retried since the decision will be
> > no-retry by scsi_noretry_cmd().
> >
> > (Take command timeout as example)
> > scsi_times_out()->scsi_abort_command()->scmd_eh_abort_handler(), here
> > scsi_noretry_cmd() returns 1, and then command will be finished (with
> > error code) without retry.
> >
> > In scsi_noretry_cmd(), there is a comment message in section
> > "check_type" as below
> >
> >       /*
> >        * assume caller has checked sense and determined
> >        * the check condition was retryable.
> >        */
> >
> > I am not sure if "timeout" and "check condition" cases in such SCSI
> > commands issued via scsi_execute() are specially designed to be unable
> > to retry.
> >
> > Would you have any suggestions if LLD drivers would like to retry these
> > kinds of SCSI commands?
>
> How about summarizing the above explanation of why the 'retry' argument
> of scsi_execute() cannot be used in a two or three line comment and to
> add that comment above the loop introduced by this patch?
>

I like this patch since this could also recover the "SSU timeout" case
we met before.
Could you please make this as formal patch with Bart's suggestion?

Thanks,
Stanley Chu

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-07-25  3:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20200717074740epcas2p2b1c8e7bf7dc28f13c5a9999373f4601b@epcas2p2.samsung.com>
2020-07-17  7:39 ` [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU Lee Sang Hyun
2020-07-18 20:30   ` Bart Van Assche
2020-07-22  9:14     ` Stanley Chu
2020-07-22 15:10       ` Bart Van Assche
2020-07-25  3:10         ` Stanley Chu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).