* [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU [not found] <CGME20200717074740epcas2p2b1c8e7bf7dc28f13c5a9999373f4601b@epcas2p2.samsung.com> @ 2020-07-17 7:39 ` Lee Sang Hyun 2020-07-18 20:30 ` Bart Van Assche 0 siblings, 1 reply; 5+ messages in thread From: Lee Sang Hyun @ 2020-07-17 7:39 UTC (permalink / raw) To: linux-scsi, alim.akhtar, avri.altman, jejb, martin.petersen, beanhuo, asutoshd, cang, bvanassche, grant.jung, sc.suh, hy50.seo, sh425.lee, kwmad.kim Retry of SSU is not working because of UNIT_ATTENTION if SSU is failed. So, more retry is needed. More than 3 times of SSU(START STOP UNIT) retries are needed to prevent a platform watchdog which is because of IO stuck. Host sends SSU to device during resume to wake-up UFS device. And system(host) can not do IO operations if SSU is failed. There are no responses from UFS device and ufshcd_eh_reset_handler() is called in the below log. We need to do 3 times of retries to clear UNIT ATTENTION which is because of HW RESET at this situation. <3>[ 636.089575] [0: kworker/u16:11: 3578] ufshcd-qcom 1d84000.ufshc: ufshcd_abort: tag:0, cmd:0x1b, retries 2 <3>[ 636.089898] [0: kworker/u16:11: 3578] ufshcd-qcom 1d84000.ufshc: ufshcd_eh_host_reset_handler: reset in progress - 2 ... <4>[ 638.271463] [1: kworker/1:1: 3552] scsi 0:0:0:49488: START_STOP failed for power mode: 1, result 30000 ... <6>[ 1056.481268] [3: msm_watchdog: 79] current proc : 79 msm_watchdog ... <4>[ 1056.576026] [3: msm_watchdog: 79] Call trace: <4>[ 1056.576138] [3: msm_watchdog: 79] __switch_to+0xb4/0xc0 <4>[ 1056.576267] [3: msm_watchdog: 79] __schedule+0x8d8/0xc70 <4>[ 1056.576404] [3: msm_watchdog: 79] io_schedule+0x8c/0xc0 <4>[ 1056.576530] [3: msm_watchdog: 79] bit_wait_io+0x14/0x60 <4>[ 1056.576658] [3: msm_watchdog: 79] out_of_line_wait_on_bit+0xd8/0x158 <4>[ 1056.576817] [3: msm_watchdog: 79] __wait_on_buffer+0x24/0x30 <4>[ 1056.576962] [3: msm_watchdog: 79] jbd2_journal_commit_transaction+0xf24/0x1970 <4>[ 1056.577134] [3: msm_watchdog: 79] kjournald2+0x1e0/0x258 <4>[ 1056.577264] [3: msm_watchdog: 79] kthread+0x110/0x120 <4>[ 1056.577389] [3: msm_watchdog: 79] ret_from_fork+0x10/0x18 ... <0>[ 1056.784682] [3: msm_watchdog: 79] Kernel panic - not syncing: Platform Watchdog can't update sync_cnt ... Change-Id: I933f774e04456226d760536200e9f079a5d5b987 Signed-off-by: Lee Sang Hyun <sh425.lee@samsung.com> --- drivers/scsi/ufs/ufshcd.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index ad4fc82..30cee8c 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -100,6 +100,9 @@ /* Default value of wait time before gating device ref clock */ #define UFSHCD_REF_CLK_GATING_WAIT_US 0xFF /* microsecs */ +/* SSU retries */ +#define SSU_RETRIES 3 + #define ufshcd_toggle_vreg(_dev, _vreg, _on) \ ({ \ int _ret; \ @@ -7977,6 +7980,7 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba, struct scsi_device *sdp; unsigned long flags; int ret; + int retries; spin_lock_irqsave(hba->host->host_lock, flags); sdp = hba->sdev_ufs_device; @@ -8016,14 +8020,18 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba, * callbacks hence set the RQF_PM flag so that it doesn't resume the * already suspended childs. */ - ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, - START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL); - if (ret) { - sdev_printk(KERN_WARNING, sdp, - "START_STOP failed for power mode: %d, result %x\n", - pwr_mode, ret); - if (driver_byte(ret) == DRIVER_SENSE) - scsi_print_sense_hdr(sdp, NULL, &sshdr); + for (retries = 0; retries < SSU_RETRIES; retries++) { + ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, + START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL); + if (ret) { + sdev_printk(KERN_WARNING, sdp, + "START_STOP failed for power mode: %d, result %x\n", + pwr_mode, ret); + if (driver_byte(ret) == DRIVER_SENSE) + scsi_print_sense_hdr(sdp, NULL, &sshdr); + } else { + break; + } } if (!ret) -- 2.7.4 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU 2020-07-17 7:39 ` [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU Lee Sang Hyun @ 2020-07-18 20:30 ` Bart Van Assche 2020-07-22 9:14 ` Stanley Chu 0 siblings, 1 reply; 5+ messages in thread From: Bart Van Assche @ 2020-07-18 20:30 UTC (permalink / raw) To: Lee Sang Hyun, linux-scsi, alim.akhtar, avri.altman, jejb, martin.petersen, beanhuo, asutoshd, cang, grant.jung, sc.suh, hy50.seo, kwmad.kim On 2020-07-17 00:39, Lee Sang Hyun wrote: > - ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, > - START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL); > - if (ret) { > - sdev_printk(KERN_WARNING, sdp, > - "START_STOP failed for power mode: %d, result %x\n", > - pwr_mode, ret); > - if (driver_byte(ret) == DRIVER_SENSE) > - scsi_print_sense_hdr(sdp, NULL, &sshdr); > + for (retries = 0; retries < SSU_RETRIES; retries++) { > + ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, > + START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL); > + if (ret) { > + sdev_printk(KERN_WARNING, sdp, > + "START_STOP failed for power mode: %d, result %x\n", > + pwr_mode, ret); > + if (driver_byte(ret) == DRIVER_SENSE) > + scsi_print_sense_hdr(sdp, NULL, &sshdr); > + } else { > + break; > + } The ninth argument of scsi_execute() is called 'retries'. Wouldn't it be better to pass a nonzero value as the 'retries' argument of scsi_execute() instead of adding a loop around the scsi_execute() call? Thanks, Bart. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU 2020-07-18 20:30 ` Bart Van Assche @ 2020-07-22 9:14 ` Stanley Chu 2020-07-22 15:10 ` Bart Van Assche 0 siblings, 1 reply; 5+ messages in thread From: Stanley Chu @ 2020-07-22 9:14 UTC (permalink / raw) To: Bart Van Assche Cc: Lee Sang Hyun, linux-scsi, alim.akhtar, avri.altman, jejb, martin.petersen, beanhuo, asutoshd, cang, grant.jung, sc.suh, hy50.seo, kwmad.kim Hi Bart, On Sat, 2020-07-18 at 13:30 -0700, Bart Van Assche wrote: > On 2020-07-17 00:39, Lee Sang Hyun wrote: > > - ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, > > - START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL); > > - if (ret) { > > - sdev_printk(KERN_WARNING, sdp, > > - "START_STOP failed for power mode: %d, result %x\n", > > - pwr_mode, ret); > > - if (driver_byte(ret) == DRIVER_SENSE) > > - scsi_print_sense_hdr(sdp, NULL, &sshdr); > > + for (retries = 0; retries < SSU_RETRIES; retries++) { > > + ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, > > + START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL); > > + if (ret) { > > + sdev_printk(KERN_WARNING, sdp, > > + "START_STOP failed for power mode: %d, result %x\n", > > + pwr_mode, ret); > > + if (driver_byte(ret) == DRIVER_SENSE) > > + scsi_print_sense_hdr(sdp, NULL, &sshdr); > > + } else { > > + break; > > + } > > The ninth argument of scsi_execute() is called 'retries'. Wouldn't it be > better to pass a nonzero value as the 'retries' argument of > scsi_execute() instead of adding a loop around the scsi_execute() call? If a SCSI command issued via scsi_execute() encounters "timeout" or "check condition", scsi_noretry_cmd() will return 1 (true) because blk_rq_is_passthrough() is true due to REQ_OP_SCSI_IN or REQ_OP_SCSI_OUT flag was set to this SCSI command by scsi_execute(). Therefore even a non-zero 'retries' value is assigned while calling scsi_execute(), the failed command has no chance to be retried since the decision will be no-retry by scsi_noretry_cmd(). (Take command timeout as example) scsi_times_out()->scsi_abort_command()->scmd_eh_abort_handler(), here scsi_noretry_cmd() returns 1, and then command will be finished (with error code) without retry. In scsi_noretry_cmd(), there is a comment message in section "check_type" as below /* * assume caller has checked sense and determined * the check condition was retryable. */ I am not sure if "timeout" and "check condition" cases in such SCSI commands issued via scsi_execute() are specially designed to be unable to retry. Would you have any suggestions if LLD drivers would like to retry these kinds of SCSI commands? Thanks a lot, Stanley Chu ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU 2020-07-22 9:14 ` Stanley Chu @ 2020-07-22 15:10 ` Bart Van Assche 2020-07-25 3:10 ` Stanley Chu 0 siblings, 1 reply; 5+ messages in thread From: Bart Van Assche @ 2020-07-22 15:10 UTC (permalink / raw) To: Stanley Chu Cc: Lee Sang Hyun, linux-scsi, alim.akhtar, avri.altman, jejb, martin.petersen, beanhuo, asutoshd, cang, grant.jung, sc.suh, hy50.seo, kwmad.kim On 2020-07-22 02:14, Stanley Chu wrote: > Hi Bart, > > On Sat, 2020-07-18 at 13:30 -0700, Bart Van Assche wrote: >> On 2020-07-17 00:39, Lee Sang Hyun wrote: >>> - ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, >>> - START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL); >>> - if (ret) { >>> - sdev_printk(KERN_WARNING, sdp, >>> - "START_STOP failed for power mode: %d, result %x\n", >>> - pwr_mode, ret); >>> - if (driver_byte(ret) == DRIVER_SENSE) >>> - scsi_print_sense_hdr(sdp, NULL, &sshdr); >>> + for (retries = 0; retries < SSU_RETRIES; retries++) { >>> + ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, >>> + START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL); >>> + if (ret) { >>> + sdev_printk(KERN_WARNING, sdp, >>> + "START_STOP failed for power mode: %d, result %x\n", >>> + pwr_mode, ret); >>> + if (driver_byte(ret) == DRIVER_SENSE) >>> + scsi_print_sense_hdr(sdp, NULL, &sshdr); >>> + } else { >>> + break; >>> + } >> >> The ninth argument of scsi_execute() is called 'retries'. Wouldn't it be >> better to pass a nonzero value as the 'retries' argument of >> scsi_execute() instead of adding a loop around the scsi_execute() call? > > If a SCSI command issued via scsi_execute() encounters "timeout" or > "check condition", scsi_noretry_cmd() will return 1 (true) because > blk_rq_is_passthrough() is true due to REQ_OP_SCSI_IN or REQ_OP_SCSI_OUT > flag was set to this SCSI command by scsi_execute(). Therefore even a > non-zero 'retries' value is assigned while calling scsi_execute(), the > failed command has no chance to be retried since the decision will be > no-retry by scsi_noretry_cmd(). > > (Take command timeout as example) > scsi_times_out()->scsi_abort_command()->scmd_eh_abort_handler(), here > scsi_noretry_cmd() returns 1, and then command will be finished (with > error code) without retry. > > In scsi_noretry_cmd(), there is a comment message in section > "check_type" as below > > /* > * assume caller has checked sense and determined > * the check condition was retryable. > */ > > I am not sure if "timeout" and "check condition" cases in such SCSI > commands issued via scsi_execute() are specially designed to be unable > to retry. > > Would you have any suggestions if LLD drivers would like to retry these > kinds of SCSI commands? How about summarizing the above explanation of why the 'retry' argument of scsi_execute() cannot be used in a two or three line comment and to add that comment above the loop introduced by this patch? Thanks, Bart. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU 2020-07-22 15:10 ` Bart Van Assche @ 2020-07-25 3:10 ` Stanley Chu 0 siblings, 0 replies; 5+ messages in thread From: Stanley Chu @ 2020-07-25 3:10 UTC (permalink / raw) To: Bart Van Assche Cc: Stanley Chu, Lee Sang Hyun, linux-scsi, Alim Akhtar, Avri Altman, James E.J. Bottomley, Martin K. Petersen, Bean Huo (beanhuo), Asutosh Das, cang, grant.jung, sc.suh, hy50.seo, Kiwoong Kim Hi Bart, Thanks for your suggestions. Hi Sang Hyun, Bart Van Assche <bvanassche@acm.org> 於 2020年7月22日 週三 下午11:11寫道: > > On 2020-07-22 02:14, Stanley Chu wrote: > > Hi Bart, > > > > On Sat, 2020-07-18 at 13:30 -0700, Bart Van Assche wrote: > >> On 2020-07-17 00:39, Lee Sang Hyun wrote: > >>> - ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, > >>> - START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL); > >>> - if (ret) { > >>> - sdev_printk(KERN_WARNING, sdp, > >>> - "START_STOP failed for power mode: %d, result %x\n", > >>> - pwr_mode, ret); > >>> - if (driver_byte(ret) == DRIVER_SENSE) > >>> - scsi_print_sense_hdr(sdp, NULL, &sshdr); > >>> + for (retries = 0; retries < SSU_RETRIES; retries++) { > >>> + ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, > >>> + START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL); > >>> + if (ret) { > >>> + sdev_printk(KERN_WARNING, sdp, > >>> + "START_STOP failed for power mode: %d, result %x\n", > >>> + pwr_mode, ret); > >>> + if (driver_byte(ret) == DRIVER_SENSE) > >>> + scsi_print_sense_hdr(sdp, NULL, &sshdr); > >>> + } else { > >>> + break; > >>> + } > >> > >> The ninth argument of scsi_execute() is called 'retries'. Wouldn't it be > >> better to pass a nonzero value as the 'retries' argument of > >> scsi_execute() instead of adding a loop around the scsi_execute() call? > > > > If a SCSI command issued via scsi_execute() encounters "timeout" or > > "check condition", scsi_noretry_cmd() will return 1 (true) because > > blk_rq_is_passthrough() is true due to REQ_OP_SCSI_IN or REQ_OP_SCSI_OUT > > flag was set to this SCSI command by scsi_execute(). Therefore even a > > non-zero 'retries' value is assigned while calling scsi_execute(), the > > failed command has no chance to be retried since the decision will be > > no-retry by scsi_noretry_cmd(). > > > > (Take command timeout as example) > > scsi_times_out()->scsi_abort_command()->scmd_eh_abort_handler(), here > > scsi_noretry_cmd() returns 1, and then command will be finished (with > > error code) without retry. > > > > In scsi_noretry_cmd(), there is a comment message in section > > "check_type" as below > > > > /* > > * assume caller has checked sense and determined > > * the check condition was retryable. > > */ > > > > I am not sure if "timeout" and "check condition" cases in such SCSI > > commands issued via scsi_execute() are specially designed to be unable > > to retry. > > > > Would you have any suggestions if LLD drivers would like to retry these > > kinds of SCSI commands? > > How about summarizing the above explanation of why the 'retry' argument > of scsi_execute() cannot be used in a two or three line comment and to > add that comment above the loop introduced by this patch? > I like this patch since this could also recover the "SSU timeout" case we met before. Could you please make this as formal patch with Bart's suggestion? Thanks, Stanley Chu ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-07-25 3:10 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CGME20200717074740epcas2p2b1c8e7bf7dc28f13c5a9999373f4601b@epcas2p2.samsung.com> 2020-07-17 7:39 ` [RESEND RFC PATCH v1] scsi: ufs: add retries for SSU Lee Sang Hyun 2020-07-18 20:30 ` Bart Van Assche 2020-07-22 9:14 ` Stanley Chu 2020-07-22 15:10 ` Bart Van Assche 2020-07-25 3:10 ` Stanley Chu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).