[PATCH v6] ufs: core: wlun suspend SSU/enter hibern8 fail recovery

stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v6] ufs: core: wlun suspend SSU/enter hibern8 fail recovery
@ 2022-12-08  7:25 peter.wang
  2022-12-08  8:02 ` Greg KH
  2022-12-14  3:16 ` Martin K. Petersen
  0 siblings, 2 replies; 6+ messages in thread
From: peter.wang @ 2022-12-08  7:25 UTC (permalink / raw)
  To: stanley.chu, linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb
  Cc: wsd_upstream, linux-mediatek, peter.wang, chun-hung.wu,
	alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
	qilin.tan, lin.gui, tun-yu.yu, eddie.huang, naomi.chu, stable

From: Peter Wang <peter.wang@mediatek.com>

When SSU/enter hibern8 fail in wlun suspend flow, trigger error
handler and return busy to break the suspend.
If not, wlun runtime pm status become error and the consumer will
stuck in runtime suspend status.

Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun")
Cc: stable@vger.kernel.org
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/ufs/core/ufshcd.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index b1f59a5fe632..31ed3fdb5266 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -6070,6 +6070,14 @@ void ufshcd_schedule_eh_work(struct ufs_hba *hba)
 	}
 }
 
+static void ufshcd_force_error_recovery(struct ufs_hba *hba) 
+{
+	spin_lock_irq(hba->host->host_lock);
+	hba->force_reset = true;
+	ufshcd_schedule_eh_work(hba);
+	spin_unlock_irq(hba->host->host_lock);
+}
+
 static void ufshcd_clk_scaling_allow(struct ufs_hba *hba, bool allow)
 {
 	down_write(&hba->clk_scaling_lock);
@@ -9049,6 +9057,15 @@ static int __ufshcd_wl_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op)
 
 		if (!hba->dev_info.b_rpm_dev_flush_capable) {
 			ret = ufshcd_set_dev_pwr_mode(hba, req_dev_pwr_mode);
+			if (ret && pm_op != UFS_SHUTDOWN_PM) {
+				/*
+				 * If return err in suspend flow, IO will hang.
+				 * Trigger error handler and break suspend for
+				 * error recovery.
+				 */
+				ufshcd_force_error_recovery(hba);
+				ret = -EBUSY;
+			}
 			if (ret)
 				goto enable_scaling;
 		}
@@ -9060,6 +9077,15 @@ static int __ufshcd_wl_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op)
 	 */
 	check_for_bkops = !ufshcd_is_ufs_dev_deepsleep(hba);
 	ret = ufshcd_link_state_transition(hba, req_link_state, check_for_bkops);
+	if (ret && pm_op != UFS_SHUTDOWN_PM) {
+		/*
+		 * If return err in suspend flow, IO will hang.
+		 * Trigger error handler and break suspend for
+		 * error recovery.
+		 */
+		ufshcd_force_error_recovery(hba);
+		ret = -EBUSY;
+	}
 	if (ret)
 		goto set_dev_active;
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] ufs: core: wlun suspend SSU/enter hibern8 fail recovery
  2022-12-08  7:25 [PATCH v6] ufs: core: wlun suspend SSU/enter hibern8 fail recovery peter.wang
@ 2022-12-08  8:02 ` Greg KH
  2022-12-14  3:16 ` Martin K. Petersen
  1 sibling, 0 replies; 6+ messages in thread
From: Greg KH @ 2022-12-08  8:02 UTC (permalink / raw)
  To: peter.wang
  Cc: stanley.chu, linux-scsi, martin.petersen, avri.altman,
	alim.akhtar, jejb, wsd_upstream, linux-mediatek, chun-hung.wu,
	alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
	qilin.tan, lin.gui, tun-yu.yu, eddie.huang, naomi.chu, stable

On Thu, Dec 08, 2022 at 03:25:20PM +0800, peter.wang@mediatek.com wrote:
> From: Peter Wang <peter.wang@mediatek.com>
> 
> When SSU/enter hibern8 fail in wlun suspend flow, trigger error
> handler and return busy to break the suspend.
> If not, wlun runtime pm status become error and the consumer will
> stuck in runtime suspend status.
> 
> Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun")
> Cc: stable@vger.kernel.org
> Signed-off-by: Peter Wang <peter.wang@mediatek.com>
> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  drivers/ufs/core/ufshcd.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index b1f59a5fe632..31ed3fdb5266 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -6070,6 +6070,14 @@ void ufshcd_schedule_eh_work(struct ufs_hba *hba)
>  	}
>  }
>  
> +static void ufshcd_force_error_recovery(struct ufs_hba *hba) 
> +{
> +	spin_lock_irq(hba->host->host_lock);
> +	hba->force_reset = true;
> +	ufshcd_schedule_eh_work(hba);
> +	spin_unlock_irq(hba->host->host_lock);
> +}
> +
>  static void ufshcd_clk_scaling_allow(struct ufs_hba *hba, bool allow)
>  {
>  	down_write(&hba->clk_scaling_lock);
> @@ -9049,6 +9057,15 @@ static int __ufshcd_wl_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op)
>  
>  		if (!hba->dev_info.b_rpm_dev_flush_capable) {
>  			ret = ufshcd_set_dev_pwr_mode(hba, req_dev_pwr_mode);
> +			if (ret && pm_op != UFS_SHUTDOWN_PM) {
> +				/*
> +				 * If return err in suspend flow, IO will hang.
> +				 * Trigger error handler and break suspend for
> +				 * error recovery.
> +				 */
> +				ufshcd_force_error_recovery(hba);
> +				ret = -EBUSY;
> +			}
>  			if (ret)
>  				goto enable_scaling;
>  		}
> @@ -9060,6 +9077,15 @@ static int __ufshcd_wl_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op)
>  	 */
>  	check_for_bkops = !ufshcd_is_ufs_dev_deepsleep(hba);
>  	ret = ufshcd_link_state_transition(hba, req_link_state, check_for_bkops);
> +	if (ret && pm_op != UFS_SHUTDOWN_PM) {
> +		/*
> +		 * If return err in suspend flow, IO will hang.
> +		 * Trigger error handler and break suspend for
> +		 * error recovery.
> +		 */
> +		ufshcd_force_error_recovery(hba);
> +		ret = -EBUSY;
> +	}
>  	if (ret)
>  		goto set_dev_active;
>  
> -- 
> 2.18.0
> 

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- This looks like a new version of a previously submitted patch, but you
  did not list below the --- line any changes from the previous version.
  Please read the section entitled "The canonical patch format" in the
  kernel file, Documentation/SubmittingPatches for what needs to be done
  here to properly describe this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] ufs: core: wlun suspend SSU/enter hibern8 fail recovery
  2022-12-08  7:25 [PATCH v6] ufs: core: wlun suspend SSU/enter hibern8 fail recovery peter.wang
  2022-12-08  8:02 ` Greg KH
@ 2022-12-14  3:16 ` Martin K. Petersen
  2022-12-20 21:00   ` Daniil Lunev
  1 sibling, 1 reply; 6+ messages in thread
From: Martin K. Petersen @ 2022-12-14  3:16 UTC (permalink / raw)
  To: peter.wang
  Cc: stanley.chu, linux-scsi, martin.petersen, avri.altman,
	alim.akhtar, jejb, wsd_upstream, linux-mediatek, chun-hung.wu,
	alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
	qilin.tan, lin.gui, tun-yu.yu, eddie.huang, naomi.chu, stable


Peter,

> When SSU/enter hibern8 fail in wlun suspend flow, trigger error
> handler and return busy to break the suspend.  If not, wlun runtime pm
> status become error and the consumer will stuck in runtime suspend
> status.

Applied to 6.2/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] ufs: core: wlun suspend SSU/enter hibern8 fail recovery
  2022-12-14  3:16 ` Martin K. Petersen
@ 2022-12-20 21:00   ` Daniil Lunev
  2022-12-21  5:59     ` Peter Wang (王信友)
  0 siblings, 1 reply; 6+ messages in thread
From: Daniil Lunev @ 2022-12-20 21:00 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: peter.wang, stanley.chu, linux-scsi, avri.altman, alim.akhtar,
	jejb, wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao,
	cc.chou, chaotian.jing, jiajie.hao, powen.kao, qilin.tan,
	lin.gui, tun-yu.yu, eddie.huang, naomi.chu, stable

> Applied to 6.2/scsi-staging, thanks!

There is an interesting side effect of the patch in this iteration
(which I am not sure was present in the past iteration I tried):
If the device auto suspends while running purge - controller is
seemingly recent and thus the purge is aborted (with no patch at all
it hangs).
That might be ok behaviour though - it will just make it an explicit
requirement to disable runtime suspend during the management
operation.

localhost ~ # ufs-utils fl -t 6 -e -p /dev/bsg/ufs-bsg0
localhost ~ # ufs-utils attr -a -p /dev/bsg/ufs-bsg0 | grep bPurgeStatus
bPurgeStatus               := 0x00

[   25.801980] ufs_device_wlun 0:0:0:49488: START_STOP failed for
power mode: 2, result 2
[   25.802002] ufs_device_wlun 0:0:0:49488: Sense Key : Not Ready [current]
[   25.802009] ufs_device_wlun 0:0:0:49488: Add. Sense: No additional
sense information
[   25.802020] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
failed: -16

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] ufs: core: wlun suspend SSU/enter hibern8 fail recovery
  2022-12-20 21:00   ` Daniil Lunev
@ 2022-12-21  5:59     ` Peter Wang (王信友)
  2023-01-02 22:05       ` Daniil Lunev
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Wang (王信友) @ 2022-12-21  5:59 UTC (permalink / raw)
  To: dlunev, martin.petersen
  Cc: linux-mediatek, Jiajie Hao (郝加节),
	CC Chou (周志杰),
	Eddie Huang (黃智傑),
	Alice Chao (趙珮均),
	jejb, wsd_upstream, avri.altman, stable,
	Lin Gui (桂林),
	Chun-Hung Wu (巫駿宏),
	linux-scsi, alim.akhtar, Tun-yu Yu (游敦聿),
	Chaotian Jing (井朝天),
	Powen Kao (高伯文),
	Naomi Chu (朱詠田),
	Stanley Chu (朱原陞),
	Qilin Tan (谭麒麟)

On Wed, 2022-12-21 at 08:00 +1100, Daniil Lunev wrote:
> > Applied to 6.2/scsi-staging, thanks!
> 
> There is an interesting side effect of the patch in this iteration
> (which I am not sure was present in the past iteration I tried):
> If the device auto suspends while running purge - controller is
> seemingly recent and thus the purge is aborted (with no patch at all
> it hangs).
> That might be ok behaviour though - it will just make it an explicit
> requirement to disable runtime suspend during the management
> operation.
> 

Hi Daniil,

I am not sure if this is similar reason we get SSU(sleep) fail.
But if without this patch when purge is onging, system IO will hang,
this is no better.
And I have another idea about rpm and purge.

To disable runtime suspend when purge operation is ongoing:
1. Disable rpm when fPurgeEnable is set, polling bPurgeStatus become 0
and enable rpm.
   But polling bPurgeStatus will extend rpm timer, so we don't need
really disable rpm, right?
2. Check bPurgeStatus if enter runtime suspend, return EBUSY if
bPurgeStatus is not 0 to break suspend.
   This is correct design to tell rpm flamework that driver is busy
with purge and suspend is inappropriate. 
   But it should be similar as current flow, return EBUSY when get SSU
fail?

So, with current design, if purge initiator do not want to see rpm
EBUSY, then he should polling bPurgeStatus. 
What do you think?

Thanks.
BR
Peter

> localhost ~ # ufs-utils fl -t 6 -e -p /dev/bsg/ufs-bsg0
> localhost ~ # ufs-utils attr -a -p /dev/bsg/ufs-bsg0 | grep
> bPurgeStatus
> bPurgeStatus               := 0x00
> 
> [   25.801980] ufs_device_wlun 0:0:0:49488: START_STOP failed for
> power mode: 2, result 2
> [   25.802002] ufs_device_wlun 0:0:0:49488: Sense Key : Not Ready
> [current]
> [   25.802009] ufs_device_wlun 0:0:0:49488: Add. Sense: No additional
> sense information
> [   25.802020] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
> failed: -16

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] ufs: core: wlun suspend SSU/enter hibern8 fail recovery
  2022-12-21  5:59     ` Peter Wang (王信友)
@ 2023-01-02 22:05       ` Daniil Lunev
  0 siblings, 0 replies; 6+ messages in thread
From: Daniil Lunev @ 2023-01-02 22:05 UTC (permalink / raw)
  To: Peter Wang (王信友)
  Cc: martin.petersen, linux-mediatek,
	Jiajie Hao (郝加节),
	CC Chou (周志杰),
	Eddie Huang (黃智傑),
	Alice Chao (趙珮均),
	jejb, wsd_upstream, avri.altman, stable,
	Lin Gui (桂林),
	Chun-Hung Wu (巫駿宏),
	linux-scsi, alim.akhtar, Tun-yu Yu (游敦聿),
	Chaotian Jing (井朝天),
	Powen Kao (高伯文),
	Naomi Chu (朱詠田),
	Stanley Chu (朱原陞),
	Qilin Tan (谭麒麟)

On Wed, Dec 21, 2022 at 4:59 PM Peter Wang (王信友)
<peter.wang@mediatek.com> wrote:
> But if without this patch when purge is onging, system IO will hang,
> this is no better.
Yes, that is why I am just pointing this out as a matter of fact, not as a bug.
It is arguable if resetting the controller in the deadlock situation is a proper
thing to do, but it might be the next best thing, so I don't argue that neither.

> So, with current design, if purge initiator do not want to see rpm
> EBUSY, then he should polling bPurgeStatus.
> What do you think?

I am actually not sure if management operations extend the timeout - they are
going through bsg interface, and I am not sure it properly re-sets the timeouts
on all possible nexus interfaces, need to check that.
But even if it does, there are two problems:
* If you make kernel be polling that parameter - it will actually make the
  application level to miss the completion code (since after querying
  completion once it will return Not Started afterwards).
* And application polling is race prone. We set runtime suspend to 100ms - so
  depending on the scheduling quirks it may miss the event.

--Daniil

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-01-02 22:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-08  7:25 [PATCH v6] ufs: core: wlun suspend SSU/enter hibern8 fail recovery peter.wang
2022-12-08  8:02 ` Greg KH
2022-12-14  3:16 ` Martin K. Petersen
2022-12-20 21:00   ` Daniil Lunev
2022-12-21  5:59     ` Peter Wang (王信友)
2023-01-02 22:05       ` Daniil Lunev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).