[PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown

linux-mediatek.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown
@ 2020-08-03 10:04 Stanley Chu
  2020-08-03 11:50 ` Can Guo
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Stanley Chu @ 2020-08-03 10:04 UTC (permalink / raw)
  To: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb,
	cang, bvanassche
  Cc: Stanley Chu, andy.teng, cc.chou, chun-hung.wu, kuohong.wang,
	linux-kernel, jiajie.hao, linux-mediatek, peter.wang,
	matthias.bgg, beanhuo, chaotian.jing, linux-arm-kernel, asutoshd

Currently I/O request could be still submitted to UFS device while
UFS is working on shutdown flow. This may lead to racing as below
scenarios and finally system may crash due to unclocked register
accesses.

To fix this kind of issues, in ufshcd_shutdown(),

1. Use pm_runtime_get_sync() instead of resuming UFS device by
   ufshcd_runtime_resume() "internally" to let runtime PM framework
   manage and prevent concurrent runtime operations by incoming I/O
   requests.

2. Specifically quiesce all SCSI devices to block all I/O requests
   after device is resumed.

Example of racing scenario: While UFS device is runtime-suspended

Thread #1: Executing UFS shutdown flow, e.g.,
           ufshcd_suspend(UFS_SHUTDOWN_PM)

Thread #2: Executing runtime resume flow triggered by I/O request,
           e.g., ufshcd_resume(UFS_RUNTIME_PM)

This breaks the assumption that UFS PM flows can not be running
concurrently and some unexpected racing behavior may happen.

Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
---
Changes:
  - Since v6:
	- Do quiesce to all SCSI devices.
  - Since v4:
	- Use pm_runtime_get_sync() instead of resuming UFS device by ufshcd_runtime_resume() "internally".
---
 drivers/scsi/ufs/ufshcd.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 307622284239..7cb220b3fde0 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -8640,6 +8640,7 @@ EXPORT_SYMBOL(ufshcd_runtime_idle);
 int ufshcd_shutdown(struct ufs_hba *hba)
 {
 	int ret = 0;
+	struct scsi_target *starget;
 
 	if (!hba->is_powered)
 		goto out;
@@ -8647,11 +8648,27 @@ int ufshcd_shutdown(struct ufs_hba *hba)
 	if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
 		goto out;
 
-	if (pm_runtime_suspended(hba->dev)) {
-		ret = ufshcd_runtime_resume(hba);
-		if (ret)
-			goto out;
-	}
+	/*
+	 * Let runtime PM framework manage and prevent concurrent runtime
+	 * operations with shutdown flow.
+	 */
+	pm_runtime_get_sync(hba->dev);
+
+	/*
+	 * Quiesce all SCSI devices to prevent any non-PM requests sending
+	 * from block layer during and after shutdown.
+	 *
+	 * Here we can not use blk_cleanup_queue() since PM requests
+	 * (with BLK_MQ_REQ_PREEMPT flag) are still required to be sent
+	 * through block layer. Therefore SCSI command queued after the
+	 * scsi_target_quiesce() call returned will block until
+	 * blk_cleanup_queue() is called.
+	 *
+	 * Besides, scsi_target_"un"quiesce (e.g., scsi_target_resume) can
+	 * be ignored since shutdown is one-way flow.
+	 */
+	list_for_each_entry(starget, &hba->host->__targets, siblings)
+		scsi_target_quiesce(starget);
 
 	ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
 out:
-- 
2.18.0
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown
  2020-08-03 10:04 [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown Stanley Chu
@ 2020-08-03 11:50 ` Can Guo
  2020-08-03 12:04   ` Can Guo
  2020-08-03 12:51 ` Can Guo
  2020-08-03 16:04 ` Bart Van Assche
  2 siblings, 1 reply; 9+ messages in thread
From: Can Guo @ 2020-08-03 11:50 UTC (permalink / raw)
  To: Stanley Chu
  Cc: jiajie.hao, linux-scsi, martin.petersen, andy.teng, jejb,
	chun-hung.wu, kuohong.wang, linux-kernel, asutoshd, avri.altman,
	linux-mediatek, peter.wang, alim.akhtar, matthias.bgg, beanhuo,
	chaotian.jing, cc.chou, linux-arm-kernel, bvanassche

Hi Stanley,

On 2020-08-03 18:04, Stanley Chu wrote:
> Currently I/O request could be still submitted to UFS device while
> UFS is working on shutdown flow. This may lead to racing as below
> scenarios and finally system may crash due to unclocked register
> accesses.
> 
> To fix this kind of issues, in ufshcd_shutdown(),
> 
> 1. Use pm_runtime_get_sync() instead of resuming UFS device by
>    ufshcd_runtime_resume() "internally" to let runtime PM framework
>    manage and prevent concurrent runtime operations by incoming I/O
>    requests.
> 
> 2. Specifically quiesce all SCSI devices to block all I/O requests
>    after device is resumed.
> 
> Example of racing scenario: While UFS device is runtime-suspended
> 
> Thread #1: Executing UFS shutdown flow, e.g.,
>            ufshcd_suspend(UFS_SHUTDOWN_PM)
> 
> Thread #2: Executing runtime resume flow triggered by I/O request,
>            e.g., ufshcd_resume(UFS_RUNTIME_PM)
> 
> This breaks the assumption that UFS PM flows can not be running
> concurrently and some unexpected racing behavior may happen.
> 
> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
> ---
> Changes:
>   - Since v6:
> 	- Do quiesce to all SCSI devices.
>   - Since v4:
> 	- Use pm_runtime_get_sync() instead of resuming UFS device by
> ufshcd_runtime_resume() "internally".
> ---
>  drivers/scsi/ufs/ufshcd.c | 27 ++++++++++++++++++++++-----
>  1 file changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 307622284239..7cb220b3fde0 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -8640,6 +8640,7 @@ EXPORT_SYMBOL(ufshcd_runtime_idle);
>  int ufshcd_shutdown(struct ufs_hba *hba)
>  {
>  	int ret = 0;
> +	struct scsi_target *starget;
> 
>  	if (!hba->is_powered)
>  		goto out;
> @@ -8647,11 +8648,27 @@ int ufshcd_shutdown(struct ufs_hba *hba)
>  	if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
>  		goto out;
> 
> -	if (pm_runtime_suspended(hba->dev)) {
> -		ret = ufshcd_runtime_resume(hba);
> -		if (ret)
> -			goto out;
> -	}
> +	/*
> +	 * Let runtime PM framework manage and prevent concurrent runtime
> +	 * operations with shutdown flow.
> +	 */
> +	pm_runtime_get_sync(hba->dev);
> +
> +	/*
> +	 * Quiesce all SCSI devices to prevent any non-PM requests sending
> +	 * from block layer during and after shutdown.
> +	 *
> +	 * Here we can not use blk_cleanup_queue() since PM requests
> +	 * (with BLK_MQ_REQ_PREEMPT flag) are still required to be sent
> +	 * through block layer. Therefore SCSI command queued after the
> +	 * scsi_target_quiesce() call returned will block until
> +	 * blk_cleanup_queue() is called.
> +	 *
> +	 * Besides, scsi_target_"un"quiesce (e.g., scsi_target_resume) can
> +	 * be ignored since shutdown is one-way flow.
> +	 */
> +	list_for_each_entry(starget, &hba->host->__targets, siblings)
> +		scsi_target_quiesce(starget);
> 

Sorry for misleading you to scsi_target_quiesce(), maybe below is 
better.

     shost_for_each_device(sdev, hba->host)
         scsi_device_quiesce(sdev);

We may need to discuss more about this quiesce part since I missed 
something.

After we quiesce the scsi devices, only PM requests are allowed, but it
is still not safe: [1] PM requests can still pass through, [2] there can
be tasks/reqs present in doorbells before the devices are quiesced. So,
these tasks/reqs in [1] and [2] can still be flying in parallel while
ufshcd_suspend is running.

How about only quiescing the UFS device well known scsi device but using
freeze_queue to the other scsi devices? blk_mq_freeze_queue can 
eliminate
the risks mentioned in [1] and [2].

      shost_for_each_device(sdev, hba->host) {
          if (sdev == hba->sdev_ufs_device)
               scsi_device_quiesce(sdev);
          else
               blk_mq_freeze_queue(sdev->request_queue);
      }

IF blk_mq_freeze_queue is not allowed to be used by LLD (I think we can
use it as I recalled Bart used to use it in one of his changes to UFS 
scaling),
we may need to make changes like below. [1] is to make sure no more PM 
requests
sent to scsi devices, [2] is make sure doorbells are cleared before 
invoke
ufshcd_suspend.

     shost_for_each_device(sdev, hba->host) {
         scsi_autopm_get_device(sdev); [1]
         scsi_device_quiesce(sdev);
     }

     ufshcd_wait_for_doorbell_clr(hba, U64_MAX); [2]

Please let me know your ideas, thanks!

Regards,

Can Guo.

>  	ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
>  out:

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown
  2020-08-03 11:50 ` Can Guo
@ 2020-08-03 12:04   ` Can Guo
  0 siblings, 0 replies; 9+ messages in thread
From: Can Guo @ 2020-08-03 12:04 UTC (permalink / raw)
  To: Stanley Chu
  Cc: jiajie.hao, linux-scsi, martin.petersen, andy.teng, jejb,
	chun-hung.wu, kuohong.wang, linux-kernel, asutoshd, avri.altman,
	linux-mediatek, peter.wang, alim.akhtar, matthias.bgg, beanhuo,
	chaotian.jing, cc.chou, linux-arm-kernel, bvanassche

Slightly updated my comments

On 2020-08-03 19:50, Can Guo wrote:
> Hi Stanley,
> 
> On 2020-08-03 18:04, Stanley Chu wrote:
>> Currently I/O request could be still submitted to UFS device while
>> UFS is working on shutdown flow. This may lead to racing as below
>> scenarios and finally system may crash due to unclocked register
>> accesses.
>> 
>> To fix this kind of issues, in ufshcd_shutdown(),
>> 
>> 1. Use pm_runtime_get_sync() instead of resuming UFS device by
>>    ufshcd_runtime_resume() "internally" to let runtime PM framework
>>    manage and prevent concurrent runtime operations by incoming I/O
>>    requests.
>> 
>> 2. Specifically quiesce all SCSI devices to block all I/O requests
>>    after device is resumed.
>> 
>> Example of racing scenario: While UFS device is runtime-suspended
>> 
>> Thread #1: Executing UFS shutdown flow, e.g.,
>>            ufshcd_suspend(UFS_SHUTDOWN_PM)
>> 
>> Thread #2: Executing runtime resume flow triggered by I/O request,
>>            e.g., ufshcd_resume(UFS_RUNTIME_PM)
>> 
>> This breaks the assumption that UFS PM flows can not be running
>> concurrently and some unexpected racing behavior may happen.
>> 
>> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
>> ---
>> Changes:
>>   - Since v6:
>> 	- Do quiesce to all SCSI devices.
>>   - Since v4:
>> 	- Use pm_runtime_get_sync() instead of resuming UFS device by
>> ufshcd_runtime_resume() "internally".
>> ---
>>  drivers/scsi/ufs/ufshcd.c | 27 ++++++++++++++++++++++-----
>>  1 file changed, 22 insertions(+), 5 deletions(-)
>> 
>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>> index 307622284239..7cb220b3fde0 100644
>> --- a/drivers/scsi/ufs/ufshcd.c
>> +++ b/drivers/scsi/ufs/ufshcd.c
>> @@ -8640,6 +8640,7 @@ EXPORT_SYMBOL(ufshcd_runtime_idle);
>>  int ufshcd_shutdown(struct ufs_hba *hba)
>>  {
>>  	int ret = 0;
>> +	struct scsi_target *starget;
>> 
>>  	if (!hba->is_powered)
>>  		goto out;
>> @@ -8647,11 +8648,27 @@ int ufshcd_shutdown(struct ufs_hba *hba)
>>  	if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
>>  		goto out;
>> 
>> -	if (pm_runtime_suspended(hba->dev)) {
>> -		ret = ufshcd_runtime_resume(hba);
>> -		if (ret)
>> -			goto out;
>> -	}
>> +	/*
>> +	 * Let runtime PM framework manage and prevent concurrent runtime
>> +	 * operations with shutdown flow.
>> +	 */
>> +	pm_runtime_get_sync(hba->dev);
>> +
>> +	/*
>> +	 * Quiesce all SCSI devices to prevent any non-PM requests sending
>> +	 * from block layer during and after shutdown.
>> +	 *
>> +	 * Here we can not use blk_cleanup_queue() since PM requests
>> +	 * (with BLK_MQ_REQ_PREEMPT flag) are still required to be sent
>> +	 * through block layer. Therefore SCSI command queued after the
>> +	 * scsi_target_quiesce() call returned will block until
>> +	 * blk_cleanup_queue() is called.
>> +	 *
>> +	 * Besides, scsi_target_"un"quiesce (e.g., scsi_target_resume) can
>> +	 * be ignored since shutdown is one-way flow.
>> +	 */
>> +	list_for_each_entry(starget, &hba->host->__targets, siblings)
>> +		scsi_target_quiesce(starget);
>> 
> 
> Sorry for misleading you to scsi_target_quiesce(), maybe below is 
> better.
> 
>     shost_for_each_device(sdev, hba->host)
>         scsi_device_quiesce(sdev);
> 
> We may need to discuss more about this quiesce part since I missed 
> something.
> 
> After we quiesce the scsi devices, only PM requests are allowed, but it
> is still not safe: [1] PM requests can still pass through, [2] there 
> can
> be tasks/reqs present in doorbells before the devices are quiesced. So,
> these tasks/reqs in [1] and [2] can still be flying in parallel while
> ufshcd_suspend is running.
> 
> How about only quiescing the UFS device well known scsi device but 
> using
> freeze_queue to the other scsi devices? blk_mq_freeze_queue can 
> eliminate
> the risks mentioned in [1] and [2].
> 
>      shost_for_each_device(sdev, hba->host) {
>          if (sdev == hba->sdev_ufs_device)
>               scsi_device_quiesce(sdev);
>          else
>               blk_mq_freeze_queue(sdev->request_queue);
>      }
> 
> IF blk_mq_freeze_queue is not allowed to be used by LLD (I think we can
> use it as I recalled Bart used to use it in one of his changes to UFS 
> scaling),
> we can use scsi_remove_device instead, it changes scsi device's state 
> to
> SDEV_DEL and calls blk_cleanup_queue.
> 
> We can also make changes like below. [1] is to make sure no more PM 
> requests
> sent to scsi devices, [2] is make sure doorbells are cleared before 
> invoke
> ufshcd_suspend.
> 
>     shost_for_each_device(sdev, hba->host) {
>         scsi_autopm_get_device(sdev); [1]
>         scsi_device_quiesce(sdev);
>     }
> 
>     ufshcd_wait_for_doorbell_clr(hba, U64_MAX); [2]
> 
> Please let me know which one you prefer or if you have better idea, 
> thanks!
> 
> Regards,
> 
> Can Guo.
> 
>>  	ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
>>  out:

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown
  2020-08-03 10:04 [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown Stanley Chu
  2020-08-03 11:50 ` Can Guo
@ 2020-08-03 12:51 ` Can Guo
  2020-08-03 16:04 ` Bart Van Assche
  2 siblings, 0 replies; 9+ messages in thread
From: Can Guo @ 2020-08-03 12:51 UTC (permalink / raw)
  To: Stanley Chu
  Cc: jiajie.hao, linux-scsi, martin.petersen, andy.teng, jejb,
	chun-hung.wu, kuohong.wang, linux-kernel, asutoshd, avri.altman,
	linux-mediatek, peter.wang, alim.akhtar, matthias.bgg, beanhuo,
	chaotian.jing, cc.chou, linux-arm-kernel, bvanassche

Hi Stanley,

Sorry for the noises, please ignore my previous 2 mails and let's
focus on this one.

On 2020-08-03 18:04, Stanley Chu wrote:
> Currently I/O request could be still submitted to UFS device while
> UFS is working on shutdown flow. This may lead to racing as below
> scenarios and finally system may crash due to unclocked register
> accesses.
> 
> To fix this kind of issues, in ufshcd_shutdown(),
> 
> 1. Use pm_runtime_get_sync() instead of resuming UFS device by
>    ufshcd_runtime_resume() "internally" to let runtime PM framework
>    manage and prevent concurrent runtime operations by incoming I/O
>    requests.
> 
> 2. Specifically quiesce all SCSI devices to block all I/O requests
>    after device is resumed.
> 
> Example of racing scenario: While UFS device is runtime-suspended
> 
> Thread #1: Executing UFS shutdown flow, e.g.,
>            ufshcd_suspend(UFS_SHUTDOWN_PM)
> 
> Thread #2: Executing runtime resume flow triggered by I/O request,
>            e.g., ufshcd_resume(UFS_RUNTIME_PM)
> 
> This breaks the assumption that UFS PM flows can not be running
> concurrently and some unexpected racing behavior may happen.
> 
> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
> ---
> Changes:
>   - Since v6:
> 	- Do quiesce to all SCSI devices.
>   - Since v4:
> 	- Use pm_runtime_get_sync() instead of resuming UFS device by
> ufshcd_runtime_resume() "internally".
> ---
>  drivers/scsi/ufs/ufshcd.c | 27 ++++++++++++++++++++++-----
>  1 file changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 307622284239..7cb220b3fde0 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -8640,6 +8640,7 @@ EXPORT_SYMBOL(ufshcd_runtime_idle);
>  int ufshcd_shutdown(struct ufs_hba *hba)
>  {
>  	int ret = 0;
> +	struct scsi_target *starget;
> 
>  	if (!hba->is_powered)
>  		goto out;
> @@ -8647,11 +8648,27 @@ int ufshcd_shutdown(struct ufs_hba *hba)
>  	if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
>  		goto out;
> 
> -	if (pm_runtime_suspended(hba->dev)) {
> -		ret = ufshcd_runtime_resume(hba);
> -		if (ret)
> -			goto out;
> -	}
> +	/*
> +	 * Let runtime PM framework manage and prevent concurrent runtime
> +	 * operations with shutdown flow.
> +	 */
> +	pm_runtime_get_sync(hba->dev);
> +
> +	/*
> +	 * Quiesce all SCSI devices to prevent any non-PM requests sending
> +	 * from block layer during and after shutdown.
> +	 *
> +	 * Here we can not use blk_cleanup_queue() since PM requests
> +	 * (with BLK_MQ_REQ_PREEMPT flag) are still required to be sent
> +	 * through block layer. Therefore SCSI command queued after the
> +	 * scsi_target_quiesce() call returned will block until
> +	 * blk_cleanup_queue() is called.
> +	 *
> +	 * Besides, scsi_target_"un"quiesce (e.g., scsi_target_resume) can
> +	 * be ignored since shutdown is one-way flow.
> +	 */
> +	list_for_each_entry(starget, &hba->host->__targets, siblings)
> +		scsi_target_quiesce(starget);
> 

Sorry for misleading you to scsi_target_quiesce(), maybe below is 
better.

     shost_for_each_device(sdev, hba->host)
         scsi_device_quiesce(sdev);

We may need to discuss more about this quiesce part since I missed 
something.

After we quiesce the scsi devices, only PM requests are allowed, but it
is still not safe - PM requests can still pass through.

How about only quiescing the UFS device well known scsi device but using
freeze_queue to the other scsi devices? blk_mq_freeze_queue can 
eliminate
the risk.

      shost_for_each_device(sdev, hba->host) {
          if (sdev == hba->sdev_ufs_device)
               scsi_device_quiesce(sdev);
          else
               blk_mq_freeze_queue(sdev->request_queue);
      }

IF blk_mq_freeze_queue is not allowed to be used by LLD (I think we can
use it as I recalled Bart used to use it in one of his changes to UFS 
scaling),
we can use scsi_remove_device instead, it changes scsi device's state to
SDEV_DEL and calls blk_cleanup_queue.

We can also use scsi_autopm_get_device like below. It is to make sure
no more PM requests sent to scsi devices (since PM requests are only 
sent
during PM ops).

     shost_for_each_device(sdev, hba->host) {
         scsi_autopm_get_device(sdev);
         scsi_device_quiesce(sdev);
     }

Please let me know which one do you prefer or if you have better ideas, 
thanks!

Regards,

Can Guo.

>  	ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
>  out:

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown
  2020-08-03 10:04 [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown Stanley Chu
  2020-08-03 11:50 ` Can Guo
  2020-08-03 12:51 ` Can Guo
@ 2020-08-03 16:04 ` Bart Van Assche
  2020-08-04  3:19   ` [SPAM]Re: " Chaotian Jing
  2020-08-13  8:55   ` Stanley Chu
  2 siblings, 2 replies; 9+ messages in thread
From: Bart Van Assche @ 2020-08-03 16:04 UTC (permalink / raw)
  To: Stanley Chu, linux-scsi, martin.petersen, avri.altman,
	alim.akhtar, jejb, cang
  Cc: andy.teng, cc.chou, chun-hung.wu, kuohong.wang, linux-kernel,
	jiajie.hao, linux-mediatek, peter.wang, matthias.bgg, beanhuo,
	chaotian.jing, linux-arm-kernel, asutoshd

On 2020-08-03 03:04, Stanley Chu wrote:
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 307622284239..7cb220b3fde0 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -8640,6 +8640,7 @@ EXPORT_SYMBOL(ufshcd_runtime_idle);
>  int ufshcd_shutdown(struct ufs_hba *hba)
>  {
>  	int ret = 0;
> +	struct scsi_target *starget;
>  
>  	if (!hba->is_powered)
>  		goto out;
> @@ -8647,11 +8648,27 @@ int ufshcd_shutdown(struct ufs_hba *hba)
>  	if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
>  		goto out;
>  
> -	if (pm_runtime_suspended(hba->dev)) {
> -		ret = ufshcd_runtime_resume(hba);
> -		if (ret)
> -			goto out;
> -	}
> +	/*
> +	 * Let runtime PM framework manage and prevent concurrent runtime
> +	 * operations with shutdown flow.
> +	 */
> +	pm_runtime_get_sync(hba->dev);
> +
> +	/*
> +	 * Quiesce all SCSI devices to prevent any non-PM requests sending
> +	 * from block layer during and after shutdown.
> +	 *
> +	 * Here we can not use blk_cleanup_queue() since PM requests
> +	 * (with BLK_MQ_REQ_PREEMPT flag) are still required to be sent
> +	 * through block layer. Therefore SCSI command queued after the
> +	 * scsi_target_quiesce() call returned will block until
> +	 * blk_cleanup_queue() is called.
> +	 *
> +	 * Besides, scsi_target_"un"quiesce (e.g., scsi_target_resume) can
> +	 * be ignored since shutdown is one-way flow.
> +	 */
> +	list_for_each_entry(starget, &hba->host->__targets, siblings)
> +		scsi_target_quiesce(starget);
>  
>  	ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
>  out:

This seems wrong to me. Since ufshcd_shutdown() shuts down the link I think
it should call scsi_remove_device() instead of scsi_target_quiesce().

Thanks,

Bart.



_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [SPAM]Re: [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown
  2020-08-03 16:04 ` Bart Van Assche
@ 2020-08-04  3:19   ` Chaotian Jing
  2020-08-04  3:46     ` Bart Van Assche
  2020-08-13  8:55   ` Stanley Chu
  1 sibling, 1 reply; 9+ messages in thread
From: Chaotian Jing @ 2020-08-04  3:19 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: beanhuo, linux-scsi, martin.petersen, andy.teng, jejb,
	chun-hung.wu, kuohong.wang, linux-kernel, asutoshd, avri.altman,
	cang, linux-mediatek, peter.wang, alim.akhtar, jiajie.hao,
	Stanley Chu, cc.chou, linux-arm-kernel, matthias.bgg

On Mon, 2020-08-03 at 09:04 -0700, Bart Van Assche wrote:
> On 2020-08-03 03:04, Stanley Chu wrote:
> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> > index 307622284239..7cb220b3fde0 100644
> > --- a/drivers/scsi/ufs/ufshcd.c
> > +++ b/drivers/scsi/ufs/ufshcd.c
> > @@ -8640,6 +8640,7 @@ EXPORT_SYMBOL(ufshcd_runtime_idle);
> >  int ufshcd_shutdown(struct ufs_hba *hba)
> >  {
> >  	int ret = 0;
> > +	struct scsi_target *starget;
> >  
> >  	if (!hba->is_powered)
> >  		goto out;
> > @@ -8647,11 +8648,27 @@ int ufshcd_shutdown(struct ufs_hba *hba)
> >  	if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
> >  		goto out;
> >  
> > -	if (pm_runtime_suspended(hba->dev)) {
> > -		ret = ufshcd_runtime_resume(hba);
> > -		if (ret)
> > -			goto out;
> > -	}
> > +	/*
> > +	 * Let runtime PM framework manage and prevent concurrent runtime
> > +	 * operations with shutdown flow.
> > +	 */
> > +	pm_runtime_get_sync(hba->dev);
> > +
> > +	/*
> > +	 * Quiesce all SCSI devices to prevent any non-PM requests sending
> > +	 * from block layer during and after shutdown.
> > +	 *
> > +	 * Here we can not use blk_cleanup_queue() since PM requests
> > +	 * (with BLK_MQ_REQ_PREEMPT flag) are still required to be sent
> > +	 * through block layer. Therefore SCSI command queued after the
> > +	 * scsi_target_quiesce() call returned will block until
> > +	 * blk_cleanup_queue() is called.
> > +	 *
> > +	 * Besides, scsi_target_"un"quiesce (e.g., scsi_target_resume) can
> > +	 * be ignored since shutdown is one-way flow.
> > +	 */
> > +	list_for_each_entry(starget, &hba->host->__targets, siblings)
> > +		scsi_target_quiesce(starget);
> >  
> >  	ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
> >  out:
> 
> This seems wrong to me. Since ufshcd_shutdown() shuts down the link I think
> it should call scsi_remove_device() instead of scsi_target_quiesce().
> 
> Thanks,
> 
> Bart.
> 
> Hi Bart & Can & Stanley,

I have a question about this:
thread A is running the shutdown flow, but thread B is still access
UFS(sda/sdb/sdc..), is it expected ? after the sd_shutdown() completed,
if thread B still has access to UFS(sda/sdb/sdc...), it will make the
sd_shutdown() make no sense because the sd_resume() will send ssu to
start sda/sdb/sdc.

so can we avoid this and ensure that there is no request to sda after
sda's shutdown() is completed ?

so that is it possible to modify the sd_shutdown() ? take "sda" for
example: after sync cache && ssu to stop sda, do blk_cleanup_queue()
then it will ensure no runtime resume of sda and no more new requests to
sda.

then, for UFSHCI host driver, its shutdown() no need and should not
handle the sda/sdb/sdc's queue and device status, because these
devices(sda/sdb/sdc) has already complete its shutdown.
just like part of Can's comment, UFSHCI's shutdown() should only handle
hba->sdev_ufs_device.
> 
> _______________________________________________
> Linux-mediatek mailing list
> Linux-mediatek@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [SPAM]Re: [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown
  2020-08-04  3:19   ` [SPAM]Re: " Chaotian Jing
@ 2020-08-04  3:46     ` Bart Van Assche
  0 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2020-08-04  3:46 UTC (permalink / raw)
  To: Chaotian Jing
  Cc: beanhuo, linux-scsi, martin.petersen, andy.teng, jejb,
	chun-hung.wu, kuohong.wang, linux-kernel, asutoshd, avri.altman,
	cang, linux-mediatek, peter.wang, alim.akhtar, jiajie.hao,
	Stanley Chu, cc.chou, linux-arm-kernel, matthias.bgg

On 2020-08-03 20:19, Chaotian Jing wrote:
> I have a question about this:
> thread A is running the shutdown flow, but thread B is still access
> UFS(sda/sdb/sdc..), is it expected ? after the sd_shutdown() completed,
> if thread B still has access to UFS(sda/sdb/sdc...), it will make the
> sd_shutdown() make no sense because the sd_resume() will send ssu to
> start sda/sdb/sdc.
> 
> so can we avoid this and ensure that there is no request to sda after
> sda's shutdown() is completed ?
> 
> so that is it possible to modify the sd_shutdown() ? take "sda" for
> example: after sync cache && ssu to stop sda, do blk_cleanup_queue()
> then it will ensure no runtime resume of sda and no more new requests to
> sda.
> 
> then, for UFSHCI host driver, its shutdown() no need and should not
> handle the sda/sdb/sdc's queue and device status, because these
> devices(sda/sdb/sdc) has already complete its shutdown.
> just like part of Can's comment, UFSHCI's shutdown() should only handle
> hba->sdev_ufs_device.

My understanding is that ufshcd_shutdown() is only called if no
recovery is possible, e.g. from the pci_driver.shutdown callback. Hence
the proposal to call scsi_remove_device() from inside ufshcd_shutdown().
It may be necessary to call scsi_target_unblock(...,
SDEV_TRANSPORT_OFFLINE) first to flush queued I/O. Other contexts that
may submit I/O while ufshcd_shutdown() is in progress, e.g. the sd
driver, are expected to hold a reference on the SCSI device with
scsi_device_get() / scsi_device_put(). I think the sd driver already
does that. In other words, no changes should be necessary in the sd
driver.

Thanks,

Bart.

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown
  2020-08-03 16:04 ` Bart Van Assche
  2020-08-04  3:19   ` [SPAM]Re: " Chaotian Jing
@ 2020-08-13  8:55   ` Stanley Chu
  2020-08-14  2:52     ` Bart Van Assche
  1 sibling, 1 reply; 9+ messages in thread
From: Stanley Chu @ 2020-08-13  8:55 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jiajie Hao (郝加节),
	linux-scsi, martin.petersen, Andy Teng (鄧如宏),
	jejb, Chun-Hung Wu (巫駿宏),
	Kuohong Wang (王國鴻),
	linux-kernel, avri.altman, cang, linux-mediatek,
	Peter Wang (王信友),
	alim.akhtar, matthias.bgg, asutoshd,
	Chaotian Jing (井朝天),
	CC Chou (周志杰),
	linux-arm-kernel, beanhuo

Hi Bart, Can, Chaotian,

Very appreciate your comments and suggestions, please see update below,

On Tue, 2020-08-04 at 00:04 +0800, Bart Van Assche wrote:
> On 2020-08-03 03:04, Stanley Chu wrote:
> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> > index 307622284239..7cb220b3fde0 100644
> > --- a/drivers/scsi/ufs/ufshcd.c
> > +++ b/drivers/scsi/ufs/ufshcd.c
> > @@ -8640,6 +8640,7 @@ EXPORT_SYMBOL(ufshcd_runtime_idle);
> >  int ufshcd_shutdown(struct ufs_hba *hba)
> >  {
> >  	int ret = 0;
> > +	struct scsi_target *starget;
> >  
> >  	if (!hba->is_powered)
> >  		goto out;
> > @@ -8647,11 +8648,27 @@ int ufshcd_shutdown(struct ufs_hba *hba)
> >  	if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
> >  		goto out;
> >  
> > -	if (pm_runtime_suspended(hba->dev)) {
> > -		ret = ufshcd_runtime_resume(hba);
> > -		if (ret)
> > -			goto out;
> > -	}
> > +	/*
> > +	 * Let runtime PM framework manage and prevent concurrent runtime
> > +	 * operations with shutdown flow.
> > +	 */
> > +	pm_runtime_get_sync(hba->dev);
> > +
> > +	/*
> > +	 * Quiesce all SCSI devices to prevent any non-PM requests sending
> > +	 * from block layer during and after shutdown.
> > +	 *
> > +	 * Here we can not use blk_cleanup_queue() since PM requests
> > +	 * (with BLK_MQ_REQ_PREEMPT flag) are still required to be sent
> > +	 * through block layer. Therefore SCSI command queued after the
> > +	 * scsi_target_quiesce() call returned will block until
> > +	 * blk_cleanup_queue() is called.
> > +	 *
> > +	 * Besides, scsi_target_"un"quiesce (e.g., scsi_target_resume) can
> > +	 * be ignored since shutdown is one-way flow.
> > +	 */
> > +	list_for_each_entry(starget, &hba->host->__targets, siblings)
> > +		scsi_target_quiesce(starget);
> >  
> >  	ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
> >  out:
> 
> This seems wrong to me. Since ufshcd_shutdown() shuts down the link I think
> it should call scsi_remove_device() instead of scsi_target_quiesce().

I tried many ways to come out the final solution. Currently two options
are considered,

== Option 1 ==
	pm_runtime_get_sync(hba->dev);

	shost_for_each_device(sdev, hba->host) {
		scsi_autopm_get_device(sdev);
		if (sdev == hba->sdev_ufs_device)
			scsi_device_quiesce(sdev);
		else
			scsi_remove_device(sdev);
	}

	ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);

	scsi_remove_device(hba->sdev_ufs_device);

Note. Using scsi_autopm_get_device() instead of pm_runtime_disable()
is to prevent noisy message by below checking,

	WARN_ON_ONCE(sdev->quiesced_by && sdev->quiesced_by != current);

in
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/scsi/scsi_lib.c#n2515

This warning shows up if we try to quiesce a runtime-suspended SCSI
device. This is possible during our new shutdown flow. Using
scsi_autopm_get_device() to resume all SCSI devices first can prevent
it.

In addition, normally sd_shutdown() would be executed prior than
ufshcd_shutdown(). If scsi_remove_device() is invoked by
ufshcd_shutdown(), sd_shutdown() will be executed again for a SCSI disk
by

[  131.398977]  sd_shutdown+0x44/0x118
[  131.399416]  sd_remove+0x5c/0xc4
[  131.399824]  device_release_driver_internal+0x1c4/0x2e4
[  131.400481]  device_release_driver+0x18/0x24
[  131.401018]  bus_remove_device+0x108/0x134
[  131.401533]  device_del+0x2dc/0x630
[  131.401973]  __scsi_remove_device+0xc0/0x174
[  131.402510]  scsi_remove_device+0x30/0x48
[  131.403014]  ufshcd_shutdown+0xc8/0x138

In this case, we could see SYNCHRONIZE_CACHE command will be sent to the
same SCSI device twice. This is kind of wired during shutdown flow.

Moreover, in consideration of performance of ufshcd_shutdown(), Option 1
obviously degrades the latency a lot by scsi_remove_device(). Please see
the "Performance Measurement" data below.

Compared Option 2, this way is simpler and also effective. This way may
be a better compromise.

== Option 2  ==
	pm_runtime_get_sync(hba->dev);

	shost_for_each_device(sdev, hba->host) {
		scsi_autopm_get_device(sdev);
		scsi_device_quiesce(sdev);
	}

== Performance Measurement ==
As-Is: < 5 ms
Option 1: 850 ms
Option 2: 60 ms

What would you prefer? Or would you have any further suggestions?

Thanks,

Stanley Chu

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown
  2020-08-13  8:55   ` Stanley Chu
@ 2020-08-14  2:52     ` Bart Van Assche
  0 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2020-08-14  2:52 UTC (permalink / raw)
  To: Stanley Chu
  Cc: Jiajie Hao (郝加节),
	linux-scsi, martin.petersen, Andy Teng (鄧如宏),
	jejb, Chun-Hung Wu (巫駿宏),
	Kuohong Wang (王國鴻),
	linux-kernel, avri.altman, cang, linux-mediatek,
	Peter Wang (王信友),
	alim.akhtar, matthias.bgg, asutoshd,
	Chaotian Jing (井朝天),
	CC Chou (周志杰),
	linux-arm-kernel, beanhuo

On 2020-08-13 01:55, Stanley Chu wrote:
> I tried many ways to come out the final solution. Currently two options
> are considered,
> 
> == Option 1 ==
> 	pm_runtime_get_sync(hba->dev);
> 
> 	shost_for_each_device(sdev, hba->host) {
> 		scsi_autopm_get_device(sdev);
> 		if (sdev == hba->sdev_ufs_device)
> 			scsi_device_quiesce(sdev);
> 		else
> 			scsi_remove_device(sdev);
> 	}
> 
> 	ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
> 
> 	scsi_remove_device(hba->sdev_ufs_device);
> 
> Note. Using scsi_autopm_get_device() instead of pm_runtime_disable()
> is to prevent noisy message by below checking,
> 
> 	WARN_ON_ONCE(sdev->quiesced_by && sdev->quiesced_by != current);
> 
> in
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/scsi/scsi_lib.c#n2515
> 
> This warning shows up if we try to quiesce a runtime-suspended SCSI
> device. This is possible during our new shutdown flow. Using
> scsi_autopm_get_device() to resume all SCSI devices first can prevent
> it.
> 
> In addition, normally sd_shutdown() would be executed prior than
> ufshcd_shutdown(). If scsi_remove_device() is invoked by
> ufshcd_shutdown(), sd_shutdown() will be executed again for a SCSI disk
> by
> 
> [  131.398977]  sd_shutdown+0x44/0x118
> [  131.399416]  sd_remove+0x5c/0xc4
> [  131.399824]  device_release_driver_internal+0x1c4/0x2e4
> [  131.400481]  device_release_driver+0x18/0x24
> [  131.401018]  bus_remove_device+0x108/0x134
> [  131.401533]  device_del+0x2dc/0x630
> [  131.401973]  __scsi_remove_device+0xc0/0x174
> [  131.402510]  scsi_remove_device+0x30/0x48
> [  131.403014]  ufshcd_shutdown+0xc8/0x138
> 
> In this case, we could see SYNCHRONIZE_CACHE command will be sent to the
> same SCSI device twice. This is kind of wired during shutdown flow.
> 
> Moreover, in consideration of performance of ufshcd_shutdown(), Option 1
> obviously degrades the latency a lot by scsi_remove_device(). Please see
> the "Performance Measurement" data below.
> 
> Compared Option 2, this way is simpler and also effective. This way may
> be a better compromise.
> 
> == Option 2  ==
> 	pm_runtime_get_sync(hba->dev);
> 
> 	shost_for_each_device(sdev, hba->host) {
> 		scsi_autopm_get_device(sdev);
> 		scsi_device_quiesce(sdev);
> 	}
> 
> == Performance Measurement ==
> As-Is: < 5 ms
> Option 1: 850 ms
> Option 2: 60 ms
> 
> What would you prefer? Or would you have any further suggestions?

Hi Stanley,

Thanks for the detailed report and also for having shared timing information.

The approach of option 2 seems wrong to me because the SCSI devices are not
removed. My concern is that option (2) could cause the sd driver to send SYNC
and/or STOP commands to the device after its PCIe resources have been freed,
resulting in a crash.

Please take a look at the output of the following command:

$ git grep -nHA10 'struct pci_driver.* = {$' */scsi |
  sed -e 's/-/:/' -e 's/-/:/' |
  grep ':[[:blank:]]*\.remove'

It seems to me that other SCSI LLDs do at least the following in their PCIe
removal callback:

1. Call scsi_remove_host()
2. Call scsi_host_put()
3. Call pci_disable_device()

Would that approach work for UFS? Would offlining the UFS LUNs (SDEV_OFFLINE)
before calling the above functions make SCSI host removal faster? See also
scsi_prep_state_check().

Thanks,

Bart.

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-08-14  2:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-03 10:04 [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown Stanley Chu
2020-08-03 11:50 ` Can Guo
2020-08-03 12:04   ` Can Guo
2020-08-03 12:51 ` Can Guo
2020-08-03 16:04 ` Bart Van Assche
2020-08-04  3:19   ` [SPAM]Re: " Chaotian Jing
2020-08-04  3:46     ` Bart Van Assche
2020-08-13  8:55   ` Stanley Chu
2020-08-14  2:52     ` Bart Van Assche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).