[PATCH V2 0/3] scsi: ufs: Let devices remain runtime suspended during system suspend

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH V2 0/3] scsi: ufs: Let devices remain runtime suspended during system suspend
@ 2021-09-03  9:56 Adrian Hunter
  2021-09-03  9:56 ` [PATCH V2 1/3] scsi: ufs: Fix error handler clear ua deadlock Adrian Hunter
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Adrian Hunter @ 2021-09-03  9:56 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
	Can Guo, Asutosh Das, Bart Van Assche, Manivannan Sadhasivam,
	Wei Li, linux-scsi

Hi

UFS devices can remain runtime suspended at system suspend time,
if the conditions are right.  Add support for that, first fixing
the impediments.


Changes in V2:

    scsi: ufs: Let devices remain runtime suspended during system suspend

	The ufs-hisi driver uses different RPM and SPM, but it is made
	explicit by a new parameter to suspend prepare.


Adrian Hunter (3):
      scsi: ufs: Fix error handler clear ua deadlock
      scsi: ufs: Fix runtime PM dependencies getting broken
      scsi: ufs: Let devices remain runtime suspended during system suspend

 drivers/scsi/scsi_pm.c      | 16 ++++++---
 drivers/scsi/ufs/ufs-hisi.c |  8 ++++-
 drivers/scsi/ufs/ufshcd.c   | 87 +++++++++++++++++++++++++++++++--------------
 drivers/scsi/ufs/ufshcd.h   | 12 ++++++-
 include/scsi/scsi_device.h  |  1 +
 5 files changed, 90 insertions(+), 34 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V2 1/3] scsi: ufs: Fix error handler clear ua deadlock
  2021-09-03  9:56 [PATCH V2 0/3] scsi: ufs: Let devices remain runtime suspended during system suspend Adrian Hunter
@ 2021-09-03  9:56 ` Adrian Hunter
  2021-09-03 20:29   ` Bart Van Assche
  2021-09-03  9:56 ` [PATCH V2 2/3] scsi: ufs: Fix runtime PM dependencies getting broken Adrian Hunter
  2021-09-03  9:56 ` [PATCH V2 3/3] scsi: ufs: Let devices remain runtime suspended during system suspend Adrian Hunter
  2 siblings, 1 reply; 8+ messages in thread
From: Adrian Hunter @ 2021-09-03  9:56 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
	Can Guo, Asutosh Das, Bart Van Assche, Manivannan Sadhasivam,
	Wei Li, linux-scsi

There is no guarantee to be able to enter the queue if requests are
blocked. That is because freezing the queue will block entry to the
queue, but freezing also waits for outstanding requests which can make
no progress while the queue is blocked.

That situation can happen when the error handler issues requests to
clear unit attention condition. The deadlock is very unlikely, so the
error handler can be expected to clear ua at some point anyway, so the
simple solution is not to wait to enter the queue.

Additionally, note that the RPMB queue might be not be entered because
it is runtime suspended, but in that case ua will be cleared at RPMB
runtime resume.

Fixes: aa53f580e67b49 ("scsi: ufs: Minor adjustments to error handling")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>
---
 drivers/scsi/ufs/ufshcd.c | 33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 67889d74761c..52fb059efa77 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -224,7 +224,7 @@ static int ufshcd_reset_and_restore(struct ufs_hba *hba);
 static int ufshcd_eh_host_reset_handler(struct scsi_cmnd *cmd);
 static int ufshcd_clear_tm_cmd(struct ufs_hba *hba, int tag);
 static void ufshcd_hba_exit(struct ufs_hba *hba);
-static int ufshcd_clear_ua_wluns(struct ufs_hba *hba);
+static int ufshcd_clear_ua_wluns(struct ufs_hba *hba, bool nowait);
 static int ufshcd_probe_hba(struct ufs_hba *hba, bool async);
 static int ufshcd_setup_clocks(struct ufs_hba *hba, bool on);
 static int ufshcd_uic_hibern8_enter(struct ufs_hba *hba);
@@ -4110,7 +4110,7 @@ int ufshcd_link_recovery(struct ufs_hba *hba)
 		dev_err(hba->dev, "%s: link recovery failed, err %d",
 			__func__, ret);
 	else
-		ufshcd_clear_ua_wluns(hba);
+		ufshcd_clear_ua_wluns(hba, false);
 
 	return ret;
 }
@@ -5974,7 +5974,7 @@ static void ufshcd_err_handling_unprepare(struct ufs_hba *hba)
 	ufshcd_release(hba);
 	if (ufshcd_is_clkscaling_supported(hba))
 		ufshcd_clk_scaling_suspend(hba, false);
-	ufshcd_clear_ua_wluns(hba);
+	ufshcd_clear_ua_wluns(hba, true);
 	ufshcd_rpm_put(hba);
 }
 
@@ -7907,7 +7907,7 @@ static int ufshcd_add_lus(struct ufs_hba *hba)
 	if (ret)
 		goto out;
 
-	ufshcd_clear_ua_wluns(hba);
+	ufshcd_clear_ua_wluns(hba, false);
 
 	/* Initialize devfreq after UFS device is detected */
 	if (ufshcd_is_clkscaling_supported(hba)) {
@@ -7943,7 +7943,8 @@ static void ufshcd_request_sense_done(struct request *rq, blk_status_t error)
 }
 
 static int
-ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev)
+ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev,
+			   bool nowait)
 {
 	/*
 	 * Some UFS devices clear unit attention condition only if the sense
@@ -7951,6 +7952,7 @@ ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev)
 	 */
 	static const u8 cmd[6] = {REQUEST_SENSE, 0, 0, 0, UFS_SENSE_SIZE, 0};
 	struct scsi_request *rq;
+	blk_mq_req_flags_t flags;
 	struct request *req;
 	char *buffer;
 	int ret;
@@ -7959,8 +7961,8 @@ ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev)
 	if (!buffer)
 		return -ENOMEM;
 
-	req = blk_get_request(sdev->request_queue, REQ_OP_DRV_IN,
-			      /*flags=*/BLK_MQ_REQ_PM);
+	flags = BLK_MQ_REQ_PM | (nowait ? BLK_MQ_REQ_NOWAIT : 0);
+	req = blk_get_request(sdev->request_queue, REQ_OP_DRV_IN, flags);
 	if (IS_ERR(req)) {
 		ret = PTR_ERR(req);
 		goto out_free;
@@ -7990,7 +7992,7 @@ ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev)
 	return ret;
 }
 
-static int ufshcd_clear_ua_wlun(struct ufs_hba *hba, u8 wlun)
+static int ufshcd_clear_ua_wlun(struct ufs_hba *hba, u8 wlun, bool nowait)
 {
 	struct scsi_device *sdp;
 	unsigned long flags;
@@ -8016,7 +8018,10 @@ static int ufshcd_clear_ua_wlun(struct ufs_hba *hba, u8 wlun)
 	if (ret)
 		goto out_err;
 
-	ret = ufshcd_request_sense_async(hba, sdp);
+	ret = ufshcd_request_sense_async(hba, sdp, nowait);
+	if (nowait && ret && wlun == UFS_UPIU_RPMB_WLUN &&
+	    pm_runtime_suspended(&sdp->sdev_gendev))
+		ret = 0; /* RPMB runtime resume will clear UAC */
 	scsi_device_put(sdp);
 out_err:
 	if (ret)
@@ -8025,16 +8030,16 @@ static int ufshcd_clear_ua_wlun(struct ufs_hba *hba, u8 wlun)
 	return ret;
 }
 
-static int ufshcd_clear_ua_wluns(struct ufs_hba *hba)
+static int ufshcd_clear_ua_wluns(struct ufs_hba *hba, bool nowait)
 {
 	int ret = 0;
 
 	if (!hba->wlun_dev_clr_ua)
 		goto out;
 
-	ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_UFS_DEVICE_WLUN);
+	ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_UFS_DEVICE_WLUN, nowait);
 	if (!ret)
-		ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN);
+		ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN, nowait);
 	if (!ret)
 		hba->wlun_dev_clr_ua = false;
 out:
@@ -8656,7 +8661,7 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
 	 */
 	hba->host->eh_noresume = 1;
 	if (hba->wlun_dev_clr_ua)
-		ufshcd_clear_ua_wlun(hba, UFS_UPIU_UFS_DEVICE_WLUN);
+		ufshcd_clear_ua_wlun(hba, UFS_UPIU_UFS_DEVICE_WLUN, false);
 
 	cmd[4] = pwr_mode << 4;
 
@@ -9825,7 +9830,7 @@ static inline int ufshcd_clear_rpmb_uac(struct ufs_hba *hba)
 
 	if (!hba->wlun_rpmb_clr_ua)
 		return 0;
-	ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN);
+	ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN, false);
 	if (!ret)
 		hba->wlun_rpmb_clr_ua = 0;
 	return ret;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 2/3] scsi: ufs: Fix runtime PM dependencies getting broken
  2021-09-03  9:56 [PATCH V2 0/3] scsi: ufs: Let devices remain runtime suspended during system suspend Adrian Hunter
  2021-09-03  9:56 ` [PATCH V2 1/3] scsi: ufs: Fix error handler clear ua deadlock Adrian Hunter
@ 2021-09-03  9:56 ` Adrian Hunter
  2021-09-03  9:56 ` [PATCH V2 3/3] scsi: ufs: Let devices remain runtime suspended during system suspend Adrian Hunter
  2 siblings, 0 replies; 8+ messages in thread
From: Adrian Hunter @ 2021-09-03  9:56 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
	Can Guo, Asutosh Das, Bart Van Assche, Manivannan Sadhasivam,
	Wei Li, linux-scsi

UFS SCSI devices make use of device links to establish PM dependencies.
However, SCSI PM will force devices' runtime PM state to be active during
system resume. That can break runtime PM dependencies for UFS devices.
Fix by adding a flag 'preserve_rpm' to let UFS SCSI devices opt-out of
the unwanted behaviour.

Fixes: b294ff3e34490f ("scsi: ufs: core: Enable power management for wlun")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/scsi/scsi_pm.c     | 16 +++++++++++-----
 drivers/scsi/ufs/ufshcd.c  |  1 +
 include/scsi/scsi_device.h |  1 +
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c
index 3717eea37ecb..0557c1ad304d 100644
--- a/drivers/scsi/scsi_pm.c
+++ b/drivers/scsi/scsi_pm.c
@@ -73,13 +73,22 @@ static int scsi_dev_type_resume(struct device *dev,
 		int (*cb)(struct device *, const struct dev_pm_ops *))
 {
 	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
+	struct scsi_device *sdev = NULL;
+	bool preserve_rpm = false;
 	int err = 0;
 
+	if (scsi_is_sdev_device(dev)) {
+		sdev = to_scsi_device(dev);
+		preserve_rpm = sdev->preserve_rpm;
+		if (preserve_rpm && pm_runtime_suspended(dev))
+			return 0;
+	}
+
 	err = cb(dev, pm);
 	scsi_device_resume(to_scsi_device(dev));
 	dev_dbg(dev, "scsi resume: %d\n", err);
 
-	if (err == 0) {
+	if (err == 0 && !preserve_rpm) {
 		pm_runtime_disable(dev);
 		err = pm_runtime_set_active(dev);
 		pm_runtime_enable(dev);
@@ -91,11 +100,8 @@ static int scsi_dev_type_resume(struct device *dev,
 		 *
 		 * The resume hook will correct runtime PM status of the disk.
 		 */
-		if (!err && scsi_is_sdev_device(dev)) {
-			struct scsi_device *sdev = to_scsi_device(dev);
-
+		if (!err && sdev)
 			blk_set_runtime_active(sdev->request_queue);
-		}
 	}
 
 	return err;
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 52fb059efa77..57ed4b93b949 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5016,6 +5016,7 @@ static int ufshcd_slave_configure(struct scsi_device *sdev)
 		pm_runtime_get_noresume(&sdev->sdev_gendev);
 	else if (ufshcd_is_rpm_autosuspend_allowed(hba))
 		sdev->rpm_autosuspend = 1;
+	sdev->preserve_rpm = 1;
 
 	ufshcd_crypto_setup_rq_keyslot_manager(hba, q);
 
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 09a17f6e93a7..47eb30a6b7b2 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -197,6 +197,7 @@ struct scsi_device {
 	unsigned no_read_disc_info:1;	/* Avoid READ_DISC_INFO cmds */
 	unsigned no_read_capacity_16:1; /* Avoid READ_CAPACITY_16 cmds */
 	unsigned try_rc_10_first:1;	/* Try READ_CAPACACITY_10 first */
+	unsigned preserve_rpm:1;	/* Preserve runtime PM */
 	unsigned security_supported:1;	/* Supports Security Protocols */
 	unsigned is_visible:1;	/* is the device visible in sysfs */
 	unsigned wce_default_on:1;	/* Cache is ON by default */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 3/3] scsi: ufs: Let devices remain runtime suspended during system suspend
  2021-09-03  9:56 [PATCH V2 0/3] scsi: ufs: Let devices remain runtime suspended during system suspend Adrian Hunter
  2021-09-03  9:56 ` [PATCH V2 1/3] scsi: ufs: Fix error handler clear ua deadlock Adrian Hunter
  2021-09-03  9:56 ` [PATCH V2 2/3] scsi: ufs: Fix runtime PM dependencies getting broken Adrian Hunter
@ 2021-09-03  9:56 ` Adrian Hunter
  2 siblings, 0 replies; 8+ messages in thread
From: Adrian Hunter @ 2021-09-03  9:56 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
	Can Guo, Asutosh Das, Bart Van Assche, Manivannan Sadhasivam,
	Wei Li, linux-scsi

If the UFS Device WLUN is runtime suspended and is in the same power
mode, link state and b_rpm_dev_flush_capable (BKOP or WB buffer flush etc)
state, then it can remain runtime suspended instead of being runtime
resumed and then system suspended.

The following patches have cleared the way for that to happen:
  scsi: ufs: Fix runtime PM dependencies getting broken
  scsi: ufs: Fix error handler clear ua deadlock

So amend the logic accordingly.

Note, the ufs-hisi driver uses different RPM and SPM, but it is made
explicit by a new parameter to suspend prepare.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---


Changes in V2:

	The ufs-hisi driver uses different RPM and SPM, but it is made
	explicit by a new parameter to suspend prepare.


 drivers/scsi/ufs/ufs-hisi.c |  8 +++++-
 drivers/scsi/ufs/ufshcd.c   | 53 ++++++++++++++++++++++++++++---------
 drivers/scsi/ufs/ufshcd.h   | 12 ++++++++-
 3 files changed, 58 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/ufs/ufs-hisi.c b/drivers/scsi/ufs/ufs-hisi.c
index 6b706de8354b..4a08fb35642c 100644
--- a/drivers/scsi/ufs/ufs-hisi.c
+++ b/drivers/scsi/ufs/ufs-hisi.c
@@ -396,6 +396,12 @@ static int ufs_hisi_pwr_change_notify(struct ufs_hba *hba,
 	return ret;
 }
 
+static int ufs_hisi_suspend_prepare(struct device *dev)
+{
+	/* RPM and SPM are different. Refer ufs_hisi_suspend() */
+	return __ufshcd_suspend_prepare(dev, false);
+}
+
 static int ufs_hisi_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op)
 {
 	struct ufs_hisi_host *host = ufshcd_get_variant(hba);
@@ -574,7 +580,7 @@ static int ufs_hisi_remove(struct platform_device *pdev)
 static const struct dev_pm_ops ufs_hisi_pm_ops = {
 	SET_SYSTEM_SLEEP_PM_OPS(ufshcd_system_suspend, ufshcd_system_resume)
 	SET_RUNTIME_PM_OPS(ufshcd_runtime_suspend, ufshcd_runtime_resume, NULL)
-	.prepare	 = ufshcd_suspend_prepare,
+	.prepare	 = ufs_hisi_suspend_prepare,
 	.complete	 = ufshcd_resume_complete,
 };
 
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 57ed4b93b949..453fbb8753e2 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -9722,14 +9722,30 @@ void ufshcd_resume_complete(struct device *dev)
 		ufshcd_rpm_put(hba);
 		hba->complete_put = false;
 	}
-	if (hba->rpmb_complete_put) {
-		ufshcd_rpmb_rpm_put(hba);
-		hba->rpmb_complete_put = false;
-	}
 }
 EXPORT_SYMBOL_GPL(ufshcd_resume_complete);
 
-int ufshcd_suspend_prepare(struct device *dev)
+static bool ufshcd_rpm_ok_for_spm(struct ufs_hba *hba)
+{
+	struct device *dev = &hba->sdev_ufs_device->sdev_gendev;
+	enum ufs_dev_pwr_mode dev_pwr_mode;
+	enum uic_link_state link_state;
+	unsigned long flags;
+	bool res;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	dev_pwr_mode = ufs_get_pm_lvl_to_dev_pwr_mode(hba->spm_lvl);
+	link_state = ufs_get_pm_lvl_to_link_pwr_state(hba->spm_lvl);
+	res = pm_runtime_suspended(dev) &&
+	      hba->curr_dev_pwr_mode == dev_pwr_mode &&
+	      hba->uic_link_state == link_state &&
+	      !hba->dev_info.b_rpm_dev_flush_capable;
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return res;
+}
+
+int __ufshcd_suspend_prepare(struct device *dev, bool rpm_ok_for_spm)
 {
 	struct ufs_hba *hba = dev_get_drvdata(dev);
 	int ret;
@@ -9741,19 +9757,30 @@ int ufshcd_suspend_prepare(struct device *dev)
 	 * Refer ufshcd_resume_complete()
 	 */
 	if (hba->sdev_ufs_device) {
-		ret = ufshcd_rpm_get_sync(hba);
-		if (ret < 0 && ret != -EACCES) {
-			ufshcd_rpm_put(hba);
-			return ret;
+		/* Prevent runtime suspend */
+		ufshcd_rpm_get_noresume(hba);
+		/*
+		 * Check if already runtime suspended in same state as system
+		 * suspend would be.
+		 */
+		if (!rpm_ok_for_spm || !ufshcd_rpm_ok_for_spm(hba)) {
+			/* RPM state is not ok for SPM, so runtime resume */
+			ret = ufshcd_rpm_resume(hba);
+			if (ret < 0 && ret != -EACCES) {
+				ufshcd_rpm_put(hba);
+				return ret;
+			}
 		}
 		hba->complete_put = true;
 	}
-	if (hba->sdev_rpmb) {
-		ufshcd_rpmb_rpm_get_sync(hba);
-		hba->rpmb_complete_put = true;
-	}
 	return 0;
 }
+EXPORT_SYMBOL_GPL(__ufshcd_suspend_prepare);
+
+int ufshcd_suspend_prepare(struct device *dev)
+{
+	return __ufshcd_suspend_prepare(dev, true);
+}
 EXPORT_SYMBOL_GPL(ufshcd_suspend_prepare);
 
 #ifdef CONFIG_PM_SLEEP
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index 4723f27a55d1..1dc8024d5211 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -915,7 +915,6 @@ struct ufs_hba {
 #endif
 	u32 luns_avail;
 	bool complete_put;
-	bool rpmb_complete_put;
 };
 
 /* Returns true if clocks can be gated. Otherwise false */
@@ -1175,6 +1174,7 @@ int ufshcd_exec_raw_upiu_cmd(struct ufs_hba *hba,
 
 int ufshcd_wb_toggle(struct ufs_hba *hba, bool enable);
 int ufshcd_suspend_prepare(struct device *dev);
+int __ufshcd_suspend_prepare(struct device *dev, bool rpm_ok_for_spm);
 void ufshcd_resume_complete(struct device *dev);
 
 /* Wrapper functions for safely calling variant operations */
@@ -1383,6 +1383,16 @@ static inline int ufshcd_rpm_put_sync(struct ufs_hba *hba)
 	return pm_runtime_put_sync(&hba->sdev_ufs_device->sdev_gendev);
 }
 
+static inline void ufshcd_rpm_get_noresume(struct ufs_hba *hba)
+{
+	pm_runtime_get_noresume(&hba->sdev_ufs_device->sdev_gendev);
+}
+
+static inline int ufshcd_rpm_resume(struct ufs_hba *hba)
+{
+	return pm_runtime_resume(&hba->sdev_ufs_device->sdev_gendev);
+}
+
 static inline int ufshcd_rpm_put(struct ufs_hba *hba)
 {
 	return pm_runtime_put(&hba->sdev_ufs_device->sdev_gendev);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 1/3] scsi: ufs: Fix error handler clear ua deadlock
  2021-09-03  9:56 ` [PATCH V2 1/3] scsi: ufs: Fix error handler clear ua deadlock Adrian Hunter
@ 2021-09-03 20:29   ` Bart Van Assche
  2021-09-05  9:51     ` Adrian Hunter
  0 siblings, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2021-09-03 20:29 UTC (permalink / raw)
  To: Adrian Hunter, Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
	Can Guo, Asutosh Das, Manivannan Sadhasivam, Wei Li, linux-scsi

On 9/3/21 2:56 AM, Adrian Hunter wrote:
> There is no guarantee to be able to enter the queue if requests are
> blocked. That is because freezing the queue will block entry to the
> queue, but freezing also waits for outstanding requests which can make
> no progress while the queue is blocked.
> 
> That situation can happen when the error handler issues requests to
> clear unit attention condition. The deadlock is very unlikely, so the
> error handler can be expected to clear ua at some point anyway, so the
> simple solution is not to wait to enter the queue.
> 
> Additionally, note that the RPMB queue might be not be entered because
> it is runtime suspended, but in that case ua will be cleared at RPMB
> runtime resume.

The only ufshcd_clear_ua_wluns() call that I am aware of and that is 
related to error handling is the call in 
ufshcd_err_handling_unprepare(). That call happens after 
ufshcd_scsi_unblock_requests() has been called so how can it be involved 
in a deadlock?

Additionally, the ufshcd_scsi_block_requests() and 
ufshcd_scsi_unblock_requests() calls can be removed from 
ufshcd_err_handling_prepare() and ufshcd_err_handling_unprepare(). These 
calls are no longer necessary since patch "scsi: ufs: Synchronize SCSI 
and UFS error handling".

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 1/3] scsi: ufs: Fix error handler clear ua deadlock
  2021-09-03 20:29   ` Bart Van Assche
@ 2021-09-05  9:51     ` Adrian Hunter
  2021-09-07  0:37       ` Bart Van Assche
  0 siblings, 1 reply; 8+ messages in thread
From: Adrian Hunter @ 2021-09-05  9:51 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
	Can Guo, Asutosh Das, Manivannan Sadhasivam, Wei Li, linux-scsi

On 3/09/21 11:29 pm, Bart Van Assche wrote:
> On 9/3/21 2:56 AM, Adrian Hunter wrote:
>> There is no guarantee to be able to enter the queue if requests are
>> blocked. That is because freezing the queue will block entry to the
>> queue, but freezing also waits for outstanding requests which can make
>> no progress while the queue is blocked.
>>
>> That situation can happen when the error handler issues requests to
>> clear unit attention condition. The deadlock is very unlikely, so the
>> error handler can be expected to clear ua at some point anyway, so the
>> simple solution is not to wait to enter the queue.
>>
>> Additionally, note that the RPMB queue might be not be entered because
>> it is runtime suspended, but in that case ua will be cleared at RPMB
>> runtime resume.
> 
> The only ufshcd_clear_ua_wluns() call that I am aware of and that is related to error handling is the call in ufshcd_err_handling_unprepare(). That call happens after ufshcd_scsi_unblock_requests() has been called so how can it be involved in a deadlock?

That is a very good question.  I went back to reproduce the deadlock again, and it is because, in addition, ufshcd_state is UFSHCD_STATE_EH_SCHEDULED_FATAL.  So I have updated the commit message accordingly in V3.

> 
> Additionally, the ufshcd_scsi_block_requests() and ufshcd_scsi_unblock_requests() calls can be removed from ufshcd_err_handling_prepare() and ufshcd_err_handling_unprepare(). These calls are no longer necessary since patch "scsi: ufs: Synchronize SCSI and UFS error handling".

As has been noted, that commit introduces several new deadlocks - and will presumably cause the deadlock this patches addresses, even if ufshcd_state is not UFSHCD_STATE_EH_SCHEDULED_FATAL.

It is perhaps more appropriate to revert "scsi: ufs: Synchronize SCSI and UFS error handling" for v5.15 and try to get things sorted out for v5.16.  What do you think?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 1/3] scsi: ufs: Fix error handler clear ua deadlock
  2021-09-05  9:51     ` Adrian Hunter
@ 2021-09-07  0:37       ` Bart Van Assche
  2021-09-07 11:06         ` Adrian Hunter
  0 siblings, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2021-09-07  0:37 UTC (permalink / raw)
  To: Adrian Hunter, Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
	Can Guo, Asutosh Das, Manivannan Sadhasivam, Wei Li, linux-scsi

On 9/5/21 02:51, Adrian Hunter wrote:
> On 3/09/21 11:29 pm, Bart Van Assche wrote:
>> On 9/3/21 2:56 AM, Adrian Hunter wrote:
>>> There is no guarantee to be able to enter the queue if requests
>>> are blocked. That is because freezing the queue will block entry
>>> to the queue, but freezing also waits for outstanding requests
>>> which can make no progress while the queue is blocked.
>>> 
>>> That situation can happen when the error handler issues requests
>>> to clear unit attention condition. The deadlock is very unlikely,
>>> so the error handler can be expected to clear ua at some point
>>> anyway, so the simple solution is not to wait to enter the
>>> queue.
>>> 
>>> Additionally, note that the RPMB queue might be not be entered
>>> because it is runtime suspended, but in that case ua will be
>>> cleared at RPMB runtime resume.
>> 
>> The only ufshcd_clear_ua_wluns() call that I am aware of and that
>> is related to error handling is the call in
>> ufshcd_err_handling_unprepare(). That call happens after
>> ufshcd_scsi_unblock_requests() has been called so how can it be
>> involved in a deadlock?
> 
> That is a very good question.  I went back to reproduce the deadlock
> again, and it is because, in addition, ufshcd_state is
> UFSHCD_STATE_EH_SCHEDULED_FATAL.  So I have updated the commit
> message accordingly in V3.
 >
>> Additionally, the ufshcd_scsi_block_requests() and
>> ufshcd_scsi_unblock_requests() calls can be removed from
>> ufshcd_err_handling_prepare() and ufshcd_err_handling_unprepare().
>> These calls are no longer necessary since patch "scsi: ufs:
>> Synchronize SCSI and UFS error handling".
> 
> As has been noted, that commit introduces several new deadlocks - and
> will presumably cause the deadlock this patches addresses, even if
> ufshcd_state is not UFSHCD_STATE_EH_SCHEDULED_FATAL.
> 
> It is perhaps more appropriate to revert "scsi: ufs: Synchronize SCSI
> and UFS error handling" for v5.15 and try to get things sorted out
> for v5.16.  What do you think?

Reverting that patch would be a step backwards because it would make it 
again possible that the SCSI EH and UFS EH run concurrently and obstruct 
each other.

Does the above mean that "if (hba->pm_op_in_progress)" should be removed 
from the following code in ufshcd_queuecommand()?

	case UFSHCD_STATE_EH_SCHEDULED_FATAL:
		if (hba->pm_op_in_progress) {
			hba->force_reset = true;
			set_host_byte(cmd, DID_BAD_TARGET);
			cmd->scsi_done(cmd);
			goto out;
		}

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 1/3] scsi: ufs: Fix error handler clear ua deadlock
  2021-09-07  0:37       ` Bart Van Assche
@ 2021-09-07 11:06         ` Adrian Hunter
  0 siblings, 0 replies; 8+ messages in thread
From: Adrian Hunter @ 2021-09-07 11:06 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
	Can Guo, Asutosh Das, Manivannan Sadhasivam, Wei Li, linux-scsi

On 7/09/21 3:37 am, Bart Van Assche wrote:
> On 9/5/21 02:51, Adrian Hunter wrote:
>> On 3/09/21 11:29 pm, Bart Van Assche wrote:
>>> On 9/3/21 2:56 AM, Adrian Hunter wrote:
>>>> There is no guarantee to be able to enter the queue if requests
>>>> are blocked. That is because freezing the queue will block entry
>>>> to the queue, but freezing also waits for outstanding requests
>>>> which can make no progress while the queue is blocked.
>>>>
>>>> That situation can happen when the error handler issues requests
>>>> to clear unit attention condition. The deadlock is very unlikely,
>>>> so the error handler can be expected to clear ua at some point
>>>> anyway, so the simple solution is not to wait to enter the
>>>> queue.
>>>>
>>>> Additionally, note that the RPMB queue might be not be entered
>>>> because it is runtime suspended, but in that case ua will be
>>>> cleared at RPMB runtime resume.
>>>
>>> The only ufshcd_clear_ua_wluns() call that I am aware of and that
>>> is related to error handling is the call in
>>> ufshcd_err_handling_unprepare(). That call happens after
>>> ufshcd_scsi_unblock_requests() has been called so how can it be
>>> involved in a deadlock?
>>
>> That is a very good question.  I went back to reproduce the deadlock
>> again, and it is because, in addition, ufshcd_state is
>> UFSHCD_STATE_EH_SCHEDULED_FATAL.  So I have updated the commit
>> message accordingly in V3.
>>
>>> Additionally, the ufshcd_scsi_block_requests() and
>>> ufshcd_scsi_unblock_requests() calls can be removed from
>>> ufshcd_err_handling_prepare() and ufshcd_err_handling_unprepare().
>>> These calls are no longer necessary since patch "scsi: ufs:
>>> Synchronize SCSI and UFS error handling".
>>
>> As has been noted, that commit introduces several new deadlocks - and
>> will presumably cause the deadlock this patches addresses, even if
>> ufshcd_state is not UFSHCD_STATE_EH_SCHEDULED_FATAL.
>>
>> It is perhaps more appropriate to revert "scsi: ufs: Synchronize SCSI
>> and UFS error handling" for v5.15 and try to get things sorted out
>> for v5.16.  What do you think?
> 
> Reverting that patch would be a step backwards because it would make it again possible that the SCSI EH and UFS EH run concurrently and obstruct each other.

I wouldn't say it is a step backwards, just a step forwards the driver is not ready for.

For me, the change causes deadlocks so it is a regression.

I have never seen SCSI EH cause a problem, but AFAICT it is not needed because the UFS driver's error handler is always scheduled when needed.

As a temporary workaround until the driver is ready for SCSI EH, interference between SCSI EH and UFS EH could presumably be avoided by setting eh_strategy_handler to an empty function.

> 
> Does the above mean that "if (hba->pm_op_in_progress)" should be removed from the following code in ufshcd_queuecommand()?
> 
>     case UFSHCD_STATE_EH_SCHEDULED_FATAL:
>         if (hba->pm_op_in_progress) {
>             hba->force_reset = true;
>             set_host_byte(cmd, DID_BAD_TARGET);
>             cmd->scsi_done(cmd);
>             goto out;
>         }

It seems to me that removing "if (hba->pm_op_in_progress)" would cause errors for requests that had not in fact even been issued.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-09-07 11:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-03  9:56 [PATCH V2 0/3] scsi: ufs: Let devices remain runtime suspended during system suspend Adrian Hunter
2021-09-03  9:56 ` [PATCH V2 1/3] scsi: ufs: Fix error handler clear ua deadlock Adrian Hunter
2021-09-03 20:29   ` Bart Van Assche
2021-09-05  9:51     ` Adrian Hunter
2021-09-07  0:37       ` Bart Van Assche
2021-09-07 11:06         ` Adrian Hunter
2021-09-03  9:56 ` [PATCH V2 2/3] scsi: ufs: Fix runtime PM dependencies getting broken Adrian Hunter
2021-09-03  9:56 ` [PATCH V2 3/3] scsi: ufs: Let devices remain runtime suspended during system suspend Adrian Hunter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.