All of lore.kernel.org
 help / color / mirror / Atom feed
From: Adrian Hunter <adrian.hunter@intel.com>
To: "Asutosh Das (asd)" <asutoshd@codeaurora.org>,
	cang@codeaurora.org, martin.petersen@oracle.com,
	linux-scsi@vger.kernel.org
Cc: linux-arm-msm@vger.kernel.org,
	Alim Akhtar <alim.akhtar@samsung.com>,
	Avri Altman <avri.altman@wdc.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	Krzysztof Kozlowski <krzk@kernel.org>,
	Stanley Chu <stanley.chu@mediatek.com>,
	Andy Gross <agross@kernel.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Matthias Brugger <matthias.bgg@gmail.com>,
	Lee Jones <lee.jones@linaro.org>, Bean Huo <beanhuo@micron.com>,
	Kiwoong Kim <kwmad.kim@samsung.com>,
	Colin Ian King <colin.king@canonical.com>,
	Wei Yongjun <weiyongjun1@huawei.com>, Yue Hu <huyue2@yulong.com>,
	Bart van Assche <bvanassche@acm.org>,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	Dinghao Liu <dinghao.liu@zju.edu.cn>,
	Jaegeuk Kim <jaegeuk@kernel.org>,
	Satya Tangirala <satyat@google.com>,
	open list <linux-kernel@vger.kernel.org>,
	"moderated list:ARM/SAMSUNG S3C,
	S5P AND EXYNOS ARM ARCHITECTURES" 
	<linux-arm-kernel@lists.infradead.org>,
	"open list:ARM/SAMSUNG S3C,
	S5P AND EXYNOS ARM ARCHITECTURES" 
	<linux-samsung-soc@vger.kernel.org>,
	"moderated list:UNIVERSAL FLASH STORAGE HOST CONTROLLER
	DRIVER..."  <linux-mediatek@lists.infradead.org>
Subject: Re: [PATCH v20 1/2] scsi: ufs: Enable power management for wlun
Date: Fri, 23 Apr 2021 11:01:40 +0300	[thread overview]
Message-ID: <973e0bbb-ac2d-7196-2e25-37aee2b77b46@intel.com> (raw)
In-Reply-To: <e6f3946a-dbe7-6b42-e43c-d3f8d705c732@intel.com>

On 23/04/21 9:18 am, Adrian Hunter wrote:
> On 23/04/21 7:23 am, Adrian Hunter wrote:
>> On 22/04/21 7:38 pm, Asutosh Das (asd) wrote:
>>> On 4/20/2021 12:42 AM, Adrian Hunter wrote:
>>>> On 20/04/21 7:15 am, Adrian Hunter wrote:
>>>>> On 20/04/21 12:53 am, Asutosh Das (asd) wrote:
>>>>>> On 4/19/2021 11:37 AM, Adrian Hunter wrote:
>>>>>>> On 16/04/21 10:49 pm, Asutosh Das wrote:
>>>>>>>>
>>>>>>>> Co-developed-by: Can Guo <cang@codeaurora.org>
>>>>>>>> Signed-off-by: Can Guo <cang@codeaurora.org>
>>>>>>>> Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
>>>>>>>> ---
>>>>>>>
>>>>>>> I came across 3 issues while testing.  See comments below.
>>>>>>>
>>>>>> Hi Adrian
>>>>>> Thanks for the comments.
>>>>>>> <SNIP>
>>>>>>>
>>>>>>>> @@ -5794,7 +5839,7 @@ static void ufshcd_err_handling_unprepare(struct ufs_hba *hba)
>>>>>>>>        if (ufshcd_is_clkscaling_supported(hba))
>>>>>>>>            ufshcd_clk_scaling_suspend(hba, false);
>>>>>>>>        ufshcd_clear_ua_wluns(hba);
>>>>>>>
>>>>>>> ufshcd_clear_ua_wluns() deadlocks trying to clear UFS_UPIU_RPMB_WLUN
>>>>>>> if sdev_rpmb is suspended and sdev_ufs_device is suspending.
>>>>>>> e.g. ufshcd_wl_suspend() is waiting on host_sem while ufshcd_err_handler()
>>>>>>> is running, at which point sdev_rpmb has already suspended.
>>>>>>>
>>>>>> Umm, I didn't understand this deadlock.
>>>>>> When you say, sdev_rpmb is suspended, does it mean runtime_suspended?
>>>>>> sdev_ufs_device is suspending - this can't be runtime_suspending, while ufshcd_err_handling_unprepare is running.
>>>>>>
>>>>>> If you've a call-stack of this deadlock, please can you share it with me. I'll also try to reproduce this.
>>>>>
>>>>> Yes it is system suspend. sdev_rpmb has suspended, sdev_ufs_device is waiting on host_sem.
>>>>> ufshcd_err_handler() holds host_sem. ufshcd_clear_ua_wlun(UFS_UPIU_RPMB_WLUN) gets stuck.
>>>>> I will get some call-stacks.
>>>>
>>> Hi Adrian,
>>>
>>> Thanks for the call stacks.
>>> From the current information, I can't say for sure why it'd get stuck in blk_queue_enter().
>>
>> I presume SCSI is leaving the RPMB WLUN device runtime suspended and consequently the queue status is RPM_SUSPENDED
>>
>>>
>>> I tried reproducing this issue on my setup yesterday but couldn't.
>>> Here's what I did:
>>> 1. sdev_rpmb is RPM_SUSPENDED, checked before initiating system suspend
>>> 2. sdev_ufs_device is RPM_RESUMED
>>> 3. I triggered system suspend (echo mem > /sys/power/state) and scheduled the error handler from ufshcd_wl_suspend().
>>> 4. Waited until error handler ran and then ufshcd_wl_suspend() blocks on host_sem.
>>> 5. The ufshcd_clear_wa_wlun(UFS_UPIU_RPMB_WLUN) went through fine.
>>>
>>> Do you've some specific steps to reproduce this or a script, perhaps? If so, please can you share it with me. I will try again.
>>
>> I was using a device that gives occasional errors, but I will what see I can do.
> 
> I have attached 2 patches to reproduce the issue, and the test script I used.
> Note it is using down_timeout() so it doesn't lock up completely.
> The kernel messages are:
> 
> [   79.385456] ufshcd_wl_suspend: doing ufshcd_schedule_eh_work
> [   79.385511] ufshcd_wl_suspend: sleeping 1000 ms
> [   79.386916] ufshcd_err_handler: start
> [   79.386979] ufshcd_err_handler: got sem
> [   79.387053] ufshcd_err_handling_unprepare: start
> [   79.387302] ufshcd_clear_ua_wluns: before ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN)
> [   80.435878] ufshcd_wl_suspend: trying to get sem
> [   85.683876] ufshcd_wl_suspend: failed to get sem
> [   85.683993] ufs_device_wlun 0:0:0:49488: PM: failed to suspend async: error -22
> [   85.686901] ufshcd_clear_ua_wluns: after ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN), ret 0
> [   85.687025] ufshcd_err_handling_unprepare: finish
> [   85.687090] ufshcd_err_handler: finish 
> 
> 

I think we also need to runtime resume RPMB WLUN before system suspend.
e.g.

+static int ufshcd_rpmb_rpm_get_sync(struct ufs_hba *hba)
+{
+	return pm_runtime_get_sync(&hba->sdev_rpmb->sdev_gendev);
+}
+
+static int ufshcd_rpmb_rpm_put(struct ufs_hba *hba)
+{
+	return pm_runtime_put(&hba->sdev_rpmb->sdev_gendev);
+}
+
 void ufshcd_resume_complete(struct device *dev)
 {
 	struct ufs_hba *hba = dev_get_drvdata(dev);
 
+	if (hba->rpmb_complete_put) {
+		hba->rpmb_complete_put = false;
+		ufshcd_rpmb_rpm_put(hba);
+	}
 	if (hba->complete_put) {
 		hba->complete_put = false;
 		ufshcd_rpm_put(hba);
@@ -9611,6 +9625,11 @@ int ufshcd_suspend_prepare(struct device *dev)
 		return ret;
 	}
 	hba->complete_put = true;
+
+	if (hba->sdev_rpmb) {
+		ufshcd_rpmb_rpm_get_sync(hba);
+		hba->rpmb_complete_put = true;
+	}
 	return 0;
 }

That also avoids another issue: if RPMB WLUN is runtime suspended at system resume, we have to skip clearing UAC, but SCSI PM will force the runtime status to RPM_ACTIVE after system resume, so the UAC never gets cleared in that case.

Furthermore, it seems better not to report errors from RPMB resume and instead let the error handler sort it out.
So, with the above change, we can simplify a bit:

-static int ufshcd_rpmb_runtime_resume(struct device *dev)
-{
-	struct ufs_hba *hba = wlun_dev_to_hba(dev);
-
-	if (hba->sdev_rpmb)
-		return ufshcd_clear_rpmb_uac(hba);
-	return 0;
-}
-
 static int ufshcd_rpmb_resume(struct device *dev)
 {
 	struct ufs_hba *hba = wlun_dev_to_hba(dev);
 
-	if (hba->sdev_rpmb && !pm_runtime_suspended(dev))
-		return ufshcd_clear_rpmb_uac(hba);
+	if (hba->sdev_rpmb)
+		ufshcd_clear_rpmb_uac(hba);
 	return 0;
 }
 
 static const struct dev_pm_ops ufs_rpmb_pm_ops = {
-	SET_RUNTIME_PM_OPS(NULL, ufshcd_rpmb_runtime_resume, NULL)
+	SET_RUNTIME_PM_OPS(NULL, ufshcd_rpmb_resume, NULL)
 	SET_SYSTEM_SLEEP_PM_OPS(NULL, ufshcd_rpmb_resume)
 };



WARNING: multiple messages have this Message-ID
From: Adrian Hunter <adrian.hunter@intel.com>
To: "Asutosh Das (asd)" <asutoshd@codeaurora.org>,
	cang@codeaurora.org, martin.petersen@oracle.com,
	linux-scsi@vger.kernel.org
Cc: linux-arm-msm@vger.kernel.org,
	Alim Akhtar <alim.akhtar@samsung.com>,
	Avri Altman <avri.altman@wdc.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	Krzysztof Kozlowski <krzk@kernel.org>,
	Stanley Chu <stanley.chu@mediatek.com>,
	Andy Gross <agross@kernel.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Matthias Brugger <matthias.bgg@gmail.com>,
	Lee Jones <lee.jones@linaro.org>, Bean Huo <beanhuo@micron.com>,
	Kiwoong Kim <kwmad.kim@samsung.com>,
	Colin Ian King <colin.king@canonical.com>,
	Wei Yongjun <weiyongjun1@huawei.com>, Yue Hu <huyue2@yulong.com>,
	Bart van Assche <bvanassche@acm.org>,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	Dinghao Liu <dinghao.liu@zju.edu.cn>,
	Jaegeuk Kim <jaegeuk@kernel.org>,
	Satya Tangirala <satyat@google.com>,
	open list <linux-kernel@vger.kernel.org>,
	"moderated list:ARM/SAMSUNG S3C,
	S5P AND EXYNOS ARM ARCHITECTURES"
	<linux-arm-kernel@lists.infradead.org>,
	"open list:ARM/SAMSUNG S3C,
	S5P AND EXYNOS ARM ARCHITECTURES"
	<linux-samsung-soc@vger.kernel.org>,
	"moderated list:UNIVERSAL FLASH STORAGE HOST CONTROLLER
	DRIVER..." <linux-mediatek@lists.infradead.org>
Subject: Re: [PATCH v20 1/2] scsi: ufs: Enable power management for wlun
Date: Fri, 23 Apr 2021 11:01:40 +0300	[thread overview]
Message-ID: <973e0bbb-ac2d-7196-2e25-37aee2b77b46@intel.com> (raw)
In-Reply-To: <e6f3946a-dbe7-6b42-e43c-d3f8d705c732@intel.com>

On 23/04/21 9:18 am, Adrian Hunter wrote:
> On 23/04/21 7:23 am, Adrian Hunter wrote:
>> On 22/04/21 7:38 pm, Asutosh Das (asd) wrote:
>>> On 4/20/2021 12:42 AM, Adrian Hunter wrote:
>>>> On 20/04/21 7:15 am, Adrian Hunter wrote:
>>>>> On 20/04/21 12:53 am, Asutosh Das (asd) wrote:
>>>>>> On 4/19/2021 11:37 AM, Adrian Hunter wrote:
>>>>>>> On 16/04/21 10:49 pm, Asutosh Das wrote:
>>>>>>>>
>>>>>>>> Co-developed-by: Can Guo <cang@codeaurora.org>
>>>>>>>> Signed-off-by: Can Guo <cang@codeaurora.org>
>>>>>>>> Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
>>>>>>>> ---
>>>>>>>
>>>>>>> I came across 3 issues while testing.  See comments below.
>>>>>>>
>>>>>> Hi Adrian
>>>>>> Thanks for the comments.
>>>>>>> <SNIP>
>>>>>>>
>>>>>>>> @@ -5794,7 +5839,7 @@ static void ufshcd_err_handling_unprepare(struct ufs_hba *hba)
>>>>>>>>        if (ufshcd_is_clkscaling_supported(hba))
>>>>>>>>            ufshcd_clk_scaling_suspend(hba, false);
>>>>>>>>        ufshcd_clear_ua_wluns(hba);
>>>>>>>
>>>>>>> ufshcd_clear_ua_wluns() deadlocks trying to clear UFS_UPIU_RPMB_WLUN
>>>>>>> if sdev_rpmb is suspended and sdev_ufs_device is suspending.
>>>>>>> e.g. ufshcd_wl_suspend() is waiting on host_sem while ufshcd_err_handler()
>>>>>>> is running, at which point sdev_rpmb has already suspended.
>>>>>>>
>>>>>> Umm, I didn't understand this deadlock.
>>>>>> When you say, sdev_rpmb is suspended, does it mean runtime_suspended?
>>>>>> sdev_ufs_device is suspending - this can't be runtime_suspending, while ufshcd_err_handling_unprepare is running.
>>>>>>
>>>>>> If you've a call-stack of this deadlock, please can you share it with me. I'll also try to reproduce this.
>>>>>
>>>>> Yes it is system suspend. sdev_rpmb has suspended, sdev_ufs_device is waiting on host_sem.
>>>>> ufshcd_err_handler() holds host_sem. ufshcd_clear_ua_wlun(UFS_UPIU_RPMB_WLUN) gets stuck.
>>>>> I will get some call-stacks.
>>>>
>>> Hi Adrian,
>>>
>>> Thanks for the call stacks.
>>> From the current information, I can't say for sure why it'd get stuck in blk_queue_enter().
>>
>> I presume SCSI is leaving the RPMB WLUN device runtime suspended and consequently the queue status is RPM_SUSPENDED
>>
>>>
>>> I tried reproducing this issue on my setup yesterday but couldn't.
>>> Here's what I did:
>>> 1. sdev_rpmb is RPM_SUSPENDED, checked before initiating system suspend
>>> 2. sdev_ufs_device is RPM_RESUMED
>>> 3. I triggered system suspend (echo mem > /sys/power/state) and scheduled the error handler from ufshcd_wl_suspend().
>>> 4. Waited until error handler ran and then ufshcd_wl_suspend() blocks on host_sem.
>>> 5. The ufshcd_clear_wa_wlun(UFS_UPIU_RPMB_WLUN) went through fine.
>>>
>>> Do you've some specific steps to reproduce this or a script, perhaps? If so, please can you share it with me. I will try again.
>>
>> I was using a device that gives occasional errors, but I will what see I can do.
> 
> I have attached 2 patches to reproduce the issue, and the test script I used.
> Note it is using down_timeout() so it doesn't lock up completely.
> The kernel messages are:
> 
> [   79.385456] ufshcd_wl_suspend: doing ufshcd_schedule_eh_work
> [   79.385511] ufshcd_wl_suspend: sleeping 1000 ms
> [   79.386916] ufshcd_err_handler: start
> [   79.386979] ufshcd_err_handler: got sem
> [   79.387053] ufshcd_err_handling_unprepare: start
> [   79.387302] ufshcd_clear_ua_wluns: before ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN)
> [   80.435878] ufshcd_wl_suspend: trying to get sem
> [   85.683876] ufshcd_wl_suspend: failed to get sem
> [   85.683993] ufs_device_wlun 0:0:0:49488: PM: failed to suspend async: error -22
> [   85.686901] ufshcd_clear_ua_wluns: after ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN), ret 0
> [   85.687025] ufshcd_err_handling_unprepare: finish
> [   85.687090] ufshcd_err_handler: finish 
> 
> 

I think we also need to runtime resume RPMB WLUN before system suspend.
e.g.

+static int ufshcd_rpmb_rpm_get_sync(struct ufs_hba *hba)
+{
+	return pm_runtime_get_sync(&hba->sdev_rpmb->sdev_gendev);
+}
+
+static int ufshcd_rpmb_rpm_put(struct ufs_hba *hba)
+{
+	return pm_runtime_put(&hba->sdev_rpmb->sdev_gendev);
+}
+
 void ufshcd_resume_complete(struct device *dev)
 {
 	struct ufs_hba *hba = dev_get_drvdata(dev);
 
+	if (hba->rpmb_complete_put) {
+		hba->rpmb_complete_put = false;
+		ufshcd_rpmb_rpm_put(hba);
+	}
 	if (hba->complete_put) {
 		hba->complete_put = false;
 		ufshcd_rpm_put(hba);
@@ -9611,6 +9625,11 @@ int ufshcd_suspend_prepare(struct device *dev)
 		return ret;
 	}
 	hba->complete_put = true;
+
+	if (hba->sdev_rpmb) {
+		ufshcd_rpmb_rpm_get_sync(hba);
+		hba->rpmb_complete_put = true;
+	}
 	return 0;
 }

That also avoids another issue: if RPMB WLUN is runtime suspended at system resume, we have to skip clearing UAC, but SCSI PM will force the runtime status to RPM_ACTIVE after system resume, so the UAC never gets cleared in that case.

Furthermore, it seems better not to report errors from RPMB resume and instead let the error handler sort it out.
So, with the above change, we can simplify a bit:

-static int ufshcd_rpmb_runtime_resume(struct device *dev)
-{
-	struct ufs_hba *hba = wlun_dev_to_hba(dev);
-
-	if (hba->sdev_rpmb)
-		return ufshcd_clear_rpmb_uac(hba);
-	return 0;
-}
-
 static int ufshcd_rpmb_resume(struct device *dev)
 {
 	struct ufs_hba *hba = wlun_dev_to_hba(dev);
 
-	if (hba->sdev_rpmb && !pm_runtime_suspended(dev))
-		return ufshcd_clear_rpmb_uac(hba);
+	if (hba->sdev_rpmb)
+		ufshcd_clear_rpmb_uac(hba);
 	return 0;
 }
 
 static const struct dev_pm_ops ufs_rpmb_pm_ops = {
-	SET_RUNTIME_PM_OPS(NULL, ufshcd_rpmb_runtime_resume, NULL)
+	SET_RUNTIME_PM_OPS(NULL, ufshcd_rpmb_resume, NULL)
 	SET_SYSTEM_SLEEP_PM_OPS(NULL, ufshcd_rpmb_resume)
 };



_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

WARNING: multiple messages have this Message-ID
From: Adrian Hunter <adrian.hunter@intel.com>
To: "Asutosh Das (asd)" <asutoshd@codeaurora.org>,
	cang@codeaurora.org, martin.petersen@oracle.com,
	linux-scsi@vger.kernel.org
Cc: linux-arm-msm@vger.kernel.org,
	Alim Akhtar <alim.akhtar@samsung.com>,
	Avri Altman <avri.altman@wdc.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	Krzysztof Kozlowski <krzk@kernel.org>,
	Stanley Chu <stanley.chu@mediatek.com>,
	Andy Gross <agross@kernel.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Matthias Brugger <matthias.bgg@gmail.com>,
	Lee Jones <lee.jones@linaro.org>, Bean Huo <beanhuo@micron.com>,
	Kiwoong Kim <kwmad.kim@samsung.com>,
	Colin Ian King <colin.king@canonical.com>,
	Wei Yongjun <weiyongjun1@huawei.com>, Yue Hu <huyue2@yulong.com>,
	Bart van Assche <bvanassche@acm.org>,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	Dinghao Liu <dinghao.liu@zju.edu.cn>,
	Jaegeuk Kim <jaegeuk@kernel.org>,
	Satya Tangirala <satyat@google.com>,
	open list <linux-kernel@vger.kernel.org>,
	"moderated list:ARM/SAMSUNG S3C,
	S5P AND EXYNOS ARM ARCHITECTURES"
	<linux-arm-kernel@lists.infradead.org>,
	"open list:ARM/SAMSUNG S3C,
	S5P AND EXYNOS ARM ARCHITECTURES"
	<linux-samsung-soc@vger.kernel.org>,
	"moderated list:UNIVERSAL FLASH STORAGE HOST CONTROLLER
	DRIVER..." <linux-mediatek@lists.infradead.org>
Subject: Re: [PATCH v20 1/2] scsi: ufs: Enable power management for wlun
Date: Fri, 23 Apr 2021 11:01:40 +0300	[thread overview]
Message-ID: <973e0bbb-ac2d-7196-2e25-37aee2b77b46@intel.com> (raw)
In-Reply-To: <e6f3946a-dbe7-6b42-e43c-d3f8d705c732@intel.com>

On 23/04/21 9:18 am, Adrian Hunter wrote:
> On 23/04/21 7:23 am, Adrian Hunter wrote:
>> On 22/04/21 7:38 pm, Asutosh Das (asd) wrote:
>>> On 4/20/2021 12:42 AM, Adrian Hunter wrote:
>>>> On 20/04/21 7:15 am, Adrian Hunter wrote:
>>>>> On 20/04/21 12:53 am, Asutosh Das (asd) wrote:
>>>>>> On 4/19/2021 11:37 AM, Adrian Hunter wrote:
>>>>>>> On 16/04/21 10:49 pm, Asutosh Das wrote:
>>>>>>>>
>>>>>>>> Co-developed-by: Can Guo <cang@codeaurora.org>
>>>>>>>> Signed-off-by: Can Guo <cang@codeaurora.org>
>>>>>>>> Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
>>>>>>>> ---
>>>>>>>
>>>>>>> I came across 3 issues while testing.  See comments below.
>>>>>>>
>>>>>> Hi Adrian
>>>>>> Thanks for the comments.
>>>>>>> <SNIP>
>>>>>>>
>>>>>>>> @@ -5794,7 +5839,7 @@ static void ufshcd_err_handling_unprepare(struct ufs_hba *hba)
>>>>>>>>        if (ufshcd_is_clkscaling_supported(hba))
>>>>>>>>            ufshcd_clk_scaling_suspend(hba, false);
>>>>>>>>        ufshcd_clear_ua_wluns(hba);
>>>>>>>
>>>>>>> ufshcd_clear_ua_wluns() deadlocks trying to clear UFS_UPIU_RPMB_WLUN
>>>>>>> if sdev_rpmb is suspended and sdev_ufs_device is suspending.
>>>>>>> e.g. ufshcd_wl_suspend() is waiting on host_sem while ufshcd_err_handler()
>>>>>>> is running, at which point sdev_rpmb has already suspended.
>>>>>>>
>>>>>> Umm, I didn't understand this deadlock.
>>>>>> When you say, sdev_rpmb is suspended, does it mean runtime_suspended?
>>>>>> sdev_ufs_device is suspending - this can't be runtime_suspending, while ufshcd_err_handling_unprepare is running.
>>>>>>
>>>>>> If you've a call-stack of this deadlock, please can you share it with me. I'll also try to reproduce this.
>>>>>
>>>>> Yes it is system suspend. sdev_rpmb has suspended, sdev_ufs_device is waiting on host_sem.
>>>>> ufshcd_err_handler() holds host_sem. ufshcd_clear_ua_wlun(UFS_UPIU_RPMB_WLUN) gets stuck.
>>>>> I will get some call-stacks.
>>>>
>>> Hi Adrian,
>>>
>>> Thanks for the call stacks.
>>> From the current information, I can't say for sure why it'd get stuck in blk_queue_enter().
>>
>> I presume SCSI is leaving the RPMB WLUN device runtime suspended and consequently the queue status is RPM_SUSPENDED
>>
>>>
>>> I tried reproducing this issue on my setup yesterday but couldn't.
>>> Here's what I did:
>>> 1. sdev_rpmb is RPM_SUSPENDED, checked before initiating system suspend
>>> 2. sdev_ufs_device is RPM_RESUMED
>>> 3. I triggered system suspend (echo mem > /sys/power/state) and scheduled the error handler from ufshcd_wl_suspend().
>>> 4. Waited until error handler ran and then ufshcd_wl_suspend() blocks on host_sem.
>>> 5. The ufshcd_clear_wa_wlun(UFS_UPIU_RPMB_WLUN) went through fine.
>>>
>>> Do you've some specific steps to reproduce this or a script, perhaps? If so, please can you share it with me. I will try again.
>>
>> I was using a device that gives occasional errors, but I will what see I can do.
> 
> I have attached 2 patches to reproduce the issue, and the test script I used.
> Note it is using down_timeout() so it doesn't lock up completely.
> The kernel messages are:
> 
> [   79.385456] ufshcd_wl_suspend: doing ufshcd_schedule_eh_work
> [   79.385511] ufshcd_wl_suspend: sleeping 1000 ms
> [   79.386916] ufshcd_err_handler: start
> [   79.386979] ufshcd_err_handler: got sem
> [   79.387053] ufshcd_err_handling_unprepare: start
> [   79.387302] ufshcd_clear_ua_wluns: before ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN)
> [   80.435878] ufshcd_wl_suspend: trying to get sem
> [   85.683876] ufshcd_wl_suspend: failed to get sem
> [   85.683993] ufs_device_wlun 0:0:0:49488: PM: failed to suspend async: error -22
> [   85.686901] ufshcd_clear_ua_wluns: after ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN), ret 0
> [   85.687025] ufshcd_err_handling_unprepare: finish
> [   85.687090] ufshcd_err_handler: finish 
> 
> 

I think we also need to runtime resume RPMB WLUN before system suspend.
e.g.

+static int ufshcd_rpmb_rpm_get_sync(struct ufs_hba *hba)
+{
+	return pm_runtime_get_sync(&hba->sdev_rpmb->sdev_gendev);
+}
+
+static int ufshcd_rpmb_rpm_put(struct ufs_hba *hba)
+{
+	return pm_runtime_put(&hba->sdev_rpmb->sdev_gendev);
+}
+
 void ufshcd_resume_complete(struct device *dev)
 {
 	struct ufs_hba *hba = dev_get_drvdata(dev);
 
+	if (hba->rpmb_complete_put) {
+		hba->rpmb_complete_put = false;
+		ufshcd_rpmb_rpm_put(hba);
+	}
 	if (hba->complete_put) {
 		hba->complete_put = false;
 		ufshcd_rpm_put(hba);
@@ -9611,6 +9625,11 @@ int ufshcd_suspend_prepare(struct device *dev)
 		return ret;
 	}
 	hba->complete_put = true;
+
+	if (hba->sdev_rpmb) {
+		ufshcd_rpmb_rpm_get_sync(hba);
+		hba->rpmb_complete_put = true;
+	}
 	return 0;
 }

That also avoids another issue: if RPMB WLUN is runtime suspended at system resume, we have to skip clearing UAC, but SCSI PM will force the runtime status to RPM_ACTIVE after system resume, so the UAC never gets cleared in that case.

Furthermore, it seems better not to report errors from RPMB resume and instead let the error handler sort it out.
So, with the above change, we can simplify a bit:

-static int ufshcd_rpmb_runtime_resume(struct device *dev)
-{
-	struct ufs_hba *hba = wlun_dev_to_hba(dev);
-
-	if (hba->sdev_rpmb)
-		return ufshcd_clear_rpmb_uac(hba);
-	return 0;
-}
-
 static int ufshcd_rpmb_resume(struct device *dev)
 {
 	struct ufs_hba *hba = wlun_dev_to_hba(dev);
 
-	if (hba->sdev_rpmb && !pm_runtime_suspended(dev))
-		return ufshcd_clear_rpmb_uac(hba);
+	if (hba->sdev_rpmb)
+		ufshcd_clear_rpmb_uac(hba);
 	return 0;
 }
 
 static const struct dev_pm_ops ufs_rpmb_pm_ops = {
-	SET_RUNTIME_PM_OPS(NULL, ufshcd_rpmb_runtime_resume, NULL)
+	SET_RUNTIME_PM_OPS(NULL, ufshcd_rpmb_resume, NULL)
 	SET_SYSTEM_SLEEP_PM_OPS(NULL, ufshcd_rpmb_resume)
 };



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-04-23  8:01 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-16 19:49 [PATCH v20 0/2] Enable power management for ufs wlun Asutosh Das
2021-04-16 19:49 ` Asutosh Das
2021-04-16 19:49 ` Asutosh Das
2021-04-16 19:49 ` [PATCH v20 1/2] scsi: ufs: Enable power management for wlun Asutosh Das
2021-04-16 19:49   ` Asutosh Das
2021-04-16 19:49   ` Asutosh Das
2021-04-19 18:37   ` Adrian Hunter
2021-04-19 18:37     ` Adrian Hunter
2021-04-19 18:37     ` Adrian Hunter
2021-04-19 21:53     ` Asutosh Das (asd)
2021-04-19 21:53       ` Asutosh Das (asd)
2021-04-20  4:15       ` Adrian Hunter
2021-04-20  4:15         ` Adrian Hunter
2021-04-20  4:15         ` Adrian Hunter
2021-04-20  7:42         ` Adrian Hunter
2021-04-20  7:42           ` Adrian Hunter
2021-04-20  7:42           ` Adrian Hunter
2021-04-22 16:38           ` Asutosh Das (asd)
2021-04-22 16:38             ` Asutosh Das (asd)
2021-04-23  4:23             ` Adrian Hunter
2021-04-23  4:23               ` Adrian Hunter
2021-04-23  4:23               ` Adrian Hunter
2021-04-23  6:18               ` Adrian Hunter
2021-04-23  6:18                 ` Adrian Hunter
2021-04-23  6:18                 ` Adrian Hunter
2021-04-23  8:01                 ` Adrian Hunter [this message]
2021-04-23  8:01                   ` Adrian Hunter
2021-04-23  8:01                   ` Adrian Hunter
2021-04-23 23:44                   ` Asutosh Das (asd)
2021-04-23 23:44                     ` Asutosh Das (asd)
2021-04-16 19:49 ` [PATCH v20 2/2] ufs: sysfs: Resume the proper scsi device Asutosh Das

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=973e0bbb-ac2d-7196-2e25-37aee2b77b46@intel.com \
    --to=adrian.hunter@intel.com \
    --cc=agross@kernel.org \
    --cc=alim.akhtar@samsung.com \
    --cc=asutoshd@codeaurora.org \
    --cc=avri.altman@wdc.com \
    --cc=beanhuo@micron.com \
    --cc=bjorn.andersson@linaro.org \
    --cc=bvanassche@acm.org \
    --cc=cang@codeaurora.org \
    --cc=colin.king@canonical.com \
    --cc=dinghao.liu@zju.edu.cn \
    --cc=gustavoars@kernel.org \
    --cc=huyue2@yulong.com \
    --cc=jaegeuk@kernel.org \
    --cc=jejb@linux.ibm.com \
    --cc=krzk@kernel.org \
    --cc=kwmad.kim@samsung.com \
    --cc=lee.jones@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux-samsung-soc@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=matthias.bgg@gmail.com \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=satyat@google.com \
    --cc=stanley.chu@mediatek.com \
    --cc=weiyongjun1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.