From: Can Guo <cang@codeaurora.org>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
asutoshd@codeaurora.org, nguyenb@codeaurora.org,
hongwus@codeaurora.org, ziqichen@codeaurora.org,
linux-scsi@vger.kernel.org, kernel-team@android.com,
Alim Akhtar <alim.akhtar@samsung.com>,
Avri Altman <avri.altman@wdc.com>,
"James E.J. Bottomley" <jejb@linux.ibm.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Stanley Chu <stanley.chu@mediatek.com>,
Bean Huo <beanhuo@micron.com>, Jaegeuk Kim <jaegeuk@kernel.org>,
open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 5/9] scsi: ufs: Simplify error handling preparation
Date: Sat, 12 Jun 2021 17:49:26 +0800 [thread overview]
Message-ID: <d3c57c8e52f7a251b5c536a893b1f101@codeaurora.org> (raw)
In-Reply-To: <645c0e3c83c8917a8fd5c0493c5815a0@codeaurora.org>
Hi Bart,
On 2021-06-12 14:46, Can Guo wrote:
> On 2021-06-12 04:58, Bart Van Assche wrote:
>> On 6/10/21 8:01 PM, Can Guo wrote:
>>> Previously, without commit cb7e6f05fce67c965194ac04467e1ba7bc70b069,
>>> ufshcd_resume() may turn off pwr and clk due to UFS error, e.g., link
>>> transition failure and SSU error/abort (and these UFS error would
>>> invoke error handling). When error handling kicks start, it should
>>> re-enable the pwr and clk before proceeding. Now, commit
>>> cb7e6f05fce67c965194ac04467e1ba7bc70b069 makes ufshcd_resume()
>>> purely control pwr and clk, meaning if ufshcd_resume() fails, there
>>> is nothing we can do about it - pwr or clk enabling must have failed,
>>> and it is not because of UFS error. This is why I am removing the
>>> re-enabling pwr/clk in error handling prepare.
>>
>> Why are link transition failures handled in the error handler instead
>> of
>> in the context where these errors are detected (ufshcd_resume())? Is
>> it
>> even possible to recover from a link transition failure or does this
>> perhaps indicate a broken UFS controller?
>
> Basically, almost all UFS failures are caused by errors in underlaying
> layers,
> i.e., UIC errors, including link transition failures. And according to
> UFSHCI
> spec, SW should do a full reset to recover it, just like handle any
> other
> fatal UIC errors. All UIC errors are detected by HW and reported by IRQ
> handler.
>
> UFSHCI Spec Ver. 31
> 8.2.7 Hibernate Enter/Exit Error Handling
> Hibernate Enter/Exit Error occurs when the UniPro link is broken. When
> this condition occurs,
> host software should reset the host controller by setting register HCE
> to ‘0’, re-initialize the host
> controller by setting register HCE to ‘1', and then start link startup
> sequence as shown in Figure 16.
>
>>
>>>> but what I really wonder is why we don't just do recovery directly
>>>> in __ufshcd_wl_suspend() and __ufshcd_wl_resume() and strip all
>>>> the PM complexity out of ufshcd_err_handling()?
>>
>> +1
>
> I've explained why I chose not to do this in my last reply to Adrian.
> Please kindly check it.
>
>>
>>> For system suspend/resume, since error handling has the same nature
>>> like user access, so we are using host_sem to avoid concurrency of
>>> error handling and system suspend/resume.
>>
>> Why is host_sem used for that purpose instead of lock_system_sleep()
>> and
>> unlock_system_sleep()?
>>
>
> I was aware of it, but the situation is that host_sem is also used to
> avoid concurrency among user access, error handling and shutdown, so
> I think just use host_sem anyways to simply the lockings, otherwise
> user access and error handling would have to take both
> system_transition_mutex
> and host_sem
On second thought, I will take your suggestion to use
lock_system_sleep()
and unlock_system_sleep() in error handler and remove the host_sem used
in suspend/resume, which can make the code more readable by keeping the
changes within error handler itself. However, please note that host_sem
will still be used to avoid concurrency of user access, error handler
and
shutdown.
Thanks,
Can Guo.
>
> Thanks,
>
> Can Guo.
>
>> Thanks,
>>
>> Bart.
next prev parent reply other threads:[~2021-06-12 9:49 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1623300218-9454-1-git-send-email-cang@codeaurora.org>
2021-06-10 4:43 ` [PATCH v3 1/9] scsi: ufs: Differentiate status between hba pm ops and wl pm ops Can Guo
2021-06-10 11:15 ` Adrian Hunter
2021-06-11 0:53 ` Can Guo
2021-06-11 20:40 ` Bart Van Assche
2021-06-12 6:20 ` Can Guo
2021-06-16 17:50 ` Bart Van Assche
2021-06-23 1:32 ` Can Guo
2021-06-10 4:43 ` [PATCH v3 2/9] scsi: ufs: Update the return value of supplier " Can Guo
2021-06-10 4:43 ` [PATCH v3 3/9] scsi: ufs: Enable IRQ after enabling clocks in error handling preparation Can Guo
2021-06-10 4:43 ` [PATCH v3 4/9] scsi: ufs: Complete the cmd before returning in queuecommand Can Guo
2021-06-11 20:52 ` Bart Van Assche
2021-06-12 7:38 ` Can Guo
2021-06-12 15:50 ` Bart Van Assche
2021-06-13 13:30 ` Can Guo
2021-06-10 4:43 ` [PATCH v3 5/9] scsi: ufs: Simplify error handling preparation Can Guo
2021-06-10 12:30 ` Adrian Hunter
2021-06-11 3:01 ` Can Guo
2021-06-11 20:58 ` Bart Van Assche
2021-06-12 6:46 ` Can Guo
2021-06-12 9:49 ` Can Guo [this message]
2021-06-10 4:43 ` [PATCH v3 6/9] scsi: ufs: Update ufshcd_recover_pm_error() Can Guo
2021-06-10 4:43 ` [PATCH v3 7/9] scsi: ufs: Let host_sem cover the entire system suspend/resume Can Guo
2021-06-10 13:32 ` Adrian Hunter
2021-06-11 3:06 ` Can Guo
2021-06-11 21:00 ` Bart Van Assche
2021-06-12 6:46 ` Can Guo
2021-06-10 4:43 ` [PATCH v3 8/9] scsi: ufs: Update the fast abort path in ufshcd_abort() for PM requests Can Guo
2021-06-11 21:02 ` Bart Van Assche
2021-06-12 7:07 ` Can Guo
2021-06-12 16:50 ` Bart Van Assche
2021-06-13 14:42 ` Can Guo
2021-06-14 18:49 ` Bart Van Assche
2021-06-15 2:36 ` Can Guo
2021-06-15 3:17 ` Can Guo
2021-06-15 18:25 ` Bart Van Assche
2021-06-16 4:00 ` Can Guo
2021-06-16 4:40 ` Bart Van Assche
2021-06-16 8:47 ` Can Guo
2021-06-16 17:55 ` Bart Van Assche
2021-06-23 1:34 ` Can Guo
2021-06-10 4:43 ` [PATCH v3 9/9] scsi: ufs: Apply more limitations to user access Can Guo
2021-06-11 21:03 ` Bart Van Assche
2021-06-12 7:13 ` Can Guo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d3c57c8e52f7a251b5c536a893b1f101@codeaurora.org \
--to=cang@codeaurora.org \
--cc=adrian.hunter@intel.com \
--cc=alim.akhtar@samsung.com \
--cc=asutoshd@codeaurora.org \
--cc=avri.altman@wdc.com \
--cc=beanhuo@micron.com \
--cc=bvanassche@acm.org \
--cc=hongwus@codeaurora.org \
--cc=jaegeuk@kernel.org \
--cc=jejb@linux.ibm.com \
--cc=kernel-team@android.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=nguyenb@codeaurora.org \
--cc=stanley.chu@mediatek.com \
--cc=ziqichen@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).