All of lore.kernel.org
 help / color / mirror / Atom feed
From: Can Guo <cang@codeaurora.org>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
	asutoshd@codeaurora.org, nguyenb@codeaurora.org,
	hongwus@codeaurora.org, ziqichen@codeaurora.org,
	linux-scsi@vger.kernel.org, kernel-team@android.com,
	Alim Akhtar <alim.akhtar@samsung.com>,
	Avri Altman <avri.altman@wdc.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Stanley Chu <stanley.chu@mediatek.com>,
	Bean Huo <beanhuo@micron.com>, Jaegeuk Kim <jaegeuk@kernel.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 06/10] scsi: ufs: Remove host_sem used in suspend/resume
Date: Tue, 29 Jun 2021 14:23:28 +0800	[thread overview]
Message-ID: <60a5496863100976b74d8c376c9e9cb0@codeaurora.org> (raw)
In-Reply-To: <c7d9e12d-f966-44c6-27dc-4004143398aa@acm.org>

On 2021-06-29 01:31, Bart Van Assche wrote:
> On 6/28/21 1:17 AM, Can Guo wrote:
>> On 2021-06-25 01:11, Bart Van Assche wrote:
>>> On 6/23/21 11:31 PM, Can Guo wrote:
>>>> Using back host_sem in suspend_prepare()/resume_complete() won't 
>>>> have
>>>> this problem of deadlock, right?
>>> 
>>> Although that would solve the deadlock discussed in this email 
>>> thread, it
>>> wouldn't solve the issue of potential adverse interactions of the UFS
>>> error handler and the SCSI error handler running concurrently.
>> 
>> I think I've explained it before, paste it here -
>> 
>> ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and 
>> flushes it,
>> so SCSI error handler and UFS error handler can safely run together.
> 
> That code path is the exception. Do you agree that the following three
> functions all invoke the ufshcd_err_handler() function asynchronously?
> * ufshcd_uic_pwr_ctrl()
> * ufshcd_check_errors()
> * ufshcd_abort()
> 

I agree, but I don't see what's wrong with that. Any context can invoke
ufs error handler asynchronously and ufs error handler prepare makes
sure error handler can work safely, i.e., stopping PM ops/gating/scaling
in error handler prepare makes sure no one shall call 
ufshcd_uic_pwr_ctrl()
ever again. And ufshcd_check_errors() and ufshcd_abort() are OK to run
concurrently with UFS error handler.

>>> How about using the
>>> standard approach for invoking the UFS error handler instead of using
>>> a custom
>>> mechanism, e.g. by using something like the (untested) patch below? 
>>> This
>>> approach guarantees that the UFS error handler is only activated 
>>> after
>>> all
>>> pending SCSI commands have failed or timed out and also guarantees
>>> that no new
>>> SCSI commands will be queued while the UFS error handler is in
>>> progress (see
>>> also scsi_host_queue_ready()).
>> 
>> Per my understanding, SCSI error handling is scsi cmd based, meaning 
>> it
>> only works when certain SCSI cmds failed [ ... ]
> That is not completely correct. The SCSI error handler is activated if
> either all pending commands have failed or if it is scheduled
> explicitly. Please take a look at the host_eh_scheduled member 
> variable,
> how it is used and also at scsi_schedule_eh(). The scsi_schedule_eh()
> function was introduced in 2006 and that the ATA code uses it since 
> then
> to activate the SCSI error handler even if no commands are pending. See
> also the patch "SCSI: make scsi_implement_eh() generic API for SCSI
> transports".
> 
>> However, most UFS (UIC) errors happens during gear scaling, clk gating
>> and suspend/resume (due to power mode changes and/or hibern8
>> enter/exit), during which there is NO scsi cmds in UFS driver at all
>> (because these contexts start only when there is no ongoing data
>> transactions).
> 
> Activating the SCSI error handler if no SCSI commands are in progress 
> is
> supported by scsi_schedule_eh().
> 
>> Thus, scsi_unjam_host() won't even call scsi_eh_ready_devs() because
>> scsi_eh_get_sense() always returns TRUE in these cases (eh_work_q is
>> empty).
> 
> Please take another look at the patch in my previous message. There is 
> a
> scsi_transport_template instance in that patch. The eh_strategy_handler
> defined in a SCSI transport template is called *instead* of
> scsi_unjam_host(). In other words, scsi_unjam_host() won't be called if
> my patch would be applied to the UFS driver.
> 
> Please let me know if you need more information.

Sorry that I missed the change of scsi_transport_template() in your 
previous
message. I can understand that you want to invoke UFS error hander by 
invoking
SCSI error handler, but I didn't go that far because I saw you changed
pm_runtime_get_sync() to pm_runtime_get_noresume() in ufs error handler 
prepare.
How can that change make sure that the device is not suspending or 
resuming
while error handler is running?

Thanks,

Can Guo.

> 
> Bart.

  reply	other threads:[~2021-06-29  6:23 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-23  7:34 [PATCH v4 00/10] Complementary changes for error handling Can Guo
2021-06-23  7:35 ` [PATCH v4 01/10] scsi: ufs: Rename flags pm_op_in_progress and is_sys_suspended Can Guo
2021-06-23 20:05   ` Bart Van Assche
2021-06-23 20:57     ` Bart Van Assche
2021-06-24  2:02       ` Can Guo
2021-06-24  2:34         ` Can Guo
2021-06-24  6:04         ` Adrian Hunter
2021-06-23 20:42   ` Bjorn Andersson
2021-06-23 22:41     ` Bart Van Assche
2021-06-24  2:04     ` Can Guo
2021-06-24 17:32   ` Bart Van Assche
2021-06-24 23:42   ` Bart Van Assche
2021-06-28  7:01     ` Can Guo
2021-06-28  7:35       ` Can Guo
2021-06-28 17:07       ` Bart Van Assche
2021-06-23  7:35 ` [PATCH v4 02/10] scsi: ufs: Add " Can Guo
2021-06-23 12:33   ` Adrian Hunter
2021-06-24  2:05     ` Can Guo
2021-06-23 20:59   ` Bart Van Assche
2021-06-24  2:07     ` Can Guo
2021-06-24 17:35   ` Bart Van Assche
2021-06-28  7:11     ` Can Guo
2021-06-23  7:35 ` [PATCH v4 03/10] scsi: ufs: Update the return value of supplier pm ops Can Guo
2021-06-23 21:08   ` Bart Van Assche
2021-06-24  2:11     ` Can Guo
2021-06-23  7:35 ` [PATCH v4 04/10] scsi: ufs: Enable IRQ after enabling clocks in error handling preparation Can Guo
2021-06-23 21:20   ` Bart Van Assche
2021-06-23  7:35 ` [PATCH 05/10] scsi: ufs: Complete the cmd before returning in queuecommand Can Guo
2021-06-23  7:39   ` Can Guo
2021-06-23  7:35 ` [PATCH v4 05/10] scsi: ufs: Remove a redundant tag check in ufshcd_queuecommand() Can Guo
2021-06-23 21:24   ` Bart Van Assche
2021-06-23  7:35 ` [PATCH v4 06/10] scsi: ufs: Remove host_sem used in suspend/resume Can Guo
2021-06-23 14:30   ` Adrian Hunter
2021-06-24  2:16     ` Can Guo
2021-06-24  5:52       ` Adrian Hunter
2021-06-24  6:12         ` Can Guo
2021-06-24  6:23           ` Adrian Hunter
2021-06-24  6:31             ` Can Guo
2021-06-24 10:04               ` Adrian Hunter
2021-06-28  7:26                 ` Can Guo
2021-07-07 19:04                   ` Adrian Hunter
2021-06-24 17:11               ` Bart Van Assche
2021-06-28  8:17                 ` Can Guo
2021-06-28 17:31                   ` Bart Van Assche
2021-06-29  6:23                     ` Can Guo [this message]
2021-06-29 18:01                       ` Bart Van Assche
2021-06-29 21:50                         ` Can Guo
2021-06-23  7:35 ` [PATCH v4 07/10] scsi: ufs: Simplify error handling preparation Can Guo
2021-06-23 21:30   ` Bart Van Assche
2021-06-23  7:35 ` [PATCH v4 08/10] scsi: ufs: Update ufshcd_recover_pm_error() Can Guo
2021-06-23  7:35 ` [PATCH v4 09/10] scsi: ufs: Update the fast abort path in ufshcd_abort() for PM requests Can Guo
2021-06-23 21:33   ` Bart Van Assche
2021-06-24  4:16     ` Can Guo
2021-06-24 16:57       ` Bart Van Assche
2021-06-23  7:35 ` [PATCH v4 10/10] scsi: ufs: Apply more limitations to user access Can Guo
2021-06-23 21:51   ` Bart Van Assche
2021-06-24  2:23     ` Can Guo
2021-06-24 22:25       ` Bart Van Assche
2021-06-28  7:16         ` Can Guo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60a5496863100976b74d8c376c9e9cb0@codeaurora.org \
    --to=cang@codeaurora.org \
    --cc=adrian.hunter@intel.com \
    --cc=alim.akhtar@samsung.com \
    --cc=asutoshd@codeaurora.org \
    --cc=avri.altman@wdc.com \
    --cc=beanhuo@micron.com \
    --cc=bvanassche@acm.org \
    --cc=hongwus@codeaurora.org \
    --cc=jaegeuk@kernel.org \
    --cc=jejb@linux.ibm.com \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=nguyenb@codeaurora.org \
    --cc=stanley.chu@mediatek.com \
    --cc=ziqichen@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.