linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Can Guo <cang@codeaurora.org>
To: Bart Van Assche <bvanassche@acm.org>
Cc: asutoshd@codeaurora.org, nguyenb@codeaurora.org,
	hongwus@codeaurora.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Stanley Chu <stanley.chu@mediatek.com>,
	Alim Akhtar <alim.akhtar@samsung.com>,
	Avri Altman <avri.altman@wdc.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Matthias Brugger <matthias.bgg@gmail.com>,
	Bean Huo <beanhuo@micron.com>, Jaegeuk Kim <jaegeuk@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Kiwoong Kim <kwmad.kim@samsung.com>,
	Satya Tangirala <satyat@google.com>,
	open list <linux-kernel@vger.kernel.org>,
	"moderated list:ARM/Mediatek SoC support" 
	<linux-arm-kernel@lists.infradead.org>,
	"moderated list:ARM/Mediatek SoC support" 
	<linux-mediatek@lists.infradead.org>
Subject: Re: [PATCH v1 2/3] scsi: ufs: Optimize host lock on transfer requests send/compl paths
Date: Wed, 23 Jun 2021 10:04:39 +0800	[thread overview]
Message-ID: <e8de57b920c9246f5a41aa44c5a4fc36@codeaurora.org> (raw)
In-Reply-To: <a29164e1-ab7d-6dbc-0fb9-029f203de735@acm.org>

Hi Bart,

On 2021-06-17 10:49, Bart Van Assche wrote:
> On 5/24/21 1:36 AM, Can Guo wrote:
>> @@ -2688,6 +2705,43 @@ static int ufshcd_queuecommand(struct Scsi_Host 
>> *host, struct scsi_cmnd *cmd)
>> +	case UFSHCD_STATE_EH_SCHEDULED_FATAL:
>> +		/*
>> +		 * pm_runtime_get_sync() is used at error handling preparation
>> +		 * stage. If a scsi cmd, e.g. the SSU cmd, is sent from hba's
>> +		 * PM ops, it can never be finished if we let SCSI layer keep
>> +		 * retrying it, which gets err handler stuck forever. Neither
>> +		 * can we let the scsi cmd pass through, because UFS is in bad
>> +		 * state, the scsi cmd may eventually time out, which will get
>> +		 * err handler blocked for too long. So, just fail the scsi cmd
>> +		 * sent from PM ops, err handler can recover PM error anyways.
>> +		 */
>> +		if (hba->pm_op_in_progress) {
>> +			hba->force_reset = true;
>> +			set_host_byte(cmd, DID_BAD_TARGET);
>> +			cmd->scsi_done(cmd);
>> +			goto out;
>> +		}
>> +		fallthrough;
> 
> Hi Can,
> 
> I know that this patch only moves the above code and that the above 
> code
> has not been introduced by this patch. Anyway, is my understanding
> correct that ufshcd_err_handler() can change the host controller state
> from UFSHCD_STATE_EH_SCHEDULED_FATAL into UFSHCD_STATE_RESET and next
> into UFSHCD_STATE_OPERATIONAL? If so, if the above code completes a 
> READ
> with status DID_BAD_TARGET and if recovery by the error handler
> succeeds, will that cause the filesystem above the UFS driver to change
> into read-only mode? If the above code completes a WRITE with status
> DID_BAD_TARGET, will that cause data corruption? Is there any other
> solution to prevent data corruption than merging the
> UFSHCD_STATE_EH_SCHEDULED_FATAL and UFSHCD_STATE_EH_SCHEDULED_NON_FATAL
> back into a single state and changing the ufshcd_rpm_get_sync(hba) call
> in ufshcd_err_handling_prepare() into a pm_runtime_get_noresume() call?
> 

Here, when hba->pm_op_in_progress is true, there cannot be READ or WRITE
command since hba is resuming or suspending. When fatal erorr happens, 
the
DID_BAD_TARGET above is intend to let the SSU (or whatever PM requests
blocking suspend/resume) fail fast (neither returning HOST_BUSY nor 
letting
the cmd pass through can achieve such purpose), so that error handling 
prepare
won't get stuck [1] when it calls

lock_system_sleep()
runtime_pm_get_sync()

The reason why I split UFSHCD_STATE_EH_SCHEDULED to 
UFSHCD_STATE_EH_SCHEDULED_FATAL
and UFSHCD_STATE_EH_SCHEDULED_NON_FATAL is that

1. For non-fatal errors, HW can recover by itself, so when host state is
UFSHCD_STATE_EH_SCHEDULED_NON_FATAL, cmd can still passthrough.

2. When non-fatal error (LINE-RESET for example) happens, error handler 
only
needs to do a power mode transition without a full reset. If we only 
have one
state, returning HOST_BUSY will get error handling prepare stuck [1], 
while
fast failing SSU cmds shall make error handler do a full reset (which 
goes
too far for non-fatal errors).

Thanks,

Can Guo.

> Thanks,
> 
> Bart.

  reply	other threads:[~2021-06-23  2:05 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-24  8:36 [PATCH v1 0/3] Optimize host lock on TR send/compl paths and utilize UTRLCNR Can Guo
2021-05-24  8:36 ` [PATCH v1 1/3] scsi: ufs: Remove a redundant command completion logic in error handler Can Guo
2021-05-24 16:43   ` Bart Van Assche
2021-05-25  4:15   ` Stanley Chu
2021-05-31  7:14   ` Bean Huo
2021-05-24  8:36 ` [PATCH v1 2/3] scsi: ufs: Optimize host lock on transfer requests send/compl paths Can Guo
2021-05-24 11:25   ` kernel test robot
2021-06-08 17:53     ` Nathan Chancellor
2021-06-09  1:01       ` Can Guo
2021-05-24 20:10   ` Bart Van Assche
2021-05-25  1:34     ` Asutosh Das (asd)
2021-05-25  8:24       ` Avri Altman
2021-05-28  7:30         ` Avri Altman
2021-06-02 21:18           ` Asutosh Das (asd)
2021-05-25  1:40     ` Can Guo
2021-05-25 16:40       ` Bart Van Assche
2021-05-31 16:04   ` Bean Huo
2021-06-02  2:14     ` Can Guo
2021-06-03  0:18     ` Bart Van Assche
2021-06-03  2:54   ` Stanley Chu
2021-06-04  1:49     ` Can Guo
2021-06-17  2:49   ` Bart Van Assche
2021-06-23  2:04     ` Can Guo [this message]
2021-06-28 22:58   ` Bart Van Assche
2021-06-29  5:41     ` Can Guo
2021-07-01 15:57       ` Bart Van Assche
2021-05-24  8:36 ` [PATCH v1 3/3] scsi: ufs: Utilize Transfer Request List Completion Notification Register Can Guo
2021-05-31 16:05   ` Bean Huo
2021-06-03  2:54   ` Stanley Chu
2021-06-16  3:48 ` [PATCH v1 0/3] Optimize host lock on TR send/compl paths and utilize UTRLCNR Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8de57b920c9246f5a41aa44c5a4fc36@codeaurora.org \
    --to=cang@codeaurora.org \
    --cc=adrian.hunter@intel.com \
    --cc=alim.akhtar@samsung.com \
    --cc=asutoshd@codeaurora.org \
    --cc=avri.altman@wdc.com \
    --cc=beanhuo@micron.com \
    --cc=bvanassche@acm.org \
    --cc=hongwus@codeaurora.org \
    --cc=jaegeuk@kernel.org \
    --cc=jejb@linux.ibm.com \
    --cc=kernel-team@android.com \
    --cc=kwmad.kim@samsung.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=matthias.bgg@gmail.com \
    --cc=nguyenb@codeaurora.org \
    --cc=satyat@google.com \
    --cc=stanley.chu@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).