All of lore.kernel.org
 help / color / mirror / Atom feed
From: Adrian Hunter <adrian.hunter@intel.com>
To: Bart Van Assche <bvanassche@acm.org>,
	"Martin K . Petersen" <martin.petersen@oracle.com>
Cc: "James E . J . Bottomley" <jejb@linux.ibm.com>,
	Bean Huo <huobean@gmail.com>, Avri Altman <avri.altman@wdc.com>,
	Alim Akhtar <alim.akhtar@samsung.com>,
	Can Guo <cang@codeaurora.org>,
	Asutosh Das <asutoshd@codeaurora.org>,
	Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>,
	Wei Li <liwei213@huawei.com>,
	linux-scsi@vger.kernel.org
Subject: Re: [PATCH V3 1/3] scsi: ufs: Fix error handler clear ua deadlock
Date: Mon, 13 Sep 2021 11:53:21 +0300	[thread overview]
Message-ID: <fae15188-2c1d-b953-f6e4-6e5ac1902b24@intel.com> (raw)
In-Reply-To: <9220f68e-dc5e-9520-6e55-2a4d86809b44@acm.org>

On 13/09/21 6:17 am, Bart Van Assche wrote:
> On 9/11/21 09:47, Adrian Hunter wrote:
>> On 8/09/21 1:36 am, Bart Van Assche wrote:
>>> --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c 
>>> @@ -2707,6 +2707,14 @@ static int ufshcd_queuecommand(struct
>>> Scsi_Host *host, struct scsi_cmnd *cmd) } fallthrough; case
>>> UFSHCD_STATE_RESET: +        /* +         * The SCSI error
>>> handler only starts after all pending commands +         * have
>>> failed or timed out. Complete commands with +         *
>>> DID_IMM_RETRY to allow the error handler to start +         * if
>>> it has been scheduled. +         */ +        set_host_byte(cmd,
>>> DID_IMM_RETRY); +        cmd->scsi_done(cmd);
>> 
>> Setting non-zero return value, in this case "err =
>> SCSI_MLQUEUE_HOST_BUSY" will anyway cause scsi_dec_host_busy(), so
>> does this make any difference?
> 
> The return value should be changed into 0 since returning
> SCSI_MLQUEUE_HOST_BUSY is only allowed if cmd->scsi_done(cmd) has not
> yet been called.
> 
> I expect that setting the host byte to DID_IMM_RETRY and calling
> scsi_done will make a difference, otherwise I wouldn't have suggested
> this. As explained in my previous email doing that triggers the SCSI> command completion and resubmission paths. Resubmission only happens
> if the SCSI error handler has not yet been scheduled. The SCSI error
> handler is scheduled after for all pending commands scsi_done() has
> been called or a timeout occurred. In other words, setting the host
> byte to DID_IMM_RETRY and calling scsi_done() makes it possible for
> the error handler to be scheduled, something that won't happen if
> ufshcd_queuecommand() systematically returns SCSI_MLQUEUE_HOST_BUSY.

Not getting it, sorry. :-(

The error handler sets UFSHCD_STATE_RESET and never leaves the state
as UFSHCD_STATE_RESET, so that case does not need to start the error
handler because it is already running.

The error handler is always scheduled after setting 
UFSHCD_STATE_EH_SCHEDULED_FATAL.

scsi_dec_host_busy() is called for any non-zero return value like
SCSI_MLQUEUE_HOST_BUSY:

i.e.
	reason = scsi_dispatch_cmd(cmd);
	if (reason) {
		scsi_set_blocked(cmd, reason);
		ret = BLK_STS_RESOURCE;
		goto out_dec_host_busy;
	}

	return BLK_STS_OK;

out_dec_host_busy:
	scsi_dec_host_busy(shost, cmd);

And that will wake the error handler:

static void scsi_dec_host_busy(struct Scsi_Host *shost, struct scsi_cmnd *cmd)
{
	unsigned long flags;

	rcu_read_lock();
	__clear_bit(SCMD_STATE_INFLIGHT, &cmd->state);
	if (unlikely(scsi_host_in_recovery(shost))) {
		spin_lock_irqsave(shost->host_lock, flags);
		if (shost->host_failed || shost->host_eh_scheduled)
			scsi_eh_wakeup(shost);
		spin_unlock_irqrestore(shost->host_lock, flags);
	}
	rcu_read_unlock();
}

Note that scsi_host_queue_ready() won't let any requests through
when scsi_host_in_recovery(), so the potential problem is with
requests that have already been successfully submitted to the
UFS driver but have not completed. The change you suggest
does not help with that.

That seems like another problem with the patch 
"scsi: ufs: Synchronize SCSI and UFS error handling".


> In the latter case the block layer timer is reset over and over
> again. See also the blk_mq_start_request() in scsi_queue_rq(). One
> could wonder whether this is really what the SCSI core should do if a
> SCSI LLD keeps returning the SCSI_MLQUEUE_HOST_BUSY status code ...
> 
> Bart.


  reply	other threads:[~2021-09-13  8:53 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-05  9:51 [PATCH V3 0/3] scsi: ufs: Let devices remain runtime suspended during system suspend Adrian Hunter
2021-09-05  9:51 ` [PATCH V3 1/3] scsi: ufs: Fix error handler clear ua deadlock Adrian Hunter
2021-09-07 14:42   ` Bart Van Assche
2021-09-07 15:43     ` Adrian Hunter
2021-09-07 16:56       ` Bart Van Assche
2021-09-07 22:36         ` Bart Van Assche
2021-09-11 16:47           ` Adrian Hunter
2021-09-13  3:17             ` Bart Van Assche
2021-09-13  8:53               ` Adrian Hunter [this message]
2021-09-13 16:33                 ` Bart Van Assche
2021-09-13 17:13                   ` Adrian Hunter
2021-09-13 20:11                     ` Bart Van Assche
2021-09-14  4:55                       ` Adrian Hunter
2021-09-14 22:28                         ` Bart Van Assche
2021-09-15 15:35                           ` Adrian Hunter
2021-09-15 22:41                             ` Bart Van Assche
2021-09-16 17:01                               ` Adrian Hunter
2021-09-05  9:51 ` [PATCH V3 2/3] scsi: ufs: Fix runtime PM dependencies getting broken Adrian Hunter
2021-09-05  9:51 ` [PATCH V3 3/3] scsi: ufs: Let devices remain runtime suspended during system suspend Adrian Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fae15188-2c1d-b953-f6e4-6e5ac1902b24@intel.com \
    --to=adrian.hunter@intel.com \
    --cc=alim.akhtar@samsung.com \
    --cc=asutoshd@codeaurora.org \
    --cc=avri.altman@wdc.com \
    --cc=bvanassche@acm.org \
    --cc=cang@codeaurora.org \
    --cc=huobean@gmail.com \
    --cc=jejb@linux.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=liwei213@huawei.com \
    --cc=manivannan.sadhasivam@linaro.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.