All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Christie <michael.christie@oracle.com>
To: Bart Van Assche <bvanassche@acm.org>,
	"Martin K . Petersen" <martin.petersen@oracle.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>,
	linux-scsi@vger.kernel.org,
	Adrian Hunter <adrian.hunter@intel.com>,
	Christoph Hellwig <hch@lst.de>, Ming Lei <ming.lei@redhat.com>,
	John Garry <john.garry@huawei.com>,
	Hannes Reinecke <hare@suse.de>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>
Subject: Re: [PATCH v3 3/8] scsi: core: Support failing requests while recovering
Date: Mon, 3 Oct 2022 12:27:06 -0500	[thread overview]
Message-ID: <9160ed4f-95a8-42ff-366e-f52d7816d8c3@oracle.com> (raw)
In-Reply-To: <20220929220021.247097-4-bvanassche@acm.org>

On 9/29/22 5:00 PM, Bart Van Assche wrote:
> The current behavior for SCSI commands submitted while error recovery
> is ongoing is to retry command submission after error recovery has
> finished. See also the scsi_host_in_recovery() check in
> scsi_host_queue_ready(). Add support for failing SCSI commands while
> host recovery is in progress. This functionality will be used to fix a
> deadlock in the UFS driver.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Mike Christie <michael.christie@oracle.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/scsi/scsi_lib.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 473d9403f0c1..ecd2ce3815df 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1337,9 +1337,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
>  				   struct scsi_device *sdev,
>  				   struct scsi_cmnd *cmd)
>  {
> -	if (scsi_host_in_recovery(shost))
> -		return 0;
> -
>  	if (atomic_read(&shost->host_blocked) > 0) {
>  		if (scsi_host_busy(shost) > 0)
>  			goto starved;
> @@ -1727,6 +1724,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
>  	ret = BLK_STS_RESOURCE;
>  	if (!scsi_target_queue_ready(shost, sdev))
>  		goto out_put_budget;
> +	if (unlikely(scsi_host_in_recovery(shost))) {
> +		if (req->cmd_flags & REQ_FAILFAST_MASK)
> +			ret = BLK_STS_OFFLINE;
> +		goto out_dec_target_busy;
> +	}
>  	if (!scsi_host_queue_ready(q, shost, sdev, cmd))
>  		goto out_dec_target_busy;
>  

This might add a regression to dm-multipath or it at least makes
the behavior difficult to understand for users and support people.

If there is a transport issue and the cmd times out and the abort
does as well or fails, then we would start the scsi eh recovery. When the
driver/transport class figures out it's a transport issue we will block
the scsi_device. So before this patch we would requeue the cmd then we
would wait until the fast_io_fail (FC) or replacement_timer (iscsi) to
fire before failing commands upwards and failing paths.

With this patch we can end up doing 2 or 3 things depending on timing:
1. If the cmd hits the code above we will fail the command and cause a
path failure.

2. If driver/transport blocks the scsi_device first then we would hit the
scsi_device state check in scsi_queue_rq and not fail the path like before.

3. With or without your patch, if dm_mq_queue_rq calls the busy callback
first (this does blk_lld_busy -> scsi_mq_lld_busy -> scsi_host_in_recovery)
then it might see all the paths in recovery. It considers this a temp condition
and will requeue cmds. So in this case we will not fail the path.

I'm not sure what the correct fix is. Maybe not fast fail REQ_FAILFAST_TRANSPORT
above, or maybe add a new FAILFAST type that you could use?

  reply	other threads:[~2022-10-03 17:28 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-29 22:00 [PATCH v3 0/8] Fix a deadlock in the UFS driver Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 1/8] scsi: core: Fix a race between scsi_done() and scsi_timeout() Bart Van Assche
2022-09-30  0:17   ` Mike Christie
2022-09-30  0:32     ` Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 2/8] scsi: core: Change the return type of .eh_timed_out() Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 3/8] scsi: core: Support failing requests while recovering Bart Van Assche
2022-10-03 17:27   ` Mike Christie [this message]
2022-10-03 19:20     ` Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 4/8] scsi: ufs: Remove an outdated comment Bart Van Assche
2022-09-30 10:26   ` Bean Huo
2022-10-03  5:55   ` Adrian Hunter
2022-09-29 22:00 ` [PATCH v3 5/8] scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode() Bart Van Assche
2022-09-30 10:28   ` Bean Huo
2022-10-03  5:58   ` Adrian Hunter
2022-09-29 22:00 ` [PATCH v3 6/8] scsi: ufs: Try harder to change the power mode Bart Van Assche
2022-09-30 12:14   ` Bean Huo
2022-10-03  6:10   ` Adrian Hunter
2022-10-03 16:38     ` Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 7/8] scsi: ufs: Track system suspend / resume activity Bart Van Assche
2022-10-03  6:10   ` Adrian Hunter
2022-09-29 22:00 ` [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler Bart Van Assche
2022-09-30 15:03   ` Bean Huo
2022-09-30 17:15     ` Bart Van Assche
2022-10-02 21:21       ` Bean Huo
2022-10-04  0:21         ` Bart Van Assche
2022-10-03  6:18   ` Adrian Hunter
2022-10-03 16:45     ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9160ed4f-95a8-42ff-366e-f52d7816d8c3@oracle.com \
    --to=michael.christie@oracle.com \
    --cc=adrian.hunter@intel.com \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jaegeuk@kernel.org \
    --cc=jejb@linux.ibm.com \
    --cc=john.garry@huawei.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.