All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Bart Van Assche <bvanassche@acm.org>,
	"Martin K . Petersen" <martin.petersen@oracle.com>
Cc: "James E . J . Bottomley" <jejb@linux.vnet.ibm.com>,
	Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>,
	Ming Lei <ming.lei@redhat.com>,
	linux-scsi@vger.kernel.org, linux-block@vger.kernel.org,
	Alan Stern <stern@rowland.harvard.edu>,
	Can Guo <cang@codeaurora.org>,
	Stanley Chu <stanley.chu@mediatek.com>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>
Subject: Re: [PATCH v4 5/9] scsi: Do not wait for a request in scsi_eh_lock_door()
Date: Thu, 3 Dec 2020 08:18:57 +0100	[thread overview]
Message-ID: <b56cf3af-940f-62ed-2a79-eb80599e2f44@suse.de> (raw)
In-Reply-To: <6e5fbc73-881e-69c7-54ce-381b8b695b3c@acm.org>

On 12/3/20 6:10 AM, Bart Van Assche wrote:
> On 12/1/20 11:06 PM, Hannes Reinecke wrote:
>> On 11/30/20 3:46 AM, Bart Van Assche wrote:
>>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
>>> index d94449188270..6de6e1bf3dcb 100644
>>> --- a/drivers/scsi/scsi_error.c
>>> +++ b/drivers/scsi/scsi_error.c
>>> @@ -1993,7 +1993,12 @@ static void scsi_eh_lock_door(struct
>>> scsi_device *sdev)
>>>        struct request *req;
>>>        struct scsi_request *rq;
>>>    -    req = blk_get_request(sdev->request_queue, REQ_OP_SCSI_IN, 0);
>>> +    /*
>>> +     * It is not guaranteed that a request is available nor that
>>> +     * sdev->request_queue is unfrozen. Hence the BLK_MQ_REQ_NOWAIT
>>> below.
>>> +     */
>>> +    req = blk_get_request(sdev->request_queue, REQ_OP_SCSI_IN,
>>> +                  BLK_MQ_REQ_NOWAIT);
>>>        if (IS_ERR(req))
>>>            return;
>>>        rq = scsi_req(req);
>>>
>>
>> Well ... had been thinking about that one, too.
>> The idea of this function is that prior to SCSI EH the device was locked
>> via scsi_set_medium_removal(). And during SCSI EH the device might have
>> become unlocked, so we need to lock it again.
>> However, scsi_set_medium_removal() not only issues the
>> PREVENT_ALLOW_MEDIUM_REMOVAL command, but also sets the 'locked' flag
>> based on the result.
>> So if we fail to get a request here, shouldn't we unset the 'locked'
>> flag, too?
> 
> Probably not. My interpretation of the 'locked' flag is that it
> represents the door state before error handling began. The following
> code in the SCSI error handler restores the door state after a bus reset:
> 
> 	if (scsi_device_online(sdev) && sdev->was_reset && sdev->locked) {
> 		scsi_eh_lock_door(sdev);
> 		sdev->was_reset = 0;
> 	}
> 
>> And what does happen if we fail here? There is no return value, hence
>> SCSI EH might run to completion, and the system will continue
>> with an unlocked door ...
>> Not sure if that's a good idea.
> 
> How about applying the following patch on top of patch 5/9?
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 6de6e1bf3dcb..feac7262e40e 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -1988,7 +1988,7 @@ static void eh_lock_door_done(struct request *req, blk_status_t status)
>    * 	We queue up an asynchronous "ALLOW MEDIUM REMOVAL" request on the
>    * 	head of the devices request queue, and continue.
>    */
> -static void scsi_eh_lock_door(struct scsi_device *sdev)
> +static int scsi_eh_lock_door(struct scsi_device *sdev)
>   {
>   	struct request *req;
>   	struct scsi_request *rq;
> @@ -2000,7 +2000,7 @@ static void scsi_eh_lock_door(struct scsi_device *sdev)
>   	req = blk_get_request(sdev->request_queue, REQ_OP_SCSI_IN,
>   			      BLK_MQ_REQ_NOWAIT);
>   	if (IS_ERR(req))
> -		return;
> +		return PTR_ERR(req);
>   	rq = scsi_req(req);
> 
>   	rq->cmd[0] = ALLOW_MEDIUM_REMOVAL;
> @@ -2016,6 +2016,7 @@ static void scsi_eh_lock_door(struct scsi_device *sdev)
>   	rq->retries = 5;
> 
>   	blk_execute_rq_nowait(req->q, NULL, req, 1, eh_lock_door_done);
> +	return 0;
>   }
> 
>   /**
> @@ -2037,8 +2038,8 @@ static void scsi_restart_operations(struct Scsi_Host *shost)
>   	 * is no point trying to lock the door of an off-line device.
>   	 */
>   	shost_for_each_device(sdev, shost) {
> -		if (scsi_device_online(sdev) && sdev->was_reset && sdev->locked) {
> -			scsi_eh_lock_door(sdev);
> +		if (scsi_device_online(sdev) && sdev->was_reset &&
> +		    sdev->locked && scsi_eh_lock_door(sdev) == 0) {
>   			sdev->was_reset = 0;
>   		}
>   	}
> 
I probably didn't make myself clear.
As per SBC (in this case, sbc3r36) the effects of 
PREVENT_ALLOW_MEDIUM_REMOVAL are being reset by a successfull LUN Reset, 
Hard Reset, Power/On Reset, or an I_T Nexus loss. Which incidentally 
maps nicely onto SCSI EH, so after a successful SCSI EH the door will be 
unlocked (which is why we need to call scsi_eh_lock_door()).
In the SCSI midlayer this state is being reflected by the 'locked' flag.
Now, if scsi_eh_lock_door() is _not_ being executed due to a 
blk_get_request() failure, the device remains unlocked, and as such the 
'locked' flag would need to be _unset_.

So I was thinking more along these lines:

@@ -2030,7 +2037,8 @@ static void scsi_restart_operations(struct 
Scsi_Host *shost)
          */
         shost_for_each_device(sdev, shost) {
                 if (scsi_device_online(sdev) && sdev->was_reset && 
sdev->locked) {
-                       scsi_eh_lock_door(sdev);
+                       if (scsi_eh_lock_door(sdev) < 0)
+                               sdev->locked = 0;
                         sdev->was_reset = 0;
                 }
         }


Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

  reply	other threads:[~2020-12-03  7:19 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30  2:46 [PATCH v4 0/9] Rework runtime suspend and SPI domain validation Bart Van Assche
2020-11-30  2:46 ` [PATCH v4 1/9] block: Fix a race in the runtime power management code Bart Van Assche
2020-11-30  2:46 ` [PATCH v4 2/9] block: Introduce BLK_MQ_REQ_PM Bart Van Assche
2020-12-01 11:31   ` Christoph Hellwig
2020-12-02  6:50   ` Hannes Reinecke
2020-11-30  2:46 ` [PATCH v4 3/9] ide: Do not set the RQF_PREEMPT flag for sense requests Bart Van Assche
2020-12-01 11:29   ` Christoph Hellwig
2020-12-02  6:52   ` Hannes Reinecke
2020-11-30  2:46 ` [PATCH v4 4/9] ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT Bart Van Assche
2020-12-01 11:31   ` Christoph Hellwig
2020-12-02  6:53   ` Hannes Reinecke
2020-11-30  2:46 ` [PATCH v4 5/9] scsi: Do not wait for a request in scsi_eh_lock_door() Bart Van Assche
2020-12-02  7:06   ` Hannes Reinecke
2020-12-03  5:10     ` Bart Van Assche
2020-12-03  7:18       ` Hannes Reinecke [this message]
2020-12-03  7:27         ` Ming Lei
2020-12-04 16:50           ` Bart Van Assche
2020-11-30  2:46 ` [PATCH v4 6/9] scsi_transport_spi: Set RQF_PM for domain validation commands Bart Van Assche
2020-12-01 11:31   ` Christoph Hellwig
2020-11-30  2:46 ` [PATCH v4 7/9] scsi: Only process PM requests if rpm_status != RPM_ACTIVE Bart Van Assche
2020-12-01 11:32   ` Christoph Hellwig
2020-12-02  7:14   ` Hannes Reinecke
2020-11-30  2:46 ` [PATCH v4 8/9] block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT Bart Van Assche
2020-12-01 11:33   ` Christoph Hellwig
2020-12-02  7:15   ` Hannes Reinecke
2020-11-30  2:46 ` [PATCH v4 9/9] block: Do not accept any requests while suspended Bart Van Assche
2020-12-02  7:16   ` Hannes Reinecke
2020-12-02  1:51 ` [PATCH v4 0/9] Rework runtime suspend and SPI domain validation Martin K. Petersen
2020-12-06  0:01   ` Jens Axboe
2020-12-08  1:56 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b56cf3af-940f-62ed-2a79-eb80599e2f44@suse.de \
    --to=hare@suse.de \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=cang@codeaurora.org \
    --cc=hch@lst.de \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=stanley.chu@mediatek.com \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.