linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Bart Van Assche <bvanassche@acm.org>,
	"Martin K . Petersen" <martin.petersen@oracle.com>
Cc: "James E . J . Bottomley" <jejb@linux.vnet.ibm.com>,
	Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>,
	Ming Lei <ming.lei@redhat.com>,
	linux-scsi@vger.kernel.org, linux-block@vger.kernel.org,
	Alan Stern <stern@rowland.harvard.edu>,
	Can Guo <cang@codeaurora.org>,
	Stanley Chu <stanley.chu@mediatek.com>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>
Subject: Re: [PATCH v4 5/9] scsi: Do not wait for a request in scsi_eh_lock_door()
Date: Thu, 3 Dec 2020 08:18:57 +0100	[thread overview]
Message-ID: <b56cf3af-940f-62ed-2a79-eb80599e2f44@suse.de> (raw)
In-Reply-To: <6e5fbc73-881e-69c7-54ce-381b8b695b3c@acm.org>

On 12/3/20 6:10 AM, Bart Van Assche wrote:
> On 12/1/20 11:06 PM, Hannes Reinecke wrote:
>> On 11/30/20 3:46 AM, Bart Van Assche wrote:
>>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
>>> index d94449188270..6de6e1bf3dcb 100644
>>> --- a/drivers/scsi/scsi_error.c
>>> +++ b/drivers/scsi/scsi_error.c
>>> @@ -1993,7 +1993,12 @@ static void scsi_eh_lock_door(struct
>>> scsi_device *sdev)
>>>        struct request *req;
>>>        struct scsi_request *rq;
>>>    -    req = blk_get_request(sdev->request_queue, REQ_OP_SCSI_IN, 0);
>>> +    /*
>>> +     * It is not guaranteed that a request is available nor that
>>> +     * sdev->request_queue is unfrozen. Hence the BLK_MQ_REQ_NOWAIT
>>> below.
>>> +     */
>>> +    req = blk_get_request(sdev->request_queue, REQ_OP_SCSI_IN,
>>> +                  BLK_MQ_REQ_NOWAIT);
>>>        if (IS_ERR(req))
>>>            return;
>>>        rq = scsi_req(req);
>>>
>>
>> Well ... had been thinking about that one, too.
>> The idea of this function is that prior to SCSI EH the device was locked
>> via scsi_set_medium_removal(). And during SCSI EH the device might have
>> become unlocked, so we need to lock it again.
>> However, scsi_set_medium_removal() not only issues the
>> PREVENT_ALLOW_MEDIUM_REMOVAL command, but also sets the 'locked' flag
>> based on the result.
>> So if we fail to get a request here, shouldn't we unset the 'locked'
>> flag, too?
> 
> Probably not. My interpretation of the 'locked' flag is that it
> represents the door state before error handling began. The following
> code in the SCSI error handler restores the door state after a bus reset:
> 
> 	if (scsi_device_online(sdev) && sdev->was_reset && sdev->locked) {
> 		scsi_eh_lock_door(sdev);
> 		sdev->was_reset = 0;
> 	}
> 
>> And what does happen if we fail here? There is no return value, hence
>> SCSI EH might run to completion, and the system will continue
>> with an unlocked door ...
>> Not sure if that's a good idea.
> 
> How about applying the following patch on top of patch 5/9?
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 6de6e1bf3dcb..feac7262e40e 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -1988,7 +1988,7 @@ static void eh_lock_door_done(struct request *req, blk_status_t status)
>    * 	We queue up an asynchronous "ALLOW MEDIUM REMOVAL" request on the
>    * 	head of the devices request queue, and continue.
>    */
> -static void scsi_eh_lock_door(struct scsi_device *sdev)
> +static int scsi_eh_lock_door(struct scsi_device *sdev)
>   {
>   	struct request *req;
>   	struct scsi_request *rq;
> @@ -2000,7 +2000,7 @@ static void scsi_eh_lock_door(struct scsi_device *sdev)
>   	req = blk_get_request(sdev->request_queue, REQ_OP_SCSI_IN,
>   			      BLK_MQ_REQ_NOWAIT);
>   	if (IS_ERR(req))
> -		return;
> +		return PTR_ERR(req);
>   	rq = scsi_req(req);
> 
>   	rq->cmd[0] = ALLOW_MEDIUM_REMOVAL;
> @@ -2016,6 +2016,7 @@ static void scsi_eh_lock_door(struct scsi_device *sdev)
>   	rq->retries = 5;
> 
>   	blk_execute_rq_nowait(req->q, NULL, req, 1, eh_lock_door_done);
> +	return 0;
>   }
> 
>   /**
> @@ -2037,8 +2038,8 @@ static void scsi_restart_operations(struct Scsi_Host *shost)
>   	 * is no point trying to lock the door of an off-line device.
>   	 */
>   	shost_for_each_device(sdev, shost) {
> -		if (scsi_device_online(sdev) && sdev->was_reset && sdev->locked) {
> -			scsi_eh_lock_door(sdev);
> +		if (scsi_device_online(sdev) && sdev->was_reset &&
> +		    sdev->locked && scsi_eh_lock_door(sdev) == 0) {
>   			sdev->was_reset = 0;
>   		}
>   	}
> 
I probably didn't make myself clear.
As per SBC (in this case, sbc3r36) the effects of 
PREVENT_ALLOW_MEDIUM_REMOVAL are being reset by a successfull LUN Reset, 
Hard Reset, Power/On Reset, or an I_T Nexus loss. Which incidentally 
maps nicely onto SCSI EH, so after a successful SCSI EH the door will be 
unlocked (which is why we need to call scsi_eh_lock_door()).
In the SCSI midlayer this state is being reflected by the 'locked' flag.
Now, if scsi_eh_lock_door() is _not_ being executed due to a 
blk_get_request() failure, the device remains unlocked, and as such the 
'locked' flag would need to be _unset_.

So I was thinking more along these lines:

@@ -2030,7 +2037,8 @@ static void scsi_restart_operations(struct 
Scsi_Host *shost)
          */
         shost_for_each_device(sdev, shost) {
                 if (scsi_device_online(sdev) && sdev->was_reset && 
sdev->locked) {
-                       scsi_eh_lock_door(sdev);
+                       if (scsi_eh_lock_door(sdev) < 0)
+                               sdev->locked = 0;
                         sdev->was_reset = 0;
                 }
         }


Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

  reply	other threads:[~2020-12-03  7:19 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30  2:46 [PATCH v4 0/9] Rework runtime suspend and SPI domain validation Bart Van Assche
2020-11-30  2:46 ` [PATCH v4 1/9] block: Fix a race in the runtime power management code Bart Van Assche
2020-11-30  2:46 ` [PATCH v4 2/9] block: Introduce BLK_MQ_REQ_PM Bart Van Assche
2020-12-01 11:31   ` Christoph Hellwig
2020-12-02  6:50   ` Hannes Reinecke
2020-11-30  2:46 ` [PATCH v4 3/9] ide: Do not set the RQF_PREEMPT flag for sense requests Bart Van Assche
2020-12-01 11:29   ` Christoph Hellwig
2020-12-02  6:52   ` Hannes Reinecke
2020-11-30  2:46 ` [PATCH v4 4/9] ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT Bart Van Assche
2020-12-01 11:31   ` Christoph Hellwig
2020-12-02  6:53   ` Hannes Reinecke
2020-11-30  2:46 ` [PATCH v4 5/9] scsi: Do not wait for a request in scsi_eh_lock_door() Bart Van Assche
2020-12-02  7:06   ` Hannes Reinecke
2020-12-03  5:10     ` Bart Van Assche
2020-12-03  7:18       ` Hannes Reinecke [this message]
2020-12-03  7:27         ` Ming Lei
2020-12-04 16:50           ` Bart Van Assche
2020-11-30  2:46 ` [PATCH v4 6/9] scsi_transport_spi: Set RQF_PM for domain validation commands Bart Van Assche
2020-12-01 11:31   ` Christoph Hellwig
2020-11-30  2:46 ` [PATCH v4 7/9] scsi: Only process PM requests if rpm_status != RPM_ACTIVE Bart Van Assche
2020-12-01 11:32   ` Christoph Hellwig
2020-12-02  7:14   ` Hannes Reinecke
2020-11-30  2:46 ` [PATCH v4 8/9] block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT Bart Van Assche
2020-12-01 11:33   ` Christoph Hellwig
2020-12-02  7:15   ` Hannes Reinecke
2020-11-30  2:46 ` [PATCH v4 9/9] block: Do not accept any requests while suspended Bart Van Assche
2020-12-02  7:16   ` Hannes Reinecke
2020-12-02  1:51 ` [PATCH v4 0/9] Rework runtime suspend and SPI domain validation Martin K. Petersen
2020-12-06  0:01   ` Jens Axboe
2020-12-08  1:56 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b56cf3af-940f-62ed-2a79-eb80599e2f44@suse.de \
    --to=hare@suse.de \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=cang@codeaurora.org \
    --cc=hch@lst.de \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=stanley.chu@mediatek.com \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).