All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alan Stern <stern@rowland.harvard.edu>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>,
	Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	Ming Lei <ming.lei@redhat.com>, stable <stable@vger.kernel.org>,
	Can Guo <cang@codeaurora.org>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	SCSI development list <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH] block: Fix a race in the runtime power management code
Date: Fri, 28 Aug 2020 21:12:03 -0400	[thread overview]
Message-ID: <20200829011203.GA486691@rowland.harvard.edu> (raw)
In-Reply-To: <31d6f204-21ae-88da-dbfc-3d7132f8bc03@acm.org>

On Fri, Aug 28, 2020 at 05:51:03PM -0700, Bart Van Assche wrote:
> On 2020-08-28 08:37, Alan Stern wrote:
> > On Thu, Aug 27, 2020 at 08:27:49PM -0700, Bart Van Assche wrote:
> >> On 2020-08-27 13:33, Alan Stern wrote:
> >>> It may not need to be that complicated.  what about something like this?
> > 
> >> I think this patch will break SCSI domain validation. The SCSI domain
> >> validation code calls scsi_device_quiesce() and that function in turn calls
> >> blk_set_pm_only(). The SCSI domain validation code submits SCSI commands with
> >> the BLK_MQ_REQ_PREEMPT flag. Since the above code postpones such requests
> >> while blk_set_pm_only() is in effect, I think the above patch will cause the
> >> SCSI domain validation code to deadlock.
> > 
> > Yes, you're right.
> > 
> > There may be an even simpler solution: Ensure that SCSI domain 
> > validation is mutually exclusive with runtime PM.  It's already mutually 
> > exclusive with system PM, so this makes sense.
> > 
> > What do you think of the patch below?
> > 
> > Alan Stern
> > 
> > 
> > Index: usb-devel/drivers/scsi/scsi_transport_spi.c
> > ===================================================================
> > --- usb-devel.orig/drivers/scsi/scsi_transport_spi.c
> > +++ usb-devel/drivers/scsi/scsi_transport_spi.c
> > @@ -1001,7 +1001,7 @@ spi_dv_device(struct scsi_device *sdev)
> >  	 * Because this function and the power management code both call
> >  	 * scsi_device_quiesce(), it is not safe to perform domain validation
> >  	 * while suspend or resume is in progress. Hence the
> > -	 * lock/unlock_system_sleep() calls.
> > +	 * lock/unlock_system_sleep() and scsi_autopm_get/put_device() calls.
> >  	 */
> >  	lock_system_sleep();
> >  
> > @@ -1018,10 +1018,13 @@ spi_dv_device(struct scsi_device *sdev)
> >  	if (unlikely(!buffer))
> >  		goto out_put;
> >  
> > +	if (scsi_autopm_get_device(sdev))
> > +		goto out_free;
> > +
> >  	/* We need to verify that the actual device will quiesce; the
> >  	 * later target quiesce is just a nice to have */
> >  	if (unlikely(scsi_device_quiesce(sdev)))
> > -		goto out_free;
> > +		goto out_autopm_put;
> >  
> >  	scsi_target_quiesce(starget);
> >  
> > @@ -1041,6 +1044,8 @@ spi_dv_device(struct scsi_device *sdev)
> >  
> >  	spi_initial_dv(starget) = 1;
> >  
> > + out_autopm_put:
> > +	scsi_autopm_put_device(sdev);
> >   out_free:
> >  	kfree(buffer);
> >   out_put:
> 
> Hi Alan,
> 
> I think this is only a part of the solution. scsi_target_quiesce() invokes
> scsi_device_quiesce() and that function in turn calls blk_set_pm_only(). So
> I think that the above is not sufficient to fix the deadlock mentioned in
> my previous email.

Sorry, it sounds like you misinterpreted my preceding email.  I meant to 
suggest that the patch above should be considered _instead of_ the patch 
that introduced BLK_MQ_REQ_PM.  So blk_queue_enter() would remain 
unchanged, using the BLK_MQ_REQ_PREEMPT flag to decide whether or not to 
postpone new requests.  Thus the deadlock you're concerned about would 
not arise.

> Maybe it is possible to fix this by creating a new request queue and by
> submitting the DV SCSI commands to that new request queue. There may be
> better solutions.

I don't think that is necessary.  After all, quiescing is quiescing, 
whether it is done for runtime power management, domain validation, or 
anything else.  What we need to avoid is one thread trying to keep a 
device quiescent at the same time that another thread is re-activating 
it.

To some extent the SCSI core addresses this by allowing only one thread 
to put a device into the SDEV_QUIESCE state at a time.  Making domain 
validation mutually exclusive with runtime suspend is my attempt to 
prevent any odd corner-case interactions from cropping up.

Alan Stern

  reply	other threads:[~2020-08-29  1:12 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-24  3:06 [PATCH] block: Fix a race in the runtime power management code Bart Van Assche
2020-08-24 14:47 ` Alan Stern
2020-08-25  9:01 ` Stanley Chu
2020-08-25  9:11 ` Stanley Chu
2020-08-25 18:24   ` Alan Stern
2020-08-25 22:22     ` Bart Van Assche
2020-08-26  1:51       ` Alan Stern
2020-08-27  3:35         ` Bart Van Assche
2020-08-27 20:33           ` Alan Stern
2020-08-28  3:27             ` Bart Van Assche
2020-08-28 15:37               ` Alan Stern
2020-08-29  0:51                 ` Bart Van Assche
2020-08-29  1:12                   ` Alan Stern [this message]
2020-08-29  2:57                     ` Bart Van Assche
2020-08-26  2:58   ` Bart Van Assche
2020-08-26  4:00     ` Stanley Chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200829011203.GA486691@rowland.harvard.edu \
    --to=stern@rowland.harvard.edu \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=cang@codeaurora.org \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=stanley.chu@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.