From: Alan Stern <stern@rowland.harvard.edu>
To: Martin Kepplinger <martin.kepplinger@puri.sm>
Cc: Bart Van Assche <bvanassche@acm.org>,
Can Guo <cang@codeaurora.org>,
linux-scsi@vger.kernel.org, linux-block@vger.kernel.org,
kernel@puri.sm
Subject: Re: [PATCH] block: Fix bug in runtime-resume handling
Date: Sat, 29 Aug 2020 14:56:53 -0400 [thread overview]
Message-ID: <20200829185653.GB501978@rowland.harvard.edu> (raw)
In-Reply-To: <6d22ec22-a0c7-6a9d-439e-38ef87b0207c@puri.sm>
On Sat, Aug 29, 2020 at 06:33:26PM +0200, Martin Kepplinger wrote:
> On 29.08.20 17:26, Alan Stern wrote:
> > Hmmm. I'm wondering about something you wrote back in June
> > (https://marc.info/?l=linux-scsi&m=159345778431615&w=2):
> >
> > blk_queue_enter() always - especially when sd is runtime
> > suspended and I try to mount as above - sets success to be true
> > for me, so never continues down to bkl_pm_request_resume(). All
> > I see is "PM: Removing info for No Bus:sda1".
> >
> > blk_queue_enter() would always set success to be true because pm
> > (derived from the BLK_MQ_REQ_PREEMPT flag) is true. But why was the
> > BLK_MQ_REQ_PREEMPT flag set? In other words, where was
> > blk_queue_enter() called from?
> >
> > Can you get a stack trace (i.e., call dump_stack()) at exactly this
> > point, that is, when pm is true and q->rpm_status is RPM_SUSPENDED? Or
> > do you already know the answer?
> >
> >
>
> I reverted any scsi/block out-of-tree fixes for this.
>
> when I try to mount, pm is TRUE (BLK_MQ_REQ_PREEMT set) and that's the
> first stack trace I get in this condition, inside of blk_queue_enter():
>
> There is more, but I don't know if that's interesting.
>
> [ 38.642202] CPU: 2 PID: 1522 Comm: mount Not tainted 5.8.0-1-librem5 #487
> [ 38.642207] Hardware name: Purism Librem 5r3 (DT)
> [ 38.642213] Call trace:
> [ 38.642233] dump_backtrace+0x0/0x210
> [ 38.642242] show_stack+0x20/0x30
> [ 38.642252] dump_stack+0xc8/0x128
> [ 38.642262] blk_queue_enter+0x1b8/0x2d8
> [ 38.642271] blk_mq_alloc_request+0x54/0xb0
> [ 38.642277] blk_get_request+0x34/0x78
> [ 38.642286] __scsi_execute+0x60/0x1c8
> [ 38.642291] scsi_test_unit_ready+0x88/0x118
> [ 38.642298] sd_check_events+0x110/0x158
> [ 38.642306] disk_check_events+0x68/0x188
> [ 38.642312] disk_clear_events+0x84/0x198
> [ 38.642320] check_disk_change+0x38/0x90
> [ 38.642325] sd_open+0x60/0x148
> [ 38.642330] __blkdev_get+0xcc/0x4c8
> [ 38.642335] __blkdev_get+0x278/0x4c8
> [ 38.642339] blkdev_get+0x128/0x1a8
> [ 38.642345] blkdev_open+0x98/0xb0
> [ 38.642354] do_dentry_open+0x130/0x3c8
> [ 38.642359] vfs_open+0x34/0x40
> [ 38.642366] path_openat+0xa30/0xe40
> [ 38.642372] do_filp_open+0x84/0x100
> [ 38.642377] do_sys_openat2+0x1f4/0x2b0
> [ 38.642382] do_sys_open+0x60/0xa8
> (...)
>
> and of course it doesn't work and /dev/sda1 disappears, see the initial
> discussion that led to your fix.
Great! That's exactly what I was looking for, thank you.
Bart, this is a perfect example of the potential race I've been talking
about in the other email thread. Suppose thread 0 is carrying out a
runtime suspend of a SCSI disk and at the same time, thread 1 is opening
the disk's block device (as we see in the stack trace here). Then we
could have the following:
Thread 0 Thread 1
-------- --------
Start runtime suspend
blk_pre_runtime_suspend calls
blk_set_pm_only and sets
q->rpm_status to RPM_SUSPENDING
Call sd_open -> ... -> scsi_test_unit_ready
-> __scsi_execute -> ...
-> blk_queue_enter
Sees BLK_MQ_REQ_PREEMPT set and
RPM_SUSPENDING queue status, so does
not postpone the request
blk_post_runtime_suspend sets
q->rpm_status to RPM_SUSPENDED
The drive goes into runtime suspend
Issues the TEST UNIT READY request
Request fails because the drive is suspended
One way to avoid this race is mutual exclusion: We could make sd_open
prevent the drive from being runtime suspended until it returns.
However I don't like this approach; it would mean tracking down every
possible pathway to __scsi_execute and making sure that runtime suspend
is blocked.
A more fine-grained approach would be to have __scsi_execute itself call
scsi_autopm_get/put_device whenever the rq_flags argument doesn't
contain RQF_PM. This way we wouldn't have to worry about missing any
possiible pathways. But it relies on an implicit assumption that
__scsi_execute is the only place where the PREEMPT flag gets set.
A third possibility is the approach I outlined before, adding a
BLK_MQ_REQ_PM flag. But to avoid the deadlock you pointed out, I would
make blk_queue_enter smarter about whether to postpone a request. The
logic would go like this:
If !blk_queue_pm_only:
Allow
If !BLK_MQ_REQ_PREEMPT:
Postpone
If q->rpm_status is RPM_ACTIVE:
Allow
If !BLK_MQ_REQ_PM:
Postpone
If q->rpm_status is RPM_SUSPENDED:
Postpone
Else:
Allow
The assumption here is that the PREEMPT flag is set whenever the PM flag
is.
I believe either the second or third possibility would work. The second
looks to be the simplest
What do you think?
Alan Stern
next prev parent reply other threads:[~2020-08-29 18:56 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-23 11:10 [PATCH] scsi: sd: add runtime pm to open / release Martin Kepplinger
2020-06-24 13:33 ` Bart Van Assche
2020-06-25 8:16 ` Martin Kepplinger
2020-06-25 14:52 ` Alan Stern
2020-06-26 3:53 ` Bart Van Assche
2020-06-26 15:07 ` Bart Van Assche
2020-06-26 15:44 ` Alan Stern
2020-06-28 2:37 ` Bart Van Assche
2020-06-28 13:10 ` Alan Stern
2020-06-29 9:42 ` Martin Kepplinger
2020-06-29 16:15 ` Alan Stern
2020-06-29 16:56 ` Bart Van Assche
2020-06-29 17:40 ` Alan Stern
2020-06-30 3:33 ` Martin Kepplinger
2020-06-30 13:38 ` Alan Stern
2020-06-30 15:59 ` Bart Van Assche
2020-06-30 18:02 ` Alan Stern
2020-06-30 19:23 ` Bart Van Assche
2020-06-30 19:38 ` Alan Stern
2020-06-30 23:31 ` Bart Van Assche
2020-07-01 0:49 ` Alan Stern
2020-07-06 16:41 ` Alan Stern
2020-07-28 7:02 ` Martin Kepplinger
2020-07-28 20:02 ` Alan Stern
2020-07-29 14:12 ` Martin Kepplinger
2020-07-29 14:32 ` Alan Stern
2020-07-29 14:44 ` Martin K. Petersen
2020-07-29 14:56 ` Alan Stern
2020-07-29 14:46 ` James Bottomley
2020-07-29 14:53 ` James Bottomley
2020-07-29 15:40 ` Martin Kepplinger
2020-07-29 15:44 ` James Bottomley
2020-07-29 16:43 ` Martin Kepplinger
2020-07-29 18:25 ` Alan Stern
2020-07-29 18:29 ` James Bottomley
2020-07-30 8:52 ` Martin Kepplinger
2020-07-30 8:54 ` Martin Kepplinger
2020-07-30 15:10 ` Alan Stern
2020-08-04 9:39 ` Martin Kepplinger
2020-08-07 9:51 ` Martin Kepplinger
2020-08-07 14:30 ` Alan Stern
2020-08-08 6:59 ` Martin Kepplinger
2020-08-08 15:05 ` Alan Stern
2020-08-09 9:20 ` Martin Kepplinger
2020-08-09 15:26 ` Alan Stern
2020-08-10 12:03 ` Martin Kepplinger
2020-08-10 14:13 ` Alan Stern
2020-08-11 7:55 ` Martin Kepplinger
2020-08-11 13:48 ` Alan Stern
2020-08-23 14:57 ` [PATCH] block: Fix bug in runtime-resume handling Alan Stern
2020-08-24 17:48 ` Bart Van Assche
2020-08-24 20:13 ` Alan Stern
2020-08-26 7:48 ` Martin Kepplinger
2020-08-27 17:42 ` Martin Kepplinger
2020-08-27 20:29 ` Alan Stern
2020-08-29 7:24 ` Martin Kepplinger
2020-08-29 15:26 ` Alan Stern
2020-08-29 16:33 ` Martin Kepplinger
2020-08-29 18:56 ` Alan Stern [this message]
2020-08-30 0:38 ` Bart Van Assche
2020-08-30 1:06 ` Alan Stern
2020-07-29 15:40 ` [PATCH] scsi: sd: add runtime pm to open / release Alan Stern
2020-07-29 15:49 ` James Bottomley
2020-07-29 16:17 ` Alan Stern
2020-07-29 15:52 ` Martin Kepplinger
2020-07-29 18:10 ` Douglas Gilbert
2020-07-30 8:05 ` Martin Kepplinger
2020-07-30 15:14 ` Alan Stern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200829185653.GB501978@rowland.harvard.edu \
--to=stern@rowland.harvard.edu \
--cc=bvanassche@acm.org \
--cc=cang@codeaurora.org \
--cc=kernel@puri.sm \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.kepplinger@puri.sm \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).