[PATCH] block: Fix bug in runtime-resume handling

* [PATCH] block: Fix bug in runtime-resume handling
       [not found]                 ` <6f0c530f-4309-ab1e-393b-83bf8367f59e@puri.sm>
@ 2020-08-23 14:57                   ` Alan Stern
  2020-08-24 17:48                     ` Bart Van Assche
  0 siblings, 1 reply; 12+ messages in thread
From: Alan Stern @ 2020-08-23 14:57 UTC (permalink / raw)
  To: Jens Axboe, Martin Kepplinger
  Cc: Bart Van Assche, Can Guo, linux-scsi, linux-block, kernel

Runtime power-management support in the block layer has somehow gotten
messed up in the past few years.  This area of code has been through a
lot of churn and it's not easy to track exactly when the bug addressed
here was introduced; it was present for a while, then fixed for a
while, and then it returned.

At any rate, the problem is that the block layer code thinks that it's
okay to issue a request with the BLK_MQ_REQ_PREEMPT flag set at any
time, even when the queue and underlying device are runtime suspended.
This belief is wrong; the flag merely indicates that it's okay to
issue the request while the queue and device are in the process of
suspending or resuming.  When they are actually suspended, no requests
may be issued.

The symptom of this bug is that a runtime-suspended block device
doesn't resume as it should.  The request which should cause a runtime
resume instead gets issued directly, without resuming the device
first.  Of course the device can't handle it properly, the I/O
fails, and the device remains suspended.

The problem is fixed by checking that the queue's runtime-PM status
isn't RPM_SUSPENDED before allowing a request to be issued, and
queuing a runtime-resume request if it is.  In particular, the inline
blk_pm_request_resume() routine is renamed blk_pm_resume_queue() and
the code is unified by merging the surrounding checks into the
routine.  If the queue isn't set up for runtime PM, or there currently
is no restriction on allowed requests, the request is allowed.
Likewise if the BLK_MQ_REQ_PREEMPT flag is set and the status isn't
RPM_SUSPENDED.  Otherwise a runtime resume is queued and the request
is blocked until conditions are more suitable.

Reported-and-tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Bart Van Assche <bvanassche@acm.org>
CC: <stable@vger.kernel.org> # 5.2

---

The bug goes back way before 5.2, but other changes will prevent the
patch from applying directly to earlier kernels, so I'm limiting the 
@stable updates to 5.2 and after.


[as1941]


 block/blk-core.c |    6 +++---
 block/blk-pm.h   |   14 +++++++++-----
 2 files changed, 12 insertions(+), 8 deletions(-)

Index: usb-devel/block/blk-core.c
===================================================================

--- usb-devel.orig/block/blk-core.c
+++ usb-devel/block/blk-core.c
@@ -423,7 +423,8 @@ int blk_queue_enter(struct request_queue
 			 * responsible for ensuring that that counter is
 			 * globally visible before the queue is unfrozen.
 			 */
-			if (pm || !blk_queue_pm_only(q)) {
+			if ((pm && q->rpm_status != RPM_SUSPENDED) ||
+			    !blk_queue_pm_only(q)) {
 				success = true;
 			} else {
 				percpu_ref_put(&q->q_usage_counter);
@@ -448,8 +449,7 @@ int blk_queue_enter(struct request_queue
 
 		wait_event(q->mq_freeze_wq,
 			   (!q->mq_freeze_depth &&
-			    (pm || (blk_pm_request_resume(q),
-				    !blk_queue_pm_only(q)))) ||
+			    blk_pm_resume_queue(pm, q)) ||
 			   blk_queue_dying(q));
 		if (blk_queue_dying(q))
 			return -ENODEV;
Index: usb-devel/block/blk-pm.h
===================================================================
--- usb-devel.orig/block/blk-pm.h
+++ usb-devel/block/blk-pm.h
@@ -6,11 +6,14 @@
 #include <linux/pm_runtime.h>
 
 #ifdef CONFIG_PM
-static inline void blk_pm_request_resume(struct request_queue *q)
+static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q)
 {
-	if (q->dev && (q->rpm_status == RPM_SUSPENDED ||
-		       q->rpm_status == RPM_SUSPENDING))
-		pm_request_resume(q->dev);
+	if (!q->dev || !blk_queue_pm_only(q))
+		return 1;	/* Nothing to do */
+	if (pm && q->rpm_status != RPM_SUSPENDED)
+		return 1;	/* Request allowed */
+	pm_request_resume(q->dev);
+	return 0;
 }
 
 static inline void blk_pm_mark_last_busy(struct request *rq)
@@ -44,8 +47,9 @@ static inline void blk_pm_put_request(st
 		--rq->q->nr_pending;
 }
 #else
-static inline void blk_pm_request_resume(struct request_queue *q)
+static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q)
 {
+	return 1;
 }
 
 static inline void blk_pm_mark_last_busy(struct request *rq)

^ permalink raw reply	[flat|nested] 12+ messages in thread