* [PATCH] block: only call sched requeue_request() for scheduled requests
@ 2020-09-08 20:46 Omar Sandoval
2020-09-08 23:42 ` Jens Axboe
0 siblings, 1 reply; 2+ messages in thread
From: Omar Sandoval @ 2020-09-08 20:46 UTC (permalink / raw)
To: linux-block; +Cc: Jens Axboe, kernel-team, Yang Yang, Paolo Valente
From: Omar Sandoval <osandov@fb.com>
Yang Yang reported the following crash caused by requeueing a flush
request in Kyber:
[ 2.517297] Unable to handle kernel paging request at virtual address ffffffd8071c0b00
...
[ 2.517468] pc : clear_bit+0x18/0x2c
[ 2.517502] lr : sbitmap_queue_clear+0x40/0x228
[ 2.517503] sp : ffffff800832bc60 pstate : 00c00145
...
[ 2.517599] Process ksoftirqd/5 (pid: 51, stack limit = 0xffffff8008328000)
[ 2.517602] Call trace:
[ 2.517606] clear_bit+0x18/0x2c
[ 2.517619] kyber_finish_request+0x74/0x80
[ 2.517627] blk_mq_requeue_request+0x3c/0xc0
[ 2.517637] __scsi_queue_insert+0x11c/0x148
[ 2.517640] scsi_softirq_done+0x114/0x130
[ 2.517643] blk_done_softirq+0x7c/0xb0
[ 2.517651] __do_softirq+0x208/0x3bc
[ 2.517657] run_ksoftirqd+0x34/0x60
[ 2.517663] smpboot_thread_fn+0x1c4/0x2c0
[ 2.517667] kthread+0x110/0x120
[ 2.517669] ret_from_fork+0x10/0x18
This happens because Kyber doesn't track flush requests, so
kyber_finish_request() reads a garbage domain token. Only call the
scheduler's requeue_request() hook if RQF_ELVPRIV is set (like we do for
the finish_request() hook in blk_mq_free_request()). Now that we're
handling it in blk-mq, also remove the check from BFQ.
Reported-by: Yang Yang <yang.yang@vivo.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
block/bfq-iosched.c | 12 ------------
block/blk-mq-sched.h | 2 +-
2 files changed, 1 insertion(+), 13 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index a4c0bec920cb..ee767fa000e4 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -5895,18 +5895,6 @@ static void bfq_finish_requeue_request(struct request *rq)
struct bfq_queue *bfqq = RQ_BFQQ(rq);
struct bfq_data *bfqd;
- /*
- * Requeue and finish hooks are invoked in blk-mq without
- * checking whether the involved request is actually still
- * referenced in the scheduler. To handle this fact, the
- * following two checks make this function exit in case of
- * spurious invocations, for which there is nothing to do.
- *
- * First, check whether rq has nothing to do with an elevator.
- */
- if (unlikely(!(rq->rq_flags & RQF_ELVPRIV)))
- return;
-
/*
* rq either is not associated with any icq, or is an already
* requeued request that has not (yet) been re-inserted into
diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
index 126021fc3a11..e81ca1bf6e10 100644
--- a/block/blk-mq-sched.h
+++ b/block/blk-mq-sched.h
@@ -66,7 +66,7 @@ static inline void blk_mq_sched_requeue_request(struct request *rq)
struct request_queue *q = rq->q;
struct elevator_queue *e = q->elevator;
- if (e && e->type->ops.requeue_request)
+ if ((rq->rq_flags & RQF_ELVPRIV) && e && e->type->ops.requeue_request)
e->type->ops.requeue_request(rq);
}
--
2.28.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] block: only call sched requeue_request() for scheduled requests
2020-09-08 20:46 [PATCH] block: only call sched requeue_request() for scheduled requests Omar Sandoval
@ 2020-09-08 23:42 ` Jens Axboe
0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2020-09-08 23:42 UTC (permalink / raw)
To: Omar Sandoval, linux-block; +Cc: kernel-team, Yang Yang, Paolo Valente
On 9/8/20 2:46 PM, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> Yang Yang reported the following crash caused by requeueing a flush
> request in Kyber:
>
> [ 2.517297] Unable to handle kernel paging request at virtual address ffffffd8071c0b00
> ...
> [ 2.517468] pc : clear_bit+0x18/0x2c
> [ 2.517502] lr : sbitmap_queue_clear+0x40/0x228
> [ 2.517503] sp : ffffff800832bc60 pstate : 00c00145
> ...
> [ 2.517599] Process ksoftirqd/5 (pid: 51, stack limit = 0xffffff8008328000)
> [ 2.517602] Call trace:
> [ 2.517606] clear_bit+0x18/0x2c
> [ 2.517619] kyber_finish_request+0x74/0x80
> [ 2.517627] blk_mq_requeue_request+0x3c/0xc0
> [ 2.517637] __scsi_queue_insert+0x11c/0x148
> [ 2.517640] scsi_softirq_done+0x114/0x130
> [ 2.517643] blk_done_softirq+0x7c/0xb0
> [ 2.517651] __do_softirq+0x208/0x3bc
> [ 2.517657] run_ksoftirqd+0x34/0x60
> [ 2.517663] smpboot_thread_fn+0x1c4/0x2c0
> [ 2.517667] kthread+0x110/0x120
> [ 2.517669] ret_from_fork+0x10/0x18
>
> This happens because Kyber doesn't track flush requests, so
> kyber_finish_request() reads a garbage domain token. Only call the
> scheduler's requeue_request() hook if RQF_ELVPRIV is set (like we do for
> the finish_request() hook in blk_mq_free_request()). Now that we're
> handling it in blk-mq, also remove the check from BFQ.
Thanks, applied.
--
Jens Axboe
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-09-08 23:42 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-08 20:46 [PATCH] block: only call sched requeue_request() for scheduled requests Omar Sandoval
2020-09-08 23:42 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).