All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2] blk-mq: use quiesced elevator switch when reinitializing queues
@ 2022-09-27 15:56 Keith Busch
  2022-09-27 15:59 ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: Keith Busch @ 2022-09-27 15:56 UTC (permalink / raw)
  To: axboe, linux-block; +Cc: Keith Busch, Ming Lei, Christoph Hellwig

From: Keith Busch <kbusch@kernel.org>

The hctx's run_work may be racing with the elevator switch when
reinitializing hardware queues. The queue is merely frozen in this
context, but that only prevents requests from allocating and doesn't
stop the hctx work from running. The work may get an elevator pointer
that's being torn down, and can result in use-after-free errors and
kernel panics (example below). Use the quiesced elevator switch instead,
and make the previous one static since it is now only used locally.

  nvme nvme0: resetting controller
  nvme nvme0: 32/0/0 default/read/poll queues
  BUG: kernel NULL pointer dereference, address: 0000000000000008
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 80000020c8861067 P4D 80000020c8861067 PUD 250f8c8067 PMD 0
  Oops: 0000 [#1] SMP PTI
  Workqueue: kblockd blk_mq_run_work_fn
  RIP: 0010:kyber_has_work+0x29/0x70

...

  Call Trace:
   __blk_mq_do_dispatch_sched+0x83/0x2b0
   __blk_mq_sched_dispatch_requests+0x12e/0x170
   blk_mq_sched_dispatch_requests+0x30/0x60
   __blk_mq_run_hw_queue+0x2b/0x50
   process_one_work+0x1ef/0x380
   worker_thread+0x2d/0x3e0

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
v1->v2:

  Updates to changelog subject, message, and reviewers. The title
  exceeds the recommended 50-char limit, but anything shorter will not
  capture the intent.

 block/blk-mq.c   | 6 +++---
 block/blk.h      | 3 +--
 block/elevator.c | 4 ++--
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 29bb48de5bda..034b24aad3fe 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -4470,14 +4470,14 @@ static bool blk_mq_elv_switch_none(struct list_head *head,
 	list_add(&qe->node, head);
 
 	/*
-	 * After elevator_switch_mq, the previous elevator_queue will be
+	 * After elevator_switch, the previous elevator_queue will be
 	 * released by elevator_release. The reference of the io scheduler
 	 * module get by elevator_get will also be put. So we need to get
 	 * a reference of the io scheduler module here to prevent it to be
 	 * removed.
 	 */
 	__module_get(qe->type->elevator_owner);
-	elevator_switch_mq(q, NULL);
+	elevator_switch(q, NULL);
 	mutex_unlock(&q->sysfs_lock);
 
 	return true;
@@ -4509,7 +4509,7 @@ static void blk_mq_elv_switch_back(struct list_head *head,
 	kfree(qe);
 
 	mutex_lock(&q->sysfs_lock);
-	elevator_switch_mq(q, t);
+	elevator_switch(q, t);
 	mutex_unlock(&q->sysfs_lock);
 }
 
diff --git a/block/blk.h b/block/blk.h
index d7142c4d2fef..52432eab621e 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -270,8 +270,7 @@ bool blk_bio_list_merge(struct request_queue *q, struct list_head *list,
 
 void blk_insert_flush(struct request *rq);
 
-int elevator_switch_mq(struct request_queue *q,
-			      struct elevator_type *new_e);
+int elevator_switch(struct request_queue *q, struct elevator_type *new_e);
 void elevator_exit(struct request_queue *q);
 int elv_register_queue(struct request_queue *q, bool uevent);
 void elv_unregister_queue(struct request_queue *q);
diff --git a/block/elevator.c b/block/elevator.c
index c319765892bb..bd71f0fc4e4b 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -588,7 +588,7 @@ void elv_unregister(struct elevator_type *e)
 }
 EXPORT_SYMBOL_GPL(elv_unregister);
 
-int elevator_switch_mq(struct request_queue *q,
+static int elevator_switch_mq(struct request_queue *q,
 			      struct elevator_type *new_e)
 {
 	int ret;
@@ -723,7 +723,7 @@ void elevator_init_mq(struct request_queue *q)
  * need for the new one. this way we have a chance of going back to the old
  * one, if the new one fails init for some reason.
  */
-static int elevator_switch(struct request_queue *q, struct elevator_type *new_e)
+int elevator_switch(struct request_queue *q, struct elevator_type *new_e)
 {
 	int err;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCHv2] blk-mq: use quiesced elevator switch when reinitializing queues
  2022-09-27 15:56 [PATCHv2] blk-mq: use quiesced elevator switch when reinitializing queues Keith Busch
@ 2022-09-27 15:59 ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2022-09-27 15:59 UTC (permalink / raw)
  To: Keith Busch, linux-block; +Cc: Christoph Hellwig, Ming Lei, Keith Busch

On Tue, 27 Sep 2022 08:56:52 -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> The hctx's run_work may be racing with the elevator switch when
> reinitializing hardware queues. The queue is merely frozen in this
> context, but that only prevents requests from allocating and doesn't
> stop the hctx work from running. The work may get an elevator pointer
> that's being torn down, and can result in use-after-free errors and
> kernel panics (example below). Use the quiesced elevator switch instead,
> and make the previous one static since it is now only used locally.
> 
> [...]

Applied, thanks!

[1/1] blk-mq: use quiesced elevator switch when reinitializing queues
      commit: 8237c01f1696bc53c470493bf1fe092a107648a6

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-09-27 15:59 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-27 15:56 [PATCHv2] blk-mq: use quiesced elevator switch when reinitializing queues Keith Busch
2022-09-27 15:59 ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.