All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: Jinlong Chen <nickyc975@zju.edu.cn>, linux-block@vger.kernel.org
Subject: [PATCH 4/4] block: fix up elevator_type refcounting
Date: Thu, 20 Oct 2022 08:48:19 +0200	[thread overview]
Message-ID: <20221020064819.1469928-5-hch@lst.de> (raw)
In-Reply-To: <20221020064819.1469928-1-hch@lst.de>

From: Jinlong Chen <nickyc975@zju.edu.cn>

The current reference management logic of io scheduler modules contains
refcnt problems. For example, blk_mq_init_sched may fail before or after
the calling of e->ops.init_sched. If it fails before the calling, it does
nothing to the reference to the io scheduler module. But if it fails after
the calling, it releases the reference by calling kobject_put(&eq->kobj).

As the callers of blk_mq_init_sched can't know exactly where the failure
happens, they can't handle the reference to the io scheduler module
properly: releasing the reference on failure results in double-release if
blk_mq_init_sched has released it, and not releasing the reference results
in ghost reference if blk_mq_init_sched did not release it either.

The same problem also exists in io schedulers' init_sched implementations.

We can address the problem by adding releasing statements to the error
handling procedures of blk_mq_init_sched and init_sched implementations.
But that is counterintuitive and requires modifications to existing io
schedulers.

Instead, We make elevator_alloc get the io scheduler module references
that will be released by elevator_release. And then, we match each
elevator_get with an elevator_put. Therefore, each reference to an io
scheduler module explicitly has its own getter and releaser, and we no
longer need to worry about the refcnt problems.

The bugs and the patch can be validated with tools here:
https://github.com/nickyc975/linux_elv_refcnt_bug.git

Signed-off-by: Jinlong Chen <nickyc975@zju.edu.cn>
[hch: split out a few bits into separate patches, use a non-try
      module_get in elevator_alloc]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-mq-sched.c |  1 +
 block/blk-mq.c       |  2 ++
 block/elevator.c     | 10 +++++++---
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index a4f7c101b53b2..68227240fdea3 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -555,6 +555,7 @@ static int blk_mq_init_sched_shared_tags(struct request_queue *queue)
 	return 0;
 }
 
+/* caller must have a reference to @e, will grab another one if successful */
 int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
 {
 	unsigned int flags = q->tag_set->flags;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9db8814cdd02b..098432d3caf1c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -4591,6 +4591,8 @@ static void blk_mq_elv_switch_back(struct list_head *head,
 
 	mutex_lock(&q->sysfs_lock);
 	elevator_switch(q, t);
+	/* drop the reference acquired in blk_mq_elv_switch_none */
+	elevator_put(t);
 	mutex_unlock(&q->sysfs_lock);
 }
 
diff --git a/block/elevator.c b/block/elevator.c
index 887becfe6a8d4..e36d132b47e1e 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -165,6 +165,7 @@ struct elevator_queue *elevator_alloc(struct request_queue *q,
 	if (unlikely(!eq))
 		return NULL;
 
+	__elevator_get(e);
 	eq->type = e;
 	kobject_init(&eq->kobj, &elv_ktype);
 	mutex_init(&eq->sysfs_lock);
@@ -708,8 +709,9 @@ void elevator_init_mq(struct request_queue *q)
 	if (err) {
 		pr_warn("\"%s\" elevator initialization failed, "
 			"falling back to \"none\"\n", e->elevator_name);
-		elevator_put(e);
 	}
+
+	elevator_put(e);
 }
 
 /*
@@ -741,6 +743,7 @@ int elevator_switch(struct request_queue *q, struct elevator_type *new_e)
 static int elevator_change(struct request_queue *q, const char *elevator_name)
 {
 	struct elevator_type *e;
+	int ret;
 
 	/* Make sure queue is not in the middle of being removed */
 	if (!blk_queue_registered(q))
@@ -761,8 +764,9 @@ static int elevator_change(struct request_queue *q, const char *elevator_name)
 	e = elevator_get(q, elevator_name, true);
 	if (!e)
 		return -EINVAL;
-
-	return elevator_switch(q, e);
+	ret = elevator_switch(q, e);
+	elevator_put(e);
+	return ret;
 }
 
 ssize_t elv_iosched_store(struct request_queue *q, const char *buf,
-- 
2.30.2


  parent reply	other threads:[~2022-10-20  6:48 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-20  6:48 elevator refcount fixes Christoph Hellwig
2022-10-20  6:48 ` [PATCH 1/4] block: add proper helpers for elevator_type module refcount management Christoph Hellwig
2022-10-20  6:48 ` [PATCH 2/4] block: sanitize the elevator name before passing it to __elevator_change Christoph Hellwig
2022-10-20  6:48 ` [PATCH 3/4] block: check for an unchanged elevator earlier in __elevator_change Christoph Hellwig
2022-10-20  6:48 ` Christoph Hellwig [this message]
2022-10-20  8:34 ` elevator refcount fixes Jinlong Chen
2022-10-23  3:05 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221020064819.1469928-5-hch@lst.de \
    --to=hch@lst.de \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=nickyc975@zju.edu.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.