All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v2 block/for-next] blkcg: Improve blkg config helpers and make iolatency init lazy
@ 2023-01-05 21:24 Tejun Heo
  2023-01-05 21:24 ` [PATCH 1/4] blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish() Tejun Heo
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Tejun Heo @ 2023-01-05 21:24 UTC (permalink / raw)
  To: axboe, josef, hch; +Cc: linux-block, linux-kernel

Hello,

v2 fixes the build failure caused by v1[1] forgetting to update bfq.

This patchset:

* Improves blkg config helpers so that they can be used consistently for all
  the existing use cases. This also allows keeps using the same bdev open
  instance across lazy init of rq_qos policies.

* Updates iolatency so that it initializes lazily when a latency target is
  set for the first time. This avoids registering the rq_qos policy when
  iolatency is not used which removes unnecessary calls into iolat from IO
  hot paths.

and contains the following four patches:

 0001-blkcg-Drop-unnecessary-RCU-read-un-locks-from-blkg_c.patch
 0002-blkcg-Restructure-blkg_conf_prep-and-friends.patch
 0003-blk-iolatency-s-blkcg_rq_qos-iolat_rq_qos.patch
 0004-blk-iolatency-Make-initialization-lazy.patch

and is also available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git iolat-lazy-init-v2

diffstat follows. Thanks.

 block/bfq-cgroup.c    |    8 +++--
 block/blk-cgroup.c    |  120 ++++++++++++++++++++++++++++++++++++++++++----------------------------------
 block/blk-cgroup.h    |   10 +++---
 block/blk-iocost.c    |   58 +++++++++++++++++++++---------------
 block/blk-iolatency.c |   39 +++++++++++++++++++++---
 block/blk-rq-qos.h    |    2 -
 block/blk-throttle.c  |   16 ++++++----
 block/blk.h           |    6 ---
 8 files changed, 157 insertions(+), 102 deletions(-)

[1] https://lkml.kernel.org/r/20230105002007.157497-1-tj@kernel.org

--
tejun




^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish()
  2023-01-05 21:24 [PATCHSET v2 block/for-next] blkcg: Improve blkg config helpers and make iolatency init lazy Tejun Heo
@ 2023-01-05 21:24 ` Tejun Heo
  2023-01-08 17:02   ` Christoph Hellwig
  2023-01-05 21:24 ` [PATCH 2/4] blkcg: Restructure blkg_conf_prep() and friends Tejun Heo
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2023-01-05 21:24 UTC (permalink / raw)
  To: axboe, josef, hch; +Cc: linux-block, linux-kernel, Tejun Heo

Holding the queue lock now implies RCU read lock, so no need to use
rcu_read_[un]lock() explicitly. This shouldn't cause any behavior changes.

While at it, drop __acquires() annotation on the queue lock too. The
__acquires() part was already out of sync and it doesn't catch anything that
lockdep can't.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-cgroup.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index ce6a2b7d3dfb..99674e23cf88 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -672,12 +672,11 @@ struct block_device *blkcg_conf_open_bdev(char **inputp)
  *
  * Parse per-blkg config update from @input and initialize @ctx with the
  * result.  @ctx->blkg points to the blkg to be updated and @ctx->body the
- * part of @input following MAJ:MIN.  This function returns with RCU read
- * lock and queue lock held and must be paired with blkg_conf_finish().
+ * part of @input following MAJ:MIN.  This function returns with queue lock
+ * held and must be paired with blkg_conf_finish().
  */
 int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 		   char *input, struct blkg_conf_ctx *ctx)
-	__acquires(rcu) __acquires(&bdev->bd_queue->queue_lock)
 {
 	struct block_device *bdev;
 	struct gendisk *disk;
@@ -699,7 +698,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 	if (ret)
 		goto fail;
 
-	rcu_read_lock();
 	spin_lock_irq(&q->queue_lock);
 
 	if (!blkcg_policy_enabled(q, pol)) {
@@ -728,7 +726,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 
 		/* Drop locks to do new blkg allocation with GFP_KERNEL. */
 		spin_unlock_irq(&q->queue_lock);
-		rcu_read_unlock();
 
 		new_blkg = blkg_alloc(pos, disk, GFP_KERNEL);
 		if (unlikely(!new_blkg)) {
@@ -742,7 +739,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 			goto fail_exit_queue;
 		}
 
-		rcu_read_lock();
 		spin_lock_irq(&q->queue_lock);
 
 		if (!blkcg_policy_enabled(q, pol)) {
@@ -778,7 +774,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 	radix_tree_preload_end();
 fail_unlock:
 	spin_unlock_irq(&q->queue_lock);
-	rcu_read_unlock();
 fail_exit_queue:
 	blk_queue_exit(q);
 fail:
@@ -805,10 +800,8 @@ EXPORT_SYMBOL_GPL(blkg_conf_prep);
  * with blkg_conf_prep().
  */
 void blkg_conf_finish(struct blkg_conf_ctx *ctx)
-	__releases(&ctx->bdev->bd_queue->queue_lock) __releases(rcu)
 {
 	spin_unlock_irq(&bdev_get_queue(ctx->bdev)->queue_lock);
-	rcu_read_unlock();
 	blkdev_put_no_open(ctx->bdev);
 }
 EXPORT_SYMBOL_GPL(blkg_conf_finish);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/4] blkcg: Restructure blkg_conf_prep() and friends
  2023-01-05 21:24 [PATCHSET v2 block/for-next] blkcg: Improve blkg config helpers and make iolatency init lazy Tejun Heo
  2023-01-05 21:24 ` [PATCH 1/4] blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish() Tejun Heo
@ 2023-01-05 21:24 ` Tejun Heo
  2023-01-10  7:09   ` Christoph Hellwig
  2023-01-05 21:24 ` [PATCH 3/4] blk-iolatency: s/blkcg_rq_qos/iolat_rq_qos/ Tejun Heo
  2023-01-05 21:24 ` [PATCH 4/4] blk-iolatency: Make initialization lazy Tejun Heo
  3 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2023-01-05 21:24 UTC (permalink / raw)
  To: axboe, josef, hch; +Cc: linux-block, linux-kernel, Tejun Heo

We want to support lazy init of rq-qos policies so that iolatency is enabled
lazily on configuration instead of gendisk initialization. The way blkg
config helpers are structured now is a bit awkward for that. Let's
restructure:

* blkcg_conf_open_bdev() is renamed to blkg_conf_open_bdev(). The blkcg_
  prefix was used because the bdev opening step is blkg-independent.
  However, the distinction is too subtle and confuses more than helps. Let's
  switch to blkg prefix so that it's consistent with the type and other
  helper names.

* struct blkg_conf_ctx now remembers the original input string and is always
  initialized by the new blkg_conf_init().

* blkg_conf_open_bdev() is updated to take a pointer to blkg_conf_ctx like
  blkg_conf_prep() and can be called multiple times safely. Instead of
  modifying the double pointer to input string directly,
  blkg_conf_open_bdev() now sets blkg_conf_ctx->body.

* blkg_conf_finish() is renamed to blkg_conf_exit() for symmetry and now
  must be called on all blkg_conf_ctx's which were initialized with
  blkg_conf_init().

Combined, this allows the users to either open the bdev first or do it
altogether with blkg_conf_prep() which will help implementing lazy init of
rq-qos policies.

Users are updated accordingly. No behavior change is intended by this patch.

v2: bfq wasn't updated in v1 causing a build error. Fixed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@lst.de>
---
 block/bfq-cgroup.c    |   8 ++--
 block/blk-cgroup.c    | 105 +++++++++++++++++++++++++++---------------
 block/blk-cgroup.h    |  10 ++--
 block/blk-iocost.c    |  58 +++++++++++++----------
 block/blk-iolatency.c |   8 ++--
 block/blk-throttle.c  |  16 ++++---
 6 files changed, 127 insertions(+), 78 deletions(-)

diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index a6e8da5f5cfd..97925793aee4 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -1115,9 +1115,11 @@ static ssize_t bfq_io_set_device_weight(struct kernfs_open_file *of,
 	struct bfq_group *bfqg;
 	u64 v;
 
-	ret = blkg_conf_prep(blkcg, &blkcg_policy_bfq, buf, &ctx);
+	blkg_conf_init(&ctx, buf);
+
+	ret = blkg_conf_prep(blkcg, &blkcg_policy_bfq, &ctx);
 	if (ret)
-		return ret;
+		goto out;
 
 	if (sscanf(ctx.body, "%llu", &v) == 1) {
 		/* require "default" on dfl */
@@ -1139,7 +1141,7 @@ static ssize_t bfq_io_set_device_weight(struct kernfs_open_file *of,
 		ret = 0;
 	}
 out:
-	blkg_conf_finish(&ctx);
+	blkg_conf_exit(&ctx);
 	return ret ?: nbytes;
 }
 
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 99674e23cf88..d8e0625cd12d 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -626,68 +626,92 @@ u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v)
 EXPORT_SYMBOL_GPL(__blkg_prfill_u64);
 
 /**
- * blkcg_conf_open_bdev - parse and open bdev for per-blkg config update
- * @inputp: input string pointer
+ * blkg_conf_init - initialize a blkg_conf_ctx
+ * @ctx: blkg_conf_ctx to initialize
+ * @input: input string
  *
- * Parse the device node prefix part, MAJ:MIN, of per-blkg config update
- * from @input and get and return the matching bdev.  *@inputp is
- * updated to point past the device node prefix.  Returns an ERR_PTR()
- * value on error.
+ * Initialize @ctx which can be used to parse blkg config input string @input.
+ * Once initialized, @ctx can be used with blkg_conf_open_bdev() and
+ * blkg_conf_prep(), and must be cleaned up with blkg_conf_exit().
+ */
+void blkg_conf_init(struct blkg_conf_ctx *ctx, char *input)
+{
+	*ctx = (struct blkg_conf_ctx){ .input = input };
+}
+EXPORT_SYMBOL_GPL(blkg_conf_init);
+
+/**
+ * blkg_conf_open_bdev - parse and open bdev for per-blkg config update
+ * @ctx: blkg_conf_ctx initialized with blkg_conf_init()
  *
- * Use this function iff blkg_conf_prep() can't be used for some reason.
+ * Parse the device node prefix part, MAJ:MIN, of per-blkg config update from
+ * @ctx->input and get and store the matching bdev in @ctx->bdev. @ctx->body is
+ * set to point past the device node prefix.
+ *
+ * This function may be called multiple times on @ctx and the extra calls become
+ * NOOPs. blkg_conf_prep() implicitly calls this function. Use this function
+ * explicitly if bdev access is needed without resolving the blkcg / policy part
+ * of @ctx->input. Returns -errno on error.
  */
-struct block_device *blkcg_conf_open_bdev(char **inputp)
+int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx)
 {
-	char *input = *inputp;
+	char *input = ctx->input;
 	unsigned int major, minor;
 	struct block_device *bdev;
 	int key_len;
 
+	if (ctx->bdev)
+		return 0;
+
 	if (sscanf(input, "%u:%u%n", &major, &minor, &key_len) != 2)
-		return ERR_PTR(-EINVAL);
+		return -EINVAL;
 
 	input += key_len;
 	if (!isspace(*input))
-		return ERR_PTR(-EINVAL);
+		return -EINVAL;
 	input = skip_spaces(input);
 
 	bdev = blkdev_get_no_open(MKDEV(major, minor));
 	if (!bdev)
-		return ERR_PTR(-ENODEV);
+		return -ENODEV;
 	if (bdev_is_partition(bdev)) {
 		blkdev_put_no_open(bdev);
-		return ERR_PTR(-ENODEV);
+		return -ENODEV;
 	}
 
-	*inputp = input;
-	return bdev;
+	ctx->body = input;
+	ctx->bdev = bdev;
+	return 0;
 }
 
 /**
  * blkg_conf_prep - parse and prepare for per-blkg config update
  * @blkcg: target block cgroup
  * @pol: target policy
- * @input: input string
- * @ctx: blkg_conf_ctx to be filled
+ * @ctx: blkg_conf_ctx initialized with blkg_conf_init()
+ *
+ * Parse per-blkg config update from @ctx->input and initialize @ctx
+ * accordingly. On success, @ctx->body points to the part of @ctx->input
+ * following MAJ:MIN, @ctx->bdev points to the target block device and
+ * @ctx->blkg to the blkg being configured.
  *
- * Parse per-blkg config update from @input and initialize @ctx with the
- * result.  @ctx->blkg points to the blkg to be updated and @ctx->body the
- * part of @input following MAJ:MIN.  This function returns with queue lock
- * held and must be paired with blkg_conf_finish().
+ * blkg_conf_open_bdev() may be called on @ctx beforehand. On success, this
+ * function returns with queue lock held and must be followed by
+ * blkg_conf_exit().
  */
 int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
-		   char *input, struct blkg_conf_ctx *ctx)
+		   struct blkg_conf_ctx *ctx)
 {
-	struct block_device *bdev;
 	struct gendisk *disk;
 	struct request_queue *q;
 	struct blkcg_gq *blkg;
 	int ret;
 
-	bdev = blkcg_conf_open_bdev(&input);
-	if (IS_ERR(bdev))
-		return PTR_ERR(bdev);
-	disk = bdev->bd_disk;
+	ret = blkg_conf_open_bdev(ctx);
+	if (ret)
+		return ret;
+
+	disk = ctx->bdev->bd_disk;
 	q = disk->queue;
 
 	/*
@@ -765,9 +789,7 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 	}
 success:
 	blk_queue_exit(q);
-	ctx->bdev = bdev;
 	ctx->blkg = blkg;
-	ctx->body = input;
 	return 0;
 
 fail_preloaded:
@@ -777,7 +799,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 fail_exit_queue:
 	blk_queue_exit(q);
 fail:
-	blkdev_put_no_open(bdev);
 	/*
 	 * If queue was bypassing, we should retry.  Do so after a
 	 * short msleep().  It isn't strictly necessary but queue
@@ -793,18 +814,26 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 EXPORT_SYMBOL_GPL(blkg_conf_prep);
 
 /**
- * blkg_conf_finish - finish up per-blkg config update
- * @ctx: blkg_conf_ctx initialized by blkg_conf_prep()
+ * blkg_conf_exit - clean up per-blkg config update
+ * @ctx: blkg_conf_ctx initialized with blkg_conf_init()
  *
- * Finish up after per-blkg config update.  This function must be paired
- * with blkg_conf_prep().
+ * Clean up after per-blkg config update. This function must be called on all
+ * blkg_conf_ctx's initialized with blkg_conf_init().
  */
-void blkg_conf_finish(struct blkg_conf_ctx *ctx)
+void blkg_conf_exit(struct blkg_conf_ctx *ctx)
 {
-	spin_unlock_irq(&bdev_get_queue(ctx->bdev)->queue_lock);
-	blkdev_put_no_open(ctx->bdev);
+	if (ctx->blkg) {
+		spin_unlock_irq(&bdev_get_queue(ctx->bdev)->queue_lock);
+		ctx->blkg = NULL;
+	}
+
+	if (ctx->bdev) {
+		blkdev_put_no_open(ctx->bdev);
+		ctx->body = NULL;
+		ctx->bdev = NULL;
+	}
 }
-EXPORT_SYMBOL_GPL(blkg_conf_finish);
+EXPORT_SYMBOL_GPL(blkg_conf_exit);
 
 static void blkg_iostat_set(struct blkg_iostat *dst, struct blkg_iostat *src)
 {
diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 1e94e404eaa8..fe09e8b4c2a8 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -208,15 +208,17 @@ void blkcg_print_blkgs(struct seq_file *sf, struct blkcg *blkcg,
 u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v);
 
 struct blkg_conf_ctx {
+	char				*input;
+	char				*body;
 	struct block_device		*bdev;
 	struct blkcg_gq			*blkg;
-	char				*body;
 };
 
-struct block_device *blkcg_conf_open_bdev(char **inputp);
+void blkg_conf_init(struct blkg_conf_ctx *ctx, char *input);
+int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx);
 int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
-		   char *input, struct blkg_conf_ctx *ctx);
-void blkg_conf_finish(struct blkg_conf_ctx *ctx);
+		   struct blkg_conf_ctx *ctx);
+void blkg_conf_exit(struct blkg_conf_ctx *ctx);
 
 /**
  * bio_issue_as_root_blkg - see if this bio needs to be issued as root blkg
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 6955605629e4..22a3639a7a05 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -3091,9 +3091,11 @@ static ssize_t ioc_weight_write(struct kernfs_open_file *of, char *buf,
 		return nbytes;
 	}
 
-	ret = blkg_conf_prep(blkcg, &blkcg_policy_iocost, buf, &ctx);
+	blkg_conf_init(&ctx, buf);
+
+	ret = blkg_conf_prep(blkcg, &blkcg_policy_iocost, &ctx);
 	if (ret)
-		return ret;
+		goto err;
 
 	iocg = blkg_to_iocg(ctx.blkg);
 
@@ -3112,12 +3114,14 @@ static ssize_t ioc_weight_write(struct kernfs_open_file *of, char *buf,
 	weight_updated(iocg, &now);
 	spin_unlock(&iocg->ioc->lock);
 
-	blkg_conf_finish(&ctx);
+	blkg_conf_exit(&ctx);
 	return nbytes;
 
 einval:
-	blkg_conf_finish(&ctx);
-	return -EINVAL;
+	ret = -EINVAL;
+err:
+	blkg_conf_exit(&ctx);
+	return ret;
 }
 
 static u64 ioc_qos_prfill(struct seq_file *sf, struct blkg_policy_data *pd,
@@ -3172,19 +3176,22 @@ static const match_table_t qos_tokens = {
 static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 			     size_t nbytes, loff_t off)
 {
-	struct block_device *bdev;
+	struct blkg_conf_ctx ctx;
 	struct gendisk *disk;
 	struct ioc *ioc;
 	u32 qos[NR_QOS_PARAMS];
 	bool enable, user;
-	char *p;
+	char *body, *p;
 	int ret;
 
-	bdev = blkcg_conf_open_bdev(&input);
-	if (IS_ERR(bdev))
-		return PTR_ERR(bdev);
+	blkg_conf_init(&ctx, input);
 
-	disk = bdev->bd_disk;
+	ret = blkg_conf_open_bdev(&ctx);
+	if (ret)
+		goto err;
+
+	body = ctx.body;
+	disk = ctx.bdev->bd_disk;
 	ioc = q_to_ioc(disk->queue);
 	if (!ioc) {
 		ret = blk_iocost_init(disk);
@@ -3201,7 +3208,7 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 	enable = ioc->enabled;
 	user = ioc->user_qos_params;
 
-	while ((p = strsep(&input, " \t\n"))) {
+	while ((p = strsep(&body, " \t\n"))) {
 		substring_t args[MAX_OPT_ARGS];
 		char buf[32];
 		int tok;
@@ -3290,7 +3297,7 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 	blk_mq_unquiesce_queue(disk->queue);
 	blk_mq_unfreeze_queue(disk->queue);
 
-	blkdev_put_no_open(bdev);
+	blkg_conf_exit(&ctx);
 	return nbytes;
 einval:
 	spin_unlock_irq(&ioc->lock);
@@ -3300,7 +3307,7 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 
 	ret = -EINVAL;
 err:
-	blkdev_put_no_open(bdev);
+	blkg_conf_exit(&ctx);
 	return ret;
 }
 
@@ -3351,22 +3358,25 @@ static const match_table_t i_lcoef_tokens = {
 static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *input,
 				    size_t nbytes, loff_t off)
 {
-	struct block_device *bdev;
+	struct blkg_conf_ctx ctx;
 	struct request_queue *q;
 	struct ioc *ioc;
 	u64 u[NR_I_LCOEFS];
 	bool user;
-	char *p;
+	char *body, *p;
 	int ret;
 
-	bdev = blkcg_conf_open_bdev(&input);
-	if (IS_ERR(bdev))
-		return PTR_ERR(bdev);
+	blkg_conf_init(&ctx, input);
+
+	ret = blkg_conf_open_bdev(&ctx);
+	if (ret)
+		goto err;
 
-	q = bdev_get_queue(bdev);
+	body = ctx.body;
+	q = bdev_get_queue(ctx.bdev);
 	ioc = q_to_ioc(q);
 	if (!ioc) {
-		ret = blk_iocost_init(bdev->bd_disk);
+		ret = blk_iocost_init(ctx.bdev->bd_disk);
 		if (ret)
 			goto err;
 		ioc = q_to_ioc(q);
@@ -3379,7 +3389,7 @@ static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *input,
 	memcpy(u, ioc->params.i_lcoefs, sizeof(u));
 	user = ioc->user_cost_model;
 
-	while ((p = strsep(&input, " \t\n"))) {
+	while ((p = strsep(&body, " \t\n"))) {
 		substring_t args[MAX_OPT_ARGS];
 		char buf[32];
 		int tok;
@@ -3426,7 +3436,7 @@ static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *input,
 	blk_mq_unquiesce_queue(q);
 	blk_mq_unfreeze_queue(q);
 
-	blkdev_put_no_open(bdev);
+	blkg_conf_exit(&ctx);
 	return nbytes;
 
 einval:
@@ -3437,7 +3447,7 @@ static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *input,
 
 	ret = -EINVAL;
 err:
-	blkdev_put_no_open(bdev);
+	blkg_conf_exit(&ctx);
 	return ret;
 }
 
diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c
index ecdc10741836..3b3667f397a9 100644
--- a/block/blk-iolatency.c
+++ b/block/blk-iolatency.c
@@ -842,9 +842,11 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf,
 	u64 oldval;
 	int ret;
 
-	ret = blkg_conf_prep(blkcg, &blkcg_policy_iolatency, buf, &ctx);
+	blkg_conf_init(&ctx, buf);
+
+	ret = blkg_conf_prep(blkcg, &blkcg_policy_iolatency, &ctx);
 	if (ret)
-		return ret;
+		goto out;
 
 	iolat = blkg_to_lat(ctx.blkg);
 	p = ctx.body;
@@ -880,7 +882,7 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf,
 		iolatency_clear_scaling(blkg);
 	ret = 0;
 out:
-	blkg_conf_finish(&ctx);
+	blkg_conf_exit(&ctx);
 	return ret ?: nbytes;
 }
 
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 6fb5a2f9e1ee..75841d1d9bf4 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1369,9 +1369,11 @@ static ssize_t tg_set_conf(struct kernfs_open_file *of,
 	int ret;
 	u64 v;
 
-	ret = blkg_conf_prep(blkcg, &blkcg_policy_throtl, buf, &ctx);
+	blkg_conf_init(&ctx, buf);
+
+	ret = blkg_conf_prep(blkcg, &blkcg_policy_throtl, &ctx);
 	if (ret)
-		return ret;
+		goto out_finish;
 
 	ret = -EINVAL;
 	if (sscanf(ctx.body, "%llu", &v) != 1)
@@ -1390,7 +1392,7 @@ static ssize_t tg_set_conf(struct kernfs_open_file *of,
 	tg_conf_updated(tg, false);
 	ret = 0;
 out_finish:
-	blkg_conf_finish(&ctx);
+	blkg_conf_exit(&ctx);
 	return ret ?: nbytes;
 }
 
@@ -1562,9 +1564,11 @@ static ssize_t tg_set_limit(struct kernfs_open_file *of,
 	int ret;
 	int index = of_cft(of)->private;
 
-	ret = blkg_conf_prep(blkcg, &blkcg_policy_throtl, buf, &ctx);
+	blkg_conf_init(&ctx, buf);
+
+	ret = blkg_conf_prep(blkcg, &blkcg_policy_throtl, &ctx);
 	if (ret)
-		return ret;
+		goto out_finish;
 
 	tg = blkg_to_tg(ctx.blkg);
 	tg_update_carryover(tg);
@@ -1663,7 +1667,7 @@ static ssize_t tg_set_limit(struct kernfs_open_file *of,
 		tg->td->limit_valid[LIMIT_LOW]);
 	ret = 0;
 out_finish:
-	blkg_conf_finish(&ctx);
+	blkg_conf_exit(&ctx);
 	return ret ?: nbytes;
 }
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/4] blk-iolatency: s/blkcg_rq_qos/iolat_rq_qos/
  2023-01-05 21:24 [PATCHSET v2 block/for-next] blkcg: Improve blkg config helpers and make iolatency init lazy Tejun Heo
  2023-01-05 21:24 ` [PATCH 1/4] blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish() Tejun Heo
  2023-01-05 21:24 ` [PATCH 2/4] blkcg: Restructure blkg_conf_prep() and friends Tejun Heo
@ 2023-01-05 21:24 ` Tejun Heo
  2023-01-10  7:09   ` Christoph Hellwig
  2023-01-05 21:24 ` [PATCH 4/4] blk-iolatency: Make initialization lazy Tejun Heo
  3 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2023-01-05 21:24 UTC (permalink / raw)
  To: axboe, josef, hch; +Cc: linux-block, linux-kernel, Tejun Heo

The name was too generic given that there are multiple blkcg rq-qos
policies.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>
---
 block/blk-iolatency.c | 2 +-
 block/blk-rq-qos.h    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c
index 3b3667f397a9..3601345808d2 100644
--- a/block/blk-iolatency.c
+++ b/block/blk-iolatency.c
@@ -976,7 +976,7 @@ static void iolatency_pd_init(struct blkg_policy_data *pd)
 {
 	struct iolatency_grp *iolat = pd_to_lat(pd);
 	struct blkcg_gq *blkg = lat_to_blkg(iolat);
-	struct rq_qos *rqos = blkcg_rq_qos(blkg->q);
+	struct rq_qos *rqos = iolat_rq_qos(blkg->q);
 	struct blk_iolatency *blkiolat = BLKIOLATENCY(rqos);
 	u64 now = ktime_to_ns(ktime_get());
 	int cpu;
diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h
index 1ef1f7d4bc3c..27f004fae66b 100644
--- a/block/blk-rq-qos.h
+++ b/block/blk-rq-qos.h
@@ -74,7 +74,7 @@ static inline struct rq_qos *wbt_rq_qos(struct request_queue *q)
 	return rq_qos_id(q, RQ_QOS_WBT);
 }
 
-static inline struct rq_qos *blkcg_rq_qos(struct request_queue *q)
+static inline struct rq_qos *iolat_rq_qos(struct request_queue *q)
 {
 	return rq_qos_id(q, RQ_QOS_LATENCY);
 }
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/4] blk-iolatency: Make initialization lazy
  2023-01-05 21:24 [PATCHSET v2 block/for-next] blkcg: Improve blkg config helpers and make iolatency init lazy Tejun Heo
                   ` (2 preceding siblings ...)
  2023-01-05 21:24 ` [PATCH 3/4] blk-iolatency: s/blkcg_rq_qos/iolat_rq_qos/ Tejun Heo
@ 2023-01-05 21:24 ` Tejun Heo
  2023-01-10  7:10   ` Christoph Hellwig
  3 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2023-01-05 21:24 UTC (permalink / raw)
  To: axboe, josef, hch; +Cc: linux-block, linux-kernel, Tejun Heo

Other rq_qos policies such as wbt and iocost are lazy-initialized when they
are configured for the first time for the device but iolatency is
initialized unconditionally from blkcg_init_disk() during gendisk init. Lazy
init is beneficial because rq_qos policies add runtime overhead when
initialized as every IO has to walk all registered rq_qos callbacks.

This patch switches iolatency to lazy initialization too so that it only
registered its rq_qos policy when it is first configured.

Note that there is a known race condition between blkcg config file writes
and del_gendisk() and this patch makes iolatency susceptible to it by
exposing the init path to race against the deletion path. However, that
problem already exists in iocost and is being worked on.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@lst.de>
---
 block/blk-cgroup.c    |  8 --------
 block/blk-iolatency.c | 29 ++++++++++++++++++++++++++++-
 block/blk.h           |  6 ------
 3 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index d8e0625cd12d..844579aff363 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -33,7 +33,6 @@
 #include "blk-cgroup.h"
 #include "blk-ioprio.h"
 #include "blk-throttle.h"
-#include "blk-rq-qos.h"
 
 /*
  * blkcg_pol_mutex protects blkcg_policy[] and policy [de]activation.
@@ -1322,14 +1321,8 @@ int blkcg_init_disk(struct gendisk *disk)
 	if (ret)
 		goto err_ioprio_exit;
 
-	ret = blk_iolatency_init(disk);
-	if (ret)
-		goto err_throtl_exit;
-
 	return 0;
 
-err_throtl_exit:
-	blk_throtl_exit(disk);
 err_ioprio_exit:
 	blk_ioprio_exit(disk);
 err_destroy_all:
@@ -1345,7 +1338,6 @@ int blkcg_init_disk(struct gendisk *disk)
 void blkcg_exit_disk(struct gendisk *disk)
 {
 	blkg_destroy_all(disk);
-	rq_qos_exit(disk->queue);
 	blk_throtl_exit(disk);
 }
 
diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c
index 3601345808d2..3484393dbc4a 100644
--- a/block/blk-iolatency.c
+++ b/block/blk-iolatency.c
@@ -755,7 +755,7 @@ static void blkiolatency_enable_work_fn(struct work_struct *work)
 	}
 }
 
-int blk_iolatency_init(struct gendisk *disk)
+static int blk_iolatency_init(struct gendisk *disk)
 {
 	struct request_queue *q = disk->queue;
 	struct blk_iolatency *blkiolat;
@@ -830,6 +830,29 @@ static void iolatency_clear_scaling(struct blkcg_gq *blkg)
 	}
 }
 
+static int blk_iolatency_try_init(struct blkg_conf_ctx *ctx)
+{
+	static DEFINE_MUTEX(init_mutex);
+	int ret;
+
+	ret = blkg_conf_open_bdev(ctx);
+	if (ret)
+		return ret;
+
+	/*
+	 * blk_iolatency_init() may fail after rq_qos_add() succeeds which can
+	 * confuse iolat_rq_qos() test. Make the test and init atomic.
+	 */
+	mutex_lock(&init_mutex);
+
+	if (!iolat_rq_qos(ctx->bdev->bd_queue))
+		ret = blk_iolatency_init(ctx->bdev->bd_disk);
+
+	mutex_unlock(&init_mutex);
+
+	return ret;
+}
+
 static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf,
 			     size_t nbytes, loff_t off)
 {
@@ -844,6 +867,10 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf,
 
 	blkg_conf_init(&ctx, buf);
 
+	ret = blk_iolatency_try_init(&ctx);
+	if (ret)
+		goto out;
+
 	ret = blkg_conf_prep(blkcg, &blkcg_policy_iolatency, &ctx);
 	if (ret)
 		goto out;
diff --git a/block/blk.h b/block/blk.h
index 4c3b3325219a..78f1706cddca 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -392,12 +392,6 @@ static inline struct bio *blk_queue_bounce(struct bio *bio,
 	return bio;
 }
 
-#ifdef CONFIG_BLK_CGROUP_IOLATENCY
-int blk_iolatency_init(struct gendisk *disk);
-#else
-static inline int blk_iolatency_init(struct gendisk *disk) { return 0; };
-#endif
-
 #ifdef CONFIG_BLK_DEV_ZONED
 void disk_free_zone_bitmaps(struct gendisk *disk);
 void disk_clear_zone_settings(struct gendisk *disk);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish()
  2023-01-05 21:24 ` [PATCH 1/4] blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish() Tejun Heo
@ 2023-01-08 17:02   ` Christoph Hellwig
  2023-01-09 20:48     ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2023-01-08 17:02 UTC (permalink / raw)
  To: Tejun Heo; +Cc: axboe, josef, hch, linux-block, linux-kernel

On Thu, Jan 05, 2023 at 11:24:29AM -1000, Tejun Heo wrote:
> Holding the queue lock now implies RCU read lock, so no need to use
> rcu_read_[un]lock() explicitly. This shouldn't cause any behavior changes.

How so?

> While at it, drop __acquires() annotation on the queue lock too. The
> __acquires() part was already out of sync and it doesn't catch anything that
> lockdep can't.

This makes sparse even more unhappy than it was before.  For now
please keep the annotation.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish()
  2023-01-08 17:02   ` Christoph Hellwig
@ 2023-01-09 20:48     ` Tejun Heo
  2023-01-10  6:49       ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2023-01-09 20:48 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, josef, linux-block, linux-kernel

Hello, Christoph.

On Sun, Jan 08, 2023 at 06:02:40PM +0100, Christoph Hellwig wrote:
> On Thu, Jan 05, 2023 at 11:24:29AM -1000, Tejun Heo wrote:
> > Holding the queue lock now implies RCU read lock, so no need to use
> > rcu_read_[un]lock() explicitly. This shouldn't cause any behavior changes.
> 
> How so?

Now that all RCU flavors have been combined, holding a spin lock, disabling
irq, disabling preemption all imply RCU read lock.

> > While at it, drop __acquires() annotation on the queue lock too. The
> > __acquires() part was already out of sync and it doesn't catch anything that
> > lockdep can't.
> 
> This makes sparse even more unhappy than it was before.  For now
> please keep the annotation.

I can drop the changes but this actually bothers me. The annotation has been
broken for a *long* time and nobody noticed. Furthermore, I can't remember a
time when __acquires/__releases notation caught anything that lockdep
couldn't trivially and can't even think of a way how it could. AFAICS, these
annotations don't contribute anything other than preservation of themselves.
I don't see why we would want to keep them.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish()
  2023-01-09 20:48     ` Tejun Heo
@ 2023-01-10  6:49       ` Christoph Hellwig
  2023-01-10 18:24         ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2023-01-10  6:49 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Christoph Hellwig, axboe, josef, linux-block, linux-kernel

On Mon, Jan 09, 2023 at 10:48:55AM -1000, Tejun Heo wrote:
> Now that all RCU flavors have been combined, holding a spin lock, disabling
> irq, disabling preemption all imply RCU read lock.

Can you write it like this in the commit log, please? 

> I can drop the changes but this actually bothers me. The annotation has been
> broken for a *long* time and nobody noticed. Furthermore, I can't remember a
> time when __acquires/__releases notation caught anything that lockdep
> couldn't trivially and can't even think of a way how it could. AFAICS, these
> annotations don't contribute anything other than preservation of themselves.
> I don't see why we would want to keep them.

People have noticed it.  It just hasn't been a priority as there are
lots of even more problematic things.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] blkcg: Restructure blkg_conf_prep() and friends
  2023-01-05 21:24 ` [PATCH 2/4] blkcg: Restructure blkg_conf_prep() and friends Tejun Heo
@ 2023-01-10  7:09   ` Christoph Hellwig
  2023-01-10 18:33     ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2023-01-10  7:09 UTC (permalink / raw)
  To: Tejun Heo; +Cc: axboe, josef, hch, linux-block, linux-kernel

On Thu, Jan 05, 2023 at 11:24:30AM -1000, Tejun Heo wrote:
> * blkg_conf_open_bdev() is updated to take a pointer to blkg_conf_ctx like
>   blkg_conf_prep() and can be called multiple times safely. Instead of
>   modifying the double pointer to input string directly,
>   blkg_conf_open_bdev() now sets blkg_conf_ctx->body.

This looks pretty awkward for the externals callers of blkcg_conf_open_bdev
in blk-iocost.  I'd either keep the calling conventions as they are
at the moment, or just open code blkcg_conf_open_bdev in blk-iocost.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] blk-iolatency: s/blkcg_rq_qos/iolat_rq_qos/
  2023-01-05 21:24 ` [PATCH 3/4] blk-iolatency: s/blkcg_rq_qos/iolat_rq_qos/ Tejun Heo
@ 2023-01-10  7:09   ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2023-01-10  7:09 UTC (permalink / raw)
  To: Tejun Heo; +Cc: axboe, josef, hch, linux-block, linux-kernel

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] blk-iolatency: Make initialization lazy
  2023-01-05 21:24 ` [PATCH 4/4] blk-iolatency: Make initialization lazy Tejun Heo
@ 2023-01-10  7:10   ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2023-01-10  7:10 UTC (permalink / raw)
  To: Tejun Heo; +Cc: axboe, josef, hch, linux-block, linux-kernel

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish()
  2023-01-10  6:49       ` Christoph Hellwig
@ 2023-01-10 18:24         ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2023-01-10 18:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, josef, linux-block, linux-kernel

Hello,

On Tue, Jan 10, 2023 at 07:49:00AM +0100, Christoph Hellwig wrote:
> On Mon, Jan 09, 2023 at 10:48:55AM -1000, Tejun Heo wrote:
> > Now that all RCU flavors have been combined, holding a spin lock, disabling
> > irq, disabling preemption all imply RCU read lock.
> 
> Can you write it like this in the commit log, please? 

Sure, will do.

> > I can drop the changes but this actually bothers me. The annotation has been
> > broken for a *long* time and nobody noticed. Furthermore, I can't remember a
> > time when __acquires/__releases notation caught anything that lockdep
> > couldn't trivially and can't even think of a way how it could. AFAICS, these
> > annotations don't contribute anything other than preservation of themselves.
> > I don't see why we would want to keep them.
> 
> People have noticed it.  It just hasn't been a priority as there are
> lots of even more problematic things.

That doesn't really shed a positive light on them, does it? I'll drop this
part but can you think of actual reasons to keep these around other than to
keep sparse happy? I'm genuninely curious and have asked several people.
Nobody had a good answer.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] blkcg: Restructure blkg_conf_prep() and friends
  2023-01-10  7:09   ` Christoph Hellwig
@ 2023-01-10 18:33     ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2023-01-10 18:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, josef, linux-block, linux-kernel

On Tue, Jan 10, 2023 at 08:09:29AM +0100, Christoph Hellwig wrote:
> On Thu, Jan 05, 2023 at 11:24:30AM -1000, Tejun Heo wrote:
> > * blkg_conf_open_bdev() is updated to take a pointer to blkg_conf_ctx like
> >   blkg_conf_prep() and can be called multiple times safely. Instead of
> >   modifying the double pointer to input string directly,
> >   blkg_conf_open_bdev() now sets blkg_conf_ctx->body.
> 
> This looks pretty awkward for the externals callers of blkcg_conf_open_bdev
> in blk-iocost.  I'd either keep the calling conventions as they are
> at the moment, or just open code blkcg_conf_open_bdev in blk-iocost.

Because we're coming in from cgroupfs, we aren't synchronizing properly
against blkdevs going away. For all config attempt coming in from cgroup
side, we'll need to synchronize explicitly and these config helper blocks
look like a good place to do so. Please take a look at the thread with Yu
Kuai. Imma update the comment to include that but yeah let's keep it this
way for that.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-01-10 18:39 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-05 21:24 [PATCHSET v2 block/for-next] blkcg: Improve blkg config helpers and make iolatency init lazy Tejun Heo
2023-01-05 21:24 ` [PATCH 1/4] blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish() Tejun Heo
2023-01-08 17:02   ` Christoph Hellwig
2023-01-09 20:48     ` Tejun Heo
2023-01-10  6:49       ` Christoph Hellwig
2023-01-10 18:24         ` Tejun Heo
2023-01-05 21:24 ` [PATCH 2/4] blkcg: Restructure blkg_conf_prep() and friends Tejun Heo
2023-01-10  7:09   ` Christoph Hellwig
2023-01-10 18:33     ` Tejun Heo
2023-01-05 21:24 ` [PATCH 3/4] blk-iolatency: s/blkcg_rq_qos/iolat_rq_qos/ Tejun Heo
2023-01-10  7:09   ` Christoph Hellwig
2023-01-05 21:24 ` [PATCH 4/4] blk-iolatency: Make initialization lazy Tejun Heo
2023-01-10  7:10   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.