* [PATCH 0/2 block/for-next] block, blkcg, bfq: make bfq disable iocost and delete bfq prefix from cgroup filenames @ 2019-09-17 16:51 Paolo Valente 2019-09-17 16:51 ` [PATCH 1/2] blkcg: Make bfq disable iocost when enabled Paolo Valente 2019-09-17 16:51 ` [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames Paolo Valente 0 siblings, 2 replies; 13+ messages in thread From: Paolo Valente @ 2019-09-17 16:51 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, bfq-iosched, oleksandr, Tejun Heo, cgroups, Paolo Valente Hi Jens, here is the pair of patches from [1] and [2]. Thanks, Paolo [1] https://lkml.org/lkml/2019/9/16/469 [2] https://lore.kernel.org/linux-block/20190917151334.GI3084169@devbig004.ftw2.facebook.com/ Angelo Ruocco (1): block, bfq: delete "bfq" prefix from cgroup filenames Tejun Heo (1): blkcg: Make bfq disable iocost when enabled Documentation/admin-guide/cgroup-v2.rst | 8 +++-- block/bfq-cgroup.c | 48 +++++++++++++------------ block/bfq-iosched.c | 32 +++++++++++++++++ block/blk-iocost.c | 5 ++- include/linux/blk-cgroup.h | 5 +++ kernel/cgroup/cgroup.c | 2 ++ 6 files changed, 71 insertions(+), 29 deletions(-) -- 2.20.1 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] blkcg: Make bfq disable iocost when enabled 2019-09-17 16:51 [PATCH 0/2 block/for-next] block, blkcg, bfq: make bfq disable iocost and delete bfq prefix from cgroup filenames Paolo Valente @ 2019-09-17 16:51 ` Paolo Valente 2019-09-17 16:51 ` [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames Paolo Valente 1 sibling, 0 replies; 13+ messages in thread From: Paolo Valente @ 2019-09-17 16:51 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, bfq-iosched, oleksandr, Tejun Heo, cgroups, Paolo Valente From: Tejun Heo <tj@kernel.org> Both iocost and bfq implement weight based IO control. Currently, bfq is using io.bfq prefix but wants to drop the bfq part. To avoid interface conflict, make bfq disable iocost when it's selected as the IO scheduler for any block device on the system. iocost is only re-enabled when bfq is built as a module and unloaded. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Paolo Valente <paolo.valente@linaro.org> --- Documentation/admin-guide/cgroup-v2.rst | 8 ++++--- block/bfq-cgroup.c | 2 ++ block/bfq-iosched.c | 32 +++++++++++++++++++++++++ block/blk-iocost.c | 5 ++-- include/linux/blk-cgroup.h | 5 ++++ kernel/cgroup/cgroup.c | 2 ++ 6 files changed, 48 insertions(+), 6 deletions(-) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index 3deacdc5e6d2..d4d06a970f8a 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1406,9 +1406,11 @@ IO The "io" controller regulates the distribution of IO resources. This controller implements both weight based and absolute bandwidth or IOPS -limit distribution; however, weight based distribution is available -only if cfq-iosched is in use and neither scheme is available for -blk-mq devices. +limit distribution. Weight based distribution is implemented by +either iocost controller or bfq IO scheduler. When bfq is selected as +the IO scheduler for any block device, iocost is disabled and bfq's +implementation overrides for all devices. If bfq is built as a kernel +module, unloading it re-enables iocost. IO Interface Files diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index 86a607cf19a1..decda96770f4 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -1194,7 +1194,9 @@ struct bfq_group *bfq_create_group_hierarchy(struct bfq_data *bfqd, int node) } struct blkcg_policy blkcg_policy_bfq = { +#ifndef CONFIG_BLK_CGROUP_IOCOST .dfl_cftypes = bfq_blkg_files, +#endif .legacy_cftypes = bfq_blkcg_legacy_files, .cpd_alloc_fn = bfq_cpd_alloc, diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 0319d6339822..21d1b08610b1 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -6382,6 +6382,36 @@ static void bfq_init_root_group(struct bfq_group *root_group, root_group->sched_data.bfq_class_idle_last_service = jiffies; } +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_BLK_CGROUP_IOCOST) +static bool bfq_enabled = false; + +static void bfq_enable(void) +{ + static DEFINE_MUTEX(bfq_enable_mutex); + + mutex_lock(&bfq_enable_mutex); + if (!bfq_enabled) { + pr_info("bfq-iosched: Overriding iocost\n"); + blkcg_policy_unregister(&blkcg_policy_iocost); + cgroup_add_dfl_cftypes(&io_cgrp_subsys, bfq_blkg_files); + bfq_enabled = true; + } + mutex_unlock(&bfq_enable_mutex); +} + +static void __exit bfq_disable(void) +{ + if (bfq_enabled) { + pr_info("bfq-iosched: Restoring iocost\n"); + cgroup_rm_cftypes(bfq_blkg_files); + blkcg_policy_register(&blkcg_policy_iocost); + } +} +#else +static void bfq_enable(void) {} +static void __exit bfq_disable(void) {} +#endif + static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) { struct bfq_data *bfqd; @@ -6506,6 +6536,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) bfq_init_entity(&bfqd->oom_bfqq.entity, bfqd->root_group); wbt_disable_default(q); + bfq_enable(); return 0; out_free: @@ -6823,6 +6854,7 @@ static void __exit bfq_exit(void) blkcg_policy_unregister(&blkcg_policy_bfq); #endif bfq_slab_kill(); + bfq_disable(); } module_init(bfq_init); diff --git a/block/blk-iocost.c b/block/blk-iocost.c index 3b39deb8b9f8..1ef5b443c09a 100644 --- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -605,8 +605,6 @@ static u32 vrate_adj_pct[] = 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8, 8, 8, 16 }; -static struct blkcg_policy blkcg_policy_iocost; - /* accessors and helpers */ static struct ioc *rqos_to_ioc(struct rq_qos *rqos) { @@ -2434,7 +2432,7 @@ static struct cftype ioc_files[] = { {} }; -static struct blkcg_policy blkcg_policy_iocost = { +struct blkcg_policy blkcg_policy_iocost = { .dfl_cftypes = ioc_files, .cpd_alloc_fn = ioc_cpd_alloc, .cpd_free_fn = ioc_cpd_free, @@ -2442,6 +2440,7 @@ static struct blkcg_policy blkcg_policy_iocost = { .pd_init_fn = ioc_pd_init, .pd_free_fn = ioc_pd_free, }; +EXPORT_SYMBOL_GPL(blkcg_policy_iocost); static int __init ioc_init(void) { diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h index bed9e43f9426..5669e3cfa1bc 100644 --- a/include/linux/blk-cgroup.h +++ b/include/linux/blk-cgroup.h @@ -815,6 +815,11 @@ static inline void blkcg_clear_delay(struct blkcg_gq *blkg) void blkcg_add_delay(struct blkcg_gq *blkg, u64 now, u64 delta); void blkcg_schedule_throttle(struct request_queue *q, bool use_memdelay); void blkcg_maybe_throttle_current(void); + +#ifdef CONFIG_BLK_CGROUP_IOCOST +extern struct blkcg_policy blkcg_policy_iocost; +#endif + #else /* CONFIG_BLK_CGROUP */ struct blkcg { diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 8be1da1ebd9a..4d015328ebb0 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4059,6 +4059,7 @@ int cgroup_rm_cftypes(struct cftype *cfts) mutex_unlock(&cgroup_mutex); return ret; } +EXPORT_SYMBOL_GPL(cgroup_rm_cftypes); /** * cgroup_add_cftypes - add an array of cftypes to a subsystem @@ -4115,6 +4116,7 @@ int cgroup_add_dfl_cftypes(struct cgroup_subsys *ss, struct cftype *cfts) cft->flags |= __CFTYPE_ONLY_ON_DFL; return cgroup_add_cftypes(ss, cfts); } +EXPORT_SYMBOL_GPL(cgroup_add_dfl_cftypes); /** * cgroup_add_legacy_cftypes - add an array of cftypes for legacy hierarchies -- 2.20.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames 2019-09-17 16:51 [PATCH 0/2 block/for-next] block, blkcg, bfq: make bfq disable iocost and delete bfq prefix from cgroup filenames Paolo Valente 2019-09-17 16:51 ` [PATCH 1/2] blkcg: Make bfq disable iocost when enabled Paolo Valente @ 2019-09-17 16:51 ` Paolo Valente 2019-09-17 21:04 ` Chaitanya Kulkarni 2019-09-17 21:32 ` Tejun Heo 1 sibling, 2 replies; 13+ messages in thread From: Paolo Valente @ 2019-09-17 16:51 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, bfq-iosched, oleksandr, Tejun Heo, cgroups, Angelo Ruocco, Paolo Valente From: Angelo Ruocco <angeloruocco90@gmail.com> When bfq was merged into mainline, there were two I/O schedulers that implemented the proportional-share policy: bfq for blk-mq and cfq for legacy blk. bfq's interface files in the blkio/io controller have the same names as cfq. But the cgroups interface doesn't allow two entities to use the same name for their files, so for bfq we had to prepend the "bfq" prefix to each of its files. However no legacy code uses these modified file names. This naming also causes confusion, as, e.g., in [1]. Now cfq has gone with legacy blk, so there is no need any longer for these prefixes in (the never used) bfq names. In view of this fact, this commit removes these prefixes, thereby enabling legacy code to truly use the proportional share policy in blk-mq. [1] https://github.com/systemd/systemd/issues/7057 Signed-off-by: Angelo Ruocco <angeloruocco90@gmail.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> --- block/bfq-cgroup.c | 46 +++++++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index decda96770f4..7f0160f5155f 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -1213,7 +1213,7 @@ struct blkcg_policy blkcg_policy_bfq = { struct cftype bfq_blkcg_legacy_files[] = { { - .name = "bfq.weight", + .name = "weight", .flags = CFTYPE_NOT_ON_ROOT, .seq_show = bfq_io_show_weight_legacy, .write_u64 = bfq_io_set_weight_legacy, @@ -1227,42 +1227,42 @@ struct cftype bfq_blkcg_legacy_files[] = { /* statistics, covers only the tasks in the bfqg */ { - .name = "bfq.io_service_bytes", + .name = "io_service_bytes", .private = (unsigned long)&blkcg_policy_bfq, .seq_show = blkg_print_stat_bytes, }, { - .name = "bfq.io_serviced", + .name = "io_serviced", .private = (unsigned long)&blkcg_policy_bfq, .seq_show = blkg_print_stat_ios, }, #ifdef CONFIG_BFQ_CGROUP_DEBUG { - .name = "bfq.time", + .name = "time", .private = offsetof(struct bfq_group, stats.time), .seq_show = bfqg_print_stat, }, { - .name = "bfq.sectors", + .name = "sectors", .seq_show = bfqg_print_stat_sectors, }, { - .name = "bfq.io_service_time", + .name = "io_service_time", .private = offsetof(struct bfq_group, stats.service_time), .seq_show = bfqg_print_rwstat, }, { - .name = "bfq.io_wait_time", + .name = "io_wait_time", .private = offsetof(struct bfq_group, stats.wait_time), .seq_show = bfqg_print_rwstat, }, { - .name = "bfq.io_merged", + .name = "io_merged", .private = offsetof(struct bfq_group, stats.merged), .seq_show = bfqg_print_rwstat, }, { - .name = "bfq.io_queued", + .name = "io_queued", .private = offsetof(struct bfq_group, stats.queued), .seq_show = bfqg_print_rwstat, }, @@ -1270,66 +1270,66 @@ struct cftype bfq_blkcg_legacy_files[] = { /* the same statistics which cover the bfqg and its descendants */ { - .name = "bfq.io_service_bytes_recursive", + .name = "io_service_bytes_recursive", .private = (unsigned long)&blkcg_policy_bfq, .seq_show = blkg_print_stat_bytes_recursive, }, { - .name = "bfq.io_serviced_recursive", + .name = "io_serviced_recursive", .private = (unsigned long)&blkcg_policy_bfq, .seq_show = blkg_print_stat_ios_recursive, }, #ifdef CONFIG_BFQ_CGROUP_DEBUG { - .name = "bfq.time_recursive", + .name = "time_recursive", .private = offsetof(struct bfq_group, stats.time), .seq_show = bfqg_print_stat_recursive, }, { - .name = "bfq.sectors_recursive", + .name = "sectors_recursive", .seq_show = bfqg_print_stat_sectors_recursive, }, { - .name = "bfq.io_service_time_recursive", + .name = "io_service_time_recursive", .private = offsetof(struct bfq_group, stats.service_time), .seq_show = bfqg_print_rwstat_recursive, }, { - .name = "bfq.io_wait_time_recursive", + .name = "io_wait_time_recursive", .private = offsetof(struct bfq_group, stats.wait_time), .seq_show = bfqg_print_rwstat_recursive, }, { - .name = "bfq.io_merged_recursive", + .name = "io_merged_recursive", .private = offsetof(struct bfq_group, stats.merged), .seq_show = bfqg_print_rwstat_recursive, }, { - .name = "bfq.io_queued_recursive", + .name = "io_queued_recursive", .private = offsetof(struct bfq_group, stats.queued), .seq_show = bfqg_print_rwstat_recursive, }, { - .name = "bfq.avg_queue_size", + .name = "avg_queue_size", .seq_show = bfqg_print_avg_queue_size, }, { - .name = "bfq.group_wait_time", + .name = "group_wait_time", .private = offsetof(struct bfq_group, stats.group_wait_time), .seq_show = bfqg_print_stat, }, { - .name = "bfq.idle_time", + .name = "idle_time", .private = offsetof(struct bfq_group, stats.idle_time), .seq_show = bfqg_print_stat, }, { - .name = "bfq.empty_time", + .name = "empty_time", .private = offsetof(struct bfq_group, stats.empty_time), .seq_show = bfqg_print_stat, }, { - .name = "bfq.dequeue", + .name = "dequeue", .private = offsetof(struct bfq_group, stats.dequeue), .seq_show = bfqg_print_stat, }, @@ -1339,7 +1339,7 @@ struct cftype bfq_blkcg_legacy_files[] = { struct cftype bfq_blkg_files[] = { { - .name = "bfq.weight", + .name = "weight", .flags = CFTYPE_NOT_ON_ROOT, .seq_show = bfq_io_show_weight, .write = bfq_io_set_weight, -- 2.20.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames 2019-09-17 16:51 ` [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames Paolo Valente @ 2019-09-17 21:04 ` Chaitanya Kulkarni 2019-09-17 21:32 ` Tejun Heo 1 sibling, 0 replies; 13+ messages in thread From: Chaitanya Kulkarni @ 2019-09-17 21:04 UTC (permalink / raw) To: Paolo Valente, Jens Axboe Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, bfq-iosched, oleksandr, Tejun Heo, cgroups, Angelo Ruocco Looks good to me. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> On 09/17/2019 09:52 AM, Paolo Valente wrote: > From: Angelo Ruocco<angeloruocco90@gmail.com> > > When bfq was merged into mainline, there were two I/O schedulers that > implemented the proportional-share policy: bfq for blk-mq and cfq for > legacy blk. bfq's interface files in the blkio/io controller have the > same names as cfq. But the cgroups interface doesn't allow two > entities to use the same name for their files, so for bfq we had to > prepend the "bfq" prefix to each of its files. However no legacy code > uses these modified file names. This naming also causes confusion, as, > e.g., in [1]. > > Now cfq has gone with legacy blk, so there is no need any longer for > these prefixes in (the never used) bfq names. In view of this fact, this > commit removes these prefixes, thereby enabling legacy code to truly > use the proportional share policy in blk-mq. > > [1]https://github.com/systemd/systemd/issues/7057 > > Signed-off-by: Angelo Ruocco<angeloruocco90@gmail.com> > Signed-off-by: Paolo Valente<paolo.valente@linaro.org> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames 2019-09-17 16:51 ` [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames Paolo Valente 2019-09-17 21:04 ` Chaitanya Kulkarni @ 2019-09-17 21:32 ` Tejun Heo 2019-09-18 5:18 ` Paolo Valente 2019-09-18 6:09 ` Ulf Hansson 1 sibling, 2 replies; 13+ messages in thread From: Tejun Heo @ 2019-09-17 21:32 UTC (permalink / raw) To: Paolo Valente Cc: Jens Axboe, linux-block, linux-kernel, ulf.hansson, linus.walleij, bfq-iosched, oleksandr, cgroups, Angelo Ruocco Hello, On Tue, Sep 17, 2019 at 06:51:48PM +0200, Paolo Valente wrote: > When bfq was merged into mainline, there were two I/O schedulers that > implemented the proportional-share policy: bfq for blk-mq and cfq for > legacy blk. bfq's interface files in the blkio/io controller have the > same names as cfq. But the cgroups interface doesn't allow two > entities to use the same name for their files, so for bfq we had to > prepend the "bfq" prefix to each of its files. However no legacy code > uses these modified file names. This naming also causes confusion, as, > e.g., in [1]. > > Now cfq has gone with legacy blk, so there is no need any longer for > these prefixes in (the never used) bfq names. In view of this fact, this > commit removes these prefixes, thereby enabling legacy code to truly > use the proportional share policy in blk-mq. So, I wrote the iocost switching patch and don't have a strong interest in whether bfq prefix should get dropped or not. However, I gotta point out that flipping interface this way is way out of the norm. In the previous release cycle, the right thing to do was dropping the bfq prefix but that wasn't possible because bfq's interface wasn't compatible at that point and didn't made to be compatible in time. Non-obviously different interface with the same name is a lot worse than giving it a new name, so the only acceptable course of action at that point was keeping the bfq prefix. Now that the interface has already been published in a released kernel, dropping the prefix would be something extremely unusual as there would already be users who will be affected by the interface flip-flop. We sometimes do change interfaces but I'm having a difficult time seeing the overriding rationales in this case. Thanks. -- tejun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames 2019-09-17 21:32 ` Tejun Heo @ 2019-09-18 5:18 ` Paolo Valente 2019-09-18 15:19 ` Tejun Heo 2019-09-18 6:09 ` Ulf Hansson 1 sibling, 1 reply; 13+ messages in thread From: Paolo Valente @ 2019-09-18 5:18 UTC (permalink / raw) To: Tejun Heo Cc: Jens Axboe, linux-block, linux-kernel, ulf.hansson, linus.walleij, bfq-iosched, oleksandr, cgroups, Angelo Ruocco > Il giorno 17 set 2019, alle ore 23:32, Tejun Heo <tj@kernel.org> ha scritto: > > Hello, > > On Tue, Sep 17, 2019 at 06:51:48PM +0200, Paolo Valente wrote: >> When bfq was merged into mainline, there were two I/O schedulers that >> implemented the proportional-share policy: bfq for blk-mq and cfq for >> legacy blk. bfq's interface files in the blkio/io controller have the >> same names as cfq. But the cgroups interface doesn't allow two >> entities to use the same name for their files, so for bfq we had to >> prepend the "bfq" prefix to each of its files. However no legacy code >> uses these modified file names. This naming also causes confusion, as, >> e.g., in [1]. >> >> Now cfq has gone with legacy blk, so there is no need any longer for >> these prefixes in (the never used) bfq names. In view of this fact, this >> commit removes these prefixes, thereby enabling legacy code to truly >> use the proportional share policy in blk-mq. > > So, I wrote the iocost switching patch and don't have a strong > interest in whether bfq prefix should get dropped or not. However, I > gotta point out that flipping interface this way is way out of the > norm. > > In the previous release cycle, the right thing to do was dropping the > bfq prefix but that wasn't possible because bfq's interface wasn't > compatible at that point and didn't made to be compatible in time. > Non-obviously different interface with the same name is a lot worse > than giving it a new name, so the only acceptable course of action at > that point was keeping the bfq prefix. > > Now that the interface has already been published in a released > kernel, dropping the prefix would be something extremely unusual as > there would already be users who will be affected by the interface > flip-flop. We sometimes do change interfaces but I'm having a > difficult time seeing the overriding rationales in this case. > This issue is a nightmare :) Userspace wants the weight to be called weight (I'm not reporting links to threads again). *Any* solution that gets to this is ok for me. A solution that both fulfills userspace request and doesn't break anything for hypothetical users of the current interface already made it to mainline, and Linus liked it too. It is: 19e9da9e86c4 ("block, bfq: add weight symlink to the bfq.weight cgroup parameter") But it was then reverted on Tejun's request to do exactly what we don't want do any longer now: cf8929885de3 ("cgroup/bfq: revert bfq.weight symlink change") So, Jens, Tejun, can we please just revert that revert? Thanks, Paolo > Thanks. > > -- > tejun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames 2019-09-18 5:18 ` Paolo Valente @ 2019-09-18 15:19 ` Tejun Heo 2019-09-18 16:19 ` Paolo Valente 0 siblings, 1 reply; 13+ messages in thread From: Tejun Heo @ 2019-09-18 15:19 UTC (permalink / raw) To: Paolo Valente Cc: Jens Axboe, linux-block, linux-kernel, ulf.hansson, linus.walleij, bfq-iosched, oleksandr, cgroups, Angelo Ruocco Hello, On Wed, Sep 18, 2019 at 07:18:50AM +0200, Paolo Valente wrote: > A solution that both fulfills userspace request and doesn't break > anything for hypothetical users of the current interface already made > it to mainline, and Linus liked it too. It is: Linus didn't like it. The implementation was a bit nasty. That was why it became a subject in the first place. > 19e9da9e86c4 ("block, bfq: add weight symlink to the bfq.weight cgroup parameter") > > But it was then reverted on Tejun's request to do exactly what we > don't want do any longer now: > cf8929885de3 ("cgroup/bfq: revert bfq.weight symlink change") Note that the interface was wrong at the time too. > So, Jens, Tejun, can we please just revert that revert? I think presenting both io.weight and io.bfq.weight interfaces are probably the best course of action at this point but why does it have to be a symlink? What's wrong with just creating another file with the same backing function? Thanks. -- tejun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames 2019-09-18 15:19 ` Tejun Heo @ 2019-09-18 16:19 ` Paolo Valente 2019-09-20 6:58 ` Paolo Valente 0 siblings, 1 reply; 13+ messages in thread From: Paolo Valente @ 2019-09-18 16:19 UTC (permalink / raw) To: Tejun Heo Cc: Jens Axboe, linux-block, linux-kernel, Ulf Hansson, Linus Walleij, noreply-spamdigest via bfq-iosched, oleksandr, cgroups, Angelo Ruocco > Il giorno 18 set 2019, alle ore 17:19, Tejun Heo <tj@kernel.org> ha scritto: > > Hello, > > On Wed, Sep 18, 2019 at 07:18:50AM +0200, Paolo Valente wrote: >> A solution that both fulfills userspace request and doesn't break >> anything for hypothetical users of the current interface already made >> it to mainline, and Linus liked it too. It is: > > Linus didn't like it. The implementation was a bit nasty. That was > why it became a subject in the first place. > >> 19e9da9e86c4 ("block, bfq: add weight symlink to the bfq.weight cgroup parameter") >> >> But it was then reverted on Tejun's request to do exactly what we >> don't want do any longer now: >> cf8929885de3 ("cgroup/bfq: revert bfq.weight symlink change") > > Note that the interface was wrong at the time too. > >> So, Jens, Tejun, can we please just revert that revert? > > I think presenting both io.weight and io.bfq.weight interfaces are > probably the best course of action at this point but why does it have > to be a symlink? What's wrong with just creating another file with > the same backing function? > I think a symlink would be much clearer for users, given the confusion already caused by two names for the same parameter. But let's hear others' opinion too. Thanks, Paolo > Thanks. > > -- > tejun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames 2019-09-18 16:19 ` Paolo Valente @ 2019-09-20 6:58 ` Paolo Valente 2019-09-20 13:05 ` Jens Axboe 0 siblings, 1 reply; 13+ messages in thread From: Paolo Valente @ 2019-09-20 6:58 UTC (permalink / raw) To: Tejun Heo Cc: Jens Axboe, linux-block, linux-kernel, Ulf Hansson, Linus Walleij, noreply-spamdigest via bfq-iosched, Oleksandr Natalenko, cgroups, Angelo Ruocco > Il giorno 18 set 2019, alle ore 18:19, Paolo Valente <paolo.valente@linaro.org> ha scritto: > > > >> Il giorno 18 set 2019, alle ore 17:19, Tejun Heo <tj@kernel.org> ha scritto: >> >> Hello, >> >> On Wed, Sep 18, 2019 at 07:18:50AM +0200, Paolo Valente wrote: >>> A solution that both fulfills userspace request and doesn't break >>> anything for hypothetical users of the current interface already made >>> it to mainline, and Linus liked it too. It is: >> >> Linus didn't like it. The implementation was a bit nasty. That was >> why it became a subject in the first place. >> >>> 19e9da9e86c4 ("block, bfq: add weight symlink to the bfq.weight cgroup parameter") >>> >>> But it was then reverted on Tejun's request to do exactly what we >>> don't want do any longer now: >>> cf8929885de3 ("cgroup/bfq: revert bfq.weight symlink change") >> >> Note that the interface was wrong at the time too. >> >>> So, Jens, Tejun, can we please just revert that revert? >> >> I think presenting both io.weight and io.bfq.weight interfaces are >> probably the best course of action at this point but why does it have >> to be a symlink? What's wrong with just creating another file with >> the same backing function? >> > > I think a symlink would be much clearer for users, given the confusion > already caused by two names for the same parameter. But let's hear > others' opinion too. > Jens, could you express your opinion on this? Any solution you and Tejun agree on is ok for me. Also this new (fourth) possible implementation of this fix, provided that then it is definitely ok for both of you. Thanks, Paolo > Thanks, > Paolo > >> Thanks. >> >> -- >> tejun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames 2019-09-20 6:58 ` Paolo Valente @ 2019-09-20 13:05 ` Jens Axboe 2019-09-21 6:55 ` Paolo Valente 0 siblings, 1 reply; 13+ messages in thread From: Jens Axboe @ 2019-09-20 13:05 UTC (permalink / raw) To: Paolo Valente, Tejun Heo Cc: linux-block, linux-kernel, Ulf Hansson, Linus Walleij, noreply-spamdigest via bfq-iosched, Oleksandr Natalenko, cgroups, Angelo Ruocco On 9/20/19 12:58 AM, Paolo Valente wrote: > > >> Il giorno 18 set 2019, alle ore 18:19, Paolo Valente <paolo.valente@linaro.org> ha scritto: >> >> >> >>> Il giorno 18 set 2019, alle ore 17:19, Tejun Heo <tj@kernel.org> ha scritto: >>> >>> Hello, >>> >>> On Wed, Sep 18, 2019 at 07:18:50AM +0200, Paolo Valente wrote: >>>> A solution that both fulfills userspace request and doesn't break >>>> anything for hypothetical users of the current interface already made >>>> it to mainline, and Linus liked it too. It is: >>> >>> Linus didn't like it. The implementation was a bit nasty. That was >>> why it became a subject in the first place. >>> >>>> 19e9da9e86c4 ("block, bfq: add weight symlink to the bfq.weight cgroup parameter") >>>> >>>> But it was then reverted on Tejun's request to do exactly what we >>>> don't want do any longer now: >>>> cf8929885de3 ("cgroup/bfq: revert bfq.weight symlink change") >>> >>> Note that the interface was wrong at the time too. >>> >>>> So, Jens, Tejun, can we please just revert that revert? >>> >>> I think presenting both io.weight and io.bfq.weight interfaces are >>> probably the best course of action at this point but why does it have >>> to be a symlink? What's wrong with just creating another file with >>> the same backing function? >>> >> >> I think a symlink would be much clearer for users, given the confusion >> already caused by two names for the same parameter. But let's hear >> others' opinion too. >> > > Jens, could you express your opinion on this? Any solution you and > Tejun agree on is ok for me. Also this new (fourth) possible > implementation of this fix, provided that then it is definitely ok for > both of you. Retaining both interfaces is arguably the right solution. It would be nice if we didn't have to, but the first bfq variant was incompatible with the in-kernel one, so we'll always have that out in the wild. Adding everything to stable doesn't work, as we still have existing kernels out there with the interface. In fact, in some ways that's worse, as you definitely don't want interfaces to change between two stable kernels. I know it's not ideal, and some better initial planning would have made it better, but we have to deal with the situation as it stands now. -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames 2019-09-20 13:05 ` Jens Axboe @ 2019-09-21 6:55 ` Paolo Valente 0 siblings, 0 replies; 13+ messages in thread From: Paolo Valente @ 2019-09-21 6:55 UTC (permalink / raw) To: Jens Axboe Cc: Tejun Heo, linux-block, linux-kernel, Ulf Hansson, Linus Walleij, noreply-spamdigest via bfq-iosched, Oleksandr Natalenko, cgroups, Angelo Ruocco > Il giorno 20 set 2019, alle ore 15:05, Jens Axboe <axboe@kernel.dk> ha scritto: > > On 9/20/19 12:58 AM, Paolo Valente wrote: >> >> >>> Il giorno 18 set 2019, alle ore 18:19, Paolo Valente <paolo.valente@linaro.org> ha scritto: >>> >>> >>> >>>> Il giorno 18 set 2019, alle ore 17:19, Tejun Heo <tj@kernel.org> ha scritto: >>>> >>>> Hello, >>>> >>>> On Wed, Sep 18, 2019 at 07:18:50AM +0200, Paolo Valente wrote: >>>>> A solution that both fulfills userspace request and doesn't break >>>>> anything for hypothetical users of the current interface already made >>>>> it to mainline, and Linus liked it too. It is: >>>> >>>> Linus didn't like it. The implementation was a bit nasty. That was >>>> why it became a subject in the first place. >>>> >>>>> 19e9da9e86c4 ("block, bfq: add weight symlink to the bfq.weight cgroup parameter") >>>>> >>>>> But it was then reverted on Tejun's request to do exactly what we >>>>> don't want do any longer now: >>>>> cf8929885de3 ("cgroup/bfq: revert bfq.weight symlink change") >>>> >>>> Note that the interface was wrong at the time too. >>>> >>>>> So, Jens, Tejun, can we please just revert that revert? >>>> >>>> I think presenting both io.weight and io.bfq.weight interfaces are >>>> probably the best course of action at this point but why does it have >>>> to be a symlink? What's wrong with just creating another file with >>>> the same backing function? >>>> >>> >>> I think a symlink would be much clearer for users, given the confusion >>> already caused by two names for the same parameter. But let's hear >>> others' opinion too. >>> >> >> Jens, could you express your opinion on this? Any solution you and >> Tejun agree on is ok for me. Also this new (fourth) possible >> implementation of this fix, provided that then it is definitely ok for >> both of you. > > Retaining both interfaces is arguably the right solution. So you also are voting for BFQ to create two files, instead of having a symlink, aren't you? I just want to be certain before submitting one more solution. Looking forward to your confirmation, Paolo > It would be > nice if we didn't have to, but the first bfq variant was incompatible > with the in-kernel one, so we'll always have that out in the wild. > Adding everything to stable doesn't work, as we still have existing > kernels out there with the interface. In fact, in some ways that's > worse, as you definitely don't want interfaces to change between two > stable kernels. > > I know it's not ideal, and some better initial planning would have > made it better, but we have to deal with the situation as it stands > now. > > -- > Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames 2019-09-17 21:32 ` Tejun Heo 2019-09-18 5:18 ` Paolo Valente @ 2019-09-18 6:09 ` Ulf Hansson 1 sibling, 0 replies; 13+ messages in thread From: Ulf Hansson @ 2019-09-18 6:09 UTC (permalink / raw) To: Tejun Heo, Paolo Valente Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Linus Walleij, 'Paolo Valente' via bfq-iosched, Oleksandr Natalenko, cgroups, Angelo Ruocco Tejun, Paolo, On Tue, 17 Sep 2019 at 23:32, Tejun Heo <tj@kernel.org> wrote: > > Hello, > > On Tue, Sep 17, 2019 at 06:51:48PM +0200, Paolo Valente wrote: > > When bfq was merged into mainline, there were two I/O schedulers that > > implemented the proportional-share policy: bfq for blk-mq and cfq for > > legacy blk. bfq's interface files in the blkio/io controller have the > > same names as cfq. But the cgroups interface doesn't allow two > > entities to use the same name for their files, so for bfq we had to > > prepend the "bfq" prefix to each of its files. However no legacy code > > uses these modified file names. This naming also causes confusion, as, > > e.g., in [1]. > > > > Now cfq has gone with legacy blk, so there is no need any longer for > > these prefixes in (the never used) bfq names. In view of this fact, this > > commit removes these prefixes, thereby enabling legacy code to truly > > use the proportional share policy in blk-mq. > > So, I wrote the iocost switching patch and don't have a strong > interest in whether bfq prefix should get dropped or not. However, I > gotta point out that flipping interface this way is way out of the > norm. > > In the previous release cycle, the right thing to do was dropping the > bfq prefix but that wasn't possible because bfq's interface wasn't > compatible at that point and didn't made to be compatible in time. Sounds like we really should send those relevant patches for stable, to set the correct ground. Then using a symlink, to make sure we don't brake current ABI, right? [...] Kind regards Uffe ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 0/2] block, bfq: make bfq disable iocost and present a double interface @ 2019-10-01 19:33 Paolo Valente 2019-10-01 19:33 ` [PATCH 1/2] blkcg: Make bfq disable iocost when enabled Paolo Valente 0 siblings, 1 reply; 13+ messages in thread From: Paolo Valente @ 2019-10-01 19:33 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, bfq-iosched, oleksandr, Tejun Heo, cgroups, Paolo Valente Hi Jens, the first patch in this series is Tejun's patch for making BFQ disable io.cost. The second patch makes BFQ present both the bfq-prefixes parameters and non-prefixed parameters, as suggested by Tejun [1]. In the first patch I've tried to use macros not to repeat code twice. checkpatch complains that these macros should be enclosed in parentheses. I don't see how to do it. I'm willing to switch to any better solution. Thanks, Paolo [1] https://lkml.org/lkml/2019/9/18/736 Paolo Valente (1): block, bfq: present a double cgroups interface Tejun Heo (1): blkcg: Make bfq disable iocost when enabled Documentation/admin-guide/cgroup-v2.rst | 8 +- Documentation/block/bfq-iosched.rst | 40 ++-- block/bfq-cgroup.c | 260 ++++++++++++------------ block/bfq-iosched.c | 32 +++ block/blk-iocost.c | 5 +- include/linux/blk-cgroup.h | 5 + kernel/cgroup/cgroup.c | 2 + 7 files changed, 201 insertions(+), 151 deletions(-) -- 2.20.1 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] blkcg: Make bfq disable iocost when enabled 2019-10-01 19:33 [PATCH 0/2] block, bfq: make bfq disable iocost and present a double interface Paolo Valente @ 2019-10-01 19:33 ` Paolo Valente 0 siblings, 0 replies; 13+ messages in thread From: Paolo Valente @ 2019-10-01 19:33 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, bfq-iosched, oleksandr, Tejun Heo, cgroups, Paolo Valente From: Tejun Heo <tj@kernel.org> Both iocost and bfq implement weight based IO control. Currently, bfq is using io.bfq prefix but wants to drop the bfq part. To avoid interface conflict, make bfq disable iocost when it's selected as the IO scheduler for any block device on the system. iocost is only re-enabled when bfq is built as a module and unloaded. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Paolo Valente <paolo.valente@linaro.org> --- Documentation/admin-guide/cgroup-v2.rst | 8 ++++--- block/bfq-cgroup.c | 2 ++ block/bfq-iosched.c | 32 +++++++++++++++++++++++++ block/blk-iocost.c | 5 ++-- include/linux/blk-cgroup.h | 5 ++++ kernel/cgroup/cgroup.c | 2 ++ 6 files changed, 48 insertions(+), 6 deletions(-) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index 0fa8c0e615c2..8374957213f1 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1440,9 +1440,11 @@ IO The "io" controller regulates the distribution of IO resources. This controller implements both weight based and absolute bandwidth or IOPS -limit distribution; however, weight based distribution is available -only if cfq-iosched is in use and neither scheme is available for -blk-mq devices. +limit distribution. Weight based distribution is implemented by +either iocost controller or bfq IO scheduler. When bfq is selected as +the IO scheduler for any block device, iocost is disabled and bfq's +implementation overrides for all devices. If bfq is built as a kernel +module, unloading it re-enables iocost. IO Interface Files diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index 86a607cf19a1..decda96770f4 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -1194,7 +1194,9 @@ struct bfq_group *bfq_create_group_hierarchy(struct bfq_data *bfqd, int node) } struct blkcg_policy blkcg_policy_bfq = { +#ifndef CONFIG_BLK_CGROUP_IOCOST .dfl_cftypes = bfq_blkg_files, +#endif .legacy_cftypes = bfq_blkcg_legacy_files, .cpd_alloc_fn = bfq_cpd_alloc, diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 0319d6339822..21d1b08610b1 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -6382,6 +6382,36 @@ static void bfq_init_root_group(struct bfq_group *root_group, root_group->sched_data.bfq_class_idle_last_service = jiffies; } +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_BLK_CGROUP_IOCOST) +static bool bfq_enabled = false; + +static void bfq_enable(void) +{ + static DEFINE_MUTEX(bfq_enable_mutex); + + mutex_lock(&bfq_enable_mutex); + if (!bfq_enabled) { + pr_info("bfq-iosched: Overriding iocost\n"); + blkcg_policy_unregister(&blkcg_policy_iocost); + cgroup_add_dfl_cftypes(&io_cgrp_subsys, bfq_blkg_files); + bfq_enabled = true; + } + mutex_unlock(&bfq_enable_mutex); +} + +static void __exit bfq_disable(void) +{ + if (bfq_enabled) { + pr_info("bfq-iosched: Restoring iocost\n"); + cgroup_rm_cftypes(bfq_blkg_files); + blkcg_policy_register(&blkcg_policy_iocost); + } +} +#else +static void bfq_enable(void) {} +static void __exit bfq_disable(void) {} +#endif + static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) { struct bfq_data *bfqd; @@ -6506,6 +6536,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) bfq_init_entity(&bfqd->oom_bfqq.entity, bfqd->root_group); wbt_disable_default(q); + bfq_enable(); return 0; out_free: @@ -6823,6 +6854,7 @@ static void __exit bfq_exit(void) blkcg_policy_unregister(&blkcg_policy_bfq); #endif bfq_slab_kill(); + bfq_disable(); } module_init(bfq_init); diff --git a/block/blk-iocost.c b/block/blk-iocost.c index 2a3db80c1dce..511bf80b6db3 100644 --- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -605,8 +605,6 @@ static u32 vrate_adj_pct[] = 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8, 8, 8, 16 }; -static struct blkcg_policy blkcg_policy_iocost; - /* accessors and helpers */ static struct ioc *rqos_to_ioc(struct rq_qos *rqos) { @@ -2442,7 +2440,7 @@ static struct cftype ioc_files[] = { {} }; -static struct blkcg_policy blkcg_policy_iocost = { +struct blkcg_policy blkcg_policy_iocost = { .dfl_cftypes = ioc_files, .cpd_alloc_fn = ioc_cpd_alloc, .cpd_free_fn = ioc_cpd_free, @@ -2450,6 +2448,7 @@ static struct blkcg_policy blkcg_policy_iocost = { .pd_init_fn = ioc_pd_init, .pd_free_fn = ioc_pd_free, }; +EXPORT_SYMBOL_GPL(blkcg_policy_iocost); static int __init ioc_init(void) { diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h index bed9e43f9426..5669e3cfa1bc 100644 --- a/include/linux/blk-cgroup.h +++ b/include/linux/blk-cgroup.h @@ -815,6 +815,11 @@ static inline void blkcg_clear_delay(struct blkcg_gq *blkg) void blkcg_add_delay(struct blkcg_gq *blkg, u64 now, u64 delta); void blkcg_schedule_throttle(struct request_queue *q, bool use_memdelay); void blkcg_maybe_throttle_current(void); + +#ifdef CONFIG_BLK_CGROUP_IOCOST +extern struct blkcg_policy blkcg_policy_iocost; +#endif + #else /* CONFIG_BLK_CGROUP */ struct blkcg { diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 080561bb8a4b..9c9a674c12bd 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4059,6 +4059,7 @@ int cgroup_rm_cftypes(struct cftype *cfts) mutex_unlock(&cgroup_mutex); return ret; } +EXPORT_SYMBOL_GPL(cgroup_rm_cftypes); /** * cgroup_add_cftypes - add an array of cftypes to a subsystem @@ -4115,6 +4116,7 @@ int cgroup_add_dfl_cftypes(struct cgroup_subsys *ss, struct cftype *cfts) cft->flags |= __CFTYPE_ONLY_ON_DFL; return cgroup_add_cftypes(ss, cfts); } +EXPORT_SYMBOL_GPL(cgroup_add_dfl_cftypes); /** * cgroup_add_legacy_cftypes - add an array of cftypes for legacy hierarchies -- 2.20.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-10-01 19:33 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-09-17 16:51 [PATCH 0/2 block/for-next] block, blkcg, bfq: make bfq disable iocost and delete bfq prefix from cgroup filenames Paolo Valente 2019-09-17 16:51 ` [PATCH 1/2] blkcg: Make bfq disable iocost when enabled Paolo Valente 2019-09-17 16:51 ` [PATCH 2/2] block, bfq: delete "bfq" prefix from cgroup filenames Paolo Valente 2019-09-17 21:04 ` Chaitanya Kulkarni 2019-09-17 21:32 ` Tejun Heo 2019-09-18 5:18 ` Paolo Valente 2019-09-18 15:19 ` Tejun Heo 2019-09-18 16:19 ` Paolo Valente 2019-09-20 6:58 ` Paolo Valente 2019-09-20 13:05 ` Jens Axboe 2019-09-21 6:55 ` Paolo Valente 2019-09-18 6:09 ` Ulf Hansson 2019-10-01 19:33 [PATCH 0/2] block, bfq: make bfq disable iocost and present a double interface Paolo Valente 2019-10-01 19:33 ` [PATCH 1/2] blkcg: Make bfq disable iocost when enabled Paolo Valente
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).